proposals/ideas/xxx-pluggable-transport.txt


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501

Filename: xxx-pluggable-transport.txt
Title: Pluggable transports for circumvention
Author: Jacob Appelbaum, Nick Mathewson
Created: 15-Oct-2010
Status: Draft

Overview

  This proposal describes a way to decouple protocol-level obfuscation
  from the core Tor protocol in order to better resist client-bridge
  censorship.  Our approach is to specify a means to add pluggable
  transport implementations to Tor clients and bridges so that they can
  negotiate a superencipherment for the Tor protocol.

Scope

  This is a document about transport plugins; it does not cover
  discovery improvements, or bridgedb improvements.  While these
  requirements might be solved by a program that also functions as a
  transport plugin, this proposal only covers the requirements and
  operation of transport plugins.

Motivation

  Frequently, people want to try a novel circumvention method to help
  users connect to Tor bridges.  Some of these methods are already
  pretty easy to deploy: if the user knows an unblocked VPN or open
  SOCKS proxy, they can just use that with the Tor client today.

  Less easy to deploy are methods that require participation by both the
  client and the bridge.  In order of increasing sophistication, we
  might want to support:

  1. A protocol obfuscation tool that transforms the output of a TLS
     connection into something that looks like HTTP as it leaves the
     client, and back to TLS as it arrives at the bridge.
  2. An additional authentication step that a client would need to
     perform for a given bridge before being allowed to connect.
  3. An information passing system that uses a side-channel in some
     existing protocol to convey traffic between a client and a bridge
     without the two of them ever communicating directly.
  4. A set of clients to tunnel client->bridge traffic over an existing
     large p2p network, such that the bridge is known by an identifier
     in that network rather than by an IP address.

  We could in theory support these almost fine with Tor as it stands
  today: every Tor client can take a SOCKS proxy to use for its outgoing
  traffic, so a suitable client proxy could handle the client's traffic
  and connections on its behalf, while a corresponding program on the
  bridge side could handle the bridge's side of the protocol
  transformation.  Nevertheless, there are some reasons to add support
  for transportation plugins to Tor itself:

  1. It would be good for bridges to have a standard way to advertise
     which transports they support, so that clients can have multiple
     local transport proxies, and automatically use the right one for
     the right bridge.

  2. There are some changes to our architecture that we'll need for a
     system like this to work.  For testing purposes, if a bridge blocks
     off its regular ORPort and instead has an obfuscated ORPort, the
     bridge authority has no way to test it.  Also, unless the bridge
     has some way to tell that the bridge-side proxy at 127.0.0.1 is not
     the origin of all the connections it is relaying, it might decide
     that there are too many connections from 127.0.0.1, and start
     paring them down to avoid a DoS.

  3. Censorship and anticensorship techniques often evolve faster than
     the typical Tor release cycle.  As such, it's a good idea to
     provide ways to test out new anticensorship mechanisms on a more
     rapid basis.

  4. Transport obfuscation is a relatively distinct problem
     from the other privacy problems that Tor tries to solve, and it
     requires a fairly distinct skill-set from hacking the rest of Tor.
     By decoupling transport obfuscation from the Tor core, we hope to
     encourage people working on transport obfuscation who would
     otherwise not be interested in hacking Tor.

  5. Finally, we hope that defining a generic transport obfuscation plugin
     mechanism will be useful to other anticensorship projects.

Non-Goals

  We're not going to talk about automatic verification of plugin
  correctness and safety via sandboxing, proof-carrying code, or
  whatever.

  We need to do more with discovery and distribution, but that's not
  what this proposal is about.  We're pretty convinced that the problems
  are sufficiently orthogonal that we should be fine so long as we don't
  preclude a single program from implementing both transport and
  discovery extensions.

  This proposal is not about what transport plugins are the best ones
  for people to write.  We do, however, make some general
  recommendations for plugin authors in an appendix.

  We've considered issues involved with completely replacing Tor's TLS
  with another encryption layer, rather than layering it inside the
  obfuscation layer.  We describe how to do this in an appendix to the
  current proposal, though we are not currently sure whether it's a good
  idea to implement.

  We deliberately reject any design that would involve linking more code
  into Tor's process space.

Design overview

  To write a new transport protocol, an implementer must provide two
  pieces: a "Client Proxy" to run at the initiator side, and a "Server
  Proxy" to run at the server side.  These two pieces may or may not be
  implemented by the same program.

  Each client may run any number of Client Proxies.  Each one acts like
  a SOCKS proxy that accepts connections on localhost.  Each one
  runs on a different port, and implements one or more transport
  methods.  If the protocol has any parameters, they are passed from Tor
  inside the regular username/password parts of the SOCKS protocol.

  Bridges (and maybe relays) may run any number of Server Proxies: these
  programs provide an interface like stunnel-server (or whatever the
  option is): they get connections from the network (typically by
  listening for connections on the network) and relay them to the
  Bridge's real ORPort.

  To configure one of these programs, it should be sufficient simply to
  list it in your torrc.  The program tells Tor which transports it
  provides.  The Tor consensus should carry a new approved version number that
  is specific for pluggable transport; this will allow Tor to know when a
  particular transport is known to be unsafe safe or non-functional.

  Bridges (and maybe relays) report in their descriptors which transport
  protocols they support.  This information can be copied into bridge
  lines.  Bridges using a transport protocol may have multiple bridge
  lines.

  Any methods that are wildly successful, we can bake into Tor.

Specifications: Client behavior

  Bridge lines can now follow the extended format "bridge method
  address:port [[keyid=]id-fingerprint] [k=v] [k=v] [k=v]". To connect
  to such a bridge, a client must open a local connection to the SOCKS
  proxy for "method", and ask it to connect to address:port.  If
  [id-fingerprint] is provided, it should expect the public identity key
  on the TLS connection to match the digest provided in
  [id-fingerprint].  If any [k=v] items are provided, they are
  configuration parameters for the proxy: Tor should separate them with
  semicolons and put them in the user and password fields of the request,
  splitting them across the fields as necessary.  If a key or value
  value must contain a semicolon or a backslash, it is escaped with a
  backslash.

  The "id-fingerprint" field is always provided in a field named
  "keyid", if it was given.  Method names must be C identifiers.

  Example: if the bridge line is "bridge trebuchet www.example.com:3333
     rocks=20 height=5.6m" AND if the Tor client knows that the
     'trebuchet' method is provided by a SOCKS5 proxy on
     127.0.0.1:19999, the client should connect to that proxy, ask it to
     connect to www.example.com, and provide the string
     "rocks=20;height=5.6m" as the username, the password, or split
     across the username and password.

  There are two ways to tell Tor clients about protocol proxies:
  external proxies and managed proxies.  An external proxy is configured
  with "ClientTransportPlugin trebuchet socks5 127.0.0.1:9999".  This
  tells Tor that another program is already running to handle
  'trubuchet' connections, and Tor doesn't need to worry about it.  A
  managed proxy is configured with "ClientTransportPlugin trebuchet
  exec /usr/libexec/tor-proxies/trebuchet [options]", and tells Tor to launch
  an external program on-demand to provide a socks proxy for 'trebuchet'
  connections. The Tor client only launches one instance of each
  external program with a given set of options, even if the same
  executable and options are listed for more than one method.

  If instead of a transport name, the torrc lists "*" for a managed proxy,
  tor uses that proxy for all transports that it supports.

  The same program can implement a managed or an external proxy: it just
  needs to take an argument saying which one to be.

Client proxy behavior

   When the Tor client launches a client proxy from the command line, it
   sets the environment variable
     "CLIENT_TRANSPORT_VER=1"
   to tell the proxy which versions of this configuration protocol
   it supports.  Future versions will give a comma-separated list.

   The Tor client also sets the environment variable
   CLIENT_TRANSPORTS to a comma-separated list of which methods this
   client should enable, or * if all methods should be enabled.

   The Tor client also sets STATE_LOCATION to a directory where
   where the proxy should store state, if it wants to.

   The transport proxy replies by printing "VERSION: 1\n" to its
   stdout to say that it supports this protocol.  It must either
   pick a version that Tor told it about, or pick no version at all,
   and say "ERROR: no-version\n" and exit.

   It then needs to tell Tor which methods and ports it
   supports.  It does this by printing zero or more CMETHOD: lines
   to its stdout.  These look like

   CMETHOD: trebuchet SOCKS5 127.0.0.1:19999 ARGS=rocks,height \
              OPT-ARGS=tensile-strength

   The ARGS field lists mandatory parameters that must appear in every
   bridge line for this method. The OPT-ARGS field lists optional
   parameters.  If no ARGS or OPT-ARGS field is provided, Tor should not
   check the parameters in bridge lines for this method.

   The proxy should print a single "CMETHODS: DONE" line after it is
   finished telling Tor about the methods it provides.

   The transport proxy MUST exit cleanly when it receives a SIGTERM from
   Tor.

   The Tor client MUST ignore lines beginning with a keyword and a colon
   if it does not recognize the keyword.

   In the future, if we need a control mechanism, we can use the
   stdin/stdout from Tor to the transport proxy.

   A transport proxy MUST handle SOCKS connect requests using the SOCKS
   version it advertises.

   Tor clients SHOULD NOT use any method from a client proxy unless it
   is both listed as a possible method for that proxy in torrc, and it
   is listed by the proxy as a method it supports.

Manually configuring a client proxy for a bridge

  All clients will support the methods "socks4" and "socks5".  Users can use
  these to configure a local proxy that doesn't support this plug-in
  infrastructure method; developers can use them to test new proxies before
  they have added support for this plug-in in.

  A bridge configured with these methods looks like:

     bridge socks4 www.example.com:8888 keyid=(fingerprint) \
                 proxy=127.0.0.1:9999 auth=xyz

  or

     bridge socks5 www.example.com:8888 keyid=(fingerprint) \
                 proxy=127.0.0.1:9999 user=x password=y

  The "proxy" argument for these methods is mandatory: it specifies a proxy
  to use when talking to the bridge.  The auth or user/password arguments for
  these methods are optional: they are passed to the proxy either as its
  authentication part (for socks4) or its username/password part (for
  socks5).

  The socks4 method uses SOCKS4 if the bridge is given as an IP
  address, and SOCKS4A if the bridge is given as a hostname.

  [We'll want to implement this part first, since it lets us test out
  per-bridge proxies, albeit with a user-unfriendly manner.]

Server behavior

  Server proxies are configured similarly to client proxies.  When
  launching a proxy, the server must tell it what ORPort it has
  configured, and what address (if any) it can listen on.  The
  server must tell the proxy which (if any) methods it should
  provide if it can; the proxy needs to tell the server which
  methods it is actually providing, and on what ports.

  When a client connects to the proxy, the proxy may need a way to
  tell the server some identifier for the client address.  It does
  this in-band.

  As before, the server lists proxies in its torrc.  These can be
  external proxies that run on their own, or managedproxies that Tor
  launches.

  An external proxy is configured as

      ServerTransportPlugin trebuchet address:port [options] -- param=val param=val...

  A managed proxy is configured as

      ServerTransportPlugin trebuchet exec /path/to/binary [options]
  or
      ServerTransportPlugin * exec /path/to/binary [options]
  The param=val pairs in the external proxy configuration, and the address,
  are advertised to make our bridge configuration.

  When possible, Tor should launch only one binary of each binary/option
  pair configured.  So if the torrc contains

     ClientTransportPlugin foo exec /usr/bin/megaproxy --foo
     ClientTransportPlugin bar exec /usr/bin/megaproxy --bar
     ServerTransportPlugin * exec /usr/bin/megaproxy --foo

  then Tor will launch the megaproxy binary twice: once with the option
  --foo and once with the option --bar.

  When the server launches managed binaries, it sets these environment
  variables:
     SERVER_TRANSPORT_VER=1
        (As CLIENT_TRANPORT_VER)

     EXT_SERVER_PORT=addr:portnum
        (A port on localhost that speaks the extended server protocol)

     ORPORT=addr:portnum
        (Our regular orport in a form suitable for local connections)

     BINDADDR=addr
        (An address on which to listen to connections.  This might be the
         advertised address, or might be a local address that Tor will
         forward ports to.)

     SERVER_TRANSPORTS=...
        (A comma-separated list of server methods that the proxy
         should support, or *).

     STATE_LOCATION=...
        (A directory where where the proxy should store state, if it
        wants to.)

   It also opens an extending server port as described below.

Server proxy behavior

  The server proxy communicates with the server as the client does.
  Both start with a version line to indicate which protocol they
  have chosen (or an error line if it supports no version in common
  with tor), then it lists a number of SMETHOD lines.

  Each SMETHOD line is of the form:

    SMETHOD: methodname address:port ARGS:k=v,k=v,k=v [Options]

  Or:

    SMETHOD-ERROR: methodname message

  SMETHOD and CMETHOD lines may be interspersed.

  After the list SMETHOD line, the proxy says "SMETHODS: DONE"

  Each SMETHOD lime is a configured and working server method.

  The 'address:port' part of an SMETHOD line is the address to put
  in the bridge line.  The ARGS: part is a list of key-value pairs
  that the client needs to know.  The Options part is a list of
  space-separated K:V flags that Tor should know about.  Recognized
  options are:

      - FORWARD:1

        If this option is set, and address:port is not a publicly
        accessible address, then the bridge needs to forward some
        other address:port to address:port via upnp-helper.

      - DECLARE:K=V,...

        If this option is set, all the K=V options should be
        added as extension entries to the router descriptor.  (See
        below)

  Server transports may need to connect to the bridge and pass
  additional information about client connections that the bridge would
  ordinarily receive from .  To to this, they connect to the
  "extended server port" as given in SERVER_PORT, sent a short amount of
  information, wait for a response, and then send the user traffic
  on that port.

  The extended server port protocol is as follows:

     COMMAND [2 bytes, big-endian]
     BODYLEN [2 bytes, big-endian]
     BODY [Bodylen bytes]

     Commands sent from the transport to the server are:

     [0x0000] DONE: There is no more information to give. (body ignored)

     [0x0001] USERADDR: an address:port string that represents the user's
       address.  If the transport doesn't actually do addresses,
       this shouldn't besent.

     Replies sent from tor to the proxy are:

     [0x1001] OKAY: Send the user's traffic. (body ignored)

     [0x1002] DENY: Tor would prefer not to get more traffic from
       this address for a while. (body ignored)

  [We could also use an out-of-band signalling method to tell Tor
  about client addresses, but that's a historically error-prone way
  to go about annotating connections.]

Advertising bridge methods:

  Bridges put the 'method' lines in their extra-info documents.

     method SP methodname SP address:port SP arglist NL

  The address:port parse are as returned from an SMETHOD line.  The
  arglist is a K=V,... list as retuned in the ARGS part of the
  SMETHOD line.

  If the SMETHOD line includes a DECLARE: part, the routerinfo gets
  a new line:

     method-info SP methodname SP arglist NL

Bridge authority behavior

  We need to specify a way to test different transport methods that
  bridges claim to support.  We should test as many as possible.  We
  should NOT require that we have a way to tra

Bridgedb behavior:

  Bridgedb can, given a set of router descriptors and their
  corresponding extrainfo documents, generate a set of bridge lines
  for each descriptor.  Bridgedb may want to avoid handing out
  methods that seem to get bridges blocked quickly.

Implementation plan

  First, we should implement per-bridge socks settings (as
  described above in "manually configuring a client proxy for a
  bridge") and the extended-server-port mechanism.  This will let
  bridges run transport proxies such that they can hand-generate
  bridge lines to give to clients for testing.  Once that's done,
  the next most important part seems to be getting the client-side
  automatic part written.  And once that's done, we can evaluate how
  much of the server side is easy for people to do and how much is
  hard.

  The "obfsproxy" obfuscating proxy is a likely candidate for an
  initial transport, as is Steven Murdoch's http thing or something
  similar.

Notes on plugins to write:

   We should ship a couple of null plugin implementations in one or two
   popular, portable languages so that people get an idea of how to
   write the stuff.

   1. We should have one that's just a proof of concept that does
      nothing but transfer bytes back and forth.

   1. We should not do a rot13 one.

   2. We should implement a basic proxy that does not transform the bytes at all

   1. We should implement DNS or HTTP using other software (as goodesll
      did years ago with DNS) as an example of wrapping existing code into
      our plugin model.

   2. The obfuscated-ssh superencipherment is pretty trivial and pretty
   useful.  It makes the protocol stringwise unfingerprintable.

      1. Nick needs to be told firmly not to bikeshed the obfuscated-ssh
        superencipherment too badly

         1. Go ahead, bikeshed my day

   1. If we do a raw-traffic proxy, openssh tunnels would be the logical choice.

Appendix: recommendations for transports

  Be free/open-source software.  Also, if you think your code might
  someday do so well at circumvention that it should be implemented
  inside Tor, it should use the same license as Tor.

  Use libraries that Tor already requires. (You can rely on openssl and
  libevent being present if current Tor is present.)

  Be portable: most Tor users are on Windows, and most Tor developers
  are not, so designing your code for just one of these platforms will
  make it either get a small userbase, or poor auditing.

  Think secure: if your code is in a C-like language, and it's hard to
  read it and become convinced it's safe, then it's probably not safe.

  Think small: we want to minimize the bytes that a Windows user needs
  to download for a transport client.

  Avoid security-through-obscurity if possible.  Specify.

  Resist trivial fingerprinting: There should be no good string or regex
  to search for to distinguish your protocol from protocols permitted by
  censors.

  Imitate a real profile: There are many ways to implement most
  protocols -- and in many cases, most possible variants of a given
  protocol won't actually exist in the wild.