Give a proposal number to pluggable transport.

author: Nick Mathewson <nickm@torproject.org> 2011-03-15 17:15:37 -0400
committer: Nick Mathewson <nickm@torproject.org> 2011-03-15 17:15:37 -0400
commit: 57c77a2110616a6a5d0bbdee260936c1c13cad9a (patch)
tree: 21ded4d2805e2ec3afa5920c532cad217a8da809 /proposals/180-pluggable-transport.txt
parent: e83442ee76c18e0bc9f5814f6fa817300d9a43c2 (diff)
download: torspec-57c77a2110616a6a5d0bbdee260936c1c13cad9a.tar.gz
torspec-57c77a2110616a6a5d0bbdee260936c1c13cad9a.zip
1 files changed, 530 insertions, 0 deletions
diff --git a/proposals/180-pluggable-transport.txt b/proposals/180-pluggable-transport.txt
new file mode 100644
index 0000000..2df0ced
--- /dev/null
+++ b/proposals/180-pluggable-transport.txt
@@ -0,0 +1,530 @@
+Filename: 180-pluggable-transport.txt
+Title: Pluggable transports for circumvention
+Author: Jacob Appelbaum, Nick Mathewson
+Created: 15-Oct-2010
+Status: Open
+
+Overview
+
+  This proposal describes a way to decouple protocol-level obfuscation
+  from the core Tor protocol in order to better resist client-bridge
+  censorship.  Our approach is to specify a means to add pluggable
+  transport implementations to Tor clients and bridges so that they can
+  negotiate a superencipherment for the Tor protocol.
+
+Scope
+
+  This is a document about transport plugins; it does not cover
+  discovery improvements, or bridgedb improvements.  While these
+  requirements might be solved by a program that also functions as a
+  transport plugin, this proposal only covers the requirements and
+  operation of transport plugins.
+
+Motivation
+
+  Frequently, people want to try a novel circumvention method to help
+  users connect to Tor bridges.  Some of these methods are already
+  pretty easy to deploy: if the user knows an unblocked VPN or open
+  SOCKS proxy, they can just use that with the Tor client today.
+
+  Less easy to deploy are methods that require participation by both the
+  client and the bridge.  In order of increasing sophistication, we
+  might want to support:
+
+  1. A protocol obfuscation tool that transforms the output of a TLS
+     connection into something that looks like HTTP as it leaves the
+     client, and back to TLS as it arrives at the bridge.
+  2. An additional authentication step that a client would need to
+     perform for a given bridge before being allowed to connect.
+  3. An information passing system that uses a side-channel in some
+     existing protocol to convey traffic between a client and a bridge
+     without the two of them ever communicating directly.
+  4. A set of clients to tunnel client->bridge traffic over an existing
+     large p2p network, such that the bridge is known by an identifier
+     in that network rather than by an IP address.
+
+  We could in theory support these almost fine with Tor as it stands
+  today: every Tor client can take a SOCKS proxy to use for its outgoing
+  traffic, so a suitable client proxy could handle the client's traffic
+  and connections on its behalf, while a corresponding program on the
+  bridge side could handle the bridge's side of the protocol
+  transformation.  Nevertheless, there are some reasons to add support
+  for transportation plugins to Tor itself:
+
+  1. It would be good for bridges to have a standard way to advertise
+     which transports they support, so that clients can have multiple
+     local transport proxies, and automatically use the right one for
+     the right bridge.
+
+  2. There are some changes to our architecture that we'll need for a
+     system like this to work.  For testing purposes, if a bridge blocks
+     off its regular ORPort and instead has an obfuscated ORPort, the
+     bridge authority has no way to test it.  Also, unless the bridge
+     has some way to tell that the bridge-side proxy at 127.0.0.1 is not
+     the origin of all the connections it is relaying, it might decide
+     that there are too many connections from 127.0.0.1, and start
+     paring them down to avoid a DoS.
+
+  3. Censorship and anticensorship techniques often evolve faster than
+     the typical Tor release cycle.  As such, it's a good idea to
+     provide ways to test out new anticensorship mechanisms on a more
+     rapid basis.
+
+  4. Transport obfuscation is a relatively distinct problem
+     from the other privacy problems that Tor tries to solve, and it
+     requires a fairly distinct skill-set from hacking the rest of Tor.
+     By decoupling transport obfuscation from the Tor core, we hope to
+     encourage people working on transport obfuscation who would
+     otherwise not be interested in hacking Tor.
+
+  5. Finally, we hope that defining a generic transport obfuscation plugin
+     mechanism will be useful to other anticensorship projects.
+
+Non-Goals
+
+  We're not going to talk about automatic verification of plugin
+  correctness and safety via sandboxing, proof-carrying code, or
+  whatever.
+
+  We need to do more with discovery and distribution, but that's not
+  what this proposal is about.  We're pretty convinced that the problems
+  are sufficiently orthogonal that we should be fine so long as we don't
+  preclude a single program from implementing both transport and
+  discovery extensions.
+
+  This proposal is not about what transport plugins are the best ones
+  for people to write.  We do, however, make some general
+  recommendations for plugin authors in an appendix.
+
+  We've considered issues involved with completely replacing Tor's TLS
+  with another encryption layer, rather than layering it inside the
+  obfuscation layer.  We describe how to do this in an appendix to the
+  current proposal, though we are not currently sure whether it's a good
+  idea to implement.
+
+  We deliberately reject any design that would involve linking more code
+  into Tor's process space.
+
+Design overview
+
+  To write a new transport protocol, an implementer must provide two
+  pieces: a "Client Proxy" to run at the initiator side, and a "Server
+  Proxy" to run at the server side.  These two pieces may or may not be
+  implemented by the same program.
+
+  Each client may run any number of Client Proxies.  Each one acts like
+  a SOCKS proxy that accepts connections on localhost.  Each one
+  runs on a different port, and implements one or more transport
+  methods.  If the protocol has any parameters, they are passed from Tor
+  inside the regular username/password parts of the SOCKS protocol.
+
+  Bridges (and maybe relays) may run any number of Server Proxies: these
+  programs provide an interface like stunnel: they get connections from the
+  network (typically by listening for connections on the network) and relay
+  them to the Bridge's real ORPort.
+
+  To configure one of these programs, it should be sufficient simply to
+  list it in your torrc.  The program tells Tor which transports it
+  provides.  The Tor consensus should carry a new approved version number that
+  is specific for pluggable transport; this will allow Tor to know when a
+  particular transport is known to be unsafe safe or non-functional.
+
+  Bridges (and maybe relays) report in their descriptors which transport
+  protocols they support.  This information can be copied into bridge
+  lines.  Bridges using a transport protocol may have multiple bridge
+  lines.
+
+  Any methods that are wildly successful, we can bake into Tor.
+
+Specifications: Client behavior
+
+  We extend the bridge line format to allow you to say which method
+  to use to connect to a bridge.
+
+  The new format is:
+     "bridge method address:port [[keyid=]id-fingerprint] [k=v] [k=v] [k=v]"
+
+  To connect to such a bridge, the Tor program needs to know which
+  local SOCKS proxy will support the transport called "method".  It
+  then connects to this proxy, and asks it to connect to
+  address:port.  If [id-fingerprint] is provided, Tor should expect
+  the public identity key on the TLS connection to match the digest
+  provided in [id-fingerprint].  If any [k=v] items are provided,
+  they are configuration parameters for the proxy: Tor should
+  separate them with semicolons and put them in the user and
+  password fields of the request, splitting them across the fields
+  as necessary.  If a key or value value must contain a semicolon or
+  a backslash, it is escaped with a backslash.
+
+  The "id-fingerprint" field is always provided in a field named
+  "keyid", if it was given.  Method names must be C identifiers.
+
+  Example: if the bridge line is "bridge trebuchet www.example.com:3333
+     rocks=20 height=5.6m" AND if the Tor client knows that the
+     'trebuchet' method is provided by a SOCKS5 proxy on
+     127.0.0.1:19999, the client should connect to that proxy, ask it to
+     connect to www.example.com, and provide the string
+     "rocks=20;height=5.6m" as the username, the password, or split
+     across the username and password.
+
+  There are two ways to tell Tor clients about protocol proxies:
+  external proxies and managed proxies.  An external proxy is configured
+  with
+     ClientTransportPlugin method socks4 address:port [auth=X]
+  or
+     ClientTransportPlugin method socks5 address:port [username=X] [password=Y]
+  as in
+     "ClientTransportPlugin trebuchet socks5 127.0.0.1:9999".
+  This example tells Tor that another program is already running to handle
+  'trubuchet' connections, and Tor doesn't need to worry about it.
+
+  A managed proxy is configured with
+     ClientTransportPlugin <method> exec <path> [options]
+  as in
+    "ClientTransportPlugin trebuchet exec /usr/libexec/trebuchet --managed"
+  This example tells Tor to launch an external program to provide a
+  socks proxy for 'trebuchet' connections. The Tor client only
+  launches one instance of each external program with a given set of
+  options, even if the same executable and options are listed for
+  more than one method.
+
+  If instead of a transport name, the torrc lists "*" for a managed proxy,
+  tor uses that proxy for all transports that it supports.  So
+  "ClientTransportPlugin * exec /usr/libexec/tor/foobar" tells Tor
+  that it should use the foobar plugin for everything that it supports.
+
+  If two proxies support the same method, Tor should use whichever
+  one is listed first.
+
+  The same program can implement a managed or an external proxy: it just
+  needs to take an argument saying which one to be.
+
+  See "Managed proxy behavior" for more information on the managed
+  proxy interface.
+
+Server behavior
+
+  Server proxies are configured similarly to client proxies.  When
+  launching a proxy, the server must tell it what ORPort it has
+  configured, and what address (if any) it can listen on.  The
+  server must tell the proxy which (if any) methods it should
+  provide if it can; the proxy needs to tell the server which
+  methods it is actually providing, and on what ports.
+
+  When a client connects to the proxy, the proxy may need a way to
+  tell the server some identifier for the client address.  It does
+  this in-band.
+
+  As before, the server lists proxies in its torrc.  These can be
+  external proxies that run on their own, or managedproxies that Tor
+  launches.
+
+  An external server proxy is configured as
+     ServerTransportPlugin method proxy address:port param=val..
+  as in
+     ServerTransportPlugin trebuchet proxy 127.0.0.1:999 rocks=heavy
+  The param=val pairs and the address are used to make the bridge
+  configuration information that we'll tell users.
+
+  A managed proxy is configured as
+      ServerTransportPlugin method exec /path/to/binary [options]
+  or
+      ServerTransportPlugin * exec /path/to/binary [options]
+
+  When possible, Tor should launch only one binary of each binary/option
+  pair configured.  So if the torrc contains
+
+     ClientTransportPlugin foo exec /usr/bin/megaproxy --foo
+     ClientTransportPlugin bar exec /usr/bin/megaproxy --bar
+     ServerTransportPlugin * exec /usr/bin/megaproxy --foo
+
+  then Tor will launch the megaproxy binary twice: once with the option
+  --foo and once with the option --bar.
+
+Managed proxy interface
+
+   When the Tor client launches a client proxy from the command
+   line, it communicates via environment variables.  At a minimum,
+   it sets:
+
+      {Client and server}
+      HOME, PATH -- as you'd expect.
+
+      "STATE_LOCATION" -- a directory where the proxy should store
+       state if it wants to.  This directory is not required to
+       exist, but the proxy SHOULD be able to create it if it
+       doesn't.  The proxy SHOULD NOT store state elsewhere.
+
+      "MANAGED_TRANSPORT_VER=1" -- To tell the proxy which versions
+       of this configuration protocol Tor supports.  Future versions
+       will give a comma-separated list.  Clients MUST accept
+       comma-separated lists containing any version that they
+       recognize, and MUST work correctly even if some of the
+       versions they don't recognize are non-numeric.
+
+      {Client only}
+
+      "CLIENT_TRANSPORTS" -- a comma-separated list of which methods
+        this client should enable, or * if all methods should be
+        enabled.  The proxy SHOULD ignore methods that it doesn't
+        recognize.
+
+      {Server only}
+
+      "EXT_SERVER_PORT=addr:portnum" -- A port (probably on localhost) that
+        speaks the extended server protocol.
+
+      "ORPORT=addr:portnum" -- Our regular ORPort in a form suitable
+        for local connections.
+
+      "BINDADDR=addr" -- An address on which to listen for local
+         connections.  This might be the advertised address, or might
+         be a local address that Tor will forward ports to.  It MUST
+         be an address that will work with bind().
+
+      "SERVER_TRANSPORTS=..." -- A comma-separated list of server
+          methods that the proxy should support, or *
+
+  The transport proxy replies by writing NL-terminated lines to
+  stdout.  The metaformat is
+
+      Keyword OptArgs NL
+      OptArgs = Args |
+      Args = SP ArgChar | Args ArgChar
+      ArgChar = Any character but NUL or NL
+      Keyword = KeywordChar | Keyword KeywordChar
+      KeyWordChar = All alphanumeric characters, dash, and underscore.
+
+  Tor MUST ignore lines with keywords that it doesn't recognize.
+
+  First, the proxy writes "VERSION 1" to say that it supports this
+  protocol. It must either pick a version that Tor told it about, or
+  pick no version at all, and say "ERROR no-version\n" and exit.
+
+  The proxy should then open its ports.  If running as a client
+  proxy, it should not use fixed ports; instead it should autoselect
+  ports to avoid conflicts.  A client proxy should by default only
+  listen on localhost for connections.
+
+  A server proxy SHOULD try listen at a consistent port, though it
+  SHOULD pick a different one if the port it last used is now allocated.
+
+  A client or server proxy then should tell which methods it has
+  made available and how.  It does this by printing zero or more
+  CMETHOD and SMETHOD lines to its stdout.  These lines look like:
+
+   CMETHOD methodname SOCKS4/SOCKS5 address:port [ARGS=arglist] \
+        [OPT-ARGS=arglist]
+
+  as in
+
+   CMETHOD trebuchet SOCKS5 127.0.0.1:19999 ARGS=rocks,height \
+              OPT-ARGS=tensile-strength
+
+  The ARGS field lists mandatory parameters that must appear in
+  every bridge line for this method. The OPT-ARGS field lists
+  optional parameters.  If no ARGS or OPT-ARGS field is provided,
+  Tor should not check the parameters in bridge lines for this
+  method.
+
+  The proxy should print a single "CMETHODS DONE" line after it is
+  finished telling Tor about the client methods it provides.  If it
+  tries to supply a client method but can't for some reason, it
+  should say:
+    CMETHOD-ERROR methodname "Message"
+
+  A proxy should tell Tor about the server methods it is providing
+  by printing zero or more SMETHOD lines.  These lines look like:
+
+    SMETHOD methodname address:port  [Options]
+
+  If there's an error setting up a configured server method, the
+  proxy should say:
+    SMETHOD-ERROR methodname "message"
+
+  The 'address:port' part of an SMETHOD line is the address to put
+  in the bridge line.  The ARGS: part is a list of key-value pairs
+  that the client needs to know.  The Options part is a list of
+  space-separated K:V flags that Tor should know about.  Recognized
+  options are:
+
+      - FORWARD:1
+
+        If this option is set, and address:port is not a publicly
+        accessible address, then the bridge needs to forward some
+        other address:port to address:port via upnp-helper.
+
+      - ARGS:k=v,k=v,k=v
+
+        If this option is set, the K=V arguments are added to the
+        extrainfo document.
+
+      - DECLARE:K=V,...
+
+        If this option is set, all the K=V options should be
+        added as extension entries to the router descriptor.  (See
+        below)
+
+      - USE-EXTPORT:1
+
+        If this option is set, the server plugin is using the
+        extended server port.
+
+  SMETHOD and CMETHOD lines may be interspersed.  After the list
+  SMETHOD line, the proxy says "SMETHODS DONE"
+
+  The proxy SHOULD NOT tell Tor about a server or client method
+  unless it is actually open and ready to use.
+
+  Tor clients SHOULD NOT use any method from a client proxy or
+  advertise any method from a server proxy UNLESS it is listed as a
+  possible method for that proxy in torrc, and it is listed by the
+  proxy as a method it supports.
+
+  Proxies should respond to a single INT signal by closing their
+  listener ports and not accepting any new connections, but keeping
+  all connections open, then terminating when connections are all
+  closed.  Proxies should respond to a second INT signal by shutting
+  down cleanly.
+
+The extended ORPort protocol.
+
+  Server transports may need to connect to the bridge and pass
+  additional information about client connections that the bridge
+  would ordinarily receive from the kernel's TCP stack.  To to this,
+  they connect to the "extended server port" as given in
+  SERVER_PORT, sent a short amount of information, wait for a
+  response, and then send the user traffic on that port.
+
+  The extended server port protocol is as follows:
+
+     COMMAND [2 bytes, big-endian]
+     BODYLEN [2 bytes, big-endian]
+     BODY [Bodylen bytes]
+
+     Commands sent from the transport to the server are:
+
+     [0x0000] DONE: There is no more information to give. (body ignored)
+
+     [0x0001] USERADDR: an address:port string that represents the user's
+       address.  If the transport doesn't actually do addresses,
+       this shouldn't be sent.
+
+     Replies sent from tor to the proxy are:
+
+     [0x1001] OKAY: Send the user's traffic. (body ignored)
+
+     [0x1002] DENY: Tor would prefer not to get more traffic from
+       this address for a while. (body ignored)
+
+  [We could also use an out-of-band signalling method to tell Tor
+  about client addresses, but that's a historically error-prone way
+  to go about annotating connections.]
+
+Advertising bridge methods:
+
+  Bridges put the 'method' lines in their extra-info documents.
+
+     method SP methodname SP address:port SP arglist NL
+
+  The address:port parse are as returned from an SMETHOD line.  The
+  arglist is a K=V,... list as retuned in the ARGS part of the
+  SMETHOD line.
+
+  If the SMETHOD line includes a DECLARE: part, the routerinfo gets
+  a new line:
+
+     method-info SP methodname SP arglist NL
+
+Bridge authority behavior
+
+  We need to specify a way to test different transport methods that
+  bridges claim to support.  We should test as many as possible.  We
+  should NOT require that we have a way to tra
+
+Bridgedb behavior:
+
+  Bridgedb can, given a set of router descriptors and their
+  corresponding extrainfo documents, generate a set of bridge lines
+  for each descriptor.  Bridgedb may want to avoid handing out
+  methods that seem to get bridges blocked quickly.
+
+Implementation plan
+
+  First, we should implement per-bridge socks settings (as
+  described above in "manually configuring a client proxy for a
+  bridge") and the extended-server-port mechanism.  This will let
+  bridges run transport proxies such that they can hand-generate
+  bridge lines to give to clients for testing.
+
+  Once that's done, we can improve usability a little bit by
+  implementing external proxies.  Once that's done, we can see if we
+  need any managed proxies, or if the whole idea there is silly.
+
+  If we do, the next most important part seems to be getting
+  the client-side automatic part written.  And once that's done, we
+  can evaluate how much of the server side is easy for people to do
+  and how much is hard.
+
+  The "obfsproxy" obfuscating proxy is a likely candidate for an
+  initial transport, as is Steven Murdoch's http thing or something
+  similar.
+
+Notes on plugins to write:
+
+   We should ship a couple of null plugin implementations in one or two
+   popular, portable languages so that people get an idea of how to
+   write the stuff.
+
+   1. We should have one that's just a proof of concept that does
+      nothing but transfer bytes back and forth.
+
+   1. We should not do a rot13 one.
+
+   2. We should implement a basic proxy that does not transform the bytes at all
+
+   1. We should implement DNS or HTTP using other software (as goodesll
+      did years ago with DNS) as an example of wrapping existing code into
+      our plugin model.
+
+   2. The obfuscated-ssh superencipherment is pretty trivial and pretty
+   useful.  It makes the protocol stringwise unfingerprintable.
+
+      1. Nick needs to be told firmly not to bikeshed the obfuscated-ssh
+        superencipherment too badly
+
+         1. Go ahead, bikeshed my day
+
+   1. If we do a raw-traffic proxy, openssh tunnels would be the logical choice.
+
+Appendix: recommendations for transports
+
+  Be free/open-source software.  Also, if you think your code might
+  someday do so well at circumvention that it should be implemented
+  inside Tor, it should use the same license as Tor.
+
+  Use libraries that Tor already requires. (You can rely on openssl and
+  libevent being present if current Tor is present.)
+
+  Be portable: most Tor users are on Windows, and most Tor developers
+  are not, so designing your code for just one of these platforms will
+  make it either get a small userbase, or poor auditing.
+
+  Think secure: if your code is in a C-like language, and it's hard to
+  read it and become convinced it's safe, then it's probably not safe.
+
+  Think small: we want to minimize the bytes that a Windows user needs
+  to download for a transport client.
+
+  Avoid security-through-obscurity if possible.  Specify.
+
+  Resist trivial fingerprinting: There should be no good string or regex
+  to search for to distinguish your protocol from protocols permitted by
+  censors.
+
+  Imitate a real profile: There are many ways to implement most
+  protocols -- and in many cases, most possible variants of a given
+  protocol won't actually exist in the wild.
+
+
+
author	Nick Mathewson <nickm@torproject.org>	2011-03-15 17:15:37 -0400
committer	Nick Mathewson <nickm@torproject.org>	2011-03-15 17:15:37 -0400
commit	57c77a2110616a6a5d0bbdee260936c1c13cad9a (patch)
tree	21ded4d2805e2ec3afa5920c532cad217a8da809 /proposals/180-pluggable-transport.txt
parent	e83442ee76c18e0bc9f5814f6fa817300d9a43c2 (diff)
download	torspec-57c77a2110616a6a5d0bbdee260936c1c13cad9a.tar.gz torspec-57c77a2110616a6a5d0bbdee260936c1c13cad9a.zip