aboutsummaryrefslogtreecommitdiff
path: root/proposals/180-pluggable-transport.txt
diff options
context:
space:
mode:
authorNick Mathewson <nickm@torproject.org>2011-03-15 17:15:37 -0400
committerNick Mathewson <nickm@torproject.org>2011-03-15 17:15:37 -0400
commit57c77a2110616a6a5d0bbdee260936c1c13cad9a (patch)
tree21ded4d2805e2ec3afa5920c532cad217a8da809 /proposals/180-pluggable-transport.txt
parente83442ee76c18e0bc9f5814f6fa817300d9a43c2 (diff)
downloadtorspec-57c77a2110616a6a5d0bbdee260936c1c13cad9a.tar.gz
torspec-57c77a2110616a6a5d0bbdee260936c1c13cad9a.zip
Give a proposal number to pluggable transport.
Diffstat (limited to 'proposals/180-pluggable-transport.txt')
-rw-r--r--proposals/180-pluggable-transport.txt530
1 files changed, 530 insertions, 0 deletions
diff --git a/proposals/180-pluggable-transport.txt b/proposals/180-pluggable-transport.txt
new file mode 100644
index 0000000..2df0ced
--- /dev/null
+++ b/proposals/180-pluggable-transport.txt
@@ -0,0 +1,530 @@
+Filename: 180-pluggable-transport.txt
+Title: Pluggable transports for circumvention
+Author: Jacob Appelbaum, Nick Mathewson
+Created: 15-Oct-2010
+Status: Open
+
+Overview
+
+ This proposal describes a way to decouple protocol-level obfuscation
+ from the core Tor protocol in order to better resist client-bridge
+ censorship. Our approach is to specify a means to add pluggable
+ transport implementations to Tor clients and bridges so that they can
+ negotiate a superencipherment for the Tor protocol.
+
+Scope
+
+ This is a document about transport plugins; it does not cover
+ discovery improvements, or bridgedb improvements. While these
+ requirements might be solved by a program that also functions as a
+ transport plugin, this proposal only covers the requirements and
+ operation of transport plugins.
+
+Motivation
+
+ Frequently, people want to try a novel circumvention method to help
+ users connect to Tor bridges. Some of these methods are already
+ pretty easy to deploy: if the user knows an unblocked VPN or open
+ SOCKS proxy, they can just use that with the Tor client today.
+
+ Less easy to deploy are methods that require participation by both the
+ client and the bridge. In order of increasing sophistication, we
+ might want to support:
+
+ 1. A protocol obfuscation tool that transforms the output of a TLS
+ connection into something that looks like HTTP as it leaves the
+ client, and back to TLS as it arrives at the bridge.
+ 2. An additional authentication step that a client would need to
+ perform for a given bridge before being allowed to connect.
+ 3. An information passing system that uses a side-channel in some
+ existing protocol to convey traffic between a client and a bridge
+ without the two of them ever communicating directly.
+ 4. A set of clients to tunnel client->bridge traffic over an existing
+ large p2p network, such that the bridge is known by an identifier
+ in that network rather than by an IP address.
+
+ We could in theory support these almost fine with Tor as it stands
+ today: every Tor client can take a SOCKS proxy to use for its outgoing
+ traffic, so a suitable client proxy could handle the client's traffic
+ and connections on its behalf, while a corresponding program on the
+ bridge side could handle the bridge's side of the protocol
+ transformation. Nevertheless, there are some reasons to add support
+ for transportation plugins to Tor itself:
+
+ 1. It would be good for bridges to have a standard way to advertise
+ which transports they support, so that clients can have multiple
+ local transport proxies, and automatically use the right one for
+ the right bridge.
+
+ 2. There are some changes to our architecture that we'll need for a
+ system like this to work. For testing purposes, if a bridge blocks
+ off its regular ORPort and instead has an obfuscated ORPort, the
+ bridge authority has no way to test it. Also, unless the bridge
+ has some way to tell that the bridge-side proxy at 127.0.0.1 is not
+ the origin of all the connections it is relaying, it might decide
+ that there are too many connections from 127.0.0.1, and start
+ paring them down to avoid a DoS.
+
+ 3. Censorship and anticensorship techniques often evolve faster than
+ the typical Tor release cycle. As such, it's a good idea to
+ provide ways to test out new anticensorship mechanisms on a more
+ rapid basis.
+
+ 4. Transport obfuscation is a relatively distinct problem
+ from the other privacy problems that Tor tries to solve, and it
+ requires a fairly distinct skill-set from hacking the rest of Tor.
+ By decoupling transport obfuscation from the Tor core, we hope to
+ encourage people working on transport obfuscation who would
+ otherwise not be interested in hacking Tor.
+
+ 5. Finally, we hope that defining a generic transport obfuscation plugin
+ mechanism will be useful to other anticensorship projects.
+
+Non-Goals
+
+ We're not going to talk about automatic verification of plugin
+ correctness and safety via sandboxing, proof-carrying code, or
+ whatever.
+
+ We need to do more with discovery and distribution, but that's not
+ what this proposal is about. We're pretty convinced that the problems
+ are sufficiently orthogonal that we should be fine so long as we don't
+ preclude a single program from implementing both transport and
+ discovery extensions.
+
+ This proposal is not about what transport plugins are the best ones
+ for people to write. We do, however, make some general
+ recommendations for plugin authors in an appendix.
+
+ We've considered issues involved with completely replacing Tor's TLS
+ with another encryption layer, rather than layering it inside the
+ obfuscation layer. We describe how to do this in an appendix to the
+ current proposal, though we are not currently sure whether it's a good
+ idea to implement.
+
+ We deliberately reject any design that would involve linking more code
+ into Tor's process space.
+
+Design overview
+
+ To write a new transport protocol, an implementer must provide two
+ pieces: a "Client Proxy" to run at the initiator side, and a "Server
+ Proxy" to run at the server side. These two pieces may or may not be
+ implemented by the same program.
+
+ Each client may run any number of Client Proxies. Each one acts like
+ a SOCKS proxy that accepts connections on localhost. Each one
+ runs on a different port, and implements one or more transport
+ methods. If the protocol has any parameters, they are passed from Tor
+ inside the regular username/password parts of the SOCKS protocol.
+
+ Bridges (and maybe relays) may run any number of Server Proxies: these
+ programs provide an interface like stunnel: they get connections from the
+ network (typically by listening for connections on the network) and relay
+ them to the Bridge's real ORPort.
+
+ To configure one of these programs, it should be sufficient simply to
+ list it in your torrc. The program tells Tor which transports it
+ provides. The Tor consensus should carry a new approved version number that
+ is specific for pluggable transport; this will allow Tor to know when a
+ particular transport is known to be unsafe safe or non-functional.
+
+ Bridges (and maybe relays) report in their descriptors which transport
+ protocols they support. This information can be copied into bridge
+ lines. Bridges using a transport protocol may have multiple bridge
+ lines.
+
+ Any methods that are wildly successful, we can bake into Tor.
+
+Specifications: Client behavior
+
+ We extend the bridge line format to allow you to say which method
+ to use to connect to a bridge.
+
+ The new format is:
+ "bridge method address:port [[keyid=]id-fingerprint] [k=v] [k=v] [k=v]"
+
+ To connect to such a bridge, the Tor program needs to know which
+ local SOCKS proxy will support the transport called "method". It
+ then connects to this proxy, and asks it to connect to
+ address:port. If [id-fingerprint] is provided, Tor should expect
+ the public identity key on the TLS connection to match the digest
+ provided in [id-fingerprint]. If any [k=v] items are provided,
+ they are configuration parameters for the proxy: Tor should
+ separate them with semicolons and put them in the user and
+ password fields of the request, splitting them across the fields
+ as necessary. If a key or value value must contain a semicolon or
+ a backslash, it is escaped with a backslash.
+
+ The "id-fingerprint" field is always provided in a field named
+ "keyid", if it was given. Method names must be C identifiers.
+
+ Example: if the bridge line is "bridge trebuchet www.example.com:3333
+ rocks=20 height=5.6m" AND if the Tor client knows that the
+ 'trebuchet' method is provided by a SOCKS5 proxy on
+ 127.0.0.1:19999, the client should connect to that proxy, ask it to
+ connect to www.example.com, and provide the string
+ "rocks=20;height=5.6m" as the username, the password, or split
+ across the username and password.
+
+ There are two ways to tell Tor clients about protocol proxies:
+ external proxies and managed proxies. An external proxy is configured
+ with
+ ClientTransportPlugin method socks4 address:port [auth=X]
+ or
+ ClientTransportPlugin method socks5 address:port [username=X] [password=Y]
+ as in
+ "ClientTransportPlugin trebuchet socks5 127.0.0.1:9999".
+ This example tells Tor that another program is already running to handle
+ 'trubuchet' connections, and Tor doesn't need to worry about it.
+
+ A managed proxy is configured with
+ ClientTransportPlugin <method> exec <path> [options]
+ as in
+ "ClientTransportPlugin trebuchet exec /usr/libexec/trebuchet --managed"
+ This example tells Tor to launch an external program to provide a
+ socks proxy for 'trebuchet' connections. The Tor client only
+ launches one instance of each external program with a given set of
+ options, even if the same executable and options are listed for
+ more than one method.
+
+ If instead of a transport name, the torrc lists "*" for a managed proxy,
+ tor uses that proxy for all transports that it supports. So
+ "ClientTransportPlugin * exec /usr/libexec/tor/foobar" tells Tor
+ that it should use the foobar plugin for everything that it supports.
+
+ If two proxies support the same method, Tor should use whichever
+ one is listed first.
+
+ The same program can implement a managed or an external proxy: it just
+ needs to take an argument saying which one to be.
+
+ See "Managed proxy behavior" for more information on the managed
+ proxy interface.
+
+Server behavior
+
+ Server proxies are configured similarly to client proxies. When
+ launching a proxy, the server must tell it what ORPort it has
+ configured, and what address (if any) it can listen on. The
+ server must tell the proxy which (if any) methods it should
+ provide if it can; the proxy needs to tell the server which
+ methods it is actually providing, and on what ports.
+
+ When a client connects to the proxy, the proxy may need a way to
+ tell the server some identifier for the client address. It does
+ this in-band.
+
+ As before, the server lists proxies in its torrc. These can be
+ external proxies that run on their own, or managedproxies that Tor
+ launches.
+
+ An external server proxy is configured as
+ ServerTransportPlugin method proxy address:port param=val..
+ as in
+ ServerTransportPlugin trebuchet proxy 127.0.0.1:999 rocks=heavy
+ The param=val pairs and the address are used to make the bridge
+ configuration information that we'll tell users.
+
+ A managed proxy is configured as
+ ServerTransportPlugin method exec /path/to/binary [options]
+ or
+ ServerTransportPlugin * exec /path/to/binary [options]
+
+ When possible, Tor should launch only one binary of each binary/option
+ pair configured. So if the torrc contains
+
+ ClientTransportPlugin foo exec /usr/bin/megaproxy --foo
+ ClientTransportPlugin bar exec /usr/bin/megaproxy --bar
+ ServerTransportPlugin * exec /usr/bin/megaproxy --foo
+
+ then Tor will launch the megaproxy binary twice: once with the option
+ --foo and once with the option --bar.
+
+Managed proxy interface
+
+ When the Tor client launches a client proxy from the command
+ line, it communicates via environment variables. At a minimum,
+ it sets:
+
+ {Client and server}
+ HOME, PATH -- as you'd expect.
+
+ "STATE_LOCATION" -- a directory where the proxy should store
+ state if it wants to. This directory is not required to
+ exist, but the proxy SHOULD be able to create it if it
+ doesn't. The proxy SHOULD NOT store state elsewhere.
+
+ "MANAGED_TRANSPORT_VER=1" -- To tell the proxy which versions
+ of this configuration protocol Tor supports. Future versions
+ will give a comma-separated list. Clients MUST accept
+ comma-separated lists containing any version that they
+ recognize, and MUST work correctly even if some of the
+ versions they don't recognize are non-numeric.
+
+ {Client only}
+
+ "CLIENT_TRANSPORTS" -- a comma-separated list of which methods
+ this client should enable, or * if all methods should be
+ enabled. The proxy SHOULD ignore methods that it doesn't
+ recognize.
+
+ {Server only}
+
+ "EXT_SERVER_PORT=addr:portnum" -- A port (probably on localhost) that
+ speaks the extended server protocol.
+
+ "ORPORT=addr:portnum" -- Our regular ORPort in a form suitable
+ for local connections.
+
+ "BINDADDR=addr" -- An address on which to listen for local
+ connections. This might be the advertised address, or might
+ be a local address that Tor will forward ports to. It MUST
+ be an address that will work with bind().
+
+ "SERVER_TRANSPORTS=..." -- A comma-separated list of server
+ methods that the proxy should support, or *
+
+ The transport proxy replies by writing NL-terminated lines to
+ stdout. The metaformat is
+
+ Keyword OptArgs NL
+ OptArgs = Args |
+ Args = SP ArgChar | Args ArgChar
+ ArgChar = Any character but NUL or NL
+ Keyword = KeywordChar | Keyword KeywordChar
+ KeyWordChar = All alphanumeric characters, dash, and underscore.
+
+ Tor MUST ignore lines with keywords that it doesn't recognize.
+
+ First, the proxy writes "VERSION 1" to say that it supports this
+ protocol. It must either pick a version that Tor told it about, or
+ pick no version at all, and say "ERROR no-version\n" and exit.
+
+ The proxy should then open its ports. If running as a client
+ proxy, it should not use fixed ports; instead it should autoselect
+ ports to avoid conflicts. A client proxy should by default only
+ listen on localhost for connections.
+
+ A server proxy SHOULD try listen at a consistent port, though it
+ SHOULD pick a different one if the port it last used is now allocated.
+
+ A client or server proxy then should tell which methods it has
+ made available and how. It does this by printing zero or more
+ CMETHOD and SMETHOD lines to its stdout. These lines look like:
+
+ CMETHOD methodname SOCKS4/SOCKS5 address:port [ARGS=arglist] \
+ [OPT-ARGS=arglist]
+
+ as in
+
+ CMETHOD trebuchet SOCKS5 127.0.0.1:19999 ARGS=rocks,height \
+ OPT-ARGS=tensile-strength
+
+ The ARGS field lists mandatory parameters that must appear in
+ every bridge line for this method. The OPT-ARGS field lists
+ optional parameters. If no ARGS or OPT-ARGS field is provided,
+ Tor should not check the parameters in bridge lines for this
+ method.
+
+ The proxy should print a single "CMETHODS DONE" line after it is
+ finished telling Tor about the client methods it provides. If it
+ tries to supply a client method but can't for some reason, it
+ should say:
+ CMETHOD-ERROR methodname "Message"
+
+ A proxy should tell Tor about the server methods it is providing
+ by printing zero or more SMETHOD lines. These lines look like:
+
+ SMETHOD methodname address:port [Options]
+
+ If there's an error setting up a configured server method, the
+ proxy should say:
+ SMETHOD-ERROR methodname "message"
+
+ The 'address:port' part of an SMETHOD line is the address to put
+ in the bridge line. The ARGS: part is a list of key-value pairs
+ that the client needs to know. The Options part is a list of
+ space-separated K:V flags that Tor should know about. Recognized
+ options are:
+
+ - FORWARD:1
+
+ If this option is set, and address:port is not a publicly
+ accessible address, then the bridge needs to forward some
+ other address:port to address:port via upnp-helper.
+
+ - ARGS:k=v,k=v,k=v
+
+ If this option is set, the K=V arguments are added to the
+ extrainfo document.
+
+ - DECLARE:K=V,...
+
+ If this option is set, all the K=V options should be
+ added as extension entries to the router descriptor. (See
+ below)
+
+ - USE-EXTPORT:1
+
+ If this option is set, the server plugin is using the
+ extended server port.
+
+ SMETHOD and CMETHOD lines may be interspersed. After the list
+ SMETHOD line, the proxy says "SMETHODS DONE"
+
+ The proxy SHOULD NOT tell Tor about a server or client method
+ unless it is actually open and ready to use.
+
+ Tor clients SHOULD NOT use any method from a client proxy or
+ advertise any method from a server proxy UNLESS it is listed as a
+ possible method for that proxy in torrc, and it is listed by the
+ proxy as a method it supports.
+
+ Proxies should respond to a single INT signal by closing their
+ listener ports and not accepting any new connections, but keeping
+ all connections open, then terminating when connections are all
+ closed. Proxies should respond to a second INT signal by shutting
+ down cleanly.
+
+The extended ORPort protocol.
+
+ Server transports may need to connect to the bridge and pass
+ additional information about client connections that the bridge
+ would ordinarily receive from the kernel's TCP stack. To to this,
+ they connect to the "extended server port" as given in
+ SERVER_PORT, sent a short amount of information, wait for a
+ response, and then send the user traffic on that port.
+
+ The extended server port protocol is as follows:
+
+ COMMAND [2 bytes, big-endian]
+ BODYLEN [2 bytes, big-endian]
+ BODY [Bodylen bytes]
+
+ Commands sent from the transport to the server are:
+
+ [0x0000] DONE: There is no more information to give. (body ignored)
+
+ [0x0001] USERADDR: an address:port string that represents the user's
+ address. If the transport doesn't actually do addresses,
+ this shouldn't be sent.
+
+ Replies sent from tor to the proxy are:
+
+ [0x1001] OKAY: Send the user's traffic. (body ignored)
+
+ [0x1002] DENY: Tor would prefer not to get more traffic from
+ this address for a while. (body ignored)
+
+ [We could also use an out-of-band signalling method to tell Tor
+ about client addresses, but that's a historically error-prone way
+ to go about annotating connections.]
+
+Advertising bridge methods:
+
+ Bridges put the 'method' lines in their extra-info documents.
+
+ method SP methodname SP address:port SP arglist NL
+
+ The address:port parse are as returned from an SMETHOD line. The
+ arglist is a K=V,... list as retuned in the ARGS part of the
+ SMETHOD line.
+
+ If the SMETHOD line includes a DECLARE: part, the routerinfo gets
+ a new line:
+
+ method-info SP methodname SP arglist NL
+
+Bridge authority behavior
+
+ We need to specify a way to test different transport methods that
+ bridges claim to support. We should test as many as possible. We
+ should NOT require that we have a way to tra
+
+Bridgedb behavior:
+
+ Bridgedb can, given a set of router descriptors and their
+ corresponding extrainfo documents, generate a set of bridge lines
+ for each descriptor. Bridgedb may want to avoid handing out
+ methods that seem to get bridges blocked quickly.
+
+Implementation plan
+
+ First, we should implement per-bridge socks settings (as
+ described above in "manually configuring a client proxy for a
+ bridge") and the extended-server-port mechanism. This will let
+ bridges run transport proxies such that they can hand-generate
+ bridge lines to give to clients for testing.
+
+ Once that's done, we can improve usability a little bit by
+ implementing external proxies. Once that's done, we can see if we
+ need any managed proxies, or if the whole idea there is silly.
+
+ If we do, the next most important part seems to be getting
+ the client-side automatic part written. And once that's done, we
+ can evaluate how much of the server side is easy for people to do
+ and how much is hard.
+
+ The "obfsproxy" obfuscating proxy is a likely candidate for an
+ initial transport, as is Steven Murdoch's http thing or something
+ similar.
+
+Notes on plugins to write:
+
+ We should ship a couple of null plugin implementations in one or two
+ popular, portable languages so that people get an idea of how to
+ write the stuff.
+
+ 1. We should have one that's just a proof of concept that does
+ nothing but transfer bytes back and forth.
+
+ 1. We should not do a rot13 one.
+
+ 2. We should implement a basic proxy that does not transform the bytes at all
+
+ 1. We should implement DNS or HTTP using other software (as goodesll
+ did years ago with DNS) as an example of wrapping existing code into
+ our plugin model.
+
+ 2. The obfuscated-ssh superencipherment is pretty trivial and pretty
+ useful. It makes the protocol stringwise unfingerprintable.
+
+ 1. Nick needs to be told firmly not to bikeshed the obfuscated-ssh
+ superencipherment too badly
+
+ 1. Go ahead, bikeshed my day
+
+ 1. If we do a raw-traffic proxy, openssh tunnels would be the logical choice.
+
+Appendix: recommendations for transports
+
+ Be free/open-source software. Also, if you think your code might
+ someday do so well at circumvention that it should be implemented
+ inside Tor, it should use the same license as Tor.
+
+ Use libraries that Tor already requires. (You can rely on openssl and
+ libevent being present if current Tor is present.)
+
+ Be portable: most Tor users are on Windows, and most Tor developers
+ are not, so designing your code for just one of these platforms will
+ make it either get a small userbase, or poor auditing.
+
+ Think secure: if your code is in a C-like language, and it's hard to
+ read it and become convinced it's safe, then it's probably not safe.
+
+ Think small: we want to minimize the bytes that a Windows user needs
+ to download for a transport client.
+
+ Avoid security-through-obscurity if possible. Specify.
+
+ Resist trivial fingerprinting: There should be no good string or regex
+ to search for to distinguish your protocol from protocols permitted by
+ censors.
+
+ Imitate a real profile: There are many ways to implement most
+ protocols -- and in many cases, most possible variants of a given
+ protocol won't actually exist in the wild.
+
+
+