proposal 203: Avoiding censorship by impersonating an HTTPS server

author: Nick Mathewson <nickm@torproject.org> 2012-06-25 18:20:11 -0400
committer: Nick Mathewson <nickm@torproject.org> 2012-06-25 18:24:01 -0400
commit: d3aa362f6e507031931ef1815512f4eefe3d2fb2 (patch)
tree: 3e306108fb5919c062934c6fd75f027e014f3a6a /proposals/203-https-frontend.txt
parent: b4195a51a98f0c54efbbf9a9e5241cd4ce6f57a4 (diff)
download: torspec-d3aa362f6e507031931ef1815512f4eefe3d2fb2.tar.gz
torspec-d3aa362f6e507031931ef1815512f4eefe3d2fb2.zip
1 files changed, 247 insertions, 0 deletions
diff --git a/proposals/203-https-frontend.txt b/proposals/203-https-frontend.txt
new file mode 100644
index 0000000..f559d92
--- /dev/null
+++ b/proposals/203-https-frontend.txt
@@ -0,0 +1,247 @@
+Filename: 203-https-frontend.txt
+Title: Avoiding censorship by impersonating an HTTPS server
+Author: Nick Mathewson
+Created: 24 Jun 2012
+Status: Draft
+
+
+Overview:
+
+   One frequently proposed approach for censorship resistance is that
+   Tor bridges ought to act like another TLS-based service, and deliver
+   traffic to Tor only if the client can demonstrate some shared
+   knowledge with the bridge.
+
+   In this document, I discuss some design considerations for building
+   such systems, and propose a few possible architectures and designs.
+
+Background:
+
+   Most of our previous work on censorship resistance has focused on
+   preventing passive attackers from identifying Tor bridges, or from
+   doing so cheaply.  But active attackers exist, and exist in the wild:
+   right now, the most sophisticated censors use their anti-Tor passive
+   attacks only as a first round of filtering before launching a
+   secondary active attack to confirm suspected Tor nodes.
+
+   One idea we've been talking about for a while is that of having a
+   service that looks like an HTTPS service unless a client does some
+   particular secret thing to prove it is allowed to use it as a Tor
+   bridge.  Such a system would still succumb to passive traffic
+   analysis attacks (since the packet timings and sizes for HTTPS don't
+   look that much like Tor), but it would be enough to beat many current
+   censors.
+
+Goals and requirements:
+
+   We should make it impossible for a passive attacker who examines only
+   a few packets at a time to distinguish Tor->Bridge traffic from an
+   HTTPS client talking to an HTTPS server.
+
+   We should make it impossible for an active attacker talking to the
+   server to tell a Tor bridge server from regular HTTPS server.
+
+   We should make it impossible for an active attacker who can MITM the
+   server to learn from the client whether it thought it was connecting
+   to an HTTPS server or a Tor bridge.  (This implies that an MITM
+   attacker shouldn't be able to learn anything that would help it
+   convince the server to act like a bridge.)
+
+   It would be nice to minimize the required code changes to Tor, and
+   the required code changes to any other software.
+
+   It would be good to avoid any requirement of close integration with
+   any particular HTTP or HTTPS implementation.
+
+   If we're replacing our own profile with that of an HTTPS service, we
+   should do so in a way that lets us use a the profile of a popular
+   HTTPS implementation.
+
+   Efficiency would be good: layering TLS inside TLS is best avoided if
+   we can.
+
+Discussion:
+
+   We need an actual web server; HTTP and HTTPS are so complicated that
+   there's no practical way to behave in a bug-compatible way with any
+   popular webserver short of running that webserver.
+
+   More obviously, we need a TLS implementation (or we can't implement
+   HTTPS), and we need a Tor bridge (since that's the whole point of
+   this exercise).
+
+   So from a top-level point of view, the question becomes: how shall we
+   wire these together?
+
+   There are three obvious ways; I'll discuss them in turn below.
+
+Design #1: TLS in Tor
+
+   Under this design, Tor accepts HTTPS connections, decides which ones
+   don't look like the Tor protocol, and relays them to a webserver.
+
+                   +--------------------------------------+
+     +------+  TLS |  +------------+  http +-----------+  |
+     | User |<------> | Tor Bridge |<----->| Webserver |  |
+     +------+      |  +------------+       +-----------+  |
+                   |     trusted host/network             |
+                   +--------------------------------------+
+
+   This approach would let us use a completely unmodified webserver
+   implementation, but would require the most extensive changes in Tor:
+   we'd need to add yet another flavor to Tor's TLS ice cream parlor,
+   and try to emulate a popular webserver's TLS behavior even more
+   thoroughly.
+
+   To authenticate, we would need to take a hybrid approach, and begin
+   forwarding traffic to the webserver as soon as soon as a webserver
+   might respond to the traffic.  This could be pretty complicated,
+   since it requires us to have a model of how the webserver would
+   respond to any given set of bytes.  As a workaround, we might try
+   relaying _all_ input to the webserver, and only replying as Tor in
+   the cases where the website hasn't replied.  (This would likely to
+   create recognizable timing patterns, though.)
+
+   The authentication itself could use a system akin to Tor proposals
+   189/190, where an early AUTHORIZE cell shows knowledge of a shared
+   secret if the client is a Tor client.
+
+Design #2: TLS in the web server
+
+                   +----------------------------------+
+     +------+  TLS |  +------------+  tor0   +-----+  |
+     | User |<------> | Webserver  |<------->| Tor |  |
+     +------+      |  +------------+         +-----+  |
+                   |     trusted host/network         |
+                   +----------------------------------+
+
+   In this design, we write an Apache module or something that can
+   recognize an authenticator of some kind in an HTTPS header, or
+   recognize a valid AUTHORIZE cell, and respond by forwarding the
+   traffic to a Tor instance.
+
+   To avoid the efficiency issue of doing an extra local
+   encrypt/decrypt, we need to have the webserver talk to Tor over a
+   local unencrypted connection. (I've denoted this as "tor0" in the
+   diagram above.)  For implementation convenience, we might want to
+   implement that as a NULL TLS connection, so that the Tor server code
+   wouldn't have to change except to allow local NULL TLS connections in
+   this configuration.
+
+   For the Tor handshake to work properly here, we'll need a way for the
+   Tor instance to know which public key the webserver is configured to
+   use.
+
+   We wouldn't need to support the parts of the Tor link protocol used
+   to authenticate clients to servers: relays shouldn't be using this
+   subsystem at all.
+
+   The Tor client would need to connect and prove its status as a Tor
+   client.  If the client uses some means other then AUTHORIZE cells, or
+   if we want to do the authentication in a pluggable transport, and we
+   therefore decided to offload the responsibility TLS itself to the
+   pluggable transport, that would scare me: Supporting pluggable
+   transports that have the responsibility for TLS would make it fairly
+   easy to mess up the crypto, and I'd rather not have it be so easy to
+   write a pluggable transport that accidentally makes Tor less secure.
+
+Design #3: Reverse proxy
+
+
+                   +----------------------------------+
+                   |  +-------+  http  +-----------+  |
+                   |  |       |<------>| Webserver |  |
+     +------+  TLS |  |       |        +-----------+  |
+     | User |<------> | Proxy |                       |
+     +------+      |  |       |  tor0  +-----------+  |
+                   |  |       |<------>|    Tor    |  |
+                   |  +-------+        +-----------+  |
+                   |     trusted host/network         |
+                   +----------------------------------+
+
+   In this design, we write a server-side proxy to sit in front of Tor
+   and a webserver, or repurpose some existing HTTPS proxy. Its role
+   will be to do TLS, and then forward connections to Tor or the
+   webserver as appropriate.  (In the web world, this kind of thing is
+   called a "reverse proxy", so that's the term I'm using here.)
+
+   To avoid fingerprinting, we should choose a proxy that's already in
+   common use as a TLS frontend for webservers -- nginx, perhaps.
+   Unfortunately, the more popular tools here seem to be pretty complex,
+   and the simpler tools less widely deployed.  More investigation would
+   be needed.
+
+   The authorization considerations would be as in Design #2 above; for
+   the reasons discussed there, it's probably a good idea to build the
+   necessary authorization into Tor itself.
+
+   I generally like this design best: it lets us isolate the "Check for
+   a valid authenticator and/or a valid or invalid HTTP header, and
+   react accordingly" question to a single program.
+
+How to authenticate: The easiest way
+
+   Designing a good MITM-resistant AUTHORIZE cell, or an equivalent
+   HTTP header, is an open problem that we should solve in proposals
+   190 and 191 and their successors.  I'm calling it out-of-scope here;
+   please see those proposals, their attendant discussion, and their
+   eventual successors
+
+How to authenticate: a slightly harder way
+
+   Some proposals in this vein have in the past suggested a special
+   HTTP header to distinguish Tor connections from non-Tor connections.
+   This could work too, though it would require substantially larger
+   changes on the Tor client's part, would still require the client
+   take measures to avoid MITM attacks, and would also require the
+   client to implement a particular browser's http profile.
+
+Some considerations on distinguishability
+
+   Against a passive eavesdropper, the easiest way to avoid
+   distinguishability in server responses will be to use an actual web
+   server or reverse web proxy's TLS implementation.
+   (Distinguishability based on client TLS use is another topic
+   entirely.)
+
+   Against an active non-MITM attacker, the best probing attacks will be
+   ones designed to provoke the system in acting in ways different from
+   those in which a webserver would act: responding earlier than a web
+   server would respond, or later, or differently.  We need to make sure
+   that, whatever the front-end program is, it answers anything that
+   would qualify as a well-formed or ill-formed HTTP request whenever
+   the web server would.  This must mean, for example, that whatever the
+   correct form of client authorization turns out to be, no prefix of
+   that authorization is ever something that the webserver would respond
+   to.  With some web servers (I believe), that's as easy as making sure
+   that any valid authenticator isn't too long, and doesn't contain a CR
+   or LF character.  With others, the authenticator would need to be a
+   valid HTTP request, with all the attendant difficulty that would
+   raise.
+
+   Against an attacker who can MITM the bridge, the best attacks will be
+   to wait for clients to connect and see how they behave.  In this
+   case, the client probably needs to be able to authenticate the bridge
+   certificate as presented in the initial TLS handshake -- or some
+   other aspect of the TLS handshake if we're feeling insane.  If the
+   certificate or handshake isn't as expected, the client should behave
+   as a web browser that's just received a bad TLS certificate.  (The
+   alternative there would be to try to impersonate an HTTPS client that
+   has just accepted a self-signed certificate.  But that would probably
+   require the Tor client to impersonate a full web browser, which isn't
+   realistic.)
+
+Side note: What to put on the webserver?
+
+   To credibly pretend not to be ourselves, we must pretend to be
+   something else in particular -- and something not easily identifiable
+   or inherently worthless.  We should not, for example, have all
+   deployments of this kind use a fixed website, even if that website is
+   the default "Welcome to Apache" configuration: A censor would
+   probably feel that they weren't breaking anything important by
+   blocking all unconfigured websites with nothing on them.
+
+   Therefore, we should probably conceive of a system like this as
+   "Something to add to your HTTPS website" rather than as a standalone
+   installation.
+
author	Nick Mathewson <nickm@torproject.org>	2012-06-25 18:20:11 -0400
committer	Nick Mathewson <nickm@torproject.org>	2012-06-25 18:24:01 -0400
commit	d3aa362f6e507031931ef1815512f4eefe3d2fb2 (patch)
tree	3e306108fb5919c062934c6fd75f027e014f3a6a /proposals/203-https-frontend.txt
parent	b4195a51a98f0c54efbbf9a9e5241cd4ce6f57a4 (diff)
download	torspec-d3aa362f6e507031931ef1815512f4eefe3d2fb2.tar.gz torspec-d3aa362f6e507031931ef1815512f4eefe3d2fb2.zip