summaryrefslogtreecommitdiff
path: root/doc/tor-design.tex
diff options
context:
space:
mode:
authorNick Mathewson <nickm@torproject.org>2003-11-01 06:47:19 +0000
committerNick Mathewson <nickm@torproject.org>2003-11-01 06:47:19 +0000
commitc826c5a95c2ac45d0d75b17448a94f88e4cafd1b (patch)
tree332dabcb8b5f8249e7322e5b6efde12c5901784a /doc/tor-design.tex
parentb6d8d458f3c4dd7156384e87d4f324b931ec2ef1 (diff)
downloadtor-c826c5a95c2ac45d0d75b17448a94f88e4cafd1b.tar.gz
tor-c826c5a95c2ac45d0d75b17448a94f88e4cafd1b.zip
Retitle and write section 8.
svn:r702
Diffstat (limited to 'doc/tor-design.tex')
-rw-r--r--doc/tor-design.tex266
1 files changed, 152 insertions, 114 deletions
diff --git a/doc/tor-design.tex b/doc/tor-design.tex
index ca0ecaf369..6a46075859 100644
--- a/doc/tor-design.tex
+++ b/doc/tor-design.tex
@@ -476,6 +476,7 @@ Tor's evolution.
\end{description}
\SubSection{Non-goals}
+\label{subsec:non-goals}
In favoring conservative, deployable designs, we have explicitly deferred
a number of goals. Many of these goals are desirable in anonymity systems,
but we choose to defer them either because they are solved elsewhere,
@@ -1539,124 +1540,161 @@ Mention jurisdictional arbitrage.
Pull attacks and defenses into analysis as a subsection
-\Section{Maintaining anonymity in Tor}
+\Section{Open Questions in Low-latency Anonymity}
\label{sec:maintaining-anonymity}
-\footnote{The first Onion Routing design \cite{or-ih96} protected against
-this threat to some
-extent by requiring users to hide network access behind an onion
-router/firewall that was also forwarding traffic from other nodes.
-However, it is desirable for users to
-benefit from Onion Routing even when they can't run their own
-onion routers.
-%Such users, especially if they engage in certain unusual
-%communication behaviors, may be identifiable \cite{wright03}.
-%To
-%complicate the possibility of such attacks Tor multiplexes many
-%stream down each circuit, but still rotates the circuit
-%periodically to avoid too much linkability from requests on a single
-%circuit.
-}
-
-I probably should have noted that this means loops will be on at least
-five hop routes, which should be rare given the distribution. I'm
-realizing that this is reproducing some of the thought that led to a
-default of five hops in the original onion routing design. There were
-some different assumptions, which I won't spell out now. Note that
-enclave level protections really change these assumptions. If most
-circuits are just two hops, then just a single link observer will be
-able to tell that two enclaves are communicating with high probability.
-So, it would seem that enclaves should have a four node minimum circuit
-to prevent trivial circuit insider identification of the whole circuit,
-and three hop minimum for circuits from an enclave to some nonclave
-responder. But then... we would have to make everyone obey these rules
-or a node that through timing inferred it was on a four hop circuit
-would know that it was probably carrying enclave to enclave traffic.
-Which... if there were even a moderate number of bad nodes in the
-network would make it advantageous to break the connection to conduct
-a reformation intersection attack. Ahhh! I gotta stop thinking
-about this and work on the paper some before the family wakes up.
-On Sat, Oct 25, 2003 at 06:57:12AM -0400, Paul Syverson wrote:
-> Which... if there were even a moderate number of bad nodes in the
-> network would make it advantageous to break the connection to conduct
-> a reformation intersection attack. Ahhh! I gotta stop thinking
-> about this and work on the paper some before the family wakes up.
-This is the sort of issue that should go in the 'maintaining anonymity
-with tor' section towards the end. :)
-Email from between roger and me to beginning of section above. Fix and move.
-
-
-[Put as much of this as a part of open issues as is possible.]
-
-[what's an anonymity set?]
-
-packet counting attacks work great against initiators. need to do some
-level of obfuscation for that. standard link padding for passive link
-observers. long-range padding for people who own the first hop. are
-we just screwed against people who insert timing signatures into your
-traffic?
-
-Even regardless of link padding from Alice to the cloud, there will be
-times when Alice is simply not online. Link padding, at the edges or
-inside the cloud, does not help for this.
-
-how often should we pull down directories? how often send updated
-server descs?
-
-when we start up the client, should we build a circuit immediately,
-or should the default be to build a circuit only on demand? should we
-fetch a directory immediately?
-
-would we benefit from greater synchronization, to blend with the other
-users? would the reduced speed hurt us more?
-
-does the "you can't see when i'm starting or ending a stream because
-you can't tell what sort of relay cell it is" idea work, or is just
-a distraction?
-
-does running a server actually get you better protection, because traffic
-coming from your node could plausibly have come from elsewhere? how
-much mixing do you need before this is actually plausible, or is it
-immediately beneficial because many adversary can't see your node?
-
-do different exit policies at different exit nodes trash anonymity sets,
-or not mess with them much?
-
-do we get better protection against a realistic adversary by having as
-many nodes as possible, so he probably can't see the whole network,
-or by having a small number of nodes that mix traffic well? is a
-cascade topology a more realistic way to get defenses against traffic
-confirmation? does the hydra (many inputs, few outputs) topology work
-better? are we going to get a hydra anyway because most nodes will be
+% There must be a better intro than this! -NM
+In addition to the open problems discussed in
+section~\ref{subsec:non-goals}, many other questions remain to be
+solved by future research before we can be truly confident that we
+have built a secure low-latency anonymity service.
+
+Many of these open issues are questions of balance. For example,
+how often should users rotate to fresh circuits? Too-frequent
+rotation is inefficient and expensive, but too-infrequent rotation
+makes the user's traffic linkable. Instead of opening a fresh
+circuit; clients can also limit linkability exit from a middle point
+of the circuit, or by truncating and re-extending the circuit, but
+more analysis is needed to determine the proper trade-off.
+[XXX mention predecessor attacks?]
+
+A similar question surrounds timing of directory operations:
+how often should directories be updated? With too-infrequent
+updates clients receive an inaccurate picture of the network; with
+too-frequent updates the directory servers are overloaded.
+
+%do different exit policies at different exit nodes trash anonymity sets,
+%or not mess with them much?
+%
+%% Why would they? By routing traffic to certain nodes preferentially?
+
+[XXX Choosing paths and path lengths: I'm not writing this bit till
+ Arma's pathselection stuff is in. -NM]
+
+%%%% Roger said that he'd put a path selection paragraph into section
+%%%% 4 that would replace this.
+%
+%I probably should have noted that this means loops will be on at least
+%five hop routes, which should be rare given the distribution. I'm
+%realizing that this is reproducing some of the thought that led to a
+%default of five hops in the original onion routing design. There were
+%some different assumptions, which I won't spell out now. Note that
+%enclave level protections really change these assumptions. If most
+%circuits are just two hops, then just a single link observer will be
+%able to tell that two enclaves are communicating with high probability.
+%So, it would seem that enclaves should have a four node minimum circuit
+%to prevent trivial circuit insider identification of the whole circuit,
+%and three hop minimum for circuits from an enclave to some nonclave
+%responder. But then... we would have to make everyone obey these rules
+%or a node that through timing inferred it was on a four hop circuit
+%would know that it was probably carrying enclave to enclave traffic.
+%Which... if there were even a moderate number of bad nodes in the
+%network would make it advantageous to break the connection to conduct
+%a reformation intersection attack. Ahhh! I gotta stop thinking
+%about this and work on the paper some before the family wakes up.
+%On Sat, Oct 25, 2003 at 06:57:12AM -0400, Paul Syverson wrote:
+%> Which... if there were even a moderate number of bad nodes in the
+%> network would make it advantageous to break the connection to conduct
+%> a reformation intersection attack. Ahhh! I gotta stop thinking
+%> about this and work on the paper some before the family wakes up.
+%This is the sort of issue that should go in the 'maintaining anonymity
+%with tor' section towards the end. :)
+%Email from between roger and me to beginning of section above. Fix and move.
+
+Throughout this paper, we have assumed that end-to-end traffic
+analysis cannot yet be defeated. But even high-latency anonymity
+systems can be vulnerable to end-to-end traffic analysis, if the
+traffic volumes are high enough, and if users' habits are sufficiently
+distinct \cite{disclosure,statistical-disclosure}. \emph{What can be
+ done to limit the effectiveness of these attacks against low-latency
+ systems?} Tor already makes some effort to conceal the starts and
+ends of streams by wrapping all long-range control commands in
+identical-looking relay cells, but more analysis is needed. Link
+padding could frustrate passive observer who count packets; long-range
+padding could work against observers who own the first hop in a
+circuit. But more research needs to be done in order to find an
+efficient and practical approach. Volunteers prefer not to run
+constant-bandwidth padding; but more sophisticated traffic shaping
+approaches remain somewhat unanalyzed. [XXX is this so?] Recent work
+on long-range padding \cite{long-range-padding} shows promise. One
+could also try to reduce correlation in packet timing by batching and
+re-ordering packets, but it is unclear whether this could improve
+anonymity without introducing so much latency as to render the
+network unusable.
+
+Even if passive timing attacks were wholly solved, active timing
+attacks would remain. \emph{What can
+ be done to address attackers who can introduce timing patterns into
+ a user's traffic?} [XXX mention likely approaches]
+
+%%% I think we cover this by framing the problem as ``Can we make
+%%% end-to-end characteristics of low-latency systems as good as
+%%% those of high-latency systems?'' Eliminating long-term
+%%% intersection is a hard problem.
+%
+%Even regardless of link padding from Alice to the cloud, there will be
+%times when Alice is simply not online. Link padding, at the edges or
+%inside the cloud, does not help for this.
+
+In order to scale to large numbers of users, and to prevent an
+attacker from observing the whole network at once, it may be necessary
+for low-latency anonymity systems to support far more servers than Tor
+currently anticipates. This introduces several issues. First, if
+approval by a centralized set of directory servers is no longer
+feasible, what mechanism should be used to prevent adversaries from
+signing up many spurious servers? (Tarzan and Morphmix present
+possible solutions.) Second, if clients can no longer have a complete
+picture of the network at all times how do we prevent attackers from
+manipulating client knowledge? Third, if there are to many servers
+for every server to constantly communicate with every other, what kind
+of non-clique topology should the network use? [XXX cite george's
+ restricted-routes paper] (Whatever topology we choose, we need some
+way to keep attackers from manipulating their position within it.)
+Fourth, since no centralized authority is tracking server reliability,
+How do we prevent unreliable servers from rendering the network
+unusable? Fifth, do clients receive so much anonymity benefit from
+running their own servers that we should expect them all to do so, or
+do we need to find another incentive structure to motivate them?
+
+Alternatively, it may be the case that one of these problems proves
+intractable, or that the drawbacks to many-server systems prove
+greater than the benefits. Nevertheless, we may still do well to
+consider non-clique topologies. A cascade topology may provide more
+defense against traffic confirmation confirmation.
+% Why would it? Cite. -NM
+Does the hydra (many inputs, few outputs) topology work
+better? Are we going to get a hydra anyway because most nodes will be
middleman nodes?
-using a circuit many times is good because it's less cpu work.
- good because of predecessor attacks with path rebuilding.
- bad because predecessor attacks can be more likely to link you with a
- previous circuit since you're so verbose.
- bad because each thing you do on that circuit is linked to the other
- things you do on that circuit.
- how often to rotate?
- how to decide when to exit from middle?
- when to truncate and re-extend versus when to start new circuit?
-
-Because Tor runs over TCP, when one of the servers goes down it seems
-that all the circuits (and thus streams) going over that server must
-break. This reduces anonymity because everybody needs to reconnect
-right then (does it? how much?) and because exit connections all break
-at the same time, and it also reduces usability. It seems the problem
-is even worse in a p2p environment, because so far such systems don't
-really provide an incentive for nodes to stay connected when they're
-done browsing, so we would expect a much higher churn rate than for
-onion routing. Are there ways of allowing streams to survive the loss
-of a node in the path?
-
-discuss topologies. Cite George's non-freeroutes paper. Maybe this
-graf goes elsewhere.
-
-discuss attracting users; incentives; usability.
-
-Choosing paths and path lengths.
+%%% Do more with this paragraph once The TCP-over-TCP paragraph is
+%%% more integrated into Related works.
+%
+As mentioned in section\ref{where-is-it-now}, Tor could improve its
+robustness against node failure by buffering stream data at the
+network's edges, and performing end-to-end acknowledgments. The
+efficacy of this approach remains to be tested, however, and there
+may be more effective means for ensuring reliable connections in the
+presence of unreliable nodes.
+
+%%% Keeping this original paragraph for a little while, since it
+%%% is not the same as what's written there now.
+%
+%Because Tor depends on TLS and TCP to provide a reliable transport,
+%when one of the servers goes down, all the circuits (and thus streams)
+%traveling over that server must break. This reduces anonymity because
+%everybody needs to reconnect right then (does it? how much?) and
+%because exit connections all break at the same time, and it also harms
+%usability. It seems the problem is even worse in a peer-to-peer
+%environment, because so far such systems don't really provide an
+%incentive for nodes to stay connected when they're done browsing, so
+%we would expect a much higher churn rate than for onion routing.
+%there ways of allowing streams to survive the loss of a node in the
+%path?
+
+% Roger or Paul suggested that we say something about incentives,
+% too, but I think that's a better candidate for our future work
+% section. After all, we will doubtlessly learn very much about why
+% people do or don't run and use Tor in the near future. -NM
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%