diff options
author | Nick Mathewson <nickm@torproject.org> | 2003-11-04 05:42:50 +0000 |
---|---|---|
committer | Nick Mathewson <nickm@torproject.org> | 2003-11-04 05:42:50 +0000 |
commit | 7b19ba8aa1f3fcf8b9ee5590eab60be17eecc9a4 (patch) | |
tree | b8937099678a48d275d524744eb5e863743da858 /doc | |
parent | a76a61fef67bb578fa010ceaa7c8dfe3620ec9da (diff) | |
download | tor-7b19ba8aa1f3fcf8b9ee5590eab60be17eecc9a4.tar.gz tor-7b19ba8aa1f3fcf8b9ee5590eab60be17eecc9a4.zip |
Tighten up 1-3; clarify a few points
svn:r745
Diffstat (limited to 'doc')
-rw-r--r-- | doc/tor-design.tex | 137 |
1 files changed, 70 insertions, 67 deletions
diff --git a/doc/tor-design.tex b/doc/tor-design.tex index a93b1a6fb4..cff2e177fc 100644 --- a/doc/tor-design.tex +++ b/doc/tor-design.tex @@ -51,8 +51,8 @@ \begin{abstract} We present Tor, a circuit-based low-latency anonymous communication -system. This second-generation Onion Routing system addresses limitations -in the original design. We add perfect forward secrecy, congestion +service. This second-generation Onion Routing system addresses limitations +in the original design. Tor adds perfect forward secrecy, congestion control, directory servers, integrity checking, variable exit policies, and a practical design for rendezvous points. Tor works on the real-world Internet, requires no special privileges or kernel modifications, requires @@ -81,10 +81,10 @@ the circuit. Traffic flowing down the circuit is sent in fixed-size Onion Routing project published several design and analysis papers \cite{or-ih96,or-jsac98,or-discex00,or-pet00}. While a wide area Onion Routing network was deployed for some weeks, the only long-running and -publicly accessible implementation of the original design was a fragile +publicly accessible implementation was a fragile proof-of-concept that ran on a single machine. Even this simple deployment processed connections from over sixty thousand distinct IP addresses from -all over the world at an average rate of about fifty thousand per day. +all over the world at a rate of about fifty thousand per day. But many critical design and deployment issues were never resolved, and the design has not been updated in several years. Here we describe Tor, a protocol for asynchronous, loosely federated onion @@ -96,6 +96,7 @@ Routing design: \item \textbf{Perfect forward secrecy:} The original Onion Routing design was vulnerable to a single hostile node recording traffic and later compromising successive nodes in the circuit and forcing them +% XXX Problem: at this point, readers don't know what an onion is. to decrypt it. Rather than using a single onion to lay each circuit, Tor now uses an incremental or \emph{telescoping} path-building design, where the initiator negotiates session keys with each successive hop in @@ -121,20 +122,20 @@ application-level request. This hurt performance by requiring multiple public key operations for every request, and also presented a threat to anonymity from building so many different circuits; see Section~\ref{sec:maintaining-anonymity}. Tor multiplexes multiple TCP -streams along each virtual circuit, to improve efficiency and anonymity. +streams along each virtual circuit to improve efficiency and anonymity. \item \textbf{Leaky-pipe circuit topology:} Through in-band signaling within the circuit, Tor initiators can direct traffic to nodes partway -down the circuit. This novel approach allows both for long-range +down the circuit. This novel approach allows for long-range padding to frustrate traffic shape and volume attacks at the initiator -\cite{defensive-dropping}, and, because circuits are used by more than one -application, allows traffic to exit the circuit from the middle---thus +\cite{defensive-dropping}, and +also allows traffic to exit the circuit from the middle---thus frustrating traffic shape and volume attacks based on observing the end of the circuit. \item \textbf{No mixing, padding, or traffic shaping:} The original Onion Routing design called for batching and reordering the cells arriving from -each circuit. It also included padding between onion routers and, in a +each source. It also included padding between onion routers and, in a later design, between onion proxies (that is, users) and onion routers \cite{or-ih96,or-jsac98}. The trade-off between padding protection and cost was discussed, but no general padding scheme was suggested. In @@ -151,21 +152,21 @@ leave these strategies out. address traffic bottlenecks. Unfortunately, typical approaches to load balancing and flow control in overlay networks involve inter-node control communication and global views of traffic. Tor's decentralized -congestion control uses end-to-end acks to maintain reasonable anonymity +congestion control uses end-to-end acks to maintain anonymity while allowing nodes at the edges of the network to detect congestion -or flooding attacks and send less data until the congestion subsides. +or flooding and send less data until the congestion subsides. \item \textbf{Directory servers:} The original Onion Routing design planned to flood link-state information through the network---an approach -that can be unreliable and open to partitioning attacks or outright +that can be unreliable and open to partitioning attacks or deception. Tor takes a simplified view toward distributing link-state -information. Certain more trusted onion routers also act as directory -servers: they provide signed \emph{directories} that describe known +information. Certain more trusted onion routers act as \emph{directory +servers}: they provide signed directories that describe known routers and their availability. Users periodically download these directories via HTTP. \item \textbf{Variable exit policies:} Tor provides a consistent mechanism -for each node to specify and advertise a policy describing the hosts +for each node to advertise a policy describing the hosts and ports to which it will connect. These exit policies are critical in a volunteer-based distributed infrastructure, because each operator is comfortable with allowing different types of traffic to exit the Tor @@ -181,8 +182,8 @@ Tor hampers these attacks by checking data integrity before it leaves the network. \item \textbf{Improved robustness to failed nodes:} A failed node -in the old design meant that circuit-building failed, but thanks to -Tor's step-by-step circuit building, users can notice failed nodes +in the old design meant that circuit building failed, but thanks to +Tor's step-by-step circuit building, users notice failed nodes while building circuits and route around them. Additionally, liveness information from directories allows users to avoid unreliable nodes in the first place. @@ -192,22 +193,21 @@ Tor provides an integrated mechanism for responder anonymity via location-protected servers. Previous Onion Routing designs included long-lived ``reply onions'' that could be used to build virtual circuits to a hidden server, but these reply onions did not provide forward -security, and would become useless if any node in the path went down +security, and became useless if any node in the path went down or rotated its keys. In Tor, clients negotiate {\it rendezvous points} to connect with hidden servers; reply onions are no longer required. \end{tightlist} -Unlike anonymity systems like Freedom \cite{freedom2-arch}, Tor only +Unlike Freedom \cite{freedom2-arch}, Tor only attempts to anonymize TCP streams. Because it does not require patches (or built-in support) in an operating system's network stack, this approach has proven valuable to Tor's portability and deployability. We have implemented most of the above features. Our source code is available under a free license, and, as far as we know, is unencumbered by -patents. We have recently begun deploying a widespread alpha network +patents. We have recently begun deploying a wide-area alpha network to test the design in practice, to get more experience with usability -and users, and to provide a research platform for experimenting with -new ideas. +and users, and to provide a research platform for experimentation. We review previous work in Section~\ref{sec:related-work}, describe our goals and assumptions in Section~\ref{sec:assumptions}, @@ -223,21 +223,22 @@ Routing project in Section~\ref{sec:conclusion}. \Section{Related work} \label{sec:related-work} -Modern anonymity systems date to Chaum's Mix-Net design +Modern anonymity systems date to Chaum's {\bf Mix-Net} design \cite{chaum-mix}. Chaum proposed hiding the correspondence between sender and recipient by wrapping messages in layers of public key cryptography, and relaying them -through a path composed of ``Mixes.'' These mixes in turn decrypt, delay, -and re-order messages, before relaying them along the sender-selected -path towards their destinations. +through a path composed of ``Mixes.'' Each of these mixes in turn +decrypts, delays, and re-orders messages, before relaying them toward +their destinations. Subsequent relay-based anonymity designs have diverged in two principal directions. Some have attempted to maximize anonymity at the cost of introducing comparatively large and variable latencies, -including Babel \cite{babel}, Mixmaster \cite{mixmaster-spec}, and -Mixminion \cite{minion-design}. Because of this +including {\bf Babel} \cite{babel}, {\bf Mixmaster} +\cite{mixmaster-spec}, and +{\bf Mixminion} \cite{minion-design}. Because of this decision, these \emph{high-latency} networks are well-suited for anonymous -email, but introduce too much lag for interactive tasks such as web browsing, +email, but introduce too much lag for interactive tasks like web browsing, internet chat, or SSH connections. Tor belongs to the second category: \emph{low-latency} designs that @@ -251,7 +252,7 @@ difficult for them to prevent an attacker who can eavesdrop both ends of the communication from correlating the timing and volume of traffic entering the anonymity network with traffic leaving it. These protocols are also vulnerable against active attacks in which an -adversary introduces timing patterns into traffic entering the network, and +adversary introduces timing patterns into traffic entering the network and looks for correlated patterns among exiting traffic. Although some work has been done to frustrate @@ -266,17 +267,17 @@ attempting to learn who is talking to whom, not to confirm a prior suspicion about who is talking to whom. The simplest low-latency designs are single-hop proxies such as the -Anonymizer \cite{anonymizer}, wherein a single trusted server strips the +{\bf Anonymizer} \cite{anonymizer}, wherein a single trusted server strips the data's origin before relaying it. These designs are easy to -analyze, but require end-users to trust the anonymizing proxy. +analyze, but require users to trust the anonymizing proxy. Concentrating the traffic to a single point increases the anonymity set -(the set of people a given user is hiding among), but it can make traffic +(the people a given user is hiding among), but can make traffic analysis easier: an adversary need only eavesdrop on the proxy to observe the entire system. More complex are distributed-trust, circuit-based anonymizing systems. In these designs, a user establishes one or more medium-term bidirectional -end-to-end circuits, and tunnels TCP streams in fixed-size cells. +end-to-end circuits, and tunnels data in fixed-size cells. Establishing circuits is computationally expensive and typically requires public-key cryptography, whereas relaying cells is comparatively inexpensive and @@ -285,44 +286,43 @@ Because a circuit crosses several servers, and each server only knows the adjacent servers in the circuit, no single server can link a user to her communication partners. -The Java Anon Proxy (also known as JAP or Web MIXes) uses fixed shared +The {\bf Java Anon Proxy} (also known as JAP or Web MIXes) uses fixed shared routes known as \emph{cascades}. As with a single-hop proxy, this approach aggregates users into larger anonymity sets, but again an attacker only needs to observe both ends of the cascade to bridge all -the system's traffic. The Java Anon Proxy's design provides -protection by padding between end users and the head of the cascade +the system's traffic. The Java Anon Proxy's design +calls for padding between end users and the head of the cascade \cite{web-mix}. However, it is not demonstrated whether the current implementation's padding policy improves anonymity. -PipeNet \cite{back01, pipenet}, another low-latency design proposed at +{\bf PipeNet} \cite{back01, pipenet}, another low-latency design proposed at about the same time as the original Onion Routing design, provided stronger anonymity at the cost of allowing a single user to shut down the network simply by not sending. Low-latency anonymous communication has also been designed for other environments such as ISDN \cite{isdn-mixes}. -In P2P designs like Tarzan \cite{tarzan:ccs02} and MorphMix +In P2P designs like {\bf Tarzan} \cite{tarzan:ccs02} and {\bf MorphMix} \cite{morphmix:fc04}, all participants both generate traffic and relay -traffic for others. These systems aim to prevent a peer -or observer from knowing whether a given peer originated a request +traffic for others. These systems aim to conceal +whether a given peer originated a request or just relayed it from another peer. While Tarzan and MorphMix use -layered encryption as above, Crowds \cite{crowds-tissec} simply assumes +layered encryption as above, {\bf Crowds} \cite{crowds-tissec} simply assumes an adversary who cannot observe the initiator: it uses no public-key -encryption, so nodes on a circuit can read that circuit's traffic. +encryption, so any node on a circuit can read that circuit's traffic. -Hordes \cite{hordes-jcs} is based on Crowds but also uses multicast -responses to hide the initiator. Herbivore \cite{herbivore} and P5 -\cite{p5} go even further, requiring broadcast. They make anonymity -and efficiency trade-offs to make broadcast more practical. +{\bf Hordes} \cite{hordes-jcs} is based on Crowds but also uses multicast +responses to hide the initiator. {\bf Herbivore} \cite{herbivore} and +{\bf P5} \cite{p5} go even further, requiring broadcast. These systems are designed primarily for communication between peers, although Herbivore users can make external connections by requesting a peer to serve as a proxy. -Systems like Freedom and the original Onion Routing build the circuit +Systems like {\bf Freedom} and the original Onion Routing build the circuit all at once, using a layered ``onion'' of public-key encrypted messages, each layer of which provides a set of session keys and the address of the next server in the circuit. Tor as described herein, Tarzan, MorphMix, -Cebolla \cite{cebolla}, and Rennhard's Anonymity Network \cite{anonnet} +{\bf Cebolla} \cite{cebolla}, and Rennhard's {\bf Anonymity Network} \cite{anonnet} build the circuit in stages, extending it one hop at a time. Section~\ref{subsubsec:constructing-a-circuit} describes how this @@ -344,7 +344,7 @@ identifying information from those requests, can take advantage of caching to limit the number of requests that leave the network, and can batch or encode those requests in order to minimize the number of connections. On the other hand, an IP-level anonymizer can handle nearly any protocol, -even ones unforeseen by their designers (though these systems require +even ones unforeseen by its designers (though these systems require kernel-level modifications to some operating systems, and so are more complex and less portable). TCP-level anonymity networks like Tor present a middle approach: they are fairly application neutral (so long as the @@ -354,9 +354,9 @@ they avoid the well-known inefficiencies of tunneling TCP over TCP \cite{tcp-over-tcp-is-bad}. Distributed-trust anonymizing systems need to prevent attackers from -adding too many servers and thus compromising too many user paths. +adding too many servers and thus compromising user paths. Tor relies on a small set of well-known directory servers, run by -independent parties, to make decisions about which nodes can +independent parties, to decide which nodes can join. Tarzan and MorphMix allow unknown users to run servers, and use a limited resource (like IP addresses) to prevent an attacker from controlling too much of the network. Crowds suggests requiring @@ -379,10 +379,10 @@ Eternity and Free~Haven. Like other low-latency anonymity designs, Tor seeks to frustrate attackers from linking communication partners, or from linking multiple communications to or from a single user. Within this -main goal, however, several design considerations have directed +main goal, however, several considerations have directed Tor's evolution. -\textbf{Deployability:} The design must be one that can be implemented, +\textbf{Deployability:} The design must be implemented, deployed, and used in the real world. This requirement precludes designs that are expensive to run (for example, by requiring more bandwidth than volunteers are willing to provide); designs that place a heavy @@ -391,21 +391,21 @@ implicate onion routers in illegal activities); and designs that are difficult or expensive to implement (for example, by requiring kernel patches, or separate proxies for every protocol). This requirement also precludes systems in which non-anonymous parties (such as websites) -must run our software. (We do not meet this goal for the current -rendezvous design, +must run our software. (Our rendezvous point design does not meet +this goal for non-anonymous users talking to hidden servers, however; see Section~\ref{sec:rendezvous}.) \textbf{Usability:} A hard-to-use system has fewer users---and because anonymity systems hide users among users, a system with fewer users -provides less anonymity. Usability is thus not only a convenience for Tor: +provides less anonymity. Usability is thus not only a convenience: it is a security requirement \cite{econymics,back01}. Tor should therefore not require modifying applications; should not introduce prohibitive delays; -and should require the user to make as few configuration decisions -as possible. Finally, Tor should be easily implemented on all commonly used +and should require users to make as few configuration decisions +as possible. Finally, Tor should be easily implemented on all commonly platforms; we cannot require users to change their operating system in order to be anonymous. (The current Tor implementation runs on Windows and -assorted Unix-clones including Linux, FreeBSD, and MacOS X.) +assorted Unix clones including Linux, FreeBSD, and MacOS X.) \textbf{Flexibility:} The protocol must be flexible and well-specified, so that it can serve as a test-bed for future research in low-latency @@ -430,7 +430,7 @@ In favoring simple, deployable designs, we have explicitly deferred several possible goals, either because they are solved elsewhere, or because their solution is an open research problem. -\textbf{Not Peer-to-peer:} Tarzan and MorphMix aim to scale to completely +\textbf{Not peer-to-peer:} Tarzan and MorphMix aim to scale to completely decentralized peer-to-peer environments with thousands of short-lived servers, many of which may be controlled by an adversary. This approach is appealing, but still has many open problems @@ -440,16 +440,17 @@ is appealing, but still has many open problems to provide a definitive solution to end-to-end timing or intersection attacks. Some approaches, such as running an onion router, may help; see Section~\ref{sec:attacks} for more discussion. +% XXX P2P issues here. -NM \textbf{No protocol normalization:} Tor does not provide \emph{protocol normalization} like Privoxy or the Anonymizer. If anonymization from the responder is desired for complex and variable -protocols such as HTTP, Tor must be layered with a filtering proxy such +protocols like HTTP, Tor must be layered with a filtering proxy such as Privoxy to hide differences between clients, and expunge protocol features that leak identity. -Note that by this separation Tor can also provide connections that -are anonymous to the network yet authenticated to the responder, for -example SSH. +Note that by this separation Tor can also provide services that +are anonymous to the network yet authenticated to the responder, like +SSH. Similarly, Tor does not currently integrate tunneling for non-stream-based protocols like UDP; this too must be provided by an external service. @@ -475,12 +476,14 @@ compromise some fraction of the onion routers on the network. In low-latency anonymity systems that use layered encryption, the adversary's typical goal is to observe both the initiator and the -responder. Passive attackers can confirm a suspicion that Alice is +responder. By observing both ends, passive attackers can confirm a +suspicion that Alice is talking to Bob if the timing and volume patterns of the traffic on the connection are distinct enough; active attackers can induce timing signatures on the traffic to \emph{force} distinct patterns. Tor provides some defenses against these \emph{traffic confirmation} attacks, for example by encouraging users to run their own onion routers, but it does +% XXX More P2P issues here. -NM not provide complete protection. Rather, we aim to prevent \emph{traffic analysis} attacks, where the adversary uses traffic patterns to learn which points in the network he should attack. @@ -497,7 +500,7 @@ to trustworthy routers to encourage users to send their traffic through compromised routers, or denying service to users to see if the traffic elsewhere in the network stops; or by introducing patterns into traffic that can later be -detected. The adversary might attack the directory servers to give users +detected. The adversary might subvert the directory servers to give users differing views of network state. Additionally, he can try to decrease the network's reliability by attacking nodes or by performing antisocial activities from reliable servers and trying to get them taken down; |