clean whitespace (no substantive changes)

svn:r976
author: Roger Dingledine <arma@torproject.org> 2004-01-07 12:08:07 +0000
committer: Roger Dingledine <arma@torproject.org> 2004-01-07 12:08:07 +0000
commit: 933d531f15c0719f65a4aa415180ca89cd00d90a (patch)
tree: f685eb8f4e8f3fd936aa962eab96705cdcb41a33
parent: bf63d281b402ed4ea799f80d5e47de15dd2e83a0 (diff)
download: tor-933d531f15c0719f65a4aa415180ca89cd00d90a.tar.gz
tor-933d531f15c0719f65a4aa415180ca89cd00d90a.zip
1 files changed, 50 insertions, 50 deletions
diff --git a/doc/tor-design.tex b/doc/tor-design.tex
index 0536aa6f53..1c06bd3d9e 100644
--- a/doc/tor-design.tex
+++ b/doc/tor-design.tex
@@ -81,7 +81,7 @@ build a \emph{circuit}, in which each node (or ``onion router'' or ``OR'')
 in the path knows its predecessor and successor, but no other nodes in
 the circuit.  Traffic flowing down the circuit is sent in fixed-size
 \emph{cells}, which are unwrapped by a symmetric key at each node
-(like the layers of an onion) and relayed downstream. The 
+(like the layers of an onion) and relayed downstream. The
 Onion Routing project published several design and analysis papers
 \cite{or-ih96,or-jsac98,or-discex00,or-pet00}. While a wide area Onion
 Routing network was deployed briefly, the only long-running and
@@ -144,7 +144,7 @@ streams along each circuit to improve efficiency and anonymity.
 
 \textbf{Leaky-pipe circuit topology:} Through in-band signaling
 within the circuit, Tor initiators can direct traffic to nodes partway
-down the circuit. This novel approach 
+down the circuit. This novel approach
 allows traffic to exit the circuit from the middle---possibly
 frustrating traffic shape and volume attacks based on observing the end
 of the circuit. (It also allows for long-range padding if
@@ -257,7 +257,7 @@ difficult for them to prevent an attacker who can eavesdrop both ends of the
 communication from correlating the timing and volume
 of traffic entering the anonymity network with traffic leaving it.  These
 protocols are also vulnerable against active attacks in which an
-adversary introduces timing patterns into traffic entering the network and 
+adversary introduces timing patterns into traffic entering the network and
 looks
 for correlated patterns among exiting traffic.
 Although some work has been done to frustrate
@@ -274,7 +274,7 @@ confirmation (cf.\ Section~\ref{subsec:threat-model}).
 The simplest low-latency designs are single-hop proxies such as the
 {\bf Anonymizer} \cite{anonymizer}, wherein a single trusted server strips the
 data's origin before relaying it.  These designs are easy to
-analyze, but users must trust the anonymizing proxy. 
+analyze, but users must trust the anonymizing proxy.
 Concentrating the traffic to a single point increases the anonymity set
 (the people a given user is hiding among), but it is vulnerable if the
 adversary can observe all traffic going into and out of the proxy.
@@ -294,7 +294,7 @@ The {\bf Java Anon Proxy} (also known as JAP or Web MIXes) uses fixed shared
 routes known as \emph{cascades}.  As with a single-hop proxy, this
 approach aggregates users into larger anonymity sets, but again an
 attacker only needs to observe both ends of the cascade to bridge all
-the system's traffic.  The Java Anon Proxy's design 
+the system's traffic.  The Java Anon Proxy's design
 calls for padding between end users and the head of the cascade
 \cite{web-mix}. However, it is not demonstrated whether the current
 implementation's padding policy improves anonymity.
@@ -340,7 +340,7 @@ Tor, they may accept TCP streams and relay the data in those streams
 along the circuit, ignoring the breakdown of that data into TCP segments
 \cite{morphmix:fc04,anonnet}. Finally, they may accept application-level
 protocols (such as HTTP) and relay the application requests themselves
-along the circuit.  
+along the circuit.
 Making this protocol-layer decision requires a compromise between flexibility
 and anonymity.  For example, a system that understands HTTP, such as Crowds,
 can strip
@@ -449,7 +449,7 @@ normalization} like Privoxy or the Anonymizer. If anonymization from
 the responder is desired for complex and variable
 protocols like HTTP, Tor must be layered with a filtering proxy such
 as Privoxy to hide differences between clients, and expunge protocol
-features that leak identity. 
+features that leak identity.
 Note that by this separation Tor can also provide services that
 are anonymous to the network yet authenticated to the responder, like
 SSH. Similarly, Tor does not currently integrate
@@ -473,7 +473,7 @@ compromise some fraction of the onion routers.
 In low-latency anonymity systems that use layered encryption, the
 adversary's typical goal is to observe both the initiator and the
 responder. By observing both ends, passive attackers can confirm a
-suspicion that Alice is 
+suspicion that Alice is
 talking to Bob if the timing and volume patterns of the traffic on the
 connection are distinct enough; active attackers can induce timing
 signatures on the traffic to force distinct patterns. Rather
@@ -509,7 +509,7 @@ each of these attacks.
 \Section{The Tor Design}
 \label{sec:design}
 
-The Tor network is an overlay network; each onion router (OR) 
+The Tor network is an overlay network; each onion router (OR)
 runs as a normal
 user-level process without any special privileges.
 Each onion router maintains a long-term TLS \cite{TLS}
@@ -524,7 +524,7 @@ runs local software called an onion proxy (OP) to fetch directories,
 establish circuits across the network,
 and handle connections from user applications.  These onion proxies accept
 TCP streams and multiplex them across the circuits. The onion
-router on the other side 
+router on the other side
 of the circuit connects to the destinations of
 the TCP streams and relays data.
 
@@ -578,8 +578,8 @@ and \emph{destroy} (to tear down a circuit).
 Relay cells have an additional header (the relay header) after the
 cell header, containing a stream identifier (many streams can
 be multiplexed over a circuit); an end-to-end checksum for integrity
-checking; the length of the relay payload; and a relay command.  
-The entire contents of the relay header and the relay cell payload 
+checking; the length of the relay payload; and a relay command.
+The entire contents of the relay header and the relay cell payload
 are encrypted or decrypted together as the relay cell moves along the
 circuit, using the 128-bit AES cipher in counter mode to generate a
 cipher stream.
@@ -622,7 +622,7 @@ without delaying streams and thereby harming user experience.\\
 A user's OP constructs circuits incrementally, negotiating a
 symmetric key with each OR on the circuit, one hop at a time. To begin
 creating a new circuit, the OP (call her Alice) sends a
-\emph{create} cell to the first node in her chosen path (call him Bob).  
+\emph{create} cell to the first node in her chosen path (call him Bob).
 (She chooses a new
 circID $C_{AB}$ not currently used on the connection from her to Bob.)
 The \emph{create} cell's
@@ -694,7 +694,7 @@ whether the decrypted streamID is recognized---either because it
 corresponds to an open stream at this OR for the given circuit, or because
 it is the control streamID (zero).  If the OR recognizes the
 streamID, it accepts the relay cell and processes it as described
-below.  Otherwise, 
+below.  Otherwise,
 the OR looks up the circID and OR for the
 next step in the circuit, replaces the circID as appropriate, and
 sends the decrypted relay cell to the next OR.  (If the OR at the end
@@ -713,19 +713,19 @@ encrypts the cell payload (that is, the relay header and payload) with
 the symmetric key of each hop up to that OR.  Because the streamID is
 encrypted to a different value at each step, only at the targeted OR
 will it have a meaningful value.\footnote{
-  % Should we just say that 2^56 is itself negligible?  
-  % Assuming 4-hop circuits with 10 streams per hop, there are 33 
+  % Should we just say that 2^56 is itself negligible?
+  % Assuming 4-hop circuits with 10 streams per hop, there are 33
   % possible bad streamIDs before the last circuit.  This still
   % gives an error only once every 2 million terabytes (approx).
 With 56 bits of streamID per cell, the probability of an accidental
 collision is far lower than the chance of hardware failure.}
 This \emph{leaky pipe} circuit topology
-allows Alice's streams to exit at different ORs on a single circuit.  
+allows Alice's streams to exit at different ORs on a single circuit.
 Alice may choose different exit points because of their exit policies,
 or to keep the ORs from knowing that two streams
 originate from the same person.
 
-When an OR later replies to Alice with a relay cell, it 
+When an OR later replies to Alice with a relay cell, it
 encrypts the cell's relay header and payload with the single key it
 shares with Alice, and sends the cell back toward Alice along the
 circuit.  Subsequent ORs add further layers of encryption as they
@@ -836,7 +836,7 @@ Thus, we check integrity only at the edges of each stream. When Alice
 negotiates a key with a new hop, they each initialize a SHA-1
 digest with a derivative of that key,
 thus beginning with randomness that only the two of them know. From
-then on they each incrementally add to the SHA-1 digest the contents of 
+then on they each incrementally add to the SHA-1 digest the contents of
 all relay cells they create, and include with each relay cell the
 first four bytes of the current digest.  Each also keeps a SHA-1
 digest of data received, to verify that the received hashes are correct.
@@ -851,7 +851,7 @@ of computing the digests is minimal compared to doing the AES
 encryption performed at each hop of the circuit. We use only four
 bytes per cell to minimize overhead; the chance that an adversary will
 correctly guess a valid hash
-%, plus the payload the current cell, 
+%, plus the payload the current cell,
 is
 acceptably low, given that Alice or Bob tear down the circuit if they
 receive a bad hash.
@@ -861,7 +861,7 @@ receive a bad hash.
 
 Volunteers are generally more willing to run services that can limit
 their own bandwidth usage. To accommodate them, Tor servers use a
-token bucket approach \cite{tannenbaum96} to 
+token bucket approach \cite{tannenbaum96} to
 enforce a long-term average rate of incoming bytes, while still
 permitting short-term bursts above the allowed bandwidth. Current bucket
 sizes are set to ten seconds' worth of traffic.
@@ -908,7 +908,7 @@ reimplement full TCP windows (with sequence numbers,
 the ability to drop cells when we're full and retransmit later, and so
 on),
 because TCP already guarantees in-order delivery of each
-cell. 
+cell.
 %But we need to investigate further the effects of the current
 %parameters on throughput and latency, while also keeping privacy in mind;
 %see Section~\ref{sec:maintaining-anonymity} for more discussion.
@@ -950,9 +950,9 @@ Currently, non-data relay cells do not affect the windows. Thus we
 avoid potential deadlock issues, for example, arising because a stream
 can't send a \emph{relay sendme} cell when its packaging window is empty.
 
-These arbitrarily chosen parameters 
+These arbitrarily chosen parameters
 %are probably not optimal; more
-%research remains to find which parameters 
+%research remains to find which parameters
 seem to give tolerable throughput and delay; more research remains.
 
 \Section{Other design decisions}
@@ -1042,7 +1042,7 @@ given host or network---an external adversary cannot eavesdrop traffic
 between the private exit and the final destination, and so is less sure of
 Alice's destination and activities. Most onion routers will function as
 \emph{restricted exits} that permit connections to the world at large,
-but prevent access to certain abuse-prone addresses and services. 
+but prevent access to certain abuse-prone addresses and services.
 Additionally, in some cases the OR can authenticate clients to
 prevent exit abuse without harming anonymity \cite{or-discex00}.
 
@@ -1134,7 +1134,7 @@ an adversary could take over the network by creating many servers
 server administrator before they are included. Mechanisms for automated
 node approval are an area of active research, and are discussed more
 in Section~\ref{sec:maintaining-anonymity}.
-  
+
 Of course, a variety of attacks remain. An adversary who controls
 a directory server can track clients by providing them different
 information---perhaps by listing only nodes under its control, or by
@@ -1214,7 +1214,7 @@ identity even in the presence of router failure. Bob's service must
 not be tied to a single OR, and Bob must be able to tie his service
 to new ORs. \textbf{Smear-resistant:}
 A social attacker who offers an illegal or disreputable location-hidden
-service should not be able to ``frame'' a rendezvous router by 
+service should not be able to ``frame'' a rendezvous router by
 making observers believe the router created that service.
 %slander-resistant? defamation-resistant?
 \textbf{Application-transparent:} Although we require users
@@ -1257,7 +1257,7 @@ application integration is described more fully below.
       rendezvous cookie that it will use to recognize Bob.
 \item Alice opens an anonymous stream to one of Bob's introduction
       points, and gives it a message (encrypted to Bob's public key)
-      which tells him 
+      which tells him
       about herself, her chosen RP and the rendezvous cookie, and the
       first half of a DH
       handshake. The introduction point sends the message to Bob.
@@ -1296,7 +1296,7 @@ service. During normal situations, Bob's service might simply be offered
 directly from mirrors, while Bob gives out tokens to high-priority users. If
 the mirrors are knocked down,
 %by distributed DoS attacks or even
-%physical attack, 
+%physical attack,
 those users can switch to accessing Bob's service via
 the Tor rendezvous system.
 
@@ -1369,7 +1369,7 @@ reveal traffic patterns (both sent and received). Profiling via user
 connection patterns requires further processing, because multiple
 application streams may be operating simultaneously or in series over
 a single circuit.
-  
+
 \emph{Observing user content.} While content at the user end is encrypted,
 connections to responders may not be (indeed, the responding website
 itself may be hostile). While filtering content is not a primary goal
@@ -1394,20 +1394,20 @@ by running the OP on the Tor node or behind a firewall. This approach
 requires an observer to separate traffic originating at the onion
 router from traffic passing through it: a global observer can do this,
 but it might be beyond a limited observer's capabilities.
-  
+
 \emph{End-to-end size correlation.} Simple packet counting
 will also be effective in confirming
 endpoints of a stream. However, even without padding, we have some
 limited protection: the leaky pipe topology means different numbers
 of packets may enter one end of a circuit than exit at the other.
-  
+
 \emph{Website fingerprinting.} All the effective passive
 attacks above are traffic confirmation attacks,
 which puts them outside our design goals. There is also
 a passive traffic analysis attack that is potentially effective.
 Rather than searching exit connections for timing and volume
 correlations, the adversary may build up a database of
-``fingerprints'' containing file sizes and access patterns for 
+``fingerprints'' containing file sizes and access patterns for
 targeted websites. He can later confirm a user's connection to a given
 site simply by consulting the database. This attack has
 been shown to be effective against SafeWeb \cite{hintz-pet02}.
@@ -1415,7 +1415,7 @@ It may be less effective against Tor, since
 streams are multiplexed within the same circuit, and
 fingerprinting will be limited to
 the granularity of cells (currently 256 bytes). Additional
-defenses could include 
+defenses could include
 larger cell sizes, padding schemes to group websites
 into large sets, and link
 padding or long-range dummies.\footnote{Note that this fingerprinting
@@ -1464,7 +1464,7 @@ connection.  There is also a danger that application
 protocols and associated programs can be induced to reveal information
 about the initiator. Tor depends on Privoxy and similar protocol cleaners
 to solve this latter problem.
-  
+
 \emph{Run an onion proxy.} It is expected that end users will
 nearly always run their own local onion proxy. However, in some
 settings, it may be necessary for the proxy to run
@@ -1478,7 +1478,7 @@ of the Tor network can increase the value of this traffic
 by attacking non-observed nodes to shut them down, reduce
 their reliability, or persuade users that they are not trustworthy.
 The best defense here is robustness.
-  
+
 \emph{Run a hostile OR.}  In addition to being a local observer,
 an isolated hostile node can create circuits through itself, or alter
 traffic patterns to affect traffic at other nodes. Nonetheless, a hostile
@@ -1488,8 +1488,8 @@ run multiple ORs, and can persuade the directory servers
 that those ORs are trustworthy and independent, then occasionally
 some user will choose one of those ORs for the start and another
 as the end of a circuit. If an adversary
-controls $m>1$ out of $N$ nodes, he should be able to correlate at most 
-$\left(\frac{m}{N}\right)^2$ of the traffic in this way---although an 
+controls $m>1$ out of $N$ nodes, he should be able to correlate at most
+$\left(\frac{m}{N}\right)^2$ of the traffic in this way---although an
 adversary
 could possibly attract a disproportionately large amount of traffic
 by running an OR with an unusually permissive exit policy, or by
@@ -1497,7 +1497,7 @@ degrading the reliability of other routers.
 
 \emph{Introduce timing into messages.} This is simply a stronger
 version of passive timing attacks already discussed earlier.
-  
+
 \emph{Tagging attacks.} A hostile node could ``tag'' a
 cell by altering it. If the
 stream were, for example, an unencrypted request to a Web site,
@@ -1506,14 +1506,14 @@ the association. However, integrity checks on cells prevent
 this attack.
 
 \emph{Replace contents of unauthenticated protocols.}  When
-relaying an unauthenticated protocol like HTTP, a hostile exit node 
+relaying an unauthenticated protocol like HTTP, a hostile exit node
 can impersonate the target server. Clients
 should prefer protocols with end-to-end authentication.
 
 \emph{Replay attacks.} Some anonymity protocols are vulnerable
 to replay attacks.  Tor is not; replaying one side of a handshake
 will result in a different negotiated session key, and so the rest
-of the recorded session can't be used.  
+of the recorded session can't be used.
 
 \emph{Smear attacks.} An attacker could use the Tor network for
 socially disapproved acts, to bring the
@@ -1558,7 +1558,7 @@ ORs in the final directory as he wishes. We must ensure that directory
 server operators are independent and attack-resistant.
 
 \emph{Encourage directory server dissent.}  The directory
-agreement protocol assumes that directory server operators agree on 
+agreement protocol assumes that directory server operators agree on
 the set of directory servers.  An adversary who can persuade some
 of the directory server operators to distrust one another could
 split the quorum into mutually hostile camps, thus partitioning
@@ -1567,7 +1567,7 @@ this attack.
 
 \emph{Trick the directory servers into listing a hostile OR.}
 Our threat model explicitly assumes directory server operators will
-be able to filter out most hostile ORs. 
+be able to filter out most hostile ORs.
 % If this is not true, an
 % attacker can flood the directory with compromised servers.
 
@@ -1579,7 +1579,7 @@ accepting TLS connections from ORs but ignoring all cells. Directory
 servers must actively test ORs by building circuits and streams as
 appropriate.  The tradeoffs of a similar approach are discussed in
 \cite{mix-acc}.\\
-  
+
 \noindent{\large\bf Attacks against rendezvous points}\\
 \emph{Make many introduction requests.}  An attacker could
 try to deny Bob service by flooding his introduction points with
@@ -1587,7 +1587,7 @@ requests.  Because the introduction points can block requests that
 lack authorization tokens, however, Bob can restrict the volume of
 requests he receives, or require a certain amount of computation for
 every request he receives.
-  
+
 \emph{Attack an introduction point.} An attacker could
 disrupt a location-hidden service by disabling its introduction
 points.  But because a service's identity is attached to its public
@@ -1612,7 +1612,7 @@ with a session key shared by Alice and Bob.
 
 \Section{Open Questions in Low-latency Anonymity}
 \label{sec:maintaining-anonymity}
- 
+
 In addition to the non-goals in
 Section~\ref{subsec:non-goals}, many other questions must be solved
 before we can be confident of Tor's security.
@@ -1645,7 +1645,7 @@ three nodes unrelated to herself and her destination.
 %
 %Thus normally she chooses
 %three nodes, but if she is running an OR and her destination is on an OR,
-%she uses five. 
+%she uses five.
 Should Alice choose a nondeterministic path length (say,
 increasing it from a geometric distribution) to foil an attacker who
 uses timing to learn that he is the fifth hop and thus concludes that
@@ -1684,7 +1684,7 @@ immediately beneficial because of real-world adversaries that can't
 observe Alice's router, but can run routers of their own?
 
 To scale to many users, and to prevent an attacker from observing the
-whole network at once, it may be necessary 
+whole network at once, it may be necessary
 to support far more servers than Tor currently anticipates.
 This introduces several issues.  First, if approval by a centralized set
 of directory servers is no longer feasible, what mechanism should be used
@@ -1724,7 +1724,7 @@ Tor brings together many innovations into a unified deployable system. The
 next immediate steps include:
 
 \emph{Scalability:} Tor's emphasis on deployability and design simplicity
-has led us to adopt a clique topology, semi-centralized 
+has led us to adopt a clique topology, semi-centralized
 directories, and a full-network-visibility model for client
 knowledge. These properties will not scale past a few hundred servers.
 Section~\ref{sec:maintaining-anonymity} describes some promising
@@ -1831,7 +1831,7 @@ our overall usability.
 %     'Cypherpunk', 'Cypherpunks', 'Cypherpunk remailer'
 %     'Onion Routing design', 'onion router' [note capitalization]
 %     'SOCKS'
-%     Try not to use \cite as a noun.  
+%     Try not to use \cite as a noun.
 %     'Authorizating' sounds great, but it isn't a word.
 %     'First, second, third', not 'Firstly, secondly, thirdly'.
 %     'circuit', not 'channel'
author	Roger Dingledine <arma@torproject.org>	2004-01-07 12:08:07 +0000
committer	Roger Dingledine <arma@torproject.org>	2004-01-07 12:08:07 +0000
commit	933d531f15c0719f65a4aa415180ca89cd00d90a (patch)
tree	f685eb8f4e8f3fd936aa962eab96705cdcb41a33
parent	bf63d281b402ed4ea799f80d5e47de15dd2e83a0 (diff)
download	tor-933d531f15c0719f65a4aa415180ca89cd00d90a.tar.gz tor-933d531f15c0719f65a4aa415180ca89cd00d90a.zip