aboutsummaryrefslogtreecommitdiff
path: root/proposals/202-improved-relay-crypto.txt
diff options
context:
space:
mode:
authorNick Mathewson <nickm@torproject.org>2012-06-19 12:14:37 -0400
committerNick Mathewson <nickm@torproject.org>2012-06-19 12:14:45 -0400
commit8ab02c54456fa28acc8e51ef304c65ef95b53304 (patch)
tree2dfec8d74331cbbc6953201b3c22bff3f386ed80 /proposals/202-improved-relay-crypto.txt
parent1c30043671507c521cfa6150c554d7028e1daf96 (diff)
downloadtorspec-8ab02c54456fa28acc8e51ef304c65ef95b53304.tar.gz
torspec-8ab02c54456fa28acc8e51ef304c65ef95b53304.zip
Add proposal 202: Two improved relay encryption protocols for Tor cells
Diffstat (limited to 'proposals/202-improved-relay-crypto.txt')
-rw-r--r--proposals/202-improved-relay-crypto.txt816
1 files changed, 816 insertions, 0 deletions
diff --git a/proposals/202-improved-relay-crypto.txt b/proposals/202-improved-relay-crypto.txt
new file mode 100644
index 0000000..dafafdd
--- /dev/null
+++ b/proposals/202-improved-relay-crypto.txt
@@ -0,0 +1,816 @@
+Filename: 202-improved-relay-crypto.txt
+Title: Two improved relay encryption protocols for Tor cells
+Author: Nick Mathewson
+Created: 19 Jun 2012
+Status: Open
+
+
+Overview:
+
+ This document describes two candidate designs for a better Tor
+ relay encryption/decryption protocol, designed to stymie tagging
+ attacks and better accord with best practices in protocol design.
+
+ My hope is that readers will examine these protocols, evaluate their
+ security and practicality, improve on them, and help to pick one for
+ implementation in Tor.
+
+ In section 1, I describe Tor's current relay crypto protocol, its
+ drawbacks, how it fits in with the rest of Tor, and some
+ requirements/desiderata for a replacement. In sections 2 and 3, I
+ propose two alternative replacements for this protocol. In section
+ 4, I discuss their pros and cons.
+
+1. Background and Motivation
+
+1.0. A short overview of Tor's current protocols
+
+ The core pieces of the Tor protocol are the link protocol,
+ the circuit extend protocol, the relay protocol, and the stream
+ protocol. All are documented in [TorSpec].
+
+ Briefly:
+
+ - The link protocol handles all direct pairwise communication
+ between nodes. Everything else is transmitted over it. It
+ uses TLS.
+
+ - The circuit extend protocol uses public-key crypto to set up
+ multi-node virtual tunnels, called "circuits", from a client
+ through one or more nodes.
+
+ *** The relay protocol uses established circuits to communicate
+ from a client to a node on a circuit. That's the one we'll
+ be talking about here. ***
+
+ - The stream protocol is tunneled over relay protocol; clients
+ use it to tell servers to open anonymous TCP connections, to
+ send data, and so forth. Servers use it to report success or
+ failure opening anonymous TCP connections, to send data from
+ TCP connections back to clients, and so forth.
+
+ In more detail: The link protocol's job is to connect two
+ computers with an encrypted, authenticated stream, to
+ authenticate one or both of them to the other, and to provide a
+ channel for passing "cells" between them. The circuit extend
+ protocol's job is to set up circuits: persistent tunnels that
+ connect a Tor client to an exit node through a series of
+ (typically three) hops, each of which knows only the previous and
+ next hop, and each of which has a set of keys that they share
+ only with the client. Finally, the relay protocol's job is to
+ allow a client to communicate bidirectionally with the node(s) on
+ the circuit, once their shared keys have been established.
+
+ (We'll ignore the stream protocol for the rest of this document.)
+
+ Note on terminology: Tor nodes are sometimes called "servers",
+ "relays", or "routers". I'll use all these terms more or less
+ interchangeably. For simplicity's sake, I will call the party
+ who is constructing and using a circuit "the client" or "Alice",
+ even though nodes sometimes construct circuits too.
+
+ Tor's internal packets are called "cells". All the cells we deal
+ with here are 512 bytes long.
+
+ The nodes in a circuit are called its "hops"; most circuits are 3
+ hops long. This doesn't include the client: when Alice builds a
+ circuit through nodes Bob_1, Bob_2, and Bob_3, the first hop is
+ "Bob_1" and the last hop is "Bob_3".
+
+1.1. The current relay protocol and its drawbacks
+
+ [This section describes Tor's current relay protocol. It is not a
+ proposal; rather it is what we do now. Sections 2 and 3 have my
+ proposed replacements for it.]
+
+ A "relay cell" is a cell that is generated by the client to send
+ to a node, or by a node to send to the client. It's called a
+ "relay" cell because a node that receives one may need to relay
+ it to the next or previous node in the circuit (or to handle the
+ cell itself).
+
+ A relay cell moving towards the client is called "inbound"; a
+ cell moving away is called "outbound".
+
+ When a relay cell is constructed by the client, the client adds one
+ layer of crypto for each node that will process the cell, and gives
+ the cell to the first node in the circuit. Each node in turn then
+ removes one layer of crypto, and either forwards the cell to the next
+ node in the circuit or acts on that cell itself.
+
+ When a relay cell is constructed by a node, it encrypts it and sends
+ it to the preceding node in the circuit. Each node between the
+ originating node and the client then encrypts the cell and passes it
+ back to the preceding node. When the client receives the cell, it
+ removes layers of crypto until it has an unencrypted cell, which it
+ then acts on.
+
+ In the current protocol, the body of each relay cell contains, in
+ its unencrypted form:
+
+ Relay command [1 byte]
+ Zeros [2 bytes]
+ StreamID [2 bytes]
+ Digest [4 bytes]
+ Length [2 bytes]
+ Data [498 bytes]
+
+ (This only adds up to 509 bytes. That's because the Tor link
+ protocol transfers 512-byte cells, and has a 3 byte overhead per
+ cell. Not how we would do it if we were starting over at this
+ point.)
+
+ At every node of a circuit, the node relaying a cell
+ encrypts/decrypts it with AES128-CTR. If the cell is outbound
+ and the "zeros" field is set to all-zeros, the node additionally
+ checks whether 'digest' is correct. A correct digest is the
+ first 4 bytes of the running SHA1 digest of: a shared secret,
+ concatenated with all the relay cells destined for this node on
+ this circuit so far, including this cell. If _that's_ true, then
+ the node accepts this cell. (See section 6 of [TorSpec] for full
+ detail; see section A.1 for a more rigorous description.)
+
+ The current approach has some actual problems. Notably:
+
+ * It permits tagging attacks. Because AES_CTR is an XOR-based
+ stream cipher, an adversary who controls the first node in a
+ circuit can XOR anything he likes into the relay cell, and
+ then see whether he/she encounters an correspondingly
+ defaced cell at some exit that he also controls.
+
+ That is, the attacker picks some pattern P, and when he
+ would transmit some outbound relay cell C at hop 1, he
+ instead transmits C xor P. If an honest exit receives the
+ cell, the digest will probably be wrong, and the honest exit
+ will reject it. But if the attacker controls the exit, he
+ will notice that he has received a cell C' where the digest
+ doesn't match, but where C' xor P _does_ have a good digest.
+ The attacker will then know that his two nodes are on the
+ same circuit, and thereby be able to link the user (whom the
+ first node sees) to her activities (which the last node sees).
+
+ Back when we did the Tor design, this didn't seem like a
+ big deal, since an adversary who controls both the first and
+ last node in a circuit is presumed to win already based on
+ traffic correlation attacks. This attack seemed strictly
+ worse than that, since it was trivially detectable in the
+ case where the attacker _didn't_ control both ends. See
+ section 4.4 of the Tor paper [TorDesign] for our early
+ thoughts here; see Xinwen Fu et al's 2009 work for a more
+ full explanation of the in-circuit tagging attack [XF]; and
+ see "The 23 Raccoons'" March 2012 "Analysis of the Relative
+ Severity of Tagging Attacks" mail on tor-dev (and the
+ ensuing thread) for a discussion of why we may want to care
+ after all, due to attacks that use tagging to amplify route
+ capture. [23R]
+
+ It also has some less practical issues.
+
+ * The digest portion is too short. Yes, if you're an attacker
+ trying to (say) change an "ls *" into an "rm *", you can
+ only expect to get one altered cell out of 2^32 accepted --
+ and all future cells on the circuit will be rejected with
+ similar probability due to the change in the running hash
+ -- but 1/2^32 is a pretty high success rate for crypto attacks.
+
+ * It does MAC-then-encrypt. That makes smart people cringe.
+
+ * Its approach to MAC is H(Key | Msg), which is vulnerable to
+ length extension attack if you've got a Merkle-Damgard hash
+ (which we do). This isn't relevant in practice right now,
+ since the only parties who see the digest are the two
+ parties that rely on it (because of the MAC-then-encrypt).
+
+
+1.2. Some feature requirements
+
+ Relay cells can originate at the client or at a relay. Relay cells
+ that originate at the client are given to the first node in the
+ circuit, and constructed so that they will be decrypted and forwarded
+ by the first M-1 nodes in the circuit, and finally decrypted and
+ processed by the Mth node, where the client chooses M. (Usually, the
+ Mth node is the the last one, which will be an exit node.) Relay
+ cells that originate at a hop in the circuit are given to the
+ preceding node, and eventually delivered to the client.
+
+ Tor provides a so called "leaky pipe" circuit topology
+ [TorDesign]: a client can send a cell to any node in the circuit,
+ not just the last node. I'll try to keep that property, although
+ historically we haven't really made use of it.
+
+ In order to implement telescoping circuit construction (where the
+ circuit is built by iteratively asking the last node in the
+ circuit to extend the circuit one hop more), it's vital that the
+ last hop of the circuit be able to change.
+
+ Proposal 188 [Prop188] suggests that we implement a "bridge guards"
+ feature: making some (non-public) nodes insert an extra hop into
+ the circuit after themselves, in a way that will make it harder
+ for other nodes in the network to enumerate them. We
+ therefore want our circuits to be one-hop re-extensible: when the
+ client extends a circuit from Bob1 to Bob2, we want Bob1 to be
+ able to introduce a new node "Bob1.5" into the circuit such that
+ the circuit runs Alice->Bob1->Bob1.5->Bob2. (This feature has
+ been called "loose source routing".)
+
+ Any new approach should be able to coexist on a circuit
+ with the old approach. That is, if Alice wants to build a
+ circuit through Bob1, Bob2, and Bob3, and only Bob2 supports a
+ revised relay protocol, then Alice should be able to build a
+ circuit such that she can have Bob1 and Bob3 process each cell
+ with the current protocol, and Bob2 process it with a revised
+ protocol. (Why? Because if all nodes in a circuit needed to use
+ the same relay protocol, then each node could learn information
+ about the other nodes in the circuit from which relay protocol
+ was chosen. For example, if Bob1 supports the new protocol, and
+ sees that the old relay protocol is in use, and knows that Bob2
+ supports the new one, then Bob1 has learned that Bob3 is some
+ node that does not support the new relay protocol.)
+
+ Cell length needs to be constant as cells move through the
+ network. For historical reasons mentioned above in section 1.1,
+ the length of the encrypted part of a relay cell needs to be 509
+ bytes.
+
+1.3. Some security requirements
+
+ Two adjacent nodes on a circuit can trivially tell that they are
+ on the same circuit, and the first node can trivially tell who
+ the client is. Other than that, we'd like any attacker who
+ controls a node on the circuit not to have a good way to learn
+ any other nodes, even if he/she controls those nodes. [*]
+
+ Relay cells should not be malleable: no relay should be able to
+ alter a cell between an honest sender and an honest recipient in
+ a way that they cannot detect.
+
+ Relay cells should be secret: nobody but the sender and recipient
+ of a relay cell should be able to learn what it says.
+
+ Circuits should resist transparent, recoverable tagging attacks:
+ if an attacker controls one node in a circuit and alters a relay
+ cell there, no non-adjacent point in the circuit should be able
+ to recover the relay cell as it would have received it had the
+ attacker not altered it.
+
+ The above properties should apply to sequences of cells too:
+ an attacker shouldn't be able to change what sequence of cells
+ arrives at a destination (for example, by removing, replaying, or
+ reordering one or more cells) without being detected.
+
+ (Feel free to substitute whatever formalization of the above
+ requirements makes you happiest, and add whatever caveats are
+ necessary to make you comfortable. I probably missed at least
+ two critical properties.)
+
+ [*] Of course, an attacker who controls two points on a circuit
+ can probably confirm this through traffic correlation. But
+ we'd prefer that the cryptography not create other, easier
+ ways for them to do this.
+
+1.4. A note on algorithms
+
+ This document is deliberately agnostic concerning the choice of
+ cryptographic primitives -- not because I have no opinions about
+ good ciphers, MACs, and modes of operation -- but because
+ experience has taught me that mentioning any particular
+ cryptographic primitive will prevent discussion of anything else.
+
+ Please DO NOT suggest algorithms to use in implementing these
+ protocols yet. It will distract! There will be time later!
+
+ If somebody _else_ suggests algorithms to use, for goodness' sake
+ DON'T ARGUE WITH THEM! There will be time later!
+
+
+2. Design 1: Large-block encryption
+
+ In this approach, we use a tweakable large-block cipher for
+ encryption and decryption, and a tweak-chaining function TC.
+
+2.1. Chained large-block what now?
+
+ We assume the existence of a primitive that provides the desired
+ properties of a tweakable[Tweak] block cipher, taking blocks of any
+ desired size. (In our case, the block size is 509 bytes[*].)
+
+ It also takes a Key, and a per-block "tweak" parameter that plays
+ the same role that an IV plays in CBC, or that the counter plays
+ in counter mode.
+
+ The Tweak-chaining function TC takes as input a previous tweak, a
+ tweak chaining key, and a cell; it outputs a new tweak. Its
+ purpose is to make future cells undecryptable unless you have
+ received all previous cells. It could probably be something like
+ a MAC of the old tweak and the cell using the tweak chaining key
+ as the MAC key.
+
+ (If the initial tweak is secret, I am not sure that TC needs to
+ be keyed.)
+
+ [*] Some large-block cipher constructions use blocks whose size is
+ the multiple of some underlying cipher's block size. If we wind
+ up having to use one of those, we can use 496-byte blocks instead
+ at the cost of 2.5% wasted space.
+
+2.2. The protocol
+
+2.2.1. Setup phase
+
+ The circuit construction algorithm needs to produce forward and
+ backward keys Kf and Kb, the forward and backward tweak chaining
+ keys TCKf and TCKb, as well as initial tweak values Tf and Tb.
+
+2.2.2. The cell format
+
+ We replace the "Digest" and "Zeros" fields of the cell with a
+ single Z-byte "Zeros" field to determine when the cell is
+ recognized and correctly decrypted; its length is a security
+ parameter.
+
+2.2.3. The decryption operations
+
+ For a relay to handle an inbound RELAY cell, it sets Tb_next to
+ TC(TCKb, Tb, Cell). Then it encrypts the cell using the large
+ block cipher keyed with Kb and tweaked with Tb. Then it sets Tb
+ to Tb_next.
+
+ For a relay to handle an outbound RELAY cell, it sets Tf_next to
+ TC(TCKf, Tf, Cell). Then it decrypts the cell using the large
+ block cipher keyed with Kf and tweaked with Tf. Then it sets Tf
+ to Tf_next. Then it checks the 'Zeros' field on the cell;
+ if that field is all [00] bytes, the cell is for us.
+
+2.3. Security discussion
+
+ This approach is fairly simple (at least, no more complex than
+ its primitives) and achieves some of our security goals. Because
+ of the large block cipher approach, any change to a cell will
+ render that cell undecryptable, and indistinguishable from random
+ junk. Because of the tweak chaining approach, if even one cell
+ is missed or corrupted or reordered, future cells will also
+ decrypt into random junk.
+
+ The tagging attack in this case is turned into a circuit-junking
+ attack: an adversary who tries to mount it can probably confirm
+ that he was indeed first and last node on a target circuit
+ (assuming that circuits which turn to junk in this way are rare),
+ but cannot recover the circuit after that point.
+
+ As a neat side observation, note that this approach improves upon
+ Tor's current forward secrecy, by providing forward secrecy while
+ circuits are still operational, since each change to the tweak
+ should make previous cells undecryptable if the old tweak value
+ isn't recoverable.
+
+ The length of Zeros is a parameter for what fraction of "random
+ junk" cells will potentially be accepted by a router or client.
+ If Zeros is Z bytes long, then junk cells will be accepted with
+ P < 2^-(8*Z + 7). (The '+7' is there because the top 7 bits of
+ the Length field must also be 0.)
+
+ There's no trouble using this protocol in a mixed circuit, where
+ some nodes speak the old protocol and some speak the
+ large-block-encryption protocol.
+
+3. Design 2: short-MAC-and-pad
+
+ In this design, we behave more similarly to a mix-net design
+ (such as Mixmaster or Mixminion's headers). Each node checks a
+ MAC, and then re-pads the cell to its chosen length before
+ decoding the cell.
+
+ This design uses as a primitive a MAC and a stream cipher. It
+ might also be possible to use an authenticating cipher mode,
+ if we can find one that works like a stream cipher and allows us
+ to efficiently output authenticators for the stream so far.
+
+ NOTE TO AE/AEAD FANS: The encrypt-and-MAC model here could be
+ replaced with an authenticated encryption mode without too much
+ loss of generality.
+
+3.1. The protocol
+
+3.1.1 Setup phase
+
+ The circuit construction algorithm needs to produce forward and
+ backward keys Kf and Kb, forward and backward stream cipher IVs
+ IVf and IVb, and forward and backward MAC keys Mf and Mb.
+
+ Additionally, the circuit construction algorithm needs a way for
+ the client to securely (and secretly? XXX) tell each hop in the
+ circuit a value 'bf' for the number of bytes of MAC it should
+ expect on outbound cells and 'bb' for the number of bytes it
+ should use on cells it's generating. Each node can get a
+ different 'bf' and 'bb'. These values can be 0: if a node's bf
+ is 0, it doesn't authenticate cells; if its bb is 0, it doesn't
+ originate them.
+
+ The circuit construction algorithm also needs a way to tell each
+ the client to securely (and secretly? XXX) tell each hop in the
+ circuit whether it is allowed to be the final destination for
+ relay cells.
+
+ Set the stream Sf and the stream Sb to empty values.
+
+3.1.2. The cell format
+
+ The Zeros and Digest field of the cell format are removed.
+
+3.1.3. The relay operations
+
+ Upon receiving an outbound cell, a node removes the first b bytes
+ of the cell, and puts them aside as 'M'. The node then computes
+ between 0 and 2 MACs of the stream consisting of all previously
+ MAC'd data, plus the remainder of the cell:
+
+ If b>0 and there is a next hop, the node computes M_relay.
+
+ If this node was told to deliver traffic, or it is the last
+ node in the circuit so far, the node computes M_receive.
+
+ M_relay is computed as MAC(stream | "relay"); M_receive is
+ computed as MAC(stream | "receive").
+
+ If M = M_receive, this cell is for the node; it should process
+ it.
+
+ If M = M_relay, or if b == 0, this cell should be relayed.
+
+ If a MAC was computed and neither of the above cases was met,
+ then the cell is bogus; the node should discard it and destroy
+ the circuit.
+
+ The node then removes the first bf bytes of the cell, and pads the
+ cell at the end with bf zero bytes. Finally, the node decrypts
+ the whole remaining padded cell with the stream cipher.
+
+ To handle an inbound cell, the node simply does a stream cipher
+ with no checking.
+
+3.1.4. Generating inbound cells.
+
+ To generate an inbound cell, a node makes a 509-bb byte RELAY
+ cell C, encrypts that cell with Kb, appends the encrypted cell to
+ Sb, and prepends M_receive(Sb) to the cell.
+
+3.1.5. Generating outbound cells
+
+ Generating an outbound cell is harder, since we need to know what
+ padding the earlier nodes will generate in order to know what
+ padding the later nodes will receive and compute their MACs, but
+ we also need to know what MACs we'll send to the later nodes in
+ order to compute which MACs we'll send to the earlier ones.
+
+ Mixnet clients have needed to do this for ages, though, so the
+ algorithms are pretty well settled. I'll give one below in A.3.
+
+3.2. Security discussion
+
+ This approach is also simple and (given good parameter choices)
+ can achieve our goals. The trickiest part is the algorithm that
+ clients must follow to package cells, but that's no so bad.
+
+ It's not necessary to check MACs on inbound traffic, because
+ nobody but the client can tell scrambled messages from good ones,
+ and the client can be trusted to keep the client's own secrets.
+
+ With this protocol, if the attacker tries to do a tagging attack,
+ the circuit should get destroyed by the next node in the circuit
+ that has a nonzero "bf" value, with probability == 1-2^-(8*bf).
+ (If there are further intervening honest nodes, they also have a
+ chance to detect the attack.)
+
+ Similarly, any attempt to replay, or reorder outbound cells
+ should fail similarly.
+
+ The "bf" values could reveal to each node its position in the
+ circuit and the client preferences, depending on how we set them;
+ on the other hand, having a fixed bf value would reveal to the
+ last node the length of the circuit. Neither option seems great.
+
+ This protocol doesn't provide any additional forward secrecy
+ beyond what Tor has today. We could fix that by changing our use
+ of the stream cipher so that instead of running in counter mode
+ between cells, we use a tweaked stream cipher and change the
+ tweak with each cell (possibly based on the unused portion of the
+ MAC).
+
+ This protocol does support loose source routing, so long as the
+ no padding bytes are added by any router-added nodes.
+
+ In a circuit, every node has at least one relay cell sent to it:
+ even non-exit nodes get a RELAY_EXTEND cell.
+
+4. Discussion
+
+ I'm not currently seeing a reason to strongly prefer one of these
+ approaches over another.
+
+ In favor of large-block encryption:
+ - The space overhead seems smaller: we need to use up fewer
+ bytes in order to get equivalent looking security.
+
+ (For example, if we want to have P < 2^64 that a cell altered
+ by hop 1 could be accepted by hop 2 or hop 3, *and* we want P
+ < 2^64 that a cell altered by hop 2 could be accepted by hop
+ 3, the large-block approach needs about 8 bytes for the Zeros
+ field, whereas the short-MAC-and-pad approach needs 16 bytes
+ worth of MACs.)
+
+ - We get forward secrecy pretty easily.
+
+ - The format doesn't leak anything about the length of the
+ circuit, or limit it.
+
+ - We don't need complicated logic to set the 'b' parameters.
+
+ - It doesn't need tricky padding code.
+
+ In the favor of short-MAC-and-pad:
+ - All of the primitives used are much better analyzed and
+ understood. There's nothing esoteric there. The format
+ itself is similar to older, well-analyzed formats.
+
+ - Most of the constructions for the large-block-cipher approach
+ seem less efficient in CPU cycles than a good stream cipher
+ and a MAC. (But I don't want to discuss that now; see section
+ 1.4 above!)
+
+ Unclear:
+
+ - Suppose that an attacker controls the first and last hop of a
+ circuit, and tries an end-to-end tagging attack. With
+ large-block encryption, the tagged cell and all future cells
+ on the circuit turn to junk after the tagging attack, with
+ P~~100%. With short-MAC-and-pad, the circuit is destroyed at
+ the second hop, with P ~~ 1- 2^(-8*bf). Is one of these
+ approaches materially worse for the attacker?
+
+ - Can we do better than the "compute two MACs" approach for
+ distinguishing the relay and receive cases of the
+ short-MAC-and-pad protocol?
+
+ - To what extent can we improve these protocols?
+
+ - If we do short-MAC-and-pad, should we apply the forward
+ security hack alluded to in section 3.2?
+
+5. Acknowledgments
+
+ Thanks to the many reviewers of the initial drafts of this
+ proposal. If you can make any sense of what I'm saying, they
+ deserve much of the credit.
+
+A. Formal description
+
+ Note that in all these cases, more efficient descriptions exist.
+
+A.1. The current Tor relay protocol.
+
+ Relay cell format:
+
+ Relay command [1 byte]
+ Zeros [2 bytes]
+ StreamID [2 bytes]
+ Digest [4 bytes]
+ Length [2 bytes]
+ Data [498 bytes]
+
+ Circuit setup:
+
+ (Specified elsewhere; the client negotiates with each router in
+ a circuit the secret AES keys Kf, Kb, and the secret 'digest
+ keys' Df, and Db. They initialize AES counters Cf and Cb to
+ 0. They initialize the digest stream Sf to Df, and Sb to Db.)
+
+ HELPER FUNCTION: CHECK(Cell [in], Stream [in,out]):
+
+ 1. If the Zeros field of Cell is not [00 00], return False.
+ 2. Let Cell' = Cell with its Digest field set to [00 00 00 00].
+ 3. Let S' = Stream | Cell'.
+ 4. If SHA1(S') = the Digest field of Cell, set Stream to S',
+ and return True.
+ 5. Otherwise return False.
+
+ HELPER FUNCTION: CONSTRUCT(Cell' [in], Stream [in,out])
+
+ 1. Set the Zeros and Digest field of Cell' to [00] bytes.
+ 2. Set Stream to Stream | Cell'.
+ 3. Construct Cell from Cell' by setting the Digest field to
+ SHA1(Stream), and taking all other fields as-is.
+ 4. Return Cell.
+
+ HELPER_FUNCTION: ENCRYPT(Cell [in,out], Key [in], Ctr [in,out])
+ 1. Encrypt Cell's contents using AES128_CTR, with key 'Key' and
+ counter 'Ctr'. Increment 'Ctr' by the length of the cell.
+
+ HELPER_FUNCTION: DECRYPT(Cell [in,out], Key [in], Ctr [in,out])
+ 1. Same as ENCRYPT.
+
+
+ Router operation, upon receiving an inbound cell -- that is, one
+ sent towards the client.
+
+ 1. ENCRYPT(cell, Kb, Cb)
+ 2. Send the decrypted contents towards the client.
+
+ Router operation, upon receiving an outbound cell -- that is, one
+ sent away from the client.
+
+ 1. DECRYPT(cell, Kf, Cf)
+ 2. If CHECK(Cell, Sf) is true, this cell is for us. Do not
+ relay the cell.
+ 3. Otherwise, this cell is not for us. Send the decrypted cell
+ to the next hop on the circuit, or discard it if there is no
+ next hop.
+
+ Router operation, to create a relay cell that will be delivered
+ to the client.
+
+ 1. Construct a Relay cell Cell' with the relay command, length,
+ stream ID, and body fields set as appropriate.
+ 2. Let Cell = CONSTRUCT(Cell', Sb).
+ 3. ENCRYPT(Cell, Kb, Cb)
+ 4. Send the encrypted cell towards the client.
+
+ Client operation, receiving an inbound cell.
+
+ For each hop H in a circuit, starting with the first hop and
+ ending (possibly) with the last:
+
+ 1. DECRYPT(Cell, Kb_H, Cb_H)
+
+ 2. If CHECK(Cell, Sb_H) is true, this cell was sent from hop
+ H. Exit the loop, and return the cell in its current
+ form.
+
+ If we reach the end of the loop without finding the right hop,
+ the cell is bogus or corrupted.
+
+ Client operation, sending an outbound cell to hop H.
+
+ 1. Construct a Relay cell Cell' with the relay command, length,
+ stream ID, and body fields set as appropriate.
+ 2. Let Cell = CONSTRUCT(Cell', Sf_H)
+ 3. For i = H..1:
+ A. ENCRYPT(Cell, Sf_i, Cf_i)
+ 4. Deliver Cell to the first hop in the circuit.
+
+A.2. The large-block-cipher protocol
+
+ Same as A.1, except for the following changes.
+
+ The cell format is now:
+ Zeros [Z bytes]
+ Length [2 bytes]
+ StreamID [2 bytes]
+ Relay command [1 byte]
+ Data [504-Z bytes]
+
+ Ctr is no longer a counter, but a Tweak value.
+
+ Each key is now a tuple of (Key_Crypt, Key_TC)
+
+ Streams are no longer used.
+
+ HELPER FUNCTION: CHECK(Cell [in], Stream [in,out])
+ 1. If the Zeros field of Cell contains only [00] bytes,
+ return True.
+ 2. Otherwise return false.
+
+ HELPER FUNCTION: CONSTRUCT(Cell' [in], Stream [in,out])
+ 1. Let Cell be Cell', with its "Zeros" field set to Z [00]
+ bytes.
+ 2. Return Cell'.
+
+ HELPER FUNCTION: ENCRYPT(Cell [in,out], Key [in], Tweak [in,out])
+ 2. Encrypt Cell using Key and Tweak
+ 1. Let Tweak' = TC(Key_TC, Tweak, Cell)
+ 3. Set Tweak to Tweak'.
+
+ HELPER FUNCTION: DECRYPT(Cell [in,out], Key [in], Tweak [in,out])
+ 1. Let Tweak' = TC(Key_TC, Tweak, Cell)
+ 2. Decrypt Cell using Key and Tweak
+ 3. Set Tweak to Tweak'.
+
+A.3. The short-MAC-and-pad protocol.
+
+ Define M_relay(K,S) as MAC(K, S|"relay").
+ Define M_receive(K,S) as MAC(K, S|"receive").
+ Define Z(n) as a series of n [00] bytes.
+ Define BODY_LEN as 509
+
+ The cell message format is now:
+
+ Relay command [1 byte]
+ StreamID [2 bytes]
+ Length [2 bytes]
+ Data [variable bytes]
+
+ Helper function: CHECK(Cell [in], b [in], K [in], S [in,out]):
+ Let M = Cell[0:b]
+ Let Rest = Cell[b:...]
+ If b == 0:
+ Return (nil, Rest)
+ Let S' = S | Rest
+ If M == M_relay(K,S')[0:b]:
+ Set S = S'
+ Return ("relay", Rest)
+ If M == M_receive(K,S')[0:b]:
+ Set S = S'
+ Return ("receive", Rest)
+ Return ("BAD", nil)
+
+ HELPER_FUNCTION: ENCRYPT(Cell [in,out], Key [in], Ctr [in,out])
+ 1. Encrypt Cell's contents using AES128_CTR, with key 'Key' and
+ counter 'Ctr'. Increment 'Ctr' by the length of the cell.
+
+ HELPER_FUNCTION: DECRYPT(Cell [in,out], Key [in], Ctr [in,out])
+ 1. Same as ENCRYPT.
+
+ Router operation, upon receiving an inbound cell:
+ 1. ENCRYPT(cell, Kb, Cb)
+ 2. Send the decrypted contents towards the client.
+
+ Router operation, upon receiving an outbound cell:
+ 1. Let Status, Msg = CHECK(Cell, bf, Mf, Sf)
+ 2. If Status == "BAD", drop the cell and destroy the circuit.
+ 3. Let Cell' = Msg | Z(BODY_LEN - len(Msg))
+ 4. DECRYPT(Cell', Kf, Cf) [*]
+ 5. If Status == "receive" or (Status == nil and there is no
+ next hop), Cell' is for us: process it.
+ 6. Otherwise, send Cell' to the next node.
+
+ Router operation, sending a cell towards the client:
+ 1. Let Body = a relay cell body of BODY_LEN-bb_i bytes.
+ 2. Let Cell' = ENCRYPT(Body, Kb, Cb)
+ 3. Let Sb = Sb | Cell'
+ 4. Let M = M_receive(Mb, Sb)[0:b]
+ 5. Send the cell M | Cell' back towards the client.
+
+ Client operation, upon receiving an inbound cell:
+
+ For each hop H in the circuit, from first to last:
+
+ 1. Let Status, Msg = CHECK(Cell, bb_i, Mb_i, Sb_i)
+ 2. If Status = "relay", drop the cell and destroy
+ the circuit. (BAD is okay; it means that this hop didn't
+ originate the cell.)
+ 3. DECRYPT(Msg, Kb_i, Cb_i)
+ 4. If Status = "receive", this cell is from hop i; process
+ it.
+ 5. Otherwise, set Cell = Msg.
+
+ Client operation, sending an outbound cell:
+
+ Let BF = the total of all bf_i values.
+
+ 1. Construct a relay cell body Msg of BODY_LEN-BF bytes.
+ 2. For each hop i, let Stream_i = ENCRYPT(Kf_i,Z(CELL_LEN),Cf_i)
+ 3. Let Pad_0 = "".
+ 4. For i in range 1..N, where N is the number of hops:
+ Let Pad_i = Pad_{i-1} | Z(bf_i)
+ Let S_last = the last len(Pad_i) bytes of Stream_i.
+ Let Pad_i = Pad_i xor S_last
+ Now Pad_i is the padding as it will stand after node i
+ has processed it.
+
+ 5. For i in range N..1, where N is the number of hops:
+ If this is the last hop, let M_* = M_receive. Else let
+ M_* = M_relay.
+
+ Let Body = Msg xor the first len(Msg) bytes of Stream_i
+
+ Let M = M_*(Mf, Body | Pad_(i-1))
+
+ Set Msg = M[:bf_i] | Body
+
+ 6. Send Msg outbound to the first relay in the circuit.
+
+
+ [*] Strictly speaking, we could omit the pad-and-decrypt
+ operation once we know we're the final hop.
+
+
+
+R. References
+
+[Prop188] Tor Proposal 188: Bridge Guards and other anti-enumeration defenses
+ https://gitweb.torproject.org/torspec.git/blob/HEAD:/proposals/188-bridge-guards.txt
+
+[TorSpec] The Tor Protocol Specification
+ https://gitweb.torproject.org/torspec.git?a=blob_plain;hb=HEAD;f=tor-spec.txt
+
+[TorDesign] Dingledine et al, "Tor: The Second Generation Onion
+ Router",
+ https://svn.torproject.org/svn/projects/design-paper/tor-design.pdf
+
+[Tweak] Liskov et al, "Tweakable Block Ciphers",
+ http://www.cs.berkeley.edu/~daw/papers/tweak-crypto02.pdf
+
+[XF] Xinwen Fu et al, "One Cell is Enough to Break Tor's Anonymity"
+
+[23R] The 23 Raccoons, "Analysis of the Relative Severity of Tagging
+ Attacks" http://archives.seul.org/or/dev/Mar-2012/msg00019.html
+ (You'll want to read the rest of the thread too.)