``` Filename: 188-bridge-guards.txt Title: Bridge Guards and other anti-enumeration defenses Author: Nick Mathewson, Isis Lovecruft Created: 14 Oct 2011 Modified: 10 Sep 2015 Status: Reserve [NOTE: This proposal is marked as "reserve" because the enumeration technique it addresses does not currently seem to be in use. See ticket tor#7144 for more information. (2020 July 31)] 1. Overview Bridges are useful against censors only so long as the adversary cannot easily enumerate their addresses. I propose a design to make it harder for an adversary who controls or observes only a few nodes to enumerate a large number of bridges. Briefly: bridges should choose guard nodes, and use the Tor protocol's "Loose source routing" feature to re-route all extend requests from clients through an additional layer of guard nodes chosen by the bridge. This way, only a bridge's guard nodes can tell that it is a bridge, and the attacker needs to run many more nodes in order to enumerate a large number of bridges. I also discuss other ways to avoid enumeration, recommending some. These ideas are due to a discussion at the 2011 Tor Developers' Meeting in Waterloo, Ontario. Practically none of the ideas here are mine; I'm just writing up what I remember. 2. History and Motivation Under the current bridge design, an attacker who runs a node can identify bridges by seeing which "clients" make a large number of connections to it, or which "clients" make connections to it in the same way clients do. This has been a known attack since early versions {XXXX check} of the design document; let's try to fix it. 2.1. Related idea: Guard nodes The idea of guard nodes isn't new: since 0.1.1, Tor has used guard nodes (first designed as "Helper" nodes by Wright et al in {XXXX}) to make it harder for an adversary who controls a smaller number of nodes to eavesdrop on clients. The rationale was: an adversary who controls or observes only one entry and one exit will have a low probability of correlating any single circuit, but over time, if clients choose a random entry and exit for each circuit, such an adversary will eventually see some circuits from each client with a probability of 1, thereby building a statistical profile of the client's activities. Therefore, let each client choose its entry node only from among a small number of client-selected "guard" nodes: the client is still correlated with the same probability as before, but now the client has a nonzero chance of remaining unprofiled. 2.2. Related idea: Loose source routing Since the earliest versions of Onion Routing, the protocol has provided "loose source routing". In strict source routing, the source of a message chooses every hop on the message's path. But in loose source routing, the message traverses the selected nodes, but may also traverse other nodes as well. In other words, the client selects nodes N_a, N_b, and N_c, but the message may in fact traverse any sequence of nodes N_1...N_j, so long as N_1=N_a, N_x=N_b, and N_y=N_c, for 1 < x < y. Tor has retained this feature, but has not yet made use of it. 3. Design Every bridge currently chooses a set of guard nodes for its circuits. Bridges should also re-route client circuits through these circuits. Specifically, when a bridge receives a request from a client to extend a circuit, it should first create a circuit to its guard, and then relay that extend cell through the guard. The bridge should add an additional layer of encryption to outgoing cells on that circuit corresponding to the encryption that the guard will remove, and remove a layer of encryption on incoming cells on that circuit corresponding to the encryption that the guard will add. 3.1. Loose-Source Routed Circuit Construction Alice, an OP, is using a bridge, Bob, and she has chosen the following path through the network: Alice -> Bob -> Charlie -> Deidra However, Bob has decided to take advantage of the loose-source routing circuit characteristic (for example, in order to use a bridge guard), and Bob has chosen N additional loose-source routed hop(s), through which he will transparently relays cells. NOTE: For the purposes of bridge guards, N is always 1. However, for completion's sake, the following details of the circuit construction are generalized to include N > 1. Additionally, the following steps should hold for a hop at any position in Alice's circuit that has decided to take advantage of the loose-source routing feature, not only for bridge ORs. From Alice's perspective, her circuit path matches the one diagrammed above. However, the overall path of the circuit is: Alice -> Bob -> Guillaume -> Charlie -> Deidra From Bob's perspective, the circuit's path is: Alice -> Bob -> Guillaume -> Charlie -> UNKNOWN Interestingly, because Bob's behaviour towards Guillaume and choices of cell types is that of a normal OP, Guillaume's perspective of the circuit's path is: Bob -> Guillaume -> Charlie -> UNKNOWN That is, to Guillaume, Bob appears (for the most part) to be a normally connecting client. (See §4.1 for more detailed analysis.) 3.1.1. Detailed Steps of Loose-Source Routed Circuit Construction 1. Connection from OP Alice has connected to Bob, and she has sent to Bob either a CREATE/CREATE_FAST or CREATE2 cell. 2. Loose-Source Path Selection In anticipation of Alice's first RELAY_EARLY cell (which will contain an EXTEND cell to Alice's next hop), Bob begins constructing a loose-source routed circuit. To do so, Bob chooses N additional hop(s): 2.a. For the first additional hop, H_1, Bob chooses a suitable entry guard node, Guillaume, using the same algorithm as OPs. See "§5 Guard nodes" of path-spec.txt for additional information on the selection algorithm. 2.b. Each additional hop, [H_2, ..., H_N], is chosen at random from a list of suitable, non-excluded ORs. 3. Loose-Source Routed Circuit Extension and Cell Types Bob now follows the same procedure as OPs use to complete the key exchanges with his chosen additional hop(s). While undergoing these following substeps, Bob SHOULD continue to proceed with Step 4, below, in parallel, as an optimization for speeding up circuit construction. 3.a. Create Cells Bob sends the appropriate type of create cell to Guillaume. For ORs new enough to support the NTor handshake (nearly all of them at this point), Bob sends a CREATE2 cell. Otherwise, for ORs which only support the older TAP handshake, Bob sends either a CREATE or CREATE_FAST cell, using the same decision-making logic as OPs. See §4.1 for more information the distinguishability of bridges based upon whether they use CREATE versus CREATE_FAST. Also note that the CREATE2 cell has since become ubiquitous after this proposal was originally drafted. Thus, because we prefer ORs which use loose-source routing to behave (as much as possible) like OPs, we now prefer to use CREATE2. 3.b. Created Cells Later, when Bob receives a corresponding CREATED/CREATED_FAST or CREATED2 cell from Guillaume, Bob extracts key material for the shared forward and reverse keys, KG_f and KG_b, respectively. 3.c. Extend Cells When N > 1, for each additional hop, H_i, in [H_2, ..., H_N], Bob chooses the appropriate type of extend cell for H_i, and sends this extend cell to H_i-1, who transforms it into a create cell in order to perform the extension. To choose which type of extend cell to send, Bob uses the same algorithm as an OP to determine whether to use EXTEND or EXTEND2. Similar to the CREATE* cells above, for most modern ORs, this will very likely mean an EXTEND2 cell. 3.d. Extended Cells When a corresponding EXTENDED/EXTENDED2 cell is received for an additional hop, H_i, Bob extracts the shared forward and reverse keys, Ki_f and Ki_b, respectively. 4. Responding to the OP Now that the additional hops in Bob's loose-source routed circuit are chosen, and construction of the loose-source routed circuit has begun, Bob answers Alice's original CREATE/CREATE_FAST or CREATE2 cell (from Step 1) by sending the corresponding created cell type. Alice has now built a circuit through Bob, and the two share the negotiated forward and reverse keys, KB_n and KB_p, respectively. Note that Bob SHOULD do this step in tandem with the loose-source routed circuit construction procedure outlined in Step 3, above. 5. OP Circuit Extension Alice then wants to extend the circuit to node Charlie. She makes a hybrid-encrypted onionskin, encrypted to Charlie's public key, containing her chosen g^x value. She puts this in an extend cell: "Extend (Charlie's address) (Charlie's OR Port) (Onionskin) (Charlie's ID)". She encrypts this with KB_n and sends it as a RELAY_EARLY cell to Bob. Bob's behaviour is now dependent on whether the loose-source routed circuit construction steps (as outlined in Step 3, above) have already completed. 5.a. The Loose-Source Routed Circuit Construction is Incomplete If Bob has not yet finished the loose-source routed circuit construction, then Bob MUST store the first outgoing (i.e. exitward) RELAY_EARLY cell received from Alice until the loose-source routed circuit construction has been completed. If any incoming (i.e. toward the OP) RELAY* cell is received while the loose-source routed circuit is not fully constructed, Bob MUST drop the cell. If Bob has already stored Alice's first RELAY_EARLY cell, and Alice sends any additional RELAY* cell, then Bob SHOULD mark the entire circuit for close with END_CIRC_REASON_TORPROTOCOL. 5.b. The Loose-Source Routed Circuit Construction is Completed Later, when the loose-source routed circuit is fully constructed, Bob MUST send any stored cells from Alice outward by following the procedure described in Step 6.a. 6. Relay Cells When receiving a RELAY* cell in either direction, Bob MAY keep statistics on the number of relay cells encountered, as well as the number of relay cells relayed. 6.a. Outgoing Relay Cells Bob decrypts the RELAY* cell with KB_n. If the cell becomes recognized, Bob should now follow the relay command checks described in Step 6.c. Bob MUST encrypt the relay cell's underlying payload to each additional hop in the loose-source routed circuit, in reverse: for each additional hop, H_i, in [H_N, ..., H_1], Bob encrypts the relay cell payload to Ki_f, the shared forward key for the hop H_i. Bob MUST update the forward digest, DG_f, of the relay cell, regardless of whether or not the cell is recognized. See 6.c. for additional information on recognized cells. Bob now sends the cell outwards through the additional hops. At each hop, H_i, the hop removes a layer of the onionskin by decrypting the cell with Ki_f, and then hop H_i forwards the cell to the next addition additional hop H_i+1. When the final additional hop, H_N, received the cell, the OP's cell command and payload should be processed by H_N in the normal manner for an OR. 6.b. Incoming Relay Cells Bob MUST decrypt the relay cell's underlying payload from each additional hop in the loose-source routed circuit (in forward order, this time): For each additional hop, H_i, in [H_1, ..., H_N], Bob decrypts the relay cell payload with Ki_b, the shared backward key for the hop H_i. If the cell has becomes recognized after all decryptions, Bob should now follow the relay command checks described in Step 6.c. Bob MUST update the backward digest, DG_b, of the relay cell, regardless of whether or not the cell is recognized. See 6.c. for additional information on recognized cells. Bob encrypts the cell towards the OP with KB_p, and sends the cell inwards. 6.c. Recognized Cells If a relay cell, either incoming or outgoing, becomes recognized (i.e. Bob sees that the cell was intended for him) after decryption, and there is no stream attached to the circuit, then Bob SHOULD mark the circuit for close if the relay command contained within the cell is any of the following types: - RELAY_BEGIN - RELAY_CONNECTED - RELAY_END - RELAY_RESOLVE - RELAY_RESOLVED - RELAY_BEGIN_DIR Apart from the above checks, Bob SHOULD essentially treat every cell as "unrecognized" by following the en-/de-cryption procedures in Steps 6.a. and 6.b. regardless of whether the cell is actually recognized or not. That is, since this is a loose-source routed circuit, Bob SHOULD relay cells not intended for him *and* cells intended for him through the leaky pipe, no matter what the cell's underlying payload and command are. 3.1.2. Example Loose-Source Circuit Construction For example, given the following circuit path chosen by Alice: Alice -> Bob -> Charlie -> Deidra when Alice wishes to extend to node Charlie, and Bob the bridge is using only one additional loose-source routed hop, Guillaume, as his bridge guard, the following steps are taken: - Alice packages the extend into a RELAY_EARLY cell and encrypts the RELAY_EARLY cell with KB_f to Bob. - Bob receives the RELAY_EARLY cell from Alice, and he follows the procedure (outlined in §3.1.1. Step 6.a.) by: * Decrypting the cell with KB_f, * Encrypting the cell to the forward key, KG_f, which Bob shares with his guard node, Guillaume, * Updating the cell forward digest, DG_f, and * Sending the cell as a RELAY_EARLY cell to Guillaume. - When Guillaume receives the cell from Bob, he processes it by: * Decrypting the cell with KG_f. Guillaume now sees that it is a RELAY_EARLY cell containing an extend cell "intended" for him, containing: "Extend (Charlie's address) (Charlie's OR Port) (Onionskin) (Charlie's ID)". * Performing the circuit extension to the specified node, Charlie, by acting accordingly: creating a connection to Charlie if he doesn't have one, ensuring that the ID is as expected, and then sending the onionskin in a create cell on that connection. Note that Guillaume is behaving exactly as a regular node would upon receiving an Extend cell. * Now the handshake finishes. Charlie receives the onionskin and sends Guillaume "CREATED g^y,KH". * Making an extended cell for Bob which contains "E(KG_b, EXTENDED g^y KH)", and * Sending the extended cell to Bob. Note that Charlie and Guillaume are both still behaving in a manner identical to regular ORs. - Bob receives the extended cell from Guillaume, and he follows the procedure (outlined in §3.1.1. Step 6.b.) by: * Decrypting the cell with KG_b, * Encrypting the cell to Alice with KB_b, * Updating the cell backward digest, DG_b, and * Sending the cell to Alice. - Alice receives the cell, and she decrypts it with KB_b, just as she would have if Bob had extended to Charlie directly. She then processes the extended cell contained within to extract shared keys with Charlie. Note that Alice's behaviour is identical to regular OPs. 3.2. Additional Notes on the Construction Note that this design does not require that our stream cipher operations be commutative, even though they are. Note also that this design requires no change in behavior from any node other than Bob, and as we can see in the above example in §3.1.2 for Alice's circuit extension, Alice, Guillaume, and Charlie behave identical to a normal OP and normal ORs. Finally, observe that even though the circuit N hops longer than it would be otherwise, no relay's count of permissible RELAY_EARLY cells falls lower than it otherwise would. This is because the extra hop that Bob adds is done with RELAY_EARLY cells, then he continues to relay Alice's cells as RELAY_EARLY, until the appropriate maximum number of RELAY_EARLY cells is reached. Afterwards, further RELAY_EARLY cells from Alice are repackaged by Bob as normal RELAY cells. 4. Alternative designs 4.1. Client-enforced bridge guards What if Tor didn't have loose source routing? We could have bridges tell clients what guards to use by advertising those guard in their descriptors, and then refusing to extend circuits to any other nodes. This change would require all clients to upgrade in order to be able to use the newer bridges, and would quite possibly cause a fair amount of pain along the way. Fortunately, we don't need to go down this path. So let's not! 4.2. Separate bridge-guards and client-guards In the design above, I specify that bridges should use the same guard nodes for extending client circuits as they use for their own circuits. It's not immediately clear whether this is a good idea or not. Having separate sets would seem to make the two kinds of circuits more easily distinguishable (even though we already assume they are distinguishable). Having different sets of guards would also seem like a way to keep the nodes who guard our own traffic from learning that we're a bridge... but another set of nodes will learn that anyway, so it's not clear what we'd gain. One good reason to keep separate guard lists is to prevent the *client* of the bridge from being able to enumerate the guards that the bridge uses to protect its own traffic (by extending a circuit through the bridge to a node it controls, and finding out where the extend request arrives from). 5. Additional bridge enumeration methods and protections In addition to the design above, there are more ways to try to prevent enumeration. Right now, there are multiple ways for the node after a bridge to distinguish a circuit extended through the bridge from one originating at the bridge. (This lets the node after the bridge tell that a bridge is talking to it.) 5.1. Make it harder to tell clients from bridges When using the older TAP circuit handshake protocol, one of the giveaways is that the first hop in a circuit is created with CREATE_FAST cells, but all subsequent hops are created with CREATE cells. However, because nearly everything in the network now uses the newer NTor circuit handshake protocol, clients send CREATE2 cells to all hops, regardless of position. Therefore, in the above design, it's no longer quite so simple to distinguish an OP connecting through bridge from an actual OP, since all of the circuits that extend through a bridge now reach its guards through CREATE2 cells (whether the bridge originated them or not), and only as a fallback (e.g. if an additional node in the loose-source routed path does not support NTor) will the bridge ever use CREATE/CREATE_FAST. (Additionally, when using the fallback mathod, the behaviour for choosing either CREATE or CREATE_FAST is identical to normal OP behaviour.) The CREATE/CREATE_FAST distinction is not the only way for a bridge's guard to tell bridges from orginary clients, however. Most importantly, a busy bridge will open far more circuits than a client would. More subtly, the timing on response from the client will be higher and more highly variable that it would be with an ordinary client. I don't think we can make bridges behave wholly indistinguishably from clients: that's why we should go with guard nodes for bridges. [XXX For further research: we should study the methods by which a bridge guard can determine that they are acting as a guard for a bridge, rather than for a normal OP, and which methods are likely to be more accurate or efficient than others. -IL] 5.2. Bridge Reachability Testing Currently, a bridge's reachability is tested both by the bridge itself (called "self-testing") and by the BridgeAuthority. 5.2.1. Bridge Reachability Self-Testing Before a bridge uploads its descriptors to the BridgeAuthority, it creates a special type of testing circuit which ends at itself: Bob -> Guillaume -> Charlie -> Bob Thus, going to all this trouble to later use loose-source routing in order to relay Alice's traffic through Guillaume (rather than connecting directly to Charlie, as Alice intended) is diminished by the fact that Charlie can still passively enumerate bridges by waiting to be asked to connect to a node which is not contained within the consensus. We could get around this option by disabling self-testing for bridges entirely, by automatically setting "AssumeReachable 1" for all bridge relays… although I am not sure if this is wise. Our best idea thus far, for bridge reachability self-testing, is to create a circuit like so: Bridge → Guard → Middle → OtherMiddle → Guard → Bridge While, clearly, that circuit is just a little bit insane, it must be that way because we cannot simply do: Bridge → Guard → Middle → Guard → Bridge because the Middle would refuse to extend back to the previous node (all ORs follow this rule). Similarly, it would be inane to do: Bridge → Guard → Middle → OtherMiddle → Bridge because, obviously, that merely shifts the problem to OtherMiddle and accomplishes nothing. [XXX Is there something smarter we could do? —IL] 5.2.2. Bridge Reachability Testing by the BridgeAuthority After receiving Bob's descriptors, the BridgeAuthority attempts to connect to Bob's ORPort by making a direct TLS connection to the bridge's advertised ORPort. Should we change this behaviour? One the one hand, at least this does not enable any random OR in the entire network to enumerate bridges. On the other hand, any adversary who can observe packets from the BridgeAuthority is capable of enumeration. 6. Other considerations What fraction of our traffic is bridge traffic? Will this alter our circuit selection weights? ```