From e4e0d93d56ee8c1aec4c2efaa7046b651f0fe55c Mon Sep 17 00:00:00 2001 From: Nick Mathewson Date: Thu, 12 Oct 2023 12:27:58 -0400 Subject: Move all text-only specifications into the OLD_TXT directory. --- path-spec.txt | 1051 --------------------------------------------------------- 1 file changed, 1051 deletions(-) delete mode 100644 path-spec.txt (limited to 'path-spec.txt') diff --git a/path-spec.txt b/path-spec.txt deleted file mode 100644 index 33d50e5..0000000 --- a/path-spec.txt +++ /dev/null @@ -1,1051 +0,0 @@ - - Tor Path Specification - - Roger Dingledine - Nick Mathewson - -Note: This is an attempt to specify Tor as currently implemented. Future -versions of Tor will implement improved algorithms. - -This document tries to cover how Tor chooses to build circuits and assign -streams to circuits. Other implementations MAY take other approaches, but -implementors should be aware of the anonymity and load-balancing implications -of their choices. - - THIS SPEC ISN'T DONE YET. - - The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL - NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and - "OPTIONAL" in this document are to be interpreted as described in - RFC 2119. - -Tables of Contents - - 1. General operation - 1.1. Terminology - 1.2. A relay's bandwidth - 2. Building circuits - 2.1. When we build - 2.1.0. We don't build circuits until we have enough directory info - 2.1.1. Clients build circuits preemptively - 2.1.2. Clients build circuits on demand - 2.1.3. Relays build circuits for testing reachability and bandwidth - 2.1.4. Hidden-service circuits - 2.1.5. Rate limiting of failed circuits - 2.1.6. When to tear down circuits - 2.2. Path selection and constraints - 2.2.1. Choosing an exit - 2.2.2. User configuration - 2.3. Cannibalizing circuits - 2.4. Learning when to give up ("timeout") on circuit construction - 2.4.1 Distribution choice and parameter estimation - 2.4.2. How much data to record - 2.4.3. How to record timeouts - 2.4.4. Detecting Changing Network Conditions - 2.4.5. Consensus parameters governing behavior - 2.4.6. Consensus parameters governing behavior - 2.5. Handling failure - 3. Attaching streams to circuits - 4. Hidden-service related circuits - 5. Guard nodes - 5.1. How consensus bandwidth weights factor into entry guard selection - 6. Server descriptor purposes - 7. Detecting route manipulation by Guard nodes (Path Bias) - 7.1. Measuring path construction success rates - 7.2. Measuring path usage success rates - 7.3. Scaling success counts - 7.4. Parametrization - 7.5. Known barriers to enforcement - X. Old notes - X.1. Do we actually do this? - X.2. A thing we could do to deal with reachability. - X.3. Some stuff that worries me about entry guards. 2006 Jun, Nickm. - -1. General operation - - Tor begins building circuits as soon as it has enough directory - information to do so (see section 5 of dir-spec.txt). Some circuits are - built preemptively because we expect to need them later (for user - traffic), and some are built because of immediate need (for user traffic - that no current circuit can handle, for testing the network or our - reachability, and so on). - - [Newer versions of Tor (0.2.6.2-alpha and later): - If the consensus contains Exits (the typical case), Tor will build both - exit and internal circuits. When bootstrap completes, Tor will be ready - to handle an application requesting an exit circuit to services like the - World Wide Web. - - If the consensus does not contain Exits, Tor will only build internal - circuits. In this case, earlier statuses will have included "internal" - as indicated above. When bootstrap completes, Tor will be ready to handle - an application requesting an internal circuit to hidden services at - ".onion" addresses. - - If a future consensus contains Exits, exit circuits may become available.] - - When a client application creates a new stream (by opening a SOCKS - connection or launching a resolve request), we attach it to an appropriate - open circuit if one exists, or wait if an appropriate circuit is - in-progress. We launch a new circuit only - if no current circuit can handle the request. We rotate circuits over - time to avoid some profiling attacks. - - To build a circuit, we choose all the nodes we want to use, and then - construct the circuit. Sometimes, when we want a circuit that ends at a - given hop, and we have an appropriate unused circuit, we "cannibalize" the - existing circuit and extend it to the new terminus. - - These processes are described in more detail below. - - This document describes Tor's automatic path selection logic only; path - selection can be overridden by a controller (with the EXTENDCIRCUIT and - ATTACHSTREAM commands). Paths constructed through these means may - violate some constraints given below. - -1.1. Terminology - - A "path" is an ordered sequence of nodes, not yet built as a circuit. - - A "clean" circuit is one that has not yet been used for any traffic. - - A "fast" or "stable" or "valid" node is one that has the 'Fast' or - 'Stable' or 'Valid' flag - set respectively, based on our current directory information. A "fast" - or "stable" circuit is one consisting only of "fast" or "stable" nodes. - - In an "exit" circuit, the final node is chosen based on waiting stream - requests if any, and in any case it avoids nodes with exit policy of - "reject *:*". An "internal" circuit, on the other hand, is one where - the final node is chosen just like a middle node (ignoring its exit - policy). - - A "request" is a client-side stream or DNS resolve that needs to be - served by a circuit. - - A "pending" circuit is one that we have started to build, but which has - not yet completed. - - A circuit or path "supports" a request if it is okay to use the - circuit/path to fulfill the request, according to the rules given below. - A circuit or path "might support" a request if some aspect of the request - is unknown (usually its target IP), but we believe the path probably - supports the request according to the rules given below. - -1.2. A relay's bandwidth - - Old versions of Tor did not report bandwidths in network status - documents, so clients had to learn them from the routers' advertised - relay descriptors. - - For versions of Tor prior to 0.2.1.17-rc, everywhere below where we - refer to a relay's "bandwidth", we mean its clipped advertised - bandwidth, computed by taking the smaller of the 'rate' and - 'observed' arguments to the "bandwidth" element in the relay's - descriptor. If a router's advertised bandwidth is greater than - MAX_BELIEVABLE_BANDWIDTH (currently 10 MB/s), we clipped to that - value. - - For more recent versions of Tor, we take the bandwidth value declared - in the consensus, and fall back to the clipped advertised bandwidth - only if the consensus does not have bandwidths listed. - -2. Building circuits - -2.1. When we build - -2.1.0. We don't build circuits until we have enough directory info - - There's a class of possible attacks where our directory servers - only give us information about the relays that they would like us - to use. To prevent this attack, we don't build multi-hop - circuits for real traffic (like those in 2.1.1, 2.1.2, 2.1.4 - below) until we have enough directory information to be - reasonably confident this attack isn't being done to us. - - Here, "enough" directory information is defined as: - - * Having a consensus that's been valid at some point in the - last REASONABLY_LIVE_TIME interval (24 hours). - - * Having enough descriptors that we could build at least some - fraction F of all bandwidth-weighted paths, without taking - ExitNodes/EntryNodes/etc into account. - - (F is set by the PathsNeededToBuildCircuits option, - defaulting to the 'min_paths_for_circs_pct' consensus - parameter, with a final default value of 60%.) - - * Having enough descriptors that we could build at least some - fraction F of all bandwidth-weighted paths, _while_ taking - ExitNodes/EntryNodes/etc into account. - - (F is as above.) - - * Having a descriptor for every one of the first - NUM_USABLE_PRIMARY_GUARDS guards among our primary guards. (see - guard-spec.txt) - - We define the "fraction of bandwidth-weighted paths" as the product of - these three fractions. - - * The fraction of descriptors that we have for nodes with the Guard - flag, weighted by their bandwidth for the guard position. - * The fraction of descriptors that we have for all nodes, - weighted by their bandwidth for the middle position. - * The fraction of descriptors that we have for nodes with the Exit - flag, weighted by their bandwidth for the exit position. - - If the consensus has zero weighted bandwidth for a given kind of - relay (Guard, Middle, or Exit), Tor instead uses the fraction of relays - for which it has the descriptor (not weighted by bandwidth at all). - - If the consensus lists zero exit-flagged relays, Tor instead uses the - fraction of middle relays. - - -2.1.1. Clients build circuits preemptively - - When running as a client, Tor tries to maintain at least a certain - number of clean circuits, so that new streams can be handled - quickly. To increase the likelihood of success, Tor tries to - predict what circuits will be useful by choosing from among nodes - that support the ports we have used in the recent past (by default - one hour). Specifically, on startup Tor tries to maintain one clean - fast exit circuit that allows connections to port 80, and at least - two fast clean stable internal circuits in case we get a resolve - request or hidden service request (at least three if we _run_ a - hidden service). - - After that, Tor will adapt the circuits that it preemptively builds - based on the requests it sees from the user: it tries to have two fast - clean exit circuits available for every port seen within the past hour - (each circuit can be adequate for many predicted ports -- it doesn't - need two separate circuits for each port), and it tries to have the - above internal circuits available if we've seen resolves or hidden - service activity within the past hour. If there are 12 or more clean - circuits open, it doesn't open more even if it has more predictions. - - Only stable circuits can "cover" a port that is listed in the - LongLivedPorts config option. Similarly, hidden service requests - to ports listed in LongLivedPorts make us create stable internal - circuits. - - Note that if there are no requests from the user for an hour, Tor - will predict no use and build no preemptive circuits. - - The Tor client SHOULD NOT store its list of predicted requests to a - persistent medium. - -2.1.2. Clients build circuits on demand - - Additionally, when a client request exists that no circuit (built or - pending) might support, we create a new circuit to support the request. - For exit connections, we pick an exit node that will handle the - most pending requests (choosing arbitrarily among ties), launch a - circuit to end there, and repeat until every unattached request - might be supported by a pending or built circuit. For internal - circuits, we pick an arbitrary acceptable path, repeating as needed. - - Clients consider a circuit to become "dirty" as soon as a stream is - attached to it, or some other request is performed over the circuit. - If a circuit has been "dirty" for at least MaxCircuitDirtiness seconds, - new circuits may not be attached to it. - - In some cases we can reuse an already established circuit if it's - clean; see Section 2.3 (cannibalizing circuits) for details. - -2.1.3. Relays build circuits for testing reachability and bandwidth - - Tor relays test reachability of their ORPort once they have - successfully built a circuit (on startup and whenever their IP address - changes). They build an ordinary fast internal circuit with themselves - as the last hop. As soon as any testing circuit succeeds, the Tor - relay decides it's reachable and is willing to publish a descriptor. - - We launch multiple testing circuits (one at a time), until we - have NUM_PARALLEL_TESTING_CIRC (4) such circuits open. Then we - do a "bandwidth test" by sending a certain number of relay drop - cells down each circuit: BandwidthRate * 10 / CELL_NETWORK_SIZE - total cells divided across the four circuits, but never more than - CIRCWINDOW_START (1000) cells total. This exercises both outgoing and - incoming bandwidth, and helps to jumpstart the observed bandwidth - (see dir-spec.txt). - - Tor relays also test reachability of their DirPort once they have - established a circuit, but they use an ordinary exit circuit for - this purpose. - -2.1.4. Hidden-service circuits - - See section 4 below. - -2.1.5. Rate limiting of failed circuits - - If we fail to build a circuit N times in a X second period (see Section - 2.3 for how this works), we stop building circuits until the X seconds - have elapsed. - XXXX - -2.1.6. When to tear down circuits - - Clients should tear down circuits (in general) only when those circuits - have no streams on them. Additionally, clients should tear-down - stream-less circuits only under one of the following conditions: - - - The circuit has never had a stream attached, and it was created too - long in the past (based on CircuitsAvailableTimeout or - cbtlearntimeout, depending on timeout estimate status). - - - The circuit is dirty (has had a stream attached), and it has been - dirty for at least MaxCircuitDirtiness. - -2.2. Path selection and constraints - - We choose the path for each new circuit before we build it. We choose the - exit node first, followed by the other nodes in the circuit, front to - back. (In other words, for a 3-hop circuit, we first pick hop 3, - then hop 1, then hop 2.) All paths we generate obey the following - constraints: - - - We do not choose the same router twice for the same path. - - We do not choose any router in the same family as another in the same - path. (Two routers are in the same family if each one lists the other - in the "family" entries of its descriptor.) - - We do not choose more than one router in a given /16 subnet - (unless EnforceDistinctSubnets is 0). - - We don't choose any non-running or non-valid router unless we have - been configured to do so. By default, we are configured to allow - non-valid routers in "middle" and "rendezvous" positions. - - If we're using Guard nodes, the first node must be a Guard (see 5 - below) - - XXXX Choosing the length - - For "fast" circuits, we only choose nodes with the Fast flag. For - non-"fast" circuits, all nodes are eligible. - - For all circuits, we weight node selection according to router bandwidth. - - We also weight the bandwidth of Exit and Guard flagged nodes depending on - the fraction of total bandwidth that they make up and depending upon the - position they are being selected for. - - These weights are published in the consensus, and are computed as described - in Section "Computing Bandwidth Weights" of dir-spec.txt. They are: - - Wgg - Weight for Guard-flagged nodes in the guard position - Wgm - Weight for non-flagged nodes in the guard Position - Wgd - Weight for Guard+Exit-flagged nodes in the guard Position - - Wmg - Weight for Guard-flagged nodes in the middle Position - Wmm - Weight for non-flagged nodes in the middle Position - Wme - Weight for Exit-flagged nodes in the middle Position - Wmd - Weight for Guard+Exit flagged nodes in the middle Position - - Weg - Weight for Guard flagged nodes in the exit Position - Wem - Weight for non-flagged nodes in the exit Position - Wee - Weight for Exit-flagged nodes in the exit Position - Wed - Weight for Guard+Exit-flagged nodes in the exit Position - - Wgb - Weight for BEGIN_DIR-supporting Guard-flagged nodes - Wmb - Weight for BEGIN_DIR-supporting non-flagged nodes - Web - Weight for BEGIN_DIR-supporting Exit-flagged nodes - Wdb - Weight for BEGIN_DIR-supporting Guard+Exit-flagged nodes - - Wbg - Weight for Guard+Exit-flagged nodes for BEGIN_DIR requests - Wbm - Weight for Guard+Exit-flagged nodes for BEGIN_DIR requests - Wbe - Weight for Guard+Exit-flagged nodes for BEGIN_DIR requests - Wbd - Weight for Guard+Exit-flagged nodes for BEGIN_DIR requests - - If any of those weights is malformed or not present in a consensus, - clients proceed with the regular path selection algorithm setting - the weights to the default value of 10000. - - Additionally, we may be building circuits with one or more requests in - mind. Each kind of request puts certain constraints on paths: - - - All service-side introduction circuits and all rendezvous paths - should be Stable. - - All connection requests for connections that we think will need to - stay open a long time require Stable circuits. Currently, Tor decides - this by examining the request's target port, and comparing it to a - list of "long-lived" ports. (Default: 21, 22, 706, 1863, 5050, - 5190, 5222, 5223, 6667, 6697, 8300.) - - DNS resolves require an exit node whose exit policy is not equivalent - to "reject *:*". - - Reverse DNS resolves require a version of Tor with advertised eventdns - support (available in Tor 0.1.2.1-alpha-dev and later). - - All connection requests require an exit node whose exit policy - supports their target address and port (if known), or which "might - support it" (if the address isn't known). See 2.2.1. - - Rules for Fast? XXXXX - -2.2.1. Choosing an exit - - If we know what IP address we want to connect to or resolve, we can - trivially tell whether a given router will support it by simulating - its declared exit policy. - - Because we often connect to addresses of the form hostname:port, we do not - always know the target IP address when we select an exit node. In these - cases, we need to pick an exit node that "might support" connections to a - given address port with an unknown address. An exit node "might support" - such a connection if any clause that accepts any connections to that port - precedes all clauses (if any) that reject all connections to that port. - - Unless requested to do so by the user, we never choose an exit node - flagged as "BadExit" by more than half of the authorities who advertise - themselves as listing bad exits. - -2.2.2. User configuration - - Users can alter the default behavior for path selection with configuration - options. - - - If "ExitNodes" is provided, then every request requires an exit node on - the ExitNodes list. (If a request is supported by no nodes on that list, - and StrictExitNodes is false, then Tor treats that request as if - ExitNodes were not provided.) - - - "EntryNodes" and "StrictEntryNodes" behave analogously. - - - If a user tries to connect to or resolve a hostname of the form - ..exit, the request is rewritten to a request for - , and the request is only supported by the exit whose nickname - or fingerprint is . - - - When set, "HSLayer2Nodes" and "HSLayer3Nodes" relax Tor's path - restrictions to allow nodes in the same /16 and node family to reappear - in the path. They also allow the guard node to be chosen as the RP, IP, - and HSDIR, and as the hop before those positions. - -2.3. Cannibalizing circuits - - If we need a circuit and have a clean one already established, in - some cases we can adapt the clean circuit for our new - purpose. Specifically, - - For hidden service interactions, we can "cannibalize" a clean internal - circuit if one is available, so we don't need to build those circuits - from scratch on demand. - - We can also cannibalize clean circuits when the client asks to exit - at a given node -- either via the ".exit" notation or because the - destination is running at the same location as an exit node. - -2.4. Learning when to give up ("timeout") on circuit construction - - Since version 0.2.2.8-alpha, Tor clients attempt to learn when to give - up on circuits based on network conditions. - -2.4.1. Distribution choice - - Based on studies of build times, we found that the distribution of - circuit build times appears to be a Frechet distribution (and a multi-modal - Frechet distribution, if more than one guard or bridge is used). However, - estimators and quantile functions of the Frechet distribution are difficult - to work with and slow to converge. So instead, since we are only interested - in the accuracy of the tail, clients approximate the tail of the multi-modal - distribution with a single Pareto curve. - -2.4.2. How much data to record - - From our observations, the minimum number of circuit build times for a - reasonable fit appears to be on the order of 100. However, to keep a - good fit over the long term, clients store 1000 most recent circuit build - times in a circular array. - - These build times only include the times required to build three-hop - circuits, and the times required to build the first three hops of circuits - with more than three hops. Circuits of fewer than three hops are not - recorded, and hops past the third are not recorded. - - The Tor client should build test circuits at a rate of one every 'cbttestfreq' - (10 seconds) until 'cbtmincircs' (100 circuits) are built, with a maximum of - 'cbtmaxopencircs' (default: 10) circuits open at once. This allows a fresh - Tor to have a CircuitBuildTimeout estimated within 30 minutes after install - or network change (see section 2.4.5 below). - - Timeouts are stored on disk in a histogram of 10ms bin width, the same - width used to calculate the Xm value above. The timeouts recorded in the - histogram must be shuffled after being read from disk, to preserve a - proper expiration of old values after restart. - - Thus, some build time resolution is lost during restart. Implementations may - choose a different persistence mechanism than this histogram, but be aware - that build time binning is still needed for parameter estimation. - -2.4.3. Parameter estimation - - Once 'cbtmincircs' build times are recorded, Tor clients update the - distribution parameters and recompute the timeout every circuit completion - (though see section 2.4.5 for when to pause and reset timeout due to - too many circuits timing out). - - Tor clients calculate the parameters for a Pareto distribution fitting the - data using the maximum likelihood estimator. For derivation, see: - https://en.wikipedia.org/wiki/Pareto_distribution#Estimation_of_parameters - - Because build times are not a true Pareto distribution, we alter how Xm is - computed. In a max likelihood estimator, the mode of the distribution is - used directly as Xm. - - Instead of using the mode of discrete build times directly, Tor clients - compute the Xm parameter using the weighted average of the midpoints - of the 'cbtnummodes' (10) most frequently occurring 10ms histogram bins. - Ties are broken in favor of earlier bins (that is, in favor of bins - corresponding to shorter build times). - - (The use of 10 modes was found to minimize error from the selected - cbtquantile, with 10ms bins for quantiles 60-80, compared to many other - heuristics). - - To avoid ln(1.0+epsilon) precision issues, use log laws to rewrite the - estimator for 'alpha' as the sum of logs followed by subtraction, rather - than multiplication and division: - - alpha = n/(Sum_n{ln(MAX(Xm, x_i))} - n*ln(Xm)) - - In this, n is the total number of build times that have completed, x_i is - the ith recorded build time, and Xm is the modes of x_i as above. - - All times below Xm are counted as having the Xm value via the MAX(), - because in Pareto estimators, Xm is supposed to be the lowest value. - However, since clients use mode averaging to estimate Xm, there can be - values below our Xm. Effectively, the Pareto estimator then treats that - everything smaller than Xm happened at Xm. One can also see that if - clients did not do this, alpha could underflow to become negative, which - results in an exponential curve, not a Pareto probability distribution. - - The timeout itself is calculated by using the Pareto Quantile function (the - inverted CDF) to give us the value on the CDF such that 80% of the mass - of the distribution is below the timeout value (parameter 'cbtquantile'). - - The Pareto Quantile Function (inverse CDF) is: - - F(q) = Xm/((1.0-q)^(1.0/alpha)) - - Thus, clients obtain the circuit build timeout for 3-hop circuits by - computing: - - timeout_ms = F(0.8) # 'cbtquantile' == 0.8 - - With this, we expect that the Tor client will accept the fastest 80% of the - total number of paths on the network. - - Clients obtain the circuit close time to completely abandon circuits as: - - close_ms = F(0.99) # 'cbtclosequantile' == 0.99 - - To avoid waiting an unreasonably long period of time for circuits that - simply have relays that are down, Tor clients cap timeout_ms at the max - build time actually observed so far, and cap close_ms at twice this max, - but at least 60 seconds: - - timeout_ms = MIN(timeout_ms, max_observed_timeout) - close_ms = MAX(MIN(close_ms, 2*max_observed_timeout), 'cbtinitialtimeout') - -2.4.3. Calculating timeouts thresholds for circuits of different lengths - - The timeout_ms and close_ms estimates above are good only for 3-hop - circuits, since only 3-hop circuits are recorded in the list of build - times. - - To calculate the appropriate timeouts and close timeouts for circuits of - other lengths, the client multiples the timeout_ms and close_ms values - by a scaling factor determined by the number of communication hops - needed to build their circuits: - - timeout_ms[hops=n] = timeout_ms * Actions(N) / Actions(3) - - close_ms[hops=n] = close_ms * Actions(N) / Actions(3) - - where Actions(N) = N * (N + 1) / 2. - - To calculate timeouts for operations other than circuit building, - the client should add X to Actions(N) for every round-trip communication - required with the Xth hop. - -2.4.4. How to record timeouts - - Pareto estimators begin to lose their accuracy if the tail is omitted. - Hence, Tor clients actually calculate two timeouts: a usage timeout, and a - close timeout. - - Circuits that pass the usage timeout are marked as measurement circuits, - and are allowed to continue to build until the close timeout corresponding - to the point 'cbtclosequantile' (default 99) on the Pareto curve, or 60 - seconds, whichever is greater. - - The actual completion times for these measurement circuits should be - recorded. - - Implementations should completely abandon a circuit and ignore the circuit - if the total build time exceeds the close threshold. Such closed circuits - should be ignored, as this typically means one of the relays in the path is - offline. - -2.4.5. Detecting Changing Network Conditions - - Tor clients attempt to detect both network connectivity loss and drastic - changes in the timeout characteristics. - - To detect changing network conditions, clients keep a history of - the timeout or non-timeout status of the past 'cbtrecentcount' circuits - (20 circuits) that successfully completed at least one hop. If more than - 90% of these circuits timeout, the client discards all buildtimes history, - resets the timeout to 'cbtinitialtimeout' (60 seconds), and then begins - recomputing the timeout. - - If the timeout was already at least `cbtinitialtimeout`, - the client doubles the timeout. - - The records here (of how many circuits succeeded or failed among the most - recent 'cbrrecentcount') are not stored as persistent state. On reload, - we start with a new, empty state. - -2.4.6. Consensus parameters governing behavior - - Clients that implement circuit build timeout learning should obey the - following consensus parameters that govern behavior, in order to allow - us to handle bugs or other emergent behaviors due to client circuit - construction. If these parameters are not present in the consensus, - the listed default values should be used instead. - - cbtdisabled - Default: 0 - Min: 0 - Max: 1 - Effect: If 1, all CircuitBuildTime learning code should be - disabled and history should be discarded. For use in - emergency situations only. - - cbtnummodes - Default: 10 - Min: 1 - Max: 20 - Effect: This value governs how many modes to use in the weighted - average calculation of Pareto parameter Xm. Selecting Xm as the - average of multiple modes improves accuracy of the Pareto tail - for quantile cutoffs from 60-80% (see cbtquantile). - - cbtrecentcount - Default: 20 - Min: 3 - Max: 1000 - Effect: This is the number of circuit build outcomes (success vs - timeout) to keep track of for the following option. - - cbtmaxtimeouts - Default: 18 - Min: 3 - Max: 10000 - Effect: When this many timeouts happen in the last 'cbtrecentcount' - circuit attempts, the client should discard all of its - history and begin learning a fresh timeout value. - - Note that if this parameter's value is greater than the value - of 'cbtrecentcount', then the history will never be - discarded because of this feature. - - cbtmincircs - Default: 100 - Min: 1 - Max: 10000 - Effect: This is the minimum number of circuits to build before - computing a timeout. - - Note that if this parameter's value is higher than 1000 (the - number of time observations that a client keeps in its - circular buffer), circuit build timeout calculation is - effectively disabled, and the default timeouts are used - indefinitely. - - cbtquantile - Default: 80 - Min: 10 - Max: 99 - Effect: This is the position on the quantile curve to use to set the - timeout value. It is a percent (10-99). - - cbtclosequantile - Default: 99 - Min: Value of cbtquantile parameter - Max: 99 - Effect: This is the position on the quantile curve to use to set the - timeout value to use to actually close circuits. It is a - percent (0-99). - - cbttestfreq - Default: 10 - Min: 1 - Max: 2147483647 (INT32_MAX) - Effect: Describes how often in seconds to build a test circuit to - gather timeout values. Only applies if less than 'cbtmincircs' - have been recorded. - - cbtmintimeout - Default: 10 - Min: 10 - Max: 2147483647 (INT32_MAX) - Effect: This is the minimum allowed timeout value in milliseconds. - - cbtinitialtimeout - Default: 60000 - Min: Value of cbtmintimeout - Max: 2147483647 (INT32_MAX) - Effect: This is the timeout value to use before we have enough data - to compute a timeout, in milliseconds. If we do not have - enough data to compute a timeout estimate (see cbtmincircs), - then we use this interval both for the close timeout and the - abandon timeout. - - cbtlearntimeout - Default: 180 - Min: 10 - Max: 60000 - Effect: This is how long idle circuits will be kept open while cbt is - learning a new timeout value. - - cbtmaxopencircs - Default: 10 - Min: 0 - Max: 14 - Effect: This is the maximum number of circuits that can be open at - at the same time during the circuit build time learning phase. - -2.5. Handling failure - - If an attempt to extend a circuit fails (either because the first create - failed or a subsequent extend failed) then the circuit is torn down and is - no longer pending. (XXXX really?) Requests that might have been - supported by the pending circuit thus become unsupported, and a new - circuit needs to be constructed. - - If a stream "begin" attempt fails with an EXITPOLICY error, we - decide that the exit node's exit policy is not correctly advertised, - so we treat the exit node as if it were a non-exit until we retrieve - a fresh descriptor for it. - - Excessive amounts of either type of failure can indicate an - attack on anonymity. See section 7 for how excessive failure is handled. - -3. Attaching streams to circuits - - When a circuit that might support a request is built, Tor tries to attach - the request's stream to the circuit and sends a BEGIN, BEGIN_DIR, - or RESOLVE relay - cell as appropriate. If the request completes unsuccessfully, Tor - considers the reason given in the CLOSE relay cell. [XXX yes, and?] - - - After a request has remained unattached for SocksTimeout (2 minutes - by default), Tor abandons the attempt and signals an error to the - client as appropriate (e.g., by closing the SOCKS connection). - - XXX Timeouts and when Tor auto-retries. - - * What stream-end-reasons are appropriate for retrying. - - If no reply to BEGIN/RESOLVE, then the stream will timeout and fail. - -4. Hidden-service related circuits - - XXX Tracking expected hidden service use (client-side and hidserv-side) - -5. Guard nodes - - We use Guard nodes (also called "helper nodes" in the research - literature) to prevent certain profiling attacks. For an overview of - our Guard selection algorithm -- which has grown rather complex -- see - guard-spec.txt. - -5.1. How consensus bandwidth weights factor into entry guard selection - - When weighting a list of routers for choosing an entry guard, the following - consensus parameters (from the "bandwidth-weights" line) apply: - - Wgg - Weight for Guard-flagged nodes in the guard position - Wgm - Weight for non-flagged nodes in the guard Position - Wgd - Weight for Guard+Exit-flagged nodes in the guard Position - Wgb - Weight for BEGIN_DIR-supporting Guard-flagged nodes - Wmb - Weight for BEGIN_DIR-supporting non-flagged nodes - Web - Weight for BEGIN_DIR-supporting Exit-flagged nodes - Wdb - Weight for BEGIN_DIR-supporting Guard+Exit-flagged nodes - - Please see "bandwidth-weights" in ยง3.4.1 of dir-spec.txt for more in depth - descriptions of these parameters. - - If a router has been marked as both an entry guard and an exit, then we - prefer to use it more, with our preference for doing so (roughly) linearly - increasing w.r.t. the router's non-guard bandwidth and bandwidth weight - (calculated without taking the guard flag into account). From proposal - #236: - - | - | Let Wpf denote the weight from the 'bandwidth-weights' line a - | client would apply to N for position p if it had the guard - | flag, Wpn the weight if it did not have the guard flag, and B the - | measured bandwidth of N in the consensus. Then instead of choosing - | N for position p proportionally to Wpf*B or Wpn*B, clients should - | choose N proportionally to F*Wpf*B + (1-F)*Wpn*B. - - where F is the weight as calculated using the above parameters. - -6. Server descriptor purposes - - There are currently three "purposes" supported for server descriptors: - general, controller, and bridge. Most descriptors are of type general - -- these are the ones listed in the consensus, and the ones fetched - and used in normal cases. - - Controller-purpose descriptors are those delivered by the controller - and labelled as such: they will be kept around (and expire like - normal descriptors), and they can be used by the controller in its - CIRCUITEXTEND commands. Otherwise they are ignored by Tor when it - chooses paths. - - Bridge-purpose descriptors are for routers that are used as bridges. See - doc/design-paper/blocking.pdf for more design explanation, or proposal - 125 for specific details. Currently bridge descriptors are used in place - of normal entry guards, for Tor clients that have UseBridges enabled. - -7. Detecting route manipulation by Guard nodes (Path Bias) - - The Path Bias defense is designed to defend against a type of route - capture where malicious Guard nodes deliberately fail or choke circuits - that extend to non-colluding Exit nodes to maximize their network - utilization in favor of carrying only compromised traffic. - - In the extreme, the attack allows an adversary that carries c/n - of the network capacity to deanonymize c/n of the network - connections, breaking the O((c/n)^2) property of Tor's original - threat model. It also allows targeted attacks aimed at monitoring - the activity of specific users, bridges, or Guard nodes. - - There are two points where path selection can be manipulated: - during construction, and during usage. Circuit construction - can be manipulated by inducing circuit failures during circuit - extend steps, which causes the Tor client to transparently retry - the circuit construction with a new path. Circuit usage can be - manipulated by abusing the stream retry features of Tor (for - example by withholding stream attempt responses from the client - until the stream timeout has expired), at which point the tor client - will also transparently retry the stream on a new path. - - The defense as deployed therefore makes two independent sets of - measurements of successful path use: one during circuit construction, - and one during circuit usage. - - The intended behavior is for clients to ultimately disable the use - of Guards responsible for excessive circuit failure of either type - (see section 7.4); however known issues with the Tor network currently - restrict the defense to being informational only at this stage (see - section 7.5). - -7.1. Measuring path construction success rates - - Clients maintain two counts for each of their guards: a count of the - number of times a circuit was extended to at least two hops through that - guard, and a count of the number of circuits that successfully complete - through that guard. The ratio of these two numbers is used to determine - a circuit success rate for that Guard. - - Circuit build timeouts are counted as construction failures if the - circuit fails to complete before the 95% "right-censored" timeout - interval, not the 80% timeout condition (see section 2.4). - - If a circuit closes prematurely after construction but before being - requested to close by the client, this is counted as a failure. - -7.2. Measuring path usage success rates - - Clients maintain two usage counts for each of their guards: a count - of the number of usage attempts, and a count of the number of - successful usages. - - A usage attempt means any attempt to attach a stream to a circuit. - - Usage success status is temporarily recorded by state flags on circuits. - Guard usage success counts are not incremented until circuit close. A - circuit is marked as successfully used if we receive a properly - recognized RELAY cell on that circuit that was expected for the current - circuit purpose. - - If subsequent stream attachments fail or time out, the successfully used - state of the circuit is cleared, causing it once again to be regarded - as a usage attempt only. - - Upon close by the client, all circuits that are still marked as usage - attempts are probed using a RELAY_BEGIN cell constructed with a - destination of the form 0.a.b.c:25, where a.b.c is a 24 bit random - nonce. If we get a RELAY_COMMAND_END in response matching our nonce, - the circuit is counted as successfully used. - - If any unrecognized RELAY cells arrive after the probe has been sent, - the circuit is counted as a usage failure. - - If the stream failure reason codes DESTROY, TORPROTOCOL, or INTERNAL - are received in response to any stream attempt, such circuits are not - probed and are declared usage failures. - - Prematurely closed circuits are not probed, and are counted as usage - failures. - -7.3. Scaling success counts - - To provide a moving average of recent Guard activity while - still preserving the ability to verify correctness, we periodically - "scale" the success counts by multiplying them by a scale factor - between 0 and 1.0. - - Scaling is performed when either usage or construction attempt counts - exceed a parametrized value. - - To avoid error due to scaling during circuit construction and use, - currently open circuits are subtracted from the usage counts before - scaling, and added back after scaling. - -7.4. Parametrization - - The following consensus parameters tune various aspects of the - defense. - - pb_mincircs - Default: 150 - Min: 5 - Effect: This is the minimum number of circuits that must complete - at least 2 hops before we begin evaluating construction rates. - - - pb_noticepct - Default: 70 - Min: 0 - Max: 100 - Effect: If the circuit success rate falls below this percentage, - we emit a notice log message. - - pb_warnpct - Default: 50 - Min: 0 - Max: 100 - Effect: If the circuit success rate falls below this percentage, - we emit a warn log message. - - pb_extremepct - Default: 30 - Min: 0 - Max: 100 - Effect: If the circuit success rate falls below this percentage, - we emit a more alarmist warning log message. If - pb_dropguard is set to 1, we also disable the use of the - guard. - - pb_dropguards - Default: 0 - Min: 0 - Max: 1 - Effect: If the circuit success rate falls below pb_extremepct, - when pb_dropguard is set to 1, we disable use of that - guard. - - pb_scalecircs - Default: 300 - Min: 10 - Effect: After this many circuits have completed at least two hops, - Tor performs the scaling described in Section 7.3. - - pb_multfactor and pb_scalefactor - Default: 1/2 - Min: 0.0 - Max: 1.0 - Effect: The double-precision result obtained from - pb_multfactor/pb_scalefactor is multiplied by our current - counts to scale them. - - pb_minuse - Default: 20 - Min: 3 - Effect: This is the minimum number of circuits that we must attempt to - use before we begin evaluating construction rates. - - pb_noticeusepct - Default: 80 - Min: 3 - Effect: If the circuit usage success rate falls below this percentage, - we emit a notice log message. - - pb_extremeusepct - Default: 60 - Min: 3 - Effect: If the circuit usage success rate falls below this percentage, - we emit a warning log message. We also disable the use of the - guard if pb_dropguards is set. - - pb_scaleuse - Default: 100 - Min: 10 - Effect: After we have attempted to use this many circuits, - Tor performs the scaling described in Section 7.3. - -7.5. Known barriers to enforcement - - Due to intermittent CPU overload at relays, the normal rate of - successful circuit completion is highly variable. The Guard-dropping - version of the defense is unlikely to be deployed until the ntor - circuit handshake is enabled, or the nature of CPU overload induced - failure is better understood. - - - -X. Old notes - -X.1. Do we actually do this? - -How to deal with network down. - - While all helpers are down/unreachable and there are no established - or on-the-way testing circuits, launch a testing circuit. (Do this - periodically in the same way we try to establish normal circuits - when things are working normally.) - (Testing circuits are a special type of circuit, that streams won't - attach to by accident.) - - When a testing circuit succeeds, mark all helpers up and hold - the testing circuit open. - - If a connection to a helper succeeds, close all testing circuits. - Else mark that helper down and try another. - - If the last helper is marked down and we already have a testing - circuit established, then add the first hop of that testing circuit - to the end of our helper node list, close that testing circuit, - and go back to square one. (Actually, rather than closing the - testing circuit, can we get away with converting it to a normal - circuit and beginning to use it immediately?) - - [Do we actually do any of the above? If so, let's spec it. If not, let's - remove it. -NM] - -X.2. A thing we could do to deal with reachability. - -And as a bonus, it leads to an answer to Nick's attack ("If I pick -my helper nodes all on 18.0.0.0:*, then I move, you'll know where I -bootstrapped") -- the answer is to pick your original three helper nodes -without regard for reachability. Then the above algorithm will add some -more that are reachable for you, and if you move somewhere, it's more -likely (though not certain) that some of the originals will become useful. -Is that smart or just complex? - -X.3. Some stuff that worries me about entry guards. 2006 Jun, Nickm. - - It is unlikely for two users to have the same set of entry guards. - Observing a user is sufficient to learn its entry guards. So, as we move - around, entry guards make us linkable. If we want to change guards when - our location (IP? subnet?) changes, we have two bad options. We could - - - Drop the old guards. But if we go back to our old location, - we'll not use our old guards. For a laptop that sometimes gets used - from work and sometimes from home, this is pretty fatal. - - Remember the old guards as associated with the old location, and use - them again if we ever go back to the old location. This would be - nasty, since it would force us to record where we've been. - - [Do we do any of this now? If not, this should move into 099-misc or - 098-todo. -NM] -- cgit v1.2.3-54-g00ecf