diff options
Diffstat (limited to 'spec/path-spec')
-rw-r--r-- | spec/path-spec/attaching-streams-to-circuits.md | 19 | ||||
-rw-r--r-- | spec/path-spec/building-circuits.md | 9 | ||||
-rw-r--r-- | spec/path-spec/cannibalizing-circuits.md | 15 | ||||
-rw-r--r-- | spec/path-spec/detecting-route-manipulation.md | 202 | ||||
-rw-r--r-- | spec/path-spec/general-operation.md | 97 | ||||
-rw-r--r-- | spec/path-spec/guard-nodes.md | 44 | ||||
-rw-r--r-- | spec/path-spec/handling-failure.md | 19 | ||||
-rw-r--r-- | spec/path-spec/hidden-service-related-circuits.md | 5 | ||||
-rw-r--r-- | spec/path-spec/index.md | 21 | ||||
-rw-r--r-- | spec/path-spec/learning-timeouts.md | 305 | ||||
-rw-r--r-- | spec/path-spec/path-selection-constraints.md | 146 | ||||
-rw-r--r-- | spec/path-spec/server-descriptor-purposes.md | 19 | ||||
-rw-r--r-- | spec/path-spec/when-we-build.md | 177 |
13 files changed, 1078 insertions, 0 deletions
diff --git a/spec/path-spec/attaching-streams-to-circuits.md b/spec/path-spec/attaching-streams-to-circuits.md new file mode 100644 index 0000000..585885b --- /dev/null +++ b/spec/path-spec/attaching-streams-to-circuits.md @@ -0,0 +1,19 @@ +<a id="path-spec.txt-3"></a> + +# Attaching streams to circuits + +When a circuit that might support a request is built, Tor tries to attach +the request's stream to the circuit and sends a BEGIN, BEGIN_DIR, +or RESOLVE relay +cell as appropriate. If the request completes unsuccessfully, Tor +considers the reason given in the CLOSE relay cell. \[XXX yes, and?\] + +After a request has remained unattached for SocksTimeout (2 minutes +by default), Tor abandons the attempt and signals an error to the +client as appropriate (e.g., by closing the SOCKS connection). + +XXX Timeouts and when Tor auto-retries. + +- What stream-end-reasons are appropriate for retrying. + +If no reply to BEGIN/RESOLVE, then the stream will timeout and fail. diff --git a/spec/path-spec/building-circuits.md b/spec/path-spec/building-circuits.md new file mode 100644 index 0000000..dfb1332 --- /dev/null +++ b/spec/path-spec/building-circuits.md @@ -0,0 +1,9 @@ +<a id="path-spec.txt-2"></a> + +# Building circuits + +Here we describe a number of rules for building circuits: under what +circumstances we do so, how we choose the paths for them, when we give +up on an in-progress circuits, and what we do when circuit +construction fails. + diff --git a/spec/path-spec/cannibalizing-circuits.md b/spec/path-spec/cannibalizing-circuits.md new file mode 100644 index 0000000..3987839 --- /dev/null +++ b/spec/path-spec/cannibalizing-circuits.md @@ -0,0 +1,15 @@ +<a id="path-spec.txt-2.3"></a> + +# Cannibalizing circuits + +If we need a circuit and have a clean one already established, in +some cases we can adapt the clean circuit for our new +purpose. Specifically, + +For hidden service interactions, we can "cannibalize" a clean internal +circuit if one is available, so we don't need to build those circuits +from scratch on demand. + +We can also cannibalize clean circuits when the client asks to exit +at a given node -- either via the ".exit" notation or because the +destination is running at the same location as an exit node. diff --git a/spec/path-spec/detecting-route-manipulation.md b/spec/path-spec/detecting-route-manipulation.md new file mode 100644 index 0000000..76be4d1 --- /dev/null +++ b/spec/path-spec/detecting-route-manipulation.md @@ -0,0 +1,202 @@ +<a id="path-spec.txt-7"></a> + +# Detecting route manipulation by Guard nodes (Path Bias) {#pathbias} + +The Path Bias defense is designed to defend against a type of route +capture where malicious Guard nodes deliberately fail or choke circuits +that extend to non-colluding Exit nodes to maximize their network +utilization in favor of carrying only compromised traffic. + +In the extreme, the attack allows an adversary that carries c/n +of the network capacity to deanonymize c/n of the network +connections, breaking the O((c/n)^2) property of Tor's original +threat model. It also allows targeted attacks aimed at monitoring +the activity of specific users, bridges, or Guard nodes. + +There are two points where path selection can be manipulated: +during construction, and during usage. Circuit construction +can be manipulated by inducing circuit failures during circuit +extend steps, which causes the Tor client to transparently retry +the circuit construction with a new path. Circuit usage can be +manipulated by abusing the stream retry features of Tor (for +example by withholding stream attempt responses from the client +until the stream timeout has expired), at which point the tor client +will also transparently retry the stream on a new path. + +The defense as deployed therefore makes two independent sets of +measurements of successful path use: one during circuit construction, +and one during circuit usage. + +The intended behavior is for clients to ultimately disable the use +of Guards responsible for excessive circuit failure of either type +(for the parameters to do this, see ["Parameterization"](#parameters) below); +however known issues with the Tor network currently +restrict the defense to being informational only at this stage +(see ["Known barriers to enforcement"](#barriers)). + +<a id="path-spec.txt-7.1"></a> + +## Measuring path construction success rates {#construction-success-rate} + +Clients maintain two counts for each of their guards: a count of the +number of times a circuit was extended to at least two hops through that +guard, and a count of the number of circuits that successfully complete +through that guard. The ratio of these two numbers is used to determine +a circuit success rate for that Guard. + +[Circuit build timeouts](./learning-timeouts.md) +are counted as construction failures if the +circuit fails to complete before the 95% "right-censored" timeout +interval, not the 80% timeout condition. + +If a circuit closes prematurely after construction but before being +requested to close by the client, this is counted as a failure. + +<a id="path-spec.txt-7.2"></a> + +## Measuring path usage success rates {#usage-success-rate} + +Clients maintain two usage counts for each of their guards: a count +of the number of usage attempts, and a count of the number of +successful usages. + +A usage attempt means any attempt to attach a stream to a circuit. + +Usage success status is temporarily recorded by state flags on circuits. +Guard usage success counts are not incremented until circuit close. A +circuit is marked as successfully used if we receive a properly +recognized RELAY cell on that circuit that was expected for the current +circuit purpose. + +If subsequent stream attachments fail or time out, the successfully used +state of the circuit is cleared, causing it once again to be regarded +as a usage attempt only. + +Upon close by the client, all circuits that are still marked as usage +attempts are probed using a RELAY_BEGIN cell constructed with a +destination of the form 0.a.b.c:25, where a.b.c is a 24 bit random +nonce. If we get a RELAY_COMMAND_END in response matching our nonce, +the circuit is counted as successfully used. + +If any unrecognized RELAY cells arrive after the probe has been sent, +the circuit is counted as a usage failure. + +If the stream failure reason codes DESTROY, TORPROTOCOL, or INTERNAL +are received in response to any stream attempt, such circuits are not +probed and are declared usage failures. + +Prematurely closed circuits are not probed, and are counted as usage +failures. + +<a id="path-spec.txt-7.3"></a> + +## Scaling success counts {#scaling} + +To provide a moving average of recent Guard activity while +still preserving the ability to verify correctness, we periodically +"scale" the success counts by multiplying them by a scale factor +between 0 and 1.0. + +Scaling is performed when either usage or construction attempt counts +exceed a parametrized value. + +To avoid error due to scaling during circuit construction and use, +currently open circuits are subtracted from the usage counts before +scaling, and added back after scaling. + +<a id="path-spec.txt-7.4"></a> + +## Parametrization {#parameters} + +The following consensus parameters tune various aspects of the +defense. + +```text + pb_mincircs + Default: 150 + Min: 5 + Effect: This is the minimum number of circuits that must complete + at least 2 hops before we begin evaluating construction rates. + + pb_noticepct + Default: 70 + Min: 0 + Max: 100 + Effect: If the circuit success rate falls below this percentage, + we emit a notice log message. + + pb_warnpct + Default: 50 + Min: 0 + Max: 100 + Effect: If the circuit success rate falls below this percentage, + we emit a warn log message. + + pb_extremepct + Default: 30 + Min: 0 + Max: 100 + Effect: If the circuit success rate falls below this percentage, + we emit a more alarmist warning log message. If + pb_dropguard is set to 1, we also disable the use of the + guard. + + pb_dropguards + Default: 0 + Min: 0 + Max: 1 + Effect: If the circuit success rate falls below pb_extremepct, + when pb_dropguard is set to 1, we disable use of that + guard. + + pb_scalecircs + Default: 300 + Min: 10 + Effect: After this many circuits have completed at least two hops, + Tor performs the scaling described in + ["Scaling success counts"](#scaling). + + pb_multfactor and pb_scalefactor + Default: 1/2 + Min: 0.0 + Max: 1.0 + Effect: The double-precision result obtained from + pb_multfactor/pb_scalefactor is multiplied by our current + counts to scale them. + + pb_minuse + Default: 20 + Min: 3 + Effect: This is the minimum number of circuits that we must attempt to + use before we begin evaluating construction rates. + + pb_noticeusepct + Default: 80 + Min: 3 + Effect: If the circuit usage success rate falls below this percentage, + we emit a notice log message. + + pb_extremeusepct + Default: 60 + Min: 3 + Effect: If the circuit usage success rate falls below this percentage, + we emit a warning log message. We also disable the use of the + guard if pb_dropguards is set. + + pb_scaleuse + Default: 100 + Min: 10 + Effect: After we have attempted to use this many circuits, + Tor performs the scaling described in + ["Scaling success counts"](#scaling). +``` + +<a id="path-spec.txt-7.5"></a> + +## Known barriers to enforcement {#barriers} + +Due to intermittent CPU overload at relays, the normal rate of +successful circuit completion is highly variable. The Guard-dropping +version of the defense is unlikely to be deployed until the ntor +circuit handshake is enabled, or the nature of CPU overload induced +failure is better understood. diff --git a/spec/path-spec/general-operation.md b/spec/path-spec/general-operation.md new file mode 100644 index 0000000..721ab3d --- /dev/null +++ b/spec/path-spec/general-operation.md @@ -0,0 +1,97 @@ +<a id="path-spec.txt-1"></a> + +# General operation + +Tor begins building circuits as soon as it has +[enough directory information](./when-we-build.md) to do so. +Some circuits are +built preemptively because we expect to need them later (for user +traffic), and some are built because of immediate need (for user traffic +that no current circuit can handle, for testing the network or our +reachability, and so on). + +```text + [Newer versions of Tor (0.2.6.2-alpha and later): + If the consensus contains Exits (the typical case), Tor will build both + exit and internal circuits. When bootstrap completes, Tor will be ready + to handle an application requesting an exit circuit to services like the + World Wide Web. +``` + +If the consensus does not contain Exits, Tor will only build internal +circuits. In this case, earlier statuses will have included "internal" +as indicated above. When bootstrap completes, Tor will be ready to handle +an application requesting an internal circuit to hidden services at +".onion" addresses. + +If a future consensus contains Exits, exit circuits may become available.\] + +When a client application creates a new stream (by opening a SOCKS +connection or launching a resolve request), we attach it to an appropriate +open circuit if one exists, or wait if an appropriate circuit is +in-progress. We launch a new circuit only +if no current circuit can handle the request. We rotate circuits over +time to avoid some profiling attacks. + +To build a circuit, we choose all the nodes we want to use, and then +construct the circuit. Sometimes, when we want a circuit that ends at a +given hop, and we have an appropriate unused circuit, we "cannibalize" the +existing circuit and extend it to the new terminus. + +These processes are described in more detail below. + +This document describes Tor's automatic path selection logic only; path +selection can be overridden by a controller (with the EXTENDCIRCUIT and +ATTACHSTREAM commands). Paths constructed through these means may +violate some constraints given below. + +<a id="path-spec.txt-1.1"></a> + +## Terminology + +A "path" is an ordered sequence of nodes, not yet built as a circuit. + +A "clean" circuit is one that has not yet been used for any traffic. + +A "fast" or "stable" or "valid" node is one that has the 'Fast' or +'Stable' or 'Valid' flag +set respectively, based on our current directory information. A "fast" +or "stable" circuit is one consisting only of "fast" or "stable" nodes. + +In an "exit" circuit, the final node is chosen based on waiting stream +requests if any, and in any case it avoids nodes with exit policy of +"reject *:*". An "internal" circuit, on the other hand, is one where +the final node is chosen just like a middle node (ignoring its exit +policy). + +A "request" is a client-side stream or DNS resolve that needs to be +served by a circuit. + +A "pending" circuit is one that we have started to build, but which has +not yet completed. + +A circuit or path "supports" a request if it is okay to use the +circuit/path to fulfill the request, according to the rules given below. +A circuit or path "might support" a request if some aspect of the request +is unknown (usually its target IP), but we believe the path probably +supports the request according to the rules given below. + +<a id="path-spec.txt-1.2"></a> + +## A relay's bandwidth {#bandwidth} + +Old versions of Tor did not report bandwidths in network status +documents, so clients had to learn them from the routers' advertised +relay descriptors. + +For versions of Tor prior to 0.2.1.17-rc, everywhere below where we +refer to a relay's "bandwidth", we mean its clipped advertised +bandwidth, computed by taking the smaller of the 'rate' and +'observed' arguments to the "bandwidth" element in the relay's +descriptor. If a router's advertised bandwidth is greater than +MAX_BELIEVABLE_BANDWIDTH (currently 10 MB/s), we clipped to that +value. + +For more recent versions of Tor, we take the bandwidth value declared +in the consensus, and fall back to the clipped advertised bandwidth +only if the consensus does not have bandwidths listed. diff --git a/spec/path-spec/guard-nodes.md b/spec/path-spec/guard-nodes.md new file mode 100644 index 0000000..af5750c --- /dev/null +++ b/spec/path-spec/guard-nodes.md @@ -0,0 +1,44 @@ +<a id="path-spec.txt-5"></a> + +# Guard nodes + +We use Guard nodes (also called "helper nodes" in the research +literature) to prevent certain profiling attacks. For an overview of +our Guard selection algorithm -- which has grown rather complex -- see +guard-spec.txt. + +<a id="path-spec.txt-5.1"></a> + +## How consensus bandwidth weights factor into entry guard selection {#bw-and-guards} + +When weighting a list of routers for choosing an entry guard, the following +consensus parameters (from the "bandwidth-weights" line) apply: + +```text + Wgg - Weight for Guard-flagged nodes in the guard position + Wgm - Weight for non-flagged nodes in the guard Position + Wgd - Weight for Guard+Exit-flagged nodes in the guard Position + Wgb - Weight for BEGIN_DIR-supporting Guard-flagged nodes + Wmb - Weight for BEGIN_DIR-supporting non-flagged nodes + Web - Weight for BEGIN_DIR-supporting Exit-flagged nodes + Wdb - Weight for BEGIN_DIR-supporting Guard+Exit-flagged nodes +``` + +Please see "bandwidth-weights" in ยง3.4.1 of dir-spec.txt for more in depth +descriptions of these parameters. + +If a router has been marked as both an entry guard and an exit, then we +prefer to use it more, with our preference for doing so (roughly) linearly +increasing w.r.t. the router's non-guard bandwidth and bandwidth weight +(calculated without taking the guard flag into account). From proposal +236: + +| +| Let Wpf denote the weight from the 'bandwidth-weights' line a +| client would apply to N for position p if it had the guard +| flag, Wpn the weight if it did not have the guard flag, and B the +| measured bandwidth of N in the consensus. Then instead of choosing +| N for position p proportionally to Wpf*B or Wpn*B, clients should +| choose N proportionally to F*Wpf*B + (1-F)*Wpn*B. + +where F is the weight as calculated using the above parameters. diff --git a/spec/path-spec/handling-failure.md b/spec/path-spec/handling-failure.md new file mode 100644 index 0000000..8db9245 --- /dev/null +++ b/spec/path-spec/handling-failure.md @@ -0,0 +1,19 @@ +<a id="path-spec.txt-2.5"></a> + +# Handling failure + +If an attempt to extend a circuit fails (either because the first create +failed or a subsequent extend failed) then the circuit is torn down and is +no longer pending. (XXXX really?) Requests that might have been +supported by the pending circuit thus become unsupported, and a new +circuit needs to be constructed. + +If a stream "begin" attempt fails with an EXITPOLICY error, we +decide that the exit node's exit policy is not correctly advertised, +so we treat the exit node as if it were a non-exit until we retrieve +a fresh descriptor for it. + +Excessive amounts of either type of failure can indicate an +attack on anonymity. +See [discussion of path bias detection](./detecting-route-manipulation.md) +for how excessive failure is handled. diff --git a/spec/path-spec/hidden-service-related-circuits.md b/spec/path-spec/hidden-service-related-circuits.md new file mode 100644 index 0000000..fc6e30d --- /dev/null +++ b/spec/path-spec/hidden-service-related-circuits.md @@ -0,0 +1,5 @@ +<a id="path-spec.txt-4"></a> + +# Hidden-service related circuits + +XXX Tracking expected hidden service use (client-side and hidserv-side) diff --git a/spec/path-spec/index.md b/spec/path-spec/index.md new file mode 100644 index 0000000..55e5df8 --- /dev/null +++ b/spec/path-spec/index.md @@ -0,0 +1,21 @@ +# Tor Path Specification + +```text + Roger Dingledine + Nick Mathewson +``` + +Note: This is an attempt to specify Tor as currently implemented. Future +versions of Tor will implement improved algorithms. + +This document tries to cover how Tor chooses to build circuits and assign +streams to circuits. Other implementations MAY take other approaches, but +implementors should be aware of the anonymity and load-balancing implications +of their choices. + +THIS SPEC ISN'T DONE YET. + +The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL +NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and +"OPTIONAL" in this document are to be interpreted as described in +RFC 2119. diff --git a/spec/path-spec/learning-timeouts.md b/spec/path-spec/learning-timeouts.md new file mode 100644 index 0000000..e98bac3 --- /dev/null +++ b/spec/path-spec/learning-timeouts.md @@ -0,0 +1,305 @@ +<a id="path-spec.txt-2.4"></a> + +# Learning when to give up ("timeout") on circuit construction {#timing-out} + +Since version 0.2.2.8-alpha, Tor clients attempt to learn when to give +up on circuits based on network conditions. + +<a id="path-spec.txt-2.4.1"></a> + +## Distribution choice + +Based on studies of build times, we found that the distribution of +circuit build times appears to be a Frechet distribution (and a multi-modal +Frechet distribution, if more than one guard or bridge is used). However, +estimators and quantile functions of the Frechet distribution are difficult +to work with and slow to converge. So instead, since we are only interested +in the accuracy of the tail, clients approximate the tail of the multi-modal +distribution with a single Pareto curve. + +<a id="path-spec.txt-2.4.2"></a> + +## How much data to record {#observations} + +From our observations, the minimum number of circuit build times for a +reasonable fit appears to be on the order of 100. However, to keep a +good fit over the long term, clients store 1000 most recent circuit build +times in a circular array. + +These build times only include the times required to build three-hop +circuits, and the times required to build the first three hops of circuits +with more than three hops. Circuits of fewer than three hops are not +recorded, and hops past the third are not recorded. + +The Tor client should build test circuits at a rate of one every 'cbttestfreq' +(10 seconds) until 'cbtmincircs' (100 circuits) are built, with a maximum of +'cbtmaxopencircs' (default: 10) circuits open at once. This allows a fresh +Tor to have a CircuitBuildTimeout estimated within 30 minutes after install +or network change +(see [Detecting Changing Network Conditions](#changes-in-network) below.) + +Timeouts are stored on disk in a histogram of 10ms bin width, the same +width used to calculate the Xm value above. The timeouts recorded in the +histogram must be shuffled after being read from disk, to preserve a +proper expiration of old values after restart. + +Thus, some build time resolution is lost during restart. Implementations may +choose a different persistence mechanism than this histogram, but be aware +that build time binning is still needed for parameter estimation. + +<a id="path-spec.txt-2.4.3"></a> + +## Parameter estimation + +Once 'cbtmincircs' build times are recorded, Tor clients update the +distribution parameters and recompute the timeout every circuit completion +(though +[see below](#changes-in-network) +for when to pause and reset timeout due to +too many circuits timing out). + +Tor clients calculate the parameters for a Pareto distribution fitting the +data using the maximum likelihood estimator. For derivation, see: +<https://en.wikipedia.org/wiki/Pareto_distribution#Estimation_of_parameters> + +Because build times are not a true Pareto distribution, we alter how Xm is +computed. In a max likelihood estimator, the mode of the distribution is +used directly as Xm. + +Instead of using the mode of discrete build times directly, Tor clients +compute the Xm parameter using the weighted average of the midpoints +of the 'cbtnummodes' (10) most frequently occurring 10ms histogram bins. +Ties are broken in favor of earlier bins (that is, in favor of bins +corresponding to shorter build times). + +(The use of 10 modes was found to minimize error from the selected +cbtquantile, with 10ms bins for quantiles 60-80, compared to many other +heuristics). + +To avoid ln(1.0+epsilon) precision issues, use log laws to rewrite the +estimator for 'alpha' as the sum of logs followed by subtraction, rather +than multiplication and division: + +`alpha = n/(Sum_n{ln(MAX(Xm, x_i))} - n\*ln(Xm))` + +In this, n is the total number of build times that have completed, x_i is +the ith recorded build time, and Xm is the modes of x_i as above. + +All times below Xm are counted as having the Xm value via the MAX(), +because in Pareto estimators, Xm is supposed to be the lowest value. +However, since clients use mode averaging to estimate Xm, there can be +values below our Xm. Effectively, the Pareto estimator then treats that +everything smaller than Xm happened at Xm. One can also see that if +clients did not do this, alpha could underflow to become negative, which +results in an exponential curve, not a Pareto probability distribution. + +The timeout itself is calculated by using the Pareto Quantile function (the +inverted CDF) to give us the value on the CDF such that 80% of the mass +of the distribution is below the timeout value (parameter 'cbtquantile'). + +The Pareto Quantile Function (inverse CDF) is: + +`F(q) = Xm/((1.0-q)^(1.0/alpha))` + +Thus, clients obtain the circuit build timeout for 3-hop circuits by +computing: + +`timeout_ms = F(0.8) # 'cbtquantile' == 0.8` + +With this, we expect that the Tor client will accept the fastest 80% of the +total number of paths on the network. + +Clients obtain the circuit close time to completely abandon circuits as: + +`close_ms = F(0.99) # 'cbtclosequantile' == 0.99` + +To avoid waiting an unreasonably long period of time for circuits that +simply have relays that are down, Tor clients cap timeout_ms at the max +build time actually observed so far, and cap close_ms at twice this max, +but at least 60 seconds: + +```text + timeout_ms = MIN(timeout_ms, max_observed_timeout) + close_ms = MAX(MIN(close_ms, 2*max_observed_timeout), 'cbtinitialtimeout') +``` + +<a id="path-spec.txt-2.4.3"></a> + +## Calculating timeouts thresholds for circuits of different lengths {#different-lengths} + +The timeout_ms and close_ms estimates above are good only for 3-hop +circuits, since only 3-hop circuits are recorded in the list of build +times. + +To calculate the appropriate timeouts and close timeouts for circuits of +other lengths, the client multiples the timeout_ms and close_ms values +by a scaling factor determined by the number of communication hops +needed to build their circuits: + +```text +timeout_ms\[hops=n\] = timeout_ms * Actions(N) / Actions(3) + +close_ms\[hops=n\] = close_ms * Actions(N) / Actions(3) +``` + +where `Actions(N) = N * (N + 1) / 2.` + +To calculate timeouts for operations other than circuit building, +the client should add X to Actions(N) for every round-trip communication +required with the Xth hop. + +<a id="path-spec.txt-2.4.4"></a> + +## How to record timeouts {#recording-timeouts} + +Pareto estimators begin to lose their accuracy if the tail is omitted. +Hence, Tor clients actually calculate two timeouts: a usage timeout, and a +close timeout. + +Circuits that pass the usage timeout are marked as measurement circuits, +and are allowed to continue to build until the close timeout corresponding +to the point 'cbtclosequantile' (default 99) on the Pareto curve, or 60 +seconds, whichever is greater. + +The actual completion times for these measurement circuits should be +recorded. + +Implementations should completely abandon a circuit and ignore the circuit +if the total build time exceeds the close threshold. Such closed circuits +should be ignored, as this typically means one of the relays in the path is +offline. + +<a id="path-spec.txt-2.4.5"></a> + +## Detecting Changing Network Conditions {#changes-in-network} + +Tor clients attempt to detect both network connectivity loss and drastic +changes in the timeout characteristics. + +To detect changing network conditions, clients keep a history of +the timeout or non-timeout status of the past 'cbtrecentcount' circuits +(20 circuits) that successfully completed at least one hop. If more than +90% of these circuits timeout, the client discards all buildtimes history, +resets the timeout to 'cbtinitialtimeout' (60 seconds), and then begins +recomputing the timeout. + +If the timeout was already at least `cbtinitialtimeout`, +the client doubles the timeout. + +The records here (of how many circuits succeeded or failed among the most +recent 'cbrrecentcount') are not stored as persistent state. On reload, +we start with a new, empty state. + +<a id="path-spec.txt-2.4.6"></a> + +## Consensus parameters governing behavior {#parameters} + +Clients that implement circuit build timeout learning should obey the +following consensus parameters that govern behavior, in order to allow +us to handle bugs or other emergent behaviors due to client circuit +construction. If these parameters are not present in the consensus, +the listed default values should be used instead. + +```text + cbtdisabled + Default: 0 + Min: 0 + Max: 1 + Effect: If 1, all CircuitBuildTime learning code should be + disabled and history should be discarded. For use in + emergency situations only. + + cbtnummodes + Default: 10 + Min: 1 + Max: 20 + Effect: This value governs how many modes to use in the weighted + average calculation of Pareto parameter Xm. Selecting Xm as the + average of multiple modes improves accuracy of the Pareto tail + for quantile cutoffs from 60-80% (see cbtquantile). + + cbtrecentcount + Default: 20 + Min: 3 + Max: 1000 + Effect: This is the number of circuit build outcomes (success vs + timeout) to keep track of for the following option. + + cbtmaxtimeouts + Default: 18 + Min: 3 + Max: 10000 + Effect: When this many timeouts happen in the last 'cbtrecentcount' + circuit attempts, the client should discard all of its + history and begin learning a fresh timeout value. + + Note that if this parameter's value is greater than the value + of 'cbtrecentcount', then the history will never be + discarded because of this feature. + + cbtmincircs + Default: 100 + Min: 1 + Max: 10000 + Effect: This is the minimum number of circuits to build before + computing a timeout. + + Note that if this parameter's value is higher than 1000 (the + number of time observations that a client keeps in its + circular buffer), circuit build timeout calculation is + effectively disabled, and the default timeouts are used + indefinitely. + + cbtquantile + Default: 80 + Min: 10 + Max: 99 + Effect: This is the position on the quantile curve to use to set the + timeout value. It is a percent (10-99). + + cbtclosequantile + Default: 99 + Min: Value of cbtquantile parameter + Max: 99 + Effect: This is the position on the quantile curve to use to set the + timeout value to use to actually close circuits. It is a + percent (0-99). + + cbttestfreq + Default: 10 + Min: 1 + Max: 2147483647 (INT32_MAX) + Effect: Describes how often in seconds to build a test circuit to + gather timeout values. Only applies if less than 'cbtmincircs' + have been recorded. + + cbtmintimeout + Default: 10 + Min: 10 + Max: 2147483647 (INT32_MAX) + Effect: This is the minimum allowed timeout value in milliseconds. + + cbtinitialtimeout + Default: 60000 + Min: Value of cbtmintimeout + Max: 2147483647 (INT32_MAX) + Effect: This is the timeout value to use before we have enough data + to compute a timeout, in milliseconds. If we do not have + enough data to compute a timeout estimate (see cbtmincircs), + then we use this interval both for the close timeout and the + abandon timeout. + + cbtlearntimeout + Default: 180 + Min: 10 + Max: 60000 + Effect: This is how long idle circuits will be kept open while cbt is + learning a new timeout value. + + cbtmaxopencircs + Default: 10 + Min: 0 + Max: 14 + Effect: This is the maximum number of circuits that can be open at + at the same time during the circuit build time learning phase. +``` diff --git a/spec/path-spec/path-selection-constraints.md b/spec/path-spec/path-selection-constraints.md new file mode 100644 index 0000000..8cac4fc --- /dev/null +++ b/spec/path-spec/path-selection-constraints.md @@ -0,0 +1,146 @@ +<a id="path-spec.txt-2.2"></a> + +# Path selection and constraints + +We choose the path for each new circuit before we build it, +based on our current directory information. +(Clients and relays use the latest directory information they have; +directory authorities use their own opinions.) + +We choose the +exit node first, followed by the other nodes in the circuit, front to +back. (In other words, for a 3-hop circuit, we first pick hop 3, +then hop 1, then hop 2.) + +## Universal constraints + +All paths we generate obey the following +constraints: + +- We do not choose the same router twice for the same path. +- We do not choose any router in the same family as another in the same + path. (Two routers are in the same family if each one lists the other + in the "family" entries of its descriptor.) +- We do not choose more than one router in a given network range, + which defaults to /16 for IPv4 and /32 for IPv6. + (C Tor overrides this with `EnforceDistinctSubnets`; + Arti overrides this with `ipv[46]_subnet_family_prefix`.) +- The first node must be a Guard (see + discussion [below](./guard-nodes.md) and in the + [guard specification](../guard-spec)). +- XXXX Choosing the length + +## Special-purpose constraints + +Additionally, we may be building circuits with one or more requests in +mind. Each kind of request puts certain constraints on paths. + +Most circuits need to be "Fast". +For these, we only choose nodes with the `Fast` flag. +For non-"fast" circuits, nodes without the `Fast` flag are eligible. + +- TODO document which circuits (do not) need to be Fast. + +Similarly, some circuits need to be "Stable". +For these, we only choose nodes with the Stable flag. + +- All service-side introduction circuits and all rendezvous paths + should be Stable. +- All connection requests for connections that we think will need to + stay open a long time require Stable circuits. Currently, Tor decides + this by examining the request's target port, and comparing it to a + list of "long-lived" ports. (Default: 21, 22, 706, 1863, 5050, + 5190, 5222, 5223, 6667, 6697, 8300.) + +<a id="path-spec.txt-2.2.1"></a> + +## Weighting node selection + +For all circuits, we weight node selection according to router bandwidth. + +We also weight the bandwidth of Exit and Guard flagged nodes depending on +the fraction of total bandwidth that they make up and depending upon the +position they are being selected for. + +These weights are published in the consensus, and are computed as described +in +["Computing Bandwidth Weights"](../dir-spec/computing-consensus.md#computing-bandwidth-weights) +in the directory specification. +They are: + +```text + Wgg - Weight for Guard-flagged nodes in the guard position + Wgm - Weight for non-flagged nodes in the guard Position + Wgd - Weight for Guard+Exit-flagged nodes in the guard Position + + Wmg - Weight for Guard-flagged nodes in the middle Position + Wmm - Weight for non-flagged nodes in the middle Position + Wme - Weight for Exit-flagged nodes in the middle Position + Wmd - Weight for Guard+Exit flagged nodes in the middle Position + + Weg - Weight for Guard flagged nodes in the exit Position + Wem - Weight for non-flagged nodes in the exit Position + Wee - Weight for Exit-flagged nodes in the exit Position + Wed - Weight for Guard+Exit-flagged nodes in the exit Position + + Wgb - Weight for BEGIN_DIR-supporting Guard-flagged nodes + Wmb - Weight for BEGIN_DIR-supporting non-flagged nodes + Web - Weight for BEGIN_DIR-supporting Exit-flagged nodes + Wdb - Weight for BEGIN_DIR-supporting Guard+Exit-flagged nodes + + Wbg - Weight for Guard+Exit-flagged nodes for BEGIN_DIR requests + Wbm - Weight for Guard+Exit-flagged nodes for BEGIN_DIR requests + Wbe - Weight for Guard+Exit-flagged nodes for BEGIN_DIR requests + Wbd - Weight for Guard+Exit-flagged nodes for BEGIN_DIR requests +``` + +If any of those weights is malformed or not present in a consensus, +clients proceed with the regular path selection algorithm setting +the weights to the default value of 10000. + + +## Choosing an exit + +If we know what IP address we want to connect to, we can +trivially tell whether a given router will support it by simulating +its declared exit policy. + +(DNS resolve requests are only sent to relays whose +exit policy is not equivalent to "reject *:*".) + +Because we often connect to addresses of the form hostname:port, we do not +always know the target IP address when we select an exit node. In these +cases, we need to pick an exit node that "might support" connections to a +given address port with an unknown address. An exit node "might support" +such a connection if any clause that accepts any connections to that port +precedes all clauses (if any) that reject all connections to that port. + +Unless requested to do so by the user, we never choose an exit node +flagged as "BadExit" by more than half of the authorities who advertise +themselves as listing bad exits. + +<a id="path-spec.txt-2.2.2"></a> + +## User configuration + +Users can alter the default behavior for path selection with configuration +options. + +```text + - If "ExitNodes" is provided, then every request requires an exit node on + the ExitNodes list. (If a request is supported by no nodes on that list, + and StrictExitNodes is false, then Tor treats that request as if + ExitNodes were not provided.) + + - "EntryNodes" and "StrictEntryNodes" behave analogously. + + - If a user tries to connect to or resolve a hostname of the form + <target>.<servername>.exit, the request is rewritten to a request for + <target>, and the request is only supported by the exit whose nickname + or fingerprint is <servername>. + + - When set, "HSLayer2Nodes" and "HSLayer3Nodes" relax Tor's path + restrictions to allow nodes in the same /16 and node family to reappear + in the path. They also allow the guard node to be chosen as the RP, IP, + and HSDIR, and as the hop before those positions. +``` diff --git a/spec/path-spec/server-descriptor-purposes.md b/spec/path-spec/server-descriptor-purposes.md new file mode 100644 index 0000000..3a7dca0 --- /dev/null +++ b/spec/path-spec/server-descriptor-purposes.md @@ -0,0 +1,19 @@ +<a id="path-spec.txt-6"></a> + +# Server descriptor purposes + +There are currently three "purposes" supported for server descriptors: +general, controller, and bridge. Most descriptors are of type general +-- these are the ones listed in the consensus, and the ones fetched +and used in normal cases. + +Controller-purpose descriptors are those delivered by the controller +and labelled as such: they will be kept around (and expire like +normal descriptors), and they can be used by the controller in its +CIRCUITEXTEND commands. Otherwise they are ignored by Tor when it +chooses paths. + +Bridge-purpose descriptors are for routers that are used as bridges. See +doc/design-paper/blocking.pdf for more design explanation, or proposal +125 for specific details. Currently bridge descriptors are used in place +of normal entry guards, for Tor clients that have UseBridges enabled. diff --git a/spec/path-spec/when-we-build.md b/spec/path-spec/when-we-build.md new file mode 100644 index 0000000..7130978 --- /dev/null +++ b/spec/path-spec/when-we-build.md @@ -0,0 +1,177 @@ +<a id="path-spec.txt-2.1"></a> + +# When we build + +<a id="path-spec.txt-2.1.0"></a> + +## We don't build circuits until we have enough directory info + +There's a class of possible attacks where our directory servers +only give us information about the relays that they would like us +to use. To prevent this attack, we don't build multi-hop +circuits +(including +[preemptive circuits](#preemptive), +[on-demand circuits(#on-demand), +[onion-service circuits](#onion-service)] +or [self-testing testing circuits](#self-test)) +for real traffic +until we have enough directory information to be +reasonably confident this attack isn't being done to us. + +Here, "enough" directory information is defined as: + +```text + * Having a consensus that's been valid at some point in the + last REASONABLY_LIVE_TIME interval (24 hours). + + * Having enough descriptors that we could build at least some + fraction F of all bandwidth-weighted paths, without taking + ExitNodes/EntryNodes/etc into account. + + (F is set by the PathsNeededToBuildCircuits option, + defaulting to the 'min_paths_for_circs_pct' consensus + parameter, with a final default value of 60%.) + + * Having enough descriptors that we could build at least some + fraction F of all bandwidth-weighted paths, _while_ taking + ExitNodes/EntryNodes/etc into account. + + (F is as above.) + + * Having a descriptor for every one of the first + NUM_USABLE_PRIMARY_GUARDS guards among our primary guards. (see + guard-spec.txt) +``` + +We define the "fraction of bandwidth-weighted paths" as the product of +these three fractions. + +```text + * The fraction of descriptors that we have for nodes with the Guard + flag, weighted by their bandwidth for the guard position. + * The fraction of descriptors that we have for all nodes, + weighted by their bandwidth for the middle position. + * The fraction of descriptors that we have for nodes with the Exit + flag, weighted by their bandwidth for the exit position. +``` + +If the consensus has zero weighted bandwidth for a given kind of +relay (Guard, Middle, or Exit), Tor instead uses the fraction of relays +for which it has the descriptor (not weighted by bandwidth at all). + +If the consensus lists zero exit-flagged relays, Tor instead uses the +fraction of middle relays. + +<a id="path-spec.txt-2.1.1"></a> + +## Clients build circuits preemptively {#preemptive} + +When running as a client, Tor tries to maintain at least a certain +number of clean circuits, so that new streams can be handled +quickly. To increase the likelihood of success, Tor tries to +predict what circuits will be useful by choosing from among nodes +that support the ports we have used in the recent past (by default +one hour). Specifically, on startup Tor tries to maintain one clean +fast exit circuit that allows connections to port 80, and at least +two fast clean stable internal circuits in case we get a resolve +request or hidden service request (at least three if we _run_ a +hidden service). + +After that, Tor will adapt the circuits that it preemptively builds +based on the requests it sees from the user: it tries to have two fast +clean exit circuits available for every port seen within the past hour +(each circuit can be adequate for many predicted ports -- it doesn't +need two separate circuits for each port), and it tries to have the +above internal circuits available if we've seen resolves or hidden +service activity within the past hour. If there are 12 or more clean +circuits open, it doesn't open more even if it has more predictions. + +Only stable circuits can "cover" a port that is listed in the +LongLivedPorts config option. Similarly, hidden service requests +to ports listed in LongLivedPorts make us create stable internal +circuits. + +Note that if there are no requests from the user for an hour, Tor +will predict no use and build no preemptive circuits. + +The Tor client SHOULD NOT store its list of predicted requests to a +persistent medium. + +<a id="path-spec.txt-2.1.2"></a> + +## Clients build circuits on demand {#on-demand} + +Additionally, when a client request exists that no circuit (built or +pending) might support, we create a new circuit to support the request. +For exit connections, we pick an exit node that will handle the +most pending requests (choosing arbitrarily among ties), launch a +circuit to end there, and repeat until every unattached request +might be supported by a pending or built circuit. For internal +circuits, we pick an arbitrary acceptable path, repeating as needed. + +Clients consider a circuit to become "dirty" as soon as a stream is +attached to it, or some other request is performed over the circuit. +If a circuit has been "dirty" for at least MaxCircuitDirtiness seconds, +new circuits may not be attached to it. + +In some cases we can reuse an already established circuit if it's +clean; see ["cannibalizing circuits"](./cannibalizing-circuits.md) + +for details. + +<a id="path-spec.txt-2.1.3"></a> + +## Relays build circuits for testing reachability and bandwidth {#self-test} + +Tor relays test reachability of their ORPort once they have +successfully built a circuit (on startup and whenever their IP address +changes). They build an ordinary fast internal circuit with themselves +as the last hop. As soon as any testing circuit succeeds, the Tor +relay decides it's reachable and is willing to publish a descriptor. + +We launch multiple testing circuits (one at a time), until we +have NUM_PARALLEL_TESTING_CIRC (4) such circuits open. Then we +do a "bandwidth test" by sending a certain number of relay drop +cells down each circuit: BandwidthRate * 10 / CELL_NETWORK_SIZE +total cells divided across the four circuits, but never more than +CIRCWINDOW_START (1000) cells total. This exercises both outgoing and +incoming bandwidth, and helps to jumpstart the observed bandwidth +(see dir-spec.txt). + +Tor relays also test reachability of their DirPort once they have +established a circuit, but they use an ordinary exit circuit for +this purpose. + +<a id="path-spec.txt-2.1.4"></a> + +## Hidden-service circuits {#onion-service} + +See section 4 below. + +<a id="path-spec.txt-2.1.5"></a> + +## Rate limiting of failed circuits + +If we fail to build a circuit N times in a X second period +(see ["Handling failure"](./handling-failure.md) +for how this works), we stop building circuits until the X seconds +have elapsed. +XXXX + +<a id="path-spec.txt-2.1.6"></a> + +## When to tear down circuits + +Clients should tear down circuits (in general) only when those circuits +have no streams on them. Additionally, clients should tear-down +stream-less circuits only under one of the following conditions: + +```text + - The circuit has never had a stream attached, and it was created too + long in the past (based on CircuitsAvailableTimeout or + cbtlearntimeout, depending on timeout estimate status). + + - The circuit is dirty (has had a stream attached), and it has been + dirty for at least MaxCircuitDirtiness. +``` |