diff options
Diffstat (limited to 'doc')
-rw-r--r-- | doc/spec/path-spec.txt | 131 | ||||
-rw-r--r-- | doc/spec/proposals/151-path-selection-improvements.txt | 1 |
2 files changed, 131 insertions, 1 deletions
diff --git a/doc/spec/path-spec.txt b/doc/spec/path-spec.txt index 8a85718a08..ae5ab3e3dd 100644 --- a/doc/spec/path-spec.txt +++ b/doc/spec/path-spec.txt @@ -175,6 +175,7 @@ of their choices. XXXX + 2.2. Path selection and constraints We choose the path for each new circuit before we build it. We choose the @@ -295,8 +296,136 @@ of their choices. at a given node -- either via the ".exit" notation or because the destination is running at the same location as an exit node. +2.4. Learning when to give up ("timeout") on circuit construction + + Since version 0.2.2.8-alpha, Tor attempts to learn when to give up on + circuits based on network conditions. + +2.4.1 Distribution choice and parameter estimation + + Based on studies of build times, we found that the distribution of + circuit build times appears to be a Frechet distribution. However, + estimators and quantile functions of the Frechet distribution are + difficult to work with and slow to converge. So instead, since we + are only interested in the accuracy of the tail, we approximate + the tail of the distribution with a Pareto curve. + + We calculate the parameters for a Pareto distribution fitting the data + using the estimators at + http://en.wikipedia.org/wiki/Pareto_distribution#Parameter_estimation. + + Because this is not a true Pareto distribution, we alter how Xm is + computed. The Xm parameter is computed as the midpoint of the most + frequently occurring 50ms histogram bin, until the point where 1000 + circuits are recorded. After this point, the weighted average of the top + 3 midpoint modes is used as Xm. All times below this value are counted + as having the midpoint value of this weighted average bin. + + The timeout itself is calculated by using the Pareto Quantile function (the + inverted CDF) to give us the value on the CDF such that 80% of the mass + of the distribution is below the timeout value. + + Thus, we expect that the Tor client will accept the fastest 80% of + the total number of paths on the network. + +2.4.2. How much data to record + + From our observations, the minimum number of circuit build times for a + reasonable fit appears to be on the order of 100. However, to keep a + good fit over the long term, we store 1000 most recent circuit build times + in a circular array. + + The Tor client should build test circuits at a rate of one per + minute up until 100 circuits are built. This allows a fresh Tor to have + a CircuitBuildTimeout estimated within 1.5 hours after install, + upgrade, or network change (see below). + + Timeouts are stored on disk in a histogram of 50ms bin width, the same + width used to calculate the Xm value above. This histogram must be shuffled + after being read from disk, to preserve a proper expiration of old values + after restart. + +2.4.3. How to record timeouts + + Timeouts should be counted as the expectation of the region of + of the Pareto distribution beyond the cutoff. This is done by + generating a random sample for each timeout at points on the + curve beyond the current timeout cutoff up to the 90% quantile marker. + +2.4.4. Detecting Changing Network Conditions + + We attempt to detect both network connectivity loss and drastic + changes in the timeout characteristics. + + We assume that we've had network connectivity loss if 3 circuits + timeout and we've received no cells or TLS handshakes since those + circuits began. We then temporarily set the timeout to 60 seconds + and stop counting timeouts. + + If 3 more circuits timeout and the network still has not been + live within this new 60 second timeout window, we then discard + the previous timeouts during this period from our history. + + To detect changing network conditions, we keep a history of + the timeout or non-timeout status of the past 20 circuits that + successfully completed at least one hop. If more than 90% of + these circuits timeout, we discard all buildtimes history, reset + the timeout to 60, and then begin recomputing the timeout. + + If the timeout was already 60 or higher, we double the timeout. + +2.4.5. Consensus parameters governing behavior + + Clients that implement circuit build timeout learning should obey the + following consensus parameters that govern behavior, in order to allow + us to handle bugs or other emergent behaviors due to client circuit + construction. If these parameters are not present in the consensus, + the listed default values should be used instead. + + cbtrecentcount + Default: 20 + Effect: This is the number of circuit build times to keep track of + for the following option. + + cbtmaxtimeouts + Default: 18 + Effect: When this many timeouts happen in the last 'cbtrecentcount' + circuit attempts, the client should discard all of its + history and begin learning a fresh timeout value. + + cbtmincircs + Default: 100 + Effect: This is the minimum number of circuits to build before + computing a timeout. + + cbtquantile + Default: 80 + Effect: This is the position on the quantile curve to use to set the + timeout value. It is a percent (0-99). + + cbtmaxsynthquantile + Default: 90 + Effect: This is the maximum position on the quantile curve to use to + generate synthetic circuit build times for timeouts. It is a + percent (0-99). + + cbttestfreq + Default: 60 + Effect: Describes how often in seconds to build a test circuit to + gather timeout values. Only applies if less than 'cbtmincircs' + have been recorded. + + cbtmintimeout + Default: 2000 + Effect: This is the minimum allowed timeout value in milliseconds. + + cbtinitialtimeout + Default: 60000 + Effect: This is the timeout value to use before computing a timeout, + in milliseconds. + -2.4. Handling failure +2.5. Handling failure If an attempt to extend a circuit fails (either because the first create failed or a subsequent extend failed) then the circuit is torn down and is diff --git a/doc/spec/proposals/151-path-selection-improvements.txt b/doc/spec/proposals/151-path-selection-improvements.txt index 363eebae84..af89f21193 100644 --- a/doc/spec/proposals/151-path-selection-improvements.txt +++ b/doc/spec/proposals/151-path-selection-improvements.txt @@ -3,6 +3,7 @@ Title: Improving Tor Path Selection Author: Fallon Chen, Mike Perry Created: 5-Jul-2008 Status: Finished +In-Spec: path-spec.txt Overview |