summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
-rw-r--r--doc/spec/proposals/151-path-selection-improvements.txt138
1 files changed, 79 insertions, 59 deletions
diff --git a/doc/spec/proposals/151-path-selection-improvements.txt b/doc/spec/proposals/151-path-selection-improvements.txt
index 3362efbe2b..fd24a9c217 100644
--- a/doc/spec/proposals/151-path-selection-improvements.txt
+++ b/doc/spec/proposals/151-path-selection-improvements.txt
@@ -10,8 +10,8 @@ Overview
The performance of paths selected can be improved by adjusting the
CircuitBuildTimeout and avoiding failing guard nodes. This proposal
- describes a method of tracking buildtime statistics, and using those
- statistics to adjust the CircuitBuildTimeout and the number of guards.
+ describes a method of tracking buildtime statistics at the client, and
+ using those statistics to adjust the CircuitBuildTimeout.
Motivation
@@ -22,71 +22,91 @@ Motivation
Implementation
+ Storing Build Times
+
+ Circuit build times will be stored in the circular array
+ 'circuit_build_times' consisting of uint16_t elements as milliseconds.
+ The total size of this array will be based on the number of circuits
+ it takes to converge on a good fit of the long term distribution of
+ the circuit builds for a fixed link. We do not want this value to be
+ too large, because it will make it difficult for clients to adapt to
+ moving between different links.
+
+ From our initial observations, this value appears to be on the order
+ of 1000, but will be configurable in a #define NCIRCUITS_TO_OBSERVE.
+
+ Long Term Storage
+
+ The long-term storage representation will be implemented by storing a
+ histogram with BUILDTIME_BIN_WIDTH millisecond buckets (default 50) when
+ writing out the statistics to disk. The format of this histogram on disk
+ is yet to be finalized, but it will likely be of the format
+ 'CircuitBuildTime <bin> <count>'.
+ Example:
+
+ CircuitBuildTimeBin 1 100
+ CircuitBuildTimeBin 2 50
+ ...
+
+ Reading the histogram in will entail multiplying each bin by the
+ BUILDTIME_BIN_WIDTH and then inserting <count> values into the
+ circuit_build_times array each with the value of
+ <bin>*BUILDTIME_BIN_WIDTH.
+
Learning the CircuitBuildTimeout
Based on studies of build times, we found that the distribution of
- circuit buildtimes appears to be a Pareto distribution. The number
- of circuits to observe (ncircuits_to_cutoff) before changing the
- CircuitBuildTimeout will be tunable. From out measurements,
- ncircuits_to_cuttoff appears to be on the order of 100.
-
- In addition, the total number of circuits gathered
- (ncircuits_to_observe) will also be tunable. It is likely that
- ncircuits_to_observe will be somewhere on the order of 1000. The values
- can be represented compactly in Tor in milliseconds as a circular array
- of 16 bit integers. More compact long-term storage representations can
- be implemented by simply storing a histogram with 50 millisecond buckets
- when writing out the statistics to disk.
-
- Calculating the preferred CircuitBuildTimeout
-
- Circuits that have longer buildtimes than some x% of the estimated
- CDF of the Pareto distribution will be excluded. x will be tunable
- as well.
-
- Circuit timeouts
-
- In the event of a timeout, backoff values should include the 100-x%
- of expected CDF of timeouts. Also, in the event of network failure,
- the observation mechanism should stop collecting timeout data.
-
- Dropping Failed Guards
-
- In addition, we have noticed that some entry guards are much more
- failure prone than others. In particular, the circuit failure rates for
- the fastest entry guards was approximately 20-25%, where as slower
- guards exhibit failure rates as high as 45-50%. In [1], it was
- demonstrated that failing guard nodes can deliberately bias path
- selection to improve their success at capturing traffic. For both these
- reasons, failing guards should be avoided.
-
- We propose increasing the number of entry guards to five, and gathering
- circuit failure statistics on each entry guard. Any guards that exceed
- the average failure rate of all guards by 10% after we have
- gathered ncircuits_to_observe circuits will be replaced.
-
+ circuit buildtimes appears to be a Pareto distribution.
-Issues
+ We will calculate the parameters for a Pareto distribution
+ fitting the data using the estimators at
+ http://en.wikipedia.org/wiki/Pareto_distribution#Parameter_estimation.
- Impact on anonymity
+ The timeout itself will be calculated by solving the CDF for the
+ a percentile cutoff BUILDTIME_PERCENT_CUTOFF. This value
+ represents the percentage of paths the Tor client will accept out of
+ the total number of paths. We have not yet determined a good
+ cutoff for this mathematically, but 85% seems a good choice for now.
- Since this follows a Pareto distribution, large reductions on the
- timeout can be achieved without cutting off a great number of the
- total paths. However, hard statistics on which cutoff percentage
- gives optimal performance have not yet been gathered.
+ From http://en.wikipedia.org/wiki/Pareto_distribution#Definition,
+ the calculation we need is pow(BUILDTIME_PERCENT_CUTOFF/100.0, k)/Xm.
+
+ When to Begin Calculation
+
+ The number of circuits to observe (NCIRCUITS_TO_CUTOFF) before
+ changing the CircuitBuildTimeout will be tunable via a #define. From
+ our measurements, a good value for NCIRCUITS_TO_CUTOFF appears to be
+ on the order of 100.
+
+ Dealing with Timeouts
+
+ Timeouts should be counted as the expectation of the region of
+ of the Pareto distribution beyond the cutoff. The proposal will
+ be updated with this value soon.
+
+ Also, in the event of network failure, the observation mechanism
+ should stop collecting timeout data.
- Guard Turnover
+ Circuits that timeout will be destroyed, as this indicates one
+ or more of their respective nodes are currently overloaded.
- We contend that the risk from failing guards biasing path selection
- outweighs the risk of exposure to larger portions of the network
- for the first hop. Furthermore, from our observations, it appears
- that circuit failure is strongly correlated to node load. Allowing
- clients to migrate away from failing guards should naturally
- rebalance the network, and eventually clients should converge on
- a stable set of reliable guards. It is also likely that once clients
- begin to migrate away from failing guards, their load should go
- down, causing their failure rates to drop as well.
+ Client Hints
+ Some research still needs to be done to provide initial values
+ for CircuitBuildTimeout based on values learned from modem
+ users, DSL users, Cable Modem users, and dedicated links. A
+ radiobutton in Vidalia should eventually be provided that
+ sets CircuitBuildTimeout to one of these values and also
+ provide the option of purging all learned data, should any exist.
-[1] http://www.crhc.uiuc.edu/~nikita/papers/relmix-ccs07.pdf
+ These values can either be published in the directory, or
+ shipped hardcoded for a particular Tor version.
+
+Issues
+
+ Impact on anonymity
+ Since this follows a Pareto distribution, large reductions on the
+ timeout can be achieved without cutting off a great number of the
+ total paths. This will eliminate a great deal of the performance
+ variation of Tor usage.