From 95969867fcff48cf63ea9a247e9ac09810e160d2 Mon Sep 17 00:00:00 2001 From: Mike Perry Date: Tue, 12 Aug 2008 18:23:38 +0000 Subject: Updated to remove dropping of failing guards and just focus on the specifics of recording, storing, and learning circuitbuildtimeout parameters. svn:r16511 --- proposals/151-path-selection-improvements.txt | 138 +++++++++++++++----------- 1 file changed, 79 insertions(+), 59 deletions(-) (limited to 'proposals/151-path-selection-improvements.txt') diff --git a/proposals/151-path-selection-improvements.txt b/proposals/151-path-selection-improvements.txt index 3362efb..fd24a9c 100644 --- a/proposals/151-path-selection-improvements.txt +++ b/proposals/151-path-selection-improvements.txt @@ -10,8 +10,8 @@ Overview The performance of paths selected can be improved by adjusting the CircuitBuildTimeout and avoiding failing guard nodes. This proposal - describes a method of tracking buildtime statistics, and using those - statistics to adjust the CircuitBuildTimeout and the number of guards. + describes a method of tracking buildtime statistics at the client, and + using those statistics to adjust the CircuitBuildTimeout. Motivation @@ -22,71 +22,91 @@ Motivation Implementation + Storing Build Times + + Circuit build times will be stored in the circular array + 'circuit_build_times' consisting of uint16_t elements as milliseconds. + The total size of this array will be based on the number of circuits + it takes to converge on a good fit of the long term distribution of + the circuit builds for a fixed link. We do not want this value to be + too large, because it will make it difficult for clients to adapt to + moving between different links. + + From our initial observations, this value appears to be on the order + of 1000, but will be configurable in a #define NCIRCUITS_TO_OBSERVE. + + Long Term Storage + + The long-term storage representation will be implemented by storing a + histogram with BUILDTIME_BIN_WIDTH millisecond buckets (default 50) when + writing out the statistics to disk. The format of this histogram on disk + is yet to be finalized, but it will likely be of the format + 'CircuitBuildTime '. + Example: + + CircuitBuildTimeBin 1 100 + CircuitBuildTimeBin 2 50 + ... + + Reading the histogram in will entail multiplying each bin by the + BUILDTIME_BIN_WIDTH and then inserting values into the + circuit_build_times array each with the value of + *BUILDTIME_BIN_WIDTH. + Learning the CircuitBuildTimeout Based on studies of build times, we found that the distribution of - circuit buildtimes appears to be a Pareto distribution. The number - of circuits to observe (ncircuits_to_cutoff) before changing the - CircuitBuildTimeout will be tunable. From out measurements, - ncircuits_to_cuttoff appears to be on the order of 100. - - In addition, the total number of circuits gathered - (ncircuits_to_observe) will also be tunable. It is likely that - ncircuits_to_observe will be somewhere on the order of 1000. The values - can be represented compactly in Tor in milliseconds as a circular array - of 16 bit integers. More compact long-term storage representations can - be implemented by simply storing a histogram with 50 millisecond buckets - when writing out the statistics to disk. - - Calculating the preferred CircuitBuildTimeout - - Circuits that have longer buildtimes than some x% of the estimated - CDF of the Pareto distribution will be excluded. x will be tunable - as well. - - Circuit timeouts - - In the event of a timeout, backoff values should include the 100-x% - of expected CDF of timeouts. Also, in the event of network failure, - the observation mechanism should stop collecting timeout data. - - Dropping Failed Guards - - In addition, we have noticed that some entry guards are much more - failure prone than others. In particular, the circuit failure rates for - the fastest entry guards was approximately 20-25%, where as slower - guards exhibit failure rates as high as 45-50%. In [1], it was - demonstrated that failing guard nodes can deliberately bias path - selection to improve their success at capturing traffic. For both these - reasons, failing guards should be avoided. - - We propose increasing the number of entry guards to five, and gathering - circuit failure statistics on each entry guard. Any guards that exceed - the average failure rate of all guards by 10% after we have - gathered ncircuits_to_observe circuits will be replaced. - + circuit buildtimes appears to be a Pareto distribution. -Issues + We will calculate the parameters for a Pareto distribution + fitting the data using the estimators at + http://en.wikipedia.org/wiki/Pareto_distribution#Parameter_estimation. - Impact on anonymity + The timeout itself will be calculated by solving the CDF for the + a percentile cutoff BUILDTIME_PERCENT_CUTOFF. This value + represents the percentage of paths the Tor client will accept out of + the total number of paths. We have not yet determined a good + cutoff for this mathematically, but 85% seems a good choice for now. - Since this follows a Pareto distribution, large reductions on the - timeout can be achieved without cutting off a great number of the - total paths. However, hard statistics on which cutoff percentage - gives optimal performance have not yet been gathered. + From http://en.wikipedia.org/wiki/Pareto_distribution#Definition, + the calculation we need is pow(BUILDTIME_PERCENT_CUTOFF/100.0, k)/Xm. + + When to Begin Calculation + + The number of circuits to observe (NCIRCUITS_TO_CUTOFF) before + changing the CircuitBuildTimeout will be tunable via a #define. From + our measurements, a good value for NCIRCUITS_TO_CUTOFF appears to be + on the order of 100. + + Dealing with Timeouts + + Timeouts should be counted as the expectation of the region of + of the Pareto distribution beyond the cutoff. The proposal will + be updated with this value soon. + + Also, in the event of network failure, the observation mechanism + should stop collecting timeout data. - Guard Turnover + Circuits that timeout will be destroyed, as this indicates one + or more of their respective nodes are currently overloaded. - We contend that the risk from failing guards biasing path selection - outweighs the risk of exposure to larger portions of the network - for the first hop. Furthermore, from our observations, it appears - that circuit failure is strongly correlated to node load. Allowing - clients to migrate away from failing guards should naturally - rebalance the network, and eventually clients should converge on - a stable set of reliable guards. It is also likely that once clients - begin to migrate away from failing guards, their load should go - down, causing their failure rates to drop as well. + Client Hints + Some research still needs to be done to provide initial values + for CircuitBuildTimeout based on values learned from modem + users, DSL users, Cable Modem users, and dedicated links. A + radiobutton in Vidalia should eventually be provided that + sets CircuitBuildTimeout to one of these values and also + provide the option of purging all learned data, should any exist. -[1] http://www.crhc.uiuc.edu/~nikita/papers/relmix-ccs07.pdf + These values can either be published in the directory, or + shipped hardcoded for a particular Tor version. + +Issues + + Impact on anonymity + Since this follows a Pareto distribution, large reductions on the + timeout can be achieved without cutting off a great number of the + total paths. This will eliminate a great deal of the performance + variation of Tor usage. -- cgit v1.2.3-54-g00ecf