From c39cb7ecc0a98e024a46c058e5d9d461150c4c90 Mon Sep 17 00:00:00 2001 From: Mike Perry Date: Sun, 6 Jul 2008 23:36:33 +0000 Subject: Add guard node failure plans to proposal. svn:r15706 --- proposals/151-path-selection-improvements.txt | 61 +++++++++++++++++++++------ 1 file changed, 47 insertions(+), 14 deletions(-) (limited to 'proposals/151-path-selection-improvements.txt') diff --git a/proposals/151-path-selection-improvements.txt b/proposals/151-path-selection-improvements.txt index 4d58396..3362efb 100644 --- a/proposals/151-path-selection-improvements.txt +++ b/proposals/151-path-selection-improvements.txt @@ -9,9 +9,9 @@ Status: Draft Overview The performance of paths selected can be improved by adjusting the - CircuitBuildTimeout and the number of guards. This proposal describes - a method of tracking buildtime statistics, and using those statistics - to adjust the CircuitBuildTimeout and the number of guards. + CircuitBuildTimeout and avoiding failing guard nodes. This proposal + describes a method of tracking buildtime statistics, and using those + statistics to adjust the CircuitBuildTimeout and the number of guards. Motivation @@ -26,14 +26,17 @@ Implementation Based on studies of build times, we found that the distribution of circuit buildtimes appears to be a Pareto distribution. The number - of circuits to observe (ncircuits_to_observe) before changing the - CircuitBuildTimeout will be tunable. From our preliminary - measurements, it is likely that ncircuits_to_observe will be - somewhere on the order of 1000. The values can be represented - compactly in Tor in milliseconds as a circular array of 16 bit - integers. More compact long-term storage representations can be - implemented by simply storing a histogram with 50 millisecond - buckets when writing out the statistics to disk. + of circuits to observe (ncircuits_to_cutoff) before changing the + CircuitBuildTimeout will be tunable. From out measurements, + ncircuits_to_cuttoff appears to be on the order of 100. + + In addition, the total number of circuits gathered + (ncircuits_to_observe) will also be tunable. It is likely that + ncircuits_to_observe will be somewhere on the order of 1000. The values + can be represented compactly in Tor in milliseconds as a circular array + of 16 bit integers. More compact long-term storage representations can + be implemented by simply storing a histogram with 50 millisecond buckets + when writing out the statistics to disk. Calculating the preferred CircuitBuildTimeout @@ -47,13 +50,43 @@ Implementation of expected CDF of timeouts. Also, in the event of network failure, the observation mechanism should stop collecting timeout data. - Other notes + Dropping Failed Guards + + In addition, we have noticed that some entry guards are much more + failure prone than others. In particular, the circuit failure rates for + the fastest entry guards was approximately 20-25%, where as slower + guards exhibit failure rates as high as 45-50%. In [1], it was + demonstrated that failing guard nodes can deliberately bias path + selection to improve their success at capturing traffic. For both these + reasons, failing guards should be avoided. + + We propose increasing the number of entry guards to five, and gathering + circuit failure statistics on each entry guard. Any guards that exceed + the average failure rate of all guards by 10% after we have + gathered ncircuits_to_observe circuits will be replaced. + + +Issues + + Impact on anonymity Since this follows a Pareto distribution, large reductions on the timeout can be achieved without cutting off a great number of the total paths. However, hard statistics on which cutoff percentage gives optimal performance have not yet been gathered. -Issues + Guard Turnover + + We contend that the risk from failing guards biasing path selection + outweighs the risk of exposure to larger portions of the network + for the first hop. Furthermore, from our observations, it appears + that circuit failure is strongly correlated to node load. Allowing + clients to migrate away from failing guards should naturally + rebalance the network, and eventually clients should converge on + a stable set of reliable guards. It is also likely that once clients + begin to migrate away from failing guards, their load should go + down, causing their failure rates to drop as well. + + +[1] http://www.crhc.uiuc.edu/~nikita/papers/relmix-ccs07.pdf - Impact on anonymity -- cgit v1.2.3-54-g00ecf