aboutsummaryrefslogtreecommitdiff
path: root/proposals/151-path-selection-improvements.txt
blob: 4d583968fcd3c56729d3510b1d44772c12ab0611 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
Filename: 151-path-selection-improvements.txt
Title: Improving Tor Path Selection
Version:
Last-Modified:
Author: Fallon Chen, Mike Perry
Created: 5-Jul-2008
Status: Draft

Overview

  The performance of paths selected can be improved by adjusting the
  CircuitBuildTimeout and the number of guards. This proposal describes
  a method of tracking buildtime statistics, and using those statistics
  to adjust the CircuitBuildTimeout and the number of guards.

Motivation

  Tor's performance can be improved by excluding those circuits that
  have long buildtimes (and by extension, high latency). For those Tor
  users who require better performance and have lower requirements for
  anonymity, this would be a very useful option to have.

Implementation

  Learning the CircuitBuildTimeout

    Based on studies of build times, we found that the distribution of
    circuit buildtimes appears to be a Pareto distribution. The number
    of circuits to observe (ncircuits_to_observe) before changing the
    CircuitBuildTimeout will be tunable. From our preliminary
    measurements, it is likely that ncircuits_to_observe will be
    somewhere on the order of 1000. The values can be represented
    compactly in Tor in milliseconds as a circular array of 16 bit
    integers. More compact long-term storage representations can be
    implemented by simply storing a histogram with 50 millisecond
    buckets when writing out the statistics to disk.

  Calculating the preferred CircuitBuildTimeout

    Circuits that have longer buildtimes than some x% of the estimated
    CDF of the Pareto distribution will be excluded. x will be tunable
    as well.

  Circuit timeouts

    In the event of a timeout, backoff values should include the 100-x%
    of expected CDF of timeouts.  Also, in the event of network failure,
    the observation mechanism should stop collecting timeout data.

  Other notes

    Since this follows a Pareto distribution, large reductions on the
    timeout can be achieved without cutting off a great number of the
    total paths.  However, hard statistics on which cutoff percentage
    gives optimal performance have not yet been gathered.

Issues

  Impact on anonymity