1 files changed, 0 insertions, 148 deletions
diff --git a/doc/spec/proposals/151-path-selection-improvements.txt b/doc/spec/proposals/151-path-selection-improvements.txt
deleted file mode 100644
index af89f21193..0000000000
--- a/doc/spec/proposals/151-path-selection-improvements.txt
+++ /dev/null
@@ -1,148 +0,0 @@
-Filename: 151-path-selection-improvements.txt
-Title: Improving Tor Path Selection
-Author: Fallon Chen, Mike Perry
-Created: 5-Jul-2008
-Status: Finished
-In-Spec: path-spec.txt
-
-Overview
-
-  The performance of paths selected can be improved by adjusting the
-  CircuitBuildTimeout and avoiding failing guard nodes. This proposal
-  describes a method of tracking buildtime statistics at the client, and
-  using those statistics to adjust the CircuitBuildTimeout.
-
-Motivation
-
-  Tor's performance can be improved by excluding those circuits that
-  have long buildtimes (and by extension, high latency). For those Tor
-  users who require better performance and have lower requirements for
-  anonymity, this would be a very useful option to have.
-
-Implementation
-
-  Gathering Build Times
-
-    Circuit build times are stored in the circular array
-    'circuit_build_times' consisting of uint32_t elements as milliseconds.
-    The total size of this array is based on the number of circuits
-    it takes to converge on a good fit of the long term distribution of
-    the circuit builds for a fixed link. We do not want this value to be
-    too large, because it will make it difficult for clients to adapt to
-    moving between different links.
-
-    From our observations, the minimum value for a reasonable fit appears
-    to be on the order of 500 (MIN_CIRCUITS_TO_OBSERVE). However, to keep
-    a good fit over the long term, we store 5000 most recent circuits in
-    the array (NCIRCUITS_TO_OBSERVE).
-
-    The Tor client will build test circuits at a rate of one per
-    minute (BUILD_TIMES_TEST_FREQUENCY) up to the point of
-    MIN_CIRCUITS_TO_OBSERVE. This allows a fresh Tor to have
-    a CircuitBuildTimeout estimated within 8 hours after install,
-    upgrade, or network change (see below).
-
-  Long Term Storage
-
-    The long-term storage representation is implemented by storing a
-    histogram with BUILDTIME_BIN_WIDTH millisecond buckets (default 50) when
-    writing out the statistics to disk. The format this takes in the
-    state file is 'CircuitBuildTime <bin-ms> <count>', with the total
-    specified as 'TotalBuildTimes <total>'
-    Example:
-
-    TotalBuildTimes 100
-    CircuitBuildTimeBin 25 50
-    CircuitBuildTimeBin 75 25
-    CircuitBuildTimeBin 125 13
-    ...
-
-    Reading the histogram in will entail inserting <count> values
-    into the circuit_build_times array each with the value of
-    <bin-ms> milliseconds. In order to evenly distribute the values
-    in the circular array, the Fisher-Yates shuffle will be performed
-    after reading values from the bins.
-
-  Learning the CircuitBuildTimeout
-
-    Based on studies of build times, we found that the distribution of
-    circuit buildtimes appears to be a Frechet distribution. However,
-    estimators and quantile functions of the Frechet distribution are
-    difficult to work with and slow to converge. So instead, since we
-    are only interested in the accuracy of the tail, we approximate
-    the tail of the distribution with a Pareto curve starting at
-    the mode of the circuit build time sample set.
-
-    We will calculate the parameters for a Pareto distribution
-    fitting the data using the estimators at
-    http://en.wikipedia.org/wiki/Pareto_distribution#Parameter_estimation.
-
-    The timeout itself is calculated by using the Quartile function (the
-    inverted CDF) to give us the value on the CDF such that
-    BUILDTIME_PERCENT_CUTOFF (80%) of the mass of the distribution is
-    below the timeout value.
-
-    Thus, we expect that the Tor client will accept the fastest 80% of
-    the total number of paths on the network.
-
-  Detecting Changing Network Conditions
-
-    We attempt to detect both network connectivity loss and drastic
-    changes in the timeout characteristics.
-
-    We assume that we've had network connectivity loss if 3 circuits
-    timeout and we've received no cells or TLS handshakes since those
-    circuits began. We then set the timeout to 60 seconds and stop
-    counting timeouts.
-
-    If 3 more circuits timeout and the network still has not been
-    live within this new 60 second timeout window, we then discard
-    the previous timeouts during this period from our history.
-
-    To detect changing network conditions, we keep a history of
-    the timeout or non-timeout status of the past RECENT_CIRCUITS (20)
-    that successfully completed at least one hop. If more than 75%
-    of these circuits timeout, we discard all buildtimes history,
-    reset the timeout to 60, and then begin recomputing the timeout.
-
-  Testing
-
-    After circuit build times, storage, and learning are implemented,
-    the resulting histogram should be checked for consistency by
-    verifying it persists across successive Tor invocations where
-    no circuits are built. In addition, we can also use the existing
-    buildtime scripts to record build times, and verify that the histogram
-    the python produces matches that which is output to the state file in Tor,
-    and verify that the Pareto parameters and cutoff points also match.
-
-    We will also verify that there are no unexpected large deviations from
-    node selection, such as nodes from distant geographical locations being
-    completely excluded.
-
-  Dealing with Timeouts
-
-    Timeouts should be counted as the expectation of the region of
-    of the Pareto distribution beyond the cutoff. This is done by
-    generating a random sample for each timeout at points on the
-    curve beyond the current timeout cutoff.
-
-  Future Work
-
-    At some point, it may be desirable to change the cutoff from a
-    single hard cutoff that destroys the circuit to a soft cutoff and
-    a hard cutoff, where the soft cutoff merely triggers the building
-    of a new circuit, and the hard cutoff triggers destruction of the
-    circuit.
-
-    It may also be beneficial to learn separate timeouts for each
-    guard node, as they will have slightly different distributions.
-    This will take longer to generate initial values though.
-
-Issues
-
-  Impact on anonymity
-
-    Since this follows a Pareto distribution, large reductions on the
-    timeout can be achieved without cutting off a great number of the
-    total paths. This will eliminate a great deal of the performance
-    variation of Tor usage.