Move all text-only specifications into the OLD_TXT directory.

author: Nick Mathewson <nickm@torproject.org> 2023-10-12 12:27:58 -0400
committer: Nick Mathewson <nickm@torproject.org> 2023-10-12 12:27:58 -0400
commit: e4e0d93d56ee8c1aec4c2efaa7046b651f0fe55c (patch)
tree: 15a085da265ae3b2b70f29a70f910a5371059a78 /guard-spec.txt
parent: b719a373934d3e79ef833446c6aeeb19be485510 (diff)
download: torspec-e4e0d93d56ee8c1aec4c2efaa7046b651f0fe55c.tar.gz
torspec-e4e0d93d56ee8c1aec4c2efaa7046b651f0fe55c.zip
1 files changed, 0 insertions, 972 deletions
diff --git a/guard-spec.txt b/guard-spec.txt
deleted file mode 100644
index 154edae..0000000
--- a/guard-spec.txt
+++ /dev/null
@@ -1,972 +0,0 @@
-
-                      Tor Guard Specification
-
-                           Isis Lovecruft
-                         George Kadianakis
-                              Ola Bini
-                           Nick Mathewson
-
-Table of Contents
-
-    1. Introduction and motivation
-    2. State instances
-    3. Circuit Creation, Entry Guard Selection (1000 foot view)
-        3.1 Path selection
-            3.1.1 Managing entry guards
-            3.1.2 Middle and exit node selection
-        3.2 Circuit Building
-    4. The algorithm.
-        4.0. The guards listed in the current consensus. [Section:GUARDS]
-        4.1. The Sampled Guard Set. [Section:SAMPLED]
-        4.2. The Usable Sample [Section:FILTERED]
-        4.3. The confirmed-guard list. [Section:CONFIRMED]
-        4.4. The Primary guards [Section:PRIMARY]
-        4.5. Retrying guards. [Section:RETRYING]
-        4.6. Selecting guards for circuits. [Section:SELECTING]
-        4.7. When a circuit fails. [Section:ON_FAIL]
-        4.8. When a circuit succeeds [Section:ON_SUCCESS]
-        4.9. Updating the list of waiting circuits [Section:UPDATE_WAITING]
-        4.10. Whenever we get a new consensus. [Section:ON_CONSENSUS]
-        4.11. Deciding whether to generate a new circuit.
-        4.12. When we are missing descriptors.
-    A. Appendices
-        A.0. Acknowledgements
-        A.1. Parameters with suggested values. [Section:PARAM_VALS]
-        A.2. Random values [Section:RANDOM]
-        A.3. Why not a sliding scale of primaryness? [Section:CVP]
-        A.4. Controller changes
-        A.5. Persistent state format
-
-1. Introduction and motivation
-
-  Tor uses entry guards to prevent an attacker who controls some
-  fraction of the network from observing a fraction of every user's
-  traffic. If users chose their entries and exits uniformly at
-  random from the list of servers every time they build a circuit,
-  then an adversary who had (k/N) of the network would deanonymize
-  F=(k/N)^2 of all circuits... and after a given user had built C
-  circuits, the attacker would see them at least once with
-  probability 1-(1-F)^C.  With large C, the attacker would get a
-  sample of every user's traffic with probability 1.
-
-  To prevent this from happening, Tor clients choose a small number
-  of guard nodes (e.g. 3).  These guard nodes are the only
-  nodes that the client will connect to directly.  If they are not
-  compromised, the user's paths are not compromised.
-
-  This specification outlines Tor's guard housekeeping algorithm,
-  which tries to meet the following goals:
-
-    - Heuristics and algorithms for determining how and which guards
-      are chosen should be kept as simple and easy to understand as
-      possible.
-
-    - Clients in censored regions or who are behind a fascist
-      firewall who connect to the Tor network should not experience
-      any significant disadvantage in terms of reachability or
-      usability.
-
-    - Tor should make a best attempt at discovering the most
-      appropriate behavior, with as little user input and
-      configuration as possible.
-
-    - Tor clients should discover usable guards without too much
-      delay.
-
-    - Tor clients should resist (to the extent possible) attacks
-      that try to force them onto compromised guards.
-
-    - Should maintain the load-balancing offered by the path selection
-      algorithm
-
-2. State instances
-
-   In the algorithm below, we describe a set of persistent and
-   non-persistent state variables.  These variables should be
-   treated as an object, of which multiple instances can exist.
-
-   In particular, we specify the use of three particular instances:
-
-     A. UseBridges
-
-      If UseBridges is set, then we replace the {GUARDS} set in
-      [Sec:GUARDS] below with the list of configured
-      bridges.  We maintain a separate persistent instance of
-      {SAMPLED_GUARDS} and {CONFIRMED_GUARDS} and other derived
-      values for the UseBridges case.
-
-      In this case, we impose no upper limit on the sample size.
-
-    B. EntryNodes / ExcludeNodes / Reachable*Addresses /
-        FascistFirewall / ClientUseIPv4=0
-
-      If one of the above options is set, and UseBridges is not,
-      then we compare the fraction of usable guards in the consensus
-      to the total number of guards in the consensus.
-
-      If this fraction is less than {MEANINGFUL_RESTRICTION_FRAC},
-      we use a separate instance of the state.
-
-      (While Tor is running, we do not change back and forth between
-      the separate instance of the state and the default instance
-      unless the fraction of usable guards is 5% higher than, or 5%
-      lower than, {MEANINGFUL_RESTRICTION_FRAC}.  This prevents us
-      from flapping back and forth between instances if we happen to
-      hit {MEANINGFUL_RESTRICTION_FRAC} exactly.
-
-      If this fraction is less than {EXTREME_RESTRICTION_FRAC}, we use a
-      separate instance of the state, and warn the user.
-
-      [TODO: should we have a different instance for each set of heavily
-      restricted options?]
-
-   C. Default
-
-      If neither of the above variant-state instances is used,
-      we use a default instance.
-
-3. Circuit Creation, Entry Guard Selection (1000 foot view)
-
-   A circuit in Tor is a path through the network connecting a client to
-   its destination. At a high-level, a three-hop exit circuit will look
-   like this:
-
-   Client <-> Entry Guard <-> Middle Node <-> Exit Node <-> Destination
-
-   Entry guards are the only nodes which a client will connect to
-   directly. Exit relays are the nodes by which traffic exits the
-   Tor network in order to connect to an external destination.
-
-   3.1 Path selection
-
-   For any multi-hop circuit, at least one entry guard and middle node(s) are
-   required. An exit node is required if traffic will exit the Tor
-   network. Depending on its configuration, a relay listed in a
-   consensus could be used for any of these roles. However, this
-   specification defines how entry guards specifically should be selected and
-   managed, as opposed to middle or exit nodes.
-
-   3.1.1 Managing entry guards
-
-   At a high level, a relay listed in a consensus will move through the
-   following states in the process from initial selection to eventual
-   usage as an entry guard:
-
-      relays listed in consensus
-                 |
-               sampled
-               |     |
-         confirmed   filtered
-               |     |      |
-               primary      usable_filtered
-
-   Relays listed in the latest consensus can be sampled for guard usage
-   if they have the "Guard" flag. Sampling is random but weighted by
-   a measured bandwidth multiplied by bandwidth-weights (Wgg if guard only,
-   Wgd if guard+exit flagged).
-
-   Once a path is built and a circuit established using this guard, it
-   is marked as confirmed. Until this point, guards are first sampled
-   and then filtered based on information such as our current
-   configuration (see SAMPLED and FILTERED sections) and later marked as
-   usable_filtered if the guard is not primary but can be reached.
-
-   It is always preferable to use a primary guard when building a new
-   circuit in order to reduce guard churn; only on failure to connect to
-   existing primary guards will new guards be used.
-
-   3.1.2 Middle and exit node selection
-
-   Middle nodes are selected at random from relays listed in the latest
-   consensus, weighted by bandwidth and bandwidth-weights. Exit nodes are
-   chosen similarly but restricted to relays with a sufficiently permissive
-   exit policy.
-
-   3.2 Circuit Building
-
-   Once a path is chosen, Tor will use this path to build a new circuit.
-
-   If the circuit is built successfully, Tor will either use it
-   immediately, or Tor will wait for a circuit with a more preferred
-   guard if there's a good chance that it will be able to make one.
-
-   If the circuit fails in a way that makes us conclude that a guard
-   is not reachable, the guard is marked as unreachable, the circuit is
-   closed, and waiting circuits are updated.
-
-4. The algorithm.
-
-4.0.  The guards listed in the current consensus. [Section:GUARDS]
-
-   By {set:GUARDS} we mean the set of all guards in the current
-   consensus that are usable for all circuits and directory
-   requests. (They must have the flags: Stable, Fast, V2Dir, Guard.)
-
-      **Rationale**
-
-   We require all guards to have the flags that we potentially need
-   from any guard, so that all guards are usable for all circuits.
-
-4.1.  The Sampled Guard Set. [Section:SAMPLED]
-
-   We maintain a set, {set:SAMPLED_GUARDS}, that persists across
-   invocations of Tor. It is a subset of the nodes ordered by a sample idx that
-   we have seen listed as a guard in the consensus at some point.
-   For each such guard, we record persistently:
-
-      - {pvar:ADDED_ON_DATE}: The date on which it was added to
-        sampled_guards.
-
-        We set this value to a point in the past, using
-        RAND(now, {GUARD_LIFETIME}/10). See
-        Appendix [RANDOM] below.
-
-      - {pvar:ADDED_BY_VERSION}: The version of Tor that added it to
-        sampled_guards.
-
-      - {pvar:IS_LISTED}: Whether it was listed as a usable Guard in
-        the _most recent_ consensus we have seen.
-
-      - {pvar:FIRST_UNLISTED_AT}: If IS_LISTED is false, the publication date
-        of the earliest consensus in which this guard was listed such that we
-        have not seen it listed in any later consensus.  Otherwise "None."
-        We randomize this to a point in the past, based on
-          RAND(added_at_time, {REMOVE_UNLISTED_GUARDS_AFTER} / 5)
-
-   For each guard in {SAMPLED_GUARDS}, we also record this data,
-   non-persistently:
-
-      - {tvar:last_tried_connect}: A 'last tried to connect at'
-        time.  Default 'never'.
-
-      - {tvar:is_reachable}: an "is reachable" tristate, with
-        possible values { <state:yes>, <state:no>, <state:maybe> }.
-        Default '<maybe>.'
-
-               [Note: "yes" is not strictly necessary, but I'm
-                making it distinct from "maybe" anyway, to make our
-                logic clearer.  A guard is "maybe" reachable if it's
-                worth trying. A guard is "yes" reachable if we tried
-                it and succeeded.]
-
-      - {tvar:failing_since}: The first time when we failed to
-        connect to this guard. Defaults to "never".  Reset to
-        "never" when we successfully connect to this guard.
-
-      - {tvar:is_pending} A "pending" flag.  This indicates that we
-        are trying to build an exploratory circuit through the
-        guard, and we don't know whether it will succeed.
-
-      - {tvar:pending_since}: A timestamp.  Set whenever we set
-        {tvar:is_pending} to true; cleared whenever we set
-        {tvar:is_pending} to false. NOTE
-
-   We require that {SAMPLED_GUARDS} contain at least
-   {MIN_FILTERED_SAMPLE} guards from the consensus (if possible),
-   but not more than {MAX_SAMPLE_THRESHOLD} of the number of guards
-   in the consensus, and not more than {MAX_SAMPLE_SIZE} in total.
-   (But if the maximum would be smaller than {MIN_FILTERED_SAMPLE}, we
-   set the maximum at {MIN_FILTERED_SAMPLE}.)
-
-   To add a new guard to {SAMPLED_GUARDS}, pick an entry at random from
-   ({GUARDS} - {SAMPLED_GUARDS}), according to the path selection rules.
-
-   We remove an entry from {SAMPLED_GUARDS} if:
-
-      * We have a live consensus, and {IS_LISTED} is false, and
-        {FIRST_UNLISTED_AT} is over {REMOVE_UNLISTED_GUARDS_AFTER}
-        days in the past.
-
-     OR
-
-      * We have a live consensus, and {ADDED_ON_DATE} is over
-        {GUARD_LIFETIME} ago, *and* {CONFIRMED_ON_DATE} is either
-        "never", or over {GUARD_CONFIRMED_MIN_LIFETIME} ago.
-
-   Note that {SAMPLED_GUARDS} does not depend on our configuration.
-   It is possible that we can't actually connect to any of these
-   guards.
-
-     **Rationale**
-
-   The {SAMPLED_GUARDS} set is meant to limit the total number of
-   guards that a client will connect to in a given period.  The
-   upper limit on its size prevents us from considering too many
-   guards.
-
-   The first expiration mechanism is there so that our
-   {SAMPLED_GUARDS} list does not accumulate so many dead
-   guards that we cannot add new ones.
-
-   The second expiration mechanism makes us rotate our guards slowly
-   over time.
-
-   Ordering the {SAMPLED_GUARDS} set in the order in which we sampled those
-   guards and picking guards from that set according to this ordering improves
-   load-balancing. It is closer to offer the expected usage of the guard nodes
-   as per the path selection rules.
-
-   The ordering also improves on another objective of this proposal: trying to
-   resist an adversary pushing clients over compromised guards, since the
-   adversary would need the clients to exhaust all their initial
-   {SAMPLED_GUARDS} set before having a chance to use a newly deployed
-   adversary node.
-
-
-4.2. The Usable Sample [Section:FILTERED]
-
-   We maintain another set, {set:FILTERED_GUARDS}, that does not
-   persist. It is derived from:
-
-       - {SAMPLED_GUARDS}
-       - our current configuration,
-       - the path bias information.
-
-   A guard is a member of {set:FILTERED_GUARDS} if and only if all
-   of the following are true:
-
-       - It is a member of {SAMPLED_GUARDS}, with {IS_LISTED} set to
-         true.
-       - It is not disabled because of path bias issues.
-       - It is not disabled because of ReachableAddresses policy,
-         the ClientUseIPv4 setting, the ClientUseIPv6 setting,
-         the FascistFirewall setting, or some other
-         option that prevents using some addresses.
-       - It is not disabled because of ExcludeNodes.
-       - It is a bridge if UseBridges is true; or it is not a
-         bridge if UseBridges is false.
-       - Is included in EntryNodes if EntryNodes is set and
-         UseBridges is not. (But see 2.B above).
-
-   We have an additional subset, {set:USABLE_FILTERED_GUARDS}, which
-   is defined to be the subset of {FILTERED_GUARDS} where
-   {is_reachable} is <yes> or <maybe>.
-
-   We try to maintain a requirement that {USABLE_FILTERED_GUARDS}
-   contain at least {MIN_FILTERED_SAMPLE} elements:
-
-     Whenever we are going to sample from {USABLE_FILTERED_GUARDS},
-     and it contains fewer than {MIN_FILTERED_SAMPLE} elements, we
-     add new elements to {SAMPLED_GUARDS} until one of the following
-     is true:
-
-       * {USABLE_FILTERED_GUARDS} is large enough,
-     OR
-       * {SAMPLED_GUARDS} is at its maximum size.
-
-
-     ** Rationale **
-
-  These filters are applied _after_ sampling: if we applied them
-  before the sampling, then our sample would reflect the set of
-  filtering restrictions that we had in the past.
-
-4.3. The confirmed-guard list. [Section:CONFIRMED]
-
-  [formerly USED_GUARDS]
-
-  We maintain a persistent ordered list, {list:CONFIRMED_GUARDS}.
-  It contains guards that we have used before, in our preference
-  order of using them.  It is a subset of {SAMPLED_GUARDS}.  For
-  each guard in this list, we store persistently:
-
-      - {pvar:IDENTITY} Its fingerprint.
-
-      - {pvar:CONFIRMED_ON_DATE} When we added this guard to
-        {CONFIRMED_GUARDS}.
-
-        Randomized to a point in the past as RAND(now, {GUARD_LIFETIME}/10).
-
-  We append new members to {CONFIRMED_GUARDS} when we mark a circuit
-  built through a guard as "for user traffic."
-
-  Whenever we remove a member from {SAMPLED_GUARDS}, we also remove
-  it from {CONFIRMED_GUARDS}.
-
-      [Note: You can also regard the {CONFIRMED_GUARDS} list as a
-      total ordering defined over a subset of {SAMPLED_GUARDS}.]
-
-  Definition: we call Guard A "higher priority" than another Guard B
-  if, when A and B are both reachable, we would rather use A.  We
-  define priority as follows:
-
-     * Every guard in {CONFIRMED_GUARDS} has a higher priority
-       than every guard not in {CONFIRMED_GUARDS}.
-
-     * Among guards in {CONFIRMED_GUARDS}, the one appearing earlier
-       on the {CONFIRMED_GUARDS} list has a higher priority.
-
-     * Among guards that do not appear in {CONFIRMED_GUARDS},
-       {is_pending}==true guards have higher priority.
-
-     * Among those, the guard with earlier {last_tried_connect} time
-       has higher priority.
-
-     * Finally, among guards that do not appear in
-       {CONFIRMED_GUARDS} with {is_pending==false}, all have equal
-       priority.
-
-   ** Rationale **
-
-  We add elements to this ordering when we have actually used them
-  for building a usable circuit.  We could mark them at some other
-  time (such as when we attempt to connect to them, or when we
-  actually connect to them), but this approach keeps us from
-  committing to a guard before we actually use it for sensitive
-  traffic.
-
-4.4. The Primary guards [Section:PRIMARY]
-
-  We keep a run-time non-persistent ordered list of
-  {list:PRIMARY_GUARDS}.  It is a subset of {FILTERED_GUARDS}.  It
-  contains {N_PRIMARY_GUARDS} elements.
-
-  To compute primary guards, take the ordered intersection of
-  {CONFIRMED_GUARDS} and {FILTERED_GUARDS}, and take the first
-  {N_PRIMARY_GUARDS} elements.  If there are fewer than
-  {N_PRIMARY_GUARDS} elements, append additional elements to
-  PRIMARY_GUARDS chosen from ({FILTERED_GUARDS} - {CONFIRMED_GUARDS}),
-  ordered in "sample order" (that is, by {ADDED_ON_DATE}).
-
-  Once an element has been added to {PRIMARY_GUARDS}, we do not remove it
-  until it is replaced by some element from {CONFIRMED_GUARDS}.
-  That is: if a non-primary guard becomes confirmed and not every primary
-  guard is confirmed, then the list of primary guards list is regenerated,
-  first from the confirmed guards (as before), and then from any
-  non-confirmed primary guards.
-
-  Note that {PRIMARY_GUARDS} do not have to be in
-  {USABLE_FILTERED_GUARDS}: they might be unreachable.
-
-    ** Rationale **
-
-  These guards are treated differently from other guards.  If one of
-  them is usable, then we use it right away. For other guards
-  {FILTERED_GUARDS}, if it's usable, then before using it we might
-  first double-check whether perhaps one of the primary guards is
-  usable after all.
-
-4.5. Retrying guards. [Section:RETRYING]
-
-  (We run this process as frequently as needed. It can be done once
-  a second, or just-in-time.)
-
-  If a primary sampled guard's {is_reachable} status is <no>, then
-  we decide whether to update its {is_reachable} status to <maybe>
-  based on its {last_tried_connect} time, its {failing_since} time,
-  and the {PRIMARY_GUARDS_RETRY_SCHED} schedule.
-
-  If a non-primary sampled guard's {is_reachable} status is <no>, then
-  we decide whether to update its {is_reachable} status to <maybe>
-  based on its {last_tried_connect} time, its {failing_since} time,
-  and the {GUARDS_RETRY_SCHED} schedule.
-
-    ** Rationale **
-
-  An observation that a guard has been 'unreachable' only lasts for
-  a given amount of time, since we can't infer that it's unreachable
-  now from the fact that it was unreachable a few minutes ago.
-
-4.6. Selecting guards for circuits. [Section:SELECTING]
-
-  Every origin circuit is now in one of these states:
-
-     <state:usable_on_completion>,
-     <state:usable_if_no_better_guard>,
-     <state:waiting_for_better_guard>, or
-     <state:complete>.
-
-  You may only attach streams to <complete> circuits.
-  (Additionally, you may only send RENDEZVOUS cells, ESTABLISH_INTRO
-  cells, and INTRODUCE cells on <complete> circuits.)
-
-  The per-circuit state machine is:
-
-      New circuits are <usable_on_completion> or
-      <usable_if_no_better_guard>.
-
-      A <usable_on_completion> circuit may become <complete>, or may
-      fail.
-
-      A <usable_if_no_better_guard> circuit may become
-      <usable_on_completion>; may become <waiting_for_better_guard>; or may
-      fail.
-
-      A <waiting_for_better_guard> circuit will become <complete>, or will
-      be closed, or will fail.
-
-      A <complete> circuit remains <complete> until it fails or is
-      closed.
-
-      Each of these transitions is described below.
-
-  We keep, as global transient state:
-
-    * {tvar:last_time_on_internet} -- the last time at which we
-      successfully used a circuit or connected to a guard.  At
-      startup we set this to "infinitely far in the past."
-
-  When we want to build a circuit, and we need to pick a guard:
-
-    * If any entry in PRIMARY_GUARDS has {is_reachable} status of
-      <maybe> or <yes>, return one of the first
-      {NUM_USABLE_PRIMARY_GUARDS} or
-      {NUM_USABLE_PRIMARY_DIRECTORY_GUARDS} such guards, chosen
-      uniformly at random. The circuit is <usable_on_completion>.
-
-      [Note: We do not use {is_pending} on primary guards, since we
-      are willing to try to build multiple circuits through them
-      before we know for sure whether they work, and since we will
-      not use any non-primary guards until we are sure that the
-      primary guards are all down.  (XX is this good?)]
-
-    * Otherwise, if the ordered intersection of {CONFIRMED_GUARDS}
-      and {USABLE_FILTERED_GUARDS} is nonempty, return the first
-      entry in that intersection that has {is_pending} set to
-      false. Set its value of {is_pending} to true,
-      and set its {pending_since} to the current time.
-      The circuit
-      is now <usable_if_no_better_guard>.  (If all entries have
-      {is_pending} true, pick the first one.)
-
-    * Otherwise, if there is no such entry, select a member from
-      {USABLE_FILTERED_GUARDS} in sample order. Set its {is_pending} field to
-      true, and set its {pending_since} to the current time.
-      The circuit is <usable_if_no_better_guard>.
-
-    * Otherwise, if USABLE_FILTERED_GUARDS is empty, we have exhausted
-      all the sampled guards.  In this case we proceed by marking all guards
-      as <maybe> reachable so that we can keep on trying circuits.
-
-  Whenever we select a guard for a new circuit attempt, we update the
-  {last_tried_connect} time for the guard to 'now.'
-
-  In some cases (for example, when we need a certain directory feature,
-  or when we need to avoid using a certain exit as a guard), we need to
-  restrict the guards that we use for a single circuit. When this happens, we
-  remember the restrictions that applied when choosing the guard for
-  that circuit, since we will need them later (see [UPDATE_WAITING].).
-
-    ** Rationale **
-
-  We're getting to the core of the algorithm here.  Our main goals are to
-  make sure that
-
-    1. If it's possible to use a primary guard, we do.
-    2. We probably use the first primary guard.
-
-  So we only try non-primary guards if we're pretty sure that all
-  the primary guards are down, and we only try a given primary guard
-  if the earlier primary guards seem down.
-
-  When we _do_ try non-primary guards, however, we only build one
-  circuit through each, to give it a chance to succeed or fail.  If
-  ever such a circuit succeeds, we don't use it until we're pretty
-  sure that it's the best guard we're getting. (see below).
-
-         [XXX timeout.]
-
-4.7. When a circuit fails. [Section:ON_FAIL]
-
-   When a circuit fails in a way that makes us conclude that a guard
-   is not reachable, we take the following steps:
-
-      * Set the guard's {is_reachable} status to <no>.  If it had
-        {is_pending} set to true, we make it non-pending and clear
-        {pending_since}.
-
-      * Close the circuit, of course.  (This removes it from
-        consideration by the algorithm in [UPDATE_WAITING].)
-
-      * Update the list of waiting circuits.  (See [UPDATE_WAITING]
-        below.)
-
-   [Note: the existing Tor logic will cause us to create more
-   circuits in response to some of these steps; and also see
-   [ON_CONSENSUS].]
-
-    ** Rationale **
-
-   See [SELECTING] above for rationale.
-
-4.8. When a circuit succeeds [Section:ON_SUCCESS]
-
-   When a circuit succeeds in a way that makes us conclude that a
-   guard _was_ reachable, we take these steps:
-
-      * We set its {is_reachable} status to <yes>.
-      * We set its {failing_since} to "never".
-      * If the guard was {is_pending}, we clear the {is_pending} flag
-        and set {pending_since} to false.
-      * If the guard was not a member of {CONFIRMED_GUARDS}, we add
-        it to the end of {CONFIRMED_GUARDS}.
-
-      * If this circuit was <usable_on_completion>, this circuit is
-        now <complete>. You may attach streams to this circuit,
-        and use it for hidden services.
-
-      * If this circuit was <usable_if_no_better_guard>, it is now
-        <waiting_for_better_guard>.  You may not yet attach streams to it.
-        Then check whether the {last_time_on_internet} is more than
-        {INTERNET_LIKELY_DOWN_INTERVAL} seconds ago:
-
-           * If it is, then mark all {PRIMARY_GUARDS} as "maybe"
-             reachable.
-
-           * If it is not, update the list of waiting circuits. (See
-             [UPDATE_WAITING] below)
-
-   [Note: the existing Tor logic will cause us to create more
-   circuits in response to some of these steps; and see
-   [ON_CONSENSUS].]
-
-    ** Rationale **
-
-   See [SELECTING] above for rationale.
-
-4.9. Updating the list of waiting circuits [Section:UPDATE_WAITING]
-
-   We run this procedure whenever it's possible that a
-   <waiting_for_better_guard> circuit might be ready to be called
-   <complete>.
-
-   * If any circuit C1 is <waiting_for_better_guard>, AND:
-       * All primary guards have reachable status of <no>.
-       * There is no circuit C2 that "blocks" C1.
-     Then, upgrade C1 to <complete>.
-
-   Definition: In the algorithm above, C2 "blocks" C1 if:
-       * C2 obeys all the restrictions that C1 had to obey, AND
-       * C2 has higher priority than C1, AND
-       * Either C2 is <complete>, or C2 is <waiting_for_better_guard>,
-         or C2 has been <usable_if_no_better_guard> for no more than
-         {NONPRIMARY_GUARD_CONNECT_TIMEOUT} seconds.
-
-   We run this procedure periodically:
-
-   * If any circuit stays in <waiting_for_better_guard>
-     for more than {NONPRIMARY_GUARD_IDLE_TIMEOUT} seconds,
-     time it out.
-
-      **Rationale**
-
-   If we open a connection to a guard, we might want to use it
-   immediately (if we're sure that it's the best we can do), or we
-   might want to wait a little while to see if some other circuit
-   which we like better will finish.
-
-
-   When we mark a circuit <complete>, we don't close the
-   lower-priority circuits immediately: we might decide to use
-   them after all if the <complete> circuit goes down before
-   {NONPRIMARY_GUARD_IDLE_TIMEOUT} seconds.
-
-4.9.1. Without a list of waiting circuits [Section:NO_CIRCLIST]
-
-   As an alternative to the section [SECTION:UPDATE_WAITING] above,
-   this section presents a new way to maintain guard status
-   independently of tracking individual circuit status.  This
-   formulation gives a result equivalent or similar to the approach
-   above, but simplifies the necessary communications between the
-   guard and circuit subsystems.
-
-   As before, when all primary guards are Unreachable, we need to
-   try non-primary guards.  We select the first such guard (in
-   preference order) that is neither Unreachable nor Pending.
-   Whenever we give out such a guard, if the guard's status is
-   Unknown, then we call that guard "Pending" with its {is_pending}
-   flag, until the attempt to use it succeeds or fails.  We remember
-   when the guard became Pending with the {pending_since variable}.
-
-   After completing a circuit, the implementation must check whether
-   its guard is usable.  A guard's usability status may be "usable",
-   "unusable", or "unknown".  A guard is usable according to
-   these rules:
-
-   1. Primary guards are always usable.
-
-   2. Non-primary guards are usable _for a given circuit_ if every
-   guard earlier in the preference list is either unsuitable for
-   that circuit (e.g. because of family restrictions), or marked as
-   Unreachable, or has been pending for at least
-   `{NONPRIMARY_GUARD_CONNECT_TIMEOUT}`.
-
-   Non-primary guards are not usable _for a given circuit_ if some
-   guard earlier in the preference list is suitable for the circuit
-   _and_ Reachable.
-
-   Non-primary guards are unusable if they have not become
-   usable after `{NONPRIMARY_GUARD_IDLE_TIMEOUT}` seconds.
-
-   3. If a circuit's guard is not usable or unusable immediately, the
-   circuit is not discarded; instead, it is kept (but not used) until the
-   guard becomes usable or unusable.
-
-
-4.10. Whenever we get a new consensus. [Section:ON_CONSENSUS]
-
-   We update {GUARDS}.
-
-   For every guard in {SAMPLED_GUARDS}, we update {IS_LISTED} and
-   {FIRST_UNLISTED_AT}.
-
-   [**] We remove entries from {SAMPLED_GUARDS} if appropriate,
-   according to the sampled-guards expiration rules.  If they were
-   in {CONFIRMED_GUARDS}, we also remove them from
-   {CONFIRMED_GUARDS}.
-
-   We recompute {FILTERED_GUARDS}, and everything that derives from
-   it, including {USABLE_FILTERED_GUARDS}, and {PRIMARY_GUARDS}.
-
-   (Whenever one of the configuration options that affects the
-   filter is updated, we repeat the process above, starting at the
-   [**] line.)
-
-4.11. Deciding whether to generate a new circuit.
-  [Section:NEW_CIRCUIT_NEEDED]
-
-   We generate a new circuit when we don't have
-   enough circuits either built or in-progress to handle a given
-   stream, or an expected stream.
-
-   For the purpose of this rule, we say that <waiting_for_better_guard>
-   circuits are neither built nor in-progress; that <complete>
-   circuits are built; and that the other states are in-progress.
-
-4.12. When we are missing descriptors.
-   [Section:MISSING_DESCRIPTORS]
-
-   We need either a router descriptor or a microdescriptor in order
-   to build a circuit through a guard.  If we do not have such a
-   descriptor for a guard, we can still use the guard for one-hop
-   directory fetches, but not for longer circuits.
-
-   (Also, when we are missing descriptors for our first
-   {NUM_USABLE_PRIMARY_GUARDS} primary guards, we don't build
-   circuits at all until we have fetched them.)
-
-A. Appendices
-
-A.0. Acknowledgements
-
-  This research was supported in part by NSF grants CNS-1111539,
-  CNS-1314637, CNS-1526306, CNS-1619454, and CNS-1640548.
-
-A.1.  Parameters with suggested values. [Section:PARAM_VALS]
-
-   (All suggested values chosen arbitrarily)
-
-   {param:MAX_SAMPLE_THRESHOLD} -- 20%
-
-   {param:MAX_SAMPLE_SIZE} -- 60
-
-   {param:GUARD_LIFETIME} -- 120 days
-
-   {param:REMOVE_UNLISTED_GUARDS_AFTER} -- 20 days
-     [previously ENTRY_GUARD_REMOVE_AFTER]
-
-   {param:MIN_FILTERED_SAMPLE} -- 20
-
-   {param:N_PRIMARY_GUARDS} -- 3
-
-   {param:PRIMARY_GUARDS_RETRY_SCHED}
-
-      We recommend the following schedule, which is the one
-      used in Arti:
-
-      -- Use the "decorrelated-jitter" algorithm from "dir-spec.txt"
-         section 5.5 where `base_delay` is 30 seconds and `cap`
-         is 6 hours.
-
-      This legacy schedule is the one used in C tor:
-
-      -- every 10 minutes for the first six hours,
-      -- every 90 minutes for the next 90 hours,
-      -- every 4 hours for the next 3 days,
-      -- every 9 hours thereafter.
-
-   {param:GUARDS_RETRY_SCHED} --
-
-      We recommend the following schedule, which is the one
-      used in Arti:
-
-      -- Use the "decorrelated-jitter" algorithm from "dir-spec.txt"
-         section 5.5 where `base_delay` is 10 minutes and `cap`
-         is 36 hours.
-
-      This legacy schedule is the one used in C tor:
-
-      -- every hour for the first six hours,
-      -- every 4 hours for the 90 hours,
-      -- every 18 hours for the next 3 days,
-      -- every 36 hours thereafter.
-
-   {param:INTERNET_LIKELY_DOWN_INTERVAL} -- 10 minutes
-
-   {param:NONPRIMARY_GUARD_CONNECT_TIMEOUT} -- 15 seconds
-
-   {param:NONPRIMARY_GUARD_IDLE_TIMEOUT} -- 10 minutes
-
-   {param:MEANINGFUL_RESTRICTION_FRAC} -- .2
-
-   {param:EXTREME_RESTRICTION_FRAC} -- .01
-
-   {param:GUARD_CONFIRMED_MIN_LIFETIME} -- 60 days
-
-   {param:NUM_USABLE_PRIMARY_GUARDS} -- 1
-
-   {param:NUM_USABLE_PRIMARY_DIRECTORY_GUARDS} -- 3
-
-A.2. Random values [Section:RANDOM]
-
-   Frequently, we want to randomize the expiration time of something
-   so that it's not easy for an observer to match it to its start
-   time. We do this by randomizing its start date a little, so that
-   we only need to remember a fixed expiration interval.
-
-   By RAND(now, INTERVAL) we mean a time between now and INTERVAL in
-   the past, chosen uniformly at random.
-
-
-A.3. Why not a sliding scale of primaryness? [Section:CVP]
-
-   At one meeting, I floated the idea of having "primaryness" be a
-   continuous variable rather than a boolean.
-
-   I'm no longer sure this is a great idea, but I'll try to outline
-   how it might work.
-
-   To begin with: being "primary" gives it a few different traits:
-
-      1) We retry primary guards more frequently. [Section:RETRYING]
-
-      2) We don't even _try_ building circuits through
-         lower-priority guards until we're pretty sure that the
-         higher-priority primary guards are down. (With non-primary
-         guards, on the other hand, we launch exploratory circuits
-         which we plan not to use if higher-priority guards
-         succeed.) [Section:SELECTING]
-
-      3) We retry them all one more time if a circuit succeeds after
-         the net has been down for a while. [Section:ON_SUCCESS]
-
-   We could make each of the above traits continuous:
-
-      1) We could make the interval at which a guard is retried
-         depend continuously on its position in CONFIRMED_GUARDS.
-
-      2) We could change the number of guards we test in parallel
-         based on their position in CONFIRMED_GUARDS.
-
-      3) We could change the rule for how long the higher-priority
-         guards need to have been down before we call a
-         <usable_if_no_better_guard> circuit <complete> based on a
-         possible network-down condition.  For example, we could
-         retry the first guard if we tried it more than 10 seconds
-         ago, the second if we tried it more than 20 seconds ago,
-         etc.
-
-   I am pretty sure, however, that if these are worth doing, they
-   need more analysis!  Here's why:
-
-      * They all have the potential to leak more information about a
-        guard's exact position on the list.  Is that safe? Is there
-        any way to exploit that?  I don't think we know.
-
-      * They all seem like changes which it would be relatively
-        simple to make to the code after we implement the simpler
-        version of the algorithm described above.
-
-A.4. Controller changes
-
-   We will add to control-spec.txt a new possible circuit state, GUARD_WAIT,
-   that can be given as part of circuit events and GETINFO responses about
-   circuits.  A circuit is in the GUARD_WAIT state when it is fully built,
-   but we will not use it because a circuit with a better guard might
-   become built too.
-
-A.5. Persistent state format
-
-   The persistent state format doesn't need to be part of this
-   specification, since different implementations can do it
-   differently. Nonetheless, here's the one Tor uses:
-
-   The "state" file contains one Guard entry for each sampled guard
-   in each instance of the guard state (see section 2).  The value
-   of this Guard entry is a set of space-separated K=V entries,
-   where K contains any nonspace character except =, and V contains
-   any nonspace characters.
-
-   Implementations must retain any unrecognized K=V entries for a
-   sampled guard when they regenerate the state file.
-
-   The order of K=V entries is not allowed to matter.
-
-   Recognized fields (values of K) are:
-
-        "in" -- the name of the guard state instance that this
-        sampled guard is in.  If a sampled guard is in two guard
-        states instances, it appears twice, with a different "in"
-        field each time. Required.
-
-        "rsa_id" -- the RSA id digest for this guard, encoded in
-        hex. Required.
-
-        "bridge_addr" -- If the guard is a bridge, its configured address and
-        port (this can be the ORPort or a pluggable transport port). Optional.
-
-        "nickname" -- the guard's nickname, if any. Optional.
-
-        "sampled_on" -- the date when the guard was sampled. Required.
-
-        "sampled_by" -- the Tor version that sampled this guard.
-        Optional.
-
-        "unlisted_since" -- the date since which the guard has been
-        unlisted. Optional.
-
-        "listed" -- 0 if the guard is not listed; 1 if it is. Required.
-
-        "confirmed_on" -- date when the guard was
-        confirmed. Optional.
-
-        "confirmed_idx" -- position of the guard in the confirmed
-        list. Optional.
-
-        "pb_use_attempts", "pb_use_successes", "pb_circ_attempts",
-        "pb_circ_successes", "pb_successful_circuits_closed",
-        "pb_collapsed_circuits", "pb_unusable_circuits",
-        "pb_timeouts" -- state for the circuit path bias algorithm,
-        given in decimal fractions. Optional.
-
-   All dates here are given as a (spaceless) ISO8601 combined date
-   and time in UTC (e.g., 2016-11-29T19:39:31).
-
-
-TODO. Still non-addressed issues [Section:TODO]
-
-   Simulate to answer:  Will this work in a dystopic world?
-
-   Simulate actual behavior.
-
-   For all lifetimes: instead of storing the "this began at" time,
-   store the "remove this at" time, slightly randomized.
-
-   Clarify that when you get a <complete> circuit, you might need to
-   relaunch circuits through that same guard immediately, if they
-   are circuits that have to be independent.
-
-
-   Fix all items marked XX or TODO.
-
-   "Directory guards" -- do they matter?
-
-       Suggestion: require that all guards support downloads via BEGINDIR.
-       We don't need to worry about directory guards for relays, since we
-       aren't trying to prevent relay enumeration.
-
-   IP version preferences via ClientPreferIPv6ORPort
-
-       Suggestion: Treat it as a preference when adding to
-       {CONFIRMED_GUARDS}, but not otherwise.
-
author	Nick Mathewson <nickm@torproject.org>	2023-10-12 12:27:58 -0400
committer	Nick Mathewson <nickm@torproject.org>	2023-10-12 12:27:58 -0400
commit	e4e0d93d56ee8c1aec4c2efaa7046b651f0fe55c (patch)
tree	15a085da265ae3b2b70f29a70f910a5371059a78 /guard-spec.txt
parent	b719a373934d3e79ef833446c6aeeb19be485510 (diff)
download	torspec-e4e0d93d56ee8c1aec4c2efaa7046b651f0fe55c.tar.gz torspec-e4e0d93d56ee8c1aec4c2efaa7046b651f0fe55c.zip