From e4e0d93d56ee8c1aec4c2efaa7046b651f0fe55c Mon Sep 17 00:00:00 2001 From: Nick Mathewson Date: Thu, 12 Oct 2023 12:27:58 -0400 Subject: Move all text-only specifications into the OLD_TXT directory. --- guard-spec.txt | 972 --------------------------------------------------------- 1 file changed, 972 deletions(-) delete mode 100644 guard-spec.txt (limited to 'guard-spec.txt') diff --git a/guard-spec.txt b/guard-spec.txt deleted file mode 100644 index 154edae..0000000 --- a/guard-spec.txt +++ /dev/null @@ -1,972 +0,0 @@ - - Tor Guard Specification - - Isis Lovecruft - George Kadianakis - Ola Bini - Nick Mathewson - -Table of Contents - - 1. Introduction and motivation - 2. State instances - 3. Circuit Creation, Entry Guard Selection (1000 foot view) - 3.1 Path selection - 3.1.1 Managing entry guards - 3.1.2 Middle and exit node selection - 3.2 Circuit Building - 4. The algorithm. - 4.0. The guards listed in the current consensus. [Section:GUARDS] - 4.1. The Sampled Guard Set. [Section:SAMPLED] - 4.2. The Usable Sample [Section:FILTERED] - 4.3. The confirmed-guard list. [Section:CONFIRMED] - 4.4. The Primary guards [Section:PRIMARY] - 4.5. Retrying guards. [Section:RETRYING] - 4.6. Selecting guards for circuits. [Section:SELECTING] - 4.7. When a circuit fails. [Section:ON_FAIL] - 4.8. When a circuit succeeds [Section:ON_SUCCESS] - 4.9. Updating the list of waiting circuits [Section:UPDATE_WAITING] - 4.10. Whenever we get a new consensus. [Section:ON_CONSENSUS] - 4.11. Deciding whether to generate a new circuit. - 4.12. When we are missing descriptors. - A. Appendices - A.0. Acknowledgements - A.1. Parameters with suggested values. [Section:PARAM_VALS] - A.2. Random values [Section:RANDOM] - A.3. Why not a sliding scale of primaryness? [Section:CVP] - A.4. Controller changes - A.5. Persistent state format - -1. Introduction and motivation - - Tor uses entry guards to prevent an attacker who controls some - fraction of the network from observing a fraction of every user's - traffic. If users chose their entries and exits uniformly at - random from the list of servers every time they build a circuit, - then an adversary who had (k/N) of the network would deanonymize - F=(k/N)^2 of all circuits... and after a given user had built C - circuits, the attacker would see them at least once with - probability 1-(1-F)^C. With large C, the attacker would get a - sample of every user's traffic with probability 1. - - To prevent this from happening, Tor clients choose a small number - of guard nodes (e.g. 3). These guard nodes are the only - nodes that the client will connect to directly. If they are not - compromised, the user's paths are not compromised. - - This specification outlines Tor's guard housekeeping algorithm, - which tries to meet the following goals: - - - Heuristics and algorithms for determining how and which guards - are chosen should be kept as simple and easy to understand as - possible. - - - Clients in censored regions or who are behind a fascist - firewall who connect to the Tor network should not experience - any significant disadvantage in terms of reachability or - usability. - - - Tor should make a best attempt at discovering the most - appropriate behavior, with as little user input and - configuration as possible. - - - Tor clients should discover usable guards without too much - delay. - - - Tor clients should resist (to the extent possible) attacks - that try to force them onto compromised guards. - - - Should maintain the load-balancing offered by the path selection - algorithm - -2. State instances - - In the algorithm below, we describe a set of persistent and - non-persistent state variables. These variables should be - treated as an object, of which multiple instances can exist. - - In particular, we specify the use of three particular instances: - - A. UseBridges - - If UseBridges is set, then we replace the {GUARDS} set in - [Sec:GUARDS] below with the list of configured - bridges. We maintain a separate persistent instance of - {SAMPLED_GUARDS} and {CONFIRMED_GUARDS} and other derived - values for the UseBridges case. - - In this case, we impose no upper limit on the sample size. - - B. EntryNodes / ExcludeNodes / Reachable*Addresses / - FascistFirewall / ClientUseIPv4=0 - - If one of the above options is set, and UseBridges is not, - then we compare the fraction of usable guards in the consensus - to the total number of guards in the consensus. - - If this fraction is less than {MEANINGFUL_RESTRICTION_FRAC}, - we use a separate instance of the state. - - (While Tor is running, we do not change back and forth between - the separate instance of the state and the default instance - unless the fraction of usable guards is 5% higher than, or 5% - lower than, {MEANINGFUL_RESTRICTION_FRAC}. This prevents us - from flapping back and forth between instances if we happen to - hit {MEANINGFUL_RESTRICTION_FRAC} exactly. - - If this fraction is less than {EXTREME_RESTRICTION_FRAC}, we use a - separate instance of the state, and warn the user. - - [TODO: should we have a different instance for each set of heavily - restricted options?] - - C. Default - - If neither of the above variant-state instances is used, - we use a default instance. - -3. Circuit Creation, Entry Guard Selection (1000 foot view) - - A circuit in Tor is a path through the network connecting a client to - its destination. At a high-level, a three-hop exit circuit will look - like this: - - Client <-> Entry Guard <-> Middle Node <-> Exit Node <-> Destination - - Entry guards are the only nodes which a client will connect to - directly. Exit relays are the nodes by which traffic exits the - Tor network in order to connect to an external destination. - - 3.1 Path selection - - For any multi-hop circuit, at least one entry guard and middle node(s) are - required. An exit node is required if traffic will exit the Tor - network. Depending on its configuration, a relay listed in a - consensus could be used for any of these roles. However, this - specification defines how entry guards specifically should be selected and - managed, as opposed to middle or exit nodes. - - 3.1.1 Managing entry guards - - At a high level, a relay listed in a consensus will move through the - following states in the process from initial selection to eventual - usage as an entry guard: - - relays listed in consensus - | - sampled - | | - confirmed filtered - | | | - primary usable_filtered - - Relays listed in the latest consensus can be sampled for guard usage - if they have the "Guard" flag. Sampling is random but weighted by - a measured bandwidth multiplied by bandwidth-weights (Wgg if guard only, - Wgd if guard+exit flagged). - - Once a path is built and a circuit established using this guard, it - is marked as confirmed. Until this point, guards are first sampled - and then filtered based on information such as our current - configuration (see SAMPLED and FILTERED sections) and later marked as - usable_filtered if the guard is not primary but can be reached. - - It is always preferable to use a primary guard when building a new - circuit in order to reduce guard churn; only on failure to connect to - existing primary guards will new guards be used. - - 3.1.2 Middle and exit node selection - - Middle nodes are selected at random from relays listed in the latest - consensus, weighted by bandwidth and bandwidth-weights. Exit nodes are - chosen similarly but restricted to relays with a sufficiently permissive - exit policy. - - 3.2 Circuit Building - - Once a path is chosen, Tor will use this path to build a new circuit. - - If the circuit is built successfully, Tor will either use it - immediately, or Tor will wait for a circuit with a more preferred - guard if there's a good chance that it will be able to make one. - - If the circuit fails in a way that makes us conclude that a guard - is not reachable, the guard is marked as unreachable, the circuit is - closed, and waiting circuits are updated. - -4. The algorithm. - -4.0. The guards listed in the current consensus. [Section:GUARDS] - - By {set:GUARDS} we mean the set of all guards in the current - consensus that are usable for all circuits and directory - requests. (They must have the flags: Stable, Fast, V2Dir, Guard.) - - **Rationale** - - We require all guards to have the flags that we potentially need - from any guard, so that all guards are usable for all circuits. - -4.1. The Sampled Guard Set. [Section:SAMPLED] - - We maintain a set, {set:SAMPLED_GUARDS}, that persists across - invocations of Tor. It is a subset of the nodes ordered by a sample idx that - we have seen listed as a guard in the consensus at some point. - For each such guard, we record persistently: - - - {pvar:ADDED_ON_DATE}: The date on which it was added to - sampled_guards. - - We set this value to a point in the past, using - RAND(now, {GUARD_LIFETIME}/10). See - Appendix [RANDOM] below. - - - {pvar:ADDED_BY_VERSION}: The version of Tor that added it to - sampled_guards. - - - {pvar:IS_LISTED}: Whether it was listed as a usable Guard in - the _most recent_ consensus we have seen. - - - {pvar:FIRST_UNLISTED_AT}: If IS_LISTED is false, the publication date - of the earliest consensus in which this guard was listed such that we - have not seen it listed in any later consensus. Otherwise "None." - We randomize this to a point in the past, based on - RAND(added_at_time, {REMOVE_UNLISTED_GUARDS_AFTER} / 5) - - For each guard in {SAMPLED_GUARDS}, we also record this data, - non-persistently: - - - {tvar:last_tried_connect}: A 'last tried to connect at' - time. Default 'never'. - - - {tvar:is_reachable}: an "is reachable" tristate, with - possible values { , , }. - Default '.' - - [Note: "yes" is not strictly necessary, but I'm - making it distinct from "maybe" anyway, to make our - logic clearer. A guard is "maybe" reachable if it's - worth trying. A guard is "yes" reachable if we tried - it and succeeded.] - - - {tvar:failing_since}: The first time when we failed to - connect to this guard. Defaults to "never". Reset to - "never" when we successfully connect to this guard. - - - {tvar:is_pending} A "pending" flag. This indicates that we - are trying to build an exploratory circuit through the - guard, and we don't know whether it will succeed. - - - {tvar:pending_since}: A timestamp. Set whenever we set - {tvar:is_pending} to true; cleared whenever we set - {tvar:is_pending} to false. NOTE - - We require that {SAMPLED_GUARDS} contain at least - {MIN_FILTERED_SAMPLE} guards from the consensus (if possible), - but not more than {MAX_SAMPLE_THRESHOLD} of the number of guards - in the consensus, and not more than {MAX_SAMPLE_SIZE} in total. - (But if the maximum would be smaller than {MIN_FILTERED_SAMPLE}, we - set the maximum at {MIN_FILTERED_SAMPLE}.) - - To add a new guard to {SAMPLED_GUARDS}, pick an entry at random from - ({GUARDS} - {SAMPLED_GUARDS}), according to the path selection rules. - - We remove an entry from {SAMPLED_GUARDS} if: - - * We have a live consensus, and {IS_LISTED} is false, and - {FIRST_UNLISTED_AT} is over {REMOVE_UNLISTED_GUARDS_AFTER} - days in the past. - - OR - - * We have a live consensus, and {ADDED_ON_DATE} is over - {GUARD_LIFETIME} ago, *and* {CONFIRMED_ON_DATE} is either - "never", or over {GUARD_CONFIRMED_MIN_LIFETIME} ago. - - Note that {SAMPLED_GUARDS} does not depend on our configuration. - It is possible that we can't actually connect to any of these - guards. - - **Rationale** - - The {SAMPLED_GUARDS} set is meant to limit the total number of - guards that a client will connect to in a given period. The - upper limit on its size prevents us from considering too many - guards. - - The first expiration mechanism is there so that our - {SAMPLED_GUARDS} list does not accumulate so many dead - guards that we cannot add new ones. - - The second expiration mechanism makes us rotate our guards slowly - over time. - - Ordering the {SAMPLED_GUARDS} set in the order in which we sampled those - guards and picking guards from that set according to this ordering improves - load-balancing. It is closer to offer the expected usage of the guard nodes - as per the path selection rules. - - The ordering also improves on another objective of this proposal: trying to - resist an adversary pushing clients over compromised guards, since the - adversary would need the clients to exhaust all their initial - {SAMPLED_GUARDS} set before having a chance to use a newly deployed - adversary node. - - -4.2. The Usable Sample [Section:FILTERED] - - We maintain another set, {set:FILTERED_GUARDS}, that does not - persist. It is derived from: - - - {SAMPLED_GUARDS} - - our current configuration, - - the path bias information. - - A guard is a member of {set:FILTERED_GUARDS} if and only if all - of the following are true: - - - It is a member of {SAMPLED_GUARDS}, with {IS_LISTED} set to - true. - - It is not disabled because of path bias issues. - - It is not disabled because of ReachableAddresses policy, - the ClientUseIPv4 setting, the ClientUseIPv6 setting, - the FascistFirewall setting, or some other - option that prevents using some addresses. - - It is not disabled because of ExcludeNodes. - - It is a bridge if UseBridges is true; or it is not a - bridge if UseBridges is false. - - Is included in EntryNodes if EntryNodes is set and - UseBridges is not. (But see 2.B above). - - We have an additional subset, {set:USABLE_FILTERED_GUARDS}, which - is defined to be the subset of {FILTERED_GUARDS} where - {is_reachable} is or . - - We try to maintain a requirement that {USABLE_FILTERED_GUARDS} - contain at least {MIN_FILTERED_SAMPLE} elements: - - Whenever we are going to sample from {USABLE_FILTERED_GUARDS}, - and it contains fewer than {MIN_FILTERED_SAMPLE} elements, we - add new elements to {SAMPLED_GUARDS} until one of the following - is true: - - * {USABLE_FILTERED_GUARDS} is large enough, - OR - * {SAMPLED_GUARDS} is at its maximum size. - - - ** Rationale ** - - These filters are applied _after_ sampling: if we applied them - before the sampling, then our sample would reflect the set of - filtering restrictions that we had in the past. - -4.3. The confirmed-guard list. [Section:CONFIRMED] - - [formerly USED_GUARDS] - - We maintain a persistent ordered list, {list:CONFIRMED_GUARDS}. - It contains guards that we have used before, in our preference - order of using them. It is a subset of {SAMPLED_GUARDS}. For - each guard in this list, we store persistently: - - - {pvar:IDENTITY} Its fingerprint. - - - {pvar:CONFIRMED_ON_DATE} When we added this guard to - {CONFIRMED_GUARDS}. - - Randomized to a point in the past as RAND(now, {GUARD_LIFETIME}/10). - - We append new members to {CONFIRMED_GUARDS} when we mark a circuit - built through a guard as "for user traffic." - - Whenever we remove a member from {SAMPLED_GUARDS}, we also remove - it from {CONFIRMED_GUARDS}. - - [Note: You can also regard the {CONFIRMED_GUARDS} list as a - total ordering defined over a subset of {SAMPLED_GUARDS}.] - - Definition: we call Guard A "higher priority" than another Guard B - if, when A and B are both reachable, we would rather use A. We - define priority as follows: - - * Every guard in {CONFIRMED_GUARDS} has a higher priority - than every guard not in {CONFIRMED_GUARDS}. - - * Among guards in {CONFIRMED_GUARDS}, the one appearing earlier - on the {CONFIRMED_GUARDS} list has a higher priority. - - * Among guards that do not appear in {CONFIRMED_GUARDS}, - {is_pending}==true guards have higher priority. - - * Among those, the guard with earlier {last_tried_connect} time - has higher priority. - - * Finally, among guards that do not appear in - {CONFIRMED_GUARDS} with {is_pending==false}, all have equal - priority. - - ** Rationale ** - - We add elements to this ordering when we have actually used them - for building a usable circuit. We could mark them at some other - time (such as when we attempt to connect to them, or when we - actually connect to them), but this approach keeps us from - committing to a guard before we actually use it for sensitive - traffic. - -4.4. The Primary guards [Section:PRIMARY] - - We keep a run-time non-persistent ordered list of - {list:PRIMARY_GUARDS}. It is a subset of {FILTERED_GUARDS}. It - contains {N_PRIMARY_GUARDS} elements. - - To compute primary guards, take the ordered intersection of - {CONFIRMED_GUARDS} and {FILTERED_GUARDS}, and take the first - {N_PRIMARY_GUARDS} elements. If there are fewer than - {N_PRIMARY_GUARDS} elements, append additional elements to - PRIMARY_GUARDS chosen from ({FILTERED_GUARDS} - {CONFIRMED_GUARDS}), - ordered in "sample order" (that is, by {ADDED_ON_DATE}). - - Once an element has been added to {PRIMARY_GUARDS}, we do not remove it - until it is replaced by some element from {CONFIRMED_GUARDS}. - That is: if a non-primary guard becomes confirmed and not every primary - guard is confirmed, then the list of primary guards list is regenerated, - first from the confirmed guards (as before), and then from any - non-confirmed primary guards. - - Note that {PRIMARY_GUARDS} do not have to be in - {USABLE_FILTERED_GUARDS}: they might be unreachable. - - ** Rationale ** - - These guards are treated differently from other guards. If one of - them is usable, then we use it right away. For other guards - {FILTERED_GUARDS}, if it's usable, then before using it we might - first double-check whether perhaps one of the primary guards is - usable after all. - -4.5. Retrying guards. [Section:RETRYING] - - (We run this process as frequently as needed. It can be done once - a second, or just-in-time.) - - If a primary sampled guard's {is_reachable} status is , then - we decide whether to update its {is_reachable} status to - based on its {last_tried_connect} time, its {failing_since} time, - and the {PRIMARY_GUARDS_RETRY_SCHED} schedule. - - If a non-primary sampled guard's {is_reachable} status is , then - we decide whether to update its {is_reachable} status to - based on its {last_tried_connect} time, its {failing_since} time, - and the {GUARDS_RETRY_SCHED} schedule. - - ** Rationale ** - - An observation that a guard has been 'unreachable' only lasts for - a given amount of time, since we can't infer that it's unreachable - now from the fact that it was unreachable a few minutes ago. - -4.6. Selecting guards for circuits. [Section:SELECTING] - - Every origin circuit is now in one of these states: - - , - , - , or - . - - You may only attach streams to circuits. - (Additionally, you may only send RENDEZVOUS cells, ESTABLISH_INTRO - cells, and INTRODUCE cells on circuits.) - - The per-circuit state machine is: - - New circuits are or - . - - A circuit may become , or may - fail. - - A circuit may become - ; may become ; or may - fail. - - A circuit will become , or will - be closed, or will fail. - - A circuit remains until it fails or is - closed. - - Each of these transitions is described below. - - We keep, as global transient state: - - * {tvar:last_time_on_internet} -- the last time at which we - successfully used a circuit or connected to a guard. At - startup we set this to "infinitely far in the past." - - When we want to build a circuit, and we need to pick a guard: - - * If any entry in PRIMARY_GUARDS has {is_reachable} status of - or , return one of the first - {NUM_USABLE_PRIMARY_GUARDS} or - {NUM_USABLE_PRIMARY_DIRECTORY_GUARDS} such guards, chosen - uniformly at random. The circuit is . - - [Note: We do not use {is_pending} on primary guards, since we - are willing to try to build multiple circuits through them - before we know for sure whether they work, and since we will - not use any non-primary guards until we are sure that the - primary guards are all down. (XX is this good?)] - - * Otherwise, if the ordered intersection of {CONFIRMED_GUARDS} - and {USABLE_FILTERED_GUARDS} is nonempty, return the first - entry in that intersection that has {is_pending} set to - false. Set its value of {is_pending} to true, - and set its {pending_since} to the current time. - The circuit - is now . (If all entries have - {is_pending} true, pick the first one.) - - * Otherwise, if there is no such entry, select a member from - {USABLE_FILTERED_GUARDS} in sample order. Set its {is_pending} field to - true, and set its {pending_since} to the current time. - The circuit is . - - * Otherwise, if USABLE_FILTERED_GUARDS is empty, we have exhausted - all the sampled guards. In this case we proceed by marking all guards - as reachable so that we can keep on trying circuits. - - Whenever we select a guard for a new circuit attempt, we update the - {last_tried_connect} time for the guard to 'now.' - - In some cases (for example, when we need a certain directory feature, - or when we need to avoid using a certain exit as a guard), we need to - restrict the guards that we use for a single circuit. When this happens, we - remember the restrictions that applied when choosing the guard for - that circuit, since we will need them later (see [UPDATE_WAITING].). - - ** Rationale ** - - We're getting to the core of the algorithm here. Our main goals are to - make sure that - - 1. If it's possible to use a primary guard, we do. - 2. We probably use the first primary guard. - - So we only try non-primary guards if we're pretty sure that all - the primary guards are down, and we only try a given primary guard - if the earlier primary guards seem down. - - When we _do_ try non-primary guards, however, we only build one - circuit through each, to give it a chance to succeed or fail. If - ever such a circuit succeeds, we don't use it until we're pretty - sure that it's the best guard we're getting. (see below). - - [XXX timeout.] - -4.7. When a circuit fails. [Section:ON_FAIL] - - When a circuit fails in a way that makes us conclude that a guard - is not reachable, we take the following steps: - - * Set the guard's {is_reachable} status to . If it had - {is_pending} set to true, we make it non-pending and clear - {pending_since}. - - * Close the circuit, of course. (This removes it from - consideration by the algorithm in [UPDATE_WAITING].) - - * Update the list of waiting circuits. (See [UPDATE_WAITING] - below.) - - [Note: the existing Tor logic will cause us to create more - circuits in response to some of these steps; and also see - [ON_CONSENSUS].] - - ** Rationale ** - - See [SELECTING] above for rationale. - -4.8. When a circuit succeeds [Section:ON_SUCCESS] - - When a circuit succeeds in a way that makes us conclude that a - guard _was_ reachable, we take these steps: - - * We set its {is_reachable} status to . - * We set its {failing_since} to "never". - * If the guard was {is_pending}, we clear the {is_pending} flag - and set {pending_since} to false. - * If the guard was not a member of {CONFIRMED_GUARDS}, we add - it to the end of {CONFIRMED_GUARDS}. - - * If this circuit was , this circuit is - now . You may attach streams to this circuit, - and use it for hidden services. - - * If this circuit was , it is now - . You may not yet attach streams to it. - Then check whether the {last_time_on_internet} is more than - {INTERNET_LIKELY_DOWN_INTERVAL} seconds ago: - - * If it is, then mark all {PRIMARY_GUARDS} as "maybe" - reachable. - - * If it is not, update the list of waiting circuits. (See - [UPDATE_WAITING] below) - - [Note: the existing Tor logic will cause us to create more - circuits in response to some of these steps; and see - [ON_CONSENSUS].] - - ** Rationale ** - - See [SELECTING] above for rationale. - -4.9. Updating the list of waiting circuits [Section:UPDATE_WAITING] - - We run this procedure whenever it's possible that a - circuit might be ready to be called - . - - * If any circuit C1 is , AND: - * All primary guards have reachable status of . - * There is no circuit C2 that "blocks" C1. - Then, upgrade C1 to . - - Definition: In the algorithm above, C2 "blocks" C1 if: - * C2 obeys all the restrictions that C1 had to obey, AND - * C2 has higher priority than C1, AND - * Either C2 is , or C2 is , - or C2 has been for no more than - {NONPRIMARY_GUARD_CONNECT_TIMEOUT} seconds. - - We run this procedure periodically: - - * If any circuit stays in - for more than {NONPRIMARY_GUARD_IDLE_TIMEOUT} seconds, - time it out. - - **Rationale** - - If we open a connection to a guard, we might want to use it - immediately (if we're sure that it's the best we can do), or we - might want to wait a little while to see if some other circuit - which we like better will finish. - - - When we mark a circuit , we don't close the - lower-priority circuits immediately: we might decide to use - them after all if the circuit goes down before - {NONPRIMARY_GUARD_IDLE_TIMEOUT} seconds. - -4.9.1. Without a list of waiting circuits [Section:NO_CIRCLIST] - - As an alternative to the section [SECTION:UPDATE_WAITING] above, - this section presents a new way to maintain guard status - independently of tracking individual circuit status. This - formulation gives a result equivalent or similar to the approach - above, but simplifies the necessary communications between the - guard and circuit subsystems. - - As before, when all primary guards are Unreachable, we need to - try non-primary guards. We select the first such guard (in - preference order) that is neither Unreachable nor Pending. - Whenever we give out such a guard, if the guard's status is - Unknown, then we call that guard "Pending" with its {is_pending} - flag, until the attempt to use it succeeds or fails. We remember - when the guard became Pending with the {pending_since variable}. - - After completing a circuit, the implementation must check whether - its guard is usable. A guard's usability status may be "usable", - "unusable", or "unknown". A guard is usable according to - these rules: - - 1. Primary guards are always usable. - - 2. Non-primary guards are usable _for a given circuit_ if every - guard earlier in the preference list is either unsuitable for - that circuit (e.g. because of family restrictions), or marked as - Unreachable, or has been pending for at least - `{NONPRIMARY_GUARD_CONNECT_TIMEOUT}`. - - Non-primary guards are not usable _for a given circuit_ if some - guard earlier in the preference list is suitable for the circuit - _and_ Reachable. - - Non-primary guards are unusable if they have not become - usable after `{NONPRIMARY_GUARD_IDLE_TIMEOUT}` seconds. - - 3. If a circuit's guard is not usable or unusable immediately, the - circuit is not discarded; instead, it is kept (but not used) until the - guard becomes usable or unusable. - - -4.10. Whenever we get a new consensus. [Section:ON_CONSENSUS] - - We update {GUARDS}. - - For every guard in {SAMPLED_GUARDS}, we update {IS_LISTED} and - {FIRST_UNLISTED_AT}. - - [**] We remove entries from {SAMPLED_GUARDS} if appropriate, - according to the sampled-guards expiration rules. If they were - in {CONFIRMED_GUARDS}, we also remove them from - {CONFIRMED_GUARDS}. - - We recompute {FILTERED_GUARDS}, and everything that derives from - it, including {USABLE_FILTERED_GUARDS}, and {PRIMARY_GUARDS}. - - (Whenever one of the configuration options that affects the - filter is updated, we repeat the process above, starting at the - [**] line.) - -4.11. Deciding whether to generate a new circuit. - [Section:NEW_CIRCUIT_NEEDED] - - We generate a new circuit when we don't have - enough circuits either built or in-progress to handle a given - stream, or an expected stream. - - For the purpose of this rule, we say that - circuits are neither built nor in-progress; that - circuits are built; and that the other states are in-progress. - -4.12. When we are missing descriptors. - [Section:MISSING_DESCRIPTORS] - - We need either a router descriptor or a microdescriptor in order - to build a circuit through a guard. If we do not have such a - descriptor for a guard, we can still use the guard for one-hop - directory fetches, but not for longer circuits. - - (Also, when we are missing descriptors for our first - {NUM_USABLE_PRIMARY_GUARDS} primary guards, we don't build - circuits at all until we have fetched them.) - -A. Appendices - -A.0. Acknowledgements - - This research was supported in part by NSF grants CNS-1111539, - CNS-1314637, CNS-1526306, CNS-1619454, and CNS-1640548. - -A.1. Parameters with suggested values. [Section:PARAM_VALS] - - (All suggested values chosen arbitrarily) - - {param:MAX_SAMPLE_THRESHOLD} -- 20% - - {param:MAX_SAMPLE_SIZE} -- 60 - - {param:GUARD_LIFETIME} -- 120 days - - {param:REMOVE_UNLISTED_GUARDS_AFTER} -- 20 days - [previously ENTRY_GUARD_REMOVE_AFTER] - - {param:MIN_FILTERED_SAMPLE} -- 20 - - {param:N_PRIMARY_GUARDS} -- 3 - - {param:PRIMARY_GUARDS_RETRY_SCHED} - - We recommend the following schedule, which is the one - used in Arti: - - -- Use the "decorrelated-jitter" algorithm from "dir-spec.txt" - section 5.5 where `base_delay` is 30 seconds and `cap` - is 6 hours. - - This legacy schedule is the one used in C tor: - - -- every 10 minutes for the first six hours, - -- every 90 minutes for the next 90 hours, - -- every 4 hours for the next 3 days, - -- every 9 hours thereafter. - - {param:GUARDS_RETRY_SCHED} -- - - We recommend the following schedule, which is the one - used in Arti: - - -- Use the "decorrelated-jitter" algorithm from "dir-spec.txt" - section 5.5 where `base_delay` is 10 minutes and `cap` - is 36 hours. - - This legacy schedule is the one used in C tor: - - -- every hour for the first six hours, - -- every 4 hours for the 90 hours, - -- every 18 hours for the next 3 days, - -- every 36 hours thereafter. - - {param:INTERNET_LIKELY_DOWN_INTERVAL} -- 10 minutes - - {param:NONPRIMARY_GUARD_CONNECT_TIMEOUT} -- 15 seconds - - {param:NONPRIMARY_GUARD_IDLE_TIMEOUT} -- 10 minutes - - {param:MEANINGFUL_RESTRICTION_FRAC} -- .2 - - {param:EXTREME_RESTRICTION_FRAC} -- .01 - - {param:GUARD_CONFIRMED_MIN_LIFETIME} -- 60 days - - {param:NUM_USABLE_PRIMARY_GUARDS} -- 1 - - {param:NUM_USABLE_PRIMARY_DIRECTORY_GUARDS} -- 3 - -A.2. Random values [Section:RANDOM] - - Frequently, we want to randomize the expiration time of something - so that it's not easy for an observer to match it to its start - time. We do this by randomizing its start date a little, so that - we only need to remember a fixed expiration interval. - - By RAND(now, INTERVAL) we mean a time between now and INTERVAL in - the past, chosen uniformly at random. - - -A.3. Why not a sliding scale of primaryness? [Section:CVP] - - At one meeting, I floated the idea of having "primaryness" be a - continuous variable rather than a boolean. - - I'm no longer sure this is a great idea, but I'll try to outline - how it might work. - - To begin with: being "primary" gives it a few different traits: - - 1) We retry primary guards more frequently. [Section:RETRYING] - - 2) We don't even _try_ building circuits through - lower-priority guards until we're pretty sure that the - higher-priority primary guards are down. (With non-primary - guards, on the other hand, we launch exploratory circuits - which we plan not to use if higher-priority guards - succeed.) [Section:SELECTING] - - 3) We retry them all one more time if a circuit succeeds after - the net has been down for a while. [Section:ON_SUCCESS] - - We could make each of the above traits continuous: - - 1) We could make the interval at which a guard is retried - depend continuously on its position in CONFIRMED_GUARDS. - - 2) We could change the number of guards we test in parallel - based on their position in CONFIRMED_GUARDS. - - 3) We could change the rule for how long the higher-priority - guards need to have been down before we call a - circuit based on a - possible network-down condition. For example, we could - retry the first guard if we tried it more than 10 seconds - ago, the second if we tried it more than 20 seconds ago, - etc. - - I am pretty sure, however, that if these are worth doing, they - need more analysis! Here's why: - - * They all have the potential to leak more information about a - guard's exact position on the list. Is that safe? Is there - any way to exploit that? I don't think we know. - - * They all seem like changes which it would be relatively - simple to make to the code after we implement the simpler - version of the algorithm described above. - -A.4. Controller changes - - We will add to control-spec.txt a new possible circuit state, GUARD_WAIT, - that can be given as part of circuit events and GETINFO responses about - circuits. A circuit is in the GUARD_WAIT state when it is fully built, - but we will not use it because a circuit with a better guard might - become built too. - -A.5. Persistent state format - - The persistent state format doesn't need to be part of this - specification, since different implementations can do it - differently. Nonetheless, here's the one Tor uses: - - The "state" file contains one Guard entry for each sampled guard - in each instance of the guard state (see section 2). The value - of this Guard entry is a set of space-separated K=V entries, - where K contains any nonspace character except =, and V contains - any nonspace characters. - - Implementations must retain any unrecognized K=V entries for a - sampled guard when they regenerate the state file. - - The order of K=V entries is not allowed to matter. - - Recognized fields (values of K) are: - - "in" -- the name of the guard state instance that this - sampled guard is in. If a sampled guard is in two guard - states instances, it appears twice, with a different "in" - field each time. Required. - - "rsa_id" -- the RSA id digest for this guard, encoded in - hex. Required. - - "bridge_addr" -- If the guard is a bridge, its configured address and - port (this can be the ORPort or a pluggable transport port). Optional. - - "nickname" -- the guard's nickname, if any. Optional. - - "sampled_on" -- the date when the guard was sampled. Required. - - "sampled_by" -- the Tor version that sampled this guard. - Optional. - - "unlisted_since" -- the date since which the guard has been - unlisted. Optional. - - "listed" -- 0 if the guard is not listed; 1 if it is. Required. - - "confirmed_on" -- date when the guard was - confirmed. Optional. - - "confirmed_idx" -- position of the guard in the confirmed - list. Optional. - - "pb_use_attempts", "pb_use_successes", "pb_circ_attempts", - "pb_circ_successes", "pb_successful_circuits_closed", - "pb_collapsed_circuits", "pb_unusable_circuits", - "pb_timeouts" -- state for the circuit path bias algorithm, - given in decimal fractions. Optional. - - All dates here are given as a (spaceless) ISO8601 combined date - and time in UTC (e.g., 2016-11-29T19:39:31). - - -TODO. Still non-addressed issues [Section:TODO] - - Simulate to answer: Will this work in a dystopic world? - - Simulate actual behavior. - - For all lifetimes: instead of storing the "this began at" time, - store the "remove this at" time, slightly randomized. - - Clarify that when you get a circuit, you might need to - relaunch circuits through that same guard immediately, if they - are circuits that have to be independent. - - - Fix all items marked XX or TODO. - - "Directory guards" -- do they matter? - - Suggestion: require that all guards support downloads via BEGINDIR. - We don't need to worry about directory guards for relays, since we - aren't trying to prevent relay enumeration. - - IP version preferences via ClientPreferIPv6ORPort - - Suggestion: Treat it as a preference when adding to - {CONFIRMED_GUARDS}, but not otherwise. - -- cgit v1.2.3-54-g00ecf