aboutsummaryrefslogtreecommitdiff
path: root/guard-spec.txt
blob: 357583b91c276fe6652b1f72dc4f8f892127e6cc (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
                      Tor Guard Specification

                           Isis Lovecruft
                         George Kadianakis
                              Ola Bini
                           Nick Mathewson

Table of Contents

    1. Introduction and motivation
    2. State instances
    3. Circuit Creation, Entry Guard Selection (1000 foot view)
        3.1 Path selection
            3.1.1 Managing entry guards
            3.1.2 Middle and exit node selection
        3.2 Circuit Building
    4. The algorithm.
        4.0. The guards listed in the current consensus. [Section:GUARDS]
        4.1. The Sampled Guard Set. [Section:SAMPLED]
        4.2. The Usable Sample [Section:FILTERED]
        4.3. The confirmed-guard list. [Section:CONFIRMED]
        4.4. The Primary guards [Section:PRIMARY]
        4.5. Retrying guards. [Section:RETRYING]
        4.6. Selecting guards for circuits. [Section:SELECTING]
        4.7. When a circuit fails. [Section:ON_FAIL]
        4.8. When a circuit succeeds [Section:ON_SUCCESS]
        4.9. Updating the list of waiting circuits [Section:UPDATE_WAITING]
        4.10. Whenever we get a new consensus. [Section:ON_CONSENSUS]
        4.11. Deciding whether to generate a new circuit.
        4.12. When we are missing descriptors.
    A. Appendices
        A.0. Acknowledgements
        A.1. Parameters with suggested values. [Section:PARAM_VALS]
        A.2. Random values [Section:RANDOM]
        A.3. Why not a sliding scale of primaryness? [Section:CVP]
        A.4. Controller changes
        A.5. Persistent state format

1. Introduction and motivation

  Tor uses entry guards to prevent an attacker who controls some
  fraction of the network from observing a fraction of every user's
  traffic. If users chose their entries and exits uniformly at
  random from the list of servers every time they build a circuit,
  then an adversary who had (k/N) of the network would deanonymize
  F=(k/N)^2 of all circuits... and after a given user had built C
  circuits, the attacker would see them at least once with
  probability 1-(1-F)^C.  With large C, the attacker would get a
  sample of every user's traffic with probability 1.

  To prevent this from happening, Tor clients choose a small number
  of guard nodes (e.g. 3).  These guard nodes are the only
  nodes that the client will connect to directly.  If they are not
  compromised, the user's paths are not compromised.

  This specification outlines Tor's guard housekeeping algorithm,
  which tries to meet the following goals:

    - Heuristics and algorithms for determining how and which guards
      are chosen should be kept as simple and easy to understand as
      possible.

    - Clients in censored regions or who are behind a fascist
      firewall who connect to the Tor network should not experience
      any significant disadvantage in terms of reachability or
      usability.

    - Tor should make a best attempt at discovering the most
      appropriate behavior, with as little user input and
      configuration as possible.

    - Tor clients should discover usable guards without too much
      delay.

    - Tor clients should resist (to the extent possible) attacks
      that try to force them onto compromised guards.

    - Should maintain the load-balancing offered by the path selection
      algorithm

2. State instances

   In the algorithm below, we describe a set of persistent and
   non-persistent state variables.  These variables should be
   treated as an object, of which multiple instances can exist.

   In particular, we specify the use of three particular instances:

     A. UseBridges

      If UseBridges is set, then we replace the {GUARDS} set in
      [Sec:GUARDS] below with the list of configured
      bridges.  We maintain a separate persistent instance of
      {SAMPLED_GUARDS} and {CONFIRMED_GUARDS} and other derived
      values for the UseBridges case.

      In this case, we impose no upper limit on the sample size.

    B. EntryNodes / ExcludeNodes / Reachable*Addresses /
        FascistFirewall / ClientUseIPv4=0

      If one of the above options is set, and UseBridges is not,
      then we compare the fraction of usable guards in the consensus
      to the total number of guards in the consensus.

      If this fraction is less than {MEANINGFUL_RESTRICTION_FRAC},
      we use a separate instance of the state.

      (While Tor is running, we do not change back and forth between
      the separate instance of the state and the default instance
      unless the fraction of usable guards is 5% higher than, or 5%
      lower than, {MEANINGFUL_RESTRICTION_FRAC}.  This prevents us
      from flapping back and forth between instances if we happen to
      hit {MEANINGFUL_RESTRICTION_FRAC} exactly.

      If this fraction is less than {EXTREME_RESTRICTION_FRAC}, we use a
      separate instance of the state, and warn the user.

      [TODO: should we have a different instance for each set of heavily
      restricted options?]

   C. Default

      If neither of the above variant-state instances is used,
      we use a default instance.

3. Circuit Creation, Entry Guard Selection (1000 foot view)

   A circuit in Tor is a path through the network connecting a client to
   its destination. At a high-level, a three-hop exit circuit will look
   like this:

   Client <-> Entry Guard <-> Middle Node <-> Exit Node <-> Destination

   Entry guards are the only nodes which a client will connect to
   directly. Exit relays are the nodes by which traffic exits the
   Tor network in order to connect to an external destination.

   3.1 Path selection

   For any multi-hop circuit, at least one entry guard and middle node(s) are
   required. An exit node is required if traffic will exit the Tor
   network. Depending on its configuration, a relay listed in a
   consensus could be used for any of these roles. However, this
   specification defines how entry guards specifically should be selected and
   managed, as opposed to middle or exit nodes.

   3.1.1 Managing entry guards

   At a high level, a relay listed in a consensus will move through the
   following states in the process from initial selection to eventual
   usage as an entry guard:

      relays listed in consensus
                 |
               sampled
               |     |
         confirmed   filtered
               |     |      |
               primary      usable_filtered

   Relays listed in the latest consensus can be sampled for guard usage
   if they have the "Guard" flag. Sampling is random but weighted by
   a measured bandwidth multiplied by bandwidth-weights (Wgg if guard only,
   Wgd if guard+exit flagged).

   Once a path is built and a circuit established using this guard, it
   is marked as confirmed. Until this point, guards are first sampled
   and then filtered based on information such as our current
   configuration (see SAMPLED and FILTERED sections) and later marked as
   usable_filtered if the guard is not primary but can be reached.

   It is always preferable to use a primary guard when building a new
   circuit in order to reduce guard churn; only on failure to connect to
   existing primary guards will new guards be used.

   3.1.2 Middle and exit node selection

   Middle nodes are selected at random from relays listed in the latest
   consensus, weighted by bandwidth and bandwidth-weights. Exit nodes are
   chosen similarly but restricted to relays with a sufficiently permissive
   exit policy.

   3.2 Circuit Building

   Once a path is chosen, Tor will use this path to build a new circuit.

   If the circuit is built successfully, Tor will either use it
   immediately, or Tor will wait for a circuit with a more preferred
   guard if there's a good chance that it will be able to make one.

   If the circuit fails in a way that makes us conclude that a guard
   is not reachable, the guard is marked as unreachable, the circuit is
   closed, and waiting circuits are updated.

4. The algorithm.

4.0.  The guards listed in the current consensus. [Section:GUARDS]

   By {set:GUARDS} we mean the set of all guards in the current
   consensus that are usable for all circuits and directory
   requests. (They must have the flags: Stable, Fast, V2Dir, Guard.)

      **Rationale**

   We require all guards to have the flags that we potentially need
   from any guard, so that all guards are usable for all circuits.

4.1.  The Sampled Guard Set. [Section:SAMPLED]

   We maintain a set, {set:SAMPLED_GUARDS}, that persists across
   invocations of Tor. It is a subset of the nodes ordered by a sample idx that
   we have seen listed as a guard in the consensus at some point.
   For each such guard, we record persistently:

      - {pvar:ADDED_ON_DATE}: The date on which it was added to
        sampled_guards.

        We set this value to a point in the past, using
        RAND(now, {GUARD_LIFETIME}/10). See
        Appendix [RANDOM] below.

      - {pvar:ADDED_BY_VERSION}: The version of Tor that added it to
        sampled_guards.

      - {pvar:IS_LISTED}: Whether it was listed as a usable Guard in
        the _most recent_ consensus we have seen.

      - {pvar:FIRST_UNLISTED_AT}: If IS_LISTED is false, the publication date
        of the earliest consensus in which this guard was listed such that we
        have not seen it listed in any later consensus.  Otherwise "None."
        We randomize this to a point in the past, based on
          RAND(added_at_time, {REMOVE_UNLISTED_GUARDS_AFTER} / 5)

   For each guard in {SAMPLED_GUARDS}, we also record this data,
   non-persistently:

      - {tvar:last_tried_connect}: A 'last tried to connect at'
        time.  Default 'never'.

      - {tvar:is_reachable}: an "is reachable" tristate, with
        possible values { <state:yes>, <state:no>, <state:maybe> }.
        Default '<maybe>.'

               [Note: "yes" is not strictly necessary, but I'm
                making it distinct from "maybe" anyway, to make our
                logic clearer.  A guard is "maybe" reachable if it's
                worth trying. A guard is "yes" reachable if we tried
                it and succeeded.]

      - {tvar:failing_since}: The first time when we failed to
        connect to this guard. Defaults to "never".  Reset to
        "never" when we successfully connect to this guard.

      - {tvar:is_pending} A "pending" flag.  This indicates that we
        are trying to build an exploratory circuit through the
        guard, and we don't know whether it will succeed.

   We require that {SAMPLED_GUARDS} contain at least
   {MIN_FILTERED_SAMPLE} guards from the consensus (if possible),
   but not more than {MAX_SAMPLE_THRESHOLD} of the number of guards
   in the consensus, and not more than {MAX_SAMPLE_SIZE} in total.
   (But if the maximum would be smaller than {MIN_FILTERED_SAMPLE}, we
   set the maximum at {MIN_FILTERED_SAMPLE}.)

   To add a new guard to {SAMPLED_GUARDS}, pick an entry at random from
   ({GUARDS} - {SAMPLED_GUARDS}), according to the path selection rules.

   We remove an entry from {SAMPLED_GUARDS} if:

      * We have a live consensus, and {IS_LISTED} is false, and
        {FIRST_UNLISTED_AT} is over {REMOVE_UNLISTED_GUARDS_AFTER}
        days in the past.

     OR

      * We have a live consensus, and {ADDED_ON_DATE} is over
        {GUARD_LIFETIME} ago, *and* {CONFIRMED_ON_DATE} is either
        "never", or over {GUARD_CONFIRMED_MIN_LIFETIME} ago.

   Note that {SAMPLED_GUARDS} does not depend on our configuration.
   It is possible that we can't actually connect to any of these
   guards.

     **Rationale**

   The {SAMPLED_GUARDS} set is meant to limit the total number of
   guards that a client will connect to in a given period.  The
   upper limit on its size prevents us from considering too many
   guards.

   The first expiration mechanism is there so that our
   {SAMPLED_GUARDS} list does not accumulate so many dead
   guards that we cannot add new ones.

   The second expiration mechanism makes us rotate our guards slowly
   over time.

   Ordering the {SAMPLED_GUARDS} set in the order in which we sampled those
   guards and picking guards from that set according to this ordering improves
   load-balancing. It is closer to offer the expected usage of the guard nodes
   as per the path selection rules.

   The ordering also improves on another objective of this proposal: trying to
   resist an adversary pushing clients over compromised guards, since the
   adversary would need the clients to exhaust all their initial
   {SAMPLED_GUARDS} set before having a chance to use a newly deployed
   adversary node.


4.2. The Usable Sample [Section:FILTERED]

   We maintain another set, {set:FILTERED_GUARDS}, that does not
   persist. It is derived from:

       - {SAMPLED_GUARDS}
       - our current configuration,
       - the path bias information.

   A guard is a member of {set:FILTERED_GUARDS} if and only if all
   of the following are true:

       - It is a member of {SAMPLED_GUARDS}, with {IS_LISTED} set to
         true.
       - It is not disabled because of path bias issues.
       - It is not disabled because of ReachableAddresses policy,
         the ClientUseIPv4 setting, the ClientUseIPv6 setting,
         the FascistFirewall setting, or some other
         option that prevents using some addresses.
       - It is not disabled because of ExcludeNodes.
       - It is a bridge if UseBridges is true; or it is not a
         bridge if UseBridges is false.
       - Is included in EntryNodes if EntryNodes is set and
         UseBridges is not. (But see 2.B above).

   We have an additional subset, {set:USABLE_FILTERED_GUARDS}, which
   is defined to be the subset of {FILTERED_GUARDS} where
   {is_reachable} is <yes> or <maybe>.

   We try to maintain a requirement that {USABLE_FILTERED_GUARDS}
   contain at least {MIN_FILTERED_SAMPLE} elements:

     Whenever we are going to sample from {USABLE_FILTERED_GUARDS},
     and it contains fewer than {MIN_FILTERED_SAMPLE} elements, we
     add new elements to {SAMPLED_GUARDS} until one of the following
     is true:

       * {USABLE_FILTERED_GUARDS} is large enough,
     OR
       * {SAMPLED_GUARDS} is at its maximum size.


     ** Rationale **

  These filters are applied _after_ sampling: if we applied them
  before the sampling, then our sample would reflect the set of
  filtering restrictions that we had in the past.

4.3. The confirmed-guard list. [Section:CONFIRMED]

  [formerly USED_GUARDS]

  We maintain a persistent ordered list, {list:CONFIRMED_GUARDS}.
  It contains guards that we have used before, in our preference
  order of using them.  It is a subset of {SAMPLED_GUARDS}.  For
  each guard in this list, we store persistently:

      - {pvar:IDENTITY} Its fingerprint.

      - {pvar:CONFIRMED_ON_DATE} When we added this guard to
        {CONFIRMED_GUARDS}.

        Randomized to a point in the past as RAND(now, {GUARD_LIFETIME}/10).

  We append new members to {CONFIRMED_GUARDS} when we mark a circuit
  built through a guard as "for user traffic."

  Whenever we remove a member from {SAMPLED_GUARDS}, we also remove
  it from {CONFIRMED_GUARDS}.

      [Note: You can also regard the {CONFIRMED_GUARDS} list as a
      total ordering defined over a subset of {SAMPLED_GUARDS}.]

  Definition: we call Guard A "higher priority" than another Guard B
  if, when A and B are both reachable, we would rather use A.  We
  define priority as follows:

     * Every guard in {CONFIRMED_GUARDS} has a higher priority
       than every guard not in {CONFIRMED_GUARDS}.

     * Among guards in {CONFIRMED_GUARDS}, the one appearing earlier
       on the {CONFIRMED_GUARDS} list has a higher priority.

     * Among guards that do not appear in {CONFIRMED_GUARDS},
       {is_pending}==true guards have higher priority.

     * Among those, the guard with earlier {last_tried_connect} time
       has higher priority.

     * Finally, among guards that do not appear in
       {CONFIRMED_GUARDS} with {is_pending==false}, all have equal
       priority.

   ** Rationale **

  We add elements to this ordering when we have actually used them
  for building a usable circuit.  We could mark them at some other
  time (such as when we attempt to connect to them, or when we
  actually connect to them), but this approach keeps us from
  committing to a guard before we actually use it for sensitive
  traffic.

4.4. The Primary guards [Section:PRIMARY]

  We keep a run-time non-persistent ordered list of
  {list:PRIMARY_GUARDS}.  It is a subset of {FILTERED_GUARDS}.  It
  contains {N_PRIMARY_GUARDS} elements.

  To compute primary guards, take the ordered intersection of
  {CONFIRMED_GUARDS} and {FILTERED_GUARDS}, and take the first
  {N_PRIMARY_GUARDS} elements.  If there are fewer than
  {N_PRIMARY_GUARDS} elements, append additional elements to
  PRIMARY_GUARDS chosen from ({FILTERED_GUARDS} - {CONFIRMED_GUARDS}),
  ordered in "sample order" (that is, by {ADDED_ON_DATE}).

  Once an element has been added to {PRIMARY_GUARDS}, we do not remove it
  until it is replaced by some element from {CONFIRMED_GUARDS}.
  That is: if a non-primary guard becomes confirmed and not every primary
  guard is confirmed, then the list of primary guards list is regenerated,
  first from the confirmed guards (as before), and then from any
  non-confirmed primary guards.

  Note that {PRIMARY_GUARDS} do not have to be in
  {USABLE_FILTERED_GUARDS}: they might be unreachable.

    ** Rationale **

  These guards are treated differently from other guards.  If one of
  them is usable, then we use it right away. For other guards
  {FILTERED_GUARDS}, if it's usable, then before using it we might
  first double-check whether perhaps one of the primary guards is
  usable after all.

4.5. Retrying guards. [Section:RETRYING]

  (We run this process as frequently as needed. It can be done once
  a second, or just-in-time.)

  If a primary sampled guard's {is_reachable} status is <no>, then
  we decide whether to update its {is_reachable} status to <maybe>
  based on its {last_tried_connect} time, its {failing_since} time,
  and the {PRIMARY_GUARDS_RETRY_SCHED} schedule.

  If a non-primary sampled guard's {is_reachable} status is <no>, then
  we decide whether to update its {is_reachable} status to <maybe>
  based on its {last_tried_connect} time, its {failing_since} time,
  and the {GUARDS_RETRY_SCHED} schedule.

    ** Rationale **

  An observation that a guard has been 'unreachable' only lasts for
  a given amount of time, since we can't infer that it's unreachable
  now from the fact that it was unreachable a few minutes ago.

4.6. Selecting guards for circuits. [Section:SELECTING]

  Every origin circuit is now in one of these states:

     <state:usable_on_completion>,
     <state:usable_if_no_better_guard>,
     <state:waiting_for_better_guard>, or
     <state:complete>.

  You may only attach streams to <complete> circuits.
  (Additionally, you may only send RENDEZVOUS cells, ESTABLISH_INTRO
  cells, and INTRODUCE cells on <complete> circuits.)

  The per-circuit state machine is:

      New circuits are <usable_on_completion> or
      <usable_if_no_better_guard>.

      A <usable_on_completion> circuit may become <complete>, or may
      fail.

      A <usable_if_no_better_guard> circuit may become
      <usable_on_completion>; may become <waiting_for_better_guard>; or may
      fail.

      A <waiting_for_better_guard> circuit will become <complete>, or will
      be closed, or will fail.

      A <complete> circuit remains <complete> until it fails or is
      closed.

      Each of these transitions is described below.

  We keep, as global transient state:

    * {tvar:last_time_on_internet} -- the last time at which we
      successfully used a circuit or connected to a guard.  At
      startup we set this to "infinitely far in the past."

  When we want to build a circuit, and we need to pick a guard:

    * If any entry in PRIMARY_GUARDS has {is_reachable} status of
      <maybe> or <yes>, return one of the first
      {NUM_USABLE_PRIMARY_GUARDS} or
      {NUM_USABLE_PRIMARY_DIRECTORY_GUARDS} such guards, chosen
      uniformly at random. The circuit is <usable_on_completion>.

      [Note: We do not use {is_pending} on primary guards, since we
      are willing to try to build multiple circuits through them
      before we know for sure whether they work, and since we will
      not use any non-primary guards until we are sure that the
      primary guards are all down.  (XX is this good?)]

    * Otherwise, if the ordered intersection of {CONFIRMED_GUARDS}
      and {USABLE_FILTERED_GUARDS} is nonempty, return the first
      entry in that intersection that has {is_pending} set to
      false. Set its value of {is_pending} to true.  The circuit
      is now <usable_if_no_better_guard>.  (If all entries have
      {is_pending} true, pick the first one.)

    * Otherwise, if there is no such entry, select a member from
      {USABLE_FILTERED_GUARDS} in sample order. Set its {is_pending} field to
      true. The circuit is <usable_if_no_better_guard>.

    * Otherwise, if USABLE_FILTERED_GUARDS is empty, we have exhausted
      all the sampled guards.  In this case we proceed by marking all guards
      as <maybe> reachable so that we can keep on trying circuits.

  Whenever we select a guard for a new circuit attempt, we update the
  {last_tried_connect} time for the guard to 'now.'

  In some cases (for example, when we need a certain directory feature,
  or when we need to avoid using a certain exit as a guard), we need to
  restrict the guards that we use for a single circuit. When this happens, we
  remember the restrictions that applied when choosing the guard for
  that circuit, since we will need them later (see [UPDATE_WAITING].).

    ** Rationale **

  We're getting to the core of the algorithm here.  Our main goals are to
  make sure that

    1. If it's possible to use a primary guard, we do.
    2. We probably use the first primary guard.

  So we only try non-primary guards if we're pretty sure that all
  the primary guards are down, and we only try a given primary guard
  if the earlier primary guards seem down.

  When we _do_ try non-primary guards, however, we only build one
  circuit through each, to give it a chance to succeed or fail.  If
  ever such a circuit succeeds, we don't use it until we're pretty
  sure that it's the best guard we're getting. (see below).

         [XXX timeout.]

4.7. When a circuit fails. [Section:ON_FAIL]

   When a circuit fails in a way that makes us conclude that a guard
   is not reachable, we take the following steps:

      * Set the guard's {is_reachable} status to <no>.  If it had
        {is_pending} set to true, we make it non-pending.

      * Close the circuit, of course.  (This removes it from
        consideration by the algorithm in [UPDATE_WAITING].)

      * Update the list of waiting circuits.  (See [UPDATE_WAITING]
        below.)

   [Note: the existing Tor logic will cause us to create more
   circuits in response to some of these steps; and also see
   [ON_CONSENSUS].]

    ** Rationale **

   See [SELECTING] above for rationale.

4.8. When a circuit succeeds [Section:ON_SUCCESS]

   When a circuit succeeds in a way that makes us conclude that a
   guard _was_ reachable, we take these steps:

      * We set its {is_reachable} status to <yes>.
      * We set its {failing_since} to "never".
      * If the guard was {is_pending}, we clear the {is_pending} flag.
      * If the guard was not a member of {CONFIRMED_GUARDS}, we add
        it to the end of {CONFIRMED_GUARDS}.

      * If this circuit was <usable_on_completion>, this circuit is
        now <complete>. You may attach streams to this circuit,
        and use it for hidden services.

      * If this circuit was <usable_if_no_better_guard>, it is now
        <waiting_for_better_guard>.  You may not yet attach streams to it.
        Then check whether the {last_time_on_internet} is more than
        {INTERNET_LIKELY_DOWN_INTERVAL} seconds ago:

           * If it is, then mark all {PRIMARY_GUARDS} as "maybe"
             reachable.

           * If it is not, update the list of waiting circuits. (See
             [UPDATE_WAITING] below)

   [Note: the existing Tor logic will cause us to create more
   circuits in response to some of these steps; and see
   [ON_CONSENSUS].]

    ** Rationale **

   See [SELECTING] above for rationale.

4.9. Updating the list of waiting circuits [Section:UPDATE_WAITING]

   We run this procedure whenever it's possible that a
   <waiting_for_better_guard> circuit might be ready to be called
   <complete>.

   * If any circuit C1 is <waiting_for_better_guard>, AND:
       * All primary guards have reachable status of <no>.
       * There is no circuit C2 that "blocks" C1.
     Then, upgrade C1 to <complete>.

   Definition: In the algorithm above, C2 "blocks" C1 if:
       * C2 obeys all the restrictions that C1 had to obey, AND
       * C2 has higher priority than C1, AND
       * Either C2 is <complete>, or C2 is <waiting_for_better_guard>,
         or C2 has been <usable_if_no_better_guard> for no more than
         {NONPRIMARY_GUARD_CONNECT_TIMEOUT} seconds.

   We run this procedure periodically:

   * If any circuit stays in <waiting_for_better_guard>
     for more than {NONPRIMARY_GUARD_IDLE_TIMEOUT} seconds,
     time it out.

      **Rationale**

   If we open a connection to a guard, we might want to use it
   immediately (if we're sure that it's the best we can do), or we
   might want to wait a little while to see if some other circuit
   which we like better will finish.


   When we mark a circuit <complete>, we don't close the
   lower-priority circuits immediately: we might decide to use
   them after all if the <complete> circuit goes down before
   {NONPRIMARY_GUARD_IDLE_TIMEOUT} seconds.

4.10. Whenever we get a new consensus. [Section:ON_CONSENSUS]

   We update {GUARDS}.

   For every guard in {SAMPLED_GUARDS}, we update {IS_LISTED} and
   {FIRST_UNLISTED_AT}.

   [**] We remove entries from {SAMPLED_GUARDS} if appropriate,
   according to the sampled-guards expiration rules.  If they were
   in {CONFIRMED_GUARDS}, we also remove them from
   {CONFIRMED_GUARDS}.

   We recompute {FILTERED_GUARDS}, and everything that derives from
   it, including {USABLE_FILTERED_GUARDS}, and {PRIMARY_GUARDS}.

   (Whenever one of the configuration options that affects the
   filter is updated, we repeat the process above, starting at the
   [**] line.)

4.11. Deciding whether to generate a new circuit.
  [Section:NEW_CIRCUIT_NEEDED]

   We generate a new circuit when we don't have
   enough circuits either built or in-progress to handle a given
   stream, or an expected stream.

   For the purpose of this rule, we say that <waiting_for_better_guard>
   circuits are neither built nor in-progress; that <complete>
   circuits are built; and that the other states are in-progress.

4.12. When we are missing descriptors.
   [Section:MISSING_DESCRIPTORS]

   We need either a router descriptor or a microdescriptor in order
   to build a circuit through a guard.  If we do not have such a
   descriptor for a guard, we can still use the guard for one-hop
   directory fetches, but not for longer circuits.

   (Also, when we are missing descriptors for our first
   {NUM_USABLE_PRIMARY_GUARDS} primary guards, we don't build
   circuits at all until we have fetched them.)

A. Appendices

A.0. Acknowledgements

  This research was supported in part by NSF grants CNS-1111539,
  CNS-1314637, CNS-1526306, CNS-1619454, and CNS-1640548.

A.1.  Parameters with suggested values. [Section:PARAM_VALS]

   (All suggested values chosen arbitrarily)

   {param:MAX_SAMPLE_THRESHOLD} -- 20%

   {param:MAX_SAMPLE_SIZE} -- 60

   {param:GUARD_LIFETIME} -- 120 days

   {param:REMOVE_UNLISTED_GUARDS_AFTER} -- 20 days
     [previously ENTRY_GUARD_REMOVE_AFTER]

   {param:MIN_FILTERED_SAMPLE} -- 20

   {param:N_PRIMARY_GUARDS} -- 3

   {param:PRIMARY_GUARDS_RETRY_SCHED}
      -- every 10 minutes for the first six hours,
      -- every 90 minutes for the next 90 hours,
      -- every 4 hours for the next 3 days,
      -- every 9 hours thereafter.

   {param:GUARDS_RETRY_SCHED} --
      -- every hour for the first six hours,
      -- every 4 hours for the 90 hours,
      -- every 18 hours for the next 3 days,
      -- every 36 hours thereafter.

   {param:INTERNET_LIKELY_DOWN_INTERVAL} -- 10 minutes

   {param:NONPRIMARY_GUARD_CONNECT_TIMEOUT} -- 15 seconds

   {param:NONPRIMARY_GUARD_IDLE_TIMEOUT} -- 10 minutes

   {param:MEANINGFUL_RESTRICTION_FRAC} -- .2

   {param:EXTREME_RESTRICTION_FRAC} -- .01

   {param:GUARD_CONFIRMED_MIN_LIFETIME} -- 60 days

   {param:NUM_USABLE_PRIMARY_GUARDS} -- 1

   {param:NUM_USABLE_PRIMARY_DIRECTORY_GUARDS} -- 3

A.2. Random values [Section:RANDOM]

   Frequently, we want to randomize the expiration time of something
   so that it's not easy for an observer to match it to its start
   time. We do this by randomizing its start date a little, so that
   we only need to remember a fixed expiration interval.

   By RAND(now, INTERVAL) we mean a time between now and INTERVAL in
   the past, chosen uniformly at random.


A.3. Why not a sliding scale of primaryness? [Section:CVP]

   At one meeting, I floated the idea of having "primaryness" be a
   continuous variable rather than a boolean.

   I'm no longer sure this is a great idea, but I'll try to outline
   how it might work.

   To begin with: being "primary" gives it a few different traits:

      1) We retry primary guards more frequently. [Section:RETRYING]

      2) We don't even _try_ building circuits through
         lower-priority guards until we're pretty sure that the
         higher-priority primary guards are down. (With non-primary
         guards, on the other hand, we launch exploratory circuits
         which we plan not to use if higher-priority guards
         succeed.) [Section:SELECTING]

      3) We retry them all one more time if a circuit succeeds after
         the net has been down for a while. [Section:ON_SUCCESS]

   We could make each of the above traits continuous:

      1) We could make the interval at which a guard is retried
         depend continuously on its position in CONFIRMED_GUARDS.

      2) We could change the number of guards we test in parallel
         based on their position in CONFIRMED_GUARDS.

      3) We could change the rule for how long the higher-priority
         guards need to have been down before we call a
         <usable_if_no_better_guard> circuit <complete> based on a
         possible network-down condition.  For example, we could
         retry the first guard if we tried it more than 10 seconds
         ago, the second if we tried it more than 20 seconds ago,
         etc.

   I am pretty sure, however, that if these are worth doing, they
   need more analysis!  Here's why:

      * They all have the potential to leak more information about a
        guard's exact position on the list.  Is that safe? Is there
        any way to exploit that?  I don't think we know.

      * They all seem like changes which it would be relatively
        simple to make to the code after we implement the simpler
        version of the algorithm described above.

A.4. Controller changes

   We will add to control-spec.txt a new possible circuit state, GUARD_WAIT,
   that can be given as part of circuit events and GETINFO responses about
   circuits.  A circuit is in the GUARD_WAIT state when it is fully built,
   but we will not use it because a circuit with a better guard might
   become built too.

A.5. Persistent state format

   The persistent state format doesn't need to be part of this
   specification, since different implementations can do it
   differently. Nonetheless, here's the one Tor uses:

   The "state" file contains one Guard entry for each sampled guard
   in each instance of the guard state (see section 2).  The value
   of this Guard entry is a set of space-separated K=V entries,
   where K contains any nonspace character except =, and V contains
   any nonspace characters.

   Implementations must retain any unrecognized K=V entries for a
   sampled guard when they regenerate the state file.

   The order of K=V entries is not allowed to matter.

   Recognized fields (values of K) are:

        "in" -- the name of the guard state instance that this
        sampled guard is in.  If a sampled guard is in two guard
        states instances, it appears twice, with a different "in"
        field each time. Required.

        "rsa_id" -- the RSA id digest for this guard, encoded in
        hex. Required.

        "bridge_addr" -- If the guard is a bridge, its configured address and
        port (this can be the ORPort or a pluggable transport port). Optional.

        "nickname" -- the guard's nickname, if any. Optional.

        "sampled_on" -- the date when the guard was sampled. Required.

        "sampled_by" -- the Tor version that sampled this guard.
        Optional.

        "unlisted_since" -- the date since which the guard has been
        unlisted. Optional.

        "listed" -- 0 if the guard is not listed; 1 if it is. Required.

        "confirmed_on" -- date when the guard was
        confirmed. Optional.

        "confirmed_idx" -- position of the guard in the confirmed
        list. Optional.

        "pb_use_attempts", "pb_use_successes", "pb_circ_attempts",
        "pb_circ_successes", "pb_successful_circuits_closed",
        "pb_collapsed_circuits", "pb_unusable_circuits",
        "pb_timeouts" -- state for the circuit path bias algorithm,
        given in decimal fractions. Optional.

   All dates here are given as a (spaceless) ISO8601 combined date
   and time in UTC (e.g., 2016-11-29T19:39:31).


TODO. Still non-addressed issues [Section:TODO]

   Simulate to answer:  Will this work in a dystopic world?

   Simulate actual behavior.

   For all lifetimes: instead of storing the "this began at" time,
   store the "remove this at" time, slightly randomized.

   Clarify that when you get a <complete> circuit, you might need to
   relaunch circuits through that same guard immediately, if they
   are circuits that have to be independent.


   Fix all items marked XX or TODO.

   "Directory guards" -- do they matter?

       Suggestion: require that all guards support downloads via BEGINDIR.
       We don't need to worry about directory guards for relays, since we
       aren't trying to prevent relay enumeration.

   IP version preferences via ClientPreferIPv6ORPort

       Suggestion: Treat it as a preference when adding to
       {CONFIRMED_GUARDS}, but not otherwise.