aboutsummaryrefslogtreecommitdiff
path: root/proposals/247-hs-guard-discovery.txt
blob: a42161b2a4e049205e74caa74d3666b170fecb55 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
```
Filename: 247-hs-guard-discovery.txt
Title: Defending Against Guard Discovery Attacks using Vanguards
Authors: George Kadianakis and Mike Perry
Created: 2015-07-10
Status: Superseded
Superseded-by: 292-mesh-vanguards.txt

[This proposal is superseded by proposal 292-mesh-vanguards.txt based on our
analysis and experiences while implementing and simulating the vanguard design.]

0. Motivation

  A guard discovery attack allow attackers to determine the guard
  node of a Tor client. The hidden service rendezvous protocol
  provides an attack vector for a guard discovery attack since anyone
  can force an HS to construct a 3-hop circuit to a relay (#9001).

  Following the guard discovery attack with a compromise and/or
  coercion of the guard node can lead to the deanonymization of a
  hidden service.

1. Overview

  This document tries to make the above guard discovery + compromise
  attack harder to launch. It introduces a configuration
  option which makes the hidden service also pin the second and third
  hops of its circuits for a longer duration.

  With this new path selection, we force the adversary to perform a
  Sybil attack and two compromise attacks before succeeding. This is
  an improvement over the current state where the Sybil attack is
  trivial to pull off, and only a single compromise attack is required.

  With this new path selection, an attacker is forced to
  compromise one or more nodes before learning the guard node of a hidden
  service. This increases the uncertainty of the attacker, since
  compromise attacks are costly and potentially detectable, so an
  attacker will have to think twice before beginning a chain of node
  compromise attacks that he might not be able to complete.

1.1. Visuals

  Here is how a hidden service rendezvous circuit currently looks like:

                     -> middle_1 -> middle_A
                     -> middle_2 -> middle_B
                     -> middle_3 -> middle_C
                     -> middle_4 -> middle_D
       HS -> guard   -> middle_5 -> middle_E -> Rendezvous Point
                     -> middle_6 -> middle_F
                     -> middle_7 -> middle_G
                     -> middle_8 -> middle_H
                     ->   ...    ->  ...
                     -> middle_n -> middle_n

  this proposal pins the two middle nodes to a much more restricted
  set, as follows:

                                  -> guard_3A_A
                     -> guard_2_A -> guard_3A_B
                                  -> guard_3A_C -> Rendezvous Point
       HS -> guard_1
                                  -> guard_3B_D
                     -> guard_2_B -> guard_3B_E
                                  -> guard_3B_F -> Rendezvous Point


  Note that the third level guards are partitioned into buckets such that
  they are only used with one specific second-level guard. In this way,
  we ensure that even if an adversary is able to execute a Sybil attack
  against the third layer, they only get to learn one of the second-layer
  Guards, and not all of them. This prevents the adversary from gaining
  the ability to take their pick of the weakest of the second-level
  guards for further attack.

2. Design

  This feature requires the HiddenServiceGuardDiscovery torrc option
  to be enabled.

  When a hidden service picks its guard nodes, it also picks an
  additional NUM_SECOND_GUARDS-sized set of middle nodes for its
  `second_guard_set`. For each of those middle layer guards, it
  picks NUM_THIRD_GUARDS that will be used only with a specific
  middle node. These sets are unique to each hidden service created
  by a single Tor client, and must be kept separate and distinct.

  When a hidden service needs to establish a circuit to an HSDir,
  introduction point or a rendezvous point, it uses nodes from
  `second_guard_set` as the second hop of the circuit and nodes from
  that second hop's corresponding `third_guard_set` as third hops of
  the circuit.

  A hidden service rotates nodes from the 'second_guard_set' at a random
  time between MIN_SECOND_GUARD_LIFETIME hours and
  MAX_SECOND_GUARD_LIFETIME hours.

  A hidden service rotates nodes from the 'third_guard_set' at a random
  time between MIN_THIRD_GUARD_LIFETIME and MAX_THIRD_GUARD_LIFETIME
  hours.

  These extra guard nodes should be picked with the same path selection
  procedure that is used for regular middle nodes (though see Section 4.3
  and Section 5.1 for reasons to restrict this slightly beyond the current
  path selection rules).

  Each node's rotation time is tracked independently, to avoid disclosing
  the rotation times of the primary and second-level guards.

  XXX: IP and RP actually need to be separate 4th hops. On the server side,
  IP should be separate to better unlink IP from the 3rd layer guards,
  and on the client side, the RP needs to come from the full network to
  avoid cross-visit linkability. So it's seven proxies all teh time...

  XXX: What about hsdir fetch? to avoid targeting and visit linkability,
  it needs an emphemeral hop too.. Unless we believe that linkability is low?
  It is lower than IP linkability, since the hsdescs can be cached for a bit.
  But if we are worried about visit linkability, then client should also add
  an extra ephemeral hop during IP visits, making that circuit 8 hops long...

  XXX: Emphemeral hops for service side before RP?

  XXX: Really crazy idea: We can provide multiple path security levels.
  We could have full 4 hops, or combine Layer2+Layer3, or combine Layer1+Layer2
  and Layer3+Layer4 for lower-security HS circs..

  XXX: update the load balancing proposal with the outcome of this :/

  XXX how should proposal 241 ("Resisting guard-turnover attacks") be
      applied here?

2.1. Security parameters

  We set NUM_SECOND_GUARDS to 4 nodes and NUM_THIRD_GUARDS to 4 nodes (ie
  four sets of four). However, see Section 5.2 for some performance
  versus security tradeoffs and discussion.

  We set MIN_SECOND_GUARD_LIFETIME to 1 day, and
  MAX_SECOND_GUARD_LIFETIME to 32 days inclusive, for an average rotation
  rate of ~11 days, using the min(X,X) distribution specified in Section
  3.2.3.

  We set MIN_THIRD_GUARD_LIFETIME to 1 hour, and
  MAX_THIRD_GUARD_LIFETIME to 18 hours inclusive, for an average rotation
  rate of ~12 hours, using the max(X,X) distribution specified in Section
  3.2.3.

  The above parameters should be configurable in the Tor consensus and
  torrc.

  See Section 3 for more analysis on these constants.


3. Rationale and Security Parameter Selection

3.1. Threat model, Assumptions, and Goals

  Consider an adversary with the following powers:

     - Can launch a Sybil guard discovery attack against any node of a
       rendezvous circuit. The slower the rotation period of the node,
       the longer the attack takes. Similarly, the higher the percentage
       of the network is compromised, the faster the attack runs.

     - Can compromise any node on the network, but this compromise takes
       time and potentially even coercive action, and also carries risk
       of discovery.

  We also make the following assumptions about the types of attacks:

  1. A Sybil attack is observable by both people monitoring the network
     for large numbers of new nodes, as well as vigilant hidden service
     operators. It will require either large amounts of traffic sent
     towards the hidden service, multiple test circuits, or both.

  2. A Sybil attack against the second or first layer Guards will be
     more noisy than a Sybil attack against the third layer guard, since the
     second and first layer Sybil attack requires a timing side channel in
     order to determine success, whereas the Sybil success is almost
     immediately obvious to third layer guard, since it will be instructed
     to connect to a cooperating malicious rend point by the adversary.

  3. As soon as the adversary is confident they have won the Sybil attack,
     an even more aggressive circuit building attack will allow them to
     determine the next node very fast (an hour or less).

  4. The adversary is strongly disincentivized from compromising nodes that
     may prove useless, as node compromise is even more risky for the
     adversary than a Sybil attack in terms of being noticed.

  Given this threat model, our security parameters were selected so that
  the first two layers of guards should be hard to attack using a Sybil
  guard discovery attack and hence require a node compromise attack. Ideally,
  we want the node compromise attacks to carry a non-negligible probability of
  being useless to the adversary by the time they complete.

  On the other hand, the outermost layer of guards should rotate fast enough to
  _require_ a Sybil attack.

3.2. Parameter Tuning

3.2.1. Sybil rotation counts for a given number of Guards

  The probability of Sybil success for Guard discovery can be modeled as
  the probability of choosing 1 or more malicious middle nodes for a
  sensitive circuit over some period of time.

  P(At least 1 bad middle) = 1 - P(All Good Middles)
                           = 1 - P(One Good middle)^(num_middles)
                           = 1 - (1 - c/n)^(num_middles)

  c/n is the adversary compromise percentage

  In the case of Vanguards, num_middles is the number of Guards you rotate
  through in a given time period. This is a function of the number of vanguards
  in that position (v), as well as the number of rotations (r).

  P(At least one bad middle) = 1 - (1 - c/n)^(v*r)

  Here's detailed tables in terms of the number of rotations required for
  a given Sybil success rate for certain number of guards.

  1.0% Network Compromise:
   Sybil Success   One   Two  Three  Four  Five  Six  Eight  Nine  Ten  Twelve  Sixteen
    10%            11     6     4     3     3     2     2     2     2     1       1
    15%            17     9     6     5     4     3     3     2     2     2       2
    25%            29    15    10     8     6     5     4     4     3     3       2
    50%            69    35    23    18    14    12     9     8     7     6       5
    60%            92    46    31    23    19    16    12    11    10     8       6
    75%           138    69    46    35    28    23    18    16    14    12       9
    85%           189    95    63    48    38    32    24    21    19    16      12
    90%           230   115    77    58    46    39    29    26    23    20      15
    95%           299   150   100    75    60    50    38    34    30    25      19
    99%           459   230   153   115    92    77    58    51    46    39      29

  5.0% Network Compromise:
   Sybil Success   One   Two  Three  Four  Five  Six  Eight  Nine  Ten  Twelve  Sixteen
    10%             3     2     1     1     1     1     1     1     1     1       1
    15%             4     2     2     1     1     1     1     1     1     1       1
    25%             6     3     2     2     2     1     1     1     1     1       1
    50%            14     7     5     4     3     3     2     2     2     2       1
    60%            18     9     6     5     4     3     3     2     2     2       2
    75%            28    14    10     7     6     5     4     4     3     3       2
    85%            37    19    13    10     8     7     5     5     4     4       3
    90%            45    23    15    12     9     8     6     5     5     4       3
    95%            59    30    20    15    12    10     8     7     6     5       4
    99%            90    45    30    23    18    15    12    10     9     8       6

  10.0% Network Compromise:
   Sybil Success   One   Two  Three  Four  Five  Six  Eight  Nine  Ten  Twelve  Sixteen
    10%             2     1     1     1     1     1     1     1     1     1       1
    15%             2     1     1     1     1     1     1     1     1     1       1
    25%             3     2     1     1     1     1     1     1     1     1       1
    50%             7     4     3     2     2     2     1     1     1     1       1
    60%             9     5     3     3     2     2     2     1     1     1       1
    75%            14     7     5     4     3     3     2     2     2     2       1
    85%            19    10     7     5     4     4     3     3     2     2       2
    90%            22    11     8     6     5     4     3     3     3     2       2
    95%            29    15    10     8     6     5     4     4     3     3       2
    99%            44    22    15    11     9     8     6     5     5     4       3

  The rotation counts in these tables were generated with:
     def num_rotations(c, v, success):
       r = 0
       while 1-math.pow((1-c), v*r) < success: r += 1
       return r

3.2.2. Rotation Period

  As specified in Section 3.1, the primary driving force for the third
  layer selection was to ensure that these nodes rotate fast enough that
  it is not worth trying to compromise them, because it is unlikely for
  compromise to succeed and yield useful information before the nodes stop
  being used. For this reason we chose 1 to 18 hours, with a weighted
  distribution (Section 3.2.3) causing the expected average to be 12 hours.

  From the table in Section 3.2.1, with NUM_SECOND_GUARDS=4 and
  NUM_THIRD_GUARDS=4, it can be seen that this means that the Sybil attack
  will complete with near-certainty (99%) in 29*12 hours (14.5 days) for
  the 1% adversary, 3 days for the 5% adversary, and 1.5 days for the 10%
  adversary.

  Since rotation of each node happens independently, the distribution of
  when the adversary expects to win this Sybil attack in order to discover
  the next node up is uniform. This means that on average, the adversary
  should expect that half of the rotation period of the next node is already
  over by the time that they win the Sybil.

  With this fact, we choose our range and distribution for the second
  layer rotation to be short enough to cause the adversary to risk
  compromising nodes that are useless, yet long enough to require a
  Sybil attack to be noticeable in terms of client activity. For this
  reason, we choose a minimum second-layer guard lifetime of 1 day,
  since this gives the adversary a minimum expected value of 12 hours for
  during which they can compromise a guard before it might be rotated.
  If the total expected rotation rate is 11 days, then the adversary can
  expect overall to have 5.5 days remaining after completing their Sybil
  attack before a second-layer guard rotates away.

3.2.3. Rotation distributions

  In order to skew the distribution of the third layer guard towards
  higher values, we use max(X,X) for the distribution, where X is a
  random variable that takes on values from the uniform distribution.

  In order to skew the distribution of the second layer guard towards
  low values (to increase the risk of compromising useless nodes) we
  skew the distribution towards lower values, using min(X,X).

  Here's a table of expectation (arithmetic means) for relevant
  ranges of X (sampled from 0..N-1). The table was generated with the
  following python functions:

  def ProbMinXX(N, i): return (2.0*(N-i)-1)/(N*N)
  def ProbMaxXX(N, i): return (2.0*i+1)/(N*N)

  def ExpFn(N, ProbFunc):
    exp = 0.0
    for i in xrange(N): exp += i*ProbFunc(N, i)
    return exp

  The current choice for second-layer guards is noted with **, and
  the current choice for third-layer guards is noted with ***.

   Range  Exp[Min(X,X)]   Exp[Max(X,X)]
   10        2.85            6.15
   11        3.18            6.82
   12        3.51            7.49
   13        3.85            8.15
   14        4.18            8.82
   15        4.51            9.49
   16        4.84            10.16
   17        5.18            10.82***
   18        5.51            11.49
   19        5.84            12.16
   20        6.18            12.82
   21        6.51            13.49
   22        6.84            14.16
   23        7.17            14.83
   24        7.51            15.49
   25        7.84            16.16
   26        8.17            16.83
   27        8.51            17.49
   28        8.84            18.16
   29        9.17            18.83
   30        9.51            19.49
   31        9.84            20.16
   32        10.17**         20.83
   33        10.51           21.49
   34        10.84           22.16
   35        11.17           22.83
   36        11.50           23.50
   37        11.84           24.16
   38        12.17           24.83
   39        12.50           25.50

  The Cumulative Density Function (CDF) tells us the probability that a
  guard will no longer be in use after a given number of time units have
  passed.

  Because the Sybil attack on the third node is expected to complete at any
  point in the second node's rotation period with uniform probability, if we
  want to know the probability that a second-level Guard node will still be in
  use after t days, we first need to compute the probability distribution of
  the rotation duration of the second-level guard at a uniformly random point
  in time. Let's call this P(R=r).

  For P(R=r), the probability of the rotation duration depends on the selection
  probability of a rotation duration, and the fraction of total time that
  rotation is likely to be in use. This can be written as:

  P(R=r) = ProbMinXX(X=r)*r / \sum_{i=1}^N ProbMinXX(X=i)*i

  or in Python:

  def ProbR(N, r, ProbFunc=ProbMinXX):
     return ProbFunc(N, r)*r/ExpFn(N, ProbFunc)

  For the full CDF, we simply sum up the fractional probability density for
  all rotation durations. For rotation durations less than t days, we add the
  entire probability mass for that period to the density function. For
  durations d greater than t days, we take the fraction of that rotation
  period's selection probability and multiply it by t/d and add it to the
  density. In other words:

  def FullCDF(N, t, ProbFunc=ProbR):
    density = 0.0
    for d in xrange(N):
      if t >= d: density += ProbFunc(N, d)
      # The +1's below compensate for 0-indexed arrays:
      else: density += ProbFunc(N, d)*(float(t+1))/(d+1)
    return density

  Computing this yields the following distribution for our current parameters:

   t          P(SECOND_ROTATION <= t)
   1               0.07701
   2               0.15403
   3               0.22829
   4               0.29900
   5               0.36584
   6               0.42869
   7               0.48754
   8               0.54241
   9               0.59338
  10               0.64055
  11               0.68402
  12               0.72392
  13               0.76036
  14               0.79350
  15               0.82348
  16               0.85043
  17               0.87452
  18               0.89589
  19               0.91471
  20               0.93112
  21               0.94529
  22               0.95738
  23               0.96754
  24               0.97596
  25               0.98278
  26               0.98817
  27               0.99231
  28               0.99535
  29               0.99746
  30               0.99881
  31               0.99958
  32               0.99992
  33               1.00000

  This CDF tells us that for the second-level Guard rotation, the
  adversary can expect that 7.7% of the time, their third-level Sybil
  attack will provide them with a second-level guard node that has only
  1 day remaining before it rotates. 15.4% of the time, there will
  be only 2 day or less remaining, and 22.8% of the time, 3 days or less.

  Note that this distribution is still a day-resolution approximation. The
  actual numbers are likely even more biased towards lower values.

  In this way, we achieve our goal of ensuring that the adversary must
  do the prep work to compromise multiple second-level nodes before
  likely being successful, or be extremely fast in compromising a
  second-level guard after winning the Sybil attack.


4. Security concerns and mitigations

4.1. Mitigating fingerprinting of new HS circuits

  By pinning the middle nodes of rendezvous circuits, we make it
  easier for all hops of the circuit to detect that they are part of a
  special hidden service circuit with varying degrees of certainty.

  The Guard node is able to recognize a Vanguard client with a high
  degree of certainty because it will observe a client IP creating the
  overwhelming majority of its circuits to just a few middle nodes in
  any given 10-18 day time period.

  The middle nodes will be able to tell with a variable certainty that
  depends on both its traffic volume and upon the popularity of the
  service, because they will see a large number of circuits that tend to
  pick the same Guard and Exit.

  The final nodes will be able to tell with a similar level of certainty
  that depends on their capacity and the service popularity, because they
  will see a lot of rend handshakes that all tend to have the same second
  hop. The final nodes can also actively confirm that they have been
  selected for the third hop by creating multiple Rend circuits to a
  target hidden service, and seeing if they are chosen for the Rend point.

  The most serious of these is the Guard fingerprinting issue. When
  proposal 254-padding-negotiation is implemented, services that enable
  this feature should use those padding primitives to create fake circuits
  to random middle nodes that are not their guards, in an attempt to look
  more like a client.

  Additionally, if Tor Browser implements "virtual circuits" based on
  SOCKS username+password isolation in order to enforce the re-use of
  paths when SOCKS username+passwords are re-used, then the number of
  middle nodes in use during a typical user's browsing session will be
  proportional to the number of sites they are viewing at any one time.
  This is likely to be much lower than one new middle node every ten
  minutes, and for some users, may be close to the number of Vanguards
  we're considering.

  This same reasoning is also an argument for increasing the number of
  second-level guards beyond just two, as it will spread the hidden
  service's traffic over a wider set of middle nodes, making it both
  easier to cover, and behave closer to a client using SOCKS virtual
  circuit isolation.

4.2. Hidden service linkability

  Multiple hidden services on the same Tor instance should use separate
  second and third level guard sets; otherwise an adversary is trivially
  able to determine that the two hidden services are co-located by
  inspecting their current chosen rend point nodes.

  Unfortunately, if the adversary is still able to determine that two or
  more hidden services are run on the same Tor instance through some other
  means, then they are able to take advantage of this fact to execute a
  Sybil attack more effectively, since there will now be an extra set of
  guard nodes for each hidden service in use.

  For this reason, if Vanguards are enabled, and more than one hidden
  service is configured, the user should be advised to ensure that they do
  not accidentally leak that the two hidden services are from the same Tor
  instance.

  For cases where the user or application wants to deliberately link multiple
  different hidden services together (for example, to support concurrent file
  transfer and chat for the same identity), this behavior should be
  configurable. A torrc option DisjointHSVanguards should be provided that
  defaults to keeping the Vanguards separate for each hidden service.

4.3. Long term information leaks

  Due to Tor's path selection constraints, the client will never choose
  its primary guard node as later positions in the circuit. Over time,
  the absence of these nodes will give away information to the adversary.

  Unfortunately, the current solution (from bug #14917) of simply creating
  a temporary second guard connection to allow the primary guard to appear
  in some paths will make the hidden service fingerprinting problem worse,
  since only hidden services will exhibit this behavior on the local
  network. The simplest mitigation is to require that no Guard-flagged nodes
  be used for the second and third-level nodes at all, and to allow the
  primary guard to be chosen as a rend point.

  XXX: Dgoulet suggested using arbitrary subsets here rather than the
  no Guard-flag restriction, esp since Layer2 inference is still a
  possibility.

  XXX: If a Guard-flagged node is chosen for the alls IP or RP, raise
  protocolerror. Refuse connection. Or allow our guard/other nodes in
  IP/RP..

  Additionally, in order to further limit the exposure of secondary guards
  to sybil attacks, the bin position of the third-level guards should be
  stable over long periods of time. When choosing third-level guards, these
  guards should be given a fixed bin number so that if they are selected
  at a later point in the future, they are placed after the same
  second-level guard, and not a different one. A potential stateless way
  of accomplishing this is to assign third-level guards to a bin number
  such that H(bin_number | HS addr) is closest to the key for the
  third-level relay.

4.4. Denial of service

  Since it will be fairly trivial for the adversary to enumerate the
  current set of third-layer guards for a hidden service, denial of
  service becomes a serious risk for Vanguard users.

  For this reason, it is important to support a large number of
  third-level guards, to increase the amount of resources required to
  bring a hidden service offline by DoSing just a few Tor nodes.

  Even with multiple third-level guards, an adversary is still able to
  degrade either performance or user experience significantly, simply by
  taking out a fraction of them. The solution to this is to make use
  of the circuit build timeout code (Section 5.2) to have the hidden
  service retry the rend connection multiple times. Unfortunately, it is
  unwise to simply replace unresponsive third-level guards that fail to
  complete circuits, as this will accelerate the Sybil attack.

4.5. Path Bias

XXX: Re-use Prop#259 here.


5. Performance considerations

  The switch to a restricted set of nodes will very likely cause
  significant performance issues, especially for high-traffic hidden
  services. If any of the nodes they select happen to be temporarily
  overloaded, performance will suffer dramatically until the next
  rotation period.

5.1. Load Balancing

  Since the second and third level "guards" are chosen from the set of all
  nodes eligible for use in the "middle" hop (as per hidden services
  today), this proposal should not significantly affect the long-term load
  on various classes of the Tor network, and should not require any
  changes to either the node weight equations, or the bandwidth
  authorities.

  Unfortunately, transient load is another matter, as mentioned
  previously. It is very likely that this scheme will increase instances
  of transient overload at nodes selected by high-traffic hidden services.

  One option to reduce the impact of this transient overload is to
  restrict the set of middle nodes that we choose from to some percentage
  of the fastest middle-capable relays in the network. This may have
  some impact on load balancing, but since the total volume of hidden
  service traffic is low, it may be unlikely to matter.

5.2. Circuit build timeout and topology

  The adaptive circuit build timeout mechanism in Tor is what corrects
  for instances of transient node overload right now.

  The timeout will naturally tend to select the current fastest and
  least-loaded paths even through this set of restricted routes, but it
  may fail to behave correctly if there are a very small set of nodes in
  each guard set, as it is based upon assumptions about the current path
  selection algorithm, and it may need to be tuned specifically for
  Vanguards, especially if the set of possible routes is small.

  It turns out that a fully-connected/mesh (aka non-binned) second guard to
  third guard mapping topology is a better option for CBT for performance,
  because it will create a larger total set of paths for CBT to choose
  from while using fewer nodes.

  This comes at the expense of exposing all second-layer guards to a
  single sybil attack, but for small numbers of guard sets, it may be
  worth the tradeoff. However, it also turns out that this need not block
  implementation, as worst-case the data structures and storage needed to
  support a fully connected mesh topology can do so by simply replicating
  the same set of third-layer guards for each second-layer guard bin.

  Since we only expect this tradeoff to be worth it when the sets are
  small, this replication should not be expensive in practice.

5.3. OnionBalance

  At first glance, it seems that this scheme makes multi-homed hidden
  services such as OnionBalance[1] even more important for high-traffic
  hidden services.

  Unfortunately, if it is equally damaging to the user for any of their
  multi-homed hidden service locations to be discovered, then OnionBalance
  is strictly equivalent to simply increasing the number of second-level
  guard nodes in use, because an active adversary can perform simultaneous
  Sybil attacks against all of the rend points offered by the multi-homed
  OnionBalance introduction points.

  XXX: This actually matters for high-perf censorship resistant publishing.
  It is better for those users to use onionbalance than to up their guards,
  since redundancy is useful for them.

5.4. Default vs optional behavior

  We suggest this torrc option to be optional because it changes path
  selection in a way that may seriously impact hidden service performance,
  especially for high traffic services that happen to pick slow guard
  nodes.

  However, by having this setting be disabled by default, we make hidden
  services who use it stand out a lot. For this reason, we should in fact
  enable this feature globally, but only after we verify its viability for
  high-traffic hidden services, and ensure that it is free of second-order
  load balancing effects.

  Even after that point, until Single Onion Services are implemented,
  there will likely still be classes of very high traffic hidden services
  for whom some degree of location anonymity is desired, but for which
  performance is much more important than the benefit of Vanguards, so there
  should always remain a way to turn this option off.


6. Future directions

  Here are some more ideas for improvements that should be done sooner
  or later:

  - Do we want to consider using Tor's GeoIP country database (if present)
    to ensure that the second-layer guards are chosen from a different
    country as the first-layer guards, or does this leak too much information
    to the adversary?

  - What does the security vs performance tradeoff actually look like
    for different amounts of bins? Or for mesh vs bins? We may need
    to simulate or run CBT tests to learn this.

  - With this tradeoff information, do we want to provide the user
    (or application) with a choice of 3 different Vanguard sets?
    One could imagine "small", "medium", and "large", for example.


7. Acknowledgments

 Thanks to Aaron Johnson, John Brooks, Mike Perry and everyone else
 who helped with this idea.

 This research was supported in part by NSF grants CNS-1111539,
 CNS-1314637, CNS-1526306, CNS-1619454, and CNS-1640548.


Appendix A: Full Python program for generating tables in this proposal

#!/usr/bin/python
import math

############ Section 3.2.1 #################
def num_rotations(c, v, success):
  i = 0
  while 1-math.pow((1-c), v*i) < success: i += 1
  return i

def rotation_line(c, pct):
  print "    %2d%%        %6d%6d%6d%6d%6d%6d%6d%6d%6d%6d%8d" % \
     (pct, num_rotations(c, 1, pct/100.0), num_rotations(c, 2, pct/100.0), \
      num_rotations(c, 3, pct/100.0), num_rotations(c, 4, pct/100.0),
      num_rotations(c, 5, pct/100.0), num_rotations(c, 6, pct/100.0),
      num_rotations(c, 8, pct/100.0), num_rotations(c, 9, pct/100.0),
      num_rotations(c, 10, pct/100.0), num_rotations(c, 12, pct/100.0),
      num_rotations(c, 16, pct/100.0))

def rotation_table_321():
  for c in [1,5,10]:
    print "\n  %2.1f%% Network Compromise: " % c
    print "   Sybil Success   One   Two  Three  Four  Five  Six  Eight  Nine  Ten  Twelve  Sixteen"
    for success in [10,15,25,50,60,75,85,90,95,99]:
      rotation_line(c/100.0, success)

############ Section 3.2.3 #################
def ProbMinXX(N, i): return (2.0*(N-i)-1)/(N*N)
def ProbMaxXX(N, i): return (2.0*i+1)/(N*N)

def ExpFn(N, ProbFunc):
  exp = 0.0
  for i in xrange(N): exp += i*ProbFunc(N, i)
  return exp

def ProbR(N, r, ProbFunc=ProbMinXX):
  return ProbFunc(N, r)*r/ExpFn(N, ProbFunc)

def FullCDF(N, t, ProbFunc=ProbR):
  density = 0.0
  for d in xrange(N):
    if t >= d: density += ProbFunc(N, d)
    # The +1's below compensate for 0-indexed arrays:
    else: density += ProbFunc(N, d)*float(t+1)/(d+1)
  return density

def expectation_table_323():
  print "\n   Range  Min(X,X)   Max(X,X)"
  for i in xrange(10,40):
    print "   %2d      %2.2f       %2.2f" % (i, ExpFn(i,ProbMinXX), ExpFn(i, ProbMaxXX))

def CDF_table_323():
  print "\n   t          P(SECOND_ROTATION <= t)"
  for i in xrange(1,34):
    print "  %2d               %2.5f" % (i, FullCDF(33, i-1))

########### Output ############

# Section 3.2.1
rotation_table_321()

# Section 3.2.3
expectation_table_323()
CDF_table_323()


----------------------

1. https://onionbalance.readthedocs.org/en/latest/design.html#overview
```