spec/path-spec/detecting-route-manipulation.md


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202

<a id="path-spec.txt-7"></a>

# Detecting route manipulation by Guard nodes (Path Bias) {#pathbias}

The Path Bias defense is designed to defend against a type of route
capture where malicious Guard nodes deliberately fail or choke circuits
that extend to non-colluding Exit nodes to maximize their network
utilization in favor of carrying only compromised traffic.

In the extreme, the attack allows an adversary that carries c/n
of the network capacity to deanonymize c/n of the network
connections, breaking the O((c/n)^2) property of Tor's original
threat model. It also allows targeted attacks aimed at monitoring
the activity of specific users, bridges, or Guard nodes.

There are two points where path selection can be manipulated:
during construction, and during usage. Circuit construction
can be manipulated by inducing circuit failures during circuit
extend steps, which causes the Tor client to transparently retry
the circuit construction with a new path. Circuit usage can be
manipulated by abusing the stream retry features of Tor (for
example by withholding stream attempt responses from the client
until the stream timeout has expired), at which point the tor client
will also transparently retry the stream on a new path.

The defense as deployed therefore makes two independent sets of
measurements of successful path use: one during circuit construction,
and one during circuit usage.

The intended behavior is for clients to ultimately disable the use
of Guards responsible for excessive circuit failure of either type
(for the parameters to do this, see ["Parameterization"](#parameters) below);
however known issues with the Tor network currently
restrict the defense to being informational only at this stage
(see ["Known barriers to enforcement"](#barriers)).

<a id="path-spec.txt-7.1"></a>

## Measuring path construction success rates {#construction-success-rate}

Clients maintain two counts for each of their guards: a count of the
number of times a circuit was extended to at least two hops through that
guard, and a count of the number of circuits that successfully complete
through that guard. The ratio of these two numbers is used to determine
a circuit success rate for that Guard.

[Circuit build timeouts](./learning-timeouts.md)
are counted as construction failures if the
circuit fails to complete before the 95% "right-censored" timeout
interval, not the 80% timeout condition.

If a circuit closes prematurely after construction but before being
requested to close by the client, this is counted as a failure.

<a id="path-spec.txt-7.2"></a>

## Measuring path usage success rates {#usage-success-rate}

Clients maintain two usage counts for each of their guards: a count
of the number of usage attempts, and a count of the number of
successful usages.

A usage attempt means any attempt to attach a stream to a circuit.

Usage success status is temporarily recorded by state flags on circuits.
Guard usage success counts are not incremented until circuit close. A
circuit is marked as successfully used if we receive a properly
recognized RELAY cell on that circuit that was expected for the current
circuit purpose.

If subsequent stream attachments fail or time out, the successfully used
state of the circuit is cleared, causing it once again to be regarded
as a usage attempt only.

Upon close by the client, all circuits that are still marked as usage
attempts are probed using a RELAY_BEGIN cell constructed with a
destination of the form 0.a.b.c:25, where a.b.c is a 24 bit random
nonce. If we get a RELAY_COMMAND_END in response matching our nonce,
the circuit is counted as successfully used.

If any unrecognized RELAY cells arrive after the probe has been sent,
the circuit is counted as a usage failure.

If the stream failure reason codes DESTROY, TORPROTOCOL, or INTERNAL
are received in response to any stream attempt, such circuits are not
probed and are declared usage failures.

Prematurely closed circuits are not probed, and are counted as usage
failures.

<a id="path-spec.txt-7.3"></a>

## Scaling success counts {#scaling}

To provide a moving average of recent Guard activity while
still preserving the ability to verify correctness, we periodically
"scale" the success counts by multiplying them by a scale factor
between 0 and 1.0.

Scaling is performed when either usage or construction attempt counts
exceed a parametrized value.

To avoid error due to scaling during circuit construction and use,
currently open circuits are subtracted from the usage counts before
scaling, and added back after scaling.

<a id="path-spec.txt-7.4"></a>

## Parametrization {#parameters}

The following consensus parameters tune various aspects of the
defense.

```text
     pb_mincircs
       Default: 150
       Min: 5
       Effect: This is the minimum number of circuits that must complete
               at least 2 hops before we begin evaluating construction rates.

     pb_noticepct
       Default: 70
       Min: 0
       Max: 100
       Effect: If the circuit success rate falls below this percentage,
               we emit a notice log message.

     pb_warnpct
       Default: 50
       Min: 0
       Max: 100
       Effect: If the circuit success rate falls below this percentage,
               we emit a warn log message.

     pb_extremepct
       Default: 30
       Min: 0
       Max: 100
       Effect: If the circuit success rate falls below this percentage,
               we emit a more alarmist warning log message. If
               pb_dropguard is set to 1, we also disable the use of the
               guard.

     pb_dropguards
       Default: 0
       Min: 0
       Max: 1
       Effect: If the circuit success rate falls below pb_extremepct,
               when pb_dropguard is set to 1, we disable use of that
               guard.

     pb_scalecircs
       Default: 300
       Min: 10
       Effect: After this many circuits have completed at least two hops,
               Tor performs the scaling described in
	       ["Scaling success counts"](#scaling).

     pb_multfactor and pb_scalefactor
       Default: 1/2
       Min: 0.0
       Max: 1.0
       Effect: The double-precision result obtained from
               pb_multfactor/pb_scalefactor is multiplied by our current
               counts to scale them.

     pb_minuse
       Default: 20
       Min: 3
       Effect: This is the minimum number of circuits that we must attempt to
               use before we begin evaluating construction rates.

     pb_noticeusepct
       Default: 80
       Min: 3
       Effect: If the circuit usage success rate falls below this percentage,
               we emit a notice log message.

     pb_extremeusepct
       Default: 60
       Min: 3
       Effect: If the circuit usage success rate falls below this percentage,
               we emit a warning log message. We also disable the use of the
               guard if pb_dropguards is set.

     pb_scaleuse
       Default: 100
       Min: 10
       Effect: After we have attempted to use this many circuits,
               Tor performs the scaling described in
      	       ["Scaling success counts"](#scaling).
```

<a id="path-spec.txt-7.5"></a>

## Known barriers to enforcement {#barriers}

Due to intermittent CPU overload at relays, the normal rate of
successful circuit completion is highly variable. The Guard-dropping
version of the defense is unlikely to be deployed until the ntor
circuit handshake is enabled, or the nature of CPU overload induced
failure is better understood.