spec/padding-spec/connection-level-padding.md


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289

<a id="padding-spec.txt-2"></a>

# Connection-level padding

<a id="padding-spec.txt-2.1"></a>

## Background

Tor clients and relays make use of PADDING to reduce the resolution of
connection-level metadata retention by ISPs and surveillance infrastructure.

Such metadata retention is implemented by Internet routers in the form of
Netflow, jFlow, Netstream, or IPFIX records.  These records are emitted by
gateway routers in a raw form and then exported (often over plaintext) to a
"collector" that either records them verbatim, or reduces their granularity
further\[1\].

Netflow records and the associated data collection and retention tools are
very configurable, and have many modes of operation, especially when
configured to handle high throughput. However, at ISP scale, per-flow records
are very likely to be employed, since they are the default, and also provide
very high resolution in terms of endpoint activity, second only to full packet
and/or header capture.

Per-flow records record the endpoint connection 5-tuple, as well as the
total number of bytes sent and received by that 5-tuple during a particular
time period. They can store additional fields as well, but it is primarily
timing and bytecount information that concern us.

When configured to provide per-flow data, routers emit these raw flow
records periodically for all active connections passing through them
based on two parameters: the "active flow timeout" and the "inactive
flow timeout".

The "active flow timeout" causes the router to emit a new record
periodically for every active TCP session that continuously sends data. The
default active flow timeout for most routers is 30 minutes, meaning that a
new record is created for every TCP session at least every 30 minutes, no
matter what. This value can be configured from 1 minute to 60 minutes on
major routers.

The "inactive flow timeout" is used by routers to create a new record if a
TCP session is inactive for some number of seconds. It allows routers to
avoid the need to track a large number of idle connections in memory, and
instead emit a separate record only when there is activity. This value
ranges from 10 seconds to 600 seconds on common routers. It appears as
though no routers support a value lower than 10 seconds.

For reference, here are default values and ranges (in parenthesis when
known) for common routers, along with citations to their manuals.

Some routers speak other collection protocols than Netflow, and in the
case of Juniper, use different timeouts for these protocols. Where this
is known to happen, it has been noted.

```text
                            Inactive Timeout              Active Timeout
    Cisco IOS[3]              15s (10-600s)               30min (1-60min)
    Cisco Catalyst[4]         5min                        32min
    Juniper (jFlow)[5]        15s (10-600s)               30min (1-60min)
    Juniper (Netflow)[6,7]    60s (10-600s)               30min (1-30min)
    H3C (Netstream)[8]        60s (60-600s)               30min (1-60min)
    Fortinet[9]               15s                         30min
    MicroTik[10]              15s                         30min
    nProbe[14]                30s                         120s
    Alcatel-Lucent[2]         15s (10-600s)               30min (1-600min)
```

The combination of the active and inactive netflow record timeouts allow us
to devise a low-cost padding defense that causes what would otherwise be
split records to "collapse" at the router even before they are exported to
the collector for storage. So long as a connection transmits data before the
"inactive flow timeout" expires, then the router will continue to count the
total bytes on that flow before finally emitting a record at the "active
flow timeout".

This means that for a minimal amount of padding that prevents the "inactive
flow timeout" from expiring, it is possible to reduce the resolution of raw
per-flow netflow data to the total amount of bytes send and received in a 30
minute window. This is a vast reduction in resolution for HTTP, IRC, XMPP,
SSH, and other intermittent interactive traffic, especially when all
user traffic in that time period is multiplexed over a single connection
(as it is with Tor).

Though flow measurement in principle can be bidirectional (counting cells
sent in both directions between a pair of IPs) or unidirectional (counting
only cells sent from one IP to another), we assume for safety that all
measurement is unidirectional, and so traffic must be sent by both parties
in order to prevent record splitting.

<a id="padding-spec.txt-2.2"></a>

## Implementation

Tor clients currently maintain one TLS connection to their Guard node to
carry actual application traffic, and make up to 3 additional connections to
other nodes to retrieve directory information.

We pad only the client's connection to the Guard node, and not any other
connection. We treat Bridge node connections to the Tor network as client
connections, and pad them, but otherwise not pad between normal relays.

Both clients and Guards will maintain a timer for all application (ie:
non-directory) TLS connections. Every time a padding packet sent by an
endpoint, that endpoint will sample a timeout value from
the max(X,X) distribution described in Section 2.3. The default
range is from 1.5 seconds to 9.5 seconds time range, subject to consensus
parameters as specified in Section 2.6.

(The timing is randomized to avoid making it obvious which cells are
padding.)

If another cell is sent for any reason before this timer expires, the timer
is reset to a new random value.

If the connection remains inactive until the timer expires, a
single PADDING cell will be sent on that connection (which will
also start a new timer).

In this way, the connection will only be padded in a given direction in
the event that it is idle in that direction, and will always transmit a
packet before the minimum 10 second inactive timeout.

(In practice, an implementation may not be able to determine when,
exactly, a cell is sent on a given channel.  For example, even though the
cell has been given to the kernel via a call to `send(2)`, the kernel may
still be buffering that cell.  In cases such as these, implementations
should use a reasonable proxy for the time at which a cell is sent: for
example, when the cell is queued.  If this strategy is used,
implementations should try to observe the innermost (closest to the wire)
queue that they practically can, and if this queue is already nonempty,
padding should not be scheduled until after the queue does become empty.)

<a id="padding-spec.txt-2.3"></a>

## Padding Cell Timeout Distribution Statistics { #distribution-statistics }

To limit the amount of padding sent, instead of sampling each endpoint
timeout uniformly, we instead sample it from max(X,X), where X is
uniformly distributed.

If X is a random variable uniform from 0..R-1 (where R=high-low), then the
random variable Y = max(X,X) has Prob(Y == i) = (2.0*i + 1)/(R*R).

Then, when both sides apply timeouts sampled from Y, the resulting
bidirectional padding packet rate is now a third random variable:
Z = min(Y,Y).

The distribution of Z is slightly bell-shaped, but mostly flat around the
mean. It also turns out that Exp\[Z\] ~= Exp\[X\]. Here's a table of average
values for each random variable:

```text
     R       Exp[X]    Exp[Z]    Exp[min(X,X)]   Exp[Y=max(X,X)]
     2000     999.5    1066        666.2           1332.8
     3000    1499.5    1599.5      999.5           1999.5
     5000    2499.5    2666       1666.2           3332.8
     6000    2999.5    3199.5     1999.5           3999.5
     7000    3499.5    3732.8     2332.8           4666.2
     8000    3999.5    4266.2     2666.2           5332.8
     10000   4999.5    5328       3332.8           6666.2
     15000   7499.5    7995       4999.5           9999.5
     20000   9900.5    10661      6666.2           13332.8
```

<a id="padding-spec.txt-2.4"></a>

## Maximum overhead bounds

With the default parameters and the above distribution, we expect a
padded connection to send one padding cell every 5.5 seconds. This
averages to 103 bytes per second full duplex (~52 bytes/sec in each
direction), assuming a 512 byte cell and 55 bytes of TLS+TCP+IP headers.
For a client connection that remains otherwise idle for its expected
~50 minute lifespan (governed by the circuit available timeout plus a
small additional connection timeout), this is about 154.5KB of overhead
in each direction (309KB total).

With 2.5M completely idle clients connected simultaneously, 52 bytes per
second amounts to 130MB/second in each direction network-wide, which is
roughly the current amount of Tor directory traffic\[11\]. Of course, our
2.5M daily users will neither be connected simultaneously, nor entirely
idle, so we expect the actual overhead to be much lower than this.

<a id="padding-spec.txt-2.5"></a>

## Reducing or Disabling Padding via Negotiation { #negotiation }

To allow mobile clients to either disable or reduce their padding overhead,
the PADDING_NEGOTIATE cell (tor-spec.txt section 7.2) may be sent from
clients to relays. This cell is used to instruct relays to cease sending
padding.

If the client has opted to use reduced padding, it continues to send
padding cells sampled from the range \[9000,14000\] milliseconds (subject to
consensus parameter alteration as per Section 2.6), still using the
Y=max(X,X) distribution. Since the padding is now unidirectional, the
expected frequency of padding cells is now governed by the Y distribution
above as opposed to Z. For a range of 5000ms, we can see that we expect to
send a padding packet every 9000+3332.8 = 12332.8ms.  We also half the
circuit available timeout from ~50min down to ~25min, which causes the
client's OR connections to be closed shortly there after when it is idle,
thus reducing overhead.

These two changes cause the padding overhead to go from 309KB per one-time-use
Tor connection down to 69KB per one-time-use Tor connection. For continual
usage, the maximum overhead goes from 103 bytes/sec down to 46 bytes/sec.

If a client opts to completely disable padding, it sends a
PADDING_NEGOTIATE to instruct the relay not to pad, and then does not
send any further padding itself.

Currently, clients negotiate padding only when a channel is created,
immediately after sending their NETINFO cell.  Recipients SHOULD, however,
accept padding negotiation messages at any time.

If a client which previously negotiated reduced, or disabled, padding, and
wishes to re-enable default padding (ie padding according to the consensus
parameters), it SHOULD send PADDING_NEGOTIATE START with zero in the
ito_low_ms and ito_high_ms fields.  (It therefore SHOULD NOT copy the values
from its own established consensus into the PADDING_NEGOTIATE cell.)
This avoids the client needing to send updated padding negotiations if the
consensus parameters should change.  The recipient's clamping of the timing
parameters will cause the recipient to use its notion of the consensus
parameters.

Clients and bridges MUST reject padding negotiation messages from relays,
and close the channel if they receive one.

<a id="padding-spec.txt-2.6"></a>

## Consensus Parameters Governing Behavior { #consensus-parameters }

Connection-level padding is controlled by the following consensus parameters:

```text
    * nf_ito_low
      - The low end of the range to send padding when inactive, in ms.
      - Default: 1500

    * nf_ito_high
      - The high end of the range to send padding, in ms.
      - Default: 9500
      - If nf_ito_low == nf_ito_high == 0, padding will be disabled.

    * nf_ito_low_reduced
      - For reduced padding clients: the low end of the range to send padding
        when inactive, in ms.
      - Default: 9000

    * nf_ito_high_reduced
      - For reduced padding clients: the high end of the range to send padding,
        in ms.
      - Default: 14000

    * nf_conntimeout_clients
      - The number of seconds to keep never-used circuits opened and
        available for clients to use. Note that the actual client timeout is
        randomized uniformly from this value to twice this value.
      - The number of seconds to keep idle (not currently used) canonical
        channels are open and available. (We do this to ensure a sufficient
        time duration of padding, which is the ultimate goal.)
      - This value is also used to determine how long, after a port has been
        used, we should attempt to keep building predicted circuits for that
        port. (See path-spec.txt section 2.1.1.)  This behavior was
        originally added to work around implementation limitations, but it
        serves as a reasonable default regardless of implementation.
      - For all use cases, reduced padding clients use half the consensus
        value.
      - Implementations MAY mark circuits held open past the reduced padding
        quantity (half the consensus value) as "not to be used for streams",
        to prevent their use from becoming a distinguisher.
      - Default: 1800

    * nf_pad_before_usage
      - If set to 1, OR connections are padded before the client uses them
        for any application traffic. If 0, OR connections are not padded
        until application data begins.
      - Default: 1

    * nf_pad_relays
      - If set to 1, we also pad inactive relay-to-relay connections
      - Default: 0

    * nf_conntimeout_relays
      - The number of seconds that idle relay-to-relay connections are kept
        open.
      - Default: 3600
```