spec/padding-spec/circuit-level-padding.md


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290

<a id="padding-spec.txt-3"></a>

# Circuit-level padding {#circuit-level-padding}

The circuit padding system in Tor is an extension of the WTF-PAD
event-driven state machine design\[15\]. At a high level, this design places
one or more padding state machines at the client, and one or more padding
state machines at a relay, on each circuit.

State transition and histogram generation has been generalized to be fully
programmable, and probability distribution support was added to support more
compact representations like APE\[16\]. Additionally, packet count limits,
rate limiting, and circuit application conditions have been added.

At present, Tor uses this system to deploy two pairs of circuit padding
machines, to obscure differences between the setup phase of client-side
onion service circuits, up to the first 10 relay cells.

This specification covers only the resulting behavior of these padding
machines, and thus does not cover the state machine implementation details or
operation. For full details on using the circuit padding system to develop
future padding defenses, see the research developer documentation\[17\].

<a id="padding-spec.txt-3.1"></a>

## Circuit Padding Negotiation {#negotiation}

Circuit padding machines are advertised as "Padding" subprotocol versions
(see tor-spec.txt Section 9). The onion service circuit padding machines are
advertised as "Padding=2".

Because circuit padding machines only become active at certain points in
circuit lifetime, and because more than one padding machine may be active at
any given point in circuit lifetime, there is also a PADDING_NEGOTIATE
message and a PADDING_NEGOTIATED message. These are relay commands 41 and 42 respectively,
with relay headers as per section 6.1 of tor-spec.txt.

The fields in the body of a PADDING_NEGOTIATE message are
as follows:

```text
     const CIRCPAD_COMMAND_STOP = 1;
     const CIRCPAD_COMMAND_START = 2;

     const CIRCPAD_RESPONSE_OK = 1;
     const CIRCPAD_RESPONSE_ERR = 2;

     const CIRCPAD_MACHINE_CIRC_SETUP = 1;

     struct circpad_negotiate {
       u8 version IN [0];
       u8 command IN [CIRCPAD_COMMAND_START, CIRCPAD_COMMAND_STOP];

       u8 machine_type IN [CIRCPAD_MACHINE_CIRC_SETUP];

       u8 unused; // Formerly echo_request

       u32 machine_ctr;
     };
```

When a client wants to start a circuit padding machine, it first checks that
the desired destination hop advertises the appropriate subprotocol version for
that machine. It then sends a PADDING_NEGOTIATE message to that hop with
command=CIRCPAD_COMMAND_START, and machine_type=CIRCPAD_MACHINE_CIRC_SETUP (for
the circ setup machine, the destination hop is the second hop in the
circuit). The machine_ctr is the count of which machine instance this is on
the circuit. It is used to disambiguate shutdown requests.

When a relay receives a PADDING_NEGOTIATE message, it checks that it supports
the requested machine, and sends a PADDING_NEGOTIATED message, which is formatted
in the body of a relay message with command number 42 (see tor-spec.txt
section 6.1), as follows:

```text
     struct circpad_negotiated {
       u8 version IN [0];
       u8 command IN [CIRCPAD_COMMAND_START, CIRCPAD_COMMAND_STOP];
       u8 response IN [CIRCPAD_RESPONSE_OK, CIRCPAD_RESPONSE_ERR];

       u8 machine_type IN [CIRCPAD_MACHINE_CIRC_SETUP];

       u32 machine_ctr;
     };
```

If the machine is supported, the response field will contain
CIRCPAD_RESPONSE_OK. If it is not, it will contain CIRCPAD_RESPONSE_ERR.

Either side may send a CIRCPAD_COMMAND_STOP to shut down the padding machines
(clients MUST only send circpad_negotiate, and relays MUST only send
circpad_negotiated for this purpose).

If the machine_ctr does not match the current machine instance count
on the circuit, the command is ignored.

<a id="padding-spec.txt-3.2"></a>

## Circuit Padding Machine Message Management { #machine-msg-mgt }

Clients MAY send padding cells towards the relay before receiving the
circpad_negotiated response, to allow for outbound cover traffic before
negotiation completes.

Clients MAY send another PADDING_NEGOTIATE message before receiving the
circpad_negotiated response, to allow for rapid machine changes.

Relays MUST NOT send padding cells or PADDING_NEGOTIATE messages unless a
padding machine is active. Any padding cells or padding-related messages
that arrive at the client
from unexpected relay sources are protocol violations, and clients MAY
immediately tear down such circuits to avoid side channel risk.

<a id="padding-spec.txt-3.3"></a>

## Obfuscating client-side onion service circuit setup { #hiding-circ-setup }

The circuit padding currently deployed in Tor attempts to hide client-side
onion service circuit setup. Service-side setup is not covered, because doing
so would involve significantly more overhead, and/or require interaction with
the application layer.

The approach taken aims to make client-side introduction and rendezvous
circuits match the cell direction sequence and cell count of 3 hop general
circuits used for normal web traffic, for the first 10 cells only. The
lifespan of introduction circuits is also made to match the lifespan
of general circuits.

Note that inter-arrival timing is not obfuscated by this defense.

<a id="padding-spec.txt-3.3.1"></a>

### Common general circuit construction sequences { #circ-setup-sequences}

Most general Tor circuits used to surf the web or download directory
information start with the following 6-cell relay cell sequence (cells
surrounded in \[brackets\] are outgoing, the others are incoming):

\[EXTEND2\] -> EXTENDED2 -> \[EXTEND2\] -> EXTENDED2 -> \[BEGIN\] -> CONNECTED

When this is done, the client has established a 3-hop circuit and also opened
a stream to the other end. Usually after this comes a series of DATA message that
either fetches pages, establishes an SSL connection or fetches directory
information:

\[DATA\] -> \[DATA\] -> DATA -> DATA...(inbound cells continue)

The above stream of 10 relay cells defines the grand majority of general
circuits that come out of Tor browser during our testing, and it's what we use
to make introduction and rendezvous circuits blend in.

Please note that in this section we only investigate relay cells and not
connection-level cells like CREATE/CREATED or AUTHENTICATE/etc. that are used
during the link-layer handshake. The rationale is that connection-level cells
depend on the type of guard used and are not an effective fingerprint for a
network/guard-level adversary.

<a id="padding-spec.txt-3.3.2"></a>

### Client-side onion service introduction circuit obfuscation { #hiding-intro }

Two circuit padding machines work to hide client-side introduction circuits:
one machine at the origin, and one machine at the second hop of the circuit.
Each machine sends padding towards the other. The padding from the origin-side
machine terminates at the second hop and does not get forwarded to the actual
introduction point.

From Section 3.3.1 above, most general circuits have the following initial
relay cell sequence (outgoing cells marked in \[brackets\]):

```text
  [EXTEND2] -> EXTENDED2 -> [EXTEND2] -> EXTENDED2 -> [BEGIN] -> CONNECTED
    -> [DATA] -> [DATA] -> DATA -> DATA...(inbound data cells continue)

  Whereas normal introduction circuits usually look like:

  [EXTEND2] -> EXTENDED2 -> [EXTEND2] -> EXTENDED2 -> [EXTEND2] -> EXTENDED2
    -> [INTRO1] -> INTRODUCE_ACK
```

This means that up to the sixth cell (first line of each sequence above),
both general and intro circuits have identical cell sequences. After that
we want to mimic the second line sequence of

-> \[DATA\] -> \[DATA\] -> DATA -> DATA...(inbound data cells continue)

We achieve this by starting padding INTRODUCE1 has been sent. With padding
negotiation cells, in the common case of the second line looks like:

-> \[INTRO1\] -> \[PADDING_NEGOTIATE\] -> PADDING_NEGOTIATED -> INTRO_ACK

Then, the middle node will send between INTRO_MACHINE_MINIMUM_PADDING (7) and
INTRO_MACHINE_MAXIMUM_PADDING (10) cells, to match the "...(inbound data cells
continue)" portion of the trace (aka the rest of an HTTPS response body).

We also set a special flag which keeps the circuit open even after the
introduction is performed. With this feature the circuit will stay alive for
the same duration as normal web circuits before they expire (usually 10
minutes).

<a id="padding-spec.txt-3.3.3"></a>

### Client-side rendezvous circuit hiding { #hiding-rendezvous }

Following a similar argument as for intro circuits, we are aiming for padded
rendezvous circuits to blend in with the initial cell sequence of general
circuits which usually look like this:

```text
  [EXTEND2] -> EXTENDED2 -> [EXTEND2] -> EXTENDED2 -> [BEGIN] -> CONNECTED
     -> [DATA] -> [DATA] -> DATA -> DATA...(incoming cells continue)

  Whereas normal rendezvous circuits usually look like:

  [EXTEND2] -> EXTENDED2 -> [EXTEND2] -> EXTENDED2 -> [EST_REND] -> REND_EST
     -> REND2 -> [BEGIN]
```

This means that up to the sixth cell (the first line), both general and
rend circuits have identical cell sequences.

After that we want to mimic a \[DATA\] -> \[DATA\] -> DATA -> DATA sequence.

With padding negotiation right after the REND_ESTABLISHED, the sequence
becomes:

```text
  [EXTEND2] -> EXTENDED2 -> [EXTEND2] -> EXTENDED2 -> [EST_REND] -> REND_EST
     -> [PADDING_NEGOTIATE] -> [DROP] -> PADDING_NEGOTIATED -> DROP...

  After which normal application DATA-bearing cells continue on the circuit.
```

Hence this way we make rendezvous circuits look like general circuits up
till the end of the circuit setup.

After that our machine gets deactivated, and we let the actual rendezvous
circuit shape the traffic flow. Since rendezvous circuits usually imitate
general circuits (their purpose is to surf the web), we can expect that they
will look alike.

<a id="padding-spec.txt-3.3.4"></a>

### Circuit setup machine overhead { #setup-overhead }

For the intro circuit case, we see that the origin-side machine just sends a
single PADDING_NEGOTIATE message, whereas the origin-side machine sends a
PADDING_NEGOTIATED message and between 7 to 10 DROP cells. This means that the
average overhead of this machine is 11 padding cells per introduction circuit.

For the rend circuit case, this machine is quite light. Both sides send 2
padding cells, for a total of 4 padding cells.

<a id="padding-spec.txt-3.4"></a>

## Circuit padding consensus parameters { #consenus-parameters }

The circuit padding system has a handful of consensus parameters that can
either disable circuit padding entirely, or rate limit the total overhead
at relays and clients.

```text
  * circpad_padding_disabled
    - If set to 1, no circuit padding machines will negotiate, and all
      current padding machines will cease padding immediately.
    - Default: 0

  * circpad_padding_reduced
    - If set to 1, only circuit padding machines marked as "reduced"/"low
      overhead" will be used. (Currently no such machines are marked
      as "reduced overhead").
    - Default: 0

  * circpad_global_allowed_cells
    - This is the number of padding cells that must be sent before
      the 'circpad_global_max_padding_percent' parameter is applied.
    - Default: 0

  * circpad_global_max_padding_percent
    - This is the maximum ratio of padding cells to total cells, specified
      as a percent. If the global ratio of padding cells to total cells
      across all circuits exceeds this percent value, no more padding is sent
      until the ratio becomes lower. 0 means no limit.
    - Default: 0

  * circpad_max_circ_queued_cells
    - This is the maximum number of cells that can be in the circuitmux queue
      before padding stops being sent on that circuit.
    - Default: CIRCWINDOW_START_MAX (1000)
```