aboutsummaryrefslogtreecommitdiff
path: root/doc/HACKING/CircuitPaddingQuickStart.md
blob: 2780b5c6eae1ea00b3b2467995783f1b67651839 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
# A Padding Machine from Scratch

A quickstart guide by Tobias Pulls.

This document describes the process of building a "padding machine" in tor's new
circuit padding framework from scratch. Notes were taken as part of porting
[Adaptive Padding Early
(APE)](https://www.cs.kau.se/pulls/hot/thebasketcase-ape/) from basket2 to the
circuit padding framework. The goal is just to document the process and provide
useful pointers along the way, not create a useful machine. 

The quick and dirty plan is to:
1. clone and compile tor
2. use newly built tor in TB and at small (non-exit) relay we run
3. add a bare-bones APE padding machine
4. run the machine, inspect logs for activity
5. port APE's state machine without thinking much about parameters

## Clone and compile tor

```bash
git clone https://git.torproject.org/tor.git
cd tor
git checkout tor-0.4.1.5
```
Above we use the tag for tor-0.4.1.5 where the circuit padding framework was
released. Note that this version of the framework is missing many features and
fixes that have since been merged to origin/master. If you need the newest
framework features, you should use that master instead.

```bash
sh autogen.sh 
./configure
make
```
When you run `./configure` you'll be told of missing dependencies and packages
to install on debian-based distributions. Important: if you plan to run `tor` on
a relay as part of the real Tor network and your server runs a distribution that
uses systemd, then I'd recommend that you `apt install dpkg dpkg-dev
libevent-dev libssl-dev asciidoc quilt dh-apparmor libseccomp-dev dh-systemd
libsystemd-dev pkg-config dh-autoreconf libfakeroot zlib1g zlib1g-dev automake
liblzma-dev libzstd-dev` and ensure that tor has systemd support enabled:
`./configure --enable-systemd`. Without this, on a recent Ubuntu, my tor service
was forcefully restarted (SIGINT interrupt) by systemd every five minutes.

If you want to install on your localsystem, run `make install`. For our case we
just want the tor binary at `src/app/tor`.

## Use tor in TB and at a relay

Download and install a fresh Tor Browser (TB) from torproject.org. Make sure it
works. From the command line, relative to the folder created when you extracted
TB, run `./Browser/start-tor-browser --verbose` to get some basic log output.
Note the version of tor, in my case, `Tor 0.4.0.5 (git-bf071e34aa26e096)` as
part of TB 8.5.4. Shut down TB, copy the `tor` binary that you compiled earlier
and replace `Browser/TorBrowser/Tor/tor`. Start TB from the command line again,
you should see a different version, in my case `Tor 0.4.1.5
(git-439ca48989ece545)`.

The relay we run is also on linux, and `tor` is located at `/usr/bin/tor`. To
view relevant logs since last boot `sudo journalctl -b /usr/bin/tor`, where we
find `Tor 0.4.0.5 running on Linux`. Copy the locally compiled `tor` to the
relay at a temporary location and then make sure it's ownership and access
rights are identical to `/usr/bin/tor`. Next, shut down the running tor service
with `sudo service tor stop`, wait for it to stop (typically 30s), copy our
locally compiled tor to replace `/usr/bin/tor` then start the service again.
Checking the logs we see `or 0.4.1.5 (git-439ca48989ece545)`.

Repeatedly shutting down a relay is detrimental to the network and should be
avoided. Sorry about that.

We have one more step left before we move on the machine: configure TB to always
use our middle relay. Edit `Browser/TorBrowser/Data/Tor/torrc` and set
`MiddleNodes <fingerprint>`, where `<fingerprint>` is the fingerprint of the
relay. Start TB, visit a website, and manually confirm that the middle is used
by looking at the circuit display. 

## Add a bare-bones APE padding machine

Now the fun part. We have several resources at our disposal (mind that links
might be broken in the future, just search for the headings):
- The official [Circuit Padding Developer
  Documentation](https://storm.torproject.org/shared/ChieH_sLU93313A2gopZYT3x2waJ41hz5Hn2uG1Uuh7).
- Notes we made on the [implementation of the circuit padding
  framework](https://github.com/pylls/padding-machines-for-tor/blob/master/notes/circuit-padding-framework.md).
- The implementation of the current circuit padding machines in tor:
  [circuitpadding.c](https://gitweb.torproject.org/tor.git/tree/src/core/or/circuitpadding_machines.c)
  and
  [circuitpadding_machines.h](https://gitweb.torproject.org/tor.git/tree/src/core/or/circuitpadding_machines.h).

Please consult the above links for details. Moving forward, the focus is to
describe what was done, not necessarily explaining all the details why. 

Since we plan to make changes to tor, create a new branch `git checkout -b
circuit-padding-ape-machine tor-0.4.1.5`. 

We start with declaring two functions, one for the machine at the client and one
at the relay, in `circuitpadding_machines.h`:

```c
void circpad_machine_relay_wf_ape(smartlist_t *machines_sl);
void circpad_machine_client_wf_ape(smartlist_t *machines_sl);
```

The definitions go into `circuitpadding_machines.c`:

```c
/**************** Adaptive Padding Early (APE) machine ****************/

/** 
 * Create a relay-side padding machine based on the APE design. 
 */
void
circpad_machine_relay_wf_ape(smartlist_t *machines_sl)
{
  circpad_machine_spec_t *relay_machine
  = tor_malloc_zero(sizeof(circpad_machine_spec_t));

  relay_machine->name = "relay_wf_ape";
  relay_machine->is_origin_side = 0; // relay-side

  // Pad to/from the middle relay, only when the circuit has streams
  relay_machine->target_hopnum = 2;
  relay_machine->conditions.min_hops = 2;
  relay_machine->conditions.state_mask = CIRCPAD_CIRC_STREAMS;

  // limits to help guard against excessive padding
  relay_machine->allowed_padding_count = 1;
  relay_machine->max_padding_percent = 1;

  // one state to start with: START (-> END, never takes a slot in states)
  circpad_machine_states_init(relay_machine, 1);
  relay_machine->states[CIRCPAD_STATE_START].
    next_state[CIRCPAD_EVENT_NONPADDING_SENT] =
    CIRCPAD_STATE_END;

  // register the machine
  relay_machine->machine_num = smartlist_len(machines_sl);
  circpad_register_padding_machine(relay_machine, machines_sl);
  
  log_info(LD_CIRC,
           "Registered relay WF APE padding machine (%u)",
           relay_machine->machine_num);
}

/** 
 * Create a client-side padding machine based on the APE design. 
 */
void
circpad_machine_client_wf_ape(smartlist_t *machines_sl)
{
    circpad_machine_spec_t *client_machine
  = tor_malloc_zero(sizeof(circpad_machine_spec_t));

  client_machine->name = "client_wf_ape";
  client_machine->is_origin_side = 1; // client-side

  /** Pad to/from the middle relay, only when the circuit has streams, and only
  * for general purpose circuits (typical for web browsing)
  */
  client_machine->target_hopnum = 2;
  client_machine->conditions.min_hops = 2;
  client_machine->conditions.state_mask = CIRCPAD_CIRC_STREAMS;
  client_machine->conditions.purpose_mask =
    circpad_circ_purpose_to_mask(CIRCUIT_PURPOSE_C_GENERAL);

  // limits to help guard against excessive padding
  client_machine->allowed_padding_count = 1;
  client_machine->max_padding_percent = 1;

  // one state to start with: START (-> END, never takes a slot in states)
  circpad_machine_states_init(client_machine, 1);
  client_machine->states[CIRCPAD_STATE_START].
    next_state[CIRCPAD_EVENT_NONPADDING_SENT] =
    CIRCPAD_STATE_END;

  client_machine->machine_num = smartlist_len(machines_sl);
  circpad_register_padding_machine(client_machine, machines_sl);
  log_info(LD_CIRC,
           "Registered client WF APE padding machine (%u)",
           client_machine->machine_num);
}
```

We also have to modify `circpad_machines_init()` in `circuitpadding.c` to
register our machines:

```c
  /* Register machines for the APE WF defense */
  circpad_machine_client_wf_ape(origin_padding_machines);
  circpad_machine_relay_wf_ape(relay_padding_machines);
```

We run `make` to get a new `tor` binary and copy it to our local TB. 

## Run the machine

To be able
to view circuit info events in the console as we launch TB, we add `Log
[circ]info notice stdout` to `torrc` of TB. 

Running TB to visit example.com we first find in the log:

```
Aug 30 18:36:43.000 [info] circpad_machine_client_hide_intro_circuits(): Registered client intro point hiding padding machine (0)
Aug 30 18:36:43.000 [info] circpad_machine_relay_hide_intro_circuits(): Registered relay intro circuit hiding padding machine (0)
Aug 30 18:36:43.000 [info] circpad_machine_client_hide_rend_circuits(): Registered client rendezvous circuit hiding padding machine (1)
Aug 30 18:36:43.000 [info] circpad_machine_relay_hide_rend_circuits(): Registered relay rendezvous circuit hiding padding machine (1)
Aug 30 18:36:43.000 [info] circpad_machine_client_wf_ape(): Registered client WF APE padding machine (2)
Aug 30 18:36:43.000 [info] circpad_machine_relay_wf_ape(): Registered relay WF APE padding machine (2)
```

All good, our machine is running. Looking further we find:

```
Aug 30 18:36:55.000 [info] circpad_setup_machine_on_circ(): Registering machine client_wf_ape to origin circ 2 (5)
Aug 30 18:36:55.000 [info] circpad_node_supports_padding(): Checking padding: supported
Aug 30 18:36:55.000 [info] circpad_negotiate_padding(): Negotiating padding on circuit 2 (5), command 2
Aug 30 18:36:55.000 [info] circpad_machine_spec_transition(): Circuit 2 circpad machine 0 transitioning from 0 to 65535
Aug 30 18:36:55.000 [info] circpad_machine_spec_transitioned_to_end(): Padding machine in end state on circuit 2 (5)
Aug 30 18:36:55.000 [info] circpad_circuit_machineinfo_free_idx(): Freeing padding info idx 0 on circuit 2 (5)
Aug 30 18:36:55.000 [info] circpad_handle_padding_negotiated(): Middle node did not accept our padding request on circuit 2 (5)
```
We see that our middle support padding (since we upgraded to tor-0.4.1.5), that
we attempt to negotiate, our machine starts on the client, transitions to the
end state, and is freed. The last line shows that the middle doesn't have a
padding machine that can run. 

Next, we follow the same steps as earlier and replace the modified `tor` at our
middle relay. We don't update the logging there to avoid logging on the info
level on the live network. Looking at the client log again we see that
negotiation works as before except for the last line: it's missing, so the
machine is running at the middle as well.  

## Implementing the APE state machine

Porting is fairly straightforward: define the states for all machines, add two
more machines (for the receive portion of WTFP-PAD, beyond AP), and pick
reasonable parameters for the distributions (I completely winged it now, as when
implementing APE). The [circuit-padding-ape-machine
branch](https://github.com/pylls/tor/tree/circuit-padding-ape-machine) contains
the commits for the full machines with plenty of comments. 

Some comments on the process:

- `tor-0.4.1.5` does not support two machines on the same circuit, the following
  fix has to be made: https://trac.torproject.org/projects/tor/ticket/31111 .
  The good news is that everything else seems to work after the small change in
  the fix. 
- APE randomizes its distributions. Currently, this can only be done during
  start of `tor`. This makes sense in the censorship circumvention setting
  (`obfs4`), less so for WF defenses: further randomizing each circuit is likely
  a PITA for attackers with few downsides.
- it was annoying to figure out that the lack of systemd support in my compiled
  tor caused systemd to interrupt (SIGINT) my tor process at the middle relay
  every five minutes. Updated build steps above to hopefully save others the
  pain.
- there's for sure some bug on relays when sending padding cells too early (?).
  It can happen with some probability with the APE implementation due to
  `circpad_machine_relay_wf_ape_send()`. Will investigate next.
- Moving the registration of machines from the definition of the machines to
  `circpad_machines_init()` makes sense, as suggested in the circuit padding doc
  draft.

Remember that APE is just a proof-of-concept and we make zero claims about its
ability to withstand WF attacks, in particular those based on deep learning.