diff options
Diffstat (limited to 'doc/HACKING/CircuitPaddingQuickStart.md')
-rw-r--r-- | doc/HACKING/CircuitPaddingQuickStart.md | 266 |
1 files changed, 266 insertions, 0 deletions
diff --git a/doc/HACKING/CircuitPaddingQuickStart.md b/doc/HACKING/CircuitPaddingQuickStart.md new file mode 100644 index 0000000000..2780b5c6ea --- /dev/null +++ b/doc/HACKING/CircuitPaddingQuickStart.md @@ -0,0 +1,266 @@ +# A Padding Machine from Scratch + +A quickstart guide by Tobias Pulls. + +This document describes the process of building a "padding machine" in tor's new +circuit padding framework from scratch. Notes were taken as part of porting +[Adaptive Padding Early +(APE)](https://www.cs.kau.se/pulls/hot/thebasketcase-ape/) from basket2 to the +circuit padding framework. The goal is just to document the process and provide +useful pointers along the way, not create a useful machine. + +The quick and dirty plan is to: +1. clone and compile tor +2. use newly built tor in TB and at small (non-exit) relay we run +3. add a bare-bones APE padding machine +4. run the machine, inspect logs for activity +5. port APE's state machine without thinking much about parameters + +## Clone and compile tor + +```bash +git clone https://git.torproject.org/tor.git +cd tor +git checkout tor-0.4.1.5 +``` +Above we use the tag for tor-0.4.1.5 where the circuit padding framework was +released. Note that this version of the framework is missing many features and +fixes that have since been merged to origin/master. If you need the newest +framework features, you should use that master instead. + +```bash +sh autogen.sh +./configure +make +``` +When you run `./configure` you'll be told of missing dependencies and packages +to install on debian-based distributions. Important: if you plan to run `tor` on +a relay as part of the real Tor network and your server runs a distribution that +uses systemd, then I'd recommend that you `apt install dpkg dpkg-dev +libevent-dev libssl-dev asciidoc quilt dh-apparmor libseccomp-dev dh-systemd +libsystemd-dev pkg-config dh-autoreconf libfakeroot zlib1g zlib1g-dev automake +liblzma-dev libzstd-dev` and ensure that tor has systemd support enabled: +`./configure --enable-systemd`. Without this, on a recent Ubuntu, my tor service +was forcefully restarted (SIGINT interrupt) by systemd every five minutes. + +If you want to install on your localsystem, run `make install`. For our case we +just want the tor binary at `src/app/tor`. + +## Use tor in TB and at a relay + +Download and install a fresh Tor Browser (TB) from torproject.org. Make sure it +works. From the command line, relative to the folder created when you extracted +TB, run `./Browser/start-tor-browser --verbose` to get some basic log output. +Note the version of tor, in my case, `Tor 0.4.0.5 (git-bf071e34aa26e096)` as +part of TB 8.5.4. Shut down TB, copy the `tor` binary that you compiled earlier +and replace `Browser/TorBrowser/Tor/tor`. Start TB from the command line again, +you should see a different version, in my case `Tor 0.4.1.5 +(git-439ca48989ece545)`. + +The relay we run is also on linux, and `tor` is located at `/usr/bin/tor`. To +view relevant logs since last boot `sudo journalctl -b /usr/bin/tor`, where we +find `Tor 0.4.0.5 running on Linux`. Copy the locally compiled `tor` to the +relay at a temporary location and then make sure it's ownership and access +rights are identical to `/usr/bin/tor`. Next, shut down the running tor service +with `sudo service tor stop`, wait for it to stop (typically 30s), copy our +locally compiled tor to replace `/usr/bin/tor` then start the service again. +Checking the logs we see `or 0.4.1.5 (git-439ca48989ece545)`. + +Repeatedly shutting down a relay is detrimental to the network and should be +avoided. Sorry about that. + +We have one more step left before we move on the machine: configure TB to always +use our middle relay. Edit `Browser/TorBrowser/Data/Tor/torrc` and set +`MiddleNodes <fingerprint>`, where `<fingerprint>` is the fingerprint of the +relay. Start TB, visit a website, and manually confirm that the middle is used +by looking at the circuit display. + +## Add a bare-bones APE padding machine + +Now the fun part. We have several resources at our disposal (mind that links +might be broken in the future, just search for the headings): +- The official [Circuit Padding Developer + Documentation](https://storm.torproject.org/shared/ChieH_sLU93313A2gopZYT3x2waJ41hz5Hn2uG1Uuh7). +- Notes we made on the [implementation of the circuit padding + framework](https://github.com/pylls/padding-machines-for-tor/blob/master/notes/circuit-padding-framework.md). +- The implementation of the current circuit padding machines in tor: + [circuitpadding.c](https://gitweb.torproject.org/tor.git/tree/src/core/or/circuitpadding_machines.c) + and + [circuitpadding_machines.h](https://gitweb.torproject.org/tor.git/tree/src/core/or/circuitpadding_machines.h). + +Please consult the above links for details. Moving forward, the focus is to +describe what was done, not necessarily explaining all the details why. + +Since we plan to make changes to tor, create a new branch `git checkout -b +circuit-padding-ape-machine tor-0.4.1.5`. + +We start with declaring two functions, one for the machine at the client and one +at the relay, in `circuitpadding_machines.h`: + +```c +void circpad_machine_relay_wf_ape(smartlist_t *machines_sl); +void circpad_machine_client_wf_ape(smartlist_t *machines_sl); +``` + +The definitions go into `circuitpadding_machines.c`: + +```c +/**************** Adaptive Padding Early (APE) machine ****************/ + +/** + * Create a relay-side padding machine based on the APE design. + */ +void +circpad_machine_relay_wf_ape(smartlist_t *machines_sl) +{ + circpad_machine_spec_t *relay_machine + = tor_malloc_zero(sizeof(circpad_machine_spec_t)); + + relay_machine->name = "relay_wf_ape"; + relay_machine->is_origin_side = 0; // relay-side + + // Pad to/from the middle relay, only when the circuit has streams + relay_machine->target_hopnum = 2; + relay_machine->conditions.min_hops = 2; + relay_machine->conditions.state_mask = CIRCPAD_CIRC_STREAMS; + + // limits to help guard against excessive padding + relay_machine->allowed_padding_count = 1; + relay_machine->max_padding_percent = 1; + + // one state to start with: START (-> END, never takes a slot in states) + circpad_machine_states_init(relay_machine, 1); + relay_machine->states[CIRCPAD_STATE_START]. + next_state[CIRCPAD_EVENT_NONPADDING_SENT] = + CIRCPAD_STATE_END; + + // register the machine + relay_machine->machine_num = smartlist_len(machines_sl); + circpad_register_padding_machine(relay_machine, machines_sl); + + log_info(LD_CIRC, + "Registered relay WF APE padding machine (%u)", + relay_machine->machine_num); +} + +/** + * Create a client-side padding machine based on the APE design. + */ +void +circpad_machine_client_wf_ape(smartlist_t *machines_sl) +{ + circpad_machine_spec_t *client_machine + = tor_malloc_zero(sizeof(circpad_machine_spec_t)); + + client_machine->name = "client_wf_ape"; + client_machine->is_origin_side = 1; // client-side + + /** Pad to/from the middle relay, only when the circuit has streams, and only + * for general purpose circuits (typical for web browsing) + */ + client_machine->target_hopnum = 2; + client_machine->conditions.min_hops = 2; + client_machine->conditions.state_mask = CIRCPAD_CIRC_STREAMS; + client_machine->conditions.purpose_mask = + circpad_circ_purpose_to_mask(CIRCUIT_PURPOSE_C_GENERAL); + + // limits to help guard against excessive padding + client_machine->allowed_padding_count = 1; + client_machine->max_padding_percent = 1; + + // one state to start with: START (-> END, never takes a slot in states) + circpad_machine_states_init(client_machine, 1); + client_machine->states[CIRCPAD_STATE_START]. + next_state[CIRCPAD_EVENT_NONPADDING_SENT] = + CIRCPAD_STATE_END; + + client_machine->machine_num = smartlist_len(machines_sl); + circpad_register_padding_machine(client_machine, machines_sl); + log_info(LD_CIRC, + "Registered client WF APE padding machine (%u)", + client_machine->machine_num); +} +``` + +We also have to modify `circpad_machines_init()` in `circuitpadding.c` to +register our machines: + +```c + /* Register machines for the APE WF defense */ + circpad_machine_client_wf_ape(origin_padding_machines); + circpad_machine_relay_wf_ape(relay_padding_machines); +``` + +We run `make` to get a new `tor` binary and copy it to our local TB. + +## Run the machine + +To be able +to view circuit info events in the console as we launch TB, we add `Log +[circ]info notice stdout` to `torrc` of TB. + +Running TB to visit example.com we first find in the log: + +``` +Aug 30 18:36:43.000 [info] circpad_machine_client_hide_intro_circuits(): Registered client intro point hiding padding machine (0) +Aug 30 18:36:43.000 [info] circpad_machine_relay_hide_intro_circuits(): Registered relay intro circuit hiding padding machine (0) +Aug 30 18:36:43.000 [info] circpad_machine_client_hide_rend_circuits(): Registered client rendezvous circuit hiding padding machine (1) +Aug 30 18:36:43.000 [info] circpad_machine_relay_hide_rend_circuits(): Registered relay rendezvous circuit hiding padding machine (1) +Aug 30 18:36:43.000 [info] circpad_machine_client_wf_ape(): Registered client WF APE padding machine (2) +Aug 30 18:36:43.000 [info] circpad_machine_relay_wf_ape(): Registered relay WF APE padding machine (2) +``` + +All good, our machine is running. Looking further we find: + +``` +Aug 30 18:36:55.000 [info] circpad_setup_machine_on_circ(): Registering machine client_wf_ape to origin circ 2 (5) +Aug 30 18:36:55.000 [info] circpad_node_supports_padding(): Checking padding: supported +Aug 30 18:36:55.000 [info] circpad_negotiate_padding(): Negotiating padding on circuit 2 (5), command 2 +Aug 30 18:36:55.000 [info] circpad_machine_spec_transition(): Circuit 2 circpad machine 0 transitioning from 0 to 65535 +Aug 30 18:36:55.000 [info] circpad_machine_spec_transitioned_to_end(): Padding machine in end state on circuit 2 (5) +Aug 30 18:36:55.000 [info] circpad_circuit_machineinfo_free_idx(): Freeing padding info idx 0 on circuit 2 (5) +Aug 30 18:36:55.000 [info] circpad_handle_padding_negotiated(): Middle node did not accept our padding request on circuit 2 (5) +``` +We see that our middle support padding (since we upgraded to tor-0.4.1.5), that +we attempt to negotiate, our machine starts on the client, transitions to the +end state, and is freed. The last line shows that the middle doesn't have a +padding machine that can run. + +Next, we follow the same steps as earlier and replace the modified `tor` at our +middle relay. We don't update the logging there to avoid logging on the info +level on the live network. Looking at the client log again we see that +negotiation works as before except for the last line: it's missing, so the +machine is running at the middle as well. + +## Implementing the APE state machine + +Porting is fairly straightforward: define the states for all machines, add two +more machines (for the receive portion of WTFP-PAD, beyond AP), and pick +reasonable parameters for the distributions (I completely winged it now, as when +implementing APE). The [circuit-padding-ape-machine +branch](https://github.com/pylls/tor/tree/circuit-padding-ape-machine) contains +the commits for the full machines with plenty of comments. + +Some comments on the process: + +- `tor-0.4.1.5` does not support two machines on the same circuit, the following + fix has to be made: https://trac.torproject.org/projects/tor/ticket/31111 . + The good news is that everything else seems to work after the small change in + the fix. +- APE randomizes its distributions. Currently, this can only be done during + start of `tor`. This makes sense in the censorship circumvention setting + (`obfs4`), less so for WF defenses: further randomizing each circuit is likely + a PITA for attackers with few downsides. +- it was annoying to figure out that the lack of systemd support in my compiled + tor caused systemd to interrupt (SIGINT) my tor process at the middle relay + every five minutes. Updated build steps above to hopefully save others the + pain. +- there's for sure some bug on relays when sending padding cells too early (?). + It can happen with some probability with the APE implementation due to + `circpad_machine_relay_wf_ape_send()`. Will investigate next. +- Moving the registration of machines from the definition of the machines to + `circpad_machines_init()` makes sense, as suggested in the circuit padding doc + draft. + +Remember that APE is just a proof-of-concept and we make zero claims about its +ability to withstand WF attacks, in particular those based on deep learning. |