aboutsummaryrefslogtreecommitdiff
path: root/spec/tor-spec/flow-control.md
blob: 91c04e3883e866167d8277644a6a0580295cf67e (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
<a id="tor-spec.txt-7"></a>

# Flow control{#flow-control}

<a id="tor-spec.txt-7.1"></a>

## Link throttling

Each client or relay should do appropriate bandwidth throttling to
keep its user happy.

Communicants rely on TCP's default flow control to push back when they
stop reading.

The mainline Tor implementation uses token buckets (one for reads,
one for writes) for the rate limiting.

Since 0.2.0.x, Tor has let the user specify an additional pair of
token buckets for "relayed" traffic, so people can deploy a Tor relay
with strict rate limiting, but also use the same Tor as a client. To
avoid partitioning concerns we combine both classes of traffic over a
given OR connection, and keep track of the last time we read or wrote
a high-priority (non-relayed) cell. If it's been less than N seconds
(currently N=30), we give the whole connection high priority, else we
give the whole connection low priority. We also give low priority
to reads and writes for connections that are serving directory
information. See proposal 111 for details.

<a id="tor-spec.txt-7.2"></a>

## Link padding{#link-padding}

Link padding can be created by sending PADDING or VPADDING cells
along the connection; relay cells of type "DROP" can be used for
long-range padding.  The payloads of PADDING, VPADDING, or DROP
cells are filled with padding bytes. See [Cell Packet format](./cell-packet-format.md#cell-packet-format).

If the link protocol is version 5 or higher, link level padding is
enabled as per padding-spec.txt. On these connections, clients may
negotiate the use of padding with a CELL_PADDING_NEGOTIATE command
whose format is as follows:

```text
         Version           [1 byte]
         Command           [1 byte]
         ito_low_ms        [2 bytes]
         ito_high_ms       [2 bytes]
```

Currently, only version 0 of this cell is defined. In it, the command
field is either 1 (stop padding) or 2 (start padding). For the start
padding command, a pair of timeout values specifying a low and a high
range bounds for randomized padding timeouts may be specified as unsigned
integer values in milliseconds. The ito_low_ms field should not be lower
than the current consensus parameter value for nf_ito_low (default:
1500).  The ito_high_ms field should not be lower than ito_low_ms.
(If any party receives an out-of-range value, they clamp it so
that it is in-range.)

For the stop padding command, the timeout fields should be sent as
zero (to avoid client distinguishability) and ignored by the recipient.

For more details on padding behavior, see padding-spec.txt.

<a id="tor-spec.txt-7.3"></a>

## Circuit-level flow control

To control a circuit's bandwidth usage, each OR keeps track of two
'windows', consisting of how many RELAY_DATA cells it is allowed to
originate or willing to consume.

These two windows are respectively named: the package window (packaged for
transmission) and the deliver window (delivered for local streams).

Because of our leaky-pipe topology, every relay on the circuit has a pair
of windows, and the OP has a pair of windows for every relay on the
circuit. These windows do not apply to relayed cells, however, and a relay
that is never used for streams will never decrement its window or cause the
client to decrement a window.

Each 'window' value is initially set based on the consensus parameter
'circwindow' in the directory (see dir-spec.txt), or to 1000 data cells if
no 'circwindow' value is given. In each direction, cells that are not
RELAY_DATA cells do not affect the window.

An OR or OP (depending on the stream direction) sends a RELAY_SENDME cell
to indicate that it is willing to receive more cells when its deliver
window goes down below a full increment (100). For example, if the window
started at 1000, it should send a RELAY_SENDME when it reaches 900.

When an OR or OP receives a RELAY_SENDME, it increments its package window
by a value of 100 (circuit window increment) and proceeds to sending the
remaining RELAY_DATA cells.

If a package window reaches 0, the OR or OP stops reading from TCP
connections for all streams on the corresponding circuit, and sends no more
RELAY_DATA cells until receiving a RELAY_SENDME cell.

If a deliver window goes below 0, the circuit should be torn down.

Starting with tor-0.4.1.1-alpha, authenticated SENDMEs are supported
(version 1, see below). This means that both the OR and OP need to remember
the rolling digest of the cell that precedes (triggers) a RELAY_SENDME.
This can be known if the package window gets to a multiple of the circuit
window increment (100).

When the RELAY_SENDME version 1 arrives, it will contain a digest that MUST
match the one remembered. This represents a proof that the end point of the
circuit saw the sent cells. On failure to match, the circuit should be torn
down.

To ensure unpredictability, random bytes should be added to at least one
RELAY_DATA cell within one increment window. In other word, every 100 cells
(increment), random bytes should be introduced in at least one cell.

<a id="tor-spec.txt-7.3.1"></a>

### SENDME Cell Format

A circuit-level RELAY_SENDME cell always has its StreamID=0.

An OR or OP must obey these two consensus parameters in order to know which
version to emit and accept.

```text
      'sendme_emit_min_version': Minimum version to emit.
      'sendme_accept_min_version': Minimum version to accept.
```

If a RELAY_SENDME version is received that is below the minimum accepted
version, the circuit should be closed.

The RELAY_SENDME payload contains the following:

```text
      VERSION     [1 byte]
      DATA_LEN    [2 bytes]
      DATA        [DATA_LEN bytes]
```

The VERSION tells us what is expected in the DATA section of length
DATA_LEN and how to handle it. The recognized values are:

0x00: The rest of the payload should be ignored.

0x01: Authenticated SENDME. The DATA section MUST contain:

DIGEST   \[20 bytes\]

```text
         If the DATA_LEN value is less than 20 bytes, the cell should be
         dropped and the circuit closed. If the value is more than 20 bytes,
         then the first 20 bytes should be read to get the DIGEST value.

         The DIGEST is the rolling digest value from the RELAY_DATA cell that
         immediately preceded (triggered) this RELAY_SENDME. This value is
         matched on the other side from the previous cell sent that the OR/OP
         must remember.

         (Note that if the digest in use has an output length greater than 20
         bytes—as is the case for the hop of an onion service rendezvous
         circuit created by the hs_ntor handshake—we truncate the digest
         to 20 bytes here.)
```

If the VERSION is unrecognized or below the minimum accepted version (taken
from the consensus), the circuit should be torn down.

<a id="tor-spec.txt-7.4"></a>

## Stream-level flow control

Edge nodes use RELAY_SENDME cells to implement end-to-end flow
control for individual connections across circuits. Similarly to
circuit-level flow control, edge nodes begin with a window of cells
(500) per stream, and increment the window by a fixed value (50)
upon receiving a RELAY_SENDME cell. Edge nodes initiate RELAY_SENDME
cells when both a) the window is \<= 450, and b) there are less than
ten cell payloads remaining to be flushed at that edge.

Stream-level RELAY_SENDME cells are distinguished by having nonzero
StreamID. They are still empty; the body still SHOULD be ignored.