spec/tor-spec/relay-cells.md


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159

<a id="tor-spec.txt-6.1"></a>
## Relay cells

Within a circuit, the OP and the end node use the contents of
RELAY packets to tunnel end-to-end commands and TCP connections
("Streams") across circuits.  End-to-end commands can be initiated
by either edge; streams are initiated by the OP.

End nodes that accept streams may be:
* exit relays (RELAY_BEGIN, anonymous),
* directory servers (RELAY_BEGIN_DIR, anonymous or non-anonymous),
* onion services (RELAY_BEGIN, anonymous via a rendezvous point).

The payload of each unencrypted RELAY cell consists of:

```text
         Relay command           [1 byte]
         'Recognized'            [2 bytes]
         StreamID                [2 bytes]
         Digest                  [4 bytes]
         Length                  [2 bytes]
         Data                    [Length bytes]
         Padding                 [PAYLOAD_LEN - 11 - Length bytes]

   The relay commands are:

         1 -- RELAY_BEGIN     [forward]
         2 -- RELAY_DATA      [forward or backward]
         3 -- RELAY_END       [forward or backward]
         4 -- RELAY_CONNECTED [backward]
         5 -- RELAY_SENDME    [forward or backward] [sometimes control]
         6 -- RELAY_EXTEND    [forward]             [control]
         7 -- RELAY_EXTENDED  [backward]            [control]
         8 -- RELAY_TRUNCATE  [forward]             [control]
         9 -- RELAY_TRUNCATED [backward]            [control]
        10 -- RELAY_DROP      [forward or backward] [control]
        11 -- RELAY_RESOLVE   [forward]
        12 -- RELAY_RESOLVED  [backward]
        13 -- RELAY_BEGIN_DIR [forward]
        14 -- RELAY_EXTEND2   [forward]             [control]
        15 -- RELAY_EXTENDED2 [backward]            [control]

        16..18 -- Reserved for UDP; Not yet in use, see prop339.

        19..22 -- Reserved for Conflux, see prop329.

        32..40 -- Used for hidden services; see rend-spec-{v2,v3}.txt.

        41..42 -- Used for circuit padding; see Section 3 of padding-spec.txt.

        Used for flow control; see Section 4 of prop324.
        43 -- XON             [forward or backward]
        44 -- XOFF            [forward or backward]
```

Commands labelled as "forward" must only be sent by the originator
of the circuit. Commands labelled as "backward" must only be sent by
other nodes in the circuit back to the originator. Commands marked
as either can be sent either by the originator or other nodes.

The 'recognized' field is used as a simple indication that the cell
is still encrypted. It is an optimization to avoid calculating
expensive digests for every cell. When sending cells, the unencrypted
'recognized' MUST be set to zero.

When receiving and decrypting cells the 'recognized' will always be
zero if we're the endpoint that the cell is destined for.  For cells
that we should relay, the 'recognized' field will usually be nonzero,
but will accidentally be zero with P=2^-16.

When handling a relay cell, if the 'recognized' in field in a
decrypted relay payload is zero, the 'digest' field is computed as
the first four bytes of the running digest of all the bytes that have
been destined for this hop of the circuit or originated from this hop
of the circuit, seeded from Df or Db respectively (obtained in
section 5.2 above), and including this RELAY cell's entire payload
(taken with the digest field set to zero).  Note that these digests
_do_ include the padding bytes at the end of the cell, not only those up
to "Len".  If the digest is correct, the cell is considered "recognized"
for the purposes of decryption (see section 5.5 above).

(The digest does not include any bytes from relay cells that do
not start or end at this hop of the circuit. That is, it does not
include forwarded data. Therefore if 'recognized' is zero but the
digest does not match, the running digest at that node should
not be updated, and the cell should be forwarded on.)

All RELAY cells pertaining to the same tunneled stream have the same
stream ID.  StreamIDs are chosen arbitrarily by the OP.  No stream
may have a StreamID of zero. Rather, RELAY cells that affect the
entire circuit rather than a particular stream use a StreamID of zero
-- they are marked in the table above as "[control]" style
cells. (Sendme cells are marked as "sometimes control" because they
can include a StreamID or not depending on their purpose -- see
Section 7.)

The 'Length' field of a relay cell contains the number of bytes in
the relay payload which contain real payload data. The remainder of
the unencrypted payload is padded with padding bytes. Implementations
handle padding bytes of unencrypted relay cells as they do padding
bytes for other cell types; see Section 3.

The 'Padding' field is used to make relay cell contents unpredictable, to
avoid certain attacks (see proposal 289 for rationale). Implementations
SHOULD fill this field with four zero-valued bytes, followed by as many
random bytes as will fit.  (If there are fewer than 4 bytes for padding,
then they should all be filled with zero.

Implementations MUST NOT rely on the contents of the 'Padding' field.

If the RELAY cell is recognized but the relay command is not
understood, the cell must be dropped and ignored. Its contents
still count with respect to the digests and flow control windows, though.

<a id="tor-spec.txt-6.1.1"></a>
### Calculating the 'Digest' field

The 'Digest' field itself serves the purpose to check if a cell has been
fully decrypted, that is, all onion layers have been removed.  Having a
single field, namely 'Recognized' is not sufficient, as outlined above.

When ENCRYPTING a RELAY cell, an implementation does the following:

```text
     # Encode the cell in binary (recognized and digest set to zero)
     tmp = cmd + [0, 0] + stream_id + [0, 0, 0, 0] + length + data + padding

     # Update the digest with the encoded data
     digest_state = hash_update(digest_state, tmp)
     digest = hash_calculate(digest_state)

     # The encoded data is the same as above with the digest field not being
     # zero anymore
     encoded = cmd + [0, 0] + stream_id + digest[0..4] + length + data +
               padding

     # Now we can encrypt the cell by adding the onion layers ...

   When DECRYPTING a RELAY cell, an implementation does the following:

     decrypted = decrypt(cell)

     # Replace the digest field in decrypted by zeros
     tmp = decrypted[0..5] + [0, 0, 0, 0] + decrypted[9..]

     # Update the digest field with the decrypted data and its digest field
     # set to zero
     digest_state = hash_update(digest_state, tmp)
     digest = hash_calculate(digest_state)

     if digest[0..4] == decrypted[5..9]
       # The cell has been fully decrypted ...
```

The caveat itself is that only the binary data with the digest bytes set to
zero are being taken into account when calculating the running digest.  The
final plain-text cells (with the digest field set to its actual value) are
not taken into the running digest.