spec/rend-spec/deriving-keys.md


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425

<a id="rend-spec-v3.txt-2.1"></a>

# Deriving blinded keys and subcredentials {#SUBCRED}

In each time period (see \[TIME-PERIODS\] for a definition of time
periods), a hidden service host uses a different blinded private key
to sign its directory information, and clients use a different
blinded public key as the index for fetching that information.

For a candidate for a key derivation method, see Appendix \[KEYBLIND\].

Additionally, clients and hosts derive a subcredential for each
period. Knowledge of the subcredential is needed to decrypt hidden
service descriptors for each period and to authenticate with the
hidden service host in the introduction process. Unlike the
credential, it changes each period. Knowing the subcredential, even
in combination with the blinded private key, does not enable the
hidden service host to derive the main credential--therefore, it is
safe to put the subcredential on the hidden service host while
leaving the hidden service's private key offline.

The subcredential for a period is derived as:

```text
N_hs_subcred = H("subcredential" | N_hs_cred | blinded-public-key).
```

In the above formula, credential corresponds to:

```text
N_hs_cred = H("credential" | public-identity-key)
```

where `public-identity-key` is the public identity master key of the hidden
service.

# Locating, uploading, and downloading hidden service descriptors {#HASHRING}

To avoid attacks where a hidden service's descriptor is easily
targeted for censorship, we store them at different directories over
time, and use shared random values to prevent those directories from
being predictable far in advance.

Which Tor servers hosts a hidden service depends on:

```text
         * the current time period,
         * the daily subcredential,
         * the hidden service directories' public keys,
         * a shared random value that changes in each time period,
           shared_random_value.
         * a set of network-wide networkstatus consensus parameters.
           (Consensus parameters are integer values voted on by authorities
           and published in the consensus documents, described in
           dir-spec.txt, section 3.3.)

   Below we explain in more detail.
```

<a id="rend-spec-v3.txt-2.2.1"></a>

## Dividing time into periods {#TIME-PERIODS}

To prevent a single set of hidden service directory from becoming a
target by adversaries looking to permanently censor a hidden service,
hidden service descriptors are uploaded to different locations that
change over time.

The length of a "time period" is controlled by the consensus
parameter 'hsdir-interval', and is a number of minutes between 30 and
14400 (10 days). The default time period length is 1440 (one day).

Time periods start at the Unix epoch (Jan 1, 1970), and are computed by
taking the number of minutes since the epoch and dividing by the time
period. However, we want our time periods to start at a regular offset
from the SRV voting schedule, so  we subtract a "rotation time offset"
of 12 voting periods from the number of minutes since the epoch, before
dividing by the time period (effectively making "our" epoch start at Jan
1, 1970 12:00UTC when the voting period is 1 hour.)

Example: If the current time is 2016-04-13 11:15:01 UTC, making the seconds
since the epoch 1460546101, and the number of minutes since the epoch
24342435\.  We then subtract the "rotation time offset" of 12\*60 minutes from
the minutes since the epoch, to get 24341715. If the current time period
length is 1440 minutes, by doing the division we see that we are currently
in time period number 16903.

Specifically, time period #16903 began 16903\*1440\*60 + (12\*60\*60) seconds
after the epoch, at 2016-04-12 12:00 UTC, and ended at 16904\*1440\*60 +
(12\*60\*60) seconds after the epoch, at 2016-04-13 12:00 UTC.

<a id="rend-spec-v3.txt-2.2.2"></a>

## When to publish a hidden service descriptor {#WHEN-HSDESC}

Hidden services periodically publish their descriptor to the responsible
HSDirs. The set of responsible HSDirs is determined as specified in
\[WHERE-HSDESC\].

Specifically, every time a hidden service publishes its descriptor, it also
sets up a timer for a random time between 60 minutes and 120 minutes in the
future. When the timer triggers, the hidden service needs to publish its
descriptor again to the responsible HSDirs for that time period.
\[TODO: Control republish period using a consensus parameter?\]

<a id="rend-spec-v3.txt-2.2.2.1"></a>

### Overlapping descriptors {#OVERLAPPING-DESCS}

Hidden services need to upload multiple descriptors so that they can be
reachable to clients with older or newer consensuses than them. Services
need to upload their descriptors to the HSDirs *before* the beginning of
each upcoming time period, so that they are readily available for clients to
fetch them. Furthermore, services should keep uploading their old descriptor
even after the end of a time period, so that they can be reachable by
clients that still have consensuses from the previous time period.

Hence, services maintain two active descriptors at every point. Clients on
the other hand, don't have a notion of overlapping descriptors, and instead
always download the descriptor for the current time period and shared random
value. It's the job of the service to ensure that descriptors will be
available for all clients. See section \[FETCHUPLOADDESC\] for how this is
achieved.

\[TODO: What to do when we run multiple hidden services in a single host?\]

<a id="rend-spec-v3.txt-2.2.3"></a>

## Where to publish a hidden service descriptor {#WHERE-HSDESC}

This section specifies how the HSDir hash ring is formed at any given
time. Whenever a time value is needed (e.g. to get the current time period
number), we assume that clients and services use the valid-after time from
their latest live consensus.

The following consensus parameters control where a hidden service
descriptor is stored;

```text
        hsdir_n_replicas = an integer in range [1,16] with default value 2.
        hsdir_spread_fetch = an integer in range [1,128] with default value 3.
        hsdir_spread_store = an integer in range [1,128] with default value 4.
           (Until 0.3.2.8-rc, the default was 3.)
```

To determine where a given hidden service descriptor will be stored
in a given period, after the blinded public key for that period is
derived, the uploading or downloading party calculates:

```text
        for replicanum in 1...hsdir_n_replicas:
            hs_service_index(replicanum) = H("store-at-idx" |
                                     blinded_public_key |
                                     INT_8(replicanum) |
                                     INT_8(period_length) |
                                     INT_8(period_num) )
```

where blinded_public_key is specified in section \[KEYBLIND\], period_length
is the length of the time period in minutes, and period_num is calculated
using the current consensus "valid-after" as specified in section
\[TIME-PERIODS\].

Then, for each node listed in the current consensus with the HSDir flag,
we compute a directory index for that node as:

```text
           hs_relay_index(node) = H("node-idx" | node_identity |
                                 shared_random_value |
                                 INT_8(period_num) |
                                 INT_8(period_length) )
```

where shared_random_value is the shared value generated by the authorities
in section \[PUB-SHAREDRANDOM\], and node_identity is the ed25519 identity
key of the node.

Finally, for replicanum in 1...hsdir_n_replicas, the hidden service
host uploads descriptors to the first hsdir_spread_store nodes whose
indices immediately follow hs_service_index(replicanum). If any of those
nodes have already been selected for a lower-numbered replica of the
service, any nodes already chosen are disregarded (i.e. skipped over)
when choosing a replica's hsdir_spread_store nodes.

When choosing an HSDir to download from, clients choose randomly from
among the first hsdir_spread_fetch nodes after the indices.  (Note
that, in order to make the system better tolerate disappearing
HSDirs, hsdir_spread_fetch may be less than hsdir_spread_store.)
Again, nodes from lower-numbered replicas are disregarded when
choosing the spread for a replica.

<a id="rend-spec-v3.txt-2.2.4"></a>

## Using time periods and SRVs to fetch/upload HS descriptors {#FETCHUPLOADDESC}

Hidden services and clients need to make correct use of time periods (TP)
and shared random values (SRVs) to successfully fetch and upload
descriptors. Furthermore, to avoid problems with skewed clocks, both clients
and services use the 'valid-after' time of a live consensus as a way to take
decisions with regards to uploading and fetching descriptors. By using the
consensus times as the ground truth here, we minimize the desynchronization
of clients and services due to system clock. Whenever time-based decisions
are taken in this section, assume that they are consensus times and not
system times.

As \[PUB-SHAREDRANDOM\] specifies, consensuses contain two shared random
values (the current one and the previous one). Hidden services and clients
are asked to match these shared random values with descriptor time periods
and use the right SRV when fetching/uploading descriptors. This section
attempts to precisely specify how this works.

Let's start with an illustration of the system:

```text
      +------------------------------------------------------------------+
      |                                                                  |
      | 00:00      12:00       00:00       12:00       00:00       12:00 |
      | SRV#1      TP#1        SRV#2       TP#2        SRV#3       TP#3  |
      |                                                                  |
      |  $==========|-----------$===========|-----------$===========|    |
      |                                                                  |
      |                                                                  |
      +------------------------------------------------------------------+

                                      Legend: [TP#1 = Time Period #1]
                                              [SRV#1 = Shared Random Value #1]
                                              ["$" = descriptor rotation moment]
```

<a id="rend-spec-v3.txt-2.2.4.1"></a>

### Client behavior for fetching descriptors {#CLIENTFETCH}

And here is how clients use TPs and SRVs to fetch descriptors:

Clients always aim to synchronize their TP with SRV, so they always want to
use TP#N with SRV#N: To achieve this wrt time periods, clients always use
the current time period when fetching descriptors. Now wrt SRVs, if a client
is in the time segment between a new time period and a new SRV (i.e. the
segments drawn with "-") it uses the current SRV, else if the client is in a
time segment between a new SRV and a new time period (i.e. the segments
drawn with "="), it uses the previous SRV.

Example:

```text
+------------------------------------------------------------------+
|                                                                  |
| 00:00      12:00       00:00       12:00       00:00       12:00 |
| SRV#1      TP#1        SRV#2       TP#2        SRV#3       TP#3  |
|                                                                  |
|  $==========|-----------$===========|-----------$===========|    |
|              ^           ^                                       |
|              C1          C2                                      |
+------------------------------------------------------------------+
```

If a client (C1) is at 13:00 right after TP#1, then it will use TP#1 and
SRV#1 for fetching descriptors. Also, if a client (C2) is at 01:00 right
after SRV#2, it will still use TP#1 and SRV#1.

<a id="rend-spec-v3.txt-2.2.4.2"></a>

### Service behavior for uploading descriptors {#SERVICEUPLOAD}

As discussed above, services maintain two active descriptors at any time. We
call these the "first" and "second" service descriptors. Services rotate
their descriptor every time they receive a consensus with a valid_after time
past the next SRV calculation time. They rotate their descriptors by
discarding their first descriptor, pushing the second descriptor to the
first, and rebuilding their second descriptor with the latest data.

Services like clients also employ a different logic for picking SRV and TP
values based on their position in the graph above. Here is the logic:

<a id="rend-spec-v3.txt-2.2.4.2.1"></a>

#### First descriptor upload logic {#FIRSTDESCUPLOAD}

Here is the service logic for uploading its first descriptor:

When a service is in the time segment between a new time period a new SRV
(i.e. the segments drawn with "-"), it uses the previous time period and
previous SRV for uploading its first descriptor: that's meant to cover
for clients that have a consensus that is still in the previous time period.

Example: Consider in the above illustration that the service is at 13:00
right after TP#1. It will upload its first descriptor using TP#0 and SRV#0.
So if a client still has a 11:00 consensus it will be able to access it
based on the client logic above.

Now if a service is in the time segment between a new SRV and a new time
period (i.e. the segments drawn with "=") it uses the current time period
and the previous SRV for its first descriptor: that's meant to cover clients
with an up-to-date consensus in the same time period as the service.

Example:

```text
+------------------------------------------------------------------+
|                                                                  |
| 00:00      12:00       00:00       12:00       00:00       12:00 |
| SRV#1      TP#1        SRV#2       TP#2        SRV#3       TP#3  |
|                                                                  |
|  $==========|-----------$===========|-----------$===========|    |
|                          ^                                       |
|                          S                                       |
+------------------------------------------------------------------+
```

Consider that the service is at 01:00 right after SRV#2: it will upload its
first descriptor using TP#1 and SRV#1.

<a id="rend-spec-v3.txt-2.2.4.2.2"></a>

#### Second descriptor upload logic {#SECONDDESCUPLOAD}

Here is the service logic for uploading its second descriptor:

When a service is in the time segment between a new time period a new SRV
(i.e. the segments drawn with "-"), it uses the current time period and
current SRV for uploading its second descriptor: that's meant to cover for
clients that have an up-to-date consensus on the same TP as the service.

Example: Consider in the above illustration that the service is at 13:00
right after TP#1: it will upload its second descriptor using TP#1 and SRV#1.

Now if a service is in the time segment between a new SRV and a new time
period (i.e. the segments drawn with "=") it uses the next time period and
the current SRV for its second descriptor: that's meant to cover clients
with a newer consensus than the service (in the next time period).

Example:

```text
+------------------------------------------------------------------+
|                                                                  |
| 00:00      12:00       00:00       12:00       00:00       12:00 |
| SRV#1      TP#1        SRV#2       TP#2        SRV#3       TP#3  |
|                                                                  |
|  $==========|-----------$===========|-----------$===========|    |
|                          ^                                       |
|                          S                                       |
+------------------------------------------------------------------+
```

Consider that the service is at 01:00 right after SRV#2: it will upload its
second descriptor using TP#2 and SRV#2.

<a id="rend-spec-v3.txt-2.2.4.3"></a>

### Directory behavior for handling descriptor uploads \[DIRUPLOAD\]

Upon receiving a hidden service descriptor publish request, directories MUST
check the following:

```text
     * The outer wrapper of the descriptor can be parsed according to
       [DESC-OUTER]
     * The version-number of the descriptor is "3"
     * If the directory has already cached a descriptor for this hidden service,
       the revision-counter of the uploaded descriptor must be greater than the
       revision-counter of the cached one
     * The descriptor signature is valid
```

If any of these basic validity checks fails, the directory MUST reject the
descriptor upload.

NOTE: Even if the descriptor passes the checks above, its first and second
layers could still be invalid: directories cannot validate the encrypted
layers of the descriptor, as they do not have access to the public key of the
service (required for decrypting the first layer of encryption), or the
necessary client credentials (for decrypting the second layer).

<a id="rend-spec-v3.txt-2.2.5"></a>

## Expiring hidden service descriptors {#EXPIRE-DESC}

Hidden services set their descriptor's "descriptor-lifetime" field to 180
minutes (3 hours). Hidden services ensure that their descriptor will remain
valid in the HSDir caches, by republishing their descriptors periodically as
specified in \[WHEN-HSDESC\].

Hidden services MUST also keep their introduction circuits alive for as long
as descriptors including those intro points are valid (even if that's after
the time period has changed).

<a id="rend-spec-v3.txt-2.2.6"></a>

## URLs for anonymous uploading and downloading {#urls}

Hidden service descriptors conforming to this specification are uploaded
with an HTTP POST request to the URL `/tor/hs/<version>/publish` relative to
the hidden service directory's root, and downloaded with an HTTP GET
request for the URL `/tor/hs/<version>/<z>` where `<z>` is a base64 encoding of
the hidden service's blinded public key and `<version>` is the protocol
version which is "3" in this case.

These requests must be made anonymously, on circuits not used for
anything else.

<a id="rend-spec-v3.txt-2.2.7"></a>

## Client-side validation of onion addresses {#addr-validation}

When a Tor client receives a prop224 onion address from the user, it
MUST first validate the onion address before attempting to connect or
fetch its descriptor. If the validation fails, the client MUST
refuse to connect.

As part of the address validation, Tor clients should check that the
underlying ed25519 key does not have a torsion component. If Tor accepted
ed25519 keys with torsion components, attackers could create multiple
equivalent onion addresses for a single ed25519 key, which would map to the
same service. We want to avoid that because it could lead to phishing
attacks and surprising behaviors (e.g. imagine a browser plugin that blocks
onion addresses, but could be bypassed using an equivalent onion address
with a torsion component).

The right way for clients to detect such fraudulent addresses (which should
only occur malevolently and never naturally) is to extract the ed25519
public key from the onion address and multiply it by the ed25519 group order
and ensure that the result is the ed25519 identity element. For more
details, please see \[TORSION-REFS\].