summaryrefslogtreecommitdiff
path: root/doc/dir-voting.txt
blob: 3297d1b315ba7778ad393ae0399662ea065e081a (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
$Id: /tor/branches/eventdns/doc/dir-spec.txt 9469 2006-11-01T23:56:30.179423Z nickm  $

                     Voting on the Tor Directory System

0. Scope and preliminaries

  This document describes a consensus voting scheme for Tor directories.
  Once it's accepted, it should be merged with dir-spec.txt.  Some
  preliminaries for authority and caching support should be done during
  the 0.1.2.x series; the main deployment should come during the 0.1.3.x
  series.

0.1. Goals and motivation: voting.

  The current directory system relies on clients downloading separate
  network status statements from the caches signed by each directory.
  Clients download a new statement every 30 minutes or so, choosing to
  replace the oldest statement they currently have.

  This creates a partitioning problem: different clients have different
  "most recent" networkstatus sources, and different versions of each
  (since authorities change their statements often).  Also, it is very
  redundant: most of the downloaded networkstatus are probably quite
  similar.

  So if we have clients only download a single multiply signed consensus
  network status statement, we can:
       - Save bandwidth.
       - Reduce client partitioning
       - Reduce client-side and cache-side storage
       - Simplify client-side voting code (by moving voting away from the
         client)

  We should try to do this without:
       - Assuming that client-side or cache-side clocks are more correct
         than we assume now.
       - Assuming that authority clocks are perfectly correct.
       - Degrading badly if an authority dies or is offline for a bit.

  We do not have to perform well if:
      - No clique of more than half the authorities can agree about who
        the authorities are.

1. The idea.

  Instead of publishing a network status whenever something changes,
  each authority instead publishes a fresh network status only once per
  "period" (say, 60 minutes).  Authorities either upload this network
  status (or "vote") to every other authority, or download every other
  authority's "vote" (see 3.1 below for discussion on push vs pull).

  After an authority has (or has become convinced that it won't be able to
  get) every other authority's vote, it deterministically computes a
  consensus networkstatus, and signs it.  Authorities download (or are
  uploaded; see 3.1) one another's signatures, and form a multiply signed
  consensus.  This multiply-signed consensus is what caches cache and what
  clients download.

  If an authority is down, authorities vote based on what they *can*
  download/get uploaded.

  If an authority is "a little" down and only some authorities can reach
  it, authorities try to get its info from other authorities.

  If an authority computes the vote wrong, its signature isn't included on
  the consensus.

  Clients use a consensus if it is signed by more than half the
  authorities they recognize.  If they can't find any such consensus,
  clients either use an older version, or beg the user to adapt the list
  of authorities.

2. Details.

2.1. Vote specifications

  Votes in v2.1 are just like v2 network status documents.  We add these
  fields to the preamble:

     "vote-status" -- the word "vote".

     "valid-until" -- the time when this authority expects to publish its
        next vote.

     "known-flags" -- a space-separated list of flags that will sometimes
        be included on "s" lines later in the vote.

     "dir-source" -- as before, except the "hostname" part MUST be the
        authority's nickname, which MUST be unique among authorities, and
        MUST match the nickname in the "directory-signature" entry.

  Authorities SHOULD cache their most recently generated votes so they
  can persist them across restarts.  Authorities SHOULD NOT generate
  another document until valid-until has passed.

  Router entries in the vote MUST be sorted in ascending order by router
  identity digest.  The flags in "s" lines MUST appear in alphabetical
  order.

  Votes SHOULD be synchronized to half-hour publication intervals (one
  hour? XXX say more; be more precise.)

  XXXX some way to request older networkstatus docs?


2.2. Consensus directory specifications

  Consensuses are like v2.1 votes, except for the following fields:

     "vote-status" -- the word "consensus".

     "published" is the latest of all the published times on the votes.

     "valid-until" is the earliest of all the valid-until times on the
       votes.

     "dir-source" and "fingerprint" and "dir-signing-key" and "contact"
       are included for each authority that contributed to the vote.

     "vote-digest" for each authority that contributed to the vote,
       calculated as for the digest in the signature on the vote. [XXX
       re-English this sentence]

     "client-versions" and "server-versions" are sorted in ascending
       order.

     "dir-options" and "known-flags" are not included.

  The fields MUST occur in the following order:
     "network-status-version"
     "vote-status"
     "published"
     "valid-until"
     For each authority, sorted in ascending order of nickname, case-
     insensitively:
         "dir-source", "fingerprint", "contact", "dir-signing-key",
         "vote-digest".
     "client-versions"
     "server-versions"

  The signatures at the end of the document appear as multiple instances
  directory-signature, sorted in ascending order by nickname,
  case-insensitively.

  A router entry should be included in the result if it is included by
  more than half of the authorities (total authorities, not just those
  whose votes we have).  A router entry has a flag set if it is included
  by more than half of the authorities who care about that flag.  [XXXX
  this creates a DOS incentive.  Can we remember what flags people set the
  last time we saw them?]

  [What does the signature hash cover ? XXX]

2.3. Agreement and timeline

  [XXXX publish signed vote summaries.]
  [XXXX URL list: vote, other people's votes, directory.]
  [XXXX in-progress URL vs done URL]
  [XXXX Store votes to disk.]

2.4. Distributing routerdescs between authorities

  Consensus will be more meaningful if authorities take steps to make sure
  that they all have the same set of descriptors _before_ the voting
  starts.  This is safe, since all descriptors are self-certified and
  timestamped: it's always okay to replace a signed descriptor with a more
  recent one signed by the same identity.

  In the long run, we might want some kind of sophisticated process here.
  For now, since authorities already download one another's networkstatus
  documents and use them to determine what descriptors to download from one
  another, we can rely on this existing mechanism to keep authorities up to
  date.

3. Questions and concerns

3.1. Push or pull?

  [XXXX]

3.2. Dropping "opt".

  The "opt" keyword in Tor's directory formats was originally intended to
  mean, "it is okay to ignore this entry if you don't understand it"; the
  default behavior has been "discard a routerdesc if it contains entries you
  don't recognize."

  But so far, every new flag we have added has been marked 'opt'.  It would
  probably make sense to change the default behavior to "ignore unrecognized
  fields", and add the statement that clients SHOULD ignore fields they don't
  recognize.  As a meta-principle, we should say that clients and servers
  MUST NOT have to understand new fields in order to use directory documents
  correctly.

  Of course, this will make it impossible to say, "The format has changed a
  lot; discard this quietly if you don't understand it." We could do that by
  adding a version field.

3.3. Multilevel keys.

  Replacing a directory authority's identity key in the event of a compromise
  would be tremendously annoying.  We'd need to tell every client to switch
  their configuration, or update to a new version with an uploaded list.  So
  long as some weren't upgraded, they'd be at risk from whoever had
  compromised the key.

  With this in mind, it's a shame that our current protocol forces us to
  store identity keys unencrypted in RAM.  We need some kind of signing key
  stored unencrypted, since we need to generate new descriptors/directories
  and rotate link and onion keys regularly.  (And since, of course, we can't
  ask server operators to be on-hand to enter a passphrase every time we
  want to rotate keys or sign a descriptor.)

  The obvious solution seems to be to have a signing-only key that lives
  indefinitely (months or longer) and signs descriptors and link keys, and a
  separate identity key that's used to sign the signing key.  Tor servers
  could run in one of several modes:
    1. Identity key stored encrypted.  You need to pick a passphrase when
       you enable this mode, and re-enter this passphrase every time you
       rotate the signing key.
    1'. Identity key stored separate.  You save your identity key to a
       floppy, and use the floppy when you need to rotate the signing key.
    2. All keys stored unencrypted.  In this case, we might not want to even
       *have* a separate signing key.  (We'll need to support no-separate-
       signing-key mode anyway to keep old servers working.)
    3. All keys stored encrypted. You need to enter a passphrase to start
       Tor.
  (Of course, we might not want to implement all of these.)

  Case 1 is probably most usable and secure, if we assume that people don't
  forget their passphrases or lose their floppies.  We could mitigate this a
  bit by encouraging people to PGP-encrypt their passphrases to themselves,
  or keep a cleartext copy of their secret key secret-split into a few
  pieces, or something like that.

  Migration presents another difficulty, especially with the authorities.  If
  we use the current set of identity keys as the new identity keys, we're in
  the position of having sensitive keys that have been stored on
  media-of-dubious-encryption up to now.  Also, we need to keep old clients
  (who will expect descriptors to be signed by the identity keys they know
  and love, and who will not understand signing keys) happy.

  I'd enumerate designs here, but I'm hoping that somebody will come up with
  a better one, so I'll try not to prejudice them with more ideas yet.

  Oh, and of course, we'll want to make sure that the keys are
  cross-certified. :)

  Ideas? -NM

3.4. Long and short descriptors

  Some of the costliest fields in the current directory protocol are ones
  that no client actually uses.  In particular, the "read-history" and
  "write-history" fields are used only by the authorities for monitoring the
  status of the network.  If we took them out, the size of a compressed list
  of all the routers would fall by about 60%.  (No other disposable field
  would save more than 2%.)

  One possible solution here is that routers should generate and upload a
  short-form and long-form descriptor.  Only the short-form descriptor should
  ever be used by anybody for routing.  The long-form descriptor should be
  used only for analytics and other tools.  (If we allowed people to route with
  long descriptors, we'd have to ensure that they stayed in sync with the
  short ones somehow.)

  Another possible solution would be to drop these fields from descriptors,
  and have them uploaded as a part of a separate "bandwidth report" to the
  authorities.  This could help prevent the mistake of using long descriptors
  in the place of short ones.

  Thoughts? -NM

4. Migration

  For directory voting, ...

caches need to start caching consensuses and accepting multisigned documents.