aboutsummaryrefslogtreecommitdiff
path: root/proposals
diff options
context:
space:
mode:
authorNick Mathewson <nickm@torproject.org>2007-01-26 05:50:40 +0000
committerNick Mathewson <nickm@torproject.org>2007-01-26 05:50:40 +0000
commitc82fbcd057696c5b2c2143e7c5eddeab73d84a1c (patch)
tree841f41b88c44d19cbf52026712c0e3532215e2f6 /proposals
parentf89ed001a0e06beeda070cab081f67a8676f84a2 (diff)
downloadtorspec-c82fbcd057696c5b2c2143e7c5eddeab73d84a1c.tar.gz
torspec-c82fbcd057696c5b2c2143e7c5eddeab73d84a1c.zip
Make a new directory for specification proposals, and move some proposals there. Also, move dir-spec-v1.txt to spec.
svn:r9415
Diffstat (limited to 'proposals')
-rw-r--r--proposals/100-tor-spec-udp.txt414
-rw-r--r--proposals/101-dir-voting.txt388
2 files changed, 802 insertions, 0 deletions
diff --git a/proposals/100-tor-spec-udp.txt b/proposals/100-tor-spec-udp.txt
new file mode 100644
index 0000000..9e4966c
--- /dev/null
+++ b/proposals/100-tor-spec-udp.txt
@@ -0,0 +1,414 @@
+[This proposed Tor extension has not been implemented yet. It is currently
+in request-for-comments state. -RD]
+
+ Tor Unreliable Datagram Extension Proposal
+
+ Marc Liberatore
+
+Abstract
+
+Contents
+
+0. Introduction
+
+ Tor is a distributed overlay network designed to anonymize low-latency
+ TCP-based applications. The current tor specification supports only
+ TCP-based traffic. This limitation prevents the use of tor to anonymize
+ other important applications, notably voice over IP software. This document
+ is a proposal to extend the tor specification to support UDP traffic.
+
+ The basic design philosophy of this extension is to add support for
+ tunneling unreliable datagrams through tor with as few modifications to the
+ protocol as possible. As currently specified, tor cannot directly support
+ such tunneling, as connections between nodes are built using transport layer
+ security (TLS) atop TCP. The latency incurred by TCP is likely unacceptable
+ to the operation of most UDP-based application level protocols.
+
+ Thus, we propose the addition of links between nodes using datagram
+ transport layer security (DTLS). These links allow packets to traverse a
+ route through tor quickly, but their unreliable nature requires minor
+ changes to the tor protocol. This proposal outlines the necessary
+ additions and changes to the tor specification to support UDP traffic.
+
+ We note that a separate set of DTLS links between nodes creates a second
+ overlay, distinct from the that composed of TLS links. This separation and
+ resulting decrease in each anonymity set's size will make certain attacks
+ easier. However, it is our belief that VoIP support in tor will
+ dramatically increase its appeal, and correspondingly, the size of its user
+ base, number of deployed nodes, and total traffic relayed. These increases
+ should help offset the loss of anonymity that two distinct networks imply.
+
+1. Overview of Tor-UDP and its complications
+
+ As described above, this proposal extends the Tor specification to support
+ UDP with as few changes as possible. Tor's overlay network is managed
+ through TLS based connections; we will re-use this control plane to set up
+ and tear down circuits that relay UDP traffic. These circuits be built atop
+ DTLS, in a fashion analogous to how Tor currently sends TCP traffic over
+ TLS.
+
+ The unreliability of DTLS circuits creates problems for Tor at two levels:
+
+ 1. Tor's encryption of the relay layer does not allow independent
+ decryption of individual records. If record N is not received, then
+ record N+1 will not decrypt correctly, as the counter for AES/CTR is
+ maintained implicitly.
+
+ 2. Tor's end-to-end integrity checking works under the assumption that
+ all RELAY cells are delivered. This assumption is invalid when cells
+ are sent over DTLS.
+
+ The fix for the first problem is straightforward: add an explicit sequence
+ number to each cell. To fix the second problem, we introduce a
+ system of nonces and hashes to RELAY packets.
+
+ In the following sections, we mirror the layout of the Tor Protocol
+ Specification, presenting the necessary modifications to the Tor protocol as
+ a series of deltas.
+
+2. Connections
+
+ Tor-UDP uses DTLS for encryption of some links. All DTLS links must have
+ corresponding TLS links, as all control messages are sent over TLS. All
+ implementations MUST support the DTLS ciphersuite "[TODO]".
+
+ DTLS connections are formed using the same protocol as TLS connections.
+ This occurs upon request, following a CREATE_UDP or CREATE_FAST_UDP cell,
+ as detailed in section 4.6.
+
+ Once a paired TLS/DTLS connection is established, the two sides send cells
+ to one another. All but two types of cells are sent over TLS links. RELAY
+ cells containing the commands RELAY_UDP_DATA and RELAY_UDP_DROP, specified
+ below, are sent over DTLS links. [Should all cells still be 512 bytes long?
+ Perhaps upon completion of a preliminary implementation, we should do a
+ performance evaluation for some class of UDP traffic, such as VoIP. - ML]
+ Cells may be sent embedded in TLS or DTLS records of any size or divided
+ across such records. The framing of these records MUST NOT leak any more
+ information than the above differentiation on the basis of cell type. [I am
+ uncomfortable with this leakage, but don't see any simple, elegant way
+ around it. -ML]
+
+ As with TLS connections, DTLS connections are not permanent.
+
+3. Cell format
+
+ Each cell contains the following fields:
+
+ CircID [2 bytes]
+ Command [1 byte]
+ Sequence Number [2 bytes]
+ Payload (padded with 0 bytes) [507 bytes]
+ [Total size: 512 bytes]
+
+ The 'Command' field holds one of the following values:
+ 0 -- PADDING (Padding) (See Sec 6.2)
+ 1 -- CREATE (Create a circuit) (See Sec 4)
+ 2 -- CREATED (Acknowledge create) (See Sec 4)
+ 3 -- RELAY (End-to-end data) (See Sec 5)
+ 4 -- DESTROY (Stop using a circuit) (See Sec 4)
+ 5 -- CREATE_FAST (Create a circuit, no PK) (See Sec 4)
+ 6 -- CREATED_FAST (Circuit created, no PK) (See Sec 4)
+ 7 -- CREATE_UDP (Create a UDP circuit) (See Sec 4)
+ 8 -- CREATED_UDP (Acknowledge UDP create) (See Sec 4)
+ 9 -- CREATE_FAST_UDP (Create a UDP circuit, no PK) (See Sec 4)
+ 10 -- CREATED_FAST_UDP(UDP circuit created, no PK) (See Sec 4)
+
+ The sequence number allows for AES/CTR decryption of RELAY cells
+ independently of one another; this functionality is required to support
+ cells sent over DTLS. The sequence number is described in more detail in
+ section 4.5.
+
+ [Should the sequence number only appear in RELAY packets? The overhead is
+ small, and I'm hesitant to force more code paths on the implementor. -ML]
+ [There's already a separate relay header that has other material in it,
+ so it wouldn't be the end of the world to move it there if it's
+ appropriate. -RD]
+
+ [Having separate commands for UDP circuits seems necessary, unless we can
+ assume a flag day event for a large number of tor nodes. -ML]
+
+4. Circuit management
+
+4.2. Setting circuit keys
+
+ Keys are set up for UDP circuits in the same fashion as for TCP circuits.
+ Each UDP circuit shares keys with its corresponding TCP circuit.
+
+ [If the keys are used for both TCP and UDP connections, how does it
+ work to mix sequence-number-less cells with sequenced-numbered cells --
+ how do you know you have the encryption order right? -RD]
+
+4.3. Creating circuits
+
+ UDP circuits are created as TCP circuits, using the *_UDP cells as
+ appropriate.
+
+4.4. Tearing down circuits
+
+ UDP circuits are torn down as TCP circuits, using the *_UDP cells as
+ appropriate.
+
+4.5. Routing relay cells
+
+ When an OR receives a RELAY cell, it checks the cell's circID and
+ determines whether it has a corresponding circuit along that
+ connection. If not, the OR drops the RELAY cell.
+
+ Otherwise, if the OR is not at the OP edge of the circuit (that is,
+ either an 'exit node' or a non-edge node), it de/encrypts the payload
+ with AES/CTR, as follows:
+ 'Forward' relay cell (same direction as CREATE):
+ Use Kf as key; decrypt, using sequence number to synchronize
+ ciphertext and keystream.
+ 'Back' relay cell (opposite direction from CREATE):
+ Use Kb as key; encrypt, using sequence number to synchronize
+ ciphertext and keystream.
+ Note that in counter mode, decrypt and encrypt are the same operation.
+ [Since the sequence number is only 2 bytes, what do you do when it
+ rolls over? -RD]
+
+ Each stream encrypted by a Kf or Kb has a corresponding unique state,
+ captured by a sequence number; the originator of each such stream chooses
+ the initial sequence number randomly, and increments it only with RELAY
+ cells. [This counts cells; unlike, say, TCP, tor uses fixed-size cells, so
+ there's no need for counting bytes directly. Right? - ML]
+ [I believe this is true. You'll find out for sure when you try to
+ build it. ;) -RD]
+
+ The OR then decides whether it recognizes the relay cell, by
+ inspecting the payload as described in section 5.1 below. If the OR
+ recognizes the cell, it processes the contents of the relay cell.
+ Otherwise, it passes the decrypted relay cell along the circuit if
+ the circuit continues. If the OR at the end of the circuit
+ encounters an unrecognized relay cell, an error has occurred: the OR
+ sends a DESTROY cell to tear down the circuit.
+
+ When a relay cell arrives at an OP, the OP decrypts the payload
+ with AES/CTR as follows:
+ OP receives data cell:
+ For I=N...1,
+ Decrypt with Kb_I, using the sequence number as above. If the
+ payload is recognized (see section 5.1), then stop and process
+ the payload.
+
+ For more information, see section 5 below.
+
+4.6. CREATE_UDP and CREATED_UDP cells
+
+ Users set up UDP circuits incrementally. The procedure is similar to that
+ for TCP circuits, as described in section 4.1. In addition to the TLS
+ connection to the first node, the OP also attempts to open a DTLS
+ connection. If this succeeds, the OP sends a CREATE_UDP cell, with a
+ payload in the same format as a CREATE cell. To extend a UDP circuit past
+ the first hop, the OP sends an EXTEND_UDP relay cell (see section 5) which
+ instructs the last node in the circuit to send a CREATE_UDP cell to extend
+ the circuit.
+
+ The relay payload for an EXTEND_UDP relay cell consists of:
+ Address [4 bytes]
+ TCP port [2 bytes]
+ UDP port [2 bytes]
+ Onion skin [186 bytes]
+ Identity fingerprint [20 bytes]
+
+ The address field and ports denote the IPV4 address and ports of the next OR
+ in the circuit.
+
+ The payload for a CREATED_UDP cell or the relay payload for an
+ RELAY_EXTENDED_UDP cell is identical to that of the corresponding CREATED or
+ RELAY_EXTENDED cell. Both circuits are established using the same key.
+
+ Note that the existence of a UDP circuit implies the
+ existence of a corresponding TCP circuit, sharing keys, sequence numbers,
+ and any other relevant state.
+
+4.6.1 CREATE_FAST_UDP/CREATED_FAST_UDP cells
+
+ As above, the OP must successfully connect using DTLS before attempting to
+ send a CREATE_FAST_UDP cell. Otherwise, the procedure is the same as in
+ section 4.1.1.
+
+5. Application connections and stream management
+
+5.1. Relay cells
+
+ Within a circuit, the OP and the exit node use the contents of RELAY cells
+ to tunnel end-to-end commands, TCP connections ("Streams"), and UDP packets
+ across circuits. End-to-end commands and UDP packets can be initiated by
+ either edge; streams are initiated by the OP.
+
+ The payload of each unencrypted RELAY cell consists of:
+ Relay command [1 byte]
+ 'Recognized' [2 bytes]
+ StreamID [2 bytes]
+ Digest [4 bytes]
+ Length [2 bytes]
+ Data [498 bytes]
+
+ The relay commands are:
+ 1 -- RELAY_BEGIN [forward]
+ 2 -- RELAY_DATA [forward or backward]
+ 3 -- RELAY_END [forward or backward]
+ 4 -- RELAY_CONNECTED [backward]
+ 5 -- RELAY_SENDME [forward or backward]
+ 6 -- RELAY_EXTEND [forward]
+ 7 -- RELAY_EXTENDED [backward]
+ 8 -- RELAY_TRUNCATE [forward]
+ 9 -- RELAY_TRUNCATED [backward]
+ 10 -- RELAY_DROP [forward or backward]
+ 11 -- RELAY_RESOLVE [forward]
+ 12 -- RELAY_RESOLVED [backward]
+ 13 -- RELAY_BEGIN_UDP [forward]
+ 14 -- RELAY_DATA_UDP [forward or backward]
+ 15 -- RELAY_EXTEND_UDP [forward]
+ 16 -- RELAY_EXTENDED_UDP [backward]
+ 17 -- RELAY_DROP_UDP [forward or backward]
+
+ Commands labelled as "forward" must only be sent by the originator
+ of the circuit. Commands labelled as "backward" must only be sent by
+ other nodes in the circuit back to the originator. Commands marked
+ as either can be sent either by the originator or other nodes.
+
+ The 'recognized' field in any unencrypted relay payload is always set to
+ zero.
+
+ The 'digest' field can have two meanings. For all cells sent over TLS
+ connections (that is, all commands and all non-UDP RELAY data), it is
+ computed as the first four bytes of the running SHA-1 digest of all the
+ bytes that have been sent reliably and have been destined for this hop of
+ the circuit or originated from this hop of the circuit, seeded from Df or Db
+ respectively (obtained in section 4.2 above), and including this RELAY
+ cell's entire payload (taken with the digest field set to zero). Cells sent
+ over DTLS connections do not affect this running digest. Each cell sent
+ over DTLS (that is, RELAY_DATA_UDP and RELAY_DROP_UDP) has the digest field
+ set to the SHA-1 digest of the current RELAY cells' entire payload, with the
+ digest field set to zero. Coupled with a randomly-chosen streamID, this
+ provides per-cell integrity checking on UDP cells.
+ [If you drop malformed UDP relay cells but don't close the circuit,
+ then this 8 bytes of digest is not as strong as what we get in the
+ TCP-circuit side. Is this a problem? -RD]
+
+ When the 'recognized' field of a RELAY cell is zero, and the digest
+ is correct, the cell is considered "recognized" for the purposes of
+ decryption (see section 4.5 above).
+
+ (The digest does not include any bytes from relay cells that do
+ not start or end at this hop of the circuit. That is, it does not
+ include forwarded data. Therefore if 'recognized' is zero but the
+ digest does not match, the running digest at that node should
+ not be updated, and the cell should be forwarded on.)
+
+ All RELAY cells pertaining to the same tunneled TCP stream have the
+ same streamID. Such streamIDs are chosen arbitrarily by the OP. RELAY
+ cells that affect the entire circuit rather than a particular
+ stream use a StreamID of zero.
+
+ All RELAY cells pertaining to the same UDP tunnel have the same streamID.
+ This streamID is chosen randomly by the OP, but cannot be zero.
+
+ The 'Length' field of a relay cell contains the number of bytes in
+ the relay payload which contain real payload data. The remainder of
+ the payload is padded with NUL bytes.
+
+ If the RELAY cell is recognized but the relay command is not
+ understood, the cell must be dropped and ignored. Its contents
+ still count with respect to the digests, though. [Before
+ 0.1.1.10, Tor closed circuits when it received an unknown relay
+ command. Perhaps this will be more forward-compatible. -RD]
+
+5.2.1. Opening UDP tunnels and transferring data
+
+ To open a new anonymized UDP connection, the OP chooses an open
+ circuit to an exit that may be able to connect to the destination
+ address, selects a random streamID not yet used on that circuit,
+ and constructs a RELAY_BEGIN_UDP cell with a payload encoding the address
+ and port of the destination host. The payload format is:
+
+ ADDRESS | ':' | PORT | [00]
+
+ where ADDRESS can be a DNS hostname, or an IPv4 address in
+ dotted-quad format, or an IPv6 address surrounded by square brackets;
+ and where PORT is encoded in decimal.
+
+ [What is the [00] for? -NM]
+ [It's so the payload is easy to parse out with string funcs -RD]
+
+ Upon receiving this cell, the exit node resolves the address as necessary.
+ If the address cannot be resolved, the exit node replies with a RELAY_END
+ cell. (See 5.4 below.) Otherwise, the exit node replies with a
+ RELAY_CONNECTED cell, whose payload is in one of the following formats:
+ The IPv4 address to which the connection was made [4 octets]
+ A number of seconds (TTL) for which the address may be cached [4 octets]
+ or
+ Four zero-valued octets [4 octets]
+ An address type (6) [1 octet]
+ The IPv6 address to which the connection was made [16 octets]
+ A number of seconds (TTL) for which the address may be cached [4 octets]
+ [XXXX Versions of Tor before 0.1.1.6 ignore and do not generate the TTL
+ field. No version of Tor currently generates the IPv6 format.]
+
+ The OP waits for a RELAY_CONNECTED cell before sending any data.
+ Once a connection has been established, the OP and exit node
+ package UDP data in RELAY_DATA_UDP cells, and upon receiving such
+ cells, echo their contents to the corresponding socket.
+ RELAY_DATA_UDP cells sent to unrecognized streams are dropped.
+
+ Relay RELAY_DROP_UDP cells are long-range dummies; upon receiving such
+ a cell, the OR or OP must drop it.
+
+5.3. Closing streams
+
+ UDP tunnels are closed in a fashion corresponding to TCP connections.
+
+6. Flow Control
+
+ UDP streams are not subject to flow control.
+
+7.2. Router descriptor format.
+
+The items' formats are as follows:
+ "router" nickname address ORPort SocksPort DirPort UDPPort
+
+ Indicates the beginning of a router descriptor. "address" must be
+ an IPv4 address in dotted-quad format. The last three numbers
+ indicate the TCP ports at which this OR exposes
+ functionality. ORPort is a port at which this OR accepts TLS
+ connections for the main OR protocol; SocksPort is deprecated and
+ should always be 0; DirPort is the port at which this OR accepts
+ directory-related HTTP connections; and UDPPort is a port at which
+ this OR accepts DTLS connections for UDP data. If any port is not
+ supported, the value 0 is given instead of a port number.
+
+Other sections:
+
+What changes need to happen to each node's exit policy to support this? -RD
+
+Switching to UDP means managing the queues of incoming packets better,
+so we don't miss packets. How does this interact with doing large public
+key operations (handshakes) in the same thread?
+
+========================================================================
+COMMENTS
+========================================================================
+
+[16 May 2006]
+
+I don't favor this approach; it makes packet traffic partitioned from
+stream traffic end-to-end. The architecture I'd like to see is:
+
+ A *All* Tor-to-Tor traffic is UDP/DTLS, unless we need to fall back on
+ TCP/TLS for firewall penetration or something. (This also gives us an
+ upgrade path for routing through legacy servers.)
+
+ B Stream traffic is handled with end-to-end per-stream acks/naks and
+ retries. On failure, the data is retransmitted in a new RELAY_DATA cell;
+ a cell isn't retransmitted.
+
+We'll need to do A anyway, to fix our behavior on packet-loss. Once we've
+done so, B is more or less inevitable, and we can support end-to-end UDP
+traffic "for free".
+
+(Also, there are some details that this draft spec doesn't address. For
+example, what happens when a UDP packet doesn't fit in a single cell?)
+
+-NM
diff --git a/proposals/101-dir-voting.txt b/proposals/101-dir-voting.txt
new file mode 100644
index 0000000..4909701
--- /dev/null
+++ b/proposals/101-dir-voting.txt
@@ -0,0 +1,388 @@
+$Id: /tor/branches/eventdns/doc/dir-spec.txt 9469 2006-11-01T23:56:30.179423Z nickm $
+
+ Voting on the Tor Directory System
+
+0. Scope and preliminaries
+
+ This document describes a consensus voting scheme for Tor directories.
+ Once it's accepted, it should be merged with dir-spec.txt. Some
+ preliminaries for authority and caching support should be done during
+ the 0.1.2.x series; the main deployment should come during the 0.1.3.x
+ series.
+
+0.1. Goals and motivation: voting.
+
+ The current directory system relies on clients downloading separate
+ network status statements from the caches signed by each directory.
+ Clients download a new statement every 30 minutes or so, choosing to
+ replace the oldest statement they currently have.
+
+ This creates a partitioning problem: different clients have different
+ "most recent" networkstatus sources, and different versions of each
+ (since authorities change their statements often).
+
+ It also creates a scaling problem: most of the downloaded networkstatus
+ are probably quite similar, and the redundancy grows as we add more
+ authorities.
+
+ So if we have clients only download a single multiply signed consensus
+ network status statement, we can:
+ - Save bandwidth.
+ - Reduce client partitioning
+ - Reduce client-side and cache-side storage
+ - Simplify client-side voting code (by moving voting away from the
+ client)
+
+ We should try to do this without:
+ - Assuming that client-side or cache-side clocks are more correct
+ than we assume now.
+ - Assuming that authority clocks are perfectly correct.
+ - Degrading badly if a few authorities die or are offline for a bit.
+
+ We do not have to perform well if:
+ - No clique of more than half the authorities can agree about who
+ the authorities are.
+
+1. The idea.
+
+ Instead of publishing a network status whenever something changes,
+ each authority instead publishes a fresh network status only once per
+ "period" (say, 60 minutes). Authorities either upload this network
+ status (or "vote") to every other authority, or download every other
+ authority's "vote" (see 3.1 below for discussion on push vs pull).
+
+ After an authority has (or has become convinced that it won't be able to
+ get) every other authority's vote, it deterministically computes a
+ consensus networkstatus, and signs it. Authorities download (or are
+ uploaded; see 3.1) one another's signatures, and form a multiply signed
+ consensus. This multiply-signed consensus is what caches cache and what
+ clients download.
+
+ If an authority is down, authorities vote based on what they *can*
+ download/get uploaded.
+
+ If an authority is "a little" down and only some authorities can reach
+ it, authorities try to get its info from other authorities.
+
+ If an authority computes the vote wrong, its signature isn't included on
+ the consensus.
+
+ Clients use a consensus if it is "trusted": signed by more than half the
+ authorities they recognize. If clients can't find any such consensus,
+ they use the most recent trusted consensus they have. If they don't
+ have any trusted consensus, they warn the user and refuse to operate
+ (and if DirServers is not the default, beg the user to adapt the list
+ of authorities).
+
+2. Details.
+
+2.1. Vote specifications
+
+ Votes in v2.1 are similar to v2 network status documents. We add these
+ fields to the preamble:
+
+ "vote-status" -- the word "vote".
+
+ "valid-until" -- the time when this authority expects to publish its
+ next vote.
+
+ "known-flags" -- a space-separated list of flags that will sometimes
+ be included on "s" lines later in the vote.
+
+ "dir-source" -- as before, except the "hostname" part MUST be the
+ authority's nickname, which MUST be unique among authorities, and
+ MUST match the nickname in the "directory-signature" entry.
+
+ Authorities SHOULD cache their most recently generated votes so they
+ can persist them across restarts. Authorities SHOULD NOT generate
+ another document until valid-until has passed.
+
+ Router entries in the vote MUST be sorted in ascending order by router
+ identity digest. The flags in "s" lines MUST appear in alphabetical
+ order.
+
+ Votes SHOULD be synchronized to half-hour publication intervals (one
+ hour? XXX say more; be more precise.)
+
+ XXXX some way to request older networkstatus docs?
+
+2.2. Consensus directory specifications
+
+ Consensuses are like v2.1 votes, except for the following fields:
+
+ "vote-status" -- the word "consensus".
+
+ "published" is the latest of all the published times on the votes.
+
+ "valid-until" is the earliest of all the valid-until times on the
+ votes.
+
+ "dir-source" and "fingerprint" and "dir-signing-key" and "contact"
+ are included for each authority that contributed to the vote.
+
+ "vote-digest" for each authority that contributed to the vote,
+ calculated as for the digest in the signature on the vote. [XXX
+ re-English this sentence]
+
+ "client-versions" and "server-versions" are sorted in ascending
+ order based on version-spec.txt.
+
+ "dir-options" and "known-flags" are not included.
+[XXX really? why not list the ones that are used in the consensus?
+For example, right now BadExit is in use, but no servers would be
+labelled BadExit, and it's still worth knowing that it was considered
+by the authorities. -RD]
+
+ The fields MUST occur in the following order:
+ "network-status-version"
+ "vote-status"
+ "published"
+ "valid-until"
+ For each authority, sorted in ascending order of nickname, case-
+ insensitively:
+ "dir-source", "fingerprint", "contact", "dir-signing-key",
+ "vote-digest".
+ "client-versions"
+ "server-versions"
+
+ The signatures at the end of the document appear as multiple instances
+ of directory-signature, sorted in ascending order by nickname,
+ case-insensitively.
+
+ A router entry should be included in the result if it is included by more
+ than half of the authorities (total authorities, not just those whose votes
+ we have). A router entry has a flag set if it is included by more than
+ half of the authorities who care about that flag. [XXXX this creates an
+ incentive for attackers to DOS authorities whose votes they don't like.
+ Can we remember what flags people set the last time we saw them? -NM]
+ [Which 'we' are we talking here? The end-users never learn which
+ authority sets which flags. So you're thinking the authorities
+ should record the last vote they saw from each authority and if it's
+ within a week or so, count all the flags that it advertised as 'no'
+ votes? Plausible. -RD]
+
+ The signature hash covers from the "network-status-version" line through
+ the characters "directory-signature" in the first "directory-signature"
+ line.
+
+ Consensus directories SHOULD be rejected if they are not signed by more
+ than half of the known authorities.
+
+2.2.1. Detached signatures
+
+ Assuming full connectivity, every authority should compute and sign the
+ same consensus directory in each period. Therefore, it isn't necessary to
+ download the consensus computed by each authority; instead, the authorities
+ only push/fetch each others' signatures. A "detached signature" document
+ contains a single "consensus-digest" entry and one or more
+ directory-signature entries. [XXXX specify more.]
+
+2.3. URLs and timelines
+
+2.3.1. URLs and timeline used for agreement
+
+ An authority SHOULD publish its vote immediately at the start of each voting
+ period. It does this by making it available at
+ http://<hostname>/tor/status-vote/current/authority.z
+ and sending it in an HTTP POST request to each other authority at the URL
+ http://<hostname>/tor/post/vote
+
+ If, N minutes after the voting period has begun, an authority does not have
+ a current statement from another authority, the first authority retrieves
+ the other's statement.
+
+ Once an authority has a vote from another authority, it makes it available
+ at
+ http://<hostname>/tor/status-vote/current/<fp>.z
+ where <fp> is the fingerprint of the other authority's identity key.
+
+ The consensus network status, along with as many signatures as the server
+ currently knows, should be available at
+ http://<hostname>/tor/status-vote/current/consensus.z
+ All of the detached signatures it knows for consensus status should be
+ available at:
+ http://<hostname>/tor/status-vote/current/consensus-signatures.z
+
+ Once an authority has computed and signed a consensus network status, it
+ should send its detached signature to each other authority in an HTTP POST
+ request to the URL:
+ http://<hostname>/tor/post/consensus-signature
+
+
+ [XXXX Store votes to disk.]
+
+2.3.2. Serving a consensus directory
+
+ Once the authority is done getting signatures on the consensus directory,
+ it should serve it from:
+ http://<hostname>/tor/status/consensus.z
+
+ Caches SHOULD download consensus directories from an authority and serve
+ them from the same URL.
+
+2.3.3. Timeline and synchronization
+
+ [XXXX]
+
+2.4. Distributing routerdescs between authorities
+
+ Consensus will be more meaningful if authorities take steps to make sure
+ that they all have the same set of descriptors _before_ the voting
+ starts. This is safe, since all descriptors are self-certified and
+ timestamped: it's always okay to replace a signed descriptor with a more
+ recent one signed by the same identity.
+
+ In the long run, we might want some kind of sophisticated process here.
+ For now, since authorities already download one another's networkstatus
+ documents and use them to determine what descriptors to download from one
+ another, we can rely on this existing mechanism to keep authorities up to
+ date.
+
+ [We should do a thorough read-through of dir-spec again to make sure
+ that the authorities converge on which descriptor to "prefer" for
+ each router. Right now the decision happens at the client, which is
+ no longer the right place for it. -RD]
+
+3. Questions and concerns
+
+3.1. Push or pull?
+
+ The URLs above define a push mechanism for publishing votes and consensus
+ signatures via HTTP POST requests, and a pull mechanism for downloading
+ these documents via HTTP GET requests. As specified, every authority will
+ post to every other. The "download if no copy has been received" mechanism
+ exists only as a fallback.
+
+3.2. Dropping "opt".
+
+ The "opt" keyword in Tor's directory formats was originally intended to
+ mean, "it is okay to ignore this entry if you don't understand it"; the
+ default behavior has been "discard a routerdesc if it contains entries you
+ don't recognize."
+
+ But so far, every new flag we have added has been marked 'opt'. It would
+ probably make sense to change the default behavior to "ignore unrecognized
+ fields", and add the statement that clients SHOULD ignore fields they don't
+ recognize. As a meta-principle, we should say that clients and servers
+ MUST NOT have to understand new fields in order to use directory documents
+ correctly.
+
+ Of course, this will make it impossible to say, "The format has changed a
+ lot; discard this quietly if you don't understand it." We could do that by
+ adding a version field.
+
+3.3. Multilevel keys.
+
+ Replacing a directory authority's identity key in the event of a compromise
+ would be tremendously annoying. We'd need to tell every client to switch
+ their configuration, or update to a new version with an uploaded list. So
+ long as some weren't upgraded, they'd be at risk from whoever had
+ compromised the key.
+
+ With this in mind, it's a shame that our current protocol forces us to
+ store identity keys unencrypted in RAM. We need some kind of signing key
+ stored unencrypted, since we need to generate new descriptors/directories
+ and rotate link and onion keys regularly. (And since, of course, we can't
+ ask server operators to be on-hand to enter a passphrase every time we
+ want to rotate keys or sign a descriptor.)
+
+ The obvious solution seems to be to have a signing-only key that lives
+ indefinitely (months or longer) and signs descriptors and link keys, and a
+ separate identity key that's used to sign the signing key. Tor servers
+ could run in one of several modes:
+ 1. Identity key stored encrypted. You need to pick a passphrase when
+ you enable this mode, and re-enter this passphrase every time you
+ rotate the signing key.
+ 1'. Identity key stored separate. You save your identity key to a
+ floppy, and use the floppy when you need to rotate the signing key.
+ 2. All keys stored unencrypted. In this case, we might not want to even
+ *have* a separate signing key. (We'll need to support no-separate-
+ signing-key mode anyway to keep old servers working.)
+ 3. All keys stored encrypted. You need to enter a passphrase to start
+ Tor.
+ (Of course, we might not want to implement all of these.)
+
+ Case 1 is probably most usable and secure, if we assume that people don't
+ forget their passphrases or lose their floppies. We could mitigate this a
+ bit by encouraging people to PGP-encrypt their passphrases to themselves,
+ or keep a cleartext copy of their secret key secret-split into a few
+ pieces, or something like that.
+
+ Migration presents another difficulty, especially with the authorities. If
+ we use the current set of identity keys as the new identity keys, we're in
+ the position of having sensitive keys that have been stored on
+ media-of-dubious-encryption up to now. Also, we need to keep old clients
+ (who will expect descriptors to be signed by the identity keys they know
+ and love, and who will not understand signing keys) happy.
+
+ I'd enumerate designs here, but I'm hoping that somebody will come up with
+ a better one, so I'll try not to prejudice them with more ideas yet.
+
+ Oh, and of course, we'll want to make sure that the keys are
+ cross-certified. :)
+
+ Ideas? -NM
+
+3.4. Long and short descriptors
+
+ Some of the costliest fields in the current directory protocol are ones
+ that no client actually uses. In particular, the "read-history" and
+ "write-history" fields are used only by the authorities for monitoring the
+ status of the network. If we took them out, the size of a compressed list
+ of all the routers would fall by about 60%. (No other disposable field
+ would save more than 2%.)
+
+ One possible solution here is that routers should generate and upload a
+ short-form and long-form descriptor. Only the short-form descriptor should
+ ever be used by anybody for routing. The long-form descriptor should be
+ used only for analytics and other tools. (If we allowed people to route with
+ long descriptors, we'd have to ensure that they stayed in sync with the
+ short ones somehow.) We can ensure that the short descriptors are used by
+ only recommending those in the network statuses.
+
+ Another possible solution would be to drop these fields from descriptors,
+ and have them uploaded as a part of a separate "bandwidth report" to the
+ authorities. This could help prevent the mistake of using long descriptors
+ in the place of short ones.
+
+ Thoughts? -NM
+
+3.5. Compression
+
+ Gzip would be easier to work with than zlib; bzip2 would result in smaller
+ data lengths. [Concretely, we're looking at about 10-15% space savings at
+ the expense of 3-5x longer compression time for using bzip2.] Doing
+ on-the-fly gzip requires zlib 1.2 or later; doing bzip2 requires bzlib.
+ Pre-compressing status documents in multiple formats would force us to use
+ more memory to hold them.
+
+4. Migration
+
+ For directory voting:
+ * It would be cool if caches could get ready to download consensus
+ status docs, verify enough signatures, and serve them now. That way
+ once stuff works all we need to do is upgrade the authorities. Caches
+ don't need to verify the correctness of the format so long as it's
+ signed (or maybe multisigned?). We need to make sure that caches back
+ off very quickly from downloading consensus docs until they're
+ actually implemented.
+
+ For dropping the "opt" requirement:
+ * stopped requiring it as of 0.1.2.5-alpha. Stop generating it once
+ earlier formats are obsolete.
+
+ For multilevel keys:
+ * no idea
+
+ For long/short descriptors:
+ * In 0.1.2.x:
+ * Authorities should accept both, now, and silently drop short
+ descriptors.
+ * Routers should upload both once authorities accept them.
+ * There should be a "long descriptor" url and the current "normal" URL.
+ Authorities should serve long descriptors from both URLs.
+ * Once tools that want long descriptors support fetching them from the
+ "long descriptor" URL:
+ * Have authorities remember short descriptors, and serve them from the
+ 'normal' URL.
+