diff options
Diffstat (limited to 'doc/spec/dir-spec.txt')
-rw-r--r-- | doc/spec/dir-spec.txt | 873 |
1 files changed, 873 insertions, 0 deletions
diff --git a/doc/spec/dir-spec.txt b/doc/spec/dir-spec.txt new file mode 100644 index 0000000000..70c164af32 --- /dev/null +++ b/doc/spec/dir-spec.txt @@ -0,0 +1,873 @@ +$Id$ + + Tor directory protocol, version 2 + +0. Scope and preliminaries + + This directory protocol is used by Tor version 0.1.1.x and later. See + dir-spec-v1.txt for information on earlier versions. + +0.1. Goals and motivation + + There were several problems with the way Tor handles directory information + in version 0.1.0.x and earlier. Here are the problems we try to fix with + this new design, already implemented in 0.1.1.x: + 1. Directories were very large and use up a lot of bandwidth: clients + downloaded descriptors for all router several times an hour. + 2. Every directory authority was a trust bottleneck: if a single + directory authority lied, it could make clients believe for a time an + arbitrarily distorted view of the Tor network. + 3. Our current "verified server" system is kind of nonsensical. + + 4. Getting more directory authorities would add more points of failure + and worsen possible partitioning attacks. + + There are two problems that remain unaddressed by this design. + 5. Requiring every client to know about every router won't scale. + 6. Requiring every directory cache to know every router won't scale. + + We attempt to fix 1-4 here, and to build a solution that will work when we + figure out an answer for 5. We haven't thought at all about what to do + about 6. + +1. Outline + + There is a small set (say, around 10) of semi-trusted directory + authorities. A default list of authorities is shipped with the Tor + software. Users can change this list, but are encouraged not to do so, in + order to avoid partitioning attacks. + + Routers periodically upload signed "descriptors" to the directory + authorities describing their keys, capabilities, and other information. + Routers may act as directory mirrors (also called "caches"), to reduce + load on the directory authorities. They announce this in their + descriptors. + + Each directory authority periodically generates and signs a compact + "network status" document that lists that authority's view of the current + descriptors and status for known routers, but which does not include the + descriptors themselves. + + Directory mirrors download, cache, and re-serve network-status documents + to clients. + + Clients, directory mirrors, and directory authorities all use + network-status documents to find out when their list of routers is + out-of-date. If it is, they download any missing router descriptors. + Clients download missing descriptors from mirrors; mirrors and authorities + download from authorities. Descriptors are downloaded by the hash of the + descriptor, not by the server's identity key: this prevents servers from + attacking clients by giving them descriptors nobody else uses. + + All directory information is uploaded and downloaded with HTTP. + + Coordination among directory authorities is done client-side: clients + compute a vote-like algorithm among the network-status documents they + have, and base their decisions on the result. + +1.1. What's different from 0.1.0.x? + + Clients used to download a signed concatenated set of router descriptors + (called a "directory") from directory mirrors, regardless of which + descriptors had changed. + + Between downloading directories, clients would download "network-status" + documents that would list which servers were supposed to running. + + Clients would always believe the most recently published network-status + document they were served. + + Routers used to upload fresh descriptors all the time, whether their keys + and other information had changed or not. + +1.2. Document meta-format + + Router descriptors, directories, and running-routers documents all obey the + following lightweight extensible information format. + + The highest level object is a Document, which consists of one or more + Items. Every Item begins with a KeywordLine, followed by one or more + Objects. A KeywordLine begins with a Keyword, optionally followed by + whitespace and more non-newline characters, and ends with a newline. A + Keyword is a sequence of one or more characters in the set [A-Za-z0-9-]. + An Object is a block of encoded data in pseudo-Open-PGP-style + armor. (cf. RFC 2440) + + More formally: + + Document ::= (Item | NL)+ + Item ::= KeywordLine Object* + KeywordLine ::= Keyword NL | Keyword WS ArgumentsChar+ NL + Keyword = KeywordChar+ + KeywordChar ::= 'A' ... 'Z' | 'a' ... 'z' | '0' ... '9' | '-' + ArgumentChar ::= any printing ASCII character except NL. + WS = (SP | TAB)+ + Object ::= BeginLine Base-64-encoded-data EndLine + BeginLine ::= "-----BEGIN " Keyword "-----" NL + EndLine ::= "-----END " Keyword "-----" NL + + The BeginLine and EndLine of an Object must use the same keyword. + + When interpreting a Document, software MUST ignore any KeywordLine that + starts with a keyword it doesn't recognize; future implementations MUST NOT + require current clients to understand any KeywordLine not currently + described. + + The "opt" keyword was used until Tor 0.1.2.5-alpha for non-critical future + extensions. All implementations MUST ignore any item of the form "opt + keyword ....." when they would not recognize "keyword ....."; and MUST + treat "opt keyword ....." as synonymous with "keyword ......" when keyword + is recognized. + + Implementations before 0.1.2.5-alpha rejected any document with a + KeywordLine that started with a keyword that they didn't recognize. + Implementations MUST prefix items not recognized by older versions of Tor + with an "opt" until those versions of Tor are obsolete. + +2. Router operation + + ORs SHOULD generate a new router descriptor whenever any of the + following events have occurred: + + - A period of time (18 hrs by default) has passed since the last + time a descriptor was generated. + + - A descriptor field other than bandwidth or uptime has changed. + + - Bandwidth has changed by more than +/- 50% from the last time a + descriptor was generated, and at least a given interval of time + (20 mins by default) has passed since then. + + - Its uptime has been reset (by restarting). + + After generating a descriptor, ORs upload it to every directory + authority they know, by posting it to the URL + + http://<hostname:port>/tor/ + +2.1. Router descriptor format + + Every router descriptor MUST start with a "router" Item; MUST end with a + "router-signature" Item and an extra NL; and MUST contain exactly one + instance of each of the following Items: "published" "onion-key" + "link-key" "signing-key" "bandwidth". Additionally, a router descriptor + MAY contain any number of "accept", "reject", "fingerprint", "uptime", and + "opt" Items. Other than "router" and "router-signature", the items may + appear in any order. + + The items' formats are as follows: + "router" nickname address ORPort SocksPort DirPort + + Indicates the beginning of a router descriptor. "address" must be an + IPv4 address in dotted-quad format. The last three numbers indicate + the TCP ports at which this OR exposes functionality. ORPort is a port + at which this OR accepts TLS connections for the main OR protocol; + SocksPort is deprecated and should always be 0; and DirPort is the + port at which this OR accepts directory-related HTTP connections. If + any port is not supported, the value 0 is given instead of a port + number. + + "bandwidth" bandwidth-avg bandwidth-burst bandwidth-observed + + Estimated bandwidth for this router, in bytes per second. The + "average" bandwidth is the volume per second that the OR is willing to + sustain over long periods; the "burst" bandwidth is the volume that + the OR is willing to sustain in very short intervals. The "observed" + value is an estimate of the capacity this server can handle. The + server remembers the max bandwidth sustained output over any ten + second period in the past day, and another sustained input. The + "observed" value is the lesser of these two numbers. + + "platform" string + + A human-readable string describing the system on which this OR is + running. This MAY include the operating system, and SHOULD include + the name and version of the software implementing the Tor protocol. + + "published" YYYY-MM-DD HH:MM:SS + + The time, in GMT, when this descriptor was generated. + + "fingerprint" + + A fingerprint (a HASH_LEN-byte of asn1 encoded public key, encoded in + hex, with a single space after every 4 characters) for this router's + identity key. A descriptor is considered invalid (and MUST be + rejected) if the fingerprint line does not match the public key. + + [We didn't start parsing this line until Tor 0.1.0.6-rc; it should + be marked with "opt" until earlier versions of Tor are obsolete.] + + "hibernating" 0|1 + + If the value is 1, then the Tor server was hibernating when the + descriptor was published, and shouldn't be used to build circuits. + + [We didn't start parsing this line until Tor 0.1.0.6-rc; it should be + marked with "opt" until earlier versions of Tor are obsolete.] + + "uptime" + + The number of seconds that this OR process has been running. + + "onion-key" NL a public key in PEM format + + This key is used to encrypt EXTEND cells for this OR. The key MUST be + accepted for at least 1 week after any new key is published in a + subsequent descriptor. + + "signing-key" NL a public key in PEM format + + The OR's long-term identity key. + + "accept" exitpattern + "reject" exitpattern + + These lines describe the rules that an OR follows when + deciding whether to allow a new stream to a given address. The + 'exitpattern' syntax is described below. The rules are considered in + order; if no rule matches, the address will be accepted. For clarity, + the last such entry SHOULD be accept *:* or reject *:*. + + "router-signature" NL Signature NL + + The "SIGNATURE" object contains a signature of the PKCS1-padded + hash of the entire router descriptor, taken from the beginning of the + "router" line, through the newline after the "router-signature" line. + The router descriptor is invalid unless the signature is performed + with the router's identity key. + + "contact" info NL + + Describes a way to contact the server's administrator, preferably + including an email address and a PGP key fingerprint. + + "family" names NL + + 'Names' is a space-separated list of server nicknames or + hexdigests. If two ORs list one another in their "family" entries, + then OPs should treat them as a single OR for the purpose of path + selection. + + For example, if node A's descriptor contains "family B", and node B's + descriptor contains "family A", then node A and node B should never + be used on the same circuit. + + "read-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM,NUM,NUM,NUM,NUM... NL + "write-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM,NUM,NUM,NUM,NUM... NL + + Declare how much bandwidth the OR has used recently. Usage is divided + into intervals of NSEC seconds. The YYYY-MM-DD HH:MM:SS field + defines the end of the most recent interval. The numbers are the + number of bytes used in the most recent intervals, ordered from + oldest to newest. + + [We didn't start parsing these lines until Tor 0.1.0.6-rc; they should + be marked with "opt" until earlier versions of Tor are obsolete.] + + "eventdns" bool NL + + Declare whether this version of Tor is using the newer enhanced + dns logic. Versions of Tor without eventdns SHOULD NOT be used for + reverse hostname lookups. + + [All versions of Tor before 0.1.2.2-alpha should be assumed to have + this option set to 0 if it is not present. All Tor versions at + 0.1.2.2-alpha or later should be assumed to have this option set to + 1 if it is not present. Until 0.1.2.1-alpha-dev, this option was + not generated, even when eventdns was in use. Versions of Tor + before 0.1.2.1-alpha-dev did not parse this option, so it should be + marked "opt". With some future version, the old 'dnsworker' logic + will be removed, rendering this option of historical interest only.] + +2.1. Nonterminals in router descriptors + + nickname ::= between 1 and 19 alphanumeric characters, case-insensitive. + hexdigest ::= a '$', followed by 20 hexadecimal characters. + [Represents a server by the digest of its identity key.] + + exitpattern ::= addrspec ":" portspec + portspec ::= "*" | port | port "-" port + port ::= an integer between 1 and 65535, inclusive. + [Some implementations incorrectly generate ports with value 0. + Implementations SHOULD accept this, and SHOULD NOT generate it.] + + addrspec ::= "*" | ip4spec | ip6spec + ipv4spec ::= ip4 | ip4 "/" num_ip4_bits | ip4 "/" ip4mask + ip4 ::= an IPv4 address in dotted-quad format + ip4mask ::= an IPv4 mask in dotted-quad format + num_ip4_bits ::= an integer between 0 and 32 + ip6spec ::= ip6 | ip6 "/" num_ip6_bits + ip6 ::= an IPv6 address, surrounded by square brackets. + num_ip6_bits ::= an integer between 0 and 128 + + bool ::= "0" | "1" + + Ports are required; if they are not included in the router + line, they must appear in the "ports" lines. + +3. Network status format + + Directory authorities generate, sign, and compress network-status + documents. Directory servers SHOULD generate a fresh network-status + document when the contents of such a document would be different from the + last one generated, and some time (at least one second, possibly longer) + has passed since the last one was generated. + + The network status document contains a preamble, a set of router status + entries, and a signature, in that order. + + We use the same meta-format as used for directories and router descriptors + in "tor-spec.txt". Implementations MAY insert blank lines + for clarity between sections; these blank lines are ignored. + Implementations MUST NOT depend on blank lines in any particular location. + + As used here, "whitespace" is a sequence of 1 or more tab or space + characters. + + The preamble contains: + + "network-status-version" -- A document format version. For this + specification, the version is "2". + "dir-source" -- The authority's hostname, current IP address, and + directory port, all separated by whitespace. + "fingerprint" -- A base16-encoded hash of the signing key's + fingerprint, with no additional spaces added. + "contact" -- An arbitrary string describing how to contact the + directory server's administrator. Administrators should include at + least an email address and a PGP fingerprint. + "dir-signing-key" -- The directory server's public signing key. + "client-versions" -- A comma-separated list of recommended client + versions. + "server-versions" -- A comma-separated list of recommended server + versions. + "published" -- The publication time for this network-status object. + "dir-options" -- A set of flags, in any order, separated by whitespace: + "Names" if this directory authority performs name bindings. + "Versions" if this directory authority recommends software versions. + "BadExits" if the directory authority flags nodes that it believes + are performing incorrectly as exit nodes. + "BadDirectories" if the directory authority flags nodes that it + believes are performing incorrectly as directory caches. + + The dir-options entry is optional. The "-versions" entries are required if + the "Versions" flag is present. The other entries are required and must + appear exactly once. The "network-status-version" entry must appear first; + the others may appear in any order. Implementations MUST ignore + additional arguments to the items above, and MUST ignore unrecognized + flags. + + For each router, the router entry contains: (This format is designed for + conciseness.) + + "r" -- followed by the following elements, in order, separated by + whitespace: + - The OR's nickname, + - A hash of its identity key, encoded in base64, with trailing = + signs removed. + - A hash of its most recent descriptor, encoded in base64, with + trailing = signs removed. (The hash is calculated as for + computing the signature of a descriptor.) + - The publication time of its most recent descriptor, in the form + YYYY-MM-DD HH:MM:SS, in GMT. + - An IP address + - An OR port + - A directory port (or "0" for none") + "s" -- A series of whitespace-separated status flags, in any order: + "Authority" if the router is a directory authority. + "BadExit" if the router is believed to be useless as an exit node + (because its ISP censors it, because it is behind a restrictive + proxy, or for some similar reason). + "BadDirectory" if the router is believed to be useless as a + directory cache (because its directory port isn't working, + its bandwidth is always throttled, or for some similar + reason). + "Exit" if the router is useful for building general-purpose exit + circuits. + "Fast" if the router is suitable for high-bandwidth circuits. + "Guard" if the router is suitable for use as an entry guard. + "Named" if the router's identity-nickname mapping is canonical, + and this authority binds names. + "Stable" if the router is suitable for long-lived circuits. + "Running" if the router is currently usable. + "Valid" if the router has been 'validated'. + "V2Dir" if the router implements this protocol. + "v" -- The version of the Tor protocol that this server is running. If + the value begins with "Tor" SP, the rest of the string is a Tor + version number, and the protocol is "The Tor protocol as supported + by the given version of Tor." Otherwise, if the value begins with + some other string, Tor has upgraded to a more sophisticated + protocol versioning system, and the protocol is "a version of the + Tor protocol more recent than any we recognize." + + The "r" entry for each router must appear first and is required. The + "s" entry is optional (see Section 3.1 below for how the flags are + decided). Unrecognized flags on the "s" line and extra elements + on the "r" line must be ignored. The "v" line is optional; it was not + supported until 0.1.2.5-alpha, and it must be preceded with an "opt" + until all earlier versions of Tor are obsolete. + + The signature section contains: + + "directory-signature" nickname-of-dirserver NL Signature + + Signature is a signature of this network-status document + (the document up until the signature, including the line + "directory-signature <nick>\n"), using the directory authority's + signing key. + + We compress the network status list with zlib before transmitting it. + +3.1. Establishing server status + + (This section describes how directory authorities choose which status + flags to apply to routers, as of Tor 0.1.1.18-rc. Later directory + authorities MAY do things differently, so long as clients keep working + well. Clients MUST NOT depend on the exact behaviors in this section.) + + "Valid" -- a router is 'Valid' if it is running a version of Tor not + known to be broken, and the directory authority has not blacklisted + it as suspicious. + + "Named" -- Directory authority administrators may decide to support name + binding. If they do, then they must maintain a file of + nickname-to-identity-key mappings, and try to keep this file consistent + with other directory authorities. If they don't, they act as clients, and + report bindings made by other directory authorities (name X is bound to + identity Y if at least one binding directory lists it, and no directory + binds X to some other Y'.) A router is called 'Named' if the router + believes the given name should be bound to the given key. + + "Running" -- A router is 'Running' if the authority managed to connect to + it successfully within the last 30 minutes. + + "Stable" -- A router is 'Stable' if its uptime is above median for known + running, valid routers, and it's running a version of Tor not known to + drop circuits stupidly. (0.1.1.10-alpha through 0.1.1.16-rc are stupid + this way.) + + "Fast" -- A router is 'Fast' if its bandwidth is in the top 7/8ths for + known running, valid routers. + + "Guard" -- A router is a possible 'Guard' if it is 'Stable' and its + bandwidth is above median for known running, valid routers. If the total + bandwidth of Running Valid non-BadExit Exit servers is less than one third + of the total bandwidth of all Running Valid servers, no Exit is listed as + a Guard. + + "Authority" -- A router is called an 'Authority' if the authority + generating the network-status document believes it is an authority. + + "V2Dir" -- A router supports the v2 directory protocol if it has an open + directory port, and it is running a version of the directory protocol that + supports the functionality clients need. (Currently, this is + 0.1.1.9-alpha or later.) + + Directory server administrators may label some servers or IPs as + blacklisted, and elect not to include them in their network-status lists. + + Thus, the network-status list includes all non-blacklisted, + non-expired, non-superseded descriptors. + +4. Directory server operation + + All directory authorities and directory mirrors ("directory servers") + implement this section, except as noted. + +4.1. Accepting uploads (authorities only) + + When a router posts a signed descriptor to a directory authority, the + authority first checks whether it is well-formed and correctly + self-signed. If it is, the authority next verifies that the nickname + question is already assigned to a router with a different public key. + Finally, the authority MAY check that the router is not blacklisted + because of its key, IP, or another reason. + + If the descriptor passes these tests, and the authority does not already + have a descriptor for a router with this public key, it accepts the + descriptor and remembers it. + + If the authority _does_ have a descriptor with the same public key, the + newly uploaded descriptor is remembered if its publication time is more + recent than the most recent old descriptor for that router, and either: + - There are non-cosmetic differences between the old descriptor and the + new one. + - Enough time has passed between the descriptors' publication times. + (Currently, 12 hours.) + + Differences between router descriptors are "non-cosmetic" if they would be + sufficient to force an upload as described in section 2 above. + + Note that the "cosmetic difference" test only applies to uploaded + descriptors, not to descriptors that the authority downloads from other + authorities. + +4.2. Downloading network-status documents (authorities and caches) + + All directory servers (authorities and mirrors) try to keep a fresh + set of network-status documents from every authority. To do so, + every 5 minutes, each authority asks every other authority for its + most recent network-status document. Every 15 minutes, each mirror + picks a random authority and asks it for the most recent network-status + documents for all the authorities the authority knows about (including + the chosen authority itself). + + Directory servers and mirrors remember and serve the most recent + network-status document they have from each authority. Other + network-status documents don't need to be stored. If the most recent + network-status document is over 10 days old, it is discarded anyway. + Mirrors SHOULD store and serve network-status documents from authorities + they don't recognize, but SHOULD NOT use such documents for any other + purpose. Mirrors SHOULD discard network-status documents older than 48 + hours. + +4.3. Downloading and storing router descriptors (authorities and caches) + + Periodically (currently, every 10 seconds), directory servers check + whether there are any specific descriptors (as identified by descriptor + hash in a network-status document) that they do not have and that they + are not currently trying to download. + + If so, the directory server launches requests to the authorities for these + descriptors, such that each authority is only asked for descriptors listed + in its most recent network-status. When more than one authority lists the + descriptor, we choose which to ask at random. + + If one of these downloads fails, we do not try to download that descriptor + from the authority that failed to serve it again unless we receive a newer + network-status from that authority that lists the same descriptor. + + Directory servers must potentially cache multiple descriptors for each + router. Servers must not discard any descriptor listed by any current + network-status document from any authority. If there is enough space to + store additional descriptors, servers SHOULD try to hold those which + clients are likely to download the most. (Currently, this is judged + based on the interval for which each descriptor seemed newest.) + + Authorities SHOULD NOT download descriptors for routers that they would + immediately reject for reasons listed in 3.1. + +4.4. HTTP URLs + + "Fingerprints" in these URLs are base-16-encoded SHA1 hashes. + + The authoritative network-status published by a host should be available at: + http://<hostname>/tor/status/authority.z + + The network-status published by a host with fingerprint + <F> should be available at: + http://<hostname>/tor/status/fp/<F>.z + + The network-status documents published by hosts with fingerprints + <F1>,<F2>,<F3> should be available at: + http://<hostname>/tor/status/fp/<F1>+<F2>+<F3>.z + + The most recent network-status documents from all known authorities, + concatenated, should be available at: + http://<hostname>/tor/status/all.z + + The most recent descriptor for a server whose identity key has a + fingerprint of <F> should be available at: + http://<hostname>/tor/server/fp/<F>.z + + The most recent descriptors for servers with identity fingerprints + <F1>,<F2>,<F3> should be available at: + http://<hostname>/tor/server/fp/<F1>+<F2>+<F3>.z + + (NOTE: Implementations SHOULD NOT download descriptors by identity key + fingerprint. This allows a corrupted server (in collusion with a cache) to + provide a unique descriptor to a client, and thereby partition that client + from the rest of the network.) + + The server descriptor with (descriptor) digest <D> (in hex) should be + available at: + http://<hostname>/tor/server/d/<D>.z + + The most recent descriptors with digests <D1>,<D2>,<D3> should be + available at: + http://<hostname>/tor/server/d/<D1>+<D2>+<D3>.z + + The most recent descriptor for this server should be at: + http://<hostname>/tor/server/authority.z + [Nothing in the Tor protocol uses this resource yet, but it is useful + for debugging purposes. Also, the official Tor implementations + (starting at 0.1.1.x) use this resource to test whether a server's + own DirPort is reachable.] + + A concatenated set of the most recent descriptors for all known servers + should be available at: + http://<hostname>/tor/server/all.z + + For debugging, directories SHOULD expose non-compressed objects at URLs like + the above, but without the final ".z". + Clients MUST handle compressed concatenated information in two forms: + - A concatenated list of zlib-compressed objects. + - A zlib-compressed concatenated list of objects. + Directory servers MAY generate either format: the former requires less + CPU, but the latter requires less bandwidth. + + Clients SHOULD use upper case letters (A-F) when base16-encoding + fingerprints. Servers MUST accept both upper and lower case fingerprints + in requests. + +5. Client operation: downloading information + + Every Tor that is not a directory server (that is, those that do + not have a DirPort set) implements this section. + +5.1. Downloading network-status documents + + Each client maintains an ordered list of directory authorities. + Insofar as possible, clients SHOULD all use the same ordered list. + + For each network-status document a client has, it keeps track of its + publication time *and* the time when the client retrieved it. Clients + consider a network-status document "live" if it was published within the + last 24 hours. + + Clients try to have a live network-status document hours from *every* + authority, and try to periodically get new network-status documents from + each authority in rotation as follows: + + If a client is missing a live network-status document for any + authority, it tries to fetch it from a directory cache. On failure, + the client waits briefly, then tries that network-status document + again from another cache. The client does not build circuits until it + has live network-status documents from more than half the authorities + it trusts, and it has descriptors for more than 1/4 of the routers + that it believes are running. + + If the most recently _retrieved_ network-status document is over 30 + minutes old, the client attempts to download a network-status document. + When choosing which documents to download, clients treat their list of + directory authorities as a circular ring, and begin with the authority + appearing immediately after the authority for their most recently + retrieved network-status document. If this attempt fails, the client + retries at other caches several times, before moving on to the next + network-status document in sequence. + + Clients discard all network-status documents over 24 hours old. + + If enough mirrors (currently 4) claim not to have a given network status, + we stop trying to download that authority's network-status, until we + download a new network-status that makes us believe that the authority in + question is running. Clients should wait a little longer after each + failure. + + Clients SHOULD try to batch as many network-status requests as possible + into each HTTP GET. + + (Note: clients can and should pick caches based on the network-status + information they have: once they have first fetched network-status info + from an authority, they should not need to go to the authority directly + again.) + +5.2. Downloading and storing router descriptors + + Clients try to have the best descriptor for each router. A descriptor is + "best" if: + * It is the most recently published descriptor listed for that router + by at least two network-status documents. + OR, + * No descriptor for that router is listed by two or more + network-status documents, and it is the most recently published + descriptor listed by any network-status document. + + Periodically (currently every 10 seconds) clients check whether there are + any "downloadable" descriptors. A descriptor is downloadable if: + - It is the "best" descriptor for some router. + - The descriptor was published at least 10 minutes in the past. + (This prevents clients from trying to fetch descriptors that the + mirrors have probably not yet retrieved and cached.) + - The client does not currently have it. + - The client is not currently trying to download it. + - The client would not discard it immediately upon receiving it. + - The client thinks it is running and valid (see 6.1 below). + + If at least 16 known routers have downloadable descriptors, or if + enough time (currently 10 minutes) has passed since the last time the + client tried to download descriptors, it launches requests for all + downloadable descriptors, as described in 5.3 below. + + When a descriptor download fails, the client notes it, and does not + consider the descriptor downloadable again until a certain amount of time + has passed. (Currently 0 seconds for the first failure, 60 seconds for the + second, 5 minutes for the third, 10 minutes for the fourth, and 1 day + thereafter.) Periodically (currently once an hour) clients reset the + failure count. + + No descriptors are downloaded until the client has downloaded more than + half of the network-status documents. + + Clients retain the most recent descriptor they have downloaded for each + router so long as it is not too old (currently, 48 hours), OR so long as + it is recommended by at least one networkstatus AND no "better" + descriptor has been downloaded. [Versions of Tor before 0.1.2.3-alpha + would discard descriptors simply for being published too far in the past.] + [The code seems to discard descriptors in all cases after they're 5 + days old. True? -RD] + +5.3. Managing downloads + + When a client has no live network-status documents, it downloads + network-status documents from a randomly chosen authority. In all other + cases, the client downloads from mirrors randomly chosen from among those + believed to be V2 directory servers. (This information comes from the + network-status documents; see 6 below.) + + When downloading multiple router descriptors, the client chooses multiple + mirrors so that: + - At least 3 different mirrors are used, except when this would result + in more than one request for under 4 descriptors. + - No more than 128 descriptors are requested from a single mirror. + - Otherwise, as few mirrors as possible are used. + After choosing mirrors, the client divides the descriptors among them + randomly. + + After receiving any response client MUST discard any network-status + documents and descriptors that it did not request. + +6. Using directory information + + Everyone besides directory authorities uses the approaches in this section + to decide which servers to use and what their keys are likely to be. + (Directory authorities just believe their own opinions, as in 3.1 above.) + +6.1. Choosing routers for circuits. + + Tor implementations only pay attention to "live" network-status documents. + A network status is "live" if it is the most recently downloaded network + status document for a given directory server, and the server is a + directory server trusted by the client, and the network-status document is + no more than 1 day old. + + For time-sensitive information, Tor implementations focus on "recent" + network-status documents. A network status is "recent" if it is live, and + if it was published in the last 60 minutes. If there are fewer + than 3 such documents, the most recently published 3 are "recent." If + there are fewer than 3 in all, all are "recent.") + + Circuits SHOULD NOT be built until the client has enough directory + information: network-statuses (or failed attempts to download + network-statuses) for all authorities, network-statuses for at more than + half of the authorites, and descriptors for at least 1/4 of the servers + believed to be running. + + A server is "listed" if it is included by more than half of the live + network status documents. Clients SHOULD NOT use unlisted servers. + + Clients believe the flags "Valid", "Exit", "Fast", "Guard", "Stable", and + "V2Dir" about a given router when they are asserted by more than half of + the live network-status documents. Clients believe the flag "Running" if + it is listed by more than half of the recent network-status documents. + + These flags are used as follows: + + - Clients SHOULD NOT use non-'Valid' or non-'Running' routers unless + requested to do so. + + - Clients SHOULD NOT use non-'Fast' routers for any purpose other than + very-low-bandwidth circuits (such as introduction circuits). + + - Clients SHOULD NOT use non-'Stable' routers for circuits that are + likely to need to be open for a very long time (such as those used for + IRC or SSH connections). + + - Clients SHOULD NOT choose non-'Guard' nodes when picking entry guard + nodes. + + - Clients SHOULD NOT download directory information from non-'V2Dir' + caches. + +6.1. Managing naming + + In order to provide human-memorable names for individual server + identities, some directory servers bind names to IDs. Clients handle + names in two ways: + + When a client encounters a name it has not mapped before: + + If all the live "Naming" network-status documents the client has + claim that the name binds to some identity ID, and the client has at + least three live network-status documents, the client maps the name to + ID. + + When a user tries to refer to a router with a name that does not have a + mapping under the above rules, the implementation SHOULD warn the user. + After giving the warning, the implementation MAY use a router that at + least one Naming authority maps the name to, so long as no other naming + authority maps that name to a different router. If no Naming authority + maps the name to a router, the implementation MAY use any router that + advertises the name. + + Not every router needs a nickname. When a router doesn't configure a + nickname, it publishes with the default nickname "Unnamed". Authorities + SHOULD NOT ever mark a router with this nickname as Named; client software + SHOULD NOT ever use a router in response to a user request for a router + called "Unnamed". + +6.2. Software versions + + An implementation of Tor SHOULD warn when it has fetched (or has + attempted to fetch and failed four consecutive times) a network-status + for each authority, and it is running a software version + not listed on more than half of the live "Versioning" network-status + documents. + +6.3. Warning about a router's status. + + If a router tries to publish its descriptor to a Naming authority + that has its nickname mapped to another key, the router SHOULD + warn the operator that it is either using the wrong key or is using + an already claimed nickname. + + If a router has fetched (or attempted to fetch and failed four + consecutive times) a network-status for every authority, and at + least one of the authorities is "Naming", and no live "Naming" + authorities publish a binding for the router's nickname, the + router MAY remind the operator that the chosen nickname is not + bound to this key at the authorities, and suggest contacting the + authority operators. + + ... + +6.4. Router protocol versions + + A client should believe that a router supports a given feature if that + feature is supported by the router or protocol versions in more than half + of the live networkstatus's "v" entries for that router. In other words, + if the "v" entries for some router are: + v Tor 0.0.8pre1 + v Tor 0.1.2.11 + v FutureProtocolDescription 99 + then the client should believe that the router supports any feature + supported by 0.1.2.11. + + This is currently equivalent to believing the median declared version for + a router in all live networkstatuses. + +7. Standards compliance + + All clients and servers MUST support HTTP 1.0. + +7.1. HTTP headers + + Servers MAY set the Content-Length: header. Servers SHOULD set + Content-Encoding to "deflate" or "identity". + + Servers MAY include an X-Your-Address-Is: header, whose value is the + apparent IP address of the client connecting to them (as a dotted quad). + For directory connections tunneled over a BEGIN_DIR stream, servers SHOULD + report the IP from which the circuit carrying the BEGIN_DIR stream reached + them. [Servers before version 0.1.2.5-alpha reported 127.0.0.1 for all + BEGIN_DIR-tunneled connections.] + + Servers SHOULD disable caching of multiple network statuses or multiple + router descriptors. Servers MAY enable caching of single descriptors, + single network statuses, the list of all router descriptors, a v1 + directory, or a v1 running routers document. XXX mention times. + +7.2. HTTP status codes + + XXX We should write down what return codes dirservers send in what situations. + |