diff options
Diffstat (limited to 'doc')
110 files changed, 30 insertions, 24651 deletions
diff --git a/doc/Makefile.am b/doc/Makefile.am index 693378c486..bc3d8df475 100644 --- a/doc/Makefile.am +++ b/doc/Makefile.am @@ -1,4 +1,3 @@ - # We use a two-step process to generate documentation from asciidoc files. # # First, we use asciidoc/a2x to process the asciidoc files into .1.in and @@ -32,16 +31,12 @@ endif EXTRA_DIST = HACKING asciidoc-helper.sh \ $(html_in) $(man_in) $(txt_in) \ tor-rpm-creation.txt \ - tor-win32-mingw-creation.txt + tor-win32-mingw-creation.txt spec/README docdir = @docdir@ asciidoc_product = $(nodist_man_MANS) $(doc_DATA) -SUBDIRS = spec - -DIST_SUBDIRS = spec - # Generate the html documentation from asciidoc, but don't do # machine-specific replacements yet $(html_in) : diff --git a/doc/spec/Makefile.am b/doc/spec/Makefile.am deleted file mode 100644 index e2fef42e81..0000000000 --- a/doc/spec/Makefile.am +++ /dev/null @@ -1,5 +0,0 @@ - -EXTRA_DIST = tor-spec.txt rend-spec.txt control-spec.txt \ - dir-spec.txt socks-extensions.txt path-spec.txt \ - version-spec.txt address-spec.txt bridges-spec.txt - diff --git a/doc/spec/README b/doc/spec/README new file mode 100644 index 0000000000..a7fa170020 --- /dev/null +++ b/doc/spec/README @@ -0,0 +1,10 @@ +The Tor specifications and proposals have moved to a new repository. + +To browse the specifications, go to + https://gitweb.torproject.org/torspec.git/tree + +To check out the specification repository, run + git clone git://git.torproject.org/torspec.git + +For other information on the repository, see + http://gitweb.torproject.org/torspec.git diff --git a/doc/spec/address-spec.txt b/doc/spec/address-spec.txt deleted file mode 100644 index ce6d2b65e7..0000000000 --- a/doc/spec/address-spec.txt +++ /dev/null @@ -1,58 +0,0 @@ - - Special Hostnames in Tor - Nick Mathewson - -1. Overview - - Most of the time, Tor treats user-specified hostnames as opaque: When - the user connects to www.torproject.org, Tor picks an exit node and uses - that node to connect to "www.torproject.org". Some hostnames, however, - can be used to override Tor's default behavior and circuit-building - rules. - - These hostnames can be passed to Tor as the address part of a SOCKS4a or - SOCKS5 request. If the application is connected to Tor using an IP-only - method (such as SOCKS4, TransPort, or NATDPort), these hostnames can be - substituted for certain IP addresses using the MapAddress configuration - option or the MAPADDRESS control command. - -2. .exit - - SYNTAX: [hostname].[name-or-digest].exit - [name-or-digest].exit - - Hostname is a valid hostname; [name-or-digest] is either the nickname of a - Tor node or the hex-encoded digest of that node's public key. - - When Tor sees an address in this format, it uses the specified hostname as - the exit node. If no "hostname" component is given, Tor defaults to the - published IPv4 address of the exit node. - - It is valid to try to resolve hostnames, and in fact upon success Tor - will cache an internal mapaddress of the form - "www.google.com.foo.exit=64.233.161.99.foo.exit" to speed subsequent - lookups. - - The .exit notation is disabled by default as of Tor 0.2.2.1-alpha, due - to potential application-level attacks. - - EXAMPLES: - www.example.com.exampletornode.exit - - Connect to www.example.com from the node called "exampletornode". - - exampletornode.exit - - Connect to the published IP address of "exampletornode" using - "exampletornode" as the exit. - -3. .onion - - SYNTAX: [digest].onion - - The digest is the first eighty bits of a SHA1 hash of the identity key for - a hidden service, encoded in base32. - - When Tor sees an address in this format, it tries to look up and connect to - the specified hidden service. See rend-spec.txt for full details. - diff --git a/doc/spec/bridges-spec.txt b/doc/spec/bridges-spec.txt deleted file mode 100644 index 647118815c..0000000000 --- a/doc/spec/bridges-spec.txt +++ /dev/null @@ -1,249 +0,0 @@ - - Tor bridges specification - -0. Preface - - This document describes the design decisions around support for bridge - users, bridge relays, and bridge authorities. It acts as an overview - of the bridge design and deployment for developers, and it also tries - to point out limitations in the current design and implementation. - - For more details on what all of these mean, look at blocking.tex in - /doc/design-paper/ - -1. Bridge relays - - Bridge relays are just like normal Tor relays except they don't publish - their server descriptors to the main directory authorities. - -1.1. PublishServerDescriptor - - To configure your relay to be a bridge relay, just add - BridgeRelay 1 - PublishServerDescriptor bridge - to your torrc. This will cause your relay to publish its descriptor - to the bridge authorities rather than to the default authorities. - - Alternatively, you can say - BridgeRelay 1 - PublishServerDescriptor 0 - which will cause your relay to not publish anywhere. This could be - useful for private bridges. - -1.2. Recommendations. - - Bridge relays should use an exit policy of "reject *:*". This is - because they only need to relay traffic between the bridge users - and the rest of the Tor network, so there's no need to let people - exit directly from them. - - We invented the RelayBandwidth* options for this situation: Tor clients - who want to allow relaying too. See proposal 111 for details. Relay - operators should feel free to rate-limit their relayed traffic. - -1.3. Implementation note. - - Vidalia 0.0.15 has turned its "Relay" settings page into a tri-state - "Don't relay" / "Relay for the Tor network" / "Help censored users". - - If you click the third choice, it forces your exit policy to reject *:*. - - If all the bridges end up on port 9001, that's not so good. On the - other hand, putting the bridges on a low-numbered port in the Unix - world requires jumping through extra hoops. The current compromise is - that Vidalia makes the ORPort default to 443 on Windows, and 9001 on - other platforms. - - At the bottom of the relay config settings window, Vidalia displays - the bridge identifier to the operator (see Section 3.1) so he can pass - it on to bridge users. - -2. Bridge authorities. - - Bridge authorities are like normal v3 directory authorities, except - they don't create their own network-status documents or votes. So if - you ask a bridge authority for a network-status document or consensus, - they behave like a directory mirror: they give you one from one of - the main authorities. But if you ask the bridge authority for the - descriptor corresponding to a particular identity fingerprint, it will - happily give you the latest descriptor for that fingerprint. - - To become a bridge authority, add these lines to your torrc: - AuthoritativeDirectory 1 - BridgeAuthoritativeDir 1 - - Right now there's one bridge authority, running on the Tonga relay. - -2.1. Exporting bridge-purpose descriptors - - We've added a new purpose for server descriptors: the "bridge" - purpose. With the new router-descriptors file format that includes - annotations, it's easy to look through it and find the bridge-purpose - descriptors. - - Currently we export the bridge descriptors from Tonga to the - BridgeDB server, so it can give them out according to the policies - in blocking.pdf. - -2.2. Reachability/uptime testing - - Right now the bridge authorities do active reachability testing of - bridges, so we know which ones to recommend for users. - - But in the design document, we suggested that bridges should publish - anonymously (i.e. via Tor) to the bridge authority, so somebody watching - the bridge authority can't just enumerate all the bridges. But if we're - doing active measurement, the game is up. Perhaps we should back off on - this goal, or perhaps we should do our active measurement anonymously? - - Answering this issue is scheduled for 0.2.1.x. - -2.3. Future work: migrating to multiple bridge authorities - - Having only one bridge authority is both a trust bottleneck (if you - break into one place you learn about every single bridge we've got) - and a robustness bottleneck (when it's down, bridge users become sad). - - Right now if we put up a second bridge authority, all the bridges would - publish to it, and (assuming the code works) bridge users would query - a random bridge authority. This resolves the robustness bottleneck, - but makes the trust bottleneck even worse. - - In 0.2.2.x and later we should think about better ways to have multiple - bridge authorities. - -3. Bridge users. - - Bridge users are like ordinary Tor users except they use encrypted - directory connections by default, and they use bridge relays as both - entry guards (their first hop) and directory guards (the source of - all their directory information). - - To become a bridge user, add the following line to your torrc: - UseBridges 1 - - and then add at least one "Bridge" line to your torrc based on the - format below. - -3.1. Format of the bridge identifier. - - The canonical format for a bridge identifier contains an IP address, - an ORPort, and an identity fingerprint: - bridge 128.31.0.34:9009 4C17 FB53 2E20 B2A8 AC19 9441 ECD2 B017 7B39 E4B1 - - However, the identity fingerprint can be left out, in which case the - bridge user will connect to that relay and use it as a bridge regardless - of what identity key it presents: - bridge 128.31.0.34:9009 - This might be useful for cases where only short bridge identifiers - can be communicated to bridge users. - - In a future version we may also support bridge identifiers that are - only a key fingerprint: - bridge 4C17 FB53 2E20 B2A8 AC19 9441 ECD2 B017 7B39 E4B1 - and the bridge user can fetch the latest descriptor from the bridge - authority (see Section 3.4). - -3.2. Bridges as entry guards - - For now, bridge users add their bridge relays to their list of "entry - guards" (see path-spec.txt for background on entry guards). They are - managed by the entry guard algorithms exactly as if they were a normal - entry guard -- their keys and timing get cached in the "state" file, - etc. This means that when the Tor user starts up with "UseBridges" - disabled, he will skip past the bridge entries since they won't be - listed as up and usable in his networkstatus consensus. But to be clear, - the "entry_guards" list doesn't currently distinguish guards by purpose. - - Internally, each bridge user keeps a smartlist of "bridge_info_t" - that reflects the "bridge" lines from his torrc along with a download - schedule (see Section 3.5 below). When he starts Tor, he attempts - to fetch a descriptor for each configured bridge (see Section 3.4 - below). When he succeeds at getting a descriptor for one of the bridges - in his list, he adds it directly to the entry guard list using the - normal add_an_entry_guard() interface. Once a bridge descriptor has - been added, should_delay_dir_fetches() will stop delaying further - directory fetches, and the user begins to bootstrap his directory - information from that bridge (see Section 3.3). - - Currently bridge users cache their bridge descriptors to the - "cached-descriptors" file (annotated with purpose "bridge"), but - they don't make any attempt to reuse descriptors they find in this - file. The theory is that either the bridge is available now, in which - case you can get a fresh descriptor, or it's not, in which case an - old descriptor won't do you much good. - - We could disable writing out the bridge lines to the state file, if - we think this is a problem. - - As an exception, if we get an application request when we have one - or more bridge descriptors but we believe none of them are running, - we mark them all as running again. This is similar to the exception - already in place to help long-idle Tor clients realize they should - fetch fresh directory information rather than just refuse requests. - -3.3. Bridges as directory guards - - In addition to using bridges as the first hop in their circuits, bridge - users also use them to fetch directory updates. Other than initial - bootstrapping to find a working bridge descriptor (see Section 3.4 - below), all further non-anonymized directory fetches will be redirected - to the bridge. - - This means that bridge relays need to have cached answers for all - questions the bridge user might ask. This makes the upgrade path - tricky --- for example, if we migrate to a v4 directory design, the - bridge user would need to keep using v3 so long as his bridge relays - only knew how to answer v3 queries. - - In a future design, for cases where the user has enough information - to build circuits yet the chosen bridge doesn't know how to answer a - given query, we might teach bridge users to make an anonymized request - to a more suitable directory server. - -3.4. How bridge users get their bridge descriptor - - Bridge users can fetch bridge descriptors in two ways: by going directly - to the bridge and asking for "/tor/server/authority", or by going to - the bridge authority and asking for "/tor/server/fp/ID". By default, - they will only try the direct queries. If the user sets - UpdateBridgesFromAuthority 1 - in his config file, then he will try querying the bridge authority - first for bridges where he knows a digest (if he only knows an IP - address and ORPort, then his only option is a direct query). - - If the user has at least one working bridge, then he will do further - queries to the bridge authority through a full three-hop Tor circuit. - But when bootstrapping, he will make a direct begin_dir-style connection - to the bridge authority. - - As of Tor 0.2.0.10-alpha, if the user attempts to fetch a descriptor - from the bridge authority and it returns a 404 not found, the user - will automatically fall back to trying a direct query. Therefore it is - recommended that bridge users always set UpdateBridgesFromAuthority, - since at worst it will delay their fetches a little bit and notify - the bridge authority of the identity fingerprint (but not location) - of their intended bridges. - -3.5. Bridge descriptor retry schedule - - Bridge users try to fetch a descriptor for each bridge (using the - steps in Section 3.4 above) on startup. Whenever they receive a - bridge descriptor, they reschedule a new descriptor download for 1 - hour from then. - - If on the other hand it fails, they try again after 15 minutes for the - first attempt, after 15 minutes for the second attempt, and after 60 - minutes for subsequent attempts. - - In 0.2.2.x we should come up with some smarter retry schedules. - -3.6. Implementation note. - - Vidalia 0.1.0 has a new checkbox in its Network config window called - "My ISP blocks connections to the Tor network." Users who click that - box change their configuration to: - UseBridges 1 - UpdateBridgesFromAuthority 1 - and should add at least one bridge identifier. - diff --git a/doc/spec/control-spec-v0.txt b/doc/spec/control-spec-v0.txt deleted file mode 100644 index 3515d395a6..0000000000 --- a/doc/spec/control-spec-v0.txt +++ /dev/null @@ -1,498 +0,0 @@ - - TC: A Tor control protocol (Version 0) - --1. Deprecation - -THIS PROTOCOL IS DEPRECATED. It is still documented here because Tor -0.1.1.x happens to support much of it; but the support for v0 is not -maintained, so you should expect it to rot in unpredictable ways. Support -for v0 will be removed some time after Tor 0.1.2. - -0. Scope - -This document describes an implementation-specific protocol that is used -for other programs (such as frontend user-interfaces) to communicate -with a locally running Tor process. It is not part of the Tor onion -routing protocol. - -We're trying to be pretty extensible here, but not infinitely -forward-compatible. - -1. Protocol outline - -TC is a bidirectional message-based protocol. It assumes an underlying -stream for communication between a controlling process (the "client") and -a Tor process (the "server"). The stream may be implemented via TCP, -TLS-over-TCP, a Unix-domain socket, or so on, but it must provide -reliable in-order delivery. For security, the stream should not be -accessible by untrusted parties. - -In TC, the client and server send typed variable-length messages to each -other over the underlying stream. By default, all messages from the server -are in response to messages from the client. Some client requests, however, -will cause the server to send messages to the client indefinitely far into -the future. - -Servers respond to messages in the order they're received. - -2. Message format - -The messages take the following format: - - Length [2 octets; big-endian] - Type [2 octets; big-endian] - Body [Length octets] - -Upon encountering a recognized Type, implementations behave as described in -section 3 below. If the type is not recognized, servers respond with an -"ERROR" message (code UNRECOGNIZED; see 3.1 below), and clients simply ignore -the message. - -2.1. Types and encodings - - All numbers are given in big-endian (network) order. - - OR identities are given in hexadecimal, in the same format as identity key - fingerprints, but without spaces; see tor-spec.txt for more information. - -3. Message types - - Message types are drawn from the following ranges: - - 0x0000-0xEFFF : Reserved for use by official versions of this spec. - 0xF000-0xFFFF : Unallocated; usable by unofficial extensions. - -3.1. ERROR (Type 0x0000) - - Sent in response to a message that could not be processed as requested. - - The body of the message begins with a 2-byte error code. The following - values are defined: - - 0x0000 Unspecified error - [] - - 0x0001 Internal error - [Something went wrong inside Tor, so that the client's - request couldn't be fulfilled.] - - 0x0002 Unrecognized message type - [The client sent a message type we don't understand.] - - 0x0003 Syntax error - [The client sent a message body in a format we can't parse.] - - 0x0004 Unrecognized configuration key - [The client tried to get or set a configuration option we don't - recognize.] - - 0x0005 Invalid configuration value - [The client tried to set a configuration option to an - incorrect, ill-formed, or impossible value.] - - 0x0006 Unrecognized byte code - [The client tried to set a byte code (in the body) that - we don't recognize.] - - 0x0007 Unauthorized. - [The client tried to send a command that requires - authorization, but it hasn't sent a valid AUTHENTICATE - message.] - - 0x0008 Failed authentication attempt - [The client sent a well-formed authorization message.] - - 0x0009 Resource exhausted - [The server didn't have enough of a given resource to - fulfill a given request.] - - 0x000A No such stream - - 0x000B No such circuit - - 0x000C No such OR - - The rest of the body should be a human-readable description of the error. - - In general, new error codes should only be added when they don't fall under - one of the existing error codes. - -3.2. DONE (Type 0x0001) - - Sent from server to client in response to a request that was successfully - completed, with no more information needed. The body is usually empty but - may contain a message. - -3.3. SETCONF (Type 0x0002) - - Change the value of a configuration variable. The body contains a list of - newline-terminated key-value configuration lines. An individual key-value - configuration line consists of the key, followed by a space, followed by - the value. The server behaves as though it had just read the key-value pair - in its configuration file. - - The server responds with a DONE message on success, or an ERROR message on - failure. - - When a configuration options takes multiple values, or when multiple - configuration keys form a context-sensitive group (see below), then - setting _any_ of the options in a SETCONF command is taken to reset all of - the others. For example, if two ORBindAddress values are configured, - and a SETCONF command arrives containing a single ORBindAddress value, the - new command's value replaces the two old values. - - To _remove_ all settings for a given option entirely (and go back to its - default value), send a single line containing the key and no value. - -3.4. GETCONF (Type 0x0003) - - Request the value of a configuration variable. The body contains one or - more NL-terminated strings for configuration keys. The server replies - with a CONFVALUE message. - - If an option appears multiple times in the configuration, all of its - key-value pairs are returned in order. - - Some options are context-sensitive, and depend on other options with - different keywords. These cannot be fetched directly. Currently there - is only one such option: clients should use the "HiddenServiceOptions" - virtual keyword to get all HiddenServiceDir, HiddenServicePort, - HiddenServiceNodes, and HiddenServiceExcludeNodes option settings. - -3.5. CONFVALUE (Type 0x0004) - - Sent in response to a GETCONF message; contains a list of "Key Value\n" - (A non-whitespace keyword, a single space, a non-NL value, a NL) - strings. - -3.6. SETEVENTS (Type 0x0005) - - Request the server to inform the client about interesting events. - The body contains a list of 2-byte event codes (see "event" below). - Any events *not* listed in the SETEVENTS body are turned off; thus, sending - SETEVENTS with an empty body turns off all event reporting. - - The server responds with a DONE message on success, and an ERROR message - if one of the event codes isn't recognized. (On error, the list of active - event codes isn't changed.) - -3.7. EVENT (Type 0x0006) - - Sent from the server to the client when an event has occurred and the - client has requested that kind of event. The body contains a 2-byte - event code followed by additional event-dependent information. Event - codes are: - 0x0001 -- Circuit status changed - - Status [1 octet] - 0x00 Launched - circuit ID assigned to new circuit - 0x01 Built - all hops finished, can now accept streams - 0x02 Extended - one more hop has been completed - 0x03 Failed - circuit closed (was not built) - 0x04 Closed - circuit closed (was built) - Circuit ID [4 octets] - (Must be unique to Tor process/time) - Path [NUL-terminated comma-separated string] - (For extended/failed, is the portion of the path that is - built) - - 0x0002 -- Stream status changed - - Status [1 octet] - (Sent connect=0,sent resolve=1,succeeded=2,failed=3, - closed=4, new connection=5, new resolve request=6, - stream detached from circuit and still retriable=7) - Stream ID [4 octets] - (Must be unique to Tor process/time) - Target (NUL-terminated address-port string] - - 0x0003 -- OR Connection status changed - - Status [1 octet] - (Launched=0,connected=1,failed=2,closed=3) - OR nickname/identity [NUL-terminated] - - 0x0004 -- Bandwidth used in the last second - - Bytes read [4 octets] - Bytes written [4 octets] - - 0x0005 -- Notice/warning/error occurred - - Message [NUL-terminated] - - <obsolete: use 0x0007-0x000B instead.> - - 0x0006 -- New descriptors available - - OR List [NUL-terminated, comma-delimited list of - OR identity] - - 0x0007 -- Debug message occurred - 0x0008 -- Info message occurred - 0x0009 -- Notice message occurred - 0x000A -- Warning message occurred - 0x000B -- Error message occurred - - Message [NUL-terminated] - -3.8. AUTHENTICATE (Type 0x0007) - - Sent from the client to the server. Contains a 'magic cookie' to prove - that client is really allowed to control this Tor process. The server - responds with DONE or ERROR. - - The format of the 'cookie' is implementation-dependent; see 4.1 below for - information on how the standard Tor implementation handles it. - -3.9. SAVECONF (Type 0x0008) - - Sent from the client to the server. Instructs the server to write out - its config options into its torrc. Server returns DONE if successful, or - ERROR if it can't write the file or some other error occurs. - -3.10. SIGNAL (Type 0x0009) - - Sent from the client to the server. The body contains one byte that - indicates the action the client wishes the server to take. - - 1 (0x01) -- Reload: reload config items, refetch directory. - 2 (0x02) -- Controlled shutdown: if server is an OP, exit immediately. - If it's an OR, close listeners and exit after 30 seconds. - 10 (0x0A) -- Dump stats: log information about open connections and - circuits. - 12 (0x0C) -- Debug: switch all open logs to loglevel debug. - 15 (0x0F) -- Immediate shutdown: clean up and exit now. - - The server responds with DONE if the signal is recognized (or simply - closes the socket if it was asked to close immediately), else ERROR. - -3.11. MAPADDRESS (Type 0x000A) - - Sent from the client to the server. The body contains a sequence of - address mappings, each consisting of the address to be mapped, a single - space, the replacement address, and a NL character. - - Addresses may be IPv4 addresses, IPv6 addresses, or hostnames. - - The client sends this message to the server in order to tell it that future - SOCKS requests for connections to the original address should be replaced - with connections to the specified replacement address. If the addresses - are well-formed, and the server is able to fulfill the request, the server - replies with a single DONE message containing the source and destination - addresses. If request is malformed, the server replies with a syntax error - message. The server can't fulfill the request, it replies with an internal - ERROR message. - - The client may decline to provide a body for the original address, and - instead send a special null address ("0.0.0.0" for IPv4, "::0" for IPv6, or - "." for hostname), signifying that the server should choose the original - address itself, and return that address in the DONE message. The server - should ensure that it returns an element of address space that is unlikely - to be in actual use. If there is already an address mapped to the - destination address, the server may reuse that mapping. - - If the original address is already mapped to a different address, the old - mapping is removed. If the original address and the destination address - are the same, the server removes any mapping in place for the original - address. - - {Note: This feature is designed to be used to help Tor-ify applications - that need to use SOCKS4 or hostname-less SOCKS5. There are three - approaches to doing this: - 1. Somehow make them use SOCKS4a or SOCKS5-with-hostnames instead. - 2. Use tor-resolve (or another interface to Tor's resolve-over-SOCKS - feature) to resolve the hostname remotely. This doesn't work - with special addresses like x.onion or x.y.exit. - 3. Use MAPADDRESS to map an IP address to the desired hostname, and then - arrange to fool the application into thinking that the hostname - has resolved to that IP. - This functionality is designed to help implement the 3rd approach.} - - [XXXX When, if ever, can mappings expire? Should they expire?] - [XXXX What addresses, if any, are safe to use?] - -3.12 GETINFO (Type 0x000B) - - Sent from the client to the server. The message body is as for GETCONF: - one or more NL-terminated strings. The server replies with an INFOVALUE - message. - - Unlike GETCONF, this message is used for data that are not stored in the - Tor configuration file, but instead. - - Recognized key and their values include: - - "version" -- The version of the server's software, including the name - of the software. (example: "Tor 0.0.9.4") - - "desc/id/<OR identity>" or "desc/name/<OR nickname>" -- the latest server - descriptor for a given OR, NUL-terminated. If no such OR is known, the - corresponding value is an empty string. - - "network-status" -- a space-separated list of all known OR identities. - This is in the same format as the router-status line in directories; - see tor-spec.txt for details. - - "addr-mappings/all" - "addr-mappings/config" - "addr-mappings/cache" - "addr-mappings/control" -- a NL-terminated list of address mappings, each - in the form of "from-address" SP "to-address". The 'config' key - returns those address mappings set in the configuration; the 'cache' - key returns the mappings in the client-side DNS cache; the 'control' - key returns the mappings set via the control interface; the 'all' - target returns the mappings set through any mechanism. - -3.13 INFOVALUE (Type 0x000C) - - Sent from the server to the client in response to a GETINFO message. - Contains one or more items of the format: - - Key [(NUL-terminated string)] - Value [(NUL-terminated string)] - - The keys match those given in the GETINFO message. - -3.14 EXTENDCIRCUIT (Type 0x000D) - - Sent from the client to the server. The message body contains two fields: - Circuit ID [4 octets] - Path [NUL-terminated, comma-delimited string of OR nickname/identity] - - This request takes one of two forms: either the Circuit ID is zero, in - which case it is a request for the server to build a new circuit according - to the specified path, or the Circuit ID is nonzero, in which case it is a - request for the server to extend an existing circuit with that ID according - to the specified path. - - If the request is successful, the server sends a DONE message containing - a message body consisting of the four-octet Circuit ID of the newly created - circuit. - -3.15 ATTACHSTREAM (Type 0x000E) - - Sent from the client to the server. The message body contains two fields: - Stream ID [4 octets] - Circuit ID [4 octets] - - This message informs the server that the specified stream should be - associated with the specified circuit. Each stream may be associated with - at most one circuit, and multiple streams may share the same circuit. - Streams can only be attached to completed circuits (that is, circuits that - have sent a circuit status 'built' event). - - If the circuit ID is 0, responsibility for attaching the given stream is - returned to Tor. - - {Implementation note: By default, Tor automatically attaches streams to - circuits itself, unless the configuration variable - "__LeaveStreamsUnattached" is set to "1". Attempting to attach streams - via TC when "__LeaveStreamsUnattached" is false may cause a race between - Tor and the controller, as both attempt to attach streams to circuits.} - -3.16 POSTDESCRIPTOR (Type 0x000F) - - Sent from the client to the server. The message body contains one field: - Descriptor [NUL-terminated string] - - This message informs the server about a new descriptor. - - The descriptor, when parsed, must contain a number of well-specified - fields, including fields for its nickname and identity. - - If there is an error in parsing the descriptor, the server must send an - appropriate error message. If the descriptor is well-formed but the server - chooses not to add it, it must reply with a DONE message whose body - explains why the server was not added. - -3.17 FRAGMENTHEADER (Type 0x0010) - - Sent in either direction. Used to encapsulate messages longer than 65535 - bytes in length. - - Underlying type [2 bytes] - Total Length [4 bytes] - Data [Rest of message] - - A FRAGMENTHEADER message MUST be followed immediately by a number of - FRAGMENT messages, such that lengths of the "Data" fields of the - FRAGMENTHEADER and FRAGMENT messages add to the "Total Length" field of the - FRAGMENTHEADER message. - - Implementations MUST NOT fragment messages of length less than 65536 bytes. - Implementations MUST be able to process fragmented messages that not - optimally packed. - -3.18 FRAGMENT (Type 0x0011) - - Data [Entire message] - - See FRAGMENTHEADER for more information - -3.19 REDIRECTSTREAM (Type 0x0012) - - Sent from the client to the server. The message body contains two fields: - Stream ID [4 octets] - Address [variable-length, NUL-terminated.] - - Tells the server to change the exit address on the specified stream. No - remapping is performed on the new provided address. - - To be sure that the modified address will be used, this event must be sent - after a new stream event is received, and before attaching this stream to - a circuit. - -3.20 CLOSESTREAM (Type 0x0013) - - Sent from the client to the server. The message body contains three - fields: - Stream ID [4 octets] - Reason [1 octet] - Flags [1 octet] - - Tells the server to close the specified stream. The reason should be - one of the Tor RELAY_END reasons given in tor-spec.txt. Flags is not - used currently. Tor may hold the stream open for a while to flush - any data that is pending. - -3.21 CLOSECIRCUIT (Type 0x0014) - - Sent from the client to the server. The message body contains two - fields: - Circuit ID [4 octets] - Flags [1 octet] - - Tells the server to close the specified circuit. If the LSB of the flags - field is nonzero, do not close the circuit unless it is unused. - -4. Implementation notes - -4.1. Authentication - - By default, the current Tor implementation trusts all local users. - - If the 'CookieAuthentication' option is true, Tor writes a "magic cookie" - file named "control_auth_cookie" into its data directory. To authenticate, - the controller must send the contents of this file. - - If the 'HashedControlPassword' option is set, it must contain the salted - hash of a secret password. The salted hash is computed according to the - S2K algorithm in RFC 2440 (OpenPGP), and prefixed with the s2k specifier. - This is then encoded in hexadecimal, prefixed by the indicator sequence - "16:". Thus, for example, the password 'foo' could encode to: - 16:660537E3E1CD49996044A3BF558097A981F539FEA2F9DA662B4626C1C2 - ++++++++++++++++**^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - salt hashed value - indicator - You can generate the salt of a password by calling - 'tor --hash-password <password>' - or by using the example code in the Python and Java controller libraries. - To authenticate under this scheme, the controller sends Tor the original - secret that was used to generate the password. - -4.2. Don't let the buffer get too big. - - If you ask for lots of events, and 16MB of them queue up on the buffer, - the Tor process will close the socket. - diff --git a/doc/spec/control-spec.txt b/doc/spec/control-spec.txt deleted file mode 100644 index 255adf00a4..0000000000 --- a/doc/spec/control-spec.txt +++ /dev/null @@ -1,1963 +0,0 @@ - - TC: A Tor control protocol (Version 1) - -0. Scope - - This document describes an implementation-specific protocol that is used - for other programs (such as frontend user-interfaces) to communicate with a - locally running Tor process. It is not part of the Tor onion routing - protocol. - - This protocol replaces version 0 of TC, which is now deprecated. For - reference, TC is described in "control-spec-v0.txt". Implementors are - recommended to avoid using TC directly, but instead to use a library that - can easily be updated to use the newer protocol. (Version 0 is used by Tor - versions 0.1.0.x; the protocol in this document only works with Tor - versions in the 0.1.1.x series and later.) - - The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL - NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and - "OPTIONAL" in this document are to be interpreted as described in - RFC 2119. - -1. Protocol outline - - TC is a bidirectional message-based protocol. It assumes an underlying - stream for communication between a controlling process (the "client" - or "controller") and a Tor process (or "server"). The stream may be - implemented via TCP, TLS-over-TCP, a Unix-domain socket, or so on, - but it must provide reliable in-order delivery. For security, the - stream should not be accessible by untrusted parties. - - In TC, the client and server send typed messages to each other over the - underlying stream. The client sends "commands" and the server sends - "replies". - - By default, all messages from the server are in response to messages from - the client. Some client requests, however, will cause the server to send - messages to the client indefinitely far into the future. Such - "asynchronous" replies are marked as such. - - Servers respond to messages in the order messages are received. - -2. Message format - -2.1. Description format - - The message formats listed below use ABNF as described in RFC 2234. - The protocol itself is loosely based on SMTP (see RFC 2821). - - We use the following nonterminals from RFC 2822: atom, qcontent - - We define the following general-use nonterminals: - - String = DQUOTE *qcontent DQUOTE - - There are explicitly no limits on line length. All 8-bit characters are - permitted unless explicitly disallowed. - - Wherever CRLF is specified to be accepted from the controller, Tor MAY also - accept LF. Tor, however, MUST NOT generate LF instead of CRLF. - Controllers SHOULD always send CRLF. - -2.2. Commands from controller to Tor - - Command = Keyword Arguments CRLF / "+" Keyword Arguments CRLF Data - Keyword = 1*ALPHA - Arguments = *(SP / VCHAR) - - Specific commands and their arguments are described below in section 3. - -2.3. Replies from Tor to the controller - - Reply = SyncReply / AsyncReply - SyncReply = *(MidReplyLine / DataReplyLine) EndReplyLine - AsyncReply = *(MidReplyLine / DataReplyLine) EndReplyLine - - MidReplyLine = StatusCode "-" ReplyLine - DataReplyLine = StatusCode "+" ReplyLine Data - EndReplyLine = StatusCode SP ReplyLine - ReplyLine = [ReplyText] CRLF - ReplyText = XXXX - StatusCode = 3DIGIT - - Specific replies are mentioned below in section 3, and described more fully - in section 4. - - [Compatibility note: versions of Tor before 0.2.0.3-alpha sometimes - generate AsyncReplies of the form "*(MidReplyLine / DataReplyLine)". - This is incorrect, but controllers that need to work with these - versions of Tor should be prepared to get multi-line AsyncReplies with - the final line (usually "650 OK") omitted.] - -2.4. General-use tokens - - ; CRLF means, "the ASCII Carriage Return character (decimal value 13) - ; followed by the ASCII Linefeed character (decimal value 10)." - CRLF = CR LF - - ; How a controller tells Tor about a particular OR. There are four - ; possible formats: - ; $Fingerprint -- The router whose identity key hashes to the fingerprint. - ; This is the preferred way to refer to an OR. - ; $Fingerprint~Nickname -- The router whose identity key hashes to the - ; given fingerprint, but only if the router has the given nickname. - ; $Fingerprint=Nickname -- The router whose identity key hashes to the - ; given fingerprint, but only if the router is Named and has the given - ; nickname. - ; Nickname -- The Named router with the given nickname, or, if no such - ; router exists, any router whose nickname matches the one given. - ; This is not a safe way to refer to routers, since Named status - ; could under some circumstances change over time. - ; - ; The tokens that implement the above follow: - - ServerSpec = LongName / Nickname - LongName = Fingerprint [ ( "=" / "~" ) Nickname ] - - Fingerprint = "$" 40*HEXDIG - NicknameChar = "a"-"z" / "A"-"Z" / "0" - "9" - Nickname = 1*19 NicknameChar - - ; What follows is an outdated way to refer to ORs. - ; Feature VERBOSE_NAMES replaces ServerID with LongName in events and - ; GETINFO results. VERBOSE_NAMES can be enabled starting in Tor version - ; 0.1.2.2-alpha and it is always-on in 0.2.2.1-alpha and later. - ServerID = Nickname / Fingerprint - - - ; Unique identifiers for streams or circuits. Currently, Tor only - ; uses digits, but this may change - StreamID = 1*16 IDChar - CircuitID = 1*16 IDChar - IDChar = ALPHA / DIGIT - - Address = ip4-address / ip6-address / hostname (XXXX Define these) - - ; A "Data" section is a sequence of octets concluded by the terminating - ; sequence CRLF "." CRLF. The terminating sequence may not appear in the - ; body of the data. Leading periods on lines in the data are escaped with - ; an additional leading period as in RFC 2821 section 4.5.2. - Data = *DataLine "." CRLF - DataLine = CRLF / "." 1*LineItem CRLF / NonDotItem *LineItem CRLF - LineItem = NonCR / 1*CR NonCRLF - NonDotItem = NonDotCR / 1*CR NonCRLF - -3. Commands - - All commands are case-insensitive, but most keywords are case-sensitive. - -3.1. SETCONF - - Change the value of one or more configuration variables. The syntax is: - - "SETCONF" 1*(SP keyword ["=" value]) CRLF - value = String / QuotedString - - Tor behaves as though it had just read each of the key-value pairs - from its configuration file. Keywords with no corresponding values have - their configuration values reset to 0 or NULL (use RESETCONF if you want - to set it back to its default). SETCONF is all-or-nothing: if there - is an error in any of the configuration settings, Tor sets none of them. - - Tor responds with a "250 configuration values set" reply on success. - If some of the listed keywords can't be found, Tor replies with a - "552 Unrecognized option" message. Otherwise, Tor responds with a - "513 syntax error in configuration values" reply on syntax error, or a - "553 impossible configuration setting" reply on a semantic error. - - When a configuration option takes multiple values, or when multiple - configuration keys form a context-sensitive group (see GETCONF below), then - setting _any_ of the options in a SETCONF command is taken to reset all of - the others. For example, if two ORBindAddress values are configured, and a - SETCONF command arrives containing a single ORBindAddress value, the new - command's value replaces the two old values. - - Sometimes it is not possible to change configuration options solely by - issuing a series of SETCONF commands, because the value of one of the - configuration options depends on the value of another which has not yet - been set. Such situations can be overcome by setting multiple configuration - options with a single SETCONF command (e.g. SETCONF ORPort=443 - ORListenAddress=9001). - -3.2. RESETCONF - - Remove all settings for a given configuration option entirely, assign - its default value (if any), and then assign the String provided. - Typically the String is left empty, to simply set an option back to - its default. The syntax is: - - "RESETCONF" 1*(SP keyword ["=" String]) CRLF - - Otherwise it behaves like SETCONF above. - -3.3. GETCONF - - Request the value of a configuration variable. The syntax is: - - "GETCONF" 1*(SP keyword) CRLF - - If all of the listed keywords exist in the Tor configuration, Tor replies - with a series of reply lines of the form: - 250 keyword=value - If any option is set to a 'default' value semantically different from an - empty string, Tor may reply with a reply line of the form: - 250 keyword - - Value may be a raw value or a quoted string. Tor will try to use - unquoted values except when the value could be misinterpreted through - not being quoted. - - If some of the listed keywords can't be found, Tor replies with a - "552 unknown configuration keyword" message. - - If an option appears multiple times in the configuration, all of its - key-value pairs are returned in order. - - Some options are context-sensitive, and depend on other options with - different keywords. These cannot be fetched directly. Currently there - is only one such option: clients should use the "HiddenServiceOptions" - virtual keyword to get all HiddenServiceDir, HiddenServicePort, - HiddenServiceNodes, and HiddenServiceExcludeNodes option settings. - -3.4. SETEVENTS - - Request the server to inform the client about interesting events. The - syntax is: - - "SETEVENTS" [SP "EXTENDED"] *(SP EventCode) CRLF - - EventCode = "CIRC" / "STREAM" / "ORCONN" / "BW" / "DEBUG" / - "INFO" / "NOTICE" / "WARN" / "ERR" / "NEWDESC" / "ADDRMAP" / - "AUTHDIR_NEWDESCS" / "DESCCHANGED" / "STATUS_GENERAL" / - "STATUS_CLIENT" / "STATUS_SERVER" / "GUARD" / "NS" / "STREAM_BW" / - "CLIENTS_SEEN" / "NEWCONSENSUS" / "BUILDTIMEOUT_SET" - - Any events *not* listed in the SETEVENTS line are turned off; thus, sending - SETEVENTS with an empty body turns off all event reporting. - - The server responds with a "250 OK" reply on success, and a "552 - Unrecognized event" reply if one of the event codes isn't recognized. (On - error, the list of active event codes isn't changed.) - - If the flag string "EXTENDED" is provided, Tor may provide extra - information with events for this connection; see 4.1 for more information. - NOTE: All events on a given connection will be provided in extended format, - or none. - NOTE: "EXTENDED" is only supported in Tor 0.1.1.9-alpha or later. - - Each event is described in more detail in Section 4.1. - -3.5. AUTHENTICATE - - Sent from the client to the server. The syntax is: - "AUTHENTICATE" [ SP 1*HEXDIG / QuotedString ] CRLF - - The server responds with "250 OK" on success or "515 Bad authentication" if - the authentication cookie is incorrect. Tor closes the connection on an - authentication failure. - - The format of the 'cookie' is implementation-dependent; see 5.1 below for - information on how the standard Tor implementation handles it. - - Before the client has authenticated, no command other than PROTOCOLINFO, - AUTHENTICATE, or QUIT is valid. If the controller sends any other command, - or sends a malformed command, or sends an unsuccessful AUTHENTICATE - command, or sends PROTOCOLINFO more than once, Tor sends an error reply and - closes the connection. - - To prevent some cross-protocol attacks, the AUTHENTICATE command is still - required even if all authentication methods in Tor are disabled. In this - case, the controller should just send "AUTHENTICATE" CRLF. - - (Versions of Tor before 0.1.2.16 and 0.2.0.4-alpha did not close the - connection after an authentication failure.) - -3.6. SAVECONF - - Sent from the client to the server. The syntax is: - "SAVECONF" CRLF - - Instructs the server to write out its config options into its torrc. Server - returns "250 OK" if successful, or "551 Unable to write configuration - to disk" if it can't write the file or some other error occurs. - - See also the "getinfo config-text" command, if the controller wants - to write the torrc file itself. - -3.7. SIGNAL - - Sent from the client to the server. The syntax is: - - "SIGNAL" SP Signal CRLF - - Signal = "RELOAD" / "SHUTDOWN" / "DUMP" / "DEBUG" / "HALT" / - "HUP" / "INT" / "USR1" / "USR2" / "TERM" / "NEWNYM" / - "CLEARDNSCACHE" - - The meaning of the signals are: - - RELOAD -- Reload: reload config items, refetch directory. (like HUP) - SHUTDOWN -- Controlled shutdown: if server is an OP, exit immediately. - If it's an OR, close listeners and exit after 30 seconds. - (like INT) - DUMP -- Dump stats: log information about open connections and - circuits. (like USR1) - DEBUG -- Debug: switch all open logs to loglevel debug. (like USR2) - HALT -- Immediate shutdown: clean up and exit now. (like TERM) - CLEARDNSCACHE -- Forget the client-side cached IPs for all hostnames. - NEWNYM -- Switch to clean circuits, so new application requests - don't share any circuits with old ones. Also clears - the client-side DNS cache. (Tor MAY rate-limit its - response to this signal.) - - The server responds with "250 OK" if the signal is recognized (or simply - closes the socket if it was asked to close immediately), or "552 - Unrecognized signal" if the signal is unrecognized. - -3.8. MAPADDRESS - - Sent from the client to the server. The syntax is: - - "MAPADDRESS" 1*(Address "=" Address SP) CRLF - - The first address in each pair is an "original" address; the second is a - "replacement" address. The client sends this message to the server in - order to tell it that future SOCKS requests for connections to the original - address should be replaced with connections to the specified replacement - address. If the addresses are well-formed, and the server is able to - fulfill the request, the server replies with a 250 message: - 250-OldAddress1=NewAddress1 - 250 OldAddress2=NewAddress2 - - containing the source and destination addresses. If request is - malformed, the server replies with "512 syntax error in command - argument". If the server can't fulfill the request, it replies with - "451 resource exhausted". - - The client may decline to provide a body for the original address, and - instead send a special null address ("0.0.0.0" for IPv4, "::0" for IPv6, or - "." for hostname), signifying that the server should choose the original - address itself, and return that address in the reply. The server - should ensure that it returns an element of address space that is unlikely - to be in actual use. If there is already an address mapped to the - destination address, the server may reuse that mapping. - - If the original address is already mapped to a different address, the old - mapping is removed. If the original address and the destination address - are the same, the server removes any mapping in place for the original - address. - - Example: - C: MAPADDRESS 0.0.0.0=torproject.org 1.2.3.4=tor.freehaven.net - S: 250-127.192.10.10=torproject.org - S: 250 1.2.3.4=tor.freehaven.net - - {Note: This feature is designed to be used to help Tor-ify applications - that need to use SOCKS4 or hostname-less SOCKS5. There are three - approaches to doing this: - 1. Somehow make them use SOCKS4a or SOCKS5-with-hostnames instead. - 2. Use tor-resolve (or another interface to Tor's resolve-over-SOCKS - feature) to resolve the hostname remotely. This doesn't work - with special addresses like x.onion or x.y.exit. - 3. Use MAPADDRESS to map an IP address to the desired hostname, and then - arrange to fool the application into thinking that the hostname - has resolved to that IP. - This functionality is designed to help implement the 3rd approach.} - - Mappings set by the controller last until the Tor process exits: - they never expire. If the controller wants the mapping to last only - a certain time, then it must explicitly un-map the address when that - time has elapsed. - -3.9. GETINFO - - Sent from the client to the server. The syntax is as for GETCONF: - "GETINFO" 1*(SP keyword) CRLF - one or more NL-terminated strings. The server replies with an INFOVALUE - message, or a 551 or 552 error. - - Unlike GETCONF, this message is used for data that are not stored in the Tor - configuration file, and that may be longer than a single line. On success, - one ReplyLine is sent for each requested value, followed by a final 250 OK - ReplyLine. If a value fits on a single line, the format is: - 250-keyword=value - If a value must be split over multiple lines, the format is: - 250+keyword= - value - . - Recognized keys and their values include: - - "version" -- The version of the server's software, including the name - of the software. (example: "Tor 0.0.9.4") - - "config-file" -- The location of Tor's configuration file ("torrc"). - - "config-text" -- The contents that Tor would write if you send it - a SAVECONF command, so the controller can write the file to - disk itself. [First implemented in 0.2.2.7-alpha.] - - ["exit-policy/prepend" -- The default exit policy lines that Tor will - *prepend* to the ExitPolicy config option. - -- Never implemented. Useful?] - - "exit-policy/default" -- The default exit policy lines that Tor will - *append* to the ExitPolicy config option. - - "desc/id/<OR identity>" or "desc/name/<OR nickname>" -- the latest - server descriptor for a given OR, NUL-terminated. - - "desc-annotations/id/<OR identity>" -- outputs the annotations string - (source, timestamp of arrival, purpose, etc) for the corresponding - descriptor. [First implemented in 0.2.0.13-alpha.] - - "extra-info/digest/<digest>" -- the extrainfo document whose digest (in - hex) is <digest>. Only available if we're downloading extra-info - documents. - - "ns/id/<OR identity>" or "ns/name/<OR nickname>" -- the latest router - status info (v2 directory style) for a given OR. Router status - info is as given in - dir-spec.txt, and reflects the current beliefs of this Tor about the - router in question. Like directory clients, controllers MUST - tolerate unrecognized flags and lines. The published date and - descriptor digest are those believed to be best by this Tor, - not necessarily those for a descriptor that Tor currently has. - [First implemented in 0.1.2.3-alpha.] - - "ns/all" -- Router status info (v2 directory style) for all ORs we - have an opinion about, joined by newlines. [First implemented - in 0.1.2.3-alpha.] - - "ns/purpose/<purpose>" -- Router status info (v2 directory style) - for all ORs of this purpose. Mostly designed for /ns/purpose/bridge - queries. [First implemented in 0.2.0.13-alpha.] - - "desc/all-recent" -- the latest server descriptor for every router that - Tor knows about. - - "network-status" -- a space-separated list (v1 directory style) - of all known OR identities. This is in the same format as the - router-status line in v1 directories; see dir-spec-v1.txt section - 3 for details. (If VERBOSE_NAMES is enabled, the output will - not conform to dir-spec-v1.txt; instead, the result will be a - space-separated list of LongName, each preceded by a "!" if it is - believed to be not running.) This option is deprecated; use - "ns/all" instead. - - "address-mappings/all" - "address-mappings/config" - "address-mappings/cache" - "address-mappings/control" -- a \r\n-separated list of address - mappings, each in the form of "from-address to-address expiry". - The 'config' key returns those address mappings set in the - configuration; the 'cache' key returns the mappings in the - client-side DNS cache; the 'control' key returns the mappings set - via the control interface; the 'all' target returns the mappings - set through any mechanism. - Expiry is formatted as with ADDRMAP events, except that "expiry" is - always a time in GMT or the string "NEVER"; see section 4.1.7. - First introduced in 0.2.0.3-alpha. - - "addr-mappings/*" -- as for address-mappings/*, but without the - expiry portion of the value. Use of this value is deprecated - since 0.2.0.3-alpha; use address-mappings instead. - - "address" -- the best guess at our external IP address. If we - have no guess, return a 551 error. (Added in 0.1.2.2-alpha) - - "fingerprint" -- the contents of the fingerprint file that Tor - writes as a server, or a 551 if we're not a server currently. - (Added in 0.1.2.3-alpha) - - "circuit-status" - A series of lines as for a circuit status event. Each line is of - the form: - CircuitID SP CircStatus [SP Path] CRLF - - "stream-status" - A series of lines as for a stream status event. Each is of the form: - StreamID SP StreamStatus SP CircID SP Target CRLF - - "orconn-status" - A series of lines as for an OR connection status event. In Tor - 0.1.2.2-alpha with feature VERBOSE_NAMES enabled and in Tor - 0.2.2.1-alpha and later by default, each line is of the form: - LongName SP ORStatus CRLF - - In Tor versions 0.1.2.2-alpha through 0.2.2.1-alpha with feature - VERBOSE_NAMES turned off and before version 0.1.2.2-alpha, each line - is of the form: - ServerID SP ORStatus CRLF - - "entry-guards" - A series of lines listing the currently chosen entry guards, if any. - In Tor 0.1.2.2-alpha with feature VERBOSE_NAMES enabled and in Tor - 0.2.2.1-alpha and later by default, each line is of the form: - LongName SP Status [SP ISOTime] CRLF - - In Tor versions 0.1.2.2-alpha through 0.2.2.1-alpha with feature - VERBOSE_NAMES turned off and before version 0.1.2.2-alpha, each line - is of the form: - ServerID2 SP Status [SP ISOTime] CRLF - ServerID2 = Nickname / 40*HEXDIG - - The definition of Status is the same for both: - Status = "up" / "never-connected" / "down" / - "unusable" / "unlisted" - - [From 0.1.1.4-alpha to 0.1.1.10-alpha, entry-guards was called - "helper-nodes". Tor still supports calling "helper-nodes", but it - is deprecated and should not be used.] - - [Older versions of Tor (before 0.1.2.x-final) generated 'down' instead - of unlisted/unusable. Current Tors never generate 'down'.] - - [XXXX ServerID2 differs from ServerID in not prefixing fingerprints - with a $. This is an implementation error. It would be nice to add - the $ back in if we can do so without breaking compatibility.] - - "accounting/enabled" - "accounting/hibernating" - "accounting/bytes" - "accounting/bytes-left" - "accounting/interval-start" - "accounting/interval-wake" - "accounting/interval-end" - Information about accounting status. If accounting is enabled, - "enabled" is 1; otherwise it is 0. The "hibernating" field is "hard" - if we are accepting no data; "soft" if we're accepting no new - connections, and "awake" if we're not hibernating at all. The "bytes" - and "bytes-left" fields contain (read-bytes SP write-bytes), for the - start and the rest of the interval respectively. The 'interval-start' - and 'interval-end' fields are the borders of the current interval; the - 'interval-wake' field is the time within the current interval (if any) - where we plan[ned] to start being active. The times are GMT. - - "config/names" - A series of lines listing the available configuration options. Each is - of the form: - OptionName SP OptionType [ SP Documentation ] CRLF - OptionName = Keyword - OptionType = "Integer" / "TimeInterval" / "DataSize" / "Float" / - "Boolean" / "Time" / "CommaList" / "Dependant" / "Virtual" / - "String" / "LineList" - Documentation = Text - - "info/names" - A series of lines listing the available GETINFO options. Each is of - one of these forms: - OptionName SP Documentation CRLF - OptionPrefix SP Documentation CRLF - OptionPrefix = OptionName "/*" - - "events/names" - A space-separated list of all the events supported by this version of - Tor's SETEVENTS. - - "features/names" - A space-separated list of all the events supported by this version of - Tor's USEFEATURE. - - "ip-to-country/*" - Maps IP addresses to 2-letter country codes. For example, - "GETINFO ip-to-country/18.0.0.1" should give "US". - - "next-circuit/IP:port" - XXX todo. - - "dir/status-vote/current/consensus" [added in Tor 0.2.1.6-alpha] - "dir/status/authority" - "dir/status/fp/<F>" - "dir/status/fp/<F1>+<F2>+<F3>" - "dir/status/all" - "dir/server/fp/<F>" - "dir/server/fp/<F1>+<F2>+<F3>" - "dir/server/d/<D>" - "dir/server/d/<D1>+<D2>+<D3>" - "dir/server/authority" - "dir/server/all" - A series of lines listing directory contents, provided according to the - specification for the URLs listed in Section 4.4 of dir-spec.txt. Note - that Tor MUST NOT provide private information, such as descriptors for - routers not marked as general-purpose. When asked for 'authority' - information for which this Tor is not authoritative, Tor replies with - an empty string. - - "status/circuit-established" - "status/enough-dir-info" - "status/good-server-descriptor" - "status/accepted-server-descriptor" - "status/..." - These provide the current internal Tor values for various Tor - states. See Section 4.1.10 for explanations. (Only a few of the - status events are available as getinfo's currently. Let us know if - you want more exposed.) - "status/reachability-succeeded/or" - 0 or 1, depending on whether we've found our ORPort reachable. - "status/reachability-succeeded/dir" - 0 or 1, depending on whether we've found our DirPort reachable. - "status/reachability-succeeded" - "OR=" ("0"/"1") SP "DIR=" ("0"/"1") - Combines status/reachability-succeeded/*; controllers MUST ignore - unrecognized elements in this entry. - "status/bootstrap-phase" - Returns the most recent bootstrap phase status event - sent. Specifically, it returns a string starting with either - "NOTICE BOOTSTRAP ..." or "WARN BOOTSTRAP ...". Controllers should - use this getinfo when they connect or attach to Tor to learn its - current bootstrap state. - "status/version/recommended" - List of currently recommended versions. - "status/version/current" - Status of the current version. One of: new, old, unrecommended, - recommended, new in series, obsolete, unknown. - "status/clients-seen" - A summary of which countries we've seen clients from recently, - formatted the same as the CLIENTS_SEEN status event described in - Section 4.1.14. This GETINFO option is currently available only - for bridge relays. - - Examples: - C: GETINFO version desc/name/moria1 - S: 250+desc/name/moria= - S: [Descriptor for moria] - S: . - S: 250-version=Tor 0.1.1.0-alpha-cvs - S: 250 OK - -3.10. EXTENDCIRCUIT - - Sent from the client to the server. The format is: - "EXTENDCIRCUIT" SP CircuitID - [SP ServerSpec *("," ServerSpec) - SP "purpose=" Purpose] CRLF - - This request takes one of two forms: either the CircuitID is zero, in - which case it is a request for the server to build a new circuit, - or the CircuitID is nonzero, in which case it is a request for the - server to extend an existing circuit with that ID according to the - specified path. - - If the CircuitID is 0, the controller has the option of providing - a path for Tor to use to build the circuit. If it does not provide - a path, Tor will select one automatically from high capacity nodes - according to path-spec.txt. - - If CircuitID is 0 and "purpose=" is specified, then the circuit's - purpose is set. Two choices are recognized: "general" and - "controller". If not specified, circuits are created as "general". - - If the request is successful, the server sends a reply containing a - message body consisting of the CircuitID of the (maybe newly created) - circuit. The syntax is "250" SP "EXTENDED" SP CircuitID CRLF. - -3.11. SETCIRCUITPURPOSE - - Sent from the client to the server. The format is: - "SETCIRCUITPURPOSE" SP CircuitID SP Purpose CRLF - - This changes the circuit's purpose. See EXTENDCIRCUIT above for details. - -3.12. SETROUTERPURPOSE - - Sent from the client to the server. The format is: - "SETROUTERPURPOSE" SP NicknameOrKey SP Purpose CRLF - - This changes the descriptor's purpose. See +POSTDESCRIPTOR below - for details. - - NOTE: This command was disabled and made obsolete as of Tor - 0.2.0.8-alpha. It doesn't exist anymore, and is listed here only for - historical interest. - -3.13. ATTACHSTREAM - - Sent from the client to the server. The syntax is: - "ATTACHSTREAM" SP StreamID SP CircuitID [SP "HOP=" HopNum] CRLF - - This message informs the server that the specified stream should be - associated with the specified circuit. Each stream may be associated with - at most one circuit, and multiple streams may share the same circuit. - Streams can only be attached to completed circuits (that is, circuits that - have sent a circuit status 'BUILT' event or are listed as built in a - GETINFO circuit-status request). - - If the circuit ID is 0, responsibility for attaching the given stream is - returned to Tor. - - If HOP=HopNum is specified, Tor will choose the HopNumth hop in the - circuit as the exit node, rather than the last node in the circuit. - Hops are 1-indexed; generally, it is not permitted to attach to hop 1. - - Tor responds with "250 OK" if it can attach the stream, 552 if the circuit - or stream didn't exist, or 551 if the stream couldn't be attached for - another reason. - - {Implementation note: Tor will close unattached streams by itself, - roughly two minutes after they are born. Let the developers know if - that turns out to be a problem.} - - {Implementation note: By default, Tor automatically attaches streams to - circuits itself, unless the configuration variable - "__LeaveStreamsUnattached" is set to "1". Attempting to attach streams - via TC when "__LeaveStreamsUnattached" is false may cause a race between - Tor and the controller, as both attempt to attach streams to circuits.} - - {Implementation note: You can try to attachstream to a stream that - has already sent a connect or resolve request but hasn't succeeded - yet, in which case Tor will detach the stream from its current circuit - before proceeding with the new attach request.} - -3.14. POSTDESCRIPTOR - - Sent from the client to the server. The syntax is: - "+POSTDESCRIPTOR" [SP "purpose=" Purpose] [SP "cache=" Cache] - CRLF Descriptor CRLF "." CRLF - - This message informs the server about a new descriptor. If Purpose is - specified, it must be either "general", "controller", or "bridge", - else we return a 552 error. The default is "general". - - If Cache is specified, it must be either "no" or "yes", else we - return a 552 error. If Cache is not specified, Tor will decide for - itself whether it wants to cache the descriptor, and controllers - must not rely on its choice. - - The descriptor, when parsed, must contain a number of well-specified - fields, including fields for its nickname and identity. - - If there is an error in parsing the descriptor, the server must send a - "554 Invalid descriptor" reply. If the descriptor is well-formed but - the server chooses not to add it, it must reply with a 251 message - whose body explains why the server was not added. If the descriptor - is added, Tor replies with "250 OK". - -3.15. REDIRECTSTREAM - - Sent from the client to the server. The syntax is: - "REDIRECTSTREAM" SP StreamID SP Address [SP Port] CRLF - - Tells the server to change the exit address on the specified stream. If - Port is specified, changes the destination port as well. No remapping - is performed on the new provided address. - - To be sure that the modified address will be used, this event must be sent - after a new stream event is received, and before attaching this stream to - a circuit. - - Tor replies with "250 OK" on success. - -3.16. CLOSESTREAM - - Sent from the client to the server. The syntax is: - - "CLOSESTREAM" SP StreamID SP Reason *(SP Flag) CRLF - - Tells the server to close the specified stream. The reason should be one - of the Tor RELAY_END reasons given in tor-spec.txt, as a decimal. Flags is - not used currently; Tor servers SHOULD ignore unrecognized flags. Tor may - hold the stream open for a while to flush any data that is pending. - - Tor replies with "250 OK" on success, or a 512 if there aren't enough - arguments, or a 552 if it doesn't recognize the StreamID or reason. - -3.17. CLOSECIRCUIT - - The syntax is: - CLOSECIRCUIT SP CircuitID *(SP Flag) CRLF - Flag = "IfUnused" - - Tells the server to close the specified circuit. If "IfUnused" is - provided, do not close the circuit unless it is unused. - - Other flags may be defined in the future; Tor SHOULD ignore unrecognized - flags. - - Tor replies with "250 OK" on success, or a 512 if there aren't enough - arguments, or a 552 if it doesn't recognize the CircuitID. - -3.18. QUIT - - Tells the server to hang up on this controller connection. This command - can be used before authenticating. - -3.19. USEFEATURE - - Adding additional features to the control protocol sometimes will break - backwards compatibility. Initially such features are added into Tor and - disabled by default. USEFEATURE can enable these additional features. - - The syntax is: - - "USEFEATURE" *(SP FeatureName) CRLF - FeatureName = 1*(ALPHA / DIGIT / "_" / "-") - - Feature names are case-insensitive. - - Once enabled, a feature stays enabled for the duration of the connection - to the controller. A new connection to the controller must be opened to - disable an enabled feature. - - Features are a forward-compatibility mechanism; each feature will eventually - become a standard part of the control protocol. Once a feature becomes part - of the protocol, it is always-on. Each feature documents the version it was - introduced as a feature and the version in which it became part of the - protocol. - - Tor will ignore a request to use any feature that is always-on. Tor will give - a 552 error in response to an unrecognized feature. - - EXTENDED_EVENTS - - Same as passing 'EXTENDED' to SETEVENTS; this is the preferred way to - request the extended event syntax. - - This feature was first introduced in 0.1.2.3-alpha. It is always-on - and part of the protocol in Tor 0.2.2.1-alpha and later. - - VERBOSE_NAMES - - Replaces ServerID with LongName in events and GETINFO results. LongName - provides a Fingerprint for all routers, an indication of Named status, - and a Nickname if one is known. LongName is strictly more informative - than ServerID, which only provides either a Fingerprint or a Nickname. - - This feature was first introduced in 0.1.2.2-alpha. It is always-on and - part of the protocol in Tor 0.2.2.1-alpha and later. - -3.20. RESOLVE - - The syntax is - "RESOLVE" *Option *Address CRLF - Option = "mode=reverse" - Address = a hostname or IPv4 address - - This command launches a remote hostname lookup request for every specified - request (or reverse lookup if "mode=reverse" is specified). Note that the - request is done in the background: to see the answers, your controller will - need to listen for ADDRMAP events; see 4.1.7 below. - - [Added in Tor 0.2.0.3-alpha] - -3.21. PROTOCOLINFO - - The syntax is: - "PROTOCOLINFO" *(SP PIVERSION) CRLF - - The server reply format is: - "250-PROTOCOLINFO" SP PIVERSION CRLF *InfoLine "250 OK" CRLF - - InfoLine = AuthLine / VersionLine / OtherLine - - AuthLine = "250-AUTH" SP "METHODS=" AuthMethod *(",")AuthMethod - *(SP "COOKIEFILE=" AuthCookieFile) CRLF - VersionLine = "250-VERSION" SP "Tor=" TorVersion [SP Arguments] CRLF - - AuthMethod = - "NULL" / ; No authentication is required - "HASHEDPASSWORD" / ; A controller must supply the original password - "COOKIE" / ; A controller must supply the contents of a cookie - - AuthCookieFile = QuotedString - TorVersion = QuotedString - - OtherLine = "250-" Keyword [SP Arguments] CRLF - - PIVERSION: 1*DIGIT - - Tor MAY give its InfoLines in any order; controllers MUST ignore InfoLines - with keywords they do not recognize. Controllers MUST ignore extraneous - data on any InfoLine. - - PIVERSION is there in case we drastically change the syntax one day. For - now it should always be "1". Controllers MAY provide a list of the - protocolinfo versions they support; Tor MAY select a version that the - controller does not support. - - AuthMethod is used to specify one or more control authentication - methods that Tor currently accepts. - - AuthCookieFile specifies the absolute path and filename of the - authentication cookie that Tor is expecting and is provided iff - the METHODS field contains the method "COOKIE". Controllers MUST handle - escape sequences inside this string. - - The VERSION line contains the Tor version. - - [Unlike other commands besides AUTHENTICATE, PROTOCOLINFO may be used (but - only once!) before AUTHENTICATE.] - - [PROTOCOLINFO was not supported before Tor 0.2.0.5-alpha.] - -4. Replies - - Reply codes follow the same 3-character format as used by SMTP, with the - first character defining a status, the second character defining a - subsystem, and the third designating fine-grained information. - - The TC protocol currently uses the following first characters: - - 2yz Positive Completion Reply - The command was successful; a new request can be started. - - 4yz Temporary Negative Completion reply - The command was unsuccessful but might be reattempted later. - - 5yz Permanent Negative Completion Reply - The command was unsuccessful; the client should not try exactly - that sequence of commands again. - - 6yz Asynchronous Reply - Sent out-of-order in response to an earlier SETEVENTS command. - - The following second characters are used: - - x0z Syntax - Sent in response to ill-formed or nonsensical commands. - - x1z Protocol - Refers to operations of the Tor Control protocol. - - x5z Tor - Refers to actual operations of Tor system. - - The following codes are defined: - - 250 OK - 251 Operation was unnecessary - [Tor has declined to perform the operation, but no harm was done.] - - 451 Resource exhausted - - 500 Syntax error: protocol - - 510 Unrecognized command - 511 Unimplemented command - 512 Syntax error in command argument - 513 Unrecognized command argument - 514 Authentication required - 515 Bad authentication - - 550 Unspecified Tor error - - 551 Internal error - [Something went wrong inside Tor, so that the client's - request couldn't be fulfilled.] - - 552 Unrecognized entity - [A configuration key, a stream ID, circuit ID, event, - mentioned in the command did not actually exist.] - - 553 Invalid configuration value - [The client tried to set a configuration option to an - incorrect, ill-formed, or impossible value.] - - 554 Invalid descriptor - - 555 Unmanaged entity - - 650 Asynchronous event notification - - Unless specified to have specific contents, the human-readable messages - in error replies should not be relied upon to match those in this document. - -4.1. Asynchronous events - - These replies can be sent after a corresponding SETEVENTS command has been - received. They will not be interleaved with other Reply elements, but they - can appear between a command and its corresponding reply. For example, - this sequence is possible: - - C: SETEVENTS CIRC - S: 250 OK - C: GETCONF SOCKSPORT ORPORT - S: 650 CIRC 1000 EXTENDED moria1,moria2 - S: 250-SOCKSPORT=9050 - S: 250 ORPORT=0 - - But this sequence is disallowed: - C: SETEVENTS CIRC - S: 250 OK - C: GETCONF SOCKSPORT ORPORT - S: 250-SOCKSPORT=9050 - S: 650 CIRC 1000 EXTENDED moria1,moria2 - S: 250 ORPORT=0 - - Clients MUST tolerate more arguments in an asynchonous reply than - expected, and MUST tolerate more lines in an asynchronous reply than - expected. For instance, a client that expects a CIRC message like: - 650 CIRC 1000 EXTENDED moria1,moria2 - must tolerate: - 650-CIRC 1000 EXTENDED moria1,moria2 0xBEEF - 650-EXTRAMAGIC=99 - 650 ANONYMITY=high - - If clients ask for extended events, then each event line as specified below - will be followed by additional extensions. Additional lines will be of the - form - "650" ("-"/" ") KEYWORD ["=" ARGUMENTS] CRLF - Additional arguments will be of the form - SP KEYWORD ["=" ( QuotedString / * NonSpDquote ) ] - Such clients MUST tolerate lines with keywords they do not recognize. - -4.1.1. Circuit status changed - - The syntax is: - - "650" SP "CIRC" SP CircuitID SP CircStatus [SP Path] - [SP "REASON=" Reason [SP "REMOTE_REASON=" Reason]] CRLF - - CircStatus = - "LAUNCHED" / ; circuit ID assigned to new circuit - "BUILT" / ; all hops finished, can now accept streams - "EXTENDED" / ; one more hop has been completed - "FAILED" / ; circuit closed (was not built) - "CLOSED" ; circuit closed (was built) - - Path = LongName *("," LongName) - ; In Tor versions 0.1.2.2-alpha through 0.2.2.1-alpha with feature - ; VERBOSE_NAMES turned off and before version 0.1.2.2-alpha, Path - ; is as follows: - Path = ServerID *("," ServerID) - - Reason = "NONE" / "TORPROTOCOL" / "INTERNAL" / "REQUESTED" / - "HIBERNATING" / "RESOURCELIMIT" / "CONNECTFAILED" / - "OR_IDENTITY" / "OR_CONN_CLOSED" / "TIMEOUT" / - "FINISHED" / "DESTROYED" / "NOPATH" / "NOSUCHSERVICE" / - "MEASUREMENT_EXPIRED" - - The path is provided only when the circuit has been extended at least one - hop. - - The "REASON" field is provided only for FAILED and CLOSED events, and only - if extended events are enabled (see 3.19). Clients MUST accept reasons - not listed above. Reasons are as given in tor-spec.txt, except for: - - NOPATH (Not enough nodes to make circuit) - - The "REMOTE_REASON" field is provided only when we receive a DESTROY or - TRUNCATE cell, and only if extended events are enabled. It contains the - actual reason given by the remote OR for closing the circuit. Clients MUST - accept reasons not listed above. Reasons are as listed in tor-spec.txt. - -4.1.2. Stream status changed - - The syntax is: - - "650" SP "STREAM" SP StreamID SP StreamStatus SP CircID SP Target - [SP "REASON=" Reason [ SP "REMOTE_REASON=" Reason ]] - [SP "SOURCE=" Source] [ SP "SOURCE_ADDR=" Address ":" Port ] - [SP "PURPOSE=" Purpose] - CRLF - - StreamStatus = - "NEW" / ; New request to connect - "NEWRESOLVE" / ; New request to resolve an address - "REMAP" / ; Address re-mapped to another - "SENTCONNECT" / ; Sent a connect cell along a circuit - "SENTRESOLVE" / ; Sent a resolve cell along a circuit - "SUCCEEDED" / ; Received a reply; stream established - "FAILED" / ; Stream failed and not retriable - "CLOSED" / ; Stream closed - "DETACHED" ; Detached from circuit; still retriable - - Target = Address ":" Port - - The circuit ID designates which circuit this stream is attached to. If - the stream is unattached, the circuit ID "0" is given. - - Reason = "MISC" / "RESOLVEFAILED" / "CONNECTREFUSED" / - "EXITPOLICY" / "DESTROY" / "DONE" / "TIMEOUT" / - "NOROUTE" / "HIBERNATING" / "INTERNAL"/ "RESOURCELIMIT" / - "CONNRESET" / "TORPROTOCOL" / "NOTDIRECTORY" / "END" - - The "REASON" field is provided only for FAILED, CLOSED, and DETACHED - events, and only if extended events are enabled (see 3.19). Clients MUST - accept reasons not listed above. Reasons are as given in tor-spec.txt, - except for: - - END (We received a RELAY_END cell from the other side of this - stream.) - [XXXX document more. -NM] - - The "REMOTE_REASON" field is provided only when we receive a RELAY_END - cell, and only if extended events are enabled. It contains the actual - reason given by the remote OR for closing the stream. Clients MUST accept - reasons not listed above. Reasons are as listed in tor-spec.txt. - - "REMAP" events include a Source if extended events are enabled: - Source = "CACHE" / "EXIT" - Clients MUST accept sources not listed above. "CACHE" is given if - the Tor client decided to remap the address because of a cached - answer, and "EXIT" is given if the remote node we queried gave us - the new address as a response. - - The "SOURCE_ADDR" field is included with NEW and NEWRESOLVE events if - extended events are enabled. It indicates the address and port - that requested the connection, and can be (e.g.) used to look up the - requesting program. - - Purpose = "DIR_FETCH" / "UPLOAD_DESC" / "DNS_REQUEST" / - "USER" / "DIRPORT_TEST" - - The "PURPOSE" field is provided only for NEW and NEWRESOLVE events, and - only if extended events are enabled (see 3.19). Clients MUST accept - purposes not listed above. - -4.1.3. OR Connection status changed - - The syntax is: - - "650" SP "ORCONN" SP (LongName / Target) SP ORStatus [ SP "REASON=" - Reason ] [ SP "NCIRCS=" NumCircuits ] CRLF - - ORStatus = "NEW" / "LAUNCHED" / "CONNECTED" / "FAILED" / "CLOSED" - - ; In Tor versions 0.1.2.2-alpha through 0.2.2.1-alpha with feature - ; VERBOSE_NAMES turned off and before version 0.1.2.2-alpha, OR - ; Connection is as follows: - "650" SP "ORCONN" SP (ServerID / Target) SP ORStatus [ SP "REASON=" - Reason ] [ SP "NCIRCS=" NumCircuits ] CRLF - - NEW is for incoming connections, and LAUNCHED is for outgoing - connections. CONNECTED means the TLS handshake has finished (in - either direction). FAILED means a connection is being closed that - hasn't finished its handshake, and CLOSED is for connections that - have handshaked. - - A LongName or ServerID is specified unless it's a NEW connection, in - which case we don't know what server it is yet, so we use Address:Port. - - If extended events are enabled (see 3.19), optional reason and - circuit counting information is provided for CLOSED and FAILED - events. - - Reason = "MISC" / "DONE" / "CONNECTREFUSED" / - "IDENTITY" / "CONNECTRESET" / "TIMEOUT" / "NOROUTE" / - "IOERROR" / "RESOURCELIMIT" - - NumCircuits counts both established and pending circuits. - -4.1.4. Bandwidth used in the last second - - The syntax is: - "650" SP "BW" SP BytesRead SP BytesWritten *(SP Type "=" Num) CRLF - BytesRead = 1*DIGIT - BytesWritten = 1*DIGIT - Type = "DIR" / "OR" / "EXIT" / "APP" / ... - Num = 1*DIGIT - - BytesRead and BytesWritten are the totals. [In a future Tor version, - we may also include a breakdown of the connection types that used - bandwidth this second (not implemented yet).] - -4.1.5. Log messages - - The syntax is: - "650" SP Severity SP ReplyText CRLF - or - "650+" Severity CRLF Data 650 SP "OK" CRLF - - Severity = "DEBUG" / "INFO" / "NOTICE" / "WARN"/ "ERR" - -4.1.6. New descriptors available - - Syntax: - "650" SP "NEWDESC" 1*(SP LongName) CRLF - ; In Tor versions 0.1.2.2-alpha through 0.2.2.1-alpha with feature - ; VERBOSE_NAMES turned off and before version 0.1.2.2-alpha, it - ; is as follows: - "650" SP "NEWDESC" 1*(SP ServerID) CRLF - -4.1.7. New Address mapping - - Syntax: - "650" SP "ADDRMAP" SP Address SP NewAddress SP Expiry - [SP Error] SP GMTExpiry CRLF - - NewAddress = Address / "<error>" - Expiry = DQUOTE ISOTime DQUOTE / "NEVER" - - Error = "error=" ErrorCode - ErrorCode = XXXX - GMTExpiry = "EXPIRES=" DQUOTE IsoTime DQUOTE - - Error and GMTExpiry are only provided if extended events are enabled. - - Expiry is expressed as the local time (rather than GMT). This is a bug, - left in for backward compatibility; new code should look at GMTExpiry - instead. - - These events are generated when a new address mapping is entered in the - cache, or when the answer for a RESOLVE command is found. - -4.1.8. Descriptors uploaded to us in our role as authoritative dirserver - - Syntax: - "650" "+" "AUTHDIR_NEWDESCS" CRLF Action CRLF Message CRLF - Descriptor CRLF "." CRLF "650" SP "OK" CRLF - Action = "ACCEPTED" / "DROPPED" / "REJECTED" - Message = Text - -4.1.9. Our descriptor changed - - Syntax: - "650" SP "DESCCHANGED" CRLF - - [First added in 0.1.2.2-alpha.] - -4.1.10. Status events - - Status events (STATUS_GENERAL, STATUS_CLIENT, and STATUS_SERVER) are sent - based on occurrences in the Tor process pertaining to the general state of - the program. Generally, they correspond to log messages of severity Notice - or higher. They differ from log messages in that their format is a - specified interface. - - Syntax: - "650" SP StatusType SP StatusSeverity SP StatusAction - [SP StatusArguments] CRLF - - StatusType = "STATUS_GENERAL" / "STATUS_CLIENT" / "STATUS_SERVER" - StatusSeverity = "NOTICE" / "WARN" / "ERR" - StatusAction = 1*ALPHA - StatusArguments = StatusArgument *(SP StatusArgument) - StatusArgument = StatusKeyword '=' StatusValue - StatusKeyword = 1*(ALNUM / "_") - StatusValue = 1*(ALNUM / '_') / QuotedString - - Action is a string, and Arguments is a series of keyword=value - pairs on the same line. Values may be space-terminated strings, - or quoted strings. - - These events are always produced with EXTENDED_EVENTS and - VERBOSE_NAMES; see the explanations in the USEFEATURE section - for details. - - Controllers MUST tolerate unrecognized actions, MUST tolerate - unrecognized arguments, MUST tolerate missing arguments, and MUST - tolerate arguments that arrive in any order. - - Each event description below is accompanied by a recommendation for - controllers. These recommendations are suggestions only; no controller - is required to implement them. - - Compatibility note: versions of Tor before 0.2.0.22-rc incorrectly - generated "STATUS_SERVER" as "STATUS_SEVER". To be compatible with those - versions, tools should accept both. - - Actions for STATUS_GENERAL events can be as follows: - - CLOCK_JUMPED - "TIME=NUM" - Tor spent enough time without CPU cycles that it has closed all - its circuits and will establish them anew. This typically - happens when a laptop goes to sleep and then wakes up again. It - also happens when the system is swapping so heavily that Tor is - starving. The "time" argument specifies the number of seconds Tor - thinks it was unconscious for (or alternatively, the number of - seconds it went back in time). - - This status event is sent as NOTICE severity normally, but WARN - severity if Tor is acting as a server currently. - - {Recommendation for controller: ignore it, since we don't really - know what the user should do anyway. Hm.} - - DANGEROUS_VERSION - "CURRENT=version" - "REASON=NEW/OBSOLETE/UNRECOMMENDED" - "RECOMMENDED=\"version, version, ...\"" - Tor has found that directory servers don't recommend its version of - the Tor software. RECOMMENDED is a comma-and-space-separated string - of Tor versions that are recommended. REASON is NEW if this version - of Tor is newer than any recommended version, OBSOLETE if - this version of Tor is older than any recommended version, and - UNRECOMMENDED if some recommended versions of Tor are newer and - some are older than this version. (The "OBSOLETE" reason was called - "OLD" from Tor 0.1.2.3-alpha up to and including 0.2.0.12-alpha.) - - {Controllers may want to suggest that the user upgrade OLD or - UNRECOMMENDED versions. NEW versions may be known-insecure, or may - simply be development versions.} - - TOO_MANY_CONNECTIONS - "CURRENT=NUM" - Tor has reached its ulimit -n or whatever the native limit is on file - descriptors or sockets. CURRENT is the number of sockets Tor - currently has open. The user should really do something about - this. The "current" argument shows the number of connections currently - open. - - {Controllers may recommend that the user increase the limit, or - increase it for them. Recommendations should be phrased in an - OS-appropriate way and automated when possible.} - - BUG - "REASON=STRING" - Tor has encountered a situation that its developers never expected, - and the developers would like to learn that it happened. Perhaps - the controller can explain this to the user and encourage her to - file a bug report? - - {Controllers should log bugs, but shouldn't annoy the user in case a - bug appears frequently.} - - CLOCK_SKEW - SKEW="+" / "-" SECONDS - MIN_SKEW="+" / "-" SECONDS. - SOURCE="DIRSERV:" IP ":" Port / - "NETWORKSTATUS:" IP ":" Port / - "OR:" IP ":" Port / - "CONSENSUS" - If "SKEW" is present, it's an estimate of how far we are from the - time declared in the source. (In other words, if we're an hour in - the past, the value is -3600.) "MIN_SKEW" is present, it's a lower - bound. If the source is a DIRSERV, we got the current time from a - connection to a dirserver. If the source is a NETWORKSTATUS, we - decided we're skewed because we got a v2 networkstatus from far in - the future. If the source is OR, the skew comes from a NETINFO - cell from a connection to another relay. If the source is - CONSENSUS, we decided we're skewed because we got a networkstatus - consensus from the future. - - {Tor should send this message to controllers when it thinks the - skew is so high that it will interfere with proper Tor operation. - Controllers shouldn't blindly adjust the clock, since the more - accurate source of skew info (DIRSERV) is currently - unauthenticated.} - - BAD_LIBEVENT - "METHOD=" libevent method - "VERSION=" libevent version - "BADNESS=" "BROKEN" / "BUGGY" / "SLOW" - "RECOVERED=" "NO" / "YES" - Tor knows about bugs in using the configured event method in this - version of libevent. "BROKEN" libevents won't work at all; - "BUGGY" libevents might work okay; "SLOW" libevents will work - fine, but not quickly. If "RECOVERED" is YES, Tor managed to - switch to a more reliable (but probably slower!) libevent method. - - {Controllers may want to warn the user if this event occurs, though - generally it's the fault of whoever built the Tor binary and there's - not much the user can do besides upgrade libevent or upgrade the - binary.} - - DIR_ALL_UNREACHABLE - Tor believes that none of the known directory servers are - reachable -- this is most likely because the local network is - down or otherwise not working, and might help to explain for the - user why Tor appears to be broken. - - {Controllers may want to warn the user if this event occurs; further - action is generally not possible.} - - CONSENSUS_ARRIVED - Tor has received and validated a new consensus networkstatus. - (This event can be delayed a little while after the consensus - is received, if Tor needs to fetch certificates.) - - Actions for STATUS_CLIENT events can be as follows: - - BOOTSTRAP - "PROGRESS=" num - "TAG=" Keyword - "SUMMARY=" String - ["WARNING=" String - "REASON=" Keyword - "COUNT=" num - "RECOMMENDATION=" Keyword - ] - - Tor has made some progress at establishing a connection to the - Tor network, fetching directory information, or making its first - circuit; or it has encountered a problem while bootstrapping. This - status event is especially useful for users with slow connections - or with connectivity problems. - - "Progress" gives a number between 0 and 100 for how far through - the bootstrapping process we are. "Summary" is a string that can - be displayed to the user to describe the *next* task that Tor - will tackle, i.e., the task it is working on after sending the - status event. "Tag" is a string that controllers can use to - recognize bootstrap phases, if they want to do something smarter - than just blindly displaying the summary string; see Section 5 - for the current tags that Tor issues. - - The StatusSeverity describes whether this is a normal bootstrap - phase (severity notice) or an indication of a bootstrapping - problem (severity warn). - - For bootstrap problems, we include the same progress, tag, and - summary values as we would for a normal bootstrap event, but we - also include "warning", "reason", "count", and "recommendation" - key/value combos. The "count" number tells how many bootstrap - problems there have been so far at this phase. The "reason" - string lists one of the reasons allowed in the ORCONN event. The - "warning" argument string with any hints Tor has to offer about - why it's having troubles bootstrapping. - - The "reason" values are long-term-stable controller-facing tags to - identify particular issues in a bootstrapping step. The warning - strings, on the other hand, are human-readable. Controllers - SHOULD NOT rely on the format of any warning string. Currently - the possible values for "recommendation" are either "ignore" or - "warn" -- if ignore, the controller can accumulate the string in - a pile of problems to show the user if the user asks; if warn, - the controller should alert the user that Tor is pretty sure - there's a bootstrapping problem. - - Currently Tor uses recommendation=ignore for the first - nine bootstrap problem reports for a given phase, and then - uses recommendation=warn for subsequent problems at that - phase. Hopefully this is a good balance between tolerating - occasional errors and reporting serious problems quickly. - - ENOUGH_DIR_INFO - Tor now knows enough network-status documents and enough server - descriptors that it's going to start trying to build circuits now. - - {Controllers may want to use this event to decide when to indicate - progress to their users, but should not interrupt the user's browsing - to tell them so.} - - NOT_ENOUGH_DIR_INFO - We discarded expired statuses and router descriptors to fall - below the desired threshold of directory information. We won't - try to build any circuits until ENOUGH_DIR_INFO occurs again. - - {Controllers may want to use this event to decide when to indicate - progress to their users, but should not interrupt the user's browsing - to tell them so.} - - CIRCUIT_ESTABLISHED - Tor is able to establish circuits for client use. This event will - only be sent if we just built a circuit that changed our mind -- - that is, prior to this event we didn't know whether we could - establish circuits. - - {Suggested use: controllers can notify their users that Tor is - ready for use as a client once they see this status event. [Perhaps - controllers should also have a timeout if too much time passes and - this event hasn't arrived, to give tips on how to troubleshoot. - On the other hand, hopefully Tor will send further status events - if it can identify the problem.]} - - CIRCUIT_NOT_ESTABLISHED - "REASON=" "EXTERNAL_ADDRESS" / "DIR_ALL_UNREACHABLE" / "CLOCK_JUMPED" - We are no longer confident that we can build circuits. The "reason" - keyword provides an explanation: which other status event type caused - our lack of confidence. - - {Controllers may want to use this event to decide when to indicate - progress to their users, but should not interrupt the user's browsing - to do so.} - [Note: only REASON=CLOCK_JUMPED is implemented currently.] - - DANGEROUS_PORT - "PORT=" port - "RESULT=" "REJECT" / "WARN" - A stream was initiated to a port that's commonly used for - vulnerable-plaintext protocols. If the Result is "reject", we - refused the connection; whereas if it's "warn", we allowed it. - - {Controllers should warn their users when this occurs, unless they - happen to know that the application using Tor is in fact doing so - correctly (e.g., because it is part of a distributed bundle). They - might also want some sort of interface to let the user configure - their RejectPlaintextPorts and WarnPlaintextPorts config options.} - - DANGEROUS_SOCKS - "PROTOCOL=" "SOCKS4" / "SOCKS5" - "ADDRESS=" IP:port - A connection was made to Tor's SOCKS port using one of the SOCKS - approaches that doesn't support hostnames -- only raw IP addresses. - If the client application got this address from gethostbyname(), - it may be leaking target addresses via DNS. - - {Controllers should warn their users when this occurs, unless they - happen to know that the application using Tor is in fact doing so - correctly (e.g., because it is part of a distributed bundle).} - - SOCKS_UNKNOWN_PROTOCOL - "DATA=string" - A connection was made to Tor's SOCKS port that tried to use it - for something other than the SOCKS protocol. Perhaps the user is - using Tor as an HTTP proxy? The DATA is the first few characters - sent to Tor on the SOCKS port. - - {Controllers may want to warn their users when this occurs: it - indicates a misconfigured application.} - - SOCKS_BAD_HOSTNAME - "HOSTNAME=QuotedString" - Some application gave us a funny-looking hostname. Perhaps - it is broken? In any case it won't work with Tor and the user - should know. - - {Controllers may want to warn their users when this occurs: it - usually indicates a misconfigured application.} - - Actions for STATUS_SERVER can be as follows: - - EXTERNAL_ADDRESS - "ADDRESS=IP" - "HOSTNAME=NAME" - "METHOD=CONFIGURED/DIRSERV/RESOLVED/INTERFACE/GETHOSTNAME" - Our best idea for our externally visible IP has changed to 'IP'. - If 'HOSTNAME' is present, we got the new IP by resolving 'NAME'. If the - method is 'CONFIGURED', the IP was given verbatim as a configuration - option. If the method is 'RESOLVED', we resolved the Address - configuration option to get the IP. If the method is 'GETHOSTNAME', - we resolved our hostname to get the IP. If the method is 'INTERFACE', - we got the address of one of our network interfaces to get the IP. If - the method is 'DIRSERV', a directory server told us a guess for what - our IP might be. - - {Controllers may want to record this info and display it to the user.} - - CHECKING_REACHABILITY - "ORADDRESS=IP:port" - "DIRADDRESS=IP:port" - We're going to start testing the reachability of our external OR port - or directory port. - - {This event could affect the controller's idea of server status, but - the controller should not interrupt the user to tell them so.} - - REACHABILITY_SUCCEEDED - "ORADDRESS=IP:port" - "DIRADDRESS=IP:port" - We successfully verified the reachability of our external OR port or - directory port (depending on which of ORADDRESS or DIRADDRESS is - given.) - - {This event could affect the controller's idea of server status, but - the controller should not interrupt the user to tell them so.} - - GOOD_SERVER_DESCRIPTOR - We successfully uploaded our server descriptor to at least one - of the directory authorities, with no complaints. - - {Originally, the goal of this event was to declare "every authority - has accepted the descriptor, so there will be no complaints - about it." But since some authorities might be offline, it's - harder to get certainty than we had thought. As such, this event - is equivalent to ACCEPTED_SERVER_DESCRIPTOR below. Controllers - should just look at ACCEPTED_SERVER_DESCRIPTOR and should ignore - this event for now.} - - SERVER_DESCRIPTOR_STATUS - "STATUS=" "LISTED" / "UNLISTED" - We just got a new networkstatus consensus, and whether we're in - it or not in it has changed. Specifically, status is "listed" - if we're listed in it but previous to this point we didn't know - we were listed in a consensus; and status is "unlisted" if we - thought we should have been listed in it (e.g. we were listed in - the last one), but we're not. - - {Moving from listed to unlisted is not necessarily cause for - alarm. The relay might have failed a few reachability tests, - or the Internet might have had some routing problems. So this - feature is mainly to let relay operators know when their relay - has successfully been listed in the consensus.} - - [Not implemented yet. We should do this in 0.2.2.x. -RD] - - NAMESERVER_STATUS - "NS=addr" - "STATUS=" "UP" / "DOWN" - "ERR=" message - One of our nameservers has changed status. - - {This event could affect the controller's idea of server status, but - the controller should not interrupt the user to tell them so.} - - NAMESERVER_ALL_DOWN - All of our nameservers have gone down. - - {This is a problem; if it happens often without the nameservers - coming up again, the user needs to configure more or better - nameservers.} - - DNS_HIJACKED - Our DNS provider is providing an address when it should be saying - "NOTFOUND"; Tor will treat the address as a synonym for "NOTFOUND". - - {This is an annoyance; controllers may want to tell admins that their - DNS provider is not to be trusted.} - - DNS_USELESS - Our DNS provider is giving a hijacked address instead of well-known - websites; Tor will not try to be an exit node. - - {Controllers could warn the admin if the server is running as an - exit server: the admin needs to configure a good DNS server. - Alternatively, this happens a lot in some restrictive environments - (hotels, universities, coffeeshops) when the user hasn't registered.} - - BAD_SERVER_DESCRIPTOR - "DIRAUTH=addr:port" - "REASON=string" - A directory authority rejected our descriptor. Possible reasons - include malformed descriptors, incorrect keys, highly skewed clocks, - and so on. - - {Controllers should warn the admin, and try to cope if they can.} - - ACCEPTED_SERVER_DESCRIPTOR - "DIRAUTH=addr:port" - A single directory authority accepted our descriptor. - // actually notice - - {This event could affect the controller's idea of server status, but - the controller should not interrupt the user to tell them so.} - - REACHABILITY_FAILED - "ORADDRESS=IP:port" - "DIRADDRESS=IP:port" - We failed to connect to our external OR port or directory port - successfully. - - {This event could affect the controller's idea of server status. The - controller should warn the admin and suggest reasonable steps to take.} - -4.1.11. Our set of guard nodes has changed - - Syntax: - "650" SP "GUARD" SP Type SP Name SP Status ... CRLF - Type = "ENTRY" - Name = The (possibly verbose) nickname of the guard affected. - Status = "NEW" | "UP" | "DOWN" | "BAD" | "GOOD" | "DROPPED" - - [explain states. XXX] - -4.1.12. Network status has changed - - Syntax: - "650" "+" "NS" CRLF 1*NetworkStatus "." CRLF "650" SP "OK" CRLF - - The event is used whenever our local view of a relay status changes. - This happens when we get a new v3 consensus (in which case the entries - we see are a duplicate of what we see in the NEWCONSENSUS event, - below), but it also happens when we decide to mark a relay as up or - down in our local status, for example based on connection attempts. - - [First added in 0.1.2.3-alpha] - -4.1.13. Bandwidth used on an application stream - - The syntax is: - "650" SP "STREAM_BW" SP StreamID SP BytesWritten SP BytesRead CRLF - BytesWritten = 1*DIGIT - BytesRead = 1*DIGIT - - BytesWritten and BytesRead are the number of bytes written and read - by the application since the last STREAM_BW event on this stream. - - Note that from Tor's perspective, *reading* a byte on a stream means - that the application *wrote* the byte. That's why the order of "written" - vs "read" is opposite for stream_bw events compared to bw events. - - These events are generated about once per second per stream; no events - are generated for streams that have not written or read. These events - apply only to streams entering Tor (such as on a SOCKSPort, TransPort, - or so on). They are not generated for exiting streams. - -4.1.14. Per-country client stats - - The syntax is: - "650" SP "CLIENTS_SEEN" SP TimeStarted SP CountrySummary CRLF - - We just generated a new summary of which countries we've seen clients - from recently. The controller could display this for the user, e.g. - in their "relay" configuration window, to give them a sense that they - are actually being useful. - - Currently only bridge relays will receive this event, but once we figure - out how to sufficiently aggregate and sanitize the client counts on - main relays, we might start sending these events in other cases too. - - TimeStarted is a quoted string indicating when the reported summary - counts from (in GMT). - - The CountrySummary keyword has as its argument a comma-separated, - possibly empty set of "countrycode=count" pairs. For example (without - linebreak), - 650-CLIENTS_SEEN TimeStarted="2008-12-25 23:50:43" - CountrySummary=us=16,de=8,uk=8 - -4.1.15. New consensus networkstatus has arrived. - - The syntax is: - "650" "+" "NEWCONSENSUS" CRLF 1*NetworkStatus "." CRLF "650" SP - "OK" CRLF - - A new consensus networkstatus has arrived. We include NS-style lines for - every relay in the consensus. NEWCONSENSUS is a separate event from the - NS event, because the list here represents every usable relay: so any - relay *not* mentioned in this list is implicitly no longer recommended. - - [First added in 0.2.1.13-alpha] - -4.1.16. New circuit buildtime has been set. - - The syntax is: - "650" SP "BUILDTIMEOUT_SET" SP Type SP "TOTAL_TIMES=" Total SP - "TIMEOUT_MS=" Timeout SP "XM=" Xm SP "ALPHA=" Alpha SP - "CUTOFF_QUANTILE=" Quantile SP "TIMEOUT_RATE=" TimeoutRate SP - "CLOSE_MS=" CloseTimeout SP "CLOSE_RATE=" CloseRate - CRLF - Type = "COMPUTED" / "RESET" / "SUSPENDED" / "DISCARD" / "RESUME" - Total = Integer count of timeouts stored - Timeout = Integer timeout in milliseconds - Xm = Estimated integer Pareto parameter Xm in milliseconds - Alpha = Estimated floating point Paredo paremter alpha - Quantile = Floating point CDF quantile cutoff point for this timeout - TimeoutRate = Floating point ratio of circuits that timeout - CloseTimeout = How long to keep measurement circs in milliseconds - CloseRate = Floating point ratio of measurement circuits that are closed - - A new circuit build timeout time has been set. If Type is "COMPUTED", - Tor has computed the value based on historical data. If Type is "RESET", - initialization or drastic network changes have caused Tor to reset - the timeout back to the default, to relearn again. If Type is - "SUSPENDED", Tor has detected a loss of network connectivity and has - temporarily changed the timeout value to the default until the network - recovers. If type is "DISCARD", Tor has decided to discard timeout - values that likely happened while the network was down. If type is - "RESUME", Tor has decided to resume timeout calculation. - - The Total value is the count of circuit build times Tor used in - computing this value. It is capped internally at the maximum number - of build times Tor stores (NCIRCUITS_TO_OBSERVE). - - The Timeout itself is provided in milliseconds. Internally, Tor rounds - this value to the nearest second before using it. - - [First added in 0.2.2.7-alpha] - -5. Implementation notes - -5.1. Authentication - - If the control port is open and no authentication operation is enabled, Tor - trusts any local user that connects to the control port. This is generally - a poor idea. - - If the 'CookieAuthentication' option is true, Tor writes a "magic cookie" - file named "control_auth_cookie" into its data directory. To authenticate, - the controller must send the contents of this file, encoded in hexadecimal. - - If the 'HashedControlPassword' option is set, it must contain the salted - hash of a secret password. The salted hash is computed according to the - S2K algorithm in RFC 2440 (OpenPGP), and prefixed with the s2k specifier. - This is then encoded in hexadecimal, prefixed by the indicator sequence - "16:". Thus, for example, the password 'foo' could encode to: - 16:660537E3E1CD49996044A3BF558097A981F539FEA2F9DA662B4626C1C2 - ++++++++++++++++**^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - salt hashed value - indicator - You can generate the salt of a password by calling - 'tor --hash-password <password>' - or by using the example code in the Python and Java controller libraries. - To authenticate under this scheme, the controller sends Tor the original - secret that was used to generate the password, either as a quoted string - or encoded in hexadecimal. - -5.2. Don't let the buffer get too big. - - If you ask for lots of events, and 16MB of them queue up on the buffer, - the Tor process will close the socket. - -5.3. Backward compatibility with v0 control protocol. - - The 'version 0' control protocol was replaced in Tor 0.1.1.x. Support - was removed in Tor 0.2.0.x. Every non-obsolete version of Tor now - supports the version 1 control protocol. - - For backward compatibility with the "version 0" control protocol, - Tor used to check whether the third octet of the first command is zero. - (If it was, Tor assumed that version 0 is in use.) - - This compatibility was removed in Tor 0.1.2.16 and 0.2.0.4-alpha. - -5.4. Tor config options for use by controllers - - Tor provides a few special configuration options for use by controllers. - These options can be set and examined by the SETCONF and GETCONF commands, - but are not saved to disk by SAVECONF. - - Generally, these options make Tor unusable by disabling a portion of Tor's - normal operations. Unless a controller provides replacement functionality - to fill this gap, Tor will not correctly handle user requests. - - __AllDirOptionsPrivate - - If true, Tor will try to launch all directory operations through - anonymous connections. (Ordinarily, Tor only tries to anonymize - requests related to hidden services.) This option will slow down - directory access, and may stop Tor from working entirely if it does not - yet have enough directory information to build circuits. - - (Boolean. Default: "0".) - - __DisablePredictedCircuits - - If true, Tor will not launch preemptive "general-purpose" circuits for - streams to attach to. (It will still launch circuits for testing and - for hidden services.) - - (Boolean. Default: "0".) - - __LeaveStreamsUnattached - - If true, Tor will not automatically attach new streams to circuits; - instead, the controller must attach them with ATTACHSTREAM. If the - controller does not attach the streams, their data will never be routed. - - (Boolean. Default: "0".) - - __HashedControlSessionPassword - - As HashedControlPassword, but is not saved to the torrc file by - SAVECONF. Added in Tor 0.2.0.20-rc. - - __ReloadTorrcOnSIGHUP - - If this option is true (the default), we reload the torrc from disk - every time we get a SIGHUP (from the controller or via a signal). - Otherwise, we don't. This option exists so that controllers can keep - their options from getting overwritten when a user sends Tor a HUP for - some other reason (for example, to rotate the logs). - - (Boolean. Default: "1") - -5.5. Phases from the Bootstrap status event. - - This section describes the various bootstrap phases currently reported - by Tor. Controllers should not assume that the percentages and tags - listed here will continue to match up, or even that the tags will stay - in the same order. Some phases might also be skipped (not reported) - if the associated bootstrap step is already complete, or if the phase - no longer is necessary. Only "starting" and "done" are guaranteed to - exist in all future versions. - - Current Tor versions enter these phases in order, monotonically. - Future Tors MAY revisit earlier stages. - - Phase 0: - tag=starting summary="Starting" - - Tor starts out in this phase. - - Phase 5: - tag=conn_dir summary="Connecting to directory mirror" - - Tor sends this event as soon as Tor has chosen a directory mirror -- - e.g. one of the authorities if bootstrapping for the first time or - after a long downtime, or one of the relays listed in its cached - directory information otherwise. - - Tor will stay at this phase until it has successfully established - a TCP connection with some directory mirror. Problems in this phase - generally happen because Tor doesn't have a network connection, or - because the local firewall is dropping SYN packets. - - Phase 10: - tag=handshake_dir summary="Finishing handshake with directory mirror" - - This event occurs when Tor establishes a TCP connection with a relay used - as a directory mirror (or its https proxy if it's using one). Tor remains - in this phase until the TLS handshake with the relay is finished. - - Problems in this phase generally happen because Tor's firewall is - doing more sophisticated MITM attacks on it, or doing packet-level - keyword recognition of Tor's handshake. - - Phase 15: - tag=onehop_create summary="Establishing one-hop circuit for dir info" - - Once TLS is finished with a relay, Tor will send a CREATE_FAST cell - to establish a one-hop circuit for retrieving directory information. - It will remain in this phase until it receives the CREATED_FAST cell - back, indicating that the circuit is ready. - - Phase 20: - tag=requesting_status summary="Asking for networkstatus consensus" - - Once we've finished our one-hop circuit, we will start a new stream - for fetching the networkstatus consensus. We'll stay in this phase - until we get the 'connected' relay cell back, indicating that we've - established a directory connection. - - Phase 25: - tag=loading_status summary="Loading networkstatus consensus" - - Once we've established a directory connection, we will start fetching - the networkstatus consensus document. This could take a while; this - phase is a good opportunity for using the "progress" keyword to indicate - partial progress. - - This phase could stall if the directory mirror we picked doesn't - have a copy of the networkstatus consensus so we have to ask another, - or it does give us a copy but we don't find it valid. - - Phase 40: - tag=loading_keys summary="Loading authority key certs" - - Sometimes when we've finished loading the networkstatus consensus, - we find that we don't have all the authority key certificates for the - keys that signed the consensus. At that point we put the consensus we - fetched on hold and fetch the keys so we can verify the signatures. - - Phase 45 - tag=requesting_descriptors summary="Asking for relay descriptors" - - Once we have a valid networkstatus consensus and we've checked all - its signatures, we start asking for relay descriptors. We stay in this - phase until we have received a 'connected' relay cell in response to - a request for descriptors. - - Phase 50: - tag=loading_descriptors summary="Loading relay descriptors" - - We will ask for relay descriptors from several different locations, - so this step will probably make up the bulk of the bootstrapping, - especially for users with slow connections. We stay in this phase until - we have descriptors for at least 1/4 of the usable relays listed in - the networkstatus consensus. This phase is also a good opportunity to - use the "progress" keyword to indicate partial steps. - - Phase 80: - tag=conn_or summary="Connecting to entry guard" - - Once we have a valid consensus and enough relay descriptors, we choose - some entry guards and start trying to build some circuits. This step - is similar to the "conn_dir" phase above; the only difference is - the context. - - If a Tor starts with enough recent cached directory information, - its first bootstrap status event will be for the conn_or phase. - - Phase 85: - tag=handshake_or summary="Finishing handshake with entry guard" - - This phase is similar to the "handshake_dir" phase, but it gets reached - if we finish a TCP connection to a Tor relay and we have already reached - the "conn_or" phase. We'll stay in this phase until we complete a TLS - handshake with a Tor relay. - - Phase 90: - tag=circuit_create summary="Establishing circuits" - - Once we've finished our TLS handshake with an entry guard, we will - set about trying to make some 3-hop circuits in case we need them soon. - - Phase 100: - tag=done summary="Done" - - A full 3-hop exit circuit has been established. Tor is ready to handle - application connections now. - diff --git a/doc/spec/dir-spec-v1.txt b/doc/spec/dir-spec-v1.txt deleted file mode 100644 index a92fc7999a..0000000000 --- a/doc/spec/dir-spec-v1.txt +++ /dev/null @@ -1,314 +0,0 @@ - - Tor Protocol Specification - - Roger Dingledine - Nick Mathewson - -0. Preliminaries - - THIS SPECIFICATION IS OBSOLETE. - - This document specifies the Tor directory protocol as used in version - 0.1.0.x and earlier. See dir-spec.txt for a current version. - -1. Basic operation - - There is a small number of directory authorities, and a larger number of - caches. Client and servers know public keys for the directory authorities. - Tor servers periodically upload self-signed "router descriptors" to the - directory authorities. Each authority publishes a self-signed "directory" - (containing all the router descriptors it knows, and a statement on which - are running) and a self-signed "running routers" document containing only - the statement on which routers are running. - - All Tors periodically download these documents, downloading the directory - less frequently than they do the "running routers" document. Clients - preferentially download from caches rather than authorities. - -1.1. Document format - - Router descriptors, directories, and running-routers documents all obey the - following lightweight extensible information format. - - The highest level object is a Document, which consists of one or more - Items. Every Item begins with a KeywordLine, followed by one or more - Objects. A KeywordLine begins with a Keyword, optionally followed by - whitespace and more non-newline characters, and ends with a newline. A - Keyword is a sequence of one or more characters in the set [A-Za-z0-9-]. - An Object is a block of encoded data in pseudo-Open-PGP-style - armor. (cf. RFC 2440) - - More formally: - - Document ::= (Item | NL)+ - Item ::= KeywordLine Object* - KeywordLine ::= Keyword NL | Keyword WS ArgumentsChar+ NL - Keyword = KeywordChar+ - KeywordChar ::= 'A' ... 'Z' | 'a' ... 'z' | '0' ... '9' | '-' - ArgumentChar ::= any printing ASCII character except NL. - WS = (SP | TAB)+ - Object ::= BeginLine Base-64-encoded-data EndLine - BeginLine ::= "-----BEGIN " Keyword "-----" NL - EndLine ::= "-----END " Keyword "-----" NL - - The BeginLine and EndLine of an Object must use the same keyword. - - When interpreting a Document, software MUST reject any document containing a - KeywordLine that starts with a keyword it doesn't recognize. - - The "opt" keyword is reserved for non-critical future extensions. All - implementations MUST ignore any item of the form "opt keyword ....." when - they would not recognize "keyword ....."; and MUST treat "opt keyword ....." - as synonymous with "keyword ......" when keyword is recognized. - -2. Router descriptor format. - - Every router descriptor MUST start with a "router" Item; MUST end with a - "router-signature" Item and an extra NL; and MUST contain exactly one - instance of each of the following Items: "published" "onion-key" "link-key" - "signing-key" "bandwidth". Additionally, a router descriptor MAY contain - any number of "accept", "reject", "fingerprint", "uptime", and "opt" Items. - Other than "router" and "router-signature", the items may appear in any - order. - - The items' formats are as follows: - "router" nickname address ORPort SocksPort DirPort - - Indicates the beginning of a router descriptor. "address" - must be an IPv4 address in dotted-quad format. The last - three numbers indicate the TCP ports at which this OR exposes - functionality. ORPort is a port at which this OR accepts TLS - connections for the main OR protocol; SocksPort is deprecated and - should always be 0; and DirPort is the port at which this OR accepts - directory-related HTTP connections. If any port is not supported, - the value 0 is given instead of a port number. - - "bandwidth" bandwidth-avg bandwidth-burst bandwidth-observed - - Estimated bandwidth for this router, in bytes per second. The - "average" bandwidth is the volume per second that the OR is willing - to sustain over long periods; the "burst" bandwidth is the volume - that the OR is willing to sustain in very short intervals. The - "observed" value is an estimate of the capacity this server can - handle. The server remembers the max bandwidth sustained output - over any ten second period in the past day, and another sustained - input. The "observed" value is the lesser of these two numbers. - - "platform" string - - A human-readable string describing the system on which this OR is - running. This MAY include the operating system, and SHOULD include - the name and version of the software implementing the Tor protocol. - - "published" YYYY-MM-DD HH:MM:SS - - The time, in GMT, when this descriptor was generated. - - "fingerprint" - - A fingerprint (a HASH_LEN-byte of asn1 encoded public key, encoded - in hex, with a single space after every 4 characters) for this router's - identity key. A descriptor is considered invalid (and MUST be - rejected) if the fingerprint line does not match the public key. - - [We didn't start parsing this line until Tor 0.1.0.6-rc; it should - be marked with "opt" until earlier versions of Tor are obsolete.] - - "hibernating" 0|1 - - If the value is 1, then the Tor server was hibernating when the - descriptor was published, and shouldn't be used to build circuits. - - [We didn't start parsing this line until Tor 0.1.0.6-rc; it should - be marked with "opt" until earlier versions of Tor are obsolete.] - - "uptime" - - The number of seconds that this OR process has been running. - - "onion-key" NL a public key in PEM format - - This key is used to encrypt EXTEND cells for this OR. The key MUST - be accepted for at least XXXX hours after any new key is published in - a subsequent descriptor. - - "signing-key" NL a public key in PEM format - - The OR's long-term identity key. - - "accept" exitpattern - "reject" exitpattern - - These lines, in order, describe the rules that an OR follows when - deciding whether to allow a new stream to a given address. The - 'exitpattern' syntax is described below. - - "router-signature" NL Signature NL - - The "SIGNATURE" object contains a signature of the PKCS1-padded - hash of the entire router descriptor, taken from the beginning of the - "router" line, through the newline after the "router-signature" line. - The router descriptor is invalid unless the signature is performed - with the router's identity key. - - "contact" info NL - - Describes a way to contact the server's administrator, preferably - including an email address and a PGP key fingerprint. - - "family" names NL - - 'Names' is a whitespace-separated list of server nicknames. If two ORs - list one another in their "family" entries, then OPs should treat them - as a single OR for the purpose of path selection. - - For example, if node A's descriptor contains "family B", and node B's - descriptor contains "family A", then node A and node B should never - be used on the same circuit. - - "read-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM,NUM,NUM,NUM,NUM... NL - "write-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM,NUM,NUM,NUM,NUM... NL - - Declare how much bandwidth the OR has used recently. Usage is divided - into intervals of NSEC seconds. The YYYY-MM-DD HH:MM:SS field defines - the end of the most recent interval. The numbers are the number of - bytes used in the most recent intervals, ordered from oldest to newest. - - [We didn't start parsing these lines until Tor 0.1.0.6-rc; they should - be marked with "opt" until earlier versions of Tor are obsolete.] - -2.1. Nonterminals in routerdescriptors - - nickname ::= between 1 and 19 alphanumeric characters, case-insensitive. - - exitpattern ::= addrspec ":" portspec - portspec ::= "*" | port | port "-" port - port ::= an integer between 1 and 65535, inclusive. - addrspec ::= "*" | ip4spec | ip6spec - ipv4spec ::= ip4 | ip4 "/" num_ip4_bits | ip4 "/" ip4mask - ip4 ::= an IPv4 address in dotted-quad format - ip4mask ::= an IPv4 mask in dotted-quad format - num_ip4_bits ::= an integer between 0 and 32 - ip6spec ::= ip6 | ip6 "/" num_ip6_bits - ip6 ::= an IPv6 address, surrounded by square brackets. - num_ip6_bits ::= an integer between 0 and 128 - - Ports are required; if they are not included in the router - line, they must appear in the "ports" lines. - -3. Directory format - - A Directory begins with a "signed-directory" item, followed by one each of - the following, in any order: "recommended-software", "published", - "router-status", "dir-signing-key". It may include any number of "opt" - items. After these items, a directory includes any number of router - descriptors, and a single "directory-signature" item. - - "signed-directory" - - Indicates the start of a directory. - - "published" YYYY-MM-DD HH:MM:SS - - The time at which this directory was generated and signed, in GMT. - - "dir-signing-key" - - The key used to sign this directory; see "signing-key" for format. - - "recommended-software" comma-separated-version-list - - A list of which versions of which implementations are currently - believed to be secure and compatible with the network. - - "running-routers" whitespace-separated-list - - A description of which routers are currently believed to be up or - down. Every entry consists of an optional "!", followed by either an - OR's nickname, or "$" followed by a hexadecimal encoding of the hash - of an OR's identity key. If the "!" is included, the router is - believed not to be running; otherwise, it is believed to be running. - If a router's nickname is given, exactly one router of that nickname - will appear in the directory, and that router is "approved" by the - directory server. If a hashed identity key is given, that OR is not - "approved". [XXXX The 'running-routers' line is only provided for - backward compatibility. New code should parse 'router-status' - instead.] - - "router-status" whitespace-separated-list - - A description of which routers are currently believed to be up or - down, and which are verified or unverified. Contains one entry for - every router that the directory server knows. Each entry is of the - format: - - !name=$digest [Verified router, currently not live.] - name=$digest [Verified router, currently live.] - !$digest [Unverified router, currently not live.] - or $digest [Unverified router, currently live.] - - (where 'name' is the router's nickname and 'digest' is a hexadecimal - encoding of the hash of the routers' identity key). - - When parsing this line, clients should only mark a router as - 'verified' if its nickname AND digest match the one provided. - - "directory-signature" nickname-of-dirserver NL Signature - - The signature is computed by computing the digest of the - directory, from the characters "signed-directory", through the newline - after "directory-signature". This digest is then padded with PKCS.1, - and signed with the directory server's signing key. - - If software encounters an unrecognized keyword in a single router descriptor, - it MUST reject only that router descriptor, and continue using the - others. Because this mechanism is used to add 'critical' extensions to - future versions of the router descriptor format, implementation should treat - it as a normal occurrence and not, for example, report it to the user as an - error. [Versions of Tor prior to 0.1.1 did this.] - - If software encounters an unrecognized keyword in the directory header, - it SHOULD reject the entire directory. - -4. Network-status descriptor - - A "network-status" (a.k.a "running-routers") document is a truncated - directory that contains only the current status of a list of nodes, not - their actual descriptors. It contains exactly one of each of the following - entries. - - "network-status" - - Must appear first. - - "published" YYYY-MM-DD HH:MM:SS - - (see section 3 above) - - "router-status" list - - (see section 3 above) - - "directory-signature" NL signature - - (see section 3 above) - -5. Behavior of a directory server - - lists nodes that are connected currently - speaks HTTP on a socket, spits out directory on request - - Directory servers listen on a certain port (the DirPort), and speak a - limited version of HTTP 1.0. Clients send either GET or POST commands. - The basic interactions are: - "%s %s HTTP/1.0\r\nContent-Length: %lu\r\nHost: %s\r\n\r\n", - command, url, content-length, host. - Get "/tor/" to fetch a full directory. - Get "/tor/dir.z" to fetch a compressed full directory. - Get "/tor/running-routers" to fetch a network-status descriptor. - Post "/tor/" to post a server descriptor, with the body of the - request containing the descriptor. - - "host" is used to specify the address:port of the dirserver, so - the request can survive going through HTTP proxies. - diff --git a/doc/spec/dir-spec-v2.txt b/doc/spec/dir-spec-v2.txt deleted file mode 100644 index d1be27f3db..0000000000 --- a/doc/spec/dir-spec-v2.txt +++ /dev/null @@ -1,896 +0,0 @@ - - Tor directory protocol, version 2 - -0. Scope and preliminaries - - This directory protocol is used by Tor version 0.1.1.x and 0.1.2.x. See - dir-spec-v1.txt for information on earlier versions, and dir-spec.txt - for information on later versions. - -0.1. Goals and motivation - - There were several problems with the way Tor handles directory information - in version 0.1.0.x and earlier. Here are the problems we try to fix with - this new design, already implemented in 0.1.1.x: - 1. Directories were very large and use up a lot of bandwidth: clients - downloaded descriptors for all router several times an hour. - 2. Every directory authority was a trust bottleneck: if a single - directory authority lied, it could make clients believe for a time an - arbitrarily distorted view of the Tor network. - 3. Our current "verified server" system is kind of nonsensical. - - 4. Getting more directory authorities would add more points of failure - and worsen possible partitioning attacks. - - There are two problems that remain unaddressed by this design. - 5. Requiring every client to know about every router won't scale. - 6. Requiring every directory cache to know every router won't scale. - - We attempt to fix 1-4 here, and to build a solution that will work when we - figure out an answer for 5. We haven't thought at all about what to do - about 6. - -1. Outline - - There is a small set (say, around 10) of semi-trusted directory - authorities. A default list of authorities is shipped with the Tor - software. Users can change this list, but are encouraged not to do so, in - order to avoid partitioning attacks. - - Routers periodically upload signed "descriptors" to the directory - authorities describing their keys, capabilities, and other information. - Routers may act as directory mirrors (also called "caches"), to reduce - load on the directory authorities. They announce this in their - descriptors. - - Each directory authority periodically generates and signs a compact - "network status" document that lists that authority's view of the current - descriptors and status for known routers, but which does not include the - descriptors themselves. - - Directory mirrors download, cache, and re-serve network-status documents - to clients. - - Clients, directory mirrors, and directory authorities all use - network-status documents to find out when their list of routers is - out-of-date. If it is, they download any missing router descriptors. - Clients download missing descriptors from mirrors; mirrors and authorities - download from authorities. Descriptors are downloaded by the hash of the - descriptor, not by the server's identity key: this prevents servers from - attacking clients by giving them descriptors nobody else uses. - - All directory information is uploaded and downloaded with HTTP. - - Coordination among directory authorities is done client-side: clients - compute a vote-like algorithm among the network-status documents they - have, and base their decisions on the result. - -1.1. What's different from 0.1.0.x? - - Clients used to download a signed concatenated set of router descriptors - (called a "directory") from directory mirrors, regardless of which - descriptors had changed. - - Between downloading directories, clients would download "network-status" - documents that would list which servers were supposed to running. - - Clients would always believe the most recently published network-status - document they were served. - - Routers used to upload fresh descriptors all the time, whether their keys - and other information had changed or not. - -1.2. Document meta-format - - Router descriptors, directories, and running-routers documents all obey the - following lightweight extensible information format. - - The highest level object is a Document, which consists of one or more - Items. Every Item begins with a KeywordLine, followed by one or more - Objects. A KeywordLine begins with a Keyword, optionally followed by - whitespace and more non-newline characters, and ends with a newline. A - Keyword is a sequence of one or more characters in the set [A-Za-z0-9-]. - An Object is a block of encoded data in pseudo-Open-PGP-style - armor. (cf. RFC 2440) - - More formally: - - Document ::= (Item | NL)+ - Item ::= KeywordLine Object* - KeywordLine ::= Keyword NL | Keyword WS ArgumentsChar+ NL - Keyword = KeywordChar+ - KeywordChar ::= 'A' ... 'Z' | 'a' ... 'z' | '0' ... '9' | '-' - ArgumentChar ::= any printing ASCII character except NL. - WS = (SP | TAB)+ - Object ::= BeginLine Base-64-encoded-data EndLine - BeginLine ::= "-----BEGIN " Keyword "-----" NL - EndLine ::= "-----END " Keyword "-----" NL - - The BeginLine and EndLine of an Object must use the same keyword. - - When interpreting a Document, software MUST ignore any KeywordLine that - starts with a keyword it doesn't recognize; future implementations MUST NOT - require current clients to understand any KeywordLine not currently - described. - - The "opt" keyword was used until Tor 0.1.2.5-alpha for non-critical future - extensions. All implementations MUST ignore any item of the form "opt - keyword ....." when they would not recognize "keyword ....."; and MUST - treat "opt keyword ....." as synonymous with "keyword ......" when keyword - is recognized. - - Implementations before 0.1.2.5-alpha rejected any document with a - KeywordLine that started with a keyword that they didn't recognize. - Implementations MUST prefix items not recognized by older versions of Tor - with an "opt" until those versions of Tor are obsolete. - - Other implementations that want to extend Tor's directory format MAY - introduce their own items. The keywords for extension items SHOULD start - with the characters "x-" or "X-", to guarantee that they will not conflict - with keywords used by future versions of Tor. - -2. Router operation - - ORs SHOULD generate a new router descriptor whenever any of the - following events have occurred: - - - A period of time (18 hrs by default) has passed since the last - time a descriptor was generated. - - - A descriptor field other than bandwidth or uptime has changed. - - - Bandwidth has changed by at least a factor of 2 from the last time a - descriptor was generated, and at least a given interval of time - (20 mins by default) has passed since then. - - - Its uptime has been reset (by restarting). - - After generating a descriptor, ORs upload it to every directory - authority they know, by posting it to the URL - - http://<hostname:port>/tor/ - -2.1. Router descriptor format - - Every router descriptor MUST start with a "router" Item; MUST end with a - "router-signature" Item and an extra NL; and MUST contain exactly one - instance of each of the following Items: "published" "onion-key" - "signing-key" "bandwidth". - - A router descriptor MAY have zero or one of each of the following Items, - but MUST NOT have more than one: "contact", "uptime", "fingerprint", - "hibernating", "read-history", "write-history", "eventdns", "platform", - "family". - - Additionally, a router descriptor MAY contain any number of "accept", - "reject", and "opt" Items. Other than "router" and "router-signature", - the items may appear in any order. - - The items' formats are as follows: - "router" nickname address ORPort SocksPort DirPort - - Indicates the beginning of a router descriptor. "address" must be an - IPv4 address in dotted-quad format. The last three numbers indicate - the TCP ports at which this OR exposes functionality. ORPort is a port - at which this OR accepts TLS connections for the main OR protocol; - SocksPort is deprecated and should always be 0; and DirPort is the - port at which this OR accepts directory-related HTTP connections. If - any port is not supported, the value 0 is given instead of a port - number. - - "bandwidth" bandwidth-avg bandwidth-burst bandwidth-observed - - Estimated bandwidth for this router, in bytes per second. The - "average" bandwidth is the volume per second that the OR is willing to - sustain over long periods; the "burst" bandwidth is the volume that - the OR is willing to sustain in very short intervals. The "observed" - value is an estimate of the capacity this server can handle. The - server remembers the max bandwidth sustained output over any ten - second period in the past day, and another sustained input. The - "observed" value is the lesser of these two numbers. - - "platform" string - - A human-readable string describing the system on which this OR is - running. This MAY include the operating system, and SHOULD include - the name and version of the software implementing the Tor protocol. - - "published" YYYY-MM-DD HH:MM:SS - - The time, in GMT, when this descriptor was generated. - - "fingerprint" - - A fingerprint (a HASH_LEN-byte of asn1 encoded public key, encoded in - hex, with a single space after every 4 characters) for this router's - identity key. A descriptor is considered invalid (and MUST be - rejected) if the fingerprint line does not match the public key. - - [We didn't start parsing this line until Tor 0.1.0.6-rc; it should - be marked with "opt" until earlier versions of Tor are obsolete.] - - "hibernating" 0|1 - - If the value is 1, then the Tor server was hibernating when the - descriptor was published, and shouldn't be used to build circuits. - - [We didn't start parsing this line until Tor 0.1.0.6-rc; it should be - marked with "opt" until earlier versions of Tor are obsolete.] - - "uptime" - - The number of seconds that this OR process has been running. - - "onion-key" NL a public key in PEM format - - This key is used to encrypt EXTEND cells for this OR. The key MUST be - accepted for at least 1 week after any new key is published in a - subsequent descriptor. - - "signing-key" NL a public key in PEM format - - The OR's long-term identity key. - - "accept" exitpattern - "reject" exitpattern - - These lines describe the rules that an OR follows when - deciding whether to allow a new stream to a given address. The - 'exitpattern' syntax is described below. The rules are considered in - order; if no rule matches, the address will be accepted. For clarity, - the last such entry SHOULD be accept *:* or reject *:*. - - "router-signature" NL Signature NL - - The "SIGNATURE" object contains a signature of the PKCS1-padded - hash of the entire router descriptor, taken from the beginning of the - "router" line, through the newline after the "router-signature" line. - The router descriptor is invalid unless the signature is performed - with the router's identity key. - - "contact" info NL - - Describes a way to contact the server's administrator, preferably - including an email address and a PGP key fingerprint. - - "family" names NL - - 'Names' is a space-separated list of server nicknames or - hexdigests. If two ORs list one another in their "family" entries, - then OPs should treat them as a single OR for the purpose of path - selection. - - For example, if node A's descriptor contains "family B", and node B's - descriptor contains "family A", then node A and node B should never - be used on the same circuit. - - "read-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM,NUM,NUM,NUM,NUM... NL - "write-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM,NUM,NUM,NUM,NUM... NL - - Declare how much bandwidth the OR has used recently. Usage is divided - into intervals of NSEC seconds. The YYYY-MM-DD HH:MM:SS field - defines the end of the most recent interval. The numbers are the - number of bytes used in the most recent intervals, ordered from - oldest to newest. - - [We didn't start parsing these lines until Tor 0.1.0.6-rc; they should - be marked with "opt" until earlier versions of Tor are obsolete.] - - "eventdns" bool NL - - Declare whether this version of Tor is using the newer enhanced - dns logic. Versions of Tor without eventdns SHOULD NOT be used for - reverse hostname lookups. - - [All versions of Tor before 0.1.2.2-alpha should be assumed to have - this option set to 0 if it is not present. All Tor versions at - 0.1.2.2-alpha or later should be assumed to have this option set to - 1 if it is not present. Until 0.1.2.1-alpha-dev, this option was - not generated, even when eventdns was in use. Versions of Tor - before 0.1.2.1-alpha-dev did not parse this option, so it should be - marked "opt". With 0.2.0.1-alpha, the old 'dnsworker' logic has - been removed, rendering this option of historical interest only.] - -2.2. Nonterminals in router descriptors - - nickname ::= between 1 and 19 alphanumeric characters, case-insensitive. - hexdigest ::= a '$', followed by 20 hexadecimal characters. - [Represents a server by the digest of its identity key.] - - exitpattern ::= addrspec ":" portspec - portspec ::= "*" | port | port "-" port - port ::= an integer between 1 and 65535, inclusive. - [Some implementations incorrectly generate ports with value 0. - Implementations SHOULD accept this, and SHOULD NOT generate it.] - - addrspec ::= "*" | ip4spec | ip6spec - ipv4spec ::= ip4 | ip4 "/" num_ip4_bits | ip4 "/" ip4mask - ip4 ::= an IPv4 address in dotted-quad format - ip4mask ::= an IPv4 mask in dotted-quad format - num_ip4_bits ::= an integer between 0 and 32 - ip6spec ::= ip6 | ip6 "/" num_ip6_bits - ip6 ::= an IPv6 address, surrounded by square brackets. - num_ip6_bits ::= an integer between 0 and 128 - - bool ::= "0" | "1" - - Ports are required; if they are not included in the router - line, they must appear in the "ports" lines. - -3. Network status format - - Directory authorities generate, sign, and compress network-status - documents. Directory servers SHOULD generate a fresh network-status - document when the contents of such a document would be different from the - last one generated, and some time (at least one second, possibly longer) - has passed since the last one was generated. - - The network status document contains a preamble, a set of router status - entries, and a signature, in that order. - - We use the same meta-format as used for directories and router descriptors - in "tor-spec.txt". Implementations MAY insert blank lines - for clarity between sections; these blank lines are ignored. - Implementations MUST NOT depend on blank lines in any particular location. - - As used here, "whitespace" is a sequence of 1 or more tab or space - characters. - - The preamble contains: - - "network-status-version" -- A document format version. For this - specification, the version is "2". - "dir-source" -- The authority's hostname, current IP address, and - directory port, all separated by whitespace. - "fingerprint" -- A base16-encoded hash of the signing key's - fingerprint, with no additional spaces added. - "contact" -- An arbitrary string describing how to contact the - directory server's administrator. Administrators should include at - least an email address and a PGP fingerprint. - "dir-signing-key" -- The directory server's public signing key. - "client-versions" -- A comma-separated list of recommended client - versions. - "server-versions" -- A comma-separated list of recommended server - versions. - "published" -- The publication time for this network-status object. - "dir-options" -- A set of flags, in any order, separated by whitespace: - "Names" if this directory authority performs name bindings. - "Versions" if this directory authority recommends software versions. - "BadExits" if the directory authority flags nodes that it believes - are performing incorrectly as exit nodes. - "BadDirectories" if the directory authority flags nodes that it - believes are performing incorrectly as directory caches. - - The dir-options entry is optional. The "-versions" entries are required if - the "Versions" flag is present. The other entries are required and must - appear exactly once. The "network-status-version" entry must appear first; - the others may appear in any order. Implementations MUST ignore - additional arguments to the items above, and MUST ignore unrecognized - flags. - - For each router, the router entry contains: (This format is designed for - conciseness.) - - "r" -- followed by the following elements, in order, separated by - whitespace: - - The OR's nickname, - - A hash of its identity key, encoded in base64, with trailing = - signs removed. - - A hash of its most recent descriptor, encoded in base64, with - trailing = signs removed. (The hash is calculated as for - computing the signature of a descriptor.) - - The publication time of its most recent descriptor, in the form - YYYY-MM-DD HH:MM:SS, in GMT. - - An IP address - - An OR port - - A directory port (or "0" for none") - "s" -- A series of whitespace-separated status flags, in any order: - "Authority" if the router is a directory authority. - "BadExit" if the router is believed to be useless as an exit node - (because its ISP censors it, because it is behind a restrictive - proxy, or for some similar reason). - "BadDirectory" if the router is believed to be useless as a - directory cache (because its directory port isn't working, - its bandwidth is always throttled, or for some similar - reason). - "Exit" if the router is useful for building general-purpose exit - circuits. - "Fast" if the router is suitable for high-bandwidth circuits. - "Guard" if the router is suitable for use as an entry guard. - "Named" if the router's identity-nickname mapping is canonical, - and this authority binds names. - "Stable" if the router is suitable for long-lived circuits. - "Running" if the router is currently usable. - "Valid" if the router has been 'validated'. - "V2Dir" if the router implements this protocol. - "v" -- The version of the Tor protocol that this server is running. If - the value begins with "Tor" SP, the rest of the string is a Tor - version number, and the protocol is "The Tor protocol as supported - by the given version of Tor." Otherwise, if the value begins with - some other string, Tor has upgraded to a more sophisticated - protocol versioning system, and the protocol is "a version of the - Tor protocol more recent than any we recognize." - - The "r" entry for each router must appear first and is required. The - "s" entry is optional (see Section 3.1 below for how the flags are - decided). Unrecognized flags on the "s" line and extra elements - on the "r" line must be ignored. The "v" line is optional; it was not - supported until 0.1.2.5-alpha, and it must be preceded with an "opt" - until all earlier versions of Tor are obsolete. - - The signature section contains: - - "directory-signature" nickname-of-dirserver NL Signature - - Signature is a signature of this network-status document - (the document up until the signature, including the line - "directory-signature <nick>\n"), using the directory authority's - signing key. - - We compress the network status list with zlib before transmitting it. - -3.1. Establishing server status - - (This section describes how directory authorities choose which status - flags to apply to routers, as of Tor 0.1.1.18-rc. Later directory - authorities MAY do things differently, so long as clients keep working - well. Clients MUST NOT depend on the exact behaviors in this section.) - - In the below definitions, a router is considered "active" if it is - running, valid, and not hibernating. - - "Valid" -- a router is 'Valid' if it is running a version of Tor not - known to be broken, and the directory authority has not blacklisted - it as suspicious. - - "Named" -- Directory authority administrators may decide to support name - binding. If they do, then they must maintain a file of - nickname-to-identity-key mappings, and try to keep this file consistent - with other directory authorities. If they don't, they act as clients, and - report bindings made by other directory authorities (name X is bound to - identity Y if at least one binding directory lists it, and no directory - binds X to some other Y'.) A router is called 'Named' if the router - believes the given name should be bound to the given key. - - "Running" -- A router is 'Running' if the authority managed to connect to - it successfully within the last 30 minutes. - - "Stable" -- A router is 'Stable' if it is active, and either its - uptime is at least the median uptime for known active routers, or - its uptime is at least 30 days. Routers are never called stable if - they are running a version of Tor known to drop circuits stupidly. - (0.1.1.10-alpha through 0.1.1.16-rc are stupid this way.) - - "Fast" -- A router is 'Fast' if it is active, and its bandwidth is - in the top 7/8ths for known active routers. - - "Guard" -- A router is a possible 'Guard' if it is 'Stable' and its - bandwidth is above median for known active routers. If the total - bandwidth of active non-BadExit Exit servers is less than one third - of the total bandwidth of all active servers, no Exit is listed as - a Guard. - - "Authority" -- A router is called an 'Authority' if the authority - generating the network-status document believes it is an authority. - - "V2Dir" -- A router supports the v2 directory protocol if it has an open - directory port, and it is running a version of the directory protocol that - supports the functionality clients need. (Currently, this is - 0.1.1.9-alpha or later.) - - Directory server administrators may label some servers or IPs as - blacklisted, and elect not to include them in their network-status lists. - - Authorities SHOULD 'disable' any servers in excess of 3 on any single IP. - When there are more than 3 to choose from, authorities should first prefer - authorities to non-authorities, then prefer Running to non-Running, and - then prefer high-bandwidth to low-bandwidth. To 'disable' a server, the - authority *should* advertise it without the Running or Valid flag. - - Thus, the network-status list includes all non-blacklisted, - non-expired, non-superseded descriptors. - -4. Directory server operation - - All directory authorities and directory mirrors ("directory servers") - implement this section, except as noted. - -4.1. Accepting uploads (authorities only) - - When a router posts a signed descriptor to a directory authority, the - authority first checks whether it is well-formed and correctly - self-signed. If it is, the authority next verifies that the nickname - in question is not already assigned to a router with a different - public key. - Finally, the authority MAY check that the router is not blacklisted - because of its key, IP, or another reason. - - If the descriptor passes these tests, and the authority does not already - have a descriptor for a router with this public key, it accepts the - descriptor and remembers it. - - If the authority _does_ have a descriptor with the same public key, the - newly uploaded descriptor is remembered if its publication time is more - recent than the most recent old descriptor for that router, and either: - - There are non-cosmetic differences between the old descriptor and the - new one. - - Enough time has passed between the descriptors' publication times. - (Currently, 12 hours.) - - Differences between router descriptors are "non-cosmetic" if they would be - sufficient to force an upload as described in section 2 above. - - Note that the "cosmetic difference" test only applies to uploaded - descriptors, not to descriptors that the authority downloads from other - authorities. - -4.2. Downloading network-status documents (authorities and caches) - - All directory servers (authorities and mirrors) try to keep a fresh - set of network-status documents from every authority. To do so, - every 5 minutes, each authority asks every other authority for its - most recent network-status document. Every 15 minutes, each mirror - picks a random authority and asks it for the most recent network-status - documents for all the authorities the authority knows about (including - the chosen authority itself). - - Directory servers and mirrors remember and serve the most recent - network-status document they have from each authority. Other - network-status documents don't need to be stored. If the most recent - network-status document is over 10 days old, it is discarded anyway. - Mirrors SHOULD store and serve network-status documents from authorities - they don't recognize, but SHOULD NOT use such documents for any other - purpose. Mirrors SHOULD discard network-status documents older than 48 - hours. - -4.3. Downloading and storing router descriptors (authorities and caches) - - Periodically (currently, every 10 seconds), directory servers check - whether there are any specific descriptors (as identified by descriptor - hash in a network-status document) that they do not have and that they - are not currently trying to download. - - If so, the directory server launches requests to the authorities for these - descriptors, such that each authority is only asked for descriptors listed - in its most recent network-status. When more than one authority lists the - descriptor, we choose which to ask at random. - - If one of these downloads fails, we do not try to download that descriptor - from the authority that failed to serve it again unless we receive a newer - network-status from that authority that lists the same descriptor. - - Directory servers must potentially cache multiple descriptors for each - router. Servers must not discard any descriptor listed by any current - network-status document from any authority. If there is enough space to - store additional descriptors, servers SHOULD try to hold those which - clients are likely to download the most. (Currently, this is judged - based on the interval for which each descriptor seemed newest.) - - Authorities SHOULD NOT download descriptors for routers that they would - immediately reject for reasons listed in 3.1. - -4.4. HTTP URLs - - "Fingerprints" in these URLs are base-16-encoded SHA1 hashes. - - The authoritative network-status published by a host should be available at: - http://<hostname>/tor/status/authority.z - - The network-status published by a host with fingerprint - <F> should be available at: - http://<hostname>/tor/status/fp/<F>.z - - The network-status documents published by hosts with fingerprints - <F1>,<F2>,<F3> should be available at: - http://<hostname>/tor/status/fp/<F1>+<F2>+<F3>.z - - The most recent network-status documents from all known authorities, - concatenated, should be available at: - http://<hostname>/tor/status/all.z - - The most recent descriptor for a server whose identity key has a - fingerprint of <F> should be available at: - http://<hostname>/tor/server/fp/<F>.z - - The most recent descriptors for servers with identity fingerprints - <F1>,<F2>,<F3> should be available at: - http://<hostname>/tor/server/fp/<F1>+<F2>+<F3>.z - - (NOTE: Implementations SHOULD NOT download descriptors by identity key - fingerprint. This allows a corrupted server (in collusion with a cache) to - provide a unique descriptor to a client, and thereby partition that client - from the rest of the network.) - - The server descriptor with (descriptor) digest <D> (in hex) should be - available at: - http://<hostname>/tor/server/d/<D>.z - - The most recent descriptors with digests <D1>,<D2>,<D3> should be - available at: - http://<hostname>/tor/server/d/<D1>+<D2>+<D3>.z - - The most recent descriptor for this server should be at: - http://<hostname>/tor/server/authority.z - [Nothing in the Tor protocol uses this resource yet, but it is useful - for debugging purposes. Also, the official Tor implementations - (starting at 0.1.1.x) use this resource to test whether a server's - own DirPort is reachable.] - - A concatenated set of the most recent descriptors for all known servers - should be available at: - http://<hostname>/tor/server/all.z - - For debugging, directories SHOULD expose non-compressed objects at URLs like - the above, but without the final ".z". - Clients MUST handle compressed concatenated information in two forms: - - A concatenated list of zlib-compressed objects. - - A zlib-compressed concatenated list of objects. - Directory servers MAY generate either format: the former requires less - CPU, but the latter requires less bandwidth. - - Clients SHOULD use upper case letters (A-F) when base16-encoding - fingerprints. Servers MUST accept both upper and lower case fingerprints - in requests. - -5. Client operation: downloading information - - Every Tor that is not a directory server (that is, those that do - not have a DirPort set) implements this section. - -5.1. Downloading network-status documents - - Each client maintains an ordered list of directory authorities. - Insofar as possible, clients SHOULD all use the same ordered list. - - For each network-status document a client has, it keeps track of its - publication time *and* the time when the client retrieved it. Clients - consider a network-status document "live" if it was published within the - last 24 hours. - - Clients try to have a live network-status document hours from *every* - authority, and try to periodically get new network-status documents from - each authority in rotation as follows: - - If a client is missing a live network-status document for any - authority, it tries to fetch it from a directory cache. On failure, - the client waits briefly, then tries that network-status document - again from another cache. The client does not build circuits until it - has live network-status documents from more than half the authorities - it trusts, and it has descriptors for more than 1/4 of the routers - that it believes are running. - - If the most recently _retrieved_ network-status document is over 30 - minutes old, the client attempts to download a network-status document. - When choosing which documents to download, clients treat their list of - directory authorities as a circular ring, and begin with the authority - appearing immediately after the authority for their most recently - retrieved network-status document. If this attempt fails (either it - fails to download at all, or the one it gets is not as good as the - one it has), the client retries at other caches several times, before - moving on to the next network-status document in sequence. - - Clients discard all network-status documents over 24 hours old. - - If enough mirrors (currently 4) claim not to have a given network status, - we stop trying to download that authority's network-status, until we - download a new network-status that makes us believe that the authority in - question is running. Clients should wait a little longer after each - failure. - - Clients SHOULD try to batch as many network-status requests as possible - into each HTTP GET. - - (Note: clients can and should pick caches based on the network-status - information they have: once they have first fetched network-status info - from an authority, they should not need to go to the authority directly - again.) - -5.2. Downloading and storing router descriptors - - Clients try to have the best descriptor for each router. A descriptor is - "best" if: - * It is the most recently published descriptor listed for that router - by at least two network-status documents. - OR, - * No descriptor for that router is listed by two or more - network-status documents, and it is the most recently published - descriptor listed by any network-status document. - - Periodically (currently every 10 seconds) clients check whether there are - any "downloadable" descriptors. A descriptor is downloadable if: - - It is the "best" descriptor for some router. - - The descriptor was published at least 10 minutes in the past. - (This prevents clients from trying to fetch descriptors that the - mirrors have probably not yet retrieved and cached.) - - The client does not currently have it. - - The client is not currently trying to download it. - - The client would not discard it immediately upon receiving it. - - The client thinks it is running and valid (see 6.1 below). - - If at least 16 known routers have downloadable descriptors, or if - enough time (currently 10 minutes) has passed since the last time the - client tried to download descriptors, it launches requests for all - downloadable descriptors, as described in 5.3 below. - - When a descriptor download fails, the client notes it, and does not - consider the descriptor downloadable again until a certain amount of time - has passed. (Currently 0 seconds for the first failure, 60 seconds for the - second, 5 minutes for the third, 10 minutes for the fourth, and 1 day - thereafter.) Periodically (currently once an hour) clients reset the - failure count. - - No descriptors are downloaded until the client has downloaded more than - half of the network-status documents. - - Clients retain the most recent descriptor they have downloaded for each - router so long as it is not too old (currently, 48 hours), OR so long as - it is recommended by at least one networkstatus AND no "better" - descriptor has been downloaded. [Versions of Tor before 0.1.2.3-alpha - would discard descriptors simply for being published too far in the past.] - [The code seems to discard descriptors in all cases after they're 5 - days old. True? -RD] - -5.3. Managing downloads - - When a client has no live network-status documents, it downloads - network-status documents from a randomly chosen authority. In all other - cases, the client downloads from mirrors randomly chosen from among those - believed to be V2 directory servers. (This information comes from the - network-status documents; see 6 below.) - - When downloading multiple router descriptors, the client chooses multiple - mirrors so that: - - At least 3 different mirrors are used, except when this would result - in more than one request for under 4 descriptors. - - No more than 128 descriptors are requested from a single mirror. - - Otherwise, as few mirrors as possible are used. - After choosing mirrors, the client divides the descriptors among them - randomly. - - After receiving any response client MUST discard any network-status - documents and descriptors that it did not request. - -6. Using directory information - - Everyone besides directory authorities uses the approaches in this section - to decide which servers to use and what their keys are likely to be. - (Directory authorities just believe their own opinions, as in 3.1 above.) - -6.1. Choosing routers for circuits. - - Tor implementations only pay attention to "live" network-status documents. - A network status is "live" if it is the most recently downloaded network - status document for a given directory server, and the server is a - directory server trusted by the client, and the network-status document is - no more than 1 day old. - - For time-sensitive information, Tor implementations focus on "recent" - network-status documents. A network status is "recent" if it is live, and - if it was published in the last 60 minutes. If there are fewer - than 3 such documents, the most recently published 3 are "recent." If - there are fewer than 3 in all, all are "recent.") - - Circuits SHOULD NOT be built until the client has enough directory - information: network-statuses (or failed attempts to download - network-statuses) for all authorities, network-statuses for at more than - half of the authorities, and descriptors for at least 1/4 of the servers - believed to be running. - - A server is "listed" if it is included by more than half of the live - network status documents. Clients SHOULD NOT use unlisted servers. - - Clients believe the flags "Valid", "Exit", "Fast", "Guard", "Stable", and - "V2Dir" about a given router when they are asserted by more than half of - the live network-status documents. Clients believe the flag "Running" if - it is listed by more than half of the recent network-status documents. - - These flags are used as follows: - - - Clients SHOULD NOT use non-'Valid' or non-'Running' routers unless - requested to do so. - - - Clients SHOULD NOT use non-'Fast' routers for any purpose other than - very-low-bandwidth circuits (such as introduction circuits). - - - Clients SHOULD NOT use non-'Stable' routers for circuits that are - likely to need to be open for a very long time (such as those used for - IRC or SSH connections). - - - Clients SHOULD NOT choose non-'Guard' nodes when picking entry guard - nodes. - - - Clients SHOULD NOT download directory information from non-'V2Dir' - caches. - -6.2. Managing naming - - In order to provide human-memorable names for individual server - identities, some directory servers bind names to IDs. Clients handle - names in two ways: - - When a client encounters a name it has not mapped before: - - If all the live "Naming" network-status documents the client has - claim that the name binds to some identity ID, and the client has at - least three live network-status documents, the client maps the name to - ID. - - When a user tries to refer to a router with a name that does not have a - mapping under the above rules, the implementation SHOULD warn the user. - After giving the warning, the implementation MAY use a router that at - least one Naming authority maps the name to, so long as no other naming - authority maps that name to a different router. If no Naming authority - maps the name to a router, the implementation MAY use any router that - advertises the name. - - Not every router needs a nickname. When a router doesn't configure a - nickname, it publishes with the default nickname "Unnamed". Authorities - SHOULD NOT ever mark a router with this nickname as Named; client software - SHOULD NOT ever use a router in response to a user request for a router - called "Unnamed". - -6.3. Software versions - - An implementation of Tor SHOULD warn when it has fetched (or has - attempted to fetch and failed four consecutive times) a network-status - for each authority, and it is running a software version - not listed on more than half of the live "Versioning" network-status - documents. - -6.4. Warning about a router's status. - - If a router tries to publish its descriptor to a Naming authority - that has its nickname mapped to another key, the router SHOULD - warn the operator that it is either using the wrong key or is using - an already claimed nickname. - - If a router has fetched (or attempted to fetch and failed four - consecutive times) a network-status for every authority, and at - least one of the authorities is "Naming", and no live "Naming" - authorities publish a binding for the router's nickname, the - router MAY remind the operator that the chosen nickname is not - bound to this key at the authorities, and suggest contacting the - authority operators. - - ... - -6.5. Router protocol versions - - A client should believe that a router supports a given feature if that - feature is supported by the router or protocol versions in more than half - of the live networkstatus's "v" entries for that router. In other words, - if the "v" entries for some router are: - v Tor 0.0.8pre1 (from authority 1) - v Tor 0.1.2.11 (from authority 2) - v FutureProtocolDescription 99 (from authority 3) - then the client should believe that the router supports any feature - supported by 0.1.2.11. - - This is currently equivalent to believing the median declared version for - a router in all live networkstatuses. - -7. Standards compliance - - All clients and servers MUST support HTTP 1.0. - -7.1. HTTP headers - - Servers MAY set the Content-Length: header. Servers SHOULD set - Content-Encoding to "deflate" or "identity". - - Servers MAY include an X-Your-Address-Is: header, whose value is the - apparent IP address of the client connecting to them (as a dotted quad). - For directory connections tunneled over a BEGIN_DIR stream, servers SHOULD - report the IP from which the circuit carrying the BEGIN_DIR stream reached - them. [Servers before version 0.1.2.5-alpha reported 127.0.0.1 for all - BEGIN_DIR-tunneled connections.] - - Servers SHOULD disable caching of multiple network statuses or multiple - router descriptors. Servers MAY enable caching of single descriptors, - single network statuses, the list of all router descriptors, a v1 - directory, or a v1 running routers document. XXX mention times. - -7.2. HTTP status codes - - XXX We should write down what return codes dirservers send in what situations. - diff --git a/doc/spec/dir-spec.txt b/doc/spec/dir-spec.txt deleted file mode 100644 index eebceeafd6..0000000000 --- a/doc/spec/dir-spec.txt +++ /dev/null @@ -1,2421 +0,0 @@ - - Tor directory protocol, version 3 - -0. Scope and preliminaries - - This directory protocol is used by Tor version 0.2.0.x-alpha and later. - See dir-spec-v1.txt for information on the protocol used up to the - 0.1.0.x series, and dir-spec-v2.txt for information on the protocol - used by the 0.1.1.x and 0.1.2.x series. - - Caches and authorities must still support older versions of the - directory protocols, until the versions of Tor that require them are - finally out of commission. - - This document merges and supersedes the following proposals: - - 101 Voting on the Tor Directory System - 103 Splitting identity key from regularly used signing key - 104 Long and Short Router Descriptors - - XXX when to download certificates. - XXX timeline - XXX fill in XXXXs - - The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL - NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and - "OPTIONAL" in this document are to be interpreted as described in - RFC 2119. - -0.1. History - - The earliest versions of Onion Routing shipped with a list of known - routers and their keys. When the set of routers changed, users needed to - fetch a new list. - - The Version 1 Directory protocol - -------------------------------- - - Early versions of Tor (0.0.2) introduced "Directory authorities": servers - that served signed "directory" documents containing a list of signed - "router descriptors", along with short summary of the status of each - router. Thus, clients could get up-to-date information on the state of - the network automatically, and be certain that the list they were getting - was attested by a trusted directory authority. - - Later versions (0.0.8) added directory caches, which download - directories from the authorities and serve them to clients. Non-caches - fetch from the caches in preference to fetching from the authorities, thus - distributing bandwidth requirements. - - Also added during the version 1 directory protocol were "router status" - documents: short documents that listed only the up/down status of the - routers on the network, rather than a complete list of all the - descriptors. Clients and caches would fetch these documents far more - frequently than they would fetch full directories. - - The Version 2 Directory Protocol - -------------------------------- - - During the Tor 0.1.1.x series, Tor revised its handling of directory - documents in order to address two major problems: - - * Directories had grown quite large (over 1MB), and most directory - downloads consisted mainly of router descriptors that clients - already had. - - * Every directory authority was a trust bottleneck: if a single - directory authority lied, it could make clients believe for a time - an arbitrarily distorted view of the Tor network. (Clients - trusted the most recent signed document they downloaded.) Thus, - adding more authorities would make the system less secure, not - more. - - To address these, we extended the directory protocol so that - authorities now published signed "network status" documents. Each - network status listed, for every router in the network: a hash of its - identity key, a hash of its most recent descriptor, and a summary of - what the authority believed about its status. Clients would download - the authorities' network status documents in turn, and believe - statements about routers iff they were attested to by more than half of - the authorities. - - Instead of downloading all router descriptors at once, clients - downloaded only the descriptors that they did not have. Descriptors - were indexed by their digests, in order to prevent malicious caches - from giving different versions of a router descriptor to different - clients. - - Routers began working harder to upload new descriptors only when their - contents were substantially changed. - - -0.2. Goals of the version 3 protocol - - Version 3 of the Tor directory protocol tries to solve the following - issues: - - * A great deal of bandwidth used to transmit router descriptors was - used by two fields that are not actually used by Tor routers - (namely read-history and write-history). We save about 60% by - moving them into a separate document that most clients do not - fetch or use. - - * It was possible under certain perverse circumstances for clients - to download an unusual set of network status documents, thus - partitioning themselves from clients who have a more recent and/or - typical set of documents. Even under the best of circumstances, - clients were sensitive to the ages of the network status documents - they downloaded. Therefore, instead of having the clients - correlate multiple network status documents, we have the - authorities collectively vote on a single consensus network status - document. - - * The most sensitive data in the entire network (the identity keys - of the directory authorities) needed to be stored unencrypted so - that the authorities can sign network-status documents on the fly. - Now, the authorities' identity keys are stored offline, and used - to certify medium-term signing keys that can be rotated. - -0.3. Some Remaining questions - - Things we could solve on a v3 timeframe: - - The SHA-1 hash is showing its age. We should do something about our - dependency on it. We could probably future-proof ourselves here in - this revision, at least so far as documents from the authorities are - concerned. - - Too many things about the authorities are hardcoded by IP. - - Perhaps we should start accepting longer identity keys for routers - too. - - Things to solve eventually: - - Requiring every client to know about every router won't scale forever. - - Requiring every directory cache to know every router won't scale - forever. - - -1. Outline - - There is a small set (say, around 5-10) of semi-trusted directory - authorities. A default list of authorities is shipped with the Tor - software. Users can change this list, but are encouraged not to do so, - in order to avoid partitioning attacks. - - Every authority has a very-secret, long-term "Authority Identity Key". - This is stored encrypted and/or offline, and is used to sign "key - certificate" documents. Every key certificate contains a medium-term - (3-12 months) "authority signing key", that is used by the authority to - sign other directory information. (Note that the authority identity - key is distinct from the router identity key that the authority uses - in its role as an ordinary router.) - - Routers periodically upload signed "routers descriptors" to the - directory authorities describing their keys, capabilities, and other - information. Routers may also upload signed "extra info documents" - containing information that is not required for the Tor protocol. - Directory authorities serve router descriptors indexed by router - identity, or by hash of the descriptor. - - Routers may act as directory caches to reduce load on the directory - authorities. They announce this in their descriptors. - - Periodically, each directory authority generates a view of - the current descriptors and status for known routers. They send a - signed summary of this view (a "status vote") to the other - authorities. The authorities compute the result of this vote, and sign - a "consensus status" document containing the result of the vote. - - Directory caches download, cache, and re-serve consensus documents. - - Clients, directory caches, and directory authorities all use consensus - documents to find out when their list of routers is out-of-date. - (Directory authorities also use vote statuses.) If it is, they download - any missing router descriptors. Clients download missing descriptors - from caches; caches and authorities download from authorities. - Descriptors are downloaded by the hash of the descriptor, not by the - server's identity key: this prevents servers from attacking clients by - giving them descriptors nobody else uses. - - All directory information is uploaded and downloaded with HTTP. - - [Authorities also generate and caches also cache documents produced and - used by earlier versions of this protocol; see dir-spec-v1.txt and - dir-spec-v2.txt for notes on those versions.] - -1.1. What's different from version 2? - - Clients used to download multiple network status documents, - corresponding roughly to "status votes" above. They would compute the - result of the vote on the client side. - - Authorities used to sign documents using the same private keys they used - for their roles as routers. This forced them to keep these extremely - sensitive keys in memory unencrypted. - - All of the information in extra-info documents used to be kept in the - main descriptors. - -1.2. Document meta-format - - Router descriptors, directories, and running-routers documents all obey the - following lightweight extensible information format. - - The highest level object is a Document, which consists of one or more - Items. Every Item begins with a KeywordLine, followed by zero or more - Objects. A KeywordLine begins with a Keyword, optionally followed by - whitespace and more non-newline characters, and ends with a newline. A - Keyword is a sequence of one or more characters in the set [A-Za-z0-9-]. - An Object is a block of encoded data in pseudo-Open-PGP-style - armor. (cf. RFC 2440) - - More formally: - - NL = The ascii LF character (hex value 0x0a). - Document ::= (Item | NL)+ - Item ::= KeywordLine Object* - KeywordLine ::= Keyword NL | Keyword WS ArgumentChar+ NL - Keyword = KeywordChar+ - KeywordChar ::= 'A' ... 'Z' | 'a' ... 'z' | '0' ... '9' | '-' - ArgumentChar ::= any printing ASCII character except NL. - WS = (SP | TAB)+ - Object ::= BeginLine Base-64-encoded-data EndLine - BeginLine ::= "-----BEGIN " Keyword "-----" NL - EndLine ::= "-----END " Keyword "-----" NL - - The BeginLine and EndLine of an Object must use the same keyword. - - When interpreting a Document, software MUST ignore any KeywordLine that - starts with a keyword it doesn't recognize; future implementations MUST NOT - require current clients to understand any KeywordLine not currently - described. - - The "opt" keyword was used until Tor 0.1.2.5-alpha for non-critical future - extensions. All implementations MUST ignore any item of the form "opt - keyword ....." when they would not recognize "keyword ....."; and MUST - treat "opt keyword ....." as synonymous with "keyword ......" when keyword - is recognized. - - Implementations before 0.1.2.5-alpha rejected any document with a - KeywordLine that started with a keyword that they didn't recognize. - When generating documents that need to be read by older versions of Tor, - implementations MUST prefix items not recognized by older versions of - Tor with an "opt" until those versions of Tor are obsolete. [Note that - key certificates, status vote documents, extra info documents, and - status consensus documents will never be read by older versions of Tor.] - - Other implementations that want to extend Tor's directory format MAY - introduce their own items. The keywords for extension items SHOULD start - with the characters "x-" or "X-", to guarantee that they will not conflict - with keywords used by future versions of Tor. - - In our document descriptions below, we tag Items with a multiplicity in - brackets. Possible tags are: - - "At start, exactly once": These items MUST occur in every instance of - the document type, and MUST appear exactly once, and MUST be the - first item in their documents. - - "Exactly once": These items MUST occur exactly one time in every - instance of the document type. - - "At end, exactly once": These items MUST occur in every instance of - the document type, and MUST appear exactly once, and MUST be the - last item in their documents. - - "At most once": These items MAY occur zero or one times in any - instance of the document type, but MUST NOT occur more than once. - - "Any number": These items MAY occur zero, one, or more times in any - instance of the document type. - - "Once or more": These items MUST occur at least once in any instance - of the document type, and MAY occur more. - -1.3. Signing documents - - Every signable document below is signed in a similar manner, using a - given "Initial Item", a final "Signature Item", a digest algorithm, and - a signing key. - - The Initial Item must be the first item in the document. - - The Signature Item has the following format: - - <signature item keyword> [arguments] NL SIGNATURE NL - - The "SIGNATURE" Object contains a signature (using the signing key) of - the PKCS1-padded digest of the entire document, taken from the - beginning of the Initial item, through the newline after the Signature - Item's keyword and its arguments. - - Unless otherwise, the digest algorithm is SHA-1. - - All documents are invalid unless signed with the correct signing key. - - The "Digest" of a document, unless stated otherwise, is its digest *as - signed by this signature scheme*. - -1.4. Voting timeline - - Every consensus document has a "valid-after" (VA) time, a "fresh-until" - (FU) time and a "valid-until" (VU) time. VA MUST precede FU, which MUST - in turn precede VU. Times are chosen so that every consensus will be - "fresh" until the next consensus becomes valid, and "valid" for a while - after. At least 3 consensuses should be valid at any given time. - - The timeline for a given consensus is as follows: - - VA-DistSeconds-VoteSeconds: The authorities exchange votes. - - VA-DistSeconds-VoteSeconds/2: The authorities try to download any - votes they don't have. - - VA-DistSeconds: The authorities calculate the consensus and exchange - signatures. - - VA-DistSeconds/2: The authorities try to download any signatures - they don't have. - - VA: All authorities have a multiply signed consensus. - - VA ... FU: Caches download the consensus. (Note that since caches have - no way of telling what VA and FU are until they have downloaded - the consensus, they assume that the present consensus's VA is - equal to the previous one's FU, and that its FU is one interval after - that.) - - FU: The consensus is no longer the freshest consensus. - - FU ... (the current consensus's VU): Clients download the consensus. - (See note above: clients guess that the next consensus's FU will be - two intervals after the current VA.) - - VU: The consensus is no longer valid. - - VoteSeconds and DistSeconds MUST each be at least 20 seconds; FU-VA and - VU-FU MUST each be at least 5 minutes. - -2. Router operation and formats - - ORs SHOULD generate a new router descriptor and a new extra-info - document whenever any of the following events have occurred: - - - A period of time (18 hrs by default) has passed since the last - time a descriptor was generated. - - - A descriptor field other than bandwidth or uptime has changed. - - - Bandwidth has changed by a factor of 2 from the last time a - descriptor was generated, and at least a given interval of time - (20 mins by default) has passed since then. - - - Its uptime has been reset (by restarting). - - [XXX this list is incomplete; see router_differences_are_cosmetic() - in routerlist.c for others] - - ORs SHOULD NOT publish a new router descriptor or extra-info document - if none of the above events have occurred and not much time has passed - (12 hours by default). - - After generating a descriptor, ORs upload them to every directory - authority they know, by posting them (in order) to the URL - - http://<hostname:port>/tor/ - -2.1. Router descriptor format - - Router descriptors consist of the following items. For backward - compatibility, there should be an extra NL at the end of each router - descriptor. - - In lines that take multiple arguments, extra arguments SHOULD be - accepted and ignored. Many of the nonterminals below are defined in - section 2.3. - - "router" nickname address ORPort SOCKSPort DirPort NL - - [At start, exactly once.] - - Indicates the beginning of a router descriptor. "nickname" must be a - valid router nickname as specified in 2.3. "address" must be an IPv4 - address in dotted-quad format. The last three numbers indicate the - TCP ports at which this OR exposes functionality. ORPort is a port at - which this OR accepts TLS connections for the main OR protocol; - SOCKSPort is deprecated and should always be 0; and DirPort is the - port at which this OR accepts directory-related HTTP connections. If - any port is not supported, the value 0 is given instead of a port - number. (At least one of DirPort and ORPort SHOULD be set; - authorities MAY reject any descriptor with both DirPort and ORPort of - 0.) - - "bandwidth" bandwidth-avg bandwidth-burst bandwidth-observed NL - - [Exactly once] - - Estimated bandwidth for this router, in bytes per second. The - "average" bandwidth is the volume per second that the OR is willing to - sustain over long periods; the "burst" bandwidth is the volume that - the OR is willing to sustain in very short intervals. The "observed" - value is an estimate of the capacity this server can handle. The - server remembers the max bandwidth sustained output over any ten - second period in the past day, and another sustained input. The - "observed" value is the lesser of these two numbers. - - "platform" string NL - - [At most once] - - A human-readable string describing the system on which this OR is - running. This MAY include the operating system, and SHOULD include - the name and version of the software implementing the Tor protocol. - - "published" YYYY-MM-DD HH:MM:SS NL - - [Exactly once] - - The time, in GMT, when this descriptor (and its corresponding - extra-info document if any) was generated. - - "fingerprint" fingerprint NL - - [At most once] - - A fingerprint (a HASH_LEN-byte of asn1 encoded public key, encoded in - hex, with a single space after every 4 characters) for this router's - identity key. A descriptor is considered invalid (and MUST be - rejected) if the fingerprint line does not match the public key. - - [We didn't start parsing this line until Tor 0.1.0.6-rc; it should - be marked with "opt" until earlier versions of Tor are obsolete.] - - "hibernating" bool NL - - [At most once] - - If the value is 1, then the Tor server was hibernating when the - descriptor was published, and shouldn't be used to build circuits. - - [We didn't start parsing this line until Tor 0.1.0.6-rc; it should be - marked with "opt" until earlier versions of Tor are obsolete.] - - "uptime" number NL - - [At most once] - - The number of seconds that this OR process has been running. - - "onion-key" NL a public key in PEM format - - [Exactly once] - - This key is used to encrypt EXTEND cells for this OR. The key MUST be - accepted for at least 1 week after any new key is published in a - subsequent descriptor. It MUST be 1024 bits. - - "signing-key" NL a public key in PEM format - - [Exactly once] - - The OR's long-term identity key. It MUST be 1024 bits. - - "accept" exitpattern NL - "reject" exitpattern NL - - [Any number] - - These lines describe an "exit policy": the rules that an OR follows - when deciding whether to allow a new stream to a given address. The - 'exitpattern' syntax is described below. There MUST be at least one - such entry. The rules are considered in order; if no rule matches, - the address will be accepted. For clarity, the last such entry SHOULD - be accept *:* or reject *:*. - - "router-signature" NL Signature NL - - [At end, exactly once] - - The "SIGNATURE" object contains a signature of the PKCS1-padded - hash of the entire router descriptor, taken from the beginning of the - "router" line, through the newline after the "router-signature" line. - The router descriptor is invalid unless the signature is performed - with the router's identity key. - - "contact" info NL - - [At most once] - - Describes a way to contact the server's administrator, preferably - including an email address and a PGP key fingerprint. - - "family" names NL - - [At most once] - - 'Names' is a space-separated list of server nicknames or - hexdigests. If two ORs list one another in their "family" entries, - then OPs should treat them as a single OR for the purpose of path - selection. - - For example, if node A's descriptor contains "family B", and node B's - descriptor contains "family A", then node A and node B should never - be used on the same circuit. - - "read-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM,NUM,NUM,NUM,NUM... NL - [At most once] - "write-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM,NUM,NUM,NUM,NUM... NL - [At most once] - - Declare how much bandwidth the OR has used recently. Usage is divided - into intervals of NSEC seconds. The YYYY-MM-DD HH:MM:SS field - defines the end of the most recent interval. The numbers are the - number of bytes used in the most recent intervals, ordered from - oldest to newest. - - [We didn't start parsing these lines until Tor 0.1.0.6-rc; they should - be marked with "opt" until earlier versions of Tor are obsolete.] - - [See also migration notes in section 2.2.1.] - - "eventdns" bool NL - - [At most once] - - Declare whether this version of Tor is using the newer enhanced - dns logic. Versions of Tor with this field set to false SHOULD NOT - be used for reverse hostname lookups. - - [All versions of Tor before 0.1.2.2-alpha should be assumed to have - this option set to 0 if it is not present. All Tor versions at - 0.1.2.2-alpha or later should be assumed to have this option set to - 1 if it is not present. Until 0.1.2.1-alpha-dev, this option was - not generated, even when the new DNS code was in use. Versions of Tor - before 0.1.2.1-alpha-dev did not parse this option, so it should be - marked "opt". The dnsworker logic has been removed, so this option - should not be used by new server code. However, it can still be - used, and should still be recognized by new code until Tor 0.1.2.x - is obsolete.] - - "caches-extra-info" NL - - [At most once.] - - Present only if this router is a directory cache that provides - extra-info documents. - - [Versions before 0.2.0.1-alpha don't recognize this, and versions - before 0.1.2.5-alpha will reject descriptors containing it unless - it is prefixed with "opt"; it should be so prefixed until these - versions are obsolete.] - - "extra-info-digest" digest NL - - [At most once] - - "Digest" is a hex-encoded digest (using upper-case characters) of the - router's extra-info document, as signed in the router's extra-info - (that is, not including the signature). (If this field is absent, the - router is not uploading a corresponding extra-info document.) - - [Versions before 0.2.0.1-alpha don't recognize this, and versions - before 0.1.2.5-alpha will reject descriptors containing it unless - it is prefixed with "opt"; it should be so prefixed until these - versions are obsolete.] - - "hidden-service-dir" *(SP VersionNum) NL - - [At most once.] - - Present only if this router stores and serves hidden service - descriptors. If any VersionNum(s) are specified, this router - supports those descriptor versions. If none are specified, it - defaults to version 2 descriptors. - - [Versions of Tor before 0.1.2.5-alpha rejected router descriptors - with unrecognized items; the protocols line should be preceded with - an "opt" until these Tors are obsolete.] - - "protocols" SP "Link" SP LINK-VERSION-LIST SP "Circuit" SP - CIRCUIT-VERSION-LIST NL - - [At most once.] - - Both lists are space-separated sequences of numbers, to indicate which - protocols the server supports. As of 30 Mar 2008, specified - protocols are "Link 1 2 Circuit 1". See section 4.1 of tor-spec.txt - for more information about link protocol versions. - - [Versions of Tor before 0.1.2.5-alpha rejected router descriptors - with unrecognized items; the protocols line should be preceded with - an "opt" until these Tors are obsolete.] - - "allow-single-hop-exits" NL - - [At most once.] - - Present only if the router allows single-hop circuits to make exit - connections. Most Tor servers do not support this: this is - included for specialized controllers designed to support perspective - access and such. - - -2.2. Extra-info documents - - Extra-info documents consist of the following items: - - "extra-info" Nickname Fingerprint NL - [At start, exactly once.] - - Identifies what router this is an extra info descriptor for. - Fingerprint is encoded in hex (using upper-case letters), with - no spaces. - - "published" YYYY-MM-DD HH:MM:SS NL - - [Exactly once.] - - The time, in GMT, when this document (and its corresponding router - descriptor if any) was generated. It MUST match the published time - in the corresponding router descriptor. - - "read-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM,NUM,NUM,NUM,NUM... NL - [At most once.] - "write-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM,NUM,NUM,NUM,NUM... NL - [At most once.] - - As documented in 2.1 above. See migration notes in section 2.2.1. - - ("geoip-start" YYYY-MM-DD HH:MM:SS NL) - ("geoip-client-origins" CC=N,CC=N,... NL) - - Only generated by bridge routers (see blocking.pdf), and only - when they have been configured with a geoip database. - Non-bridges SHOULD NOT generate these fields. Contains a list - of mappings from two-letter country codes (CC) to the number - of clients that have connected to that bridge from that - country (approximate, and rounded up to the nearest multiple of 8 - in order to hamper traffic analysis). A country is included - only if it has at least one address. The time in - "geoip-start" is the time at which we began collecting geoip - statistics. - - "geoip-start" and "geoip-client-origins" have been replaced by - "bridge-stats-end" and "bridge-stats-ips" in 0.2.2.4-alpha. The - reason is that the measurement interval with "geoip-stats" as - determined by subtracting "geoip-start" from "published" could - have had a variable length, whereas the measurement interval in - 0.2.2.4-alpha and later is set to be exactly 24 hours long. In - order to clearly distinguish the new measurement intervals from - the old ones, the new keywords have been introduced. - - "bridge-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL - [At most once.] - - YYYY-MM-DD HH:MM:SS defines the end of the included measurement - interval of length NSEC seconds (86400 seconds by default). - - A "bridge-stats-end" line, as well as any other "bridge-*" line, - is only added when the relay has been running as a bridge for at - least 24 hours. - - "bridge-ips" CC=N,CC=N,... NL - [At most once.] - - List of mappings from two-letter country codes to the number of - unique IP addresses that have connected from that country to the - bridge and which are no known relays, rounded up to the nearest - multiple of 8. - - "dirreq-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL - [At most once.] - - YYYY-MM-DD HH:MM:SS defines the end of the included measurement - interval of length NSEC seconds (86400 seconds by default). - - A "dirreq-stats-end" line, as well as any other "dirreq-*" line, - is only added when the relay has opened its Dir port and after 24 - hours of measuring directory requests. - - "dirreq-v2-ips" CC=N,CC=N,... NL - [At most once.] - "dirreq-v3-ips" CC=N,CC=N,... NL - [At most once.] - - List of mappings from two-letter country codes to the number of - unique IP addresses that have connected from that country to - request a v2/v3 network status, rounded up to the nearest multiple - of 8. Only those IP addresses are counted that the directory can - answer with a 200 OK status code. - - "dirreq-v2-reqs" CC=N,CC=N,... NL - [At most once.] - "dirreq-v3-reqs" CC=N,CC=N,... NL - [At most once.] - - List of mappings from two-letter country codes to the number of - requests for v2/v3 network statuses from that country, rounded up - to the nearest multiple of 8. Only those requests are counted that - the directory can answer with a 200 OK status code. - - "dirreq-v2-share" num% NL - [At most once.] - "dirreq-v3-share" num% NL - [At most once.] - - The share of v2/v3 network status requests that the directory - expects to receive from clients based on its advertised bandwidth - compared to the overall network bandwidth capacity. Shares are - formatted in percent with two decimal places. Shares are - calculated as means over the whole 24-hour interval. - - "dirreq-v2-resp" status=num,... NL - [At most once.] - "dirreq-v3-resp" status=nul,... NL - [At most once.] - - List of mappings from response statuses to the number of requests - for v2/v3 network statuses that were answered with that response - status, rounded up to the nearest multiple of 4. Only response - statuses with at least 1 response are reported. New response - statuses can be added at any time. The current list of response - statuses is as follows: - - "ok": a network status request is answered; this number - corresponds to the sum of all requests as reported in - "dirreq-v2-reqs" or "dirreq-v3-reqs", respectively, before - rounding up. - "not-enough-sigs: a version 3 network status is not signed by a - sufficient number of requested authorities. - "unavailable": a requested network status object is unavailable. - "not-found": a requested network status is not found. - "not-modified": a network status has not been modified since the - If-Modified-Since time that is included in the request. - "busy": the directory is busy. - - "dirreq-v2-direct-dl" key=val,... NL - [At most once.] - "dirreq-v3-direct-dl" key=val,... NL - [At most once.] - "dirreq-v2-tunneled-dl" key=val,... NL - [At most once.] - "dirreq-v3-tunneled-dl" key=val,... NL - [At most once.] - - List of statistics about possible failures in the download process - of v2/v3 network statuses. Requests are either "direct" - HTTP-encoded requests over the relay's directory port, or - "tunneled" requests using a BEGIN_DIR cell over the relay's OR - port. The list of possible statistics can change, and statistics - can be left out from reporting. The current list of statistics is - as follows: - - Successful downloads and failures: - - "complete": a client has finished the download successfully. - "timeout": a download did not finish within 10 minutes after - starting to send the response. - "running": a download is still running at the end of the - measurement period for less than 10 minutes after starting to - send the response. - - Download times: - - "min", "max": smallest and largest measured bandwidth in B/s. - "d[1-4,6-9]": 1st to 4th and 6th to 9th decile of measured - bandwidth in B/s. For a given decile i, i/10 of all downloads - had a smaller bandwidth than di, and (10-i)/10 of all downloads - had a larger bandwidth than di. - "q[1,3]": 1st and 3rd quartile of measured bandwidth in B/s. One - fourth of all downloads had a smaller bandwidth than q1, one - fourth of all downloads had a larger bandwidth than q3, and the - remaining half of all downloads had a bandwidth between q1 and - q3. - "md": median of measured bandwidth in B/s. Half of the downloads - had a smaller bandwidth than md, the other half had a larger - bandwidth than md. - - "dirreq-read-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM,NUM,NUM... NL - [At most once] - "dirreq-write-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM,NUM,NUM... NL - [At most once] - - Declare how much bandwidth the OR has spent on answering directory - requests. Usage is divided into intervals of NSEC seconds. The - YYYY-MM-DD HH:MM:SS field defines the end of the most recent - interval. The numbers are the number of bytes used in the most - recent intervals, ordered from oldest to newest. - - "entry-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL - [At most once.] - - YYYY-MM-DD HH:MM:SS defines the end of the included measurement - interval of length NSEC seconds (86400 seconds by default). - - An "entry-stats-end" line, as well as any other "entry-*" - line, is first added after the relay has been running for at least - 24 hours. - - "entry-ips" CC=N,CC=N,... NL - [At most once.] - - List of mappings from two-letter country codes to the number of - unique IP addresses that have connected from that country to the - relay and which are no known other relays, rounded up to the - nearest multiple of 8. - - "cell-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL - [At most once.] - - YYYY-MM-DD HH:MM:SS defines the end of the included measurement - interval of length NSEC seconds (86400 seconds by default). - - A "cell-stats-end" line, as well as any other "cell-*" line, - is first added after the relay has been running for at least 24 - hours. - - "cell-processed-cells" num,...,num NL - [At most once.] - - Mean number of processed cells per circuit, subdivided into - deciles of circuits by the number of cells they have processed in - descending order from loudest to quietest circuits. - - "cell-queued-cells" num,...,num NL - [At most once.] - - Mean number of cells contained in queues by circuit decile. These - means are calculated by 1) determining the mean number of cells in - a single circuit between its creation and its termination and 2) - calculating the mean for all circuits in a given decile as - determined in "cell-processed-cells". Numbers have a precision of - two decimal places. - - "cell-time-in-queue" num,...,num NL - [At most once.] - - Mean time cells spend in circuit queues in milliseconds. Times are - calculated by 1) determining the mean time cells spend in the - queue of a single circuit and 2) calculating the mean for all - circuits in a given decile as determined in - "cell-processed-cells". - - "cell-circuits-per-decile" num NL - [At most once.] - - Mean number of circuits that are included in any of the deciles, - rounded up to the next integer. - - "exit-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL - [At most once.] - - YYYY-MM-DD HH:MM:SS defines the end of the included measurement - interval of length NSEC seconds (86400 seconds by default). - - An "exit-stats-end" line, as well as any other "exit-*" line, is - first added after the relay has been running for at least 24 hours - and only if the relay permits exiting (where exiting to a single - port and IP address is sufficient). - - "exit-kibibytes-written" port=N,port=N,... NL - [At most once.] - "exit-kibibytes-read" port=N,port=N,... NL - [At most once.] - - List of mappings from ports to the number of kibibytes that the - relay has written to or read from exit connections to that port, - rounded up to the next full kibibyte. - - "exit-streams-opened" port=N,port=N,... NL - [At most once.] - - List of mappings from ports to the number of opened exit streams - to that port, rounded up to the nearest multiple of 4. - - "router-signature" NL Signature NL - [At end, exactly once.] - - A document signature as documented in section 1.3, using the - initial item "extra-info" and the final item "router-signature", - signed with the router's identity key. - -2.2.1. Moving history fields to extra-info documents. - - Tools that want to use the read-history and write-history values SHOULD - download extra-info documents as well as router descriptors. Such - tools SHOULD accept history values from both sources; if they appear in - both documents, the values in the extra-info documents are authoritative. - - New versions of Tor no longer generate router descriptors - containing read-history or write-history. Tools should continue to - accept read-history and write-history values in router descriptors - produced by older versions of Tor until all Tor versions earlier - than 0.2.0.x are obsolete. - -2.3. Nonterminals in router descriptors - - nickname ::= between 1 and 19 alphanumeric characters ([A-Za-z0-9]), - case-insensitive. - hexdigest ::= a '$', followed by 40 hexadecimal characters - ([A-Fa-f0-9]). [Represents a server by the digest of its identity - key.] - - exitpattern ::= addrspec ":" portspec - portspec ::= "*" | port | port "-" port - port ::= an integer between 1 and 65535, inclusive. - - [Some implementations incorrectly generate ports with value 0. - Implementations SHOULD accept this, and SHOULD NOT generate it. - Connections to port 0 are never permitted.] - - addrspec ::= "*" | ip4spec | ip6spec - ipv4spec ::= ip4 | ip4 "/" num_ip4_bits | ip4 "/" ip4mask - ip4 ::= an IPv4 address in dotted-quad format - ip4mask ::= an IPv4 mask in dotted-quad format - num_ip4_bits ::= an integer between 0 and 32 - ip6spec ::= ip6 | ip6 "/" num_ip6_bits - ip6 ::= an IPv6 address, surrounded by square brackets. - num_ip6_bits ::= an integer between 0 and 128 - - bool ::= "0" | "1" - -3. Formats produced by directory authorities. - - Every authority has two keys used in this protocol: a signing key, and - an authority identity key. (Authorities also have a router identity - key used in their role as a router and by earlier versions of the - directory protocol.) The identity key is used from time to time to - sign new key certificates using new signing keys; it is very sensitive. - The signing key is used to sign key certificates and status documents. - - There are three kinds of documents generated by directory authorities: - - Key certificates - Status votes - Status consensuses - - Each is discussed below. - -3.1. Key certificates - - Key certificates consist of the following items: - - "dir-key-certificate-version" version NL - - [At start, exactly once.] - - Determines the version of the key certificate. MUST be "3" for - the protocol described in this document. Implementations MUST - reject formats they don't understand. - - "dir-address" IPPort NL - [At most once] - - An IP:Port for this authority's directory port. - - "fingerprint" fingerprint NL - - [Exactly once.] - - Hexadecimal encoding without spaces based on the authority's - identity key. - - "dir-identity-key" NL a public key in PEM format - - [Exactly once.] - - The long-term authority identity key for this authority. This key - SHOULD be at least 2048 bits long; it MUST NOT be shorter than - 1024 bits. - - "dir-key-published" YYYY-MM-DD HH:MM:SS NL - - [Exactly once.] - - The time (in GMT) when this document and corresponding key were - last generated. - - "dir-key-expires" YYYY-MM-DD HH:MM:SS NL - - [Exactly once.] - - A time (in GMT) after which this key is no longer valid. - - "dir-signing-key" NL a key in PEM format - - [Exactly once.] - - The directory server's public signing key. This key MUST be at - least 1024 bits, and MAY be longer. - - "dir-key-crosscert" NL CrossSignature NL - - [At most once.] - - NOTE: Authorities MUST include this field in all newly generated - certificates. A future version of this specification will make - the field required. - - CrossSignature is a signature, made using the certificate's signing - key, of the digest of the PKCS1-padded hash of the certificate's - identity key. For backward compatibility with broken versions of the - parser, we wrap the base64-encoded signature in -----BEGIN ID - SIGNATURE---- and -----END ID SIGNATURE----- tags. Implementations - MUST allow the "ID " portion to be omitted, however. - - When encountering a certificate with a dir-key-crosscert entry, - implementations MUST verify that the signature is a correct signature - of the hash of the identity key using the signing key. - - "dir-key-certification" NL Signature NL - - [At end, exactly once.] - - A document signature as documented in section 1.3, using the - initial item "dir-key-certificate-version" and the final item - "dir-key-certification", signed with the authority identity key. - - Authorities MUST generate a new signing key and corresponding - certificate before the key expires. - -3.2. Vote and consensus status documents - - Votes and consensuses are more strictly formatted then other documents - in this specification, since different authorities must be able to - generate exactly the same consensus given the same set of votes. - - The procedure for deciding when to generate vote and consensus status - documents are described in section 1.4 on the voting timeline. - - Status documents contain a preamble, an authority section, a list of - router status entries, and one or more footer signature, in that order. - - Unlike other formats described above, a SP in these documents must be a - single space character (hex 20). - - Some items appear only in votes, and some items appear only in - consensuses. Unless specified, items occur in both. - - The preamble contains the following items. They MUST occur in the - order given here: - - "network-status-version" SP version NL. - - [At start, exactly once.] - - A document format version. For this specification, the version is - "3". - - "vote-status" SP type NL - - [Exactly once.] - - The status MUST be "vote" or "consensus", depending on the type of - the document. - - "consensus-methods" SP IntegerList NL - - [Exactly once for votes; does not occur in consensuses.] - - A space-separated list of supported methods for generating - consensuses from votes. See section 3.4.1 for details. Method "1" - MUST be included. - - "consensus-method" SP Integer NL - - [Exactly once for consensuses; does not occur in votes.] - - See section 3.4.1 for details. - - (Only included when the vote is generated with consensus-method 2 or - later.) - - "published" SP YYYY-MM-DD SP HH:MM:SS NL - - [Exactly once for votes; does not occur in consensuses.] - - The publication time for this status document (if a vote). - - "valid-after" SP YYYY-MM-DD SP HH:MM:SS NL - - [Exactly once.] - - The start of the Interval for this vote. Before this time, the - consensus document produced from this vote should not be used. - See 1.4 for voting timeline information. - - "fresh-until" SP YYYY-MM-DD SP HH:MM:SS NL - - [Exactly once.] - - The time at which the next consensus should be produced; before this - time, there is no point in downloading another consensus, since there - won't be a new one. See 1.4 for voting timeline information. - - "valid-until" SP YYYY-MM-DD SP HH:MM:SS NL - - [Exactly once.] - - The end of the Interval for this vote. After this time, the - consensus produced by this vote should not be used. See 1.4 for - voting timeline information. - - "voting-delay" SP VoteSeconds SP DistSeconds NL - - [Exactly once.] - - VoteSeconds is the number of seconds that we will allow to collect - votes from all authorities; DistSeconds is the number of seconds - we'll allow to collect signatures from all authorities. See 1.4 for - voting timeline information. - - "client-versions" SP VersionList NL - - [At most once.] - - A comma-separated list of recommended client versions, in - ascending order. If absent, no opinion is held about client - versions. - - "server-versions" SP VersionList NL - - [At most once.] - - A comma-separated list of recommended server versions, in - ascending order. If absent, no opinion is held about server - versions. - - "known-flags" SP FlagList NL - - [Exactly once.] - - A space-separated list of all of the flags that this document - might contain. A flag is "known" either because the authority - knows about them and might set them (if in a vote), or because - enough votes were counted for the consensus for an authoritative - opinion to have been formed about their status. - - "params" SP [Parameters] NL - - [At most once] - - Parameter ::= Keyword '=' Int32 - Int32 ::= A decimal integer between -2147483648 and 2147483647. - Parameters ::= Parameter | Parameters SP Parameter - - The parameters list, if present, contains a space-separated list of - case-sensitive key-value pairs, sorted in lexical order by - their keyword. Each parameter has its own meaning. - - (Only included when the vote is generated with consensus-method 7 or - later.) - - Commonly used "param" arguments at this point include: - - "circwindow" -- the default package window that circuits should - be established with. It started out at 1000 cells, but some - research indicates that a lower value would mean fewer cells in - transit in the network at any given time. Obeyed by Tor 0.2.1.20 - and later. - Min: 100, Max: 1000 - - "CircuitPriorityHalflifeMsec" -- the halflife parameter used when - weighting which circuit will send the next cell. Obeyed by Tor - 0.2.2.10-alpha and later. (Versions of Tor between 0.2.2.7-alpha - and 0.2.2.10-alpha recognized a "CircPriorityHalflifeMsec" parameter, - but mishandled it badly.) - Min: -1, Max: 2147483647 (INT32_MAX) - - "perconnbwrate" and "perconnbwburst" -- if set, each relay sets - up a separate token bucket for every client OR connection, - and rate limits that connection indepedently. Typically left - unset, except when used for performance experiments around trac - entry 1750. Only honored by relays running Tor 0.2.2.16-alpha - and later. (Note that relays running 0.2.2.7-alpha through - 0.2.2.14-alpha looked for bwconnrate and bwconnburst, but then - did the wrong thing with them; see bug 1830 for details.) - Min: 1, Max: 2147483647 (INT32_MAX) - - "refuseunknownexits" -- if set to one, exit relays look at - the previous hop of circuits that ask to open an exit stream, - and refuse to exit if they don't recognize it as a relay. The - goal is to make it harder for people to use them as one-hop - proxies. See trac entry 1751 for details. - Min: 0, Max: 1 - - See also "2.4.5. Consensus parameters governing behavior" - in path-spec.txt for a series of circuit build time related - consensus params. - - The authority section of a vote contains the following items, followed - in turn by the authority's current key certificate: - - "dir-source" SP nickname SP identity SP address SP IP SP dirport SP - orport NL - - [Exactly once, at start] - - Describes this authority. The nickname is a convenient identifier - for the authority. The identity is an uppercase hex fingerprint of - the authority's current (v3 authority) identity key. The address is - the server's hostname. The IP is the server's current IP address, - and dirport is its current directory port. XXXXorport - - "contact" SP string NL - - [At most once.] - - An arbitrary string describing how to contact the directory - server's administrator. Administrators should include at least an - email address and a PGP fingerprint. - - "legacy-key" SP FINGERPRINT NL - - [At most once] - - Lists a fingerprint for an obsolete _identity_ key still used - by this authority to keep older clients working. This option - is used to keep key around for a little while in case the - authorities need to migrate many identity keys at once. - (Generally, this would only happen because of a security - vulnerability that affected multiple authorities, like the - Debian OpenSSL RNG bug of May 2008.) - - The authority section of a consensus contains groups the following items, - in the order given, with one group for each authority that contributed to - the consensus, with groups sorted by authority identity digest: - - "dir-source" SP nickname SP identity SP address SP IP SP dirport SP - orport NL - - [Exactly once, at start] - - As in the authority section of a vote. - - "contact" SP string NL - - [At most once.] - - As in the authority section of a vote. - - "vote-digest" SP digest NL - - [Exactly once.] - - A digest of the vote from the authority that contributed to this - consensus, as signed (that is, not including the signature). - (Hex, upper-case.) - - Each router status entry contains the following items. Router status - entries are sorted in ascending order by identity digest. - - "r" SP nickname SP identity SP digest SP publication SP IP SP ORPort - SP DirPort NL - - [At start, exactly once.] - - "Nickname" is the OR's nickname. "Identity" is a hash of its - identity key, encoded in base64, with trailing equals sign(s) - removed. "Digest" is a hash of its most recent descriptor as - signed (that is, not including the signature), encoded in base64. - "Publication" is the - publication time of its most recent descriptor, in the form - YYYY-MM-DD HH:MM:SS, in GMT. "IP" is its current IP address; - ORPort is its current OR port, "DirPort" is it's current directory - port, or "0" for "none". - - "s" SP Flags NL - - [At most once.] - - A series of space-separated status flags, in alphabetical order. - Currently documented flags are: - - "Authority" if the router is a directory authority. - "BadExit" if the router is believed to be useless as an exit node - (because its ISP censors it, because it is behind a restrictive - proxy, or for some similar reason). - "BadDirectory" if the router is believed to be useless as a - directory cache (because its directory port isn't working, - its bandwidth is always throttled, or for some similar - reason). - "Exit" if the router is more useful for building - general-purpose exit circuits than for relay circuits. The - path building algorithm uses this flag; see path-spec.txt. - "Fast" if the router is suitable for high-bandwidth circuits. - "Guard" if the router is suitable for use as an entry guard. - "HSDir" if the router is considered a v2 hidden service directory. - "Named" if the router's identity-nickname mapping is canonical, - and this authority binds names. - "Stable" if the router is suitable for long-lived circuits. - "Running" if the router is currently usable. - "Unnamed" if another router has bound the name used by this - router, and this authority binds names. - "Valid" if the router has been 'validated'. - "V2Dir" if the router implements the v2 directory protocol. - "V3Dir" if the router implements this protocol. - - "v" SP version NL - - [At most once.] - - The version of the Tor protocol that this server is running. If - the value begins with "Tor" SP, the rest of the string is a Tor - version number, and the protocol is "The Tor protocol as supported - by the given version of Tor." Otherwise, if the value begins with - some other string, Tor has upgraded to a more sophisticated - protocol versioning system, and the protocol is "a version of the - Tor protocol more recent than any we recognize." - - Directory authorities SHOULD omit version strings they receive from - descriptors if they would cause "v" lines to be over 128 characters - long. - - "w" SP "Bandwidth=" INT [SP "Measured=" INT] NL - - [At most once.] - - An estimate of the bandwidth of this server, in an arbitrary - unit (currently kilobytes per second). Used to weight router - selection. - - Additionally, the Measured= keyword is present in votes by - participating bandwidth measurement authorities to indicate - a measured bandwidth currently produced by measuring stream - capacities. - - Other weighting keywords may be added later. - Clients MUST ignore keywords they do not recognize. - - "p" SP ("accept" / "reject") SP PortList NL - - [At most once.] - - PortList = PortOrRange - PortList = PortList "," PortOrRange - PortOrRange = INT "-" INT / INT - - A list of those ports that this router supports (if 'accept') - or does not support (if 'reject') for exit to "most - addresses". - - The footer section is delineated in all votes and consensuses supporting - consensus method 9 and above with the following: - - "directory-footer" NL - - It contains two subsections, a bandwidths-weights line and a - directory-signature. - - The bandwidths-weights line appears At Most Once for a consensus. It does - not appear in votes. - - "bandwidth-weights" SP - "Wbd=" INT SP "Wbe=" INT SP "Wbg=" INT SP "Wbm=" INT SP - "Wdb=" INT SP - "Web=" INT SP "Wed=" INT SP "Wee=" INT SP "Weg=" INT SP "Wem=" INT SP - "Wgb=" INT SP "Wgd=" INT SP "Wgg=" INT SP "Wgm=" INT SP - "Wmb=" INT SP "Wmd=" INT SP "Wme=" INT SP "Wmg=" INT SP "Wmm=" INT NL - - These values represent the weights to apply to router bandwidths during - path selection. They are sorted in alphabetical order in the list. The - integer values are divided by BW_WEIGHT_SCALE=10000 or the consensus - param "bwweightscale". They are: - - Wgg - Weight for Guard-flagged nodes in the guard position - Wgm - Weight for non-flagged nodes in the guard Position - Wgd - Weight for Guard+Exit-flagged nodes in the guard Position - - Wmg - Weight for Guard-flagged nodes in the middle Position - Wmm - Weight for non-flagged nodes in the middle Position - Wme - Weight for Exit-flagged nodes in the middle Position - Wmd - Weight for Guard+Exit flagged nodes in the middle Position - - Weg - Weight for Guard flagged nodes in the exit Position - Wem - Weight for non-flagged nodes in the exit Position - Wee - Weight for Exit-flagged nodes in the exit Position - Wed - Weight for Guard+Exit-flagged nodes in the exit Position - - Wgb - Weight for BEGIN_DIR-supporting Guard-flagged nodes - Wmb - Weight for BEGIN_DIR-supporting non-flagged nodes - Web - Weight for BEGIN_DIR-supporting Exit-flagged nodes - Wdb - Weight for BEGIN_DIR-supporting Guard+Exit-flagged nodes - - Wbg - Weight for Guard flagged nodes for BEGIN_DIR requests - Wbm - Weight for non-flagged nodes for BEGIN_DIR requests - Wbe - Weight for Exit-flagged nodes for BEGIN_DIR requests - Wbd - Weight for Guard+Exit-flagged nodes for BEGIN_DIR requests - - These values are calculated as specified in Section 3.4.3. - - The signature contains the following item, which appears Exactly Once - for a vote, and At Least Once for a consensus. - - "directory-signature" SP identity SP signing-key-digest NL Signature - - This is a signature of the status document, with the initial item - "network-status-version", and the signature item - "directory-signature", using the signing key. (In this case, we take - the hash through the _space_ after directory-signature, not the - newline: this ensures that all authorities sign the same thing.) - "identity" is the hex-encoded digest of the authority identity key of - the signing authority, and "signing-key-digest" is the hex-encoded - digest of the current authority signing key of the signing authority. - -3.3. Assigning flags in a vote - - (This section describes how directory authorities choose which status - flags to apply to routers, as of Tor 0.2.0.0-alpha-dev. Later directory - authorities MAY do things differently, so long as clients keep working - well. Clients MUST NOT depend on the exact behaviors in this section.) - - In the below definitions, a router is considered "active" if it is - running, valid, and not hibernating. - - "Valid" -- a router is 'Valid' if it is running a version of Tor not - known to be broken, and the directory authority has not blacklisted - it as suspicious. - - "Named" -- Directory authority administrators may decide to support name - binding. If they do, then they must maintain a file of - nickname-to-identity-key mappings, and try to keep this file consistent - with other directory authorities. If they don't, they act as clients, and - report bindings made by other directory authorities (name X is bound to - identity Y if at least one binding directory lists it, and no directory - binds X to some other Y'.) A router is called 'Named' if the router - believes the given name should be bound to the given key. - - Two strategies exist on the current network for deciding on - values for the Named flag. In the original version, server - operators were asked to send nickname-identity pairs to a - mailing list of Naming directory authorities operators. The - operators were then supposed to add the pairs to their - mapping files; in practice, they didn't get to this often. - - Newer Naming authorities run a script that registers routers - in their mapping files once the routers have been online at - least two weeks, no other router has that nickname, and no - other router has wanted the nickname for a month. If a router - has not been online for six months, the router is removed. - - "Unnamed" -- Directory authorities that support naming should vote for a - router to be 'Unnamed' if its given nickname is mapped to a different - identity. - - "Running" -- A router is 'Running' if the authority managed to connect to - it successfully within the last 30 minutes. - - "Stable" -- A router is 'Stable' if it is active, and either its Weighted - MTBF is at least the median for known active routers or its Weighted MTBF - corresponds to at least 7 days. Routers are never called Stable if they are - running a version of Tor known to drop circuits stupidly. (0.1.1.10-alpha - through 0.1.1.16-rc are stupid this way.) - - To calculate weighted MTBF, compute the weighted mean of the lengths - of all intervals when the router was observed to be up, weighting - intervals by $\alpha^n$, where $n$ is the amount of time that has - passed since the interval ended, and $\alpha$ is chosen so that - measurements over approximately one month old no longer influence the - weighted MTBF much. - - [XXXX what happens when we have less than 4 days of MTBF info.] - - "Exit" -- A router is called an 'Exit' iff it allows exits to at - least two of the ports 80, 443, and 6667 and allows exits to at - least one /8 address space. - - "Fast" -- A router is 'Fast' if it is active, and its bandwidth is - either in the top 7/8ths for known active routers or at least 20KB/s. - - "Guard" -- A router is a possible 'Guard' if its Weighted Fractional - Uptime is at least the median for "familiar" active routers, and if - its bandwidth is at least median or at least 250KB/s. - - To calculate weighted fractional uptime, compute the fraction - of time that the router is up in any given day, weighting so that - downtime and uptime in the past counts less. - - A node is 'familiar' if 1/8 of all active nodes have appeared more - recently than it, OR it has been around for a few weeks. - - "Authority" -- A router is called an 'Authority' if the authority - generating the network-status document believes it is an authority. - - "V2Dir" -- A router supports the v2 directory protocol if it has an open - directory port, and it is running a version of the directory protocol that - supports the functionality clients need. (Currently, this is - 0.1.1.9-alpha or later.) - - "V3Dir" -- A router supports the v3 directory protocol if it has an open - directory port, and it is running a version of the directory protocol that - supports the functionality clients need. (Currently, this is - 0.2.0.?????-alpha or later.) - - "HSDir" -- A router is a v2 hidden service directory if it stores and - serves v2 hidden service descriptors and the authority managed to connect - to it successfully within the last 24 hours. - - Directory server administrators may label some servers or IPs as - blacklisted, and elect not to include them in their network-status lists. - - Authorities SHOULD 'disable' any servers in excess of 3 on any single IP. - When there are more than 3 to choose from, authorities should first prefer - authorities to non-authorities, then prefer Running to non-Running, and - then prefer high-bandwidth to low-bandwidth. To 'disable' a server, the - authority *should* advertise it without the Running or Valid flag. - - Thus, the network-status vote includes all non-blacklisted, - non-expired, non-superseded descriptors. - - The bandwidth in a "w" line should be taken as the best estimate - of the router's actual capacity that the authority has. For now, - this should be the lesser of the observed bandwidth and bandwidth - rate limit from the router descriptor. It is given in kilobytes - per second, and capped at some arbitrary value (currently 10 MB/s). - - The Measured= keyword on a "w" line vote is currently computed - by multiplying the previous published consensus bandwidth by the - ratio of the measured average node stream capacity to the network - average. If 3 or more authorities provide a Measured= keyword for - a router, the authorities produce a consensus containing a "w" - Bandwidth= keyword equal to the median of the Measured= votes. - - The ports listed in a "p" line should be taken as those ports for - which the router's exit policy permits 'most' addresses, ignoring any - accept not for all addresses, ignoring all rejects for private - netblocks. "Most" addresses are permitted if no more than 2^25 - IPv4 addresses (two /8 networks) were blocked. The list is encoded - as described in 3.4.2. - -3.4. Computing a consensus from a set of votes - - Given a set of votes, authorities compute the contents of the consensus - document as follows: - - The "valid-after", "valid-until", and "fresh-until" times are taken as - the median of the respective values from all the votes. - - The times in the "voting-delay" line are taken as the median of the - VoteSeconds and DistSeconds times in the votes. - - Known-flags is the union of all flags known by any voter. - - Entries are given on the "params" line for every keyword on which any - authority voted. The values given are the low-median of all votes on - that keyword. - - "client-versions" and "server-versions" are sorted in ascending - order; A version is recommended in the consensus if it is recommended - by more than half of the voting authorities that included a - client-versions or server-versions lines in their votes. - - The authority item groups (dir-source, contact, fingerprint, - vote-digest) are taken from the votes of the voting - authorities. These groups are sorted by the digests of the - authorities identity keys, in ascending order. If the consensus - method is 3 or later, a dir-source line must be included for - every vote with legacy-key entry, using the legacy-key's - fingerprint, the voter's ordinary nickname with the string - "-legacy" appended, and all other fields as from the original - vote's dir-source line. - - A router status entry: - * is included in the result if some router status entry with the same - identity is included by more than half of the authorities (total - authorities, not just those whose votes we have). - - * For any given identity, we include at most one router status entry. - - * A router entry has a flag set if that is included by more than half - of the authorities who care about that flag. - - * Two router entries are "the same" if they have the same - <descriptor digest, published time, nickname, IP, ports> tuple. - We choose the tuple for a given router as whichever tuple appears - for that router in the most votes. We break ties first in favor of - the more recently published, then in favor of smaller server - descriptor digest. - - * The Named flag appears if it is included for this routerstatus by - _any_ authority, and if all authorities that list it list the same - nickname. However, if consensus-method 2 or later is in use, and - any authority calls this identity/nickname pair Unnamed, then - this routerstatus does not get the Named flag. - - * If consensus-method 2 or later is in use, the Unnamed flag is - set for a routerstatus if any authorities have voted for a different - identities to be Named with that nickname, or if any authority - lists that nickname/ID pair as Unnamed. - - (With consensus-method 1, Unnamed is set like any other flag.) - - * The version is given as whichever version is listed by the most - voters, with ties decided in favor of more recent versions. - - * If consensus-method 4 or later is in use, then routers that - do not have the Running flag are not listed at all. - - * If consensus-method 5 or later is in use, then the "w" line - is generated using a low-median of the bandwidth values from - the votes that included "w" lines for this router. - - * If consensus-method 5 or later is in use, then the "p" line - is taken from the votes that have the same policy summary - for the descriptor we are listing. (They should all be the - same. If they are not, we pick the most commonly listed - one, breaking ties in favor of the lexicographically larger - vote.) The port list is encoded as specified in 3.4.2. - - * If consensus-method 6 or later is in use and if 3 or more - authorities provide a Measured= keyword in their votes for - a router, the authorities produce a consensus containing a - Bandwidth= keyword equal to the median of the Measured= votes. - - * If consensus-method 7 or later is in use, the params line is - included in the output. - - The signatures at the end of a consensus document are sorted in - ascending order by identity digest. - - All ties in computing medians are broken in favor of the smaller or - earlier item. - -3.4.1. Forward compatibility - - Future versions of Tor will need to include new information in the - consensus documents, but it is important that all authorities (or at least - half) generate and sign the same signed consensus. - - To achieve this, authorities list in their votes their supported methods - for generating consensuses from votes. Later methods will be assigned - higher numbers. Currently recognized methods: - "1" -- The first implemented version. - "2" -- Added support for the Unnamed flag. - "3" -- Added legacy ID key support to aid in authority ID key rollovers - "4" -- No longer list routers that are not running in the consensus - "5" -- adds support for "w" and "p" lines. - "6" -- Prefers measured bandwidth values rather than advertised - "7" -- Provides keyword=integer pairs of consensus parameters - "8" -- Provides microdescriptor summaries - "9" -- Provides weights for selecting flagged routers in paths - "10" -- Fixes edge case bugs in router flag selection weights - - Before generating a consensus, an authority must decide which consensus - method to use. To do this, it looks for the highest version number - supported by more than 2/3 of the authorities voting. If it supports this - method, then it uses it. Otherwise, it falls back to method 1. - - (The consensuses generated by new methods must be parsable by - implementations that only understand the old methods, and must not cause - those implementations to compromise their anonymity. This is a means for - making changes in the contents of consensus; not for making - backward-incompatible changes in their format.) - -3.4.2. Encoding port lists - - Whether the summary shows the list of accepted ports or the list of - rejected ports depends on which list is shorter (has a shorter string - representation). In case of ties we choose the list of accepted - ports. As an exception to this rule an allow-all policy is - represented as "accept 1-65535" instead of "reject " and a reject-all - policy is similarly given as "reject 1-65535". - - Summary items are compressed, that is instead of "80-88,89-100" there - only is a single item of "80-100", similarly instead of "20,21" a - summary will say "20-21". - - Port lists are sorted in ascending order. - - The maximum allowed length of a policy summary (including the "accept " - or "reject ") is 1000 characters. If a summary exceeds that length we - use an accept-style summary and list as much of the port list as is - possible within these 1000 bytes. [XXXX be more specific.] - -3.4.3. Computing Bandwidth Weights - - Let weight_scale = 10000 - - Let G be the total bandwidth for Guard-flagged nodes. - Let M be the total bandwidth for non-flagged nodes. - Let E be the total bandwidth for Exit-flagged nodes. - Let D be the total bandwidth for Guard+Exit-flagged nodes. - Let T = G+M+E+D - - Let Wgd be the weight for choosing a Guard+Exit for the guard position. - Let Wmd be the weight for choosing a Guard+Exit for the middle position. - Let Wed be the weight for choosing a Guard+Exit for the exit position. - - Let Wme be the weight for choosing an Exit for the middle position. - Let Wmg be the weight for choosing a Guard for the middle position. - - Let Wgg be the weight for choosing a Guard for the guard position. - Let Wee be the weight for choosing an Exit for the exit position. - - Balanced network conditions then arise from solutions to the following - system of equations: - - Wgg*G + Wgd*D == M + Wmd*D + Wme*E + Wmg*G (guard bw = middle bw) - Wgg*G + Wgd*D == Wee*E + Wed*D (guard bw = exit bw) - Wed*D + Wmd*D + Wgd*D == D (aka: Wed+Wmd+Wdg = 1) - Wmg*G + Wgg*G == G (aka: Wgg = 1-Wmg) - Wme*E + Wee*E == E (aka: Wee = 1-Wme) - - We are short 2 constraints with the above set. The remaining constraints - come from examining different cases of network load. The following - constraints are used in consensus method 10 and above. There are another - incorrect and obsolete set of constraints used for these same cases in - consensus method 9. For those, see dir-spec.txt in Tor 0.2.2.10-alpha - to 0.2.2.16-alpha. - - Case 1: E >= T/3 && G >= T/3 (Neither Exit nor Guard Scarce) - - In this case, the additional two constraints are: Wmg == Wmd, - Wed == 1/3. - - This leads to the solution: - Wgd = weight_scale/3 - Wed = weight_scale/3 - Wmd = weight_scale/3 - Wee = (weight_scale*(E+G+M))/(3*E) - Wme = weight_scale - Wee - Wmg = (weight_scale*(2*G-E-M))/(3*G) - Wgg = weight_scale - Wmg - - Case 2: E < T/3 && G < T/3 (Both are scarce) - - Let R denote the more scarce class (Rare) between Guard vs Exit. - Let S denote the less scarce class. - - Subcase a: R+D < S - - In this subcase, we simply devote all of D bandwidth to the - scarce class. - - Wgg = Wee = weight_scale - Wmg = Wme = Wmd = 0; - if E < G: - Wed = weight_scale - Wgd = 0 - else: - Wed = 0 - Wgd = weight_scale - - Subcase b: R+D >= S - - In this case, if M <= T/3, we have enough bandwidth to try to achieve - a balancing condition. - - Add constraints Wgg = 1, Wmd == Wgd to maximize bandwidth in the guard - position while still allowing exits to be used as middle nodes: - - Wee = (weight_scale*(E - G + M))/E - Wed = (weight_scale*(D - 2*E + 4*G - 2*M))/(3*D) - Wme = (weight_scale*(G-M))/E - Wmg = 0 - Wgg = weight_scale - Wmd = (weight_scale - Wed)/2 - Wgd = (weight_scale - Wed)/2 - - If this system ends up with any values out of range (ie negative, or - above weight_scale), use the constraints Wgg == 1 and Wee == 1, since - both those positions are scarce: - - Wgg = weight_scale - Wee = weight_scale - Wed = (weight_scale*(D - 2*E + G + M))/(3*D) - Wmd = (weight_Scale*(D - 2*M + G + E))/(3*D) - Wme = 0 - Wmg = 0 - Wgd = weight_scale - Wed - Wmd - - If M > T/3, then the Wmd weight above will become negative. Set it to 0 - in this case: - Wmd = 0 - Wgd = weight_scale - Wed - - Case 3: One of E < T/3 or G < T/3 - - Let S be the scarce class (of E or G). - - Subcase a: (S+D) < T/3: - if G=S: - Wgg = Wgd = weight_scale; - Wmd = Wed = Wmg = 0; - // Minor subcase, if E is more scarce than M, - // keep its bandwidth in place. - if (E < M) Wme = 0; - else Wme = (weight_scale*(E-M))/(2*E); - Wee = weight_scale-Wme; - if E=S: - Wee = Wed = weight_scale; - Wmd = Wgd = Wme = 0; - // Minor subcase, if G is more scarce than M, - // keep its bandwidth in place. - if (G < M) Wmg = 0; - else Wmg = (weight_scale*(G-M))/(2*G); - Wgg = weight_scale-Wmg; - - Subcase b: (S+D) >= T/3 - if G=S: - Add constraints Wgg = 1, Wmd == Wed to maximize bandwidth - in the guard position, while still allowing exits to be - used as middle nodes: - Wgg = weight_scale - Wgd = (weight_scale*(D - 2*G + E + M))/(3*D) - Wmg = 0 - Wee = (weight_scale*(E+M))/(2*E) - Wme = weight_scale - Wee - Wmd = (weight_scale - Wgd)/2 - Wed = (weight_scale - Wgd)/2 - if E=S: - Add constraints Wee == 1, Wmd == Wgd to maximize bandwidth - in the exit position: - Wee = weight_scale; - Wed = (weight_scale*(D - 2*E + G + M))/(3*D); - Wme = 0; - Wgg = (weight_scale*(G+M))/(2*G); - Wmg = weight_scale - Wgg; - Wmd = (weight_scale - Wed)/2; - Wgd = (weight_scale - Wed)/2; - - To ensure consensus, all calculations are performed using integer math - with a fixed precision determined by the bwweightscale consensus - parameter (defaults at 10000, Min: 1, Max: INT32_MAX). - - For future balancing improvements, Tor clients support 11 additional weights - for directory requests and middle weighting. These weights are currently - set at weight_scale, with the exception of the following groups of - assignments: - - Directory requests use middle weights: - Wbd=Wmd, Wbg=Wmg, Wbe=Wme, Wbm=Wmm - - Handle bridges and strange exit policies: - Wgm=Wgg, Wem=Wee, Weg=Wed - -3.5. Detached signatures - - Assuming full connectivity, every authority should compute and sign the - same consensus directory in each period. Therefore, it isn't necessary to - download the consensus computed by each authority; instead, the - authorities only push/fetch each others' signatures. A "detached - signature" document contains items as follows: - - "consensus-digest" SP Digest NL - - [At start, at most once.] - - The digest of the consensus being signed. - - "valid-after" SP YYYY-MM-DD SP HH:MM:SS NL - "fresh-until" SP YYYY-MM-DD SP HH:MM:SS NL - "valid-until" SP YYYY-MM-DD SP HH:MM:SS NL - - [As in the consensus] - - "directory-signature" - - [As in the consensus; the signature object is the same as in the - consensus document.] - - -4. Directory server operation - - All directory authorities and directory caches ("directory servers") - implement this section, except as noted. - -4.1. Accepting uploads (authorities only) - - When a router posts a signed descriptor to a directory authority, the - authority first checks whether it is well-formed and correctly - self-signed. If it is, the authority next verifies that the nickname - in question is not already assigned to a router with a different - public key. - Finally, the authority MAY check that the router is not blacklisted - because of its key, IP, or another reason. - - If the descriptor passes these tests, and the authority does not already - have a descriptor for a router with this public key, it accepts the - descriptor and remembers it. - - If the authority _does_ have a descriptor with the same public key, the - newly uploaded descriptor is remembered if its publication time is more - recent than the most recent old descriptor for that router, and either: - - There are non-cosmetic differences between the old descriptor and the - new one. - - Enough time has passed between the descriptors' publication times. - (Currently, 12 hours.) - - Differences between router descriptors are "non-cosmetic" if they would be - sufficient to force an upload as described in section 2 above. - - Note that the "cosmetic difference" test only applies to uploaded - descriptors, not to descriptors that the authority downloads from other - authorities. - - When a router posts a signed extra-info document to a directory authority, - the authority again checks it for well-formedness and correct signature, - and checks that its matches the extra-info-digest in some router - descriptor that it believes is currently useful. If so, it accepts it and - stores it and serves it as requested. If not, it drops it. - -4.2. Voting (authorities only) - - Authorities divide time into Intervals. Authority administrators SHOULD - try to all pick the same interval length, and SHOULD pick intervals that - are commonly used divisions of time (e.g., 5 minutes, 15 minutes, 30 - minutes, 60 minutes, 90 minutes). Voting intervals SHOULD be chosen to - divide evenly into a 24-hour day. - - Authorities SHOULD act according to interval and delays in the - latest consensus. Lacking a latest consensus, they SHOULD default to a - 30-minute Interval, a 5 minute VotingDelay, and a 5 minute DistDelay. - - Authorities MUST take pains to ensure that their clocks remain accurate - within a few seconds. (Running NTP is usually sufficient.) - - The first voting period of each day begins at 00:00 (midnight) GMT. If - the last period of the day would be truncated by one-half or more, it is - merged with the second-to-last period. - - An authority SHOULD publish its vote immediately at the start of each voting - period (minus VoteSeconds+DistSeconds). It does this by making it - available at - http://<hostname>/tor/status-vote/next/authority.z - and sending it in an HTTP POST request to each other authority at the URL - http://<hostname>/tor/post/vote - - If, at the start of the voting period, minus DistSeconds, an authority - does not have a current statement from another authority, the first - authority downloads the other's statement. - - Once an authority has a vote from another authority, it makes it available - at - http://<hostname>/tor/status-vote/next/<fp>.z - where <fp> is the fingerprint of the other authority's identity key. - And at - http://<hostname>/tor/status-vote/next/d/<d>.z - where <d> is the digest of the vote document. - - The consensus status, along with as many signatures as the server - currently knows, should be available at - http://<hostname>/tor/status-vote/next/consensus.z - All of the detached signatures it knows for consensus status should be - available at: - http://<hostname>/tor/status-vote/next/consensus-signatures.z - - Once there are enough signatures, or once the voting period starts, - these documents are available at - http://<hostname>/tor/status-vote/current/consensus.z - and - http://<hostname>/tor/status-vote/current/consensus-signatures.z - [XXX current/consensus-signatures is not currently implemented, as it - is not used in the voting protocol.] - - The other vote documents are analogously made available under - http://<hostname>/tor/status-vote/current/authority.z - http://<hostname>/tor/status-vote/current/<fp>.z - http://<hostname>/tor/status-vote/current/d/<d>.z - once the consensus is complete. - - Once an authority has computed and signed a consensus network status, it - should send its detached signature to each other authority in an HTTP POST - request to the URL: - http://<hostname>/tor/post/consensus-signature - - [XXX Note why we support push-and-then-pull.] - - [XXX possible future features include support for downloading old - consensuses.] - -4.3. Downloading consensus status documents (caches only) - - All directory servers (authorities and caches) try to keep a recent - network-status consensus document to serve to clients. A cache ALWAYS - downloads a network-status consensus if any of the following are true: - - The cache has no consensus document. - - The cache's consensus document is no longer valid. - Otherwise, the cache downloads a new consensus document at a randomly - chosen time in the first half-interval after its current consensus - stops being fresh. (This time is chosen at random to avoid swarming - the authorities at the start of each period. The interval size is - inferred from the difference between the valid-after time and the - fresh-until time on the consensus.) - - [For example, if a cache has a consensus that became valid at 1:00, - and is fresh until 2:00, that cache will fetch a new consensus at - a random time between 2:00 and 2:30.] - -4.4. Downloading and storing router descriptors (authorities and caches) - - Periodically (currently, every 10 seconds), directory servers check - whether there are any specific descriptors that they do not have and that - they are not currently trying to download. Caches identify these - descriptors by hash in the recent network-status consensus documents; - authorities identify them by hash in vote (if publication date is more - recent than the descriptor we currently have). - - [XXXX need a way to fetch descriptors ahead of the vote? v2 status docs can - do that for now.] - - If so, the directory server launches requests to the authorities for these - descriptors, such that each authority is only asked for descriptors listed - in its most recent vote (if the requester is an authority) or in the - consensus (if the requester is a cache). If we're an authority, and more - than one authority lists the descriptor, we choose which to ask at random. - - If one of these downloads fails, we do not try to download that descriptor - from the authority that failed to serve it again unless we receive a newer - network-status (consensus or vote) from that authority that lists the same - descriptor. - - Directory servers must potentially cache multiple descriptors for each - router. Servers must not discard any descriptor listed by any recent - consensus. If there is enough space to store additional descriptors, - servers SHOULD try to hold those which clients are likely to download the - most. (Currently, this is judged based on the interval for which each - descriptor seemed newest.) -[XXXX define recent] - - Authorities SHOULD NOT download descriptors for routers that they would - immediately reject for reasons listed in 3.1. - -4.5. Downloading and storing extra-info documents - - All authorities, and any cache that chooses to cache extra-info documents, - and any client that uses extra-info documents, should implement this - section. - - Note that generally, clients don't need extra-info documents. - - Periodically, the Tor instance checks whether it is missing any extra-info - documents: in other words, if it has any router descriptors with an - extra-info-digest field that does not match any of the extra-info - documents currently held. If so, it downloads whatever extra-info - documents are missing. Caches download from authorities; non-caches try - to download from caches. We follow the same splitting and back-off rules - as in 4.4 (if a cache) or 5.3 (if a client). - -4.6. General-use HTTP URLs - - "Fingerprints" in these URLs are base-16-encoded SHA1 hashes. - - The most recent v3 consensus should be available at: - http://<hostname>/tor/status-vote/current/consensus.z - - Starting with Tor version 0.2.1.1-alpha is also available at: - http://<hostname>/tor/status-vote/current/consensus/<F1>+<F2>+<F3>.z - - Where F1, F2, etc. are authority identity fingerprints the client trusts. - Servers will only return a consensus if more than half of the requested - authorities have signed the document, otherwise a 404 error will be sent - back. The fingerprints can be shortened to a length of any multiple of - two, using only the leftmost part of the encoded fingerprint. Tor uses - 3 bytes (6 hex characters) of the fingerprint. - - Clients SHOULD sort the fingerprints in ascending order. Server MUST - accept any order. - - Clients SHOULD use this format when requesting consensus documents from - directory authority servers and from caches running a version of Tor - that is known to support this URL format. - - A concatenated set of all the current key certificates should be available - at: - http://<hostname>/tor/keys/all.z - - The key certificate for this server (if it is an authority) should be - available at: - http://<hostname>/tor/keys/authority.z - - The key certificate for an authority whose authority identity fingerprint - is <F> should be available at: - http://<hostname>/tor/keys/fp/<F>.z - - The key certificate whose signing key fingerprint is <F> should be - available at: - http://<hostname>/tor/keys/sk/<F>.z - - The key certificate whose identity key fingerprint is <F> and whose signing - key fingerprint is <S> should be available at: - - http://<hostname>/tor/keys/fp-sk/<F>-<S>.z - - (As usual, clients may request multiple certificates using: - http://<hostname>/tor/keys/fp-sk/<F1>-<S1>+<F2>-<S2>.z ) - [The above fp-sk format was not supported before Tor 0.2.1.9-alpha.] - - The most recent descriptor for a server whose identity key has a - fingerprint of <F> should be available at: - http://<hostname>/tor/server/fp/<F>.z - - The most recent descriptors for servers with identity fingerprints - <F1>,<F2>,<F3> should be available at: - http://<hostname>/tor/server/fp/<F1>+<F2>+<F3>.z - - (NOTE: Implementations SHOULD NOT download descriptors by identity key - fingerprint. This allows a corrupted server (in collusion with a cache) to - provide a unique descriptor to a client, and thereby partition that client - from the rest of the network.) - - The server descriptor with (descriptor) digest <D> (in hex) should be - available at: - http://<hostname>/tor/server/d/<D>.z - - The most recent descriptors with digests <D1>,<D2>,<D3> should be - available at: - http://<hostname>/tor/server/d/<D1>+<D2>+<D3>.z - - The most recent descriptor for this server should be at: - http://<hostname>/tor/server/authority.z - [Nothing in the Tor protocol uses this resource yet, but it is useful - for debugging purposes. Also, the official Tor implementations - (starting at 0.1.1.x) use this resource to test whether a server's - own DirPort is reachable.] - - A concatenated set of the most recent descriptors for all known servers - should be available at: - http://<hostname>/tor/server/all.z - - Extra-info documents are available at the URLS - http://<hostname>/tor/extra/d/... - http://<hostname>/tor/extra/fp/... - http://<hostname>/tor/extra/all[.z] - http://<hostname>/tor/extra/authority[.z] - (As for /tor/server/ URLs: supports fetching extra-info - documents by their digest, by the fingerprint of their servers, - or all at once. When serving by fingerprint, we serve the - extra-info that corresponds to the descriptor we would serve by - that fingerprint. Only directory authorities of version - 0.2.0.1-alpha or later are guaranteed to support the first - three classes of URLs. Caches may support them, and MUST - support them if they have advertised "caches-extra-info".) - - For debugging, directories SHOULD expose non-compressed objects at URLs like - the above, but without the final ".z". - Clients MUST handle compressed concatenated information in two forms: - - A concatenated list of zlib-compressed objects. - - A zlib-compressed concatenated list of objects. - Directory servers MAY generate either format: the former requires less - CPU, but the latter requires less bandwidth. - - Clients SHOULD use upper case letters (A-F) when base16-encoding - fingerprints. Servers MUST accept both upper and lower case fingerprints - in requests. - -5. Client operation: downloading information - - Every Tor that is not a directory server (that is, those that do - not have a DirPort set) implements this section. - -5.1. Downloading network-status documents - - Each client maintains a list of directory authorities. Insofar as - possible, clients SHOULD all use the same list. - - Clients try to have a live consensus network-status document at all times. - A network-status document is "live" if the time in its valid-until field - has not passed. - - If a client is missing a live network-status document, it tries to fetch - it from a directory cache (or from an authority if it knows no caches). - On failure, the client waits briefly, then tries that network-status - document again from another cache. The client does not build circuits - until it has a live network-status consensus document, and it has - descriptors for more than 1/4 of the routers that it believes are running. - - (Note: clients can and should pick caches based on the network-status - information they have: once they have first fetched network-status info - from an authority, they should not need to go to the authority directly - again.) - - To avoid swarming the caches whenever a consensus expires, the - clients download new consensuses at a randomly chosen time after the - caches are expected to have a fresh consensus, but before their - consensus will expire. (This time is chosen uniformly at random from - the interval between the time 3/4 into the first interval after the - consensus is no longer fresh, and 7/8 of the time remaining after - that before the consensus is invalid.) - - [For example, if a cache has a consensus that became valid at 1:00, - and is fresh until 2:00, and expires at 4:00, that cache will fetch - a new consensus at a random time between 2:45 and 3:50, since 3/4 - of the one-hour interval is 45 minutes, and 7/8 of the remaining 75 - minutes is 65 minutes.] - -5.2. Downloading and storing router descriptors - - Clients try to have the best descriptor for each router. A descriptor is - "best" if: - * It is listed in the consensus network-status document. - - Periodically (currently every 10 seconds) clients check whether there are - any "downloadable" descriptors. A descriptor is downloadable if: - - It is the "best" descriptor for some router. - - The descriptor was published at least 10 minutes in the past. - (This prevents clients from trying to fetch descriptors that the - mirrors have probably not yet retrieved and cached.) - - The client does not currently have it. - - The client is not currently trying to download it. - - The client would not discard it immediately upon receiving it. - - The client thinks it is running and valid (see 6.1 below). - - If at least 16 known routers have downloadable descriptors, or if - enough time (currently 10 minutes) has passed since the last time the - client tried to download descriptors, it launches requests for all - downloadable descriptors, as described in 5.3 below. - - When a descriptor download fails, the client notes it, and does not - consider the descriptor downloadable again until a certain amount of time - has passed. (Currently 0 seconds for the first failure, 60 seconds for the - second, 5 minutes for the third, 10 minutes for the fourth, and 1 day - thereafter.) Periodically (currently once an hour) clients reset the - failure count. - - Clients retain the most recent descriptor they have downloaded for each - router so long as it is not too old (currently, 48 hours), OR so long as - no better descriptor has been downloaded for the same router. - - [Versions of Tor before 0.1.2.3-alpha would discard descriptors simply for - being published too far in the past.] [The code seems to discard - descriptors in all cases after they're 5 days old. True? -RD] - -5.3. Managing downloads - - When a client has no consensus network-status document, it downloads it - from a randomly chosen authority. In all other cases, the client - downloads from caches randomly chosen from among those believed to be V2 - directory servers. (This information comes from the network-status - documents; see 6 below.) - - When downloading multiple router descriptors, the client chooses multiple - mirrors so that: - - At least 3 different mirrors are used, except when this would result - in more than one request for under 4 descriptors. - - No more than 128 descriptors are requested from a single mirror. - - Otherwise, as few mirrors as possible are used. - After choosing mirrors, the client divides the descriptors among them - randomly. - - After receiving any response client MUST discard any network-status - documents and descriptors that it did not request. - -6. Using directory information - - Everyone besides directory authorities uses the approaches in this section - to decide which servers to use and what their keys are likely to be. - (Directory authorities just believe their own opinions, as in 3.1 above.) - -6.1. Choosing routers for circuits. - - Circuits SHOULD NOT be built until the client has enough directory - information: a live consensus network status [XXXX fallback?] and - descriptors for at least 1/4 of the servers believed to be running. - - A server is "listed" if it is included by the consensus network-status - document. Clients SHOULD NOT use unlisted servers. - - These flags are used as follows: - - - Clients SHOULD NOT use non-'Valid' or non-'Running' routers unless - requested to do so. - - - Clients SHOULD NOT use non-'Fast' routers for any purpose other than - very-low-bandwidth circuits (such as introduction circuits). - - - Clients SHOULD NOT use non-'Stable' routers for circuits that are - likely to need to be open for a very long time (such as those used for - IRC or SSH connections). - - - Clients SHOULD NOT choose non-'Guard' nodes when picking entry guard - nodes. - - - Clients SHOULD NOT download directory information from non-'V2Dir' - caches. - - See the "path-spec.txt" document for more details. - -6.2. Managing naming - - In order to provide human-memorable names for individual server - identities, some directory servers bind names to IDs. Clients handle - names in two ways: - - When a client encounters a name it has not mapped before: - - If the consensus lists any router with that name as "Named", or if - consensus-method 2 or later is in use and the consensus lists any - router with that name as having the "Unnamed" flag, then the name is - bound. (It's bound to the ID listed in the entry with the Named, - or to an unknown ID if no name is found.) - - When the user refers to a bound name, the implementation SHOULD provide - only the router with ID bound to that name, and no other router, even - if the router with the right ID can't be found. - - When a user tries to refer to a non-bound name, the implementation SHOULD - warn the user. After warning the user, the implementation MAY use any - router that advertises the name. - - Not every router needs a nickname. When a router doesn't configure a - nickname, it publishes with the default nickname "Unnamed". Authorities - SHOULD NOT ever mark a router with this nickname as Named; client software - SHOULD NOT ever use a router in response to a user request for a router - called "Unnamed". - -6.3. Software versions - - An implementation of Tor SHOULD warn when it has fetched a consensus - network-status, and it is running a software version not listed. - -6.4. Warning about a router's status. - - If a router tries to publish its descriptor to a Naming authority - that has its nickname mapped to another key, the router SHOULD - warn the operator that it is either using the wrong key or is using - an already claimed nickname. - - If a router has fetched a consensus document,, and the - authorities do not publish a binding for the router's nickname, the - router MAY remind the operator that the chosen nickname is not - bound to this key at the authorities, and suggest contacting the - authority operators. - - ... - -6.5. Router protocol versions - - A client should believe that a router supports a given feature if that - feature is supported by the router or protocol versions in more than half - of the live networkstatuses' "v" entries for that router. In other words, - if the "v" entries for some router are: - v Tor 0.0.8pre1 (from authority 1) - v Tor 0.1.2.11 (from authority 2) - v FutureProtocolDescription 99 (from authority 3) - then the client should believe that the router supports any feature - supported by 0.1.2.11. - - This is currently equivalent to believing the median declared version for - a router in all live networkstatuses. - -7. Standards compliance - - All clients and servers MUST support HTTP 1.0. Clients and servers MAY - support later versions of HTTP as well. - -7.1. HTTP headers - - Servers MAY set the Content-Length: header. Servers SHOULD set - Content-Encoding to "deflate" or "identity". - - Servers MAY include an X-Your-Address-Is: header, whose value is the - apparent IP address of the client connecting to them (as a dotted quad). - For directory connections tunneled over a BEGIN_DIR stream, servers SHOULD - report the IP from which the circuit carrying the BEGIN_DIR stream reached - them. [Servers before version 0.1.2.5-alpha reported 127.0.0.1 for all - BEGIN_DIR-tunneled connections.] - - Servers SHOULD disable caching of multiple network statuses or multiple - router descriptors. Servers MAY enable caching of single descriptors, - single network statuses, the list of all router descriptors, a v1 - directory, or a v1 running routers document. XXX mention times. - -7.2. HTTP status codes - - Tor delivers the following status codes. Some were chosen without much - thought; other code SHOULD NOT rely on specific status codes yet. - - 200 -- the operation completed successfully - -- the user requested statuses or serverdescs, and none of the ones we - requested were found (0.2.0.4-alpha and earlier). - - 304 -- the client specified an if-modified-since time, and none of the - requested resources have changed since that time. - - 400 -- the request is malformed, or - -- the URL is for a malformed variation of one of the URLs we support, - or - -- the client tried to post to a non-authority, or - -- the authority rejected a malformed posted document, or - - 404 -- the requested document was not found. - -- the user requested statuses or serverdescs, and none of the ones - requested were found (0.2.0.5-alpha and later). - - 503 -- we are declining the request in order to save bandwidth - -- user requested some items that we ordinarily generate or store, - but we do not have any available. - -9. Backward compatibility and migration plans - - Until Tor versions before 0.1.1.x are completely obsolete, directory - authorities should generate, and mirrors should download and cache, v1 - directories and running-routers lists, and allow old clients to download - them. These documents and the rules for retrieving, serving, and caching - them are described in dir-spec-v1.txt. - - Until Tor versions before 0.2.0.x are completely obsolete, directory - authorities should generate, mirrors should download and cache, v2 - network-status documents, and allow old clients to download them. - Additionally, all directory servers and caches should download, store, and - serve any router descriptor that is required because of v2 network-status - documents. These documents and the rules for retrieving, serving, and - caching them are described in dir-spec-v1.txt. - -A. Consensus-negotiation timeline. - - Period begins: this is the Published time. - Everybody sends votes - Reconciliation: everybody tries to fetch missing votes. - consensus may exist at this point. - End of voting period: - everyone swaps signatures. - Now it's okay for caches to download - Now it's okay for clients to download. - - Valid-after/valid-until switchover - diff --git a/doc/spec/path-spec.txt b/doc/spec/path-spec.txt deleted file mode 100644 index 7c313f8ab0..0000000000 --- a/doc/spec/path-spec.txt +++ /dev/null @@ -1,657 +0,0 @@ - - Tor Path Specification - - Roger Dingledine - Nick Mathewson - -Note: This is an attempt to specify Tor as currently implemented. Future -versions of Tor will implement improved algorithms. - -This document tries to cover how Tor chooses to build circuits and assign -streams to circuits. Other implementations MAY take other approaches, but -implementors should be aware of the anonymity and load-balancing implications -of their choices. - - THIS SPEC ISN'T DONE YET. - - The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL - NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and - "OPTIONAL" in this document are to be interpreted as described in - RFC 2119. - -1. General operation - - Tor begins building circuits as soon as it has enough directory - information to do so (see section 5 of dir-spec.txt). Some circuits are - built preemptively because we expect to need them later (for user - traffic), and some are built because of immediate need (for user traffic - that no current circuit can handle, for testing the network or our - reachability, and so on). - - When a client application creates a new stream (by opening a SOCKS - connection or launching a resolve request), we attach it to an appropriate - open circuit if one exists, or wait if an appropriate circuit is - in-progress. We launch a new circuit only - if no current circuit can handle the request. We rotate circuits over - time to avoid some profiling attacks. - - To build a circuit, we choose all the nodes we want to use, and then - construct the circuit. Sometimes, when we want a circuit that ends at a - given hop, and we have an appropriate unused circuit, we "cannibalize" the - existing circuit and extend it to the new terminus. - - These processes are described in more detail below. - - This document describes Tor's automatic path selection logic only; path - selection can be overridden by a controller (with the EXTENDCIRCUIT and - ATTACHSTREAM commands). Paths constructed through these means may - violate some constraints given below. - -1.1. Terminology - - A "path" is an ordered sequence of nodes, not yet built as a circuit. - - A "clean" circuit is one that has not yet been used for any traffic. - - A "fast" or "stable" or "valid" node is one that has the 'Fast' or - 'Stable' or 'Valid' flag - set respectively, based on our current directory information. A "fast" - or "stable" circuit is one consisting only of "fast" or "stable" nodes. - - In an "exit" circuit, the final node is chosen based on waiting stream - requests if any, and in any case it avoids nodes with exit policy of - "reject *:*". An "internal" circuit, on the other hand, is one where - the final node is chosen just like a middle node (ignoring its exit - policy). - - A "request" is a client-side stream or DNS resolve that needs to be - served by a circuit. - - A "pending" circuit is one that we have started to build, but which has - not yet completed. - - A circuit or path "supports" a request if it is okay to use the - circuit/path to fulfill the request, according to the rules given below. - A circuit or path "might support" a request if some aspect of the request - is unknown (usually its target IP), but we believe the path probably - supports the request according to the rules given below. - -1.1. A server's bandwidth - - Old versions of Tor did not report bandwidths in network status - documents, so clients had to learn them from the routers' advertised - server descriptors. - - For versions of Tor prior to 0.2.1.17-rc, everywhere below where we - refer to a server's "bandwidth", we mean its clipped advertised - bandwidth, computed by taking the smaller of the 'rate' and - 'observed' arguments to the "bandwidth" element in the server's - descriptor. If a router's advertised bandwidth is greater than - MAX_BELIEVABLE_BANDWIDTH (currently 10 MB/s), we clipped to that - value. - - For more recent versions of Tor, we take the bandwidth value declared - in the consensus, and fall back to the clipped advertised bandwidth - only if the consensus does not have bandwidths listed. - -2. Building circuits - -2.1. When we build - -2.1.1. Clients build circuits preemptively - - When running as a client, Tor tries to maintain at least a certain - number of clean circuits, so that new streams can be handled - quickly. To increase the likelihood of success, Tor tries to - predict what circuits will be useful by choosing from among nodes - that support the ports we have used in the recent past (by default - one hour). Specifically, on startup Tor tries to maintain one clean - fast exit circuit that allows connections to port 80, and at least - two fast clean stable internal circuits in case we get a resolve - request or hidden service request (at least three if we _run_ a - hidden service). - - After that, Tor will adapt the circuits that it preemptively builds - based on the requests it sees from the user: it tries to have two fast - clean exit circuits available for every port seen within the past hour - (each circuit can be adequate for many predicted ports -- it doesn't - need two separate circuits for each port), and it tries to have the - above internal circuits available if we've seen resolves or hidden - service activity within the past hour. If there are 12 or more clean - circuits open, it doesn't open more even if it has more predictions. - - Only stable circuits can "cover" a port that is listed in the - LongLivedPorts config option. Similarly, hidden service requests - to ports listed in LongLivedPorts make us create stable internal - circuits. - - Note that if there are no requests from the user for an hour, Tor - will predict no use and build no preemptive circuits. - - The Tor client SHOULD NOT store its list of predicted requests to a - persistent medium. - -2.1.2. Clients build circuits on demand - - Additionally, when a client request exists that no circuit (built or - pending) might support, we create a new circuit to support the request. - For exit connections, we pick an exit node that will handle the - most pending requests (choosing arbitrarily among ties), launch a - circuit to end there, and repeat until every unattached request - might be supported by a pending or built circuit. For internal - circuits, we pick an arbitrary acceptable path, repeating as needed. - - In some cases we can reuse an already established circuit if it's - clean; see Section 2.3 (cannibalizing circuits) for details. - -2.1.3. Servers build circuits for testing reachability and bandwidth - - Tor servers test reachability of their ORPort once they have - successfully built a circuit (on start and whenever their IP address - changes). They build an ordinary fast internal circuit with themselves - as the last hop. As soon as any testing circuit succeeds, the Tor - server decides it's reachable and is willing to publish a descriptor. - - We launch multiple testing circuits (one at a time), until we - have NUM_PARALLEL_TESTING_CIRC (4) such circuits open. Then we - do a "bandwidth test" by sending a certain number of relay drop - cells down each circuit: BandwidthRate * 10 / CELL_NETWORK_SIZE - total cells divided across the four circuits, but never more than - CIRCWINDOW_START (1000) cells total. This exercises both outgoing and - incoming bandwidth, and helps to jumpstart the observed bandwidth - (see dir-spec.txt). - - Tor servers also test reachability of their DirPort once they have - established a circuit, but they use an ordinary exit circuit for - this purpose. - -2.1.4. Hidden-service circuits - - See section 4 below. - -2.1.5. Rate limiting of failed circuits - - If we fail to build a circuit N times in a X second period (see Section - 2.3 for how this works), we stop building circuits until the X seconds - have elapsed. - XXXX - -2.1.6. When to tear down circuits - - XXXX - - -2.2. Path selection and constraints - - We choose the path for each new circuit before we build it. We choose the - exit node first, followed by the other nodes in the circuit. All paths - we generate obey the following constraints: - - We do not choose the same router twice for the same path. - - We do not choose any router in the same family as another in the same - path. - - We do not choose more than one router in a given /16 subnet - (unless EnforceDistinctSubnets is 0). - - We don't choose any non-running or non-valid router unless we have - been configured to do so. By default, we are configured to allow - non-valid routers in "middle" and "rendezvous" positions. - - If we're using Guard nodes, the first node must be a Guard (see 5 - below) - - XXXX Choosing the length - - For "fast" circuits, we only choose nodes with the Fast flag. For - non-"fast" circuits, all nodes are eligible. - - For all circuits, we weight node selection according to router bandwidth. - - We also weight the bandwidth of Exit and Guard flagged nodes depending on - the fraction of total bandwidth that they make up and depending upon the - position they are being selected for. - - These weights are published in the consensus, and are computed as described - in Section 3.4.3 of dir-spec.txt. They are: - - Wgg - Weight for Guard-flagged nodes in the guard position - Wgm - Weight for non-flagged nodes in the guard Position - Wgd - Weight for Guard+Exit-flagged nodes in the guard Position - - Wmg - Weight for Guard-flagged nodes in the middle Position - Wmm - Weight for non-flagged nodes in the middle Position - Wme - Weight for Exit-flagged nodes in the middle Position - Wmd - Weight for Guard+Exit flagged nodes in the middle Position - - Weg - Weight for Guard flagged nodes in the exit Position - Wem - Weight for non-flagged nodes in the exit Position - Wee - Weight for Exit-flagged nodes in the exit Position - Wed - Weight for Guard+Exit-flagged nodes in the exit Position - - Wgb - Weight for BEGIN_DIR-supporting Guard-flagged nodes - Wmb - Weight for BEGIN_DIR-supporting non-flagged nodes - Web - Weight for BEGIN_DIR-supporting Exit-flagged nodes - Wdb - Weight for BEGIN_DIR-supporting Guard+Exit-flagged nodes - - Wbg - Weight for Guard+Exit-flagged nodes for BEGIN_DIR requests - Wbm - Weight for Guard+Exit-flagged nodes for BEGIN_DIR requests - Wbe - Weight for Guard+Exit-flagged nodes for BEGIN_DIR requests - Wbd - Weight for Guard+Exit-flagged nodes for BEGIN_DIR requests - - Additionally, we may be building circuits with one or more requests in - mind. Each kind of request puts certain constraints on paths: - - - All service-side introduction circuits and all rendezvous paths - should be Stable. - - All connection requests for connections that we think will need to - stay open a long time require Stable circuits. Currently, Tor decides - this by examining the request's target port, and comparing it to a - list of "long-lived" ports. (Default: 21, 22, 706, 1863, 5050, - 5190, 5222, 5223, 6667, 6697, 8300.) - - DNS resolves require an exit node whose exit policy is not equivalent - to "reject *:*". - - Reverse DNS resolves require a version of Tor with advertised eventdns - support (available in Tor 0.1.2.1-alpha-dev and later). - - All connection requests require an exit node whose exit policy - supports their target address and port (if known), or which "might - support it" (if the address isn't known). See 2.2.1. - - Rules for Fast? XXXXX - -2.2.1. Choosing an exit - - If we know what IP address we want to connect to or resolve, we can - trivially tell whether a given router will support it by simulating - its declared exit policy. - - Because we often connect to addresses of the form hostname:port, we do not - always know the target IP address when we select an exit node. In these - cases, we need to pick an exit node that "might support" connections to a - given address port with an unknown address. An exit node "might support" - such a connection if any clause that accepts any connections to that port - precedes all clauses (if any) that reject all connections to that port. - - Unless requested to do so by the user, we never choose an exit server - flagged as "BadExit" by more than half of the authorities who advertise - themselves as listing bad exits. - -2.2.2. User configuration - - Users can alter the default behavior for path selection with configuration - options. - - - If "ExitNodes" is provided, then every request requires an exit node on - the ExitNodes list. (If a request is supported by no nodes on that list, - and StrictExitNodes is false, then Tor treats that request as if - ExitNodes were not provided.) - - - "EntryNodes" and "StrictEntryNodes" behave analogously. - - - If a user tries to connect to or resolve a hostname of the form - <target>.<servername>.exit, the request is rewritten to a request for - <target>, and the request is only supported by the exit whose nickname - or fingerprint is <servername>. - -2.3. Cannibalizing circuits - - If we need a circuit and have a clean one already established, in - some cases we can adapt the clean circuit for our new - purpose. Specifically, - - For hidden service interactions, we can "cannibalize" a clean internal - circuit if one is available, so we don't need to build those circuits - from scratch on demand. - - We can also cannibalize clean circuits when the client asks to exit - at a given node -- either via the ".exit" notation or because the - destination is running at the same location as an exit node. - -2.4. Learning when to give up ("timeout") on circuit construction - - Since version 0.2.2.8-alpha, Tor attempts to learn when to give up on - circuits based on network conditions. - -2.4.1 Distribution choice and parameter estimation - - Based on studies of build times, we found that the distribution of - circuit build times appears to be a Frechet distribution. However, - estimators and quantile functions of the Frechet distribution are - difficult to work with and slow to converge. So instead, since we - are only interested in the accuracy of the tail, we approximate - the tail of the distribution with a Pareto curve. - - We calculate the parameters for a Pareto distribution fitting the data - using the estimators in equation 4 from: - http://portal.acm.org/citation.cfm?id=1647962.1648139 - - This is: - - alpha_m = s/(ln(U(X)/Xm^n)) - - where s is the total number of completed circuits we have seen, and - - U(X) = x_max^u * Prod_s{x_i} - - with x_i as our i-th completed circuit time, x_max as the longest - completed circuit build time we have yet observed, u as the - number of unobserved timeouts that have no exact value recorded, - and n as u+s, the total number of circuits that either timeout or - complete. - - Using log laws, we compute this as the sum of logs to avoid - overflow and ln(1.0+epsilon) precision issues: - - alpha_m = s/(u*ln(x_max) + Sum_s{ln(x_i)} - n*ln(Xm)) - - This estimator is closely related to the parameters present in: - http://en.wikipedia.org/wiki/Pareto_distribution#Parameter_estimation - except they are adjusted to handle the fact that our samples are - right-censored at the timeout cutoff. - - Additionally, because this is not a true Pareto distribution, we alter - how Xm is computed. The Xm parameter is computed as the midpoint of the most - frequently occurring 50ms histogram bin, until the point where 1000 - circuits are recorded. After this point, the weighted average of the top - 'cbtnummodes' (default: 3) midpoint modes is used as Xm. All times below - this value are counted as having the midpoint value of this weighted average bin. - - The timeout itself is calculated by using the Pareto Quantile function (the - inverted CDF) to give us the value on the CDF such that 80% of the mass - of the distribution is below the timeout value. - - Thus, we expect that the Tor client will accept the fastest 80% of - the total number of paths on the network. - -2.4.2. How much data to record - - From our observations, the minimum number of circuit build times for a - reasonable fit appears to be on the order of 100. However, to keep a - good fit over the long term, we store 1000 most recent circuit build times - in a circular array. - - The Tor client should build test circuits at a rate of one per - minute up until 100 circuits are built. This allows a fresh Tor to have - a CircuitBuildTimeout estimated within 1.5 hours after install, - upgrade, or network change (see below). - - Timeouts are stored on disk in a histogram of 50ms bin width, the same - width used to calculate the Xm value above. This histogram must be shuffled - after being read from disk, to preserve a proper expiration of old values - after restart. - -2.4.3. How to record timeouts - - Circuits that pass the timeout threshold should be allowed to continue - building until a time corresponding to the point 'cbtclosequantile' - (default 95) on the Pareto curve, or 60 seconds, whichever is greater. - - The actual completion times for these circuits should be recorded. - Implementations should completely abandon a circuit and record a value - as an 'unknown' timeout if the total build time exceeds this threshold. - - The reason for this is that right-censored pareto estimators begin to lose - their accuracy if more than approximately 5% of the values are censored. - Since we wish to set the cutoff at 20%, we must allow circuits to continue - building past this cutoff point up to the 95th percentile. - -2.4.4. Detecting Changing Network Conditions - - We attempt to detect both network connectivity loss and drastic - changes in the timeout characteristics. - - We assume that we've had network connectivity loss if 3 circuits - timeout and we've received no cells or TLS handshakes since those - circuits began. We then temporarily set the timeout to 60 seconds - and stop counting timeouts. - - If 3 more circuits timeout and the network still has not been - live within this new 60 second timeout window, we then discard - the previous timeouts during this period from our history. - - To detect changing network conditions, we keep a history of - the timeout or non-timeout status of the past 20 circuits that - successfully completed at least one hop. If more than 90% of - these circuits timeout, we discard all buildtimes history, reset - the timeout to 60, and then begin recomputing the timeout. - - If the timeout was already 60 or higher, we double the timeout. - -2.4.5. Consensus parameters governing behavior - - Clients that implement circuit build timeout learning should obey the - following consensus parameters that govern behavior, in order to allow - us to handle bugs or other emergent behaviors due to client circuit - construction. If these parameters are not present in the consensus, - the listed default values should be used instead. - - cbtdisabled - Default: 0 - Min: 0 - Max: 1 - Effect: If 1, all CircuitBuildTime learning code should be - disabled and history should be discarded. For use in - emergency situations only. - - cbtnummodes - Default: 3 - Min: 1 - Max: 20 - Effect: This value governs how many modes to use in the weighted - average calculation of Pareto parameter Xm. A value of 3 introduces - some bias (2-5% of CDF) under ideal conditions, but allows for better - performance in the event that a client chooses guard nodes of radically - different performance characteristics. - - cbtrecentcount - Default: 20 - Min: 3 - Max: 1000 - Effect: This is the number of circuit build times to keep track of - for the following option. - - cbtmaxtimeouts - Default: 18 - Min: 3 - Max: 10000 - Effect: When this many timeouts happen in the last 'cbtrecentcount' - circuit attempts, the client should discard all of its - history and begin learning a fresh timeout value. - - cbtmincircs - Default: 100 - Min: 1 - Max: 10000 - Effect: This is the minimum number of circuits to build before - computing a timeout. - - cbtquantile - Default: 80 - Min: 10 - Max: 99 - Effect: This is the position on the quantile curve to use to set the - timeout value. It is a percent (10-99). - - cbtclosequantile - Default: 95 - Min: Value of cbtquantile parameter - Max: 99 - Effect: This is the position on the quantile curve to use to set the - timeout value to use to actually close circuits. It is a percent - (0-99). - - cbttestfreq - Default: 60 - Min: 1 - Max: 2147483647 (INT32_MAX) - Effect: Describes how often in seconds to build a test circuit to - gather timeout values. Only applies if less than 'cbtmincircs' - have been recorded. - - cbtmintimeout - Default: 2000 - Min: 500 - Max: 2147483647 (INT32_MAX) - Effect: This is the minimum allowed timeout value in milliseconds. - The minimum is to prevent rounding to 0 (we only check once - per second). - - cbtinitialtimeout - Default: 60000 - Min: Value of cbtmintimeout - Max: 2147483647 (INT32_MAX) - Effect: This is the timeout value to use before computing a timeout, - in milliseconds. - - -2.5. Handling failure - - If an attempt to extend a circuit fails (either because the first create - failed or a subsequent extend failed) then the circuit is torn down and is - no longer pending. (XXXX really?) Requests that might have been - supported by the pending circuit thus become unsupported, and a new - circuit needs to be constructed. - - If a stream "begin" attempt fails with an EXITPOLICY error, we - decide that the exit node's exit policy is not correctly advertised, - so we treat the exit node as if it were a non-exit until we retrieve - a fresh descriptor for it. - - XXXX - -3. Attaching streams to circuits - - When a circuit that might support a request is built, Tor tries to attach - the request's stream to the circuit and sends a BEGIN, BEGIN_DIR, - or RESOLVE relay - cell as appropriate. If the request completes unsuccessfully, Tor - considers the reason given in the CLOSE relay cell. [XXX yes, and?] - - - After a request has remained unattached for SocksTimeout (2 minutes - by default), Tor abandons the attempt and signals an error to the - client as appropriate (e.g., by closing the SOCKS connection). - - XXX Timeouts and when Tor auto-retries. - * What stream-end-reasons are appropriate for retrying. - - If no reply to BEGIN/RESOLVE, then the stream will timeout and fail. - -4. Hidden-service related circuits - - XXX Tracking expected hidden service use (client-side and hidserv-side) - -5. Guard nodes - - We use Guard nodes (also called "helper nodes" in the literature) to - prevent certain profiling attacks. Here's the risk: if we choose entry and - exit nodes at random, and an attacker controls C out of N servers - (ignoring bandwidth), then the - attacker will control the entry and exit node of any given circuit with - probability (C/N)^2. But as we make many different circuits over time, - then the probability that the attacker will see a sample of about (C/N)^2 - of our traffic goes to 1. Since statistical sampling works, the attacker - can be sure of learning a profile of our behavior. - - If, on the other hand, we picked an entry node and held it fixed, we would - have probability C/N of choosing a bad entry and being profiled, and - probability (N-C)/N of choosing a good entry and not being profiled. - - When guard nodes are enabled, Tor maintains an ordered list of entry nodes - as our chosen guards, and stores this list persistently to disk. If a Guard - node becomes unusable, rather than replacing it, Tor adds new guards to the - end of the list. When choosing the first hop of a circuit, Tor - chooses at - random from among the first NumEntryGuards (default 3) usable guards on the - list. If there are not at least 2 usable guards on the list, Tor adds - routers until there are, or until there are no more usable routers to add. - - A guard is unusable if any of the following hold: - - it is not marked as a Guard by the networkstatuses, - - it is not marked Valid (and the user hasn't set AllowInvalid entry) - - it is not marked Running - - Tor couldn't reach it the last time it tried to connect - - A guard is unusable for a particular circuit if any of the rules for path - selection in 2.2 are not met. In particular, if the circuit is "fast" - and the guard is not Fast, or if the circuit is "stable" and the guard is - not Stable, or if the guard has already been chosen as the exit node in - that circuit, Tor can't use it as a guard node for that circuit. - - If the guard is excluded because of its status in the networkstatuses for - over 30 days, Tor removes it from the list entirely, preserving order. - - If Tor fails to connect to an otherwise usable guard, it retries - periodically: every hour for six hours, every 4 hours for 3 days, every - 18 hours for a week, and every 36 hours thereafter. Additionally, Tor - retries unreachable guards the first time it adds a new guard to the list, - since it is possible that the old guards were only marked as unreachable - because the network was unreachable or down. - - Tor does not add a guard persistently to the list until the first time we - have connected to it successfully. - -6. Router descriptor purposes - - There are currently three "purposes" supported for router descriptors: - general, controller, and bridge. Most descriptors are of type general - -- these are the ones listed in the consensus, and the ones fetched - and used in normal cases. - - Controller-purpose descriptors are those delivered by the controller - and labelled as such: they will be kept around (and expire like - normal descriptors), and they can be used by the controller in its - CIRCUITEXTEND commands. Otherwise they are ignored by Tor when it - chooses paths. - - Bridge-purpose descriptors are for routers that are used as bridges. See - doc/design-paper/blocking.pdf for more design explanation, or proposal - 125 for specific details. Currently bridge descriptors are used in place - of normal entry guards, for Tor clients that have UseBridges enabled. - - -X. Old notes - -X.1. Do we actually do this? - -How to deal with network down. - - While all helpers are down/unreachable and there are no established - or on-the-way testing circuits, launch a testing circuit. (Do this - periodically in the same way we try to establish normal circuits - when things are working normally.) - (Testing circuits are a special type of circuit, that streams won't - attach to by accident.) - - When a testing circuit succeeds, mark all helpers up and hold - the testing circuit open. - - If a connection to a helper succeeds, close all testing circuits. - Else mark that helper down and try another. - - If the last helper is marked down and we already have a testing - circuit established, then add the first hop of that testing circuit - to the end of our helper node list, close that testing circuit, - and go back to square one. (Actually, rather than closing the - testing circuit, can we get away with converting it to a normal - circuit and beginning to use it immediately?) - - [Do we actually do any of the above? If so, let's spec it. If not, let's - remove it. -NM] - -X.2. A thing we could do to deal with reachability. - -And as a bonus, it leads to an answer to Nick's attack ("If I pick -my helper nodes all on 18.0.0.0:*, then I move, you'll know where I -bootstrapped") -- the answer is to pick your original three helper nodes -without regard for reachability. Then the above algorithm will add some -more that are reachable for you, and if you move somewhere, it's more -likely (though not certain) that some of the originals will become useful. -Is that smart or just complex? - -X.3. Some stuff that worries me about entry guards. 2006 Jun, Nickm. - - It is unlikely for two users to have the same set of entry guards. - Observing a user is sufficient to learn its entry guards. So, as we move - around, entry guards make us linkable. If we want to change guards when - our location (IP? subnet?) changes, we have two bad options. We could - - Drop the old guards. But if we go back to our old location, - we'll not use our old guards. For a laptop that sometimes gets used - from work and sometimes from home, this is pretty fatal. - - Remember the old guards as associated with the old location, and use - them again if we ever go back to the old location. This would be - nasty, since it would force us to record where we've been. - - [Do we do any of this now? If not, this should move into 099-misc or - 098-todo. -NM] - diff --git a/doc/spec/proposals/000-index.txt b/doc/spec/proposals/000-index.txt deleted file mode 100644 index f6f313e58d..0000000000 --- a/doc/spec/proposals/000-index.txt +++ /dev/null @@ -1,188 +0,0 @@ -Filename: 000-index.txt -Title: Index of Tor Proposals -Author: Nick Mathewson -Created: 26-Jan-2007 -Status: Meta - -Overview: - - This document provides an index to Tor proposals. - - This is an informational document. - - Everything in this document below the line of '=' signs is automatically - generated by reindex.py; do not edit by hand. - -============================================================ -Proposals by number: - -000 Index of Tor Proposals [META] -001 The Tor Proposal Process [META] -098 Proposals that should be written [META] -099 Miscellaneous proposals [META] -100 Tor Unreliable Datagram Extension Proposal [DEAD] -101 Voting on the Tor Directory System [CLOSED] -102 Dropping "opt" from the directory format [CLOSED] -103 Splitting identity key from regularly used signing key [CLOSED] -104 Long and Short Router Descriptors [CLOSED] -105 Version negotiation for the Tor protocol [CLOSED] -106 Checking fewer things during TLS handshakes [CLOSED] -107 Uptime Sanity Checking [CLOSED] -108 Base "Stable" Flag on Mean Time Between Failures [CLOSED] -109 No more than one server per IP address [CLOSED] -110 Avoiding infinite length circuits [ACCEPTED] -111 Prioritizing local traffic over relayed traffic [CLOSED] -112 Bring Back Pathlen Coin Weight [SUPERSEDED] -113 Simplifying directory authority administration [SUPERSEDED] -114 Distributed Storage for Tor Hidden Service Descriptors [CLOSED] -115 Two Hop Paths [DEAD] -116 Two hop paths from entry guards [DEAD] -117 IPv6 exits [ACCEPTED] -118 Advertising multiple ORPorts at once [ACCEPTED] -119 New PROTOCOLINFO command for controllers [CLOSED] -120 Shutdown descriptors when Tor servers stop [DEAD] -121 Hidden Service Authentication [FINISHED] -122 Network status entries need a new Unnamed flag [CLOSED] -123 Naming authorities automatically create bindings [CLOSED] -124 Blocking resistant TLS certificate usage [SUPERSEDED] -125 Behavior for bridge users, bridge relays, and bridge authorities [CLOSED] -126 Getting GeoIP data and publishing usage summaries [CLOSED] -127 Relaying dirport requests to Tor download site / website [DRAFT] -128 Families of private bridges [DEAD] -129 Block Insecure Protocols by Default [CLOSED] -130 Version 2 Tor connection protocol [CLOSED] -131 Help users to verify they are using Tor [NEEDS-REVISION] -132 A Tor Web Service For Verifying Correct Browser Configuration [DRAFT] -133 Incorporate Unreachable ORs into the Tor Network [DRAFT] -134 More robust consensus voting with diverse authority sets [REJECTED] -135 Simplify Configuration of Private Tor Networks [CLOSED] -136 Mass authority migration with legacy keys [CLOSED] -137 Keep controllers informed as Tor bootstraps [CLOSED] -138 Remove routers that are not Running from consensus documents [CLOSED] -139 Download consensus documents only when it will be trusted [CLOSED] -140 Provide diffs between consensuses [ACCEPTED] -141 Download server descriptors on demand [DRAFT] -142 Combine Introduction and Rendezvous Points [DEAD] -143 Improvements of Distributed Storage for Tor Hidden Service Descriptors [OPEN] -144 Increase the diversity of circuits by detecting nodes belonging the same provider [DRAFT] -145 Separate "suitable as a guard" from "suitable as a new guard" [OPEN] -146 Add new flag to reflect long-term stability [OPEN] -147 Eliminate the need for v2 directories in generating v3 directories [ACCEPTED] -148 Stream end reasons from the client side should be uniform [CLOSED] -149 Using data from NETINFO cells [OPEN] -150 Exclude Exit Nodes from a circuit [CLOSED] -151 Improving Tor Path Selection [FINISHED] -152 Optionally allow exit from single-hop circuits [CLOSED] -153 Automatic software update protocol [SUPERSEDED] -154 Automatic Software Update Protocol [SUPERSEDED] -155 Four Improvements of Hidden Service Performance [FINISHED] -156 Tracking blocked ports on the client side [OPEN] -157 Make certificate downloads specific [ACCEPTED] -158 Clients download consensus + microdescriptors [OPEN] -159 Exit Scanning [OPEN] -160 Authorities vote for bandwidth offsets in consensus [FINISHED] -161 Computing Bandwidth Adjustments [FINISHED] -162 Publish the consensus in multiple flavors [OPEN] -163 Detecting whether a connection comes from a client [OPEN] -164 Reporting the status of server votes [OPEN] -165 Easy migration for voting authority sets [OPEN] -166 Including Network Statistics in Extra-Info Documents [ACCEPTED] -167 Vote on network parameters in consensus [CLOSED] -168 Reduce default circuit window [OPEN] -169 Eliminate TLS renegotiation for the Tor connection handshake [DRAFT] -170 Configuration options regarding circuit building [DRAFT] -172 GETINFO controller option for circuit information [ACCEPTED] -173 GETINFO Option Expansion [ACCEPTED] -174 Optimistic Data for Tor: Server Side [OPEN] - - -Proposals by status: - - DRAFT: - 127 Relaying dirport requests to Tor download site / website - 132 A Tor Web Service For Verifying Correct Browser Configuration - 133 Incorporate Unreachable ORs into the Tor Network - 141 Download server descriptors on demand - 144 Increase the diversity of circuits by detecting nodes belonging the same provider - 169 Eliminate TLS renegotiation for the Tor connection handshake [for 0.2.2] - 170 Configuration options regarding circuit building - NEEDS-REVISION: - 131 Help users to verify they are using Tor - OPEN: - 143 Improvements of Distributed Storage for Tor Hidden Service Descriptors [for 0.2.1.x] - 145 Separate "suitable as a guard" from "suitable as a new guard" [for 0.2.1.x] - 146 Add new flag to reflect long-term stability [for 0.2.1.x] - 149 Using data from NETINFO cells [for 0.2.1.x] - 156 Tracking blocked ports on the client side [for 0.2.?] - 158 Clients download consensus + microdescriptors - 159 Exit Scanning - 162 Publish the consensus in multiple flavors [for 0.2.2] - 163 Detecting whether a connection comes from a client [for 0.2.2] - 164 Reporting the status of server votes [for 0.2.2] - 165 Easy migration for voting authority sets - 168 Reduce default circuit window [for 0.2.2] - 174 Optimistic Data for Tor: Server Side - ACCEPTED: - 110 Avoiding infinite length circuits [for 0.2.1.x] [in 0.2.1.3-alpha] - 117 IPv6 exits [for 0.2.1.x] - 118 Advertising multiple ORPorts at once [for 0.2.1.x] - 140 Provide diffs between consensuses [for 0.2.2.x] - 147 Eliminate the need for v2 directories in generating v3 directories [for 0.2.1.x] - 157 Make certificate downloads specific [for 0.2.1.x] - 166 Including Network Statistics in Extra-Info Documents [for 0.2.2] - 172 GETINFO controller option for circuit information - 173 GETINFO Option Expansion - META: - 000 Index of Tor Proposals - 001 The Tor Proposal Process - 098 Proposals that should be written - 099 Miscellaneous proposals - FINISHED: - 121 Hidden Service Authentication [in 0.2.1.x] - 151 Improving Tor Path Selection - 155 Four Improvements of Hidden Service Performance [in 0.2.1.x] - 160 Authorities vote for bandwidth offsets in consensus [for 0.2.2.x] - 161 Computing Bandwidth Adjustments [for 0.2.2.x] - CLOSED: - 101 Voting on the Tor Directory System [in 0.2.0.x] - 102 Dropping "opt" from the directory format [in 0.2.0.x] - 103 Splitting identity key from regularly used signing key [in 0.2.0.x] - 104 Long and Short Router Descriptors [in 0.2.0.x] - 105 Version negotiation for the Tor protocol [in 0.2.0.x] - 106 Checking fewer things during TLS handshakes [in 0.2.0.x] - 107 Uptime Sanity Checking [in 0.2.0.x] - 108 Base "Stable" Flag on Mean Time Between Failures [in 0.2.0.x] - 109 No more than one server per IP address [in 0.2.0.x] - 111 Prioritizing local traffic over relayed traffic [in 0.2.0.x] - 114 Distributed Storage for Tor Hidden Service Descriptors [in 0.2.0.x] - 119 New PROTOCOLINFO command for controllers [in 0.2.0.x] - 122 Network status entries need a new Unnamed flag [in 0.2.0.x] - 123 Naming authorities automatically create bindings [in 0.2.0.x] - 125 Behavior for bridge users, bridge relays, and bridge authorities [in 0.2.0.x] - 126 Getting GeoIP data and publishing usage summaries [in 0.2.0.x] - 129 Block Insecure Protocols by Default [in 0.2.0.x] - 130 Version 2 Tor connection protocol [in 0.2.0.x] - 135 Simplify Configuration of Private Tor Networks [for 0.2.1.x] [in 0.2.1.2-alpha] - 136 Mass authority migration with legacy keys [in 0.2.0.x] - 137 Keep controllers informed as Tor bootstraps [in 0.2.1.x] - 138 Remove routers that are not Running from consensus documents [in 0.2.1.2-alpha] - 139 Download consensus documents only when it will be trusted [in 0.2.1.x] - 148 Stream end reasons from the client side should be uniform [in 0.2.1.9-alpha] - 150 Exclude Exit Nodes from a circuit [in 0.2.1.3-alpha] - 152 Optionally allow exit from single-hop circuits [in 0.2.1.6-alpha] - 167 Vote on network parameters in consensus [in 0.2.2] - SUPERSEDED: - 112 Bring Back Pathlen Coin Weight - 113 Simplifying directory authority administration - 124 Blocking resistant TLS certificate usage - 153 Automatic software update protocol - 154 Automatic Software Update Protocol - DEAD: - 100 Tor Unreliable Datagram Extension Proposal - 115 Two Hop Paths - 116 Two hop paths from entry guards - 120 Shutdown descriptors when Tor servers stop - 128 Families of private bridges - 142 Combine Introduction and Rendezvous Points - REJECTED: - 134 More robust consensus voting with diverse authority sets diff --git a/doc/spec/proposals/001-process.txt b/doc/spec/proposals/001-process.txt deleted file mode 100644 index e2fe358fed..0000000000 --- a/doc/spec/proposals/001-process.txt +++ /dev/null @@ -1,184 +0,0 @@ -Filename: 001-process.txt -Title: The Tor Proposal Process -Author: Nick Mathewson -Created: 30-Jan-2007 -Status: Meta - -Overview: - - This document describes how to change the Tor specifications, how Tor - proposals work, and the relationship between Tor proposals and the - specifications. - - This is an informational document. - -Motivation: - - Previously, our process for updating the Tor specifications was maximally - informal: we'd patch the specification (sometimes forking first, and - sometimes not), then discuss the patches, reach consensus, and implement - the changes. - - This had a few problems. - - First, even at its most efficient, the old process would often have the - spec out of sync with the code. The worst cases were those where - implementation was deferred: the spec and code could stay out of sync for - versions at a time. - - Second, it was hard to participate in discussion, since you had to know - which portions of the spec were a proposal, and which were already - implemented. - - Third, it littered the specifications with too many inline comments. - [This was a real problem -NM] - [Especially when it went to multiple levels! -NM] - [XXXX especially when they weren't signed and talked about that - thing that you can't remember after a year] - -How to change the specs now: - - First, somebody writes a proposal document. It should describe the change - that should be made in detail, and give some idea of how to implement it. - Once it's fleshed out enough, it becomes a proposal. - - Like an RFC, every proposal gets a number. Unlike RFCs, proposals can - change over time and keep the same number, until they are finally - accepted or rejected. The history for each proposal - will be stored in the Tor repository. - - Once a proposal is in the repository, we should discuss and improve it - until we've reached consensus that it's a good idea, and that it's - detailed enough to implement. When this happens, we implement the - proposal and incorporate it into the specifications. Thus, the specs - remain the canonical documentation for the Tor protocol: no proposal is - ever the canonical documentation for an implemented feature. - - (This process is pretty similar to the Python Enhancement Process, with - the major exception that Tor proposals get re-integrated into the specs - after implementation, whereas PEPs _become_ the new spec.) - - {It's still okay to make small changes directly to the spec if the code - can be - written more or less immediately, or cosmetic changes if no code change is - required. This document reflects the current developers' _intent_, not - a permanent promise to always use this process in the future: we reserve - the right to get really excited and run off and implement something in a - caffeine-or-m&m-fueled all-night hacking session.} - -How new proposals get added: - - Once an idea has been proposed on the development list, a properly formatted - (see below) draft exists, and rough consensus within the active development - community exists that this idea warrants consideration, the proposal editor - will officially add the proposal. - - To get your proposal in, send it to or-dev. - - The current proposal editor is Nick Mathewson. - -What should go in a proposal: - - Every proposal should have a header containing these fields: - Filename, Title, Author, Created, Status. - - These fields are optional but recommended: - Target, Implemented-In. - The Target field should describe which version the proposal is hoped to be - implemented in (if it's Open or Accepted). The Implemented-In field - should describe which version the proposal was implemented in (if it's - Finished or Closed). - - The body of the proposal should start with an Overview section explaining - what the proposal's about, what it does, and about what state it's in. - - After the Overview, the proposal becomes more free-form. Depending on its - length and complexity, the proposal can break into sections as - appropriate, or follow a short discursive format. Every proposal should - contain at least the following information before it is "ACCEPTED", - though the information does not need to be in sections with these names. - - Motivation: What problem is the proposal trying to solve? Why does - this problem matter? If several approaches are possible, why take this - one? - - Design: A high-level view of what the new or modified features are, how - the new or modified features work, how they interoperate with each - other, and how they interact with the rest of Tor. This is the main - body of the proposal. Some proposals will start out with only a - Motivation and a Design, and wait for a specification until the - Design seems approximately right. - - Security implications: What effects the proposed changes might have on - anonymity, how well understood these effects are, and so on. - - Specification: A detailed description of what needs to be added to the - Tor specifications in order to implement the proposal. This should - be in about as much detail as the specifications will eventually - contain: it should be possible for independent programmers to write - mutually compatible implementations of the proposal based on its - specifications. - - Compatibility: Will versions of Tor that follow the proposal be - compatible with versions that do not? If so, how will compatibility - be achieved? Generally, we try to not drop compatibility if at - all possible; we haven't made a "flag day" change since May 2004, - and we don't want to do another one. - - Implementation: If the proposal will be tricky to implement in Tor's - current architecture, the document can contain some discussion of how - to go about making it work. Actual patches should go on public git - branches, or be uploaded to trac. - - Performance and scalability notes: If the feature will have an effect - on performance (in RAM, CPU, bandwidth) or scalability, there should - be some analysis on how significant this effect will be, so that we - can avoid really expensive performance regressions, and so we can - avoid wasting time on insignificant gains. - -Proposal status: - - Open: A proposal under discussion. - - Accepted: The proposal is complete, and we intend to implement it. - After this point, substantive changes to the proposal should be - avoided, and regarded as a sign of the process having failed - somewhere. - - Finished: The proposal has been accepted and implemented. After this - point, the proposal should not be changed. - - Closed: The proposal has been accepted, implemented, and merged into the - main specification documents. The proposal should not be changed after - this point. - - Rejected: We're not going to implement the feature as described here, - though we might do some other version. See comments in the document - for details. The proposal should not be changed after this point; - to bring up some other version of the idea, write a new proposal. - - Draft: This isn't a complete proposal yet; there are definite missing - pieces. Please don't add any new proposals with this status; put them - in the "ideas" sub-directory instead. - - Needs-Revision: The idea for the proposal is a good one, but the proposal - as it stands has serious problems that keep it from being accepted. - See comments in the document for details. - - Dead: The proposal hasn't been touched in a long time, and it doesn't look - like anybody is going to complete it soon. It can become "Open" again - if it gets a new proponent. - - Needs-Research: There are research problems that need to be solved before - it's clear whether the proposal is a good idea. - - Meta: This is not a proposal, but a document about proposals. - - - The editor maintains the correct status of proposals, based on rough - consensus and his own discretion. - -Proposal numbering: - - Numbers 000-099 are reserved for special and meta-proposals. 100 and up - are used for actual proposals. Numbers aren't recycled. diff --git a/doc/spec/proposals/098-todo.txt b/doc/spec/proposals/098-todo.txt deleted file mode 100644 index a0bbbeb568..0000000000 --- a/doc/spec/proposals/098-todo.txt +++ /dev/null @@ -1,107 +0,0 @@ -Filename: 098-todo.txt -Title: Proposals that should be written -Author: Nick Mathewson, Roger Dingledine -Created: 26-Jan-2007 -Status: Meta - -Overview: - - This document lists ideas that various people have had for improving the - Tor protocol. These should be implemented and specified if they're - trivial, or written up as proposals if they're not. - - This is an active document, to be edited as proposals are written and as - we come up with new ideas for proposals. We should take stuff out as it - seems irrelevant. - - -For some later protocol version. - - - It would be great to get smarter about identity and linkability. - It's not crazy to say, "Never use the same circuit for my SSH - connections and my web browsing." How far can/should we take this? - See ideas/xxx-separate-streams-by-port.txt for a start. - - - Fix onionskin handshake scheme to be more mainstream, less nutty. - Can we just do - E(HMAC(g^x), g^x) rather than just E(g^x) ? - No, that has the same flaws as before. We should send - E(g^x, C) with random C and expect g^y, HMAC_C(K=g^xy). - Better ask Ian; probably Stephen too. - - - Length on CREATE and friends - - - Versioning on circuits and create cells, so we have a clear path - to improve the circuit protocol. - - - SHA1 is showing its age. We should get a design for upgrading our - hash once the AHS competition is done, or even sooner. - - - Not being able to upgrade ciphersuites or increase key lengths is - lame. - - Paul has some ideas about circuit creation; read his PET paper once it's - out. - -Any time: - - - Some ideas for revising the directory protocol: - - Extend the "r" line in network-status to give a set of buckets (say, - comma-separated) for that router. - - Buckets are deterministic based on IP address. - - Then clients can choose a bucket (or set of buckets) to - download and use. - - We need a way for the authorities to declare that nodes are in a - family. Also, it kinda sucks that family declarations use O(N^2) space - in the descriptors. - - REASON_CONNECTFAILED should include an IP. - - Spec should incorporate some prose from tor-design to be more readable. - - Spec when we should rotate which keys - - Spec how to publish descriptors less often - - Describe pros and cons of non-deterministic path lengths - - - We should use a variable-length path length by default -- 3 +/- some - distribution. Need to think harder about allowing values less than 3, - and there's a tradeoff between having a wide variance and performance. - - - Clients currently use certs during TLS. Is this wise? It does make it - easier for servers to tell which NATted client is which. We could use a - seprate set of certs for each guard, I suppose, but generating so many - certs could get expensive. Omitting them entirely would make OP->OR - easier to tell from OR->OR. - -Things that should change... - -B.1. ... but which will require backward-incompatible change - - - Circuit IDs should be longer. - . IPv6 everywhere. - - Maybe, keys should be longer. - - Maybe, key-length should be adjustable. How to do this without - making anonymity suck? - - Drop backward compatibility. - - We should use a 128-bit subgroup of our DH prime. - - Handshake should use HMAC. - - Multiple cell lengths. - - Ability to split circuits across paths (If this is useful.) - - SENDME windows should be dynamic. - - - Directory - - Stop ever mentioning socks ports - -B.1. ... and that will require no changes - - - Advertised outbound IP? - - Migrate streams across circuits. - - Fix bug 469 by limiting the number of simultaneous connections per IP. - -B.2. ... and that we have no idea how to do. - - - UDP (as transport) - - UDP (as content) - - Use a better AES mode that has built-in integrity checking, - doesn't grow with the number of hops, is not patented, and - is implemented and maintained by smart people. - -Let onion keys be not just RSA but maybe DH too, for Paul's reply onion -design. - diff --git a/doc/spec/proposals/099-misc.txt b/doc/spec/proposals/099-misc.txt deleted file mode 100644 index a3621dd25f..0000000000 --- a/doc/spec/proposals/099-misc.txt +++ /dev/null @@ -1,28 +0,0 @@ -Filename: 099-misc.txt -Title: Miscellaneous proposals -Author: Various -Created: 26-Jan-2007 -Status: Meta - -Overview: - - This document is for small proposal ideas that are about one paragraph in - length. From here, ideas can be rejected outright, expanded into full - proposals, or specified and implemented as-is. - -Proposals - -1. Directory compression. - - Gzip would be easier to work with than zlib; bzip2 would result in smaller - data lengths. [Concretely, we're looking at about 10-15% space savings at - the expense of 3-5x longer compression time for using bzip2.] Doing - on-the-fly gzip requires zlib 1.2 or later; doing bzip2 requires bzlib. - Pre-compressing status documents in multiple formats would force us to use - more memory to hold them. - - Status: Open - - -- Nick Mathewson - - diff --git a/doc/spec/proposals/100-tor-spec-udp.txt b/doc/spec/proposals/100-tor-spec-udp.txt deleted file mode 100644 index 7f062222c5..0000000000 --- a/doc/spec/proposals/100-tor-spec-udp.txt +++ /dev/null @@ -1,422 +0,0 @@ -Filename: 100-tor-spec-udp.txt -Title: Tor Unreliable Datagram Extension Proposal -Author: Marc Liberatore -Created: 23 Feb 2006 -Status: Dead - -Overview: - - This is a modified version of the Tor specification written by Marc - Liberatore to add UDP support to Tor. For each TLS link, it adds a - corresponding DTLS link: control messages and TCP data flow over TLS, and - UDP data flows over DTLS. - - This proposal is not likely to be accepted as-is; see comments at the end - of the document. - - -Contents - -0. Introduction - - Tor is a distributed overlay network designed to anonymize low-latency - TCP-based applications. The current tor specification supports only - TCP-based traffic. This limitation prevents the use of tor to anonymize - other important applications, notably voice over IP software. This document - is a proposal to extend the tor specification to support UDP traffic. - - The basic design philosophy of this extension is to add support for - tunneling unreliable datagrams through tor with as few modifications to the - protocol as possible. As currently specified, tor cannot directly support - such tunneling, as connections between nodes are built using transport layer - security (TLS) atop TCP. The latency incurred by TCP is likely unacceptable - to the operation of most UDP-based application level protocols. - - Thus, we propose the addition of links between nodes using datagram - transport layer security (DTLS). These links allow packets to traverse a - route through tor quickly, but their unreliable nature requires minor - changes to the tor protocol. This proposal outlines the necessary - additions and changes to the tor specification to support UDP traffic. - - We note that a separate set of DTLS links between nodes creates a second - overlay, distinct from the that composed of TLS links. This separation and - resulting decrease in each anonymity set's size will make certain attacks - easier. However, it is our belief that VoIP support in tor will - dramatically increase its appeal, and correspondingly, the size of its user - base, number of deployed nodes, and total traffic relayed. These increases - should help offset the loss of anonymity that two distinct networks imply. - -1. Overview of Tor-UDP and its complications - - As described above, this proposal extends the Tor specification to support - UDP with as few changes as possible. Tor's overlay network is managed - through TLS based connections; we will re-use this control plane to set up - and tear down circuits that relay UDP traffic. These circuits be built atop - DTLS, in a fashion analogous to how Tor currently sends TCP traffic over - TLS. - - The unreliability of DTLS circuits creates problems for Tor at two levels: - - 1. Tor's encryption of the relay layer does not allow independent - decryption of individual records. If record N is not received, then - record N+1 will not decrypt correctly, as the counter for AES/CTR is - maintained implicitly. - - 2. Tor's end-to-end integrity checking works under the assumption that - all RELAY cells are delivered. This assumption is invalid when cells - are sent over DTLS. - - The fix for the first problem is straightforward: add an explicit sequence - number to each cell. To fix the second problem, we introduce a - system of nonces and hashes to RELAY packets. - - In the following sections, we mirror the layout of the Tor Protocol - Specification, presenting the necessary modifications to the Tor protocol as - a series of deltas. - -2. Connections - - Tor-UDP uses DTLS for encryption of some links. All DTLS links must have - corresponding TLS links, as all control messages are sent over TLS. All - implementations MUST support the DTLS ciphersuite "[TODO]". - - DTLS connections are formed using the same protocol as TLS connections. - This occurs upon request, following a CREATE_UDP or CREATE_FAST_UDP cell, - as detailed in section 4.6. - - Once a paired TLS/DTLS connection is established, the two sides send cells - to one another. All but two types of cells are sent over TLS links. RELAY - cells containing the commands RELAY_UDP_DATA and RELAY_UDP_DROP, specified - below, are sent over DTLS links. [Should all cells still be 512 bytes long? - Perhaps upon completion of a preliminary implementation, we should do a - performance evaluation for some class of UDP traffic, such as VoIP. - ML] - Cells may be sent embedded in TLS or DTLS records of any size or divided - across such records. The framing of these records MUST NOT leak any more - information than the above differentiation on the basis of cell type. [I am - uncomfortable with this leakage, but don't see any simple, elegant way - around it. -ML] - - As with TLS connections, DTLS connections are not permanent. - -3. Cell format - - Each cell contains the following fields: - - CircID [2 bytes] - Command [1 byte] - Sequence Number [2 bytes] - Payload (padded with 0 bytes) [507 bytes] - [Total size: 512 bytes] - - The 'Command' field holds one of the following values: - 0 -- PADDING (Padding) (See Sec 6.2) - 1 -- CREATE (Create a circuit) (See Sec 4) - 2 -- CREATED (Acknowledge create) (See Sec 4) - 3 -- RELAY (End-to-end data) (See Sec 5) - 4 -- DESTROY (Stop using a circuit) (See Sec 4) - 5 -- CREATE_FAST (Create a circuit, no PK) (See Sec 4) - 6 -- CREATED_FAST (Circuit created, no PK) (See Sec 4) - 7 -- CREATE_UDP (Create a UDP circuit) (See Sec 4) - 8 -- CREATED_UDP (Acknowledge UDP create) (See Sec 4) - 9 -- CREATE_FAST_UDP (Create a UDP circuit, no PK) (See Sec 4) - 10 -- CREATED_FAST_UDP(UDP circuit created, no PK) (See Sec 4) - - The sequence number allows for AES/CTR decryption of RELAY cells - independently of one another; this functionality is required to support - cells sent over DTLS. The sequence number is described in more detail in - section 4.5. - - [Should the sequence number only appear in RELAY packets? The overhead is - small, and I'm hesitant to force more code paths on the implementor. -ML] - [There's already a separate relay header that has other material in it, - so it wouldn't be the end of the world to move it there if it's - appropriate. -RD] - - [Having separate commands for UDP circuits seems necessary, unless we can - assume a flag day event for a large number of tor nodes. -ML] - -4. Circuit management - -4.2. Setting circuit keys - - Keys are set up for UDP circuits in the same fashion as for TCP circuits. - Each UDP circuit shares keys with its corresponding TCP circuit. - - [If the keys are used for both TCP and UDP connections, how does it - work to mix sequence-number-less cells with sequenced-numbered cells -- - how do you know you have the encryption order right? -RD] - -4.3. Creating circuits - - UDP circuits are created as TCP circuits, using the *_UDP cells as - appropriate. - -4.4. Tearing down circuits - - UDP circuits are torn down as TCP circuits, using the *_UDP cells as - appropriate. - -4.5. Routing relay cells - - When an OR receives a RELAY cell, it checks the cell's circID and - determines whether it has a corresponding circuit along that - connection. If not, the OR drops the RELAY cell. - - Otherwise, if the OR is not at the OP edge of the circuit (that is, - either an 'exit node' or a non-edge node), it de/encrypts the payload - with AES/CTR, as follows: - 'Forward' relay cell (same direction as CREATE): - Use Kf as key; decrypt, using sequence number to synchronize - ciphertext and keystream. - 'Back' relay cell (opposite direction from CREATE): - Use Kb as key; encrypt, using sequence number to synchronize - ciphertext and keystream. - Note that in counter mode, decrypt and encrypt are the same operation. - [Since the sequence number is only 2 bytes, what do you do when it - rolls over? -RD] - - Each stream encrypted by a Kf or Kb has a corresponding unique state, - captured by a sequence number; the originator of each such stream chooses - the initial sequence number randomly, and increments it only with RELAY - cells. [This counts cells; unlike, say, TCP, tor uses fixed-size cells, so - there's no need for counting bytes directly. Right? - ML] - [I believe this is true. You'll find out for sure when you try to - build it. ;) -RD] - - The OR then decides whether it recognizes the relay cell, by - inspecting the payload as described in section 5.1 below. If the OR - recognizes the cell, it processes the contents of the relay cell. - Otherwise, it passes the decrypted relay cell along the circuit if - the circuit continues. If the OR at the end of the circuit - encounters an unrecognized relay cell, an error has occurred: the OR - sends a DESTROY cell to tear down the circuit. - - When a relay cell arrives at an OP, the OP decrypts the payload - with AES/CTR as follows: - OP receives data cell: - For I=N...1, - Decrypt with Kb_I, using the sequence number as above. If the - payload is recognized (see section 5.1), then stop and process - the payload. - - For more information, see section 5 below. - -4.6. CREATE_UDP and CREATED_UDP cells - - Users set up UDP circuits incrementally. The procedure is similar to that - for TCP circuits, as described in section 4.1. In addition to the TLS - connection to the first node, the OP also attempts to open a DTLS - connection. If this succeeds, the OP sends a CREATE_UDP cell, with a - payload in the same format as a CREATE cell. To extend a UDP circuit past - the first hop, the OP sends an EXTEND_UDP relay cell (see section 5) which - instructs the last node in the circuit to send a CREATE_UDP cell to extend - the circuit. - - The relay payload for an EXTEND_UDP relay cell consists of: - Address [4 bytes] - TCP port [2 bytes] - UDP port [2 bytes] - Onion skin [186 bytes] - Identity fingerprint [20 bytes] - - The address field and ports denote the IPV4 address and ports of the next OR - in the circuit. - - The payload for a CREATED_UDP cell or the relay payload for an - RELAY_EXTENDED_UDP cell is identical to that of the corresponding CREATED or - RELAY_EXTENDED cell. Both circuits are established using the same key. - - Note that the existence of a UDP circuit implies the - existence of a corresponding TCP circuit, sharing keys, sequence numbers, - and any other relevant state. - -4.6.1 CREATE_FAST_UDP/CREATED_FAST_UDP cells - - As above, the OP must successfully connect using DTLS before attempting to - send a CREATE_FAST_UDP cell. Otherwise, the procedure is the same as in - section 4.1.1. - -5. Application connections and stream management - -5.1. Relay cells - - Within a circuit, the OP and the exit node use the contents of RELAY cells - to tunnel end-to-end commands, TCP connections ("Streams"), and UDP packets - across circuits. End-to-end commands and UDP packets can be initiated by - either edge; streams are initiated by the OP. - - The payload of each unencrypted RELAY cell consists of: - Relay command [1 byte] - 'Recognized' [2 bytes] - StreamID [2 bytes] - Digest [4 bytes] - Length [2 bytes] - Data [498 bytes] - - The relay commands are: - 1 -- RELAY_BEGIN [forward] - 2 -- RELAY_DATA [forward or backward] - 3 -- RELAY_END [forward or backward] - 4 -- RELAY_CONNECTED [backward] - 5 -- RELAY_SENDME [forward or backward] - 6 -- RELAY_EXTEND [forward] - 7 -- RELAY_EXTENDED [backward] - 8 -- RELAY_TRUNCATE [forward] - 9 -- RELAY_TRUNCATED [backward] - 10 -- RELAY_DROP [forward or backward] - 11 -- RELAY_RESOLVE [forward] - 12 -- RELAY_RESOLVED [backward] - 13 -- RELAY_BEGIN_UDP [forward] - 14 -- RELAY_DATA_UDP [forward or backward] - 15 -- RELAY_EXTEND_UDP [forward] - 16 -- RELAY_EXTENDED_UDP [backward] - 17 -- RELAY_DROP_UDP [forward or backward] - - Commands labelled as "forward" must only be sent by the originator - of the circuit. Commands labelled as "backward" must only be sent by - other nodes in the circuit back to the originator. Commands marked - as either can be sent either by the originator or other nodes. - - The 'recognized' field in any unencrypted relay payload is always set to - zero. - - The 'digest' field can have two meanings. For all cells sent over TLS - connections (that is, all commands and all non-UDP RELAY data), it is - computed as the first four bytes of the running SHA-1 digest of all the - bytes that have been sent reliably and have been destined for this hop of - the circuit or originated from this hop of the circuit, seeded from Df or Db - respectively (obtained in section 4.2 above), and including this RELAY - cell's entire payload (taken with the digest field set to zero). Cells sent - over DTLS connections do not affect this running digest. Each cell sent - over DTLS (that is, RELAY_DATA_UDP and RELAY_DROP_UDP) has the digest field - set to the SHA-1 digest of the current RELAY cells' entire payload, with the - digest field set to zero. Coupled with a randomly-chosen streamID, this - provides per-cell integrity checking on UDP cells. - [If you drop malformed UDP relay cells but don't close the circuit, - then this 8 bytes of digest is not as strong as what we get in the - TCP-circuit side. Is this a problem? -RD] - - When the 'recognized' field of a RELAY cell is zero, and the digest - is correct, the cell is considered "recognized" for the purposes of - decryption (see section 4.5 above). - - (The digest does not include any bytes from relay cells that do - not start or end at this hop of the circuit. That is, it does not - include forwarded data. Therefore if 'recognized' is zero but the - digest does not match, the running digest at that node should - not be updated, and the cell should be forwarded on.) - - All RELAY cells pertaining to the same tunneled TCP stream have the - same streamID. Such streamIDs are chosen arbitrarily by the OP. RELAY - cells that affect the entire circuit rather than a particular - stream use a StreamID of zero. - - All RELAY cells pertaining to the same UDP tunnel have the same streamID. - This streamID is chosen randomly by the OP, but cannot be zero. - - The 'Length' field of a relay cell contains the number of bytes in - the relay payload which contain real payload data. The remainder of - the payload is padded with NUL bytes. - - If the RELAY cell is recognized but the relay command is not - understood, the cell must be dropped and ignored. Its contents - still count with respect to the digests, though. [Before - 0.1.1.10, Tor closed circuits when it received an unknown relay - command. Perhaps this will be more forward-compatible. -RD] - -5.2.1. Opening UDP tunnels and transferring data - - To open a new anonymized UDP connection, the OP chooses an open - circuit to an exit that may be able to connect to the destination - address, selects a random streamID not yet used on that circuit, - and constructs a RELAY_BEGIN_UDP cell with a payload encoding the address - and port of the destination host. The payload format is: - - ADDRESS | ':' | PORT | [00] - - where ADDRESS can be a DNS hostname, or an IPv4 address in - dotted-quad format, or an IPv6 address surrounded by square brackets; - and where PORT is encoded in decimal. - - [What is the [00] for? -NM] - [It's so the payload is easy to parse out with string funcs -RD] - - Upon receiving this cell, the exit node resolves the address as necessary. - If the address cannot be resolved, the exit node replies with a RELAY_END - cell. (See 5.4 below.) Otherwise, the exit node replies with a - RELAY_CONNECTED cell, whose payload is in one of the following formats: - The IPv4 address to which the connection was made [4 octets] - A number of seconds (TTL) for which the address may be cached [4 octets] - or - Four zero-valued octets [4 octets] - An address type (6) [1 octet] - The IPv6 address to which the connection was made [16 octets] - A number of seconds (TTL) for which the address may be cached [4 octets] - [XXXX Versions of Tor before 0.1.1.6 ignore and do not generate the TTL - field. No version of Tor currently generates the IPv6 format.] - - The OP waits for a RELAY_CONNECTED cell before sending any data. - Once a connection has been established, the OP and exit node - package UDP data in RELAY_DATA_UDP cells, and upon receiving such - cells, echo their contents to the corresponding socket. - RELAY_DATA_UDP cells sent to unrecognized streams are dropped. - - Relay RELAY_DROP_UDP cells are long-range dummies; upon receiving such - a cell, the OR or OP must drop it. - -5.3. Closing streams - - UDP tunnels are closed in a fashion corresponding to TCP connections. - -6. Flow Control - - UDP streams are not subject to flow control. - -7.2. Router descriptor format. - -The items' formats are as follows: - "router" nickname address ORPort SocksPort DirPort UDPPort - - Indicates the beginning of a router descriptor. "address" must be - an IPv4 address in dotted-quad format. The last three numbers - indicate the TCP ports at which this OR exposes - functionality. ORPort is a port at which this OR accepts TLS - connections for the main OR protocol; SocksPort is deprecated and - should always be 0; DirPort is the port at which this OR accepts - directory-related HTTP connections; and UDPPort is a port at which - this OR accepts DTLS connections for UDP data. If any port is not - supported, the value 0 is given instead of a port number. - -Other sections: - -What changes need to happen to each node's exit policy to support this? -RD - -Switching to UDP means managing the queues of incoming packets better, -so we don't miss packets. How does this interact with doing large public -key operations (handshakes) in the same thread? -RD - -======================================================================== -COMMENTS -======================================================================== - -[16 May 2006] - -I don't favor this approach; it makes packet traffic partitioned from -stream traffic end-to-end. The architecture I'd like to see is: - - A *All* Tor-to-Tor traffic is UDP/DTLS, unless we need to fall back on - TCP/TLS for firewall penetration or something. (This also gives us an - upgrade path for routing through legacy servers.) - - B Stream traffic is handled with end-to-end per-stream acks/naks and - retries. On failure, the data is retransmitted in a new RELAY_DATA cell; - a cell isn't retransmitted. - -We'll need to do A anyway, to fix our behavior on packet-loss. Once we've -done so, B is more or less inevitable, and we can support end-to-end UDP -traffic "for free". - -(Also, there are some details that this draft spec doesn't address. For -example, what happens when a UDP packet doesn't fit in a single cell?) - --NM diff --git a/doc/spec/proposals/101-dir-voting.txt b/doc/spec/proposals/101-dir-voting.txt deleted file mode 100644 index 634d3f1948..0000000000 --- a/doc/spec/proposals/101-dir-voting.txt +++ /dev/null @@ -1,283 +0,0 @@ -Filename: 101-dir-voting.txt -Title: Voting on the Tor Directory System -Author: Nick Mathewson -Created: Nov 2006 -Status: Closed -Implemented-In: 0.2.0.x - -Overview - - This document describes a consensus voting scheme for Tor directories; - instead of publishing different network statuses, directories would vote on - and publish a single "consensus" network status document. - - This is an open proposal. - -Proposal: - -0. Scope and preliminaries - - This document describes a consensus voting scheme for Tor directories. - Once it's accepted, it should be merged with dir-spec.txt. Some - preliminaries for authority and caching support should be done during - the 0.1.2.x series; the main deployment should come during the 0.2.0.x - series. - -0.1. Goals and motivation: voting. - - The current directory system relies on clients downloading separate - network status statements from the caches signed by each directory. - Clients download a new statement every 30 minutes or so, choosing to - replace the oldest statement they currently have. - - This creates a partitioning problem: different clients have different - "most recent" networkstatus sources, and different versions of each - (since authorities change their statements often). - - It also creates a scaling problem: most of the downloaded networkstatus - are probably quite similar, and the redundancy grows as we add more - authorities. - - So if we have clients only download a single multiply signed consensus - network status statement, we can: - - Save bandwidth. - - Reduce client partitioning - - Reduce client-side and cache-side storage - - Simplify client-side voting code (by moving voting away from the - client) - - We should try to do this without: - - Assuming that client-side or cache-side clocks are more correct - than we assume now. - - Assuming that authority clocks are perfectly correct. - - Degrading badly if a few authorities die or are offline for a bit. - - We do not have to perform well if: - - No clique of more than half the authorities can agree about who - the authorities are. - -1. The idea. - - Instead of publishing a network status whenever something changes, - each authority instead publishes a fresh network status only once per - "period" (say, 60 minutes). Authorities either upload this network - status (or "vote") to every other authority, or download every other - authority's "vote" (see 3.1 below for discussion on push vs pull). - - After an authority has (or has become convinced that it won't be able to - get) every other authority's vote, it deterministically computes a - consensus networkstatus, and signs it. Authorities download (or are - uploaded; see 3.1) one another's signatures, and form a multiply signed - consensus. This multiply-signed consensus is what caches cache and what - clients download. - - If an authority is down, authorities vote based on what they *can* - download/get uploaded. - - If an authority is "a little" down and only some authorities can reach - it, authorities try to get its info from other authorities. - - If an authority computes the vote wrong, its signature isn't included on - the consensus. - - Clients use a consensus if it is "trusted": signed by more than half the - authorities they recognize. If clients can't find any such consensus, - they use the most recent trusted consensus they have. If they don't - have any trusted consensus, they warn the user and refuse to operate - (and if DirServers is not the default, beg the user to adapt the list - of authorities). - -2. Details. - -2.0. Versioning - - All documents generated here have version "3" given in their - network-status-version entries. - -2.1. Vote specifications - - Votes in v3 are similar to v2 network status documents. We add these - fields to the preamble: - - "vote-status" -- the word "vote". - - "valid-until" -- the time when this authority expects to publish its - next vote. - - "known-flags" -- a space-separated list of flags that will sometimes - be included on "s" lines later in the vote. - - "dir-source" -- as before, except the "hostname" part MUST be the - authority's nickname, which MUST be unique among authorities, and - MUST match the nickname in the "directory-signature" entry. - - Authorities SHOULD cache their most recently generated votes so they - can persist them across restarts. Authorities SHOULD NOT generate - another document until valid-until has passed. - - Router entries in the vote MUST be sorted in ascending order by router - identity digest. The flags in "s" lines MUST appear in alphabetical - order. - - Votes SHOULD be synchronized to half-hour publication intervals (one - hour? XXX say more; be more precise.) - - XXXX some way to request older networkstatus docs? - -2.2. Consensus directory specifications - - Consensuses are like v3 votes, except for the following fields: - - "vote-status" -- the word "consensus". - - "published" is the latest of all the published times on the votes. - - "valid-until" is the earliest of all the valid-until times on the - votes. - - "dir-source" and "fingerprint" and "dir-signing-key" and "contact" - are included for each authority that contributed to the vote. - - "vote-digest" for each authority that contributed to the vote, - calculated as for the digest in the signature on the vote. [XXX - re-English this sentence] - - "client-versions" and "server-versions" are sorted in ascending - order based on version-spec.txt. - - "dir-options" and "known-flags" are not included. -[XXX really? why not list the ones that are used in the consensus? -For example, right now BadExit is in use, but no servers would be -labelled BadExit, and it's still worth knowing that it was considered -by the authorities. -RD] - - The fields MUST occur in the following order: - "network-status-version" - "vote-status" - "published" - "valid-until" - For each authority, sorted in ascending order of nickname, case- - insensitively: - "dir-source", "fingerprint", "contact", "dir-signing-key", - "vote-digest". - "client-versions" - "server-versions" - - The signatures at the end of the document appear as multiple instances - of directory-signature, sorted in ascending order by nickname, - case-insensitively. - - A router entry should be included in the result if it is included by more - than half of the authorities (total authorities, not just those whose votes - we have). A router entry has a flag set if it is included by more than - half of the authorities who care about that flag. [XXXX this creates an - incentive for attackers to DOS authorities whose votes they don't like. - Can we remember what flags people set the last time we saw them? -NM] - [Which 'we' are we talking here? The end-users never learn which - authority sets which flags. So you're thinking the authorities - should record the last vote they saw from each authority and if it's - within a week or so, count all the flags that it advertised as 'no' - votes? Plausible. -RD] - - The signature hash covers from the "network-status-version" line through - the characters "directory-signature" in the first "directory-signature" - line. - - Consensus directories SHOULD be rejected if they are not signed by more - than half of the known authorities. - -2.2.1. Detached signatures - - Assuming full connectivity, every authority should compute and sign the - same consensus directory in each period. Therefore, it isn't necessary to - download the consensus computed by each authority; instead, the authorities - only push/fetch each others' signatures. A "detached signature" document - contains a single "consensus-digest" entry and one or more - directory-signature entries. [XXXX specify more.] - -2.3. URLs and timelines - -2.3.1. URLs and timeline used for agreement - - An authority SHOULD publish its vote immediately at the start of each voting - period. It does this by making it available at - http://<hostname>/tor/status-vote/current/authority.z - and sending it in an HTTP POST request to each other authority at the URL - http://<hostname>/tor/post/vote - - If, N minutes after the voting period has begun, an authority does not have - a current statement from another authority, the first authority retrieves - the other's statement. - - Once an authority has a vote from another authority, it makes it available - at - http://<hostname>/tor/status-vote/current/<fp>.z - where <fp> is the fingerprint of the other authority's identity key. - - The consensus network status, along with as many signatures as the server - currently knows, should be available at - http://<hostname>/tor/status-vote/current/consensus.z - All of the detached signatures it knows for consensus status should be - available at: - http://<hostname>/tor/status-vote/current/consensus-signatures.z - - Once an authority has computed and signed a consensus network status, it - should send its detached signature to each other authority in an HTTP POST - request to the URL: - http://<hostname>/tor/post/consensus-signature - - - [XXXX Store votes to disk.] - -2.3.2. Serving a consensus directory - - Once the authority is done getting signatures on the consensus directory, - it should serve it from: - http://<hostname>/tor/status/consensus.z - - Caches SHOULD download consensus directories from an authority and serve - them from the same URL. - -2.3.3. Timeline and synchronization - - [XXXX] - -2.4. Distributing routerdescs between authorities - - Consensus will be more meaningful if authorities take steps to make sure - that they all have the same set of descriptors _before_ the voting - starts. This is safe, since all descriptors are self-certified and - timestamped: it's always okay to replace a signed descriptor with a more - recent one signed by the same identity. - - In the long run, we might want some kind of sophisticated process here. - For now, since authorities already download one another's networkstatus - documents and use them to determine what descriptors to download from one - another, we can rely on this existing mechanism to keep authorities up to - date. - - [We should do a thorough read-through of dir-spec again to make sure - that the authorities converge on which descriptor to "prefer" for - each router. Right now the decision happens at the client, which is - no longer the right place for it. -RD] - -3. Questions and concerns - -3.1. Push or pull? - - The URLs above define a push mechanism for publishing votes and consensus - signatures via HTTP POST requests, and a pull mechanism for downloading - these documents via HTTP GET requests. As specified, every authority will - post to every other. The "download if no copy has been received" mechanism - exists only as a fallback. - -4. Migration - - * It would be cool if caches could get ready to download consensus - status docs, verify enough signatures, and serve them now. That way - once stuff works all we need to do is upgrade the authorities. Caches - don't need to verify the correctness of the format so long as it's - signed (or maybe multisigned?). We need to make sure that caches back - off very quickly from downloading consensus docs until they're - actually implemented. - diff --git a/doc/spec/proposals/102-drop-opt.txt b/doc/spec/proposals/102-drop-opt.txt deleted file mode 100644 index 490376bb53..0000000000 --- a/doc/spec/proposals/102-drop-opt.txt +++ /dev/null @@ -1,38 +0,0 @@ -Filename: 102-drop-opt.txt -Title: Dropping "opt" from the directory format -Author: Nick Mathewson -Created: Jan 2007 -Status: Closed -Implemented-In: 0.2.0.x - -Overview: - - This document proposes a change in the format used to transmit router and - directory information. - - This proposal has been accepted, implemented, and merged into dir-spec.txt. - -Proposal: - - The "opt" keyword in Tor's directory formats was originally intended to - mean, "it is okay to ignore this entry if you don't understand it"; the - default behavior has been "discard a routerdesc if it contains entries you - don't recognize." - - But so far, every new flag we have added has been marked 'opt'. It would - probably make sense to change the default behavior to "ignore unrecognized - fields", and add the statement that clients SHOULD ignore fields they don't - recognize. As a meta-principle, we should say that clients and servers - MUST NOT have to understand new fields in order to use directory documents - correctly. - - Of course, this will make it impossible to say, "The format has changed a - lot; discard this quietly if you don't understand it." We could do that by - adding a version field. - -Status: - - * We stopped requiring it as of 0.1.2.5-alpha. We'll stop generating it - once earlier formats are obsolete. - - diff --git a/doc/spec/proposals/103-multilevel-keys.txt b/doc/spec/proposals/103-multilevel-keys.txt deleted file mode 100644 index c8a7a6677b..0000000000 --- a/doc/spec/proposals/103-multilevel-keys.txt +++ /dev/null @@ -1,204 +0,0 @@ -Filename: 103-multilevel-keys.txt -Title: Splitting identity key from regularly used signing key. -Author: Nick Mathewson -Created: Jan 2007 -Status: Closed -Implemented-In: 0.2.0.x - -Overview: - - This document proposes a change in the way identity keys are used, so that - highly sensitive keys can be password-protected and seldom loaded into RAM. - - It presents options; it is not yet a complete proposal. - -Proposal: - - Replacing a directory authority's identity key in the event of a compromise - would be tremendously annoying. We'd need to tell every client to switch - their configuration, or update to a new version with an uploaded list. So - long as some weren't upgraded, they'd be at risk from whoever had - compromised the key. - - With this in mind, it's a shame that our current protocol forces us to - store identity keys unencrypted in RAM. We need some kind of signing key - stored unencrypted, since we need to generate new descriptors/directories - and rotate link and onion keys regularly. (And since, of course, we can't - ask server operators to be on-hand to enter a passphrase every time we - want to rotate keys or sign a descriptor.) - - The obvious solution seems to be to have a signing-only key that lives - indefinitely (months or longer) and signs descriptors and link keys, and a - separate identity key that's used to sign the signing key. Tor servers - could run in one of several modes: - 1. Identity key stored encrypted. You need to pick a passphrase when - you enable this mode, and re-enter this passphrase every time you - rotate the signing key. - 1'. Identity key stored separate. You save your identity key to a - floppy, and use the floppy when you need to rotate the signing key. - 2. All keys stored unencrypted. In this case, we might not want to even - *have* a separate signing key. (We'll need to support no-separate- - signing-key mode anyway to keep old servers working.) - 3. All keys stored encrypted. You need to enter a passphrase to start - Tor. - (Of course, we might not want to implement all of these.) - - Case 1 is probably most usable and secure, if we assume that people don't - forget their passphrases or lose their floppies. We could mitigate this a - bit by encouraging people to PGP-encrypt their passphrases to themselves, - or keep a cleartext copy of their secret key secret-split into a few - pieces, or something like that. - - Migration presents another difficulty, especially with the authorities. If - we use the current set of identity keys as the new identity keys, we're in - the position of having sensitive keys that have been stored on - media-of-dubious-encryption up to now. Also, we need to keep old clients - (who will expect descriptors to be signed by the identity keys they know - and love, and who will not understand signing keys) happy. - -A possible solution: - - One thing to consider is that router identity keys are not very sensitive: - if an OR disappears and reappears with a new key, the network treats it as - though an old router had disappeared and a new one had joined the network. - The Tor network continues unharmed; this isn't a disaster. - - Thus, the ideas above are mostly relevant for authorities. - - The most straightforward solution for the authorities is probably to take - advantage of the protocol transition that will come with proposal 101, and - introduce a new set of signing _and_ identity keys used only to sign votes - and consensus network-status documents. Signing and identity keys could be - delivered to users in a separate, rarely changing "keys" document, so that - the consensus network-status documents wouldn't need to include N signing - keys, N identity keys, and N certifications. - - Note also that there is no reason that the identity/signing keys used by - directory authorities would necessarily have to be the same as the identity - keys those authorities use in their capacity as routers. Decoupling these - keys would give directory authorities the following set of keys: - - Directory authority identity: - Highly confidential; stored encrypted and/or offline. Used to - identity directory authorities. Shipped with clients. Used to - sign Directory authority signing keys. - - Directory authority signing key: - Stored online, accessible to regular Tor process. Used to sign - votes and consensus directories. Downloaded as part of a "keys" - document. - - [Administrators SHOULD rotate their signing keys every month or - two, just to keep in practice and keep from forgetting the - password to the authority identity.] - - V1-V2 directory authority identity: - Stored online, never changed. Used to sign legacy network-status - and directory documents. - - Router identity: - Stored online, seldom changed. Used to sign server descriptors - for this authority in its role as a router. Implicitly certified - by being listed in network-status documents. - - Onion key, link key: - As in tor-spec.txt - - -Extensions to Proposal 101. - - Define a new document type, "Key certificate". It contains the - following fields, in order: - - "dir-key-certificate-version": As network-status-version. Must be - "3". - "fingerprint": Hex fingerprint, with spaces, based on the directory - authority's identity key. - "dir-identity-key": The long-term identity key for this authority. - "dir-key-published": The time when this directory's signing key was - last changed. - "dir-key-expires": A time after which this key is no longer valid. - "dir-signing-key": As in proposal 101. - "dir-key-certification": A signature of the above fields, in order. - The signed material extends from the beginning of - "dir-key-certicate-version" through the newline after - "dir-key-certification". The identity key is used to generate - this signature. - - These elements together constitute a "key certificate". These are - generated offline when starting a v3 authority. Private identity - keys SHOULD be stored offline, encrypted, or both. A running - authority only needs access to the signing key. - - Unlike other keys currently used by Tor, the authority identity - keys and directory signing keys MAY be longer than 1024 bits. - (They SHOULD be 2048 bits or longer; they MUST NOT be shorter than - 1024.) - - Vote documents change as follows: - - A key certificate MUST be included in-line in every vote document. With - the exception of "fingerprint", its elements MUST NOT appear in consensus - documents. - - Consensus network statuses change as follows: - - Remove dir-signing-key. - - Change "directory-signature" to take a fingerprint of the authority's - identity key and a fingerprint of the authority's current signing key - rather than the authority's nickname. - - Change "dir-source" to take the a fingerprint of the authority's - identity key rather than the authority's nickname or hostname. - - Add a new document type: - - A "keys" document contains all currently known key certificates. - All authorities serve it at - - http://<hostname>/tor/status/keys.z - - Caches and clients download the keys document whenever they receive a - consensus vote that uses a key they do not recognize. Caches download - from authorities; clients download from caches. - - Processing votes: - - When receiving a vote, authorities check to see if the key - certificate for the voter is different from the one they have. If - the key certificate _is_ different, and its dir-key-published is - more recent than the most recently known one, and it is - well-formed and correctly signed with the correct identity key, - then authorities remember it as the new canonical key certificate - for that voter. - - A key certificate is invalid if any of the following hold: - * The version is unrecognized. - * The fingerprint does not match the identity key. - * The identity key or the signing key is ill-formed. - * The published date is very far in the past or future. - - * The signature is not a valid signature of the key certificate - generated with the identity key. - - When processing the signatures on consensus, clients and caches act as - follows: - - 1. Only consider the directory-signature entries whose identity - key hashes match trusted authorities. - - 2. If any such entries have signing key hashes that match unknown - signing keys, download a new keys document. - - 3. For every entry with a known (identity key,signing key) pair, - check the signature on the document. - - 4. If the document has been signed by more than half of the - authorities the client recognizes, treat the consensus as - correctly signed. - - If not, but the number entries with known identity keys but - unknown signing keys might be enough to make the consensus - correctly signed, do not use the consensus, but do not discard - it until we have a new keys document. diff --git a/doc/spec/proposals/104-short-descriptors.txt b/doc/spec/proposals/104-short-descriptors.txt deleted file mode 100644 index 90e0764fe6..0000000000 --- a/doc/spec/proposals/104-short-descriptors.txt +++ /dev/null @@ -1,181 +0,0 @@ -Filename: 104-short-descriptors.txt -Title: Long and Short Router Descriptors -Author: Nick Mathewson -Created: Jan 2007 -Status: Closed -Implemented-In: 0.2.0.x - -Overview: - - This document proposes moving unused-by-clients information from regular - router descriptors into a new "extra info" router descriptor. - -Proposal: - - Some of the costliest fields in the current directory protocol are ones - that no client actually uses. In particular, the "read-history" and - "write-history" fields are used only by the authorities for monitoring the - status of the network. If we took them out, the size of a compressed list - of all the routers would fall by about 60%. (No other disposable field - would save much more than 2%.) - - We propose to remove these fields from descriptors, and and have them - uploaded as a part of a separate signed "extra info" to the authorities. - This document will be signed. A hash of this document will be included in - the regular descriptors. - - (We considered another design, where routers would generate and upload a - short-form and a long-form descriptor. Only the short-form descriptor would - ever be used by anybody for routing. The long-form descriptor would be - used only for analytics and other tools. We decided against this because - well-behaved tools would need to download short-form descriptors too (as - these would be the only ones indexed), and hence get redundant info. Badly - behaved tools would download only long-form descriptors, and expose - themselves to partitioning attacks.) - -Other disposable fields: - - Clients don't need these fields, but removing them doesn't help bandwidth - enough to be worthwhile. - contact (save about 1%) - fingerprint (save about 3%) - - We could represent these fields more succinctly, but removing them would - only save 1%. (!) - reject - accept - (Apparently, exit polices are highly compressible.) - - [Does size-on-disk matter to anybody? Some clients and servers don't - have much disk, or have really slow disk (e.g. USB). And we don't - store caches compressed right now. -RD] - -Specification: - - 1. Extra Info Format. - - An "extra info" descriptor contains the following fields: - - "extra-info" Nickname Fingerprint - Identifies what router this is an extra info descriptor for. - Fingerprint is encoded in hex (using upper-case letters), with - no spaces. - - "published" As currently documented in dir-spec.txt. It MUST match the - "published" field of the descriptor published at the same time. - - "read-history" - "write-history" - As currently documented in dir-spec.txt. Optional. - - "router-signature" NL Signature NL - - A signature of the PKCS1-padded hash of the entire extra info - document, taken from the beginning of the "extra-info" line, through - the newline after the "router-signature" line. An extra info - document is not valid unless the signature is performed with the - identity key whose digest matches FINGERPRINT. - - The "extra-info" field is required and MUST appear first. The - router-signature field is required and MUST appear last. All others are - optional. As for other documents, unrecognized fields must be ignored. - - 2. Existing formats - - Implementations that use "read-history" and "write-history" SHOULD - continue accepting router descriptors that contain them. (Prior to - 0.2.0.x, this information was encoded in ordinary router descriptors; - in any case they have always been listed as opt, so they should be - accepted anyway.) - - Add these fields to router descriptors: - - "extra-info-digest" Digest - "Digest" is a hex-encoded digest (using upper-case characters) - of the router's extra-info document, as signed in the router's - extra-info. (If this field is absent, no extra-info-digest - exists.) - - "caches-extra-info" - Present if this router is a directory cache that provides - extra-info documents, or an authority that handles extra-info - documents. - - (Since implementations before 0.1.2.5-alpha required that the "opt" - keyword precede any unrecognized entry, these keys MUST be preceded - with "opt" until 0.1.2.5-alpha is obsolete.) - - 3. New communications rules - - Servers SHOULD generate and upload one extra-info document after each - descriptor they generate and upload; no more, no less. Servers MUST - upload the new descriptor before they upload the new extra-info. - - Authorities receiving an extra-info document SHOULD verify all of the - following: - * They have a router descriptor for some server with a matching - nickname and identity fingerprint. - * That server's identity key has been used to sign the extra-info - document. - * The extra-info-digest field in the router descriptor matches - the digest of the extra-info document. - * The published fields in the two documents match. - - Authorities SHOULD drop extra-info documents that do not meet these - criteria. - - Extra-info documents MAY be uploaded as part of the same HTTP post as - the router descriptor, or separately. Authorities MUST accept both - methods. - - Authorities SHOULD try to fetch extra-info documents from one another if - they do not have one matching the digest declared in a router - descriptor. - - Caches that are running locally with a tool that needs to use extra-info - documents MAY download and store extra-info documents. They should do - so when they notice that the recommended descriptor has an - extra-info-digest not matching any extra-info document they currently - have. (Caches not running on a host that needs to use extra-info - documents SHOULD NOT download or cache them.) - - 4. New URLs - - http://<hostname>/tor/extra/d/... - http://<hostname>/tor/extra/fp/... - http://<hostname>/tor/extra/all[.z] - (As for /tor/server/ URLs: supports fetching extra-info documents - by their digest, by the fingerprint of their servers, or all - at once. When serving by fingerprint, we serve the extra-info - that corresponds to the descriptor we would serve by that - fingerprint. Only directory authorities are guaranteed to support - these URLs.) - - http://<hostname>/tor/extra/authority[.z] - (The extra-info document for this router.) - - Extra-info documents are uploaded to the same URLs as regular - router descriptors. - -Migration: - - For extra info approach: - * First: - * Authorities should accept extra info, and support serving it. - * Routers should upload extra info once authorities accept it. - * Caches should support an option to download and cache it, once - authorities serve it. - * Tools should be updated to use locally cached information. - These tools include: - lefkada's exit.py script. - tor26's noreply script and general directory cache. - https://nighteffect.us/tns/ for its graphs - and check with or-talk for the rest, once it's time. - - * Set a cutoff time for including bandwidth in router descriptors, so - that tools that use bandwidth info know that they will need to fetch - extra info documents. - - * Once tools that want bandwidth info support fetching extra info: - * Have routers stop including bandwidth info in their router - descriptors. diff --git a/doc/spec/proposals/105-handshake-revision.txt b/doc/spec/proposals/105-handshake-revision.txt deleted file mode 100644 index 791a016c26..0000000000 --- a/doc/spec/proposals/105-handshake-revision.txt +++ /dev/null @@ -1,323 +0,0 @@ -Filename: 105-handshake-revision.txt -Title: Version negotiation for the Tor protocol. -Author: Nick Mathewson, Roger Dingledine -Created: Jan 2007 -Status: Closed -Implemented-In: 0.2.0.x - -Overview: - - This document was extracted from a modified version of tor-spec.txt that we - had written before the proposal system went into place. It adds two new - cells types to the Tor link connection setup handshake: one used for - version negotiation, and another to prevent MITM attacks. - - This proposal is partially implemented, and partially proceded by - proposal 130. - -Motivation: Tor versions - - Our *current* approach to versioning the Tor protocol(s) has been as - follows: - - All changes must be backward compatible. - - It's okay to add new cell types, if they would be ignored by previous - versions of Tor. - - It's okay to add new data elements to cells, if they would be - ignored by previous versions of Tor. - - For forward compatibility, Tor must ignore cell types it doesn't - recognize, and ignore data in those cells it doesn't expect. - - Clients can inspect the version of Tor declared in the platform line - of a router's descriptor, and use that to learn whether a server - supports a given feature. Servers, however, aren't assumed to all - know about each other, and so don't know the version of who they're - talking to. - - This system has these problems: - - It's very hard to change fundamental aspects of the protocol, like the - cell format, the link protocol, any of the various encryption schemes, - and so on. - - The router-to-router link protocol has remained more-or-less frozen - for a long time, since we can't easily have an OR use new features - unless it knows the other OR will understand them. - - We need to resolve these problems because: - - Our cipher suite is showing its age: SHA1/AES128/RSA1024/DH1024 will - not seem like the best idea for all time. - - There are many ideas circulating for multiple cell sizes; while it's - not obvious whether these are safe, we can't do them at all without a - mechanism to permit them. - - There are many ideas circulating for alternative circuit building and - cell relay rules: they don't work unless they can coexist in the - current network. - - If our protocol changes a lot, it's hard to describe any coherent - version of it: we need to say "the version that Tor versions W through - X use when talking to versions Y through Z". This makes analysis - harder. - -Motivation: Preventing MITM attacks - - TLS prevents a man-in-the-middle attacker from reading or changing the - contents of a communication. It does not, however, prevent such an - attacker from observing timing information. Since timing attacks are some - of the most effective against low-latency anonymity nets like Tor, we - should take more care to make sure that we're not only talking to who - we think we're talking to, but that we're using the network path we - believe we're using. - -Motivation: Signed clock information - - It's very useful for Tor instances to know how skewed they are relative - to one another. The only way to find out currently has been to download - directory information, and check the Date header--but this is not - authenticated, and hence subject to modification on the wire. Using - BEGIN_DIR to create an authenticated directory stream through an existing - circuit is better, but that's an extra step and it might be nicer to - learn the information in the course of the regular protocol. - -Proposal: - -1.0. Version numbers - - The node-to-node TLS-based "OR connection" protocol and the multi-hop - "circuit" protocol are versioned quasi-independently. - - Of course, some dependencies will continue to exist: Certain versions - of the circuit protocol may require a minimum version of the connection - protocol to be used. The connection protocol affects: - - Initial connection setup, link encryption, transport guarantees, - etc. - - The allowable set of cell commands - - Allowable formats for cells. - - The circuit protocol determines: - - How circuits are established and maintained - - How cells are decrypted and relayed - - How streams are established and maintained. - - Version numbers are incremented for backward-incompatible protocol changes - only. Backward-compatible changes are generally implemented by adding - additional fields to existing structures; implementations MUST ignore - fields they do not expect. Unused portions of cells MUST be set to zero. - - Though versioning the protocol will make it easier to maintain backward - compatibility with older versions of Tor, we will nevertheless continue to - periodically drop support for older protocols, - - to keep the implementation from growing without bound, - - to limit the maintenance burden of patching bugs in obsolete Tors, - - to limit the testing burden of verifying that many old protocol - versions continue to be implemented properly, and - - to limit the exposure of the network to protocol versions that are - expensive to support. - - The Tor protocol as implemented through the 0.1.2.x Tor series will be - called "version 1" in its link protocol and "version 1" in its relay - protocol. Versions of the Tor protocol so old as to be incompatible with - Tor 0.1.2.x can be considered to be version 0 of each, and are not - supported. - -2.1. VERSIONS cells - - When a Tor connection is established, both parties normally send a - VERSIONS cell before sending any other cells. (But see below.) - - VersionsLen [2 byte] - Versions [VersionsLen bytes] - - "Versions" is a sequence of VersionsLen bytes. Each value between 1 and - 127 inclusive represents a single version; current implementations MUST - ignore other bytes. Parties should list all of the versions which they - are able and willing to support. Parties can only communicate if they - have some connection protocol version in common. - - Version 0.2.0.x-alpha and earlier don't understand VERSIONS cells, - and therefore don't support version negotiation. Thus, waiting until - the other side has sent a VERSIONS cell won't work for these servers: - if the other side sends no cells back, it is impossible to tell - whether they - have sent a VERSIONS cell that has been stalled, or whether they have - dropped our own VERSIONS cell as unrecognized. Therefore, we'll - change the TLS negotiation parameters so that old parties can still - negotiate, but new parties can recognize each other. Immediately - after a TLS connection has been established, the parties check - whether the other side negotiated the connection in an "old" way or a - "new" way. If either party negotiated in the "old" way, we assume a - v1 connection. Otherwise, both parties send VERSIONS cells listing - all their supported versions. Upon receiving the other party's - VERSIONS cell, the implementation begins using the highest-valued - version common to both cells. If the first cell from the other party - has a recognized command, and is _not_ a VERSIONS cell, we assume a - v1 protocol. - - (For more detail on the TLS protocol change, see forthcoming draft - proposals from Steven Murdoch.) - - Implementations MUST discard VERSIONS cells that are not the first - recognized cells sent on a connection. - - The VERSIONS cell must be sent as a v1 cell (2 bytes of circuitID, 1 - byte of command, 509 bytes of payload). - - [NOTE: The VERSIONS cell is assigned the command number 7.] - -2.2. MITM-prevention and time checking - - If we negotiate a v2 connection or higher, the second cell we send SHOULD - be a NETINFO cell. Implementations SHOULD NOT send NETINFO cells at other - times. - - A NETINFO cell contains: - Timestamp [4 bytes] - Other OR's address [variable] - Number of addresses [1 byte] - This OR's addresses [variable] - - Timestamp is the OR's current Unix time, in seconds since the epoch. If - an implementation receives time values from many ORs that - indicate that its clock is skewed, it SHOULD try to warn the - administrator. (We leave the definition of 'many' intentionally vague - for now.) - - Before believing the timestamp in a NETINFO cell, implementations - SHOULD compare the time at which they received the cell to the time - when they sent their VERSIONS cell. If the difference is very large, - it is likely that the cell was delayed long enough that its - contents are out of date. - - Each address contains Type/Length/Value as used in Section 6.4 of - tor-spec.txt. The first address is the one that the party sending - the NETINFO cell believes the other has -- it can be used to learn - what your IP address is if you have no other hints. - The rest of the addresses are the advertised addresses of the party - sending the NETINFO cell -- we include them - to block a man-in-the-middle attack on TLS that lets an attacker bounce - traffic through his own computers to enable timing and packet-counting - attacks. - - A Tor instance should use the other Tor's reported address - information as part of logic to decide whether to treat a given - connection as suitable for extending circuits to a given address/ID - combination. When we get an extend request, we use an - existing OR connection if the ID matches, and ANY of the following - conditions hold: - - The IP matches the requested IP. - - We know that the IP we're using is canonical because it was - listed in the NETINFO cell. - - We know that the IP we're using is canonical because it was - listed in the server descriptor. - - [NOTE: The NETINFO cell is assigned the command number 8.] - -Discussion: Versions versus feature lists - - Many protocols negotiate lists of available features instead of (or in - addition to) protocol versions. While it's possible that some amount of - feature negotiation could be supported in a later Tor, we should prefer to - use protocol versions whenever possible, for reasons discussed in - the "Anonymity Loves Company" paper. - -Discussion: Bytes per version, versions per cell - - This document provides for a one-byte count of how many versions a Tor - supports, and allows one byte per version. Thus, it can only support only - 254 more versions of the protocol beyond the unallocated v0 and the - current v1. If we ever need to split the protocol into 255 incompatible - versions, we've probably screwed up badly somewhere. - - Nevertheless, here are two ways we could support more versions: - - Change the version count to a two-byte field that counts the number of - _bytes_ used, and use a UTF8-style encoding: versions 0 through 127 - take one byte to encode, versions 128 through 2047 take two bytes to - encode, and so on. We wouldn't need to parse any version higher than - 127 right now, since all bytes used to encode higher versions would - have their high bit set. - - We'd still have a limit of 380 simultaneously versions that could be - declared in any version. This is probably okay. - - - Decide that if we need to support more versions, we can add a - MOREVERSIONS cell that gets sent before the VERSIONS cell. The spec - above requires Tors to ignore unrecognized cell types that they get - before the first VERSIONS cell, and still allows version negotiation - to - succeed. - - [Resolution: Reserve the high bit and the v0 value for later use. If - we ever have more live versions than we can fit in a cell, we've made a - bad design decision somewhere along the line.] - -Discussion: Reducing round-trips - - It might be appealing to see if we can cram more information in the - initial VERSIONS cell. For example, the contents of NETINFO will pretty - soon be sent by everybody before any more information is exchanged, but - decoupling them from the version exchange increases round-trips. - - Instead, we could speculatively include handshaking information at - the end of a VERSIONS cell, wrapped in a marker to indicate, "if we wind - up speaking VERSION 2, here's the NETINFO I'll send. Otherwise, ignore - this." This could be extended to opportunistically reduce round trips - when possible for future versions when we guess the versions right. - - Of course, we'd need to be careful about using a feature like this: - - We don't want to include things that are expensive to compute, - like PK signatures or proof-of-work. - - We don't want to speculate as a mobile client: it may leak our - experience with the server in question. - -Discussion: Advertising versions in routerdescs and networkstatuses. - - In network-statuses: - - The networkstatus "v" line now has the format: - "v" IMPLEMENTATION IMPL-VERSION "Link" LINK-VERSION-LIST - "Circuit" CIRCUIT-VERSION-LIST NL - - LINK-VERSION-LIST and CIRCUIT-VERSION-LIST are comma-separated lists of - supported version numbers. IMPLEMENTATION is the name of the - implementation of the Tor protocol (e.g., "Tor"), and IMPL-VERSION is the - version of the implementation. - - Examples: - v Tor 0.2.5.1-alpha Link 1,2,3 Circuit 2,5 - - v OtherOR 2000+ Link 3 Circuit 5 - - Implementations that release independently of the Tor codebase SHOULD NOT - use "Tor" as the value of their IMPLEMENTATION. - - Additional fields on the "v" line MUST be ignored. - - In router descriptors: - - The router descriptor should contain a line of the form, - "protocols" "Link" LINK-VERSION-LIST "Circuit" CIRCUIT_VERSION_LIST - - Additional fields on the "protocols" line MUST be ignored. - - [Versions of Tor before 0.1.2.5-alpha rejected router descriptors with - unrecognized items; the protocols line should be preceded with an "opt" - until these Tors are obsolete.] - -Security issues: - - Client partitioning is the big danger when we introduce new versions; if a - client supports some very unusual set of protocol versions, it will stand - out from others no matter where it goes. If a server supports an unusual - version, it will get a disproportionate amount of traffic from clients who - prefer that version. We can mitigate this somewhat as follows: - - - Do not have clients prefer any protocol version by default until that - version is widespread. (First introduce the new version to servers, - and have clients admit to using it only when configured to do so for - testing. Then, once many servers are running the new protocol - version, enable its use by default.) - - - Do not multiply protocol versions needlessly. - - - Encourage protocol implementors to implement the same protocol version - sets as some popular version of Tor. - - - Disrecommend very old/unpopular versions of Tor via the directory - authorities' RecommmendedVersions mechanism, even if it is still - technically possible to use them. - diff --git a/doc/spec/proposals/106-less-tls-constraint.txt b/doc/spec/proposals/106-less-tls-constraint.txt deleted file mode 100644 index 7e7621df69..0000000000 --- a/doc/spec/proposals/106-less-tls-constraint.txt +++ /dev/null @@ -1,111 +0,0 @@ -Filename: 106-less-tls-constraint.txt -Title: Checking fewer things during TLS handshakes -Author: Nick Mathewson -Created: 9-Feb-2007 -Status: Closed -Implemented-In: 0.2.0.x - -Overview: - - This document proposes that we relax our requirements on the context of - X.509 certificates during initial TLS handshakes. - -Motivation: - - Later, we want to try harder to avoid protocol fingerprinting attacks. - This means that we'll need to make our connection handshake look closer - to a regular HTTPS connection: one certificate on the server side and - zero certificates on the client side. For now, about the best we - can do is to stop requiring things during handshake that we don't - actually use. - -What we check now, and where we check it: - - tor_tls_check_lifetime: - peer has certificate - notBefore <= now <= notAfter - - tor_tls_verify: - peer has at least one certificate - There is at least one certificate in the chain - At least one of the certificates in the chain is not the one used to - negotiate the connection. (The "identity cert".) - The certificate _not_ used to negotiate the connection has signed the - link cert - - tor_tls_get_peer_cert_nickname: - peer has a certificate. - certificate has a subjectName. - subjectName has a commonName. - commonName consists only of characters in LEGAL_NICKNAME_CHARACTERS. [2] - - tor_tls_peer_has_cert: - peer has a certificate. - - connection_or_check_valid_handshake: - tor_tls_peer_has_cert [1] - tor_tls_get_peer_cert_nickname [1] - tor_tls_verify [1] - If nickname in cert is a known, named router, then its identity digest - must be as expected. - If we initiated the connection, then we got the identity digest we - expected. - - USEFUL THINGS WE COULD DO: - - [1] We could just not force clients to have any certificate at all, let alone - an identity certificate. Internally to the code, we could assign the - identity_digest field of these or_connections to a random number, or even - not add them to the identity_digest->or_conn map. - [so if somebody connects with no certs, we let them. and mark them as - a client and don't treat them as a server. great. -rd] - - [2] Instead of using a restricted nickname character set that makes our - commonName structure look unlike typical SSL certificates, we could treat - the nickname as extending from the start of the commonName up to but not - including the first non-nickname character. - - Alternatively, we could stop checking commonNames entirely. We don't - actually _do_ anything based on the nickname in the certificate, so - there's really no harm in letting every router have any commonName it - wants. - [this is the better choice -rd] - [agreed. -nm] - -REMAINING WAYS TO RECOGNIZE CLIENT->SERVER CONNECTIONS: - - Assuming that we removed the above requirements, we could then (in a later - release) have clients not send certificates, and sometimes and started - making our DNs a little less formulaic, client->server OR connections would - still be recognizable by: - having a two-certificate chain sent by the server - using a particular set of ciphersuites - traffic patterns - probing the server later - -OTHER IMPLICATIONS: - - If we stop verifying the above requirements: - - It will be slightly (but only slightly) more common to connect to a non-Tor - server running TLS, and believe that you're talking to a Tor server (until - you send the first cell). - - It will be far easier for non-Tor SSL clients to accidentally connect to - Tor servers and speak HTTPS or whatever to them. - - If, in a later release, we have clients not send certificates, and we make - DNs less recognizable: - - If clients don't send certs, servers don't need to verify them: win! - - If we remove these restrictions, it will be easier for people to write - clients to fuzz our protocol: sorta win! - - If clients don't send certs, they look slightly less like servers. - -OTHER SPEC CHANGES: - - When a client doesn't give us an identity, we should never extend any - circuits to it (duh), and we should allow it to set circuit ID however it - wants. diff --git a/doc/spec/proposals/107-uptime-sanity-checking.txt b/doc/spec/proposals/107-uptime-sanity-checking.txt deleted file mode 100644 index 922129b21d..0000000000 --- a/doc/spec/proposals/107-uptime-sanity-checking.txt +++ /dev/null @@ -1,54 +0,0 @@ -Filename: 107-uptime-sanity-checking.txt -Title: Uptime Sanity Checking -Author: Kevin Bauer & Damon McCoy -Created: 8-March-2007 -Status: Closed -Implemented-In: 0.2.0.x - -Overview: - - This document describes how to cap the uptime that is used when computing - which routers are marked as stable such that highly stable routers cannot - be displaced by malicious routers that report extremely high uptime - values. - - This is similar to how bandwidth is capped at 1.5MB/s. - -Motivation: - - It has been pointed out that an attacker can displace all stable nodes and - entry guard nodes by reporting high uptimes. This is an easy fix that will - prevent highly stable nodes from being displaced. - -Security implications: - - It should decrease the effectiveness of routing attacks that report high - uptimes while not impacting the normal routing algorithms. - -Specification: - - So we could patch Section 3.1 of dir-spec.txt to say: - - "Stable" -- A router is 'Stable' if it is running, valid, not - hibernating, and either its uptime is at least the median uptime for - known running, valid, non-hibernating routers, or its uptime is at - least 30 days. Routers are never called stable if they are running - a version of Tor known to drop circuits stupidly. (0.1.1.10-alpha - through 0.1.1.16-rc are stupid this way.) - -Compatibility: - - There should be no compatibility issues due to uptime capping. - -Implementation: - - Implemented and merged into dir-spec in 0.2.0.0-alpha-dev (r9788). - -Discussion: - - Initially, this proposal set the maximum at 60 days, not 30; the 30 day - limit and spec wording was suggested by Roger in an or-dev post on 9 March - 2007. - - This proposal also led to 108-mtbf-based-stability.txt - diff --git a/doc/spec/proposals/108-mtbf-based-stability.txt b/doc/spec/proposals/108-mtbf-based-stability.txt deleted file mode 100644 index 294103760b..0000000000 --- a/doc/spec/proposals/108-mtbf-based-stability.txt +++ /dev/null @@ -1,88 +0,0 @@ -Filename: 108-mtbf-based-stability.txt -Title: Base "Stable" Flag on Mean Time Between Failures -Author: Nick Mathewson -Created: 10-Mar-2007 -Status: Closed -Implemented-In: 0.2.0.x - -Overview: - - This document proposes that we change how directory authorities set the - stability flag from inspection of a router's declared Uptime to the - authorities' perceived mean time between failure for the router. - -Motivation: - - Clients prefer nodes that the authorities call Stable. This flag is (as - of 0.2.0.0-alpha-dev) set entirely based on the node's declared value for - uptime. This creates an opportunity for malicious nodes to declare - falsely high uptimes in order to get more traffic. - -Spec changes: - - Replace the current rule for setting the Stable flag with: - - "Stable" -- A router is 'Stable' if it is active and its observed Stability - for the past month is at or above the median Stability for active routers. - Routers are never called stable if they are running a version of Tor - known to drop circuits stupidly. (0.1.1.10-alpha through 0.1.1.16-rc - are stupid this way.) - - Stability shall be defined as the weighted mean length of the runs - observed by a given directory authority. A run begins when an authority - decides that the server is Running, and ends when the authority decides - that the server is not Running. In-progress runs are counted when - measuring Stability. When calculating the mean, runs are weighted by - $\alpha ^ t$, where $t$ is time elapsed since the end of the run, and - $0 < \alpha < 1$. Time when an authority is down do not count to the - length of the run. - -Rejected Alternative: - - "A router's Stability shall be defined as the sum of $\alpha ^ d$ for every - $d$ such that the router was considered reachable for the entire day - $d$ days ago. - - This allows a simpler implementation: every day, we multiply - yesterday's Stability by alpha, and if the router was observed to be - available every time we looked today, we add 1. - - Instead of "day", we could pick an arbitrary time unit. We should - pick alpha to be high enough that long-term stability counts, but low - enough that the distant past is eventually forgotten. Something - between .8 and .95 seems right. - - (By requiring that routers be up for an entire day to get their - stability increased, instead of counting fractions of a day, we - capture the notion that stability is more like "probability of - staying up for the next hour" than it is like "probability of being - up at some randomly chosen time over the next hour." The former - notion of stability is far more relevant for long-lived circuits.) - -Limitations: - - Authorities can have false positives and false negatives when trying to - tell whether a router is up or down. So long as these aren't terribly - wrong, and so long as they aren't significantly biased, we should be able - to use them to estimate stability pretty well. - - Probing approaches like the above could miss short incidents of - downtime. If we use the router's declared uptime, we could detect - these: but doing so would penalize routers who reported their uptime - accurately. - -Implementation: - - For now, the easiest way to store this information at authorities - would probably be in some kind of periodically flushed flat file. - Later, we could move to Berkeley db or something if we really had to. - - For each router, an authority will need to store: - The router ID. - Whether the router is up. - The time when the current run started, if the router is up. - The weighted sum length of all previous runs. - The time at which the weighted sum length was last weighted down. - - Servers should probe at random intervals to test whether servers are - running. diff --git a/doc/spec/proposals/109-no-sharing-ips.txt b/doc/spec/proposals/109-no-sharing-ips.txt deleted file mode 100644 index 5438cf049a..0000000000 --- a/doc/spec/proposals/109-no-sharing-ips.txt +++ /dev/null @@ -1,90 +0,0 @@ -Filename: 109-no-sharing-ips.txt -Title: No more than one server per IP address. -Author: Kevin Bauer & Damon McCoy -Created: 9-March-2007 -Status: Closed -Implemented-In: 0.2.0.x - -Overview: - This document describes a solution to a Sybil attack vulnerability in the - directory servers. Currently, it is possible for a single IP address to - host an arbitrarily high number of Tor routers. We propose that the - directory servers limit the number of Tor routers that may be registered at - a particular IP address to some small (fixed) number, perhaps just one Tor - router per IP address. - - While Tor never uses more than one server from a given /16 in the same - circuit, an attacker with multiple servers in the same place is still - dangerous because he can get around the per-server bandwidth cap that is - designed to prevent a single server from attracting too much of the overall - traffic. - -Motivation: - Since it is possible for an attacker to register an arbitrarily large - number of Tor routers, it is possible for malicious parties to do this - as part of a traffic analysis attack. - -Security implications: - This countermeasure will increase the number of IP addresses that an - attacker must control in order to carry out traffic analysis. - -Specification: - - For each IP address, each directory authority tracks the number of routers - using that IP address, along with their total observed bandwidth. If there - are more than MAX_SERVERS_PER_IP servers at some IP, the authority should - "disable" all but MAX_SERVERS_PER_IP servers. When choosing which servers - to disable, the authority should first disable non-Running servers in - increasing order of observed bandwidth, and then should disable Running - servers in increasing order of bandwidth. - - [[ We don't actually do this part here. -NM - - If the total observed - bandwidth of the remaining non-"disabled" servers exceeds MAX_BW_PER_IP, - the authority should "disable" some of the remaining servers until only one - server remains, or until the remaining observed bandwidth of non-"disabled" - servers is under MAX_BW_PER_IP. - ]] - - Servers that are "disabled" MUST be marked as non-Valid and non-Running. - - MAX_SERVERS_PER_IP is 3. - - MAX_BW_PER_IP is 8 MB per s. - -Compatibility: - - Upon inspection of a directory server, we found that the following IP - addresses have more than one Tor router: - - Scruples 68.5.113.81 ip68-5-113-81.oc.oc.cox.net 443 - WiseUp 68.5.113.81 ip68-5-113-81.oc.oc.cox.net 9001 - Unnamed 62.1.196.71 pc01-megabyte-net-arkadiou.megabyte.gr 9001 - Unnamed 62.1.196.71 pc01-megabyte-net-arkadiou.megabyte.gr 9001 - Unnamed 62.1.196.71 pc01-megabyte-net-arkadiou.megabyte.gr 9001 - aurel 85.180.62.138 e180062138.adsl.alicedsl.de 9001 - sokrates 85.180.62.138 e180062138.adsl.alicedsl.de 9001 - moria1 18.244.0.188 moria.mit.edu 9001 - peacetime 18.244.0.188 moria.mit.edu 9100 - - There may exist compatibility issues with this proposed fix. Reasons why - more than one server would share an IP address include: - - * Testing. moria1, moria2, peacetime, and other morias all run on one - computer at MIT, because that way we get testing. Moria1 and moria2 are - run by Roger, and peacetime is run by Nick. - * NAT. If there are several servers but they port-forward through the same - IP address, ... we can hope that the operators coordinate with each - other. Also, we should recognize that while they help the network in - terms of increased capacity, they don't help as much as they could in - terms of location diversity. But our approach so far has been to take - what we can get. - * People who have more than 1.5MB/s and want to help out more. For - example, for a while Tonga was offering 10MB/s and its Tor server - would only make use of a bit of it. So Roger suggested that he run - two Tor servers, to use more. - -[Note Roger's tweak to this behavior, in -http://archives.seul.org/or/cvs/Oct-2007/msg00118.html] - diff --git a/doc/spec/proposals/110-avoid-infinite-circuits.txt b/doc/spec/proposals/110-avoid-infinite-circuits.txt deleted file mode 100644 index fffc41c25a..0000000000 --- a/doc/spec/proposals/110-avoid-infinite-circuits.txt +++ /dev/null @@ -1,120 +0,0 @@ -Filename: 110-avoid-infinite-circuits.txt -Title: Avoiding infinite length circuits -Author: Roger Dingledine -Created: 13-Mar-2007 -Status: Accepted -Target: 0.2.1.x -Implemented-In: 0.2.1.3-alpha - -History: - - Revised 28 July 2008 by nickm: set K. - Revised 3 July 2008 by nickm: rename from relay_extend to - relay_early. Revise to current migration plan. Allow K cells - over circuit lifetime, not just at start. - -Overview: - - Right now, an attacker can add load to the Tor network by extending a - circuit an arbitrary number of times. Every cell that goes down the - circuit then adds N times that amount of load in overall bandwidth - use. This vulnerability arises because servers don't know their position - on the path, so they can't tell how many nodes there are before them - on the path. - - We propose a new set of relay cells that are distinguishable by - intermediate hops as permitting extend cells. This approach will allow - us to put an upper bound on circuit length relative to the number of - colluding adversary nodes; but there are some downsides too. - -Motivation: - - The above attack can be used to generally increase load all across the - network, or it can be used to target specific servers: by building a - circuit back and forth between two victim servers, even a low-bandwidth - attacker can soak up all the bandwidth offered by the fastest Tor - servers. - - The general attacks could be used as a demonstration that Tor isn't - perfect (leading to yet more media articles about "breaking" Tor), and - the targetted attacks will come into play once we have a reputation - system -- it will be trivial to DoS a server so it can't pass its - reputation checks, in turn impacting security. - -Design: - - We should split RELAY cells into two types: RELAY and RELAY_EARLY. - - Only K (say, 10) Relay_early cells can be sent across a circuit, and - only relay_early cells are allowed to contain extend requests. We - still support obscuring the length of the circuit (if more research - shows us what to do), because Alice can choose how many of the K to - mark as relay_early. Note that relay_early cells *can* contain any - sort of data cell; so in effect it's actually the relay type cells - that are restricted. By default, she would just send the first K - data cells over the stream as relay_early cells, regardless of their - actual type. - - (Note that a circuit that is out of relay_early cells MUST NOT be - cannibalized later, since it can't extend. Note also that it's always okay - to use regular RELAY cells when sending non-EXTEND commands targetted at - the first hop of a circuit, since there is no intermediate hop to try to - learn the relay command type.) - - Each intermediate server would pass on the same type of cell that it - received (either relay or relay_early), and the cell's destination - will be able to learn whether it's allowed to contain an Extend request. - - If an intermediate server receives more than K relay_early cells, or - if it sees a relay cell that contains an extend request, then it - tears down the circuit (protocol violation). - -Security implications: - - The upside is that this limits the bandwidth amplification factor to - K: for an individual circuit to become arbitrary-length, the attacker - would need an adversary-controlled node every K hops, and at that - point the attack is no worse than if the attacker creates N/K separate - K-hop circuits. - - On the other hand, we want to pick a large enough value of K that we - don't mind the cap. - - If we ever want to take steps to hide the number of hops in the circuit - or a node's position in the circuit, this design probably makes that - more complex. - -Migration: - - In 0.2.0, servers speaking v2 or later of the link protocol accept - RELAY_EARLY cells, and pass them on. If the next OR in the circuit - is not speaking the v2 link protocol, the server relays the cell as - a RELAY cell. - - In 0.2.1.3-alpha, clients begin using RELAY_EARLY cells on v2 - connections. This functionality can be safely backported to - 0.2.0.x. Clients should pick a random number betweeen (say) K and - K-2 to send. - - In 0.2.1.3-alpha, servers close any circuit in which more than K - relay_early cells are sent. - - Once all versions the do not send RELAY_EARLY cells are obsolete, - servers can begin to reject any EXTEND requests not sent in a - RELAY_EARLY cell. - -Parameters: - - Let K = 8, for no terribly good reason. - -Spec: - - [We can formalize this part once we think the design is a good one.] - -Acknowledgements: - - This design has been kicking around since Christian Grothoff and I came - up with it at PET 2004. (Nathan Evans, Christian Grothoff's student, - is working on implementing a fix based on this design in the summer - 2007 timeframe.) - diff --git a/doc/spec/proposals/111-local-traffic-priority.txt b/doc/spec/proposals/111-local-traffic-priority.txt deleted file mode 100644 index 9411463c21..0000000000 --- a/doc/spec/proposals/111-local-traffic-priority.txt +++ /dev/null @@ -1,151 +0,0 @@ -Filename: 111-local-traffic-priority.txt -Title: Prioritizing local traffic over relayed traffic -Author: Roger Dingledine -Created: 14-Mar-2007 -Status: Closed -Implemented-In: 0.2.0.x - -Overview: - - We describe some ways to let Tor users operate as a relay and enforce - rate limiting for relayed traffic without impacting their locally - initiated traffic. - -Motivation: - - Right now we encourage people who use Tor as a client to configure it - as a relay too ("just click the button in Vidalia"). Most of these users - are on asymmetric links, meaning they have a lot more download capacity - than upload capacity. But if they enable rate limiting too, suddenly - they're limited to the same download capacity as upload capacity. And - they have to enable rate limiting, or their upstream pipe gets filled - up, starts dropping packets, and now their net connection doesn't work - even for non-Tor stuff. So they end up turning off the relaying part - so they can use Tor (and other applications) again. - - So far this hasn't mattered that much: most of our fast relays are - being operated only in relay mode, so the rate limiting makes sense - for them. But if we want to be able to attract many more relays in - the future, we need to let ordinary users act as relays too. - - Further, as we begin to deploy the blocking-resistance design and we - rely on ordinary users to click the "Tor for Freedom" button, this - limitation will become a serious stumbling block to getting volunteers - to act as bridges. - -The problem: - - Tor implements its rate limiting on the 'read' side by only reading - a certain number of bytes from the network in each second. If it has - emptied its token bucket, it doesn't read any more from the network; - eventually TCP notices and stalls until we resume reading. But if we - want to have two classes of service, we can't know what class a given - incoming cell will be until we look at it, at which point we've already - read it. - -Some options: - - Option 1: read when our token bucket is full enough, and if it turns - out that what we read was local traffic, then add the tokens back into - the token bucket. This will work when local traffic load alternates - with relayed traffic load; but it's a poor option in general, because - when we're receiving both local and relayed traffic, there are plenty - of cases where we'll end up with an empty token bucket, and then we're - back where we were before. - - More generally, notice that our problem is easy when a given TCP - connection either has entirely local circuits or entirely relayed - circuits. In fact, even if they are both present, if one class is - entirely idle (none of its circuits have sent or received in the past - N seconds), we can ignore that class until it wakes up again. So it - only gets complex when a single connection contains active circuits - of both classes. - - Next, notice that local traffic uses only the entry guards, whereas - relayed traffic likely doesn't. So if we're a bridge handling just - a few users, the expected number of overlapping connections would be - almost zero, and even if we're a full relay the number of overlapping - connections will be quite small. - - Option 2: build separate TCP connections for local traffic and for - relayed traffic. In practice this will actually only require a few - extra TCP connections: we would only need redundant TCP connections - to at most the number of entry guards in use. - - However, this approach has some drawbacks. First, if the remote side - wants to extend a circuit to you, how does it know which TCP connection - to send it on? We would need some extra scheme to label some connections - "client-only" during construction. Perhaps we could do this by seeing - whether any circuit was made via CREATE_FAST; but this still opens - up a race condition where the other side sends a create request - immediately. The only ways I can imagine to avoid the race entirely - are to specify our preference in the VERSIONS cell, or to add some - sort of "nope, not this connection, why don't you try another rather - than failing" response to create cells, or to forbid create cells on - connections that you didn't initiate and on which you haven't seen - any circuit creation requests yet -- this last one would lead to a bit - more connection bloat but doesn't seem so bad. And we already accept - this race for the case where directory authorities establish new TCP - connections periodically to check reachability, and then hope to hang - up on them soon after. (In any case this issue is moot for bridges, - since each destination will be one-way with respect to extend requests: - either receiving extend requests from bridge users or sending extend - requests to the Tor server, never both.) - - The second problem with option 2 is that using two TCP connections - reveals that there are two classes of traffic (and probably quickly - reveals which is which, based on throughput). Now, it's unclear whether - this information is already available to the other relay -- he would - easily be able to tell that some circuits are fast and some are rate - limited, after all -- but it would be nice to not add even more ways to - leak that information. Also, it's less clear that an external observer - already has this information if the circuits are all bundled together, - and for this case it's worth trying to protect it. - - Option 3: tell the other side about our rate limiting rules. When we - establish the TCP connection, specify the different policy classes we - have configured. Each time we extend a circuit, specify which policy - class that circuit should be part of. Then hope the other side obeys - our wishes. (If he doesn't, hang up on him.) Besides the design and - coordination hassles involved in this approach, there's a big problem: - our rate limiting classes apply to all our connections, not just - pairwise connections. How does one server we're connected to know how - much of our bucket has already been spent by another? I could imagine - a complex and inefficient "ok, now you can send me those two more cells - that you've got queued" protocol. I'm not sure how else we could do it. - - (Gosh. How could UDP designs possibly be compatible with rate limiting - with multiple bucket sizes?) - - Option 4: put both classes of circuits over a single connection, and - keep track of the last time we read or wrote a high-priority cell. If - it's been less than N seconds, give the whole connection high priority, - else give the whole connection low priority. - - Option 5: put both classes of circuits over a single connection, and - play a complex juggling game by periodically telling the remote side - what rate limits to set for that connection, so you end up giving - priority to the right connections but still stick to roughly your - intended bandwidthrate and relaybandwidthrate. - - Option 6: ? - -Prognosis: - - Nick really didn't like option 2 because of the partitioning questions. - - I've put option 4 into place as of Tor 0.2.0.3-alpha. - - In terms of implementation, it will be easy: just add a time_t to - or_connection_t that specifies client_used (used by the initiator - of the connection to rate limit it differently depending on how - recently the time_t was reset). We currently update client_used - in three places: - - command_process_relay_cell() when we receive a relay cell for - an origin circuit. - - relay_send_command_from_edge() when we send a relay cell for - an origin circuit. - - circuit_deliver_create_cell() when send a create cell. - We could probably remove the third case and it would still work, - but hey. - diff --git a/doc/spec/proposals/112-bring-back-pathlencoinweight.txt b/doc/spec/proposals/112-bring-back-pathlencoinweight.txt deleted file mode 100644 index 3f6c3376f0..0000000000 --- a/doc/spec/proposals/112-bring-back-pathlencoinweight.txt +++ /dev/null @@ -1,163 +0,0 @@ -Filename: 112-bring-back-pathlencoinweight.txt -Title: Bring Back Pathlen Coin Weight -Author: Mike Perry -Created: -Status: Superseded -Superseded-By: 115 - - -Overview: - - The idea is that users should be able to choose a weight which - probabilistically chooses their path lengths to be 2 or 3 hops. This - weight will essentially be a biased coin that indicates an - additional hop (beyond 2) with probability P. The user should be - allowed to choose 0 for this weight to always get 2 hops and 1 to - always get 3. - - This value should be modifiable from the controller, and should be - available from Vidalia. - - -Motivation: - - The Tor network is slow and overloaded. Increasingly often I hear - stories about friends and friends of friends who are behind firewalls, - annoying censorware, or under surveillance that interferes with their - productivity and Internet usage, or chills their speech. These people - know about Tor, but they choose to put up with the censorship because - Tor is too slow to be usable for them. In fact, to download a fresh, - complete copy of levine-timing.pdf for the Anonymity Implications - section of this proposal over Tor took me 3 tries. - - There are many ways to improve the speed problem, and of course we - should and will implement as many as we can. Johannes's GSoC project - and my reputation system are longer term, higher-effort things that - will still provide benefit independent of this proposal. - - However, reducing the path length to 2 for those who do not need the - (questionable) extra anonymity 3 hops provide not only improves - their Tor experience but also reduces their load on the Tor network by - 33%, and can be done in less than 10 lines of code. That's not just - Win-Win, it's Win-Win-Win. - - Furthermore, when blocking resistance measures insert an extra relay - hop into the equation, 4 hops will certainly be completely unusable - for these users, especially since it will be considerably more - difficult to balance the load across a dark relay net than balancing - the load on Tor itself (which today is still not without its flaws). - - -Anonymity Implications: - - It has long been established that timing attacks against mixed - networks are extremely effective, and that regardless of path - length, if the adversary has compromised your first and last - hop of your path, you can assume they have compromised your - identity for that connection. - - In [1], it is demonstrated that for all but the slowest, lossiest - networks, error rates for false positives and false negatives were - very near zero. Only for constant streams of traffic over slow and - (more importantly) extremely lossy network links did the error rate - hit 20%. For loss rates typical to the Internet, even the error rate - for slow nodes with constant traffic streams was 13%. - - When you take into account that most Tor streams are not constant, - but probably much more like their "HomeIP" dataset, which consists - mostly of web traffic that exists over finite intervals at specific - times, error rates drop to fractions of 1%, even for the "worst" - network nodes. - - Therefore, the user has little benefit from the extra hop, assuming - the adversary does timing correlation on their nodes. The real - protection is the probability of getting both the first and last hop, - and this is constant whether the client chooses 2 hops, 3 hops, or 42. - - Partitioning attacks form another concern. Since Tor uses telescoping - to build circuits, it is possible to tell a user is constructing only - two hop paths at the entry node. It is questionable if this data is - actually worth anything though, especially if the majority of users - have easy access to this option, and do actually choose their path - lengths semi-randomly. - - Nick has postulated that exits may also be able to tell that you are - using only 2 hops by the amount of time between sending their - RELAY_CONNECTED cell and the first bit of RELAY_DATA traffic they - see from the OP. I doubt that they will be able to make much use - of this timing pattern, since it will likely vary widely depending - upon the type of node selected for that first hop, and the user's - connection rate to that first hop. It is also questionable if this - data is worth anything, especially if many users are using this - option (and I imagine many will). - - Perhaps most seriously, two hop paths do allow malicious guards - to easily fail circuits if they do not extend to their colluding peers - for the exit hop. Since guards can detect the number of hops in a - path, they could always fail the 3 hop circuits and focus on - selectively failing the two hop ones until a peer was chosen. - - I believe currently guards are rotated if circuits fail, which does - provide some protection, but this could be changed so that an entry - guard is completely abandoned after a certain ratio of extend or - general circuit failures with respect to non-failed circuits. This - could possibly be gamed to increase guard turnover, but such a game - would be much more noticeable than an individual guard failing circuits, - though, since it would affect all clients, not just those who chose - a particular guard. - - -Why not fix Pathlen=2?: - - The main reason I am not advocating that we always use 2 hops is that - in some situations, timing correlation evidence by itself may not be - considered as solid and convincing as an actual, uninterrupted, fully - traced path. Are these timing attacks as effective on a real network - as they are in simulation? Would an extralegal adversary or authoritarian - government even care? In the face of these situation-dependent unknowns, - it should be up to the user to decide if this is a concern for them or not. - - It should probably also be noted that even a false positive - rate of 1% for a 200k concurrent-user network could mean that for a - given node, a given stream could be confused with something like 10 - users, assuming ~200 nodes carry most of the traffic (ie 1000 users - each). Though of course to really know for sure, someone needs to do - an attack on a real network, unfortunately. - - -Implementation: - - new_route_len() can be modified directly with a check of the - PathlenCoinWeight option (converted to percent) and a call to - crypto_rand_int(0,100) for the weighted coin. - - The entry_guard_t structure could have num_circ_failed and - num_circ_succeeded members such that if it exceeds N% circuit - extend failure rate to a second hop, it is removed from the entry list. - N should be sufficiently high to avoid churn from normal Tor circuit - failure as determined by TorFlow scans. - - The Vidalia option should be presented as a boolean, to minimize confusion - for the user. Something like a radiobutton with: - - * "I use Tor for Censorship Resistance, not Anonymity. Speed is more - important to me than Anonymity." - * "I use Tor for Anonymity. I need extra protection at the cost of speed." - - and then some explanation in the help for exactly what this means, and - the risks involved with eliminating the adversary's need for timing attacks - wrt to false positives, etc. - -Migration: - - Phase one: Experiment with the proper ratio of circuit failures - used to expire garbage or malicious guards via TorFlow. - - Phase two: Re-enable config and modify new_route_len() to add an - extra hop if coin comes up "heads". - - Phase three: Make radiobutton in Vidalia, along with help entry - that explains in layman's terms the risks involved. - - -[1] http://www.cs.umass.edu/~mwright/papers/levine-timing.pdf diff --git a/doc/spec/proposals/113-fast-authority-interface.txt b/doc/spec/proposals/113-fast-authority-interface.txt deleted file mode 100644 index 8912b53220..0000000000 --- a/doc/spec/proposals/113-fast-authority-interface.txt +++ /dev/null @@ -1,85 +0,0 @@ -Filename: 113-fast-authority-interface.txt -Title: Simplifying directory authority administration -Author: Nick Mathewson -Created: -Status: Superseded - -Overview - -The problem: - - Administering a directory authority is a pain: you need to go through - emails and manually add new nodes as "named". When bad things come up, - you need to mark nodes (or whole regions) as invalid, badexit, etc. - - This means that mostly, authority admins don't: only 2/4 current authority - admins actually bind names or list bad exits, and those two have often - complained about how annoying it is to do so. - - Worse, name binding is a common path, but it's a pain in the neck: nobody - has done it for a couple of months. - -Digression: who knows what? - - It's trivial for Tor to automatically keep track of all of the - following information about a server: - name, fingerprint, IP, last-seen time, first-seen time, declared - contact. - - All we need to have the administrator set is: - - Is this name/fingerprint pair bound? - - Is this fingerprint/IP a bad exit? - - Is this fingerprint/IP an invalid node? - - Is this fingerprint/IP to be rejected? - - The workflow for authority admins has two parts: - - Periodically, go through tor-ops and add new names. This doesn't - need to be done urgently. - - Less often, mark badly behaved serves as badly behaved. This is more - urgent. - -Possible solution #1: Web-interface for name binding. - - Deprecate use of the tor-ops mailing list; instead, have operators go to a - webform and enter their server info. This would put the information in a - standardized format, thus allowing quick, nearly-automated approval and - reply. - -Possible solution #2: Self-binding names. - - Peter Palfrader has proposed that names be assigned automatically to nodes - that have been up and running and valid for a while. - -Possible solution #3: Self-maintaining approved-routers file - - Mixminion alpha has a neat feature where whenever a new server is seen, - a stub line gets added to a configuration file. For Tor, it could look - something like this: - - ## First seen with this key on 2007-04-21 13:13:14 - ## Stayed up for at least 12 hours on IP 192.168.10.10 - #RouterName AAAABBBBCCCCDDDDEFEF - - (Note that the implementation needs to parse commented lines to make sure - that it doesn't add duplicates, but that's not so hard.) - - To add a router as named, administrators would only need to uncomment the - entry. This automatically maintained file could be kept separately from a - manually maintained one. - - This could be combined with solution #2, such that Tor would do the hard - work of uncommenting entries for routers that should get Named, but - operators could override its decisions. - -Possible solution #4: A separate mailing list for authority operators. - - Right now, the tor-ops list is very high volume. There should be another - list that's only for dealing with problems that need prompt action, like - marking a router as !badexit. - -Resolution: - - Solution #2 is described in "Proposal 123: Naming authorities - automatically create bindings", and that approach is implemented. - There are remaining issues in the problem statement above that need - their own solutions. diff --git a/doc/spec/proposals/114-distributed-storage.txt b/doc/spec/proposals/114-distributed-storage.txt deleted file mode 100644 index 91a787d301..0000000000 --- a/doc/spec/proposals/114-distributed-storage.txt +++ /dev/null @@ -1,439 +0,0 @@ -Filename: 114-distributed-storage.txt -Title: Distributed Storage for Tor Hidden Service Descriptors -Author: Karsten Loesing -Created: 13-May-2007 -Status: Closed -Implemented-In: 0.2.0.x - -Change history: - - 13-May-2007 Initial proposal - 14-May-2007 Added changes suggested by Lasse Øverlier - 30-May-2007 Changed descriptor format, key length discussion, typos - 09-Jul-2007 Incorporated suggestions by Roger, added status of specification - and implementation for upcoming GSoC mid-term evaluation - 11-Aug-2007 Updated implementation statuses, included non-consecutive - replication to descriptor format - 20-Aug-2007 Renamed config option HSDir as HidServDirectoryV2 - 02-Dec-2007 Closed proposal - -Overview: - - The basic idea of this proposal is to distribute the tasks of storing and - serving hidden service descriptors from currently three authoritative - directory nodes among a large subset of all onion routers. The three - reasons to do this are better robustness (availability), better - scalability, and improved security properties. Further, - this proposal suggests changes to the hidden service descriptor format to - prevent new security threats coming from decentralization and to gain even - better security properties. - -Status: - - As of December 2007, the new hidden service descriptor format is implemented - and usable. However, servers and clients do not yet make use of descriptor - cookies, because there are open usability issues of this feature that might - be resolved in proposal 121. Further, hidden service directories do not - perform replication by themselves, because (unauthorized) replica fetch - requests would allow any attacker to fetch all hidden service descriptors in - the system. As neither issue is critical to the functioning of v2 - descriptors and their distribution, this proposal is considered as Closed. - -Motivation: - - The current design of hidden services exhibits the following performance and - security problems: - - First, the three hidden service authoritative directories constitute a - performance bottleneck in the system. The directory nodes are responsible for - storing and serving all hidden service descriptors. As of May 2007 there are - about 1000 descriptors at a time, but this number is assumed to increase in - the future. Further, there is no replication protocol for descriptors between - the three directory nodes, so that hidden services must ensure the - availability of their descriptors by manually publishing them on all - directory nodes. Whenever a fourth or fifth hidden service authoritative - directory is added, hidden services will need to maintain an equally - increasing number of replicas. These scalability issues have an impact on the - current usage of hidden services and put an even higher burden on the - development of new kinds of applications for hidden services that might - require storing even more descriptors. - - Second, besides posing a limitation to scalability, storing all hidden - service descriptors on three directory nodes also constitutes a security - risk. The directory node operators could easily analyze the publish and fetch - requests to derive information on service activity and usage and read the - descriptor contents to determine which onion routers work as introduction - points for a given hidden service and need to be attacked or threatened to - shut it down. Furthermore, the contents of a hidden service descriptor offer - only minimal security properties to the hidden service. Whoever gets aware of - the service ID can easily find out whether the service is active at the - moment and which introduction points it has. This applies to (former) - clients, (former) introduction points, and of course to the directory nodes. - It requires only to request the descriptor for the given service ID, which - can be performed by anyone anonymously. - - This proposal suggests two major changes to approach the described - performance and security problems: - - The first change affects the storage location for hidden service descriptors. - Descriptors are distributed among a large subset of all onion routers instead - of three fixed directory nodes. Each storing node is responsible for a subset - of descriptors for a limited time only. It is not able to choose which - descriptors it stores at a certain time, because this is determined by its - onion ID which is hard to change frequently and in time (only routers which - are stable for a given time are accepted as storing nodes). In order to - resist single node failures and untrustworthy nodes, descriptors are - replicated among a certain number of storing nodes. A first replication - protocol makes sure that descriptors don't get lost when the node population - changes; therefore, a storing node periodically requests the descriptors from - its siblings. A second replication protocol distributes descriptors among - non-consecutive nodes of the ID ring to prevent a group of adversaries from - generating new onion keys until they have consecutive IDs to create a 'black - hole' in the ring and make random services unavailable. Connections to - storing nodes are established by extending existing circuits by one hop to - the storing node. This also ensures that contents are encrypted. The effect - of this first change is that the probability that a single node operator - learns about a certain hidden service is very small and that it is very hard - to track a service over time, even when it collaborates with other node - operators. - - The second change concerns the content of hidden service descriptors. - Obviously, security problems cannot be solved only by decentralizing storage; - in fact, they could also get worse if done without caution. At first, a - descriptor ID needs to change periodically in order to be stored on changing - nodes over time. Next, the descriptor ID needs to be computable only for the - service's clients, but should be unpredictable for all other nodes. Further, - the storing node needs to be able to verify that the hidden service is the - true originator of the descriptor with the given ID even though it is not a - client. Finally, a storing node should learn as little information as - necessary by storing a descriptor, because it might not be as trustworthy as - a directory node; for example it does not need to know the list of - introduction points. Therefore, a second key is applied that is only known to - the hidden service provider and its clients and that is not included in the - descriptor. It is used to calculate descriptor IDs and to encrypt the - introduction points. This second key can either be given to all clients - together with the hidden service ID, or to a group or a single client as - an authentication token. In the future this second key could be the result of - some key agreement protocol between the hidden service and one or more - clients. A new text-based format is proposed for descriptors instead of an - extension of the existing binary format for reasons of future extensibility. - -Design: - - The proposed design is described by the required changes to the current - design. These requirements are grouped by content, rather than by affected - specification documents or code files, and numbered for reference below. - - Hidden service clients, servers, and directories: - - /1/ Create routing list - - All participants can filter the consensus status document received from the - directory authorities to one routing list containing only those servers - that store and serve hidden service descriptors and which are running for - at least 24 hours. A participant only trusts its own routing list and never - learns about routing information from other parties. - - /2/ Determine responsible hidden service directory - - All participants can determine the hidden service directory that is - responsible for storing and serving a given ID, as well as the hidden - service directories that replicate its content. Every hidden service - directory is responsible for the descriptor IDs in the interval from - its predecessor, exclusive, to its own ID, inclusive. Further, a hidden - service directory holds replicas for its n predecessors, where n denotes - the number of consecutive replicas. (requires /1/) - - [/3/ and /4/ were requirements to use BEGIN_DIR cells for directory - requests which have not been fulfilled in the course of the implementation - of this proposal, but elsewhere.] - - Hidden service directory nodes: - - /5/ Advertise hidden service directory functionality - - Every onion router that has its directory port open can decide whether it - wants to store and serve hidden service descriptors by setting a new config - option "HidServDirectoryV2" 0|1 to 1. An onion router with this config - option being set includes the flag "hidden-service-dir" in its router - descriptors that it sends to directory authorities. - - /6/ Accept v2 publish requests, parse and store v2 descriptors - - Hidden service directory nodes accept publish requests for hidden service - descriptors and store them to their local memory. (It is not necessary to - make descriptors persistent, because after disconnecting, the onion router - would not be accepted as storing node anyway, because it has not been - running for at least 24 hours.) All requests and replies are formatted as - HTTP messages. Requests are directed to the router's directory port and are - contained within BEGIN_DIR cells. A hidden service directory node stores a - descriptor only when it thinks that it is responsible for storing that - descriptor based on its own routing table. Every hidden service directory - node is responsible for the descriptor IDs in the interval of its n-th - predecessor in the ID circle up to its own ID (n denotes the number of - consecutive replicas). (requires /1/) - - /7/ Accept v2 fetch requests - - Same as /6/, but with fetch requests for hidden service descriptors. - (requires /2/) - - /8/ Replicate descriptors with neighbors - - A hidden service directory node replicates descriptors from its two - predecessors by downloading them once an hour. Further, it checks its - routing table periodically for changes. Whenever it realizes that a - predecessor has left the network, it establishes a connection to the new - n-th predecessor and requests its stored descriptors in the interval of its - (n+1)-th predecessor and the requested n-th predecessor. Whenever it - realizes that a new onion router has joined with an ID higher than its - former n-th predecessor, it adds it to its predecessors and discards all - descriptors in the interval of its (n+1)-th and its n-th predecessor. - (requires /1/) - - [Dec 02: This function has not been implemented, because arbitrary nodes - what have been able to download the entire set of v2 descriptors. An - authorized replication request would be necessary. For the moment, the - system runs without any directory-side replication. -KL] - - Authoritative directory nodes: - - /9/ Confirm a router's hidden service directory functionality - - Directory nodes include a new flag "HSDir" for routers that decided to - provide storage for hidden service descriptors and that are running for at - least 24 hours. The last requirement prevents a node from frequently - changing its onion key to become responsible for an identifier it wants to - target. - - Hidden service provider: - - /10/ Configure v2 hidden service - - Each hidden service provider that has set the config option - "PublishV2HidServDescriptors" 0|1 to 1 is configured to publish v2 - descriptors and conform to the v2 connection establishment protocol. When - configuring a hidden service, a hidden service provider checks if it has - already created a random secret_cookie and a hostname2 file; if not, it - creates both of them. (requires /2/) - - /11/ Establish introduction points with fresh key - - If configured to publish only v2 descriptors and no v0/v1 descriptors any - more, a hidden service provider that is setting up the hidden service at - introduction points does not pass its own public key, but the public key - of a freshly generated key pair. It also includes these fresh public keys - in the hidden service descriptor together with the other introduction point - information. The reason is that the introduction point does not need to and - therefore should not know for which hidden service it works, so as to - prevent it from tracking the hidden service's activity. (If a hidden - service provider supports both, v0/v1 and v2 descriptors, v0/v1 clients - rely on the fact that all introduction points accept the same public key, - so that this new feature cannot be used.) - - /12/ Encode v2 descriptors and send v2 publish requests - - If configured to publish v2 descriptors, a hidden service provider - publishes a new descriptor whenever its content changes or a new - publication period starts for this descriptor. If the current publication - period would only last for less than 60 minutes (= 2 x 30 minutes to allow - the server to be 30 minutes behind and the client 30 minutes ahead), the - hidden service provider publishes both a current descriptor and one for - the next period. Publication is performed by sending the descriptor to all - hidden service directories that are responsible for keeping replicas for - the descriptor ID. This includes two non-consecutive replicas that are - stored at 3 consecutive nodes each. (requires /1/ and /2/) - - Hidden service client: - - /13/ Send v2 fetch requests - - A hidden service client that has set the config option - "FetchV2HidServDescriptors" 0|1 to 1 handles SOCKS requests for v2 onion - addresses by requesting a v2 descriptor from a randomly chosen hidden - service directory that is responsible for keeping replica for the - descriptor ID. In total there are six replicas of which the first and the - last three are stored on consecutive nodes. The probability of picking one - of the three consecutive replicas is 1/6, 2/6, and 3/6 to incorporate the - fact that the availability will be the highest on the node with next higher - ID. A hidden service client relies on the hidden service provider to store - two sets of descriptors to compensate clock skew between service and - client. (requires /1/ and /2/) - - /14/ Process v2 fetch reply and parse v2 descriptors - - A hidden service client that has sent a request for a v2 descriptor can - parse it and store it to the local cache of rendezvous service descriptors. - - /15/ Establish connection to v2 hidden service - - A hidden service client can establish a connection to a hidden service - using a v2 descriptor. This includes using the secret cookie for decrypting - the introduction points contained in the descriptor. When contacting an - introduction point, the client does not use the public key of the hidden - service provider, but the freshly-generated public key that is included in - the hidden service descriptor. Whether or not a fresh key is used instead - of the key of the hidden service depends on the available protocol versions - that are included in the descriptor; by this, connection establishment is - to a certain extend decoupled from fetching the descriptor. - - Hidden service descriptor: - - (Requirements concerning the descriptor format are contained in /6/ and /7/.) - - The new v2 hidden service descriptor format looks like this: - - onion-address = h(public-key) + cookie - descriptor-id = h(h(public-key) + h(time-period + cookie + relica)) - descriptor-content = { - descriptor-id, - version, - public-key, - h(time-period + cookie + replica), - timestamp, - protocol-versions, - { introduction-points } encrypted with cookie - } signed with private-key - - The "descriptor-id" needs to change periodically in order for the - descriptor to be stored on changing nodes over time. It may only be - computable by a hidden service provider and all of his clients to prevent - unauthorized nodes from tracking the service activity by periodically - checking whether there is a descriptor for this service. Finally, the - hidden service directory needs to be able to verify that the hidden service - provider is the true originator of the descriptor with the given ID. - - Therefore, "descriptor-id" is derived from the "public-key" of the hidden - service provider, the current "time-period" which changes every 24 hours, - a secret "cookie" shared between hidden service provider and clients, and - a "replica" denoting the number of this non-consecutive replica. (The - "time-period" is constructed in a way that time periods do not change at - the same moment for all descriptors by deriving a value between 0:00 and - 23:59 hours from h(public-key) and making the descriptors of this hidden - service provider expire at that time of the day.) The "descriptor-id" is - defined to be 160 bits long. [extending the "descriptor-id" length - suggested by LØ] - - Only the hidden service provider and the clients are able to generate - future "descriptor-ID"s. Hence, the "onion-address" is extended from now - the hash value of "public-key" by the secret "cookie". The "public-key" is - determined to be 80 bits long, whereas the "cookie" is dimensioned to be - 120 bits long. This makes a total of 200 bits or 40 base32 chars, which is - quite a lot to handle for a human, but necessary to provide sufficient - protection against an adversary from generating a key pair with same - "public-key" hash or guessing the "cookie". - - A hidden service directory can verify that a descriptor was created by the - hidden service provider by checking if the "descriptor-id" corresponds to - the "public-key" and if the signature can be verified with the - "public-key". - - The "introduction-points" that are included in the descriptor are encrypted - using the same "cookie" that is shared between hidden service provider and - clients. [correction to use another key than h(time-period + cookie) as - encryption key for introduction points made by LØ] - - A new text-based format is proposed for descriptors instead of an extension - of the existing binary format for reasons of future extensibility. - -Security implications: - - The security implications of the proposed changes are grouped by the roles of - nodes that could perform attacks or on which attacks could be performed. - - Attacks by authoritative directory nodes - - Authoritative directory nodes are no longer the single places in the - network that know about a hidden service's activity and introduction - points. Thus, they cannot perform attacks using this information, e.g. - track a hidden service's activity or usage pattern or attack its - introduction points. Formerly, it would only require a single corrupted - authoritative directory operator to perform such an attack. - - Attacks by hidden service directory nodes - - A hidden service directory node could misuse a stored descriptor to track a - hidden service's activity and usage pattern by clients. Though there is no - countermeasure against this kind of attack, it is very expensive to track a - certain hidden service over time. An attacker would need to run a large - number of stable onion routers that work as hidden service directory nodes - to have a good probability to become responsible for its changing - descriptor IDs. For each period, the probability is: - - 1-(N-c choose r)/(N choose r) for N-c>=r and 1 otherwise, with N - as total - number of hidden service directories, c as compromised nodes, and r as - number of replicas - - The hidden service directory nodes could try to make a certain hidden - service unavailable to its clients. Therefore, they could discard all - stored descriptors for that hidden service and reply to clients that there - is no descriptor for the given ID or return an old or false descriptor - content. The client would detect a false descriptor, because it could not - contain a correct signature. But an old content or an empty reply could - confuse the client. Therefore, the countermeasure is to replicate - descriptors among a small number of hidden service directories, e.g. 5. - The probability of a group of collaborating nodes to make a hidden service - completely unavailable is in each period: - - (c choose r)/(N choose r) for c>=r and N>=r, and 0 otherwise, - with N as total - number of hidden service directories, c as compromised nodes, and r as - number of replicas - - A hidden service directory could try to find out which introduction points - are working on behalf of a hidden service. In contrast to the previous - design, this is not possible anymore, because this information is encrypted - to the clients of a hidden service. - - Attacks on hidden service directory nodes - - An anonymous attacker could try to swamp a hidden service directory with - false descriptors for a given descriptor ID. This is prevented by requiring - that descriptors are signed. - - Anonymous attackers could swamp a hidden service directory with correct - descriptors for non-existing hidden services. There is no countermeasure - against this attack. However, the creation of valid descriptors is more - expensive than verification and storage in local memory. This should make - this kind of attack unattractive. - - Attacks by introduction points - - Current or former introduction points could try to gain information on the - hidden service they serve. But due to the fresh key pair that is used by - the hidden service, this attack is not possible anymore. - - Attacks by clients - - Current or former clients could track a hidden service's activity, attack - its introduction points, or determine the responsible hidden service - directory nodes and attack them. There is nothing that could prevent them - from doing so, because honest clients need the full descriptor content to - establish a connection to the hidden service. At the moment, the only - countermeasure against dishonest clients is to change the secret cookie and - pass it only to the honest clients. - -Compatibility: - - The proposed design is meant to replace the current design for hidden service - descriptors and their storage in the long run. - - There should be a first transition phase in which both, the current design - and the proposed design are served in parallel. Onion routers should start - serving as hidden service directories, and hidden service providers and - clients should make use of the new design if both sides support it. Hidden - service providers should be allowed to publish descriptors of the current - format in parallel, and authoritative directories should continue storing and - serving these descriptors. - - After the first transition phase, hidden service providers should stop - publishing descriptors on authoritative directories, and hidden service - clients should not try to fetch descriptors from the authoritative - directories. However, the authoritative directories should continue serving - hidden service descriptors for a second transition phase. As of this point, - all v2 config options should be set to a default value of 1. - - After the second transition phase, the authoritative directories should stop - serving hidden service descriptors. - diff --git a/doc/spec/proposals/115-two-hop-paths.txt b/doc/spec/proposals/115-two-hop-paths.txt deleted file mode 100644 index 9854c9ad55..0000000000 --- a/doc/spec/proposals/115-two-hop-paths.txt +++ /dev/null @@ -1,385 +0,0 @@ -Filename: 115-two-hop-paths.txt -Title: Two Hop Paths -Author: Mike Perry -Created: -Status: Dead -Supersedes: 112 - - -Overview: - - The idea is that users should be able to choose if they would like - to have either two or three hop paths through the tor network. - - Let us be clear: the users who would choose this option should be - those that are concerned with IP obfuscation only: ie they would not be - targets of a resource-intensive multi-node attack. It is sometimes said - that these users should find some other network to use other than Tor. - This is a foolish suggestion: more users improves security of everyone, - and the current small userbase size is a critical hindrance to - anonymity, as is discussed below and in [1]. - - This value should be modifiable from the controller, and should be - available from Vidalia. - - -Motivation: - - The Tor network is slow and overloaded. Increasingly often I hear - stories about friends and friends of friends who are behind firewalls, - annoying censorware, or under surveillance that interferes with their - productivity and Internet usage, or chills their speech. These people - know about Tor, but they choose to put up with the censorship because - Tor is too slow to be usable for them. In fact, to download a fresh, - complete copy of levine-timing.pdf for the Theoretical Argument - section of this proposal over Tor took me 3 tries. - - Furthermore, the biggest current problem with Tor's anonymity for - those who really need it is not someone attacking the network to - discover who they are. It's instead the extreme danger that so few - people use Tor because it's so slow, that those who do use it have - essentially no confusion set. - - The recent case where the professor and the rogue Tor user were the - only Tor users on campus, and thus suspected in an incident involving - Tor and that University underscores this point: "That was why the police - had come to see me. They told me that only two people on our campus were - using Tor: me and someone they suspected of engaging in an online scam. - The detectives wanted to know whether the other user was a former - student of mine, and why I was using Tor"[1]. - - Not only does Tor provide no anonymity if you use it to be anonymous - but are obviously from a certain institution, location or circumstance, - it is also dangerous to use Tor for risk of being accused of having - something significant enough to hide to be willing to put up with - the horrible performance as opposed to using some weaker alternative. - - There are many ways to improve the speed problem, and of course we - should and will implement as many as we can. Johannes's GSoC project - and my reputation system are longer term, higher-effort things that - will still provide benefit independent of this proposal. - - However, reducing the path length to 2 for those who do not need the - extra anonymity 3 hops provide not only improves their Tor experience - but also reduces their load on the Tor network by 33%, and should - increase adoption of Tor by a good deal. That's not just Win-Win, it's - Win-Win-Win. - - -Who will enable this option? - - This is the crux of the proposal. Admittedly, there is some anonymity - loss and some degree of decreased investment required on the part of - the adversary to attack 2 hop users versus 3 hop users, even if it is - minimal and limited mostly to up-front costs and false positives. - - The key questions are: - - 1. Are these users in a class such that their risk is significantly - less than the amount of this anonymity loss? - - 2. Are these users able to identify themselves? - - Many many users of Tor are not at risk for an adversary capturing c/n - nodes of the network just to see what they do. These users use Tor to - circumvent aggressive content filters, or simply to keep their IP out of - marketing and search engine databases. Most content filters have no - interest in running Tor nodes to catch violators, and marketers - certainly would never consider such a thing, both on a cost basis and a - legal one. - - In a sense, this represents an alternate threat model against these - users who are not at risk for Tor's normal threat model. - - It should be evident to these users that they fall into this class. All - that should be needed is a radio button - - * "I use Tor for local content filter circumvention and/or IP obfuscation, - not anonymity. Speed is more important to me than high anonymity. - No one will make considerable efforts to determine my real IP." - * "I use Tor for anonymity and/or national-level, legally enforced - censorship. It is possible effort will be taken to identify - me, including but not limited to network surveillance. I need more - protection." - - and then some explanation in the help for exactly what this means, and - the risks involved with eliminating the adversary's need for timing - attacks with respect to false positives. Ultimately, the decision is a - simple one that can be made without this information, however. The user - does not need Paul Syverson to instruct them on the deep magic of Onion - Routing to make this decision. They just need to know why they use Tor. - If they use it just to stay out of marketing databases and/or bypass a - local content filter, two hops is plenty. This is likely the vast - majority of Tor users, and many non-users we would like to bring on - board. - - So, having established this class of users, let us now go on to - examine theoretical and practical risks we place them at, and determine - if these risks violate the users needs, or introduce additional risk - to node operators who may be subject to requests from law enforcement - to track users who need 3 hops, but use 2 because they enjoy the - thrill of russian roulette. - - -Theoretical Argument: - - It has long been established that timing attacks against mixed - and onion networks are extremely effective, and that regardless - of path length, if the adversary has compromised your first and - last hop of your path, you can assume they have compromised your - identity for that connection. - - In fact, it was demonstrated that for all but the slowest, lossiest - networks, error rates for false positives and false negatives were - very near zero[2]. Only for constant streams of traffic over slow and - (more importantly) extremely lossy network links did the error rate - hit 20%. For loss rates typical to the Internet, even the error rate - for slow nodes with constant traffic streams was 13%. - - When you take into account that most Tor streams are not constant, - but probably much more like their "HomeIP" dataset, which consists - mostly of web traffic that exists over finite intervals at specific - times, error rates drop to fractions of 1%, even for the "worst" - network nodes. - - Therefore, the user has little benefit from the extra hop, assuming - the adversary does timing correlation on their nodes. Since timing - correlation is simply an implementation issue and is most likely - a single up-front cost (and one that is like quite a bit cheaper - than the cost of the machines purchased to host the nodes to mount - an attack), the real protection is the low probability of getting - both the first and last hop of a client's stream. - - -Practical Issues: - - Theoretical issues aside, there are several practical issues with the - implementation of Tor that need to be addressed to ensure that - identity information is not leaked by the implementation. - - Exit policy issues: - - If a client chooses an exit with a very restrictive exit policy - (such as an IP or IP range), the first hop then knows a good deal - about the destination. For this reason, clients should not select - exits that match their destination IP with anything other than "*". - - Partitioning: - - Partitioning attacks form another concern. Since Tor uses telescoping - to build circuits, it is possible to tell a user is constructing only - two hop paths at the entry node and on the local network. An external - adversary can potentially differentiate 2 and 3 hop users, and decide - that all IP addresses connecting to Tor and using 3 hops have something - to hide, and should be scrutinized more closely or outright apprehended. - - One solution to this is to use the "leaky-circuit" method of attaching - streams: The user always creates 3-hop circuits, but if the option - is enabled, they always exit from their 2nd hop. The ideal solution - would be to create a RELAY_SHISHKABOB cell which contains onion - skins for every host along the path, but this requires protocol - changes at the nodes to support. - - Guard nodes: - - Since guard nodes can rotate due to client relocation, network - failure, node upgrades and other issues, if you amortize the risk a - mobile, dialup, or otherwise intermittently connected user is exposed to - over any reasonable duration of Tor usage (on the order of a year), it - is the same with or without guard nodes. Assuming an adversary has - c%/n% of network bandwidth, and guards rotate on average with period R, - statistically speaking, it's merely a question of if the user wishes - their risk to be concentrated with probability c/n over an expected - period of R*c, and probability 0 over an expected period of R*(n-c), - versus a continuous risk of (c/n)^2. So statistically speaking, guards - only create a time-tradeoff of risk over the long run for normal Tor - usage. Rotating guards do not reduce risk for normal client usage long - term.[3] - - On other other hand, assuming a more stable method of guard selection - and preservation is devised, or a more stable client side network than - my own is typical (which rotates guards frequently due to network issues - and moving about), guard nodes provide a tradeoff in the form of c/n% of - the users being "sacrificial users" who are exposed to high risk O(c/n) - of identification, while the rest of the network is exposed to zero - risk. - - The nature of Tor makes it likely an adversary will take a "shock and - awe" approach to suppressing Tor by rounding up a few users whose - browsing activity has been observed to be made into examples, in an - attempt to prove that Tor is not perfect. - - Since this "shock and awe" attack can be applied with or without guard - nodes, stable guard nodes do offer a measure of accountability of sorts. - If a user was using a small set of guard nodes and knows them well, and - then is suddenly apprehended as a result of Tor usage, having a fixed - set of entry points to suspect is a lot better than suspecting the whole - network. Conversely, it can also give non-apprehended users comfort - that they are likely to remain safe indefinitely with their set of (now - presumably trusted) guards. This is probably the most beneficial - property of reliable guards: they deter the adversary from mounting - "shock and awe" attacks because the surviving users will not - intimidated, but instead made more confident. Of course, guards need to - be made much more stable and users need to be encouraged to know their - guards for this property to really take effect. - - This beneficial property of client vigilance also carries over to an - active adversary, except in this case instead of relying on the user - to remember their guard nodes and somehow communicate them after - apprehension, the code can alert them to the presence of an active - adversary before they are apprehended. But only if they use guard nodes. - - So lets consider the active adversary: Two hop paths allow malicious - guards to get considerably more benefit from failing circuits if they do - not extend to their colluding peers for the exit hop. Since guards can - detect the number of hops in a path via either timing or by statistical - analysis of the exit policy of the 2nd hop, they can perform this attack - predominantly against 2 hop users. - - This can be addressed by completely abandoning an entry guard after a - certain ratio of extend or general circuit failures with respect to - non-failed circuits. The proper value for this ratio can be determined - experimentally with TorFlow. There is the possibility that the local - network can abuse this feature to cause certain guards to be dropped, - but they can do that anyways with the current Tor by just making guards - they don't like unreachable. With this mechanism, Tor will complain - loudly if any guard failure rate exceeds the expected in any failure - case, local or remote. - - Eliminating guards entirely would actually not address this issue due - to the time-tradeoff nature of risk. In fact, it would just make it - worse. Without guard nodes, it becomes much more difficult for clients - to become alerted to Tor entry points that are failing circuits to make - sure that they only devote bandwidth to carry traffic for streams which - they observe both ends. Yet the rogue entry points are still able to - significantly increase their success rates by failing circuits. - - For this reason, guard nodes should remain enabled for 2 hop users, - at least until an IP-independent, undetectable guard scanner can - be created. TorFlow can scan for failing guards, but after a while, - its unique behavior gives away the fact that its IP is a scanner and - it can be given selective service. - - Consideration of risks for node operators: - - There is a serious risk for two hop users in the form of guard - profiling. If an adversary running an exit node notices that a - particular site is always visited from a fixed previous hop, it is - likely that this is a two hop user using a certain guard, which could be - monitored to determine their identity. Thus, for the protection of both - 2 hop users and node operators, 2 hop users should limit their guard - duration to a sufficient number of days to verify reliability of a node, - but not much more. This duration can be determined experimentally by - TorFlow. - - Considering a Tor client builds on average 144 circuits/day (10 - minutes per circuit), if the adversary owns c/n% of exits on the - network, they can expect to see 144*c/n circuits from this user, or - about 14 minutes of usage per day per percentage of network penetration. - Since it will take several occurrences of user-linkable exit content - from the same predecessor hop for the adversary to have any confidence - this is a 2 hop user, it is very unlikely that any sort of demands made - upon the predecessor node would guaranteed to be effective (ie it - actually was a guard), let alone be executed in time to apprehend the - user before they rotated guards. - - The reverse risk also warrants consideration. If a malicious guard has - orders to surveil Mike Perry, it can determine Mike Perry is using two - hops by observing his tendency to choose a 2nd hop with a viable exit - policy. This can be done relatively quickly, unfortunately, and - indicates Mike Perry should spend some of his time building real 3 hop - circuits through the same guards, to require them to at least wait for - him to actually use Tor to determine his style of operation, rather than - collect this information from his passive building patterns. - - However, to actively determine where Mike Perry is going, the guard - will need to require logging ahead of time at multiple exit nodes that - he may use over the course of the few days while he is at that guard, - and correlate the usage times of the exit node with Mike Perry's - activity at that guard for the few days he uses it. At this point, the - adversary is mounting a scale and method of attack (widespread logging, - timing attacks) that works pretty much just as effectively against 3 - hops, so exit node operators are exposed to no additional danger than - they otherwise normally are. - - -Why not fix Pathlen=2?: - - The main reason I am not advocating that we always use 2 hops is that - in some situations, timing correlation evidence by itself may not be - considered as solid and convincing as an actual, uninterrupted, fully - traced path. Are these timing attacks as effective on a real network as - they are in simulation? Maybe the circuit multiplexing of Tor can serve - to frustrate them to a degree? Would an extralegal adversary or - authoritarian government even care? In the face of these situation - dependent unknowns, it should be up to the user to decide if this is - a concern for them or not. - - It should probably also be noted that even a false positive - rate of 1% for a 200k concurrent-user network could mean that for a - given node, a given stream could be confused with something like 10 - users, assuming ~200 nodes carry most of the traffic (ie 1000 users - each). Though of course to really know for sure, someone needs to do - an attack on a real network, unfortunately. - - Additionally, at some point cover traffic schemes may be implemented to - frustrate timing attacks on the first hop. It is possible some expert - users may do this ad-hoc already, and may wish to continue using 3 hops - for this reason. - - -Implementation: - - new_route_len() can be modified directly with a check of the - Pathlen option. However, circuit construction logic should be - altered so that both 2 hop and 3 hop users build the same types of - circuits, and the option should ultimately govern circuit selection, - not construction. This improves coverage against guard nodes being - able to passively profile users who aren't even using Tor. - PathlenCoinWeight, anyone? :) - - The exit policy hack is a bit more tricky. compare_addr_to_addr_policy - needs to return an alternate ADDR_POLICY_ACCEPTED_WILDCARD or - ADDR_POLICY_ACCEPTED_SPECIFIC return value for use in - circuit_is_acceptable. - - The leaky exit is trickier still.. handle_control_attachstream - does allow paths to exit at a given hop. Presumably something similar - can be done in connection_ap_handshake_process_socks, and elsewhere? - Circuit construction would also have to be performed such that the - 2nd hop's exit policy is what is considered, not the 3rd's. - - The entry_guard_t structure could have num_circ_failed and - num_circ_succeeded members such that if it exceeds F% circuit - extend failure rate to a second hop, it is removed from the entry list. - - F should be sufficiently high to avoid churn from normal Tor circuit - failure as determined by TorFlow scans. - - The Vidalia option should be presented as a radio button. - - -Migration: - - Phase 1: Adjust exit policy checks if Pathlen is set, implement leaky - circuit ability, and 2-3 hop circuit selection logic governed by - Pathlen. - - Phase 2: Experiment to determine the proper ratio of circuit - failures used to expire garbage or malicious guards via TorFlow - (pending Bug #440 backport+adoption). - - Phase 3: Implement guard expiration code to kick off failure-prone - guards and warn the user. Cap 2 hop guard duration to a proper number - of days determined sufficient to establish guard reliability (to be - determined by TorFlow). - - Phase 4: Make radiobutton in Vidalia, along with help entry - that explains in layman's terms the risks involved. - - Phase 5: Allow user to specify path length by HTTP URL suffix. - - -[1] http://p2pnet.net/story/11279 -[2] http://www.cs.umass.edu/~mwright/papers/levine-timing.pdf -[3] Proof available upon request ;) diff --git a/doc/spec/proposals/116-two-hop-paths-from-guard.txt b/doc/spec/proposals/116-two-hop-paths-from-guard.txt deleted file mode 100644 index f45625350b..0000000000 --- a/doc/spec/proposals/116-two-hop-paths-from-guard.txt +++ /dev/null @@ -1,118 +0,0 @@ -Filename: 116-two-hop-paths-from-guard.txt -Title: Two hop paths from entry guards -Author: Michael Lieberman -Created: 26-Jun-2007 -Status: Dead - -This proposal is related to (but different from) Mike Perry's proposal 115 -"Two Hop Paths." - -Overview: - -Volunteers who run entry guards should have the option of using only 2 -additional tor nodes when constructing their own tor circuits. - -While the option of two hop paths should perhaps be extended to every client -(as discussed in Mike Perry's thread), I believe the anonymity properties of -two hop paths are particularly well-suited to client computers that are also -serving as entry guards. - -First I will describe the details of the strategy, as well as possible -avenues of attack. Then I will list advantages and disadvantages. Then, I -will discuss some possibly safer variations of the strategy, and finally -some implementation issues. - -Details: - -Suppose Alice is an entry guard, and wants to construct a two hop circuit. -Alice chooses a middle node at random (not using the entry guard strategy), -and gains anonymity by having her traffic look just like traffic from -someone else using her as an entry guard. - -Can Alice's middle node figure out that she is initiator of the traffic? I -can think of four possible approaches for distinguishing traffic from Alice -with traffic through Alice: - -1) Notice that communication from Alice comes too fast: Experimentation is -needed to determine if traffic from Alice can be distinguished from traffic -from a computer with a decent link to Alice. - -2) Monitor Alice's network traffic to discover the lack of incoming packets -at the appropriate times. If an adversary has this ability, then Alice -already has problems in the current system, because the adversary can run a -standard timing attack on Alice's traffic. - -3) Notice that traffic from Alice is unique in some way such that if Alice -was just one of 3 entry guards for this traffic, then the traffic should be -coming from two other entry guards as well. An example of "unique traffic" -could be always sending 117 packets every 3 minutes to an exit node that -exits to port 4661. However, if such patterns existed with sufficient -precision, then it seems to me that Tor already has a problem. (This "unique -traffic" may not be a problem if clients often end up choosing a single -entry guard because their other two are down. Does anyone know if this is -the case?) - -4) First, control the middle node *and* some other part of the traffic, -using standard attacks on a two hop circuit without entry nodes (my recent -paper on Browser-Based Attacks would work well for this -http://petworkshop.org/2007/papers/PET2007_preproc_Browser_based.pdf). With -control of the circuit, we can now cause "unique traffic" as in 3). -Alternatively, if we know something about Alice independently, and we can -see what websites are being visited, we might be able to guess that she is -the kind of person that would visit those websites. - -Anonymity Advantages: - --Alice never has the problem of choosing a malicious entry guard. In some -sense, Alice acts as her own entry guard. - -Anonymity Disadvantages: - --If Alice's traffic is identified as originating from herself (see above for -how hard that might be), then she has the anonymity of a 2 hop circuit -without entry guards. - -Additional advantages: - --A discussion of the latency advantages of two hop circuits is going on in -Mike Perry's thread already. --Also, we can advertise this change as "Run an entry guard and decrease your -own Tor latency." This incentive has the potential to add nodes to the -network, improving the network as a whole. - -Safer variations: - -To solve the "unique traffic" problem, Alice could use two hop paths only -1/3 of the time, and choose 2 other entry guards for the other 2/3 of the -time. All the advantages are now 1/3 as useful (possibly more, if the other -2 entry guards are not always up). - -To solve the problem that Alice's responses are too fast, Alice could delay -her responses (ideally based on some real data of response time when Alice -is used an entry guard). This loses most of the speed advantages of the two -hop path, but if Alice is a fast entry guard, it doesn't lose everything. It -also still has the (arguable) anonymity advantage that Alice doesn't have to -worry about having a malicious entry guard. - -Implementation details: -For Alice to remain anonymous using this strategy, she has to actually be -acting as an entry guard for other nodes. This means the two hop option can -only be available to whatever high-performance threshold is currently set on -entry guards. Alice may need to somehow check her own current status as an -entry guard before choosing this two hop strategy. - -Another thing to consider: suppose Alice is also an exit node. If the -fraction of exit nodes in existence is too small, she may rarely or never be -chosen as an entry guard. It would be sad if we offered an incentive to run -an entry guard that didn't extend to exit nodes. I suppose clients of Exit -nodes could pull the same trick, and bypass using Tor altogether (zero hop -paths), though that has additional issues.* - -Mike Lieberman -MIT - -*Why we shouldn't recommend Exit nodes pull the same trick: -1) Exit nodes would suffer heavily from the problem of "unique traffic" -mentioned above. -2) It would give governments an incentive to confiscate exit nodes to see if -they are pulling this trick. diff --git a/doc/spec/proposals/117-ipv6-exits.txt b/doc/spec/proposals/117-ipv6-exits.txt deleted file mode 100644 index 00cd7cef10..0000000000 --- a/doc/spec/proposals/117-ipv6-exits.txt +++ /dev/null @@ -1,410 +0,0 @@ -Filename: 117-ipv6-exits.txt -Title: IPv6 exits -Author: coderman -Created: 10-Jul-2007 -Status: Accepted -Target: 0.2.1.x - -Overview - - Extend Tor for TCP exit via IPv6 transport and DNS resolution of IPv6 - addresses. This proposal does not imply any IPv6 support for OR - traffic, only exit and name resolution. - - -Contents - -0. Motivation - - As the IPv4 address space becomes more scarce there is increasing - effort to provide Internet services via the IPv6 protocol. Many - hosts are available at IPv6 endpoints which are currently - inaccessible for Tor users. - - Extending Tor to support IPv6 exit streams and IPv6 DNS name - resolution will allow users of the Tor network to access these hosts. - This capability would be present for those who do not currently have - IPv6 access, thus increasing the utility of Tor and furthering - adoption of IPv6. - - -1. Design - -1.1. General design overview - - There are three main components to this proposal. The first is a - method for routers to advertise their ability to exit IPv6 traffic. - The second is the manner in which routers resolve names to IPv6 - addresses. Last but not least is the method in which clients - communicate with Tor to resolve and connect to IPv6 endpoints - anonymously. - -1.2. Router IPv6 exit support - - In order to specify exit policies and IPv6 capability new directives - in the Tor configuration will be needed. If a router advertises IPv6 - exit policies in its descriptor this will signal the ability to - provide IPv6 exit. There are a number of additional default deny - rules associated with this new address space which are detailed in - the addendum. - - When Tor is started on a host it should check for the presence of a - global unicast IPv6 address and if present include the default IPv6 - exit policies and any user specified IPv6 exit policies. - - If a user provides IPv6 exit policies but no global unicast IPv6 - address is available Tor should generate a warning and not publish the - IPv6 policies in the router descriptor. - - It should be noted that IPv4 mapped IPv6 addresses are not valid exit - destinations. This mechanism is mainly used to interoperate with - both IPv4 and IPv6 clients on the same socket. Any attempts to use - an IPv4 mapped IPv6 address, perhaps to circumvent exit policy for - IPv4, must be refused. - -1.3. DNS name resolution of IPv6 addresses (AAAA records) - - In addition to exit support for IPv6 TCP connections, a method to - resolve domain names to their respective IPv6 addresses is also - needed. This is accomplished in the existing DNS system via AAAA - records. Routers will perform both A and AAAA requests when - resolving a name so that the client can utilize an IPv6 endpoint when - available or preferred. - - To avoid potential problems with caching DNS servers that behave - poorly all NXDOMAIN responses to AAAA requests should be ignored if a - successful response is received for an A request. This implies that - both AAAA and A requests will always be performed for each name - resolution. - - For reverse lookups on IPv6 addresses, like that used for - RESOLVE_PTR, Tor will perform the necessary PTR requests via - IP6.ARPA. - - All routers which perform DNS resolution on behalf of clients - (RELAY_RESOLVE) should perform and respond with both A and AAAA - resources. - - [NOTE: In a future version, when we extend the behavior of RESOLVE to - encapsulate more of real DNS, it will make sense to allow more - flexibility here. -nickm] - -1.4. Client interaction with IPv6 exit capability - -1.4.1. Usability goals - - There are a number of behaviors which Tor can provide when - interacting with clients that will improve the usability of IPv6 exit - capability. These behaviors are designed to make it simple for - clients to express a preference for IPv6 transport and utilize IPv6 - host services. - -1.4.2. SOCKSv5 IPv6 client behavior - - The SOCKS version 5 protocol supports IPv6 connections. When using - SOCKSv5 with hostnames it is difficult to determine if a client - wishes to use an IPv4 or IPv6 address to connect to the desired host - if it resolves to both address types. - - In order to make this more intuitive the SOCKSv5 protocol can be - supported on a local IPv6 endpoint, [::1] port 9050 for example. - When a client requests a connection to the desired host via an IPv6 - SOCKS connection Tor will prefer IPv6 addresses when resolving the - host name and connecting to the host. - - Likewise, RESOLVE and RESOLVE_PTR requests from an IPv6 SOCKS - connection will return IPv6 addresses when available, and fall back - to IPv4 addresses if not. - - [NOTE: This means that SocksListenAddress and DNSListenAddress should - support IPv6 addresses. Perhaps there should also be a general option - to have listeners that default to 127.0.0.1 and 0.0.0.0 listen - additionally or instead on ::1 and :: -nickm] - -1.4.3. MAPADDRESS behavior - - The MAPADDRESS capability supports clients that may not be able to - use the SOCKSv4a or SOCKSv5 hostname support to resolve names via - Tor. This ability should be extended to IPv6 addresses in SOCKSv5 as - well. - - When a client requests an address mapping from the wildcard IPv6 - address, [::0], the server will respond with a unique local IPv6 - address on success. It is important to note that there may be two - mappings for the same name if both an IPv4 and IPv6 address are - associated with the host. In this case a CONNECT to a mapped IPv6 - address should prefer IPv6 for the connection to the host, if - available, while CONNECT to a mapped IPv4 address will prefer IPv4. - - It should be noted that IPv6 does not provide the concept of a host - local subnet, like 127.0.0.0/8 in IPv4. For this reason integration - of Tor with IPv6 clients should consider a firewall or filter rule to - drop unique local addresses to or from the network when possible. - These packets should not be routed, however, keeping them off the - subnet entirely is worthwhile. - -1.4.3.1. Generating unique local IPv6 addresses - - The usual manner of generating a unique local IPv6 address is to - select a Global ID part randomly, along with a Subnet ID, and sharing - this prefix among the communicating parties who each have their own - distinct Interface ID. In this style a given Tor instance might - select a random Global and Subnet ID and provide MAPADDRESS - assignments with a random Interface ID as needed. This has the - potential to associate unique Global/Subnet identifiers with a given - Tor instance and may expose attacks against the anonymity of Tor - users. - - Tor avoid this potential problem entirely MAPADDRESS must always - generate the Global, Subnet, and Interface IDs randomly for each - request. It is also highly suggested that explicitly specifying an - IPv6 source address instead of the wildcard address not be supported - to ensure that a good random address is used. - -1.4.4. DNSProxy IPv6 client behavior - - A new capability in recent Tor versions is the transparent DNS proxy. - This feature will need to return both A and AAAA resource records - when responding to client name resolution requests. - - The transparent DNS proxy should also support reverse lookups for - IPv6 addresses. It is suggested that any such requests to the - deprecated IP6.INT domain should be translated to IP6.ARPA instead. - This translation is not likely to be used and is of low priority. - - It would be nice to support DNS over IPv6 transport as well, however, - this is not likely to be used and is of low priority. - -1.4.5. TransPort IPv6 client behavior - - Tor also provides transparent TCP proxy support via the Trans* - directives in the configuration. The TransListenAddress directive - should accept an IPv6 address in addition to IPv4 so that IPv6 TCP - connections can be transparently proxied. - -1.5. Additional changes - - The RedirectExit option should be deprecated rather than extending - this feature to IPv6. - - -2. Spec changes - -2.1. Tor specification - - In '6.2. Opening streams and transferring data' the following should - be changed to indicate IPv6 exit capability: - - "No version of Tor currently generates the IPv6 format." - - In '6.4. Remote hostname lookup' the following should be updated to - reflect use of ip6.arpa in addition to in-addr.arpa. - - "For a reverse lookup, the OP sends a RELAY_RESOLVE cell containing an - in-addr.arpa address." - - In 'A.1. Differences between spec and implementation' the following - should be updated to indicate IPv6 exit capability: - - "The current codebase has no IPv6 support at all." - - [NOTE: the EXITPOLICY end-cell reason says that it can hold an ipv4 or an - ipv6 address, but doesn't say how. We may want a separate EXITPOLICY2 - type that can hold an ipv6 address, since the way we encode ipv6 - addresses elsewhere ("0.0.0.0 indicates that the next 16 bytes are ipv6") - is a bit dumb. -nickm] - [Actually, the length field lets us distinguish EXITPOLICY. -nickm] - -2.2. Directory specification - - In '2.1. Router descriptor format' a new set of directives is needed - for IPv6 exit policy. The existing accept/reject directives should - be clarified to indicate IPv4 or wildcard address relevance. The new - IPv6 directives will be in the form of: - - "accept6" exitpattern NL - "reject6" exitpattern NL - - The section describing accept6/reject6 should explain that the - presence of accept6 or reject6 exit policies in a router descriptor - signals the ability of that router to exit IPv6 traffic (according to - IPv6 exit policies). - - The "[::]/0" notation is used to represent "all IPv6 addresses". - "[::0]/0" may also be used for this representation. - - If a user specifies a 'reject6 [::]/0:*' policy in the Tor - configuration this will be interpreted as forcing no IPv6 exit - support and no accept6/reject6 policies will be included in the - published descriptor. This will prevent IPv6 exit if the router host - has a global unicast IPv6 address present. - - It is important to note that a wildcard address in an accept or - reject policy applies to both IPv4 and IPv6 addresses. - -2.3. Control specification - - In '3.8. MAPADDRESS' the potential to have to addresses for a given - name should be explained. The method for generating unique local - addresses for IPv6 mappings needs explanation as described above. - - When IPv6 addresses are used in this document they should include the - brackets for consistency. For example, the null IPv6 address should - be written as "[::0]" and not "::0". The control commands will - expect the same syntax as well. - - In '3.9. GETINFO' the "address" command should return both public - IPv4 and IPv6 addresses if present. These addresses should be - separated via \r\n. - - -2.4. Tor SOCKS extensions - - In '2. Name lookup' a description of IPv6 address resolution is - needed for SOCKSv5 as described above. IPv6 addresses should be - supported in both the RESOLVE and RESOLVE_PTR extensions. - - A new section describing the ability to accept SOCKSv5 clients on a - local IPv6 address to indicate a preference for IPv6 transport as - described above is also needed. The behavior of Tor SOCKSv5 proxy - with an IPv6 preference should be explained, for example, preferring - IPv6 transport to a named host with both IPv4 and IPv6 addresses - available (A and AAAA records). - - -3. Questions and concerns - -3.1. DNS A6 records - - A6 is explicitly avoided in this document. There are potential - reasons for implementing this, however, the inherent complexity of - the protocol and resolvers make this unappealing. Is there a - compelling reason to consider A6 as part of IPv6 exit support? - - [IMO not till anybody needs it. -nickm] - -3.2. IPv4 and IPv6 preference - - The design above tries to infer a preference for IPv4 or IPv6 - transport based on client interactions with Tor. It might be useful - to provide more explicit control over this preference. For example, - an IPv4 SOCKSv5 client may want to use IPv6 transport to named hosts - in CONNECT requests while the current implementation would assume an - IPv4 preference. Should more explicit control be available, through - either configuration directives or control commands? - - Many applications support a inet6-only or prefer-family type option - that provides the user manual control over address preference. This - could be provided as a Tor configuration option. - - An explicit preference is still possible by resolving names and then - CONNECTing to an IPv4 or IPv6 address as desired, however, not all - client applications may have this option available. - -3.3. Support for IPv6 only transparent proxy clients - - It may be useful to support IPv6 only transparent proxy clients using - IPv4 mapped IPv6 like addresses. This would require transparent DNS - proxy using IPv6 transport and the ability to map A record responses - into IPv4 mapped IPv6 like addresses in the manner described in the - "NAT-PT" RFC for a traditional Basic-NAT-PT with DNS-ALG. The - transparent TCP proxy would thus need to detect these mapped addresses - and connect to the desired IPv4 host. - - The IPv6 prefix used for this purpose must not be the actual IPv4 - mapped IPv6 address prefix, though the manner in which IPv4 addresses - are embedded in IPv6 addresses would be the same. - - The lack of any IPv6 only hosts which would use this transparent proxy - method makes this a lot of work for very little gain. Is there a - compelling reason to support this NAT-PT like capability? - -3.4. IPv6 DNS and older Tor routers - - It is expected that many routers will continue to run with older - versions of Tor when the IPv6 exit capability is released. Clients - who wish to use IPv6 will need to route RELAY_RESOLVE requests to the - newer routers which will respond with both A and AAAA resource - records when possible. - - One way to do this is to route RELAY_RESOLVE requests to routers with - IPv6 exit policies published, however, this would not utilize current - routers that can resolve IPv6 addresses even if they can't exit such - traffic. - - There was also concern expressed about the ability of existing clients - to cope with new RELAY_RESOLVE responses that contain IPv6 addresses. - If this breaks backward compatibility, a new request type may be - necessary, like RELAY_RESOLVE6, or some other mechanism of indicating - the ability to parse IPv6 responses when making the request. - -3.5. IPv4 and IPv6 bindings in MAPADDRESS - - It may be troublesome to try and support two distinct address mappings - for the same name in the existing MAPADDRESS implementation. If this - cannot be accommodated then the behavior should replace existing - mappings with the new address regardless of family. A warning when - this occurs would be useful to assist clients who encounter problems - when both an IPv4 and IPv6 application are using MAPADDRESS for the - same names concurrently, causing lost connections for one of them. - -4. Addendum - -4.1. Sample IPv6 default exit policy - - reject 0.0.0.0/8 - reject 169.254.0.0/16 - reject 127.0.0.0/8 - reject 192.168.0.0/16 - reject 10.0.0.0/8 - reject 172.16.0.0/12 - reject6 [0000::]/8 - reject6 [0100::]/8 - reject6 [0200::]/7 - reject6 [0400::]/6 - reject6 [0800::]/5 - reject6 [1000::]/4 - reject6 [4000::]/3 - reject6 [6000::]/3 - reject6 [8000::]/3 - reject6 [A000::]/3 - reject6 [C000::]/3 - reject6 [E000::]/4 - reject6 [F000::]/5 - reject6 [F800::]/6 - reject6 [FC00::]/7 - reject6 [FE00::]/9 - reject6 [FE80::]/10 - reject6 [FEC0::]/10 - reject6 [FF00::]/8 - reject *:25 - reject *:119 - reject *:135-139 - reject *:445 - reject *:1214 - reject *:4661-4666 - reject *:6346-6429 - reject *:6699 - reject *:6881-6999 - accept *:* - # accept6 [2000::]/3:* is implied - -4.2. Additional resources - - 'DNS Extensions to Support IP Version 6' - http://www.ietf.org/rfc/rfc3596.txt - - 'DNS Extensions to Support IPv6 Address Aggregation and Renumbering' - http://www.ietf.org/rfc/rfc2874.txt - - 'SOCKS Protocol Version 5' - http://www.ietf.org/rfc/rfc1928.txt - - 'Unique Local IPv6 Unicast Addresses' - http://www.ietf.org/rfc/rfc4193.txt - - 'INTERNET PROTOCOL VERSION 6 ADDRESS SPACE' - http://www.iana.org/assignments/ipv6-address-space - - 'Network Address Translation - Protocol Translation (NAT-PT)' - http://www.ietf.org/rfc/rfc2766.txt diff --git a/doc/spec/proposals/118-multiple-orports.txt b/doc/spec/proposals/118-multiple-orports.txt deleted file mode 100644 index 2381ec7ca3..0000000000 --- a/doc/spec/proposals/118-multiple-orports.txt +++ /dev/null @@ -1,84 +0,0 @@ -Filename: 118-multiple-orports.txt -Title: Advertising multiple ORPorts at once -Author: Nick Mathewson -Created: 09-Jul-2007 -Status: Accepted -Target: 0.2.1.x - -Overview: - - This document is a proposal for servers to advertise multiple - address/port combinations for their ORPort. - -Motivation: - - Sometimes servers want to support multiple ports for incoming - connections, either in order to support multiple address families, to - better use multiple interfaces, or to support a variety of - FascistFirewallPorts settings. This is easy to set up now, but - there's no way to advertise it to clients. - -New descriptor syntax: - - We add a new line in the router descriptor, "or-address". This line - can occur zero, one, or multiple times. Its format is: - - or-address SP ADDRESS ":" PORTLIST NL - - ADDRESS = IP6ADDR / IP4ADDR - IPV6ADDR = an ipv6 address, surrounded by square brackets. - IPV4ADDR = an ipv4 address, represented as a dotted quad. - PORTLIST = PORTSPEC | PORTSPEC "," PORTLIST - PORTSPEC = PORT | PORT "-" PORT - - [This is the regular format for specifying sets of addresses and - ports in Tor.] - -New OR behavior: - - We add two more options to supplement ORListenAddress: - ORPublishedListenAddress, and ORPublishAddressSet. The former - listens on an address-port combination and publishes it in addition - to the regular address. The latter advertises a set of address-port - combinations, but does not listen on them. [To use this option, the - server operator should set up port forwarding to the regular ORPort, - as for example with firewall rules.] - - Servers should extend their testing to include advertised addresses - and ports. No address or port should be advertised until it's been - tested. [This might get expensive in practice.] - -New authority behavior: - - Authorities should spot-test descriptors, and reject any where a - substantial part of the addresses can't be reached. - -New client behavior: - - When connecting to another server, clients SHOULD pick an - address-port ocmbination at random as supported by their - reachableaddresses. If a client has a connection to a server at one - address, it SHOULD use that address for any simultaneous connections - to that server. Clients SHOULD use the canonical address for any - server when generating extend cells. - -Not addressed here: - - * There's no reason to listen on multiple dirports; current Tors - mostly don't connect directly to the dirport anyway. - - * It could be advantageous to list something about extra addresses in - the network-status document. This would, however, eat space there. - More analysis is needed, particularly in light of proposal 141 - ("Download server descriptors on demand") - -Dependencies: - - Testing for canonical connections needs to be implemented before it's - safe to use this proposal. - - -Notes 3 July: - - Write up the simple version of this. No ranges needed yet. No - networkstatus chagnes yet. - diff --git a/doc/spec/proposals/119-controlport-auth.txt b/doc/spec/proposals/119-controlport-auth.txt deleted file mode 100644 index 9ed1cc1cbe..0000000000 --- a/doc/spec/proposals/119-controlport-auth.txt +++ /dev/null @@ -1,140 +0,0 @@ -Filename: 119-controlport-auth.txt -Title: New PROTOCOLINFO command for controllers -Author: Roger Dingledine -Created: 14-Aug-2007 -Status: Closed -Implemented-In: 0.2.0.x - -Overview: - - Here we describe how to help controllers locate the cookie - authentication file when authenticating to Tor, so we can a) require - authentication by default for Tor controllers and b) still keep - things usable. Also, we propose an extensible, general-purpose mechanism - for controllers to learn about a Tor instance's protocol and - authentication requirements before authenticating. - -The Problem: - - When we first added the controller protocol, we wanted to make it - easy for people to play with it, so by default we didn't require any - authentication from controller programs. We allowed requests only from - localhost as a stopgap measure for security. - - Due to an increasing number of vulnerabilities based on this approach, - it's time to add authentication in default configurations. - - We have a number of goals: - - We want the default Vidalia bundles to transparently work. That - means we don't want the users to have to type in or know a password. - - We want to allow multiple controller applications to connect to the - control port. So if Vidalia is launching Tor, it can't just keep the - secrets to itself. - - Right now there are three authentication approaches supported - by the control protocol: NULL, CookieAuthentication, and - HashedControlPassword. See Sec 5.1 in control-spec.txt for details. - - There are a couple of challenges here. The first is: if the controller - launches Tor, how should we teach Tor what authentication approach - it should require, and the secret that goes along with it? Next is: - how should this work when the controller attaches to an existing Tor, - rather than launching Tor itself? - - Cookie authentication seems most amenable to letting multiple controller - applications interact with Tor. But that brings in yet another question: - how does the controller guess where to look for the cookie file, - without first knowing what DataDirectory Tor is using? - -Design: - - We should add a new controller command PROTOCOLINFO that can be sent - as a valid first command (the others being AUTHENTICATE and QUIT). If - PROTOCOLINFO is sent as the first command, the second command must be - either a successful AUTHENTICATE or a QUIT. - - If the initial command sequence is not valid, Tor closes the connection. - - -Spec: - - C: "PROTOCOLINFO" *(SP PIVERSION) CRLF - S: "250+PROTOCOLINFO" SP PIVERSION CRLF *InfoLine "250 OK" CRLF - - InfoLine = AuthLine / VersionLine / OtherLine - - AuthLine = "250-AUTH" SP "METHODS=" AuthMethod *(",")AuthMethod - *(SP "COOKIEFILE=" AuthCookieFile) CRLF - VersionLine = "250-VERSION" SP "Tor=" TorVersion [SP Arguments] CRLF - - AuthMethod = - "NULL" / ; No authentication is required - "HASHEDPASSWORD" / ; A controller must supply the original password - "COOKIE" / ; A controller must supply the contents of a cookie - - AuthCookieFile = QuotedString - TorVersion = QuotedString - - OtherLine = "250-" Keyword [SP Arguments] CRLF - - For example: - - C: PROTOCOLINFO CRLF - S: "250+PROTOCOLINFO 1" CRLF - S: "250-AUTH Methods=HASHEDPASSWORD,COOKIE COOKIEFILE="/tor/cookie"" CRLF - S: "250-VERSION Tor=0.2.0.5-alpha" CRLF - S: "250 OK" CRLF - - Tor MAY give its InfoLines in any order; controllers MUST ignore InfoLines - with keywords it does not recognize. Controllers MUST ignore extraneous - data on any InfoLine. - - PIVERSION is there in case we drastically change the syntax one day. For - now it should always be "1", for the controller protocol. Controllers MAY - provide a list of the protocol versions they support; Tor MAY select a - version that the controller does not support. - - Right now only two "topics" (AUTH and VERSION) are included, but more - may be included in the future. Controllers must accept lines with - unexpected topics. - - AuthCookieFile = QuotedString - - AuthMethod is used to specify one or more control authentication - methods that Tor currently accepts. - - AuthCookieFile specifies the absolute path and filename of the - authentication cookie that Tor is expecting and is provided iff - the METHODS field contains the method "COOKIE". Controllers MUST handle - escape sequences inside this string. - - The VERSION line contains the Tor version. - - [What else might we want to include that could be useful? -RD] - -Compatibility: - - Tor 0.1.2.16 and 0.2.0.4-alpha hang up after the first failed - command. Earlier Tors don't know about this command but don't hang - up. That means controllers will need a mechanism for distinguishing - whether they're talking to a Tor that speaks PROTOCOLINFO or not. - - I suggest that the controllers attempt a PROTOCOLINFO. Then: - - If it works, great. Authenticate as required. - - If they get hung up on, reconnect and do a NULL AUTHENTICATE. - - If it's unrecognized but they're not hung up on, do a NULL - AUTHENTICATE. - -Unsolved problems: - - If Torbutton wants to be a Tor controller one day... talking TCP is - bad enough, but reading from the filesystem is even harder. Is there - a way to let simple programs work with the controller port without - needing all the auth infrastructure? - - Once we put this approach in place, the next vulnerability we see will - involve an attacker somehow getting read access to the victim's files - --- and then we're back where we started. This means we still need - to think about how to demand password-based authentication without - bothering the user about it. - diff --git a/doc/spec/proposals/120-shutdown-descriptors.txt b/doc/spec/proposals/120-shutdown-descriptors.txt deleted file mode 100644 index 5cfe2b5bc6..0000000000 --- a/doc/spec/proposals/120-shutdown-descriptors.txt +++ /dev/null @@ -1,83 +0,0 @@ -Filename: 120-shutdown-descriptors.txt -Title: Shutdown descriptors when Tor servers stop -Author: Roger Dingledine -Created: 15-Aug-2007 -Status: Dead - -[Proposal dead as of 11 Jul 2008. The point of this proposal was to give -routers a good way to get out of the networkstatus early, but proposal -138 (already implemented) has achieved this.] - -Overview: - - Tor servers should publish a last descriptor whenever they shut down, - to let others know that they are no longer offering service. - -The Problem: - - The main reason for this is in reaction to Internet services that want - to treat connections from the Tor network differently. Right now, - if a user experiments with turning on the "relay" functionality, he - is punished by being locked out of some websites, some IRC networks, - etc --- and this lockout persists for several days even after he turns - the server off. - -Design: - - During the "slow shutdown" period if exiting, or shortly after the - user sets his ORPort back to 0 if not exiting, Tor should publish a - final descriptor with the following characteristics: - - 1) Exit policy is listed as "reject *:*" - 2) It includes a new entry called "opt shutdown 1" - - The first step is so current blacklists will no longer list this node - as exiting to whatever the service is. - - The second step is so directory authorities can avoid wasting time - doing reachability testing. Authorities should automatically not list - as Running any router whose latest descriptor says it shut down. - - [I originally had in mind a third step --- Advertised bandwidth capacity - is listed as "0" --- so current Tor clients will skip over this node - when building most circuits. But since clients won't fetch descriptors - from nodes not listed as Running, this step seems pointless. -RD] - -Spec: - - TBD but should be pretty straightforward. - -Security issues: - - Now external people can learn exactly when a node stopped offering - relay service. How bad is this? I can see a few minor attacks based - on this knowledge, but on the other hand as it is we don't really take - any steps to keep this information secret. - -Overhead issues: - - We are creating more descriptors that want to be remembered. However, - since the router won't be marked as Running, ordinary clients won't - fetch the shutdown descriptors. Caches will, though. I hope this is ok. - -Implementation: - - To make things easy, we should publish the shutdown descriptor only - on controlled shutdown (SIGINT as opposed to SIGTERM). That would - leave enough time for publishing that we probably wouldn't need any - extra synchronization code. - - If that turns out to be too unintuitive for users, I could imagine doing - it on SIGTERMs too, and just delaying exit until we had successfully - published to at least one authority, at which point we'd hope that it - propagated from there. - -Acknowledgements: - - tup suggested this idea. - -Comments: - - 2) Maybe add a rule "Don't do this for hibernation if we expect to wake - up before the next consensus is published"? - - NM 9 Oct 2007 diff --git a/doc/spec/proposals/121-hidden-service-authentication.txt b/doc/spec/proposals/121-hidden-service-authentication.txt deleted file mode 100644 index 0d92b53a8c..0000000000 --- a/doc/spec/proposals/121-hidden-service-authentication.txt +++ /dev/null @@ -1,776 +0,0 @@ -Filename: 121-hidden-service-authentication.txt -Title: Hidden Service Authentication -Author: Tobias Kamm, Thomas Lauterbach, Karsten Loesing, Ferdinand Rieger, - Christoph Weingarten -Created: 10-Sep-2007 -Status: Finished -Implemented-In: 0.2.1.x - -Change history: - - 26-Sep-2007 Initial proposal for or-dev - 08-Dec-2007 Incorporated comments by Nick posted to or-dev on 10-Oct-2007 - 15-Dec-2007 Rewrote complete proposal for better readability, modified - authentication protocol, merged in personal notes - 24-Dec-2007 Replaced misleading term "authentication" by "authorization" - and added some clarifications (comments by Sven Kaffille) - 28-Apr-2008 Updated most parts of the concrete authorization protocol - 04-Jul-2008 Add a simple algorithm to delay descriptor publication for - different clients of a hidden service - 19-Jul-2008 Added INTRODUCE1V cell type (1.2), improved replay - protection for INTRODUCE2 cells (1.3), described limitations - for auth protocols (1.6), improved hidden service protocol - without client authorization (2.1), added second, more - scalable authorization protocol (2.2), rewrote existing - authorization protocol (2.3); changes based on discussion - with Nick - 31-Jul-2008 Limit maximum descriptor size to 20 kilobytes to prevent - abuse. - 01-Aug-2008 Use first part of Diffie-Hellman handshake for replay - protection instead of rendezvous cookie. - 01-Aug-2008 Remove improved hidden service protocol without client - authorization (2.1). It might get implemented in proposal - 142. - -Overview: - - This proposal deals with a general infrastructure for performing - authorization (not necessarily implying authentication) of requests to - hidden services at three points: (1) when downloading and decrypting - parts of the hidden service descriptor, (2) at the introduction point, - and (3) at Bob's Tor client before contacting the rendezvous point. A - service provider will be able to restrict access to his service at these - three points to authorized clients only. Further, the proposal contains - specific authorization protocols as instances that implement the - presented authorization infrastructure. - - This proposal is based on v2 hidden service descriptors as described in - proposal 114 and introduced in version 0.2.0.10-alpha. - - The proposal is structured as follows: The next section motivates the - integration of authorization mechanisms in the hidden service protocol. - Then we describe a general infrastructure for authorization in hidden - services, followed by specific authorization protocols for this - infrastructure. At the end we discuss a number of attacks and non-attacks - as well as compatibility issues. - -Motivation: - - The major part of hidden services does not require client authorization - now and won't do so in the future. To the contrary, many clients would - not want to be (pseudonymously) identifiable by the service (though this - is unavoidable to some extent), but rather use the service - anonymously. These services are not addressed by this proposal. - - However, there may be certain services which are intended to be accessed - by a limited set of clients only. A possible application might be a - wiki or forum that should only be accessible for a closed user group. - Another, less intuitive example might be a real-time communication - service, where someone provides a presence and messaging service only to - his buddies. Finally, a possible application would be a personal home - server that should be remotely accessed by its owner. - - Performing authorization for a hidden service within the Tor network, as - proposed here, offers a range of advantages compared to allowing all - client connections in the first instance and deferring authorization to - the transported protocol: - - (1) Reduced traffic: Unauthorized requests would be rejected as early as - possible, thereby reducing the overall traffic in the network generated - by establishing circuits and sending cells. - - (2) Better protection of service location: Unauthorized clients could not - force Bob to create circuits to their rendezvous points, thus preventing - the attack described by Øverlier and Syverson in their paper "Locating - Hidden Servers" even without the need for guards. - - (3) Hiding activity: Apart from performing the actual authorization, a - service provider could also hide the mere presence of his service from - unauthorized clients when not providing hidden service descriptors to - them, rejecting unauthorized requests already at the introduction - point (ideally without leaking presence information at any of these - points), or not answering unauthorized introduction requests. - - (4) Better protection of introduction points: When providing hidden - service descriptors to authorized clients only and encrypting the - introduction points as described in proposal 114, the introduction points - would be unknown to unauthorized clients and thereby protected from DoS - attacks. - - (5) Protocol independence: Authorization could be performed for all - transported protocols, regardless of their own capabilities to do so. - - (6) Ease of administration: A service provider running multiple hidden - services would be able to configure access at a single place uniformly - instead of doing so for all services separately. - - (7) Optional QoS support: Bob could adapt his node selection algorithm - for building the circuit to Alice's rendezvous point depending on a - previously guaranteed QoS level, thus providing better latency or - bandwidth for selected clients. - - A disadvantage of performing authorization within the Tor network is - that a hidden service cannot make use of authorization data in - the transported protocol. Tor hidden services were designed to be - independent of the transported protocol. Therefore it's only possible to - either grant or deny access to the whole service, but not to specific - resources of the service. - - Authorization often implies authentication, i.e. proving one's identity. - However, when performing authorization within the Tor network, untrusted - points should not gain any useful information about the identities of - communicating parties, neither server nor client. A crucial challenge is - to remain anonymous towards directory servers and introduction points. - However, trying to hide identity from the hidden service is a futile - task, because a client would never know if he is the only authorized - client and therefore perfectly identifiable. Therefore, hiding client - identity from the hidden service is not an aim of this proposal. - - The current implementation of hidden services does not provide any kind - of authorization. The hidden service descriptor version 2, introduced by - proposal 114, was designed to use a descriptor cookie for downloading and - decrypting parts of the descriptor content, but this feature is not yet - in use. Further, most relevant cell formats specified in rend-spec - contain fields for authorization data, but those fields are neither - implemented nor do they suffice entirely. - -Details: - - 1. General infrastructure for authorization to hidden services - - We spotted three possible authorization points in the hidden service - protocol: - - (1) when downloading and decrypting parts of the hidden service - descriptor, - (2) at the introduction point, and - (3) at Bob's Tor client before contacting the rendezvous point. - - The general idea of this proposal is to allow service providers to - restrict access to some or all of these points to authorized clients - only. - - 1.1. Client authorization at directory - - Since the implementation of proposal 114 it is possible to combine a - hidden service descriptor with a so-called descriptor cookie. If done so, - the descriptor cookie becomes part of the descriptor ID, thus having an - effect on the storage location of the descriptor. Someone who has learned - about a service, but is not aware of the descriptor cookie, won't be able - to determine the descriptor ID and download the current hidden service - descriptor; he won't even know whether the service has uploaded a - descriptor recently. Descriptor IDs are calculated as follows (see - section 1.2 of rend-spec for the complete specification of v2 hidden - service descriptors): - - descriptor-id = - H(service-id | H(time-period | descriptor-cookie | replica)) - - Currently, service-id is equivalent to permanent-id which is calculated - as in the following formula. But in principle it could be any public - key. - - permanent-id = H(permanent-key)[:10] - - The second purpose of the descriptor cookie is to encrypt the list of - introduction points, including optional authorization data. Hence, the - hidden service directories won't learn any introduction information from - storing a hidden service descriptor. This feature is implemented but - unused at the moment. So this proposal will harness the advantages - of proposal 114. - - The descriptor cookie can be used for authorization by keeping it secret - from everyone but authorized clients. A service could then decide whether - to publish hidden service descriptors using that descriptor cookie later - on. An authorized client being aware of the descriptor cookie would be - able to download and decrypt the hidden service descriptor. - - The number of concurrently used descriptor cookies for one hidden service - is not restricted. A service could use a single descriptor cookie for all - users, a distinct cookie per user, or something in between, like one - cookie per group of users. It is up to the specific protocol and how it - is applied by a service provider. - - Two or more hidden service descriptors for different groups or users - should not be uploaded at the same time. A directory node could conclude - easily that the descriptors were issued by the same hidden service, thus - being able to link the two groups or users. Therefore, descriptors for - different users or clients that ought to be stored on the same directory - are delayed, so that only one descriptor is uploaded to a directory at a - time. The remaining descriptors are uploaded with a delay of up to - 30 seconds. - Further, descriptors for different groups or users that are to be stored - on different directories are delayed for a random time of up to 30 - seconds to hide relations from colluding directories. Certainly, this - does not prevent linking entirely, but it makes it somewhat harder. - There is a conflict between hiding links between clients and making a - service available in a timely manner. - - Although this part of the proposal is meant to describe a general - infrastructure for authorization, changing the way of using the - descriptor cookie to look up hidden service descriptors, e.g. applying - some sort of asymmetric crypto system, would require in-depth changes - that would be incompatible to v2 hidden service descriptors. On the - contrary, using another key for en-/decrypting the introduction point - part of a hidden service descriptor, e.g. a different symmetric key or - asymmetric encryption, would be easy to implement and compatible to v2 - hidden service descriptors as understood by hidden service directories - (clients and services would have to be upgraded anyway for using the new - features). - - An adversary could try to abuse the fact that introduction points can be - encrypted by storing arbitrary, unrelated data in the hidden service - directory. This abuse can be limited by setting a hard descriptor size - limit, forcing the adversary to split data into multiple chunks. There - are some limitations that make splitting data across multiple descriptors - unattractive: 1) The adversary would not be able to choose descriptor IDs - freely and would therefore have to implement his own indexing - structure. 2) Validity of descriptors is limited to at most 24 hours - after which descriptors need to be republished. - - The regular descriptor size in bytes is 745 + num_ipos * 837 + auth_data. - A large descriptor with 7 introduction points and 5 kilobytes of - authorization data would be 11724 bytes in size. The upper size limit of - descriptors should be set to 20 kilobytes, which limits the effect of - abuse while retaining enough flexibility in designing authorization - protocols. - - 1.2. Client authorization at introduction point - - The next possible authorization point after downloading and decrypting - a hidden service descriptor is the introduction point. It may be important - for authorization, because it bears the last chance of hiding presence - of a hidden service from unauthorized clients. Further, performing - authorization at the introduction point might reduce traffic in the - network, because unauthorized requests would not be passed to the - hidden service. This applies to those clients who are aware of a - descriptor cookie and thereby of the hidden service descriptor, but do - not have authorization data to pass the introduction point or access the - service (such a situation might occur when authorization data for - authorization at the directory is not issued on a per-user basis, but - authorization data for authorization at the introduction point is). - - It is important to note that the introduction point must be considered - untrustworthy, and therefore cannot replace authorization at the hidden - service itself. Nor should the introduction point learn any sensitive - identifiable information from either the service or the client. - - In order to perform authorization at the introduction point, three - message formats need to be modified: (1) v2 hidden service descriptors, - (2) ESTABLISH_INTRO cells, and (3) INTRODUCE1 cells. - - A v2 hidden service descriptor needs to contain authorization data that - is introduction-point-specific and sometimes also authorization data - that is introduction-point-independent. Therefore, v2 hidden service - descriptors as specified in section 1.2 of rend-spec already contain two - reserved fields "intro-authorization" and "service-authorization" - (originally, the names of these fields were "...-authentication") - containing an authorization type number and arbitrary authorization - data. We propose that authorization data consists of base64 encoded - objects of arbitrary length, surrounded by "-----BEGIN MESSAGE-----" and - "-----END MESSAGE-----". This will increase the size of hidden service - descriptors, but this is allowed since there is no strict upper limit. - - The current ESTABLISH_INTRO cells as described in section 1.3 of - rend-spec do not contain either authorization data or version - information. Therefore, we propose a new version 1 of the ESTABLISH_INTRO - cells adding these two issues as follows: - - V Format byte: set to 255 [1 octet] - V Version byte: set to 1 [1 octet] - KL Key length [2 octets] - PK Bob's public key [KL octets] - HS Hash of session info [20 octets] - AUTHT The auth type that is supported [1 octet] - AUTHL Length of auth data [2 octets] - AUTHD Auth data [variable] - SIG Signature of above information [variable] - - From the format it is possible to determine the maximum allowed size for - authorization data: given the fact that cells are 512 octets long, of - which 498 octets are usable (see section 6.1 of tor-spec), and assuming - 1024 bit = 128 octet long keys, there are 215 octets left for - authorization data. Hence, authorization protocols are bound to use no - more than these 215 octets, regardless of the number of clients that - shall be authenticated at the introduction point. Otherwise, one would - need to send multiple ESTABLISH_INTRO cells or split them up, which we do - not specify here. - - In order to understand a v1 ESTABLISH_INTRO cell, the implementation of - a relay must have a certain Tor version. Hidden services need to be able - to distinguish relays being capable of understanding the new v1 cell - formats and perform authorization. We propose to use the version number - that is contained in networkstatus documents to find capable - introduction points. - - The current INTRODUCE1 cell as described in section 1.8 of rend-spec is - not designed to carry authorization data and has no version number, too. - Unfortunately, unversioned INTRODUCE1 cells consist only of a fixed-size, - seemingly random PK_ID, followed by the encrypted INTRODUCE2 cell. This - makes it impossible to distinguish unversioned INTRODUCE1 cells from any - later format. In particular, it is not possible to introduce some kind of - format and version byte for newer versions of this cell. That's probably - where the comment "[XXX011 want to put intro-level auth info here, but no - version. crap. -RD]" that was part of rend-spec some time ago comes from. - - We propose that new versioned INTRODUCE1 cells use the new cell type 41 - RELAY_INTRODUCE1V (where V stands for versioned): - - Cleartext - V Version byte: set to 1 [1 octet] - PK_ID Identifier for Bob's PK [20 octets] - AUTHT The auth type that is included [1 octet] - AUTHL Length of auth data [2 octets] - AUTHD Auth data [variable] - Encrypted to Bob's PK: - (RELAY_INTRODUCE2 cell) - - The maximum length of contained authorization data depends on the length - of the contained INTRODUCE2 cell. A calculation follows below when - describing the INTRODUCE2 cell format we propose to use. - - 1.3. Client authorization at hidden service - - The time when a hidden service receives an INTRODUCE2 cell constitutes - the last possible authorization point during the hidden service - protocol. Performing authorization here is easier than at the other two - authorization points, because there are no possibly untrusted entities - involved. - - In general, a client that is successfully authorized at the introduction - point should be granted access at the hidden service, too. Otherwise, the - client would receive a positive INTRODUCE_ACK cell from the introduction - point and conclude that it may connect to the service, but the request - will be dropped without notice. This would appear as a failure to - clients. Therefore, the number of cases in which a client successfully - passes the introduction point but fails at the hidden service should be - zero. However, this does not lead to the conclusion that the - authorization data used at the introduction point and the hidden service - must be the same, but only that both authorization data should lead to - the same authorization result. - - Authorization data is transmitted from client to server via an - INTRODUCE2 cell that is forwarded by the introduction point. There are - versions 0 to 2 specified in section 1.8 of rend-spec, but none of these - contain fields for carrying authorization data. We propose a slightly - modified version of v3 INTRODUCE2 cells that is specified in section - 1.8.1 and which is not implemented as of December 2007. In contrast to - the specified v3 we avoid specifying (and implementing) IPv6 capabilities, - because Tor relays will be required to support IPv4 addresses for a long - time in the future, so that this seems unnecessary at the moment. The - proposed format of v3 INTRODUCE2 cells is as follows: - - VER Version byte: set to 3. [1 octet] - AUTHT The auth type that is used [1 octet] - AUTHL Length of auth data [2 octets] - AUTHD Auth data [variable] - TS Timestamp (seconds since 1-1-1970) [4 octets] - IP Rendezvous point's address [4 octets] - PORT Rendezvous point's OR port [2 octets] - ID Rendezvous point identity ID [20 octets] - KLEN Length of onion key [2 octets] - KEY Rendezvous point onion key [KLEN octets] - RC Rendezvous cookie [20 octets] - g^x Diffie-Hellman data, part 1 [128 octets] - - The maximum possible length of authorization data is related to the - enclosing INTRODUCE1V cell. A v3 INTRODUCE2 cell with - 1024 bit = 128 octets long public key without any authorization data - occupies 306 octets (AUTHL is only used when AUTHT has a value != 0), - plus 58 octets for hybrid public key encryption (see - section 5.1 of tor-spec on hybrid encryption of CREATE cells). The - surrounding INTRODUCE1V cell requires 24 octets. This leaves only 110 - of the 498 available octets free, which must be shared between - authorization data to the introduction point _and_ to the hidden - service. - - When receiving a v3 INTRODUCE2 cell, Bob checks whether a client has - provided valid authorization data to him. He also requires that the - timestamp is no more than 30 minutes in the past or future and that the - first part of the Diffie-Hellman handshake has not been used in the past - 60 minutes to prevent replay attacks by rogue introduction points. (The - reason for not using the rendezvous cookie to detect replays---even - though it is only sent once in the current design---is that it might be - desirable to re-use rendezvous cookies for multiple introduction requests - in the future.) If all checks pass, Bob builds a circuit to the provided - rendezvous point. Otherwise he drops the cell. - - 1.4. Summary of authorization data fields - - In summary, the proposed descriptor format and cell formats provide the - following fields for carrying authorization data: - - (1) The v2 hidden service descriptor contains: - - a descriptor cookie that is used for the lookup process, and - - an arbitrary encryption schema to ensure authorization to access - introduction information (currently symmetric encryption with the - descriptor cookie). - - (2) For performing authorization at the introduction point we can use: - - the fields intro-authorization and service-authorization in - hidden service descriptors, - - a maximum of 215 octets in the ESTABLISH_INTRO cell, and - - one part of 110 octets in the INTRODUCE1V cell. - - (3) For performing authorization at the hidden service we can use: - - the fields intro-authorization and service-authorization in - hidden service descriptors, - - the other part of 110 octets in the INTRODUCE2 cell. - - It will also still be possible to access a hidden service without any - authorization or only use a part of the authorization infrastructure. - However, this requires to consider all parts of the infrastructure. For - example, authorization at the introduction point relying on confidential - intro-authorization data transported in the hidden service descriptor - cannot be performed without using an encryption schema for introduction - information. - - 1.5. Managing authorization data at servers and clients - - In order to provide authorization data at the hidden service and the - authenticated clients, we propose to use files---either the Tor - configuration file or separate files. The exact format of these special - files depends on the authorization protocol used. - - Currently, rend-spec contains the proposition to encode client-side - authorization data in the URL, like in x.y.z.onion. This was never used - and is also a bad idea, because in case of HTTP the requested URL may be - contained in the Host and Referer fields. - - 1.6. Limitations for authorization protocols - - There are two limitations of the current hidden service protocol for - authorization protocols that shall be identified here. - - 1. The three cell types ESTABLISH_INTRO, INTRODUCE1V, and INTRODUCE2 - restricts the amount of data that can be used for authorization. - This forces authorization protocols that require per-user - authorization data at the introduction point to restrict the number - of authorized clients artificially. A possible solution could be to - split contents among multiple cells and reassemble them at the - introduction points. - - 2. The current hidden service protocol does not specify cell types to - perform interactive authorization between client and introduction - point or hidden service. If there should be an authorization - protocol that requires interaction, new cell types would have to be - defined and integrated into the hidden service protocol. - - - 2. Specific authorization protocol instances - - In the following we present two specific authorization protocols that - make use of (parts of) the new authorization infrastructure: - - 1. The first protocol allows a service provider to restrict access - to clients with a previously received secret key only, but does not - attempt to hide service activity from others. - - 2. The second protocol, albeit being feasible for a limited set of about - 16 clients, performs client authorization and hides service activity - from everyone but the authorized clients. - - These two protocol instances extend the existing hidden service protocol - version 2. Hidden services that perform client authorization may run in - parallel to other services running versions 0, 2, or both. - - 2.1. Service with large-scale client authorization - - The first client authorization protocol aims at performing access control - while consuming as few additional resources as possible. A service - provider should be able to permit access to a large number of clients - while denying access for everyone else. However, the price for - scalability is that the service won't be able to hide its activity from - unauthorized or formerly authorized clients. - - The main idea of this protocol is to encrypt the introduction-point part - in hidden service descriptors to authorized clients using symmetric keys. - This ensures that nobody else but authorized clients can learn which - introduction points a service currently uses, nor can someone send a - valid INTRODUCE1 message without knowing the introduction key. Therefore, - a subsequent authorization at the introduction point is not required. - - A service provider generates symmetric "descriptor cookies" for his - clients and distributes them outside of Tor. The suggested key size is - 128 bits, so that descriptor cookies can be encoded in 22 base64 chars - (which can hold up to 22 * 5 = 132 bits, leaving 4 bits to encode the - authorization type (here: "0") and allow a client to distinguish this - authorization protocol from others like the one proposed below). - Typically, the contact information for a hidden service using this - authorization protocol looks like this: - - v2cbb2l4lsnpio4q.onion Ll3X7Xgz9eHGKCCnlFH0uz - - When generating a hidden service descriptor, the service encrypts the - introduction-point part with a single randomly generated symmetric - 128-bit session key using AES-CTR as described for v2 hidden service - descriptors in rend-spec. Afterwards, the service encrypts the session - key to all descriptor cookies using AES. Authorized client should be able - to efficiently find the session key that is encrypted for him/her, so - that 4 octet long client ID are generated consisting of descriptor cookie - and initialization vector. Descriptors always contain a number of - encrypted session keys that is a multiple of 16 by adding fake entries. - Encrypted session keys are ordered by client IDs in order to conceal - addition or removal of authorized clients by the service provider. - - ATYPE Authorization type: set to 1. [1 octet] - ALEN Number of clients := 1 + ((clients - 1) div 16) [1 octet] - for each symmetric descriptor cookie: - ID Client ID: H(descriptor cookie | IV)[:4] [4 octets] - SKEY Session key encrypted with descriptor cookie [16 octets] - (end of client-specific part) - RND Random data [(15 - ((clients - 1) mod 16)) * 20 octets] - IV AES initialization vector [16 octets] - IPOS Intro points, encrypted with session key [remaining octets] - - An authorized client needs to configure Tor to use the descriptor cookie - when accessing the hidden service. Therefore, a user adds the contact - information that she received from the service provider to her torrc - file. Upon downloading a hidden service descriptor, Tor finds the - encrypted introduction-point part and attempts to decrypt it using the - configured descriptor cookie. (In the rare event of two or more client - IDs being equal a client tries to decrypt all of them.) - - Upon sending the introduction, the client includes her descriptor cookie - as auth type "1" in the INTRODUCE2 cell that she sends to the service. - The hidden service checks whether the included descriptor cookie is - authorized to access the service and either responds to the introduction - request, or not. - - 2.2. Authorization for limited number of clients - - A second, more sophisticated client authorization protocol goes the extra - mile of hiding service activity from unauthorized clients. With all else - being equal to the preceding authorization protocol, the second protocol - publishes hidden service descriptors for each user separately and gets - along with encrypting the introduction-point part of descriptors to a - single client. This allows the service to stop publishing descriptors for - removed clients. As long as a removed client cannot link descriptors - issued for other clients to the service, it cannot derive service - activity any more. The downside of this approach is limited scalability. - Even though the distributed storage of descriptors (cf. proposal 114) - tackles the problem of limited scalability to a certain extent, this - protocol should not be used for services with more than 16 clients. (In - fact, Tor should refuse to advertise services for more than this number - of clients.) - - A hidden service generates an asymmetric "client key" and a symmetric - "descriptor cookie" for each client. The client key is used as - replacement for the service's permanent key, so that the service uses a - different identity for each of his clients. The descriptor cookie is used - to store descriptors at changing directory nodes that are unpredictable - for anyone but service and client, to encrypt the introduction-point - part, and to be included in INTRODUCE2 cells. Once the service has - created client key and descriptor cookie, he tells them to the client - outside of Tor. The contact information string looks similar to the one - used by the preceding authorization protocol (with the only difference - that it has "1" encoded as auth-type in the remaining 4 of 132 bits - instead of "0" as before). - - When creating a hidden service descriptor for an authorized client, the - hidden service uses the client key and descriptor cookie to compute - secret ID part and descriptor ID: - - secret-id-part = H(time-period | descriptor-cookie | replica) - - descriptor-id = H(client-key[:10] | secret-id-part) - - The hidden service also replaces permanent-key in the descriptor with - client-key and encrypts introduction-points with the descriptor cookie. - - ATYPE Authorization type: set to 2. [1 octet] - IV AES initialization vector [16 octets] - IPOS Intro points, encr. with descriptor cookie [remaining octets] - - When uploading descriptors, the hidden service needs to make sure that - descriptors for different clients are not uploaded at the same time (cf. - Section 1.1) which is also a limiting factor for the number of clients. - - When a client is requested to establish a connection to a hidden service - it looks up whether it has any authorization data configured for that - service. If the user has configured authorization data for authorization - protocol "2", the descriptor ID is determined as described in the last - paragraph. Upon receiving a descriptor, the client decrypts the - introduction-point part using its descriptor cookie. Further, the client - includes its descriptor cookie as auth-type "2" in INTRODUCE2 cells that - it sends to the service. - - 2.3. Hidden service configuration - - A hidden service that is meant to perform client authorization adds a - new option HiddenServiceAuthorizeClient to its hidden service - configuration. This option contains the authorization type which is - either "1" for the protocol described in 2.1 or "2" for the protocol in - 2.2 and a comma-separated list of human-readable client names, so that - Tor can create authorization data for these clients: - - HiddenServiceAuthorizeClient auth-type client-name,client-name,... - - If this option is configured, HiddenServiceVersion is automatically - reconfigured to contain only version numbers of 2 or higher. - - Tor stores all generated authorization data for the authorization - protocols described in Sections 2.1 and 2.2 in a new file using the - following file format: - - "client-name" human-readable client identifier NL - "descriptor-cookie" 128-bit key ^= 22 base64 chars NL - - If the authorization protocol of Section 2.2 is used, Tor also generates - and stores the following data: - - "client-key" NL a public key in PEM format - - 2.4. Client configuration - - Clients need to make their authorization data known to Tor using another - configuration option that contains a service name (mainly for the sake of - convenience), the service address, and the descriptor cookie that is - required to access a hidden service (the authorization protocol number is - encoded in the descriptor cookie): - - HidServAuth service-name service-address descriptor-cookie - -Security implications: - - In the following we want to discuss possible attacks by dishonest - entities in the presented infrastructure and specific protocol. These - security implications would have to be verified once more when adding - another protocol. The dishonest entities (theoretically) include the - hidden service itself, the authenticated clients, hidden service directory - nodes, introduction points, and rendezvous points. The relays that are - part of circuits used during protocol execution, but never learn about - the exchanged descriptors or cells by design, are not considered. - Obviously, this list makes no claim to be complete. The discussed attacks - are sorted by the difficulty to perform them, in ascending order, - starting with roles that everyone could attempt to take and ending with - partially trusted entities abusing the trust put in them. - - (1) A hidden service directory could attempt to conclude presence of a - service from the existence of a locally stored hidden service descriptor: - This passive attack is possible only for a single client-service - relation, because descriptors need to contain a publicly visible - signature of the service using the client key. - A possible protection would be to increase the number of hidden service - directories in the network. - - (2) A hidden service directory could try to break the descriptor cookies - of locally stored descriptors: This attack can be performed offline. The - only useful countermeasure against it might be using safe passwords that - are generated by Tor. - -[passwords? where did those come in? -RD] - - (3) An introduction point could try to identify the pseudonym of the - hidden service on behalf of which it operates: This is impossible by - design, because the service uses a fresh public key for every - establishment of an introduction point (see proposal 114) and the - introduction point receives a fresh introduction cookie, so that there is - no identifiable information about the service that the introduction point - could learn. The introduction point cannot even tell if client accesses - belong to the same client or not, nor can it know the total number of - authorized clients. The only information might be the pattern of - anonymous client accesses, but that is hardly enough to reliably identify - a specific service. - - (4) An introduction point could want to learn the identities of accessing - clients: This is also impossible by design, because all clients use the - same introduction cookie for authorization at the introduction point. - - (5) An introduction point could try to replay a correct INTRODUCE1 cell - to other introduction points of the same service, e.g. in order to force - the service to create a huge number of useless circuits: This attack is - not possible by design, because INTRODUCE1 cells are encrypted using a - freshly created introduction key that is only known to authorized - clients. - - (6) An introduction point could attempt to replay a correct INTRODUCE2 - cell to the hidden service, e.g. for the same reason as in the last - attack: This attack is stopped by the fact that a service will drop - INTRODUCE2 cells containing a DH handshake they have seen recently. - - (7) An introduction point could block client requests by sending either - positive or negative INTRODUCE_ACK cells back to the client, but without - forwarding INTRODUCE2 cells to the server: This attack is an annoyance - for clients, because they might wait for a timeout to elapse until trying - another introduction point. However, this attack is not introduced by - performing authorization and it cannot be targeted towards a specific - client. A countermeasure might be for the server to periodically perform - introduction requests to his own service to see if introduction points - are working correctly. - - (8) The rendezvous point could attempt to identify either server or - client: This remains impossible as it was before, because the - rendezvous cookie does not contain any identifiable information. - - (9) An authenticated client could swamp the server with valid INTRODUCE1 - and INTRODUCE2 cells, e.g. in order to force the service to create - useless circuits to rendezvous points; as opposed to an introduction - point replaying the same INTRODUCE2 cell, a client could include a new - rendezvous cookie for every request: The countermeasure for this attack - is the restriction to 10 connection establishments per client per hour. - -Compatibility: - - An implementation of this proposal would require changes to hidden - services and clients to process authorization data and encode and - understand the new formats. However, both services and clients would - remain compatible to regular hidden services without authorization. - -Implementation: - - The implementation of this proposal can be divided into a number of - changes to hidden service and client side. There are no - changes necessary on directory, introduction, or rendezvous nodes. All - changes are marked with either [service] or [client] do denote on which - side they need to be made. - - /1/ Configure client authorization [service] - - - Parse configuration option HiddenServiceAuthorizeClient containing - authorized client names. - - Load previously created client keys and descriptor cookies. - - Generate missing client keys and descriptor cookies, add them to - client_keys file. - - Rewrite the hostname file. - - Keep client keys and descriptor cookies of authorized clients in - memory. - [- In case of reconfiguration, mark which client authorizations were - added and whether any were removed. This can be used later when - deciding whether to rebuild introduction points and publish new - hidden service descriptors. Not implemented yet.] - - /2/ Publish hidden service descriptors [service] - - - Create and upload hidden service descriptors for all authorized - clients. - [- See /1/ for the case of reconfiguration.] - - /3/ Configure permission for hidden services [client] - - - Parse configuration option HidServAuth containing service - authorization, store authorization data in memory. - - /5/ Fetch hidden service descriptors [client] - - - Look up client authorization upon receiving a hidden service request. - - Request hidden service descriptor ID including client key and - descriptor cookie. Only request v2 descriptors, no v0. - - /6/ Process hidden service descriptor [client] - - - Decrypt introduction points with descriptor cookie. - - /7/ Create introduction request [client] - - - Include descriptor cookie in INTRODUCE2 cell to introduction point. - - Pass descriptor cookie around between involved connections and - circuits. - - /8/ Process introduction request [service] - - - Read descriptor cookie from INTRODUCE2 cell. - - Check whether descriptor cookie is authorized for access, including - checking access counters. - - Log access for accountability. - diff --git a/doc/spec/proposals/122-unnamed-flag.txt b/doc/spec/proposals/122-unnamed-flag.txt deleted file mode 100644 index 2ce7bb22b9..0000000000 --- a/doc/spec/proposals/122-unnamed-flag.txt +++ /dev/null @@ -1,136 +0,0 @@ -Filename: 122-unnamed-flag.txt -Title: Network status entries need a new Unnamed flag -Author: Roger Dingledine -Created: 04-Oct-2007 -Status: Closed -Implemented-In: 0.2.0.x - -1. Overview: - - Tor's directory authorities can give certain servers a "Named" flag - in the network-status entry, when they want to bind that nickname to - that identity key. This allows clients to specify a nickname rather - than an identity fingerprint and still be certain they're getting the - "right" server. As dir-spec.txt describes it, - - Name X is bound to identity Y if at least one binding directory lists - it, and no directory binds X to some other Y'. - - In practice, clients can refer to servers by nickname whether they are - Named or not; if they refer to nicknames that aren't Named, a complaint - shows up in the log asking them to use the identity key in the future - --- but it still works. - - The problem? Imagine a Tor server with nickname Bob. Bob and his - identity fingerprint are registered in tor26's approved-routers - file, but none of the other authorities registered him. Imagine - there are several other unregistered servers also with nickname Bob - ("the imposters"). - - While Bob is online, all is well: a) tor26 gives a Named flag to - the real one, and refuses to list the other ones; and b) the other - authorities list the imposters but don't give them a Named flag. Clients - who have all the network-statuses can compute which one is the real Bob. - - But when the real Bob disappears and his descriptor expires? tor26 - continues to refuse to list any of the imposters, and the other - authorities continue to list the imposters. Clients don't have any - idea that there exists a Named Bob, so they can ask for server Bob and - get one of the imposters. (A warning will also appear in their log, - but so what.) - -2. The stopgap solution: - - tor26 should start accepting and listing the imposters, but it should - assign them a new flag: "Unnamed". - - This would produce three cases in terms of assigning flags in the consensus - networkstatus: - - i) a router gets the Named flag in the v3 networkstatus if - a) it's the only router with that nickname that has the Named flag - out of all the votes, and - b) no vote lists it as Unnamed - else, - ii) a router gets the Unnamed flag if - a) some vote lists a different router with that nickname as Named, or - b) at least one vote lists it as Unnamed, or - c) there are other routers with the same nickname that are Unnamed - else, - iii) the router neither gets a Named nor an Unnamed flag. - - (This whole proposal is meant only for v3 dir flags; we shouldn't try - to backport it to the v2 dir world.) - - Then client behavior is: - - a) If there's a Bob with a Named flag, pick that one. - else b) If the Bobs don't have the Unnamed flag (notice that they should - either all have it, or none), pick one of them and warn. - else c) They all have the Unnamed flag -- no router found. - -3. Problems not solved by this stopgap: - - 3.1. Naming authorities can go offline. - - If tor26 is the only authority that provides a binding for Bob, when - tor26 goes offline we're back in our previous situation -- the imposters - can be referenced with a mere ignorable warning in the client's log. - - If some other authority Names a different Bob, and tor26 goes offline, - then that other Bob becomes the unique Named Bob. - - So be it. We should try to solve these one day, but there's no clear way - to do it that doesn't destroy usability in other ways, and if we want - to get the Unnamed flag into v3 network statuses we should add it soon. - - 3.2. V3 dir spec magnifies brief discrepancies. - - Another point to notice is if tor26 names Bob(1), doesn't know about - Bob(2), but moria lists Bob(2). Then Bob(2) doesn't get an Unnamed flag - even if it should (and Bob(1) is not around). - - Right now, in v2 dirs, the case where an authority doesn't know about - a server but the other authorities do know is rare. That's because - authorities periodically ask for other networkstatuses and then fetch - descriptors that are missing. - - With v3, if that window occurs at the wrong time, it is extended for the - entire period. We could solve this by making the voting more complex, - but that doesn't seem worth it. - - [3.3. Tor26 is only one tor26. - - We need more naming authorities, possibly with some kind of auto-naming - feature. This is out-of-scope for this proposal -NM] - -4. Changes to the v2 directory - - Previously, v2 authorities that had a binding for a server named Bob did - not list any other server named Bob. This will change too: - - Version 2 authorities will start listing all routers they know about, - whether they conflict with a name-binding or not: Servers for which - this authority has a binding will continue to be marked Named, - additionally all other servers of that nickname will be listed without the - Named flag (i.e. there will be no Unnamed flag in v2 status documents). - - Clients already should handle having a named Bob alongside unnamed - Bobs correctly, and having the unnamed Bobs in the status file even - without the named server is no worse than the current status quo where - clients learn about those servers from other authorities. - - The benefit of this is that an authority's opinion on a server like - Guard, Stable, Fast etc. can now be learned by clients even if that - specific authority has reserved that server's name for somebody else. - -5. Other benefits: - - This new flag will allow people to operate servers that happen to have - the same nickname as somebody who registered their server two years ago - and left soon after. Right now there are dozens of nicknames that are - registered on all three binding directory authorities, yet haven't been - running for years. While it's bad that these nicknames are effectively - blacklisted from the network, the really bad part is that this logic - is really unintuitive to prospective new server operators. - diff --git a/doc/spec/proposals/123-autonaming.txt b/doc/spec/proposals/123-autonaming.txt deleted file mode 100644 index 74c486985d..0000000000 --- a/doc/spec/proposals/123-autonaming.txt +++ /dev/null @@ -1,54 +0,0 @@ -Filename: 123-autonaming.txt -Title: Naming authorities automatically create bindings -Author: Peter Palfrader -Created: 2007-10-11 -Status: Closed -Implemented-In: 0.2.0.x - -Overview: - - Tor's directory authorities can give certain servers a "Named" flag - in the network-status entry, when they want to bind that nickname to - that identity key. This allows clients to specify a nickname rather - than an identity fingerprint and still be certain they're getting the - "right" server. - - Authority operators name a server by adding their nickname and - identity fingerprint to the 'approved-routers' file. Historically - being listed in the file was required for a router, at first for being - listed in the directory at all, and later in order to be used by - clients as a first or last hop of a circuit. - - Adding identities to the list of named routers so far has been a - manual, time consuming, and boring job. Given that and the fact that - the Tor network works just fine without named routers the last - authority to keep a current binding list stopped updating it well over - half a year ago. - - Naming, if it were done, would serve a useful purpose however in that - users can have a reasonable expectation that the exit server Bob they - are using in their http://www.google.com.bob.exit/ URL is the same - Bob every time. - -Proposal: - I propose that identity<->name binding be completely automated: - - New bindings should be added after the router has been around for a - bit and their name has not been used by other routers, similarly names - that have not appeared on the network for a long time should be freed - in case a new router wants to use it. - - The following rules are suggested: - i) If a named router has not been online for half a year, the - identity<->name binding for that name is removed. The nickname - is free to be taken by other routers now. - ii) If a router claims a certain nickname and - a) has been on the network for at least two weeks, and - b) that nickname is not yet linked to a different router, and - c) no other router has wanted that nickname in the last month, - a new binding should be created for this router and its desired - nickname. - - This automaton does not necessarily need to live in the Tor code, it - can do its job just as well when it's an external tool. - diff --git a/doc/spec/proposals/124-tls-certificates.txt b/doc/spec/proposals/124-tls-certificates.txt deleted file mode 100644 index 9472d14af8..0000000000 --- a/doc/spec/proposals/124-tls-certificates.txt +++ /dev/null @@ -1,313 +0,0 @@ -Filename: 124-tls-certificates.txt -Title: Blocking resistant TLS certificate usage -Author: Steven J. Murdoch -Created: 2007-10-25 -Status: Superseded - -Overview: - - To be less distinguishable from HTTPS web browsing, only Tor servers should - present TLS certificates. This should be done whilst maintaining backwards - compatibility with Tor nodes which present and expect client certificates, and - while preserving existing security properties. This specification describes - the negotiation protocol, what certificates should be presented during the TLS - negotiation, and how to move the client authentication within the encrypted - tunnel. - -Motivation: - - In Tor's current TLS [1] handshake, both client and server present a - two-certificate chain. Since TLS performs authentication prior to establishing - the encrypted tunnel, the contents of these certificates are visible to an - eavesdropper. In contrast, during normal HTTPS web browsing, the server - presents a single certificate, signed by a root CA and the client presents no - certificate. Hence it is possible to distinguish Tor from HTTP by identifying - this pattern. - - To resist blocking based on traffic identification, Tor should behave as close - to HTTPS as possible, i.e. servers should offer a single certificate and not - request a client certificate; clients should present no certificate. This - presents two difficulties: clients are no longer authenticated and servers are - authenticated by the connection key, rather than identity key. The link - protocol must thus be modified to preserve the old security semantics. - - Finally, in order to maintain backwards compatibility, servers must correctly - identify whether the client supports the modified certificate handling. This - is achieved by modifying the cipher suites that clients advertise support - for. These cipher suites are selected to be similar to those chosen by web - browsers, in order to resist blocking based on client hello. - -Terminology: - - Initiator: OP or OR which initiates a TLS connection ("client" in TLS - terminology) - - Responder: OR which receives an incoming TLS connection ("server" in TLS - terminology) - -Version negotiation and cipher suite selection: - - In the modified TLS handshake, the responder does not request a certificate - from the initiator. This request would normally occur immediately after the - responder receives the client hello (the first message in a TLS handshake) and - so the responder must decide whether to request a certificate based only on - the information in the client hello. This is achieved by examining the cipher - suites in the client hello. - - List 1: cipher suites lists offered by version 0/1 Tor - - From src/common/tortls.c, revision 12086: - TLS1_TXT_DHE_RSA_WITH_AES_128_SHA - TLS1_TXT_DHE_RSA_WITH_AES_128_SHA : SSL3_TXT_EDH_RSA_DES_192_CBC3_SHA - SSL3_TXT_EDH_RSA_DES_192_CBC3_SHA - - Client hello sent by initiator: - - Initiators supporting version 2 of the Tor connection protocol MUST - offer a different cipher suite list from those sent by pre-version 2 - Tors, contained in List 1. To maintain compatibility with older Tor - versions and common browsers, the cipher suite list MUST include - support for: - - TLS_DHE_RSA_WITH_AES_256_CBC_SHA - TLS_DHE_RSA_WITH_AES_128_CBC_SHA - SSL_DHE_RSA_WITH_3DES_EDE_CBC_SHA - SSL_DHE_DSS_WITH_3DES_EDE_CBC_SHA - - Client hello received by responder/server hello sent by responder: - - Responders supporting version 2 of the Tor connection protocol should compare - the cipher suite list in the client hello with those in List 1. If it matches - any in the list then the responder should assume that the initiatior supports - version 1, and thus should maintain the version 1 behavior, i.e. send a - two-certificate chain, request a client certificate and do not send or expect - a VERSIONS cell [2]. - - Otherwise, the responder should assume version 2 behavior and select a cipher - suite following TLS [1] behavior, i.e. select the first entry from the client - hello cipher list which is acceptable. Responders MUST NOT select any suite - that lacks ephemeral keys, or whose symmetric keys are less then KEY_LEN bits, - or whose digests are less than HASH_LEN bits. Implementations SHOULD NOT - allow other SSLv3 ciphersuites. - - Should no mutually acceptable cipher suite be found, the connection MUST be - closed. - - If the responder is implementing version 2 of the connection protocol it - SHOULD send a server certificate with random contents. The organizationName - field MUST NOT be "Tor", "TOR" or "t o r". - - Server certificate received by initiator: - - If the server certificate has an organizationName of "Tor", "TOR" or "t o r", - the initiator should assume that the responder does not support version 2 of - the connection protocol. In which case the initiator should respond following - version 1, i.e. send a two-certificate client chain and do not send or expect - a VERSIONS cell. - - [SJM: We could also use the fact that a client certificate request was sent] - - If the server hello contains a ciphersuite which does not comply with the key - length requirements above, even if it was one offered in the client hello, the - connection MUST be closed. This will only occur if the responder is not a Tor - server. - - Backward compatibility: - - v1 Initiator, v1 Responder: No change - v1 Initiator, v2 Responder: Responder detects v1 initiator by client hello - v2 Initiator, v1 Responder: Responder accepts v2 client hello. Initiator - detects v1 server certificate and continues with v1 protocol - v2 Initiator, v2 Responder: Responder accepts v2 client hello. Initiator - detects v2 server certificate and continues with v2 protocol. - - Additional link authentication process: - - Following VERSION and NETINFO negotiation, both responder and - initiator MUST send a certification chain in a CERT cell. If one - party does not have a certificate, the CERT cell MUST still be sent, - but with a length of zero. - - A CERT cell is a variable length cell, of the format - CircID [2 bytes] - Command [1 byte] - Length [2 bytes] - Payload [<length> bytes] - - CircID MUST set to be 0x0000 - Command is [SJM: TODO] - Length is the length of the payload - Payload contains 0 or more certificates, each is of the format: - Cert_Length [2 bytes] - Certificate [<cert_length> bytes] - - Each certificate MUST sign the one preceding it. The initator MUST - place its connection certificate first; the responder, having - already sent its connection certificate as part of the TLS handshake - MUST place its identity certificate first. - - Initiators who send a CERT cell MUST follow that with an LINK_AUTH - cell to prove that they posess the corresponding private key. - - A LINK_AUTH cell is fixed-lenth, of the format: - CircID [2 bytes] - Command [1 byte] - Length [2 bytes] - Payload (padded with 0 bytes) [PAYLOAD_LEN - 2 bytes] - - CircID MUST set to be 0x0000 - Command is [SJM: TODO] - Length is the valid portion of the payload - Payload is of the format: - Signature version [1 byte] - Signature [<length> - 1 bytes] - Padding [PAYLOAD_LEN - <length> - 2 bytes] - - Signature version: Identifies the type of signature, currently 0x00 - Signature: Digital signature under the initiator's connection key of the - following item, in PKCS #1 block type 1 [3] format: - - HMAC-SHA1, using the TLS master secret as key, of the - following elements concatenated: - - The signature version (0x00) - - The NUL terminated ASCII string: "Tor initiator certificate verification" - - client_random, as sent in the Client Hello - - server_random, as sent in the Server Hello - - SHA-1 hash of the initiator connection certificate - - SHA-1 hash of the responder connection certificate - - Security checks: - - - Before sending a LINK_AUTH cell, a node MUST ensure that the TLS - connection is authenticated by the responder key. - - For the handshake to have succeeded, the initiator MUST confirm: - - That the TLS handshake was authenticated by the - responder connection key - - That the responder connection key was signed by the first - certificate in the CERT cell - - That each certificate in the CERT cell was signed by the - following certificate, with the exception of the last - - That the last certificate in the CERT cell is the expected - identity certificate for the node being connected to - - For the handshake to have succeeded, the responder MUST confirm - either: - A) - A zero length CERT cell was sent and no LINK_AUTH cell was - sent - In which case the responder shall treat the identity of the - initiator as unknown - or - B) - That the LINK_AUTH MAC contains a signature by the first - certificate in the CERT cell - - That the MAC signed matches the expected value - - That each certificate in the CERT cell was signed by the - following certificate, with the exception of the last - In which case the responder shall treat the identity of the - initiator as that of the last certificate in the CERT cell - - Protocol summary: - - 1. I(nitiator) <-> R(esponder): TLS handshake, including responder - authentication under connection certificate R_c - 2. I <->: VERSION and NETINFO negotiation - 3. R -> I: CERT (Responder identity certificate R_i (which signs R_c)) - 4. I -> R: CERT (Initiator connection certificate I_c, - Initiator identity certificate I_i (which signs I_c) - 5. I -> R: LINK_AUTH (Signature, under I_c of HMAC-SHA1(master_secret, - "Tor initiator certificate verification" || - client_random || server_random || - I_c hash || R_c hash) - - Notes: I -> R doesn't need to wait for R_i before sending its own - messages (reduces round-trips). - Certificate hash is calculated like identity hash in CREATE cells. - Initiator signature is calculated in a similar way to Certificate - Verify messages in TLS 1.1 (RFC4346, Sections 7.4.8 and 4.7). - If I is an OP, a zero length certificate chain may be sent in step 4; - In which case, step 5 is not performed - - Rationale: - - - Version and netinfo negotiation before authentication: The version cell needs - to come before before the rest of the protocol, since we may choose to alter - the rest at some later point, e.g switch to a different MAC/signature scheme. - It is useful to keep the NETINFO and VERSION cells close to each other, since - the time between them is used to check if there is a delay-attack. Still, a - server might want to not act on NETINFO data from an initiator until the - authentication is complete. - -Appendix A: Cipher suite choices - - This specification intentionally does not put any constraints on the - TLS ciphersuite lists presented by clients, other than a minimum - required for compatibility. However, to maximize blocking - resistance, ciphersuite lists should be carefully selected. - - Recommended client ciphersuite list - - Source: http://lxr.mozilla.org/security/source/security/nss/lib/ssl/sslproto.h - - 0xc00a: TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA - 0xc014: TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA - 0x0039: TLS_DHE_RSA_WITH_AES_256_CBC_SHA - 0x0038: TLS_DHE_DSS_WITH_AES_256_CBC_SHA - 0xc00f: TLS_ECDH_RSA_WITH_AES_256_CBC_SHA - 0xc005: TLS_ECDH_ECDSA_WITH_AES_256_CBC_SHA - 0x0035: TLS_RSA_WITH_AES_256_CBC_SHA - 0xc007: TLS_ECDHE_ECDSA_WITH_RC4_128_SHA - 0xc009: TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA - 0xc011: TLS_ECDHE_RSA_WITH_RC4_128_SHA - 0xc013: TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA - 0x0033: TLS_DHE_RSA_WITH_AES_128_CBC_SHA - 0x0032: TLS_DHE_DSS_WITH_AES_128_CBC_SHA - 0xc00c: TLS_ECDH_RSA_WITH_RC4_128_SHA - 0xc00e: TLS_ECDH_RSA_WITH_AES_128_CBC_SHA - 0xc002: TLS_ECDH_ECDSA_WITH_RC4_128_SHA - 0xc004: TLS_ECDH_ECDSA_WITH_AES_128_CBC_SHA - 0x0004: SSL_RSA_WITH_RC4_128_MD5 - 0x0005: SSL_RSA_WITH_RC4_128_SHA - 0x002f: TLS_RSA_WITH_AES_128_CBC_SHA - 0xc008: TLS_ECDHE_ECDSA_WITH_3DES_EDE_CBC_SHA - 0xc012: TLS_ECDHE_RSA_WITH_3DES_EDE_CBC_SHA - 0x0016: SSL_DHE_RSA_WITH_3DES_EDE_CBC_SHA - 0x0013: SSL_DHE_DSS_WITH_3DES_EDE_CBC_SHA - 0xc00d: TLS_ECDH_RSA_WITH_3DES_EDE_CBC_SHA - 0xc003: TLS_ECDH_ECDSA_WITH_3DES_EDE_CBC_SHA - 0xfeff: SSL_RSA_FIPS_WITH_3DES_EDE_CBC_SHA (168-bit Triple DES with RSA and a SHA1 MAC) - 0x000a: SSL_RSA_WITH_3DES_EDE_CBC_SHA - - Order specified in: - http://lxr.mozilla.org/security/source/security/nss/lib/ssl/sslenum.c#47 - - Recommended options: - 0x0000: Server Name Indication [4] - 0x000a: Supported Elliptic Curves [5] - 0x000b: Supported Point Formats [5] - - Recommended compression: - 0x00 - - Recommended server ciphersuite selection: - - The responder should select the first entry in this list which is - listed in the client hello: - - 0x0039: TLS_DHE_RSA_WITH_AES_256_CBC_SHA [ Common Firefox choice ] - 0x0033: TLS_DHE_RSA_WITH_AES_128_CBC_SHA [ Tor v1 default ] - 0x0016: SSL_DHE_RSA_WITH_3DES_EDE_CBC_SHA [ Tor v1 fallback ] - 0x0013: SSL_DHE_DSS_WITH_3DES_EDE_CBC_SHA [ Valid IE option ] - -References: - -[1] The Transport Layer Security (TLS) Protocol, Version 1.1, RFC4346, IETF - -[2] Version negotiation for the Tor protocol, Tor proposal 105 - -[3] B. Kaliski, "Public-Key Cryptography Standards (PKCS) #1: - RSA Cryptography Specifications Version 1.5", RFC 2313, - March 1998. - -[4] TLS Extensions, RFC 3546 - -[5] Elliptic Curve Cryptography (ECC) Cipher Suites for Transport Layer Security (TLS) - -% <!-- Local IspellDict: american --> diff --git a/doc/spec/proposals/125-bridges.txt b/doc/spec/proposals/125-bridges.txt deleted file mode 100644 index 9d95729d42..0000000000 --- a/doc/spec/proposals/125-bridges.txt +++ /dev/null @@ -1,291 +0,0 @@ -Filename: 125-bridges.txt -Title: Behavior for bridge users, bridge relays, and bridge authorities -Author: Roger Dingledine -Created: 11-Nov-2007 -Status: Closed -Implemented-In: 0.2.0.x - -0. Preface - - This document describes the design decisions around support for bridge - users, bridge relays, and bridge authorities. It acts as an overview - of the bridge design and deployment for developers, and it also tries - to point out limitations in the current design and implementation. - - For more details on what all of these mean, look at blocking.tex in - /doc/design-paper/ - -1. Bridge relays - - Bridge relays are just like normal Tor relays except they don't publish - their server descriptors to the main directory authorities. - -1.1. PublishServerDescriptor - - To configure your relay to be a bridge relay, just add - BridgeRelay 1 - PublishServerDescriptor bridge - to your torrc. This will cause your relay to publish its descriptor - to the bridge authorities rather than to the default authorities. - - Alternatively, you can say - BridgeRelay 1 - PublishServerDescriptor 0 - which will cause your relay to not publish anywhere. This could be - useful for private bridges. - -1.2. Exit policy - - Bridge relays should use an exit policy of "reject *:*". This is - because they only need to relay traffic between the bridge users - and the rest of the Tor network, so there's no need to let people - exit directly from them. - -1.3. RelayBandwidthRate / RelayBandwidthBurst - - We invented the RelayBandwidth* options for this situation: Tor clients - who want to allow relaying too. See proposal 111 for details. Relay - operators should feel free to rate-limit their relayed traffic. - -1.4. Helping the user with port forwarding, NAT, etc. - - Just as for operating normal relays, our documentation and hints for - how to make your ORPort reachable are inadequate for normal users. - - We need to work harder on this step, perhaps in 0.2.2.x. - -1.5. Vidalia integration - - Vidalia has turned its "Relay" settings page into a tri-state - "Don't relay" / "Relay for the Tor network" / "Help censored users". - - If you click the third choice, it forces your exit policy to reject *:*. - - If all the bridges end up on port 9001, that's not so good. On the - other hand, putting the bridges on a low-numbered port in the Unix - world requires jumping through extra hoops. The current compromise is - that Vidalia makes the ORPort default to 443 on Windows, and 9001 on - other platforms. - - At the bottom of the relay config settings window, Vidalia displays - the bridge identifier to the operator (see Section 3.1) so he can pass - it on to bridge users. - -1.6. What if the default ORPort is already used? - - If the user already has a webserver or some other application - bound to port 443, then Tor will fail to bind it and complain to the - user, probably in a cryptic way. Rather than just working on a better - error message (though we should do this), we should consider an - "ORPort auto" option that tells Tor to try to find something that's - bindable and reachable. This would also help us tolerate ISPs that - filter incoming connections on port 80 and port 443. But this should - be a different proposal, and can wait until 0.2.2.x. - -2. Bridge authorities. - - Bridge authorities are like normal directory authorities, except they - don't create their own network-status documents or votes. So if you - ask an authority for a network-status document or consensus, they - behave like a directory mirror: they give you one from one of the main - authorities. But if you ask the bridge authority for the descriptor - corresponding to a particular identity fingerprint, it will happily - give you the latest descriptor for that fingerprint. - - To become a bridge authority, add these lines to your torrc: - AuthoritativeDirectory 1 - BridgeAuthoritativeDir 1 - - Right now there's one bridge authority, running on the Tonga relay. - -2.1. Exporting bridge-purpose descriptors - - We've added a new purpose for server descriptors: the "bridge" - purpose. With the new router-descriptors file format that includes - annotations, it's easy to look through it and find the bridge-purpose - descriptors. - - Currently we export the bridge descriptors from Tonga to the - BridgeDB server, so it can give them out according to the policies - in blocking.pdf. - -2.2. Reachability/uptime testing - - Right now the bridge authorities do active reachability testing of - bridges, so we know which ones to recommend for users. - - But in the design document, we suggested that bridges should publish - anonymously (i.e. via Tor) to the bridge authority, so somebody watching - the bridge authority can't just enumerate all the bridges. But if we're - doing active measurement, the game is up. Perhaps we should back off on - this goal, or perhaps we should do our active measurement anonymously? - - Answering this issue is scheduled for 0.2.1.x. - -2.3. Migrating to multiple bridge authorities - - Having only one bridge authority is both a trust bottleneck (if you - break into one place you learn about every single bridge we've got) - and a robustness bottleneck (when it's down, bridge users become sad). - - Right now if we put up a second bridge authority, all the bridges would - publish to it, and (assuming the code works) bridge users would query - a random bridge authority. This resolves the robustness bottleneck, - but makes the trust bottleneck even worse. - - In 0.2.2.x and later we should think about better ways to have multiple - bridge authorities. - -3. Bridge users. - - Bridge users are like ordinary Tor users except they use encrypted - directory connections by default, and they use bridge relays as both - entry guards (their first hop) and directory guards (the source of - all their directory information). - - To become a bridge user, add the following line to your torrc: - - UseBridges 1 - - and then add at least one "Bridge" line to your torrc based on the - format below. - -3.1. Format of the bridge identifier. - - The canonical format for a bridge identifier contains an IP address, - an ORPort, and an identity fingerprint: - bridge 128.31.0.34:9009 4C17 FB53 2E20 B2A8 AC19 9441 ECD2 B017 7B39 E4B1 - - However, the identity fingerprint can be left out, in which case the - bridge user will connect to that relay and use it as a bridge regardless - of what identity key it presents: - bridge 128.31.0.34:9009 - This might be useful for cases where only short bridge identifiers - can be communicated to bridge users. - - In a future version we may also support bridge identifiers that are - only a key fingerprint: - bridge 4C17 FB53 2E20 B2A8 AC19 9441 ECD2 B017 7B39 E4B1 - and the bridge user can fetch the latest descriptor from the bridge - authority (see Section 3.4). - -3.2. Bridges as entry guards - - For now, bridge users add their bridge relays to their list of "entry - guards" (see path-spec.txt for background on entry guards). They are - managed by the entry guard algorithms exactly as if they were a normal - entry guard -- their keys and timing get cached in the "state" file, - etc. This means that when the Tor user starts up with "UseBridges" - disabled, he will skip past the bridge entries since they won't be - listed as up and usable in his networkstatus consensus. But to be clear, - the "entry_guards" list doesn't currently distinguish guards by purpose. - - Internally, each bridge user keeps a smartlist of "bridge_info_t" - that reflects the "bridge" lines from his torrc along with a download - schedule (see Section 3.5 below). When he starts Tor, he attempts - to fetch a descriptor for each configured bridge (see Section 3.4 - below). When he succeeds at getting a descriptor for one of the bridges - in his list, he adds it directly to the entry guard list using the - normal add_an_entry_guard() interface. Once a bridge descriptor has - been added, should_delay_dir_fetches() will stop delaying further - directory fetches, and the user begins to bootstrap his directory - information from that bridge (see Section 3.3). - - Currently bridge users cache their bridge descriptors to the - "cached-descriptors" file (annotated with purpose "bridge"), but - they don't make any attempt to reuse descriptors they find in this - file. The theory is that either the bridge is available now, in which - case you can get a fresh descriptor, or it's not, in which case an - old descriptor won't do you much good. - - We could disable writing out the bridge lines to the state file, if - we think this is a problem. - - As an exception, if we get an application request when we have one - or more bridge descriptors but we believe none of them are running, - we mark them all as running again. This is similar to the exception - already in place to help long-idle Tor clients realize they should - fetch fresh directory information rather than just refuse requests. - -3.3. Bridges as directory guards - - In addition to using bridges as the first hop in their circuits, bridge - users also use them to fetch directory updates. Other than initial - bootstrapping to find a working bridge descriptor (see Section 3.4 - below), all further non-anonymized directory fetches will be redirected - to the bridge. - - This means that bridge relays need to have cached answers for all - questions the bridge user might ask. This makes the upgrade path - tricky --- for example, if we migrate to a v4 directory design, the - bridge user would need to keep using v3 so long as his bridge relays - only knew how to answer v3 queries. - - In a future design, for cases where the user has enough information - to build circuits yet the chosen bridge doesn't know how to answer a - given query, we might teach bridge users to make an anonymized request - to a more suitable directory server. - -3.4. How bridge users get their bridge descriptor - - Bridge users can fetch bridge descriptors in two ways: by going directly - to the bridge and asking for "/tor/server/authority", or by going to - the bridge authority and asking for "/tor/server/fp/ID". By default, - they will only try the direct queries. If the user sets - UpdateBridgesFromAuthority 1 - in his config file, then he will try querying the bridge authority - first for bridges where he knows a digest (if he only knows an IP - address and ORPort, then his only option is a direct query). - - If the user has at least one working bridge, then he will do further - queries to the bridge authority through a full three-hop Tor circuit. - But when bootstrapping, he will make a direct begin_dir-style connection - to the bridge authority. - - As of Tor 0.2.0.10-alpha, if the user attempts to fetch a descriptor - from the bridge authority and it returns a 404 not found, the user - will automatically fall back to trying a direct query. Therefore it is - recommended that bridge users always set UpdateBridgesFromAuthority, - since at worst it will delay their fetches a little bit and notify - the bridge authority of the identity fingerprint (but not location) - of their intended bridges. - -3.5. Bridge descriptor retry schedule - - Bridge users try to fetch a descriptor for each bridge (using the - steps in Section 3.4 above) on startup. Whenever they receive a - bridge descriptor, they reschedule a new descriptor download for 1 - hour from then. - - If on the other hand it fails, they try again after 15 minutes for the - first attempt, after 15 minutes for the second attempt, and after 60 - minutes for subsequent attempts. - - In 0.2.2.x we should come up with some smarter retry schedules. - -3.6. Vidalia integration - - Vidalia 0.0.16 has a checkbox in its Network config window called - "My ISP blocks connections to the Tor network." Users who click that - box change their configuration to: - UseBridges 1 - UpdateBridgesFromAuthority 1 - and should specify at least one Bridge identifier. - -3.7. Do we need a second layer of entry guards? - - If the bridge user uses the bridge as its entry guard, then the - triangulation attacks from Lasse and Paul's Oakland paper work to - locate the user's bridge(s). - - Worse, this is another way to enumerate bridges: if the bridge users - keep rotating through second hops, then if you run a few fast servers - (and avoid getting considered an Exit or a Guard) you'll quickly get - a list of the bridges in active use. - - That's probably the strongest reason why bridge users will need to - pick second-layer guards. Would this mean bridge users should switch - to four-hop circuits? - - We should figure this out in the 0.2.1.x timeframe. - diff --git a/doc/spec/proposals/126-geoip-reporting.txt b/doc/spec/proposals/126-geoip-reporting.txt deleted file mode 100644 index 9f3b21c670..0000000000 --- a/doc/spec/proposals/126-geoip-reporting.txt +++ /dev/null @@ -1,410 +0,0 @@ -Filename: 126-geoip-reporting.txt -Title: Getting GeoIP data and publishing usage summaries -Author: Roger Dingledine -Created: 2007-11-24 -Status: Closed -Implemented-In: 0.2.0.x - -0. Status - - In 0.2.0.x, this proposal is implemented to the extent needed to - address its motivations. See notes below with the test "RESOLUTION" - for details. - -1. Background and motivation - - Right now we can keep a rough count of Tor users, both total and by - country, by watching connections to a single directory mirror. Being - able to get usage estimates is useful both for our funders (to - demonstrate progress) and for our own development (so we know how - quickly we're scaling and can design accordingly, and so we know which - countries and communities to focus on more). This need for information - is the only reason we haven't deployed "directory guards" (think of - them like entry guards but for directory information; in practice, - it would seem that Tor clients should simply use their entry guards - as their directory guards; see also proposal 125). - - With the move toward bridges, we will no longer be able to track Tor - clients that use bridges, since they use their bridges as directory - guards. Further, we need to be able to learn which bridges stop seeing - use from certain countries (and are thus likely blocked), so we can - avoid giving them out to other users in those countries. - - Right now we already do GeoIP lookups in Vidalia: Vidalia draws relays - and circuits on its 'network map', and it performs anonymized GeoIP - lookups to its central servers to know where to put the dots. Vidalia - caches answers it gets -- to reduce delay, to reduce overhead on - the network, and to reduce anonymity issues where users reveal their - knowledge about the network through which IP addresses they ask about. - - But with the advent of bridges, Tor clients are asking about IP - addresses that aren't in the main directory. In particular, bridge - users inform the central Vidalia servers about each bridge as they - discover it and their Vidalia tries to map it. - - Also, we wouldn't mind letting Vidalia do a GeoIP lookup on the client's - own IP address, so it can provide a more useful map. - - Finally, Vidalia's central servers leave users open to partitioning - attacks, even if they can't target specific users. Further, as we - start using GeoIP results for more operational or security-relevant - goals, such as avoiding or including particular countries in circuits, - it becomes more important that users can't be singled out in terms of - their IP-to-country mapping beliefs. - -2. The available GeoIP databases - - There are at least two classes of GeoIP database out there: "IP to - country", which tells us the country code for the IP address but - no more details, and "IP to city", which tells us the country code, - the name of the city, and some basic latitude/longitude guesses. - - A recent ip-to-country.csv is 3421362 bytes. Compressed, it is 564252 - bytes. A typical line is: - "205500992","208605279","US","USA","UNITED STATES" - http://ip-to-country.webhosting.info/node/view/5 - - Similarly, the maxmind GeoLite Country database is also about 500KB - compressed. - http://www.maxmind.com/app/geolitecountry - - The maxmind GeoLite City database gives more finegrained detail like - geo coordinates and city name. Vidalia currently makes use of this - information. On the other hand it's 16MB compressed. A typical line is: - 206.124.149.146,Bellevue,WA,US,47.6051,-122.1134 - http://www.maxmind.com/app/geolitecity - - There are other databases out there, like - http://www.hostip.info/faq.html - http://www.webconfs.com/ip-to-city.php - that want more attention, but for now let's assume that all the db's - are around this size. - -3. What we'd like to solve - - Goal #1a: Tor relays collect IP-to-country user stats and publish - sanitized versions. - Goal #1b: Tor bridges collect IP-to-country user stats and publish - sanitized versions. - - Goal #2a: Vidalia learns IP-to-city stats for Tor relays, for better - mapping. - Goal #2b: Vidalia learns IP-to-country stats for Tor relays, so the user - can pick countries for her paths. - - Goal #3: Vidalia doesn't do external lookups on bridge relay addresses. - - Goal #4: Vidalia resolves the Tor client's IP-to-country or IP-to-city - for better mapping. - - Goal #5: Reduce partitioning opportunities where Vidalia central - servers can give different (distinguishing) responses. - -4. Solution overview - - Our goal is to allow Tor relays, bridges, and clients to learn enough - GeoIP information so they can do local private queries. - -4.1. The IP-to-country db - - Directory authorities should publish a "geoip" file that contains - IP-to-country mappings. Directory caches will mirror it, and Tor clients - and relays (including bridge relays) will fetch it. Thus we can solve - goals 1a and 1b (publish sanitized usage info). Controllers could also - use this to solve goal 2b (choosing path by country attributes). It - also solves goal 4 (learning the Tor client's country), though for - huge countries like the US we'd still need to decide where the "middle" - should be when we're mapping that address. - - The IP-to-country details are described further in Sections 5 and - 6 below. - - [RESOLUTION: The geoip file in 0.2.0.x is not distributed through - Tor. Instead, it is shipped with the bundle.] - -4.2. The IP-to-city db - - In an ideal world, the IP-to-city db would be small enough that we - could distribute it in the above manner too. But for now, it is too - large. Here's where the design choice forks. - - Option A: Vidalia should continue doing its anonymized IP-to-city - queries. Thus we can achieve goals 2a and 2b. We would solve goal - 3 by only doing lookups on descriptors that are purpose "general" - (see Section 4.2.1 for how). We would leave goal 5 unsolved. - - Option B: Each directory authority should keep an IP-to-city db, - lookup the value for each router it lists, and include that line in - the router's network-status entry. The network-status consensus would - then use the line that appears in the majority of votes. This approach - also solves goals 2a and 2b, goal 3 (Vidalia doesn't do any lookups - at all now), and goal 5 (reduced partitioning risks). - - Option B has the advantage that Vidalia can simplify its operation, - and the advantage that this consensus IP-to-city data is available to - other controllers besides just Vidalia. But it has the disadvantage - that the networkstatus consensus becomes larger, even though most of - the GeoIP information won't change from one consensus to the next. Is - there another reasonable location for it that can provide similar - consensus security properties? - - [RESOLUTION: IP-to-city is not supported.] - -4.2.1. Controllers can query for router annotations - - Vidalia needs to stop doing queries on bridge relay IP addresses. - It could do that by only doing lookups on descriptors that are in - the networkstatus consensus, but that precludes designs like Blossom - that might want to map its relay locations. The best answer is that it - should learn the router annotations, with a new controller 'getinfo' - command: - "GETINFO desc-annotations/id/<OR identity>" - which would respond with something like - @downloaded-at 2007-11-29 08:06:38 - @source "128.31.0.34" - @purpose bridge - - [We could also make the answer include the digest for the router in - question, which would enable us to ask GETINFO router-annotations/all. - Is this worth it? -RD] - - Then Vidalia can avoid doing lookups on descriptors with purpose - "bridge". Even better would be to add a new annotation "@private true" - so Vidalia can know how to handle new purposes that we haven't created - yet. Vidalia could special-case "bridge" for now, for compatibility - with the current 0.2.0.x-alphas. - -4.3. Recommendation - - My overall recommendation is that we should implement 4.1 soon - (e.g. early in 0.2.1.x), and we can go with 4.2 option A for now, - with the hope that later we discover a better way to distribute the - IP-to-city info and can switch to 4.2 option B. - - Below we discuss more how to go about achieving 4.1. - -5. Publishing and caching the GeoIP (IP-to-country) database - - Each v3 directory authority should put a copy of the "geoip" file in - its datadirectory. Then its network-status votes should include a hash - of this file (Recommended-geoip-hash: %s), and the resulting consensus - directory should specify the consensus hash. - - There should be a new URL for fetching this geoip db (by "current.z" - for testing purposes, and by hash.z for typical downloads). Authorities - should fetch and serve the one listed in the consensus, even when they - vote for their own. This would argue for storing the cached version - in a better filename than "geoip". - - Directory mirrors should keep a copy of this file available via the - same URLs. - - We assume that the file would change at most a few times a month. Should - Tor ship with a bootstrap geoip file? An out-of-date geoip file may - open you up to partitioning attacks, but for the most part it won't - be that different. - - There should be a config option to disable updating the geoip file, - in case users want to use their own file (e.g. they have a proprietary - GeoIP file they prefer to use). In that case we leave it up to the - user to update his geoip file out-of-band. - - [XXX Should consider forward/backward compatibility, e.g. if we want - to move to a new geoip file format. -RD] - - [RESOLUTION: Not done over Tor.] - -6. Controllers use the IP-to-country db for mapping and for path building - - Down the road, Vidalia could use the IP-to-country mappings for placing - on its map: - - The location of the client - - The location of the bridges, or other relays not in the - networkstatus, on the map. - - Any relays that it doesn't yet have an IP-to-city answer for. - - Other controllers can also use it to set EntryNodes, ExitNodes, etc - in a per-country way. - - To support these features, we need to export the IP-to-country data - via the Tor controller protocol. - - Is it sufficient just to add a new GETINFO command? - GETINFO ip-to-country/128.31.0.34 - 250+ip-to-country/128.31.0.34="US","USA","UNITED STATES" - - [RESOLUTION: Not done now, except for the getinfo command.] - -6.1. Other interfaces - - Robert Hogan has also suggested a - - GETINFO relays-by-country/cn - - as well as torrc options for ExitCountryCodes, EntryCountryCodes, - ExcludeCountryCodes, etc. - - [RESOLUTION: Not implemented in 0.2.0.x. Fodder for a future proposal.] - -7. Relays and bridges use the IP-to-country db for usage summaries - - Once bridges have a GeoIP database locally, they can start to publish - sanitized summaries of client usage -- how many users they see and from - what countries. This might also be a more useful way for ordinary Tor - relays to convey the level of usage they see, which would allow us to - switch to using directory guards for all users by default. - - But how to safely summarize this information without opening too many - anonymity leaks? - -7.1 Attacks to think about - - First, note that we need to have a large enough time window that we're - not aiding correlation attacks much. I hope 24 hours is enough. So - that means no publishing stats until you've been up at least 24 hours. - And you can't publish follow-up stats more often than every 24 hours, - or people could look at the differential. - - Second, note that we need to be sufficiently vague about the IP - addresses we're reporting. We are hoping that just specifying the - country will be vague enough. But a) what about active attacks where - we convince a bridge to use a GeoIP db that labels each suspect IP - address as a unique country? We have to assume that the consensus GeoIP - db won't be malicious in this way. And b) could such singling-out - attacks occur naturally, for example because of countries that have - a very small IP space? We should investigate that. - -7.2. Granularity of users - - Do we only want to report countries that have a sufficient anonymity set - (that is, number of users) for the day? For example, we might avoid - listing any countries that have seen less than five addresses over - the 24 hour period. This approach would be helpful in reducing the - singling-out opportunities -- in the extreme case, we could imagine a - situation where one blogger from the Sudan used Tor on a given day, and - we can discover which entry guard she used. - - But I fear that especially for bridges, seeing only one hit from a - given country in a given day may be quite common. - - As a compromise, we should start out with an "Other" category in - the reported stats, which is the sum of unlisted countries; if that - category is consistently interesting, we can think harder about how - to get the right data from it safely. - - But note that bridge summaries will not be made public individually, - since doing so would help people enumerate bridges. Whereas summaries - from normal relays will be public. So perhaps that means we can afford - to be more specific in bridge summaries? In particular, I'm thinking the - "other" category should be used by public relays but not for bridges - (or if it is, used with a lower threshold). - - Even for countries that have many Tor users, we might not want to be - too specific about how many users we've seen. For example, we might - round down the number of users we report to the nearest multiple of 5. - My instinct for now is that this won't be that useful. - -7.3 Other issues - - Another note: we'll likely be overreporting in the case of users with - dynamic IP addresses: if they rotate to a new address over the course - of the day, we'll count them twice. So be it. - -7.4. Where to publish the summaries? - - We designed extrainfo documents for information like this. So they - should just be more entries in the extrainfo doc. - - But if we want to publish summaries every 24 hours (no more often, - no less often), aren't we tried to the router descriptor publishing - schedule? That is, if we publish a new router descriptor at the 18 - hour mark, and nothing much has changed at the 24 hour mark, won't - the new descriptor get dropped as being "cosmetically similar", and - then nobody will know to ask about the new extrainfo document? - - One solution would be to make and remember the 24 hour summary at the - 24 hour mark, but not actually publish it anywhere until we happen to - publish a new descriptor for other reasons. If we happen to go down - before publishing a new descriptor, then so be it, at least we tried. - -7.5. What if the relay is unreachable or goes to sleep? - - Even if you've been up for 24 hours, if you were hibernating for 18 - of them, then we're not getting as much fuzziness as we'd like. So - I guess that means that we need a 24-hour period of being "awake" - before we'll willing to publish a summary. A similar attack works if - you've been awake but unreachable for the first 18 of the 24 hours. As - another example, a bridge that's on a laptop might be suspended for - some of each day. - - This implies that some relays and bridges will never publish summary - stats, because they're not ever reliably working for 24 hours in - a row. If a significant percentage of our reporters end up being in - this boat, we should investigate whether we can accumulate 24 hours of - "usefulness", even if there are holes in the middle, and publish based - on that. - - What other issues are like this? It seems that just moving to a new - IP address shouldn't be a reason to cancel stats publishing, assuming - we were usable at each address. - -7.6. IP addresses that aren't in the geoip db - - Some IP addresses aren't in the public geoip databases. In particular, - I've found that a lot of African countries are missing, but there - are also some common ones in the US that are missing, like parts of - Comcast. We could just lump unknown IP addresses into the "other" - category, but it might be useful to gather a general sense of how many - lookups are failing entirely, by adding a separate "Unknown" category. - - We could also contribute back to the geoip db, by letting bridges set - a config option to report the actual IP addresses that failed their - lookup. Then the bridge authority operators can manually make sure - the correct answer will be in later geoip files. This config option - should be disabled by default. - -7.7 Bringing it all together - - So here's the plan: - - 24 hours after starting up (modulo Section 7.5 above), bridges and - relays should construct a daily summary of client countries they've - seen, including the above "Unknown" category (Section 7.6) as well. - - Non-bridge relays lump all countries with less than K (e.g. K=5) users - into the "Other" category (see Sec 7.2 above), whereas bridge relays are - willing to list a country even when it has only one user for the day. - - Whenever we have a daily summary on record, we include it in our - extrainfo document whenever we publish one. The daily summary we - remember locally gets replaced with a newer one when another 24 - hours pass. - -7.8. Some forward secrecy - - How should we remember addresses locally? If we convert them into - country-codes immediately, we will count them again if we see them - again. On the other hand, we don't really want to keep a list hanging - around of all IP addresses we've seen in the past 24 hours. - - Step one is that we should never write this stuff to disk. Keeping it - only in ram will make things somewhat better. Step two is to avoid - keeping any timestamps associated with it: rather than a rolling - 24-hour window, which would require us to remember the various times - we've seen that address, we can instead just throw out the whole list - every 24 hours and start over. - - We could hash the addresses, and then compare hashes when deciding if - we've seen a given address before. We could even do keyed hashes. Or - Bloom filters. But if our goal is to defend against an adversary - who steals a copy of our ram while we're running and then does - guess-and-check on whatever blob we're keeping, we're in bad shape. - - We could drop the last octet of the IP address as soon as we see - it. That would cause us to undercount some users from cablemodem and - DSL networks that have a high density of Tor users. And it wouldn't - really help that much -- indeed, the extent to which it does help is - exactly the extent to which it makes our stats less useful. - - Other ideas? - diff --git a/doc/spec/proposals/127-dirport-mirrors-downloads.txt b/doc/spec/proposals/127-dirport-mirrors-downloads.txt deleted file mode 100644 index 72d6c0cb9f..0000000000 --- a/doc/spec/proposals/127-dirport-mirrors-downloads.txt +++ /dev/null @@ -1,155 +0,0 @@ -Filename: 127-dirport-mirrors-downloads.txt -Title: Relaying dirport requests to Tor download site / website -Author: Roger Dingledine -Created: 2007-12-02 -Status: Draft - -1. Overview - - Some countries and networks block connections to the Tor website. As - time goes by, this will remain a problem and it may even become worse. - - We have a big pile of mirrors (google for "Tor mirrors"), but few of - our users think to try a search like that. Also, many of these mirrors - might be automatically blocked since their pages contain words that - might cause them to get banned. And lastly, we can imagine a future - where the blockers are aware of the mirror list too. - - Here we describe a new set of URLs for Tor's DirPort that will relay - connections from users to the official Tor download site. Rather than - trying to cache a bunch of new Tor packages (which is a hassle in terms - of keeping them up to date, and a hassle in terms of drive space used), - we instead just proxy the requests directly to Tor's /dist page. - - Specifically, we should support - - GET /tor/dist/$1 - - and - - GET /tor/website/$1 - -2. Direct connections, one-hop circuits, or three-hop circuits? - - We could relay the connections directly to the download site -- but - this produces recognizable outgoing traffic on the bridge or cache's - network, which will probably surprise our nice volunteers. (Is this - a good enough reason to discard the direct connection idea?) - - Even if we don't do direct connections, should we do a one-hop - begindir-style connection to the mirror site (make a one-hop circuit - to it, then send a 'begindir' cell down the circuit), or should we do - a normal three-hop anonymized connection? - - If these mirrors are mainly bridges, doing either a direct or a one-hop - connection creates another way to enumerate bridges. That would argue - for three-hop. On the other hand, downloading a 10+ megabyte installer - through a normal Tor circuit can't be fun. But if you're already getting - throttled a lot because you're in the "relayed traffic" bucket, you're - going to have to accept a slow transfer anyway. So three-hop it is. - - Speaking of which, we would want to label this connection - as "relay" traffic for the purposes of rate limiting; see - connection_counts_as_relayed_traffic() and or_conn->client_used. This - will be a bit tricky though, because these connections will use the - bridge's guards. - -3. Scanning resistance - - One other goal we'd like to achieve, or at least not hinder, is making - it hard to scan large swaths of the Internet to look for responses - that indicate a bridge. - - In general this is a really hard problem, so we shouldn't demand to - solve it here. But we can note that some bridges should open their - DirPort (and offer this functionality), and others shouldn't. Then - some bridges provide a download mirror while others can remain - scanning-resistant. - -4. Integrity checking - - If we serve this stuff in plaintext from the bridge, anybody in between - the user and the bridge can intercept and modify it. The bridge can too. - - If we do an anonymized three-hop connection, the exit node can also - intercept and modify the exe it sends back. - - Are we setting ourselves up for rogue exit relays, or rogue bridges, - that trojan our users? - - Answer #1: Users need to do pgp signature checking. Not a very good - answer, a) because it's complex, and b) because they don't know the - right signing keys in the first place. - - Answer #2: The mirrors could exit from a specific Tor relay, using the - '.exit' notation. This would make connections a bit more brittle, but - would resolve the rogue exit relay issue. We could even round-robin - among several, and the list could be dynamic -- for example, all the - relays with an Authority flag that allow exits to the Tor website. - - Answer #3: The mirrors should connect to the main distribution site - via SSL. That way the exit relay can't influence anything. - - Answer #4: We could suggest that users only use trusted bridges for - fetching a copy of Tor. Hopefully they heard about the bridge from a - trusted source rather than from the adversary. - - Answer #5: What if the adversary is trawling for Tor downloads by - network signature -- either by looking for known bytes in the binary, - or by looking for "GET /tor/dist/"? It would be nice to encrypt the - connection from the bridge user to the bridge. And we can! The bridge - already supports TLS. Rather than initiating a TLS renegotiation after - connecting to the ORPort, the user should actually request a URL. Then - the ORPort can either pass the connection off as a linked conn to the - dirport, or renegotiate and become a Tor connection, depending on how - the client behaves. - -5. Linked connections: at what level should we proxy? - - Check out the connection_ap_make_link() function, as called from - directory.c. Tor clients use this to create a "fake" socks connection - back to themselves, and then they attach a directory request to it, - so they can launch directory fetches via Tor. We can piggyback on - this feature. - - We need to decide if we're going to be passing the bytes back and - forth between the web browser and the main distribution site, or if - we're going to be actually acting like a proxy (parsing out the file - they want, fetching that file, and serving it back). - - Advantages of proxying without looking inside: - - We don't need to build any sort of http support (including - continues, partial fetches, etc etc). - Disadvantages: - - If the browser thinks it's speaking http, are there easy ways - to pass the bytes to an https server and have everything work - correctly? At the least, it would seem that the browser would - complain about the cert. More generally, ssl wants to be negotiated - before the URL and headers are sent, yet we need to read the URL - and headers to know that this is a mirror request; so we have an - ordering problem here. - - Makes it harder to do caching later on, if we don't look at what - we're relaying. (It might be useful down the road to cache the - answers to popular requests, so we don't have to keep getting - them again.) - -6. Outstanding problems - - 1) HTTP proxies already exist. Why waste our time cloning one - badly? When we clone existing stuff, we usually regret it. - - 2) It's overbroad. We only seem to need a secure get-a-tor feature, - and instead we're contemplating building a locked-down HTTP proxy. - - 3) It's going to add a fair bit of complexity to our code. We do - not currently implement HTTPS. We'd need to refactor lots of the - low-level connection stuff so that "SSL" and "Cell-based" were no - longer synonymous. - - 4) It's still unclear how effective this proposal would be in - practice. You need to know that this feature exists, which means - somebody needs to tell you about a bridge (mirror) address and tell - you how to use it. And if they're doing that, they could (e.g.) tell - you about a gmail autoresponder address just as easily, and then you'd - get better authentication of the Tor program to boot. - diff --git a/doc/spec/proposals/128-bridge-families.txt b/doc/spec/proposals/128-bridge-families.txt deleted file mode 100644 index e5bdcf95cb..0000000000 --- a/doc/spec/proposals/128-bridge-families.txt +++ /dev/null @@ -1,64 +0,0 @@ -Filename: 128-bridge-families.txt -Title: Families of private bridges -Author: Roger Dingledine -Created: 2007-12-xx -Status: Dead - -1. Overview - - Proposal 125 introduced the basic notion of how bridge authorities, - bridge relays, and bridge users should behave. But it doesn't get into - the various mechanisms of how to distribute bridge relay addresses to - bridge users. - - One of the mechanisms we have in mind is called 'families of bridges'. - If a bridge user knows about only one private bridge, and that bridge - shuts off for the night or gets a new dynamic IP address, the bridge - user is out of luck and needs to re-bootstrap manually or wait and - hope it comes back. On the other hand, if the bridge user knows about - a family of bridges, then as long as one of those bridges is still - reachable his Tor client can automatically learn about where the - other bridges have gone. - - So in this design, a single volunteer could run multiple coordinated - bridges, or a group of volunteers could each run a bridge. We abstract - out the details of how these volunteers find each other and decide to - set up a family. - -2. Other notes. - - somebody needs to run a bridge authority - - it needs to have a torrc option to publish networkstatuses of its bridges - - it should also do reachability testing just of those bridges - - people ask for the bridge networkstatus by asking for a url that - contains a password. (it's safe to do this because of begin_dir.) - - so the bridge users need to know a) a password, and b) a bridge - authority line. - - the bridge users need to know the bridge authority line. - - the bridge authority needs to know the password. - -3. Current state - - I implemented a BridgePassword config option. Bridge authorities - should set it, and users who want to use those bridge authorities - should set it. - - Now there is a new directory URL "/tor/networkstatus-bridges" that - directory mirrors serve if BridgeAuthoritativeDir is set and it's a - begin_dir connection. It looks for the header - Authorization: Basic %s - where %s is the base-64 bridge password. - - I never got around to teaching clients how to set the header though, - so it may or may not, and may or may not do what we ultimate want. - - I've marked this proposal dead; it really never should have left the - ideas/ directory. Somebody should pick it up sometime and finish the - design and implementation. - diff --git a/doc/spec/proposals/129-reject-plaintext-ports.txt b/doc/spec/proposals/129-reject-plaintext-ports.txt deleted file mode 100644 index 8080ff5b75..0000000000 --- a/doc/spec/proposals/129-reject-plaintext-ports.txt +++ /dev/null @@ -1,114 +0,0 @@ -Filename: 129-reject-plaintext-ports.txt -Title: Block Insecure Protocols by Default -Author: Kevin Bauer & Damon McCoy -Created: 2008-01-15 -Status: Closed -Implemented-In: 0.2.0.x - -Overview: - - Below is a proposal to mitigate insecure protocol use over Tor. - - This document 1) demonstrates the extent to which insecure protocols are - currently used within the Tor network, and 2) proposes a simple solution - to prevent users from unknowingly using these insecure protocols. By - insecure, we consider protocols that explicitly leak sensitive user names - and/or passwords, such as POP, IMAP, Telnet, and FTP. - -Motivation: - - As part of a general study of Tor use in 2006/2007 [1], we attempted to - understand what types of protocols are used over Tor. While we observed a - enormous volume of Web and Peer-to-peer traffic, we were surprised by the - number of insecure protocols that were used over Tor. For example, over an - 8 day observation period, we observed the following number of connections - over insecure protocols: - - POP and IMAP:10,326 connections - Telnet: 8,401 connections - FTP: 3,788 connections - - Each of the above listed protocols exchange user name and password - information in plain-text. As an upper bound, we could have observed - 22,515 user names and passwords. This observation echos the reports of - a Tor router logging and posting e-mail passwords in August 2007 [2]. The - response from the Tor community has been to further educate users - about the dangers of using insecure protocols over Tor. However, we - recently repeated our Tor usage study from last year and noticed that the - trend in insecure protocol use has not declined. Therefore, we propose that - additional steps be taken to protect naive Tor users from inadvertently - exposing their identities (and even passwords) over Tor. - -Security Implications: - - This proposal is intended to improve Tor's security by limiting the - use of insecure protocols. - - Roger added: By adding these warnings for only some of the risky - behavior, users may do other risky behavior, not get a warning, and - believe that it is therefore safe. But overall, I think it's better - to warn for some of it than to warn for none of it. - -Specification: - - As an initial step towards mitigating the use of the above-mentioned - insecure protocols, we propose that the default ports for each respective - insecure service be blocked at the Tor client's socks proxy. These default - ports include: - - 23 - Telnet - 109 - POP2 - 110 - POP3 - 143 - IMAP - - Notice that FTP is not included in the proposed list of ports to block. This - is because FTP is often used anonymously, i.e., without any identifying - user name or password. - - This blocking scheme can be implemented as a set of flags in the client's - torrc configuration file: - - BlockInsecureProtocols 0|1 - WarnInsecureProtocols 0|1 - - When the warning flag is activated, a message should be displayed to - the user similar to the message given when Tor's socks proxy is given an IP - address rather than resolving a host name. - - We recommend that the default torrc configuration file block insecure - protocols and provide a warning to the user to explain the behavior. - - Finally, there are many popular web pages that do not offer secure - login features, such as MySpace, and it would be prudent to provide - additional rules to Privoxy to attempt to protect users from unknowingly - submitting their login credentials in plain-text. - -Compatibility: - - None, as the proposed changes are to be implemented in the client. - -References: - - [1] Shining Light in Dark Places: A Study of Anonymous Network Usage. - University of Colorado Technical Report CU-CS-1032-07. August 2007. - - [2] Rogue Nodes Turn Tor Anonymizer Into Eavesdropper's Paradise. - http://www.wired.com/politics/security/news/2007/09/embassy_hacks. - Wired. September 10, 2007. - -Implementation: - - Roger added this feature in - http://archives.seul.org/or/cvs/Jan-2008/msg00182.html - He also added a status event for Vidalia to recognize attempts to use - vulnerable-plaintext ports, so it can help the user understand what's - going on and how to fix it. - -Next steps: - - a) Vidalia should learn to recognize this controller status event, - so we don't leave users out in the cold when we enable this feature. - - b) We should decide which ports to reject by default. The current - consensus is 23,109,110,143 -- the same set that we warn for now. - diff --git a/doc/spec/proposals/130-v2-conn-protocol.txt b/doc/spec/proposals/130-v2-conn-protocol.txt deleted file mode 100644 index 60e742a622..0000000000 --- a/doc/spec/proposals/130-v2-conn-protocol.txt +++ /dev/null @@ -1,184 +0,0 @@ -Filename: 130-v2-conn-protocol.txt -Title: Version 2 Tor connection protocol -Author: Nick Mathewson -Created: 2007-10-25 -Status: Closed -Implemented-In: 0.2.0.x - -Overview: - - This proposal describes the significant changes to be made in the v2 - Tor connection protocol. - - This proposal relates to other proposals as follows: - - It refers to and supersedes: - Proposal 124: Blocking resistant TLS certificate usage - It refers to aspects of: - Proposal 105: Version negotiation for the Tor protocol - - - In summary, The Tor connection protocol has been in need of a redesign - for a while. This proposal describes how we can add to the Tor - protocol: - - - A new TLS handshake (to achieve blocking resistance without - breaking backward compatibility) - - Version negotiation (so that future connection protocol changes - can happen without breaking compatibility) - - The actual changes in the v2 Tor connection protocol. - -Motivation: - - For motivation, see proposal 124. - -Proposal: - -0. Terminology - - The version of the Tor connection protocol implemented up to now is - "version 1". This proposal describes "version 2". - - "Old" or "Older" versions of Tor are ones not aware that version 2 - of this protocol exists; - "New" or "Newer" versions are ones that are. - - The connection initiator is referred to below as the Client; the - connection responder is referred to below as the Server. - -1. The revised TLS handshake. - - For motivation, see proposal 124. This is a simplified version of the - handshake that uses TLS's renegotiation capability in order to avoid - some of the extraneous steps in proposal 124. - - The Client connects to the Server and, as in ordinary TLS, sends a - list of ciphers. Older versions of Tor will send only ciphers from - the list: - TLS_DHE_RSA_WITH_AES_256_CBC_SHA - TLS_DHE_RSA_WITH_AES_128_CBC_SHA - SSL_DHE_RSA_WITH_3DES_EDE_CBC_SHA - SSL_DHE_DSS_WITH_3DES_EDE_CBC_SHA - Clients that support the revised handshake will send the recommended - list of ciphers from proposal 124, in order to emulate the behavior of - a web browser. - - If the server notices that the list of ciphers contains only ciphers - from this list, it proceeds with Tor's version 1 TLS handshake as - documented in tor-spec.txt. - - (The server may also notice cipher lists used by other implementations - of the Tor protocol (in particular, the BouncyCastle default cipher - list as used by some Java-based implementations), and whitelist them.) - - On the other hand, if the server sees a list of ciphers that could not - have been sent from an older implementation (because it includes other - ciphers, and does not match any known-old list), the server sends a - reply containing a single connection certificate, constructed as for - the link certificate in the v1 Tor protocol. The subject names in - this certificate SHOULD NOT have any strings to identify them as - coming from a Tor server. The server does not ask the client for - certificates. - - Old Servers will (mostly) ignore the cipher list and respond as in the v1 - protocol, sending back a two-certificate chain. - - After the Client gets a response from the server, it checks for the - number of certificates it received. If there are two certificates, - the client assumes a V1 connection and proceeds as in tor-spec.txt. - But if there is only one certificate, the client assumes a V2 or later - protocol and continues. - - At this point, the client has established a TLS connection with the - server, but the parties have not been authenticated: the server hasn't - sent its identity certificate, and the client hasn't sent any - certificates at all. To fix this, the client begins a TLS session - renegotiation. This time, the server continues with two certificates - as usual, and asks for certificates so that the client will send - certificates of its own. Because the TLS connection has been - established, all of this is encrypted. (The certificate sent by the - server in the renegotiated connection need not be the same that - as sentin the original connection.) - - The server MUST NOT write any data until the client has renegotiated. - - Once the renegotiation is finished, the server and client check one - another's certificates as in V1. Now they are mutually authenticated. - -1.1. Revised TLS handshake: implementation notes. - - It isn't so easy to adjust server behavior based on the client's - ciphersuite list. Here's how we can do it using OpenSSL. This is a - bit of an abuse of the OpenSSL APIs, but it's the best we can do, and - we won't have to do it forever. - - We can use OpenSSL's SSL_set_info_callback() to register a function to - be called when the state changes. The type/state tuple of - SSL_CB_ACCEPT_LOOP/SSL3_ST_SW_SRVR_HELLO_A - happens when we have completely parsed the client hello, and are about - to send a response. From this callback, we can check the cipherlist - and act accordingly: - - * If the ciphersuite list indicates a v1 protocol, we set the - verify mode to SSL_VERIFY_NONE with a callback (so we get - certificates). - - * If the ciphersuite list indicates a v2 protocol, we set the - verify mode to SSL_VERIFY_NONE with no callback (so we get - no certificates) and set the SSL_MODE_NO_AUTO_CHAIN flag (so that - we send only 1 certificate in the response. - - Once the handshake is done, the server clears the - SSL_MODE_NO_AUTO_CHAIN flag and sets the callback as for the V1 - protocol. It then starts reading. - - The other problem to take care of is missing ciphers and OpenSSL's - cipher sorting algorithms. The two main issues are a) OpenSSL doesn't - support some of the default ciphers that Firefox advertises, and b) - OpenSSL sorts the list of ciphers it offers in a different way than - Firefox sorts them, so unless we fix that Tor will still look different - than Firefox. - [XXXX more on this.] - - -1.2. Compatibility for clients using libraries less hackable than OpenSSL. - - As discussed in proposal 105, servers advertise which protocol - versions they support in their router descriptors. Clients can simply - behave as v1 clients when connecting to servers that do not support - link version 2 or higher, and as v2 clients when connecting to servers - that do support link version 2 or higher. - - (Servers can't use this strategy because we do not assume that servers - know one another's capabilities when connecting.) - -2. Version negotiation. - - Version negotiation proceeds as described in proposal 105, except as - follows: - - * Version negotiation only happens if the TLS handshake as described - above completes. - - * The TLS renegotiation must be finished before the client sends a - VERSIONS cell; the server sends its VERSIONS cell in response. - - * The VERSIONS cell uses the following variable-width format: - Circuit [2 octets; set to 0] - Command [1 octet; set to 7 for VERSIONS] - Length [2 octets; big-endian] - Data [Length bytes] - - The Data in the cell is a series of big-endian two-byte integers. - - * It is not allowed to negotiate V1 conections once the v2 protocol - has been used. If this happens, Tor instances should close the - connection. - -3. The rest of the "v2" protocol - - Once a v2 protocol has been negotiated, NETINFO cells are exchanged - as in proposal 105, and communications begin as per tor-spec.txt. - Until NETINFO cells have been exchanged, the connection is not open. - - diff --git a/doc/spec/proposals/131-verify-tor-usage.txt b/doc/spec/proposals/131-verify-tor-usage.txt deleted file mode 100644 index d3c6efe75a..0000000000 --- a/doc/spec/proposals/131-verify-tor-usage.txt +++ /dev/null @@ -1,148 +0,0 @@ -Filename: 131-verify-tor-usage.txt -Title: Help users to verify they are using Tor -Author: Steven J. Murdoch -Created: 2008-01-25 -Status: Needs-Revision - -Overview: - - Websites for checking whether a user is accessing them via Tor are a - very helpful aid to configuring web browsers correctly. Existing - solutions have both false positives and false negatives when - checking if Tor is being used. This proposal will discuss how to - modify Tor so as to make testing more reliable. - -Motivation: - - Currently deployed websites for detecting Tor use work by comparing - the client IP address for a request with a list of known Tor nodes. - This approach is generally effective, but suffers from both false - positives and false negatives. - - If a user has a Tor exit node installed, or just happens to have - been allocated an IP address previously used by a Tor exit node, any - web requests will be incorrectly flagged as coming from Tor. If any - customer of an ISP which implements a transparent proxy runs an exit - node, all other users of the ISP will be flagged as Tor users. - - Conversely, if the exit node chosen by a Tor user has not yet been - recorded by the Tor checking website, requests will be incorrectly - flagged as not coming via Tor. - - The only reliable way to tell whether Tor is being used or not is for - the Tor client to flag this to the browser. - -Proposal: - - A DNS name should be registered and point to an IP address - controlled by the Tor project and likely to remain so for the - useful lifetime of a Tor client. A web server should be placed - at this IP address. - - Tor should be modified to treat requests to port 80, at the - specified DNS name or IP address specially. Instead of opening a - circuit, it should respond to a HTTP request with a helpful web - page: - - - If the request to open a connection was to the domain name, the web - page should state that Tor is working properly. - - If the request was to the IP address, the web page should state - that there is a DNS-leakage vulnerability. - - If the request goes through to the real web server, the page - should state that Tor has not been set up properly. - -Extensions: - - Identifying proxy server: - - If needed, other applications between the web browser and Tor (e.g. - Polipo and Privoxy) could piggyback on the same mechanism to flag - whether they are in use. All three possible web pages should include - a machine-readable placeholder, into which another program could - insert their own message. - - For example, the webpage returned by Tor to indicate a successful - configuration could include the following HTML: - <h2>Connection chain</h2> - <ul> - <li>Tor 0.1.2.14-alpha</li> - <!-- Tor Connectivity Check: success --> - </ul> - - When the proxy server observes this string, in response to a request - for the Tor connectivity check web page, it would prepend it's own - message, resulting in the following being returned to the web - browser: - <h2>Connection chain - <ul> - <li>Tor 0.1.2.14-alpha</li> - <li>Polipo version 1.0.4</li> - <!-- Tor Connectivity Check: success --> - </ul> - - Checking external connectivity: - - If Tor intercepts a request, and returns a response itself, the user - will not actually confirm whether Tor is able to build a successful - circuit. It may then be advantageous to include an image in the web - page which is loaded from a different domain. If this is able to be - loaded then the user will know that external connectivity through - Tor works. - - Automatic Firefox Notification: - - All forms of the website should return valid XHTML and have a - hidden link with an id attribute "TorCheckResult" and a target - property that can be queried to determine the result. For example, - a hidden link would convey success like this: - - <a id="TorCheckResult" target="success" href="/"></a> - - failure like this: - - <a id="TorCheckResult" target="failure" href="/"></a> - - and DNS leaks like this: - - <a id="TorCheckResult" target="dnsleak" href="/"></a> - - Firefox extensions such as Torbutton would then be able to - issue an XMLHttpRequest for the page and query the result - with resultXML.getElementById("TorCheckResult").target - to automatically report the Tor status to the user when - they first attempt to enable Tor activity, or whenever - they request a check from the extension preferences window. - - If the check website is to be themed with heavy graphics and/or - extensive documentation, the check result itself should be - contained in a seperate lightweight iframe that extensions can - request via an alternate url. - -Security and resiliency implications: - - What attacks are possible? - - If the IP address used for this feature moves there will be two - consequences: - - A new website at this IP address will remain inaccessible over - Tor - - Tor users who are leaking DNS will be informed that Tor is not - working, rather than that it is active but leaking DNS - We should thus attempt to find an IP address which we reasonably - believe can remain static. - -Open issues: - - If a Tor version which does not support this extra feature is used, - the webpage returned will indicate that Tor is not being used. Can - this be safely fixed? - -Related work: - - The proposed mechanism is very similar to config.privoxy.org. The - most significant difference is that if the web browser is - misconfigured, Tor will only get an IP address. Even in this case, - Tor should be able to respond with a webpage to notify the user of how - to fix the problem. This also implies that Tor must be told of the - special IP address, and so must be effectively permanent. diff --git a/doc/spec/proposals/132-browser-check-tor-service.txt b/doc/spec/proposals/132-browser-check-tor-service.txt deleted file mode 100644 index 6132e5d060..0000000000 --- a/doc/spec/proposals/132-browser-check-tor-service.txt +++ /dev/null @@ -1,145 +0,0 @@ -Filename: 132-browser-check-tor-service.txt -Title: A Tor Web Service For Verifying Correct Browser Configuration -Author: Robert Hogan -Created: 2008-03-08 -Status: Draft - -Overview: - - Tor should operate a primitive web service on the loopback network device - that tests the operation of user's browser, privacy proxy and Tor client. - The tests are performed by serving unique, randomly generated elements in - image URLs embedded in static HTML. The images are only displayed if the DNS - and HTTP requests for them are routed through Tor, otherwise the 'alt' text - may be displayed. The proposal assumes that 'alt' text is not displayed on - all browsers so suggests that text and links should accompany each image - advising the user on next steps in case the test fails. - - The service is primarily for the use of controllers, since presumably users - aren't going to want to edit text files and then type something exotic like - 127.0.0.1:9999 into their address bar. In the main use case the controller - will have configured the actual port for the webservice so will know where - to direct the request. It would also be the responsibility of the controller - to ensure the webservice is available, and tor is running, before allowing - the user to access the page through their browser. - -Motivation: - - This is a complementary approach to proposal 131. It overcomes some of the - limitations of the approach described in proposal 131: reliance - on a permanent, real IP address and compatibility with older versions of - Tor. Unlike 131, it is not as useful to Tor users who are not running a - controller. - -Objective: - - Provide a reliable means of helping users to determine if their Tor - installation, privacy proxy and browser are properly configured for - anonymous browsing. - -Proposal: - - When configured to do so, Tor should run a basic web service available - on a configured port on 127.0.0.1. The purpose of this web service is to - serve a number of basic test images that will allow the user to determine - if their browser is properly configured and that Tor is working normally. - - The service can consist of a single web page with two columns. The left - column contains images, the right column contains advice on what the - display/non-display of the column means. - - The rest of this proposal assumes that the service is running on port - 9999. The port should be configurable, and configuring the port enables the - service. The service must run on 127.0.0.1. - - In all the examples below [uniquesessionid] refers to a random, base64 - encoded string that is unique to the URL it is contained in. Tor only ever - stores the most recently generated [uniquesessionid] for each URL, storing 3 - in total. Tor should generate a [uniquesessionid] for each of the test URLs - below every time a HTTP GET is received at 127.0.0.1:9999 for index.htm. - - The most suitable image for each test case is an implementation decision. - Tor will need to store and serve images for the first and second test - images, and possibly the third (see 'Open Issues'). - - 1. DNS Request Test Image - - This is a HTML element embedded in the page served by Tor at - http://127.0.0.1:9999: - - <IMG src="http://[uniquesessionid]:9999/torlogo.jpg" alt="If you can see - this text, your browser's DNS requests are not being routed through Tor." - width="200" height="200" align="middle" border="2"> - - If the browser's DNS request for [uniquesessionid] is routed through Tor, - Tor will intercept the request and return 127.0.0.1 as the resolved IP - address. This will shortly be followed by a HTTP request from the browser - for http://127.0.0.1:9999/torlogo.jpg. This request should be served with - the appropriate image. - - If the browser's DNS request for [uniquesessionid] is not routed through Tor - the browser may display the 'alt' text specified in the html element. The - HTML served by Tor should also contain text accompanying the image to advise - users what it means if they do not see an image. It should also provide a - link to click that provides information on how to remedy the problem. This - behaviour also applies to the images described in 2. and 3. below, so should - be assumed there as well. - - - 2. Proxy Configuration Test Image - - This is a HTML element embedded in the page served by Tor at - http://127.0.0.1:9999: - - <IMG src="http://torproject.org/[uniquesessionid].jpg" alt="If you can see - this text, your browser is not configured to work with Tor." width="200" - height="200" align="middle" border="2"> - - If the HTTP request for the resource [uniquesessionid].jpg is received by - Tor it will serve the appropriate image in response. It should serve this - image itself, without attempting to retrieve anything from the Internet. - - If Tor can identify the name of the proxy application requesting the - resource then it could store and serve an image identifying the proxy to the - user. - - 3. Tor Connectivity Test Image - - This is a HTML element embedded in the page served by Tor at - http://127.0.0.1:9999: - - <IMG src="http://torproject.org/[uniquesessionid]-torlogo.jpg" alt="If you - can see this text, your Tor installation cannot connect to the Internet." - width="200" height="200" align="middle" border="2"> - - The referenced image should actually exist on the Tor project website. If - Tor receives the request for the above resource it should remove the random - base64 encoded digest from the request (i.e. [uniquesessionid]-) and attempt - to retrieve the real image. - - Even on a fully operational Tor client this test may not always succeed. The - user should be advised that one or more attempts to retrieve this image may - be necessary to confirm a genuine problem. - -Open Issues: - - The final connectivity test relies on an externally maintained resource, if - this resource becomes unavailable the connectivity test will always fail. - Either the text accompanying the test should advise of this possibility or - Tor clients should be advised of the location of the test resource in the - main network directory listings. - - Any number of misconfigurations may make the web service unreachable, it is - the responsibility of the user's controller to recognize these and assist - the user in eliminating them. Tor can mitigate against the specific - misconfiguration of routing HTTP traffic to 127.0.0.1 to Tor itself by - serving such requests through the SOCKS port as well as the configured web - service report. - - Now Tor is inspecting the URLs requested on its SOCKS port and 'dropping' - them. It already inspects for raw IP addresses (to warn of DNS leaks) but - maybe the behaviour proposed here is qualitatively different. Maybe this is - an unwelcome precedent that can be used to beat the project over the head in - future. Or maybe it's not such a bad thing, Tor is merely attempting to make - normally invalid resource requests valid for a given purpose. - diff --git a/doc/spec/proposals/133-unreachable-ors.txt b/doc/spec/proposals/133-unreachable-ors.txt deleted file mode 100644 index a1c2dd8549..0000000000 --- a/doc/spec/proposals/133-unreachable-ors.txt +++ /dev/null @@ -1,128 +0,0 @@ -Filename: 133-unreachable-ors.txt -Title: Incorporate Unreachable ORs into the Tor Network -Author: Robert Hogan -Created: 2008-03-08 -Status: Draft - -Overview: - - Propose a scheme for harnessing the bandwidth of ORs who cannot currently - participate in the Tor network because they can only make outbound - TCP connections. - -Motivation: - - Restrictive local and remote firewalls are preventing many willing - candidates from becoming ORs on the Tor network.These - ORs have a casual interest in joining the network but their operator is not - sufficiently motivated or adept to complete the necessary router or firewall - configuration. The Tor network is losing out on their bandwidth. At the - moment we don't even know how many such 'candidate' ORs there are. - - -Objective: - - 1. Establish how many ORs are unable to qualify for publication because - they cannot establish that their ORPort is reachable. - - 2. Devise a method for making such ORs available to clients for circuit - building without prejudicing their anonymity. - -Proposal: - - ORs whose ORPort reachability testing fails a specified number of - consecutive times should: - 1. Enlist themselves with the authorities setting a 'Fallback' flag. This - flag indicates that the OR is up and running but cannot connect to - itself. - 2. Open an orconn with all ORs whose fingerprint begins with the same - byte as their own. The management of this orconn will be transferred - entirely to the OR at the other end. - 2. The fallback OR should update it's router status to contain the - 'Running' flag if it has managed to open an orconn with 3/4 of the ORs - with an FP beginning with the same byte as its own. - - Tor ORs who are contacted by fallback ORs requesting an orconn should: - 1. Accept the orconn until they have reached a defined limit of orconn - connections with fallback ORs. - 2. Should only accept such orconn requests from listed fallback ORs who - have an FP beginning with the same byte as its own. - - Tor clients can include fallback ORs in the network by doing the - following: - 1. When building a circuit, observe the fingerprint of each node they - wish to connect to. - 2. When randomly selecting a node from the set of all eligible nodes, - add all published, running fallback nodes to the set where the first - byte of the fingerprint matches the previous node in the circuit. - -Anonymity Implications: - - At least some, and possibly all, nodes on the network will have a set - of nodes that only they and a few others can build circuits on. - - 1. This means that fallback ORs might be unsuitable for use as middlemen - nodes, because if the exit node is the attacker it knows that the - number of nodes that could be the entry guard in the circuit is - reduced to roughly 1/256th of the network, or worse 1/256th of all - nodes listed as Guards. For the same reason, fallback nodes would - appear to be unsuitable for two-hop circuits. - - 2. This is not a problem if fallback ORs are always exit nodes. If - the fallback OR is an attacker it will not be able to reduce the - set of possible nodes for the entry guard any further than a normal, - published OR. - -Possible Attacks/Open Issues: - - 1. Gaming Node Selection - Does running a fallback OR customized for a specific set of published ORs - improve an attacker's chances of seeing traffic from that set of published - ORs? Would such a strategy be any more effective than running published - ORs with other 'attractive' properties? - - 2. DOS Attack - An attacker could prevent all other legitimate fallback ORs with a - given byte-1 in their FP from functioning by running 20 or 30 fallback ORs - and monopolizing all available fallback slots on the published ORs. - This same attacker would then be in a position to monopolize all the - traffic of the fallback ORs on that byte-1 network segment. I'm not sure - what this would allow such an attacker to do. - - 4. Circuit-Sniffing - An observer watching exit traffic from a fallback server will know that the - previous node in the circuit is one of a very small, identifiable - subset of the total ORs in the network. To establish the full path of the - circuit they would only have to watch the exit traffic from the fallback - OR and all the traffic from the 20 or 30 ORs it is likely to be connected - to. This means it is substantially easier to establish all members of a - circuit which has a fallback OR as an exit (sniff and analyse 10-50 (i.e. - 1/256 varying) + 1 ORs) rather than a normal published OR (sniff all 2560 - or so ORs on the network). The same mechanism that allows the client to - expect a specific fallback OR to be available from a specific published OR - allows an attacker to prepare his ground. - - Mitigant: - In terms of the resources and access required to monitor 2000 to 3000 - nodes, the effort of the adversary is not significantly diminished when he - is only interested in 20 or 30. It is hard to see how an adversary who can - obtain access to a randomly selected portion of the Tor network would face - any new or qualitatively different obstacles in attempting to access much - of the rest of it. - - -Implementation Issues: - - The number of ORs this proposal would add to the Tor network is not known. - This is because there is no mechanism at present for recording unsuccessful - attempts to become an OR. If the proposal is considered promising it may be - worthwhile to issue an alpha series release where candidate ORs post a - primitive fallback descriptor to the authority directories. This fallback - descriptor would not contain any other flag that would make it eligible for - selection by clients. It would act solely as a means of sizing the number of - Tor instances that try and fail to become ORs. - - The upper limit on the number of orconns from fallback ORs a normal, - published OR should be willing to accept is an open question. Is one - hundred, mostly idle, such orconns too onerous? - diff --git a/doc/spec/proposals/134-robust-voting.txt b/doc/spec/proposals/134-robust-voting.txt deleted file mode 100644 index c5dfb3b47f..0000000000 --- a/doc/spec/proposals/134-robust-voting.txt +++ /dev/null @@ -1,123 +0,0 @@ -Filename: 134-robust-voting.txt -Title: More robust consensus voting with diverse authority sets -Author: Peter Palfrader -Created: 2008-04-01 -Status: Rejected - -History: - 2009 May 27: Added note on rejecting this proposal -- Nick - -Overview: - - A means to arrive at a valid directory consensus even when voters - disagree on who is an authority. - - -Motivation: - - Right now there are about five authoritative directory servers in the - Tor network, tho this number is expected to rise to about 15 eventually. - - Adding a new authority requires synchronized action from all operators of - directory authorities so that at any time during the update at least half of - all authorities are running and agree on who is an authority. The latter - requirement is there so that the authorities can arrive at a common - consensus: Each authority builds the consensus based on the votes from - all authorities it recognizes, and so a different set of recognized - authorities will lead to a different consensus document. - - -Objective: - - The modified voting procedure outlined in this proposal obsoletes the - requirement for most authorities to exactly agree on the list of - authorities. - - -Proposal: - - The vote document each authority generates contains a list of - authorities recognized by the generating authority. This will be - a list of authority identity fingerprints. - - Authorities will accept votes from and serve/mirror votes also for - authorities they do not recognize. (Votes contain the signing, - authority key, and the certificate linking them so they can be - verified even without knowing the authority beforehand.) - - Before building the consensus we will check which votes to use for - building: - - 1) We build a directed graph of which authority/vote recognizes - whom. - 2) (Parts of the graph that aren't reachable, directly or - indirectly, from any authorities we recognize can be discarded - immediately.) - 3) We find the largest fully connected subgraph. - (Should there be more than one subgraph of the same size there - needs to be some arbitrary ordering so we always pick the same. - E.g. pick the one who has the smaller (XOR of all votes' digests) - or something.) - 4) If we are part of that subgraph, great. This is the list of - votes we build our consensus with. - 5) If we are not part of that subgraph, remove all the nodes that - are part of it and go to 3. - - Using this procedure authorities that are updated to recognize a - new authority will continue voting with the old group until a - sufficient number has been updated to arrive at a consensus with - the recently added authority. - - In fact, the old set of authorities will probably be voting among - themselves until all but one has been updated to recognize the - new authority. Then which set of votes is used for consensus - building depends on which of the two equally large sets gets - ordered before the other in step (3) above. - - It is necessary to continue with the process in (5) even if we - are not in the largest subgraph. Otherwise one rogue authority - could create a number of extra votes (by new authorities) so that - everybody stops at 5 and no consensus is built, even tho it would - be trusted by all clients. - - -Anonymity Implications: - - The author does not believe this proposal to have anonymity - implications. - - -Possible Attacks/Open Issues/Some thinking required: - - Q: Can a number (less or exactly half) of the authorities cause an honest - authority to vote for "their" consensus rather than the one that would - result were all authorities taken into account? - - - Q: Can a set of votes from external authorities, i.e of whom we trust either - none or at least not all, cause us to change the set of consensus makers we - pick? - A: Yes, if other authorities decide they rather build a consensus with them - then they'll be thrown out in step 3. But that's ok since those other - authorities will never vote with us anyway. - If we trust none of them then we throw them out even sooner, so no harm done. - - Q: Can this ever force us to build a consensus with authorities we do not - recognize? - A: No, we can never build a fully connected set with them in step 3. - ------------------------------- - -I'm rejecting this proposal as insecure. - -Suppose that we have a clique of size N, and M hostile members in the -clique. If these hostile members stop declaring trust for up to M-1 -good members of the clique, the clique with the hostile members will -in it will be larger than the one without them. - -The M hostile members will constitute a majority of this new clique -when M > (N-(M-1)) / 2, or when M > (N + 1) / 3. This breaks our -requirement that an adversary must compromise a majority of authorities -in order to control the consensus. - --- Nick diff --git a/doc/spec/proposals/135-private-tor-networks.txt b/doc/spec/proposals/135-private-tor-networks.txt deleted file mode 100644 index 19ef68b7b1..0000000000 --- a/doc/spec/proposals/135-private-tor-networks.txt +++ /dev/null @@ -1,281 +0,0 @@ -Filename: 135-private-tor-networks.txt -Title: Simplify Configuration of Private Tor Networks -Author: Karsten Loesing -Created: 29-Apr-2008 -Status: Closed -Target: 0.2.1.x -Implemented-In: 0.2.1.2-alpha - -Change history: - - 29-Apr-2008 Initial proposal for or-dev - 19-May-2008 Included changes based on comments by Nick to or-dev and - added a section for test cases. - 18-Jun-2008 Changed testing-network-only configuration option names. - -Overview: - - Configuring a private Tor network has become a time-consuming and - error-prone task with the introduction of the v3 directory protocol. In - addition to that, operators of private Tor networks need to set an - increasing number of non-trivial configuration options, and it is hard - to keep FAQ entries describing this task up-to-date. In this proposal we - (1) suggest to (optionally) accelerate timing of the v3 directory voting - process and (2) introduce an umbrella config option specifically aimed at - creating private Tor networks. - -Design: - - 1. Accelerate Timing of v3 Directory Voting Process - - Tor has reasonable defaults for setting up a large, Internet-scale - network with comparably high latencies and possibly wrong server clocks. - However, those defaults are bad when it comes to quickly setting up a - private Tor network for testing, either on a single node or LAN (things - might be different when creating a test network on PlanetLab or - something). Some time constraints should be made configurable for private - networks. The general idea is to accelerate everything that has to do - with propagation of directory information, but nothing else, so that a - private network is available as soon as possible. (As a possible - safeguard, changing these configuration values could be made dependent on - the umbrella configuration option introduced in 2.) - - 1.1. Initial Voting Schedule - - When a v3 directory does not know any consensus, it assumes an initial, - hard-coded VotingInterval of 30 minutes, VoteDelay of 5 minutes, and - DistDelay of 5 minutes. This is important for multiple, simultaneously - restarted directory authorities to meet at a common time and create an - initial consensus. Unfortunately, this means that it may take up to half - an hour (or even more) for a private Tor network to bootstrap. - - We propose to make these three time constants configurable (note that - V3AuthVotingInterval, V3AuthVoteDelay, and V3AuthDistDelay do not have an - effect on the _initial_ voting schedule, but only on the schedule that a - directory authority votes for). This can be achieved by introducing three - new configuration options: TestingV3AuthInitialVotingInterval, - TestingV3AuthInitialVoteDelay, and TestingV3AuthInitialDistDelay. - - As first safeguards, Tor should only accept configuration values for - TestingV3AuthInitialVotingInterval that divide evenly into the default - value of 30 minutes. The effect is that even if people misconfigured - their directory authorities, they would meet at the default values at the - latest. The second safeguard is to allow configuration only when the - umbrella configuration option TestingTorNetwork is set. - - 1.2. Immediately Provide Reachability Information (Running flag) - - The default behavior of a directory authority is to provide the Running - flag only after the authority is available for at least 30 minutes. The - rationale is that before that time, an authority simply cannot deliver - useful information about other running nodes. But for private Tor - networks this may be different. This is currently implemented in the code - as: - - /** If we've been around for less than this amount of time, our - * reachability information is not accurate. */ - #define DIRSERV_TIME_TO_GET_REACHABILITY_INFO (30*60) - - There should be another configuration option - TestingAuthDirTimeToLearnReachability with a default value of 30 minutes - that can be changed when running testing Tor networks, e.g. to 0 minutes. - The configuration value would simply replace the quoted constant. Again, - changing this option could be safeguarded by requiring the umbrella - configuration option TestingTorNetwork to be set. - - 1.3. Reduce Estimated Descriptor Propagation Time - - Tor currently assumes that it takes up to 10 minutes until router - descriptors are propagated from the authorities to directory caches. - This is not very useful for private Tor networks, and we want to be able - to reduce this time, so that clients can download router descriptors in a - timely manner. - - /** Clients don't download any descriptor this recent, since it will - * probably not have propagated to enough caches. */ - #define ESTIMATED_PROPAGATION_TIME (10*60) - - We suggest to introduce a new config option - TestingEstimatedDescriptorPropagationTime which defaults to 10 minutes, - but that can be set to any lower non-negative value, e.g. 0 minutes. The - same safeguards as in 1.2 could be used here, too. - - 2. Umbrella Option for Setting Up Private Tor Networks - - Setting up a private Tor network requires a number of specific settings - that are not required or useful when running Tor in the public Tor - network. Instead of writing down these options in a FAQ entry, there - should be a single configuration option, e.g. TestingTorNetwork, that - changes all required settings at once. Newer Tor versions would keep the - set of configuration options up-to-date. It should still remain possible - to manually overwrite the settings that the umbrella configuration option - affects. - - The following configuration options are set by TestingTorNetwork: - - - ServerDNSAllowBrokenResolvConf 1 - Ignore the situation that private relays are not aware of any name - servers. - - - DirAllowPrivateAddresses 1 - Allow router descriptors containing private IP addresses. - - - EnforceDistinctSubnets 0 - Permit building circuits with relays in the same subnet. - - - AssumeReachable 1 - Omit self-testing for reachability. - - - AuthDirMaxServersPerAddr 0 - - AuthDirMaxServersPerAuthAddr 0 - Permit an unlimited number of nodes on the same IP address. - - - ClientDNSRejectInternalAddresses 0 - Believe in DNS responses resolving to private IP addresses. - - - ExitPolicyRejectPrivate 0 - Allow exiting to private IP addresses. (This one is a matter of - taste---it might be dangerous to make this a default in a private - network, although people setting up private Tor networks should know - what they are doing.) - - - V3AuthVotingInterval 5 minutes - - V3AuthVoteDelay 20 seconds - - V3AuthDistDelay 20 seconds - Accelerate voting schedule after first consensus has been reached. - - - TestingV3AuthInitialVotingInterval 5 minutes - - TestingV3AuthInitialVoteDelay 20 seconds - - TestingV3AuthInitialDistDelay 20 seconds - Accelerate initial voting schedule until first consensus is reached. - - - TestingAuthDirTimeToLearnReachability 0 minutes - Consider routers as Running from the start of running an authority. - - - TestingEstimatedDescriptorPropagationTime 0 minutes - Clients try downloading router descriptors from directory caches, - even when they are not 10 minutes old. - - In addition to changing the defaults for these configuration options, - TestingTorNetwork can only be set when a user has manually configured - DirServer lines. - -Test: - - The implementation of this proposal must pass the following tests: - - 1. Set TestingTorNetwork and see if dependent configuration options are - correctly changed. - - tor DataDirectory . ControlPort 9051 TestingTorNetwork 1 DirServer \ - "mydir 127.0.0.1:1234 0000000000000000000000000000000000000000" - telnet 127.0.0.1 9051 - AUTHENTICATE - GETCONF TestingTorNetwork TestingAuthDirTimeToLearnReachability - 250-TestingTorNetwork=1 - 250 TestingAuthDirTimeToLearnReachability=0 - QUIT - - 2. Set TestingTorNetwork and a dependent configuration value to see if - the provided value is used for the dependent option. - - tor DataDirectory . ControlPort 9051 TestingTorNetwork 1 DirServer \ - "mydir 127.0.0.1:1234 0000000000000000000000000000000000000000" \ - TestingAuthDirTimeToLearnReachability 5 - telnet 127.0.0.1 9051 - AUTHENTICATE - GETCONF TestingTorNetwork TestingAuthDirTimeToLearnReachability - 250-TestingTorNetwork=1 - 250 TestingAuthDirTimeToLearnReachability=5 - QUIT - - 3. Start with TestingTorNetwork set and change a dependent configuration - option later on. - - tor DataDirectory . ControlPort 9051 TestingTorNetwork 1 DirServer \ - "mydir 127.0.0.1:1234 0000000000000000000000000000000000000000" - telnet 127.0.0.1 9051 - AUTHENTICATE - SETCONF TestingAuthDirTimeToLearnReachability=5 - GETCONF TestingAuthDirTimeToLearnReachability - 250 TestingAuthDirTimeToLearnReachability=5 - QUIT - - 4. Start with TestingTorNetwork set and a dependent configuration value, - and reset that dependent configuration value. The result should be - the testing-network specific default value. - - tor DataDirectory . ControlPort 9051 TestingTorNetwork 1 DirServer \ - "mydir 127.0.0.1:1234 0000000000000000000000000000000000000000" \ - TestingAuthDirTimeToLearnReachability 5 - telnet 127.0.0.1 9051 - AUTHENTICATE - GETCONF TestingAuthDirTimeToLearnReachability - 250 TestingAuthDirTimeToLearnReachability=5 - RESETCONF TestingAuthDirTimeToLearnReachability - GETCONF TestingAuthDirTimeToLearnReachability - 250 TestingAuthDirTimeToLearnReachability=0 - QUIT - - 5. Leave TestingTorNetwork unset and check if dependent configuration - options are left unchanged. - - tor DataDirectory . ControlPort 9051 DirServer \ - "mydir 127.0.0.1:1234 0000000000000000000000000000000000000000" - telnet 127.0.0.1 9051 - AUTHENTICATE - GETCONF TestingTorNetwork TestingAuthDirTimeToLearnReachability - 250-TestingTorNetwork=0 - 250 TestingAuthDirTimeToLearnReachability=1800 - QUIT - - 6. Leave TestingTorNetwork unset, but set dependent configuration option - which should fail. - - tor DataDirectory . ControlPort 9051 DirServer \ - "mydir 127.0.0.1:1234 0000000000000000000000000000000000000000" \ - TestingAuthDirTimeToLearnReachability 0 - [warn] Failed to parse/validate config: - TestingAuthDirTimeToLearnReachability may only be changed in testing - Tor networks! - - 7. Start with TestingTorNetwork unset and change dependent configuration - option later on which should fail. - - tor DataDirectory . ControlPort 9051 DirServer \ - "mydir 127.0.0.1:1234 0000000000000000000000000000000000000000" - telnet 127.0.0.1 9051 - AUTHENTICATE - SETCONF TestingAuthDirTimeToLearnReachability=0 - 513 Unacceptable option value: TestingAuthDirTimeToLearnReachability - may only be changed in testing Tor networks! - - 8. Start with TestingTorNetwork unset and set it later on which should - fail. - - tor DataDirectory . ControlPort 9051 DirServer \ - "mydir 127.0.0.1:1234 0000000000000000000000000000000000000000" - telnet 127.0.0.1 9051 - AUTHENTICATE - SETCONF TestingTorNetwork=1 - 553 Transition not allowed: While Tor is running, changing - TestingTorNetwork is not allowed. - - 9. Start with TestingTorNetwork set and unset it later on which should - fail. - - tor DataDirectory . ControlPort 9051 TestingTorNetwork 1 DirServer \ - "mydir 127.0.0.1:1234 0000000000000000000000000000000000000000" - telnet 127.0.0.1 9051 - AUTHENTICATE - RESETCONF TestingTorNetwork - 513 Unacceptable option value: TestingV3AuthInitialVotingInterval may - only be changed in testing Tor networks! - - 10. Set TestingTorNetwork, but do not provide an alternate DirServer - which should fail. - - tor DataDirectory . ControlPort 9051 TestingTorNetwork 1 - [warn] Failed to parse/validate config: TestingTorNetwork may only be - configured in combination with a non-default set of DirServers. - diff --git a/doc/spec/proposals/136-legacy-keys.txt b/doc/spec/proposals/136-legacy-keys.txt deleted file mode 100644 index f2b1b5c7f9..0000000000 --- a/doc/spec/proposals/136-legacy-keys.txt +++ /dev/null @@ -1,100 +0,0 @@ -Filename: 136-legacy-keys.txt -Title: Mass authority migration with legacy keys -Author: Nick Mathewson -Created: 13-May-2008 -Status: Closed -Implemented-In: 0.2.0.x - -Overview: - - This document describes a mechanism to change the keys of more than - half of the directory servers at once without breaking old clients - and caches immediately. - -Motivation: - - If a single authority's identity key is believed to be compromised, - the solution is obvious: remove that authority from the list, - generate a new certificate, and treat the new cert as belonging to a - new authority. This approach works fine so long as less than 1/2 of - the authority identity keys are bad. - - Unfortunately, the mass-compromise case is possible if there is a - sufficiently bad bug in Tor or in any OS used by a majority of v3 - authorities. Let's be prepared for it! - - We could simply stop using the old keys and start using new ones, - and tell all clients running insecure versions to upgrade. - Unfortunately, this breaks our cacheing system pretty badly, since - caches won't cache a consensus that they don't believe in. It would - be nice to have everybody become secure the moment they upgrade to a - version listing the new authority keys, _without_ breaking upgraded - clients until the caches upgrade. - - So, let's come up with a way to provide a time window where the - consensuses are signed with the new keys and with the old. - -Design: - - We allow directory authorities to list a single "legacy key" - fingerprint in their votes. Each authority may add a single legacy - key. The format for this line is: - - legacy-dir-key FINGERPRINT - - We describe a new consensus method for generating directory - consensuses. This method is consensus method "3". - - When the authorities decide to use method "3" (as described in 3.4.1 - of dir-spec.txt), for every included vote with a legacy-dir-key line, - the consensus includes an extra dir-source line. The fingerprint in - this extra line is as in the legacy-dir-key line. The ports and - addresses are in the dir-source line. The nickname is as in the - dir-source line, with the string "-legacy" appended. - - [We need to include this new dir-source line because the code - won't accept or preserve signatures from authorities not listed - as contributing to the consensus.] - - Authorities using legacy dir keys include two signatures on their - consensuses: one generated with a signing key signed with their real - signing key, and another generated with a signing key signed with - another signing key attested to by their identity key. These - signing keys MUST be different. Authorities MUST serve both - certificates if asked. - -Process: - - In the event of a mass key failure, we'll follow the following - (ugly) procedure: - - All affected authorities generate new certificates and identity - keys, and circulate their new dirserver lines. They copy their old - certificates and old broken keys, but put them in new "legacy - key files". - - At the earliest time that can be arranged, the authorities - replace their signing keys, identity keys, and certificates - with the new uncompromised versions, and update to the new list - of dirserer lines. - - They add an "V3DirAdvertiseLegacyKey 1" option to their torrc. - - Now, new consensuses will be generated using the new keys, but - the results will also be signed with the old keys. - - Clients and caches are told they need to upgrade, and given a - time window to do so. - - At the end of the time window, authorities remove the - V3DirAdvertiseLegacyKey option. - -Notes: - - It might be good to get caches to cache consensuses that they do not - believe in. I'm not sure the best way of how to do this. - - It's a superficially neat idea to have new signing keys and have - them signed by the new and by the old authority identity keys. This - breaks some code, though, and doesn't actually gain us anything, - since we'd still need to include each signature twice. - - It's also a superficially neat idea, if identity keys and signing - keys are compromised, to at least replace all the signing keys. - I don't think this achieves us anything either, though. - - diff --git a/doc/spec/proposals/137-bootstrap-phases.txt b/doc/spec/proposals/137-bootstrap-phases.txt deleted file mode 100644 index ebe044c707..0000000000 --- a/doc/spec/proposals/137-bootstrap-phases.txt +++ /dev/null @@ -1,235 +0,0 @@ -Filename: 137-bootstrap-phases.txt -Title: Keep controllers informed as Tor bootstraps -Author: Roger Dingledine -Created: 07-Jun-2008 -Status: Closed -Implemented-In: 0.2.1.x - -1. Overview. - - Tor has many steps to bootstrapping directory information and - initial circuits, but from the controller's perspective we just have - a coarse-grained "CIRCUIT_ESTABLISHED" status event. Tor users with - slow connections or with connectivity problems can wait a long time - staring at the yellow onion, wondering if it will ever change color. - - This proposal describes a new client status event so Tor can give - more details to the controller. Section 2 describes the changes to the - controller protocol; Section 3 describes Tor's internal bootstrapping - phases when everything is going correctly; Section 4 describes when - Tor detects a problem and issues a bootstrap warning; Section 5 covers - suggestions for how controllers should display the results. - -2. Controller event syntax. - - The generic status event is: - - "650" SP StatusType SP StatusSeverity SP StatusAction - [SP StatusArguments] CRLF - - So in this case we send - 650 STATUS_CLIENT NOTICE/WARN BOOTSTRAP \ - PROGRESS=num TAG=Keyword SUMMARY=String \ - [WARNING=String REASON=Keyword COUNT=num RECOMMENDATION=Keyword] - - The arguments MAY appear in any order. Controllers MUST accept unrecognized - arguments. - - "Progress" gives a number between 0 and 100 for how far through - the bootstrapping process we are. "Summary" is a string that can be - displayed to the user to describe the *next* task that Tor will tackle, - i.e., the task it is working on after sending the status event. "Tag" - is an optional string that controllers can use to recognize bootstrap - phases from Section 3, if they want to do something smarter than just - blindly displaying the summary string. - - The severity describes whether this is a normal bootstrap phase - (severity notice) or an indication of a bootstrapping problem - (severity warn). If severity warn, it should also include a "warning" - argument string with any hints Tor has to offer about why it's having - troubles bootstrapping, a "reason" string that lists one of the reasons - allowed in the ORConn event, a "count" number that tells how many - bootstrap problems there have been so far at this phase, and a - "recommendation" keyword to indicate how the controller ought to react. - -3. The bootstrap phases. - - This section describes the various phases currently reported by - Tor. Controllers should not assume that the percentages and tags listed - here will continue to match up, or even that the tags will stay in - the same order. Some phases might also be skipped (not reported) if the - associated bootstrap step is already complete, or if the phase no longer - is necessary. Only "starting" and "done" are guaranteed to exist in all - future versions. - - Current Tor versions enter these phases in order, monotonically; - future Tors MAY revisit earlier stages. - - Phase 0: - tag=starting summary="starting" - - Tor starts out in this phase. - - Phase 5: - tag=conn_dir summary="Connecting to directory mirror" - - Tor sends this event as soon as Tor has chosen a directory mirror --- - one of the authorities if bootstrapping for the first time or after - a long downtime, or one of the relays listed in its cached directory - information otherwise. - - Tor will stay at this phase until it has successfully established - a TCP connection with some directory mirror. Problems in this phase - generally happen because Tor doesn't have a network connection, or - because the local firewall is dropping SYN packets. - - Phase 10 - tag=handshake_dir summary="Finishing handshake with directory mirror" - - This event occurs when Tor establishes a TCP connection with a relay used - as a directory mirror (or its https proxy if it's using one). Tor remains - in this phase until the TLS handshake with the relay is finished. - - Problems in this phase generally happen because Tor's firewall is - doing more sophisticated MITM attacks on it, or doing packet-level - keyword recognition of Tor's handshake. - - Phase 15: - tag=onehop_create summary="Establishing one-hop circuit for dir info" - - Once TLS is finished with a relay, Tor will send a CREATE_FAST cell - to establish a one-hop circuit for retrieving directory information. - It will remain in this phase until it receives the CREATED_FAST cell - back, indicating that the circuit is ready. - - Phase 20: - tag=requesting_status summary="Asking for networkstatus consensus" - - Once we've finished our one-hop circuit, we will start a new stream - for fetching the networkstatus consensus. We'll stay in this phase - until we get the 'connected' relay cell back, indicating that we've - established a directory connection. - - Phase 25: - tag=loading_status summary="Loading networkstatus consensus" - - Once we've established a directory connection, we will start fetching - the networkstatus consensus document. This could take a while; this - phase is a good opportunity for using the "progress" keyword to indicate - partial progress. - - This phase could stall if the directory mirror we picked doesn't - have a copy of the networkstatus consensus so we have to ask another, - or it does give us a copy but we don't find it valid. - - Phase 40: - tag=loading_keys summary="Loading authority key certs" - - Sometimes when we've finished loading the networkstatus consensus, - we find that we don't have all the authority key certificates for the - keys that signed the consensus. At that point we put the consensus we - fetched on hold and fetch the keys so we can verify the signatures. - - Phase 45 - tag=requesting_descriptors summary="Asking for relay descriptors" - - Once we have a valid networkstatus consensus and we've checked all - its signatures, we start asking for relay descriptors. We stay in this - phase until we have received a 'connected' relay cell in response to - a request for descriptors. - - Phase 50: - tag=loading_descriptors summary="Loading relay descriptors" - - We will ask for relay descriptors from several different locations, - so this step will probably make up the bulk of the bootstrapping, - especially for users with slow connections. We stay in this phase until - we have descriptors for at least 1/4 of the usable relays listed in - the networkstatus consensus. This phase is also a good opportunity to - use the "progress" keyword to indicate partial steps. - - Phase 80: - tag=conn_or summary="Connecting to entry guard" - - Once we have a valid consensus and enough relay descriptors, we choose - some entry guards and start trying to build some circuits. This step - is similar to the "conn_dir" phase above; the only difference is - the context. - - If a Tor starts with enough recent cached directory information, - its first bootstrap status event will be for the conn_or phase. - - Phase 85: - tag=handshake_or summary="Finishing handshake with entry guard" - - This phase is similar to the "handshake_dir" phase, but it gets reached - if we finish a TCP connection to a Tor relay and we have already reached - the "conn_or" phase. We'll stay in this phase until we complete a TLS - handshake with a Tor relay. - - Phase 90: - tag=circuit_create "Establishing circuits" - - Once we've finished our TLS handshake with an entry guard, we will - set about trying to make some 3-hop circuits in case we need them soon. - - Phase 100: - tag=done summary="Done" - - A full 3-hop circuit has been established. Tor is ready to handle - application connections now. - -4. Bootstrap problem events. - - When an OR Conn fails, we send a "bootstrap problem" status event, which - is like the standard bootstrap status event except with severity warn. - We include the same progress, tag, and summary values as we would for - a normal bootstrap event, but we also include "warning", "reason", - "count", and "recommendation" key/value combos. - - The "reason" values are long-term-stable controller-facing tags to - identify particular issues in a bootstrapping step. The warning - strings, on the other hand, are human-readable. Controllers SHOULD - NOT rely on the format of any warning string. Currently the possible - values for "recommendation" are either "ignore" or "warn" -- if ignore, - the controller can accumulate the string in a pile of problems to show - the user if the user asks; if warn, the controller should alert the - user that Tor is pretty sure there's a bootstrapping problem. - - Currently Tor uses recommendation=ignore for the first nine bootstrap - problem reports for a given phase, and then uses recommendation=warn - for subsequent problems at that phase. Hopefully this is a good - balance between tolerating occasional errors and reporting serious - problems quickly. - -5. Suggested controller behavior. - - Controllers should start out with a yellow onion or the equivalent - ("starting"), and then watch for either a bootstrap status event - (meaning the Tor they're using is sufficiently new to produce them, - and they should load up the progress bar or whatever they plan to use - to indicate progress) or a circuit_established status event (meaning - bootstrapping is finished). - - In addition to a progress bar in the display, controllers should also - have some way to indicate progress even when no controller window is - open. For example, folks using Tor Browser Bundle in hostile Internet - cafes don't want a big splashy screen up. One way to let the user keep - informed of progress in a more subtle way is to change the task tray - icon and/or tooltip string as more bootstrap events come in. - - Controllers should also have some mechanism to alert their user when - bootstrapping problems are reported. Perhaps we should gather a set of - help texts and the controller can send the user to the right anchor in a - "bootstrapping problems" page in the controller's help subsystem? - -6. Getting up to speed when the controller connects. - - There's a new "GETINFO /status/bootstrap-phase" option, which returns - the most recent bootstrap phase status event sent. Specifically, - it returns a string starting with either "NOTICE BOOTSTRAP ..." or - "WARN BOOTSTRAP ...". - - Controllers should use this getinfo when they connect or attach to - Tor to learn its current state. - diff --git a/doc/spec/proposals/138-remove-down-routers-from-consensus.txt b/doc/spec/proposals/138-remove-down-routers-from-consensus.txt deleted file mode 100644 index 776911b5c9..0000000000 --- a/doc/spec/proposals/138-remove-down-routers-from-consensus.txt +++ /dev/null @@ -1,49 +0,0 @@ -Filename: 138-remove-down-routers-from-consensus.txt -Title: Remove routers that are not Running from consensus documents -Author: Peter Palfrader -Created: 11-Jun-2008 -Status: Closed -Implemented-In: 0.2.1.2-alpha - -1. Overview. - - Tor directory authorities hourly vote and agree on a consensus document - which lists all the routers on the network together with some of their - basic properties, like if a router is an exit node, whether it is - stable or whether it is a version 2 directory mirror. - - One of the properties given with each router is the 'Running' flag. - Clients do not use routers that are not listed as running. - - This proposal suggests that routers without the Running flag are not - listed at all. - -2. Current status - - At a typical bootstrap a client downloads a 140KB consensus, about - 10KB of certificates to verify that consensus, and about 1.6MB of - server descriptors, about 1/4 of which it requires before it will - start building circuits. - - Another proposal deals with how to get that huge 1.6MB fraction to - effectively zero (by downloading only individual descriptors, on - demand). Should that get successfully implemented that will leave the - 140KB compressed consensus as a large fraction of what a client needs - to get in order to work. - - About one third of the routers listed in a consensus are not running - and will therefore never be used by clients who use this consensus. - Not listing those routers will save about 30% to 40% in size. - -3. Proposed change - - Authority directory servers produce vote documents that include all - the servers they know about, running or not, like they currently - do. In addition these vote documents also state that the authority - supports a new consensus forming method (method number 4). - - If more than two thirds of votes that an authority has received claim - they support method 4 then this new method will be used: The - consensus document is formed like before but a new last step removes - all routers from the listing that are not marked as Running. - diff --git a/doc/spec/proposals/139-conditional-consensus-download.txt b/doc/spec/proposals/139-conditional-consensus-download.txt deleted file mode 100644 index 941f5ad6b0..0000000000 --- a/doc/spec/proposals/139-conditional-consensus-download.txt +++ /dev/null @@ -1,94 +0,0 @@ -Filename: 139-conditional-consensus-download.txt -Title: Download consensus documents only when it will be trusted -Author: Peter Palfrader -Created: 2008-04-13 -Status: Closed -Implemented-In: 0.2.1.x - -Overview: - - Servers only provide consensus documents to clients when it is known that - the client will trust it. - -Motivation: - - When clients[1] want a new network status consensus they request it - from a Tor server using the URL path /tor/status-vote/current/consensus. - Then after downloading the client checks if this consensus can be - trusted. Whether the client trusts the consensus depends on the - authorities that the client trusts and how many of those - authorities signed the consensus document. - - If the client cannot trust the consensus document it is disregarded - and a new download is tried at a later time. Several hundred - kilobytes of server bandwidth were wasted by this single client's - request. - - With hundreds of thousands of clients this will have undesirable - consequences when the list of authorities has changed so much that a - large number of established clients no longer can trust any consensus - document formed. - -Objective: - - The objective of this proposal is to make clients not download - consensuses they will not trust. - -Proposal: - - The list of authorities that are trusted by a client are encoded in - the URL they send to the directory server when requesting a consensus - document. - - The directory server then only sends back the consensus when more than - half of the authorities listed in the request have signed the - consensus. If it is known that the consensus will not be trusted - a 404 error code is sent back to the client. - - This proposal does not require directory caches to keep more than one - consensus document. This proposal also does not require authorities - to verify the signature on the consensus document of authorities they - do not recognize. - - The new URL scheme to download a consensus is - /tor/status-vote/current/consensus/<F> where F is a list of - fingerprints, sorted in ascending order, and concatenated using a + - sign. - - Fingerprints are uppercase hexadecimal encodings of the authority - identity key's digest. Servers should also accept requests that - use lower case or mixed case hexadecimal encodings. - - A .z URL for compressed versions of the consensus will be provided - similarly to existing resources and is the URL that usually should - be used by clients. - -Migration: - - The old location of the consensus should continue to work - indefinitely. Not only is it used by old clients, but it is a useful - resource for automated tools that do not particularly care which - authorities have signed the consensus. - - Authorities that are known to the client a priori by being shipped - with the Tor code are assumed to handle this format. - - When downloading a consensus document from caches that do not support this - new format they fall back to the old download location. - - Caches support the new format starting with Tor version 0.2.1.1-alpha. - -Anonymity Implications: - - By supplying the list of authorities a client trusts to the directory - server we leak information (like likely version of Tor client) to the - directory server. In the current system we also leak that we are - very old - by re-downloading the consensus over and over again, but - only when we are so old that we no longer can trust the consensus. - - - -Footnotes: - 1. For the purpose of this proposal a client can be any Tor instance - that downloads a consensus document. This includes relays, - directory caches as well as end users. diff --git a/doc/spec/proposals/140-consensus-diffs.txt b/doc/spec/proposals/140-consensus-diffs.txt deleted file mode 100644 index 8bc4070bfe..0000000000 --- a/doc/spec/proposals/140-consensus-diffs.txt +++ /dev/null @@ -1,156 +0,0 @@ -Filename: 140-consensus-diffs.txt -Title: Provide diffs between consensuses -Author: Peter Palfrader -Created: 13-Jun-2008 -Status: Accepted -Target: 0.2.2.x - -0. History - - 22-May-2009: Restricted the ed format even more strictly for ease of - implementation. -nickm - -1. Overview. - - Tor clients and servers need a list of which relays are on the - network. This list, the consensus, is created by authorities - hourly and clients fetch a copy of it, with some delay, hourly. - - This proposal suggests that clients download diffs of consensuses - once they have a consensus instead of hourly downloading a full - consensus. - -2. Numbers - - After implementing proposal 138 which removes nodes that are not - running from the list a consensus document is about 92 kilobytes - in size after compression. - - The diff between two consecutive consensus, in ed format, is on - average 13 kilobytes compressed. - -3. Proposal - -3.1 Clients - - If a client has a consensus that is recent enough it SHOULD - try to download a diff to get the latest consensus rather than - fetching a full one. - - [XXX: what is recent enough? - time delta in hours / size of compressed diff - 0 20 - 1 9650 - 2 17011 - 3 23150 - 4 29813 - 5 36079 - 6 39455 - 7 43903 - 8 48907 - 9 54549 - 10 60057 - 11 67810 - 12 71171 - 13 73863 - 14 76048 - 15 80031 - 16 84686 - 17 89862 - 18 94760 - 19 94868 - 20 94223 - 21 93921 - 22 92144 - 23 90228 - [ size of gzip compressed "diff -e" between the consensus on - 2008-06-01-00:00:00 and the following consensuses that day. - Consensuses have been modified to exclude down routers per - proposal 138. ] - - Data suggests that for the first few hours diffs are very useful, - saving about 60% for the first three hours, 30% for the first 10, - and almost nothing once we are past 16 hours. - ] - -3.2 Servers - - Directory authorities and servers need to keep up to X [XXX: depends - on how long clients try to download diffs per above] old consensus - documents so they can build diffs. They should offer a diff to the - most recent consensus at the URL - - http://tor.noreply.org/tor/status-vote/current/consensus/diff/<HASH>/<FPRLIST> - - where hash is the full digest of the consensus the client currently - has, and FPRLIST is a list of (abbreviated) fingerprints of - authorities the client trusts. - - Servers will only return a consensus if more than half of the requested - authorities have signed the document, otherwise a 404 error will be sent - back. The fingerprints can be shortened to a length of any multiple of - two, using only the leftmost part of the encoded fingerprint. Tor uses - 3 bytes (6 hex characters) of the fingerprint. (This is just like the - conditional consensus downloads that Tor supports starting with - 0.1.2.1-alpha.) - - If a server cannot offer a diff from the consensus identified by the - hash but has a current consensus it MUST return the full consensus. - - [XXX: what should we do when the client already has the latest - consensus? I can think of the following options: - - send back 3xx not modified - - send back 200 ok and an empty diff - - send back 404 nothing newer here. - - I currently lean towards the empty diff.] - -4. Diff Format - - Diffs start with the token "network-status-diff-version" followed by a - space and the version number, currently "1". - - If a document does not start with network-status-diff it is assumed - to be a full consensus download and would therefore currently start - with "network-status-version 3". - - Following the network-status-diff header line is a diff, or patch, in - limited ed format. We choose this format because it is easy to create - and process with standard tools (patch, diff -e, ed). This will help - us in developing and testing this proposal and it should make future - debugging easier. - - [ If at one point in the future we decide that the space benefits from - a custom diff format outweighs these benefits we can always - introduce a new diff format and offer it at for instance - ../diff2/... ] - - We support the following ed commands, each on a line by itself: - - "<n1>d" Delete line n1 - - "<n1>,<n2>d" Delete lines n1 through n2, including - - "<n1>c" Replace line n1 with the following block - - "<n1>,<n2>c" Replace lines n1 through n2, including, with the - following block. - - "<n1>a" Append the following block after line n1. - - "a" Append the following block after the current line. - - "s/.//" Remove the first character in the current line. - - Note that line numbers always apply to the file after all previous - commands have already been applied. - - The commands MUST apply to the file from back to front, such that - lines are only ever referred to by their position in the original - file. - - The "current line" is either the first line of the file, if this is - the first command, the last line of a block we added in an append or - change command, or the line immediate following a set of lines we just - deleted (or the last line of the file if there are no lines after - that). - - The replace and append command take blocks. These blocks are simply - appended to the diff after the line with the command. A line with - just a period (".") ends the block (and is not part of the lines - to add). Note that it is impossible to insert a line with just - a single dot. Recommended procedure is to insert a line with - two dots, then remove the first character of that line using s/.//. diff --git a/doc/spec/proposals/141-jit-sd-downloads.txt b/doc/spec/proposals/141-jit-sd-downloads.txt deleted file mode 100644 index 2ac7a086b7..0000000000 --- a/doc/spec/proposals/141-jit-sd-downloads.txt +++ /dev/null @@ -1,323 +0,0 @@ -Filename: 141-jit-sd-downloads.txt -Title: Download server descriptors on demand -Author: Peter Palfrader -Created: 15-Jun-2008 -Status: Draft - -1. Overview - - Downloading all server descriptors is the most expensive part - of bootstrapping a Tor client. These server descriptors currently - amount to about 1.5 Megabytes of data, and this size will grow - linearly with network size. - - Fetching all these server descriptors takes a long while for people - behind slow network connections. It is also a considerable load on - our network of directory mirrors. - - This document describes proposed changes to the Tor network and - directory protocol so that clients will no longer need to download - all server descriptors. - - These changes consist of moving load balancing information into - network status documents, implementing a means to download server - descriptors on demand in an anonymity-preserving way, and dealing - with exit node selection. - -2. What is in a server descriptor - - When a Tor client starts the first thing it will try to get is a - current network status document: a consensus signed by a majority - of directory authorities. This document is currently about 100 - Kilobytes in size, tho it will grow linearly with network size. - This document lists all servers currently running on the network. - The Tor client will then try to get a server descriptor for each - of the running servers. All server descriptors currently amount - to about 1.5 Megabytes of downloads. - - A Tor client learns several things about a server from its descriptor. - Some of these it already learned from the network status document - published by the authorities, but the server descriptor contains it - again in a single statement signed by the server itself, not just by - the directory authorities. - - Tor clients use the information from server descriptors for - different purposes, which are considered in the following sections. - - #three ways: One, to determine if a server will be able to handle - #this client's request; two, to actually communicate or use the server; - #three, for load balancing decisions. - # - #These three points are considered in the following subsections. - -2.1 Load balancing - - The Tor load balancing mechanism is quite complex in its details, but - it has a simple goal: The more traffic a server can handle the more - traffic it should get. That means the more traffic a server can - handle the more likely a client will use it. - - For this purpose each server descriptor has bandwidth information - which tries to convey a server's capacity to clients. - - Currently we weigh servers differently for different purposes. There - is a weight for when we use a server as a guard node (our entry to the - Tor network), there is one weight we assign servers for exit duties, - and a third for when we need intermediate (middle) nodes. - -2.2 Exit information - - When a Tor wants to exit to some resource on the internet it will - build a circuit to an exit node that allows access to that resource's - IP address and TCP Port. - - When building that circuit the client can make sure that the circuit - ends at a server that will be able to fulfill the request because the - client already learned of all the servers' exit policies from their - descriptors. - -2.3 Capability information - - Server descriptors contain information about the specific version of - the Tor protocol they understand [proposal 105]. - - Furthermore the server descriptor also contains the exact version of - the Tor software that the server is running and some decisions are - made based on the server version number (for instance a Tor client - will only make conditional consensus requests [proposal 139] when - talking to Tor servers version 0.2.1.1-alpha or later). - -2.4 Contact/key information - - A server descriptor lists a server's IP address and TCP ports on which - it accepts onion and directory connections. Furthermore it contains - the onion key (a short lived RSA key to which clients encrypt CREATE - cells). - -2.5 Identity information - - A Tor client learns the digest of a server's key from the network - status document. Once it has a server descriptor this descriptor - contains the full RSA identity key of the server. Clients verify - that 1) the digest of the identity key matches the expected digest - it got from the consensus, and 2) that the signature on the descriptor - from that key is valid. - - -3. No longer require clients to have copies of all SDs - -3.1 Load balancing info in consensus documents - - One of the reasons why clients download all server descriptors is for - doing load proper load balancing as described in 2.1. In order for - clients to not require all server descriptors this information will - have to move into the network status document. - - Consensus documents will have a new line per router similar - to the "r", "s", and "v" lines that already exist. This line - will convey weight information to clients. - - "w Bandwidth=193" - - The bandwidth number is the lesser of observed bandwidth and bandwidth - rate limit from the server descriptor that the "r" line referenced by - digest (1st and 3rd field of the bandwidth line in the descriptor). - It is given in kilobytes per second so the byte value in the - descriptor has to be divided by 1024 (and is then truncated, i.e. - rounded down). - - Authorities will cap the bandwidth number at some arbitrary value, - currently 10MB/sec. If a router claims a larger bandwidth an - authority's vote will still only show Bandwidth=10240. - - The consensus value for bandwidth is the median of all bandwidth - numbers given in votes. In case of an even number of votes we use - the lower median. (Using this procedure allows us to change the - cap value more easily.) - - Clients should believe the bandwidth as presented in the consensus, - not capping it again. - -3.2 Fetching descriptors on demand - - As described in 2.4 a descriptor lists IP address, OR- and Dir-Port, - and the onion key for a server. - - A client already knows the IP address and the ports from the consensus - documents, but without the onion key it will not be able to send - CREATE/EXTEND cells for that server. Since the client needs the onion - key it needs the descriptor. - - If a client only downloaded a few descriptors in an observable manner - then that would leak which nodes it was going to use. - - This proposal suggests the following: - - 1) when connecting to a guard node for which the client does not - yet have a cached descriptor it requests the descriptor it - expects by hash. (The consensus document that the client holds - has a hash for the descriptor of this server. We want exactly - that descriptor, not a different one.) - - It does that by sending a RELAY_REQUEST_SD cell. - - A client MAY cache the descriptor of the guard node so that it does - not need to request it every single time it contacts the guard. - - 2) when a client wants to extend a circuit that currently ends in - server B to a new next server C, the client will send a - RELAY_REQUEST_SD cell to server B. This cell contains in its - payload the hash of a server descriptor the client would like - to obtain (C's server descriptor). The server sends back the - descriptor and the client can now form a valid EXTEND/CREATE cell - encrypted to C's onion key. - - Clients MUST NOT cache such descriptors. If they did they might - leak that they already extended to that server at least once - before. - - Replies to RELAY_REQUEST_SD requests need to be padded to some - constant upper limit in order to conceal a client's destination - from anybody who might be counting cells/bytes. - - RELAY_REQUEST_SD cells contain the following information: - - hash of the server descriptor requested - - hash of the identity digest of the server for which we want the SD - - IP address and OR-port or the server for which we want the SD - - padding factor - the number of cells we want the answer - padded to. - [XXX this just occured to me and it might be smart. or it might - be stupid. clients would learn the padding factor they want - to use from the consensus document. This allows us to grow - the replies later on should SDs become larger.] - [XXX: figure out a decent padding size] - -3.3 Protocol versions - - Server descriptors contain optional information of supported - link-level and circuit-level protocols in the form of - "opt protocols Link 1 2 Circuit 1". These are not currently needed - and will probably eventually move into the "v" (version) line in - the consensus. This proposal does not deal with them. - - Similarly a server descriptor contains the version number of - a Tor node. This information is already present in the consensus - and is thus available to all clients immediately. - -3.4 Exit selection - - Currently finding an appropriate exit node for a user's request is - easy for a client because it has complete knowledge of all the exit - policies of all servers on the network. - - The consensus document will once again be extended to contain the - information required by clients. This information will be a summary - of each node's exit policy. The exit policy summary will only contain - the list of ports to which a node exits to most destination IP - addresses. - - A summary should claim a router exits to a specific TCP port if, - ignoring private IP addresses, the exit policy indicates that the - router would exit to this port to most IP address. either two /8 - netblocks, or one /8 and a couple of /12s or any other combination). - The exact algorith used is this: Going through all exit policy items - - ignore any accept that is not for all IP addresses ("*"), - - ignore rejects for these netblocks (exactly, no subnetting): - 0.0.0.0/8, 169.254.0.0/16, 127.0.0.0/8, 192.168.0.0/16, 10.0.0.0/8, - and 172.16.0.0/12m - - for each reject count the number of IP addresses rejected against - the affected ports, - - once we hit an accept for all IP addresses ("*") add the ports in - that policy item to the list of accepted ports, if they don't have - more than 2^25 IP addresses (that's two /8 networks) counted - against them (i.e. if the router exits to a port to everywhere but - at most two /8 networks). - - An exit policy summary will be included in votes and consensus as a - new line attached to each exit node. The line will have the format - "p" <space> "accept"|"reject" <portlist> - where portlist is a comma seperated list of single port numbers or - portranges (e.g. "22,80-88,1024-6000,6667"). - - Whether the summary shows the list of accepted ports or the list of - rejected ports depends on which list is shorter (has a shorter string - representation). In case of ties we choose the list of accepted - ports. As an exception to this rule an allow-all policy is - represented as "accept 1-65535" instead of "reject " and a reject-all - policy is similarly given as "reject 1-65535". - - Summary items are compressed, that is instead of "80-88,89-100" there - only is a single item of "80-100", similarly instead of "20,21" a - summary will say "20-21". - - Port lists are sorted in ascending order. - - The maximum allowed length of a policy summary (including the "accept " - or "reject ") is 1000 characters. If a summary exceeds that length we - use an accept-style summary and list as much of the port list as is - possible within these 1000 bytes. - -3.4.1 Consensus selection - - When building a consensus, authorities have to agree on a digest of - the server descriptor to list in the router line for each router. - This is documented in dir-spec section 3.4. - - All authorities that listed that agreed upon descriptor digest in - their vote should also list the same exit policy summary - or list - none at all if the authority has not been upgraded to list that - information in their vote. - - If we have votes with matching server descriptor digest of which at - least one of them has an exit policy then we differ between two cases: - a) all authorities agree (or abstained) on the policy summary, and we - use the exit policy summary that they all listed in their vote, - b) something went wrong (or some authority is playing foul) and we - have different policy summaries. In that case we pick the one - that is most commonly listed in votes with the matching - descriptor. We break ties in favour of the lexigraphically larger - vote. - - If none one of the votes with a matching server descriptor digest has - an exit policy summary we use the most commonly listed one in all - votes, breaking ties like in case b above. - -3.4.2 Client behaviour - - When choosing an exit node for a specific request a Tor client will - choose from the list of nodes that exit to the requested port as given - by the consensus document. If a client has additional knowledge (like - cached full descriptors) that indicates the so chosen exit node will - reject the request then it MAY use that knowledge (or not include such - nodes in the selection to begin with). However, clients MUST NOT use - nodes that do not list the port as accepted in the summary (but for - which they know that the node would exit to that address from other - sources, like a cached descriptor). - - An exception to this is exit enclave behaviour: A client MAY use the - node at a specific IP address to exit to any port on the same address - even if that node is not listed as exiting to the port in the summary. - -4. Migration - -4.1 Consensus document changes. - - The consensus will need to include - - bandwidth information (see 3.1) - - exit policy summaries (3.4) - - A new consensus method (number TBD) will be chosen for this. - -5. Future possibilities - - This proposal still requires that all servers have the descriptors of - every other node in the network in order to answer RELAY_REQUEST_SD - cells. These cells are sent when a circuit is extended from ending at - node B to a new node C. In that case B would have to answer a - RELAY_REQUEST_SD cell that asks for C's server descriptor (by SD digest). - - In order to answer that request B obviously needs a copy of C's server - descriptor. The RELAY_REQUEST_SD cell already has all the info that - B needs to contact C so it can ask about the descriptor before passing it - back to the client. - diff --git a/doc/spec/proposals/142-combine-intro-and-rend-points.txt b/doc/spec/proposals/142-combine-intro-and-rend-points.txt deleted file mode 100644 index 3abd5c863d..0000000000 --- a/doc/spec/proposals/142-combine-intro-and-rend-points.txt +++ /dev/null @@ -1,277 +0,0 @@ -Filename: 142-combine-intro-and-rend-points.txt -Title: Combine Introduction and Rendezvous Points -Author: Karsten Loesing, Christian Wilms -Created: 27-Jun-2008 -Status: Dead - -Change history: - - 27-Jun-2008 Initial proposal for or-dev - 04-Jul-2008 Give first security property the new name "Responsibility" - and change new cell formats according to rendezvous protocol - version 3 draft. - 19-Jul-2008 Added comment by Nick (but no solution, yet) that sharing of - circuits between multiple clients is not supported by Tor. - -Overview: - - Establishing a connection to a hidden service currently involves two Tor - relays, introduction and rendezvous point, and 10 more relays distributed - over four circuits to connect to them. The introduction point is - established in the mid-term by a hidden service to transfer introduction - requests from client to the hidden service. The rendezvous point is set - up by the client for a single hidden service request and actually - transfers end-to-end encrypted application data between client and hidden - service. - - There are some reasons for separating the two roles of introduction and - rendezvous point: (1) Responsibility: A relay shall not be made - responsible that it relays data for a certain hidden service; in the - original design as described in [1] an introduction point relays no - application data, and a rendezvous points neither knows the hidden - service nor can it decrypt the data. (2) Scalability: The hidden service - shall not have to maintain a number of open circuits proportional to the - expected number of client requests. (3) Attack resistance: The effect of - an attack on the only visible parts of a hidden service, its introduction - points, shall be as small as possible. - - However, elimination of a separate rendezvous connection as proposed by - Øverlier and Syverson [2] is the most promising approach to improve the - delay in connection establishment. From all substeps of connection - establishment extending a circuit by only a single hop is responsible for - a major part of delay. Reducing on-demand circuit extensions from two to - one results in a decrease of mean connection establishment times from 39 - to 29 seconds [3]. Particularly, eliminating the delay on hidden-service - side allows the client to better observe progress of connection - establishment, thus allowing it to use smaller timeouts. Proposal 114 - introduced new introduction keys for introduction points and provides for - user authorization data in hidden service descriptors; it will be shown - in this proposal that introduction keys in combination with new - introduction cookies provide for the first security property - responsibility. Further, eliminating the need for a separate introduction - connection benefits the overall network load by decreasing the number of - circuit extensions. After all, having only one connection between client - and hidden service reduces the overall protocol complexity. - -Design: - - 1. Hidden Service Configuration - - Hidden services should be able to choose whether they would like to use - this protocol. This might be opt-in for 0.2.1.x and opt-out for later - major releases. - - 2. Contact Point Establishment - - When preparing a hidden service, a Tor client selects a set of relays to - act as contact points instead of introduction points. The contact point - combines both roles of introduction and rendezvous point as proposed in - [2]. The only requirement for a relay to be picked as contact point is - its capability of performing this role. This can be determined from the - Tor version number that needs to be equal or higher than the first - version that implements this proposal. - - The easiest way to implement establishment of contact points is to - introduce v2 ESTABLISH_INTRO cells. By convention, the relay recognizes - version 2 ESTABLISH_INTRO cells as requests to establish a contact point - rather than an introduction point. - - V Format byte: set to 255 [1 octet] - V Version byte: set to 2 [1 octet] - KLEN Key length [2 octets] - PK Public introduction key [KLEN octets] - HS Hash of session info [20 octets] - SIG Signature of above information [variable] - - The hidden service does not create a fixed number of contact points, like - 3 in the current protocol. It uses a minimum of 3 contact points, but - increases this number depending on the history of client requests within - the last hour. The hidden service also increases this number depending on - the frequency of failing contact points in order to defend against - attacks on its contact points. When client authorization as described in - proposal 121 is used, a hidden service can also use the number of - authorized clients as first estimate for the required number of contact - points. - - 3. Hidden Service Descriptor Creation - - A hidden service needs to issue a fresh introduction cookie for each - established introduction point. By requiring clients to use this cookie - in a later connection establishment, an introduction point cannot access - the hidden service that it works for. Together with the fresh - introduction key that was introduced in proposal 114, this reduces - responsibility of a contact point for a specific hidden service. - - The v2 hidden service descriptor format contains an - "intro-authentication" field that may contain introduction-point specific - keys. The hidden service creates a random string, comparable to the - rendezvous cookie, and includes it in the descriptor as introduction - cookie for auth-type "1". By convention, clients recognize existence of - auth-type 1 as possibility to connect to a hidden service via a contact - point rather than an introduction point. Older clients that do not - understand this new protocol simply ignore that cookie. - - 4. Connection Establishment - - When establishing a connection to a hidden service a client learns about - the capability of using the new protocol from the hidden service - descriptor. It may choose whether to use this new protocol or not, - whereas older clients cannot understand the new capability and can only - use the current protocol. Client using version 0.2.1.x should be able to - opt-in for using the new protocol, which should change to opt-out for - later major releases. - - When using the new capability the client creates a v2 INTRODUCE1 cell - that extends an unversioned INTRODUCE1 cell by adding the content of an - ESTABLISH_RENDEZVOUS cell. Further, the client sends this cell using the - new cell type 41 RELAY_INTRODUCE1_VERSIONED to the introduction point, - because unversioned and versioned INTRODUCE1 cells are indistinguishable: - - Cleartext - V Version byte: set to 2 [1 octet] - PK_ID Identifier for Bob's PK [20 octets] - RC Rendezvous cookie [20 octets] - Encrypted to introduction key: - VER Version byte: set to 3. [1 octet] - AUTHT The auth type that is supported [1 octet] - AUTHL Length of auth data [2 octets] - AUTHD Auth data [variable] - RC Rendezvous cookie [20 octets] - g^x Diffie-Hellman data, part 1 [128 octets] - - The cleartext part contains the rendezvous cookie that the contact point - remembers just as a rendezvous point would do. - - The encrypted part contains the introduction cookie as auth data for the - auth type 1. The rendezvous cookie is contained as before, but there is - no further rendezvous point information, as there is no separate - rendezvous point. - - 5. Rendezvous Establishment - - The contact point recognizes a v2 INTRODUCE1 cell with auth type 1 as a - request to be used in the new protocol. It remembers the contained - rendezvous cookie, replies to the client with an INTRODUCE_ACK cell - (omitting the RENDEZVOUS_ESTABLISHED cell), and forwards the encrypted - part of the INTRODUCE1 cell as INTRODUCE2 cell to the hidden service. - - 6. Introduction at Hidden Service - - The hidden services recognizes an INTRODUCE2 cell containing an - introduction cookie as authorization data. In this case, it does not - extend a circuit to a rendezvous point, but sends a RENDEZVOUS1 cell - directly back to its contact point as usual. - - 7. Rendezvous at Contact Point - - The contact point processes a RENDEZVOUS1 cell just as a rendezvous point - does. The only difference is that the hidden-service-side circuit is not - exclusive for the client connection, but shared among multiple client - connections. - - [Tor does not allow sharing of a single circuit among multiple client - connections easily. We need to think about a smart and efficient way to - implement this. Comment by Nick. -KL] - -Security Implications: - - (1) Responsibility - - One of the original reasons for the separation of introduction and - rendezvous points is that a relay shall not be made responsible that it - relays data for a certain hidden service. In the current design an - introduction point relays no application data and a rendezvous points - neither knows the hidden service nor can it decrypt the data. - - This property is also fulfilled in this new design. A contact point only - learns a fresh introduction key instead of the hidden service key, so - that it cannot recognize a hidden service. Further, the introduction - cookie, which is unknown to the contact point, prevents it from accessing - the hidden service itself. The only way for a contact point to access a - hidden service is to look up whether it is contained in the descriptors - of known hidden services. A contact point cannot directly be made - responsible for which hidden service it is working. In addition to that, - it cannot learn the data that it transfers, because all communication - between client and hidden service are end-to-end encrypted. - - (2) Scalability - - Another goal of the existing hidden service protocol is that a hidden - service does not have to maintain a number of open circuits proportional - to the expected number of client requests. The rationale behind this is - better scalability. - - The new protocol eliminates the need for a hidden service to extend - circuits on demand, which has a positive effect on circuits establishment - times and overall network load. The solution presented here to establish - a number of contact points proportional to the history of connection - requests reduces the number of circuits to a minimum number that fits the - hidden service's needs. - - (3) Attack resistance - - The third goal of separating introduction and rendezvous points is to - limit the effect of an attack on the only visible parts of a hidden - service which are the contact points in this protocol. - - In theory, the new protocol is more vulnerable to this attack. An - attacker who can take down a contact point does not only eliminate an - access point to the hidden service, but also breaks current client - connections to the hidden service using that contact point. - - Øverlier and Syverson proposed the concept of valet nodes as additional - safeguard for introduction/contact points [4]. Unfortunately, this - increases hidden service protocol complexity conceptually and from an - implementation point of view. Therefore, it is not included in this - proposal. - - However, in practice attacking a contact point (or introduction point) is - not as rewarding as it might appear. The cost for a hidden service to set - up a new contact point and publish a new hidden service descriptor is - minimal compared to the efforts necessary for an attacker to take a Tor - relay down. As a countermeasure to further frustrate this attack, the - hidden service raises the number of contact points as a function of - previous contact point failures. - - Further, the probability of breaking client connections due to attacking - a contact point is minimal. It can be assumed that the probability of one - of the other five involved relays in a hidden service connection failing - or being shut down is higher than that of a successful attack on a - contact point. - - (4) Resistance against Locating Attacks - - Clients are no longer able to force a hidden service to create or extend - circuits. This further reduces an attacker's capabilities of locating a - hidden server as described by Øverlier and Syverson [5]. - -Compatibility: - - The presented protocol does not raise compatibility issues with current - Tor versions. New relay versions support both, the existing and the - proposed protocol as introduction/rendezvous/contact points. A contact - point acts as introduction point simultaneously. Hidden services and - clients can opt-in to use the new protocol which might change to opt-out - some time in the future. - -References: - - [1] Roger Dingledine, Nick Mathewson, and Paul Syverson, Tor: The - Second-Generation Onion Router. In the Proceedings of the 13th USENIX - Security Symposium, August 2004. - - [2] Lasse Øverlier and Paul Syverson, Improving Efficiency and Simplicity - of Tor Circuit Establishment and Hidden Services. In the Proceedings of - the Seventh Workshop on Privacy Enhancing Technologies (PET 2007), - Ottawa, Canada, June 2007. - - [3] Christian Wilms, Improving the Tor Hidden Service Protocol Aiming at - Better Performance, diploma thesis, June 2008, University of Bamberg. - - [4] Lasse Øverlier and Paul Syverson, Valet Services: Improving Hidden - Servers with a Personal Touch. In the Proceedings of the Sixth Workshop - on Privacy Enhancing Technologies (PET 2006), Cambridge, UK, June 2006. - - [5] Lasse Øverlier and Paul Syverson, Locating Hidden Servers. In the - Proceedings of the 2006 IEEE Symposium on Security and Privacy, May 2006. - diff --git a/doc/spec/proposals/143-distributed-storage-improvements.txt b/doc/spec/proposals/143-distributed-storage-improvements.txt deleted file mode 100644 index 0f7468f1dc..0000000000 --- a/doc/spec/proposals/143-distributed-storage-improvements.txt +++ /dev/null @@ -1,194 +0,0 @@ -Filename: 143-distributed-storage-improvements.txt -Title: Improvements of Distributed Storage for Tor Hidden Service Descriptors -Author: Karsten Loesing -Created: 28-Jun-2008 -Status: Open -Target: 0.2.1.x - -Change history: - - 28-Jun-2008 Initial proposal for or-dev - -Overview: - - An evaluation of the distributed storage for Tor hidden service - descriptors and subsequent discussions have brought up a few improvements - to proposal 114. All improvements are backwards compatible to the - implementation of proposal 114. - -Design: - - 1. Report Bad Directory Nodes - - Bad hidden service directory nodes could deny existence of previously - stored descriptors. A bad directory node that does this with all stored - descriptors causes harm to the distributed storage in general, but - replication will cope with this problem in most cases. However, an - adversary that attempts to make a specific hidden service unavailable by - running relays that become responsible for all of a service's - descriptors poses a more serious threat. The distributed storage needs to - defend against this attack by detecting and removing bad directory nodes. - - As a countermeasure hidden services try to download their descriptors - every hour at random times from the hidden service directories that are - responsible for storing it. If a directory node replies with 404 (Not - found), the hidden service reports the supposedly bad directory node to - a random selection of half of the directory authorities (with version - numbers equal to or higher than the first version that implements this - proposal). The hidden service posts a complaint message using HTTP 'POST' - to a URL "/tor/rendezvous/complain" with the following message format: - - "hidden-service-directory-complaint" identifier NL - - [At start, exactly once] - - The identifier of the hidden service directory node to be - investigated. - - "rendezvous-service-descriptor" descriptor NL - - [At end, Excatly once] - - The hidden service descriptor that the supposedly bad directory node - does not serve. - - The directory authority checks if the descriptor is valid and the hidden - service directory responsible for storing it. It waits for a random time - of up to 30 minutes before posting the descriptor to the hidden service - directory. If the publication is acknowledged, the directory authority - waits another random time of up to 30 minutes before attempting to - request the descriptor that it has posted. If the directory node replies - with 404 (Not found), it will be blacklisted for being a hidden service - directory node for the next 48 hours. - - A blacklisted hidden service directory is assigned the new flag BadHSDir - instead of the HSDir flag in the vote that a directory authority creates. - In a consensus a relay is only assigned a HSDir flag if the majority of - votes contains a HSDir flag and no more than one third of votes contains - a BadHSDir flag. As a result, clients do not have to learn about the - BadHSDir flag. A blacklisted directory node will simply not be assigned - the HSDir flag in the consensus. - - In order to prevent an attacker from setting up new nodes as replacement - for blacklisted directory nodes, all directory nodes in the same /24 - subnet are blacklisted, too. Furthermore, if two or more directory nodes - are blacklisted in the same /16 subnet concurrently, all other directory - nodes in that /16 subnet are blacklisted, too. Blacklisting holds for at - most 48 hours. - - 2. Publish Fewer Replicas - - The evaluation has shown that the probability of a directory node to - serve a previously stored descriptor is 85.7% (more precisely, this is - the 0.001-quantile of the empirical distribution with the rationale that - it holds for 99.9% of all empirical cases). If descriptors are replicated - to x directory nodes, the probability of at least one of the replicas to - be available for clients is 1 - (1 - 85.7%) ^ x. In order to achieve an - overall availability of 99.9%, x = 3.55 replicas need to be stored. From - this follows that 4 replicas are sufficient, rather than the currently - stored 6 replicas. - - Further, the current design stores 2 sets of descriptors on 3 directory - nodes with consecutive identities. Originally, this was meant to - facilitate replication between directory nodes, which has not been and - will not be implemented (the selection criterion of 24 hours uptime does - not make it necessary). As a result, storing descriptors on directory - nodes with consecutive identities is not required. In fact it should be - avoided to enable an attacker to create "black holes" in the identifier - ring. - - Hidden services should store their descriptors on 4 non-consecutive - directory nodes, and clients should request descriptors from these - directory nodes only. For compatibility reasons, hidden services also - store their descriptors on 2 consecutive directory nodes. Hence, 0.2.0.x - clients will be able to retrieve 4 out of 6 descriptors, but will fail - for the remaining 2 descriptors, which is sufficient for reliability. As - soon as 0.2.0.x is deprecated, hidden services can stop publishing the - additional 2 replicas. - - 3. Change Default Value of Being Hidden Service Directory - - The requirements for becoming a hidden service directory node are an open - directory port and an uptime of at least 24 hours. The evaluation has - shown that there are 300 hidden service directory candidates in the mean, - but only 6 of them are configured to act as hidden service directories. - This is bad, because those 6 nodes need to serve a large share of all - hidden service descriptors. Optimally, there should be hundreds of hidden - service directories. Having a large number of 0.2.1.x directory nodes - also has a positive effect on 0.2.0.x hidden services and clients. - - Therefore, the new default of HidServDirectoryV2 should be 1, so that a - Tor relay that has an open directory port automatically accepts and - serves v2 hidden service descriptors. A relay operator can still opt-out - running a hidden service directory by changing HidServDirectoryV2 to 0. - The additional bandwidth requirements for running a hidden service - directory node in addition to being a directory cache are negligible. - - 4. Make Descriptors Persistent on Directory Nodes - - Hidden service directories that are restarted by their operators or after - a failure will not be selected as hidden service directories within the - next 24 hours. However, some clients might still think that these nodes - are responsible for certain descriptors, because they work on the basis - of network consensuses that are up to three hours old. The directory - nodes should be able to serve the previously received descriptors to - these clients. Therefore, directory nodes make all received descriptors - persistent and load previously received descriptors on startup. - - 5. Store and Serve Descriptors Regardless of Responsibility - - Currently, directory nodes only accept descriptors for which they think - they are responsible. This may lead to problems when a directory node - uses an older or newer network consensus than hidden service or client - or when a directory node has been restarted recently. In fact, there are - no security issues in storing or serving descriptors for which a - directory node thinks it is not responsible. To the contrary, doing so - may improve reliability in border cases. As a result, a directory node - does not pay attention to responsibilty when receiving a publication or - fetch request, but stores or serves the requested descriptor. Likewise, - the directory node does not remove descriptors when it thinks it is not - responsible for them any more. - - 6. Avoid Periodic Descriptor Re-Publication - - In the current implementation a hidden service re-publishes its - descriptor either when its content changes or an hour elapses. However, - the evaluation has shown that failures of hidden service directory nodes, - i.e. of nodes that have not failed within the last 24 hours, are very - rare. Together with making descriptors persistent on directory nodes, - there is no necessity to re-publish descriptors hourly. - - The only two events leading to descriptor re-publication should be a - change of the descriptor content and a new directory node becoming - responsible for the descriptor. Hidden services should therefore consider - re-publication every time they learn about a new network consensus - instead of hourly. - - 7. Discard Expired Descriptors - - The current implementation lets directory nodes keep a descriptor for two - days before discarding it. However, with the v2 design, descriptors are - only valid for at most one day. Directory nodes should determine the - validity of stored descriptors and discard them one hour after they have - expired (to compensate wrong clocks on clients). - - 8. Shorten Client-Side Descriptor Fetch History - - When clients try to download a hidden service descriptor, they memorize - fetch requests to directory nodes for up to 15 minutes. This allows them - to request all replicas of a descriptor to avoid bad or failing directory - nodes, but without querying the same directory node twice. - - The downside is that a client that has requested a descriptor without - success, will not be able to find a hidden service that has been started - during the following 15 minutes after the client's last request. - - This can be improved by shortening the fetch history to only 5 minutes. - This time should be sufficient to complete requests for all replicas of a - descriptor, but without ending in an infinite request loop. - -Compatibility: - - All proposed improvements are compatible to the currently implemented - design as described in proposal 114. - diff --git a/doc/spec/proposals/144-enforce-distinct-providers.txt b/doc/spec/proposals/144-enforce-distinct-providers.txt deleted file mode 100644 index aa460482f1..0000000000 --- a/doc/spec/proposals/144-enforce-distinct-providers.txt +++ /dev/null @@ -1,165 +0,0 @@ -Filename: 144-enforce-distinct-providers.txt -Title: Increase the diversity of circuits by detecting nodes belonging the - same provider -Author: Mfr -Created: 2008-06-15 -Status: Draft - -Overview: - - Increase network security by reducing the capacity of the relay or - ISPs monitoring personally or requisition, a large part of traffic - Tor trying to break circuits privacy. A way to increase the - diversity of circuits without killing the network performance. - -Motivation: - - Since 2004, Roger an Nick publication about diversity [1], very fast - relays Tor running are focused among an half dozen of providers, - controlling traffic of some dozens of routers [2]. - - In the same way the generalization of VMs clonables paid by hour, - allowing starting in few minutes and for a small cost, a set of very - high-speed relay whose in a few hours can attract a big traffic that - can be analyzed, increasing the vulnerability of the network. - - Whether ISPs or domU providers, these usually have several groups of - IP Class B. Also the restriction in place EnforceDistinctSubnets - automatically excluding IP subnet class B is only partially - effective. By contrast a restriction at the class A will be too - restrictive. - - Therefore it seems necessary to consider another approach. - -Proposal: - - Add a provider control based on AS number added by the router on is - descriptor, controlled by Directories Authorities, and used like the - declarative family field for circuit creating. - -Design: - -Step 1 : - - Add to the router descriptor a provider information get request [4] - by the router itself. - - "provider" name NL - - 'names' is the AS number of the router formated like this: - 'ASxxxxxx' where AS is fixed and xxxxxx is the AS number, - left aligned ( ex: AS98304 , AS4096,AS1 ) or if AS number - is missing the network A class number is used like that: - 'ANxxx' where AN is fixed and xxx is the first 3 digits of - the IP (ex: for the IP 1.1.1.2 AN1) or an 'L' value is set - if it's a local network IP. - - If two ORs list one another in their "provider" entries, - then OPs should treat them as a single OR for the purpose - of path selection. - - For example, if node A's descriptor contains "provider B", - and node B's descriptor contains "provider A", then node A - and node B should never be used on the same circuit. - - Add the regarding config option in torrc - - EnforceDistinctProviders set to 1 by default. - Permit building circuits with relays in the same provider - if set to 0. - Regarding to proposal 135 if TestingTorNetwork is set - need to be EnforceDistinctProviders is unset. - - Control by Authorities Directories of the AS numbers - - The Directories Authority control the AS numbers of the new node - descriptor uploaded. - - If an old version is operated by the node this test is - bypassed. - - If AS number get by request is different from the - description, router is flagged as non-Valid by the testing - Authority for the voting process. - -Step 2 When a ' significant number of nodes' of valid routers are -generating descriptor with provider information. - - Add missing provider information get by DNS request -functionality for the circuit user: - - During circuit building, computing, OP apply first - family check and EnforceDistinctSubnets directives for - performance, then if provider info is needed and - missing in router descriptor try to get AS provider - info by DNS request [4]. This information could be - DNS cached. AN ( class A number) is never generated - during this process to prevent DNS block problems. If - DNS request fails ignore and continue building - circuit. - -Step 3 When the 'whole majority' of valid Tor clients are providing -DNS request. - - Older versions are deprecated and mark as no-Valid. - - EnforceDistinctProviders replace EnforceDistinctSubnets functionnality. - - EnforceDistinctSubnets is removed. - - Functionalities deployed in step 2 are removed. - -Security implications: - - This providermeasure will increase the number of providers - addresses that an attacker must use in order to carry out - traffic analysis. - -Compatibility: - - The presented protocol does not raise compatibility issues - with current Tor versions. The compatibility is preserved by - implementing this functionality in 3 steps, giving time to - network users to upgrade clients and routers. - -Performance and scalability notes: - - Provider change for all routers could reduce a little - performance if the circuit to long. - - During step 2 Get missing provider information could increase - building path time and should have a time out. - -Possible Attacks/Open Issues/Some thinking required: - - These proposal seems be compatible with proposal 135 Simplify - Configuration of Private Tor Networks. - - This proposal does not resolve multiples AS owners and top - providers traffic monitoring attacks [5]. - - Unresolved AS number are treated as a Class A network. Perhaps - should be marked as invalid. But there's only fives items on - last check see [2]. - - Need to define what's a 'significant number of nodes' and - 'whole majority' ;-) - -References: -[1] Location Diversity in Anonymity Networks by Nick Feamster and Roger -Dingledine. -In the Proceedings of the Workshop on Privacy in the Electronic Society -(WPES 2004), Washington, DC, USA, October 2004 -http://freehaven.net/anonbib/#feamster:wpes2004 -[2] http://as4jtw5gc6efb267.onion/IPListbyAS.txt -[3] see Goodell Tor Exit Page -http://cassandra.eecs.harvard.edu/cgi-bin/exit.py -[4] see the great IP to ASN DNS Tool -http://www.team-cymru.org/Services/ip-to-asn.html -[5] Sampled Traffic Analysis by Internet-Exchange-Level Adversaries by -Steven J. Murdoch and Piotr Zielinski. -In the Proceedings of the Seventh Workshop on Privacy Enhancing Technologies - -(PET 2007), Ottawa, Canada, June 2007. -http://freehaven.net/anonbib/#murdoch-pet2007 -[5] http://bugs.noreply.org/flyspray/index.php?do=details&id=690 diff --git a/doc/spec/proposals/145-newguard-flag.txt b/doc/spec/proposals/145-newguard-flag.txt deleted file mode 100644 index 9e61e30be9..0000000000 --- a/doc/spec/proposals/145-newguard-flag.txt +++ /dev/null @@ -1,39 +0,0 @@ -Filename: 145-newguard-flag.txt -Title: Separate "suitable as a guard" from "suitable as a new guard" -Author: Nick Mathewson -Created: 1-Jul-2008 -Status: Open -Target: 0.2.1.x - -[This could be obsoleted by proposal 141, which could replace NewGuard -with a Guard weight.] - -Overview - - Right now, Tor has one flag that clients use both to tell which - nodes should be kept as guards, and which nodes should be picked - when choosing new guards. This proposal separates this flag into - two. - -Motivation - - Balancing clients amoung guards is not done well by our current - algorithm. When a new guard appears, it is chosen by clients - looking for a new guard with the same probability as all existing - guards... but new guards are likelier to be under capacity, whereas - old guards are likelier to be under more use. - -Implementation - - We add a new flag, NewGuard. Clients will change so that when they - are choosing new guards, they only consider nodes with the NewGuard - flag set. - - For now, authorities will always set NewGuard if they are setting - the Guard flag. Later, it will be easy to migrate authorities to - set NewGuard for underused guards. - -Alternatives - - We might instead have authorities list weights with which nodes - should be picked as guards. diff --git a/doc/spec/proposals/146-long-term-stability.txt b/doc/spec/proposals/146-long-term-stability.txt deleted file mode 100644 index 9af0017441..0000000000 --- a/doc/spec/proposals/146-long-term-stability.txt +++ /dev/null @@ -1,84 +0,0 @@ -Filename: 146-long-term-stability.txt -Title: Add new flag to reflect long-term stability -Author: Nick Mathewson -Created: 19-Jun-2008 -Status: Open -Target: 0.2.1.x - -Overview - - This document proposes a new flag to indicate that a router has - existed at the same address for a long time, describes how to - implement it, and explains what it's good for. - -Motivation - - Tor has had three notions of "stability" for servers. Older - directory protocols based a server's stability on its - (self-reported) uptime: a server that had been running for a day was - more stable than a server that had been running for five minutes, - regardless of their past history. Current directory protocols track - weighted mean time between failure (WMTBF) and weighted fractional - uptime (WFU). WFU is computed as the fraction of time for which the - server is running, with measurements weighted to exponentially - decay such that old days count less. WMTBF is computed as the - average length of intervals for which the server runs between - downtime, with old intervals weighted to count less. - - WMTBF is useful in answering the question: "If a server is running - now, how long is it likely to stay running?" This makes it a good - choice for picking servers for streams that need to be long-lived. - WFU is useful in answering the question: "If I try connecting to - this server at an arbitrary time, is it likely to be running?" This - makes it an important factor for picking guard nodes, since we want - guard nodes to be usually-up. - - There are other questions that clients want to answer, however, for - which the current flags aren't very useful. The one that this - proposal addresses is, - - "If I found this server in an old consensus, is it likely to - still be running at the same address?" - - This one is useful when we're trying to find directory mirrors in a - fallback-consensus file. This property is equivalent to, - - "If I find this server in a current consensus, how long is it - likely to exist on the network?" - - This one is useful if we're trying to pick introduction points or - something and care more about churn rate than about whether every IP - will be up all the time. - -Implementation: - - I propose we add a new flag, called "Longterm." Authorities should - set this flag for routers if their Longevity is in the upper - quartile of all routers. A router's Longevity is computed as the - total amount of days in the last year or so[*] for which the router has - been Running at least once at its current IP:orport pair. - - Clients should use directory servers from a fallback-consensus only - if they have the Longterm flag set. - - Authority ops should be able to mark particular routers as not - Longterm, regardless of history. (For instance, it makes sense to - remove the Longterm flag from a router whose op says that it will - need to shutdown in a month.) - - [*] This is deliberately vague, to permit efficient implementations. - -Compatibility and migration issues: - - The voting protocol already acts gracefully when new flags are - added, so no change to the voting protocol is needed. - - Tor won't have collected this data, however. It might be desirable - to bootstrap it from historical consensuses. Alternatively, we can - just let the algorithm run for a month or two. - -Issues and future possibilities: - - Longterm is a really awkward name. - - diff --git a/doc/spec/proposals/147-prevoting-opinions.txt b/doc/spec/proposals/147-prevoting-opinions.txt deleted file mode 100644 index 3d9659c984..0000000000 --- a/doc/spec/proposals/147-prevoting-opinions.txt +++ /dev/null @@ -1,58 +0,0 @@ -Filename: 147-prevoting-opinions.txt -Title: Eliminate the need for v2 directories in generating v3 directories -Author: Nick Mathewson -Created: 2-Jul-2008 -Status: Accepted -Target: 0.2.1.x - -Overview - - We propose a new v3 vote document type to replace the role of v2 - networkstatus information in generating v3 consensuses. - -Motivation - - When authorities vote on which descriptors are to be listed in the - next consensus, it helps if they all know about the same descriptors - as one another. But a hostile, confused, or out-of-date server may - upload a descriptor to only some authorities. In the current v3 - directory design, the authorities don't have a good way to tell one - another about the new descriptor until they exchange votes... but by - the time this happens, they are already committed to their votes, - and they can't add anybody they learn about from other authorities - until the next voting cycle. That's no good! - - The current Tor implementation avoids this problem by having - authorities also look at v2 networkstatus documents, but we'd like - in the long term to eliminate these, once 0.1.2.x is obsolete. - -Design: - - We add a new value for vote-status in v3 consensus documents in - addition to "consensus" and "vote": "opinion". Authorities generate - and sign an opinion document as if they were generating a vote, - except that they generate opinions earlier than they generate votes. - - Authorities don't need to generate more than one opinion document - per voting interval, but may. They should send it to the other - authorities they know about, at the regular vote upload URL, before - the authorities begin voting, so that enough time remains for the - authorities to fetch new descriptors. - - Additionally, authories make their opinions available at - http://<hostname>/tor/status-vote/next/opinion.z - and download opinions from authorities they haven't heard from in a - while. - - Authorities MAY generate opinions on demand. - - Upon receiving an opinion document, authorities scan it for any - descriptors that: - - They might accept. - - Are for routers they don't know about, or are published more - recently than any descriptor they have for that router. - Authorities then begin downloading such descriptors from authorities - that claim to have them. - - Authorities MAY cache opinion documents, but don't need to. - diff --git a/doc/spec/proposals/148-uniform-client-end-reason.txt b/doc/spec/proposals/148-uniform-client-end-reason.txt deleted file mode 100644 index 1db3b3e596..0000000000 --- a/doc/spec/proposals/148-uniform-client-end-reason.txt +++ /dev/null @@ -1,57 +0,0 @@ -Filename: 148-uniform-client-end-reason.txt -Title: Stream end reasons from the client side should be uniform -Author: Roger Dingledine -Created: 2-Jul-2008 -Status: Closed -Implemented-In: 0.2.1.9-alpha - -Overview - - When a stream closes before it's finished, the end relay cell that's - sent includes an "end stream reason" to tell the other end why it - closed. It's useful for the exit relay to send a reason to the client, - so the client can choose a different circuit, inform the user, etc. But - there's no reason to include it from the client to the exit relay, - and in some cases it can even harm anonymity. - - We should pick a single reason for the client-to-exit-relay direction - and always just send that. - -Motivation - - Back when I first deployed the Tor network, it was useful to have - the Tor relays learn why a stream closed, so I could debug both ends - of the stream at once. Now that streams have worked for many years, - there's no need to continue telling the exit relay whether the client - gave up on a stream because of "timeout" or "misc" or what. - - Then in Tor 0.2.0.28-rc, I fixed this bug: - - Fix a bug where, when we were choosing the 'end stream reason' to - put in our relay end cell that we send to the exit relay, Tor - clients on Windows were sometimes sending the wrong 'reason'. The - anonymity problem is that exit relays may be able to guess whether - the client is running Windows, thus helping partition the anonymity - set. Down the road we should stop sending reasons to exit relays, - or otherwise prevent future versions of this bug. - - It turned out that non-Windows clients were choosing their reason - correctly, whereas Windows clients were potentially looking at errno - wrong and so always choosing 'misc'. - - I fixed that particular bug, but I think we should prevent future - versions of the bug too. - - (We already fixed it so *circuit* end reasons don't get sent from - the client to the exit relay. But we appear to be have skipped over - stream end reasons thus far.) - -Design: - - One option would be to no longer include any 'reason' field in end - relay cells. But that would introduce a partitioning attack ("users - running the old version" vs "users running the new version"). - - Instead I suggest that clients all switch to sending the "misc" reason, - like most of the Windows clients currently do and like the non-Windows - clients already do sometimes. - diff --git a/doc/spec/proposals/149-using-netinfo-data.txt b/doc/spec/proposals/149-using-netinfo-data.txt deleted file mode 100644 index 8bf8375d5d..0000000000 --- a/doc/spec/proposals/149-using-netinfo-data.txt +++ /dev/null @@ -1,42 +0,0 @@ -Filename: 149-using-netinfo-data.txt -Title: Using data from NETINFO cells -Author: Nick Mathewson -Created: 2-Jul-2008 -Status: Open -Target: 0.2.1.x - -Overview - - Current Tor versions send signed IP and timestamp information in - NETINFO cells, but don't use them to their fullest. This proposal - describes how they should start using this info in 0.2.1.x. - -Motivation - - Our directory system relies on clients and routers having - reasonably accurate clocks to detect replayed directory info, and - to set accurate timestamps on directory info they publish - themselves. NETINFO cells contain timestamps. - - Also, the directory system relies on routers having a reasonable - idea of their own IP addresses, so they can publish correct - descriptors. This is also in NETINFO cells. - -Learning the time and IP address - - We need to think about attackers here. Just because a router tells - us that we have a given IP or a given clock skew doesn't mean that - it's true. We believe this information only if we've heard it from - a majority of the routers we've connected to recently, including at - least 3 routers. Routers only believe this information if the - majority includes at least one authority. - -Avoiding MITM attacks - - Current Tors use the IP addresses published in the other router's - NETINFO cells to see whether the connection is "canonical". Right - now, we prefer to extend circuits over "canonical" connections. In - 0.2.1.x, we should refuse to extend circuits over non-canonical - connections without first trying to build a canonical one. - - diff --git a/doc/spec/proposals/150-exclude-exit-nodes.txt b/doc/spec/proposals/150-exclude-exit-nodes.txt deleted file mode 100644 index b497ae62c1..0000000000 --- a/doc/spec/proposals/150-exclude-exit-nodes.txt +++ /dev/null @@ -1,47 +0,0 @@ -Filename: 150-exclude-exit-nodes.txt -Title: Exclude Exit Nodes from a circuit -Author: Mfr -Created: 2008-06-15 -Status: Closed -Implemented-In: 0.2.1.3-alpha - -Overview - - Right now, Tor users can manually exclude a node from all positions - in their circuits created using the directive ExcludeNodes. - This proposal makes this exclusion less restrictive, allowing users to - exclude a node only from the exit part of a circuit. - -Motivation - - This feature would Help the integration into vidalia (tor exit - branch) or other tools, of features to exclude a country for exit - without reducing circuits possibilities, and privacy. This feature - could help people from a country were many sites are blocked to - exclude this country for browsing, giving them a more stable - navigation. It could also add the possibility for the user to - exclude a currently used exit node. - -Implementation - - ExcludeExitNodes is similar to ExcludeNodes except it's only - the exit node which is excluded for circuit build. - - Tor doesn't warn if node from this list is not an exit node. - -Security implications: - - Open also possibilities for a future user bad exit reporting - -Risks: - - Use of this option can make users partitionable under certain attack - assumptions. However, ExitNodes already creates this possibility, - so there isn't much increased risk in ExcludeExitNodes. - - We should still encourage people who exclude an exit node because - of bad behavior to report it instead of just adding it to their - ExcludeExit list. It would be unfortunate if we didn't find out - about broken exits because of this option. This issue can probably - be addressed sufficiently with documentation. - diff --git a/doc/spec/proposals/151-path-selection-improvements.txt b/doc/spec/proposals/151-path-selection-improvements.txt deleted file mode 100644 index af89f21193..0000000000 --- a/doc/spec/proposals/151-path-selection-improvements.txt +++ /dev/null @@ -1,148 +0,0 @@ -Filename: 151-path-selection-improvements.txt -Title: Improving Tor Path Selection -Author: Fallon Chen, Mike Perry -Created: 5-Jul-2008 -Status: Finished -In-Spec: path-spec.txt - -Overview - - The performance of paths selected can be improved by adjusting the - CircuitBuildTimeout and avoiding failing guard nodes. This proposal - describes a method of tracking buildtime statistics at the client, and - using those statistics to adjust the CircuitBuildTimeout. - -Motivation - - Tor's performance can be improved by excluding those circuits that - have long buildtimes (and by extension, high latency). For those Tor - users who require better performance and have lower requirements for - anonymity, this would be a very useful option to have. - -Implementation - - Gathering Build Times - - Circuit build times are stored in the circular array - 'circuit_build_times' consisting of uint32_t elements as milliseconds. - The total size of this array is based on the number of circuits - it takes to converge on a good fit of the long term distribution of - the circuit builds for a fixed link. We do not want this value to be - too large, because it will make it difficult for clients to adapt to - moving between different links. - - From our observations, the minimum value for a reasonable fit appears - to be on the order of 500 (MIN_CIRCUITS_TO_OBSERVE). However, to keep - a good fit over the long term, we store 5000 most recent circuits in - the array (NCIRCUITS_TO_OBSERVE). - - The Tor client will build test circuits at a rate of one per - minute (BUILD_TIMES_TEST_FREQUENCY) up to the point of - MIN_CIRCUITS_TO_OBSERVE. This allows a fresh Tor to have - a CircuitBuildTimeout estimated within 8 hours after install, - upgrade, or network change (see below). - - Long Term Storage - - The long-term storage representation is implemented by storing a - histogram with BUILDTIME_BIN_WIDTH millisecond buckets (default 50) when - writing out the statistics to disk. The format this takes in the - state file is 'CircuitBuildTime <bin-ms> <count>', with the total - specified as 'TotalBuildTimes <total>' - Example: - - TotalBuildTimes 100 - CircuitBuildTimeBin 25 50 - CircuitBuildTimeBin 75 25 - CircuitBuildTimeBin 125 13 - ... - - Reading the histogram in will entail inserting <count> values - into the circuit_build_times array each with the value of - <bin-ms> milliseconds. In order to evenly distribute the values - in the circular array, the Fisher-Yates shuffle will be performed - after reading values from the bins. - - Learning the CircuitBuildTimeout - - Based on studies of build times, we found that the distribution of - circuit buildtimes appears to be a Frechet distribution. However, - estimators and quantile functions of the Frechet distribution are - difficult to work with and slow to converge. So instead, since we - are only interested in the accuracy of the tail, we approximate - the tail of the distribution with a Pareto curve starting at - the mode of the circuit build time sample set. - - We will calculate the parameters for a Pareto distribution - fitting the data using the estimators at - http://en.wikipedia.org/wiki/Pareto_distribution#Parameter_estimation. - - The timeout itself is calculated by using the Quartile function (the - inverted CDF) to give us the value on the CDF such that - BUILDTIME_PERCENT_CUTOFF (80%) of the mass of the distribution is - below the timeout value. - - Thus, we expect that the Tor client will accept the fastest 80% of - the total number of paths on the network. - - Detecting Changing Network Conditions - - We attempt to detect both network connectivity loss and drastic - changes in the timeout characteristics. - - We assume that we've had network connectivity loss if 3 circuits - timeout and we've received no cells or TLS handshakes since those - circuits began. We then set the timeout to 60 seconds and stop - counting timeouts. - - If 3 more circuits timeout and the network still has not been - live within this new 60 second timeout window, we then discard - the previous timeouts during this period from our history. - - To detect changing network conditions, we keep a history of - the timeout or non-timeout status of the past RECENT_CIRCUITS (20) - that successfully completed at least one hop. If more than 75% - of these circuits timeout, we discard all buildtimes history, - reset the timeout to 60, and then begin recomputing the timeout. - - Testing - - After circuit build times, storage, and learning are implemented, - the resulting histogram should be checked for consistency by - verifying it persists across successive Tor invocations where - no circuits are built. In addition, we can also use the existing - buildtime scripts to record build times, and verify that the histogram - the python produces matches that which is output to the state file in Tor, - and verify that the Pareto parameters and cutoff points also match. - - We will also verify that there are no unexpected large deviations from - node selection, such as nodes from distant geographical locations being - completely excluded. - - Dealing with Timeouts - - Timeouts should be counted as the expectation of the region of - of the Pareto distribution beyond the cutoff. This is done by - generating a random sample for each timeout at points on the - curve beyond the current timeout cutoff. - - Future Work - - At some point, it may be desirable to change the cutoff from a - single hard cutoff that destroys the circuit to a soft cutoff and - a hard cutoff, where the soft cutoff merely triggers the building - of a new circuit, and the hard cutoff triggers destruction of the - circuit. - - It may also be beneficial to learn separate timeouts for each - guard node, as they will have slightly different distributions. - This will take longer to generate initial values though. - -Issues - - Impact on anonymity - - Since this follows a Pareto distribution, large reductions on the - timeout can be achieved without cutting off a great number of the - total paths. This will eliminate a great deal of the performance - variation of Tor usage. diff --git a/doc/spec/proposals/152-single-hop-circuits.txt b/doc/spec/proposals/152-single-hop-circuits.txt deleted file mode 100644 index d0b28b1c72..0000000000 --- a/doc/spec/proposals/152-single-hop-circuits.txt +++ /dev/null @@ -1,62 +0,0 @@ -Filename: 152-single-hop-circuits.txt -Title: Optionally allow exit from single-hop circuits -Author: Geoff Goodell -Created: 13-Jul-2008 -Status: Closed -Implemented-In: 0.2.1.6-alpha - -Overview - - Provide a special configuration option that adds a line to descriptors - indicating that a router can be used as an exit for one-hop circuits, - and allow clients to attach streams to one-hop circuits provided - that the descriptor for the router in the circuit includes this - configuration option. - -Motivation - - At some point, code was added to restrict the attachment of streams - to one-hop circuits. - - The idea seems to be that we can use the cost of forking and - maintaining a patch as a lever to prevent people from writing - controllers that jeopardize the operational security of routers - and the anonymity properties of the Tor network by creating and - using one-hop circuits rather than the standard three-hop circuits. - It may be, for example, that some users do not actually seek true - anonymity but simply reachability through network perspectives - afforded by the Tor network, and since anonymity is stronger in - numbers, forcing users to contribute to anonymity and decrease the - risk to server operators by using full-length paths may be reasonable. - - As presently implemented, the sweeping restriction of one-hop circuits - for all routers limits the usefulness of Tor as a general-purpose - technology for building circuits. In particular, we should allow - for controllers, such as Blossom, that create and use single-hop - circuits involving routers that are not part of the Tor network. - -Design - - Introduce a configuration option for Tor servers that, when set, - indicates that a router is willing to provide exit from one-hop - circuits. Routers with this policy will not require that a circuit - has at least two hops when it is used as an exit. - - In addition, routers for which this configuration option - has been set will have a line in their descriptors, "opt - exit-from-single-hop-circuits". Clients will keep track of which - routers have this option and allow streams to be attached to - single-hop circuits that include such routers. - -Security Considerations - - This approach seems to eliminate the worry about operational router - security, since server operators will not set the configuraiton - option unless they are willing to take on such risk. - - To reduce the impact on anonymity of the network resulting - from including such "risky" routers in regular Tor path - selection, clients may systematically exclude routers with "opt - exit-from-single-hop-circuits" when choosing random paths through - the Tor network. - diff --git a/doc/spec/proposals/153-automatic-software-update-protocol.txt b/doc/spec/proposals/153-automatic-software-update-protocol.txt deleted file mode 100644 index c2979bb695..0000000000 --- a/doc/spec/proposals/153-automatic-software-update-protocol.txt +++ /dev/null @@ -1,175 +0,0 @@ -Filename: 153-automatic-software-update-protocol.txt -Title: Automatic software update protocol -Author: Jacob Appelbaum -Created: 14-July-2008 -Status: Superseded - -[Superseded by thandy-spec.txt] - - - Automatic Software Update Protocol Proposal - -0.0 Introduction - -The Tor project and its users require a robust method to update shipped -software bundles. The software bundles often includes Vidalia, Privoxy, Polipo, -Torbutton and of course Tor itself. It is not inconcievable that an update -could include all of the Tor Browser Bundle. It seems reasonable to make this -a standalone program that can be called in shell scripts, cronjobs or by -various Tor controllers. - -0.1 Minimal Tasks To Implement Automatic Updating - -At the most minimal, an update must be able to do the following: - - 0 - Detect the curent Tor version, note the working status of Tor. - 1 - Detect the latest Tor version. - 2 - Fetch the latest version in the form of a platform specific package(s). - 3 - Verify the itegrity of the downloaded package(s). - 4 - Install the verified package(s). - 5 - Test that the new package(s) works properly. - -0.2 Specific Enumeration Of Minimal Tasks - -To implement requirement 0, we need to detect the current Tor version of both -the updater and the current running Tor. The update program itself should be -versioned internally. This requirement should also test connecting through Tor -itself and note if such connections are possible. - -To implement requirement 1, we need to learn the concensus from the directory -authorities or fail back to a known good URL with cryptographically signed -content. - -To implement requirement 2, we need to download Tor - hopefully over Tor. - -To implement requirement 3, we need to verify the package signature. - -To implement requirement 4, we need to use a platform specific method of -installation. The Tor controller performing the update perform these platform -specific methods. - -To implement requirement 5, we need to be able to extend circuits and reach -the internet through Tor. - -0.x Implementation Goals - -The update system will be cross platform and rely on as little external code -as possible. If the update system uses it, it must be updated by the update -system itself. It will consist only of free software and will not rely on any -non-free components until the actual installation phase. If a package manager -is in use, it will be platform specific and thus only invoked by the update -system implementing the update protocol. - -The update system itself will attempt to perform update related network -activity over Tor. Possibly it will attempt to use a hidden service first. -It will attempt to use novel and not so novel caching -when possible, it will always verify cryptographic signatures before any -remotely fetched code is executed. In the event of an unusable Tor system, -it will be able to attempt to fetch updates without Tor. This should be user -configurable, some users will be unwilling to update without the protection of -using Tor - others will simply be unable because of blocking of the main Tor -website. - -The update system will track current version numbers of Tor and supporting -software. The update system will also track known working versions to assist -with automatic The update system itself will be a standalone library. It will be -strongly versioned internally to match the Tor bundle it was shiped with. The -update system will keep track of the given platform, cpu architecture, lsb_release, -package management functionality and any other platform specific metadata. - -We have referenced two popular automatic update systems, though neither fit -our needs, both are useful as an idea of what others are doing in the same -area. - -The first is sparkle[0] but it is sadly only available for Cocoa -environments and is written in Objective C. This doesn't meet our requirements -because it is directly tied into the private Apple framework. - -The second is the Mozilla Automatic Update System[1]. It is possibly useful -as an idea of how other free software projects automatically update. It is -however not useful in its currently documented form. - - - [0] http://sparkle.andymatuschak.org/documentation/ - [1] http://wiki.mozilla.org/AUS:Manual - -0.x Previous methods of Tor and related software update - -Previously, Tor users updated their Tor related software by hand. There has -been no fully automatic method for any user to update. In addition, there -hasn't been any specific way to find out the most current stable version of Tor -or related software as voted on by the directory authority concensus. - -0.x Changes to the directory specification - -We will want to supplement client-versions and server-versions in the -concensus voting with another version identifier known as -'auto-update-versions'. This will keep track of the current concensus of -specific versions that are best per platform and per architecture. It should -be noted that while the Mac OS X universal binary may be the best for x86 -processers with Tiger, it may not be the best for PPC users on Panther. This -goes for all of the package updates. We want to prevent updates that cause Tor -to break even if the updating program can recover gracefully. - -x.x Assumptions About Operating System Package Management - -It is assumed that users will use their package manager unless they are on -Microsoft Windows (any version) or Mac OS X (any version). Microsoft Windows -users will have integration with the normal "add/remove program" functionality -that said users would expect. - -x.x Package Update System Failure Modes - -The package update will try to ensure that a user always has a working Tor at -the very least. It will keep state to remember versions of Tor that were able -to bootstrap properly and reach the rest of the Tor network. It will also keep -note of which versions broke. It will select the best Tor that works for the -user. It will also allow for anonymized bug reporting on the packages -available and tested by the auto-update system. - -x.x Package Signature Verification - -The update system will be aware of replay attacks against the update signature -system itself. It will not allow package update signatures that are radically -out of date. It will be a multi-key system to prevent any single party from -forging an update. The key will be updated regularly. This is like authority -key (see proposal 103) usage. - -x.x Package Caching - -The update system will iterate over different update methods. Whichever method -is picked will have caching functionality. Each Tor server itself should be -able to serve cached update files. This will be an option that friendly server -administrators can turn on should they wish to support caching. In addition, -it is possible to cache the full contents of a package in an -authoratative DNS zone. Users can then query the DNS zone for their package. -If we wish to further distribute the update load, we can also offer packages -with encrypted bittorrent. Clients who wish to share the updates but do not -wish to be a server can help distribute Tor updates. This can be tied together -with the DNS caching[2][3] if needed. - - [2] http://www.netrogenic.com/dnstorrent/ - [3] http://www.doxpara.com/ozymandns_src_0.1.tgz - -x.x Helping Our Users Spread Tor - -There should be a way for a user to participate in the packaging caching as -described in section x.x. This option should be presented by the Tor -controller. - -x.x Simple HTTP Proxy To The Tor Project Website - -It has been suggested that we should provide a simple proxy that allows a user -to visit the main Tor website to download packages. This was part of a -previous proposal and has not been closely examined. - -x.x Package Installation - -Platform specific methods for proper package installation will be left to the -controller that is calling for an update. Each platform is different, the -installation options and user interface will be specific to the controller in -question. - -x.x Other Things - -Other things should be added to this proposal. What are they? diff --git a/doc/spec/proposals/154-automatic-updates.txt b/doc/spec/proposals/154-automatic-updates.txt deleted file mode 100644 index 4c2c6d3899..0000000000 --- a/doc/spec/proposals/154-automatic-updates.txt +++ /dev/null @@ -1,377 +0,0 @@ -Filename: 154-automatic-updates.txt -Title: Automatic Software Update Protocol -Author: Matt Edman -Created: 30-July-2008 -Status: Superseded -Target: 0.2.1.x - -Superseded by thandy-spec.txt - -Scope - - This proposal specifies the method by which an automatic update client can - determine the most recent recommended Tor installation package for the - user's platform, download the package, and then verify that the package was - downloaded successfully. While this proposal focuses on only the Tor - software, the protocol defined is sufficiently extensible such that other - components of the Tor bundles, like Vidalia, Polipo, and Torbutton, can be - managed and updated by the automatic update client as well. - - The initial target platform for the automatic update framework is Windows, - given that's the platform used by a majority of our users and that it lacks - a sane package management system that many Linux distributions already have. - Our second target platform will be Mac OS X, and so the protocol will be - designed with this near-future direction in mind. - - Other client-side aspects of the automatic update process, such as user - interaction, the interface presented, and actual package installation - procedure, are outside the scope of this proposal. - - -Motivation - - Tor releases new versions frequently, often with important security, - anonymity, and stability fixes. Thus, it is important for users to be able - to promptly recognize when new versions are available and to easily - download, authenticate, and install updated Tor and Tor-related software - packages. - - Tor's control protocol [2] provides a method by which controllers can - identify when the user's Tor software is obsolete or otherwise no longer - recommended. Currently, however, no mechanism exists for clients to - automatically download and install updated Tor and Tor-related software for - the user. - - -Design Overview - - The core of the automatic update framework is a well-defined file called a - "recommended-packages" file. The recommended-packages file is accessible via - HTTP[S] at one or more well-defined URLs. An example recommended-packages - URL may be: - - https://updates.torproject.org/recommended-packages - - The recommended-packages document is formatted according to Section 1.2 - below and specifies the most recent recommended installation package - versions for Tor or Tor-related software, as well as URLs at which the - packages and their signatures can be downloaded. - - An automatic update client process runs on the Tor user's computer and - periodically retrieves the recommended-packages file according to the method - described in Section 2.0. As described further in Section 1.2, the - recommended-packages file is signed and can be verified by the automatic - update client with one or more public keys included in the client software. - Since it is signed, the recommended-packages file can be mirrored by - multiple hosts (e.g., Tor directory authorities), whose URLs are included in - the automatic update client's configuration. - - After retrieving and verifying the recommended-packages file, the automatic - update client compares the versions of the recommended software packages - listed in the file with those currently installed on the end-user's - computer. If one or more of the installed packages is determined to be out - of date, an updated package and its signature will be downloaded from one of - the package URLs listed in the recommended-packages file as described in - Section 2.2. - - The automatic update system uses a multilevel signing key scheme for package - signatures. There are a small number of entities we call "packaging - authorities" that each have their own signing key. A packaging authority is - responsible for signing and publishing the recommended-packages file. - Additionally, each individual packager responsible for producing an - installation package for one or more platforms has their own signing key. - Every packager's signing key must be signed by at least one of the packaging - authority keys. - - -Specification - - 1. recommended-packages Specification - - In this section we formally specify the format of the published - recommended-packages file. - - 1.1. Document Meta-format - - The recommended-packages document follows the lightweight extensible - information format defined in Tor's directory protocol specification [1]. In - the interest of self-containment, we have reproduced the relevant portions - of that format's specification in this Section. (Credits to Nick Mathewson - for much of the original format definition language.) - - The highest level object is a Document, which consists of one or more - Items. Every Item begins with a KeywordLine, followed by zero or more - Objects. A KeywordLine begins with a Keyword, optionally followed by - whitespace and more non-newline characters, and ends with a newline. A - Keyword is a sequence of one or more characters in the set [A-Za-z0-9-]. - An Object is a block of encoded data in pseudo-Open-PGP-style - armor. (cf. RFC 2440) - - More formally: - - Document ::= (Item | NL)+ - Item ::= KeywordLine Object* - KeywordLine ::= Keyword NL | Keyword WS ArgumentChar+ NL - Keyword ::= KeywordChar+ - KeywordChar ::= 'A' ... 'Z' | 'a' ... 'z' | '0' ... '9' | '-' - ArgumentChar ::= any printing ASCII character except NL. - WS ::= (SP | TAB)+ - Object ::= BeginLine Base-64-encoded-data EndLine - BeginLine ::= "-----BEGIN " Keyword "-----" NL - EndLine ::= "-----END " Keyword "-----" NL - - The BeginLine and EndLine of an Object must use the same keyword. - - In our Document description below, we also tag Items with a multiplicity in - brackets. Possible tags are: - - "At start, exactly once": These items MUST occur in every instance of the - document type, and MUST appear exactly once, and MUST be the first item in - their documents. - - "Exactly once": These items MUST occur exactly one time in every - instance of the document type. - - "Once or more": These items MUST occur at least once in any instance - of the document type, and MAY occur more than once. - - "At end, exactly once": These items MUST occur in every instance of - the document type, and MUST appear exactly once, and MUST be the - last item in their documents. - - 1.2. recommended-packages Document Format - - When interpreting a recommended-packages Document, software MUST ignore - any KeywordLine that starts with a keyword it doesn't recognize; future - implementations MUST NOT require current automatic update clients to - understand any KeywordLine not currently described. - - In lines that take multiple arguments, extra arguments SHOULD be - accepted and ignored. - - The currently defined Items contained in a recommended-packages document - are: - - "recommended-packages-format" SP number NL - - [Exactly once] - - This Item specifies the version of the recommended-packages format that - is contained in the subsequent document. The version defined in this - proposal is version "1". Subsequent iterations of this protocol MUST - increment this value if they introduce incompatible changes to the - document format and MAY increment this value if they only introduce - additional Keywords. - - "published" SP YYYY-MM-DD SP HH:MM:SS NL - - [Exactly once] - - The time, in GMT, when this recommended-packages document was generated. - Automatic update clients SHOULD ignore Documents over 60 days old. - - "tor-stable-win32-version" SP TorVersion NL - - [Exactly once] - - This keyword specifies the latest recommended release of Tor's "stable" - branch for the Windows platform that has an installation package - available. Note that this version does not necessarily correspond to the - most recently tagged stable Tor version, since that version may not yet - have an installer package available, or may have known issues on - Windows. - - The TorVersion field is formatted according to Section 2 of Tor's - version specification [3]. - - "tor-stable-win32-package" SP Url NL - - [Once or more] - - This Item specifies the location from which the most recent - recommended Windows installation package for Tor's stable branch can be - downloaded. - - When this Item appears multiple times within the Document, automatic - update clients SHOULD select randomly from the available package - mirrors. - - "tor-dev-win32-version" SP TorVersion NL - - [Exactly once] - - This Item specifies the latest recommended release of Tor's - "development" branch for the Windows platform that has an installation - package available. The same caveats from the description of - "tor-stable-win32-version" also apply to this keyword. - - The TorVersion field is formatted according to Section 2 of Tor's - version specification [3]. - - "tor-dev-win32-package" SP Url NL - - [Once or more] - - This Item specifies the location from which the most recent recommended - Windows installation package and its signature for Tor's development - branch can be downloaded. - - When this Keyword appears multiple times within the Document, automatic - update clients SHOULD select randomly from the available package - mirrors. - - "signature" NL SIGNATURE NL - - [At end, exactly once] - - The "SIGNATURE" Object contains a PGP signature (using a packaging - authority signing key) of the entire document, taken from the beginning - of the "recommended-packages-format" keyword, through the newline after - the "signature" Keyword. - - - 2. Automatic Update Client Behavior - - The client-side component of the automatic update framework is an - application that runs on the end-user's machine. It is responsible for - fetching and verifying a recommended-packages document, as well as - downloading, verifying, and subsequently installing any necessary updated - software packages. - - 2.1. Download and verify a recommended-packages document - - The first step in the automatic update process is for the client to download - a copy of the recommended-packages file. The automatic update client - contains a (hardcoded and/or user-configurable) list of URLs from which it - will attempt to retrieve a recommended-packages file. - - Connections to each of the recommended-packages URLs SHOULD be attempted in - the following order: - - 1) HTTPS over Tor - 2) HTTP over Tor - 3) Direct HTTPS - 4) Direct HTTP - - If the client fails to retrieve a recommended-packages document via any of - the above connection methods from any of the configured URLs, the client - SHOULD retry its download attempts following an exponential back-off - algorithm. After the first failed attempt, the client SHOULD delay one hour - before attempting again, up to a maximum of 24 hours delay between retry - attempts. - - After successfully downloading a recommended-packages file, the automatic - update client will verify the signature using one of the public keys - distributed with the client software. If more than one recommended-packages - file is downloaded and verified, the file with the most recent "published" - date that is verified will be retained and the rest discarded. - - 2.2. Download and verify the updated packages - - The automatic update client next compares the latest recommended package - version from the recommended-packages document with the currently installed - Tor version. If the user currently has installed a Tor version from Tor's - "development" branch, then the version specified in "tor-dev-*-version" Item - is used for comparison. Similarly, if the user currently has installed a Tor - version from Tor's "stable" branch, then the version specified in the - "tor-stable-*version" Item is used for comparison. Version comparisons are - done according to Tor's version specification [3]. - - If the automatic update client determines an installation package newer than - the user's currently installed version is available, it will attempt to - download a package appropriate for the user's platform and Tor branch from a - URL specified by a "tor-[branch]-[platform]-package" Item. If more than one - mirror for the selected package is available, a mirror will be chosen at - random from all those available. - - The automatic update client must also download a ".asc" signature file for - the retrieved package. The URL for the package signature is the same as that - for the package itself, except with the extension ".asc" appended to the - package URL. - - Connections to download the updated package and its signature SHOULD be - attempted in the same order described in Section 2.1. - - After completing the steps described in Sections 2.1 and 2.2, the automatic - update client will have downloaded and verified a copy of the latest Tor - installation package. It can then take whatever subsequent platform-specific - steps are necessary to install the downloaded software updates. - - 2.3. Periodic checking for updates - - The automatic update client SHOULD maintain a local state file in which it - records (at a minimum) the timestamp at which it last retrieved a - recommended-packages file and the timestamp at which the client last - successfully downloaded and installed a software update. - - Automatic update clients SHOULD check for an updated recommended-packages - document at most once per day but at least once every 30 days. - - - 3. Future Extensions - - There are several possible areas for future extensions of this framework. - The extensions below are merely suggestions and should be the subject of - their own proposal before being implemented. - - 3.1. Additional Software Updates - - There are several software packages often included in Tor bundles besides - Tor, such as Vidalia, Privoxy or Polipo, and Torbutton. The versions and - download locations of updated installation packages for these bundle - components can be easily added to the recommended-packages document - specification above. - - 3.2. Including ChangeLog Information - - It may be useful for automatic update clients to be able to display for - users a summary of the changes made in the latest Tor or Tor-related - software release, before the user chooses to install the update. In the - future, we can add keywords to the specification in Section 1.2 that specify - the location of a ChangeLog file for the latest recommended package - versions. It may also be desirable to allow localized ChangeLog information, - so that the automatic update client can fetch release notes in the - end-user's preferred language. - - 3.3. Weighted Package Mirror Selection - - We defined in Section 1.2 a method by which automatic update clients can - select from multiple available package mirrors. We may want to add a Weight - argument to the "*-package" Items that allows the recommended-packages file - to suggest to clients the probability with which a package mirror should be - chosen. This will allow clients to more appropriately distribute package - downloads across available mirrors proportional to their approximate - bandwidth. - - -Implementation - - Implementation of this proposal will consist of two separate components. - - The first component is a small "au-publish" tool that takes as input a - configuration file specifying the information described in Section 1.2 and a - private key. The tool is run by a "packaging authority" (someone responsible - for publishing updated installation packages), who will be prompted to enter - the passphrase for the private key used to sign the recommended-packages - document. The output of the tool is a document formatted according to - Section 1.2, with a signature appended at the end. The resulting document - can then be published to any of the update mirrors. - - The second component is an "au-client" tool that is run on the end-user's - machine. It periodically checks for updated installation packages according - to Section 2 and fetches the packages if necessary. The public keys used - to sign the recommended-packages file and any of the published packages are - included in the "au-client" tool. - - -References - - [1] Tor directory protocol (version 3), - https://tor-svn.freehaven.net/svn/tor/trunk/doc/spec/dir-spec.txt - - [2] Tor control protocol (version 2), - https://tor-svn.freehaven.net/svn/tor/trunk/doc/spec/control-spec.txt - - [3] Tor version specification, - https://tor-svn.freehaven.net/svn/tor/trunk/doc/spec/version-spec.txt - diff --git a/doc/spec/proposals/155-four-hidden-service-improvements.txt b/doc/spec/proposals/155-four-hidden-service-improvements.txt deleted file mode 100644 index e342bf1c39..0000000000 --- a/doc/spec/proposals/155-four-hidden-service-improvements.txt +++ /dev/null @@ -1,120 +0,0 @@ -Filename: 155-four-hidden-service-improvements.txt -Title: Four Improvements of Hidden Service Performance -Author: Karsten Loesing, Christian Wilms -Created: 25-Sep-2008 -Status: Finished -Implemented-In: 0.2.1.x - -Change history: - - 25-Sep-2008 Initial proposal for or-dev - -Overview: - - A performance analysis of hidden services [1] has brought up a few - possible design changes to reduce advertisement time of a hidden service - in the network as well as connection establishment time. Some of these - design changes have side-effects on anonymity or overall network load - which had to be weighed up against individual performance gains. A - discussion of seven possible design changes [2] has led to a selection - of four changes [3] that are proposed to be implemented here. - -Design: - - 1. Shorter Circuit Extension Timeout - - When establishing a connection to a hidden service a client cannibalizes - an existing circuit and extends it by one hop to one of the service's - introduction points. In most cases this can be accomplished within a few - seconds. Therefore, the current timeout of 60 seconds for extending a - circuit is far too high. - - Assuming that the timeout would be reduced to a lower value, for example - 30 seconds, a second (or third) attempt to cannibalize and extend would - be started earlier. With the current timeout of 60 seconds, 93.42% of all - circuits can be established, whereas this fraction would have been only - 0.87% smaller at 92.55% with a timeout of 30 seconds. - - For a timeout of 30 seconds the performance gain would be approximately 2 - seconds in the mean as opposed to the current timeout of 60 seconds. At - the same time a smaller timeout leads to discarding an increasing number - of circuits that might have been completed within the current timeout of - 60 seconds. - - Measurements with simulated low-bandwidth connectivity have shown that - there is no significant effect of client connectivity on circuit - extension times. The reason for this might be that extension messages are - small and thereby independent of the client bandwidth. Further, the - connection between client and entry node only constitutes a single hop of - a circuit, so that its influence on the whole circuit is limited. - - The exact value of the new timeout does not necessarily have to be 30 - seconds, but might also depend on the results of circuit build timeout - measurements as described in proposal 151. - - 2. Parallel Connections to Introduction Points - - An additional approach to accelerate extension of introduction circuits - is to extend a second circuit in parallel to a different introduction - point. Such parallel extension attempts should be started after a short - delay of, e.g., 15 seconds in order to prevent unnecessary circuit - extensions and thereby save network resources. Whichever circuit - extension succeeds first is used for introduction, while the other - attempt is aborted. - - An evaluation has been performed for the more resource-intensive approach - of starting two parallel circuits immediately instead of waiting for a - short delay. The result was a reduction of connection establishment times - from 27.4 seconds in the original protocol to 22.5 seconds. - - While the effect of the proposed approach of delayed parallelization on - mean connection establishment times is expected to be smaller, - variability of connection attempt times can be reduced significantly. - - 3. Increase Count of Internal Circuits - - Hidden services need to create or cannibalize and extend a circuit to a - rendezvous point for every client request. Really popular hidden services - require more than two internal circuits in the pool to answer multiple - client requests at the same time. This scenario was not yet analyzed, but - will probably exhibit worse performance than measured in the previous - analysis. The number of preemptively built internal circuits should be a - function of connection requests in the past to adapt to changing needs. - Furthermore, an increased number of internal circuits on client side - would allow clients to establish connections to more than one hidden - service at a time. - - Under the assumption that a popular hidden service cannot make use of - cannibalization for connecting to rendezvous points, the circuit creation - time needs to be added to the current results. In the mean, the - connection establishment time to a popular hidden service would increase - by 4.7 seconds. - - 4. Build More Introduction Circuits - - When establishing introduction points, a hidden service should launch 5 - instead of 3 introduction circuits at the same time and use only the - first 3 that could be established. The remaining two circuits could still - be used for other purposes afterwards. - - The effect has been simulated using previously measured data, too. - Therefore, circuit establishment times were derived from log files and - written to an array. Afterwards, a simulation with 10,000 runs was - performed picking 5 (4, 6) random values and using the 3 lowest values in - contrast to picking only 3 values at random. The result is that the mean - time of the 3-out-of-3 approach is 8.1 seconds, while the mean time of - the 3-out-of-5 approach is 4.4 seconds. - - The effect on network load is minimal, because the hidden service can - reuse the slower internal circuits for other purposes, e.g., rendezvous - circuits. The only change is that a hidden service starts establishing - more circuits at once instead of subsequently doing so. - -References: - - [1] http://freehaven.net/~karsten/hidserv/perfanalysis-2008-06-15.pdf - - [2] http://freehaven.net/~karsten/hidserv/discussion-2008-07-15.pdf - - [3] http://freehaven.net/~karsten/hidserv/design-2008-08-15.pdf - diff --git a/doc/spec/proposals/156-tracking-blocked-ports.txt b/doc/spec/proposals/156-tracking-blocked-ports.txt deleted file mode 100644 index 419de7e74c..0000000000 --- a/doc/spec/proposals/156-tracking-blocked-ports.txt +++ /dev/null @@ -1,527 +0,0 @@ -Filename: 156-tracking-blocked-ports.txt -Title: Tracking blocked ports on the client side -Author: Robert Hogan -Created: 14-Oct-2008 -Status: Open -Target: 0.2.? - -Motivation: -Tor clients that are behind extremely restrictive firewalls can end up -waiting a while for their first successful OR connection to a node on the -network. Worse, the more restrictive their firewall the more susceptible -they are to an attacker guessing their entry nodes. Tor routers that -are behind extremely restrictive firewalls can only offer a limited, -'partitioned' service to other routers and clients on the network. Exit -nodes behind extremely restrictive firewalls may advertise ports that they -are actually not able to connect to, wasting network resources in circuit -constructions that are doomed to fail at the last hop on first use. - -Proposal: - -When a client attempts to connect to an entry guard it should avoid -further attempts on ports that fail once until it has connected to at -least one entry guard successfully. (Maybe it should wait for more than -one failure to reduce the skew on the first node selection.) Thereafter -it should select entry guards regardless of port and warn the user if -it observes that connections to a given port have failed every multiple -of 5 times without success or since the last success. - -Tor should warn the operators of exit, middleman and entry nodes if it -observes that connections to a given port have failed a multiple of 5 -times without success or since the last success. If attempts on a port -fail 20 or more times without or since success, Tor should add the port -to a 'blocked-ports' entry in its descriptor's extra-info. Some thought -needs to be given to what the authorities might do with this information. - -Related TODO item: - "- Automatically determine what ports are reachable and start using - those, if circuits aren't working and it's a pattern we - recognize ("port 443 worked once and port 9001 keeps not - working")." - - -I've had a go at implementing all of this in the attached. - -Addendum: -Just a note on the patch, storing the digest of each router that uses the port -is a bit of a memory hog, and its only real purpose is to provide a count of -routers using that port when warning the user. That could be achieved when -warning the user by iterating through the routerlist instead. - -Index: src/or/connection_or.c -=================================================================== ---- src/or/connection_or.c (revision 17104) -+++ src/or/connection_or.c (working copy) -@@ -502,6 +502,9 @@ - connection_or_connect_failed(or_connection_t *conn, - int reason, const char *msg) - { -+ if ((reason == END_OR_CONN_REASON_NO_ROUTE) || -+ (reason == END_OR_CONN_REASON_REFUSED)) -+ or_port_hist_failure(conn->identity_digest,TO_CONN(conn)->port); - control_event_or_conn_status(conn, OR_CONN_EVENT_FAILED, reason); - if (!authdir_mode_tests_reachability(get_options())) - control_event_bootstrap_problem(msg, reason); -@@ -580,6 +583,7 @@ - /* already marked for close */ - return NULL; - } -+ - return conn; - } - -@@ -909,6 +913,7 @@ - control_event_or_conn_status(conn, OR_CONN_EVENT_CONNECTED, 0); - - if (started_here) { -+ or_port_hist_success(TO_CONN(conn)->port); - rep_hist_note_connect_succeeded(conn->identity_digest, now); - if (entry_guard_register_connect_status(conn->identity_digest, - 1, now) < 0) { -Index: src/or/rephist.c -=================================================================== ---- src/or/rephist.c (revision 17104) -+++ src/or/rephist.c (working copy) -@@ -18,6 +18,7 @@ - static void bw_arrays_init(void); - static void predicted_ports_init(void); - static void hs_usage_init(void); -+static void or_port_hist_init(void); - - /** Total number of bytes currently allocated in fields used by rephist.c. */ - uint64_t rephist_total_alloc=0; -@@ -89,6 +90,25 @@ - digestmap_t *link_history_map; - } or_history_t; - -+/** or_port_hist_t contains our router/client's knowledge of -+ all OR ports offered on the network, and how many servers with each port we -+ have succeeded or failed to connect to. */ -+typedef struct { -+ /** The port this entry is tracking. */ -+ uint16_t or_port; -+ /** Have we ever connected to this port on another OR?. */ -+ unsigned int success:1; -+ /** The ORs using this port. */ -+ digestmap_t *ids; -+ /** The ORs using this port we have failed to connect to. */ -+ digestmap_t *failure_ids; -+ /** Are we excluding ORs with this port during entry selection?*/ -+ unsigned int excluded; -+} or_port_hist_t; -+ -+static unsigned int still_searching = 0; -+static smartlist_t *or_port_hists; -+ - /** When did we last multiply all routers' weighted_run_length and - * total_run_weights by STABILITY_ALPHA? */ - static time_t stability_last_downrated = 0; -@@ -164,6 +184,16 @@ - tor_free(hist); - } - -+/** Helper: free storage held by a single OR port history entry. */ -+static void -+or_port_hist_free(or_port_hist_t *p) -+{ -+ tor_assert(p); -+ digestmap_free(p->ids,NULL); -+ digestmap_free(p->failure_ids,NULL); -+ tor_free(p); -+} -+ - /** Update an or_history_t object <b>hist</b> so that its uptime/downtime - * count is up-to-date as of <b>when</b>. - */ -@@ -1639,7 +1669,7 @@ - tmp_time = smartlist_get(predicted_ports_times, i); - if (*tmp_time + PREDICTED_CIRCS_RELEVANCE_TIME < now) { - tmp_port = smartlist_get(predicted_ports_list, i); -- log_debug(LD_CIRC, "Expiring predicted port %d", *tmp_port); -+ log_debug(LD_HIST, "Expiring predicted port %d", *tmp_port); - smartlist_del(predicted_ports_list, i); - smartlist_del(predicted_ports_times, i); - rephist_total_alloc -= sizeof(uint16_t)+sizeof(time_t); -@@ -1821,6 +1851,12 @@ - tor_free(last_stability_doc); - built_last_stability_doc_at = 0; - predicted_ports_free(); -+ if (or_port_hists) { -+ SMARTLIST_FOREACH(or_port_hists, or_port_hist_t *, p, -+ or_port_hist_free(p)); -+ smartlist_free(or_port_hists); -+ or_port_hists = NULL; -+ } - } - - /****************** hidden service usage statistics ******************/ -@@ -2356,3 +2392,225 @@ - tor_free(fname); - } - -+/** Create a new entry in the port tracking cache for the or_port in -+ * <b>ri</b>. */ -+void -+or_port_hist_new(const routerinfo_t *ri) -+{ -+ or_port_hist_t *result; -+ const char *id=ri->cache_info.identity_digest; -+ -+ if (!or_port_hists) -+ or_port_hist_init(); -+ -+ SMARTLIST_FOREACH(or_port_hists, or_port_hist_t *, tp, -+ { -+ /* Cope with routers that change their advertised OR port or are -+ dropped from the networkstatus. We don't discard the failures of -+ dropped routers because they are still valid when counting -+ consecutive failures on a port.*/ -+ if (digestmap_get(tp->ids, id) && (tp->or_port != ri->or_port)) { -+ digestmap_remove(tp->ids, id); -+ } -+ if (tp->or_port == ri->or_port) { -+ if (!(digestmap_get(tp->ids, id))) -+ digestmap_set(tp->ids, id, (void*)1); -+ return; -+ } -+ }); -+ -+ result = tor_malloc_zero(sizeof(or_port_hist_t)); -+ result->or_port=ri->or_port; -+ result->success=0; -+ result->ids=digestmap_new(); -+ digestmap_set(result->ids, id, (void*)1); -+ result->failure_ids=digestmap_new(); -+ result->excluded=0; -+ smartlist_add(or_port_hists, result); -+} -+ -+/** Create the port tracking cache. */ -+/*XXX: need to call this when we rebuild/update our network status */ -+static void -+or_port_hist_init(void) -+{ -+ routerlist_t *rl = router_get_routerlist(); -+ -+ if (!or_port_hists) -+ or_port_hists=smartlist_create(); -+ -+ if (rl && rl->routers) { -+ SMARTLIST_FOREACH(rl->routers, routerinfo_t *, ri, -+ { -+ or_port_hist_new(ri); -+ }); -+ } -+} -+ -+#define NOT_BLOCKED 0 -+#define FAILURES_OBSERVED 1 -+#define POSSIBLY_BLOCKED 5 -+#define PROBABLY_BLOCKED 10 -+/** Return the list of blocked ports for our router's extra-info.*/ -+char * -+or_port_hist_get_blocked_ports(void) -+{ -+ char blocked_ports[2048]; -+ char *bp; -+ -+ tor_snprintf(blocked_ports,sizeof(blocked_ports),"blocked-ports"); -+ SMARTLIST_FOREACH(or_port_hists, or_port_hist_t *, tp, -+ { -+ if (digestmap_size(tp->failure_ids) >= PROBABLY_BLOCKED) -+ tor_snprintf(blocked_ports+strlen(blocked_ports), -+ sizeof(blocked_ports)," %u,",tp->or_port); -+ }); -+ if (strlen(blocked_ports) == 13) -+ return NULL; -+ bp=tor_strdup(blocked_ports); -+ bp[strlen(bp)-1]='\n'; -+ bp[strlen(bp)]='\0'; -+ return bp; -+} -+ -+/** Revert to client-only mode if we have seen to many failures on a port or -+ * range of ports.*/ -+static void -+or_port_hist_report_block(unsigned int min_severity) -+{ -+ or_options_t *options=get_options(); -+ char failures_observed[2048],possibly_blocked[2048],probably_blocked[2048]; -+ char port[1024]; -+ -+ memset(failures_observed,0,sizeof(failures_observed)); -+ memset(possibly_blocked,0,sizeof(possibly_blocked)); -+ memset(probably_blocked,0,sizeof(probably_blocked)); -+ -+ SMARTLIST_FOREACH(or_port_hists, or_port_hist_t *, tp, -+ { -+ unsigned int failures = digestmap_size(tp->failure_ids); -+ if (failures >= min_severity) { -+ tor_snprintf(port, sizeof(port), " %u (%u failures %s out of %u on the" -+ " network)",tp->or_port,failures, -+ (!tp->success)?"and no successes": "since last success", -+ digestmap_size(tp->ids)); -+ if (failures >= PROBABLY_BLOCKED) { -+ strlcat(probably_blocked, port, sizeof(probably_blocked)); -+ } else if (failures >= POSSIBLY_BLOCKED) -+ strlcat(possibly_blocked, port, sizeof(possibly_blocked)); -+ else if (failures >= FAILURES_OBSERVED) -+ strlcat(failures_observed, port, sizeof(failures_observed)); -+ } -+ }); -+ -+ log_warn(LD_HIST,"%s%s%s%s%s%s%s%s", -+ server_mode(options) && -+ ((min_severity==FAILURES_OBSERVED) || strlen(probably_blocked))? -+ "You should consider disabling your Tor server.":"", -+ (min_severity==FAILURES_OBSERVED)? -+ "Tor appears to be blocked from connecting to a range of ports " -+ "with the result that it cannot connect to one tenth of the Tor " -+ "network. ":"", -+ strlen(failures_observed)? -+ "Tor has observed failures on the following ports: ":"", -+ failures_observed, -+ strlen(possibly_blocked)? -+ "Tor is possibly blocked on the following ports: ":"", -+ possibly_blocked, -+ strlen(probably_blocked)? -+ "Tor is almost certainly blocked on the following ports: ":"", -+ probably_blocked); -+ -+} -+ -+/** Record the success of our connection to <b>digest</b>'s -+ * OR port. */ -+void -+or_port_hist_success(uint16_t or_port) -+{ -+ SMARTLIST_FOREACH(or_port_hists, or_port_hist_t *, tp, -+ { -+ if (tp->or_port != or_port) -+ continue; -+ /*Reset our failure stats so we can notice if this port ever gets -+ blocked again.*/ -+ tp->success=1; -+ if (digestmap_size(tp->failure_ids)) { -+ digestmap_free(tp->failure_ids,NULL); -+ tp->failure_ids=digestmap_new(); -+ } -+ if (still_searching) { -+ still_searching=0; -+ SMARTLIST_FOREACH(or_port_hists,or_port_hist_t *,t,t->excluded=0;); -+ } -+ return; -+ }); -+} -+/** Record the failure of our connection to <b>digest</b>'s -+ * OR port. Warn, exclude the port from future entry guard selection, or -+ * add port to blocked-ports in our server's extra-info as appropriate. */ -+void -+or_port_hist_failure(const char *digest, uint16_t or_port) -+{ -+ int total_failures=0, ports_excluded=0, report_block=0; -+ int total_routers=smartlist_len(router_get_routerlist()->routers); -+ -+ SMARTLIST_FOREACH(or_port_hists, or_port_hist_t *, tp, -+ { -+ ports_excluded += tp->excluded; -+ total_failures+=digestmap_size(tp->failure_ids); -+ if (tp->or_port != or_port) -+ continue; -+ /* We're only interested in unique failures */ -+ if (digestmap_get(tp->failure_ids, digest)) -+ return; -+ -+ total_failures++; -+ digestmap_set(tp->failure_ids, digest, (void*)1); -+ if (still_searching && !tp->success) { -+ tp->excluded=1; -+ ports_excluded++; -+ } -+ if ((digestmap_size(tp->ids) >= POSSIBLY_BLOCKED) && -+ !(digestmap_size(tp->failure_ids) % POSSIBLY_BLOCKED)) -+ report_block=POSSIBLY_BLOCKED; -+ }); -+ -+ if (total_failures >= (int)(total_routers/10)) -+ or_port_hist_report_block(FAILURES_OBSERVED); -+ else if (report_block) -+ or_port_hist_report_block(report_block); -+ -+ if (ports_excluded >= smartlist_len(or_port_hists)) { -+ log_warn(LD_HIST,"During entry node selection Tor tried every port " -+ "offered on the network on at least one server " -+ "and didn't manage a single " -+ "successful connection. This suggests you are behind an " -+ "extremely restrictive firewall. Tor will keep trying to find " -+ "a reachable entry node."); -+ SMARTLIST_FOREACH(or_port_hists, or_port_hist_t *, tp, tp->excluded=0;); -+ } -+} -+ -+/** Add any ports marked as excluded in or_port_hist_t to <b>rt</b> */ -+void -+or_port_hist_exclude(routerset_t *rt) -+{ -+ SMARTLIST_FOREACH(or_port_hists, or_port_hist_t *, tp, -+ { -+ char portpolicy[9]; -+ if (tp->excluded) { -+ tor_snprintf(portpolicy,sizeof(portpolicy),"*:%u", tp->or_port); -+ log_warn(LD_HIST,"Port %u may be blocked, excluding it temporarily " -+ "from entry guard selection.", tp->or_port); -+ routerset_parse(rt, portpolicy, "Ports"); -+ } -+ }); -+} -+ -+/** Allow the exclusion of ports during our search for an entry node. */ -+void -+or_port_hist_search_again(void) -+{ -+ still_searching=1; -+} -Index: src/or/or.h -=================================================================== ---- src/or/or.h (revision 17104) -+++ src/or/or.h (working copy) -@@ -3864,6 +3864,13 @@ - int any_predicted_circuits(time_t now); - int rep_hist_circbuilding_dormant(time_t now); - -+void or_port_hist_failure(const char *digest, uint16_t or_port); -+void or_port_hist_success(uint16_t or_port); -+void or_port_hist_new(const routerinfo_t *ri); -+void or_port_hist_exclude(routerset_t *rt); -+void or_port_hist_search_again(void); -+char *or_port_hist_get_blocked_ports(void); -+ - /** Possible public/private key operations in Tor: used to keep track of where - * we're spending our time. */ - typedef enum { -Index: src/or/routerparse.c -=================================================================== ---- src/or/routerparse.c (revision 17104) -+++ src/or/routerparse.c (working copy) -@@ -1401,6 +1401,8 @@ - goto err; - } - -+ or_port_hist_new(router); -+ - if (!router->platform) { - router->platform = tor_strdup("<unknown>"); - } -Index: src/or/router.c -=================================================================== ---- src/or/router.c (revision 17104) -+++ src/or/router.c (working copy) -@@ -1818,6 +1818,7 @@ - char published[ISO_TIME_LEN+1]; - char digest[DIGEST_LEN]; - char *bandwidth_usage; -+ char *blocked_ports; - int result; - size_t len; - -@@ -1825,7 +1826,6 @@ - extrainfo->cache_info.identity_digest, DIGEST_LEN); - format_iso_time(published, extrainfo->cache_info.published_on); - bandwidth_usage = rep_hist_get_bandwidth_lines(1); -- - result = tor_snprintf(s, maxlen, - "extra-info %s %s\n" - "published %s\n%s", -@@ -1835,6 +1835,16 @@ - if (result<0) - return -1; - -+ blocked_ports = or_port_hist_get_blocked_ports(); -+ if (blocked_ports) { -+ result = tor_snprintf(s+strlen(s), maxlen-strlen(s), -+ "%s", -+ blocked_ports); -+ tor_free(blocked_ports); -+ if (result<0) -+ return -1; -+ } -+ - if (should_record_bridge_info(options)) { - static time_t last_purged_at = 0; - char *geoip_summary; -Index: src/or/circuitbuild.c -=================================================================== ---- src/or/circuitbuild.c (revision 17104) -+++ src/or/circuitbuild.c (working copy) -@@ -62,6 +62,7 @@ - - static void entry_guards_changed(void); - static time_t start_of_month(time_t when); -+static int num_live_entry_guards(void); - - /** Iterate over values of circ_id, starting from conn-\>next_circ_id, - * and with the high bit specified by conn-\>circ_id_type, until we get -@@ -1627,12 +1628,14 @@ - smartlist_t *excluded; - or_options_t *options = get_options(); - router_crn_flags_t flags = 0; -+ routerset_t *_ExcludeNodes; - - if (state && options->UseEntryGuards && - (purpose != CIRCUIT_PURPOSE_TESTING || options->BridgeRelay)) { - return choose_random_entry(state); - } - -+ _ExcludeNodes = routerset_new(); - excluded = smartlist_create(); - - if (state && (r = build_state_get_exit_router(state))) { -@@ -1670,12 +1673,18 @@ - if (options->_AllowInvalid & ALLOW_INVALID_ENTRY) - flags |= CRN_ALLOW_INVALID; - -+ if (options->ExcludeNodes) -+ routerset_union(_ExcludeNodes,options->ExcludeNodes); -+ -+ or_port_hist_exclude(_ExcludeNodes); -+ - choice = router_choose_random_node( - NULL, - excluded, -- options->ExcludeNodes, -+ _ExcludeNodes, - flags); - smartlist_free(excluded); -+ routerset_free(_ExcludeNodes); - return choice; - } - -@@ -2727,6 +2736,7 @@ - entry_guards_update_state(or_state_t *state) - { - config_line_t **next, *line; -+ unsigned int have_reachable_entry=0; - if (! entry_guards_dirty) - return; - -@@ -2740,6 +2750,7 @@ - char dbuf[HEX_DIGEST_LEN+1]; - if (!e->made_contact) - continue; /* don't write this one to disk */ -+ have_reachable_entry=1; - *next = line = tor_malloc_zero(sizeof(config_line_t)); - line->key = tor_strdup("EntryGuard"); - line->value = tor_malloc(HEX_DIGEST_LEN+MAX_NICKNAME_LEN+2); -@@ -2785,6 +2796,11 @@ - if (!get_options()->AvoidDiskWrites) - or_state_mark_dirty(get_or_state(), 0); - entry_guards_dirty = 0; -+ -+ /* XXX: Is this the place to decide that we no longer have any reachable -+ guards? */ -+ if (!have_reachable_entry) -+ or_port_hist_search_again(); - } - - /** If <b>question</b> is the string "entry-guards", then dump - diff --git a/doc/spec/proposals/157-specific-cert-download.txt b/doc/spec/proposals/157-specific-cert-download.txt deleted file mode 100644 index 204b20973a..0000000000 --- a/doc/spec/proposals/157-specific-cert-download.txt +++ /dev/null @@ -1,102 +0,0 @@ -Filename: 157-specific-cert-download.txt -Title: Make certificate downloads specific -Author: Nick Mathewson -Created: 2-Dec-2008 -Status: Accepted -Target: 0.2.1.x - -History: - - 2008 Dec 2, 22:34 - Changed name of cross certification field to match the other authority - certificate fields. - -Status: - - As of 0.2.1.9-alpha: - Cross-certification is implemented for new certificates, but not yet - required. Directories support the tor/keys/fp-sk urls. - -Overview: - - Tor's directory specification gives two ways to download a certificate: - by its identity fingerprint, or by the digest of its signing key. Both - are error-prone. We propose a new download mechanism to make sure that - clients get the certificates they want. - -Motivation: - - When a client wants a certificate to verify a consensus, it has two choices - currently: - - Download by identity key fingerprint. In this case, the client risks - getting a certificate for the same authority, but with a different - signing key than the one used to sign the consensus. - - - Download by signing key fingerprint. In this case, the client risks - getting a forged certificate that contains the right signing key - signed with the wrong identity key. (Since caches are willing to - cache certs from authorities they do not themselves recognize, the - attacker wouldn't need to compromise an authority's key to do this.) - -Current solution: - - Clients fetch by identity keys, and re-fetch with backoff if they don't get - certs with the signing key they want. - -Proposed solution: - - Phase 1: Add a URL type for clients to download certs by identity _and_ - signing key fingerprint. Unless both fields match, the client doesn't - accept the certificate(s). Clients begin using this method when their - randomly chosen directory cache supports it. - - Phase 1A: Simultaneously, add a cross-certification element to - certificates. - - Phase 2: Once many directory caches support phase 1, clients should prefer - to fetch certificates using that protocol when available. - - Phase 2A: Once all authorities are generating cross-certified certificates - as in phase 1A, require cross-certification. - -Specification additions: - - The key certificate whose identity key fingerprint is <F> and whose signing - key fingerprint is <S> should be available at: - - http://<hostname>/tor/keys/fp-sk/<F>-<S>.z - - As usual, clients may request multiple certificates using: - - http://<hostname>/tor/keys/fp-sk/<F1>-<S1>+<F2>-<S2>.z - - Clients SHOULD use this format whenever they know both key fingerprints for - a desired certificate. - - - Certificates SHOULD contain the following field (at most once): - - "dir-key-crosscert" NL CrossSignature NL - - where CrossSignature is a signature, made using the certificate's signing - key, of the digest of the PKCS1-padded hash of the certificate's identity - key. For backward compatibility with broken versions of the parser, we - wrap the base64-encoded signature in -----BEGIN ID SIGNATURE---- and - -----END ID SIGNATURE----- tags. (See bug 880.) Implementations MUST allow - the "ID " portion to be omitted, however. - - When encountering a certificate with a dir-key-crosscert entry, - implementations MUST verify that the signature is a correct signature of - the hash of the identity key using the signing key. - - (In a future version of this specification, dir-key-crosscert entries will - be required.) - -Why cross-certify too? - - Cross-certification protects clients who haven't updated yet, by reducing - the number of caches that are willing to hold and serve bogus certificates. - -References: - - This is related to part 2 of bug 854. diff --git a/doc/spec/proposals/158-microdescriptors.txt b/doc/spec/proposals/158-microdescriptors.txt deleted file mode 100644 index e6966c0cef..0000000000 --- a/doc/spec/proposals/158-microdescriptors.txt +++ /dev/null @@ -1,198 +0,0 @@ -Filename: 158-microdescriptors.txt -Title: Clients download consensus + microdescriptors -Author: Roger Dingledine -Created: 17-Jan-2009 -Status: Open - -0. History - - 15 May 2009: Substantially revised based on discussions on or-dev - from late January. Removed the notion of voting on how to choose - microdescriptors; made it just a function of the consensus method. - (This lets us avoid the possibility of "desynchronization.") - Added suggestion to use a new consensus flavor. Specified use of - SHA256 for new hashes. -nickm - - 15 June 2009: Cleaned up based on comments from Roger. -nickm - -1. Overview - - This proposal replaces section 3.2 of proposal 141, which was - called "Fetching descriptors on demand". Rather than modifying the - circuit-building protocol to fetch a server descriptor inline at each - circuit extend, we instead put all of the information that clients need - either into the consensus itself, or into a new set of data about each - relay called a microdescriptor. - - Descriptor elements that are small and frequently changing should go - in the consensus itself, and descriptor elements that are small and - relatively static should go in the microdescriptor. If we ever end up - with descriptor elements that aren't small yet clients need to know - them, we'll need to resume considering some design like the one in - proposal 141. - - Note also that any descriptor element which clients need to use to - decide which servers to fetch info about, or which servers to fetch - info from, needs to stay in the consensus. - -2. Motivation - - See - http://archives.seul.org/or/dev/Nov-2008/msg00000.html and - http://archives.seul.org/or/dev/Nov-2008/msg00001.html and especially - http://archives.seul.org/or/dev/Nov-2008/msg00007.html - for a discussion of the options and why this is currently the best - approach. - -3. Design - - There are three pieces to the proposal. First, authorities will list in - their votes (and thus in the consensus) the expected hash of - microdescriptor for each relay. Second, authorities will serve - microdescriptors, directory mirrors will cache and serve - them. Third, clients will ask for them and cache them. - -3.1. Consensus changes - - If the authorities choose a consensus method of a given version or - later, a microdescriptor format is implicit in that version. - A microdescriptor should in every case be a pure function of the - router descriptor and the consensus method. - - In votes, we need to include the hash of each expected microdescriptor - in the routerstatus section. I suggest a new "m" line for each stanza, - with the base64 of the SHA256 hash of the router's microdescriptor. - - For every consensus method that an authority supports, it includes a - separate "m" line in each router section of its vote, containing: - "m" SP methods 1*(SP AlgorithmName "=" digest) NL - where methods is a comma-separated list of the consensus methods - that the authority believes will produce "digest". - - (As with base64 encoding of SHA1 hashes in consensuses, let's - omit the trailing =s) - - The consensus microdescriptor-elements and "m" lines are then computed - as described in Section 3.1.2 below. - - (This means we need a new consensus-method that knows - how to compute the microdescriptor-elements and add "m" lines.) - - The microdescriptor consensus uses the directory-signature format from - proposal 162, with the "sha256" algorithm. - - -3.1.1. Descriptor elements to include for now - - In the first version, the microdescriptor should contain the - onion-key element, and the family element from the router descriptor, - and the exit policy summary as currently specified in dir-spec.txt. - -3.1.2. Computing consensus for microdescriptor-elements and "m" lines - - When we are generating a consensus, we use whichever m line - unambiguously corresponds to the descriptor digest that will be - included in the consensus. - - (If different votes have different microdescriptor digests for a - single <descriptor-digest, consensus-method> pair, then at least one - of the authorities is broken. If this happens, the consensus should - contain whichever microdescriptor digest is most common. If there is - no winner, we break ties in the favor of the lexically earliest. - Either way, we should log a warning: there is definitely a bug.) - - The "m" lines in a consensus contain only the digest, not a list of - consensus methods. - -3.1.3. A new flavor of consensus - - Rather than inserting "m" lines in the current consensus format, - they should be included in a new consensus flavor (see proposal - 162). - - This flavor can safely omit descriptor digests. - - When we implement this voting method, we can remove the exit policy - summary from the current "ns" flavor of consensus, since no current - clients use them, and they take up about 5% of the compressed - consensus. - - This new consensus flavor should be signed with the sha256 signature - format as documented in proposal 162. - -3.2. Directory mirrors fetch, cache, and serve microdescriptors - - Directory mirrors should fetch, catch, and serve each microdescriptor - from the authorities. (They need to continue to serve normal relay - descriptors too, to handle old clients.) - - The microdescriptors with base64 hashes <D1>,<D2>,<D3> should be - available at: - http://<hostname>/tor/micro/d/<D1>-<D2>-<D3>.z - (We use base64 for size and for consistency with the consensus - format. We use -s instead of +s to separate these items, since - the + character is used in base64 encoding.) - - All the microdescriptors from the current consensus should also be - available at: - http://<hostname>/tor/micro/all.z - so a client that's bootstrapping doesn't need to send a 70KB URL just - to name every microdescriptor it's looking for. - - Microdescriptors have no header or footer. - The hash of the microdescriptor is simply the hash of the concatenated - elements. - - Directory mirrors should check to make sure that the microdescriptors - they're about to serve match the right hashes (either the hashes from - the fetch URL or the hashes from the consensus, respectively). - - We will probably want to consider some sort of smart data structure to - be able to quickly convert microdescriptor hashes into the appropriate - microdescriptor. Clients will want this anyway when they load their - microdescriptor cache and want to match it up with the consensus to - see what's missing. - -3.3. Clients fetch them and cache them - - When a client gets a new consensus, it looks to see if there are any - microdescriptors it needs to learn. If it needs to learn more than - some threshold of the microdescriptors (half?), it requests 'all', - else it requests only the missing ones. Clients MAY try to - determine whether the upload bandwidth for listing the - microdescriptors they want is more or less than the download - bandwidth for the microdescriptors they do not want. - - Clients maintain a cache of microdescriptors along with metadata like - when it was last referenced by a consensus, and which identity key - it corresponds to. They keep a microdescriptor - until it hasn't been mentioned in any consensus for a week. Future - clients might cache them for longer or shorter times. - -3.3.1. Information leaks from clients - - If a client asks you for a set of microdescs, then you know she didn't - have them cached before. How much does that leak? What about when - we're all using our entry guards as directory guards, and we've seen - that user make a bunch of circuits already? - - Fetching "all" when you need at least half is a good first order fix, - but might not be all there is to it. - - Another future option would be to fetch some of the microdescriptors - anonymously (via a Tor circuit). - - Another crazy option (Roger's phrasing) is to do decoy fetches as - well. - -4. Transition and deployment - - Phase one, the directory authorities should start voting on - microdescriptors, and putting them in the consensus. - - Phase two, directory mirrors should learn how to serve them, and learn - how to read the consensus to find out what they should be serving. - - Phase three, clients should start fetching and caching them instead - of normal descriptors. - diff --git a/doc/spec/proposals/159-exit-scanning.txt b/doc/spec/proposals/159-exit-scanning.txt deleted file mode 100644 index 7090f2ed08..0000000000 --- a/doc/spec/proposals/159-exit-scanning.txt +++ /dev/null @@ -1,142 +0,0 @@ -Filename: 159-exit-scanning.txt -Title: Exit Scanning -Author: Mike Perry -Created: 13-Feb-2009 -Status: Open - -Overview: - -This proposal describes the implementation and integration of an -automated exit node scanner for scanning the Tor network for malicious, -misconfigured, firewalled or filtered nodes. - -Motivation: - -Tor exit nodes can be run by anyone with an Internet connection. Often, -these users aren't fully aware of limitations of their networking -setup. Content filters, antivirus software, advertisements injected by -their service providers, malicious upstream providers, and the resource -limitations of their computer or networking equipment have all been -observed on the current Tor network. - -It is also possible that some nodes exist purely for malicious -purposes. In the past, there have been intermittent instances of -nodes spoofing SSH keys, as well as nodes being used for purposes of -plaintext surveillance. - -While it is not realistic to expect to catch extremely targeted or -completely passive malicious adversaries, the goal is to prevent -malicious adversaries from deploying dragnet attacks against large -segments of the Tor userbase. - - -Scanning methodology: - -The first scans to be implemented are HTTP, HTML, Javascript, and -SSL scans. - -The HTTP scan scrapes Google for common filetype urls such as exe, msi, -doc, dmg, etc. It then fetches these urls through Non-Tor and Tor, and -compares the SHA1 hashes of the resulting content. - -The SSL scan downloads certificates for all IPs a domain will locally -resolve to and compares these certificates to those seen over Tor. The -scanner notes if a domain had rotated certificates locally in the -results for each scan. - -The HTML scan checks HTML, Javascript, and plugin content for -modifications. Because of the dynamic nature of most of the web, the -scanner has a number of mechanisms built in to filter out false -positives that are used when a change is noticed between Tor and -Non-Tor. - -All tests also share a URL-based false positive filter that -automatically removes results retroactively if the number of failures -exceeds a certain percentage of nodes tested with the URL. - - -Deployment Stages: - -To avoid instances where bugs cause us to mark exit nodes as BadExit -improperly, it is proposed that we begin use of the scanner in stages. - -1. Manual Review: - - In the first stage, basic scans will be run by a small number of - people while we stabilize the scanner. The scanner has the ability - to resume crashed scans, and to rescan nodes that fail various - tests. - -2. Human Review: - - In the second stage, results will be automatically mailed to - an email list of interested parties for review. We will also begin - classifying failure types into three to four different severity - levels, based on both the reliability of the test and the nature of - the failure. - -3. Automatic BadExit Marking: - - In the final stage, the scanner will begin marking exits depending - on the failure severity level in one of three different ways: by - node idhex, by node IP, or by node IP mask. A potential fourth, less - severe category of results may still be delivered via email only for - review. - - BadExit markings will be delivered in batches upon completion - of whole-network scans, so that the final false positive - filter has an opportunity to filter out URLs that exhibit - dynamic content beyond what we can filter. - - -Specification of Exit Marking: - -Technically, BadExit could be marked via SETCONF AuthDirBadExit over -the control port, but this would allow full access to the directory -authority configuration and operation. - -The approved-routers file could also be used, but currently it only -supports fingerprints, and it also contains other data unrelated to -exit scanning that would be difficult to coordinate. - -Instead, we propose that a new badexit-routers file that has three -keywords: - - BadExitNet 1*[exitpattern from 2.3 in dir-spec.txt] - BadExitFP 1*[hexdigest from 2.3 in dir-spec.txt] - -BadExitNet lines would follow the codepaths used by AuthDirBadExit to -set authdir_badexit_policy, and BadExitFP would follow the codepaths -from approved-router's !badexit lines. - -The scanner would have exclusive ability to write, append, rewrite, -and modify this file. Prior to building a new consensus vote, a -participating Tor authority would read in a fresh copy. - - -Security Implications: - -Aside from evading the scanner's detection, there are two additional -high-level security considerations: - -1. Ensure nodes cannot be marked BadExit by an adversary at will - -It is possible individual website owners will be able to target certain -Tor nodes, but once they begin to attempt to fail more than the URL -filter percentage of the exits, their sites will be automatically -discarded. - -Failing specific nodes is possible, but scanned results are fully -reproducible, and BadExits should be rare enough that humans are never -fully removed from the loop. - -State (cookies, cache, etc) does not otherwise persist in the scanner -between exit nodes to enable one exit node to bias the results of a -later one. - -2. Ensure that scanner compromise does not yield authority compromise - -Having a separate file that is under the exclusive control of the -scanner allows us to heavily isolate the scanner from the Tor -authority, potentially even running them on separate machines. - diff --git a/doc/spec/proposals/160-bandwidth-offset.txt b/doc/spec/proposals/160-bandwidth-offset.txt deleted file mode 100644 index 96935ade7d..0000000000 --- a/doc/spec/proposals/160-bandwidth-offset.txt +++ /dev/null @@ -1,105 +0,0 @@ -Filename: 160-bandwidth-offset.txt -Title: Authorities vote for bandwidth offsets in consensus -Author: Roger Dingledine -Created: 4-May-2009 -Status: Finished -Target: 0.2.2.x - -1. Motivation - - As part of proposal 141, we moved the bandwidth value for each relay - into the consensus. Now clients can know how they should load balance - even before they've fetched the corresponding relay descriptors. - - Putting the bandwidth in the consensus also lets the directory - authorities choose more accurate numbers to advertise, if we come up - with a better algorithm for deciding weightings. - - Our original plan was to teach directory authorities how to measure - bandwidth themselves; then every authority would vote for the bandwidth - it prefers, and we'd take the median of votes as usual. - - The problem comes when we have 7 authorities, and only a few of them - have smarter bandwidth allocation algorithms. So long as the majority - of them are voting for the number in the relay descriptor, the minority - that have better numbers will be ignored. - -2. Options - - One fix would be to demand that every authority also run the - new bandwidth measurement algorithms: in that case, part of the - responsibility of being an authority operator is that you need to run - this code too. But in practice we can't really require all current - authority operators to do that; and if we want to expand the set of - authority operators even further, it will become even more impractical. - Also, bandwidth testing adds load to the network, so we don't really - want to require that the number of concurrent bandwidth tests match - the number of authorities we have. - - The better fix is to allow certain authorities to specify that they are - voting on bandwidth measurements: more accurate bandwidth values that - have actually been evaluated. In this way, authorities can vote on - the median measured value if sufficient measured votes exist for a router, - and otherwise fall back to the median value taken from the published router - descriptors. - -3. Security implications - - If only some authorities choose to vote on an offset, then a majority of - those voting authorities can arbitrarily change the bandwidth weighting - for the relay. At the extreme, if there's only one offset-voting - authority, then that authority can dictate which relays clients will - find attractive. - - This problem isn't entirely new: we already have the worry wrt - the subset of authorities that vote for BadExit. - - To make it not so bad, we should deploy at least three offset-voting - authorities. - - Also, authorities that know how to vote for offsets should vote for - an offset of zero for new nodes, rather than choosing not to vote on - any offset in those cases. - -4. Design - - First, we need a new consensus method to support this new calculation. - - Now v3 votes can have an additional value on the "w" line: - "w Bandwidth=X Measured=" INT. - - Once we're using the new consensus method, the new way to compute the - Bandwidth weight is by checking if there are at least 3 "Measured" - votes. If so, the median of these is taken. Otherwise, the median - of the "Bandwidth=" values are taken, as described in Proposal 141. - - Then the actual consensus looks just the same as it did before, - so clients never have to know that this additional calculation is - happening. - -5. Implementation - - The Measured values will be read from a file provided by the scanners - described in proposal 161. Files with a timestamp older than 3 days - will be ignored. - - The file will be read in from dirserv_generate_networkstatus_vote_obj() - in a location specified by a new config option "V3MeasuredBandwidths". - A helper function will be called to populate new 'measured' and - 'has_measured' fields of the routerstatus_t 'routerstatuses' list with - values read from this file. - - An additional for_vote flag will be passed to - routerstatus_format_entry() from format_networkstatus_vote(), which will - indicate that the "Measured=" string should be appended to the "w Bandwith=" - line with the measured value in the struct. - - routerstatus_parse_entry_from_string() will be modified to parse the - "Measured=" lines into routerstatus_t struct fields. - - Finally, networkstatus_compute_consensus() will set rs_out.bandwidth - to the median of the measured values if there are more than 3, otherwise - it will use the bandwidth value median as normal. - - - diff --git a/doc/spec/proposals/161-computing-bandwidth-adjustments.txt b/doc/spec/proposals/161-computing-bandwidth-adjustments.txt deleted file mode 100644 index d219826668..0000000000 --- a/doc/spec/proposals/161-computing-bandwidth-adjustments.txt +++ /dev/null @@ -1,174 +0,0 @@ -Title: Computing Bandwidth Adjustments -Filename: 161-computing-bandwidth-adjustments.txt -Author: Mike Perry -Created: 12-May-2009 -Target: 0.2.2.x -Status: Finished - - -1. Motivation - - There is high variance in the performance of the Tor network. Despite - our efforts to balance load evenly across the Tor nodes, some nodes are - significantly slower and more overloaded than others. - - Proposal 160 describes how we can augment the directory authorities to - vote on measured bandwidths for routers. This proposal describes what - goes into the measuring process. - - -2. Measurement Selection - - The general idea is to determine a load factor representing the ratio - of the capacity of measured nodes to the rest of the network. This load - factor could be computed from three potentially relevant statistics: - circuit failure rates, circuit extend times, or stream capacity. - - Circuit failure rates and circuit extend times appear to be - non-linearly proportional to node load. We've observed that the same - nodes when scanned at US nighttime hours (when load is presumably - lower) exhibit almost no circuit failure, and significantly faster - extend times than when scanned during the day. - - Stream capacity, however, is much more uniform, even during US - nighttime hours. Moreover, it is a more intuitive representation of - node capacity, and also less dependent upon distance and latency - if amortized over large stream fetches. - - -3. Average Stream Bandwidth Calculation - - The average stream bandwidths are obtained by dividing the network into - slices of 50 nodes each, grouped according to advertised node bandwidth. - - Two hop circuits are built using nodes from the same slice, and a large - file is downloaded via these circuits. The file sizes are set based - on node percentile rank as follows: - - 0-10: 2M - 10-20: 1M - 20-30: 512k - 30-50: 256k - 50-100: 128k - - These sizes are based on measurements performed during test scans. - - This process is repeated until each node has been chosen to participate - in at least 5 circuits. - - -4. Ratio Calculation - - The ratios are calculated by dividing each measured value by the - network-wide average. - - -5. Ratio Filtering - - After the base ratios are calculated, a second pass is performed - to remove any streams with nodes of ratios less than X=0.5 from - the results of other nodes. In addition, all outlying streams - with capacity of one standard deviation below a node's average - are also removed. - - The final ratio result will be greater of the unfiltered ratio - and the filtered ratio. - - -6. Pseudocode for Ratio Calculation Algorithm - - Here is the complete pseudocode for the ratio algorithm: - - Slices = {S | S is 50 nodes of similar consensus capacity} - for S in Slices: - while exists node N in S with circ_chosen(N) < 7: - fetch_slice_file(build_2hop_circuit(N, (exit in S))) - for N in S: - BW_measured(N) = MEAN(b | b is bandwidth of a stream through N) - Bw_stddev(N) = STDDEV(b | b is bandwidth of a stream through N) - Bw_avg(S) = MEAN(b | b = BW_measured(N) for all N in S) - for N in S: - Normal_Streams(N) = {stream via N | bandwidth >= BW_measured(N)} - BW_Norm_measured(N) = MEAN(b | b is a bandwidth of Normal_Streams(N)) - - Bw_net_avg(Slices) = MEAN(BW_measured(N) for all N in Slices) - Bw_Norm_net_avg(Slices) = MEAN(BW_Norm_measured(N) for all N in Slices) - - for N in all Slices: - Bw_net_ratio(N) = Bw_measured(N)/Bw_net_avg(Slices) - Bw_Norm_net_ratio(N) = BW_Norm_measured(N)/Bw_Norm_net_avg(Slices) - - ResultRatio(N) = MAX(Bw_net_ratio(N), Bw_Norm_net_ratio(N)) - - -7. Security implications - - The ratio filtering will deal with cases of sabotage by dropping - both very slow outliers in stream average calculations, as well - as dropping streams that used very slow nodes from the calculation - of other nodes. - - This scheme will not address nodes that try to game the system by - providing better service to scanners. The scanners can be detected - at the entry by IP address, and at the exit by the destination fetch - IP. - - Measures can be taken to obfuscate and separate the scanners' source - IP address from the directory authority IP address. For instance, - scans can happen offsite and the results can be rsynced into the - authorities. The destination server IP can also change. - - Neither of these methods are foolproof, but such nodes can already - lie about their bandwidth to attract more traffic, so this solution - does not set us back any in that regard. - - -8. Parallelization - - Because each slice takes as long as 6 hours to complete, we will want - to parallelize as much as possible. This will be done by concurrently - running multiple scanners from each authority to deal with different - segments of the network. Each scanner piece will continually loop - over a portion of the network, outputting files of the form: - - node_id=<idhex> SP strm_bw=<BW_measured(N)> SP - filt_bw=<BW_Norm_measured(N)> ns_bw=<CurrentConsensusBw(N)> NL - - The most recent file from each scanner will be periodically gathered - by another script that uses them to produce network-wide averages - and calculate ratios as per the algorithm in section 6. Because nodes - may shift in capacity, they may appear in more than one slice and/or - appear more than once in the file set. The most recently measured - line will be chosen in this case. - - -9. Integration with Proposal 160 - - The final results will be produced for the voting mechanism - described in Proposal 160 by multiplying the derived ratio by - the average published consensus bandwidth during the course of the - scan, and taking the weighted average with the previous consensus - bandwidth: - - Bw_new = Round((Bw_current * Alpha + Bw_scan_avg*Bw_ratio)/(Alpha + 1)) - - The Alpha parameter is a smoothing parameter intended to prevent - rapid oscillation between loaded and unloaded conditions. It is - currently fixed at 0.333. - - The Round() step consists of rounding to the 3 most significant figures - in base10, and then rounding that result to the nearest 1000, with - a minimum value of 1000. - - This will produce a new bandwidth value that will be output into a - file consisting of lines of the form: - - node_id=<idhex> SP bw=<Bw_new> NL - - The first line of the file will contain a timestamp in UNIX time() - seconds. This will be used by the authority to decide if the - measured values are too old to use. - - This file can be either copied or rsynced into a directory readable - by the directory authority. - diff --git a/doc/spec/proposals/162-consensus-flavors.txt b/doc/spec/proposals/162-consensus-flavors.txt deleted file mode 100644 index e3b697afee..0000000000 --- a/doc/spec/proposals/162-consensus-flavors.txt +++ /dev/null @@ -1,188 +0,0 @@ -Filename: 162-consensus-flavors.txt -Title: Publish the consensus in multiple flavors -Author: Nick Mathewson -Created: 14-May-2009 -Target: 0.2.2 -Status: Open - -Overview: - - This proposal describes a way to publish each consensus in - multiple simultaneous formats, or "flavors". This will reduce the - amount of time needed to deploy new consensus-like documents, and - reduce the size of consensus documents in the long term. - -Motivation: - - In the future, we will almost surely want different fields and - data in the network-status document. Examples include: - - Publishing hashes of microdescriptors instead of hashes of - full descriptors (Proposal 158). - - Including different digests of descriptors, instead of the - perhaps-soon-to-be-totally-broken SHA1. - - Note that in both cases, from the client's point of view, this - information _replaces_ older information. If we're using a - SHA256 hash, we don't need to see the SHA1. If clients only want - microdescriptors, they don't (necessarily) need to see hashes of - other things. - - Our past approach to cases like this has been to shovel all of - the data into the consensus document. But this is rather poor - for bandwidth. Adding a single SHA256 hash to a consensus for - each router increases the compressed consensus size by 47%. In - comparison, replacing a single SHA1 hash with a SHA256 hash for - each listed router increases the consensus size by only 18%. - -Design in brief: - - Let the voting process remain as it is, until a consensus is - generated. With future versions of the voting algorithm, instead - of just a single consensus being generated, multiple consensus - "flavors" are produced. - - Consensuses (all of them) include a list of which flavors are - being generated. Caches fetch and serve all flavors of consensus - that are listed, regardless of whether they can parse or validate - them, and serve them to clients. Thus, once this design is in - place, we won't need to deploy more cache changes in order to get - new flavors of consensus to be cached. - - Clients download only the consensus flavor they want. - -A note on hashes: - - Everything in this document is specified to use SHA256, and to be - upgradeable to use better hashes in the future. - -Spec modifications: - - 1. URLs and changes to the current consensus format. - - Every consensus flavor has a name consisting of a sequence of one - or more alphanumeric characters and dashes. For compatibility - current descriptor flavor is called "ns". - - The supported consensus flavors are defined as part of the - authorities' consensus method. - - For each supported flavor, every authority calculates another - consensus document of as-yet-unspecified format, and exchanges - detached signatures for these documents as in the current consensus - design. - - In addition to the consensus currently served at - /tor/status-vote/(current|next)/consensus.z and - /tor/status-vote/(current|next)/consensus/<FP1>+<FP2>+<FP3>+....z , - authorities serve another consensus of each flavor "F" from the - locations /tor/status-vote/(current|next)/consensus-F.z. and - /tor/status-vote/(current|next)/consensus-F/<FP1>+....z. - - When caches serve these documents, they do so from the same - locations. - - 2. Document format: generic consensus. - - The format of a flavored consensus is as-yet-unspecified, except - that the first line is: - "network-status-version" SP version SP flavor NL - - where version is 3 or higher, and the flavor is a string - consisting of alphanumeric characters and dashes, matching the - corresponding flavor listed in the unflavored consensus. - - 3. Document format: detached signatures. - - We amend the detached signature format to include more than one - consensus-digest line, and more than one set of signatures. - - After the consensus-digest line, we allow more lines of the form: - "additional-digest" SP flavor SP algname SP digest NL - - Before the directory-signature lines, we allow more entries of the form: - "additional-signature" SP flavor SP algname SP identity SP - signing-key-digest NL signature. - - [We do not use "consensus-digest" or "directory-signature" for flavored - consensuses, since this could confuse older Tors.] - - The consensus-signatures URL should contain the signatures - for _all_ flavors of consensus. - - 4. The consensus index: - - Authorities additionally generate and serve a consensus-index - document. Its format is: - - Header ValidAfter ValidUntil Documents Signatures - - Header = "consensus-index" SP version NL - ValidAfter = as in a consensus - ValidUntil = as in a consensus - Documents = Document* - Document = "document" SP flavor SP SignedLength - 1*(SP AlgorithmName "=" Digest) NL - Signatures = Signature* - Signature = "directory-signature" SP algname SP identity - SP signing-key-digest NL signature - - There must be one Document line for each generated consensus flavor. - Each Document line describes the length of the signed portion of - a consensus (the signatures themselves are not included), along - with one or more digests of that signed portion. Digests are - given in hex. The algorithm "sha256" MUST be included; others - are allowed. - - The algname part of a signature describes what algorithm was - used to hash the identity and signing keys, and to compute the - signature. The algorithm "sha256" MUST be recognized; - signatures with unrecognized algorithms MUST be ignored. - (See below). - - The consensus index is made available at - /tor/status-vote/(current|next)/consensus-index.z. - - Caches should fetch this document so they can check the - correctness of the different consensus documents they fetch. - They do not need to check anything about an unrecognized - consensus document beyond its digest and length. - - 4.1. The "sha256" signature format. - - The 'SHA256' signature format for directory objects is defined as - the RSA signature of the OAEP+-padded SHA256 digest of the item to - be signed. When checking signatures, the signature MUST be treated - as valid if the signature material begins with SHA256(document); - this allows us to add other data later. - -Considerations: - - - We should not create a new flavor of consensus when adding a - field instead wouldn't be too onerous. - - - We should not proliferate flavors lightly: clients will be - distinguishable based on which flavor they download. - -Migration: - - - Stage one: authorities begin generating and serving - consensus-index files. - - - Stage two: Caches begin downloading consensus-index files, - validating them, and using them to decide what flavors of - consensus documents to cache. They download all listed - documents, and compare them to the digests given in the - consensus. - - - Stage three: Once we want to make a significant change to the - consensus format, we deploy another flavor of consensus at the - authorities. This will immediately start getting cached by the - caches, and clients can start fetching the new flavor without - waiting a version or two for enough caches to begin supporting - it. - -Acknowledgements: - - Aspects of this design and its applications to hash migration were - heavily influenced by IRC conversations with Marian. - diff --git a/doc/spec/proposals/163-detecting-clients.txt b/doc/spec/proposals/163-detecting-clients.txt deleted file mode 100644 index d838b17063..0000000000 --- a/doc/spec/proposals/163-detecting-clients.txt +++ /dev/null @@ -1,115 +0,0 @@ -Filename: 163-detecting-clients.txt -Title: Detecting whether a connection comes from a client -Author: Nick Mathewson -Created: 22-May-2009 -Target: 0.2.2 -Status: Open - - -Overview: - - Some aspects of Tor's design require relays to distinguish - connections from clients from connections that come from relays. - The existing means for doing this is easy to spoof. We propose - a better approach. - -Motivation: - - There are at least two reasons for which Tor servers want to tell - which connections come from clients and which come from other - servers: - - 1) Some exits, proposal 152 notwithstanding, want to disallow - their use as single-hop proxies. - 2) Some performance-related proposals involve prioritizing - traffic from relays, or limiting traffic per client (but not - per relay). - - Right now, we detect client vs server status based on how the - client opens circuits. (Check out the code that implements the - AllowSingleHopExits option if you want all the details.) This - method is depressingly easy to fake, though. This document - proposes better means. - -Goals: - - To make grabbing relay privileges at least as difficult as just - running a relay. - - In the analysis below, "using server privileges" means taking any - action that only servers are supposed to do, like delivering a - BEGIN cell to an exit node that doesn't allow single hop exits, - or claiming server-like amounts of bandwidth. - -Passive detection: - - A connection is definitely a client connection if it takes one of - the TLS methods during setup that does not establish an identity - key. - - A circuit is definitely a client circuit if it is initiated with - a CREATE_FAST cell, though the node could be a client or a server. - - A node that's listed in a recent consensus is probably a server. - - A node to which we have successfully extended circuits from - multiple origins is probably a server. - -Active detection: - - If a node doesn't try to use server privileges at all, we never - need to care whether it's a server. - - When a node or circuit tries to use server privileges, if it is - "definitely a client" as per above, we can refuse it immediately. - - If it's "probably a server" as per above, we can accept it. - - Otherwise, we have either a client, or a server that is neither - listed in any consensus or used by any other clients -- in other - words, a new or private server. - - For these servers, we should attempt to build one or more test - circuits through them. If enough of the circuits succeed, the - node is a real relay. If not, it is probably a client. - - While we are waiting for the test circuits to succeed, we should - allow a short grace period in which server privileges are - permitted. When a test is done, we should remember its outcome - for a while, so we don't need to do it again. - -Why it's hard to do good testing: - - Doing a test circuit starting with an unlisted router requires - only that we have an open connection for it. Doing a test - circuit starting elsewhere _through_ an unlisted router--though - more reliable-- would require that we have a known address, port, - identity key, and onion key for the router. Only the address and - identity key are easily available via the current Tor protocol in - all cases. - - We could fix this part by requiring that all servers support - BEGIN_DIR and support downloading at least a current descriptor - for themselves. - -Open questions: - - What are the thresholds for the needed numbers of circuits - for us to decide that a node is a relay? - - [Suggested answer: two circuits from two distinct hosts.] - - How do we pick grace periods? How long do we remember the - outcome of a test? - - [Suggested answer: 10 minute grace period; 48 hour memory of - test outcomes.] - - If we can build circuits starting at a suspect node, but we don't - have enough information to try extending circuits elsewhere - through the node, should we conclude that the node is - "server-like" or not? - - [Suggested answer: for now, just try making circuits through - the node. Extend this to extending circuits as needed.] - diff --git a/doc/spec/proposals/164-reporting-server-status.txt b/doc/spec/proposals/164-reporting-server-status.txt deleted file mode 100644 index 705f5f1a84..0000000000 --- a/doc/spec/proposals/164-reporting-server-status.txt +++ /dev/null @@ -1,91 +0,0 @@ -Filename: 164-reporting-server-status.txt -Title: Reporting the status of server votes -Author: Nick Mathewson -Created: 22-May-2009 -Target: 0.2.2 -Status: Open - - -Overview: - - When a given node isn't listed in the directory, it isn't always easy - to tell why. This proposal suggest a quick-and-dirty way for - authorities to export not only how they voted, but why, and a way to - collate the information. - -Motivation: - - Right now, if you want to know the reason why your server was listed - a certain way in the Tor directory, the following steps are - recommended: - - - Look through your log for reports of what the authority said - when you tried to upload. - - - Look at the consensus; see if you're listed. - - - Wait a while, see if things get better. - - - Download the votes from all the authorities, and see how they - voted. Try to figure out why. - - - If you think they'll listen to you, ask some authority - operators to look you up in their mtbf files and logs to see - why they voted as they did. - - This is far too hard. - -Solution: - - We should add a new vote-like information-only document that - authorities serve on request. Call it a "vote info". It is - generated at the same time as a vote, but used only for - determining why a server voted as it did. It is served from - /tor/status-vote-info/current/authority[.z] - - It differs from a vote in that: - - * Its vote-status field is 'vote-info'. - - * It includes routers that the authority would not include - in its vote. - - For these, it includes an "omitted" line with an English - message explaining why they were omitted. - - * For each router, it includes a line describing its WFU and - MTBF. The format is: - - "stability <mtbf> up-since='date'" - "uptime <wfu> down-since='date'" - - * It describes the WFU and MTBF thresholds it requires to - vote for a given router in various roles in the header. - The format is: - - "flag-requirement <flag-name> <field> <op> <value>" - - e.g. - - "flag-requirement Guard uptime > 80" - - * It includes info on routers all of whose descriptors that - were uploaded but rejected over the past few hours. The - "r" lines for these are the same as for regular routers. - The other lines are omitted for these routers, and are - replaced with a single "rejected" line, explaining (in - English) why the router was rejected. - - - A status site (like Torweather or Torstatus or another - tool) can poll these files when they are generated, collate - the data, and make it available to server operators. - -Risks: - - This document makes no provisions for caching these "vote - info" documents. If many people wind up fetching them - aggressively from the authorities, that would be bad. - - - diff --git a/doc/spec/proposals/165-simple-robust-voting.txt b/doc/spec/proposals/165-simple-robust-voting.txt deleted file mode 100644 index f813285a83..0000000000 --- a/doc/spec/proposals/165-simple-robust-voting.txt +++ /dev/null @@ -1,133 +0,0 @@ -Filename: 165-simple-robust-voting.txt -Title: Easy migration for voting authority sets -Author: Nick Mathewson -Created: 2009-05-28 -Status: Open - -Overview: - - This proposal describes any easy-to-implement, easy-to-verify way to - change the set of authorities without creating a "flag day" situation. - -Motivation: - - From proposal 134 ("More robust consensus voting with diverse - authority sets") by Peter Palfrader: - - Right now there are about five authoritative directory servers - in the Tor network, tho this number is expected to rise to about - 15 eventually. - - Adding a new authority requires synchronized action from all - operators of directory authorities so that at any time during the - update at least half of all authorities are running and agree on - who is an authority. The latter requirement is there so that the - authorities can arrive at a common consensus: Each authority - builds the consensus based on the votes from all authorities it - recognizes, and so a different set of recognized authorities will - lead to a different consensus document. - - In response to this problem, proposal 134 suggested that every - candidate authority list in its vote whom it believes to be an - authority. These A-says-B-is-an-authority relationships form a - directed graph. Each authority then iteratively finds the largest - clique in the graph and remove it, until they find one containing - them. They vote with this clique. - - Proposal 134 had some problems: - - - It had a security problem in that M hostile authorities in a - clique could effectively kick out M-1 honest authorities. This - could enable a minority of the original authorities to take over. - - - It was too complex in its implications to analyze well: it took us - over a year to realize that it was insecure. - - - It tried to solve a bigger problem: general fragmentation of - authority trust. Really, all we wanted to have was the ability to - add and remove authorities without forcing a flag day. - -Proposed protocol design: - - A "Voting Set" is a set of authorities. Each authority has a list of - the voting sets it considers acceptable. These sets are chosen - manually by the authority operators. They must always contain the - authority itself. Each authority lists all of these voting sets in - its votes. - - Authorities exchange votes with every other authority in any of their - voting sets. - - When it is time to calculate a consensus, an authority votes with - whichever voting set it lists that is listed by the most members of - that set. In other words, given two sets S1 and S2 that an authority - lists, that authority will prefer to vote with S1 over S2 whenever - the number of other authorities in S1 that themselves list S1 is - higher than the number of other authorities in S2 that themselves - list S2. - - For example, suppose authority A recognizes two sets, "A B C D" and - "A E F G H". Suppose that the first set is recognized by all of A, - B, C, and D, whereas the second set is recognized only by A, E, and - F. Because the first set is recognize by more of the authorities in - it than the other one, A will vote with the first set. - - Ties are broken in favor of some arbitrary function of the identity - keys of the authorities in the set. - -How to migrate authority sets: - - In steady state, each authority operator should list only the current - actual voting set as accepted. - - When we want to add an authority, each authority operator configures - his or her server to list two voting sets: one containing all the old - authorities, and one containing the old authorities and the new - authority too. Once all authorities are listing the new set of - authorities, they will start voting with that set because of its - size. - - What if one or two authority operators are slow to list the new set? - Then the other operators can stop listing the old set once there are - enough authorities listing the new set to make its voting successful. - (Note that these authorities not listing the new set will still have - their votes counted, since they themselves will be members of the new - set. They will only fail to sign the consensus generated by the - other authorities who are using the new set.) - - When we want to remove an authority, the operators list two voting - sets: one containing all the authorities, and one omitting the - authority we want to remove. Once enough authorities list the new - set as acceptable, we start having authority operators stop listing - the old set. Once there are more listing the new set than the old - set, the new set will win. - -Data format changes: - - Add a new 'voting-set' line to the vote document format. Allow it to - occur any number of times. Its format is: - - voting-set SP 'fingerprint' SP 'fingerprint' ... NL - - where each fingerprint is the hex fingerprint of an identity key of - an authority. Sort fingerprints in ascending order. - - When the consensus method is at least 'X' (decide this when we - implement the proposal), add this line to the consensus format as - well, before the first dir-source line. [This information is not - redundant with the dir-source sections in the consensus: If an - authority is recognized but didn't vote, that authority will appear in - the voting-set line but not in the dir-source sections.] - - We don't need to list other information about authorities in our - vote. - -Migration issues: - - We should keep track somewhere of which Tor client versions - recognized which authorities. - -Acknowledgments: - - The design came out of an IRC conversation with Peter Palfrader. He - had the basic idea first. diff --git a/doc/spec/proposals/166-statistics-extra-info-docs.txt b/doc/spec/proposals/166-statistics-extra-info-docs.txt deleted file mode 100644 index ab2716a71c..0000000000 --- a/doc/spec/proposals/166-statistics-extra-info-docs.txt +++ /dev/null @@ -1,391 +0,0 @@ -Filename: 166-statistics-extra-info-docs.txt -Title: Including Network Statistics in Extra-Info Documents -Author: Karsten Loesing -Created: 21-Jul-2009 -Target: 0.2.2 -Status: Accepted - -Change history: - - 21-Jul-2009 Initial proposal for or-dev - - -Overview: - - The Tor network has grown to almost two thousand relays and millions - of casual users over the past few years. With growth has come - increasing performance problems and attempts by some countries to - block access to the Tor network. In order to address these problems, - we need to learn more about the Tor network. This proposal suggests to - measure additional statistics and include them in extra-info documents - to help us understand the Tor network better. - - -Introduction: - - As of May 2009, relays, bridges, and directories gather the following - data for statistical purposes: - - - Relays and bridges count the number of bytes that they have pushed - in 15-minute intervals over the past 24 hours. Relays and bridges - include these data in extra-info documents that they send to the - directory authorities whenever they publish their server descriptor. - - - Bridges further include a rough number of clients per country that - they have seen in the past 48 hours in their extra-info documents. - - - Directories can be configured to count the number of clients they - see per country in the past 24 hours and to write them to a local - file. - - Since then we extended the network statistics in Tor. These statistics - include: - - - Directories now gather more precise statistics about connecting - clients. Fixes include measuring in intervals of exactly 24 hours, - counting unsuccessful requests, measuring download times, etc. The - directories append their statistics to a local file every 24 hours. - - - Entry guards count the number of clients per country per day like - bridges do and write them to a local file every 24 hours. - - - Relays measure statistics of the number of cells in their circuit - queues and how much time these cells spend waiting there. Relays - write these statistics to a local file every 24 hours. - - - Exit nodes count the number of read and written bytes on exit - connections per port as well as the number of opened exit streams - per port in 24-hour intervals. Exit nodes write their statistics to - a local file. - - The following four sections contain descriptions for adding these - statistics to the relays' extra-info documents. - - -Directory request statistics: - - The first type of statistics aims at measuring directory requests sent - by clients to a directory mirror or directory authority. More - precisely, these statistics aim at requests for v2 and v3 network - statuses only. These directory requests are sent non-anonymously, - either via HTTP-like requests to a directory's Dir port or tunneled - over a 1-hop circuit. - - Measuring directory request statistics is useful for several reasons: - First, the number of locally seen directory requests can be used to - estimate the total number of clients in the Tor network. Second, the - country-wise classification of requests using a GeoIP database can - help counting the relative and absolute number of users per country. - Third, the download times can give hints on the available bandwidth - capacity at clients. - - Directory requests do not give any hints on the contents that clients - send or receive over the Tor network. Every client requests network - statuses from the directories, so that there are no anonymity-related - concerns to gather these statistics. It might be, though, that clients - wish to hide the fact that they are connecting to the Tor network. - Therefore, IP addresses are resolved to country codes in memory, - events are accumulated over 24 hours, and numbers are rounded up to - multiples of 4 or 8. - - "dirreq-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL - [At most once.] - - YYYY-MM-DD HH:MM:SS defines the end of the included measurement - interval of length NSEC seconds (86400 seconds by default). - - A "dirreq-stats-end" line, as well as any other "dirreq-*" line, - is only added when the relay has opened its Dir port and after 24 - hours of measuring directory requests. - - "dirreq-v2-ips" CC=N,CC=N,... NL - [At most once.] - "dirreq-v3-ips" CC=N,CC=N,... NL - [At most once.] - - List of mappings from two-letter country codes to the number of - unique IP addresses that have connected from that country to - request a v2/v3 network status, rounded up to the nearest multiple - of 8. Only those IP addresses are counted that the directory can - answer with a 200 OK status code. - - "dirreq-v2-reqs" CC=N,CC=N,... NL - [At most once.] - "dirreq-v3-reqs" CC=N,CC=N,... NL - [At most once.] - - List of mappings from two-letter country codes to the number of - requests for v2/v3 network statuses from that country, rounded up - to the nearest multiple of 8. Only those requests are counted that - the directory can answer with a 200 OK status code. - - "dirreq-v2-share" num% NL - [At most once.] - "dirreq-v3-share" num% NL - [At most once.] - - The share of v2/v3 network status requests that the directory - expects to receive from clients based on its advertised bandwidth - compared to the overall network bandwidth capacity. Shares are - formatted in percent with two decimal places. Shares are - calculated as means over the whole 24-hour interval. - - "dirreq-v2-resp" status=num,... NL - [At most once.] - "dirreq-v3-resp" status=nul,... NL - [At most once.] - - List of mappings from response statuses to the number of requests - for v2/v3 network statuses that were answered with that response - status, rounded up to the nearest multiple of 4. Only response - statuses with at least 1 response are reported. New response - statuses can be added at any time. The current list of response - statuses is as follows: - - "ok": a network status request is answered; this number - corresponds to the sum of all requests as reported in - "dirreq-v2-reqs" or "dirreq-v3-reqs", respectively, before - rounding up. - "not-enough-sigs: a version 3 network status is not signed by a - sufficient number of requested authorities. - "unavailable": a requested network status object is unavailable. - "not-found": a requested network status is not found. - "not-modified": a network status has not been modified since the - If-Modified-Since time that is included in the request. - "busy": the directory is busy. - - "dirreq-v2-direct-dl" key=val,... NL - [At most once.] - "dirreq-v3-direct-dl" key=val,... NL - [At most once.] - "dirreq-v2-tunneled-dl" key=val,... NL - [At most once.] - "dirreq-v3-tunneled-dl" key=val,... NL - [At most once.] - - List of statistics about possible failures in the download process - of v2/v3 network statuses. Requests are either "direct" - HTTP-encoded requests over the relay's directory port, or - "tunneled" requests using a BEGIN_DIR cell over the relay's OR - port. The list of possible statistics can change, and statistics - can be left out from reporting. The current list of statistics is - as follows: - - Successful downloads and failures: - - "complete": a client has finished the download successfully. - "timeout": a download did not finish within 10 minutes after - starting to send the response. - "running": a download is still running at the end of the - measurement period for less than 10 minutes after starting to - send the response. - - Download times: - - "min", "max": smallest and largest measured bandwidth in B/s. - "d[1-4,6-9]": 1st to 4th and 6th to 9th decile of measured - bandwidth in B/s. For a given decile i, i/10 of all downloads - had a smaller bandwidth than di, and (10-i)/10 of all downloads - had a larger bandwidth than di. - "q[1,3]": 1st and 3rd quartile of measured bandwidth in B/s. One - fourth of all downloads had a smaller bandwidth than q1, one - fourth of all downloads had a larger bandwidth than q3, and the - remaining half of all downloads had a bandwidth between q1 and - q3. - "md": median of measured bandwidth in B/s. Half of the downloads - had a smaller bandwidth than md, the other half had a larger - bandwidth than md. - - -Entry guard statistics: - - Entry guard statistics include the number of clients per country and - per day that are connecting directly to an entry guard. - - Entry guard statistics are important to learn more about the - distribution of clients to countries. In the future, this knowledge - can be useful to detect if there are or start to be any restrictions - for clients connecting from specific countries. - - The information which client connects to a given entry guard is very - sensitive. This information must not be combined with the information - what contents are leaving the network at the exit nodes. Therefore, - entry guard statistics need to be aggregated to prevent them from - becoming useful for de-anonymization. Aggregation includes resolving - IP addresses to country codes, counting events over 24-hour intervals, - and rounding up numbers to the next multiple of 8. - - "entry-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL - [At most once.] - - YYYY-MM-DD HH:MM:SS defines the end of the included measurement - interval of length NSEC seconds (86400 seconds by default). - - An "entry-stats-end" line, as well as any other "entry-*" - line, is first added after the relay has been running for at least - 24 hours. - - "entry-ips" CC=N,CC=N,... NL - [At most once.] - - List of mappings from two-letter country codes to the number of - unique IP addresses that have connected from that country to the - relay and which are no known other relays, rounded up to the - nearest multiple of 8. - - -Cell statistics: - - The third type of statistics have to do with the time that cells spend - in circuit queues. In order to gather these statistics, the relay - memorizes when it puts a given cell in a circuit queue and when this - cell is flushed. The relay further notes the life time of the circuit. - These data are sufficient to determine the mean number of cells in a - queue over time and the mean time that cells spend in a queue. - - Cell statistics are necessary to learn more about possible reasons for - the poor network performance of the Tor network, especially high - latencies. The same statistics are also useful to determine the - effects of design changes by comparing today's data with future data. - - There are basically no privacy concerns from measuring cell - statistics, regardless of a node being an entry, middle, or exit node. - - "cell-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL - [At most once.] - - YYYY-MM-DD HH:MM:SS defines the end of the included measurement - interval of length NSEC seconds (86400 seconds by default). - - A "cell-stats-end" line, as well as any other "cell-*" line, - is first added after the relay has been running for at least 24 - hours. - - "cell-processed-cells" num,...,num NL - [At most once.] - - Mean number of processed cells per circuit, subdivided into - deciles of circuits by the number of cells they have processed in - descending order from loudest to quietest circuits. - - "cell-queued-cells" num,...,num NL - [At most once.] - - Mean number of cells contained in queues by circuit decile. These - means are calculated by 1) determining the mean number of cells in - a single circuit between its creation and its termination and 2) - calculating the mean for all circuits in a given decile as - determined in "cell-processed-cells". Numbers have a precision of - two decimal places. - - "cell-time-in-queue" num,...,num NL - [At most once.] - - Mean time cells spend in circuit queues in milliseconds. Times are - calculated by 1) determining the mean time cells spend in the - queue of a single circuit and 2) calculating the mean for all - circuits in a given decile as determined in - "cell-processed-cells". - - "cell-circuits-per-decile" num NL - [At most once.] - - Mean number of circuits that are included in any of the deciles, - rounded up to the next integer. - - -Exit statistics: - - The last type of statistics affects exit nodes counting the number of - bytes written and read and the number of streams opened per port and - per 24 hours. Exit port statistics can be measured from looking at - headers of BEGIN and DATA cells. A BEGIN cell contains the exit port - that is required for the exit node to open a new exit stream. - Subsequent DATA cells coming from the client or being sent back to the - client contain a length field stating how many bytes of application - data are contained in the cell. - - Exit port statistics are important to measure in order to identify - possible load-balancing problems with respect to exit policies. Exit - nodes that permit more ports than others are very likely overloaded - with traffic for those ports plus traffic for other ports. Improving - load balancing in the Tor network improves the overall utilization of - bandwidth capacity. - - Exit traffic is one of the most sensitive parts of network data in the - Tor network. Even though these statistics do not require looking at - traffic contents, statistics are aggregated so that they are not - useful for de-anonymizing users. Only those ports are reported that - have seen at least 0.1% of exiting or incoming bytes, numbers of bytes - are rounded up to full kibibytes (KiB), and stream numbers are rounded - up to the next multiple of 4. - - "exit-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL - [At most once.] - - YYYY-MM-DD HH:MM:SS defines the end of the included measurement - interval of length NSEC seconds (86400 seconds by default). - - An "exit-stats-end" line, as well as any other "exit-*" line, is - first added after the relay has been running for at least 24 hours - and only if the relay permits exiting (where exiting to a single - port and IP address is sufficient). - - "exit-kibibytes-written" port=N,port=N,... NL - [At most once.] - "exit-kibibytes-read" port=N,port=N,... NL - [At most once.] - - List of mappings from ports to the number of kibibytes that the - relay has written to or read from exit connections to that port, - rounded up to the next full kibibyte. - - "exit-streams-opened" port=N,port=N,... NL - [At most once.] - - List of mappings from ports to the number of opened exit streams - to that port, rounded up to the nearest multiple of 4. - - -Implementation notes: - - Right now, relays that are configured accordingly write similar - statistics to those described in this proposal to disk every 24 hours. - With this proposal being implemented, relays include the contents of - these files in extra-info documents. - - The following steps are necessary to implement this proposal: - - 1. The current format of [dirreq|entry|buffer|exit]-stats files needs - to be adapted to the description in this proposal. This step - basically means renaming keywords. - - 2. The timing of writing the four *-stats files should be unified, so - that they are written exactly 24 hours after starting the - relay. Right now, the measurement intervals for dirreq, entry, and - exit stats starts with the first observed request, and files are - written when observing the first request that occurs more than 24 - hours after the beginning of the measurement interval. With this - proposal, the measurement intervals should all start at the same - time, and files should be written exactly 24 hours later. - - 3. It is advantageous to cache statistics in local files in the data - directory until they are included in extra-info documents. The - reason is that the 24-hour measurement interval can be very - different from the 18-hour publication interval of extra-info - documents. When a relay crashes after finishing a measurement - interval, but before publishing the next extra-info document, - statistics would get lost. Therefore, statistics are written to - disk when finishing a measurement interval and read from disk when - generating an extra-info document. Only the statistics that were - appended to the *-stats files within the past 24 hours are included - in extra-info documents. Further, the contents of the *-stats files - need to be checked in the process of generating extra-info documents. - - 4. With the statistics patches being tested, the ./configure options - should be removed and the statistics code be compiled by default. - It is still required for relay operators to add configuration - options (DirReqStatistics, ExitPortStatistics, etc.) to enable - gathering statistics. However, in the near future, statistics shall - be enabled gathered by all relays by default, where requiring a - ./configure option would be a barrier for many relay operators. diff --git a/doc/spec/proposals/167-params-in-consensus.txt b/doc/spec/proposals/167-params-in-consensus.txt deleted file mode 100644 index d23bc9c01e..0000000000 --- a/doc/spec/proposals/167-params-in-consensus.txt +++ /dev/null @@ -1,47 +0,0 @@ -Filename: 167-params-in-consensus.txt -Title: Vote on network parameters in consensus -Author: Roger Dingledine -Created: 18-Aug-2009 -Status: Closed -Implemented-In: 0.2.2 - -0. History - - -1. Overview - - Several of our new performance plans involve guessing how to tune - clients and relays, yet we won't be able to learn whether we guessed - the right tuning parameters until many people have upgraded. Instead, - we should have directory authorities vote on the parameters, and teach - Tors to read the currently recommended values out of the consensus. - -2. Design - - V3 votes should include a new "params" line after the known-flags - line. It contains key=value pairs, where value is an integer. - - Consensus documents that are generated with a sufficiently new consensus - method (7?) then include a params line that includes every key listed - in any vote, and the median value for that key (in case of ties, - we use the median closer to zero). - -2.1. Planned keys. - - The first planned parameter is "circwindow=101", which is the initial - circuit packaging window that clients and relays should use. Putting - it in the consensus will let us perform experiments with different - values once enough Tors have upgraded -- see proposal 168. - - Later parameters might include a weighting for how much to favor quiet - circuits over loud circuits in our round-robin algorithm; a weighting - for how much to prioritize relays over clients if we use an incentive - scheme like the gold-star design; and what fraction of circuits we - should throw out from proposal 151. - -2.2. What about non-integers? - - I'm not sure how we would do median on non-integer values. Further, - I don't have any non-integer values in mind yet. So I say we cross - that bridge when we get to it. - diff --git a/doc/spec/proposals/168-reduce-circwindow.txt b/doc/spec/proposals/168-reduce-circwindow.txt deleted file mode 100644 index c10cf41e2e..0000000000 --- a/doc/spec/proposals/168-reduce-circwindow.txt +++ /dev/null @@ -1,134 +0,0 @@ -Filename: 168-reduce-circwindow.txt -Title: Reduce default circuit window -Author: Roger Dingledine -Created: 12-Aug-2009 -Status: Open -Target: 0.2.2 - -0. History - - -1. Overview - - We should reduce the starting circuit "package window" from 1000 to - 101. The lower package window will mean that clients will only be able - to receive 101 cells (~50KB) on a circuit before they need to send a - 'sendme' acknowledgement cell to request 100 more. - - Starting with a lower package window on exit relays should save on - buffer sizes (and thus memory requirements for the exit relay), and - should save on queue sizes (and thus latency for users). - - Lowering the package window will induce an extra round-trip for every - additional 50298 bytes of the circuit. This extra step is clearly a - slow-down for large streams, but ultimately we hope that a) clients - fetching smaller streams will see better response, and b) slowing - down the large streams in this way will produce lower e2e latencies, - so the round-trips won't be so bad. - -2. Motivation - - Karsten's torperf graphs show that the median download time for a 50KB - file over Tor in mid 2009 is 7.7 seconds, whereas the median download - time for 1MB and 5MB are around 50s and 150s respectively. The 7.7 - second figure is way too high, whereas the 50s and 150s figures are - surprisingly low. - - The median round-trip latency appears to be around 2s, with 25% of - the data points taking more than 5s. That's a lot of variance. - - We designed Tor originally with the original goal of maximizing - throughput. We figured that would also optimize other network properties - like round-trip latency. Looks like we were wrong. - -3. Design - - Wherever we initialize the circuit package window, initialize it to - 101 rather than 1000. Reducing it should be safe even when interacting - with old Tors: the old Tors will receive the 101 cells and send back - a sendme ack cell. They'll still have much higher deliver windows, - but the rest of their deliver window will go unused. - - You can find the patch at arma/circwindow. It seems to work. - -3.1. Why not 100? - - Tor 0.0.0 through 0.2.1.19 have a bug where they only send the sendme - ack cell after 101 cells rather than the intended 100 cells. - - Once 0.2.1.19 is obsolete we can change it back to 100 if we like. But - hopefully we'll have moved to some datagram protocol long before - 0.2.1.19 becomes obsolete. - -3.2. What about stream packaging windows? - - Right now the stream packaging windows start at 500. The goal was to - set the stream window to half the circuit window, to provide a crude - load balancing between streams on the same circuit. Once we lower - the circuit packaging window, the stream packaging window basically - becomes redundant. - - We could leave it in -- it isn't hurting much in either case. Or we - could take it out -- people building other Tor clients would thank us - for that step. Alas, people building other Tor clients are going to - have to be compatible with current Tor clients, so in practice there's - no point taking out the stream packaging windows. - -3.3. What about variable circuit windows? - - Once upon a time we imagined adapting the circuit package window to - the network conditions. That is, we would start the window small, - and raise it based on the latency and throughput we see. - - In theory that crude imitation of TCP's windowing system would allow - us to adapt to fill the network better. In practice, I think we want - to stick with the small window and never raise it. The low cap reduces - the total throughput you can get from Tor for a given circuit. But - that's a feature, not a bug. - -4. Evaluation - - How do we know this change is actually smart? It seems intuitive that - it's helpful, and some smart systems people have agreed that it's - a good idea (or said another way, they were shocked at how big the - default package window was before). - - To get a more concrete sense of the benefit, though, Karsten has been - running torperf side-by-side on exit relays with the old package window - vs the new one. The results are mixed currently -- it is slightly faster - for fetching 40KB files, and slightly slower for fetching 50KB files. - - I think it's going to be tough to get a clear conclusion that this is - a good design just by comparing one exit relay running the patch. The - trouble is that the other hops in the circuits are still getting bogged - down by other clients introducing too much traffic into the network. - - Ultimately, we'll want to put the circwindow parameter into the - consensus so we can test a broader range of values once enough relays - have upgraded. - -5. Transition and deployment - - We should put the circwindow in the consensus (see proposal 167), - with an initial value of 101. Then as more exit relays upgrade, - clients should seamlessly get the better behavior. - - Note that upgrading the exit relay will only affect the "download" - package window. An old client that's uploading lots of bytes will - continue to use the old package window at the client side, and we - can't throttle that window at the exit side without breaking protocol. - - The real question then is what we should backport to 0.2.1. Assuming - this could be a big performance win, we can't afford to wait until - 0.2.2.x comes out before starting to see the changes here. So we have - two options as I see them: - a) once clients in 0.2.2.x know how to read the value out of the - consensus, and it's been tested for a bit, backport that part to - 0.2.1.x. - b) if it's too complex to backport, just pick a number, like 101, and - backport that number. - - Clearly choice (a) is the better one if the consensus parsing part - isn't very complex. Let's shoot for that, and fall back to (b) if the - patch turns out to be so big that we reconsider. - diff --git a/doc/spec/proposals/169-eliminating-renegotiation.txt b/doc/spec/proposals/169-eliminating-renegotiation.txt deleted file mode 100644 index 2c90f9c9e8..0000000000 --- a/doc/spec/proposals/169-eliminating-renegotiation.txt +++ /dev/null @@ -1,404 +0,0 @@ -Filename: 169-eliminating-renegotiation.txt -Title: Eliminate TLS renegotiation for the Tor connection handshake -Author: Nick Mathewson -Created: 27-Jan-2010 -Status: Draft -Target: 0.2.2 - -1. Overview - - I propose a backward-compatible change to the Tor connection - establishment protocol to avoid the use of TLS renegotiation. - - Rather than doing a TLS renegotiation to exchange certificates - and authenticate the original handshake, this proposal takes an - approach similar to Steven Murdoch's proposal 124, and uses Tor - cells to finish authenticating the parties' identities once the - initial TLS handshake is finished. - - Terminological note: I use "client" below to mean the Tor - instance (a client or a relay) that initiates a TLS connection, - and "server" to mean the Tor instance (a relay) that accepts it. - -2. Motivation and history - - In the original Tor TLS connection handshake protocol ("V1", or - "two-cert"), parties that wanted to authenticate provided a - two-cert chain of X.509 certificates during the handshake setup - phase. Every party that wanted to authenticate sent these - certificates. - - In the current Tor TLS connection handshake protocol ("V2", or - "renegotiating"), the parties begin with a single certificate - sent from the server (responder) to the client (initiator), and - then renegotiate to a two-certs-from-each-authenticating party. - We made this change to make Tor's handshake look like a browser - speaking SSL to a webserver. (See proposal 130, and - tor-spec.txt.) To tell whether to use the V1 or V2 handshake, - servers look at the list of ciphers sent by the client. (This is - ugly, but there's not much else in the ClientHello that they can - look at.) If the list contains any cipher not used by the V1 - protocol, the server sends back a single cert and expects a - renegotiation. If the client gets back a single cert, then it - withholds its own certificates until the TLS renegotiation phase. - - In other words, initiator behavior now looks like this: - - - Begin TLS negotiation with V2 cipher list; wait for - certificate(s). - - If we get a certificate chain: - - Then we are using the V1 handshake. Send our own - certificate chain as part of this initial TLS handshake - if we want to authenticate; otherwise, send no - certificates. When the handshake completes, check - certificates. We are now mutually authenticated. - - Otherwise, if we get just a single certificate: - - Then we are using the V2 handshake. Do not send any - certificates during this handshake. - - When the handshake is done, immediately start a TLS - renegotiation. During the renegotiation, expect - a certificate chain from the server; send a certificate - chain of our own if we want to authenticate ourselves. - - After the renegotiation, check the certificates. Then - send (and expect) a VERSIONS cell from the other side to - establish the link protocol version. - - And V2 responder behavior now looks like this: - - - When we get a TLS ClientHello request, look at the cipher - list. - - If the cipher list contains only the V1 ciphersuites: - - Then we're doing a V1 handshake. Send a certificate - chain. Expect a possible client certificate chain in - response. - Otherwise, if we get other ciphersuites: - - We're using the V2 handshake. Send back a single - certificate and let the handshake complete. - - Do not accept any data until the client has renegotiated. - - When the client is renegotiating, send a certificate - chain, and expect (possibly multiple) certificates in - reply. - - Check the certificates when the renegotiation is done. - Then exchange VERSIONS cells. - - Late in 2009, researchers found a flaw in most applications' use - of TLS renegotiation: Although TLS renegotiation does not - reauthenticate any information exchanged before the renegotiation - takes place, many applications were treating it as though it did, - and assuming that data sent _before_ the renegotiation was - authenticated with the credentials negotiated _during_ the - renegotiation. This problem was exacerbated by the fact that - most TLS libraries don't actually give you an obvious good way to - tell where the renegotiation occurred relative to the datastream. - Tor wasn't directly affected by this vulnerability, but its - aftermath hurts us in a few ways: - - 1) OpenSSL has disabled renegotiation by default, and created - a "yes we know what we're doing" option we need to set to - turn it back on. (Two options, actually: one for openssl - 0.9.8l and one for 0.9.8m and later.) - - 2) Some vendors have removed all renegotiation support from - their versions of OpenSSL entirely, forcing us to tell - users to either replace their versions of OpenSSL or to - link Tor against a hand-built one. - - 3) Because of 1 and 2, I'd expect TLS renegotiation to become - rarer and rarer in the wild, making our own use stand out - more. - -3. Design - -3.1. The view in the large - - Taking a cue from Steven Murdoch's proposal 124, I propose that - we move the work currently done by the TLS renegotiation step - (that is, authenticating the parties to one another) and do it - with Tor cells instead of with TLS. - - Using _yet another_ variant response from the responder (server), - we allow the client to learn that it doesn't need to rehandshake - and can instead use a cell-based authentication system. Once the - TLS handshake is done, the client and server exchange VERSIONS - cells to determine link protocol version (including - handshake version). If they're using the handshake version - specified here, the client and server arrive at link protocol - version 3 (or higher), and use cells to exchange further - authentication information. - -3.2. New TLS handshake variant - - We already used the list of ciphers from the clienthello to - indicate whether the client can speak the V2 ("renegotiating") - handshake or later, so we can't encode more information there. - - We can, however, change the DN in the certificate passed by the - server back to the client. Currently, all V2 certificates are - generated with CN values ending with ".net". I propose that we - have the ".net" commonName ending reserved to indicate the V2 - protocol, and use commonName values ending with ".com" to - indicate the V3 ("minimal") handshake described herein. - - Now, once the initial TLS handshake is done, the client can look - at the server's certificate(s). If there is a certificate chain, - the handshake is V1. If there is a single certificate whose - subject commonName ends in ".net", the handshake is V2 and the - client should try to renegotiate as it would currently. - Otherwise, the client should assume that the handshake is V3+. - [Servers should _only_ send ".com" addesses, to allow room for - more signaling in the future.] - -3.3. Authenticating inside Tor - - Once the TLS handshake is finished, if the client renegotiates, - then the server should go on as it does currently. - - If the client implements this proposal, however, and the server - has shown it can understand the V3+ handshake protocol, the - client immediately sends a VERSIONS cell to the server - and waits to receive a VERSIONS cell in return. We negotiate - the Tor link protocol version _before_ we proceed with the - negotiation, in case we need to change the authentication - protocol in the future. - - Once either party has seen the VERSIONS cell from the other, it - knows which version they will pick (that is, the highest version - shared by both parties' VERSIONS cells). All Tor instances using - the handshake protocol described in 3.2 MUST support at least - link protocol version 3 as described here. - - On learning the link protocol, the server then sends the client a - CERT cell and a NETINFO cell. If the client wants to - authenticate to the server, it sends a CERT cell, an AUTHENTICATE - cell, and a NETINFO cell, or it may simply send a NETINFO cell if - it does not want to authenticate. - - The CERT cell describes the keys that a Tor instance is claiming - to have. It is a variable-length cell. Its payload format is: - - N: Number of certs in cell [1 octet] - N times: - CLEN [2 octets] - Certificate [CLEN octets] - - Any extra octets at the end of a CERT cell MUST be ignored. - - Each certificate has the form: - - CertType [1 octet] - CertPurpose [1 octet] - PublicKeyLen [2 octets] - PublicKey [PublicKeyLen octets] - NotBefore [4 octets] - NotAfter [4 octets] - SignerID [HASH256_LEN octets] - SignatureLen [2 octets] - Signature [SignatureLen octets] - - where CertType is 1 (meaning "RSA/SHA256") - CertPurpose is 1 (meaning "link certificate") - PublicKey is the DER encoding of the ASN.1 representation - of the RSA key of the subject of this certificate, - NotBefore is a time in HOURS since January 1, 1970, 00:00 - UTC before which this certificate should not be - considered valid. - NotAfter is a time in HOURS since January 1, 1970, 00:00 - UTC after which this certificate should not be - considered valid. - SignerID is the SHA-256 digest of the public key signing - this certificate - and Signature is the signature of the all other fields in - this certificate, using SHA256 as described in proposal - 158. - - While authenticating, a server need send only a self-signed - certificate for its identity key. (Its TLS certificate already - contains its link key signed by its identity key.) A client that - wants to authenticate MUST send two certificates: one containing - a public link key signed by its identity key, and one self-signed - cert for its identity. - - Tor instances MUST ignore any certificate with an unrecognized - CertType or CertPurpose, and MUST ignore extra bytes in the cert. - - The AUTHENTICATE cell proves to the server that the client with - whom it completed the initial TLS handshake is the one possessing - the link public key in its certificate. It is a variable-length - cell. Its contents are: - - SignatureType [2 octets] - SignatureLen [2 octets] - Signature [SignatureLen octets] - - where SignatureType is 1 (meaning "RSA-SHA256") and Signature is - an RSA-SHA256 signature of the HMAC-SHA256, using the TLS master - secret key as its key, of the following elements: - - - The SignatureType field (0x00 0x01) - - The NUL terminated ASCII string: "Tor certificate verification" - - client_random, as sent in the Client Hello - - server_random, as sent in the Server Hello - - Once the above handshake is complete, the client knows (from the - initial TLS handshake) that it has a secure connection to an - entity that controls a given link public key, and knows (from the - CERT cell) that the link public key is a valid public key for a - given Tor identity. - - If the client authenticates, the server learns from the CERT cell - that a given Tor identity has a given current public link key. - From the AUTHENTICATE cell, it knows that an entity with that - link key knows the master secret for the TLS connection, and - hence must be the party with whom it's talking, if TLS works. - -3.4. Security checks - - If the TLS handshake indicates a V2 or V3+ connection, the server - MUST reject any connection from the client that does not begin - with either a renegotiation attempt or a VERSIONS cell containing - at least link protocol version "3". If the TLS handshake - indicates a V3+ connection, the client MUST reject any connection - where the server sends anything before the client has sent a - VERSIONS cell, and any connection where the VERSIONS cell does - not contain at least link protocol version "3". - - If link protocol version 3 is chosen: - - Clients and servers MUST check that all digests and signatures - on the certificates in CERT cells they are given are as - described above. - - After the VERSIONS cell, clients and servers MUST close the - connection if anything besides a CERT or AUTH cell is sent - before the - - CERT or AUTHENTICATE cells anywhere after the first NETINFO - cell must be rejected. - - ... [write more here. What else?] ... - -3.5. Summary - - We now revisit the protocol outlines from section 2 to incorporate - our changes. New or modified steps are marked with a *. - - The new initiator behavior now looks like this: - - - Begin TLS negotiation with V2 cipher list; wait for - certificate(s). - - If we get a certificate chain: - - Then we are using the V1 handshake. Send our own - certificate chain as part of this initial TLS handshake - if we want to authenticate; otherwise, send no - certificates. When the handshake completes, check - certificates. We are now mutually authenticated. - Otherwise, if we get just a single certificate: - - Then we are using the V2 or the V3+ handshake. Do not - send any certificates during this handshake. - * When the handshake is done, look at the server's - certificate's subject commonName. - * If it ends with ".net", we're doing a V2 handshake: - - Immediately start a TLS renegotiation. During the - renegotiation, expect a certificate chain from the - server; send a certificate chain of our own if we - want to authenticate ourselves. - - After the renegotiation, check the certificates. Then - send (and expect) a VERSIONS cell from the other side - to establish the link protocol version. - * If it ends with anything else, assume a V3 or later - handshake: - * Send a VERSIONS cell, and wait for a VERSIONS cell - from the server. - * If we are authenticating, send CERT and AUTHENTICATE - cells. - * Send a NETINFO cell. Wait for a CERT and a NETINFO - cell from the server. - * If the CERT cell contains a valid self-identity cert, - and the identity key in the cert can be used to check - the signature on the x.509 certificate we got during - the TLS handshake, then we know we connected to the - server with that identity. If any of these checks - fail, or the identity key was not what we expected, - then we close the connection. - * Once the NETINFO cell arrives, continue as before. - - And V3+ responder behavior now looks like this: - - - When we get a TLS ClientHello request, look at the cipher - list. - - - If the cipher list contains only the V1 ciphersuites: - - Then we're doing a V1 handshake. Send a certificate - chain. Expect a possible client certificate chain in - response. - Otherwise, if we get other ciphersuites: - - We're using the V2 handshake. Send back a single - certificate whose subject commonName ends with ".com", - and let the handshake complete. - * If the client does anything besides renegotiate or send a - VERSIONS cell, drop the connection. - - If the client renegotiates immediately, it's a V2 - connection: - - When the client is renegotiating, send a certificate - chain, and expect (possibly multiple certificates in - reply). - - Check the certificates when the renegotiation is done. - Then exchange VERSIONS cells. - * Otherwise we got a VERSIONS cell and it's a V3 handshake. - * Send a VERSIONS cell, a CERT cell, an AUTHENTICATE - cell, and a NETINFO cell. - * Wait for the client to send cells in reply. If the - client sends a CERT and an AUTHENTICATE and a NETINFO, - use them to authenticate the client. If the client - sends a NETINFO, it is unauthenticated. If it sends - anything else before its NETINFO, it's rejected. - -4. Numbers to assign - - We need a version number for this link protocol. I've been - calling it "3". - - We need to reserve command numbers for CERT and AUTH cells. I - suggest that in link protocol 3 and higher, we reserve command - numbers 128..240 for variable-length cells. (241-256 we can hold - for future extensions. - -5. Efficiency - - This protocol add a round-trip step when the client sends a - VERSIONS cell to the server, and waits for the {VERSIONS, CERT, - NETINFO} response in turn. (The server then waits for the - client's {NETINFO} or {CERT, AUTHENTICATE, NETINFO} reply, - but it would have already been waiting for the client's NETINFO, - so that's not an additional wait.) - - This is actually fewer round-trip steps than required before for - TLS renegotiation, so that's a win. - -6. Open questions: - - - Should we use X.509 certificates instead of the certificate-ish - things we describe here? They are more standard, but more ugly. - - - May we cache which certificates we've already verified? It - might leak in timing whether we've connected with a given server - before, and how recently. - - - Is there a better secret than the master secret to use in the - AUTHENTICATE cell? Say, a portable one? Can we get at it for - other libraries besides OpenSSL? - - - Does using the client_random and server_random data in the - AUTHENTICATE message actually help us? How hard is it to pull - them out of the OpenSSL data structure? - - - Can we give some way for clients to signal "I want to use the - V3 protocol if possible, but I can't renegotiate, so don't give - me the V2"? Clients currently have a fair idea of server - versions, so they could potentially do the V3+ handshake with - servers that support it, and fall back to V1 otherwise. - - - What should servers that don't have TLS renegotiation do? For - now, I think they should just get it. Eventually we can - deprecate the V2 handshake as we did with the V1 handshake. diff --git a/doc/spec/proposals/170-user-path-config.txt b/doc/spec/proposals/170-user-path-config.txt deleted file mode 100644 index fa74c76f73..0000000000 --- a/doc/spec/proposals/170-user-path-config.txt +++ /dev/null @@ -1,95 +0,0 @@ -Title: Configuration options regarding circuit building -Filename: 170-user-path-config.txt -Author: Sebastian Hahn -Created: 01-March-2010 -Status: Draft - -Overview: - - This document outlines how Tor handles the user configuration - options to influence the circuit building process. - -Motivation: - - Tor's treatment of the configuration *Nodes options was surprising - to many users, and quite a few conspiracy theories have crept up. We - should update our specification and code to better describe and - communicate what is going during circuit building, and how we're - honoring configuration. So far, we've been tracking a bugreport - about this behaviour ( - https://bugs.torproject.org/flyspray/index.php?do=details&id=1090 ) - and Nick replied in a thread on or-talk ( - http://archives.seul.org/or/talk/Feb-2010/msg00117.html ). - - This proposal tries to document our intention for those configuration - options. - -Design: - - Five configuration options are available to users to influence Tor's - circuit building. EntryNodes and ExitNodes define a list of nodes - that are for the Entry/Exit position in all circuits. ExcludeNodes - is a list of nodes that are used for no circuit, and - ExcludeExitNodes is a list of nodes that aren't used as the last - hop. StrictNodes defines Tor's behaviour in case of a conflict, for - example when a node that is excluded is the only available - introduction point. Setting StrictNodes to 1 breaks Tor's - functionality in that case, and it will refuse to build such a - circuit. - - Neither Nick's email nor bug 1090 have clear suggestions how we - should behave in each case, so I tried to come up with something - that made sense to me. - -Security implications: - - Deviating from normal circuit building can break one's anonymity, so - the documentation of the above option should contain a warning to - make users aware of the pitfalls. - -Specification: - - It is proposed that the "User configuration" part of path-spec - (section 2.2.2) be replaced with this: - - Users can alter the default behavior for path selection with - configuration options. In case of conflicts (excluding and requiring - the same node) the "StrictNodes" option is used to determine - behaviour. If a nodes is both excluded and required via a - configuration option, the exclusion takes preference. - - - If "ExitNodes" is provided, then every request requires an exit - node on the ExitNodes list. If a request is supported by no nodes - on that list, and "StrictNodes" is false, then Tor treats that - request as if ExitNodes were not provided. - - - "EntryNodes" behaves analogously. - - - If "ExcludeNodes" is provided, then no circuit uses any of the - nodes listed. If a circuit requires an excluded node to be used, - and "StrictNodes" is false, then Tor uses the node in that - position while not using any other of the excluded nodes. - - - If "ExcludeExitNodes" is provided, then Tor will not use the nodes - listed for the exit position in a circuit. If a circuit requires - an excluded node to be used in the exit position and "StrictNodes" - is false, then Tor builds that circuit as if ExcludeExitNodes were - not provided. - - - If a user tries to connect to or resolve a hostname of the form - <target>.<servername>.exit and the "AllowDotExit" configuration - option is set to 1, the request is rewritten to a request for - <target>, and the request is only supported by the exit whose - nickname or fingerprint is <servername>. If "AllowDotExit" is set - to 0 (default), any request for <anything>.exit is denied. - - - When any of the *Nodes settings are changed, all circuits are - expired immediately, to prevent a situation where a previously - built circuit is used even though some of its nodes are now - excluded. - - -Compatibility: - - The old Strict*Nodes options are deprecated, and the StrictNodes - option is new. Tor users may need to update their configuration file. diff --git a/doc/spec/proposals/172-circ-getinfo-option.txt b/doc/spec/proposals/172-circ-getinfo-option.txt deleted file mode 100644 index b7fd79c9a8..0000000000 --- a/doc/spec/proposals/172-circ-getinfo-option.txt +++ /dev/null @@ -1,138 +0,0 @@ -Filename: 172-circ-getinfo-option.txt -Title: GETINFO controller option for circuit information -Author: Damian Johnson -Created: 03-June-2010 -Status: Accepted - -Overview: - - This details an additional GETINFO option that would provide information - concerning a relay's current circuits. - -Motivation: - - The original proposal was for connection related information, but Jake make - the excellent point that any information retrieved from the control port - is... - - 1. completely ineffectual for auditing purposes since either (a) these - results can be fetched from netstat already or (b) the information would - only be provided via tor and can't be validated. - - 2. The more useful uses for connection information can be achieved with - much less (and safer) information. - - Hence the proposal is now for circuit based rather than connection based - information. This would strip the most controversial and sensitive data - entirely (ip addresses, ports, and connection based bandwidth breakdowns) - while still being useful for the following purposes: - - - Basic Relay Usage Questions - How is the bandwidth I'm contributing broken down? Is it being evenly - distributed or is someone hogging most of it? Do these circuits belong to - the hidden service I'm running or something else? Now that I'm using exit - policy X am I desirable as an exit, or are most people just using me as a - relay? - - - Debugging - Say a relay has a restrictive firewall policy for outbound connections, - with the ORPort whitelisted but doesn't realize that tor needs random high - ports. Tor would report success ("your orport is reachable - excellent") - yet the relay would be nonfunctional. This proposed information would - reveal numerous RELAY -> YOU -> UNESTABLISHED circuits, giving a good - indicator of what's wrong. - - - Visualization - A nice benefit of visualizing tor's behavior is that it becomes a helpful - tool in puzzling out how tor works. For instance, tor spawns numerous - client connections at startup (even if unused as a client). As a newcomer - to tor these asymmetric (outbound only) connections mystified me for quite - a while until until Roger explained their use to me. The proposed - TYPE_FLAGS would let controllers clearly label them as being client - related, making their purpose a bit clearer. - - At the moment connection data can only be retrieved via commands like - netstat, ss, and lsof. However, providing an alternative via the control - port provides several advantages: - - - scrubbing for private data - Raw connection data has no notion of what's sensitive and what is - not. The relay's flags and cached consensus can be used to take - educated guesses concerning which connections could possibly belong - to client or exit traffic, but this is both difficult and inaccurate. - Anything provided via the control port can scrubbed to make sure we - aren't providing anything we think relay operators should not see. - - - additional information - All connection querying commands strictly provide the ip address and - port of connections, and nothing else. However, for the uses listed - above the far more interesting attributes are the circuit's type, - bandwidth usage and uptime. - - - improved performance - Querying connection data is an expensive activity, especially for - busy relays or low end processors (such as mobile devices). Tor - already internally knows its circuits, allowing for vastly quicker - lookups. - - - cross platform capability - The connection querying utilities mentioned above not only aren't - available under Windows, but differ widely among different *nix - platforms. FreeBSD in particular takes a very unique approach, - dropping important options from netstat and assigning ss to a - spreadsheet application instead. A controller interface, however, - would provide a uniform means of retrieving this information. - -Security Implications: - - This is an open question. This proposal lacks the most controversial pieces - of information (ip addresses and ports) and insight into potential threats - this would pose would be very welcomed! - -Specification: - - The following addition would be made to the control-spec's GETINFO section: - - "rcirc/id/<Circuit identity>" -- Provides entry for the associated relay - circuit, formatted as: - CIRC_ID=<circuit ID> CREATED=<timestamp> UPDATED=<timestamp> TYPE=<flag> - READ=<bytes> WRITE=<bytes> - - none of the parameters contain whitespace, and additional results must be - ignored to allow for future expansion. Parameters are defined as follows: - CIRC_ID - Unique numeric identifier for the circuit this belongs to. - CREATED - Unix timestamp (as seconds since the Epoch) for when the - circuit was created. - UPDATED - Unix timestamp for when this information was last updated. - TYPE - Single character flags indicating attributes in the circuit: - (E)ntry : has a connection that doesn't belong to a known Tor server, - indicating that this is either the first hop or bridged - E(X)it : has been used for at least one exit stream - (R)elay : has been extended - Rende(Z)vous : is being used for a rendezvous point - (I)ntroduction : is being used for a hidden service introduction - (N)one of the above: none of the above have happened yet. - READ - Total bytes transmitted toward the exit over the circuit. - WRITE - Total bytes transmitted toward the client over the circuit. - - "rcirc/all" -- The 'rcirc/id/*' output for all current circuits, joined by - newlines. - - The following would be included for circ info update events. - -4.1.X. Relay circuit status changed - - The syntax is: - "650" SP "RCIRC" SP CircID SP Notice [SP Created SP Updated SP Type SP - Read SP Write] CRLF - - Notice = - "NEW" / ; first information being provided for this circuit - "UPDATE" / ; update for a previously reported circuit - "CLOSED" ; notice that the circuit no longer exists - - Notice indicating that queryable information on a relay related circuit has - changed. If the Notice parameter is either "NEW" or "UPDATE" then this - provides the same fields that would be given by calling "GETINFO rcirc/id/" - with the CircID. - diff --git a/doc/spec/proposals/173-getinfo-option-expansion.txt b/doc/spec/proposals/173-getinfo-option-expansion.txt deleted file mode 100644 index 03e18ef8d4..0000000000 --- a/doc/spec/proposals/173-getinfo-option-expansion.txt +++ /dev/null @@ -1,101 +0,0 @@ -Filename: 173-getinfo-option-expansion.txt -Title: GETINFO Option Expansion -Author: Damian Johnson -Created: 02-June-2010 -Status: Accepted - -Overview: - - Over the course of developing arm there's been numerous hacks and - workarounds to gleam pieces of basic, desirable information about the tor - process. As per Roger's request I've compiled a list of these pain points - to try and improve the control protocol interface. - -Motivation: - - The purpose of this proposal is to expose additional process and relay - related information that is currently unavailable in a convenient, - dependable, and/or platform independent way. Examples of this are... - - - The relay's total contributed bandwidth. This is a highly requested - piece of information and, based on the following patch from pipe, looks - trivial to include. - http://www.mail-archive.com/or-talk@freehaven.net/msg13085.html - - - The process ID of the tor process. There is a high degree of guess work - in obtaining this. Arm for instance uses pidof, netstat, and ps yet - still fails on some platforms, and Orbot recently got a ticket about - its own attempt to fetch it with ps: - https://trac.torproject.org/projects/tor/ticket/1388 - - This just includes the pieces of missing information I've noticed - (suggestions or questions of their usefulness are welcome!). - -Security Implications: - - None that I'm aware of. From a security standpoint this seems decently - innocuous. - -Specification: - - The following addition would be made to the control-spec's GETINFO section: - - "relay/bw-limit" -- Effective relayed bandwidth limit. - - "relay/burst-limit" -- Effective relayed burst limit. - - "relay/read-total" -- Total bytes relayed (download). - - "relay/write-total" -- Total bytes relayed (upload). - - "relay/flags" -- Space separated listing of flags currently held by the - relay as repored by the currently cached consensus. - - "process/user" -- Username under which the tor process is running, - providing an empty string if none exists. - - "process/pid" -- Process id belonging to the main tor process, -1 if none - exists for the platform. - - "process/uptime" -- Total uptime of the tor process (in seconds). - - "process/uptime-reset" -- Time since last reset (startup, sighup, or RELOAD - signal, in seconds). - - "process/descriptors-used" -- Count of file descriptors used. - - "process/descriptor-limit" -- File descriptor limit (getrlimit results). - - "ns/authority" -- Router status info (v2 directory style) for all - recognized directory authorities, joined by newlines. - - "state/names" -- A space-separated list of all the keys supported by this - version of Tor's state. - - "state/val/<key>" -- Provides the current state value belonging to the - given key. If undefined, this provides the key's default value. - - "status/ports-seen" -- A summary of which ports we've seen connections - circuits connect to recently, formatted the same as the EXITS_SEEN status - event described in Section 4.1.XX. This GETINFO option is currently - available only for exit relays. - -4.1.XX. Per-port exit stats - - The syntax is: - "650" SP "EXITS_SEEN" SP TimeStarted SP PortSummary CRLF - - We just generated a new summary of which ports we've seen exiting circuits - connecting to recently. The controller could display this for the user, e.g. - in their "relay" configuration window, to give them a sense of how they're - being used (popularity of the various ports they exit to). Currently only - exit relays will receive this event. - - TimeStarted is a quoted string indicating when the reported summary - counts from (in GMT). - - The PortSummary keyword has as its argument a comma-separated, possibly - empty set of "port=count" pairs. For example (without linebreak), - 650-EXITS_SEEN TimeStarted="2008-12-25 23:50:43" - PortSummary=80=16,443=8 - diff --git a/doc/spec/proposals/174-optimistic-data-server.txt b/doc/spec/proposals/174-optimistic-data-server.txt deleted file mode 100644 index d97c45e909..0000000000 --- a/doc/spec/proposals/174-optimistic-data-server.txt +++ /dev/null @@ -1,242 +0,0 @@ -Filename: 174-optimistic-data-server.txt -Title: Optimistic Data for Tor: Server Side -Author: Ian Goldberg -Created: 2-Aug-2010 -Status: Open - -Overview: - -When a SOCKS client opens a TCP connection through Tor (for an HTTP -request, for example), the query latency is about 1.5x higher than it -needs to be. Simply, the problem is that the sequence of data flows -is this: - -1. The SOCKS client opens a TCP connection to the OP -2. The SOCKS client sends a SOCKS CONNECT command -3. The OP sends a BEGIN cell to the Exit -4. The Exit opens a TCP connection to the Server -5. The Exit returns a CONNECTED cell to the OP -6. The OP returns a SOCKS CONNECTED notification to the SOCKS client -7. The SOCKS client sends some data (the GET request, for example) -8. The OP sends a DATA cell to the Exit -9. The Exit sends the GET to the server -10. The Server returns the HTTP result to the Exit -11. The Exit sends the DATA cells to the OP -12. The OP returns the HTTP result to the SOCKS client - -Note that the Exit node knows that the connection to the Server was -successful at the end of step 4, but is unable to send the HTTP query to -the server until step 9. - -This proposal (as well as its upcoming sibling concerning the client -side) aims to reduce the latency by allowing: -1. SOCKS clients to optimistically send data before they are notified - that the SOCKS connection has completed successfully -2. OPs to optimistically send DATA cells on streams in the CONNECT_WAIT - state -3. Exit nodes to accept and queue DATA cells while in the - EXIT_CONN_STATE_CONNECTING state - -This particular proposal deals with #3. - -In this way, the flow would be as follows: - -1. The SOCKS client opens a TCP connection to the OP -2. The SOCKS client sends a SOCKS CONNECT command, followed immediately - by data (such as the GET request) -3. The OP sends a BEGIN cell to the Exit, followed immediately by DATA - cells -4. The Exit opens a TCP connection to the Server -5. The Exit returns a CONNECTED cell to the OP, and sends the queued GET - request to the Server -6. The OP returns a SOCKS CONNECTED notification to the SOCKS client, - and the Server returns the HTTP result to the Exit -7. The Exit sends the DATA cells to the OP -8. The OP returns the HTTP result to the SOCKS client - -Motivation: - -This change will save one OP<->Exit round trip (down to one from two). -There are still two SOCKS Client<->OP round trips (negligible time) and -two Exit<->Server round trips. Depending on the ratio of the -Exit<->Server (Internet) RTT to the OP<->Exit (Tor) RTT, this will -decrease the latency by 25 to 50 percent. Experiments validate these -predictions. [Goldberg, PETS 2010 rump session; see -https://thunk.cs.uwaterloo.ca/optimistic-data-pets2010-rump.pdf ] - -Design: - -The current code actually correctly handles queued data at the Exit; if -there is queued data in a EXIT_CONN_STATE_CONNECTING stream, that data -will be immediately sent when the connection succeeds. If the -connection fails, the data will be correctly ignored and freed. The -problem with the current server code is that the server currently -drops DATA cells on streams in the EXIT_CONN_STATE_CONNECTING state. -Also, if you try to queue data in the EXIT_CONN_STATE_RESOLVING state, -bad things happen because streams in that state don't yet have -conn->write_event set, and so some existing sanity checks (any stream -with queued data is at least potentially writable) are no longer sound. - -The solution is to simply not drop received DATA cells while in the -EXIT_CONN_STATE_CONNECTING state. Also do not send SENDME cells in this -state, so that the OP cannot send more than one window's worth of data -to be queued at the Exit. Finally, patch the sanity checks so that -streams in the EXIT_CONN_STATE_RESOLVING state that have buffered data -can pass. - -If no clients ever send such optimistic data, the new code will never be -executed, and the behaviour of Tor will not change. When clients begin -to send optimistic data, the performance of those clients' streams will -improve. - -After discussion with nickm, it seems best to just have the server -version number be the indicator of whether a particular Exit supports -optimistic data. (If a client sends optimistic data to an Exit which -does not support it, the data will be dropped, and the client's request -will fail to complete.) What do version numbers for hypothetical future -protocol-compatible implementations look like, though? - -Security implications: - -Servers (for sure the Exit, and possibly others, by watching the -pattern of packets) will be able to tell that a particular client -is using optimistic data. This will be discussed more in the sibling -proposal. - -On the Exit side, servers will be queueing a little bit extra data, but -no more than one window. Clients today can cause Exits to queue that -much data anyway, simply by establishing a Tor connection to a slow -machine, and sending one window of data. - -Specification: - -tor-spec section 6.2 currently says: - - The OP waits for a RELAY_CONNECTED cell before sending any data. - Once a connection has been established, the OP and exit node - package stream data in RELAY_DATA cells, and upon receiving such - cells, echo their contents to the corresponding TCP stream. - RELAY_DATA cells sent to unrecognized streams are dropped. - -It is not clear exactly what an "unrecognized" stream is, but this last -sentence would be changed to say that RELAY_DATA cells received on a -stream that has processed a RELAY_BEGIN cell and has not yet issued a -RELAY_END or a RELAY_CONNECTED cell are queued; that queue is processed -immediately after a RELAY_CONNECTED cell is issued for the stream, or -freed after a RELAY_END cell is issued for the stream. - -The earlier part of this section will be addressed in the sibling -proposal. - -Compatibility: - -There are compatibility issues, as mentioned above. OPs MUST NOT send -optimistic data to Exit nodes whose version numbers predate (something). -OPs MAY send optimistic data to Exit nodes whose version numbers match -or follow that value. (But see the question about independent server -reimplementations, above.) - -Implementation: - -Here is a simple patch. It seems to work with both regular streams and -hidden services, but there may be other corner cases I'm not aware of. -(Do streams used for directory fetches, hidden services, etc. take a -different code path?) - -diff --git a/src/or/connection.c b/src/or/connection.c -index 7b1493b..f80cd6e 100644 ---- a/src/or/connection.c -+++ b/src/or/connection.c -@@ -2845,7 +2845,13 @@ _connection_write_to_buf_impl(const char *string, size_t len, - return; - } - -- connection_start_writing(conn); -+ /* If we receive optimistic data in the EXIT_CONN_STATE_RESOLVING -+ * state, we don't want to try to write it right away, since -+ * conn->write_event won't be set yet. Otherwise, write data from -+ * this conn as the socket is available. */ -+ if (conn->state != EXIT_CONN_STATE_RESOLVING) { -+ connection_start_writing(conn); -+ } - if (zlib) { - conn->outbuf_flushlen += buf_datalen(conn->outbuf) - old_datalen; - } else { -@@ -3382,7 +3388,11 @@ assert_connection_ok(connection_t *conn, time_t now) - tor_assert(conn->s < 0); - - if (conn->outbuf_flushlen > 0) { -- tor_assert(connection_is_writing(conn) || conn->write_blocked_on_bw || -+ /* With optimistic data, we may have queued data in -+ * EXIT_CONN_STATE_RESOLVING while the conn is not yet marked to writing. -+ * */ -+ tor_assert(conn->state == EXIT_CONN_STATE_RESOLVING || -+ connection_is_writing(conn) || conn->write_blocked_on_bw || - (CONN_IS_EDGE(conn) && TO_EDGE_CONN(conn)->edge_blocked_on_circ)); - } - -diff --git a/src/or/relay.c b/src/or/relay.c -index fab2d88..e45ff70 100644 ---- a/src/or/relay.c -+++ b/src/or/relay.c -@@ -1019,6 +1019,9 @@ connection_edge_process_relay_cell(cell_t *cell, circuit_t *circ, - relay_header_t rh; - unsigned domain = layer_hint?LD_APP:LD_EXIT; - int reason; -+ int optimistic_data = 0; /* Set to 1 if we receive data on a stream -+ that's in the EXIT_CONN_STATE_RESOLVING -+ or EXIT_CONN_STATE_CONNECTING states.*/ - - tor_assert(cell); - tor_assert(circ); -@@ -1038,9 +1041,20 @@ connection_edge_process_relay_cell(cell_t *cell, circuit_t *circ, - /* either conn is NULL, in which case we've got a control cell, or else - * conn points to the recognized stream. */ - -- if (conn && !connection_state_is_open(TO_CONN(conn))) -- return connection_edge_process_relay_cell_not_open( -- &rh, cell, circ, conn, layer_hint); -+ if (conn && !connection_state_is_open(TO_CONN(conn))) { -+ if ((conn->_base.state == EXIT_CONN_STATE_CONNECTING || -+ conn->_base.state == EXIT_CONN_STATE_RESOLVING) && -+ rh.command == RELAY_COMMAND_DATA) { -+ /* We're going to allow DATA cells to be delivered to an exit -+ * node in state EXIT_CONN_STATE_CONNECTING or -+ * EXIT_CONN_STATE_RESOLVING. This speeds up HTTP, for example. */ -+ log_warn(domain, "Optimistic data received."); -+ optimistic_data = 1; -+ } else { -+ return connection_edge_process_relay_cell_not_open( -+ &rh, cell, circ, conn, layer_hint); -+ } -+ } - - switch (rh.command) { - case RELAY_COMMAND_DROP: -@@ -1090,7 +1104,9 @@ connection_edge_process_relay_cell(cell_t *cell, circuit_t *circ, - log_debug(domain,"circ deliver_window now %d.", layer_hint ? - layer_hint->deliver_window : circ->deliver_window); - -- circuit_consider_sending_sendme(circ, layer_hint); -+ if (!optimistic_data) { -+ circuit_consider_sending_sendme(circ, layer_hint); -+ } - - if (!conn) { - log_info(domain,"data cell dropped, unknown stream (streamid %d).", -@@ -1107,7 +1123,9 @@ connection_edge_process_relay_cell(cell_t *cell, circuit_t *circ, - stats_n_data_bytes_received += rh.length; - connection_write_to_buf(cell->payload + RELAY_HEADER_SIZE, - rh.length, TO_CONN(conn)); -- connection_edge_consider_sending_sendme(conn); -+ if (!optimistic_data) { -+ connection_edge_consider_sending_sendme(conn); -+ } - return 0; - case RELAY_COMMAND_END: - reason = rh.length > 0 ? - -Performance and scalability notes: - -There may be more RAM used at Exit nodes, as mentioned above, but it is -transient. diff --git a/doc/spec/proposals/ideas/xxx-auto-update.txt b/doc/spec/proposals/ideas/xxx-auto-update.txt deleted file mode 100644 index dc9a857c1e..0000000000 --- a/doc/spec/proposals/ideas/xxx-auto-update.txt +++ /dev/null @@ -1,39 +0,0 @@ - -Notes on an auto updater: - -steve wants a "latest" symlink so he can always just fetch that. - -roger worries that this will exacerbate the "what version are you -using?" "latest." problem. - -weasel suggests putting the latest recommended version in dns. then -we don't have to hit the website. it's got caching, it's lightweight, -it scales. just put it in a TXT record or something. - -but, no dnssec. - -roger suggests a file on the https website that lists the latest -recommended version (or filename or url or something like that). - -(steve seems to already be doing this with xerobank. he additionally -suggests a little blurb that can be displayed to the user to describe -what's new.) - -how to verify you're getting the right file? -a) it's https. -b) ship with a signing key, and use some openssl functions to verify. -c) both - -andrew reminds us that we have a "recommended versions" line in the -consensus directory already. - -if only we had some way to point out the "latest stable recommendation" -from this list. we could list it first, or something. - -the recommended versions line also doesn't take into account which -packages are available -- e.g. on Windows one version might be the best -available, and on OS X it might be a different one. - -aren't there existing solutions to this? surely there is a beautiful, -efficient, crypto-correct auto updater lib out there. even for windows. - diff --git a/doc/spec/proposals/ideas/xxx-bridge-disbursement.txt b/doc/spec/proposals/ideas/xxx-bridge-disbursement.txt deleted file mode 100644 index 6c9a3c71ed..0000000000 --- a/doc/spec/proposals/ideas/xxx-bridge-disbursement.txt +++ /dev/null @@ -1,174 +0,0 @@ - -How to hand out bridges. - -Divide bridges into 'strategies' as they come in. Do this uniformly -at random for now. - -For each strategy, we'll hand out bridges in a different way to -clients. This document describes two strategies: email-based and -IP-based. - -0. Notation: - - HMAC(k,v) : an HMAC of v using the key k. - - A|B: The string A concatenated with the string B. - - -1. Email-based. - - Goal: bootstrap based on one or more popular email service's sybil - prevention algorithms. - - - Parameters: - HMAC -- an HMAC function - P -- a time period - K -- the number of bridges to send in a period. - - Setup: Generate two nonces, N and M. - - As bridges arrive, put them into a ring according to HMAC(N,ID) - where ID is the bridges's identity digest. - - Divide time into divisions of length P. - - When we get an email: - - If it's not from a supported email service, reject it. - - If we already sent a response to that email address (normalized) - in this period, send _exactly_ the same response. - - If it is from a supported service, generate X = HMAC(M,PS|E) where E - is the lowercased normalized email address for the user, and - where PS is the start of the currrent period. Send - the first K bridges in the ring after point X. - - [If we want to make sure that repeat queries are given exactly the - same results, then we can't let the ring change during the - time period. For a long time period like a month, that's quite a - hassle. How about instead just keeping a replay cache of addresses - that have been answered, and sending them a "sorry, you already got - your addresses for the time period; perhaps you should try these - other fine distribution strategies while you wait?" response? This - approach would also resolve the "Make sure you can't construct a - distinct address to match an existing one" note below. -RD] - - [I think, if we get a replay, we need to send back the same - answer as we did the first time, not say "try again." - Otherwise we need to worry that an attacker can keep people - from getting bridges by preemtively asking for them, - or that an attacker may force them to prove they haven't - gotten any bridges by asking. -NM] - - [While we're at it, if we do the replay cache thing and don't need - repeatable answers, we could just pick K random answers from the - pool. Is it beneficial that a bridge user who knows about a clump of - nodes will be sharing them with other users who know about a similar - (overlapping) clump? One good aspect is against an adversary who - learns about a clump this way and watches those bridges to learn - other users and discover *their* bridges: he doesn't learn about - as many new bridges as he might if they were randomly distributed. - A drawback is against an adversary who happens to pick two email - addresses in P that include overlapping answers: he can measure - the difference in clumps and estimate how quickly the bridge pool - is growing. -RD] - - [Random is one more darn thing to implement; rings are already - there. -NM] - - [If we make the period P be mailbox-specific, and make it a random - value around some mean, then we make it harder for an attacker to - know when to try using his small army of gmail addresses to gather - another harvest. But we also make it harder for users to know when - they can try again. -RD] - - [Letting the users know about when they can try again seems - worthwhile. Otherwise users and attackers will all probe and - probe and probe until they get an answer. No additional - security will be achieved, but bandwidth will be lost. -NM] - - To normalize an email address: - Start with the RFC822 address. Consider only the mailbox {???} - portion of the address (username@domain). Put this into lowercase - ascii. - - Questions: - What to do with weird character encodings? Look up the RFC. - - Notes: - Make sure that you can't force a single email address to appear - in lots of different ways. IOW, if nickm@freehaven.net and - NICKM@freehaven.net aren't treated the same, then I can get lots - more bridges than I should. - - Make sure you can't construct a distinct address to match an - existing one. IOW, if we treat nickm@X and nickm@Y as the same - user, then anybody can register nickm@Z and use it to tell which - bridges nickm@X got (or would get). - - Make sure that we actually check headers so we can't be trivially - used to spam people. - - -2. IP-based. - - Goal: avoid handing out all the bridges to users in a similar IP - space and time. - - Parameters: - - T_Flush -- how long it should take a user on a single network to - see a whole cluster of bridges. - - N_C - - K -- the number of bridges we hand out in response to a single - request. - - Setup: using an AS map or a geoip map or some other flawed input - source, divide IP space into "areas" such that surveying a large - collection of "areas" is hard. For v0, use /24 address blocks. - - Group areas into N_C clusters. - - Generate secrets L, M, N. - - Set the period P such that P*(bridges-per-cluster/K) = T_flush. - Don't set P to greater than a week, or less than three hours. - - When we get a bridge: - - Based on HMAC(L,ID), assign the bridge to a cluster. Within each - cluster, keep the bridges in a ring based on HMAC(M,ID). - - [Should we re-sort the rings for each new time period, so the ring - for a given cluster is based on HMAC(M,PS|ID)? -RD] - - When we get a connection: - - If it's http, redirect it to https. - - Let area be the incoming IP network. Let PS be the current - period. Compute X = HMAC(N, PS|area). Return the next K bridges - in the ring after X. - - [Don't we want to compute C = HMAC(key, area) to learn what cluster - to answer from, and then X = HMAC(key, PS|area) to pick a point in - that ring? -RD] - - - Need to clarify that some HMACs are for rings, and some are for - partitions. How rings scale is clear. How do we grow the number of - partitions? Looking at successive bits from the HMAC output is one way. - -3. Open issues - - Denial of service attacks - A good view of network topology - -at some point we should learn some reliability stats on our bridges. when -we say above 'give out k bridges', we might give out 2 reliable ones and -k-2 others. we count around the ring the same way we do now, to find them. - diff --git a/doc/spec/proposals/ideas/xxx-bwrate-algs.txt b/doc/spec/proposals/ideas/xxx-bwrate-algs.txt deleted file mode 100644 index 757f5bc55e..0000000000 --- a/doc/spec/proposals/ideas/xxx-bwrate-algs.txt +++ /dev/null @@ -1,106 +0,0 @@ -# The following two algorithms - - -# Algorithm 1 -# TODO: Burst and Relay/Regular differentiation - -BwRate = Bandwidth Rate in Bytes Per Second -GlobalWriteBucket = 0 -GlobalReadBucket = 0 -Epoch = Token Fill Rate in seconds: suggest 50ms=.050 -SecondCounter = 0 -MinWriteBytes = Minimum amount bytes per write - -Every Epoch Seconds: - UseMinWriteBytes = MinWriteBytes - WriteCnt = 0 - ReadCnt = 0 - BytesRead = 0 - - For Each Open OR Conn with pending write data: - WriteCnt++ - For Each Open OR Conn: - ReadCnt++ - - BytesToRead = (BwRate*Epoch + GlobalReadBucket)/ReadCnt - BytesToWrite = (BwRate*Epoch + GlobalWriteBucket)/WriteCnt - - if BwRate/WriteCnt < MinWriteBytes: - # If we aren't likely to accumulate enough bytes in a second to - # send a whole cell for our connections, send partials - Log(NOTICE, "Too many ORCons to write full blocks. Sending short packets.") - UseMinWriteBytes = 1 - # Other option: We could switch to plan 2 here - - # Service each writable ORConn. If there are any partial writes, - # return remaining bytes from this epoch to the global pool - For Each Open OR Conn with pending write data: - ORConn->write_bucket += BytesToWrite - if ORConn->write_bucket > UseMinWriteBytes: - w = write(ORConn, MIN(len(ORConn->write_data), ORConn->write_bucket)) - # possible that w < ORConn->write_data here due to TCP pushback. - # We should restore the rest of the write_bucket to the global - # buffer - GlobalWriteBucket += (ORConn->write_bucket - w) - ORConn->write_bucket = 0 - - For Each Open OR Conn: - r = read_nonblock(ORConn, BytesToRead) - BytesRead += r - - SecondCounter += Epoch - if SecondCounter < 1: - # Save unused bytes from this epoch to be used later in the second - GlobalReadBucket += (BwRate*Epoch - BytesRead) - else: - SecondCounter = 0 - GlobalReadBucket = 0 - GlobalWriteBucket = 0 - For Each ORConn: - ORConn->write_bucket = 0 - - - -# Alternate plan for Writing fairly. Reads would still be covered -# by plan 1 as there is no additional network overhead for short reads, -# so we don't need to try to avoid them. -# -# I think this is actually pretty similar to what we do now, but -# with the addition that the bytes accumulate up to the second mark -# and we try to keep track of our position in the write list here -# (unless libevent is doing that for us already and I just don't see it) -# -# TODO: Burst and Relay/Regular differentiation - -# XXX: The inability to send single cells will cause us to block -# on EXTEND cells for low-bandwidth node pairs.. -BwRate = Bandwidth Rate in Bytes Per Second -WriteBytes = Bytes per write -Epoch = MAX(MIN(WriteBytes/BwRate, .333s), .050s) - -SecondCounter = 0 -GlobalWriteBucket = 0 - -# New connections are inserted at Head-1 (the 'tail' of this circular list) -# This is not 100% fifo for all node data, but it is the best we can do -# without insane amounts of additional queueing complexity. -WriteConnList = List of Open OR Conns with pending write data > WriteBytes -WriteConnHead = 0 - -Every Epoch Seconds: - GlobalWriteBucket += BwRate*Epoch - WriteListEnd = WriteConnHead - - do - ORCONN = WriteConnList[WriteConnHead] - w = write(ORConn, WriteBytes) - GlobalWriteBucket -= w - WriteConnHead += 1 - while GlobalWriteBucket > 0 and WriteConnHead != WriteListEnd - - SecondCounter += Epoch - if SecondCounter >= 1: - SecondCounter = 0 - GlobalWriteBucket = 0 - - diff --git a/doc/spec/proposals/ideas/xxx-choosing-crypto-in-tor-protocol.txt b/doc/spec/proposals/ideas/xxx-choosing-crypto-in-tor-protocol.txt deleted file mode 100644 index e8489570f7..0000000000 --- a/doc/spec/proposals/ideas/xxx-choosing-crypto-in-tor-protocol.txt +++ /dev/null @@ -1,138 +0,0 @@ -Filename: xxx-choosing-crypto-in-tor-protocol.txt -Title: Picking cryptographic standards in the Tor wire protocol -Author: Marian -Created: 2009-05-16 -Status: Draft - -Motivation: - - SHA-1 is horribly outdated and not suited for security critical - purposes. SHA-2, RIPEMD-160, Whirlpool and Tigerare good options - for a short-term replacement, but in the long run, we will - probably want to upgrade to the winner or a semi-finalist of the - SHA-3 competition. - - For a 2006 comparison of different hash algorithms, read: - http://www.sane.nl/sane2006/program/final-papers/R10.pdf - - Other reading about SHA-1: - http://www.schneier.com/blog/archives/2005/02/sha1_broken.html - http://www.schneier.com/blog/archives/2005/08/new_cryptanalyt.html - http://www.schneier.com/paper-preimages.html - - Additionally, AES has been theoretically broken for years. While - the attack is still not efficient enough that the public sector - has been able to prove that it works, we should probably consider - the time between a theoretical attack and a practical attack as an - opportunity to figure out how to upgrade to a better algorithm, - such as Twofish. - - See: - http://schneier.com/crypto-gram-0209.html#1 - -Design: - - I suggest that nodes should publish in directories which - cryptographic standards, such as hash algorithms and ciphers, - they support. Clients communicating with nodes will then - pick whichever of those cryptographic standards they prefer - the most. In the case that the node does not publish which - cryptographic standards it supports, the client should assume - that the server supports the older standards, such as SHA-1 - and AES, until such time as we choose to desupport those - standards. - - Node to node communications could work similarly. However, in - case they both support a set of algorithms but have different - preferences, the disagreement would have to be resolved - somehow. Two possibilities include: - * the node requesting communications presents which - cryptographic standards it supports in the request. The - other node picks. - * both nodes send each other lists of what they support and - what version of Tor they are using. The newer node picks, - based on the assumption that the newer node has the most up - to date information about which hash algorithm is the best. - Of course, the node could lie about its version, but then - again, it could also maliciously choose only to support older - algorithms. - - Using this method, we could potentially add server side support - to hash algorithms and ciphers before we instruct clients to - begin preferring those hash algorithms and ciphers. In this way, - the clients could upgrade and the servers would already support - the newly preferred hash algorithms and ciphers, even if the - servers were still using older versions of Tor, so long as the - older versions of Tor were at least new enough to have server - side support. - - This would make quickly upgrading to new hash algorithms and - ciphers easier. This could be very useful when new attacks - are published. - - One concern is that client preferences could expose the client - to segmentation attacks. To mitigate this, we suggest hardcoding - preferences in the client, to prevent the client from choosing - to use a new hash algorithm or cipher that no one else is using - yet. While offering a preference might be useful in case a client - with an older version of Tor wants to start using the newer hash - algorithm or cipher that everyone else is using, if the client - cares enough, he or she can just upgrade Tor. - - We may also have to worry about nodes which, through laziness or - maliciousness, refuse to start supporting new hash algorithms or - ciphers. This must be balanced with the need to maintain - backward compatibility so the client will have a large selection - of nodes to pick from. Adding new hash algorithms and ciphers - long before we suggest nodes start using them can help mitigate - this. However, eventually, once sufficient nodes support new - standards, client side support for older standards should be - disabled, particularly if there are practical rather than merely - theoretical attacks. - - Server side support for older standards can be kept much longer - than client side support, since clients using older hashes and - ciphers are really only hurting theirselvse. - - If server side support for a hash algorithm or cipher is added - but never preferred before we decide we don't really want it, - support can be removed without having to worry about backward - compatibility. - -Security implications: - Improving cryptography will improve Tor's security. However, if - clients pick different cryptographic standards, they could be - partitioned based on their cryptographic preferences. We also - need to worry about nodes refusing to support new standards. - These issues are detailed above. - -Specification: - - Todo. Need better understanding of how Tor currently works or - help from someone who does. - -Compatibility: - - This idea is intended to allow easier upgrading of cryptographic - hash algorithms and ciphers while maintaining backwards - compatibility. However, at some point, backwards compatibility - with very old hashes and ciphers should be dropped for security - reasons. - -Implementation: - - Todo. - -Performance and scalability nodes: - - Better hashes and cipher are someimes a little more CPU intensive - than weaker ones. For instance, on most computers AES is a little - faster than Twofish. However, in that example, I consider Twofish's - additional security worth the tradeoff. - -Acknowledgements: - - Discussed this on IRC with a few people, mostly Nick Mathewson. - Nick was particularly helpful in explaining how Tor works, - explaining goals, and providing various links to Tor - specifications. diff --git a/doc/spec/proposals/ideas/xxx-controllers-intercept-extends.txt b/doc/spec/proposals/ideas/xxx-controllers-intercept-extends.txt deleted file mode 100644 index 76ba5c84b5..0000000000 --- a/doc/spec/proposals/ideas/xxx-controllers-intercept-extends.txt +++ /dev/null @@ -1,44 +0,0 @@ -Author: Geoff Goodell -Title: Allow controller to manage circuit extensions -Date: 12 March 2006 - -History: - - This was once bug 268. Moving it into the proposal system for posterity. - -Test: - -Tor controllers should have a means of learning more about circuits built -through Tor routers. Specifically, if a Tor controller is connected to a Tor -router, it should be able to subscribe to a new class of events, perhaps -"onion" or "router" events. A Tor router SHOULD then ensure that the -controller is informed: - -(a) (NEW) when it receives a connection from some other location, in which -case it SHOULD indicate (1) a unique identifier for the circuit, and (2) a -ServerID in the event of an OR connection from another Tor router, and -Hostname otherwise. - -(b) (REQUEST) when it receives a request to extend an existing circuit to a -successive Tor router, in which case it SHOULD provide (1) the unique -identifier for the circuit, (2) a Hostname (or, if possible, ServerID) of the -previous Tor router in the circuit, and (3) a ServerID for the requested -successive Tor router in the circuit; - -(c) (EXTEND) Tor will attempt to extend the circuit to some other router, in -which case it SHOULD provide the same fields as provided for REQUEST. - -(d) (SUCCEEDED) The circuit has been successfully extended to some ther -router, in which case it SHOULD provide the same fields as provided for -REQUEST. - -We also need a new configuration option analogous to _leavestreamsunattached, -specifying whether the controller is to manage circuit extensions or not. -Perhaps we can call it "_leavecircuitsunextended". When set to 0, Tor -manages everything as usual. When set to 1, a circuit received by the Tor -router cannot transition from "REQUEST" to "EXTEND" state without being -directed by a new controller command. The controller command probably does -not need any arguments, since circuits are extended per client source -routing, and all that the controller does is accept or reject the extension. - -This feature can be used as a basis for enforcing routing policy. diff --git a/doc/spec/proposals/ideas/xxx-encrypted-services.txt b/doc/spec/proposals/ideas/xxx-encrypted-services.txt deleted file mode 100644 index 3414f3c4fb..0000000000 --- a/doc/spec/proposals/ideas/xxx-encrypted-services.txt +++ /dev/null @@ -1,18 +0,0 @@ - -the basic idea might be to generate a keypair, and sign little statements -like "this key corresponds to this relay id", and publish them on karsten's -hs dht. - -so if you want to talk to it, you look it up, then go to that exit. -and by 'go to' i mean 'build a tor circuit like normal except you're sure -where to exit' - -connecting to it is slower than usual, but once you're connected, it's no -slower than normal tor. -and you get what wikileaks wants from its hidden service, which is really -just the UI piece. -indymedia also wants this. - -might be interesting to let an encrypted service list more than one relay, -too. - diff --git a/doc/spec/proposals/ideas/xxx-exit-scanning-outline.txt b/doc/spec/proposals/ideas/xxx-exit-scanning-outline.txt deleted file mode 100644 index d84094400a..0000000000 --- a/doc/spec/proposals/ideas/xxx-exit-scanning-outline.txt +++ /dev/null @@ -1,44 +0,0 @@ -1. Scanning process - A. Non-HTML/JS HTTP mime types compared via SHA1 hash - B. Dynamic HTTP content filtered at 4 levels: - 1. IP change+Tor cookie utilization - - Tor cookies replayed with new IP in case of changes - 2. HTML Tag+Attribute+JS comparison - - Comparisons made based only on "relevant" HTML tags - and attributes - 3. HTML Tag+Attribute+JS diffing - - Tags, attributes and JS AST nodes that change during - Non-Tor fetches pruned from comparison - 4. URLS with > N% of node failures removed - - results purged from filesystem at end of scan loop - C. SSL scanning handles some forms of dynamic certs - 1. Catalogs certs for all IPs resolved locally - by getaddrinfo over the duration of the scan. - - Updated each test. - 2. If the domain presents a new cert for each IP, this - is noted on the failure result for the node - 3. If the same IP presents two different certs locally, - the cert list is first refreshed, and if it happens - again, discarded - 4. A N% node failure filter also applies - D. Scanner can be restarted from any point in the event - of scanner or system crashes, or graceful shutdown. - - Results+scan state pickled to filesystem continuously -2. Cron job checks results periodically for reporting - A. Divide failures into three types of BadExit based on type - and frequency over time and incident rate - B. write reject lines to approved-routers for those three types: - 1. ID Hex based (for misconfig/network problems easily fixed) - 2. IP based (for content modification) - 3. IP+mask based (for continuous/egregious content modification) - C. Emails results to tor-scanners@freehaven.net -3. Human Review and Appeal - A. ID Hex-based BadExit is meant to be possible to removed easily - without needing to beg us. - - Should this behavior be encouraged? - B. Optionally can reserve IP based badexits for human review - 1. Results are encapsulated fully on the filesystem and can be - reviewed without network access - 2. Soat has --rescan to rescan failed nodes from a data directory - - New set of URLs used - diff --git a/doc/spec/proposals/ideas/xxx-geoip-survey-plan.txt b/doc/spec/proposals/ideas/xxx-geoip-survey-plan.txt deleted file mode 100644 index 49c6615a66..0000000000 --- a/doc/spec/proposals/ideas/xxx-geoip-survey-plan.txt +++ /dev/null @@ -1,137 +0,0 @@ - - -Abstract - - This document explains how to tell about how many Tor users there - are, and how many there are in which country. Statistics are - involved. - -Motivation - - There are a few reasons we need to keep track of which countries - Tor users (in aggregate) are coming from: - - - Resource allocation. Knowing about underserved countries with - lots of users can let us know about where we need to direct - translation and outreach efforts. - - - Anticensorship. Sudden drops in usage on a national basis can - indicate the arrival of a censorious firewall. - - - Sponsor outreach and self-evalutation. Many people and - organizations who are interested in funding The Tor Project's - work want to know that we're successfully serving parts of the - world they're interested in, and that efforts to expand our - userbase are actually succeeding. So do we. - -Goals - - We want to know approximately how many Tor users there are, and which - countries they're in, even in the presence of a hypothetical - "directory guard" feature. Some uncertainty is okay, but we'd like - to be able to put a bound on the uncertainty. - - We need to make sure this information isn't exposed in a way that - helps an adversary. - -Methods for current clients: - - Every client downloads network status documents. There are - currently three methods (one hypothetical) for clients to get them. - - 0.1.2.x clients (and earlier) fetch a v2 networkstatus - document about every NETWORKSTATUS_CLIENT_DL_INTERVAL [30 - minutes]. - - - 0.2.0.x clients fetch a v3 networkstatus consensus document - at a random interval between when their current document is no - longer freshest, and when their current document is about to - expire. - - [In both of the above cases, clients choose a running - directory cache at random with odds roughly proportional to - its bandwidth. If they're just starting, they know a XXXX FIXME -NM] - - - In some future version, clients will choose directory caches - to serve as their "directory guards" to avoid profiling - attacks, similarly to how clients currently start all their - circuits at guard nodes. - - We assume that a directory cache can tell which of these three - categories a client is in by the format of its status request. - - A directory cache can be made to count distinct client IP - addresses that make a certain request of it in a given timeframe, - and total requests made to it over that timeframe. For the first - two cases, a cache can get a picture of the overall - number and countries of users in the network by dividing the IP - count by the probability with which they (as a cache) would be - chosen. Assuming that our listed bandwidth is such that we expect - to be chosen with probability P for any given request, and we've - been counting IPs for long enough that we expect the average - client to have made N requests, they will have visited us at least - once with probability P' = 1-(1-P)^N, and so we divide the IP - counts we've seen by P' for our estimate. To estimate total - number of clients of a given type, determine how many requests a - client of that type will make over that time, and assume we'll - have seen P of them. - - Both of these numbers are useful: the IP counts will give the - total number of IPs connecting to the network, and the request - counts will give the total number of users on the network at any - given time. - - Notes: - - [Over H hours, the N for V2 clients is 2*H, and the N for V3 - clients is currently around H/2 or H/3.] - - - (We should only count requests that we actually intend to answer; - 503 requests shouldn't count.) - - - These measurements should also be taken at a directory - authority if possible: their picture of the network is skewed - by clients that fetch from them directly. These clients, - however, are all the clients that are just bootstrapping - (assuming that the fallback-consensus feature isn't yet used - much). - - - These measurements also overestimate the V2 download rate if - some downloads fail and clients retry them later after backing - off. - -Methods for directory guards: - - If directory guards are in use, directory guards get a picture of - all those users who chose them as a guard when they were listed - as a good choice for a guard, and who are also on the network - now. The cleanest data here will come from nodes that were listed - as good new-guards choices for a while, and have not been so for a - while longer (to study decay rates); nodes that have been listed - as good new-guard choices consistently for a long time (to get a - sample of the network); and nodes that have been listed as good - new-guard choices only recently (to get a sample of new users and - users whose guards have died out.) - - Since directory guards are currently unspecified, we'll need to - make some guesses about how they'll turn out to work. Here are - a couple of approaches that could work. - - We could have clients pick completely new directory guards on - a rolling basis every two months or so. This would ensure - that staying as a guard for a while would be sufficient to - see a sample of users. This is potentially advantageous for - load-balancing the network as well, though it might lose some - of the benefits of directory guard. We need to quantify the - impact of this; it might not actually make stuff worse in - practice, if most guards don't stay good guards for a month - or two. - - - We could try to collect statistics at several directory - guards and combine their statisics, but we would need to make - sure that for all time, at least one of the directory guards - had been recommended as a good choice for new guards. By - looking at new-IP rates for guards, we could get an idea of - user uptake; for looking at old-IP decay rates, we could get - an idea of turnover. This approach would entail significant - complexity, and we'd probably need to record more information - than we'd really like to. - - diff --git a/doc/spec/proposals/ideas/xxx-grand-scaling-plan.txt b/doc/spec/proposals/ideas/xxx-grand-scaling-plan.txt deleted file mode 100644 index 336798cc0f..0000000000 --- a/doc/spec/proposals/ideas/xxx-grand-scaling-plan.txt +++ /dev/null @@ -1,97 +0,0 @@ - -Right now as I understand it, there are n big scaling problems heading -our way: - -1) Clients need to learn all the relay descriptors they could use. That's -a lot of bytes through a potentially small pipe. -2) Relays need to hold open TCP connections to most other relays. -3) Clients need to learn the whole networkstatus. Even using v3, as -the network grows that will become unwieldy. -4) Dir mirrors need to mirror all the relay descriptors; eventually this -will get big too. - -Here's my plan. - --------------------------------------------------------------------- - -Piece one: download O(1) descriptors rather than O(n) descriptors. - -We need to change our circuit extend protocol so it fetches a relay -descriptor at every 'extend' operation: - - Client fetches networkstatus, picks guards, connects to one. - - Client picks middle hop out of networkstatus, asks guard for - its descriptor, then extends to it. - - Clients picks exit hop out of networkstatus, asks middle hop - for its descriptor, then extends to it. Done. - -The client needs to ask for the descriptor even if it already has a -copy, because otherwise we leak too much. Also, the descriptor needs to -be padded to some large (but not too large) size to prevent the middle -hops from guessing about it. - -The first step towards this is to instrument the current code to see -how much of a win this would actually be -- I am guessing it is already -a win even with the current number of descriptors. - -We also would need to assign the 'Exit' flag more usefully, and make -clients pay attention to it when picking their last hop, since they -don't actually know the exit policies of the relays they're choosing from. - -We also need to think harder about other implications -- for example, -a relay with a tiny exit policy won't get the Exit flag, and thus won't -ever get picked as an exit relay. Plus, our "enclave exit" model is out -the window unless we figure out a cool trick. - -More generally, we'll probably want to compress the descriptors that we -send back; maybe 8k is a good upper bound? I wonder if we could ask for -several descriptors, and bundle back all of the ones that fit in the 8k? - -We'd also want to put the load balancing weights into the networkstatus, -so clients can choose fast nodes more often without needing to see the -descriptors. This is a good opportunity for the authorities to be able -to put "more accurate" weights in if they learn to detect attacks. It -also means we should consider running automated audits to make sure the -authorities aren't trying to snooker everybody. - -I'm aiming to get Peter Palfrader to tackle this problem in mid 2008, -but I bet he could use some help. - --------------------------------------------------------------------- - -Piece two: inter-relay communication uses UDP - -If relays send packets to/from other relays via UDP, they don't need a -new descriptor for each such link. Thus we'll still need to keep state -for each link, but we won't max out on sockets. - -Clearly a lot more work needs to be done here. Ian Goldberg has a student -who has been working on it, and if all goes well we'll be chipping in -some funding to continue that. Also, Camilo Viecco has been doing his -PhD thesis on it. - --------------------------------------------------------------------- - -Piece three: networkstatus documents get partitioned - -While the authorities should be expected to be able to handle learning -about all the relays, there's no reason the clients or the mirrors need -to. Authorities should put a cap on the number of relays listed in a -single networkstatus, and split them when they get too big. - -We'd need a good way to have each authority come to the same conclusion -about which partition a given relay goes into. - -Directory mirrors would then mirror all the relay descriptors in their -partition. This is compatible with 'piece one' above, since clients in -a given partition will only ask about descriptors in that partition. - -More complex versions of this design would involve overlapping partitions, -but that would seem to start contradicting other parts of this proposal -right quick. - -Nobody is working on this piece yet. It's hard to say when we'll need -it, but it would be nice to have some more thought on it before the week -that we need it. - --------------------------------------------------------------------- - diff --git a/doc/spec/proposals/ideas/xxx-hide-platform.txt b/doc/spec/proposals/ideas/xxx-hide-platform.txt deleted file mode 100644 index ad19fb1fd4..0000000000 --- a/doc/spec/proposals/ideas/xxx-hide-platform.txt +++ /dev/null @@ -1,37 +0,0 @@ -Filename: xxx-hide-platform.txt -Title: Hide Tor Platform Information -Author: Jacob Appelbaum -Created: 24-July-2008 -Status: Draft - - - Hiding Tor Platform Information - -0.0 Introduction - -The current Tor program publishes its specific Tor version and related OS -platform information. This information could be misused by an attacker. - -0.1 Current Implementation - -Currently, the Tor binary sends data that looks like the following: - - Tor 0.2.0.26-rc (r14597) on Darwin Power Macintosh - Tor 0.1.2.19 on Windows XP Service Pack 3 [workstation] {terminal services, - single user} - -1.0 Suggested changes - -It would be useful to allow a user to configure the disclosure of such -information. Such a change would be an option in the torrc file like so: - - HidePlatform Yes - -1.1 Suggested default behavior in the future - -If a user would like to disclose this information, they could configure their -Tor to do so. - - HidePlatform No - - diff --git a/doc/spec/proposals/ideas/xxx-port-knocking.txt b/doc/spec/proposals/ideas/xxx-port-knocking.txt deleted file mode 100644 index 85c27ec52d..0000000000 --- a/doc/spec/proposals/ideas/xxx-port-knocking.txt +++ /dev/null @@ -1,91 +0,0 @@ -Filename: xxx-port-knocking.txt -Title: Port knocking for bridge scanning resistance -Author: Jacob Appelbaum -Created: 19-April-2009 -Status: Draft - - Port knocking for bridge scanning resistance - -0.0 Introduction - -This document is a collection of ideas relating to improving scanning -resistance for private bridge relays. This is intented to stop opportunistic -network scanning and subsequent discovery of private bridge relays. - - -0.1 Current Implementation - -Currently private bridges are only hidden by their obscurity. If you know -a bridge ip address, the bridge can be detected trivially and added to a block -list. - -0.2 Configuring an external port knocking program to control the firewall - -It is currently possible for bridge operators to configure a port knocking -daemon that controls access to the incoming OR port. This is currently out of -scope for Tor and Tor configuration. This process requires the firewall to know -the current nodes in the Tor network. - -1.0 Suggested changes - -Private bridge operators should be able to configure a method of hiding their -relay. Only authorized users should be able to communicate with the private -bridge. This should be done with Tor and if possible without the help of the -firewall. It should be possible for a Tor user to enter a secret key into -Tor or optionally Vidalia on a per bridge basis. This secret key should be -used to authenticate the bridge user to the private bridge. - -1.x Issues with low ports and bind() for ORPort - -Tor opens low numbered ports during startup and then drops privileges. It is -no longer possible to rebind to those lower ports after they are closed. - -1.x Issues with OS level packet filtering - -Tor does not know about any OS level packet filtering. Currently there is no -packet filters that understands the Tor network in real time. - -1.x Possible partioning of users by bridge operator - -Depending on implementation, it may be possible for bridge operators to -uniquely identify users. This appears to be a general bridge issue when a -bridge operator uniquely deploys bridges per user. - -2.0 Implementation ideas - -This is a suggested set of methods for port knocking. - -2.x Using SPA port knocking - -Single Packet Authentication port knocking encodes all required data into a -single UDP packet. Improperly formatted packets may be simply discarded. -Properly formatted packets should be processed and appropriate actions taken. - -2.x Using DNS as a transport for SPA - -It should be possible for Tor to bind to port 53 at startup and merely drop all -packets that are not valid. UDP does not require a response and invalid packets -will not trigger a response from Tor. With base32 encoding it should be -possible to encode SPA as valid DNS requests. This should allow use of the -public DNS infrastructure for authorization requests if desired. - -2.x Ghetto firewalling with opportunistic connection closing - -Until a user has authenticated with Tor, Tor only has a UDP listener. This -listener should never send data in response, it should only open an ORPort -when a user has successfully authenticated. After a user has authenticated -with Tor to open an ORPort, only users who have authenticated will be able -to use it. All other users as identified by their ip address will have their -connection closed before any data is sent or received. This should be -accomplished with an access policy. By default, the access policy should block -all access to the ORPort. - -2.x Timing and reset of access policies - -Access to the ORPort is sensitive. The bridge should remove any exceptions -to its access policy regularly when the ORPort is unused. Valid users should -reauthenticate if they do not use the ORPort within a given time frame. - -2.x Additional considerations - -There are many. A format of the packet and the crypto involved is a good start. diff --git a/doc/spec/proposals/ideas/xxx-rate-limit-exits.txt b/doc/spec/proposals/ideas/xxx-rate-limit-exits.txt deleted file mode 100644 index 81fed20af8..0000000000 --- a/doc/spec/proposals/ideas/xxx-rate-limit-exits.txt +++ /dev/null @@ -1,63 +0,0 @@ - -1. Overview - - We should rate limit the volume of stream creations at exits: - -2.1. Per-circuit limits - - If a given circuit opens more than N streams in X seconds, further - stream requests over the next Y seconds should fail with the reason - 'resourcelimit'. Clients will automatically notice this and switch to - a new circuit. - - The goal is to limit the effects of port scans on a given exit relay, - so the relay's ISP won't get hassled as much. - - First thoughts for parameters would be N=100 streams in X=5 seconds - causes 30 seconds of fails; and N=300 streams in X=30 seconds causes - 30 seconds of fails. - - We could simplify by, instead of having a "for 30 seconds" parameter, - just marking the circuit as forever failing new requests. (We don't want - to just close the circuit because it may still have open streams on it.) - -2.2. Per-destination limits - - If a given circuit opens more than N1 streams in X seconds to a single - IP address, or all the circuits combined open more than N2 streams, - then we should fail further attempts to reach that address for a while. - - The goal is to limit the abuse that Tor exit relays can dish out - to a single target either for socket DoS or for web crawling, in - the hopes of a) not triggering their automated defenses, and b) not - making them upset at Tor. Hopefully these self-imposed bans will be - much shorter-lived than bans or barriers put up by the websites. - -3. Issues - -3.1. Circuit-creation overload - - Making clients move to new circuits more often will cause more circuit - creation requests. - -3.2. How to pick the parameters? - - If we pick the numbers too low, then popular sites are effectively - cut out of Tor. If we pick them too high, we don't do much good. - - Worse, picking them wrong isn't easy to fix, since the deployed Tor - servers will ship with a certain set of numbers. - - We could put numbers (or "general settings") in the networkstatus - consensus, and Tor exits would adapt more dynamically. - - We could also have a local config option about how aggressive this - server should be with its parameters. - -4. Client-side limitations - - Perhaps the clients should have built-in rate limits too, so they avoid - harrassing the servers by default? - - Tricky if we want to get Tor clients in use at large enclaves. - diff --git a/doc/spec/proposals/ideas/xxx-separate-streams-by-port.txt b/doc/spec/proposals/ideas/xxx-separate-streams-by-port.txt deleted file mode 100644 index f26c1e580f..0000000000 --- a/doc/spec/proposals/ideas/xxx-separate-streams-by-port.txt +++ /dev/null @@ -1,59 +0,0 @@ -Filename: xxx-separate-streams-by-port.txt -Title: Separate streams across circuits by destination port -Author: Robert Hogan -Created: 21-Oct-2008 -Status: Draft - -Here's a patch Robert Hogan wrote to use only one destination port per -circuit. It's based on a wishlist item Roger wrote, to never send AIM -usernames over the same circuit that we're hoping to browse anonymously -through. The remaining open question is: how many extra circuits does this -cause an ordinary user to create? My guess is not very many, but I'm wary -of putting this in until we have some better estimate. On the other hand, -not putting it in means that we have a known security flaw. Hm. - -Index: src/or/or.h -=================================================================== ---- src/or/or.h (revision 17143) -+++ src/or/or.h (working copy) -@@ -1874,6 +1874,7 @@ - - uint8_t state; /**< Current status of this circuit. */ - uint8_t purpose; /**< Why are we creating this circuit? */ -+ uint16_t service; /**< Port conn must have to use this circuit. */ - - /** How many relay data cells can we package (read from edge streams) - * on this circuit before we receive a circuit-level sendme cell asking -Index: src/or/circuituse.c -=================================================================== ---- src/or/circuituse.c (revision 17143) -+++ src/or/circuituse.c (working copy) -@@ -62,10 +62,16 @@ - return 0; - } - -- if (purpose == CIRCUIT_PURPOSE_C_GENERAL) -+ if (purpose == CIRCUIT_PURPOSE_C_GENERAL) { - if (circ->timestamp_dirty && - circ->timestamp_dirty+get_options()->MaxCircuitDirtiness <= now) - return 0; -+ /* If the circuit is dirty and used for services on another port, -+ then it is not suitable. */ -+ if (circ->service && conn->socks_request->port && -+ (circ->service != conn->socks_request->port)) -+ return 0; -+ } - - /* decide if this circ is suitable for this conn */ - -@@ -1351,7 +1357,9 @@ - if (connection_ap_handshake_send_resolve(conn) < 0) - return -1; - } -- -+ if (conn->socks_request->port -+ && (TO_CIRCUIT(circ)->purpose == CIRCUIT_PURPOSE_C_GENERAL)) -+ TO_CIRCUIT(circ)->service = conn->socks_request->port; - return 1; - } - diff --git a/doc/spec/proposals/ideas/xxx-using-spdy.txt b/doc/spec/proposals/ideas/xxx-using-spdy.txt deleted file mode 100644 index d733a84b69..0000000000 --- a/doc/spec/proposals/ideas/xxx-using-spdy.txt +++ /dev/null @@ -1,143 +0,0 @@ -Filename: xxx-using-spdy.txt -Title: Using the SPDY protocol to improve Tor performance -Author: Steven J. Murdoch -Created: 03-Feb-2010 -Status: Draft -Target: - -1. Overview - - The SPDY protocol [1] is an alternative method for transferring - web content over TCP, designed to improve efficiency and - performance. A SPDY-aware browser can already communicate with - a SPDY-aware web server over Tor, because this only requires a TCP - stream to be set up. However, a SPDY-aware browser cannot - communicate with a non-SPDY-aware web server. This proposal - outlines how Tor could support this latter case, and why it - may be good for performance. - -2. Motivation - - About 90% of Tor traffic, by connection, is HTTP [2], but - users report subjective performance to be poor. It would - therefore be desirable to improve this situation. SPDY was - designed to offer better performance than HTTP, in - high-latency and/or low-bandwidth situations, and is therefore - an option worth examining. - - If a user wishes to access a SPDY-enabled web server over Tor, - all they need to do is to configure their SPDY-enabled browser - (e.g. Google Chrome) to use Tor. However, there are few - SPDY-enabled web servers, and even if there was high demand - from Tor users, there would be little motivation for server - operators to upgrade, for the benefit of only a small - proportion of their users. - - The motivation of this proposal is to allow only the user to - install a SPDY-enabled browser, and permit web servers to - remain unmodified. Essentially, Tor would incorporate a proxy - on the exit node, which communicates SPDY to the web browser - and normal HTTP to the web server. This proxy would translate - between the two transport protocols, and possibly perform - other optimizations. - - SPDY currently offers five optimizations: - - 1) Multiplexed streams: - An unlimited number of resources can be transferred - concurrently, over a single TCP connection. - - 2) Request prioritization: - The client can set a priority on each resource, to assist - the server in re-ordering responses. - - 3) Compression: - Both HTTP header and resource content can be compressed. - - 4) Server push: - The server can offer the client resources which have not - been requested, but which the server believes will be. - - 5) Server hint: - The server can suggest that the client request further - resources, before the main content is transferred. - - Tor currently effectively implements (1), by being able to put - multiple streams on one circuit. SPDY however requires fewer - round-trips to do the same. The other features are not - implemented by Tor. Therefore it is reasonable to expect that - a HTTP <-> SPDY proxy may improve Tor performance, by some - amount. - - The consequences on caching need to be considered carefully. - Most of the optimizations SPDY offers have no effect because - the existing HTTP cache control headers are transmitted without - modification. Server push is more problematic, because here - the server may push a resource that the client already has. - -3. Design outline - - One way to implement the SPDY proxy is for Tor exit nodes to - advertise this capability in their descriptor. The OP would - then preferentially select these nodes when routing streams - destined for port 80. - - Then, rather than sending the usual RELAY_BEGIN cell, the OP - would send a RELAY_BEGIN_TRANSFORMED cell, with a parameter to - indicate that the exit node should translate between SPDY and - HTTP. The rest of the connection process would operate as - usual. - - There would need to be some way of elegantly handling non-HTTP - traffic which goes over port 80. - -4. Implementation status - - SPDY is under active development and both the specification - and implementations are in a state of flux. Initial - experiments with Google Chrome in SPDY-mode and server - libraries indicate that more work is needed before they are - production-ready. There is no indication that browsers other - than Google Chrome will support SPDY (and no official - statement as to whether Google Chrome will eventually enable - SPDY by default). - - Implementing a full SPDY proxy would be non-trivial. Stream - multiplexing and compression are supported by existing - libraries and would be fairly simple to implement. Request - prioritization would require some form of caching on the - proxy-side. Server push and server hint would require content - parsing to identify resources which should be treated - specially. - -5. Security and policy implications - - A SPDY proxy would be a significant amount of code, and may - pull in external libraries. This code will process potentially - malicious data, both at the SPDY and HTTP sides. This proposal - therefore increases the risk that exit nodes will be - compromised by exploiting a bug in the proxy. - - This proposal would also be the first way in which Tor is - modifying TCP stream data. Arguably this is still meta-data - (HTTP headers), but there may be some concern that Tor should - not be doing this. - - Torbutton only works with Firefox, but SPDY only works with - Google Chrome. We should be careful not to recommend that - users adopt a browser which harms their privacy in other ways. - -6. Open questions: - - - How difficult would this be to implement? - - - How much performance improvement would it actually result in? - - - Is there some way to rapidly develop a prototype which would - answer the previous question? - -[1] SPDY: An experimental protocol for a faster web - http://dev.chromium.org/spdy/spdy-whitepaper -[2] Shining Light in Dark Places: Understanding the Tor Network Damon McCoy, - Kevin Bauer, Dirk Grunwald, Tadayoshi Kohno, Douglas Sicker - http://www.cs.washington.edu/homes/yoshi/papers/Tor/PETS2008_37.pdf diff --git a/doc/spec/proposals/ideas/xxx-what-uses-sha1.txt b/doc/spec/proposals/ideas/xxx-what-uses-sha1.txt deleted file mode 100644 index b3ca3eea5a..0000000000 --- a/doc/spec/proposals/ideas/xxx-what-uses-sha1.txt +++ /dev/null @@ -1,247 +0,0 @@ -Filename: xxx-what-uses-sha1.txt -Title: Where does Tor use SHA-1 today? -Authors: Nick Mathewson, Marian -Created: 30-Dec-2008 -Status: Meta - - -Introduction: - - Tor uses SHA-1 as a message digest. SHA-1 is showing its age: - theoretical attacks for finding collisions against it get better - every year or two, and it will likely be broken in practice before - too long. - - According to smart crypto people, the SHA-2 functions (SHA-256, etc) - share too much of SHA-1's structure to be very good. RIPEMD-160 is - also based on flawed past hashes. Some people think other hash - functions (e.g. Whirlpool and Tiger) are not as bad; most of these - have not seen enough analysis to be used yet. - - Here is a 2006 paper about hash algorithms. - http://www.sane.nl/sane2006/program/final-papers/R10.pdf - - (Todo: Ask smart crypto people.) - - By 2012, the NIST SHA-3 competition will be done, and with luck we'll - have something good to switch too. But it's probably a bad idea to - wait until 2012 to figure out _how_ to migrate to a new hash - function, for two reasons: - 1) It's not inconceivable we'll want to migrate in a hurry - some time before then. - 2) It's likely that migrating to a new hash function will - require protocol changes, and it's easiest to make protocol - changes backward compatible if we lay the groundwork in - advance. It would suck to have to break compatibility with - a big hard-to-test "flag day" protocol change. - - This document attempts to list everything Tor uses SHA-1 for today. - This is the first step in getting all the design work done to switch - to something else. - - This document SHOULD NOT be a clearinghouse of what to do about our - use of SHA-1. That's better left for other individual proposals. - - -Why now? - - The recent publication of "MD5 considered harmful today: Creating a - rogue CA certificate" by Alexander Sotirov, Marc Stevens, Jacob - Appelbaum, Arjen Lenstra, David Molnar, Dag Arne Osvik, and Benne de - Weger has reminded me that: - - * You can't rely on theoretical attacks to stay theoretical. - * It's quite unpleasant when theoretical attacks become practical - and public on days you were planning to leave for vacation. - * Broken hash functions (which SHA-1 is not quite yet AFAIU) - should be dropped like hot potatoes. Failure to do so can make - one look silly. - - -Triage - - How severe are these problems? Let's divide them into these - categories, where H(x) is the SHA-1 hash of x: - PREIMAGE -- find any x such that a H(x) has a chosen value - -- A SHA-1 usage that only depends on preimage - resistance - * Also SECOND PREIMAGE. Given x, find a y not equal to - x such that H(x) = H(y) - COLLISION<role> -- A SHA-1 usage that depends on collision - resistance, but the only party who could mount a - collision-based attack is already in a trusted role - (like a distribution signer or a directory authority). - COLLISION -- find any x and y such that H(x) = H(y) -- A - SHA-1 usage that depends on collision resistance - and doesn't need the attacker to have any special keys. - - There is no need to put much effort into fixing PREIMAGE and SECOND - PREIMAGE usages in the near-term: while there have been some - theoretical results doing these attacks against SHA-1, they don't - seem to be close to practical yet. To fix COLLISION<code-signing> - usages is not too important either, since anyone who has the key to - sign the code can mount far worse attacks. It would be good to fix - COLLISION<authority> usages, since we try to resist bad authorities - to a limited extent. The COLLISION usages are the most important - to fix. - - Kelsey and Schneier published a theoretical second preimage attack - against SHA-1 in 2005, so it would be a good idea to fix PREIMAGE - and SECOND PREIMAGE usages after fixing COLLISION usages or where fixes - require minimal effort. - - http://www.schneier.com/paper-preimages.html - - Additionally, we need to consider the impact of a successful attack - in each of these cases. SHA-1 collisions are still expensive even - if recent results are verified, and anybody with the resources to - compute one also has the resources to mount a decent Sybil attack. - - Let's be pessimistic, and not assume that producing collisions of - a given format is actually any harder than producing collisions at - all. - - -What Tor uses hashes for today: - -1. Infrastructure. - - A. Our X.509 certificates are signed with SHA-1. - COLLSION - B. TLS uses SHA-1 (and MD5) internally to generate keys. - PREIMAGE? - * At least breaking SHA-1 and MD5 simultaneously is - much more difficult than breaking either - independently. - C. Some of the TLS ciphersuites we allow use SHA-1. - PREIMAGE? - D. When we sign our code with GPG, it might be using SHA-1. - COLLISION<code-signing> - * GPG 1.4 and up have writing support for SHA-2 hashes. - This blog has help for converting: - http://www.schwer.us/journal/2005/02/19/sha-1-broken-and-gnupg-gpg/ - E. Our GPG keys might be authenticated with SHA-1. - COLLISION<code-signing-key-signing> - F. OpenSSL's random number generator uses SHA-1, I believe. - PREIMAGE - -2. The Tor protocol - - A. Everything we sign, we sign using SHA-1-based OAEP-MGF1. - PREIMAGE? - B. Our CREATE cell format uses SHA-1 for: OAEP padding. - PREIMAGE? - C. Our EXTEND cells use SHA-1 to hash the identity key of the - target server. - COLLISION - D. Our CREATED cells use SHA-1 to hash the derived key data. - ?? - E. The data we use in CREATE_FAST cells to generate a key is the - length of a SHA-1. - NONE - F. The data we send back in a CREATED/CREATED_FAST cell is the length - of a SHA-1. - NONE - G. We use SHA-1 to derive our circuit keys from the negotiated g^xy - value. - NONE - H. We use SHA-1 to derive the digest field of each RELAY cell, but that's - used more as a checksum than as a strong digest. - NONE - -3. Directory services - - [All are COLLISION or COLLISION<authority> ] - - A. All signatures are generated on the SHA-1 of their corresponding - documents, using PKCS1 padding. - * In dir-spec.txt, section 1.3, it states, - "SIGNATURE" Object contains a signature (using the signing key) - of the PKCS1-padded digest of the entire document, taken from - the beginning of the Initial item, through the newline after - the Signature Item's keyword and its arguments." - So our attacker, Malcom, could generate a collision for the hash - that is signed. Thus, a second pre-image attack is possible. - Vulnerable to regular collision attack only if key is stolen. - If the key is stolen, Malcom could distribute two different - copies of the document which have the same hash. Maybe useful - for a partitioning attack? - B. Router descriptors identify their corresponding extra-info documents - by their SHA-1 digest. - * A third party might use a second pre-image attack to generate a - false extra-info document that has the same hash. The router - itself might use a regular collision attack to generate multiple - extra-info documents with the same hash, which might be useful - for a partitioning attack. - C. Fingerprints in router descriptors are taken using SHA-1. - * The fingerprint must match the public key. Not sure what would - happen if two routers had different public keys but the same - fingerprint. There could perhaps be unpredictable behaviour. - D. In router descriptors, routers in the same "Family" may be listed - by server nicknames or hexdigests. - * Does not seem critical. - E. Fingerprints in authority certs are taken using SHA-1. - F. Fingerprints in dir-source lines of votes and consensuses are taken - using SHA-1. - G. Networkstatuses refer to routers identity keys and descriptors by their - SHA-1 digests. - H. Directory-signature lines identify which key is doing the signing by - the SHA-1 digests of the authority's signing key and its identity key. - I. The following items are downloaded by the SHA-1 of their contents: - XXXX list them - J. The following items are downloaded by the SHA-1 of an identity key: - XXXX list them too. - -4. The rendezvous protocol - - A. Hidden servers use SHA-1 to establish introduction points on relays, - and relays use SHA-1 to check incoming introduction point - establishment requests. - B. Hidden servers use SHA-1 in multiple places when generating hidden - service descriptors. - * The permanent-id is the first 80 bits of the SHA-1 hash of the - public key - ** time-period performs caclulations using the permanent-id - * The secret-id-part is the SHA-1 has of the time period, the - descriptor-cookie, and replica. - * Hash of introduction point's identity key. - C. Hidden servers performing basic-type client authorization for their - services use SHA-1 when encrypting introduction points contained in - hidden service descriptors. - D. Hidden service directories use SHA-1 to check whether a given hidden - service descriptor may be published under a given descriptor - identifier or not. - E. Hidden servers use SHA-1 to derive .onion addresses of their - services. - * What's worse, it only uses the first 80 bits of the SHA-1 hash. - However, the rend-spec.txt says we aren't worried about arbitrary - collisons? - F. Clients use SHA-1 to generate the current hidden service descriptor - identifiers for a given .onion address. - G. Hidden servers use SHA-1 to remember digests of the first parts of - Diffie-Hellman handshakes contained in introduction requests in order - to detect replays. See the RELAY_ESTABLISH_INTRO cell. We seem to be - taking a hash of a hash here. - H. Hidden servers use SHA-1 during the Diffie-Hellman key exchange with - a connecting client. - -5. The bridge protocol - - XXXX write me - - A. Client may attempt to query for bridges where he knows a digest - (probably SHA-1) before a direct query. - -6. The Tor user interface - - A. We log information about servers based on SHA-1 hashes of their - identity keys. - COLLISION - B. The controller identifies servers based on SHA-1 hashes of their - identity keys. - COLLISION - C. Nearly all of our configuration options that list servers allow SHA-1 - hashes of their identity keys. - COLLISION - E. The deprecated .exit notation uses SHA-1 hashes of identity keys - COLLISION diff --git a/doc/spec/proposals/reindex.py b/doc/spec/proposals/reindex.py deleted file mode 100755 index 980bc0659f..0000000000 --- a/doc/spec/proposals/reindex.py +++ /dev/null @@ -1,117 +0,0 @@ -#!/usr/bin/python - -import re, os -class Error(Exception): pass - -STATUSES = """DRAFT NEEDS-REVISION NEEDS-RESEARCH OPEN ACCEPTED META FINISHED - CLOSED SUPERSEDED DEAD REJECTED""".split() -REQUIRED_FIELDS = [ "Filename", "Status", "Title" ] -CONDITIONAL_FIELDS = { "OPEN" : [ "Target" ], - "ACCEPTED" : [ "Target "], - "CLOSED" : [ "Implemented-In" ], - "FINISHED" : [ "Implemented-In" ] } -FNAME_RE = re.compile(r'^(\d\d\d)-.*[^\~]$') -DIR = "." -OUTFILE = "000-index.txt" -TMPFILE = OUTFILE+".tmp" - -def indexed(seq): - n = 0 - for i in seq: - yield n, i - n += 1 - -def readProposal(fn): - fields = { } - f = open(fn, 'r') - lastField = None - try: - for lineno, line in indexed(f): - line = line.rstrip() - if not line: - return fields - if line[0].isspace(): - fields[lastField] += " %s"%(line.strip()) - else: - parts = line.split(":", 1) - if len(parts) != 2: - raise Error("%s:%s: Neither field nor continuation"% - (fn,lineno)) - else: - fields[parts[0]] = parts[1].strip() - lastField = parts[0] - - return fields - finally: - f.close() - -def checkProposal(fn, fields): - status = fields.get("Status") - need_fields = REQUIRED_FIELDS + CONDITIONAL_FIELDS.get(status, []) - for f in need_fields: - if not fields.has_key(f): - raise Error("%s has no %s field"%(fn, f)) - if fn != fields['Filename']: - print `fn`, `fields['Filename']` - raise Error("Mismatched Filename field in %s"%fn) - if fields['Title'][-1] == '.': - fields['Title'] = fields['Title'][:-1] - - status = fields['Status'] = status.upper() - if status not in STATUSES: - raise Error("I've never heard of status %s in %s"%(status,fn)) - if status in [ "SUPERSEDED", "DEAD" ]: - for f in [ 'Implemented-In', 'Target' ]: - if fields.has_key(f): del fields[f] - -def readProposals(): - res = [] - for fn in os.listdir(DIR): - m = FNAME_RE.match(fn) - if not m: continue - if not fn.endswith(".txt"): - raise Error("%s doesn't end with .txt"%fn) - num = m.group(1) - fields = readProposal(fn) - checkProposal(fn, fields) - fields['num'] = num - res.append(fields) - return res - -def writeIndexFile(proposals): - proposals.sort(key=lambda f:f['num']) - seenStatuses = set() - for p in proposals: - seenStatuses.add(p['Status']) - - out = open(TMPFILE, 'w') - inf = open(OUTFILE, 'r') - for line in inf: - out.write(line) - if line.startswith("====="): break - inf.close() - - out.write("Proposals by number:\n\n") - for prop in proposals: - out.write("%(num)s %(Title)s [%(Status)s]\n"%prop) - out.write("\n\nProposals by status:\n\n") - for s in STATUSES: - if s not in seenStatuses: continue - out.write(" %s:\n"%s) - for prop in proposals: - if s == prop['Status']: - out.write(" %(num)s %(Title)s"%prop) - if prop.has_key('Target'): - out.write(" [for %(Target)s]"%prop) - if prop.has_key('Implemented-In'): - out.write(" [in %(Implemented-In)s]"%prop) - out.write("\n") - out.close() - os.rename(TMPFILE, OUTFILE) - -try: - os.unlink(TMPFILE) -except OSError: - pass - -writeIndexFile(readProposals()) diff --git a/doc/spec/rend-spec.txt b/doc/spec/rend-spec.txt deleted file mode 100644 index 3c14ebc662..0000000000 --- a/doc/spec/rend-spec.txt +++ /dev/null @@ -1,966 +0,0 @@ - - Tor Rendezvous Specification - -0. Overview and preliminaries - - The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL - NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and - "OPTIONAL" in this document are to be interpreted as described in - RFC 2119. - - Read - https://svn.torproject.org/svn/projects/design-paper/tor-design.html#sec:rendezvous - before you read this specification. It will make more sense. - - Rendezvous points provide location-hidden services (server - anonymity) for the onion routing network. With rendezvous points, - Bob can offer a TCP service (say, a webserver) via the onion - routing network, without revealing the IP of that service. - - Bob does this by anonymously advertising a public key for his - service, along with a list of onion routers to act as "Introduction - Points" for his service. He creates forward circuits to those - introduction points, and tells them about his service. To - connect to Bob, Alice first builds a circuit to an OR to act as - her "Rendezvous Point." She then connects to one of Bob's chosen - introduction points, and asks it to tell him about her Rendezvous - Point (RP). If Bob chooses to answer, he builds a circuit to her - RP, and tells it to connect him to Alice. The RP joins their - circuits together, and begins relaying cells. Alice's 'BEGIN' - cells are received directly by Bob's OP, which passes data to - and from the local server implementing Bob's service. - - Below we describe a network-level specification of this service, - along with interfaces to make this process transparent to Alice - (so long as she is using an OP). - -0.1. Notation, conventions and prerequisites - - In the specifications below, we use the same notation and terminology - as in "tor-spec.txt". The service specified here also requires the - existence of an onion routing network as specified in that file. - - H(x) is a SHA1 digest of x. - PKSign(SK,x) is a PKCS.1-padded RSA signature of x with SK. - PKEncrypt(SK,x) is a PKCS.1-padded RSA encryption of x with SK. - Public keys are all RSA, and encoded in ASN.1. - All integers are stored in network (big-endian) order. - All symmetric encryption uses AES in counter mode, except where - otherwise noted. - - In all discussions, "Alice" will refer to a user connecting to a - location-hidden service, and "Bob" will refer to a user running a - location-hidden service. - - An OP is (as defined elsewhere) an "Onion Proxy" or Tor client. - - An OR is (as defined elsewhere) an "Onion Router" or Tor server. - - An "Introduction point" is a Tor server chosen to be Bob's medium-term - 'meeting place'. A "Rendezvous point" is a Tor server chosen by Alice to - be a short-term communication relay between her and Bob. All Tor servers - potentially act as introduction and rendezvous points. - -0.2. Protocol outline - - 1. Bob->Bob's OP: "Offer IP:Port as public-key-name:Port". [configuration] - (We do not specify this step; it is left to the implementor of - Bob's OP.) - - 2. Bob's OP generates a long-term keypair. - - 3. Bob's OP->Introduction point via Tor: [introduction setup] - "This public key is (currently) associated to me." - - 4. Bob's OP->directory service via Tor: publishes Bob's service descriptor - [advertisement] - "Meet public-key X at introduction point A, B, or C." (signed) - - 5. Out of band, Alice receives a z.onion:port address. - She opens a SOCKS connection to her OP, and requests z.onion:port. - - 6. Alice's OP retrieves Bob's descriptor via Tor. [descriptor lookup.] - - 7. Alice's OP chooses a rendezvous point, opens a circuit to that - rendezvous point, and establishes a rendezvous circuit. [rendezvous - setup.] - - 8. Alice connects to the Introduction point via Tor, and tells it about - her rendezvous point. (Encrypted to Bob.) [Introduction 1] - - 9. The Introduction point passes this on to Bob's OP via Tor, along the - introduction circuit. [Introduction 2] - - 10. Bob's OP decides whether to connect to Alice, and if so, creates a - circuit to Alice's RP via Tor. Establishes a shared circuit. - [Rendezvous 1] - - 11. The Rendezvous point forwards Bob's confirmation to Alice's OP. - [Rendezvous 2] - - 12. Alice's OP sends begin cells to Bob's OP. [Connection] - -0.3. Constants and new cell types - - Relay cell types - 32 -- RELAY_COMMAND_ESTABLISH_INTRO - 33 -- RELAY_COMMAND_ESTABLISH_RENDEZVOUS - 34 -- RELAY_COMMAND_INTRODUCE1 - 35 -- RELAY_COMMAND_INTRODUCE2 - 36 -- RELAY_COMMAND_RENDEZVOUS1 - 37 -- RELAY_COMMAND_RENDEZVOUS2 - 38 -- RELAY_COMMAND_INTRO_ESTABLISHED - 39 -- RELAY_COMMAND_RENDEZVOUS_ESTABLISHED - 40 -- RELAY_COMMAND_INTRODUCE_ACK - -0.4. Version overview - - There are several parts in the hidden service protocol that have - changed over time, each of them having its own version number, whereas - other parts remained the same. The following list of potentially - versioned protocol parts should help reduce some confusion: - - - Hidden service descriptor: the binary-based v0 was the default for a - long time, and an ASCII-based v2 has been added by proposal 114. The - v0 descriptor format has been deprecated in 0.2.2.1-alpha. See 1.3. - - - Hidden service descriptor propagation mechanism: currently related to - the hidden service descriptor version -- v0 publishes to the original - hs directory authorities, whereas v2 publishes to a rotating subset - of relays with the "HSDir" flag; see 1.4 and 1.6. - - - Introduction protocol for how to generate an introduction cell: - v0 specified a nickname for the rendezvous point and assumed the - relay would know about it, whereas v2 now specifies IP address, - port, and onion key so the relay doesn't need to already recognize - it. See 1.8. - -1. The Protocol - -1.1. Bob configures his local OP. - - We do not specify a format for the OP configuration file. However, - OPs SHOULD allow Bob to provide more than one advertised service - per OP, and MUST allow Bob to specify one or more virtual ports per - service. Bob provides a mapping from each of these virtual ports - to a local IP:Port pair. - -1.2. Bob's OP establishes his introduction points. - - The first time the OP provides an advertised service, it generates - a public/private keypair (stored locally). - - The OP chooses a small number of Tor servers as introduction points. - The OP establishes a new introduction circuit to each introduction - point. These circuits MUST NOT be used for anything but hidden service - introduction. To establish the introduction, Bob sends a - RELAY_COMMAND_ESTABLISH_INTRO cell, containing: - - KL Key length [2 octets] - PK Bob's public key or service key [KL octets] - HS Hash of session info [20 octets] - SIG Signature of above information [variable] - - KL is the length of PK, in octets. - - To prevent replay attacks, the HS field contains a SHA-1 hash based on the - shared secret KH between Bob's OP and the introduction point, as - follows: - HS = H(KH | "INTRODUCE") - That is: - HS = H(KH | [49 4E 54 52 4F 44 55 43 45]) - (KH, as specified in tor-spec.txt, is H(g^xy | [00]) .) - - Upon receiving such a cell, the OR first checks that the signature is - correct with the included public key. If so, it checks whether HS is - correct given the shared state between Bob's OP and the OR. If either - check fails, the OP discards the cell; otherwise, it associates the - circuit with Bob's public key, and dissociates any other circuits - currently associated with PK. On success, the OR sends Bob a - RELAY_COMMAND_INTRO_ESTABLISHED cell with an empty payload. - - Bob's OP uses either Bob's public key or a freshly generated, single-use - service key in the RELAY_COMMAND_ESTABLISH_INTRO cell, depending on the - configured hidden service descriptor version. The public key is used for - v0 descriptors, the service key for v2 descriptors. In the latter case, the - service keys of all introduction points are included in the v2 hidden - service descriptor together with the other introduction point information. - The reason is that the introduction point does not need to and therefore - should not know for which hidden service it works, so as to prevent it from - tracking the hidden service's activity. If the hidden service is configured - to publish both v0 and v2 descriptors, two separate sets of introduction - points are established. - -1.3. Bob's OP generates service descriptors. - - For versions before 0.2.2.1-alpha, Bob's OP periodically generates and - publishes a descriptor of type "V0". - - The "V0" descriptor contains: - - KL Key length [2 octets] - PK Bob's public key [KL octets] - TS A timestamp [4 octets] - NI Number of introduction points [2 octets] - Ipt A list of NUL-terminated ORs [variable] - SIG Signature of above fields [variable] - - TS is the number of seconds elapsed since Jan 1, 1970. - - The members of Ipt may be either (a) nicknames, or (b) identity key - digests, encoded in hex, and prefixed with a '$'. Clients must - accept both forms. Services must only generate the second form. - Once 0.0.9.x is obsoleted, we can drop the first form. - - [It's ok for Bob to advertise 0 introduction points. He might want - to do that if he previously advertised some introduction points, - and now he doesn't have any. -RD] - - Beginning with 0.2.0.10-alpha, Bob's OP encodes "V2" descriptors in - addition to (or instead of) "V0" descriptors. The format of a "V2" - descriptor is as follows: - - "rendezvous-service-descriptor" descriptor-id NL - - [At start, exactly once] - - Indicates the beginning of the descriptor. "descriptor-id" is a - periodically changing identifier of 160 bits formatted as 32 base32 - chars that is calculated by the hidden service and its clients. The - "descriptor-id" is calculated by performing the following operation: - - descriptor-id = - H(permanent-id | H(time-period | descriptor-cookie | replica)) - - "permanent-id" is the permanent identifier of the hidden service, - consisting of 80 bits. It can be calculated by computing the hash value - of the public hidden service key and truncating after the first 80 bits: - - permanent-id = H(public-key)[:10] - - Note: If Bob's OP has "stealth" authorization enabled (see Section 2.2), - it uses the client key in place of the public hidden service key. - - "H(time-period | descriptor-cookie | replica)" is the (possibly - secret) id part that is necessary to verify that the hidden service is - the true originator of this descriptor and that is therefore contained - in the descriptor, too. The descriptor ID can only be created by the - hidden service and its clients, but the "signature" below can only be - created by the service. - - "time-period" changes periodically as a function of time and - - "permanent-id". The current value for "time-period" can be calculated - using the following formula: - - time-period = (current-time + permanent-id-byte * 86400 / 256) - / 86400 - - "current-time" contains the current system time in seconds since - 1970-01-01 00:00, e.g. 1188241957. "permanent-id-byte" is the first - (unsigned) byte of the permanent identifier (which is in network - order), e.g. 143. Adding the product of "permanent-id-byte" and - 86400 (seconds per day), divided by 256, prevents "time-period" from - changing for all descriptors at the same time of the day. The result - of the overall operation is a (network-ordered) 32-bit integer, e.g. - 13753 or 0x000035B9 with the example values given above. - - "descriptor-cookie" is an optional secret password of 128 bits that - is shared between the hidden service provider and its clients. If the - descriptor-cookie is left out, the input to the hash function is 128 - bits shorter. - - "replica" denotes the number of the replica. A service publishes - multiple descriptors with different descriptor IDs in order to - distribute them to different places on the ring. - - "version" version-number NL - - [Exactly once] - - The version number of this descriptor's format. In this case: 2. - - "permanent-key" NL a public key in PEM format - - [Exactly once] - - The public key of the hidden service which is required to verify the - "descriptor-id" and the "signature". - - "secret-id-part" secret-id-part NL - - [Exactly once] - - The result of the following operation as explained above, formatted as - 32 base32 chars. Using this secret id part, everyone can verify that - the signed descriptor belongs to "descriptor-id". - - secret-id-part = H(time-period | descriptor-cookie | replica) - - "publication-time" YYYY-MM-DD HH:MM:SS NL - - [Exactly once] - - A timestamp when this descriptor has been created. - - "protocol-versions" version-string NL - - [Exactly once] - - A comma-separated list of recognized and permitted version numbers - for use in INTRODUCE cells; these versions are described in section - 1.8 below. - - "introduction-points" NL encrypted-string - - [At most once] - - A list of introduction points. If the optional "descriptor-cookie" is - used, this list is encrypted with AES in CTR mode with a random - initialization vector of 128 bits that is written to - the beginning of the encrypted string, and the "descriptor-cookie" as - secret key of 128 bits length. - - The string containing the introduction point data (either encrypted - or not) is encoded in base64, and surrounded with - "-----BEGIN MESSAGE-----" and "-----END MESSAGE-----". - - The unencrypted string may begin with: - - "service-authentication" auth-type auth-data NL - - [Any number] - - The service-specific authentication data can be used to perform - client authentication. This data is independent of the selected - introduction point as opposed to "intro-authentication" below. The - format of auth-data (base64-encoded or PEM format) depends on - auth-type. See section 2 of this document for details on auth - mechanisms. - - Subsequently, an arbitrary number of introduction point entries may - follow, each containing the following data: - - "introduction-point" identifier NL - - [At start, exactly once] - - The identifier of this introduction point: the base-32 encoded - hash of this introduction point's identity key. - - "ip-address" ip-address NL - - [Exactly once] - - The IP address of this introduction point. - - "onion-port" port NL - - [Exactly once] - - The TCP port on which the introduction point is listening for - incoming onion requests. - - "onion-key" NL a public key in PEM format - - [Exactly once] - - The public key that can be used to encrypt messages to this - introduction point. - - "service-key" NL a public key in PEM format - - [Exactly once] - - The public key that can be used to encrypt messages to the hidden - service. - - "intro-authentication" auth-type auth-data NL - - [Any number] - - The introduction-point-specific authentication data can be used - to perform client authentication. This data depends on the - selected introduction point as opposed to "service-authentication" - above. The format of auth-data (base64-encoded or PEM format) - depends on auth-type. See section 2 of this document for details - on auth mechanisms. - - (This ends the fields in the encrypted portion of the descriptor.) - - [It's ok for Bob to advertise 0 introduction points. He might want - to do that if he previously advertised some introduction points, - and now he doesn't have any. -RD] - - "signature" NL signature-string - - [At end, exactly once] - - A signature of all fields above with the private key of the hidden - service. - -1.3.1. Other descriptor formats we don't use. - - Support for the V0 descriptor format was dropped in 0.2.2.0-alpha-dev: - - KL Key length [2 octets] - PK Bob's public key [KL octets] - TS A timestamp [4 octets] - NI Number of introduction points [2 octets] - Ipt A list of NUL-terminated ORs [variable] - SIG Signature of above fields [variable] - - KL is the length of PK, in octets. - TS is the number of seconds elapsed since Jan 1, 1970. - - The members of Ipt may be either (a) nicknames, or (b) identity key - digests, encoded in hex, and prefixed with a '$'. - - The V1 descriptor format was understood and accepted from - 0.1.1.5-alpha-cvs to 0.2.0.6-alpha-dev, but no Tors generated it and - it was removed: - - V Format byte: set to 255 [1 octet] - V Version byte: set to 1 [1 octet] - KL Key length [2 octets] - PK Bob's public key [KL octets] - TS A timestamp [4 octets] - PROTO Protocol versions: bitmask [2 octets] - NI Number of introduction points [2 octets] - For each introduction point: (as in INTRODUCE2 cells) - IP Introduction point's address [4 octets] - PORT Introduction point's OR port [2 octets] - ID Introduction point identity ID [20 octets] - KLEN Length of onion key [2 octets] - KEY Introduction point onion key [KLEN octets] - SIG Signature of above fields [variable] - - A hypothetical "V1" descriptor, that has never been used but might - be useful for historical reasons, contains: - - V Format byte: set to 255 [1 octet] - V Version byte: set to 1 [1 octet] - KL Key length [2 octets] - PK Bob's public key [KL octets] - TS A timestamp [4 octets] - PROTO Rendezvous protocol versions: bitmask [2 octets] - NA Number of auth mechanisms accepted [1 octet] - For each auth mechanism: - AUTHT The auth type that is supported [2 octets] - AUTHL Length of auth data [1 octet] - AUTHD Auth data [variable] - NI Number of introduction points [2 octets] - For each introduction point: (as in INTRODUCE2 cells) - ATYPE An address type (typically 4) [1 octet] - ADDR Introduction point's IP address [4 or 16 octets] - PORT Introduction point's OR port [2 octets] - AUTHT The auth type that is supported [2 octets] - AUTHL Length of auth data [1 octet] - AUTHD Auth data [variable] - ID Introduction point identity ID [20 octets] - KLEN Length of onion key [2 octets] - KEY Introduction point onion key [KLEN octets] - SIG Signature of above fields [variable] - - AUTHT specifies which authentication/authorization mechanism is - required by the hidden service or the introduction point. AUTHD - is arbitrary data that can be associated with an auth approach. - Currently only AUTHT of [00 00] is supported, with an AUTHL of 0. - See section 2 of this document for details on auth mechanisms. - -1.4. Bob's OP advertises his service descriptor(s). - - Bob's OP advertises his service descriptor to a fixed set of v0 hidden - service directory servers and/or a changing subset of all v2 hidden service - directories. - - For versions before 0.2.2.1-alpha, Bob's OP opens a stream to each v0 - directory server's directory port via Tor. (He may re-use old circuits for - this.) Over this stream, Bob's OP makes an HTTP 'POST' request, to a URL - "/tor/rendezvous/publish" relative to the directory server's root, - containing as its body Bob's service descriptor. - - Upon receiving a descriptor, the directory server checks the signature, - and discards the descriptor if the signature does not match the enclosed - public key. Next, the directory server checks the timestamp. If the - timestamp is more than 24 hours in the past or more than 1 hour in the - future, or the directory server already has a newer descriptor with the - same public key, the server discards the descriptor. Otherwise, the - server discards any older descriptors with the same public key and - version format, and associates the new descriptor with the public key. - The directory server remembers this descriptor for at least 24 hours - after its timestamp. At least every 18 hours, Bob's OP uploads a - fresh descriptor. - - If Bob's OP is configured to publish v2 descriptors, it does so to a - changing subset of all v2 hidden service directories instead of the - authoritative directory servers. Therefore, Bob's OP opens a stream via - Tor to each responsible hidden service directory. (He may re-use old - circuits for this.) Over this stream, Bob's OP makes an HTTP 'POST' - request to a URL "/tor/rendezvous2/publish" relative to the hidden service - directory's root, containing as its body Bob's service descriptor. - - At any time, there are 6 hidden service directories responsible for - keeping replicas of a descriptor; they consist of 2 sets of 3 hidden - service directories with consecutive onion IDs. Bob's OP learns about - the complete list of hidden service directories by filtering the - consensus status document received from the directory authorities. A - hidden service directory is deemed responsible for all descriptor IDs in - the interval from its direct predecessor, exclusive, to its own ID, - inclusive; it further holds replicas for its 2 predecessors. A - participant only trusts its own routing list and never learns about - routing information from other parties. - - Bob's OP publishes a new v2 descriptor once an hour or whenever its - content changes. V2 descriptors can be found by clients within a given - time period of 24 hours, after which they change their ID as described - under 1.3. If a published descriptor would be valid for less than 60 - minutes (= 2 x 30 minutes to allow the server to be 30 minutes behind - and the client 30 minutes ahead), Bob's OP publishes the descriptor - under the ID of both, the current and the next publication period. - -1.5. Alice receives a z.onion address. - - When Alice receives a pointer to a location-hidden service, it is as a - hostname of the form "z.onion", where z is a base-32 encoding of a - 10-octet hash of Bob's service's public key, computed as follows: - - 1. Let H = H(PK). - 2. Let H' = the first 80 bits of H, considering each octet from - most significant bit to least significant bit. - 3. Generate a 16-character encoding of H', using base32 as defined - in RFC 3548. - - (We only use 80 bits instead of the 160 bits from SHA1 because we - don't need to worry about arbitrary collisions, and because it will - make handling the url's more convenient.) - - [Yes, numbers are allowed at the beginning. See RFC 1123. -NM] - -1.6. Alice's OP retrieves a service descriptor. - - Alice's OP fetches the service descriptor from the fixed set of v0 hidden - service directory servers and/or a changing subset of all v2 hidden service - directories. - - For versions before 0.2.2.1-alpha, Alice's OP opens a stream to a directory - server via Tor, and makes an HTTP GET request for the document - '/tor/rendezvous/<z>', where '<z>' is replaced with the encoding of Bob's - public key as described above. (She may re-use old circuits for this.) The - directory replies with a 404 HTTP response if it does not recognize <z>, - and otherwise returns Bob's most recently uploaded service descriptor. - - If Alice's OP receives a 404 response, it tries the other directory - servers, and only fails the lookup if none recognize the public key hash. - - Upon receiving a service descriptor, Alice verifies with the same process - as the directory server uses, described above in section 1.4. - - The directory server gives a 400 response if it cannot understand Alice's - request. - - Alice should cache the descriptor locally, but should not use - descriptors that are more than 24 hours older than their timestamp. - [Caching may make her partitionable, but she fetched it anonymously, - and we can't very well *not* cache it. -RD] - - If Alice's OP is running 0.2.1.10-alpha or higher, it fetches v2 hidden - service descriptors. Versions before 0.2.2.1-alpha are fetching both v0 and - v2 descriptors in parallel. Similar to the description in section 1.4, - Alice's OP fetches a v2 descriptor from a randomly chosen hidden service - directory out of the changing subset of 6 nodes. If the request is - unsuccessful, Alice retries the other remaining responsible hidden service - directories in a random order. Alice relies on Bob to care about a potential - clock skew between the two by possibly storing two sets of descriptors (see - end of section 1.4). - - Alice's OP opens a stream via Tor to the chosen v2 hidden service - directory. (She may re-use old circuits for this.) Over this stream, - Alice's OP makes an HTTP 'GET' request for the document - "/tor/rendezvous2/<z>", where z is replaced with the encoding of the - descriptor ID. The directory replies with a 404 HTTP response if it does - not recognize <z>, and otherwise returns Bob's most recently uploaded - service descriptor. - -1.7. Alice's OP establishes a rendezvous point. - - When Alice requests a connection to a given location-hidden service, - and Alice's OP does not have an established circuit to that service, - the OP builds a rendezvous circuit. It does this by establishing - a circuit to a randomly chosen OR, and sending a - RELAY_COMMAND_ESTABLISH_RENDEZVOUS cell to that OR. The body of that cell - contains: - - RC Rendezvous cookie [20 octets] - - The rendezvous cookie is an arbitrary 20-byte value, chosen randomly by - Alice's OP. Alice SHOULD choose a new rendezvous cookie for each new - connection attempt. - - Upon receiving a RELAY_COMMAND_ESTABLISH_RENDEZVOUS cell, the OR associates - the RC with the circuit that sent it. It replies to Alice with an empty - RELAY_COMMAND_RENDEZVOUS_ESTABLISHED cell to indicate success. - - Alice's OP MUST NOT use the circuit which sent the cell for any purpose - other than rendezvous with the given location-hidden service. - -1.8. Introduction: from Alice's OP to Introduction Point - - Alice builds a separate circuit to one of Bob's chosen introduction - points, and sends it a RELAY_COMMAND_INTRODUCE1 cell containing: - - Cleartext - PK_ID Identifier for Bob's PK [20 octets] - Encrypted to Bob's PK: (in the v0 intro protocol) - RP Rendezvous point's nickname [20 octets] - RC Rendezvous cookie [20 octets] - g^x Diffie-Hellman data, part 1 [128 octets] - OR (in the v1 intro protocol) - VER Version byte: set to 1. [1 octet] - RP Rendezvous point nick or ID [42 octets] - RC Rendezvous cookie [20 octets] - g^x Diffie-Hellman data, part 1 [128 octets] - OR (in the v2 intro protocol) - VER Version byte: set to 2. [1 octet] - IP Rendezvous point's address [4 octets] - PORT Rendezvous point's OR port [2 octets] - ID Rendezvous point identity ID [20 octets] - KLEN Length of onion key [2 octets] - KEY Rendezvous point onion key [KLEN octets] - RC Rendezvous cookie [20 octets] - g^x Diffie-Hellman data, part 1 [128 octets] - OR (in the v3 intro protocol) - VER Version byte: set to 3. [1 octet] - AUTHT The auth type that is used [1 octet] - AUTHL Length of auth data [2 octets] - AUTHD Auth data [variable] - TS A timestamp [4 octets] - IP Rendezvous point's address [4 octets] - PORT Rendezvous point's OR port [2 octets] - ID Rendezvous point identity ID [20 octets] - KLEN Length of onion key [2 octets] - KEY Rendezvous point onion key [KLEN octets] - RC Rendezvous cookie [20 octets] - g^x Diffie-Hellman data, part 1 [128 octets] - - PK_ID is the hash of Bob's public key or the service key, depending on the - hidden service descriptor version. In case of a v0 descriptor, Alice's OP - uses Bob's public key. If Alice has downloaded a v2 descriptor, she uses - the contained public key ("service-key"). - - RP is NUL-padded and terminated. In version 0 of the intro protocol, RP - must contain a nickname. In version 1, it must contain EITHER a nickname or - an identity key digest that is encoded in hex and prefixed with a '$'. - - The hybrid encryption to Bob's PK works just like the hybrid - encryption in CREATE cells (see tor-spec). Thus the payload of the - version 0 RELAY_COMMAND_INTRODUCE1 cell on the wire will contain - 20+42+16+20+20+128=246 bytes, and the version 1 and version 2 - introduction formats have other sizes. - - Through Tor 0.2.0.6-alpha, clients only generated the v0 introduction - format, whereas hidden services have understood and accepted v0, - v1, and v2 since 0.1.1.x. As of Tor 0.2.0.7-alpha and 0.1.2.18, - clients switched to using the v2 intro format. - -1.9. Introduction: From the Introduction Point to Bob's OP - - If the Introduction Point recognizes PK_ID as a public key which has - established a circuit for introductions as in 1.2 above, it sends the body - of the cell in a new RELAY_COMMAND_INTRODUCE2 cell down the corresponding - circuit. (If the PK_ID is unrecognized, the RELAY_COMMAND_INTRODUCE1 cell is - discarded.) - - After sending the RELAY_COMMAND_INTRODUCE2 cell to Bob, the OR replies to - Alice with an empty RELAY_COMMAND_INTRODUCE_ACK cell. If no - RELAY_COMMAND_INTRODUCE2 cell can be sent, the OR replies to Alice with a - non-empty cell to indicate an error. (The semantics of the cell body may be - determined later; the current implementation sends a single '1' byte on - failure.) - - When Bob's OP receives the RELAY_COMMAND_INTRODUCE2 cell, it decrypts it - with the private key for the corresponding hidden service, and extracts the - rendezvous point's nickname, the rendezvous cookie, and the value of g^x - chosen by Alice. - -1.10. Rendezvous - - Bob's OP builds a new Tor circuit ending at Alice's chosen rendezvous - point, and sends a RELAY_COMMAND_RENDEZVOUS1 cell along this circuit, - containing: - RC Rendezvous cookie [20 octets] - g^y Diffie-Hellman [128 octets] - KH Handshake digest [20 octets] - - (Bob's OP MUST NOT use this circuit for any other purpose.) - - If the RP recognizes RC, it relays the rest of the cell down the - corresponding circuit in a RELAY_COMMAND_RENDEZVOUS2 cell, containing: - - g^y Diffie-Hellman [128 octets] - KH Handshake digest [20 octets] - - (If the RP does not recognize the RC, it discards the cell and - tears down the circuit.) - - When Alice's OP receives a RELAY_COMMAND_RENDEZVOUS2 cell on a circuit which - has sent a RELAY_COMMAND_ESTABLISH_RENDEZVOUS cell but which has not yet - received a reply, it uses g^y and H(g^xy) to complete the handshake as in - the Tor circuit extend process: they establish a 60-octet string as - K = SHA1(g^xy | [00]) | SHA1(g^xy | [01]) | SHA1(g^xy | [02]) - and generate - KH = K[0..15] - Kf = K[16..31] - Kb = K[32..47] - - Subsequently, the rendezvous point passes relay cells, unchanged, from - each of the two circuits to the other. When Alice's OP sends - RELAY cells along the circuit, it first encrypts them with the - Kf, then with all of the keys for the ORs in Alice's side of the circuit; - and when Alice's OP receives RELAY cells from the circuit, it decrypts - them with the keys for the ORs in Alice's side of the circuit, then - decrypts them with Kb. Bob's OP does the same, with Kf and Kb - interchanged. - -1.11. Creating streams - - To open TCP connections to Bob's location-hidden service, Alice's OP sends - a RELAY_COMMAND_BEGIN cell along the established circuit, using the special - address "", and a chosen port. Bob's OP chooses a destination IP and - port, based on the configuration of the service connected to the circuit, - and opens a TCP stream. From then on, Bob's OP treats the stream as an - ordinary exit connection. - [ Except he doesn't include addr in the connected cell or the end - cell. -RD] - - Alice MAY send multiple RELAY_COMMAND_BEGIN cells along the circuit, to open - multiple streams to Bob. Alice SHOULD NOT send RELAY_COMMAND_BEGIN cells - for any other address along her circuit to Bob; if she does, Bob MUST reject - them. - -2. Authentication and authorization. - - The rendezvous protocol as described in Section 1 provides a few options - for implementing client-side authorization. There are two steps in the - rendezvous protocol that can be used for performing client authorization: - when downloading and decrypting parts of the hidden service descriptor and - at Bob's Tor client before contacting the rendezvous point. A service - provider can restrict access to his service at these two points to - authorized clients only. - - There are currently two authorization protocols specified that are - described in more detail below: - - 1. The first protocol allows a service provider to restrict access - to clients with a previously received secret key only, but does not - attempt to hide service activity from others. - - 2. The second protocol, albeit being feasible for a limited set of about - 16 clients, performs client authorization and hides service activity - from everyone but the authorized clients. - -2.1. Service with large-scale client authorization - - The first client authorization protocol aims at performing access control - while consuming as few additional resources as possible. This is the "basic" - authorization protocol. A service provider should be able to permit access - to a large number of clients while denying access for everyone else. - However, the price for scalability is that the service won't be able to hide - its activity from unauthorized or formerly authorized clients. - - The main idea of this protocol is to encrypt the introduction-point part - in hidden service descriptors to authorized clients using symmetric keys. - This ensures that nobody else but authorized clients can learn which - introduction points a service currently uses, nor can someone send a - valid INTRODUCE1 message without knowing the introduction key. Therefore, - a subsequent authorization at the introduction point is not required. - - A service provider generates symmetric "descriptor cookies" for his - clients and distributes them outside of Tor. The suggested key size is - 128 bits, so that descriptor cookies can be encoded in 22 base64 chars - (which can hold up to 22 * 5 = 132 bits, leaving 4 bits to encode the - authorization type (here: "0") and allow a client to distinguish this - authorization protocol from others like the one proposed below). - Typically, the contact information for a hidden service using this - authorization protocol looks like this: - - v2cbb2l4lsnpio4q.onion Ll3X7Xgz9eHGKCCnlFH0uz - - When generating a hidden service descriptor, the service encrypts the - introduction-point part with a single randomly generated symmetric - 128-bit session key using AES-CTR as described for v2 hidden service - descriptors in rend-spec. Afterwards, the service encrypts the session - key to all descriptor cookies using AES. Authorized client should be able - to efficiently find the session key that is encrypted for him/her, so - that 4 octet long client ID are generated consisting of descriptor cookie - and initialization vector. Descriptors always contain a number of - encrypted session keys that is a multiple of 16 by adding fake entries. - Encrypted session keys are ordered by client IDs in order to conceal - addition or removal of authorized clients by the service provider. - - ATYPE Authorization type: set to 1. [1 octet] - ALEN Number of clients := 1 + ((clients - 1) div 16) [1 octet] - for each symmetric descriptor cookie: - ID Client ID: H(descriptor cookie | IV)[:4] [4 octets] - SKEY Session key encrypted with descriptor cookie [16 octets] - (end of client-specific part) - RND Random data [(15 - ((clients - 1) mod 16)) * 20 octets] - IV AES initialization vector [16 octets] - IPOS Intro points, encrypted with session key [remaining octets] - - An authorized client needs to configure Tor to use the descriptor cookie - when accessing the hidden service. Therefore, a user adds the contact - information that she received from the service provider to her torrc - file. Upon downloading a hidden service descriptor, Tor finds the - encrypted introduction-point part and attempts to decrypt it using the - configured descriptor cookie. (In the rare event of two or more client - IDs being equal a client tries to decrypt all of them.) - - Upon sending the introduction, the client includes her descriptor cookie - as auth type "1" in the INTRODUCE2 cell that she sends to the service. - The hidden service checks whether the included descriptor cookie is - authorized to access the service and either responds to the introduction - request, or not. - -2.2. Authorization for limited number of clients - - A second, more sophisticated client authorization protocol goes the extra - mile of hiding service activity from unauthorized clients. This is the - "stealth" authorization protocol. With all else being equal to the preceding - authorization protocol, the second protocol publishes hidden service - descriptors for each user separately and gets along with encrypting the - introduction-point part of descriptors to a single client. This allows the - service to stop publishing descriptors for removed clients. As long as a - removed client cannot link descriptors issued for other clients to the - service, it cannot derive service activity any more. The downside of this - approach is limited scalability. Even though the distributed storage of - descriptors (cf. proposal 114) tackles the problem of limited scalability to - a certain extent, this protocol should not be used for services with more - than 16 clients. (In fact, Tor should refuse to advertise services for more - than this number of clients.) - - A hidden service generates an asymmetric "client key" and a symmetric - "descriptor cookie" for each client. The client key is used as - replacement for the service's permanent key, so that the service uses a - different identity for each of his clients. The descriptor cookie is used - to store descriptors at changing directory nodes that are unpredictable - for anyone but service and client, to encrypt the introduction-point - part, and to be included in INTRODUCE2 cells. Once the service has - created client key and descriptor cookie, he tells them to the client - outside of Tor. The contact information string looks similar to the one - used by the preceding authorization protocol (with the only difference - that it has "1" encoded as auth-type in the remaining 4 of 132 bits - instead of "0" as before). - - When creating a hidden service descriptor for an authorized client, the - hidden service uses the client key and descriptor cookie to compute - secret ID part and descriptor ID: - - secret-id-part = H(time-period | descriptor-cookie | replica) - - descriptor-id = H(client-key[:10] | secret-id-part) - - The hidden service also replaces permanent-key in the descriptor with - client-key and encrypts introduction-points with the descriptor cookie. - - ATYPE Authorization type: set to 2. [1 octet] - IV AES initialization vector [16 octets] - IPOS Intro points, encr. with descriptor cookie [remaining octets] - - When uploading descriptors, the hidden service needs to make sure that - descriptors for different clients are not uploaded at the same time (cf. - Section 1.1) which is also a limiting factor for the number of clients. - - When a client is requested to establish a connection to a hidden service - it looks up whether it has any authorization data configured for that - service. If the user has configured authorization data for authorization - protocol "2", the descriptor ID is determined as described in the last - paragraph. Upon receiving a descriptor, the client decrypts the - introduction-point part using its descriptor cookie. Further, the client - includes its descriptor cookie as auth-type "2" in INTRODUCE2 cells that - it sends to the service. - -2.3. Hidden service configuration - - A hidden service that is meant to perform client authorization adds a - new option HiddenServiceAuthorizeClient to its hidden service - configuration. This option contains the authorization type which is - either "basic" for the protocol described in 2.1 or "stealth" for the - protocol in 2.2 and a comma-separated list of human-readable client - names, so that Tor can create authorization data for these clients: - - HiddenServiceAuthorizeClient auth-type client-name,client-name,... - - If this option is configured, HiddenServiceVersion is automatically - reconfigured to contain only version numbers of 2 or higher. There is - a maximum of 512 client names for basic auth and a maximum of 16 for - stealth auth. - - Tor stores all generated authorization data for the authorization - protocols described in Sections 2.1 and 2.2 in a new file using the - following file format: - - "client-name" human-readable client identifier NL - "descriptor-cookie" 128-bit key ^= 22 base64 chars NL - - If the authorization protocol of Section 2.2 is used, Tor also generates - and stores the following data: - - "client-key" NL a public key in PEM format - -2.4. Client configuration - - Clients need to make their authorization data known to Tor using another - configuration option that contains a service name (mainly for the sake of - convenience), the service address, and the descriptor cookie that is - required to access a hidden service (the authorization protocol number is - encoded in the descriptor cookie): - - HidServAuth service-name service-address descriptor-cookie - -3. Hidden service directory operation - - This section has been introduced with the v2 hidden service descriptor - format. It describes all operations of the v2 hidden service descriptor - fetching and propagation mechanism that are required for the protocol - described in section 1 to succeed with v2 hidden service descriptors. - -3.1. Configuring as hidden service directory - - Every onion router that has its directory port open can decide whether it - wants to store and serve hidden service descriptors. An onion router which - is configured as such includes the "hidden-service-dir" flag in its router - descriptors that it sends to directory authorities. - - The directory authorities include a new flag "HSDir" for routers that - decided to provide storage for hidden service descriptors and that - have been running for at least 24 hours. - -3.2. Accepting publish requests - - Hidden service directory nodes accept publish requests for v2 hidden service - descriptors and store them to their local memory. (It is not necessary to - make descriptors persistent, because after restarting, the onion router - would not be accepted as a storing node anyway, because it has not been - running for at least 24 hours.) All requests and replies are formatted as - HTTP messages. Requests are initiated via BEGIN_DIR cells directed to - the router's directory port, and formatted as HTTP POST requests to the URL - "/tor/rendezvous2/publish" relative to the hidden service directory's root, - containing as its body a v2 service descriptor. - - A hidden service directory node parses every received descriptor and only - stores it when it thinks that it is responsible for storing that descriptor - based on its own routing table. See section 1.4 for more information on how - to determine responsibility for a certain descriptor ID. - -3.3. Processing fetch requests - - Hidden service directory nodes process fetch requests for hidden service - descriptors by looking them up in their local memory. (They do not need to - determine if they are responsible for the passed ID, because it does no harm - if they deliver a descriptor for which they are not (any more) responsible.) - All requests and replies are formatted as HTTP messages. Requests are - initiated via BEGIN_DIR cells directed to the router's directory port, - and formatted as HTTP GET requests for the document "/tor/rendezvous2/<z>", - where z is replaced with the encoding of the descriptor ID. - diff --git a/doc/spec/socks-extensions.txt b/doc/spec/socks-extensions.txt deleted file mode 100644 index 62d86acd9f..0000000000 --- a/doc/spec/socks-extensions.txt +++ /dev/null @@ -1,78 +0,0 @@ -Tor's extensions to the SOCKS protocol - -1. Overview - - The SOCKS protocol provides a generic interface for TCP proxies. Client - software connects to a SOCKS server via TCP, and requests a TCP connection - to another address and port. The SOCKS server establishes the connection, - and reports success or failure to the client. After the connection has - been established, the client application uses the TCP stream as usual. - - Tor supports SOCKS4 as defined in [1], SOCKS4A as defined in [2], and - SOCKS5 as defined in [3]. - - The stickiest issue for Tor in supporting clients, in practice, is forcing - DNS lookups to occur at the OR side: if clients do their own DNS lookup, - the DNS server can learn which addresses the client wants to reach. - SOCKS4 supports addressing by IPv4 address; SOCKS4A is a kludge on top of - SOCKS4 to allow addressing by hostname; SOCKS5 supports IPv4, IPv6, and - hostnames. - -1.1. Extent of support - - Tor supports the SOCKS4, SOCKS4A, and SOCKS5 standards, except as follows: - - BOTH: - - The BIND command is not supported. - - SOCKS4,4A: - - SOCKS4 usernames are ignored. - - SOCKS5: - - The (SOCKS5) "UDP ASSOCIATE" command is not supported. - - IPv6 is not supported in CONNECT commands. - - Only the "NO AUTHENTICATION" (SOCKS5) authentication method [00] is - supported. - -2. Name lookup - - As an extension to SOCKS4A and SOCKS5, Tor implements a new command value, - "RESOLVE" [F0]. When Tor receives a "RESOLVE" SOCKS command, it initiates - a remote lookup of the hostname provided as the target address in the SOCKS - request. The reply is either an error (if the address couldn't be - resolved) or a success response. In the case of success, the address is - stored in the portion of the SOCKS response reserved for remote IP address. - - (We support RESOLVE in SOCKS4 too, even though it is unnecessary.) - - For SOCKS5 only, we support reverse resolution with a new command value, - "RESOLVE_PTR" [F1]. In response to a "RESOLVE_PTR" SOCKS5 command with - an IPv4 address as its target, Tor attempts to find the canonical - hostname for that IPv4 record, and returns it in the "server bound - address" portion of the reply. - (This command was not supported before Tor 0.1.2.2-alpha.) - -3. Other command extensions. - - Tor 0.1.2.4-alpha added a new command value: "CONNECT_DIR" [F2]. - In this case, Tor will open an encrypted direct TCP connection to the - directory port of the Tor server specified by address:port (the port - specified should be the ORPort of the server). It uses a one-hop tunnel - and a "BEGIN_DIR" relay cell to accomplish this secure connection. - - The F2 command value was removed in Tor 0.2.0.10-alpha in favor of a - new use_begindir flag in edge_connection_t. - -4. HTTP-resistance - - Tor checks the first byte of each SOCKS request to see whether it looks - more like an HTTP request (that is, it starts with a "G", "H", or "P"). If - so, Tor returns a small webpage, telling the user that his/her browser is - misconfigured. This is helpful for the many users who mistakenly try to - use Tor as an HTTP proxy instead of a SOCKS proxy. - -References: - [1] http://archive.socks.permeo.com/protocol/socks4.protocol - [2] http://archive.socks.permeo.com/protocol/socks4a.protocol - [3] SOCKS5: RFC1928 - diff --git a/doc/spec/tor-spec.txt b/doc/spec/tor-spec.txt deleted file mode 100644 index 91ad561b8d..0000000000 --- a/doc/spec/tor-spec.txt +++ /dev/null @@ -1,1004 +0,0 @@ - - Tor Protocol Specification - - Roger Dingledine - Nick Mathewson - -Note: This document aims to specify Tor as implemented in 0.2.1.x. Future -versions of Tor may implement improved protocols, and compatibility is not -guaranteed. Compatibility notes are given for versions 0.1.1.15-rc and -later; earlier versions are not compatible with the Tor network as of this -writing. - -This specification is not a design document; most design criteria -are not examined. For more information on why Tor acts as it does, -see tor-design.pdf. - -0. Preliminaries - - The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL - NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and - "OPTIONAL" in this document are to be interpreted as described in - RFC 2119. - -0.1. Notation and encoding - - PK -- a public key. - SK -- a private key. - K -- a key for a symmetric cipher. - - a|b -- concatenation of 'a' and 'b'. - - [A0 B1 C2] -- a three-byte sequence, containing the bytes with - hexadecimal values A0, B1, and C2, in that order. - - All numeric values are encoded in network (big-endian) order. - - H(m) -- a cryptographic hash of m. - -0.2. Security parameters - - Tor uses a stream cipher, a public-key cipher, the Diffie-Hellman - protocol, and a hash function. - - KEY_LEN -- the length of the stream cipher's key, in bytes. - - PK_ENC_LEN -- the length of a public-key encrypted message, in bytes. - PK_PAD_LEN -- the number of bytes added in padding for public-key - encryption, in bytes. (The largest number of bytes that can be encrypted - in a single public-key operation is therefore PK_ENC_LEN-PK_PAD_LEN.) - - DH_LEN -- the number of bytes used to represent a member of the - Diffie-Hellman group. - DH_SEC_LEN -- the number of bytes used in a Diffie-Hellman private key (x). - - HASH_LEN -- the length of the hash function's output, in bytes. - - PAYLOAD_LEN -- The longest allowable cell payload, in bytes. (509) - - CELL_LEN -- The length of a Tor cell, in bytes. - -0.3. Ciphers - - For a stream cipher, we use 128-bit AES in counter mode, with an IV of all - 0 bytes. - - For a public-key cipher, we use RSA with 1024-bit keys and a fixed - exponent of 65537. We use OAEP-MGF1 padding, with SHA-1 as its digest - function. We leave the optional "Label" parameter unset. (For OAEP - padding, see ftp://ftp.rsasecurity.com/pub/pkcs/pkcs-1/pkcs-1v2-1.pdf) - - For Diffie-Hellman, we use a generator (g) of 2. For the modulus (p), we - use the 1024-bit safe prime from rfc2409 section 6.2 whose hex - representation is: - - "FFFFFFFFFFFFFFFFC90FDAA22168C234C4C6628B80DC1CD129024E08" - "8A67CC74020BBEA63B139B22514A08798E3404DDEF9519B3CD3A431B" - "302B0A6DF25F14374FE1356D6D51C245E485B576625E7EC6F44C42E9" - "A637ED6B0BFF5CB6F406B7EDEE386BFB5A899FA5AE9F24117C4B1FE6" - "49286651ECE65381FFFFFFFFFFFFFFFF" - - As an optimization, implementations SHOULD choose DH private keys (x) of - 320 bits. Implementations that do this MUST never use any DH key more - than once. - [May other implementations reuse their DH keys?? -RD] - [Probably not. Conceivably, you could get away with changing DH keys once - per second, but there are too many oddball attacks for me to be - comfortable that this is safe. -NM] - - For a hash function, we use SHA-1. - - KEY_LEN=16. - DH_LEN=128; DH_SEC_LEN=40. - PK_ENC_LEN=128; PK_PAD_LEN=42. - HASH_LEN=20. - - When we refer to "the hash of a public key", we mean the SHA-1 hash of the - DER encoding of an ASN.1 RSA public key (as specified in PKCS.1). - - All "random" values should be generated with a cryptographically strong - random number generator, unless otherwise noted. - - The "hybrid encryption" of a byte sequence M with a public key PK is - computed as follows: - 1. If M is less than PK_ENC_LEN-PK_PAD_LEN, pad and encrypt M with PK. - 2. Otherwise, generate a KEY_LEN byte random key K. - Let M1 = the first PK_ENC_LEN-PK_PAD_LEN-KEY_LEN bytes of M, - and let M2 = the rest of M. - Pad and encrypt K|M1 with PK. Encrypt M2 with our stream cipher, - using the key K. Concatenate these encrypted values. - [XXX Note that this "hybrid encryption" approach does not prevent - an attacker from adding or removing bytes to the end of M. It also - allows attackers to modify the bytes not covered by the OAEP -- - see Goldberg's PET2006 paper for details. We will add a MAC to this - scheme one day. -RD] - -0.4. Other parameter values - - CELL_LEN=512 - -1. System overview - - Tor is a distributed overlay network designed to anonymize - low-latency TCP-based applications such as web browsing, secure shell, - and instant messaging. Clients choose a path through the network and - build a ``circuit'', in which each node (or ``onion router'' or ``OR'') - in the path knows its predecessor and successor, but no other nodes in - the circuit. Traffic flowing down the circuit is sent in fixed-size - ``cells'', which are unwrapped by a symmetric key at each node (like - the layers of an onion) and relayed downstream. - -1.1. Keys and names - - Every Tor server has multiple public/private keypairs: - - - A long-term signing-only "Identity key" used to sign documents and - certificates, and used to establish server identity. - - A medium-term "Onion key" used to decrypt onion skins when accepting - circuit extend attempts. (See 5.1.) Old keys MUST be accepted for at - least one week after they are no longer advertised. Because of this, - servers MUST retain old keys for a while after they're rotated. - - A short-term "Connection key" used to negotiate TLS connections. - Tor implementations MAY rotate this key as often as they like, and - SHOULD rotate this key at least once a day. - - Tor servers are also identified by "nicknames"; these are specified in - dir-spec.txt. - -2. Connections - - Connections between two Tor servers, or between a client and a server, - use TLS/SSLv3 for link authentication and encryption. All - implementations MUST support the SSLv3 ciphersuite - "SSL_DHE_RSA_WITH_3DES_EDE_CBC_SHA", and SHOULD support the TLS - ciphersuite "TLS_DHE_RSA_WITH_AES_128_CBC_SHA" if it is available. - - There are three acceptable ways to perform a TLS handshake when - connecting to a Tor server: "certificates up-front", "renegotiation", and - "backwards-compatible renegotiation". ("Backwards-compatible - renegotiation" is, as the name implies, compatible with both other - handshake types.) - - Before Tor 0.2.0.21, only "certificates up-front" was supported. In Tor - 0.2.0.21 or later, "backwards-compatible renegotiation" is used. - - In "certificates up-front", the connection initiator always sends a - two-certificate chain, consisting of an X.509 certificate using a - short-term connection public key and a second, self- signed X.509 - certificate containing its identity key. The other party sends a similar - certificate chain. The initiator's ClientHello MUST NOT include any - ciphersuites other than: - TLS_DHE_RSA_WITH_AES_256_CBC_SHA - TLS_DHE_RSA_WITH_AES_128_CBC_SHA - SSL_DHE_RSA_WITH_3DES_EDE_CBC_SHA - SSL_DHE_DSS_WITH_3DES_EDE_CBC_SHA - - In "renegotiation", the connection initiator sends no certificates, and - the responder sends a single connection certificate. Once the TLS - handshake is complete, the initiator renegotiates the handshake, with each - party sending a two-certificate chain as in "certificates up-front". - The initiator's ClientHello MUST include at least one ciphersuite not in - the list above. The responder SHOULD NOT select any ciphersuite besides - those in the list above. - [The above "should not" is because some of the ciphers that - clients list may be fake.] - - In "backwards-compatible renegotiation", the connection initiator's - ClientHello MUST include at least one ciphersuite other than those listed - above. The connection responder examines the initiator's ciphersuite list - to see whether it includes any ciphers other than those included in the - list above. If extra ciphers are included, the responder proceeds as in - "renegotiation": it sends a single certificate and does not request - client certificates. Otherwise (in the case that no extra ciphersuites - are included in the ClientHello) the responder proceeds as in - "certificates up-front": it requests client certificates, and sends a - two-certificate chain. In either case, once the responder has sent its - certificate or certificates, the initiator counts them. If two - certificates have been sent, it proceeds as in "certificates up-front"; - otherwise, it proceeds as in "renegotiation". - - All new implementations of the Tor server protocol MUST support - "backwards-compatible renegotiation"; clients SHOULD do this too. If - this is not possible, new client implementations MUST support both - "renegotiation" and "certificates up-front" and use the router's - published link protocols list (see dir-spec.txt on the "protocols" entry) - to decide which to use. - - In all of the above handshake variants, certificates sent in the clear - SHOULD NOT include any strings to identify the host as a Tor server. In - the "renegotiation" and "backwards-compatible renegotiation" steps, the - initiator SHOULD choose a list of ciphersuites and TLS extensions - to mimic one used by a popular web browser. - - Responders MUST NOT select any TLS ciphersuite that lacks ephemeral keys, - or whose symmetric keys are less then KEY_LEN bits, or whose digests are - less than HASH_LEN bits. Responders SHOULD NOT select any SSLv3 - ciphersuite other than those listed above. - - Even though the connection protocol is identical, we will think of the - initiator as either an onion router (OR) if it is willing to relay - traffic for other Tor users, or an onion proxy (OP) if it only handles - local requests. Onion proxies SHOULD NOT provide long-term-trackable - identifiers in their handshakes. - - In all handshake variants, once all certificates are exchanged, all - parties receiving certificates must confirm that the identity key is as - expected. (When initiating a connection, the expected identity key is - the one given in the directory; when creating a connection because of an - EXTEND cell, the expected identity key is the one given in the cell.) If - the key is not as expected, the party must close the connection. - - When connecting to an OR, all parties SHOULD reject the connection if that - OR has a malformed or missing certificate. When accepting an incoming - connection, an OR SHOULD NOT reject incoming connections from parties with - malformed or missing certificates. (However, an OR should not believe - that an incoming connection is from another OR unless the certificates - are present and well-formed.) - - [Before version 0.1.2.8-rc, ORs rejected incoming connections from ORs and - OPs alike if their certificates were missing or malformed.] - - Once a TLS connection is established, the two sides send cells - (specified below) to one another. Cells are sent serially. All - cells are CELL_LEN bytes long. Cells may be sent embedded in TLS - records of any size or divided across TLS records, but the framing - of TLS records MUST NOT leak information about the type or contents - of the cells. - - TLS connections are not permanent. Either side MAY close a connection - if there are no circuits running over it and an amount of time - (KeepalivePeriod, defaults to 5 minutes) has passed since the last time - any traffic was transmitted over the TLS connection. Clients SHOULD - also hold a TLS connection with no circuits open, if it is likely that a - circuit will be built soon using that connection. - - (As an exception, directory servers may try to stay connected to all of - the ORs -- though this will be phased out for the Tor 0.1.2.x release.) - - To avoid being trivially distinguished from servers, client-only Tor - instances are encouraged but not required to use a two-certificate chain - as well. Clients SHOULD NOT keep using the same certificates when - their IP address changes. Clients MAY send no certificates at all. - -3. Cell Packet format - - The basic unit of communication for onion routers and onion - proxies is a fixed-width "cell". - - On a version 1 connection, each cell contains the following - fields: - - CircID [2 bytes] - Command [1 byte] - Payload (padded with 0 bytes) [PAYLOAD_LEN bytes] - - On a version 2 connection, all cells are as in version 1 connections, - except for the initial VERSIONS cell, whose format is: - - Circuit [2 octets; set to 0] - Command [1 octet; set to 7 for VERSIONS] - Length [2 octets; big-endian integer] - Payload [Length bytes] - - The CircID field determines which circuit, if any, the cell is - associated with. - - The 'Command' field holds one of the following values: - 0 -- PADDING (Padding) (See Sec 7.2) - 1 -- CREATE (Create a circuit) (See Sec 5.1) - 2 -- CREATED (Acknowledge create) (See Sec 5.1) - 3 -- RELAY (End-to-end data) (See Sec 5.5 and 6) - 4 -- DESTROY (Stop using a circuit) (See Sec 5.4) - 5 -- CREATE_FAST (Create a circuit, no PK) (See Sec 5.1) - 6 -- CREATED_FAST (Circuit created, no PK) (See Sec 5.1) - 7 -- VERSIONS (Negotiate proto version) (See Sec 4) - 8 -- NETINFO (Time and address info) (See Sec 4) - 9 -- RELAY_EARLY (End-to-end data; limited)(See Sec 5.6) - - The interpretation of 'Payload' depends on the type of the cell. - PADDING: Payload is unused. - CREATE: Payload contains the handshake challenge. - CREATED: Payload contains the handshake response. - RELAY: Payload contains the relay header and relay body. - DESTROY: Payload contains a reason for closing the circuit. - (see 5.4) - Upon receiving any other value for the command field, an OR must - drop the cell. Since more cell types may be added in the future, ORs - should generally not warn when encountering unrecognized commands. - - The payload is padded with 0 bytes. - - PADDING cells are currently used to implement connection keepalive. - If there is no other traffic, ORs and OPs send one another a PADDING - cell every few minutes. - - CREATE, CREATED, and DESTROY cells are used to manage circuits; - see section 5 below. - - RELAY cells are used to send commands and data along a circuit; see - section 6 below. - - VERSIONS and NETINFO cells are used to set up connections. See section 4 - below. - -4. Negotiating and initializing connections - -4.1. Negotiating versions with VERSIONS cells - - There are multiple instances of the Tor link connection protocol. Any - connection negotiated using the "certificates up front" handshake (see - section 2 above) is "version 1". In any connection where both parties - have behaved as in the "renegotiation" handshake, the link protocol - version is 2 or higher. - - To determine the version, in any connection where the "renegotiation" - handshake was used (that is, where the server sent only one certificate - at first and where the client did not send any certificates until - renegotiation), both parties MUST send a VERSIONS cell immediately after - the renegotiation is finished, before any other cells are sent. Parties - MUST NOT send any other cells on a connection until they have received a - VERSIONS cell. - - The payload in a VERSIONS cell is a series of big-endian two-byte - integers. Both parties MUST select as the link protocol version the - highest number contained both in the VERSIONS cell they sent and in the - versions cell they received. If they have no such version in common, - they cannot communicate and MUST close the connection. - - Since the version 1 link protocol does not use the "renegotiation" - handshake, implementations MUST NOT list version 1 in their VERSIONS - cell. - -4.2. NETINFO cells - - If version 2 or higher is negotiated, each party sends the other a - NETINFO cell. The cell's payload is: - - Timestamp [4 bytes] - Other OR's address [variable] - Number of addresses [1 byte] - This OR's addresses [variable] - - The address format is a type/length/value sequence as given in section - 6.4 below. The timestamp is a big-endian unsigned integer number of - seconds since the Unix epoch. - - Implementations MAY use the timestamp value to help decide if their - clocks are skewed. Initiators MAY use "other OR's address" to help - learn which address their connections are originating from, if they do - not know it. Initiators SHOULD use "this OR's address" to make sure - that they have connected to another OR at its canonical address. - - [As of 0.2.0.23-rc, implementations use none of the above values.] - - -5. Circuit management - -5.1. CREATE and CREATED cells - - Users set up circuits incrementally, one hop at a time. To create a - new circuit, OPs send a CREATE cell to the first node, with the - first half of the DH handshake; that node responds with a CREATED - cell with the second half of the DH handshake plus the first 20 bytes - of derivative key data (see section 5.2). To extend a circuit past - the first hop, the OP sends an EXTEND relay cell (see section 5) - which instructs the last node in the circuit to send a CREATE cell - to extend the circuit. - - The payload for a CREATE cell is an 'onion skin', which consists - of the first step of the DH handshake data (also known as g^x). - This value is hybrid-encrypted (see 0.3) to Bob's onion key, giving - an onion-skin of: - PK-encrypted: - Padding [PK_PAD_LEN bytes] - Symmetric key [KEY_LEN bytes] - First part of g^x [PK_ENC_LEN-PK_PAD_LEN-KEY_LEN bytes] - Symmetrically encrypted: - Second part of g^x [DH_LEN-(PK_ENC_LEN-PK_PAD_LEN-KEY_LEN) - bytes] - - The relay payload for an EXTEND relay cell consists of: - Address [4 bytes] - Port [2 bytes] - Onion skin [DH_LEN+KEY_LEN+PK_PAD_LEN bytes] - Identity fingerprint [HASH_LEN bytes] - - The port and address field denote the IPv4 address and port of the next - onion router in the circuit; the public key hash is the hash of the PKCS#1 - ASN1 encoding of the next onion router's identity (signing) key. (See 0.3 - above.) Including this hash allows the extending OR verify that it is - indeed connected to the correct target OR, and prevents certain - man-in-the-middle attacks. - - The payload for a CREATED cell, or the relay payload for an - EXTENDED cell, contains: - DH data (g^y) [DH_LEN bytes] - Derivative key data (KH) [HASH_LEN bytes] <see 5.2 below> - - The CircID for a CREATE cell is an arbitrarily chosen 2-byte integer, - selected by the node (OP or OR) that sends the CREATE cell. To prevent - CircID collisions, when one node sends a CREATE cell to another, it chooses - from only one half of the possible values based on the ORs' public - identity keys: if the sending node has a lower key, it chooses a CircID with - an MSB of 0; otherwise, it chooses a CircID with an MSB of 1. - - (An OP with no public key MAY choose any CircID it wishes, since an OP - never needs to process a CREATE cell.) - - Public keys are compared numerically by modulus. - - As usual with DH, x and y MUST be generated randomly. - -5.1.1. CREATE_FAST/CREATED_FAST cells - - When initializing the first hop of a circuit, the OP has already - established the OR's identity and negotiated a secret key using TLS. - Because of this, it is not always necessary for the OP to perform the - public key operations to create a circuit. In this case, the - OP MAY send a CREATE_FAST cell instead of a CREATE cell for the first - hop only. The OR responds with a CREATED_FAST cell, and the circuit is - created. - - A CREATE_FAST cell contains: - - Key material (X) [HASH_LEN bytes] - - A CREATED_FAST cell contains: - - Key material (Y) [HASH_LEN bytes] - Derivative key data [HASH_LEN bytes] (See 5.2 below) - - The values of X and Y must be generated randomly. - - If an OR sees a circuit created with CREATE_FAST, the OR is sure to be the - first hop of a circuit. ORs SHOULD reject attempts to create streams with - RELAY_BEGIN exiting the circuit at the first hop: letting Tor be used as a - single hop proxy makes exit nodes a more attractive target for compromise. - -5.2. Setting circuit keys - - Once the handshake between the OP and an OR is completed, both can - now calculate g^xy with ordinary DH. Before computing g^xy, both client - and server MUST verify that the received g^x or g^y value is not degenerate; - that is, it must be strictly greater than 1 and strictly less than p-1 - where p is the DH modulus. Implementations MUST NOT complete a handshake - with degenerate keys. Implementations MUST NOT discard other "weak" - g^x values. - - (Discarding degenerate keys is critical for security; if bad keys - are not discarded, an attacker can substitute the server's CREATED - cell's g^y with 0 or 1, thus creating a known g^xy and impersonating - the server. Discarding other keys may allow attacks to learn bits of - the private key.) - - If CREATE or EXTEND is used to extend a circuit, the client and server - base their key material on K0=g^xy, represented as a big-endian unsigned - integer. - - If CREATE_FAST is used, the client and server base their key material on - K0=X|Y. - - From the base key material K0, they compute KEY_LEN*2+HASH_LEN*3 bytes of - derivative key data as - K = H(K0 | [00]) | H(K0 | [01]) | H(K0 | [02]) | ... - - The first HASH_LEN bytes of K form KH; the next HASH_LEN form the forward - digest Df; the next HASH_LEN 41-60 form the backward digest Db; the next - KEY_LEN 61-76 form Kf, and the final KEY_LEN form Kb. Excess bytes from K - are discarded. - - KH is used in the handshake response to demonstrate knowledge of the - computed shared key. Df is used to seed the integrity-checking hash - for the stream of data going from the OP to the OR, and Db seeds the - integrity-checking hash for the data stream from the OR to the OP. Kf - is used to encrypt the stream of data going from the OP to the OR, and - Kb is used to encrypt the stream of data going from the OR to the OP. - -5.3. Creating circuits - - When creating a circuit through the network, the circuit creator - (OP) performs the following steps: - - 1. Choose an onion router as an exit node (R_N), such that the onion - router's exit policy includes at least one pending stream that - needs a circuit (if there are any). - - 2. Choose a chain of (N-1) onion routers - (R_1...R_N-1) to constitute the path, such that no router - appears in the path twice. - - 3. If not already connected to the first router in the chain, - open a new connection to that router. - - 4. Choose a circID not already in use on the connection with the - first router in the chain; send a CREATE cell along the - connection, to be received by the first onion router. - - 5. Wait until a CREATED cell is received; finish the handshake - and extract the forward key Kf_1 and the backward key Kb_1. - - 6. For each subsequent onion router R (R_2 through R_N), extend - the circuit to R. - - To extend the circuit by a single onion router R_M, the OP performs - these steps: - - 1. Create an onion skin, encrypted to R_M's public onion key. - - 2. Send the onion skin in a relay EXTEND cell along - the circuit (see section 5). - - 3. When a relay EXTENDED cell is received, verify KH, and - calculate the shared keys. The circuit is now extended. - - When an onion router receives an EXTEND relay cell, it sends a CREATE - cell to the next onion router, with the enclosed onion skin as its - payload. As special cases, if the extend cell includes a digest of - all zeroes, or asks to extend back to the relay that sent the extend - cell, the circuit will fail and be torn down. The initiating onion - router chooses some circID not yet used on the connection between the - two onion routers. (But see section 5.1. above, concerning choosing - circIDs based on lexicographic order of nicknames.) - - When an onion router receives a CREATE cell, if it already has a - circuit on the given connection with the given circID, it drops the - cell. Otherwise, after receiving the CREATE cell, it completes the - DH handshake, and replies with a CREATED cell. Upon receiving a - CREATED cell, an onion router packs it payload into an EXTENDED relay - cell (see section 5), and sends that cell up the circuit. Upon - receiving the EXTENDED relay cell, the OP can retrieve g^y. - - (As an optimization, OR implementations may delay processing onions - until a break in traffic allows time to do so without harming - network latency too greatly.) - -5.3.1. Canonical connections - - It is possible for an attacker to launch a man-in-the-middle attack - against a connection by telling OR Alice to extend to OR Bob at some - address X controlled by the attacker. The attacker cannot read the - encrypted traffic, but the attacker is now in a position to count all - bytes sent between Alice and Bob (assuming Alice was not already - connected to Bob.) - - To prevent this, when an OR we gets an extend request, it SHOULD use an - existing OR connection if the ID matches, and ANY of the following - conditions hold: - - The IP matches the requested IP. - - The OR knows that the IP of the connection it's using is canonical - because it was listed in the NETINFO cell. - - The OR knows that the IP of the connection it's using is canonical - because it was listed in the server descriptor. - - [This is not implemented in Tor 0.2.0.23-rc.] - -5.4. Tearing down circuits - - Circuits are torn down when an unrecoverable error occurs along - the circuit, or when all streams on a circuit are closed and the - circuit's intended lifetime is over. Circuits may be torn down - either completely or hop-by-hop. - - To tear down a circuit completely, an OR or OP sends a DESTROY - cell to the adjacent nodes on that circuit, using the appropriate - direction's circID. - - Upon receiving an outgoing DESTROY cell, an OR frees resources - associated with the corresponding circuit. If it's not the end of - the circuit, it sends a DESTROY cell for that circuit to the next OR - in the circuit. If the node is the end of the circuit, then it tears - down any associated edge connections (see section 6.1). - - After a DESTROY cell has been processed, an OR ignores all data or - destroy cells for the corresponding circuit. - - To tear down part of a circuit, the OP may send a RELAY_TRUNCATE cell - signaling a given OR (Stream ID zero). That OR sends a DESTROY - cell to the next node in the circuit, and replies to the OP with a - RELAY_TRUNCATED cell. - - [Note: If an OR receives a TRUNCATE cell and it has any RELAY cells - still queued on the circuit for the next node it will drop them - without sending them. This is not considered conformant behavior, - but it probably won't get fixed until a later version of Tor. Thus, - clients SHOULD NOT send a TRUNCATE cell to a node running any current - version of Tor if a) they have sent relay cells through that node, - and b) they aren't sure whether those cells have been sent on yes.] - - When an unrecoverable error occurs along one connection in a - circuit, the nodes on either side of the connection should, if they - are able, act as follows: the node closer to the OP should send a - RELAY_TRUNCATED cell towards the OP; the node farther from the OP - should send a DESTROY cell down the circuit. - - The payload of a RELAY_TRUNCATED or DESTROY cell contains a single octet, - describing why the circuit is being closed or truncated. When sending a - TRUNCATED or DESTROY cell because of another TRUNCATED or DESTROY cell, - the error code should be propagated. The origin of a circuit always sets - this error code to 0, to avoid leaking its version. - - The error codes are: - 0 -- NONE (No reason given.) - 1 -- PROTOCOL (Tor protocol violation.) - 2 -- INTERNAL (Internal error.) - 3 -- REQUESTED (A client sent a TRUNCATE command.) - 4 -- HIBERNATING (Not currently operating; trying to save bandwidth.) - 5 -- RESOURCELIMIT (Out of memory, sockets, or circuit IDs.) - 6 -- CONNECTFAILED (Unable to reach server.) - 7 -- OR_IDENTITY (Connected to server, but its OR identity was not - as expected.) - 8 -- OR_CONN_CLOSED (The OR connection that was carrying this circuit - died.) - 9 -- FINISHED (The circuit has expired for being dirty or old.) - 10 -- TIMEOUT (Circuit construction took too long) - 11 -- DESTROYED (The circuit was destroyed w/o client TRUNCATE) - 12 -- NOSUCHSERVICE (Request for unknown hidden service) - -5.5. Routing relay cells - - When an OR receives a RELAY or RELAY_EARLY cell, it checks the cell's - circID and determines whether it has a corresponding circuit along that - connection. If not, the OR drops the cell. - - Otherwise, if the OR is not at the OP edge of the circuit (that is, - either an 'exit node' or a non-edge node), it de/encrypts the payload - with the stream cipher, as follows: - 'Forward' relay cell (same direction as CREATE): - Use Kf as key; decrypt. - 'Back' relay cell (opposite direction from CREATE): - Use Kb as key; encrypt. - Note that in counter mode, decrypt and encrypt are the same operation. - - The OR then decides whether it recognizes the relay cell, by - inspecting the payload as described in section 6.1 below. If the OR - recognizes the cell, it processes the contents of the relay cell. - Otherwise, it passes the decrypted relay cell along the circuit if - the circuit continues. If the OR at the end of the circuit - encounters an unrecognized relay cell, an error has occurred: the OR - sends a DESTROY cell to tear down the circuit. - - When a relay cell arrives at an OP, the OP decrypts the payload - with the stream cipher as follows: - OP receives data cell: - For I=N...1, - Decrypt with Kb_I. If the payload is recognized (see - section 6..1), then stop and process the payload. - - For more information, see section 6 below. - -5.6. Handling relay_early cells - - A RELAY_EARLY cell is designed to limit the length any circuit can reach. - When an OR receives a RELAY_EARLY cell, and the next node in the circuit - is speaking v2 of the link protocol or later, the OR relays the cell as a - RELAY_EARLY cell. Otherwise, it relays it as a RELAY cell. - - If a node ever receives more than 8 RELAY_EARLY cells on a given - outbound circuit, it SHOULD close the circuit. (For historical reasons, - we don't limit the number of inbound RELAY_EARLY cells; they should - be harmless anyway because clients won't accept extend requests. See - bug 1038.) - - When speaking v2 of the link protocol or later, clients MUST only send - EXTEND cells inside RELAY_EARLY cells. Clients SHOULD send the first ~8 - RELAY cells that are not targeted at the first hop of any circuit as - RELAY_EARLY cells too, in order to partially conceal the circuit length. - - [In a future version of Tor, servers will reject any EXTEND cell not - received in a RELAY_EARLY cell. See proposal 110.] - -6. Application connections and stream management - -6.1. Relay cells - - Within a circuit, the OP and the exit node use the contents of - RELAY packets to tunnel end-to-end commands and TCP connections - ("Streams") across circuits. End-to-end commands can be initiated - by either edge; streams are initiated by the OP. - - The payload of each unencrypted RELAY cell consists of: - Relay command [1 byte] - 'Recognized' [2 bytes] - StreamID [2 bytes] - Digest [4 bytes] - Length [2 bytes] - Data [CELL_LEN-14 bytes] - - The relay commands are: - 1 -- RELAY_BEGIN [forward] - 2 -- RELAY_DATA [forward or backward] - 3 -- RELAY_END [forward or backward] - 4 -- RELAY_CONNECTED [backward] - 5 -- RELAY_SENDME [forward or backward] [sometimes control] - 6 -- RELAY_EXTEND [forward] [control] - 7 -- RELAY_EXTENDED [backward] [control] - 8 -- RELAY_TRUNCATE [forward] [control] - 9 -- RELAY_TRUNCATED [backward] [control] - 10 -- RELAY_DROP [forward or backward] [control] - 11 -- RELAY_RESOLVE [forward] - 12 -- RELAY_RESOLVED [backward] - 13 -- RELAY_BEGIN_DIR [forward] - - 32..40 -- Used for hidden services; see rend-spec.txt. - - Commands labelled as "forward" must only be sent by the originator - of the circuit. Commands labelled as "backward" must only be sent by - other nodes in the circuit back to the originator. Commands marked - as either can be sent either by the originator or other nodes. - - The 'recognized' field in any unencrypted relay payload is always set - to zero; the 'digest' field is computed as the first four bytes of - the running digest of all the bytes that have been destined for - this hop of the circuit or originated from this hop of the circuit, - seeded from Df or Db respectively (obtained in section 5.2 above), - and including this RELAY cell's entire payload (taken with the digest - field set to zero). - - When the 'recognized' field of a RELAY cell is zero, and the digest - is correct, the cell is considered "recognized" for the purposes of - decryption (see section 5.5 above). - - (The digest does not include any bytes from relay cells that do - not start or end at this hop of the circuit. That is, it does not - include forwarded data. Therefore if 'recognized' is zero but the - digest does not match, the running digest at that node should - not be updated, and the cell should be forwarded on.) - - All RELAY cells pertaining to the same tunneled stream have the - same stream ID. StreamIDs are chosen arbitrarily by the OP. RELAY - cells that affect the entire circuit rather than a particular - stream use a StreamID of zero -- they are marked in the table above - as "[control]" style cells. (Sendme cells are marked as "sometimes - control" because they can take include a StreamID or not depending - on their purpose -- see Section 7.) - - The 'Length' field of a relay cell contains the number of bytes in - the relay payload which contain real payload data. The remainder of - the payload is padded with NUL bytes. - - If the RELAY cell is recognized but the relay command is not - understood, the cell must be dropped and ignored. Its contents - still count with respect to the digests, though. - -6.2. Opening streams and transferring data - - To open a new anonymized TCP connection, the OP chooses an open - circuit to an exit that may be able to connect to the destination - address, selects an arbitrary StreamID not yet used on that circuit, - and constructs a RELAY_BEGIN cell with a payload encoding the address - and port of the destination host. The payload format is: - - ADDRESS | ':' | PORT | [00] - - where ADDRESS can be a DNS hostname, or an IPv4 address in - dotted-quad format, or an IPv6 address surrounded by square brackets; - and where PORT is a decimal integer between 1 and 65535, inclusive. - - [What is the [00] for? -NM] - [It's so the payload is easy to parse out with string funcs -RD] - - Upon receiving this cell, the exit node resolves the address as - necessary, and opens a new TCP connection to the target port. If the - address cannot be resolved, or a connection can't be established, the - exit node replies with a RELAY_END cell. (See 6.4 below.) - Otherwise, the exit node replies with a RELAY_CONNECTED cell, whose - payload is in one of the following formats: - The IPv4 address to which the connection was made [4 octets] - A number of seconds (TTL) for which the address may be cached [4 octets] - or - Four zero-valued octets [4 octets] - An address type (6) [1 octet] - The IPv6 address to which the connection was made [16 octets] - A number of seconds (TTL) for which the address may be cached [4 octets] - [XXXX No version of Tor currently generates the IPv6 format.] - - [Tor servers before 0.1.2.0 set the TTL field to a fixed value. Later - versions set the TTL to the last value seen from a DNS server, and expire - their own cached entries after a fixed interval. This prevents certain - attacks.] - - The OP waits for a RELAY_CONNECTED cell before sending any data. - Once a connection has been established, the OP and exit node - package stream data in RELAY_DATA cells, and upon receiving such - cells, echo their contents to the corresponding TCP stream. - RELAY_DATA cells sent to unrecognized streams are dropped. - - Relay RELAY_DROP cells are long-range dummies; upon receiving such - a cell, the OR or OP must drop it. - -6.2.1. Opening a directory stream - - If a Tor server is a directory server, it should respond to a - RELAY_BEGIN_DIR cell as if it had received a BEGIN cell requesting a - connection to its directory port. RELAY_BEGIN_DIR cells ignore exit - policy, since the stream is local to the Tor process. - - If the Tor server is not running a directory service, it should respond - with a REASON_NOTDIRECTORY RELAY_END cell. - - Clients MUST generate an all-zero payload for RELAY_BEGIN_DIR cells, - and servers MUST ignore the payload. - - [RELAY_BEGIN_DIR was not supported before Tor 0.1.2.2-alpha; clients - SHOULD NOT send it to routers running earlier versions of Tor.] - -6.3. Closing streams - - When an anonymized TCP connection is closed, or an edge node - encounters error on any stream, it sends a 'RELAY_END' cell along the - circuit (if possible) and closes the TCP connection immediately. If - an edge node receives a 'RELAY_END' cell for any stream, it closes - the TCP connection completely, and sends nothing more along the - circuit for that stream. - - The payload of a RELAY_END cell begins with a single 'reason' byte to - describe why the stream is closing, plus optional data (depending on - the reason.) The values are: - - 1 -- REASON_MISC (catch-all for unlisted reasons) - 2 -- REASON_RESOLVEFAILED (couldn't look up hostname) - 3 -- REASON_CONNECTREFUSED (remote host refused connection) [*] - 4 -- REASON_EXITPOLICY (OR refuses to connect to host or port) - 5 -- REASON_DESTROY (Circuit is being destroyed) - 6 -- REASON_DONE (Anonymized TCP connection was closed) - 7 -- REASON_TIMEOUT (Connection timed out, or OR timed out - while connecting) - 8 -- REASON_NOROUTE (Routing error while attempting to - contact destination) - 9 -- REASON_HIBERNATING (OR is temporarily hibernating) - 10 -- REASON_INTERNAL (Internal error at the OR) - 11 -- REASON_RESOURCELIMIT (OR has no resources to fulfill request) - 12 -- REASON_CONNRESET (Connection was unexpectedly reset) - 13 -- REASON_TORPROTOCOL (Sent when closing connection because of - Tor protocol violations.) - 14 -- REASON_NOTDIRECTORY (Client sent RELAY_BEGIN_DIR to a - non-directory server.) - - (With REASON_EXITPOLICY, the 4-byte IPv4 address or 16-byte IPv6 address - forms the optional data, along with a 4-byte TTL; no other reason - currently has extra data.) - - OPs and ORs MUST accept reasons not on the above list, since future - versions of Tor may provide more fine-grained reasons. - - Tors SHOULD NOT send any reason except REASON_MISC for a stream that they - have originated. - - [*] Older versions of Tor also send this reason when connections are - reset. - - --- [The rest of this section describes unimplemented functionality.] - - Because TCP connections can be half-open, we follow an equivalent - to TCP's FIN/FIN-ACK/ACK protocol to close streams. - - An exit connection can have a TCP stream in one of three states: - 'OPEN', 'DONE_PACKAGING', and 'DONE_DELIVERING'. For the purposes - of modeling transitions, we treat 'CLOSED' as a fourth state, - although connections in this state are not, in fact, tracked by the - onion router. - - A stream begins in the 'OPEN' state. Upon receiving a 'FIN' from - the corresponding TCP connection, the edge node sends a 'RELAY_FIN' - cell along the circuit and changes its state to 'DONE_PACKAGING'. - Upon receiving a 'RELAY_FIN' cell, an edge node sends a 'FIN' to - the corresponding TCP connection (e.g., by calling - shutdown(SHUT_WR)) and changing its state to 'DONE_DELIVERING'. - - When a stream in already in 'DONE_DELIVERING' receives a 'FIN', it - also sends a 'RELAY_FIN' along the circuit, and changes its state - to 'CLOSED'. When a stream already in 'DONE_PACKAGING' receives a - 'RELAY_FIN' cell, it sends a 'FIN' and changes its state to - 'CLOSED'. - - If an edge node encounters an error on any stream, it sends a - 'RELAY_END' cell (if possible) and closes the stream immediately. - -6.4. Remote hostname lookup - - To find the address associated with a hostname, the OP sends a - RELAY_RESOLVE cell containing the hostname to be resolved with a NUL - terminating byte. (For a reverse lookup, the OP sends a RELAY_RESOLVE - cell containing an in-addr.arpa address.) The OR replies with a - RELAY_RESOLVED cell containing a status byte, and any number of - answers. Each answer is of the form: - Type (1 octet) - Length (1 octet) - Value (variable-width) - TTL (4 octets) - "Length" is the length of the Value field. - "Type" is one of: - 0x00 -- Hostname - 0x04 -- IPv4 address - 0x06 -- IPv6 address - 0xF0 -- Error, transient - 0xF1 -- Error, nontransient - - If any answer has a type of 'Error', then no other answer may be given. - - The RELAY_RESOLVE cell must use a nonzero, distinct streamID; the - corresponding RELAY_RESOLVED cell must use the same streamID. No stream - is actually created by the OR when resolving the name. - -7. Flow control - -7.1. Link throttling - - Each client or relay should do appropriate bandwidth throttling to - keep its user happy. - - Communicants rely on TCP's default flow control to push back when they - stop reading. - - The mainline Tor implementation uses token buckets (one for reads, - one for writes) for the rate limiting. - - Since 0.2.0.x, Tor has let the user specify an additional pair of - token buckets for "relayed" traffic, so people can deploy a Tor relay - with strict rate limiting, but also use the same Tor as a client. To - avoid partitioning concerns we combine both classes of traffic over a - given OR connection, and keep track of the last time we read or wrote - a high-priority (non-relayed) cell. If it's been less than N seconds - (currently N=30), we give the whole connection high priority, else we - give the whole connection low priority. We also give low priority - to reads and writes for connections that are serving directory - information. See proposal 111 for details. - -7.2. Link padding - - Link padding can be created by sending PADDING cells along the - connection; relay cells of type "DROP" can be used for long-range - padding. - - Currently nodes are not required to do any sort of link padding or - dummy traffic. Because strong attacks exist even with link padding, - and because link padding greatly increases the bandwidth requirements - for running a node, we plan to leave out link padding until this - tradeoff is better understood. - -7.3. Circuit-level flow control - - To control a circuit's bandwidth usage, each OR keeps track of two - 'windows', consisting of how many RELAY_DATA cells it is allowed to - originate (package for transmission), and how many RELAY_DATA cells - it is willing to consume (receive for local streams). These limits - do not apply to cells that the OR receives from one host and relays - to another. - - Each 'window' value is initially set to 1000 data cells - in each direction (cells that are not data cells do not affect - the window). When an OR is willing to deliver more cells, it sends a - RELAY_SENDME cell towards the OP, with Stream ID zero. When an OR - receives a RELAY_SENDME cell with stream ID zero, it increments its - packaging window. - - Each of these cells increments the corresponding window by 100. - - The OP behaves identically, except that it must track a packaging - window and a delivery window for every OR in the circuit. - - An OR or OP sends cells to increment its delivery window when the - corresponding window value falls under some threshold (900). - - If a packaging window reaches 0, the OR or OP stops reading from - TCP connections for all streams on the corresponding circuit, and - sends no more RELAY_DATA cells until receiving a RELAY_SENDME cell. -[this stuff is badly worded; copy in the tor-design section -RD] - -7.4. Stream-level flow control - - Edge nodes use RELAY_SENDME cells to implement end-to-end flow - control for individual connections across circuits. Similarly to - circuit-level flow control, edge nodes begin with a window of cells - (500) per stream, and increment the window by a fixed value (50) - upon receiving a RELAY_SENDME cell. Edge nodes initiate RELAY_SENDME - cells when both a) the window is <= 450, and b) there are less than - ten cell payloads remaining to be flushed at that edge. - -A.1. Differences between spec and implementation - -- The current specification requires all ORs to have IPv4 addresses, but - allows servers to exit and resolve to IPv6 addresses, and to declare IPv6 - addresses in their exit policies. The current codebase has no IPv6 - support at all. - diff --git a/doc/spec/version-spec.txt b/doc/spec/version-spec.txt deleted file mode 100644 index 265717f409..0000000000 --- a/doc/spec/version-spec.txt +++ /dev/null @@ -1,44 +0,0 @@ - - HOW TOR VERSION NUMBERS WORK - -1. The Old Way - - Before 0.1.0, versions were of the format: - MAJOR.MINOR.MICRO(status(PATCHLEVEL))?(-cvs)? - where MAJOR, MINOR, MICRO, and PATCHLEVEL are numbers, status is one - of "pre" (for an alpha release), "rc" (for a release candidate), or - "." for a release. As a special case, "a.b.c" was equivalent to - "a.b.c.0". We compare the elements in order (major, minor, micro, - status, patchlevel, cvs), with "cvs" preceding non-cvs. - - We would start each development branch with a final version in mind: - say, "0.0.8". Our first pre-release would be "0.0.8pre1", followed by - (for example) "0.0.8pre2-cvs", "0.0.8pre2", "0.0.8pre3-cvs", - "0.0.8rc1", "0.0.8rc2-cvs", and "0.0.8rc2". Finally, we'd release - 0.0.8. The stable CVS branch would then be versioned "0.0.8.1-cvs", - and any eventual bugfix release would be "0.0.8.1". - -2. The New Way - - After 0.1.0, versions are of the format: - MAJOR.MINOR.MICRO(.PATCHLEVEL)(-status_tag) - The stuff in parentheses is optional. As before, MAJOR, MINOR, MICRO, - and PATCHLEVEL are numbers, with an absent number equivalent to 0. - All versions should be distinguishable purely by those four - numbers. The status tag is purely informational, and lets you know how - stable we think the release is: "alpha" is pretty unstable; "rc" is a - release candidate; and no tag at all means that we have a final - release. If the tag ends with "-cvs" or "-dev", you're looking at a - development snapshot that came after a given release. If we *do* - encounter two versions that differ only by status tag, we compare them - lexically. - - Now, we start each development branch with (say) 0.1.1.1-alpha. The - patchlevel increments consistently as the status tag changes, for - example, as in: 0.1.1.2-alpha, 0.1.1.3-alpha, 0.1.1.4-rc, 0.1.1.5-rc. - Eventually, we release 0.1.1.6. The next patch release is 0.1.1.7. - - Between these releases, CVS is versioned with a -cvs tag: after - 0.1.1.1-alpha comes 0.1.1.1-alpha-cvs, and so on. But starting with - 0.1.2.1-alpha-dev, we switched to SVN and started using the "-dev" - suffix instead of the "-cvs" suffix. diff --git a/doc/tor.1.txt b/doc/tor.1.txt index a4ab0d921f..033f0a28e1 100644 --- a/doc/tor.1.txt +++ b/doc/tor.1.txt @@ -499,7 +499,7 @@ The following options are useful only for clients (that is, if list. **EntryNodes** __node__,__node__,__...__:: - A list of identity fingerprints, nicknames, country codes and address + A list of identity fingerprints, nicknames and address patterns of nodes to use for the first hop in normal circuits. These are treated only as preferences unless StrictNodes (see below) is also set. @@ -682,8 +682,9 @@ The following options are useful only for clients (that is, if can leak your location to attackers. (Default: 1) **VirtualAddrNetwork** __Address__/__bits__:: - When a controller asks for a virtual (unused) address with the MAPADDRESS - command, Tor picks an unassigned address from this range. (Default: + When Tor needs to assign a virtual (unused) address because of a MAPADDRESS + command from the controller or the AutomapHostsOnResolve feature, Tor + picks an unassigned address from this range. (Default: 127.192.0.0/10) + + When providing proxy server service to a network of computers using a tool @@ -759,6 +760,12 @@ The following options are useful only for clients (that is, if 192.168.0.1). This option prevents certain browser-based attacks; don't turn it off unless you know what you're doing. (Default: 1). +**ClientRejectInternalAddresses** **0**|**1**:: + If true, Tor does not try to fulfill requests to connect to an internal + address (like 127.0.0.1 or 192.168.0.1) __unless a exit node is + specifically requested__ (for example, via a .exit hostname, or a + controller request). (Default: 1). + **DownloadExtraInfo** **0**|**1**:: If true, Tor downloads and caches "extra-info" documents. These documents contain information about servers other than the information in their @@ -903,9 +910,9 @@ is non-zero): specified in ORPort. (Default: 0.0.0.0) This directive can be specified multiple times to bind to multiple addresses/ports. -**PublishServerDescriptor** **0**|**1**|**v1**|**v2**|**v3**|**bridge**|**hidserv**,**...**:: +**PublishServerDescriptor** **0**|**1**|**v1**|**v2**|**v3**|**bridge**,**...**:: This option specifies which descriptors Tor will publish when acting as - a relay or hidden service. You can + a relay. You can choose multiple arguments, separated by commas. + If this option is set to 0, Tor will not publish its @@ -913,7 +920,7 @@ is non-zero): out your server, or if you're using a Tor controller that handles directory publishing for you.) Otherwise, Tor will publish its descriptors of all type(s) specified. The default is "1", - which means "if running as a server or a hidden service, publish the + which means "if running as a server, publish the appropriate descriptors to the authorities". **ShutdownWaitLength** __NUM__:: @@ -928,7 +935,9 @@ is non-zero): period, or receive more than that number in the period. For example, with AccountingMax set to 1 GB, a server could send 900 MB and receive 800 MB and continue running. It will only hibernate once one of the two reaches 1 - GB. When the number of bytes is exhausted, Tor will hibernate until some + GB. When the number of bytes gets low, Tor will stop accepting new + connections and circuits. When the number of bytes + is exhausted, Tor will hibernate until some time in the next accounting period. To prevent all servers from waking at the same time, Tor will also wait until a random point in each period before waking up. If you have bandwidth cost issues, enabling hibernation @@ -1088,7 +1097,8 @@ if DirPort is non-zero): **HSAuthoritativeDir** **0**|**1**:: When this option is set in addition to **AuthoritativeDirectory**, Tor also - accepts and serves hidden service descriptors. (Default: 0) + accepts and serves v0 hidden service descriptors, + which are produced and used by Tor 0.2.1.x and older. (Default: 0) **HidServDirectoryV2** **0**|**1**:: When this option is set, Tor accepts and serves v2 hidden service @@ -1295,6 +1305,7 @@ The following options are used for running a testing Tor network. AuthDirMaxServersPerAddr 0 AuthDirMaxServersPerAuthAddr 0 ClientDNSRejectInternalAddresses 0 + ClientRejectInternalAddresses 0 ExitPolicyRejectPrivate 0 V3AuthVotingInterval 5 minutes V3AuthVoteDelay 20 seconds |