diff options
author | Nick Mathewson <nickm@torproject.org> | 2011-02-21 16:10:31 -0500 |
---|---|---|
committer | Nick Mathewson <nickm@torproject.org> | 2011-02-21 16:10:31 -0500 |
commit | b99a8d54271dfd26c5c3a2d3086226974b32373e (patch) | |
tree | d3faab523a3f23925c654005f927945b79ce95fc | |
parent | 90f9caf4233c3db87f818d10c6c2b93e4fe398c9 (diff) | |
parent | d673479ebaa29b2dc8f227c342785112c945ec18 (diff) | |
download | tor-b99a8d54271dfd26c5c3a2d3086226974b32373e.tar.gz tor-b99a8d54271dfd26c5c3a2d3086226974b32373e.zip |
Merge remote branch 'origin/maint-0.2.2'
Conflicts:
doc/spec/Makefile.am
doc/spec/control-spec.txt
doc/spec/dir-spec.txt
doc/spec/proposals/000-index.txt
doc/spec/proposals/001-process.txt
doc/spec/proposals/ideas/xxx-encrypted-services.txt
117 files changed, 11 insertions, 27211 deletions
diff --git a/doc/Makefile.am b/doc/Makefile.am index c8bffc9310..6cc0ea99fb 100644 --- a/doc/Makefile.am +++ b/doc/Makefile.am @@ -1,4 +1,3 @@ - # We use a two-step process to generate documentation from asciidoc files. # # First, we use asciidoc/a2x to process the asciidoc files into .1.in and @@ -36,16 +35,12 @@ endif EXTRA_DIST = HACKING asciidoc-helper.sh \ $(html_in) $(man_in) $(txt_in) \ tor-rpm-creation.txt \ - tor-win32-mingw-creation.txt + tor-win32-mingw-creation.txt spec/README docdir = @docdir@ asciidoc_product = $(nodist_man_MANS) $(doc_DATA) -SUBDIRS = spec - -DIST_SUBDIRS = spec - # Generate the html documentation from asciidoc, but don't do # machine-specific replacements yet $(html_in) : diff --git a/doc/spec/Makefile.am b/doc/spec/Makefile.am deleted file mode 100644 index a4fba780ee..0000000000 --- a/doc/spec/Makefile.am +++ /dev/null @@ -1,12 +0,0 @@ - -EXTRA_DIST = \ - address-spec.txt \ - bridges-spec.txt \ - control-spec.txt \ - dir-spec.txt \ - path-spec.txt \ - rend-spec.txt \ - socks-extensions.txt \ - tor-spec.txt \ - version-spec.txt - diff --git a/doc/spec/README b/doc/spec/README new file mode 100644 index 0000000000..a7fa170020 --- /dev/null +++ b/doc/spec/README @@ -0,0 +1,10 @@ +The Tor specifications and proposals have moved to a new repository. + +To browse the specifications, go to + https://gitweb.torproject.org/torspec.git/tree + +To check out the specification repository, run + git clone git://git.torproject.org/torspec.git + +For other information on the repository, see + http://gitweb.torproject.org/torspec.git diff --git a/doc/spec/address-spec.txt b/doc/spec/address-spec.txt deleted file mode 100644 index ce6d2b65e7..0000000000 --- a/doc/spec/address-spec.txt +++ /dev/null @@ -1,58 +0,0 @@ - - Special Hostnames in Tor - Nick Mathewson - -1. Overview - - Most of the time, Tor treats user-specified hostnames as opaque: When - the user connects to www.torproject.org, Tor picks an exit node and uses - that node to connect to "www.torproject.org". Some hostnames, however, - can be used to override Tor's default behavior and circuit-building - rules. - - These hostnames can be passed to Tor as the address part of a SOCKS4a or - SOCKS5 request. If the application is connected to Tor using an IP-only - method (such as SOCKS4, TransPort, or NATDPort), these hostnames can be - substituted for certain IP addresses using the MapAddress configuration - option or the MAPADDRESS control command. - -2. .exit - - SYNTAX: [hostname].[name-or-digest].exit - [name-or-digest].exit - - Hostname is a valid hostname; [name-or-digest] is either the nickname of a - Tor node or the hex-encoded digest of that node's public key. - - When Tor sees an address in this format, it uses the specified hostname as - the exit node. If no "hostname" component is given, Tor defaults to the - published IPv4 address of the exit node. - - It is valid to try to resolve hostnames, and in fact upon success Tor - will cache an internal mapaddress of the form - "www.google.com.foo.exit=64.233.161.99.foo.exit" to speed subsequent - lookups. - - The .exit notation is disabled by default as of Tor 0.2.2.1-alpha, due - to potential application-level attacks. - - EXAMPLES: - www.example.com.exampletornode.exit - - Connect to www.example.com from the node called "exampletornode". - - exampletornode.exit - - Connect to the published IP address of "exampletornode" using - "exampletornode" as the exit. - -3. .onion - - SYNTAX: [digest].onion - - The digest is the first eighty bits of a SHA1 hash of the identity key for - a hidden service, encoded in base32. - - When Tor sees an address in this format, it tries to look up and connect to - the specified hidden service. See rend-spec.txt for full details. - diff --git a/doc/spec/bridges-spec.txt b/doc/spec/bridges-spec.txt deleted file mode 100644 index 647118815c..0000000000 --- a/doc/spec/bridges-spec.txt +++ /dev/null @@ -1,249 +0,0 @@ - - Tor bridges specification - -0. Preface - - This document describes the design decisions around support for bridge - users, bridge relays, and bridge authorities. It acts as an overview - of the bridge design and deployment for developers, and it also tries - to point out limitations in the current design and implementation. - - For more details on what all of these mean, look at blocking.tex in - /doc/design-paper/ - -1. Bridge relays - - Bridge relays are just like normal Tor relays except they don't publish - their server descriptors to the main directory authorities. - -1.1. PublishServerDescriptor - - To configure your relay to be a bridge relay, just add - BridgeRelay 1 - PublishServerDescriptor bridge - to your torrc. This will cause your relay to publish its descriptor - to the bridge authorities rather than to the default authorities. - - Alternatively, you can say - BridgeRelay 1 - PublishServerDescriptor 0 - which will cause your relay to not publish anywhere. This could be - useful for private bridges. - -1.2. Recommendations. - - Bridge relays should use an exit policy of "reject *:*". This is - because they only need to relay traffic between the bridge users - and the rest of the Tor network, so there's no need to let people - exit directly from them. - - We invented the RelayBandwidth* options for this situation: Tor clients - who want to allow relaying too. See proposal 111 for details. Relay - operators should feel free to rate-limit their relayed traffic. - -1.3. Implementation note. - - Vidalia 0.0.15 has turned its "Relay" settings page into a tri-state - "Don't relay" / "Relay for the Tor network" / "Help censored users". - - If you click the third choice, it forces your exit policy to reject *:*. - - If all the bridges end up on port 9001, that's not so good. On the - other hand, putting the bridges on a low-numbered port in the Unix - world requires jumping through extra hoops. The current compromise is - that Vidalia makes the ORPort default to 443 on Windows, and 9001 on - other platforms. - - At the bottom of the relay config settings window, Vidalia displays - the bridge identifier to the operator (see Section 3.1) so he can pass - it on to bridge users. - -2. Bridge authorities. - - Bridge authorities are like normal v3 directory authorities, except - they don't create their own network-status documents or votes. So if - you ask a bridge authority for a network-status document or consensus, - they behave like a directory mirror: they give you one from one of - the main authorities. But if you ask the bridge authority for the - descriptor corresponding to a particular identity fingerprint, it will - happily give you the latest descriptor for that fingerprint. - - To become a bridge authority, add these lines to your torrc: - AuthoritativeDirectory 1 - BridgeAuthoritativeDir 1 - - Right now there's one bridge authority, running on the Tonga relay. - -2.1. Exporting bridge-purpose descriptors - - We've added a new purpose for server descriptors: the "bridge" - purpose. With the new router-descriptors file format that includes - annotations, it's easy to look through it and find the bridge-purpose - descriptors. - - Currently we export the bridge descriptors from Tonga to the - BridgeDB server, so it can give them out according to the policies - in blocking.pdf. - -2.2. Reachability/uptime testing - - Right now the bridge authorities do active reachability testing of - bridges, so we know which ones to recommend for users. - - But in the design document, we suggested that bridges should publish - anonymously (i.e. via Tor) to the bridge authority, so somebody watching - the bridge authority can't just enumerate all the bridges. But if we're - doing active measurement, the game is up. Perhaps we should back off on - this goal, or perhaps we should do our active measurement anonymously? - - Answering this issue is scheduled for 0.2.1.x. - -2.3. Future work: migrating to multiple bridge authorities - - Having only one bridge authority is both a trust bottleneck (if you - break into one place you learn about every single bridge we've got) - and a robustness bottleneck (when it's down, bridge users become sad). - - Right now if we put up a second bridge authority, all the bridges would - publish to it, and (assuming the code works) bridge users would query - a random bridge authority. This resolves the robustness bottleneck, - but makes the trust bottleneck even worse. - - In 0.2.2.x and later we should think about better ways to have multiple - bridge authorities. - -3. Bridge users. - - Bridge users are like ordinary Tor users except they use encrypted - directory connections by default, and they use bridge relays as both - entry guards (their first hop) and directory guards (the source of - all their directory information). - - To become a bridge user, add the following line to your torrc: - UseBridges 1 - - and then add at least one "Bridge" line to your torrc based on the - format below. - -3.1. Format of the bridge identifier. - - The canonical format for a bridge identifier contains an IP address, - an ORPort, and an identity fingerprint: - bridge 128.31.0.34:9009 4C17 FB53 2E20 B2A8 AC19 9441 ECD2 B017 7B39 E4B1 - - However, the identity fingerprint can be left out, in which case the - bridge user will connect to that relay and use it as a bridge regardless - of what identity key it presents: - bridge 128.31.0.34:9009 - This might be useful for cases where only short bridge identifiers - can be communicated to bridge users. - - In a future version we may also support bridge identifiers that are - only a key fingerprint: - bridge 4C17 FB53 2E20 B2A8 AC19 9441 ECD2 B017 7B39 E4B1 - and the bridge user can fetch the latest descriptor from the bridge - authority (see Section 3.4). - -3.2. Bridges as entry guards - - For now, bridge users add their bridge relays to their list of "entry - guards" (see path-spec.txt for background on entry guards). They are - managed by the entry guard algorithms exactly as if they were a normal - entry guard -- their keys and timing get cached in the "state" file, - etc. This means that when the Tor user starts up with "UseBridges" - disabled, he will skip past the bridge entries since they won't be - listed as up and usable in his networkstatus consensus. But to be clear, - the "entry_guards" list doesn't currently distinguish guards by purpose. - - Internally, each bridge user keeps a smartlist of "bridge_info_t" - that reflects the "bridge" lines from his torrc along with a download - schedule (see Section 3.5 below). When he starts Tor, he attempts - to fetch a descriptor for each configured bridge (see Section 3.4 - below). When he succeeds at getting a descriptor for one of the bridges - in his list, he adds it directly to the entry guard list using the - normal add_an_entry_guard() interface. Once a bridge descriptor has - been added, should_delay_dir_fetches() will stop delaying further - directory fetches, and the user begins to bootstrap his directory - information from that bridge (see Section 3.3). - - Currently bridge users cache their bridge descriptors to the - "cached-descriptors" file (annotated with purpose "bridge"), but - they don't make any attempt to reuse descriptors they find in this - file. The theory is that either the bridge is available now, in which - case you can get a fresh descriptor, or it's not, in which case an - old descriptor won't do you much good. - - We could disable writing out the bridge lines to the state file, if - we think this is a problem. - - As an exception, if we get an application request when we have one - or more bridge descriptors but we believe none of them are running, - we mark them all as running again. This is similar to the exception - already in place to help long-idle Tor clients realize they should - fetch fresh directory information rather than just refuse requests. - -3.3. Bridges as directory guards - - In addition to using bridges as the first hop in their circuits, bridge - users also use them to fetch directory updates. Other than initial - bootstrapping to find a working bridge descriptor (see Section 3.4 - below), all further non-anonymized directory fetches will be redirected - to the bridge. - - This means that bridge relays need to have cached answers for all - questions the bridge user might ask. This makes the upgrade path - tricky --- for example, if we migrate to a v4 directory design, the - bridge user would need to keep using v3 so long as his bridge relays - only knew how to answer v3 queries. - - In a future design, for cases where the user has enough information - to build circuits yet the chosen bridge doesn't know how to answer a - given query, we might teach bridge users to make an anonymized request - to a more suitable directory server. - -3.4. How bridge users get their bridge descriptor - - Bridge users can fetch bridge descriptors in two ways: by going directly - to the bridge and asking for "/tor/server/authority", or by going to - the bridge authority and asking for "/tor/server/fp/ID". By default, - they will only try the direct queries. If the user sets - UpdateBridgesFromAuthority 1 - in his config file, then he will try querying the bridge authority - first for bridges where he knows a digest (if he only knows an IP - address and ORPort, then his only option is a direct query). - - If the user has at least one working bridge, then he will do further - queries to the bridge authority through a full three-hop Tor circuit. - But when bootstrapping, he will make a direct begin_dir-style connection - to the bridge authority. - - As of Tor 0.2.0.10-alpha, if the user attempts to fetch a descriptor - from the bridge authority and it returns a 404 not found, the user - will automatically fall back to trying a direct query. Therefore it is - recommended that bridge users always set UpdateBridgesFromAuthority, - since at worst it will delay their fetches a little bit and notify - the bridge authority of the identity fingerprint (but not location) - of their intended bridges. - -3.5. Bridge descriptor retry schedule - - Bridge users try to fetch a descriptor for each bridge (using the - steps in Section 3.4 above) on startup. Whenever they receive a - bridge descriptor, they reschedule a new descriptor download for 1 - hour from then. - - If on the other hand it fails, they try again after 15 minutes for the - first attempt, after 15 minutes for the second attempt, and after 60 - minutes for subsequent attempts. - - In 0.2.2.x we should come up with some smarter retry schedules. - -3.6. Implementation note. - - Vidalia 0.1.0 has a new checkbox in its Network config window called - "My ISP blocks connections to the Tor network." Users who click that - box change their configuration to: - UseBridges 1 - UpdateBridgesFromAuthority 1 - and should add at least one bridge identifier. - diff --git a/doc/spec/control-spec-v0.txt b/doc/spec/control-spec-v0.txt deleted file mode 100644 index 3515d395a6..0000000000 --- a/doc/spec/control-spec-v0.txt +++ /dev/null @@ -1,498 +0,0 @@ - - TC: A Tor control protocol (Version 0) - --1. Deprecation - -THIS PROTOCOL IS DEPRECATED. It is still documented here because Tor -0.1.1.x happens to support much of it; but the support for v0 is not -maintained, so you should expect it to rot in unpredictable ways. Support -for v0 will be removed some time after Tor 0.1.2. - -0. Scope - -This document describes an implementation-specific protocol that is used -for other programs (such as frontend user-interfaces) to communicate -with a locally running Tor process. It is not part of the Tor onion -routing protocol. - -We're trying to be pretty extensible here, but not infinitely -forward-compatible. - -1. Protocol outline - -TC is a bidirectional message-based protocol. It assumes an underlying -stream for communication between a controlling process (the "client") and -a Tor process (the "server"). The stream may be implemented via TCP, -TLS-over-TCP, a Unix-domain socket, or so on, but it must provide -reliable in-order delivery. For security, the stream should not be -accessible by untrusted parties. - -In TC, the client and server send typed variable-length messages to each -other over the underlying stream. By default, all messages from the server -are in response to messages from the client. Some client requests, however, -will cause the server to send messages to the client indefinitely far into -the future. - -Servers respond to messages in the order they're received. - -2. Message format - -The messages take the following format: - - Length [2 octets; big-endian] - Type [2 octets; big-endian] - Body [Length octets] - -Upon encountering a recognized Type, implementations behave as described in -section 3 below. If the type is not recognized, servers respond with an -"ERROR" message (code UNRECOGNIZED; see 3.1 below), and clients simply ignore -the message. - -2.1. Types and encodings - - All numbers are given in big-endian (network) order. - - OR identities are given in hexadecimal, in the same format as identity key - fingerprints, but without spaces; see tor-spec.txt for more information. - -3. Message types - - Message types are drawn from the following ranges: - - 0x0000-0xEFFF : Reserved for use by official versions of this spec. - 0xF000-0xFFFF : Unallocated; usable by unofficial extensions. - -3.1. ERROR (Type 0x0000) - - Sent in response to a message that could not be processed as requested. - - The body of the message begins with a 2-byte error code. The following - values are defined: - - 0x0000 Unspecified error - [] - - 0x0001 Internal error - [Something went wrong inside Tor, so that the client's - request couldn't be fulfilled.] - - 0x0002 Unrecognized message type - [The client sent a message type we don't understand.] - - 0x0003 Syntax error - [The client sent a message body in a format we can't parse.] - - 0x0004 Unrecognized configuration key - [The client tried to get or set a configuration option we don't - recognize.] - - 0x0005 Invalid configuration value - [The client tried to set a configuration option to an - incorrect, ill-formed, or impossible value.] - - 0x0006 Unrecognized byte code - [The client tried to set a byte code (in the body) that - we don't recognize.] - - 0x0007 Unauthorized. - [The client tried to send a command that requires - authorization, but it hasn't sent a valid AUTHENTICATE - message.] - - 0x0008 Failed authentication attempt - [The client sent a well-formed authorization message.] - - 0x0009 Resource exhausted - [The server didn't have enough of a given resource to - fulfill a given request.] - - 0x000A No such stream - - 0x000B No such circuit - - 0x000C No such OR - - The rest of the body should be a human-readable description of the error. - - In general, new error codes should only be added when they don't fall under - one of the existing error codes. - -3.2. DONE (Type 0x0001) - - Sent from server to client in response to a request that was successfully - completed, with no more information needed. The body is usually empty but - may contain a message. - -3.3. SETCONF (Type 0x0002) - - Change the value of a configuration variable. The body contains a list of - newline-terminated key-value configuration lines. An individual key-value - configuration line consists of the key, followed by a space, followed by - the value. The server behaves as though it had just read the key-value pair - in its configuration file. - - The server responds with a DONE message on success, or an ERROR message on - failure. - - When a configuration options takes multiple values, or when multiple - configuration keys form a context-sensitive group (see below), then - setting _any_ of the options in a SETCONF command is taken to reset all of - the others. For example, if two ORBindAddress values are configured, - and a SETCONF command arrives containing a single ORBindAddress value, the - new command's value replaces the two old values. - - To _remove_ all settings for a given option entirely (and go back to its - default value), send a single line containing the key and no value. - -3.4. GETCONF (Type 0x0003) - - Request the value of a configuration variable. The body contains one or - more NL-terminated strings for configuration keys. The server replies - with a CONFVALUE message. - - If an option appears multiple times in the configuration, all of its - key-value pairs are returned in order. - - Some options are context-sensitive, and depend on other options with - different keywords. These cannot be fetched directly. Currently there - is only one such option: clients should use the "HiddenServiceOptions" - virtual keyword to get all HiddenServiceDir, HiddenServicePort, - HiddenServiceNodes, and HiddenServiceExcludeNodes option settings. - -3.5. CONFVALUE (Type 0x0004) - - Sent in response to a GETCONF message; contains a list of "Key Value\n" - (A non-whitespace keyword, a single space, a non-NL value, a NL) - strings. - -3.6. SETEVENTS (Type 0x0005) - - Request the server to inform the client about interesting events. - The body contains a list of 2-byte event codes (see "event" below). - Any events *not* listed in the SETEVENTS body are turned off; thus, sending - SETEVENTS with an empty body turns off all event reporting. - - The server responds with a DONE message on success, and an ERROR message - if one of the event codes isn't recognized. (On error, the list of active - event codes isn't changed.) - -3.7. EVENT (Type 0x0006) - - Sent from the server to the client when an event has occurred and the - client has requested that kind of event. The body contains a 2-byte - event code followed by additional event-dependent information. Event - codes are: - 0x0001 -- Circuit status changed - - Status [1 octet] - 0x00 Launched - circuit ID assigned to new circuit - 0x01 Built - all hops finished, can now accept streams - 0x02 Extended - one more hop has been completed - 0x03 Failed - circuit closed (was not built) - 0x04 Closed - circuit closed (was built) - Circuit ID [4 octets] - (Must be unique to Tor process/time) - Path [NUL-terminated comma-separated string] - (For extended/failed, is the portion of the path that is - built) - - 0x0002 -- Stream status changed - - Status [1 octet] - (Sent connect=0,sent resolve=1,succeeded=2,failed=3, - closed=4, new connection=5, new resolve request=6, - stream detached from circuit and still retriable=7) - Stream ID [4 octets] - (Must be unique to Tor process/time) - Target (NUL-terminated address-port string] - - 0x0003 -- OR Connection status changed - - Status [1 octet] - (Launched=0,connected=1,failed=2,closed=3) - OR nickname/identity [NUL-terminated] - - 0x0004 -- Bandwidth used in the last second - - Bytes read [4 octets] - Bytes written [4 octets] - - 0x0005 -- Notice/warning/error occurred - - Message [NUL-terminated] - - <obsolete: use 0x0007-0x000B instead.> - - 0x0006 -- New descriptors available - - OR List [NUL-terminated, comma-delimited list of - OR identity] - - 0x0007 -- Debug message occurred - 0x0008 -- Info message occurred - 0x0009 -- Notice message occurred - 0x000A -- Warning message occurred - 0x000B -- Error message occurred - - Message [NUL-terminated] - -3.8. AUTHENTICATE (Type 0x0007) - - Sent from the client to the server. Contains a 'magic cookie' to prove - that client is really allowed to control this Tor process. The server - responds with DONE or ERROR. - - The format of the 'cookie' is implementation-dependent; see 4.1 below for - information on how the standard Tor implementation handles it. - -3.9. SAVECONF (Type 0x0008) - - Sent from the client to the server. Instructs the server to write out - its config options into its torrc. Server returns DONE if successful, or - ERROR if it can't write the file or some other error occurs. - -3.10. SIGNAL (Type 0x0009) - - Sent from the client to the server. The body contains one byte that - indicates the action the client wishes the server to take. - - 1 (0x01) -- Reload: reload config items, refetch directory. - 2 (0x02) -- Controlled shutdown: if server is an OP, exit immediately. - If it's an OR, close listeners and exit after 30 seconds. - 10 (0x0A) -- Dump stats: log information about open connections and - circuits. - 12 (0x0C) -- Debug: switch all open logs to loglevel debug. - 15 (0x0F) -- Immediate shutdown: clean up and exit now. - - The server responds with DONE if the signal is recognized (or simply - closes the socket if it was asked to close immediately), else ERROR. - -3.11. MAPADDRESS (Type 0x000A) - - Sent from the client to the server. The body contains a sequence of - address mappings, each consisting of the address to be mapped, a single - space, the replacement address, and a NL character. - - Addresses may be IPv4 addresses, IPv6 addresses, or hostnames. - - The client sends this message to the server in order to tell it that future - SOCKS requests for connections to the original address should be replaced - with connections to the specified replacement address. If the addresses - are well-formed, and the server is able to fulfill the request, the server - replies with a single DONE message containing the source and destination - addresses. If request is malformed, the server replies with a syntax error - message. The server can't fulfill the request, it replies with an internal - ERROR message. - - The client may decline to provide a body for the original address, and - instead send a special null address ("0.0.0.0" for IPv4, "::0" for IPv6, or - "." for hostname), signifying that the server should choose the original - address itself, and return that address in the DONE message. The server - should ensure that it returns an element of address space that is unlikely - to be in actual use. If there is already an address mapped to the - destination address, the server may reuse that mapping. - - If the original address is already mapped to a different address, the old - mapping is removed. If the original address and the destination address - are the same, the server removes any mapping in place for the original - address. - - {Note: This feature is designed to be used to help Tor-ify applications - that need to use SOCKS4 or hostname-less SOCKS5. There are three - approaches to doing this: - 1. Somehow make them use SOCKS4a or SOCKS5-with-hostnames instead. - 2. Use tor-resolve (or another interface to Tor's resolve-over-SOCKS - feature) to resolve the hostname remotely. This doesn't work - with special addresses like x.onion or x.y.exit. - 3. Use MAPADDRESS to map an IP address to the desired hostname, and then - arrange to fool the application into thinking that the hostname - has resolved to that IP. - This functionality is designed to help implement the 3rd approach.} - - [XXXX When, if ever, can mappings expire? Should they expire?] - [XXXX What addresses, if any, are safe to use?] - -3.12 GETINFO (Type 0x000B) - - Sent from the client to the server. The message body is as for GETCONF: - one or more NL-terminated strings. The server replies with an INFOVALUE - message. - - Unlike GETCONF, this message is used for data that are not stored in the - Tor configuration file, but instead. - - Recognized key and their values include: - - "version" -- The version of the server's software, including the name - of the software. (example: "Tor 0.0.9.4") - - "desc/id/<OR identity>" or "desc/name/<OR nickname>" -- the latest server - descriptor for a given OR, NUL-terminated. If no such OR is known, the - corresponding value is an empty string. - - "network-status" -- a space-separated list of all known OR identities. - This is in the same format as the router-status line in directories; - see tor-spec.txt for details. - - "addr-mappings/all" - "addr-mappings/config" - "addr-mappings/cache" - "addr-mappings/control" -- a NL-terminated list of address mappings, each - in the form of "from-address" SP "to-address". The 'config' key - returns those address mappings set in the configuration; the 'cache' - key returns the mappings in the client-side DNS cache; the 'control' - key returns the mappings set via the control interface; the 'all' - target returns the mappings set through any mechanism. - -3.13 INFOVALUE (Type 0x000C) - - Sent from the server to the client in response to a GETINFO message. - Contains one or more items of the format: - - Key [(NUL-terminated string)] - Value [(NUL-terminated string)] - - The keys match those given in the GETINFO message. - -3.14 EXTENDCIRCUIT (Type 0x000D) - - Sent from the client to the server. The message body contains two fields: - Circuit ID [4 octets] - Path [NUL-terminated, comma-delimited string of OR nickname/identity] - - This request takes one of two forms: either the Circuit ID is zero, in - which case it is a request for the server to build a new circuit according - to the specified path, or the Circuit ID is nonzero, in which case it is a - request for the server to extend an existing circuit with that ID according - to the specified path. - - If the request is successful, the server sends a DONE message containing - a message body consisting of the four-octet Circuit ID of the newly created - circuit. - -3.15 ATTACHSTREAM (Type 0x000E) - - Sent from the client to the server. The message body contains two fields: - Stream ID [4 octets] - Circuit ID [4 octets] - - This message informs the server that the specified stream should be - associated with the specified circuit. Each stream may be associated with - at most one circuit, and multiple streams may share the same circuit. - Streams can only be attached to completed circuits (that is, circuits that - have sent a circuit status 'built' event). - - If the circuit ID is 0, responsibility for attaching the given stream is - returned to Tor. - - {Implementation note: By default, Tor automatically attaches streams to - circuits itself, unless the configuration variable - "__LeaveStreamsUnattached" is set to "1". Attempting to attach streams - via TC when "__LeaveStreamsUnattached" is false may cause a race between - Tor and the controller, as both attempt to attach streams to circuits.} - -3.16 POSTDESCRIPTOR (Type 0x000F) - - Sent from the client to the server. The message body contains one field: - Descriptor [NUL-terminated string] - - This message informs the server about a new descriptor. - - The descriptor, when parsed, must contain a number of well-specified - fields, including fields for its nickname and identity. - - If there is an error in parsing the descriptor, the server must send an - appropriate error message. If the descriptor is well-formed but the server - chooses not to add it, it must reply with a DONE message whose body - explains why the server was not added. - -3.17 FRAGMENTHEADER (Type 0x0010) - - Sent in either direction. Used to encapsulate messages longer than 65535 - bytes in length. - - Underlying type [2 bytes] - Total Length [4 bytes] - Data [Rest of message] - - A FRAGMENTHEADER message MUST be followed immediately by a number of - FRAGMENT messages, such that lengths of the "Data" fields of the - FRAGMENTHEADER and FRAGMENT messages add to the "Total Length" field of the - FRAGMENTHEADER message. - - Implementations MUST NOT fragment messages of length less than 65536 bytes. - Implementations MUST be able to process fragmented messages that not - optimally packed. - -3.18 FRAGMENT (Type 0x0011) - - Data [Entire message] - - See FRAGMENTHEADER for more information - -3.19 REDIRECTSTREAM (Type 0x0012) - - Sent from the client to the server. The message body contains two fields: - Stream ID [4 octets] - Address [variable-length, NUL-terminated.] - - Tells the server to change the exit address on the specified stream. No - remapping is performed on the new provided address. - - To be sure that the modified address will be used, this event must be sent - after a new stream event is received, and before attaching this stream to - a circuit. - -3.20 CLOSESTREAM (Type 0x0013) - - Sent from the client to the server. The message body contains three - fields: - Stream ID [4 octets] - Reason [1 octet] - Flags [1 octet] - - Tells the server to close the specified stream. The reason should be - one of the Tor RELAY_END reasons given in tor-spec.txt. Flags is not - used currently. Tor may hold the stream open for a while to flush - any data that is pending. - -3.21 CLOSECIRCUIT (Type 0x0014) - - Sent from the client to the server. The message body contains two - fields: - Circuit ID [4 octets] - Flags [1 octet] - - Tells the server to close the specified circuit. If the LSB of the flags - field is nonzero, do not close the circuit unless it is unused. - -4. Implementation notes - -4.1. Authentication - - By default, the current Tor implementation trusts all local users. - - If the 'CookieAuthentication' option is true, Tor writes a "magic cookie" - file named "control_auth_cookie" into its data directory. To authenticate, - the controller must send the contents of this file. - - If the 'HashedControlPassword' option is set, it must contain the salted - hash of a secret password. The salted hash is computed according to the - S2K algorithm in RFC 2440 (OpenPGP), and prefixed with the s2k specifier. - This is then encoded in hexadecimal, prefixed by the indicator sequence - "16:". Thus, for example, the password 'foo' could encode to: - 16:660537E3E1CD49996044A3BF558097A981F539FEA2F9DA662B4626C1C2 - ++++++++++++++++**^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - salt hashed value - indicator - You can generate the salt of a password by calling - 'tor --hash-password <password>' - or by using the example code in the Python and Java controller libraries. - To authenticate under this scheme, the controller sends Tor the original - secret that was used to generate the password. - -4.2. Don't let the buffer get too big. - - If you ask for lots of events, and 16MB of them queue up on the buffer, - the Tor process will close the socket. - diff --git a/doc/spec/control-spec.txt b/doc/spec/control-spec.txt deleted file mode 100644 index f86f94ba6f..0000000000 --- a/doc/spec/control-spec.txt +++ /dev/null @@ -1,2001 +0,0 @@ - - TC: A Tor control protocol (Version 1) - -0. Scope - - This document describes an implementation-specific protocol that is used - for other programs (such as frontend user-interfaces) to communicate with a - locally running Tor process. It is not part of the Tor onion routing - protocol. - - This protocol replaces version 0 of TC, which is now deprecated. For - reference, TC is described in "control-spec-v0.txt". Implementors are - recommended to avoid using TC directly, but instead to use a library that - can easily be updated to use the newer protocol. (Version 0 is used by Tor - versions 0.1.0.x; the protocol in this document only works with Tor - versions in the 0.1.1.x series and later.) - - The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL - NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and - "OPTIONAL" in this document are to be interpreted as described in - RFC 2119. - -1. Protocol outline - - TC is a bidirectional message-based protocol. It assumes an underlying - stream for communication between a controlling process (the "client" - or "controller") and a Tor process (or "server"). The stream may be - implemented via TCP, TLS-over-TCP, a Unix-domain socket, or so on, - but it must provide reliable in-order delivery. For security, the - stream should not be accessible by untrusted parties. - - In TC, the client and server send typed messages to each other over the - underlying stream. The client sends "commands" and the server sends - "replies". - - By default, all messages from the server are in response to messages from - the client. Some client requests, however, will cause the server to send - messages to the client indefinitely far into the future. Such - "asynchronous" replies are marked as such. - - Servers respond to messages in the order messages are received. - -2. Message format - -2.1. Description format - - The message formats listed below use ABNF as described in RFC 2234. - The protocol itself is loosely based on SMTP (see RFC 2821). - - We use the following nonterminals from RFC 2822: atom, qcontent - - We define the following general-use nonterminals: - - String = DQUOTE *qcontent DQUOTE - - There are explicitly no limits on line length. All 8-bit characters are - permitted unless explicitly disallowed. - - Wherever CRLF is specified to be accepted from the controller, Tor MAY also - accept LF. Tor, however, MUST NOT generate LF instead of CRLF. - Controllers SHOULD always send CRLF. - -2.2. Commands from controller to Tor - - Command = Keyword Arguments CRLF / "+" Keyword Arguments CRLF Data - Keyword = 1*ALPHA - Arguments = *(SP / VCHAR) - - Specific commands and their arguments are described below in section 3. - -2.3. Replies from Tor to the controller - - Reply = SyncReply / AsyncReply - SyncReply = *(MidReplyLine / DataReplyLine) EndReplyLine - AsyncReply = *(MidReplyLine / DataReplyLine) EndReplyLine - - MidReplyLine = StatusCode "-" ReplyLine - DataReplyLine = StatusCode "+" ReplyLine Data - EndReplyLine = StatusCode SP ReplyLine - ReplyLine = [ReplyText] CRLF - ReplyText = XXXX - StatusCode = 3DIGIT - - Specific replies are mentioned below in section 3, and described more fully - in section 4. - - [Compatibility note: versions of Tor before 0.2.0.3-alpha sometimes - generate AsyncReplies of the form "*(MidReplyLine / DataReplyLine)". - This is incorrect, but controllers that need to work with these - versions of Tor should be prepared to get multi-line AsyncReplies with - the final line (usually "650 OK") omitted.] - -2.4. General-use tokens - - ; CRLF means, "the ASCII Carriage Return character (decimal value 13) - ; followed by the ASCII Linefeed character (decimal value 10)." - CRLF = CR LF - - ; How a controller tells Tor about a particular OR. There are four - ; possible formats: - ; $Fingerprint -- The router whose identity key hashes to the fingerprint. - ; This is the preferred way to refer to an OR. - ; $Fingerprint~Nickname -- The router whose identity key hashes to the - ; given fingerprint, but only if the router has the given nickname. - ; $Fingerprint=Nickname -- The router whose identity key hashes to the - ; given fingerprint, but only if the router is Named and has the given - ; nickname. - ; Nickname -- The Named router with the given nickname, or, if no such - ; router exists, any router whose nickname matches the one given. - ; This is not a safe way to refer to routers, since Named status - ; could under some circumstances change over time. - ; - ; The tokens that implement the above follow: - - ServerSpec = LongName / Nickname - LongName = Fingerprint [ ( "=" / "~" ) Nickname ] - - Fingerprint = "$" 40*HEXDIG - NicknameChar = "a"-"z" / "A"-"Z" / "0" - "9" - Nickname = 1*19 NicknameChar - - ; What follows is an outdated way to refer to ORs. - ; Feature VERBOSE_NAMES replaces ServerID with LongName in events and - ; GETINFO results. VERBOSE_NAMES can be enabled starting in Tor version - ; 0.1.2.2-alpha and it is always-on in 0.2.2.1-alpha and later. - ServerID = Nickname / Fingerprint - - - ; Unique identifiers for streams or circuits. Currently, Tor only - ; uses digits, but this may change - StreamID = 1*16 IDChar - CircuitID = 1*16 IDChar - IDChar = ALPHA / DIGIT - - Address = ip4-address / ip6-address / hostname (XXXX Define these) - - ; A "Data" section is a sequence of octets concluded by the terminating - ; sequence CRLF "." CRLF. The terminating sequence may not appear in the - ; body of the data. Leading periods on lines in the data are escaped with - ; an additional leading period as in RFC 2821 section 4.5.2. - Data = *DataLine "." CRLF - DataLine = CRLF / "." 1*LineItem CRLF / NonDotItem *LineItem CRLF - LineItem = NonCR / 1*CR NonCRLF - NonDotItem = NonDotCR / 1*CR NonCRLF - -3. Commands - - All commands are case-insensitive, but most keywords are case-sensitive. - -3.1. SETCONF - - Change the value of one or more configuration variables. The syntax is: - - "SETCONF" 1*(SP keyword ["=" value]) CRLF - value = String / QuotedString - - Tor behaves as though it had just read each of the key-value pairs - from its configuration file. Keywords with no corresponding values have - their configuration values reset to 0 or NULL (use RESETCONF if you want - to set it back to its default). SETCONF is all-or-nothing: if there - is an error in any of the configuration settings, Tor sets none of them. - - Tor responds with a "250 configuration values set" reply on success. - If some of the listed keywords can't be found, Tor replies with a - "552 Unrecognized option" message. Otherwise, Tor responds with a - "513 syntax error in configuration values" reply on syntax error, or a - "553 impossible configuration setting" reply on a semantic error. - - When a configuration option takes multiple values, or when multiple - configuration keys form a context-sensitive group (see GETCONF below), then - setting _any_ of the options in a SETCONF command is taken to reset all of - the others. For example, if two ORBindAddress values are configured, and a - SETCONF command arrives containing a single ORBindAddress value, the new - command's value replaces the two old values. - - Sometimes it is not possible to change configuration options solely by - issuing a series of SETCONF commands, because the value of one of the - configuration options depends on the value of another which has not yet - been set. Such situations can be overcome by setting multiple configuration - options with a single SETCONF command (e.g. SETCONF ORPort=443 - ORListenAddress=9001). - -3.2. RESETCONF - - Remove all settings for a given configuration option entirely, assign - its default value (if any), and then assign the String provided. - Typically the String is left empty, to simply set an option back to - its default. The syntax is: - - "RESETCONF" 1*(SP keyword ["=" String]) CRLF - - Otherwise it behaves like SETCONF above. - -3.3. GETCONF - - Request the value of a configuration variable. The syntax is: - - "GETCONF" 1*(SP keyword) CRLF - - If all of the listed keywords exist in the Tor configuration, Tor replies - with a series of reply lines of the form: - 250 keyword=value - If any option is set to a 'default' value semantically different from an - empty string, Tor may reply with a reply line of the form: - 250 keyword - - Value may be a raw value or a quoted string. Tor will try to use - unquoted values except when the value could be misinterpreted through - not being quoted. - - If some of the listed keywords can't be found, Tor replies with a - "552 unknown configuration keyword" message. - - If an option appears multiple times in the configuration, all of its - key-value pairs are returned in order. - - Some options are context-sensitive, and depend on other options with - different keywords. These cannot be fetched directly. Currently there - is only one such option: clients should use the "HiddenServiceOptions" - virtual keyword to get all HiddenServiceDir, HiddenServicePort, - HiddenServiceNodes, and HiddenServiceExcludeNodes option settings. - -3.4. SETEVENTS - - Request the server to inform the client about interesting events. The - syntax is: - - "SETEVENTS" [SP "EXTENDED"] *(SP EventCode) CRLF - - EventCode = "CIRC" / "STREAM" / "ORCONN" / "BW" / "DEBUG" / - "INFO" / "NOTICE" / "WARN" / "ERR" / "NEWDESC" / "ADDRMAP" / - "AUTHDIR_NEWDESCS" / "DESCCHANGED" / "STATUS_GENERAL" / - "STATUS_CLIENT" / "STATUS_SERVER" / "GUARD" / "NS" / "STREAM_BW" / - "CLIENTS_SEEN" / "NEWCONSENSUS" / "BUILDTIMEOUT_SET" / "SIGNAL" - - Any events *not* listed in the SETEVENTS line are turned off; thus, sending - SETEVENTS with an empty body turns off all event reporting. - - The server responds with a "250 OK" reply on success, and a "552 - Unrecognized event" reply if one of the event codes isn't recognized. (On - error, the list of active event codes isn't changed.) - - If the flag string "EXTENDED" is provided, Tor may provide extra - information with events for this connection; see 4.1 for more information. - NOTE: All events on a given connection will be provided in extended format, - or none. - NOTE: "EXTENDED" is only supported in Tor 0.1.1.9-alpha or later. - - Each event is described in more detail in Section 4.1. - -3.5. AUTHENTICATE - - Sent from the client to the server. The syntax is: - "AUTHENTICATE" [ SP 1*HEXDIG / QuotedString ] CRLF - - The server responds with "250 OK" on success or "515 Bad authentication" if - the authentication cookie is incorrect. Tor closes the connection on an - authentication failure. - - The format of the 'cookie' is implementation-dependent; see 5.1 below for - information on how the standard Tor implementation handles it. - - Before the client has authenticated, no command other than PROTOCOLINFO, - AUTHENTICATE, or QUIT is valid. If the controller sends any other command, - or sends a malformed command, or sends an unsuccessful AUTHENTICATE - command, or sends PROTOCOLINFO more than once, Tor sends an error reply and - closes the connection. - - To prevent some cross-protocol attacks, the AUTHENTICATE command is still - required even if all authentication methods in Tor are disabled. In this - case, the controller should just send "AUTHENTICATE" CRLF. - - (Versions of Tor before 0.1.2.16 and 0.2.0.4-alpha did not close the - connection after an authentication failure.) - -3.6. SAVECONF - - Sent from the client to the server. The syntax is: - "SAVECONF" CRLF - - Instructs the server to write out its config options into its torrc. Server - returns "250 OK" if successful, or "551 Unable to write configuration - to disk" if it can't write the file or some other error occurs. - - See also the "getinfo config-text" command, if the controller wants - to write the torrc file itself. - -3.7. SIGNAL - - Sent from the client to the server. The syntax is: - - "SIGNAL" SP Signal CRLF - - Signal = "RELOAD" / "SHUTDOWN" / "DUMP" / "DEBUG" / "HALT" / - "HUP" / "INT" / "USR1" / "USR2" / "TERM" / "NEWNYM" / - "CLEARDNSCACHE" - - The meaning of the signals are: - - RELOAD -- Reload: reload config items, refetch directory. (like HUP) - SHUTDOWN -- Controlled shutdown: if server is an OP, exit immediately. - If it's an OR, close listeners and exit after 30 seconds. - (like INT) - DUMP -- Dump stats: log information about open connections and - circuits. (like USR1) - DEBUG -- Debug: switch all open logs to loglevel debug. (like USR2) - HALT -- Immediate shutdown: clean up and exit now. (like TERM) - CLEARDNSCACHE -- Forget the client-side cached IPs for all hostnames. - NEWNYM -- Switch to clean circuits, so new application requests - don't share any circuits with old ones. Also clears - the client-side DNS cache. (Tor MAY rate-limit its - response to this signal.) - - The server responds with "250 OK" if the signal is recognized (or simply - closes the socket if it was asked to close immediately), or "552 - Unrecognized signal" if the signal is unrecognized. - -3.8. MAPADDRESS - - Sent from the client to the server. The syntax is: - - "MAPADDRESS" 1*(Address "=" Address SP) CRLF - - The first address in each pair is an "original" address; the second is a - "replacement" address. The client sends this message to the server in - order to tell it that future SOCKS requests for connections to the original - address should be replaced with connections to the specified replacement - address. If the addresses are well-formed, and the server is able to - fulfill the request, the server replies with a 250 message: - 250-OldAddress1=NewAddress1 - 250 OldAddress2=NewAddress2 - - containing the source and destination addresses. If request is - malformed, the server replies with "512 syntax error in command - argument". If the server can't fulfill the request, it replies with - "451 resource exhausted". - - The client may decline to provide a body for the original address, and - instead send a special null address ("0.0.0.0" for IPv4, "::0" for IPv6, or - "." for hostname), signifying that the server should choose the original - address itself, and return that address in the reply. The server - should ensure that it returns an element of address space that is unlikely - to be in actual use. If there is already an address mapped to the - destination address, the server may reuse that mapping. - - If the original address is already mapped to a different address, the old - mapping is removed. If the original address and the destination address - are the same, the server removes any mapping in place for the original - address. - - Example: - C: MAPADDRESS 0.0.0.0=torproject.org 1.2.3.4=tor.freehaven.net - S: 250-127.192.10.10=torproject.org - S: 250 1.2.3.4=tor.freehaven.net - - {Note: This feature is designed to be used to help Tor-ify applications - that need to use SOCKS4 or hostname-less SOCKS5. There are three - approaches to doing this: - 1. Somehow make them use SOCKS4a or SOCKS5-with-hostnames instead. - 2. Use tor-resolve (or another interface to Tor's resolve-over-SOCKS - feature) to resolve the hostname remotely. This doesn't work - with special addresses like x.onion or x.y.exit. - 3. Use MAPADDRESS to map an IP address to the desired hostname, and then - arrange to fool the application into thinking that the hostname - has resolved to that IP. - This functionality is designed to help implement the 3rd approach.} - - Mappings set by the controller last until the Tor process exits: - they never expire. If the controller wants the mapping to last only - a certain time, then it must explicitly un-map the address when that - time has elapsed. - -3.9. GETINFO - - Sent from the client to the server. The syntax is as for GETCONF: - "GETINFO" 1*(SP keyword) CRLF - one or more NL-terminated strings. The server replies with an INFOVALUE - message, or a 551 or 552 error. - - Unlike GETCONF, this message is used for data that are not stored in the Tor - configuration file, and that may be longer than a single line. On success, - one ReplyLine is sent for each requested value, followed by a final 250 OK - ReplyLine. If a value fits on a single line, the format is: - 250-keyword=value - If a value must be split over multiple lines, the format is: - 250+keyword= - value - . - Recognized keys and their values include: - - "version" -- The version of the server's software, including the name - of the software. (example: "Tor 0.0.9.4") - - "config-file" -- The location of Tor's configuration file ("torrc"). - - "config-text" -- The contents that Tor would write if you send it - a SAVECONF command, so the controller can write the file to - disk itself. [First implemented in 0.2.2.7-alpha.] - - ["exit-policy/prepend" -- The default exit policy lines that Tor will - *prepend* to the ExitPolicy config option. - -- Never implemented. Useful?] - - "exit-policy/default" -- The default exit policy lines that Tor will - *append* to the ExitPolicy config option. - - "desc/id/<OR identity>" or "desc/name/<OR nickname>" -- the latest - server descriptor for a given OR, NUL-terminated. - - "desc-annotations/id/<OR identity>" -- outputs the annotations string - (source, timestamp of arrival, purpose, etc) for the corresponding - descriptor. [First implemented in 0.2.0.13-alpha.] - - "extra-info/digest/<digest>" -- the extrainfo document whose digest (in - hex) is <digest>. Only available if we're downloading extra-info - documents. - - "ns/id/<OR identity>" or "ns/name/<OR nickname>" -- the latest router - status info (v2 directory style) for a given OR. Router status - info is as given in - dir-spec.txt, and reflects the current beliefs of this Tor about the - router in question. Like directory clients, controllers MUST - tolerate unrecognized flags and lines. The published date and - descriptor digest are those believed to be best by this Tor, - not necessarily those for a descriptor that Tor currently has. - [First implemented in 0.1.2.3-alpha.] - - "ns/all" -- Router status info (v2 directory style) for all ORs we - have an opinion about, joined by newlines. [First implemented - in 0.1.2.3-alpha.] - - "ns/purpose/<purpose>" -- Router status info (v2 directory style) - for all ORs of this purpose. Mostly designed for /ns/purpose/bridge - queries. [First implemented in 0.2.0.13-alpha.] - - "desc/all-recent" -- the latest server descriptor for every router that - Tor knows about. - - "network-status" -- a space-separated list (v1 directory style) - of all known OR identities. This is in the same format as the - router-status line in v1 directories; see dir-spec-v1.txt section - 3 for details. (If VERBOSE_NAMES is enabled, the output will - not conform to dir-spec-v1.txt; instead, the result will be a - space-separated list of LongName, each preceded by a "!" if it is - believed to be not running.) This option is deprecated; use - "ns/all" instead. - - "address-mappings/all" - "address-mappings/config" - "address-mappings/cache" - "address-mappings/control" -- a \r\n-separated list of address - mappings, each in the form of "from-address to-address expiry". - The 'config' key returns those address mappings set in the - configuration; the 'cache' key returns the mappings in the - client-side DNS cache; the 'control' key returns the mappings set - via the control interface; the 'all' target returns the mappings - set through any mechanism. - Expiry is formatted as with ADDRMAP events, except that "expiry" is - always a time in GMT or the string "NEVER"; see section 4.1.7. - First introduced in 0.2.0.3-alpha. - - "addr-mappings/*" -- as for address-mappings/*, but without the - expiry portion of the value. Use of this value is deprecated - since 0.2.0.3-alpha; use address-mappings instead. - - "address" -- the best guess at our external IP address. If we - have no guess, return a 551 error. (Added in 0.1.2.2-alpha) - - "fingerprint" -- the contents of the fingerprint file that Tor - writes as a server, or a 551 if we're not a server currently. - (Added in 0.1.2.3-alpha) - - "circuit-status" - A series of lines as for a circuit status event. Each line is of - the form: - CircuitID SP CircStatus [SP Path] CRLF - - "stream-status" - A series of lines as for a stream status event. Each is of the form: - StreamID SP StreamStatus SP CircID SP Target CRLF - - "orconn-status" - A series of lines as for an OR connection status event. In Tor - 0.1.2.2-alpha with feature VERBOSE_NAMES enabled and in Tor - 0.2.2.1-alpha and later by default, each line is of the form: - LongName SP ORStatus CRLF - - In Tor versions 0.1.2.2-alpha through 0.2.2.1-alpha with feature - VERBOSE_NAMES turned off and before version 0.1.2.2-alpha, each line - is of the form: - ServerID SP ORStatus CRLF - - "entry-guards" - A series of lines listing the currently chosen entry guards, if any. - In Tor 0.1.2.2-alpha with feature VERBOSE_NAMES enabled and in Tor - 0.2.2.1-alpha and later by default, each line is of the form: - LongName SP Status [SP ISOTime] CRLF - - In Tor versions 0.1.2.2-alpha through 0.2.2.1-alpha with feature - VERBOSE_NAMES turned off and before version 0.1.2.2-alpha, each line - is of the form: - ServerID2 SP Status [SP ISOTime] CRLF - ServerID2 = Nickname / 40*HEXDIG - - The definition of Status is the same for both: - Status = "up" / "never-connected" / "down" / - "unusable" / "unlisted" - - [From 0.1.1.4-alpha to 0.1.1.10-alpha, entry-guards was called - "helper-nodes". Tor still supports calling "helper-nodes", but it - is deprecated and should not be used.] - - [Older versions of Tor (before 0.1.2.x-final) generated 'down' instead - of unlisted/unusable. Current Tors never generate 'down'.] - - [XXXX ServerID2 differs from ServerID in not prefixing fingerprints - with a $. This is an implementation error. It would be nice to add - the $ back in if we can do so without breaking compatibility.] - - "traffic/read" -- Total bytes read (downloaded). - - "traffic/written" -- Total bytes written (uploaded). - - "accounting/enabled" - "accounting/hibernating" - "accounting/bytes" - "accounting/bytes-left" - "accounting/interval-start" - "accounting/interval-wake" - "accounting/interval-end" - Information about accounting status. If accounting is enabled, - "enabled" is 1; otherwise it is 0. The "hibernating" field is "hard" - if we are accepting no data; "soft" if we're accepting no new - connections, and "awake" if we're not hibernating at all. The "bytes" - and "bytes-left" fields contain (read-bytes SP write-bytes), for the - start and the rest of the interval respectively. The 'interval-start' - and 'interval-end' fields are the borders of the current interval; the - 'interval-wake' field is the time within the current interval (if any) - where we plan[ned] to start being active. The times are GMT. - - "config/names" - A series of lines listing the available configuration options. Each is - of the form: - OptionName SP OptionType [ SP Documentation ] CRLF - OptionName = Keyword - OptionType = "Integer" / "TimeInterval" / "TimeMsecInterval" / - "DataSize" / "Float" / "Boolean" / "Time" / "CommaList" / - "Dependant" / "Virtual" / "String" / "LineList" - Documentation = Text - - "info/names" - A series of lines listing the available GETINFO options. Each is of - one of these forms: - OptionName SP Documentation CRLF - OptionPrefix SP Documentation CRLF - OptionPrefix = OptionName "/*" - - "events/names" - A space-separated list of all the events supported by this version of - Tor's SETEVENTS. - - "features/names" - A space-separated list of all the events supported by this version of - Tor's USEFEATURE. - - "ip-to-country/*" - Maps IP addresses to 2-letter country codes. For example, - "GETINFO ip-to-country/18.0.0.1" should give "US". - - "next-circuit/IP:port" - XXX todo. - - "process/pid" -- Process id belonging to the main tor process. - "process/uid" -- User id running the tor process, -1 if unknown (this is - unimplemented on Windows, returning -1). - "process/user" -- Username under which the tor process is running, - providing an empty string if none exists (this is unimplemented on - Windows, returning an empty string). - "process/descriptor-limit" -- Upper bound on the file descriptor limit, -1 - if unknown. - - "dir/status-vote/current/consensus" [added in Tor 0.2.1.6-alpha] - "dir/status/authority" - "dir/status/fp/<F>" - "dir/status/fp/<F1>+<F2>+<F3>" - "dir/status/all" - "dir/server/fp/<F>" - "dir/server/fp/<F1>+<F2>+<F3>" - "dir/server/d/<D>" - "dir/server/d/<D1>+<D2>+<D3>" - "dir/server/authority" - "dir/server/all" - A series of lines listing directory contents, provided according to the - specification for the URLs listed in Section 4.4 of dir-spec.txt. Note - that Tor MUST NOT provide private information, such as descriptors for - routers not marked as general-purpose. When asked for 'authority' - information for which this Tor is not authoritative, Tor replies with - an empty string. - - "status/circuit-established" - "status/enough-dir-info" - "status/good-server-descriptor" - "status/accepted-server-descriptor" - "status/..." - These provide the current internal Tor values for various Tor - states. See Section 4.1.10 for explanations. (Only a few of the - status events are available as getinfo's currently. Let us know if - you want more exposed.) - "status/reachability-succeeded/or" - 0 or 1, depending on whether we've found our ORPort reachable. - "status/reachability-succeeded/dir" - 0 or 1, depending on whether we've found our DirPort reachable. - "status/reachability-succeeded" - "OR=" ("0"/"1") SP "DIR=" ("0"/"1") - Combines status/reachability-succeeded/*; controllers MUST ignore - unrecognized elements in this entry. - "status/bootstrap-phase" - Returns the most recent bootstrap phase status event - sent. Specifically, it returns a string starting with either - "NOTICE BOOTSTRAP ..." or "WARN BOOTSTRAP ...". Controllers should - use this getinfo when they connect or attach to Tor to learn its - current bootstrap state. - "status/version/recommended" - List of currently recommended versions. - "status/version/current" - Status of the current version. One of: new, old, unrecommended, - recommended, new in series, obsolete, unknown. - "status/clients-seen" - A summary of which countries we've seen clients from recently, - formatted the same as the CLIENTS_SEEN status event described in - Section 4.1.14. This GETINFO option is currently available only - for bridge relays. - - Examples: - C: GETINFO version desc/name/moria1 - S: 250+desc/name/moria= - S: [Descriptor for moria] - S: . - S: 250-version=Tor 0.1.1.0-alpha-cvs - S: 250 OK - -3.10. EXTENDCIRCUIT - - Sent from the client to the server. The format is: - "EXTENDCIRCUIT" SP CircuitID - [SP ServerSpec *("," ServerSpec) - SP "purpose=" Purpose] CRLF - - This request takes one of two forms: either the CircuitID is zero, in - which case it is a request for the server to build a new circuit, - or the CircuitID is nonzero, in which case it is a request for the - server to extend an existing circuit with that ID according to the - specified path. - - If the CircuitID is 0, the controller has the option of providing - a path for Tor to use to build the circuit. If it does not provide - a path, Tor will select one automatically from high capacity nodes - according to path-spec.txt. - - If CircuitID is 0 and "purpose=" is specified, then the circuit's - purpose is set. Two choices are recognized: "general" and - "controller". If not specified, circuits are created as "general". - - If the request is successful, the server sends a reply containing a - message body consisting of the CircuitID of the (maybe newly created) - circuit. The syntax is "250" SP "EXTENDED" SP CircuitID CRLF. - -3.11. SETCIRCUITPURPOSE - - Sent from the client to the server. The format is: - "SETCIRCUITPURPOSE" SP CircuitID SP Purpose CRLF - - This changes the circuit's purpose. See EXTENDCIRCUIT above for details. - -3.12. SETROUTERPURPOSE - - Sent from the client to the server. The format is: - "SETROUTERPURPOSE" SP NicknameOrKey SP Purpose CRLF - - This changes the descriptor's purpose. See +POSTDESCRIPTOR below - for details. - - NOTE: This command was disabled and made obsolete as of Tor - 0.2.0.8-alpha. It doesn't exist anymore, and is listed here only for - historical interest. - -3.13. ATTACHSTREAM - - Sent from the client to the server. The syntax is: - "ATTACHSTREAM" SP StreamID SP CircuitID [SP "HOP=" HopNum] CRLF - - This message informs the server that the specified stream should be - associated with the specified circuit. Each stream may be associated with - at most one circuit, and multiple streams may share the same circuit. - Streams can only be attached to completed circuits (that is, circuits that - have sent a circuit status 'BUILT' event or are listed as built in a - GETINFO circuit-status request). - - If the circuit ID is 0, responsibility for attaching the given stream is - returned to Tor. - - If HOP=HopNum is specified, Tor will choose the HopNumth hop in the - circuit as the exit node, rather than the last node in the circuit. - Hops are 1-indexed; generally, it is not permitted to attach to hop 1. - - Tor responds with "250 OK" if it can attach the stream, 552 if the circuit - or stream didn't exist, or 551 if the stream couldn't be attached for - another reason. - - {Implementation note: Tor will close unattached streams by itself, - roughly two minutes after they are born. Let the developers know if - that turns out to be a problem.} - - {Implementation note: By default, Tor automatically attaches streams to - circuits itself, unless the configuration variable - "__LeaveStreamsUnattached" is set to "1". Attempting to attach streams - via TC when "__LeaveStreamsUnattached" is false may cause a race between - Tor and the controller, as both attempt to attach streams to circuits.} - - {Implementation note: You can try to attachstream to a stream that - has already sent a connect or resolve request but hasn't succeeded - yet, in which case Tor will detach the stream from its current circuit - before proceeding with the new attach request.} - -3.14. POSTDESCRIPTOR - - Sent from the client to the server. The syntax is: - "+POSTDESCRIPTOR" [SP "purpose=" Purpose] [SP "cache=" Cache] - CRLF Descriptor CRLF "." CRLF - - This message informs the server about a new descriptor. If Purpose is - specified, it must be either "general", "controller", or "bridge", - else we return a 552 error. The default is "general". - - If Cache is specified, it must be either "no" or "yes", else we - return a 552 error. If Cache is not specified, Tor will decide for - itself whether it wants to cache the descriptor, and controllers - must not rely on its choice. - - The descriptor, when parsed, must contain a number of well-specified - fields, including fields for its nickname and identity. - - If there is an error in parsing the descriptor, the server must send a - "554 Invalid descriptor" reply. If the descriptor is well-formed but - the server chooses not to add it, it must reply with a 251 message - whose body explains why the server was not added. If the descriptor - is added, Tor replies with "250 OK". - -3.15. REDIRECTSTREAM - - Sent from the client to the server. The syntax is: - "REDIRECTSTREAM" SP StreamID SP Address [SP Port] CRLF - - Tells the server to change the exit address on the specified stream. If - Port is specified, changes the destination port as well. No remapping - is performed on the new provided address. - - To be sure that the modified address will be used, this event must be sent - after a new stream event is received, and before attaching this stream to - a circuit. - - Tor replies with "250 OK" on success. - -3.16. CLOSESTREAM - - Sent from the client to the server. The syntax is: - - "CLOSESTREAM" SP StreamID SP Reason *(SP Flag) CRLF - - Tells the server to close the specified stream. The reason should be one - of the Tor RELAY_END reasons given in tor-spec.txt, as a decimal. Flags is - not used currently; Tor servers SHOULD ignore unrecognized flags. Tor may - hold the stream open for a while to flush any data that is pending. - - Tor replies with "250 OK" on success, or a 512 if there aren't enough - arguments, or a 552 if it doesn't recognize the StreamID or reason. - -3.17. CLOSECIRCUIT - - The syntax is: - CLOSECIRCUIT SP CircuitID *(SP Flag) CRLF - Flag = "IfUnused" - - Tells the server to close the specified circuit. If "IfUnused" is - provided, do not close the circuit unless it is unused. - - Other flags may be defined in the future; Tor SHOULD ignore unrecognized - flags. - - Tor replies with "250 OK" on success, or a 512 if there aren't enough - arguments, or a 552 if it doesn't recognize the CircuitID. - -3.18. QUIT - - Tells the server to hang up on this controller connection. This command - can be used before authenticating. - -3.19. USEFEATURE - - Adding additional features to the control protocol sometimes will break - backwards compatibility. Initially such features are added into Tor and - disabled by default. USEFEATURE can enable these additional features. - - The syntax is: - - "USEFEATURE" *(SP FeatureName) CRLF - FeatureName = 1*(ALPHA / DIGIT / "_" / "-") - - Feature names are case-insensitive. - - Once enabled, a feature stays enabled for the duration of the connection - to the controller. A new connection to the controller must be opened to - disable an enabled feature. - - Features are a forward-compatibility mechanism; each feature will eventually - become a standard part of the control protocol. Once a feature becomes part - of the protocol, it is always-on. Each feature documents the version it was - introduced as a feature and the version in which it became part of the - protocol. - - Tor will ignore a request to use any feature that is always-on. Tor will give - a 552 error in response to an unrecognized feature. - - EXTENDED_EVENTS - - Same as passing 'EXTENDED' to SETEVENTS; this is the preferred way to - request the extended event syntax. - - This feature was first introduced in 0.1.2.3-alpha. It is always-on - and part of the protocol in Tor 0.2.2.1-alpha and later. - - VERBOSE_NAMES - - Replaces ServerID with LongName in events and GETINFO results. LongName - provides a Fingerprint for all routers, an indication of Named status, - and a Nickname if one is known. LongName is strictly more informative - than ServerID, which only provides either a Fingerprint or a Nickname. - - This feature was first introduced in 0.1.2.2-alpha. It is always-on and - part of the protocol in Tor 0.2.2.1-alpha and later. - -3.20. RESOLVE - - The syntax is - "RESOLVE" *Option *Address CRLF - Option = "mode=reverse" - Address = a hostname or IPv4 address - - This command launches a remote hostname lookup request for every specified - request (or reverse lookup if "mode=reverse" is specified). Note that the - request is done in the background: to see the answers, your controller will - need to listen for ADDRMAP events; see 4.1.7 below. - - [Added in Tor 0.2.0.3-alpha] - -3.21. PROTOCOLINFO - - The syntax is: - "PROTOCOLINFO" *(SP PIVERSION) CRLF - - The server reply format is: - "250-PROTOCOLINFO" SP PIVERSION CRLF *InfoLine "250 OK" CRLF - - InfoLine = AuthLine / VersionLine / OtherLine - - AuthLine = "250-AUTH" SP "METHODS=" AuthMethod *(",")AuthMethod - *(SP "COOKIEFILE=" AuthCookieFile) CRLF - VersionLine = "250-VERSION" SP "Tor=" TorVersion [SP Arguments] CRLF - - AuthMethod = - "NULL" / ; No authentication is required - "HASHEDPASSWORD" / ; A controller must supply the original password - "COOKIE" / ; A controller must supply the contents of a cookie - - AuthCookieFile = QuotedString - TorVersion = QuotedString - - OtherLine = "250-" Keyword [SP Arguments] CRLF - - PIVERSION: 1*DIGIT - - Tor MAY give its InfoLines in any order; controllers MUST ignore InfoLines - with keywords they do not recognize. Controllers MUST ignore extraneous - data on any InfoLine. - - PIVERSION is there in case we drastically change the syntax one day. For - now it should always be "1". Controllers MAY provide a list of the - protocolinfo versions they support; Tor MAY select a version that the - controller does not support. - - AuthMethod is used to specify one or more control authentication - methods that Tor currently accepts. - - AuthCookieFile specifies the absolute path and filename of the - authentication cookie that Tor is expecting and is provided iff - the METHODS field contains the method "COOKIE". Controllers MUST handle - escape sequences inside this string. - - The VERSION line contains the Tor version. - - [Unlike other commands besides AUTHENTICATE, PROTOCOLINFO may be used (but - only once!) before AUTHENTICATE.] - - [PROTOCOLINFO was not supported before Tor 0.2.0.5-alpha.] - -4. Replies - - Reply codes follow the same 3-character format as used by SMTP, with the - first character defining a status, the second character defining a - subsystem, and the third designating fine-grained information. - - The TC protocol currently uses the following first characters: - - 2yz Positive Completion Reply - The command was successful; a new request can be started. - - 4yz Temporary Negative Completion reply - The command was unsuccessful but might be reattempted later. - - 5yz Permanent Negative Completion Reply - The command was unsuccessful; the client should not try exactly - that sequence of commands again. - - 6yz Asynchronous Reply - Sent out-of-order in response to an earlier SETEVENTS command. - - The following second characters are used: - - x0z Syntax - Sent in response to ill-formed or nonsensical commands. - - x1z Protocol - Refers to operations of the Tor Control protocol. - - x5z Tor - Refers to actual operations of Tor system. - - The following codes are defined: - - 250 OK - 251 Operation was unnecessary - [Tor has declined to perform the operation, but no harm was done.] - - 451 Resource exhausted - - 500 Syntax error: protocol - - 510 Unrecognized command - 511 Unimplemented command - 512 Syntax error in command argument - 513 Unrecognized command argument - 514 Authentication required - 515 Bad authentication - - 550 Unspecified Tor error - - 551 Internal error - [Something went wrong inside Tor, so that the client's - request couldn't be fulfilled.] - - 552 Unrecognized entity - [A configuration key, a stream ID, circuit ID, event, - mentioned in the command did not actually exist.] - - 553 Invalid configuration value - [The client tried to set a configuration option to an - incorrect, ill-formed, or impossible value.] - - 554 Invalid descriptor - - 555 Unmanaged entity - - 650 Asynchronous event notification - - Unless specified to have specific contents, the human-readable messages - in error replies should not be relied upon to match those in this document. - -4.1. Asynchronous events - - These replies can be sent after a corresponding SETEVENTS command has been - received. They will not be interleaved with other Reply elements, but they - can appear between a command and its corresponding reply. For example, - this sequence is possible: - - C: SETEVENTS CIRC - S: 250 OK - C: GETCONF SOCKSPORT ORPORT - S: 650 CIRC 1000 EXTENDED moria1,moria2 - S: 250-SOCKSPORT=9050 - S: 250 ORPORT=0 - - But this sequence is disallowed: - C: SETEVENTS CIRC - S: 250 OK - C: GETCONF SOCKSPORT ORPORT - S: 250-SOCKSPORT=9050 - S: 650 CIRC 1000 EXTENDED moria1,moria2 - S: 250 ORPORT=0 - - Clients MUST tolerate more arguments in an asynchonous reply than - expected, and MUST tolerate more lines in an asynchronous reply than - expected. For instance, a client that expects a CIRC message like: - 650 CIRC 1000 EXTENDED moria1,moria2 - must tolerate: - 650-CIRC 1000 EXTENDED moria1,moria2 0xBEEF - 650-EXTRAMAGIC=99 - 650 ANONYMITY=high - - If clients ask for extended events, then each event line as specified below - will be followed by additional extensions. Additional lines will be of the - form - "650" ("-"/" ") KEYWORD ["=" ARGUMENTS] CRLF - Additional arguments will be of the form - SP KEYWORD ["=" ( QuotedString / * NonSpDquote ) ] - Such clients MUST tolerate lines with keywords they do not recognize. - -4.1.1. Circuit status changed - - The syntax is: - - "650" SP "CIRC" SP CircuitID SP CircStatus [SP Path] - [SP "REASON=" Reason [SP "REMOTE_REASON=" Reason]] CRLF - - CircStatus = - "LAUNCHED" / ; circuit ID assigned to new circuit - "BUILT" / ; all hops finished, can now accept streams - "EXTENDED" / ; one more hop has been completed - "FAILED" / ; circuit closed (was not built) - "CLOSED" ; circuit closed (was built) - - Path = LongName *("," LongName) - ; In Tor versions 0.1.2.2-alpha through 0.2.2.1-alpha with feature - ; VERBOSE_NAMES turned off and before version 0.1.2.2-alpha, Path - ; is as follows: - Path = ServerID *("," ServerID) - - Reason = "NONE" / "TORPROTOCOL" / "INTERNAL" / "REQUESTED" / - "HIBERNATING" / "RESOURCELIMIT" / "CONNECTFAILED" / - "OR_IDENTITY" / "OR_CONN_CLOSED" / "TIMEOUT" / - "FINISHED" / "DESTROYED" / "NOPATH" / "NOSUCHSERVICE" / - "MEASUREMENT_EXPIRED" - - The path is provided only when the circuit has been extended at least one - hop. - - The "REASON" field is provided only for FAILED and CLOSED events, and only - if extended events are enabled (see 3.19). Clients MUST accept reasons - not listed above. Reasons are as given in tor-spec.txt, except for: - - NOPATH (Not enough nodes to make circuit) - - The "REMOTE_REASON" field is provided only when we receive a DESTROY or - TRUNCATE cell, and only if extended events are enabled. It contains the - actual reason given by the remote OR for closing the circuit. Clients MUST - accept reasons not listed above. Reasons are as listed in tor-spec.txt. - -4.1.2. Stream status changed - - The syntax is: - - "650" SP "STREAM" SP StreamID SP StreamStatus SP CircID SP Target - [SP "REASON=" Reason [ SP "REMOTE_REASON=" Reason ]] - [SP "SOURCE=" Source] [ SP "SOURCE_ADDR=" Address ":" Port ] - [SP "PURPOSE=" Purpose] - CRLF - - StreamStatus = - "NEW" / ; New request to connect - "NEWRESOLVE" / ; New request to resolve an address - "REMAP" / ; Address re-mapped to another - "SENTCONNECT" / ; Sent a connect cell along a circuit - "SENTRESOLVE" / ; Sent a resolve cell along a circuit - "SUCCEEDED" / ; Received a reply; stream established - "FAILED" / ; Stream failed and not retriable - "CLOSED" / ; Stream closed - "DETACHED" ; Detached from circuit; still retriable - - Target = Address ":" Port - - The circuit ID designates which circuit this stream is attached to. If - the stream is unattached, the circuit ID "0" is given. - - Reason = "MISC" / "RESOLVEFAILED" / "CONNECTREFUSED" / - "EXITPOLICY" / "DESTROY" / "DONE" / "TIMEOUT" / - "NOROUTE" / "HIBERNATING" / "INTERNAL"/ "RESOURCELIMIT" / - "CONNRESET" / "TORPROTOCOL" / "NOTDIRECTORY" / "END" / - "PRIVATE_ADDR" - - The "REASON" field is provided only for FAILED, CLOSED, and DETACHED - events, and only if extended events are enabled (see 3.19). Clients MUST - accept reasons not listed above. Reasons are as given in tor-spec.txt, - except for: - - END (We received a RELAY_END cell from the other side of this - stream.) - PRIVATE_ADDR (The client tried to connect to a private address like - 127.0.0.1 or 10.0.0.1 over Tor.) - [XXXX document more. -NM] - - - The "REMOTE_REASON" field is provided only when we receive a RELAY_END - cell, and only if extended events are enabled. It contains the actual - reason given by the remote OR for closing the stream. Clients MUST accept - reasons not listed above. Reasons are as listed in tor-spec.txt. - - "REMAP" events include a Source if extended events are enabled: - Source = "CACHE" / "EXIT" - Clients MUST accept sources not listed above. "CACHE" is given if - the Tor client decided to remap the address because of a cached - answer, and "EXIT" is given if the remote node we queried gave us - the new address as a response. - - The "SOURCE_ADDR" field is included with NEW and NEWRESOLVE events if - extended events are enabled. It indicates the address and port - that requested the connection, and can be (e.g.) used to look up the - requesting program. - - Purpose = "DIR_FETCH" / "UPLOAD_DESC" / "DNS_REQUEST" / - "USER" / "DIRPORT_TEST" - - The "PURPOSE" field is provided only for NEW and NEWRESOLVE events, and - only if extended events are enabled (see 3.19). Clients MUST accept - purposes not listed above. - -4.1.3. OR Connection status changed - - The syntax is: - - "650" SP "ORCONN" SP (LongName / Target) SP ORStatus [ SP "REASON=" - Reason ] [ SP "NCIRCS=" NumCircuits ] CRLF - - ORStatus = "NEW" / "LAUNCHED" / "CONNECTED" / "FAILED" / "CLOSED" - - ; In Tor versions 0.1.2.2-alpha through 0.2.2.1-alpha with feature - ; VERBOSE_NAMES turned off and before version 0.1.2.2-alpha, OR - ; Connection is as follows: - "650" SP "ORCONN" SP (ServerID / Target) SP ORStatus [ SP "REASON=" - Reason ] [ SP "NCIRCS=" NumCircuits ] CRLF - - NEW is for incoming connections, and LAUNCHED is for outgoing - connections. CONNECTED means the TLS handshake has finished (in - either direction). FAILED means a connection is being closed that - hasn't finished its handshake, and CLOSED is for connections that - have handshaked. - - A LongName or ServerID is specified unless it's a NEW connection, in - which case we don't know what server it is yet, so we use Address:Port. - - If extended events are enabled (see 3.19), optional reason and - circuit counting information is provided for CLOSED and FAILED - events. - - Reason = "MISC" / "DONE" / "CONNECTREFUSED" / - "IDENTITY" / "CONNECTRESET" / "TIMEOUT" / "NOROUTE" / - "IOERROR" / "RESOURCELIMIT" - - NumCircuits counts both established and pending circuits. - -4.1.4. Bandwidth used in the last second - - The syntax is: - "650" SP "BW" SP BytesRead SP BytesWritten *(SP Type "=" Num) CRLF - BytesRead = 1*DIGIT - BytesWritten = 1*DIGIT - Type = "DIR" / "OR" / "EXIT" / "APP" / ... - Num = 1*DIGIT - - BytesRead and BytesWritten are the totals. [In a future Tor version, - we may also include a breakdown of the connection types that used - bandwidth this second (not implemented yet).] - -4.1.5. Log messages - - The syntax is: - "650" SP Severity SP ReplyText CRLF - or - "650+" Severity CRLF Data 650 SP "OK" CRLF - - Severity = "DEBUG" / "INFO" / "NOTICE" / "WARN"/ "ERR" - -4.1.6. New descriptors available - - Syntax: - "650" SP "NEWDESC" 1*(SP LongName) CRLF - ; In Tor versions 0.1.2.2-alpha through 0.2.2.1-alpha with feature - ; VERBOSE_NAMES turned off and before version 0.1.2.2-alpha, it - ; is as follows: - "650" SP "NEWDESC" 1*(SP ServerID) CRLF - -4.1.7. New Address mapping - - Syntax: - "650" SP "ADDRMAP" SP Address SP NewAddress SP Expiry - [SP Error] SP GMTExpiry CRLF - - NewAddress = Address / "<error>" - Expiry = DQUOTE ISOTime DQUOTE / "NEVER" - - Error = "error=" ErrorCode - ErrorCode = XXXX - GMTExpiry = "EXPIRES=" DQUOTE IsoTime DQUOTE - - Error and GMTExpiry are only provided if extended events are enabled. - - Expiry is expressed as the local time (rather than GMT). This is a bug, - left in for backward compatibility; new code should look at GMTExpiry - instead. - - These events are generated when a new address mapping is entered in the - cache, or when the answer for a RESOLVE command is found. - -4.1.8. Descriptors uploaded to us in our role as authoritative dirserver - - Syntax: - "650" "+" "AUTHDIR_NEWDESCS" CRLF Action CRLF Message CRLF - Descriptor CRLF "." CRLF "650" SP "OK" CRLF - Action = "ACCEPTED" / "DROPPED" / "REJECTED" - Message = Text - -4.1.9. Our descriptor changed - - Syntax: - "650" SP "DESCCHANGED" CRLF - - [First added in 0.1.2.2-alpha.] - -4.1.10. Status events - - Status events (STATUS_GENERAL, STATUS_CLIENT, and STATUS_SERVER) are sent - based on occurrences in the Tor process pertaining to the general state of - the program. Generally, they correspond to log messages of severity Notice - or higher. They differ from log messages in that their format is a - specified interface. - - Syntax: - "650" SP StatusType SP StatusSeverity SP StatusAction - [SP StatusArguments] CRLF - - StatusType = "STATUS_GENERAL" / "STATUS_CLIENT" / "STATUS_SERVER" - StatusSeverity = "NOTICE" / "WARN" / "ERR" - StatusAction = 1*ALPHA - StatusArguments = StatusArgument *(SP StatusArgument) - StatusArgument = StatusKeyword '=' StatusValue - StatusKeyword = 1*(ALNUM / "_") - StatusValue = 1*(ALNUM / '_') / QuotedString - - Action is a string, and Arguments is a series of keyword=value - pairs on the same line. Values may be space-terminated strings, - or quoted strings. - - These events are always produced with EXTENDED_EVENTS and - VERBOSE_NAMES; see the explanations in the USEFEATURE section - for details. - - Controllers MUST tolerate unrecognized actions, MUST tolerate - unrecognized arguments, MUST tolerate missing arguments, and MUST - tolerate arguments that arrive in any order. - - Each event description below is accompanied by a recommendation for - controllers. These recommendations are suggestions only; no controller - is required to implement them. - - Compatibility note: versions of Tor before 0.2.0.22-rc incorrectly - generated "STATUS_SERVER" as "STATUS_SEVER". To be compatible with those - versions, tools should accept both. - - Actions for STATUS_GENERAL events can be as follows: - - CLOCK_JUMPED - "TIME=NUM" - Tor spent enough time without CPU cycles that it has closed all - its circuits and will establish them anew. This typically - happens when a laptop goes to sleep and then wakes up again. It - also happens when the system is swapping so heavily that Tor is - starving. The "time" argument specifies the number of seconds Tor - thinks it was unconscious for (or alternatively, the number of - seconds it went back in time). - - This status event is sent as NOTICE severity normally, but WARN - severity if Tor is acting as a server currently. - - {Recommendation for controller: ignore it, since we don't really - know what the user should do anyway. Hm.} - - DANGEROUS_VERSION - "CURRENT=version" - "REASON=NEW/OBSOLETE/UNRECOMMENDED" - "RECOMMENDED=\"version, version, ...\"" - Tor has found that directory servers don't recommend its version of - the Tor software. RECOMMENDED is a comma-and-space-separated string - of Tor versions that are recommended. REASON is NEW if this version - of Tor is newer than any recommended version, OBSOLETE if - this version of Tor is older than any recommended version, and - UNRECOMMENDED if some recommended versions of Tor are newer and - some are older than this version. (The "OBSOLETE" reason was called - "OLD" from Tor 0.1.2.3-alpha up to and including 0.2.0.12-alpha.) - - {Controllers may want to suggest that the user upgrade OLD or - UNRECOMMENDED versions. NEW versions may be known-insecure, or may - simply be development versions.} - - TOO_MANY_CONNECTIONS - "CURRENT=NUM" - Tor has reached its ulimit -n or whatever the native limit is on file - descriptors or sockets. CURRENT is the number of sockets Tor - currently has open. The user should really do something about - this. The "current" argument shows the number of connections currently - open. - - {Controllers may recommend that the user increase the limit, or - increase it for them. Recommendations should be phrased in an - OS-appropriate way and automated when possible.} - - BUG - "REASON=STRING" - Tor has encountered a situation that its developers never expected, - and the developers would like to learn that it happened. Perhaps - the controller can explain this to the user and encourage her to - file a bug report? - - {Controllers should log bugs, but shouldn't annoy the user in case a - bug appears frequently.} - - CLOCK_SKEW - SKEW="+" / "-" SECONDS - MIN_SKEW="+" / "-" SECONDS. - SOURCE="DIRSERV:" IP ":" Port / - "NETWORKSTATUS:" IP ":" Port / - "OR:" IP ":" Port / - "CONSENSUS" - If "SKEW" is present, it's an estimate of how far we are from the - time declared in the source. (In other words, if we're an hour in - the past, the value is -3600.) "MIN_SKEW" is present, it's a lower - bound. If the source is a DIRSERV, we got the current time from a - connection to a dirserver. If the source is a NETWORKSTATUS, we - decided we're skewed because we got a v2 networkstatus from far in - the future. If the source is OR, the skew comes from a NETINFO - cell from a connection to another relay. If the source is - CONSENSUS, we decided we're skewed because we got a networkstatus - consensus from the future. - - {Tor should send this message to controllers when it thinks the - skew is so high that it will interfere with proper Tor operation. - Controllers shouldn't blindly adjust the clock, since the more - accurate source of skew info (DIRSERV) is currently - unauthenticated.} - - BAD_LIBEVENT - "METHOD=" libevent method - "VERSION=" libevent version - "BADNESS=" "BROKEN" / "BUGGY" / "SLOW" - "RECOVERED=" "NO" / "YES" - Tor knows about bugs in using the configured event method in this - version of libevent. "BROKEN" libevents won't work at all; - "BUGGY" libevents might work okay; "SLOW" libevents will work - fine, but not quickly. If "RECOVERED" is YES, Tor managed to - switch to a more reliable (but probably slower!) libevent method. - - {Controllers may want to warn the user if this event occurs, though - generally it's the fault of whoever built the Tor binary and there's - not much the user can do besides upgrade libevent or upgrade the - binary.} - - DIR_ALL_UNREACHABLE - Tor believes that none of the known directory servers are - reachable -- this is most likely because the local network is - down or otherwise not working, and might help to explain for the - user why Tor appears to be broken. - - {Controllers may want to warn the user if this event occurs; further - action is generally not possible.} - - CONSENSUS_ARRIVED - Tor has received and validated a new consensus networkstatus. - (This event can be delayed a little while after the consensus - is received, if Tor needs to fetch certificates.) - - Actions for STATUS_CLIENT events can be as follows: - - BOOTSTRAP - "PROGRESS=" num - "TAG=" Keyword - "SUMMARY=" String - ["WARNING=" String - "REASON=" Keyword - "COUNT=" num - "RECOMMENDATION=" Keyword - ] - - Tor has made some progress at establishing a connection to the - Tor network, fetching directory information, or making its first - circuit; or it has encountered a problem while bootstrapping. This - status event is especially useful for users with slow connections - or with connectivity problems. - - "Progress" gives a number between 0 and 100 for how far through - the bootstrapping process we are. "Summary" is a string that can - be displayed to the user to describe the *next* task that Tor - will tackle, i.e., the task it is working on after sending the - status event. "Tag" is a string that controllers can use to - recognize bootstrap phases, if they want to do something smarter - than just blindly displaying the summary string; see Section 5 - for the current tags that Tor issues. - - The StatusSeverity describes whether this is a normal bootstrap - phase (severity notice) or an indication of a bootstrapping - problem (severity warn). - - For bootstrap problems, we include the same progress, tag, and - summary values as we would for a normal bootstrap event, but we - also include "warning", "reason", "count", and "recommendation" - key/value combos. The "count" number tells how many bootstrap - problems there have been so far at this phase. The "reason" - string lists one of the reasons allowed in the ORCONN event. The - "warning" argument string with any hints Tor has to offer about - why it's having troubles bootstrapping. - - The "reason" values are long-term-stable controller-facing tags to - identify particular issues in a bootstrapping step. The warning - strings, on the other hand, are human-readable. Controllers - SHOULD NOT rely on the format of any warning string. Currently - the possible values for "recommendation" are either "ignore" or - "warn" -- if ignore, the controller can accumulate the string in - a pile of problems to show the user if the user asks; if warn, - the controller should alert the user that Tor is pretty sure - there's a bootstrapping problem. - - Currently Tor uses recommendation=ignore for the first - nine bootstrap problem reports for a given phase, and then - uses recommendation=warn for subsequent problems at that - phase. Hopefully this is a good balance between tolerating - occasional errors and reporting serious problems quickly. - - ENOUGH_DIR_INFO - Tor now knows enough network-status documents and enough server - descriptors that it's going to start trying to build circuits now. - - {Controllers may want to use this event to decide when to indicate - progress to their users, but should not interrupt the user's browsing - to tell them so.} - - NOT_ENOUGH_DIR_INFO - We discarded expired statuses and router descriptors to fall - below the desired threshold of directory information. We won't - try to build any circuits until ENOUGH_DIR_INFO occurs again. - - {Controllers may want to use this event to decide when to indicate - progress to their users, but should not interrupt the user's browsing - to tell them so.} - - CIRCUIT_ESTABLISHED - Tor is able to establish circuits for client use. This event will - only be sent if we just built a circuit that changed our mind -- - that is, prior to this event we didn't know whether we could - establish circuits. - - {Suggested use: controllers can notify their users that Tor is - ready for use as a client once they see this status event. [Perhaps - controllers should also have a timeout if too much time passes and - this event hasn't arrived, to give tips on how to troubleshoot. - On the other hand, hopefully Tor will send further status events - if it can identify the problem.]} - - CIRCUIT_NOT_ESTABLISHED - "REASON=" "EXTERNAL_ADDRESS" / "DIR_ALL_UNREACHABLE" / "CLOCK_JUMPED" - We are no longer confident that we can build circuits. The "reason" - keyword provides an explanation: which other status event type caused - our lack of confidence. - - {Controllers may want to use this event to decide when to indicate - progress to their users, but should not interrupt the user's browsing - to do so.} - [Note: only REASON=CLOCK_JUMPED is implemented currently.] - - DANGEROUS_PORT - "PORT=" port - "RESULT=" "REJECT" / "WARN" - A stream was initiated to a port that's commonly used for - vulnerable-plaintext protocols. If the Result is "reject", we - refused the connection; whereas if it's "warn", we allowed it. - - {Controllers should warn their users when this occurs, unless they - happen to know that the application using Tor is in fact doing so - correctly (e.g., because it is part of a distributed bundle). They - might also want some sort of interface to let the user configure - their RejectPlaintextPorts and WarnPlaintextPorts config options.} - - DANGEROUS_SOCKS - "PROTOCOL=" "SOCKS4" / "SOCKS5" - "ADDRESS=" IP:port - A connection was made to Tor's SOCKS port using one of the SOCKS - approaches that doesn't support hostnames -- only raw IP addresses. - If the client application got this address from gethostbyname(), - it may be leaking target addresses via DNS. - - {Controllers should warn their users when this occurs, unless they - happen to know that the application using Tor is in fact doing so - correctly (e.g., because it is part of a distributed bundle).} - - SOCKS_UNKNOWN_PROTOCOL - "DATA=string" - A connection was made to Tor's SOCKS port that tried to use it - for something other than the SOCKS protocol. Perhaps the user is - using Tor as an HTTP proxy? The DATA is the first few characters - sent to Tor on the SOCKS port. - - {Controllers may want to warn their users when this occurs: it - indicates a misconfigured application.} - - SOCKS_BAD_HOSTNAME - "HOSTNAME=QuotedString" - Some application gave us a funny-looking hostname. Perhaps - it is broken? In any case it won't work with Tor and the user - should know. - - {Controllers may want to warn their users when this occurs: it - usually indicates a misconfigured application.} - - Actions for STATUS_SERVER can be as follows: - - EXTERNAL_ADDRESS - "ADDRESS=IP" - "HOSTNAME=NAME" - "METHOD=CONFIGURED/DIRSERV/RESOLVED/INTERFACE/GETHOSTNAME" - Our best idea for our externally visible IP has changed to 'IP'. - If 'HOSTNAME' is present, we got the new IP by resolving 'NAME'. If the - method is 'CONFIGURED', the IP was given verbatim as a configuration - option. If the method is 'RESOLVED', we resolved the Address - configuration option to get the IP. If the method is 'GETHOSTNAME', - we resolved our hostname to get the IP. If the method is 'INTERFACE', - we got the address of one of our network interfaces to get the IP. If - the method is 'DIRSERV', a directory server told us a guess for what - our IP might be. - - {Controllers may want to record this info and display it to the user.} - - CHECKING_REACHABILITY - "ORADDRESS=IP:port" - "DIRADDRESS=IP:port" - We're going to start testing the reachability of our external OR port - or directory port. - - {This event could affect the controller's idea of server status, but - the controller should not interrupt the user to tell them so.} - - REACHABILITY_SUCCEEDED - "ORADDRESS=IP:port" - "DIRADDRESS=IP:port" - We successfully verified the reachability of our external OR port or - directory port (depending on which of ORADDRESS or DIRADDRESS is - given.) - - {This event could affect the controller's idea of server status, but - the controller should not interrupt the user to tell them so.} - - GOOD_SERVER_DESCRIPTOR - We successfully uploaded our server descriptor to at least one - of the directory authorities, with no complaints. - - {Originally, the goal of this event was to declare "every authority - has accepted the descriptor, so there will be no complaints - about it." But since some authorities might be offline, it's - harder to get certainty than we had thought. As such, this event - is equivalent to ACCEPTED_SERVER_DESCRIPTOR below. Controllers - should just look at ACCEPTED_SERVER_DESCRIPTOR and should ignore - this event for now.} - - SERVER_DESCRIPTOR_STATUS - "STATUS=" "LISTED" / "UNLISTED" - We just got a new networkstatus consensus, and whether we're in - it or not in it has changed. Specifically, status is "listed" - if we're listed in it but previous to this point we didn't know - we were listed in a consensus; and status is "unlisted" if we - thought we should have been listed in it (e.g. we were listed in - the last one), but we're not. - - {Moving from listed to unlisted is not necessarily cause for - alarm. The relay might have failed a few reachability tests, - or the Internet might have had some routing problems. So this - feature is mainly to let relay operators know when their relay - has successfully been listed in the consensus.} - - [Not implemented yet. We should do this in 0.2.2.x. -RD] - - NAMESERVER_STATUS - "NS=addr" - "STATUS=" "UP" / "DOWN" - "ERR=" message - One of our nameservers has changed status. - - {This event could affect the controller's idea of server status, but - the controller should not interrupt the user to tell them so.} - - NAMESERVER_ALL_DOWN - All of our nameservers have gone down. - - {This is a problem; if it happens often without the nameservers - coming up again, the user needs to configure more or better - nameservers.} - - DNS_HIJACKED - Our DNS provider is providing an address when it should be saying - "NOTFOUND"; Tor will treat the address as a synonym for "NOTFOUND". - - {This is an annoyance; controllers may want to tell admins that their - DNS provider is not to be trusted.} - - DNS_USELESS - Our DNS provider is giving a hijacked address instead of well-known - websites; Tor will not try to be an exit node. - - {Controllers could warn the admin if the server is running as an - exit server: the admin needs to configure a good DNS server. - Alternatively, this happens a lot in some restrictive environments - (hotels, universities, coffeeshops) when the user hasn't registered.} - - BAD_SERVER_DESCRIPTOR - "DIRAUTH=addr:port" - "REASON=string" - A directory authority rejected our descriptor. Possible reasons - include malformed descriptors, incorrect keys, highly skewed clocks, - and so on. - - {Controllers should warn the admin, and try to cope if they can.} - - ACCEPTED_SERVER_DESCRIPTOR - "DIRAUTH=addr:port" - A single directory authority accepted our descriptor. - // actually notice - - {This event could affect the controller's idea of server status, but - the controller should not interrupt the user to tell them so.} - - REACHABILITY_FAILED - "ORADDRESS=IP:port" - "DIRADDRESS=IP:port" - We failed to connect to our external OR port or directory port - successfully. - - {This event could affect the controller's idea of server status. The - controller should warn the admin and suggest reasonable steps to take.} - -4.1.11. Our set of guard nodes has changed - - Syntax: - "650" SP "GUARD" SP Type SP Name SP Status ... CRLF - Type = "ENTRY" - Name = The (possibly verbose) nickname of the guard affected. - Status = "NEW" | "UP" | "DOWN" | "BAD" | "GOOD" | "DROPPED" - - [explain states. XXX] - -4.1.12. Network status has changed - - Syntax: - "650" "+" "NS" CRLF 1*NetworkStatus "." CRLF "650" SP "OK" CRLF - - The event is used whenever our local view of a relay status changes. - This happens when we get a new v3 consensus (in which case the entries - we see are a duplicate of what we see in the NEWCONSENSUS event, - below), but it also happens when we decide to mark a relay as up or - down in our local status, for example based on connection attempts. - - [First added in 0.1.2.3-alpha] - -4.1.13. Bandwidth used on an application stream - - The syntax is: - "650" SP "STREAM_BW" SP StreamID SP BytesWritten SP BytesRead CRLF - BytesWritten = 1*DIGIT - BytesRead = 1*DIGIT - - BytesWritten and BytesRead are the number of bytes written and read - by the application since the last STREAM_BW event on this stream. - - Note that from Tor's perspective, *reading* a byte on a stream means - that the application *wrote* the byte. That's why the order of "written" - vs "read" is opposite for stream_bw events compared to bw events. - - These events are generated about once per second per stream; no events - are generated for streams that have not written or read. These events - apply only to streams entering Tor (such as on a SOCKSPort, TransPort, - or so on). They are not generated for exiting streams. - -4.1.14. Per-country client stats - - The syntax is: - "650" SP "CLIENTS_SEEN" SP TimeStarted SP CountrySummary CRLF - - We just generated a new summary of which countries we've seen clients - from recently. The controller could display this for the user, e.g. - in their "relay" configuration window, to give them a sense that they - are actually being useful. - - Currently only bridge relays will receive this event, but once we figure - out how to sufficiently aggregate and sanitize the client counts on - main relays, we might start sending these events in other cases too. - - TimeStarted is a quoted string indicating when the reported summary - counts from (in GMT). - - The CountrySummary keyword has as its argument a comma-separated, - possibly empty set of "countrycode=count" pairs. For example (without - linebreak), - 650-CLIENTS_SEEN TimeStarted="2008-12-25 23:50:43" - CountrySummary=us=16,de=8,uk=8 - -4.1.15. New consensus networkstatus has arrived. - - The syntax is: - "650" "+" "NEWCONSENSUS" CRLF 1*NetworkStatus "." CRLF "650" SP - "OK" CRLF - - A new consensus networkstatus has arrived. We include NS-style lines for - every relay in the consensus. NEWCONSENSUS is a separate event from the - NS event, because the list here represents every usable relay: so any - relay *not* mentioned in this list is implicitly no longer recommended. - - [First added in 0.2.1.13-alpha] - -4.1.16. New circuit buildtime has been set. - - The syntax is: - "650" SP "BUILDTIMEOUT_SET" SP Type SP "TOTAL_TIMES=" Total SP - "TIMEOUT_MS=" Timeout SP "XM=" Xm SP "ALPHA=" Alpha SP - "CUTOFF_QUANTILE=" Quantile SP "TIMEOUT_RATE=" TimeoutRate SP - "CLOSE_MS=" CloseTimeout SP "CLOSE_RATE=" CloseRate - CRLF - Type = "COMPUTED" / "RESET" / "SUSPENDED" / "DISCARD" / "RESUME" - Total = Integer count of timeouts stored - Timeout = Integer timeout in milliseconds - Xm = Estimated integer Pareto parameter Xm in milliseconds - Alpha = Estimated floating point Paredo paremter alpha - Quantile = Floating point CDF quantile cutoff point for this timeout - TimeoutRate = Floating point ratio of circuits that timeout - CloseTimeout = How long to keep measurement circs in milliseconds - CloseRate = Floating point ratio of measurement circuits that are closed - - A new circuit build timeout time has been set. If Type is "COMPUTED", - Tor has computed the value based on historical data. If Type is "RESET", - initialization or drastic network changes have caused Tor to reset - the timeout back to the default, to relearn again. If Type is - "SUSPENDED", Tor has detected a loss of network connectivity and has - temporarily changed the timeout value to the default until the network - recovers. If type is "DISCARD", Tor has decided to discard timeout - values that likely happened while the network was down. If type is - "RESUME", Tor has decided to resume timeout calculation. - - The Total value is the count of circuit build times Tor used in - computing this value. It is capped internally at the maximum number - of build times Tor stores (NCIRCUITS_TO_OBSERVE). - - The Timeout itself is provided in milliseconds. Internally, Tor rounds - this value to the nearest second before using it. - - [First added in 0.2.2.7-alpha] - -4.1.17. Signal received - - The syntax is: - "650" SP "SIGNAL" SP Signal CRLF - - Signal = "RELOAD" / "DUMP" / "DEBUG" / "NEWNYM" / "CLEARDNSCACHE" - - A signal has been received and actions taken by Tor. The meaning of each - signal, and the mapping to Unix signals, is as defined in section 3.7. - Future versions of Tor MAY generate signals other than those listed here; - controllers MUST be able to accept them. - - If Tor chose to ignore a signal (such as NEWNYM), this event will not be - sent. Note that some options (like ReloadTorrcOnSIGHUP) may affect the - semantics of the signals here. - - Note that the HALT (SIGTERM) and SHUTDOWN (SIGINT) signals do not currently - generate any event. - - [First added in 0.2.3.1-alpha] - -5. Implementation notes - -5.1. Authentication - - If the control port is open and no authentication operation is enabled, Tor - trusts any local user that connects to the control port. This is generally - a poor idea. - - If the 'CookieAuthentication' option is true, Tor writes a "magic cookie" - file named "control_auth_cookie" into its data directory. To authenticate, - the controller must send the contents of this file, encoded in hexadecimal. - - If the 'HashedControlPassword' option is set, it must contain the salted - hash of a secret password. The salted hash is computed according to the - S2K algorithm in RFC 2440 (OpenPGP), and prefixed with the s2k specifier. - This is then encoded in hexadecimal, prefixed by the indicator sequence - "16:". Thus, for example, the password 'foo' could encode to: - 16:660537E3E1CD49996044A3BF558097A981F539FEA2F9DA662B4626C1C2 - ++++++++++++++++**^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - salt hashed value - indicator - You can generate the salt of a password by calling - 'tor --hash-password <password>' - or by using the example code in the Python and Java controller libraries. - To authenticate under this scheme, the controller sends Tor the original - secret that was used to generate the password, either as a quoted string - or encoded in hexadecimal. - -5.2. Don't let the buffer get too big. - - If you ask for lots of events, and 16MB of them queue up on the buffer, - the Tor process will close the socket. - -5.3. Backward compatibility with v0 control protocol. - - The 'version 0' control protocol was replaced in Tor 0.1.1.x. Support - was removed in Tor 0.2.0.x. Every non-obsolete version of Tor now - supports the version 1 control protocol. - - For backward compatibility with the "version 0" control protocol, - Tor used to check whether the third octet of the first command is zero. - (If it was, Tor assumed that version 0 is in use.) - - This compatibility was removed in Tor 0.1.2.16 and 0.2.0.4-alpha. - -5.4. Tor config options for use by controllers - - Tor provides a few special configuration options for use by controllers. - These options can be set and examined by the SETCONF and GETCONF commands, - but are not saved to disk by SAVECONF. - - Generally, these options make Tor unusable by disabling a portion of Tor's - normal operations. Unless a controller provides replacement functionality - to fill this gap, Tor will not correctly handle user requests. - - __AllDirOptionsPrivate - - If true, Tor will try to launch all directory operations through - anonymous connections. (Ordinarily, Tor only tries to anonymize - requests related to hidden services.) This option will slow down - directory access, and may stop Tor from working entirely if it does not - yet have enough directory information to build circuits. - - (Boolean. Default: "0".) - - __DisablePredictedCircuits - - If true, Tor will not launch preemptive "general-purpose" circuits for - streams to attach to. (It will still launch circuits for testing and - for hidden services.) - - (Boolean. Default: "0".) - - __LeaveStreamsUnattached - - If true, Tor will not automatically attach new streams to circuits; - instead, the controller must attach them with ATTACHSTREAM. If the - controller does not attach the streams, their data will never be routed. - - (Boolean. Default: "0".) - - __HashedControlSessionPassword - - As HashedControlPassword, but is not saved to the torrc file by - SAVECONF. Added in Tor 0.2.0.20-rc. - - __ReloadTorrcOnSIGHUP - - If this option is true (the default), we reload the torrc from disk - every time we get a SIGHUP (from the controller or via a signal). - Otherwise, we don't. This option exists so that controllers can keep - their options from getting overwritten when a user sends Tor a HUP for - some other reason (for example, to rotate the logs). - - (Boolean. Default: "1") - -5.5. Phases from the Bootstrap status event. - - This section describes the various bootstrap phases currently reported - by Tor. Controllers should not assume that the percentages and tags - listed here will continue to match up, or even that the tags will stay - in the same order. Some phases might also be skipped (not reported) - if the associated bootstrap step is already complete, or if the phase - no longer is necessary. Only "starting" and "done" are guaranteed to - exist in all future versions. - - Current Tor versions enter these phases in order, monotonically. - Future Tors MAY revisit earlier stages. - - Phase 0: - tag=starting summary="Starting" - - Tor starts out in this phase. - - Phase 5: - tag=conn_dir summary="Connecting to directory mirror" - - Tor sends this event as soon as Tor has chosen a directory mirror -- - e.g. one of the authorities if bootstrapping for the first time or - after a long downtime, or one of the relays listed in its cached - directory information otherwise. - - Tor will stay at this phase until it has successfully established - a TCP connection with some directory mirror. Problems in this phase - generally happen because Tor doesn't have a network connection, or - because the local firewall is dropping SYN packets. - - Phase 10: - tag=handshake_dir summary="Finishing handshake with directory mirror" - - This event occurs when Tor establishes a TCP connection with a relay used - as a directory mirror (or its https proxy if it's using one). Tor remains - in this phase until the TLS handshake with the relay is finished. - - Problems in this phase generally happen because Tor's firewall is - doing more sophisticated MITM attacks on it, or doing packet-level - keyword recognition of Tor's handshake. - - Phase 15: - tag=onehop_create summary="Establishing one-hop circuit for dir info" - - Once TLS is finished with a relay, Tor will send a CREATE_FAST cell - to establish a one-hop circuit for retrieving directory information. - It will remain in this phase until it receives the CREATED_FAST cell - back, indicating that the circuit is ready. - - Phase 20: - tag=requesting_status summary="Asking for networkstatus consensus" - - Once we've finished our one-hop circuit, we will start a new stream - for fetching the networkstatus consensus. We'll stay in this phase - until we get the 'connected' relay cell back, indicating that we've - established a directory connection. - - Phase 25: - tag=loading_status summary="Loading networkstatus consensus" - - Once we've established a directory connection, we will start fetching - the networkstatus consensus document. This could take a while; this - phase is a good opportunity for using the "progress" keyword to indicate - partial progress. - - This phase could stall if the directory mirror we picked doesn't - have a copy of the networkstatus consensus so we have to ask another, - or it does give us a copy but we don't find it valid. - - Phase 40: - tag=loading_keys summary="Loading authority key certs" - - Sometimes when we've finished loading the networkstatus consensus, - we find that we don't have all the authority key certificates for the - keys that signed the consensus. At that point we put the consensus we - fetched on hold and fetch the keys so we can verify the signatures. - - Phase 45 - tag=requesting_descriptors summary="Asking for relay descriptors" - - Once we have a valid networkstatus consensus and we've checked all - its signatures, we start asking for relay descriptors. We stay in this - phase until we have received a 'connected' relay cell in response to - a request for descriptors. - - Phase 50: - tag=loading_descriptors summary="Loading relay descriptors" - - We will ask for relay descriptors from several different locations, - so this step will probably make up the bulk of the bootstrapping, - especially for users with slow connections. We stay in this phase until - we have descriptors for at least 1/4 of the usable relays listed in - the networkstatus consensus. This phase is also a good opportunity to - use the "progress" keyword to indicate partial steps. - - Phase 80: - tag=conn_or summary="Connecting to entry guard" - - Once we have a valid consensus and enough relay descriptors, we choose - some entry guards and start trying to build some circuits. This step - is similar to the "conn_dir" phase above; the only difference is - the context. - - If a Tor starts with enough recent cached directory information, - its first bootstrap status event will be for the conn_or phase. - - Phase 85: - tag=handshake_or summary="Finishing handshake with entry guard" - - This phase is similar to the "handshake_dir" phase, but it gets reached - if we finish a TCP connection to a Tor relay and we have already reached - the "conn_or" phase. We'll stay in this phase until we complete a TLS - handshake with a Tor relay. - - Phase 90: - tag=circuit_create summary="Establishing circuits" - - Once we've finished our TLS handshake with an entry guard, we will - set about trying to make some 3-hop circuits in case we need them soon. - - Phase 100: - tag=done summary="Done" - - A full 3-hop exit circuit has been established. Tor is ready to handle - application connections now. - diff --git a/doc/spec/dir-spec-v1.txt b/doc/spec/dir-spec-v1.txt deleted file mode 100644 index a92fc7999a..0000000000 --- a/doc/spec/dir-spec-v1.txt +++ /dev/null @@ -1,314 +0,0 @@ - - Tor Protocol Specification - - Roger Dingledine - Nick Mathewson - -0. Preliminaries - - THIS SPECIFICATION IS OBSOLETE. - - This document specifies the Tor directory protocol as used in version - 0.1.0.x and earlier. See dir-spec.txt for a current version. - -1. Basic operation - - There is a small number of directory authorities, and a larger number of - caches. Client and servers know public keys for the directory authorities. - Tor servers periodically upload self-signed "router descriptors" to the - directory authorities. Each authority publishes a self-signed "directory" - (containing all the router descriptors it knows, and a statement on which - are running) and a self-signed "running routers" document containing only - the statement on which routers are running. - - All Tors periodically download these documents, downloading the directory - less frequently than they do the "running routers" document. Clients - preferentially download from caches rather than authorities. - -1.1. Document format - - Router descriptors, directories, and running-routers documents all obey the - following lightweight extensible information format. - - The highest level object is a Document, which consists of one or more - Items. Every Item begins with a KeywordLine, followed by one or more - Objects. A KeywordLine begins with a Keyword, optionally followed by - whitespace and more non-newline characters, and ends with a newline. A - Keyword is a sequence of one or more characters in the set [A-Za-z0-9-]. - An Object is a block of encoded data in pseudo-Open-PGP-style - armor. (cf. RFC 2440) - - More formally: - - Document ::= (Item | NL)+ - Item ::= KeywordLine Object* - KeywordLine ::= Keyword NL | Keyword WS ArgumentsChar+ NL - Keyword = KeywordChar+ - KeywordChar ::= 'A' ... 'Z' | 'a' ... 'z' | '0' ... '9' | '-' - ArgumentChar ::= any printing ASCII character except NL. - WS = (SP | TAB)+ - Object ::= BeginLine Base-64-encoded-data EndLine - BeginLine ::= "-----BEGIN " Keyword "-----" NL - EndLine ::= "-----END " Keyword "-----" NL - - The BeginLine and EndLine of an Object must use the same keyword. - - When interpreting a Document, software MUST reject any document containing a - KeywordLine that starts with a keyword it doesn't recognize. - - The "opt" keyword is reserved for non-critical future extensions. All - implementations MUST ignore any item of the form "opt keyword ....." when - they would not recognize "keyword ....."; and MUST treat "opt keyword ....." - as synonymous with "keyword ......" when keyword is recognized. - -2. Router descriptor format. - - Every router descriptor MUST start with a "router" Item; MUST end with a - "router-signature" Item and an extra NL; and MUST contain exactly one - instance of each of the following Items: "published" "onion-key" "link-key" - "signing-key" "bandwidth". Additionally, a router descriptor MAY contain - any number of "accept", "reject", "fingerprint", "uptime", and "opt" Items. - Other than "router" and "router-signature", the items may appear in any - order. - - The items' formats are as follows: - "router" nickname address ORPort SocksPort DirPort - - Indicates the beginning of a router descriptor. "address" - must be an IPv4 address in dotted-quad format. The last - three numbers indicate the TCP ports at which this OR exposes - functionality. ORPort is a port at which this OR accepts TLS - connections for the main OR protocol; SocksPort is deprecated and - should always be 0; and DirPort is the port at which this OR accepts - directory-related HTTP connections. If any port is not supported, - the value 0 is given instead of a port number. - - "bandwidth" bandwidth-avg bandwidth-burst bandwidth-observed - - Estimated bandwidth for this router, in bytes per second. The - "average" bandwidth is the volume per second that the OR is willing - to sustain over long periods; the "burst" bandwidth is the volume - that the OR is willing to sustain in very short intervals. The - "observed" value is an estimate of the capacity this server can - handle. The server remembers the max bandwidth sustained output - over any ten second period in the past day, and another sustained - input. The "observed" value is the lesser of these two numbers. - - "platform" string - - A human-readable string describing the system on which this OR is - running. This MAY include the operating system, and SHOULD include - the name and version of the software implementing the Tor protocol. - - "published" YYYY-MM-DD HH:MM:SS - - The time, in GMT, when this descriptor was generated. - - "fingerprint" - - A fingerprint (a HASH_LEN-byte of asn1 encoded public key, encoded - in hex, with a single space after every 4 characters) for this router's - identity key. A descriptor is considered invalid (and MUST be - rejected) if the fingerprint line does not match the public key. - - [We didn't start parsing this line until Tor 0.1.0.6-rc; it should - be marked with "opt" until earlier versions of Tor are obsolete.] - - "hibernating" 0|1 - - If the value is 1, then the Tor server was hibernating when the - descriptor was published, and shouldn't be used to build circuits. - - [We didn't start parsing this line until Tor 0.1.0.6-rc; it should - be marked with "opt" until earlier versions of Tor are obsolete.] - - "uptime" - - The number of seconds that this OR process has been running. - - "onion-key" NL a public key in PEM format - - This key is used to encrypt EXTEND cells for this OR. The key MUST - be accepted for at least XXXX hours after any new key is published in - a subsequent descriptor. - - "signing-key" NL a public key in PEM format - - The OR's long-term identity key. - - "accept" exitpattern - "reject" exitpattern - - These lines, in order, describe the rules that an OR follows when - deciding whether to allow a new stream to a given address. The - 'exitpattern' syntax is described below. - - "router-signature" NL Signature NL - - The "SIGNATURE" object contains a signature of the PKCS1-padded - hash of the entire router descriptor, taken from the beginning of the - "router" line, through the newline after the "router-signature" line. - The router descriptor is invalid unless the signature is performed - with the router's identity key. - - "contact" info NL - - Describes a way to contact the server's administrator, preferably - including an email address and a PGP key fingerprint. - - "family" names NL - - 'Names' is a whitespace-separated list of server nicknames. If two ORs - list one another in their "family" entries, then OPs should treat them - as a single OR for the purpose of path selection. - - For example, if node A's descriptor contains "family B", and node B's - descriptor contains "family A", then node A and node B should never - be used on the same circuit. - - "read-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM,NUM,NUM,NUM,NUM... NL - "write-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM,NUM,NUM,NUM,NUM... NL - - Declare how much bandwidth the OR has used recently. Usage is divided - into intervals of NSEC seconds. The YYYY-MM-DD HH:MM:SS field defines - the end of the most recent interval. The numbers are the number of - bytes used in the most recent intervals, ordered from oldest to newest. - - [We didn't start parsing these lines until Tor 0.1.0.6-rc; they should - be marked with "opt" until earlier versions of Tor are obsolete.] - -2.1. Nonterminals in routerdescriptors - - nickname ::= between 1 and 19 alphanumeric characters, case-insensitive. - - exitpattern ::= addrspec ":" portspec - portspec ::= "*" | port | port "-" port - port ::= an integer between 1 and 65535, inclusive. - addrspec ::= "*" | ip4spec | ip6spec - ipv4spec ::= ip4 | ip4 "/" num_ip4_bits | ip4 "/" ip4mask - ip4 ::= an IPv4 address in dotted-quad format - ip4mask ::= an IPv4 mask in dotted-quad format - num_ip4_bits ::= an integer between 0 and 32 - ip6spec ::= ip6 | ip6 "/" num_ip6_bits - ip6 ::= an IPv6 address, surrounded by square brackets. - num_ip6_bits ::= an integer between 0 and 128 - - Ports are required; if they are not included in the router - line, they must appear in the "ports" lines. - -3. Directory format - - A Directory begins with a "signed-directory" item, followed by one each of - the following, in any order: "recommended-software", "published", - "router-status", "dir-signing-key". It may include any number of "opt" - items. After these items, a directory includes any number of router - descriptors, and a single "directory-signature" item. - - "signed-directory" - - Indicates the start of a directory. - - "published" YYYY-MM-DD HH:MM:SS - - The time at which this directory was generated and signed, in GMT. - - "dir-signing-key" - - The key used to sign this directory; see "signing-key" for format. - - "recommended-software" comma-separated-version-list - - A list of which versions of which implementations are currently - believed to be secure and compatible with the network. - - "running-routers" whitespace-separated-list - - A description of which routers are currently believed to be up or - down. Every entry consists of an optional "!", followed by either an - OR's nickname, or "$" followed by a hexadecimal encoding of the hash - of an OR's identity key. If the "!" is included, the router is - believed not to be running; otherwise, it is believed to be running. - If a router's nickname is given, exactly one router of that nickname - will appear in the directory, and that router is "approved" by the - directory server. If a hashed identity key is given, that OR is not - "approved". [XXXX The 'running-routers' line is only provided for - backward compatibility. New code should parse 'router-status' - instead.] - - "router-status" whitespace-separated-list - - A description of which routers are currently believed to be up or - down, and which are verified or unverified. Contains one entry for - every router that the directory server knows. Each entry is of the - format: - - !name=$digest [Verified router, currently not live.] - name=$digest [Verified router, currently live.] - !$digest [Unverified router, currently not live.] - or $digest [Unverified router, currently live.] - - (where 'name' is the router's nickname and 'digest' is a hexadecimal - encoding of the hash of the routers' identity key). - - When parsing this line, clients should only mark a router as - 'verified' if its nickname AND digest match the one provided. - - "directory-signature" nickname-of-dirserver NL Signature - - The signature is computed by computing the digest of the - directory, from the characters "signed-directory", through the newline - after "directory-signature". This digest is then padded with PKCS.1, - and signed with the directory server's signing key. - - If software encounters an unrecognized keyword in a single router descriptor, - it MUST reject only that router descriptor, and continue using the - others. Because this mechanism is used to add 'critical' extensions to - future versions of the router descriptor format, implementation should treat - it as a normal occurrence and not, for example, report it to the user as an - error. [Versions of Tor prior to 0.1.1 did this.] - - If software encounters an unrecognized keyword in the directory header, - it SHOULD reject the entire directory. - -4. Network-status descriptor - - A "network-status" (a.k.a "running-routers") document is a truncated - directory that contains only the current status of a list of nodes, not - their actual descriptors. It contains exactly one of each of the following - entries. - - "network-status" - - Must appear first. - - "published" YYYY-MM-DD HH:MM:SS - - (see section 3 above) - - "router-status" list - - (see section 3 above) - - "directory-signature" NL signature - - (see section 3 above) - -5. Behavior of a directory server - - lists nodes that are connected currently - speaks HTTP on a socket, spits out directory on request - - Directory servers listen on a certain port (the DirPort), and speak a - limited version of HTTP 1.0. Clients send either GET or POST commands. - The basic interactions are: - "%s %s HTTP/1.0\r\nContent-Length: %lu\r\nHost: %s\r\n\r\n", - command, url, content-length, host. - Get "/tor/" to fetch a full directory. - Get "/tor/dir.z" to fetch a compressed full directory. - Get "/tor/running-routers" to fetch a network-status descriptor. - Post "/tor/" to post a server descriptor, with the body of the - request containing the descriptor. - - "host" is used to specify the address:port of the dirserver, so - the request can survive going through HTTP proxies. - diff --git a/doc/spec/dir-spec-v2.txt b/doc/spec/dir-spec-v2.txt deleted file mode 100644 index d1be27f3db..0000000000 --- a/doc/spec/dir-spec-v2.txt +++ /dev/null @@ -1,896 +0,0 @@ - - Tor directory protocol, version 2 - -0. Scope and preliminaries - - This directory protocol is used by Tor version 0.1.1.x and 0.1.2.x. See - dir-spec-v1.txt for information on earlier versions, and dir-spec.txt - for information on later versions. - -0.1. Goals and motivation - - There were several problems with the way Tor handles directory information - in version 0.1.0.x and earlier. Here are the problems we try to fix with - this new design, already implemented in 0.1.1.x: - 1. Directories were very large and use up a lot of bandwidth: clients - downloaded descriptors for all router several times an hour. - 2. Every directory authority was a trust bottleneck: if a single - directory authority lied, it could make clients believe for a time an - arbitrarily distorted view of the Tor network. - 3. Our current "verified server" system is kind of nonsensical. - - 4. Getting more directory authorities would add more points of failure - and worsen possible partitioning attacks. - - There are two problems that remain unaddressed by this design. - 5. Requiring every client to know about every router won't scale. - 6. Requiring every directory cache to know every router won't scale. - - We attempt to fix 1-4 here, and to build a solution that will work when we - figure out an answer for 5. We haven't thought at all about what to do - about 6. - -1. Outline - - There is a small set (say, around 10) of semi-trusted directory - authorities. A default list of authorities is shipped with the Tor - software. Users can change this list, but are encouraged not to do so, in - order to avoid partitioning attacks. - - Routers periodically upload signed "descriptors" to the directory - authorities describing their keys, capabilities, and other information. - Routers may act as directory mirrors (also called "caches"), to reduce - load on the directory authorities. They announce this in their - descriptors. - - Each directory authority periodically generates and signs a compact - "network status" document that lists that authority's view of the current - descriptors and status for known routers, but which does not include the - descriptors themselves. - - Directory mirrors download, cache, and re-serve network-status documents - to clients. - - Clients, directory mirrors, and directory authorities all use - network-status documents to find out when their list of routers is - out-of-date. If it is, they download any missing router descriptors. - Clients download missing descriptors from mirrors; mirrors and authorities - download from authorities. Descriptors are downloaded by the hash of the - descriptor, not by the server's identity key: this prevents servers from - attacking clients by giving them descriptors nobody else uses. - - All directory information is uploaded and downloaded with HTTP. - - Coordination among directory authorities is done client-side: clients - compute a vote-like algorithm among the network-status documents they - have, and base their decisions on the result. - -1.1. What's different from 0.1.0.x? - - Clients used to download a signed concatenated set of router descriptors - (called a "directory") from directory mirrors, regardless of which - descriptors had changed. - - Between downloading directories, clients would download "network-status" - documents that would list which servers were supposed to running. - - Clients would always believe the most recently published network-status - document they were served. - - Routers used to upload fresh descriptors all the time, whether their keys - and other information had changed or not. - -1.2. Document meta-format - - Router descriptors, directories, and running-routers documents all obey the - following lightweight extensible information format. - - The highest level object is a Document, which consists of one or more - Items. Every Item begins with a KeywordLine, followed by one or more - Objects. A KeywordLine begins with a Keyword, optionally followed by - whitespace and more non-newline characters, and ends with a newline. A - Keyword is a sequence of one or more characters in the set [A-Za-z0-9-]. - An Object is a block of encoded data in pseudo-Open-PGP-style - armor. (cf. RFC 2440) - - More formally: - - Document ::= (Item | NL)+ - Item ::= KeywordLine Object* - KeywordLine ::= Keyword NL | Keyword WS ArgumentsChar+ NL - Keyword = KeywordChar+ - KeywordChar ::= 'A' ... 'Z' | 'a' ... 'z' | '0' ... '9' | '-' - ArgumentChar ::= any printing ASCII character except NL. - WS = (SP | TAB)+ - Object ::= BeginLine Base-64-encoded-data EndLine - BeginLine ::= "-----BEGIN " Keyword "-----" NL - EndLine ::= "-----END " Keyword "-----" NL - - The BeginLine and EndLine of an Object must use the same keyword. - - When interpreting a Document, software MUST ignore any KeywordLine that - starts with a keyword it doesn't recognize; future implementations MUST NOT - require current clients to understand any KeywordLine not currently - described. - - The "opt" keyword was used until Tor 0.1.2.5-alpha for non-critical future - extensions. All implementations MUST ignore any item of the form "opt - keyword ....." when they would not recognize "keyword ....."; and MUST - treat "opt keyword ....." as synonymous with "keyword ......" when keyword - is recognized. - - Implementations before 0.1.2.5-alpha rejected any document with a - KeywordLine that started with a keyword that they didn't recognize. - Implementations MUST prefix items not recognized by older versions of Tor - with an "opt" until those versions of Tor are obsolete. - - Other implementations that want to extend Tor's directory format MAY - introduce their own items. The keywords for extension items SHOULD start - with the characters "x-" or "X-", to guarantee that they will not conflict - with keywords used by future versions of Tor. - -2. Router operation - - ORs SHOULD generate a new router descriptor whenever any of the - following events have occurred: - - - A period of time (18 hrs by default) has passed since the last - time a descriptor was generated. - - - A descriptor field other than bandwidth or uptime has changed. - - - Bandwidth has changed by at least a factor of 2 from the last time a - descriptor was generated, and at least a given interval of time - (20 mins by default) has passed since then. - - - Its uptime has been reset (by restarting). - - After generating a descriptor, ORs upload it to every directory - authority they know, by posting it to the URL - - http://<hostname:port>/tor/ - -2.1. Router descriptor format - - Every router descriptor MUST start with a "router" Item; MUST end with a - "router-signature" Item and an extra NL; and MUST contain exactly one - instance of each of the following Items: "published" "onion-key" - "signing-key" "bandwidth". - - A router descriptor MAY have zero or one of each of the following Items, - but MUST NOT have more than one: "contact", "uptime", "fingerprint", - "hibernating", "read-history", "write-history", "eventdns", "platform", - "family". - - Additionally, a router descriptor MAY contain any number of "accept", - "reject", and "opt" Items. Other than "router" and "router-signature", - the items may appear in any order. - - The items' formats are as follows: - "router" nickname address ORPort SocksPort DirPort - - Indicates the beginning of a router descriptor. "address" must be an - IPv4 address in dotted-quad format. The last three numbers indicate - the TCP ports at which this OR exposes functionality. ORPort is a port - at which this OR accepts TLS connections for the main OR protocol; - SocksPort is deprecated and should always be 0; and DirPort is the - port at which this OR accepts directory-related HTTP connections. If - any port is not supported, the value 0 is given instead of a port - number. - - "bandwidth" bandwidth-avg bandwidth-burst bandwidth-observed - - Estimated bandwidth for this router, in bytes per second. The - "average" bandwidth is the volume per second that the OR is willing to - sustain over long periods; the "burst" bandwidth is the volume that - the OR is willing to sustain in very short intervals. The "observed" - value is an estimate of the capacity this server can handle. The - server remembers the max bandwidth sustained output over any ten - second period in the past day, and another sustained input. The - "observed" value is the lesser of these two numbers. - - "platform" string - - A human-readable string describing the system on which this OR is - running. This MAY include the operating system, and SHOULD include - the name and version of the software implementing the Tor protocol. - - "published" YYYY-MM-DD HH:MM:SS - - The time, in GMT, when this descriptor was generated. - - "fingerprint" - - A fingerprint (a HASH_LEN-byte of asn1 encoded public key, encoded in - hex, with a single space after every 4 characters) for this router's - identity key. A descriptor is considered invalid (and MUST be - rejected) if the fingerprint line does not match the public key. - - [We didn't start parsing this line until Tor 0.1.0.6-rc; it should - be marked with "opt" until earlier versions of Tor are obsolete.] - - "hibernating" 0|1 - - If the value is 1, then the Tor server was hibernating when the - descriptor was published, and shouldn't be used to build circuits. - - [We didn't start parsing this line until Tor 0.1.0.6-rc; it should be - marked with "opt" until earlier versions of Tor are obsolete.] - - "uptime" - - The number of seconds that this OR process has been running. - - "onion-key" NL a public key in PEM format - - This key is used to encrypt EXTEND cells for this OR. The key MUST be - accepted for at least 1 week after any new key is published in a - subsequent descriptor. - - "signing-key" NL a public key in PEM format - - The OR's long-term identity key. - - "accept" exitpattern - "reject" exitpattern - - These lines describe the rules that an OR follows when - deciding whether to allow a new stream to a given address. The - 'exitpattern' syntax is described below. The rules are considered in - order; if no rule matches, the address will be accepted. For clarity, - the last such entry SHOULD be accept *:* or reject *:*. - - "router-signature" NL Signature NL - - The "SIGNATURE" object contains a signature of the PKCS1-padded - hash of the entire router descriptor, taken from the beginning of the - "router" line, through the newline after the "router-signature" line. - The router descriptor is invalid unless the signature is performed - with the router's identity key. - - "contact" info NL - - Describes a way to contact the server's administrator, preferably - including an email address and a PGP key fingerprint. - - "family" names NL - - 'Names' is a space-separated list of server nicknames or - hexdigests. If two ORs list one another in their "family" entries, - then OPs should treat them as a single OR for the purpose of path - selection. - - For example, if node A's descriptor contains "family B", and node B's - descriptor contains "family A", then node A and node B should never - be used on the same circuit. - - "read-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM,NUM,NUM,NUM,NUM... NL - "write-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM,NUM,NUM,NUM,NUM... NL - - Declare how much bandwidth the OR has used recently. Usage is divided - into intervals of NSEC seconds. The YYYY-MM-DD HH:MM:SS field - defines the end of the most recent interval. The numbers are the - number of bytes used in the most recent intervals, ordered from - oldest to newest. - - [We didn't start parsing these lines until Tor 0.1.0.6-rc; they should - be marked with "opt" until earlier versions of Tor are obsolete.] - - "eventdns" bool NL - - Declare whether this version of Tor is using the newer enhanced - dns logic. Versions of Tor without eventdns SHOULD NOT be used for - reverse hostname lookups. - - [All versions of Tor before 0.1.2.2-alpha should be assumed to have - this option set to 0 if it is not present. All Tor versions at - 0.1.2.2-alpha or later should be assumed to have this option set to - 1 if it is not present. Until 0.1.2.1-alpha-dev, this option was - not generated, even when eventdns was in use. Versions of Tor - before 0.1.2.1-alpha-dev did not parse this option, so it should be - marked "opt". With 0.2.0.1-alpha, the old 'dnsworker' logic has - been removed, rendering this option of historical interest only.] - -2.2. Nonterminals in router descriptors - - nickname ::= between 1 and 19 alphanumeric characters, case-insensitive. - hexdigest ::= a '$', followed by 20 hexadecimal characters. - [Represents a server by the digest of its identity key.] - - exitpattern ::= addrspec ":" portspec - portspec ::= "*" | port | port "-" port - port ::= an integer between 1 and 65535, inclusive. - [Some implementations incorrectly generate ports with value 0. - Implementations SHOULD accept this, and SHOULD NOT generate it.] - - addrspec ::= "*" | ip4spec | ip6spec - ipv4spec ::= ip4 | ip4 "/" num_ip4_bits | ip4 "/" ip4mask - ip4 ::= an IPv4 address in dotted-quad format - ip4mask ::= an IPv4 mask in dotted-quad format - num_ip4_bits ::= an integer between 0 and 32 - ip6spec ::= ip6 | ip6 "/" num_ip6_bits - ip6 ::= an IPv6 address, surrounded by square brackets. - num_ip6_bits ::= an integer between 0 and 128 - - bool ::= "0" | "1" - - Ports are required; if they are not included in the router - line, they must appear in the "ports" lines. - -3. Network status format - - Directory authorities generate, sign, and compress network-status - documents. Directory servers SHOULD generate a fresh network-status - document when the contents of such a document would be different from the - last one generated, and some time (at least one second, possibly longer) - has passed since the last one was generated. - - The network status document contains a preamble, a set of router status - entries, and a signature, in that order. - - We use the same meta-format as used for directories and router descriptors - in "tor-spec.txt". Implementations MAY insert blank lines - for clarity between sections; these blank lines are ignored. - Implementations MUST NOT depend on blank lines in any particular location. - - As used here, "whitespace" is a sequence of 1 or more tab or space - characters. - - The preamble contains: - - "network-status-version" -- A document format version. For this - specification, the version is "2". - "dir-source" -- The authority's hostname, current IP address, and - directory port, all separated by whitespace. - "fingerprint" -- A base16-encoded hash of the signing key's - fingerprint, with no additional spaces added. - "contact" -- An arbitrary string describing how to contact the - directory server's administrator. Administrators should include at - least an email address and a PGP fingerprint. - "dir-signing-key" -- The directory server's public signing key. - "client-versions" -- A comma-separated list of recommended client - versions. - "server-versions" -- A comma-separated list of recommended server - versions. - "published" -- The publication time for this network-status object. - "dir-options" -- A set of flags, in any order, separated by whitespace: - "Names" if this directory authority performs name bindings. - "Versions" if this directory authority recommends software versions. - "BadExits" if the directory authority flags nodes that it believes - are performing incorrectly as exit nodes. - "BadDirectories" if the directory authority flags nodes that it - believes are performing incorrectly as directory caches. - - The dir-options entry is optional. The "-versions" entries are required if - the "Versions" flag is present. The other entries are required and must - appear exactly once. The "network-status-version" entry must appear first; - the others may appear in any order. Implementations MUST ignore - additional arguments to the items above, and MUST ignore unrecognized - flags. - - For each router, the router entry contains: (This format is designed for - conciseness.) - - "r" -- followed by the following elements, in order, separated by - whitespace: - - The OR's nickname, - - A hash of its identity key, encoded in base64, with trailing = - signs removed. - - A hash of its most recent descriptor, encoded in base64, with - trailing = signs removed. (The hash is calculated as for - computing the signature of a descriptor.) - - The publication time of its most recent descriptor, in the form - YYYY-MM-DD HH:MM:SS, in GMT. - - An IP address - - An OR port - - A directory port (or "0" for none") - "s" -- A series of whitespace-separated status flags, in any order: - "Authority" if the router is a directory authority. - "BadExit" if the router is believed to be useless as an exit node - (because its ISP censors it, because it is behind a restrictive - proxy, or for some similar reason). - "BadDirectory" if the router is believed to be useless as a - directory cache (because its directory port isn't working, - its bandwidth is always throttled, or for some similar - reason). - "Exit" if the router is useful for building general-purpose exit - circuits. - "Fast" if the router is suitable for high-bandwidth circuits. - "Guard" if the router is suitable for use as an entry guard. - "Named" if the router's identity-nickname mapping is canonical, - and this authority binds names. - "Stable" if the router is suitable for long-lived circuits. - "Running" if the router is currently usable. - "Valid" if the router has been 'validated'. - "V2Dir" if the router implements this protocol. - "v" -- The version of the Tor protocol that this server is running. If - the value begins with "Tor" SP, the rest of the string is a Tor - version number, and the protocol is "The Tor protocol as supported - by the given version of Tor." Otherwise, if the value begins with - some other string, Tor has upgraded to a more sophisticated - protocol versioning system, and the protocol is "a version of the - Tor protocol more recent than any we recognize." - - The "r" entry for each router must appear first and is required. The - "s" entry is optional (see Section 3.1 below for how the flags are - decided). Unrecognized flags on the "s" line and extra elements - on the "r" line must be ignored. The "v" line is optional; it was not - supported until 0.1.2.5-alpha, and it must be preceded with an "opt" - until all earlier versions of Tor are obsolete. - - The signature section contains: - - "directory-signature" nickname-of-dirserver NL Signature - - Signature is a signature of this network-status document - (the document up until the signature, including the line - "directory-signature <nick>\n"), using the directory authority's - signing key. - - We compress the network status list with zlib before transmitting it. - -3.1. Establishing server status - - (This section describes how directory authorities choose which status - flags to apply to routers, as of Tor 0.1.1.18-rc. Later directory - authorities MAY do things differently, so long as clients keep working - well. Clients MUST NOT depend on the exact behaviors in this section.) - - In the below definitions, a router is considered "active" if it is - running, valid, and not hibernating. - - "Valid" -- a router is 'Valid' if it is running a version of Tor not - known to be broken, and the directory authority has not blacklisted - it as suspicious. - - "Named" -- Directory authority administrators may decide to support name - binding. If they do, then they must maintain a file of - nickname-to-identity-key mappings, and try to keep this file consistent - with other directory authorities. If they don't, they act as clients, and - report bindings made by other directory authorities (name X is bound to - identity Y if at least one binding directory lists it, and no directory - binds X to some other Y'.) A router is called 'Named' if the router - believes the given name should be bound to the given key. - - "Running" -- A router is 'Running' if the authority managed to connect to - it successfully within the last 30 minutes. - - "Stable" -- A router is 'Stable' if it is active, and either its - uptime is at least the median uptime for known active routers, or - its uptime is at least 30 days. Routers are never called stable if - they are running a version of Tor known to drop circuits stupidly. - (0.1.1.10-alpha through 0.1.1.16-rc are stupid this way.) - - "Fast" -- A router is 'Fast' if it is active, and its bandwidth is - in the top 7/8ths for known active routers. - - "Guard" -- A router is a possible 'Guard' if it is 'Stable' and its - bandwidth is above median for known active routers. If the total - bandwidth of active non-BadExit Exit servers is less than one third - of the total bandwidth of all active servers, no Exit is listed as - a Guard. - - "Authority" -- A router is called an 'Authority' if the authority - generating the network-status document believes it is an authority. - - "V2Dir" -- A router supports the v2 directory protocol if it has an open - directory port, and it is running a version of the directory protocol that - supports the functionality clients need. (Currently, this is - 0.1.1.9-alpha or later.) - - Directory server administrators may label some servers or IPs as - blacklisted, and elect not to include them in their network-status lists. - - Authorities SHOULD 'disable' any servers in excess of 3 on any single IP. - When there are more than 3 to choose from, authorities should first prefer - authorities to non-authorities, then prefer Running to non-Running, and - then prefer high-bandwidth to low-bandwidth. To 'disable' a server, the - authority *should* advertise it without the Running or Valid flag. - - Thus, the network-status list includes all non-blacklisted, - non-expired, non-superseded descriptors. - -4. Directory server operation - - All directory authorities and directory mirrors ("directory servers") - implement this section, except as noted. - -4.1. Accepting uploads (authorities only) - - When a router posts a signed descriptor to a directory authority, the - authority first checks whether it is well-formed and correctly - self-signed. If it is, the authority next verifies that the nickname - in question is not already assigned to a router with a different - public key. - Finally, the authority MAY check that the router is not blacklisted - because of its key, IP, or another reason. - - If the descriptor passes these tests, and the authority does not already - have a descriptor for a router with this public key, it accepts the - descriptor and remembers it. - - If the authority _does_ have a descriptor with the same public key, the - newly uploaded descriptor is remembered if its publication time is more - recent than the most recent old descriptor for that router, and either: - - There are non-cosmetic differences between the old descriptor and the - new one. - - Enough time has passed between the descriptors' publication times. - (Currently, 12 hours.) - - Differences between router descriptors are "non-cosmetic" if they would be - sufficient to force an upload as described in section 2 above. - - Note that the "cosmetic difference" test only applies to uploaded - descriptors, not to descriptors that the authority downloads from other - authorities. - -4.2. Downloading network-status documents (authorities and caches) - - All directory servers (authorities and mirrors) try to keep a fresh - set of network-status documents from every authority. To do so, - every 5 minutes, each authority asks every other authority for its - most recent network-status document. Every 15 minutes, each mirror - picks a random authority and asks it for the most recent network-status - documents for all the authorities the authority knows about (including - the chosen authority itself). - - Directory servers and mirrors remember and serve the most recent - network-status document they have from each authority. Other - network-status documents don't need to be stored. If the most recent - network-status document is over 10 days old, it is discarded anyway. - Mirrors SHOULD store and serve network-status documents from authorities - they don't recognize, but SHOULD NOT use such documents for any other - purpose. Mirrors SHOULD discard network-status documents older than 48 - hours. - -4.3. Downloading and storing router descriptors (authorities and caches) - - Periodically (currently, every 10 seconds), directory servers check - whether there are any specific descriptors (as identified by descriptor - hash in a network-status document) that they do not have and that they - are not currently trying to download. - - If so, the directory server launches requests to the authorities for these - descriptors, such that each authority is only asked for descriptors listed - in its most recent network-status. When more than one authority lists the - descriptor, we choose which to ask at random. - - If one of these downloads fails, we do not try to download that descriptor - from the authority that failed to serve it again unless we receive a newer - network-status from that authority that lists the same descriptor. - - Directory servers must potentially cache multiple descriptors for each - router. Servers must not discard any descriptor listed by any current - network-status document from any authority. If there is enough space to - store additional descriptors, servers SHOULD try to hold those which - clients are likely to download the most. (Currently, this is judged - based on the interval for which each descriptor seemed newest.) - - Authorities SHOULD NOT download descriptors for routers that they would - immediately reject for reasons listed in 3.1. - -4.4. HTTP URLs - - "Fingerprints" in these URLs are base-16-encoded SHA1 hashes. - - The authoritative network-status published by a host should be available at: - http://<hostname>/tor/status/authority.z - - The network-status published by a host with fingerprint - <F> should be available at: - http://<hostname>/tor/status/fp/<F>.z - - The network-status documents published by hosts with fingerprints - <F1>,<F2>,<F3> should be available at: - http://<hostname>/tor/status/fp/<F1>+<F2>+<F3>.z - - The most recent network-status documents from all known authorities, - concatenated, should be available at: - http://<hostname>/tor/status/all.z - - The most recent descriptor for a server whose identity key has a - fingerprint of <F> should be available at: - http://<hostname>/tor/server/fp/<F>.z - - The most recent descriptors for servers with identity fingerprints - <F1>,<F2>,<F3> should be available at: - http://<hostname>/tor/server/fp/<F1>+<F2>+<F3>.z - - (NOTE: Implementations SHOULD NOT download descriptors by identity key - fingerprint. This allows a corrupted server (in collusion with a cache) to - provide a unique descriptor to a client, and thereby partition that client - from the rest of the network.) - - The server descriptor with (descriptor) digest <D> (in hex) should be - available at: - http://<hostname>/tor/server/d/<D>.z - - The most recent descriptors with digests <D1>,<D2>,<D3> should be - available at: - http://<hostname>/tor/server/d/<D1>+<D2>+<D3>.z - - The most recent descriptor for this server should be at: - http://<hostname>/tor/server/authority.z - [Nothing in the Tor protocol uses this resource yet, but it is useful - for debugging purposes. Also, the official Tor implementations - (starting at 0.1.1.x) use this resource to test whether a server's - own DirPort is reachable.] - - A concatenated set of the most recent descriptors for all known servers - should be available at: - http://<hostname>/tor/server/all.z - - For debugging, directories SHOULD expose non-compressed objects at URLs like - the above, but without the final ".z". - Clients MUST handle compressed concatenated information in two forms: - - A concatenated list of zlib-compressed objects. - - A zlib-compressed concatenated list of objects. - Directory servers MAY generate either format: the former requires less - CPU, but the latter requires less bandwidth. - - Clients SHOULD use upper case letters (A-F) when base16-encoding - fingerprints. Servers MUST accept both upper and lower case fingerprints - in requests. - -5. Client operation: downloading information - - Every Tor that is not a directory server (that is, those that do - not have a DirPort set) implements this section. - -5.1. Downloading network-status documents - - Each client maintains an ordered list of directory authorities. - Insofar as possible, clients SHOULD all use the same ordered list. - - For each network-status document a client has, it keeps track of its - publication time *and* the time when the client retrieved it. Clients - consider a network-status document "live" if it was published within the - last 24 hours. - - Clients try to have a live network-status document hours from *every* - authority, and try to periodically get new network-status documents from - each authority in rotation as follows: - - If a client is missing a live network-status document for any - authority, it tries to fetch it from a directory cache. On failure, - the client waits briefly, then tries that network-status document - again from another cache. The client does not build circuits until it - has live network-status documents from more than half the authorities - it trusts, and it has descriptors for more than 1/4 of the routers - that it believes are running. - - If the most recently _retrieved_ network-status document is over 30 - minutes old, the client attempts to download a network-status document. - When choosing which documents to download, clients treat their list of - directory authorities as a circular ring, and begin with the authority - appearing immediately after the authority for their most recently - retrieved network-status document. If this attempt fails (either it - fails to download at all, or the one it gets is not as good as the - one it has), the client retries at other caches several times, before - moving on to the next network-status document in sequence. - - Clients discard all network-status documents over 24 hours old. - - If enough mirrors (currently 4) claim not to have a given network status, - we stop trying to download that authority's network-status, until we - download a new network-status that makes us believe that the authority in - question is running. Clients should wait a little longer after each - failure. - - Clients SHOULD try to batch as many network-status requests as possible - into each HTTP GET. - - (Note: clients can and should pick caches based on the network-status - information they have: once they have first fetched network-status info - from an authority, they should not need to go to the authority directly - again.) - -5.2. Downloading and storing router descriptors - - Clients try to have the best descriptor for each router. A descriptor is - "best" if: - * It is the most recently published descriptor listed for that router - by at least two network-status documents. - OR, - * No descriptor for that router is listed by two or more - network-status documents, and it is the most recently published - descriptor listed by any network-status document. - - Periodically (currently every 10 seconds) clients check whether there are - any "downloadable" descriptors. A descriptor is downloadable if: - - It is the "best" descriptor for some router. - - The descriptor was published at least 10 minutes in the past. - (This prevents clients from trying to fetch descriptors that the - mirrors have probably not yet retrieved and cached.) - - The client does not currently have it. - - The client is not currently trying to download it. - - The client would not discard it immediately upon receiving it. - - The client thinks it is running and valid (see 6.1 below). - - If at least 16 known routers have downloadable descriptors, or if - enough time (currently 10 minutes) has passed since the last time the - client tried to download descriptors, it launches requests for all - downloadable descriptors, as described in 5.3 below. - - When a descriptor download fails, the client notes it, and does not - consider the descriptor downloadable again until a certain amount of time - has passed. (Currently 0 seconds for the first failure, 60 seconds for the - second, 5 minutes for the third, 10 minutes for the fourth, and 1 day - thereafter.) Periodically (currently once an hour) clients reset the - failure count. - - No descriptors are downloaded until the client has downloaded more than - half of the network-status documents. - - Clients retain the most recent descriptor they have downloaded for each - router so long as it is not too old (currently, 48 hours), OR so long as - it is recommended by at least one networkstatus AND no "better" - descriptor has been downloaded. [Versions of Tor before 0.1.2.3-alpha - would discard descriptors simply for being published too far in the past.] - [The code seems to discard descriptors in all cases after they're 5 - days old. True? -RD] - -5.3. Managing downloads - - When a client has no live network-status documents, it downloads - network-status documents from a randomly chosen authority. In all other - cases, the client downloads from mirrors randomly chosen from among those - believed to be V2 directory servers. (This information comes from the - network-status documents; see 6 below.) - - When downloading multiple router descriptors, the client chooses multiple - mirrors so that: - - At least 3 different mirrors are used, except when this would result - in more than one request for under 4 descriptors. - - No more than 128 descriptors are requested from a single mirror. - - Otherwise, as few mirrors as possible are used. - After choosing mirrors, the client divides the descriptors among them - randomly. - - After receiving any response client MUST discard any network-status - documents and descriptors that it did not request. - -6. Using directory information - - Everyone besides directory authorities uses the approaches in this section - to decide which servers to use and what their keys are likely to be. - (Directory authorities just believe their own opinions, as in 3.1 above.) - -6.1. Choosing routers for circuits. - - Tor implementations only pay attention to "live" network-status documents. - A network status is "live" if it is the most recently downloaded network - status document for a given directory server, and the server is a - directory server trusted by the client, and the network-status document is - no more than 1 day old. - - For time-sensitive information, Tor implementations focus on "recent" - network-status documents. A network status is "recent" if it is live, and - if it was published in the last 60 minutes. If there are fewer - than 3 such documents, the most recently published 3 are "recent." If - there are fewer than 3 in all, all are "recent.") - - Circuits SHOULD NOT be built until the client has enough directory - information: network-statuses (or failed attempts to download - network-statuses) for all authorities, network-statuses for at more than - half of the authorities, and descriptors for at least 1/4 of the servers - believed to be running. - - A server is "listed" if it is included by more than half of the live - network status documents. Clients SHOULD NOT use unlisted servers. - - Clients believe the flags "Valid", "Exit", "Fast", "Guard", "Stable", and - "V2Dir" about a given router when they are asserted by more than half of - the live network-status documents. Clients believe the flag "Running" if - it is listed by more than half of the recent network-status documents. - - These flags are used as follows: - - - Clients SHOULD NOT use non-'Valid' or non-'Running' routers unless - requested to do so. - - - Clients SHOULD NOT use non-'Fast' routers for any purpose other than - very-low-bandwidth circuits (such as introduction circuits). - - - Clients SHOULD NOT use non-'Stable' routers for circuits that are - likely to need to be open for a very long time (such as those used for - IRC or SSH connections). - - - Clients SHOULD NOT choose non-'Guard' nodes when picking entry guard - nodes. - - - Clients SHOULD NOT download directory information from non-'V2Dir' - caches. - -6.2. Managing naming - - In order to provide human-memorable names for individual server - identities, some directory servers bind names to IDs. Clients handle - names in two ways: - - When a client encounters a name it has not mapped before: - - If all the live "Naming" network-status documents the client has - claim that the name binds to some identity ID, and the client has at - least three live network-status documents, the client maps the name to - ID. - - When a user tries to refer to a router with a name that does not have a - mapping under the above rules, the implementation SHOULD warn the user. - After giving the warning, the implementation MAY use a router that at - least one Naming authority maps the name to, so long as no other naming - authority maps that name to a different router. If no Naming authority - maps the name to a router, the implementation MAY use any router that - advertises the name. - - Not every router needs a nickname. When a router doesn't configure a - nickname, it publishes with the default nickname "Unnamed". Authorities - SHOULD NOT ever mark a router with this nickname as Named; client software - SHOULD NOT ever use a router in response to a user request for a router - called "Unnamed". - -6.3. Software versions - - An implementation of Tor SHOULD warn when it has fetched (or has - attempted to fetch and failed four consecutive times) a network-status - for each authority, and it is running a software version - not listed on more than half of the live "Versioning" network-status - documents. - -6.4. Warning about a router's status. - - If a router tries to publish its descriptor to a Naming authority - that has its nickname mapped to another key, the router SHOULD - warn the operator that it is either using the wrong key or is using - an already claimed nickname. - - If a router has fetched (or attempted to fetch and failed four - consecutive times) a network-status for every authority, and at - least one of the authorities is "Naming", and no live "Naming" - authorities publish a binding for the router's nickname, the - router MAY remind the operator that the chosen nickname is not - bound to this key at the authorities, and suggest contacting the - authority operators. - - ... - -6.5. Router protocol versions - - A client should believe that a router supports a given feature if that - feature is supported by the router or protocol versions in more than half - of the live networkstatus's "v" entries for that router. In other words, - if the "v" entries for some router are: - v Tor 0.0.8pre1 (from authority 1) - v Tor 0.1.2.11 (from authority 2) - v FutureProtocolDescription 99 (from authority 3) - then the client should believe that the router supports any feature - supported by 0.1.2.11. - - This is currently equivalent to believing the median declared version for - a router in all live networkstatuses. - -7. Standards compliance - - All clients and servers MUST support HTTP 1.0. - -7.1. HTTP headers - - Servers MAY set the Content-Length: header. Servers SHOULD set - Content-Encoding to "deflate" or "identity". - - Servers MAY include an X-Your-Address-Is: header, whose value is the - apparent IP address of the client connecting to them (as a dotted quad). - For directory connections tunneled over a BEGIN_DIR stream, servers SHOULD - report the IP from which the circuit carrying the BEGIN_DIR stream reached - them. [Servers before version 0.1.2.5-alpha reported 127.0.0.1 for all - BEGIN_DIR-tunneled connections.] - - Servers SHOULD disable caching of multiple network statuses or multiple - router descriptors. Servers MAY enable caching of single descriptors, - single network statuses, the list of all router descriptors, a v1 - directory, or a v1 running routers document. XXX mention times. - -7.2. HTTP status codes - - XXX We should write down what return codes dirservers send in what situations. - diff --git a/doc/spec/dir-spec.txt b/doc/spec/dir-spec.txt deleted file mode 100644 index 49b64e8a92..0000000000 --- a/doc/spec/dir-spec.txt +++ /dev/null @@ -1,2440 +0,0 @@ - - Tor directory protocol, version 3 - -0. Scope and preliminaries - - This directory protocol is used by Tor version 0.2.0.x-alpha and later. - See dir-spec-v1.txt for information on the protocol used up to the - 0.1.0.x series, and dir-spec-v2.txt for information on the protocol - used by the 0.1.1.x and 0.1.2.x series. - - Caches and authorities must still support older versions of the - directory protocols, until the versions of Tor that require them are - finally out of commission. - - This document merges and supersedes the following proposals: - - 101 Voting on the Tor Directory System - 103 Splitting identity key from regularly used signing key - 104 Long and Short Router Descriptors - - XXX when to download certificates. - XXX timeline - XXX fill in XXXXs - - The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL - NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and - "OPTIONAL" in this document are to be interpreted as described in - RFC 2119. - -0.1. History - - The earliest versions of Onion Routing shipped with a list of known - routers and their keys. When the set of routers changed, users needed to - fetch a new list. - - The Version 1 Directory protocol - -------------------------------- - - Early versions of Tor (0.0.2) introduced "Directory authorities": servers - that served signed "directory" documents containing a list of signed - "router descriptors", along with short summary of the status of each - router. Thus, clients could get up-to-date information on the state of - the network automatically, and be certain that the list they were getting - was attested by a trusted directory authority. - - Later versions (0.0.8) added directory caches, which download - directories from the authorities and serve them to clients. Non-caches - fetch from the caches in preference to fetching from the authorities, thus - distributing bandwidth requirements. - - Also added during the version 1 directory protocol were "router status" - documents: short documents that listed only the up/down status of the - routers on the network, rather than a complete list of all the - descriptors. Clients and caches would fetch these documents far more - frequently than they would fetch full directories. - - The Version 2 Directory Protocol - -------------------------------- - - During the Tor 0.1.1.x series, Tor revised its handling of directory - documents in order to address two major problems: - - * Directories had grown quite large (over 1MB), and most directory - downloads consisted mainly of router descriptors that clients - already had. - - * Every directory authority was a trust bottleneck: if a single - directory authority lied, it could make clients believe for a time - an arbitrarily distorted view of the Tor network. (Clients - trusted the most recent signed document they downloaded.) Thus, - adding more authorities would make the system less secure, not - more. - - To address these, we extended the directory protocol so that - authorities now published signed "network status" documents. Each - network status listed, for every router in the network: a hash of its - identity key, a hash of its most recent descriptor, and a summary of - what the authority believed about its status. Clients would download - the authorities' network status documents in turn, and believe - statements about routers iff they were attested to by more than half of - the authorities. - - Instead of downloading all router descriptors at once, clients - downloaded only the descriptors that they did not have. Descriptors - were indexed by their digests, in order to prevent malicious caches - from giving different versions of a router descriptor to different - clients. - - Routers began working harder to upload new descriptors only when their - contents were substantially changed. - - -0.2. Goals of the version 3 protocol - - Version 3 of the Tor directory protocol tries to solve the following - issues: - - * A great deal of bandwidth used to transmit router descriptors was - used by two fields that are not actually used by Tor routers - (namely read-history and write-history). We save about 60% by - moving them into a separate document that most clients do not - fetch or use. - - * It was possible under certain perverse circumstances for clients - to download an unusual set of network status documents, thus - partitioning themselves from clients who have a more recent and/or - typical set of documents. Even under the best of circumstances, - clients were sensitive to the ages of the network status documents - they downloaded. Therefore, instead of having the clients - correlate multiple network status documents, we have the - authorities collectively vote on a single consensus network status - document. - - * The most sensitive data in the entire network (the identity keys - of the directory authorities) needed to be stored unencrypted so - that the authorities can sign network-status documents on the fly. - Now, the authorities' identity keys are stored offline, and used - to certify medium-term signing keys that can be rotated. - -0.3. Some Remaining questions - - Things we could solve on a v3 timeframe: - - The SHA-1 hash is showing its age. We should do something about our - dependency on it. We could probably future-proof ourselves here in - this revision, at least so far as documents from the authorities are - concerned. - - Too many things about the authorities are hardcoded by IP. - - Perhaps we should start accepting longer identity keys for routers - too. - - Things to solve eventually: - - Requiring every client to know about every router won't scale forever. - - Requiring every directory cache to know every router won't scale - forever. - - -1. Outline - - There is a small set (say, around 5-10) of semi-trusted directory - authorities. A default list of authorities is shipped with the Tor - software. Users can change this list, but are encouraged not to do so, - in order to avoid partitioning attacks. - - Every authority has a very-secret, long-term "Authority Identity Key". - This is stored encrypted and/or offline, and is used to sign "key - certificate" documents. Every key certificate contains a medium-term - (3-12 months) "authority signing key", that is used by the authority to - sign other directory information. (Note that the authority identity - key is distinct from the router identity key that the authority uses - in its role as an ordinary router.) - - Routers periodically upload signed "routers descriptors" to the - directory authorities describing their keys, capabilities, and other - information. Routers may also upload signed "extra info documents" - containing information that is not required for the Tor protocol. - Directory authorities serve router descriptors indexed by router - identity, or by hash of the descriptor. - - Routers may act as directory caches to reduce load on the directory - authorities. They announce this in their descriptors. - - Periodically, each directory authority generates a view of - the current descriptors and status for known routers. They send a - signed summary of this view (a "status vote") to the other - authorities. The authorities compute the result of this vote, and sign - a "consensus status" document containing the result of the vote. - - Directory caches download, cache, and re-serve consensus documents. - - Clients, directory caches, and directory authorities all use consensus - documents to find out when their list of routers is out-of-date. - (Directory authorities also use vote statuses.) If it is, they download - any missing router descriptors. Clients download missing descriptors - from caches; caches and authorities download from authorities. - Descriptors are downloaded by the hash of the descriptor, not by the - server's identity key: this prevents servers from attacking clients by - giving them descriptors nobody else uses. - - All directory information is uploaded and downloaded with HTTP. - - [Authorities also generate and caches also cache documents produced and - used by earlier versions of this protocol; see dir-spec-v1.txt and - dir-spec-v2.txt for notes on those versions.] - -1.1. What's different from version 2? - - Clients used to download multiple network status documents, - corresponding roughly to "status votes" above. They would compute the - result of the vote on the client side. - - Authorities used to sign documents using the same private keys they used - for their roles as routers. This forced them to keep these extremely - sensitive keys in memory unencrypted. - - All of the information in extra-info documents used to be kept in the - main descriptors. - -1.2. Document meta-format - - Router descriptors, directories, and running-routers documents all obey the - following lightweight extensible information format. - - The highest level object is a Document, which consists of one or more - Items. Every Item begins with a KeywordLine, followed by zero or more - Objects. A KeywordLine begins with a Keyword, optionally followed by - whitespace and more non-newline characters, and ends with a newline. A - Keyword is a sequence of one or more characters in the set [A-Za-z0-9-]. - An Object is a block of encoded data in pseudo-Open-PGP-style - armor. (cf. RFC 2440) - - More formally: - - NL = The ascii LF character (hex value 0x0a). - Document ::= (Item | NL)+ - Item ::= KeywordLine Object* - KeywordLine ::= Keyword NL | Keyword WS ArgumentChar+ NL - Keyword = KeywordChar+ - KeywordChar ::= 'A' ... 'Z' | 'a' ... 'z' | '0' ... '9' | '-' - ArgumentChar ::= any printing ASCII character except NL. - WS = (SP | TAB)+ - Object ::= BeginLine Base-64-encoded-data EndLine - BeginLine ::= "-----BEGIN " Keyword "-----" NL - EndLine ::= "-----END " Keyword "-----" NL - - The BeginLine and EndLine of an Object must use the same keyword. - - When interpreting a Document, software MUST ignore any KeywordLine that - starts with a keyword it doesn't recognize; future implementations MUST NOT - require current clients to understand any KeywordLine not currently - described. - - The "opt" keyword was used until Tor 0.1.2.5-alpha for non-critical future - extensions. All implementations MUST ignore any item of the form "opt - keyword ....." when they would not recognize "keyword ....."; and MUST - treat "opt keyword ....." as synonymous with "keyword ......" when keyword - is recognized. - - Implementations before 0.1.2.5-alpha rejected any document with a - KeywordLine that started with a keyword that they didn't recognize. - When generating documents that need to be read by older versions of Tor, - implementations MUST prefix items not recognized by older versions of - Tor with an "opt" until those versions of Tor are obsolete. [Note that - key certificates, status vote documents, extra info documents, and - status consensus documents will never be read by older versions of Tor.] - - Other implementations that want to extend Tor's directory format MAY - introduce their own items. The keywords for extension items SHOULD start - with the characters "x-" or "X-", to guarantee that they will not conflict - with keywords used by future versions of Tor. - - In our document descriptions below, we tag Items with a multiplicity in - brackets. Possible tags are: - - "At start, exactly once": These items MUST occur in every instance of - the document type, and MUST appear exactly once, and MUST be the - first item in their documents. - - "Exactly once": These items MUST occur exactly one time in every - instance of the document type. - - "At end, exactly once": These items MUST occur in every instance of - the document type, and MUST appear exactly once, and MUST be the - last item in their documents. - - "At most once": These items MAY occur zero or one times in any - instance of the document type, but MUST NOT occur more than once. - - "Any number": These items MAY occur zero, one, or more times in any - instance of the document type. - - "Once or more": These items MUST occur at least once in any instance - of the document type, and MAY occur more. - -1.3. Signing documents - - Every signable document below is signed in a similar manner, using a - given "Initial Item", a final "Signature Item", a digest algorithm, and - a signing key. - - The Initial Item must be the first item in the document. - - The Signature Item has the following format: - - <signature item keyword> [arguments] NL SIGNATURE NL - - The "SIGNATURE" Object contains a signature (using the signing key) of - the PKCS1-padded digest of the entire document, taken from the - beginning of the Initial item, through the newline after the Signature - Item's keyword and its arguments. - - Unless otherwise, the digest algorithm is SHA-1. - - All documents are invalid unless signed with the correct signing key. - - The "Digest" of a document, unless stated otherwise, is its digest *as - signed by this signature scheme*. - -1.4. Voting timeline - - Every consensus document has a "valid-after" (VA) time, a "fresh-until" - (FU) time and a "valid-until" (VU) time. VA MUST precede FU, which MUST - in turn precede VU. Times are chosen so that every consensus will be - "fresh" until the next consensus becomes valid, and "valid" for a while - after. At least 3 consensuses should be valid at any given time. - - The timeline for a given consensus is as follows: - - VA-DistSeconds-VoteSeconds: The authorities exchange votes. - - VA-DistSeconds-VoteSeconds/2: The authorities try to download any - votes they don't have. - - VA-DistSeconds: The authorities calculate the consensus and exchange - signatures. - - VA-DistSeconds/2: The authorities try to download any signatures - they don't have. - - VA: All authorities have a multiply signed consensus. - - VA ... FU: Caches download the consensus. (Note that since caches have - no way of telling what VA and FU are until they have downloaded - the consensus, they assume that the present consensus's VA is - equal to the previous one's FU, and that its FU is one interval after - that.) - - FU: The consensus is no longer the freshest consensus. - - FU ... (the current consensus's VU): Clients download the consensus. - (See note above: clients guess that the next consensus's FU will be - two intervals after the current VA.) - - VU: The consensus is no longer valid. - - VoteSeconds and DistSeconds MUST each be at least 20 seconds; FU-VA and - VU-FU MUST each be at least 5 minutes. - -2. Router operation and formats - - ORs SHOULD generate a new router descriptor and a new extra-info - document whenever any of the following events have occurred: - - - A period of time (18 hrs by default) has passed since the last - time a descriptor was generated. - - - A descriptor field other than bandwidth or uptime has changed. - - - Bandwidth has changed by a factor of 2 from the last time a - descriptor was generated, and at least a given interval of time - (20 mins by default) has passed since then. - - - Its uptime has been reset (by restarting). - - [XXX this list is incomplete; see router_differences_are_cosmetic() - in routerlist.c for others] - - ORs SHOULD NOT publish a new router descriptor or extra-info document - if none of the above events have occurred and not much time has passed - (12 hours by default). - - After generating a descriptor, ORs upload them to every directory - authority they know, by posting them (in order) to the URL - - http://<hostname:port>/tor/ - -2.1. Router descriptor format - - Router descriptors consist of the following items. For backward - compatibility, there should be an extra NL at the end of each router - descriptor. - - In lines that take multiple arguments, extra arguments SHOULD be - accepted and ignored. Many of the nonterminals below are defined in - section 2.3. - - "router" nickname address ORPort SOCKSPort DirPort NL - - [At start, exactly once.] - - Indicates the beginning of a router descriptor. "nickname" must be a - valid router nickname as specified in 2.3. "address" must be an IPv4 - address in dotted-quad format. The last three numbers indicate the - TCP ports at which this OR exposes functionality. ORPort is a port at - which this OR accepts TLS connections for the main OR protocol; - SOCKSPort is deprecated and should always be 0; and DirPort is the - port at which this OR accepts directory-related HTTP connections. If - any port is not supported, the value 0 is given instead of a port - number. (At least one of DirPort and ORPort SHOULD be set; - authorities MAY reject any descriptor with both DirPort and ORPort of - 0.) - - "bandwidth" bandwidth-avg bandwidth-burst bandwidth-observed NL - - [Exactly once] - - Estimated bandwidth for this router, in bytes per second. The - "average" bandwidth is the volume per second that the OR is willing to - sustain over long periods; the "burst" bandwidth is the volume that - the OR is willing to sustain in very short intervals. The "observed" - value is an estimate of the capacity this server can handle. The - server remembers the max bandwidth sustained output over any ten - second period in the past day, and another sustained input. The - "observed" value is the lesser of these two numbers. - - "platform" string NL - - [At most once] - - A human-readable string describing the system on which this OR is - running. This MAY include the operating system, and SHOULD include - the name and version of the software implementing the Tor protocol. - - "published" YYYY-MM-DD HH:MM:SS NL - - [Exactly once] - - The time, in GMT, when this descriptor (and its corresponding - extra-info document if any) was generated. - - "fingerprint" fingerprint NL - - [At most once] - - A fingerprint (a HASH_LEN-byte of asn1 encoded public key, encoded in - hex, with a single space after every 4 characters) for this router's - identity key. A descriptor is considered invalid (and MUST be - rejected) if the fingerprint line does not match the public key. - - [We didn't start parsing this line until Tor 0.1.0.6-rc; it should - be marked with "opt" until earlier versions of Tor are obsolete.] - - "hibernating" bool NL - - [At most once] - - If the value is 1, then the Tor server was hibernating when the - descriptor was published, and shouldn't be used to build circuits. - - [We didn't start parsing this line until Tor 0.1.0.6-rc; it should be - marked with "opt" until earlier versions of Tor are obsolete.] - - "uptime" number NL - - [At most once] - - The number of seconds that this OR process has been running. - - "onion-key" NL a public key in PEM format - - [Exactly once] - - This key is used to encrypt EXTEND cells for this OR. The key MUST be - accepted for at least 1 week after any new key is published in a - subsequent descriptor. It MUST be 1024 bits. - - "signing-key" NL a public key in PEM format - - [Exactly once] - - The OR's long-term identity key. It MUST be 1024 bits. - - "accept" exitpattern NL - "reject" exitpattern NL - - [Any number] - - These lines describe an "exit policy": the rules that an OR follows - when deciding whether to allow a new stream to a given address. The - 'exitpattern' syntax is described below. There MUST be at least one - such entry. The rules are considered in order; if no rule matches, - the address will be accepted. For clarity, the last such entry SHOULD - be accept *:* or reject *:*. - - "router-signature" NL Signature NL - - [At end, exactly once] - - The "SIGNATURE" object contains a signature of the PKCS1-padded - hash of the entire router descriptor, taken from the beginning of the - "router" line, through the newline after the "router-signature" line. - The router descriptor is invalid unless the signature is performed - with the router's identity key. - - "contact" info NL - - [At most once] - - Describes a way to contact the server's administrator, preferably - including an email address and a PGP key fingerprint. - - "family" names NL - - [At most once] - - 'Names' is a space-separated list of server nicknames or - hexdigests. If two ORs list one another in their "family" entries, - then OPs should treat them as a single OR for the purpose of path - selection. - - For example, if node A's descriptor contains "family B", and node B's - descriptor contains "family A", then node A and node B should never - be used on the same circuit. - - "read-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM,NUM,NUM,NUM,NUM... NL - [At most once] - "write-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM,NUM,NUM,NUM,NUM... NL - [At most once] - - Declare how much bandwidth the OR has used recently. Usage is divided - into intervals of NSEC seconds. The YYYY-MM-DD HH:MM:SS field - defines the end of the most recent interval. The numbers are the - number of bytes used in the most recent intervals, ordered from - oldest to newest. - - [We didn't start parsing these lines until Tor 0.1.0.6-rc; they should - be marked with "opt" until earlier versions of Tor are obsolete.] - - [See also migration notes in section 2.2.1.] - - "eventdns" bool NL - - [At most once] - - Declare whether this version of Tor is using the newer enhanced - dns logic. Versions of Tor with this field set to false SHOULD NOT - be used for reverse hostname lookups. - - [This option is obsolete. All Tor current servers should be presumed - to have the evdns backend.] - - "caches-extra-info" NL - - [At most once.] - - Present only if this router is a directory cache that provides - extra-info documents. - - [Versions before 0.2.0.1-alpha don't recognize this, and versions - before 0.1.2.5-alpha will reject descriptors containing it unless - it is prefixed with "opt"; it should be so prefixed until these - versions are obsolete.] - - "extra-info-digest" digest NL - - [At most once] - - "Digest" is a hex-encoded digest (using upper-case characters) of the - router's extra-info document, as signed in the router's extra-info - (that is, not including the signature). (If this field is absent, the - router is not uploading a corresponding extra-info document.) - - [Versions before 0.2.0.1-alpha don't recognize this, and versions - before 0.1.2.5-alpha will reject descriptors containing it unless - it is prefixed with "opt"; it should be so prefixed until these - versions are obsolete.] - - "hidden-service-dir" *(SP VersionNum) NL - - [At most once.] - - Present only if this router stores and serves hidden service - descriptors. If any VersionNum(s) are specified, this router - supports those descriptor versions. If none are specified, it - defaults to version 2 descriptors. - - [Versions of Tor before 0.1.2.5-alpha rejected router descriptors - with unrecognized items; the protocols line should be preceded with - an "opt" until these Tors are obsolete.] - - "protocols" SP "Link" SP LINK-VERSION-LIST SP "Circuit" SP - CIRCUIT-VERSION-LIST NL - - [At most once.] - - Both lists are space-separated sequences of numbers, to indicate which - protocols the server supports. As of 30 Mar 2008, specified - protocols are "Link 1 2 Circuit 1". See section 4.1 of tor-spec.txt - for more information about link protocol versions. - - [Versions of Tor before 0.1.2.5-alpha rejected router descriptors - with unrecognized items; the protocols line should be preceded with - an "opt" until these Tors are obsolete.] - - "allow-single-hop-exits" NL - - [At most once.] - - Present only if the router allows single-hop circuits to make exit - connections. Most Tor servers do not support this: this is - included for specialized controllers designed to support perspective - access and such. - - -2.2. Extra-info documents - - Extra-info documents consist of the following items: - - "extra-info" Nickname Fingerprint NL - [At start, exactly once.] - - Identifies what router this is an extra info descriptor for. - Fingerprint is encoded in hex (using upper-case letters), with - no spaces. - - "published" YYYY-MM-DD HH:MM:SS NL - - [Exactly once.] - - The time, in GMT, when this document (and its corresponding router - descriptor if any) was generated. It MUST match the published time - in the corresponding router descriptor. - - "read-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM,NUM,NUM,NUM,NUM... NL - [At most once.] - "write-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM,NUM,NUM,NUM,NUM... NL - [At most once.] - - As documented in 2.1 above. See migration notes in section 2.2.1. - - "geoip-db-digest" Digest NL - [At most once.] - - SHA1 digest of the GeoIP database file that is used to resolve IP - addresses to country codes. - - ("geoip-start" YYYY-MM-DD HH:MM:SS NL) - ("geoip-client-origins" CC=N,CC=N,... NL) - - Only generated by bridge routers (see blocking.pdf), and only - when they have been configured with a geoip database. - Non-bridges SHOULD NOT generate these fields. Contains a list - of mappings from two-letter country codes (CC) to the number - of clients that have connected to that bridge from that - country (approximate, and rounded up to the nearest multiple of 8 - in order to hamper traffic analysis). A country is included - only if it has at least one address. The time in - "geoip-start" is the time at which we began collecting geoip - statistics. - - "geoip-start" and "geoip-client-origins" have been replaced by - "bridge-stats-end" and "bridge-stats-ips" in 0.2.2.4-alpha. The - reason is that the measurement interval with "geoip-stats" as - determined by subtracting "geoip-start" from "published" could - have had a variable length, whereas the measurement interval in - 0.2.2.4-alpha and later is set to be exactly 24 hours long. In - order to clearly distinguish the new measurement intervals from - the old ones, the new keywords have been introduced. - - "bridge-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL - [At most once.] - - YYYY-MM-DD HH:MM:SS defines the end of the included measurement - interval of length NSEC seconds (86400 seconds by default). - - A "bridge-stats-end" line, as well as any other "bridge-*" line, - is only added when the relay has been running as a bridge for at - least 24 hours. - - "bridge-ips" CC=N,CC=N,... NL - [At most once.] - - List of mappings from two-letter country codes to the number of - unique IP addresses that have connected from that country to the - bridge and which are no known relays, rounded up to the nearest - multiple of 8. - - "dirreq-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL - [At most once.] - - YYYY-MM-DD HH:MM:SS defines the end of the included measurement - interval of length NSEC seconds (86400 seconds by default). - - A "dirreq-stats-end" line, as well as any other "dirreq-*" line, - is only added when the relay has opened its Dir port and after 24 - hours of measuring directory requests. - - "dirreq-v2-ips" CC=N,CC=N,... NL - [At most once.] - "dirreq-v3-ips" CC=N,CC=N,... NL - [At most once.] - - List of mappings from two-letter country codes to the number of - unique IP addresses that have connected from that country to - request a v2/v3 network status, rounded up to the nearest multiple - of 8. Only those IP addresses are counted that the directory can - answer with a 200 OK status code. - - "dirreq-v2-reqs" CC=N,CC=N,... NL - [At most once.] - "dirreq-v3-reqs" CC=N,CC=N,... NL - [At most once.] - - List of mappings from two-letter country codes to the number of - requests for v2/v3 network statuses from that country, rounded up - to the nearest multiple of 8. Only those requests are counted that - the directory can answer with a 200 OK status code. - - "dirreq-v2-share" num% NL - [At most once.] - "dirreq-v3-share" num% NL - [At most once.] - - The share of v2/v3 network status requests that the directory - expects to receive from clients based on its advertised bandwidth - compared to the overall network bandwidth capacity. Shares are - formatted in percent with two decimal places. Shares are - calculated as means over the whole 24-hour interval. - - "dirreq-v2-resp" status=num,... NL - [At most once.] - "dirreq-v3-resp" status=nul,... NL - [At most once.] - - List of mappings from response statuses to the number of requests - for v2/v3 network statuses that were answered with that response - status, rounded up to the nearest multiple of 4. Only response - statuses with at least 1 response are reported. New response - statuses can be added at any time. The current list of response - statuses is as follows: - - "ok": a network status request is answered; this number - corresponds to the sum of all requests as reported in - "dirreq-v2-reqs" or "dirreq-v3-reqs", respectively, before - rounding up. - "not-enough-sigs: a version 3 network status is not signed by a - sufficient number of requested authorities. - "unavailable": a requested network status object is unavailable. - "not-found": a requested network status is not found. - "not-modified": a network status has not been modified since the - If-Modified-Since time that is included in the request. - "busy": the directory is busy. - - "dirreq-v2-direct-dl" key=val,... NL - [At most once.] - "dirreq-v3-direct-dl" key=val,... NL - [At most once.] - "dirreq-v2-tunneled-dl" key=val,... NL - [At most once.] - "dirreq-v3-tunneled-dl" key=val,... NL - [At most once.] - - List of statistics about possible failures in the download process - of v2/v3 network statuses. Requests are either "direct" - HTTP-encoded requests over the relay's directory port, or - "tunneled" requests using a BEGIN_DIR cell over the relay's OR - port. The list of possible statistics can change, and statistics - can be left out from reporting. The current list of statistics is - as follows: - - Successful downloads and failures: - - "complete": a client has finished the download successfully. - "timeout": a download did not finish within 10 minutes after - starting to send the response. - "running": a download is still running at the end of the - measurement period for less than 10 minutes after starting to - send the response. - - Download times: - - "min", "max": smallest and largest measured bandwidth in B/s. - "d[1-4,6-9]": 1st to 4th and 6th to 9th decile of measured - bandwidth in B/s. For a given decile i, i/10 of all downloads - had a smaller bandwidth than di, and (10-i)/10 of all downloads - had a larger bandwidth than di. - "q[1,3]": 1st and 3rd quartile of measured bandwidth in B/s. One - fourth of all downloads had a smaller bandwidth than q1, one - fourth of all downloads had a larger bandwidth than q3, and the - remaining half of all downloads had a bandwidth between q1 and - q3. - "md": median of measured bandwidth in B/s. Half of the downloads - had a smaller bandwidth than md, the other half had a larger - bandwidth than md. - - "dirreq-read-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM,NUM,NUM... NL - [At most once] - "dirreq-write-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM,NUM,NUM... NL - [At most once] - - Declare how much bandwidth the OR has spent on answering directory - requests. Usage is divided into intervals of NSEC seconds. The - YYYY-MM-DD HH:MM:SS field defines the end of the most recent - interval. The numbers are the number of bytes used in the most - recent intervals, ordered from oldest to newest. - - "entry-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL - [At most once.] - - YYYY-MM-DD HH:MM:SS defines the end of the included measurement - interval of length NSEC seconds (86400 seconds by default). - - An "entry-stats-end" line, as well as any other "entry-*" - line, is first added after the relay has been running for at least - 24 hours. - - "entry-ips" CC=N,CC=N,... NL - [At most once.] - - List of mappings from two-letter country codes to the number of - unique IP addresses that have connected from that country to the - relay and which are no known other relays, rounded up to the - nearest multiple of 8. - - "cell-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL - [At most once.] - - YYYY-MM-DD HH:MM:SS defines the end of the included measurement - interval of length NSEC seconds (86400 seconds by default). - - A "cell-stats-end" line, as well as any other "cell-*" line, - is first added after the relay has been running for at least 24 - hours. - - "cell-processed-cells" num,...,num NL - [At most once.] - - Mean number of processed cells per circuit, subdivided into - deciles of circuits by the number of cells they have processed in - descending order from loudest to quietest circuits. - - "cell-queued-cells" num,...,num NL - [At most once.] - - Mean number of cells contained in queues by circuit decile. These - means are calculated by 1) determining the mean number of cells in - a single circuit between its creation and its termination and 2) - calculating the mean for all circuits in a given decile as - determined in "cell-processed-cells". Numbers have a precision of - two decimal places. - - "cell-time-in-queue" num,...,num NL - [At most once.] - - Mean time cells spend in circuit queues in milliseconds. Times are - calculated by 1) determining the mean time cells spend in the - queue of a single circuit and 2) calculating the mean for all - circuits in a given decile as determined in - "cell-processed-cells". - - "cell-circuits-per-decile" num NL - [At most once.] - - Mean number of circuits that are included in any of the deciles, - rounded up to the next integer. - - "conn-bi-direct" YYYY-MM-DD HH:MM:SS (NSEC s) BELOW,READ,WRITE,BOTH NL - [At most once] - - Number of connections, split into 10-second intervals, that are - used uni-directionally or bi-directionally as observed in the NSEC - seconds (usually 86400 seconds) before YYYY-MM-DD HH:MM:SS. Every - 10 seconds, we determine for every connection whether we read and - wrote less than a threshold of 20 KiB (BELOW), read at least 10 - times more than we wrote (READ), wrote at least 10 times more than - we read (WRITE), or read and wrote more than the threshold, but - not 10 times more in either direction (BOTH). After classifying a - connection, read and write counters are reset for the next - 10-second interval. - - "exit-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL - [At most once.] - - YYYY-MM-DD HH:MM:SS defines the end of the included measurement - interval of length NSEC seconds (86400 seconds by default). - - An "exit-stats-end" line, as well as any other "exit-*" line, is - first added after the relay has been running for at least 24 hours - and only if the relay permits exiting (where exiting to a single - port and IP address is sufficient). - - "exit-kibibytes-written" port=N,port=N,... NL - [At most once.] - "exit-kibibytes-read" port=N,port=N,... NL - [At most once.] - - List of mappings from ports to the number of kibibytes that the - relay has written to or read from exit connections to that port, - rounded up to the next full kibibyte. - - "exit-streams-opened" port=N,port=N,... NL - [At most once.] - - List of mappings from ports to the number of opened exit streams - to that port, rounded up to the nearest multiple of 4. - - "router-signature" NL Signature NL - [At end, exactly once.] - - A document signature as documented in section 1.3, using the - initial item "extra-info" and the final item "router-signature", - signed with the router's identity key. - -2.2.1. Moving history fields to extra-info documents. - - Tools that want to use the read-history and write-history values SHOULD - download extra-info documents as well as router descriptors. Such - tools SHOULD accept history values from both sources; if they appear in - both documents, the values in the extra-info documents are authoritative. - - New versions of Tor no longer generate router descriptors - containing read-history or write-history. Tools should continue to - accept read-history and write-history values in router descriptors - produced by older versions of Tor until all Tor versions earlier - than 0.2.0.x are obsolete. - -2.3. Nonterminals in router descriptors - - nickname ::= between 1 and 19 alphanumeric characters ([A-Za-z0-9]), - case-insensitive. - hexdigest ::= a '$', followed by 40 hexadecimal characters - ([A-Fa-f0-9]). [Represents a server by the digest of its identity - key.] - - exitpattern ::= addrspec ":" portspec - portspec ::= "*" | port | port "-" port - port ::= an integer between 1 and 65535, inclusive. - - [Some implementations incorrectly generate ports with value 0. - Implementations SHOULD accept this, and SHOULD NOT generate it. - Connections to port 0 are never permitted.] - - addrspec ::= "*" | ip4spec | ip6spec - ipv4spec ::= ip4 | ip4 "/" num_ip4_bits | ip4 "/" ip4mask - ip4 ::= an IPv4 address in dotted-quad format - ip4mask ::= an IPv4 mask in dotted-quad format - num_ip4_bits ::= an integer between 0 and 32 - ip6spec ::= ip6 | ip6 "/" num_ip6_bits - ip6 ::= an IPv6 address, surrounded by square brackets. - num_ip6_bits ::= an integer between 0 and 128 - - bool ::= "0" | "1" - -3. Formats produced by directory authorities. - - Every authority has two keys used in this protocol: a signing key, and - an authority identity key. (Authorities also have a router identity - key used in their role as a router and by earlier versions of the - directory protocol.) The identity key is used from time to time to - sign new key certificates using new signing keys; it is very sensitive. - The signing key is used to sign key certificates and status documents. - - There are three kinds of documents generated by directory authorities: - - Key certificates - Status votes - Status consensuses - - Each is discussed below. - -3.1. Key certificates - - Key certificates consist of the following items: - - "dir-key-certificate-version" version NL - - [At start, exactly once.] - - Determines the version of the key certificate. MUST be "3" for - the protocol described in this document. Implementations MUST - reject formats they don't understand. - - "dir-address" IPPort NL - [At most once] - - An IP:Port for this authority's directory port. - - "fingerprint" fingerprint NL - - [Exactly once.] - - Hexadecimal encoding without spaces based on the authority's - identity key. - - "dir-identity-key" NL a public key in PEM format - - [Exactly once.] - - The long-term authority identity key for this authority. This key - SHOULD be at least 2048 bits long; it MUST NOT be shorter than - 1024 bits. - - "dir-key-published" YYYY-MM-DD HH:MM:SS NL - - [Exactly once.] - - The time (in GMT) when this document and corresponding key were - last generated. - - "dir-key-expires" YYYY-MM-DD HH:MM:SS NL - - [Exactly once.] - - A time (in GMT) after which this key is no longer valid. - - "dir-signing-key" NL a key in PEM format - - [Exactly once.] - - The directory server's public signing key. This key MUST be at - least 1024 bits, and MAY be longer. - - "dir-key-crosscert" NL CrossSignature NL - - [At most once.] - - NOTE: Authorities MUST include this field in all newly generated - certificates. A future version of this specification will make - the field required. - - CrossSignature is a signature, made using the certificate's signing - key, of the digest of the PKCS1-padded hash of the certificate's - identity key. For backward compatibility with broken versions of the - parser, we wrap the base64-encoded signature in -----BEGIN ID - SIGNATURE---- and -----END ID SIGNATURE----- tags. Implementations - MUST allow the "ID " portion to be omitted, however. - - When encountering a certificate with a dir-key-crosscert entry, - implementations MUST verify that the signature is a correct signature - of the hash of the identity key using the signing key. - - "dir-key-certification" NL Signature NL - - [At end, exactly once.] - - A document signature as documented in section 1.3, using the - initial item "dir-key-certificate-version" and the final item - "dir-key-certification", signed with the authority identity key. - - Authorities MUST generate a new signing key and corresponding - certificate before the key expires. - -3.2. Vote and consensus status documents - - Votes and consensuses are more strictly formatted then other documents - in this specification, since different authorities must be able to - generate exactly the same consensus given the same set of votes. - - The procedure for deciding when to generate vote and consensus status - documents are described in section 1.4 on the voting timeline. - - Status documents contain a preamble, an authority section, a list of - router status entries, and one or more footer signature, in that order. - - Unlike other formats described above, a SP in these documents must be a - single space character (hex 20). - - Some items appear only in votes, and some items appear only in - consensuses. Unless specified, items occur in both. - - The preamble contains the following items. They MUST occur in the - order given here: - - "network-status-version" SP version NL. - - [At start, exactly once.] - - A document format version. For this specification, the version is - "3". - - "vote-status" SP type NL - - [Exactly once.] - - The status MUST be "vote" or "consensus", depending on the type of - the document. - - "consensus-methods" SP IntegerList NL - - [Exactly once for votes; does not occur in consensuses.] - - A space-separated list of supported methods for generating - consensuses from votes. See section 3.4.1 for details. Method "1" - MUST be included. - - "consensus-method" SP Integer NL - - [Exactly once for consensuses; does not occur in votes.] - - See section 3.4.1 for details. - - (Only included when the vote is generated with consensus-method 2 or - later.) - - "published" SP YYYY-MM-DD SP HH:MM:SS NL - - [Exactly once for votes; does not occur in consensuses.] - - The publication time for this status document (if a vote). - - "valid-after" SP YYYY-MM-DD SP HH:MM:SS NL - - [Exactly once.] - - The start of the Interval for this vote. Before this time, the - consensus document produced from this vote should not be used. - See 1.4 for voting timeline information. - - "fresh-until" SP YYYY-MM-DD SP HH:MM:SS NL - - [Exactly once.] - - The time at which the next consensus should be produced; before this - time, there is no point in downloading another consensus, since there - won't be a new one. See 1.4 for voting timeline information. - - "valid-until" SP YYYY-MM-DD SP HH:MM:SS NL - - [Exactly once.] - - The end of the Interval for this vote. After this time, the - consensus produced by this vote should not be used. See 1.4 for - voting timeline information. - - "voting-delay" SP VoteSeconds SP DistSeconds NL - - [Exactly once.] - - VoteSeconds is the number of seconds that we will allow to collect - votes from all authorities; DistSeconds is the number of seconds - we'll allow to collect signatures from all authorities. See 1.4 for - voting timeline information. - - "client-versions" SP VersionList NL - - [At most once.] - - A comma-separated list of recommended client versions, in - ascending order. If absent, no opinion is held about client - versions. - - "server-versions" SP VersionList NL - - [At most once.] - - A comma-separated list of recommended server versions, in - ascending order. If absent, no opinion is held about server - versions. - - "known-flags" SP FlagList NL - - [Exactly once.] - - A space-separated list of all of the flags that this document - might contain. A flag is "known" either because the authority - knows about them and might set them (if in a vote), or because - enough votes were counted for the consensus for an authoritative - opinion to have been formed about their status. - - "params" SP [Parameters] NL - - [At most once] - - Parameter ::= Keyword '=' Int32 - Int32 ::= A decimal integer between -2147483648 and 2147483647. - Parameters ::= Parameter | Parameters SP Parameter - - The parameters list, if present, contains a space-separated list of - case-sensitive key-value pairs, sorted in lexical order by - their keyword. Each parameter has its own meaning. - - (Only included when the vote is generated with consensus-method 7 or - later.) - - Commonly used "param" arguments at this point include: - - "circwindow" -- the default package window that circuits should - be established with. It started out at 1000 cells, but some - research indicates that a lower value would mean fewer cells in - transit in the network at any given time. Obeyed by Tor 0.2.1.20 - and later. - Min: 100, Max: 1000 - - "CircuitPriorityHalflifeMsec" -- the halflife parameter used when - weighting which circuit will send the next cell. Obeyed by Tor - 0.2.2.10-alpha and later. (Versions of Tor between 0.2.2.7-alpha - and 0.2.2.10-alpha recognized a "CircPriorityHalflifeMsec" parameter, - but mishandled it badly.) - Min: -1, Max: 2147483647 (INT32_MAX) - - "perconnbwrate" and "perconnbwburst" -- if set, each relay sets - up a separate token bucket for every client OR connection, - and rate limits that connection indepedently. Typically left - unset, except when used for performance experiments around trac - entry 1750. Only honored by relays running Tor 0.2.2.16-alpha - and later. (Note that relays running 0.2.2.7-alpha through - 0.2.2.14-alpha looked for bwconnrate and bwconnburst, but then - did the wrong thing with them; see bug 1830 for details.) - Min: 1, Max: 2147483647 (INT32_MAX) - - "refuseunknownexits" -- if set to one, exit relays look at - the previous hop of circuits that ask to open an exit stream, - and refuse to exit if they don't recognize it as a relay. The - goal is to make it harder for people to use them as one-hop - proxies. See trac entry 1751 for details. - Min: 0, Max: 1 - - "cbtdisabled", "cbtnummodes", "cbtrecentcount", "cbtmaxtimeouts", - "cbtmincircs", "cbtquantile", "cbtclosequantile", "cbttestfreq", - "cbtmintimeout", and "cbtinitialtimeout" -- see "2.4.5. Consensus - parameters governing behavior" in path-spec.txt for a series of - circuit build time related consensus params. - - The authority section of a vote contains the following items, followed - in turn by the authority's current key certificate: - - "dir-source" SP nickname SP identity SP address SP IP SP dirport SP - orport NL - - [Exactly once, at start] - - Describes this authority. The nickname is a convenient identifier - for the authority. The identity is an uppercase hex fingerprint of - the authority's current (v3 authority) identity key. The address is - the server's hostname. The IP is the server's current IP address, - and dirport is its current directory port. XXXXorport - - "contact" SP string NL - - [At most once.] - - An arbitrary string describing how to contact the directory - server's administrator. Administrators should include at least an - email address and a PGP fingerprint. - - "legacy-key" SP FINGERPRINT NL - - [At most once] - - Lists a fingerprint for an obsolete _identity_ key still used - by this authority to keep older clients working. This option - is used to keep key around for a little while in case the - authorities need to migrate many identity keys at once. - (Generally, this would only happen because of a security - vulnerability that affected multiple authorities, like the - Debian OpenSSL RNG bug of May 2008.) - - The authority section of a consensus contains groups the following items, - in the order given, with one group for each authority that contributed to - the consensus, with groups sorted by authority identity digest: - - "dir-source" SP nickname SP identity SP address SP IP SP dirport SP - orport NL - - [Exactly once, at start] - - As in the authority section of a vote. - - "contact" SP string NL - - [At most once.] - - As in the authority section of a vote. - - "vote-digest" SP digest NL - - [Exactly once.] - - A digest of the vote from the authority that contributed to this - consensus, as signed (that is, not including the signature). - (Hex, upper-case.) - - Each router status entry contains the following items. Router status - entries are sorted in ascending order by identity digest. - - "r" SP nickname SP identity SP digest SP publication SP IP SP ORPort - SP DirPort NL - - [At start, exactly once.] - - "Nickname" is the OR's nickname. "Identity" is a hash of its - identity key, encoded in base64, with trailing equals sign(s) - removed. "Digest" is a hash of its most recent descriptor as - signed (that is, not including the signature), encoded in base64. - "Publication" is the - publication time of its most recent descriptor, in the form - YYYY-MM-DD HH:MM:SS, in GMT. "IP" is its current IP address; - ORPort is its current OR port, "DirPort" is it's current directory - port, or "0" for "none". - - "s" SP Flags NL - - [At most once.] - - A series of space-separated status flags, in alphabetical order. - Currently documented flags are: - - "Authority" if the router is a directory authority. - "BadExit" if the router is believed to be useless as an exit node - (because its ISP censors it, because it is behind a restrictive - proxy, or for some similar reason). - "BadDirectory" if the router is believed to be useless as a - directory cache (because its directory port isn't working, - its bandwidth is always throttled, or for some similar - reason). - "Exit" if the router is more useful for building - general-purpose exit circuits than for relay circuits. The - path building algorithm uses this flag; see path-spec.txt. - "Fast" if the router is suitable for high-bandwidth circuits. - "Guard" if the router is suitable for use as an entry guard. - "HSDir" if the router is considered a v2 hidden service directory. - "Named" if the router's identity-nickname mapping is canonical, - and this authority binds names. - "Stable" if the router is suitable for long-lived circuits. - "Running" if the router is currently usable. - "Unnamed" if another router has bound the name used by this - router, and this authority binds names. - "Valid" if the router has been 'validated'. - "V2Dir" if the router implements the v2 directory protocol. - "V3Dir" if the router implements this protocol. - - "v" SP version NL - - [At most once.] - - The version of the Tor protocol that this server is running. If - the value begins with "Tor" SP, the rest of the string is a Tor - version number, and the protocol is "The Tor protocol as supported - by the given version of Tor." Otherwise, if the value begins with - some other string, Tor has upgraded to a more sophisticated - protocol versioning system, and the protocol is "a version of the - Tor protocol more recent than any we recognize." - - Directory authorities SHOULD omit version strings they receive from - descriptors if they would cause "v" lines to be over 128 characters - long. - - "w" SP "Bandwidth=" INT [SP "Measured=" INT] NL - - [At most once.] - - An estimate of the bandwidth of this server, in an arbitrary - unit (currently kilobytes per second). Used to weight router - selection. - - Additionally, the Measured= keyword is present in votes by - participating bandwidth measurement authorities to indicate - a measured bandwidth currently produced by measuring stream - capacities. - - Other weighting keywords may be added later. - Clients MUST ignore keywords they do not recognize. - - "p" SP ("accept" / "reject") SP PortList NL - - [At most once.] - - PortList = PortOrRange - PortList = PortList "," PortOrRange - PortOrRange = INT "-" INT / INT - - A list of those ports that this router supports (if 'accept') - or does not support (if 'reject') for exit to "most - addresses". - - The footer section is delineated in all votes and consensuses supporting - consensus method 9 and above with the following: - - "directory-footer" NL - - It contains two subsections, a bandwidths-weights line and a - directory-signature. - - The bandwidths-weights line appears At Most Once for a consensus. It does - not appear in votes. - - "bandwidth-weights" SP - "Wbd=" INT SP "Wbe=" INT SP "Wbg=" INT SP "Wbm=" INT SP - "Wdb=" INT SP - "Web=" INT SP "Wed=" INT SP "Wee=" INT SP "Weg=" INT SP "Wem=" INT SP - "Wgb=" INT SP "Wgd=" INT SP "Wgg=" INT SP "Wgm=" INT SP - "Wmb=" INT SP "Wmd=" INT SP "Wme=" INT SP "Wmg=" INT SP "Wmm=" INT NL - - These values represent the weights to apply to router bandwidths during - path selection. They are sorted in alphabetical order in the list. The - integer values are divided by BW_WEIGHT_SCALE=10000 or the consensus - param "bwweightscale". They are: - - Wgg - Weight for Guard-flagged nodes in the guard position - Wgm - Weight for non-flagged nodes in the guard Position - Wgd - Weight for Guard+Exit-flagged nodes in the guard Position - - Wmg - Weight for Guard-flagged nodes in the middle Position - Wmm - Weight for non-flagged nodes in the middle Position - Wme - Weight for Exit-flagged nodes in the middle Position - Wmd - Weight for Guard+Exit flagged nodes in the middle Position - - Weg - Weight for Guard flagged nodes in the exit Position - Wem - Weight for non-flagged nodes in the exit Position - Wee - Weight for Exit-flagged nodes in the exit Position - Wed - Weight for Guard+Exit-flagged nodes in the exit Position - - Wgb - Weight for BEGIN_DIR-supporting Guard-flagged nodes - Wmb - Weight for BEGIN_DIR-supporting non-flagged nodes - Web - Weight for BEGIN_DIR-supporting Exit-flagged nodes - Wdb - Weight for BEGIN_DIR-supporting Guard+Exit-flagged nodes - - Wbg - Weight for Guard flagged nodes for BEGIN_DIR requests - Wbm - Weight for non-flagged nodes for BEGIN_DIR requests - Wbe - Weight for Exit-flagged nodes for BEGIN_DIR requests - Wbd - Weight for Guard+Exit-flagged nodes for BEGIN_DIR requests - - These values are calculated as specified in Section 3.4.3. - - The signature contains the following item, which appears Exactly Once - for a vote, and At Least Once for a consensus. - - "directory-signature" SP identity SP signing-key-digest NL Signature - - This is a signature of the status document, with the initial item - "network-status-version", and the signature item - "directory-signature", using the signing key. (In this case, we take - the hash through the _space_ after directory-signature, not the - newline: this ensures that all authorities sign the same thing.) - "identity" is the hex-encoded digest of the authority identity key of - the signing authority, and "signing-key-digest" is the hex-encoded - digest of the current authority signing key of the signing authority. - -3.3. Assigning flags in a vote - - (This section describes how directory authorities choose which status - flags to apply to routers, as of Tor 0.2.0.0-alpha-dev. Later directory - authorities MAY do things differently, so long as clients keep working - well. Clients MUST NOT depend on the exact behaviors in this section.) - - In the below definitions, a router is considered "active" if it is - running, valid, and not hibernating. - - "Valid" -- a router is 'Valid' if it is running a version of Tor not - known to be broken, and the directory authority has not blacklisted - it as suspicious. - - "Named" -- Directory authority administrators may decide to support name - binding. If they do, then they must maintain a file of - nickname-to-identity-key mappings, and try to keep this file consistent - with other directory authorities. If they don't, they act as clients, and - report bindings made by other directory authorities (name X is bound to - identity Y if at least one binding directory lists it, and no directory - binds X to some other Y'.) A router is called 'Named' if the router - believes the given name should be bound to the given key. - - Two strategies exist on the current network for deciding on - values for the Named flag. In the original version, server - operators were asked to send nickname-identity pairs to a - mailing list of Naming directory authorities operators. The - operators were then supposed to add the pairs to their - mapping files; in practice, they didn't get to this often. - - Newer Naming authorities run a script that registers routers - in their mapping files once the routers have been online at - least two weeks, no other router has that nickname, and no - other router has wanted the nickname for a month. If a router - has not been online for six months, the router is removed. - - "Unnamed" -- Directory authorities that support naming should vote for a - router to be 'Unnamed' if its given nickname is mapped to a different - identity. - - "Running" -- A router is 'Running' if the authority managed to connect to - it successfully within the last 30 minutes. - - "Stable" -- A router is 'Stable' if it is active, and either its Weighted - MTBF is at least the median for known active routers or its Weighted MTBF - corresponds to at least 7 days. Routers are never called Stable if they are - running a version of Tor known to drop circuits stupidly. (0.1.1.10-alpha - through 0.1.1.16-rc are stupid this way.) - - To calculate weighted MTBF, compute the weighted mean of the lengths - of all intervals when the router was observed to be up, weighting - intervals by $\alpha^n$, where $n$ is the amount of time that has - passed since the interval ended, and $\alpha$ is chosen so that - measurements over approximately one month old no longer influence the - weighted MTBF much. - - [XXXX what happens when we have less than 4 days of MTBF info.] - - "Exit" -- A router is called an 'Exit' iff it allows exits to at - least two of the ports 80, 443, and 6667 and allows exits to at - least one /8 address space. - - "Fast" -- A router is 'Fast' if it is active, and its bandwidth is - either in the top 7/8ths for known active routers or at least 20KB/s. - - "Guard" -- A router is a possible 'Guard' if its Weighted Fractional - Uptime is at least the median for "familiar" active routers, and if - its bandwidth is at least median or at least 250KB/s. - - To calculate weighted fractional uptime, compute the fraction - of time that the router is up in any given day, weighting so that - downtime and uptime in the past counts less. - - A node is 'familiar' if 1/8 of all active nodes have appeared more - recently than it, OR it has been around for a few weeks. - - "Authority" -- A router is called an 'Authority' if the authority - generating the network-status document believes it is an authority. - - "V2Dir" -- A router supports the v2 directory protocol if it has an open - directory port, and it is running a version of the directory protocol that - supports the functionality clients need. (Currently, this is - 0.1.1.9-alpha or later.) - - "V3Dir" -- A router supports the v3 directory protocol if it has an open - directory port, and it is running a version of the directory protocol that - supports the functionality clients need. (Currently, this is - 0.2.0.?????-alpha or later.) - - "HSDir" -- A router is a v2 hidden service directory if it stores and - serves v2 hidden service descriptors and the authority managed to connect - to it successfully within the last 24 hours. - - Directory server administrators may label some servers or IPs as - blacklisted, and elect not to include them in their network-status lists. - - Authorities SHOULD 'disable' any servers in excess of 3 on any single IP. - When there are more than 3 to choose from, authorities should first prefer - authorities to non-authorities, then prefer Running to non-Running, and - then prefer high-bandwidth to low-bandwidth. To 'disable' a server, the - authority *should* advertise it without the Running or Valid flag. - - Thus, the network-status vote includes all non-blacklisted, - non-expired, non-superseded descriptors. - - The bandwidth in a "w" line should be taken as the best estimate - of the router's actual capacity that the authority has. For now, - this should be the lesser of the observed bandwidth and bandwidth - rate limit from the router descriptor. It is given in kilobytes - per second, and capped at some arbitrary value (currently 10 MB/s). - - The Measured= keyword on a "w" line vote is currently computed - by multiplying the previous published consensus bandwidth by the - ratio of the measured average node stream capacity to the network - average. If 3 or more authorities provide a Measured= keyword for - a router, the authorities produce a consensus containing a "w" - Bandwidth= keyword equal to the median of the Measured= votes. - - The ports listed in a "p" line should be taken as those ports for - which the router's exit policy permits 'most' addresses, ignoring any - accept not for all addresses, ignoring all rejects for private - netblocks. "Most" addresses are permitted if no more than 2^25 - IPv4 addresses (two /8 networks) were blocked. The list is encoded - as described in 3.4.2. - -3.4. Computing a consensus from a set of votes - - Given a set of votes, authorities compute the contents of the consensus - document as follows: - - The "valid-after", "valid-until", and "fresh-until" times are taken as - the median of the respective values from all the votes. - - The times in the "voting-delay" line are taken as the median of the - VoteSeconds and DistSeconds times in the votes. - - Known-flags is the union of all flags known by any voter. - - Entries are given on the "params" line for every keyword on which any - authority voted. The values given are the low-median of all votes on - that keyword. - - "client-versions" and "server-versions" are sorted in ascending - order; A version is recommended in the consensus if it is recommended - by more than half of the voting authorities that included a - client-versions or server-versions lines in their votes. - - The authority item groups (dir-source, contact, fingerprint, - vote-digest) are taken from the votes of the voting - authorities. These groups are sorted by the digests of the - authorities identity keys, in ascending order. If the consensus - method is 3 or later, a dir-source line must be included for - every vote with legacy-key entry, using the legacy-key's - fingerprint, the voter's ordinary nickname with the string - "-legacy" appended, and all other fields as from the original - vote's dir-source line. - - A router status entry: - * is included in the result if some router status entry with the same - identity is included by more than half of the authorities (total - authorities, not just those whose votes we have). - - * For any given identity, we include at most one router status entry. - - * A router entry has a flag set if that is included by more than half - of the authorities who care about that flag. - - * Two router entries are "the same" if they have the same - <descriptor digest, published time, nickname, IP, ports> tuple. - We choose the tuple for a given router as whichever tuple appears - for that router in the most votes. We break ties first in favor of - the more recently published, then in favor of smaller server - descriptor digest. - - * The Named flag appears if it is included for this routerstatus by - _any_ authority, and if all authorities that list it list the same - nickname. However, if consensus-method 2 or later is in use, and - any authority calls this identity/nickname pair Unnamed, then - this routerstatus does not get the Named flag. - - * If consensus-method 2 or later is in use, the Unnamed flag is - set for a routerstatus if any authorities have voted for a different - identities to be Named with that nickname, or if any authority - lists that nickname/ID pair as Unnamed. - - (With consensus-method 1, Unnamed is set like any other flag.) - - * The version is given as whichever version is listed by the most - voters, with ties decided in favor of more recent versions. - - * If consensus-method 4 or later is in use, then routers that - do not have the Running flag are not listed at all. - - * If consensus-method 5 or later is in use, then the "w" line - is generated using a low-median of the bandwidth values from - the votes that included "w" lines for this router. - - * If consensus-method 5 or later is in use, then the "p" line - is taken from the votes that have the same policy summary - for the descriptor we are listing. (They should all be the - same. If they are not, we pick the most commonly listed - one, breaking ties in favor of the lexicographically larger - vote.) The port list is encoded as specified in 3.4.2. - - * If consensus-method 6 or later is in use and if 3 or more - authorities provide a Measured= keyword in their votes for - a router, the authorities produce a consensus containing a - Bandwidth= keyword equal to the median of the Measured= votes. - - * If consensus-method 7 or later is in use, the params line is - included in the output. - - * If the consensus method is under 11, bad exits are considered as - possible exits when computing bandwidth weights. Otherwise, if - method 11 or later is in use, any router that is determined to get - the BadExit flag doesn't count when we're calculating weights. - - The signatures at the end of a consensus document are sorted in - ascending order by identity digest. - - All ties in computing medians are broken in favor of the smaller or - earlier item. - -3.4.1. Forward compatibility - - Future versions of Tor will need to include new information in the - consensus documents, but it is important that all authorities (or at least - half) generate and sign the same signed consensus. - - To achieve this, authorities list in their votes their supported methods - for generating consensuses from votes. Later methods will be assigned - higher numbers. Currently recognized methods: - "1" -- The first implemented version. - "2" -- Added support for the Unnamed flag. - "3" -- Added legacy ID key support to aid in authority ID key rollovers - "4" -- No longer list routers that are not running in the consensus - "5" -- adds support for "w" and "p" lines. - "6" -- Prefers measured bandwidth values rather than advertised - "7" -- Provides keyword=integer pairs of consensus parameters - "8" -- Provides microdescriptor summaries - "9" -- Provides weights for selecting flagged routers in paths - "10" -- Fixes edge case bugs in router flag selection weights - - Before generating a consensus, an authority must decide which consensus - method to use. To do this, it looks for the highest version number - supported by more than 2/3 of the authorities voting. If it supports this - method, then it uses it. Otherwise, it falls back to method 1. - - (The consensuses generated by new methods must be parsable by - implementations that only understand the old methods, and must not cause - those implementations to compromise their anonymity. This is a means for - making changes in the contents of consensus; not for making - backward-incompatible changes in their format.) - -3.4.2. Encoding port lists - - Whether the summary shows the list of accepted ports or the list of - rejected ports depends on which list is shorter (has a shorter string - representation). In case of ties we choose the list of accepted - ports. As an exception to this rule an allow-all policy is - represented as "accept 1-65535" instead of "reject " and a reject-all - policy is similarly given as "reject 1-65535". - - Summary items are compressed, that is instead of "80-88,89-100" there - only is a single item of "80-100", similarly instead of "20,21" a - summary will say "20-21". - - Port lists are sorted in ascending order. - - The maximum allowed length of a policy summary (including the "accept " - or "reject ") is 1000 characters. If a summary exceeds that length we - use an accept-style summary and list as much of the port list as is - possible within these 1000 bytes. [XXXX be more specific.] - -3.4.3. Computing Bandwidth Weights - - Let weight_scale = 10000 - - Let G be the total bandwidth for Guard-flagged nodes. - Let M be the total bandwidth for non-flagged nodes. - Let E be the total bandwidth for Exit-flagged nodes. - Let D be the total bandwidth for Guard+Exit-flagged nodes. - Let T = G+M+E+D - - Let Wgd be the weight for choosing a Guard+Exit for the guard position. - Let Wmd be the weight for choosing a Guard+Exit for the middle position. - Let Wed be the weight for choosing a Guard+Exit for the exit position. - - Let Wme be the weight for choosing an Exit for the middle position. - Let Wmg be the weight for choosing a Guard for the middle position. - - Let Wgg be the weight for choosing a Guard for the guard position. - Let Wee be the weight for choosing an Exit for the exit position. - - Balanced network conditions then arise from solutions to the following - system of equations: - - Wgg*G + Wgd*D == M + Wmd*D + Wme*E + Wmg*G (guard bw = middle bw) - Wgg*G + Wgd*D == Wee*E + Wed*D (guard bw = exit bw) - Wed*D + Wmd*D + Wgd*D == D (aka: Wed+Wmd+Wdg = 1) - Wmg*G + Wgg*G == G (aka: Wgg = 1-Wmg) - Wme*E + Wee*E == E (aka: Wee = 1-Wme) - - We are short 2 constraints with the above set. The remaining constraints - come from examining different cases of network load. The following - constraints are used in consensus method 10 and above. There are another - incorrect and obsolete set of constraints used for these same cases in - consensus method 9. For those, see dir-spec.txt in Tor 0.2.2.10-alpha - to 0.2.2.16-alpha. - - Case 1: E >= T/3 && G >= T/3 (Neither Exit nor Guard Scarce) - - In this case, the additional two constraints are: Wmg == Wmd, - Wed == 1/3. - - This leads to the solution: - Wgd = weight_scale/3 - Wed = weight_scale/3 - Wmd = weight_scale/3 - Wee = (weight_scale*(E+G+M))/(3*E) - Wme = weight_scale - Wee - Wmg = (weight_scale*(2*G-E-M))/(3*G) - Wgg = weight_scale - Wmg - - Case 2: E < T/3 && G < T/3 (Both are scarce) - - Let R denote the more scarce class (Rare) between Guard vs Exit. - Let S denote the less scarce class. - - Subcase a: R+D < S - - In this subcase, we simply devote all of D bandwidth to the - scarce class. - - Wgg = Wee = weight_scale - Wmg = Wme = Wmd = 0; - if E < G: - Wed = weight_scale - Wgd = 0 - else: - Wed = 0 - Wgd = weight_scale - - Subcase b: R+D >= S - - In this case, if M <= T/3, we have enough bandwidth to try to achieve - a balancing condition. - - Add constraints Wgg = 1, Wmd == Wgd to maximize bandwidth in the guard - position while still allowing exits to be used as middle nodes: - - Wee = (weight_scale*(E - G + M))/E - Wed = (weight_scale*(D - 2*E + 4*G - 2*M))/(3*D) - Wme = (weight_scale*(G-M))/E - Wmg = 0 - Wgg = weight_scale - Wmd = (weight_scale - Wed)/2 - Wgd = (weight_scale - Wed)/2 - - If this system ends up with any values out of range (ie negative, or - above weight_scale), use the constraints Wgg == 1 and Wee == 1, since - both those positions are scarce: - - Wgg = weight_scale - Wee = weight_scale - Wed = (weight_scale*(D - 2*E + G + M))/(3*D) - Wmd = (weight_Scale*(D - 2*M + G + E))/(3*D) - Wme = 0 - Wmg = 0 - Wgd = weight_scale - Wed - Wmd - - If M > T/3, then the Wmd weight above will become negative. Set it to 0 - in this case: - Wmd = 0 - Wgd = weight_scale - Wed - - Case 3: One of E < T/3 or G < T/3 - - Let S be the scarce class (of E or G). - - Subcase a: (S+D) < T/3: - if G=S: - Wgg = Wgd = weight_scale; - Wmd = Wed = Wmg = 0; - // Minor subcase, if E is more scarce than M, - // keep its bandwidth in place. - if (E < M) Wme = 0; - else Wme = (weight_scale*(E-M))/(2*E); - Wee = weight_scale-Wme; - if E=S: - Wee = Wed = weight_scale; - Wmd = Wgd = Wme = 0; - // Minor subcase, if G is more scarce than M, - // keep its bandwidth in place. - if (G < M) Wmg = 0; - else Wmg = (weight_scale*(G-M))/(2*G); - Wgg = weight_scale-Wmg; - - Subcase b: (S+D) >= T/3 - if G=S: - Add constraints Wgg = 1, Wmd == Wed to maximize bandwidth - in the guard position, while still allowing exits to be - used as middle nodes: - Wgg = weight_scale - Wgd = (weight_scale*(D - 2*G + E + M))/(3*D) - Wmg = 0 - Wee = (weight_scale*(E+M))/(2*E) - Wme = weight_scale - Wee - Wmd = (weight_scale - Wgd)/2 - Wed = (weight_scale - Wgd)/2 - if E=S: - Add constraints Wee == 1, Wmd == Wgd to maximize bandwidth - in the exit position: - Wee = weight_scale; - Wed = (weight_scale*(D - 2*E + G + M))/(3*D); - Wme = 0; - Wgg = (weight_scale*(G+M))/(2*G); - Wmg = weight_scale - Wgg; - Wmd = (weight_scale - Wed)/2; - Wgd = (weight_scale - Wed)/2; - - To ensure consensus, all calculations are performed using integer math - with a fixed precision determined by the bwweightscale consensus - parameter (defaults at 10000, Min: 1, Max: INT32_MAX). - - For future balancing improvements, Tor clients support 11 additional weights - for directory requests and middle weighting. These weights are currently - set at weight_scale, with the exception of the following groups of - assignments: - - Directory requests use middle weights: - Wbd=Wmd, Wbg=Wmg, Wbe=Wme, Wbm=Wmm - - Handle bridges and strange exit policies: - Wgm=Wgg, Wem=Wee, Weg=Wed - -3.5. Detached signatures - - Assuming full connectivity, every authority should compute and sign the - same consensus directory in each period. Therefore, it isn't necessary to - download the consensus computed by each authority; instead, the - authorities only push/fetch each others' signatures. A "detached - signature" document contains items as follows: - - "consensus-digest" SP Digest NL - - [At start, at most once.] - - The digest of the consensus being signed. - - "valid-after" SP YYYY-MM-DD SP HH:MM:SS NL - "fresh-until" SP YYYY-MM-DD SP HH:MM:SS NL - "valid-until" SP YYYY-MM-DD SP HH:MM:SS NL - - [As in the consensus] - - "directory-signature" - - [As in the consensus; the signature object is the same as in the - consensus document.] - - -4. Directory server operation - - All directory authorities and directory caches ("directory servers") - implement this section, except as noted. - -4.1. Accepting uploads (authorities only) - - When a router posts a signed descriptor to a directory authority, the - authority first checks whether it is well-formed and correctly - self-signed. If it is, the authority next verifies that the nickname - in question is not already assigned to a router with a different - public key. - Finally, the authority MAY check that the router is not blacklisted - because of its key, IP, or another reason. - - If the descriptor passes these tests, and the authority does not already - have a descriptor for a router with this public key, it accepts the - descriptor and remembers it. - - If the authority _does_ have a descriptor with the same public key, the - newly uploaded descriptor is remembered if its publication time is more - recent than the most recent old descriptor for that router, and either: - - There are non-cosmetic differences between the old descriptor and the - new one. - - Enough time has passed between the descriptors' publication times. - (Currently, 12 hours.) - - Differences between router descriptors are "non-cosmetic" if they would be - sufficient to force an upload as described in section 2 above. - - Note that the "cosmetic difference" test only applies to uploaded - descriptors, not to descriptors that the authority downloads from other - authorities. - - When a router posts a signed extra-info document to a directory authority, - the authority again checks it for well-formedness and correct signature, - and checks that its matches the extra-info-digest in some router - descriptor that it believes is currently useful. If so, it accepts it and - stores it and serves it as requested. If not, it drops it. - -4.2. Voting (authorities only) - - Authorities divide time into Intervals. Authority administrators SHOULD - try to all pick the same interval length, and SHOULD pick intervals that - are commonly used divisions of time (e.g., 5 minutes, 15 minutes, 30 - minutes, 60 minutes, 90 minutes). Voting intervals SHOULD be chosen to - divide evenly into a 24-hour day. - - Authorities SHOULD act according to interval and delays in the - latest consensus. Lacking a latest consensus, they SHOULD default to a - 30-minute Interval, a 5 minute VotingDelay, and a 5 minute DistDelay. - - Authorities MUST take pains to ensure that their clocks remain accurate - within a few seconds. (Running NTP is usually sufficient.) - - The first voting period of each day begins at 00:00 (midnight) GMT. If - the last period of the day would be truncated by one-half or more, it is - merged with the second-to-last period. - - An authority SHOULD publish its vote immediately at the start of each voting - period (minus VoteSeconds+DistSeconds). It does this by making it - available at - http://<hostname>/tor/status-vote/next/authority.z - and sending it in an HTTP POST request to each other authority at the URL - http://<hostname>/tor/post/vote - - If, at the start of the voting period, minus DistSeconds, an authority - does not have a current statement from another authority, the first - authority downloads the other's statement. - - Once an authority has a vote from another authority, it makes it available - at - http://<hostname>/tor/status-vote/next/<fp>.z - where <fp> is the fingerprint of the other authority's identity key. - And at - http://<hostname>/tor/status-vote/next/d/<d>.z - where <d> is the digest of the vote document. - - The consensus status, along with as many signatures as the server - currently knows, should be available at - http://<hostname>/tor/status-vote/next/consensus.z - All of the detached signatures it knows for consensus status should be - available at: - http://<hostname>/tor/status-vote/next/consensus-signatures.z - - Once there are enough signatures, or once the voting period starts, - these documents are available at - http://<hostname>/tor/status-vote/current/consensus.z - and - http://<hostname>/tor/status-vote/current/consensus-signatures.z - [XXX current/consensus-signatures is not currently implemented, as it - is not used in the voting protocol.] - - The other vote documents are analogously made available under - http://<hostname>/tor/status-vote/current/authority.z - http://<hostname>/tor/status-vote/current/<fp>.z - http://<hostname>/tor/status-vote/current/d/<d>.z - once the consensus is complete. - - Once an authority has computed and signed a consensus network status, it - should send its detached signature to each other authority in an HTTP POST - request to the URL: - http://<hostname>/tor/post/consensus-signature - - [XXX Note why we support push-and-then-pull.] - - [XXX possible future features include support for downloading old - consensuses.] - -4.3. Downloading consensus status documents (caches only) - - All directory servers (authorities and caches) try to keep a recent - network-status consensus document to serve to clients. A cache ALWAYS - downloads a network-status consensus if any of the following are true: - - The cache has no consensus document. - - The cache's consensus document is no longer valid. - Otherwise, the cache downloads a new consensus document at a randomly - chosen time in the first half-interval after its current consensus - stops being fresh. (This time is chosen at random to avoid swarming - the authorities at the start of each period. The interval size is - inferred from the difference between the valid-after time and the - fresh-until time on the consensus.) - - [For example, if a cache has a consensus that became valid at 1:00, - and is fresh until 2:00, that cache will fetch a new consensus at - a random time between 2:00 and 2:30.] - -4.4. Downloading and storing router descriptors (authorities and caches) - - Periodically (currently, every 10 seconds), directory servers check - whether there are any specific descriptors that they do not have and that - they are not currently trying to download. Caches identify these - descriptors by hash in the recent network-status consensus documents; - authorities identify them by hash in vote (if publication date is more - recent than the descriptor we currently have). - - [XXXX need a way to fetch descriptors ahead of the vote? v2 status docs can - do that for now.] - - If so, the directory server launches requests to the authorities for these - descriptors, such that each authority is only asked for descriptors listed - in its most recent vote (if the requester is an authority) or in the - consensus (if the requester is a cache). If we're an authority, and more - than one authority lists the descriptor, we choose which to ask at random. - - If one of these downloads fails, we do not try to download that descriptor - from the authority that failed to serve it again unless we receive a newer - network-status (consensus or vote) from that authority that lists the same - descriptor. - - Directory servers must potentially cache multiple descriptors for each - router. Servers must not discard any descriptor listed by any recent - consensus. If there is enough space to store additional descriptors, - servers SHOULD try to hold those which clients are likely to download the - most. (Currently, this is judged based on the interval for which each - descriptor seemed newest.) -[XXXX define recent] - - Authorities SHOULD NOT download descriptors for routers that they would - immediately reject for reasons listed in 3.1. - -4.5. Downloading and storing extra-info documents - - All authorities, and any cache that chooses to cache extra-info documents, - and any client that uses extra-info documents, should implement this - section. - - Note that generally, clients don't need extra-info documents. - - Periodically, the Tor instance checks whether it is missing any extra-info - documents: in other words, if it has any router descriptors with an - extra-info-digest field that does not match any of the extra-info - documents currently held. If so, it downloads whatever extra-info - documents are missing. Caches download from authorities; non-caches try - to download from caches. We follow the same splitting and back-off rules - as in 4.4 (if a cache) or 5.3 (if a client). - -4.6. General-use HTTP URLs - - "Fingerprints" in these URLs are base-16-encoded SHA1 hashes. - - The most recent v3 consensus should be available at: - http://<hostname>/tor/status-vote/current/consensus.z - - Starting with Tor version 0.2.1.1-alpha is also available at: - http://<hostname>/tor/status-vote/current/consensus/<F1>+<F2>+<F3>.z - - Where F1, F2, etc. are authority identity fingerprints the client trusts. - Servers will only return a consensus if more than half of the requested - authorities have signed the document, otherwise a 404 error will be sent - back. The fingerprints can be shortened to a length of any multiple of - two, using only the leftmost part of the encoded fingerprint. Tor uses - 3 bytes (6 hex characters) of the fingerprint. - - Clients SHOULD sort the fingerprints in ascending order. Server MUST - accept any order. - - Clients SHOULD use this format when requesting consensus documents from - directory authority servers and from caches running a version of Tor - that is known to support this URL format. - - A concatenated set of all the current key certificates should be available - at: - http://<hostname>/tor/keys/all.z - - The key certificate for this server (if it is an authority) should be - available at: - http://<hostname>/tor/keys/authority.z - - The key certificate for an authority whose authority identity fingerprint - is <F> should be available at: - http://<hostname>/tor/keys/fp/<F>.z - - The key certificate whose signing key fingerprint is <F> should be - available at: - http://<hostname>/tor/keys/sk/<F>.z - - The key certificate whose identity key fingerprint is <F> and whose signing - key fingerprint is <S> should be available at: - - http://<hostname>/tor/keys/fp-sk/<F>-<S>.z - - (As usual, clients may request multiple certificates using: - http://<hostname>/tor/keys/fp-sk/<F1>-<S1>+<F2>-<S2>.z ) - [The above fp-sk format was not supported before Tor 0.2.1.9-alpha.] - - The most recent descriptor for a server whose identity key has a - fingerprint of <F> should be available at: - http://<hostname>/tor/server/fp/<F>.z - - The most recent descriptors for servers with identity fingerprints - <F1>,<F2>,<F3> should be available at: - http://<hostname>/tor/server/fp/<F1>+<F2>+<F3>.z - - (NOTE: Implementations SHOULD NOT download descriptors by identity key - fingerprint. This allows a corrupted server (in collusion with a cache) to - provide a unique descriptor to a client, and thereby partition that client - from the rest of the network.) - - The server descriptor with (descriptor) digest <D> (in hex) should be - available at: - http://<hostname>/tor/server/d/<D>.z - - The most recent descriptors with digests <D1>,<D2>,<D3> should be - available at: - http://<hostname>/tor/server/d/<D1>+<D2>+<D3>.z - - The most recent descriptor for this server should be at: - http://<hostname>/tor/server/authority.z - [Nothing in the Tor protocol uses this resource yet, but it is useful - for debugging purposes. Also, the official Tor implementations - (starting at 0.1.1.x) use this resource to test whether a server's - own DirPort is reachable.] - - A concatenated set of the most recent descriptors for all known servers - should be available at: - http://<hostname>/tor/server/all.z - - Extra-info documents are available at the URLS - http://<hostname>/tor/extra/d/... - http://<hostname>/tor/extra/fp/... - http://<hostname>/tor/extra/all[.z] - http://<hostname>/tor/extra/authority[.z] - (As for /tor/server/ URLs: supports fetching extra-info - documents by their digest, by the fingerprint of their servers, - or all at once. When serving by fingerprint, we serve the - extra-info that corresponds to the descriptor we would serve by - that fingerprint. Only directory authorities of version - 0.2.0.1-alpha or later are guaranteed to support the first - three classes of URLs. Caches may support them, and MUST - support them if they have advertised "caches-extra-info".) - - For debugging, directories SHOULD expose non-compressed objects at URLs like - the above, but without the final ".z". - Clients MUST handle compressed concatenated information in two forms: - - A concatenated list of zlib-compressed objects. - - A zlib-compressed concatenated list of objects. - Directory servers MAY generate either format: the former requires less - CPU, but the latter requires less bandwidth. - - Clients SHOULD use upper case letters (A-F) when base16-encoding - fingerprints. Servers MUST accept both upper and lower case fingerprints - in requests. - -5. Client operation: downloading information - - Every Tor that is not a directory server (that is, those that do - not have a DirPort set) implements this section. - -5.1. Downloading network-status documents - - Each client maintains a list of directory authorities. Insofar as - possible, clients SHOULD all use the same list. - - Clients try to have a live consensus network-status document at all times. - A network-status document is "live" if the time in its valid-until field - has not passed. - - If a client is missing a live network-status document, it tries to fetch - it from a directory cache (or from an authority if it knows no caches). - On failure, the client waits briefly, then tries that network-status - document again from another cache. The client does not build circuits - until it has a live network-status consensus document, and it has - descriptors for more than 1/4 of the routers that it believes are running. - - (Note: clients can and should pick caches based on the network-status - information they have: once they have first fetched network-status info - from an authority, they should not need to go to the authority directly - again.) - - To avoid swarming the caches whenever a consensus expires, the - clients download new consensuses at a randomly chosen time after the - caches are expected to have a fresh consensus, but before their - consensus will expire. (This time is chosen uniformly at random from - the interval between the time 3/4 into the first interval after the - consensus is no longer fresh, and 7/8 of the time remaining after - that before the consensus is invalid.) - - [For example, if a cache has a consensus that became valid at 1:00, - and is fresh until 2:00, and expires at 4:00, that cache will fetch - a new consensus at a random time between 2:45 and 3:50, since 3/4 - of the one-hour interval is 45 minutes, and 7/8 of the remaining 75 - minutes is 65 minutes.] - -5.2. Downloading and storing router descriptors - - Clients try to have the best descriptor for each router. A descriptor is - "best" if: - * It is listed in the consensus network-status document. - - Periodically (currently every 10 seconds) clients check whether there are - any "downloadable" descriptors. A descriptor is downloadable if: - - It is the "best" descriptor for some router. - - The descriptor was published at least 10 minutes in the past. - (This prevents clients from trying to fetch descriptors that the - mirrors have probably not yet retrieved and cached.) - - The client does not currently have it. - - The client is not currently trying to download it. - - The client would not discard it immediately upon receiving it. - - The client thinks it is running and valid (see 6.1 below). - - If at least 16 known routers have downloadable descriptors, or if - enough time (currently 10 minutes) has passed since the last time the - client tried to download descriptors, it launches requests for all - downloadable descriptors, as described in 5.3 below. - - When a descriptor download fails, the client notes it, and does not - consider the descriptor downloadable again until a certain amount of time - has passed. (Currently 0 seconds for the first failure, 60 seconds for the - second, 5 minutes for the third, 10 minutes for the fourth, and 1 day - thereafter.) Periodically (currently once an hour) clients reset the - failure count. - - Clients retain the most recent descriptor they have downloaded for each - router so long as it is not too old (currently, 48 hours), OR so long as - no better descriptor has been downloaded for the same router. - - [Versions of Tor before 0.1.2.3-alpha would discard descriptors simply for - being published too far in the past.] [The code seems to discard - descriptors in all cases after they're 5 days old. True? -RD] - -5.3. Managing downloads - - When a client has no consensus network-status document, it downloads it - from a randomly chosen authority. In all other cases, the client - downloads from caches randomly chosen from among those believed to be V2 - directory servers. (This information comes from the network-status - documents; see 6 below.) - - When downloading multiple router descriptors, the client chooses multiple - mirrors so that: - - At least 3 different mirrors are used, except when this would result - in more than one request for under 4 descriptors. - - No more than 128 descriptors are requested from a single mirror. - - Otherwise, as few mirrors as possible are used. - After choosing mirrors, the client divides the descriptors among them - randomly. - - After receiving any response client MUST discard any network-status - documents and descriptors that it did not request. - -6. Using directory information - - Everyone besides directory authorities uses the approaches in this section - to decide which servers to use and what their keys are likely to be. - (Directory authorities just believe their own opinions, as in 3.1 above.) - -6.1. Choosing routers for circuits. - - Circuits SHOULD NOT be built until the client has enough directory - information: a live consensus network status [XXXX fallback?] and - descriptors for at least 1/4 of the servers believed to be running. - - A server is "listed" if it is included by the consensus network-status - document. Clients SHOULD NOT use unlisted servers. - - These flags are used as follows: - - - Clients SHOULD NOT use non-'Valid' or non-'Running' routers unless - requested to do so. - - - Clients SHOULD NOT use non-'Fast' routers for any purpose other than - very-low-bandwidth circuits (such as introduction circuits). - - - Clients SHOULD NOT use non-'Stable' routers for circuits that are - likely to need to be open for a very long time (such as those used for - IRC or SSH connections). - - - Clients SHOULD NOT choose non-'Guard' nodes when picking entry guard - nodes. - - - Clients SHOULD NOT download directory information from non-'V2Dir' - caches. - - See the "path-spec.txt" document for more details. - -6.2. Managing naming - - In order to provide human-memorable names for individual server - identities, some directory servers bind names to IDs. Clients handle - names in two ways: - - When a client encounters a name it has not mapped before: - - If the consensus lists any router with that name as "Named", or if - consensus-method 2 or later is in use and the consensus lists any - router with that name as having the "Unnamed" flag, then the name is - bound. (It's bound to the ID listed in the entry with the Named, - or to an unknown ID if no name is found.) - - When the user refers to a bound name, the implementation SHOULD provide - only the router with ID bound to that name, and no other router, even - if the router with the right ID can't be found. - - When a user tries to refer to a non-bound name, the implementation SHOULD - warn the user. After warning the user, the implementation MAY use any - router that advertises the name. - - Not every router needs a nickname. When a router doesn't configure a - nickname, it publishes with the default nickname "Unnamed". Authorities - SHOULD NOT ever mark a router with this nickname as Named; client software - SHOULD NOT ever use a router in response to a user request for a router - called "Unnamed". - -6.3. Software versions - - An implementation of Tor SHOULD warn when it has fetched a consensus - network-status, and it is running a software version not listed. - -6.4. Warning about a router's status. - - If a router tries to publish its descriptor to a Naming authority - that has its nickname mapped to another key, the router SHOULD - warn the operator that it is either using the wrong key or is using - an already claimed nickname. - - If a router has fetched a consensus document,, and the - authorities do not publish a binding for the router's nickname, the - router MAY remind the operator that the chosen nickname is not - bound to this key at the authorities, and suggest contacting the - authority operators. - - ... - -6.5. Router protocol versions - - A client should believe that a router supports a given feature if that - feature is supported by the router or protocol versions in more than half - of the live networkstatuses' "v" entries for that router. In other words, - if the "v" entries for some router are: - v Tor 0.0.8pre1 (from authority 1) - v Tor 0.1.2.11 (from authority 2) - v FutureProtocolDescription 99 (from authority 3) - then the client should believe that the router supports any feature - supported by 0.1.2.11. - - This is currently equivalent to believing the median declared version for - a router in all live networkstatuses. - -7. Standards compliance - - All clients and servers MUST support HTTP 1.0. Clients and servers MAY - support later versions of HTTP as well. - -7.1. HTTP headers - - Servers MAY set the Content-Length: header. Servers SHOULD set - Content-Encoding to "deflate" or "identity". - - Servers MAY include an X-Your-Address-Is: header, whose value is the - apparent IP address of the client connecting to them (as a dotted quad). - For directory connections tunneled over a BEGIN_DIR stream, servers SHOULD - report the IP from which the circuit carrying the BEGIN_DIR stream reached - them. [Servers before version 0.1.2.5-alpha reported 127.0.0.1 for all - BEGIN_DIR-tunneled connections.] - - Servers SHOULD disable caching of multiple network statuses or multiple - router descriptors. Servers MAY enable caching of single descriptors, - single network statuses, the list of all router descriptors, a v1 - directory, or a v1 running routers document. XXX mention times. - -7.2. HTTP status codes - - Tor delivers the following status codes. Some were chosen without much - thought; other code SHOULD NOT rely on specific status codes yet. - - 200 -- the operation completed successfully - -- the user requested statuses or serverdescs, and none of the ones we - requested were found (0.2.0.4-alpha and earlier). - - 304 -- the client specified an if-modified-since time, and none of the - requested resources have changed since that time. - - 400 -- the request is malformed, or - -- the URL is for a malformed variation of one of the URLs we support, - or - -- the client tried to post to a non-authority, or - -- the authority rejected a malformed posted document, or - - 404 -- the requested document was not found. - -- the user requested statuses or serverdescs, and none of the ones - requested were found (0.2.0.5-alpha and later). - - 503 -- we are declining the request in order to save bandwidth - -- user requested some items that we ordinarily generate or store, - but we do not have any available. - -9. Backward compatibility and migration plans - - Until Tor versions before 0.1.1.x are completely obsolete, directory - authorities should generate, and mirrors should download and cache, v1 - directories and running-routers lists, and allow old clients to download - them. These documents and the rules for retrieving, serving, and caching - them are described in dir-spec-v1.txt. - - Until Tor versions before 0.2.0.x are completely obsolete, directory - authorities should generate, mirrors should download and cache, v2 - network-status documents, and allow old clients to download them. - Additionally, all directory servers and caches should download, store, and - serve any router descriptor that is required because of v2 network-status - documents. These documents and the rules for retrieving, serving, and - caching them are described in dir-spec-v1.txt. - -A. Consensus-negotiation timeline. - - Period begins: this is the Published time. - Everybody sends votes - Reconciliation: everybody tries to fetch missing votes. - consensus may exist at this point. - End of voting period: - everyone swaps signatures. - Now it's okay for caches to download - Now it's okay for clients to download. - - Valid-after/valid-until switchover - diff --git a/doc/spec/path-spec.txt b/doc/spec/path-spec.txt deleted file mode 100644 index 7c313f8ab0..0000000000 --- a/doc/spec/path-spec.txt +++ /dev/null @@ -1,657 +0,0 @@ - - Tor Path Specification - - Roger Dingledine - Nick Mathewson - -Note: This is an attempt to specify Tor as currently implemented. Future -versions of Tor will implement improved algorithms. - -This document tries to cover how Tor chooses to build circuits and assign -streams to circuits. Other implementations MAY take other approaches, but -implementors should be aware of the anonymity and load-balancing implications -of their choices. - - THIS SPEC ISN'T DONE YET. - - The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL - NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and - "OPTIONAL" in this document are to be interpreted as described in - RFC 2119. - -1. General operation - - Tor begins building circuits as soon as it has enough directory - information to do so (see section 5 of dir-spec.txt). Some circuits are - built preemptively because we expect to need them later (for user - traffic), and some are built because of immediate need (for user traffic - that no current circuit can handle, for testing the network or our - reachability, and so on). - - When a client application creates a new stream (by opening a SOCKS - connection or launching a resolve request), we attach it to an appropriate - open circuit if one exists, or wait if an appropriate circuit is - in-progress. We launch a new circuit only - if no current circuit can handle the request. We rotate circuits over - time to avoid some profiling attacks. - - To build a circuit, we choose all the nodes we want to use, and then - construct the circuit. Sometimes, when we want a circuit that ends at a - given hop, and we have an appropriate unused circuit, we "cannibalize" the - existing circuit and extend it to the new terminus. - - These processes are described in more detail below. - - This document describes Tor's automatic path selection logic only; path - selection can be overridden by a controller (with the EXTENDCIRCUIT and - ATTACHSTREAM commands). Paths constructed through these means may - violate some constraints given below. - -1.1. Terminology - - A "path" is an ordered sequence of nodes, not yet built as a circuit. - - A "clean" circuit is one that has not yet been used for any traffic. - - A "fast" or "stable" or "valid" node is one that has the 'Fast' or - 'Stable' or 'Valid' flag - set respectively, based on our current directory information. A "fast" - or "stable" circuit is one consisting only of "fast" or "stable" nodes. - - In an "exit" circuit, the final node is chosen based on waiting stream - requests if any, and in any case it avoids nodes with exit policy of - "reject *:*". An "internal" circuit, on the other hand, is one where - the final node is chosen just like a middle node (ignoring its exit - policy). - - A "request" is a client-side stream or DNS resolve that needs to be - served by a circuit. - - A "pending" circuit is one that we have started to build, but which has - not yet completed. - - A circuit or path "supports" a request if it is okay to use the - circuit/path to fulfill the request, according to the rules given below. - A circuit or path "might support" a request if some aspect of the request - is unknown (usually its target IP), but we believe the path probably - supports the request according to the rules given below. - -1.1. A server's bandwidth - - Old versions of Tor did not report bandwidths in network status - documents, so clients had to learn them from the routers' advertised - server descriptors. - - For versions of Tor prior to 0.2.1.17-rc, everywhere below where we - refer to a server's "bandwidth", we mean its clipped advertised - bandwidth, computed by taking the smaller of the 'rate' and - 'observed' arguments to the "bandwidth" element in the server's - descriptor. If a router's advertised bandwidth is greater than - MAX_BELIEVABLE_BANDWIDTH (currently 10 MB/s), we clipped to that - value. - - For more recent versions of Tor, we take the bandwidth value declared - in the consensus, and fall back to the clipped advertised bandwidth - only if the consensus does not have bandwidths listed. - -2. Building circuits - -2.1. When we build - -2.1.1. Clients build circuits preemptively - - When running as a client, Tor tries to maintain at least a certain - number of clean circuits, so that new streams can be handled - quickly. To increase the likelihood of success, Tor tries to - predict what circuits will be useful by choosing from among nodes - that support the ports we have used in the recent past (by default - one hour). Specifically, on startup Tor tries to maintain one clean - fast exit circuit that allows connections to port 80, and at least - two fast clean stable internal circuits in case we get a resolve - request or hidden service request (at least three if we _run_ a - hidden service). - - After that, Tor will adapt the circuits that it preemptively builds - based on the requests it sees from the user: it tries to have two fast - clean exit circuits available for every port seen within the past hour - (each circuit can be adequate for many predicted ports -- it doesn't - need two separate circuits for each port), and it tries to have the - above internal circuits available if we've seen resolves or hidden - service activity within the past hour. If there are 12 or more clean - circuits open, it doesn't open more even if it has more predictions. - - Only stable circuits can "cover" a port that is listed in the - LongLivedPorts config option. Similarly, hidden service requests - to ports listed in LongLivedPorts make us create stable internal - circuits. - - Note that if there are no requests from the user for an hour, Tor - will predict no use and build no preemptive circuits. - - The Tor client SHOULD NOT store its list of predicted requests to a - persistent medium. - -2.1.2. Clients build circuits on demand - - Additionally, when a client request exists that no circuit (built or - pending) might support, we create a new circuit to support the request. - For exit connections, we pick an exit node that will handle the - most pending requests (choosing arbitrarily among ties), launch a - circuit to end there, and repeat until every unattached request - might be supported by a pending or built circuit. For internal - circuits, we pick an arbitrary acceptable path, repeating as needed. - - In some cases we can reuse an already established circuit if it's - clean; see Section 2.3 (cannibalizing circuits) for details. - -2.1.3. Servers build circuits for testing reachability and bandwidth - - Tor servers test reachability of their ORPort once they have - successfully built a circuit (on start and whenever their IP address - changes). They build an ordinary fast internal circuit with themselves - as the last hop. As soon as any testing circuit succeeds, the Tor - server decides it's reachable and is willing to publish a descriptor. - - We launch multiple testing circuits (one at a time), until we - have NUM_PARALLEL_TESTING_CIRC (4) such circuits open. Then we - do a "bandwidth test" by sending a certain number of relay drop - cells down each circuit: BandwidthRate * 10 / CELL_NETWORK_SIZE - total cells divided across the four circuits, but never more than - CIRCWINDOW_START (1000) cells total. This exercises both outgoing and - incoming bandwidth, and helps to jumpstart the observed bandwidth - (see dir-spec.txt). - - Tor servers also test reachability of their DirPort once they have - established a circuit, but they use an ordinary exit circuit for - this purpose. - -2.1.4. Hidden-service circuits - - See section 4 below. - -2.1.5. Rate limiting of failed circuits - - If we fail to build a circuit N times in a X second period (see Section - 2.3 for how this works), we stop building circuits until the X seconds - have elapsed. - XXXX - -2.1.6. When to tear down circuits - - XXXX - - -2.2. Path selection and constraints - - We choose the path for each new circuit before we build it. We choose the - exit node first, followed by the other nodes in the circuit. All paths - we generate obey the following constraints: - - We do not choose the same router twice for the same path. - - We do not choose any router in the same family as another in the same - path. - - We do not choose more than one router in a given /16 subnet - (unless EnforceDistinctSubnets is 0). - - We don't choose any non-running or non-valid router unless we have - been configured to do so. By default, we are configured to allow - non-valid routers in "middle" and "rendezvous" positions. - - If we're using Guard nodes, the first node must be a Guard (see 5 - below) - - XXXX Choosing the length - - For "fast" circuits, we only choose nodes with the Fast flag. For - non-"fast" circuits, all nodes are eligible. - - For all circuits, we weight node selection according to router bandwidth. - - We also weight the bandwidth of Exit and Guard flagged nodes depending on - the fraction of total bandwidth that they make up and depending upon the - position they are being selected for. - - These weights are published in the consensus, and are computed as described - in Section 3.4.3 of dir-spec.txt. They are: - - Wgg - Weight for Guard-flagged nodes in the guard position - Wgm - Weight for non-flagged nodes in the guard Position - Wgd - Weight for Guard+Exit-flagged nodes in the guard Position - - Wmg - Weight for Guard-flagged nodes in the middle Position - Wmm - Weight for non-flagged nodes in the middle Position - Wme - Weight for Exit-flagged nodes in the middle Position - Wmd - Weight for Guard+Exit flagged nodes in the middle Position - - Weg - Weight for Guard flagged nodes in the exit Position - Wem - Weight for non-flagged nodes in the exit Position - Wee - Weight for Exit-flagged nodes in the exit Position - Wed - Weight for Guard+Exit-flagged nodes in the exit Position - - Wgb - Weight for BEGIN_DIR-supporting Guard-flagged nodes - Wmb - Weight for BEGIN_DIR-supporting non-flagged nodes - Web - Weight for BEGIN_DIR-supporting Exit-flagged nodes - Wdb - Weight for BEGIN_DIR-supporting Guard+Exit-flagged nodes - - Wbg - Weight for Guard+Exit-flagged nodes for BEGIN_DIR requests - Wbm - Weight for Guard+Exit-flagged nodes for BEGIN_DIR requests - Wbe - Weight for Guard+Exit-flagged nodes for BEGIN_DIR requests - Wbd - Weight for Guard+Exit-flagged nodes for BEGIN_DIR requests - - Additionally, we may be building circuits with one or more requests in - mind. Each kind of request puts certain constraints on paths: - - - All service-side introduction circuits and all rendezvous paths - should be Stable. - - All connection requests for connections that we think will need to - stay open a long time require Stable circuits. Currently, Tor decides - this by examining the request's target port, and comparing it to a - list of "long-lived" ports. (Default: 21, 22, 706, 1863, 5050, - 5190, 5222, 5223, 6667, 6697, 8300.) - - DNS resolves require an exit node whose exit policy is not equivalent - to "reject *:*". - - Reverse DNS resolves require a version of Tor with advertised eventdns - support (available in Tor 0.1.2.1-alpha-dev and later). - - All connection requests require an exit node whose exit policy - supports their target address and port (if known), or which "might - support it" (if the address isn't known). See 2.2.1. - - Rules for Fast? XXXXX - -2.2.1. Choosing an exit - - If we know what IP address we want to connect to or resolve, we can - trivially tell whether a given router will support it by simulating - its declared exit policy. - - Because we often connect to addresses of the form hostname:port, we do not - always know the target IP address when we select an exit node. In these - cases, we need to pick an exit node that "might support" connections to a - given address port with an unknown address. An exit node "might support" - such a connection if any clause that accepts any connections to that port - precedes all clauses (if any) that reject all connections to that port. - - Unless requested to do so by the user, we never choose an exit server - flagged as "BadExit" by more than half of the authorities who advertise - themselves as listing bad exits. - -2.2.2. User configuration - - Users can alter the default behavior for path selection with configuration - options. - - - If "ExitNodes" is provided, then every request requires an exit node on - the ExitNodes list. (If a request is supported by no nodes on that list, - and StrictExitNodes is false, then Tor treats that request as if - ExitNodes were not provided.) - - - "EntryNodes" and "StrictEntryNodes" behave analogously. - - - If a user tries to connect to or resolve a hostname of the form - <target>.<servername>.exit, the request is rewritten to a request for - <target>, and the request is only supported by the exit whose nickname - or fingerprint is <servername>. - -2.3. Cannibalizing circuits - - If we need a circuit and have a clean one already established, in - some cases we can adapt the clean circuit for our new - purpose. Specifically, - - For hidden service interactions, we can "cannibalize" a clean internal - circuit if one is available, so we don't need to build those circuits - from scratch on demand. - - We can also cannibalize clean circuits when the client asks to exit - at a given node -- either via the ".exit" notation or because the - destination is running at the same location as an exit node. - -2.4. Learning when to give up ("timeout") on circuit construction - - Since version 0.2.2.8-alpha, Tor attempts to learn when to give up on - circuits based on network conditions. - -2.4.1 Distribution choice and parameter estimation - - Based on studies of build times, we found that the distribution of - circuit build times appears to be a Frechet distribution. However, - estimators and quantile functions of the Frechet distribution are - difficult to work with and slow to converge. So instead, since we - are only interested in the accuracy of the tail, we approximate - the tail of the distribution with a Pareto curve. - - We calculate the parameters for a Pareto distribution fitting the data - using the estimators in equation 4 from: - http://portal.acm.org/citation.cfm?id=1647962.1648139 - - This is: - - alpha_m = s/(ln(U(X)/Xm^n)) - - where s is the total number of completed circuits we have seen, and - - U(X) = x_max^u * Prod_s{x_i} - - with x_i as our i-th completed circuit time, x_max as the longest - completed circuit build time we have yet observed, u as the - number of unobserved timeouts that have no exact value recorded, - and n as u+s, the total number of circuits that either timeout or - complete. - - Using log laws, we compute this as the sum of logs to avoid - overflow and ln(1.0+epsilon) precision issues: - - alpha_m = s/(u*ln(x_max) + Sum_s{ln(x_i)} - n*ln(Xm)) - - This estimator is closely related to the parameters present in: - http://en.wikipedia.org/wiki/Pareto_distribution#Parameter_estimation - except they are adjusted to handle the fact that our samples are - right-censored at the timeout cutoff. - - Additionally, because this is not a true Pareto distribution, we alter - how Xm is computed. The Xm parameter is computed as the midpoint of the most - frequently occurring 50ms histogram bin, until the point where 1000 - circuits are recorded. After this point, the weighted average of the top - 'cbtnummodes' (default: 3) midpoint modes is used as Xm. All times below - this value are counted as having the midpoint value of this weighted average bin. - - The timeout itself is calculated by using the Pareto Quantile function (the - inverted CDF) to give us the value on the CDF such that 80% of the mass - of the distribution is below the timeout value. - - Thus, we expect that the Tor client will accept the fastest 80% of - the total number of paths on the network. - -2.4.2. How much data to record - - From our observations, the minimum number of circuit build times for a - reasonable fit appears to be on the order of 100. However, to keep a - good fit over the long term, we store 1000 most recent circuit build times - in a circular array. - - The Tor client should build test circuits at a rate of one per - minute up until 100 circuits are built. This allows a fresh Tor to have - a CircuitBuildTimeout estimated within 1.5 hours after install, - upgrade, or network change (see below). - - Timeouts are stored on disk in a histogram of 50ms bin width, the same - width used to calculate the Xm value above. This histogram must be shuffled - after being read from disk, to preserve a proper expiration of old values - after restart. - -2.4.3. How to record timeouts - - Circuits that pass the timeout threshold should be allowed to continue - building until a time corresponding to the point 'cbtclosequantile' - (default 95) on the Pareto curve, or 60 seconds, whichever is greater. - - The actual completion times for these circuits should be recorded. - Implementations should completely abandon a circuit and record a value - as an 'unknown' timeout if the total build time exceeds this threshold. - - The reason for this is that right-censored pareto estimators begin to lose - their accuracy if more than approximately 5% of the values are censored. - Since we wish to set the cutoff at 20%, we must allow circuits to continue - building past this cutoff point up to the 95th percentile. - -2.4.4. Detecting Changing Network Conditions - - We attempt to detect both network connectivity loss and drastic - changes in the timeout characteristics. - - We assume that we've had network connectivity loss if 3 circuits - timeout and we've received no cells or TLS handshakes since those - circuits began. We then temporarily set the timeout to 60 seconds - and stop counting timeouts. - - If 3 more circuits timeout and the network still has not been - live within this new 60 second timeout window, we then discard - the previous timeouts during this period from our history. - - To detect changing network conditions, we keep a history of - the timeout or non-timeout status of the past 20 circuits that - successfully completed at least one hop. If more than 90% of - these circuits timeout, we discard all buildtimes history, reset - the timeout to 60, and then begin recomputing the timeout. - - If the timeout was already 60 or higher, we double the timeout. - -2.4.5. Consensus parameters governing behavior - - Clients that implement circuit build timeout learning should obey the - following consensus parameters that govern behavior, in order to allow - us to handle bugs or other emergent behaviors due to client circuit - construction. If these parameters are not present in the consensus, - the listed default values should be used instead. - - cbtdisabled - Default: 0 - Min: 0 - Max: 1 - Effect: If 1, all CircuitBuildTime learning code should be - disabled and history should be discarded. For use in - emergency situations only. - - cbtnummodes - Default: 3 - Min: 1 - Max: 20 - Effect: This value governs how many modes to use in the weighted - average calculation of Pareto parameter Xm. A value of 3 introduces - some bias (2-5% of CDF) under ideal conditions, but allows for better - performance in the event that a client chooses guard nodes of radically - different performance characteristics. - - cbtrecentcount - Default: 20 - Min: 3 - Max: 1000 - Effect: This is the number of circuit build times to keep track of - for the following option. - - cbtmaxtimeouts - Default: 18 - Min: 3 - Max: 10000 - Effect: When this many timeouts happen in the last 'cbtrecentcount' - circuit attempts, the client should discard all of its - history and begin learning a fresh timeout value. - - cbtmincircs - Default: 100 - Min: 1 - Max: 10000 - Effect: This is the minimum number of circuits to build before - computing a timeout. - - cbtquantile - Default: 80 - Min: 10 - Max: 99 - Effect: This is the position on the quantile curve to use to set the - timeout value. It is a percent (10-99). - - cbtclosequantile - Default: 95 - Min: Value of cbtquantile parameter - Max: 99 - Effect: This is the position on the quantile curve to use to set the - timeout value to use to actually close circuits. It is a percent - (0-99). - - cbttestfreq - Default: 60 - Min: 1 - Max: 2147483647 (INT32_MAX) - Effect: Describes how often in seconds to build a test circuit to - gather timeout values. Only applies if less than 'cbtmincircs' - have been recorded. - - cbtmintimeout - Default: 2000 - Min: 500 - Max: 2147483647 (INT32_MAX) - Effect: This is the minimum allowed timeout value in milliseconds. - The minimum is to prevent rounding to 0 (we only check once - per second). - - cbtinitialtimeout - Default: 60000 - Min: Value of cbtmintimeout - Max: 2147483647 (INT32_MAX) - Effect: This is the timeout value to use before computing a timeout, - in milliseconds. - - -2.5. Handling failure - - If an attempt to extend a circuit fails (either because the first create - failed or a subsequent extend failed) then the circuit is torn down and is - no longer pending. (XXXX really?) Requests that might have been - supported by the pending circuit thus become unsupported, and a new - circuit needs to be constructed. - - If a stream "begin" attempt fails with an EXITPOLICY error, we - decide that the exit node's exit policy is not correctly advertised, - so we treat the exit node as if it were a non-exit until we retrieve - a fresh descriptor for it. - - XXXX - -3. Attaching streams to circuits - - When a circuit that might support a request is built, Tor tries to attach - the request's stream to the circuit and sends a BEGIN, BEGIN_DIR, - or RESOLVE relay - cell as appropriate. If the request completes unsuccessfully, Tor - considers the reason given in the CLOSE relay cell. [XXX yes, and?] - - - After a request has remained unattached for SocksTimeout (2 minutes - by default), Tor abandons the attempt and signals an error to the - client as appropriate (e.g., by closing the SOCKS connection). - - XXX Timeouts and when Tor auto-retries. - * What stream-end-reasons are appropriate for retrying. - - If no reply to BEGIN/RESOLVE, then the stream will timeout and fail. - -4. Hidden-service related circuits - - XXX Tracking expected hidden service use (client-side and hidserv-side) - -5. Guard nodes - - We use Guard nodes (also called "helper nodes" in the literature) to - prevent certain profiling attacks. Here's the risk: if we choose entry and - exit nodes at random, and an attacker controls C out of N servers - (ignoring bandwidth), then the - attacker will control the entry and exit node of any given circuit with - probability (C/N)^2. But as we make many different circuits over time, - then the probability that the attacker will see a sample of about (C/N)^2 - of our traffic goes to 1. Since statistical sampling works, the attacker - can be sure of learning a profile of our behavior. - - If, on the other hand, we picked an entry node and held it fixed, we would - have probability C/N of choosing a bad entry and being profiled, and - probability (N-C)/N of choosing a good entry and not being profiled. - - When guard nodes are enabled, Tor maintains an ordered list of entry nodes - as our chosen guards, and stores this list persistently to disk. If a Guard - node becomes unusable, rather than replacing it, Tor adds new guards to the - end of the list. When choosing the first hop of a circuit, Tor - chooses at - random from among the first NumEntryGuards (default 3) usable guards on the - list. If there are not at least 2 usable guards on the list, Tor adds - routers until there are, or until there are no more usable routers to add. - - A guard is unusable if any of the following hold: - - it is not marked as a Guard by the networkstatuses, - - it is not marked Valid (and the user hasn't set AllowInvalid entry) - - it is not marked Running - - Tor couldn't reach it the last time it tried to connect - - A guard is unusable for a particular circuit if any of the rules for path - selection in 2.2 are not met. In particular, if the circuit is "fast" - and the guard is not Fast, or if the circuit is "stable" and the guard is - not Stable, or if the guard has already been chosen as the exit node in - that circuit, Tor can't use it as a guard node for that circuit. - - If the guard is excluded because of its status in the networkstatuses for - over 30 days, Tor removes it from the list entirely, preserving order. - - If Tor fails to connect to an otherwise usable guard, it retries - periodically: every hour for six hours, every 4 hours for 3 days, every - 18 hours for a week, and every 36 hours thereafter. Additionally, Tor - retries unreachable guards the first time it adds a new guard to the list, - since it is possible that the old guards were only marked as unreachable - because the network was unreachable or down. - - Tor does not add a guard persistently to the list until the first time we - have connected to it successfully. - -6. Router descriptor purposes - - There are currently three "purposes" supported for router descriptors: - general, controller, and bridge. Most descriptors are of type general - -- these are the ones listed in the consensus, and the ones fetched - and used in normal cases. - - Controller-purpose descriptors are those delivered by the controller - and labelled as such: they will be kept around (and expire like - normal descriptors), and they can be used by the controller in its - CIRCUITEXTEND commands. Otherwise they are ignored by Tor when it - chooses paths. - - Bridge-purpose descriptors are for routers that are used as bridges. See - doc/design-paper/blocking.pdf for more design explanation, or proposal - 125 for specific details. Currently bridge descriptors are used in place - of normal entry guards, for Tor clients that have UseBridges enabled. - - -X. Old notes - -X.1. Do we actually do this? - -How to deal with network down. - - While all helpers are down/unreachable and there are no established - or on-the-way testing circuits, launch a testing circuit. (Do this - periodically in the same way we try to establish normal circuits - when things are working normally.) - (Testing circuits are a special type of circuit, that streams won't - attach to by accident.) - - When a testing circuit succeeds, mark all helpers up and hold - the testing circuit open. - - If a connection to a helper succeeds, close all testing circuits. - Else mark that helper down and try another. - - If the last helper is marked down and we already have a testing - circuit established, then add the first hop of that testing circuit - to the end of our helper node list, close that testing circuit, - and go back to square one. (Actually, rather than closing the - testing circuit, can we get away with converting it to a normal - circuit and beginning to use it immediately?) - - [Do we actually do any of the above? If so, let's spec it. If not, let's - remove it. -NM] - -X.2. A thing we could do to deal with reachability. - -And as a bonus, it leads to an answer to Nick's attack ("If I pick -my helper nodes all on 18.0.0.0:*, then I move, you'll know where I -bootstrapped") -- the answer is to pick your original three helper nodes -without regard for reachability. Then the above algorithm will add some -more that are reachable for you, and if you move somewhere, it's more -likely (though not certain) that some of the originals will become useful. -Is that smart or just complex? - -X.3. Some stuff that worries me about entry guards. 2006 Jun, Nickm. - - It is unlikely for two users to have the same set of entry guards. - Observing a user is sufficient to learn its entry guards. So, as we move - around, entry guards make us linkable. If we want to change guards when - our location (IP? subnet?) changes, we have two bad options. We could - - Drop the old guards. But if we go back to our old location, - we'll not use our old guards. For a laptop that sometimes gets used - from work and sometimes from home, this is pretty fatal. - - Remember the old guards as associated with the old location, and use - them again if we ever go back to the old location. This would be - nasty, since it would force us to record where we've been. - - [Do we do any of this now? If not, this should move into 099-misc or - 098-todo. -NM] - diff --git a/doc/spec/proposals/000-index.txt b/doc/spec/proposals/000-index.txt deleted file mode 100644 index 580ce36fa7..0000000000 --- a/doc/spec/proposals/000-index.txt +++ /dev/null @@ -1,196 +0,0 @@ -Filename: 000-index.txt -Title: Index of Tor Proposals -Author: Nick Mathewson -Created: 26-Jan-2007 -Status: Meta - -Overview: - - This document provides an index to Tor proposals. - - This is an informational document. - - Everything in this document below the line of '=' signs is automatically - generated by reindex.py; do not edit by hand. - -============================================================ -Proposals by number: - -000 Index of Tor Proposals [META] -001 The Tor Proposal Process [META] -098 Proposals that should be written [META] -099 Miscellaneous proposals [META] -100 Tor Unreliable Datagram Extension Proposal [DEAD] -101 Voting on the Tor Directory System [CLOSED] -102 Dropping "opt" from the directory format [CLOSED] -103 Splitting identity key from regularly used signing key [CLOSED] -104 Long and Short Router Descriptors [CLOSED] -105 Version negotiation for the Tor protocol [CLOSED] -106 Checking fewer things during TLS handshakes [CLOSED] -107 Uptime Sanity Checking [CLOSED] -108 Base "Stable" Flag on Mean Time Between Failures [CLOSED] -109 No more than one server per IP address [CLOSED] -110 Avoiding infinite length circuits [ACCEPTED] -111 Prioritizing local traffic over relayed traffic [CLOSED] -112 Bring Back Pathlen Coin Weight [SUPERSEDED] -113 Simplifying directory authority administration [SUPERSEDED] -114 Distributed Storage for Tor Hidden Service Descriptors [CLOSED] -115 Two Hop Paths [DEAD] -116 Two hop paths from entry guards [DEAD] -117 IPv6 exits [ACCEPTED] -118 Advertising multiple ORPorts at once [ACCEPTED] -119 New PROTOCOLINFO command for controllers [CLOSED] -120 Shutdown descriptors when Tor servers stop [DEAD] -121 Hidden Service Authentication [FINISHED] -122 Network status entries need a new Unnamed flag [CLOSED] -123 Naming authorities automatically create bindings [CLOSED] -124 Blocking resistant TLS certificate usage [SUPERSEDED] -125 Behavior for bridge users, bridge relays, and bridge authorities [CLOSED] -126 Getting GeoIP data and publishing usage summaries [CLOSED] -127 Relaying dirport requests to Tor download site / website [DRAFT] -128 Families of private bridges [DEAD] -129 Block Insecure Protocols by Default [CLOSED] -130 Version 2 Tor connection protocol [CLOSED] -131 Help users to verify they are using Tor [NEEDS-REVISION] -132 A Tor Web Service For Verifying Correct Browser Configuration [DRAFT] -133 Incorporate Unreachable ORs into the Tor Network [DRAFT] -134 More robust consensus voting with diverse authority sets [REJECTED] -135 Simplify Configuration of Private Tor Networks [CLOSED] -136 Mass authority migration with legacy keys [CLOSED] -137 Keep controllers informed as Tor bootstraps [CLOSED] -138 Remove routers that are not Running from consensus documents [CLOSED] -139 Download consensus documents only when it will be trusted [CLOSED] -140 Provide diffs between consensuses [ACCEPTED] -141 Download server descriptors on demand [DRAFT] -142 Combine Introduction and Rendezvous Points [DEAD] -143 Improvements of Distributed Storage for Tor Hidden Service Descriptors [OPEN] -144 Increase the diversity of circuits by detecting nodes belonging the same provider [DRAFT] -145 Separate "suitable as a guard" from "suitable as a new guard" [OPEN] -146 Add new flag to reflect long-term stability [OPEN] -147 Eliminate the need for v2 directories in generating v3 directories [ACCEPTED] -148 Stream end reasons from the client side should be uniform [CLOSED] -149 Using data from NETINFO cells [OPEN] -150 Exclude Exit Nodes from a circuit [CLOSED] -151 Improving Tor Path Selection [FINISHED] -152 Optionally allow exit from single-hop circuits [CLOSED] -153 Automatic software update protocol [SUPERSEDED] -154 Automatic Software Update Protocol [SUPERSEDED] -155 Four Improvements of Hidden Service Performance [FINISHED] -156 Tracking blocked ports on the client side [OPEN] -157 Make certificate downloads specific [ACCEPTED] -158 Clients download consensus + microdescriptors [OPEN] -159 Exit Scanning [OPEN] -160 Authorities vote for bandwidth offsets in consensus [FINISHED] -161 Computing Bandwidth Adjustments [FINISHED] -162 Publish the consensus in multiple flavors [OPEN] -163 Detecting whether a connection comes from a client [OPEN] -164 Reporting the status of server votes [OPEN] -165 Easy migration for voting authority sets [OPEN] -166 Including Network Statistics in Extra-Info Documents [ACCEPTED] -167 Vote on network parameters in consensus [CLOSED] -168 Reduce default circuit window [OPEN] -169 Eliminate TLS renegotiation for the Tor connection handshake [DRAFT] -170 Configuration options regarding circuit building [DRAFT] -171 Separate streams across circuits by connection metadata [OPEN] -172 GETINFO controller option for circuit information [ACCEPTED] -173 GETINFO Option Expansion [ACCEPTED] -174 Optimistic Data for Tor: Server Side [OPEN] -175 Automatically promoting Tor clients to nodes [DRAFT] -176 Proposed version-3 link handshake for Tor [DRAFT] -177 Abstaining from votes on individual flags [DRAFT] - - -Proposals by status: - - DRAFT: - 127 Relaying dirport requests to Tor download site / website - 132 A Tor Web Service For Verifying Correct Browser Configuration - 133 Incorporate Unreachable ORs into the Tor Network - 141 Download server descriptors on demand - 144 Increase the diversity of circuits by detecting nodes belonging the same provider - 169 Eliminate TLS renegotiation for the Tor connection handshake [for 0.2.2] - 170 Configuration options regarding circuit building - 175 Automatically promoting Tor clients to nodes - 176 Proposed version-3 link handshake for Tor [for 0.2.3] - 177 Abstaining from votes on individual flags - NEEDS-REVISION: - 131 Help users to verify they are using Tor - OPEN: - 143 Improvements of Distributed Storage for Tor Hidden Service Descriptors [for 0.2.1.x] - 145 Separate "suitable as a guard" from "suitable as a new guard" [for 0.2.1.x] - 146 Add new flag to reflect long-term stability [for 0.2.1.x] - 149 Using data from NETINFO cells [for 0.2.1.x] - 156 Tracking blocked ports on the client side [for 0.2.?] - 158 Clients download consensus + microdescriptors - 159 Exit Scanning - 162 Publish the consensus in multiple flavors [for 0.2.2] - 163 Detecting whether a connection comes from a client [for 0.2.2] - 164 Reporting the status of server votes [for 0.2.2] - 165 Easy migration for voting authority sets - 168 Reduce default circuit window [for 0.2.2] - 171 Separate streams across circuits by connection metadata - 174 Optimistic Data for Tor: Server Side - ACCEPTED: - 110 Avoiding infinite length circuits [for 0.2.1.x] [in 0.2.1.3-alpha] - 117 IPv6 exits [for 0.2.1.x] - 118 Advertising multiple ORPorts at once [for 0.2.1.x] - 140 Provide diffs between consensuses [for 0.2.2.x] - 147 Eliminate the need for v2 directories in generating v3 directories [for 0.2.1.x] - 157 Make certificate downloads specific [for 0.2.1.x] - 166 Including Network Statistics in Extra-Info Documents [for 0.2.2] - 172 GETINFO controller option for circuit information - 173 GETINFO Option Expansion - META: - 000 Index of Tor Proposals - 001 The Tor Proposal Process - 098 Proposals that should be written - 099 Miscellaneous proposals - FINISHED: - 121 Hidden Service Authentication [in 0.2.1.x] - 151 Improving Tor Path Selection - 155 Four Improvements of Hidden Service Performance [in 0.2.1.x] - 160 Authorities vote for bandwidth offsets in consensus [for 0.2.2.x] - 161 Computing Bandwidth Adjustments [for 0.2.2.x] - CLOSED: - 101 Voting on the Tor Directory System [in 0.2.0.x] - 102 Dropping "opt" from the directory format [in 0.2.0.x] - 103 Splitting identity key from regularly used signing key [in 0.2.0.x] - 104 Long and Short Router Descriptors [in 0.2.0.x] - 105 Version negotiation for the Tor protocol [in 0.2.0.x] - 106 Checking fewer things during TLS handshakes [in 0.2.0.x] - 107 Uptime Sanity Checking [in 0.2.0.x] - 108 Base "Stable" Flag on Mean Time Between Failures [in 0.2.0.x] - 109 No more than one server per IP address [in 0.2.0.x] - 111 Prioritizing local traffic over relayed traffic [in 0.2.0.x] - 114 Distributed Storage for Tor Hidden Service Descriptors [in 0.2.0.x] - 119 New PROTOCOLINFO command for controllers [in 0.2.0.x] - 122 Network status entries need a new Unnamed flag [in 0.2.0.x] - 123 Naming authorities automatically create bindings [in 0.2.0.x] - 125 Behavior for bridge users, bridge relays, and bridge authorities [in 0.2.0.x] - 126 Getting GeoIP data and publishing usage summaries [in 0.2.0.x] - 129 Block Insecure Protocols by Default [in 0.2.0.x] - 130 Version 2 Tor connection protocol [in 0.2.0.x] - 135 Simplify Configuration of Private Tor Networks [for 0.2.1.x] [in 0.2.1.2-alpha] - 136 Mass authority migration with legacy keys [in 0.2.0.x] - 137 Keep controllers informed as Tor bootstraps [in 0.2.1.x] - 138 Remove routers that are not Running from consensus documents [in 0.2.1.2-alpha] - 139 Download consensus documents only when it will be trusted [in 0.2.1.x] - 148 Stream end reasons from the client side should be uniform [in 0.2.1.9-alpha] - 150 Exclude Exit Nodes from a circuit [in 0.2.1.3-alpha] - 152 Optionally allow exit from single-hop circuits [in 0.2.1.6-alpha] - 167 Vote on network parameters in consensus [in 0.2.2] - SUPERSEDED: - 112 Bring Back Pathlen Coin Weight - 113 Simplifying directory authority administration - 124 Blocking resistant TLS certificate usage - 153 Automatic software update protocol - 154 Automatic Software Update Protocol - DEAD: - 100 Tor Unreliable Datagram Extension Proposal - 115 Two Hop Paths - 116 Two hop paths from entry guards - 120 Shutdown descriptors when Tor servers stop - 128 Families of private bridges - 142 Combine Introduction and Rendezvous Points - REJECTED: - 134 More robust consensus voting with diverse authority sets diff --git a/doc/spec/proposals/001-process.txt b/doc/spec/proposals/001-process.txt deleted file mode 100644 index 53ad32ba12..0000000000 --- a/doc/spec/proposals/001-process.txt +++ /dev/null @@ -1,184 +0,0 @@ -Filename: 001-process.txt -Title: The Tor Proposal Process -Author: Nick Mathewson -Created: 30-Jan-2007 -Status: Meta - -Overview: - - This document describes how to change the Tor specifications, how Tor - proposals work, and the relationship between Tor proposals and the - specifications. - - This is an informational document. - -Motivation: - - Previously, our process for updating the Tor specifications was maximally - informal: we'd patch the specification (sometimes forking first, and - sometimes not), then discuss the patches, reach consensus, and implement - the changes. - - This had a few problems. - - First, even at its most efficient, the old process would often have the - spec out of sync with the code. The worst cases were those where - implementation was deferred: the spec and code could stay out of sync for - versions at a time. - - Second, it was hard to participate in discussion, since you had to know - which portions of the spec were a proposal, and which were already - implemented. - - Third, it littered the specifications with too many inline comments. - [This was a real problem -NM] - [Especially when it went to multiple levels! -NM] - [XXXX especially when they weren't signed and talked about that - thing that you can't remember after a year] - -How to change the specs now: - - First, somebody writes a proposal document. It should describe the change - that should be made in detail, and give some idea of how to implement it. - Once it's fleshed out enough, it becomes a proposal. - - Like an RFC, every proposal gets a number. Unlike RFCs, proposals can - change over time and keep the same number, until they are finally - accepted or rejected. The history for each proposal - will be stored in the Tor repository. - - Once a proposal is in the repository, we should discuss and improve it - until we've reached consensus that it's a good idea, and that it's - detailed enough to implement. When this happens, we implement the - proposal and incorporate it into the specifications. Thus, the specs - remain the canonical documentation for the Tor protocol: no proposal is - ever the canonical documentation for an implemented feature. - - (This process is pretty similar to the Python Enhancement Process, with - the major exception that Tor proposals get re-integrated into the specs - after implementation, whereas PEPs _become_ the new spec.) - - {It's still okay to make small changes directly to the spec if the code - can be - written more or less immediately, or cosmetic changes if no code change is - required. This document reflects the current developers' _intent_, not - a permanent promise to always use this process in the future: we reserve - the right to get really excited and run off and implement something in a - caffeine-or-m&m-fueled all-night hacking session.} - -How new proposals get added: - - Once an idea has been proposed on the development list, a properly formatted - (see below) draft exists, and rough consensus within the active development - community exists that this idea warrants consideration, the proposal editor - will officially add the proposal. - - To get your proposal in, send it to or-dev. - - The current proposal editors are Nick Mathewson and Jacob Appelbaum. - -What should go in a proposal: - - Every proposal should have a header containing these fields: - Filename, Title, Author, Created, Status. - - These fields are optional but recommended: - Target, Implemented-In. - The Target field should describe which version the proposal is hoped to be - implemented in (if it's Open or Accepted). The Implemented-In field - should describe which version the proposal was implemented in (if it's - Finished or Closed). - - The body of the proposal should start with an Overview section explaining - what the proposal's about, what it does, and about what state it's in. - - After the Overview, the proposal becomes more free-form. Depending on its - length and complexity, the proposal can break into sections as - appropriate, or follow a short discursive format. Every proposal should - contain at least the following information before it is "ACCEPTED", - though the information does not need to be in sections with these names. - - Motivation: What problem is the proposal trying to solve? Why does - this problem matter? If several approaches are possible, why take this - one? - - Design: A high-level view of what the new or modified features are, how - the new or modified features work, how they interoperate with each - other, and how they interact with the rest of Tor. This is the main - body of the proposal. Some proposals will start out with only a - Motivation and a Design, and wait for a specification until the - Design seems approximately right. - - Security implications: What effects the proposed changes might have on - anonymity, how well understood these effects are, and so on. - - Specification: A detailed description of what needs to be added to the - Tor specifications in order to implement the proposal. This should - be in about as much detail as the specifications will eventually - contain: it should be possible for independent programmers to write - mutually compatible implementations of the proposal based on its - specifications. - - Compatibility: Will versions of Tor that follow the proposal be - compatible with versions that do not? If so, how will compatibility - be achieved? Generally, we try to not drop compatibility if at - all possible; we haven't made a "flag day" change since May 2004, - and we don't want to do another one. - - Implementation: If the proposal will be tricky to implement in Tor's - current architecture, the document can contain some discussion of how - to go about making it work. Actual patches should go on public git - branches, or be uploaded to trac. - - Performance and scalability notes: If the feature will have an effect - on performance (in RAM, CPU, bandwidth) or scalability, there should - be some analysis on how significant this effect will be, so that we - can avoid really expensive performance regressions, and so we can - avoid wasting time on insignificant gains. - -Proposal status: - - Open: A proposal under discussion. - - Accepted: The proposal is complete, and we intend to implement it. - After this point, substantive changes to the proposal should be - avoided, and regarded as a sign of the process having failed - somewhere. - - Finished: The proposal has been accepted and implemented. After this - point, the proposal should not be changed. - - Closed: The proposal has been accepted, implemented, and merged into the - main specification documents. The proposal should not be changed after - this point. - - Rejected: We're not going to implement the feature as described here, - though we might do some other version. See comments in the document - for details. The proposal should not be changed after this point; - to bring up some other version of the idea, write a new proposal. - - Draft: This isn't a complete proposal yet; there are definite missing - pieces. Please don't add any new proposals with this status; put them - in the "ideas" sub-directory instead. - - Needs-Revision: The idea for the proposal is a good one, but the proposal - as it stands has serious problems that keep it from being accepted. - See comments in the document for details. - - Dead: The proposal hasn't been touched in a long time, and it doesn't look - like anybody is going to complete it soon. It can become "Open" again - if it gets a new proponent. - - Needs-Research: There are research problems that need to be solved before - it's clear whether the proposal is a good idea. - - Meta: This is not a proposal, but a document about proposals. - - - The editor maintains the correct status of proposals, based on rough - consensus and his own discretion. - -Proposal numbering: - - Numbers 000-099 are reserved for special and meta-proposals. 100 and up - are used for actual proposals. Numbers aren't recycled. diff --git a/doc/spec/proposals/098-todo.txt b/doc/spec/proposals/098-todo.txt deleted file mode 100644 index a0bbbeb568..0000000000 --- a/doc/spec/proposals/098-todo.txt +++ /dev/null @@ -1,107 +0,0 @@ -Filename: 098-todo.txt -Title: Proposals that should be written -Author: Nick Mathewson, Roger Dingledine -Created: 26-Jan-2007 -Status: Meta - -Overview: - - This document lists ideas that various people have had for improving the - Tor protocol. These should be implemented and specified if they're - trivial, or written up as proposals if they're not. - - This is an active document, to be edited as proposals are written and as - we come up with new ideas for proposals. We should take stuff out as it - seems irrelevant. - - -For some later protocol version. - - - It would be great to get smarter about identity and linkability. - It's not crazy to say, "Never use the same circuit for my SSH - connections and my web browsing." How far can/should we take this? - See ideas/xxx-separate-streams-by-port.txt for a start. - - - Fix onionskin handshake scheme to be more mainstream, less nutty. - Can we just do - E(HMAC(g^x), g^x) rather than just E(g^x) ? - No, that has the same flaws as before. We should send - E(g^x, C) with random C and expect g^y, HMAC_C(K=g^xy). - Better ask Ian; probably Stephen too. - - - Length on CREATE and friends - - - Versioning on circuits and create cells, so we have a clear path - to improve the circuit protocol. - - - SHA1 is showing its age. We should get a design for upgrading our - hash once the AHS competition is done, or even sooner. - - - Not being able to upgrade ciphersuites or increase key lengths is - lame. - - Paul has some ideas about circuit creation; read his PET paper once it's - out. - -Any time: - - - Some ideas for revising the directory protocol: - - Extend the "r" line in network-status to give a set of buckets (say, - comma-separated) for that router. - - Buckets are deterministic based on IP address. - - Then clients can choose a bucket (or set of buckets) to - download and use. - - We need a way for the authorities to declare that nodes are in a - family. Also, it kinda sucks that family declarations use O(N^2) space - in the descriptors. - - REASON_CONNECTFAILED should include an IP. - - Spec should incorporate some prose from tor-design to be more readable. - - Spec when we should rotate which keys - - Spec how to publish descriptors less often - - Describe pros and cons of non-deterministic path lengths - - - We should use a variable-length path length by default -- 3 +/- some - distribution. Need to think harder about allowing values less than 3, - and there's a tradeoff between having a wide variance and performance. - - - Clients currently use certs during TLS. Is this wise? It does make it - easier for servers to tell which NATted client is which. We could use a - seprate set of certs for each guard, I suppose, but generating so many - certs could get expensive. Omitting them entirely would make OP->OR - easier to tell from OR->OR. - -Things that should change... - -B.1. ... but which will require backward-incompatible change - - - Circuit IDs should be longer. - . IPv6 everywhere. - - Maybe, keys should be longer. - - Maybe, key-length should be adjustable. How to do this without - making anonymity suck? - - Drop backward compatibility. - - We should use a 128-bit subgroup of our DH prime. - - Handshake should use HMAC. - - Multiple cell lengths. - - Ability to split circuits across paths (If this is useful.) - - SENDME windows should be dynamic. - - - Directory - - Stop ever mentioning socks ports - -B.1. ... and that will require no changes - - - Advertised outbound IP? - - Migrate streams across circuits. - - Fix bug 469 by limiting the number of simultaneous connections per IP. - -B.2. ... and that we have no idea how to do. - - - UDP (as transport) - - UDP (as content) - - Use a better AES mode that has built-in integrity checking, - doesn't grow with the number of hops, is not patented, and - is implemented and maintained by smart people. - -Let onion keys be not just RSA but maybe DH too, for Paul's reply onion -design. - diff --git a/doc/spec/proposals/099-misc.txt b/doc/spec/proposals/099-misc.txt deleted file mode 100644 index a3621dd25f..0000000000 --- a/doc/spec/proposals/099-misc.txt +++ /dev/null @@ -1,28 +0,0 @@ -Filename: 099-misc.txt -Title: Miscellaneous proposals -Author: Various -Created: 26-Jan-2007 -Status: Meta - -Overview: - - This document is for small proposal ideas that are about one paragraph in - length. From here, ideas can be rejected outright, expanded into full - proposals, or specified and implemented as-is. - -Proposals - -1. Directory compression. - - Gzip would be easier to work with than zlib; bzip2 would result in smaller - data lengths. [Concretely, we're looking at about 10-15% space savings at - the expense of 3-5x longer compression time for using bzip2.] Doing - on-the-fly gzip requires zlib 1.2 or later; doing bzip2 requires bzlib. - Pre-compressing status documents in multiple formats would force us to use - more memory to hold them. - - Status: Open - - -- Nick Mathewson - - diff --git a/doc/spec/proposals/100-tor-spec-udp.txt b/doc/spec/proposals/100-tor-spec-udp.txt deleted file mode 100644 index 7f062222c5..0000000000 --- a/doc/spec/proposals/100-tor-spec-udp.txt +++ /dev/null @@ -1,422 +0,0 @@ -Filename: 100-tor-spec-udp.txt -Title: Tor Unreliable Datagram Extension Proposal -Author: Marc Liberatore -Created: 23 Feb 2006 -Status: Dead - -Overview: - - This is a modified version of the Tor specification written by Marc - Liberatore to add UDP support to Tor. For each TLS link, it adds a - corresponding DTLS link: control messages and TCP data flow over TLS, and - UDP data flows over DTLS. - - This proposal is not likely to be accepted as-is; see comments at the end - of the document. - - -Contents - -0. Introduction - - Tor is a distributed overlay network designed to anonymize low-latency - TCP-based applications. The current tor specification supports only - TCP-based traffic. This limitation prevents the use of tor to anonymize - other important applications, notably voice over IP software. This document - is a proposal to extend the tor specification to support UDP traffic. - - The basic design philosophy of this extension is to add support for - tunneling unreliable datagrams through tor with as few modifications to the - protocol as possible. As currently specified, tor cannot directly support - such tunneling, as connections between nodes are built using transport layer - security (TLS) atop TCP. The latency incurred by TCP is likely unacceptable - to the operation of most UDP-based application level protocols. - - Thus, we propose the addition of links between nodes using datagram - transport layer security (DTLS). These links allow packets to traverse a - route through tor quickly, but their unreliable nature requires minor - changes to the tor protocol. This proposal outlines the necessary - additions and changes to the tor specification to support UDP traffic. - - We note that a separate set of DTLS links between nodes creates a second - overlay, distinct from the that composed of TLS links. This separation and - resulting decrease in each anonymity set's size will make certain attacks - easier. However, it is our belief that VoIP support in tor will - dramatically increase its appeal, and correspondingly, the size of its user - base, number of deployed nodes, and total traffic relayed. These increases - should help offset the loss of anonymity that two distinct networks imply. - -1. Overview of Tor-UDP and its complications - - As described above, this proposal extends the Tor specification to support - UDP with as few changes as possible. Tor's overlay network is managed - through TLS based connections; we will re-use this control plane to set up - and tear down circuits that relay UDP traffic. These circuits be built atop - DTLS, in a fashion analogous to how Tor currently sends TCP traffic over - TLS. - - The unreliability of DTLS circuits creates problems for Tor at two levels: - - 1. Tor's encryption of the relay layer does not allow independent - decryption of individual records. If record N is not received, then - record N+1 will not decrypt correctly, as the counter for AES/CTR is - maintained implicitly. - - 2. Tor's end-to-end integrity checking works under the assumption that - all RELAY cells are delivered. This assumption is invalid when cells - are sent over DTLS. - - The fix for the first problem is straightforward: add an explicit sequence - number to each cell. To fix the second problem, we introduce a - system of nonces and hashes to RELAY packets. - - In the following sections, we mirror the layout of the Tor Protocol - Specification, presenting the necessary modifications to the Tor protocol as - a series of deltas. - -2. Connections - - Tor-UDP uses DTLS for encryption of some links. All DTLS links must have - corresponding TLS links, as all control messages are sent over TLS. All - implementations MUST support the DTLS ciphersuite "[TODO]". - - DTLS connections are formed using the same protocol as TLS connections. - This occurs upon request, following a CREATE_UDP or CREATE_FAST_UDP cell, - as detailed in section 4.6. - - Once a paired TLS/DTLS connection is established, the two sides send cells - to one another. All but two types of cells are sent over TLS links. RELAY - cells containing the commands RELAY_UDP_DATA and RELAY_UDP_DROP, specified - below, are sent over DTLS links. [Should all cells still be 512 bytes long? - Perhaps upon completion of a preliminary implementation, we should do a - performance evaluation for some class of UDP traffic, such as VoIP. - ML] - Cells may be sent embedded in TLS or DTLS records of any size or divided - across such records. The framing of these records MUST NOT leak any more - information than the above differentiation on the basis of cell type. [I am - uncomfortable with this leakage, but don't see any simple, elegant way - around it. -ML] - - As with TLS connections, DTLS connections are not permanent. - -3. Cell format - - Each cell contains the following fields: - - CircID [2 bytes] - Command [1 byte] - Sequence Number [2 bytes] - Payload (padded with 0 bytes) [507 bytes] - [Total size: 512 bytes] - - The 'Command' field holds one of the following values: - 0 -- PADDING (Padding) (See Sec 6.2) - 1 -- CREATE (Create a circuit) (See Sec 4) - 2 -- CREATED (Acknowledge create) (See Sec 4) - 3 -- RELAY (End-to-end data) (See Sec 5) - 4 -- DESTROY (Stop using a circuit) (See Sec 4) - 5 -- CREATE_FAST (Create a circuit, no PK) (See Sec 4) - 6 -- CREATED_FAST (Circuit created, no PK) (See Sec 4) - 7 -- CREATE_UDP (Create a UDP circuit) (See Sec 4) - 8 -- CREATED_UDP (Acknowledge UDP create) (See Sec 4) - 9 -- CREATE_FAST_UDP (Create a UDP circuit, no PK) (See Sec 4) - 10 -- CREATED_FAST_UDP(UDP circuit created, no PK) (See Sec 4) - - The sequence number allows for AES/CTR decryption of RELAY cells - independently of one another; this functionality is required to support - cells sent over DTLS. The sequence number is described in more detail in - section 4.5. - - [Should the sequence number only appear in RELAY packets? The overhead is - small, and I'm hesitant to force more code paths on the implementor. -ML] - [There's already a separate relay header that has other material in it, - so it wouldn't be the end of the world to move it there if it's - appropriate. -RD] - - [Having separate commands for UDP circuits seems necessary, unless we can - assume a flag day event for a large number of tor nodes. -ML] - -4. Circuit management - -4.2. Setting circuit keys - - Keys are set up for UDP circuits in the same fashion as for TCP circuits. - Each UDP circuit shares keys with its corresponding TCP circuit. - - [If the keys are used for both TCP and UDP connections, how does it - work to mix sequence-number-less cells with sequenced-numbered cells -- - how do you know you have the encryption order right? -RD] - -4.3. Creating circuits - - UDP circuits are created as TCP circuits, using the *_UDP cells as - appropriate. - -4.4. Tearing down circuits - - UDP circuits are torn down as TCP circuits, using the *_UDP cells as - appropriate. - -4.5. Routing relay cells - - When an OR receives a RELAY cell, it checks the cell's circID and - determines whether it has a corresponding circuit along that - connection. If not, the OR drops the RELAY cell. - - Otherwise, if the OR is not at the OP edge of the circuit (that is, - either an 'exit node' or a non-edge node), it de/encrypts the payload - with AES/CTR, as follows: - 'Forward' relay cell (same direction as CREATE): - Use Kf as key; decrypt, using sequence number to synchronize - ciphertext and keystream. - 'Back' relay cell (opposite direction from CREATE): - Use Kb as key; encrypt, using sequence number to synchronize - ciphertext and keystream. - Note that in counter mode, decrypt and encrypt are the same operation. - [Since the sequence number is only 2 bytes, what do you do when it - rolls over? -RD] - - Each stream encrypted by a Kf or Kb has a corresponding unique state, - captured by a sequence number; the originator of each such stream chooses - the initial sequence number randomly, and increments it only with RELAY - cells. [This counts cells; unlike, say, TCP, tor uses fixed-size cells, so - there's no need for counting bytes directly. Right? - ML] - [I believe this is true. You'll find out for sure when you try to - build it. ;) -RD] - - The OR then decides whether it recognizes the relay cell, by - inspecting the payload as described in section 5.1 below. If the OR - recognizes the cell, it processes the contents of the relay cell. - Otherwise, it passes the decrypted relay cell along the circuit if - the circuit continues. If the OR at the end of the circuit - encounters an unrecognized relay cell, an error has occurred: the OR - sends a DESTROY cell to tear down the circuit. - - When a relay cell arrives at an OP, the OP decrypts the payload - with AES/CTR as follows: - OP receives data cell: - For I=N...1, - Decrypt with Kb_I, using the sequence number as above. If the - payload is recognized (see section 5.1), then stop and process - the payload. - - For more information, see section 5 below. - -4.6. CREATE_UDP and CREATED_UDP cells - - Users set up UDP circuits incrementally. The procedure is similar to that - for TCP circuits, as described in section 4.1. In addition to the TLS - connection to the first node, the OP also attempts to open a DTLS - connection. If this succeeds, the OP sends a CREATE_UDP cell, with a - payload in the same format as a CREATE cell. To extend a UDP circuit past - the first hop, the OP sends an EXTEND_UDP relay cell (see section 5) which - instructs the last node in the circuit to send a CREATE_UDP cell to extend - the circuit. - - The relay payload for an EXTEND_UDP relay cell consists of: - Address [4 bytes] - TCP port [2 bytes] - UDP port [2 bytes] - Onion skin [186 bytes] - Identity fingerprint [20 bytes] - - The address field and ports denote the IPV4 address and ports of the next OR - in the circuit. - - The payload for a CREATED_UDP cell or the relay payload for an - RELAY_EXTENDED_UDP cell is identical to that of the corresponding CREATED or - RELAY_EXTENDED cell. Both circuits are established using the same key. - - Note that the existence of a UDP circuit implies the - existence of a corresponding TCP circuit, sharing keys, sequence numbers, - and any other relevant state. - -4.6.1 CREATE_FAST_UDP/CREATED_FAST_UDP cells - - As above, the OP must successfully connect using DTLS before attempting to - send a CREATE_FAST_UDP cell. Otherwise, the procedure is the same as in - section 4.1.1. - -5. Application connections and stream management - -5.1. Relay cells - - Within a circuit, the OP and the exit node use the contents of RELAY cells - to tunnel end-to-end commands, TCP connections ("Streams"), and UDP packets - across circuits. End-to-end commands and UDP packets can be initiated by - either edge; streams are initiated by the OP. - - The payload of each unencrypted RELAY cell consists of: - Relay command [1 byte] - 'Recognized' [2 bytes] - StreamID [2 bytes] - Digest [4 bytes] - Length [2 bytes] - Data [498 bytes] - - The relay commands are: - 1 -- RELAY_BEGIN [forward] - 2 -- RELAY_DATA [forward or backward] - 3 -- RELAY_END [forward or backward] - 4 -- RELAY_CONNECTED [backward] - 5 -- RELAY_SENDME [forward or backward] - 6 -- RELAY_EXTEND [forward] - 7 -- RELAY_EXTENDED [backward] - 8 -- RELAY_TRUNCATE [forward] - 9 -- RELAY_TRUNCATED [backward] - 10 -- RELAY_DROP [forward or backward] - 11 -- RELAY_RESOLVE [forward] - 12 -- RELAY_RESOLVED [backward] - 13 -- RELAY_BEGIN_UDP [forward] - 14 -- RELAY_DATA_UDP [forward or backward] - 15 -- RELAY_EXTEND_UDP [forward] - 16 -- RELAY_EXTENDED_UDP [backward] - 17 -- RELAY_DROP_UDP [forward or backward] - - Commands labelled as "forward" must only be sent by the originator - of the circuit. Commands labelled as "backward" must only be sent by - other nodes in the circuit back to the originator. Commands marked - as either can be sent either by the originator or other nodes. - - The 'recognized' field in any unencrypted relay payload is always set to - zero. - - The 'digest' field can have two meanings. For all cells sent over TLS - connections (that is, all commands and all non-UDP RELAY data), it is - computed as the first four bytes of the running SHA-1 digest of all the - bytes that have been sent reliably and have been destined for this hop of - the circuit or originated from this hop of the circuit, seeded from Df or Db - respectively (obtained in section 4.2 above), and including this RELAY - cell's entire payload (taken with the digest field set to zero). Cells sent - over DTLS connections do not affect this running digest. Each cell sent - over DTLS (that is, RELAY_DATA_UDP and RELAY_DROP_UDP) has the digest field - set to the SHA-1 digest of the current RELAY cells' entire payload, with the - digest field set to zero. Coupled with a randomly-chosen streamID, this - provides per-cell integrity checking on UDP cells. - [If you drop malformed UDP relay cells but don't close the circuit, - then this 8 bytes of digest is not as strong as what we get in the - TCP-circuit side. Is this a problem? -RD] - - When the 'recognized' field of a RELAY cell is zero, and the digest - is correct, the cell is considered "recognized" for the purposes of - decryption (see section 4.5 above). - - (The digest does not include any bytes from relay cells that do - not start or end at this hop of the circuit. That is, it does not - include forwarded data. Therefore if 'recognized' is zero but the - digest does not match, the running digest at that node should - not be updated, and the cell should be forwarded on.) - - All RELAY cells pertaining to the same tunneled TCP stream have the - same streamID. Such streamIDs are chosen arbitrarily by the OP. RELAY - cells that affect the entire circuit rather than a particular - stream use a StreamID of zero. - - All RELAY cells pertaining to the same UDP tunnel have the same streamID. - This streamID is chosen randomly by the OP, but cannot be zero. - - The 'Length' field of a relay cell contains the number of bytes in - the relay payload which contain real payload data. The remainder of - the payload is padded with NUL bytes. - - If the RELAY cell is recognized but the relay command is not - understood, the cell must be dropped and ignored. Its contents - still count with respect to the digests, though. [Before - 0.1.1.10, Tor closed circuits when it received an unknown relay - command. Perhaps this will be more forward-compatible. -RD] - -5.2.1. Opening UDP tunnels and transferring data - - To open a new anonymized UDP connection, the OP chooses an open - circuit to an exit that may be able to connect to the destination - address, selects a random streamID not yet used on that circuit, - and constructs a RELAY_BEGIN_UDP cell with a payload encoding the address - and port of the destination host. The payload format is: - - ADDRESS | ':' | PORT | [00] - - where ADDRESS can be a DNS hostname, or an IPv4 address in - dotted-quad format, or an IPv6 address surrounded by square brackets; - and where PORT is encoded in decimal. - - [What is the [00] for? -NM] - [It's so the payload is easy to parse out with string funcs -RD] - - Upon receiving this cell, the exit node resolves the address as necessary. - If the address cannot be resolved, the exit node replies with a RELAY_END - cell. (See 5.4 below.) Otherwise, the exit node replies with a - RELAY_CONNECTED cell, whose payload is in one of the following formats: - The IPv4 address to which the connection was made [4 octets] - A number of seconds (TTL) for which the address may be cached [4 octets] - or - Four zero-valued octets [4 octets] - An address type (6) [1 octet] - The IPv6 address to which the connection was made [16 octets] - A number of seconds (TTL) for which the address may be cached [4 octets] - [XXXX Versions of Tor before 0.1.1.6 ignore and do not generate the TTL - field. No version of Tor currently generates the IPv6 format.] - - The OP waits for a RELAY_CONNECTED cell before sending any data. - Once a connection has been established, the OP and exit node - package UDP data in RELAY_DATA_UDP cells, and upon receiving such - cells, echo their contents to the corresponding socket. - RELAY_DATA_UDP cells sent to unrecognized streams are dropped. - - Relay RELAY_DROP_UDP cells are long-range dummies; upon receiving such - a cell, the OR or OP must drop it. - -5.3. Closing streams - - UDP tunnels are closed in a fashion corresponding to TCP connections. - -6. Flow Control - - UDP streams are not subject to flow control. - -7.2. Router descriptor format. - -The items' formats are as follows: - "router" nickname address ORPort SocksPort DirPort UDPPort - - Indicates the beginning of a router descriptor. "address" must be - an IPv4 address in dotted-quad format. The last three numbers - indicate the TCP ports at which this OR exposes - functionality. ORPort is a port at which this OR accepts TLS - connections for the main OR protocol; SocksPort is deprecated and - should always be 0; DirPort is the port at which this OR accepts - directory-related HTTP connections; and UDPPort is a port at which - this OR accepts DTLS connections for UDP data. If any port is not - supported, the value 0 is given instead of a port number. - -Other sections: - -What changes need to happen to each node's exit policy to support this? -RD - -Switching to UDP means managing the queues of incoming packets better, -so we don't miss packets. How does this interact with doing large public -key operations (handshakes) in the same thread? -RD - -======================================================================== -COMMENTS -======================================================================== - -[16 May 2006] - -I don't favor this approach; it makes packet traffic partitioned from -stream traffic end-to-end. The architecture I'd like to see is: - - A *All* Tor-to-Tor traffic is UDP/DTLS, unless we need to fall back on - TCP/TLS for firewall penetration or something. (This also gives us an - upgrade path for routing through legacy servers.) - - B Stream traffic is handled with end-to-end per-stream acks/naks and - retries. On failure, the data is retransmitted in a new RELAY_DATA cell; - a cell isn't retransmitted. - -We'll need to do A anyway, to fix our behavior on packet-loss. Once we've -done so, B is more or less inevitable, and we can support end-to-end UDP -traffic "for free". - -(Also, there are some details that this draft spec doesn't address. For -example, what happens when a UDP packet doesn't fit in a single cell?) - --NM diff --git a/doc/spec/proposals/101-dir-voting.txt b/doc/spec/proposals/101-dir-voting.txt deleted file mode 100644 index 634d3f1948..0000000000 --- a/doc/spec/proposals/101-dir-voting.txt +++ /dev/null @@ -1,283 +0,0 @@ -Filename: 101-dir-voting.txt -Title: Voting on the Tor Directory System -Author: Nick Mathewson -Created: Nov 2006 -Status: Closed -Implemented-In: 0.2.0.x - -Overview - - This document describes a consensus voting scheme for Tor directories; - instead of publishing different network statuses, directories would vote on - and publish a single "consensus" network status document. - - This is an open proposal. - -Proposal: - -0. Scope and preliminaries - - This document describes a consensus voting scheme for Tor directories. - Once it's accepted, it should be merged with dir-spec.txt. Some - preliminaries for authority and caching support should be done during - the 0.1.2.x series; the main deployment should come during the 0.2.0.x - series. - -0.1. Goals and motivation: voting. - - The current directory system relies on clients downloading separate - network status statements from the caches signed by each directory. - Clients download a new statement every 30 minutes or so, choosing to - replace the oldest statement they currently have. - - This creates a partitioning problem: different clients have different - "most recent" networkstatus sources, and different versions of each - (since authorities change their statements often). - - It also creates a scaling problem: most of the downloaded networkstatus - are probably quite similar, and the redundancy grows as we add more - authorities. - - So if we have clients only download a single multiply signed consensus - network status statement, we can: - - Save bandwidth. - - Reduce client partitioning - - Reduce client-side and cache-side storage - - Simplify client-side voting code (by moving voting away from the - client) - - We should try to do this without: - - Assuming that client-side or cache-side clocks are more correct - than we assume now. - - Assuming that authority clocks are perfectly correct. - - Degrading badly if a few authorities die or are offline for a bit. - - We do not have to perform well if: - - No clique of more than half the authorities can agree about who - the authorities are. - -1. The idea. - - Instead of publishing a network status whenever something changes, - each authority instead publishes a fresh network status only once per - "period" (say, 60 minutes). Authorities either upload this network - status (or "vote") to every other authority, or download every other - authority's "vote" (see 3.1 below for discussion on push vs pull). - - After an authority has (or has become convinced that it won't be able to - get) every other authority's vote, it deterministically computes a - consensus networkstatus, and signs it. Authorities download (or are - uploaded; see 3.1) one another's signatures, and form a multiply signed - consensus. This multiply-signed consensus is what caches cache and what - clients download. - - If an authority is down, authorities vote based on what they *can* - download/get uploaded. - - If an authority is "a little" down and only some authorities can reach - it, authorities try to get its info from other authorities. - - If an authority computes the vote wrong, its signature isn't included on - the consensus. - - Clients use a consensus if it is "trusted": signed by more than half the - authorities they recognize. If clients can't find any such consensus, - they use the most recent trusted consensus they have. If they don't - have any trusted consensus, they warn the user and refuse to operate - (and if DirServers is not the default, beg the user to adapt the list - of authorities). - -2. Details. - -2.0. Versioning - - All documents generated here have version "3" given in their - network-status-version entries. - -2.1. Vote specifications - - Votes in v3 are similar to v2 network status documents. We add these - fields to the preamble: - - "vote-status" -- the word "vote". - - "valid-until" -- the time when this authority expects to publish its - next vote. - - "known-flags" -- a space-separated list of flags that will sometimes - be included on "s" lines later in the vote. - - "dir-source" -- as before, except the "hostname" part MUST be the - authority's nickname, which MUST be unique among authorities, and - MUST match the nickname in the "directory-signature" entry. - - Authorities SHOULD cache their most recently generated votes so they - can persist them across restarts. Authorities SHOULD NOT generate - another document until valid-until has passed. - - Router entries in the vote MUST be sorted in ascending order by router - identity digest. The flags in "s" lines MUST appear in alphabetical - order. - - Votes SHOULD be synchronized to half-hour publication intervals (one - hour? XXX say more; be more precise.) - - XXXX some way to request older networkstatus docs? - -2.2. Consensus directory specifications - - Consensuses are like v3 votes, except for the following fields: - - "vote-status" -- the word "consensus". - - "published" is the latest of all the published times on the votes. - - "valid-until" is the earliest of all the valid-until times on the - votes. - - "dir-source" and "fingerprint" and "dir-signing-key" and "contact" - are included for each authority that contributed to the vote. - - "vote-digest" for each authority that contributed to the vote, - calculated as for the digest in the signature on the vote. [XXX - re-English this sentence] - - "client-versions" and "server-versions" are sorted in ascending - order based on version-spec.txt. - - "dir-options" and "known-flags" are not included. -[XXX really? why not list the ones that are used in the consensus? -For example, right now BadExit is in use, but no servers would be -labelled BadExit, and it's still worth knowing that it was considered -by the authorities. -RD] - - The fields MUST occur in the following order: - "network-status-version" - "vote-status" - "published" - "valid-until" - For each authority, sorted in ascending order of nickname, case- - insensitively: - "dir-source", "fingerprint", "contact", "dir-signing-key", - "vote-digest". - "client-versions" - "server-versions" - - The signatures at the end of the document appear as multiple instances - of directory-signature, sorted in ascending order by nickname, - case-insensitively. - - A router entry should be included in the result if it is included by more - than half of the authorities (total authorities, not just those whose votes - we have). A router entry has a flag set if it is included by more than - half of the authorities who care about that flag. [XXXX this creates an - incentive for attackers to DOS authorities whose votes they don't like. - Can we remember what flags people set the last time we saw them? -NM] - [Which 'we' are we talking here? The end-users never learn which - authority sets which flags. So you're thinking the authorities - should record the last vote they saw from each authority and if it's - within a week or so, count all the flags that it advertised as 'no' - votes? Plausible. -RD] - - The signature hash covers from the "network-status-version" line through - the characters "directory-signature" in the first "directory-signature" - line. - - Consensus directories SHOULD be rejected if they are not signed by more - than half of the known authorities. - -2.2.1. Detached signatures - - Assuming full connectivity, every authority should compute and sign the - same consensus directory in each period. Therefore, it isn't necessary to - download the consensus computed by each authority; instead, the authorities - only push/fetch each others' signatures. A "detached signature" document - contains a single "consensus-digest" entry and one or more - directory-signature entries. [XXXX specify more.] - -2.3. URLs and timelines - -2.3.1. URLs and timeline used for agreement - - An authority SHOULD publish its vote immediately at the start of each voting - period. It does this by making it available at - http://<hostname>/tor/status-vote/current/authority.z - and sending it in an HTTP POST request to each other authority at the URL - http://<hostname>/tor/post/vote - - If, N minutes after the voting period has begun, an authority does not have - a current statement from another authority, the first authority retrieves - the other's statement. - - Once an authority has a vote from another authority, it makes it available - at - http://<hostname>/tor/status-vote/current/<fp>.z - where <fp> is the fingerprint of the other authority's identity key. - - The consensus network status, along with as many signatures as the server - currently knows, should be available at - http://<hostname>/tor/status-vote/current/consensus.z - All of the detached signatures it knows for consensus status should be - available at: - http://<hostname>/tor/status-vote/current/consensus-signatures.z - - Once an authority has computed and signed a consensus network status, it - should send its detached signature to each other authority in an HTTP POST - request to the URL: - http://<hostname>/tor/post/consensus-signature - - - [XXXX Store votes to disk.] - -2.3.2. Serving a consensus directory - - Once the authority is done getting signatures on the consensus directory, - it should serve it from: - http://<hostname>/tor/status/consensus.z - - Caches SHOULD download consensus directories from an authority and serve - them from the same URL. - -2.3.3. Timeline and synchronization - - [XXXX] - -2.4. Distributing routerdescs between authorities - - Consensus will be more meaningful if authorities take steps to make sure - that they all have the same set of descriptors _before_ the voting - starts. This is safe, since all descriptors are self-certified and - timestamped: it's always okay to replace a signed descriptor with a more - recent one signed by the same identity. - - In the long run, we might want some kind of sophisticated process here. - For now, since authorities already download one another's networkstatus - documents and use them to determine what descriptors to download from one - another, we can rely on this existing mechanism to keep authorities up to - date. - - [We should do a thorough read-through of dir-spec again to make sure - that the authorities converge on which descriptor to "prefer" for - each router. Right now the decision happens at the client, which is - no longer the right place for it. -RD] - -3. Questions and concerns - -3.1. Push or pull? - - The URLs above define a push mechanism for publishing votes and consensus - signatures via HTTP POST requests, and a pull mechanism for downloading - these documents via HTTP GET requests. As specified, every authority will - post to every other. The "download if no copy has been received" mechanism - exists only as a fallback. - -4. Migration - - * It would be cool if caches could get ready to download consensus - status docs, verify enough signatures, and serve them now. That way - once stuff works all we need to do is upgrade the authorities. Caches - don't need to verify the correctness of the format so long as it's - signed (or maybe multisigned?). We need to make sure that caches back - off very quickly from downloading consensus docs until they're - actually implemented. - diff --git a/doc/spec/proposals/102-drop-opt.txt b/doc/spec/proposals/102-drop-opt.txt deleted file mode 100644 index 490376bb53..0000000000 --- a/doc/spec/proposals/102-drop-opt.txt +++ /dev/null @@ -1,38 +0,0 @@ -Filename: 102-drop-opt.txt -Title: Dropping "opt" from the directory format -Author: Nick Mathewson -Created: Jan 2007 -Status: Closed -Implemented-In: 0.2.0.x - -Overview: - - This document proposes a change in the format used to transmit router and - directory information. - - This proposal has been accepted, implemented, and merged into dir-spec.txt. - -Proposal: - - The "opt" keyword in Tor's directory formats was originally intended to - mean, "it is okay to ignore this entry if you don't understand it"; the - default behavior has been "discard a routerdesc if it contains entries you - don't recognize." - - But so far, every new flag we have added has been marked 'opt'. It would - probably make sense to change the default behavior to "ignore unrecognized - fields", and add the statement that clients SHOULD ignore fields they don't - recognize. As a meta-principle, we should say that clients and servers - MUST NOT have to understand new fields in order to use directory documents - correctly. - - Of course, this will make it impossible to say, "The format has changed a - lot; discard this quietly if you don't understand it." We could do that by - adding a version field. - -Status: - - * We stopped requiring it as of 0.1.2.5-alpha. We'll stop generating it - once earlier formats are obsolete. - - diff --git a/doc/spec/proposals/103-multilevel-keys.txt b/doc/spec/proposals/103-multilevel-keys.txt deleted file mode 100644 index c8a7a6677b..0000000000 --- a/doc/spec/proposals/103-multilevel-keys.txt +++ /dev/null @@ -1,204 +0,0 @@ -Filename: 103-multilevel-keys.txt -Title: Splitting identity key from regularly used signing key. -Author: Nick Mathewson -Created: Jan 2007 -Status: Closed -Implemented-In: 0.2.0.x - -Overview: - - This document proposes a change in the way identity keys are used, so that - highly sensitive keys can be password-protected and seldom loaded into RAM. - - It presents options; it is not yet a complete proposal. - -Proposal: - - Replacing a directory authority's identity key in the event of a compromise - would be tremendously annoying. We'd need to tell every client to switch - their configuration, or update to a new version with an uploaded list. So - long as some weren't upgraded, they'd be at risk from whoever had - compromised the key. - - With this in mind, it's a shame that our current protocol forces us to - store identity keys unencrypted in RAM. We need some kind of signing key - stored unencrypted, since we need to generate new descriptors/directories - and rotate link and onion keys regularly. (And since, of course, we can't - ask server operators to be on-hand to enter a passphrase every time we - want to rotate keys or sign a descriptor.) - - The obvious solution seems to be to have a signing-only key that lives - indefinitely (months or longer) and signs descriptors and link keys, and a - separate identity key that's used to sign the signing key. Tor servers - could run in one of several modes: - 1. Identity key stored encrypted. You need to pick a passphrase when - you enable this mode, and re-enter this passphrase every time you - rotate the signing key. - 1'. Identity key stored separate. You save your identity key to a - floppy, and use the floppy when you need to rotate the signing key. - 2. All keys stored unencrypted. In this case, we might not want to even - *have* a separate signing key. (We'll need to support no-separate- - signing-key mode anyway to keep old servers working.) - 3. All keys stored encrypted. You need to enter a passphrase to start - Tor. - (Of course, we might not want to implement all of these.) - - Case 1 is probably most usable and secure, if we assume that people don't - forget their passphrases or lose their floppies. We could mitigate this a - bit by encouraging people to PGP-encrypt their passphrases to themselves, - or keep a cleartext copy of their secret key secret-split into a few - pieces, or something like that. - - Migration presents another difficulty, especially with the authorities. If - we use the current set of identity keys as the new identity keys, we're in - the position of having sensitive keys that have been stored on - media-of-dubious-encryption up to now. Also, we need to keep old clients - (who will expect descriptors to be signed by the identity keys they know - and love, and who will not understand signing keys) happy. - -A possible solution: - - One thing to consider is that router identity keys are not very sensitive: - if an OR disappears and reappears with a new key, the network treats it as - though an old router had disappeared and a new one had joined the network. - The Tor network continues unharmed; this isn't a disaster. - - Thus, the ideas above are mostly relevant for authorities. - - The most straightforward solution for the authorities is probably to take - advantage of the protocol transition that will come with proposal 101, and - introduce a new set of signing _and_ identity keys used only to sign votes - and consensus network-status documents. Signing and identity keys could be - delivered to users in a separate, rarely changing "keys" document, so that - the consensus network-status documents wouldn't need to include N signing - keys, N identity keys, and N certifications. - - Note also that there is no reason that the identity/signing keys used by - directory authorities would necessarily have to be the same as the identity - keys those authorities use in their capacity as routers. Decoupling these - keys would give directory authorities the following set of keys: - - Directory authority identity: - Highly confidential; stored encrypted and/or offline. Used to - identity directory authorities. Shipped with clients. Used to - sign Directory authority signing keys. - - Directory authority signing key: - Stored online, accessible to regular Tor process. Used to sign - votes and consensus directories. Downloaded as part of a "keys" - document. - - [Administrators SHOULD rotate their signing keys every month or - two, just to keep in practice and keep from forgetting the - password to the authority identity.] - - V1-V2 directory authority identity: - Stored online, never changed. Used to sign legacy network-status - and directory documents. - - Router identity: - Stored online, seldom changed. Used to sign server descriptors - for this authority in its role as a router. Implicitly certified - by being listed in network-status documents. - - Onion key, link key: - As in tor-spec.txt - - -Extensions to Proposal 101. - - Define a new document type, "Key certificate". It contains the - following fields, in order: - - "dir-key-certificate-version": As network-status-version. Must be - "3". - "fingerprint": Hex fingerprint, with spaces, based on the directory - authority's identity key. - "dir-identity-key": The long-term identity key for this authority. - "dir-key-published": The time when this directory's signing key was - last changed. - "dir-key-expires": A time after which this key is no longer valid. - "dir-signing-key": As in proposal 101. - "dir-key-certification": A signature of the above fields, in order. - The signed material extends from the beginning of - "dir-key-certicate-version" through the newline after - "dir-key-certification". The identity key is used to generate - this signature. - - These elements together constitute a "key certificate". These are - generated offline when starting a v3 authority. Private identity - keys SHOULD be stored offline, encrypted, or both. A running - authority only needs access to the signing key. - - Unlike other keys currently used by Tor, the authority identity - keys and directory signing keys MAY be longer than 1024 bits. - (They SHOULD be 2048 bits or longer; they MUST NOT be shorter than - 1024.) - - Vote documents change as follows: - - A key certificate MUST be included in-line in every vote document. With - the exception of "fingerprint", its elements MUST NOT appear in consensus - documents. - - Consensus network statuses change as follows: - - Remove dir-signing-key. - - Change "directory-signature" to take a fingerprint of the authority's - identity key and a fingerprint of the authority's current signing key - rather than the authority's nickname. - - Change "dir-source" to take the a fingerprint of the authority's - identity key rather than the authority's nickname or hostname. - - Add a new document type: - - A "keys" document contains all currently known key certificates. - All authorities serve it at - - http://<hostname>/tor/status/keys.z - - Caches and clients download the keys document whenever they receive a - consensus vote that uses a key they do not recognize. Caches download - from authorities; clients download from caches. - - Processing votes: - - When receiving a vote, authorities check to see if the key - certificate for the voter is different from the one they have. If - the key certificate _is_ different, and its dir-key-published is - more recent than the most recently known one, and it is - well-formed and correctly signed with the correct identity key, - then authorities remember it as the new canonical key certificate - for that voter. - - A key certificate is invalid if any of the following hold: - * The version is unrecognized. - * The fingerprint does not match the identity key. - * The identity key or the signing key is ill-formed. - * The published date is very far in the past or future. - - * The signature is not a valid signature of the key certificate - generated with the identity key. - - When processing the signatures on consensus, clients and caches act as - follows: - - 1. Only consider the directory-signature entries whose identity - key hashes match trusted authorities. - - 2. If any such entries have signing key hashes that match unknown - signing keys, download a new keys document. - - 3. For every entry with a known (identity key,signing key) pair, - check the signature on the document. - - 4. If the document has been signed by more than half of the - authorities the client recognizes, treat the consensus as - correctly signed. - - If not, but the number entries with known identity keys but - unknown signing keys might be enough to make the consensus - correctly signed, do not use the consensus, but do not discard - it until we have a new keys document. diff --git a/doc/spec/proposals/104-short-descriptors.txt b/doc/spec/proposals/104-short-descriptors.txt deleted file mode 100644 index 90e0764fe6..0000000000 --- a/doc/spec/proposals/104-short-descriptors.txt +++ /dev/null @@ -1,181 +0,0 @@ -Filename: 104-short-descriptors.txt -Title: Long and Short Router Descriptors -Author: Nick Mathewson -Created: Jan 2007 -Status: Closed -Implemented-In: 0.2.0.x - -Overview: - - This document proposes moving unused-by-clients information from regular - router descriptors into a new "extra info" router descriptor. - -Proposal: - - Some of the costliest fields in the current directory protocol are ones - that no client actually uses. In particular, the "read-history" and - "write-history" fields are used only by the authorities for monitoring the - status of the network. If we took them out, the size of a compressed list - of all the routers would fall by about 60%. (No other disposable field - would save much more than 2%.) - - We propose to remove these fields from descriptors, and and have them - uploaded as a part of a separate signed "extra info" to the authorities. - This document will be signed. A hash of this document will be included in - the regular descriptors. - - (We considered another design, where routers would generate and upload a - short-form and a long-form descriptor. Only the short-form descriptor would - ever be used by anybody for routing. The long-form descriptor would be - used only for analytics and other tools. We decided against this because - well-behaved tools would need to download short-form descriptors too (as - these would be the only ones indexed), and hence get redundant info. Badly - behaved tools would download only long-form descriptors, and expose - themselves to partitioning attacks.) - -Other disposable fields: - - Clients don't need these fields, but removing them doesn't help bandwidth - enough to be worthwhile. - contact (save about 1%) - fingerprint (save about 3%) - - We could represent these fields more succinctly, but removing them would - only save 1%. (!) - reject - accept - (Apparently, exit polices are highly compressible.) - - [Does size-on-disk matter to anybody? Some clients and servers don't - have much disk, or have really slow disk (e.g. USB). And we don't - store caches compressed right now. -RD] - -Specification: - - 1. Extra Info Format. - - An "extra info" descriptor contains the following fields: - - "extra-info" Nickname Fingerprint - Identifies what router this is an extra info descriptor for. - Fingerprint is encoded in hex (using upper-case letters), with - no spaces. - - "published" As currently documented in dir-spec.txt. It MUST match the - "published" field of the descriptor published at the same time. - - "read-history" - "write-history" - As currently documented in dir-spec.txt. Optional. - - "router-signature" NL Signature NL - - A signature of the PKCS1-padded hash of the entire extra info - document, taken from the beginning of the "extra-info" line, through - the newline after the "router-signature" line. An extra info - document is not valid unless the signature is performed with the - identity key whose digest matches FINGERPRINT. - - The "extra-info" field is required and MUST appear first. The - router-signature field is required and MUST appear last. All others are - optional. As for other documents, unrecognized fields must be ignored. - - 2. Existing formats - - Implementations that use "read-history" and "write-history" SHOULD - continue accepting router descriptors that contain them. (Prior to - 0.2.0.x, this information was encoded in ordinary router descriptors; - in any case they have always been listed as opt, so they should be - accepted anyway.) - - Add these fields to router descriptors: - - "extra-info-digest" Digest - "Digest" is a hex-encoded digest (using upper-case characters) - of the router's extra-info document, as signed in the router's - extra-info. (If this field is absent, no extra-info-digest - exists.) - - "caches-extra-info" - Present if this router is a directory cache that provides - extra-info documents, or an authority that handles extra-info - documents. - - (Since implementations before 0.1.2.5-alpha required that the "opt" - keyword precede any unrecognized entry, these keys MUST be preceded - with "opt" until 0.1.2.5-alpha is obsolete.) - - 3. New communications rules - - Servers SHOULD generate and upload one extra-info document after each - descriptor they generate and upload; no more, no less. Servers MUST - upload the new descriptor before they upload the new extra-info. - - Authorities receiving an extra-info document SHOULD verify all of the - following: - * They have a router descriptor for some server with a matching - nickname and identity fingerprint. - * That server's identity key has been used to sign the extra-info - document. - * The extra-info-digest field in the router descriptor matches - the digest of the extra-info document. - * The published fields in the two documents match. - - Authorities SHOULD drop extra-info documents that do not meet these - criteria. - - Extra-info documents MAY be uploaded as part of the same HTTP post as - the router descriptor, or separately. Authorities MUST accept both - methods. - - Authorities SHOULD try to fetch extra-info documents from one another if - they do not have one matching the digest declared in a router - descriptor. - - Caches that are running locally with a tool that needs to use extra-info - documents MAY download and store extra-info documents. They should do - so when they notice that the recommended descriptor has an - extra-info-digest not matching any extra-info document they currently - have. (Caches not running on a host that needs to use extra-info - documents SHOULD NOT download or cache them.) - - 4. New URLs - - http://<hostname>/tor/extra/d/... - http://<hostname>/tor/extra/fp/... - http://<hostname>/tor/extra/all[.z] - (As for /tor/server/ URLs: supports fetching extra-info documents - by their digest, by the fingerprint of their servers, or all - at once. When serving by fingerprint, we serve the extra-info - that corresponds to the descriptor we would serve by that - fingerprint. Only directory authorities are guaranteed to support - these URLs.) - - http://<hostname>/tor/extra/authority[.z] - (The extra-info document for this router.) - - Extra-info documents are uploaded to the same URLs as regular - router descriptors. - -Migration: - - For extra info approach: - * First: - * Authorities should accept extra info, and support serving it. - * Routers should upload extra info once authorities accept it. - * Caches should support an option to download and cache it, once - authorities serve it. - * Tools should be updated to use locally cached information. - These tools include: - lefkada's exit.py script. - tor26's noreply script and general directory cache. - https://nighteffect.us/tns/ for its graphs - and check with or-talk for the rest, once it's time. - - * Set a cutoff time for including bandwidth in router descriptors, so - that tools that use bandwidth info know that they will need to fetch - extra info documents. - - * Once tools that want bandwidth info support fetching extra info: - * Have routers stop including bandwidth info in their router - descriptors. diff --git a/doc/spec/proposals/105-handshake-revision.txt b/doc/spec/proposals/105-handshake-revision.txt deleted file mode 100644 index 791a016c26..0000000000 --- a/doc/spec/proposals/105-handshake-revision.txt +++ /dev/null @@ -1,323 +0,0 @@ -Filename: 105-handshake-revision.txt -Title: Version negotiation for the Tor protocol. -Author: Nick Mathewson, Roger Dingledine -Created: Jan 2007 -Status: Closed -Implemented-In: 0.2.0.x - -Overview: - - This document was extracted from a modified version of tor-spec.txt that we - had written before the proposal system went into place. It adds two new - cells types to the Tor link connection setup handshake: one used for - version negotiation, and another to prevent MITM attacks. - - This proposal is partially implemented, and partially proceded by - proposal 130. - -Motivation: Tor versions - - Our *current* approach to versioning the Tor protocol(s) has been as - follows: - - All changes must be backward compatible. - - It's okay to add new cell types, if they would be ignored by previous - versions of Tor. - - It's okay to add new data elements to cells, if they would be - ignored by previous versions of Tor. - - For forward compatibility, Tor must ignore cell types it doesn't - recognize, and ignore data in those cells it doesn't expect. - - Clients can inspect the version of Tor declared in the platform line - of a router's descriptor, and use that to learn whether a server - supports a given feature. Servers, however, aren't assumed to all - know about each other, and so don't know the version of who they're - talking to. - - This system has these problems: - - It's very hard to change fundamental aspects of the protocol, like the - cell format, the link protocol, any of the various encryption schemes, - and so on. - - The router-to-router link protocol has remained more-or-less frozen - for a long time, since we can't easily have an OR use new features - unless it knows the other OR will understand them. - - We need to resolve these problems because: - - Our cipher suite is showing its age: SHA1/AES128/RSA1024/DH1024 will - not seem like the best idea for all time. - - There are many ideas circulating for multiple cell sizes; while it's - not obvious whether these are safe, we can't do them at all without a - mechanism to permit them. - - There are many ideas circulating for alternative circuit building and - cell relay rules: they don't work unless they can coexist in the - current network. - - If our protocol changes a lot, it's hard to describe any coherent - version of it: we need to say "the version that Tor versions W through - X use when talking to versions Y through Z". This makes analysis - harder. - -Motivation: Preventing MITM attacks - - TLS prevents a man-in-the-middle attacker from reading or changing the - contents of a communication. It does not, however, prevent such an - attacker from observing timing information. Since timing attacks are some - of the most effective against low-latency anonymity nets like Tor, we - should take more care to make sure that we're not only talking to who - we think we're talking to, but that we're using the network path we - believe we're using. - -Motivation: Signed clock information - - It's very useful for Tor instances to know how skewed they are relative - to one another. The only way to find out currently has been to download - directory information, and check the Date header--but this is not - authenticated, and hence subject to modification on the wire. Using - BEGIN_DIR to create an authenticated directory stream through an existing - circuit is better, but that's an extra step and it might be nicer to - learn the information in the course of the regular protocol. - -Proposal: - -1.0. Version numbers - - The node-to-node TLS-based "OR connection" protocol and the multi-hop - "circuit" protocol are versioned quasi-independently. - - Of course, some dependencies will continue to exist: Certain versions - of the circuit protocol may require a minimum version of the connection - protocol to be used. The connection protocol affects: - - Initial connection setup, link encryption, transport guarantees, - etc. - - The allowable set of cell commands - - Allowable formats for cells. - - The circuit protocol determines: - - How circuits are established and maintained - - How cells are decrypted and relayed - - How streams are established and maintained. - - Version numbers are incremented for backward-incompatible protocol changes - only. Backward-compatible changes are generally implemented by adding - additional fields to existing structures; implementations MUST ignore - fields they do not expect. Unused portions of cells MUST be set to zero. - - Though versioning the protocol will make it easier to maintain backward - compatibility with older versions of Tor, we will nevertheless continue to - periodically drop support for older protocols, - - to keep the implementation from growing without bound, - - to limit the maintenance burden of patching bugs in obsolete Tors, - - to limit the testing burden of verifying that many old protocol - versions continue to be implemented properly, and - - to limit the exposure of the network to protocol versions that are - expensive to support. - - The Tor protocol as implemented through the 0.1.2.x Tor series will be - called "version 1" in its link protocol and "version 1" in its relay - protocol. Versions of the Tor protocol so old as to be incompatible with - Tor 0.1.2.x can be considered to be version 0 of each, and are not - supported. - -2.1. VERSIONS cells - - When a Tor connection is established, both parties normally send a - VERSIONS cell before sending any other cells. (But see below.) - - VersionsLen [2 byte] - Versions [VersionsLen bytes] - - "Versions" is a sequence of VersionsLen bytes. Each value between 1 and - 127 inclusive represents a single version; current implementations MUST - ignore other bytes. Parties should list all of the versions which they - are able and willing to support. Parties can only communicate if they - have some connection protocol version in common. - - Version 0.2.0.x-alpha and earlier don't understand VERSIONS cells, - and therefore don't support version negotiation. Thus, waiting until - the other side has sent a VERSIONS cell won't work for these servers: - if the other side sends no cells back, it is impossible to tell - whether they - have sent a VERSIONS cell that has been stalled, or whether they have - dropped our own VERSIONS cell as unrecognized. Therefore, we'll - change the TLS negotiation parameters so that old parties can still - negotiate, but new parties can recognize each other. Immediately - after a TLS connection has been established, the parties check - whether the other side negotiated the connection in an "old" way or a - "new" way. If either party negotiated in the "old" way, we assume a - v1 connection. Otherwise, both parties send VERSIONS cells listing - all their supported versions. Upon receiving the other party's - VERSIONS cell, the implementation begins using the highest-valued - version common to both cells. If the first cell from the other party - has a recognized command, and is _not_ a VERSIONS cell, we assume a - v1 protocol. - - (For more detail on the TLS protocol change, see forthcoming draft - proposals from Steven Murdoch.) - - Implementations MUST discard VERSIONS cells that are not the first - recognized cells sent on a connection. - - The VERSIONS cell must be sent as a v1 cell (2 bytes of circuitID, 1 - byte of command, 509 bytes of payload). - - [NOTE: The VERSIONS cell is assigned the command number 7.] - -2.2. MITM-prevention and time checking - - If we negotiate a v2 connection or higher, the second cell we send SHOULD - be a NETINFO cell. Implementations SHOULD NOT send NETINFO cells at other - times. - - A NETINFO cell contains: - Timestamp [4 bytes] - Other OR's address [variable] - Number of addresses [1 byte] - This OR's addresses [variable] - - Timestamp is the OR's current Unix time, in seconds since the epoch. If - an implementation receives time values from many ORs that - indicate that its clock is skewed, it SHOULD try to warn the - administrator. (We leave the definition of 'many' intentionally vague - for now.) - - Before believing the timestamp in a NETINFO cell, implementations - SHOULD compare the time at which they received the cell to the time - when they sent their VERSIONS cell. If the difference is very large, - it is likely that the cell was delayed long enough that its - contents are out of date. - - Each address contains Type/Length/Value as used in Section 6.4 of - tor-spec.txt. The first address is the one that the party sending - the NETINFO cell believes the other has -- it can be used to learn - what your IP address is if you have no other hints. - The rest of the addresses are the advertised addresses of the party - sending the NETINFO cell -- we include them - to block a man-in-the-middle attack on TLS that lets an attacker bounce - traffic through his own computers to enable timing and packet-counting - attacks. - - A Tor instance should use the other Tor's reported address - information as part of logic to decide whether to treat a given - connection as suitable for extending circuits to a given address/ID - combination. When we get an extend request, we use an - existing OR connection if the ID matches, and ANY of the following - conditions hold: - - The IP matches the requested IP. - - We know that the IP we're using is canonical because it was - listed in the NETINFO cell. - - We know that the IP we're using is canonical because it was - listed in the server descriptor. - - [NOTE: The NETINFO cell is assigned the command number 8.] - -Discussion: Versions versus feature lists - - Many protocols negotiate lists of available features instead of (or in - addition to) protocol versions. While it's possible that some amount of - feature negotiation could be supported in a later Tor, we should prefer to - use protocol versions whenever possible, for reasons discussed in - the "Anonymity Loves Company" paper. - -Discussion: Bytes per version, versions per cell - - This document provides for a one-byte count of how many versions a Tor - supports, and allows one byte per version. Thus, it can only support only - 254 more versions of the protocol beyond the unallocated v0 and the - current v1. If we ever need to split the protocol into 255 incompatible - versions, we've probably screwed up badly somewhere. - - Nevertheless, here are two ways we could support more versions: - - Change the version count to a two-byte field that counts the number of - _bytes_ used, and use a UTF8-style encoding: versions 0 through 127 - take one byte to encode, versions 128 through 2047 take two bytes to - encode, and so on. We wouldn't need to parse any version higher than - 127 right now, since all bytes used to encode higher versions would - have their high bit set. - - We'd still have a limit of 380 simultaneously versions that could be - declared in any version. This is probably okay. - - - Decide that if we need to support more versions, we can add a - MOREVERSIONS cell that gets sent before the VERSIONS cell. The spec - above requires Tors to ignore unrecognized cell types that they get - before the first VERSIONS cell, and still allows version negotiation - to - succeed. - - [Resolution: Reserve the high bit and the v0 value for later use. If - we ever have more live versions than we can fit in a cell, we've made a - bad design decision somewhere along the line.] - -Discussion: Reducing round-trips - - It might be appealing to see if we can cram more information in the - initial VERSIONS cell. For example, the contents of NETINFO will pretty - soon be sent by everybody before any more information is exchanged, but - decoupling them from the version exchange increases round-trips. - - Instead, we could speculatively include handshaking information at - the end of a VERSIONS cell, wrapped in a marker to indicate, "if we wind - up speaking VERSION 2, here's the NETINFO I'll send. Otherwise, ignore - this." This could be extended to opportunistically reduce round trips - when possible for future versions when we guess the versions right. - - Of course, we'd need to be careful about using a feature like this: - - We don't want to include things that are expensive to compute, - like PK signatures or proof-of-work. - - We don't want to speculate as a mobile client: it may leak our - experience with the server in question. - -Discussion: Advertising versions in routerdescs and networkstatuses. - - In network-statuses: - - The networkstatus "v" line now has the format: - "v" IMPLEMENTATION IMPL-VERSION "Link" LINK-VERSION-LIST - "Circuit" CIRCUIT-VERSION-LIST NL - - LINK-VERSION-LIST and CIRCUIT-VERSION-LIST are comma-separated lists of - supported version numbers. IMPLEMENTATION is the name of the - implementation of the Tor protocol (e.g., "Tor"), and IMPL-VERSION is the - version of the implementation. - - Examples: - v Tor 0.2.5.1-alpha Link 1,2,3 Circuit 2,5 - - v OtherOR 2000+ Link 3 Circuit 5 - - Implementations that release independently of the Tor codebase SHOULD NOT - use "Tor" as the value of their IMPLEMENTATION. - - Additional fields on the "v" line MUST be ignored. - - In router descriptors: - - The router descriptor should contain a line of the form, - "protocols" "Link" LINK-VERSION-LIST "Circuit" CIRCUIT_VERSION_LIST - - Additional fields on the "protocols" line MUST be ignored. - - [Versions of Tor before 0.1.2.5-alpha rejected router descriptors with - unrecognized items; the protocols line should be preceded with an "opt" - until these Tors are obsolete.] - -Security issues: - - Client partitioning is the big danger when we introduce new versions; if a - client supports some very unusual set of protocol versions, it will stand - out from others no matter where it goes. If a server supports an unusual - version, it will get a disproportionate amount of traffic from clients who - prefer that version. We can mitigate this somewhat as follows: - - - Do not have clients prefer any protocol version by default until that - version is widespread. (First introduce the new version to servers, - and have clients admit to using it only when configured to do so for - testing. Then, once many servers are running the new protocol - version, enable its use by default.) - - - Do not multiply protocol versions needlessly. - - - Encourage protocol implementors to implement the same protocol version - sets as some popular version of Tor. - - - Disrecommend very old/unpopular versions of Tor via the directory - authorities' RecommmendedVersions mechanism, even if it is still - technically possible to use them. - diff --git a/doc/spec/proposals/106-less-tls-constraint.txt b/doc/spec/proposals/106-less-tls-constraint.txt deleted file mode 100644 index 7e7621df69..0000000000 --- a/doc/spec/proposals/106-less-tls-constraint.txt +++ /dev/null @@ -1,111 +0,0 @@ -Filename: 106-less-tls-constraint.txt -Title: Checking fewer things during TLS handshakes -Author: Nick Mathewson -Created: 9-Feb-2007 -Status: Closed -Implemented-In: 0.2.0.x - -Overview: - - This document proposes that we relax our requirements on the context of - X.509 certificates during initial TLS handshakes. - -Motivation: - - Later, we want to try harder to avoid protocol fingerprinting attacks. - This means that we'll need to make our connection handshake look closer - to a regular HTTPS connection: one certificate on the server side and - zero certificates on the client side. For now, about the best we - can do is to stop requiring things during handshake that we don't - actually use. - -What we check now, and where we check it: - - tor_tls_check_lifetime: - peer has certificate - notBefore <= now <= notAfter - - tor_tls_verify: - peer has at least one certificate - There is at least one certificate in the chain - At least one of the certificates in the chain is not the one used to - negotiate the connection. (The "identity cert".) - The certificate _not_ used to negotiate the connection has signed the - link cert - - tor_tls_get_peer_cert_nickname: - peer has a certificate. - certificate has a subjectName. - subjectName has a commonName. - commonName consists only of characters in LEGAL_NICKNAME_CHARACTERS. [2] - - tor_tls_peer_has_cert: - peer has a certificate. - - connection_or_check_valid_handshake: - tor_tls_peer_has_cert [1] - tor_tls_get_peer_cert_nickname [1] - tor_tls_verify [1] - If nickname in cert is a known, named router, then its identity digest - must be as expected. - If we initiated the connection, then we got the identity digest we - expected. - - USEFUL THINGS WE COULD DO: - - [1] We could just not force clients to have any certificate at all, let alone - an identity certificate. Internally to the code, we could assign the - identity_digest field of these or_connections to a random number, or even - not add them to the identity_digest->or_conn map. - [so if somebody connects with no certs, we let them. and mark them as - a client and don't treat them as a server. great. -rd] - - [2] Instead of using a restricted nickname character set that makes our - commonName structure look unlike typical SSL certificates, we could treat - the nickname as extending from the start of the commonName up to but not - including the first non-nickname character. - - Alternatively, we could stop checking commonNames entirely. We don't - actually _do_ anything based on the nickname in the certificate, so - there's really no harm in letting every router have any commonName it - wants. - [this is the better choice -rd] - [agreed. -nm] - -REMAINING WAYS TO RECOGNIZE CLIENT->SERVER CONNECTIONS: - - Assuming that we removed the above requirements, we could then (in a later - release) have clients not send certificates, and sometimes and started - making our DNs a little less formulaic, client->server OR connections would - still be recognizable by: - having a two-certificate chain sent by the server - using a particular set of ciphersuites - traffic patterns - probing the server later - -OTHER IMPLICATIONS: - - If we stop verifying the above requirements: - - It will be slightly (but only slightly) more common to connect to a non-Tor - server running TLS, and believe that you're talking to a Tor server (until - you send the first cell). - - It will be far easier for non-Tor SSL clients to accidentally connect to - Tor servers and speak HTTPS or whatever to them. - - If, in a later release, we have clients not send certificates, and we make - DNs less recognizable: - - If clients don't send certs, servers don't need to verify them: win! - - If we remove these restrictions, it will be easier for people to write - clients to fuzz our protocol: sorta win! - - If clients don't send certs, they look slightly less like servers. - -OTHER SPEC CHANGES: - - When a client doesn't give us an identity, we should never extend any - circuits to it (duh), and we should allow it to set circuit ID however it - wants. diff --git a/doc/spec/proposals/107-uptime-sanity-checking.txt b/doc/spec/proposals/107-uptime-sanity-checking.txt deleted file mode 100644 index 922129b21d..0000000000 --- a/doc/spec/proposals/107-uptime-sanity-checking.txt +++ /dev/null @@ -1,54 +0,0 @@ -Filename: 107-uptime-sanity-checking.txt -Title: Uptime Sanity Checking -Author: Kevin Bauer & Damon McCoy -Created: 8-March-2007 -Status: Closed -Implemented-In: 0.2.0.x - -Overview: - - This document describes how to cap the uptime that is used when computing - which routers are marked as stable such that highly stable routers cannot - be displaced by malicious routers that report extremely high uptime - values. - - This is similar to how bandwidth is capped at 1.5MB/s. - -Motivation: - - It has been pointed out that an attacker can displace all stable nodes and - entry guard nodes by reporting high uptimes. This is an easy fix that will - prevent highly stable nodes from being displaced. - -Security implications: - - It should decrease the effectiveness of routing attacks that report high - uptimes while not impacting the normal routing algorithms. - -Specification: - - So we could patch Section 3.1 of dir-spec.txt to say: - - "Stable" -- A router is 'Stable' if it is running, valid, not - hibernating, and either its uptime is at least the median uptime for - known running, valid, non-hibernating routers, or its uptime is at - least 30 days. Routers are never called stable if they are running - a version of Tor known to drop circuits stupidly. (0.1.1.10-alpha - through 0.1.1.16-rc are stupid this way.) - -Compatibility: - - There should be no compatibility issues due to uptime capping. - -Implementation: - - Implemented and merged into dir-spec in 0.2.0.0-alpha-dev (r9788). - -Discussion: - - Initially, this proposal set the maximum at 60 days, not 30; the 30 day - limit and spec wording was suggested by Roger in an or-dev post on 9 March - 2007. - - This proposal also led to 108-mtbf-based-stability.txt - diff --git a/doc/spec/proposals/108-mtbf-based-stability.txt b/doc/spec/proposals/108-mtbf-based-stability.txt deleted file mode 100644 index 294103760b..0000000000 --- a/doc/spec/proposals/108-mtbf-based-stability.txt +++ /dev/null @@ -1,88 +0,0 @@ -Filename: 108-mtbf-based-stability.txt -Title: Base "Stable" Flag on Mean Time Between Failures -Author: Nick Mathewson -Created: 10-Mar-2007 -Status: Closed -Implemented-In: 0.2.0.x - -Overview: - - This document proposes that we change how directory authorities set the - stability flag from inspection of a router's declared Uptime to the - authorities' perceived mean time between failure for the router. - -Motivation: - - Clients prefer nodes that the authorities call Stable. This flag is (as - of 0.2.0.0-alpha-dev) set entirely based on the node's declared value for - uptime. This creates an opportunity for malicious nodes to declare - falsely high uptimes in order to get more traffic. - -Spec changes: - - Replace the current rule for setting the Stable flag with: - - "Stable" -- A router is 'Stable' if it is active and its observed Stability - for the past month is at or above the median Stability for active routers. - Routers are never called stable if they are running a version of Tor - known to drop circuits stupidly. (0.1.1.10-alpha through 0.1.1.16-rc - are stupid this way.) - - Stability shall be defined as the weighted mean length of the runs - observed by a given directory authority. A run begins when an authority - decides that the server is Running, and ends when the authority decides - that the server is not Running. In-progress runs are counted when - measuring Stability. When calculating the mean, runs are weighted by - $\alpha ^ t$, where $t$ is time elapsed since the end of the run, and - $0 < \alpha < 1$. Time when an authority is down do not count to the - length of the run. - -Rejected Alternative: - - "A router's Stability shall be defined as the sum of $\alpha ^ d$ for every - $d$ such that the router was considered reachable for the entire day - $d$ days ago. - - This allows a simpler implementation: every day, we multiply - yesterday's Stability by alpha, and if the router was observed to be - available every time we looked today, we add 1. - - Instead of "day", we could pick an arbitrary time unit. We should - pick alpha to be high enough that long-term stability counts, but low - enough that the distant past is eventually forgotten. Something - between .8 and .95 seems right. - - (By requiring that routers be up for an entire day to get their - stability increased, instead of counting fractions of a day, we - capture the notion that stability is more like "probability of - staying up for the next hour" than it is like "probability of being - up at some randomly chosen time over the next hour." The former - notion of stability is far more relevant for long-lived circuits.) - -Limitations: - - Authorities can have false positives and false negatives when trying to - tell whether a router is up or down. So long as these aren't terribly - wrong, and so long as they aren't significantly biased, we should be able - to use them to estimate stability pretty well. - - Probing approaches like the above could miss short incidents of - downtime. If we use the router's declared uptime, we could detect - these: but doing so would penalize routers who reported their uptime - accurately. - -Implementation: - - For now, the easiest way to store this information at authorities - would probably be in some kind of periodically flushed flat file. - Later, we could move to Berkeley db or something if we really had to. - - For each router, an authority will need to store: - The router ID. - Whether the router is up. - The time when the current run started, if the router is up. - The weighted sum length of all previous runs. - The time at which the weighted sum length was last weighted down. - - Servers should probe at random intervals to test whether servers are - running. diff --git a/doc/spec/proposals/109-no-sharing-ips.txt b/doc/spec/proposals/109-no-sharing-ips.txt deleted file mode 100644 index 5438cf049a..0000000000 --- a/doc/spec/proposals/109-no-sharing-ips.txt +++ /dev/null @@ -1,90 +0,0 @@ -Filename: 109-no-sharing-ips.txt -Title: No more than one server per IP address. -Author: Kevin Bauer & Damon McCoy -Created: 9-March-2007 -Status: Closed -Implemented-In: 0.2.0.x - -Overview: - This document describes a solution to a Sybil attack vulnerability in the - directory servers. Currently, it is possible for a single IP address to - host an arbitrarily high number of Tor routers. We propose that the - directory servers limit the number of Tor routers that may be registered at - a particular IP address to some small (fixed) number, perhaps just one Tor - router per IP address. - - While Tor never uses more than one server from a given /16 in the same - circuit, an attacker with multiple servers in the same place is still - dangerous because he can get around the per-server bandwidth cap that is - designed to prevent a single server from attracting too much of the overall - traffic. - -Motivation: - Since it is possible for an attacker to register an arbitrarily large - number of Tor routers, it is possible for malicious parties to do this - as part of a traffic analysis attack. - -Security implications: - This countermeasure will increase the number of IP addresses that an - attacker must control in order to carry out traffic analysis. - -Specification: - - For each IP address, each directory authority tracks the number of routers - using that IP address, along with their total observed bandwidth. If there - are more than MAX_SERVERS_PER_IP servers at some IP, the authority should - "disable" all but MAX_SERVERS_PER_IP servers. When choosing which servers - to disable, the authority should first disable non-Running servers in - increasing order of observed bandwidth, and then should disable Running - servers in increasing order of bandwidth. - - [[ We don't actually do this part here. -NM - - If the total observed - bandwidth of the remaining non-"disabled" servers exceeds MAX_BW_PER_IP, - the authority should "disable" some of the remaining servers until only one - server remains, or until the remaining observed bandwidth of non-"disabled" - servers is under MAX_BW_PER_IP. - ]] - - Servers that are "disabled" MUST be marked as non-Valid and non-Running. - - MAX_SERVERS_PER_IP is 3. - - MAX_BW_PER_IP is 8 MB per s. - -Compatibility: - - Upon inspection of a directory server, we found that the following IP - addresses have more than one Tor router: - - Scruples 68.5.113.81 ip68-5-113-81.oc.oc.cox.net 443 - WiseUp 68.5.113.81 ip68-5-113-81.oc.oc.cox.net 9001 - Unnamed 62.1.196.71 pc01-megabyte-net-arkadiou.megabyte.gr 9001 - Unnamed 62.1.196.71 pc01-megabyte-net-arkadiou.megabyte.gr 9001 - Unnamed 62.1.196.71 pc01-megabyte-net-arkadiou.megabyte.gr 9001 - aurel 85.180.62.138 e180062138.adsl.alicedsl.de 9001 - sokrates 85.180.62.138 e180062138.adsl.alicedsl.de 9001 - moria1 18.244.0.188 moria.mit.edu 9001 - peacetime 18.244.0.188 moria.mit.edu 9100 - - There may exist compatibility issues with this proposed fix. Reasons why - more than one server would share an IP address include: - - * Testing. moria1, moria2, peacetime, and other morias all run on one - computer at MIT, because that way we get testing. Moria1 and moria2 are - run by Roger, and peacetime is run by Nick. - * NAT. If there are several servers but they port-forward through the same - IP address, ... we can hope that the operators coordinate with each - other. Also, we should recognize that while they help the network in - terms of increased capacity, they don't help as much as they could in - terms of location diversity. But our approach so far has been to take - what we can get. - * People who have more than 1.5MB/s and want to help out more. For - example, for a while Tonga was offering 10MB/s and its Tor server - would only make use of a bit of it. So Roger suggested that he run - two Tor servers, to use more. - -[Note Roger's tweak to this behavior, in -http://archives.seul.org/or/cvs/Oct-2007/msg00118.html] - diff --git a/doc/spec/proposals/110-avoid-infinite-circuits.txt b/doc/spec/proposals/110-avoid-infinite-circuits.txt deleted file mode 100644 index fffc41c25a..0000000000 --- a/doc/spec/proposals/110-avoid-infinite-circuits.txt +++ /dev/null @@ -1,120 +0,0 @@ -Filename: 110-avoid-infinite-circuits.txt -Title: Avoiding infinite length circuits -Author: Roger Dingledine -Created: 13-Mar-2007 -Status: Accepted -Target: 0.2.1.x -Implemented-In: 0.2.1.3-alpha - -History: - - Revised 28 July 2008 by nickm: set K. - Revised 3 July 2008 by nickm: rename from relay_extend to - relay_early. Revise to current migration plan. Allow K cells - over circuit lifetime, not just at start. - -Overview: - - Right now, an attacker can add load to the Tor network by extending a - circuit an arbitrary number of times. Every cell that goes down the - circuit then adds N times that amount of load in overall bandwidth - use. This vulnerability arises because servers don't know their position - on the path, so they can't tell how many nodes there are before them - on the path. - - We propose a new set of relay cells that are distinguishable by - intermediate hops as permitting extend cells. This approach will allow - us to put an upper bound on circuit length relative to the number of - colluding adversary nodes; but there are some downsides too. - -Motivation: - - The above attack can be used to generally increase load all across the - network, or it can be used to target specific servers: by building a - circuit back and forth between two victim servers, even a low-bandwidth - attacker can soak up all the bandwidth offered by the fastest Tor - servers. - - The general attacks could be used as a demonstration that Tor isn't - perfect (leading to yet more media articles about "breaking" Tor), and - the targetted attacks will come into play once we have a reputation - system -- it will be trivial to DoS a server so it can't pass its - reputation checks, in turn impacting security. - -Design: - - We should split RELAY cells into two types: RELAY and RELAY_EARLY. - - Only K (say, 10) Relay_early cells can be sent across a circuit, and - only relay_early cells are allowed to contain extend requests. We - still support obscuring the length of the circuit (if more research - shows us what to do), because Alice can choose how many of the K to - mark as relay_early. Note that relay_early cells *can* contain any - sort of data cell; so in effect it's actually the relay type cells - that are restricted. By default, she would just send the first K - data cells over the stream as relay_early cells, regardless of their - actual type. - - (Note that a circuit that is out of relay_early cells MUST NOT be - cannibalized later, since it can't extend. Note also that it's always okay - to use regular RELAY cells when sending non-EXTEND commands targetted at - the first hop of a circuit, since there is no intermediate hop to try to - learn the relay command type.) - - Each intermediate server would pass on the same type of cell that it - received (either relay or relay_early), and the cell's destination - will be able to learn whether it's allowed to contain an Extend request. - - If an intermediate server receives more than K relay_early cells, or - if it sees a relay cell that contains an extend request, then it - tears down the circuit (protocol violation). - -Security implications: - - The upside is that this limits the bandwidth amplification factor to - K: for an individual circuit to become arbitrary-length, the attacker - would need an adversary-controlled node every K hops, and at that - point the attack is no worse than if the attacker creates N/K separate - K-hop circuits. - - On the other hand, we want to pick a large enough value of K that we - don't mind the cap. - - If we ever want to take steps to hide the number of hops in the circuit - or a node's position in the circuit, this design probably makes that - more complex. - -Migration: - - In 0.2.0, servers speaking v2 or later of the link protocol accept - RELAY_EARLY cells, and pass them on. If the next OR in the circuit - is not speaking the v2 link protocol, the server relays the cell as - a RELAY cell. - - In 0.2.1.3-alpha, clients begin using RELAY_EARLY cells on v2 - connections. This functionality can be safely backported to - 0.2.0.x. Clients should pick a random number betweeen (say) K and - K-2 to send. - - In 0.2.1.3-alpha, servers close any circuit in which more than K - relay_early cells are sent. - - Once all versions the do not send RELAY_EARLY cells are obsolete, - servers can begin to reject any EXTEND requests not sent in a - RELAY_EARLY cell. - -Parameters: - - Let K = 8, for no terribly good reason. - -Spec: - - [We can formalize this part once we think the design is a good one.] - -Acknowledgements: - - This design has been kicking around since Christian Grothoff and I came - up with it at PET 2004. (Nathan Evans, Christian Grothoff's student, - is working on implementing a fix based on this design in the summer - 2007 timeframe.) - diff --git a/doc/spec/proposals/111-local-traffic-priority.txt b/doc/spec/proposals/111-local-traffic-priority.txt deleted file mode 100644 index 9411463c21..0000000000 --- a/doc/spec/proposals/111-local-traffic-priority.txt +++ /dev/null @@ -1,151 +0,0 @@ -Filename: 111-local-traffic-priority.txt -Title: Prioritizing local traffic over relayed traffic -Author: Roger Dingledine -Created: 14-Mar-2007 -Status: Closed -Implemented-In: 0.2.0.x - -Overview: - - We describe some ways to let Tor users operate as a relay and enforce - rate limiting for relayed traffic without impacting their locally - initiated traffic. - -Motivation: - - Right now we encourage people who use Tor as a client to configure it - as a relay too ("just click the button in Vidalia"). Most of these users - are on asymmetric links, meaning they have a lot more download capacity - than upload capacity. But if they enable rate limiting too, suddenly - they're limited to the same download capacity as upload capacity. And - they have to enable rate limiting, or their upstream pipe gets filled - up, starts dropping packets, and now their net connection doesn't work - even for non-Tor stuff. So they end up turning off the relaying part - so they can use Tor (and other applications) again. - - So far this hasn't mattered that much: most of our fast relays are - being operated only in relay mode, so the rate limiting makes sense - for them. But if we want to be able to attract many more relays in - the future, we need to let ordinary users act as relays too. - - Further, as we begin to deploy the blocking-resistance design and we - rely on ordinary users to click the "Tor for Freedom" button, this - limitation will become a serious stumbling block to getting volunteers - to act as bridges. - -The problem: - - Tor implements its rate limiting on the 'read' side by only reading - a certain number of bytes from the network in each second. If it has - emptied its token bucket, it doesn't read any more from the network; - eventually TCP notices and stalls until we resume reading. But if we - want to have two classes of service, we can't know what class a given - incoming cell will be until we look at it, at which point we've already - read it. - -Some options: - - Option 1: read when our token bucket is full enough, and if it turns - out that what we read was local traffic, then add the tokens back into - the token bucket. This will work when local traffic load alternates - with relayed traffic load; but it's a poor option in general, because - when we're receiving both local and relayed traffic, there are plenty - of cases where we'll end up with an empty token bucket, and then we're - back where we were before. - - More generally, notice that our problem is easy when a given TCP - connection either has entirely local circuits or entirely relayed - circuits. In fact, even if they are both present, if one class is - entirely idle (none of its circuits have sent or received in the past - N seconds), we can ignore that class until it wakes up again. So it - only gets complex when a single connection contains active circuits - of both classes. - - Next, notice that local traffic uses only the entry guards, whereas - relayed traffic likely doesn't. So if we're a bridge handling just - a few users, the expected number of overlapping connections would be - almost zero, and even if we're a full relay the number of overlapping - connections will be quite small. - - Option 2: build separate TCP connections for local traffic and for - relayed traffic. In practice this will actually only require a few - extra TCP connections: we would only need redundant TCP connections - to at most the number of entry guards in use. - - However, this approach has some drawbacks. First, if the remote side - wants to extend a circuit to you, how does it know which TCP connection - to send it on? We would need some extra scheme to label some connections - "client-only" during construction. Perhaps we could do this by seeing - whether any circuit was made via CREATE_FAST; but this still opens - up a race condition where the other side sends a create request - immediately. The only ways I can imagine to avoid the race entirely - are to specify our preference in the VERSIONS cell, or to add some - sort of "nope, not this connection, why don't you try another rather - than failing" response to create cells, or to forbid create cells on - connections that you didn't initiate and on which you haven't seen - any circuit creation requests yet -- this last one would lead to a bit - more connection bloat but doesn't seem so bad. And we already accept - this race for the case where directory authorities establish new TCP - connections periodically to check reachability, and then hope to hang - up on them soon after. (In any case this issue is moot for bridges, - since each destination will be one-way with respect to extend requests: - either receiving extend requests from bridge users or sending extend - requests to the Tor server, never both.) - - The second problem with option 2 is that using two TCP connections - reveals that there are two classes of traffic (and probably quickly - reveals which is which, based on throughput). Now, it's unclear whether - this information is already available to the other relay -- he would - easily be able to tell that some circuits are fast and some are rate - limited, after all -- but it would be nice to not add even more ways to - leak that information. Also, it's less clear that an external observer - already has this information if the circuits are all bundled together, - and for this case it's worth trying to protect it. - - Option 3: tell the other side about our rate limiting rules. When we - establish the TCP connection, specify the different policy classes we - have configured. Each time we extend a circuit, specify which policy - class that circuit should be part of. Then hope the other side obeys - our wishes. (If he doesn't, hang up on him.) Besides the design and - coordination hassles involved in this approach, there's a big problem: - our rate limiting classes apply to all our connections, not just - pairwise connections. How does one server we're connected to know how - much of our bucket has already been spent by another? I could imagine - a complex and inefficient "ok, now you can send me those two more cells - that you've got queued" protocol. I'm not sure how else we could do it. - - (Gosh. How could UDP designs possibly be compatible with rate limiting - with multiple bucket sizes?) - - Option 4: put both classes of circuits over a single connection, and - keep track of the last time we read or wrote a high-priority cell. If - it's been less than N seconds, give the whole connection high priority, - else give the whole connection low priority. - - Option 5: put both classes of circuits over a single connection, and - play a complex juggling game by periodically telling the remote side - what rate limits to set for that connection, so you end up giving - priority to the right connections but still stick to roughly your - intended bandwidthrate and relaybandwidthrate. - - Option 6: ? - -Prognosis: - - Nick really didn't like option 2 because of the partitioning questions. - - I've put option 4 into place as of Tor 0.2.0.3-alpha. - - In terms of implementation, it will be easy: just add a time_t to - or_connection_t that specifies client_used (used by the initiator - of the connection to rate limit it differently depending on how - recently the time_t was reset). We currently update client_used - in three places: - - command_process_relay_cell() when we receive a relay cell for - an origin circuit. - - relay_send_command_from_edge() when we send a relay cell for - an origin circuit. - - circuit_deliver_create_cell() when send a create cell. - We could probably remove the third case and it would still work, - but hey. - diff --git a/doc/spec/proposals/112-bring-back-pathlencoinweight.txt b/doc/spec/proposals/112-bring-back-pathlencoinweight.txt deleted file mode 100644 index 3f6c3376f0..0000000000 --- a/doc/spec/proposals/112-bring-back-pathlencoinweight.txt +++ /dev/null @@ -1,163 +0,0 @@ -Filename: 112-bring-back-pathlencoinweight.txt -Title: Bring Back Pathlen Coin Weight -Author: Mike Perry -Created: -Status: Superseded -Superseded-By: 115 - - -Overview: - - The idea is that users should be able to choose a weight which - probabilistically chooses their path lengths to be 2 or 3 hops. This - weight will essentially be a biased coin that indicates an - additional hop (beyond 2) with probability P. The user should be - allowed to choose 0 for this weight to always get 2 hops and 1 to - always get 3. - - This value should be modifiable from the controller, and should be - available from Vidalia. - - -Motivation: - - The Tor network is slow and overloaded. Increasingly often I hear - stories about friends and friends of friends who are behind firewalls, - annoying censorware, or under surveillance that interferes with their - productivity and Internet usage, or chills their speech. These people - know about Tor, but they choose to put up with the censorship because - Tor is too slow to be usable for them. In fact, to download a fresh, - complete copy of levine-timing.pdf for the Anonymity Implications - section of this proposal over Tor took me 3 tries. - - There are many ways to improve the speed problem, and of course we - should and will implement as many as we can. Johannes's GSoC project - and my reputation system are longer term, higher-effort things that - will still provide benefit independent of this proposal. - - However, reducing the path length to 2 for those who do not need the - (questionable) extra anonymity 3 hops provide not only improves - their Tor experience but also reduces their load on the Tor network by - 33%, and can be done in less than 10 lines of code. That's not just - Win-Win, it's Win-Win-Win. - - Furthermore, when blocking resistance measures insert an extra relay - hop into the equation, 4 hops will certainly be completely unusable - for these users, especially since it will be considerably more - difficult to balance the load across a dark relay net than balancing - the load on Tor itself (which today is still not without its flaws). - - -Anonymity Implications: - - It has long been established that timing attacks against mixed - networks are extremely effective, and that regardless of path - length, if the adversary has compromised your first and last - hop of your path, you can assume they have compromised your - identity for that connection. - - In [1], it is demonstrated that for all but the slowest, lossiest - networks, error rates for false positives and false negatives were - very near zero. Only for constant streams of traffic over slow and - (more importantly) extremely lossy network links did the error rate - hit 20%. For loss rates typical to the Internet, even the error rate - for slow nodes with constant traffic streams was 13%. - - When you take into account that most Tor streams are not constant, - but probably much more like their "HomeIP" dataset, which consists - mostly of web traffic that exists over finite intervals at specific - times, error rates drop to fractions of 1%, even for the "worst" - network nodes. - - Therefore, the user has little benefit from the extra hop, assuming - the adversary does timing correlation on their nodes. The real - protection is the probability of getting both the first and last hop, - and this is constant whether the client chooses 2 hops, 3 hops, or 42. - - Partitioning attacks form another concern. Since Tor uses telescoping - to build circuits, it is possible to tell a user is constructing only - two hop paths at the entry node. It is questionable if this data is - actually worth anything though, especially if the majority of users - have easy access to this option, and do actually choose their path - lengths semi-randomly. - - Nick has postulated that exits may also be able to tell that you are - using only 2 hops by the amount of time between sending their - RELAY_CONNECTED cell and the first bit of RELAY_DATA traffic they - see from the OP. I doubt that they will be able to make much use - of this timing pattern, since it will likely vary widely depending - upon the type of node selected for that first hop, and the user's - connection rate to that first hop. It is also questionable if this - data is worth anything, especially if many users are using this - option (and I imagine many will). - - Perhaps most seriously, two hop paths do allow malicious guards - to easily fail circuits if they do not extend to their colluding peers - for the exit hop. Since guards can detect the number of hops in a - path, they could always fail the 3 hop circuits and focus on - selectively failing the two hop ones until a peer was chosen. - - I believe currently guards are rotated if circuits fail, which does - provide some protection, but this could be changed so that an entry - guard is completely abandoned after a certain ratio of extend or - general circuit failures with respect to non-failed circuits. This - could possibly be gamed to increase guard turnover, but such a game - would be much more noticeable than an individual guard failing circuits, - though, since it would affect all clients, not just those who chose - a particular guard. - - -Why not fix Pathlen=2?: - - The main reason I am not advocating that we always use 2 hops is that - in some situations, timing correlation evidence by itself may not be - considered as solid and convincing as an actual, uninterrupted, fully - traced path. Are these timing attacks as effective on a real network - as they are in simulation? Would an extralegal adversary or authoritarian - government even care? In the face of these situation-dependent unknowns, - it should be up to the user to decide if this is a concern for them or not. - - It should probably also be noted that even a false positive - rate of 1% for a 200k concurrent-user network could mean that for a - given node, a given stream could be confused with something like 10 - users, assuming ~200 nodes carry most of the traffic (ie 1000 users - each). Though of course to really know for sure, someone needs to do - an attack on a real network, unfortunately. - - -Implementation: - - new_route_len() can be modified directly with a check of the - PathlenCoinWeight option (converted to percent) and a call to - crypto_rand_int(0,100) for the weighted coin. - - The entry_guard_t structure could have num_circ_failed and - num_circ_succeeded members such that if it exceeds N% circuit - extend failure rate to a second hop, it is removed from the entry list. - N should be sufficiently high to avoid churn from normal Tor circuit - failure as determined by TorFlow scans. - - The Vidalia option should be presented as a boolean, to minimize confusion - for the user. Something like a radiobutton with: - - * "I use Tor for Censorship Resistance, not Anonymity. Speed is more - important to me than Anonymity." - * "I use Tor for Anonymity. I need extra protection at the cost of speed." - - and then some explanation in the help for exactly what this means, and - the risks involved with eliminating the adversary's need for timing attacks - wrt to false positives, etc. - -Migration: - - Phase one: Experiment with the proper ratio of circuit failures - used to expire garbage or malicious guards via TorFlow. - - Phase two: Re-enable config and modify new_route_len() to add an - extra hop if coin comes up "heads". - - Phase three: Make radiobutton in Vidalia, along with help entry - that explains in layman's terms the risks involved. - - -[1] http://www.cs.umass.edu/~mwright/papers/levine-timing.pdf diff --git a/doc/spec/proposals/113-fast-authority-interface.txt b/doc/spec/proposals/113-fast-authority-interface.txt deleted file mode 100644 index 8912b53220..0000000000 --- a/doc/spec/proposals/113-fast-authority-interface.txt +++ /dev/null @@ -1,85 +0,0 @@ -Filename: 113-fast-authority-interface.txt -Title: Simplifying directory authority administration -Author: Nick Mathewson -Created: -Status: Superseded - -Overview - -The problem: - - Administering a directory authority is a pain: you need to go through - emails and manually add new nodes as "named". When bad things come up, - you need to mark nodes (or whole regions) as invalid, badexit, etc. - - This means that mostly, authority admins don't: only 2/4 current authority - admins actually bind names or list bad exits, and those two have often - complained about how annoying it is to do so. - - Worse, name binding is a common path, but it's a pain in the neck: nobody - has done it for a couple of months. - -Digression: who knows what? - - It's trivial for Tor to automatically keep track of all of the - following information about a server: - name, fingerprint, IP, last-seen time, first-seen time, declared - contact. - - All we need to have the administrator set is: - - Is this name/fingerprint pair bound? - - Is this fingerprint/IP a bad exit? - - Is this fingerprint/IP an invalid node? - - Is this fingerprint/IP to be rejected? - - The workflow for authority admins has two parts: - - Periodically, go through tor-ops and add new names. This doesn't - need to be done urgently. - - Less often, mark badly behaved serves as badly behaved. This is more - urgent. - -Possible solution #1: Web-interface for name binding. - - Deprecate use of the tor-ops mailing list; instead, have operators go to a - webform and enter their server info. This would put the information in a - standardized format, thus allowing quick, nearly-automated approval and - reply. - -Possible solution #2: Self-binding names. - - Peter Palfrader has proposed that names be assigned automatically to nodes - that have been up and running and valid for a while. - -Possible solution #3: Self-maintaining approved-routers file - - Mixminion alpha has a neat feature where whenever a new server is seen, - a stub line gets added to a configuration file. For Tor, it could look - something like this: - - ## First seen with this key on 2007-04-21 13:13:14 - ## Stayed up for at least 12 hours on IP 192.168.10.10 - #RouterName AAAABBBBCCCCDDDDEFEF - - (Note that the implementation needs to parse commented lines to make sure - that it doesn't add duplicates, but that's not so hard.) - - To add a router as named, administrators would only need to uncomment the - entry. This automatically maintained file could be kept separately from a - manually maintained one. - - This could be combined with solution #2, such that Tor would do the hard - work of uncommenting entries for routers that should get Named, but - operators could override its decisions. - -Possible solution #4: A separate mailing list for authority operators. - - Right now, the tor-ops list is very high volume. There should be another - list that's only for dealing with problems that need prompt action, like - marking a router as !badexit. - -Resolution: - - Solution #2 is described in "Proposal 123: Naming authorities - automatically create bindings", and that approach is implemented. - There are remaining issues in the problem statement above that need - their own solutions. diff --git a/doc/spec/proposals/114-distributed-storage.txt b/doc/spec/proposals/114-distributed-storage.txt deleted file mode 100644 index 91a787d301..0000000000 --- a/doc/spec/proposals/114-distributed-storage.txt +++ /dev/null @@ -1,439 +0,0 @@ -Filename: 114-distributed-storage.txt -Title: Distributed Storage for Tor Hidden Service Descriptors -Author: Karsten Loesing -Created: 13-May-2007 -Status: Closed -Implemented-In: 0.2.0.x - -Change history: - - 13-May-2007 Initial proposal - 14-May-2007 Added changes suggested by Lasse Øverlier - 30-May-2007 Changed descriptor format, key length discussion, typos - 09-Jul-2007 Incorporated suggestions by Roger, added status of specification - and implementation for upcoming GSoC mid-term evaluation - 11-Aug-2007 Updated implementation statuses, included non-consecutive - replication to descriptor format - 20-Aug-2007 Renamed config option HSDir as HidServDirectoryV2 - 02-Dec-2007 Closed proposal - -Overview: - - The basic idea of this proposal is to distribute the tasks of storing and - serving hidden service descriptors from currently three authoritative - directory nodes among a large subset of all onion routers. The three - reasons to do this are better robustness (availability), better - scalability, and improved security properties. Further, - this proposal suggests changes to the hidden service descriptor format to - prevent new security threats coming from decentralization and to gain even - better security properties. - -Status: - - As of December 2007, the new hidden service descriptor format is implemented - and usable. However, servers and clients do not yet make use of descriptor - cookies, because there are open usability issues of this feature that might - be resolved in proposal 121. Further, hidden service directories do not - perform replication by themselves, because (unauthorized) replica fetch - requests would allow any attacker to fetch all hidden service descriptors in - the system. As neither issue is critical to the functioning of v2 - descriptors and their distribution, this proposal is considered as Closed. - -Motivation: - - The current design of hidden services exhibits the following performance and - security problems: - - First, the three hidden service authoritative directories constitute a - performance bottleneck in the system. The directory nodes are responsible for - storing and serving all hidden service descriptors. As of May 2007 there are - about 1000 descriptors at a time, but this number is assumed to increase in - the future. Further, there is no replication protocol for descriptors between - the three directory nodes, so that hidden services must ensure the - availability of their descriptors by manually publishing them on all - directory nodes. Whenever a fourth or fifth hidden service authoritative - directory is added, hidden services will need to maintain an equally - increasing number of replicas. These scalability issues have an impact on the - current usage of hidden services and put an even higher burden on the - development of new kinds of applications for hidden services that might - require storing even more descriptors. - - Second, besides posing a limitation to scalability, storing all hidden - service descriptors on three directory nodes also constitutes a security - risk. The directory node operators could easily analyze the publish and fetch - requests to derive information on service activity and usage and read the - descriptor contents to determine which onion routers work as introduction - points for a given hidden service and need to be attacked or threatened to - shut it down. Furthermore, the contents of a hidden service descriptor offer - only minimal security properties to the hidden service. Whoever gets aware of - the service ID can easily find out whether the service is active at the - moment and which introduction points it has. This applies to (former) - clients, (former) introduction points, and of course to the directory nodes. - It requires only to request the descriptor for the given service ID, which - can be performed by anyone anonymously. - - This proposal suggests two major changes to approach the described - performance and security problems: - - The first change affects the storage location for hidden service descriptors. - Descriptors are distributed among a large subset of all onion routers instead - of three fixed directory nodes. Each storing node is responsible for a subset - of descriptors for a limited time only. It is not able to choose which - descriptors it stores at a certain time, because this is determined by its - onion ID which is hard to change frequently and in time (only routers which - are stable for a given time are accepted as storing nodes). In order to - resist single node failures and untrustworthy nodes, descriptors are - replicated among a certain number of storing nodes. A first replication - protocol makes sure that descriptors don't get lost when the node population - changes; therefore, a storing node periodically requests the descriptors from - its siblings. A second replication protocol distributes descriptors among - non-consecutive nodes of the ID ring to prevent a group of adversaries from - generating new onion keys until they have consecutive IDs to create a 'black - hole' in the ring and make random services unavailable. Connections to - storing nodes are established by extending existing circuits by one hop to - the storing node. This also ensures that contents are encrypted. The effect - of this first change is that the probability that a single node operator - learns about a certain hidden service is very small and that it is very hard - to track a service over time, even when it collaborates with other node - operators. - - The second change concerns the content of hidden service descriptors. - Obviously, security problems cannot be solved only by decentralizing storage; - in fact, they could also get worse if done without caution. At first, a - descriptor ID needs to change periodically in order to be stored on changing - nodes over time. Next, the descriptor ID needs to be computable only for the - service's clients, but should be unpredictable for all other nodes. Further, - the storing node needs to be able to verify that the hidden service is the - true originator of the descriptor with the given ID even though it is not a - client. Finally, a storing node should learn as little information as - necessary by storing a descriptor, because it might not be as trustworthy as - a directory node; for example it does not need to know the list of - introduction points. Therefore, a second key is applied that is only known to - the hidden service provider and its clients and that is not included in the - descriptor. It is used to calculate descriptor IDs and to encrypt the - introduction points. This second key can either be given to all clients - together with the hidden service ID, or to a group or a single client as - an authentication token. In the future this second key could be the result of - some key agreement protocol between the hidden service and one or more - clients. A new text-based format is proposed for descriptors instead of an - extension of the existing binary format for reasons of future extensibility. - -Design: - - The proposed design is described by the required changes to the current - design. These requirements are grouped by content, rather than by affected - specification documents or code files, and numbered for reference below. - - Hidden service clients, servers, and directories: - - /1/ Create routing list - - All participants can filter the consensus status document received from the - directory authorities to one routing list containing only those servers - that store and serve hidden service descriptors and which are running for - at least 24 hours. A participant only trusts its own routing list and never - learns about routing information from other parties. - - /2/ Determine responsible hidden service directory - - All participants can determine the hidden service directory that is - responsible for storing and serving a given ID, as well as the hidden - service directories that replicate its content. Every hidden service - directory is responsible for the descriptor IDs in the interval from - its predecessor, exclusive, to its own ID, inclusive. Further, a hidden - service directory holds replicas for its n predecessors, where n denotes - the number of consecutive replicas. (requires /1/) - - [/3/ and /4/ were requirements to use BEGIN_DIR cells for directory - requests which have not been fulfilled in the course of the implementation - of this proposal, but elsewhere.] - - Hidden service directory nodes: - - /5/ Advertise hidden service directory functionality - - Every onion router that has its directory port open can decide whether it - wants to store and serve hidden service descriptors by setting a new config - option "HidServDirectoryV2" 0|1 to 1. An onion router with this config - option being set includes the flag "hidden-service-dir" in its router - descriptors that it sends to directory authorities. - - /6/ Accept v2 publish requests, parse and store v2 descriptors - - Hidden service directory nodes accept publish requests for hidden service - descriptors and store them to their local memory. (It is not necessary to - make descriptors persistent, because after disconnecting, the onion router - would not be accepted as storing node anyway, because it has not been - running for at least 24 hours.) All requests and replies are formatted as - HTTP messages. Requests are directed to the router's directory port and are - contained within BEGIN_DIR cells. A hidden service directory node stores a - descriptor only when it thinks that it is responsible for storing that - descriptor based on its own routing table. Every hidden service directory - node is responsible for the descriptor IDs in the interval of its n-th - predecessor in the ID circle up to its own ID (n denotes the number of - consecutive replicas). (requires /1/) - - /7/ Accept v2 fetch requests - - Same as /6/, but with fetch requests for hidden service descriptors. - (requires /2/) - - /8/ Replicate descriptors with neighbors - - A hidden service directory node replicates descriptors from its two - predecessors by downloading them once an hour. Further, it checks its - routing table periodically for changes. Whenever it realizes that a - predecessor has left the network, it establishes a connection to the new - n-th predecessor and requests its stored descriptors in the interval of its - (n+1)-th predecessor and the requested n-th predecessor. Whenever it - realizes that a new onion router has joined with an ID higher than its - former n-th predecessor, it adds it to its predecessors and discards all - descriptors in the interval of its (n+1)-th and its n-th predecessor. - (requires /1/) - - [Dec 02: This function has not been implemented, because arbitrary nodes - what have been able to download the entire set of v2 descriptors. An - authorized replication request would be necessary. For the moment, the - system runs without any directory-side replication. -KL] - - Authoritative directory nodes: - - /9/ Confirm a router's hidden service directory functionality - - Directory nodes include a new flag "HSDir" for routers that decided to - provide storage for hidden service descriptors and that are running for at - least 24 hours. The last requirement prevents a node from frequently - changing its onion key to become responsible for an identifier it wants to - target. - - Hidden service provider: - - /10/ Configure v2 hidden service - - Each hidden service provider that has set the config option - "PublishV2HidServDescriptors" 0|1 to 1 is configured to publish v2 - descriptors and conform to the v2 connection establishment protocol. When - configuring a hidden service, a hidden service provider checks if it has - already created a random secret_cookie and a hostname2 file; if not, it - creates both of them. (requires /2/) - - /11/ Establish introduction points with fresh key - - If configured to publish only v2 descriptors and no v0/v1 descriptors any - more, a hidden service provider that is setting up the hidden service at - introduction points does not pass its own public key, but the public key - of a freshly generated key pair. It also includes these fresh public keys - in the hidden service descriptor together with the other introduction point - information. The reason is that the introduction point does not need to and - therefore should not know for which hidden service it works, so as to - prevent it from tracking the hidden service's activity. (If a hidden - service provider supports both, v0/v1 and v2 descriptors, v0/v1 clients - rely on the fact that all introduction points accept the same public key, - so that this new feature cannot be used.) - - /12/ Encode v2 descriptors and send v2 publish requests - - If configured to publish v2 descriptors, a hidden service provider - publishes a new descriptor whenever its content changes or a new - publication period starts for this descriptor. If the current publication - period would only last for less than 60 minutes (= 2 x 30 minutes to allow - the server to be 30 minutes behind and the client 30 minutes ahead), the - hidden service provider publishes both a current descriptor and one for - the next period. Publication is performed by sending the descriptor to all - hidden service directories that are responsible for keeping replicas for - the descriptor ID. This includes two non-consecutive replicas that are - stored at 3 consecutive nodes each. (requires /1/ and /2/) - - Hidden service client: - - /13/ Send v2 fetch requests - - A hidden service client that has set the config option - "FetchV2HidServDescriptors" 0|1 to 1 handles SOCKS requests for v2 onion - addresses by requesting a v2 descriptor from a randomly chosen hidden - service directory that is responsible for keeping replica for the - descriptor ID. In total there are six replicas of which the first and the - last three are stored on consecutive nodes. The probability of picking one - of the three consecutive replicas is 1/6, 2/6, and 3/6 to incorporate the - fact that the availability will be the highest on the node with next higher - ID. A hidden service client relies on the hidden service provider to store - two sets of descriptors to compensate clock skew between service and - client. (requires /1/ and /2/) - - /14/ Process v2 fetch reply and parse v2 descriptors - - A hidden service client that has sent a request for a v2 descriptor can - parse it and store it to the local cache of rendezvous service descriptors. - - /15/ Establish connection to v2 hidden service - - A hidden service client can establish a connection to a hidden service - using a v2 descriptor. This includes using the secret cookie for decrypting - the introduction points contained in the descriptor. When contacting an - introduction point, the client does not use the public key of the hidden - service provider, but the freshly-generated public key that is included in - the hidden service descriptor. Whether or not a fresh key is used instead - of the key of the hidden service depends on the available protocol versions - that are included in the descriptor; by this, connection establishment is - to a certain extend decoupled from fetching the descriptor. - - Hidden service descriptor: - - (Requirements concerning the descriptor format are contained in /6/ and /7/.) - - The new v2 hidden service descriptor format looks like this: - - onion-address = h(public-key) + cookie - descriptor-id = h(h(public-key) + h(time-period + cookie + relica)) - descriptor-content = { - descriptor-id, - version, - public-key, - h(time-period + cookie + replica), - timestamp, - protocol-versions, - { introduction-points } encrypted with cookie - } signed with private-key - - The "descriptor-id" needs to change periodically in order for the - descriptor to be stored on changing nodes over time. It may only be - computable by a hidden service provider and all of his clients to prevent - unauthorized nodes from tracking the service activity by periodically - checking whether there is a descriptor for this service. Finally, the - hidden service directory needs to be able to verify that the hidden service - provider is the true originator of the descriptor with the given ID. - - Therefore, "descriptor-id" is derived from the "public-key" of the hidden - service provider, the current "time-period" which changes every 24 hours, - a secret "cookie" shared between hidden service provider and clients, and - a "replica" denoting the number of this non-consecutive replica. (The - "time-period" is constructed in a way that time periods do not change at - the same moment for all descriptors by deriving a value between 0:00 and - 23:59 hours from h(public-key) and making the descriptors of this hidden - service provider expire at that time of the day.) The "descriptor-id" is - defined to be 160 bits long. [extending the "descriptor-id" length - suggested by LØ] - - Only the hidden service provider and the clients are able to generate - future "descriptor-ID"s. Hence, the "onion-address" is extended from now - the hash value of "public-key" by the secret "cookie". The "public-key" is - determined to be 80 bits long, whereas the "cookie" is dimensioned to be - 120 bits long. This makes a total of 200 bits or 40 base32 chars, which is - quite a lot to handle for a human, but necessary to provide sufficient - protection against an adversary from generating a key pair with same - "public-key" hash or guessing the "cookie". - - A hidden service directory can verify that a descriptor was created by the - hidden service provider by checking if the "descriptor-id" corresponds to - the "public-key" and if the signature can be verified with the - "public-key". - - The "introduction-points" that are included in the descriptor are encrypted - using the same "cookie" that is shared between hidden service provider and - clients. [correction to use another key than h(time-period + cookie) as - encryption key for introduction points made by LØ] - - A new text-based format is proposed for descriptors instead of an extension - of the existing binary format for reasons of future extensibility. - -Security implications: - - The security implications of the proposed changes are grouped by the roles of - nodes that could perform attacks or on which attacks could be performed. - - Attacks by authoritative directory nodes - - Authoritative directory nodes are no longer the single places in the - network that know about a hidden service's activity and introduction - points. Thus, they cannot perform attacks using this information, e.g. - track a hidden service's activity or usage pattern or attack its - introduction points. Formerly, it would only require a single corrupted - authoritative directory operator to perform such an attack. - - Attacks by hidden service directory nodes - - A hidden service directory node could misuse a stored descriptor to track a - hidden service's activity and usage pattern by clients. Though there is no - countermeasure against this kind of attack, it is very expensive to track a - certain hidden service over time. An attacker would need to run a large - number of stable onion routers that work as hidden service directory nodes - to have a good probability to become responsible for its changing - descriptor IDs. For each period, the probability is: - - 1-(N-c choose r)/(N choose r) for N-c>=r and 1 otherwise, with N - as total - number of hidden service directories, c as compromised nodes, and r as - number of replicas - - The hidden service directory nodes could try to make a certain hidden - service unavailable to its clients. Therefore, they could discard all - stored descriptors for that hidden service and reply to clients that there - is no descriptor for the given ID or return an old or false descriptor - content. The client would detect a false descriptor, because it could not - contain a correct signature. But an old content or an empty reply could - confuse the client. Therefore, the countermeasure is to replicate - descriptors among a small number of hidden service directories, e.g. 5. - The probability of a group of collaborating nodes to make a hidden service - completely unavailable is in each period: - - (c choose r)/(N choose r) for c>=r and N>=r, and 0 otherwise, - with N as total - number of hidden service directories, c as compromised nodes, and r as - number of replicas - - A hidden service directory could try to find out which introduction points - are working on behalf of a hidden service. In contrast to the previous - design, this is not possible anymore, because this information is encrypted - to the clients of a hidden service. - - Attacks on hidden service directory nodes - - An anonymous attacker could try to swamp a hidden service directory with - false descriptors for a given descriptor ID. This is prevented by requiring - that descriptors are signed. - - Anonymous attackers could swamp a hidden service directory with correct - descriptors for non-existing hidden services. There is no countermeasure - against this attack. However, the creation of valid descriptors is more - expensive than verification and storage in local memory. This should make - this kind of attack unattractive. - - Attacks by introduction points - - Current or former introduction points could try to gain information on the - hidden service they serve. But due to the fresh key pair that is used by - the hidden service, this attack is not possible anymore. - - Attacks by clients - - Current or former clients could track a hidden service's activity, attack - its introduction points, or determine the responsible hidden service - directory nodes and attack them. There is nothing that could prevent them - from doing so, because honest clients need the full descriptor content to - establish a connection to the hidden service. At the moment, the only - countermeasure against dishonest clients is to change the secret cookie and - pass it only to the honest clients. - -Compatibility: - - The proposed design is meant to replace the current design for hidden service - descriptors and their storage in the long run. - - There should be a first transition phase in which both, the current design - and the proposed design are served in parallel. Onion routers should start - serving as hidden service directories, and hidden service providers and - clients should make use of the new design if both sides support it. Hidden - service providers should be allowed to publish descriptors of the current - format in parallel, and authoritative directories should continue storing and - serving these descriptors. - - After the first transition phase, hidden service providers should stop - publishing descriptors on authoritative directories, and hidden service - clients should not try to fetch descriptors from the authoritative - directories. However, the authoritative directories should continue serving - hidden service descriptors for a second transition phase. As of this point, - all v2 config options should be set to a default value of 1. - - After the second transition phase, the authoritative directories should stop - serving hidden service descriptors. - diff --git a/doc/spec/proposals/115-two-hop-paths.txt b/doc/spec/proposals/115-two-hop-paths.txt deleted file mode 100644 index 9854c9ad55..0000000000 --- a/doc/spec/proposals/115-two-hop-paths.txt +++ /dev/null @@ -1,385 +0,0 @@ -Filename: 115-two-hop-paths.txt -Title: Two Hop Paths -Author: Mike Perry -Created: -Status: Dead -Supersedes: 112 - - -Overview: - - The idea is that users should be able to choose if they would like - to have either two or three hop paths through the tor network. - - Let us be clear: the users who would choose this option should be - those that are concerned with IP obfuscation only: ie they would not be - targets of a resource-intensive multi-node attack. It is sometimes said - that these users should find some other network to use other than Tor. - This is a foolish suggestion: more users improves security of everyone, - and the current small userbase size is a critical hindrance to - anonymity, as is discussed below and in [1]. - - This value should be modifiable from the controller, and should be - available from Vidalia. - - -Motivation: - - The Tor network is slow and overloaded. Increasingly often I hear - stories about friends and friends of friends who are behind firewalls, - annoying censorware, or under surveillance that interferes with their - productivity and Internet usage, or chills their speech. These people - know about Tor, but they choose to put up with the censorship because - Tor is too slow to be usable for them. In fact, to download a fresh, - complete copy of levine-timing.pdf for the Theoretical Argument - section of this proposal over Tor took me 3 tries. - - Furthermore, the biggest current problem with Tor's anonymity for - those who really need it is not someone attacking the network to - discover who they are. It's instead the extreme danger that so few - people use Tor because it's so slow, that those who do use it have - essentially no confusion set. - - The recent case where the professor and the rogue Tor user were the - only Tor users on campus, and thus suspected in an incident involving - Tor and that University underscores this point: "That was why the police - had come to see me. They told me that only two people on our campus were - using Tor: me and someone they suspected of engaging in an online scam. - The detectives wanted to know whether the other user was a former - student of mine, and why I was using Tor"[1]. - - Not only does Tor provide no anonymity if you use it to be anonymous - but are obviously from a certain institution, location or circumstance, - it is also dangerous to use Tor for risk of being accused of having - something significant enough to hide to be willing to put up with - the horrible performance as opposed to using some weaker alternative. - - There are many ways to improve the speed problem, and of course we - should and will implement as many as we can. Johannes's GSoC project - and my reputation system are longer term, higher-effort things that - will still provide benefit independent of this proposal. - - However, reducing the path length to 2 for those who do not need the - extra anonymity 3 hops provide not only improves their Tor experience - but also reduces their load on the Tor network by 33%, and should - increase adoption of Tor by a good deal. That's not just Win-Win, it's - Win-Win-Win. - - -Who will enable this option? - - This is the crux of the proposal. Admittedly, there is some anonymity - loss and some degree of decreased investment required on the part of - the adversary to attack 2 hop users versus 3 hop users, even if it is - minimal and limited mostly to up-front costs and false positives. - - The key questions are: - - 1. Are these users in a class such that their risk is significantly - less than the amount of this anonymity loss? - - 2. Are these users able to identify themselves? - - Many many users of Tor are not at risk for an adversary capturing c/n - nodes of the network just to see what they do. These users use Tor to - circumvent aggressive content filters, or simply to keep their IP out of - marketing and search engine databases. Most content filters have no - interest in running Tor nodes to catch violators, and marketers - certainly would never consider such a thing, both on a cost basis and a - legal one. - - In a sense, this represents an alternate threat model against these - users who are not at risk for Tor's normal threat model. - - It should be evident to these users that they fall into this class. All - that should be needed is a radio button - - * "I use Tor for local content filter circumvention and/or IP obfuscation, - not anonymity. Speed is more important to me than high anonymity. - No one will make considerable efforts to determine my real IP." - * "I use Tor for anonymity and/or national-level, legally enforced - censorship. It is possible effort will be taken to identify - me, including but not limited to network surveillance. I need more - protection." - - and then some explanation in the help for exactly what this means, and - the risks involved with eliminating the adversary's need for timing - attacks with respect to false positives. Ultimately, the decision is a - simple one that can be made without this information, however. The user - does not need Paul Syverson to instruct them on the deep magic of Onion - Routing to make this decision. They just need to know why they use Tor. - If they use it just to stay out of marketing databases and/or bypass a - local content filter, two hops is plenty. This is likely the vast - majority of Tor users, and many non-users we would like to bring on - board. - - So, having established this class of users, let us now go on to - examine theoretical and practical risks we place them at, and determine - if these risks violate the users needs, or introduce additional risk - to node operators who may be subject to requests from law enforcement - to track users who need 3 hops, but use 2 because they enjoy the - thrill of russian roulette. - - -Theoretical Argument: - - It has long been established that timing attacks against mixed - and onion networks are extremely effective, and that regardless - of path length, if the adversary has compromised your first and - last hop of your path, you can assume they have compromised your - identity for that connection. - - In fact, it was demonstrated that for all but the slowest, lossiest - networks, error rates for false positives and false negatives were - very near zero[2]. Only for constant streams of traffic over slow and - (more importantly) extremely lossy network links did the error rate - hit 20%. For loss rates typical to the Internet, even the error rate - for slow nodes with constant traffic streams was 13%. - - When you take into account that most Tor streams are not constant, - but probably much more like their "HomeIP" dataset, which consists - mostly of web traffic that exists over finite intervals at specific - times, error rates drop to fractions of 1%, even for the "worst" - network nodes. - - Therefore, the user has little benefit from the extra hop, assuming - the adversary does timing correlation on their nodes. Since timing - correlation is simply an implementation issue and is most likely - a single up-front cost (and one that is like quite a bit cheaper - than the cost of the machines purchased to host the nodes to mount - an attack), the real protection is the low probability of getting - both the first and last hop of a client's stream. - - -Practical Issues: - - Theoretical issues aside, there are several practical issues with the - implementation of Tor that need to be addressed to ensure that - identity information is not leaked by the implementation. - - Exit policy issues: - - If a client chooses an exit with a very restrictive exit policy - (such as an IP or IP range), the first hop then knows a good deal - about the destination. For this reason, clients should not select - exits that match their destination IP with anything other than "*". - - Partitioning: - - Partitioning attacks form another concern. Since Tor uses telescoping - to build circuits, it is possible to tell a user is constructing only - two hop paths at the entry node and on the local network. An external - adversary can potentially differentiate 2 and 3 hop users, and decide - that all IP addresses connecting to Tor and using 3 hops have something - to hide, and should be scrutinized more closely or outright apprehended. - - One solution to this is to use the "leaky-circuit" method of attaching - streams: The user always creates 3-hop circuits, but if the option - is enabled, they always exit from their 2nd hop. The ideal solution - would be to create a RELAY_SHISHKABOB cell which contains onion - skins for every host along the path, but this requires protocol - changes at the nodes to support. - - Guard nodes: - - Since guard nodes can rotate due to client relocation, network - failure, node upgrades and other issues, if you amortize the risk a - mobile, dialup, or otherwise intermittently connected user is exposed to - over any reasonable duration of Tor usage (on the order of a year), it - is the same with or without guard nodes. Assuming an adversary has - c%/n% of network bandwidth, and guards rotate on average with period R, - statistically speaking, it's merely a question of if the user wishes - their risk to be concentrated with probability c/n over an expected - period of R*c, and probability 0 over an expected period of R*(n-c), - versus a continuous risk of (c/n)^2. So statistically speaking, guards - only create a time-tradeoff of risk over the long run for normal Tor - usage. Rotating guards do not reduce risk for normal client usage long - term.[3] - - On other other hand, assuming a more stable method of guard selection - and preservation is devised, or a more stable client side network than - my own is typical (which rotates guards frequently due to network issues - and moving about), guard nodes provide a tradeoff in the form of c/n% of - the users being "sacrificial users" who are exposed to high risk O(c/n) - of identification, while the rest of the network is exposed to zero - risk. - - The nature of Tor makes it likely an adversary will take a "shock and - awe" approach to suppressing Tor by rounding up a few users whose - browsing activity has been observed to be made into examples, in an - attempt to prove that Tor is not perfect. - - Since this "shock and awe" attack can be applied with or without guard - nodes, stable guard nodes do offer a measure of accountability of sorts. - If a user was using a small set of guard nodes and knows them well, and - then is suddenly apprehended as a result of Tor usage, having a fixed - set of entry points to suspect is a lot better than suspecting the whole - network. Conversely, it can also give non-apprehended users comfort - that they are likely to remain safe indefinitely with their set of (now - presumably trusted) guards. This is probably the most beneficial - property of reliable guards: they deter the adversary from mounting - "shock and awe" attacks because the surviving users will not - intimidated, but instead made more confident. Of course, guards need to - be made much more stable and users need to be encouraged to know their - guards for this property to really take effect. - - This beneficial property of client vigilance also carries over to an - active adversary, except in this case instead of relying on the user - to remember their guard nodes and somehow communicate them after - apprehension, the code can alert them to the presence of an active - adversary before they are apprehended. But only if they use guard nodes. - - So lets consider the active adversary: Two hop paths allow malicious - guards to get considerably more benefit from failing circuits if they do - not extend to their colluding peers for the exit hop. Since guards can - detect the number of hops in a path via either timing or by statistical - analysis of the exit policy of the 2nd hop, they can perform this attack - predominantly against 2 hop users. - - This can be addressed by completely abandoning an entry guard after a - certain ratio of extend or general circuit failures with respect to - non-failed circuits. The proper value for this ratio can be determined - experimentally with TorFlow. There is the possibility that the local - network can abuse this feature to cause certain guards to be dropped, - but they can do that anyways with the current Tor by just making guards - they don't like unreachable. With this mechanism, Tor will complain - loudly if any guard failure rate exceeds the expected in any failure - case, local or remote. - - Eliminating guards entirely would actually not address this issue due - to the time-tradeoff nature of risk. In fact, it would just make it - worse. Without guard nodes, it becomes much more difficult for clients - to become alerted to Tor entry points that are failing circuits to make - sure that they only devote bandwidth to carry traffic for streams which - they observe both ends. Yet the rogue entry points are still able to - significantly increase their success rates by failing circuits. - - For this reason, guard nodes should remain enabled for 2 hop users, - at least until an IP-independent, undetectable guard scanner can - be created. TorFlow can scan for failing guards, but after a while, - its unique behavior gives away the fact that its IP is a scanner and - it can be given selective service. - - Consideration of risks for node operators: - - There is a serious risk for two hop users in the form of guard - profiling. If an adversary running an exit node notices that a - particular site is always visited from a fixed previous hop, it is - likely that this is a two hop user using a certain guard, which could be - monitored to determine their identity. Thus, for the protection of both - 2 hop users and node operators, 2 hop users should limit their guard - duration to a sufficient number of days to verify reliability of a node, - but not much more. This duration can be determined experimentally by - TorFlow. - - Considering a Tor client builds on average 144 circuits/day (10 - minutes per circuit), if the adversary owns c/n% of exits on the - network, they can expect to see 144*c/n circuits from this user, or - about 14 minutes of usage per day per percentage of network penetration. - Since it will take several occurrences of user-linkable exit content - from the same predecessor hop for the adversary to have any confidence - this is a 2 hop user, it is very unlikely that any sort of demands made - upon the predecessor node would guaranteed to be effective (ie it - actually was a guard), let alone be executed in time to apprehend the - user before they rotated guards. - - The reverse risk also warrants consideration. If a malicious guard has - orders to surveil Mike Perry, it can determine Mike Perry is using two - hops by observing his tendency to choose a 2nd hop with a viable exit - policy. This can be done relatively quickly, unfortunately, and - indicates Mike Perry should spend some of his time building real 3 hop - circuits through the same guards, to require them to at least wait for - him to actually use Tor to determine his style of operation, rather than - collect this information from his passive building patterns. - - However, to actively determine where Mike Perry is going, the guard - will need to require logging ahead of time at multiple exit nodes that - he may use over the course of the few days while he is at that guard, - and correlate the usage times of the exit node with Mike Perry's - activity at that guard for the few days he uses it. At this point, the - adversary is mounting a scale and method of attack (widespread logging, - timing attacks) that works pretty much just as effectively against 3 - hops, so exit node operators are exposed to no additional danger than - they otherwise normally are. - - -Why not fix Pathlen=2?: - - The main reason I am not advocating that we always use 2 hops is that - in some situations, timing correlation evidence by itself may not be - considered as solid and convincing as an actual, uninterrupted, fully - traced path. Are these timing attacks as effective on a real network as - they are in simulation? Maybe the circuit multiplexing of Tor can serve - to frustrate them to a degree? Would an extralegal adversary or - authoritarian government even care? In the face of these situation - dependent unknowns, it should be up to the user to decide if this is - a concern for them or not. - - It should probably also be noted that even a false positive - rate of 1% for a 200k concurrent-user network could mean that for a - given node, a given stream could be confused with something like 10 - users, assuming ~200 nodes carry most of the traffic (ie 1000 users - each). Though of course to really know for sure, someone needs to do - an attack on a real network, unfortunately. - - Additionally, at some point cover traffic schemes may be implemented to - frustrate timing attacks on the first hop. It is possible some expert - users may do this ad-hoc already, and may wish to continue using 3 hops - for this reason. - - -Implementation: - - new_route_len() can be modified directly with a check of the - Pathlen option. However, circuit construction logic should be - altered so that both 2 hop and 3 hop users build the same types of - circuits, and the option should ultimately govern circuit selection, - not construction. This improves coverage against guard nodes being - able to passively profile users who aren't even using Tor. - PathlenCoinWeight, anyone? :) - - The exit policy hack is a bit more tricky. compare_addr_to_addr_policy - needs to return an alternate ADDR_POLICY_ACCEPTED_WILDCARD or - ADDR_POLICY_ACCEPTED_SPECIFIC return value for use in - circuit_is_acceptable. - - The leaky exit is trickier still.. handle_control_attachstream - does allow paths to exit at a given hop. Presumably something similar - can be done in connection_ap_handshake_process_socks, and elsewhere? - Circuit construction would also have to be performed such that the - 2nd hop's exit policy is what is considered, not the 3rd's. - - The entry_guard_t structure could have num_circ_failed and - num_circ_succeeded members such that if it exceeds F% circuit - extend failure rate to a second hop, it is removed from the entry list. - - F should be sufficiently high to avoid churn from normal Tor circuit - failure as determined by TorFlow scans. - - The Vidalia option should be presented as a radio button. - - -Migration: - - Phase 1: Adjust exit policy checks if Pathlen is set, implement leaky - circuit ability, and 2-3 hop circuit selection logic governed by - Pathlen. - - Phase 2: Experiment to determine the proper ratio of circuit - failures used to expire garbage or malicious guards via TorFlow - (pending Bug #440 backport+adoption). - - Phase 3: Implement guard expiration code to kick off failure-prone - guards and warn the user. Cap 2 hop guard duration to a proper number - of days determined sufficient to establish guard reliability (to be - determined by TorFlow). - - Phase 4: Make radiobutton in Vidalia, along with help entry - that explains in layman's terms the risks involved. - - Phase 5: Allow user to specify path length by HTTP URL suffix. - - -[1] http://p2pnet.net/story/11279 -[2] http://www.cs.umass.edu/~mwright/papers/levine-timing.pdf -[3] Proof available upon request ;) diff --git a/doc/spec/proposals/116-two-hop-paths-from-guard.txt b/doc/spec/proposals/116-two-hop-paths-from-guard.txt deleted file mode 100644 index f45625350b..0000000000 --- a/doc/spec/proposals/116-two-hop-paths-from-guard.txt +++ /dev/null @@ -1,118 +0,0 @@ -Filename: 116-two-hop-paths-from-guard.txt -Title: Two hop paths from entry guards -Author: Michael Lieberman -Created: 26-Jun-2007 -Status: Dead - -This proposal is related to (but different from) Mike Perry's proposal 115 -"Two Hop Paths." - -Overview: - -Volunteers who run entry guards should have the option of using only 2 -additional tor nodes when constructing their own tor circuits. - -While the option of two hop paths should perhaps be extended to every client -(as discussed in Mike Perry's thread), I believe the anonymity properties of -two hop paths are particularly well-suited to client computers that are also -serving as entry guards. - -First I will describe the details of the strategy, as well as possible -avenues of attack. Then I will list advantages and disadvantages. Then, I -will discuss some possibly safer variations of the strategy, and finally -some implementation issues. - -Details: - -Suppose Alice is an entry guard, and wants to construct a two hop circuit. -Alice chooses a middle node at random (not using the entry guard strategy), -and gains anonymity by having her traffic look just like traffic from -someone else using her as an entry guard. - -Can Alice's middle node figure out that she is initiator of the traffic? I -can think of four possible approaches for distinguishing traffic from Alice -with traffic through Alice: - -1) Notice that communication from Alice comes too fast: Experimentation is -needed to determine if traffic from Alice can be distinguished from traffic -from a computer with a decent link to Alice. - -2) Monitor Alice's network traffic to discover the lack of incoming packets -at the appropriate times. If an adversary has this ability, then Alice -already has problems in the current system, because the adversary can run a -standard timing attack on Alice's traffic. - -3) Notice that traffic from Alice is unique in some way such that if Alice -was just one of 3 entry guards for this traffic, then the traffic should be -coming from two other entry guards as well. An example of "unique traffic" -could be always sending 117 packets every 3 minutes to an exit node that -exits to port 4661. However, if such patterns existed with sufficient -precision, then it seems to me that Tor already has a problem. (This "unique -traffic" may not be a problem if clients often end up choosing a single -entry guard because their other two are down. Does anyone know if this is -the case?) - -4) First, control the middle node *and* some other part of the traffic, -using standard attacks on a two hop circuit without entry nodes (my recent -paper on Browser-Based Attacks would work well for this -http://petworkshop.org/2007/papers/PET2007_preproc_Browser_based.pdf). With -control of the circuit, we can now cause "unique traffic" as in 3). -Alternatively, if we know something about Alice independently, and we can -see what websites are being visited, we might be able to guess that she is -the kind of person that would visit those websites. - -Anonymity Advantages: - --Alice never has the problem of choosing a malicious entry guard. In some -sense, Alice acts as her own entry guard. - -Anonymity Disadvantages: - --If Alice's traffic is identified as originating from herself (see above for -how hard that might be), then she has the anonymity of a 2 hop circuit -without entry guards. - -Additional advantages: - --A discussion of the latency advantages of two hop circuits is going on in -Mike Perry's thread already. --Also, we can advertise this change as "Run an entry guard and decrease your -own Tor latency." This incentive has the potential to add nodes to the -network, improving the network as a whole. - -Safer variations: - -To solve the "unique traffic" problem, Alice could use two hop paths only -1/3 of the time, and choose 2 other entry guards for the other 2/3 of the -time. All the advantages are now 1/3 as useful (possibly more, if the other -2 entry guards are not always up). - -To solve the problem that Alice's responses are too fast, Alice could delay -her responses (ideally based on some real data of response time when Alice -is used an entry guard). This loses most of the speed advantages of the two -hop path, but if Alice is a fast entry guard, it doesn't lose everything. It -also still has the (arguable) anonymity advantage that Alice doesn't have to -worry about having a malicious entry guard. - -Implementation details: -For Alice to remain anonymous using this strategy, she has to actually be -acting as an entry guard for other nodes. This means the two hop option can -only be available to whatever high-performance threshold is currently set on -entry guards. Alice may need to somehow check her own current status as an -entry guard before choosing this two hop strategy. - -Another thing to consider: suppose Alice is also an exit node. If the -fraction of exit nodes in existence is too small, she may rarely or never be -chosen as an entry guard. It would be sad if we offered an incentive to run -an entry guard that didn't extend to exit nodes. I suppose clients of Exit -nodes could pull the same trick, and bypass using Tor altogether (zero hop -paths), though that has additional issues.* - -Mike Lieberman -MIT - -*Why we shouldn't recommend Exit nodes pull the same trick: -1) Exit nodes would suffer heavily from the problem of "unique traffic" -mentioned above. -2) It would give governments an incentive to confiscate exit nodes to see if -they are pulling this trick. diff --git a/doc/spec/proposals/117-ipv6-exits.txt b/doc/spec/proposals/117-ipv6-exits.txt deleted file mode 100644 index 00cd7cef10..0000000000 --- a/doc/spec/proposals/117-ipv6-exits.txt +++ /dev/null @@ -1,410 +0,0 @@ -Filename: 117-ipv6-exits.txt -Title: IPv6 exits -Author: coderman -Created: 10-Jul-2007 -Status: Accepted -Target: 0.2.1.x - -Overview - - Extend Tor for TCP exit via IPv6 transport and DNS resolution of IPv6 - addresses. This proposal does not imply any IPv6 support for OR - traffic, only exit and name resolution. - - -Contents - -0. Motivation - - As the IPv4 address space becomes more scarce there is increasing - effort to provide Internet services via the IPv6 protocol. Many - hosts are available at IPv6 endpoints which are currently - inaccessible for Tor users. - - Extending Tor to support IPv6 exit streams and IPv6 DNS name - resolution will allow users of the Tor network to access these hosts. - This capability would be present for those who do not currently have - IPv6 access, thus increasing the utility of Tor and furthering - adoption of IPv6. - - -1. Design - -1.1. General design overview - - There are three main components to this proposal. The first is a - method for routers to advertise their ability to exit IPv6 traffic. - The second is the manner in which routers resolve names to IPv6 - addresses. Last but not least is the method in which clients - communicate with Tor to resolve and connect to IPv6 endpoints - anonymously. - -1.2. Router IPv6 exit support - - In order to specify exit policies and IPv6 capability new directives - in the Tor configuration will be needed. If a router advertises IPv6 - exit policies in its descriptor this will signal the ability to - provide IPv6 exit. There are a number of additional default deny - rules associated with this new address space which are detailed in - the addendum. - - When Tor is started on a host it should check for the presence of a - global unicast IPv6 address and if present include the default IPv6 - exit policies and any user specified IPv6 exit policies. - - If a user provides IPv6 exit policies but no global unicast IPv6 - address is available Tor should generate a warning and not publish the - IPv6 policies in the router descriptor. - - It should be noted that IPv4 mapped IPv6 addresses are not valid exit - destinations. This mechanism is mainly used to interoperate with - both IPv4 and IPv6 clients on the same socket. Any attempts to use - an IPv4 mapped IPv6 address, perhaps to circumvent exit policy for - IPv4, must be refused. - -1.3. DNS name resolution of IPv6 addresses (AAAA records) - - In addition to exit support for IPv6 TCP connections, a method to - resolve domain names to their respective IPv6 addresses is also - needed. This is accomplished in the existing DNS system via AAAA - records. Routers will perform both A and AAAA requests when - resolving a name so that the client can utilize an IPv6 endpoint when - available or preferred. - - To avoid potential problems with caching DNS servers that behave - poorly all NXDOMAIN responses to AAAA requests should be ignored if a - successful response is received for an A request. This implies that - both AAAA and A requests will always be performed for each name - resolution. - - For reverse lookups on IPv6 addresses, like that used for - RESOLVE_PTR, Tor will perform the necessary PTR requests via - IP6.ARPA. - - All routers which perform DNS resolution on behalf of clients - (RELAY_RESOLVE) should perform and respond with both A and AAAA - resources. - - [NOTE: In a future version, when we extend the behavior of RESOLVE to - encapsulate more of real DNS, it will make sense to allow more - flexibility here. -nickm] - -1.4. Client interaction with IPv6 exit capability - -1.4.1. Usability goals - - There are a number of behaviors which Tor can provide when - interacting with clients that will improve the usability of IPv6 exit - capability. These behaviors are designed to make it simple for - clients to express a preference for IPv6 transport and utilize IPv6 - host services. - -1.4.2. SOCKSv5 IPv6 client behavior - - The SOCKS version 5 protocol supports IPv6 connections. When using - SOCKSv5 with hostnames it is difficult to determine if a client - wishes to use an IPv4 or IPv6 address to connect to the desired host - if it resolves to both address types. - - In order to make this more intuitive the SOCKSv5 protocol can be - supported on a local IPv6 endpoint, [::1] port 9050 for example. - When a client requests a connection to the desired host via an IPv6 - SOCKS connection Tor will prefer IPv6 addresses when resolving the - host name and connecting to the host. - - Likewise, RESOLVE and RESOLVE_PTR requests from an IPv6 SOCKS - connection will return IPv6 addresses when available, and fall back - to IPv4 addresses if not. - - [NOTE: This means that SocksListenAddress and DNSListenAddress should - support IPv6 addresses. Perhaps there should also be a general option - to have listeners that default to 127.0.0.1 and 0.0.0.0 listen - additionally or instead on ::1 and :: -nickm] - -1.4.3. MAPADDRESS behavior - - The MAPADDRESS capability supports clients that may not be able to - use the SOCKSv4a or SOCKSv5 hostname support to resolve names via - Tor. This ability should be extended to IPv6 addresses in SOCKSv5 as - well. - - When a client requests an address mapping from the wildcard IPv6 - address, [::0], the server will respond with a unique local IPv6 - address on success. It is important to note that there may be two - mappings for the same name if both an IPv4 and IPv6 address are - associated with the host. In this case a CONNECT to a mapped IPv6 - address should prefer IPv6 for the connection to the host, if - available, while CONNECT to a mapped IPv4 address will prefer IPv4. - - It should be noted that IPv6 does not provide the concept of a host - local subnet, like 127.0.0.0/8 in IPv4. For this reason integration - of Tor with IPv6 clients should consider a firewall or filter rule to - drop unique local addresses to or from the network when possible. - These packets should not be routed, however, keeping them off the - subnet entirely is worthwhile. - -1.4.3.1. Generating unique local IPv6 addresses - - The usual manner of generating a unique local IPv6 address is to - select a Global ID part randomly, along with a Subnet ID, and sharing - this prefix among the communicating parties who each have their own - distinct Interface ID. In this style a given Tor instance might - select a random Global and Subnet ID and provide MAPADDRESS - assignments with a random Interface ID as needed. This has the - potential to associate unique Global/Subnet identifiers with a given - Tor instance and may expose attacks against the anonymity of Tor - users. - - Tor avoid this potential problem entirely MAPADDRESS must always - generate the Global, Subnet, and Interface IDs randomly for each - request. It is also highly suggested that explicitly specifying an - IPv6 source address instead of the wildcard address not be supported - to ensure that a good random address is used. - -1.4.4. DNSProxy IPv6 client behavior - - A new capability in recent Tor versions is the transparent DNS proxy. - This feature will need to return both A and AAAA resource records - when responding to client name resolution requests. - - The transparent DNS proxy should also support reverse lookups for - IPv6 addresses. It is suggested that any such requests to the - deprecated IP6.INT domain should be translated to IP6.ARPA instead. - This translation is not likely to be used and is of low priority. - - It would be nice to support DNS over IPv6 transport as well, however, - this is not likely to be used and is of low priority. - -1.4.5. TransPort IPv6 client behavior - - Tor also provides transparent TCP proxy support via the Trans* - directives in the configuration. The TransListenAddress directive - should accept an IPv6 address in addition to IPv4 so that IPv6 TCP - connections can be transparently proxied. - -1.5. Additional changes - - The RedirectExit option should be deprecated rather than extending - this feature to IPv6. - - -2. Spec changes - -2.1. Tor specification - - In '6.2. Opening streams and transferring data' the following should - be changed to indicate IPv6 exit capability: - - "No version of Tor currently generates the IPv6 format." - - In '6.4. Remote hostname lookup' the following should be updated to - reflect use of ip6.arpa in addition to in-addr.arpa. - - "For a reverse lookup, the OP sends a RELAY_RESOLVE cell containing an - in-addr.arpa address." - - In 'A.1. Differences between spec and implementation' the following - should be updated to indicate IPv6 exit capability: - - "The current codebase has no IPv6 support at all." - - [NOTE: the EXITPOLICY end-cell reason says that it can hold an ipv4 or an - ipv6 address, but doesn't say how. We may want a separate EXITPOLICY2 - type that can hold an ipv6 address, since the way we encode ipv6 - addresses elsewhere ("0.0.0.0 indicates that the next 16 bytes are ipv6") - is a bit dumb. -nickm] - [Actually, the length field lets us distinguish EXITPOLICY. -nickm] - -2.2. Directory specification - - In '2.1. Router descriptor format' a new set of directives is needed - for IPv6 exit policy. The existing accept/reject directives should - be clarified to indicate IPv4 or wildcard address relevance. The new - IPv6 directives will be in the form of: - - "accept6" exitpattern NL - "reject6" exitpattern NL - - The section describing accept6/reject6 should explain that the - presence of accept6 or reject6 exit policies in a router descriptor - signals the ability of that router to exit IPv6 traffic (according to - IPv6 exit policies). - - The "[::]/0" notation is used to represent "all IPv6 addresses". - "[::0]/0" may also be used for this representation. - - If a user specifies a 'reject6 [::]/0:*' policy in the Tor - configuration this will be interpreted as forcing no IPv6 exit - support and no accept6/reject6 policies will be included in the - published descriptor. This will prevent IPv6 exit if the router host - has a global unicast IPv6 address present. - - It is important to note that a wildcard address in an accept or - reject policy applies to both IPv4 and IPv6 addresses. - -2.3. Control specification - - In '3.8. MAPADDRESS' the potential to have to addresses for a given - name should be explained. The method for generating unique local - addresses for IPv6 mappings needs explanation as described above. - - When IPv6 addresses are used in this document they should include the - brackets for consistency. For example, the null IPv6 address should - be written as "[::0]" and not "::0". The control commands will - expect the same syntax as well. - - In '3.9. GETINFO' the "address" command should return both public - IPv4 and IPv6 addresses if present. These addresses should be - separated via \r\n. - - -2.4. Tor SOCKS extensions - - In '2. Name lookup' a description of IPv6 address resolution is - needed for SOCKSv5 as described above. IPv6 addresses should be - supported in both the RESOLVE and RESOLVE_PTR extensions. - - A new section describing the ability to accept SOCKSv5 clients on a - local IPv6 address to indicate a preference for IPv6 transport as - described above is also needed. The behavior of Tor SOCKSv5 proxy - with an IPv6 preference should be explained, for example, preferring - IPv6 transport to a named host with both IPv4 and IPv6 addresses - available (A and AAAA records). - - -3. Questions and concerns - -3.1. DNS A6 records - - A6 is explicitly avoided in this document. There are potential - reasons for implementing this, however, the inherent complexity of - the protocol and resolvers make this unappealing. Is there a - compelling reason to consider A6 as part of IPv6 exit support? - - [IMO not till anybody needs it. -nickm] - -3.2. IPv4 and IPv6 preference - - The design above tries to infer a preference for IPv4 or IPv6 - transport based on client interactions with Tor. It might be useful - to provide more explicit control over this preference. For example, - an IPv4 SOCKSv5 client may want to use IPv6 transport to named hosts - in CONNECT requests while the current implementation would assume an - IPv4 preference. Should more explicit control be available, through - either configuration directives or control commands? - - Many applications support a inet6-only or prefer-family type option - that provides the user manual control over address preference. This - could be provided as a Tor configuration option. - - An explicit preference is still possible by resolving names and then - CONNECTing to an IPv4 or IPv6 address as desired, however, not all - client applications may have this option available. - -3.3. Support for IPv6 only transparent proxy clients - - It may be useful to support IPv6 only transparent proxy clients using - IPv4 mapped IPv6 like addresses. This would require transparent DNS - proxy using IPv6 transport and the ability to map A record responses - into IPv4 mapped IPv6 like addresses in the manner described in the - "NAT-PT" RFC for a traditional Basic-NAT-PT with DNS-ALG. The - transparent TCP proxy would thus need to detect these mapped addresses - and connect to the desired IPv4 host. - - The IPv6 prefix used for this purpose must not be the actual IPv4 - mapped IPv6 address prefix, though the manner in which IPv4 addresses - are embedded in IPv6 addresses would be the same. - - The lack of any IPv6 only hosts which would use this transparent proxy - method makes this a lot of work for very little gain. Is there a - compelling reason to support this NAT-PT like capability? - -3.4. IPv6 DNS and older Tor routers - - It is expected that many routers will continue to run with older - versions of Tor when the IPv6 exit capability is released. Clients - who wish to use IPv6 will need to route RELAY_RESOLVE requests to the - newer routers which will respond with both A and AAAA resource - records when possible. - - One way to do this is to route RELAY_RESOLVE requests to routers with - IPv6 exit policies published, however, this would not utilize current - routers that can resolve IPv6 addresses even if they can't exit such - traffic. - - There was also concern expressed about the ability of existing clients - to cope with new RELAY_RESOLVE responses that contain IPv6 addresses. - If this breaks backward compatibility, a new request type may be - necessary, like RELAY_RESOLVE6, or some other mechanism of indicating - the ability to parse IPv6 responses when making the request. - -3.5. IPv4 and IPv6 bindings in MAPADDRESS - - It may be troublesome to try and support two distinct address mappings - for the same name in the existing MAPADDRESS implementation. If this - cannot be accommodated then the behavior should replace existing - mappings with the new address regardless of family. A warning when - this occurs would be useful to assist clients who encounter problems - when both an IPv4 and IPv6 application are using MAPADDRESS for the - same names concurrently, causing lost connections for one of them. - -4. Addendum - -4.1. Sample IPv6 default exit policy - - reject 0.0.0.0/8 - reject 169.254.0.0/16 - reject 127.0.0.0/8 - reject 192.168.0.0/16 - reject 10.0.0.0/8 - reject 172.16.0.0/12 - reject6 [0000::]/8 - reject6 [0100::]/8 - reject6 [0200::]/7 - reject6 [0400::]/6 - reject6 [0800::]/5 - reject6 [1000::]/4 - reject6 [4000::]/3 - reject6 [6000::]/3 - reject6 [8000::]/3 - reject6 [A000::]/3 - reject6 [C000::]/3 - reject6 [E000::]/4 - reject6 [F000::]/5 - reject6 [F800::]/6 - reject6 [FC00::]/7 - reject6 [FE00::]/9 - reject6 [FE80::]/10 - reject6 [FEC0::]/10 - reject6 [FF00::]/8 - reject *:25 - reject *:119 - reject *:135-139 - reject *:445 - reject *:1214 - reject *:4661-4666 - reject *:6346-6429 - reject *:6699 - reject *:6881-6999 - accept *:* - # accept6 [2000::]/3:* is implied - -4.2. Additional resources - - 'DNS Extensions to Support IP Version 6' - http://www.ietf.org/rfc/rfc3596.txt - - 'DNS Extensions to Support IPv6 Address Aggregation and Renumbering' - http://www.ietf.org/rfc/rfc2874.txt - - 'SOCKS Protocol Version 5' - http://www.ietf.org/rfc/rfc1928.txt - - 'Unique Local IPv6 Unicast Addresses' - http://www.ietf.org/rfc/rfc4193.txt - - 'INTERNET PROTOCOL VERSION 6 ADDRESS SPACE' - http://www.iana.org/assignments/ipv6-address-space - - 'Network Address Translation - Protocol Translation (NAT-PT)' - http://www.ietf.org/rfc/rfc2766.txt diff --git a/doc/spec/proposals/118-multiple-orports.txt b/doc/spec/proposals/118-multiple-orports.txt deleted file mode 100644 index 2381ec7ca3..0000000000 --- a/doc/spec/proposals/118-multiple-orports.txt +++ /dev/null @@ -1,84 +0,0 @@ -Filename: 118-multiple-orports.txt -Title: Advertising multiple ORPorts at once -Author: Nick Mathewson -Created: 09-Jul-2007 -Status: Accepted -Target: 0.2.1.x - -Overview: - - This document is a proposal for servers to advertise multiple - address/port combinations for their ORPort. - -Motivation: - - Sometimes servers want to support multiple ports for incoming - connections, either in order to support multiple address families, to - better use multiple interfaces, or to support a variety of - FascistFirewallPorts settings. This is easy to set up now, but - there's no way to advertise it to clients. - -New descriptor syntax: - - We add a new line in the router descriptor, "or-address". This line - can occur zero, one, or multiple times. Its format is: - - or-address SP ADDRESS ":" PORTLIST NL - - ADDRESS = IP6ADDR / IP4ADDR - IPV6ADDR = an ipv6 address, surrounded by square brackets. - IPV4ADDR = an ipv4 address, represented as a dotted quad. - PORTLIST = PORTSPEC | PORTSPEC "," PORTLIST - PORTSPEC = PORT | PORT "-" PORT - - [This is the regular format for specifying sets of addresses and - ports in Tor.] - -New OR behavior: - - We add two more options to supplement ORListenAddress: - ORPublishedListenAddress, and ORPublishAddressSet. The former - listens on an address-port combination and publishes it in addition - to the regular address. The latter advertises a set of address-port - combinations, but does not listen on them. [To use this option, the - server operator should set up port forwarding to the regular ORPort, - as for example with firewall rules.] - - Servers should extend their testing to include advertised addresses - and ports. No address or port should be advertised until it's been - tested. [This might get expensive in practice.] - -New authority behavior: - - Authorities should spot-test descriptors, and reject any where a - substantial part of the addresses can't be reached. - -New client behavior: - - When connecting to another server, clients SHOULD pick an - address-port ocmbination at random as supported by their - reachableaddresses. If a client has a connection to a server at one - address, it SHOULD use that address for any simultaneous connections - to that server. Clients SHOULD use the canonical address for any - server when generating extend cells. - -Not addressed here: - - * There's no reason to listen on multiple dirports; current Tors - mostly don't connect directly to the dirport anyway. - - * It could be advantageous to list something about extra addresses in - the network-status document. This would, however, eat space there. - More analysis is needed, particularly in light of proposal 141 - ("Download server descriptors on demand") - -Dependencies: - - Testing for canonical connections needs to be implemented before it's - safe to use this proposal. - - -Notes 3 July: - - Write up the simple version of this. No ranges needed yet. No - networkstatus chagnes yet. - diff --git a/doc/spec/proposals/119-controlport-auth.txt b/doc/spec/proposals/119-controlport-auth.txt deleted file mode 100644 index 9ed1cc1cbe..0000000000 --- a/doc/spec/proposals/119-controlport-auth.txt +++ /dev/null @@ -1,140 +0,0 @@ -Filename: 119-controlport-auth.txt -Title: New PROTOCOLINFO command for controllers -Author: Roger Dingledine -Created: 14-Aug-2007 -Status: Closed -Implemented-In: 0.2.0.x - -Overview: - - Here we describe how to help controllers locate the cookie - authentication file when authenticating to Tor, so we can a) require - authentication by default for Tor controllers and b) still keep - things usable. Also, we propose an extensible, general-purpose mechanism - for controllers to learn about a Tor instance's protocol and - authentication requirements before authenticating. - -The Problem: - - When we first added the controller protocol, we wanted to make it - easy for people to play with it, so by default we didn't require any - authentication from controller programs. We allowed requests only from - localhost as a stopgap measure for security. - - Due to an increasing number of vulnerabilities based on this approach, - it's time to add authentication in default configurations. - - We have a number of goals: - - We want the default Vidalia bundles to transparently work. That - means we don't want the users to have to type in or know a password. - - We want to allow multiple controller applications to connect to the - control port. So if Vidalia is launching Tor, it can't just keep the - secrets to itself. - - Right now there are three authentication approaches supported - by the control protocol: NULL, CookieAuthentication, and - HashedControlPassword. See Sec 5.1 in control-spec.txt for details. - - There are a couple of challenges here. The first is: if the controller - launches Tor, how should we teach Tor what authentication approach - it should require, and the secret that goes along with it? Next is: - how should this work when the controller attaches to an existing Tor, - rather than launching Tor itself? - - Cookie authentication seems most amenable to letting multiple controller - applications interact with Tor. But that brings in yet another question: - how does the controller guess where to look for the cookie file, - without first knowing what DataDirectory Tor is using? - -Design: - - We should add a new controller command PROTOCOLINFO that can be sent - as a valid first command (the others being AUTHENTICATE and QUIT). If - PROTOCOLINFO is sent as the first command, the second command must be - either a successful AUTHENTICATE or a QUIT. - - If the initial command sequence is not valid, Tor closes the connection. - - -Spec: - - C: "PROTOCOLINFO" *(SP PIVERSION) CRLF - S: "250+PROTOCOLINFO" SP PIVERSION CRLF *InfoLine "250 OK" CRLF - - InfoLine = AuthLine / VersionLine / OtherLine - - AuthLine = "250-AUTH" SP "METHODS=" AuthMethod *(",")AuthMethod - *(SP "COOKIEFILE=" AuthCookieFile) CRLF - VersionLine = "250-VERSION" SP "Tor=" TorVersion [SP Arguments] CRLF - - AuthMethod = - "NULL" / ; No authentication is required - "HASHEDPASSWORD" / ; A controller must supply the original password - "COOKIE" / ; A controller must supply the contents of a cookie - - AuthCookieFile = QuotedString - TorVersion = QuotedString - - OtherLine = "250-" Keyword [SP Arguments] CRLF - - For example: - - C: PROTOCOLINFO CRLF - S: "250+PROTOCOLINFO 1" CRLF - S: "250-AUTH Methods=HASHEDPASSWORD,COOKIE COOKIEFILE="/tor/cookie"" CRLF - S: "250-VERSION Tor=0.2.0.5-alpha" CRLF - S: "250 OK" CRLF - - Tor MAY give its InfoLines in any order; controllers MUST ignore InfoLines - with keywords it does not recognize. Controllers MUST ignore extraneous - data on any InfoLine. - - PIVERSION is there in case we drastically change the syntax one day. For - now it should always be "1", for the controller protocol. Controllers MAY - provide a list of the protocol versions they support; Tor MAY select a - version that the controller does not support. - - Right now only two "topics" (AUTH and VERSION) are included, but more - may be included in the future. Controllers must accept lines with - unexpected topics. - - AuthCookieFile = QuotedString - - AuthMethod is used to specify one or more control authentication - methods that Tor currently accepts. - - AuthCookieFile specifies the absolute path and filename of the - authentication cookie that Tor is expecting and is provided iff - the METHODS field contains the method "COOKIE". Controllers MUST handle - escape sequences inside this string. - - The VERSION line contains the Tor version. - - [What else might we want to include that could be useful? -RD] - -Compatibility: - - Tor 0.1.2.16 and 0.2.0.4-alpha hang up after the first failed - command. Earlier Tors don't know about this command but don't hang - up. That means controllers will need a mechanism for distinguishing - whether they're talking to a Tor that speaks PROTOCOLINFO or not. - - I suggest that the controllers attempt a PROTOCOLINFO. Then: - - If it works, great. Authenticate as required. - - If they get hung up on, reconnect and do a NULL AUTHENTICATE. - - If it's unrecognized but they're not hung up on, do a NULL - AUTHENTICATE. - -Unsolved problems: - - If Torbutton wants to be a Tor controller one day... talking TCP is - bad enough, but reading from the filesystem is even harder. Is there - a way to let simple programs work with the controller port without - needing all the auth infrastructure? - - Once we put this approach in place, the next vulnerability we see will - involve an attacker somehow getting read access to the victim's files - --- and then we're back where we started. This means we still need - to think about how to demand password-based authentication without - bothering the user about it. - diff --git a/doc/spec/proposals/120-shutdown-descriptors.txt b/doc/spec/proposals/120-shutdown-descriptors.txt deleted file mode 100644 index 5cfe2b5bc6..0000000000 --- a/doc/spec/proposals/120-shutdown-descriptors.txt +++ /dev/null @@ -1,83 +0,0 @@ -Filename: 120-shutdown-descriptors.txt -Title: Shutdown descriptors when Tor servers stop -Author: Roger Dingledine -Created: 15-Aug-2007 -Status: Dead - -[Proposal dead as of 11 Jul 2008. The point of this proposal was to give -routers a good way to get out of the networkstatus early, but proposal -138 (already implemented) has achieved this.] - -Overview: - - Tor servers should publish a last descriptor whenever they shut down, - to let others know that they are no longer offering service. - -The Problem: - - The main reason for this is in reaction to Internet services that want - to treat connections from the Tor network differently. Right now, - if a user experiments with turning on the "relay" functionality, he - is punished by being locked out of some websites, some IRC networks, - etc --- and this lockout persists for several days even after he turns - the server off. - -Design: - - During the "slow shutdown" period if exiting, or shortly after the - user sets his ORPort back to 0 if not exiting, Tor should publish a - final descriptor with the following characteristics: - - 1) Exit policy is listed as "reject *:*" - 2) It includes a new entry called "opt shutdown 1" - - The first step is so current blacklists will no longer list this node - as exiting to whatever the service is. - - The second step is so directory authorities can avoid wasting time - doing reachability testing. Authorities should automatically not list - as Running any router whose latest descriptor says it shut down. - - [I originally had in mind a third step --- Advertised bandwidth capacity - is listed as "0" --- so current Tor clients will skip over this node - when building most circuits. But since clients won't fetch descriptors - from nodes not listed as Running, this step seems pointless. -RD] - -Spec: - - TBD but should be pretty straightforward. - -Security issues: - - Now external people can learn exactly when a node stopped offering - relay service. How bad is this? I can see a few minor attacks based - on this knowledge, but on the other hand as it is we don't really take - any steps to keep this information secret. - -Overhead issues: - - We are creating more descriptors that want to be remembered. However, - since the router won't be marked as Running, ordinary clients won't - fetch the shutdown descriptors. Caches will, though. I hope this is ok. - -Implementation: - - To make things easy, we should publish the shutdown descriptor only - on controlled shutdown (SIGINT as opposed to SIGTERM). That would - leave enough time for publishing that we probably wouldn't need any - extra synchronization code. - - If that turns out to be too unintuitive for users, I could imagine doing - it on SIGTERMs too, and just delaying exit until we had successfully - published to at least one authority, at which point we'd hope that it - propagated from there. - -Acknowledgements: - - tup suggested this idea. - -Comments: - - 2) Maybe add a rule "Don't do this for hibernation if we expect to wake - up before the next consensus is published"? - - NM 9 Oct 2007 diff --git a/doc/spec/proposals/121-hidden-service-authentication.txt b/doc/spec/proposals/121-hidden-service-authentication.txt deleted file mode 100644 index 0d92b53a8c..0000000000 --- a/doc/spec/proposals/121-hidden-service-authentication.txt +++ /dev/null @@ -1,776 +0,0 @@ -Filename: 121-hidden-service-authentication.txt -Title: Hidden Service Authentication -Author: Tobias Kamm, Thomas Lauterbach, Karsten Loesing, Ferdinand Rieger, - Christoph Weingarten -Created: 10-Sep-2007 -Status: Finished -Implemented-In: 0.2.1.x - -Change history: - - 26-Sep-2007 Initial proposal for or-dev - 08-Dec-2007 Incorporated comments by Nick posted to or-dev on 10-Oct-2007 - 15-Dec-2007 Rewrote complete proposal for better readability, modified - authentication protocol, merged in personal notes - 24-Dec-2007 Replaced misleading term "authentication" by "authorization" - and added some clarifications (comments by Sven Kaffille) - 28-Apr-2008 Updated most parts of the concrete authorization protocol - 04-Jul-2008 Add a simple algorithm to delay descriptor publication for - different clients of a hidden service - 19-Jul-2008 Added INTRODUCE1V cell type (1.2), improved replay - protection for INTRODUCE2 cells (1.3), described limitations - for auth protocols (1.6), improved hidden service protocol - without client authorization (2.1), added second, more - scalable authorization protocol (2.2), rewrote existing - authorization protocol (2.3); changes based on discussion - with Nick - 31-Jul-2008 Limit maximum descriptor size to 20 kilobytes to prevent - abuse. - 01-Aug-2008 Use first part of Diffie-Hellman handshake for replay - protection instead of rendezvous cookie. - 01-Aug-2008 Remove improved hidden service protocol without client - authorization (2.1). It might get implemented in proposal - 142. - -Overview: - - This proposal deals with a general infrastructure for performing - authorization (not necessarily implying authentication) of requests to - hidden services at three points: (1) when downloading and decrypting - parts of the hidden service descriptor, (2) at the introduction point, - and (3) at Bob's Tor client before contacting the rendezvous point. A - service provider will be able to restrict access to his service at these - three points to authorized clients only. Further, the proposal contains - specific authorization protocols as instances that implement the - presented authorization infrastructure. - - This proposal is based on v2 hidden service descriptors as described in - proposal 114 and introduced in version 0.2.0.10-alpha. - - The proposal is structured as follows: The next section motivates the - integration of authorization mechanisms in the hidden service protocol. - Then we describe a general infrastructure for authorization in hidden - services, followed by specific authorization protocols for this - infrastructure. At the end we discuss a number of attacks and non-attacks - as well as compatibility issues. - -Motivation: - - The major part of hidden services does not require client authorization - now and won't do so in the future. To the contrary, many clients would - not want to be (pseudonymously) identifiable by the service (though this - is unavoidable to some extent), but rather use the service - anonymously. These services are not addressed by this proposal. - - However, there may be certain services which are intended to be accessed - by a limited set of clients only. A possible application might be a - wiki or forum that should only be accessible for a closed user group. - Another, less intuitive example might be a real-time communication - service, where someone provides a presence and messaging service only to - his buddies. Finally, a possible application would be a personal home - server that should be remotely accessed by its owner. - - Performing authorization for a hidden service within the Tor network, as - proposed here, offers a range of advantages compared to allowing all - client connections in the first instance and deferring authorization to - the transported protocol: - - (1) Reduced traffic: Unauthorized requests would be rejected as early as - possible, thereby reducing the overall traffic in the network generated - by establishing circuits and sending cells. - - (2) Better protection of service location: Unauthorized clients could not - force Bob to create circuits to their rendezvous points, thus preventing - the attack described by Øverlier and Syverson in their paper "Locating - Hidden Servers" even without the need for guards. - - (3) Hiding activity: Apart from performing the actual authorization, a - service provider could also hide the mere presence of his service from - unauthorized clients when not providing hidden service descriptors to - them, rejecting unauthorized requests already at the introduction - point (ideally without leaking presence information at any of these - points), or not answering unauthorized introduction requests. - - (4) Better protection of introduction points: When providing hidden - service descriptors to authorized clients only and encrypting the - introduction points as described in proposal 114, the introduction points - would be unknown to unauthorized clients and thereby protected from DoS - attacks. - - (5) Protocol independence: Authorization could be performed for all - transported protocols, regardless of their own capabilities to do so. - - (6) Ease of administration: A service provider running multiple hidden - services would be able to configure access at a single place uniformly - instead of doing so for all services separately. - - (7) Optional QoS support: Bob could adapt his node selection algorithm - for building the circuit to Alice's rendezvous point depending on a - previously guaranteed QoS level, thus providing better latency or - bandwidth for selected clients. - - A disadvantage of performing authorization within the Tor network is - that a hidden service cannot make use of authorization data in - the transported protocol. Tor hidden services were designed to be - independent of the transported protocol. Therefore it's only possible to - either grant or deny access to the whole service, but not to specific - resources of the service. - - Authorization often implies authentication, i.e. proving one's identity. - However, when performing authorization within the Tor network, untrusted - points should not gain any useful information about the identities of - communicating parties, neither server nor client. A crucial challenge is - to remain anonymous towards directory servers and introduction points. - However, trying to hide identity from the hidden service is a futile - task, because a client would never know if he is the only authorized - client and therefore perfectly identifiable. Therefore, hiding client - identity from the hidden service is not an aim of this proposal. - - The current implementation of hidden services does not provide any kind - of authorization. The hidden service descriptor version 2, introduced by - proposal 114, was designed to use a descriptor cookie for downloading and - decrypting parts of the descriptor content, but this feature is not yet - in use. Further, most relevant cell formats specified in rend-spec - contain fields for authorization data, but those fields are neither - implemented nor do they suffice entirely. - -Details: - - 1. General infrastructure for authorization to hidden services - - We spotted three possible authorization points in the hidden service - protocol: - - (1) when downloading and decrypting parts of the hidden service - descriptor, - (2) at the introduction point, and - (3) at Bob's Tor client before contacting the rendezvous point. - - The general idea of this proposal is to allow service providers to - restrict access to some or all of these points to authorized clients - only. - - 1.1. Client authorization at directory - - Since the implementation of proposal 114 it is possible to combine a - hidden service descriptor with a so-called descriptor cookie. If done so, - the descriptor cookie becomes part of the descriptor ID, thus having an - effect on the storage location of the descriptor. Someone who has learned - about a service, but is not aware of the descriptor cookie, won't be able - to determine the descriptor ID and download the current hidden service - descriptor; he won't even know whether the service has uploaded a - descriptor recently. Descriptor IDs are calculated as follows (see - section 1.2 of rend-spec for the complete specification of v2 hidden - service descriptors): - - descriptor-id = - H(service-id | H(time-period | descriptor-cookie | replica)) - - Currently, service-id is equivalent to permanent-id which is calculated - as in the following formula. But in principle it could be any public - key. - - permanent-id = H(permanent-key)[:10] - - The second purpose of the descriptor cookie is to encrypt the list of - introduction points, including optional authorization data. Hence, the - hidden service directories won't learn any introduction information from - storing a hidden service descriptor. This feature is implemented but - unused at the moment. So this proposal will harness the advantages - of proposal 114. - - The descriptor cookie can be used for authorization by keeping it secret - from everyone but authorized clients. A service could then decide whether - to publish hidden service descriptors using that descriptor cookie later - on. An authorized client being aware of the descriptor cookie would be - able to download and decrypt the hidden service descriptor. - - The number of concurrently used descriptor cookies for one hidden service - is not restricted. A service could use a single descriptor cookie for all - users, a distinct cookie per user, or something in between, like one - cookie per group of users. It is up to the specific protocol and how it - is applied by a service provider. - - Two or more hidden service descriptors for different groups or users - should not be uploaded at the same time. A directory node could conclude - easily that the descriptors were issued by the same hidden service, thus - being able to link the two groups or users. Therefore, descriptors for - different users or clients that ought to be stored on the same directory - are delayed, so that only one descriptor is uploaded to a directory at a - time. The remaining descriptors are uploaded with a delay of up to - 30 seconds. - Further, descriptors for different groups or users that are to be stored - on different directories are delayed for a random time of up to 30 - seconds to hide relations from colluding directories. Certainly, this - does not prevent linking entirely, but it makes it somewhat harder. - There is a conflict between hiding links between clients and making a - service available in a timely manner. - - Although this part of the proposal is meant to describe a general - infrastructure for authorization, changing the way of using the - descriptor cookie to look up hidden service descriptors, e.g. applying - some sort of asymmetric crypto system, would require in-depth changes - that would be incompatible to v2 hidden service descriptors. On the - contrary, using another key for en-/decrypting the introduction point - part of a hidden service descriptor, e.g. a different symmetric key or - asymmetric encryption, would be easy to implement and compatible to v2 - hidden service descriptors as understood by hidden service directories - (clients and services would have to be upgraded anyway for using the new - features). - - An adversary could try to abuse the fact that introduction points can be - encrypted by storing arbitrary, unrelated data in the hidden service - directory. This abuse can be limited by setting a hard descriptor size - limit, forcing the adversary to split data into multiple chunks. There - are some limitations that make splitting data across multiple descriptors - unattractive: 1) The adversary would not be able to choose descriptor IDs - freely and would therefore have to implement his own indexing - structure. 2) Validity of descriptors is limited to at most 24 hours - after which descriptors need to be republished. - - The regular descriptor size in bytes is 745 + num_ipos * 837 + auth_data. - A large descriptor with 7 introduction points and 5 kilobytes of - authorization data would be 11724 bytes in size. The upper size limit of - descriptors should be set to 20 kilobytes, which limits the effect of - abuse while retaining enough flexibility in designing authorization - protocols. - - 1.2. Client authorization at introduction point - - The next possible authorization point after downloading and decrypting - a hidden service descriptor is the introduction point. It may be important - for authorization, because it bears the last chance of hiding presence - of a hidden service from unauthorized clients. Further, performing - authorization at the introduction point might reduce traffic in the - network, because unauthorized requests would not be passed to the - hidden service. This applies to those clients who are aware of a - descriptor cookie and thereby of the hidden service descriptor, but do - not have authorization data to pass the introduction point or access the - service (such a situation might occur when authorization data for - authorization at the directory is not issued on a per-user basis, but - authorization data for authorization at the introduction point is). - - It is important to note that the introduction point must be considered - untrustworthy, and therefore cannot replace authorization at the hidden - service itself. Nor should the introduction point learn any sensitive - identifiable information from either the service or the client. - - In order to perform authorization at the introduction point, three - message formats need to be modified: (1) v2 hidden service descriptors, - (2) ESTABLISH_INTRO cells, and (3) INTRODUCE1 cells. - - A v2 hidden service descriptor needs to contain authorization data that - is introduction-point-specific and sometimes also authorization data - that is introduction-point-independent. Therefore, v2 hidden service - descriptors as specified in section 1.2 of rend-spec already contain two - reserved fields "intro-authorization" and "service-authorization" - (originally, the names of these fields were "...-authentication") - containing an authorization type number and arbitrary authorization - data. We propose that authorization data consists of base64 encoded - objects of arbitrary length, surrounded by "-----BEGIN MESSAGE-----" and - "-----END MESSAGE-----". This will increase the size of hidden service - descriptors, but this is allowed since there is no strict upper limit. - - The current ESTABLISH_INTRO cells as described in section 1.3 of - rend-spec do not contain either authorization data or version - information. Therefore, we propose a new version 1 of the ESTABLISH_INTRO - cells adding these two issues as follows: - - V Format byte: set to 255 [1 octet] - V Version byte: set to 1 [1 octet] - KL Key length [2 octets] - PK Bob's public key [KL octets] - HS Hash of session info [20 octets] - AUTHT The auth type that is supported [1 octet] - AUTHL Length of auth data [2 octets] - AUTHD Auth data [variable] - SIG Signature of above information [variable] - - From the format it is possible to determine the maximum allowed size for - authorization data: given the fact that cells are 512 octets long, of - which 498 octets are usable (see section 6.1 of tor-spec), and assuming - 1024 bit = 128 octet long keys, there are 215 octets left for - authorization data. Hence, authorization protocols are bound to use no - more than these 215 octets, regardless of the number of clients that - shall be authenticated at the introduction point. Otherwise, one would - need to send multiple ESTABLISH_INTRO cells or split them up, which we do - not specify here. - - In order to understand a v1 ESTABLISH_INTRO cell, the implementation of - a relay must have a certain Tor version. Hidden services need to be able - to distinguish relays being capable of understanding the new v1 cell - formats and perform authorization. We propose to use the version number - that is contained in networkstatus documents to find capable - introduction points. - - The current INTRODUCE1 cell as described in section 1.8 of rend-spec is - not designed to carry authorization data and has no version number, too. - Unfortunately, unversioned INTRODUCE1 cells consist only of a fixed-size, - seemingly random PK_ID, followed by the encrypted INTRODUCE2 cell. This - makes it impossible to distinguish unversioned INTRODUCE1 cells from any - later format. In particular, it is not possible to introduce some kind of - format and version byte for newer versions of this cell. That's probably - where the comment "[XXX011 want to put intro-level auth info here, but no - version. crap. -RD]" that was part of rend-spec some time ago comes from. - - We propose that new versioned INTRODUCE1 cells use the new cell type 41 - RELAY_INTRODUCE1V (where V stands for versioned): - - Cleartext - V Version byte: set to 1 [1 octet] - PK_ID Identifier for Bob's PK [20 octets] - AUTHT The auth type that is included [1 octet] - AUTHL Length of auth data [2 octets] - AUTHD Auth data [variable] - Encrypted to Bob's PK: - (RELAY_INTRODUCE2 cell) - - The maximum length of contained authorization data depends on the length - of the contained INTRODUCE2 cell. A calculation follows below when - describing the INTRODUCE2 cell format we propose to use. - - 1.3. Client authorization at hidden service - - The time when a hidden service receives an INTRODUCE2 cell constitutes - the last possible authorization point during the hidden service - protocol. Performing authorization here is easier than at the other two - authorization points, because there are no possibly untrusted entities - involved. - - In general, a client that is successfully authorized at the introduction - point should be granted access at the hidden service, too. Otherwise, the - client would receive a positive INTRODUCE_ACK cell from the introduction - point and conclude that it may connect to the service, but the request - will be dropped without notice. This would appear as a failure to - clients. Therefore, the number of cases in which a client successfully - passes the introduction point but fails at the hidden service should be - zero. However, this does not lead to the conclusion that the - authorization data used at the introduction point and the hidden service - must be the same, but only that both authorization data should lead to - the same authorization result. - - Authorization data is transmitted from client to server via an - INTRODUCE2 cell that is forwarded by the introduction point. There are - versions 0 to 2 specified in section 1.8 of rend-spec, but none of these - contain fields for carrying authorization data. We propose a slightly - modified version of v3 INTRODUCE2 cells that is specified in section - 1.8.1 and which is not implemented as of December 2007. In contrast to - the specified v3 we avoid specifying (and implementing) IPv6 capabilities, - because Tor relays will be required to support IPv4 addresses for a long - time in the future, so that this seems unnecessary at the moment. The - proposed format of v3 INTRODUCE2 cells is as follows: - - VER Version byte: set to 3. [1 octet] - AUTHT The auth type that is used [1 octet] - AUTHL Length of auth data [2 octets] - AUTHD Auth data [variable] - TS Timestamp (seconds since 1-1-1970) [4 octets] - IP Rendezvous point's address [4 octets] - PORT Rendezvous point's OR port [2 octets] - ID Rendezvous point identity ID [20 octets] - KLEN Length of onion key [2 octets] - KEY Rendezvous point onion key [KLEN octets] - RC Rendezvous cookie [20 octets] - g^x Diffie-Hellman data, part 1 [128 octets] - - The maximum possible length of authorization data is related to the - enclosing INTRODUCE1V cell. A v3 INTRODUCE2 cell with - 1024 bit = 128 octets long public key without any authorization data - occupies 306 octets (AUTHL is only used when AUTHT has a value != 0), - plus 58 octets for hybrid public key encryption (see - section 5.1 of tor-spec on hybrid encryption of CREATE cells). The - surrounding INTRODUCE1V cell requires 24 octets. This leaves only 110 - of the 498 available octets free, which must be shared between - authorization data to the introduction point _and_ to the hidden - service. - - When receiving a v3 INTRODUCE2 cell, Bob checks whether a client has - provided valid authorization data to him. He also requires that the - timestamp is no more than 30 minutes in the past or future and that the - first part of the Diffie-Hellman handshake has not been used in the past - 60 minutes to prevent replay attacks by rogue introduction points. (The - reason for not using the rendezvous cookie to detect replays---even - though it is only sent once in the current design---is that it might be - desirable to re-use rendezvous cookies for multiple introduction requests - in the future.) If all checks pass, Bob builds a circuit to the provided - rendezvous point. Otherwise he drops the cell. - - 1.4. Summary of authorization data fields - - In summary, the proposed descriptor format and cell formats provide the - following fields for carrying authorization data: - - (1) The v2 hidden service descriptor contains: - - a descriptor cookie that is used for the lookup process, and - - an arbitrary encryption schema to ensure authorization to access - introduction information (currently symmetric encryption with the - descriptor cookie). - - (2) For performing authorization at the introduction point we can use: - - the fields intro-authorization and service-authorization in - hidden service descriptors, - - a maximum of 215 octets in the ESTABLISH_INTRO cell, and - - one part of 110 octets in the INTRODUCE1V cell. - - (3) For performing authorization at the hidden service we can use: - - the fields intro-authorization and service-authorization in - hidden service descriptors, - - the other part of 110 octets in the INTRODUCE2 cell. - - It will also still be possible to access a hidden service without any - authorization or only use a part of the authorization infrastructure. - However, this requires to consider all parts of the infrastructure. For - example, authorization at the introduction point relying on confidential - intro-authorization data transported in the hidden service descriptor - cannot be performed without using an encryption schema for introduction - information. - - 1.5. Managing authorization data at servers and clients - - In order to provide authorization data at the hidden service and the - authenticated clients, we propose to use files---either the Tor - configuration file or separate files. The exact format of these special - files depends on the authorization protocol used. - - Currently, rend-spec contains the proposition to encode client-side - authorization data in the URL, like in x.y.z.onion. This was never used - and is also a bad idea, because in case of HTTP the requested URL may be - contained in the Host and Referer fields. - - 1.6. Limitations for authorization protocols - - There are two limitations of the current hidden service protocol for - authorization protocols that shall be identified here. - - 1. The three cell types ESTABLISH_INTRO, INTRODUCE1V, and INTRODUCE2 - restricts the amount of data that can be used for authorization. - This forces authorization protocols that require per-user - authorization data at the introduction point to restrict the number - of authorized clients artificially. A possible solution could be to - split contents among multiple cells and reassemble them at the - introduction points. - - 2. The current hidden service protocol does not specify cell types to - perform interactive authorization between client and introduction - point or hidden service. If there should be an authorization - protocol that requires interaction, new cell types would have to be - defined and integrated into the hidden service protocol. - - - 2. Specific authorization protocol instances - - In the following we present two specific authorization protocols that - make use of (parts of) the new authorization infrastructure: - - 1. The first protocol allows a service provider to restrict access - to clients with a previously received secret key only, but does not - attempt to hide service activity from others. - - 2. The second protocol, albeit being feasible for a limited set of about - 16 clients, performs client authorization and hides service activity - from everyone but the authorized clients. - - These two protocol instances extend the existing hidden service protocol - version 2. Hidden services that perform client authorization may run in - parallel to other services running versions 0, 2, or both. - - 2.1. Service with large-scale client authorization - - The first client authorization protocol aims at performing access control - while consuming as few additional resources as possible. A service - provider should be able to permit access to a large number of clients - while denying access for everyone else. However, the price for - scalability is that the service won't be able to hide its activity from - unauthorized or formerly authorized clients. - - The main idea of this protocol is to encrypt the introduction-point part - in hidden service descriptors to authorized clients using symmetric keys. - This ensures that nobody else but authorized clients can learn which - introduction points a service currently uses, nor can someone send a - valid INTRODUCE1 message without knowing the introduction key. Therefore, - a subsequent authorization at the introduction point is not required. - - A service provider generates symmetric "descriptor cookies" for his - clients and distributes them outside of Tor. The suggested key size is - 128 bits, so that descriptor cookies can be encoded in 22 base64 chars - (which can hold up to 22 * 5 = 132 bits, leaving 4 bits to encode the - authorization type (here: "0") and allow a client to distinguish this - authorization protocol from others like the one proposed below). - Typically, the contact information for a hidden service using this - authorization protocol looks like this: - - v2cbb2l4lsnpio4q.onion Ll3X7Xgz9eHGKCCnlFH0uz - - When generating a hidden service descriptor, the service encrypts the - introduction-point part with a single randomly generated symmetric - 128-bit session key using AES-CTR as described for v2 hidden service - descriptors in rend-spec. Afterwards, the service encrypts the session - key to all descriptor cookies using AES. Authorized client should be able - to efficiently find the session key that is encrypted for him/her, so - that 4 octet long client ID are generated consisting of descriptor cookie - and initialization vector. Descriptors always contain a number of - encrypted session keys that is a multiple of 16 by adding fake entries. - Encrypted session keys are ordered by client IDs in order to conceal - addition or removal of authorized clients by the service provider. - - ATYPE Authorization type: set to 1. [1 octet] - ALEN Number of clients := 1 + ((clients - 1) div 16) [1 octet] - for each symmetric descriptor cookie: - ID Client ID: H(descriptor cookie | IV)[:4] [4 octets] - SKEY Session key encrypted with descriptor cookie [16 octets] - (end of client-specific part) - RND Random data [(15 - ((clients - 1) mod 16)) * 20 octets] - IV AES initialization vector [16 octets] - IPOS Intro points, encrypted with session key [remaining octets] - - An authorized client needs to configure Tor to use the descriptor cookie - when accessing the hidden service. Therefore, a user adds the contact - information that she received from the service provider to her torrc - file. Upon downloading a hidden service descriptor, Tor finds the - encrypted introduction-point part and attempts to decrypt it using the - configured descriptor cookie. (In the rare event of two or more client - IDs being equal a client tries to decrypt all of them.) - - Upon sending the introduction, the client includes her descriptor cookie - as auth type "1" in the INTRODUCE2 cell that she sends to the service. - The hidden service checks whether the included descriptor cookie is - authorized to access the service and either responds to the introduction - request, or not. - - 2.2. Authorization for limited number of clients - - A second, more sophisticated client authorization protocol goes the extra - mile of hiding service activity from unauthorized clients. With all else - being equal to the preceding authorization protocol, the second protocol - publishes hidden service descriptors for each user separately and gets - along with encrypting the introduction-point part of descriptors to a - single client. This allows the service to stop publishing descriptors for - removed clients. As long as a removed client cannot link descriptors - issued for other clients to the service, it cannot derive service - activity any more. The downside of this approach is limited scalability. - Even though the distributed storage of descriptors (cf. proposal 114) - tackles the problem of limited scalability to a certain extent, this - protocol should not be used for services with more than 16 clients. (In - fact, Tor should refuse to advertise services for more than this number - of clients.) - - A hidden service generates an asymmetric "client key" and a symmetric - "descriptor cookie" for each client. The client key is used as - replacement for the service's permanent key, so that the service uses a - different identity for each of his clients. The descriptor cookie is used - to store descriptors at changing directory nodes that are unpredictable - for anyone but service and client, to encrypt the introduction-point - part, and to be included in INTRODUCE2 cells. Once the service has - created client key and descriptor cookie, he tells them to the client - outside of Tor. The contact information string looks similar to the one - used by the preceding authorization protocol (with the only difference - that it has "1" encoded as auth-type in the remaining 4 of 132 bits - instead of "0" as before). - - When creating a hidden service descriptor for an authorized client, the - hidden service uses the client key and descriptor cookie to compute - secret ID part and descriptor ID: - - secret-id-part = H(time-period | descriptor-cookie | replica) - - descriptor-id = H(client-key[:10] | secret-id-part) - - The hidden service also replaces permanent-key in the descriptor with - client-key and encrypts introduction-points with the descriptor cookie. - - ATYPE Authorization type: set to 2. [1 octet] - IV AES initialization vector [16 octets] - IPOS Intro points, encr. with descriptor cookie [remaining octets] - - When uploading descriptors, the hidden service needs to make sure that - descriptors for different clients are not uploaded at the same time (cf. - Section 1.1) which is also a limiting factor for the number of clients. - - When a client is requested to establish a connection to a hidden service - it looks up whether it has any authorization data configured for that - service. If the user has configured authorization data for authorization - protocol "2", the descriptor ID is determined as described in the last - paragraph. Upon receiving a descriptor, the client decrypts the - introduction-point part using its descriptor cookie. Further, the client - includes its descriptor cookie as auth-type "2" in INTRODUCE2 cells that - it sends to the service. - - 2.3. Hidden service configuration - - A hidden service that is meant to perform client authorization adds a - new option HiddenServiceAuthorizeClient to its hidden service - configuration. This option contains the authorization type which is - either "1" for the protocol described in 2.1 or "2" for the protocol in - 2.2 and a comma-separated list of human-readable client names, so that - Tor can create authorization data for these clients: - - HiddenServiceAuthorizeClient auth-type client-name,client-name,... - - If this option is configured, HiddenServiceVersion is automatically - reconfigured to contain only version numbers of 2 or higher. - - Tor stores all generated authorization data for the authorization - protocols described in Sections 2.1 and 2.2 in a new file using the - following file format: - - "client-name" human-readable client identifier NL - "descriptor-cookie" 128-bit key ^= 22 base64 chars NL - - If the authorization protocol of Section 2.2 is used, Tor also generates - and stores the following data: - - "client-key" NL a public key in PEM format - - 2.4. Client configuration - - Clients need to make their authorization data known to Tor using another - configuration option that contains a service name (mainly for the sake of - convenience), the service address, and the descriptor cookie that is - required to access a hidden service (the authorization protocol number is - encoded in the descriptor cookie): - - HidServAuth service-name service-address descriptor-cookie - -Security implications: - - In the following we want to discuss possible attacks by dishonest - entities in the presented infrastructure and specific protocol. These - security implications would have to be verified once more when adding - another protocol. The dishonest entities (theoretically) include the - hidden service itself, the authenticated clients, hidden service directory - nodes, introduction points, and rendezvous points. The relays that are - part of circuits used during protocol execution, but never learn about - the exchanged descriptors or cells by design, are not considered. - Obviously, this list makes no claim to be complete. The discussed attacks - are sorted by the difficulty to perform them, in ascending order, - starting with roles that everyone could attempt to take and ending with - partially trusted entities abusing the trust put in them. - - (1) A hidden service directory could attempt to conclude presence of a - service from the existence of a locally stored hidden service descriptor: - This passive attack is possible only for a single client-service - relation, because descriptors need to contain a publicly visible - signature of the service using the client key. - A possible protection would be to increase the number of hidden service - directories in the network. - - (2) A hidden service directory could try to break the descriptor cookies - of locally stored descriptors: This attack can be performed offline. The - only useful countermeasure against it might be using safe passwords that - are generated by Tor. - -[passwords? where did those come in? -RD] - - (3) An introduction point could try to identify the pseudonym of the - hidden service on behalf of which it operates: This is impossible by - design, because the service uses a fresh public key for every - establishment of an introduction point (see proposal 114) and the - introduction point receives a fresh introduction cookie, so that there is - no identifiable information about the service that the introduction point - could learn. The introduction point cannot even tell if client accesses - belong to the same client or not, nor can it know the total number of - authorized clients. The only information might be the pattern of - anonymous client accesses, but that is hardly enough to reliably identify - a specific service. - - (4) An introduction point could want to learn the identities of accessing - clients: This is also impossible by design, because all clients use the - same introduction cookie for authorization at the introduction point. - - (5) An introduction point could try to replay a correct INTRODUCE1 cell - to other introduction points of the same service, e.g. in order to force - the service to create a huge number of useless circuits: This attack is - not possible by design, because INTRODUCE1 cells are encrypted using a - freshly created introduction key that is only known to authorized - clients. - - (6) An introduction point could attempt to replay a correct INTRODUCE2 - cell to the hidden service, e.g. for the same reason as in the last - attack: This attack is stopped by the fact that a service will drop - INTRODUCE2 cells containing a DH handshake they have seen recently. - - (7) An introduction point could block client requests by sending either - positive or negative INTRODUCE_ACK cells back to the client, but without - forwarding INTRODUCE2 cells to the server: This attack is an annoyance - for clients, because they might wait for a timeout to elapse until trying - another introduction point. However, this attack is not introduced by - performing authorization and it cannot be targeted towards a specific - client. A countermeasure might be for the server to periodically perform - introduction requests to his own service to see if introduction points - are working correctly. - - (8) The rendezvous point could attempt to identify either server or - client: This remains impossible as it was before, because the - rendezvous cookie does not contain any identifiable information. - - (9) An authenticated client could swamp the server with valid INTRODUCE1 - and INTRODUCE2 cells, e.g. in order to force the service to create - useless circuits to rendezvous points; as opposed to an introduction - point replaying the same INTRODUCE2 cell, a client could include a new - rendezvous cookie for every request: The countermeasure for this attack - is the restriction to 10 connection establishments per client per hour. - -Compatibility: - - An implementation of this proposal would require changes to hidden - services and clients to process authorization data and encode and - understand the new formats. However, both services and clients would - remain compatible to regular hidden services without authorization. - -Implementation: - - The implementation of this proposal can be divided into a number of - changes to hidden service and client side. There are no - changes necessary on directory, introduction, or rendezvous nodes. All - changes are marked with either [service] or [client] do denote on which - side they need to be made. - - /1/ Configure client authorization [service] - - - Parse configuration option HiddenServiceAuthorizeClient containing - authorized client names. - - Load previously created client keys and descriptor cookies. - - Generate missing client keys and descriptor cookies, add them to - client_keys file. - - Rewrite the hostname file. - - Keep client keys and descriptor cookies of authorized clients in - memory. - [- In case of reconfiguration, mark which client authorizations were - added and whether any were removed. This can be used later when - deciding whether to rebuild introduction points and publish new - hidden service descriptors. Not implemented yet.] - - /2/ Publish hidden service descriptors [service] - - - Create and upload hidden service descriptors for all authorized - clients. - [- See /1/ for the case of reconfiguration.] - - /3/ Configure permission for hidden services [client] - - - Parse configuration option HidServAuth containing service - authorization, store authorization data in memory. - - /5/ Fetch hidden service descriptors [client] - - - Look up client authorization upon receiving a hidden service request. - - Request hidden service descriptor ID including client key and - descriptor cookie. Only request v2 descriptors, no v0. - - /6/ Process hidden service descriptor [client] - - - Decrypt introduction points with descriptor cookie. - - /7/ Create introduction request [client] - - - Include descriptor cookie in INTRODUCE2 cell to introduction point. - - Pass descriptor cookie around between involved connections and - circuits. - - /8/ Process introduction request [service] - - - Read descriptor cookie from INTRODUCE2 cell. - - Check whether descriptor cookie is authorized for access, including - checking access counters. - - Log access for accountability. - diff --git a/doc/spec/proposals/122-unnamed-flag.txt b/doc/spec/proposals/122-unnamed-flag.txt deleted file mode 100644 index 2ce7bb22b9..0000000000 --- a/doc/spec/proposals/122-unnamed-flag.txt +++ /dev/null @@ -1,136 +0,0 @@ -Filename: 122-unnamed-flag.txt -Title: Network status entries need a new Unnamed flag -Author: Roger Dingledine -Created: 04-Oct-2007 -Status: Closed -Implemented-In: 0.2.0.x - -1. Overview: - - Tor's directory authorities can give certain servers a "Named" flag - in the network-status entry, when they want to bind that nickname to - that identity key. This allows clients to specify a nickname rather - than an identity fingerprint and still be certain they're getting the - "right" server. As dir-spec.txt describes it, - - Name X is bound to identity Y if at least one binding directory lists - it, and no directory binds X to some other Y'. - - In practice, clients can refer to servers by nickname whether they are - Named or not; if they refer to nicknames that aren't Named, a complaint - shows up in the log asking them to use the identity key in the future - --- but it still works. - - The problem? Imagine a Tor server with nickname Bob. Bob and his - identity fingerprint are registered in tor26's approved-routers - file, but none of the other authorities registered him. Imagine - there are several other unregistered servers also with nickname Bob - ("the imposters"). - - While Bob is online, all is well: a) tor26 gives a Named flag to - the real one, and refuses to list the other ones; and b) the other - authorities list the imposters but don't give them a Named flag. Clients - who have all the network-statuses can compute which one is the real Bob. - - But when the real Bob disappears and his descriptor expires? tor26 - continues to refuse to list any of the imposters, and the other - authorities continue to list the imposters. Clients don't have any - idea that there exists a Named Bob, so they can ask for server Bob and - get one of the imposters. (A warning will also appear in their log, - but so what.) - -2. The stopgap solution: - - tor26 should start accepting and listing the imposters, but it should - assign them a new flag: "Unnamed". - - This would produce three cases in terms of assigning flags in the consensus - networkstatus: - - i) a router gets the Named flag in the v3 networkstatus if - a) it's the only router with that nickname that has the Named flag - out of all the votes, and - b) no vote lists it as Unnamed - else, - ii) a router gets the Unnamed flag if - a) some vote lists a different router with that nickname as Named, or - b) at least one vote lists it as Unnamed, or - c) there are other routers with the same nickname that are Unnamed - else, - iii) the router neither gets a Named nor an Unnamed flag. - - (This whole proposal is meant only for v3 dir flags; we shouldn't try - to backport it to the v2 dir world.) - - Then client behavior is: - - a) If there's a Bob with a Named flag, pick that one. - else b) If the Bobs don't have the Unnamed flag (notice that they should - either all have it, or none), pick one of them and warn. - else c) They all have the Unnamed flag -- no router found. - -3. Problems not solved by this stopgap: - - 3.1. Naming authorities can go offline. - - If tor26 is the only authority that provides a binding for Bob, when - tor26 goes offline we're back in our previous situation -- the imposters - can be referenced with a mere ignorable warning in the client's log. - - If some other authority Names a different Bob, and tor26 goes offline, - then that other Bob becomes the unique Named Bob. - - So be it. We should try to solve these one day, but there's no clear way - to do it that doesn't destroy usability in other ways, and if we want - to get the Unnamed flag into v3 network statuses we should add it soon. - - 3.2. V3 dir spec magnifies brief discrepancies. - - Another point to notice is if tor26 names Bob(1), doesn't know about - Bob(2), but moria lists Bob(2). Then Bob(2) doesn't get an Unnamed flag - even if it should (and Bob(1) is not around). - - Right now, in v2 dirs, the case where an authority doesn't know about - a server but the other authorities do know is rare. That's because - authorities periodically ask for other networkstatuses and then fetch - descriptors that are missing. - - With v3, if that window occurs at the wrong time, it is extended for the - entire period. We could solve this by making the voting more complex, - but that doesn't seem worth it. - - [3.3. Tor26 is only one tor26. - - We need more naming authorities, possibly with some kind of auto-naming - feature. This is out-of-scope for this proposal -NM] - -4. Changes to the v2 directory - - Previously, v2 authorities that had a binding for a server named Bob did - not list any other server named Bob. This will change too: - - Version 2 authorities will start listing all routers they know about, - whether they conflict with a name-binding or not: Servers for which - this authority has a binding will continue to be marked Named, - additionally all other servers of that nickname will be listed without the - Named flag (i.e. there will be no Unnamed flag in v2 status documents). - - Clients already should handle having a named Bob alongside unnamed - Bobs correctly, and having the unnamed Bobs in the status file even - without the named server is no worse than the current status quo where - clients learn about those servers from other authorities. - - The benefit of this is that an authority's opinion on a server like - Guard, Stable, Fast etc. can now be learned by clients even if that - specific authority has reserved that server's name for somebody else. - -5. Other benefits: - - This new flag will allow people to operate servers that happen to have - the same nickname as somebody who registered their server two years ago - and left soon after. Right now there are dozens of nicknames that are - registered on all three binding directory authorities, yet haven't been - running for years. While it's bad that these nicknames are effectively - blacklisted from the network, the really bad part is that this logic - is really unintuitive to prospective new server operators. - diff --git a/doc/spec/proposals/123-autonaming.txt b/doc/spec/proposals/123-autonaming.txt deleted file mode 100644 index 74c486985d..0000000000 --- a/doc/spec/proposals/123-autonaming.txt +++ /dev/null @@ -1,54 +0,0 @@ -Filename: 123-autonaming.txt -Title: Naming authorities automatically create bindings -Author: Peter Palfrader -Created: 2007-10-11 -Status: Closed -Implemented-In: 0.2.0.x - -Overview: - - Tor's directory authorities can give certain servers a "Named" flag - in the network-status entry, when they want to bind that nickname to - that identity key. This allows clients to specify a nickname rather - than an identity fingerprint and still be certain they're getting the - "right" server. - - Authority operators name a server by adding their nickname and - identity fingerprint to the 'approved-routers' file. Historically - being listed in the file was required for a router, at first for being - listed in the directory at all, and later in order to be used by - clients as a first or last hop of a circuit. - - Adding identities to the list of named routers so far has been a - manual, time consuming, and boring job. Given that and the fact that - the Tor network works just fine without named routers the last - authority to keep a current binding list stopped updating it well over - half a year ago. - - Naming, if it were done, would serve a useful purpose however in that - users can have a reasonable expectation that the exit server Bob they - are using in their http://www.google.com.bob.exit/ URL is the same - Bob every time. - -Proposal: - I propose that identity<->name binding be completely automated: - - New bindings should be added after the router has been around for a - bit and their name has not been used by other routers, similarly names - that have not appeared on the network for a long time should be freed - in case a new router wants to use it. - - The following rules are suggested: - i) If a named router has not been online for half a year, the - identity<->name binding for that name is removed. The nickname - is free to be taken by other routers now. - ii) If a router claims a certain nickname and - a) has been on the network for at least two weeks, and - b) that nickname is not yet linked to a different router, and - c) no other router has wanted that nickname in the last month, - a new binding should be created for this router and its desired - nickname. - - This automaton does not necessarily need to live in the Tor code, it - can do its job just as well when it's an external tool. - diff --git a/doc/spec/proposals/124-tls-certificates.txt b/doc/spec/proposals/124-tls-certificates.txt deleted file mode 100644 index 9472d14af8..0000000000 --- a/doc/spec/proposals/124-tls-certificates.txt +++ /dev/null @@ -1,313 +0,0 @@ -Filename: 124-tls-certificates.txt -Title: Blocking resistant TLS certificate usage -Author: Steven J. Murdoch -Created: 2007-10-25 -Status: Superseded - -Overview: - - To be less distinguishable from HTTPS web browsing, only Tor servers should - present TLS certificates. This should be done whilst maintaining backwards - compatibility with Tor nodes which present and expect client certificates, and - while preserving existing security properties. This specification describes - the negotiation protocol, what certificates should be presented during the TLS - negotiation, and how to move the client authentication within the encrypted - tunnel. - -Motivation: - - In Tor's current TLS [1] handshake, both client and server present a - two-certificate chain. Since TLS performs authentication prior to establishing - the encrypted tunnel, the contents of these certificates are visible to an - eavesdropper. In contrast, during normal HTTPS web browsing, the server - presents a single certificate, signed by a root CA and the client presents no - certificate. Hence it is possible to distinguish Tor from HTTP by identifying - this pattern. - - To resist blocking based on traffic identification, Tor should behave as close - to HTTPS as possible, i.e. servers should offer a single certificate and not - request a client certificate; clients should present no certificate. This - presents two difficulties: clients are no longer authenticated and servers are - authenticated by the connection key, rather than identity key. The link - protocol must thus be modified to preserve the old security semantics. - - Finally, in order to maintain backwards compatibility, servers must correctly - identify whether the client supports the modified certificate handling. This - is achieved by modifying the cipher suites that clients advertise support - for. These cipher suites are selected to be similar to those chosen by web - browsers, in order to resist blocking based on client hello. - -Terminology: - - Initiator: OP or OR which initiates a TLS connection ("client" in TLS - terminology) - - Responder: OR which receives an incoming TLS connection ("server" in TLS - terminology) - -Version negotiation and cipher suite selection: - - In the modified TLS handshake, the responder does not request a certificate - from the initiator. This request would normally occur immediately after the - responder receives the client hello (the first message in a TLS handshake) and - so the responder must decide whether to request a certificate based only on - the information in the client hello. This is achieved by examining the cipher - suites in the client hello. - - List 1: cipher suites lists offered by version 0/1 Tor - - From src/common/tortls.c, revision 12086: - TLS1_TXT_DHE_RSA_WITH_AES_128_SHA - TLS1_TXT_DHE_RSA_WITH_AES_128_SHA : SSL3_TXT_EDH_RSA_DES_192_CBC3_SHA - SSL3_TXT_EDH_RSA_DES_192_CBC3_SHA - - Client hello sent by initiator: - - Initiators supporting version 2 of the Tor connection protocol MUST - offer a different cipher suite list from those sent by pre-version 2 - Tors, contained in List 1. To maintain compatibility with older Tor - versions and common browsers, the cipher suite list MUST include - support for: - - TLS_DHE_RSA_WITH_AES_256_CBC_SHA - TLS_DHE_RSA_WITH_AES_128_CBC_SHA - SSL_DHE_RSA_WITH_3DES_EDE_CBC_SHA - SSL_DHE_DSS_WITH_3DES_EDE_CBC_SHA - - Client hello received by responder/server hello sent by responder: - - Responders supporting version 2 of the Tor connection protocol should compare - the cipher suite list in the client hello with those in List 1. If it matches - any in the list then the responder should assume that the initiatior supports - version 1, and thus should maintain the version 1 behavior, i.e. send a - two-certificate chain, request a client certificate and do not send or expect - a VERSIONS cell [2]. - - Otherwise, the responder should assume version 2 behavior and select a cipher - suite following TLS [1] behavior, i.e. select the first entry from the client - hello cipher list which is acceptable. Responders MUST NOT select any suite - that lacks ephemeral keys, or whose symmetric keys are less then KEY_LEN bits, - or whose digests are less than HASH_LEN bits. Implementations SHOULD NOT - allow other SSLv3 ciphersuites. - - Should no mutually acceptable cipher suite be found, the connection MUST be - closed. - - If the responder is implementing version 2 of the connection protocol it - SHOULD send a server certificate with random contents. The organizationName - field MUST NOT be "Tor", "TOR" or "t o r". - - Server certificate received by initiator: - - If the server certificate has an organizationName of "Tor", "TOR" or "t o r", - the initiator should assume that the responder does not support version 2 of - the connection protocol. In which case the initiator should respond following - version 1, i.e. send a two-certificate client chain and do not send or expect - a VERSIONS cell. - - [SJM: We could also use the fact that a client certificate request was sent] - - If the server hello contains a ciphersuite which does not comply with the key - length requirements above, even if it was one offered in the client hello, the - connection MUST be closed. This will only occur if the responder is not a Tor - server. - - Backward compatibility: - - v1 Initiator, v1 Responder: No change - v1 Initiator, v2 Responder: Responder detects v1 initiator by client hello - v2 Initiator, v1 Responder: Responder accepts v2 client hello. Initiator - detects v1 server certificate and continues with v1 protocol - v2 Initiator, v2 Responder: Responder accepts v2 client hello. Initiator - detects v2 server certificate and continues with v2 protocol. - - Additional link authentication process: - - Following VERSION and NETINFO negotiation, both responder and - initiator MUST send a certification chain in a CERT cell. If one - party does not have a certificate, the CERT cell MUST still be sent, - but with a length of zero. - - A CERT cell is a variable length cell, of the format - CircID [2 bytes] - Command [1 byte] - Length [2 bytes] - Payload [<length> bytes] - - CircID MUST set to be 0x0000 - Command is [SJM: TODO] - Length is the length of the payload - Payload contains 0 or more certificates, each is of the format: - Cert_Length [2 bytes] - Certificate [<cert_length> bytes] - - Each certificate MUST sign the one preceding it. The initator MUST - place its connection certificate first; the responder, having - already sent its connection certificate as part of the TLS handshake - MUST place its identity certificate first. - - Initiators who send a CERT cell MUST follow that with an LINK_AUTH - cell to prove that they posess the corresponding private key. - - A LINK_AUTH cell is fixed-lenth, of the format: - CircID [2 bytes] - Command [1 byte] - Length [2 bytes] - Payload (padded with 0 bytes) [PAYLOAD_LEN - 2 bytes] - - CircID MUST set to be 0x0000 - Command is [SJM: TODO] - Length is the valid portion of the payload - Payload is of the format: - Signature version [1 byte] - Signature [<length> - 1 bytes] - Padding [PAYLOAD_LEN - <length> - 2 bytes] - - Signature version: Identifies the type of signature, currently 0x00 - Signature: Digital signature under the initiator's connection key of the - following item, in PKCS #1 block type 1 [3] format: - - HMAC-SHA1, using the TLS master secret as key, of the - following elements concatenated: - - The signature version (0x00) - - The NUL terminated ASCII string: "Tor initiator certificate verification" - - client_random, as sent in the Client Hello - - server_random, as sent in the Server Hello - - SHA-1 hash of the initiator connection certificate - - SHA-1 hash of the responder connection certificate - - Security checks: - - - Before sending a LINK_AUTH cell, a node MUST ensure that the TLS - connection is authenticated by the responder key. - - For the handshake to have succeeded, the initiator MUST confirm: - - That the TLS handshake was authenticated by the - responder connection key - - That the responder connection key was signed by the first - certificate in the CERT cell - - That each certificate in the CERT cell was signed by the - following certificate, with the exception of the last - - That the last certificate in the CERT cell is the expected - identity certificate for the node being connected to - - For the handshake to have succeeded, the responder MUST confirm - either: - A) - A zero length CERT cell was sent and no LINK_AUTH cell was - sent - In which case the responder shall treat the identity of the - initiator as unknown - or - B) - That the LINK_AUTH MAC contains a signature by the first - certificate in the CERT cell - - That the MAC signed matches the expected value - - That each certificate in the CERT cell was signed by the - following certificate, with the exception of the last - In which case the responder shall treat the identity of the - initiator as that of the last certificate in the CERT cell - - Protocol summary: - - 1. I(nitiator) <-> R(esponder): TLS handshake, including responder - authentication under connection certificate R_c - 2. I <->: VERSION and NETINFO negotiation - 3. R -> I: CERT (Responder identity certificate R_i (which signs R_c)) - 4. I -> R: CERT (Initiator connection certificate I_c, - Initiator identity certificate I_i (which signs I_c) - 5. I -> R: LINK_AUTH (Signature, under I_c of HMAC-SHA1(master_secret, - "Tor initiator certificate verification" || - client_random || server_random || - I_c hash || R_c hash) - - Notes: I -> R doesn't need to wait for R_i before sending its own - messages (reduces round-trips). - Certificate hash is calculated like identity hash in CREATE cells. - Initiator signature is calculated in a similar way to Certificate - Verify messages in TLS 1.1 (RFC4346, Sections 7.4.8 and 4.7). - If I is an OP, a zero length certificate chain may be sent in step 4; - In which case, step 5 is not performed - - Rationale: - - - Version and netinfo negotiation before authentication: The version cell needs - to come before before the rest of the protocol, since we may choose to alter - the rest at some later point, e.g switch to a different MAC/signature scheme. - It is useful to keep the NETINFO and VERSION cells close to each other, since - the time between them is used to check if there is a delay-attack. Still, a - server might want to not act on NETINFO data from an initiator until the - authentication is complete. - -Appendix A: Cipher suite choices - - This specification intentionally does not put any constraints on the - TLS ciphersuite lists presented by clients, other than a minimum - required for compatibility. However, to maximize blocking - resistance, ciphersuite lists should be carefully selected. - - Recommended client ciphersuite list - - Source: http://lxr.mozilla.org/security/source/security/nss/lib/ssl/sslproto.h - - 0xc00a: TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA - 0xc014: TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA - 0x0039: TLS_DHE_RSA_WITH_AES_256_CBC_SHA - 0x0038: TLS_DHE_DSS_WITH_AES_256_CBC_SHA - 0xc00f: TLS_ECDH_RSA_WITH_AES_256_CBC_SHA - 0xc005: TLS_ECDH_ECDSA_WITH_AES_256_CBC_SHA - 0x0035: TLS_RSA_WITH_AES_256_CBC_SHA - 0xc007: TLS_ECDHE_ECDSA_WITH_RC4_128_SHA - 0xc009: TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA - 0xc011: TLS_ECDHE_RSA_WITH_RC4_128_SHA - 0xc013: TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA - 0x0033: TLS_DHE_RSA_WITH_AES_128_CBC_SHA - 0x0032: TLS_DHE_DSS_WITH_AES_128_CBC_SHA - 0xc00c: TLS_ECDH_RSA_WITH_RC4_128_SHA - 0xc00e: TLS_ECDH_RSA_WITH_AES_128_CBC_SHA - 0xc002: TLS_ECDH_ECDSA_WITH_RC4_128_SHA - 0xc004: TLS_ECDH_ECDSA_WITH_AES_128_CBC_SHA - 0x0004: SSL_RSA_WITH_RC4_128_MD5 - 0x0005: SSL_RSA_WITH_RC4_128_SHA - 0x002f: TLS_RSA_WITH_AES_128_CBC_SHA - 0xc008: TLS_ECDHE_ECDSA_WITH_3DES_EDE_CBC_SHA - 0xc012: TLS_ECDHE_RSA_WITH_3DES_EDE_CBC_SHA - 0x0016: SSL_DHE_RSA_WITH_3DES_EDE_CBC_SHA - 0x0013: SSL_DHE_DSS_WITH_3DES_EDE_CBC_SHA - 0xc00d: TLS_ECDH_RSA_WITH_3DES_EDE_CBC_SHA - 0xc003: TLS_ECDH_ECDSA_WITH_3DES_EDE_CBC_SHA - 0xfeff: SSL_RSA_FIPS_WITH_3DES_EDE_CBC_SHA (168-bit Triple DES with RSA and a SHA1 MAC) - 0x000a: SSL_RSA_WITH_3DES_EDE_CBC_SHA - - Order specified in: - http://lxr.mozilla.org/security/source/security/nss/lib/ssl/sslenum.c#47 - - Recommended options: - 0x0000: Server Name Indication [4] - 0x000a: Supported Elliptic Curves [5] - 0x000b: Supported Point Formats [5] - - Recommended compression: - 0x00 - - Recommended server ciphersuite selection: - - The responder should select the first entry in this list which is - listed in the client hello: - - 0x0039: TLS_DHE_RSA_WITH_AES_256_CBC_SHA [ Common Firefox choice ] - 0x0033: TLS_DHE_RSA_WITH_AES_128_CBC_SHA [ Tor v1 default ] - 0x0016: SSL_DHE_RSA_WITH_3DES_EDE_CBC_SHA [ Tor v1 fallback ] - 0x0013: SSL_DHE_DSS_WITH_3DES_EDE_CBC_SHA [ Valid IE option ] - -References: - -[1] The Transport Layer Security (TLS) Protocol, Version 1.1, RFC4346, IETF - -[2] Version negotiation for the Tor protocol, Tor proposal 105 - -[3] B. Kaliski, "Public-Key Cryptography Standards (PKCS) #1: - RSA Cryptography Specifications Version 1.5", RFC 2313, - March 1998. - -[4] TLS Extensions, RFC 3546 - -[5] Elliptic Curve Cryptography (ECC) Cipher Suites for Transport Layer Security (TLS) - -% <!-- Local IspellDict: american --> diff --git a/doc/spec/proposals/125-bridges.txt b/doc/spec/proposals/125-bridges.txt deleted file mode 100644 index 9d95729d42..0000000000 --- a/doc/spec/proposals/125-bridges.txt +++ /dev/null @@ -1,291 +0,0 @@ -Filename: 125-bridges.txt -Title: Behavior for bridge users, bridge relays, and bridge authorities -Author: Roger Dingledine -Created: 11-Nov-2007 -Status: Closed -Implemented-In: 0.2.0.x - -0. Preface - - This document describes the design decisions around support for bridge - users, bridge relays, and bridge authorities. It acts as an overview - of the bridge design and deployment for developers, and it also tries - to point out limitations in the current design and implementation. - - For more details on what all of these mean, look at blocking.tex in - /doc/design-paper/ - -1. Bridge relays - - Bridge relays are just like normal Tor relays except they don't publish - their server descriptors to the main directory authorities. - -1.1. PublishServerDescriptor - - To configure your relay to be a bridge relay, just add - BridgeRelay 1 - PublishServerDescriptor bridge - to your torrc. This will cause your relay to publish its descriptor - to the bridge authorities rather than to the default authorities. - - Alternatively, you can say - BridgeRelay 1 - PublishServerDescriptor 0 - which will cause your relay to not publish anywhere. This could be - useful for private bridges. - -1.2. Exit policy - - Bridge relays should use an exit policy of "reject *:*". This is - because they only need to relay traffic between the bridge users - and the rest of the Tor network, so there's no need to let people - exit directly from them. - -1.3. RelayBandwidthRate / RelayBandwidthBurst - - We invented the RelayBandwidth* options for this situation: Tor clients - who want to allow relaying too. See proposal 111 for details. Relay - operators should feel free to rate-limit their relayed traffic. - -1.4. Helping the user with port forwarding, NAT, etc. - - Just as for operating normal relays, our documentation and hints for - how to make your ORPort reachable are inadequate for normal users. - - We need to work harder on this step, perhaps in 0.2.2.x. - -1.5. Vidalia integration - - Vidalia has turned its "Relay" settings page into a tri-state - "Don't relay" / "Relay for the Tor network" / "Help censored users". - - If you click the third choice, it forces your exit policy to reject *:*. - - If all the bridges end up on port 9001, that's not so good. On the - other hand, putting the bridges on a low-numbered port in the Unix - world requires jumping through extra hoops. The current compromise is - that Vidalia makes the ORPort default to 443 on Windows, and 9001 on - other platforms. - - At the bottom of the relay config settings window, Vidalia displays - the bridge identifier to the operator (see Section 3.1) so he can pass - it on to bridge users. - -1.6. What if the default ORPort is already used? - - If the user already has a webserver or some other application - bound to port 443, then Tor will fail to bind it and complain to the - user, probably in a cryptic way. Rather than just working on a better - error message (though we should do this), we should consider an - "ORPort auto" option that tells Tor to try to find something that's - bindable and reachable. This would also help us tolerate ISPs that - filter incoming connections on port 80 and port 443. But this should - be a different proposal, and can wait until 0.2.2.x. - -2. Bridge authorities. - - Bridge authorities are like normal directory authorities, except they - don't create their own network-status documents or votes. So if you - ask an authority for a network-status document or consensus, they - behave like a directory mirror: they give you one from one of the main - authorities. But if you ask the bridge authority for the descriptor - corresponding to a particular identity fingerprint, it will happily - give you the latest descriptor for that fingerprint. - - To become a bridge authority, add these lines to your torrc: - AuthoritativeDirectory 1 - BridgeAuthoritativeDir 1 - - Right now there's one bridge authority, running on the Tonga relay. - -2.1. Exporting bridge-purpose descriptors - - We've added a new purpose for server descriptors: the "bridge" - purpose. With the new router-descriptors file format that includes - annotations, it's easy to look through it and find the bridge-purpose - descriptors. - - Currently we export the bridge descriptors from Tonga to the - BridgeDB server, so it can give them out according to the policies - in blocking.pdf. - -2.2. Reachability/uptime testing - - Right now the bridge authorities do active reachability testing of - bridges, so we know which ones to recommend for users. - - But in the design document, we suggested that bridges should publish - anonymously (i.e. via Tor) to the bridge authority, so somebody watching - the bridge authority can't just enumerate all the bridges. But if we're - doing active measurement, the game is up. Perhaps we should back off on - this goal, or perhaps we should do our active measurement anonymously? - - Answering this issue is scheduled for 0.2.1.x. - -2.3. Migrating to multiple bridge authorities - - Having only one bridge authority is both a trust bottleneck (if you - break into one place you learn about every single bridge we've got) - and a robustness bottleneck (when it's down, bridge users become sad). - - Right now if we put up a second bridge authority, all the bridges would - publish to it, and (assuming the code works) bridge users would query - a random bridge authority. This resolves the robustness bottleneck, - but makes the trust bottleneck even worse. - - In 0.2.2.x and later we should think about better ways to have multiple - bridge authorities. - -3. Bridge users. - - Bridge users are like ordinary Tor users except they use encrypted - directory connections by default, and they use bridge relays as both - entry guards (their first hop) and directory guards (the source of - all their directory information). - - To become a bridge user, add the following line to your torrc: - - UseBridges 1 - - and then add at least one "Bridge" line to your torrc based on the - format below. - -3.1. Format of the bridge identifier. - - The canonical format for a bridge identifier contains an IP address, - an ORPort, and an identity fingerprint: - bridge 128.31.0.34:9009 4C17 FB53 2E20 B2A8 AC19 9441 ECD2 B017 7B39 E4B1 - - However, the identity fingerprint can be left out, in which case the - bridge user will connect to that relay and use it as a bridge regardless - of what identity key it presents: - bridge 128.31.0.34:9009 - This might be useful for cases where only short bridge identifiers - can be communicated to bridge users. - - In a future version we may also support bridge identifiers that are - only a key fingerprint: - bridge 4C17 FB53 2E20 B2A8 AC19 9441 ECD2 B017 7B39 E4B1 - and the bridge user can fetch the latest descriptor from the bridge - authority (see Section 3.4). - -3.2. Bridges as entry guards - - For now, bridge users add their bridge relays to their list of "entry - guards" (see path-spec.txt for background on entry guards). They are - managed by the entry guard algorithms exactly as if they were a normal - entry guard -- their keys and timing get cached in the "state" file, - etc. This means that when the Tor user starts up with "UseBridges" - disabled, he will skip past the bridge entries since they won't be - listed as up and usable in his networkstatus consensus. But to be clear, - the "entry_guards" list doesn't currently distinguish guards by purpose. - - Internally, each bridge user keeps a smartlist of "bridge_info_t" - that reflects the "bridge" lines from his torrc along with a download - schedule (see Section 3.5 below). When he starts Tor, he attempts - to fetch a descriptor for each configured bridge (see Section 3.4 - below). When he succeeds at getting a descriptor for one of the bridges - in his list, he adds it directly to the entry guard list using the - normal add_an_entry_guard() interface. Once a bridge descriptor has - been added, should_delay_dir_fetches() will stop delaying further - directory fetches, and the user begins to bootstrap his directory - information from that bridge (see Section 3.3). - - Currently bridge users cache their bridge descriptors to the - "cached-descriptors" file (annotated with purpose "bridge"), but - they don't make any attempt to reuse descriptors they find in this - file. The theory is that either the bridge is available now, in which - case you can get a fresh descriptor, or it's not, in which case an - old descriptor won't do you much good. - - We could disable writing out the bridge lines to the state file, if - we think this is a problem. - - As an exception, if we get an application request when we have one - or more bridge descriptors but we believe none of them are running, - we mark them all as running again. This is similar to the exception - already in place to help long-idle Tor clients realize they should - fetch fresh directory information rather than just refuse requests. - -3.3. Bridges as directory guards - - In addition to using bridges as the first hop in their circuits, bridge - users also use them to fetch directory updates. Other than initial - bootstrapping to find a working bridge descriptor (see Section 3.4 - below), all further non-anonymized directory fetches will be redirected - to the bridge. - - This means that bridge relays need to have cached answers for all - questions the bridge user might ask. This makes the upgrade path - tricky --- for example, if we migrate to a v4 directory design, the - bridge user would need to keep using v3 so long as his bridge relays - only knew how to answer v3 queries. - - In a future design, for cases where the user has enough information - to build circuits yet the chosen bridge doesn't know how to answer a - given query, we might teach bridge users to make an anonymized request - to a more suitable directory server. - -3.4. How bridge users get their bridge descriptor - - Bridge users can fetch bridge descriptors in two ways: by going directly - to the bridge and asking for "/tor/server/authority", or by going to - the bridge authority and asking for "/tor/server/fp/ID". By default, - they will only try the direct queries. If the user sets - UpdateBridgesFromAuthority 1 - in his config file, then he will try querying the bridge authority - first for bridges where he knows a digest (if he only knows an IP - address and ORPort, then his only option is a direct query). - - If the user has at least one working bridge, then he will do further - queries to the bridge authority through a full three-hop Tor circuit. - But when bootstrapping, he will make a direct begin_dir-style connection - to the bridge authority. - - As of Tor 0.2.0.10-alpha, if the user attempts to fetch a descriptor - from the bridge authority and it returns a 404 not found, the user - will automatically fall back to trying a direct query. Therefore it is - recommended that bridge users always set UpdateBridgesFromAuthority, - since at worst it will delay their fetches a little bit and notify - the bridge authority of the identity fingerprint (but not location) - of their intended bridges. - -3.5. Bridge descriptor retry schedule - - Bridge users try to fetch a descriptor for each bridge (using the - steps in Section 3.4 above) on startup. Whenever they receive a - bridge descriptor, they reschedule a new descriptor download for 1 - hour from then. - - If on the other hand it fails, they try again after 15 minutes for the - first attempt, after 15 minutes for the second attempt, and after 60 - minutes for subsequent attempts. - - In 0.2.2.x we should come up with some smarter retry schedules. - -3.6. Vidalia integration - - Vidalia 0.0.16 has a checkbox in its Network config window called - "My ISP blocks connections to the Tor network." Users who click that - box change their configuration to: - UseBridges 1 - UpdateBridgesFromAuthority 1 - and should specify at least one Bridge identifier. - -3.7. Do we need a second layer of entry guards? - - If the bridge user uses the bridge as its entry guard, then the - triangulation attacks from Lasse and Paul's Oakland paper work to - locate the user's bridge(s). - - Worse, this is another way to enumerate bridges: if the bridge users - keep rotating through second hops, then if you run a few fast servers - (and avoid getting considered an Exit or a Guard) you'll quickly get - a list of the bridges in active use. - - That's probably the strongest reason why bridge users will need to - pick second-layer guards. Would this mean bridge users should switch - to four-hop circuits? - - We should figure this out in the 0.2.1.x timeframe. - diff --git a/doc/spec/proposals/126-geoip-reporting.txt b/doc/spec/proposals/126-geoip-reporting.txt deleted file mode 100644 index 9f3b21c670..0000000000 --- a/doc/spec/proposals/126-geoip-reporting.txt +++ /dev/null @@ -1,410 +0,0 @@ -Filename: 126-geoip-reporting.txt -Title: Getting GeoIP data and publishing usage summaries -Author: Roger Dingledine -Created: 2007-11-24 -Status: Closed -Implemented-In: 0.2.0.x - -0. Status - - In 0.2.0.x, this proposal is implemented to the extent needed to - address its motivations. See notes below with the test "RESOLUTION" - for details. - -1. Background and motivation - - Right now we can keep a rough count of Tor users, both total and by - country, by watching connections to a single directory mirror. Being - able to get usage estimates is useful both for our funders (to - demonstrate progress) and for our own development (so we know how - quickly we're scaling and can design accordingly, and so we know which - countries and communities to focus on more). This need for information - is the only reason we haven't deployed "directory guards" (think of - them like entry guards but for directory information; in practice, - it would seem that Tor clients should simply use their entry guards - as their directory guards; see also proposal 125). - - With the move toward bridges, we will no longer be able to track Tor - clients that use bridges, since they use their bridges as directory - guards. Further, we need to be able to learn which bridges stop seeing - use from certain countries (and are thus likely blocked), so we can - avoid giving them out to other users in those countries. - - Right now we already do GeoIP lookups in Vidalia: Vidalia draws relays - and circuits on its 'network map', and it performs anonymized GeoIP - lookups to its central servers to know where to put the dots. Vidalia - caches answers it gets -- to reduce delay, to reduce overhead on - the network, and to reduce anonymity issues where users reveal their - knowledge about the network through which IP addresses they ask about. - - But with the advent of bridges, Tor clients are asking about IP - addresses that aren't in the main directory. In particular, bridge - users inform the central Vidalia servers about each bridge as they - discover it and their Vidalia tries to map it. - - Also, we wouldn't mind letting Vidalia do a GeoIP lookup on the client's - own IP address, so it can provide a more useful map. - - Finally, Vidalia's central servers leave users open to partitioning - attacks, even if they can't target specific users. Further, as we - start using GeoIP results for more operational or security-relevant - goals, such as avoiding or including particular countries in circuits, - it becomes more important that users can't be singled out in terms of - their IP-to-country mapping beliefs. - -2. The available GeoIP databases - - There are at least two classes of GeoIP database out there: "IP to - country", which tells us the country code for the IP address but - no more details, and "IP to city", which tells us the country code, - the name of the city, and some basic latitude/longitude guesses. - - A recent ip-to-country.csv is 3421362 bytes. Compressed, it is 564252 - bytes. A typical line is: - "205500992","208605279","US","USA","UNITED STATES" - http://ip-to-country.webhosting.info/node/view/5 - - Similarly, the maxmind GeoLite Country database is also about 500KB - compressed. - http://www.maxmind.com/app/geolitecountry - - The maxmind GeoLite City database gives more finegrained detail like - geo coordinates and city name. Vidalia currently makes use of this - information. On the other hand it's 16MB compressed. A typical line is: - 206.124.149.146,Bellevue,WA,US,47.6051,-122.1134 - http://www.maxmind.com/app/geolitecity - - There are other databases out there, like - http://www.hostip.info/faq.html - http://www.webconfs.com/ip-to-city.php - that want more attention, but for now let's assume that all the db's - are around this size. - -3. What we'd like to solve - - Goal #1a: Tor relays collect IP-to-country user stats and publish - sanitized versions. - Goal #1b: Tor bridges collect IP-to-country user stats and publish - sanitized versions. - - Goal #2a: Vidalia learns IP-to-city stats for Tor relays, for better - mapping. - Goal #2b: Vidalia learns IP-to-country stats for Tor relays, so the user - can pick countries for her paths. - - Goal #3: Vidalia doesn't do external lookups on bridge relay addresses. - - Goal #4: Vidalia resolves the Tor client's IP-to-country or IP-to-city - for better mapping. - - Goal #5: Reduce partitioning opportunities where Vidalia central - servers can give different (distinguishing) responses. - -4. Solution overview - - Our goal is to allow Tor relays, bridges, and clients to learn enough - GeoIP information so they can do local private queries. - -4.1. The IP-to-country db - - Directory authorities should publish a "geoip" file that contains - IP-to-country mappings. Directory caches will mirror it, and Tor clients - and relays (including bridge relays) will fetch it. Thus we can solve - goals 1a and 1b (publish sanitized usage info). Controllers could also - use this to solve goal 2b (choosing path by country attributes). It - also solves goal 4 (learning the Tor client's country), though for - huge countries like the US we'd still need to decide where the "middle" - should be when we're mapping that address. - - The IP-to-country details are described further in Sections 5 and - 6 below. - - [RESOLUTION: The geoip file in 0.2.0.x is not distributed through - Tor. Instead, it is shipped with the bundle.] - -4.2. The IP-to-city db - - In an ideal world, the IP-to-city db would be small enough that we - could distribute it in the above manner too. But for now, it is too - large. Here's where the design choice forks. - - Option A: Vidalia should continue doing its anonymized IP-to-city - queries. Thus we can achieve goals 2a and 2b. We would solve goal - 3 by only doing lookups on descriptors that are purpose "general" - (see Section 4.2.1 for how). We would leave goal 5 unsolved. - - Option B: Each directory authority should keep an IP-to-city db, - lookup the value for each router it lists, and include that line in - the router's network-status entry. The network-status consensus would - then use the line that appears in the majority of votes. This approach - also solves goals 2a and 2b, goal 3 (Vidalia doesn't do any lookups - at all now), and goal 5 (reduced partitioning risks). - - Option B has the advantage that Vidalia can simplify its operation, - and the advantage that this consensus IP-to-city data is available to - other controllers besides just Vidalia. But it has the disadvantage - that the networkstatus consensus becomes larger, even though most of - the GeoIP information won't change from one consensus to the next. Is - there another reasonable location for it that can provide similar - consensus security properties? - - [RESOLUTION: IP-to-city is not supported.] - -4.2.1. Controllers can query for router annotations - - Vidalia needs to stop doing queries on bridge relay IP addresses. - It could do that by only doing lookups on descriptors that are in - the networkstatus consensus, but that precludes designs like Blossom - that might want to map its relay locations. The best answer is that it - should learn the router annotations, with a new controller 'getinfo' - command: - "GETINFO desc-annotations/id/<OR identity>" - which would respond with something like - @downloaded-at 2007-11-29 08:06:38 - @source "128.31.0.34" - @purpose bridge - - [We could also make the answer include the digest for the router in - question, which would enable us to ask GETINFO router-annotations/all. - Is this worth it? -RD] - - Then Vidalia can avoid doing lookups on descriptors with purpose - "bridge". Even better would be to add a new annotation "@private true" - so Vidalia can know how to handle new purposes that we haven't created - yet. Vidalia could special-case "bridge" for now, for compatibility - with the current 0.2.0.x-alphas. - -4.3. Recommendation - - My overall recommendation is that we should implement 4.1 soon - (e.g. early in 0.2.1.x), and we can go with 4.2 option A for now, - with the hope that later we discover a better way to distribute the - IP-to-city info and can switch to 4.2 option B. - - Below we discuss more how to go about achieving 4.1. - -5. Publishing and caching the GeoIP (IP-to-country) database - - Each v3 directory authority should put a copy of the "geoip" file in - its datadirectory. Then its network-status votes should include a hash - of this file (Recommended-geoip-hash: %s), and the resulting consensus - directory should specify the consensus hash. - - There should be a new URL for fetching this geoip db (by "current.z" - for testing purposes, and by hash.z for typical downloads). Authorities - should fetch and serve the one listed in the consensus, even when they - vote for their own. This would argue for storing the cached version - in a better filename than "geoip". - - Directory mirrors should keep a copy of this file available via the - same URLs. - - We assume that the file would change at most a few times a month. Should - Tor ship with a bootstrap geoip file? An out-of-date geoip file may - open you up to partitioning attacks, but for the most part it won't - be that different. - - There should be a config option to disable updating the geoip file, - in case users want to use their own file (e.g. they have a proprietary - GeoIP file they prefer to use). In that case we leave it up to the - user to update his geoip file out-of-band. - - [XXX Should consider forward/backward compatibility, e.g. if we want - to move to a new geoip file format. -RD] - - [RESOLUTION: Not done over Tor.] - -6. Controllers use the IP-to-country db for mapping and for path building - - Down the road, Vidalia could use the IP-to-country mappings for placing - on its map: - - The location of the client - - The location of the bridges, or other relays not in the - networkstatus, on the map. - - Any relays that it doesn't yet have an IP-to-city answer for. - - Other controllers can also use it to set EntryNodes, ExitNodes, etc - in a per-country way. - - To support these features, we need to export the IP-to-country data - via the Tor controller protocol. - - Is it sufficient just to add a new GETINFO command? - GETINFO ip-to-country/128.31.0.34 - 250+ip-to-country/128.31.0.34="US","USA","UNITED STATES" - - [RESOLUTION: Not done now, except for the getinfo command.] - -6.1. Other interfaces - - Robert Hogan has also suggested a - - GETINFO relays-by-country/cn - - as well as torrc options for ExitCountryCodes, EntryCountryCodes, - ExcludeCountryCodes, etc. - - [RESOLUTION: Not implemented in 0.2.0.x. Fodder for a future proposal.] - -7. Relays and bridges use the IP-to-country db for usage summaries - - Once bridges have a GeoIP database locally, they can start to publish - sanitized summaries of client usage -- how many users they see and from - what countries. This might also be a more useful way for ordinary Tor - relays to convey the level of usage they see, which would allow us to - switch to using directory guards for all users by default. - - But how to safely summarize this information without opening too many - anonymity leaks? - -7.1 Attacks to think about - - First, note that we need to have a large enough time window that we're - not aiding correlation attacks much. I hope 24 hours is enough. So - that means no publishing stats until you've been up at least 24 hours. - And you can't publish follow-up stats more often than every 24 hours, - or people could look at the differential. - - Second, note that we need to be sufficiently vague about the IP - addresses we're reporting. We are hoping that just specifying the - country will be vague enough. But a) what about active attacks where - we convince a bridge to use a GeoIP db that labels each suspect IP - address as a unique country? We have to assume that the consensus GeoIP - db won't be malicious in this way. And b) could such singling-out - attacks occur naturally, for example because of countries that have - a very small IP space? We should investigate that. - -7.2. Granularity of users - - Do we only want to report countries that have a sufficient anonymity set - (that is, number of users) for the day? For example, we might avoid - listing any countries that have seen less than five addresses over - the 24 hour period. This approach would be helpful in reducing the - singling-out opportunities -- in the extreme case, we could imagine a - situation where one blogger from the Sudan used Tor on a given day, and - we can discover which entry guard she used. - - But I fear that especially for bridges, seeing only one hit from a - given country in a given day may be quite common. - - As a compromise, we should start out with an "Other" category in - the reported stats, which is the sum of unlisted countries; if that - category is consistently interesting, we can think harder about how - to get the right data from it safely. - - But note that bridge summaries will not be made public individually, - since doing so would help people enumerate bridges. Whereas summaries - from normal relays will be public. So perhaps that means we can afford - to be more specific in bridge summaries? In particular, I'm thinking the - "other" category should be used by public relays but not for bridges - (or if it is, used with a lower threshold). - - Even for countries that have many Tor users, we might not want to be - too specific about how many users we've seen. For example, we might - round down the number of users we report to the nearest multiple of 5. - My instinct for now is that this won't be that useful. - -7.3 Other issues - - Another note: we'll likely be overreporting in the case of users with - dynamic IP addresses: if they rotate to a new address over the course - of the day, we'll count them twice. So be it. - -7.4. Where to publish the summaries? - - We designed extrainfo documents for information like this. So they - should just be more entries in the extrainfo doc. - - But if we want to publish summaries every 24 hours (no more often, - no less often), aren't we tried to the router descriptor publishing - schedule? That is, if we publish a new router descriptor at the 18 - hour mark, and nothing much has changed at the 24 hour mark, won't - the new descriptor get dropped as being "cosmetically similar", and - then nobody will know to ask about the new extrainfo document? - - One solution would be to make and remember the 24 hour summary at the - 24 hour mark, but not actually publish it anywhere until we happen to - publish a new descriptor for other reasons. If we happen to go down - before publishing a new descriptor, then so be it, at least we tried. - -7.5. What if the relay is unreachable or goes to sleep? - - Even if you've been up for 24 hours, if you were hibernating for 18 - of them, then we're not getting as much fuzziness as we'd like. So - I guess that means that we need a 24-hour period of being "awake" - before we'll willing to publish a summary. A similar attack works if - you've been awake but unreachable for the first 18 of the 24 hours. As - another example, a bridge that's on a laptop might be suspended for - some of each day. - - This implies that some relays and bridges will never publish summary - stats, because they're not ever reliably working for 24 hours in - a row. If a significant percentage of our reporters end up being in - this boat, we should investigate whether we can accumulate 24 hours of - "usefulness", even if there are holes in the middle, and publish based - on that. - - What other issues are like this? It seems that just moving to a new - IP address shouldn't be a reason to cancel stats publishing, assuming - we were usable at each address. - -7.6. IP addresses that aren't in the geoip db - - Some IP addresses aren't in the public geoip databases. In particular, - I've found that a lot of African countries are missing, but there - are also some common ones in the US that are missing, like parts of - Comcast. We could just lump unknown IP addresses into the "other" - category, but it might be useful to gather a general sense of how many - lookups are failing entirely, by adding a separate "Unknown" category. - - We could also contribute back to the geoip db, by letting bridges set - a config option to report the actual IP addresses that failed their - lookup. Then the bridge authority operators can manually make sure - the correct answer will be in later geoip files. This config option - should be disabled by default. - -7.7 Bringing it all together - - So here's the plan: - - 24 hours after starting up (modulo Section 7.5 above), bridges and - relays should construct a daily summary of client countries they've - seen, including the above "Unknown" category (Section 7.6) as well. - - Non-bridge relays lump all countries with less than K (e.g. K=5) users - into the "Other" category (see Sec 7.2 above), whereas bridge relays are - willing to list a country even when it has only one user for the day. - - Whenever we have a daily summary on record, we include it in our - extrainfo document whenever we publish one. The daily summary we - remember locally gets replaced with a newer one when another 24 - hours pass. - -7.8. Some forward secrecy - - How should we remember addresses locally? If we convert them into - country-codes immediately, we will count them again if we see them - again. On the other hand, we don't really want to keep a list hanging - around of all IP addresses we've seen in the past 24 hours. - - Step one is that we should never write this stuff to disk. Keeping it - only in ram will make things somewhat better. Step two is to avoid - keeping any timestamps associated with it: rather than a rolling - 24-hour window, which would require us to remember the various times - we've seen that address, we can instead just throw out the whole list - every 24 hours and start over. - - We could hash the addresses, and then compare hashes when deciding if - we've seen a given address before. We could even do keyed hashes. Or - Bloom filters. But if our goal is to defend against an adversary - who steals a copy of our ram while we're running and then does - guess-and-check on whatever blob we're keeping, we're in bad shape. - - We could drop the last octet of the IP address as soon as we see - it. That would cause us to undercount some users from cablemodem and - DSL networks that have a high density of Tor users. And it wouldn't - really help that much -- indeed, the extent to which it does help is - exactly the extent to which it makes our stats less useful. - - Other ideas? - diff --git a/doc/spec/proposals/127-dirport-mirrors-downloads.txt b/doc/spec/proposals/127-dirport-mirrors-downloads.txt deleted file mode 100644 index 72d6c0cb9f..0000000000 --- a/doc/spec/proposals/127-dirport-mirrors-downloads.txt +++ /dev/null @@ -1,155 +0,0 @@ -Filename: 127-dirport-mirrors-downloads.txt -Title: Relaying dirport requests to Tor download site / website -Author: Roger Dingledine -Created: 2007-12-02 -Status: Draft - -1. Overview - - Some countries and networks block connections to the Tor website. As - time goes by, this will remain a problem and it may even become worse. - - We have a big pile of mirrors (google for "Tor mirrors"), but few of - our users think to try a search like that. Also, many of these mirrors - might be automatically blocked since their pages contain words that - might cause them to get banned. And lastly, we can imagine a future - where the blockers are aware of the mirror list too. - - Here we describe a new set of URLs for Tor's DirPort that will relay - connections from users to the official Tor download site. Rather than - trying to cache a bunch of new Tor packages (which is a hassle in terms - of keeping them up to date, and a hassle in terms of drive space used), - we instead just proxy the requests directly to Tor's /dist page. - - Specifically, we should support - - GET /tor/dist/$1 - - and - - GET /tor/website/$1 - -2. Direct connections, one-hop circuits, or three-hop circuits? - - We could relay the connections directly to the download site -- but - this produces recognizable outgoing traffic on the bridge or cache's - network, which will probably surprise our nice volunteers. (Is this - a good enough reason to discard the direct connection idea?) - - Even if we don't do direct connections, should we do a one-hop - begindir-style connection to the mirror site (make a one-hop circuit - to it, then send a 'begindir' cell down the circuit), or should we do - a normal three-hop anonymized connection? - - If these mirrors are mainly bridges, doing either a direct or a one-hop - connection creates another way to enumerate bridges. That would argue - for three-hop. On the other hand, downloading a 10+ megabyte installer - through a normal Tor circuit can't be fun. But if you're already getting - throttled a lot because you're in the "relayed traffic" bucket, you're - going to have to accept a slow transfer anyway. So three-hop it is. - - Speaking of which, we would want to label this connection - as "relay" traffic for the purposes of rate limiting; see - connection_counts_as_relayed_traffic() and or_conn->client_used. This - will be a bit tricky though, because these connections will use the - bridge's guards. - -3. Scanning resistance - - One other goal we'd like to achieve, or at least not hinder, is making - it hard to scan large swaths of the Internet to look for responses - that indicate a bridge. - - In general this is a really hard problem, so we shouldn't demand to - solve it here. But we can note that some bridges should open their - DirPort (and offer this functionality), and others shouldn't. Then - some bridges provide a download mirror while others can remain - scanning-resistant. - -4. Integrity checking - - If we serve this stuff in plaintext from the bridge, anybody in between - the user and the bridge can intercept and modify it. The bridge can too. - - If we do an anonymized three-hop connection, the exit node can also - intercept and modify the exe it sends back. - - Are we setting ourselves up for rogue exit relays, or rogue bridges, - that trojan our users? - - Answer #1: Users need to do pgp signature checking. Not a very good - answer, a) because it's complex, and b) because they don't know the - right signing keys in the first place. - - Answer #2: The mirrors could exit from a specific Tor relay, using the - '.exit' notation. This would make connections a bit more brittle, but - would resolve the rogue exit relay issue. We could even round-robin - among several, and the list could be dynamic -- for example, all the - relays with an Authority flag that allow exits to the Tor website. - - Answer #3: The mirrors should connect to the main distribution site - via SSL. That way the exit relay can't influence anything. - - Answer #4: We could suggest that users only use trusted bridges for - fetching a copy of Tor. Hopefully they heard about the bridge from a - trusted source rather than from the adversary. - - Answer #5: What if the adversary is trawling for Tor downloads by - network signature -- either by looking for known bytes in the binary, - or by looking for "GET /tor/dist/"? It would be nice to encrypt the - connection from the bridge user to the bridge. And we can! The bridge - already supports TLS. Rather than initiating a TLS renegotiation after - connecting to the ORPort, the user should actually request a URL. Then - the ORPort can either pass the connection off as a linked conn to the - dirport, or renegotiate and become a Tor connection, depending on how - the client behaves. - -5. Linked connections: at what level should we proxy? - - Check out the connection_ap_make_link() function, as called from - directory.c. Tor clients use this to create a "fake" socks connection - back to themselves, and then they attach a directory request to it, - so they can launch directory fetches via Tor. We can piggyback on - this feature. - - We need to decide if we're going to be passing the bytes back and - forth between the web browser and the main distribution site, or if - we're going to be actually acting like a proxy (parsing out the file - they want, fetching that file, and serving it back). - - Advantages of proxying without looking inside: - - We don't need to build any sort of http support (including - continues, partial fetches, etc etc). - Disadvantages: - - If the browser thinks it's speaking http, are there easy ways - to pass the bytes to an https server and have everything work - correctly? At the least, it would seem that the browser would - complain about the cert. More generally, ssl wants to be negotiated - before the URL and headers are sent, yet we need to read the URL - and headers to know that this is a mirror request; so we have an - ordering problem here. - - Makes it harder to do caching later on, if we don't look at what - we're relaying. (It might be useful down the road to cache the - answers to popular requests, so we don't have to keep getting - them again.) - -6. Outstanding problems - - 1) HTTP proxies already exist. Why waste our time cloning one - badly? When we clone existing stuff, we usually regret it. - - 2) It's overbroad. We only seem to need a secure get-a-tor feature, - and instead we're contemplating building a locked-down HTTP proxy. - - 3) It's going to add a fair bit of complexity to our code. We do - not currently implement HTTPS. We'd need to refactor lots of the - low-level connection stuff so that "SSL" and "Cell-based" were no - longer synonymous. - - 4) It's still unclear how effective this proposal would be in - practice. You need to know that this feature exists, which means - somebody needs to tell you about a bridge (mirror) address and tell - you how to use it. And if they're doing that, they could (e.g.) tell - you about a gmail autoresponder address just as easily, and then you'd - get better authentication of the Tor program to boot. - diff --git a/doc/spec/proposals/128-bridge-families.txt b/doc/spec/proposals/128-bridge-families.txt deleted file mode 100644 index e5bdcf95cb..0000000000 --- a/doc/spec/proposals/128-bridge-families.txt +++ /dev/null @@ -1,64 +0,0 @@ -Filename: 128-bridge-families.txt -Title: Families of private bridges -Author: Roger Dingledine -Created: 2007-12-xx -Status: Dead - -1. Overview - - Proposal 125 introduced the basic notion of how bridge authorities, - bridge relays, and bridge users should behave. But it doesn't get into - the various mechanisms of how to distribute bridge relay addresses to - bridge users. - - One of the mechanisms we have in mind is called 'families of bridges'. - If a bridge user knows about only one private bridge, and that bridge - shuts off for the night or gets a new dynamic IP address, the bridge - user is out of luck and needs to re-bootstrap manually or wait and - hope it comes back. On the other hand, if the bridge user knows about - a family of bridges, then as long as one of those bridges is still - reachable his Tor client can automatically learn about where the - other bridges have gone. - - So in this design, a single volunteer could run multiple coordinated - bridges, or a group of volunteers could each run a bridge. We abstract - out the details of how these volunteers find each other and decide to - set up a family. - -2. Other notes. - - somebody needs to run a bridge authority - - it needs to have a torrc option to publish networkstatuses of its bridges - - it should also do reachability testing just of those bridges - - people ask for the bridge networkstatus by asking for a url that - contains a password. (it's safe to do this because of begin_dir.) - - so the bridge users need to know a) a password, and b) a bridge - authority line. - - the bridge users need to know the bridge authority line. - - the bridge authority needs to know the password. - -3. Current state - - I implemented a BridgePassword config option. Bridge authorities - should set it, and users who want to use those bridge authorities - should set it. - - Now there is a new directory URL "/tor/networkstatus-bridges" that - directory mirrors serve if BridgeAuthoritativeDir is set and it's a - begin_dir connection. It looks for the header - Authorization: Basic %s - where %s is the base-64 bridge password. - - I never got around to teaching clients how to set the header though, - so it may or may not, and may or may not do what we ultimate want. - - I've marked this proposal dead; it really never should have left the - ideas/ directory. Somebody should pick it up sometime and finish the - design and implementation. - diff --git a/doc/spec/proposals/129-reject-plaintext-ports.txt b/doc/spec/proposals/129-reject-plaintext-ports.txt deleted file mode 100644 index 8080ff5b75..0000000000 --- a/doc/spec/proposals/129-reject-plaintext-ports.txt +++ /dev/null @@ -1,114 +0,0 @@ -Filename: 129-reject-plaintext-ports.txt -Title: Block Insecure Protocols by Default -Author: Kevin Bauer & Damon McCoy -Created: 2008-01-15 -Status: Closed -Implemented-In: 0.2.0.x - -Overview: - - Below is a proposal to mitigate insecure protocol use over Tor. - - This document 1) demonstrates the extent to which insecure protocols are - currently used within the Tor network, and 2) proposes a simple solution - to prevent users from unknowingly using these insecure protocols. By - insecure, we consider protocols that explicitly leak sensitive user names - and/or passwords, such as POP, IMAP, Telnet, and FTP. - -Motivation: - - As part of a general study of Tor use in 2006/2007 [1], we attempted to - understand what types of protocols are used over Tor. While we observed a - enormous volume of Web and Peer-to-peer traffic, we were surprised by the - number of insecure protocols that were used over Tor. For example, over an - 8 day observation period, we observed the following number of connections - over insecure protocols: - - POP and IMAP:10,326 connections - Telnet: 8,401 connections - FTP: 3,788 connections - - Each of the above listed protocols exchange user name and password - information in plain-text. As an upper bound, we could have observed - 22,515 user names and passwords. This observation echos the reports of - a Tor router logging and posting e-mail passwords in August 2007 [2]. The - response from the Tor community has been to further educate users - about the dangers of using insecure protocols over Tor. However, we - recently repeated our Tor usage study from last year and noticed that the - trend in insecure protocol use has not declined. Therefore, we propose that - additional steps be taken to protect naive Tor users from inadvertently - exposing their identities (and even passwords) over Tor. - -Security Implications: - - This proposal is intended to improve Tor's security by limiting the - use of insecure protocols. - - Roger added: By adding these warnings for only some of the risky - behavior, users may do other risky behavior, not get a warning, and - believe that it is therefore safe. But overall, I think it's better - to warn for some of it than to warn for none of it. - -Specification: - - As an initial step towards mitigating the use of the above-mentioned - insecure protocols, we propose that the default ports for each respective - insecure service be blocked at the Tor client's socks proxy. These default - ports include: - - 23 - Telnet - 109 - POP2 - 110 - POP3 - 143 - IMAP - - Notice that FTP is not included in the proposed list of ports to block. This - is because FTP is often used anonymously, i.e., without any identifying - user name or password. - - This blocking scheme can be implemented as a set of flags in the client's - torrc configuration file: - - BlockInsecureProtocols 0|1 - WarnInsecureProtocols 0|1 - - When the warning flag is activated, a message should be displayed to - the user similar to the message given when Tor's socks proxy is given an IP - address rather than resolving a host name. - - We recommend that the default torrc configuration file block insecure - protocols and provide a warning to the user to explain the behavior. - - Finally, there are many popular web pages that do not offer secure - login features, such as MySpace, and it would be prudent to provide - additional rules to Privoxy to attempt to protect users from unknowingly - submitting their login credentials in plain-text. - -Compatibility: - - None, as the proposed changes are to be implemented in the client. - -References: - - [1] Shining Light in Dark Places: A Study of Anonymous Network Usage. - University of Colorado Technical Report CU-CS-1032-07. August 2007. - - [2] Rogue Nodes Turn Tor Anonymizer Into Eavesdropper's Paradise. - http://www.wired.com/politics/security/news/2007/09/embassy_hacks. - Wired. September 10, 2007. - -Implementation: - - Roger added this feature in - http://archives.seul.org/or/cvs/Jan-2008/msg00182.html - He also added a status event for Vidalia to recognize attempts to use - vulnerable-plaintext ports, so it can help the user understand what's - going on and how to fix it. - -Next steps: - - a) Vidalia should learn to recognize this controller status event, - so we don't leave users out in the cold when we enable this feature. - - b) We should decide which ports to reject by default. The current - consensus is 23,109,110,143 -- the same set that we warn for now. - diff --git a/doc/spec/proposals/130-v2-conn-protocol.txt b/doc/spec/proposals/130-v2-conn-protocol.txt deleted file mode 100644 index 60e742a622..0000000000 --- a/doc/spec/proposals/130-v2-conn-protocol.txt +++ /dev/null @@ -1,184 +0,0 @@ -Filename: 130-v2-conn-protocol.txt -Title: Version 2 Tor connection protocol -Author: Nick Mathewson -Created: 2007-10-25 -Status: Closed -Implemented-In: 0.2.0.x - -Overview: - - This proposal describes the significant changes to be made in the v2 - Tor connection protocol. - - This proposal relates to other proposals as follows: - - It refers to and supersedes: - Proposal 124: Blocking resistant TLS certificate usage - It refers to aspects of: - Proposal 105: Version negotiation for the Tor protocol - - - In summary, The Tor connection protocol has been in need of a redesign - for a while. This proposal describes how we can add to the Tor - protocol: - - - A new TLS handshake (to achieve blocking resistance without - breaking backward compatibility) - - Version negotiation (so that future connection protocol changes - can happen without breaking compatibility) - - The actual changes in the v2 Tor connection protocol. - -Motivation: - - For motivation, see proposal 124. - -Proposal: - -0. Terminology - - The version of the Tor connection protocol implemented up to now is - "version 1". This proposal describes "version 2". - - "Old" or "Older" versions of Tor are ones not aware that version 2 - of this protocol exists; - "New" or "Newer" versions are ones that are. - - The connection initiator is referred to below as the Client; the - connection responder is referred to below as the Server. - -1. The revised TLS handshake. - - For motivation, see proposal 124. This is a simplified version of the - handshake that uses TLS's renegotiation capability in order to avoid - some of the extraneous steps in proposal 124. - - The Client connects to the Server and, as in ordinary TLS, sends a - list of ciphers. Older versions of Tor will send only ciphers from - the list: - TLS_DHE_RSA_WITH_AES_256_CBC_SHA - TLS_DHE_RSA_WITH_AES_128_CBC_SHA - SSL_DHE_RSA_WITH_3DES_EDE_CBC_SHA - SSL_DHE_DSS_WITH_3DES_EDE_CBC_SHA - Clients that support the revised handshake will send the recommended - list of ciphers from proposal 124, in order to emulate the behavior of - a web browser. - - If the server notices that the list of ciphers contains only ciphers - from this list, it proceeds with Tor's version 1 TLS handshake as - documented in tor-spec.txt. - - (The server may also notice cipher lists used by other implementations - of the Tor protocol (in particular, the BouncyCastle default cipher - list as used by some Java-based implementations), and whitelist them.) - - On the other hand, if the server sees a list of ciphers that could not - have been sent from an older implementation (because it includes other - ciphers, and does not match any known-old list), the server sends a - reply containing a single connection certificate, constructed as for - the link certificate in the v1 Tor protocol. The subject names in - this certificate SHOULD NOT have any strings to identify them as - coming from a Tor server. The server does not ask the client for - certificates. - - Old Servers will (mostly) ignore the cipher list and respond as in the v1 - protocol, sending back a two-certificate chain. - - After the Client gets a response from the server, it checks for the - number of certificates it received. If there are two certificates, - the client assumes a V1 connection and proceeds as in tor-spec.txt. - But if there is only one certificate, the client assumes a V2 or later - protocol and continues. - - At this point, the client has established a TLS connection with the - server, but the parties have not been authenticated: the server hasn't - sent its identity certificate, and the client hasn't sent any - certificates at all. To fix this, the client begins a TLS session - renegotiation. This time, the server continues with two certificates - as usual, and asks for certificates so that the client will send - certificates of its own. Because the TLS connection has been - established, all of this is encrypted. (The certificate sent by the - server in the renegotiated connection need not be the same that - as sentin the original connection.) - - The server MUST NOT write any data until the client has renegotiated. - - Once the renegotiation is finished, the server and client check one - another's certificates as in V1. Now they are mutually authenticated. - -1.1. Revised TLS handshake: implementation notes. - - It isn't so easy to adjust server behavior based on the client's - ciphersuite list. Here's how we can do it using OpenSSL. This is a - bit of an abuse of the OpenSSL APIs, but it's the best we can do, and - we won't have to do it forever. - - We can use OpenSSL's SSL_set_info_callback() to register a function to - be called when the state changes. The type/state tuple of - SSL_CB_ACCEPT_LOOP/SSL3_ST_SW_SRVR_HELLO_A - happens when we have completely parsed the client hello, and are about - to send a response. From this callback, we can check the cipherlist - and act accordingly: - - * If the ciphersuite list indicates a v1 protocol, we set the - verify mode to SSL_VERIFY_NONE with a callback (so we get - certificates). - - * If the ciphersuite list indicates a v2 protocol, we set the - verify mode to SSL_VERIFY_NONE with no callback (so we get - no certificates) and set the SSL_MODE_NO_AUTO_CHAIN flag (so that - we send only 1 certificate in the response. - - Once the handshake is done, the server clears the - SSL_MODE_NO_AUTO_CHAIN flag and sets the callback as for the V1 - protocol. It then starts reading. - - The other problem to take care of is missing ciphers and OpenSSL's - cipher sorting algorithms. The two main issues are a) OpenSSL doesn't - support some of the default ciphers that Firefox advertises, and b) - OpenSSL sorts the list of ciphers it offers in a different way than - Firefox sorts them, so unless we fix that Tor will still look different - than Firefox. - [XXXX more on this.] - - -1.2. Compatibility for clients using libraries less hackable than OpenSSL. - - As discussed in proposal 105, servers advertise which protocol - versions they support in their router descriptors. Clients can simply - behave as v1 clients when connecting to servers that do not support - link version 2 or higher, and as v2 clients when connecting to servers - that do support link version 2 or higher. - - (Servers can't use this strategy because we do not assume that servers - know one another's capabilities when connecting.) - -2. Version negotiation. - - Version negotiation proceeds as described in proposal 105, except as - follows: - - * Version negotiation only happens if the TLS handshake as described - above completes. - - * The TLS renegotiation must be finished before the client sends a - VERSIONS cell; the server sends its VERSIONS cell in response. - - * The VERSIONS cell uses the following variable-width format: - Circuit [2 octets; set to 0] - Command [1 octet; set to 7 for VERSIONS] - Length [2 octets; big-endian] - Data [Length bytes] - - The Data in the cell is a series of big-endian two-byte integers. - - * It is not allowed to negotiate V1 conections once the v2 protocol - has been used. If this happens, Tor instances should close the - connection. - -3. The rest of the "v2" protocol - - Once a v2 protocol has been negotiated, NETINFO cells are exchanged - as in proposal 105, and communications begin as per tor-spec.txt. - Until NETINFO cells have been exchanged, the connection is not open. - - diff --git a/doc/spec/proposals/131-verify-tor-usage.txt b/doc/spec/proposals/131-verify-tor-usage.txt deleted file mode 100644 index d3c6efe75a..0000000000 --- a/doc/spec/proposals/131-verify-tor-usage.txt +++ /dev/null @@ -1,148 +0,0 @@ -Filename: 131-verify-tor-usage.txt -Title: Help users to verify they are using Tor -Author: Steven J. Murdoch -Created: 2008-01-25 -Status: Needs-Revision - -Overview: - - Websites for checking whether a user is accessing them via Tor are a - very helpful aid to configuring web browsers correctly. Existing - solutions have both false positives and false negatives when - checking if Tor is being used. This proposal will discuss how to - modify Tor so as to make testing more reliable. - -Motivation: - - Currently deployed websites for detecting Tor use work by comparing - the client IP address for a request with a list of known Tor nodes. - This approach is generally effective, but suffers from both false - positives and false negatives. - - If a user has a Tor exit node installed, or just happens to have - been allocated an IP address previously used by a Tor exit node, any - web requests will be incorrectly flagged as coming from Tor. If any - customer of an ISP which implements a transparent proxy runs an exit - node, all other users of the ISP will be flagged as Tor users. - - Conversely, if the exit node chosen by a Tor user has not yet been - recorded by the Tor checking website, requests will be incorrectly - flagged as not coming via Tor. - - The only reliable way to tell whether Tor is being used or not is for - the Tor client to flag this to the browser. - -Proposal: - - A DNS name should be registered and point to an IP address - controlled by the Tor project and likely to remain so for the - useful lifetime of a Tor client. A web server should be placed - at this IP address. - - Tor should be modified to treat requests to port 80, at the - specified DNS name or IP address specially. Instead of opening a - circuit, it should respond to a HTTP request with a helpful web - page: - - - If the request to open a connection was to the domain name, the web - page should state that Tor is working properly. - - If the request was to the IP address, the web page should state - that there is a DNS-leakage vulnerability. - - If the request goes through to the real web server, the page - should state that Tor has not been set up properly. - -Extensions: - - Identifying proxy server: - - If needed, other applications between the web browser and Tor (e.g. - Polipo and Privoxy) could piggyback on the same mechanism to flag - whether they are in use. All three possible web pages should include - a machine-readable placeholder, into which another program could - insert their own message. - - For example, the webpage returned by Tor to indicate a successful - configuration could include the following HTML: - <h2>Connection chain</h2> - <ul> - <li>Tor 0.1.2.14-alpha</li> - <!-- Tor Connectivity Check: success --> - </ul> - - When the proxy server observes this string, in response to a request - for the Tor connectivity check web page, it would prepend it's own - message, resulting in the following being returned to the web - browser: - <h2>Connection chain - <ul> - <li>Tor 0.1.2.14-alpha</li> - <li>Polipo version 1.0.4</li> - <!-- Tor Connectivity Check: success --> - </ul> - - Checking external connectivity: - - If Tor intercepts a request, and returns a response itself, the user - will not actually confirm whether Tor is able to build a successful - circuit. It may then be advantageous to include an image in the web - page which is loaded from a different domain. If this is able to be - loaded then the user will know that external connectivity through - Tor works. - - Automatic Firefox Notification: - - All forms of the website should return valid XHTML and have a - hidden link with an id attribute "TorCheckResult" and a target - property that can be queried to determine the result. For example, - a hidden link would convey success like this: - - <a id="TorCheckResult" target="success" href="/"></a> - - failure like this: - - <a id="TorCheckResult" target="failure" href="/"></a> - - and DNS leaks like this: - - <a id="TorCheckResult" target="dnsleak" href="/"></a> - - Firefox extensions such as Torbutton would then be able to - issue an XMLHttpRequest for the page and query the result - with resultXML.getElementById("TorCheckResult").target - to automatically report the Tor status to the user when - they first attempt to enable Tor activity, or whenever - they request a check from the extension preferences window. - - If the check website is to be themed with heavy graphics and/or - extensive documentation, the check result itself should be - contained in a seperate lightweight iframe that extensions can - request via an alternate url. - -Security and resiliency implications: - - What attacks are possible? - - If the IP address used for this feature moves there will be two - consequences: - - A new website at this IP address will remain inaccessible over - Tor - - Tor users who are leaking DNS will be informed that Tor is not - working, rather than that it is active but leaking DNS - We should thus attempt to find an IP address which we reasonably - believe can remain static. - -Open issues: - - If a Tor version which does not support this extra feature is used, - the webpage returned will indicate that Tor is not being used. Can - this be safely fixed? - -Related work: - - The proposed mechanism is very similar to config.privoxy.org. The - most significant difference is that if the web browser is - misconfigured, Tor will only get an IP address. Even in this case, - Tor should be able to respond with a webpage to notify the user of how - to fix the problem. This also implies that Tor must be told of the - special IP address, and so must be effectively permanent. diff --git a/doc/spec/proposals/132-browser-check-tor-service.txt b/doc/spec/proposals/132-browser-check-tor-service.txt deleted file mode 100644 index 6132e5d060..0000000000 --- a/doc/spec/proposals/132-browser-check-tor-service.txt +++ /dev/null @@ -1,145 +0,0 @@ -Filename: 132-browser-check-tor-service.txt -Title: A Tor Web Service For Verifying Correct Browser Configuration -Author: Robert Hogan -Created: 2008-03-08 -Status: Draft - -Overview: - - Tor should operate a primitive web service on the loopback network device - that tests the operation of user's browser, privacy proxy and Tor client. - The tests are performed by serving unique, randomly generated elements in - image URLs embedded in static HTML. The images are only displayed if the DNS - and HTTP requests for them are routed through Tor, otherwise the 'alt' text - may be displayed. The proposal assumes that 'alt' text is not displayed on - all browsers so suggests that text and links should accompany each image - advising the user on next steps in case the test fails. - - The service is primarily for the use of controllers, since presumably users - aren't going to want to edit text files and then type something exotic like - 127.0.0.1:9999 into their address bar. In the main use case the controller - will have configured the actual port for the webservice so will know where - to direct the request. It would also be the responsibility of the controller - to ensure the webservice is available, and tor is running, before allowing - the user to access the page through their browser. - -Motivation: - - This is a complementary approach to proposal 131. It overcomes some of the - limitations of the approach described in proposal 131: reliance - on a permanent, real IP address and compatibility with older versions of - Tor. Unlike 131, it is not as useful to Tor users who are not running a - controller. - -Objective: - - Provide a reliable means of helping users to determine if their Tor - installation, privacy proxy and browser are properly configured for - anonymous browsing. - -Proposal: - - When configured to do so, Tor should run a basic web service available - on a configured port on 127.0.0.1. The purpose of this web service is to - serve a number of basic test images that will allow the user to determine - if their browser is properly configured and that Tor is working normally. - - The service can consist of a single web page with two columns. The left - column contains images, the right column contains advice on what the - display/non-display of the column means. - - The rest of this proposal assumes that the service is running on port - 9999. The port should be configurable, and configuring the port enables the - service. The service must run on 127.0.0.1. - - In all the examples below [uniquesessionid] refers to a random, base64 - encoded string that is unique to the URL it is contained in. Tor only ever - stores the most recently generated [uniquesessionid] for each URL, storing 3 - in total. Tor should generate a [uniquesessionid] for each of the test URLs - below every time a HTTP GET is received at 127.0.0.1:9999 for index.htm. - - The most suitable image for each test case is an implementation decision. - Tor will need to store and serve images for the first and second test - images, and possibly the third (see 'Open Issues'). - - 1. DNS Request Test Image - - This is a HTML element embedded in the page served by Tor at - http://127.0.0.1:9999: - - <IMG src="http://[uniquesessionid]:9999/torlogo.jpg" alt="If you can see - this text, your browser's DNS requests are not being routed through Tor." - width="200" height="200" align="middle" border="2"> - - If the browser's DNS request for [uniquesessionid] is routed through Tor, - Tor will intercept the request and return 127.0.0.1 as the resolved IP - address. This will shortly be followed by a HTTP request from the browser - for http://127.0.0.1:9999/torlogo.jpg. This request should be served with - the appropriate image. - - If the browser's DNS request for [uniquesessionid] is not routed through Tor - the browser may display the 'alt' text specified in the html element. The - HTML served by Tor should also contain text accompanying the image to advise - users what it means if they do not see an image. It should also provide a - link to click that provides information on how to remedy the problem. This - behaviour also applies to the images described in 2. and 3. below, so should - be assumed there as well. - - - 2. Proxy Configuration Test Image - - This is a HTML element embedded in the page served by Tor at - http://127.0.0.1:9999: - - <IMG src="http://torproject.org/[uniquesessionid].jpg" alt="If you can see - this text, your browser is not configured to work with Tor." width="200" - height="200" align="middle" border="2"> - - If the HTTP request for the resource [uniquesessionid].jpg is received by - Tor it will serve the appropriate image in response. It should serve this - image itself, without attempting to retrieve anything from the Internet. - - If Tor can identify the name of the proxy application requesting the - resource then it could store and serve an image identifying the proxy to the - user. - - 3. Tor Connectivity Test Image - - This is a HTML element embedded in the page served by Tor at - http://127.0.0.1:9999: - - <IMG src="http://torproject.org/[uniquesessionid]-torlogo.jpg" alt="If you - can see this text, your Tor installation cannot connect to the Internet." - width="200" height="200" align="middle" border="2"> - - The referenced image should actually exist on the Tor project website. If - Tor receives the request for the above resource it should remove the random - base64 encoded digest from the request (i.e. [uniquesessionid]-) and attempt - to retrieve the real image. - - Even on a fully operational Tor client this test may not always succeed. The - user should be advised that one or more attempts to retrieve this image may - be necessary to confirm a genuine problem. - -Open Issues: - - The final connectivity test relies on an externally maintained resource, if - this resource becomes unavailable the connectivity test will always fail. - Either the text accompanying the test should advise of this possibility or - Tor clients should be advised of the location of the test resource in the - main network directory listings. - - Any number of misconfigurations may make the web service unreachable, it is - the responsibility of the user's controller to recognize these and assist - the user in eliminating them. Tor can mitigate against the specific - misconfiguration of routing HTTP traffic to 127.0.0.1 to Tor itself by - serving such requests through the SOCKS port as well as the configured web - service report. - - Now Tor is inspecting the URLs requested on its SOCKS port and 'dropping' - them. It already inspects for raw IP addresses (to warn of DNS leaks) but - maybe the behaviour proposed here is qualitatively different. Maybe this is - an unwelcome precedent that can be used to beat the project over the head in - future. Or maybe it's not such a bad thing, Tor is merely attempting to make - normally invalid resource requests valid for a given purpose. - diff --git a/doc/spec/proposals/133-unreachable-ors.txt b/doc/spec/proposals/133-unreachable-ors.txt deleted file mode 100644 index a1c2dd8549..0000000000 --- a/doc/spec/proposals/133-unreachable-ors.txt +++ /dev/null @@ -1,128 +0,0 @@ -Filename: 133-unreachable-ors.txt -Title: Incorporate Unreachable ORs into the Tor Network -Author: Robert Hogan -Created: 2008-03-08 -Status: Draft - -Overview: - - Propose a scheme for harnessing the bandwidth of ORs who cannot currently - participate in the Tor network because they can only make outbound - TCP connections. - -Motivation: - - Restrictive local and remote firewalls are preventing many willing - candidates from becoming ORs on the Tor network.These - ORs have a casual interest in joining the network but their operator is not - sufficiently motivated or adept to complete the necessary router or firewall - configuration. The Tor network is losing out on their bandwidth. At the - moment we don't even know how many such 'candidate' ORs there are. - - -Objective: - - 1. Establish how many ORs are unable to qualify for publication because - they cannot establish that their ORPort is reachable. - - 2. Devise a method for making such ORs available to clients for circuit - building without prejudicing their anonymity. - -Proposal: - - ORs whose ORPort reachability testing fails a specified number of - consecutive times should: - 1. Enlist themselves with the authorities setting a 'Fallback' flag. This - flag indicates that the OR is up and running but cannot connect to - itself. - 2. Open an orconn with all ORs whose fingerprint begins with the same - byte as their own. The management of this orconn will be transferred - entirely to the OR at the other end. - 2. The fallback OR should update it's router status to contain the - 'Running' flag if it has managed to open an orconn with 3/4 of the ORs - with an FP beginning with the same byte as its own. - - Tor ORs who are contacted by fallback ORs requesting an orconn should: - 1. Accept the orconn until they have reached a defined limit of orconn - connections with fallback ORs. - 2. Should only accept such orconn requests from listed fallback ORs who - have an FP beginning with the same byte as its own. - - Tor clients can include fallback ORs in the network by doing the - following: - 1. When building a circuit, observe the fingerprint of each node they - wish to connect to. - 2. When randomly selecting a node from the set of all eligible nodes, - add all published, running fallback nodes to the set where the first - byte of the fingerprint matches the previous node in the circuit. - -Anonymity Implications: - - At least some, and possibly all, nodes on the network will have a set - of nodes that only they and a few others can build circuits on. - - 1. This means that fallback ORs might be unsuitable for use as middlemen - nodes, because if the exit node is the attacker it knows that the - number of nodes that could be the entry guard in the circuit is - reduced to roughly 1/256th of the network, or worse 1/256th of all - nodes listed as Guards. For the same reason, fallback nodes would - appear to be unsuitable for two-hop circuits. - - 2. This is not a problem if fallback ORs are always exit nodes. If - the fallback OR is an attacker it will not be able to reduce the - set of possible nodes for the entry guard any further than a normal, - published OR. - -Possible Attacks/Open Issues: - - 1. Gaming Node Selection - Does running a fallback OR customized for a specific set of published ORs - improve an attacker's chances of seeing traffic from that set of published - ORs? Would such a strategy be any more effective than running published - ORs with other 'attractive' properties? - - 2. DOS Attack - An attacker could prevent all other legitimate fallback ORs with a - given byte-1 in their FP from functioning by running 20 or 30 fallback ORs - and monopolizing all available fallback slots on the published ORs. - This same attacker would then be in a position to monopolize all the - traffic of the fallback ORs on that byte-1 network segment. I'm not sure - what this would allow such an attacker to do. - - 4. Circuit-Sniffing - An observer watching exit traffic from a fallback server will know that the - previous node in the circuit is one of a very small, identifiable - subset of the total ORs in the network. To establish the full path of the - circuit they would only have to watch the exit traffic from the fallback - OR and all the traffic from the 20 or 30 ORs it is likely to be connected - to. This means it is substantially easier to establish all members of a - circuit which has a fallback OR as an exit (sniff and analyse 10-50 (i.e. - 1/256 varying) + 1 ORs) rather than a normal published OR (sniff all 2560 - or so ORs on the network). The same mechanism that allows the client to - expect a specific fallback OR to be available from a specific published OR - allows an attacker to prepare his ground. - - Mitigant: - In terms of the resources and access required to monitor 2000 to 3000 - nodes, the effort of the adversary is not significantly diminished when he - is only interested in 20 or 30. It is hard to see how an adversary who can - obtain access to a randomly selected portion of the Tor network would face - any new or qualitatively different obstacles in attempting to access much - of the rest of it. - - -Implementation Issues: - - The number of ORs this proposal would add to the Tor network is not known. - This is because there is no mechanism at present for recording unsuccessful - attempts to become an OR. If the proposal is considered promising it may be - worthwhile to issue an alpha series release where candidate ORs post a - primitive fallback descriptor to the authority directories. This fallback - descriptor would not contain any other flag that would make it eligible for - selection by clients. It would act solely as a means of sizing the number of - Tor instances that try and fail to become ORs. - - The upper limit on the number of orconns from fallback ORs a normal, - published OR should be willing to accept is an open question. Is one - hundred, mostly idle, such orconns too onerous? - diff --git a/doc/spec/proposals/134-robust-voting.txt b/doc/spec/proposals/134-robust-voting.txt deleted file mode 100644 index c5dfb3b47f..0000000000 --- a/doc/spec/proposals/134-robust-voting.txt +++ /dev/null @@ -1,123 +0,0 @@ -Filename: 134-robust-voting.txt -Title: More robust consensus voting with diverse authority sets -Author: Peter Palfrader -Created: 2008-04-01 -Status: Rejected - -History: - 2009 May 27: Added note on rejecting this proposal -- Nick - -Overview: - - A means to arrive at a valid directory consensus even when voters - disagree on who is an authority. - - -Motivation: - - Right now there are about five authoritative directory servers in the - Tor network, tho this number is expected to rise to about 15 eventually. - - Adding a new authority requires synchronized action from all operators of - directory authorities so that at any time during the update at least half of - all authorities are running and agree on who is an authority. The latter - requirement is there so that the authorities can arrive at a common - consensus: Each authority builds the consensus based on the votes from - all authorities it recognizes, and so a different set of recognized - authorities will lead to a different consensus document. - - -Objective: - - The modified voting procedure outlined in this proposal obsoletes the - requirement for most authorities to exactly agree on the list of - authorities. - - -Proposal: - - The vote document each authority generates contains a list of - authorities recognized by the generating authority. This will be - a list of authority identity fingerprints. - - Authorities will accept votes from and serve/mirror votes also for - authorities they do not recognize. (Votes contain the signing, - authority key, and the certificate linking them so they can be - verified even without knowing the authority beforehand.) - - Before building the consensus we will check which votes to use for - building: - - 1) We build a directed graph of which authority/vote recognizes - whom. - 2) (Parts of the graph that aren't reachable, directly or - indirectly, from any authorities we recognize can be discarded - immediately.) - 3) We find the largest fully connected subgraph. - (Should there be more than one subgraph of the same size there - needs to be some arbitrary ordering so we always pick the same. - E.g. pick the one who has the smaller (XOR of all votes' digests) - or something.) - 4) If we are part of that subgraph, great. This is the list of - votes we build our consensus with. - 5) If we are not part of that subgraph, remove all the nodes that - are part of it and go to 3. - - Using this procedure authorities that are updated to recognize a - new authority will continue voting with the old group until a - sufficient number has been updated to arrive at a consensus with - the recently added authority. - - In fact, the old set of authorities will probably be voting among - themselves until all but one has been updated to recognize the - new authority. Then which set of votes is used for consensus - building depends on which of the two equally large sets gets - ordered before the other in step (3) above. - - It is necessary to continue with the process in (5) even if we - are not in the largest subgraph. Otherwise one rogue authority - could create a number of extra votes (by new authorities) so that - everybody stops at 5 and no consensus is built, even tho it would - be trusted by all clients. - - -Anonymity Implications: - - The author does not believe this proposal to have anonymity - implications. - - -Possible Attacks/Open Issues/Some thinking required: - - Q: Can a number (less or exactly half) of the authorities cause an honest - authority to vote for "their" consensus rather than the one that would - result were all authorities taken into account? - - - Q: Can a set of votes from external authorities, i.e of whom we trust either - none or at least not all, cause us to change the set of consensus makers we - pick? - A: Yes, if other authorities decide they rather build a consensus with them - then they'll be thrown out in step 3. But that's ok since those other - authorities will never vote with us anyway. - If we trust none of them then we throw them out even sooner, so no harm done. - - Q: Can this ever force us to build a consensus with authorities we do not - recognize? - A: No, we can never build a fully connected set with them in step 3. - ------------------------------- - -I'm rejecting this proposal as insecure. - -Suppose that we have a clique of size N, and M hostile members in the -clique. If these hostile members stop declaring trust for up to M-1 -good members of the clique, the clique with the hostile members will -in it will be larger than the one without them. - -The M hostile members will constitute a majority of this new clique -when M > (N-(M-1)) / 2, or when M > (N + 1) / 3. This breaks our -requirement that an adversary must compromise a majority of authorities -in order to control the consensus. - --- Nick diff --git a/doc/spec/proposals/135-private-tor-networks.txt b/doc/spec/proposals/135-private-tor-networks.txt deleted file mode 100644 index 19ef68b7b1..0000000000 --- a/doc/spec/proposals/135-private-tor-networks.txt +++ /dev/null @@ -1,281 +0,0 @@ -Filename: 135-private-tor-networks.txt -Title: Simplify Configuration of Private Tor Networks -Author: Karsten Loesing -Created: 29-Apr-2008 -Status: Closed -Target: 0.2.1.x -Implemented-In: 0.2.1.2-alpha - -Change history: - - 29-Apr-2008 Initial proposal for or-dev - 19-May-2008 Included changes based on comments by Nick to or-dev and - added a section for test cases. - 18-Jun-2008 Changed testing-network-only configuration option names. - -Overview: - - Configuring a private Tor network has become a time-consuming and - error-prone task with the introduction of the v3 directory protocol. In - addition to that, operators of private Tor networks need to set an - increasing number of non-trivial configuration options, and it is hard - to keep FAQ entries describing this task up-to-date. In this proposal we - (1) suggest to (optionally) accelerate timing of the v3 directory voting - process and (2) introduce an umbrella config option specifically aimed at - creating private Tor networks. - -Design: - - 1. Accelerate Timing of v3 Directory Voting Process - - Tor has reasonable defaults for setting up a large, Internet-scale - network with comparably high latencies and possibly wrong server clocks. - However, those defaults are bad when it comes to quickly setting up a - private Tor network for testing, either on a single node or LAN (things - might be different when creating a test network on PlanetLab or - something). Some time constraints should be made configurable for private - networks. The general idea is to accelerate everything that has to do - with propagation of directory information, but nothing else, so that a - private network is available as soon as possible. (As a possible - safeguard, changing these configuration values could be made dependent on - the umbrella configuration option introduced in 2.) - - 1.1. Initial Voting Schedule - - When a v3 directory does not know any consensus, it assumes an initial, - hard-coded VotingInterval of 30 minutes, VoteDelay of 5 minutes, and - DistDelay of 5 minutes. This is important for multiple, simultaneously - restarted directory authorities to meet at a common time and create an - initial consensus. Unfortunately, this means that it may take up to half - an hour (or even more) for a private Tor network to bootstrap. - - We propose to make these three time constants configurable (note that - V3AuthVotingInterval, V3AuthVoteDelay, and V3AuthDistDelay do not have an - effect on the _initial_ voting schedule, but only on the schedule that a - directory authority votes for). This can be achieved by introducing three - new configuration options: TestingV3AuthInitialVotingInterval, - TestingV3AuthInitialVoteDelay, and TestingV3AuthInitialDistDelay. - - As first safeguards, Tor should only accept configuration values for - TestingV3AuthInitialVotingInterval that divide evenly into the default - value of 30 minutes. The effect is that even if people misconfigured - their directory authorities, they would meet at the default values at the - latest. The second safeguard is to allow configuration only when the - umbrella configuration option TestingTorNetwork is set. - - 1.2. Immediately Provide Reachability Information (Running flag) - - The default behavior of a directory authority is to provide the Running - flag only after the authority is available for at least 30 minutes. The - rationale is that before that time, an authority simply cannot deliver - useful information about other running nodes. But for private Tor - networks this may be different. This is currently implemented in the code - as: - - /** If we've been around for less than this amount of time, our - * reachability information is not accurate. */ - #define DIRSERV_TIME_TO_GET_REACHABILITY_INFO (30*60) - - There should be another configuration option - TestingAuthDirTimeToLearnReachability with a default value of 30 minutes - that can be changed when running testing Tor networks, e.g. to 0 minutes. - The configuration value would simply replace the quoted constant. Again, - changing this option could be safeguarded by requiring the umbrella - configuration option TestingTorNetwork to be set. - - 1.3. Reduce Estimated Descriptor Propagation Time - - Tor currently assumes that it takes up to 10 minutes until router - descriptors are propagated from the authorities to directory caches. - This is not very useful for private Tor networks, and we want to be able - to reduce this time, so that clients can download router descriptors in a - timely manner. - - /** Clients don't download any descriptor this recent, since it will - * probably not have propagated to enough caches. */ - #define ESTIMATED_PROPAGATION_TIME (10*60) - - We suggest to introduce a new config option - TestingEstimatedDescriptorPropagationTime which defaults to 10 minutes, - but that can be set to any lower non-negative value, e.g. 0 minutes. The - same safeguards as in 1.2 could be used here, too. - - 2. Umbrella Option for Setting Up Private Tor Networks - - Setting up a private Tor network requires a number of specific settings - that are not required or useful when running Tor in the public Tor - network. Instead of writing down these options in a FAQ entry, there - should be a single configuration option, e.g. TestingTorNetwork, that - changes all required settings at once. Newer Tor versions would keep the - set of configuration options up-to-date. It should still remain possible - to manually overwrite the settings that the umbrella configuration option - affects. - - The following configuration options are set by TestingTorNetwork: - - - ServerDNSAllowBrokenResolvConf 1 - Ignore the situation that private relays are not aware of any name - servers. - - - DirAllowPrivateAddresses 1 - Allow router descriptors containing private IP addresses. - - - EnforceDistinctSubnets 0 - Permit building circuits with relays in the same subnet. - - - AssumeReachable 1 - Omit self-testing for reachability. - - - AuthDirMaxServersPerAddr 0 - - AuthDirMaxServersPerAuthAddr 0 - Permit an unlimited number of nodes on the same IP address. - - - ClientDNSRejectInternalAddresses 0 - Believe in DNS responses resolving to private IP addresses. - - - ExitPolicyRejectPrivate 0 - Allow exiting to private IP addresses. (This one is a matter of - taste---it might be dangerous to make this a default in a private - network, although people setting up private Tor networks should know - what they are doing.) - - - V3AuthVotingInterval 5 minutes - - V3AuthVoteDelay 20 seconds - - V3AuthDistDelay 20 seconds - Accelerate voting schedule after first consensus has been reached. - - - TestingV3AuthInitialVotingInterval 5 minutes - - TestingV3AuthInitialVoteDelay 20 seconds - - TestingV3AuthInitialDistDelay 20 seconds - Accelerate initial voting schedule until first consensus is reached. - - - TestingAuthDirTimeToLearnReachability 0 minutes - Consider routers as Running from the start of running an authority. - - - TestingEstimatedDescriptorPropagationTime 0 minutes - Clients try downloading router descriptors from directory caches, - even when they are not 10 minutes old. - - In addition to changing the defaults for these configuration options, - TestingTorNetwork can only be set when a user has manually configured - DirServer lines. - -Test: - - The implementation of this proposal must pass the following tests: - - 1. Set TestingTorNetwork and see if dependent configuration options are - correctly changed. - - tor DataDirectory . ControlPort 9051 TestingTorNetwork 1 DirServer \ - "mydir 127.0.0.1:1234 0000000000000000000000000000000000000000" - telnet 127.0.0.1 9051 - AUTHENTICATE - GETCONF TestingTorNetwork TestingAuthDirTimeToLearnReachability - 250-TestingTorNetwork=1 - 250 TestingAuthDirTimeToLearnReachability=0 - QUIT - - 2. Set TestingTorNetwork and a dependent configuration value to see if - the provided value is used for the dependent option. - - tor DataDirectory . ControlPort 9051 TestingTorNetwork 1 DirServer \ - "mydir 127.0.0.1:1234 0000000000000000000000000000000000000000" \ - TestingAuthDirTimeToLearnReachability 5 - telnet 127.0.0.1 9051 - AUTHENTICATE - GETCONF TestingTorNetwork TestingAuthDirTimeToLearnReachability - 250-TestingTorNetwork=1 - 250 TestingAuthDirTimeToLearnReachability=5 - QUIT - - 3. Start with TestingTorNetwork set and change a dependent configuration - option later on. - - tor DataDirectory . ControlPort 9051 TestingTorNetwork 1 DirServer \ - "mydir 127.0.0.1:1234 0000000000000000000000000000000000000000" - telnet 127.0.0.1 9051 - AUTHENTICATE - SETCONF TestingAuthDirTimeToLearnReachability=5 - GETCONF TestingAuthDirTimeToLearnReachability - 250 TestingAuthDirTimeToLearnReachability=5 - QUIT - - 4. Start with TestingTorNetwork set and a dependent configuration value, - and reset that dependent configuration value. The result should be - the testing-network specific default value. - - tor DataDirectory . ControlPort 9051 TestingTorNetwork 1 DirServer \ - "mydir 127.0.0.1:1234 0000000000000000000000000000000000000000" \ - TestingAuthDirTimeToLearnReachability 5 - telnet 127.0.0.1 9051 - AUTHENTICATE - GETCONF TestingAuthDirTimeToLearnReachability - 250 TestingAuthDirTimeToLearnReachability=5 - RESETCONF TestingAuthDirTimeToLearnReachability - GETCONF TestingAuthDirTimeToLearnReachability - 250 TestingAuthDirTimeToLearnReachability=0 - QUIT - - 5. Leave TestingTorNetwork unset and check if dependent configuration - options are left unchanged. - - tor DataDirectory . ControlPort 9051 DirServer \ - "mydir 127.0.0.1:1234 0000000000000000000000000000000000000000" - telnet 127.0.0.1 9051 - AUTHENTICATE - GETCONF TestingTorNetwork TestingAuthDirTimeToLearnReachability - 250-TestingTorNetwork=0 - 250 TestingAuthDirTimeToLearnReachability=1800 - QUIT - - 6. Leave TestingTorNetwork unset, but set dependent configuration option - which should fail. - - tor DataDirectory . ControlPort 9051 DirServer \ - "mydir 127.0.0.1:1234 0000000000000000000000000000000000000000" \ - TestingAuthDirTimeToLearnReachability 0 - [warn] Failed to parse/validate config: - TestingAuthDirTimeToLearnReachability may only be changed in testing - Tor networks! - - 7. Start with TestingTorNetwork unset and change dependent configuration - option later on which should fail. - - tor DataDirectory . ControlPort 9051 DirServer \ - "mydir 127.0.0.1:1234 0000000000000000000000000000000000000000" - telnet 127.0.0.1 9051 - AUTHENTICATE - SETCONF TestingAuthDirTimeToLearnReachability=0 - 513 Unacceptable option value: TestingAuthDirTimeToLearnReachability - may only be changed in testing Tor networks! - - 8. Start with TestingTorNetwork unset and set it later on which should - fail. - - tor DataDirectory . ControlPort 9051 DirServer \ - "mydir 127.0.0.1:1234 0000000000000000000000000000000000000000" - telnet 127.0.0.1 9051 - AUTHENTICATE - SETCONF TestingTorNetwork=1 - 553 Transition not allowed: While Tor is running, changing - TestingTorNetwork is not allowed. - - 9. Start with TestingTorNetwork set and unset it later on which should - fail. - - tor DataDirectory . ControlPort 9051 TestingTorNetwork 1 DirServer \ - "mydir 127.0.0.1:1234 0000000000000000000000000000000000000000" - telnet 127.0.0.1 9051 - AUTHENTICATE - RESETCONF TestingTorNetwork - 513 Unacceptable option value: TestingV3AuthInitialVotingInterval may - only be changed in testing Tor networks! - - 10. Set TestingTorNetwork, but do not provide an alternate DirServer - which should fail. - - tor DataDirectory . ControlPort 9051 TestingTorNetwork 1 - [warn] Failed to parse/validate config: TestingTorNetwork may only be - configured in combination with a non-default set of DirServers. - diff --git a/doc/spec/proposals/136-legacy-keys.txt b/doc/spec/proposals/136-legacy-keys.txt deleted file mode 100644 index f2b1b5c7f9..0000000000 --- a/doc/spec/proposals/136-legacy-keys.txt +++ /dev/null @@ -1,100 +0,0 @@ -Filename: 136-legacy-keys.txt -Title: Mass authority migration with legacy keys -Author: Nick Mathewson -Created: 13-May-2008 -Status: Closed -Implemented-In: 0.2.0.x - -Overview: - - This document describes a mechanism to change the keys of more than - half of the directory servers at once without breaking old clients - and caches immediately. - -Motivation: - - If a single authority's identity key is believed to be compromised, - the solution is obvious: remove that authority from the list, - generate a new certificate, and treat the new cert as belonging to a - new authority. This approach works fine so long as less than 1/2 of - the authority identity keys are bad. - - Unfortunately, the mass-compromise case is possible if there is a - sufficiently bad bug in Tor or in any OS used by a majority of v3 - authorities. Let's be prepared for it! - - We could simply stop using the old keys and start using new ones, - and tell all clients running insecure versions to upgrade. - Unfortunately, this breaks our cacheing system pretty badly, since - caches won't cache a consensus that they don't believe in. It would - be nice to have everybody become secure the moment they upgrade to a - version listing the new authority keys, _without_ breaking upgraded - clients until the caches upgrade. - - So, let's come up with a way to provide a time window where the - consensuses are signed with the new keys and with the old. - -Design: - - We allow directory authorities to list a single "legacy key" - fingerprint in their votes. Each authority may add a single legacy - key. The format for this line is: - - legacy-dir-key FINGERPRINT - - We describe a new consensus method for generating directory - consensuses. This method is consensus method "3". - - When the authorities decide to use method "3" (as described in 3.4.1 - of dir-spec.txt), for every included vote with a legacy-dir-key line, - the consensus includes an extra dir-source line. The fingerprint in - this extra line is as in the legacy-dir-key line. The ports and - addresses are in the dir-source line. The nickname is as in the - dir-source line, with the string "-legacy" appended. - - [We need to include this new dir-source line because the code - won't accept or preserve signatures from authorities not listed - as contributing to the consensus.] - - Authorities using legacy dir keys include two signatures on their - consensuses: one generated with a signing key signed with their real - signing key, and another generated with a signing key signed with - another signing key attested to by their identity key. These - signing keys MUST be different. Authorities MUST serve both - certificates if asked. - -Process: - - In the event of a mass key failure, we'll follow the following - (ugly) procedure: - - All affected authorities generate new certificates and identity - keys, and circulate their new dirserver lines. They copy their old - certificates and old broken keys, but put them in new "legacy - key files". - - At the earliest time that can be arranged, the authorities - replace their signing keys, identity keys, and certificates - with the new uncompromised versions, and update to the new list - of dirserer lines. - - They add an "V3DirAdvertiseLegacyKey 1" option to their torrc. - - Now, new consensuses will be generated using the new keys, but - the results will also be signed with the old keys. - - Clients and caches are told they need to upgrade, and given a - time window to do so. - - At the end of the time window, authorities remove the - V3DirAdvertiseLegacyKey option. - -Notes: - - It might be good to get caches to cache consensuses that they do not - believe in. I'm not sure the best way of how to do this. - - It's a superficially neat idea to have new signing keys and have - them signed by the new and by the old authority identity keys. This - breaks some code, though, and doesn't actually gain us anything, - since we'd still need to include each signature twice. - - It's also a superficially neat idea, if identity keys and signing - keys are compromised, to at least replace all the signing keys. - I don't think this achieves us anything either, though. - - diff --git a/doc/spec/proposals/137-bootstrap-phases.txt b/doc/spec/proposals/137-bootstrap-phases.txt deleted file mode 100644 index ebe044c707..0000000000 --- a/doc/spec/proposals/137-bootstrap-phases.txt +++ /dev/null @@ -1,235 +0,0 @@ -Filename: 137-bootstrap-phases.txt -Title: Keep controllers informed as Tor bootstraps -Author: Roger Dingledine -Created: 07-Jun-2008 -Status: Closed -Implemented-In: 0.2.1.x - -1. Overview. - - Tor has many steps to bootstrapping directory information and - initial circuits, but from the controller's perspective we just have - a coarse-grained "CIRCUIT_ESTABLISHED" status event. Tor users with - slow connections or with connectivity problems can wait a long time - staring at the yellow onion, wondering if it will ever change color. - - This proposal describes a new client status event so Tor can give - more details to the controller. Section 2 describes the changes to the - controller protocol; Section 3 describes Tor's internal bootstrapping - phases when everything is going correctly; Section 4 describes when - Tor detects a problem and issues a bootstrap warning; Section 5 covers - suggestions for how controllers should display the results. - -2. Controller event syntax. - - The generic status event is: - - "650" SP StatusType SP StatusSeverity SP StatusAction - [SP StatusArguments] CRLF - - So in this case we send - 650 STATUS_CLIENT NOTICE/WARN BOOTSTRAP \ - PROGRESS=num TAG=Keyword SUMMARY=String \ - [WARNING=String REASON=Keyword COUNT=num RECOMMENDATION=Keyword] - - The arguments MAY appear in any order. Controllers MUST accept unrecognized - arguments. - - "Progress" gives a number between 0 and 100 for how far through - the bootstrapping process we are. "Summary" is a string that can be - displayed to the user to describe the *next* task that Tor will tackle, - i.e., the task it is working on after sending the status event. "Tag" - is an optional string that controllers can use to recognize bootstrap - phases from Section 3, if they want to do something smarter than just - blindly displaying the summary string. - - The severity describes whether this is a normal bootstrap phase - (severity notice) or an indication of a bootstrapping problem - (severity warn). If severity warn, it should also include a "warning" - argument string with any hints Tor has to offer about why it's having - troubles bootstrapping, a "reason" string that lists one of the reasons - allowed in the ORConn event, a "count" number that tells how many - bootstrap problems there have been so far at this phase, and a - "recommendation" keyword to indicate how the controller ought to react. - -3. The bootstrap phases. - - This section describes the various phases currently reported by - Tor. Controllers should not assume that the percentages and tags listed - here will continue to match up, or even that the tags will stay in - the same order. Some phases might also be skipped (not reported) if the - associated bootstrap step is already complete, or if the phase no longer - is necessary. Only "starting" and "done" are guaranteed to exist in all - future versions. - - Current Tor versions enter these phases in order, monotonically; - future Tors MAY revisit earlier stages. - - Phase 0: - tag=starting summary="starting" - - Tor starts out in this phase. - - Phase 5: - tag=conn_dir summary="Connecting to directory mirror" - - Tor sends this event as soon as Tor has chosen a directory mirror --- - one of the authorities if bootstrapping for the first time or after - a long downtime, or one of the relays listed in its cached directory - information otherwise. - - Tor will stay at this phase until it has successfully established - a TCP connection with some directory mirror. Problems in this phase - generally happen because Tor doesn't have a network connection, or - because the local firewall is dropping SYN packets. - - Phase 10 - tag=handshake_dir summary="Finishing handshake with directory mirror" - - This event occurs when Tor establishes a TCP connection with a relay used - as a directory mirror (or its https proxy if it's using one). Tor remains - in this phase until the TLS handshake with the relay is finished. - - Problems in this phase generally happen because Tor's firewall is - doing more sophisticated MITM attacks on it, or doing packet-level - keyword recognition of Tor's handshake. - - Phase 15: - tag=onehop_create summary="Establishing one-hop circuit for dir info" - - Once TLS is finished with a relay, Tor will send a CREATE_FAST cell - to establish a one-hop circuit for retrieving directory information. - It will remain in this phase until it receives the CREATED_FAST cell - back, indicating that the circuit is ready. - - Phase 20: - tag=requesting_status summary="Asking for networkstatus consensus" - - Once we've finished our one-hop circuit, we will start a new stream - for fetching the networkstatus consensus. We'll stay in this phase - until we get the 'connected' relay cell back, indicating that we've - established a directory connection. - - Phase 25: - tag=loading_status summary="Loading networkstatus consensus" - - Once we've established a directory connection, we will start fetching - the networkstatus consensus document. This could take a while; this - phase is a good opportunity for using the "progress" keyword to indicate - partial progress. - - This phase could stall if the directory mirror we picked doesn't - have a copy of the networkstatus consensus so we have to ask another, - or it does give us a copy but we don't find it valid. - - Phase 40: - tag=loading_keys summary="Loading authority key certs" - - Sometimes when we've finished loading the networkstatus consensus, - we find that we don't have all the authority key certificates for the - keys that signed the consensus. At that point we put the consensus we - fetched on hold and fetch the keys so we can verify the signatures. - - Phase 45 - tag=requesting_descriptors summary="Asking for relay descriptors" - - Once we have a valid networkstatus consensus and we've checked all - its signatures, we start asking for relay descriptors. We stay in this - phase until we have received a 'connected' relay cell in response to - a request for descriptors. - - Phase 50: - tag=loading_descriptors summary="Loading relay descriptors" - - We will ask for relay descriptors from several different locations, - so this step will probably make up the bulk of the bootstrapping, - especially for users with slow connections. We stay in this phase until - we have descriptors for at least 1/4 of the usable relays listed in - the networkstatus consensus. This phase is also a good opportunity to - use the "progress" keyword to indicate partial steps. - - Phase 80: - tag=conn_or summary="Connecting to entry guard" - - Once we have a valid consensus and enough relay descriptors, we choose - some entry guards and start trying to build some circuits. This step - is similar to the "conn_dir" phase above; the only difference is - the context. - - If a Tor starts with enough recent cached directory information, - its first bootstrap status event will be for the conn_or phase. - - Phase 85: - tag=handshake_or summary="Finishing handshake with entry guard" - - This phase is similar to the "handshake_dir" phase, but it gets reached - if we finish a TCP connection to a Tor relay and we have already reached - the "conn_or" phase. We'll stay in this phase until we complete a TLS - handshake with a Tor relay. - - Phase 90: - tag=circuit_create "Establishing circuits" - - Once we've finished our TLS handshake with an entry guard, we will - set about trying to make some 3-hop circuits in case we need them soon. - - Phase 100: - tag=done summary="Done" - - A full 3-hop circuit has been established. Tor is ready to handle - application connections now. - -4. Bootstrap problem events. - - When an OR Conn fails, we send a "bootstrap problem" status event, which - is like the standard bootstrap status event except with severity warn. - We include the same progress, tag, and summary values as we would for - a normal bootstrap event, but we also include "warning", "reason", - "count", and "recommendation" key/value combos. - - The "reason" values are long-term-stable controller-facing tags to - identify particular issues in a bootstrapping step. The warning - strings, on the other hand, are human-readable. Controllers SHOULD - NOT rely on the format of any warning string. Currently the possible - values for "recommendation" are either "ignore" or "warn" -- if ignore, - the controller can accumulate the string in a pile of problems to show - the user if the user asks; if warn, the controller should alert the - user that Tor is pretty sure there's a bootstrapping problem. - - Currently Tor uses recommendation=ignore for the first nine bootstrap - problem reports for a given phase, and then uses recommendation=warn - for subsequent problems at that phase. Hopefully this is a good - balance between tolerating occasional errors and reporting serious - problems quickly. - -5. Suggested controller behavior. - - Controllers should start out with a yellow onion or the equivalent - ("starting"), and then watch for either a bootstrap status event - (meaning the Tor they're using is sufficiently new to produce them, - and they should load up the progress bar or whatever they plan to use - to indicate progress) or a circuit_established status event (meaning - bootstrapping is finished). - - In addition to a progress bar in the display, controllers should also - have some way to indicate progress even when no controller window is - open. For example, folks using Tor Browser Bundle in hostile Internet - cafes don't want a big splashy screen up. One way to let the user keep - informed of progress in a more subtle way is to change the task tray - icon and/or tooltip string as more bootstrap events come in. - - Controllers should also have some mechanism to alert their user when - bootstrapping problems are reported. Perhaps we should gather a set of - help texts and the controller can send the user to the right anchor in a - "bootstrapping problems" page in the controller's help subsystem? - -6. Getting up to speed when the controller connects. - - There's a new "GETINFO /status/bootstrap-phase" option, which returns - the most recent bootstrap phase status event sent. Specifically, - it returns a string starting with either "NOTICE BOOTSTRAP ..." or - "WARN BOOTSTRAP ...". - - Controllers should use this getinfo when they connect or attach to - Tor to learn its current state. - diff --git a/doc/spec/proposals/138-remove-down-routers-from-consensus.txt b/doc/spec/proposals/138-remove-down-routers-from-consensus.txt deleted file mode 100644 index 776911b5c9..0000000000 --- a/doc/spec/proposals/138-remove-down-routers-from-consensus.txt +++ /dev/null @@ -1,49 +0,0 @@ -Filename: 138-remove-down-routers-from-consensus.txt -Title: Remove routers that are not Running from consensus documents -Author: Peter Palfrader -Created: 11-Jun-2008 -Status: Closed -Implemented-In: 0.2.1.2-alpha - -1. Overview. - - Tor directory authorities hourly vote and agree on a consensus document - which lists all the routers on the network together with some of their - basic properties, like if a router is an exit node, whether it is - stable or whether it is a version 2 directory mirror. - - One of the properties given with each router is the 'Running' flag. - Clients do not use routers that are not listed as running. - - This proposal suggests that routers without the Running flag are not - listed at all. - -2. Current status - - At a typical bootstrap a client downloads a 140KB consensus, about - 10KB of certificates to verify that consensus, and about 1.6MB of - server descriptors, about 1/4 of which it requires before it will - start building circuits. - - Another proposal deals with how to get that huge 1.6MB fraction to - effectively zero (by downloading only individual descriptors, on - demand). Should that get successfully implemented that will leave the - 140KB compressed consensus as a large fraction of what a client needs - to get in order to work. - - About one third of the routers listed in a consensus are not running - and will therefore never be used by clients who use this consensus. - Not listing those routers will save about 30% to 40% in size. - -3. Proposed change - - Authority directory servers produce vote documents that include all - the servers they know about, running or not, like they currently - do. In addition these vote documents also state that the authority - supports a new consensus forming method (method number 4). - - If more than two thirds of votes that an authority has received claim - they support method 4 then this new method will be used: The - consensus document is formed like before but a new last step removes - all routers from the listing that are not marked as Running. - diff --git a/doc/spec/proposals/139-conditional-consensus-download.txt b/doc/spec/proposals/139-conditional-consensus-download.txt deleted file mode 100644 index 941f5ad6b0..0000000000 --- a/doc/spec/proposals/139-conditional-consensus-download.txt +++ /dev/null @@ -1,94 +0,0 @@ -Filename: 139-conditional-consensus-download.txt -Title: Download consensus documents only when it will be trusted -Author: Peter Palfrader -Created: 2008-04-13 -Status: Closed -Implemented-In: 0.2.1.x - -Overview: - - Servers only provide consensus documents to clients when it is known that - the client will trust it. - -Motivation: - - When clients[1] want a new network status consensus they request it - from a Tor server using the URL path /tor/status-vote/current/consensus. - Then after downloading the client checks if this consensus can be - trusted. Whether the client trusts the consensus depends on the - authorities that the client trusts and how many of those - authorities signed the consensus document. - - If the client cannot trust the consensus document it is disregarded - and a new download is tried at a later time. Several hundred - kilobytes of server bandwidth were wasted by this single client's - request. - - With hundreds of thousands of clients this will have undesirable - consequences when the list of authorities has changed so much that a - large number of established clients no longer can trust any consensus - document formed. - -Objective: - - The objective of this proposal is to make clients not download - consensuses they will not trust. - -Proposal: - - The list of authorities that are trusted by a client are encoded in - the URL they send to the directory server when requesting a consensus - document. - - The directory server then only sends back the consensus when more than - half of the authorities listed in the request have signed the - consensus. If it is known that the consensus will not be trusted - a 404 error code is sent back to the client. - - This proposal does not require directory caches to keep more than one - consensus document. This proposal also does not require authorities - to verify the signature on the consensus document of authorities they - do not recognize. - - The new URL scheme to download a consensus is - /tor/status-vote/current/consensus/<F> where F is a list of - fingerprints, sorted in ascending order, and concatenated using a + - sign. - - Fingerprints are uppercase hexadecimal encodings of the authority - identity key's digest. Servers should also accept requests that - use lower case or mixed case hexadecimal encodings. - - A .z URL for compressed versions of the consensus will be provided - similarly to existing resources and is the URL that usually should - be used by clients. - -Migration: - - The old location of the consensus should continue to work - indefinitely. Not only is it used by old clients, but it is a useful - resource for automated tools that do not particularly care which - authorities have signed the consensus. - - Authorities that are known to the client a priori by being shipped - with the Tor code are assumed to handle this format. - - When downloading a consensus document from caches that do not support this - new format they fall back to the old download location. - - Caches support the new format starting with Tor version 0.2.1.1-alpha. - -Anonymity Implications: - - By supplying the list of authorities a client trusts to the directory - server we leak information (like likely version of Tor client) to the - directory server. In the current system we also leak that we are - very old - by re-downloading the consensus over and over again, but - only when we are so old that we no longer can trust the consensus. - - - -Footnotes: - 1. For the purpose of this proposal a client can be any Tor instance - that downloads a consensus document. This includes relays, - directory caches as well as end users. diff --git a/doc/spec/proposals/140-consensus-diffs.txt b/doc/spec/proposals/140-consensus-diffs.txt deleted file mode 100644 index 8bc4070bfe..0000000000 --- a/doc/spec/proposals/140-consensus-diffs.txt +++ /dev/null @@ -1,156 +0,0 @@ -Filename: 140-consensus-diffs.txt -Title: Provide diffs between consensuses -Author: Peter Palfrader -Created: 13-Jun-2008 -Status: Accepted -Target: 0.2.2.x - -0. History - - 22-May-2009: Restricted the ed format even more strictly for ease of - implementation. -nickm - -1. Overview. - - Tor clients and servers need a list of which relays are on the - network. This list, the consensus, is created by authorities - hourly and clients fetch a copy of it, with some delay, hourly. - - This proposal suggests that clients download diffs of consensuses - once they have a consensus instead of hourly downloading a full - consensus. - -2. Numbers - - After implementing proposal 138 which removes nodes that are not - running from the list a consensus document is about 92 kilobytes - in size after compression. - - The diff between two consecutive consensus, in ed format, is on - average 13 kilobytes compressed. - -3. Proposal - -3.1 Clients - - If a client has a consensus that is recent enough it SHOULD - try to download a diff to get the latest consensus rather than - fetching a full one. - - [XXX: what is recent enough? - time delta in hours / size of compressed diff - 0 20 - 1 9650 - 2 17011 - 3 23150 - 4 29813 - 5 36079 - 6 39455 - 7 43903 - 8 48907 - 9 54549 - 10 60057 - 11 67810 - 12 71171 - 13 73863 - 14 76048 - 15 80031 - 16 84686 - 17 89862 - 18 94760 - 19 94868 - 20 94223 - 21 93921 - 22 92144 - 23 90228 - [ size of gzip compressed "diff -e" between the consensus on - 2008-06-01-00:00:00 and the following consensuses that day. - Consensuses have been modified to exclude down routers per - proposal 138. ] - - Data suggests that for the first few hours diffs are very useful, - saving about 60% for the first three hours, 30% for the first 10, - and almost nothing once we are past 16 hours. - ] - -3.2 Servers - - Directory authorities and servers need to keep up to X [XXX: depends - on how long clients try to download diffs per above] old consensus - documents so they can build diffs. They should offer a diff to the - most recent consensus at the URL - - http://tor.noreply.org/tor/status-vote/current/consensus/diff/<HASH>/<FPRLIST> - - where hash is the full digest of the consensus the client currently - has, and FPRLIST is a list of (abbreviated) fingerprints of - authorities the client trusts. - - Servers will only return a consensus if more than half of the requested - authorities have signed the document, otherwise a 404 error will be sent - back. The fingerprints can be shortened to a length of any multiple of - two, using only the leftmost part of the encoded fingerprint. Tor uses - 3 bytes (6 hex characters) of the fingerprint. (This is just like the - conditional consensus downloads that Tor supports starting with - 0.1.2.1-alpha.) - - If a server cannot offer a diff from the consensus identified by the - hash but has a current consensus it MUST return the full consensus. - - [XXX: what should we do when the client already has the latest - consensus? I can think of the following options: - - send back 3xx not modified - - send back 200 ok and an empty diff - - send back 404 nothing newer here. - - I currently lean towards the empty diff.] - -4. Diff Format - - Diffs start with the token "network-status-diff-version" followed by a - space and the version number, currently "1". - - If a document does not start with network-status-diff it is assumed - to be a full consensus download and would therefore currently start - with "network-status-version 3". - - Following the network-status-diff header line is a diff, or patch, in - limited ed format. We choose this format because it is easy to create - and process with standard tools (patch, diff -e, ed). This will help - us in developing and testing this proposal and it should make future - debugging easier. - - [ If at one point in the future we decide that the space benefits from - a custom diff format outweighs these benefits we can always - introduce a new diff format and offer it at for instance - ../diff2/... ] - - We support the following ed commands, each on a line by itself: - - "<n1>d" Delete line n1 - - "<n1>,<n2>d" Delete lines n1 through n2, including - - "<n1>c" Replace line n1 with the following block - - "<n1>,<n2>c" Replace lines n1 through n2, including, with the - following block. - - "<n1>a" Append the following block after line n1. - - "a" Append the following block after the current line. - - "s/.//" Remove the first character in the current line. - - Note that line numbers always apply to the file after all previous - commands have already been applied. - - The commands MUST apply to the file from back to front, such that - lines are only ever referred to by their position in the original - file. - - The "current line" is either the first line of the file, if this is - the first command, the last line of a block we added in an append or - change command, or the line immediate following a set of lines we just - deleted (or the last line of the file if there are no lines after - that). - - The replace and append command take blocks. These blocks are simply - appended to the diff after the line with the command. A line with - just a period (".") ends the block (and is not part of the lines - to add). Note that it is impossible to insert a line with just - a single dot. Recommended procedure is to insert a line with - two dots, then remove the first character of that line using s/.//. diff --git a/doc/spec/proposals/141-jit-sd-downloads.txt b/doc/spec/proposals/141-jit-sd-downloads.txt deleted file mode 100644 index 2ac7a086b7..0000000000 --- a/doc/spec/proposals/141-jit-sd-downloads.txt +++ /dev/null @@ -1,323 +0,0 @@ -Filename: 141-jit-sd-downloads.txt -Title: Download server descriptors on demand -Author: Peter Palfrader -Created: 15-Jun-2008 -Status: Draft - -1. Overview - - Downloading all server descriptors is the most expensive part - of bootstrapping a Tor client. These server descriptors currently - amount to about 1.5 Megabytes of data, and this size will grow - linearly with network size. - - Fetching all these server descriptors takes a long while for people - behind slow network connections. It is also a considerable load on - our network of directory mirrors. - - This document describes proposed changes to the Tor network and - directory protocol so that clients will no longer need to download - all server descriptors. - - These changes consist of moving load balancing information into - network status documents, implementing a means to download server - descriptors on demand in an anonymity-preserving way, and dealing - with exit node selection. - -2. What is in a server descriptor - - When a Tor client starts the first thing it will try to get is a - current network status document: a consensus signed by a majority - of directory authorities. This document is currently about 100 - Kilobytes in size, tho it will grow linearly with network size. - This document lists all servers currently running on the network. - The Tor client will then try to get a server descriptor for each - of the running servers. All server descriptors currently amount - to about 1.5 Megabytes of downloads. - - A Tor client learns several things about a server from its descriptor. - Some of these it already learned from the network status document - published by the authorities, but the server descriptor contains it - again in a single statement signed by the server itself, not just by - the directory authorities. - - Tor clients use the information from server descriptors for - different purposes, which are considered in the following sections. - - #three ways: One, to determine if a server will be able to handle - #this client's request; two, to actually communicate or use the server; - #three, for load balancing decisions. - # - #These three points are considered in the following subsections. - -2.1 Load balancing - - The Tor load balancing mechanism is quite complex in its details, but - it has a simple goal: The more traffic a server can handle the more - traffic it should get. That means the more traffic a server can - handle the more likely a client will use it. - - For this purpose each server descriptor has bandwidth information - which tries to convey a server's capacity to clients. - - Currently we weigh servers differently for different purposes. There - is a weight for when we use a server as a guard node (our entry to the - Tor network), there is one weight we assign servers for exit duties, - and a third for when we need intermediate (middle) nodes. - -2.2 Exit information - - When a Tor wants to exit to some resource on the internet it will - build a circuit to an exit node that allows access to that resource's - IP address and TCP Port. - - When building that circuit the client can make sure that the circuit - ends at a server that will be able to fulfill the request because the - client already learned of all the servers' exit policies from their - descriptors. - -2.3 Capability information - - Server descriptors contain information about the specific version of - the Tor protocol they understand [proposal 105]. - - Furthermore the server descriptor also contains the exact version of - the Tor software that the server is running and some decisions are - made based on the server version number (for instance a Tor client - will only make conditional consensus requests [proposal 139] when - talking to Tor servers version 0.2.1.1-alpha or later). - -2.4 Contact/key information - - A server descriptor lists a server's IP address and TCP ports on which - it accepts onion and directory connections. Furthermore it contains - the onion key (a short lived RSA key to which clients encrypt CREATE - cells). - -2.5 Identity information - - A Tor client learns the digest of a server's key from the network - status document. Once it has a server descriptor this descriptor - contains the full RSA identity key of the server. Clients verify - that 1) the digest of the identity key matches the expected digest - it got from the consensus, and 2) that the signature on the descriptor - from that key is valid. - - -3. No longer require clients to have copies of all SDs - -3.1 Load balancing info in consensus documents - - One of the reasons why clients download all server descriptors is for - doing load proper load balancing as described in 2.1. In order for - clients to not require all server descriptors this information will - have to move into the network status document. - - Consensus documents will have a new line per router similar - to the "r", "s", and "v" lines that already exist. This line - will convey weight information to clients. - - "w Bandwidth=193" - - The bandwidth number is the lesser of observed bandwidth and bandwidth - rate limit from the server descriptor that the "r" line referenced by - digest (1st and 3rd field of the bandwidth line in the descriptor). - It is given in kilobytes per second so the byte value in the - descriptor has to be divided by 1024 (and is then truncated, i.e. - rounded down). - - Authorities will cap the bandwidth number at some arbitrary value, - currently 10MB/sec. If a router claims a larger bandwidth an - authority's vote will still only show Bandwidth=10240. - - The consensus value for bandwidth is the median of all bandwidth - numbers given in votes. In case of an even number of votes we use - the lower median. (Using this procedure allows us to change the - cap value more easily.) - - Clients should believe the bandwidth as presented in the consensus, - not capping it again. - -3.2 Fetching descriptors on demand - - As described in 2.4 a descriptor lists IP address, OR- and Dir-Port, - and the onion key for a server. - - A client already knows the IP address and the ports from the consensus - documents, but without the onion key it will not be able to send - CREATE/EXTEND cells for that server. Since the client needs the onion - key it needs the descriptor. - - If a client only downloaded a few descriptors in an observable manner - then that would leak which nodes it was going to use. - - This proposal suggests the following: - - 1) when connecting to a guard node for which the client does not - yet have a cached descriptor it requests the descriptor it - expects by hash. (The consensus document that the client holds - has a hash for the descriptor of this server. We want exactly - that descriptor, not a different one.) - - It does that by sending a RELAY_REQUEST_SD cell. - - A client MAY cache the descriptor of the guard node so that it does - not need to request it every single time it contacts the guard. - - 2) when a client wants to extend a circuit that currently ends in - server B to a new next server C, the client will send a - RELAY_REQUEST_SD cell to server B. This cell contains in its - payload the hash of a server descriptor the client would like - to obtain (C's server descriptor). The server sends back the - descriptor and the client can now form a valid EXTEND/CREATE cell - encrypted to C's onion key. - - Clients MUST NOT cache such descriptors. If they did they might - leak that they already extended to that server at least once - before. - - Replies to RELAY_REQUEST_SD requests need to be padded to some - constant upper limit in order to conceal a client's destination - from anybody who might be counting cells/bytes. - - RELAY_REQUEST_SD cells contain the following information: - - hash of the server descriptor requested - - hash of the identity digest of the server for which we want the SD - - IP address and OR-port or the server for which we want the SD - - padding factor - the number of cells we want the answer - padded to. - [XXX this just occured to me and it might be smart. or it might - be stupid. clients would learn the padding factor they want - to use from the consensus document. This allows us to grow - the replies later on should SDs become larger.] - [XXX: figure out a decent padding size] - -3.3 Protocol versions - - Server descriptors contain optional information of supported - link-level and circuit-level protocols in the form of - "opt protocols Link 1 2 Circuit 1". These are not currently needed - and will probably eventually move into the "v" (version) line in - the consensus. This proposal does not deal with them. - - Similarly a server descriptor contains the version number of - a Tor node. This information is already present in the consensus - and is thus available to all clients immediately. - -3.4 Exit selection - - Currently finding an appropriate exit node for a user's request is - easy for a client because it has complete knowledge of all the exit - policies of all servers on the network. - - The consensus document will once again be extended to contain the - information required by clients. This information will be a summary - of each node's exit policy. The exit policy summary will only contain - the list of ports to which a node exits to most destination IP - addresses. - - A summary should claim a router exits to a specific TCP port if, - ignoring private IP addresses, the exit policy indicates that the - router would exit to this port to most IP address. either two /8 - netblocks, or one /8 and a couple of /12s or any other combination). - The exact algorith used is this: Going through all exit policy items - - ignore any accept that is not for all IP addresses ("*"), - - ignore rejects for these netblocks (exactly, no subnetting): - 0.0.0.0/8, 169.254.0.0/16, 127.0.0.0/8, 192.168.0.0/16, 10.0.0.0/8, - and 172.16.0.0/12m - - for each reject count the number of IP addresses rejected against - the affected ports, - - once we hit an accept for all IP addresses ("*") add the ports in - that policy item to the list of accepted ports, if they don't have - more than 2^25 IP addresses (that's two /8 networks) counted - against them (i.e. if the router exits to a port to everywhere but - at most two /8 networks). - - An exit policy summary will be included in votes and consensus as a - new line attached to each exit node. The line will have the format - "p" <space> "accept"|"reject" <portlist> - where portlist is a comma seperated list of single port numbers or - portranges (e.g. "22,80-88,1024-6000,6667"). - - Whether the summary shows the list of accepted ports or the list of - rejected ports depends on which list is shorter (has a shorter string - representation). In case of ties we choose the list of accepted - ports. As an exception to this rule an allow-all policy is - represented as "accept 1-65535" instead of "reject " and a reject-all - policy is similarly given as "reject 1-65535". - - Summary items are compressed, that is instead of "80-88,89-100" there - only is a single item of "80-100", similarly instead of "20,21" a - summary will say "20-21". - - Port lists are sorted in ascending order. - - The maximum allowed length of a policy summary (including the "accept " - or "reject ") is 1000 characters. If a summary exceeds that length we - use an accept-style summary and list as much of the port list as is - possible within these 1000 bytes. - -3.4.1 Consensus selection - - When building a consensus, authorities have to agree on a digest of - the server descriptor to list in the router line for each router. - This is documented in dir-spec section 3.4. - - All authorities that listed that agreed upon descriptor digest in - their vote should also list the same exit policy summary - or list - none at all if the authority has not been upgraded to list that - information in their vote. - - If we have votes with matching server descriptor digest of which at - least one of them has an exit policy then we differ between two cases: - a) all authorities agree (or abstained) on the policy summary, and we - use the exit policy summary that they all listed in their vote, - b) something went wrong (or some authority is playing foul) and we - have different policy summaries. In that case we pick the one - that is most commonly listed in votes with the matching - descriptor. We break ties in favour of the lexigraphically larger - vote. - - If none one of the votes with a matching server descriptor digest has - an exit policy summary we use the most commonly listed one in all - votes, breaking ties like in case b above. - -3.4.2 Client behaviour - - When choosing an exit node for a specific request a Tor client will - choose from the list of nodes that exit to the requested port as given - by the consensus document. If a client has additional knowledge (like - cached full descriptors) that indicates the so chosen exit node will - reject the request then it MAY use that knowledge (or not include such - nodes in the selection to begin with). However, clients MUST NOT use - nodes that do not list the port as accepted in the summary (but for - which they know that the node would exit to that address from other - sources, like a cached descriptor). - - An exception to this is exit enclave behaviour: A client MAY use the - node at a specific IP address to exit to any port on the same address - even if that node is not listed as exiting to the port in the summary. - -4. Migration - -4.1 Consensus document changes. - - The consensus will need to include - - bandwidth information (see 3.1) - - exit policy summaries (3.4) - - A new consensus method (number TBD) will be chosen for this. - -5. Future possibilities - - This proposal still requires that all servers have the descriptors of - every other node in the network in order to answer RELAY_REQUEST_SD - cells. These cells are sent when a circuit is extended from ending at - node B to a new node C. In that case B would have to answer a - RELAY_REQUEST_SD cell that asks for C's server descriptor (by SD digest). - - In order to answer that request B obviously needs a copy of C's server - descriptor. The RELAY_REQUEST_SD cell already has all the info that - B needs to contact C so it can ask about the descriptor before passing it - back to the client. - diff --git a/doc/spec/proposals/142-combine-intro-and-rend-points.txt b/doc/spec/proposals/142-combine-intro-and-rend-points.txt deleted file mode 100644 index 3abd5c863d..0000000000 --- a/doc/spec/proposals/142-combine-intro-and-rend-points.txt +++ /dev/null @@ -1,277 +0,0 @@ -Filename: 142-combine-intro-and-rend-points.txt -Title: Combine Introduction and Rendezvous Points -Author: Karsten Loesing, Christian Wilms -Created: 27-Jun-2008 -Status: Dead - -Change history: - - 27-Jun-2008 Initial proposal for or-dev - 04-Jul-2008 Give first security property the new name "Responsibility" - and change new cell formats according to rendezvous protocol - version 3 draft. - 19-Jul-2008 Added comment by Nick (but no solution, yet) that sharing of - circuits between multiple clients is not supported by Tor. - -Overview: - - Establishing a connection to a hidden service currently involves two Tor - relays, introduction and rendezvous point, and 10 more relays distributed - over four circuits to connect to them. The introduction point is - established in the mid-term by a hidden service to transfer introduction - requests from client to the hidden service. The rendezvous point is set - up by the client for a single hidden service request and actually - transfers end-to-end encrypted application data between client and hidden - service. - - There are some reasons for separating the two roles of introduction and - rendezvous point: (1) Responsibility: A relay shall not be made - responsible that it relays data for a certain hidden service; in the - original design as described in [1] an introduction point relays no - application data, and a rendezvous points neither knows the hidden - service nor can it decrypt the data. (2) Scalability: The hidden service - shall not have to maintain a number of open circuits proportional to the - expected number of client requests. (3) Attack resistance: The effect of - an attack on the only visible parts of a hidden service, its introduction - points, shall be as small as possible. - - However, elimination of a separate rendezvous connection as proposed by - Øverlier and Syverson [2] is the most promising approach to improve the - delay in connection establishment. From all substeps of connection - establishment extending a circuit by only a single hop is responsible for - a major part of delay. Reducing on-demand circuit extensions from two to - one results in a decrease of mean connection establishment times from 39 - to 29 seconds [3]. Particularly, eliminating the delay on hidden-service - side allows the client to better observe progress of connection - establishment, thus allowing it to use smaller timeouts. Proposal 114 - introduced new introduction keys for introduction points and provides for - user authorization data in hidden service descriptors; it will be shown - in this proposal that introduction keys in combination with new - introduction cookies provide for the first security property - responsibility. Further, eliminating the need for a separate introduction - connection benefits the overall network load by decreasing the number of - circuit extensions. After all, having only one connection between client - and hidden service reduces the overall protocol complexity. - -Design: - - 1. Hidden Service Configuration - - Hidden services should be able to choose whether they would like to use - this protocol. This might be opt-in for 0.2.1.x and opt-out for later - major releases. - - 2. Contact Point Establishment - - When preparing a hidden service, a Tor client selects a set of relays to - act as contact points instead of introduction points. The contact point - combines both roles of introduction and rendezvous point as proposed in - [2]. The only requirement for a relay to be picked as contact point is - its capability of performing this role. This can be determined from the - Tor version number that needs to be equal or higher than the first - version that implements this proposal. - - The easiest way to implement establishment of contact points is to - introduce v2 ESTABLISH_INTRO cells. By convention, the relay recognizes - version 2 ESTABLISH_INTRO cells as requests to establish a contact point - rather than an introduction point. - - V Format byte: set to 255 [1 octet] - V Version byte: set to 2 [1 octet] - KLEN Key length [2 octets] - PK Public introduction key [KLEN octets] - HS Hash of session info [20 octets] - SIG Signature of above information [variable] - - The hidden service does not create a fixed number of contact points, like - 3 in the current protocol. It uses a minimum of 3 contact points, but - increases this number depending on the history of client requests within - the last hour. The hidden service also increases this number depending on - the frequency of failing contact points in order to defend against - attacks on its contact points. When client authorization as described in - proposal 121 is used, a hidden service can also use the number of - authorized clients as first estimate for the required number of contact - points. - - 3. Hidden Service Descriptor Creation - - A hidden service needs to issue a fresh introduction cookie for each - established introduction point. By requiring clients to use this cookie - in a later connection establishment, an introduction point cannot access - the hidden service that it works for. Together with the fresh - introduction key that was introduced in proposal 114, this reduces - responsibility of a contact point for a specific hidden service. - - The v2 hidden service descriptor format contains an - "intro-authentication" field that may contain introduction-point specific - keys. The hidden service creates a random string, comparable to the - rendezvous cookie, and includes it in the descriptor as introduction - cookie for auth-type "1". By convention, clients recognize existence of - auth-type 1 as possibility to connect to a hidden service via a contact - point rather than an introduction point. Older clients that do not - understand this new protocol simply ignore that cookie. - - 4. Connection Establishment - - When establishing a connection to a hidden service a client learns about - the capability of using the new protocol from the hidden service - descriptor. It may choose whether to use this new protocol or not, - whereas older clients cannot understand the new capability and can only - use the current protocol. Client using version 0.2.1.x should be able to - opt-in for using the new protocol, which should change to opt-out for - later major releases. - - When using the new capability the client creates a v2 INTRODUCE1 cell - that extends an unversioned INTRODUCE1 cell by adding the content of an - ESTABLISH_RENDEZVOUS cell. Further, the client sends this cell using the - new cell type 41 RELAY_INTRODUCE1_VERSIONED to the introduction point, - because unversioned and versioned INTRODUCE1 cells are indistinguishable: - - Cleartext - V Version byte: set to 2 [1 octet] - PK_ID Identifier for Bob's PK [20 octets] - RC Rendezvous cookie [20 octets] - Encrypted to introduction key: - VER Version byte: set to 3. [1 octet] - AUTHT The auth type that is supported [1 octet] - AUTHL Length of auth data [2 octets] - AUTHD Auth data [variable] - RC Rendezvous cookie [20 octets] - g^x Diffie-Hellman data, part 1 [128 octets] - - The cleartext part contains the rendezvous cookie that the contact point - remembers just as a rendezvous point would do. - - The encrypted part contains the introduction cookie as auth data for the - auth type 1. The rendezvous cookie is contained as before, but there is - no further rendezvous point information, as there is no separate - rendezvous point. - - 5. Rendezvous Establishment - - The contact point recognizes a v2 INTRODUCE1 cell with auth type 1 as a - request to be used in the new protocol. It remembers the contained - rendezvous cookie, replies to the client with an INTRODUCE_ACK cell - (omitting the RENDEZVOUS_ESTABLISHED cell), and forwards the encrypted - part of the INTRODUCE1 cell as INTRODUCE2 cell to the hidden service. - - 6. Introduction at Hidden Service - - The hidden services recognizes an INTRODUCE2 cell containing an - introduction cookie as authorization data. In this case, it does not - extend a circuit to a rendezvous point, but sends a RENDEZVOUS1 cell - directly back to its contact point as usual. - - 7. Rendezvous at Contact Point - - The contact point processes a RENDEZVOUS1 cell just as a rendezvous point - does. The only difference is that the hidden-service-side circuit is not - exclusive for the client connection, but shared among multiple client - connections. - - [Tor does not allow sharing of a single circuit among multiple client - connections easily. We need to think about a smart and efficient way to - implement this. Comment by Nick. -KL] - -Security Implications: - - (1) Responsibility - - One of the original reasons for the separation of introduction and - rendezvous points is that a relay shall not be made responsible that it - relays data for a certain hidden service. In the current design an - introduction point relays no application data and a rendezvous points - neither knows the hidden service nor can it decrypt the data. - - This property is also fulfilled in this new design. A contact point only - learns a fresh introduction key instead of the hidden service key, so - that it cannot recognize a hidden service. Further, the introduction - cookie, which is unknown to the contact point, prevents it from accessing - the hidden service itself. The only way for a contact point to access a - hidden service is to look up whether it is contained in the descriptors - of known hidden services. A contact point cannot directly be made - responsible for which hidden service it is working. In addition to that, - it cannot learn the data that it transfers, because all communication - between client and hidden service are end-to-end encrypted. - - (2) Scalability - - Another goal of the existing hidden service protocol is that a hidden - service does not have to maintain a number of open circuits proportional - to the expected number of client requests. The rationale behind this is - better scalability. - - The new protocol eliminates the need for a hidden service to extend - circuits on demand, which has a positive effect on circuits establishment - times and overall network load. The solution presented here to establish - a number of contact points proportional to the history of connection - requests reduces the number of circuits to a minimum number that fits the - hidden service's needs. - - (3) Attack resistance - - The third goal of separating introduction and rendezvous points is to - limit the effect of an attack on the only visible parts of a hidden - service which are the contact points in this protocol. - - In theory, the new protocol is more vulnerable to this attack. An - attacker who can take down a contact point does not only eliminate an - access point to the hidden service, but also breaks current client - connections to the hidden service using that contact point. - - Øverlier and Syverson proposed the concept of valet nodes as additional - safeguard for introduction/contact points [4]. Unfortunately, this - increases hidden service protocol complexity conceptually and from an - implementation point of view. Therefore, it is not included in this - proposal. - - However, in practice attacking a contact point (or introduction point) is - not as rewarding as it might appear. The cost for a hidden service to set - up a new contact point and publish a new hidden service descriptor is - minimal compared to the efforts necessary for an attacker to take a Tor - relay down. As a countermeasure to further frustrate this attack, the - hidden service raises the number of contact points as a function of - previous contact point failures. - - Further, the probability of breaking client connections due to attacking - a contact point is minimal. It can be assumed that the probability of one - of the other five involved relays in a hidden service connection failing - or being shut down is higher than that of a successful attack on a - contact point. - - (4) Resistance against Locating Attacks - - Clients are no longer able to force a hidden service to create or extend - circuits. This further reduces an attacker's capabilities of locating a - hidden server as described by Øverlier and Syverson [5]. - -Compatibility: - - The presented protocol does not raise compatibility issues with current - Tor versions. New relay versions support both, the existing and the - proposed protocol as introduction/rendezvous/contact points. A contact - point acts as introduction point simultaneously. Hidden services and - clients can opt-in to use the new protocol which might change to opt-out - some time in the future. - -References: - - [1] Roger Dingledine, Nick Mathewson, and Paul Syverson, Tor: The - Second-Generation Onion Router. In the Proceedings of the 13th USENIX - Security Symposium, August 2004. - - [2] Lasse Øverlier and Paul Syverson, Improving Efficiency and Simplicity - of Tor Circuit Establishment and Hidden Services. In the Proceedings of - the Seventh Workshop on Privacy Enhancing Technologies (PET 2007), - Ottawa, Canada, June 2007. - - [3] Christian Wilms, Improving the Tor Hidden Service Protocol Aiming at - Better Performance, diploma thesis, June 2008, University of Bamberg. - - [4] Lasse Øverlier and Paul Syverson, Valet Services: Improving Hidden - Servers with a Personal Touch. In the Proceedings of the Sixth Workshop - on Privacy Enhancing Technologies (PET 2006), Cambridge, UK, June 2006. - - [5] Lasse Øverlier and Paul Syverson, Locating Hidden Servers. In the - Proceedings of the 2006 IEEE Symposium on Security and Privacy, May 2006. - diff --git a/doc/spec/proposals/143-distributed-storage-improvements.txt b/doc/spec/proposals/143-distributed-storage-improvements.txt deleted file mode 100644 index 0f7468f1dc..0000000000 --- a/doc/spec/proposals/143-distributed-storage-improvements.txt +++ /dev/null @@ -1,194 +0,0 @@ -Filename: 143-distributed-storage-improvements.txt -Title: Improvements of Distributed Storage for Tor Hidden Service Descriptors -Author: Karsten Loesing -Created: 28-Jun-2008 -Status: Open -Target: 0.2.1.x - -Change history: - - 28-Jun-2008 Initial proposal for or-dev - -Overview: - - An evaluation of the distributed storage for Tor hidden service - descriptors and subsequent discussions have brought up a few improvements - to proposal 114. All improvements are backwards compatible to the - implementation of proposal 114. - -Design: - - 1. Report Bad Directory Nodes - - Bad hidden service directory nodes could deny existence of previously - stored descriptors. A bad directory node that does this with all stored - descriptors causes harm to the distributed storage in general, but - replication will cope with this problem in most cases. However, an - adversary that attempts to make a specific hidden service unavailable by - running relays that become responsible for all of a service's - descriptors poses a more serious threat. The distributed storage needs to - defend against this attack by detecting and removing bad directory nodes. - - As a countermeasure hidden services try to download their descriptors - every hour at random times from the hidden service directories that are - responsible for storing it. If a directory node replies with 404 (Not - found), the hidden service reports the supposedly bad directory node to - a random selection of half of the directory authorities (with version - numbers equal to or higher than the first version that implements this - proposal). The hidden service posts a complaint message using HTTP 'POST' - to a URL "/tor/rendezvous/complain" with the following message format: - - "hidden-service-directory-complaint" identifier NL - - [At start, exactly once] - - The identifier of the hidden service directory node to be - investigated. - - "rendezvous-service-descriptor" descriptor NL - - [At end, Excatly once] - - The hidden service descriptor that the supposedly bad directory node - does not serve. - - The directory authority checks if the descriptor is valid and the hidden - service directory responsible for storing it. It waits for a random time - of up to 30 minutes before posting the descriptor to the hidden service - directory. If the publication is acknowledged, the directory authority - waits another random time of up to 30 minutes before attempting to - request the descriptor that it has posted. If the directory node replies - with 404 (Not found), it will be blacklisted for being a hidden service - directory node for the next 48 hours. - - A blacklisted hidden service directory is assigned the new flag BadHSDir - instead of the HSDir flag in the vote that a directory authority creates. - In a consensus a relay is only assigned a HSDir flag if the majority of - votes contains a HSDir flag and no more than one third of votes contains - a BadHSDir flag. As a result, clients do not have to learn about the - BadHSDir flag. A blacklisted directory node will simply not be assigned - the HSDir flag in the consensus. - - In order to prevent an attacker from setting up new nodes as replacement - for blacklisted directory nodes, all directory nodes in the same /24 - subnet are blacklisted, too. Furthermore, if two or more directory nodes - are blacklisted in the same /16 subnet concurrently, all other directory - nodes in that /16 subnet are blacklisted, too. Blacklisting holds for at - most 48 hours. - - 2. Publish Fewer Replicas - - The evaluation has shown that the probability of a directory node to - serve a previously stored descriptor is 85.7% (more precisely, this is - the 0.001-quantile of the empirical distribution with the rationale that - it holds for 99.9% of all empirical cases). If descriptors are replicated - to x directory nodes, the probability of at least one of the replicas to - be available for clients is 1 - (1 - 85.7%) ^ x. In order to achieve an - overall availability of 99.9%, x = 3.55 replicas need to be stored. From - this follows that 4 replicas are sufficient, rather than the currently - stored 6 replicas. - - Further, the current design stores 2 sets of descriptors on 3 directory - nodes with consecutive identities. Originally, this was meant to - facilitate replication between directory nodes, which has not been and - will not be implemented (the selection criterion of 24 hours uptime does - not make it necessary). As a result, storing descriptors on directory - nodes with consecutive identities is not required. In fact it should be - avoided to enable an attacker to create "black holes" in the identifier - ring. - - Hidden services should store their descriptors on 4 non-consecutive - directory nodes, and clients should request descriptors from these - directory nodes only. For compatibility reasons, hidden services also - store their descriptors on 2 consecutive directory nodes. Hence, 0.2.0.x - clients will be able to retrieve 4 out of 6 descriptors, but will fail - for the remaining 2 descriptors, which is sufficient for reliability. As - soon as 0.2.0.x is deprecated, hidden services can stop publishing the - additional 2 replicas. - - 3. Change Default Value of Being Hidden Service Directory - - The requirements for becoming a hidden service directory node are an open - directory port and an uptime of at least 24 hours. The evaluation has - shown that there are 300 hidden service directory candidates in the mean, - but only 6 of them are configured to act as hidden service directories. - This is bad, because those 6 nodes need to serve a large share of all - hidden service descriptors. Optimally, there should be hundreds of hidden - service directories. Having a large number of 0.2.1.x directory nodes - also has a positive effect on 0.2.0.x hidden services and clients. - - Therefore, the new default of HidServDirectoryV2 should be 1, so that a - Tor relay that has an open directory port automatically accepts and - serves v2 hidden service descriptors. A relay operator can still opt-out - running a hidden service directory by changing HidServDirectoryV2 to 0. - The additional bandwidth requirements for running a hidden service - directory node in addition to being a directory cache are negligible. - - 4. Make Descriptors Persistent on Directory Nodes - - Hidden service directories that are restarted by their operators or after - a failure will not be selected as hidden service directories within the - next 24 hours. However, some clients might still think that these nodes - are responsible for certain descriptors, because they work on the basis - of network consensuses that are up to three hours old. The directory - nodes should be able to serve the previously received descriptors to - these clients. Therefore, directory nodes make all received descriptors - persistent and load previously received descriptors on startup. - - 5. Store and Serve Descriptors Regardless of Responsibility - - Currently, directory nodes only accept descriptors for which they think - they are responsible. This may lead to problems when a directory node - uses an older or newer network consensus than hidden service or client - or when a directory node has been restarted recently. In fact, there are - no security issues in storing or serving descriptors for which a - directory node thinks it is not responsible. To the contrary, doing so - may improve reliability in border cases. As a result, a directory node - does not pay attention to responsibilty when receiving a publication or - fetch request, but stores or serves the requested descriptor. Likewise, - the directory node does not remove descriptors when it thinks it is not - responsible for them any more. - - 6. Avoid Periodic Descriptor Re-Publication - - In the current implementation a hidden service re-publishes its - descriptor either when its content changes or an hour elapses. However, - the evaluation has shown that failures of hidden service directory nodes, - i.e. of nodes that have not failed within the last 24 hours, are very - rare. Together with making descriptors persistent on directory nodes, - there is no necessity to re-publish descriptors hourly. - - The only two events leading to descriptor re-publication should be a - change of the descriptor content and a new directory node becoming - responsible for the descriptor. Hidden services should therefore consider - re-publication every time they learn about a new network consensus - instead of hourly. - - 7. Discard Expired Descriptors - - The current implementation lets directory nodes keep a descriptor for two - days before discarding it. However, with the v2 design, descriptors are - only valid for at most one day. Directory nodes should determine the - validity of stored descriptors and discard them one hour after they have - expired (to compensate wrong clocks on clients). - - 8. Shorten Client-Side Descriptor Fetch History - - When clients try to download a hidden service descriptor, they memorize - fetch requests to directory nodes for up to 15 minutes. This allows them - to request all replicas of a descriptor to avoid bad or failing directory - nodes, but without querying the same directory node twice. - - The downside is that a client that has requested a descriptor without - success, will not be able to find a hidden service that has been started - during the following 15 minutes after the client's last request. - - This can be improved by shortening the fetch history to only 5 minutes. - This time should be sufficient to complete requests for all replicas of a - descriptor, but without ending in an infinite request loop. - -Compatibility: - - All proposed improvements are compatible to the currently implemented - design as described in proposal 114. - diff --git a/doc/spec/proposals/144-enforce-distinct-providers.txt b/doc/spec/proposals/144-enforce-distinct-providers.txt deleted file mode 100644 index aa460482f1..0000000000 --- a/doc/spec/proposals/144-enforce-distinct-providers.txt +++ /dev/null @@ -1,165 +0,0 @@ -Filename: 144-enforce-distinct-providers.txt -Title: Increase the diversity of circuits by detecting nodes belonging the - same provider -Author: Mfr -Created: 2008-06-15 -Status: Draft - -Overview: - - Increase network security by reducing the capacity of the relay or - ISPs monitoring personally or requisition, a large part of traffic - Tor trying to break circuits privacy. A way to increase the - diversity of circuits without killing the network performance. - -Motivation: - - Since 2004, Roger an Nick publication about diversity [1], very fast - relays Tor running are focused among an half dozen of providers, - controlling traffic of some dozens of routers [2]. - - In the same way the generalization of VMs clonables paid by hour, - allowing starting in few minutes and for a small cost, a set of very - high-speed relay whose in a few hours can attract a big traffic that - can be analyzed, increasing the vulnerability of the network. - - Whether ISPs or domU providers, these usually have several groups of - IP Class B. Also the restriction in place EnforceDistinctSubnets - automatically excluding IP subnet class B is only partially - effective. By contrast a restriction at the class A will be too - restrictive. - - Therefore it seems necessary to consider another approach. - -Proposal: - - Add a provider control based on AS number added by the router on is - descriptor, controlled by Directories Authorities, and used like the - declarative family field for circuit creating. - -Design: - -Step 1 : - - Add to the router descriptor a provider information get request [4] - by the router itself. - - "provider" name NL - - 'names' is the AS number of the router formated like this: - 'ASxxxxxx' where AS is fixed and xxxxxx is the AS number, - left aligned ( ex: AS98304 , AS4096,AS1 ) or if AS number - is missing the network A class number is used like that: - 'ANxxx' where AN is fixed and xxx is the first 3 digits of - the IP (ex: for the IP 1.1.1.2 AN1) or an 'L' value is set - if it's a local network IP. - - If two ORs list one another in their "provider" entries, - then OPs should treat them as a single OR for the purpose - of path selection. - - For example, if node A's descriptor contains "provider B", - and node B's descriptor contains "provider A", then node A - and node B should never be used on the same circuit. - - Add the regarding config option in torrc - - EnforceDistinctProviders set to 1 by default. - Permit building circuits with relays in the same provider - if set to 0. - Regarding to proposal 135 if TestingTorNetwork is set - need to be EnforceDistinctProviders is unset. - - Control by Authorities Directories of the AS numbers - - The Directories Authority control the AS numbers of the new node - descriptor uploaded. - - If an old version is operated by the node this test is - bypassed. - - If AS number get by request is different from the - description, router is flagged as non-Valid by the testing - Authority for the voting process. - -Step 2 When a ' significant number of nodes' of valid routers are -generating descriptor with provider information. - - Add missing provider information get by DNS request -functionality for the circuit user: - - During circuit building, computing, OP apply first - family check and EnforceDistinctSubnets directives for - performance, then if provider info is needed and - missing in router descriptor try to get AS provider - info by DNS request [4]. This information could be - DNS cached. AN ( class A number) is never generated - during this process to prevent DNS block problems. If - DNS request fails ignore and continue building - circuit. - -Step 3 When the 'whole majority' of valid Tor clients are providing -DNS request. - - Older versions are deprecated and mark as no-Valid. - - EnforceDistinctProviders replace EnforceDistinctSubnets functionnality. - - EnforceDistinctSubnets is removed. - - Functionalities deployed in step 2 are removed. - -Security implications: - - This providermeasure will increase the number of providers - addresses that an attacker must use in order to carry out - traffic analysis. - -Compatibility: - - The presented protocol does not raise compatibility issues - with current Tor versions. The compatibility is preserved by - implementing this functionality in 3 steps, giving time to - network users to upgrade clients and routers. - -Performance and scalability notes: - - Provider change for all routers could reduce a little - performance if the circuit to long. - - During step 2 Get missing provider information could increase - building path time and should have a time out. - -Possible Attacks/Open Issues/Some thinking required: - - These proposal seems be compatible with proposal 135 Simplify - Configuration of Private Tor Networks. - - This proposal does not resolve multiples AS owners and top - providers traffic monitoring attacks [5]. - - Unresolved AS number are treated as a Class A network. Perhaps - should be marked as invalid. But there's only fives items on - last check see [2]. - - Need to define what's a 'significant number of nodes' and - 'whole majority' ;-) - -References: -[1] Location Diversity in Anonymity Networks by Nick Feamster and Roger -Dingledine. -In the Proceedings of the Workshop on Privacy in the Electronic Society -(WPES 2004), Washington, DC, USA, October 2004 -http://freehaven.net/anonbib/#feamster:wpes2004 -[2] http://as4jtw5gc6efb267.onion/IPListbyAS.txt -[3] see Goodell Tor Exit Page -http://cassandra.eecs.harvard.edu/cgi-bin/exit.py -[4] see the great IP to ASN DNS Tool -http://www.team-cymru.org/Services/ip-to-asn.html -[5] Sampled Traffic Analysis by Internet-Exchange-Level Adversaries by -Steven J. Murdoch and Piotr Zielinski. -In the Proceedings of the Seventh Workshop on Privacy Enhancing Technologies - -(PET 2007), Ottawa, Canada, June 2007. -http://freehaven.net/anonbib/#murdoch-pet2007 -[5] http://bugs.noreply.org/flyspray/index.php?do=details&id=690 diff --git a/doc/spec/proposals/145-newguard-flag.txt b/doc/spec/proposals/145-newguard-flag.txt deleted file mode 100644 index 9e61e30be9..0000000000 --- a/doc/spec/proposals/145-newguard-flag.txt +++ /dev/null @@ -1,39 +0,0 @@ -Filename: 145-newguard-flag.txt -Title: Separate "suitable as a guard" from "suitable as a new guard" -Author: Nick Mathewson -Created: 1-Jul-2008 -Status: Open -Target: 0.2.1.x - -[This could be obsoleted by proposal 141, which could replace NewGuard -with a Guard weight.] - -Overview - - Right now, Tor has one flag that clients use both to tell which - nodes should be kept as guards, and which nodes should be picked - when choosing new guards. This proposal separates this flag into - two. - -Motivation - - Balancing clients amoung guards is not done well by our current - algorithm. When a new guard appears, it is chosen by clients - looking for a new guard with the same probability as all existing - guards... but new guards are likelier to be under capacity, whereas - old guards are likelier to be under more use. - -Implementation - - We add a new flag, NewGuard. Clients will change so that when they - are choosing new guards, they only consider nodes with the NewGuard - flag set. - - For now, authorities will always set NewGuard if they are setting - the Guard flag. Later, it will be easy to migrate authorities to - set NewGuard for underused guards. - -Alternatives - - We might instead have authorities list weights with which nodes - should be picked as guards. diff --git a/doc/spec/proposals/146-long-term-stability.txt b/doc/spec/proposals/146-long-term-stability.txt deleted file mode 100644 index 9af0017441..0000000000 --- a/doc/spec/proposals/146-long-term-stability.txt +++ /dev/null @@ -1,84 +0,0 @@ -Filename: 146-long-term-stability.txt -Title: Add new flag to reflect long-term stability -Author: Nick Mathewson -Created: 19-Jun-2008 -Status: Open -Target: 0.2.1.x - -Overview - - This document proposes a new flag to indicate that a router has - existed at the same address for a long time, describes how to - implement it, and explains what it's good for. - -Motivation - - Tor has had three notions of "stability" for servers. Older - directory protocols based a server's stability on its - (self-reported) uptime: a server that had been running for a day was - more stable than a server that had been running for five minutes, - regardless of their past history. Current directory protocols track - weighted mean time between failure (WMTBF) and weighted fractional - uptime (WFU). WFU is computed as the fraction of time for which the - server is running, with measurements weighted to exponentially - decay such that old days count less. WMTBF is computed as the - average length of intervals for which the server runs between - downtime, with old intervals weighted to count less. - - WMTBF is useful in answering the question: "If a server is running - now, how long is it likely to stay running?" This makes it a good - choice for picking servers for streams that need to be long-lived. - WFU is useful in answering the question: "If I try connecting to - this server at an arbitrary time, is it likely to be running?" This - makes it an important factor for picking guard nodes, since we want - guard nodes to be usually-up. - - There are other questions that clients want to answer, however, for - which the current flags aren't very useful. The one that this - proposal addresses is, - - "If I found this server in an old consensus, is it likely to - still be running at the same address?" - - This one is useful when we're trying to find directory mirrors in a - fallback-consensus file. This property is equivalent to, - - "If I find this server in a current consensus, how long is it - likely to exist on the network?" - - This one is useful if we're trying to pick introduction points or - something and care more about churn rate than about whether every IP - will be up all the time. - -Implementation: - - I propose we add a new flag, called "Longterm." Authorities should - set this flag for routers if their Longevity is in the upper - quartile of all routers. A router's Longevity is computed as the - total amount of days in the last year or so[*] for which the router has - been Running at least once at its current IP:orport pair. - - Clients should use directory servers from a fallback-consensus only - if they have the Longterm flag set. - - Authority ops should be able to mark particular routers as not - Longterm, regardless of history. (For instance, it makes sense to - remove the Longterm flag from a router whose op says that it will - need to shutdown in a month.) - - [*] This is deliberately vague, to permit efficient implementations. - -Compatibility and migration issues: - - The voting protocol already acts gracefully when new flags are - added, so no change to the voting protocol is needed. - - Tor won't have collected this data, however. It might be desirable - to bootstrap it from historical consensuses. Alternatively, we can - just let the algorithm run for a month or two. - -Issues and future possibilities: - - Longterm is a really awkward name. - - diff --git a/doc/spec/proposals/147-prevoting-opinions.txt b/doc/spec/proposals/147-prevoting-opinions.txt deleted file mode 100644 index 3d9659c984..0000000000 --- a/doc/spec/proposals/147-prevoting-opinions.txt +++ /dev/null @@ -1,58 +0,0 @@ -Filename: 147-prevoting-opinions.txt -Title: Eliminate the need for v2 directories in generating v3 directories -Author: Nick Mathewson -Created: 2-Jul-2008 -Status: Accepted -Target: 0.2.1.x - -Overview - - We propose a new v3 vote document type to replace the role of v2 - networkstatus information in generating v3 consensuses. - -Motivation - - When authorities vote on which descriptors are to be listed in the - next consensus, it helps if they all know about the same descriptors - as one another. But a hostile, confused, or out-of-date server may - upload a descriptor to only some authorities. In the current v3 - directory design, the authorities don't have a good way to tell one - another about the new descriptor until they exchange votes... but by - the time this happens, they are already committed to their votes, - and they can't add anybody they learn about from other authorities - until the next voting cycle. That's no good! - - The current Tor implementation avoids this problem by having - authorities also look at v2 networkstatus documents, but we'd like - in the long term to eliminate these, once 0.1.2.x is obsolete. - -Design: - - We add a new value for vote-status in v3 consensus documents in - addition to "consensus" and "vote": "opinion". Authorities generate - and sign an opinion document as if they were generating a vote, - except that they generate opinions earlier than they generate votes. - - Authorities don't need to generate more than one opinion document - per voting interval, but may. They should send it to the other - authorities they know about, at the regular vote upload URL, before - the authorities begin voting, so that enough time remains for the - authorities to fetch new descriptors. - - Additionally, authories make their opinions available at - http://<hostname>/tor/status-vote/next/opinion.z - and download opinions from authorities they haven't heard from in a - while. - - Authorities MAY generate opinions on demand. - - Upon receiving an opinion document, authorities scan it for any - descriptors that: - - They might accept. - - Are for routers they don't know about, or are published more - recently than any descriptor they have for that router. - Authorities then begin downloading such descriptors from authorities - that claim to have them. - - Authorities MAY cache opinion documents, but don't need to. - diff --git a/doc/spec/proposals/148-uniform-client-end-reason.txt b/doc/spec/proposals/148-uniform-client-end-reason.txt deleted file mode 100644 index 1db3b3e596..0000000000 --- a/doc/spec/proposals/148-uniform-client-end-reason.txt +++ /dev/null @@ -1,57 +0,0 @@ -Filename: 148-uniform-client-end-reason.txt -Title: Stream end reasons from the client side should be uniform -Author: Roger Dingledine -Created: 2-Jul-2008 -Status: Closed -Implemented-In: 0.2.1.9-alpha - -Overview - - When a stream closes before it's finished, the end relay cell that's - sent includes an "end stream reason" to tell the other end why it - closed. It's useful for the exit relay to send a reason to the client, - so the client can choose a different circuit, inform the user, etc. But - there's no reason to include it from the client to the exit relay, - and in some cases it can even harm anonymity. - - We should pick a single reason for the client-to-exit-relay direction - and always just send that. - -Motivation - - Back when I first deployed the Tor network, it was useful to have - the Tor relays learn why a stream closed, so I could debug both ends - of the stream at once. Now that streams have worked for many years, - there's no need to continue telling the exit relay whether the client - gave up on a stream because of "timeout" or "misc" or what. - - Then in Tor 0.2.0.28-rc, I fixed this bug: - - Fix a bug where, when we were choosing the 'end stream reason' to - put in our relay end cell that we send to the exit relay, Tor - clients on Windows were sometimes sending the wrong 'reason'. The - anonymity problem is that exit relays may be able to guess whether - the client is running Windows, thus helping partition the anonymity - set. Down the road we should stop sending reasons to exit relays, - or otherwise prevent future versions of this bug. - - It turned out that non-Windows clients were choosing their reason - correctly, whereas Windows clients were potentially looking at errno - wrong and so always choosing 'misc'. - - I fixed that particular bug, but I think we should prevent future - versions of the bug too. - - (We already fixed it so *circuit* end reasons don't get sent from - the client to the exit relay. But we appear to be have skipped over - stream end reasons thus far.) - -Design: - - One option would be to no longer include any 'reason' field in end - relay cells. But that would introduce a partitioning attack ("users - running the old version" vs "users running the new version"). - - Instead I suggest that clients all switch to sending the "misc" reason, - like most of the Windows clients currently do and like the non-Windows - clients already do sometimes. - diff --git a/doc/spec/proposals/149-using-netinfo-data.txt b/doc/spec/proposals/149-using-netinfo-data.txt deleted file mode 100644 index 8bf8375d5d..0000000000 --- a/doc/spec/proposals/149-using-netinfo-data.txt +++ /dev/null @@ -1,42 +0,0 @@ -Filename: 149-using-netinfo-data.txt -Title: Using data from NETINFO cells -Author: Nick Mathewson -Created: 2-Jul-2008 -Status: Open -Target: 0.2.1.x - -Overview - - Current Tor versions send signed IP and timestamp information in - NETINFO cells, but don't use them to their fullest. This proposal - describes how they should start using this info in 0.2.1.x. - -Motivation - - Our directory system relies on clients and routers having - reasonably accurate clocks to detect replayed directory info, and - to set accurate timestamps on directory info they publish - themselves. NETINFO cells contain timestamps. - - Also, the directory system relies on routers having a reasonable - idea of their own IP addresses, so they can publish correct - descriptors. This is also in NETINFO cells. - -Learning the time and IP address - - We need to think about attackers here. Just because a router tells - us that we have a given IP or a given clock skew doesn't mean that - it's true. We believe this information only if we've heard it from - a majority of the routers we've connected to recently, including at - least 3 routers. Routers only believe this information if the - majority includes at least one authority. - -Avoiding MITM attacks - - Current Tors use the IP addresses published in the other router's - NETINFO cells to see whether the connection is "canonical". Right - now, we prefer to extend circuits over "canonical" connections. In - 0.2.1.x, we should refuse to extend circuits over non-canonical - connections without first trying to build a canonical one. - - diff --git a/doc/spec/proposals/150-exclude-exit-nodes.txt b/doc/spec/proposals/150-exclude-exit-nodes.txt deleted file mode 100644 index b497ae62c1..0000000000 --- a/doc/spec/proposals/150-exclude-exit-nodes.txt +++ /dev/null @@ -1,47 +0,0 @@ -Filename: 150-exclude-exit-nodes.txt -Title: Exclude Exit Nodes from a circuit -Author: Mfr -Created: 2008-06-15 -Status: Closed -Implemented-In: 0.2.1.3-alpha - -Overview - - Right now, Tor users can manually exclude a node from all positions - in their circuits created using the directive ExcludeNodes. - This proposal makes this exclusion less restrictive, allowing users to - exclude a node only from the exit part of a circuit. - -Motivation - - This feature would Help the integration into vidalia (tor exit - branch) or other tools, of features to exclude a country for exit - without reducing circuits possibilities, and privacy. This feature - could help people from a country were many sites are blocked to - exclude this country for browsing, giving them a more stable - navigation. It could also add the possibility for the user to - exclude a currently used exit node. - -Implementation - - ExcludeExitNodes is similar to ExcludeNodes except it's only - the exit node which is excluded for circuit build. - - Tor doesn't warn if node from this list is not an exit node. - -Security implications: - - Open also possibilities for a future user bad exit reporting - -Risks: - - Use of this option can make users partitionable under certain attack - assumptions. However, ExitNodes already creates this possibility, - so there isn't much increased risk in ExcludeExitNodes. - - We should still encourage people who exclude an exit node because - of bad behavior to report it instead of just adding it to their - ExcludeExit list. It would be unfortunate if we didn't find out - about broken exits because of this option. This issue can probably - be addressed sufficiently with documentation. - diff --git a/doc/spec/proposals/151-path-selection-improvements.txt b/doc/spec/proposals/151-path-selection-improvements.txt deleted file mode 100644 index af89f21193..0000000000 --- a/doc/spec/proposals/151-path-selection-improvements.txt +++ /dev/null @@ -1,148 +0,0 @@ -Filename: 151-path-selection-improvements.txt -Title: Improving Tor Path Selection -Author: Fallon Chen, Mike Perry -Created: 5-Jul-2008 -Status: Finished -In-Spec: path-spec.txt - -Overview - - The performance of paths selected can be improved by adjusting the - CircuitBuildTimeout and avoiding failing guard nodes. This proposal - describes a method of tracking buildtime statistics at the client, and - using those statistics to adjust the CircuitBuildTimeout. - -Motivation - - Tor's performance can be improved by excluding those circuits that - have long buildtimes (and by extension, high latency). For those Tor - users who require better performance and have lower requirements for - anonymity, this would be a very useful option to have. - -Implementation - - Gathering Build Times - - Circuit build times are stored in the circular array - 'circuit_build_times' consisting of uint32_t elements as milliseconds. - The total size of this array is based on the number of circuits - it takes to converge on a good fit of the long term distribution of - the circuit builds for a fixed link. We do not want this value to be - too large, because it will make it difficult for clients to adapt to - moving between different links. - - From our observations, the minimum value for a reasonable fit appears - to be on the order of 500 (MIN_CIRCUITS_TO_OBSERVE). However, to keep - a good fit over the long term, we store 5000 most recent circuits in - the array (NCIRCUITS_TO_OBSERVE). - - The Tor client will build test circuits at a rate of one per - minute (BUILD_TIMES_TEST_FREQUENCY) up to the point of - MIN_CIRCUITS_TO_OBSERVE. This allows a fresh Tor to have - a CircuitBuildTimeout estimated within 8 hours after install, - upgrade, or network change (see below). - - Long Term Storage - - The long-term storage representation is implemented by storing a - histogram with BUILDTIME_BIN_WIDTH millisecond buckets (default 50) when - writing out the statistics to disk. The format this takes in the - state file is 'CircuitBuildTime <bin-ms> <count>', with the total - specified as 'TotalBuildTimes <total>' - Example: - - TotalBuildTimes 100 - CircuitBuildTimeBin 25 50 - CircuitBuildTimeBin 75 25 - CircuitBuildTimeBin 125 13 - ... - - Reading the histogram in will entail inserting <count> values - into the circuit_build_times array each with the value of - <bin-ms> milliseconds. In order to evenly distribute the values - in the circular array, the Fisher-Yates shuffle will be performed - after reading values from the bins. - - Learning the CircuitBuildTimeout - - Based on studies of build times, we found that the distribution of - circuit buildtimes appears to be a Frechet distribution. However, - estimators and quantile functions of the Frechet distribution are - difficult to work with and slow to converge. So instead, since we - are only interested in the accuracy of the tail, we approximate - the tail of the distribution with a Pareto curve starting at - the mode of the circuit build time sample set. - - We will calculate the parameters for a Pareto distribution - fitting the data using the estimators at - http://en.wikipedia.org/wiki/Pareto_distribution#Parameter_estimation. - - The timeout itself is calculated by using the Quartile function (the - inverted CDF) to give us the value on the CDF such that - BUILDTIME_PERCENT_CUTOFF (80%) of the mass of the distribution is - below the timeout value. - - Thus, we expect that the Tor client will accept the fastest 80% of - the total number of paths on the network. - - Detecting Changing Network Conditions - - We attempt to detect both network connectivity loss and drastic - changes in the timeout characteristics. - - We assume that we've had network connectivity loss if 3 circuits - timeout and we've received no cells or TLS handshakes since those - circuits began. We then set the timeout to 60 seconds and stop - counting timeouts. - - If 3 more circuits timeout and the network still has not been - live within this new 60 second timeout window, we then discard - the previous timeouts during this period from our history. - - To detect changing network conditions, we keep a history of - the timeout or non-timeout status of the past RECENT_CIRCUITS (20) - that successfully completed at least one hop. If more than 75% - of these circuits timeout, we discard all buildtimes history, - reset the timeout to 60, and then begin recomputing the timeout. - - Testing - - After circuit build times, storage, and learning are implemented, - the resulting histogram should be checked for consistency by - verifying it persists across successive Tor invocations where - no circuits are built. In addition, we can also use the existing - buildtime scripts to record build times, and verify that the histogram - the python produces matches that which is output to the state file in Tor, - and verify that the Pareto parameters and cutoff points also match. - - We will also verify that there are no unexpected large deviations from - node selection, such as nodes from distant geographical locations being - completely excluded. - - Dealing with Timeouts - - Timeouts should be counted as the expectation of the region of - of the Pareto distribution beyond the cutoff. This is done by - generating a random sample for each timeout at points on the - curve beyond the current timeout cutoff. - - Future Work - - At some point, it may be desirable to change the cutoff from a - single hard cutoff that destroys the circuit to a soft cutoff and - a hard cutoff, where the soft cutoff merely triggers the building - of a new circuit, and the hard cutoff triggers destruction of the - circuit. - - It may also be beneficial to learn separate timeouts for each - guard node, as they will have slightly different distributions. - This will take longer to generate initial values though. - -Issues - - Impact on anonymity - - Since this follows a Pareto distribution, large reductions on the - timeout can be achieved without cutting off a great number of the - total paths. This will eliminate a great deal of the performance - variation of Tor usage. diff --git a/doc/spec/proposals/152-single-hop-circuits.txt b/doc/spec/proposals/152-single-hop-circuits.txt deleted file mode 100644 index d0b28b1c72..0000000000 --- a/doc/spec/proposals/152-single-hop-circuits.txt +++ /dev/null @@ -1,62 +0,0 @@ -Filename: 152-single-hop-circuits.txt -Title: Optionally allow exit from single-hop circuits -Author: Geoff Goodell -Created: 13-Jul-2008 -Status: Closed -Implemented-In: 0.2.1.6-alpha - -Overview - - Provide a special configuration option that adds a line to descriptors - indicating that a router can be used as an exit for one-hop circuits, - and allow clients to attach streams to one-hop circuits provided - that the descriptor for the router in the circuit includes this - configuration option. - -Motivation - - At some point, code was added to restrict the attachment of streams - to one-hop circuits. - - The idea seems to be that we can use the cost of forking and - maintaining a patch as a lever to prevent people from writing - controllers that jeopardize the operational security of routers - and the anonymity properties of the Tor network by creating and - using one-hop circuits rather than the standard three-hop circuits. - It may be, for example, that some users do not actually seek true - anonymity but simply reachability through network perspectives - afforded by the Tor network, and since anonymity is stronger in - numbers, forcing users to contribute to anonymity and decrease the - risk to server operators by using full-length paths may be reasonable. - - As presently implemented, the sweeping restriction of one-hop circuits - for all routers limits the usefulness of Tor as a general-purpose - technology for building circuits. In particular, we should allow - for controllers, such as Blossom, that create and use single-hop - circuits involving routers that are not part of the Tor network. - -Design - - Introduce a configuration option for Tor servers that, when set, - indicates that a router is willing to provide exit from one-hop - circuits. Routers with this policy will not require that a circuit - has at least two hops when it is used as an exit. - - In addition, routers for which this configuration option - has been set will have a line in their descriptors, "opt - exit-from-single-hop-circuits". Clients will keep track of which - routers have this option and allow streams to be attached to - single-hop circuits that include such routers. - -Security Considerations - - This approach seems to eliminate the worry about operational router - security, since server operators will not set the configuraiton - option unless they are willing to take on such risk. - - To reduce the impact on anonymity of the network resulting - from including such "risky" routers in regular Tor path - selection, clients may systematically exclude routers with "opt - exit-from-single-hop-circuits" when choosing random paths through - the Tor network. - diff --git a/doc/spec/proposals/153-automatic-software-update-protocol.txt b/doc/spec/proposals/153-automatic-software-update-protocol.txt deleted file mode 100644 index c2979bb695..0000000000 --- a/doc/spec/proposals/153-automatic-software-update-protocol.txt +++ /dev/null @@ -1,175 +0,0 @@ -Filename: 153-automatic-software-update-protocol.txt -Title: Automatic software update protocol -Author: Jacob Appelbaum -Created: 14-July-2008 -Status: Superseded - -[Superseded by thandy-spec.txt] - - - Automatic Software Update Protocol Proposal - -0.0 Introduction - -The Tor project and its users require a robust method to update shipped -software bundles. The software bundles often includes Vidalia, Privoxy, Polipo, -Torbutton and of course Tor itself. It is not inconcievable that an update -could include all of the Tor Browser Bundle. It seems reasonable to make this -a standalone program that can be called in shell scripts, cronjobs or by -various Tor controllers. - -0.1 Minimal Tasks To Implement Automatic Updating - -At the most minimal, an update must be able to do the following: - - 0 - Detect the curent Tor version, note the working status of Tor. - 1 - Detect the latest Tor version. - 2 - Fetch the latest version in the form of a platform specific package(s). - 3 - Verify the itegrity of the downloaded package(s). - 4 - Install the verified package(s). - 5 - Test that the new package(s) works properly. - -0.2 Specific Enumeration Of Minimal Tasks - -To implement requirement 0, we need to detect the current Tor version of both -the updater and the current running Tor. The update program itself should be -versioned internally. This requirement should also test connecting through Tor -itself and note if such connections are possible. - -To implement requirement 1, we need to learn the concensus from the directory -authorities or fail back to a known good URL with cryptographically signed -content. - -To implement requirement 2, we need to download Tor - hopefully over Tor. - -To implement requirement 3, we need to verify the package signature. - -To implement requirement 4, we need to use a platform specific method of -installation. The Tor controller performing the update perform these platform -specific methods. - -To implement requirement 5, we need to be able to extend circuits and reach -the internet through Tor. - -0.x Implementation Goals - -The update system will be cross platform and rely on as little external code -as possible. If the update system uses it, it must be updated by the update -system itself. It will consist only of free software and will not rely on any -non-free components until the actual installation phase. If a package manager -is in use, it will be platform specific and thus only invoked by the update -system implementing the update protocol. - -The update system itself will attempt to perform update related network -activity over Tor. Possibly it will attempt to use a hidden service first. -It will attempt to use novel and not so novel caching -when possible, it will always verify cryptographic signatures before any -remotely fetched code is executed. In the event of an unusable Tor system, -it will be able to attempt to fetch updates without Tor. This should be user -configurable, some users will be unwilling to update without the protection of -using Tor - others will simply be unable because of blocking of the main Tor -website. - -The update system will track current version numbers of Tor and supporting -software. The update system will also track known working versions to assist -with automatic The update system itself will be a standalone library. It will be -strongly versioned internally to match the Tor bundle it was shiped with. The -update system will keep track of the given platform, cpu architecture, lsb_release, -package management functionality and any other platform specific metadata. - -We have referenced two popular automatic update systems, though neither fit -our needs, both are useful as an idea of what others are doing in the same -area. - -The first is sparkle[0] but it is sadly only available for Cocoa -environments and is written in Objective C. This doesn't meet our requirements -because it is directly tied into the private Apple framework. - -The second is the Mozilla Automatic Update System[1]. It is possibly useful -as an idea of how other free software projects automatically update. It is -however not useful in its currently documented form. - - - [0] http://sparkle.andymatuschak.org/documentation/ - [1] http://wiki.mozilla.org/AUS:Manual - -0.x Previous methods of Tor and related software update - -Previously, Tor users updated their Tor related software by hand. There has -been no fully automatic method for any user to update. In addition, there -hasn't been any specific way to find out the most current stable version of Tor -or related software as voted on by the directory authority concensus. - -0.x Changes to the directory specification - -We will want to supplement client-versions and server-versions in the -concensus voting with another version identifier known as -'auto-update-versions'. This will keep track of the current concensus of -specific versions that are best per platform and per architecture. It should -be noted that while the Mac OS X universal binary may be the best for x86 -processers with Tiger, it may not be the best for PPC users on Panther. This -goes for all of the package updates. We want to prevent updates that cause Tor -to break even if the updating program can recover gracefully. - -x.x Assumptions About Operating System Package Management - -It is assumed that users will use their package manager unless they are on -Microsoft Windows (any version) or Mac OS X (any version). Microsoft Windows -users will have integration with the normal "add/remove program" functionality -that said users would expect. - -x.x Package Update System Failure Modes - -The package update will try to ensure that a user always has a working Tor at -the very least. It will keep state to remember versions of Tor that were able -to bootstrap properly and reach the rest of the Tor network. It will also keep -note of which versions broke. It will select the best Tor that works for the -user. It will also allow for anonymized bug reporting on the packages -available and tested by the auto-update system. - -x.x Package Signature Verification - -The update system will be aware of replay attacks against the update signature -system itself. It will not allow package update signatures that are radically -out of date. It will be a multi-key system to prevent any single party from -forging an update. The key will be updated regularly. This is like authority -key (see proposal 103) usage. - -x.x Package Caching - -The update system will iterate over different update methods. Whichever method -is picked will have caching functionality. Each Tor server itself should be -able to serve cached update files. This will be an option that friendly server -administrators can turn on should they wish to support caching. In addition, -it is possible to cache the full contents of a package in an -authoratative DNS zone. Users can then query the DNS zone for their package. -If we wish to further distribute the update load, we can also offer packages -with encrypted bittorrent. Clients who wish to share the updates but do not -wish to be a server can help distribute Tor updates. This can be tied together -with the DNS caching[2][3] if needed. - - [2] http://www.netrogenic.com/dnstorrent/ - [3] http://www.doxpara.com/ozymandns_src_0.1.tgz - -x.x Helping Our Users Spread Tor - -There should be a way for a user to participate in the packaging caching as -described in section x.x. This option should be presented by the Tor -controller. - -x.x Simple HTTP Proxy To The Tor Project Website - -It has been suggested that we should provide a simple proxy that allows a user -to visit the main Tor website to download packages. This was part of a -previous proposal and has not been closely examined. - -x.x Package Installation - -Platform specific methods for proper package installation will be left to the -controller that is calling for an update. Each platform is different, the -installation options and user interface will be specific to the controller in -question. - -x.x Other Things - -Other things should be added to this proposal. What are they? diff --git a/doc/spec/proposals/154-automatic-updates.txt b/doc/spec/proposals/154-automatic-updates.txt deleted file mode 100644 index 4c2c6d3899..0000000000 --- a/doc/spec/proposals/154-automatic-updates.txt +++ /dev/null @@ -1,377 +0,0 @@ -Filename: 154-automatic-updates.txt -Title: Automatic Software Update Protocol -Author: Matt Edman -Created: 30-July-2008 -Status: Superseded -Target: 0.2.1.x - -Superseded by thandy-spec.txt - -Scope - - This proposal specifies the method by which an automatic update client can - determine the most recent recommended Tor installation package for the - user's platform, download the package, and then verify that the package was - downloaded successfully. While this proposal focuses on only the Tor - software, the protocol defined is sufficiently extensible such that other - components of the Tor bundles, like Vidalia, Polipo, and Torbutton, can be - managed and updated by the automatic update client as well. - - The initial target platform for the automatic update framework is Windows, - given that's the platform used by a majority of our users and that it lacks - a sane package management system that many Linux distributions already have. - Our second target platform will be Mac OS X, and so the protocol will be - designed with this near-future direction in mind. - - Other client-side aspects of the automatic update process, such as user - interaction, the interface presented, and actual package installation - procedure, are outside the scope of this proposal. - - -Motivation - - Tor releases new versions frequently, often with important security, - anonymity, and stability fixes. Thus, it is important for users to be able - to promptly recognize when new versions are available and to easily - download, authenticate, and install updated Tor and Tor-related software - packages. - - Tor's control protocol [2] provides a method by which controllers can - identify when the user's Tor software is obsolete or otherwise no longer - recommended. Currently, however, no mechanism exists for clients to - automatically download and install updated Tor and Tor-related software for - the user. - - -Design Overview - - The core of the automatic update framework is a well-defined file called a - "recommended-packages" file. The recommended-packages file is accessible via - HTTP[S] at one or more well-defined URLs. An example recommended-packages - URL may be: - - https://updates.torproject.org/recommended-packages - - The recommended-packages document is formatted according to Section 1.2 - below and specifies the most recent recommended installation package - versions for Tor or Tor-related software, as well as URLs at which the - packages and their signatures can be downloaded. - - An automatic update client process runs on the Tor user's computer and - periodically retrieves the recommended-packages file according to the method - described in Section 2.0. As described further in Section 1.2, the - recommended-packages file is signed and can be verified by the automatic - update client with one or more public keys included in the client software. - Since it is signed, the recommended-packages file can be mirrored by - multiple hosts (e.g., Tor directory authorities), whose URLs are included in - the automatic update client's configuration. - - After retrieving and verifying the recommended-packages file, the automatic - update client compares the versions of the recommended software packages - listed in the file with those currently installed on the end-user's - computer. If one or more of the installed packages is determined to be out - of date, an updated package and its signature will be downloaded from one of - the package URLs listed in the recommended-packages file as described in - Section 2.2. - - The automatic update system uses a multilevel signing key scheme for package - signatures. There are a small number of entities we call "packaging - authorities" that each have their own signing key. A packaging authority is - responsible for signing and publishing the recommended-packages file. - Additionally, each individual packager responsible for producing an - installation package for one or more platforms has their own signing key. - Every packager's signing key must be signed by at least one of the packaging - authority keys. - - -Specification - - 1. recommended-packages Specification - - In this section we formally specify the format of the published - recommended-packages file. - - 1.1. Document Meta-format - - The recommended-packages document follows the lightweight extensible - information format defined in Tor's directory protocol specification [1]. In - the interest of self-containment, we have reproduced the relevant portions - of that format's specification in this Section. (Credits to Nick Mathewson - for much of the original format definition language.) - - The highest level object is a Document, which consists of one or more - Items. Every Item begins with a KeywordLine, followed by zero or more - Objects. A KeywordLine begins with a Keyword, optionally followed by - whitespace and more non-newline characters, and ends with a newline. A - Keyword is a sequence of one or more characters in the set [A-Za-z0-9-]. - An Object is a block of encoded data in pseudo-Open-PGP-style - armor. (cf. RFC 2440) - - More formally: - - Document ::= (Item | NL)+ - Item ::= KeywordLine Object* - KeywordLine ::= Keyword NL | Keyword WS ArgumentChar+ NL - Keyword ::= KeywordChar+ - KeywordChar ::= 'A' ... 'Z' | 'a' ... 'z' | '0' ... '9' | '-' - ArgumentChar ::= any printing ASCII character except NL. - WS ::= (SP | TAB)+ - Object ::= BeginLine Base-64-encoded-data EndLine - BeginLine ::= "-----BEGIN " Keyword "-----" NL - EndLine ::= "-----END " Keyword "-----" NL - - The BeginLine and EndLine of an Object must use the same keyword. - - In our Document description below, we also tag Items with a multiplicity in - brackets. Possible tags are: - - "At start, exactly once": These items MUST occur in every instance of the - document type, and MUST appear exactly once, and MUST be the first item in - their documents. - - "Exactly once": These items MUST occur exactly one time in every - instance of the document type. - - "Once or more": These items MUST occur at least once in any instance - of the document type, and MAY occur more than once. - - "At end, exactly once": These items MUST occur in every instance of - the document type, and MUST appear exactly once, and MUST be the - last item in their documents. - - 1.2. recommended-packages Document Format - - When interpreting a recommended-packages Document, software MUST ignore - any KeywordLine that starts with a keyword it doesn't recognize; future - implementations MUST NOT require current automatic update clients to - understand any KeywordLine not currently described. - - In lines that take multiple arguments, extra arguments SHOULD be - accepted and ignored. - - The currently defined Items contained in a recommended-packages document - are: - - "recommended-packages-format" SP number NL - - [Exactly once] - - This Item specifies the version of the recommended-packages format that - is contained in the subsequent document. The version defined in this - proposal is version "1". Subsequent iterations of this protocol MUST - increment this value if they introduce incompatible changes to the - document format and MAY increment this value if they only introduce - additional Keywords. - - "published" SP YYYY-MM-DD SP HH:MM:SS NL - - [Exactly once] - - The time, in GMT, when this recommended-packages document was generated. - Automatic update clients SHOULD ignore Documents over 60 days old. - - "tor-stable-win32-version" SP TorVersion NL - - [Exactly once] - - This keyword specifies the latest recommended release of Tor's "stable" - branch for the Windows platform that has an installation package - available. Note that this version does not necessarily correspond to the - most recently tagged stable Tor version, since that version may not yet - have an installer package available, or may have known issues on - Windows. - - The TorVersion field is formatted according to Section 2 of Tor's - version specification [3]. - - "tor-stable-win32-package" SP Url NL - - [Once or more] - - This Item specifies the location from which the most recent - recommended Windows installation package for Tor's stable branch can be - downloaded. - - When this Item appears multiple times within the Document, automatic - update clients SHOULD select randomly from the available package - mirrors. - - "tor-dev-win32-version" SP TorVersion NL - - [Exactly once] - - This Item specifies the latest recommended release of Tor's - "development" branch for the Windows platform that has an installation - package available. The same caveats from the description of - "tor-stable-win32-version" also apply to this keyword. - - The TorVersion field is formatted according to Section 2 of Tor's - version specification [3]. - - "tor-dev-win32-package" SP Url NL - - [Once or more] - - This Item specifies the location from which the most recent recommended - Windows installation package and its signature for Tor's development - branch can be downloaded. - - When this Keyword appears multiple times within the Document, automatic - update clients SHOULD select randomly from the available package - mirrors. - - "signature" NL SIGNATURE NL - - [At end, exactly once] - - The "SIGNATURE" Object contains a PGP signature (using a packaging - authority signing key) of the entire document, taken from the beginning - of the "recommended-packages-format" keyword, through the newline after - the "signature" Keyword. - - - 2. Automatic Update Client Behavior - - The client-side component of the automatic update framework is an - application that runs on the end-user's machine. It is responsible for - fetching and verifying a recommended-packages document, as well as - downloading, verifying, and subsequently installing any necessary updated - software packages. - - 2.1. Download and verify a recommended-packages document - - The first step in the automatic update process is for the client to download - a copy of the recommended-packages file. The automatic update client - contains a (hardcoded and/or user-configurable) list of URLs from which it - will attempt to retrieve a recommended-packages file. - - Connections to each of the recommended-packages URLs SHOULD be attempted in - the following order: - - 1) HTTPS over Tor - 2) HTTP over Tor - 3) Direct HTTPS - 4) Direct HTTP - - If the client fails to retrieve a recommended-packages document via any of - the above connection methods from any of the configured URLs, the client - SHOULD retry its download attempts following an exponential back-off - algorithm. After the first failed attempt, the client SHOULD delay one hour - before attempting again, up to a maximum of 24 hours delay between retry - attempts. - - After successfully downloading a recommended-packages file, the automatic - update client will verify the signature using one of the public keys - distributed with the client software. If more than one recommended-packages - file is downloaded and verified, the file with the most recent "published" - date that is verified will be retained and the rest discarded. - - 2.2. Download and verify the updated packages - - The automatic update client next compares the latest recommended package - version from the recommended-packages document with the currently installed - Tor version. If the user currently has installed a Tor version from Tor's - "development" branch, then the version specified in "tor-dev-*-version" Item - is used for comparison. Similarly, if the user currently has installed a Tor - version from Tor's "stable" branch, then the version specified in the - "tor-stable-*version" Item is used for comparison. Version comparisons are - done according to Tor's version specification [3]. - - If the automatic update client determines an installation package newer than - the user's currently installed version is available, it will attempt to - download a package appropriate for the user's platform and Tor branch from a - URL specified by a "tor-[branch]-[platform]-package" Item. If more than one - mirror for the selected package is available, a mirror will be chosen at - random from all those available. - - The automatic update client must also download a ".asc" signature file for - the retrieved package. The URL for the package signature is the same as that - for the package itself, except with the extension ".asc" appended to the - package URL. - - Connections to download the updated package and its signature SHOULD be - attempted in the same order described in Section 2.1. - - After completing the steps described in Sections 2.1 and 2.2, the automatic - update client will have downloaded and verified a copy of the latest Tor - installation package. It can then take whatever subsequent platform-specific - steps are necessary to install the downloaded software updates. - - 2.3. Periodic checking for updates - - The automatic update client SHOULD maintain a local state file in which it - records (at a minimum) the timestamp at which it last retrieved a - recommended-packages file and the timestamp at which the client last - successfully downloaded and installed a software update. - - Automatic update clients SHOULD check for an updated recommended-packages - document at most once per day but at least once every 30 days. - - - 3. Future Extensions - - There are several possible areas for future extensions of this framework. - The extensions below are merely suggestions and should be the subject of - their own proposal before being implemented. - - 3.1. Additional Software Updates - - There are several software packages often included in Tor bundles besides - Tor, such as Vidalia, Privoxy or Polipo, and Torbutton. The versions and - download locations of updated installation packages for these bundle - components can be easily added to the recommended-packages document - specification above. - - 3.2. Including ChangeLog Information - - It may be useful for automatic update clients to be able to display for - users a summary of the changes made in the latest Tor or Tor-related - software release, before the user chooses to install the update. In the - future, we can add keywords to the specification in Section 1.2 that specify - the location of a ChangeLog file for the latest recommended package - versions. It may also be desirable to allow localized ChangeLog information, - so that the automatic update client can fetch release notes in the - end-user's preferred language. - - 3.3. Weighted Package Mirror Selection - - We defined in Section 1.2 a method by which automatic update clients can - select from multiple available package mirrors. We may want to add a Weight - argument to the "*-package" Items that allows the recommended-packages file - to suggest to clients the probability with which a package mirror should be - chosen. This will allow clients to more appropriately distribute package - downloads across available mirrors proportional to their approximate - bandwidth. - - -Implementation - - Implementation of this proposal will consist of two separate components. - - The first component is a small "au-publish" tool that takes as input a - configuration file specifying the information described in Section 1.2 and a - private key. The tool is run by a "packaging authority" (someone responsible - for publishing updated installation packages), who will be prompted to enter - the passphrase for the private key used to sign the recommended-packages - document. The output of the tool is a document formatted according to - Section 1.2, with a signature appended at the end. The resulting document - can then be published to any of the update mirrors. - - The second component is an "au-client" tool that is run on the end-user's - machine. It periodically checks for updated installation packages according - to Section 2 and fetches the packages if necessary. The public keys used - to sign the recommended-packages file and any of the published packages are - included in the "au-client" tool. - - -References - - [1] Tor directory protocol (version 3), - https://tor-svn.freehaven.net/svn/tor/trunk/doc/spec/dir-spec.txt - - [2] Tor control protocol (version 2), - https://tor-svn.freehaven.net/svn/tor/trunk/doc/spec/control-spec.txt - - [3] Tor version specification, - https://tor-svn.freehaven.net/svn/tor/trunk/doc/spec/version-spec.txt - diff --git a/doc/spec/proposals/155-four-hidden-service-improvements.txt b/doc/spec/proposals/155-four-hidden-service-improvements.txt deleted file mode 100644 index e342bf1c39..0000000000 --- a/doc/spec/proposals/155-four-hidden-service-improvements.txt +++ /dev/null @@ -1,120 +0,0 @@ -Filename: 155-four-hidden-service-improvements.txt -Title: Four Improvements of Hidden Service Performance -Author: Karsten Loesing, Christian Wilms -Created: 25-Sep-2008 -Status: Finished -Implemented-In: 0.2.1.x - -Change history: - - 25-Sep-2008 Initial proposal for or-dev - -Overview: - - A performance analysis of hidden services [1] has brought up a few - possible design changes to reduce advertisement time of a hidden service - in the network as well as connection establishment time. Some of these - design changes have side-effects on anonymity or overall network load - which had to be weighed up against individual performance gains. A - discussion of seven possible design changes [2] has led to a selection - of four changes [3] that are proposed to be implemented here. - -Design: - - 1. Shorter Circuit Extension Timeout - - When establishing a connection to a hidden service a client cannibalizes - an existing circuit and extends it by one hop to one of the service's - introduction points. In most cases this can be accomplished within a few - seconds. Therefore, the current timeout of 60 seconds for extending a - circuit is far too high. - - Assuming that the timeout would be reduced to a lower value, for example - 30 seconds, a second (or third) attempt to cannibalize and extend would - be started earlier. With the current timeout of 60 seconds, 93.42% of all - circuits can be established, whereas this fraction would have been only - 0.87% smaller at 92.55% with a timeout of 30 seconds. - - For a timeout of 30 seconds the performance gain would be approximately 2 - seconds in the mean as opposed to the current timeout of 60 seconds. At - the same time a smaller timeout leads to discarding an increasing number - of circuits that might have been completed within the current timeout of - 60 seconds. - - Measurements with simulated low-bandwidth connectivity have shown that - there is no significant effect of client connectivity on circuit - extension times. The reason for this might be that extension messages are - small and thereby independent of the client bandwidth. Further, the - connection between client and entry node only constitutes a single hop of - a circuit, so that its influence on the whole circuit is limited. - - The exact value of the new timeout does not necessarily have to be 30 - seconds, but might also depend on the results of circuit build timeout - measurements as described in proposal 151. - - 2. Parallel Connections to Introduction Points - - An additional approach to accelerate extension of introduction circuits - is to extend a second circuit in parallel to a different introduction - point. Such parallel extension attempts should be started after a short - delay of, e.g., 15 seconds in order to prevent unnecessary circuit - extensions and thereby save network resources. Whichever circuit - extension succeeds first is used for introduction, while the other - attempt is aborted. - - An evaluation has been performed for the more resource-intensive approach - of starting two parallel circuits immediately instead of waiting for a - short delay. The result was a reduction of connection establishment times - from 27.4 seconds in the original protocol to 22.5 seconds. - - While the effect of the proposed approach of delayed parallelization on - mean connection establishment times is expected to be smaller, - variability of connection attempt times can be reduced significantly. - - 3. Increase Count of Internal Circuits - - Hidden services need to create or cannibalize and extend a circuit to a - rendezvous point for every client request. Really popular hidden services - require more than two internal circuits in the pool to answer multiple - client requests at the same time. This scenario was not yet analyzed, but - will probably exhibit worse performance than measured in the previous - analysis. The number of preemptively built internal circuits should be a - function of connection requests in the past to adapt to changing needs. - Furthermore, an increased number of internal circuits on client side - would allow clients to establish connections to more than one hidden - service at a time. - - Under the assumption that a popular hidden service cannot make use of - cannibalization for connecting to rendezvous points, the circuit creation - time needs to be added to the current results. In the mean, the - connection establishment time to a popular hidden service would increase - by 4.7 seconds. - - 4. Build More Introduction Circuits - - When establishing introduction points, a hidden service should launch 5 - instead of 3 introduction circuits at the same time and use only the - first 3 that could be established. The remaining two circuits could still - be used for other purposes afterwards. - - The effect has been simulated using previously measured data, too. - Therefore, circuit establishment times were derived from log files and - written to an array. Afterwards, a simulation with 10,000 runs was - performed picking 5 (4, 6) random values and using the 3 lowest values in - contrast to picking only 3 values at random. The result is that the mean - time of the 3-out-of-3 approach is 8.1 seconds, while the mean time of - the 3-out-of-5 approach is 4.4 seconds. - - The effect on network load is minimal, because the hidden service can - reuse the slower internal circuits for other purposes, e.g., rendezvous - circuits. The only change is that a hidden service starts establishing - more circuits at once instead of subsequently doing so. - -References: - - [1] http://freehaven.net/~karsten/hidserv/perfanalysis-2008-06-15.pdf - - [2] http://freehaven.net/~karsten/hidserv/discussion-2008-07-15.pdf - - [3] http://freehaven.net/~karsten/hidserv/design-2008-08-15.pdf - diff --git a/doc/spec/proposals/156-tracking-blocked-ports.txt b/doc/spec/proposals/156-tracking-blocked-ports.txt deleted file mode 100644 index 419de7e74c..0000000000 --- a/doc/spec/proposals/156-tracking-blocked-ports.txt +++ /dev/null @@ -1,527 +0,0 @@ -Filename: 156-tracking-blocked-ports.txt -Title: Tracking blocked ports on the client side -Author: Robert Hogan -Created: 14-Oct-2008 -Status: Open -Target: 0.2.? - -Motivation: -Tor clients that are behind extremely restrictive firewalls can end up -waiting a while for their first successful OR connection to a node on the -network. Worse, the more restrictive their firewall the more susceptible -they are to an attacker guessing their entry nodes. Tor routers that -are behind extremely restrictive firewalls can only offer a limited, -'partitioned' service to other routers and clients on the network. Exit -nodes behind extremely restrictive firewalls may advertise ports that they -are actually not able to connect to, wasting network resources in circuit -constructions that are doomed to fail at the last hop on first use. - -Proposal: - -When a client attempts to connect to an entry guard it should avoid -further attempts on ports that fail once until it has connected to at -least one entry guard successfully. (Maybe it should wait for more than -one failure to reduce the skew on the first node selection.) Thereafter -it should select entry guards regardless of port and warn the user if -it observes that connections to a given port have failed every multiple -of 5 times without success or since the last success. - -Tor should warn the operators of exit, middleman and entry nodes if it -observes that connections to a given port have failed a multiple of 5 -times without success or since the last success. If attempts on a port -fail 20 or more times without or since success, Tor should add the port -to a 'blocked-ports' entry in its descriptor's extra-info. Some thought -needs to be given to what the authorities might do with this information. - -Related TODO item: - "- Automatically determine what ports are reachable and start using - those, if circuits aren't working and it's a pattern we - recognize ("port 443 worked once and port 9001 keeps not - working")." - - -I've had a go at implementing all of this in the attached. - -Addendum: -Just a note on the patch, storing the digest of each router that uses the port -is a bit of a memory hog, and its only real purpose is to provide a count of -routers using that port when warning the user. That could be achieved when -warning the user by iterating through the routerlist instead. - -Index: src/or/connection_or.c -=================================================================== ---- src/or/connection_or.c (revision 17104) -+++ src/or/connection_or.c (working copy) -@@ -502,6 +502,9 @@ - connection_or_connect_failed(or_connection_t *conn, - int reason, const char *msg) - { -+ if ((reason == END_OR_CONN_REASON_NO_ROUTE) || -+ (reason == END_OR_CONN_REASON_REFUSED)) -+ or_port_hist_failure(conn->identity_digest,TO_CONN(conn)->port); - control_event_or_conn_status(conn, OR_CONN_EVENT_FAILED, reason); - if (!authdir_mode_tests_reachability(get_options())) - control_event_bootstrap_problem(msg, reason); -@@ -580,6 +583,7 @@ - /* already marked for close */ - return NULL; - } -+ - return conn; - } - -@@ -909,6 +913,7 @@ - control_event_or_conn_status(conn, OR_CONN_EVENT_CONNECTED, 0); - - if (started_here) { -+ or_port_hist_success(TO_CONN(conn)->port); - rep_hist_note_connect_succeeded(conn->identity_digest, now); - if (entry_guard_register_connect_status(conn->identity_digest, - 1, now) < 0) { -Index: src/or/rephist.c -=================================================================== ---- src/or/rephist.c (revision 17104) -+++ src/or/rephist.c (working copy) -@@ -18,6 +18,7 @@ - static void bw_arrays_init(void); - static void predicted_ports_init(void); - static void hs_usage_init(void); -+static void or_port_hist_init(void); - - /** Total number of bytes currently allocated in fields used by rephist.c. */ - uint64_t rephist_total_alloc=0; -@@ -89,6 +90,25 @@ - digestmap_t *link_history_map; - } or_history_t; - -+/** or_port_hist_t contains our router/client's knowledge of -+ all OR ports offered on the network, and how many servers with each port we -+ have succeeded or failed to connect to. */ -+typedef struct { -+ /** The port this entry is tracking. */ -+ uint16_t or_port; -+ /** Have we ever connected to this port on another OR?. */ -+ unsigned int success:1; -+ /** The ORs using this port. */ -+ digestmap_t *ids; -+ /** The ORs using this port we have failed to connect to. */ -+ digestmap_t *failure_ids; -+ /** Are we excluding ORs with this port during entry selection?*/ -+ unsigned int excluded; -+} or_port_hist_t; -+ -+static unsigned int still_searching = 0; -+static smartlist_t *or_port_hists; -+ - /** When did we last multiply all routers' weighted_run_length and - * total_run_weights by STABILITY_ALPHA? */ - static time_t stability_last_downrated = 0; -@@ -164,6 +184,16 @@ - tor_free(hist); - } - -+/** Helper: free storage held by a single OR port history entry. */ -+static void -+or_port_hist_free(or_port_hist_t *p) -+{ -+ tor_assert(p); -+ digestmap_free(p->ids,NULL); -+ digestmap_free(p->failure_ids,NULL); -+ tor_free(p); -+} -+ - /** Update an or_history_t object <b>hist</b> so that its uptime/downtime - * count is up-to-date as of <b>when</b>. - */ -@@ -1639,7 +1669,7 @@ - tmp_time = smartlist_get(predicted_ports_times, i); - if (*tmp_time + PREDICTED_CIRCS_RELEVANCE_TIME < now) { - tmp_port = smartlist_get(predicted_ports_list, i); -- log_debug(LD_CIRC, "Expiring predicted port %d", *tmp_port); -+ log_debug(LD_HIST, "Expiring predicted port %d", *tmp_port); - smartlist_del(predicted_ports_list, i); - smartlist_del(predicted_ports_times, i); - rephist_total_alloc -= sizeof(uint16_t)+sizeof(time_t); -@@ -1821,6 +1851,12 @@ - tor_free(last_stability_doc); - built_last_stability_doc_at = 0; - predicted_ports_free(); -+ if (or_port_hists) { -+ SMARTLIST_FOREACH(or_port_hists, or_port_hist_t *, p, -+ or_port_hist_free(p)); -+ smartlist_free(or_port_hists); -+ or_port_hists = NULL; -+ } - } - - /****************** hidden service usage statistics ******************/ -@@ -2356,3 +2392,225 @@ - tor_free(fname); - } - -+/** Create a new entry in the port tracking cache for the or_port in -+ * <b>ri</b>. */ -+void -+or_port_hist_new(const routerinfo_t *ri) -+{ -+ or_port_hist_t *result; -+ const char *id=ri->cache_info.identity_digest; -+ -+ if (!or_port_hists) -+ or_port_hist_init(); -+ -+ SMARTLIST_FOREACH(or_port_hists, or_port_hist_t *, tp, -+ { -+ /* Cope with routers that change their advertised OR port or are -+ dropped from the networkstatus. We don't discard the failures of -+ dropped routers because they are still valid when counting -+ consecutive failures on a port.*/ -+ if (digestmap_get(tp->ids, id) && (tp->or_port != ri->or_port)) { -+ digestmap_remove(tp->ids, id); -+ } -+ if (tp->or_port == ri->or_port) { -+ if (!(digestmap_get(tp->ids, id))) -+ digestmap_set(tp->ids, id, (void*)1); -+ return; -+ } -+ }); -+ -+ result = tor_malloc_zero(sizeof(or_port_hist_t)); -+ result->or_port=ri->or_port; -+ result->success=0; -+ result->ids=digestmap_new(); -+ digestmap_set(result->ids, id, (void*)1); -+ result->failure_ids=digestmap_new(); -+ result->excluded=0; -+ smartlist_add(or_port_hists, result); -+} -+ -+/** Create the port tracking cache. */ -+/*XXX: need to call this when we rebuild/update our network status */ -+static void -+or_port_hist_init(void) -+{ -+ routerlist_t *rl = router_get_routerlist(); -+ -+ if (!or_port_hists) -+ or_port_hists=smartlist_create(); -+ -+ if (rl && rl->routers) { -+ SMARTLIST_FOREACH(rl->routers, routerinfo_t *, ri, -+ { -+ or_port_hist_new(ri); -+ }); -+ } -+} -+ -+#define NOT_BLOCKED 0 -+#define FAILURES_OBSERVED 1 -+#define POSSIBLY_BLOCKED 5 -+#define PROBABLY_BLOCKED 10 -+/** Return the list of blocked ports for our router's extra-info.*/ -+char * -+or_port_hist_get_blocked_ports(void) -+{ -+ char blocked_ports[2048]; -+ char *bp; -+ -+ tor_snprintf(blocked_ports,sizeof(blocked_ports),"blocked-ports"); -+ SMARTLIST_FOREACH(or_port_hists, or_port_hist_t *, tp, -+ { -+ if (digestmap_size(tp->failure_ids) >= PROBABLY_BLOCKED) -+ tor_snprintf(blocked_ports+strlen(blocked_ports), -+ sizeof(blocked_ports)," %u,",tp->or_port); -+ }); -+ if (strlen(blocked_ports) == 13) -+ return NULL; -+ bp=tor_strdup(blocked_ports); -+ bp[strlen(bp)-1]='\n'; -+ bp[strlen(bp)]='\0'; -+ return bp; -+} -+ -+/** Revert to client-only mode if we have seen to many failures on a port or -+ * range of ports.*/ -+static void -+or_port_hist_report_block(unsigned int min_severity) -+{ -+ or_options_t *options=get_options(); -+ char failures_observed[2048],possibly_blocked[2048],probably_blocked[2048]; -+ char port[1024]; -+ -+ memset(failures_observed,0,sizeof(failures_observed)); -+ memset(possibly_blocked,0,sizeof(possibly_blocked)); -+ memset(probably_blocked,0,sizeof(probably_blocked)); -+ -+ SMARTLIST_FOREACH(or_port_hists, or_port_hist_t *, tp, -+ { -+ unsigned int failures = digestmap_size(tp->failure_ids); -+ if (failures >= min_severity) { -+ tor_snprintf(port, sizeof(port), " %u (%u failures %s out of %u on the" -+ " network)",tp->or_port,failures, -+ (!tp->success)?"and no successes": "since last success", -+ digestmap_size(tp->ids)); -+ if (failures >= PROBABLY_BLOCKED) { -+ strlcat(probably_blocked, port, sizeof(probably_blocked)); -+ } else if (failures >= POSSIBLY_BLOCKED) -+ strlcat(possibly_blocked, port, sizeof(possibly_blocked)); -+ else if (failures >= FAILURES_OBSERVED) -+ strlcat(failures_observed, port, sizeof(failures_observed)); -+ } -+ }); -+ -+ log_warn(LD_HIST,"%s%s%s%s%s%s%s%s", -+ server_mode(options) && -+ ((min_severity==FAILURES_OBSERVED) || strlen(probably_blocked))? -+ "You should consider disabling your Tor server.":"", -+ (min_severity==FAILURES_OBSERVED)? -+ "Tor appears to be blocked from connecting to a range of ports " -+ "with the result that it cannot connect to one tenth of the Tor " -+ "network. ":"", -+ strlen(failures_observed)? -+ "Tor has observed failures on the following ports: ":"", -+ failures_observed, -+ strlen(possibly_blocked)? -+ "Tor is possibly blocked on the following ports: ":"", -+ possibly_blocked, -+ strlen(probably_blocked)? -+ "Tor is almost certainly blocked on the following ports: ":"", -+ probably_blocked); -+ -+} -+ -+/** Record the success of our connection to <b>digest</b>'s -+ * OR port. */ -+void -+or_port_hist_success(uint16_t or_port) -+{ -+ SMARTLIST_FOREACH(or_port_hists, or_port_hist_t *, tp, -+ { -+ if (tp->or_port != or_port) -+ continue; -+ /*Reset our failure stats so we can notice if this port ever gets -+ blocked again.*/ -+ tp->success=1; -+ if (digestmap_size(tp->failure_ids)) { -+ digestmap_free(tp->failure_ids,NULL); -+ tp->failure_ids=digestmap_new(); -+ } -+ if (still_searching) { -+ still_searching=0; -+ SMARTLIST_FOREACH(or_port_hists,or_port_hist_t *,t,t->excluded=0;); -+ } -+ return; -+ }); -+} -+/** Record the failure of our connection to <b>digest</b>'s -+ * OR port. Warn, exclude the port from future entry guard selection, or -+ * add port to blocked-ports in our server's extra-info as appropriate. */ -+void -+or_port_hist_failure(const char *digest, uint16_t or_port) -+{ -+ int total_failures=0, ports_excluded=0, report_block=0; -+ int total_routers=smartlist_len(router_get_routerlist()->routers); -+ -+ SMARTLIST_FOREACH(or_port_hists, or_port_hist_t *, tp, -+ { -+ ports_excluded += tp->excluded; -+ total_failures+=digestmap_size(tp->failure_ids); -+ if (tp->or_port != or_port) -+ continue; -+ /* We're only interested in unique failures */ -+ if (digestmap_get(tp->failure_ids, digest)) -+ return; -+ -+ total_failures++; -+ digestmap_set(tp->failure_ids, digest, (void*)1); -+ if (still_searching && !tp->success) { -+ tp->excluded=1; -+ ports_excluded++; -+ } -+ if ((digestmap_size(tp->ids) >= POSSIBLY_BLOCKED) && -+ !(digestmap_size(tp->failure_ids) % POSSIBLY_BLOCKED)) -+ report_block=POSSIBLY_BLOCKED; -+ }); -+ -+ if (total_failures >= (int)(total_routers/10)) -+ or_port_hist_report_block(FAILURES_OBSERVED); -+ else if (report_block) -+ or_port_hist_report_block(report_block); -+ -+ if (ports_excluded >= smartlist_len(or_port_hists)) { -+ log_warn(LD_HIST,"During entry node selection Tor tried every port " -+ "offered on the network on at least one server " -+ "and didn't manage a single " -+ "successful connection. This suggests you are behind an " -+ "extremely restrictive firewall. Tor will keep trying to find " -+ "a reachable entry node."); -+ SMARTLIST_FOREACH(or_port_hists, or_port_hist_t *, tp, tp->excluded=0;); -+ } -+} -+ -+/** Add any ports marked as excluded in or_port_hist_t to <b>rt</b> */ -+void -+or_port_hist_exclude(routerset_t *rt) -+{ -+ SMARTLIST_FOREACH(or_port_hists, or_port_hist_t *, tp, -+ { -+ char portpolicy[9]; -+ if (tp->excluded) { -+ tor_snprintf(portpolicy,sizeof(portpolicy),"*:%u", tp->or_port); -+ log_warn(LD_HIST,"Port %u may be blocked, excluding it temporarily " -+ "from entry guard selection.", tp->or_port); -+ routerset_parse(rt, portpolicy, "Ports"); -+ } -+ }); -+} -+ -+/** Allow the exclusion of ports during our search for an entry node. */ -+void -+or_port_hist_search_again(void) -+{ -+ still_searching=1; -+} -Index: src/or/or.h -=================================================================== ---- src/or/or.h (revision 17104) -+++ src/or/or.h (working copy) -@@ -3864,6 +3864,13 @@ - int any_predicted_circuits(time_t now); - int rep_hist_circbuilding_dormant(time_t now); - -+void or_port_hist_failure(const char *digest, uint16_t or_port); -+void or_port_hist_success(uint16_t or_port); -+void or_port_hist_new(const routerinfo_t *ri); -+void or_port_hist_exclude(routerset_t *rt); -+void or_port_hist_search_again(void); -+char *or_port_hist_get_blocked_ports(void); -+ - /** Possible public/private key operations in Tor: used to keep track of where - * we're spending our time. */ - typedef enum { -Index: src/or/routerparse.c -=================================================================== ---- src/or/routerparse.c (revision 17104) -+++ src/or/routerparse.c (working copy) -@@ -1401,6 +1401,8 @@ - goto err; - } - -+ or_port_hist_new(router); -+ - if (!router->platform) { - router->platform = tor_strdup("<unknown>"); - } -Index: src/or/router.c -=================================================================== ---- src/or/router.c (revision 17104) -+++ src/or/router.c (working copy) -@@ -1818,6 +1818,7 @@ - char published[ISO_TIME_LEN+1]; - char digest[DIGEST_LEN]; - char *bandwidth_usage; -+ char *blocked_ports; - int result; - size_t len; - -@@ -1825,7 +1826,6 @@ - extrainfo->cache_info.identity_digest, DIGEST_LEN); - format_iso_time(published, extrainfo->cache_info.published_on); - bandwidth_usage = rep_hist_get_bandwidth_lines(1); -- - result = tor_snprintf(s, maxlen, - "extra-info %s %s\n" - "published %s\n%s", -@@ -1835,6 +1835,16 @@ - if (result<0) - return -1; - -+ blocked_ports = or_port_hist_get_blocked_ports(); -+ if (blocked_ports) { -+ result = tor_snprintf(s+strlen(s), maxlen-strlen(s), -+ "%s", -+ blocked_ports); -+ tor_free(blocked_ports); -+ if (result<0) -+ return -1; -+ } -+ - if (should_record_bridge_info(options)) { - static time_t last_purged_at = 0; - char *geoip_summary; -Index: src/or/circuitbuild.c -=================================================================== ---- src/or/circuitbuild.c (revision 17104) -+++ src/or/circuitbuild.c (working copy) -@@ -62,6 +62,7 @@ - - static void entry_guards_changed(void); - static time_t start_of_month(time_t when); -+static int num_live_entry_guards(void); - - /** Iterate over values of circ_id, starting from conn-\>next_circ_id, - * and with the high bit specified by conn-\>circ_id_type, until we get -@@ -1627,12 +1628,14 @@ - smartlist_t *excluded; - or_options_t *options = get_options(); - router_crn_flags_t flags = 0; -+ routerset_t *_ExcludeNodes; - - if (state && options->UseEntryGuards && - (purpose != CIRCUIT_PURPOSE_TESTING || options->BridgeRelay)) { - return choose_random_entry(state); - } - -+ _ExcludeNodes = routerset_new(); - excluded = smartlist_create(); - - if (state && (r = build_state_get_exit_router(state))) { -@@ -1670,12 +1673,18 @@ - if (options->_AllowInvalid & ALLOW_INVALID_ENTRY) - flags |= CRN_ALLOW_INVALID; - -+ if (options->ExcludeNodes) -+ routerset_union(_ExcludeNodes,options->ExcludeNodes); -+ -+ or_port_hist_exclude(_ExcludeNodes); -+ - choice = router_choose_random_node( - NULL, - excluded, -- options->ExcludeNodes, -+ _ExcludeNodes, - flags); - smartlist_free(excluded); -+ routerset_free(_ExcludeNodes); - return choice; - } - -@@ -2727,6 +2736,7 @@ - entry_guards_update_state(or_state_t *state) - { - config_line_t **next, *line; -+ unsigned int have_reachable_entry=0; - if (! entry_guards_dirty) - return; - -@@ -2740,6 +2750,7 @@ - char dbuf[HEX_DIGEST_LEN+1]; - if (!e->made_contact) - continue; /* don't write this one to disk */ -+ have_reachable_entry=1; - *next = line = tor_malloc_zero(sizeof(config_line_t)); - line->key = tor_strdup("EntryGuard"); - line->value = tor_malloc(HEX_DIGEST_LEN+MAX_NICKNAME_LEN+2); -@@ -2785,6 +2796,11 @@ - if (!get_options()->AvoidDiskWrites) - or_state_mark_dirty(get_or_state(), 0); - entry_guards_dirty = 0; -+ -+ /* XXX: Is this the place to decide that we no longer have any reachable -+ guards? */ -+ if (!have_reachable_entry) -+ or_port_hist_search_again(); - } - - /** If <b>question</b> is the string "entry-guards", then dump - diff --git a/doc/spec/proposals/157-specific-cert-download.txt b/doc/spec/proposals/157-specific-cert-download.txt deleted file mode 100644 index 204b20973a..0000000000 --- a/doc/spec/proposals/157-specific-cert-download.txt +++ /dev/null @@ -1,102 +0,0 @@ -Filename: 157-specific-cert-download.txt -Title: Make certificate downloads specific -Author: Nick Mathewson -Created: 2-Dec-2008 -Status: Accepted -Target: 0.2.1.x - -History: - - 2008 Dec 2, 22:34 - Changed name of cross certification field to match the other authority - certificate fields. - -Status: - - As of 0.2.1.9-alpha: - Cross-certification is implemented for new certificates, but not yet - required. Directories support the tor/keys/fp-sk urls. - -Overview: - - Tor's directory specification gives two ways to download a certificate: - by its identity fingerprint, or by the digest of its signing key. Both - are error-prone. We propose a new download mechanism to make sure that - clients get the certificates they want. - -Motivation: - - When a client wants a certificate to verify a consensus, it has two choices - currently: - - Download by identity key fingerprint. In this case, the client risks - getting a certificate for the same authority, but with a different - signing key than the one used to sign the consensus. - - - Download by signing key fingerprint. In this case, the client risks - getting a forged certificate that contains the right signing key - signed with the wrong identity key. (Since caches are willing to - cache certs from authorities they do not themselves recognize, the - attacker wouldn't need to compromise an authority's key to do this.) - -Current solution: - - Clients fetch by identity keys, and re-fetch with backoff if they don't get - certs with the signing key they want. - -Proposed solution: - - Phase 1: Add a URL type for clients to download certs by identity _and_ - signing key fingerprint. Unless both fields match, the client doesn't - accept the certificate(s). Clients begin using this method when their - randomly chosen directory cache supports it. - - Phase 1A: Simultaneously, add a cross-certification element to - certificates. - - Phase 2: Once many directory caches support phase 1, clients should prefer - to fetch certificates using that protocol when available. - - Phase 2A: Once all authorities are generating cross-certified certificates - as in phase 1A, require cross-certification. - -Specification additions: - - The key certificate whose identity key fingerprint is <F> and whose signing - key fingerprint is <S> should be available at: - - http://<hostname>/tor/keys/fp-sk/<F>-<S>.z - - As usual, clients may request multiple certificates using: - - http://<hostname>/tor/keys/fp-sk/<F1>-<S1>+<F2>-<S2>.z - - Clients SHOULD use this format whenever they know both key fingerprints for - a desired certificate. - - - Certificates SHOULD contain the following field (at most once): - - "dir-key-crosscert" NL CrossSignature NL - - where CrossSignature is a signature, made using the certificate's signing - key, of the digest of the PKCS1-padded hash of the certificate's identity - key. For backward compatibility with broken versions of the parser, we - wrap the base64-encoded signature in -----BEGIN ID SIGNATURE---- and - -----END ID SIGNATURE----- tags. (See bug 880.) Implementations MUST allow - the "ID " portion to be omitted, however. - - When encountering a certificate with a dir-key-crosscert entry, - implementations MUST verify that the signature is a correct signature of - the hash of the identity key using the signing key. - - (In a future version of this specification, dir-key-crosscert entries will - be required.) - -Why cross-certify too? - - Cross-certification protects clients who haven't updated yet, by reducing - the number of caches that are willing to hold and serve bogus certificates. - -References: - - This is related to part 2 of bug 854. diff --git a/doc/spec/proposals/158-microdescriptors.txt b/doc/spec/proposals/158-microdescriptors.txt deleted file mode 100644 index e6966c0cef..0000000000 --- a/doc/spec/proposals/158-microdescriptors.txt +++ /dev/null @@ -1,198 +0,0 @@ -Filename: 158-microdescriptors.txt -Title: Clients download consensus + microdescriptors -Author: Roger Dingledine -Created: 17-Jan-2009 -Status: Open - -0. History - - 15 May 2009: Substantially revised based on discussions on or-dev - from late January. Removed the notion of voting on how to choose - microdescriptors; made it just a function of the consensus method. - (This lets us avoid the possibility of "desynchronization.") - Added suggestion to use a new consensus flavor. Specified use of - SHA256 for new hashes. -nickm - - 15 June 2009: Cleaned up based on comments from Roger. -nickm - -1. Overview - - This proposal replaces section 3.2 of proposal 141, which was - called "Fetching descriptors on demand". Rather than modifying the - circuit-building protocol to fetch a server descriptor inline at each - circuit extend, we instead put all of the information that clients need - either into the consensus itself, or into a new set of data about each - relay called a microdescriptor. - - Descriptor elements that are small and frequently changing should go - in the consensus itself, and descriptor elements that are small and - relatively static should go in the microdescriptor. If we ever end up - with descriptor elements that aren't small yet clients need to know - them, we'll need to resume considering some design like the one in - proposal 141. - - Note also that any descriptor element which clients need to use to - decide which servers to fetch info about, or which servers to fetch - info from, needs to stay in the consensus. - -2. Motivation - - See - http://archives.seul.org/or/dev/Nov-2008/msg00000.html and - http://archives.seul.org/or/dev/Nov-2008/msg00001.html and especially - http://archives.seul.org/or/dev/Nov-2008/msg00007.html - for a discussion of the options and why this is currently the best - approach. - -3. Design - - There are three pieces to the proposal. First, authorities will list in - their votes (and thus in the consensus) the expected hash of - microdescriptor for each relay. Second, authorities will serve - microdescriptors, directory mirrors will cache and serve - them. Third, clients will ask for them and cache them. - -3.1. Consensus changes - - If the authorities choose a consensus method of a given version or - later, a microdescriptor format is implicit in that version. - A microdescriptor should in every case be a pure function of the - router descriptor and the consensus method. - - In votes, we need to include the hash of each expected microdescriptor - in the routerstatus section. I suggest a new "m" line for each stanza, - with the base64 of the SHA256 hash of the router's microdescriptor. - - For every consensus method that an authority supports, it includes a - separate "m" line in each router section of its vote, containing: - "m" SP methods 1*(SP AlgorithmName "=" digest) NL - where methods is a comma-separated list of the consensus methods - that the authority believes will produce "digest". - - (As with base64 encoding of SHA1 hashes in consensuses, let's - omit the trailing =s) - - The consensus microdescriptor-elements and "m" lines are then computed - as described in Section 3.1.2 below. - - (This means we need a new consensus-method that knows - how to compute the microdescriptor-elements and add "m" lines.) - - The microdescriptor consensus uses the directory-signature format from - proposal 162, with the "sha256" algorithm. - - -3.1.1. Descriptor elements to include for now - - In the first version, the microdescriptor should contain the - onion-key element, and the family element from the router descriptor, - and the exit policy summary as currently specified in dir-spec.txt. - -3.1.2. Computing consensus for microdescriptor-elements and "m" lines - - When we are generating a consensus, we use whichever m line - unambiguously corresponds to the descriptor digest that will be - included in the consensus. - - (If different votes have different microdescriptor digests for a - single <descriptor-digest, consensus-method> pair, then at least one - of the authorities is broken. If this happens, the consensus should - contain whichever microdescriptor digest is most common. If there is - no winner, we break ties in the favor of the lexically earliest. - Either way, we should log a warning: there is definitely a bug.) - - The "m" lines in a consensus contain only the digest, not a list of - consensus methods. - -3.1.3. A new flavor of consensus - - Rather than inserting "m" lines in the current consensus format, - they should be included in a new consensus flavor (see proposal - 162). - - This flavor can safely omit descriptor digests. - - When we implement this voting method, we can remove the exit policy - summary from the current "ns" flavor of consensus, since no current - clients use them, and they take up about 5% of the compressed - consensus. - - This new consensus flavor should be signed with the sha256 signature - format as documented in proposal 162. - -3.2. Directory mirrors fetch, cache, and serve microdescriptors - - Directory mirrors should fetch, catch, and serve each microdescriptor - from the authorities. (They need to continue to serve normal relay - descriptors too, to handle old clients.) - - The microdescriptors with base64 hashes <D1>,<D2>,<D3> should be - available at: - http://<hostname>/tor/micro/d/<D1>-<D2>-<D3>.z - (We use base64 for size and for consistency with the consensus - format. We use -s instead of +s to separate these items, since - the + character is used in base64 encoding.) - - All the microdescriptors from the current consensus should also be - available at: - http://<hostname>/tor/micro/all.z - so a client that's bootstrapping doesn't need to send a 70KB URL just - to name every microdescriptor it's looking for. - - Microdescriptors have no header or footer. - The hash of the microdescriptor is simply the hash of the concatenated - elements. - - Directory mirrors should check to make sure that the microdescriptors - they're about to serve match the right hashes (either the hashes from - the fetch URL or the hashes from the consensus, respectively). - - We will probably want to consider some sort of smart data structure to - be able to quickly convert microdescriptor hashes into the appropriate - microdescriptor. Clients will want this anyway when they load their - microdescriptor cache and want to match it up with the consensus to - see what's missing. - -3.3. Clients fetch them and cache them - - When a client gets a new consensus, it looks to see if there are any - microdescriptors it needs to learn. If it needs to learn more than - some threshold of the microdescriptors (half?), it requests 'all', - else it requests only the missing ones. Clients MAY try to - determine whether the upload bandwidth for listing the - microdescriptors they want is more or less than the download - bandwidth for the microdescriptors they do not want. - - Clients maintain a cache of microdescriptors along with metadata like - when it was last referenced by a consensus, and which identity key - it corresponds to. They keep a microdescriptor - until it hasn't been mentioned in any consensus for a week. Future - clients might cache them for longer or shorter times. - -3.3.1. Information leaks from clients - - If a client asks you for a set of microdescs, then you know she didn't - have them cached before. How much does that leak? What about when - we're all using our entry guards as directory guards, and we've seen - that user make a bunch of circuits already? - - Fetching "all" when you need at least half is a good first order fix, - but might not be all there is to it. - - Another future option would be to fetch some of the microdescriptors - anonymously (via a Tor circuit). - - Another crazy option (Roger's phrasing) is to do decoy fetches as - well. - -4. Transition and deployment - - Phase one, the directory authorities should start voting on - microdescriptors, and putting them in the consensus. - - Phase two, directory mirrors should learn how to serve them, and learn - how to read the consensus to find out what they should be serving. - - Phase three, clients should start fetching and caching them instead - of normal descriptors. - diff --git a/doc/spec/proposals/159-exit-scanning.txt b/doc/spec/proposals/159-exit-scanning.txt deleted file mode 100644 index 7090f2ed08..0000000000 --- a/doc/spec/proposals/159-exit-scanning.txt +++ /dev/null @@ -1,142 +0,0 @@ -Filename: 159-exit-scanning.txt -Title: Exit Scanning -Author: Mike Perry -Created: 13-Feb-2009 -Status: Open - -Overview: - -This proposal describes the implementation and integration of an -automated exit node scanner for scanning the Tor network for malicious, -misconfigured, firewalled or filtered nodes. - -Motivation: - -Tor exit nodes can be run by anyone with an Internet connection. Often, -these users aren't fully aware of limitations of their networking -setup. Content filters, antivirus software, advertisements injected by -their service providers, malicious upstream providers, and the resource -limitations of their computer or networking equipment have all been -observed on the current Tor network. - -It is also possible that some nodes exist purely for malicious -purposes. In the past, there have been intermittent instances of -nodes spoofing SSH keys, as well as nodes being used for purposes of -plaintext surveillance. - -While it is not realistic to expect to catch extremely targeted or -completely passive malicious adversaries, the goal is to prevent -malicious adversaries from deploying dragnet attacks against large -segments of the Tor userbase. - - -Scanning methodology: - -The first scans to be implemented are HTTP, HTML, Javascript, and -SSL scans. - -The HTTP scan scrapes Google for common filetype urls such as exe, msi, -doc, dmg, etc. It then fetches these urls through Non-Tor and Tor, and -compares the SHA1 hashes of the resulting content. - -The SSL scan downloads certificates for all IPs a domain will locally -resolve to and compares these certificates to those seen over Tor. The -scanner notes if a domain had rotated certificates locally in the -results for each scan. - -The HTML scan checks HTML, Javascript, and plugin content for -modifications. Because of the dynamic nature of most of the web, the -scanner has a number of mechanisms built in to filter out false -positives that are used when a change is noticed between Tor and -Non-Tor. - -All tests also share a URL-based false positive filter that -automatically removes results retroactively if the number of failures -exceeds a certain percentage of nodes tested with the URL. - - -Deployment Stages: - -To avoid instances where bugs cause us to mark exit nodes as BadExit -improperly, it is proposed that we begin use of the scanner in stages. - -1. Manual Review: - - In the first stage, basic scans will be run by a small number of - people while we stabilize the scanner. The scanner has the ability - to resume crashed scans, and to rescan nodes that fail various - tests. - -2. Human Review: - - In the second stage, results will be automatically mailed to - an email list of interested parties for review. We will also begin - classifying failure types into three to four different severity - levels, based on both the reliability of the test and the nature of - the failure. - -3. Automatic BadExit Marking: - - In the final stage, the scanner will begin marking exits depending - on the failure severity level in one of three different ways: by - node idhex, by node IP, or by node IP mask. A potential fourth, less - severe category of results may still be delivered via email only for - review. - - BadExit markings will be delivered in batches upon completion - of whole-network scans, so that the final false positive - filter has an opportunity to filter out URLs that exhibit - dynamic content beyond what we can filter. - - -Specification of Exit Marking: - -Technically, BadExit could be marked via SETCONF AuthDirBadExit over -the control port, but this would allow full access to the directory -authority configuration and operation. - -The approved-routers file could also be used, but currently it only -supports fingerprints, and it also contains other data unrelated to -exit scanning that would be difficult to coordinate. - -Instead, we propose that a new badexit-routers file that has three -keywords: - - BadExitNet 1*[exitpattern from 2.3 in dir-spec.txt] - BadExitFP 1*[hexdigest from 2.3 in dir-spec.txt] - -BadExitNet lines would follow the codepaths used by AuthDirBadExit to -set authdir_badexit_policy, and BadExitFP would follow the codepaths -from approved-router's !badexit lines. - -The scanner would have exclusive ability to write, append, rewrite, -and modify this file. Prior to building a new consensus vote, a -participating Tor authority would read in a fresh copy. - - -Security Implications: - -Aside from evading the scanner's detection, there are two additional -high-level security considerations: - -1. Ensure nodes cannot be marked BadExit by an adversary at will - -It is possible individual website owners will be able to target certain -Tor nodes, but once they begin to attempt to fail more than the URL -filter percentage of the exits, their sites will be automatically -discarded. - -Failing specific nodes is possible, but scanned results are fully -reproducible, and BadExits should be rare enough that humans are never -fully removed from the loop. - -State (cookies, cache, etc) does not otherwise persist in the scanner -between exit nodes to enable one exit node to bias the results of a -later one. - -2. Ensure that scanner compromise does not yield authority compromise - -Having a separate file that is under the exclusive control of the -scanner allows us to heavily isolate the scanner from the Tor -authority, potentially even running them on separate machines. - diff --git a/doc/spec/proposals/160-bandwidth-offset.txt b/doc/spec/proposals/160-bandwidth-offset.txt deleted file mode 100644 index 96935ade7d..0000000000 --- a/doc/spec/proposals/160-bandwidth-offset.txt +++ /dev/null @@ -1,105 +0,0 @@ -Filename: 160-bandwidth-offset.txt -Title: Authorities vote for bandwidth offsets in consensus -Author: Roger Dingledine -Created: 4-May-2009 -Status: Finished -Target: 0.2.2.x - -1. Motivation - - As part of proposal 141, we moved the bandwidth value for each relay - into the consensus. Now clients can know how they should load balance - even before they've fetched the corresponding relay descriptors. - - Putting the bandwidth in the consensus also lets the directory - authorities choose more accurate numbers to advertise, if we come up - with a better algorithm for deciding weightings. - - Our original plan was to teach directory authorities how to measure - bandwidth themselves; then every authority would vote for the bandwidth - it prefers, and we'd take the median of votes as usual. - - The problem comes when we have 7 authorities, and only a few of them - have smarter bandwidth allocation algorithms. So long as the majority - of them are voting for the number in the relay descriptor, the minority - that have better numbers will be ignored. - -2. Options - - One fix would be to demand that every authority also run the - new bandwidth measurement algorithms: in that case, part of the - responsibility of being an authority operator is that you need to run - this code too. But in practice we can't really require all current - authority operators to do that; and if we want to expand the set of - authority operators even further, it will become even more impractical. - Also, bandwidth testing adds load to the network, so we don't really - want to require that the number of concurrent bandwidth tests match - the number of authorities we have. - - The better fix is to allow certain authorities to specify that they are - voting on bandwidth measurements: more accurate bandwidth values that - have actually been evaluated. In this way, authorities can vote on - the median measured value if sufficient measured votes exist for a router, - and otherwise fall back to the median value taken from the published router - descriptors. - -3. Security implications - - If only some authorities choose to vote on an offset, then a majority of - those voting authorities can arbitrarily change the bandwidth weighting - for the relay. At the extreme, if there's only one offset-voting - authority, then that authority can dictate which relays clients will - find attractive. - - This problem isn't entirely new: we already have the worry wrt - the subset of authorities that vote for BadExit. - - To make it not so bad, we should deploy at least three offset-voting - authorities. - - Also, authorities that know how to vote for offsets should vote for - an offset of zero for new nodes, rather than choosing not to vote on - any offset in those cases. - -4. Design - - First, we need a new consensus method to support this new calculation. - - Now v3 votes can have an additional value on the "w" line: - "w Bandwidth=X Measured=" INT. - - Once we're using the new consensus method, the new way to compute the - Bandwidth weight is by checking if there are at least 3 "Measured" - votes. If so, the median of these is taken. Otherwise, the median - of the "Bandwidth=" values are taken, as described in Proposal 141. - - Then the actual consensus looks just the same as it did before, - so clients never have to know that this additional calculation is - happening. - -5. Implementation - - The Measured values will be read from a file provided by the scanners - described in proposal 161. Files with a timestamp older than 3 days - will be ignored. - - The file will be read in from dirserv_generate_networkstatus_vote_obj() - in a location specified by a new config option "V3MeasuredBandwidths". - A helper function will be called to populate new 'measured' and - 'has_measured' fields of the routerstatus_t 'routerstatuses' list with - values read from this file. - - An additional for_vote flag will be passed to - routerstatus_format_entry() from format_networkstatus_vote(), which will - indicate that the "Measured=" string should be appended to the "w Bandwith=" - line with the measured value in the struct. - - routerstatus_parse_entry_from_string() will be modified to parse the - "Measured=" lines into routerstatus_t struct fields. - - Finally, networkstatus_compute_consensus() will set rs_out.bandwidth - to the median of the measured values if there are more than 3, otherwise - it will use the bandwidth value median as normal. - - - diff --git a/doc/spec/proposals/161-computing-bandwidth-adjustments.txt b/doc/spec/proposals/161-computing-bandwidth-adjustments.txt deleted file mode 100644 index d219826668..0000000000 --- a/doc/spec/proposals/161-computing-bandwidth-adjustments.txt +++ /dev/null @@ -1,174 +0,0 @@ -Title: Computing Bandwidth Adjustments -Filename: 161-computing-bandwidth-adjustments.txt -Author: Mike Perry -Created: 12-May-2009 -Target: 0.2.2.x -Status: Finished - - -1. Motivation - - There is high variance in the performance of the Tor network. Despite - our efforts to balance load evenly across the Tor nodes, some nodes are - significantly slower and more overloaded than others. - - Proposal 160 describes how we can augment the directory authorities to - vote on measured bandwidths for routers. This proposal describes what - goes into the measuring process. - - -2. Measurement Selection - - The general idea is to determine a load factor representing the ratio - of the capacity of measured nodes to the rest of the network. This load - factor could be computed from three potentially relevant statistics: - circuit failure rates, circuit extend times, or stream capacity. - - Circuit failure rates and circuit extend times appear to be - non-linearly proportional to node load. We've observed that the same - nodes when scanned at US nighttime hours (when load is presumably - lower) exhibit almost no circuit failure, and significantly faster - extend times than when scanned during the day. - - Stream capacity, however, is much more uniform, even during US - nighttime hours. Moreover, it is a more intuitive representation of - node capacity, and also less dependent upon distance and latency - if amortized over large stream fetches. - - -3. Average Stream Bandwidth Calculation - - The average stream bandwidths are obtained by dividing the network into - slices of 50 nodes each, grouped according to advertised node bandwidth. - - Two hop circuits are built using nodes from the same slice, and a large - file is downloaded via these circuits. The file sizes are set based - on node percentile rank as follows: - - 0-10: 2M - 10-20: 1M - 20-30: 512k - 30-50: 256k - 50-100: 128k - - These sizes are based on measurements performed during test scans. - - This process is repeated until each node has been chosen to participate - in at least 5 circuits. - - -4. Ratio Calculation - - The ratios are calculated by dividing each measured value by the - network-wide average. - - -5. Ratio Filtering - - After the base ratios are calculated, a second pass is performed - to remove any streams with nodes of ratios less than X=0.5 from - the results of other nodes. In addition, all outlying streams - with capacity of one standard deviation below a node's average - are also removed. - - The final ratio result will be greater of the unfiltered ratio - and the filtered ratio. - - -6. Pseudocode for Ratio Calculation Algorithm - - Here is the complete pseudocode for the ratio algorithm: - - Slices = {S | S is 50 nodes of similar consensus capacity} - for S in Slices: - while exists node N in S with circ_chosen(N) < 7: - fetch_slice_file(build_2hop_circuit(N, (exit in S))) - for N in S: - BW_measured(N) = MEAN(b | b is bandwidth of a stream through N) - Bw_stddev(N) = STDDEV(b | b is bandwidth of a stream through N) - Bw_avg(S) = MEAN(b | b = BW_measured(N) for all N in S) - for N in S: - Normal_Streams(N) = {stream via N | bandwidth >= BW_measured(N)} - BW_Norm_measured(N) = MEAN(b | b is a bandwidth of Normal_Streams(N)) - - Bw_net_avg(Slices) = MEAN(BW_measured(N) for all N in Slices) - Bw_Norm_net_avg(Slices) = MEAN(BW_Norm_measured(N) for all N in Slices) - - for N in all Slices: - Bw_net_ratio(N) = Bw_measured(N)/Bw_net_avg(Slices) - Bw_Norm_net_ratio(N) = BW_Norm_measured(N)/Bw_Norm_net_avg(Slices) - - ResultRatio(N) = MAX(Bw_net_ratio(N), Bw_Norm_net_ratio(N)) - - -7. Security implications - - The ratio filtering will deal with cases of sabotage by dropping - both very slow outliers in stream average calculations, as well - as dropping streams that used very slow nodes from the calculation - of other nodes. - - This scheme will not address nodes that try to game the system by - providing better service to scanners. The scanners can be detected - at the entry by IP address, and at the exit by the destination fetch - IP. - - Measures can be taken to obfuscate and separate the scanners' source - IP address from the directory authority IP address. For instance, - scans can happen offsite and the results can be rsynced into the - authorities. The destination server IP can also change. - - Neither of these methods are foolproof, but such nodes can already - lie about their bandwidth to attract more traffic, so this solution - does not set us back any in that regard. - - -8. Parallelization - - Because each slice takes as long as 6 hours to complete, we will want - to parallelize as much as possible. This will be done by concurrently - running multiple scanners from each authority to deal with different - segments of the network. Each scanner piece will continually loop - over a portion of the network, outputting files of the form: - - node_id=<idhex> SP strm_bw=<BW_measured(N)> SP - filt_bw=<BW_Norm_measured(N)> ns_bw=<CurrentConsensusBw(N)> NL - - The most recent file from each scanner will be periodically gathered - by another script that uses them to produce network-wide averages - and calculate ratios as per the algorithm in section 6. Because nodes - may shift in capacity, they may appear in more than one slice and/or - appear more than once in the file set. The most recently measured - line will be chosen in this case. - - -9. Integration with Proposal 160 - - The final results will be produced for the voting mechanism - described in Proposal 160 by multiplying the derived ratio by - the average published consensus bandwidth during the course of the - scan, and taking the weighted average with the previous consensus - bandwidth: - - Bw_new = Round((Bw_current * Alpha + Bw_scan_avg*Bw_ratio)/(Alpha + 1)) - - The Alpha parameter is a smoothing parameter intended to prevent - rapid oscillation between loaded and unloaded conditions. It is - currently fixed at 0.333. - - The Round() step consists of rounding to the 3 most significant figures - in base10, and then rounding that result to the nearest 1000, with - a minimum value of 1000. - - This will produce a new bandwidth value that will be output into a - file consisting of lines of the form: - - node_id=<idhex> SP bw=<Bw_new> NL - - The first line of the file will contain a timestamp in UNIX time() - seconds. This will be used by the authority to decide if the - measured values are too old to use. - - This file can be either copied or rsynced into a directory readable - by the directory authority. - diff --git a/doc/spec/proposals/162-consensus-flavors.txt b/doc/spec/proposals/162-consensus-flavors.txt deleted file mode 100644 index e3b697afee..0000000000 --- a/doc/spec/proposals/162-consensus-flavors.txt +++ /dev/null @@ -1,188 +0,0 @@ -Filename: 162-consensus-flavors.txt -Title: Publish the consensus in multiple flavors -Author: Nick Mathewson -Created: 14-May-2009 -Target: 0.2.2 -Status: Open - -Overview: - - This proposal describes a way to publish each consensus in - multiple simultaneous formats, or "flavors". This will reduce the - amount of time needed to deploy new consensus-like documents, and - reduce the size of consensus documents in the long term. - -Motivation: - - In the future, we will almost surely want different fields and - data in the network-status document. Examples include: - - Publishing hashes of microdescriptors instead of hashes of - full descriptors (Proposal 158). - - Including different digests of descriptors, instead of the - perhaps-soon-to-be-totally-broken SHA1. - - Note that in both cases, from the client's point of view, this - information _replaces_ older information. If we're using a - SHA256 hash, we don't need to see the SHA1. If clients only want - microdescriptors, they don't (necessarily) need to see hashes of - other things. - - Our past approach to cases like this has been to shovel all of - the data into the consensus document. But this is rather poor - for bandwidth. Adding a single SHA256 hash to a consensus for - each router increases the compressed consensus size by 47%. In - comparison, replacing a single SHA1 hash with a SHA256 hash for - each listed router increases the consensus size by only 18%. - -Design in brief: - - Let the voting process remain as it is, until a consensus is - generated. With future versions of the voting algorithm, instead - of just a single consensus being generated, multiple consensus - "flavors" are produced. - - Consensuses (all of them) include a list of which flavors are - being generated. Caches fetch and serve all flavors of consensus - that are listed, regardless of whether they can parse or validate - them, and serve them to clients. Thus, once this design is in - place, we won't need to deploy more cache changes in order to get - new flavors of consensus to be cached. - - Clients download only the consensus flavor they want. - -A note on hashes: - - Everything in this document is specified to use SHA256, and to be - upgradeable to use better hashes in the future. - -Spec modifications: - - 1. URLs and changes to the current consensus format. - - Every consensus flavor has a name consisting of a sequence of one - or more alphanumeric characters and dashes. For compatibility - current descriptor flavor is called "ns". - - The supported consensus flavors are defined as part of the - authorities' consensus method. - - For each supported flavor, every authority calculates another - consensus document of as-yet-unspecified format, and exchanges - detached signatures for these documents as in the current consensus - design. - - In addition to the consensus currently served at - /tor/status-vote/(current|next)/consensus.z and - /tor/status-vote/(current|next)/consensus/<FP1>+<FP2>+<FP3>+....z , - authorities serve another consensus of each flavor "F" from the - locations /tor/status-vote/(current|next)/consensus-F.z. and - /tor/status-vote/(current|next)/consensus-F/<FP1>+....z. - - When caches serve these documents, they do so from the same - locations. - - 2. Document format: generic consensus. - - The format of a flavored consensus is as-yet-unspecified, except - that the first line is: - "network-status-version" SP version SP flavor NL - - where version is 3 or higher, and the flavor is a string - consisting of alphanumeric characters and dashes, matching the - corresponding flavor listed in the unflavored consensus. - - 3. Document format: detached signatures. - - We amend the detached signature format to include more than one - consensus-digest line, and more than one set of signatures. - - After the consensus-digest line, we allow more lines of the form: - "additional-digest" SP flavor SP algname SP digest NL - - Before the directory-signature lines, we allow more entries of the form: - "additional-signature" SP flavor SP algname SP identity SP - signing-key-digest NL signature. - - [We do not use "consensus-digest" or "directory-signature" for flavored - consensuses, since this could confuse older Tors.] - - The consensus-signatures URL should contain the signatures - for _all_ flavors of consensus. - - 4. The consensus index: - - Authorities additionally generate and serve a consensus-index - document. Its format is: - - Header ValidAfter ValidUntil Documents Signatures - - Header = "consensus-index" SP version NL - ValidAfter = as in a consensus - ValidUntil = as in a consensus - Documents = Document* - Document = "document" SP flavor SP SignedLength - 1*(SP AlgorithmName "=" Digest) NL - Signatures = Signature* - Signature = "directory-signature" SP algname SP identity - SP signing-key-digest NL signature - - There must be one Document line for each generated consensus flavor. - Each Document line describes the length of the signed portion of - a consensus (the signatures themselves are not included), along - with one or more digests of that signed portion. Digests are - given in hex. The algorithm "sha256" MUST be included; others - are allowed. - - The algname part of a signature describes what algorithm was - used to hash the identity and signing keys, and to compute the - signature. The algorithm "sha256" MUST be recognized; - signatures with unrecognized algorithms MUST be ignored. - (See below). - - The consensus index is made available at - /tor/status-vote/(current|next)/consensus-index.z. - - Caches should fetch this document so they can check the - correctness of the different consensus documents they fetch. - They do not need to check anything about an unrecognized - consensus document beyond its digest and length. - - 4.1. The "sha256" signature format. - - The 'SHA256' signature format for directory objects is defined as - the RSA signature of the OAEP+-padded SHA256 digest of the item to - be signed. When checking signatures, the signature MUST be treated - as valid if the signature material begins with SHA256(document); - this allows us to add other data later. - -Considerations: - - - We should not create a new flavor of consensus when adding a - field instead wouldn't be too onerous. - - - We should not proliferate flavors lightly: clients will be - distinguishable based on which flavor they download. - -Migration: - - - Stage one: authorities begin generating and serving - consensus-index files. - - - Stage two: Caches begin downloading consensus-index files, - validating them, and using them to decide what flavors of - consensus documents to cache. They download all listed - documents, and compare them to the digests given in the - consensus. - - - Stage three: Once we want to make a significant change to the - consensus format, we deploy another flavor of consensus at the - authorities. This will immediately start getting cached by the - caches, and clients can start fetching the new flavor without - waiting a version or two for enough caches to begin supporting - it. - -Acknowledgements: - - Aspects of this design and its applications to hash migration were - heavily influenced by IRC conversations with Marian. - diff --git a/doc/spec/proposals/163-detecting-clients.txt b/doc/spec/proposals/163-detecting-clients.txt deleted file mode 100644 index d838b17063..0000000000 --- a/doc/spec/proposals/163-detecting-clients.txt +++ /dev/null @@ -1,115 +0,0 @@ -Filename: 163-detecting-clients.txt -Title: Detecting whether a connection comes from a client -Author: Nick Mathewson -Created: 22-May-2009 -Target: 0.2.2 -Status: Open - - -Overview: - - Some aspects of Tor's design require relays to distinguish - connections from clients from connections that come from relays. - The existing means for doing this is easy to spoof. We propose - a better approach. - -Motivation: - - There are at least two reasons for which Tor servers want to tell - which connections come from clients and which come from other - servers: - - 1) Some exits, proposal 152 notwithstanding, want to disallow - their use as single-hop proxies. - 2) Some performance-related proposals involve prioritizing - traffic from relays, or limiting traffic per client (but not - per relay). - - Right now, we detect client vs server status based on how the - client opens circuits. (Check out the code that implements the - AllowSingleHopExits option if you want all the details.) This - method is depressingly easy to fake, though. This document - proposes better means. - -Goals: - - To make grabbing relay privileges at least as difficult as just - running a relay. - - In the analysis below, "using server privileges" means taking any - action that only servers are supposed to do, like delivering a - BEGIN cell to an exit node that doesn't allow single hop exits, - or claiming server-like amounts of bandwidth. - -Passive detection: - - A connection is definitely a client connection if it takes one of - the TLS methods during setup that does not establish an identity - key. - - A circuit is definitely a client circuit if it is initiated with - a CREATE_FAST cell, though the node could be a client or a server. - - A node that's listed in a recent consensus is probably a server. - - A node to which we have successfully extended circuits from - multiple origins is probably a server. - -Active detection: - - If a node doesn't try to use server privileges at all, we never - need to care whether it's a server. - - When a node or circuit tries to use server privileges, if it is - "definitely a client" as per above, we can refuse it immediately. - - If it's "probably a server" as per above, we can accept it. - - Otherwise, we have either a client, or a server that is neither - listed in any consensus or used by any other clients -- in other - words, a new or private server. - - For these servers, we should attempt to build one or more test - circuits through them. If enough of the circuits succeed, the - node is a real relay. If not, it is probably a client. - - While we are waiting for the test circuits to succeed, we should - allow a short grace period in which server privileges are - permitted. When a test is done, we should remember its outcome - for a while, so we don't need to do it again. - -Why it's hard to do good testing: - - Doing a test circuit starting with an unlisted router requires - only that we have an open connection for it. Doing a test - circuit starting elsewhere _through_ an unlisted router--though - more reliable-- would require that we have a known address, port, - identity key, and onion key for the router. Only the address and - identity key are easily available via the current Tor protocol in - all cases. - - We could fix this part by requiring that all servers support - BEGIN_DIR and support downloading at least a current descriptor - for themselves. - -Open questions: - - What are the thresholds for the needed numbers of circuits - for us to decide that a node is a relay? - - [Suggested answer: two circuits from two distinct hosts.] - - How do we pick grace periods? How long do we remember the - outcome of a test? - - [Suggested answer: 10 minute grace period; 48 hour memory of - test outcomes.] - - If we can build circuits starting at a suspect node, but we don't - have enough information to try extending circuits elsewhere - through the node, should we conclude that the node is - "server-like" or not? - - [Suggested answer: for now, just try making circuits through - the node. Extend this to extending circuits as needed.] - diff --git a/doc/spec/proposals/164-reporting-server-status.txt b/doc/spec/proposals/164-reporting-server-status.txt deleted file mode 100644 index 705f5f1a84..0000000000 --- a/doc/spec/proposals/164-reporting-server-status.txt +++ /dev/null @@ -1,91 +0,0 @@ -Filename: 164-reporting-server-status.txt -Title: Reporting the status of server votes -Author: Nick Mathewson -Created: 22-May-2009 -Target: 0.2.2 -Status: Open - - -Overview: - - When a given node isn't listed in the directory, it isn't always easy - to tell why. This proposal suggest a quick-and-dirty way for - authorities to export not only how they voted, but why, and a way to - collate the information. - -Motivation: - - Right now, if you want to know the reason why your server was listed - a certain way in the Tor directory, the following steps are - recommended: - - - Look through your log for reports of what the authority said - when you tried to upload. - - - Look at the consensus; see if you're listed. - - - Wait a while, see if things get better. - - - Download the votes from all the authorities, and see how they - voted. Try to figure out why. - - - If you think they'll listen to you, ask some authority - operators to look you up in their mtbf files and logs to see - why they voted as they did. - - This is far too hard. - -Solution: - - We should add a new vote-like information-only document that - authorities serve on request. Call it a "vote info". It is - generated at the same time as a vote, but used only for - determining why a server voted as it did. It is served from - /tor/status-vote-info/current/authority[.z] - - It differs from a vote in that: - - * Its vote-status field is 'vote-info'. - - * It includes routers that the authority would not include - in its vote. - - For these, it includes an "omitted" line with an English - message explaining why they were omitted. - - * For each router, it includes a line describing its WFU and - MTBF. The format is: - - "stability <mtbf> up-since='date'" - "uptime <wfu> down-since='date'" - - * It describes the WFU and MTBF thresholds it requires to - vote for a given router in various roles in the header. - The format is: - - "flag-requirement <flag-name> <field> <op> <value>" - - e.g. - - "flag-requirement Guard uptime > 80" - - * It includes info on routers all of whose descriptors that - were uploaded but rejected over the past few hours. The - "r" lines for these are the same as for regular routers. - The other lines are omitted for these routers, and are - replaced with a single "rejected" line, explaining (in - English) why the router was rejected. - - - A status site (like Torweather or Torstatus or another - tool) can poll these files when they are generated, collate - the data, and make it available to server operators. - -Risks: - - This document makes no provisions for caching these "vote - info" documents. If many people wind up fetching them - aggressively from the authorities, that would be bad. - - - diff --git a/doc/spec/proposals/165-simple-robust-voting.txt b/doc/spec/proposals/165-simple-robust-voting.txt deleted file mode 100644 index f813285a83..0000000000 --- a/doc/spec/proposals/165-simple-robust-voting.txt +++ /dev/null @@ -1,133 +0,0 @@ -Filename: 165-simple-robust-voting.txt -Title: Easy migration for voting authority sets -Author: Nick Mathewson -Created: 2009-05-28 -Status: Open - -Overview: - - This proposal describes any easy-to-implement, easy-to-verify way to - change the set of authorities without creating a "flag day" situation. - -Motivation: - - From proposal 134 ("More robust consensus voting with diverse - authority sets") by Peter Palfrader: - - Right now there are about five authoritative directory servers - in the Tor network, tho this number is expected to rise to about - 15 eventually. - - Adding a new authority requires synchronized action from all - operators of directory authorities so that at any time during the - update at least half of all authorities are running and agree on - who is an authority. The latter requirement is there so that the - authorities can arrive at a common consensus: Each authority - builds the consensus based on the votes from all authorities it - recognizes, and so a different set of recognized authorities will - lead to a different consensus document. - - In response to this problem, proposal 134 suggested that every - candidate authority list in its vote whom it believes to be an - authority. These A-says-B-is-an-authority relationships form a - directed graph. Each authority then iteratively finds the largest - clique in the graph and remove it, until they find one containing - them. They vote with this clique. - - Proposal 134 had some problems: - - - It had a security problem in that M hostile authorities in a - clique could effectively kick out M-1 honest authorities. This - could enable a minority of the original authorities to take over. - - - It was too complex in its implications to analyze well: it took us - over a year to realize that it was insecure. - - - It tried to solve a bigger problem: general fragmentation of - authority trust. Really, all we wanted to have was the ability to - add and remove authorities without forcing a flag day. - -Proposed protocol design: - - A "Voting Set" is a set of authorities. Each authority has a list of - the voting sets it considers acceptable. These sets are chosen - manually by the authority operators. They must always contain the - authority itself. Each authority lists all of these voting sets in - its votes. - - Authorities exchange votes with every other authority in any of their - voting sets. - - When it is time to calculate a consensus, an authority votes with - whichever voting set it lists that is listed by the most members of - that set. In other words, given two sets S1 and S2 that an authority - lists, that authority will prefer to vote with S1 over S2 whenever - the number of other authorities in S1 that themselves list S1 is - higher than the number of other authorities in S2 that themselves - list S2. - - For example, suppose authority A recognizes two sets, "A B C D" and - "A E F G H". Suppose that the first set is recognized by all of A, - B, C, and D, whereas the second set is recognized only by A, E, and - F. Because the first set is recognize by more of the authorities in - it than the other one, A will vote with the first set. - - Ties are broken in favor of some arbitrary function of the identity - keys of the authorities in the set. - -How to migrate authority sets: - - In steady state, each authority operator should list only the current - actual voting set as accepted. - - When we want to add an authority, each authority operator configures - his or her server to list two voting sets: one containing all the old - authorities, and one containing the old authorities and the new - authority too. Once all authorities are listing the new set of - authorities, they will start voting with that set because of its - size. - - What if one or two authority operators are slow to list the new set? - Then the other operators can stop listing the old set once there are - enough authorities listing the new set to make its voting successful. - (Note that these authorities not listing the new set will still have - their votes counted, since they themselves will be members of the new - set. They will only fail to sign the consensus generated by the - other authorities who are using the new set.) - - When we want to remove an authority, the operators list two voting - sets: one containing all the authorities, and one omitting the - authority we want to remove. Once enough authorities list the new - set as acceptable, we start having authority operators stop listing - the old set. Once there are more listing the new set than the old - set, the new set will win. - -Data format changes: - - Add a new 'voting-set' line to the vote document format. Allow it to - occur any number of times. Its format is: - - voting-set SP 'fingerprint' SP 'fingerprint' ... NL - - where each fingerprint is the hex fingerprint of an identity key of - an authority. Sort fingerprints in ascending order. - - When the consensus method is at least 'X' (decide this when we - implement the proposal), add this line to the consensus format as - well, before the first dir-source line. [This information is not - redundant with the dir-source sections in the consensus: If an - authority is recognized but didn't vote, that authority will appear in - the voting-set line but not in the dir-source sections.] - - We don't need to list other information about authorities in our - vote. - -Migration issues: - - We should keep track somewhere of which Tor client versions - recognized which authorities. - -Acknowledgments: - - The design came out of an IRC conversation with Peter Palfrader. He - had the basic idea first. diff --git a/doc/spec/proposals/166-statistics-extra-info-docs.txt b/doc/spec/proposals/166-statistics-extra-info-docs.txt deleted file mode 100644 index ab2716a71c..0000000000 --- a/doc/spec/proposals/166-statistics-extra-info-docs.txt +++ /dev/null @@ -1,391 +0,0 @@ -Filename: 166-statistics-extra-info-docs.txt -Title: Including Network Statistics in Extra-Info Documents -Author: Karsten Loesing -Created: 21-Jul-2009 -Target: 0.2.2 -Status: Accepted - -Change history: - - 21-Jul-2009 Initial proposal for or-dev - - -Overview: - - The Tor network has grown to almost two thousand relays and millions - of casual users over the past few years. With growth has come - increasing performance problems and attempts by some countries to - block access to the Tor network. In order to address these problems, - we need to learn more about the Tor network. This proposal suggests to - measure additional statistics and include them in extra-info documents - to help us understand the Tor network better. - - -Introduction: - - As of May 2009, relays, bridges, and directories gather the following - data for statistical purposes: - - - Relays and bridges count the number of bytes that they have pushed - in 15-minute intervals over the past 24 hours. Relays and bridges - include these data in extra-info documents that they send to the - directory authorities whenever they publish their server descriptor. - - - Bridges further include a rough number of clients per country that - they have seen in the past 48 hours in their extra-info documents. - - - Directories can be configured to count the number of clients they - see per country in the past 24 hours and to write them to a local - file. - - Since then we extended the network statistics in Tor. These statistics - include: - - - Directories now gather more precise statistics about connecting - clients. Fixes include measuring in intervals of exactly 24 hours, - counting unsuccessful requests, measuring download times, etc. The - directories append their statistics to a local file every 24 hours. - - - Entry guards count the number of clients per country per day like - bridges do and write them to a local file every 24 hours. - - - Relays measure statistics of the number of cells in their circuit - queues and how much time these cells spend waiting there. Relays - write these statistics to a local file every 24 hours. - - - Exit nodes count the number of read and written bytes on exit - connections per port as well as the number of opened exit streams - per port in 24-hour intervals. Exit nodes write their statistics to - a local file. - - The following four sections contain descriptions for adding these - statistics to the relays' extra-info documents. - - -Directory request statistics: - - The first type of statistics aims at measuring directory requests sent - by clients to a directory mirror or directory authority. More - precisely, these statistics aim at requests for v2 and v3 network - statuses only. These directory requests are sent non-anonymously, - either via HTTP-like requests to a directory's Dir port or tunneled - over a 1-hop circuit. - - Measuring directory request statistics is useful for several reasons: - First, the number of locally seen directory requests can be used to - estimate the total number of clients in the Tor network. Second, the - country-wise classification of requests using a GeoIP database can - help counting the relative and absolute number of users per country. - Third, the download times can give hints on the available bandwidth - capacity at clients. - - Directory requests do not give any hints on the contents that clients - send or receive over the Tor network. Every client requests network - statuses from the directories, so that there are no anonymity-related - concerns to gather these statistics. It might be, though, that clients - wish to hide the fact that they are connecting to the Tor network. - Therefore, IP addresses are resolved to country codes in memory, - events are accumulated over 24 hours, and numbers are rounded up to - multiples of 4 or 8. - - "dirreq-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL - [At most once.] - - YYYY-MM-DD HH:MM:SS defines the end of the included measurement - interval of length NSEC seconds (86400 seconds by default). - - A "dirreq-stats-end" line, as well as any other "dirreq-*" line, - is only added when the relay has opened its Dir port and after 24 - hours of measuring directory requests. - - "dirreq-v2-ips" CC=N,CC=N,... NL - [At most once.] - "dirreq-v3-ips" CC=N,CC=N,... NL - [At most once.] - - List of mappings from two-letter country codes to the number of - unique IP addresses that have connected from that country to - request a v2/v3 network status, rounded up to the nearest multiple - of 8. Only those IP addresses are counted that the directory can - answer with a 200 OK status code. - - "dirreq-v2-reqs" CC=N,CC=N,... NL - [At most once.] - "dirreq-v3-reqs" CC=N,CC=N,... NL - [At most once.] - - List of mappings from two-letter country codes to the number of - requests for v2/v3 network statuses from that country, rounded up - to the nearest multiple of 8. Only those requests are counted that - the directory can answer with a 200 OK status code. - - "dirreq-v2-share" num% NL - [At most once.] - "dirreq-v3-share" num% NL - [At most once.] - - The share of v2/v3 network status requests that the directory - expects to receive from clients based on its advertised bandwidth - compared to the overall network bandwidth capacity. Shares are - formatted in percent with two decimal places. Shares are - calculated as means over the whole 24-hour interval. - - "dirreq-v2-resp" status=num,... NL - [At most once.] - "dirreq-v3-resp" status=nul,... NL - [At most once.] - - List of mappings from response statuses to the number of requests - for v2/v3 network statuses that were answered with that response - status, rounded up to the nearest multiple of 4. Only response - statuses with at least 1 response are reported. New response - statuses can be added at any time. The current list of response - statuses is as follows: - - "ok": a network status request is answered; this number - corresponds to the sum of all requests as reported in - "dirreq-v2-reqs" or "dirreq-v3-reqs", respectively, before - rounding up. - "not-enough-sigs: a version 3 network status is not signed by a - sufficient number of requested authorities. - "unavailable": a requested network status object is unavailable. - "not-found": a requested network status is not found. - "not-modified": a network status has not been modified since the - If-Modified-Since time that is included in the request. - "busy": the directory is busy. - - "dirreq-v2-direct-dl" key=val,... NL - [At most once.] - "dirreq-v3-direct-dl" key=val,... NL - [At most once.] - "dirreq-v2-tunneled-dl" key=val,... NL - [At most once.] - "dirreq-v3-tunneled-dl" key=val,... NL - [At most once.] - - List of statistics about possible failures in the download process - of v2/v3 network statuses. Requests are either "direct" - HTTP-encoded requests over the relay's directory port, or - "tunneled" requests using a BEGIN_DIR cell over the relay's OR - port. The list of possible statistics can change, and statistics - can be left out from reporting. The current list of statistics is - as follows: - - Successful downloads and failures: - - "complete": a client has finished the download successfully. - "timeout": a download did not finish within 10 minutes after - starting to send the response. - "running": a download is still running at the end of the - measurement period for less than 10 minutes after starting to - send the response. - - Download times: - - "min", "max": smallest and largest measured bandwidth in B/s. - "d[1-4,6-9]": 1st to 4th and 6th to 9th decile of measured - bandwidth in B/s. For a given decile i, i/10 of all downloads - had a smaller bandwidth than di, and (10-i)/10 of all downloads - had a larger bandwidth than di. - "q[1,3]": 1st and 3rd quartile of measured bandwidth in B/s. One - fourth of all downloads had a smaller bandwidth than q1, one - fourth of all downloads had a larger bandwidth than q3, and the - remaining half of all downloads had a bandwidth between q1 and - q3. - "md": median of measured bandwidth in B/s. Half of the downloads - had a smaller bandwidth than md, the other half had a larger - bandwidth than md. - - -Entry guard statistics: - - Entry guard statistics include the number of clients per country and - per day that are connecting directly to an entry guard. - - Entry guard statistics are important to learn more about the - distribution of clients to countries. In the future, this knowledge - can be useful to detect if there are or start to be any restrictions - for clients connecting from specific countries. - - The information which client connects to a given entry guard is very - sensitive. This information must not be combined with the information - what contents are leaving the network at the exit nodes. Therefore, - entry guard statistics need to be aggregated to prevent them from - becoming useful for de-anonymization. Aggregation includes resolving - IP addresses to country codes, counting events over 24-hour intervals, - and rounding up numbers to the next multiple of 8. - - "entry-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL - [At most once.] - - YYYY-MM-DD HH:MM:SS defines the end of the included measurement - interval of length NSEC seconds (86400 seconds by default). - - An "entry-stats-end" line, as well as any other "entry-*" - line, is first added after the relay has been running for at least - 24 hours. - - "entry-ips" CC=N,CC=N,... NL - [At most once.] - - List of mappings from two-letter country codes to the number of - unique IP addresses that have connected from that country to the - relay and which are no known other relays, rounded up to the - nearest multiple of 8. - - -Cell statistics: - - The third type of statistics have to do with the time that cells spend - in circuit queues. In order to gather these statistics, the relay - memorizes when it puts a given cell in a circuit queue and when this - cell is flushed. The relay further notes the life time of the circuit. - These data are sufficient to determine the mean number of cells in a - queue over time and the mean time that cells spend in a queue. - - Cell statistics are necessary to learn more about possible reasons for - the poor network performance of the Tor network, especially high - latencies. The same statistics are also useful to determine the - effects of design changes by comparing today's data with future data. - - There are basically no privacy concerns from measuring cell - statistics, regardless of a node being an entry, middle, or exit node. - - "cell-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL - [At most once.] - - YYYY-MM-DD HH:MM:SS defines the end of the included measurement - interval of length NSEC seconds (86400 seconds by default). - - A "cell-stats-end" line, as well as any other "cell-*" line, - is first added after the relay has been running for at least 24 - hours. - - "cell-processed-cells" num,...,num NL - [At most once.] - - Mean number of processed cells per circuit, subdivided into - deciles of circuits by the number of cells they have processed in - descending order from loudest to quietest circuits. - - "cell-queued-cells" num,...,num NL - [At most once.] - - Mean number of cells contained in queues by circuit decile. These - means are calculated by 1) determining the mean number of cells in - a single circuit between its creation and its termination and 2) - calculating the mean for all circuits in a given decile as - determined in "cell-processed-cells". Numbers have a precision of - two decimal places. - - "cell-time-in-queue" num,...,num NL - [At most once.] - - Mean time cells spend in circuit queues in milliseconds. Times are - calculated by 1) determining the mean time cells spend in the - queue of a single circuit and 2) calculating the mean for all - circuits in a given decile as determined in - "cell-processed-cells". - - "cell-circuits-per-decile" num NL - [At most once.] - - Mean number of circuits that are included in any of the deciles, - rounded up to the next integer. - - -Exit statistics: - - The last type of statistics affects exit nodes counting the number of - bytes written and read and the number of streams opened per port and - per 24 hours. Exit port statistics can be measured from looking at - headers of BEGIN and DATA cells. A BEGIN cell contains the exit port - that is required for the exit node to open a new exit stream. - Subsequent DATA cells coming from the client or being sent back to the - client contain a length field stating how many bytes of application - data are contained in the cell. - - Exit port statistics are important to measure in order to identify - possible load-balancing problems with respect to exit policies. Exit - nodes that permit more ports than others are very likely overloaded - with traffic for those ports plus traffic for other ports. Improving - load balancing in the Tor network improves the overall utilization of - bandwidth capacity. - - Exit traffic is one of the most sensitive parts of network data in the - Tor network. Even though these statistics do not require looking at - traffic contents, statistics are aggregated so that they are not - useful for de-anonymizing users. Only those ports are reported that - have seen at least 0.1% of exiting or incoming bytes, numbers of bytes - are rounded up to full kibibytes (KiB), and stream numbers are rounded - up to the next multiple of 4. - - "exit-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL - [At most once.] - - YYYY-MM-DD HH:MM:SS defines the end of the included measurement - interval of length NSEC seconds (86400 seconds by default). - - An "exit-stats-end" line, as well as any other "exit-*" line, is - first added after the relay has been running for at least 24 hours - and only if the relay permits exiting (where exiting to a single - port and IP address is sufficient). - - "exit-kibibytes-written" port=N,port=N,... NL - [At most once.] - "exit-kibibytes-read" port=N,port=N,... NL - [At most once.] - - List of mappings from ports to the number of kibibytes that the - relay has written to or read from exit connections to that port, - rounded up to the next full kibibyte. - - "exit-streams-opened" port=N,port=N,... NL - [At most once.] - - List of mappings from ports to the number of opened exit streams - to that port, rounded up to the nearest multiple of 4. - - -Implementation notes: - - Right now, relays that are configured accordingly write similar - statistics to those described in this proposal to disk every 24 hours. - With this proposal being implemented, relays include the contents of - these files in extra-info documents. - - The following steps are necessary to implement this proposal: - - 1. The current format of [dirreq|entry|buffer|exit]-stats files needs - to be adapted to the description in this proposal. This step - basically means renaming keywords. - - 2. The timing of writing the four *-stats files should be unified, so - that they are written exactly 24 hours after starting the - relay. Right now, the measurement intervals for dirreq, entry, and - exit stats starts with the first observed request, and files are - written when observing the first request that occurs more than 24 - hours after the beginning of the measurement interval. With this - proposal, the measurement intervals should all start at the same - time, and files should be written exactly 24 hours later. - - 3. It is advantageous to cache statistics in local files in the data - directory until they are included in extra-info documents. The - reason is that the 24-hour measurement interval can be very - different from the 18-hour publication interval of extra-info - documents. When a relay crashes after finishing a measurement - interval, but before publishing the next extra-info document, - statistics would get lost. Therefore, statistics are written to - disk when finishing a measurement interval and read from disk when - generating an extra-info document. Only the statistics that were - appended to the *-stats files within the past 24 hours are included - in extra-info documents. Further, the contents of the *-stats files - need to be checked in the process of generating extra-info documents. - - 4. With the statistics patches being tested, the ./configure options - should be removed and the statistics code be compiled by default. - It is still required for relay operators to add configuration - options (DirReqStatistics, ExitPortStatistics, etc.) to enable - gathering statistics. However, in the near future, statistics shall - be enabled gathered by all relays by default, where requiring a - ./configure option would be a barrier for many relay operators. diff --git a/doc/spec/proposals/167-params-in-consensus.txt b/doc/spec/proposals/167-params-in-consensus.txt deleted file mode 100644 index d23bc9c01e..0000000000 --- a/doc/spec/proposals/167-params-in-consensus.txt +++ /dev/null @@ -1,47 +0,0 @@ -Filename: 167-params-in-consensus.txt -Title: Vote on network parameters in consensus -Author: Roger Dingledine -Created: 18-Aug-2009 -Status: Closed -Implemented-In: 0.2.2 - -0. History - - -1. Overview - - Several of our new performance plans involve guessing how to tune - clients and relays, yet we won't be able to learn whether we guessed - the right tuning parameters until many people have upgraded. Instead, - we should have directory authorities vote on the parameters, and teach - Tors to read the currently recommended values out of the consensus. - -2. Design - - V3 votes should include a new "params" line after the known-flags - line. It contains key=value pairs, where value is an integer. - - Consensus documents that are generated with a sufficiently new consensus - method (7?) then include a params line that includes every key listed - in any vote, and the median value for that key (in case of ties, - we use the median closer to zero). - -2.1. Planned keys. - - The first planned parameter is "circwindow=101", which is the initial - circuit packaging window that clients and relays should use. Putting - it in the consensus will let us perform experiments with different - values once enough Tors have upgraded -- see proposal 168. - - Later parameters might include a weighting for how much to favor quiet - circuits over loud circuits in our round-robin algorithm; a weighting - for how much to prioritize relays over clients if we use an incentive - scheme like the gold-star design; and what fraction of circuits we - should throw out from proposal 151. - -2.2. What about non-integers? - - I'm not sure how we would do median on non-integer values. Further, - I don't have any non-integer values in mind yet. So I say we cross - that bridge when we get to it. - diff --git a/doc/spec/proposals/168-reduce-circwindow.txt b/doc/spec/proposals/168-reduce-circwindow.txt deleted file mode 100644 index c10cf41e2e..0000000000 --- a/doc/spec/proposals/168-reduce-circwindow.txt +++ /dev/null @@ -1,134 +0,0 @@ -Filename: 168-reduce-circwindow.txt -Title: Reduce default circuit window -Author: Roger Dingledine -Created: 12-Aug-2009 -Status: Open -Target: 0.2.2 - -0. History - - -1. Overview - - We should reduce the starting circuit "package window" from 1000 to - 101. The lower package window will mean that clients will only be able - to receive 101 cells (~50KB) on a circuit before they need to send a - 'sendme' acknowledgement cell to request 100 more. - - Starting with a lower package window on exit relays should save on - buffer sizes (and thus memory requirements for the exit relay), and - should save on queue sizes (and thus latency for users). - - Lowering the package window will induce an extra round-trip for every - additional 50298 bytes of the circuit. This extra step is clearly a - slow-down for large streams, but ultimately we hope that a) clients - fetching smaller streams will see better response, and b) slowing - down the large streams in this way will produce lower e2e latencies, - so the round-trips won't be so bad. - -2. Motivation - - Karsten's torperf graphs show that the median download time for a 50KB - file over Tor in mid 2009 is 7.7 seconds, whereas the median download - time for 1MB and 5MB are around 50s and 150s respectively. The 7.7 - second figure is way too high, whereas the 50s and 150s figures are - surprisingly low. - - The median round-trip latency appears to be around 2s, with 25% of - the data points taking more than 5s. That's a lot of variance. - - We designed Tor originally with the original goal of maximizing - throughput. We figured that would also optimize other network properties - like round-trip latency. Looks like we were wrong. - -3. Design - - Wherever we initialize the circuit package window, initialize it to - 101 rather than 1000. Reducing it should be safe even when interacting - with old Tors: the old Tors will receive the 101 cells and send back - a sendme ack cell. They'll still have much higher deliver windows, - but the rest of their deliver window will go unused. - - You can find the patch at arma/circwindow. It seems to work. - -3.1. Why not 100? - - Tor 0.0.0 through 0.2.1.19 have a bug where they only send the sendme - ack cell after 101 cells rather than the intended 100 cells. - - Once 0.2.1.19 is obsolete we can change it back to 100 if we like. But - hopefully we'll have moved to some datagram protocol long before - 0.2.1.19 becomes obsolete. - -3.2. What about stream packaging windows? - - Right now the stream packaging windows start at 500. The goal was to - set the stream window to half the circuit window, to provide a crude - load balancing between streams on the same circuit. Once we lower - the circuit packaging window, the stream packaging window basically - becomes redundant. - - We could leave it in -- it isn't hurting much in either case. Or we - could take it out -- people building other Tor clients would thank us - for that step. Alas, people building other Tor clients are going to - have to be compatible with current Tor clients, so in practice there's - no point taking out the stream packaging windows. - -3.3. What about variable circuit windows? - - Once upon a time we imagined adapting the circuit package window to - the network conditions. That is, we would start the window small, - and raise it based on the latency and throughput we see. - - In theory that crude imitation of TCP's windowing system would allow - us to adapt to fill the network better. In practice, I think we want - to stick with the small window and never raise it. The low cap reduces - the total throughput you can get from Tor for a given circuit. But - that's a feature, not a bug. - -4. Evaluation - - How do we know this change is actually smart? It seems intuitive that - it's helpful, and some smart systems people have agreed that it's - a good idea (or said another way, they were shocked at how big the - default package window was before). - - To get a more concrete sense of the benefit, though, Karsten has been - running torperf side-by-side on exit relays with the old package window - vs the new one. The results are mixed currently -- it is slightly faster - for fetching 40KB files, and slightly slower for fetching 50KB files. - - I think it's going to be tough to get a clear conclusion that this is - a good design just by comparing one exit relay running the patch. The - trouble is that the other hops in the circuits are still getting bogged - down by other clients introducing too much traffic into the network. - - Ultimately, we'll want to put the circwindow parameter into the - consensus so we can test a broader range of values once enough relays - have upgraded. - -5. Transition and deployment - - We should put the circwindow in the consensus (see proposal 167), - with an initial value of 101. Then as more exit relays upgrade, - clients should seamlessly get the better behavior. - - Note that upgrading the exit relay will only affect the "download" - package window. An old client that's uploading lots of bytes will - continue to use the old package window at the client side, and we - can't throttle that window at the exit side without breaking protocol. - - The real question then is what we should backport to 0.2.1. Assuming - this could be a big performance win, we can't afford to wait until - 0.2.2.x comes out before starting to see the changes here. So we have - two options as I see them: - a) once clients in 0.2.2.x know how to read the value out of the - consensus, and it's been tested for a bit, backport that part to - 0.2.1.x. - b) if it's too complex to backport, just pick a number, like 101, and - backport that number. - - Clearly choice (a) is the better one if the consensus parsing part - isn't very complex. Let's shoot for that, and fall back to (b) if the - patch turns out to be so big that we reconsider. - diff --git a/doc/spec/proposals/169-eliminating-renegotiation.txt b/doc/spec/proposals/169-eliminating-renegotiation.txt deleted file mode 100644 index 2c90f9c9e8..0000000000 --- a/doc/spec/proposals/169-eliminating-renegotiation.txt +++ /dev/null @@ -1,404 +0,0 @@ -Filename: 169-eliminating-renegotiation.txt -Title: Eliminate TLS renegotiation for the Tor connection handshake -Author: Nick Mathewson -Created: 27-Jan-2010 -Status: Draft -Target: 0.2.2 - -1. Overview - - I propose a backward-compatible change to the Tor connection - establishment protocol to avoid the use of TLS renegotiation. - - Rather than doing a TLS renegotiation to exchange certificates - and authenticate the original handshake, this proposal takes an - approach similar to Steven Murdoch's proposal 124, and uses Tor - cells to finish authenticating the parties' identities once the - initial TLS handshake is finished. - - Terminological note: I use "client" below to mean the Tor - instance (a client or a relay) that initiates a TLS connection, - and "server" to mean the Tor instance (a relay) that accepts it. - -2. Motivation and history - - In the original Tor TLS connection handshake protocol ("V1", or - "two-cert"), parties that wanted to authenticate provided a - two-cert chain of X.509 certificates during the handshake setup - phase. Every party that wanted to authenticate sent these - certificates. - - In the current Tor TLS connection handshake protocol ("V2", or - "renegotiating"), the parties begin with a single certificate - sent from the server (responder) to the client (initiator), and - then renegotiate to a two-certs-from-each-authenticating party. - We made this change to make Tor's handshake look like a browser - speaking SSL to a webserver. (See proposal 130, and - tor-spec.txt.) To tell whether to use the V1 or V2 handshake, - servers look at the list of ciphers sent by the client. (This is - ugly, but there's not much else in the ClientHello that they can - look at.) If the list contains any cipher not used by the V1 - protocol, the server sends back a single cert and expects a - renegotiation. If the client gets back a single cert, then it - withholds its own certificates until the TLS renegotiation phase. - - In other words, initiator behavior now looks like this: - - - Begin TLS negotiation with V2 cipher list; wait for - certificate(s). - - If we get a certificate chain: - - Then we are using the V1 handshake. Send our own - certificate chain as part of this initial TLS handshake - if we want to authenticate; otherwise, send no - certificates. When the handshake completes, check - certificates. We are now mutually authenticated. - - Otherwise, if we get just a single certificate: - - Then we are using the V2 handshake. Do not send any - certificates during this handshake. - - When the handshake is done, immediately start a TLS - renegotiation. During the renegotiation, expect - a certificate chain from the server; send a certificate - chain of our own if we want to authenticate ourselves. - - After the renegotiation, check the certificates. Then - send (and expect) a VERSIONS cell from the other side to - establish the link protocol version. - - And V2 responder behavior now looks like this: - - - When we get a TLS ClientHello request, look at the cipher - list. - - If the cipher list contains only the V1 ciphersuites: - - Then we're doing a V1 handshake. Send a certificate - chain. Expect a possible client certificate chain in - response. - Otherwise, if we get other ciphersuites: - - We're using the V2 handshake. Send back a single - certificate and let the handshake complete. - - Do not accept any data until the client has renegotiated. - - When the client is renegotiating, send a certificate - chain, and expect (possibly multiple) certificates in - reply. - - Check the certificates when the renegotiation is done. - Then exchange VERSIONS cells. - - Late in 2009, researchers found a flaw in most applications' use - of TLS renegotiation: Although TLS renegotiation does not - reauthenticate any information exchanged before the renegotiation - takes place, many applications were treating it as though it did, - and assuming that data sent _before_ the renegotiation was - authenticated with the credentials negotiated _during_ the - renegotiation. This problem was exacerbated by the fact that - most TLS libraries don't actually give you an obvious good way to - tell where the renegotiation occurred relative to the datastream. - Tor wasn't directly affected by this vulnerability, but its - aftermath hurts us in a few ways: - - 1) OpenSSL has disabled renegotiation by default, and created - a "yes we know what we're doing" option we need to set to - turn it back on. (Two options, actually: one for openssl - 0.9.8l and one for 0.9.8m and later.) - - 2) Some vendors have removed all renegotiation support from - their versions of OpenSSL entirely, forcing us to tell - users to either replace their versions of OpenSSL or to - link Tor against a hand-built one. - - 3) Because of 1 and 2, I'd expect TLS renegotiation to become - rarer and rarer in the wild, making our own use stand out - more. - -3. Design - -3.1. The view in the large - - Taking a cue from Steven Murdoch's proposal 124, I propose that - we move the work currently done by the TLS renegotiation step - (that is, authenticating the parties to one another) and do it - with Tor cells instead of with TLS. - - Using _yet another_ variant response from the responder (server), - we allow the client to learn that it doesn't need to rehandshake - and can instead use a cell-based authentication system. Once the - TLS handshake is done, the client and server exchange VERSIONS - cells to determine link protocol version (including - handshake version). If they're using the handshake version - specified here, the client and server arrive at link protocol - version 3 (or higher), and use cells to exchange further - authentication information. - -3.2. New TLS handshake variant - - We already used the list of ciphers from the clienthello to - indicate whether the client can speak the V2 ("renegotiating") - handshake or later, so we can't encode more information there. - - We can, however, change the DN in the certificate passed by the - server back to the client. Currently, all V2 certificates are - generated with CN values ending with ".net". I propose that we - have the ".net" commonName ending reserved to indicate the V2 - protocol, and use commonName values ending with ".com" to - indicate the V3 ("minimal") handshake described herein. - - Now, once the initial TLS handshake is done, the client can look - at the server's certificate(s). If there is a certificate chain, - the handshake is V1. If there is a single certificate whose - subject commonName ends in ".net", the handshake is V2 and the - client should try to renegotiate as it would currently. - Otherwise, the client should assume that the handshake is V3+. - [Servers should _only_ send ".com" addesses, to allow room for - more signaling in the future.] - -3.3. Authenticating inside Tor - - Once the TLS handshake is finished, if the client renegotiates, - then the server should go on as it does currently. - - If the client implements this proposal, however, and the server - has shown it can understand the V3+ handshake protocol, the - client immediately sends a VERSIONS cell to the server - and waits to receive a VERSIONS cell in return. We negotiate - the Tor link protocol version _before_ we proceed with the - negotiation, in case we need to change the authentication - protocol in the future. - - Once either party has seen the VERSIONS cell from the other, it - knows which version they will pick (that is, the highest version - shared by both parties' VERSIONS cells). All Tor instances using - the handshake protocol described in 3.2 MUST support at least - link protocol version 3 as described here. - - On learning the link protocol, the server then sends the client a - CERT cell and a NETINFO cell. If the client wants to - authenticate to the server, it sends a CERT cell, an AUTHENTICATE - cell, and a NETINFO cell, or it may simply send a NETINFO cell if - it does not want to authenticate. - - The CERT cell describes the keys that a Tor instance is claiming - to have. It is a variable-length cell. Its payload format is: - - N: Number of certs in cell [1 octet] - N times: - CLEN [2 octets] - Certificate [CLEN octets] - - Any extra octets at the end of a CERT cell MUST be ignored. - - Each certificate has the form: - - CertType [1 octet] - CertPurpose [1 octet] - PublicKeyLen [2 octets] - PublicKey [PublicKeyLen octets] - NotBefore [4 octets] - NotAfter [4 octets] - SignerID [HASH256_LEN octets] - SignatureLen [2 octets] - Signature [SignatureLen octets] - - where CertType is 1 (meaning "RSA/SHA256") - CertPurpose is 1 (meaning "link certificate") - PublicKey is the DER encoding of the ASN.1 representation - of the RSA key of the subject of this certificate, - NotBefore is a time in HOURS since January 1, 1970, 00:00 - UTC before which this certificate should not be - considered valid. - NotAfter is a time in HOURS since January 1, 1970, 00:00 - UTC after which this certificate should not be - considered valid. - SignerID is the SHA-256 digest of the public key signing - this certificate - and Signature is the signature of the all other fields in - this certificate, using SHA256 as described in proposal - 158. - - While authenticating, a server need send only a self-signed - certificate for its identity key. (Its TLS certificate already - contains its link key signed by its identity key.) A client that - wants to authenticate MUST send two certificates: one containing - a public link key signed by its identity key, and one self-signed - cert for its identity. - - Tor instances MUST ignore any certificate with an unrecognized - CertType or CertPurpose, and MUST ignore extra bytes in the cert. - - The AUTHENTICATE cell proves to the server that the client with - whom it completed the initial TLS handshake is the one possessing - the link public key in its certificate. It is a variable-length - cell. Its contents are: - - SignatureType [2 octets] - SignatureLen [2 octets] - Signature [SignatureLen octets] - - where SignatureType is 1 (meaning "RSA-SHA256") and Signature is - an RSA-SHA256 signature of the HMAC-SHA256, using the TLS master - secret key as its key, of the following elements: - - - The SignatureType field (0x00 0x01) - - The NUL terminated ASCII string: "Tor certificate verification" - - client_random, as sent in the Client Hello - - server_random, as sent in the Server Hello - - Once the above handshake is complete, the client knows (from the - initial TLS handshake) that it has a secure connection to an - entity that controls a given link public key, and knows (from the - CERT cell) that the link public key is a valid public key for a - given Tor identity. - - If the client authenticates, the server learns from the CERT cell - that a given Tor identity has a given current public link key. - From the AUTHENTICATE cell, it knows that an entity with that - link key knows the master secret for the TLS connection, and - hence must be the party with whom it's talking, if TLS works. - -3.4. Security checks - - If the TLS handshake indicates a V2 or V3+ connection, the server - MUST reject any connection from the client that does not begin - with either a renegotiation attempt or a VERSIONS cell containing - at least link protocol version "3". If the TLS handshake - indicates a V3+ connection, the client MUST reject any connection - where the server sends anything before the client has sent a - VERSIONS cell, and any connection where the VERSIONS cell does - not contain at least link protocol version "3". - - If link protocol version 3 is chosen: - - Clients and servers MUST check that all digests and signatures - on the certificates in CERT cells they are given are as - described above. - - After the VERSIONS cell, clients and servers MUST close the - connection if anything besides a CERT or AUTH cell is sent - before the - - CERT or AUTHENTICATE cells anywhere after the first NETINFO - cell must be rejected. - - ... [write more here. What else?] ... - -3.5. Summary - - We now revisit the protocol outlines from section 2 to incorporate - our changes. New or modified steps are marked with a *. - - The new initiator behavior now looks like this: - - - Begin TLS negotiation with V2 cipher list; wait for - certificate(s). - - If we get a certificate chain: - - Then we are using the V1 handshake. Send our own - certificate chain as part of this initial TLS handshake - if we want to authenticate; otherwise, send no - certificates. When the handshake completes, check - certificates. We are now mutually authenticated. - Otherwise, if we get just a single certificate: - - Then we are using the V2 or the V3+ handshake. Do not - send any certificates during this handshake. - * When the handshake is done, look at the server's - certificate's subject commonName. - * If it ends with ".net", we're doing a V2 handshake: - - Immediately start a TLS renegotiation. During the - renegotiation, expect a certificate chain from the - server; send a certificate chain of our own if we - want to authenticate ourselves. - - After the renegotiation, check the certificates. Then - send (and expect) a VERSIONS cell from the other side - to establish the link protocol version. - * If it ends with anything else, assume a V3 or later - handshake: - * Send a VERSIONS cell, and wait for a VERSIONS cell - from the server. - * If we are authenticating, send CERT and AUTHENTICATE - cells. - * Send a NETINFO cell. Wait for a CERT and a NETINFO - cell from the server. - * If the CERT cell contains a valid self-identity cert, - and the identity key in the cert can be used to check - the signature on the x.509 certificate we got during - the TLS handshake, then we know we connected to the - server with that identity. If any of these checks - fail, or the identity key was not what we expected, - then we close the connection. - * Once the NETINFO cell arrives, continue as before. - - And V3+ responder behavior now looks like this: - - - When we get a TLS ClientHello request, look at the cipher - list. - - - If the cipher list contains only the V1 ciphersuites: - - Then we're doing a V1 handshake. Send a certificate - chain. Expect a possible client certificate chain in - response. - Otherwise, if we get other ciphersuites: - - We're using the V2 handshake. Send back a single - certificate whose subject commonName ends with ".com", - and let the handshake complete. - * If the client does anything besides renegotiate or send a - VERSIONS cell, drop the connection. - - If the client renegotiates immediately, it's a V2 - connection: - - When the client is renegotiating, send a certificate - chain, and expect (possibly multiple certificates in - reply). - - Check the certificates when the renegotiation is done. - Then exchange VERSIONS cells. - * Otherwise we got a VERSIONS cell and it's a V3 handshake. - * Send a VERSIONS cell, a CERT cell, an AUTHENTICATE - cell, and a NETINFO cell. - * Wait for the client to send cells in reply. If the - client sends a CERT and an AUTHENTICATE and a NETINFO, - use them to authenticate the client. If the client - sends a NETINFO, it is unauthenticated. If it sends - anything else before its NETINFO, it's rejected. - -4. Numbers to assign - - We need a version number for this link protocol. I've been - calling it "3". - - We need to reserve command numbers for CERT and AUTH cells. I - suggest that in link protocol 3 and higher, we reserve command - numbers 128..240 for variable-length cells. (241-256 we can hold - for future extensions. - -5. Efficiency - - This protocol add a round-trip step when the client sends a - VERSIONS cell to the server, and waits for the {VERSIONS, CERT, - NETINFO} response in turn. (The server then waits for the - client's {NETINFO} or {CERT, AUTHENTICATE, NETINFO} reply, - but it would have already been waiting for the client's NETINFO, - so that's not an additional wait.) - - This is actually fewer round-trip steps than required before for - TLS renegotiation, so that's a win. - -6. Open questions: - - - Should we use X.509 certificates instead of the certificate-ish - things we describe here? They are more standard, but more ugly. - - - May we cache which certificates we've already verified? It - might leak in timing whether we've connected with a given server - before, and how recently. - - - Is there a better secret than the master secret to use in the - AUTHENTICATE cell? Say, a portable one? Can we get at it for - other libraries besides OpenSSL? - - - Does using the client_random and server_random data in the - AUTHENTICATE message actually help us? How hard is it to pull - them out of the OpenSSL data structure? - - - Can we give some way for clients to signal "I want to use the - V3 protocol if possible, but I can't renegotiate, so don't give - me the V2"? Clients currently have a fair idea of server - versions, so they could potentially do the V3+ handshake with - servers that support it, and fall back to V1 otherwise. - - - What should servers that don't have TLS renegotiation do? For - now, I think they should just get it. Eventually we can - deprecate the V2 handshake as we did with the V1 handshake. diff --git a/doc/spec/proposals/170-user-path-config.txt b/doc/spec/proposals/170-user-path-config.txt deleted file mode 100644 index fa74c76f73..0000000000 --- a/doc/spec/proposals/170-user-path-config.txt +++ /dev/null @@ -1,95 +0,0 @@ -Title: Configuration options regarding circuit building -Filename: 170-user-path-config.txt -Author: Sebastian Hahn -Created: 01-March-2010 -Status: Draft - -Overview: - - This document outlines how Tor handles the user configuration - options to influence the circuit building process. - -Motivation: - - Tor's treatment of the configuration *Nodes options was surprising - to many users, and quite a few conspiracy theories have crept up. We - should update our specification and code to better describe and - communicate what is going during circuit building, and how we're - honoring configuration. So far, we've been tracking a bugreport - about this behaviour ( - https://bugs.torproject.org/flyspray/index.php?do=details&id=1090 ) - and Nick replied in a thread on or-talk ( - http://archives.seul.org/or/talk/Feb-2010/msg00117.html ). - - This proposal tries to document our intention for those configuration - options. - -Design: - - Five configuration options are available to users to influence Tor's - circuit building. EntryNodes and ExitNodes define a list of nodes - that are for the Entry/Exit position in all circuits. ExcludeNodes - is a list of nodes that are used for no circuit, and - ExcludeExitNodes is a list of nodes that aren't used as the last - hop. StrictNodes defines Tor's behaviour in case of a conflict, for - example when a node that is excluded is the only available - introduction point. Setting StrictNodes to 1 breaks Tor's - functionality in that case, and it will refuse to build such a - circuit. - - Neither Nick's email nor bug 1090 have clear suggestions how we - should behave in each case, so I tried to come up with something - that made sense to me. - -Security implications: - - Deviating from normal circuit building can break one's anonymity, so - the documentation of the above option should contain a warning to - make users aware of the pitfalls. - -Specification: - - It is proposed that the "User configuration" part of path-spec - (section 2.2.2) be replaced with this: - - Users can alter the default behavior for path selection with - configuration options. In case of conflicts (excluding and requiring - the same node) the "StrictNodes" option is used to determine - behaviour. If a nodes is both excluded and required via a - configuration option, the exclusion takes preference. - - - If "ExitNodes" is provided, then every request requires an exit - node on the ExitNodes list. If a request is supported by no nodes - on that list, and "StrictNodes" is false, then Tor treats that - request as if ExitNodes were not provided. - - - "EntryNodes" behaves analogously. - - - If "ExcludeNodes" is provided, then no circuit uses any of the - nodes listed. If a circuit requires an excluded node to be used, - and "StrictNodes" is false, then Tor uses the node in that - position while not using any other of the excluded nodes. - - - If "ExcludeExitNodes" is provided, then Tor will not use the nodes - listed for the exit position in a circuit. If a circuit requires - an excluded node to be used in the exit position and "StrictNodes" - is false, then Tor builds that circuit as if ExcludeExitNodes were - not provided. - - - If a user tries to connect to or resolve a hostname of the form - <target>.<servername>.exit and the "AllowDotExit" configuration - option is set to 1, the request is rewritten to a request for - <target>, and the request is only supported by the exit whose - nickname or fingerprint is <servername>. If "AllowDotExit" is set - to 0 (default), any request for <anything>.exit is denied. - - - When any of the *Nodes settings are changed, all circuits are - expired immediately, to prevent a situation where a previously - built circuit is used even though some of its nodes are now - excluded. - - -Compatibility: - - The old Strict*Nodes options are deprecated, and the StrictNodes - option is new. Tor users may need to update their configuration file. diff --git a/doc/spec/proposals/171-separate-streams.txt b/doc/spec/proposals/171-separate-streams.txt deleted file mode 100644 index 9842265db1..0000000000 --- a/doc/spec/proposals/171-separate-streams.txt +++ /dev/null @@ -1,357 +0,0 @@ -Filename: 171-separate-streams.txt -Title: Separate streams across circuits by connection metadata -Author: Robert Hogan, Jacob Appelbaum, Damon McCoy, Nick Mathewson -Created: 21-Oct-2008 -Modified: 7-Dec-2010 -Status: Open - -Summary: - - We propose a new set of options to isolate unrelated streams from one - another, putting them on separate circuits so that semantically - unrelated traffic is not inadvertently made linkable. - -Motivation: - - Currently, Tor attaches regular streams (that is, ones not carrying - rendezvous or directory traffic) to circuits based only on whether Tor - circuit's current exit node supports the destination, and whether the - circuit has been dirty (that is, in use) for too long. - - This means that traffic that would otherwise be unrelated sometimes - gets sent over the same circuit, allowing the exit node to link such - streams with certainty, and allowing other parties to link such - streams probabilistically. - - Older versions of onion routing tried to address this problem by - sending every stream over a separate circuit; performance issues made - this unfeasible. Moreover, in the presence of a localized adversary, - separating streams by circuits increases the odds that, for any given - linked set of streams, at least one will go over a compromised - circuit. - - Therefore we ought to look for ways to allow streams that ought to be - linked to travel over a single circuit, while keeping streams that - ought not be linked isolated to separate circuits. - -Discussion: - - Let's call a series of inherently-linked streams (like a set of - streams downloading objects from the same webpage, or a browsing - session where the user requests several related webpages) a "Session". - - "Sessions" are a necessarily a fuzzy concept. While users typically - consider some activities as wholly unrelated to each other ("My IM - session has nothing to do with my web browsing!"), the boundaries - between activities are sometimes hard to determine. If I'm reading - lolcats in one browser tab and reading about treatments for an - embarrassing disease in another, those are probably separate sessions. - If I search for a forum, log in, read it for a while, and post a few - messages on unrelated topics, that's probably all the same session. - - So with the proviso that no automated process can identify sessions - 100% accurately, let's see which options we have available. - - Generally, all the streams on a session come from a single - application. Unfortunately, isolating streams by application - automatically isn't feasible, given the lack of any nice - cross-platform way to tell which local process originated a given - connection. (Yes, lsof works. But a quick review of the lsof code - should be sufficient to scare you away from thinking there is a - portable option, much less a portable O(1) option.) So instead, we'll - have to use some other aspect of a Tor request as a proxy for the - application. - - Generally, traffic from separate applications is not in the same - session. - - With some applications (IRC, for example), each stream is a session. - - Some applications (most notably web browsing) can't be meaningfully - split into sessions without inspecting the traffic itself and - maintaining a lot of state. - - How well do ports correspond to sessions? Early versions of this - proposal focused on using destination ports as a proxy for - application, since a connection to port 22 for SSH is probably not in - the same session as one to port 80. This only works with some - applications better than others, though: while SSH users typically - know when they're on port 22 and when they aren't, a web browser can - be coaxed (though img urls or any number of releated tricks) into - connecting to any port at all. Moreover, when Tor gets a DNS lookup - request, it doesn't know in advance which port the resulting address - will be used to connect to. - - So in summary, each kind of traffic wants to follow different rules, - and assuming the existence of a web browser and a hostile web page or - exit node, we can't tell one kind of traffic from another by simply - looking at the destination:port of the traffic. - - Fortunately, we're not doomed. - -Design: - - When a stream arrives at Tor, we have the following data to examine: - 1) The destination address - 2) The destination port (unless this a DNS lookup) - 3) The protocol used by the application to send the stream to Tor: - SOCKS4, SOCKS4A, SOCKS5, or whatever local "transparent proxy" - mechanism the kernel gives us. - 4) The port used by the application to send the stream to Tor -- - that is, the SOCKSListenAddress or TransListenAddress that the - application used, if we have more than one. - 5) The SOCKS username and password, if any. - 6) The source address and port for the application. - - We propose to use 3, 4, and 5 as a backchannel for applications to - tell Tor about different sessions. Rather than running only one - SOCKSPort, a Tor user who would prefer better session isolation should - run multiple SOCKSPorts/TransPorts, and configure different - applications to use separate ports. Applications that support SOCKS - authentication can further be separated on a single port by their - choice of username/password. Streams sent to separate ports or using - different authentication information should never be sent over the - same circuit. We allow each port to have its own settings for - isolation based on destination port, destination address, or both. - - Handling DNS can be a challenge. We can get hostnames by one of three - means: - - A) A SOCKS4a request, or a SOCKS5 request with a hostname. This - case is handled trivially using the rules above. - B) A RESOLVE request on a SOCKSPort. This case is handled using the - rules above, except that port isolation can't work to isolate - RESOLVE requests into a proper session, since we don't know which - port will eventually be used when we connect to the returned - address. - C) A request on a DNSPort. We have no way of knowing which - address/port will be used to connect to the requested address. - - When B or C is required but problematic, we could favor the use of - AutomapHostsOnResolve. - -Interface: - - We propose that {SOCKS,Natd,Trans,DNS}ListenAddr be deprecated in - favor of an expanded {SOCKS,Natd,Trans,DNS}Port syntax: - - ClientPortLine = OptionName SP (Addr ":")? Port (SP Options?) - OptionName = "SOCKSPort" / "NatdPort" / "TransPort" / "DNSPort" - Addr = An IPv4 address / an IPv6 address surrounded by brackets. - If optional, we default to 127.0.0.1 - Port = An integer from 1 through 65535 inclusive - Options = Option - Options = Options SP Option - Option = IsolateOption / GroupOption - GroupOption = "SessionGroup=" UINT - IsolateOption = OptNo ("IsolateDestPort" / "IsolateDestAddr" / - "IsolateSOCKSUser"/ "IsolateClientProtocol" / - "IsolateClientAddr") OptPlural - OptNo = "No" ? - OptPlural = "s" ? - SP = " " - UINT = An unsigned integer - - All options are case-insensitive. - - The "IsolateSOCKSUser" and "IsolateClientAddr" options are on by - default; "NoIsolateSOCKSUser" and "NoIsolateClientAddr" respectively - turn them off. The IsolateDestPort and IsolateDestAddr and - IsolateClientProtocol options are off by default. NoIsolateDestPort and - NoIsolateDestAddr and NoIsolateClientProtocol have no effect. - - Given a set of ClientPortLines, streams must NOT be placed on the same - circuit if ANY of the following hold: - - * They were sent to two different client ports, unless the two - client ports both specify a "SessionGroup" option with the same - integer value. - * At least one was sent to a client port with the IsolateDestPort - active, and they have different destination ports. - * At least one was sent to a client port with IsolateDestAddr - active, and they have different destination addresses. - * At least one was sent to a client port with IsolateClientProtocol - active, and they use different protocols (where SOCKS4, SOCKS4a, - SOCKS5, TransPort, NatdPort, and DNS are the protocols in question) - * At least one was sent to a client port with IsolateSOCKSUser - active, and they have different SOCKS username/password values - configurations. (For the purposes of this option, the - username/password pair of ""/"" is distinct from SOCKS without - authentication, and both are distinct from any non-SOCKS client's - non-authentication.) - * At least one was sent to a client port with IsolateClientAddr - active, and they came from different client addresses. (For the - purpose of this option, any local interface counts as the same - address. So if the host is configured with addresses 10.0.0.1, - 192.0.32.10, and 127.0.0.1, then traffic from those addresses can - leave on the same circuit, but traffic to from 10.0.0.2 (for - example) could not share a circuit with any of them.) - - These rules apply regardless of whether the streams are active at the - same time. In other words, if the rules say that streams A and B must - not be on the same circuit, and stream A is attached to circuit X, - then stream B must never be attached to stream X, even if stream A is - closed first. - -Alternative Interface: - - We're cramming a lot onto one line in the design above. Perhaps - instead it would be a better idea to have grouped lines of the form: - - StreamGroup 1 - SOCKSPort 9050 - TransPort 9051 - IsolateDestPort 1 - IsolateClientProtocol 0 - EndStreamGroup - - StreamGroup 2 - SOCKSPort 9052 - DNSPort 9053 - IsolateDestAddr 1 - EndStreamGroup - - This would be equivalent to: - SOCKSPort 9050 SessionGroup=1 IsolateDestPort NoIsolateClientProtocol - TransPort 9051 SessionGroup=1 IsolateDestPort NoIsolateClientProtocol - SOCKSPort 9052 SessionGroup=2 IsolateDestAddr - DNSPort 9053 SessionGroup=2 IsolateDestAddr - - But it would let us extend range of allowed options later without - having client port lines group without bound. For example, we might - give different circuit building parameters to different session - groups. - -Example of use: - - Suppose that we want to use a web browser, an IRC client, and a SSH - client all at the same time. Let's assume that we want web traffic to - be isolated from all other traffic, even if the browser makes - connections to ports usually used for IRC or SSH. Let's also assume - that IRC and SSH are both used for relatively long-lived connections, - and we want to keep all IRC/SSH sessions separate from one another. - - In this case, we could say: - - SOCKSPort 9050 - SOCKSPort 9051 IsolateDestAddr IsolateDestPort - - We would then configure our browser to use 9050 and our IRC/SSH - clients to use 9051. - -Advanced example of use, #2: - - Suppose that we have a bunch of applications, and we launch them all - using torsocks, and we want to keep each applications isolated from - one another. We just create a shell script, "torlaunch": - #!/bin/bash - export TORSOCKS_USERNAME="$1" - exec torsocks $@ - And we configure our SOCKSPort with IsolateSOCKSUser. - - Or if we're on Linux and we want to isolate by application invocation, - we would change the TORSOCKS_USERNAME line to: - - export TORSOCKS_USERNAME="`cat /proc/sys/kernel/random/uuid`" - -Advanced example of use, #2: - - Now suppose that we want to achieve the benefits of the first example - of use, but we are stuck using transparent proxies. Let's suppose - this is Linux. - - TransPort 9090 - TransPort 9091 IsolateDestAddr IsolateDestPort - DNSPort 5353 - AutomapHostsOnResolve 1 - - Here we use the iptables --cmd-owner filter to distinguish which - command is originating the packets, directing traffic from our irc - client and our SSH client to port 9091, and directing other traffic to - 9090. Using AutomapHostsOnResolve will confuse ssh in its default - configuration; we'll need to find a way around that. - -Security Risks: - - Disabling IsolateClientAddr is a pretty bad idea. - - Setting up a set of applications to use this system effectively is a - big problem. It's likely that lots of people who try to do this will - mess it up. We should try to see which setups are sensible, and see - if we can provide good feedback to explain which streams are isolated - how. - -Performance Risks: - - This proposal will result in clients building many more circuits than - they do today. To avoid accidentally hammering the network, we should - have in-process limits on the maximum circuit creation rate and the - total maximum client circuits. - -Specification: - - The Tor client circuit selection process is not entirely specified. - Any client circuit specification must take these changes into account. - -Implementation notes: - - The more obvious ways to implement the "find a good circuit to attach - to" part of this proposal involve doing an O(n_circuits) operation - every time we have a stream to attach. We already do such an - operation, so it's not as if we need to hunt for fancy ways to make it - O(1). What will be harder is implementing the "launch circuits as - needed" part of the proposal. Still, it should come down to "a simple - matter of programming." - - The SOCKS4 spec has the client provide authentication info when it - connects; accepting such info is no problem. But the SOCKS5 spec has - the client send a list of known auth methods, then has the server send - back the authentication method it chooses. We'll need to update the - SOCKS5 implementation so it can accept user/password authentication if - it's offered. - - If we use the second syntax for describing these options, we'll want - to add a new "section-based" entry type for the configuration parser. - Not a huge deal; we already have kludged up something similar for - hidden service configurations. - - Opening circuits for predicted ports has the potential to get a little - more complicated; we can probably get away with the existing - algorithm, though, to see where its weak points are and look for - better ones. - - Perhaps we can get our next-gen HTTP proxy to communicate browser tab - or session into to tor via authentication, or have torbutton do it - directly. More design is needed here, though. - -Alternative designs: - - The implementation of this option may want to consider cases where the - same exit node is shared by two or more circuits and - IsolateStreamsByPort is in force. Since one possible use of the option - is to reduce the opportunity of Exit Nodes to attack traffic from the - same source on multiple ports, the implementation may need to ensure - that circuits reserved for the exclusive use of given ports do not - share the same exit node. On the other hand, if our goal is only that - streams should be unlinkable, deliberately shunting them to different - exit nodes is unnecessary and slightly counterproductive. - - Earlier versions of this design included a mechanism to isolate - _particular_ destination ports and addresses, so that traffic sent to, - say, port 22 would never share a port with any traffic *not* sent to - port 22. You can achieve this here by having all applications that - send traffic to one of these ports use a separate SOCKSPort, and - then setting IsolateDestPorts on that SOCKSPort. - -Future work: - - Nikita Borisov suggests that different session profiles -- so long as - there aren't too many of them -- could well get different guard node - allocations in order to prevent guard profiling. This can be done - orthogonally to the rest of this proposal. - -Lingering questions: - - I suspect there are issues remaining with DNS and TransPort users, and - that my "just use AutomapHostsOnResolve" suggestion may be - insufficient. diff --git a/doc/spec/proposals/172-circ-getinfo-option.txt b/doc/spec/proposals/172-circ-getinfo-option.txt deleted file mode 100644 index b7fd79c9a8..0000000000 --- a/doc/spec/proposals/172-circ-getinfo-option.txt +++ /dev/null @@ -1,138 +0,0 @@ -Filename: 172-circ-getinfo-option.txt -Title: GETINFO controller option for circuit information -Author: Damian Johnson -Created: 03-June-2010 -Status: Accepted - -Overview: - - This details an additional GETINFO option that would provide information - concerning a relay's current circuits. - -Motivation: - - The original proposal was for connection related information, but Jake make - the excellent point that any information retrieved from the control port - is... - - 1. completely ineffectual for auditing purposes since either (a) these - results can be fetched from netstat already or (b) the information would - only be provided via tor and can't be validated. - - 2. The more useful uses for connection information can be achieved with - much less (and safer) information. - - Hence the proposal is now for circuit based rather than connection based - information. This would strip the most controversial and sensitive data - entirely (ip addresses, ports, and connection based bandwidth breakdowns) - while still being useful for the following purposes: - - - Basic Relay Usage Questions - How is the bandwidth I'm contributing broken down? Is it being evenly - distributed or is someone hogging most of it? Do these circuits belong to - the hidden service I'm running or something else? Now that I'm using exit - policy X am I desirable as an exit, or are most people just using me as a - relay? - - - Debugging - Say a relay has a restrictive firewall policy for outbound connections, - with the ORPort whitelisted but doesn't realize that tor needs random high - ports. Tor would report success ("your orport is reachable - excellent") - yet the relay would be nonfunctional. This proposed information would - reveal numerous RELAY -> YOU -> UNESTABLISHED circuits, giving a good - indicator of what's wrong. - - - Visualization - A nice benefit of visualizing tor's behavior is that it becomes a helpful - tool in puzzling out how tor works. For instance, tor spawns numerous - client connections at startup (even if unused as a client). As a newcomer - to tor these asymmetric (outbound only) connections mystified me for quite - a while until until Roger explained their use to me. The proposed - TYPE_FLAGS would let controllers clearly label them as being client - related, making their purpose a bit clearer. - - At the moment connection data can only be retrieved via commands like - netstat, ss, and lsof. However, providing an alternative via the control - port provides several advantages: - - - scrubbing for private data - Raw connection data has no notion of what's sensitive and what is - not. The relay's flags and cached consensus can be used to take - educated guesses concerning which connections could possibly belong - to client or exit traffic, but this is both difficult and inaccurate. - Anything provided via the control port can scrubbed to make sure we - aren't providing anything we think relay operators should not see. - - - additional information - All connection querying commands strictly provide the ip address and - port of connections, and nothing else. However, for the uses listed - above the far more interesting attributes are the circuit's type, - bandwidth usage and uptime. - - - improved performance - Querying connection data is an expensive activity, especially for - busy relays or low end processors (such as mobile devices). Tor - already internally knows its circuits, allowing for vastly quicker - lookups. - - - cross platform capability - The connection querying utilities mentioned above not only aren't - available under Windows, but differ widely among different *nix - platforms. FreeBSD in particular takes a very unique approach, - dropping important options from netstat and assigning ss to a - spreadsheet application instead. A controller interface, however, - would provide a uniform means of retrieving this information. - -Security Implications: - - This is an open question. This proposal lacks the most controversial pieces - of information (ip addresses and ports) and insight into potential threats - this would pose would be very welcomed! - -Specification: - - The following addition would be made to the control-spec's GETINFO section: - - "rcirc/id/<Circuit identity>" -- Provides entry for the associated relay - circuit, formatted as: - CIRC_ID=<circuit ID> CREATED=<timestamp> UPDATED=<timestamp> TYPE=<flag> - READ=<bytes> WRITE=<bytes> - - none of the parameters contain whitespace, and additional results must be - ignored to allow for future expansion. Parameters are defined as follows: - CIRC_ID - Unique numeric identifier for the circuit this belongs to. - CREATED - Unix timestamp (as seconds since the Epoch) for when the - circuit was created. - UPDATED - Unix timestamp for when this information was last updated. - TYPE - Single character flags indicating attributes in the circuit: - (E)ntry : has a connection that doesn't belong to a known Tor server, - indicating that this is either the first hop or bridged - E(X)it : has been used for at least one exit stream - (R)elay : has been extended - Rende(Z)vous : is being used for a rendezvous point - (I)ntroduction : is being used for a hidden service introduction - (N)one of the above: none of the above have happened yet. - READ - Total bytes transmitted toward the exit over the circuit. - WRITE - Total bytes transmitted toward the client over the circuit. - - "rcirc/all" -- The 'rcirc/id/*' output for all current circuits, joined by - newlines. - - The following would be included for circ info update events. - -4.1.X. Relay circuit status changed - - The syntax is: - "650" SP "RCIRC" SP CircID SP Notice [SP Created SP Updated SP Type SP - Read SP Write] CRLF - - Notice = - "NEW" / ; first information being provided for this circuit - "UPDATE" / ; update for a previously reported circuit - "CLOSED" ; notice that the circuit no longer exists - - Notice indicating that queryable information on a relay related circuit has - changed. If the Notice parameter is either "NEW" or "UPDATE" then this - provides the same fields that would be given by calling "GETINFO rcirc/id/" - with the CircID. - diff --git a/doc/spec/proposals/173-getinfo-option-expansion.txt b/doc/spec/proposals/173-getinfo-option-expansion.txt deleted file mode 100644 index 03e18ef8d4..0000000000 --- a/doc/spec/proposals/173-getinfo-option-expansion.txt +++ /dev/null @@ -1,101 +0,0 @@ -Filename: 173-getinfo-option-expansion.txt -Title: GETINFO Option Expansion -Author: Damian Johnson -Created: 02-June-2010 -Status: Accepted - -Overview: - - Over the course of developing arm there's been numerous hacks and - workarounds to gleam pieces of basic, desirable information about the tor - process. As per Roger's request I've compiled a list of these pain points - to try and improve the control protocol interface. - -Motivation: - - The purpose of this proposal is to expose additional process and relay - related information that is currently unavailable in a convenient, - dependable, and/or platform independent way. Examples of this are... - - - The relay's total contributed bandwidth. This is a highly requested - piece of information and, based on the following patch from pipe, looks - trivial to include. - http://www.mail-archive.com/or-talk@freehaven.net/msg13085.html - - - The process ID of the tor process. There is a high degree of guess work - in obtaining this. Arm for instance uses pidof, netstat, and ps yet - still fails on some platforms, and Orbot recently got a ticket about - its own attempt to fetch it with ps: - https://trac.torproject.org/projects/tor/ticket/1388 - - This just includes the pieces of missing information I've noticed - (suggestions or questions of their usefulness are welcome!). - -Security Implications: - - None that I'm aware of. From a security standpoint this seems decently - innocuous. - -Specification: - - The following addition would be made to the control-spec's GETINFO section: - - "relay/bw-limit" -- Effective relayed bandwidth limit. - - "relay/burst-limit" -- Effective relayed burst limit. - - "relay/read-total" -- Total bytes relayed (download). - - "relay/write-total" -- Total bytes relayed (upload). - - "relay/flags" -- Space separated listing of flags currently held by the - relay as repored by the currently cached consensus. - - "process/user" -- Username under which the tor process is running, - providing an empty string if none exists. - - "process/pid" -- Process id belonging to the main tor process, -1 if none - exists for the platform. - - "process/uptime" -- Total uptime of the tor process (in seconds). - - "process/uptime-reset" -- Time since last reset (startup, sighup, or RELOAD - signal, in seconds). - - "process/descriptors-used" -- Count of file descriptors used. - - "process/descriptor-limit" -- File descriptor limit (getrlimit results). - - "ns/authority" -- Router status info (v2 directory style) for all - recognized directory authorities, joined by newlines. - - "state/names" -- A space-separated list of all the keys supported by this - version of Tor's state. - - "state/val/<key>" -- Provides the current state value belonging to the - given key. If undefined, this provides the key's default value. - - "status/ports-seen" -- A summary of which ports we've seen connections - circuits connect to recently, formatted the same as the EXITS_SEEN status - event described in Section 4.1.XX. This GETINFO option is currently - available only for exit relays. - -4.1.XX. Per-port exit stats - - The syntax is: - "650" SP "EXITS_SEEN" SP TimeStarted SP PortSummary CRLF - - We just generated a new summary of which ports we've seen exiting circuits - connecting to recently. The controller could display this for the user, e.g. - in their "relay" configuration window, to give them a sense of how they're - being used (popularity of the various ports they exit to). Currently only - exit relays will receive this event. - - TimeStarted is a quoted string indicating when the reported summary - counts from (in GMT). - - The PortSummary keyword has as its argument a comma-separated, possibly - empty set of "port=count" pairs. For example (without linebreak), - 650-EXITS_SEEN TimeStarted="2008-12-25 23:50:43" - PortSummary=80=16,443=8 - diff --git a/doc/spec/proposals/174-optimistic-data-server.txt b/doc/spec/proposals/174-optimistic-data-server.txt deleted file mode 100644 index d97c45e909..0000000000 --- a/doc/spec/proposals/174-optimistic-data-server.txt +++ /dev/null @@ -1,242 +0,0 @@ -Filename: 174-optimistic-data-server.txt -Title: Optimistic Data for Tor: Server Side -Author: Ian Goldberg -Created: 2-Aug-2010 -Status: Open - -Overview: - -When a SOCKS client opens a TCP connection through Tor (for an HTTP -request, for example), the query latency is about 1.5x higher than it -needs to be. Simply, the problem is that the sequence of data flows -is this: - -1. The SOCKS client opens a TCP connection to the OP -2. The SOCKS client sends a SOCKS CONNECT command -3. The OP sends a BEGIN cell to the Exit -4. The Exit opens a TCP connection to the Server -5. The Exit returns a CONNECTED cell to the OP -6. The OP returns a SOCKS CONNECTED notification to the SOCKS client -7. The SOCKS client sends some data (the GET request, for example) -8. The OP sends a DATA cell to the Exit -9. The Exit sends the GET to the server -10. The Server returns the HTTP result to the Exit -11. The Exit sends the DATA cells to the OP -12. The OP returns the HTTP result to the SOCKS client - -Note that the Exit node knows that the connection to the Server was -successful at the end of step 4, but is unable to send the HTTP query to -the server until step 9. - -This proposal (as well as its upcoming sibling concerning the client -side) aims to reduce the latency by allowing: -1. SOCKS clients to optimistically send data before they are notified - that the SOCKS connection has completed successfully -2. OPs to optimistically send DATA cells on streams in the CONNECT_WAIT - state -3. Exit nodes to accept and queue DATA cells while in the - EXIT_CONN_STATE_CONNECTING state - -This particular proposal deals with #3. - -In this way, the flow would be as follows: - -1. The SOCKS client opens a TCP connection to the OP -2. The SOCKS client sends a SOCKS CONNECT command, followed immediately - by data (such as the GET request) -3. The OP sends a BEGIN cell to the Exit, followed immediately by DATA - cells -4. The Exit opens a TCP connection to the Server -5. The Exit returns a CONNECTED cell to the OP, and sends the queued GET - request to the Server -6. The OP returns a SOCKS CONNECTED notification to the SOCKS client, - and the Server returns the HTTP result to the Exit -7. The Exit sends the DATA cells to the OP -8. The OP returns the HTTP result to the SOCKS client - -Motivation: - -This change will save one OP<->Exit round trip (down to one from two). -There are still two SOCKS Client<->OP round trips (negligible time) and -two Exit<->Server round trips. Depending on the ratio of the -Exit<->Server (Internet) RTT to the OP<->Exit (Tor) RTT, this will -decrease the latency by 25 to 50 percent. Experiments validate these -predictions. [Goldberg, PETS 2010 rump session; see -https://thunk.cs.uwaterloo.ca/optimistic-data-pets2010-rump.pdf ] - -Design: - -The current code actually correctly handles queued data at the Exit; if -there is queued data in a EXIT_CONN_STATE_CONNECTING stream, that data -will be immediately sent when the connection succeeds. If the -connection fails, the data will be correctly ignored and freed. The -problem with the current server code is that the server currently -drops DATA cells on streams in the EXIT_CONN_STATE_CONNECTING state. -Also, if you try to queue data in the EXIT_CONN_STATE_RESOLVING state, -bad things happen because streams in that state don't yet have -conn->write_event set, and so some existing sanity checks (any stream -with queued data is at least potentially writable) are no longer sound. - -The solution is to simply not drop received DATA cells while in the -EXIT_CONN_STATE_CONNECTING state. Also do not send SENDME cells in this -state, so that the OP cannot send more than one window's worth of data -to be queued at the Exit. Finally, patch the sanity checks so that -streams in the EXIT_CONN_STATE_RESOLVING state that have buffered data -can pass. - -If no clients ever send such optimistic data, the new code will never be -executed, and the behaviour of Tor will not change. When clients begin -to send optimistic data, the performance of those clients' streams will -improve. - -After discussion with nickm, it seems best to just have the server -version number be the indicator of whether a particular Exit supports -optimistic data. (If a client sends optimistic data to an Exit which -does not support it, the data will be dropped, and the client's request -will fail to complete.) What do version numbers for hypothetical future -protocol-compatible implementations look like, though? - -Security implications: - -Servers (for sure the Exit, and possibly others, by watching the -pattern of packets) will be able to tell that a particular client -is using optimistic data. This will be discussed more in the sibling -proposal. - -On the Exit side, servers will be queueing a little bit extra data, but -no more than one window. Clients today can cause Exits to queue that -much data anyway, simply by establishing a Tor connection to a slow -machine, and sending one window of data. - -Specification: - -tor-spec section 6.2 currently says: - - The OP waits for a RELAY_CONNECTED cell before sending any data. - Once a connection has been established, the OP and exit node - package stream data in RELAY_DATA cells, and upon receiving such - cells, echo their contents to the corresponding TCP stream. - RELAY_DATA cells sent to unrecognized streams are dropped. - -It is not clear exactly what an "unrecognized" stream is, but this last -sentence would be changed to say that RELAY_DATA cells received on a -stream that has processed a RELAY_BEGIN cell and has not yet issued a -RELAY_END or a RELAY_CONNECTED cell are queued; that queue is processed -immediately after a RELAY_CONNECTED cell is issued for the stream, or -freed after a RELAY_END cell is issued for the stream. - -The earlier part of this section will be addressed in the sibling -proposal. - -Compatibility: - -There are compatibility issues, as mentioned above. OPs MUST NOT send -optimistic data to Exit nodes whose version numbers predate (something). -OPs MAY send optimistic data to Exit nodes whose version numbers match -or follow that value. (But see the question about independent server -reimplementations, above.) - -Implementation: - -Here is a simple patch. It seems to work with both regular streams and -hidden services, but there may be other corner cases I'm not aware of. -(Do streams used for directory fetches, hidden services, etc. take a -different code path?) - -diff --git a/src/or/connection.c b/src/or/connection.c -index 7b1493b..f80cd6e 100644 ---- a/src/or/connection.c -+++ b/src/or/connection.c -@@ -2845,7 +2845,13 @@ _connection_write_to_buf_impl(const char *string, size_t len, - return; - } - -- connection_start_writing(conn); -+ /* If we receive optimistic data in the EXIT_CONN_STATE_RESOLVING -+ * state, we don't want to try to write it right away, since -+ * conn->write_event won't be set yet. Otherwise, write data from -+ * this conn as the socket is available. */ -+ if (conn->state != EXIT_CONN_STATE_RESOLVING) { -+ connection_start_writing(conn); -+ } - if (zlib) { - conn->outbuf_flushlen += buf_datalen(conn->outbuf) - old_datalen; - } else { -@@ -3382,7 +3388,11 @@ assert_connection_ok(connection_t *conn, time_t now) - tor_assert(conn->s < 0); - - if (conn->outbuf_flushlen > 0) { -- tor_assert(connection_is_writing(conn) || conn->write_blocked_on_bw || -+ /* With optimistic data, we may have queued data in -+ * EXIT_CONN_STATE_RESOLVING while the conn is not yet marked to writing. -+ * */ -+ tor_assert(conn->state == EXIT_CONN_STATE_RESOLVING || -+ connection_is_writing(conn) || conn->write_blocked_on_bw || - (CONN_IS_EDGE(conn) && TO_EDGE_CONN(conn)->edge_blocked_on_circ)); - } - -diff --git a/src/or/relay.c b/src/or/relay.c -index fab2d88..e45ff70 100644 ---- a/src/or/relay.c -+++ b/src/or/relay.c -@@ -1019,6 +1019,9 @@ connection_edge_process_relay_cell(cell_t *cell, circuit_t *circ, - relay_header_t rh; - unsigned domain = layer_hint?LD_APP:LD_EXIT; - int reason; -+ int optimistic_data = 0; /* Set to 1 if we receive data on a stream -+ that's in the EXIT_CONN_STATE_RESOLVING -+ or EXIT_CONN_STATE_CONNECTING states.*/ - - tor_assert(cell); - tor_assert(circ); -@@ -1038,9 +1041,20 @@ connection_edge_process_relay_cell(cell_t *cell, circuit_t *circ, - /* either conn is NULL, in which case we've got a control cell, or else - * conn points to the recognized stream. */ - -- if (conn && !connection_state_is_open(TO_CONN(conn))) -- return connection_edge_process_relay_cell_not_open( -- &rh, cell, circ, conn, layer_hint); -+ if (conn && !connection_state_is_open(TO_CONN(conn))) { -+ if ((conn->_base.state == EXIT_CONN_STATE_CONNECTING || -+ conn->_base.state == EXIT_CONN_STATE_RESOLVING) && -+ rh.command == RELAY_COMMAND_DATA) { -+ /* We're going to allow DATA cells to be delivered to an exit -+ * node in state EXIT_CONN_STATE_CONNECTING or -+ * EXIT_CONN_STATE_RESOLVING. This speeds up HTTP, for example. */ -+ log_warn(domain, "Optimistic data received."); -+ optimistic_data = 1; -+ } else { -+ return connection_edge_process_relay_cell_not_open( -+ &rh, cell, circ, conn, layer_hint); -+ } -+ } - - switch (rh.command) { - case RELAY_COMMAND_DROP: -@@ -1090,7 +1104,9 @@ connection_edge_process_relay_cell(cell_t *cell, circuit_t *circ, - log_debug(domain,"circ deliver_window now %d.", layer_hint ? - layer_hint->deliver_window : circ->deliver_window); - -- circuit_consider_sending_sendme(circ, layer_hint); -+ if (!optimistic_data) { -+ circuit_consider_sending_sendme(circ, layer_hint); -+ } - - if (!conn) { - log_info(domain,"data cell dropped, unknown stream (streamid %d).", -@@ -1107,7 +1123,9 @@ connection_edge_process_relay_cell(cell_t *cell, circuit_t *circ, - stats_n_data_bytes_received += rh.length; - connection_write_to_buf(cell->payload + RELAY_HEADER_SIZE, - rh.length, TO_CONN(conn)); -- connection_edge_consider_sending_sendme(conn); -+ if (!optimistic_data) { -+ connection_edge_consider_sending_sendme(conn); -+ } - return 0; - case RELAY_COMMAND_END: - reason = rh.length > 0 ? - -Performance and scalability notes: - -There may be more RAM used at Exit nodes, as mentioned above, but it is -transient. diff --git a/doc/spec/proposals/175-automatic-node-promotion.txt b/doc/spec/proposals/175-automatic-node-promotion.txt deleted file mode 100644 index c990b3f060..0000000000 --- a/doc/spec/proposals/175-automatic-node-promotion.txt +++ /dev/null @@ -1,238 +0,0 @@ -Filename: 175-automatic-node-promotion.txt -Title: Automatically promoting Tor clients to nodes -Author: Steven Murdoch -Created: 12-Mar-2010 -Status: Draft - -1. Overview - - This proposal describes how Tor clients could determine when they - have sufficient bandwidth capacity and are sufficiently reliable to - become either bridges or Tor relays. When they meet this - criteria, they will automatically promote themselves, based on user - preferences. The proposal also defines the new controller messages - and options which will control this process. - - Note that for the moment, only transitions between client and - bridge are being considered. Transitions to public relay will - be considered at a future date, but will use the same - infrastructure for measuring capacity and reliability. - -2. Motivation and history - - Tor has a growing user-base and one of the major impediments to the - quality of service offered is the lack of network capacity. This is - particularly the case for bridges, because these are gradually - being blocked, and thus no longer of use to people within some - countries. By automatically promoting Tor clients to bridges, and - perhaps also to full public relays, this proposal aims to solve - these problems. - - Only Tor clients which are sufficiently useful should be promoted, - and the process of determining usefulness should be performed - without reporting the existence of the client to the central - authorities. The criteria used for determining usefulness will be - in terms of bandwidth capacity and uptime, but parameters should be - specified in the directory consensus. State stored at the client - should be in no more detail than necessary, to prevent sensitive - information being recorded. - -3. Design - -3.x Opt-in state model - - Tor can be in one of five node-promotion states: - - - off (O): Currently a client, and will stay as such - - auto (A): Currently a client, but will consider promotion - - bridge (B): Currently a bridge, and will stay as such - - auto-bridge (AB): Currently a bridge, but will consider promotion - - relay (R): Currently a public relay, and will stay as such - - The state can be fully controlled from the configuration file or - controller, but the normal state transitions are as follows: - - Any state -> off: User has opted out of node promotion - Off -> any state: Only permitted with user consent - - Auto -> auto-bridge: Tor has detected that it is sufficiently - reliable to be a *bridge* - Auto -> bridge: Tor has detected that it is sufficiently reliable - to be a *relay*, but the user has chosen to remain a *bridge* - Auto -> relay: Tor has detected that it is sufficiently reliable - to be *relay*, and will skip being a *bridge* - Auto-bridge -> relay: Tor has detected that it is sufficiently - reliable to be a *relay* - - Note that this model does not support automatic demotion. If this - is desirable, there should be some memory as to whether the - previous state was relay, bridge, or auto-bridge. Otherwise the - user may be prompted to become a relay, although he has opted to - only be a bridge. - -3.x User interaction policy - - There are a variety of options in how to involve the user into the - decision as to whether and when to perform node promotion. The - choice also may be different when Tor is running from Vidalia (and - thus can readily prompt the user for information), and standalone - (where Tor can only log messages, which may or may not be read). - - The option requiring minimal user interaction is to automatically - promote nodes according to reliability, and allow the user to opt - out, by changing settings in the configuration file or Vidalia user - interface. - - Alternatively, if a user interface is available, Tor could prompt - the user when it detects that a transition is available, and allow - the user to choose which of the available options to select. If - Vidalia is not available, it still may be possible to solicit an - email address on install, and contact the operator to ask whether - a transition to bridge or relay is permitted. - - Finally, Tor could by default not make any transition, and the user - would need to opt in by stating the maximum level (bridge or - relay) to which the node may automatically promote itself. - -3.x Performance monitoring model - - To prevent a large number of clients activating as relays, but - being too unreliable to be useful, clients should measure their - performance. If this performance meets a parameterized acceptance - criteria, a client should consider promotion. To measure - reliability, this proposal adopts a simple user model: - - - A user decides to use Tor at times which follow a Poisson - distribution - - At each time, the user will be happy if the bridge chosen has - adequate bandwidth and is reachable - - If the chosen bridge is down or slow too many times, the user - will consider Tor to be bad - - If we additionally assume that the recent history of relay - performance matches the current performance, we can measure - reliability by simulating this simple user. - - The following parameters are distributed to clients in the - directory consensus: - - - min_bandwidth: Minimum self-measured bandwidth for a node to be - considered useful, in bytes per second - - check_period: How long, in seconds, to wait between checking - reachability and bandwidth (on average) - - num_samples: Number of recent samples to keep - - num_useful: Minimum number of recent samples where the node was - reachable and had at least min_bandwidth capacity, for a client - to consider promoting to a bridge - - A different set of parameters may be used for considering when to - promote a bridge to a full relay, but this will be the subject of a - future revision of the proposal. - -3.x Performance monitoring algorithm - - The simulation described above can be implemented as follows: - - Every 60 seconds: - 1. Tor generates a random floating point number x in - the interval [0, 1). - 2. If x > (1 / (check_period / 60)) GOTO end; otherwise: - 3. Tor sets the value last_check to the current_time (in seconds) - 4. Tor measures reachability - 5. If the client is reachable, Tor measures its bandwidth - 6. If the client is reachable and the bandwidth is >= - min_bandwidth, the test has succeeded, otherwise it has failed. - 7. Tor adds the test result to the end of a ring-buffer containing - the last num_samples results: measurement_results - 8. Tor saves last_check and measurements_results to disk - 9. If the length of measurements_results == num_samples and - the number of successes >= num_useful, Tor should consider - promotion to a bridge - end. - - When Tor starts, it must fill in the samples for which it was not - running. This can only happen once the consensus has downloaded, - because the value of check_period is needed. - - 1. Tor generates a random number y from the Poisson distribution [1] - with lambda = (current_time - last_check) * (1 / check_period) - 2. Tor sets the value last_check to the current_time (in seconds) - 3. Add y test failures to the ring buffer measurements_results - 4. Tor saves last_check and measurements_results to disk - - In this way, a Tor client will measure its bandwidth and - reachability every check_period seconds, on average. Provided - check_period is sufficiently greater than a minute (say, at least an - hour), the times of check will follow a Poisson distribution. [2] - - While this does require that Tor does record the state of a client - over time, this does not leak much information. Only a binary - reachable/non-reachable is stored, and the timing of samples becomes - increasingly fuzzy as the data becomes less recent. - - On IP address changes, Tor should clear the ring-buffer, because - from the perspective of users with the old IP address, this node - might as well be a new one with no history. This policy may change - once we start allowing the bridge authority to hand out new IP - addresses given the fingerprint. - -3.x Bandwidth measurement - - Tor needs to measure its bandwidth to test the usefulness as a - bridge. A non-intrusive way to do this would be to passively measure - the peak data transfer rate since the last reachability test. Once - this exceeds min_bandwidth, Tor can set a flag that this node - currently has sufficient bandwidth to pass the bandwidth component - of the upcoming performance measurement. - - For the first version we may simply skip the bandwidth test, - because the existing reachability test sends 500 kB over several - circuits, and checks whether the node can transfer at least 50 - kB/s. This is probably good enough for a bridge, so this test - might be sufficient to record a success in the ring buffer. - -3.x New options - -3.x New controller message - -4. Migration plan - - We should start by setting a high bandwidth and uptime requirement - in the consensus, so as to avoid overloading the bridge authority - with too many bridges. Once we are confident our systems can scale, - the criteria can be gradually shifted down to gain more bridges. - -5. Related proposals - -6. Open questions: - - - What user interaction policy should we take? - - - When (if ever) should we turn a relay into an exit relay? - - - What should the rate limits be for auto-promoted bridges/relays? - Should we prompt the user for this? - - - Perhaps the bridge authority should tell potential bridges - whether to enable themselves, by taking into account whether - their IP address is blocked - - - How do we explain the possible risks of running a bridge/relay - * Use of bandwidth/congestion - * Publication of IP address - * Blocking from IRC (even for non-exit relays) - - - What feedback should we give to bridge relays, to encourage then - e.g. number of recent users (what about reserve bridges)? - - - Can clients back-off from doing these tests (yes, we should do - this) - -[1] For algorithms to generate random numbers from the Poisson - distribution, see: http://en.wikipedia.org/wiki/Poisson_distribution#Generating_Poisson-distributed_random_variables -[2] "The sample size n should be equal to or larger than 20 and the - probability of a single success, p, should be smaller than or equal to - .05. If n >= 100, the approximation is excellent if np is also <= 10." - http://www.itl.nist.gov/div898/handbook/pmc/section3/pmc331.htm (e-Handbook of Statistical Methods) - -% vim: spell ai et: diff --git a/doc/spec/proposals/176-revising-handshake.txt b/doc/spec/proposals/176-revising-handshake.txt deleted file mode 100644 index db7ea4a663..0000000000 --- a/doc/spec/proposals/176-revising-handshake.txt +++ /dev/null @@ -1,623 +0,0 @@ -Filename: 176-revising-handshake.txt -Title: Proposed version-3 link handshake for Tor -Author: Nick Mathewson -Created: 31-Jan-2011 -Status: Draft -Target: 0.2.3 -Supersedes: 169 - -1. Overview - - I propose a (mostly) backward-compatible change to the Tor - connection establishment protocol to avoid the use of TLS - renegotiation, to avoid certain protocol fingerprinting attacks, - and to make it easier to write Tor clients and servers. - - Rather than doing a TLS renegotiation to exchange certificates - and authenticate the original handshake, this proposal takes an - approach similar to Steven Murdoch's proposal 124 and my old - proposal 169, and uses Tor cells to finish authenticating the - parties' identities once the initial TLS handshake is finished. - - I discuss some alternative design choices and why I didn't make - them in section 7; please have a quick look there before - telling me that something is pointless or makes no sense. - - Terminological note: I use "client" below to mean the Tor - instance (a client or a bridge or a relay) that initiates a TLS - connection, and "server" to mean the Tor instance (a bridge or a - relay) that accepts it. - -2. History and Motivation - - The _goals_ of the Tor link handshake have remained basically uniform - since our earliest versions. They are: - - * Provide data confidentiality, data integrity - * Provide forward secrecy - * Allow responder authentication or bidirectional authentication. - * Try to look like some popular too-important-to-block-at-whim - encryption protocol, to avoid fingerprinting and censorship. - * Try to be implementatble -- on the client side at least! -- - by as many TLS implementations as possible. - - When we added the v2 handshake, we added another goal: - - * Remain compatible with older versions of the handshake - protocol. - - In the original Tor TLS connection handshake protocol ("V1", or - "two-cert"), parties that wanted to authenticate provided a - two-cert chain of X.509 certificates during the handshake setup - phase. Every party that wanted to authenticate sent these - certificates. The security properties of this protocol are just - fine; the problem was that our behavior of sending - two-certificate chains made Tor easy to identify. - - In the current Tor TLS connection handshake protocol ("V2", or - "renegotiating"), the parties begin with a single certificate - sent from the server (responder) to the client (initiator), and - then renegotiate to a two-certs-from-each-authenticating party. - We made this change to make Tor's handshake look like a browser - speaking SSL to a webserver. (See proposal 130, and - tor-spec.txt.) So from an observer's point of view, two parties - performing the V2 handshake begin by making a regular TLS - handshake with a single certificate, then renegotiate - immediately. - - To tell whether to use the V1 or V2 handshake, the servers look - at the list of ciphers sent by the client. (This is ugly, but - there's not much else in the ClientHello that they can look at.) - If the list contains any cipher not used by the V1 protocol, the - server sends back a single cert and expects a renegotiation. If - the client gets back a single cert, then it withholds its own - certificates until the TLS renegotiation phase. - - In other words, V2-supporting initiator behavior currently looks - like this: - - - Begin TLS negotiation with V2 cipher list; wait for - certificate(s). - - If we get a certificate chain: - - Then we are using the V1 handshake. Send our own - certificate chain as part of this initial TLS handshake - if we want to authenticate; otherwise, send no - certificates. When the handshake completes, check - certificates. We are now mutually authenticated. - - Otherwise, if we get just a single certificate: - - Then we are using the V2 handshake. Do not send any - certificates during this handshake. - - When the handshake is done, immediately start a TLS - renegotiation. During the renegotiation, expect - a certificate chain from the server; send a certificate - chain of our own if we want to authenticate ourselves. - - After the renegotiation, check the certificates. Then - send (and expect) a VERSIONS cell from the other side to - establish the link protocol version. - - And V2-supporting responder behavior now looks like this: - - - When we get a TLS ClientHello request, look at the cipher - list. - - If the cipher list contains only the V1 ciphersuites: - - Then we're doing a V1 handshake. Send a certificate - chain. Expect a possible client certificate chain in - response. - Otherwise, if we get other ciphersuites: - - We're using the V2 handshake. Send back a single - certificate and let the handshake complete. - - Do not accept any data until the client has renegotiated. - - When the client is renegotiating, send a certificate - chain, and expect (possibly multiple) certificates in - reply. - - Check the certificates when the renegotiation is done. - Then exchange VERSIONS cells. - - Late in 2009, researchers found a flaw in most applications' use - of TLS renegotiation: Although TLS renegotiation does not - reauthenticate any information exchanged before the renegotiation - takes place, many applications were treating it as though it did, - and assuming that data sent _before_ the renegotiation was - authenticated with the credentials negotiated _during_ the - renegotiation. This problem was exacerbated by the fact that - most TLS libraries don't actually give you an obvious good way to - tell where the renegotiation occurred relative to the datastream. - Tor wasn't directly affected by this vulnerability, but the - aftermath hurts us in a few ways: - - 1) OpenSSL has disabled renegotiation by default, and created - a "yes we know what we're doing" option we need to set to - turn it back on. (Two options, actually: one for openssl - 0.9.8l and one for 0.9.8m and later.) - - 2) Some vendors have removed all renegotiation support from - their versions of OpenSSL entirely, forcing us to tell - users to either replace their versions of OpenSSL or to - link Tor against a hand-built one. - - 3) Because of 1 and 2, I'd expect TLS renegotiation to become - rarer and rarer in the wild, making our own use stand out - more. - - Furthermore, there are other issues related to TLS and - fingerprinting that we want to fix in any revised handshake: - - 1) We should make it easier to use self-signed certs, or maybe - even existing HTTPS certificates, for the server side - handshake, since most non-Tor SSL handshakes use either - self-signed certificates or - - 2) We should make it harder to probe for a Tor server. Right - now, you can just do a handshake with a server, - renegotiate, then see if it gives you a VERSIONS cell. - That's no good. - - 3) We should allow other changes in our use of TLS and in our - certificates so as to resist fingerprinting based on how - our certificates look. - -3. Design - -3.1. The view in the large - - Taking a cue from Steven Murdoch's proposal 124 and my old - proposal 169, I propose that we move the work currently done by - the TLS renegotiation step (that is, authenticating the parties - to one another) and do it with Tor cells instead of with TLS - alone. - - This section outlines the protocol; we go into more detail below. - - To tell the client that it can use the new cell-based - authentication system, the server sends a "V3 certificate" during - the initial TLS handshake. (More on what makes a certificate - "v3" below.) If the client recognizes the format of the - certificate and decides to pursue the V3 handshake, then instead - of renegotiating immediately on completion of the initial TLS - handshake, the client instead sends a VERSIONS cell (and the - negotiation begins). - - So the flowchart on the server side is: - - Wait for a ClientHello. - IF the client sends a ClientHello that indicates V1: - - Send a certificate chain. - - When the TLS handshake is done, if the client sent us a - certificate chain, then check it. - If the client sends a ClientHello that indicates V2 or V3: - - Send a self-signed certificate or a CA-signed certificate - - When the TLS handshake is done, wait for renegotiation or data. - - If renegotiation occurs, the client is V2: send a - certificate chain and maybe receive one. Check the - certificate chain as in V1. - - If the client sends data without renegotiating, it is - starting the V3 handshake. Proceed with the V3 - handshake as below. - - And the client-side flowchart is: - - - Send a ClientHello with a set of ciphers that indicates V2/V3. - - After the handshake is done: - - If the server sent us a certificate chain, check it: we - are using the V1 handshake. - - If the server sent us a single "V2 certificate", we are - using the v2 handshake: the client begins to renegotiate - and proceeds as before. - - Finally, if the server sent us a "v3 certificate", we are - doing the V3 handshake below. - - And the cell-based part of the V3 handshake, in summary, is: - - C<->S: TLS handshake where S sends a "v3 certificate" - - In TLS: - - C->S: VERSIONS cell - S->C: VERSIONS cell, CERT cell, AUTH_CHALLENGE cell, NETINFO cell - - C->S: Optionally: CERT cell, AUTHENTICATE cell - - A "CERTS" cell contains a set of certificates; an "AUTHENTICATE" - cell authenticates the client to the server. More on these - later. - -3.2. Distinguishing V2 and V3 certificates - - In the protocol outline above, we require that the client can - distinguish between v2 certificates (that is, those sent by - current servers) and a v3 certificates. We further require that - existing clients will accept v3 certificates as they currently - accept v2 certificates. - - Fortunately, current certificates have a few characteristics that - make them fairly mannered as it is. We say that a certificate - indicates a V2-only server if ALL of the following hold: - * The certificate is not self-signed. - * There is no DN field set in the certificate's issuer or - subject other than "commonName". - * The commonNames of the issuer and subject both end with - ".net" - * The public modulus is at most 1024 bits long. - - Otherwise, the client should assume that the server supports the - V3 handshake. - - To the best of my knowledge, current clients will behave properly - on receiving non-v2 certs during the initial TLS handshake so - long as they eventually get the correct V2 cert chain during the - renegotiation. - - The v3 requirements are easy to meet: any certificate designed to - resist fingerprinting will likely be self-signed, or if it's - signed by a CA, then the issuer will surely have more DN fields - set. Certificates that aren't trying to resist fingerprinting - can trivially become v3 by using a CN that doesn't end with .net, - or using a 1024-bit key. - - -3.3. Authenticating via Tor cells: server authentication - - Once the TLS handshake is finished, if the client renegotiates, - then the server should go on as it does currently. - - If the client implements this proposal, however, and the server - has shown it can understand the V3+ handshake protocol, the - client immediately sends a VERSIONS cell to the server - and waits to receive a VERSIONS cell in return. We negotiate - the Tor link protocol version _before_ we proceed with the - negotiation, in case we need to change the authentication - protocol in the future. - - Once either party has seen the VERSIONS cell from the other, it - knows which version they will pick (that is, the highest version - shared by both parties' VERSIONS cells). All Tor instances using - the handshake protocol described in 3.2 MUST support at least - link protocol version 3 as described here. If a version lower - than 3 is negotiated with the V3 handshake in place, a Tor - instance MUST close the connection. - - On learning the link protocol, the server then sends the client a - CERT cell and a NETINFO cell. If the client wants to - authenticate to the server, it sends a CERT cell, an AUTHENTICATE - cell, and a NETINFO cell, or it may simply send a NETINFO cell if - it does not want to authenticate. - - The CERT cell describes the keys that a Tor instance is claiming - to have. It is a variable-length cell. Its payload format is: - - N: Number of certs in cell [1 octet] - N times: - CertType [1 octet] - CLEN [2 octets] - Certificate [CLEN octets] - - Any extra octets at the end of a CERT cell MUST be ignored. - - CertType values are: - 1: Link key certificate from RSA1024 identity - 2: RSA1024 Identity certificate - 3: RSA1024 AUTHENTICATE cell link certificate - - The certificate format is X509. - - To authenticate the server, the client MUST check the following: - * The CERTS cell contains exactly one CertType 1 "Link" certificate. - * The CERTS cell contains exactly one CertType 2 "ID" - certificate. - * Both certificates have validAfter and validUntil dates that - are not expired. - * The certified key in the Link certificate matches the - link key that was used to negotiate the TLS connection. - * The certified key in the ID certificate is a 1024-bit RSA key. - * The certified key in the ID certificate was used to sign both - certificates. - * The link certificate is correctly signed with the key in the - ID certificate - * The ID certificate is correctly self-signed. - - If all of these conditions hold, then the client knows that it is - connected to the server whose identity key is certified in the ID - certificate. If any condition does not hold, the client closes - the connection. If the client wanted to connect to a server with - a different identity key, the client closes the connection. - - - An AUTH_CHALLENGE cell is a variable-length cell with the following - fields: - Challenge [32 octets] - It is sent from the server to the client. Clients MUST ignore - unexpected bytes at the end of the cell. Servers MUST generate - every challenge using a strong RNG or PRNG. - -3.4. Authenticating via Tor cells: Client authentication - - A client does not need to authenticate to the server. If it - does not wish to, it responds to the server's valid CERT cell by - sending NETINFO cell: once it has gotten a valid NETINFO cell - back, the client should consider the connection open, and the - server should consider the connection as opened by an - unauthenticated client. - - If a client wants to authenticate, it responds to the - AUTH_CHALLENGE cell with a CERT cell and an AUTHENTICATE cell. - The CERT cell is as a server would send, except that instead of - sending a CertType 1 cert for an arbitrary link certificate, the - client sends a CertType 3 cert for an RSA AUTHENTICATE key. - (This difference is because we allow any link key type on a TLS - link, but the protocol described here will only work for 1024-bit - RSA keys. A later protocol version should extend the protocol - here to work with non-1024-bit, non-RSA keys.) - - AuthType [2 octets] - AuthLen [2 octets] - Authentication [AuthLen octets] - - - Servers MUST ignore extra bytes at the end of an AUTHENTICATE - cell. If AuthType is 1 (meaning "RSA-SHA256-TLSSecret"), then the - Authentication contains the following: - - Type: The characters "AUTH0001" [8 octets] - CID: A SHA256 hash of the client's RSA1024 identity key [32 octets] - SID: A SHA256 hash of the server's RSA1024 identity key [32 octets] - SLOG: A SHA256 hash of all bytes sent from the server to the client - as part of the negotiation up to and including the - AUTH_CHALLENGE cell; that is, the VERSIONS cell, - the CERT cell, and the AUTH_CHALLENGE cell. [32 octets] - CLOG: A SHA256 hash of all bytes sent from the client to the - server as part of the negotiation so far; that is, the - VERSIONS cell and the CERT cell. [32 octets] - SCERT: A SHA256 hash of the server's TLS link - certificate. [32 octets] - TLSSECRETS: Either 32 zero octets, or a SHA256 HMAC, using - the TLS master secret as the secret key, of the following: - - client_random, as sent in the TLS Client Hello - - server_random, as sent in the TLS Server Hello - - the NUL terminated ASCII string: - "Tor V3 handshake TLS cross-certification" - [32 octets] - TIME: The time of day in seconds since the POSIX epoch. [8 octets] - NONCE: A 16 byte value, randomly chosen by the client [16 octets] - SIG: A signature of a SHA256 hash of all the previous fields - using the client's "Authenticate" key as presented. (As - always in Tor, we use OAEP-MGF1 padding; see tor-spec.txt - section 0.3.) - [variable length] - - To check the AUTHENTICATE cell, a server checks that all fields - containing a hash contain the correct value, then verifies the - signature. The server MUST ignore any extra bytes after - the SHA256 hash. - - When possible (that is, when implemented using C TLS API), - implementations SHOULD include and verify the TLSSECRETS field. - -3.5. Responding to extra cells, and other security checks. - - If the handshake is a V3+ TLS handshake, both parties MUST reject - any negotiated link version less than 3. Both parties MUST check - this and close the connection if it is violated. - - If the handshake is not a V3+ TLS handshake, both parties MUST - still advertise all link protocols they support in their versions - cell. Both parties MUST close the link if it turns out they both - would have supported version 3 or higher, but they somehow wound - up using a v2 or v1 handshake. (More on this in section 6.4.) - - A server SHOULD NOT send any sequence of cells when starting a v3 - negotiation other than "VERSIONS, CERT, AUTH_CHALLENGE, - NETINFO". A client SHOULD drop a CERT, AUTH_CHALLENGE, or - NETINFO cell that appears at any other time or out of sequence. - - A client should not begin a v3 negotiation with any sequence - other than "VERSIONS, NETINFO" or "VERSIONS, CERT, AUTHENTICATE, - NETINFO". A server SHOULD drop a CERT, AUTH_CHALLENGE, or - NETINFO cell that appears at any other time or out of sequence. - -4. Numbers to assign - - We need a version number for this link protocol. I've been - calling it "3". - - We need to reserve command numbers for CERT, AUTH_CHALLENGE, and - AUTHENTICATE. I suggest that in link protocol 3 and higher, we - reserve a separate range of commands for variable-length cells. - -5. Efficiency - - This protocol adds a round-trip step when the client sends a - VERSIONS cell to the server, and waits for the {VERSIONS, CERT, - NETINFO} response in turn. (The server then waits for the - client's {NETINFO} or {CERT, AUTHENTICATE, NETINFO} reply, - but it would have already been waiting for the client's NETINFO, - so that's not an additional wait.) - - This is actually fewer round-trip steps than required before for - TLS renegotiation, so that's a win over v2. - -6. Security argument - - These aren't crypto proofs, since I don't write those. They are - meant be reasonably convincing. - -6.1. The server is authenticated - - TLS guarantees that if the TLS handshake completes successfully, - the client knows that it is speaking to somebody who knows the - private key corresponding to the public link key that was used in - the TLS handshake. - - Because this public link key is signed by the server's identity - key in the CERT cell, the client knows that somebody who holds - the server's private identity key says that the server's public - link key corresponds to the server's public identity key. - - Therefore, if the crypto works, and if TLS works, and if the keys - aren't compromised, then the client is talking to somebody who - holds the server's private identity key. - -6.2. The client is authenticated - - Once the server has checked the client's certificates, the server - knows that somebody who knows the client's private identity key - says that he is the one holding the private key corresponding to - the client's presented link-authentication public key. - - Once the server has checked the signature in the AUTHENTICATE - cell, the server knows that somebody holding the client's - link-authentication private key signed the data in question. By - the standard certification argument above, the server knows that - somebody holding the client's private identity key signed the - data in question. - - So the server's remaining question is: am I really talking to - somebody holding the client's identity key, or am I getting a - replayed or MITM'd AUTHENTICATE cell that was previously sent by - the client? - - If the client included a non-zero TLSSECRET component, and the - server is able to verify it, then the answer is easy: the server - knows for certain that it is talking to the party with whom it - did the TLS handshake, since if somebody else generated a correct - TLSSECRET, they would have to know the master secret of the TLS - connection, which would require them to have broken TLS. - - If the client was not able to include a non-zero TLSSECRET - component, or the server can't check it, the answer is a little - trickier. The server knows that it is not getting a replayed - AUTHENTICATE cell, since the cell authenticates (among other - stuff) the server's AUTH_CHALLENGE cell, which it has never used - before. The server knows that it is not getting a MITM'd - AUTHENTICATE cell, since the cell includes a hash of the server's - link certificate, which nobody else should have been able to use - in a successful TLS negotiation. - -6.3. MITM attacks won't work any better than they do against TLS - - TLS guarantees that a man-in-the-middle attacker can't read the - content of a successfully negotiated encrypted connection, nor - alter the content in any way other than truncating it, unless he - compromises the session keys or one of the key-exchange secret - keys used to establish that connection. Let's make sure we do at - least that well. - - Suppose that a client Alice connects to an MITM attacker Mallory, - thinking that he is connecting to some server Bob. Let's assume - that the TLS handshake between Alice and Mallory finishes - successfully and the v3 protocol is chosen. [If the v1 or v2 - protocol is chosen, those already resist MITM. If the TLS - handshake doesn't complete, then Alice isn't connected to anybody.] - - During the v3 handshake, Mallory can't convince Alice that she is - talking to Bob, since she should not be able to produce a CERT - cell containing a certificate chain signed by Bob's identity key - and used to authenticate the link key that Mallory used during - TLS. (If Mallory used her own link key for the TLS handshake, it - won't match anything Bob signed unless Bob is compromised. - Mallory can't use any key that Bob _did_ produce a certificate - for, since she doesn't know the private key.) - - Even if Alice fails to check the certificates from Bob, Mallory - still can't convince Bob that she is really Alice. Assuming that - Alice's keys aren't compromised, Mallory can't sent a CERT cell - with a cert chain from Alice's identity key to a key that Mallory - controls, so if Mallory wants to impersonate Alice's identity - key, she can only do so by sending an AUTHENTICATE cell really - generated by Alice. Because Bob will check that the random bytes - in the AUTH_CHALLENGE cell will influence the SLOG hash, Mallory - needs to send Bob's challenge to Alice, and can't use any other - AUTHENTICATE cell that Alice generated before. But because the - AUTHENTICATE cell Alice will generate will include in the SCERT - field a hash of the link certificate used by Mallory, Bob will - reject it as not being valid to connect to him. - -6.4. Protocol downgrade attacks won't work. - - Assuming that Alice checks the certificates from Bob, she knows - that Bob really sent her the VERSION cell that she received. - - Because the AUTHENTICATE cell from Alice includes signed hashes - of the VERSIONS cells from Alice and Bob, Bob knows that Alice - got the VERSIONS cell he sent and sent the VERSIONS cell that he - received. - - But what about attempts to downgrade the protocol earlier in the - handshake? Here TLS comes to the rescue: because the TLS - Finished handshake message includes an authenticated digest of - everything previously said during the handshake, an attacker - can't replace the client's ciphersuite list (to trigger a - downgrade to the v1 protocol) or the server's certificate [chain] - (to trigger a downgrade to the v1 or v2 protocol). - -7. Design considerations - - I previously considered adding our own certificate format in - order to avoid the pain associated with X509, but decided instead - to simply use X509 since a correct Tor implementation will - already need to have X509 code to handle the other handshake - versions and to use TLS. - - The trickiest part of the design here is deciding what to stick - in the AUTHENTICATE cell. Some of it is strictly necessary, and - some of it is left there for security margin in case my other - security arguments fail. Because of the CID and SID elements - you can't use an AUTHENTICATE cell for anything other than - authenticating a client ID to a server with an appropriate - server ID. The SLOG and CLOG elements are there mostly to - authenticate the VERSIONS cells and resist downgrade attacks - once there are two versions of this. The presence of the - AUTH_CHALLENGE field in the stuff authenticated in SLOG - prevents replays and ensures that the AUTHENTICATE cell was - really generated by somebody who is reading what the server is - sending over the TLS connection. The SCERT element is meant to - prevent MITM attacks. When the TLSSECRET field is - used, it should prevent the use of the AUTHENTICATE cell for - anything other than the TLS connection the client had in mind. - - A signature of the TLSSECRET element on its own should be - sufficient to prevent the attacks we care about, but because we - don't necessarily have access to the TLS master secret when using - a non-C TLS library, we can't depend on it. I added it anyway - so that, if there is some problem with the rest of the protocol, - clients and servers that _are_ written in C (that is, the official - Tor implementation) can still be secure. - - If the client checks the server's certificates and matches them - to the TLS connection link key before proceding with the - handshake, then signing the contents of the AUTH_CHALLENGE cell - would be sufficient to authenticate the client. But implementers - of allegedly compatible Tor clients have in the past skipped - certificate verification steps, and I didn't want a client's - failure to verify certificates to mean that a server couldn't - trust that he was really talking to the client. To prevent this, - I added the TLS link certificate to the authenticated data: even - if the Tor client code doesn't check any certificates, the TLS - library code will still check that the certificate used in the - handshake contains a link key that matches the one used in the - handshake. - -8. Open questions: - - - May we cache which certificates we've already verified? It - might leak in timing whether we've connected with a given server - before, and how recently. - - - With which TLS libraries is it feasible to yoink client_random, - server_random, and the master secret? If the answer is "All - free C TLS libraries", great. If the answer is "OpenSSL only", - not so great. - - - Should we do anything to check the timestamp in the AUTHENTICATE - cell? - - - Can we give some way for clients to signal "I want to use the - V3 protocol if possible, but I can't renegotiate, so don't give - me the V2"? Clients currently have a fair idea of server - versions, so they could potentially do the V3+ handshake with - servers that support it, and fall back to V1 otherwise. - - - What should servers that don't have TLS renegotiation do? For - now, I think they should just stick with V1. Eventually we can - deprecate the V2 handshake as we did with the V1 handshake. - When that happens, servers can be V3-only. diff --git a/doc/spec/proposals/177-flag-abstention.txt b/doc/spec/proposals/177-flag-abstention.txt deleted file mode 100644 index 0b4a9babbb..0000000000 --- a/doc/spec/proposals/177-flag-abstention.txt +++ /dev/null @@ -1,104 +0,0 @@ -Filename: 177-flag-abstention.txt -Title: Abstaining from votes on individual flags -Author: Nick Mathewson -Created: 14 Feb 2011 -Status: Draft - -Overview: - - We should have a way for authorities to vote on flags in - particular instances, without having to vote on that flag for all - servers. - -Motivation: - - Suppose that the status of some router becomes controversial, and - an authority wants to vote for or against the BadExit status of - that router. Suppose also that the authority is not currently - voting on the BadExit flag. If the authority wants to say that - the router is or is not "BadExit", it cannot currently do so - without voting yea or nay on the BadExit status of all other - routers. - - Suppose that an authority wants to vote "Valid" or "Invalid" on a - large number of routers, but does not have an opinion on some of - them. Currently, it cannot do so: if it votes for the Valid flag - anywhere, it votes for it everywhere. - -Design: - - We add a new line "extra-flags" in directory votes, to appear - after "known-flags". It lists zero or more flags that an - authority has occasional opinions on, but for which the authority - will usually abstain. No flag may appear in both extra-flags and - known-flags. - - In the router-status section for each directory vote, we allow an - optional "s2" line to appear after the "s" line. It contains - zero or more flag votes. A flag vote is of the form of one of - "+", "-", or "/" followed by the name of a flag. "+" denotes a - yea vote, and "-" denotes a nay vote, and "/" notes an - abstention. Authorities may omit most abstentions, except as - noted below. No flag may appear in an s2 line unless it appears - in the known-flags or extra-flags line.We retain the rule that no - flag may appear in an s line unless it appears in the known-flags - line. - - When using an appropriate consensus method to vote, we use these - new rules to determine flags: - - A flag is listed in the consensus if it is in the known-flags - section of at least one voter, and in the known-flags or - extra-flags section of at least three voters (or half the - authorities, whichever set is smaller). - - A single authority's vote for a given flag on a given router is - interpreted as follows: - - - If the authority votes +Flag or -Flag or /Flag in the s2 line for - that router, the vote is "yea" or "nay" or "abstain" respectively. - - Otherwise, if the flag is listed on the "s" line for the - router, then the vote is "yea". - - Otherwise, if the flag is listed in the known-flags line, - then the vote is "nay". - - Otherwise, the vote is "abstain". - - A router is assigned a flag in the consensus iff the total "yeas" - outnumber the total "nays". - - As an exception, this proposal does not affect the behavior of - the "Named" and "Unnamed" flags; these are still treated as - before. (An authority can already abstain from a single naming - decision by not voting Named on any router with a given name.) - -Examples: - - Suppose that it becomes important to know which Tor servers are - operated by burrowing marsupials. Some authority operators - diligently research this question; others want to vote about - individual routers on an ad hoc basis when they learn about a - particular router's being e.g. located underground in New South - Wales. - - If an authority usually has no opinions on the RunByWombats flag, - it should list it in the "extra-flags" of its votes. If it - occasionally wants to vote that a router is (or is not) run by - wombats, it should list "s2 +RunByWombats" or "s2 -RunByWombats" - for the routers in question. Otherwise it can omit the flag from - its s and s2 lines entirely. - - If an authority usually has an opinion on the RunByWombats flag, - but wants to abstain in some cases, it should list "RunByWombats" - in the "known-flags" part of its votes, and include - "RunByWombats" in the s line for every router that it believes is - run by wombats. When it wants to vote that a router is not run - by wombats, it should list the RunByWombats flag in neither the s - nor the s2 line. When it wants to abstain, it should list "s2 - /RunByWombats". - - In both cases, when the new consensus method is used, a router - will get listed as "RunByWombats" if there are more authorities - that say it is run by wombats than there are authorities saying - it is not run by wombats. (As now, "no" votes win ties.) - - diff --git a/doc/spec/proposals/ideas/xxx-auto-update.txt b/doc/spec/proposals/ideas/xxx-auto-update.txt deleted file mode 100644 index dc9a857c1e..0000000000 --- a/doc/spec/proposals/ideas/xxx-auto-update.txt +++ /dev/null @@ -1,39 +0,0 @@ - -Notes on an auto updater: - -steve wants a "latest" symlink so he can always just fetch that. - -roger worries that this will exacerbate the "what version are you -using?" "latest." problem. - -weasel suggests putting the latest recommended version in dns. then -we don't have to hit the website. it's got caching, it's lightweight, -it scales. just put it in a TXT record or something. - -but, no dnssec. - -roger suggests a file on the https website that lists the latest -recommended version (or filename or url or something like that). - -(steve seems to already be doing this with xerobank. he additionally -suggests a little blurb that can be displayed to the user to describe -what's new.) - -how to verify you're getting the right file? -a) it's https. -b) ship with a signing key, and use some openssl functions to verify. -c) both - -andrew reminds us that we have a "recommended versions" line in the -consensus directory already. - -if only we had some way to point out the "latest stable recommendation" -from this list. we could list it first, or something. - -the recommended versions line also doesn't take into account which -packages are available -- e.g. on Windows one version might be the best -available, and on OS X it might be a different one. - -aren't there existing solutions to this? surely there is a beautiful, -efficient, crypto-correct auto updater lib out there. even for windows. - diff --git a/doc/spec/proposals/ideas/xxx-bridge-disbursement.txt b/doc/spec/proposals/ideas/xxx-bridge-disbursement.txt deleted file mode 100644 index 6c9a3c71ed..0000000000 --- a/doc/spec/proposals/ideas/xxx-bridge-disbursement.txt +++ /dev/null @@ -1,174 +0,0 @@ - -How to hand out bridges. - -Divide bridges into 'strategies' as they come in. Do this uniformly -at random for now. - -For each strategy, we'll hand out bridges in a different way to -clients. This document describes two strategies: email-based and -IP-based. - -0. Notation: - - HMAC(k,v) : an HMAC of v using the key k. - - A|B: The string A concatenated with the string B. - - -1. Email-based. - - Goal: bootstrap based on one or more popular email service's sybil - prevention algorithms. - - - Parameters: - HMAC -- an HMAC function - P -- a time period - K -- the number of bridges to send in a period. - - Setup: Generate two nonces, N and M. - - As bridges arrive, put them into a ring according to HMAC(N,ID) - where ID is the bridges's identity digest. - - Divide time into divisions of length P. - - When we get an email: - - If it's not from a supported email service, reject it. - - If we already sent a response to that email address (normalized) - in this period, send _exactly_ the same response. - - If it is from a supported service, generate X = HMAC(M,PS|E) where E - is the lowercased normalized email address for the user, and - where PS is the start of the currrent period. Send - the first K bridges in the ring after point X. - - [If we want to make sure that repeat queries are given exactly the - same results, then we can't let the ring change during the - time period. For a long time period like a month, that's quite a - hassle. How about instead just keeping a replay cache of addresses - that have been answered, and sending them a "sorry, you already got - your addresses for the time period; perhaps you should try these - other fine distribution strategies while you wait?" response? This - approach would also resolve the "Make sure you can't construct a - distinct address to match an existing one" note below. -RD] - - [I think, if we get a replay, we need to send back the same - answer as we did the first time, not say "try again." - Otherwise we need to worry that an attacker can keep people - from getting bridges by preemtively asking for them, - or that an attacker may force them to prove they haven't - gotten any bridges by asking. -NM] - - [While we're at it, if we do the replay cache thing and don't need - repeatable answers, we could just pick K random answers from the - pool. Is it beneficial that a bridge user who knows about a clump of - nodes will be sharing them with other users who know about a similar - (overlapping) clump? One good aspect is against an adversary who - learns about a clump this way and watches those bridges to learn - other users and discover *their* bridges: he doesn't learn about - as many new bridges as he might if they were randomly distributed. - A drawback is against an adversary who happens to pick two email - addresses in P that include overlapping answers: he can measure - the difference in clumps and estimate how quickly the bridge pool - is growing. -RD] - - [Random is one more darn thing to implement; rings are already - there. -NM] - - [If we make the period P be mailbox-specific, and make it a random - value around some mean, then we make it harder for an attacker to - know when to try using his small army of gmail addresses to gather - another harvest. But we also make it harder for users to know when - they can try again. -RD] - - [Letting the users know about when they can try again seems - worthwhile. Otherwise users and attackers will all probe and - probe and probe until they get an answer. No additional - security will be achieved, but bandwidth will be lost. -NM] - - To normalize an email address: - Start with the RFC822 address. Consider only the mailbox {???} - portion of the address (username@domain). Put this into lowercase - ascii. - - Questions: - What to do with weird character encodings? Look up the RFC. - - Notes: - Make sure that you can't force a single email address to appear - in lots of different ways. IOW, if nickm@freehaven.net and - NICKM@freehaven.net aren't treated the same, then I can get lots - more bridges than I should. - - Make sure you can't construct a distinct address to match an - existing one. IOW, if we treat nickm@X and nickm@Y as the same - user, then anybody can register nickm@Z and use it to tell which - bridges nickm@X got (or would get). - - Make sure that we actually check headers so we can't be trivially - used to spam people. - - -2. IP-based. - - Goal: avoid handing out all the bridges to users in a similar IP - space and time. - - Parameters: - - T_Flush -- how long it should take a user on a single network to - see a whole cluster of bridges. - - N_C - - K -- the number of bridges we hand out in response to a single - request. - - Setup: using an AS map or a geoip map or some other flawed input - source, divide IP space into "areas" such that surveying a large - collection of "areas" is hard. For v0, use /24 address blocks. - - Group areas into N_C clusters. - - Generate secrets L, M, N. - - Set the period P such that P*(bridges-per-cluster/K) = T_flush. - Don't set P to greater than a week, or less than three hours. - - When we get a bridge: - - Based on HMAC(L,ID), assign the bridge to a cluster. Within each - cluster, keep the bridges in a ring based on HMAC(M,ID). - - [Should we re-sort the rings for each new time period, so the ring - for a given cluster is based on HMAC(M,PS|ID)? -RD] - - When we get a connection: - - If it's http, redirect it to https. - - Let area be the incoming IP network. Let PS be the current - period. Compute X = HMAC(N, PS|area). Return the next K bridges - in the ring after X. - - [Don't we want to compute C = HMAC(key, area) to learn what cluster - to answer from, and then X = HMAC(key, PS|area) to pick a point in - that ring? -RD] - - - Need to clarify that some HMACs are for rings, and some are for - partitions. How rings scale is clear. How do we grow the number of - partitions? Looking at successive bits from the HMAC output is one way. - -3. Open issues - - Denial of service attacks - A good view of network topology - -at some point we should learn some reliability stats on our bridges. when -we say above 'give out k bridges', we might give out 2 reliable ones and -k-2 others. we count around the ring the same way we do now, to find them. - diff --git a/doc/spec/proposals/ideas/xxx-bwrate-algs.txt b/doc/spec/proposals/ideas/xxx-bwrate-algs.txt deleted file mode 100644 index 757f5bc55e..0000000000 --- a/doc/spec/proposals/ideas/xxx-bwrate-algs.txt +++ /dev/null @@ -1,106 +0,0 @@ -# The following two algorithms - - -# Algorithm 1 -# TODO: Burst and Relay/Regular differentiation - -BwRate = Bandwidth Rate in Bytes Per Second -GlobalWriteBucket = 0 -GlobalReadBucket = 0 -Epoch = Token Fill Rate in seconds: suggest 50ms=.050 -SecondCounter = 0 -MinWriteBytes = Minimum amount bytes per write - -Every Epoch Seconds: - UseMinWriteBytes = MinWriteBytes - WriteCnt = 0 - ReadCnt = 0 - BytesRead = 0 - - For Each Open OR Conn with pending write data: - WriteCnt++ - For Each Open OR Conn: - ReadCnt++ - - BytesToRead = (BwRate*Epoch + GlobalReadBucket)/ReadCnt - BytesToWrite = (BwRate*Epoch + GlobalWriteBucket)/WriteCnt - - if BwRate/WriteCnt < MinWriteBytes: - # If we aren't likely to accumulate enough bytes in a second to - # send a whole cell for our connections, send partials - Log(NOTICE, "Too many ORCons to write full blocks. Sending short packets.") - UseMinWriteBytes = 1 - # Other option: We could switch to plan 2 here - - # Service each writable ORConn. If there are any partial writes, - # return remaining bytes from this epoch to the global pool - For Each Open OR Conn with pending write data: - ORConn->write_bucket += BytesToWrite - if ORConn->write_bucket > UseMinWriteBytes: - w = write(ORConn, MIN(len(ORConn->write_data), ORConn->write_bucket)) - # possible that w < ORConn->write_data here due to TCP pushback. - # We should restore the rest of the write_bucket to the global - # buffer - GlobalWriteBucket += (ORConn->write_bucket - w) - ORConn->write_bucket = 0 - - For Each Open OR Conn: - r = read_nonblock(ORConn, BytesToRead) - BytesRead += r - - SecondCounter += Epoch - if SecondCounter < 1: - # Save unused bytes from this epoch to be used later in the second - GlobalReadBucket += (BwRate*Epoch - BytesRead) - else: - SecondCounter = 0 - GlobalReadBucket = 0 - GlobalWriteBucket = 0 - For Each ORConn: - ORConn->write_bucket = 0 - - - -# Alternate plan for Writing fairly. Reads would still be covered -# by plan 1 as there is no additional network overhead for short reads, -# so we don't need to try to avoid them. -# -# I think this is actually pretty similar to what we do now, but -# with the addition that the bytes accumulate up to the second mark -# and we try to keep track of our position in the write list here -# (unless libevent is doing that for us already and I just don't see it) -# -# TODO: Burst and Relay/Regular differentiation - -# XXX: The inability to send single cells will cause us to block -# on EXTEND cells for low-bandwidth node pairs.. -BwRate = Bandwidth Rate in Bytes Per Second -WriteBytes = Bytes per write -Epoch = MAX(MIN(WriteBytes/BwRate, .333s), .050s) - -SecondCounter = 0 -GlobalWriteBucket = 0 - -# New connections are inserted at Head-1 (the 'tail' of this circular list) -# This is not 100% fifo for all node data, but it is the best we can do -# without insane amounts of additional queueing complexity. -WriteConnList = List of Open OR Conns with pending write data > WriteBytes -WriteConnHead = 0 - -Every Epoch Seconds: - GlobalWriteBucket += BwRate*Epoch - WriteListEnd = WriteConnHead - - do - ORCONN = WriteConnList[WriteConnHead] - w = write(ORConn, WriteBytes) - GlobalWriteBucket -= w - WriteConnHead += 1 - while GlobalWriteBucket > 0 and WriteConnHead != WriteListEnd - - SecondCounter += Epoch - if SecondCounter >= 1: - SecondCounter = 0 - GlobalWriteBucket = 0 - - diff --git a/doc/spec/proposals/ideas/xxx-choosing-crypto-in-tor-protocol.txt b/doc/spec/proposals/ideas/xxx-choosing-crypto-in-tor-protocol.txt deleted file mode 100644 index e8489570f7..0000000000 --- a/doc/spec/proposals/ideas/xxx-choosing-crypto-in-tor-protocol.txt +++ /dev/null @@ -1,138 +0,0 @@ -Filename: xxx-choosing-crypto-in-tor-protocol.txt -Title: Picking cryptographic standards in the Tor wire protocol -Author: Marian -Created: 2009-05-16 -Status: Draft - -Motivation: - - SHA-1 is horribly outdated and not suited for security critical - purposes. SHA-2, RIPEMD-160, Whirlpool and Tigerare good options - for a short-term replacement, but in the long run, we will - probably want to upgrade to the winner or a semi-finalist of the - SHA-3 competition. - - For a 2006 comparison of different hash algorithms, read: - http://www.sane.nl/sane2006/program/final-papers/R10.pdf - - Other reading about SHA-1: - http://www.schneier.com/blog/archives/2005/02/sha1_broken.html - http://www.schneier.com/blog/archives/2005/08/new_cryptanalyt.html - http://www.schneier.com/paper-preimages.html - - Additionally, AES has been theoretically broken for years. While - the attack is still not efficient enough that the public sector - has been able to prove that it works, we should probably consider - the time between a theoretical attack and a practical attack as an - opportunity to figure out how to upgrade to a better algorithm, - such as Twofish. - - See: - http://schneier.com/crypto-gram-0209.html#1 - -Design: - - I suggest that nodes should publish in directories which - cryptographic standards, such as hash algorithms and ciphers, - they support. Clients communicating with nodes will then - pick whichever of those cryptographic standards they prefer - the most. In the case that the node does not publish which - cryptographic standards it supports, the client should assume - that the server supports the older standards, such as SHA-1 - and AES, until such time as we choose to desupport those - standards. - - Node to node communications could work similarly. However, in - case they both support a set of algorithms but have different - preferences, the disagreement would have to be resolved - somehow. Two possibilities include: - * the node requesting communications presents which - cryptographic standards it supports in the request. The - other node picks. - * both nodes send each other lists of what they support and - what version of Tor they are using. The newer node picks, - based on the assumption that the newer node has the most up - to date information about which hash algorithm is the best. - Of course, the node could lie about its version, but then - again, it could also maliciously choose only to support older - algorithms. - - Using this method, we could potentially add server side support - to hash algorithms and ciphers before we instruct clients to - begin preferring those hash algorithms and ciphers. In this way, - the clients could upgrade and the servers would already support - the newly preferred hash algorithms and ciphers, even if the - servers were still using older versions of Tor, so long as the - older versions of Tor were at least new enough to have server - side support. - - This would make quickly upgrading to new hash algorithms and - ciphers easier. This could be very useful when new attacks - are published. - - One concern is that client preferences could expose the client - to segmentation attacks. To mitigate this, we suggest hardcoding - preferences in the client, to prevent the client from choosing - to use a new hash algorithm or cipher that no one else is using - yet. While offering a preference might be useful in case a client - with an older version of Tor wants to start using the newer hash - algorithm or cipher that everyone else is using, if the client - cares enough, he or she can just upgrade Tor. - - We may also have to worry about nodes which, through laziness or - maliciousness, refuse to start supporting new hash algorithms or - ciphers. This must be balanced with the need to maintain - backward compatibility so the client will have a large selection - of nodes to pick from. Adding new hash algorithms and ciphers - long before we suggest nodes start using them can help mitigate - this. However, eventually, once sufficient nodes support new - standards, client side support for older standards should be - disabled, particularly if there are practical rather than merely - theoretical attacks. - - Server side support for older standards can be kept much longer - than client side support, since clients using older hashes and - ciphers are really only hurting theirselvse. - - If server side support for a hash algorithm or cipher is added - but never preferred before we decide we don't really want it, - support can be removed without having to worry about backward - compatibility. - -Security implications: - Improving cryptography will improve Tor's security. However, if - clients pick different cryptographic standards, they could be - partitioned based on their cryptographic preferences. We also - need to worry about nodes refusing to support new standards. - These issues are detailed above. - -Specification: - - Todo. Need better understanding of how Tor currently works or - help from someone who does. - -Compatibility: - - This idea is intended to allow easier upgrading of cryptographic - hash algorithms and ciphers while maintaining backwards - compatibility. However, at some point, backwards compatibility - with very old hashes and ciphers should be dropped for security - reasons. - -Implementation: - - Todo. - -Performance and scalability nodes: - - Better hashes and cipher are someimes a little more CPU intensive - than weaker ones. For instance, on most computers AES is a little - faster than Twofish. However, in that example, I consider Twofish's - additional security worth the tradeoff. - -Acknowledgements: - - Discussed this on IRC with a few people, mostly Nick Mathewson. - Nick was particularly helpful in explaining how Tor works, - explaining goals, and providing various links to Tor - specifications. diff --git a/doc/spec/proposals/ideas/xxx-controllers-intercept-extends.txt b/doc/spec/proposals/ideas/xxx-controllers-intercept-extends.txt deleted file mode 100644 index 76ba5c84b5..0000000000 --- a/doc/spec/proposals/ideas/xxx-controllers-intercept-extends.txt +++ /dev/null @@ -1,44 +0,0 @@ -Author: Geoff Goodell -Title: Allow controller to manage circuit extensions -Date: 12 March 2006 - -History: - - This was once bug 268. Moving it into the proposal system for posterity. - -Test: - -Tor controllers should have a means of learning more about circuits built -through Tor routers. Specifically, if a Tor controller is connected to a Tor -router, it should be able to subscribe to a new class of events, perhaps -"onion" or "router" events. A Tor router SHOULD then ensure that the -controller is informed: - -(a) (NEW) when it receives a connection from some other location, in which -case it SHOULD indicate (1) a unique identifier for the circuit, and (2) a -ServerID in the event of an OR connection from another Tor router, and -Hostname otherwise. - -(b) (REQUEST) when it receives a request to extend an existing circuit to a -successive Tor router, in which case it SHOULD provide (1) the unique -identifier for the circuit, (2) a Hostname (or, if possible, ServerID) of the -previous Tor router in the circuit, and (3) a ServerID for the requested -successive Tor router in the circuit; - -(c) (EXTEND) Tor will attempt to extend the circuit to some other router, in -which case it SHOULD provide the same fields as provided for REQUEST. - -(d) (SUCCEEDED) The circuit has been successfully extended to some ther -router, in which case it SHOULD provide the same fields as provided for -REQUEST. - -We also need a new configuration option analogous to _leavestreamsunattached, -specifying whether the controller is to manage circuit extensions or not. -Perhaps we can call it "_leavecircuitsunextended". When set to 0, Tor -manages everything as usual. When set to 1, a circuit received by the Tor -router cannot transition from "REQUEST" to "EXTEND" state without being -directed by a new controller command. The controller command probably does -not need any arguments, since circuits are extended per client source -routing, and all that the controller does is accept or reject the extension. - -This feature can be used as a basis for enforcing routing policy. diff --git a/doc/spec/proposals/ideas/xxx-crypto-migration.txt b/doc/spec/proposals/ideas/xxx-crypto-migration.txt deleted file mode 100644 index 1c734229b8..0000000000 --- a/doc/spec/proposals/ideas/xxx-crypto-migration.txt +++ /dev/null @@ -1,384 +0,0 @@ - -Title: Initial thoughts on migrating Tor to new cryptography -Author: Nick Mathewson -Created: 12 December 2010 - -1. Introduction - - Tor currently uses AES-128, RSA-1024, and SHA1. Even though these - ciphers were a decent choice back in 2003, and even though attacking - these algorithms is by no means the best way for a well-funded - adversary to attack users (correlation attacks are still cheaper, even - with pessimistic assumptions about the security of each cipher), we - will want to move to better algorithms in the future. Indeed, if - migrating to a new ciphersuite were simple, we would probably have - already moved to RSA-1024/AES-128/SHA256 or something like that. - - So it's a good idea to start figuring out how we can move to better - ciphers. Unfortunately, this is a bit nontrivial, so before we start - doing the design work here, we should start by examining the issues - involved. Robert Ransom and I both decided to spend this weekend - writing up documents of this type so that we can see how much two - people working independently agree on. I know more Tor than Robert; - Robert knows far more cryptography than I do. With luck we'll - complement each other's work nicely. - - A note on scope: This document WILL NOT attempt to pick a new cipher - or set of ciphers. Instead, it's about how to migrate to new ciphers - in general. Any algorithms mentioned other than those we use today - are just for illustration. - - Also, I don't much consider the importance of updating each particular - usage; only the methods that you'd use to do it. - - Also, this isn't a complete proposal. - -2. General principles and tricks - - Before I get started, let's talk about some general design issues. - -2.1. Many algorithms or few? - - Protocols like TLS and OpenPGP allow a wide choice of cryptographic - algorithms; so long as the sender and receiver (or the responder and - initiator) have at least one mutually acceptable algorithm, they can - converge upon it and send each other messages. - - This isn't the best choice for anonymity designs. If two clients - support a different set of algorithms, then an attacker can tell them - apart. A protocol with N ciphersuites would in principle split - clients into 2**N-1 sets. (In practice, nearly all users will use the - default, and most users who choose _not_ to use the default will do so - without considering the loss of anonymity. See "Anonymity Loves - Company: Usability and the Network Effect".) - - On the other hand, building only one ciphersuite into Tor has a flaw - of its own: it has proven difficult to migrate to another one. So - perhaps instead of specifying only a single new ciphersuite, we should - specify more than one, with plans to switch over (based on a flag in - the consensus or some other secure signal) once the first choice of - algorithms start looking iffy. This switch-based approach would seem - especially easy for parameterizable stuff like key sizes. - -2.2. Waiting for old clients and servers to upgrade - - The easiest way to implement a shift in algorithms would be to declare - a "flag day": once we have the new versions of the protocols - implemented, pick a day by which everybody must upgrade to the new - software. Before this day, the software would have the old behavior; - after this way, it would use the improved behavior. - - Tor tries to avoid flag days whenever possible; they have well-known - issues. First, since a number of our users don't automatically - update, it can take a while for people to upgrade to new versions of - our software. Second and more worryingly, it's hard to get adequate - testing for new behavior that is off-by-default. Flag days in other - systems have been known to leave whole networks more or less - inoperable for months; we should not trust in our skill to avoid - similar problems. - - So if we're avoiding flag days, what can we do? - - * We can add _support_ for new behavior early, and have clients use it - where it's available. (Clients know the advertised versions of the - Tor servers they use-- but see 2.3 below for a danger here, and 2.4 - for a bigger danger.) - - * We can remove misfeatures that _prevent_ deployment of new - behavior. For instance, if a certain key length has an arbitrary - 1024-bit limit, we can remove that arbitrary limitation. - - * Once an optional new behavior is ubiquitous enough, the authorities - can stop accepting descriptors from servers that do not have it - until they upgrade. - - It is far easier to remove arbitrary limitations than to make other - changes; such changes are generally safe to back-port to older stable - release series. But in general, it's much better to avoid any plans - that require waiting for any version of Tor to no longer be in common - use: a stable release can take on the order of 2.5 years to start - dropping off the radar. Thandy might fix that, but even if a perfect - Thandy release comes out tomorrow, we'll still have lots of older - clients and relays not using it. - - We'll have to approach the migration problem on a case-by-case basis - as we consider the algorithms used by Tor and how to change them. - -2.3. Early adopters and other partitioning dangers - - It's pretty much unavoidable that clients running software that speak - the new version of any protocol will be distinguishable from those - that cannot speak the new version. This is inevitable, though we - could try to minimize the number of such partitioning sets by having - features turned on in the same release rather than one-at-a-time. - - Another option here is to have new protocols controlled by a - configuration tri-state with values "on", "off", and "auto". The - "auto" value means to look at the consensus to decide wither to use - the feature; the other two values are self-explanatory. We'd ship - clients with the feature set to "auto" by default, with people only - using "on" for testing. - - If we're worried about early client-side implementations of a protocol - turning out to be broken, we can have the consensus value say _which_ - versions should turn on the protocol. - -2.4. Avoid whole-circuit switches - - One risky kind of protocol migration is a feature that gets used only - when all the routers in a circuit support it. If such a feature is - implemented by few relays, then each relay learns a lot about the rest - of the path by seeing it used. On the other hand, if the feature is - implemented by most relays, then a relay learns a lot about the rest of - the path when the feature is *not* used. - - It's okay to have a feature that can be only used if two consecutive - routers in the patch support it: each router knows the ones adjacent - to it, after all, so knowing what version of Tor they're running is no - big deal. - -2.5. The Second System Effect rears its ugly head - - Any attempt at improving Tor's crypto is likely to involve changes - throughout the Tor protocol. We should be aware of the risks of - falling into what Fred Brooks called the "Second System Effect": when - redesigning a fielded system, it's always tempting to try to shovel in - every possible change that one ever wanted to make to it. - - This is a fine time to make parts of our protocol that weren't - previously versionable into ones that are easier to upgrade in the - future. This probably _isn't_ time to redesign every aspect of the - Tor protocol that anybody finds problematic. - -2.6. Low-hanging fruit and well-lit areas - - Not all parts of Tor are tightly covered. If it's possible to upgrade - different parts of the system at different rates from one another, we - should consider doing the stuff we can do easier, earlier. - - But remember the story of the policeman who finds a drunk under a - streetlamp, staring at the ground? The cop asks, "What are you - doing?" The drunk says, "I'm looking for my keys!" "Oh, did you drop - them around here?" says the policeman. "No," says the drunk, "But the - light is so much better here!" - - Or less proverbially: Simply because a change is easiest, does not - mean it is the best use of our time. We should avoid getting bogged - down solving the _easy_ aspects of our system unless they happen also - to be _important_. - -2.7. Nice safe boring codes - - Let's avoid, to the extent that we can: - - being the primary user of any cryptographic construction or - protocol. - - anything that hasn't gotten much attention in the literature. - - anything we would have to implement from scratch - - anything without a nice BSD-licensed C implementation - - Sometimes we'll have the choice of a more efficient algorithm or a - more boring & well-analyzed one. We should not even consider trading - conservative design for efficiency unless we are firmly in the - critical path. - -2.8. Key restrictions - - Our spec says that RSA exponents should be 65537, but our code never - checks for that. If we want to bolster resistance against collision - attacks, we could check this requirement. To the best of my - knowledge, nothing violates it except for tools like "shallot" that - generate cute memorable .onion names. If we want to be nice to - shallot users, we could check the requirement for everything *except* - hidden service identity keys. - -3. Aspects of Tor's cryptography, and thoughts on how to upgrade them all - -3.1. Link cryptography - - Tor uses TLS for its link cryptography; it is easy to add more - ciphersuites to the acceptable list, or increase the length of - link-crypto public keys, or increase the length of the DH parameter, - or sign the X509 certificates with any digest algorithm that OpenSSL - clients will support. Current Tor versions do not check any of these - against expected values. - - The identity key used to sign the second certificate in the current - handshake protocol, however, is harder to change, since it needs to - match up with what we see in the router descriptor for the router - we're connecting to. See notes on router identity below. So long as - the certificate chain is ultimately authenticated by a RSA-1024 key, - it's not clear whether making the link RSA key longer on its own - really improves matters or not. - - Recall also that for anti-fingerprinting reasons, we're thinking of - revising the protocol handshake sometime in the 0.2.3.x timeframe. - If we do that, that might be a good time to make sure that we aren't - limited by the old identity key size. - -3.2. Circuit-extend crypto - - Currently, our code requires RSA onion keys to be 1024 bits long. - Additionally, current nodes will not deliver an EXTEND cell unless it - is the right length. - - For this, we might add a second, longer onion-key to router - descriptors, and a second CREATE2 cell to open new circuits - using this key type. It should contain not only the onionskin, but - also information on onionskin version and ciphersuite. Onionskins - generated for CREATE2 cells should use a larger DH group as well, and - keys should be derived from DH results using a better digest algorithm. - - We should remove the length limit on EXTEND cells, backported to all - supported stable versions; call these "EXTEND2" cells. Call these - "lightly patched". Clients could use the new EXTEND2/CREATE2 format - whenever using a lightly patched or new server to extend to a new - server, and the old EXTEND/CREATE format otherwise. - - The new onion skin format should try to avoid the design oddities of - our old one. Instead of its current iffy hybrid encryption scheme, it - should probably do something more like a BEAR/LIONESS operation with a - fixed key on the g^x value, followed by a public key encryption on the - start of the encrypted data. (Robert reminded me about this - construction.) - - The current EXTEND cell format ends with a router identity - fingerprint, which is used by the extended-from router to authenticate - the extended-to router when it connects. Changes to this will - interact with changes to how long an identity key can be and to the - link protocol; see notes on the link protocol above and about router - identity below. - -3.2.1. Circuit-extend crypto: fast case - - When we do unauthenticated circuit extends with CREATE/CREATED_FAST, - the two input values are combined with SHA1. I believe that's okay; - using any entropy here at all is overkill. - -3.3. Relay crypto - - Upon receiving relay cells, a router transforms the payload portion of - the cell with the appropriate key appropriate key, sees if it - recognizes the cell (the recognized field is zero, the digest field is - correct, the cell is outbound), and passes them on if not. It is - possible for each hop in the circuit to handle the relay crypto - differently; nobody but the client and the hop in question need to - coordinate their operations. - - It's not clear, though, whether updating the relay crypto algorithms - would help anything, unless we changed the whole relay cell processing - format too. The stream cipher is good enough, and the use of 4 bytes - of digest does not have enough bits to provide cryptographic strength, - no matter what cipher we use. - - This is the likeliest area for the second-system effect to strike; - there are lots of opportunities to try to be more clever than we are - now. - -3.4. Router identity - - This is one of the hardest things to change. Right now, routers are - identified by a "fingerprint" equal to the SHA1 hash of their 1024-bit - identity key as given in their router descriptor. No existing Tor - will accept any other size of identity key, or any other hash - algorithm. The identity key itself is used: - - To sign the router descriptors - - To sign link-key certificates - - To determine the least significant bits of circuit IDs used on a - Tor instance's links (see tor-spec §5.1) - - The fingerprint is used: - - To identify a router identity key in EXTEND cells - - To identify a router identity key in bridge lines - - Throughout the controller interface - - To fetch bridge descriptors for a bridge - - To identify a particular router throughout the codebase - - In the .exit notation. - - By the controller to identify nodes - - To identify servers in the logs - - Probably other places too - - To begin to allow other key types, key lengths, and hash functions, we - would either need to wait till all current Tors are obsolete, or allow - routers to have more than one identity for a while. - - To allow routers to have more than one identity, we need to - cross-certify identity keys. We can do this trivially, in theory, by - listing both keys in the router descriptor and having both identities - sign the descriptor. In practice, we will need to analyze this pretty - carefully to avoid attacks where one key is completely fake aimed to - trick old clients somehow. - - Upgrading the hash algorithm once would be easy: just say that all - new-type keys get hashed using the new hash algorithm. Remaining - future-proof could be tricky. - - This is one of the hardest areas to update; "SHA1 of identity key" is - assumed in so many places throughout Tor that we'll probably need a - lot of design work to work with something else. - -3.5. Directory objects - - Fortunately, the problem is not so bad for consensuses themselves, - because: - - Authority identity keys are allowed to be RSA keys of any length; - in practice I think they are all 3072 bits. - - Authority signing keys are also allowed to be of any length. - AFAIK the code works with longer signing keys just fine. - - Currently, votes are hashed with both sha1 and sha256; adding - more hash algorithms isn't so hard. - - Microdescriptor consensuses are all signed using sha256. While - regular consensuses are signed using sha1, exploitable collisions - are hard to come up with, since once you had a collision, you - would need to get a majority of other authorities to agree to - generate it. - - Router descriptors are currently identified by SHA1 digests of their - identity keys and descriptor digests in regular consensuses, and by - SHA1 digests of identity keys and SHA256 digests of microdescriptors - in microdesc consensuses. The consensus-flavors design allows us to - generate new flavors of consensus that identity routers by new hashes - of their identity keys. Alternatively, existing consensuses could be - expanded to contain more hashes, though that would have some space - concerns. - - Router descriptors themselves are signed using RSA-1024 identity keys - and SHA1. For information on updating identity keys, see above. - - Router descriptors and extra-info documents cross-certify one another - using SHA1. - - Microdescriptors are currently specified to contain exactly one - onion key, of length 1024 bits. - -3.6. The directory protocol - - Most objects are indexed by SHA1 hash of an identity key or a - descriptor object. Adding more hash types wouldn't be a huge problem - at the directory cache level. - -3.7. The hidden service protocol - - Hidden services self-identify by a 1024-bit RSA key. Other key - lengths are not supported. This key is turned into an 80 bit half - SHA-1 hash for hidden service names. - - The most simple change here would be to set an interface for putting - the whole ugly SHA1 hash in the hidden service name. Remember that - this needs to coexist with the authentication system which also uses - .onion hostnames; that hostnames top out around 255 characters and and - their components top out at 63. - - Currently, ESTABLISH_INTRO cells take a key length parameter, so in - theory they allow longer keys. The rest of the protocol assumes that - this will be hashed into a 20-byte SHA1 identifier. Changing that - would require changes at the introduction point as well as the hidden - service. - - The parsing code for hidden service descriptors currently enforce a - 1024-bit identity key, though this does not seem to be described in - the specification. Changing that would be at least as hard as doing - it for regular identity keys. - - Fortunately, hidden services are nearly completely orthogonal to - everything else. - diff --git a/doc/spec/proposals/ideas/xxx-crypto-requirements.txt b/doc/spec/proposals/ideas/xxx-crypto-requirements.txt deleted file mode 100644 index 8a8943a42f..0000000000 --- a/doc/spec/proposals/ideas/xxx-crypto-requirements.txt +++ /dev/null @@ -1,72 +0,0 @@ -Title: Requirements for Tor's circuit cryptography -Author: Robert Ransom -Created: 12 December 2010 - -Overview - - This draft is intended to specify the meaning of 'secure' for a Tor - circuit protocol, hopefully in enough detail that - mathematically-inclined cryptographers can use this definition to - prove that a Tor circuit protocol (or component thereof) is secure - under reasonably well-accepted assumptions. - - Tor's current circuit protocol consists of the CREATE, CREATED, RELAY, - DESTROY, CREATE_FAST, CREATED_FAST, and RELAY_EARLY cells (including - all subtypes of RELAY and RELAY_EARLY cells). Tor currently has two - circuit-extension handshake protocols: one consists of the CREATE and - CREATED cells; the other, used only over the TLS connection to the - first node in a circuit, consists of the CREATE_FAST and CREATED_FAST - cells. - -Requirements - - 1. Every circuit-extension handshake protocol must provide forward - secrecy -- the protocol must allow both the client and the relay to - destroy, immediately after a circuit is closed, enough key material - that no attacker who can eavesdrop on all handshake and circuit cells - and who can seize and inspect the client and relay after the circuit - is closed will be able to decrypt any non-handshake data sent along - the circuit. - - In particular, the protocol must not require that a key which can be - used to decrypt non-handshake data be stored for a predetermined - period of time, as such a key must be written to persistent storage. - - 2. Every circuit-extension handshake protocol must specify what key - material must be used only once in order to allow unlinkability of - circuit-extension handshakes. - - 3. Every circuit-extension handshake protocol must authenticate the relay - to the client -- an attacker who can eavesdrop on all handshake and - circuit cells and who can participate in handshakes with the client - must not be able to determine a symmetric session key that a circuit - will use without either knowing a secret key corresponding to a - handshake-authentication public key published by the relay or breaking - a cryptosystem for which the relay published a - handshake-authentication public key. - - 4. Every circuit-extension handshake protocol must ensure that neither - the client nor the relay can cause the handshake to result in a - predetermined symmetric session key. - - 5. Every circuit-extension handshake protocol should ensure that an - attacker who can predict the relay's ephemeral secret input to the - handshake and can eavesdrop on all handshake and circuit cells, but - does not know a secret key corresponding to the - handshake-authentication public key used in the handshake, cannot - break the handshake-authentication public key's cryptosystem, and - cannot predict the client's ephemeral secret input to the handshake, - cannot predict the symmetric session keys used for the resulting - circuit. - - 6. The circuit protocol must specify an end-to-end flow-control - mechanism, and must allow for the addition of new mechanisms. - - 7. The circuit protocol should specify the statistics to be exchanged - between circuit endpoints in order to support end-to-end flow control, - and should specify how such statistics can be verified. - - - 8. The circuit protocol should allow an endpoint to verify that the other - endpoint is participating in an end-to-end flow-control protocol - honestly. diff --git a/doc/spec/proposals/ideas/xxx-draft-spec-for-TLS-normalization.txt b/doc/spec/proposals/ideas/xxx-draft-spec-for-TLS-normalization.txt deleted file mode 100644 index 16484e6375..0000000000 --- a/doc/spec/proposals/ideas/xxx-draft-spec-for-TLS-normalization.txt +++ /dev/null @@ -1,360 +0,0 @@ -Filename: xxx-draft-spec-for-TLS-normalization.txt -Title: Draft spec for TLS certificate and handshake normalization -Author: Jacob Appelbaum, Gladys Shufflebottom -Created: 16-Feb-2011 -Status: Draft - - - Draft spec for TLS certificate and handshake normalization - - - Overview - -Scope - -This is a document that proposes improvements to problems with Tor's -current TLS (Transport Layer Security) certificates and handshake that will -reduce the distinguishability of Tor traffic from other encrypted traffic that -uses TLS. It also addresses some of the possible fingerprinting attacks -possible against the current Tor TLS protocol setup process. - -Motivation and history - -Censorship is an arms race and this is a step forward in the defense -of Tor. This proposal outlines ideas to make it more difficult to -fingerprint and block Tor traffic. - -Goals - -This proposal intends to normalize or remove easy-to-predict or static -values in the Tor TLS certificates and with the Tor TLS setup process. -These values can be used as criteria for the automated classification of -encrypted traffic as Tor traffic. Network observers should not be able -to trivially detect Tor merely by receiving or observing the certificate -used or advertised by a Tor relay. I also propose the creation of -a hard-to-detect covert channel through which a server can signal that it -supports the third version ("V3") of the Tor handshake protocol. - -Non-Goals - -This document is not intended to solve all of the possible active or passive -Tor fingerprinting problems. This document focuses on removing distinctive -and predictable features of TLS protocol negotiation; we do not attempt to -make guarantees about resisting other kinds of fingerprinting of Tor -traffic, such as fingerprinting techniques related to timing or volume of -transmitted data. - - Implementation details - - -Certificate Issues - -The CN or commonName ASN1 field - -Tor generates certificates with a predictable commonName field; the -field is within a given range of values that is specific to Tor. -Additionally, the generated host names have other undesirable properties. -The host names typically do not resolve in the DNS because the domain -names referred to are generated at random. Although they are syntatically -valid, they usually refer to domains that have never been registered by -any domain name registrar. - -An example of the current commonName field: CN=www.s4ku5skci.net - -An example of OpenSSL’s asn1parse over a typical Tor certificate: - - 0:d=0 hl=4 l= 438 cons: SEQUENCE - 4:d=1 hl=4 l= 287 cons: SEQUENCE - 8:d=2 hl=2 l= 3 cons: cont [ 0 ] - 10:d=3 hl=2 l= 1 prim: INTEGER :02 - 13:d=2 hl=2 l= 4 prim: INTEGER :4D3C763A - 19:d=2 hl=2 l= 13 cons: SEQUENCE - 21:d=3 hl=2 l= 9 prim: OBJECT :sha1WithRSAEncryption - 32:d=3 hl=2 l= 0 prim: NULL - 34:d=2 hl=2 l= 35 cons: SEQUENCE - 36:d=3 hl=2 l= 33 cons: SET - 38:d=4 hl=2 l= 31 cons: SEQUENCE - 40:d=5 hl=2 l= 3 prim: OBJECT :commonName - 45:d=5 hl=2 l= 24 prim: PRINTABLESTRING :www.vsbsvwu5b4soh4wg.net - 71:d=2 hl=2 l= 30 cons: SEQUENCE - 73:d=3 hl=2 l= 13 prim: UTCTIME :110123184058Z - 88:d=3 hl=2 l= 13 prim: UTCTIME :110123204058Z - 103:d=2 hl=2 l= 28 cons: SEQUENCE - 105:d=3 hl=2 l= 26 cons: SET - 107:d=4 hl=2 l= 24 cons: SEQUENCE - 109:d=5 hl=2 l= 3 prim: OBJECT :commonName - 114:d=5 hl=2 l= 17 prim: PRINTABLESTRING :www.s4ku5skci.net - 133:d=2 hl=3 l= 159 cons: SEQUENCE - 136:d=3 hl=2 l= 13 cons: SEQUENCE - 138:d=4 hl=2 l= 9 prim: OBJECT :rsaEncryption - 149:d=4 hl=2 l= 0 prim: NULL - 151:d=3 hl=3 l= 141 prim: BIT STRING - 295:d=1 hl=2 l= 13 cons: SEQUENCE - 297:d=2 hl=2 l= 9 prim: OBJECT :sha1WithRSAEncryption - 308:d=2 hl=2 l= 0 prim: NULL - 310:d=1 hl=3 l= 129 prim: BIT STRING - -I propose that we match OpenSSL's default self-signed certificates. I hypothesise -that they are the most common self-signed certificates. If this turns out not -to be the case, then we should use whatever the most common turns out to be. - -Certificate serial numbers - -Currently our generated certificate serial number is set to the number of -seconds since the epoch at the time of the certificate's creation. I propose -that we should ensure that our serial numbers are unrelated to the epoch, -since the generation methods are potentially recognizable as Tor-related. - -Instead, I propose that we use a randomly generated number that is -subsequently hashed with SHA-512 and then truncate the data to eight bytes[1]. - -Random sixteen byte values appear to be the high bound for serial number as -issued by Verisign and DigiCert. RapidSSL appears to be three bytes in length. -Others common byte lengths appear to be between one and four bytes. The default -OpenSSL certificates are eight bytes and we should use this length with our -self-signed certificates. - -This randomly generated serial number field may now serve as a covert channel -that signals to the client that the OR will not support TLS renegotiation; this -means that the client can expect to perform a V3 TLS handshake setup. -Otherwise, if the serial number is a reasonable time since the epoch, we should -assume the OR is using an earlier protocol version and hence that it expects -renegotiation. - -We also have a need to signal properties with our certificates for a possible -v3 handshake in the future. Therefore I propose that we match OpenSSL default -self-signed certificates (a 64-bit random number), but reserve the two least- -significant bits for signaling. For the moment, these two bits will be zero. - -This means that an attacker may be able to identify Tor certificates from default -OpenSSL certificates with a 75% probability. - -As a security note, care must be taken to ensure that supporting this -covert channel will not lead to an attacker having a method to downgrade client -behavior. This shouldn't be a risk because the TLS Finished message hashes over -all the bytes of the handshake, including the certificates. - -Certificate fingerprinting issues expressed as base64 encoding - -It appears that all deployed Tor certificates have the following strings in -common: - -MIIB -CCA -gAwIBAgIETU -ANBgkqhkiG9w0BAQUFADA -YDVQQDEx -3d3cu - -As expected these values correspond to specific ASN.1 OBJECT IDENTIFIER (OID) -properties (sha1WithRSAEncryption, commonName, etc) of how we generate our -certificates. - -As an illustrated example of the common bytes of all certificates used within -the Tor network within a single one hour window, I have replaced the actual -value with a wild card ('.') character here: - ------BEGIN CERTIFICATE----- -MIIB..CCA..gAwIBAgIETU....ANBgkqhkiG9w0BAQUFADA.M..w..YDVQQDEx.3 -d3cu............................................................ -................................................................ -................................................................ -................................................................ -................................................................ -................................................................ -................................................................ -................................................................ -........................... <--- Variable length and padding ------END CERTIFICATE----- - -This fine ascii art only illustrates the bytes that absolutely match in all -cases. In many cases, it's likely that there is a high probability for a given -byte to be only a small subset of choices. - -Using the above strings, the EFF's certificate observatory may trivially -discover all known relays, known bridges and unknown bridges in a single SQL -query. I propose that we ensure that we test our certificates to ensure that -they do not have these kinds of statistical similarities without ensuring -overlap with a very large cross section of the internet's certificates. - -Certificate dating and validity issues - -TLS certificates found in the wild are generally found to be long-lived; -they are frequently old and often even expired. The current Tor certificate -validity time is a very small time window starting at generation time and -ending shortly thereafter, as defined in or.h by MAX_SSL_KEY_LIFETIME -(2*60*60). - -I propose that the certificate validity time length is extended to a period of -twelve Earth months, possibly with a small random skew to be determined by the -implementer. Tor should randomly set the start date in the past or some -currently unspecified window of time before the current date. This would -more closely track the typical distribution of non-Tor TLS certificate -expiration times. - -The certificate values, such as expiration, should not be used for anything -relating to security; for example, if the OR presents an expired TLS -certificate, this does not imply that the client should terminate the -connection (as would be appropriate for an ordinary TLS implementation). -Rather, I propose we use a TOFU style expiration policy - the certificate -should never be trusted for more than a two hour window from first sighting. - -This policy should have two major impacts. The first is that an adversary will -have to perform a differential analysis of all certificates for a given IP -address rather than a single check. The second is that the server expiration -time is enforced by the client and confirmed by keys rotating in the consensus. - -The expiration time should not be a fixed time that is simple to calculate by -any Deep Packet Inspection device or it will become a new Tor TLS setup -fingerprint. - -Proposed certificate form - -The following output from openssl asn1parse results from the proposed -certificate generation algorithm. It matches the results of generating a -default self-signed certificate: - - 0:d=0 hl=4 l= 513 cons: SEQUENCE - 4:d=1 hl=4 l= 362 cons: SEQUENCE - 8:d=2 hl=2 l= 9 prim: INTEGER :DBF6B3B864FF7478 - 19:d=2 hl=2 l= 13 cons: SEQUENCE - 21:d=3 hl=2 l= 9 prim: OBJECT :sha1WithRSAEncryption - 32:d=3 hl=2 l= 0 prim: NULL - 34:d=2 hl=2 l= 69 cons: SEQUENCE - 36:d=3 hl=2 l= 11 cons: SET - 38:d=4 hl=2 l= 9 cons: SEQUENCE - 40:d=5 hl=2 l= 3 prim: OBJECT :countryName - 45:d=5 hl=2 l= 2 prim: PRINTABLESTRING :AU - 49:d=3 hl=2 l= 19 cons: SET - 51:d=4 hl=2 l= 17 cons: SEQUENCE - 53:d=5 hl=2 l= 3 prim: OBJECT :stateOrProvinceName - 58:d=5 hl=2 l= 10 prim: PRINTABLESTRING :Some-State - 70:d=3 hl=2 l= 33 cons: SET - 72:d=4 hl=2 l= 31 cons: SEQUENCE - 74:d=5 hl=2 l= 3 prim: OBJECT :organizationName - 79:d=5 hl=2 l= 24 prim: PRINTABLESTRING :Internet Widgits Pty Ltd - 105:d=2 hl=2 l= 30 cons: SEQUENCE - 107:d=3 hl=2 l= 13 prim: UTCTIME :110217011237Z - 122:d=3 hl=2 l= 13 prim: UTCTIME :120217011237Z - 137:d=2 hl=2 l= 69 cons: SEQUENCE - 139:d=3 hl=2 l= 11 cons: SET - 141:d=4 hl=2 l= 9 cons: SEQUENCE - 143:d=5 hl=2 l= 3 prim: OBJECT :countryName - 148:d=5 hl=2 l= 2 prim: PRINTABLESTRING :AU - 152:d=3 hl=2 l= 19 cons: SET - 154:d=4 hl=2 l= 17 cons: SEQUENCE - 156:d=5 hl=2 l= 3 prim: OBJECT :stateOrProvinceName - 161:d=5 hl=2 l= 10 prim: PRINTABLESTRING :Some-State - 173:d=3 hl=2 l= 33 cons: SET - 175:d=4 hl=2 l= 31 cons: SEQUENCE - 177:d=5 hl=2 l= 3 prim: OBJECT :organizationName - 182:d=5 hl=2 l= 24 prim: PRINTABLESTRING :Internet Widgits Pty Ltd - 208:d=2 hl=3 l= 159 cons: SEQUENCE - 211:d=3 hl=2 l= 13 cons: SEQUENCE - 213:d=4 hl=2 l= 9 prim: OBJECT :rsaEncryption - 224:d=4 hl=2 l= 0 prim: NULL - 226:d=3 hl=3 l= 141 prim: BIT STRING - 370:d=1 hl=2 l= 13 cons: SEQUENCE - 372:d=2 hl=2 l= 9 prim: OBJECT :sha1WithRSAEncryption - 383:d=2 hl=2 l= 0 prim: NULL - 385:d=1 hl=3 l= 129 prim: BIT STRING - - -Custom Certificates - -It should be possible for a Tor relay operator to use a specifically supplied -certificate and secret key. This will allow a relay or bridge operator to use a -certificate signed by any member of any geographically relevant certificate -authority racket; it will also allow for any other user-supplied certificate. -This may be desirable in some kinds of filtered networks or when attempting to -avoid attracting suspicion by blending in with the TLS web server certificate -crowd. - -Problematic Diffie–Hellman parameters - -We currently send a static Diffie–Hellman parameter, prime p (or “prime p -outlawâ€) as specified in RFC2409 as part of the TLS Server Hello response. - -The use of this prime in TLS negotiations may, as a result, be filtered and -effectively banned by certain networks. We do not have to use this particular -prime in all cases. - -While amusing to have the power to make specific prime numbers into a new class -of numbers (cf. imaginary, irrational, illegal [3]) - our new friend prime p -outlaw is not required. - -The use of this prime in TLS negotiations may, as a result, be filtered and -effectively banned by certain networks. We do not have to use this particular -prime in all cases. - -I propose that the function to initialize and generate DH parameters be -split into two functions. - -First, init_dh_param() should be used only for OR-to-OR DH setup and -communication. Second, it is proposed that we create a new function -init_tls_dh_param() that will have a two-stage development process. - -The first stage init_tls_dh_param() will use the same prime that -Apache2.x [4] sends (or “dh1024_apache_pâ€), and this change should be -made immediately. This is a known good and safe prime number (p-1 / 2 -is also prime) that is currently not known to be blocked. - -The second stage init_tls_dh_param() should randomly generate a new prime on a -regular basis; this is designed to make the prime difficult to outlaw or -filter. Call this a shape-shifting or "Rakshasa" prime. This should be added -to the 0.2.3.x branch of Tor. This prime can be generated at setup or execution -time and probably does not need to be stored on disk. Rakshasa primes only -need to be generated by Tor relays as Tor clients will never send them. Such -a prime should absolutely not be shared between different Tor relays nor -should it ever be static after the 0.2.3.x release. - -As a security precaution, care must be taken to ensure that we do not generate -weak primes or known filtered primes. Both weak and filtered primes will -undermine the TLS connection security properties. OpenSSH solves this issue -dynamically in RFC 4419 [5] and may provide a solution that works reasonably -well for Tor. More research in this area including the applicability of -Miller-Rabin or AKS primality tests[6] will need to be analyzed and probably -added to Tor. - -Practical key size - -Currently we use a 1024 bit long RSA modulus. I propose that we increase the -RSA key size to 2048 as an additional channel to signal support for the V3 -handshake setup. 2048 appears to be the most common key size[0] above 1024. -Additionally, the increase in modulus size provides a reasonable security boost -with regard to key security properties. - -The implementer should increase the 1024 bit RSA modulus to 2048 bits. - -Possible future filtering nightmares - -At some point it may cost effective or politically feasible for a network -filter to simply block all signed or self-signed certificates without a known -valid CA trust chain. This will break many applications on the internet and -hopefully, our option for custom certificates will ensure that this step is -simply avoided by the censors. - -The Rakshasa prime approach may cause censors to specifically allow only -certain known and accepted DH parameters. - - -Appendix: Other issues - -What other obvious TLS certificate issues exist? What other static values are -present in the Tor TLS setup process? - -[0] http://archives.seul.org/or/dev/Jan-2011/msg00051.html -[1] http://archives.seul.org/or/dev/Feb-2011/msg00016.html -[2] http://archives.seul.org/or/dev/Feb-2011/msg00039.html -[3] To be fair this is hardly a new class of numbers. History is rife with - similar examples of inane authoritarian attempts at mathematical secrecy. - Probably the most dramatic example is the story of the pupil Hipassus of - Metapontum, pupil of the famous Pythagoras, who, legend goes, proved the - fact that Root2 cannot be expressed as a fraction of whole numbers (now - called an irrational number) and was assassinated for revealing this - secret. Further reading on the subject may be found on the Wikipedia: - http://en.wikipedia.org/wiki/Hippasus - -[4] httpd-2.2.17/modules/ss/ssl_engine_dh.c -[5] http://tools.ietf.org/html/rfc4419 -[6] http://archives.seul.org/or/dev/Jan-2011/msg00037.html diff --git a/doc/spec/proposals/ideas/xxx-encrypted-services.txt b/doc/spec/proposals/ideas/xxx-encrypted-services.txt deleted file mode 100644 index 3c2ac67fa4..0000000000 --- a/doc/spec/proposals/ideas/xxx-encrypted-services.txt +++ /dev/null @@ -1,66 +0,0 @@ -Filename: xxx-encrypted-services.txt -Title: Encrypted services as a replacement to exit enclaving -Author: Roger Dingledine -Created: 2011-01-12 -Status: Draft - -We should offer a way to run a Tor hidden service where the server-side -rendezvous circuits are just one hop. - -1. Motivation - - There are three Tor use cases that this idea addresses: - - 1) Indymedia wants to run an exit enclave that provides end-to-end - authentication and encryption. They tried running an exit relay that - just exits to themselves: - https://trac.torproject.org/projects/tor/ticket/800 - but a) it handles lots of other traffic too since it's a relay, and - b) exit enclaves don't actually work consistently, because the first - connection from the user doesn't realize it should use the exit enclave. - - 2) Wikileaks uses Tor hidden services not to hide their service, - but because the hidden service address provides a type of usability - we didn't think much about: unlike a more normal address, a Tor - hidden service address either works (meaning you get your end-to-end - authentication and encryption) or it fails hard. With a hidden service - address there's no way a user could accidentally submit their documents - to Wikileaks without using Tor, but with normal Tor it's possible. - - 3) The Freenode IRC network wants to provide end-to-end encryption and - authentication to its users, a) to handle the fact that the IRC protocol - doesn't really provide much of that by default, and b) to funnel all - their Tor users into a single location so they can handle the anonymous - users better. They don't mind the fact that their service is hidden, but - they'd rather have better performance for their users given the choice. - -2. Design - - It seems that the main changes required would be to a) make - circuit_launch_by_extend_info() know to use 1 hop rather than the - default, and know not to try to cannibalize a general 3-hop circ for - these circuits, and b) add a way in the torrc file to specify that this - service wants to be an encrypted service rather than a hidden service. - - I had originally pondered some sort of even more efficient "signed - document saying this service is running at this Tor relay", which - would be more efficient because it would cut out the rendezvous step. - But by reusing the hidden service rendezvous infrastructure, we a) - blend in with hidden services (and hidden service descriptors) and - don't need to teach users (or their Tor clients) a new interface, - and b) can offer the encrypted service on a non-relay. - - One design question to ponder: should we continue to use three-hop - circuits for our introduction points, and for publishing our encrypted - service descriptor? Probably. - -3. Security implications - - There's a possible second-order effect here since both encrypted - services and hidden services will have foo.onion addresses and it's - not clear based on the address whether the service will be hidden -- - if *some* .onion addresses are easy to track down, are we encouraging - adversaries to attack all rendezvous points just in case? - -... - diff --git a/doc/spec/proposals/ideas/xxx-exit-scanning-outline.txt b/doc/spec/proposals/ideas/xxx-exit-scanning-outline.txt deleted file mode 100644 index d84094400a..0000000000 --- a/doc/spec/proposals/ideas/xxx-exit-scanning-outline.txt +++ /dev/null @@ -1,44 +0,0 @@ -1. Scanning process - A. Non-HTML/JS HTTP mime types compared via SHA1 hash - B. Dynamic HTTP content filtered at 4 levels: - 1. IP change+Tor cookie utilization - - Tor cookies replayed with new IP in case of changes - 2. HTML Tag+Attribute+JS comparison - - Comparisons made based only on "relevant" HTML tags - and attributes - 3. HTML Tag+Attribute+JS diffing - - Tags, attributes and JS AST nodes that change during - Non-Tor fetches pruned from comparison - 4. URLS with > N% of node failures removed - - results purged from filesystem at end of scan loop - C. SSL scanning handles some forms of dynamic certs - 1. Catalogs certs for all IPs resolved locally - by getaddrinfo over the duration of the scan. - - Updated each test. - 2. If the domain presents a new cert for each IP, this - is noted on the failure result for the node - 3. If the same IP presents two different certs locally, - the cert list is first refreshed, and if it happens - again, discarded - 4. A N% node failure filter also applies - D. Scanner can be restarted from any point in the event - of scanner or system crashes, or graceful shutdown. - - Results+scan state pickled to filesystem continuously -2. Cron job checks results periodically for reporting - A. Divide failures into three types of BadExit based on type - and frequency over time and incident rate - B. write reject lines to approved-routers for those three types: - 1. ID Hex based (for misconfig/network problems easily fixed) - 2. IP based (for content modification) - 3. IP+mask based (for continuous/egregious content modification) - C. Emails results to tor-scanners@freehaven.net -3. Human Review and Appeal - A. ID Hex-based BadExit is meant to be possible to removed easily - without needing to beg us. - - Should this behavior be encouraged? - B. Optionally can reserve IP based badexits for human review - 1. Results are encapsulated fully on the filesystem and can be - reviewed without network access - 2. Soat has --rescan to rescan failed nodes from a data directory - - New set of URLs used - diff --git a/doc/spec/proposals/ideas/xxx-geoip-survey-plan.txt b/doc/spec/proposals/ideas/xxx-geoip-survey-plan.txt deleted file mode 100644 index 49c6615a66..0000000000 --- a/doc/spec/proposals/ideas/xxx-geoip-survey-plan.txt +++ /dev/null @@ -1,137 +0,0 @@ - - -Abstract - - This document explains how to tell about how many Tor users there - are, and how many there are in which country. Statistics are - involved. - -Motivation - - There are a few reasons we need to keep track of which countries - Tor users (in aggregate) are coming from: - - - Resource allocation. Knowing about underserved countries with - lots of users can let us know about where we need to direct - translation and outreach efforts. - - - Anticensorship. Sudden drops in usage on a national basis can - indicate the arrival of a censorious firewall. - - - Sponsor outreach and self-evalutation. Many people and - organizations who are interested in funding The Tor Project's - work want to know that we're successfully serving parts of the - world they're interested in, and that efforts to expand our - userbase are actually succeeding. So do we. - -Goals - - We want to know approximately how many Tor users there are, and which - countries they're in, even in the presence of a hypothetical - "directory guard" feature. Some uncertainty is okay, but we'd like - to be able to put a bound on the uncertainty. - - We need to make sure this information isn't exposed in a way that - helps an adversary. - -Methods for current clients: - - Every client downloads network status documents. There are - currently three methods (one hypothetical) for clients to get them. - - 0.1.2.x clients (and earlier) fetch a v2 networkstatus - document about every NETWORKSTATUS_CLIENT_DL_INTERVAL [30 - minutes]. - - - 0.2.0.x clients fetch a v3 networkstatus consensus document - at a random interval between when their current document is no - longer freshest, and when their current document is about to - expire. - - [In both of the above cases, clients choose a running - directory cache at random with odds roughly proportional to - its bandwidth. If they're just starting, they know a XXXX FIXME -NM] - - - In some future version, clients will choose directory caches - to serve as their "directory guards" to avoid profiling - attacks, similarly to how clients currently start all their - circuits at guard nodes. - - We assume that a directory cache can tell which of these three - categories a client is in by the format of its status request. - - A directory cache can be made to count distinct client IP - addresses that make a certain request of it in a given timeframe, - and total requests made to it over that timeframe. For the first - two cases, a cache can get a picture of the overall - number and countries of users in the network by dividing the IP - count by the probability with which they (as a cache) would be - chosen. Assuming that our listed bandwidth is such that we expect - to be chosen with probability P for any given request, and we've - been counting IPs for long enough that we expect the average - client to have made N requests, they will have visited us at least - once with probability P' = 1-(1-P)^N, and so we divide the IP - counts we've seen by P' for our estimate. To estimate total - number of clients of a given type, determine how many requests a - client of that type will make over that time, and assume we'll - have seen P of them. - - Both of these numbers are useful: the IP counts will give the - total number of IPs connecting to the network, and the request - counts will give the total number of users on the network at any - given time. - - Notes: - - [Over H hours, the N for V2 clients is 2*H, and the N for V3 - clients is currently around H/2 or H/3.] - - - (We should only count requests that we actually intend to answer; - 503 requests shouldn't count.) - - - These measurements should also be taken at a directory - authority if possible: their picture of the network is skewed - by clients that fetch from them directly. These clients, - however, are all the clients that are just bootstrapping - (assuming that the fallback-consensus feature isn't yet used - much). - - - These measurements also overestimate the V2 download rate if - some downloads fail and clients retry them later after backing - off. - -Methods for directory guards: - - If directory guards are in use, directory guards get a picture of - all those users who chose them as a guard when they were listed - as a good choice for a guard, and who are also on the network - now. The cleanest data here will come from nodes that were listed - as good new-guards choices for a while, and have not been so for a - while longer (to study decay rates); nodes that have been listed - as good new-guard choices consistently for a long time (to get a - sample of the network); and nodes that have been listed as good - new-guard choices only recently (to get a sample of new users and - users whose guards have died out.) - - Since directory guards are currently unspecified, we'll need to - make some guesses about how they'll turn out to work. Here are - a couple of approaches that could work. - - We could have clients pick completely new directory guards on - a rolling basis every two months or so. This would ensure - that staying as a guard for a while would be sufficient to - see a sample of users. This is potentially advantageous for - load-balancing the network as well, though it might lose some - of the benefits of directory guard. We need to quantify the - impact of this; it might not actually make stuff worse in - practice, if most guards don't stay good guards for a month - or two. - - - We could try to collect statistics at several directory - guards and combine their statisics, but we would need to make - sure that for all time, at least one of the directory guards - had been recommended as a good choice for new guards. By - looking at new-IP rates for guards, we could get an idea of - user uptake; for looking at old-IP decay rates, we could get - an idea of turnover. This approach would entail significant - complexity, and we'd probably need to record more information - than we'd really like to. - - diff --git a/doc/spec/proposals/ideas/xxx-grand-scaling-plan.txt b/doc/spec/proposals/ideas/xxx-grand-scaling-plan.txt deleted file mode 100644 index 336798cc0f..0000000000 --- a/doc/spec/proposals/ideas/xxx-grand-scaling-plan.txt +++ /dev/null @@ -1,97 +0,0 @@ - -Right now as I understand it, there are n big scaling problems heading -our way: - -1) Clients need to learn all the relay descriptors they could use. That's -a lot of bytes through a potentially small pipe. -2) Relays need to hold open TCP connections to most other relays. -3) Clients need to learn the whole networkstatus. Even using v3, as -the network grows that will become unwieldy. -4) Dir mirrors need to mirror all the relay descriptors; eventually this -will get big too. - -Here's my plan. - --------------------------------------------------------------------- - -Piece one: download O(1) descriptors rather than O(n) descriptors. - -We need to change our circuit extend protocol so it fetches a relay -descriptor at every 'extend' operation: - - Client fetches networkstatus, picks guards, connects to one. - - Client picks middle hop out of networkstatus, asks guard for - its descriptor, then extends to it. - - Clients picks exit hop out of networkstatus, asks middle hop - for its descriptor, then extends to it. Done. - -The client needs to ask for the descriptor even if it already has a -copy, because otherwise we leak too much. Also, the descriptor needs to -be padded to some large (but not too large) size to prevent the middle -hops from guessing about it. - -The first step towards this is to instrument the current code to see -how much of a win this would actually be -- I am guessing it is already -a win even with the current number of descriptors. - -We also would need to assign the 'Exit' flag more usefully, and make -clients pay attention to it when picking their last hop, since they -don't actually know the exit policies of the relays they're choosing from. - -We also need to think harder about other implications -- for example, -a relay with a tiny exit policy won't get the Exit flag, and thus won't -ever get picked as an exit relay. Plus, our "enclave exit" model is out -the window unless we figure out a cool trick. - -More generally, we'll probably want to compress the descriptors that we -send back; maybe 8k is a good upper bound? I wonder if we could ask for -several descriptors, and bundle back all of the ones that fit in the 8k? - -We'd also want to put the load balancing weights into the networkstatus, -so clients can choose fast nodes more often without needing to see the -descriptors. This is a good opportunity for the authorities to be able -to put "more accurate" weights in if they learn to detect attacks. It -also means we should consider running automated audits to make sure the -authorities aren't trying to snooker everybody. - -I'm aiming to get Peter Palfrader to tackle this problem in mid 2008, -but I bet he could use some help. - --------------------------------------------------------------------- - -Piece two: inter-relay communication uses UDP - -If relays send packets to/from other relays via UDP, they don't need a -new descriptor for each such link. Thus we'll still need to keep state -for each link, but we won't max out on sockets. - -Clearly a lot more work needs to be done here. Ian Goldberg has a student -who has been working on it, and if all goes well we'll be chipping in -some funding to continue that. Also, Camilo Viecco has been doing his -PhD thesis on it. - --------------------------------------------------------------------- - -Piece three: networkstatus documents get partitioned - -While the authorities should be expected to be able to handle learning -about all the relays, there's no reason the clients or the mirrors need -to. Authorities should put a cap on the number of relays listed in a -single networkstatus, and split them when they get too big. - -We'd need a good way to have each authority come to the same conclusion -about which partition a given relay goes into. - -Directory mirrors would then mirror all the relay descriptors in their -partition. This is compatible with 'piece one' above, since clients in -a given partition will only ask about descriptors in that partition. - -More complex versions of this design would involve overlapping partitions, -but that would seem to start contradicting other parts of this proposal -right quick. - -Nobody is working on this piece yet. It's hard to say when we'll need -it, but it would be nice to have some more thought on it before the week -that we need it. - --------------------------------------------------------------------- - diff --git a/doc/spec/proposals/ideas/xxx-hide-platform.txt b/doc/spec/proposals/ideas/xxx-hide-platform.txt deleted file mode 100644 index ad19fb1fd4..0000000000 --- a/doc/spec/proposals/ideas/xxx-hide-platform.txt +++ /dev/null @@ -1,37 +0,0 @@ -Filename: xxx-hide-platform.txt -Title: Hide Tor Platform Information -Author: Jacob Appelbaum -Created: 24-July-2008 -Status: Draft - - - Hiding Tor Platform Information - -0.0 Introduction - -The current Tor program publishes its specific Tor version and related OS -platform information. This information could be misused by an attacker. - -0.1 Current Implementation - -Currently, the Tor binary sends data that looks like the following: - - Tor 0.2.0.26-rc (r14597) on Darwin Power Macintosh - Tor 0.1.2.19 on Windows XP Service Pack 3 [workstation] {terminal services, - single user} - -1.0 Suggested changes - -It would be useful to allow a user to configure the disclosure of such -information. Such a change would be an option in the torrc file like so: - - HidePlatform Yes - -1.1 Suggested default behavior in the future - -If a user would like to disclose this information, they could configure their -Tor to do so. - - HidePlatform No - - diff --git a/doc/spec/proposals/ideas/xxx-pluggable-transport.txt b/doc/spec/proposals/ideas/xxx-pluggable-transport.txt deleted file mode 100644 index 53ba9c630b..0000000000 --- a/doc/spec/proposals/ideas/xxx-pluggable-transport.txt +++ /dev/null @@ -1,312 +0,0 @@ -Filename: xxx-pluggable-transport.txt -Title: Pluggable transports for circumvention -Author: Jacob Appelbaum, Nick Mathewson -Created: 15-Oct-2010 -Status: Draft - -Overview - - This proposal describes a way to decouple protocol-level obfuscation - from the core Tor protocol in order to better resist client-bridge - censorship. Our approach is to specify a means to add pluggable - transport implementations to Tor clients and bridges so that they can - negotiate a superencipherment for the Tor protocol. - -Scope - - This is a document about transport plugins; it does not cover - discovery improvements, or bridgedb improvements. While these - requirements might be solved by a program that also functions as a - transport plugin, this proposal only covers the requirements and - operation of transport plugins. - -Motivation - - Frequently, people want to try a novel circumvention method to help - users connect to Tor bridges. Some of these methods are already - pretty easy to deploy: if the user knows an unblocked VPN or open - SOCKS proxy, they can just use that with the Tor client today. - - Less easy to deploy are methods that require participation by both the - client and the bridge. In order of increasing sophistication, we - might want to support: - - 1. A protocol obfuscation tool that transforms the output of a TLS - connection into something that looks like HTTP as it leaves the - client, and back to TLS as it arrives at the bridge. - 2. An additional authentication step that a client would need to - perform for a given bridge before being allowed to connect. - 3. An information passing system that uses a side-channel in some - existing protocol to convey traffic between a client and a bridge - without the two of them ever communicating directly. - 4. A set of clients to tunnel client->bridge traffic over an existing - large p2p network, such that the bridge is known by an identifier - in that network rather than by an IP address. - - We could in theory support these almost fine with Tor as it stands - today: every Tor client can take a SOCKS proxy to use for its outgoing - traffic, so a suitable client proxy could handle the client's traffic - and connections on its behalf, while a corresponding program on the - bridge side could handle the bridge's side of the protocol - transformation. Nevertheless, there are some reasons to add support - for transportation plugins to Tor itself: - - 1. It would be good for bridges to have a standard way to advertise - which transports they support, so that clients can have multiple - local transport proxies, and automatically use the right one for - the right bridge. - - 2. There are some changes to our architecture that we'll need for a - system like this to work. For testing purposes, if a bridge blocks - off its regular ORPort and instead has an obfuscated ORPort, the - bridge authority has no way to test it. Also, unless the bridge - has some way to tell that the bridge-side proxy at 127.0.0.1 is not - the origin of all the connections it is relaying, it might decide - that there are too many connections from 127.0.0.1, and start - paring them down to avoid a DoS. - - 3. Censorship and anticensorship techniques often evolve faster than - the typical Tor release cycle. As such, it's a good idea to - provide ways to test out new anticensorship mechanisms on a more - rapid basis. - - 4. Transport obfuscation is a relatively distinct problem - from the other privacy problems that Tor tries to solve, and it - requires a fairly distinct skill-set from hacking the rest of Tor. - By decoupling transport obfuscation from the Tor core, we hope to - encourage people working on transport obfuscation who would - otherwise not be interested in hacking Tor. - - 5. Finally, we hope that defining a generic transport obfuscation plugin - mechanism will be useful to other anticensorship projects. - -Non-Goals - - We're not going to talk about automatic verification of plugin - correctness and safety via sandboxing, proof-carrying code, or - whatever. - - We need to do more with discovery and distribution, but that's not - what this proposal is about. We're pretty convinced that the problems - are sufficiently orthogonal that we should be fine so long as we don't - preclude a single program from implementing both transport and - discovery extensions. - - This proposal is not about what transport plugins are the best ones - for people to write. We do, however, make some general - recommendations for plugin authors in an appendix. - - We've considered issues involved with completely replacing Tor's TLS - with another encryption layer, rather than layering it inside the - obfuscation layer. We describe how to do this in an appendix to the - current proposal, though we are not currently sure whether it's a good - idea to implement. - - We deliberately reject any design that would involve linking more code - into Tor's process space. - -Design overview - - To write a new transport protocol, an implementer must provide two - pieces: a "Client Proxy" to run at the initiator side, and a "Server - Proxy" to run a the server side. These two pieces may or may not be - implemented by the same program. - - Each client may run any number of Client Proxies. Each one acts like - a SOCKS proxy that accepts accept connections on localhost. Each one - runs on a different port, and implements one or more transport - methods. If the protocol has any parameters, they passed from Tor - inside the regular username/password parts of the SOCKS protocol. - - Bridges (and maybe relays) may run any number of Server Proxies: these - programs provide an interface like stunnel-server (or whatever the - option is): they get connections from the network (typically by - listening for connections on the network) and relay them to the - Bridge's real ORPort. - - To configure one of these programs, it should be sufficient simply to - list it in your torrc. The program tells Tor which transports it - provides. The Tor consensus should carry a new approved version number that - is specific for pluggable transport; this will allow Tor to know when a - particular transport is known to be unsafe safe or non-functional. - - Bridges (and maybe relays) report in their descriptors which transport - protocols they support. This information can be copied into bridge - lines. Bridges using a transport protocol may have multiple bridge - lines. - - Any methods that are wildly successful, we can bake into Tor. - -Specifications: Client behavior - - Bridge lines can now follow the extended format "bridge method - address:port [[keyid=]id-fingerprint] [k=v] [k=v] [k=v]". To connect - to such a bridge, a client must open a local connection to the SOCKS - proxy for "method", and ask it to connect to address:port. If - [id-fingerprint] is provided, it should expect the public identity key - on the TLS connection to match the digest provided in - [id-fingerprint]. If any [k=v] items are provided, they are - configuration parameters for the proxy: Tor should separate them with - semicolons and put them user and password fields of the request, - splitting them across the fields as necessary. If a key or value - value must contain a semicolon or a backslash, it is escaped with a - backslash. - - The "id-fingerprint" field is always provided in a field named - "keyid", if it was given. Method names must be C identifiers. - - Example: if the bridge line is "bridge trebuchet www.example.com:3333 - rocks=20 height=5.6m" AND if the Tor client knows that the - 'trebuchet' method is provided by a SOCKS5 proxy on - 127.0.0.1:19999, the client should connect to that proxy, ask it to - connect to www.example.com, and provide the string - "rocks=20;height=5.6m" as the username, the password, or split - across the username and password. - - There are two ways to tell Tor clients about protocol proxies: - external proxies and managed proxies. An external proxy is configured - with "ClientTransportPlugin trebuchet socks5 127.0.0.1:9999". This - tells Tor that another program is already running to handle - 'trubuchet' connections, and Tor doesn't need to worry about it. A - managed proxy is configured with "ClientTransportPlugin trebuchet - exec /usr/libexec/tor-proxies/trebuchet [options]", and tells Tor to launch - an external program on-demand to provide a socks proxy for 'trebuchet' - connections. The Tor client only launches one instance of each - external program, even if the same executable is listed for more than - one method. - - The same program can implement a managed or an external proxy: it just - needs to take an argument saying which one to be. - -Client proxy behavior - - When launched from the command-line by a Tor client, a transport - proxy needs to tell Tor which methods and ports it supports. It does - this by printing one or more CMETHOD: lines to its stdout. These look - like - - CMETHOD: trebuchet SOCKS5 127.0.0.1:19999 ARGS:rocks,height \ - OPT-ARGS:tensile-strength - - The ARGS field lists mandatory parameters that must appear in every - bridge line for this method. The OPT-ARGS field lists optional - parameters. If no ARGS or OPT-ARGS field is provided, Tor should not - check the parameters in bridge lines for this method. - - The proxy should print a single "METHODS: DONE" line after it is - finished telling Tor about the methods it provides. - - The transport proxy MUST exit cleanly when it receives a SIGTERM from - Tor. - - The Tor client MUST ignore lines beginning with a keyword and a colon - if it does not recognize the keyword. - - In the future, if we need a control mechanism, we can use the - stdin/stdout from Tor to the transport proxy. - - A transport proxy MUST handle SOCKS connect requests using the SOCKS - version it advertises. - - Tor clients SHOULD NOT use any method from a client proxy unless it - is both listed as a possible method for that proxy in torrc, and it - is listed by the proxy as a method it supports. - - [XXXX say something about versioning.] - -Server behavior - - Server proxies are configured similarly to client proxies. - - - -Server proxy behavior - - - - [so, we can have this work like client proxies, where the bridge - launches some programs, and they tell the bridge, "I am giving you - method X with parameters Y"? Do you have to take all the methods? If - not, which do you specify?] - - [Do we allow programs that get started independently?] - - [We'll need to figure out how this works with port forwarding. Is - port forwarding the bridge's problem, the proxy's problem, or some - combination of the two?] - - [If we're using the bridge authority/bridgedb system for distributing - bridge info, the right place to advertise bridge lines is probably - the extrainfo document. We also need a way to tell the bridge - authority "don't give out a default bridge line for me"] - -Server behavior - -Bridge authority behavior - -Implementation plan - - Turn this into a draft proposal - - Circulate and discuss on or-dev. - - We should ship a couple of null plugin implementations in one or two - popular, portable languages so that people get an idea of how to - write the stuff. - - 1. We should have one that's just a proof of concept that does - nothing but transfer bytes back and forth. - - 1. We should not do a rot13 one. - - 2. We should implement a basic proxy that does not transform the bytes at all - - 1. We should implement DNS or HTTP using other software (as goodell - did years ago with DNS) as an example of wrapping existing code into - our plugin model. - - 2. The obfuscated-ssh superencipherment is pretty trivial and pretty - useful. It makes the protocol stringwise unfingerprintable. - - 1. Nick needs to be told firmly not to bikeshed the obfuscated-ssh - superencipherment too badly - - 1. Go ahead, bikeshed my day - - 1. If we do a raw-traffic proxy, openssh tunnels would be the logical choice. - -Appendix: recommendations for transports - - Be free/open-source software. Also, if you think your code might - someday do so well at circumvention that it should be implemented - inside Tor, it should use the same license as Tor. - - Use libraries that Tor already requires. (You can rely on openssl and - libevent being present if current Tor is present.) - - Be portable: most Tor users are on Windows, and most Tor developers - are not, so designing your code for just one of these platforms will - make it either get a small userbase, or poor auditing. - - Think secure: if your code is in a C-like language, and it's hard to - read it and become convinced it's safe then, it's probably not safe. - - Think small: we want to minimize the bytes that a Windows user needs - to download for a transport client. - - Specify: if you can't come up with a good explanation - - Avoid security-through-obscurity if possible. Specify. - - Resist trivial fingerprinting: There should be no good string or regex - to search for to distinguish your protocol from protocols permitted by - censors. - - Imitate a real profile: There are many ways to implement most - protocols -- and in many cases, most possible variants of a given - protocol won't actually exist in the wild. - -Appendix: Raw-traffic transports - - This section describes an optional extension to the proposal above. - We are not sure whether it is a good idea. diff --git a/doc/spec/proposals/ideas/xxx-port-knocking.txt b/doc/spec/proposals/ideas/xxx-port-knocking.txt deleted file mode 100644 index 85c27ec52d..0000000000 --- a/doc/spec/proposals/ideas/xxx-port-knocking.txt +++ /dev/null @@ -1,91 +0,0 @@ -Filename: xxx-port-knocking.txt -Title: Port knocking for bridge scanning resistance -Author: Jacob Appelbaum -Created: 19-April-2009 -Status: Draft - - Port knocking for bridge scanning resistance - -0.0 Introduction - -This document is a collection of ideas relating to improving scanning -resistance for private bridge relays. This is intented to stop opportunistic -network scanning and subsequent discovery of private bridge relays. - - -0.1 Current Implementation - -Currently private bridges are only hidden by their obscurity. If you know -a bridge ip address, the bridge can be detected trivially and added to a block -list. - -0.2 Configuring an external port knocking program to control the firewall - -It is currently possible for bridge operators to configure a port knocking -daemon that controls access to the incoming OR port. This is currently out of -scope for Tor and Tor configuration. This process requires the firewall to know -the current nodes in the Tor network. - -1.0 Suggested changes - -Private bridge operators should be able to configure a method of hiding their -relay. Only authorized users should be able to communicate with the private -bridge. This should be done with Tor and if possible without the help of the -firewall. It should be possible for a Tor user to enter a secret key into -Tor or optionally Vidalia on a per bridge basis. This secret key should be -used to authenticate the bridge user to the private bridge. - -1.x Issues with low ports and bind() for ORPort - -Tor opens low numbered ports during startup and then drops privileges. It is -no longer possible to rebind to those lower ports after they are closed. - -1.x Issues with OS level packet filtering - -Tor does not know about any OS level packet filtering. Currently there is no -packet filters that understands the Tor network in real time. - -1.x Possible partioning of users by bridge operator - -Depending on implementation, it may be possible for bridge operators to -uniquely identify users. This appears to be a general bridge issue when a -bridge operator uniquely deploys bridges per user. - -2.0 Implementation ideas - -This is a suggested set of methods for port knocking. - -2.x Using SPA port knocking - -Single Packet Authentication port knocking encodes all required data into a -single UDP packet. Improperly formatted packets may be simply discarded. -Properly formatted packets should be processed and appropriate actions taken. - -2.x Using DNS as a transport for SPA - -It should be possible for Tor to bind to port 53 at startup and merely drop all -packets that are not valid. UDP does not require a response and invalid packets -will not trigger a response from Tor. With base32 encoding it should be -possible to encode SPA as valid DNS requests. This should allow use of the -public DNS infrastructure for authorization requests if desired. - -2.x Ghetto firewalling with opportunistic connection closing - -Until a user has authenticated with Tor, Tor only has a UDP listener. This -listener should never send data in response, it should only open an ORPort -when a user has successfully authenticated. After a user has authenticated -with Tor to open an ORPort, only users who have authenticated will be able -to use it. All other users as identified by their ip address will have their -connection closed before any data is sent or received. This should be -accomplished with an access policy. By default, the access policy should block -all access to the ORPort. - -2.x Timing and reset of access policies - -Access to the ORPort is sensitive. The bridge should remove any exceptions -to its access policy regularly when the ORPort is unused. Valid users should -reauthenticate if they do not use the ORPort within a given time frame. - -2.x Additional considerations - -There are many. A format of the packet and the crypto involved is a good start. diff --git a/doc/spec/proposals/ideas/xxx-rate-limit-exits.txt b/doc/spec/proposals/ideas/xxx-rate-limit-exits.txt deleted file mode 100644 index 81fed20af8..0000000000 --- a/doc/spec/proposals/ideas/xxx-rate-limit-exits.txt +++ /dev/null @@ -1,63 +0,0 @@ - -1. Overview - - We should rate limit the volume of stream creations at exits: - -2.1. Per-circuit limits - - If a given circuit opens more than N streams in X seconds, further - stream requests over the next Y seconds should fail with the reason - 'resourcelimit'. Clients will automatically notice this and switch to - a new circuit. - - The goal is to limit the effects of port scans on a given exit relay, - so the relay's ISP won't get hassled as much. - - First thoughts for parameters would be N=100 streams in X=5 seconds - causes 30 seconds of fails; and N=300 streams in X=30 seconds causes - 30 seconds of fails. - - We could simplify by, instead of having a "for 30 seconds" parameter, - just marking the circuit as forever failing new requests. (We don't want - to just close the circuit because it may still have open streams on it.) - -2.2. Per-destination limits - - If a given circuit opens more than N1 streams in X seconds to a single - IP address, or all the circuits combined open more than N2 streams, - then we should fail further attempts to reach that address for a while. - - The goal is to limit the abuse that Tor exit relays can dish out - to a single target either for socket DoS or for web crawling, in - the hopes of a) not triggering their automated defenses, and b) not - making them upset at Tor. Hopefully these self-imposed bans will be - much shorter-lived than bans or barriers put up by the websites. - -3. Issues - -3.1. Circuit-creation overload - - Making clients move to new circuits more often will cause more circuit - creation requests. - -3.2. How to pick the parameters? - - If we pick the numbers too low, then popular sites are effectively - cut out of Tor. If we pick them too high, we don't do much good. - - Worse, picking them wrong isn't easy to fix, since the deployed Tor - servers will ship with a certain set of numbers. - - We could put numbers (or "general settings") in the networkstatus - consensus, and Tor exits would adapt more dynamically. - - We could also have a local config option about how aggressive this - server should be with its parameters. - -4. Client-side limitations - - Perhaps the clients should have built-in rate limits too, so they avoid - harrassing the servers by default? - - Tricky if we want to get Tor clients in use at large enclaves. - diff --git a/doc/spec/proposals/ideas/xxx-using-spdy.txt b/doc/spec/proposals/ideas/xxx-using-spdy.txt deleted file mode 100644 index d733a84b69..0000000000 --- a/doc/spec/proposals/ideas/xxx-using-spdy.txt +++ /dev/null @@ -1,143 +0,0 @@ -Filename: xxx-using-spdy.txt -Title: Using the SPDY protocol to improve Tor performance -Author: Steven J. Murdoch -Created: 03-Feb-2010 -Status: Draft -Target: - -1. Overview - - The SPDY protocol [1] is an alternative method for transferring - web content over TCP, designed to improve efficiency and - performance. A SPDY-aware browser can already communicate with - a SPDY-aware web server over Tor, because this only requires a TCP - stream to be set up. However, a SPDY-aware browser cannot - communicate with a non-SPDY-aware web server. This proposal - outlines how Tor could support this latter case, and why it - may be good for performance. - -2. Motivation - - About 90% of Tor traffic, by connection, is HTTP [2], but - users report subjective performance to be poor. It would - therefore be desirable to improve this situation. SPDY was - designed to offer better performance than HTTP, in - high-latency and/or low-bandwidth situations, and is therefore - an option worth examining. - - If a user wishes to access a SPDY-enabled web server over Tor, - all they need to do is to configure their SPDY-enabled browser - (e.g. Google Chrome) to use Tor. However, there are few - SPDY-enabled web servers, and even if there was high demand - from Tor users, there would be little motivation for server - operators to upgrade, for the benefit of only a small - proportion of their users. - - The motivation of this proposal is to allow only the user to - install a SPDY-enabled browser, and permit web servers to - remain unmodified. Essentially, Tor would incorporate a proxy - on the exit node, which communicates SPDY to the web browser - and normal HTTP to the web server. This proxy would translate - between the two transport protocols, and possibly perform - other optimizations. - - SPDY currently offers five optimizations: - - 1) Multiplexed streams: - An unlimited number of resources can be transferred - concurrently, over a single TCP connection. - - 2) Request prioritization: - The client can set a priority on each resource, to assist - the server in re-ordering responses. - - 3) Compression: - Both HTTP header and resource content can be compressed. - - 4) Server push: - The server can offer the client resources which have not - been requested, but which the server believes will be. - - 5) Server hint: - The server can suggest that the client request further - resources, before the main content is transferred. - - Tor currently effectively implements (1), by being able to put - multiple streams on one circuit. SPDY however requires fewer - round-trips to do the same. The other features are not - implemented by Tor. Therefore it is reasonable to expect that - a HTTP <-> SPDY proxy may improve Tor performance, by some - amount. - - The consequences on caching need to be considered carefully. - Most of the optimizations SPDY offers have no effect because - the existing HTTP cache control headers are transmitted without - modification. Server push is more problematic, because here - the server may push a resource that the client already has. - -3. Design outline - - One way to implement the SPDY proxy is for Tor exit nodes to - advertise this capability in their descriptor. The OP would - then preferentially select these nodes when routing streams - destined for port 80. - - Then, rather than sending the usual RELAY_BEGIN cell, the OP - would send a RELAY_BEGIN_TRANSFORMED cell, with a parameter to - indicate that the exit node should translate between SPDY and - HTTP. The rest of the connection process would operate as - usual. - - There would need to be some way of elegantly handling non-HTTP - traffic which goes over port 80. - -4. Implementation status - - SPDY is under active development and both the specification - and implementations are in a state of flux. Initial - experiments with Google Chrome in SPDY-mode and server - libraries indicate that more work is needed before they are - production-ready. There is no indication that browsers other - than Google Chrome will support SPDY (and no official - statement as to whether Google Chrome will eventually enable - SPDY by default). - - Implementing a full SPDY proxy would be non-trivial. Stream - multiplexing and compression are supported by existing - libraries and would be fairly simple to implement. Request - prioritization would require some form of caching on the - proxy-side. Server push and server hint would require content - parsing to identify resources which should be treated - specially. - -5. Security and policy implications - - A SPDY proxy would be a significant amount of code, and may - pull in external libraries. This code will process potentially - malicious data, both at the SPDY and HTTP sides. This proposal - therefore increases the risk that exit nodes will be - compromised by exploiting a bug in the proxy. - - This proposal would also be the first way in which Tor is - modifying TCP stream data. Arguably this is still meta-data - (HTTP headers), but there may be some concern that Tor should - not be doing this. - - Torbutton only works with Firefox, but SPDY only works with - Google Chrome. We should be careful not to recommend that - users adopt a browser which harms their privacy in other ways. - -6. Open questions: - - - How difficult would this be to implement? - - - How much performance improvement would it actually result in? - - - Is there some way to rapidly develop a prototype which would - answer the previous question? - -[1] SPDY: An experimental protocol for a faster web - http://dev.chromium.org/spdy/spdy-whitepaper -[2] Shining Light in Dark Places: Understanding the Tor Network Damon McCoy, - Kevin Bauer, Dirk Grunwald, Tadayoshi Kohno, Douglas Sicker - http://www.cs.washington.edu/homes/yoshi/papers/Tor/PETS2008_37.pdf diff --git a/doc/spec/proposals/ideas/xxx-what-uses-sha1.txt b/doc/spec/proposals/ideas/xxx-what-uses-sha1.txt deleted file mode 100644 index b3ca3eea5a..0000000000 --- a/doc/spec/proposals/ideas/xxx-what-uses-sha1.txt +++ /dev/null @@ -1,247 +0,0 @@ -Filename: xxx-what-uses-sha1.txt -Title: Where does Tor use SHA-1 today? -Authors: Nick Mathewson, Marian -Created: 30-Dec-2008 -Status: Meta - - -Introduction: - - Tor uses SHA-1 as a message digest. SHA-1 is showing its age: - theoretical attacks for finding collisions against it get better - every year or two, and it will likely be broken in practice before - too long. - - According to smart crypto people, the SHA-2 functions (SHA-256, etc) - share too much of SHA-1's structure to be very good. RIPEMD-160 is - also based on flawed past hashes. Some people think other hash - functions (e.g. Whirlpool and Tiger) are not as bad; most of these - have not seen enough analysis to be used yet. - - Here is a 2006 paper about hash algorithms. - http://www.sane.nl/sane2006/program/final-papers/R10.pdf - - (Todo: Ask smart crypto people.) - - By 2012, the NIST SHA-3 competition will be done, and with luck we'll - have something good to switch too. But it's probably a bad idea to - wait until 2012 to figure out _how_ to migrate to a new hash - function, for two reasons: - 1) It's not inconceivable we'll want to migrate in a hurry - some time before then. - 2) It's likely that migrating to a new hash function will - require protocol changes, and it's easiest to make protocol - changes backward compatible if we lay the groundwork in - advance. It would suck to have to break compatibility with - a big hard-to-test "flag day" protocol change. - - This document attempts to list everything Tor uses SHA-1 for today. - This is the first step in getting all the design work done to switch - to something else. - - This document SHOULD NOT be a clearinghouse of what to do about our - use of SHA-1. That's better left for other individual proposals. - - -Why now? - - The recent publication of "MD5 considered harmful today: Creating a - rogue CA certificate" by Alexander Sotirov, Marc Stevens, Jacob - Appelbaum, Arjen Lenstra, David Molnar, Dag Arne Osvik, and Benne de - Weger has reminded me that: - - * You can't rely on theoretical attacks to stay theoretical. - * It's quite unpleasant when theoretical attacks become practical - and public on days you were planning to leave for vacation. - * Broken hash functions (which SHA-1 is not quite yet AFAIU) - should be dropped like hot potatoes. Failure to do so can make - one look silly. - - -Triage - - How severe are these problems? Let's divide them into these - categories, where H(x) is the SHA-1 hash of x: - PREIMAGE -- find any x such that a H(x) has a chosen value - -- A SHA-1 usage that only depends on preimage - resistance - * Also SECOND PREIMAGE. Given x, find a y not equal to - x such that H(x) = H(y) - COLLISION<role> -- A SHA-1 usage that depends on collision - resistance, but the only party who could mount a - collision-based attack is already in a trusted role - (like a distribution signer or a directory authority). - COLLISION -- find any x and y such that H(x) = H(y) -- A - SHA-1 usage that depends on collision resistance - and doesn't need the attacker to have any special keys. - - There is no need to put much effort into fixing PREIMAGE and SECOND - PREIMAGE usages in the near-term: while there have been some - theoretical results doing these attacks against SHA-1, they don't - seem to be close to practical yet. To fix COLLISION<code-signing> - usages is not too important either, since anyone who has the key to - sign the code can mount far worse attacks. It would be good to fix - COLLISION<authority> usages, since we try to resist bad authorities - to a limited extent. The COLLISION usages are the most important - to fix. - - Kelsey and Schneier published a theoretical second preimage attack - against SHA-1 in 2005, so it would be a good idea to fix PREIMAGE - and SECOND PREIMAGE usages after fixing COLLISION usages or where fixes - require minimal effort. - - http://www.schneier.com/paper-preimages.html - - Additionally, we need to consider the impact of a successful attack - in each of these cases. SHA-1 collisions are still expensive even - if recent results are verified, and anybody with the resources to - compute one also has the resources to mount a decent Sybil attack. - - Let's be pessimistic, and not assume that producing collisions of - a given format is actually any harder than producing collisions at - all. - - -What Tor uses hashes for today: - -1. Infrastructure. - - A. Our X.509 certificates are signed with SHA-1. - COLLSION - B. TLS uses SHA-1 (and MD5) internally to generate keys. - PREIMAGE? - * At least breaking SHA-1 and MD5 simultaneously is - much more difficult than breaking either - independently. - C. Some of the TLS ciphersuites we allow use SHA-1. - PREIMAGE? - D. When we sign our code with GPG, it might be using SHA-1. - COLLISION<code-signing> - * GPG 1.4 and up have writing support for SHA-2 hashes. - This blog has help for converting: - http://www.schwer.us/journal/2005/02/19/sha-1-broken-and-gnupg-gpg/ - E. Our GPG keys might be authenticated with SHA-1. - COLLISION<code-signing-key-signing> - F. OpenSSL's random number generator uses SHA-1, I believe. - PREIMAGE - -2. The Tor protocol - - A. Everything we sign, we sign using SHA-1-based OAEP-MGF1. - PREIMAGE? - B. Our CREATE cell format uses SHA-1 for: OAEP padding. - PREIMAGE? - C. Our EXTEND cells use SHA-1 to hash the identity key of the - target server. - COLLISION - D. Our CREATED cells use SHA-1 to hash the derived key data. - ?? - E. The data we use in CREATE_FAST cells to generate a key is the - length of a SHA-1. - NONE - F. The data we send back in a CREATED/CREATED_FAST cell is the length - of a SHA-1. - NONE - G. We use SHA-1 to derive our circuit keys from the negotiated g^xy - value. - NONE - H. We use SHA-1 to derive the digest field of each RELAY cell, but that's - used more as a checksum than as a strong digest. - NONE - -3. Directory services - - [All are COLLISION or COLLISION<authority> ] - - A. All signatures are generated on the SHA-1 of their corresponding - documents, using PKCS1 padding. - * In dir-spec.txt, section 1.3, it states, - "SIGNATURE" Object contains a signature (using the signing key) - of the PKCS1-padded digest of the entire document, taken from - the beginning of the Initial item, through the newline after - the Signature Item's keyword and its arguments." - So our attacker, Malcom, could generate a collision for the hash - that is signed. Thus, a second pre-image attack is possible. - Vulnerable to regular collision attack only if key is stolen. - If the key is stolen, Malcom could distribute two different - copies of the document which have the same hash. Maybe useful - for a partitioning attack? - B. Router descriptors identify their corresponding extra-info documents - by their SHA-1 digest. - * A third party might use a second pre-image attack to generate a - false extra-info document that has the same hash. The router - itself might use a regular collision attack to generate multiple - extra-info documents with the same hash, which might be useful - for a partitioning attack. - C. Fingerprints in router descriptors are taken using SHA-1. - * The fingerprint must match the public key. Not sure what would - happen if two routers had different public keys but the same - fingerprint. There could perhaps be unpredictable behaviour. - D. In router descriptors, routers in the same "Family" may be listed - by server nicknames or hexdigests. - * Does not seem critical. - E. Fingerprints in authority certs are taken using SHA-1. - F. Fingerprints in dir-source lines of votes and consensuses are taken - using SHA-1. - G. Networkstatuses refer to routers identity keys and descriptors by their - SHA-1 digests. - H. Directory-signature lines identify which key is doing the signing by - the SHA-1 digests of the authority's signing key and its identity key. - I. The following items are downloaded by the SHA-1 of their contents: - XXXX list them - J. The following items are downloaded by the SHA-1 of an identity key: - XXXX list them too. - -4. The rendezvous protocol - - A. Hidden servers use SHA-1 to establish introduction points on relays, - and relays use SHA-1 to check incoming introduction point - establishment requests. - B. Hidden servers use SHA-1 in multiple places when generating hidden - service descriptors. - * The permanent-id is the first 80 bits of the SHA-1 hash of the - public key - ** time-period performs caclulations using the permanent-id - * The secret-id-part is the SHA-1 has of the time period, the - descriptor-cookie, and replica. - * Hash of introduction point's identity key. - C. Hidden servers performing basic-type client authorization for their - services use SHA-1 when encrypting introduction points contained in - hidden service descriptors. - D. Hidden service directories use SHA-1 to check whether a given hidden - service descriptor may be published under a given descriptor - identifier or not. - E. Hidden servers use SHA-1 to derive .onion addresses of their - services. - * What's worse, it only uses the first 80 bits of the SHA-1 hash. - However, the rend-spec.txt says we aren't worried about arbitrary - collisons? - F. Clients use SHA-1 to generate the current hidden service descriptor - identifiers for a given .onion address. - G. Hidden servers use SHA-1 to remember digests of the first parts of - Diffie-Hellman handshakes contained in introduction requests in order - to detect replays. See the RELAY_ESTABLISH_INTRO cell. We seem to be - taking a hash of a hash here. - H. Hidden servers use SHA-1 during the Diffie-Hellman key exchange with - a connecting client. - -5. The bridge protocol - - XXXX write me - - A. Client may attempt to query for bridges where he knows a digest - (probably SHA-1) before a direct query. - -6. The Tor user interface - - A. We log information about servers based on SHA-1 hashes of their - identity keys. - COLLISION - B. The controller identifies servers based on SHA-1 hashes of their - identity keys. - COLLISION - C. Nearly all of our configuration options that list servers allow SHA-1 - hashes of their identity keys. - COLLISION - E. The deprecated .exit notation uses SHA-1 hashes of identity keys - COLLISION diff --git a/doc/spec/proposals/reindex.py b/doc/spec/proposals/reindex.py deleted file mode 100755 index 980bc0659f..0000000000 --- a/doc/spec/proposals/reindex.py +++ /dev/null @@ -1,117 +0,0 @@ -#!/usr/bin/python - -import re, os -class Error(Exception): pass - -STATUSES = """DRAFT NEEDS-REVISION NEEDS-RESEARCH OPEN ACCEPTED META FINISHED - CLOSED SUPERSEDED DEAD REJECTED""".split() -REQUIRED_FIELDS = [ "Filename", "Status", "Title" ] -CONDITIONAL_FIELDS = { "OPEN" : [ "Target" ], - "ACCEPTED" : [ "Target "], - "CLOSED" : [ "Implemented-In" ], - "FINISHED" : [ "Implemented-In" ] } -FNAME_RE = re.compile(r'^(\d\d\d)-.*[^\~]$') -DIR = "." -OUTFILE = "000-index.txt" -TMPFILE = OUTFILE+".tmp" - -def indexed(seq): - n = 0 - for i in seq: - yield n, i - n += 1 - -def readProposal(fn): - fields = { } - f = open(fn, 'r') - lastField = None - try: - for lineno, line in indexed(f): - line = line.rstrip() - if not line: - return fields - if line[0].isspace(): - fields[lastField] += " %s"%(line.strip()) - else: - parts = line.split(":", 1) - if len(parts) != 2: - raise Error("%s:%s: Neither field nor continuation"% - (fn,lineno)) - else: - fields[parts[0]] = parts[1].strip() - lastField = parts[0] - - return fields - finally: - f.close() - -def checkProposal(fn, fields): - status = fields.get("Status") - need_fields = REQUIRED_FIELDS + CONDITIONAL_FIELDS.get(status, []) - for f in need_fields: - if not fields.has_key(f): - raise Error("%s has no %s field"%(fn, f)) - if fn != fields['Filename']: - print `fn`, `fields['Filename']` - raise Error("Mismatched Filename field in %s"%fn) - if fields['Title'][-1] == '.': - fields['Title'] = fields['Title'][:-1] - - status = fields['Status'] = status.upper() - if status not in STATUSES: - raise Error("I've never heard of status %s in %s"%(status,fn)) - if status in [ "SUPERSEDED", "DEAD" ]: - for f in [ 'Implemented-In', 'Target' ]: - if fields.has_key(f): del fields[f] - -def readProposals(): - res = [] - for fn in os.listdir(DIR): - m = FNAME_RE.match(fn) - if not m: continue - if not fn.endswith(".txt"): - raise Error("%s doesn't end with .txt"%fn) - num = m.group(1) - fields = readProposal(fn) - checkProposal(fn, fields) - fields['num'] = num - res.append(fields) - return res - -def writeIndexFile(proposals): - proposals.sort(key=lambda f:f['num']) - seenStatuses = set() - for p in proposals: - seenStatuses.add(p['Status']) - - out = open(TMPFILE, 'w') - inf = open(OUTFILE, 'r') - for line in inf: - out.write(line) - if line.startswith("====="): break - inf.close() - - out.write("Proposals by number:\n\n") - for prop in proposals: - out.write("%(num)s %(Title)s [%(Status)s]\n"%prop) - out.write("\n\nProposals by status:\n\n") - for s in STATUSES: - if s not in seenStatuses: continue - out.write(" %s:\n"%s) - for prop in proposals: - if s == prop['Status']: - out.write(" %(num)s %(Title)s"%prop) - if prop.has_key('Target'): - out.write(" [for %(Target)s]"%prop) - if prop.has_key('Implemented-In'): - out.write(" [in %(Implemented-In)s]"%prop) - out.write("\n") - out.close() - os.rename(TMPFILE, OUTFILE) - -try: - os.unlink(TMPFILE) -except OSError: - pass - -writeIndexFile(readProposals()) diff --git a/doc/spec/rend-spec.txt b/doc/spec/rend-spec.txt deleted file mode 100644 index 3c14ebc662..0000000000 --- a/doc/spec/rend-spec.txt +++ /dev/null @@ -1,966 +0,0 @@ - - Tor Rendezvous Specification - -0. Overview and preliminaries - - The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL - NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and - "OPTIONAL" in this document are to be interpreted as described in - RFC 2119. - - Read - https://svn.torproject.org/svn/projects/design-paper/tor-design.html#sec:rendezvous - before you read this specification. It will make more sense. - - Rendezvous points provide location-hidden services (server - anonymity) for the onion routing network. With rendezvous points, - Bob can offer a TCP service (say, a webserver) via the onion - routing network, without revealing the IP of that service. - - Bob does this by anonymously advertising a public key for his - service, along with a list of onion routers to act as "Introduction - Points" for his service. He creates forward circuits to those - introduction points, and tells them about his service. To - connect to Bob, Alice first builds a circuit to an OR to act as - her "Rendezvous Point." She then connects to one of Bob's chosen - introduction points, and asks it to tell him about her Rendezvous - Point (RP). If Bob chooses to answer, he builds a circuit to her - RP, and tells it to connect him to Alice. The RP joins their - circuits together, and begins relaying cells. Alice's 'BEGIN' - cells are received directly by Bob's OP, which passes data to - and from the local server implementing Bob's service. - - Below we describe a network-level specification of this service, - along with interfaces to make this process transparent to Alice - (so long as she is using an OP). - -0.1. Notation, conventions and prerequisites - - In the specifications below, we use the same notation and terminology - as in "tor-spec.txt". The service specified here also requires the - existence of an onion routing network as specified in that file. - - H(x) is a SHA1 digest of x. - PKSign(SK,x) is a PKCS.1-padded RSA signature of x with SK. - PKEncrypt(SK,x) is a PKCS.1-padded RSA encryption of x with SK. - Public keys are all RSA, and encoded in ASN.1. - All integers are stored in network (big-endian) order. - All symmetric encryption uses AES in counter mode, except where - otherwise noted. - - In all discussions, "Alice" will refer to a user connecting to a - location-hidden service, and "Bob" will refer to a user running a - location-hidden service. - - An OP is (as defined elsewhere) an "Onion Proxy" or Tor client. - - An OR is (as defined elsewhere) an "Onion Router" or Tor server. - - An "Introduction point" is a Tor server chosen to be Bob's medium-term - 'meeting place'. A "Rendezvous point" is a Tor server chosen by Alice to - be a short-term communication relay between her and Bob. All Tor servers - potentially act as introduction and rendezvous points. - -0.2. Protocol outline - - 1. Bob->Bob's OP: "Offer IP:Port as public-key-name:Port". [configuration] - (We do not specify this step; it is left to the implementor of - Bob's OP.) - - 2. Bob's OP generates a long-term keypair. - - 3. Bob's OP->Introduction point via Tor: [introduction setup] - "This public key is (currently) associated to me." - - 4. Bob's OP->directory service via Tor: publishes Bob's service descriptor - [advertisement] - "Meet public-key X at introduction point A, B, or C." (signed) - - 5. Out of band, Alice receives a z.onion:port address. - She opens a SOCKS connection to her OP, and requests z.onion:port. - - 6. Alice's OP retrieves Bob's descriptor via Tor. [descriptor lookup.] - - 7. Alice's OP chooses a rendezvous point, opens a circuit to that - rendezvous point, and establishes a rendezvous circuit. [rendezvous - setup.] - - 8. Alice connects to the Introduction point via Tor, and tells it about - her rendezvous point. (Encrypted to Bob.) [Introduction 1] - - 9. The Introduction point passes this on to Bob's OP via Tor, along the - introduction circuit. [Introduction 2] - - 10. Bob's OP decides whether to connect to Alice, and if so, creates a - circuit to Alice's RP via Tor. Establishes a shared circuit. - [Rendezvous 1] - - 11. The Rendezvous point forwards Bob's confirmation to Alice's OP. - [Rendezvous 2] - - 12. Alice's OP sends begin cells to Bob's OP. [Connection] - -0.3. Constants and new cell types - - Relay cell types - 32 -- RELAY_COMMAND_ESTABLISH_INTRO - 33 -- RELAY_COMMAND_ESTABLISH_RENDEZVOUS - 34 -- RELAY_COMMAND_INTRODUCE1 - 35 -- RELAY_COMMAND_INTRODUCE2 - 36 -- RELAY_COMMAND_RENDEZVOUS1 - 37 -- RELAY_COMMAND_RENDEZVOUS2 - 38 -- RELAY_COMMAND_INTRO_ESTABLISHED - 39 -- RELAY_COMMAND_RENDEZVOUS_ESTABLISHED - 40 -- RELAY_COMMAND_INTRODUCE_ACK - -0.4. Version overview - - There are several parts in the hidden service protocol that have - changed over time, each of them having its own version number, whereas - other parts remained the same. The following list of potentially - versioned protocol parts should help reduce some confusion: - - - Hidden service descriptor: the binary-based v0 was the default for a - long time, and an ASCII-based v2 has been added by proposal 114. The - v0 descriptor format has been deprecated in 0.2.2.1-alpha. See 1.3. - - - Hidden service descriptor propagation mechanism: currently related to - the hidden service descriptor version -- v0 publishes to the original - hs directory authorities, whereas v2 publishes to a rotating subset - of relays with the "HSDir" flag; see 1.4 and 1.6. - - - Introduction protocol for how to generate an introduction cell: - v0 specified a nickname for the rendezvous point and assumed the - relay would know about it, whereas v2 now specifies IP address, - port, and onion key so the relay doesn't need to already recognize - it. See 1.8. - -1. The Protocol - -1.1. Bob configures his local OP. - - We do not specify a format for the OP configuration file. However, - OPs SHOULD allow Bob to provide more than one advertised service - per OP, and MUST allow Bob to specify one or more virtual ports per - service. Bob provides a mapping from each of these virtual ports - to a local IP:Port pair. - -1.2. Bob's OP establishes his introduction points. - - The first time the OP provides an advertised service, it generates - a public/private keypair (stored locally). - - The OP chooses a small number of Tor servers as introduction points. - The OP establishes a new introduction circuit to each introduction - point. These circuits MUST NOT be used for anything but hidden service - introduction. To establish the introduction, Bob sends a - RELAY_COMMAND_ESTABLISH_INTRO cell, containing: - - KL Key length [2 octets] - PK Bob's public key or service key [KL octets] - HS Hash of session info [20 octets] - SIG Signature of above information [variable] - - KL is the length of PK, in octets. - - To prevent replay attacks, the HS field contains a SHA-1 hash based on the - shared secret KH between Bob's OP and the introduction point, as - follows: - HS = H(KH | "INTRODUCE") - That is: - HS = H(KH | [49 4E 54 52 4F 44 55 43 45]) - (KH, as specified in tor-spec.txt, is H(g^xy | [00]) .) - - Upon receiving such a cell, the OR first checks that the signature is - correct with the included public key. If so, it checks whether HS is - correct given the shared state between Bob's OP and the OR. If either - check fails, the OP discards the cell; otherwise, it associates the - circuit with Bob's public key, and dissociates any other circuits - currently associated with PK. On success, the OR sends Bob a - RELAY_COMMAND_INTRO_ESTABLISHED cell with an empty payload. - - Bob's OP uses either Bob's public key or a freshly generated, single-use - service key in the RELAY_COMMAND_ESTABLISH_INTRO cell, depending on the - configured hidden service descriptor version. The public key is used for - v0 descriptors, the service key for v2 descriptors. In the latter case, the - service keys of all introduction points are included in the v2 hidden - service descriptor together with the other introduction point information. - The reason is that the introduction point does not need to and therefore - should not know for which hidden service it works, so as to prevent it from - tracking the hidden service's activity. If the hidden service is configured - to publish both v0 and v2 descriptors, two separate sets of introduction - points are established. - -1.3. Bob's OP generates service descriptors. - - For versions before 0.2.2.1-alpha, Bob's OP periodically generates and - publishes a descriptor of type "V0". - - The "V0" descriptor contains: - - KL Key length [2 octets] - PK Bob's public key [KL octets] - TS A timestamp [4 octets] - NI Number of introduction points [2 octets] - Ipt A list of NUL-terminated ORs [variable] - SIG Signature of above fields [variable] - - TS is the number of seconds elapsed since Jan 1, 1970. - - The members of Ipt may be either (a) nicknames, or (b) identity key - digests, encoded in hex, and prefixed with a '$'. Clients must - accept both forms. Services must only generate the second form. - Once 0.0.9.x is obsoleted, we can drop the first form. - - [It's ok for Bob to advertise 0 introduction points. He might want - to do that if he previously advertised some introduction points, - and now he doesn't have any. -RD] - - Beginning with 0.2.0.10-alpha, Bob's OP encodes "V2" descriptors in - addition to (or instead of) "V0" descriptors. The format of a "V2" - descriptor is as follows: - - "rendezvous-service-descriptor" descriptor-id NL - - [At start, exactly once] - - Indicates the beginning of the descriptor. "descriptor-id" is a - periodically changing identifier of 160 bits formatted as 32 base32 - chars that is calculated by the hidden service and its clients. The - "descriptor-id" is calculated by performing the following operation: - - descriptor-id = - H(permanent-id | H(time-period | descriptor-cookie | replica)) - - "permanent-id" is the permanent identifier of the hidden service, - consisting of 80 bits. It can be calculated by computing the hash value - of the public hidden service key and truncating after the first 80 bits: - - permanent-id = H(public-key)[:10] - - Note: If Bob's OP has "stealth" authorization enabled (see Section 2.2), - it uses the client key in place of the public hidden service key. - - "H(time-period | descriptor-cookie | replica)" is the (possibly - secret) id part that is necessary to verify that the hidden service is - the true originator of this descriptor and that is therefore contained - in the descriptor, too. The descriptor ID can only be created by the - hidden service and its clients, but the "signature" below can only be - created by the service. - - "time-period" changes periodically as a function of time and - - "permanent-id". The current value for "time-period" can be calculated - using the following formula: - - time-period = (current-time + permanent-id-byte * 86400 / 256) - / 86400 - - "current-time" contains the current system time in seconds since - 1970-01-01 00:00, e.g. 1188241957. "permanent-id-byte" is the first - (unsigned) byte of the permanent identifier (which is in network - order), e.g. 143. Adding the product of "permanent-id-byte" and - 86400 (seconds per day), divided by 256, prevents "time-period" from - changing for all descriptors at the same time of the day. The result - of the overall operation is a (network-ordered) 32-bit integer, e.g. - 13753 or 0x000035B9 with the example values given above. - - "descriptor-cookie" is an optional secret password of 128 bits that - is shared between the hidden service provider and its clients. If the - descriptor-cookie is left out, the input to the hash function is 128 - bits shorter. - - "replica" denotes the number of the replica. A service publishes - multiple descriptors with different descriptor IDs in order to - distribute them to different places on the ring. - - "version" version-number NL - - [Exactly once] - - The version number of this descriptor's format. In this case: 2. - - "permanent-key" NL a public key in PEM format - - [Exactly once] - - The public key of the hidden service which is required to verify the - "descriptor-id" and the "signature". - - "secret-id-part" secret-id-part NL - - [Exactly once] - - The result of the following operation as explained above, formatted as - 32 base32 chars. Using this secret id part, everyone can verify that - the signed descriptor belongs to "descriptor-id". - - secret-id-part = H(time-period | descriptor-cookie | replica) - - "publication-time" YYYY-MM-DD HH:MM:SS NL - - [Exactly once] - - A timestamp when this descriptor has been created. - - "protocol-versions" version-string NL - - [Exactly once] - - A comma-separated list of recognized and permitted version numbers - for use in INTRODUCE cells; these versions are described in section - 1.8 below. - - "introduction-points" NL encrypted-string - - [At most once] - - A list of introduction points. If the optional "descriptor-cookie" is - used, this list is encrypted with AES in CTR mode with a random - initialization vector of 128 bits that is written to - the beginning of the encrypted string, and the "descriptor-cookie" as - secret key of 128 bits length. - - The string containing the introduction point data (either encrypted - or not) is encoded in base64, and surrounded with - "-----BEGIN MESSAGE-----" and "-----END MESSAGE-----". - - The unencrypted string may begin with: - - "service-authentication" auth-type auth-data NL - - [Any number] - - The service-specific authentication data can be used to perform - client authentication. This data is independent of the selected - introduction point as opposed to "intro-authentication" below. The - format of auth-data (base64-encoded or PEM format) depends on - auth-type. See section 2 of this document for details on auth - mechanisms. - - Subsequently, an arbitrary number of introduction point entries may - follow, each containing the following data: - - "introduction-point" identifier NL - - [At start, exactly once] - - The identifier of this introduction point: the base-32 encoded - hash of this introduction point's identity key. - - "ip-address" ip-address NL - - [Exactly once] - - The IP address of this introduction point. - - "onion-port" port NL - - [Exactly once] - - The TCP port on which the introduction point is listening for - incoming onion requests. - - "onion-key" NL a public key in PEM format - - [Exactly once] - - The public key that can be used to encrypt messages to this - introduction point. - - "service-key" NL a public key in PEM format - - [Exactly once] - - The public key that can be used to encrypt messages to the hidden - service. - - "intro-authentication" auth-type auth-data NL - - [Any number] - - The introduction-point-specific authentication data can be used - to perform client authentication. This data depends on the - selected introduction point as opposed to "service-authentication" - above. The format of auth-data (base64-encoded or PEM format) - depends on auth-type. See section 2 of this document for details - on auth mechanisms. - - (This ends the fields in the encrypted portion of the descriptor.) - - [It's ok for Bob to advertise 0 introduction points. He might want - to do that if he previously advertised some introduction points, - and now he doesn't have any. -RD] - - "signature" NL signature-string - - [At end, exactly once] - - A signature of all fields above with the private key of the hidden - service. - -1.3.1. Other descriptor formats we don't use. - - Support for the V0 descriptor format was dropped in 0.2.2.0-alpha-dev: - - KL Key length [2 octets] - PK Bob's public key [KL octets] - TS A timestamp [4 octets] - NI Number of introduction points [2 octets] - Ipt A list of NUL-terminated ORs [variable] - SIG Signature of above fields [variable] - - KL is the length of PK, in octets. - TS is the number of seconds elapsed since Jan 1, 1970. - - The members of Ipt may be either (a) nicknames, or (b) identity key - digests, encoded in hex, and prefixed with a '$'. - - The V1 descriptor format was understood and accepted from - 0.1.1.5-alpha-cvs to 0.2.0.6-alpha-dev, but no Tors generated it and - it was removed: - - V Format byte: set to 255 [1 octet] - V Version byte: set to 1 [1 octet] - KL Key length [2 octets] - PK Bob's public key [KL octets] - TS A timestamp [4 octets] - PROTO Protocol versions: bitmask [2 octets] - NI Number of introduction points [2 octets] - For each introduction point: (as in INTRODUCE2 cells) - IP Introduction point's address [4 octets] - PORT Introduction point's OR port [2 octets] - ID Introduction point identity ID [20 octets] - KLEN Length of onion key [2 octets] - KEY Introduction point onion key [KLEN octets] - SIG Signature of above fields [variable] - - A hypothetical "V1" descriptor, that has never been used but might - be useful for historical reasons, contains: - - V Format byte: set to 255 [1 octet] - V Version byte: set to 1 [1 octet] - KL Key length [2 octets] - PK Bob's public key [KL octets] - TS A timestamp [4 octets] - PROTO Rendezvous protocol versions: bitmask [2 octets] - NA Number of auth mechanisms accepted [1 octet] - For each auth mechanism: - AUTHT The auth type that is supported [2 octets] - AUTHL Length of auth data [1 octet] - AUTHD Auth data [variable] - NI Number of introduction points [2 octets] - For each introduction point: (as in INTRODUCE2 cells) - ATYPE An address type (typically 4) [1 octet] - ADDR Introduction point's IP address [4 or 16 octets] - PORT Introduction point's OR port [2 octets] - AUTHT The auth type that is supported [2 octets] - AUTHL Length of auth data [1 octet] - AUTHD Auth data [variable] - ID Introduction point identity ID [20 octets] - KLEN Length of onion key [2 octets] - KEY Introduction point onion key [KLEN octets] - SIG Signature of above fields [variable] - - AUTHT specifies which authentication/authorization mechanism is - required by the hidden service or the introduction point. AUTHD - is arbitrary data that can be associated with an auth approach. - Currently only AUTHT of [00 00] is supported, with an AUTHL of 0. - See section 2 of this document for details on auth mechanisms. - -1.4. Bob's OP advertises his service descriptor(s). - - Bob's OP advertises his service descriptor to a fixed set of v0 hidden - service directory servers and/or a changing subset of all v2 hidden service - directories. - - For versions before 0.2.2.1-alpha, Bob's OP opens a stream to each v0 - directory server's directory port via Tor. (He may re-use old circuits for - this.) Over this stream, Bob's OP makes an HTTP 'POST' request, to a URL - "/tor/rendezvous/publish" relative to the directory server's root, - containing as its body Bob's service descriptor. - - Upon receiving a descriptor, the directory server checks the signature, - and discards the descriptor if the signature does not match the enclosed - public key. Next, the directory server checks the timestamp. If the - timestamp is more than 24 hours in the past or more than 1 hour in the - future, or the directory server already has a newer descriptor with the - same public key, the server discards the descriptor. Otherwise, the - server discards any older descriptors with the same public key and - version format, and associates the new descriptor with the public key. - The directory server remembers this descriptor for at least 24 hours - after its timestamp. At least every 18 hours, Bob's OP uploads a - fresh descriptor. - - If Bob's OP is configured to publish v2 descriptors, it does so to a - changing subset of all v2 hidden service directories instead of the - authoritative directory servers. Therefore, Bob's OP opens a stream via - Tor to each responsible hidden service directory. (He may re-use old - circuits for this.) Over this stream, Bob's OP makes an HTTP 'POST' - request to a URL "/tor/rendezvous2/publish" relative to the hidden service - directory's root, containing as its body Bob's service descriptor. - - At any time, there are 6 hidden service directories responsible for - keeping replicas of a descriptor; they consist of 2 sets of 3 hidden - service directories with consecutive onion IDs. Bob's OP learns about - the complete list of hidden service directories by filtering the - consensus status document received from the directory authorities. A - hidden service directory is deemed responsible for all descriptor IDs in - the interval from its direct predecessor, exclusive, to its own ID, - inclusive; it further holds replicas for its 2 predecessors. A - participant only trusts its own routing list and never learns about - routing information from other parties. - - Bob's OP publishes a new v2 descriptor once an hour or whenever its - content changes. V2 descriptors can be found by clients within a given - time period of 24 hours, after which they change their ID as described - under 1.3. If a published descriptor would be valid for less than 60 - minutes (= 2 x 30 minutes to allow the server to be 30 minutes behind - and the client 30 minutes ahead), Bob's OP publishes the descriptor - under the ID of both, the current and the next publication period. - -1.5. Alice receives a z.onion address. - - When Alice receives a pointer to a location-hidden service, it is as a - hostname of the form "z.onion", where z is a base-32 encoding of a - 10-octet hash of Bob's service's public key, computed as follows: - - 1. Let H = H(PK). - 2. Let H' = the first 80 bits of H, considering each octet from - most significant bit to least significant bit. - 3. Generate a 16-character encoding of H', using base32 as defined - in RFC 3548. - - (We only use 80 bits instead of the 160 bits from SHA1 because we - don't need to worry about arbitrary collisions, and because it will - make handling the url's more convenient.) - - [Yes, numbers are allowed at the beginning. See RFC 1123. -NM] - -1.6. Alice's OP retrieves a service descriptor. - - Alice's OP fetches the service descriptor from the fixed set of v0 hidden - service directory servers and/or a changing subset of all v2 hidden service - directories. - - For versions before 0.2.2.1-alpha, Alice's OP opens a stream to a directory - server via Tor, and makes an HTTP GET request for the document - '/tor/rendezvous/<z>', where '<z>' is replaced with the encoding of Bob's - public key as described above. (She may re-use old circuits for this.) The - directory replies with a 404 HTTP response if it does not recognize <z>, - and otherwise returns Bob's most recently uploaded service descriptor. - - If Alice's OP receives a 404 response, it tries the other directory - servers, and only fails the lookup if none recognize the public key hash. - - Upon receiving a service descriptor, Alice verifies with the same process - as the directory server uses, described above in section 1.4. - - The directory server gives a 400 response if it cannot understand Alice's - request. - - Alice should cache the descriptor locally, but should not use - descriptors that are more than 24 hours older than their timestamp. - [Caching may make her partitionable, but she fetched it anonymously, - and we can't very well *not* cache it. -RD] - - If Alice's OP is running 0.2.1.10-alpha or higher, it fetches v2 hidden - service descriptors. Versions before 0.2.2.1-alpha are fetching both v0 and - v2 descriptors in parallel. Similar to the description in section 1.4, - Alice's OP fetches a v2 descriptor from a randomly chosen hidden service - directory out of the changing subset of 6 nodes. If the request is - unsuccessful, Alice retries the other remaining responsible hidden service - directories in a random order. Alice relies on Bob to care about a potential - clock skew between the two by possibly storing two sets of descriptors (see - end of section 1.4). - - Alice's OP opens a stream via Tor to the chosen v2 hidden service - directory. (She may re-use old circuits for this.) Over this stream, - Alice's OP makes an HTTP 'GET' request for the document - "/tor/rendezvous2/<z>", where z is replaced with the encoding of the - descriptor ID. The directory replies with a 404 HTTP response if it does - not recognize <z>, and otherwise returns Bob's most recently uploaded - service descriptor. - -1.7. Alice's OP establishes a rendezvous point. - - When Alice requests a connection to a given location-hidden service, - and Alice's OP does not have an established circuit to that service, - the OP builds a rendezvous circuit. It does this by establishing - a circuit to a randomly chosen OR, and sending a - RELAY_COMMAND_ESTABLISH_RENDEZVOUS cell to that OR. The body of that cell - contains: - - RC Rendezvous cookie [20 octets] - - The rendezvous cookie is an arbitrary 20-byte value, chosen randomly by - Alice's OP. Alice SHOULD choose a new rendezvous cookie for each new - connection attempt. - - Upon receiving a RELAY_COMMAND_ESTABLISH_RENDEZVOUS cell, the OR associates - the RC with the circuit that sent it. It replies to Alice with an empty - RELAY_COMMAND_RENDEZVOUS_ESTABLISHED cell to indicate success. - - Alice's OP MUST NOT use the circuit which sent the cell for any purpose - other than rendezvous with the given location-hidden service. - -1.8. Introduction: from Alice's OP to Introduction Point - - Alice builds a separate circuit to one of Bob's chosen introduction - points, and sends it a RELAY_COMMAND_INTRODUCE1 cell containing: - - Cleartext - PK_ID Identifier for Bob's PK [20 octets] - Encrypted to Bob's PK: (in the v0 intro protocol) - RP Rendezvous point's nickname [20 octets] - RC Rendezvous cookie [20 octets] - g^x Diffie-Hellman data, part 1 [128 octets] - OR (in the v1 intro protocol) - VER Version byte: set to 1. [1 octet] - RP Rendezvous point nick or ID [42 octets] - RC Rendezvous cookie [20 octets] - g^x Diffie-Hellman data, part 1 [128 octets] - OR (in the v2 intro protocol) - VER Version byte: set to 2. [1 octet] - IP Rendezvous point's address [4 octets] - PORT Rendezvous point's OR port [2 octets] - ID Rendezvous point identity ID [20 octets] - KLEN Length of onion key [2 octets] - KEY Rendezvous point onion key [KLEN octets] - RC Rendezvous cookie [20 octets] - g^x Diffie-Hellman data, part 1 [128 octets] - OR (in the v3 intro protocol) - VER Version byte: set to 3. [1 octet] - AUTHT The auth type that is used [1 octet] - AUTHL Length of auth data [2 octets] - AUTHD Auth data [variable] - TS A timestamp [4 octets] - IP Rendezvous point's address [4 octets] - PORT Rendezvous point's OR port [2 octets] - ID Rendezvous point identity ID [20 octets] - KLEN Length of onion key [2 octets] - KEY Rendezvous point onion key [KLEN octets] - RC Rendezvous cookie [20 octets] - g^x Diffie-Hellman data, part 1 [128 octets] - - PK_ID is the hash of Bob's public key or the service key, depending on the - hidden service descriptor version. In case of a v0 descriptor, Alice's OP - uses Bob's public key. If Alice has downloaded a v2 descriptor, she uses - the contained public key ("service-key"). - - RP is NUL-padded and terminated. In version 0 of the intro protocol, RP - must contain a nickname. In version 1, it must contain EITHER a nickname or - an identity key digest that is encoded in hex and prefixed with a '$'. - - The hybrid encryption to Bob's PK works just like the hybrid - encryption in CREATE cells (see tor-spec). Thus the payload of the - version 0 RELAY_COMMAND_INTRODUCE1 cell on the wire will contain - 20+42+16+20+20+128=246 bytes, and the version 1 and version 2 - introduction formats have other sizes. - - Through Tor 0.2.0.6-alpha, clients only generated the v0 introduction - format, whereas hidden services have understood and accepted v0, - v1, and v2 since 0.1.1.x. As of Tor 0.2.0.7-alpha and 0.1.2.18, - clients switched to using the v2 intro format. - -1.9. Introduction: From the Introduction Point to Bob's OP - - If the Introduction Point recognizes PK_ID as a public key which has - established a circuit for introductions as in 1.2 above, it sends the body - of the cell in a new RELAY_COMMAND_INTRODUCE2 cell down the corresponding - circuit. (If the PK_ID is unrecognized, the RELAY_COMMAND_INTRODUCE1 cell is - discarded.) - - After sending the RELAY_COMMAND_INTRODUCE2 cell to Bob, the OR replies to - Alice with an empty RELAY_COMMAND_INTRODUCE_ACK cell. If no - RELAY_COMMAND_INTRODUCE2 cell can be sent, the OR replies to Alice with a - non-empty cell to indicate an error. (The semantics of the cell body may be - determined later; the current implementation sends a single '1' byte on - failure.) - - When Bob's OP receives the RELAY_COMMAND_INTRODUCE2 cell, it decrypts it - with the private key for the corresponding hidden service, and extracts the - rendezvous point's nickname, the rendezvous cookie, and the value of g^x - chosen by Alice. - -1.10. Rendezvous - - Bob's OP builds a new Tor circuit ending at Alice's chosen rendezvous - point, and sends a RELAY_COMMAND_RENDEZVOUS1 cell along this circuit, - containing: - RC Rendezvous cookie [20 octets] - g^y Diffie-Hellman [128 octets] - KH Handshake digest [20 octets] - - (Bob's OP MUST NOT use this circuit for any other purpose.) - - If the RP recognizes RC, it relays the rest of the cell down the - corresponding circuit in a RELAY_COMMAND_RENDEZVOUS2 cell, containing: - - g^y Diffie-Hellman [128 octets] - KH Handshake digest [20 octets] - - (If the RP does not recognize the RC, it discards the cell and - tears down the circuit.) - - When Alice's OP receives a RELAY_COMMAND_RENDEZVOUS2 cell on a circuit which - has sent a RELAY_COMMAND_ESTABLISH_RENDEZVOUS cell but which has not yet - received a reply, it uses g^y and H(g^xy) to complete the handshake as in - the Tor circuit extend process: they establish a 60-octet string as - K = SHA1(g^xy | [00]) | SHA1(g^xy | [01]) | SHA1(g^xy | [02]) - and generate - KH = K[0..15] - Kf = K[16..31] - Kb = K[32..47] - - Subsequently, the rendezvous point passes relay cells, unchanged, from - each of the two circuits to the other. When Alice's OP sends - RELAY cells along the circuit, it first encrypts them with the - Kf, then with all of the keys for the ORs in Alice's side of the circuit; - and when Alice's OP receives RELAY cells from the circuit, it decrypts - them with the keys for the ORs in Alice's side of the circuit, then - decrypts them with Kb. Bob's OP does the same, with Kf and Kb - interchanged. - -1.11. Creating streams - - To open TCP connections to Bob's location-hidden service, Alice's OP sends - a RELAY_COMMAND_BEGIN cell along the established circuit, using the special - address "", and a chosen port. Bob's OP chooses a destination IP and - port, based on the configuration of the service connected to the circuit, - and opens a TCP stream. From then on, Bob's OP treats the stream as an - ordinary exit connection. - [ Except he doesn't include addr in the connected cell or the end - cell. -RD] - - Alice MAY send multiple RELAY_COMMAND_BEGIN cells along the circuit, to open - multiple streams to Bob. Alice SHOULD NOT send RELAY_COMMAND_BEGIN cells - for any other address along her circuit to Bob; if she does, Bob MUST reject - them. - -2. Authentication and authorization. - - The rendezvous protocol as described in Section 1 provides a few options - for implementing client-side authorization. There are two steps in the - rendezvous protocol that can be used for performing client authorization: - when downloading and decrypting parts of the hidden service descriptor and - at Bob's Tor client before contacting the rendezvous point. A service - provider can restrict access to his service at these two points to - authorized clients only. - - There are currently two authorization protocols specified that are - described in more detail below: - - 1. The first protocol allows a service provider to restrict access - to clients with a previously received secret key only, but does not - attempt to hide service activity from others. - - 2. The second protocol, albeit being feasible for a limited set of about - 16 clients, performs client authorization and hides service activity - from everyone but the authorized clients. - -2.1. Service with large-scale client authorization - - The first client authorization protocol aims at performing access control - while consuming as few additional resources as possible. This is the "basic" - authorization protocol. A service provider should be able to permit access - to a large number of clients while denying access for everyone else. - However, the price for scalability is that the service won't be able to hide - its activity from unauthorized or formerly authorized clients. - - The main idea of this protocol is to encrypt the introduction-point part - in hidden service descriptors to authorized clients using symmetric keys. - This ensures that nobody else but authorized clients can learn which - introduction points a service currently uses, nor can someone send a - valid INTRODUCE1 message without knowing the introduction key. Therefore, - a subsequent authorization at the introduction point is not required. - - A service provider generates symmetric "descriptor cookies" for his - clients and distributes them outside of Tor. The suggested key size is - 128 bits, so that descriptor cookies can be encoded in 22 base64 chars - (which can hold up to 22 * 5 = 132 bits, leaving 4 bits to encode the - authorization type (here: "0") and allow a client to distinguish this - authorization protocol from others like the one proposed below). - Typically, the contact information for a hidden service using this - authorization protocol looks like this: - - v2cbb2l4lsnpio4q.onion Ll3X7Xgz9eHGKCCnlFH0uz - - When generating a hidden service descriptor, the service encrypts the - introduction-point part with a single randomly generated symmetric - 128-bit session key using AES-CTR as described for v2 hidden service - descriptors in rend-spec. Afterwards, the service encrypts the session - key to all descriptor cookies using AES. Authorized client should be able - to efficiently find the session key that is encrypted for him/her, so - that 4 octet long client ID are generated consisting of descriptor cookie - and initialization vector. Descriptors always contain a number of - encrypted session keys that is a multiple of 16 by adding fake entries. - Encrypted session keys are ordered by client IDs in order to conceal - addition or removal of authorized clients by the service provider. - - ATYPE Authorization type: set to 1. [1 octet] - ALEN Number of clients := 1 + ((clients - 1) div 16) [1 octet] - for each symmetric descriptor cookie: - ID Client ID: H(descriptor cookie | IV)[:4] [4 octets] - SKEY Session key encrypted with descriptor cookie [16 octets] - (end of client-specific part) - RND Random data [(15 - ((clients - 1) mod 16)) * 20 octets] - IV AES initialization vector [16 octets] - IPOS Intro points, encrypted with session key [remaining octets] - - An authorized client needs to configure Tor to use the descriptor cookie - when accessing the hidden service. Therefore, a user adds the contact - information that she received from the service provider to her torrc - file. Upon downloading a hidden service descriptor, Tor finds the - encrypted introduction-point part and attempts to decrypt it using the - configured descriptor cookie. (In the rare event of two or more client - IDs being equal a client tries to decrypt all of them.) - - Upon sending the introduction, the client includes her descriptor cookie - as auth type "1" in the INTRODUCE2 cell that she sends to the service. - The hidden service checks whether the included descriptor cookie is - authorized to access the service and either responds to the introduction - request, or not. - -2.2. Authorization for limited number of clients - - A second, more sophisticated client authorization protocol goes the extra - mile of hiding service activity from unauthorized clients. This is the - "stealth" authorization protocol. With all else being equal to the preceding - authorization protocol, the second protocol publishes hidden service - descriptors for each user separately and gets along with encrypting the - introduction-point part of descriptors to a single client. This allows the - service to stop publishing descriptors for removed clients. As long as a - removed client cannot link descriptors issued for other clients to the - service, it cannot derive service activity any more. The downside of this - approach is limited scalability. Even though the distributed storage of - descriptors (cf. proposal 114) tackles the problem of limited scalability to - a certain extent, this protocol should not be used for services with more - than 16 clients. (In fact, Tor should refuse to advertise services for more - than this number of clients.) - - A hidden service generates an asymmetric "client key" and a symmetric - "descriptor cookie" for each client. The client key is used as - replacement for the service's permanent key, so that the service uses a - different identity for each of his clients. The descriptor cookie is used - to store descriptors at changing directory nodes that are unpredictable - for anyone but service and client, to encrypt the introduction-point - part, and to be included in INTRODUCE2 cells. Once the service has - created client key and descriptor cookie, he tells them to the client - outside of Tor. The contact information string looks similar to the one - used by the preceding authorization protocol (with the only difference - that it has "1" encoded as auth-type in the remaining 4 of 132 bits - instead of "0" as before). - - When creating a hidden service descriptor for an authorized client, the - hidden service uses the client key and descriptor cookie to compute - secret ID part and descriptor ID: - - secret-id-part = H(time-period | descriptor-cookie | replica) - - descriptor-id = H(client-key[:10] | secret-id-part) - - The hidden service also replaces permanent-key in the descriptor with - client-key and encrypts introduction-points with the descriptor cookie. - - ATYPE Authorization type: set to 2. [1 octet] - IV AES initialization vector [16 octets] - IPOS Intro points, encr. with descriptor cookie [remaining octets] - - When uploading descriptors, the hidden service needs to make sure that - descriptors for different clients are not uploaded at the same time (cf. - Section 1.1) which is also a limiting factor for the number of clients. - - When a client is requested to establish a connection to a hidden service - it looks up whether it has any authorization data configured for that - service. If the user has configured authorization data for authorization - protocol "2", the descriptor ID is determined as described in the last - paragraph. Upon receiving a descriptor, the client decrypts the - introduction-point part using its descriptor cookie. Further, the client - includes its descriptor cookie as auth-type "2" in INTRODUCE2 cells that - it sends to the service. - -2.3. Hidden service configuration - - A hidden service that is meant to perform client authorization adds a - new option HiddenServiceAuthorizeClient to its hidden service - configuration. This option contains the authorization type which is - either "basic" for the protocol described in 2.1 or "stealth" for the - protocol in 2.2 and a comma-separated list of human-readable client - names, so that Tor can create authorization data for these clients: - - HiddenServiceAuthorizeClient auth-type client-name,client-name,... - - If this option is configured, HiddenServiceVersion is automatically - reconfigured to contain only version numbers of 2 or higher. There is - a maximum of 512 client names for basic auth and a maximum of 16 for - stealth auth. - - Tor stores all generated authorization data for the authorization - protocols described in Sections 2.1 and 2.2 in a new file using the - following file format: - - "client-name" human-readable client identifier NL - "descriptor-cookie" 128-bit key ^= 22 base64 chars NL - - If the authorization protocol of Section 2.2 is used, Tor also generates - and stores the following data: - - "client-key" NL a public key in PEM format - -2.4. Client configuration - - Clients need to make their authorization data known to Tor using another - configuration option that contains a service name (mainly for the sake of - convenience), the service address, and the descriptor cookie that is - required to access a hidden service (the authorization protocol number is - encoded in the descriptor cookie): - - HidServAuth service-name service-address descriptor-cookie - -3. Hidden service directory operation - - This section has been introduced with the v2 hidden service descriptor - format. It describes all operations of the v2 hidden service descriptor - fetching and propagation mechanism that are required for the protocol - described in section 1 to succeed with v2 hidden service descriptors. - -3.1. Configuring as hidden service directory - - Every onion router that has its directory port open can decide whether it - wants to store and serve hidden service descriptors. An onion router which - is configured as such includes the "hidden-service-dir" flag in its router - descriptors that it sends to directory authorities. - - The directory authorities include a new flag "HSDir" for routers that - decided to provide storage for hidden service descriptors and that - have been running for at least 24 hours. - -3.2. Accepting publish requests - - Hidden service directory nodes accept publish requests for v2 hidden service - descriptors and store them to their local memory. (It is not necessary to - make descriptors persistent, because after restarting, the onion router - would not be accepted as a storing node anyway, because it has not been - running for at least 24 hours.) All requests and replies are formatted as - HTTP messages. Requests are initiated via BEGIN_DIR cells directed to - the router's directory port, and formatted as HTTP POST requests to the URL - "/tor/rendezvous2/publish" relative to the hidden service directory's root, - containing as its body a v2 service descriptor. - - A hidden service directory node parses every received descriptor and only - stores it when it thinks that it is responsible for storing that descriptor - based on its own routing table. See section 1.4 for more information on how - to determine responsibility for a certain descriptor ID. - -3.3. Processing fetch requests - - Hidden service directory nodes process fetch requests for hidden service - descriptors by looking them up in their local memory. (They do not need to - determine if they are responsible for the passed ID, because it does no harm - if they deliver a descriptor for which they are not (any more) responsible.) - All requests and replies are formatted as HTTP messages. Requests are - initiated via BEGIN_DIR cells directed to the router's directory port, - and formatted as HTTP GET requests for the document "/tor/rendezvous2/<z>", - where z is replaced with the encoding of the descriptor ID. - diff --git a/doc/spec/socks-extensions.txt b/doc/spec/socks-extensions.txt deleted file mode 100644 index 62d86acd9f..0000000000 --- a/doc/spec/socks-extensions.txt +++ /dev/null @@ -1,78 +0,0 @@ -Tor's extensions to the SOCKS protocol - -1. Overview - - The SOCKS protocol provides a generic interface for TCP proxies. Client - software connects to a SOCKS server via TCP, and requests a TCP connection - to another address and port. The SOCKS server establishes the connection, - and reports success or failure to the client. After the connection has - been established, the client application uses the TCP stream as usual. - - Tor supports SOCKS4 as defined in [1], SOCKS4A as defined in [2], and - SOCKS5 as defined in [3]. - - The stickiest issue for Tor in supporting clients, in practice, is forcing - DNS lookups to occur at the OR side: if clients do their own DNS lookup, - the DNS server can learn which addresses the client wants to reach. - SOCKS4 supports addressing by IPv4 address; SOCKS4A is a kludge on top of - SOCKS4 to allow addressing by hostname; SOCKS5 supports IPv4, IPv6, and - hostnames. - -1.1. Extent of support - - Tor supports the SOCKS4, SOCKS4A, and SOCKS5 standards, except as follows: - - BOTH: - - The BIND command is not supported. - - SOCKS4,4A: - - SOCKS4 usernames are ignored. - - SOCKS5: - - The (SOCKS5) "UDP ASSOCIATE" command is not supported. - - IPv6 is not supported in CONNECT commands. - - Only the "NO AUTHENTICATION" (SOCKS5) authentication method [00] is - supported. - -2. Name lookup - - As an extension to SOCKS4A and SOCKS5, Tor implements a new command value, - "RESOLVE" [F0]. When Tor receives a "RESOLVE" SOCKS command, it initiates - a remote lookup of the hostname provided as the target address in the SOCKS - request. The reply is either an error (if the address couldn't be - resolved) or a success response. In the case of success, the address is - stored in the portion of the SOCKS response reserved for remote IP address. - - (We support RESOLVE in SOCKS4 too, even though it is unnecessary.) - - For SOCKS5 only, we support reverse resolution with a new command value, - "RESOLVE_PTR" [F1]. In response to a "RESOLVE_PTR" SOCKS5 command with - an IPv4 address as its target, Tor attempts to find the canonical - hostname for that IPv4 record, and returns it in the "server bound - address" portion of the reply. - (This command was not supported before Tor 0.1.2.2-alpha.) - -3. Other command extensions. - - Tor 0.1.2.4-alpha added a new command value: "CONNECT_DIR" [F2]. - In this case, Tor will open an encrypted direct TCP connection to the - directory port of the Tor server specified by address:port (the port - specified should be the ORPort of the server). It uses a one-hop tunnel - and a "BEGIN_DIR" relay cell to accomplish this secure connection. - - The F2 command value was removed in Tor 0.2.0.10-alpha in favor of a - new use_begindir flag in edge_connection_t. - -4. HTTP-resistance - - Tor checks the first byte of each SOCKS request to see whether it looks - more like an HTTP request (that is, it starts with a "G", "H", or "P"). If - so, Tor returns a small webpage, telling the user that his/her browser is - misconfigured. This is helpful for the many users who mistakenly try to - use Tor as an HTTP proxy instead of a SOCKS proxy. - -References: - [1] http://archive.socks.permeo.com/protocol/socks4.protocol - [2] http://archive.socks.permeo.com/protocol/socks4a.protocol - [3] SOCKS5: RFC1928 - diff --git a/doc/spec/tor-fw-helper-spec.txt b/doc/spec/tor-fw-helper-spec.txt deleted file mode 100644 index 0068b26556..0000000000 --- a/doc/spec/tor-fw-helper-spec.txt +++ /dev/null @@ -1,57 +0,0 @@ - - Tor's (little) Firewall Helper specification - Jacob Appelbaum - -0. Preface - - This document describes issues faced by Tor users who are behind NAT devices - and wish to share their resources with the rest of the Tor network. It also - explains a possible solution for some NAT devices. - -1. Overview - - Tor users often wish to relay traffic for the Tor network and their upstream - firewall thwarts their attempted generosity. Automatic port forwarding - configuration for many consumer NAT devices is often available with two common - protocols NAT-PMP[0] and UPnP[1]. - -2. Implementation - - tor-fw-helper is a program that implements basic port forwarding requests; it - may be used alone or called from Tor itself. - -2.1 Output format - - When tor-fw-helper has completed the requested action successfully, it will - report the following message to standard output: - - tor-fw-helper: SUCCESS - - If tor-fw-helper was unable to complete the requested action successfully, it - will report the following message to standard error: - - tor-fw-helper: FAILURE - - All informational messages are printed to standard output; all error messages - are printed to standard error. Messages other than SUCCESS and FAILURE - may be printed by any compliant tor-fw-helper. - -2.2 Output format stability - - The above SUCCESS and FAILURE messages are the only stable output formats - provided by this specification. tor-fw-helper-spec compliant implementations - must return SUCCESS or FAILURE as defined above. - -3. Security Concerns - - It is probably best to hand configure port forwarding and in the process, we - suggest disabling NAT-PMP and/or UPnP. This is of course absolutely confusing - to users and so we support automatic, non-authenticated NAT port mapping - protocols with compliant tor-fw-helper applications. - - NAT should not be considered a security boundary. NAT-PMP and UPnP are hacks - to deal with the shortcomings of user education about TCP/IP, IPv4 shortages, - and of course, NAT devices that suffer from horrible user interface design. - -[0] http://en.wikipedia.org/wiki/NAT_Port_Mapping_Protocol -[1] http://en.wikipedia.org/wiki/Universal_Plug_and_Play diff --git a/doc/spec/tor-spec.txt b/doc/spec/tor-spec.txt deleted file mode 100644 index 91ad561b8d..0000000000 --- a/doc/spec/tor-spec.txt +++ /dev/null @@ -1,1004 +0,0 @@ - - Tor Protocol Specification - - Roger Dingledine - Nick Mathewson - -Note: This document aims to specify Tor as implemented in 0.2.1.x. Future -versions of Tor may implement improved protocols, and compatibility is not -guaranteed. Compatibility notes are given for versions 0.1.1.15-rc and -later; earlier versions are not compatible with the Tor network as of this -writing. - -This specification is not a design document; most design criteria -are not examined. For more information on why Tor acts as it does, -see tor-design.pdf. - -0. Preliminaries - - The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL - NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and - "OPTIONAL" in this document are to be interpreted as described in - RFC 2119. - -0.1. Notation and encoding - - PK -- a public key. - SK -- a private key. - K -- a key for a symmetric cipher. - - a|b -- concatenation of 'a' and 'b'. - - [A0 B1 C2] -- a three-byte sequence, containing the bytes with - hexadecimal values A0, B1, and C2, in that order. - - All numeric values are encoded in network (big-endian) order. - - H(m) -- a cryptographic hash of m. - -0.2. Security parameters - - Tor uses a stream cipher, a public-key cipher, the Diffie-Hellman - protocol, and a hash function. - - KEY_LEN -- the length of the stream cipher's key, in bytes. - - PK_ENC_LEN -- the length of a public-key encrypted message, in bytes. - PK_PAD_LEN -- the number of bytes added in padding for public-key - encryption, in bytes. (The largest number of bytes that can be encrypted - in a single public-key operation is therefore PK_ENC_LEN-PK_PAD_LEN.) - - DH_LEN -- the number of bytes used to represent a member of the - Diffie-Hellman group. - DH_SEC_LEN -- the number of bytes used in a Diffie-Hellman private key (x). - - HASH_LEN -- the length of the hash function's output, in bytes. - - PAYLOAD_LEN -- The longest allowable cell payload, in bytes. (509) - - CELL_LEN -- The length of a Tor cell, in bytes. - -0.3. Ciphers - - For a stream cipher, we use 128-bit AES in counter mode, with an IV of all - 0 bytes. - - For a public-key cipher, we use RSA with 1024-bit keys and a fixed - exponent of 65537. We use OAEP-MGF1 padding, with SHA-1 as its digest - function. We leave the optional "Label" parameter unset. (For OAEP - padding, see ftp://ftp.rsasecurity.com/pub/pkcs/pkcs-1/pkcs-1v2-1.pdf) - - For Diffie-Hellman, we use a generator (g) of 2. For the modulus (p), we - use the 1024-bit safe prime from rfc2409 section 6.2 whose hex - representation is: - - "FFFFFFFFFFFFFFFFC90FDAA22168C234C4C6628B80DC1CD129024E08" - "8A67CC74020BBEA63B139B22514A08798E3404DDEF9519B3CD3A431B" - "302B0A6DF25F14374FE1356D6D51C245E485B576625E7EC6F44C42E9" - "A637ED6B0BFF5CB6F406B7EDEE386BFB5A899FA5AE9F24117C4B1FE6" - "49286651ECE65381FFFFFFFFFFFFFFFF" - - As an optimization, implementations SHOULD choose DH private keys (x) of - 320 bits. Implementations that do this MUST never use any DH key more - than once. - [May other implementations reuse their DH keys?? -RD] - [Probably not. Conceivably, you could get away with changing DH keys once - per second, but there are too many oddball attacks for me to be - comfortable that this is safe. -NM] - - For a hash function, we use SHA-1. - - KEY_LEN=16. - DH_LEN=128; DH_SEC_LEN=40. - PK_ENC_LEN=128; PK_PAD_LEN=42. - HASH_LEN=20. - - When we refer to "the hash of a public key", we mean the SHA-1 hash of the - DER encoding of an ASN.1 RSA public key (as specified in PKCS.1). - - All "random" values should be generated with a cryptographically strong - random number generator, unless otherwise noted. - - The "hybrid encryption" of a byte sequence M with a public key PK is - computed as follows: - 1. If M is less than PK_ENC_LEN-PK_PAD_LEN, pad and encrypt M with PK. - 2. Otherwise, generate a KEY_LEN byte random key K. - Let M1 = the first PK_ENC_LEN-PK_PAD_LEN-KEY_LEN bytes of M, - and let M2 = the rest of M. - Pad and encrypt K|M1 with PK. Encrypt M2 with our stream cipher, - using the key K. Concatenate these encrypted values. - [XXX Note that this "hybrid encryption" approach does not prevent - an attacker from adding or removing bytes to the end of M. It also - allows attackers to modify the bytes not covered by the OAEP -- - see Goldberg's PET2006 paper for details. We will add a MAC to this - scheme one day. -RD] - -0.4. Other parameter values - - CELL_LEN=512 - -1. System overview - - Tor is a distributed overlay network designed to anonymize - low-latency TCP-based applications such as web browsing, secure shell, - and instant messaging. Clients choose a path through the network and - build a ``circuit'', in which each node (or ``onion router'' or ``OR'') - in the path knows its predecessor and successor, but no other nodes in - the circuit. Traffic flowing down the circuit is sent in fixed-size - ``cells'', which are unwrapped by a symmetric key at each node (like - the layers of an onion) and relayed downstream. - -1.1. Keys and names - - Every Tor server has multiple public/private keypairs: - - - A long-term signing-only "Identity key" used to sign documents and - certificates, and used to establish server identity. - - A medium-term "Onion key" used to decrypt onion skins when accepting - circuit extend attempts. (See 5.1.) Old keys MUST be accepted for at - least one week after they are no longer advertised. Because of this, - servers MUST retain old keys for a while after they're rotated. - - A short-term "Connection key" used to negotiate TLS connections. - Tor implementations MAY rotate this key as often as they like, and - SHOULD rotate this key at least once a day. - - Tor servers are also identified by "nicknames"; these are specified in - dir-spec.txt. - -2. Connections - - Connections between two Tor servers, or between a client and a server, - use TLS/SSLv3 for link authentication and encryption. All - implementations MUST support the SSLv3 ciphersuite - "SSL_DHE_RSA_WITH_3DES_EDE_CBC_SHA", and SHOULD support the TLS - ciphersuite "TLS_DHE_RSA_WITH_AES_128_CBC_SHA" if it is available. - - There are three acceptable ways to perform a TLS handshake when - connecting to a Tor server: "certificates up-front", "renegotiation", and - "backwards-compatible renegotiation". ("Backwards-compatible - renegotiation" is, as the name implies, compatible with both other - handshake types.) - - Before Tor 0.2.0.21, only "certificates up-front" was supported. In Tor - 0.2.0.21 or later, "backwards-compatible renegotiation" is used. - - In "certificates up-front", the connection initiator always sends a - two-certificate chain, consisting of an X.509 certificate using a - short-term connection public key and a second, self- signed X.509 - certificate containing its identity key. The other party sends a similar - certificate chain. The initiator's ClientHello MUST NOT include any - ciphersuites other than: - TLS_DHE_RSA_WITH_AES_256_CBC_SHA - TLS_DHE_RSA_WITH_AES_128_CBC_SHA - SSL_DHE_RSA_WITH_3DES_EDE_CBC_SHA - SSL_DHE_DSS_WITH_3DES_EDE_CBC_SHA - - In "renegotiation", the connection initiator sends no certificates, and - the responder sends a single connection certificate. Once the TLS - handshake is complete, the initiator renegotiates the handshake, with each - party sending a two-certificate chain as in "certificates up-front". - The initiator's ClientHello MUST include at least one ciphersuite not in - the list above. The responder SHOULD NOT select any ciphersuite besides - those in the list above. - [The above "should not" is because some of the ciphers that - clients list may be fake.] - - In "backwards-compatible renegotiation", the connection initiator's - ClientHello MUST include at least one ciphersuite other than those listed - above. The connection responder examines the initiator's ciphersuite list - to see whether it includes any ciphers other than those included in the - list above. If extra ciphers are included, the responder proceeds as in - "renegotiation": it sends a single certificate and does not request - client certificates. Otherwise (in the case that no extra ciphersuites - are included in the ClientHello) the responder proceeds as in - "certificates up-front": it requests client certificates, and sends a - two-certificate chain. In either case, once the responder has sent its - certificate or certificates, the initiator counts them. If two - certificates have been sent, it proceeds as in "certificates up-front"; - otherwise, it proceeds as in "renegotiation". - - All new implementations of the Tor server protocol MUST support - "backwards-compatible renegotiation"; clients SHOULD do this too. If - this is not possible, new client implementations MUST support both - "renegotiation" and "certificates up-front" and use the router's - published link protocols list (see dir-spec.txt on the "protocols" entry) - to decide which to use. - - In all of the above handshake variants, certificates sent in the clear - SHOULD NOT include any strings to identify the host as a Tor server. In - the "renegotiation" and "backwards-compatible renegotiation" steps, the - initiator SHOULD choose a list of ciphersuites and TLS extensions - to mimic one used by a popular web browser. - - Responders MUST NOT select any TLS ciphersuite that lacks ephemeral keys, - or whose symmetric keys are less then KEY_LEN bits, or whose digests are - less than HASH_LEN bits. Responders SHOULD NOT select any SSLv3 - ciphersuite other than those listed above. - - Even though the connection protocol is identical, we will think of the - initiator as either an onion router (OR) if it is willing to relay - traffic for other Tor users, or an onion proxy (OP) if it only handles - local requests. Onion proxies SHOULD NOT provide long-term-trackable - identifiers in their handshakes. - - In all handshake variants, once all certificates are exchanged, all - parties receiving certificates must confirm that the identity key is as - expected. (When initiating a connection, the expected identity key is - the one given in the directory; when creating a connection because of an - EXTEND cell, the expected identity key is the one given in the cell.) If - the key is not as expected, the party must close the connection. - - When connecting to an OR, all parties SHOULD reject the connection if that - OR has a malformed or missing certificate. When accepting an incoming - connection, an OR SHOULD NOT reject incoming connections from parties with - malformed or missing certificates. (However, an OR should not believe - that an incoming connection is from another OR unless the certificates - are present and well-formed.) - - [Before version 0.1.2.8-rc, ORs rejected incoming connections from ORs and - OPs alike if their certificates were missing or malformed.] - - Once a TLS connection is established, the two sides send cells - (specified below) to one another. Cells are sent serially. All - cells are CELL_LEN bytes long. Cells may be sent embedded in TLS - records of any size or divided across TLS records, but the framing - of TLS records MUST NOT leak information about the type or contents - of the cells. - - TLS connections are not permanent. Either side MAY close a connection - if there are no circuits running over it and an amount of time - (KeepalivePeriod, defaults to 5 minutes) has passed since the last time - any traffic was transmitted over the TLS connection. Clients SHOULD - also hold a TLS connection with no circuits open, if it is likely that a - circuit will be built soon using that connection. - - (As an exception, directory servers may try to stay connected to all of - the ORs -- though this will be phased out for the Tor 0.1.2.x release.) - - To avoid being trivially distinguished from servers, client-only Tor - instances are encouraged but not required to use a two-certificate chain - as well. Clients SHOULD NOT keep using the same certificates when - their IP address changes. Clients MAY send no certificates at all. - -3. Cell Packet format - - The basic unit of communication for onion routers and onion - proxies is a fixed-width "cell". - - On a version 1 connection, each cell contains the following - fields: - - CircID [2 bytes] - Command [1 byte] - Payload (padded with 0 bytes) [PAYLOAD_LEN bytes] - - On a version 2 connection, all cells are as in version 1 connections, - except for the initial VERSIONS cell, whose format is: - - Circuit [2 octets; set to 0] - Command [1 octet; set to 7 for VERSIONS] - Length [2 octets; big-endian integer] - Payload [Length bytes] - - The CircID field determines which circuit, if any, the cell is - associated with. - - The 'Command' field holds one of the following values: - 0 -- PADDING (Padding) (See Sec 7.2) - 1 -- CREATE (Create a circuit) (See Sec 5.1) - 2 -- CREATED (Acknowledge create) (See Sec 5.1) - 3 -- RELAY (End-to-end data) (See Sec 5.5 and 6) - 4 -- DESTROY (Stop using a circuit) (See Sec 5.4) - 5 -- CREATE_FAST (Create a circuit, no PK) (See Sec 5.1) - 6 -- CREATED_FAST (Circuit created, no PK) (See Sec 5.1) - 7 -- VERSIONS (Negotiate proto version) (See Sec 4) - 8 -- NETINFO (Time and address info) (See Sec 4) - 9 -- RELAY_EARLY (End-to-end data; limited)(See Sec 5.6) - - The interpretation of 'Payload' depends on the type of the cell. - PADDING: Payload is unused. - CREATE: Payload contains the handshake challenge. - CREATED: Payload contains the handshake response. - RELAY: Payload contains the relay header and relay body. - DESTROY: Payload contains a reason for closing the circuit. - (see 5.4) - Upon receiving any other value for the command field, an OR must - drop the cell. Since more cell types may be added in the future, ORs - should generally not warn when encountering unrecognized commands. - - The payload is padded with 0 bytes. - - PADDING cells are currently used to implement connection keepalive. - If there is no other traffic, ORs and OPs send one another a PADDING - cell every few minutes. - - CREATE, CREATED, and DESTROY cells are used to manage circuits; - see section 5 below. - - RELAY cells are used to send commands and data along a circuit; see - section 6 below. - - VERSIONS and NETINFO cells are used to set up connections. See section 4 - below. - -4. Negotiating and initializing connections - -4.1. Negotiating versions with VERSIONS cells - - There are multiple instances of the Tor link connection protocol. Any - connection negotiated using the "certificates up front" handshake (see - section 2 above) is "version 1". In any connection where both parties - have behaved as in the "renegotiation" handshake, the link protocol - version is 2 or higher. - - To determine the version, in any connection where the "renegotiation" - handshake was used (that is, where the server sent only one certificate - at first and where the client did not send any certificates until - renegotiation), both parties MUST send a VERSIONS cell immediately after - the renegotiation is finished, before any other cells are sent. Parties - MUST NOT send any other cells on a connection until they have received a - VERSIONS cell. - - The payload in a VERSIONS cell is a series of big-endian two-byte - integers. Both parties MUST select as the link protocol version the - highest number contained both in the VERSIONS cell they sent and in the - versions cell they received. If they have no such version in common, - they cannot communicate and MUST close the connection. - - Since the version 1 link protocol does not use the "renegotiation" - handshake, implementations MUST NOT list version 1 in their VERSIONS - cell. - -4.2. NETINFO cells - - If version 2 or higher is negotiated, each party sends the other a - NETINFO cell. The cell's payload is: - - Timestamp [4 bytes] - Other OR's address [variable] - Number of addresses [1 byte] - This OR's addresses [variable] - - The address format is a type/length/value sequence as given in section - 6.4 below. The timestamp is a big-endian unsigned integer number of - seconds since the Unix epoch. - - Implementations MAY use the timestamp value to help decide if their - clocks are skewed. Initiators MAY use "other OR's address" to help - learn which address their connections are originating from, if they do - not know it. Initiators SHOULD use "this OR's address" to make sure - that they have connected to another OR at its canonical address. - - [As of 0.2.0.23-rc, implementations use none of the above values.] - - -5. Circuit management - -5.1. CREATE and CREATED cells - - Users set up circuits incrementally, one hop at a time. To create a - new circuit, OPs send a CREATE cell to the first node, with the - first half of the DH handshake; that node responds with a CREATED - cell with the second half of the DH handshake plus the first 20 bytes - of derivative key data (see section 5.2). To extend a circuit past - the first hop, the OP sends an EXTEND relay cell (see section 5) - which instructs the last node in the circuit to send a CREATE cell - to extend the circuit. - - The payload for a CREATE cell is an 'onion skin', which consists - of the first step of the DH handshake data (also known as g^x). - This value is hybrid-encrypted (see 0.3) to Bob's onion key, giving - an onion-skin of: - PK-encrypted: - Padding [PK_PAD_LEN bytes] - Symmetric key [KEY_LEN bytes] - First part of g^x [PK_ENC_LEN-PK_PAD_LEN-KEY_LEN bytes] - Symmetrically encrypted: - Second part of g^x [DH_LEN-(PK_ENC_LEN-PK_PAD_LEN-KEY_LEN) - bytes] - - The relay payload for an EXTEND relay cell consists of: - Address [4 bytes] - Port [2 bytes] - Onion skin [DH_LEN+KEY_LEN+PK_PAD_LEN bytes] - Identity fingerprint [HASH_LEN bytes] - - The port and address field denote the IPv4 address and port of the next - onion router in the circuit; the public key hash is the hash of the PKCS#1 - ASN1 encoding of the next onion router's identity (signing) key. (See 0.3 - above.) Including this hash allows the extending OR verify that it is - indeed connected to the correct target OR, and prevents certain - man-in-the-middle attacks. - - The payload for a CREATED cell, or the relay payload for an - EXTENDED cell, contains: - DH data (g^y) [DH_LEN bytes] - Derivative key data (KH) [HASH_LEN bytes] <see 5.2 below> - - The CircID for a CREATE cell is an arbitrarily chosen 2-byte integer, - selected by the node (OP or OR) that sends the CREATE cell. To prevent - CircID collisions, when one node sends a CREATE cell to another, it chooses - from only one half of the possible values based on the ORs' public - identity keys: if the sending node has a lower key, it chooses a CircID with - an MSB of 0; otherwise, it chooses a CircID with an MSB of 1. - - (An OP with no public key MAY choose any CircID it wishes, since an OP - never needs to process a CREATE cell.) - - Public keys are compared numerically by modulus. - - As usual with DH, x and y MUST be generated randomly. - -5.1.1. CREATE_FAST/CREATED_FAST cells - - When initializing the first hop of a circuit, the OP has already - established the OR's identity and negotiated a secret key using TLS. - Because of this, it is not always necessary for the OP to perform the - public key operations to create a circuit. In this case, the - OP MAY send a CREATE_FAST cell instead of a CREATE cell for the first - hop only. The OR responds with a CREATED_FAST cell, and the circuit is - created. - - A CREATE_FAST cell contains: - - Key material (X) [HASH_LEN bytes] - - A CREATED_FAST cell contains: - - Key material (Y) [HASH_LEN bytes] - Derivative key data [HASH_LEN bytes] (See 5.2 below) - - The values of X and Y must be generated randomly. - - If an OR sees a circuit created with CREATE_FAST, the OR is sure to be the - first hop of a circuit. ORs SHOULD reject attempts to create streams with - RELAY_BEGIN exiting the circuit at the first hop: letting Tor be used as a - single hop proxy makes exit nodes a more attractive target for compromise. - -5.2. Setting circuit keys - - Once the handshake between the OP and an OR is completed, both can - now calculate g^xy with ordinary DH. Before computing g^xy, both client - and server MUST verify that the received g^x or g^y value is not degenerate; - that is, it must be strictly greater than 1 and strictly less than p-1 - where p is the DH modulus. Implementations MUST NOT complete a handshake - with degenerate keys. Implementations MUST NOT discard other "weak" - g^x values. - - (Discarding degenerate keys is critical for security; if bad keys - are not discarded, an attacker can substitute the server's CREATED - cell's g^y with 0 or 1, thus creating a known g^xy and impersonating - the server. Discarding other keys may allow attacks to learn bits of - the private key.) - - If CREATE or EXTEND is used to extend a circuit, the client and server - base their key material on K0=g^xy, represented as a big-endian unsigned - integer. - - If CREATE_FAST is used, the client and server base their key material on - K0=X|Y. - - From the base key material K0, they compute KEY_LEN*2+HASH_LEN*3 bytes of - derivative key data as - K = H(K0 | [00]) | H(K0 | [01]) | H(K0 | [02]) | ... - - The first HASH_LEN bytes of K form KH; the next HASH_LEN form the forward - digest Df; the next HASH_LEN 41-60 form the backward digest Db; the next - KEY_LEN 61-76 form Kf, and the final KEY_LEN form Kb. Excess bytes from K - are discarded. - - KH is used in the handshake response to demonstrate knowledge of the - computed shared key. Df is used to seed the integrity-checking hash - for the stream of data going from the OP to the OR, and Db seeds the - integrity-checking hash for the data stream from the OR to the OP. Kf - is used to encrypt the stream of data going from the OP to the OR, and - Kb is used to encrypt the stream of data going from the OR to the OP. - -5.3. Creating circuits - - When creating a circuit through the network, the circuit creator - (OP) performs the following steps: - - 1. Choose an onion router as an exit node (R_N), such that the onion - router's exit policy includes at least one pending stream that - needs a circuit (if there are any). - - 2. Choose a chain of (N-1) onion routers - (R_1...R_N-1) to constitute the path, such that no router - appears in the path twice. - - 3. If not already connected to the first router in the chain, - open a new connection to that router. - - 4. Choose a circID not already in use on the connection with the - first router in the chain; send a CREATE cell along the - connection, to be received by the first onion router. - - 5. Wait until a CREATED cell is received; finish the handshake - and extract the forward key Kf_1 and the backward key Kb_1. - - 6. For each subsequent onion router R (R_2 through R_N), extend - the circuit to R. - - To extend the circuit by a single onion router R_M, the OP performs - these steps: - - 1. Create an onion skin, encrypted to R_M's public onion key. - - 2. Send the onion skin in a relay EXTEND cell along - the circuit (see section 5). - - 3. When a relay EXTENDED cell is received, verify KH, and - calculate the shared keys. The circuit is now extended. - - When an onion router receives an EXTEND relay cell, it sends a CREATE - cell to the next onion router, with the enclosed onion skin as its - payload. As special cases, if the extend cell includes a digest of - all zeroes, or asks to extend back to the relay that sent the extend - cell, the circuit will fail and be torn down. The initiating onion - router chooses some circID not yet used on the connection between the - two onion routers. (But see section 5.1. above, concerning choosing - circIDs based on lexicographic order of nicknames.) - - When an onion router receives a CREATE cell, if it already has a - circuit on the given connection with the given circID, it drops the - cell. Otherwise, after receiving the CREATE cell, it completes the - DH handshake, and replies with a CREATED cell. Upon receiving a - CREATED cell, an onion router packs it payload into an EXTENDED relay - cell (see section 5), and sends that cell up the circuit. Upon - receiving the EXTENDED relay cell, the OP can retrieve g^y. - - (As an optimization, OR implementations may delay processing onions - until a break in traffic allows time to do so without harming - network latency too greatly.) - -5.3.1. Canonical connections - - It is possible for an attacker to launch a man-in-the-middle attack - against a connection by telling OR Alice to extend to OR Bob at some - address X controlled by the attacker. The attacker cannot read the - encrypted traffic, but the attacker is now in a position to count all - bytes sent between Alice and Bob (assuming Alice was not already - connected to Bob.) - - To prevent this, when an OR we gets an extend request, it SHOULD use an - existing OR connection if the ID matches, and ANY of the following - conditions hold: - - The IP matches the requested IP. - - The OR knows that the IP of the connection it's using is canonical - because it was listed in the NETINFO cell. - - The OR knows that the IP of the connection it's using is canonical - because it was listed in the server descriptor. - - [This is not implemented in Tor 0.2.0.23-rc.] - -5.4. Tearing down circuits - - Circuits are torn down when an unrecoverable error occurs along - the circuit, or when all streams on a circuit are closed and the - circuit's intended lifetime is over. Circuits may be torn down - either completely or hop-by-hop. - - To tear down a circuit completely, an OR or OP sends a DESTROY - cell to the adjacent nodes on that circuit, using the appropriate - direction's circID. - - Upon receiving an outgoing DESTROY cell, an OR frees resources - associated with the corresponding circuit. If it's not the end of - the circuit, it sends a DESTROY cell for that circuit to the next OR - in the circuit. If the node is the end of the circuit, then it tears - down any associated edge connections (see section 6.1). - - After a DESTROY cell has been processed, an OR ignores all data or - destroy cells for the corresponding circuit. - - To tear down part of a circuit, the OP may send a RELAY_TRUNCATE cell - signaling a given OR (Stream ID zero). That OR sends a DESTROY - cell to the next node in the circuit, and replies to the OP with a - RELAY_TRUNCATED cell. - - [Note: If an OR receives a TRUNCATE cell and it has any RELAY cells - still queued on the circuit for the next node it will drop them - without sending them. This is not considered conformant behavior, - but it probably won't get fixed until a later version of Tor. Thus, - clients SHOULD NOT send a TRUNCATE cell to a node running any current - version of Tor if a) they have sent relay cells through that node, - and b) they aren't sure whether those cells have been sent on yes.] - - When an unrecoverable error occurs along one connection in a - circuit, the nodes on either side of the connection should, if they - are able, act as follows: the node closer to the OP should send a - RELAY_TRUNCATED cell towards the OP; the node farther from the OP - should send a DESTROY cell down the circuit. - - The payload of a RELAY_TRUNCATED or DESTROY cell contains a single octet, - describing why the circuit is being closed or truncated. When sending a - TRUNCATED or DESTROY cell because of another TRUNCATED or DESTROY cell, - the error code should be propagated. The origin of a circuit always sets - this error code to 0, to avoid leaking its version. - - The error codes are: - 0 -- NONE (No reason given.) - 1 -- PROTOCOL (Tor protocol violation.) - 2 -- INTERNAL (Internal error.) - 3 -- REQUESTED (A client sent a TRUNCATE command.) - 4 -- HIBERNATING (Not currently operating; trying to save bandwidth.) - 5 -- RESOURCELIMIT (Out of memory, sockets, or circuit IDs.) - 6 -- CONNECTFAILED (Unable to reach server.) - 7 -- OR_IDENTITY (Connected to server, but its OR identity was not - as expected.) - 8 -- OR_CONN_CLOSED (The OR connection that was carrying this circuit - died.) - 9 -- FINISHED (The circuit has expired for being dirty or old.) - 10 -- TIMEOUT (Circuit construction took too long) - 11 -- DESTROYED (The circuit was destroyed w/o client TRUNCATE) - 12 -- NOSUCHSERVICE (Request for unknown hidden service) - -5.5. Routing relay cells - - When an OR receives a RELAY or RELAY_EARLY cell, it checks the cell's - circID and determines whether it has a corresponding circuit along that - connection. If not, the OR drops the cell. - - Otherwise, if the OR is not at the OP edge of the circuit (that is, - either an 'exit node' or a non-edge node), it de/encrypts the payload - with the stream cipher, as follows: - 'Forward' relay cell (same direction as CREATE): - Use Kf as key; decrypt. - 'Back' relay cell (opposite direction from CREATE): - Use Kb as key; encrypt. - Note that in counter mode, decrypt and encrypt are the same operation. - - The OR then decides whether it recognizes the relay cell, by - inspecting the payload as described in section 6.1 below. If the OR - recognizes the cell, it processes the contents of the relay cell. - Otherwise, it passes the decrypted relay cell along the circuit if - the circuit continues. If the OR at the end of the circuit - encounters an unrecognized relay cell, an error has occurred: the OR - sends a DESTROY cell to tear down the circuit. - - When a relay cell arrives at an OP, the OP decrypts the payload - with the stream cipher as follows: - OP receives data cell: - For I=N...1, - Decrypt with Kb_I. If the payload is recognized (see - section 6..1), then stop and process the payload. - - For more information, see section 6 below. - -5.6. Handling relay_early cells - - A RELAY_EARLY cell is designed to limit the length any circuit can reach. - When an OR receives a RELAY_EARLY cell, and the next node in the circuit - is speaking v2 of the link protocol or later, the OR relays the cell as a - RELAY_EARLY cell. Otherwise, it relays it as a RELAY cell. - - If a node ever receives more than 8 RELAY_EARLY cells on a given - outbound circuit, it SHOULD close the circuit. (For historical reasons, - we don't limit the number of inbound RELAY_EARLY cells; they should - be harmless anyway because clients won't accept extend requests. See - bug 1038.) - - When speaking v2 of the link protocol or later, clients MUST only send - EXTEND cells inside RELAY_EARLY cells. Clients SHOULD send the first ~8 - RELAY cells that are not targeted at the first hop of any circuit as - RELAY_EARLY cells too, in order to partially conceal the circuit length. - - [In a future version of Tor, servers will reject any EXTEND cell not - received in a RELAY_EARLY cell. See proposal 110.] - -6. Application connections and stream management - -6.1. Relay cells - - Within a circuit, the OP and the exit node use the contents of - RELAY packets to tunnel end-to-end commands and TCP connections - ("Streams") across circuits. End-to-end commands can be initiated - by either edge; streams are initiated by the OP. - - The payload of each unencrypted RELAY cell consists of: - Relay command [1 byte] - 'Recognized' [2 bytes] - StreamID [2 bytes] - Digest [4 bytes] - Length [2 bytes] - Data [CELL_LEN-14 bytes] - - The relay commands are: - 1 -- RELAY_BEGIN [forward] - 2 -- RELAY_DATA [forward or backward] - 3 -- RELAY_END [forward or backward] - 4 -- RELAY_CONNECTED [backward] - 5 -- RELAY_SENDME [forward or backward] [sometimes control] - 6 -- RELAY_EXTEND [forward] [control] - 7 -- RELAY_EXTENDED [backward] [control] - 8 -- RELAY_TRUNCATE [forward] [control] - 9 -- RELAY_TRUNCATED [backward] [control] - 10 -- RELAY_DROP [forward or backward] [control] - 11 -- RELAY_RESOLVE [forward] - 12 -- RELAY_RESOLVED [backward] - 13 -- RELAY_BEGIN_DIR [forward] - - 32..40 -- Used for hidden services; see rend-spec.txt. - - Commands labelled as "forward" must only be sent by the originator - of the circuit. Commands labelled as "backward" must only be sent by - other nodes in the circuit back to the originator. Commands marked - as either can be sent either by the originator or other nodes. - - The 'recognized' field in any unencrypted relay payload is always set - to zero; the 'digest' field is computed as the first four bytes of - the running digest of all the bytes that have been destined for - this hop of the circuit or originated from this hop of the circuit, - seeded from Df or Db respectively (obtained in section 5.2 above), - and including this RELAY cell's entire payload (taken with the digest - field set to zero). - - When the 'recognized' field of a RELAY cell is zero, and the digest - is correct, the cell is considered "recognized" for the purposes of - decryption (see section 5.5 above). - - (The digest does not include any bytes from relay cells that do - not start or end at this hop of the circuit. That is, it does not - include forwarded data. Therefore if 'recognized' is zero but the - digest does not match, the running digest at that node should - not be updated, and the cell should be forwarded on.) - - All RELAY cells pertaining to the same tunneled stream have the - same stream ID. StreamIDs are chosen arbitrarily by the OP. RELAY - cells that affect the entire circuit rather than a particular - stream use a StreamID of zero -- they are marked in the table above - as "[control]" style cells. (Sendme cells are marked as "sometimes - control" because they can take include a StreamID or not depending - on their purpose -- see Section 7.) - - The 'Length' field of a relay cell contains the number of bytes in - the relay payload which contain real payload data. The remainder of - the payload is padded with NUL bytes. - - If the RELAY cell is recognized but the relay command is not - understood, the cell must be dropped and ignored. Its contents - still count with respect to the digests, though. - -6.2. Opening streams and transferring data - - To open a new anonymized TCP connection, the OP chooses an open - circuit to an exit that may be able to connect to the destination - address, selects an arbitrary StreamID not yet used on that circuit, - and constructs a RELAY_BEGIN cell with a payload encoding the address - and port of the destination host. The payload format is: - - ADDRESS | ':' | PORT | [00] - - where ADDRESS can be a DNS hostname, or an IPv4 address in - dotted-quad format, or an IPv6 address surrounded by square brackets; - and where PORT is a decimal integer between 1 and 65535, inclusive. - - [What is the [00] for? -NM] - [It's so the payload is easy to parse out with string funcs -RD] - - Upon receiving this cell, the exit node resolves the address as - necessary, and opens a new TCP connection to the target port. If the - address cannot be resolved, or a connection can't be established, the - exit node replies with a RELAY_END cell. (See 6.4 below.) - Otherwise, the exit node replies with a RELAY_CONNECTED cell, whose - payload is in one of the following formats: - The IPv4 address to which the connection was made [4 octets] - A number of seconds (TTL) for which the address may be cached [4 octets] - or - Four zero-valued octets [4 octets] - An address type (6) [1 octet] - The IPv6 address to which the connection was made [16 octets] - A number of seconds (TTL) for which the address may be cached [4 octets] - [XXXX No version of Tor currently generates the IPv6 format.] - - [Tor servers before 0.1.2.0 set the TTL field to a fixed value. Later - versions set the TTL to the last value seen from a DNS server, and expire - their own cached entries after a fixed interval. This prevents certain - attacks.] - - The OP waits for a RELAY_CONNECTED cell before sending any data. - Once a connection has been established, the OP and exit node - package stream data in RELAY_DATA cells, and upon receiving such - cells, echo their contents to the corresponding TCP stream. - RELAY_DATA cells sent to unrecognized streams are dropped. - - Relay RELAY_DROP cells are long-range dummies; upon receiving such - a cell, the OR or OP must drop it. - -6.2.1. Opening a directory stream - - If a Tor server is a directory server, it should respond to a - RELAY_BEGIN_DIR cell as if it had received a BEGIN cell requesting a - connection to its directory port. RELAY_BEGIN_DIR cells ignore exit - policy, since the stream is local to the Tor process. - - If the Tor server is not running a directory service, it should respond - with a REASON_NOTDIRECTORY RELAY_END cell. - - Clients MUST generate an all-zero payload for RELAY_BEGIN_DIR cells, - and servers MUST ignore the payload. - - [RELAY_BEGIN_DIR was not supported before Tor 0.1.2.2-alpha; clients - SHOULD NOT send it to routers running earlier versions of Tor.] - -6.3. Closing streams - - When an anonymized TCP connection is closed, or an edge node - encounters error on any stream, it sends a 'RELAY_END' cell along the - circuit (if possible) and closes the TCP connection immediately. If - an edge node receives a 'RELAY_END' cell for any stream, it closes - the TCP connection completely, and sends nothing more along the - circuit for that stream. - - The payload of a RELAY_END cell begins with a single 'reason' byte to - describe why the stream is closing, plus optional data (depending on - the reason.) The values are: - - 1 -- REASON_MISC (catch-all for unlisted reasons) - 2 -- REASON_RESOLVEFAILED (couldn't look up hostname) - 3 -- REASON_CONNECTREFUSED (remote host refused connection) [*] - 4 -- REASON_EXITPOLICY (OR refuses to connect to host or port) - 5 -- REASON_DESTROY (Circuit is being destroyed) - 6 -- REASON_DONE (Anonymized TCP connection was closed) - 7 -- REASON_TIMEOUT (Connection timed out, or OR timed out - while connecting) - 8 -- REASON_NOROUTE (Routing error while attempting to - contact destination) - 9 -- REASON_HIBERNATING (OR is temporarily hibernating) - 10 -- REASON_INTERNAL (Internal error at the OR) - 11 -- REASON_RESOURCELIMIT (OR has no resources to fulfill request) - 12 -- REASON_CONNRESET (Connection was unexpectedly reset) - 13 -- REASON_TORPROTOCOL (Sent when closing connection because of - Tor protocol violations.) - 14 -- REASON_NOTDIRECTORY (Client sent RELAY_BEGIN_DIR to a - non-directory server.) - - (With REASON_EXITPOLICY, the 4-byte IPv4 address or 16-byte IPv6 address - forms the optional data, along with a 4-byte TTL; no other reason - currently has extra data.) - - OPs and ORs MUST accept reasons not on the above list, since future - versions of Tor may provide more fine-grained reasons. - - Tors SHOULD NOT send any reason except REASON_MISC for a stream that they - have originated. - - [*] Older versions of Tor also send this reason when connections are - reset. - - --- [The rest of this section describes unimplemented functionality.] - - Because TCP connections can be half-open, we follow an equivalent - to TCP's FIN/FIN-ACK/ACK protocol to close streams. - - An exit connection can have a TCP stream in one of three states: - 'OPEN', 'DONE_PACKAGING', and 'DONE_DELIVERING'. For the purposes - of modeling transitions, we treat 'CLOSED' as a fourth state, - although connections in this state are not, in fact, tracked by the - onion router. - - A stream begins in the 'OPEN' state. Upon receiving a 'FIN' from - the corresponding TCP connection, the edge node sends a 'RELAY_FIN' - cell along the circuit and changes its state to 'DONE_PACKAGING'. - Upon receiving a 'RELAY_FIN' cell, an edge node sends a 'FIN' to - the corresponding TCP connection (e.g., by calling - shutdown(SHUT_WR)) and changing its state to 'DONE_DELIVERING'. - - When a stream in already in 'DONE_DELIVERING' receives a 'FIN', it - also sends a 'RELAY_FIN' along the circuit, and changes its state - to 'CLOSED'. When a stream already in 'DONE_PACKAGING' receives a - 'RELAY_FIN' cell, it sends a 'FIN' and changes its state to - 'CLOSED'. - - If an edge node encounters an error on any stream, it sends a - 'RELAY_END' cell (if possible) and closes the stream immediately. - -6.4. Remote hostname lookup - - To find the address associated with a hostname, the OP sends a - RELAY_RESOLVE cell containing the hostname to be resolved with a NUL - terminating byte. (For a reverse lookup, the OP sends a RELAY_RESOLVE - cell containing an in-addr.arpa address.) The OR replies with a - RELAY_RESOLVED cell containing a status byte, and any number of - answers. Each answer is of the form: - Type (1 octet) - Length (1 octet) - Value (variable-width) - TTL (4 octets) - "Length" is the length of the Value field. - "Type" is one of: - 0x00 -- Hostname - 0x04 -- IPv4 address - 0x06 -- IPv6 address - 0xF0 -- Error, transient - 0xF1 -- Error, nontransient - - If any answer has a type of 'Error', then no other answer may be given. - - The RELAY_RESOLVE cell must use a nonzero, distinct streamID; the - corresponding RELAY_RESOLVED cell must use the same streamID. No stream - is actually created by the OR when resolving the name. - -7. Flow control - -7.1. Link throttling - - Each client or relay should do appropriate bandwidth throttling to - keep its user happy. - - Communicants rely on TCP's default flow control to push back when they - stop reading. - - The mainline Tor implementation uses token buckets (one for reads, - one for writes) for the rate limiting. - - Since 0.2.0.x, Tor has let the user specify an additional pair of - token buckets for "relayed" traffic, so people can deploy a Tor relay - with strict rate limiting, but also use the same Tor as a client. To - avoid partitioning concerns we combine both classes of traffic over a - given OR connection, and keep track of the last time we read or wrote - a high-priority (non-relayed) cell. If it's been less than N seconds - (currently N=30), we give the whole connection high priority, else we - give the whole connection low priority. We also give low priority - to reads and writes for connections that are serving directory - information. See proposal 111 for details. - -7.2. Link padding - - Link padding can be created by sending PADDING cells along the - connection; relay cells of type "DROP" can be used for long-range - padding. - - Currently nodes are not required to do any sort of link padding or - dummy traffic. Because strong attacks exist even with link padding, - and because link padding greatly increases the bandwidth requirements - for running a node, we plan to leave out link padding until this - tradeoff is better understood. - -7.3. Circuit-level flow control - - To control a circuit's bandwidth usage, each OR keeps track of two - 'windows', consisting of how many RELAY_DATA cells it is allowed to - originate (package for transmission), and how many RELAY_DATA cells - it is willing to consume (receive for local streams). These limits - do not apply to cells that the OR receives from one host and relays - to another. - - Each 'window' value is initially set to 1000 data cells - in each direction (cells that are not data cells do not affect - the window). When an OR is willing to deliver more cells, it sends a - RELAY_SENDME cell towards the OP, with Stream ID zero. When an OR - receives a RELAY_SENDME cell with stream ID zero, it increments its - packaging window. - - Each of these cells increments the corresponding window by 100. - - The OP behaves identically, except that it must track a packaging - window and a delivery window for every OR in the circuit. - - An OR or OP sends cells to increment its delivery window when the - corresponding window value falls under some threshold (900). - - If a packaging window reaches 0, the OR or OP stops reading from - TCP connections for all streams on the corresponding circuit, and - sends no more RELAY_DATA cells until receiving a RELAY_SENDME cell. -[this stuff is badly worded; copy in the tor-design section -RD] - -7.4. Stream-level flow control - - Edge nodes use RELAY_SENDME cells to implement end-to-end flow - control for individual connections across circuits. Similarly to - circuit-level flow control, edge nodes begin with a window of cells - (500) per stream, and increment the window by a fixed value (50) - upon receiving a RELAY_SENDME cell. Edge nodes initiate RELAY_SENDME - cells when both a) the window is <= 450, and b) there are less than - ten cell payloads remaining to be flushed at that edge. - -A.1. Differences between spec and implementation - -- The current specification requires all ORs to have IPv4 addresses, but - allows servers to exit and resolve to IPv6 addresses, and to declare IPv6 - addresses in their exit policies. The current codebase has no IPv6 - support at all. - diff --git a/doc/spec/version-spec.txt b/doc/spec/version-spec.txt deleted file mode 100644 index 265717f409..0000000000 --- a/doc/spec/version-spec.txt +++ /dev/null @@ -1,44 +0,0 @@ - - HOW TOR VERSION NUMBERS WORK - -1. The Old Way - - Before 0.1.0, versions were of the format: - MAJOR.MINOR.MICRO(status(PATCHLEVEL))?(-cvs)? - where MAJOR, MINOR, MICRO, and PATCHLEVEL are numbers, status is one - of "pre" (for an alpha release), "rc" (for a release candidate), or - "." for a release. As a special case, "a.b.c" was equivalent to - "a.b.c.0". We compare the elements in order (major, minor, micro, - status, patchlevel, cvs), with "cvs" preceding non-cvs. - - We would start each development branch with a final version in mind: - say, "0.0.8". Our first pre-release would be "0.0.8pre1", followed by - (for example) "0.0.8pre2-cvs", "0.0.8pre2", "0.0.8pre3-cvs", - "0.0.8rc1", "0.0.8rc2-cvs", and "0.0.8rc2". Finally, we'd release - 0.0.8. The stable CVS branch would then be versioned "0.0.8.1-cvs", - and any eventual bugfix release would be "0.0.8.1". - -2. The New Way - - After 0.1.0, versions are of the format: - MAJOR.MINOR.MICRO(.PATCHLEVEL)(-status_tag) - The stuff in parentheses is optional. As before, MAJOR, MINOR, MICRO, - and PATCHLEVEL are numbers, with an absent number equivalent to 0. - All versions should be distinguishable purely by those four - numbers. The status tag is purely informational, and lets you know how - stable we think the release is: "alpha" is pretty unstable; "rc" is a - release candidate; and no tag at all means that we have a final - release. If the tag ends with "-cvs" or "-dev", you're looking at a - development snapshot that came after a given release. If we *do* - encounter two versions that differ only by status tag, we compare them - lexically. - - Now, we start each development branch with (say) 0.1.1.1-alpha. The - patchlevel increments consistently as the status tag changes, for - example, as in: 0.1.1.2-alpha, 0.1.1.3-alpha, 0.1.1.4-rc, 0.1.1.5-rc. - Eventually, we release 0.1.1.6. The next patch release is 0.1.1.7. - - Between these releases, CVS is versioned with a -cvs tag: after - 0.1.1.1-alpha comes 0.1.1.1-alpha-cvs, and so on. But starting with - 0.1.2.1-alpha-dev, we switched to SVN and started using the "-dev" - suffix instead of the "-cvs" suffix. |