From e4e0d93d56ee8c1aec4c2efaa7046b651f0fe55c Mon Sep 17 00:00:00 2001 From: Nick Mathewson Date: Thu, 12 Oct 2023 12:27:58 -0400 Subject: Move all text-only specifications into the OLD_TXT directory. --- address-spec.txt | 94 - attic/text_formats/README.md | 2 + attic/text_formats/address-spec.txt | 94 + attic/text_formats/bandwidth-file-spec.txt | 1315 +++++++++ attic/text_formats/bridgedb-spec.txt | 409 +++ attic/text_formats/cert-spec.txt | 198 ++ attic/text_formats/control-spec.txt | 4418 ++++++++++++++++++++++++++++ attic/text_formats/dir-list-spec.txt | 529 ++++ attic/text_formats/dir-spec.txt | 4299 +++++++++++++++++++++++++++ attic/text_formats/ext-orport-spec.txt | 226 ++ attic/text_formats/gettor-spec.txt | 88 + attic/text_formats/glossary.txt | 198 ++ attic/text_formats/guard-spec.txt | 972 ++++++ attic/text_formats/padding-spec.txt | 625 ++++ attic/text_formats/param-spec.txt | 517 ++++ attic/text_formats/path-spec.txt | 1051 +++++++ attic/text_formats/pt-spec.txt | 828 ++++++ attic/text_formats/rend-spec-v3.txt | 2869 ++++++++++++++++++ attic/text_formats/socks-extensions.txt | 175 ++ attic/text_formats/srv-spec.txt | 653 ++++ attic/text_formats/tor-spec.txt | 2735 +++++++++++++++++ attic/text_formats/version-spec.txt | 86 + bandwidth-file-spec.txt | 1315 --------- bridgedb-spec.txt | 409 --- cert-spec.txt | 198 -- control-spec.txt | 4418 ---------------------------- dir-list-spec.txt | 529 ---- dir-spec.txt | 4299 --------------------------- ext-orport-spec.txt | 226 -- gettor-spec.txt | 88 - glossary.txt | 198 -- guard-spec.txt | 972 ------ padding-spec.txt | 625 ---- param-spec.txt | 517 ---- path-spec.txt | 1051 ------- pt-spec.txt | 828 ------ rend-spec-v3.txt | 2869 ------------------ socks-extensions.txt | 175 -- srv-spec.txt | 653 ---- tor-spec.txt | 2735 ----------------- version-spec.txt | 86 - 41 files changed, 22287 insertions(+), 22285 deletions(-) delete mode 100644 address-spec.txt create mode 100644 attic/text_formats/README.md create mode 100644 attic/text_formats/address-spec.txt create mode 100644 attic/text_formats/bandwidth-file-spec.txt create mode 100644 attic/text_formats/bridgedb-spec.txt create mode 100644 attic/text_formats/cert-spec.txt create mode 100644 attic/text_formats/control-spec.txt create mode 100644 attic/text_formats/dir-list-spec.txt create mode 100644 attic/text_formats/dir-spec.txt create mode 100644 attic/text_formats/ext-orport-spec.txt create mode 100644 attic/text_formats/gettor-spec.txt create mode 100644 attic/text_formats/glossary.txt create mode 100644 attic/text_formats/guard-spec.txt create mode 100644 attic/text_formats/padding-spec.txt create mode 100644 attic/text_formats/param-spec.txt create mode 100644 attic/text_formats/path-spec.txt create mode 100644 attic/text_formats/pt-spec.txt create mode 100644 attic/text_formats/rend-spec-v3.txt create mode 100644 attic/text_formats/socks-extensions.txt create mode 100644 attic/text_formats/srv-spec.txt create mode 100644 attic/text_formats/tor-spec.txt create mode 100644 attic/text_formats/version-spec.txt delete mode 100644 bandwidth-file-spec.txt delete mode 100644 bridgedb-spec.txt delete mode 100644 cert-spec.txt delete mode 100644 control-spec.txt delete mode 100644 dir-list-spec.txt delete mode 100644 dir-spec.txt delete mode 100644 ext-orport-spec.txt delete mode 100644 gettor-spec.txt delete mode 100644 glossary.txt delete mode 100644 guard-spec.txt delete mode 100644 padding-spec.txt delete mode 100644 param-spec.txt delete mode 100644 path-spec.txt delete mode 100644 pt-spec.txt delete mode 100644 rend-spec-v3.txt delete mode 100644 socks-extensions.txt delete mode 100644 srv-spec.txt delete mode 100644 tor-spec.txt delete mode 100644 version-spec.txt diff --git a/address-spec.txt b/address-spec.txt deleted file mode 100644 index 1e90e6e..0000000 --- a/address-spec.txt +++ /dev/null @@ -1,94 +0,0 @@ - Special Hostnames in Tor - Nick Mathewson - -Table of Contents - - 1. Overview - 2. .exit - 3. .onion - 4. .noconnect - -1. Overview - - Most of the time, Tor treats user-specified hostnames as opaque: When - the user connects to www.torproject.org, Tor picks an exit node and uses - that node to connect to "www.torproject.org". Some hostnames, however, - can be used to override Tor's default behavior and circuit-building - rules. - - These hostnames can be passed to Tor as the address part of a SOCKS4a or - SOCKS5 request. If the application is connected to Tor using an IP-only - method (such as SOCKS4, TransPort, or NATDPort), these hostnames can be - substituted for certain IP addresses using the MapAddress configuration - option or the MAPADDRESS control command. - -2. .exit - - SYNTAX: [hostname].[name-or-digest].exit - [name-or-digest].exit - - Hostname is a valid hostname; [name-or-digest] is either the nickname of a - Tor node or the hex-encoded digest of that node's public key. - - When Tor sees an address in this format, it uses the specified hostname as - the exit node. If no "hostname" component is given, Tor defaults to the - published IPv4 address of the exit node. - - It is valid to try to resolve hostnames, and in fact upon success Tor - will cache an internal mapaddress of the form - "www.google.com.foo.exit=64.233.161.99.foo.exit" to speed subsequent - lookups. - - The .exit notation is disabled by default as of Tor 0.2.2.1-alpha, due - to potential application-level attacks. - - EXAMPLES: - www.example.com.exampletornode.exit - - Connect to www.example.com from the node called "exampletornode". - - exampletornode.exit - - Connect to the published IP address of "exampletornode" using - "exampletornode" as the exit. - -3. .onion - - SYNTAX: [digest].onion - [ignored].[digest].onion - - Version 2 addresses (deprecated since 0.4.6.1-alpha), the digest is the first - eighty bits of a SHA1 hash of the identity key for a hidden service, encoded - in base32. - - Version 3 addresses, the digest is defined as: - - onion_address = base32(PUBKEY | CHECKSUM | VERSION) - CHECKSUM = H(".onion checksum" | PUBKEY | VERSION)[:2] - - where: - - PUBKEY is the 32 bytes ed25519 master pubkey of the onion service. - - VERSION is a one byte version field (default value '\x03') - - ".onion checksum" is a constant string - - H is SHA3-256 - - CHECKSUM is truncated to two bytes before inserting it in onion_address - - When Tor sees an address in this format, it tries to look up and connect to - the specified onion service. See rend-spec-v3.txt for full details. - - The "ignored" portion of the address is intended for use in vhosting, and - is supported in Tor 0.2.4.10-alpha and later. - -4. .noconnect - - SYNTAX: [string].noconnect - - When Tor sees an address in this format, it immediately closes the - connection without attaching it to any circuit. This is useful for - controllers that want to test whether a given application is indeed - using the same instance of Tor that they're controlling. - - This feature was added in Tor 0.1.2.4-alpha, and taken out in Tor - 0.2.2.1-alpha over fears that it provided another avenue for detecting - Tor users via application-level web tricks. - diff --git a/attic/text_formats/README.md b/attic/text_formats/README.md new file mode 100644 index 0000000..b68d031 --- /dev/null +++ b/attic/text_formats/README.md @@ -0,0 +1,2 @@ +This directory contains our specifications before our conversion +to mdbook on Thu Oct 12 12:27:58 PM EDT 2023. diff --git a/attic/text_formats/address-spec.txt b/attic/text_formats/address-spec.txt new file mode 100644 index 0000000..1e90e6e --- /dev/null +++ b/attic/text_formats/address-spec.txt @@ -0,0 +1,94 @@ + Special Hostnames in Tor + Nick Mathewson + +Table of Contents + + 1. Overview + 2. .exit + 3. .onion + 4. .noconnect + +1. Overview + + Most of the time, Tor treats user-specified hostnames as opaque: When + the user connects to www.torproject.org, Tor picks an exit node and uses + that node to connect to "www.torproject.org". Some hostnames, however, + can be used to override Tor's default behavior and circuit-building + rules. + + These hostnames can be passed to Tor as the address part of a SOCKS4a or + SOCKS5 request. If the application is connected to Tor using an IP-only + method (such as SOCKS4, TransPort, or NATDPort), these hostnames can be + substituted for certain IP addresses using the MapAddress configuration + option or the MAPADDRESS control command. + +2. .exit + + SYNTAX: [hostname].[name-or-digest].exit + [name-or-digest].exit + + Hostname is a valid hostname; [name-or-digest] is either the nickname of a + Tor node or the hex-encoded digest of that node's public key. + + When Tor sees an address in this format, it uses the specified hostname as + the exit node. If no "hostname" component is given, Tor defaults to the + published IPv4 address of the exit node. + + It is valid to try to resolve hostnames, and in fact upon success Tor + will cache an internal mapaddress of the form + "www.google.com.foo.exit=64.233.161.99.foo.exit" to speed subsequent + lookups. + + The .exit notation is disabled by default as of Tor 0.2.2.1-alpha, due + to potential application-level attacks. + + EXAMPLES: + www.example.com.exampletornode.exit + + Connect to www.example.com from the node called "exampletornode". + + exampletornode.exit + + Connect to the published IP address of "exampletornode" using + "exampletornode" as the exit. + +3. .onion + + SYNTAX: [digest].onion + [ignored].[digest].onion + + Version 2 addresses (deprecated since 0.4.6.1-alpha), the digest is the first + eighty bits of a SHA1 hash of the identity key for a hidden service, encoded + in base32. + + Version 3 addresses, the digest is defined as: + + onion_address = base32(PUBKEY | CHECKSUM | VERSION) + CHECKSUM = H(".onion checksum" | PUBKEY | VERSION)[:2] + + where: + - PUBKEY is the 32 bytes ed25519 master pubkey of the onion service. + - VERSION is a one byte version field (default value '\x03') + - ".onion checksum" is a constant string + - H is SHA3-256 + - CHECKSUM is truncated to two bytes before inserting it in onion_address + + When Tor sees an address in this format, it tries to look up and connect to + the specified onion service. See rend-spec-v3.txt for full details. + + The "ignored" portion of the address is intended for use in vhosting, and + is supported in Tor 0.2.4.10-alpha and later. + +4. .noconnect + + SYNTAX: [string].noconnect + + When Tor sees an address in this format, it immediately closes the + connection without attaching it to any circuit. This is useful for + controllers that want to test whether a given application is indeed + using the same instance of Tor that they're controlling. + + This feature was added in Tor 0.1.2.4-alpha, and taken out in Tor + 0.2.2.1-alpha over fears that it provided another avenue for detecting + Tor users via application-level web tricks. + diff --git a/attic/text_formats/bandwidth-file-spec.txt b/attic/text_formats/bandwidth-file-spec.txt new file mode 100644 index 0000000..bad13f6 --- /dev/null +++ b/attic/text_formats/bandwidth-file-spec.txt @@ -0,0 +1,1315 @@ + + Tor Bandwidth File Format + juga + teor + +Table of Contents + + 1. Scope and preliminaries + 1.2. Acknowledgements + 1.3. Outline + 1.4. Format Versions + 2. Format details + 2.1. Definitions + 2.2. Header List format + 2.3. Relay Line format + 2.4. Implementation details + 2.4.1. Writing bandwidth files atomically + 2.4.2. Additional KeyValue pair definitions + 2.4.2.1. Simple Bandwidth Scanner + 2.4.2.2. Torflow + A. Sample data + A.1. Generated by Torflow + A.2. Generated by sbws version 0.1.0 + A.3. Generated by sbws version 1.0.3 + A.4. Headers generated by sbws version 1.0.4 + A.5 Generated by sbws version 1.1.0 + B. Scaling bandwidths + B.1. Scaling requirements + B.2. A linear scaling method + B.3. Quota changes + B.4. Torflow aggregation + +1. Scope and preliminaries + + This document describes the format of Tor's Bandwidth File, version + 1.0.0 and later. + + It is a new specification for the existing bandwidth file format, + which we call version 1.0.0. It also specifies new format versions + 1.1.0 and later, which are backwards compatible with 1.0.0 parsers. + + Since Tor version 0.2.4.12-alpha, the directory authorities use + the Bandwidth File file called "V3BandwidthsFile" generated by + Torflow [1]. The details of this format are described in Torflow's + README.spec.txt. We also summarise the format in this specification. + + The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL + NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and + "OPTIONAL" in this document are to be interpreted as described in + RFC 2119. + +1.2. Acknowledgements + + The original bandwidth generator (Torflow) and format was + created by mike. Teor suggested to write this specification while + contributing on pastly's new bandwidth generator implementation. + + This specification was revised after feedback from: + + Nick Mathewson (nickm) + Iain Learmonth (irl) + +1.3. Outline + + The Tor directory protocol (dir-spec.txt [3]) sections 3.4.1 + and 3.4.2, use the term bandwidth measurements, to refer to what + here is called Bandwidth File. + + A Bandwidth File contains information on relays' bandwidth + capacities and is produced by bandwidth generators, previously known + as bandwidth scanners. + +1.4. Format Versions + + 1.0.0 - The legacy Bandwidth File format + + 1.1.0 - Adds a header containing information about the bandwidth + file. Document the sbws and Torflow relay line keys. + + 1.2.0 - If there are not enough eligible relays, the bandwidth file + SHOULD contain a header, but no relays. (To match Torflow's + existing behaviour.) + + Adds scanner and destination countries to the header. + Adds new KeyValue Lines to the Header List section with + statistics about the number of relays included in the file. + Adds new KeyValues to Relay Bandwidth Lines, with different + bandwidth values (averages and descriptor bandwidths). + + 1.4.0 - Adds monitoring KeyValues to the header and relay lines. + + RelayLines for excluded relays MAY be present in the bandwidth + file for diagnostic reasons. Similarly, if there are not enough + eligible relays, the bandwidth file MAY contain all known relays. + + Diagnostic relay lines SHOULD be marked with vote=0, and + Tor SHOULD NOT use their bandwidths in its votes. + + Also adds Tor version. + 1.5.0 - Removes "recent_measurement_attempt_count" KeyValue. + 1.6.0 - Adds congestion control stream events KeyValues. + 1.7.0 - Adds ratios KeyValues to the relay lines and network averages + KeyValues to the header. + + All Tor versions can consume format version 1.0.0. + + All Tor versions can consume format version 1.1.0 and later, + but Tor versions earlier than 0.3.5.1-alpha warn if the header + contains any KeyValue lines after the Timestamp. + + Tor versions 0.4.0.3-alpha, 0.3.5.8, 0.3.4.11, and earlier do not + understand "vote=0". Instead, they will vote for the actual bandwidths + that sbws puts in diagnostic relay lines: + * 1 for relays with "unmeasured=1", and + * the relay's measured and scaled bandwidth when "under_min_report=1". + +2. Format details + + The Bandwidth File MUST contain the following sections: + - Header List (exactly once), which is a partially ordered list of + - Header Lines (one or more times), then + - Relay Lines (zero or more times), in an arbitrary order. + If it does not contain these sections, parsers SHOULD ignore the file. + +2.1. Definitions + + The following nonterminals are defined in Tor directory protocol + sections 1.2., 2.1.1., 2.1.3.: + + bool + Int + SP (space) + NL (newline) + KeywordChar + ArgumentChar + nickname + hexdigest (a '$', followed by 40 hexadecimal characters + ([A-Fa-f0-9])) + + Nonterminal defined section 2 of version-spec.txt [4]: + + version_number + + We define the following nonterminals: + + Line ::= ArgumentChar* NL + RelayLine ::= KeyValue (SP KeyValue)* NL + HeaderLine ::= KeyValue NL + KeyValue ::= Key "=" Value + Key ::= (KeywordChar | "_")+ + Value ::= ArgumentCharValue+ + ArgumentCharValue ::= any printing ASCII character except NL and SP. + Terminator ::= "=====" or "====" + Generators SHOULD use a 5-character terminator. + Timestamp ::= Int + Bandwidth ::= Int + MasterKey ::= a base64-encoded Ed25519 public key, with + padding characters omitted. + DateTime ::= "YYYY-MM-DDTHH:MM:SS", as in ISO 8601 + CountryCode ::= Two capital ASCII letters ([A-Z]{2}), as defined in + ISO 3166-1 alpha-2 plus "ZZ" to denote unknown country + (eg the destination is in a Content Delivery Network). + CountryCodeList ::= One or more CountryCode(s) separated by a comma + ([A-Z]{2}(,[A-Z]{2})*). + + Note that key_value and value are defined in Tor directory protocol + with different formats to KeyValue and Value here. + + Tor versions earlier than 0.3.5.1-alpha require all lines in the file + to be 510 characters or less. The previous limit was 254 characters in + Tor 0.2.6.2-alpha and earlier. Parsers MAY ignore longer Lines. + + Note that directory authorities are only supported on the two most + recent stable Tor versions, so we expect that line limits will be + removed after Tor 0.4.0 is released in 2019. + +2.2. Header List format + + It consists of a Timestamp line and zero or more HeaderLines. + + All the header lines MUST conform to the HeaderLine format, except + the first Timestamp line. + + The Timestamp line is not a HeaderLine to keep compatibility with + the legacy Bandwidth File format. + + Some header Lines MUST appear in specific positions, as documented + below. All other Lines can appear in any order. + + If a parser does not recognize any extra material in a header Line, + the Line MUST be ignored. + + If a header Line does not conform to this format, the Line SHOULD be + ignored by parsers. + + It consists of: + + Timestamp NL + + [At start, exactly once.] + + The Unix Epoch time in seconds of the most recent generator bandwidth + result. + + If the generator implementation has multiple threads or + subprocesses which can fail independently, it SHOULD take the most + recent timestamp from each thread and use the oldest value. This + ensures all the threads continue running. + + If there are threads that do not run continuously, they SHOULD be + excluded from the timestamp calculation. + + If there are no recent results, the generator MUST NOT generate a new + file. + + It does not follow the KeyValue format for backwards compatibility + with version 1.0.0. + + "version" version_number NL + + [In second position, zero or one time.] + + The specification document format version. + It uses semantic versioning [5]. + + This Line was added in version 1.1.0 of this specification. + + Version 1.0.0 documents do not contain this Line, and the + version_number is considered to be "1.0.0". + + "software" Value NL + + [Zero or one time.] + + The name of the software that created the document. + + This Line was added in version 1.1.0 of this specification. + + Version 1.0.0 documents do not contain this Line, and the software + is considered to be "torflow". + + "software_version" Value NL + + [Zero or one time.] + + The version of the software that created the document. + The version may be a version_number, a git commit, or some other + version scheme. + + This Line was added in version 1.1.0 of this specification. + + "file_created" DateTime NL + + [Zero or one time.] + + The date and time timestamp in ISO 8601 format and UTC time zone + when the file was created. + + This Line was added in version 1.1.0 of this specification. + + "generator_started" DateTime NL + + [Zero or one time.] + + The date and time timestamp in ISO 8601 format and UTC time zone + when the generator started. + + This Line was added in version 1.1.0 of this specification. + + "earliest_bandwidth" DateTime NL + + [Zero or one time.] + + The date and time timestamp in ISO 8601 format and UTC time zone + when the first relay bandwidth was obtained. + + This Line was added in version 1.1.0 of this specification. + + "latest_bandwidth" DateTime NL + + [Zero or one time.] + + The date and time timestamp in ISO 8601 format and UTC time zone + of the most recent generator bandwidth result. + + This time MUST be identical to the initial Timestamp line. + + This duplicate value is included to make the format easier for people + to read. + + This Line was added in version 1.1.0 of this specification. + + "number_eligible_relays" Int NL + + [Zero or one time.] + + The number of relays that have enough measurements to be + included in the bandwidth file. + + This Line was added in version 1.2.0 of this specification. + + "minimum_percent_eligible_relays" Int NL + + [Zero or one time.] + + The percentage of relays in the consensus that SHOULD be + included in every generated bandwidth file. + + If this threshold is not reached, format versions 1.3.0 and earlier + SHOULD NOT contain any relays. (Bandwidth files always include a + header.) + + Format versions 1.4.0 and later SHOULD include all the relays for + diagnostic purposes, even if this threshold is not reached. But these + relays SHOULD be marked so that Tor does not vote on them. + See section 1.4 for details. + + The minimum percentage is 60% in Torflow, so sbws uses + 60% as the default. + + This Line was added in version 1.2.0 of this specification. + + "number_consensus_relays" Int NL + + [Zero or one time.] + + The number of relays in the consensus. + + This Line was added in version 1.2.0 of this specification. + + "percent_eligible_relays" Int NL + + [Zero or one time.] + + The number of eligible relays, as a percentage of the number + of relays in the consensus. + + This line SHOULD be equal to: + (number_eligible_relays * 100.0) / number_consensus_relays + to the number of relays in the consensus to include in this file. + + This Line was added in version 1.2.0 of this specification. + + "minimum_number_eligible_relays" Int NL + + [Zero or one time.] + + The minimum number of relays that SHOULD be included in the bandwidth + file. See minimum_percent_eligible_relays for details. + + This line SHOULD be equal to: + number_consensus_relays * (minimum_percent_eligible_relays / 100.0) + + This Line was added in version 1.2.0 of this specification. + + "scanner_country" CountryCode NL + + [Zero or one time.] + + The country, as in political geolocation, where the generator is run. + + This Line was added in version 1.2.0 of this specification. + + "destinations_countries" CountryCodeList NL + + [Zero or one time.] + + The country, as in political geolocation, or countries where the + destination Web server(s) are located. + The destination Web Servers serve the data that the generator retrieves + to measure the bandwidth. + + This Line was added in version 1.2.0 of this specification. + + "recent_consensus_count" Int NL + + [Zero or one time.]. + + The number of the different consensuses seen in the last data_period + days. (data_period is 5 by default.) + + Assuming that Tor clients fetch a consensus every 1-2 hours, + and that the data_period is 5 days, the Value of this Key SHOULD be + between: + data_period * 24 / 2 = 60 + data_period * 24 = 120 + + This Line was added in version 1.4.0 of this specification. + + "recent_priority_list_count" Int NL + + [Zero or one time.] + + The number of times that a list with a subset of relays prioritized + to be measured has been created in the last data_period days. + (data_period is 5 by default.) + + In 2019, with 7000 relays in the network, the Value of this Key SHOULD be + approximately: + data_period * 24 / 1.5 = 80 + Being 1.5 the approximate number of hours it takes to measure a + priority list of 7000 * 0.05 (350) relays, when the fraction of relays + in a priority list is the 5% (0.05). + + This Line was added in version 1.4.0 of this specification. + + "recent_priority_relay_count" Int NL + + [Zero or one time.] + + The number of relays that has been in in the list of relays prioritized + to be measured in the last data_period days. (data_period is 5 by + default.) + + In 2019, with 7000 relays in the network, the Value of this Key SHOULD be + approximately: + 80 * (7000 * 0.05) = 28000 + Being 0.05 (5%) the fraction of relays in a priority list and 80 + the approximate number of priority lists (see + "recent_priority_list_count"). + + This Line was added in version 1.4.0 of this specification. + + "recent_measurement_attempt_count" Int NL + + [Zero or one time.] + + The number of times that any relay has been queued to be measured + in the last data_period days. (data_period is 5 by default.) + + In 2019, with 7000 relays in the network, the Value of this Key SHOULD be + approximately the same as "recent_priority_relay_count", + assuming that there is one attempt to measure a relay for each relay that + has been prioritized unless there are system, network or implementation + issues. + + This Line was added in version 1.4.0 of this specification and removed + in version 1.5.0. + + "recent_measurement_failure_count" Int NL + + [Zero or one time.] + + The number of times that the scanner attempted to measure a relay in + the last data_period days (5 by default), but the relay has not been + measured because of system, network or implementation issues. + + This Line was added in version 1.4.0 of this specification. + + "recent_measurements_excluded_error_count" Int NL + + [Zero or one time.] + + The number of relays that have no successful measurements in the last + data_period days (5 by default). + + (See the note in section 1.4, version 1.4.0, about excluded relays.) + + This Line was added in version 1.4.0 of this specification. + + "recent_measurements_excluded_near_count" Int NL + + [Zero or one time.] + + The number of relays that have some successful measurements in the last + data_period days (5 by default), but all those measurements were + performed in a period of time that was too short (by default 1 day). + + (See the note in section 1.4, version 1.4.0, about excluded relays.) + + This Line was added in version 1.4.0 of this specification. + + "recent_measurements_excluded_old_count" Int NL + + [Zero or one time.] + + The number of relays that have some successful measurements, but all + those measurements are too old (more than 5 days, by default). + + Excludes relays that are already counted in + recent_measurements_excluded_near_count. + + (See the note in section 1.4, version 1.4.0, about excluded relays.) + + This Line was added in version 1.4.0 of this specification. + + "recent_measurements_excluded_few_count" Int NL + + [Zero or one time.] + + The number of relays that don't have enough recent successful + measurements. (Fewer than 2 measurements in the last 5 days, by + default). + + Excludes relays that are already counted in + recent_measurements_excluded_near_count and + recent_measurements_excluded_old_count. + + (See the note in section 1.4, version 1.4.0, about excluded relays.) + + This Line was added in version 1.4.0 of this specification. + + "time_to_report_half_network" Int NL + + [Zero or one time.] + + The time in seconds that it would take to report measurements about the + half of the network, given the number of eligible relays and the time + it took in the last days (5 days, by default). + + (See the note in section 1.4, version 1.4.0, about excluded relays.) + + This Line was added in version 1.4.0 of this specification. + + "tor_version" version_number NL + + [Zero or one time.] + + The Tor version of the Tor process controlled by the generator. + + This Line was added in version 1.4.0 of this specification. + + "mu" Int NL + + [Zero or one time.] + + The network stream bandwidth average calculated as explained in B4.2. + + This Line was added in version 1.7.0 of this specification. + + "muf" Int NL + + [Zero or one time.] + + The network stream bandwidth average filtered calculated as explained in + B4.2. + + This Line was added in version 1.7.0 of this specification. + + KeyValue NL + + [Zero or more times.] + + There MUST NOT be multiple KeyValue header Lines with the same key. + If there are, the parser SHOULD choose an arbitrary Line. + + If a parser does not recognize a Keyword in a KeyValue Line, it + MUST be ignored. + + Future format versions may include additional KeyValue header Lines. + Additional header Lines will be accompanied by a minor version + increment. + + Implementations MAY add additional header Lines as needed. This + specification SHOULD be updated to avoid conflicting meanings for + the same header keys. + + Parsers MUST NOT rely on the order of these additional Lines. + + Additional header Lines MUST NOT use any keywords specified in the + relay measurements format. + If there are, the parser MAY ignore conflicting keywords. + + Terminator NL + + [Zero or one time.] + + The Header List section ends with a Terminator. + + In version 1.0.0, Header List ends when the first relay bandwidth + is found conforming to the next section. + + Implementations of version 1.1.0 and later SHOULD use a 5-character + terminator. + + Tor 0.4.0.1-alpha and later look for a 5-character terminator, + or the first relay bandwidth line. sbws versions 0.1.0 to 1.0.2 + used a 4-character terminator, this bug was fixed in 1.0.3. + +2.3. Relay Line format + + It consists of zero or more RelayLines containing relay ids and + bandwidths. The relays and their KeyValues are in arbitrary order. + + There MUST NOT be multiple KeyValue pairs with the same key in the same + RelayLine. If there are, the parser SHOULD choose an arbitrary Value. + + There MUST NOT be multiple RelayLines per relay identity (node_id or + master_key_ed25519). If there are, parsers SHOULD issue a warning. + Parers MAY reject the file, choose an arbitrary RelayLine, or ignore + both RelayLines. + + If a parser does not recognize any extra material in a RelayLine, + the extra material MUST be ignored. + + Each RelayLine includes the following KeyValue pairs: + + "node_id" hexdigest + + [Exactly once.] + + The fingerprint for the relay's RSA identity key. + + Note: In bandwidth files read by Tor versions earlier than + 0.3.4.1-alpha, node_id MUST NOT be at the end of the Line. + These authority versions are no longer supported. + + Current Tor versions ignore master_key_ed25519, so node_id MUST be + present in each relay Line. + + Implementations of version 1.1.0 and later SHOULD include both node_id + and master_key_ed25519. Parsers SHOULD accept Lines that contain at + least one of them. + + "master_key_ed25519" MasterKey + + [Zero or one time.] + + The relays's master Ed25519 key, base64 encoded, + without trailing "="s, to avoid ambiguity with KeyValue "=" + character. + + This KeyValue pair SHOULD be present, see the note under node_id. + + This KeyValue was added in version 1.1.0 of this specification. + + "bw" Bandwidth + + [Exactly once.] + + The bandwidth of this relay in kilobytes per second. + + No Zero Bandwidths: + Tor accepts zero bandwidths, but they trigger bugs in older Tor + implementations. Therefore, implementations SHOULD NOT produce zero + bandwidths. Instead, they SHOULD use one as their minimum bandwidth. + If there are zero bandwidths, the parser MAY ignore them. + + Bandwidth Aggregation: + Multiple measurements can be aggregated using an averaging scheme, + such as a mean, median, or decaying average. + + Bandwidth Scaling: + Torflow scales bandwidths to kilobytes per second. Other + implementations SHOULD use kilobytes per second for their initial + bandwidth scaling. + + If different implementations or configurations are used in votes for + the same network, their measurements MAY need further scaling. See + Appendix B for information about scaling, and one possible scaling + method. + + MaxAdvertisedBandwidth: + Bandwidth generators MUST limit the relays' measured bandwidth based + on the MaxAdvertisedBadwidth. + A relay's MaxAdvertisedBandwidth limits the bandwidth-avg in its + descriptor. bandwidth-avg is the minimum of MaxAdvertisedBandwidth, + BandwidthRate, RelayBandwidthRate, BandwidthBurst, and + RelayBandwidthBurst. + Therefore, generators MUST limit a relay's measured bandwidth to its + descriptor's bandwidth-avg. This limit needs to be implemented in the + generator, because generators may scale consensus weights before + sending them to Tor. + Generators SHOULD NOT limit measured bandwidths based on descriptors' + bandwidth-observed, because that penalises new relays. + + sbws limits the relay's measured bandwidth to the bandwidth-avg + advertised. + + Torflow partitions relays based on their bandwidth. For unmeasured + relays, Torflow uses the minimum of all descriptor bandwidths, + including bandwidth-avg (MaxAdvertisedBandwidth) and + bandwidth-observed. Then Torflow measures the relays in each partition + against each other, which implicitly limits a relay's measured + bandwidth to the bandwidths of similar relays. + + Torflow also generates consensus weights based on the ratio between the + measured bandwidth and the minimum of all descriptor bandwidths (at the + time of the measurement). So when an operator reduces the + MaxAdvertisedBandwidth for a relay, Torflow reduces that relay's + measured bandwidth. + + KeyValue + + [Zero or more times.] + + Future format versions may include additional KeyValue pairs on a + RelayLine. + Additional KeyValue pairs will be accompanied by a minor version + increment. + + Implementations MAY add additional relay KeyValue pairs as needed. + This specification SHOULD be updated to avoid conflicting meanings + for the same Keywords. + + Parsers MUST NOT rely on the order of these additional KeyValue + pairs. + + Additional KeyValue pairs MUST NOT use any keywords specified in the + header format. + If there are, the parser MAY ignore conflicting keywords. + +2.4. Implementation details + +2.4.1. Writing bandwidth files atomically + + To avoid inconsistent reads, implementations SHOULD write bandwidth files + atomically. If the file is transferred from another host, it SHOULD be + written to a temporary path, then renamed to the V3BandwidthsFile path. + + sbws versions 0.7.0 and later write the bandwidth file to an archival + location, create a temporary symlink to that location, then atomically rename + the symlink + to the configured V3BandwidthsFile path. + + Torflow does not write bandwidth files atomically. + +2.4.2. Additional KeyValue pair definitions + + KeyValue pairs in RelayLines that current implementations generate. + +2.4.2.1. Simple Bandwidth Scanner + + sbws RelayLines contain these keys: + + "node_id" hexdigest + + As above. + + "bw" Bandwidth + + As above. + + "nick" nickname + + [Exactly once.] + + The relay nickname. + + Torflow also has a "nick" KeyValue. + + "rtt" Int + + [Zero or one time.] + + The Round Trip Time in milliseconds to obtain 1 byte of data. + + This KeyValue was added in version 1.1.0 of this specification. + It became optional in version 1.3.0 or 1.4.0 of this specification. + + "time" DateTime + + [Exactly once.] + + The date and time timestamp in ISO 8601 format and UTC time zone + when the last bandwidth was obtained. + + This KeyValue was added in version 1.1.0 of this specification. + The Torflow equivalent is "measured_at". + + "success" Int + + [Zero or one time.] + + The number of times that the bandwidth measurements for this relay were + successful. + + This KeyValue was added in version 1.1.0 of this specification. + + "error_circ" Int + + [Zero or one time.] + + The number of times that the bandwidth measurements for this relay + failed because of circuit failures. + + This KeyValue was added in version 1.1.0 of this specification. + The Torflow equivalent is "circ_fail". + + "error_stream" Int + + [Zero or one time.] + + The number of times that the bandwidth measurements for this relay + failed because of stream failures. + + This KeyValue was added in version 1.1.0 of this specification. + + "error_destination" Int + + [Zero or one time.] + + The number of times that the bandwidth measurements for this relay + failed because the destination Web server was not available. + + This KeyValue was added in version 1.4.0 of this specification. + + "error_second_relay" Int + + [Zero or one time.] + + The number of times that the bandwidth measurements for this relay + failed because sbws could not find a second relay for the test circuit. + + This KeyValue was added in version 1.4.0 of this specification. + + "error_misc" Int + + [Zero or one time.] + + The number of times that the bandwidth measurements for this relay + failed because of other reasons. + + This KeyValue was added in version 1.1.0 of this specification. + + "bw_mean" Int + + [Zero or one time.] + + The measured bandwidth mean for this relay in bytes per second. + + This KeyValue was added in version 1.2.0 of this specification. + + "bw_median" Int + + [Zero or one time.] + + The measured bandwidth median for this relay in bytes per second. + + This KeyValue was added in version 1.2.0 of this specification. + + "desc_bw_avg" Int + + [Zero or one time.] + + The descriptor average bandwidth for this relay in bytes per second. + + This KeyValue was added in version 1.2.0 of this specification. + + "desc_bw_obs_last" Int + + [Zero or one time.] + + The last descriptor observed bandwidth for this relay in bytes per + second. + + This KeyValue was added in version 1.2.0 of this specification. + + "desc_bw_obs_mean" Int + + [Zero or one time.] + + The descriptor observed bandwidth mean for this relay in bytes per + second. + + This KeyValue was added in version 1.2.0 of this specification. + + "desc_bw_bur" Int + + [Zero or one time.] + + The descriptor burst bandwidth for this relay in bytes per + second. + + This KeyValue was added in version 1.2.0 of this specification. + + "consensus_bandwidth" Int + + [Zero or one time.] + + The consensus bandwidth for this relay in bytes per second. + + This KeyValue was added in version 1.2.0 of this specification. + + "consensus_bandwidth_is_unmeasured" Bool + + [Zero or one time.] + + If the consensus bandwidth for this relay was not obtained from + three or more bandwidth authorities, this KeyValue is True or + False otherwise. + + This KeyValue was added in version 1.2.0 of this specification. + + "relay_in_recent_consensus_count" Int + + [Zero or one time.] + + The number of times this relay was found in a consensus in the + last data_period days. (Unless otherwise stated, data_period is + 5 by default.) + + This KeyValue was added in version 1.4.0 of this specification. + + "relay_recent_priority_list_count" Int + + [Zero or one time.] + + The number of times this relay has been prioritized to be measured + in the last data_period days. + + This KeyValue was added in version 1.4.0 of this specification. + + "relay_recent_measurement_attempt_count" Int + + [Zero or one time.] + + The number of times this relay was tried to be measured in the + last data_period days. + + This KeyValue was added in version 1.4.0 of this specification. + + "relay_recent_measurement_failure_count" Int + + [Zero or one time.] + + The number of times this relay was tried to be measured in the + last data_period days, but it was not possible to obtain a + measurement. + + This KeyValue was added in version 1.4.0 of this specification. + + "relay_recent_measurements_excluded_error_count" Int + + [Zero or one time.] + + The number of recent relay measurement attempts that failed. + Measurements are recent if they are in the last data_period days + (5 by default). + + (See the note in section 1.4, version 1.4.0, about excluded relays.) + + This KeyValue was added in version 1.4.0 of this specification. + + "relay_recent_measurements_excluded_near_count" Int + + [Zero or one time.] + + When all of a relay's recent successful measurements were performed in + a period of time that was too short (by default 1 day), the relay is + excluded. This KeyValue contains the number of recent successful + measurements for the relay that were ignored for this reason. + + (See the note in section 1.4, version 1.4.0, about excluded relays.) + + This KeyValue was added in version 1.4.0 of this specification. + + "relay_recent_measurements_excluded_old_count" Int + + [Zero or one time.] + + The number of successful measurements for this relay that are too old + (more than data_period days, 5 by default). + + Excludes measurements that are already counted in + relay_recent_measurements_excluded_near_count. + + (See the note in section 1.4, version 1.4.0, about excluded relays.) + + This KeyValue was added in version 1.4.0 of this specification. + + "relay_recent_measurements_excluded_few_count" Int + + [Zero or one time.] + + The number of successful measurements for this relay that were ignored + because the relay did not have enough successful measurements (fewer + than 2, by default). + + Excludes measurements that are already counted in + relay_recent_measurements_excluded_near_count or + relay_recent_measurements_excluded_old_count. + + (See the note in section 1.4, version 1.4.0, about excluded relays.) + + This KeyValue was added in version 1.4.0 of this specification. + + "under_min_report" bool + + [Zero or one time.] + + If the value is 1, there are not enough eligible relays in the + bandwidth file, and Tor bandwidth authorities MAY NOT vote on this + relay. (Current Tor versions do not change their behaviour based on + the "under_min_report" key.) + + If the value is 0 or the KeyValue is not present, there are enough + relays in the bandwidth file. + + Because Tor versions released before April 2019 (see section 1.4. for + the full list of versions) ignore "vote=0", generator implementations + MUST NOT change the bandwidths for under_min_report relays. Using the + same bw value makes authorities that do not understand "vote=0" + or "under_min_report=1" produce votes that don't change relay weights + too much. It also avoids flapping when the reporting threshold is + reached. + + This KeyValue was added in version 1.4.0 of this specification. + + "unmeasured" bool + + [Zero or one time.] + + If the value is 1, this relay was not successfully measured and + Tor bandwidth authorities MAY NOT vote on this relay. + (Current Tor versions do not change their behaviour based on + the "unmeasured" key.) + + If the value is 0 or the KeyValue is not present, this relay + was successfully measured. + + Because Tor versions released before April 2019 (see section 1.4. for + the full list of versions) ignore "vote=0", generator implementations + MUST set "bw=1" for unmeasured relays. Using the minimum bw value + makes authorities that do not understand "vote=0" or "unmeasured=1" + produce votes that don't change relay weights too much. + + This KeyValue was added in version 1.4.0 of this specification. + + "vote" bool + + [Zero or one time.] + + If the value is 0, Tor directory authorities SHOULD ignore the relay's + entry in the bandwidth file. They SHOULD vote for the relay the same + way they would vote for a relay that is not present in the file. + + This MAY be the case when this relay was not successfully measured but + it is included in the Bandwidth File, to diagnose why they were not + measured. + + If the value is 1 or the KeyValue is not present, Tor directory + authorities MUST use the relay's bw value in any votes for that relay. + + Implementations MUST also set "bw=1" for unmeasured relays. + But they MUST NOT change the bw for under_min_report relays. + (See the explanations under "unmeasured" and "under_min_report" + for more details.) + + This KeyValue was added in version 1.4.0 of this specification. + + "xoff_recv" Int + + [Zero or one time.] + + The number of times this relay received `XOFF_RECV` stream events while + being measured in the last data_period days. + + This KeyValue was added in version 1.6.0 of this specification. + + "xoff_sent" Int + + [Zero or one time.] + + The number of times this relay received `XOFF_SENT` stream events while + being measured in the last data_period days. + + This KeyValue was added in version 1.6.0 of this specification. + + "r_strm" Float + + [Zero or one time.] + + The stream ratio of this relay calculated as explained in B4.3. + + This KeyValue was added in version 1.7.0 of this specification. + + "r_strm_filt" Float + + [Zero or one time.] + + The filtered stream ratio of this relay calculated as explained in B4.3. + + This KeyValue was added in version 1.7.0 of this specification. + + +2.4.2.2. Torflow + + Torflow RelayLines include node_id and bw, and other KeyValue pairs [2]. + +References: + +1. https://gitweb.torproject.org/torflow.git +2. https://gitweb.torproject.org/torflow.git/tree/NetworkScanners/BwAuthority/README.spec.txt#n332 + The Torflow specification is outdated, and does not match the current + implementation. See section A.1. for the format produced by Torflow. +3. https://gitweb.torproject.org/torspec.git/tree/dir-spec.txt +4. https://gitweb.torproject.org/torspec.git/tree/version-spec.txt +5. https://semver.org/ + +A. Sample data + +The following has not been obtained from any real measurement. + +A.1. Generated by Torflow + +This an example version 1.0.0 document: + +1523911758 +node_id=$68A483E05A2ABDCA6DA5A3EF8DB5177638A27F80 bw=760 nick=Test measured_at=1523911725 updated_at=1523911725 pid_error=4.11374090719 pid_error_sum=4.11374090719 pid_bw=57136645 pid_delta=2.12168374577 circ_fail=0.2 scanner=/filepath +node_id=$96C15995F30895689291F455587BD94CA427B6FC bw=189 nick=Test2 measured_at=1523911623 updated_at=1523911623 pid_error=3.96703337994 pid_error_sum=3.96703337994 pid_bw=47422125 pid_delta=2.65469736988 circ_fail=0.0 scanner=/filepath + +A.2. Generated by sbws version 0.1.0 + +1523911758 +version=1.1.0 +software=sbws +software_version=0.1.0 +latest_bandwidth=2018-04-16T20:49:18 +file_created=2018-04-16T21:49:18 +generator_started=2018-04-16T15:13:25 +earliest_bandwidth=2018-04-16T15:13:26 +==== +bw=380 error_circ=0 error_misc=0 error_stream=1 master_key_ed25519=YaqV4vbvPYKucElk297eVdNArDz9HtIwUoIeo0+cVIpQ nick=Test node_id=$68A483E05A2ABDCA6DA5A3EF8DB5177638A27F80 rtt=380 success=1 time=2018-05-08T16:13:26 +bw=189 error_circ=0 error_misc=0 error_stream=0 master_key_ed25519=a6a+dZadrQBtfSbmQkP7j2ardCmLnm5NJ4ZzkvDxbo0I nick=Test2 node_id=$96C15995F30895689291F455587BD94CA427B6FC rtt=378 success=1 time=2018-05-08T16:13:36 + +A.3. Generated by sbws version 1.0.3 + +1523911758 +version=1.2.0 +latest_bandwidth=2018-04-16T20:49:18 +file_created=2018-04-16T21:49:18 +generator_started=2018-04-16T15:13:25 +earliest_bandwidth=2018-04-16T15:13:26 +minimum_number_eligible_relays=3862 +minimum_percent_eligible_relays=60 +number_consensus_relays=6436 +number_eligible_relays=6000 +percent_eligible_relays=93 +software=sbws +software_version=1.0.3 +===== +bw=38000 bw_mean=1127824 bw_median=1180062 desc_bw_avg=1073741824 desc_bw_obs_last=17230879 desc_bw_obs_mean=14732306 error_circ=0 error_misc=0 error_stream=1 master_key_ed25519=YaqV4vbvPYKucElk297eVdNArDz9HtIwUoIeo0+cVIpQ nick=Test node_id=$68A483E05A2ABDCA6DA5A3EF8DB5177638A27F80 rtt=380 success=1 time=2018-05-08T16:13:26 +bw=1 bw_mean=199162 bw_median=185675 desc_bw_avg=409600 desc_bw_obs_last=836165 desc_bw_obs_mean=858030 error_circ=0 error_misc=0 error_stream=0 master_key_ed25519=a6a+dZadrQBtfSbmQkP7j2ardCmLnm5NJ4ZzkvDxbo0I nick=Test2 node_id=$96C15995F30895689291F455587BD94CA427B6FC rtt=378 success=1 time=2018-05-08T16:13:36 + +A.3.1. When there are not enough eligible measured relays: + +1540496079 +version=1.2.0 +earliest_bandwidth=2018-10-20T19:35:52 +file_created=2018-10-25T19:35:03 +generator_started=2018-10-25T11:42:56 +latest_bandwidth=2018-10-25T19:34:39 +minimum_number_eligible_relays=3862 +minimum_percent_eligible_relays=60 +number_consensus_relays=6436 +number_eligible_relays=2960 +percent_eligible_relays=46 +software=sbws +software_version=1.0.3 +===== + +A.4. Headers generated by sbws version 1.0.4 + +1523911758 +version=1.2.0 +latest_bandwidth=2018-04-16T20:49:18 +destinations_countries=TH,ZZ +file_created=2018-04-16T21:49:18 +generator_started=2018-04-16T15:13:25 +earliest_bandwidth=2018-04-16T15:13:26 +minimum_number_eligible_relays=3862 +minimum_percent_eligible_relays=60 +number_consensus_relays=6436 +number_eligible_relays=6000 +percent_eligible_relays=93 +scanner_country=SN +software=sbws +software_version=1.0.4 +===== + +A.5 Generated by sbws version 1.1.0 + +1523911758 +version=1.4.0 +latest_bandwidth=2018-04-16T20:49:18 +destinations_countries=TH,ZZ +file_created=2018-04-16T21:49:18 +generator_started=2018-04-16T15:13:25 +earliest_bandwidth=2018-04-16T15:13:26 +minimum_number_eligible_relays=3862 +minimum_percent_eligible_relays=60 +number_consensus_relays=6436 +number_eligible_relays=6000 +percent_eligible_relays=93 +recent_measurement_attempt_count=6243 +recent_measurement_failure_count=732 +recent_measurements_excluded_error_count=969 +recent_measurements_excluded_few_count=3946 +recent_measurements_excluded_near_count=90 +recent_measurements_excluded_old_count=0 +recent_priority_list_count=20 +recent_priority_relay_count=6243 +scanner_country=SN +software=sbws +software_version=1.1.0 +time_to_report_half_network=57273 +===== +bw=1 error_circ=1 error_destination=0 error_misc=0 error_second_relay=0 error_stream=0 master_key_ed25519=J3HQ24kOQWac3L1xlFLp7gY91qkb5NuKxjj1BhDi+m8 nick=snap269 node_id=$DC4D609F95A52614D1E69C752168AF1FCAE0B05F relay_recent_measurement_attempt_count=3 relay_recent_measurements_excluded_error_count=1 relay_recent_measurements_excluded_near_count=3 relay_recent_consensus_count=3 relay_recent_priority_list_count=3 success=3 time=2019-03-16T18:20:57 unmeasured=1 vote=0 +bw=1 error_circ=0 error_destination=0 error_misc=0 error_second_relay=0 error_stream=2 master_key_ed25519=h6ZB1E1yBFWIMloUm9IWwjgaPXEpL5cUbuoQDgdSDKg nick=relay node_id=$C4544F9E209A9A9B99591D548B3E2822236C0503 relay_recent_measurement_attempt_count=3 relay_recent_measurements_excluded_error_count=2 relay_recent_measurements_excluded_few_count=1 relay_recent_consensus_count=3 relay_recent_priority_list_count=3 success=1 time=2019-03-17T06:50:58 unmeasured=1 vote=0 + +B. Scaling bandwidths + +B.1. Scaling requirements + + Tor accepts zero bandwidths, but they trigger bugs in older Tor + implementations. Therefore, scaling methods SHOULD perform the + following checks: + * If the total bandwidth is zero, all relays should be given equal + bandwidths. + * If the scaled bandwidth is zero, it should be rounded up to one. + + Initial experiments indicate that scaling may not be needed for + torflow and sbws, because their measured bandwidths are similar + enough already. + +B.2. A linear scaling method + + If scaling is required, here is a simple linear bandwidth scaling + method, which ensures that all bandwidth votes contain approximately + the same total bandwidth: + + 1. Calculate the relay quota by dividing the total measured bandwidth + in all votes, by the number of relays with measured bandwidth + votes. In the public tor network, this is approximately 7500 as of + April 2018. The quota should be a consensus parameter, so it can be + adjusted for all generators on the network. + + 2. Calculate a vote quota by multiplying the relay quota by the number + of relays this bandwidth authority has measured + bandwidths for. + + 3. Calculate a scaling factor by dividing the vote quota by the + total unscaled measured bandwidth in this bandwidth + authority's upcoming vote. + + 4. Multiply each unscaled measured bandwidth by the scaling + factor. + + Now, the total scaled bandwidth in the upcoming vote is + approximately equal to the quota. + +B.3. Quota changes + + If all generators are using scaling, the quota can be gradually + reduced or increased as needed. Smaller quotas decrease the size + of uncompressed consensuses, and may decrease the size of + consensus diffs and compressed consensuses. But if the relay + quota is too small, some relays may be over- or under-weighted. + +B.4. Torflow aggregation + + Torflow implements two methods to compute the bandwidth values from the + (stream) bandwidth measurements: with and without PID control feedback. + The method described here is without PID control (see Torflow + specification, section 2.2). + + In the following sections, the relays' measured bandwidth refer to the + ones that this bandwidth authority has measured for the relays that + would be included in the next bandwidth authority's upcoming vote. + + 1. Calculate the filtered bandwidth for each relay: + - choose the relay's measurements (`bw_j`) that are equal or greater + than the mean of the measurements for this relay + - calculate the mean of those measurements + + In pseudocode: + + bw_filt_i = mean(max(mean(bw_j), bw_j)) + + 2. Calculate network averages: + - calculate the filtered average by dividing the sum of all the + relays' filtered bandwidth by the number of relays that have been + measured (`n`), ie, calculate the mean average of the relays' + filtered bandwidth. + - calculate the stream average by dividing the sum of all the + relays' measured bandwidth by the number of relays that have been + measured (`n`), ie, calculate the mean average or the relays' + measured bandwidth. + + In pseudocode: + + bw_avg_filt_ = bw_filt_i / n + bw_avg_strm = bw_i / n + + 3. Calculate ratios for each relay: + - calculate the filtered ratio by dividing each relay filtered + bandwidth by the filtered average + - calculate the stream ratio by dividing each relay measured + bandwidth by the stream average + + In pseudocode: + + r_filt_i = bw_filt_i / bw_avg_filt + r_strm_i = bw_i / bw_avg_strm + + 4. Calculate the final ratio for each relay: + The final ratio is the larger between the filtered bandwidth's and the + stream bandwidth's ratio. + + In pseudocode: + + r_i = max(r_filt_i, r_strm_i) + + 5. Calculate the scaled bandwidth for each relay: + The most recent descriptor observed bandwidth (`bw_obs_i`) is + multiplied by the ratio + + In pseudocode: + + bw_new_i = r_i * bw_obs_i + + <> diff --git a/attic/text_formats/bridgedb-spec.txt b/attic/text_formats/bridgedb-spec.txt new file mode 100644 index 0000000..51f6e5d --- /dev/null +++ b/attic/text_formats/bridgedb-spec.txt @@ -0,0 +1,409 @@ + + BridgeDB specification + + Karsten Loesing + Nick Mathewson + +Table of Contents + + 0. Preliminaries + 1. Importing bridge network statuses and bridge descriptors + 1.1. Parsing bridge network statuses + 1.2. Parsing bridge descriptors + 1.3. Parsing extra-info documents + 2. Assigning bridges to distributors + 3. Giving out bridges upon requests + 4. Selecting bridges to be given out based on IP addresses + 5. Selecting bridges to be given out based on email addresses + 6. Selecting unallocated bridges to be stored in file buckets + 7. Displaying Bridge Information + 8. Writing bridge assignments for statistics + +0. Preliminaries + + This document specifies how BridgeDB processes bridge descriptor files + to learn about new bridges, maintains persistent assignments of bridges + to distributors, and decides which bridges to give out upon user + requests. + + Some of the decisions here may be suboptimal: this document is meant to + specify current behavior as of August 2013, not to specify ideal + behavior. + +1. Importing bridge network statuses and bridge descriptors + + BridgeDB learns about bridges by parsing bridge network statuses, + bridge descriptors, and extra info documents as specified in Tor's + directory protocol. BridgeDB parses one bridge network status file + first and at least one bridge descriptor file and potentially one extra + info file afterwards. + + BridgeDB scans its files on sighup. + + BridgeDB does not validate signatures on descriptors or networkstatus + files: the operator needs to make sure that these documents have come + from a Tor instance that did the validation for us. + +1.1. Parsing bridge network statuses + + Bridge network status documents contain the information of which bridges + are known to the bridge authority and which flags the bridge authority + assigns to them. + We expect bridge network statuses to contain at least the following two + lines for every bridge in the given order (format fully specified in Tor's + directory protocol): + + "r" SP nickname SP identity SP digest SP publication SP IP SP ORPort + SP DirPort NL + "a" SP address ":" port NL (no more than 8 instances) + "s" SP Flags NL + + BridgeDB parses the identity and the publication timestamp from the "r" + line, the OR address(es) and ORPort(s) from the "a" line(s), and the + assigned flags from the "s" line, specifically checking the assignment + of the "Running" and "Stable" flags. + BridgeDB memorizes all bridges that have the Running flag as the set of + running bridges that can be given out to bridge users. + BridgeDB memorizes assigned flags if it wants to ensure that sets of + bridges given out should contain at least a given number of bridges + with these flags. + +1.2. Parsing bridge descriptors + + BridgeDB learns about a bridge's most recent IP address and OR port + from parsing bridge descriptors. + In theory, both IP address and OR port of a bridge are also contained + in the "r" line of the bridge network status, so there is no mandatory + reason for parsing bridge descriptors. But the functionality described + in this section is still implemented in case we need data from the + bridge descriptor in the future. + + Bridge descriptor files may contain one or more bridge descriptors. + We expect a bridge descriptor to contain at least the following lines in + the stated order: + + "@purpose" SP purpose NL + "router" SP nickname SP IP SP ORPort SP SOCKSPort SP DirPort NL + "published" SP timestamp + ["opt" SP] "fingerprint" SP fingerprint NL + "router-signature" NL Signature NL + + BridgeDB parses the purpose, IP, ORPort, nickname, and fingerprint + from these lines. + BridgeDB skips bridge descriptors if the fingerprint is not contained + in the bridge network status parsed earlier or if the bridge does not + have the Running flag. + BridgeDB discards bridge descriptors which have a different purpose + than "bridge". BridgeDB can be configured to only accept descriptors + with another purpose or not discard descriptors based on purpose at + all. + BridgeDB memorizes the IP addresses and OR ports of the remaining + bridges. + If there is more than one bridge descriptor with the same fingerprint, + BridgeDB memorizes the IP address and OR port of the most recently + parsed bridge descriptor. + If BridgeDB does not find a bridge descriptor for a bridge contained in + the bridge network status parsed before, it does not add that bridge + to the set of bridges to be given out to bridge users. + +1.3. Parsing extra-info documents + + BridgeDB learns if a bridge supports a pluggable transport by parsing + extra-info documents. + Extra-info documents contain the name of the bridge (but only if it is + named), the bridge's fingerprint, the type of pluggable transport(s) it + supports, and the IP address and port number on which each transport + listens, respectively. + + Extra-info documents may contain zero or more entries per bridge. We expect + an extra-info entry to contain the following lines in the stated order: + + "extra-info" SP name SP fingerprint NL + "transport" SP transport SP IP ":" PORT ARGS NL + + BridgeDB parses the fingerprint, transport type, IP address, port and any + arguments that are specified on these lines. BridgeDB skips the name. If + the fingerprint is invalid, BridgeDB skips the entry. BridgeDB memorizes + the transport type, IP address, port number, and any arguments that are be + provided and then it assigns them to the corresponding bridge based on the + fingerprint. Arguments are comma-separated and are of the form k=v,k=v. + Bridges that do not have an associated extra-info entry are not invalid. + +2. Assigning bridges to distributors + + A "distributor" is a mechanism by which bridges are given (or not + given) to clients. The current distributors are "email", "https", + and "unallocated". + + BridgeDB assigns bridges to distributors based on an HMAC hash of the + bridge's ID and a secret and makes these assignments persistent. + Persistence is achieved by using a database to map node ID to + distributor. + Each bridge is assigned to exactly one distributor (including + the "unallocated" distributor). + BridgeDB may be configured to support only a non-empty subset of the + distributors specified in this document. + BridgeDB may be configured to use different probabilities for assigning + new bridges to distributors. + BridgeDB does not change existing assignments of bridges to + distributors, even if probabilities for assigning bridges to + distributors change or distributors are disabled entirely. + +3. Giving out bridges upon requests + + Upon receiving a client request, a BridgeDB distributor provides a + subset of the bridges assigned to it. + BridgeDB only gives out bridges that are contained in the most recently + parsed bridge network status and that have the Running flag set (see + Section 1). + BridgeDB may be configured to give out a different number of bridges + (typically 4) depending on the distributor. + BridgeDB may define an arbitrary number of rules. These rules may + specify the criteria by which a bridge is selected. Specifically, + the available rules restrict the IP address version, OR port number, + transport type, bridge relay flag, or country in which the bridge + should not be blocked. + +4. Selecting bridges to be given out based on IP addresses + + BridgeDB may be configured to support one or more distributors which + gives out bridges based on the requestor's IP address. Currently, this + is how the HTTPS distributor works. + The goal is to avoid handing out all the bridges to users in a similar + IP space and time. +# Someone else should look at proposals/ideas/old/xxx-bridge-disbursement +# to see if this section is missing relevant pieces from it. -KL + + BridgeDB fixes the set of bridges to be returned for a defined time + period. + BridgeDB considers all IP addresses coming from the same /24 network + as the same IP address and returns the same set of bridges. From here on, + this non-unique address will be referred to as the IP address's 'area'. + BridgeDB divides the IP address space equally into a small number of +# Note, changed term from "areas" to "disjoint clusters" -MF + disjoint clusters (typically 4) and returns different results for requests + coming from addresses that are placed into different clusters. +# I found that BridgeDB is not strict in returning only bridges for a +# given area. If a ring is empty, it considers the next one. Is this +# expected behavior? -KL +# +# This does not appear to be the case, anymore. If a ring is empty, then +# BridgeDB simply returns an empty set of bridges. -MF +# +# I also found that BridgeDB does not make the assignment to areas +# persistent in the database. So, if we change the number of rings, it +# will assign bridges to other rings. I assume this is okay? -KL + BridgeDB maintains a list of proxy IP addresses and returns the same + set of bridges to requests coming from these IP addresses. + The bridges returned to proxy IP addresses do not come from the same + set as those for the general IP address space. + + BridgeDB can be configured to include bridge fingerprints in replies + along with bridge IP addresses and OR ports. + BridgeDB can be configured to display a CAPTCHA which the user must solve + prior to returning the requested bridges. + + The current algorithm is as follows. An IP-based distributor splits + the bridges uniformly into a set of "rings" based on an HMAC of their + ID. Some of these rings are "area" rings for parts of IP space; some + are "category" rings for categories of IPs (like proxies). When a + client makes a request from an IP, the distributor first sees whether + the IP is in one of the categories it knows. If so, the distributor + returns an IP from the category rings. If not, the distributor + maps the IP into an "area" (that is, a /24), and then uses an HMAC to + map the area to one of the area rings. + + When the IP-based distributor determines from which area ring it is handing + out bridges, it identifies which rules it will use to choose appropriate + bridges. Using this information, it searches its cache of rings for one + that already adheres to the criteria specified in this request. If one + exists, then BridgeDB maps the current "epoch" (N-hour period) and the + IP's area (/24) to a point on the ring based on HMAC, and hands out + bridges at that point. If a ring does not already exist which satisfies this + request, then a new ring is created and filled with bridges that fulfill + the requirements. This ring is then used to select bridges as described. + + "Mapping X to Y based on an HMAC" above means one of the following: + + - We keep all of the elements of Y in some order, with a mapping + from all 160-bit strings to positions in Y. + - We take an HMAC of X using some fixed string as a key to get a + 160-bit value. We then map that value to the next position of Y. + + When giving out bridges based on a position in a ring, BridgeDB first + looks at flag requirements and port requirements. For example, + BridgeDB may be configured to "Give out at least L bridges with port + 443, and at least M bridges with Stable, and at most N bridges + total." To do this, BridgeDB combines to the results: + + - The first L bridges in the ring after the position that have the + port 443, and + - The first M bridges in the ring after the position that have the + flag stable and that it has not already decided to give out, and + - The first N-L-M bridges in the ring after the position that it + has not already decided to give out. + + After BridgeDB selects appropriate bridges to return to the requestor, it + then prioritises the ordering of them in a list so that as many criteria + are fulfilled as possible within the first few bridges. This list is then + truncated to N bridges, if possible. N is currently defined as a + piecewise function of the number of bridges in the ring such that: + + / + | 1, if len(ring) < 20 + | + N = | 2, if 20 <= len(ring) <= 100 + | + | 3, if 100 <= len(ring) + \ + + The bridges in this sublist, containing no more than N bridges, are the + bridges returned to the requestor. + +5. Selecting bridges to be given out based on email addresses + + BridgeDB can be configured to support one or more distributors that are + giving out bridges based on the requestor's email address. Currently, + this is how the email distributor works. + The goal is to bootstrap based on one or more popular email service's + sybil prevention algorithms. +# Someone else should look at proposals/ideas/old/xxx-bridge-disbursement +# to see if this section is missing relevant pieces from it. -KL + + BridgeDB rejects email addresses containing other characters than the + ones that RFC2822 allows. + BridgeDB may be configured to reject email addresses containing other + characters it might not process correctly. +# I don't think we do this, is it worthwhile? -MF + BridgeDB rejects email addresses coming from other domains than a + configured set of permitted domains. + BridgeDB normalizes email addresses by removing "." characters and by + removing parts after the first "+" character. + BridgeDB can be configured to discard requests that do not have the + value "pass" in their X-DKIM-Authentication-Result header or does not + have this header. The X-DKIM-Authentication-Result header is set by + the incoming mail stack that needs to check DKIM authentication. + + BridgeDB does not return a new set of bridges to the same email address + until a given time period (typically a few hours) has passed. +# Why don't we fix the bridges we give out for a global 3-hour time period +# like we do for IP addresses? This way we could avoid storing email +# addresses. -KL +# The 3-hour value is probably much too short anyway. If we take longer +# time values, then people get new bridges when bridges show up, as +# opposed to then we decide to reset the bridges we give them. (Yes, this +# problem exists for the IP distributor). -NM +# I'm afraid I don't fully understand what you mean here. Can you +# elaborate? -KL +# +# Assuming an average churn rate, if we use short time periods, then a +# requestor will receive new bridges based on rate-limiting and will (likely) +# eventually work their way around the ring; eventually exhausting all bridges +# available to them from this distributor. If we use a longer time period, +# then each time the period expires there will be more bridges in the ring +# thus reducing the likelihood of all bridges being blocked and increasing +# the time and effort required to enumerate all bridges. (This is my +# understanding, not from Nick) -MF +# Also, we presently need the cache to prevent replays and because if a user +# sent multiple requests with different criteria in each then we would leak +# additional bridges otherwise. -MF + BridgeDB can be configured to include bridge fingerprints in replies + along with bridge IP addresses and OR ports. + BridgeDB can be configured to sign all replies using a PGP signing key. + BridgeDB periodically discards old email-address-to-bridge mappings. + BridgeDB rejects too frequent email requests coming from the same + normalized address. + + To map previously unseen email addresses to a set of bridges, BridgeDB + proceeds as follows: + + - It normalizes the email address as above, by stripping out dots, + removing all of the localpart after the +, and putting it all + in lowercase. (Example: "John.Doe+bridges@example.COM" becomes + "johndoe@example.com".) + - It maps an HMAC of the normalized address to a position on its ring + of bridges. + - It hands out bridges starting at that position, based on the + port/flag requirements, as specified at the end of section 4. + + See section 4 for the details of how bridges are selected from the ring + and returned to the requestor. + +6. Selecting unallocated bridges to be stored in file buckets + +# Kaner should have a look at this section. -NM + + BridgeDB can be configured to reserve a subset of bridges and not give + them out via one of the distributors. + BridgeDB assigns reserved bridges to one or more file buckets of fixed + sizes and write these file buckets to disk for manual distribution. + BridgeDB ensures that a file bucket always contains the requested + number of running bridges. + If the requested number of bridges in a file bucket is reduced or the + file bucket is not required anymore, the unassigned bridges are + returned to the reserved set of bridges. + If a bridge stops running, BridgeDB replaces it with another bridge + from the reserved set of bridges. +# I'm not sure if there's a design bug in file buckets. What happens if +# we add a bridge X to file bucket A, and X goes offline? We would add +# another bridge Y to file bucket A. OK, but what if A comes back? We +# cannot put it back in file bucket A, because it's full. Are we going to +# add it to a different file bucket? Doesn't that mean that most bridges +# will be contained in most file buckets over time? -KL +# +# This should be handled the same as if the file bucket is reduced in size. +# If X returns, then it should be added to the appropriate distributor. -MF + +7. Displaying Bridge Information + + After bridges are selected using one of the methods described in + Sections 4 - 6, they are output in one of two formats. Bridges are + formatted as: + + NL + + Pluggable transports are formatted as: + + SP [SP arglist] NL + + where arglist is an optional space-separated list of key-value pairs in + the form of k=v. + + Previously, each line was prepended with the "bridge" keyword, such as + + "bridge" SP NL + + "bridge" SP SP [SP arglist] NL + +# We don't do this anymore because Vidalia and TorLauncher don't expect it. +# See the commit message for b70347a9c5fd769c6d5d0c0eb5171ace2999a736. + +8. Writing bridge assignments for statistics + + BridgeDB can be configured to write bridge assignments to disk for + statistical analysis. + The start of a bridge assignment is marked by the following line: + + "bridge-pool-assignment" SP YYYY-MM-DD HH:MM:SS NL + + YYYY-MM-DD HH:MM:SS is the time, in UTC, when BridgeDB has completed + loading new bridges and assigning them to distributors. + + For every running bridge there is a line with the following format: + + fingerprint SP distributor (SP key "=" value)* NL + + The distributor is one out of "email", "https", or "unallocated". + + Both "email" and "https" distributors support adding keys for "port", + "flag" and "transport". Respectively, the port number, flag name, and + transport types are the values. These are used to indicate that + a bridge matches certain port, flag, transport criteria of requests. + + The "https" distributor also allows the key "ring" with a number as + value to indicate to which IP address area the bridge is returned. + + The "unallocated" distributor allows the key "bucket" with the file + bucket name as value to indicate which file bucket a bridge is assigned + to. + diff --git a/attic/text_formats/cert-spec.txt b/attic/text_formats/cert-spec.txt new file mode 100644 index 0000000..a70e100 --- /dev/null +++ b/attic/text_formats/cert-spec.txt @@ -0,0 +1,198 @@ + + Ed25519 certificates in Tor + +Table of Contents + + 1. Scope and Preliminaries + 1.1. Signing + 1.2. Integer encoding + 2. Document formats + 2.1. Ed25519 Certificates + 2.2. Basic extensions + 2.2.1. Signed-with-ed25519-key extension [type 04] + 2.3. RSA->Ed25519 cross-certificate + A.1. List of certificate types (CERT_TYPE field) + A.2. List of extension types + A.3. List of signature prefixes + A.4. List of certified key types (CERT_KEY_TYPE field) + +1. Scope and Preliminaries + + This document describes a certificate format that Tor uses for + its Ed25519 internal certificates. It is not the only + certificate format that Tor uses. For the certificates that + authorities use for their signing keys, see dir-spec.txt. + Additionally, Tor uses TLS, which depends on X.509 certificates; + see tor-spec.txt for details. + + The certificates in this document were first introduced in + proposal 220, and were first supported by Tor in Tor version + 0.2.7.2-alpha. + +1.1. Signing + + All signatures here, unless otherwise specified, are computed + using an Ed25519 key. + + In order to future-proof the format, before signing anything, the + signed document is prefixed with a personalization string, which + will be different in each case. + +1.2. Integer encoding + + Network byte order (big-endian) is used to encode all integer values + in Ed25519 certificates unless explicitly specified otherwise. + +2. Document formats + +2.1. Ed25519 Certificates + + When generating a signing key, we also generate a certificate for it. + Unlike the certificates for authorities' signing keys, these + certificates need to be sent around frequently, in significant + numbers. So we'll choose a compact representation. + + VERSION [1 Byte] + CERT_TYPE [1 Byte] + EXPIRATION_DATE [4 Bytes] + CERT_KEY_TYPE [1 byte] + CERTIFIED_KEY [32 Bytes] + N_EXTENSIONS [1 byte] + EXTENSIONS [N_EXTENSIONS times] + SIGNATURE [64 Bytes] + + The "VERSION" field holds the value [01]. The "CERT_TYPE" field + holds a value depending on the type of certificate. (See appendix + A.1.) The CERTIFIED_KEY field is an Ed25519 public key if + CERT_KEY_TYPE is [01], or a digest of some other key type + depending on the value of CERT_KEY_TYPE. (See appendix A.4.) + The EXPIRATION_DATE is a date, given in HOURS since the epoch, + after which this certificate isn't valid. (A four-byte field here + will work fine until 10136 A.D.) + + The EXTENSIONS field contains zero or more extensions, each of + the format: + + ExtLength [2 bytes] + ExtType [1 byte] + ExtFlags [1 byte] + ExtData [ExtLength bytes] + + The meaning of the ExtData field in an extension is type-dependent. + + The ExtFlags field holds flags; this flag is currently defined: + + 1 -- AFFECTS_VALIDATION. If this flag is present, then the + extension affects whether the certificate is valid; clients + must not accept the certificate as valid unless they + understand the extension. + + It is an error for an extension to be truncated; such a + certificate is invalid. + + Before processing any certificate, parties SHOULD know which + identity key it is supposed to be signed by, and then check the + signature. The signature is created by signing all the fields in + the certificate up until "SIGNATURE" (that is, signing + sizeof(ed25519_cert) - 64 bytes). + +2.2. Basic extensions + +2.2.1. Signed-with-ed25519-key extension [type 04] + + In several places, it's desirable to bundle the key signing a + certificate along with the certificate. We do so with this + extension. + + ExtLength = 32 + ExtData = + An ed25519 key [32 bytes] + + When this extension is present, it MUST match the key used to + sign the certificate. + +2.3. RSA->Ed25519 cross-certificate + + Certificate type [07] (Cross-certification of Ed25519 identity + with RSA key) contains the following data: + + ED25519_KEY [32 bytes] + EXPIRATION_DATE [4 bytes] + SIGLEN [1 byte] + SIGNATURE [SIGLEN bytes] + + Here, the Ed25519 identity key is signed with router's RSA + identity key, to indicate that authenticating with a key + certified by the Ed25519 key counts as certifying with RSA + identity key. (The signature is computed on the SHA256 hash of + the non-signature parts of the certificate, prefixed with the + string "Tor TLS RSA/Ed25519 cross-certificate".) + + Just like with the Ed25519 certificates above, the EXPIRATION_DATE + operates in HOURS after the epoch. + + This certificate type is used to mean, "This Ed25519 identity key + acts with the authority of the RSA key that signed this + certificate." + +A.1. List of certificate types (CERT_TYPE field) + + The values marked with asterisks are not types corresponding to + the certificate format of section 2.1. Instead, they are + reserved for RSA-signed certificates to avoid conflicts between + the certificate type enumeration of the CERTS cell and the + certificate type enumeration of in our Ed25519 certificates. + + + **[00],[01],[02],[03] - Reserved to avoid conflict with types used + in CERTS cells. + + [04] - Ed25519 signing key with an identity key + (see prop220 section 4.2) + + [05] - TLS link certificate signed with ed25519 signing key + (see prop220 section 4.2) + + [06] - Ed25519 authentication key signed with ed25519 signing key + (see prop220 section 4.2) + + **[07] - Reserved for RSA identity cross-certification; + (see section 2.3 above, and tor-spec.txt section 4.2) + + [08] - Onion service: short-term descriptor signing key, signed + with blinded public key. + (See rend-spec-v3.txt, section [DESC_OUTER]) + + [09] - Onion service: intro point authentication key, cross-certifying the + descriptor signing key. + (See rend-spec-v3.txt, description of "auth-key") + + [0A] - ntor onion key cross-certifying ed25519 identity key + (see dir-spec.txt, description of "ntor-onion-key-crosscert") + + [0B] - Onion service: ntor-extra encryption key, cross-certifying + descriptor signing key. + (see rend-spec-v3.txt, description of "enc-key-cert") + +A.2. List of extension types + + [04] - signed-with-ed25519-key (section 2.2.1) + +A.3. List of signature prefixes + + We describe various documents as being signed with a prefix. Here + are those prefixes: + + "Tor router descriptor signature v1" (see dir-spec.txt) + +A.4. List of certified key types (CERT_KEY_TYPE field) + + [01] ed25519 key + [02] SHA256 hash of an RSA key. (Not currently used.) + [03] SHA256 hash of an X.509 certificate. (Used with certificate + type 5.) + + (NOTE: Up till 0.4.5.1-alpha, all versions of Tor have incorrectly used + "01" for all types of certified key. Implementations SHOULD + allow "01" in this position, and infer the actual key type from + the CERT_TYPE field.) diff --git a/attic/text_formats/control-spec.txt b/attic/text_formats/control-spec.txt new file mode 100644 index 0000000..52e11a0 --- /dev/null +++ b/attic/text_formats/control-spec.txt @@ -0,0 +1,4418 @@ + + TC: A Tor control protocol (Version 1) + +Table of Contents + + 0. Scope + 1. Protocol outline + 1.1. Forward-compatibility + 2. Message format + 2.1. Description format + 2.1.1. Notes on an escaping bug + 2.2. Commands from controller to Tor + 2.3. Replies from Tor to the controller + 2.4. General-use tokens + 3. Commands + 3.1. SETCONF + 3.2. RESETCONF + 3.3. GETCONF + 3.4. SETEVENTS + 3.5. AUTHENTICATE + 3.6. SAVECONF + 3.7. SIGNAL + 3.8. MAPADDRESS + 3.9. GETINFO + 3.10. EXTENDCIRCUIT + 3.11. SETCIRCUITPURPOSE + 3.12. SETROUTERPURPOSE + 3.13. ATTACHSTREAM + 3.14. POSTDESCRIPTOR + 3.15. REDIRECTSTREAM + 3.16. CLOSESTREAM + 3.17. CLOSECIRCUIT + 3.18. QUIT + 3.19. USEFEATURE + 3.20. RESOLVE + 3.21. PROTOCOLINFO + 3.22. LOADCONF + 3.23. TAKEOWNERSHIP + 3.24. AUTHCHALLENGE + 3.25. DROPGUARDS + 3.26. HSFETCH + 3.27. ADD_ONION + 3.28. DEL_ONION + 3.29. HSPOST + 3.30. ONION_CLIENT_AUTH_ADD + 3.31. ONION_CLIENT_AUTH_REMOVE + 3.32. ONION_CLIENT_AUTH_VIEW + 3.33. DROPOWNERSHIP + 3.34. DROPTIMEOUTS + 4. Replies + 4.1. Asynchronous events + 4.1.1. Circuit status changed + 4.1.2. Stream status changed + 4.1.3. OR Connection status changed + 4.1.4. Bandwidth used in the last second + 4.1.5. Log messages + 4.1.6. New descriptors available + 4.1.7. New Address mapping + 4.1.8. Descriptors uploaded to us in our role as authoritative dirserver + 4.1.9. Our descriptor changed + 4.1.10. Status events + 4.1.11. Our set of guard nodes has changed + 4.1.12. Network status has changed + 4.1.13. Bandwidth used on an application stream + 4.1.14. Per-country client stats + 4.1.15. New consensus networkstatus has arrived + 4.1.16. New circuit buildtime has been set + 4.1.17. Signal received + 4.1.18. Configuration changed + 4.1.19. Circuit status changed slightly + 4.1.20. Pluggable transport launched + 4.1.21. Bandwidth used on an OR or DIR or EXIT connection + 4.1.22. Bandwidth used by all streams attached to a circuit + 4.1.23. Per-circuit cell stats + 4.1.24. Token buckets refilled + 4.1.25. HiddenService descriptors + 4.1.26. HiddenService descriptors content + 4.1.27. Network liveness has changed + 4.1.28. Pluggable Transport Logs + 4.1.29. Pluggable Transport Status + 5. Implementation notes + 5.1. Authentication + 5.2. Don't let the buffer get too big + 5.3. Backward compatibility with v0 control protocol + 5.4. Tor config options for use by controllers + 5.5. Phases from the Bootstrap status event + 5.5.1. Overview of Bootstrap reporting. + 5.5.2. Phases in Bootstrap Stage 1 + 5.5.3. Phases in Bootstrap Stage 2 + 5.5.4. Phases in Bootstrap Stage 3 + 5.6 Bootstrap phases reported by older versions of Tor + +0. Scope + + This document describes an implementation-specific protocol that is used + for other programs (such as frontend user-interfaces) to communicate with a + locally running Tor process. It is not part of the Tor onion routing + protocol. + + This protocol replaces version 0 of TC, which is now deprecated. For + reference, TC is described in "control-spec-v0.txt". Implementors are + recommended to avoid using TC directly, but instead to use a library that + can easily be updated to use the newer protocol. (Version 0 is used by Tor + versions 0.1.0.x; the protocol in this document only works with Tor + versions in the 0.1.1.x series and later.) + + The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL + NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and + "OPTIONAL" in this document are to be interpreted as described in + RFC 2119. + +1. Protocol outline + + TC is a bidirectional message-based protocol. It assumes an underlying + stream for communication between a controlling process (the "client" + or "controller") and a Tor process (or "server"). The stream may be + implemented via TCP, TLS-over-TCP, a Unix-domain socket, or so on, + but it must provide reliable in-order delivery. For security, the + stream should not be accessible by untrusted parties. + + In TC, the client and server send typed messages to each other over the + underlying stream. The client sends "commands" and the server sends + "replies". + + By default, all messages from the server are in response to messages from + the client. Some client requests, however, will cause the server to send + messages to the client indefinitely far into the future. Such + "asynchronous" replies are marked as such. + + Servers respond to messages in the order messages are received. + +1.1. Forward-compatibility + + This is an evolving protocol; new client and server behavior will be + allowed in future versions. To allow new backward-compatible behavior + on behalf of the client, we may add new commands and allow existing + commands to take new arguments in future versions. To allow new + backward-compatible server behavior, we note various places below + where servers speaking a future version of this protocol may insert + new data, and note that clients should/must "tolerate" unexpected + elements in these places. There are two ways that we do this: + + * Adding a new field to a message: + + For example, we might say "This message has three space-separated + fields; clients MUST tolerate more fields." This means that a + client MUST NOT crash or otherwise fail to parse the message or + other subsequent messages when there are more than three fields, and + that it SHOULD function at least as well when more fields are + provided as it does when it only gets the fields it accepts. The + most obvious way to do this is by ignoring additional fields; the + next-most-obvious way is to report additional fields verbatim to the + user, perhaps as part of an expert UI. + + * Adding a new possible value to a list of alternatives: + + For example, we might say "This field will be OPEN, CLOSED, or + CONNECTED. Clients MUST tolerate unexpected values." This means + that a client MUST NOT crash or otherwise fail to parse the message + or other subsequent messages when there are unexpected values, and + that it SHOULD try to handle the rest of the message as well as it + can. The most obvious way to do this is by pretending that each + list of alternatives has an additional "unrecognized value" element, + and mapping any unrecognized values to that element; the + next-most-obvious way is to create a separate "unrecognized value" + element for each unrecognized value. + + Clients SHOULD NOT "tolerate" unrecognized alternatives by + pretending that the message containing them is absent. For example, + a stream closed for an unrecognized reason is nevertheless closed, + and should be reported as such. + + (If some list of alternatives is given, and there isn't an explicit + statement that clients must tolerate unexpected values, clients still + must tolerate unexpected values. The only exception would be if there + were an explicit statement that no future values will ever be added.) + +2. Message format + +2.1. Description format + + The message formats listed below use ABNF as described in RFC 2234. + The protocol itself is loosely based on SMTP (see RFC 2821). + + We use the following nonterminals from RFC 2822: atom, qcontent + + We define the following general-use nonterminals: + + QuotedString = DQUOTE *qcontent DQUOTE + + There are explicitly no limits on line length. All 8-bit characters + are permitted unless explicitly disallowed. In QuotedStrings, + backslashes and quotes must be escaped; other characters need not be + escaped. + + Wherever CRLF is specified to be accepted from the controller, Tor MAY also + accept LF. Tor, however, MUST NOT generate LF instead of CRLF. + Controllers SHOULD always send CRLF. + +2.1.1. Notes on an escaping bug + + CString = DQUOTE *qcontent DQUOTE + + Note that although these nonterminals have the same grammar, they + are interpreted differently. In a QuotedString, a backslash + followed by any character represents that character. But + in a CString, the escapes "\n", "\t", "\r", and the octal escapes + "\0" ... "\377" represent newline, tab, carriage return, and the + 256 possible octet values respectively. + + The use of CString in this document reflects a bug in Tor; + they should have been QuotedString instead. In the future, they + may migrate to use QuotedString instead. If they do, the + QuotedString implementation will never place a backslash before a + "n", "t", "r", or digit, to ensure that old controllers don't get + confused. + + For future-proofing, controller implementors MAY use the following + rules to be compatible with buggy Tor implementations and with + future ones that implement the spec as intended: + + Read \n \t \r and \0 ... \377 as C escapes. + Treat a backslash followed by any other character as that character. + + Currently, many of the QuotedString instances below that Tor + outputs are in fact CStrings. We intend to fix this in future + versions of Tor, and document which ones were broken. (See + bugtracker ticket #14555 for a bit more information.) + + Note that this bug exists only in strings generated by Tor for the + Tor controller; Tor should parse input QuotedStrings from the + controller correctly. + + +2.2. Commands from controller to Tor + + Command = Keyword OptArguments CRLF / "+" Keyword OptArguments CRLF CmdData + Keyword = 1*ALPHA + OptArguments = [ SP *(SP / VCHAR) ] + + A command is either a single line containing a Keyword and arguments, or a + multiline command whose initial keyword begins with +, and whose data + section ends with a single "." on a line of its own. (We use a special + character to distinguish multiline commands so that Tor can correctly parse + multi-line commands that it does not recognize.) Specific commands and + their arguments are described below in section 3. + +2.3. Replies from Tor to the controller + + Reply = SyncReply / AsyncReply + SyncReply = *(MidReplyLine / DataReplyLine) EndReplyLine + AsyncReply = *(MidReplyLine / DataReplyLine) EndReplyLine + + MidReplyLine = StatusCode "-" ReplyLine + DataReplyLine = StatusCode "+" ReplyLine CmdData + EndReplyLine = StatusCode SP ReplyLine + ReplyLine = [ReplyText] CRLF + ReplyText = XXXX + StatusCode = 3DIGIT + + Unless specified otherwise, multiple lines in a single reply from + Tor to the controller are guaranteed to share the same status + code. Specific replies are mentioned below in section 3, and + described more fully in section 4. + + [Compatibility note: versions of Tor before 0.2.0.3-alpha sometimes + generate AsyncReplies of the form "*(MidReplyLine / DataReplyLine)". + This is incorrect, but controllers that need to work with these + versions of Tor should be prepared to get multi-line AsyncReplies with + the final line (usually "650 OK") omitted.] + +2.4. General-use tokens + + ; CRLF means, "the ASCII Carriage Return character (decimal value 13) + ; followed by the ASCII Linefeed character (decimal value 10)." + CRLF = CR LF + + ; How a controller tells Tor about a particular OR. There are four + ; possible formats: + ; $Fingerprint -- The router whose identity key hashes to the fingerprint. + ; This is the preferred way to refer to an OR. + ; $Fingerprint~Nickname -- The router whose identity key hashes to the + ; given fingerprint, but only if the router has the given nickname. + ; $Fingerprint=Nickname -- The router whose identity key hashes to the + ; given fingerprint, but only if the router is Named and has the given + ; nickname. + ; Nickname -- The Named router with the given nickname, or, if no such + ; router exists, any router whose nickname matches the one given. + ; This is not a safe way to refer to routers, since Named status + ; could under some circumstances change over time. + ; + ; The tokens that implement the above follow: + + ServerSpec = LongName / Nickname + LongName = Fingerprint [ "~" Nickname ] + + ; For tors older than 0.3.1.3-alpha, LongName may have included an equal + ; sign ("=") in lieu of a tilde ("~"). The presence of an equal sign + ; denoted that the OR possessed the "Named" flag: + + LongName = Fingerprint [ ( "=" / "~" ) Nickname ] + + Fingerprint = "$" 40*HEXDIG + NicknameChar = "a"-"z" / "A"-"Z" / "0" - "9" + Nickname = 1*19 NicknameChar + + ; What follows is an outdated way to refer to ORs. + ; Feature VERBOSE_NAMES replaces ServerID with LongName in events and + ; GETINFO results. VERBOSE_NAMES can be enabled starting in Tor version + ; 0.1.2.2-alpha and it is always-on in 0.2.2.1-alpha and later. + ServerID = Nickname / Fingerprint + + + ; Unique identifiers for streams or circuits. Currently, Tor only + ; uses digits, but this may change + StreamID = 1*16 IDChar + CircuitID = 1*16 IDChar + ConnID = 1*16 IDChar + QueueID = 1*16 IDChar + IDChar = ALPHA / DIGIT + + Address = ip4-address / ip6-address / hostname (XXXX Define these) + + ; A "CmdData" section is a sequence of octets concluded by the terminating + ; sequence CRLF "." CRLF. The terminating sequence may not appear in the + ; body of the data. Leading periods on lines in the data are escaped with + ; an additional leading period as in RFC 2821 section 4.5.2. + CmdData = *DataLine "." CRLF + DataLine = CRLF / "." 1*LineItem CRLF / NonDotItem *LineItem CRLF + LineItem = NonCR / 1*CR NonCRLF + NonDotItem = NonDotCR / 1*CR NonCRLF + + ; ISOTime, ISOTime2, and ISOTime2Frac are time formats as specified in + ; ISO8601. + ; example ISOTime: "2012-01-11 12:15:33" + ; example ISOTime2: "2012-01-11T12:15:33" + ; example ISOTime2Frac: "2012-01-11T12:15:33.51" + IsoDatePart = 4*DIGIT "-" 2*DIGIT "-" 2*DIGIT + IsoTimePart = 2*DIGIT ":" 2*DIGIT ":" 2*DIGIT + ISOTime = IsoDatePart " " IsoTimePart + ISOTime2 = IsoDatePart "T" IsoTimePart + ISOTime2Frac = IsoTime2 [ "." 1*DIGIT ] + + ; Numbers + LeadingDigit = "1" - "9" + UInt = LeadingDigit *Digit + +3. Commands + + All commands are case-insensitive, but most keywords are case-sensitive. + +3.1. SETCONF + + Change the value of one or more configuration variables. The syntax is: + + "SETCONF" 1*(SP keyword ["=" value]) CRLF + value = String / QuotedString + + Tor behaves as though it had just read each of the key-value pairs + from its configuration file. Keywords with no corresponding values have + their configuration values reset to 0 or NULL (use RESETCONF if you want + to set it back to its default). SETCONF is all-or-nothing: if there + is an error in any of the configuration settings, Tor sets none of them. + + Tor responds with a "250 OK" reply on success. + If some of the listed keywords can't be found, Tor replies with a + "552 Unrecognized option" message. Otherwise, Tor responds with a + "513 syntax error in configuration values" reply on syntax error, or a + "553 impossible configuration setting" reply on a semantic error. + + Some configuration options (e.g. "Bridge") take multiple values. Also, + some configuration keys (e.g. for hidden services and for entry + guard lists) form a context-sensitive group where order matters (see + GETCONF below). In these cases, setting _any_ of the options in a + SETCONF command is taken to reset all of the others. For example, + if two ORListenAddress values are configured, and a SETCONF command + arrives containing a single ORListenAddress value, the new command's + value replaces the two old values. + + Sometimes it is not possible to change configuration options solely by + issuing a series of SETCONF commands, because the value of one of the + configuration options depends on the value of another which has not yet + been set. Such situations can be overcome by setting multiple configuration + options with a single SETCONF command (e.g. SETCONF ORPort=443 + ORListenAddress=9001). + +3.2. RESETCONF + + Remove all settings for a given configuration option entirely, assign + its default value (if any), and then assign the String provided. + Typically the String is left empty, to simply set an option back to + its default. The syntax is: + + "RESETCONF" 1*(SP keyword ["=" String]) CRLF + + Otherwise it behaves like SETCONF above. + +3.3. GETCONF + + Request the value of zero or more configuration variable(s). + The syntax is: + + "GETCONF" *(SP keyword) CRLF + + If all of the listed keywords exist in the Tor configuration, Tor replies + with a series of reply lines of the form: + + 250 keyword=value + + If any option is set to a 'default' value semantically different from an + empty string, Tor may reply with a reply line of the form: + + 250 keyword + + Value may be a raw value or a quoted string. Tor will try to use unquoted + values except when the value could be misinterpreted through not being + quoted. (Right now, Tor supports no such misinterpretable values for + configuration options.) + + If some of the listed keywords can't be found, Tor replies with a + "552 unknown configuration keyword" message. + + If an option appears multiple times in the configuration, all of its + key-value pairs are returned in order. + + If no keywords were provided, Tor responds with "250 OK" message. + + Some options are context-sensitive, and depend on other options with + different keywords. These cannot be fetched directly. Currently there + is only one such option: clients should use the "HiddenServiceOptions" + virtual keyword to get all HiddenServiceDir, HiddenServicePort, + HiddenServiceVersion, and HiddenserviceAuthorizeClient option settings. + +3.4. SETEVENTS + + Request the server to inform the client about interesting events. The + syntax is: + + "SETEVENTS" [SP "EXTENDED"] *(SP EventCode) CRLF + + EventCode = 1*(ALPHA / "_") (see section 4.1.x for event types) + + Any events *not* listed in the SETEVENTS line are turned off; thus, sending + SETEVENTS with an empty body turns off all event reporting. + + The server responds with a "250 OK" reply on success, and a "552 + Unrecognized event" reply if one of the event codes isn't recognized. (On + error, the list of active event codes isn't changed.) + + If the flag string "EXTENDED" is provided, Tor may provide extra + information with events for this connection; see 4.1 for more information. + NOTE: All events on a given connection will be provided in extended format, + or none. + NOTE: "EXTENDED" was first supported in Tor 0.1.1.9-alpha; it is + always-on in Tor 0.2.2.1-alpha and later. + + Each event is described in more detail in Section 4.1. + +3.5. AUTHENTICATE + + Sent from the client to the server. The syntax is: + + "AUTHENTICATE" [ SP 1*HEXDIG / QuotedString ] CRLF + + This command is used to authenticate to the server. The provided string is + one of the following: + + * (For the HASHEDPASSWORD authentication method; see 3.21) + The original password represented as a QuotedString. + + * (For the COOKIE is authentication method; see 3.21) + The contents of the cookie file, formatted in hexadecimal + + * (For the SAFECOOKIE authentication method; see 3.21) + The HMAC based on the AUTHCHALLENGE message, in hexadecimal. + + The server responds with "250 OK" on success or "515 Bad authentication" if + the authentication cookie is incorrect. Tor closes the connection on an + authentication failure. + + The authentication token can be specified as either a quoted ASCII string, + or as an unquoted hexadecimal encoding of that same string (to avoid escaping + issues). + + For information on how the implementation securely stores authentication + information on disk, see section 5.1. + + Before the client has authenticated, no command other than + PROTOCOLINFO, AUTHCHALLENGE, AUTHENTICATE, or QUIT is valid. If the + controller sends any other command, or sends a malformed command, or + sends an unsuccessful AUTHENTICATE command, or sends PROTOCOLINFO or + AUTHCHALLENGE more than once, Tor sends an error reply and closes + the connection. + + To prevent some cross-protocol attacks, the AUTHENTICATE command is still + required even if all authentication methods in Tor are disabled. In this + case, the controller should just send "AUTHENTICATE" CRLF. + + (Versions of Tor before 0.1.2.16 and 0.2.0.4-alpha did not close the + connection after an authentication failure.) + +3.6. SAVECONF + + Sent from the client to the server. The syntax is: + + "SAVECONF" [SP "FORCE"] CRLF + + Instructs the server to write out its config options into its torrc. Server + returns "250 OK" if successful, or "551 Unable to write configuration + to disk" if it can't write the file or some other error occurs. + + If the %include option is used on torrc, SAVECONF will not write the + configuration to disk. If the flag string "FORCE" is provided, the + configuration will be overwritten even if %include is used. Using %include + on defaults-torrc does not affect SAVECONF. (Introduced in 0.3.1.1-alpha.) + + See also the "getinfo config-text" command, if the controller wants + to write the torrc file itself. + + See also the "getinfo config-can-saveconf" command, to tell if the FORCE + flag will be required. (Also introduced in 0.3.1.1-alpha.) + +3.7. SIGNAL + + Sent from the client to the server. The syntax is: + + "SIGNAL" SP Signal CRLF + + Signal = "RELOAD" / "SHUTDOWN" / "DUMP" / "DEBUG" / "HALT" / + "HUP" / "INT" / "USR1" / "USR2" / "TERM" / "NEWNYM" / + "CLEARDNSCACHE" / "HEARTBEAT" / "ACTIVE" / "DORMANT" + + The meaning of the signals are: + + RELOAD -- Reload: reload config items. + SHUTDOWN -- Controlled shutdown: if server is an OP, exit immediately. + If it's an OR, close listeners and exit after + ShutdownWaitLength seconds. + DUMP -- Dump stats: log information about open connections and + circuits. + DEBUG -- Debug: switch all open logs to loglevel debug. + HALT -- Immediate shutdown: clean up and exit now. + CLEARDNSCACHE -- Forget the client-side cached IPs for all hostnames. + NEWNYM -- Switch to clean circuits, so new application requests + don't share any circuits with old ones. Also clears + the client-side DNS cache. (Tor MAY rate-limit its + response to this signal.) + HEARTBEAT -- Make Tor dump an unscheduled Heartbeat message to log. + DORMANT -- Tell Tor to become "dormant". A dormant Tor will + try to avoid CPU and network usage until it receives + user-initiated network request. (Don't use this + on relays or hidden services yet!) + ACTIVE -- Tell Tor to stop being "dormant", as if it had received + a user-initiated network request. + + The server responds with "250 OK" if the signal is recognized (or simply + closes the socket if it was asked to close immediately), or "552 + Unrecognized signal" if the signal is unrecognized. + + Note that not all of these signals have POSIX signal equivalents. The + ones that do are as below. You may also use these POSIX names for the + signal that have them. + + RELOAD: HUP + SHUTDOWN: INT + HALT: TERM + DUMP: USR1 + DEBUG: USR2 + + [SIGNAL DORMANT and SIGNAL ACTIVE were added in 0.4.0.1-alpha.] + +3.8. MAPADDRESS + + Sent from the client to the server. The syntax is: + + "MAPADDRESS" 1*(Address "=" Address SP) CRLF + + The first address in each pair is an "original" address; the second is a + "replacement" address. The client sends this message to the server in + order to tell it that future SOCKS requests for connections to the original + address should be replaced with connections to the specified replacement + address. If the addresses are well-formed, and the server is able to + fulfill the request, the server replies with a 250 message: + + 250-OldAddress1=NewAddress1 + 250 OldAddress2=NewAddress2 + + containing the source and destination addresses. If request is + malformed, the server replies with "512 syntax error in command + argument". If the server can't fulfill the request, it replies with + "451 resource exhausted". + + The client may decline to provide a body for the original address, and + instead send a special null address ("0.0.0.0" for IPv4, "::0" for IPv6, or + "." for hostname), signifying that the server should choose the original + address itself, and return that address in the reply. The server + should ensure that it returns an element of address space that is unlikely + to be in actual use. If there is already an address mapped to the + destination address, the server may reuse that mapping. + + If the original address is already mapped to a different address, the old + mapping is removed. If the original address and the destination address + are the same, the server removes any mapping in place for the original + address. + + Example: + + C: MAPADDRESS 1.2.3.4=torproject.org + S: 250 1.2.3.4=torproject.org + + C: GETINFO address-mappings/control + S: 250-address-mappings/control=1.2.3.4 torproject.org NEVER + S: 250 OK + + C: MAPADDRESS 1.2.3.4=1.2.3.4 + S: 250 1.2.3.4=1.2.3.4 + + C: GETINFO address-mappings/control + S: 250-address-mappings/control= + S: 250 OK + + {Note: This feature is designed to be used to help Tor-ify applications + that need to use SOCKS4 or hostname-less SOCKS5. There are three + approaches to doing this: + + 1. Somehow make them use SOCKS4a or SOCKS5-with-hostnames instead. + 2. Use tor-resolve (or another interface to Tor's resolve-over-SOCKS + feature) to resolve the hostname remotely. This doesn't work + with special addresses like x.onion or x.y.exit. + 3. Use MAPADDRESS to map an IP address to the desired hostname, and then + arrange to fool the application into thinking that the hostname + has resolved to that IP. + + This functionality is designed to help implement the 3rd approach.} + + Mappings set by the controller last until the Tor process exits: + they never expire. If the controller wants the mapping to last only + a certain time, then it must explicitly un-map the address when that + time has elapsed. + + MapAddress replies MAY contain mixed status codes. + + Example: + + C: MAPADDRESS xxx=@@@ 0.0.0.0=bogus1.google.com + S: 512-syntax error: invalid address '@@@' + S: 250 127.199.80.246=bogus1.google.com + +3.9. GETINFO + + Sent from the client to the server. The syntax is as for GETCONF: + + "GETINFO" 1*(SP keyword) CRLF + + Unlike GETCONF, this message is used for data that are not stored in the Tor + configuration file, and that may be longer than a single line. On success, + one ReplyLine is sent for each requested value, followed by a final 250 OK + ReplyLine. If a value fits on a single line, the format is: + + 250-keyword=value + If a value must be split over multiple lines, the format is: + + 250+keyword= + value + . + The server sends a 551 or 552 error on failure. + + Recognized keys and their values include: + + "version" -- The version of the server's software, which MAY include the + name of the software, such as "Tor 0.0.9.4". The name of the software, + if absent, is assumed to be "Tor". + + "config-file" -- The location of Tor's configuration file ("torrc"). + + "config-defaults-file" -- The location of Tor's configuration + defaults file ("torrc.defaults"). This file gets parsed before + torrc, and is typically used to replace Tor's default + configuration values. [First implemented in 0.2.3.9-alpha.] + + "config-text" -- The contents that Tor would write if you send it + a SAVECONF command, so the controller can write the file to + disk itself. [First implemented in 0.2.2.7-alpha.] + + "exit-policy/default" -- The default exit policy lines that Tor will + *append* to the ExitPolicy config option. + + "exit-policy/reject-private/default" -- The default exit policy lines + that Tor will *prepend* to the ExitPolicy config option when + ExitPolicyRejectPrivate is 1. + + "exit-policy/reject-private/relay" -- The relay-specific exit policy + lines that Tor will *prepend* to the ExitPolicy config option based + on the current values of ExitPolicyRejectPrivate and + ExitPolicyRejectLocalInterfaces. These lines are based on the public + addresses configured in the torrc and present on the relay's + interfaces. Will send 552 error if the server is not running as + onion router. Will send 551 on internal error which may be transient. + + "exit-policy/ipv4" + "exit-policy/ipv6" + "exit-policy/full" -- This OR's exit policy, in IPv4-only, IPv6-only, or + all-entries flavors. Handles errors in the same way as "exit-policy/ + reject-private/relay" does. + + "desc/id/" or "desc/name/" -- the latest + server descriptor for a given OR. (Note that modern Tor clients + do not download server descriptors by default, but download + microdescriptors instead. If microdescriptors are enabled, you'll + need to use "md" instead.) + + "md/all" -- all known microdescriptors for the entire Tor network. + Each microdescriptor is terminated by a newline. + [First implemented in 0.3.5.1-alpha] + + "md/id/" or "md/name/" -- the latest + microdescriptor for a given OR. Empty if we have no microdescriptor for + that OR (because we haven't downloaded one, or it isn't in the + consensus). [First implemented in 0.2.3.8-alpha.] + + "desc/download-enabled" -- "1" if we try to download router descriptors; + "0" otherwise. [First implemented in 0.3.2.1-alpha] + + "md/download-enabled" -- "1" if we try to download microdescriptors; + "0" otherwise. [First implemented in 0.3.2.1-alpha] + + "dormant" -- A nonnegative integer: zero if Tor is currently active and + building circuits, and nonzero if Tor has gone idle due to lack of use + or some similar reason. [First implemented in 0.2.3.16-alpha] + + "desc-annotations/id/" -- outputs the annotations string + (source, timestamp of arrival, purpose, etc) for the corresponding + descriptor. [First implemented in 0.2.0.13-alpha.] + + "extra-info/digest/" -- the extrainfo document whose digest (in + hex) is . Only available if we're downloading extra-info + documents. + + "ns/id/" or "ns/name/" -- the latest router + status info (v3 directory style) for a given OR. Router status + info is as given in dir-spec.txt, and reflects the latest + consensus opinion about the + router in question. Like directory clients, controllers MUST + tolerate unrecognized flags and lines. The published date and + descriptor digest are those believed to be best by this Tor, + not necessarily those for a descriptor that Tor currently has. + [First implemented in 0.1.2.3-alpha.] + [In 0.2.0.9-alpha this switched from v2 directory style to v3] + + "ns/all" -- Router status info (v3 directory style) for all ORs we + that the consensus has an opinion about, joined by newlines. + [First implemented in 0.1.2.3-alpha.] + [In 0.2.0.9-alpha this switched from v2 directory style to v3] + + "ns/purpose/" -- Router status info (v3 directory style) + for all ORs of this purpose. Mostly designed for /ns/purpose/bridge + queries. + [First implemented in 0.2.0.13-alpha.] + [In 0.2.0.9-alpha this switched from v2 directory style to v3] + [In versions before 0.4.1.1-alpha we set the Running flag on + bridges when /ns/purpose/bridge is accessed] + [In 0.4.1.1-alpha we set the Running flag on bridges when the + bridge networkstatus file is written to disk] + + "desc/all-recent" -- the latest server descriptor for every router that + Tor knows about. (See md note about "desc/id" and "desc/name" above.) + + "network-status" -- [Deprecated in 0.3.1.1-alpha, removed + in 0.4.5.1-alpha.] + + "address-mappings/all" + "address-mappings/config" + "address-mappings/cache" + "address-mappings/control" -- a \r\n-separated list of address + mappings, each in the form of "from-address to-address expiry". + The 'config' key returns those address mappings set in the + configuration; the 'cache' key returns the mappings in the + client-side DNS cache; the 'control' key returns the mappings set + via the control interface; the 'all' target returns the mappings + set through any mechanism. + Expiry is formatted as with ADDRMAP events, except that "expiry" is + always a time in UTC or the string "NEVER"; see section 4.1.7. + First introduced in 0.2.0.3-alpha. + + "addr-mappings/*" -- as for address-mappings/*, but without the + expiry portion of the value. Use of this value is deprecated + since 0.2.0.3-alpha; use address-mappings instead. + + "address" -- the best guess at our external IP address. If we + have no guess, return a 551 error. (Added in 0.1.2.2-alpha) + + "address/v4" + "address/v6" + the best guess at our respective external IPv4 or IPv6 address. + If we have no guess, return a 551 error. (Added in 0.4.5.1-alpha) + + "fingerprint" -- the contents of the fingerprint file that Tor + writes as a relay, or a 551 if we're not a relay currently. + (Added in 0.1.2.3-alpha) + + "circuit-status" + A series of lines as for a circuit status event. Each line is of + the form described in section 4.1.1, omitting the initial + "650 CIRC ". Note that clients must be ready to accept additional + arguments as described in section 4.1. + + "stream-status" + A series of lines as for a stream status event. Each is of the form: + StreamID SP StreamStatus SP CircuitID SP Target CRLF + + "orconn-status" + A series of lines as for an OR connection status event. In Tor + 0.1.2.2-alpha with feature VERBOSE_NAMES enabled and in Tor + 0.2.2.1-alpha and later by default, each line is of the form: + LongName SP ORStatus CRLF + + In Tor versions 0.1.2.2-alpha through 0.2.2.1-alpha with feature + VERBOSE_NAMES turned off and before version 0.1.2.2-alpha, each line + is of the form: + ServerID SP ORStatus CRLF + + "entry-guards" + A series of lines listing the currently chosen entry guards, if any. + In Tor 0.1.2.2-alpha with feature VERBOSE_NAMES enabled and in Tor + 0.2.2.1-alpha and later by default, each line is of the form: + LongName SP Status [SP ISOTime] CRLF + + In Tor versions 0.1.2.2-alpha through 0.2.2.1-alpha with feature + VERBOSE_NAMES turned off and before version 0.1.2.2-alpha, each line + is of the form: + ServerID2 SP Status [SP ISOTime] CRLF + ServerID2 = Nickname / 40*HEXDIG + + The definition of Status is the same for both: + Status = "up" / "never-connected" / "down" / + "unusable" / "unlisted" + + [From 0.1.1.4-alpha to 0.1.1.10-alpha, entry-guards was called + "helper-nodes". Tor still supports calling "helper-nodes", but it + is deprecated and should not be used.] + + [Older versions of Tor (before 0.1.2.x-final) generated 'down' instead + of unlisted/unusable. Between 0.1.2.x-final and 0.2.6.3-alpha, + 'down' was never generated.] + + [XXXX ServerID2 differs from ServerID in not prefixing fingerprints + with a $. This is an implementation error. It would be nice to add + the $ back in if we can do so without breaking compatibility.] + + "traffic/read" -- Total bytes read (downloaded). + + "traffic/written" -- Total bytes written (uploaded). + + "uptime" -- Uptime of the Tor daemon (in seconds). Added in + 0.3.5.1-alpha. + + "accounting/enabled" + "accounting/hibernating" + "accounting/bytes" + "accounting/bytes-left" + "accounting/interval-start" + "accounting/interval-wake" + "accounting/interval-end" + Information about accounting status. If accounting is enabled, + "enabled" is 1; otherwise it is 0. The "hibernating" field is "hard" + if we are accepting no data; "soft" if we're accepting no new + connections, and "awake" if we're not hibernating at all. The "bytes" + and "bytes-left" fields contain (read-bytes SP write-bytes), for the + start and the rest of the interval respectively. The 'interval-start' + and 'interval-end' fields are the borders of the current interval; the + 'interval-wake' field is the time within the current interval (if any) + where we plan[ned] to start being active. The times are UTC. + + "config/names" + A series of lines listing the available configuration options. Each is + of the form: + OptionName SP OptionType [ SP Documentation ] CRLF + OptionName = Keyword + OptionType = "Integer" / "TimeInterval" / "TimeMsecInterval" / + "DataSize" / "Float" / "Boolean" / "Time" / "CommaList" / + "Dependent" / "Virtual" / "String" / "LineList" + Documentation = Text + Note: The incorrect spelling "Dependant" was used from the time this key + was introduced in Tor 0.1.1.4-alpha until it was corrected in Tor + 0.3.0.2-alpha. It is recommended that clients accept both spellings. + + "config/defaults" + A series of lines listing default values for each configuration + option. Options which don't have a valid default don't show up + in the list. Introduced in Tor 0.2.4.1-alpha. + OptionName SP OptionValue CRLF + OptionName = Keyword + OptionValue = Text + + "info/names" + A series of lines listing the available GETINFO options. Each is of + one of these forms: + OptionName SP Documentation CRLF + OptionPrefix SP Documentation CRLF + OptionPrefix = OptionName "/*" + The OptionPrefix form indicates a number of options beginning with the + prefix. So if "config/*" is listed, other options beginning with + "config/" will work, but "config/*" itself is not an option. + + "events/names" + A space-separated list of all the events supported by this version of + Tor's SETEVENTS. + + "features/names" + A space-separated list of all the features supported by this version + of Tor's USEFEATURE. + + "signal/names" + A space-separated list of all the values supported by the SIGNAL + command. + + "ip-to-country/ipv4-available" + "ip-to-country/ipv6-available" + "1" if the relevant geoip or geoip6 database is present; "0" otherwise. + This field was added in Tor 0.3.2.1-alpha. + + "ip-to-country/*" + Maps IP addresses to 2-letter country codes. For example, + "GETINFO ip-to-country/18.0.0.1" should give "US". + + "process/pid" -- Process id belonging to the main tor process. + "process/uid" -- User id running the tor process, -1 if unknown (this is + unimplemented on Windows, returning -1). + "process/user" -- Username under which the tor process is running, + providing an empty string if none exists (this is unimplemented on + Windows, returning an empty string). + "process/descriptor-limit" -- Upper bound on the file descriptor limit, -1 + if unknown + + "dir/status-vote/current/consensus" [added in Tor 0.2.1.6-alpha] + "dir/status-vote/current/consensus-microdesc" [added in Tor 0.4.3.1-alpha] + "dir/status/authority" + "dir/status/fp/" + "dir/status/fp/++" + "dir/status/all" + "dir/server/fp/" + "dir/server/fp/++" + "dir/server/d/" + "dir/server/d/++" + "dir/server/authority" + "dir/server/all" + A series of lines listing directory contents, provided according to the + specification for the URLs listed in Section 4.4 of dir-spec.txt. Note + that Tor MUST NOT provide private information, such as descriptors for + routers not marked as general-purpose. When asked for 'authority' + information for which this Tor is not authoritative, Tor replies with + an empty string. + + Note that, as of Tor 0.2.3.3-alpha, Tor clients don't download server + descriptors anymore, but microdescriptors. So, a "551 Servers + unavailable" reply to all "GETINFO dir/server/*" requests is actually + correct. If you have an old program which absolutely requires server + descriptors to work, try setting UseMicrodescriptors 0 or + FetchUselessDescriptors 1 in your client's torrc. + + "status/circuit-established" + "status/enough-dir-info" + "status/good-server-descriptor" + "status/accepted-server-descriptor" + "status/..." + These provide the current internal Tor values for various Tor + states. See Section 4.1.10 for explanations. (Only a few of the + status events are available as getinfo's currently. Let us know if + you want more exposed.) + "status/reachability-succeeded/or" + 0 or 1, depending on whether we've found our ORPort reachable. + "status/reachability-succeeded/dir" + 0 or 1, depending on whether we've found our DirPort reachable. + 1 if there is no DirPort, and therefore no need for a reachability + check. + "status/reachability-succeeded" + "OR=" ("0"/"1") SP "DIR=" ("0"/"1") + Combines status/reachability-succeeded/*; controllers MUST ignore + unrecognized elements in this entry. + "status/bootstrap-phase" + Returns the most recent bootstrap phase status event + sent. Specifically, it returns a string starting with either + "NOTICE BOOTSTRAP ..." or "WARN BOOTSTRAP ...". Controllers should + use this getinfo when they connect or attach to Tor to learn its + current bootstrap state. + "status/version/recommended" + List of currently recommended versions. + "status/version/current" + Status of the current version. One of: new, old, unrecommended, + recommended, new in series, obsolete, unknown. + "status/clients-seen" + A summary of which countries we've seen clients from recently, + formatted the same as the CLIENTS_SEEN status event described in + Section 4.1.14. This GETINFO option is currently available only + for bridge relays. + "status/fresh-relay-descs" + Provides fresh server and extra-info descriptors for our relay. Note + this is *not* the latest descriptors we've published, but rather what we + would generate if we needed to make a new descriptor right now. + + "net/listeners/*" + + A quoted, space-separated list of the locations where Tor is listening + for connections of the specified type. These can contain IPv4 + network address... + + "127.0.0.1:9050" "127.0.0.1:9051" + + ... or local unix sockets... + + "unix:/home/my_user/.tor/socket" + + ... or IPv6 network addresses: + + "[2001:0db8:7000:0000:0000:dead:beef:1234]:9050" + + [New in Tor 0.2.2.26-beta.] + + "net/listeners/or" + + Listeners for OR connections. Talks Tor protocol as described in + tor-spec.txt. + + "net/listeners/dir" + + Listeners for Tor directory protocol, as described in dir-spec.txt. + + "net/listeners/socks" + + Listeners for onion proxy connections that talk SOCKS4/4a/5 protocol. + + "net/listeners/trans" + + Listeners for transparent connections redirected by firewall, such as + pf or netfilter. + + "net/listeners/natd" + + Listeners for transparent connections redirected by natd. + + "net/listeners/dns" + + Listeners for a subset of DNS protocol that Tor network supports. + + "net/listeners/control" + + Listeners for Tor control protocol, described herein. + + "net/listeners/extor" + + Listeners corresponding to Extended ORPorts for integration with + pluggable transports. See proposals 180 and 196. + + "net/listeners/httptunnel" + + Listeners for onion proxy connections that leverage HTTP CONNECT + tunnelling. + + [The extor and httptunnel lists were added in 0.3.2.12, 0.3.3.10, and + 0.3.4.6-rc.] + + "dir-usage" + A newline-separated list of how many bytes we've served to answer + each type of directory request. The format of each line is: + Keyword 1*SP Integer 1*SP Integer + where the first integer is the number of bytes written, and the second + is the number of requests answered. + + [This feature was added in Tor 0.2.2.1-alpha, and removed in + Tor 0.2.9.1-alpha. Even when it existed, it only provided + useful output when the Tor client was built with either the + INSTRUMENT_DOWNLOADS or RUNNING_DOXYGEN compile-time options.] + + "bw-event-cache" + A space-separated summary of recent BW events in chronological order + from oldest to newest. Each event is represented by a comma-separated + tuple of "R,W", R is the number of bytes read, and W is the number of + bytes written. These entries each represent about one second's worth + of traffic. + [New in Tor 0.2.6.3-alpha] + + "consensus/valid-after" + "consensus/fresh-until" + "consensus/valid-until" + Each of these produces an ISOTime describing part of the lifetime of + the current (valid, accepted) consensus that Tor has. + [New in Tor 0.2.6.3-alpha] + + "hs/client/desc/id/" + Prints the content of the hidden service descriptor corresponding to + the given which is an onion address without the ".onion" part. + The client's cache is queried to find the descriptor. The format of + the descriptor is described in section 1.3 of the rend-spec.txt + document. + + If is unrecognized or if not found in the cache, a 551 error is + returned. + + [New in Tor 0.2.7.1-alpha] + [HS v3 support added 0.3.3.1-alpha] + + "hs/service/desc/id/" + Prints the content of the hidden service descriptor corresponding to + the given which is an onion address without the ".onion" part. + The service's local descriptor cache is queried to find the descriptor. + The format of the descriptor is described in section 1.3 of the + rend-spec.txt document. + + If is unrecognized or if not found in the cache, a 551 error is + returned. + + [New in Tor 0.2.7.2-alpha] + [HS v3 support added 0.3.3.1-alpha] + + "onions/current" + "onions/detached" + A newline-separated list of the Onion ("Hidden") Services created + via the "ADD_ONION" command. The 'current' key returns Onion Services + belonging to the current control connection. The 'detached' key + returns Onion Services detached from the parent control connection + (as in, belonging to no control connection). + The format of each line is: + HSAddress + [New in Tor 0.2.7.1-alpha.] + [HS v3 support added 0.3.3.1-alpha] + + "network-liveness" + The string "up" or "down", indicating whether we currently believe the + network is reachable. + + "downloads/" + The keys under downloads/ are used to query download statuses; they all + return either a sequence of newline-terminated hex encoded digests, or + a "serialized download status" as follows: + + SerializedDownloadStatus = + -- when do we plan to next attempt to download this object? + "next-attempt-at" SP ISOTime CRLF + -- how many times have we failed since the last success? + "n-download-failures" SP UInt CRLF + -- how many times have we tried to download this? + "n-download-attempts" SP UInt CRLF + -- according to which schedule rule will we download this? + "schedule" SP DownloadSchedule CRLF + -- do we want to fetch this from an authority, or will any cache do? + "want-authority" SP DownloadWantAuthority CRLF + -- do we increase our download delay whenever we fail to fetch this, + -- or whenever we attempt fetching this? + "increment-on" SP DownloadIncrementOn CRLF + -- do we increase the download schedule deterministically, or at + -- random? + "backoff" SP DownloadBackoff CRLF + [ + -- with an exponential backoff, where are we in the schedule? + "last-backoff-position" Uint CRLF + -- with an exponential backoff, what was our last delay? + "last-delay-used UInt CRLF + ] + + where + + DownloadSchedule = + "DL_SCHED_GENERIC" / "DL_SCHED_CONSENSUS" / "DL_SCHED_BRIDGE" + DownloadWantAuthority = + "DL_WANT_ANY_DIRSERVER" / "DL_WANT_AUTHORITY" + DownloadIncrementOn = + "DL_SCHED_INCREMENT_FAILURE" / "DL_SCHED_INCREMENT_ATTEMPT" + DownloadBackoff = + "DL_SCHED_DETERMINISTIC" / "DL_SCHED_RANDOM_EXPONENTIAL" + + The optional last two lines must be present if DownloadBackoff is + "DL_SCHED_RANDOM_EXPONENTIAL" and must be absent if DownloadBackoff + is "DL_SCHED_DETERMINISTIC". + + In detail, the keys supported are: + + "downloads/networkstatus/ns" + The SerializedDownloadStatus for the NS-flavored consensus for + whichever bootstrap state Tor is currently in. + + "downloads/networkstatus/ns/bootstrap" + The SerializedDownloadStatus for the NS-flavored consensus at + bootstrap time, regardless of whether we are currently bootstrapping. + + "downloads/networkstatus/ns/running" + + The SerializedDownloadStatus for the NS-flavored consensus when + running, regardless of whether we are currently bootstrapping. + + "downloads/networkstatus/microdesc" + The SerializedDownloadStatus for the microdesc-flavored consensus for + whichever bootstrap state Tor is currently in. + + "downloads/networkstatus/microdesc/bootstrap" + The SerializedDownloadStatus for the microdesc-flavored consensus at + bootstrap time, regardless of whether we are currently bootstrapping. + + "downloads/networkstatus/microdesc/running" + The SerializedDownloadStatus for the microdesc-flavored consensus when + running, regardless of whether we are currently bootstrapping. + + "downloads/cert/fps" + + A newline-separated list of hex-encoded digests for authority + certificates for which we have download status available. + + "downloads/cert/fp/" + A SerializedDownloadStatus for the default certificate for the + identity digest returned by the downloads/cert/fps key. + + "downloads/cert/fp//sks" + A newline-separated list of hex-encoded signing key digests for the + authority identity digest returned by the + downloads/cert/fps key. + + "downloads/cert/fp//" + A SerializedDownloadStatus for the certificate for the identity + digest returned by the downloads/cert/fps key and signing + key digest returned by the downloads/cert/fp// + sks key. + + "downloads/desc/descs" + A newline-separated list of hex-encoded router descriptor digests + [note, not identity digests - the Tor process may not have seen them + yet while downloading router descriptors]. If the Tor process is not + using a NS-flavored consensus, a 551 error is returned. + + "downloads/desc/" + A SerializedDownloadStatus for the router descriptor with digest + as returned by the downloads/desc/descs key. If the Tor + process is not using a NS-flavored consensus, a 551 error is returned. + + "downloads/bridge/bridges" + A newline-separated list of hex-encoded bridge identity digests. If + the Tor process is not using bridges, a 551 error is returned. + + "downloads/bridge/" + A SerializedDownloadStatus for the bridge descriptor with identity + digest as returned by the downloads/bridge/bridges key. If + the Tor process is not using bridges, a 551 error is returned. + + "sr/current" + "sr/previous" + The current or previous shared random value, as received in the + consensus, base-64 encoded. An empty value means that either + the consensus has no shared random value, or Tor has no consensus. + + "current-time/local" + "current-time/utc" + The current system or UTC time, as returned by the system, in ISOTime2 + format. (Introduced in 0.3.4.1-alpha.) + + "stats/ntor/requested" + "stats/ntor/assigned" + The NTor circuit onion handshake rephist values which are requested or + assigned. (Introduced in 0.4.5.1-alpha) + + "stats/tap/requested" + "stats/tap/assigned" + The TAP circuit onion handshake rephist values which are requested or + assigned. (Introduced in 0.4.5.1-alpha) + + "config-can-saveconf" + 0 or 1, depending on whether it is possible to use SAVECONF without the + FORCE flag. (Introduced in 0.3.1.1-alpha.) + + "limits/max-mem-in-queues" + The amount of memory that Tor's out-of-memory checker will allow + Tor to allocate (in places it can see) before it starts freeing memory + and killing circuits. See the MaxMemInQueues option for more + details. Unlike the option, this value reflects Tor's actual limit, and + may be adjusted depending on the available system memory rather than on + the MaxMemInQueues option. (Introduced in 0.2.5.4-alpha) + + Examples: + + C: GETINFO version desc/name/moria1 + S: 250+desc/name/moria= + S: [Descriptor for moria] + S: . + S: 250-version=Tor 0.1.1.0-alpha-cvs + S: 250 OK + +3.10. EXTENDCIRCUIT + + Sent from the client to the server. The format is: + + "EXTENDCIRCUIT" SP CircuitID + [SP ServerSpec *("," ServerSpec)] + [SP "purpose=" Purpose] CRLF + + This request takes one of two forms: either the CircuitID is zero, in + which case it is a request for the server to build a new circuit, + or the CircuitID is nonzero, in which case it is a request for the + server to extend an existing circuit with that ID according to the + specified path. + + If the CircuitID is 0, the controller has the option of providing + a path for Tor to use to build the circuit. If it does not provide + a path, Tor will select one automatically from high capacity nodes + according to path-spec.txt. + + If CircuitID is 0 and "purpose=" is specified, then the circuit's + purpose is set. Two choices are recognized: "general" and + "controller". If not specified, circuits are created as "general". + + If the request is successful, the server sends a reply containing a + message body consisting of the CircuitID of the (maybe newly created) + circuit. The syntax is "250" SP "EXTENDED" SP CircuitID CRLF. + +3.11. SETCIRCUITPURPOSE + + Sent from the client to the server. The format is: + + "SETCIRCUITPURPOSE" SP CircuitID SP "purpose=" Purpose CRLF + + This changes the circuit's purpose. See EXTENDCIRCUIT above for details. + +3.12. SETROUTERPURPOSE + + Sent from the client to the server. The format is: + + "SETROUTERPURPOSE" SP NicknameOrKey SP Purpose CRLF + + This changes the descriptor's purpose. See +POSTDESCRIPTOR below + for details. + + NOTE: This command was disabled and made obsolete as of Tor + 0.2.0.8-alpha. It doesn't exist anymore, and is listed here only for + historical interest. + +3.13. ATTACHSTREAM + + Sent from the client to the server. The syntax is: + + "ATTACHSTREAM" SP StreamID SP CircuitID [SP "HOP=" HopNum] CRLF + + This message informs the server that the specified stream should be + associated with the specified circuit. Each stream may be associated with + at most one circuit, and multiple streams may share the same circuit. + Streams can only be attached to completed circuits (that is, circuits that + have sent a circuit status 'BUILT' event or are listed as built in a + GETINFO circuit-status request). + + If the circuit ID is 0, responsibility for attaching the given stream is + returned to Tor. + + If HOP=HopNum is specified, Tor will choose the HopNumth hop in the + circuit as the exit node, rather than the last node in the circuit. + Hops are 1-indexed; generally, it is not permitted to attach to hop 1. + + Tor responds with "250 OK" if it can attach the stream, 552 if the + circuit or stream didn't exist, 555 if the stream isn't in an + appropriate state to be attached (e.g. it's already open), or 551 if + the stream couldn't be attached for another reason. + + {Implementation note: Tor will close unattached streams by itself, + roughly two minutes after they are born. Let the developers know if + that turns out to be a problem.} + + {Implementation note: By default, Tor automatically attaches streams to + circuits itself, unless the configuration variable + "__LeaveStreamsUnattached" is set to "1". Attempting to attach streams + via TC when "__LeaveStreamsUnattached" is false may cause a race between + Tor and the controller, as both attempt to attach streams to circuits.} + + {Implementation note: You can try to attachstream to a stream that + has already sent a connect or resolve request but hasn't succeeded + yet, in which case Tor will detach the stream from its current circuit + before proceeding with the new attach request.} + +3.14. POSTDESCRIPTOR + + Sent from the client to the server. The syntax is: + + "+POSTDESCRIPTOR" [SP "purpose=" Purpose] [SP "cache=" Cache] + CRLF Descriptor CRLF "." CRLF + + This message informs the server about a new descriptor. If Purpose is + specified, it must be either "general", "controller", or "bridge", + else we return a 552 error. The default is "general". + + If Cache is specified, it must be either "no" or "yes", else we + return a 552 error. If Cache is not specified, Tor will decide for + itself whether it wants to cache the descriptor, and controllers + must not rely on its choice. + + The descriptor, when parsed, must contain a number of well-specified + fields, including fields for its nickname and identity. + + If there is an error in parsing the descriptor, the server must send a + "554 Invalid descriptor" reply. If the descriptor is well-formed but + the server chooses not to add it, it must reply with a 251 message + whose body explains why the server was not added. If the descriptor + is added, Tor replies with "250 OK". + +3.15. REDIRECTSTREAM + + Sent from the client to the server. The syntax is: + + "REDIRECTSTREAM" SP StreamID SP Address [SP Port] CRLF + + Tells the server to change the exit address on the specified stream. If + Port is specified, changes the destination port as well. No remapping + is performed on the new provided address. + + To be sure that the modified address will be used, this event must be sent + after a new stream event is received, and before attaching this stream to + a circuit. + + Tor replies with "250 OK" on success. + +3.16. CLOSESTREAM + + Sent from the client to the server. The syntax is: + + "CLOSESTREAM" SP StreamID SP Reason *(SP Flag) CRLF + + Tells the server to close the specified stream. The reason should be one + of the Tor RELAY_END reasons given in tor-spec.txt, as a decimal. Flags is + not used currently; Tor servers SHOULD ignore unrecognized flags. Tor may + hold the stream open for a while to flush any data that is pending. + + Tor replies with "250 OK" on success, or a 512 if there aren't enough + arguments, or a 552 if it doesn't recognize the StreamID or reason. + +3.17. CLOSECIRCUIT + + The syntax is: + + "CLOSECIRCUIT" SP CircuitID *(SP Flag) CRLF + Flag = "IfUnused" + + Tells the server to close the specified circuit. If "IfUnused" is + provided, do not close the circuit unless it is unused. + + Other flags may be defined in the future; Tor SHOULD ignore unrecognized + flags. + + Tor replies with "250 OK" on success, or a 512 if there aren't enough + arguments, or a 552 if it doesn't recognize the CircuitID. + +3.18. QUIT + + Tells the server to hang up on this controller connection. This command + can be used before authenticating. + +3.19. USEFEATURE + + Adding additional features to the control protocol sometimes will break + backwards compatibility. Initially such features are added into Tor and + disabled by default. USEFEATURE can enable these additional features. + + The syntax is: + + "USEFEATURE" *(SP FeatureName) CRLF + FeatureName = 1*(ALPHA / DIGIT / "_" / "-") + + Feature names are case-insensitive. + + Once enabled, a feature stays enabled for the duration of the connection + to the controller. A new connection to the controller must be opened to + disable an enabled feature. + + Features are a forward-compatibility mechanism; each feature will eventually + become a standard part of the control protocol. Once a feature becomes part + of the protocol, it is always-on. Each feature documents the version it was + introduced as a feature and the version in which it became part of the + protocol. + + Tor will ignore a request to use any feature that is always-on. Tor will give + a 552 error in response to an unrecognized feature. + + EXTENDED_EVENTS + + Same as passing 'EXTENDED' to SETEVENTS; this is the preferred way to + request the extended event syntax. + + This feature was first introduced in 0.1.2.3-alpha. It is always-on + and part of the protocol in Tor 0.2.2.1-alpha and later. + + VERBOSE_NAMES + + Replaces ServerID with LongName in events and GETINFO results. LongName + provides a Fingerprint for all routers, an indication of Named status, + and a Nickname if one is known. LongName is strictly more informative + than ServerID, which only provides either a Fingerprint or a Nickname. + + This feature was first introduced in 0.1.2.2-alpha. It is always-on and + part of the protocol in Tor 0.2.2.1-alpha and later. + +3.20. RESOLVE + + The syntax is + + "RESOLVE" *Option *Address CRLF + Option = "mode=reverse" + Address = a hostname or IPv4 address + + This command launches a remote hostname lookup request for every specified + request (or reverse lookup if "mode=reverse" is specified). Note that the + request is done in the background: to see the answers, your controller will + need to listen for ADDRMAP events; see 4.1.7 below. + + [Added in Tor 0.2.0.3-alpha] + +3.21. PROTOCOLINFO + + The syntax is: + + "PROTOCOLINFO" *(SP PIVERSION) CRLF + + The server reply format is: + + "250-PROTOCOLINFO" SP PIVERSION CRLF *InfoLine "250 OK" CRLF + + InfoLine = AuthLine / VersionLine / OtherLine + + AuthLine = "250-AUTH" SP "METHODS=" AuthMethod *("," AuthMethod) + *(SP "COOKIEFILE=" AuthCookieFile) CRLF + VersionLine = "250-VERSION" SP "Tor=" TorVersion OptArguments CRLF + + AuthMethod = + "NULL" / ; No authentication is required + "HASHEDPASSWORD" / ; A controller must supply the original password + "COOKIE" / ; ... or supply the contents of a cookie file + "SAFECOOKIE" ; ... or prove knowledge of a cookie file's contents + + AuthCookieFile = QuotedString + TorVersion = QuotedString + + OtherLine = "250-" Keyword OptArguments CRLF + + PIVERSION: 1*DIGIT + + This command tells the controller what kinds of authentication are + supported. + + Tor MAY give its InfoLines in any order; controllers MUST ignore InfoLines + with keywords they do not recognize. Controllers MUST ignore extraneous + data on any InfoLine. + + PIVERSION is there in case we drastically change the syntax one day. For + now it should always be "1". Controllers MAY provide a list of the + protocolinfo versions they support; Tor MAY select a version that the + controller does not support. + + AuthMethod is used to specify one or more control authentication + methods that Tor currently accepts. + + AuthCookieFile specifies the absolute path and filename of the + authentication cookie that Tor is expecting and is provided iff the + METHODS field contains the method "COOKIE" and/or "SAFECOOKIE". + Controllers MUST handle escape sequences inside this string. + + All authentication cookies are 32 bytes long. Controllers MUST NOT + use the contents of a non-32-byte-long file as an authentication + cookie. + + If the METHODS field contains the method "SAFECOOKIE", every + AuthCookieFile must contain the same authentication cookie. + + The COOKIE authentication method exposes the user running a + controller to an unintended information disclosure attack whenever + the controller has greater filesystem read access than the process + that it has connected to. (Note that a controller may connect to a + process other than Tor.) It is almost never safe to use, even if + the controller's user has explicitly specified which filename to + read an authentication cookie from. For this reason, the COOKIE + authentication method has been deprecated and will be removed from + a future version of Tor. + + The VERSION line contains the Tor version. + + [Unlike other commands besides AUTHENTICATE, PROTOCOLINFO may be used (but + only once!) before AUTHENTICATE.] + + [PROTOCOLINFO was not supported before Tor 0.2.0.5-alpha.] + +3.22. LOADCONF + + The syntax is: + + "+LOADCONF" CRLF ConfigText CRLF "." CRLF + + This command allows a controller to upload the text of a config file + to Tor over the control port. This config file is then loaded as if + it had been read from disk. + + [LOADCONF was added in Tor 0.2.1.1-alpha.] + +3.23. TAKEOWNERSHIP + + The syntax is: + + "TAKEOWNERSHIP" CRLF + + This command instructs Tor to shut down when this control + connection is closed. This command affects each control connection + that sends it independently; if multiple control connections send + the TAKEOWNERSHIP command to a Tor instance, Tor will shut down when + any of those connections closes. + + (As of Tor 0.2.5.2-alpha, Tor does not wait a while for circuits to + close when shutting down because of an exiting controller. If you + want to ensure a clean shutdown--and you should!--then send "SIGNAL + SHUTDOWN" and wait for the Tor process to close.) + + This command is intended to be used with the + __OwningControllerProcess configuration option. A controller that + starts a Tor process which the user cannot easily control or stop + should 'own' that Tor process: + + * When starting Tor, the controller should specify its PID in an + __OwningControllerProcess on Tor's command line. This will + cause Tor to poll for the existence of a process with that PID, + and exit if it does not find such a process. (This is not a + completely reliable way to detect whether the 'owning + controller' is still running, but it should work well enough in + most cases.) + + * Once the controller has connected to Tor's control port, it + should send the TAKEOWNERSHIP command along its control + connection. At this point, *both* the TAKEOWNERSHIP command and + the __OwningControllerProcess option are in effect: Tor will + exit when the control connection ends *and* Tor will exit if it + detects that there is no process with the PID specified in the + __OwningControllerProcess option. + + * After the controller has sent the TAKEOWNERSHIP command, it + should send "RESETCONF __OwningControllerProcess" along its + control connection. This will cause Tor to stop polling for the + existence of a process with its owning controller's PID; Tor + will still exit when the control connection ends. + + [TAKEOWNERSHIP was added in Tor 0.2.2.28-beta.] + +3.24. AUTHCHALLENGE + + The syntax is: + + "AUTHCHALLENGE" SP "SAFECOOKIE" + SP ClientNonce + CRLF + + ClientNonce = 2*HEXDIG / QuotedString + + This command is used to begin the authentication routine for the + SAFECOOKIE method of authentication. + + If the server accepts the command, the server reply format is: + + "250 AUTHCHALLENGE" + SP "SERVERHASH=" ServerHash + SP "SERVERNONCE=" ServerNonce + CRLF + + ServerHash = 64*64HEXDIG + ServerNonce = 64*64HEXDIG + + The ClientNonce, ServerHash, and ServerNonce values are + encoded/decoded in the same way as the argument passed to the + AUTHENTICATE command. ServerNonce MUST be 32 bytes long. + + ServerHash is computed as: + + HMAC-SHA256("Tor safe cookie authentication server-to-controller hash", + CookieString | ClientNonce | ServerNonce) + + (with the HMAC key as its first argument) + + After a controller sends a successful AUTHCHALLENGE command, the + next command sent on the connection must be an AUTHENTICATE command, + and the only authentication string which that AUTHENTICATE command + will accept is: + + HMAC-SHA256("Tor safe cookie authentication controller-to-server hash", + CookieString | ClientNonce | ServerNonce) + + [Unlike other commands besides AUTHENTICATE, AUTHCHALLENGE may be + used (but only once!) before AUTHENTICATE.] + + [AUTHCHALLENGE was added in Tor 0.2.3.13-alpha.] + +3.25. DROPGUARDS + + The syntax is: + + "DROPGUARDS" CRLF + + Tells the server to drop all guard nodes. Do not invoke this command + lightly; it can increase vulnerability to tracking attacks over time. + + Tor replies with "250 OK" on success. + + [DROPGUARDS was added in Tor 0.2.5.2-alpha.] + +3.26. HSFETCH + + The syntax is: + + "HSFETCH" SP (HSAddress / "v" Version "-" DescId) + *[SP "SERVER=" Server] CRLF + + HSAddress = 16*Base32Character / 56*Base32Character + Version = "2" / "3" + DescId = 32*Base32Character + Server = LongName + + This command launches hidden service descriptor fetch(es) for the given + HSAddress or DescId. + + HSAddress can be version 2 or version 3 addresses. DescIDs can only be + version 2 IDs. Version 2 addresses consist of 16*Base32Character and + version 3 addresses consist of 56*Base32Character. + + If a DescId is specified, at least one Server MUST also be provided, + otherwise a 512 error is returned. If no DescId and Server(s) are specified, + it behaves like a normal Tor client descriptor fetch. If one or more + Server are given, they are used instead triggering a fetch on each of them + in parallel. + + The caching behavior when fetching a descriptor using this command is + identical to normal Tor client behavior. + + Details on how to compute a descriptor id (DescId) can be found in + rend-spec.txt section 1.3. + + If any values are unrecognized, a 513 error is returned and the command is + stopped. On success, Tor replies "250 OK" then Tor MUST eventually follow + this with both a HS_DESC and HS_DESC_CONTENT events with the results. If + SERVER is specified then events are emitted for each location. + + Examples are: + + C: HSFETCH v2-gezdgnbvgy3tqolbmjrwizlgm5ugs2tl + SERVER=9695DFC35FFEB861329B9F1AB04C46397020CE31 + S: 250 OK + + C: HSFETCH ajkhdsfuygaesfaa + S: 250 OK + + C: HSFETCH vww6ybal4bd7szmgncyruucpgfkqahzddi37ktceo3ah7ngmcopnpyyd + S: 250 OK + + [HSFETCH was added in Tor 0.2.7.1-alpha] + [HS v3 support added 0.4.1.1-alpha] + +3.27. ADD_ONION + + The syntax is: + + "ADD_ONION" SP KeyType ":" KeyBlob + [SP "Flags=" Flag *("," Flag)] + [SP "MaxStreams=" NumStreams] + 1*(SP "Port=" VirtPort ["," Target]) + *(SP "ClientAuth=" ClientName [":" ClientBlob]) CRLF + *(SP "ClientAuthV3=" V3Key) CRLF + + KeyType = + "NEW" / ; The server should generate a key of algorithm KeyBlob + "RSA1024" / ; The server should use the 1024 bit RSA key provided + in as KeyBlob (v2). + "ED25519-V3"; The server should use the ed25519 v3 key provided in as + KeyBlob (v3). + + KeyBlob = + "BEST" / ; The server should generate a key using the "best" + supported algorithm (KeyType == "NEW"). + [As of 0.4.2.3-alpha, ED25519-V3 is used] + "RSA1024" / ; The server should generate a 1024 bit RSA key + (KeyType == "NEW") (v2). + "ED25519-V3"; The server should generate an ed25519 private key + (KeyType == "NEW") (v3). + String ; A serialized private key (without whitespace) + + Flag = + "DiscardPK" / ; The server should not include the newly generated + private key as part of the response. + "Detach" / ; Do not associate the newly created Onion Service + to the current control connection. + "BasicAuth" / ; Client authorization is required using the "basic" + method (v2 only). + "V3Auth" / ; Version 3 client authorization is required (v3 only). + + "NonAnonymous" /; Add a non-anonymous Single Onion Service. Tor + checks this flag matches its configured hidden + service anonymity mode. + "MaxStreamsCloseCircuit"; Close the circuit is the maximum streams + allowed is reached. + + NumStreams = A value between 0 and 65535 which is used as the maximum + streams that can be attached on a rendezvous circuit. Setting + it to 0 means unlimited which is also the default behavior. + + VirtPort = The virtual TCP Port for the Onion Service (As in the + HiddenServicePort "VIRTPORT" argument). + + Target = The (optional) target for the given VirtPort (As in the + optional HiddenServicePort "TARGET" argument). + + ClientName = An identifier 1 to 16 characters long, using only + characters in A-Za-z0-9+-_ (no spaces) (v2 only). + + ClientBlob = Authorization data for the client, in an opaque format + specific to the authorization method (v2 only). + + V3Key = The client's base32-encoded x25519 public key, using only the key + part of rend-spec-v3.txt section G.1.2 (v3 only). + + The server reply format is: + + "250-ServiceID=" ServiceID CRLF + ["250-PrivateKey=" KeyType ":" KeyBlob CRLF] + *("250-ClientAuth=" ClientName ":" ClientBlob CRLF) + "250 OK" CRLF + + ServiceID = The Onion Service address without the trailing ".onion" + suffix + + Tells the server to create a new Onion ("Hidden") Service, with the + specified private key and algorithm. If a KeyType of "NEW" is selected, + the server will generate a new keypair using the selected algorithm. + The "Port" argument's VirtPort and Target values have identical + semantics to the corresponding HiddenServicePort configuration values. + + The server response will only include a private key if the server was + requested to generate a new keypair, and also the "DiscardPK" flag was + not specified. (Note that if "DiscardPK" flag is specified, there is no + way to recreate the generated keypair and the corresponding Onion + Service at a later date). + + If client authorization is enabled using the "BasicAuth" flag (which is v2 + only), the service will not be accessible to clients without valid + authorization data (configured with the "HidServAuth" option). The list of + authorized clients is specified with one or more "ClientAuth" parameters. + If "ClientBlob" is not specified for a client, a new credential will be + randomly generated and returned. + + Tor instances can either be in anonymous hidden service mode, or + non-anonymous single onion service mode. All hidden services on the same + tor instance have the same anonymity. To guard against unexpected loss + of anonymity, Tor checks that the ADD_ONION "NonAnonymous" flag matches + the current hidden service anonymity mode. The hidden service anonymity + mode is configured using the Tor options HiddenServiceSingleHopMode and + HiddenServiceNonAnonymousMode. If both these options are 1, the + "NonAnonymous" flag must be provided to ADD_ONION. If both these options + are 0 (the Tor default), the flag must NOT be provided. + + Once created the new Onion Service will remain active until either the + Onion Service is removed via "DEL_ONION", the server terminates, or the + control connection that originated the "ADD_ONION" command is closed. + It is possible to override disabling the Onion Service on control + connection close by specifying the "Detach" flag. + + It is the Onion Service server application's responsibility to close + existing client connections if desired after the Onion Service is + removed. + + (The KeyBlob format is left intentionally opaque, however for "RSA1024" + keys it is currently the Base64 encoded DER representation of a PKCS#1 + RSAPrivateKey, with all newlines removed. For a "ED25519-V3" key is + the Base64 encoding of the concatenation of the 32-byte ed25519 secret + scalar in little-endian and the 32-byte ed25519 PRF secret.) + + [Note: The ED25519-V3 format is not the same as, e.g., SUPERCOP + ed25519/ref, which stores the concatenation of the 32-byte ed25519 + hash seed concatenated with the 32-byte public key, and which derives + the secret scalar and PRF secret by expanding the hash seed with + SHA-512. Our key blinding scheme is incompatible with storing + private keys as seeds, so we store the secret scalar alongside the + PRF secret, and just pay the cost of recomputing the public key when + importing an ED25519-V3 key.] + + Examples: + + C: ADD_ONION NEW:BEST Flags=DiscardPK Port=80 + S: 250-ServiceID=exampleoniont2pqglbny66wpovyvao3ylc23eileodtevc4b75ikpad + S: 250 OK + + C: ADD_ONION RSA1024:[Blob Redacted] Port=80,192.168.1.1:8080 + S: 250-ServiceID=sampleonion12456 + S: 250 OK + + C: ADD_ONION NEW:BEST Port=22 Port=80,8080 + S: 250-ServiceID=sampleonion4t2pqglbny66wpovyvao3ylc23eileodtevc4b75ikpad + S: 250-PrivateKey=ED25519-V3:[Blob Redacted] + S: 250 OK + + C: ADD_ONION NEW:RSA1024 Flags=DiscardPK,BasicAuth Port=22 + ClientAuth=alice:[Blob Redacted] ClientAuth=bob + S: 250-ServiceID=testonion1234567 + S: 250-ClientAuth=bob:[Blob Redacted] + S: 250 OK + + C: ADD_ONION NEW:ED25519-V3 ClientAuthV3=[Blob Redacted] Port=22 + S: 250-ServiceID=n35etu3yjxrqjpntmfziom5sjwspoydchmelc4xleoy4jk2u4lziz2yd + S: 250-ClientAuthV3=[Blob Redacted] + S: 250 OK + + Examples with Tor in anonymous onion service mode: + + C: ADD_ONION NEW:BEST Flags=DiscardPK Port=22 + S: 250-ServiceID=exampleoniont2pqglbny66wpovyvao3ylc23eileodtevc4b75ikpad + S: 250 OK + + C: ADD_ONION NEW:BEST Flags=DiscardPK,NonAnonymous Port=22 + S: 512 Tor is in anonymous hidden service mode + + Examples with Tor in non-anonymous onion service mode: + + C: ADD_ONION NEW:BEST Flags=DiscardPK Port=22 + S: 512 Tor is in non-anonymous hidden service mode + + C: ADD_ONION NEW:BEST Flags=DiscardPK,NonAnonymous Port=22 + S: 250-ServiceID=exampleoniont2pqglbny66wpovyvao3ylc23eileodtevc4b75ikpad + S: 250 OK + + [ADD_ONION was added in Tor 0.2.7.1-alpha.] + [MaxStreams and MaxStreamsCloseCircuit were added in Tor 0.2.7.2-alpha] + [ClientAuth was added in Tor 0.2.9.1-alpha. It is v2 only.] + [NonAnonymous was added in Tor 0.2.9.3-alpha.] + [HS v3 support added 0.3.3.1-alpha] + [ClientV3Auth support added 0.4.6.1-alpha] + +3.28. DEL_ONION + + The syntax is: + + "DEL_ONION" SP ServiceID CRLF + + ServiceID = The Onion Service address without the trailing ".onion" + suffix + + Tells the server to remove an Onion ("Hidden") Service, that was + previously created via an "ADD_ONION" command. It is only possible to + remove Onion Services that were created on the same control connection + as the "DEL_ONION" command, and those that belong to no control + connection in particular (The "Detach" flag was specified at creation). + + If the ServiceID is invalid, or is neither owned by the current control + connection nor a detached Onion Service, the server will return a 552. + + It is the Onion Service server application's responsibility to close + existing client connections if desired after the Onion Service has been + removed via "DEL_ONION". + + Tor replies with "250 OK" on success, or a 512 if there are an invalid + number of arguments, or a 552 if it doesn't recognize the ServiceID. + + [DEL_ONION was added in Tor 0.2.7.1-alpha.] + [HS v3 support added 0.3.3.1-alpha] + +3.29. HSPOST + + The syntax is: + + "+HSPOST" *[SP "SERVER=" Server] [SP "HSADDRESS=" HSAddress] + CRLF Descriptor CRLF "." CRLF + + Server = LongName + HSAddress = 56*Base32Character + Descriptor = The text of the descriptor formatted as specified + in rend-spec.txt section 1.3. + + The "HSAddress" key is optional and only applies for v3 descriptors. A 513 + error is returned if used with v2. + + This command launches a hidden service descriptor upload to the specified + HSDirs. If one or more Server arguments are provided, an upload is triggered + on each of them in parallel. If no Server options are provided, it behaves + like a normal HS descriptor upload and will upload to the set of responsible + HS directories. + + If any value is unrecognized, a 552 error is returned and the command is + stopped. If there is an error in parsing the descriptor, the server + must send a "554 Invalid descriptor" reply. + + On success, Tor replies "250 OK" then Tor MUST eventually follow + this with a HS_DESC event with the result for each upload location. + + Examples are: + + C: +HSPOST SERVER=9695DFC35FFEB861329B9F1AB04C46397020CE31 + [DESCRIPTOR] + . + S: 250 OK + + [HSPOST was added in Tor 0.2.7.1-alpha] + +3.30. ONION_CLIENT_AUTH_ADD + + The syntax is: + + "ONION_CLIENT_AUTH_ADD" SP HSAddress + SP KeyType ":" PrivateKeyBlob + [SP "ClientName=" Nickname] + [SP "Flags=" TYPE] CRLF + + HSAddress = 56*Base32Character + KeyType = "x25519" is the only one supported right now + PrivateKeyBlob = base64 encoding of x25519 key + + Tells the connected Tor to add client-side v3 client auth credentials for the + onion service with "HSAddress". The "PrivateKeyBlob" is the x25519 private + key that should be used for this client, and "Nickname" is an optional + nickname for the client. + + FLAGS is a comma-separated tuple of flags for this new client. For now, the + currently supported flags are: + + "Permanent" - This client's credentials should be stored in the filesystem. + If this is not set, the client's credentials are ephemeral + and stored in memory. + + If client auth credentials already existed for this service, replace them + with the new ones. + + If Tor has cached onion service descriptors that it has been unable to + decrypt in the past (due to lack of client auth credentials), attempt to + decrypt those descriptors as soon as this command succeeds. + + On success, "250 OK" is returned. Otherwise, the following error codes exist: + + 251 - Client auth credentials for this onion service already existed and replaced. + 252 - Added client auth credentials and successfully decrypted a cached descriptor. + 451 - We reached authorized client capacity + 512 - Syntax error in "HSAddress", or "PrivateKeyBlob" or "Nickname" + 551 - Client with with this "Nickname" already exists + 552 - Unrecognized KeyType + + [ONION_CLIENT_AUTH_ADD was added in Tor 0.4.3.1-alpha] + +3.31. ONION_CLIENT_AUTH_REMOVE + + The syntax is: + + "ONION_CLIENT_AUTH_REMOVE" SP HSAddress + + KeyType = "x25519" is the only one supported right now + + Tells the connected Tor to remove the client-side v3 client auth credentials + for the onion service with "HSAddress". + + On success "250 OK" is returned. Otherwise, the following error codes exist: + + 512 - Syntax error in "HSAddress". + 251 - Client credentials for "HSAddress" did not exist. + + [ONION_CLIENT_AUTH_REMOVE was added in Tor 0.4.3.1-alpha] + +3.32. ONION_CLIENT_AUTH_VIEW + + The syntax is: + + "ONION_CLIENT_AUTH_VIEW" [SP HSAddress] CRLF + + Tells the connected Tor to list all the stored client-side v3 client auth + credentials for "HSAddress". If no "HSAddress" is provided, list all the + stored client-side v3 client auth credentials. + + The server reply format is: + + "250-ONION_CLIENT_AUTH_VIEW" [SP HSAddress] CRLF + *("250-CLIENT" SP HSAddress SP KeyType ":" PrivateKeyBlob + [SP "ClientName=" Nickname] + [SP "Flags=" FLAGS] CRLF) + "250 OK" CRLF + + HSAddress = The onion address under which this credential is stored + KeyType = "x25519" is the only one supported right now + PrivateKeyBlob = base64 encoding of x25519 key + + "Nickname" is an optional nickname for this client, which can be set either + through the ONION_CLIENT_AUTH_ADD command, or it's the filename of this + client if the credentials are stored in the filesystem. + + FLAGS is a comma-separated field of flags for this client, the currently + supported flags are: + + "Permanent" - This client's credentials are stored in the filesystem. + + On success "250 OK" is returned. Otherwise, the following error codes exist: + + 512 - Syntax error in "HSAddress". + + [ONION_CLIENT_AUTH_VIEW was added in Tor 0.4.3.1-alpha] + +3.33. DROPOWNERSHIP + + The syntax is: + + "DROPOWNERSHIP" CRLF + + This command instructs Tor to relinquish ownership of its control + connection. As such tor will not shut down when this control + connection is closed. + + This method is idempotent. If the control connection does not + already have ownership this method returns successfully, and + does nothing. + + The controller can call TAKEOWNERSHIP again to re-establish + ownership. + + [DROPOWNERSHIP was added in Tor 0.4.0.0-alpha] + +3.34. DROPTIMEOUTS + + The syntax is: + "DROPTIMEOUTS" CRLF + + Tells the server to drop all circuit build times. Do not invoke this command + lightly; it can increase vulnerability to tracking attacks over time. + + Tor replies with "250 OK" on success. Tor also emits the BUILDTIMEOUT_SET + RESET event right after this "250 OK". + + [DROPTIMEOUTS was added in Tor 0.4.5.0-alpha.] + +4. Replies + + Reply codes follow the same 3-character format as used by SMTP, with the + first character defining a status, the second character defining a + subsystem, and the third designating fine-grained information. + + The TC protocol currently uses the following first characters: + + 2yz Positive Completion Reply + The command was successful; a new request can be started. + + 4yz Temporary Negative Completion reply + The command was unsuccessful but might be reattempted later. + + 5yz Permanent Negative Completion Reply + The command was unsuccessful; the client should not try exactly + that sequence of commands again. + + 6yz Asynchronous Reply + Sent out-of-order in response to an earlier SETEVENTS command. + + The following second characters are used: + + x0z Syntax + Sent in response to ill-formed or nonsensical commands. + + x1z Protocol + Refers to operations of the Tor Control protocol. + + x5z Tor + Refers to actual operations of Tor system. + + The following codes are defined: + + 250 OK + 251 Operation was unnecessary + [Tor has declined to perform the operation, but no harm was done.] + + 451 Resource exhausted + + 500 Syntax error: protocol + + 510 Unrecognized command + 511 Unimplemented command + 512 Syntax error in command argument + 513 Unrecognized command argument + 514 Authentication required + 515 Bad authentication + + 550 Unspecified Tor error + + 551 Internal error + [Something went wrong inside Tor, so that the client's + request couldn't be fulfilled.] + + 552 Unrecognized entity + [A configuration key, a stream ID, circuit ID, event, + mentioned in the command did not actually exist.] + + 553 Invalid configuration value + [The client tried to set a configuration option to an + incorrect, ill-formed, or impossible value.] + + 554 Invalid descriptor + + 555 Unmanaged entity + + 650 Asynchronous event notification + + Unless specified to have specific contents, the human-readable messages + in error replies should not be relied upon to match those in this document. + +4.1. Asynchronous events + + These replies can be sent after a corresponding SETEVENTS command has been + received. They will not be interleaved with other Reply elements, but they + can appear between a command and its corresponding reply. For example, + this sequence is possible: + + C: SETEVENTS CIRC + S: 250 OK + C: GETCONF SOCKSPORT ORPORT + S: 650 CIRC 1000 EXTENDED moria1,moria2 + S: 250-SOCKSPORT=9050 + S: 250 ORPORT=0 + + But this sequence is disallowed: + + C: SETEVENTS CIRC + S: 250 OK + C: GETCONF SOCKSPORT ORPORT + S: 250-SOCKSPORT=9050 + S: 650 CIRC 1000 EXTENDED moria1,moria2 + S: 250 ORPORT=0 + + Clients MUST tolerate more arguments in an asynchronous reply than + expected, and MUST tolerate more lines in an asynchronous reply than + expected. For instance, a client that expects a CIRC message like: + + 650 CIRC 1000 EXTENDED moria1,moria2 + + must tolerate: + + 650-CIRC 1000 EXTENDED moria1,moria2 0xBEEF + 650-EXTRAMAGIC=99 + 650 ANONYMITY=high + + If clients receives extended events (selected by USEFEATUERE + EXTENDED_EVENTS in Tor 0.1.2.2-alpha..Tor-0.2.1.x, and always-on in + Tor 0.2.2.x and later), then each event line as specified below may be + followed by additional arguments and additional lines. Additional + lines will be of the form: + + "650" ("-"/" ") KEYWORD ["=" ARGUMENTS] CRLF + + Additional arguments will be of the form + + SP KEYWORD ["=" ( QuotedString / * NonSpDquote ) ] + + Clients MUST tolerate events with arguments and keywords they do not + recognize, and SHOULD process those events as if any unrecognized + arguments and keywords were not present. + + Clients SHOULD NOT depend on the order of keyword=value arguments, + and SHOULD NOT depend on there being no new keyword=value arguments + appearing between existing keyword=value arguments, though as of this + writing (Jun 2011) some do. Thus, extensions to this protocol should + add new keywords only after the existing keywords, until all + controllers have been fixed. At some point this "SHOULD NOT" might + become a "MUST NOT". + +4.1.1. Circuit status changed + + The syntax is: + + "650" SP "CIRC" SP CircuitID SP CircStatus [SP Path] + [SP "BUILD_FLAGS=" BuildFlags] [SP "PURPOSE=" Purpose] + [SP "HS_STATE=" HSState] [SP "REND_QUERY=" HSAddress] + [SP "TIME_CREATED=" TimeCreated] + [SP "REASON=" Reason [SP "REMOTE_REASON=" Reason]] + [SP "SOCKS_USERNAME=" EscapedUsername] + [SP "SOCKS_PASSWORD=" EscapedPassword] + [SP "HS_POW=" HSPoW ] + CRLF + + CircStatus = + "LAUNCHED" / ; circuit ID assigned to new circuit + "BUILT" / ; all hops finished, can now accept streams + "GUARD_WAIT" / ; all hops finished, waiting to see if a + ; circuit with a better guard will be usable. + "EXTENDED" / ; one more hop has been completed + "FAILED" / ; circuit closed (was not built) + "CLOSED" ; circuit closed (was built) + + Path = LongName *("," LongName) + ; In Tor versions 0.1.2.2-alpha through 0.2.2.1-alpha with feature + ; VERBOSE_NAMES turned off and before version 0.1.2.2-alpha, Path + ; is as follows: + ; Path = ServerID *("," ServerID) + + BuildFlags = BuildFlag *("," BuildFlag) + BuildFlag = "ONEHOP_TUNNEL" / "IS_INTERNAL" / + "NEED_CAPACITY" / "NEED_UPTIME" + + Purpose = "GENERAL" / "HS_CLIENT_INTRO" / "HS_CLIENT_REND" / + "HS_SERVICE_INTRO" / "HS_SERVICE_REND" / "TESTING" / + "CONTROLLER" / "MEASURE_TIMEOUT" / + "HS_VANGUARDS" / "PATH_BIAS_TESTING" / + "CIRCUIT_PADDING" + + HSState = "HSCI_CONNECTING" / "HSCI_INTRO_SENT" / "HSCI_DONE" / + "HSCR_CONNECTING" / "HSCR_ESTABLISHED_IDLE" / + "HSCR_ESTABLISHED_WAITING" / "HSCR_JOINED" / + "HSSI_CONNECTING" / "HSSI_ESTABLISHED" / + "HSSR_CONNECTING" / "HSSR_JOINED" + + HSPoWType = "v1" + HSPoWEffort = 1*DIGIT + HSPoW = HSPoWType "," HSPoWEffort + + EscapedUsername = QuotedString + EscapedPassword = QuotedString + + HSAddress = 16*Base32Character / 56*Base32Character + Base32Character = ALPHA / "2" / "3" / "4" / "5" / "6" / "7" + + TimeCreated = ISOTime2Frac + Seconds = 1*DIGIT + Microseconds = 1*DIGIT + + Reason = "NONE" / "TORPROTOCOL" / "INTERNAL" / "REQUESTED" / + "HIBERNATING" / "RESOURCELIMIT" / "CONNECTFAILED" / + "OR_IDENTITY" / "OR_CONN_CLOSED" / "TIMEOUT" / + "FINISHED" / "DESTROYED" / "NOPATH" / "NOSUCHSERVICE" / + "MEASUREMENT_EXPIRED" + + The path is provided only when the circuit has been extended at least one + hop. + + The "BUILD_FLAGS" field is provided only in versions 0.2.3.11-alpha + and later. Clients MUST accept build flags not listed above. + Build flags are defined as follows: + + ONEHOP_TUNNEL (one-hop circuit, used for tunneled directory conns) + IS_INTERNAL (internal circuit, not to be used for exiting streams) + NEED_CAPACITY (this circuit must use only high-capacity nodes) + NEED_UPTIME (this circuit must use only high-uptime nodes) + + The "PURPOSE" field is provided only in versions 0.2.1.6-alpha and + later, and only if extended events are enabled (see 3.19). Clients + MUST accept purposes not listed above. Purposes are defined as + follows: + + GENERAL (circuit for AP and/or directory request streams) + HS_CLIENT_INTRO (HS client-side introduction-point circuit) + HS_CLIENT_REND (HS client-side rendezvous circuit; carries AP streams) + HS_SERVICE_INTRO (HS service-side introduction-point circuit) + HS_SERVICE_REND (HS service-side rendezvous circuit) + TESTING (reachability-testing circuit; carries no traffic) + CONTROLLER (circuit built by a controller) + MEASURE_TIMEOUT (circuit being kept around to see how long it takes) + HS_VANGUARDS (circuit created ahead of time when using + HS vanguards, and later repurposed as needed) + PATH_BIAS_TESTING (circuit used to probe whether our circuits are + being deliberately closed by an attacker) + CIRCUIT_PADDING (circuit that is being held open to disguise its + true close time) + + The "HS_STATE" field is provided only for hidden-service circuits, + and only in versions 0.2.3.11-alpha and later. Clients MUST accept + hidden-service circuit states not listed above. Hidden-service + circuit states are defined as follows: + + HSCI_* (client-side introduction-point circuit states) + HSCI_CONNECTING (connecting to intro point) + HSCI_INTRO_SENT (sent INTRODUCE1; waiting for reply from IP) + HSCI_DONE (received reply from IP relay; closing) + + HSCR_* (client-side rendezvous-point circuit states) + HSCR_CONNECTING (connecting to or waiting for reply from RP) + HSCR_ESTABLISHED_IDLE (established RP; waiting for introduction) + HSCR_ESTABLISHED_WAITING (introduction sent to HS; waiting for rend) + HSCR_JOINED (connected to HS) + + HSSI_* (service-side introduction-point circuit states) + HSSI_CONNECTING (connecting to intro point) + HSSI_ESTABLISHED (established intro point) + + HSSR_* (service-side rendezvous-point circuit states) + HSSR_CONNECTING (connecting to client's rend point) + HSSR_JOINED (connected to client's RP circuit) + + The "SOCKS_USERNAME" and "SOCKS_PASSWORD" fields indicate the credentials + that were used by a SOCKS client to connect to Tor's SOCKS port and + initiate this circuit. (Streams for SOCKS clients connected with different + usernames and/or passwords are isolated on separate circuits if the + IsolateSOCKSAuth flag is active; see Proposal 171.) [Added in Tor + 0.4.3.1-alpha.] + + The "REND_QUERY" field is provided only for hidden-service-related + circuits, and only in versions 0.2.3.11-alpha and later. Clients + MUST accept hidden service addresses in formats other than that + specified above. [Added in Tor 0.4.3.1-alpha.] + + The "TIME_CREATED" field is provided only in versions 0.2.3.11-alpha and + later. TIME_CREATED is the time at which the circuit was created or + cannibalized. [Added in Tor 0.4.3.1-alpha.] + + The "REASON" field is provided only for FAILED and CLOSED events, and only + if extended events are enabled (see 3.19). Clients MUST accept reasons + not listed above. [Added in Tor 0.4.3.1-alpha.] Reasons are as given in + tor-spec.txt, except for: + + NOPATH (Not enough nodes to make circuit) + MEASUREMENT_EXPIRED (As "TIMEOUT", except that we had left the circuit + open for measurement purposes to see how long it + would take to finish.) + IP_NOW_REDUNDANT (Closing a circuit to an introduction point that + has become redundant, since some other circuit + opened in parallel with it has succeeded.) + + The "REMOTE_REASON" field is provided only when we receive a DESTROY or + TRUNCATE cell, and only if extended events are enabled. It contains the + actual reason given by the remote OR for closing the circuit. Clients MUST + accept reasons not listed above. Reasons are as listed in tor-spec.txt. + [Added in Tor 0.4.3.1-alpha.] + +4.1.2. Stream status changed + + The syntax is: + + "650" SP "STREAM" SP StreamID SP StreamStatus SP CircuitID SP Target + [SP "REASON=" Reason [ SP "REMOTE_REASON=" Reason ]] + [SP "SOURCE=" Source] [ SP "SOURCE_ADDR=" Address ":" Port ] + [SP "PURPOSE=" Purpose] [SP "SOCKS_USERNAME=" EscapedUsername] + [SP "SOCKS_PASSWORD=" EscapedPassword] + [SP "CLIENT_PROTOCOL=" ClientProtocol] [SP "NYM_EPOCH=" NymEpoch] + [SP "SESSION_GROUP=" SessionGroup] [SP "ISO_FIELDS=" IsoFields] + CRLF + + StreamStatus = + "NEW" / ; New request to connect + "NEWRESOLVE" / ; New request to resolve an address + "REMAP" / ; Address re-mapped to another + "SENTCONNECT" / ; Sent a connect cell along a circuit + "SENTRESOLVE" / ; Sent a resolve cell along a circuit + "SUCCEEDED" / ; Received a reply; stream established + "FAILED" / ; Stream failed and not retriable + "CLOSED" / ; Stream closed + "DETACHED" / ; Detached from circuit; still retriable + "CONTROLLER_WAIT" ; Waiting for controller to use ATTACHSTREAM + ; (new in 0.4.5.1-alpha) + "XOFF_SENT" ; XOFF has been sent for this stream + ; (new in 0.4.7.5-alpha) + "XOFF_RECV" ; XOFF has been received for this stream + ; (new in 0.4.7.5-alpha) + "XON_SENT" ; XON has been sent for this stream + ; (new in 0.4.7.5-alpha) + "XON_RECV" ; XON has been received for this stream + ; (new in 0.4.7.5-alpha) + + Target = TargetAddress ":" Port + Port = an integer from 0 to 65535 inclusive + TargetAddress = Address / "(Tor_internal)" + + EscapedUsername = QuotedString + EscapedPassword = QuotedString + + ClientProtocol = + "SOCKS4" / + "SOCKS5" / + "TRANS" / + "NATD" / + "DNS" / + "HTTPCONNECT" / + "UNKNOWN" + + NymEpoch = a nonnegative integer + SessionGroup = an integer + + IsoFields = a comma-separated list of IsoField values + + IsoField = + "CLIENTADDR" / + "CLIENTPORT" / + "DESTADDR" / + "DESTPORT" / + the name of a field that is valid for STREAM events + + The circuit ID designates which circuit this stream is attached to. If + the stream is unattached, the circuit ID "0" is given. The target + indicates the address which the stream is meant to resolve or connect to; + it can be "(Tor_internal)" for a virtual stream created by the Tor program + to talk to itself. + + Reason = "MISC" / "RESOLVEFAILED" / "CONNECTREFUSED" / + "EXITPOLICY" / "DESTROY" / "DONE" / "TIMEOUT" / + "NOROUTE" / "HIBERNATING" / "INTERNAL"/ "RESOURCELIMIT" / + "CONNRESET" / "TORPROTOCOL" / "NOTDIRECTORY" / "END" / + "PRIVATE_ADDR" + + The "REASON" field is provided only for FAILED, CLOSED, and DETACHED + events, and only if extended events are enabled (see 3.19). Clients MUST + accept reasons not listed above. Reasons are as given in tor-spec.txt, + except for: + + END (We received a RELAY_END cell from the other side of this + stream.) + PRIVATE_ADDR (The client tried to connect to a private address like + 127.0.0.1 or 10.0.0.1 over Tor.) + [XXXX document more. -NM] + + The "REMOTE_REASON" field is provided only when we receive a RELAY_END + cell, and only if extended events are enabled. It contains the actual + reason given by the remote OR for closing the stream. Clients MUST accept + reasons not listed above. Reasons are as listed in tor-spec.txt. + + "REMAP" events include a Source if extended events are enabled: + + Source = "CACHE" / "EXIT" + + Clients MUST accept sources not listed above. "CACHE" is given if + the Tor client decided to remap the address because of a cached + answer, and "EXIT" is given if the remote node we queried gave us + the new address as a response. + + The "SOURCE_ADDR" field is included with NEW and NEWRESOLVE events if + extended events are enabled. It indicates the address and port + that requested the connection, and can be (e.g.) used to look up the + requesting program. + + Purpose = "DIR_FETCH" / "DIR_UPLOAD" / "DNS_REQUEST" / + "USER" / "DIRPORT_TEST" + + The "PURPOSE" field is provided only for NEW and NEWRESOLVE events, and + only if extended events are enabled (see 3.19). Clients MUST accept + purposes not listed above. The purposes above are defined as: + + "DIR_FETCH" -- This stream is generated internally to Tor for + fetching directory information. + "DIR_UPLOAD" -- An internal stream for uploading information to + a directory authority. + "DIRPORT_TEST" -- A stream we're using to test our own directory + port to make sure it's reachable. + "DNS_REQUEST" -- A user-initiated DNS request. + "USER" -- This stream is handling user traffic, OR it's internal + to Tor, but it doesn't match one of the purposes above. + + The "SOCKS_USERNAME" and "SOCKS_PASSWORD" fields indicate the credentials + that were used by a SOCKS client to connect to Tor's SOCKS port and + initiate this stream. (Streams for SOCKS clients connected with different + usernames and/or passwords are isolated on separate circuits if the + IsolateSOCKSAuth flag is active; see Proposal 171.) + + The "CLIENT_PROTOCOL" field indicates the protocol that was used by a client + to initiate this stream. (Streams for clients connected with different + protocols are isolated on separate circuits if the IsolateClientProtocol + flag is active.) Controllers MUST tolerate unrecognized client protocols. + + The "NYM_EPOCH" field indicates the nym epoch that was active when a client + initiated this stream. The epoch increments when the NEWNYM signal is + received. (Streams with different nym epochs are isolated on separate + circuits.) + + The "SESSION_GROUP" field indicates the session group of the listener port + that a client used to initiate this stream. By default, the session group is + different for each listener port, but this can be overridden for a listener + via the "SessionGroup" option in torrc. (Streams with different session + groups are isolated on separate circuits.) + + The "ISO_FIELDS" field indicates the set of STREAM event fields for which + stream isolation is enabled for the listener port that a client used to + initiate this stream. The special values "CLIENTADDR", "CLIENTPORT", + "DESTADDR", and "DESTPORT", if their correspondingly named fields are not + present, refer to the Address and Port components of the "SOURCE_ADDR" and + Target fields. + +4.1.3. OR Connection status changed + + The syntax is: + + "650" SP "ORCONN" SP (LongName / Target) SP ORStatus [ SP "REASON=" + Reason ] [ SP "NCIRCS=" NumCircuits ] [ SP "ID=" ConnID ] CRLF + + ORStatus = "NEW" / "LAUNCHED" / "CONNECTED" / "FAILED" / "CLOSED" + + ; In Tor versions 0.1.2.2-alpha through 0.2.2.1-alpha with feature + ; VERBOSE_NAMES turned off and before version 0.1.2.2-alpha, OR + ; Connection is as follows: + "650" SP "ORCONN" SP (ServerID / Target) SP ORStatus [ SP "REASON=" + Reason ] [ SP "NCIRCS=" NumCircuits ] CRLF + + NEW is for incoming connections, and LAUNCHED is for outgoing + connections. CONNECTED means the TLS handshake has finished (in + either direction). FAILED means a connection is being closed that + hasn't finished its handshake, and CLOSED is for connections that + have handshaked. + + A LongName or ServerID is specified unless it's a NEW connection, in + which case we don't know what server it is yet, so we use Address:Port. + + If extended events are enabled (see 3.19), optional reason and + circuit counting information is provided for CLOSED and FAILED + events. + + Reason = "MISC" / "DONE" / "CONNECTREFUSED" / + "IDENTITY" / "CONNECTRESET" / "TIMEOUT" / "NOROUTE" / + "IOERROR" / "RESOURCELIMIT" / "PT_MISSING" + + NumCircuits counts both established and pending circuits. + + The ORStatus values are as follows: + + NEW -- We have received a new incoming OR connection, and are starting + the server-side handshake. + LAUNCHED -- We have launched a new outgoing OR connection, and are + starting the client-side handshake. + CONNECTED -- The OR connection has been connected and the handshake is + done. + FAILED -- Our attempt to open the OR connection failed. + CLOSED -- The OR connection closed in an unremarkable way. + + The Reason values for closed/failed OR connections are: + + DONE -- The OR connection has shut down cleanly. + CONNECTREFUSED -- We got an ECONNREFUSED while connecting to the target + OR. + IDENTITY -- We connected to the OR, but found that its identity was + not what we expected. + CONNECTRESET -- We got an ECONNRESET or similar IO error from the + connection with the OR. + TIMEOUT -- We got an ETIMEOUT or similar IO error from the connection + with the OR, or we're closing the connection for being idle for too + long. + NOROUTE -- We got an ENOTCONN, ENETUNREACH, ENETDOWN, EHOSTUNREACH, or + similar error while connecting to the OR. + IOERROR -- We got some other IO error on our connection to the OR. + RESOURCELIMIT -- We don't have enough operating system resources (file + descriptors, buffers, etc) to connect to the OR. + PT_MISSING -- No pluggable transport was available. + MISC -- The OR connection closed for some other reason. + + [First added ID parameter in 0.2.5.2-alpha] + +4.1.4. Bandwidth used in the last second + + The syntax is: + + "650" SP "BW" SP BytesRead SP BytesWritten *(SP Type "=" Num) CRLF + BytesRead = 1*DIGIT + BytesWritten = 1*DIGIT + Type = "DIR" / "OR" / "EXIT" / "APP" / ... + Num = 1*DIGIT + + BytesRead and BytesWritten are the totals. [In a future Tor version, + we may also include a breakdown of the connection types that used + bandwidth this second (not implemented yet).] + +4.1.5. Log messages + + The syntax is: + + "650" SP Severity SP ReplyText CRLF + + or + + "650+" Severity CRLF Data 650 SP "OK" CRLF + + Severity = "DEBUG" / "INFO" / "NOTICE" / "WARN"/ "ERR" + + Some low-level logs may be sent from signal handlers, so their destination + logs must be signal-safe. These low-level logs include backtraces, + logging function errors, and errors in code called by logging functions. + Signal-safe logs are never sent as control port log events. + + Control port message trace debug logs are never sent as control port log + events, to avoid modifying control output when debugging. + +4.1.6. New descriptors available + + This event is generated when new router descriptors (not microdescs or + extrainfos or anything else) are received. + + Syntax: + + "650" SP "NEWDESC" 1*(SP LongName) CRLF + ; In Tor versions 0.1.2.2-alpha through 0.2.2.1-alpha with feature + ; VERBOSE_NAMES turned off and before version 0.1.2.2-alpha, it + ; is as follows: + "650" SP "NEWDESC" 1*(SP ServerID) CRLF + +4.1.7. New Address mapping + + These events are generated when a new address mapping is entered in + Tor's address map cache, or when the answer for a RESOLVE command is + found. Entries can be created by a successful or failed DNS lookup, + a successful or failed connection attempt, a RESOLVE command, + a MAPADDRESS command, the AutomapHostsOnResolve feature, or the + TrackHostExits feature. + + Syntax: + + "650" SP "ADDRMAP" SP Address SP NewAddress SP Expiry + [SP "error=" ErrorCode] [SP "EXPIRES=" UTCExpiry] [SP "CACHED=" Cached] + [SP "STREAMID=" StreamId] CRLF + + NewAddress = Address / "" + Expiry = DQUOTE ISOTime DQUOTE / "NEVER" + + ErrorCode = "yes" / "internal" / "Unable to launch resolve request" + UTCExpiry = DQUOTE IsoTime DQUOTE + + Cached = DQUOTE "YES" DQUOTE / DQUOTE "NO" DQUOTE + StreamId = DQUOTE StreamId DQUOTE + + Error and UTCExpiry are only provided if extended events are enabled. + The values for Error are mostly useless. Future values will be + chosen to match 1*(ALNUM / "_"); the "Unable to launch resolve request" + value is a bug in Tor before 0.2.4.7-alpha. + + Expiry is expressed as the local time (rather than UTC). This is a bug, + left in for backward compatibility; new code should look at UTCExpiry + instead. (If Expiry is "NEVER", UTCExpiry is omitted.) + + Cached indicates whether the mapping will be stored until it expires, or if + it is just a notification in response to a RESOLVE command. + + StreamId is the global stream identifier of the stream or circuit from which + the address was resolved. + +4.1.8. Descriptors uploaded to us in our role as authoritative dirserver + + [NOTE: This feature was removed in Tor 0.3.2.1-alpha.] + + Tor generates this event when it's a directory authority, and + somebody has just uploaded a server descriptor. + + Syntax: + + "650" "+" "AUTHDIR_NEWDESCS" CRLF Action CRLF Message CRLF + Descriptor CRLF "." CRLF "650" SP "OK" CRLF + Action = "ACCEPTED" / "DROPPED" / "REJECTED" + Message = Text + + The Descriptor field is the text of the server descriptor; the Action + field is "ACCEPTED" if we're accepting the descriptor as the new + best valid descriptor for its router, "REJECTED" if we aren't taking + the descriptor and we're complaining to the uploading relay about + it, and "DROPPED" if we decide to drop the descriptor without + complaining. The Message field is a human-readable string + explaining why we chose the Action. (It doesn't contain newlines.) + +4.1.9. Our descriptor changed + + Syntax: + + "650" SP "DESCCHANGED" CRLF + + [First added in 0.1.2.2-alpha.] + +4.1.10. Status events + + Status events (STATUS_GENERAL, STATUS_CLIENT, and STATUS_SERVER) are sent + based on occurrences in the Tor process pertaining to the general state of + the program. Generally, they correspond to log messages of severity Notice + or higher. They differ from log messages in that their format is a + specified interface. + + Syntax: + + "650" SP StatusType SP StatusSeverity SP StatusAction + [SP StatusArguments] CRLF + + StatusType = "STATUS_GENERAL" / "STATUS_CLIENT" / "STATUS_SERVER" + StatusSeverity = "NOTICE" / "WARN" / "ERR" + StatusAction = 1*ALPHA + StatusArguments = StatusArgument *(SP StatusArgument) + StatusArgument = StatusKeyword '=' StatusValue + StatusKeyword = 1*(ALNUM / "_") + StatusValue = 1*(ALNUM / '_') / QuotedString + + StatusAction is a string, and StatusArguments is a series of + keyword=value pairs on the same line. Values may be space-terminated + strings, or quoted strings. + + These events are always produced with EXTENDED_EVENTS and + VERBOSE_NAMES; see the explanations in the USEFEATURE section + for details. + + Controllers MUST tolerate unrecognized actions, MUST tolerate + unrecognized arguments, MUST tolerate missing arguments, and MUST + tolerate arguments that arrive in any order. + + Each event description below is accompanied by a recommendation for + controllers. These recommendations are suggestions only; no controller + is required to implement them. + + Compatibility note: versions of Tor before 0.2.0.22-rc incorrectly + generated "STATUS_SERVER" as "STATUS_SEVER". To be compatible with those + versions, tools should accept both. + + Actions for STATUS_GENERAL events can be as follows: + + CLOCK_JUMPED + "TIME=NUM" + Tor spent enough time without CPU cycles that it has closed all + its circuits and will establish them anew. This typically + happens when a laptop goes to sleep and then wakes up again. It + also happens when the system is swapping so heavily that Tor is + starving. The "time" argument specifies the number of seconds Tor + thinks it was unconscious for (or alternatively, the number of + seconds it went back in time). + + This status event is sent as NOTICE severity normally, but WARN + severity if Tor is acting as a server currently. + + {Recommendation for controller: ignore it, since we don't really + know what the user should do anyway. Hm.} + + DANGEROUS_VERSION + "CURRENT=version" + "REASON=NEW/OBSOLETE/UNRECOMMENDED" + "RECOMMENDED=\"version, version, ...\"" + Tor has found that directory servers don't recommend its version of + the Tor software. RECOMMENDED is a comma-and-space-separated string + of Tor versions that are recommended. REASON is NEW if this version + of Tor is newer than any recommended version, OBSOLETE if + this version of Tor is older than any recommended version, and + UNRECOMMENDED if some recommended versions of Tor are newer and + some are older than this version. (The "OBSOLETE" reason was called + "OLD" from Tor 0.1.2.3-alpha up to and including 0.2.0.12-alpha.) + + {Controllers may want to suggest that the user upgrade OLD or + UNRECOMMENDED versions. NEW versions may be known-insecure, or may + simply be development versions.} + + TOO_MANY_CONNECTIONS + "CURRENT=NUM" + Tor has reached its ulimit -n or whatever the native limit is on file + descriptors or sockets. CURRENT is the number of sockets Tor + currently has open. The user should really do something about + this. The "current" argument shows the number of connections currently + open. + + {Controllers may recommend that the user increase the limit, or + increase it for them. Recommendations should be phrased in an + OS-appropriate way and automated when possible.} + + BUG + "REASON=STRING" + Tor has encountered a situation that its developers never expected, + and the developers would like to learn that it happened. Perhaps + the controller can explain this to the user and encourage her to + file a bug report? + + {Controllers should log bugs, but shouldn't annoy the user in case a + bug appears frequently.} + + CLOCK_SKEW + SKEW="+" / "-" SECONDS + MIN_SKEW="+" / "-" SECONDS. + SOURCE="DIRSERV:" IP ":" Port / + "NETWORKSTATUS:" IP ":" Port / + "OR:" IP ":" Port / + "CONSENSUS" + If "SKEW" is present, it's an estimate of how far we are from the + time declared in the source. (In other words, if we're an hour in + the past, the value is -3600.) "MIN_SKEW" is present, it's a lower + bound. If the source is a DIRSERV, we got the current time from a + connection to a dirserver. If the source is a NETWORKSTATUS, we + decided we're skewed because we got a v2 networkstatus from far in + the future. If the source is OR, the skew comes from a NETINFO + cell from a connection to another relay. If the source is + CONSENSUS, we decided we're skewed because we got a networkstatus + consensus from the future. + + {Tor should send this message to controllers when it thinks the + skew is so high that it will interfere with proper Tor operation. + Controllers shouldn't blindly adjust the clock, since the more + accurate source of skew info (DIRSERV) is currently + unauthenticated.} + + BAD_LIBEVENT + "METHOD=" libevent method + "VERSION=" libevent version + "BADNESS=" "BROKEN" / "BUGGY" / "SLOW" + "RECOVERED=" "NO" / "YES" + Tor knows about bugs in using the configured event method in this + version of libevent. "BROKEN" libevents won't work at all; + "BUGGY" libevents might work okay; "SLOW" libevents will work + fine, but not quickly. If "RECOVERED" is YES, Tor managed to + switch to a more reliable (but probably slower!) libevent method. + + {Controllers may want to warn the user if this event occurs, though + generally it's the fault of whoever built the Tor binary and there's + not much the user can do besides upgrade libevent or upgrade the + binary.} + + DIR_ALL_UNREACHABLE + Tor believes that none of the known directory servers are + reachable -- this is most likely because the local network is + down or otherwise not working, and might help to explain for the + user why Tor appears to be broken. + + {Controllers may want to warn the user if this event occurs; further + action is generally not possible.} + + Actions for STATUS_CLIENT events can be as follows: + + BOOTSTRAP + "PROGRESS=" num + "TAG=" Keyword + "SUMMARY=" String + ["WARNING=" String] + ["REASON=" Keyword] + ["COUNT=" num] + ["RECOMMENDATION=" Keyword] + ["HOST=" QuotedString] + ["HOSTADDR=" QuotedString] + + Tor has made some progress at establishing a connection to the + Tor network, fetching directory information, or making its first + circuit; or it has encountered a problem while bootstrapping. This + status event is especially useful for users with slow connections + or with connectivity problems. + + "Progress" gives a number between 0 and 100 for how far through + the bootstrapping process we are. "Summary" is a string that can + be displayed to the user to describe the *next* task that Tor + will tackle, i.e., the task it is working on after sending the + status event. "Tag" is a string that controllers can use to + recognize bootstrap phases, if they want to do something smarter + than just blindly displaying the summary string; see Section 5 + for the current tags that Tor issues. + + The StatusSeverity describes whether this is a normal bootstrap + phase (severity notice) or an indication of a bootstrapping + problem (severity warn). + + For bootstrap problems, we include the same progress, tag, and + summary values as we would for a normal bootstrap event, but we + also include "warning", "reason", "count", and "recommendation" + key/value combos. The "count" number tells how many bootstrap + problems there have been so far at this phase. The "reason" + string lists one of the reasons allowed in the ORCONN event. The + "warning" argument string with any hints Tor has to offer about + why it's having troubles bootstrapping. + + The "reason" values are long-term-stable controller-facing tags to + identify particular issues in a bootstrapping step. The warning + strings, on the other hand, are human-readable. Controllers + SHOULD NOT rely on the format of any warning string. Currently + the possible values for "recommendation" are either "ignore" or + "warn" -- if ignore, the controller can accumulate the string in + a pile of problems to show the user if the user asks; if warn, + the controller should alert the user that Tor is pretty sure + there's a bootstrapping problem. + + The "host" value is the identity digest (in hex) of the node we're + trying to connect to; the "hostaddr" is an address:port combination, + where 'address' is an ipv4 or ipv6 address. + + Currently Tor uses recommendation=ignore for the first + nine bootstrap problem reports for a given phase, and then + uses recommendation=warn for subsequent problems at that + phase. Hopefully this is a good balance between tolerating + occasional errors and reporting serious problems quickly. + + ENOUGH_DIR_INFO + Tor now knows enough network-status documents and enough server + descriptors that it's going to start trying to build circuits now. + [Newer versions of Tor (0.2.6.2-alpha and later): + If the consensus contains Exits (the typical case), Tor will build + both exit and internal circuits. If not, Tor will only build internal + circuits.] + + {Controllers may want to use this event to decide when to indicate + progress to their users, but should not interrupt the user's browsing + to tell them so.} + + NOT_ENOUGH_DIR_INFO + We discarded expired statuses and server descriptors to fall + below the desired threshold of directory information. We won't + try to build any circuits until ENOUGH_DIR_INFO occurs again. + + {Controllers may want to use this event to decide when to indicate + progress to their users, but should not interrupt the user's browsing + to tell them so.} + + CIRCUIT_ESTABLISHED + Tor is able to establish circuits for client use. This event will + only be sent if we just built a circuit that changed our mind -- + that is, prior to this event we didn't know whether we could + establish circuits. + + {Suggested use: controllers can notify their users that Tor is + ready for use as a client once they see this status event. [Perhaps + controllers should also have a timeout if too much time passes and + this event hasn't arrived, to give tips on how to troubleshoot. + On the other hand, hopefully Tor will send further status events + if it can identify the problem.]} + + CIRCUIT_NOT_ESTABLISHED + "REASON=" "EXTERNAL_ADDRESS" / "DIR_ALL_UNREACHABLE" / "CLOCK_JUMPED" + We are no longer confident that we can build circuits. The "reason" + keyword provides an explanation: which other status event type caused + our lack of confidence. + + {Controllers may want to use this event to decide when to indicate + progress to their users, but should not interrupt the user's browsing + to do so.} + [Note: only REASON=CLOCK_JUMPED is implemented currently.] + + CONSENSUS_ARRIVED + Tor has received and validated a new consensus networkstatus. + (This event can be delayed a little while after the consensus + is received, if Tor needs to fetch certificates.) + + DANGEROUS_PORT + "PORT=" port + "RESULT=" "REJECT" / "WARN" + A stream was initiated to a port that's commonly used for + vulnerable-plaintext protocols. If the Result is "reject", we + refused the connection; whereas if it's "warn", we allowed it. + + {Controllers should warn their users when this occurs, unless they + happen to know that the application using Tor is in fact doing so + correctly (e.g., because it is part of a distributed bundle). They + might also want some sort of interface to let the user configure + their RejectPlaintextPorts and WarnPlaintextPorts config options.} + + DANGEROUS_SOCKS + "PROTOCOL=" "SOCKS4" / "SOCKS5" + "ADDRESS=" IP:port + A connection was made to Tor's SOCKS port using one of the SOCKS + approaches that doesn't support hostnames -- only raw IP addresses. + If the client application got this address from gethostbyname(), + it may be leaking target addresses via DNS. + + {Controllers should warn their users when this occurs, unless they + happen to know that the application using Tor is in fact doing so + correctly (e.g., because it is part of a distributed bundle).} + + SOCKS_UNKNOWN_PROTOCOL + "DATA=string" + A connection was made to Tor's SOCKS port that tried to use it + for something other than the SOCKS protocol. Perhaps the user is + using Tor as an HTTP proxy? The DATA is the first few characters + sent to Tor on the SOCKS port. + + {Controllers may want to warn their users when this occurs: it + indicates a misconfigured application.} + + SOCKS_BAD_HOSTNAME + "HOSTNAME=QuotedString" + Some application gave us a funny-looking hostname. Perhaps + it is broken? In any case it won't work with Tor and the user + should know. + + {Controllers may want to warn their users when this occurs: it + usually indicates a misconfigured application.} + + Actions for STATUS_SERVER can be as follows: + + EXTERNAL_ADDRESS + "ADDRESS=IP" + "HOSTNAME=NAME" + "METHOD=CONFIGURED/CONFIGURED_ORPORT/DIRSERV/RESOLVED/ + INTERFACE/GETHOSTNAME" + Our best idea for our externally visible IP has changed to 'IP'. If + 'HOSTNAME' is present, we got the new IP by resolving 'NAME'. If the + method is 'CONFIGURED', the IP was given verbatim as the Address + configuration option. If the method is 'CONFIGURED_ORPORT', the IP was + given verbatim in the ORPort configuration option. If the method is + 'RESOLVED', we resolved the Address configuration option to get the IP. + If the method is 'GETHOSTNAME', we resolved our hostname to get the IP. + If the method is 'INTERFACE', we got the address of one of our network + interfaces to get the IP. If the method is 'DIRSERV', a directory + server told us a guess for what our IP might be. + + {Controllers may want to record this info and display it to the user.} + + CHECKING_REACHABILITY + "ORADDRESS=IP:port" + "DIRADDRESS=IP:port" + We're going to start testing the reachability of our external OR port + or directory port. + + {This event could affect the controller's idea of server status, but + the controller should not interrupt the user to tell them so.} + + REACHABILITY_SUCCEEDED + "ORADDRESS=IP:port" + "DIRADDRESS=IP:port" + We successfully verified the reachability of our external OR port or + directory port (depending on which of ORADDRESS or DIRADDRESS is + given.) + + {This event could affect the controller's idea of server status, but + the controller should not interrupt the user to tell them so.} + + GOOD_SERVER_DESCRIPTOR + We successfully uploaded our server descriptor to at least one + of the directory authorities, with no complaints. + + {Originally, the goal of this event was to declare "every authority + has accepted the descriptor, so there will be no complaints + about it." But since some authorities might be offline, it's + harder to get certainty than we had thought. As such, this event + is equivalent to ACCEPTED_SERVER_DESCRIPTOR below. Controllers + should just look at ACCEPTED_SERVER_DESCRIPTOR and should ignore + this event for now.} + + SERVER_DESCRIPTOR_STATUS + "STATUS=" "LISTED" / "UNLISTED" + We just got a new networkstatus consensus, and whether we're in + it or not in it has changed. Specifically, status is "listed" + if we're listed in it but previous to this point we didn't know + we were listed in a consensus; and status is "unlisted" if we + thought we should have been listed in it (e.g. we were listed in + the last one), but we're not. + + {Moving from listed to unlisted is not necessarily cause for + alarm. The relay might have failed a few reachability tests, + or the Internet might have had some routing problems. So this + feature is mainly to let relay operators know when their relay + has successfully been listed in the consensus.} + + [Not implemented yet. We should do this in 0.2.2.x. -RD] + + NAMESERVER_STATUS + "NS=addr" + "STATUS=" "UP" / "DOWN" + "ERR=" message + One of our nameservers has changed status. + + {This event could affect the controller's idea of server status, but + the controller should not interrupt the user to tell them so.} + + NAMESERVER_ALL_DOWN + All of our nameservers have gone down. + + {This is a problem; if it happens often without the nameservers + coming up again, the user needs to configure more or better + nameservers.} + + DNS_HIJACKED + Our DNS provider is providing an address when it should be saying + "NOTFOUND"; Tor will treat the address as a synonym for "NOTFOUND". + + {This is an annoyance; controllers may want to tell admins that their + DNS provider is not to be trusted.} + + DNS_USELESS + Our DNS provider is giving a hijacked address instead of well-known + websites; Tor will not try to be an exit node. + + {Controllers could warn the admin if the relay is running as an + exit node: the admin needs to configure a good DNS server. + Alternatively, this happens a lot in some restrictive environments + (hotels, universities, coffeeshops) when the user hasn't registered.} + + BAD_SERVER_DESCRIPTOR + "DIRAUTH=addr:port" + "REASON=string" + A directory authority rejected our descriptor. Possible reasons + include malformed descriptors, incorrect keys, highly skewed clocks, + and so on. + + {Controllers should warn the admin, and try to cope if they can.} + + ACCEPTED_SERVER_DESCRIPTOR + "DIRAUTH=addr:port" + A single directory authority accepted our descriptor. + // actually notice + + {This event could affect the controller's idea of server status, but + the controller should not interrupt the user to tell them so.} + + REACHABILITY_FAILED + "ORADDRESS=IP:port" + "DIRADDRESS=IP:port" + We failed to connect to our external OR port or directory port + successfully. + + {This event could affect the controller's idea of server status. The + controller should warn the admin and suggest reasonable steps to take.} + + HIBERNATION_STATUS + "STATUS=" "AWAKE" | "SOFT" | "HARD" + Our bandwidth based accounting status has changed, and we are now + relaying traffic/rejecting new connections/hibernating. + + {This event could affect the controller's idea of server status. The + controller MAY inform the admin, though presumably the accounting was + explicitly enabled for a reason.} + + [This event was added in tor 0.2.9.0-alpha.] + +4.1.11. Our set of guard nodes has changed + + Syntax: + + "650" SP "GUARD" SP Type SP Name SP Status ... CRLF + Type = "ENTRY" + Name = ServerSpec + (Identifies the guard affected) + Status = "NEW" | "UP" | "DOWN" | "BAD" | "GOOD" | "DROPPED" + + The ENTRY type indicates a guard used for connections to the Tor + network. + + The Status values are: + + "NEW" -- This node was not previously used as a guard; now we have + picked it as one. + "DROPPED" -- This node is one we previously picked as a guard; we + no longer consider it to be a member of our guard list. + "UP" -- The guard now seems to be reachable. + "DOWN" -- The guard now seems to be unreachable. + "BAD" -- Because of flags set in the consensus and/or values in the + configuration, this node is now unusable as a guard. + "BAD_L2" -- This layer2 guard has expired or got removed from the + consensus. This node is removed from the layer2 guard set. + "GOOD" -- Because of flags set in the consensus and/or values in the + configuration, this node is now usable as a guard. + + Controllers must accept unrecognized types and unrecognized statuses. + +4.1.12. Network status has changed + + Syntax: + + "650" "+" "NS" CRLF 1*NetworkStatus "." CRLF "650" SP "OK" CRLF + + The event is used whenever our local view of a relay status changes. + This happens when we get a new v3 consensus (in which case the entries + we see are a duplicate of what we see in the NEWCONSENSUS event, + below), but it also happens when we decide to mark a relay as up or + down in our local status, for example based on connection attempts. + + [First added in 0.1.2.3-alpha] + +4.1.13. Bandwidth used on an application stream + + The syntax is: + + "650" SP "STREAM_BW" SP StreamID SP BytesWritten SP BytesRead SP + Time CRLF + BytesWritten = 1*DIGIT + BytesRead = 1*DIGIT + Time = ISOTime2Frac + + BytesWritten and BytesRead are the number of bytes written and read + by the application since the last STREAM_BW event on this stream. + + Note that from Tor's perspective, *reading* a byte on a stream means + that the application *wrote* the byte. That's why the order of "written" + vs "read" is opposite for stream_bw events compared to bw events. + + The Time field is provided only in versions 0.3.2.1-alpha and later. It + records when Tor created the bandwidth event. + + These events are generated about once per second per stream; no events + are generated for streams that have not written or read. These events + apply only to streams entering Tor (such as on a SOCKSPort, TransPort, + or so on). They are not generated for exiting streams. + +4.1.14. Per-country client stats + + The syntax is: + + "650" SP "CLIENTS_SEEN" SP TimeStarted SP CountrySummary SP + IPVersions CRLF + + We just generated a new summary of which countries we've seen clients + from recently. The controller could display this for the user, e.g. + in their "relay" configuration window, to give them a sense that they + are actually being useful. + + Currently only bridge relays will receive this event, but once we figure + out how to sufficiently aggregate and sanitize the client counts on + main relays, we might start sending these events in other cases too. + + TimeStarted is a quoted string indicating when the reported summary + counts from (in UTCS). + + The CountrySummary keyword has as its argument a comma-separated, + possibly empty set of "countrycode=count" pairs. For example (without + linebreak), + 650-CLIENTS_SEEN TimeStarted="2008-12-25 23:50:43" + CountrySummary=us=16,de=8,uk=8 + + The IPVersions keyword has as its argument a comma-separated set of + "protocol-family=count" pairs. For example, + IPVersions=v4=16,v6=40 + + Note that these values are rounded, not exact. The rounding + algorithm is specified in the description of "geoip-client-origins" + in dir-spec.txt. + +4.1.15. New consensus networkstatus has arrived + + The syntax is: + + "650" "+" "NEWCONSENSUS" CRLF 1*NetworkStatus "." CRLF "650" SP + "OK" CRLF + + A new consensus networkstatus has arrived. We include NS-style lines for + every relay in the consensus. NEWCONSENSUS is a separate event from the + NS event, because the list here represents every usable relay: so any + relay *not* mentioned in this list is implicitly no longer recommended. + + [First added in 0.2.1.13-alpha] + +4.1.16. New circuit buildtime has been set + + The syntax is: + + "650" SP "BUILDTIMEOUT_SET" SP Type SP "TOTAL_TIMES=" Total SP + "TIMEOUT_MS=" Timeout SP "XM=" Xm SP "ALPHA=" Alpha SP + "CUTOFF_QUANTILE=" Quantile SP "TIMEOUT_RATE=" TimeoutRate SP + "CLOSE_MS=" CloseTimeout SP "CLOSE_RATE=" CloseRate + CRLF + Type = "COMPUTED" / "RESET" / "SUSPENDED" / "DISCARD" / "RESUME" + Total = Integer count of timeouts stored + Timeout = Integer timeout in milliseconds + Xm = Estimated integer Pareto parameter Xm in milliseconds + Alpha = Estimated floating point Paredo parameter alpha + Quantile = Floating point CDF quantile cutoff point for this timeout + TimeoutRate = Floating point ratio of circuits that timeout + CloseTimeout = How long to keep measurement circs in milliseconds + CloseRate = Floating point ratio of measurement circuits that are closed + + A new circuit build timeout time has been set. If Type is "COMPUTED", + Tor has computed the value based on historical data. If Type is "RESET", + initialization or drastic network changes have caused Tor to reset + the timeout back to the default, to relearn again. If Type is + "SUSPENDED", Tor has detected a loss of network connectivity and has + temporarily changed the timeout value to the default until the network + recovers. If type is "DISCARD", Tor has decided to discard timeout + values that likely happened while the network was down. If type is + "RESUME", Tor has decided to resume timeout calculation. + + The Total value is the count of circuit build times Tor used in + computing this value. It is capped internally at the maximum number + of build times Tor stores (NCIRCUITS_TO_OBSERVE). + + The Timeout itself is provided in milliseconds. Internally, Tor rounds + this value to the nearest second before using it. + + [First added in 0.2.2.7-alpha] + +4.1.17. Signal received + + The syntax is: + + "650" SP "SIGNAL" SP Signal CRLF + + Signal = "RELOAD" / "DUMP" / "DEBUG" / "NEWNYM" / "CLEARDNSCACHE" + + A signal has been received and actions taken by Tor. The meaning of each + signal, and the mapping to Unix signals, is as defined in section 3.7. + Future versions of Tor MAY generate signals other than those listed here; + controllers MUST be able to accept them. + + If Tor chose to ignore a signal (such as NEWNYM), this event will not be + sent. Note that some options (like ReloadTorrcOnSIGHUP) may affect the + semantics of the signals here. + + Note that the HALT (SIGTERM) and SHUTDOWN (SIGINT) signals do not currently + generate any event. + + [First added in 0.2.3.1-alpha] + +4.1.18. Configuration changed + + The syntax is: + + StartReplyLine *(MidReplyLine) EndReplyLine + + StartReplyLine = "650-CONF_CHANGED" CRLF + MidReplyLine = "650-" KEYWORD ["=" VALUE] CRLF + EndReplyLine = "650 OK" + + Tor configuration options have changed (such as via a SETCONF or RELOAD + signal). KEYWORD and VALUE specify the configuration option that was changed. + Undefined configuration options contain only the KEYWORD. + +4.1.19. Circuit status changed slightly + + The syntax is: + + "650" SP "CIRC_MINOR" SP CircuitID SP CircEvent [SP Path] + [SP "BUILD_FLAGS=" BuildFlags] [SP "PURPOSE=" Purpose] + [SP "HS_STATE=" HSState] [SP "REND_QUERY=" HSAddress] + [SP "TIME_CREATED=" TimeCreated] + [SP "OLD_PURPOSE=" Purpose [SP "OLD_HS_STATE=" HSState]] CRLF + + CircEvent = + "PURPOSE_CHANGED" / ; circuit purpose or HS-related state changed + "CANNIBALIZED" ; circuit cannibalized + + Clients MUST accept circuit events not listed above. + + The "OLD_PURPOSE" field is provided for both PURPOSE_CHANGED and + CANNIBALIZED events. The "OLD_HS_STATE" field is provided whenever + the "OLD_PURPOSE" field is provided and is a hidden-service-related + purpose. + + Other fields are as specified in section 4.1.1 above. + + [First added in 0.2.3.11-alpha] + +4.1.20. Pluggable transport launched + + The syntax is: + + "650" SP "TRANSPORT_LAUNCHED" SP Type SP Name SP TransportAddress SP Port + Type = "server" | "client" + Name = The name of the pluggable transport + TransportAddress = An IPv4 or IPv6 address on which the pluggable + transport is listening for connections + Port = The TCP port on which it is listening for connections. + + A pluggable transport called 'Name' of type 'Type' was launched + successfully and is now listening for connections on 'Address':'Port'. + +4.1.21. Bandwidth used on an OR or DIR or EXIT connection + + The syntax is: + + "650" SP "CONN_BW" SP "ID=" ConnID SP "TYPE=" ConnType + SP "READ=" BytesRead SP "WRITTEN=" BytesWritten CRLF + + ConnType = "OR" / ; Carrying traffic within the tor network. This can + either be our own (client) traffic or traffic we're + relaying within the network. + "DIR" / ; Fetching tor descriptor data, or transmitting + descriptors we're mirroring. + "EXIT" ; Carrying traffic between the tor network and an + external destination. + + BytesRead = 1*DIGIT + BytesWritten = 1*DIGIT + + Controllers MUST tolerate unrecognized connection types. + + BytesWritten and BytesRead are the number of bytes written and read + by Tor since the last CONN_BW event on this connection. + + These events are generated about once per second per connection; no + events are generated for connections that have not read or written. + These events are only generated if TestingTorNetwork is set. + + [First added in 0.2.5.2-alpha] + +4.1.22. Bandwidth used by all streams attached to a circuit + + The syntax is: + + "650" SP "CIRC_BW" SP "ID=" CircuitID SP "READ=" BytesRead SP + "WRITTEN=" BytesWritten SP "TIME=" Time SP + "DELIVERED_READ=" DeliveredBytesRead SP + "OVERHEAD_READ=" OverheadBytesRead SP + "DELIVERED_WRITTEN=" DeliveredBytesWritten SP + "OVERHEAD_WRITTEN=" OverheadBytesWritten SP + "SS=" SlowStartState SP + "CWND=" CWNDCells SP + "RTT=" RTTMilliseconds SP + "MIN_RTT=" RTTMilliseconds CRLF + BytesRead = 1*DIGIT + BytesWritten = 1*DIGIT + OverheadBytesRead = 1*DIGIT + OverheadBytesWritten = 1*DIGIT + DeliveredBytesRead = 1*DIGIT + DeliveredBytesWritten = 1*DIGIT + SlowStartState = 0 or 1 + CWNDCells = 1*DIGIT + RTTMilliseconds= 1*DIGIT + Time = ISOTime2Frac + + BytesRead and BytesWritten are the number of bytes read and written + on this circuit since the last CIRC_BW event. These bytes have not + necessarily been validated by Tor, and can include invalid cells, + dropped cells, and ignored cells (such as padding cells). These + values include the relay headers, but not circuit headers. + + Circuit data that has been validated and processed by Tor is further + broken down into two categories: delivered payloads and overhead. + DeliveredBytesRead and DeliveredBytesWritten are the total relay cell + payloads transmitted since the last CIRC_BW event, not counting relay + cell headers or circuit headers. OverheadBytesRead and + OverheadBytesWritten are the extra unused bytes at the end of each + cell in order for it to be the fixed CELL_LEN bytes long. + + The sum of DeliveredBytesRead and OverheadBytesRead MUST be less than + BytesRead, and the same is true for their written counterparts. This + sum represents the total relay cell bytes on the circuit that + have been validated by Tor, not counting relay headers and cell headers. + Subtracting this sum (plus relay cell headers) from the BytesRead + (or BytesWritten) value gives the byte count that Tor has decided to + reject due to protocol errors, or has otherwise decided to ignore. + + The Time field is provided only in versions 0.3.2.1-alpha and later. It + records when Tor created the bandwidth event. + + The SS, CWND, RTT, and MIN_RTT fields are present only if the circuit + has negotiated congestion control to an onion service or Exit hop (any + intermediate leaky pipe congestion control hops are not examined here). + SS provides an indication if the circuit is in slow start (1), or not (0). + CWND is the size of the congestion window in terms of number of cells. + RTT is the N_EWMA smoothed current RTT value, and MIN_RTT is the minimum + RTT value of the circuit. The SS and CWND fields apply only to the + upstream direction of the circuit. The slow start state and CWND values + of the other endpoint may be different. + + These events are generated about once per second per circuit; no events + are generated for circuits that had no attached stream writing or + reading. + + [First added in 0.2.5.2-alpha] + + [DELIVERED_READ, OVERHEAD_READ, DELIVERED_WRITTEN, and OVERHEAD_WRITTEN + were added in Tor 0.3.4.0-alpha] + + [SS, CWND, RTT, and MIN_RTT were added in Tor 0.4.7.5-alpha] + +4.1.23. Per-circuit cell stats + + The syntax is: + + "650" SP "CELL_STATS" + [ SP "ID=" CircuitID ] + [ SP "InboundQueue=" QueueID SP "InboundConn=" ConnID ] + [ SP "InboundAdded=" CellsByType ] + [ SP "InboundRemoved=" CellsByType SP + "InboundTime=" MsecByType ] + [ SP "OutboundQueue=" QueueID SP "OutboundConn=" ConnID ] + [ SP "OutboundAdded=" CellsByType ] + [ SP "OutboundRemoved=" CellsByType SP + "OutboundTime=" MsecByType ] CRLF + CellsByType, MsecByType = CellType ":" 1*DIGIT + 0*( "," CellType ":" 1*DIGIT ) + CellType = 1*( "a" - "z" / "0" - "9" / "_" ) + + Examples are: + + 650 CELL_STATS ID=14 OutboundQueue=19403 OutboundConn=15 + OutboundAdded=create_fast:1,relay_early:2 + OutboundRemoved=create_fast:1,relay_early:2 + OutboundTime=create_fast:0,relay_early:0 + 650 CELL_STATS InboundQueue=19403 InboundConn=32 + InboundAdded=relay:1,created_fast:1 + InboundRemoved=relay:1,created_fast:1 + InboundTime=relay:0,created_fast:0 + OutboundQueue=6710 OutboundConn=18 + OutboundAdded=create:1,relay_early:1 + OutboundRemoved=create:1,relay_early:1 + OutboundTime=create:0,relay_early:0 + + ID is the locally unique circuit identifier that is only included if the + circuit originates at this node. + + Inbound and outbound refer to the direction of cell flow through the + circuit which is either to origin (inbound) or from origin (outbound). + + InboundQueue and OutboundQueue are identifiers of the inbound and + outbound circuit queues of this circuit. These identifiers are only + unique per OR connection. OutboundQueue is chosen by this node and + matches InboundQueue of the next node in the circuit. + + InboundConn and OutboundConn are locally unique IDs of inbound and + outbound OR connection. OutboundConn does not necessarily match + InboundConn of the next node in the circuit. + + InboundQueue and InboundConn are not present if the circuit originates + at this node. OutboundQueue and OutboundConn are not present if the + circuit (currently) ends at this node. + + InboundAdded and OutboundAdded are total number of cells by cell type + added to inbound and outbound queues. Only present if at least one cell + was added to a queue. + + InboundRemoved and OutboundRemoved are total number of cells by + cell type processed from inbound and outbound queues. InboundTime and + OutboundTime are total waiting times in milliseconds of all processed + cells by cell type. Only present if at least one cell was removed from + a queue. + + These events are generated about once per second per circuit; no + events are generated for circuits that have not added or processed any + cell. These events are only generated if TestingTorNetwork is set. + + [First added in 0.2.5.2-alpha] + +4.1.24. Token buckets refilled + + The syntax is: + + "650" SP "TB_EMPTY" SP BucketName [ SP "ID=" ConnID ] SP + "READ=" ReadBucketEmpty SP "WRITTEN=" WriteBucketEmpty SP + "LAST=" LastRefill CRLF + + BucketName = "GLOBAL" / "RELAY" / "ORCONN" + ReadBucketEmpty = 1*DIGIT + WriteBucketEmpty = 1*DIGIT + LastRefill = 1*DIGIT + + Examples are: + + 650 TB_EMPTY ORCONN ID=16 READ=0 WRITTEN=0 LAST=100 + 650 TB_EMPTY GLOBAL READ=93 WRITTEN=93 LAST=100 + 650 TB_EMPTY RELAY READ=93 WRITTEN=93 LAST=100 + + This event is generated when refilling a previously empty token + bucket. BucketNames "GLOBAL" and "RELAY" keywords are used for the + global or relay token buckets, BucketName "ORCONN" is used for the + token buckets of an OR connection. Controllers MUST tolerate + unrecognized bucket names. + + ConnID is only included if the BucketName is "ORCONN". + + If both global and relay buckets and/or the buckets of one or more OR + connections run out of tokens at the same time, multiple separate + events are generated. + + ReadBucketEmpty (WriteBucketEmpty) is the time in millis that the read + (write) bucket was empty since the last refill. LastRefill is the + time in millis since the last refill. + + If a bucket went negative and if refilling tokens didn't make it go + positive again, there will be multiple consecutive TB_EMPTY events for + each refill interval during which the bucket contained zero tokens or + less. In such a case, ReadBucketEmpty or WriteBucketEmpty are capped + at LastRefill in order not to report empty times more than once. + + These events are only generated if TestingTorNetwork is set. + + [First added in 0.2.5.2-alpha] + +4.1.25. HiddenService descriptors + + The syntax is: + + "650" SP "HS_DESC" SP Action SP HSAddress SP AuthType SP HsDir + [SP DescriptorID] [SP "REASON=" Reason] [SP "REPLICA=" Replica] + [SP "HSDIR_INDEX=" HSDirIndex] + + Action = "REQUESTED" / "UPLOAD" / "RECEIVED" / "UPLOADED" / "IGNORE" / + "FAILED" / "CREATED" + HSAddress = 16*Base32Character / 56*Base32Character / "UNKNOWN" + AuthType = "NO_AUTH" / "BASIC_AUTH" / "STEALTH_AUTH" / "UNKNOWN" + HsDir = LongName / Fingerprint / "UNKNOWN" + DescriptorID = 32*Base32Character / 43*Base64Character + Reason = "BAD_DESC" / "QUERY_REJECTED" / "UPLOAD_REJECTED" / "NOT_FOUND" / + "UNEXPECTED" / "QUERY_NO_HSDIR" / "QUERY_RATE_LIMITED" + Replica = 1*DIGIT + HSDirIndex = 64*HEXDIG + + These events will be triggered when required HiddenService descriptor is + not found in the cache and a fetch or upload with the network is performed. + + If the fetch was triggered with only a DescriptorID (using the HSFETCH + command for instance), the HSAddress only appears in the Action=RECEIVED + since there is no way to know the HSAddress from the DescriptorID thus + the value will be "UNKNOWN". + + If we already had the v0 descriptor, the newly fetched v2 descriptor + will be ignored and a "HS_DESC" event with "IGNORE" action will be + generated. + + For HsDir, LongName is always preferred. If HsDir cannot be found in node + list at the time event is sent, Fingerprint will be used instead. + + If Action is "FAILED", Tor SHOULD send Reason field as well. Possible + values of Reason are: + - "BAD_DESC" - descriptor was retrieved, but found to be unparsable. + - "QUERY_REJECTED" - query was rejected by HS directory. + - "UPLOAD_REJECTED" - descriptor was rejected by HS directory. + - "NOT_FOUND" - HS descriptor with given identifier was not found. + - "UNEXPECTED" - nature of failure is unknown. + - "QUERY_NO_HSDIR" - No suitable HSDir were found for the query. + - "QUERY_RATE_LIMITED" - query for this service is rate-limited + + For "QUERY_NO_HSDIR" or "QUERY_RATE_LIMITED", the HsDir will be set to + "UNKNOWN" which was introduced in tor 0.3.1.0-alpha and 0.4.1.0-alpha + respectively. + + If Action is "CREATED", Tor SHOULD send Replica field as well. The Replica + field contains the replica number of the generated descriptor. The Replica + number is specified in rend-spec.txt section 1.3 and determines the + descriptor ID of the descriptor. + + For hidden service v3, the following applies: + + The "HSDIR_INDEX=" is an optional field that is only for version 3 + which contains the computed index of the HsDir the descriptor was + uploaded to or fetched from. + + The "DescriptorID" key is the descriptor blinded key used for the index + value at the "HsDir". + + The "REPLICA=" field is not used for the "CREATED" event because v3 + doesn't use the replica number in the descriptor ID computation. + + Because client authentication is not yet implemented, the "AuthType" + field is always "NO_AUTH". + + [HS v3 support added 0.3.3.1-alpha] + +4.1.26. HiddenService descriptors content + + The syntax is: + + "650" "+" "HS_DESC_CONTENT" SP HSAddress SP DescId SP HsDir CRLF + Descriptor CRLF "." CRLF "650" SP "OK" CRLF + + HSAddress = 16*Base32Character / 56*Base32Character / "UNKNOWN" + DescId = 32*Base32Character / 32*Base64Character + HsDir = LongName / "UNKNOWN" + Descriptor = The text of the descriptor formatted as specified in + rend-spec.txt section 1.3 (v2) or rend-spec-v3.txt + section 2.4 (v3) or empty string on failure. + + This event is triggered when a successfully fetched HS descriptor is + received. The text of that descriptor is then replied. If the HS_DESC + event is enabled, it is replied just after the RECEIVED action. + + If a fetch fails, the Descriptor is an empty string and HSAddress is set + to "UNKNOWN". The HS_DESC event should be used to get more information on + the failed request. + + If the fetch fails for the QUERY_NO_HSDIR or QUERY_RATE_LIMITED reason from + the HS_DESC event, the HsDir is set to "UNKNOWN". This was introduced in + 0.3.1.0-alpha and 0.4.1.0-alpha respectively. + + It's expected to receive a reply relatively fast as in it's the time it + takes to fetch something over the Tor network. This can be between a + couple of seconds up to 60 seconds (not a hard limit). But, in any cases, + this event will reply either the descriptor's content or an empty one. + + [HS_DESC_CONTENT was added in Tor 0.2.7.1-alpha] + [HS v3 support added 0.3.3.1-alpha] + +4.1.27. Network liveness has changed + + Syntax: + + "650" SP "NETWORK_LIVENESS" SP Status CRLF + Status = "UP" / ; The network now seems to be reachable. + "DOWN" / ; The network now seems to be unreachable. + + Controllers MUST tolerate unrecognized status types. + + [NETWORK_LIVENESS was added in Tor 0.2.7.2-alpha] + +4.1.28. Pluggable Transport Logs + + Syntax: + + "650" SP "PT_LOG" SP PT=Program SP Message + + Program = The program path as defined in the *TransportPlugin + configuration option. Tor accepts relative and full path. + Message = The log message that the PT sends back to the tor parent + process minus the "LOG" string prefix. Formatted as + specified in pt-spec.txt section "3.3.4. Pluggable + Transport Log Message". + + This event is triggered when tor receives a log message from the PT. + + Example: + + PT (obfs4): LOG SEVERITY=debug MESSAGE="Connected to bridge A" + + the resulting control port event would be: + + Tor: 650 PT_LOG PT=/usr/bin/obs4proxy SEVERITY=debug MESSAGE="Connected to bridge A" + + [PT_LOG was added in Tor 0.4.0.1-alpha] + +4.1.29. Pluggable Transport Status + + Syntax: + + "650" SP "PT_STATUS" SP PT=Program SP TRANSPORT=Transport SP Message + + Program = The program path as defined in the *TransportPlugin + configuration option. Tor accepts relative and full path. + Transport = This value indicates a hint on what the PT is such as the + name or the protocol used for instance. + Message = The status message that the PT sends back to the tor parent + process minus the "STATUS" string prefix. Formatted as + specified in pt-spec.txt section "3.3.5 Pluggable + Transport Status Message". + + This event is triggered when tor receives a log message from the PT. + + Example: + + PT (obfs4): STATUS TRANSPORT=obfs4 CONNECT=Success + + the resulting control port event would be: + + Tor: 650 PT_STATUS PT=/usr/bin/obs4proxy TRANSPORT=obfs4 CONNECT=Success + + [PT_STATUS was added in Tor 0.4.0.1-alpha] + +5. Implementation notes + +5.1. Authentication + + If the control port is open and no authentication operation is enabled, Tor + trusts any local user that connects to the control port. This is generally + a poor idea. + + If the 'CookieAuthentication' option is true, Tor writes a "magic + cookie" file named "control_auth_cookie" into its data directory (or + to another file specified in the 'CookieAuthFile' option). To + authenticate, the controller must demonstrate that it can read the + contents of the cookie file: + + * Current versions of Tor support cookie authentication + + using the "COOKIE" authentication method: the controller sends the + contents of the cookie file, encoded in hexadecimal. This + authentication method exposes the user running a controller to an + unintended information disclosure attack whenever the controller + has greater filesystem read access than the process that it has + connected to. (Note that a controller may connect to a process + other than Tor.) It is almost never safe to use, even if the + controller's user has explicitly specified which filename to read + an authentication cookie from. For this reason, the COOKIE + authentication method has been deprecated and will be removed from + Tor before some future version of Tor. + + * 0.2.2.x versions of Tor starting with 0.2.2.36, and all versions of + + Tor after 0.2.3.12-alpha, support cookie authentication using the + "SAFECOOKIE" authentication method, which discloses much less + information about the contents of the cookie file. + + If the 'HashedControlPassword' option is set, it must contain the salted + hash of a secret password. The salted hash is computed according to the + S2K algorithm in RFC 2440 (OpenPGP), and prefixed with the s2k specifier. + This is then encoded in hexadecimal, prefixed by the indicator sequence + "16:". Thus, for example, the password 'foo' could encode to: + + 16:660537E3E1CD49996044A3BF558097A981F539FEA2F9DA662B4626C1C2 + ++++++++++++++++**^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + salt hashed value + indicator + + You can generate the salt of a password by calling + + 'tor --hash-password ' + + or by using the example code in the Python and Java controller libraries. + To authenticate under this scheme, the controller sends Tor the original + secret that was used to generate the password, either as a quoted string + or encoded in hexadecimal. + +5.2. Don't let the buffer get too big. + + With old versions of Tor (before 0.2.0.16-alpha), if you ask for + lots of events, and 16MB of them queue up on the buffer, the Tor + process will close the socket. + + Newer Tor versions do not have this 16 MB buffer limit. However, + if you leave huge numbers of events unread, Tor may still run out + of memory, so you should still be careful about buffer size. + +5.3. Backward compatibility with v0 control protocol. + + The 'version 0' control protocol was replaced in Tor 0.1.1.x. Support + was removed in Tor 0.2.0.x. Every non-obsolete version of Tor now + supports the version 1 control protocol. + + For backward compatibility with the "version 0" control protocol, + Tor used to check whether the third octet of the first command is zero. + (If it was, Tor assumed that version 0 is in use.) + + This compatibility was removed in Tor 0.1.2.16 and 0.2.0.4-alpha. + +5.4. Tor config options for use by controllers + + Tor provides a few special configuration options for use by controllers. + These options are not saved to disk by SAVECONF. Most can be set and + examined by the SETCONF and GETCONF commands, but some (noted below) can + only be given in a torrc file or on the command line. + + Generally, these options make Tor unusable by disabling a portion of Tor's + normal operations. Unless a controller provides replacement functionality + to fill this gap, Tor will not correctly handle user requests. + + __AllDirActionsPrivate + + If true, Tor will try to launch all directory operations through + anonymous connections. (Ordinarily, Tor only tries to anonymize + requests related to hidden services.) This option will slow down + directory access, and may stop Tor from working entirely if it does not + yet have enough directory information to build circuits. + + (Boolean. Default: "0".) + + __DisablePredictedCircuits + + If true, Tor will not launch preemptive "general-purpose" circuits for + streams to attach to. (It will still launch circuits for testing and + for hidden services.) + + (Boolean. Default: "0".) + + __LeaveStreamsUnattached + + If true, Tor will not automatically attach new streams to circuits; + instead, the controller must attach them with ATTACHSTREAM. If the + controller does not attach the streams, their data will never be routed. + + (Boolean. Default: "0".) + + __HashedControlSessionPassword + + As HashedControlPassword, but is not saved to the torrc file by + SAVECONF. Added in Tor 0.2.0.20-rc. + + __ReloadTorrcOnSIGHUP + + If this option is true (the default), we reload the torrc from disk + every time we get a SIGHUP (from the controller or via a signal). + Otherwise, we don't. This option exists so that controllers can keep + their options from getting overwritten when a user sends Tor a HUP for + some other reason (for example, to rotate the logs). + + (Boolean. Default: "1") + + __OwningControllerProcess + + If this option is set to a process ID, Tor will periodically check + whether a process with the specified PID exists, and exit if one + does not. Added in Tor 0.2.2.28-beta. This option's intended use + is documented in section 3.23 with the related TAKEOWNERSHIP + command. + + Note that this option can only specify a single process ID, unlike + the TAKEOWNERSHIP command which can be sent along multiple control + connections. + + (String. Default: unset.) + + __OwningControllerFD + + If this option is a valid socket, Tor will start with an open control + connection on this socket. Added in Tor 0.3.3.1-alpha. + + This socket will be an owning controller, as if it had already called + TAKEOWNERSHIP. It will be automatically authenticated. This option + should only be used by other programs that are starting Tor. + + This option cannot be changed via SETCONF; it must be set in a torrc or + via the command line. + + (Integer. Default: -1.) + + __DisableSignalHandlers + + If this option is set to true during startup, then Tor will not install + any signal handlers to watch for POSIX signals. The SIGNAL controller + command will still work. + + This option is meant for embedding Tor inside another process, when + the controlling process would rather handle signals on its own. + + This option cannot be changed via SETCONF; it must be set in a torrc or + via the command line. + + (Boolean. Default: 0.) + +5.5. Phases from the Bootstrap status event. + + [For the bootstrap phases reported by Tor prior to 0.4.0.x, see + Section 5.6.] + + This section describes the various bootstrap phases currently reported + by Tor. Controllers should not assume that the percentages and tags + listed here will continue to match up, or even that the tags will stay + in the same order. Some phases might also be skipped (not reported) + if the associated bootstrap step is already complete, or if the phase + no longer is necessary. Only "starting" and "done" are guaranteed to + exist in all future versions. + + Current Tor versions enter these phases in order, monotonically. + Future Tors MAY revisit earlier phases, for example, if the network + fails. + +5.5.1. Overview of Bootstrap reporting. + + Bootstrap phases can be viewed as belonging to one of three stages: + + 1. Initial connection to a Tor relay or bridge + 2. Obtaining directory information + 3. Building an application circuit + + Tor doesn't specifically enter Stage 1; that is a side effect of + other actions that Tor is taking. Tor could be making a connection + to a fallback directory server, or it could be making a connection + to a guard candidate. Either one counts as Stage 1 for the purposes + of bootstrap reporting. + + Stage 2 might involve Tor contacting directory servers, or it might + involve reading cached directory information from a previous + session. Large parts of Stage 2 might be skipped if there is already + enough cached directory information to build circuits. Tor will + defer reporting progress in Stage 2 until Stage 1 is complete. + + Tor defers this reporting because Tor can already have enough + directory information to build circuits, yet not be able to connect + to a relay. Without that deferral, a user might misleadingly see Tor + stuck at a large amount of progress when something as fundamental as + making a TCP connection to any relay is failing. + + Tor also doesn't specifically enter Stage 3; that is a side effect + of Tor building circuits for some purpose or other. In a typical + client, Tor builds predicted circuits to provide lower latency for + application connection requests. In Stage 3, Tor might make new + connections to relays or bridges that it did not connect to in Stage + 1. + +5.5.2. Phases in Bootstrap Stage 1. + + Phase 0: + tag=starting summary="Starting" + + Tor starts out in this phase. + + Phase 1: + tag=conn_pt summary="Connecting to pluggable transport" + [This phase is new in 0.4.0.x] + + Tor is making a TCP connection to the transport plugin for a + pluggable transport. Tor will use this pluggable transport to make + its first connection to a bridge. + + Phase 2: + tag=conn_done_pt summary="Connected to pluggable transport" + [New in 0.4.0.x] + + Tor has completed its TCP connection to the transport plugin for the + pluggable transport. + + Phase 3: + tag=conn_proxy summary="Connecting to proxy" + [New in 0.4.0.x] + + Tor is making a TCP connection to a proxy to make its first + connection to a relay or bridge. + + Phase 4: + tag=conn_done_proxy summary="Connected to proxy" + [New in 0.4.0.x] + + Tor has completed its TCP connection to a proxy to make its first + connection to a relay or bridge. + + Phase 5: + tag=conn summary="Connecting to a relay" + [New in 0.4.0.x; prior versions of Tor had a "conn_dir" phase that + sometimes but not always corresponded to connecting to a directory server] + + Tor is making its first connection to a relay. This might be through + a pluggable transport or proxy connection that Tor has already + established. + + Phase 10: + tag=conn_done summary="Connected to a relay" + [New in 0.4.0.x] + + Tor has completed its first connection to a relay. + + Phase 14: + tag=handshake summary="Handshaking with a relay" + [New in 0.4.0.x; prior versions of Tor had a "handshake_dir" phase] + + Tor is in the process of doing a TLS handshake with a relay. + + Phase 15: + tag=handshake_done summary="Handshake with a relay done" + [New in 0.4.0.x] + + Tor has completed its TLS handshake with a relay. + +5.5.3. Phases in Bootstrap Stage 2. + + Phase 20: + tag=onehop_create summary="Establishing an encrypted directory connection" + [prior to 0.4.0.x, this was numbered 15] + + Once TLS is finished with a relay, Tor will send a CREATE_FAST cell + to establish a one-hop circuit for retrieving directory information. + It will remain in this phase until it receives the CREATED_FAST cell + back, indicating that the circuit is ready. + + Phase 25: + tag=requesting_status summary="Asking for networkstatus consensus" + [prior to 0.4.0.x, this was numbered 20] + + Once we've finished our one-hop circuit, we will start a new stream + for fetching the networkstatus consensus. We'll stay in this phase + until we get the 'connected' relay cell back, indicating that we've + established a directory connection. + + Phase 30: + tag=loading_status summary="Loading networkstatus consensus" + [prior to 0.4.0.x, this was numbered 25] + + Once we've established a directory connection, we will start fetching + the networkstatus consensus document. This could take a while; this + phase is a good opportunity for using the "progress" keyword to indicate + partial progress. + + This phase could stall if the directory server we picked doesn't + have a copy of the networkstatus consensus so we have to ask another, + or it does give us a copy but we don't find it valid. + + Phase 40: + tag=loading_keys summary="Loading authority key certs" + + Sometimes when we've finished loading the networkstatus consensus, + we find that we don't have all the authority key certificates for the + keys that signed the consensus. At that point we put the consensus we + fetched on hold and fetch the keys so we can verify the signatures. + + Phase 45 + tag=requesting_descriptors summary="Asking for relay descriptors" + + Once we have a valid networkstatus consensus and we've checked all + its signatures, we start asking for relay descriptors. We stay in this + phase until we have received a 'connected' relay cell in response to + a request for descriptors. + + [Some versions of Tor (starting with 0.2.6.2-alpha but before + 0.4.0.x): Tor could report having internal paths only; see Section + 5.6] + + Phase 50: + tag=loading_descriptors summary="Loading relay descriptors" + + We will ask for relay descriptors from several different locations, + so this step will probably make up the bulk of the bootstrapping, + especially for users with slow connections. We stay in this phase until + we have descriptors for a significant fraction of the usable relays + listed in the networkstatus consensus (this can be between 25% and 95% + depending on Tor's configuration and network consensus parameters). + This phase is also a good opportunity to use the "progress" keyword to + indicate partial steps. + + [Some versions of Tor (starting with 0.2.6.2-alpha but before + 0.4.0.x): Tor could report having internal paths only; see Section + 5.6] + + Phase 75: + tag=enough_dirinfo summary="Loaded enough directory info to build + circuits" + [New in 0.4.0.x; previously, Tor would misleadingly report the + "conn_or" tag once it had enough directory info.] + +5.5.4. Phases in Bootstrap Stage 3. + + Phase 76: + tag=ap_conn_pt summary="Connecting to pluggable transport to build + circuits" + [New in 0.4.0.x] + + This is similar to conn_pt, except for making connections to + additional relays or bridges that Tor needs to use to build + application circuits. + + Phase 77: + tag=ap_conn_done_pt summary="Connected to pluggable transport to build circuits" + [New in 0.4.0.x] + + This is similar to conn_done_pt, except for making connections to + additional relays or bridges that Tor needs to use to build + application circuits. + + Phase 78: + tag=ap_conn_proxy summary="Connecting to proxy to build circuits" + [New in 0.4.0.x] + + This is similar to conn_proxy, except for making connections to + additional relays or bridges that Tor needs to use to build + application circuits. + + Phase 79: + tag=ap_conn_done_proxy summary="Connected to proxy to build circuits" + [New in 0.4.0.x] + + This is similar to conn_done_proxy, except for making connections to + additional relays or bridges that Tor needs to use to build + application circuits. + + Phase 80: + tag=ap_conn summary="Connecting to a relay to build circuits" + [New in 0.4.0.x] + + This is similar to conn, except for making connections to additional + relays or bridges that Tor needs to use to build application + circuits. + + Phase 85: + tag=ap_conn_done summary="Connected to a relay to build circuits" + [New in 0.4.0.x] + + This is similar to conn_done, except for making connections to + additional relays or bridges that Tor needs to use to build + application circuits. + + Phase 89: + tag=ap_handshake summary="Finishing handshake with a relay to build circuits" + [New in 0.4.0.x] + + This is similar to handshake, except for making connections to + additional relays or bridges that Tor needs to use to build + application circuits. + + Phase 90: + tag=ap_handshake_done summary="Handshake finished with a relay to build circuits" + [New in 0.4.0.x] + + This is similar to handshake_done, except for making connections to + additional relays or bridges that Tor needs to use to build + application circuits. + + Phase 95: + tag=circuit_create summary="Establishing a[n internal] Tor circuit" + [prior to 0.4.0.x, this was numbered 90] + + Once we've finished our TLS handshake with the first hop of a circuit, + we will set about trying to make some 3-hop circuits in case we need them + soon. + + [Some versions of Tor (starting with 0.2.6.2-alpha but before + 0.4.0.x): Tor could report having internal paths only; see Section + 5.6] + + Phase 100: + tag=done summary="Done" + + A full 3-hop circuit has been established. Tor is ready to handle + application connections now. + + [Some versions of Tor (starting with 0.2.6.2-alpha but before + 0.4.0.x): Tor could report having internal paths only; see Section + 5.6] + +5.6. Bootstrap phases reported by older versions of Tor + + These phases were reported by Tor older than 0.4.0.x. For newer + versions of Tor, see Section 5.5. + + [Newer versions of Tor (0.2.6.2-alpha and later): + If the consensus contains Exits (the typical case), Tor will build both + exit and internal circuits. When bootstrap completes, Tor will be ready + to handle an application requesting an exit circuit to services like the + World Wide Web. + + If the consensus does not contain Exits, Tor will only build internal + circuits. In this case, earlier statuses will have included "internal" + as indicated above. When bootstrap completes, Tor will be ready to handle + an application requesting an internal circuit to hidden services at + ".onion" addresses. + + If a future consensus contains Exits, exit circuits may become available.] + + Phase 0: + tag=starting summary="Starting" + + Tor starts out in this phase. + + Phase 5: + tag=conn_dir summary="Connecting to directory server" + + Tor sends this event as soon as Tor has chosen a directory server -- + e.g. one of the authorities if bootstrapping for the first time or + after a long downtime, or one of the relays listed in its cached + directory information otherwise. + + Tor will stay at this phase until it has successfully established + a TCP connection with some directory server. Problems in this phase + generally happen because Tor doesn't have a network connection, or + because the local firewall is dropping SYN packets. + + Phase 10: + tag=handshake_dir summary="Finishing handshake with directory server" + + This event occurs when Tor establishes a TCP connection with a relay or + authority used as a directory server (or its https proxy if it's using + one). Tor remains in this phase until the TLS handshake with the relay + or authority is finished. + + Problems in this phase generally happen because Tor's firewall is + doing more sophisticated MITM attacks on it, or doing packet-level + keyword recognition of Tor's handshake. + + Phase 15: + tag=onehop_create summary="Establishing an encrypted directory connection" + + Once TLS is finished with a relay, Tor will send a CREATE_FAST cell + to establish a one-hop circuit for retrieving directory information. + It will remain in this phase until it receives the CREATED_FAST cell + back, indicating that the circuit is ready. + + Phase 20: + tag=requesting_status summary="Asking for networkstatus consensus" + + Once we've finished our one-hop circuit, we will start a new stream + for fetching the networkstatus consensus. We'll stay in this phase + until we get the 'connected' relay cell back, indicating that we've + established a directory connection. + + Phase 25: + tag=loading_status summary="Loading networkstatus consensus" + + Once we've established a directory connection, we will start fetching + the networkstatus consensus document. This could take a while; this + phase is a good opportunity for using the "progress" keyword to indicate + partial progress. + + This phase could stall if the directory server we picked doesn't + have a copy of the networkstatus consensus so we have to ask another, + or it does give us a copy but we don't find it valid. + + Phase 40: + tag=loading_keys summary="Loading authority key certs" + + Sometimes when we've finished loading the networkstatus consensus, + we find that we don't have all the authority key certificates for the + keys that signed the consensus. At that point we put the consensus we + fetched on hold and fetch the keys so we can verify the signatures. + + Phase 45 + tag=requesting_descriptors summary="Asking for relay descriptors + [ for internal paths]" + + Once we have a valid networkstatus consensus and we've checked all + its signatures, we start asking for relay descriptors. We stay in this + phase until we have received a 'connected' relay cell in response to + a request for descriptors. + + [Newer versions of Tor (0.2.6.2-alpha and later): + If the consensus contains Exits (the typical case), Tor will ask for + descriptors for both exit and internal paths. If not, Tor will only ask + for descriptors for internal paths. In this case, this status will + include "internal" as indicated above.] + + Phase 50: + tag=loading_descriptors summary="Loading relay descriptors[ for internal + paths]" + + We will ask for relay descriptors from several different locations, + so this step will probably make up the bulk of the bootstrapping, + especially for users with slow connections. We stay in this phase until + we have descriptors for a significant fraction of the usable relays + listed in the networkstatus consensus (this can be between 25% and 95% + depending on Tor's configuration and network consensus parameters). + This phase is also a good opportunity to use the "progress" keyword to + indicate partial steps. + + [Newer versions of Tor (0.2.6.2-alpha and later): + If the consensus contains Exits (the typical case), Tor will download + descriptors for both exit and internal paths. If not, Tor will only + download descriptors for internal paths. In this case, this status will + include "internal" as indicated above.] + + Phase 80: + tag=conn_or summary="Connecting to the Tor network[ internally]" + + Once we have a valid consensus and enough relay descriptors, we choose + entry guard(s) and start trying to build some circuits. This step + is similar to the "conn_dir" phase above; the only difference is + the context. + + If a Tor starts with enough recent cached directory information, + its first bootstrap status event will be for the conn_or phase. + + [Newer versions of Tor (0.2.6.2-alpha and later): + If the consensus contains Exits (the typical case), Tor will build both + exit and internal circuits. If not, Tor will only build internal circuits. + In this case, this status will include "internal(ly)" as indicated above.] + + Phase 85: + tag=handshake_or summary="Finishing handshake with first hop[ of internal + circuit]" + + This phase is similar to the "handshake_dir" phase, but it gets reached + if we finish a TCP connection to a Tor relay and we have already reached + the "conn_or" phase. We'll stay in this phase until we complete a TLS + handshake with a Tor relay. + + [Newer versions of Tor (0.2.6.2-alpha and later): + If the consensus contains Exits (the typical case), Tor may be finishing + a handshake with the first hop if either an exit or internal circuit. In + this case, it won't specify which type. If the consensus contains no Exits, + Tor will only build internal circuits. In this case, this status will + include "internal" as indicated above.] + + Phase 90: + tag=circuit_create summary="Establishing a[n internal] Tor circuit" + + Once we've finished our TLS handshake with the first hop of a circuit, + we will set about trying to make some 3-hop circuits in case we need them + soon. + + [Newer versions of Tor (0.2.6.2-alpha and later): + If the consensus contains Exits (the typical case), Tor will build both + exit and internal circuits. If not, Tor will only build internal circuits. + In this case, this status will include "internal" as indicated above.] + + Phase 100: + tag=done summary="Done" + + A full 3-hop circuit has been established. Tor is ready to handle + application connections now. + + [Newer versions of Tor (0.2.6.2-alpha and later): + If the consensus contains Exits (the typical case), Tor will build both + exit and internal circuits. At this stage, Tor will be ready to handle + an application requesting an exit circuit to services like the World + Wide Web. + + If the consensus does not contain Exits, Tor will only build internal + circuits. In this case, earlier statuses will have included "internal" + as indicated above. At this stage, Tor will be ready to handle an + application requesting an internal circuit to hidden services at ".onion" + addresses. + + If a future consensus contains Exits, exit circuits may become available.] diff --git a/attic/text_formats/dir-list-spec.txt b/attic/text_formats/dir-list-spec.txt new file mode 100644 index 0000000..65af536 --- /dev/null +++ b/attic/text_formats/dir-list-spec.txt @@ -0,0 +1,529 @@ + + Tor Directory List Format + Tim Wilson-Brown (teor) + +Table of Contents + + 1. Scope and Preliminaries + 1.1. Format Overview + 1.2. Acknowledgements + 1.3. Format Versions + 1.4. Future Plans + 2. Format Details + 2.1. Nonterminals + 2.2. List Header + 2.2.1. List Header Format + 2.3. List Generation + 2.3.1. List Generation Format + 2.4. Directory Entry + 2.4.1. Directory Entry Format + 3. Usage Considerations + 3.1. Caching + 3.2. Retrieving Directory Information + 3.3. Fallback Reliability + A.1. Sample Data + A.1.1. Sample Fallback List Header + A.1.2. Sample Fallback List Generation + A.1.3. Sample Fallback Entries + +1. Scope and Preliminaries + + This document describes the format of Tor's directory lists, which are + compiled and hard-coded into the tor binary. There is currently one + list: the fallback directory mirrors. This list is also parsed by other + libraries, like stem and metrics-lib. Alternate Tor implementations can + use this list to bootstrap from the latest public Tor directory + information. + + The FallbackDir feature was introduced by proposal 210, and was first + supported by Tor in Tor version 0.2.4.7-alpha. The first hard-coded + list was shipped in 0.2.8.1-alpha. + + The hard-coded fallback directory list is located in the tor source + repository at: + + src/app/config/fallback_dirs.inc + + In Tor 0.3.4 and earlier, the list is located at: + + src/or/fallback_dirs.inc + + This document describes version 2.0.0 and later of the directory list + format. + + Legacy, semi-structured versions of the fallback list were released with + Tor 0.2.8.1-alpha through Tor 0.3.1.9. We call this format version 1. + Stem and Relay Search have parsers for this legacy format. + +1.1. Format Overview + + A directory list is a C code fragment containing an array of C string + constants. Each double-quoted C string constant is a valid torrc + FallbackDir entry. Each entry contains various data fields. + + Directory lists do not include the C array's declaration, or the array's + terminating NULL. Entries in directory lists do not include the + FallbackDir torrc option. These are handled by the including C code. + + Directory lists also include C-style comments and whitespace. The + presence of whitespace may be significant, but the amount of whitespace + is never significant. The type of whitespace is not significant to the + C compiler or Tor C string parser. However, other parsers MAY rely on + the distinction between newlines and spaces. (And that the only + whitespace characters in the list are newlines and spaces.) + + The directory entry C string constants are split over multiple lines for + readability. Structured C-style comments are used to provide additional + data fields. This information is not used by Tor, but may be of interest + to other libraries. + + The order of directory entries and data fields is not significant, + except where noted below. + +1.2. Acknowledgements + + The original fallback directory script and format was created by + weasel. The current script uses code written by gsathya & karsten. + + This specification was revised after feedback from: + + Damian Johnson ("atagar") + Iain R. Learmonth ("irl") + +1.3. Format Versions + + The directory list format uses semantic versioning: https://semver.org + + In particular: + * major versions are used for incompatible changes, like + removing non-optional fields + * minor versions are used for compatible changes, like adding + fields + * patch versions are for bug fixes, like fixing an + incorrectly-formatted Summary item + + 1.0.0 - The legacy fallback directory list format + + 2.0.0 - Adds name and extrainfo structured comments, and section separator + comments to make the list easier to parses. Also adds a source list + comment to the header. + + 3.0.0 - Modifies the format of the source list comment. + +1.4. Future Plans + + Tor also has an auth_dirs.inc file, but it is not yet in this format. + Tor uses slightly different formats for authorities and fallback + directory mirrors, so we will need to make some changes to tor so that + it parses this format. (We will also need to add authority-specific + information to this format.) See #24818 for details. + + We want to add a torrc option so operators can opt-in their relays as + fallback directory mirrors. This gives us a signed opt-in confirmation. + (We can also continue to accept whitelist entries, and do other checks.) + We need to write a short proposal, and make some changes to tor and the + fallback update script. See #24839 for details. + +2. Format Details + + Directory lists contain the following sections: + + - List Header (exactly once) + - List Generation (exactly once, may be empty) + - Directory Entry (zero or more times) + + Each section (or entry) ends with a separator. + +2.1. Nonterminals + + The following nonterminals are defined in the Onionoo details document + specification: + + dir_address + fingerprint + nickname + + See https://metrics.torproject.org/onionoo.html#details + + The following nonterminals are defined in the "Tor directory protocol" + specification in dir-spec.txt: + + Keyword + ArgumentChar + NL (newline) + SP (space) + bool (must not be confused with Onionoo's JSON "boolean") + + We derive the following nonterminals from Onionoo and dir-spec.txt: + + ipv4_or_port ::= port from an IPv4 or_addresses item + + The ipv4_or_port is the port part of an IPv4 address from the + Onionoo or_addresses list. + + ipv6_or_address ::= an IPv6 or_addresses item + + The ipv6_or_address is an IPv6 address and port from the Onionoo + or_addresses list. The address MAY be in the canonical RFC 5952 + IPv6 address format. + + A key-value pair: + + value ::= Zero or more ArgumentChar, excluding the following strings: + * a double quotation mark (DQUOTE), and + * the C comment terminators ("/*" and "*/"). + + Note that the C++ comment ("//") and equals sign ("=") are + not excluded, because they are reserved for future use in + base64 values. + + key_value ::= Keyword "=" value + + We also define these additional nonterminals: + + number ::= An optional negative sign ("-"), followed by one or more + numeric characters ([0-9]), with an optional decimal part + (".", followed by one or more numeric characters). + + separator ::= "/*" SP+ "=====" SP+ "*/" + +2.2. List Header + + The list header consists of a number of key-value pairs, embedded in + C-style comments. + +2.2.1. List Header Format + + "/*" SP+ "type=" Keyword SP+ "*/" SP* NL + + [At start, exactly once.] + + The type of directory entries in the list. Parsers SHOULD exit with + an error if this is not the first line of the list, or if the value + is anything other than "fallback". + + "/*" SP+ "version=" version_number SP+ "*/" SP* NL + + [In second position, exactly once.] + + The version of the directory list format. + + version_number is a semantic version, see the "Format Versions" + section for details. + + Version 1.0.0 represents the undocumented, legacy fallback list + format(s). Version 2.0.0 and later are documented by this + specification. + + "/*" SP+ "timestamp=" number SP+ "*/" SP* NL + + [Exactly once.] + + A positive integer that indicates when this directory list was + generated. This timestamp is guaranteed to increase for every + version 2.0.0 and later directory list. + + The current timestamp format is YYYYMMDDHHMMSS, as an integer. + + "/*" SP+ "source=" Keyword ("," Keyword)* SP+ "*/" SP* NL + + [Zero or one time.] + + A list of the sources of the directory entries in the list. + + As of version 3.0.0, the possible sources are: + * "offer-list" - the fallback_offer_list file in the fallback-scripts + repository. + * "descriptor" - one or more signed descriptors, each containing an + "offer-fallback-dir" line. This feature will be + implemented in ticket #24839. + * "fallback" - a fallback_dirs.inc file from a tor repository. + Used in check_existing mode. + + Before #24839 is implemented, the default is "offer-list". During the + transition to signed offers, it will be "descriptor,offer-list". + Afterwards, it will be "descriptor". + + In version 2.0.0, only one source name was allowed after "source=", + and the deprecated "whitelist" source name was used instead of + "offer-list". + + This line was added in version 2.0.0 of this specification. The format + of this line was modified in version 3.0.0 of this specification. + + "/*" SP+ key_value SP+ "*/" SP* NL + + [Zero or more times.] + + Future releases may include additional header fields. Parsers MUST NOT + rely on the order of these additional fields. Additional header fields + will be accompanied by a minor version increment. + + separator SP* NL + + The list header ends with the section separator. + +2.3. List Generation + + The list generation information consists of human-readable prose + describing the content and origin of this directory list. It is contained + in zero or more C-style comments, and may contain multi-line comments and + uncommented C code. + + In particular, this section may contain C-style comments that contain + an equals ("=") character. It may also be entirely empty. + + Future releases may arbitrarily change the content of this section. + Parsers MUST NOT rely on a version increment when the format changes. + +2.3.1. List Generation Format + + In general, parsers MUST NOT rely on the format of this section. + + Parsers MAY rely on the following details: + + The list generation section MUST NOT be a valid directory entry. + + The list generation summary MUST end with a section separator: + + separator SP* NL + + There MUST NOT be any section separators in the list generation + section, other than the terminating section separator. + +2.4. Directory Entry + + A directory entry consists of a C string constant, and one or more + C-style comments. The C string constant is a valid argument to the + DirAuthority or FallbackDir torrc option. The section also contains + additional key-value fields in C-style comments. + + The list of fallback entries does not include the directory + authorities: they are in a separate list. (The Tor implementation combines + these lists after parsing them, and applies the DirAuthorityFallbackRate + to their weights.) + +2.4.1. Directory Entry Format + + If a directory entry does not conform to this format, the entry SHOULD + be ignored by parsers. + + DQUOTE dir_address SP+ "orport=" ipv4_or_port SP+ + "id=" fingerprint DQUOTE SP* NL + + [At start, exactly once, on a single line.] + + This line consists of the following fields: + + dir_address + + An IPv4 address and DirPort for this directory, as defined by + Onionoo. In this format version, all IPv4 addresses and DirPorts + are guaranteed to be non-zero. (For IPv4 addresses, this means + that they are not equal to "0.0.0.0".) + + ipv4_or_port + + An IPv4 ORPort for this directory, derived from Onionoo. In this + format version, all IPv4 ORPorts are guaranteed to be non-zero. + + fingerprint + + The relay fingerprint of this directory, as defined by Onionoo. + All relay fingerprints are guaranteed to have one or more non-zero + digits. + + Note: + + Each double-quoted C string line that occurs after the first line, + starts with space inside the quotes. This is a requirement of the + Tor implementation. + + DQUOTE SP+ "ipv6=" ipv6_or_address DQUOTE SP* NL + + [Zero or one time.] + + The IPv6 address and ORPort for this directory, as defined by + Onionoo. If present, IPv6 addresses and ORPorts are guaranteed to be + non-zero. (For IPv6 addresses, this means that they are not equal to + "[::]".) + + DQUOTE SP+ "weight=" number DQUOTE SP* NL + + [Zero or one time.] + + A non-negative, real-numbered weight for this directory. + The default fallback weight is 1.0, and the default + DirAuthorityFallbackRate is 1.0 in legacy Tor versions, and 0.1 in + recent Tor versions. + + weight was removed in version 2.0.0, but is documented because it + may be of interest to libraries implementing Tor's fallback + behaviour. + + DQUOTE SP+ key_value DQUOTE SP* NL + + [Zero or more times.] + + Future releases may include additional data fields in double-quoted + C string constants. Parsers MUST NOT rely on the order of these + additional fields. Additional data fields will be accompanied by a + minor version increment. + + "/*" SP+ "nickname=" nickname* SP+ "*/" SP* NL + + [Exactly once.] + + The nickname for this directory, as defined by Onionoo. An + empty nickname indicates that the nickname is unknown. + + The first fallback list in the 2.0.0 format had nickname lines, but + they were all empty. + + "/*" SP+ "extrainfo=" bool SP+ "*/" SP* NL + + [Exactly once.] + + An integer flag that indicates whether this directory caches + extra-info documents. Set to 1 if the directory claimed that it + cached extra-info documents in its descriptor when the list was + created. 0 indicates that it did not, or its descriptor was not + available. + + The first fallback list in the 2.0.0 format had extrainfo lines, but + they were all zero. + + "/*" SP+ key_value SP+ "*/" SP* NL + + [Zero or more times.] + + Future releases may include additional data fields in C-style + comments. Parsers MUST NOT rely on the order of these additional + fields. Additional data fields will be accompanied by a minor version + increment. + + separator SP* NL + + [Exactly once.] + + Each directory entry ends with the section separator. + + "," SP* NL + + [Exactly once.] + + The comma terminates the C string constant. (Multiple C string + constants separated by whitespace or comments are coalesced by + the C compiler.) + +3. Usage Considerations + + This section contains recommended library behaviours. It does not affect + the format of directory lists. + +3.1. Caching + + The fallback list typically changes once every 6-12 months. The data in + the list represents the state of the fallback directory entries when the + list was created. Fallbacks can and do change their details over time. + + Libraries SHOULD parse and cache the most recent version of these lists + during their build or release processes. Libraries MUST NOT retrieve the + lists by default every time they are deployed or executed. + + The latest fallback list can be retrieved from: + + https://gitweb.torproject.org/tor.git/plain/src/or/fallback_dirs.inc + + Libraries MUST NOT rely on the availability of the server that hosts + these lists. + + The list can also be retrieved using: + + git clone https://git.torproject.org/tor.git + + If you just want the latest list, you may wish to perform a shallow + clone. + +3.2. Retrieving Directory Information + + Some libraries retrieve directory documents directly from the Tor + Directory Authorities. The directory authorities are designed to support + Tor relay and client bootstrap, and MAY choose to rate-limit library + access. Libraries MAY provide a user-agent in their requests, if they + are not intended to support anonymous operation. (User agents are a + fingerprinting vector.) + + Libraries SHOULD consider the potential load on the authorities, and + whether other sources can meet their needs. + + Libraries that require high-uptime availability of Tor directory + information should investigate the following options: + + * OnionOO: https://metrics.torproject.org/onionoo.html + * Third-party OnionOO mirrors are also available + * CollecTor: https://collector.torproject.org/ + * Fallback Directory Mirrors + + Onionoo and CollecTor are typically updated every hour on a regular + schedule. Fallbacks update their own directory information at random + intervals, see dir-spec for details. + +3.3. Fallback Reliability + + The fallback list is typically regenerated when the fallback failure + rate exceeds 25%. Libraries SHOULD NOT rely on any particular fallback + being available, or some proportion of fallbacks being available. + + Libraries that use fallbacks MAY wish to query an authority after a + few fallback queries fail. For example, Tor clients try 3-4 fallbacks + before trying an authority. + +A.1. Sample Data + + A sample version 2.0.0 fallback list is available here: + + https://trac.torproject.org/projects/tor/raw-attachment/ticket/22759/fallback_dirs_new_format_version.4.inc + + A sample transitional version 2.0.0 fallback list is available here: + + https://raw.githubusercontent.com/teor2345/tor/fallback-format-2-v4/src/or/fallback_dirs.inc + +A.1.1. Sample Fallback List Header + +/* type=fallback */ +/* version=2.0.0 */ +/* ===== */ + +A.1.2. Sample Fallback List Generation + +/* Whitelist & blacklist excluded 1326 of 1513 candidates. */ +/* Checked IPv4 DirPorts served a consensus within 15.0s. */ +/* +Final Count: 151 (Eligible 187, Target 392 (1963 * 0.20), Max 200) +Excluded: 36 (Same Operator 27, Failed/Skipped Download 9, Excess 0) +Bandwidth Range: 1.3 - 40.0 MByte/s +*/ +/* +Onionoo Source: details Date: 2017-05-16 07:00:00 Version: 4.0 +URL: https:onionoo.torproject.orgdetails?fields=fingerprint%2Cnickname%2Ccontact%2Clast_changed_address_or_port%2Cconsensus_weight%2Cadvertised_bandwidth%2Cor_addresses%2Cdir_address%2Crecommended_version%2Cflags%2Ceffective_family%2Cplatform&flag=V2Dir&type=relay&last_seen_days=-0&first_seen_days=30- +*/ +/* +Onionoo Source: uptime Date: 2017-05-16 07:00:00 Version: 4.0 +URL: https:onionoo.torproject.orguptime?first_seen_days=30-&flag=V2Dir&type=relay&last_seen_days=-0 +*/ +/* ===== */ + +A.1.3. Sample Fallback Entries + +"176.10.104.240:80 orport=443 id=0111BA9B604669E636FFD5B503F382A4B7AD6E80" +/* nickname=foo */ +/* extrainfo=1 */ +/* ===== */ +, +"5.9.110.236:9030 orport=9001 id=0756B7CD4DFC8182BE23143FAC0642F515182CEB" +" ipv6=[2a01:4f8:162:51e2::2]:9001" +/* nickname= */ +/* extrainfo=0 */ +/* ===== */ +, diff --git a/attic/text_formats/dir-spec.txt b/attic/text_formats/dir-spec.txt new file mode 100644 index 0000000..f133c39 --- /dev/null +++ b/attic/text_formats/dir-spec.txt @@ -0,0 +1,4299 @@ + + Tor directory protocol, version 3 + +Table of Contents + + 0. Scope and preliminaries + 0.1. History + 0.2. Goals of the version 3 protoc + 0.3. Some Remaining questions + 1. Outline + 1.1. What's different from version 2? + 1.2. Document meta-format + 1.3. Signing documents + 1.4. Voting timeline + 2. Router operation and formats + 2.1. Uploading server descriptors and extra-info documents + 2.1.1. Server descriptor format + 2.1.2. Extra-info document format + 2.1.3. Nonterminals in server descriptors + 3. Directory authority operation and formats + 3.1. Creating key certificates + 3.2. Accepting server descriptor and extra-info document uploads + 3.3. Computing microdescriptors + 3.4. Exchanging votes + 3.4.1. Vote and consensus status document formats + 3.4.2. Assigning flags in a vote + 3.4.3. Serving bandwidth list files + 3.5. Downloading missing certificates from other directory authorities + 3.6. Downloading server descriptors from other directory authorities + 3.7. Downloading extra-info documents from other directory authorities + 3.8. Computing a consensus from a set of votes + 3.8.0.1. Deciding which Ids to include. + 3.8.0.2. Deciding which descriptors to include + 3.8.1. Forward compatibility + 3.8.2. Encoding port lists + 3.8.3. Computing Bandwidth Weights + 3.9. Computing consensus flavors + 3.9.1. ns consensus + 3.9.2. Microdescriptor consensus + 3.10. Exchanging detached signatures + 3.11. Publishing the signed consensus + 4. Directory cache operation + 4.1. Downloading consensus status documents from directory authorities + 4.2. Downloading server descriptors from directory authorities + 4.3. Downloading microdescriptors from directory authorities + 4.4. Downloading extra-info documents from directory authorities + 4.5. Consensus diffs + 4.5.1. Consensus diff format + 4.5.2. Serving and requesting diff + 4.6 Retrying failed downloads + 5. Client operation + 5.1. Downloading network-status documents + 5.2. Downloading server descriptors or microdescriptors + 5.3. Downloading extra-info documents + 5.4. Using directory information + 5.4.1. Choosing routers for circuits. + 5.4.2. Managing naming + 5.4.3. Software versions + 5.4.4. Warning about a router's status. + 5.5. Retrying failed downloads + 6. Standards compliance + 6.1. HTTP headers + 6.2. HTTP status codes + A. Consensus-negotiation timeline. + B. General-use HTTP URLs + C. Converting a curve25519 public key to an ed25519 public key + D. Inferring missing proto lines. + E. Limited ed diff format + +0. Scope and preliminaries + + This directory protocol is used by Tor version 0.2.0.x-alpha and later. + See dir-spec-v1.txt for information on the protocol used up to the + 0.1.0.x series, and dir-spec-v2.txt for information on the protocol + used by the 0.1.1.x and 0.1.2.x series. + + This document merges and supersedes the following proposals: + + 101 Voting on the Tor Directory System + 103 Splitting identity key from regularly used signing key + 104 Long and Short Router Descriptors + + XXX timeline + XXX fill in XXXXs + + The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL + NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and + "OPTIONAL" in this document are to be interpreted as described in + RFC 2119. + +0.1. History + + The earliest versions of Onion Routing shipped with a list of known + routers and their keys. When the set of routers changed, users needed to + fetch a new list. + + The Version 1 Directory protocol + -------------------------------- + + Early versions of Tor (0.0.2) introduced "Directory authorities": servers + that served signed "directory" documents containing a list of signed + "server descriptors", along with short summary of the status of each + router. Thus, clients could get up-to-date information on the state of + the network automatically, and be certain that the list they were getting + was attested by a trusted directory authority. + + Later versions (0.0.8) added directory caches, which download + directories from the authorities and serve them to clients. Non-caches + fetch from the caches in preference to fetching from the authorities, thus + distributing bandwidth requirements. + + Also added during the version 1 directory protocol were "router status" + documents: short documents that listed only the up/down status of the + routers on the network, rather than a complete list of all the + descriptors. Clients and caches would fetch these documents far more + frequently than they would fetch full directories. + + The Version 2 Directory Protocol + -------------------------------- + + During the Tor 0.1.1.x series, Tor revised its handling of directory + documents in order to address two major problems: + + * Directories had grown quite large (over 1MB), and most directory + downloads consisted mainly of server descriptors that clients + already had. + + * Every directory authority was a trust bottleneck: if a single + directory authority lied, it could make clients believe for a time + an arbitrarily distorted view of the Tor network. (Clients + trusted the most recent signed document they downloaded.) Thus, + adding more authorities would make the system less secure, not + more. + + To address these, we extended the directory protocol so that + authorities now published signed "network status" documents. Each + network status listed, for every router in the network: a hash of its + identity key, a hash of its most recent descriptor, and a summary of + what the authority believed about its status. Clients would download + the authorities' network status documents in turn, and believe + statements about routers iff they were attested to by more than half of + the authorities. + + Instead of downloading all server descriptors at once, clients + downloaded only the descriptors that they did not have. Descriptors + were indexed by their digests, in order to prevent malicious caches + from giving different versions of a server descriptor to different + clients. + + Routers began working harder to upload new descriptors only when their + contents were substantially changed. + + +0.2. Goals of the version 3 protocol + + Version 3 of the Tor directory protocol tries to solve the following + issues: + + * A great deal of bandwidth used to transmit server descriptors was + used by two fields that are not actually used by Tor routers + (namely read-history and write-history). We save about 60% by + moving them into a separate document that most clients do not + fetch or use. + + * It was possible under certain perverse circumstances for clients + to download an unusual set of network status documents, thus + partitioning themselves from clients who have a more recent and/or + typical set of documents. Even under the best of circumstances, + clients were sensitive to the ages of the network status documents + they downloaded. Therefore, instead of having the clients + correlate multiple network status documents, we have the + authorities collectively vote on a single consensus network status + document. + + * The most sensitive data in the entire network (the identity keys + of the directory authorities) needed to be stored unencrypted so + that the authorities can sign network-status documents on the fly. + Now, the authorities' identity keys are stored offline, and used + to certify medium-term signing keys that can be rotated. + +0.3. Some Remaining questions + + Things we could solve on a v3 timeframe: + + The SHA-1 hash is showing its age. We should do something about our + dependency on it. We could probably future-proof ourselves here in + this revision, at least so far as documents from the authorities are + concerned. + + Too many things about the authorities are hardcoded by IP. + + Perhaps we should start accepting longer identity keys for routers + too. + + Things to solve eventually: + + Requiring every client to know about every router won't scale forever. + + Requiring every directory cache to know every router won't scale + forever. + + +1. Outline + + There is a small set (say, around 5-10) of semi-trusted directory + authorities. A default list of authorities is shipped with the Tor + software. Users can change this list, but are encouraged not to do so, + in order to avoid partitioning attacks. + + Every authority has a very-secret, long-term "Authority Identity Key". + This is stored encrypted and/or offline, and is used to sign "key + certificate" documents. Every key certificate contains a medium-term + (3-12 months) "authority signing key", that is used by the authority to + sign other directory information. (Note that the authority identity + key is distinct from the router identity key that the authority uses + in its role as an ordinary router.) + + Routers periodically upload signed "routers descriptors" to the + directory authorities describing their keys, capabilities, and other + information. Routers may also upload signed "extra-info documents" + containing information that is not required for the Tor protocol. + Directory authorities serve server descriptors indexed by router + identity, or by hash of the descriptor. + + Routers may act as directory caches to reduce load on the directory + authorities. They announce this in their descriptors. + + Periodically, each directory authority generates a view of + the current descriptors and status for known routers. They send a + signed summary of this view (a "status vote") to the other + authorities. The authorities compute the result of this vote, and sign + a "consensus status" document containing the result of the vote. + + Directory caches download, cache, and re-serve consensus documents. + + Clients, directory caches, and directory authorities all use consensus + documents to find out when their list of routers is out-of-date. + (Directory authorities also use vote statuses.) If it is, they download + any missing server descriptors. Clients download missing descriptors + from caches; caches and authorities download from authorities. + Descriptors are downloaded by the hash of the descriptor, not by the + relay's identity key: this prevents directory servers from attacking + clients by giving them descriptors nobody else uses. + + All directory information is uploaded and downloaded with HTTP. + +1.1. What's different from version 2? + + Clients used to download multiple network status documents, + corresponding roughly to "status votes" above. They would compute the + result of the vote on the client side. + + Authorities used to sign documents using the same private keys they used + for their roles as routers. This forced them to keep these extremely + sensitive keys in memory unencrypted. + + All of the information in extra-info documents used to be kept in the + main descriptors. + +1.2. Document meta-format + + Server descriptors, directories, and running-routers documents all obey the + following lightweight extensible information format. + + The highest level object is a Document, which consists of one or more + Items. Every Item begins with a KeywordLine, followed by zero or more + Objects. A KeywordLine begins with a Keyword, optionally followed by + whitespace and more non-newline characters, and ends with a newline. A + Keyword is a sequence of one or more characters in the set [A-Za-z0-9-], + but may not start with -. + An Object is a block of encoded data in pseudo-Privacy-Enhanced-Mail (PEM) + style format: that is, lines of encoded data MAY be wrapped by inserting + an ascii linefeed ("LF", also called newline, or "NL" here) character + (cf. RFC 4648 §3.1). When line wrapping, implementations MUST wrap lines + at 64 characters. Upon decoding, implementations MUST ignore and discard + all linefeed characters. + + More formally: + + NL = The ascii LF character (hex value 0x0a). + Document ::= (Item | NL)+ + Item ::= KeywordLine Object? + KeywordLine ::= Keyword (WS Argument)* NL + Keyword = KeywordStart KeywordChar* + KeywordStart ::= 'A' ... 'Z' | 'a' ... 'z' | '0' ... '9' + KeywordChar ::= KeywordStart | '-' + Argument := ArgumentChar+ + ArgumentChar ::= any graphical printing ASCII character. + WS = (SP | TAB)+ + Object ::= BeginLine Base64-encoded-data EndLine + BeginLine ::= "-----BEGIN " Keyword (" " Keyword)* "-----" NL + EndLine ::= "-----END " Keyword (" " Keyword)* "-----" NL + + A Keyword may not be "-----BEGIN". + + The BeginLine and EndLine of an Object must use the same keyword. + + When interpreting a Document, software MUST ignore any KeywordLine that + starts with a keyword it doesn't recognize; future implementations MUST NOT + require current clients to understand any KeywordLine not currently + described. + + Other implementations that want to extend Tor's directory format MAY + introduce their own items. The keywords for extension items SHOULD start + with the characters "x-" or "X-", to guarantee that they will not conflict + with keywords used by future versions of Tor. + + In our document descriptions below, we tag Items with a multiplicity in + brackets. Possible tags are: + + "At start, exactly once": These items MUST occur in every instance of + the document type, and MUST appear exactly once, and MUST be the + first item in their documents. + + "Exactly once": These items MUST occur exactly one time in every + instance of the document type. + + "At end, exactly once": These items MUST occur in every instance of + the document type, and MUST appear exactly once, and MUST be the + last item in their documents. + + "At most once": These items MAY occur zero or one times in any + instance of the document type, but MUST NOT occur more than once. + + "Any number": These items MAY occur zero, one, or more times in any + instance of the document type. + + "Once or more": These items MUST occur at least once in any instance + of the document type, and MAY occur more. + + For forward compatibility, each item MUST allow extra arguments at the + end of the line unless otherwise noted. So if an item's description below + is given as: + + "thing" int int int NL + + then implementations SHOULD accept this string as well: + + "thing 5 9 11 13 16 12" NL + + but not this string: + + "thing 5" NL + + and not this string: + + "thing 5 10 thing" NL + . + + Whenever an item DOES NOT allow extra arguments, we will tag it with + "no extra arguments". + +1.3. Signing documents + + Every signable document below is signed in a similar manner, using a + given "Initial Item", a final "Signature Item", a digest algorithm, and + a signing key. + + The Initial Item must be the first item in the document. + + The Signature Item has the following format: + + [arguments] NL SIGNATURE NL + + The "SIGNATURE" Object contains a signature (using the signing key) of + the PKCS#1 1.5 padded digest of the entire document, taken from the + beginning of the Initial item, through the newline after the Signature + Item's keyword and its arguments. + + The signature does not include the algorithmIdentifier specified in PKCS #1. + + Unless specified otherwise, the digest algorithm is SHA-1. + + All documents are invalid unless signed with the correct signing key. + + The "Digest" of a document, unless stated otherwise, is its digest *as + signed by this signature scheme*. + +1.4. Voting timeline + + Every consensus document has a "valid-after" (VA) time, a "fresh-until" + (FU) time and a "valid-until" (VU) time. VA MUST precede FU, which MUST + in turn precede VU. Times are chosen so that every consensus will be + "fresh" until the next consensus becomes valid, and "valid" for a while + after. At least 3 consensuses should be valid at any given time. + + The timeline for a given consensus is as follows: + + VA-DistSeconds-VoteSeconds: The authorities exchange votes. Each authority + uploads their vote to all other authorities. + + VA-DistSeconds-VoteSeconds/2: The authorities try to download any + votes they don't have. + + Authorities SHOULD also reject any votes that other authorities try to + upload after this time. (0.4.4.1-alpha was the first version to reject votes + in this way.) + + Note: Refusing late uploaded votes minimizes the chance of a consensus + split, particular when authorities are under bandwidth pressure. If an + authority is struggling to upload its vote, and finally uploads to a + fraction of authorities after this period, they will compute a consensus + different from the others. By refusing uploaded votes after this time, + we increase the likelihood that most authorities will use the same vote + set. + + Rejecting late uploaded votes does not fix the problem entirely. If + some authorities are able to download a specific vote, but others fail + to do so, then there may still be a consensus split. However, this + change does remove one common cause of consensus splits. + + VA-DistSeconds: The authorities calculate the consensus and exchange + signatures. (This is the earliest point at which anybody can + possibly get a given consensus if they ask for it.) + + VA-DistSeconds/2: The authorities try to download any signatures + they don't have. + + VA: All authorities have a multiply signed consensus. + + VA ... FU: Caches download the consensus. (Note that since caches have + no way of telling what VA and FU are until they have downloaded + the consensus, they assume that the present consensus's VA is + equal to the previous one's FU, and that its FU is one interval after + that.) + + FU: The consensus is no longer the freshest consensus. + + FU ... (the current consensus's VU): Clients download the consensus. + (See note above: clients guess that the next consensus's FU will be + two intervals after the current VA.) + + VU: The consensus is no longer valid; clients should continue to try to + download a new consensus if they have not done so already. + + VU + 24 hours: Clients will no longer use the consensus at all. + + VoteSeconds and DistSeconds MUST each be at least 20 seconds; FU-VA and + VU-FU MUST each be at least 5 minutes. + +2. Router operation and formats + +2.1. Uploading server descriptors and extra-info documents + + ORs SHOULD generate a new server descriptor and a new extra-info + document whenever any of the following events have occurred: + + - A period of time (18 hrs by default) has passed since the last + time a descriptor was generated. + + - A descriptor field other than bandwidth or uptime has changed. + + - Its uptime is less than 24h and bandwidth has changed by a factor of 2 + from the last time a descriptor was generated, and at least a given + interval of time (3 hours by default) has passed since then. + + - Its uptime has been reset (by restarting). + + - It receives a networkstatus consensus in which it is not listed. + + - It receives a networkstatus consensus in which it is listed + with the StaleDesc flag. + + [XXX this list is incomplete; see router_differences_are_cosmetic() + in routerlist.c for others] + + ORs SHOULD NOT publish a new server descriptor or extra-info document + if none of the above events have occurred and not much time has passed + (12 hours by default). + + Tor versions older than 0.3.5.1-alpha ignore uptime when checking for + bandwidth changes. + + After generating a descriptor, ORs upload them to every directory + authority they know, by posting them (in order) to the URL + + http:///tor/ + + Server descriptors may not exceed 20,000 bytes in length; extra-info + documents may not exceed 50,000 bytes in length. If they do, the + authorities SHOULD reject them. + +2.1.1. Server descriptor format + + Server descriptors consist of the following items. + + In lines that take multiple arguments, extra arguments SHOULD be + accepted and ignored. Many of the nonterminals below are defined in + section 2.1.3. + + Note that many versions of Tor will generate an extra newline at the + end of their descriptors. Implementations MUST tolerate one or + more blank lines at the end of a single descriptor or a list of + concatenated descriptors. New implementations SHOULD NOT generate + such blank lines. + + "router" nickname address ORPort SOCKSPort DirPort NL + + [At start, exactly once.] + + Indicates the beginning of a server descriptor. "nickname" must be a + valid router nickname as specified in section 2.1.3. "address" must + be an IPv4 + address in dotted-quad format. The last three numbers indicate the + TCP ports at which this OR exposes functionality. ORPort is a port at + which this OR accepts TLS connections for the main OR protocol; + SOCKSPort is deprecated and should always be 0; and DirPort is the + port at which this OR accepts directory-related HTTP connections. If + any port is not supported, the value 0 is given instead of a port + number. (At least one of DirPort and ORPort SHOULD be set; + authorities MAY reject any descriptor with both DirPort and ORPort of + 0.) + + "identity-ed25519" NL "-----BEGIN ED25519 CERT-----" NL certificate + "-----END ED25519 CERT-----" NL + + [Exactly once, in second position in document.] + [No extra arguments] + + The certificate is a base64-encoded Ed25519 certificate (see + cert-spec.txt) with terminating =s removed. When this element + is present, it MUST appear as the first or second element in + the router descriptor. + + The certificate has CERT_TYPE of [04]. It must include a + signed-with-ed25519-key extension (see cert-spec.txt, + section 2.2.1), so that we can extract the master identity key. + + [Before Tor 0.4.5.1-alpha, this field was optional.] + + "master-key-ed25519" SP MasterKey NL + + [Exactly once] + + Contains the base-64 encoded ed25519 master key as a single + argument. If it is present, it MUST match the identity key + in the identity-ed25519 entry. + + [Before Tor 0.4.5.1-alpha, this field was optional.] + + "bandwidth" bandwidth-avg bandwidth-burst bandwidth-observed NL + + [Exactly once] + + Estimated bandwidth for this router, in bytes per second. The + "average" bandwidth is the volume per second that the OR is willing to + sustain over long periods; the "burst" bandwidth is the volume that + the OR is willing to sustain in very short intervals. The "observed" + value is an estimate of the capacity this relay can handle. The + relay remembers the max bandwidth sustained output over any ten + second period in the past 5 days, and another sustained input. The + "observed" value is the lesser of these two numbers. + + Tor versions released before 2018 only kept bandwidth-observed for one + day. These versions are no longer supported or recommended. + + "platform" string NL + + [At most once] + + A human-readable string describing the system on which this OR is + running. This MAY include the operating system, and SHOULD include + the name and version of the software implementing the Tor protocol. + + "published" YYYY-MM-DD HH:MM:SS NL + + [Exactly once] + + The time, in UTC, when this descriptor (and its corresponding + extra-info document if any) was generated. + + "fingerprint" fingerprint NL + + [At most once] + + A fingerprint (a HASH_LEN-byte of asn1 encoded public key, encoded in + hex, with a single space after every 4 characters) for this router's + identity key. A descriptor is considered invalid (and MUST be + rejected) if the fingerprint line does not match the public key. + + [We didn't start parsing this line until Tor 0.1.0.6-rc; it should + be marked with "opt" until earlier versions of Tor are obsolete.] + + "hibernating" bool NL + + [At most once] + + If the value is 1, then the Tor relay was hibernating when the + descriptor was published, and shouldn't be used to build circuits. + + [We didn't start parsing this line until Tor 0.1.0.6-rc; it should be + marked with "opt" until earlier versions of Tor are obsolete.] + + "uptime" number NL + + [At most once] + + The number of seconds that this OR process has been running. + + "onion-key" NL a public key in PEM format + + [Exactly once] + [No extra arguments] + + This key is used to encrypt CREATE cells for this OR. The key MUST be + accepted for at least 1 week after any new key is published in a + subsequent descriptor. It MUST be 1024 bits. + + The key encoding is the encoding of the key as a PKCS#1 RSAPublicKey + structure, encoded in base64, and wrapped in "-----BEGIN RSA PUBLIC + KEY-----" and "-----END RSA PUBLIC KEY-----". + + "onion-key-crosscert" NL a RSA signature in PEM format. + + [Exactly once] + [No extra arguments] + + This element contains an RSA signature, generated using the + onion-key, of the following: + + A SHA1 hash of the RSA identity key, + i.e. RSA key from "signing-key" (see below) [20 bytes] + The Ed25519 identity key, + i.e. Ed25519 key from "master-key-ed25519" [32 bytes] + + If there is no Ed25519 identity key, or if in some future version + there is no RSA identity key, the corresponding field must be + zero-filled. + + Parties verifying this signature MUST allow additional data + beyond the 52 bytes listed above. + + This signature proves that the party creating the descriptor + had control over the secret key corresponding to the + onion-key. + + [Before Tor 0.4.5.1-alpha, this field was optional whenever + identity-ed25519 was absent.] + + "ntor-onion-key" base-64-encoded-key + + [Exactly once] + + A curve25519 public key used for the ntor circuit extended + handshake. It's the standard encoding of the OR's curve25519 + public key, encoded in base 64. The trailing '=' sign MAY be + omitted from the base64 encoding. The key MUST be accepted + for at least 1 week after any new key is published in a + subsequent descriptor. + + [Before Tor 0.4.5.1-alpha, this field was optional.] + + "ntor-onion-key-crosscert" SP Bit NL + "-----BEGIN ED25519 CERT-----" NL certificate + "-----END ED25519 CERT-----" NL + + [Exactly once] + [No extra arguments] + + A signature created with the ntor-onion-key, using the + certificate format documented in cert-spec.txt, with type + [0a]. The signed key here is the master identity key. + + Bit must be "0" or "1". It indicates the sign of the ed25519 + public key corresponding to the ntor onion key. If Bit is "0", + then implementations MUST guarantee that the x-coordinate of + the resulting ed25519 public key is positive. Otherwise, if + Bit is "1", then the sign of the x-coordinate MUST be negative. + + To compute the ed25519 public key corresponding to a curve25519 + key, and for further explanation on key formats, see appendix C. + + This signature proves that the party creating the descriptor + had control over the secret key corresponding to the + ntor-onion-key. + + [Before Tor 0.4.5.1-alpha, this field was optional whenever + identity-ed25519 was absent.] + + "signing-key" NL a public key in PEM format + + [Exactly once] + [No extra arguments] + + The OR's long-term RSA identity key. It MUST be 1024 bits. + + The encoding is as for "onion-key" above. + + "accept" exitpattern NL + "reject" exitpattern NL + + [Any number] + + These lines describe an "exit policy": the rules that an OR follows + when deciding whether to allow a new stream to a given address. The + 'exitpattern' syntax is described below. There MUST be at least one + such entry. The rules are considered in order; if no rule matches, + the address will be accepted. For clarity, the last such entry SHOULD + be accept *:* or reject *:*. + + "ipv6-policy" SP ("accept" / "reject") SP PortList NL + + [At most once.] + + An exit-policy summary as specified in sections 3.4.1 and 3.8.2, + summarizing + the router's rules for connecting to IPv6 addresses. A missing + "ipv6-policy" line is equivalent to "ipv6-policy reject + 1-65535". + + "overload-general" SP version SP YYYY-MM-DD HH:MM:SS NL + + [At most once.] + + Indicates that a relay has reached an "overloaded state" which can be + one or many of the following load metrics: + + - Any OOM invocation due to memory pressure + - Any ntor onionskins are dropped + - TCP port exhaustion + + The timestamp is when at least one metrics was detected. It should always + be at the hour and thus, as an example, "2020-01-10 13:00:00" is an + expected timestamp. Because this is a binary state, if the line is + present, we consider that it was hit at the very least once somewhere + between the provided timestamp and the "published" timestamp of the + document which is when the document was generated. + + The overload-general line should remain in place for 72 hours since last + triggered. If the limits are reached again in this period, the timestamp + is updated, and this 72 hour period restarts. + + The 'version' field is set to '1' for now. + + (Introduced in tor-0.4.6.1-alpha, but moved from extra-info to general + descriptor in tor-0.4.6.2-alpha) + + "router-sig-ed25519" SP Signature NL + + [Exactly once.] + + It MUST be the next-to-last element in the descriptor, appearing + immediately before the RSA signature. It MUST contain an Ed25519 + signature of a SHA256 digest of the entire document. This digest is + taken from the first character up to and including the first space + after the "router-sig-ed25519" string. Before computing the digest, + the string "Tor router descriptor signature v1" is prefixed to the + document. + + The signature is encoded in Base64, with terminating =s removed. + + The signing key in the identity-ed25519 certificate MUST + be the one used to sign the document. + + [Before Tor 0.4.5.1-alpha, this field was optional whenever + identity-ed25519 was absent.] + + "router-signature" NL Signature NL + + [At end, exactly once] + [No extra arguments] + + The "SIGNATURE" object contains a signature of the PKCS1-padded + hash of the entire server descriptor, taken from the beginning of the + "router" line, through the newline after the "router-signature" line. + The server descriptor is invalid unless the signature is performed + with the router's identity key. + + "contact" info NL + + [At most once] + + Describes a way to contact the relay's administrator, preferably + including an email address and a PGP key fingerprint. + + "bridge-distribution-request" SP Method NL + + [At most once, bridges only.] + + The "Method" describes how a Bridge address is distributed by + BridgeDB. Recognized methods are: "none", "any", "https", "email", + "moat". If set to "none", BridgeDB will avoid distributing your bridge + address. If set to "any", BridgeDB will choose how to distribute your + bridge address. Choosing any of the other methods will tell BridgeDB to + distribute your bridge via a specific method: + + - "https" specifies distribution via the web interface at + https://bridges.torproject.org; + - "email" specifies distribution via the email autoresponder at + bridges@torproject.org; + - "moat" specifies distribution via an interactive menu inside Tor + Browser; and + + Potential future "Method" specifiers must be as follows: + Method = (KeywordChar | "_") + + + All bridges SHOULD include this line. Non-bridges MUST NOT include + it. + + BridgeDB SHOULD treat unrecognized Method values as if they were + "none". + + (Default: "any") + + [This line was introduced in 0.3.2.3-alpha, with a minimal backport + to 0.2.5.16, 0.2.8.17, 0.2.9.14, 0.3.0.13, 0.3.1.9, and later.] + + "family" names NL + + [At most once] + + 'Names' is a space-separated list of relay nicknames or + hexdigests. If two ORs list one another in their "family" entries, + then OPs should treat them as a single OR for the purpose of path + selection. + + For example, if node A's descriptor contains "family B", and node B's + descriptor contains "family A", then node A and node B should never + be used on the same circuit. + + "read-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM,NUM,NUM,NUM,NUM... NL + [At most once] + "write-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM,NUM,NUM,NUM,NUM... NL + [At most once] + + (These fields once appeared in router descriptors, but have + appeared in extra-info descriptors since 0.2.0.x.) + + "eventdns" bool NL + + [At most once] + + Declare whether this version of Tor is using the newer enhanced + dns logic. Versions of Tor with this field set to false SHOULD NOT + be used for reverse hostname lookups. + + [This option is obsolete. All Tor current relays should be presumed + to have the evdns backend.] + + "caches-extra-info" NL + + [At most once.] + [No extra arguments] + + Present only if this router is a directory cache that provides + extra-info documents. + + [Versions before 0.2.0.1-alpha don't recognize this] + + "extra-info-digest" SP sha1-digest [SP sha256-digest] NL + + [At most once] + + "sha1-digest" is a hex-encoded SHA1 digest (using upper-case characters) + of the router's extra-info document, as signed in the router's + extra-info (that is, not including the signature). (If this field is + absent, the router is not uploading a corresponding extra-info + document.) + + "sha256-digest" is a base64-encoded SHA256 digest of the extra-info + document. Unlike the "sha1-digest", this digest is calculated over the + entire document, including the signature. This difference is due to + a long-lived bug in the tor implementation that it would be difficult + to roll out an incremental fix for, not a design choice. Future digest + algorithms specified should not include the signature in the data used + to compute the digest. + + [Versions before 0.2.7.2-alpha did not include a SHA256 digest.] + [Versions before 0.2.0.1-alpha don't recognize this field at all.] + + "hidden-service-dir" NL + + [At most once.] + + Present only if this router stores and serves hidden service + descriptors. This router supports the descriptor versions declared + in the HSDir "proto" entry. If there is no "proto" entry, this + router supports version 2 descriptors. + + "protocols" SP "Link" SP LINK-VERSION-LIST SP "Circuit" SP + CIRCUIT-VERSION-LIST NL + + [At most once.] + + An obsolete list of protocol versions, superseded by the "proto" + entry. This list was never parsed, and has not been emitted + since Tor 0.2.9.4-alpha. New code should neither generate nor + parse this line. + + "allow-single-hop-exits" NL + + [At most once.] + [No extra arguments] + + Present only if the router allows single-hop circuits to make exit + connections. Most Tor relays do not support this: this is + included for specialized controllers designed to support perspective + access and such. This is obsolete in tor version >= 0.3.1.0-alpha. + + "or-address" SP ADDRESS ":" PORT NL + + [Any number] + + ADDRESS = IP6ADDR | IP4ADDR + IPV6ADDR = an ipv6 address, surrounded by square brackets. + IPV4ADDR = an ipv4 address, represented as a dotted quad. + PORT = a number between 1 and 65535 inclusive. + + An alternative for the address and ORPort of the "router" line, but with + two added capabilities: + + * or-address can be either an IPv4 or IPv6 address + * or-address allows for multiple ORPorts and addresses + + A descriptor SHOULD NOT include an or-address line that does nothing but + duplicate the address:port pair from its "router" line. + + The ordering of or-address lines and their PORT entries matter because + Tor MAY accept a limited number of address/port pairs. As of + Tor 0.2.3.x only the first address/port pair is advertised and used. + + "tunnelled-dir-server" NL + + [At most once.] + [No extra arguments] + + Present if the router accepts "tunneled" directory requests using a + BEGIN_DIR cell over the router's OR port. + (Added in 0.2.8.1-alpha. Before this, Tor relays accepted + tunneled directory requests only if they had a DirPort open, + or if they were bridges.) + + "proto" SP Entries NL + + [Exactly once.] + + Entries = + Entries = Entry + Entries = Entry SP Entries + + Entry = Keyword "=" Values + + Values = + Values = Value + Values = Value "," Values + + Value = Int + Value = Int "-" Int + + Int = NON_ZERO_DIGIT + Int = Int DIGIT + + Each 'Entry' in the "proto" line indicates that the Tor relay supports + one or more versions of the protocol in question. Entries should be + sorted by keyword. Values should be numerically ascending within each + entry. (This implies that there should be no overlapping ranges.) + Ranges should be represented as compactly as possible. Ints must be no + larger than 63. + + This field was first added in Tor 0.2.9.x. + + [Before Tor 0.4.5.1-alpha, this field was optional.] + + +2.1.2. Extra-info document format + + Extra-info documents consist of the following items: + + "extra-info" Nickname Fingerprint NL + [At start, exactly once.] + + Identifies what router this is an extra-info descriptor for. + Fingerprint is encoded in hex (using upper-case letters), with + no spaces. + + "identity-ed25519" + [As in router descriptors] + + "published" YYYY-MM-DD HH:MM:SS NL + + [Exactly once.] + + The time, in UTC, when this document (and its corresponding router + descriptor if any) was generated. It MUST match the published time + in the corresponding server descriptor. + + "read-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM,NUM,NUM,NUM,NUM... NL + [At most once.] + "write-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM,NUM,NUM,NUM,NUM... NL + [At most once.] + + Declare how much bandwidth the OR has used recently. Usage is divided + into intervals of NSEC seconds. The YYYY-MM-DD HH:MM:SS field + defines the end of the most recent interval. The numbers are the + number of bytes used in the most recent intervals, ordered from + oldest to newest. + + These fields include both IPv4 and IPv6 traffic. + + "ipv6-read-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM,NUM,NUM... NL + [At most once] + "ipv6-write-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM,NUM,NUM... NL + [At most once] + + Declare how much bandwidth the OR has used recently, on IPv6 + connections. See "read-history" and "write-history" for full details. + + "geoip-db-digest" Digest NL + [At most once.] + + SHA1 digest of the IPv4 GeoIP database file that is used to + resolve IPv4 addresses to country codes. + + "geoip6-db-digest" Digest NL + [At most once.] + + SHA1 digest of the IPv6 GeoIP database file that is used to + resolve IPv6 addresses to country codes. + + ("geoip-start-time" YYYY-MM-DD HH:MM:SS NL) + ("geoip-client-origins" CC=NUM,CC=NUM,... NL) + + Only generated by bridge routers (see blocking.pdf), and only + when they have been configured with a geoip database. + Non-bridges SHOULD NOT generate these fields. Contains a list + of mappings from two-letter country codes (CC) to the number + of clients that have connected to that bridge from that + country (approximate, and rounded up to the nearest multiple of 8 + in order to hamper traffic analysis). A country is included + only if it has at least one address. The time in + "geoip-start-time" is the time at which we began collecting geoip + statistics. + + "geoip-start-time" and "geoip-client-origins" have been replaced by + "bridge-stats-end" and "bridge-ips" in 0.2.2.4-alpha. The + reason is that the measurement interval with "geoip-stats" as + determined by subtracting "geoip-start-time" from "published" could + have had a variable length, whereas the measurement interval in + 0.2.2.4-alpha and later is set to be exactly 24 hours long. In + order to clearly distinguish the new measurement intervals from + the old ones, the new keywords have been introduced. + + "bridge-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL + [At most once.] + + YYYY-MM-DD HH:MM:SS defines the end of the included measurement + interval of length NSEC seconds (86400 seconds by default). + + A "bridge-stats-end" line, as well as any other "bridge-*" line, + is only added when the relay has been running as a bridge for at + least 24 hours. + + "bridge-ips" CC=NUM,CC=NUM,... NL + [At most once.] + + List of mappings from two-letter country codes to the number of + unique IP addresses that have connected from that country to the + bridge and which are no known relays, rounded up to the nearest + multiple of 8. + + "bridge-ip-versions" FAM=NUM,FAM=NUM,... NL + [At most once.] + + List of unique IP addresses that have connected to the bridge + per protocol family. + + "bridge-ip-transports" PT=NUM,PT=NUM,... NL + [At most once.] + + List of mappings from pluggable transport names to the number + of unique IP addresses that have connected using that + pluggable transport. Unobfuscated connections are counted + using the reserved pluggable transport name "" (without + quotes). If we received a connection from a transport proxy + but we couldn't figure out the name of the pluggable + transport, we use the reserved pluggable transport name + "". + + ("" and "" are reserved because normal pluggable + transport names MUST match the following regular expression: + "[a-zA-Z_][a-zA-Z0-9_]*" ) + + The pluggable transport name list is sorted into lexically + ascending order. + + If no clients have connected to the bridge yet, we only write + "bridge-ip-transports" to the stats file. + + "dirreq-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL + [At most once.] + + YYYY-MM-DD HH:MM:SS defines the end of the included measurement + interval of length NSEC seconds (86400 seconds by default). + + A "dirreq-stats-end" line, as well as any other "dirreq-*" line, + is only added when the relay has opened its Dir port and after 24 + hours of measuring directory requests. + + "dirreq-v2-ips" CC=NUM,CC=NUM,... NL + [At most once.] + "dirreq-v3-ips" CC=NUM,CC=NUM,... NL + [At most once.] + + List of mappings from two-letter country codes to the number of + unique IP addresses that have connected from that country to + request a v2/v3 network status, rounded up to the nearest multiple + of 8. Only those IP addresses are counted that the directory can + answer with a 200 OK status code. (Note here and below: current Tor + versions, as of 0.2.5.2-alpha, no longer cache or serve v2 + networkstatus documents.) + + "dirreq-v2-reqs" CC=NUM,CC=NUM,... NL + [At most once.] + "dirreq-v3-reqs" CC=NUM,CC=NUM,... NL + [At most once.] + + List of mappings from two-letter country codes to the number of + requests for v2/v3 network statuses from that country, rounded up + to the nearest multiple of 8. Only those requests are counted that + the directory can answer with a 200 OK status code. + + "dirreq-v2-share" NUM% NL + [At most once.] + "dirreq-v3-share" NUM% NL + [At most once.] + + The share of v2/v3 network status requests that the directory + expects to receive from clients based on its advertised bandwidth + compared to the overall network bandwidth capacity. Shares are + formatted in percent with two decimal places. Shares are + calculated as means over the whole 24-hour interval. + + "dirreq-v2-resp" status=NUM,... NL + [At most once.] + "dirreq-v3-resp" status=NUM,... NL + [At most once.] + + List of mappings from response statuses to the number of requests + for v2/v3 network statuses that were answered with that response + status, rounded up to the nearest multiple of 4. Only response + statuses with at least 1 response are reported. New response + statuses can be added at any time. The current list of response + statuses is as follows: + + "ok": a network status request is answered; this number + corresponds to the sum of all requests as reported in + "dirreq-v2-reqs" or "dirreq-v3-reqs", respectively, before + rounding up. + "not-enough-sigs: a version 3 network status is not signed by a + sufficient number of requested authorities. + "unavailable": a requested network status object is unavailable. + "not-found": a requested network status is not found. + "not-modified": a network status has not been modified since the + If-Modified-Since time that is included in the request. + "busy": the directory is busy. + + "dirreq-v2-direct-dl" key=NUM,... NL + [At most once.] + "dirreq-v3-direct-dl" key=NUM,... NL + [At most once.] + "dirreq-v2-tunneled-dl" key=NUM,... NL + [At most once.] + "dirreq-v3-tunneled-dl" key=NUM,... NL + [At most once.] + + List of statistics about possible failures in the download process + of v2/v3 network statuses. Requests are either "direct" + HTTP-encoded requests over the relay's directory port, or + "tunneled" requests using a BEGIN_DIR cell over the relay's OR + port. The list of possible statistics can change, and statistics + can be left out from reporting. The current list of statistics is + as follows: + + Successful downloads and failures: + + "complete": a client has finished the download successfully. + "timeout": a download did not finish within 10 minutes after + starting to send the response. + "running": a download is still running at the end of the + measurement period for less than 10 minutes after starting to + send the response. + + Download times: + + "min", "max": smallest and largest measured bandwidth in B/s. + "d[1-4,6-9]": 1st to 4th and 6th to 9th decile of measured + bandwidth in B/s. For a given decile i, i/10 of all downloads + had a smaller bandwidth than di, and (10-i)/10 of all downloads + had a larger bandwidth than di. + "q[1,3]": 1st and 3rd quartile of measured bandwidth in B/s. One + fourth of all downloads had a smaller bandwidth than q1, one + fourth of all downloads had a larger bandwidth than q3, and the + remaining half of all downloads had a bandwidth between q1 and + q3. + "md": median of measured bandwidth in B/s. Half of the downloads + had a smaller bandwidth than md, the other half had a larger + bandwidth than md. + + "dirreq-read-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM,NUM,NUM... NL + [At most once] + "dirreq-write-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM,NUM,NUM... NL + [At most once] + + Declare how much bandwidth the OR has spent on answering directory + requests. Usage is divided into intervals of NSEC seconds. The + YYYY-MM-DD HH:MM:SS field defines the end of the most recent + interval. The numbers are the number of bytes used in the most + recent intervals, ordered from oldest to newest. + + "entry-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL + [At most once.] + + YYYY-MM-DD HH:MM:SS defines the end of the included measurement + interval of length NSEC seconds (86400 seconds by default). + + An "entry-stats-end" line, as well as any other "entry-*" + line, is first added after the relay has been running for at least + 24 hours. + + "entry-ips" CC=NUM,CC=NUM,... NL + [At most once.] + + List of mappings from two-letter country codes to the number of + unique IP addresses that have connected from that country to the + relay and which are no known other relays, rounded up to the + nearest multiple of 8. + + "cell-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL + [At most once.] + + YYYY-MM-DD HH:MM:SS defines the end of the included measurement + interval of length NSEC seconds (86400 seconds by default). + + A "cell-stats-end" line, as well as any other "cell-*" line, + is first added after the relay has been running for at least 24 + hours. + + "cell-processed-cells" NUM,...,NUM NL + [At most once.] + + Mean number of processed cells per circuit, subdivided into + deciles of circuits by the number of cells they have processed in + descending order from loudest to quietest circuits. + + "cell-queued-cells" NUM,...,NUM NL + [At most once.] + + Mean number of cells contained in queues by circuit decile. These + means are calculated by 1) determining the mean number of cells in + a single circuit between its creation and its termination and 2) + calculating the mean for all circuits in a given decile as + determined in "cell-processed-cells". Numbers have a precision of + two decimal places. + + Note that this statistic can be inaccurate for circuits that had + queued cells at the start or end of the measurement interval. + + "cell-time-in-queue" NUM,...,NUM NL + [At most once.] + + Mean time cells spend in circuit queues in milliseconds. Times are + calculated by 1) determining the mean time cells spend in the + queue of a single circuit and 2) calculating the mean for all + circuits in a given decile as determined in + "cell-processed-cells". + + Note that this statistic can be inaccurate for circuits that had + queued cells at the start or end of the measurement interval. + + "cell-circuits-per-decile" NUM NL + [At most once.] + + Mean number of circuits that are included in any of the deciles, + rounded up to the next integer. + + "conn-bi-direct" YYYY-MM-DD HH:MM:SS (NSEC s) BELOW,READ,WRITE,BOTH NL + [At most once] + + Number of connections, split into 10-second intervals, that are + used uni-directionally or bi-directionally as observed in the NSEC + seconds (usually 86400 seconds) before YYYY-MM-DD HH:MM:SS. Every + 10 seconds, we determine for every connection whether we read and + wrote less than a threshold of 20 KiB (BELOW), read at least 10 + times more than we wrote (READ), wrote at least 10 times more than + we read (WRITE), or read and wrote more than the threshold, but + not 10 times more in either direction (BOTH). After classifying a + connection, read and write counters are reset for the next + 10-second interval. + + This measurement includes both IPv4 and IPv6 connections. + + "ipv6-conn-bi-direct" YYYY-MM-DD HH:MM:SS (NSEC s) BELOW,READ,WRITE,BOTH NL + [At most once] + + Number of IPv6 connections that are used uni-directionally or + bi-directionally. See "conn-bi-direct" for more details. + + "exit-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL + [At most once.] + + YYYY-MM-DD HH:MM:SS defines the end of the included measurement + interval of length NSEC seconds (86400 seconds by default). + + An "exit-stats-end" line, as well as any other "exit-*" line, is + first added after the relay has been running for at least 24 hours + and only if the relay permits exiting (where exiting to a single + port and IP address is sufficient). + + "exit-kibibytes-written" port=N,port=N,... NL + [At most once.] + "exit-kibibytes-read" port=N,port=N,... NL + [At most once.] + + List of mappings from ports to the number of kibibytes that the + relay has written to or read from exit connections to that port, + rounded up to the next full kibibyte. Relays may limit the + number of listed ports and subsume any remaining kibibytes under + port "other". + + "exit-streams-opened" port=N,port=N,... NL + [At most once.] + + List of mappings from ports to the number of opened exit streams + to that port, rounded up to the nearest multiple of 4. Relays may + limit the number of listed ports and subsume any remaining opened + streams under port "other". + + "hidserv-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL + [At most once.] + "hidserv-v3-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL + [At most once.] + + YYYY-MM-DD HH:MM:SS defines the end of the included measurement + interval of length NSEC seconds (86400 seconds by default). + + A "hidserv-stats-end" line, as well as any other "hidserv-*" line, + is first added after the relay has been running for at least 24 + hours. + + (Introduced in tor-0.4.6.1-alpha) + + "hidserv-rend-relayed-cells" SP NUM SP key=val SP key=val ... NL + [At most once.] + "hidserv-rend-v3-relayed-cells" SP NUM SP key=val SP key=val ... NL + [At most once.] + + Approximate number of RELAY cells seen in either direction on a + circuit after receiving and successfully processing a RENDEZVOUS1 + cell. + + The original measurement value is obfuscated in several steps: + first, it is rounded up to the nearest multiple of 'bin_size' + which is reported in the key=val part of this line; second, a + (possibly negative) noise value is added to the result of the + first step by randomly sampling from a Laplace distribution with + mu = 0 and b = (delta_f / epsilon) with 'delta_f' and 'epsilon' + being reported in the key=val part, too; third, the result of the + previous obfuscation steps is truncated to the next smaller + integer and included as 'NUM'. Note that the overall reported + value can be negative. + + (Introduced in tor-0.4.6.1-alpha) + + "hidserv-dir-onions-seen" SP NUM SP key=val SP key=val ... NL + [At most once.] + "hidserv-dir-v3-onions-seen" SP NUM SP key=val SP key=val ... NL + [At most once.] + + Approximate number of unique hidden-service identities seen in + descriptors published to and accepted by this hidden-service + directory. + + The original measurement value is obfuscated in the same way as + the 'NUM' value reported in "hidserv-rend-relayed-cells", but + possibly with different parameters as reported in the key=val part + of this line. Note that the overall reported value can be + negative. + + (Introduced in tor-0.4.6.1-alpha) + + "transport" transportname address:port [arglist] NL + [Any number.] + + Signals that the router supports the 'transportname' pluggable + transport in IP address 'address' and TCP port 'port'. A single + descriptor MUST not have more than one transport line with the + same 'transportname'. + + Pluggable transports are only relevant to bridges, but these entries + can appear in non-bridge relays as well. + + "padding-counts" YYYY-MM-DD HH:MM:SS (NSEC s) key=NUM key=NUM ... NL + [At most once.] + + YYYY-MM-DD HH:MM:SS defines the end of the included measurement + interval of length NSEC seconds (86400 seconds by default). Counts + are reset to 0 at the end of this interval. + + The keyword list is currently as follows: + + bin-size + - The current rounding value for cell count fields (10000 by + default) + write-drop + - The number of RELAY_DROP cells this relay sent + write-pad + - The number of CELL_PADDING cells this relay sent + write-total + - The total number of cells this relay cent + read-drop + - The number of RELAY_DROP cells this relay received + read-pad + - The number of CELL_PADDING cells this relay received + read-total + - The total number of cells this relay received + enabled-read-pad + - The number of CELL_PADDING cells this relay received on + connections that support padding + enabled-read-total + - The total number of cells this relay received on connections + that support padding + enabled-write-pad + - The total number of cells this relay received on connections + that support padding + enabled-write-total + - The total number of cells sent by this relay on connections + that support padding + max-chanpad-timers + - The maximum number of timers that this relay scheduled for + padding in the previous NSEC interval + + "overload-ratelimits" SP version SP YYYY-MM-DD SP HH:MM:SS + SP rate-limit SP burst-limit + SP read-overload-count SP write-overload-count NL + [At most once.] + + Indicates that a bandwidth limit was exhausted for this relay. + + The "rate-limit" and "burst-limit" are the raw values from the + BandwidthRate and BandwidthBurst found in the torrc configuration file. + + The "{read|write}-overload-count" are the counts of how many times the + reported limits of burst/rate were exhausted and thus the maximum + between the read and write count occurrences. To make the counter more + meaningful and to avoid multiple connections saturating the counter + when a relay is overloaded, we only increment it once a minute. + + The 'version' field is set to '1' for now. + + (Introduced in tor-0.4.6.1-alpha) + + "overload-fd-exhausted" SP version YYYY-MM-DD HH:MM:SS NL + [At most once.] + + Indicates that a file descriptor exhaustion was experienced by this + relay. + + The timestamp indicates that the maximum was reached between the + timestamp and the "published" timestamp of the document. + + This overload field should remain in place for 72 hours since last + triggered. If the limits are reached again in this period, the + timestamp is updated, and this 72 hour period restarts. + + The 'version' field is set to '1' for the initial implementation which + detects fd exhaustion only when a socket open fails. + + (Introduced in tor-0.4.6.1-alpha) + + "router-sig-ed25519" + [As in router descriptors] + + "router-signature" NL Signature NL + [At end, exactly once.] + [No extra arguments] + + A document signature as documented in section 1.3, using the + initial item "extra-info" and the final item "router-signature", + signed with the router's identity key. + +2.1.3. Nonterminals in server descriptors + + nickname ::= between 1 and 19 alphanumeric characters ([A-Za-z0-9]), + case-insensitive. + hexdigest ::= a '$', followed by 40 hexadecimal characters + ([A-Fa-f0-9]). [Represents a relay by the digest of its identity + key.] + + exitpattern ::= addrspec ":" portspec + portspec ::= "*" | port | port "-" port + port ::= an integer between 1 and 65535, inclusive. + + [Some implementations incorrectly generate ports with value 0. + Implementations SHOULD accept this, and SHOULD NOT generate it. + Connections to port 0 are never permitted.] + + addrspec ::= "*" | ip4spec | ip6spec + ipv4spec ::= ip4 | ip4 "/" num_ip4_bits | ip4 "/" ip4mask + ip4 ::= an IPv4 address in dotted-quad format + ip4mask ::= an IPv4 mask in dotted-quad format + num_ip4_bits ::= an integer between 0 and 32 + ip6spec ::= ip6 | ip6 "/" num_ip6_bits + ip6 ::= an IPv6 address, surrounded by square brackets. + num_ip6_bits ::= an integer between 0 and 128 + + bool ::= "0" | "1" + +3. Directory authority operation and formats + + Every authority has two keys used in this protocol: a signing key, and + an authority identity key. (Authorities also have a router identity + key used in their role as a router and by earlier versions of the + directory protocol.) The identity key is used from time to time to + sign new key certificates using new signing keys; it is very sensitive. + The signing key is used to sign key certificates and status documents. + +3.1. Creating key certificates + + Key certificates consist of the following items: + + "dir-key-certificate-version" version NL + + [At start, exactly once.] + + Determines the version of the key certificate. MUST be "3" for + the protocol described in this document. Implementations MUST + reject formats they don't understand. + + "dir-address" IPPort NL + [At most once] + + An IP:Port for this authority's directory port. + + "fingerprint" fingerprint NL + + [Exactly once.] + + Hexadecimal encoding without spaces based on the authority's + identity key. + + "dir-identity-key" NL a public key in PEM format + + [Exactly once.] + [No extra arguments] + + The long-term authority identity key for this authority. This key + SHOULD be at least 2048 bits long; it MUST NOT be shorter than + 1024 bits. + + "dir-key-published" YYYY-MM-DD HH:MM:SS NL + + [Exactly once.] + + The time (in UTC) when this document and corresponding key were + last generated. + + Implementations SHOULD reject certificates that are published + too far in the future, though they MAY tolerate some clock skew. + + "dir-key-expires" YYYY-MM-DD HH:MM:SS NL + + [Exactly once.] + + A time (in UTC) after which this key is no longer valid. + + Implementations SHOULD reject expired certificates, though they + MAY tolerate some clock skew. + + "dir-signing-key" NL a key in PEM format + + [Exactly once.] + [No extra arguments] + + The directory server's public signing key. This key MUST be at + least 1024 bits, and MAY be longer. + + "dir-key-crosscert" NL CrossSignature NL + + [Exactly once.] + [No extra arguments] + + CrossSignature is a signature, made using the certificate's signing + key, of the digest of the PKCS1-padded hash of the certificate's + identity key. For backward compatibility with broken versions of the + parser, we wrap the base64-encoded signature in -----BEGIN ID + SIGNATURE---- and -----END ID SIGNATURE----- tags. Implementations + MUST allow the "ID " portion to be omitted, however. + + Implementations MUST verify that the signature is a correct signature + of the hash of the identity key using the signing key. + + "dir-key-certification" NL Signature NL + + [At end, exactly once.] + [No extra arguments] + + A document signature as documented in section 1.3, using the + initial item "dir-key-certificate-version" and the final item + "dir-key-certification", signed with the authority identity key. + + Authorities MUST generate a new signing key and corresponding + certificate before the key expires. + +3.2. Accepting server descriptor and extra-info document uploads + + When a router posts a signed descriptor to a directory authority, the + authority first checks whether it is well-formed and correctly + self-signed. If it is, the authority next verifies that the nickname + in question is not already assigned to a router with a different + public key. + Finally, the authority MAY check that the router is not blacklisted + because of its key, IP, or another reason. + + An authority also keeps a record of all the Ed25519/RSA1024 + identity key pairs that it has seen before. It rejects any + descriptor that has a known Ed/RSA identity key that it has + already seen accompanied by a different RSA/Ed identity key + in an older descriptor. + + At a future date, authorities will begin rejecting all + descriptors whose RSA key was previously accompanied by an + Ed25519 key, if the descriptor does not list an Ed25519 key. + + At a future date, authorities will begin rejecting all descriptors + that do not list an Ed25519 key. + + If the descriptor passes these tests, and the authority does not already + have a descriptor for a router with this public key, it accepts the + descriptor and remembers it. + + If the authority _does_ have a descriptor with the same public key, the + newly uploaded descriptor is remembered if its publication time is more + recent than the most recent old descriptor for that router, and either: + + - There are non-cosmetic differences between the old descriptor and the + new one. + - Enough time has passed between the descriptors' publication times. + (Currently, 2 hours.) + + Differences between server descriptors are "non-cosmetic" if they would be + sufficient to force an upload as described in section 2.1 above. + + Note that the "cosmetic difference" test only applies to uploaded + descriptors, not to descriptors that the authority downloads from other + authorities. + + When a router posts a signed extra-info document to a directory authority, + the authority again checks it for well-formedness and correct signature, + and checks that its matches the extra-info-digest in some router + descriptor that it believes is currently useful. If so, it accepts it and + stores it and serves it as requested. If not, it drops it. + + +3.3. Computing microdescriptors + + Microdescriptors are a stripped-down version of server descriptors + generated by the directory authorities which may additionally contain + authority-generated information. Microdescriptors contain only the + most relevant parts that clients care about. Microdescriptors are + expected to be relatively static and only change about once per week. + Microdescriptors do not contain any information that clients need to + use to decide which servers to fetch information about, or which + servers to fetch information from. + + Microdescriptors are a straight transform from the server descriptor + and the consensus method. Microdescriptors have no header or footer. + Microdescriptors are identified by the hash of its concatenated + elements without a signature by the router. Microdescriptors do not + contain any version information, because their version is determined + by the consensus method. + + Starting with consensus method 8, microdescriptors contain the + following elements taken from or based on the server descriptor. Order + matters here, because different directory authorities must be able to + transform a given server descriptor and consensus method into the exact + same microdescriptor. + + "onion-key" NL a public key in PEM format + + [Exactly once, at start] + [No extra arguments] + + The "onion-key" element as specified in section 2.1.1. + + When generating microdescriptors for consensus method 30 or later, + the trailing = sign must be absent. For consensus method 29 or + earlier, the trailing = sign must be present. + + "ntor-onion-key" SP base-64-encoded-key NL + + [Exactly once] + + The "ntor-onion-key" element as specified in section 2.1.1. + + (Only included when generating microdescriptors for + consensus-method 16 or later.) + + [Before Tor 0.4.5.1-alpha, this field was optional.] + + "a" SP address ":" port NL + + [Any number] + + Additional advertised addresses for the OR. + + Present currently only if the OR advertises at least one IPv6 + address; currently, the first address is included and all others are + omitted. Any other IPv4 or IPv6 addresses should be ignored. + + Address and port are as for "or-address" as specified in + section 2.1.1. + + (Only included when generating microdescriptors for + consensus-methods 14 to 27.) + + "family" names NL + + [At most once] + + The "family" element as specified in section 2.1.1. + + When generating microdescriptors for consensus method 29 or later, + the following canonicalization algorithm is applied to improve + compression: + + For all entries of the form $hexid=name or $hexid~name, + remove the =name or ~name portion. + + Remove all entries of the form $hexid, where hexid is not + 40 hexadecimal characters long. + + If an entry is a valid nickname, put it into lower case. + + If an entry is a valid $hexid, put it into upper case. + + If there are any entries, add a single $hexid entry for + the relay in question, so that it is a member of its own + family. + + Sort all entries in lexical order. + + Remove duplicate entries. + + (Note that if an entry is not of the form "nickname", "$hexid", + "$hexid=nickname" or "$hexid~nickname", then it will be unchanged: + this is what makes the algorithm forward-compatible.) + + "p" SP ("accept" / "reject") SP PortList NL + + [Exactly once.] + + The exit-policy summary as specified in sections 3.4.1 and 3.8.2. + + [With microdescriptors, clients don't learn exact exit policies: + clients can only guess whether a relay accepts their request, try the + BEGIN request, and might get end-reason-exit-policy if they guessed + wrong, in which case they'll have to try elsewhere.] + + [In consensus methods before 5, this line was omitted.] + + "p6" SP ("accept" / "reject") SP PortList NL + + [At most once] + + The IPv6 exit policy summary as specified in sections 3.4.1 and + 3.8.2. A missing "p6" line is equivalent to "p6 reject 1-65535". + + (Only included when generating microdescriptors for + consensus-method 15 or later.) + + "id" SP "rsa1024" SP base64-encoded-identity-digest NL + + [At most once] + + The node identity digest (as described in tor-spec.txt), base64 + encoded, without trailing =s. This line is included to prevent + collisions between microdescriptors. + + Implementations SHOULD ignore these lines: they are + added to microdescriptors only to prevent collisions. + + (Only included when generating microdescriptors for + consensus-method 18 or later.) + + "id" SP "ed25519" SP base64-encoded-ed25519-identity NL + + [At most once] + + The node's master Ed25519 identity key, base64 encoded, + without trailing =s. + + All implementations MUST ignore this key for any microdescriptor + whose corresponding entry in the consensus includes the + 'NoEdConsensus' flag. + + (Only included when generating microdescriptors for + consensus-method 21 or later.) + + "id" SP keytype ... NL + + [At most once per distinct keytype.] + + Implementations MUST ignore "id" lines with unrecognized + key-types in place of "rsa1024" or "ed25519" + + "pr" SP Entries NL + + [Exactly once.] + + The "proto" element as specified in section 2.1.1. + + [Before Tor 0.4.5.1-alpha, this field was optional.] + + (Note that with microdescriptors, clients do not learn the RSA identity of + their routers: they only learn a hash of the RSA identity key. This is + all they need to confirm the actual identity key when doing a TLS + handshake, and all they need to put the identity key digest in their + CREATE cells.) + +3.4. Exchanging votes + + Authorities divide time into Intervals. Authority administrators SHOULD + try to all pick the same interval length, and SHOULD pick intervals that + are commonly used divisions of time (e.g., 5 minutes, 15 minutes, 30 + minutes, 60 minutes, 90 minutes). Voting intervals SHOULD be chosen to + divide evenly into a 24-hour day. + + Authorities SHOULD act according to interval and delays in the + latest consensus. Lacking a latest consensus, they SHOULD default to a + 30-minute Interval, a 5 minute VotingDelay, and a 5 minute DistDelay. + + Authorities MUST take pains to ensure that their clocks remain accurate + within a few seconds. (Running NTP is usually sufficient.) + + The first voting period of each day begins at 00:00 (midnight) UTC. If + the last period of the day would be truncated by one-half or more, it is + merged with the second-to-last period. + + An authority SHOULD publish its vote immediately at the start of each voting + period (minus VoteSeconds+DistSeconds). It does this by making it + available at + + http:///tor/status-vote/next/authority.z + + and sending it in an HTTP POST request to each other authority at the URL + + http:///tor/post/vote + + If, at the start of the voting period, minus DistSeconds, an authority + does not have a current statement from another authority, the first + authority downloads the other's statement. + + Once an authority has a vote from another authority, it makes it available + at + + http:///tor/status-vote/next/.z + + where is the fingerprint of the other authority's identity key. + And at + + http:///tor/status-vote/next/d/.z + + where is the digest of the vote document. + + Also, once an authority receives a vote from another authority, it + examines it for new descriptors and fetches them from that authority. + This may be the only way for an authority to hear about relays that didn't + publish their descriptor to all authorities, and, while it's too late + for the authority to include relays in its current vote, it can include + them in its next vote. See section 3.6 below for details. + +3.4.1. Vote and consensus status document formats + + Votes and consensuses are more strictly formatted than other documents + in this specification, since different authorities must be able to + generate exactly the same consensus given the same set of votes. + + The procedure for deciding when to generate vote and consensus status + documents are described in section 1.4 on the voting timeline. + + Status documents contain a preamble, an authority section, a list of + router status entries, and one or more footer signature, in that order. + + Unlike other formats described above, a SP in these documents must be a + single space character (hex 20). + + Some items appear only in votes, and some items appear only in + consensuses. Unless specified, items occur in both. + + The preamble contains the following items. They SHOULD occur in the + order given here: + + "network-status-version" SP version NL + + [At start, exactly once.] + + A document format version. For this specification, the version is + "3". + + "vote-status" SP type NL + + [Exactly once.] + + The status MUST be "vote" or "consensus", depending on the type of + the document. + + "consensus-methods" SP IntegerList NL + + [At most once for votes; does not occur in consensuses.] + + A space-separated list of supported methods for generating + consensuses from votes. See section 3.8.1 for details. Absence of + the line means that only method "1" is supported. + + "consensus-method" SP Integer NL + + [At most once for consensuses; does not occur in votes.] + [No extra arguments] + + See section 3.8.1 for details. + + (Only included when the vote is generated with consensus-method 2 or + later.) + + "published" SP YYYY-MM-DD SP HH:MM:SS NL + + [Exactly once for votes; does not occur in consensuses.] + + The publication time for this status document (if a vote). + + "valid-after" SP YYYY-MM-DD SP HH:MM:SS NL + + [Exactly once.] + + The start of the Interval for this vote. Before this time, the + consensus document produced from this vote is not officially in + use. + + (Note that because of propagation delays, clients and relays + may see consensus documents that are up to `DistSeconds` + earlier than this time, and should not warn about them.) + + See section 1.4 for voting timeline information. + + "fresh-until" SP YYYY-MM-DD SP HH:MM:SS NL + + [Exactly once.] + + The time at which the next consensus should be produced; before this + time, there is no point in downloading another consensus, since there + won't be a new one. See section 1.4 for voting timeline information. + + "valid-until" SP YYYY-MM-DD SP HH:MM:SS NL + + [Exactly once.] + + The end of the Interval for this vote. After this time, all + clients should try to find a more recent consensus. See section 1.4 + for voting timeline information. + + In practice, clients continue to use the consensus for up to 24 hours + after it is no longer valid, if no more recent consensus can be + downloaded. + + "voting-delay" SP VoteSeconds SP DistSeconds NL + + [Exactly once.] + + VoteSeconds is the number of seconds that we will allow to collect + votes from all authorities; DistSeconds is the number of seconds + we'll allow to collect signatures from all authorities. See + section 1.4 for voting timeline information. + + "client-versions" SP VersionList NL + + [At most once.] + + A comma-separated list of recommended Tor versions for client + usage, in ascending order. The versions are given as defined by + version-spec.txt. If absent, no opinion is held about client + versions. + + "server-versions" SP VersionList NL + + [At most once.] + + A comma-separated list of recommended Tor versions for relay + usage, in ascending order. The versions are given as defined by + version-spec.txt. If absent, no opinion is held about server + versions. + + "package" SP PackageName SP Version SP URL SP DIGESTS NL + + [Any number of times.] + + For this element: + + PACKAGENAME = NONSPACE + VERSION = NONSPACE + URL = NONSPACE + DIGESTS = DIGEST | DIGESTS SP DIGEST + DIGEST = DIGESTTYPE "=" DIGESTVAL + NONSPACE = one or more non-space printing characters + DIGESTVAL = DIGESTTYPE = one or more non-space printing characters + other than "=". + + Indicates that a package called "package" of version VERSION may be + found at URL, and its digest as computed with DIGESTTYPE is equal to + DIGESTVAL. In consensuses, these lines are sorted lexically by + "PACKAGENAME VERSION" pairs, and DIGESTTYPES must appear in ascending + order. A consensus must not contain the same "PACKAGENAME VERSION" + more than once. If a vote contains the same "PACKAGENAME VERSION" + more than once, all but the last is ignored. + + Included in consensuses only for methods 19-33. Earlier methods + did not include this; method 34 removed it. + + "known-flags" SP FlagList NL + + [Exactly once.] + + A space-separated list of all of the flags that this document + might contain. A flag is "known" either because the authority + knows about them and might set them (if in a vote), or because + enough votes were counted for the consensus for an authoritative + opinion to have been formed about their status. + + "flag-thresholds" SP Thresholds NL + + [At most once for votes; does not occur in consensuses.] + + A space-separated list of the internal performance thresholds + that the directory authority had at the moment it was forming + a vote. + + The metaformat is: + Thresholds = Threshold | Threshold SP Thresholds + Threshold = ThresholdKey '=' ThresholdVal + ThresholdKey = (KeywordChar | "_") + + ThresholdVal = [0-9]+("."[0-9]+)? "%"? + + Commonly used Thresholds at this point include: + + "stable-uptime" -- Uptime (in seconds) required for a relay + to be marked as stable. + + "stable-mtbf" -- MTBF (in seconds) required for a relay to be + marked as stable. + + "enough-mtbf" -- Whether we have measured enough MTBF to look + at stable-mtbf instead of stable-uptime. + + "fast-speed" -- Bandwidth (in bytes per second) required for + a relay to be marked as fast. + + "guard-wfu" -- WFU (in seconds) required for a relay to be + marked as guard. + + "guard-tk" -- Weighted Time Known (in seconds) required for a + relay to be marked as guard. + + "guard-bw-inc-exits" -- If exits can be guards, then all guards + must have a bandwidth this high. + + "guard-bw-exc-exits" -- If exits can't be guards, then all guards + must have a bandwidth this high. + + "ignoring-advertised-bws" -- 1 if we have enough measured bandwidths + that we'll ignore the advertised bandwidth + claims of routers without measured bandwidth. + + "recommended-client-protocols" SP Entries NL + "recommended-relay-protocols" SP Entries NL + "required-client-protocols" SP Entries NL + "required-relay-protocols" SP Entries NL + + [At most once for each.] + + The "proto" element as specified in section 2.1.1. + + To vote on these entries, a protocol/version combination is included + only if it is listed by a majority of the voters. + + These lines should be voted on. A majority of votes is sufficient to + make a protocol un-supported. A supermajority of authorities (2/3) + are needed to make a protocol required. The required protocols + should not be torrc-configurable, but rather should be hardwired in + the Tor code. + + The tor-spec.txt section 9 details how a relay and a client should + behave when they encounter these lines in the consensus. + + "params" SP [Parameters] NL + + [At most once] + + Parameter ::= Keyword '=' Int32 + Int32 ::= A decimal integer between -2147483648 and 2147483647. + Parameters ::= Parameter | Parameters SP Parameter + + The parameters list, if present, contains a space-separated list of + case-sensitive key-value pairs, sorted in lexical order by their + keyword (as ASCII byte strings). Each parameter has its own meaning. + + (Only included when the vote is generated with consensus-method 7 or + later.) + + See param-spec.txt for a list of parameters and their meanings. + + "shared-rand-previous-value" SP NumReveals SP Value NL + + [At most once] + + NumReveals ::= An integer greater or equal to 0. + Value ::= Base64-encoded-data + + The shared_random_value that was generated during the second-to-last + shared randomness protocol run. For example, if this document was + created on the 5th of November, this field carries the shared random + value generated during the protocol run of the 3rd of November. + + See section [SRCALC] of srv-spec.txt for instructions on how to compute + this value, and see section [CONS] for why we include old shared random + values in votes and consensus. + + Value is the actual shared random value encoded in base64. It will + be exactly 256 bits long. NumReveals is the number of commits used + to generate this SRV. + + "shared-rand-current-value" SP NumReveals SP Value NL + + [At most once] + + NumReveals ::= An integer greater or equal to 0. + Value ::= Base64-encoded-data + + The shared_random_value that was generated during the latest shared + randomness protocol run. For example, if this document was created on + the 5th of November, this field carries the shared random value + generated during the protocol run of the 4th of November + + See section [SRCALC] of srv-spec.txt for instructions on how to compute + this value given the active commits. + + Value is the actual shared random value encoded in base64. It will + be exactly 256 bits long. NumReveals is the number of commits used to + generate this SRV. + + "bandwidth-file-headers" SP KeyValues NL + + [At most once for votes; does not occur in consensuses.] + + KeyValues ::= "" | KeyValue | KeyValues SP KeyValue + KeyValue ::= Keyword '=' Value + Value ::= ArgumentCharValue+ + ArgumentCharValue ::= any printing ASCII character except NL and SP. + + The headers from the bandwidth file used to generate this vote. + The bandwidth file headers are described in bandwidth-file-spec.txt. + + If an authority is not configured with a V3BandwidthsFile, this line + SHOULD NOT appear in its vote. + + If an authority is configured with a V3BandwidthsFile, but parsing + fails, this line SHOULD appear in its vote, but without any headers. + + First-appeared: Tor 0.3.5.1-alpha. + + "bandwidth-file-digest" 1*(SP algorithm "=" digest) NL + + [At most once for votes; does not occur in consensuses.] + + A digest of the bandwidth file used to generate this vote. + "algorithm" is the name of the hash algorithm producing "digest", + which can be "sha256" or another algorithm. "digest" is the + base64 encoding of the hash of the bandwidth file, with trailing =s + omitted. + + If an authority is not configured with a V3BandwidthsFile, this line + SHOULD NOT appear in its vote. + + If an authority is configured with a V3BandwidthsFile, but parsing + fails, this line SHOULD appear in its vote, with the digest(s) of the + unparseable file. + + First-appeared: Tor 0.4.0.4-alpha + + The authority section of a vote contains the following items, followed + in turn by the authority's current key certificate: + + "dir-source" SP nickname SP identity SP address SP IP SP dirport SP + orport NL + + [Exactly once, at start] + + Describes this authority. The nickname is a convenient identifier + for the authority. The identity is an uppercase hex fingerprint of + the authority's current (v3 authority) identity key. The address is + the server's hostname. The IP is the server's current IP address, + and dirport is its current directory port. The orport is the + port at that address where the authority listens for OR + connections. + + "contact" SP string NL + + [Exactly once] + + An arbitrary string describing how to contact the directory + server's administrator. Administrators should include at least an + email address and a PGP fingerprint. + + "legacy-dir-key" SP FINGERPRINT NL + + [At most once] + + Lists a fingerprint for an obsolete _identity_ key still used + by this authority to keep older clients working. This option + is used to keep key around for a little while in case the + authorities need to migrate many identity keys at once. + (Generally, this would only happen because of a security + vulnerability that affected multiple authorities, like the + Debian OpenSSL RNG bug of May 2008.) + + "shared-rand-participate" NL + + [At most once] + + Denotes that the directory authority supports and can participate in the + shared random protocol. + + "shared-rand-commit" SP Version SP AlgName SP Identity SP Commit [SP Reveal] NL + + [Any number of times] + + Version ::= An integer greater or equal to 0. + AlgName ::= 1*(ALPHA / DIGIT / "_" / "-") + Identity ::= 40 * HEXDIG + Commit ::= Base64-encoded-data + Reveal ::= Base64-encoded-data + + Denotes a directory authority commit for the shared randomness + protocol, containing the commitment value and potentially also the + reveal value. See sections [COMMITREVEAL] and [VALIDATEVALUES] of + srv-spec.txt on how to generate and validate these values. + + Version is the current shared randomness protocol version. AlgName is + the hash algorithm that is used (e.g. "sha3-256") and Identity is the + authority's SHA1 v3 identity fingerprint. Commit is the encoded + commitment value in base64. Reveal is optional and if it's set, it + contains the reveal value in base64. + + If a vote contains multiple commits from the same authority, the + receiver MUST only consider the first commit listed. + + "shared-rand-previous-value" SP NumReveals SP Value NL + + [At most once] + + See shared-rand-previous-value description above. + + "shared-rand-current-value" SP NumReveals SP Value NL + + [At most once] + + See shared-rand-current-value description above. + + The authority section of a consensus contains groups of the following items, + in the order given, with one group for each authority that contributed to + the consensus, with groups sorted by authority identity digest: + + "dir-source" SP nickname SP identity SP address SP IP SP dirport SP + orport NL + + [Exactly once, at start] + + As in the authority section of a vote. + + "contact" SP string NL + + [Exactly once.] + + As in the authority section of a vote. + + "vote-digest" SP digest NL + + [Exactly once.] + + A digest of the vote from the authority that contributed to this + consensus, as signed (that is, not including the signature). + (Hex, upper-case.) + + For each "legacy-dir-key" in the vote, there is an additional "dir-source" + line containing that legacy key's fingerprint, the authority's nickname + with "-legacy" appended, and all other fields as in the main "dir-source" + line for that authority. These "dir-source" lines do not have + corresponding "contact" or "vote-digest" entries. + + Each router status entry contains the following items. Router status + entries are sorted in ascending order by identity digest. + + "r" SP nickname SP identity SP digest SP publication SP IP SP ORPort + SP DirPort NL + + [At start, exactly once.] + + "Nickname" is the OR's nickname. "Identity" is a hash of its + identity key, encoded in base64, with trailing equals sign(s) + removed. "Digest" is a hash of its most recent descriptor as + signed (that is, not including the signature) by the RSA identity + key (see section 1.3.), encoded in base64. + + "Publication" was once the publication time of the router's most + recent descriptor, in the form YYYY-MM-DD HH:MM:SS, in UTC. Now + it is only used in votes, and may be set to a fixed value in + consensus documents. Implementations SHOULD ignore this value + in non-vote documents. + + "IP" is its current IP address; ORPort is its current OR port, + "DirPort" is its current directory port, or "0" for "none". + + "a" SP address ":" port NL + + [Any number] + + The first advertised IPv6 address for the OR, if it is reachable. + + Present only if the OR advertises at least one IPv6 address, and the + authority believes that the first advertised address is reachable. + Any other IPv4 or IPv6 addresses should be ignored. + + Address and port are as for "or-address" as specified in + section 2.1.1. + + (Only included when the vote or consensus is generated with + consensus-method 14 or later.) + + "s" SP Flags NL + + [Exactly once.] + + A series of space-separated status flags, in lexical order (as ASCII + byte strings). Currently documented flags are: + + "Authority" if the router is a directory authority. + "BadExit" if the router is believed to be useless as an exit node + (because its ISP censors it, because it is behind a restrictive + proxy, or for some similar reason). + "Exit" if the router is more useful for building + general-purpose exit circuits than for relay circuits. The + path building algorithm uses this flag; see path-spec.txt. + "Fast" if the router is suitable for high-bandwidth circuits. + "Guard" if the router is suitable for use as an entry guard. + "HSDir" if the router is considered a v2 hidden service directory. + "MiddleOnly" if the router is considered unsuitable for + usage other than as a middle relay. Clients do not need + to handle this option, since when it is present, the authorities + will automatically vote against flags that would make the router + usable in other positions. (Since 0.4.7.2-alpha.) + "NoEdConsensus" if any Ed25519 key in the router's descriptor or + microdescriptor does not reflect authority consensus. + "Stable" if the router is suitable for long-lived circuits. + "StaleDesc" if the router should upload a new descriptor because + the old one is too old. + "Running" if the router is currently usable over all its published + ORPorts. (Authorities ignore IPv6 ORPorts unless configured to + check IPv6 reachability.) Relays without this flag are omitted + from the consensus, and current clients (since 0.2.9.4-alpha) + assume that every listed relay has this flag. + "Valid" if the router has been 'validated'. Clients before + 0.2.9.4-alpha would not use routers without this flag by + default. Currently, relays without this flag are omitted + from the consensus, and current (post-0.2.9.4-alpha) clients + assume that every listed relay has this flag. + "V2Dir" if the router implements the v2 directory protocol or + higher. + + "v" SP version NL + + [At most once.] + + The version of the Tor protocol that this relay is running. If + the value begins with "Tor" SP, the rest of the string is a Tor + version number, and the protocol is "The Tor protocol as supported + by the given version of Tor." Otherwise, if the value begins with + some other string, Tor has upgraded to a more sophisticated + protocol versioning system, and the protocol is "a version of the + Tor protocol more recent than any we recognize." + + Directory authorities SHOULD omit version strings they receive from + descriptors if they would cause "v" lines to be over 128 characters + long. + + "pr" SP Entries NL + + [At most once.] + + The "proto" family element as specified in section 2.1.1. + + During voting, authorities copy these lines immediately below the "v" + lines. When a descriptor does not contain a "proto" entry, the + authorities should reconstruct it using the approach described below + in section D. They are included in the consensus using the same rules + as currently used for "v" lines, if a sufficiently late consensus + method is in use. + + "w" SP "Bandwidth=" INT [SP "Measured=" INT] [SP "Unmeasured=1"] NL + + [At most once.] + + An estimate of the bandwidth of this relay, in an arbitrary + unit (currently kilobytes per second). Used to weight router + selection. See section 3.4.2 for details on how the value of + Bandwidth is determined in a consensus. + + Additionally, the Measured= keyword is present in votes by + participating bandwidth measurement authorities to indicate + a measured bandwidth currently produced by measuring stream + capacities. It does not occur in consensuses. + + 'Bandwidth=' and 'Measured=' values must be between 0 and + 2^32 - 1 inclusive. + + The "Unmeasured=1" value is included in consensuses generated + with method 17 or later when the 'Bandwidth=' value is not + based on a threshold of 3 or more measurements for this relay. + + Other weighting keywords may be added later. + Clients MUST ignore keywords they do not recognize. + + "p" SP ("accept" / "reject") SP PortList NL + + [At most once.] + + PortList = PortOrRange + PortList = PortList "," PortOrRange + PortOrRange = INT "-" INT / INT + + A list of those ports that this router supports (if 'accept') + or does not support (if 'reject') for exit to "most + addresses". + + "m" SP methods 1*(SP algorithm "=" digest) NL + + [Any number, only in votes.] + + Microdescriptor hashes for all consensus methods that an authority + supports and that use the same microdescriptor format. "methods" + is a comma-separated list of the consensus methods that the + authority believes will produce "digest". "algorithm" is the name + of the hash algorithm producing "digest", which can be "sha256" or + something else, depending on the consensus "methods" supporting + this algorithm. "digest" is the base64 encoding of the hash of + the router's microdescriptor with trailing =s omitted. + + "id" SP "ed25519" SP ed25519-identity NL + "id" SP "ed25519" SP "none" NL + [vote only, at most once] + + "stats" SP [KeyValues] NL + + [At most once. Vote only] + + KeyValue ::= Keyword '=' Number + Number ::= [0-9]+("."[0-9]+)? + KeyValues ::= KeyValue | KeyValues SP KeyValue + + Line containing various statistics that an authority has computed for + this relay. Each stats is represented as a key + value. Reported keys + are: + + "wfu" - Weighted Fractional Uptime + "tk" - Weighted Time Known + "mtbf" - Mean Time Between Failure (stability) + + (As of tor-0.4.6.1-alpha) + + The footer section is delineated in all votes and consensuses supporting + consensus method 9 and above with the following: + + "directory-footer" NL + [No extra arguments] + + It contains two subsections, a bandwidths-weights line and a + directory-signature. (Prior to consensus method 9, footers only contained + directory-signatures without a 'directory-footer' line or + bandwidth-weights.) + + The bandwidths-weights line appears At Most Once for a consensus. It does + not appear in votes. + + "bandwidth-weights" [SP Weights] NL + + Weight ::= Keyword '=' Int32 + Int32 ::= A decimal integer between -2147483648 and 2147483647. + Weights ::= Weight | Weights SP Weight + + List of optional weights to apply to router bandwidths during path + selection. They are sorted in lexical order (as ASCII byte strings) and + values are divided by the consensus' "bwweightscale" param. Definition + of our known entries are... + + Wgg - Weight for Guard-flagged nodes in the guard position + Wgm - Weight for non-flagged nodes in the guard Position + Wgd - Weight for Guard+Exit-flagged nodes in the guard Position + + Wmg - Weight for Guard-flagged nodes in the middle Position + Wmm - Weight for non-flagged nodes in the middle Position + Wme - Weight for Exit-flagged nodes in the middle Position + Wmd - Weight for Guard+Exit flagged nodes in the middle Position + + Weg - Weight for Guard flagged nodes in the exit Position + Wem - Weight for non-flagged nodes in the exit Position + Wee - Weight for Exit-flagged nodes in the exit Position + Wed - Weight for Guard+Exit-flagged nodes in the exit Position + + Wgb - Weight for BEGIN_DIR-supporting Guard-flagged nodes + Wmb - Weight for BEGIN_DIR-supporting non-flagged nodes + Web - Weight for BEGIN_DIR-supporting Exit-flagged nodes + Wdb - Weight for BEGIN_DIR-supporting Guard+Exit-flagged nodes + + Wbg - Weight for Guard flagged nodes for BEGIN_DIR requests + Wbm - Weight for non-flagged nodes for BEGIN_DIR requests + Wbe - Weight for Exit-flagged nodes for BEGIN_DIR requests + Wbd - Weight for Guard+Exit-flagged nodes for BEGIN_DIR requests + + These values are calculated as specified in section 3.8.3. + + The signature contains the following item, which appears Exactly Once + for a vote, and At Least Once for a consensus. + + "directory-signature" [SP Algorithm] SP identity SP signing-key-digest + NL Signature + + This is a signature of the status document, with the initial item + "network-status-version", and the signature item + "directory-signature", using the signing key. (In this case, we take + the hash through the _space_ after directory-signature, not the + newline: this ensures that all authorities sign the same thing.) + "identity" is the hex-encoded digest of the authority identity key of + the signing authority, and "signing-key-digest" is the hex-encoded + digest of the current authority signing key of the signing authority. + + The Algorithm is one of "sha1" or "sha256" if it is present; + implementations MUST ignore directory-signature entries with an + unrecognized Algorithm. "sha1" is the default, if no Algorithm is + given. The algorithm describes how to compute the hash of the + document before signing it. + + "ns"-flavored consensus documents must contain only sha1 signatures. + Votes and microdescriptor documents may contain other signature + types. Note that only one signature from each authority should be + "counted" as meaning that the authority has signed the consensus. + + (Tor clients before 0.2.3.x did not understand the 'algorithm' + field.) + +3.4.2. Assigning flags in a vote + + (This section describes how directory authorities choose which status + flags to apply to routers. Later directory authorities MAY do things + differently, so long as clients keep working well. Clients MUST NOT + depend on the exact behaviors in this section.) + + In the below definitions, a router is considered "active" if it is + running, valid, and not hibernating. + + When we speak of a router's bandwidth in this section, we mean either + its measured bandwidth, or its advertised bandwidth. If a sufficient + threshold (configurable with MinMeasuredBWsForAuthToIgnoreAdvertised, + 500 by default) of routers have measured bandwidth values, then the + authority bases flags on _measured_ bandwidths, and treats nodes with + non-measured bandwidths as if their bandwidths were zero. Otherwise, + it uses measured bandwidths for nodes that have them, and advertised + bandwidths for other nodes. + + When computing thresholds based on percentiles of nodes, an authority + only considers nodes that are active, that have not been + omitted as a sybil (see below), and whose bandwidth is at least + 4 KB. Nodes that don't meet these criteria do not influence any + threshold calculations (including calculation of stability and uptime + and bandwidth thresholds) and also do not have their Exit status + change. + + "Valid" -- a router is 'Valid' if it is running a version of Tor not + known to be broken, and the directory authority has not blacklisted + it as suspicious. + + "Named" -- + "Unnamed" -- Directory authorities no longer assign these flags. + They were once used to determine whether a relay's nickname was + canonically linked to its public key. + + "Running" -- A router is 'Running' if the authority managed to connect to + it successfully within the last 45 minutes on all its published ORPorts. + Authorities check reachability on: + + * the IPv4 ORPort in the "r" line, and + * the IPv6 ORPort considered for the "a" line, if: + * the router advertises at least one IPv6 ORPort, and + * AuthDirHasIPv6Connectivity 1 is set on the authority. + + A minority of voting authorities that set AuthDirHasIPv6Connectivity will + drop unreachable IPv6 ORPorts from the full consensus. Consensus method 27 + in 0.3.3.x puts IPv6 ORPorts in the microdesc consensus, so that + authorities can drop unreachable IPv6 ORPorts from all consensus flavors. + Consensus method 28 removes IPv6 ORPorts from microdescriptors. + + "Stable" -- A router is 'Stable' if it is active, and either its Weighted + MTBF is at least the median for known active routers or its Weighted MTBF + corresponds to at least 7 days. Routers are never called Stable if they are + running a version of Tor known to drop circuits stupidly. (0.1.1.10-alpha + through 0.1.1.16-rc are stupid this way.) + + To calculate weighted MTBF, compute the weighted mean of the lengths + of all intervals when the router was observed to be up, weighting + intervals by $\alpha^n$, where $n$ is the amount of time that has + passed since the interval ended, and $\alpha$ is chosen so that + measurements over approximately one month old no longer influence the + weighted MTBF much. + + [XXXX what happens when we have less than 4 days of MTBF info.] + + "Exit" -- A router is called an 'Exit' iff it allows exits to at + least one /8 address space on each of ports 80 and 443. (Up until + Tor version 0.3.2, the flag was assigned if relays exit to at least + two of the ports 80, 443, and 6667.) + + "Fast" -- A router is 'Fast' if it is active, and its bandwidth is either in + the top 7/8ths for known active routers or at least 100KB/s. + + "Guard" -- A router is a possible Guard if all of the following apply: + + - It is Fast, + - It is Stable, + - Its Weighted Fractional Uptime is at least the median for "familiar" + active routers, + - It is "familiar", + - Its bandwidth is at least AuthDirGuardBWGuarantee (if set, 2 MB by + default), OR its bandwidth is among the 25% fastest relays, + - It qualifies for the V2Dir flag as described below (this + constraint was added in 0.3.3.x, because in 0.3.0.x clients + started avoiding guards that didn't also have the V2Dir flag). + + To calculate weighted fractional uptime, compute the fraction + of time that the router is up in any given day, weighting so that + downtime and uptime in the past counts less. + + A node is 'familiar' if 1/8 of all active nodes have appeared more + recently than it, OR it has been around for a few weeks. + + "Authority" -- A router is called an 'Authority' if the authority + generating the network-status document believes it is an authority. + + "V2Dir" -- A router supports the v2 directory protocol or higher if it has + an open directory port OR a tunnelled-dir-server line in its router + descriptor, and it is running a version of the directory + protocol that supports the functionality clients need. (Currently, every + supported version of Tor supports the functionality that clients need, + but some relays might set "DirCache 0" or set really low rate limiting, + making them unqualified to be a directory mirror, i.e. they will omit + the tunnelled-dir-server line from their descriptor.) + + "HSDir" -- A router is a v2 hidden service directory if it stores and + serves v2 hidden service descriptors, has the Stable and Fast flag, and the + authority believes that it's been up for at least 96 hours (or the current + value of MinUptimeHidServDirectoryV2). + + "MiddleOnly" -- An authority should vote for this flag if it believes + that a relay is unsuitable for use except as a middle relay. When + voting for this flag, the authority should also vote against "Exit", + "Guard", "HsDir", and "V2Dir". When voting for this flag, if the + authority votes on the "BadExit" flag, the authority should vote in + favor of "BadExit". (This flag was added in 0.4.7.2-alpha.) + + "NoEdConsensus" -- authorities should not vote on this flag; it is + produced as part of the consensus for consensus method 22 or later. + + "StaleDesc" -- authorities should vote to assign this flag if the + published time on the descriptor is over 18 hours in the past. (This flag + was added in 0.4.0.1-alpha.) + + "Sybil" -- authorities SHOULD NOT accept more than 2 relays on a single IP. + If this happens, the authority *should* vote for the excess relays, but + should omit the Running or Valid flags and instead should assign the "Sybil" + flag. When there are more than 2 (or AuthDirMaxServersPerAddr) relays to + choose from, authorities should first prefer authorities to non-authorities, + then prefer Running to non-Running, and then prefer high-bandwidth to + low-bandwidth relays. In this comparison, measured bandwidth is used unless + it is not present for a router, in which case advertised bandwidth is used. + + Thus, the network-status vote includes all non-blacklisted, + non-expired, non-superseded descriptors. + + The bandwidth in a "w" line should be taken as the best estimate + of the router's actual capacity that the authority has. For now, + this should be the lesser of the observed bandwidth and bandwidth + rate limit from the server descriptor. It is given in kilobytes + per second, and capped at some arbitrary value (currently 10 MB/s). + + The Measured= keyword on a "w" line vote is currently computed + by multiplying the previous published consensus bandwidth by the + ratio of the measured average node stream capacity to the network + average. If 3 or more authorities provide a Measured= keyword for + a router, the authorities produce a consensus containing a "w" + Bandwidth= keyword equal to the median of the Measured= votes. + + As a special case, if the "w" line in a vote is about a relay with the + Authority flag, it should not include a Measured= keyword. The goal is + to leave such relays marked as Unmeasured, so they can reserve their + attention for authority-specific activities. "w" lines for votes about + authorities may include the bandwidth authority's measurement using + a different keyword, e.g. MeasuredButAuthority=, so it can still be + reported and recorded for posterity. + + The ports listed in a "p" line should be taken as those ports for + which the router's exit policy permits 'most' addresses, ignoring any + accept not for all addresses, ignoring all rejects for private + netblocks. "Most" addresses are permitted if no more than 2^25 + IPv4 addresses (two /8 networks) were blocked. The list is encoded + as described in section 3.8.2. + +3.4.3. Serving bandwidth list files + + If an authority has used a bandwidth list file to generate a vote + document it SHOULD make it available at + + http:///tor/status-vote/next/bandwidth.z + + at the start of each voting period. + + It MUST NOT attempt to send its bandwidth list file in a HTTP POST to + other authorities and it SHOULD NOT make bandwidth list files from other + authorities available. + + If an authority makes this file available, it MUST be the bandwidth file + used to create the vote document available at + + http:///tor/status-vote/next/authority.z + + To avoid inconsistent reads, authorities SHOULD only read the bandwidth + file once per voting period. Further processing and serving SHOULD use a + cached copy. + + The bandwidth list format is described in bandwidth-file-spec.txt. + + The standard URLs for bandwidth list files first-appeared in + Tor 0.4.0.4-alpha. + +3.5. Downloading missing certificates from other directory authorities + + XXX when to download certificates. + +3.6. Downloading server descriptors from other directory authorities + + Periodically (currently, every 10 seconds), directory authorities check + whether there are any specific descriptors that they do not have and that + they are not currently trying to download. + Authorities identify them by hash in vote (if publication date is more + recent than the descriptor we currently have). + + [XXXX need a way to fetch descriptors ahead of the vote? v2 status docs can + do that for now.] + + If so, the directory authority launches requests to the authorities for these + descriptors, such that each authority is only asked for descriptors listed + in its most recent vote. If more + than one authority lists the descriptor, we choose which to ask at random. + + If one of these downloads fails, we do not try to download that descriptor + from the authority that failed to serve it again unless we receive a newer + network-status (consensus or vote) from that authority that lists the same + descriptor. + + Directory authorities must potentially cache multiple descriptors for each + router. Authorities must not discard any descriptor listed by any recent + consensus. If there is enough space to store additional descriptors, + authorities SHOULD try to hold those which clients are likely to download the + most. (Currently, this is judged based on the interval for which each + descriptor seemed newest.) +[XXXX define recent] + + Authorities SHOULD NOT download descriptors for routers that they would + immediately reject for reasons listed in section 3.2. + +3.7. Downloading extra-info documents from other directory authorities + + Periodically, an authority checks whether it is missing any extra-info + documents: in other words, if it has any server descriptors with an + extra-info-digest field that does not match any of the extra-info + documents currently held. If so, it downloads whatever extra-info + documents are missing. We follow the same splitting and back-off rules + as in section 3.6. + +3.8. Computing a consensus from a set of votes + + Given a set of votes, authorities compute the contents of the consensus. + + The consensus status, along with as many signatures as the server + currently knows (see section 3.10 below), should be available at + + http:///tor/status-vote/next/consensus.z + + The contents of the consensus document are as follows: + + The "valid-after", "valid-until", and "fresh-until" times are taken as + the median of the respective values from all the votes. + + The times in the "voting-delay" line are taken as the median of the + VoteSeconds and DistSeconds times in the votes. + + Known-flags is the union of all flags known by any voter. + + Entries are given on the "params" line for every keyword on which a + majority of authorities (total authorities, not just those + participating in this vote) voted on, or if at least three + authorities voted for that parameter. The values given are the + low-median of all votes on that keyword. + + (In consensus methods 7 to 11 inclusive, entries were given on + the "params" line for every keyword on which *any* authority voted, + the value given being the low-median of all votes on that keyword.) + + "client-versions" and "server-versions" are sorted in ascending + order; A version is recommended in the consensus if it is recommended + by more than half of the voting authorities that included a + client-versions or server-versions lines in their votes. + + With consensus methods 19 through 33, a package line is generated for a + given PACKAGENAME/VERSION pair if at least three authorities list such a + package in their votes. (Call these lines the "input" lines for + PACKAGENAME.) The consensus will contain every "package" line that is + listed verbatim by more than half of the authorities listing a line for + the PACKAGENAME/VERSION pair, and no others. + + The authority item groups (dir-source, contact, fingerprint, + vote-digest) are taken from the votes of the voting + authorities. These groups are sorted by the digests of the + authorities identity keys, in ascending order. If the consensus + method is 3 or later, a dir-source line must be included for + every vote with legacy-key entry, using the legacy-key's + fingerprint, the voter's ordinary nickname with the string + "-legacy" appended, and all other fields as from the original + vote's dir-source line. + + A router status entry: + * is included in the result if some router status entry with the same + identity is included by more than half of the authorities (total + authorities, not just those whose votes we have). + (Consensus method earlier than 21) + + * is included according to the rules in section 3.8.0.1 and + 3.8.0.2 below. (Consensus method 22 or later) + + * For any given RSA identity digest, we include at most + one router status entry. + + * For any given Ed25519 identity, we include at most one router + status entry. + + * A router entry has a flag set if that is included by more than half + of the authorities who care about that flag. + + * Two router entries are "the same" if they have the same + tuple. + We choose the tuple for a given router as whichever tuple appears + for that router in the most votes. We break ties first in favor of + the more recently published, then in favor of smaller server + descriptor digest. + + [ + * The Named flag appears if it is included for this routerstatus by + _any_ authority, and if all authorities that list it list the same + nickname. However, if consensus-method 2 or later is in use, and + any authority calls this identity/nickname pair Unnamed, then + this routerstatus does not get the Named flag. + + * If consensus-method 2 or later is in use, the Unnamed flag is + set for a routerstatus if any authorities have voted for a different + identities to be Named with that nickname, or if any authority + lists that nickname/ID pair as Unnamed. + + (With consensus-method 1, Unnamed is set like any other flag.) + + [But note that authorities no longer vote for the Named flag, + and the above two bulletpoints are now irrelevant.] + ] + + * The version is given as whichever version is listed by the most + voters, with ties decided in favor of more recent versions. + + * If consensus-method 4 or later is in use, then routers that + do not have the Running flag are not listed at all. + + * If consensus-method 5 or later is in use, then the "w" line + is generated using a low-median of the bandwidth values from + the votes that included "w" lines for this router. + + * If consensus-method 5 or later is in use, then the "p" line + is taken from the votes that have the same policy summary + for the descriptor we are listing. (They should all be the + same. If they are not, we pick the most commonly listed + one, breaking ties in favor of the lexicographically larger + vote.) The port list is encoded as specified in section 3.8.2. + + * If consensus-method 6 or later is in use and if 3 or more + authorities provide a Measured= keyword in their votes for + a router, the authorities produce a consensus containing a + Bandwidth= keyword equal to the median of the Measured= votes. + + * If consensus-method 7 or later is in use, the params line is + included in the output. + + * If the consensus method is under 11, bad exits are considered as + possible exits when computing bandwidth weights. Otherwise, if + method 11 or later is in use, any router that is determined to get + the BadExit flag doesn't count when we're calculating weights. + + * If consensus method 12 or later is used, only consensus + parameters that more than half of the total number of + authorities voted for are included in the consensus. + + [ As of 0.2.6.1-alpha, authorities no longer advertise or negotiate + any consensus methods lower than 13. ] + + * If consensus method 13 or later is used, microdesc consensuses + omit any router for which no microdesc was agreed upon. + + * If consensus method 14 or later is used, the ns consensus and + microdescriptors may include an "a" line for each router, listing + an IPv6 OR port. + + * If consensus method 15 or later is used, microdescriptors + include "p6" lines including IPv6 exit policies. + + * If consensus method 16 or later is used, ntor-onion-key + are included in microdescriptors + + * If consensus method 17 or later is used, authorities impose a + maximum on the Bandwidth= values that they'll put on a 'w' + line for any router that doesn't have at least 3 measured + bandwidth values in votes. They also add an "Unmeasured=1" + flag to such 'w' lines. + + * If consensus method 18 or later is used, authorities include + "id" lines in microdescriptors. This method adds RSA ids. + + * If consensus method 19 or later is used, authorities may include + "package" lines in consensuses. + + * If consensus method 20 or later is used, authorities may include + GuardFraction information in microdescriptors. + + * If consensus method 21 or later is used, authorities may include + an "id" line for ed25519 identities in microdescriptors. + + [ As of 0.2.8.2-alpha, authorities no longer advertise or negotiate + consensus method 21, because it contains bugs. ] + + * If consensus method 22 or later is used, and the votes do not + produce a majority consensus about a relay's Ed25519 key (see + 3.8.0.1 below), the consensus must include a NoEdConsensus flag on + the "s" line for every relay whose listed Ed key does not reflect + consensus. + + * If consensus method 23 or later is used, authorities include + shared randomness protocol data on their votes and consensus. + + * If consensus-method 24 or later is in use, then routers that + do not have the Valid flag are not listed at all. + + [ As of 0.3.4.1-alpha, authorities no longer advertise or negotiate + any consensus methods lower than 25. ] + + * If consensus-method 25 or later is in use, then we vote + on recommended-protocols and required-protocols lines in the + consensus. We also include protocols lines in routerstatus + entries. + + * If consensus-method 26 or later is in use, then we initialize + bandwidth weights to 1 in our calculations, to avoid + division-by-zero errors on unusual networks. + + * If consensus method 27 or later is used, the microdesc consensus + may include an "a" line for each router, listing an IPv6 OR port. + + [ As of 0.4.3.1-alpha, authorities no longer advertise or negotiate + any consensus methods lower than 28. ] + + * If consensus method 28 or later is used, microdescriptors no longer + include "a" lines. + + * If consensus method 29 or later is used, microdescriptor "family" + lines are canonicalized to improve compression. + + * If consensus method 30 or later is used, the base64 encoded + ntor-onion-key does not include the trailing = sign. + + * If consensus method 31 or later is used, authorities parse the + "bwweightscale" and "maxunmeasuredbw" parameters correctly when + computing votes. + + * If consensus method 32 or later is used, authorities handle the + "MiddleOnly" flag specially when computing a consensus. When the + voters agree to include "MiddleOnly" in a routerstatus, they + automatically remove "Exit", "Guard", "V2Dir", and "HSDir". If + the BadExit flag is included in the consensus, they automatically + add it to the routerstatus. + + * If consensus method 33 or later is used, and the consensus + flavor is "microdesc", then the "Publication" field in the "r" + line is set to "2038-01-01 00:00:00". + + * If consensus method 34 or later is used, the consensus + does not include any "package" lines. + + The signatures at the end of a consensus document are sorted in + ascending order by identity digest. + + All ties in computing medians are broken in favor of the smaller or + earlier item. + +3.8.0.1. Deciding which Ids to include. + + This sorting algorithm is used for consensus-method 22 and later. + + First, consider each listing by tuple of identities, where 'Ed' + may be "None" if the voter included "id ed25519 none" to indicate that + the authority knows what ed25519 identities are, and thinks that the RSA + key doesn't have one. + + For each such tuple that is listed by more than half of the + total authorities (not just total votes), include it. (It is not + possible for any other to have as many votes.) If more + than half of the authorities list a single pair of this type, we + consider that Ed key to be "consensus"; see description of the + NoEdConsensus flag. + + Log any other id-RSA values corresponding to an id-Ed we included, and any + other id-Ed values corresponding to an id-RSA we included. + + For each that is not yet included, if it is listed by more than + half of the total authorities, and we do not already have it listed with + some , include it, but do not consider its Ed identity canonical. + +3.8.0.2. Deciding which descriptors to include + + Deciding which descriptors to include. + + A tuple belongs to an identity if it is a new tuple that + matches both ID parts, or if it is an old tuple (one with no Ed opinion) + that matches the RSA part. A tuple belongs to an identity if its + RSA identity matches. + + A tuple matches another tuple if all the fields that are present in both + tuples are the same. + + For every included identity, consider the tuples belonging to that + identity. Group them into sets of matching tuples. Include the tuple + that matches the largest set, breaking ties in favor of the most recently + published, and then in favor of the smaller server descriptor digest. + +3.8.1. Forward compatibility + + Future versions of Tor will need to include new information in the + consensus documents, but it is important that all authorities (or at least + half) generate and sign the same signed consensus. + + To achieve this, authorities list in their votes their supported methods + for generating consensuses from votes. Later methods will be assigned + higher numbers. Currently specified methods: + + "1" -- The first implemented version. + "2" -- Added support for the Unnamed flag. + "3" -- Added legacy ID key support to aid in authority ID key rollovers + "4" -- No longer list routers that are not running in the consensus + "5" -- adds support for "w" and "p" lines. + "6" -- Prefers measured bandwidth values rather than advertised + "7" -- Provides keyword=integer pairs of consensus parameters + "8" -- Provides microdescriptor summaries + "9" -- Provides weights for selecting flagged routers in paths + "10" -- Fixes edge case bugs in router flag selection weights + "11" -- Don't consider BadExits when calculating bandwidth weights + "12" -- Params are only included if enough auths voted for them + "13" -- Omit router entries with missing microdescriptors. + "14" -- Adds support for "a" lines in ns consensuses and microdescriptors. + "15" -- Adds support for "p6" lines. + "16" -- Adds ntor keys to microdescriptors + "17" -- Adds "Unmeasured=1" flags to "w" lines + "18" -- Adds 'id' to microdescriptors. + "19" -- Adds "package" lines to consensuses + "20" -- Adds GuardFraction information to microdescriptors. + "21" -- Adds Ed25519 keys to microdescriptors. + "22" -- Instantiates Ed25519 voting algorithm correctly. + "23" -- Adds shared randomness protocol data. + "24" -- No longer lists routers that are not Valid in the consensus. + "25" -- Vote on recommended-protocols and required-protocols. + "26" -- Initialize bandwidth weights to 1 to avoid division-by-zero. + "27" -- Adds support for "a" lines in microdescriptor consensuses. + "28" -- Removes "a" lines from microdescriptors. + "29" -- Canonicalizes families in microdescriptors. + "30" -- Removes padding from ntor-onion-key. + "31" -- Uses correct parsing for bwweightscale and maxunmeasuredbw + when computing weights + "32" -- Adds special handling for MiddleOnly flag. + "33" -- Sets "publication" field in microdesc consensus "r" lines + to a meaningless value. + "34" -- Removes "package" lines from consensus. + + Before generating a consensus, an authority must decide which consensus + method to use. To do this, it looks for the highest version number + supported by more than 2/3 of the authorities voting. If it supports this + method, then it uses it. Otherwise, it falls back to the newest consensus + method that it supports (which will probably not result in a sufficiently + signed consensus). + + All authorities MUST support method 25; authorities SHOULD support + more recent methods as well. Authorities SHOULD NOT support or + advertise support for any method before 25. Clients MAY assume that + they will never see a current valid signed consensus for any method + before method 25. + + (The consensuses generated by new methods must be parsable by + implementations that only understand the old methods, and must not cause + those implementations to compromise their anonymity. This is a means for + making changes in the contents of consensus; not for making + backward-incompatible changes in their format.) + + The following methods have incorrect implementations; authorities SHOULD + NOT advertise support for them: + + "21" -- Did not correctly enable support for ed25519 key collation. + +3.8.2. Encoding port lists + + Whether the summary shows the list of accepted ports or the list of + rejected ports depends on which list is shorter (has a shorter string + representation). In case of ties we choose the list of accepted + ports. As an exception to this rule an allow-all policy is + represented as "accept 1-65535" instead of "reject " and a reject-all + policy is similarly given as "reject 1-65535". + + Summary items are compressed, that is instead of "80-88,89-100" there + only is a single item of "80-100", similarly instead of "20,21" a + summary will say "20-21". + + Port lists are sorted in ascending order. + + The maximum allowed length of a policy summary (including the "accept " + or "reject ") is 1000 characters. If a summary exceeds that length we + use an accept-style summary and list as much of the port list as is + possible within these 1000 bytes. [XXXX be more specific.] + +3.8.3. Computing Bandwidth Weights + + Let weight_scale = 10000, or the value of the "bwweightscale" parameter. + (Before consensus method 31 there was a bug in parsing bwweightscale, so + that if there were any consensus parameters after it alphabetically, it + would always be treated as 10000. A similar bug existed for + "maxunmeasuredbw".) + + Starting with consensus method 26, G, M, E, and D are initialized to 1 and + T to 4. Prior consensus methods initialize them all to 0. With this change, + test tor networks that are small or new are much more likely to produce + bandwidth-weights in their consensus. The extra bandwidth has a negligible + impact on the bandwidth weights in the public tor network. + + Let G be the total bandwidth for Guard-flagged nodes. + Let M be the total bandwidth for non-flagged nodes. + Let E be the total bandwidth for Exit-flagged nodes. + Let D be the total bandwidth for Guard+Exit-flagged nodes. + Let T = G+M+E+D + + Let Wgd be the weight for choosing a Guard+Exit for the guard position. + Let Wmd be the weight for choosing a Guard+Exit for the middle position. + Let Wed be the weight for choosing a Guard+Exit for the exit position. + + Let Wme be the weight for choosing an Exit for the middle position. + Let Wmg be the weight for choosing a Guard for the middle position. + + Let Wgg be the weight for choosing a Guard for the guard position. + Let Wee be the weight for choosing an Exit for the exit position. + + Balanced network conditions then arise from solutions to the following + system of equations: + + Wgg*G + Wgd*D == M + Wmd*D + Wme*E + Wmg*G (guard bw = middle bw) + Wgg*G + Wgd*D == Wee*E + Wed*D (guard bw = exit bw) + Wed*D + Wmd*D + Wgd*D == D (aka: Wed+Wmd+Wdg = weight_scale) + Wmg*G + Wgg*G == G (aka: Wgg = weight_scale-Wmg) + Wme*E + Wee*E == E (aka: Wee = weight_scale-Wme) + + We are short 2 constraints with the above set. The remaining constraints + come from examining different cases of network load. The following + constraints are used in consensus method 10 and above. There are another + incorrect and obsolete set of constraints used for these same cases in + consensus method 9. For those, see dir-spec.txt in Tor 0.2.2.10-alpha + to 0.2.2.16-alpha. + + Case 1: E >= T/3 && G >= T/3 (Neither Exit nor Guard Scarce) + + In this case, the additional two constraints are: Wmg == Wmd, + Wed == 1/3. + + This leads to the solution: + Wgd = weight_scale/3 + Wed = weight_scale/3 + Wmd = weight_scale/3 + Wee = (weight_scale*(E+G+M))/(3*E) + Wme = weight_scale - Wee + Wmg = (weight_scale*(2*G-E-M))/(3*G) + Wgg = weight_scale - Wmg + + Case 2: E < T/3 && G < T/3 (Both are scarce) + + Let R denote the more scarce class (Rare) between Guard vs Exit. + Let S denote the less scarce class. + + Subcase a: R+D < S + + In this subcase, we simply devote all of D bandwidth to the + scarce class. + + Wgg = Wee = weight_scale + Wmg = Wme = Wmd = 0; + if E < G: + Wed = weight_scale + Wgd = 0 + else: + Wed = 0 + Wgd = weight_scale + + Subcase b: R+D >= S + + In this case, if M <= T/3, we have enough bandwidth to try to achieve + a balancing condition. + + Add constraints Wgg = weight_scale, Wmd == Wgd to maximize bandwidth in + the guard position while still allowing exits to be used as middle nodes: + + Wee = (weight_scale*(E - G + M))/E + Wed = (weight_scale*(D - 2*E + 4*G - 2*M))/(3*D) + Wme = (weight_scale*(G-M))/E + Wmg = 0 + Wgg = weight_scale + Wmd = (weight_scale - Wed)/2 + Wgd = (weight_scale - Wed)/2 + + If this system ends up with any values out of range (ie negative, or + above weight_scale), use the constraints Wgg == weight_scale and Wee == + weight_scale, since both those positions are scarce: + + Wgg = weight_scale + Wee = weight_scale + Wed = (weight_scale*(D - 2*E + G + M))/(3*D) + Wmd = (weight_Scale*(D - 2*M + G + E))/(3*D) + Wme = 0 + Wmg = 0 + Wgd = weight_scale - Wed - Wmd + + If M > T/3, then the Wmd weight above will become negative. Set it to 0 + in this case: + Wmd = 0 + Wgd = weight_scale - Wed + + Case 3: One of E < T/3 or G < T/3 + + Let S be the scarce class (of E or G). + + Subcase a: (S+D) < T/3: + if G=S: + Wgg = Wgd = weight_scale; + Wmd = Wed = Wmg = 0; + // Minor subcase, if E is more scarce than M, + // keep its bandwidth in place. + if (E < M) Wme = 0; + else Wme = (weight_scale*(E-M))/(2*E); + Wee = weight_scale-Wme; + if E=S: + Wee = Wed = weight_scale; + Wmd = Wgd = Wme = 0; + // Minor subcase, if G is more scarce than M, + // keep its bandwidth in place. + if (G < M) Wmg = 0; + else Wmg = (weight_scale*(G-M))/(2*G); + Wgg = weight_scale-Wmg; + + Subcase b: (S+D) >= T/3 + if G=S: + Add constraints Wgg = weight_scale, Wmd == Wed to maximize bandwidth + in the guard position, while still allowing exits to be + used as middle nodes: + Wgg = weight_scale + Wgd = (weight_scale*(D - 2*G + E + M))/(3*D) + Wmg = 0 + Wee = (weight_scale*(E+M))/(2*E) + Wme = weight_scale - Wee + Wmd = (weight_scale - Wgd)/2 + Wed = (weight_scale - Wgd)/2 + if E=S: + Add constraints Wee == weight_scale, Wmd == Wgd to maximize bandwidth + in the exit position: + Wee = weight_scale; + Wed = (weight_scale*(D - 2*E + G + M))/(3*D); + Wme = 0; + Wgg = (weight_scale*(G+M))/(2*G); + Wmg = weight_scale - Wgg; + Wmd = (weight_scale - Wed)/2; + Wgd = (weight_scale - Wed)/2; + + To ensure consensus, all calculations are performed using integer math + with a fixed precision determined by the bwweightscale consensus + parameter (defaults at 10000, Min: 1, Max: INT32_MAX). (See note above + about parsing bug in bwweightscale before consensus method 31.) + + For future balancing improvements, Tor clients support 11 additional weights + for directory requests and middle weighting. These weights are currently + set at weight_scale, with the exception of the following groups of + assignments: + + Directory requests use middle weights: + + Wbd=Wmd, Wbg=Wmg, Wbe=Wme, Wbm=Wmm + + Handle bridges and strange exit policies: + + Wgm=Wgg, Wem=Wee, Weg=Wed + +3.9. Computing consensus flavors + + Consensus flavors are variants of the consensus that clients can choose + to download and use instead of the unflavored consensus. The purpose + of a consensus flavor is to remove or replace information in the + unflavored consensus without forcing clients to download information + they would not use anyway. + + Directory authorities can produce and serve an arbitrary number of + flavors of the same consensus. A downside of creating too many new + flavors is that clients will be distinguishable based on which flavor + they download. A new flavor should not be created when adding a field + instead wouldn't be too onerous. + + Examples for consensus flavors include: + + - Publishing hashes of microdescriptors instead of hashes of + full descriptors (see section 3.9.2). + - Including different digests of descriptors, instead of the + perhaps-soon-to-be-totally-broken SHA1. + + Consensus flavors are derived from the unflavored consensus once the + voting process is complete. This is to avoid consensus synchronization + problems. + + Every consensus flavor has a name consisting of a sequence of one + or more alphanumeric characters and dashes. For compatibility, + the original (unflavored) consensus type is called "ns". + + The supported consensus flavors are defined as part of the + authorities' consensus method. + + All consensus flavors have in common that their first line is + "network-status-version" where version is 3 or higher, and the flavor + is a string consisting of alphanumeric characters and dashes: + + "network-status-version" SP version [SP flavor] NL + +3.9.1. ns consensus + + The ns consensus flavor is equivalent to the unflavored consensus. + When the flavor is omitted from the "network-status-version" line, + it should be assumed to be "ns". Some implementations may explicitly + state that the flavor is "ns" when generating consensuses, but should + accept consensuses where the flavor is omitted. + +3.9.2. Microdescriptor consensus + + The microdescriptor consensus is a consensus flavor that contains + microdescriptor hashes instead of descriptor hashes and that omits + exit-policy summaries which are contained in microdescriptors. The + microdescriptor consensus was designed to contain elements that are + small and frequently changing. Clients use the information in the + microdescriptor consensus to decide which servers to fetch information + about and which servers to fetch information from. + + The microdescriptor consensus is based on the unflavored consensus with + the exceptions as follows: + + "network-status-version" SP version SP "microdesc" NL + + [At start, exactly once.] + + The flavor name of a microdescriptor consensus is "microdesc". + + Changes to router status entries are as follows: + + "r" SP nickname SP identity SP publication SP IP SP ORPort + SP DirPort NL + + [At start, exactly once.] + + Similar to "r" lines in section 3.4.1, but without the digest element. + + "a" SP address ":" port NL + + [Any number] + + Identical to the "r" lines in section 3.4.1. + + (Only included when the vote is generated with consensus-method 14 + or later, and the consensus is generated with consensus-method 27 or + later.) + + "p" ... NL + + [At most once] + + Not currently generated. + + Exit policy summaries are contained in microdescriptors and + therefore omitted in the microdescriptor consensus. + + "m" SP digest NL + + [Exactly once.*] + + "digest" is the base64 of the SHA256 hash of the router's + microdescriptor with trailing =s omitted. For a given router + descriptor digest and consensus method there should only be a + single microdescriptor digest in the "m" lines of all votes. + If different votes have different microdescriptor digests for + the same descriptor digest and consensus method, at least one + of the authorities is broken. If this happens, the microdesc + consensus should contain whichever microdescriptor digest is + most common. If there is no winner, we break ties in the favor + of the lexically earliest. + + [*Before consensus method 13, this field was sometimes erroneously + omitted.] + + Additionally, a microdescriptor consensus SHOULD use the sha256 digest + algorithm for its signatures. + +3.10. Exchanging detached signatures + + Once an authority has computed and signed a consensus network status, it + should send its detached signature to each other authority in an HTTP POST + request to the URL: + + http:///tor/post/consensus-signature + + [XXX Note why we support push-and-then-pull.] + + All of the detached signatures it knows for consensus status should be + available at: + + http:///tor/status-vote/next/consensus-signatures.z + + Assuming full connectivity, every authority should compute and sign the + same consensus including any flavors in each period. Therefore, it + isn't necessary to download the consensus or any flavors of it computed + by each authority; instead, the authorities only push/fetch each + others' signatures. A "detached signature" document contains items as + follows: + + "consensus-digest" SP Digest NL + + [At start, at most once.] + + The digest of the consensus being signed. + + "valid-after" SP YYYY-MM-DD SP HH:MM:SS NL + "fresh-until" SP YYYY-MM-DD SP HH:MM:SS NL + "valid-until" SP YYYY-MM-DD SP HH:MM:SS NL + + [As in the consensus] + + "additional-digest" SP flavor SP algname SP digest NL + + [Any number.] + + For each supported consensus flavor, every directory authority + adds one or more "additional-digest" lines. "flavor" is the name + of the consensus flavor, "algname" is the name of the hash + algorithm that is used to generate the digest, and "digest" is the + hex-encoded digest. + + The hash algorithm for the microdescriptor consensus flavor is + defined as SHA256 with algname "sha256". + + "additional-signature" SP flavor SP algname SP identity SP + signing-key-digest NL signature. + + [Any number.] + + For each supported consensus flavor and defined digest algorithm, + every directory authority adds an "additional-signature" line. + "flavor" is the name of the consensus flavor. "algname" is the + name of the algorithm that was used to hash the identity and + signing keys, and to compute the signature. "identity" is the + hex-encoded digest of the authority identity key of the signing + authority, and "signing-key-digest" is the hex-encoded digest of + the current authority signing key of the signing authority. + + The "sha256" signature format is defined as the RSA signature of + the OAEP+-padded SHA256 digest of the item to be signed. When + checking signatures, the signature MUST be treated as valid if the + signature material begins with SHA256(document), so that other + data can get added later. + [To be honest, I didn't fully understand the previous paragraph + and only copied it from the proposals. Review carefully. -KL] + + "directory-signature" + + [As in the consensus; the signature object is the same as in the + consensus document.] + +3.11. Publishing the signed consensus + + The voting period ends at the valid-after time. If the consensus has + been signed by a majority of authorities, these documents are made + available at + + http:///tor/status-vote/current/consensus.z + + and + + http:///tor/status-vote/current/consensus-signatures.z + + [XXX current/consensus-signatures is not currently implemented, as it + is not used in the voting protocol.] + + [XXX possible future features include support for downloading old + consensuses.] + + The other vote documents are analogously made available under + + http:///tor/status-vote/current/authority.z + http:///tor/status-vote/current/.z + http:///tor/status-vote/current/d/.z + http:///tor/status-vote/current/bandwidth.z + + once the voting period ends, regardless of the number of signatures. + + The authorities serve another consensus of each flavor "F" from the + locations + + /tor/status-vote/(current|next)/consensus-F.z. and + /tor/status-vote/(current|next)/consensus-F/+....z. + + The standard URLs for bandwidth list files first-appeared in Tor 0.3.5. + +4. Directory cache operation + + All directory caches implement this section, except as noted. + +4.1. Downloading consensus status documents from directory authorities + + All directory caches try to keep a recent + network-status consensus document to serve to clients. A cache ALWAYS + downloads a network-status consensus if any of the following are true: + + - The cache has no consensus document. + - The cache's consensus document is no longer valid. + + Otherwise, the cache downloads a new consensus document at a randomly + chosen time in the first half-interval after its current consensus + stops being fresh. (This time is chosen at random to avoid swarming + the authorities at the start of each period. The interval size is + inferred from the difference between the valid-after time and the + fresh-until time on the consensus.) + + [For example, if a cache has a consensus that became valid at 1:00, + and is fresh until 2:00, that cache will fetch a new consensus at + a random time between 2:00 and 2:30.] + + Directory caches also fetch consensus flavors from the authorities. + Caches check the correctness of consensus flavors, but do not check + anything about an unrecognized consensus document beyond its digest and + length. Caches serve all consensus flavors from the same locations as + the directory authorities. + +4.2. Downloading server descriptors from directory authorities + + Periodically (currently, every 10 seconds), directory caches check + whether there are any specific descriptors that they do not have and that + they are not currently trying to download. Caches identify these + descriptors by hash in the recent network-status consensus documents. + + If so, the directory cache launches requests to the authorities for these + descriptors. + + If one of these downloads fails, we do not try to download that descriptor + from the authority that failed to serve it again unless we receive a newer + network-status consensus that lists the same descriptor. + + Directory caches must potentially cache multiple descriptors for each + router. Caches must not discard any descriptor listed by any recent + consensus. If there is enough space to store additional descriptors, + caches SHOULD try to hold those which clients are likely to download the + most. (Currently, this is judged based on the interval for which each + descriptor seemed newest.) + + [XXXX define recent] + +4.3. Downloading microdescriptors from directory authorities + + Directory mirrors should fetch, cache, and serve each microdescriptor + from the authorities. + + The microdescriptors with base64 hashes ,, are available + at: + + http:///tor/micro/d/--[.z] + + are base64 encoded with trailing =s omitted for size and for + consistency with the microdescriptor consensus format. -s are used + instead of +s to separate items, since the + character is used in + base64 encoding. + + Directory mirrors should check to make sure that the microdescriptors + they're about to serve match the right hashes (either the hashes from + the fetch URL or the hashes from the consensus, respectively). + + (NOTE: Due to squid proxy url limitations at most 92 microdescriptor hashes + can be retrieved in a single request.) + +4.4. Downloading extra-info documents from directory authorities + + Any cache that chooses to cache extra-info documents should implement this + section. + + Periodically, the Tor instance checks whether it is missing any extra-info + documents: in other words, if it has any server descriptors with an + extra-info-digest field that does not match any of the extra-info + documents currently held. If so, it downloads whatever extra-info + documents are missing. Caches download from authorities. We follow the + same splitting and back-off rules as in section 4.2. + +4.5. Consensus diffs + + Instead of downloading an entire consensus, clients may download + a "diff" document containing an ed-style diff from a previous + consensus document. Caches (and authorities) make these diffs as + they learn about new consensuses. To do so, they must store a + record of older consensuses. + + (Support for consensus diffs was added in 0.3.1.1-alpha, and is + advertised with the DirCache protocol version "2" or later.) + +4.5.1. Consensus diff format + + Consensus diffs are formatted as follows: + + The first line is "network-status-diff-version 1" NL + + The second line is + + "hash" SP FromDigest SP ToDigest NL + + where FromDigest is the hex-encoded SHA3-256 digest of the _signed + part_ of the consensus that the diff should be applied to, and + ToDigest is the hex-encoded SHA3-256 digest of the _entire_ + consensus resulting from applying the diff. (See 3.4.1 for + information on that part of a consensus is signed.) + + The third and subsequent lines encode the diff from FromDigest to + ToDigest in a limited subset of the ed diff format, as specified + in appendix E. + +4.5.2. Serving and requesting diffs. + + When downloading the current consensus, a client may include an + HTTP header of the form + + X-Or-Diff-From-Consensus: HASH1, HASH2, ... + + where the HASH values are hex-encoded SHA3-256 digests of the + _signed part_ of one or more consensuses that the client knows + about. + + If a cache knows a consensus diff from one of those consensuses + to the most recent consensus of the requested flavor, it may + send that diff instead of the specified consensus. + + Caches also serve diffs from the URIs: + + /tor/status-vote/current/consensus/diff//.z + /tor/status-vote/current/consensus-/diff//.z + + where FLAVOR is the consensus flavor, defaulting to "ns", and + FPRLIST is +-separated list of recognized authority identity + fingerprints as in appendix B. + +4.6 Retrying failed downloads + + See section 5.5 below; it applies to caches as well as clients. + +5. Client operation + + Every Tor that is not a directory server (that is, those that do + not have a DirPort set) implements this section. + +5.1. Downloading network-status documents + + Each client maintains a list of directory authorities. Insofar as + possible, clients SHOULD all use the same list. + + [Newer versions of Tor (0.2.8.1-alpha and later): + Each client also maintains a list of default fallback directory mirrors + (fallbacks). Each released version of Tor MAY have a different list, + depending on the mirrors that satisfy the fallback directory criteria at + release time.] + + Clients try to have a live consensus network-status document at all times. + A network-status document is "live" if the time in its valid-after field + has passed, and the time in its valid-until field has not passed. + + When a client has no consensus network-status document, it downloads it + from a randomly chosen fallback directory mirror or authority. Clients + prefer fallbacks to authorities, trying them earlier and more frequently. + In all other cases, the client downloads from caches randomly chosen from + among those believed to be V3 directory servers. (This information comes + from the network-status documents.) + + After receiving any response client MUST discard any network-status + documents that it did not request. + + On failure, the client waits briefly, then tries that network-status + document again from another cache. The client does not build circuits + until it has a live network-status consensus document, and it has + descriptors for a significant proportion of the routers that it believes + are running (this is configurable using torrc options and consensus + parameters). + + [Newer versions of Tor (0.2.6.2-alpha and later): + If the consensus contains Exits (the typical case), Tor will build both + exit and internal circuits. When bootstrap completes, Tor will be ready + to handle an application requesting an exit circuit to services like the + World Wide Web. + + If the consensus does not contain Exits, Tor will only build internal + circuits. In this case, earlier statuses will have included "internal" + as indicated above. When bootstrap completes, Tor will be ready to handle + an application requesting an internal circuit to hidden services at + ".onion" addresses. + + If a future consensus contains Exits, exit circuits may become available.] + + (Note: clients can and should pick caches based on the network-status + information they have: once they have first fetched network-status info + from an authority or fallback, they should not need to go to the authority + directly again, and should only choose the fallback at random, based on its + consensus weight in the current consensus.) + + To avoid swarming the caches whenever a consensus expires, the + clients download new consensuses at a randomly chosen time after the + caches are expected to have a fresh consensus, but before their + consensus will expire. (This time is chosen uniformly at random from + the interval between the time 3/4 into the first interval after the + consensus is no longer fresh, and 7/8 of the time remaining after + that before the consensus is invalid.) + + [For example, if a client has a consensus that became valid at 1:00, + and is fresh until 2:00, and expires at 4:00, that client will fetch + a new consensus at a random time between 2:45 and 3:50, since 3/4 + of the one-hour interval is 45 minutes, and 7/8 of the remaining 75 + minutes is 65 minutes.] + + Clients may choose to download the microdescriptor consensus instead + of the general network status consensus. In that case they should use + the same update strategy as for the normal consensus. They should not + download more than one consensus flavor. + + When a client does not have a live consensus, it will generally use the + most recent consensus it has if that consensus is "reasonably live". A + "reasonably live" consensus is one that expired less than 24 hours ago. + +5.2. Downloading server descriptors or microdescriptors + + Clients try to have the best descriptor for each router. A descriptor is + "best" if: + + * It is listed in the consensus network-status document. + + Periodically (currently every 10 seconds) clients check whether there are + any "downloadable" descriptors. A descriptor is downloadable if: + + - It is the "best" descriptor for some router. + - The descriptor was published at least 10 minutes in the past. + (This prevents clients from trying to fetch descriptors that the + mirrors have probably not yet retrieved and cached.) + - The client does not currently have it. + - The client is not currently trying to download it. + - The client would not discard it immediately upon receiving it. + - The client thinks it is running and valid (see section 5.4.1 below). + + If at least 16 known routers have downloadable descriptors, or if + enough time (currently 10 minutes) has passed since the last time the + client tried to download descriptors, it launches requests for all + downloadable descriptors. + + When downloading multiple server descriptors, the client chooses multiple + mirrors so that: + + - At least 3 different mirrors are used, except when this would result + in more than one request for under 4 descriptors. + - No more than 128 descriptors are requested from a single mirror. + - Otherwise, as few mirrors as possible are used. + After choosing mirrors, the client divides the descriptors among them + randomly. + + After receiving any response the client MUST discard any descriptors that + it did not request. + + When a descriptor download fails, the client notes it, and does not + consider the descriptor downloadable again until a certain amount of time + has passed. (Currently 0 seconds for the first failure, 60 seconds for the + second, 5 minutes for the third, 10 minutes for the fourth, and 1 day + thereafter.) Periodically (currently once an hour) clients reset the + failure count. + + Clients retain the most recent descriptor they have downloaded for each + router so long as it is listed in the consensus. If it is not listed, + they keep it so long as it is not too old (currently, ROUTER_MAX_AGE=48 + hours) and no better router descriptor has been downloaded for the same + relay. Caches retain descriptors until they are at least + OLD_ROUTER_DESC_MAX_AGE=5 days old. + + Clients which chose to download the microdescriptor consensus instead + of the general consensus must download the referenced microdescriptors + instead of server descriptors. Clients fetch and cache + microdescriptors preemptively from dir mirrors when starting up, like + they currently fetch descriptors. After bootstrapping, clients only + need to fetch the microdescriptors that have changed. + + When a client gets a new microdescriptor consensus, it looks to see if + there are any microdescriptors it needs to learn, and launches a request + for them. + + Clients maintain a cache of microdescriptors along with metadata like + when it was last referenced by a consensus, and which identity key + it corresponds to. They keep a microdescriptor until it hasn't been + mentioned in any consensus for a week. Future clients might cache them + for longer or shorter times. + +5.3. Downloading extra-info documents + + Any client that uses extra-info documents should implement this + section. + + Note that generally, clients don't need extra-info documents. + + Periodically, the Tor instance checks whether it is missing any extra-info + documents: in other words, if it has any server descriptors with an + extra-info-digest field that does not match any of the extra-info + documents currently held. If so, it downloads whatever extra-info + documents are missing. Clients try to download from caches. + We follow the same splitting and back-off rules as in section 5.2. + +5.4. Using directory information + + [XXX This subsection really belongs in path-spec.txt, not here. -KL] + + Everyone besides directory authorities uses the approaches in this section + to decide which relays to use and what their keys are likely to be. + (Directory authorities just believe their own opinions, as in section 3.4.2 + above.) + +5.4.1. Choosing routers for circuits. + + Circuits SHOULD NOT be built until the client has enough directory + information: a live consensus network status [XXXX fallback?] and + descriptors for at least 1/4 of the relays believed to be running. + + A relay is "listed" if it is included by the consensus network-status + document. Clients SHOULD NOT use unlisted relays. + + These flags are used as follows: + + - Clients SHOULD NOT use non-'Valid' or non-'Running' routers unless + requested to do so. + + - Clients SHOULD NOT use non-'Fast' routers for any purpose other than + very-low-bandwidth circuits (such as introduction circuits). + + - Clients SHOULD NOT use non-'Stable' routers for circuits that are + likely to need to be open for a very long time (such as those used for + IRC or SSH connections). + + - Clients SHOULD NOT choose non-'Guard' nodes when picking entry guard + nodes. + + See the "path-spec.txt" document for more details. + +5.4.2. Managing naming + + (This section is removed; authorities no longer assign the 'Named' flag.) + +5.4.3. Software versions + + An implementation of Tor SHOULD warn when it has fetched a consensus + network-status, and it is running a software version not listed. + +5.4.4. Warning about a router's status. + + (This section is removed; authorities no longer assign the 'Named' flag.) + +5.5. Retrying failed downloads + + This section applies to caches as well as to clients. + + When a client fails to download a resource (a consensus, a router + descriptor, a microdescriptor, etc) it waits for a certain amount of + time before retrying the download. To determine the amount of time + to wait, clients use a randomized exponential backoff algorithm. + (Specifically, they use a variation of the "decorrelated jitter" + algorithm from + https://aws.amazon.com/blogs/architecture/exponential-backoff-and-jitter/ .) + + The specific formula used to compute the 'i+1'th delay is: + + Delay_{i+1} = MIN(cap, random_between(lower_bound, upper_bound))) + where upper_bound = MAX(lower_bound+1, Delay_i * 3) + lower_bound = MAX(1, base_delay). + + The value of 'cap' is set to INT_MAX; the value of 'base_delay' + depends on what is being downloaded, whether the client is fully + bootstrapped, how the client is configured, and where it is + downloading from. Current base_delay values are: + + Consensus objects, as a non-bridge cache: + 0 (TestingServerConsensusDownloadInitialDelay) + + Consensus objects, as a client or bridge that has bootstrapped: + 0 (TestingClientConsensusDownloadInitialDelay) + + Consensus objects, as a client or bridge that is bootstrapping, + when connecting to an authority because no "fallback" caches are + known: + 0 (ClientBootstrapConsensusAuthorityOnlyDownloadInitialDelay) + + Consensus objects, as a client or bridge that is bootstrapping, + when "fallback" caches are known but connecting to an authority + anyway: + 6 (ClientBootstrapConsensusAuthorityDownloadInitialDelay) + + Consensus objects, as a client or bridge that is bootstrapping, + when downloading from a "fallback" cache. + 0 (ClientBootstrapConsensusFallbackDownloadInitialDelay) + + Bridge descriptors, as a bridge-using client when at least one bridge + is usable: + 10800 (TestingBridgeDownloadInitialDelay) + + Bridge descriptors, otherwise: + 0 (TestingBridgeBootstrapDownloadInitialDelay) + + Other objects, as cache or authority: + 0 (TestingServerDownloadInitialDelay) + + Other objects, as client: + 0 (TestingClientDownloadInitialDelay) + + +6. Standards compliance + + All clients and servers MUST support HTTP 1.0. Clients and servers MAY + support later versions of HTTP as well. + +6.1. HTTP headers + + Servers SHOULD set Content-Encoding to the algorithm used to compress the + document(s) being served. Recognized algorithms are: + + - "identity" -- RFC2616 section 3.5 + - "deflate" -- RFC2616 section 3.5 + - "gzip" -- RFC2616 section 3.5 + - "x-zstd" -- The zstandard compression algorithm (www.zstd.net) + - "x-tor-lzma" -- The lzma compression algorithm, with a "preset" + value no higher than 6. + + Clients SHOULD use Accept-Encoding on most directory requests to indicate + which of the above compression algorithms they support. If they omit it + (as Tor clients did before 0.3.1.1-alpha), then the server should serve + only "deflate" or "identity" encoded documents, based on the presence or + absence of the ".z" suffix on the requested URL. + + Note that for anonymous directory requests (that is, requests made over + multi-hop circuits, like those for onion service lookups) implementations + SHOULD NOT advertise any Accept-Encoding values other than deflate. To do + so would be to create a fingerprinting opportunity. + + When receiving multiple documents, clients MUST accept compressed + concatenated documents and concatenated compressed documents as + equivalent. + + Servers MAY set the Content-Length: header. When they do, it should + match the number of compressed bytes that they are sending. + + Servers MAY include an X-Your-Address-Is: header, whose value is the + apparent IP address of the client connecting to them (as a dotted quad). + For directory connections tunneled over a BEGIN_DIR stream, servers SHOULD + report the IP from which the circuit carrying the BEGIN_DIR stream reached + them. + + Servers SHOULD disable caching of multiple network statuses or multiple + server descriptors. Servers MAY enable caching of single descriptors, + single network statuses, the list of all server descriptors, a v1 + directory, or a v1 running routers document. XXX mention times. + +6.2. HTTP status codes + + Tor delivers the following status codes. Some were chosen without much + thought; other code SHOULD NOT rely on specific status codes yet. + + 200 -- the operation completed successfully + -- the user requested statuses or serverdescs, and none of the ones we + requested were found (0.2.0.4-alpha and earlier). + + 304 -- the client specified an if-modified-since time, and none of the + requested resources have changed since that time. + + 400 -- the request is malformed, or + -- the URL is for a malformed variation of one of the URLs we support, + or + -- the client tried to post to a non-authority, or + -- the authority rejected a malformed posted document, or + + 404 -- the requested document was not found. + -- the user requested statuses or serverdescs, and none of the ones + requested were found (0.2.0.5-alpha and later). + + 503 -- we are declining the request in order to save bandwidth + -- user requested some items that we ordinarily generate or store, + but we do not have any available. + +A. Consensus-negotiation timeline. + + Period begins: this is the Published time. + Everybody sends votes + Reconciliation: everybody tries to fetch missing votes. + consensus may exist at this point. + End of voting period: + everyone swaps signatures. + Now it's okay for caches to download + Now it's okay for clients to download. + + Valid-after/valid-until switchover + +B. General-use HTTP URLs + + "Fingerprints" in these URLs are base16-encoded SHA1 hashes. + + The most recent v3 consensus should be available at: + + http:///tor/status-vote/current/consensus.z + + Similarly, the v3 microdescriptor consensus should be available at: + + http:///tor/status-vote/current/consensus-microdesc.z + + Starting with Tor version 0.2.1.1-alpha is also available at: + + http:///tor/status-vote/current/consensus/++.z + + (NOTE: Due to squid proxy url limitations at most 96 fingerprints can be + retrieved in a single request.) + + Where F1, F2, etc. are authority identity fingerprints the client trusts. + Servers will only return a consensus if more than half of the requested + authorities have signed the document, otherwise a 404 error will be sent + back. The fingerprints can be shortened to a length of any multiple of + two, using only the leftmost part of the encoded fingerprint. Tor uses + 3 bytes (6 hex characters) of the fingerprint. + + Clients SHOULD sort the fingerprints in ascending order. Server MUST + accept any order. + + Clients SHOULD use this format when requesting consensus documents from + directory authority servers and from caches running a version of Tor + that is known to support this URL format. + + A concatenated set of all the current key certificates should be available + at: + + http:///tor/keys/all.z + + The key certificate for this server should be available at: + + http:///tor/keys/authority.z + + The key certificate for an authority whose authority identity fingerprint + is should be available at: + + http:///tor/keys/fp/.z + + The key certificate whose signing key fingerprint is should be + available at: + + http:///tor/keys/sk/.z + + The key certificate whose identity key fingerprint is and whose signing + key fingerprint is should be available at: + + http:///tor/keys/fp-sk/-.z + + (As usual, clients may request multiple certificates using: + + http:///tor/keys/fp-sk/-+-.z ) + + [The above fp-sk format was not supported before Tor 0.2.1.9-alpha.] + + The most recent descriptor for a server whose identity key has a + fingerprint of should be available at: + + http:///tor/server/fp/.z + + The most recent descriptors for servers with identity fingerprints + ,, should be available at: + + http:///tor/server/fp/++.z + + (NOTE: Due to squid proxy url limitations at most 96 fingerprints can be + retrieved in a single request. + + Implementations SHOULD NOT download descriptors by identity key + fingerprint. This allows a corrupted server (in collusion with a cache) to + provide a unique descriptor to a client, and thereby partition that client + from the rest of the network.) + + The server descriptor with (descriptor) digest (in hex) should be + available at: + + http:///tor/server/d/.z + + The most recent descriptors with digests ,, should be + available at: + + http:///tor/server/d/++.z + + The most recent descriptor for this server should be at: + + http:///tor/server/authority.z + + This is used for authorities, and also if a server is configured + as a bridge. The official Tor implementations (starting at + 0.1.1.x) use this resource to test whether a server's own DirPort + is reachable. It is also useful for debugging purposes. + + A concatenated set of the most recent descriptors for all known servers + should be available at: + + http:///tor/server/all.z + + Extra-info documents are available at the URLS + + http:///tor/extra/d/... + http:///tor/extra/fp/... + http:///tor/extra/all[.z] + http:///tor/extra/authority[.z] + (As for /tor/server/ URLs: supports fetching extra-info + documents by their digest, by the fingerprint of their servers, + or all at once. When serving by fingerprint, we serve the + extra-info that corresponds to the descriptor we would serve by + that fingerprint. Only directory authorities of version + 0.2.0.1-alpha or later are guaranteed to support the first + three classes of URLs. Caches may support them, and MUST + support them if they have advertised "caches-extra-info".) + + For debugging, directories SHOULD expose non-compressed objects at + URLs like the above, but without the final ".z". If the client uses + Accept-Encodings header, it should override the presence or absence + of the ".z" (see section 6.1). + + Clients SHOULD use upper case letters (A-F) when base16-encoding + fingerprints. Servers MUST accept both upper and lower case fingerprints + in requests. + +C. Converting a curve25519 public key to an ed25519 public key + + Given an X25519 key, that is, an affine point (u,v) on the + Montgomery curve defined by + + bv^2 = u(u^2 + au +1) + + where + + a = 486662 + b = 1 + + and comprised of the compressed form (i.e. consisting of only the + u-coordinate), we can retrieve the y-coordinate of the affine point + (x,y) on the twisted Edwards form of the curve defined by + + -x^2 + y^2 = 1 + d x^2 y^2 + + where + + d = - 121665/121666 + + by computing + + y = (u-1)/(u+1). + + and then we can apply the usual curve25519 twisted Edwards point + decompression algorithm to find _an_ x-coordinate of an affine + twisted Edwards point to check signatures with. Signing keys for + ed25519 are compressed curve points in twisted Edwards form (so a + y-coordinate and the sign of the x-coordinate), and X25519 keys are + compressed curve points in Montgomery form (i.e. a u-coordinate). + + However, note that compressed point in Montgomery form neglects to + encode what the sign of the corresponding twisted Edwards + x-coordinate would be. Thus, we need the sign of the x-coordinate + to do this operation; otherwise, we'll have two possible + x-coordinates that might have correspond to the ed25519 public key. + + To get the sign, the easiest way is to take the corresponding + private key, feed it to the ed25519 public key generation + algorithm, and see what the sign is. + + [Recomputing the sign bit from the private key every time sounds + rather strange and inefficient to me… —isis] + + Note that in addition to its coordinates, an expanded Ed25519 private key + also has a 32-byte random value, "prefix", used to compute internal `r` + values in the signature. For security, this prefix value should be + derived deterministically from the curve25519 key. The Tor + implementation derives it as SHA512(private_key | STR)[0..32], where + STR is the nul-terminated string: + + "Derive high part of ed25519 key from curve25519 key\0" + + + On the client side, where there is no access to the curve25519 private + keys, one may use the curve25519 public key's Montgomery u-coordinate to + recover the Montgomery v-coordinate by computing the right-hand side of + the Montgomery curve equation: + + bv^2 = u(u^2 + au +1) + + where + + a = 486662 + b = 1 + + Then, knowing the intended sign of the Edwards x-coordinate, one + may recover said x-coordinate by computing: + + x = (u/v) * sqrt(-a - 2) + +D. Inferring missing proto lines. + + The directory authorities no longer allow versions of Tor before + 0.2.4.18-rc. But right now, there is no version of Tor in the consensus + before 0.2.4.19. Therefore, we should disallow versions of Tor earlier + than 0.2.4.19, so that we can have the protocol list for all current Tor + versions include: + + Cons=1-2 Desc=1-2 DirCache=1 HSDir=1 HSIntro=3 HSRend=1-2 Link=1-4 + LinkAuth=1 Microdesc=1-2 Relay=1-2 + + For Desc, Microdesc and Cons, Tor versions before 0.2.7.stable should be + taken to only support version 1. + +E. Limited ed diff format + + We support the following format for consensus diffs. It's a + subset of the ed diff format, but clients MUST NOT accept other + ed commands. + + We support the following ed commands, each on a line by itself: + + - "d" Delete line n1 + - ",d" Delete lines n1 through n2, inclusive + - ",$d" Delete line n1 through the end of the file, inclusive. + - "c" Replace line n1 with the following block + - ",c" Replace lines n1 through n2, inclusive, with the + following block. + - "a" Append the following block after line n1. + + Note that line numbers always apply to the file after all previous + commands have already been applied. Note also that line numbers + are 1-indexed. + + The commands MUST apply to the file from back to front, such that + lines are only ever referred to by their position in the original + file. + + If there are any directory signatures on the original document, the + first command MUST be a ",$d" form to remove all of the directory + signatures. Using this format ensures that the client will + successfully apply the diff even if they have an unusual encoding for + the signatures. + + The replace and append command take blocks. These blocks are simply + appended to the diff after the line with the command. A line with + just a period (".") ends the block (and is not part of the lines + to add). Note that it is impossible to insert a line with just + a single dot. diff --git a/attic/text_formats/ext-orport-spec.txt b/attic/text_formats/ext-orport-spec.txt new file mode 100644 index 0000000..6b8f8e1 --- /dev/null +++ b/attic/text_formats/ext-orport-spec.txt @@ -0,0 +1,226 @@ + Extended ORPort for pluggable transports + George Kadianakis, Nick Mathewson + +Table of Contents + + 1. Overview + 2. Establishing a connection and authenticating. + 2.1. Authentication type: SAFE_COOKIE + 2.1.2. Cookie-file format + 2.1.3. SAFE_COOKIE Protocol specification + 3. The extended ORPort protocol + 3.1. Protocol + 3.2. Command descriptions + 3.2.1. USERADDR + 3.2.2. TRANSPORT + 4. Security Considerations + +1. Overview + + This document describes the "Extended ORPort" protocol, a wrapper + around Tor's ordinary ORPort protocol for use by bridges that + support pluggable transports. It provides a way for server-side PTs + and bridges to exchange additional information before beginning + the actual OR connection. + + See `tor-spec.txt` for information on the regular OR protocol, and + `pt-spec.txt` for information on pluggable transports. + + This protocol was originally proposed in proposal 196, and + extended with authentication in proposal 217. + +2. Establishing a connection and authenticating. + + When a client (that is to say, a server-side pluggable transport) + connects to an Extended ORPort, the server sends: + + AuthTypes [variable] + EndAuthTypes [1 octet] + + Where, + + + AuthTypes are the authentication schemes that the server supports + for this session. They are multiple concatenated 1-octet values that + take values from 1 to 255. + + EndAuthTypes is the special value 0. + + The client reads the list of supported authentication schemes, + chooses one, and sends it back: + + AuthType [1 octet] + + Where, + + + AuthType is the authentication scheme that the client wants to use + for this session. A valid authentication type takes values from 1 to + 255. A value of 0 means that the client did not like the + authentication types offered by the server. + + If the client sent an AuthType of value 0, or an AuthType that the + server does not support, the server MUST close the connection. + +2.1. Authentication type: SAFE_COOKIE + + We define one authentication type: SAFE_COOKIE. Its AuthType + value is 1. It is based on the client proving to the bridge that + it can access a given "cookie" file on disk. The purpose of + authentication is to defend against cross-protocol attacks. + + If the Extended ORPort is enabled, Tor should regenerate the cookie + file on startup and store it in + $DataDirectory/extended_orport_auth_cookie. + + The location of the cookie can be overridden by using the + configuration file parameter ExtORPortCookieAuthFile, which is + defined as: + + ExtORPortCookieAuthFile + + where is a filesystem path. + +2.1.2. Cookie-file format + + The format of the cookie-file is: + + StaticHeader [32 octets] + Cookie [32 octets] + + Where, + + StaticHeader is the following string: + "! Extended ORPort Auth Cookie !\x0a" + + Cookie is the shared-secret. During the SAFE_COOKIE protocol, the + cookie is called CookieString. + + Extended ORPort clients MUST make sure that the StaticHeader is + present in the cookie file, before proceeding with the + authentication protocol. + +2.1.3. SAFE_COOKIE Protocol specification + + + A client that performs the SAFE_COOKIE handshake begins by sending: + + ClientNonce [32 octets] + + Where, + + ClientNonce is 32 octets of random data. + + Then, the server replies with: + + ServerHash [32 octets] + ServerNonce [32 octets] + + Where, + + ServerHash is computed as: + HMAC-SHA256(CookieString, + "ExtORPort authentication server-to-client hash" | ClientNonce | ServerNonce) + + ServerNonce is 32 random octets. + + Upon receiving that data, the client computes ServerHash, and + validates it against the ServerHash provided by the server. + + If the server-provided ServerHash is invalid, the client MUST + terminate the connection. + + Otherwise the client replies with: + + ClientHash [32 octets] + + Where, + + ClientHash is computed as: + HMAC-SHA256(CookieString, + "ExtORPort authentication client-to-server hash" | ClientNonce | ServerNonce) + + Upon receiving that data, the server computes ClientHash, and + validates it against the ClientHash provided by the client. + + Finally, the server replies with: + + Status [1 octet] + + Where, + + Status is 1 if the authentication was successful. If the + authentication failed, Status is 0. + +3. The extended ORPort protocol + + Once a connection is established and authenticated, the parties + communicate with the protocol described here. + +3.1. Protocol + + The extended server port protocol is as follows: + + COMMAND [2 bytes, big-endian] + BODYLEN [2 bytes, big-endian] + BODY [BODYLEN bytes] + + Commands sent from the transport proxy to the bridge are: + + [0x0000] DONE: There is no more information to give. The next + bytes sent by the transport will be those tunneled over it. + (body ignored) + + [0x0001] USERADDR: an address:port string that represents the + client's address. + + [0x0002] TRANSPORT: a string of the name of the pluggable + transport currently in effect on the connection. + + Replies sent from tor to the proxy are: + + [0x1000] OKAY: Send the user's traffic. (body ignored) + + [0x1001] DENY: Tor would prefer not to get more traffic from + this address for a while. (body ignored) + + [0x1002] CONTROL: (Not used) + + Parties MUST ignore command codes that they do not understand. + + If the server receives a recognized command that does not parse, it + MUST close the connection to the client. + +3.2. Command descriptions + +3.2.1. USERADDR + + An ASCII string holding the TCP/IP address of the client of the + pluggable transport proxy. A Tor bridge SHOULD use that address to + collect statistics about its clients. Recognized formats are: + 1.2.3.4:5678 + [1:2::3:4]:5678 + + (Current Tor versions may accept other formats, but this is a bug: + transports MUST NOT send them.) + + The string MUST not be NUL-terminated. + +3.2.2. TRANSPORT + + An ASCII string holding the name of the pluggable transport used by + the client of the pluggable transport proxy. A Tor bridge that + supports multiple transports SHOULD use that information to collect + statistics about the popularity of individual pluggable transports. + + The string MUST not be NUL-terminated. + + Pluggable transport names are C-identifiers and Tor MUST check them + for correctness. + +4. Security Considerations + + Extended ORPort or TransportControlPort do _not_ provide link + confidentiality, authentication or integrity. Sensitive data, like + cryptographic material, should not be transferred through them. + + An attacker with superuser access is able to sniff network traffic, + and capture TransportControlPort identifiers and any data passed + through those ports. + + Tor SHOULD issue a warning if the bridge operator tries to bind + Extended ORPort to a non-localhost address. + + Pluggable transport proxies SHOULD issue a warning if they are + instructed to connect to a non-localhost Extended ORPort. + diff --git a/attic/text_formats/gettor-spec.txt b/attic/text_formats/gettor-spec.txt new file mode 100644 index 0000000..a4959b4 --- /dev/null +++ b/attic/text_formats/gettor-spec.txt @@ -0,0 +1,88 @@ + + GetTor specification + Jacob Appelbaum + +Table of Contents + + 0. Preface + 1. Overview + 2. Implementation + 2.1. Reference implementation + 3. SMTP transport + 3.1. SMTP transport security considerations + 3.2. SMTP transport privacy considerations + 4. Other transports + 5. Implementation suggestions + +0. Preface + + This document describes GetTor and how to properly implementation GetTor. + +1. Overview + + GetTor was created to resolve direct and indirect censorship of Tor's + software. In many countries and networks Tor's main website is blocked and + would-be Tor users are unable to download even the source code to the Tor + program. Other software hosted by the Tor Project is similarly censored. The + filtering of the possible download sites is sometimes easy to bypass by using + our TLS enabled website. In other cases the website and all of the mirrors are + entirely blocked; this is a situation where a user seems to actually need Tor + to fetch Tor. We discovered that it is feasible to use alternate transport + methods such as SMTP between a non-trusted third party or with IRC and XDCC. + +2. Implementation + + Any compliant GetTor implementation will implement at least a single transport + to meet the needs of a certain class of users. It should be i18n and l10n + compliant for all user facing interactions; users should be able to manually + set their language and this should serve as their preference for localization + of any software delivered. The implementation must be free software and it + should be freely available by request from the implementation that they + interface with to download any of the other software available from that + GetTor instance. Security and privacy considerations should be described on a + per transport basis. + +2.1. Reference implementation + + We have implemented[0] a compliant GetTor that supports SMTP as a transport. + +3. SMTP transport + + The SMTP transport for GetTor should allow users to send any RFC822 compliant + message in any known human language; GetTor should respond in whatever + language is detected with supplementary translations in the same email. + GetTor shall offer a list of all available software in the body of the email - + it should offer the software as a list of packages and their subsequent + descriptions. + +3.1. SMTP transport security considerations + + Any GetTor instance that offers SMTP as a transport should optionally + implement the checking of DKIM signatures to ensure that email is not forged. + Optionally GetTor should take an OpenPGP key from the user and encrypt the + response with a blinded message. + +3.2. SMTP transport privacy considerations + + Any GetTor instance that offers SMTP as a transport must at least store the + requester's address for the time that it takes to process a response. This + should not be written to any permanent storage medium; GetTor should function + without any long term storage excepting a cache of files that it will send to + any user who requests it. + + GetTor may optionally collect anonymized usage statistics to better understand + how GetTor[1] is in use. This must not include any personally identifying + information about any of the requester beyond language selection. + +4. Other transports + + At this time no other transports have been specified. IRC XDCC is a likely + useful system as is XMPP/Jabber with the newest OTR file sharing transport. + +5. Implementation suggestions + + It is suggested that any compliant GetTor instance should be written in a so + called "safe" language such as Python. + +[0] https://gitweb.torproject.org/gettor.git +[1] https://metrics.torproject.org/packages.html diff --git a/attic/text_formats/glossary.txt b/attic/text_formats/glossary.txt new file mode 100644 index 0000000..68de376 --- /dev/null +++ b/attic/text_formats/glossary.txt @@ -0,0 +1,198 @@ + + Glossary + + The Tor Project + +This document aims to specify terms, notations, and phrases related +to Tor, as used in the Tor specification documents and other documentation. + +This glossary is not a design document; it is only a reference. + +This glossary is a work-in-progress; double-check its definitions before +citing them authoritatively. ;) + +Table of Contents + + 0. Preliminaries + 1.0. Commonly used Tor configuration terms + 2.0. Tor network components + 2.1. Relays, aka OR (onion router) + 2.1.1. Specific roles + 2.2. Client, aka OP (onion proxy) + 2.3. Authorities + 2.4. Hidden Service + 2.5. Circuit + 2.6. Edge connection + 2.7. Consensus + 2.8. Descriptor + 3.0. Tor network protocols + 3.1. Link handshake + 3.2. Circuit handshake + 3.3. Hidden Service Protocol + 3.4. Directory Protocol + 4.0. General network definitions + +0. Preliminaries + + The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL + NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and + "OPTIONAL" in this document are to be interpreted as described in + RFC 2119. + +1.0. Commonly used Tor configuration terms + + ORPort - Onion Router Port + DirPort - Directory Port + +2.0. Tor network components + +2.1. Relays, aka OR (onion router) + + [Style guide: prefer the term "Relay"] + +2.1.1. Specific roles + + Exit relay: The final hop in an exit circuit before traffic leaves + the Tor network to connect to external servers. + + Non-exit relay: Relays that send and receive traffic only to + other Tor relays and Tor clients. + + Entry relay: The first hop in a Tor circuit. Can be either a guard + relay or a bridge, depending on the client's configuration. + + Guard relay: A relay that a client uses as its entry for a longer + period of time. Guard relays are rotated more slowly to prevent + attacks that can come from being exposed to too many guards. + + Bridge: A relay intentionally not listed in the public Tor + consensus, with the purpose of circumventing entities (such as + governments or ISPs) seeking to block clients from using Tor. + Currently, bridges are used only as entry relays. + + Directory cache: A relay that downloads cached directory information + from the directory authorities and serves it to clients on demand. + Any relay will act as a directory cache, if its bandwidth is high enough. + + Rendezvous point: A relay connecting a client to a hidden service. + Each party builds a three-hop circuit, meeting at the + rendezvous point. + +2.2. Client, aka OP (onion proxy) + + [Style: the "OP" and "onion proxy" terms are deprecated.] + +2.3. Authorities: + + Directory Authority: Nine total in the Tor network, operated by + trusted individuals. Directory authorities define and serve the + consensus document, defining the "state of the network." This document + contains a "router status" section for every relay currently + in the network. Directory authorities also serve router descriptors, + extra info documents, microdescriptors, and the microdescriptor consensus. + + Bridge Authority: One total. Similar in responsibility to directory + authorities, but for bridges. + + Fallback directory mirror: One of a list of directory caches distributed + with the Tor software. (When a client first connects to the network, and + has no directory information, it asks a fallback directory. From then on, + the client can ask any directory cache that's listed in the directory + information it has.) + +2.4. Hidden Service: + + A hidden service is a server that will only accept incoming + connections via the hidden service protocol. Connection + initiators will not be able to learn the IP address of the hidden + service, allowing the hidden service to receive incoming connections, + serve content, etc, while preserving its location anonymity. + +2.5. Circuit: + + An established path through the network, where cryptographic keys + are negotiated using the ntor protocol or TAP (Tor Authentication + Protocol (deprecated)) with each hop. Circuits can differ in length + depending on their purpose. See also Leaky Pipe Topology. + + Origin Circuit - + + Exit Circuit: A circuit which connects clients to destinations + outside the Tor network. For example, if a client wanted to visit + duckduckgo.com, this connection would require an exit circuit. + + Internal Circuit: A circuit whose traffic never leaves the Tor + network. For example, a client could connect to a hidden service via + an internal circuit. + +2.6. Edge connection: + +2.7. Consensus: The state of the Tor network, published every hour, + decided by a vote from the network's directory authorities. Clients + fetch the consensus from directory authorities, fallback + directories, or directory caches. + +2.8. Descriptor: Each descriptor represents information about one + relay in the Tor network. The descriptor includes the relay's IP + address, public keys, and other data. Relays send + descriptors to directory authorities, who vote and publish a + summary of them in the network consensus. + +3.0. Tor network protocols + +3.1. Link handshake + + The link handshake establishes the TLS connection over which two + Tor participants will send Tor cells. This handshake also + authenticates the participants to each other, possibly using Tor + cells. + +3.2. Circuit handshake + + Circuit handshakes establish the hop-by-hop onion encryption + that clients use to tunnel their application traffic. The + client does a pairwise key establishment handshake with each + individual relay in the circuit. For every hop except the + first, these handshakes tunnel through existing hops in the + circuit. Each cell type in this protocol also has a newer + version (with a "2" suffix), e.g., CREATE2. + + CREATE cell: First part of a handshake, sent by the initiator. + + CREATED cell: Second part of a handshake, sent by the responder. + + EXTEND cell: (also known as a RELAY_EXTEND cell) First part of a + handshake, tunneled through an existing circuit. The last relay + in the circuit so far will decrypt this cell and send the + payload in a CREATED cell to the chosen next hop relay. + + EXTENDED cell: (also known as a RELAY_EXTENDED cell) Second part + of a handshake, tunneled through an existing circuit. The last + relay in the circuit so far receives the CREATED cell from the + new last hop relay and encrypts the payload in an EXTENDED cell + to tunnel back to the client. + + Onion skin: A CREATE/CREATE2 or EXTEND/EXTEND2 payload that + contains the first part of the TAP or ntor key establishment + handshake. + +3.3. Hidden Service Protocol + +3.4. Directory Protocol + + +4.0. General network definitions + + Leaky Pipe Topology: The ability for the origin of a circuit to address + relay cells to be addressed to any hop in the path of a circuit. In Tor, + the destination hop is determined by using the 'recognized' field of relay + cells. + + Stream: A single application-level connection or request, multiplexed over + a Tor circuit. A 'Stream' can currently carry the contents of a TCP + connection, a DNS request, or a Tor directory request. + + Channel: A pairwise connection between two Tor relays, or between a + client and a relay. Circuits are multiplexed over Channels. All + channels are currently implemented as TLS connections. + diff --git a/attic/text_formats/guard-spec.txt b/attic/text_formats/guard-spec.txt new file mode 100644 index 0000000..154edae --- /dev/null +++ b/attic/text_formats/guard-spec.txt @@ -0,0 +1,972 @@ + + Tor Guard Specification + + Isis Lovecruft + George Kadianakis + Ola Bini + Nick Mathewson + +Table of Contents + + 1. Introduction and motivation + 2. State instances + 3. Circuit Creation, Entry Guard Selection (1000 foot view) + 3.1 Path selection + 3.1.1 Managing entry guards + 3.1.2 Middle and exit node selection + 3.2 Circuit Building + 4. The algorithm. + 4.0. The guards listed in the current consensus. [Section:GUARDS] + 4.1. The Sampled Guard Set. [Section:SAMPLED] + 4.2. The Usable Sample [Section:FILTERED] + 4.3. The confirmed-guard list. [Section:CONFIRMED] + 4.4. The Primary guards [Section:PRIMARY] + 4.5. Retrying guards. [Section:RETRYING] + 4.6. Selecting guards for circuits. [Section:SELECTING] + 4.7. When a circuit fails. [Section:ON_FAIL] + 4.8. When a circuit succeeds [Section:ON_SUCCESS] + 4.9. Updating the list of waiting circuits [Section:UPDATE_WAITING] + 4.10. Whenever we get a new consensus. [Section:ON_CONSENSUS] + 4.11. Deciding whether to generate a new circuit. + 4.12. When we are missing descriptors. + A. Appendices + A.0. Acknowledgements + A.1. Parameters with suggested values. [Section:PARAM_VALS] + A.2. Random values [Section:RANDOM] + A.3. Why not a sliding scale of primaryness? [Section:CVP] + A.4. Controller changes + A.5. Persistent state format + +1. Introduction and motivation + + Tor uses entry guards to prevent an attacker who controls some + fraction of the network from observing a fraction of every user's + traffic. If users chose their entries and exits uniformly at + random from the list of servers every time they build a circuit, + then an adversary who had (k/N) of the network would deanonymize + F=(k/N)^2 of all circuits... and after a given user had built C + circuits, the attacker would see them at least once with + probability 1-(1-F)^C. With large C, the attacker would get a + sample of every user's traffic with probability 1. + + To prevent this from happening, Tor clients choose a small number + of guard nodes (e.g. 3). These guard nodes are the only + nodes that the client will connect to directly. If they are not + compromised, the user's paths are not compromised. + + This specification outlines Tor's guard housekeeping algorithm, + which tries to meet the following goals: + + - Heuristics and algorithms for determining how and which guards + are chosen should be kept as simple and easy to understand as + possible. + + - Clients in censored regions or who are behind a fascist + firewall who connect to the Tor network should not experience + any significant disadvantage in terms of reachability or + usability. + + - Tor should make a best attempt at discovering the most + appropriate behavior, with as little user input and + configuration as possible. + + - Tor clients should discover usable guards without too much + delay. + + - Tor clients should resist (to the extent possible) attacks + that try to force them onto compromised guards. + + - Should maintain the load-balancing offered by the path selection + algorithm + +2. State instances + + In the algorithm below, we describe a set of persistent and + non-persistent state variables. These variables should be + treated as an object, of which multiple instances can exist. + + In particular, we specify the use of three particular instances: + + A. UseBridges + + If UseBridges is set, then we replace the {GUARDS} set in + [Sec:GUARDS] below with the list of configured + bridges. We maintain a separate persistent instance of + {SAMPLED_GUARDS} and {CONFIRMED_GUARDS} and other derived + values for the UseBridges case. + + In this case, we impose no upper limit on the sample size. + + B. EntryNodes / ExcludeNodes / Reachable*Addresses / + FascistFirewall / ClientUseIPv4=0 + + If one of the above options is set, and UseBridges is not, + then we compare the fraction of usable guards in the consensus + to the total number of guards in the consensus. + + If this fraction is less than {MEANINGFUL_RESTRICTION_FRAC}, + we use a separate instance of the state. + + (While Tor is running, we do not change back and forth between + the separate instance of the state and the default instance + unless the fraction of usable guards is 5% higher than, or 5% + lower than, {MEANINGFUL_RESTRICTION_FRAC}. This prevents us + from flapping back and forth between instances if we happen to + hit {MEANINGFUL_RESTRICTION_FRAC} exactly. + + If this fraction is less than {EXTREME_RESTRICTION_FRAC}, we use a + separate instance of the state, and warn the user. + + [TODO: should we have a different instance for each set of heavily + restricted options?] + + C. Default + + If neither of the above variant-state instances is used, + we use a default instance. + +3. Circuit Creation, Entry Guard Selection (1000 foot view) + + A circuit in Tor is a path through the network connecting a client to + its destination. At a high-level, a three-hop exit circuit will look + like this: + + Client <-> Entry Guard <-> Middle Node <-> Exit Node <-> Destination + + Entry guards are the only nodes which a client will connect to + directly. Exit relays are the nodes by which traffic exits the + Tor network in order to connect to an external destination. + + 3.1 Path selection + + For any multi-hop circuit, at least one entry guard and middle node(s) are + required. An exit node is required if traffic will exit the Tor + network. Depending on its configuration, a relay listed in a + consensus could be used for any of these roles. However, this + specification defines how entry guards specifically should be selected and + managed, as opposed to middle or exit nodes. + + 3.1.1 Managing entry guards + + At a high level, a relay listed in a consensus will move through the + following states in the process from initial selection to eventual + usage as an entry guard: + + relays listed in consensus + | + sampled + | | + confirmed filtered + | | | + primary usable_filtered + + Relays listed in the latest consensus can be sampled for guard usage + if they have the "Guard" flag. Sampling is random but weighted by + a measured bandwidth multiplied by bandwidth-weights (Wgg if guard only, + Wgd if guard+exit flagged). + + Once a path is built and a circuit established using this guard, it + is marked as confirmed. Until this point, guards are first sampled + and then filtered based on information such as our current + configuration (see SAMPLED and FILTERED sections) and later marked as + usable_filtered if the guard is not primary but can be reached. + + It is always preferable to use a primary guard when building a new + circuit in order to reduce guard churn; only on failure to connect to + existing primary guards will new guards be used. + + 3.1.2 Middle and exit node selection + + Middle nodes are selected at random from relays listed in the latest + consensus, weighted by bandwidth and bandwidth-weights. Exit nodes are + chosen similarly but restricted to relays with a sufficiently permissive + exit policy. + + 3.2 Circuit Building + + Once a path is chosen, Tor will use this path to build a new circuit. + + If the circuit is built successfully, Tor will either use it + immediately, or Tor will wait for a circuit with a more preferred + guard if there's a good chance that it will be able to make one. + + If the circuit fails in a way that makes us conclude that a guard + is not reachable, the guard is marked as unreachable, the circuit is + closed, and waiting circuits are updated. + +4. The algorithm. + +4.0. The guards listed in the current consensus. [Section:GUARDS] + + By {set:GUARDS} we mean the set of all guards in the current + consensus that are usable for all circuits and directory + requests. (They must have the flags: Stable, Fast, V2Dir, Guard.) + + **Rationale** + + We require all guards to have the flags that we potentially need + from any guard, so that all guards are usable for all circuits. + +4.1. The Sampled Guard Set. [Section:SAMPLED] + + We maintain a set, {set:SAMPLED_GUARDS}, that persists across + invocations of Tor. It is a subset of the nodes ordered by a sample idx that + we have seen listed as a guard in the consensus at some point. + For each such guard, we record persistently: + + - {pvar:ADDED_ON_DATE}: The date on which it was added to + sampled_guards. + + We set this value to a point in the past, using + RAND(now, {GUARD_LIFETIME}/10). See + Appendix [RANDOM] below. + + - {pvar:ADDED_BY_VERSION}: The version of Tor that added it to + sampled_guards. + + - {pvar:IS_LISTED}: Whether it was listed as a usable Guard in + the _most recent_ consensus we have seen. + + - {pvar:FIRST_UNLISTED_AT}: If IS_LISTED is false, the publication date + of the earliest consensus in which this guard was listed such that we + have not seen it listed in any later consensus. Otherwise "None." + We randomize this to a point in the past, based on + RAND(added_at_time, {REMOVE_UNLISTED_GUARDS_AFTER} / 5) + + For each guard in {SAMPLED_GUARDS}, we also record this data, + non-persistently: + + - {tvar:last_tried_connect}: A 'last tried to connect at' + time. Default 'never'. + + - {tvar:is_reachable}: an "is reachable" tristate, with + possible values { , , }. + Default '.' + + [Note: "yes" is not strictly necessary, but I'm + making it distinct from "maybe" anyway, to make our + logic clearer. A guard is "maybe" reachable if it's + worth trying. A guard is "yes" reachable if we tried + it and succeeded.] + + - {tvar:failing_since}: The first time when we failed to + connect to this guard. Defaults to "never". Reset to + "never" when we successfully connect to this guard. + + - {tvar:is_pending} A "pending" flag. This indicates that we + are trying to build an exploratory circuit through the + guard, and we don't know whether it will succeed. + + - {tvar:pending_since}: A timestamp. Set whenever we set + {tvar:is_pending} to true; cleared whenever we set + {tvar:is_pending} to false. NOTE + + We require that {SAMPLED_GUARDS} contain at least + {MIN_FILTERED_SAMPLE} guards from the consensus (if possible), + but not more than {MAX_SAMPLE_THRESHOLD} of the number of guards + in the consensus, and not more than {MAX_SAMPLE_SIZE} in total. + (But if the maximum would be smaller than {MIN_FILTERED_SAMPLE}, we + set the maximum at {MIN_FILTERED_SAMPLE}.) + + To add a new guard to {SAMPLED_GUARDS}, pick an entry at random from + ({GUARDS} - {SAMPLED_GUARDS}), according to the path selection rules. + + We remove an entry from {SAMPLED_GUARDS} if: + + * We have a live consensus, and {IS_LISTED} is false, and + {FIRST_UNLISTED_AT} is over {REMOVE_UNLISTED_GUARDS_AFTER} + days in the past. + + OR + + * We have a live consensus, and {ADDED_ON_DATE} is over + {GUARD_LIFETIME} ago, *and* {CONFIRMED_ON_DATE} is either + "never", or over {GUARD_CONFIRMED_MIN_LIFETIME} ago. + + Note that {SAMPLED_GUARDS} does not depend on our configuration. + It is possible that we can't actually connect to any of these + guards. + + **Rationale** + + The {SAMPLED_GUARDS} set is meant to limit the total number of + guards that a client will connect to in a given period. The + upper limit on its size prevents us from considering too many + guards. + + The first expiration mechanism is there so that our + {SAMPLED_GUARDS} list does not accumulate so many dead + guards that we cannot add new ones. + + The second expiration mechanism makes us rotate our guards slowly + over time. + + Ordering the {SAMPLED_GUARDS} set in the order in which we sampled those + guards and picking guards from that set according to this ordering improves + load-balancing. It is closer to offer the expected usage of the guard nodes + as per the path selection rules. + + The ordering also improves on another objective of this proposal: trying to + resist an adversary pushing clients over compromised guards, since the + adversary would need the clients to exhaust all their initial + {SAMPLED_GUARDS} set before having a chance to use a newly deployed + adversary node. + + +4.2. The Usable Sample [Section:FILTERED] + + We maintain another set, {set:FILTERED_GUARDS}, that does not + persist. It is derived from: + + - {SAMPLED_GUARDS} + - our current configuration, + - the path bias information. + + A guard is a member of {set:FILTERED_GUARDS} if and only if all + of the following are true: + + - It is a member of {SAMPLED_GUARDS}, with {IS_LISTED} set to + true. + - It is not disabled because of path bias issues. + - It is not disabled because of ReachableAddresses policy, + the ClientUseIPv4 setting, the ClientUseIPv6 setting, + the FascistFirewall setting, or some other + option that prevents using some addresses. + - It is not disabled because of ExcludeNodes. + - It is a bridge if UseBridges is true; or it is not a + bridge if UseBridges is false. + - Is included in EntryNodes if EntryNodes is set and + UseBridges is not. (But see 2.B above). + + We have an additional subset, {set:USABLE_FILTERED_GUARDS}, which + is defined to be the subset of {FILTERED_GUARDS} where + {is_reachable} is or . + + We try to maintain a requirement that {USABLE_FILTERED_GUARDS} + contain at least {MIN_FILTERED_SAMPLE} elements: + + Whenever we are going to sample from {USABLE_FILTERED_GUARDS}, + and it contains fewer than {MIN_FILTERED_SAMPLE} elements, we + add new elements to {SAMPLED_GUARDS} until one of the following + is true: + + * {USABLE_FILTERED_GUARDS} is large enough, + OR + * {SAMPLED_GUARDS} is at its maximum size. + + + ** Rationale ** + + These filters are applied _after_ sampling: if we applied them + before the sampling, then our sample would reflect the set of + filtering restrictions that we had in the past. + +4.3. The confirmed-guard list. [Section:CONFIRMED] + + [formerly USED_GUARDS] + + We maintain a persistent ordered list, {list:CONFIRMED_GUARDS}. + It contains guards that we have used before, in our preference + order of using them. It is a subset of {SAMPLED_GUARDS}. For + each guard in this list, we store persistently: + + - {pvar:IDENTITY} Its fingerprint. + + - {pvar:CONFIRMED_ON_DATE} When we added this guard to + {CONFIRMED_GUARDS}. + + Randomized to a point in the past as RAND(now, {GUARD_LIFETIME}/10). + + We append new members to {CONFIRMED_GUARDS} when we mark a circuit + built through a guard as "for user traffic." + + Whenever we remove a member from {SAMPLED_GUARDS}, we also remove + it from {CONFIRMED_GUARDS}. + + [Note: You can also regard the {CONFIRMED_GUARDS} list as a + total ordering defined over a subset of {SAMPLED_GUARDS}.] + + Definition: we call Guard A "higher priority" than another Guard B + if, when A and B are both reachable, we would rather use A. We + define priority as follows: + + * Every guard in {CONFIRMED_GUARDS} has a higher priority + than every guard not in {CONFIRMED_GUARDS}. + + * Among guards in {CONFIRMED_GUARDS}, the one appearing earlier + on the {CONFIRMED_GUARDS} list has a higher priority. + + * Among guards that do not appear in {CONFIRMED_GUARDS}, + {is_pending}==true guards have higher priority. + + * Among those, the guard with earlier {last_tried_connect} time + has higher priority. + + * Finally, among guards that do not appear in + {CONFIRMED_GUARDS} with {is_pending==false}, all have equal + priority. + + ** Rationale ** + + We add elements to this ordering when we have actually used them + for building a usable circuit. We could mark them at some other + time (such as when we attempt to connect to them, or when we + actually connect to them), but this approach keeps us from + committing to a guard before we actually use it for sensitive + traffic. + +4.4. The Primary guards [Section:PRIMARY] + + We keep a run-time non-persistent ordered list of + {list:PRIMARY_GUARDS}. It is a subset of {FILTERED_GUARDS}. It + contains {N_PRIMARY_GUARDS} elements. + + To compute primary guards, take the ordered intersection of + {CONFIRMED_GUARDS} and {FILTERED_GUARDS}, and take the first + {N_PRIMARY_GUARDS} elements. If there are fewer than + {N_PRIMARY_GUARDS} elements, append additional elements to + PRIMARY_GUARDS chosen from ({FILTERED_GUARDS} - {CONFIRMED_GUARDS}), + ordered in "sample order" (that is, by {ADDED_ON_DATE}). + + Once an element has been added to {PRIMARY_GUARDS}, we do not remove it + until it is replaced by some element from {CONFIRMED_GUARDS}. + That is: if a non-primary guard becomes confirmed and not every primary + guard is confirmed, then the list of primary guards list is regenerated, + first from the confirmed guards (as before), and then from any + non-confirmed primary guards. + + Note that {PRIMARY_GUARDS} do not have to be in + {USABLE_FILTERED_GUARDS}: they might be unreachable. + + ** Rationale ** + + These guards are treated differently from other guards. If one of + them is usable, then we use it right away. For other guards + {FILTERED_GUARDS}, if it's usable, then before using it we might + first double-check whether perhaps one of the primary guards is + usable after all. + +4.5. Retrying guards. [Section:RETRYING] + + (We run this process as frequently as needed. It can be done once + a second, or just-in-time.) + + If a primary sampled guard's {is_reachable} status is , then + we decide whether to update its {is_reachable} status to + based on its {last_tried_connect} time, its {failing_since} time, + and the {PRIMARY_GUARDS_RETRY_SCHED} schedule. + + If a non-primary sampled guard's {is_reachable} status is , then + we decide whether to update its {is_reachable} status to + based on its {last_tried_connect} time, its {failing_since} time, + and the {GUARDS_RETRY_SCHED} schedule. + + ** Rationale ** + + An observation that a guard has been 'unreachable' only lasts for + a given amount of time, since we can't infer that it's unreachable + now from the fact that it was unreachable a few minutes ago. + +4.6. Selecting guards for circuits. [Section:SELECTING] + + Every origin circuit is now in one of these states: + + , + , + , or + . + + You may only attach streams to circuits. + (Additionally, you may only send RENDEZVOUS cells, ESTABLISH_INTRO + cells, and INTRODUCE cells on circuits.) + + The per-circuit state machine is: + + New circuits are or + . + + A circuit may become , or may + fail. + + A circuit may become + ; may become ; or may + fail. + + A circuit will become , or will + be closed, or will fail. + + A circuit remains until it fails or is + closed. + + Each of these transitions is described below. + + We keep, as global transient state: + + * {tvar:last_time_on_internet} -- the last time at which we + successfully used a circuit or connected to a guard. At + startup we set this to "infinitely far in the past." + + When we want to build a circuit, and we need to pick a guard: + + * If any entry in PRIMARY_GUARDS has {is_reachable} status of + or , return one of the first + {NUM_USABLE_PRIMARY_GUARDS} or + {NUM_USABLE_PRIMARY_DIRECTORY_GUARDS} such guards, chosen + uniformly at random. The circuit is . + + [Note: We do not use {is_pending} on primary guards, since we + are willing to try to build multiple circuits through them + before we know for sure whether they work, and since we will + not use any non-primary guards until we are sure that the + primary guards are all down. (XX is this good?)] + + * Otherwise, if the ordered intersection of {CONFIRMED_GUARDS} + and {USABLE_FILTERED_GUARDS} is nonempty, return the first + entry in that intersection that has {is_pending} set to + false. Set its value of {is_pending} to true, + and set its {pending_since} to the current time. + The circuit + is now . (If all entries have + {is_pending} true, pick the first one.) + + * Otherwise, if there is no such entry, select a member from + {USABLE_FILTERED_GUARDS} in sample order. Set its {is_pending} field to + true, and set its {pending_since} to the current time. + The circuit is . + + * Otherwise, if USABLE_FILTERED_GUARDS is empty, we have exhausted + all the sampled guards. In this case we proceed by marking all guards + as reachable so that we can keep on trying circuits. + + Whenever we select a guard for a new circuit attempt, we update the + {last_tried_connect} time for the guard to 'now.' + + In some cases (for example, when we need a certain directory feature, + or when we need to avoid using a certain exit as a guard), we need to + restrict the guards that we use for a single circuit. When this happens, we + remember the restrictions that applied when choosing the guard for + that circuit, since we will need them later (see [UPDATE_WAITING].). + + ** Rationale ** + + We're getting to the core of the algorithm here. Our main goals are to + make sure that + + 1. If it's possible to use a primary guard, we do. + 2. We probably use the first primary guard. + + So we only try non-primary guards if we're pretty sure that all + the primary guards are down, and we only try a given primary guard + if the earlier primary guards seem down. + + When we _do_ try non-primary guards, however, we only build one + circuit through each, to give it a chance to succeed or fail. If + ever such a circuit succeeds, we don't use it until we're pretty + sure that it's the best guard we're getting. (see below). + + [XXX timeout.] + +4.7. When a circuit fails. [Section:ON_FAIL] + + When a circuit fails in a way that makes us conclude that a guard + is not reachable, we take the following steps: + + * Set the guard's {is_reachable} status to . If it had + {is_pending} set to true, we make it non-pending and clear + {pending_since}. + + * Close the circuit, of course. (This removes it from + consideration by the algorithm in [UPDATE_WAITING].) + + * Update the list of waiting circuits. (See [UPDATE_WAITING] + below.) + + [Note: the existing Tor logic will cause us to create more + circuits in response to some of these steps; and also see + [ON_CONSENSUS].] + + ** Rationale ** + + See [SELECTING] above for rationale. + +4.8. When a circuit succeeds [Section:ON_SUCCESS] + + When a circuit succeeds in a way that makes us conclude that a + guard _was_ reachable, we take these steps: + + * We set its {is_reachable} status to . + * We set its {failing_since} to "never". + * If the guard was {is_pending}, we clear the {is_pending} flag + and set {pending_since} to false. + * If the guard was not a member of {CONFIRMED_GUARDS}, we add + it to the end of {CONFIRMED_GUARDS}. + + * If this circuit was , this circuit is + now . You may attach streams to this circuit, + and use it for hidden services. + + * If this circuit was , it is now + . You may not yet attach streams to it. + Then check whether the {last_time_on_internet} is more than + {INTERNET_LIKELY_DOWN_INTERVAL} seconds ago: + + * If it is, then mark all {PRIMARY_GUARDS} as "maybe" + reachable. + + * If it is not, update the list of waiting circuits. (See + [UPDATE_WAITING] below) + + [Note: the existing Tor logic will cause us to create more + circuits in response to some of these steps; and see + [ON_CONSENSUS].] + + ** Rationale ** + + See [SELECTING] above for rationale. + +4.9. Updating the list of waiting circuits [Section:UPDATE_WAITING] + + We run this procedure whenever it's possible that a + circuit might be ready to be called + . + + * If any circuit C1 is , AND: + * All primary guards have reachable status of . + * There is no circuit C2 that "blocks" C1. + Then, upgrade C1 to . + + Definition: In the algorithm above, C2 "blocks" C1 if: + * C2 obeys all the restrictions that C1 had to obey, AND + * C2 has higher priority than C1, AND + * Either C2 is , or C2 is , + or C2 has been for no more than + {NONPRIMARY_GUARD_CONNECT_TIMEOUT} seconds. + + We run this procedure periodically: + + * If any circuit stays in + for more than {NONPRIMARY_GUARD_IDLE_TIMEOUT} seconds, + time it out. + + **Rationale** + + If we open a connection to a guard, we might want to use it + immediately (if we're sure that it's the best we can do), or we + might want to wait a little while to see if some other circuit + which we like better will finish. + + + When we mark a circuit , we don't close the + lower-priority circuits immediately: we might decide to use + them after all if the circuit goes down before + {NONPRIMARY_GUARD_IDLE_TIMEOUT} seconds. + +4.9.1. Without a list of waiting circuits [Section:NO_CIRCLIST] + + As an alternative to the section [SECTION:UPDATE_WAITING] above, + this section presents a new way to maintain guard status + independently of tracking individual circuit status. This + formulation gives a result equivalent or similar to the approach + above, but simplifies the necessary communications between the + guard and circuit subsystems. + + As before, when all primary guards are Unreachable, we need to + try non-primary guards. We select the first such guard (in + preference order) that is neither Unreachable nor Pending. + Whenever we give out such a guard, if the guard's status is + Unknown, then we call that guard "Pending" with its {is_pending} + flag, until the attempt to use it succeeds or fails. We remember + when the guard became Pending with the {pending_since variable}. + + After completing a circuit, the implementation must check whether + its guard is usable. A guard's usability status may be "usable", + "unusable", or "unknown". A guard is usable according to + these rules: + + 1. Primary guards are always usable. + + 2. Non-primary guards are usable _for a given circuit_ if every + guard earlier in the preference list is either unsuitable for + that circuit (e.g. because of family restrictions), or marked as + Unreachable, or has been pending for at least + `{NONPRIMARY_GUARD_CONNECT_TIMEOUT}`. + + Non-primary guards are not usable _for a given circuit_ if some + guard earlier in the preference list is suitable for the circuit + _and_ Reachable. + + Non-primary guards are unusable if they have not become + usable after `{NONPRIMARY_GUARD_IDLE_TIMEOUT}` seconds. + + 3. If a circuit's guard is not usable or unusable immediately, the + circuit is not discarded; instead, it is kept (but not used) until the + guard becomes usable or unusable. + + +4.10. Whenever we get a new consensus. [Section:ON_CONSENSUS] + + We update {GUARDS}. + + For every guard in {SAMPLED_GUARDS}, we update {IS_LISTED} and + {FIRST_UNLISTED_AT}. + + [**] We remove entries from {SAMPLED_GUARDS} if appropriate, + according to the sampled-guards expiration rules. If they were + in {CONFIRMED_GUARDS}, we also remove them from + {CONFIRMED_GUARDS}. + + We recompute {FILTERED_GUARDS}, and everything that derives from + it, including {USABLE_FILTERED_GUARDS}, and {PRIMARY_GUARDS}. + + (Whenever one of the configuration options that affects the + filter is updated, we repeat the process above, starting at the + [**] line.) + +4.11. Deciding whether to generate a new circuit. + [Section:NEW_CIRCUIT_NEEDED] + + We generate a new circuit when we don't have + enough circuits either built or in-progress to handle a given + stream, or an expected stream. + + For the purpose of this rule, we say that + circuits are neither built nor in-progress; that + circuits are built; and that the other states are in-progress. + +4.12. When we are missing descriptors. + [Section:MISSING_DESCRIPTORS] + + We need either a router descriptor or a microdescriptor in order + to build a circuit through a guard. If we do not have such a + descriptor for a guard, we can still use the guard for one-hop + directory fetches, but not for longer circuits. + + (Also, when we are missing descriptors for our first + {NUM_USABLE_PRIMARY_GUARDS} primary guards, we don't build + circuits at all until we have fetched them.) + +A. Appendices + +A.0. Acknowledgements + + This research was supported in part by NSF grants CNS-1111539, + CNS-1314637, CNS-1526306, CNS-1619454, and CNS-1640548. + +A.1. Parameters with suggested values. [Section:PARAM_VALS] + + (All suggested values chosen arbitrarily) + + {param:MAX_SAMPLE_THRESHOLD} -- 20% + + {param:MAX_SAMPLE_SIZE} -- 60 + + {param:GUARD_LIFETIME} -- 120 days + + {param:REMOVE_UNLISTED_GUARDS_AFTER} -- 20 days + [previously ENTRY_GUARD_REMOVE_AFTER] + + {param:MIN_FILTERED_SAMPLE} -- 20 + + {param:N_PRIMARY_GUARDS} -- 3 + + {param:PRIMARY_GUARDS_RETRY_SCHED} + + We recommend the following schedule, which is the one + used in Arti: + + -- Use the "decorrelated-jitter" algorithm from "dir-spec.txt" + section 5.5 where `base_delay` is 30 seconds and `cap` + is 6 hours. + + This legacy schedule is the one used in C tor: + + -- every 10 minutes for the first six hours, + -- every 90 minutes for the next 90 hours, + -- every 4 hours for the next 3 days, + -- every 9 hours thereafter. + + {param:GUARDS_RETRY_SCHED} -- + + We recommend the following schedule, which is the one + used in Arti: + + -- Use the "decorrelated-jitter" algorithm from "dir-spec.txt" + section 5.5 where `base_delay` is 10 minutes and `cap` + is 36 hours. + + This legacy schedule is the one used in C tor: + + -- every hour for the first six hours, + -- every 4 hours for the 90 hours, + -- every 18 hours for the next 3 days, + -- every 36 hours thereafter. + + {param:INTERNET_LIKELY_DOWN_INTERVAL} -- 10 minutes + + {param:NONPRIMARY_GUARD_CONNECT_TIMEOUT} -- 15 seconds + + {param:NONPRIMARY_GUARD_IDLE_TIMEOUT} -- 10 minutes + + {param:MEANINGFUL_RESTRICTION_FRAC} -- .2 + + {param:EXTREME_RESTRICTION_FRAC} -- .01 + + {param:GUARD_CONFIRMED_MIN_LIFETIME} -- 60 days + + {param:NUM_USABLE_PRIMARY_GUARDS} -- 1 + + {param:NUM_USABLE_PRIMARY_DIRECTORY_GUARDS} -- 3 + +A.2. Random values [Section:RANDOM] + + Frequently, we want to randomize the expiration time of something + so that it's not easy for an observer to match it to its start + time. We do this by randomizing its start date a little, so that + we only need to remember a fixed expiration interval. + + By RAND(now, INTERVAL) we mean a time between now and INTERVAL in + the past, chosen uniformly at random. + + +A.3. Why not a sliding scale of primaryness? [Section:CVP] + + At one meeting, I floated the idea of having "primaryness" be a + continuous variable rather than a boolean. + + I'm no longer sure this is a great idea, but I'll try to outline + how it might work. + + To begin with: being "primary" gives it a few different traits: + + 1) We retry primary guards more frequently. [Section:RETRYING] + + 2) We don't even _try_ building circuits through + lower-priority guards until we're pretty sure that the + higher-priority primary guards are down. (With non-primary + guards, on the other hand, we launch exploratory circuits + which we plan not to use if higher-priority guards + succeed.) [Section:SELECTING] + + 3) We retry them all one more time if a circuit succeeds after + the net has been down for a while. [Section:ON_SUCCESS] + + We could make each of the above traits continuous: + + 1) We could make the interval at which a guard is retried + depend continuously on its position in CONFIRMED_GUARDS. + + 2) We could change the number of guards we test in parallel + based on their position in CONFIRMED_GUARDS. + + 3) We could change the rule for how long the higher-priority + guards need to have been down before we call a + circuit based on a + possible network-down condition. For example, we could + retry the first guard if we tried it more than 10 seconds + ago, the second if we tried it more than 20 seconds ago, + etc. + + I am pretty sure, however, that if these are worth doing, they + need more analysis! Here's why: + + * They all have the potential to leak more information about a + guard's exact position on the list. Is that safe? Is there + any way to exploit that? I don't think we know. + + * They all seem like changes which it would be relatively + simple to make to the code after we implement the simpler + version of the algorithm described above. + +A.4. Controller changes + + We will add to control-spec.txt a new possible circuit state, GUARD_WAIT, + that can be given as part of circuit events and GETINFO responses about + circuits. A circuit is in the GUARD_WAIT state when it is fully built, + but we will not use it because a circuit with a better guard might + become built too. + +A.5. Persistent state format + + The persistent state format doesn't need to be part of this + specification, since different implementations can do it + differently. Nonetheless, here's the one Tor uses: + + The "state" file contains one Guard entry for each sampled guard + in each instance of the guard state (see section 2). The value + of this Guard entry is a set of space-separated K=V entries, + where K contains any nonspace character except =, and V contains + any nonspace characters. + + Implementations must retain any unrecognized K=V entries for a + sampled guard when they regenerate the state file. + + The order of K=V entries is not allowed to matter. + + Recognized fields (values of K) are: + + "in" -- the name of the guard state instance that this + sampled guard is in. If a sampled guard is in two guard + states instances, it appears twice, with a different "in" + field each time. Required. + + "rsa_id" -- the RSA id digest for this guard, encoded in + hex. Required. + + "bridge_addr" -- If the guard is a bridge, its configured address and + port (this can be the ORPort or a pluggable transport port). Optional. + + "nickname" -- the guard's nickname, if any. Optional. + + "sampled_on" -- the date when the guard was sampled. Required. + + "sampled_by" -- the Tor version that sampled this guard. + Optional. + + "unlisted_since" -- the date since which the guard has been + unlisted. Optional. + + "listed" -- 0 if the guard is not listed; 1 if it is. Required. + + "confirmed_on" -- date when the guard was + confirmed. Optional. + + "confirmed_idx" -- position of the guard in the confirmed + list. Optional. + + "pb_use_attempts", "pb_use_successes", "pb_circ_attempts", + "pb_circ_successes", "pb_successful_circuits_closed", + "pb_collapsed_circuits", "pb_unusable_circuits", + "pb_timeouts" -- state for the circuit path bias algorithm, + given in decimal fractions. Optional. + + All dates here are given as a (spaceless) ISO8601 combined date + and time in UTC (e.g., 2016-11-29T19:39:31). + + +TODO. Still non-addressed issues [Section:TODO] + + Simulate to answer: Will this work in a dystopic world? + + Simulate actual behavior. + + For all lifetimes: instead of storing the "this began at" time, + store the "remove this at" time, slightly randomized. + + Clarify that when you get a circuit, you might need to + relaunch circuits through that same guard immediately, if they + are circuits that have to be independent. + + + Fix all items marked XX or TODO. + + "Directory guards" -- do they matter? + + Suggestion: require that all guards support downloads via BEGINDIR. + We don't need to worry about directory guards for relays, since we + aren't trying to prevent relay enumeration. + + IP version preferences via ClientPreferIPv6ORPort + + Suggestion: Treat it as a preference when adding to + {CONFIRMED_GUARDS}, but not otherwise. + diff --git a/attic/text_formats/padding-spec.txt b/attic/text_formats/padding-spec.txt new file mode 100644 index 0000000..206a7f1 --- /dev/null +++ b/attic/text_formats/padding-spec.txt @@ -0,0 +1,625 @@ + + Tor Padding Specification + + Mike Perry, George Kadianakis + +Note: This is an attempt to specify Tor as currently implemented. Future +versions of Tor will implement improved algorithms. + +This document tries to cover how Tor chooses to use cover traffic to obscure +various traffic patterns from external and internal observers. Other +implementations MAY take other approaches, but implementors should be aware of +the anonymity and load-balancing implications of their choices. + + The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL + NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and + "OPTIONAL" in this document are to be interpreted as described in + RFC 2119. + +Table of Contents + + 1. Overview + 2. Connection-level padding + 2.1. Background + 2.2. Implementation + 2.3. Padding Cell Timeout Distribution Statistics + 2.4. Maximum overhead bounds + 2.5. Reducing or Disabling Padding via Negotiation + 2.6. Consensus Parameters Governing Behavior + 3. Circuit-level padding + 3.1. Circuit Padding Negotiation + 3.2. Circuit Padding Machine Message Management + 3.3. Obfuscating client-side onion service circuit setup + 3.3.1. Common general circuit construction sequences + 3.3.2. Client-side onion service introduction circuit obfuscation + 3.3.3. Client-side rendezvous circuit hiding + 3.3.4. Circuit setup machine overhead + 3.4. Circuit padding consensus parameters + A. Acknowledgments + +1. Overview + + Tor supports two classes of cover traffic: connection-level padding, and + circuit-level padding. + + Connection-level padding uses the CELL_PADDING cell command for cover + traffic, where as circuit-level padding uses the RELAY_COMMAND_DROP relay + command. CELL_PADDING is single-hop only and can be differentiated from + normal traffic by Tor relays ("internal" observers), but not by entities + monitoring Tor OR connections ("external" observers). + + RELAY_COMMAND_DROP is multi-hop, and is not visible to intermediate Tor + relays, because the relay command field is covered by circuit layer + encryption. Moreover, Tor's 'recognized' field allows RELAY_COMMAND_DROP + padding to be sent to any intermediate node in a circuit (as per Section + 6.1 of tor-spec.txt). + + Tor uses both connection level and circuit level padding. Connection + level padding is described in section 2. Circuit level padding is + described in section 3. + + The circuit-level padding system is completely orthogonal to the + connection-level padding. The connection-level padding system regards + circuit-level padding as normal data traffic, and hence the connection-level + padding system will not add any additional overhead while the circuit-level + padding system is actively padding. + + +2. Connection-level padding + +2.1. Background + + Tor clients and relays make use of CELL_PADDING to reduce the resolution of + connection-level metadata retention by ISPs and surveillance infrastructure. + + Such metadata retention is implemented by Internet routers in the form of + Netflow, jFlow, Netstream, or IPFIX records. These records are emitted by + gateway routers in a raw form and then exported (often over plaintext) to a + "collector" that either records them verbatim, or reduces their granularity + further[1]. + + Netflow records and the associated data collection and retention tools are + very configurable, and have many modes of operation, especially when + configured to handle high throughput. However, at ISP scale, per-flow records + are very likely to be employed, since they are the default, and also provide + very high resolution in terms of endpoint activity, second only to full packet + and/or header capture. + + Per-flow records record the endpoint connection 5-tuple, as well as the + total number of bytes sent and received by that 5-tuple during a particular + time period. They can store additional fields as well, but it is primarily + timing and bytecount information that concern us. + + When configured to provide per-flow data, routers emit these raw flow + records periodically for all active connections passing through them + based on two parameters: the "active flow timeout" and the "inactive + flow timeout". + + The "active flow timeout" causes the router to emit a new record + periodically for every active TCP session that continuously sends data. The + default active flow timeout for most routers is 30 minutes, meaning that a + new record is created for every TCP session at least every 30 minutes, no + matter what. This value can be configured from 1 minute to 60 minutes on + major routers. + + The "inactive flow timeout" is used by routers to create a new record if a + TCP session is inactive for some number of seconds. It allows routers to + avoid the need to track a large number of idle connections in memory, and + instead emit a separate record only when there is activity. This value + ranges from 10 seconds to 600 seconds on common routers. It appears as + though no routers support a value lower than 10 seconds. + + For reference, here are default values and ranges (in parenthesis when + known) for common routers, along with citations to their manuals. + + Some routers speak other collection protocols than Netflow, and in the + case of Juniper, use different timeouts for these protocols. Where this + is known to happen, it has been noted. + + Inactive Timeout Active Timeout + Cisco IOS[3] 15s (10-600s) 30min (1-60min) + Cisco Catalyst[4] 5min 32min + Juniper (jFlow)[5] 15s (10-600s) 30min (1-60min) + Juniper (Netflow)[6,7] 60s (10-600s) 30min (1-30min) + H3C (Netstream)[8] 60s (60-600s) 30min (1-60min) + Fortinet[9] 15s 30min + MicroTik[10] 15s 30min + nProbe[14] 30s 120s + Alcatel-Lucent[2] 15s (10-600s) 30min (1-600min) + + The combination of the active and inactive netflow record timeouts allow us + to devise a low-cost padding defense that causes what would otherwise be + split records to "collapse" at the router even before they are exported to + the collector for storage. So long as a connection transmits data before the + "inactive flow timeout" expires, then the router will continue to count the + total bytes on that flow before finally emitting a record at the "active + flow timeout". + + This means that for a minimal amount of padding that prevents the "inactive + flow timeout" from expiring, it is possible to reduce the resolution of raw + per-flow netflow data to the total amount of bytes send and received in a 30 + minute window. This is a vast reduction in resolution for HTTP, IRC, XMPP, + SSH, and other intermittent interactive traffic, especially when all + user traffic in that time period is multiplexed over a single connection + (as it is with Tor). + + Though flow measurement in principle can be bidirectional (counting cells + sent in both directions between a pair of IPs) or unidirectional (counting + only cells sent from one IP to another), we assume for safety that all + measurement is unidirectional, and so traffic must be sent by both parties + in order to prevent record splitting. + +2.2. Implementation + + Tor clients currently maintain one TLS connection to their Guard node to + carry actual application traffic, and make up to 3 additional connections to + other nodes to retrieve directory information. + + We pad only the client's connection to the Guard node, and not any other + connection. We treat Bridge node connections to the Tor network as client + connections, and pad them, but otherwise not pad between normal relays. + + Both clients and Guards will maintain a timer for all application (ie: + non-directory) TLS connections. Every time a padding packet sent by an + endpoint, that endpoint will sample a timeout value from + the max(X,X) distribution described in Section 2.3. The default + range is from 1.5 seconds to 9.5 seconds time range, subject to consensus + parameters as specified in Section 2.6. + + (The timing is randomized to avoid making it obvious which cells are + padding.) + + If another cell is sent for any reason before this timer expires, the timer + is reset to a new random value. + + If the connection remains inactive until the timer expires, a + single CELL_PADDING cell will be sent on that connection (which will + also start a new timer). + + In this way, the connection will only be padded in a given direction in + the event that it is idle in that direction, and will always transmit a + packet before the minimum 10 second inactive timeout. + + (In practice, an implementation may not be able to determine when, + exactly, a cell is sent on a given channel. For example, even though the + cell has been given to the kernel via a call to `send(2)`, the kernel may + still be buffering that cell. In cases such as these, implementations + should use a reasonable proxy for the time at which a cell is sent: for + example, when the cell is queued. If this strategy is used, + implementations should try to observe the innermost (closest to the wire) + queue that they practically can, and if this queue is already nonempty, + padding should not be scheduled until after the queue does become empty.) + +2.3. Padding Cell Timeout Distribution Statistics + + To limit the amount of padding sent, instead of sampling each endpoint + timeout uniformly, we instead sample it from max(X,X), where X is + uniformly distributed. + + If X is a random variable uniform from 0..R-1 (where R=high-low), then the + random variable Y = max(X,X) has Prob(Y == i) = (2.0*i + 1)/(R*R). + + Then, when both sides apply timeouts sampled from Y, the resulting + bidirectional padding packet rate is now a third random variable: + Z = min(Y,Y). + + The distribution of Z is slightly bell-shaped, but mostly flat around the + mean. It also turns out that Exp[Z] ~= Exp[X]. Here's a table of average + values for each random variable: + + R Exp[X] Exp[Z] Exp[min(X,X)] Exp[Y=max(X,X)] + 2000 999.5 1066 666.2 1332.8 + 3000 1499.5 1599.5 999.5 1999.5 + 5000 2499.5 2666 1666.2 3332.8 + 6000 2999.5 3199.5 1999.5 3999.5 + 7000 3499.5 3732.8 2332.8 4666.2 + 8000 3999.5 4266.2 2666.2 5332.8 + 10000 4999.5 5328 3332.8 6666.2 + 15000 7499.5 7995 4999.5 9999.5 + 20000 9900.5 10661 6666.2 13332.8 + + +2.4. Maximum overhead bounds + + With the default parameters and the above distribution, we expect a + padded connection to send one padding cell every 5.5 seconds. This + averages to 103 bytes per second full duplex (~52 bytes/sec in each + direction), assuming a 512 byte cell and 55 bytes of TLS+TCP+IP headers. + For a client connection that remains otherwise idle for its expected + ~50 minute lifespan (governed by the circuit available timeout plus a + small additional connection timeout), this is about 154.5KB of overhead + in each direction (309KB total). + + With 2.5M completely idle clients connected simultaneously, 52 bytes per + second amounts to 130MB/second in each direction network-wide, which is + roughly the current amount of Tor directory traffic[11]. Of course, our + 2.5M daily users will neither be connected simultaneously, nor entirely + idle, so we expect the actual overhead to be much lower than this. + +2.5. Reducing or Disabling Padding via Negotiation + + To allow mobile clients to either disable or reduce their padding overhead, + the CELL_PADDING_NEGOTIATE cell (tor-spec.txt section 7.2) may be sent from + clients to relays. This cell is used to instruct relays to cease sending + padding. + + If the client has opted to use reduced padding, it continues to send + padding cells sampled from the range [9000,14000] milliseconds (subject to + consensus parameter alteration as per Section 2.6), still using the + Y=max(X,X) distribution. Since the padding is now unidirectional, the + expected frequency of padding cells is now governed by the Y distribution + above as opposed to Z. For a range of 5000ms, we can see that we expect to + send a padding packet every 9000+3332.8 = 12332.8ms. We also half the + circuit available timeout from ~50min down to ~25min, which causes the + client's OR connections to be closed shortly there after when it is idle, + thus reducing overhead. + + These two changes cause the padding overhead to go from 309KB per one-time-use + Tor connection down to 69KB per one-time-use Tor connection. For continual + usage, the maximum overhead goes from 103 bytes/sec down to 46 bytes/sec. + + If a client opts to completely disable padding, it sends a + CELL_PADDING_NEGOTIATE to instruct the relay not to pad, and then does not + send any further padding itself. + + Currently, clients negotiate padding only when a channel is created, + immediately after sending their NETINFO cell. Recipients SHOULD, however, + accept padding negotiation messages at any time. + + If a client which previously negotiated reduced, or disabled, padding, and + wishes to re-enable default padding (ie padding according to the consensus + parameters), it SHOULD send CELL_PADDING_NEGOTIATE START with zero in the + ito_low_ms and ito_high_ms fields. (It therefore SHOULD NOT copy the values + from its own established consensus into the CELL_PADDING_NEGOTIATE cell.) + This avoids the client needing to send updated padding negotiations if the + consensus parameters should change. The recipient's clamping of the timing + parameters will cause the recipient to use its notion of the consensus + parameters. + + Clients and bridges MUST reject padding negotiation messages from relays, + and close the channel if they receive one. + +2.6. Consensus Parameters Governing Behavior + + Connection-level padding is controlled by the following consensus parameters: + + * nf_ito_low + - The low end of the range to send padding when inactive, in ms. + - Default: 1500 + + * nf_ito_high + - The high end of the range to send padding, in ms. + - Default: 9500 + - If nf_ito_low == nf_ito_high == 0, padding will be disabled. + + * nf_ito_low_reduced + - For reduced padding clients: the low end of the range to send padding + when inactive, in ms. + - Default: 9000 + + * nf_ito_high_reduced + - For reduced padding clients: the high end of the range to send padding, + in ms. + - Default: 14000 + + * nf_conntimeout_clients + - The number of seconds to keep never-used circuits opened and + available for clients to use. Note that the actual client timeout is + randomized uniformly from this value to twice this value. + - The number of seconds to keep idle (not currently used) canonical + channels are open and available. (We do this to ensure a sufficient + time duration of padding, which is the ultimate goal.) + - This value is also used to determine how long, after a port has been + used, we should attempt to keep building predicted circuits for that + port. (See path-spec.txt section 2.1.1.) This behavior was + originally added to work around implementation limitations, but it + serves as a reasonable default regardless of implementation. + - For all use cases, reduced padding clients use half the consensus + value. + - Implementations MAY mark circuits held open past the reduced padding + quantity (half the consensus value) as "not to be used for streams", + to prevent their use from becoming a distinguisher. + - Default: 1800 + + * nf_pad_before_usage + - If set to 1, OR connections are padded before the client uses them + for any application traffic. If 0, OR connections are not padded + until application data begins. + - Default: 1 + + * nf_pad_relays + - If set to 1, we also pad inactive relay-to-relay connections + - Default: 0 + + * nf_conntimeout_relays + - The number of seconds that idle relay-to-relay connections are kept + open. + - Default: 3600 + + +3. Circuit-level padding + + The circuit padding system in Tor is an extension of the WTF-PAD + event-driven state machine design[15]. At a high level, this design places + one or more padding state machines at the client, and one or more padding + state machines at a relay, on each circuit. + + State transition and histogram generation has been generalized to be fully + programmable, and probability distribution support was added to support more + compact representations like APE[16]. Additionally, packet count limits, + rate limiting, and circuit application conditions have been added. + + At present, Tor uses this system to deploy two pairs of circuit padding + machines, to obscure differences between the setup phase of client-side + onion service circuits, up to the first 10 cells. + + This specification covers only the resulting behavior of these padding + machines, and thus does not cover the state machine implementation details or + operation. For full details on using the circuit padding system to develop + future padding defenses, see the research developer documentation[17]. + +3.1. Circuit Padding Negotiation + + Circuit padding machines are advertised as "Padding" subprotocol versions + (see tor-spec.txt Section 9). The onion service circuit padding machines are + advertised as "Padding=2". + + Because circuit padding machines only become active at certain points in + circuit lifetime, and because more than one padding machine may be active at + any given point in circuit lifetime, there is also a padding negotiation + cell and a negotiated response. These are relay commands 41 and 42, with + relay headers as per section 6.1 of tor-spec.txt. + + The fields of the relay cell Data payload of a negotiate request are + as follows: + + const CIRCPAD_COMMAND_STOP = 1; + const CIRCPAD_COMMAND_START = 2; + + const CIRCPAD_RESPONSE_OK = 1; + const CIRCPAD_RESPONSE_ERR = 2; + + const CIRCPAD_MACHINE_CIRC_SETUP = 1; + + struct circpad_negotiate { + u8 version IN [0]; + u8 command IN [CIRCPAD_COMMAND_START, CIRCPAD_COMMAND_STOP]; + + u8 machine_type IN [CIRCPAD_MACHINE_CIRC_SETUP]; + + u8 unused; // Formerly echo_request + + u32 machine_ctr; + }; + + When a client wants to start a circuit padding machine, it first checks that + the desired destination hop advertises the appropriate subprotocol version for + that machine. It then sends a circpad_negotiate cell to that hop with + command=CIRCPAD_COMMAND_START, and machine_type=CIRCPAD_MACHINE_CIRC_SETUP (for + the circ setup machine, the destination hop is the second hop in the + circuit). The machine_ctr is the count of which machine instance this is on + the circuit. It is used to disambiguate shutdown requests. + + When a relay receives a circpad_negotiate cell, it checks that it supports + the requested machine, and sends a circpad_negotiated cell, which is formatted + in the data payload of a relay cell with command number 42 (see tor-spec.txt + section 6.1), as follows: + + struct circpad_negotiated { + u8 version IN [0]; + u8 command IN [CIRCPAD_COMMAND_START, CIRCPAD_COMMAND_STOP]; + u8 response IN [CIRCPAD_RESPONSE_OK, CIRCPAD_RESPONSE_ERR]; + + u8 machine_type IN [CIRCPAD_MACHINE_CIRC_SETUP]; + + u32 machine_ctr; + }; + + If the machine is supported, the response field will contain + CIRCPAD_RESPONSE_OK. If it is not, it will contain CIRCPAD_RESPONSE_ERR. + + Either side may send a CIRCPAD_COMMAND_STOP to shut down the padding machines + (clients MUST only send circpad_negotiate, and relays MUST only send + circpad_negotiated for this purpose). + + If the machine_ctr does not match the current machine instance count + on the circuit, the command is ignored. + +3.2. Circuit Padding Machine Message Management + + Clients MAY send padding cells towards the relay before receiving the + circpad_negotiated response, to allow for outbound cover traffic before + negotiation completes. + + Clients MAY send another circpad_negotiate cell before receiving the + circpad_negotiated response, to allow for rapid machine changes. + + Relays MUST NOT send padding cells or circpad_negotiated cells, unless a + padding machine is active. Any padding-related cells that arrive at the client + from unexpected relay sources are protocol violations, and clients MAY + immediately tear down such circuits to avoid side channel risk. + +3.3. Obfuscating client-side onion service circuit setup + + The circuit padding currently deployed in Tor attempts to hide client-side + onion service circuit setup. Service-side setup is not covered, because doing + so would involve significantly more overhead, and/or require interaction with + the application layer. + + The approach taken aims to make client-side introduction and rendezvous + circuits match the cell direction sequence and cell count of 3 hop general + circuits used for normal web traffic, for the first 10 cells only. The + lifespan of introduction circuits is also made to match the lifespan + of general circuits. + + Note that inter-arrival timing is not obfuscated by this defense. + +3.3.1. Common general circuit construction sequences + + Most general Tor circuits used to surf the web or download directory + information start with the following 6-cell relay cell sequence (cells + surrounded in [brackets] are outgoing, the others are incoming): + + [EXTEND2] -> EXTENDED2 -> [EXTEND2] -> EXTENDED2 -> [BEGIN] -> CONNECTED + + When this is done, the client has established a 3-hop circuit and also opened + a stream to the other end. Usually after this comes a series of DATA cell that + either fetches pages, establishes an SSL connection or fetches directory + information: + + [DATA] -> [DATA] -> DATA -> DATA...(inbound cells continue) + + The above stream of 10 relay cells defines the grand majority of general + circuits that come out of Tor browser during our testing, and it's what we use + to make introduction and rendezvous circuits blend in. + + Please note that in this section we only investigate relay cells and not + connection-level cells like CREATE/CREATED or AUTHENTICATE/etc. that are used + during the link-layer handshake. The rationale is that connection-level cells + depend on the type of guard used and are not an effective fingerprint for a + network/guard-level adversary. + +3.3.2. Client-side onion service introduction circuit obfuscation + + Two circuit padding machines work to hide client-side introduction circuits: + one machine at the origin, and one machine at the second hop of the circuit. + Each machine sends padding towards the other. The padding from the origin-side + machine terminates at the second hop and does not get forwarded to the actual + introduction point. + + From Section 3.3.1 above, most general circuits have the following initial + relay cell sequence (outgoing cells marked in [brackets]): + + [EXTEND2] -> EXTENDED2 -> [EXTEND2] -> EXTENDED2 -> [BEGIN] -> CONNECTED + -> [DATA] -> [DATA] -> DATA -> DATA...(inbound data cells continue) + + Whereas normal introduction circuits usually look like: + + [EXTEND2] -> EXTENDED2 -> [EXTEND2] -> EXTENDED2 -> [EXTEND2] -> EXTENDED2 + -> [INTRO1] -> INTRODUCE_ACK + + This means that up to the sixth cell (first line of each sequence above), + both general and intro circuits have identical cell sequences. After that + we want to mimic the second line sequence of + + -> [DATA] -> [DATA] -> DATA -> DATA...(inbound data cells continue) + + We achieve this by starting padding INTRODUCE1 has been sent. With padding + negotiation cells, in the common case of the second line looks like: + + -> [INTRO1] -> [PADDING_NEGOTIATE] -> PADDING_NEGOTIATED -> INTRO_ACK + + Then, the middle node will send between INTRO_MACHINE_MINIMUM_PADDING (7) and + INTRO_MACHINE_MAXIMUM_PADDING (10) cells, to match the "...(inbound data cells + continue)" portion of the trace (aka the rest of an HTTPS response body). + + We also set a special flag which keeps the circuit open even after the + introduction is performed. With this feature the circuit will stay alive for + the same duration as normal web circuits before they expire (usually 10 + minutes). + +3.3.3. Client-side rendezvous circuit hiding + + Following a similar argument as for intro circuits, we are aiming for padded + rendezvous circuits to blend in with the initial cell sequence of general + circuits which usually look like this: + + [EXTEND2] -> EXTENDED2 -> [EXTEND2] -> EXTENDED2 -> [BEGIN] -> CONNECTED + -> [DATA] -> [DATA] -> DATA -> DATA...(incoming cells continue) + + Whereas normal rendezvous circuits usually look like: + + [EXTEND2] -> EXTENDED2 -> [EXTEND2] -> EXTENDED2 -> [EST_REND] -> REND_EST + -> REND2 -> [BEGIN] + + This means that up to the sixth cell (the first line), both general and + rend circuits have identical cell sequences. + + After that we want to mimic a [DATA] -> [DATA] -> DATA -> DATA sequence. + + With padding negotiation right after the REND_ESTABLISHED, the sequence + becomes: + + [EXTEND2] -> EXTENDED2 -> [EXTEND2] -> EXTENDED2 -> [EST_REND] -> REND_EST + -> [PADDING_NEGOTIATE] -> [DROP] -> PADDING_NEGOTIATED -> DROP... + + After which normal application DATA cells continue on the circuit. + + Hence this way we make rendezvous circuits look like general circuits up + till the end of the circuit setup. + + After that our machine gets deactivated, and we let the actual rendezvous + circuit shape the traffic flow. Since rendezvous circuits usually imitate + general circuits (their purpose is to surf the web), we can expect that they + will look alike. + +3.3.4. Circuit setup machine overhead + + For the intro circuit case, we see that the origin-side machine just sends a + single [PADDING_NEGOTIATE] cell, whereas the origin-side machine sends a + PADDING_NEGOTIATED cell and between 7 to 10 DROP cells. This means that the + average overhead of this machine is 11 padding cells per introduction circuit. + + For the rend circuit case, this machine is quite light. Both sides send 2 + padding cells, for a total of 4 padding cells. + +3.4. Circuit padding consensus parameters + + The circuit padding system has a handful of consensus parameters that can + either disable circuit padding entirely, or rate limit the total overhead + at relays and clients. + + * circpad_padding_disabled + - If set to 1, no circuit padding machines will negotiate, and all + current padding machines will cease padding immediately. + - Default: 0 + + * circpad_padding_reduced + - If set to 1, only circuit padding machines marked as "reduced"/"low + overhead" will be used. (Currently no such machines are marked + as "reduced overhead"). + - Default: 0 + + * circpad_global_allowed_cells + - This is the number of padding cells that must be sent before + the 'circpad_global_max_padding_percent' parameter is applied. + - Default: 0 + + * circpad_global_max_padding_percent + - This is the maximum ratio of padding cells to total cells, specified + as a percent. If the global ratio of padding cells to total cells + across all circuits exceeds this percent value, no more padding is sent + until the ratio becomes lower. 0 means no limit. + - Default: 0 + + * circpad_max_circ_queued_cells + - This is the maximum number of cells that can be in the circuitmux queue + before padding stops being sent on that circuit. + - Default: CIRCWINDOW_START_MAX (1000) + + +A. Acknowledgments + + This research was supported in part by NSF grants CNS-1111539, + CNS-1314637, CNS-1526306, CNS-1619454, and CNS-1640548. + +1. https://en.wikipedia.org/wiki/NetFlow +2. http://infodoc.alcatel-lucent.com/html/0_add-h-f/93-0073-10-01/7750_SR_OS_Router_Configuration_Guide/Cflowd-CLI.html +3. http://www.cisco.com/en/US/docs/ios/12_3t/netflow/command/reference/nfl_a1gt_ps5207_TSD_Products_Command_Reference_Chapter.html#wp1185203 +4. http://www.cisco.com/c/en/us/support/docs/switches/catalyst-6500-series-switches/70974-netflow-catalyst6500.html#opconf +5. https://www.juniper.net/techpubs/software/erx/junose60/swconfig-routing-vol1/html/ip-jflow-stats-config4.html#560916 +6. http://www.jnpr.net/techpubs/en_US/junos15.1/topics/reference/configuration-statement/flow-active-timeout-edit-forwarding-options-po.html +7. http://www.jnpr.net/techpubs/en_US/junos15.1/topics/reference/configuration-statement/flow-active-timeout-edit-forwarding-options-po.html +8. http://www.h3c.com/portal/Technical_Support___Documents/Technical_Documents/Switches/H3C_S9500_Series_Switches/Command/Command/H3C_S9500_CM-Release1648%5Bv1.24%5D-System_Volume/200901/624854_1285_0.htm#_Toc217704193 +9. http://docs-legacy.fortinet.com/fgt/handbook/cli52_html/FortiOS%205.2%20CLI/config_system.23.046.html +10. http://wiki.mikrotik.com/wiki/Manual:IP/Traffic_Flow +11. https://metrics.torproject.org/dirbytes.html +12. http://freehaven.net/anonbib/cache/murdoch-pet2007.pdf +13. https://gitweb.torproject.org/torspec.git/tree/proposals/188-bridge-guards.txt +14. http://www.ntop.org/wp-content/uploads/2013/03/nProbe_UserGuide.pdf +15. http://arxiv.org/pdf/1512.00524 +16. https://www.cs.kau.se/pulls/hot/thebasketcase-ape/ +17. https://github.com/torproject/tor/tree/master/doc/HACKING/CircuitPaddingDevelopment.md +18. https://www.usenix.org/node/190967 + https://blog.torproject.org/technical-summary-usenix-fingerprinting-paper + diff --git a/attic/text_formats/param-spec.txt b/attic/text_formats/param-spec.txt new file mode 100644 index 0000000..d8ea80b --- /dev/null +++ b/attic/text_formats/param-spec.txt @@ -0,0 +1,517 @@ + + Tor network parameters + +This file lists the recognized parameters that can appear on the "params" +line of a directory consensus. + +Table of Contents + + 1. Network protocol parameters + 2. Performance-tuning parameters + 3. Voting-related parameters + 4. Circuit-build-timeout parameters + 5. Directory-related parameters + 6. Pathbias parameters + 7. Relay behavior + 8. V3 onion service parameters + 9. Denial-of-service parameters + 10. Padding-related parameters + 11. Guard-related parameters + X. Obsolete parameters + +1. Network protocol parameters + + "circwindow" -- the default package window that circuits should be + established with. It started out at 1000 cells, but some research + indicates that a lower value would mean fewer cells in transit in the + network at any given time. + Min: 100, Max: 1000, Default: 1000 + First-appeared: Tor 0.2.1.20 + + "UseOptimisticData" -- If set to zero, clients by default shouldn't try + to send optimistic data to servers until they have received a + RELAY_CONNECTED cell. + Min: 0, Max: 1, Default: 1 + First-appeared: 0.2.3.3-alpha + Default was 0 before: 0.2.9.1-alpha + Removed in 0.4.5.1-alpha; now always on. + + "usecreatefast" -- Used to control whether clients use the CREATE_FAST + handshake on the first hop of their circuits. + Min: 0, Max: 1. Default: 1. + First-appeared: 0.2.4.23, 0.2.5.2-alpha + Removed in 0.4.5.1-alpha; now always off. + + "min_paths_for_circs_pct" -- A percentage threshold that determines + whether clients believe they have enough directory information to + build circuits. This value applies to the total fraction of + bandwidth-weighted paths that the client could build; see + path-spec.txt for more information. + Min: 25, Max: 95, Default: 60 + First-appeared: 0.2.4 + + "ExtendByEd25519ID" -- If true, clients should include Ed25519 + identities for relays when generating EXTEND2 cells. + Min: 0. Max: 1. Default: 0. + First-appeared: 0.3.0 + + "sendme_emit_min_version" -- Minimum SENDME version that can be sent. + Min: 0. Max: 255. Default 0. + First appeared: 0.4.1.1-alpha. + + "sendme_accept_min_version" -- Minimum SENDME version that is accepted. + Min: 0. Max: 255. Default 0. + First appeared: 0.4.1.1-alpha. + + "allow-network-reentry" -- If true, the Exit relays allow connections that + are exiting the network to re-enter. If false, any exit connections going + to a relay ORPort or an authority ORPort and DirPort is denied and the + stream is terminated. + Min: 0. Max: 1. Default: 0 + First appeared: 0.4.5.1-alpha. + +2. Performance-tuning parameters + + "CircuitPriorityHalflifeMsec" -- the halflife parameter used when + weighting which circuit will send the next cell. Obeyed by Tor + 0.2.2.10-alpha and later. (Versions of Tor between 0.2.2.7-alpha and + 0.2.2.10-alpha recognized a "CircPriorityHalflifeMsec" parameter, but + mishandled it badly.) + Min: 1, Max: 2147483647 (INT32_MAX), Default: 30000. + First-appeared: Tor 0.2.2.11-alpha + + "perconnbwrate" and "perconnbwburst" -- if set, each relay sets up a + separate token bucket for every client OR connection, and rate limits + that connection independently. Typically left unset, except when used for + performance experiments around trac entry 1750. Only honored by relays + running Tor 0.2.2.16-alpha and later. (Note that relays running + 0.2.2.7-alpha through 0.2.2.14-alpha looked for bwconnrate and + bwconnburst, but then did the wrong thing with them; see bug 1830 for + details.) + Min: 1, Max: 2147483647 (INT32_MAX), Default: (user setting of + BandwidthRate/BandwidthBurst). + First-appeared: 0.2.2.7-alpha + Removed-in: 0.2.2.16-alpha + + "NumNTorsPerTAP" -- When balancing ntor and TAP cells at relays, + how many ntor handshakes should we perform for each TAP handshake? + Min: 1. Max: 100000. Default: 10. + First-appeared: 0.2.4.17-rc + + "circ_max_cell_queue_size" -- This parameter determines the maximum + number of cells allowed per circuit queue. + Min: 1000. Max: 2147483647 (INT32_MAX). Default: 50000. + First-appeared: 0.3.3.6-rc. + + "KISTSchedRunInterval" -- How frequently should the "KIST" scheduler + run in order to decide which data to write to the network? Value in + units of milliseconds. + Min: 2. Max: 100. Default: 2 + First appeared: 0.3.2 + + "KISTSchedRunIntervalClient" -- How frequently should the "KIST" scheduler + run in order to decide which data to write to the network, on clients? Value + in units of milliseconds. The client value needs to be much lower than + the relay value. + Min: 2. Max: 100. Default: 2. + First appeared: 0.4.8.2 + +3. Voting-related parameters + + "bwweightscale" -- Value that bandwidth-weights are divided by. If not + present then this defaults to 10000. + Min: 1 + First-appeared: 0.2.2.10-alpha + + "maxunmeasuredbw" -- Used by authorities during voting with method 17 or + later. The maximum value to give for any Bandwidth= entry for a router + that isn't based on at least three measurements. + + (Note: starting in version 0.4.6.1-alpha + there was a bug where Tor authorities would instead look at + a parameter called "maxunmeasurdbw", without the "e". + This bug was fixed in 0.4.9.1-alpha and in 0.4.8.8. + Until all relays are running a fixed version, then either this parameter + must not be set, or it must be set to the same value for both + spellings.) + + First-appeared: 0.2.4.11-alpha + + "FastFlagMinThreshold", "FastFlagMaxThreshold" -- lowest and highest + allowable values for the cutoff for routers that should get the Fast + flag. This is used during voting to prevent the threshold for getting + the Fast flag from being too low or too high. + FastFlagMinThreshold: Min: 4. Max: INT32_MAX: Default: 4. + FastFlagMaxThreshold: Min: -. Max: INT32_MAX: Default: INT32_MAX + First-appeared: 0.2.3.11-alpha + + "AuthDirNumSRVAgreements" -- Minimum number of agreeing directory + authority votes required for a fresh shared random value to be written in + the consensus (this rule only applies on the first commit round of the + shared randomness protocol). + Min: 1. Max: INT32_MAX. Default: 2/3 of the total number of + dirauth. + +4. Circuit-build-timeout parameters + + "cbtdisabled", "cbtnummodes", "cbtrecentcount", "cbtmaxtimeouts", + "cbtmincircs", "cbtquantile", "cbtclosequantile", "cbttestfreq", + "cbtmintimeout", "cbtlearntimeout", "cbtmaxopencircs", and + "cbtinitialtimeout" -- see "2.4.5. Consensus parameters governing + behavior" in path-spec.txt for a series of circuit build time related + consensus parameters. + + +5. Directory-related parameters + + "max-consensus-age-to-cache-for-diff" -- Determines how much + consensus history (in hours) relays should try to cache in order to + serve diffs. (min 0, max 8192, default 72) + + "try-diff-for-consensus-newer-than" -- This parameter determines how + old a consensus can be (in hours) before a client should no longer + try to find a diff for it. (min 0, max 8192, default 72) + +6. Pathbias parameters + + "pb_mincircs", "pb_noticepct", "pb_warnpct", "pb_extremepct", + "pb_dropguards", "pb_scalecircs", "pb_scalefactor", + "pb_multfactor", "pb_minuse", "pb_noticeusepct", + "pb_extremeusepct", "pb_scaleuse" -- DOCDOC + +7. Relay behavior + + "refuseunknownexits" -- if set to one, exit relays look at the previous + hop of circuits that ask to open an exit stream, and refuse to exit if + they don't recognize it as a relay. The goal is to make it harder for + people to use them as one-hop proxies. See trac entry 1751 for details. + Min: 0, Max: 1 + First-appeared: 0.2.2.17-alpha + + "onion-key-rotation-days" -- (min 1, max 90, default 28) + + "onion-key-grace-period-days" -- (min 1, max + onion-key-rotation-days, default 7) + + Every relay should list each onion key it generates for + onion-key-rotation-days days after generating it, and then + replace it. Relays should continue to accept their most recent + previous onion key for an additional onion-key-grace-period-days + days after it is replaced. (Introduced in 0.3.1.1-alpha; + prior versions of tor hardcoded both of these values to 7 days.) + + "AllowNonearlyExtend" -- If true, permit EXTEND cells that are not inside + RELAY_EARLY cells. + Min: 0. Max: 1. Default: 0. + First-appeared: 0.2.3.11-alpha + + "overload_dns_timeout_scale_percent" -- This value is a percentage of how + many DNS timeout over N seconds we accept before reporting the overload + general state. It is scaled by a factor of 1000 in order to be able to + represent decimal point. As an example, a value of 1000 means 1%. + Min: 0. Max: 100000. Default: 1000. + First-appeared: 0.4.6.8 + Deprecated: 0.4.7.3-alpha-dev + + "overload_dns_timeout_period_secs" -- This value is the period in seconds + of the DNS timeout measurements (the N in the + "overload_dns_timeout_scale_percent" parameter). For this amount of + seconds, we will gather DNS statistics and at the end, we'll do an + assessment on the overload general signal with regards to DNS timeouts. + Min: 0. Max: 2147483647. Default: 600 + First-appeared: 0.4.6.8 + Deprecated: 0.4.7.3-alpha-dev + + "overload_onionskin_ntor_scale_percent" -- This value is a percentage of + how many onionskin ntor drop over N seconds we accept before reporting the + overload general state. It is scaled by a factor of 1000 in order to be + able to represent decimal point. As an example, a value of 1000 means 1%. + Min: 0. Max: 100000. Default: 1000. + First-appeared: 0.4.7.5-alpha + + "overload_onionskin_ntor_period_secs" -- This value is the period in + seconds of the onionskin ntor overload measurements (the N in the + "overload_onionskin_ntor_scale_percent" parameter). For this amount of + seconds, we will gather onionskin ntor statistics and at the end, we'll do + an assessment on the overload general signal. + Min: 0. Max: 2147483647. Default: 21600 (6 hours) + First-appeared: 0.4.7.5-alpha + + "assume-reachable" -- If true, relays should publish descriptors + even when they cannot make a connection to their IPv4 ORPort. + Min: 0. Max: 1. Default: 0. + First appeared: 0.4.5.1-alpha. + + "assume-reachable-ipv6" -- If true, relays should publish + descriptors even when they cannot make a connection to their IPv6 + ORPort. + Min: 0. Max: 1. Default: 0. + First appeared: 0.4.5.1-alpha. + + "exit_dns_timeout" -- The time in milliseconds an Exit sets libevent to + wait before it considers the DNS timed out. The corresponding libevent + option is "timeout:". + Min: 1. Max: 120000. Default: 1000 (1sec) + First appeared: 0.4.7.5-alpha. + + "exit_dns_num_attempts" -- How many attempts _after the first_ should an + Exit should try a timing-out DNS query before calling it hopeless? (Each of + these attempts will wait for "exit_dns_timeout" independently). The + corresponding libevent option is "attempts:". + Min: 0. Max: 255. Default: 2 + First appeared: 0.4.7.5-alpha. + +8. V3 onion service parameters + + "hs_intro_min_introduce2", "hs_intro_max_introduce2" -- + Minimum/maximum amount of INTRODUCE2 cells allowed per circuits + before rotation (actual amount picked at random between these two + values). + Min: 0. Max: INT32_MAX. Defaults: 16384, 32768. + + "hs_intro_min_lifetime", "hs_intro_max_lifetime" -- Minimum/maximum + lifetime in seconds that a service should keep an intro point for + (actual lifetime picked at random between these two values). + Min: 0. Max: INT32_MAX. Defaults: 18 hours, 24 hours. + + "hs_intro_num_extra" -- Number of extra intro points a service is + allowed to open. This concept comes from proposal #155. + Min: 0. Max: 128. Default: 2. + + "hsdir_interval" -- The length of a time period, _in minutes_. See + rend-spec-v3.txt section [TIME-PERIODS]. + Min: 30. Max: 14400. Default: 1440. + + "hsdir_n_replicas" -- Number of HS descriptor replicas. + Min: 1. Max: 16. Default: 2. + + "hsdir_spread_fetch" -- Total number of HSDirs per replica a tor + client should select to try to fetch a descriptor. + Min: 1. Max: 128. Default: 3. + + "hsdir_spread_store" -- Total number of HSDirs per replica a service + will upload its descriptor to. + Min: 1. Max: 128. Default: 4 + + "HSV3MaxDescriptorSize" -- Maximum descriptor size (in bytes). + Min: 1. Max: INT32_MAX. Default: 50000 + + "hs_service_max_rdv_failures" -- This parameter determines the + maximum number of rendezvous attempt an HS service can make per + introduction. + Min 1. Max 10. Default 2. + First-appeared: 0.3.3.0-alpha. + + "HiddenServiceEnableIntroDoSDefense" -- This parameter makes tor + start using this defense if the introduction point supports it + (for protover HSIntro=5). + Min: 0. Max: 1. Default: 0. + First appeared: 0.4.2.1-alpha. + + "HiddenServiceEnableIntroDoSBurstPerSec" -- Maximum burst to be used + for token bucket for the introduction point rate-limiting. + Min: 0. Max: INT32_MAX. Default: 200 + First appeared: 0.4.2.1-alpha. + + "HiddenServiceEnableIntroDoSRatePerSec" -- Refill rate to be used + for token bucket for the introduction point rate-limiting. + Min: 0. Max: INT32_MAX. Default: 25 + First appeared: 0.4.2.1-alpha. + +9. Denial-of-service parameters + + Denial of Service mitigation parameters. Introduced in 0.3.3.2-alpha: + + "DoSCircuitCreationEnabled" -- Enable the circuit creation DoS + mitigation. + + "DoSCircuitCreationMinConnections" -- Minimum threshold of + concurrent connections before a client address can be flagged as + executing a circuit creation DoS + + "DoSCircuitCreationRate" -- Allowed circuit creation rate per second + per client IP address once the minimum concurrent connection + threshold is reached. + + "DoSCircuitCreationBurst" -- The allowed circuit creation burst per + client IP address once the minimum concurrent connection threshold + is reached. + + "DoSCircuitCreationDefenseType" -- Defense type applied to a + detected client address for the circuit creation mitigation. + 1: No defense. + 2: Refuse circuit creation for the length of + "DoSCircuitCreationDefenseTimePeriod". + + + "DoSCircuitCreationDefenseTimePeriod" -- The base time period that + the DoS defense is activated for. + + "DoSConnectionEnabled" -- Enable the connection DoS mitigation. + + "DoSConnectionMaxConcurrentCount" -- The maximum threshold of + concurrent connection from a client IP address. + + "DoSConnectionDefenseType" -- Defense type applied to a detected + client address for the connection mitigation. Possible values are: + 1: No defense. + 2: Immediately close new connections. + + "DoSRefuseSingleHopClientRendezvous" -- Refuse establishment of + rendezvous points for single hop clients. + +10. Padding-related parameters + + "circpad_max_circ_queued_cells" -- The circuitpadding module will + stop sending more padding cells if more than this many cells are in + the circuit queue a given circuit. + Min: 0. Max: 50000. Default 1000. + First appeared: 0.4.0.3-alpha. + + "circpad_global_allowed_cells" -- DOCDOC + + "circpad_global_max_padding_pct" -- DOCDOC + + "circpad_padding_disabled" -- DOCDOC + + "circpad_padding_reduced" -- DOCDOC + + "nf_conntimeout_clients" -- DOCDOC + + "nf_conntimeout_relays" -- DOCDOC + + "nf_ito_high_reduced" -- DOCDOC + + "nf_ito_low" -- DOCDOC + + "nf_ito_low_reduced" -- DOCDOC + + "nf_pad_before_usage" -- DOCDOC + + "nf_pad_relays" -- DOCDOC + + "nf_pad_single_onion" -- DOCDOC + +11. Guard-related parameters + + (See guard-spec.txt for more information on the vocabulary used here.) + + "UseGuardFraction" -- If true, clients use `GuardFraction` + information from the consensus in order to decide how to weight + guards when picking them. + Min: 0. Max: 1. Default: 0. + First appeared: 0.2.6 + + "guard-lifetime-days" -- Controls guard lifetime. If an unconfirmed + guard has been sampled more than this many days ago, it should be + removed from the guard sample. + Min: 1. Max: 3650. Default: 120. + First appeared: 0.3.0 + + "guard-confirmed-min-lifetime-days" -- Controls confirmed guard + lifetime: if a guard was confirmed more than this many days ago, it + should be removed from the guard sample. + Min: 1. Max: 3650. Default: 60. + First appeared: 0.3.0 + + "guard-internet-likely-down-interval" -- If Tor has been unable to + build a circuit for this long (in seconds), assume that the internet + connection is down, and treat guard failures as unproven. + Min: 1. Max: INT32_MAX. Default: 600. + First appeared: 0.3.0 + + "guard-max-sample-size" -- Largest number of guards that clients + should try to collect in their sample. + Min: 1. Max: INT32_MAX. Default: 60. + First appeared: 0.3.0 + + "guard-max-sample-threshold-percent" -- Largest bandwidth-weighted + fraction of guards that clients should try to collect in their + sample. + Min: 1. Max: 100. Default: 20. + First appeared: 0.3.0 + + "guard-meaningful-restriction-percent" -- If the client has + configured tor to exclude so many guards that the available guard + bandwidth is less than this percentage of the total, treat the guard + sample as "restricted", and keep it in a separate sample. + Min: 1. Max: 100. Default: 20. + First appeared: 0.3.0 + + "guard-extreme-restriction-percent" -- Warn the user if they have + configured tor to exclude so many guards that the available guard + bandwidth is less than this percentage of the total. + Min: 1. Max: 100. Default: 1. + First appeared: 0.3.0. MAX was INT32_MAX, which would have no meaningful + effect. MAX lowered to 100 in 0.4.7. + + "guard-min-filtered-sample-size" -- If fewer than this number of + guards is available in the sample after filtering out unusable + guards, the client should try to add more guards to the sample (if + allowed). + Min: 1. Max: INT32_MAX. Default: 20. + First appeared: 0.3.0 + + "guard-n-primary-guards" -- The number of confirmed guards that the + client should treat as "primary guards". + Min: 1. Max: INT32_MAX. Default: 3. + First appeared: 0.3.0 + + "guard-n-primary-guards-to-use", "guard-n-primary-dir-guards-to-use" + -- number of primary guards and primary directory guards that the + client should be willing to use in parallel. Other primary guards + won't get used unless the earlier ones are down. + "guard-n-primary-guards-to-use": + Min 1, Max INT32_MAX: Default: 1. + "guard-n-primary-dir-guards-to-use" + Min 1, Max INT32_MAX: Default: 3. + First appeared: 0.3.0 + + "guard-nonprimary-guard-connect-timeout" -- When trying to confirm + nonprimary guards, if a guard doesn't answer for more than this long + in seconds, treat lower-priority guards as usable. + Min: 1. Max: INT32_MAX. Default: 15 + First appeared: 0.3.0 + + "guard-nonprimary-guard-idle-timeout" -- When trying to confirm + nonprimary guards, if a guard doesn't answer for more than this long + in seconds, treat it as down. + Min: 1. Max: INT32_MAX. Default: 600 + First appeared: 0.3.0 + + "guard-remove-unlisted-guards-after-days" -- If a guard has been + unlisted in the consensus for at least this many days, remove it + from the sample. + Min: 1. Max: 3650. Default: 20. + First appeared: 0.3.0 + +X. Obsolete parameters + + "NumDirectoryGuards", "NumEntryGuards" -- Number of guard nodes + clients should use by default. If NumDirectoryGuards is 0, we + default to NumEntryGuards. + NumDirectoryGuards: Min: 0. Max: 10. Default: 0 + NumEntryGuards: Min: 1. Max: 10. Default: 3 + First-appeared: 0.2.4.23, 0.2.5.6-alpha + Removed in: 0.3.0 + + "GuardLifetime" -- Duration for which clients should choose guard + nodes, in seconds. + Min: 30 days. Max: 1826 days. Default: 60 days. + First-appeared: 0.2.4.12-alpha + Removed in: 0.3.0. + + "UseNTorHandshake" -- If true, then versions of Tor that support + NTor will prefer to use it by default. + Min: 0, Max: 1. Default: 1. + First-appeared: 0.2.4.8-alpha + Removed in: 0.2.9. + + "Support022HiddenServices" -- Used to implement a mass switch-over + from sending timestamps to hidden services by default to sending no + timestamps at all. If this option is absent, or is set to 1, + clients with the default configuration send timestamps; otherwise, + they do not. + Min: 0, Max: 1. Default: 1. + First-appeared: 0.2.4.18-rc + Removed in: 0.2.6 diff --git a/attic/text_formats/path-spec.txt b/attic/text_formats/path-spec.txt new file mode 100644 index 0000000..33d50e5 --- /dev/null +++ b/attic/text_formats/path-spec.txt @@ -0,0 +1,1051 @@ + + Tor Path Specification + + Roger Dingledine + Nick Mathewson + +Note: This is an attempt to specify Tor as currently implemented. Future +versions of Tor will implement improved algorithms. + +This document tries to cover how Tor chooses to build circuits and assign +streams to circuits. Other implementations MAY take other approaches, but +implementors should be aware of the anonymity and load-balancing implications +of their choices. + + THIS SPEC ISN'T DONE YET. + + The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL + NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and + "OPTIONAL" in this document are to be interpreted as described in + RFC 2119. + +Tables of Contents + + 1. General operation + 1.1. Terminology + 1.2. A relay's bandwidth + 2. Building circuits + 2.1. When we build + 2.1.0. We don't build circuits until we have enough directory info + 2.1.1. Clients build circuits preemptively + 2.1.2. Clients build circuits on demand + 2.1.3. Relays build circuits for testing reachability and bandwidth + 2.1.4. Hidden-service circuits + 2.1.5. Rate limiting of failed circuits + 2.1.6. When to tear down circuits + 2.2. Path selection and constraints + 2.2.1. Choosing an exit + 2.2.2. User configuration + 2.3. Cannibalizing circuits + 2.4. Learning when to give up ("timeout") on circuit construction + 2.4.1 Distribution choice and parameter estimation + 2.4.2. How much data to record + 2.4.3. How to record timeouts + 2.4.4. Detecting Changing Network Conditions + 2.4.5. Consensus parameters governing behavior + 2.4.6. Consensus parameters governing behavior + 2.5. Handling failure + 3. Attaching streams to circuits + 4. Hidden-service related circuits + 5. Guard nodes + 5.1. How consensus bandwidth weights factor into entry guard selection + 6. Server descriptor purposes + 7. Detecting route manipulation by Guard nodes (Path Bias) + 7.1. Measuring path construction success rates + 7.2. Measuring path usage success rates + 7.3. Scaling success counts + 7.4. Parametrization + 7.5. Known barriers to enforcement + X. Old notes + X.1. Do we actually do this? + X.2. A thing we could do to deal with reachability. + X.3. Some stuff that worries me about entry guards. 2006 Jun, Nickm. + +1. General operation + + Tor begins building circuits as soon as it has enough directory + information to do so (see section 5 of dir-spec.txt). Some circuits are + built preemptively because we expect to need them later (for user + traffic), and some are built because of immediate need (for user traffic + that no current circuit can handle, for testing the network or our + reachability, and so on). + + [Newer versions of Tor (0.2.6.2-alpha and later): + If the consensus contains Exits (the typical case), Tor will build both + exit and internal circuits. When bootstrap completes, Tor will be ready + to handle an application requesting an exit circuit to services like the + World Wide Web. + + If the consensus does not contain Exits, Tor will only build internal + circuits. In this case, earlier statuses will have included "internal" + as indicated above. When bootstrap completes, Tor will be ready to handle + an application requesting an internal circuit to hidden services at + ".onion" addresses. + + If a future consensus contains Exits, exit circuits may become available.] + + When a client application creates a new stream (by opening a SOCKS + connection or launching a resolve request), we attach it to an appropriate + open circuit if one exists, or wait if an appropriate circuit is + in-progress. We launch a new circuit only + if no current circuit can handle the request. We rotate circuits over + time to avoid some profiling attacks. + + To build a circuit, we choose all the nodes we want to use, and then + construct the circuit. Sometimes, when we want a circuit that ends at a + given hop, and we have an appropriate unused circuit, we "cannibalize" the + existing circuit and extend it to the new terminus. + + These processes are described in more detail below. + + This document describes Tor's automatic path selection logic only; path + selection can be overridden by a controller (with the EXTENDCIRCUIT and + ATTACHSTREAM commands). Paths constructed through these means may + violate some constraints given below. + +1.1. Terminology + + A "path" is an ordered sequence of nodes, not yet built as a circuit. + + A "clean" circuit is one that has not yet been used for any traffic. + + A "fast" or "stable" or "valid" node is one that has the 'Fast' or + 'Stable' or 'Valid' flag + set respectively, based on our current directory information. A "fast" + or "stable" circuit is one consisting only of "fast" or "stable" nodes. + + In an "exit" circuit, the final node is chosen based on waiting stream + requests if any, and in any case it avoids nodes with exit policy of + "reject *:*". An "internal" circuit, on the other hand, is one where + the final node is chosen just like a middle node (ignoring its exit + policy). + + A "request" is a client-side stream or DNS resolve that needs to be + served by a circuit. + + A "pending" circuit is one that we have started to build, but which has + not yet completed. + + A circuit or path "supports" a request if it is okay to use the + circuit/path to fulfill the request, according to the rules given below. + A circuit or path "might support" a request if some aspect of the request + is unknown (usually its target IP), but we believe the path probably + supports the request according to the rules given below. + +1.2. A relay's bandwidth + + Old versions of Tor did not report bandwidths in network status + documents, so clients had to learn them from the routers' advertised + relay descriptors. + + For versions of Tor prior to 0.2.1.17-rc, everywhere below where we + refer to a relay's "bandwidth", we mean its clipped advertised + bandwidth, computed by taking the smaller of the 'rate' and + 'observed' arguments to the "bandwidth" element in the relay's + descriptor. If a router's advertised bandwidth is greater than + MAX_BELIEVABLE_BANDWIDTH (currently 10 MB/s), we clipped to that + value. + + For more recent versions of Tor, we take the bandwidth value declared + in the consensus, and fall back to the clipped advertised bandwidth + only if the consensus does not have bandwidths listed. + +2. Building circuits + +2.1. When we build + +2.1.0. We don't build circuits until we have enough directory info + + There's a class of possible attacks where our directory servers + only give us information about the relays that they would like us + to use. To prevent this attack, we don't build multi-hop + circuits for real traffic (like those in 2.1.1, 2.1.2, 2.1.4 + below) until we have enough directory information to be + reasonably confident this attack isn't being done to us. + + Here, "enough" directory information is defined as: + + * Having a consensus that's been valid at some point in the + last REASONABLY_LIVE_TIME interval (24 hours). + + * Having enough descriptors that we could build at least some + fraction F of all bandwidth-weighted paths, without taking + ExitNodes/EntryNodes/etc into account. + + (F is set by the PathsNeededToBuildCircuits option, + defaulting to the 'min_paths_for_circs_pct' consensus + parameter, with a final default value of 60%.) + + * Having enough descriptors that we could build at least some + fraction F of all bandwidth-weighted paths, _while_ taking + ExitNodes/EntryNodes/etc into account. + + (F is as above.) + + * Having a descriptor for every one of the first + NUM_USABLE_PRIMARY_GUARDS guards among our primary guards. (see + guard-spec.txt) + + We define the "fraction of bandwidth-weighted paths" as the product of + these three fractions. + + * The fraction of descriptors that we have for nodes with the Guard + flag, weighted by their bandwidth for the guard position. + * The fraction of descriptors that we have for all nodes, + weighted by their bandwidth for the middle position. + * The fraction of descriptors that we have for nodes with the Exit + flag, weighted by their bandwidth for the exit position. + + If the consensus has zero weighted bandwidth for a given kind of + relay (Guard, Middle, or Exit), Tor instead uses the fraction of relays + for which it has the descriptor (not weighted by bandwidth at all). + + If the consensus lists zero exit-flagged relays, Tor instead uses the + fraction of middle relays. + + +2.1.1. Clients build circuits preemptively + + When running as a client, Tor tries to maintain at least a certain + number of clean circuits, so that new streams can be handled + quickly. To increase the likelihood of success, Tor tries to + predict what circuits will be useful by choosing from among nodes + that support the ports we have used in the recent past (by default + one hour). Specifically, on startup Tor tries to maintain one clean + fast exit circuit that allows connections to port 80, and at least + two fast clean stable internal circuits in case we get a resolve + request or hidden service request (at least three if we _run_ a + hidden service). + + After that, Tor will adapt the circuits that it preemptively builds + based on the requests it sees from the user: it tries to have two fast + clean exit circuits available for every port seen within the past hour + (each circuit can be adequate for many predicted ports -- it doesn't + need two separate circuits for each port), and it tries to have the + above internal circuits available if we've seen resolves or hidden + service activity within the past hour. If there are 12 or more clean + circuits open, it doesn't open more even if it has more predictions. + + Only stable circuits can "cover" a port that is listed in the + LongLivedPorts config option. Similarly, hidden service requests + to ports listed in LongLivedPorts make us create stable internal + circuits. + + Note that if there are no requests from the user for an hour, Tor + will predict no use and build no preemptive circuits. + + The Tor client SHOULD NOT store its list of predicted requests to a + persistent medium. + +2.1.2. Clients build circuits on demand + + Additionally, when a client request exists that no circuit (built or + pending) might support, we create a new circuit to support the request. + For exit connections, we pick an exit node that will handle the + most pending requests (choosing arbitrarily among ties), launch a + circuit to end there, and repeat until every unattached request + might be supported by a pending or built circuit. For internal + circuits, we pick an arbitrary acceptable path, repeating as needed. + + Clients consider a circuit to become "dirty" as soon as a stream is + attached to it, or some other request is performed over the circuit. + If a circuit has been "dirty" for at least MaxCircuitDirtiness seconds, + new circuits may not be attached to it. + + In some cases we can reuse an already established circuit if it's + clean; see Section 2.3 (cannibalizing circuits) for details. + +2.1.3. Relays build circuits for testing reachability and bandwidth + + Tor relays test reachability of their ORPort once they have + successfully built a circuit (on startup and whenever their IP address + changes). They build an ordinary fast internal circuit with themselves + as the last hop. As soon as any testing circuit succeeds, the Tor + relay decides it's reachable and is willing to publish a descriptor. + + We launch multiple testing circuits (one at a time), until we + have NUM_PARALLEL_TESTING_CIRC (4) such circuits open. Then we + do a "bandwidth test" by sending a certain number of relay drop + cells down each circuit: BandwidthRate * 10 / CELL_NETWORK_SIZE + total cells divided across the four circuits, but never more than + CIRCWINDOW_START (1000) cells total. This exercises both outgoing and + incoming bandwidth, and helps to jumpstart the observed bandwidth + (see dir-spec.txt). + + Tor relays also test reachability of their DirPort once they have + established a circuit, but they use an ordinary exit circuit for + this purpose. + +2.1.4. Hidden-service circuits + + See section 4 below. + +2.1.5. Rate limiting of failed circuits + + If we fail to build a circuit N times in a X second period (see Section + 2.3 for how this works), we stop building circuits until the X seconds + have elapsed. + XXXX + +2.1.6. When to tear down circuits + + Clients should tear down circuits (in general) only when those circuits + have no streams on them. Additionally, clients should tear-down + stream-less circuits only under one of the following conditions: + + - The circuit has never had a stream attached, and it was created too + long in the past (based on CircuitsAvailableTimeout or + cbtlearntimeout, depending on timeout estimate status). + + - The circuit is dirty (has had a stream attached), and it has been + dirty for at least MaxCircuitDirtiness. + +2.2. Path selection and constraints + + We choose the path for each new circuit before we build it. We choose the + exit node first, followed by the other nodes in the circuit, front to + back. (In other words, for a 3-hop circuit, we first pick hop 3, + then hop 1, then hop 2.) All paths we generate obey the following + constraints: + + - We do not choose the same router twice for the same path. + - We do not choose any router in the same family as another in the same + path. (Two routers are in the same family if each one lists the other + in the "family" entries of its descriptor.) + - We do not choose more than one router in a given /16 subnet + (unless EnforceDistinctSubnets is 0). + - We don't choose any non-running or non-valid router unless we have + been configured to do so. By default, we are configured to allow + non-valid routers in "middle" and "rendezvous" positions. + - If we're using Guard nodes, the first node must be a Guard (see 5 + below) + - XXXX Choosing the length + + For "fast" circuits, we only choose nodes with the Fast flag. For + non-"fast" circuits, all nodes are eligible. + + For all circuits, we weight node selection according to router bandwidth. + + We also weight the bandwidth of Exit and Guard flagged nodes depending on + the fraction of total bandwidth that they make up and depending upon the + position they are being selected for. + + These weights are published in the consensus, and are computed as described + in Section "Computing Bandwidth Weights" of dir-spec.txt. They are: + + Wgg - Weight for Guard-flagged nodes in the guard position + Wgm - Weight for non-flagged nodes in the guard Position + Wgd - Weight for Guard+Exit-flagged nodes in the guard Position + + Wmg - Weight for Guard-flagged nodes in the middle Position + Wmm - Weight for non-flagged nodes in the middle Position + Wme - Weight for Exit-flagged nodes in the middle Position + Wmd - Weight for Guard+Exit flagged nodes in the middle Position + + Weg - Weight for Guard flagged nodes in the exit Position + Wem - Weight for non-flagged nodes in the exit Position + Wee - Weight for Exit-flagged nodes in the exit Position + Wed - Weight for Guard+Exit-flagged nodes in the exit Position + + Wgb - Weight for BEGIN_DIR-supporting Guard-flagged nodes + Wmb - Weight for BEGIN_DIR-supporting non-flagged nodes + Web - Weight for BEGIN_DIR-supporting Exit-flagged nodes + Wdb - Weight for BEGIN_DIR-supporting Guard+Exit-flagged nodes + + Wbg - Weight for Guard+Exit-flagged nodes for BEGIN_DIR requests + Wbm - Weight for Guard+Exit-flagged nodes for BEGIN_DIR requests + Wbe - Weight for Guard+Exit-flagged nodes for BEGIN_DIR requests + Wbd - Weight for Guard+Exit-flagged nodes for BEGIN_DIR requests + + If any of those weights is malformed or not present in a consensus, + clients proceed with the regular path selection algorithm setting + the weights to the default value of 10000. + + Additionally, we may be building circuits with one or more requests in + mind. Each kind of request puts certain constraints on paths: + + - All service-side introduction circuits and all rendezvous paths + should be Stable. + - All connection requests for connections that we think will need to + stay open a long time require Stable circuits. Currently, Tor decides + this by examining the request's target port, and comparing it to a + list of "long-lived" ports. (Default: 21, 22, 706, 1863, 5050, + 5190, 5222, 5223, 6667, 6697, 8300.) + - DNS resolves require an exit node whose exit policy is not equivalent + to "reject *:*". + - Reverse DNS resolves require a version of Tor with advertised eventdns + support (available in Tor 0.1.2.1-alpha-dev and later). + - All connection requests require an exit node whose exit policy + supports their target address and port (if known), or which "might + support it" (if the address isn't known). See 2.2.1. + - Rules for Fast? XXXXX + +2.2.1. Choosing an exit + + If we know what IP address we want to connect to or resolve, we can + trivially tell whether a given router will support it by simulating + its declared exit policy. + + Because we often connect to addresses of the form hostname:port, we do not + always know the target IP address when we select an exit node. In these + cases, we need to pick an exit node that "might support" connections to a + given address port with an unknown address. An exit node "might support" + such a connection if any clause that accepts any connections to that port + precedes all clauses (if any) that reject all connections to that port. + + Unless requested to do so by the user, we never choose an exit node + flagged as "BadExit" by more than half of the authorities who advertise + themselves as listing bad exits. + +2.2.2. User configuration + + Users can alter the default behavior for path selection with configuration + options. + + - If "ExitNodes" is provided, then every request requires an exit node on + the ExitNodes list. (If a request is supported by no nodes on that list, + and StrictExitNodes is false, then Tor treats that request as if + ExitNodes were not provided.) + + - "EntryNodes" and "StrictEntryNodes" behave analogously. + + - If a user tries to connect to or resolve a hostname of the form + ..exit, the request is rewritten to a request for + , and the request is only supported by the exit whose nickname + or fingerprint is . + + - When set, "HSLayer2Nodes" and "HSLayer3Nodes" relax Tor's path + restrictions to allow nodes in the same /16 and node family to reappear + in the path. They also allow the guard node to be chosen as the RP, IP, + and HSDIR, and as the hop before those positions. + +2.3. Cannibalizing circuits + + If we need a circuit and have a clean one already established, in + some cases we can adapt the clean circuit for our new + purpose. Specifically, + + For hidden service interactions, we can "cannibalize" a clean internal + circuit if one is available, so we don't need to build those circuits + from scratch on demand. + + We can also cannibalize clean circuits when the client asks to exit + at a given node -- either via the ".exit" notation or because the + destination is running at the same location as an exit node. + +2.4. Learning when to give up ("timeout") on circuit construction + + Since version 0.2.2.8-alpha, Tor clients attempt to learn when to give + up on circuits based on network conditions. + +2.4.1. Distribution choice + + Based on studies of build times, we found that the distribution of + circuit build times appears to be a Frechet distribution (and a multi-modal + Frechet distribution, if more than one guard or bridge is used). However, + estimators and quantile functions of the Frechet distribution are difficult + to work with and slow to converge. So instead, since we are only interested + in the accuracy of the tail, clients approximate the tail of the multi-modal + distribution with a single Pareto curve. + +2.4.2. How much data to record + + From our observations, the minimum number of circuit build times for a + reasonable fit appears to be on the order of 100. However, to keep a + good fit over the long term, clients store 1000 most recent circuit build + times in a circular array. + + These build times only include the times required to build three-hop + circuits, and the times required to build the first three hops of circuits + with more than three hops. Circuits of fewer than three hops are not + recorded, and hops past the third are not recorded. + + The Tor client should build test circuits at a rate of one every 'cbttestfreq' + (10 seconds) until 'cbtmincircs' (100 circuits) are built, with a maximum of + 'cbtmaxopencircs' (default: 10) circuits open at once. This allows a fresh + Tor to have a CircuitBuildTimeout estimated within 30 minutes after install + or network change (see section 2.4.5 below). + + Timeouts are stored on disk in a histogram of 10ms bin width, the same + width used to calculate the Xm value above. The timeouts recorded in the + histogram must be shuffled after being read from disk, to preserve a + proper expiration of old values after restart. + + Thus, some build time resolution is lost during restart. Implementations may + choose a different persistence mechanism than this histogram, but be aware + that build time binning is still needed for parameter estimation. + +2.4.3. Parameter estimation + + Once 'cbtmincircs' build times are recorded, Tor clients update the + distribution parameters and recompute the timeout every circuit completion + (though see section 2.4.5 for when to pause and reset timeout due to + too many circuits timing out). + + Tor clients calculate the parameters for a Pareto distribution fitting the + data using the maximum likelihood estimator. For derivation, see: + https://en.wikipedia.org/wiki/Pareto_distribution#Estimation_of_parameters + + Because build times are not a true Pareto distribution, we alter how Xm is + computed. In a max likelihood estimator, the mode of the distribution is + used directly as Xm. + + Instead of using the mode of discrete build times directly, Tor clients + compute the Xm parameter using the weighted average of the midpoints + of the 'cbtnummodes' (10) most frequently occurring 10ms histogram bins. + Ties are broken in favor of earlier bins (that is, in favor of bins + corresponding to shorter build times). + + (The use of 10 modes was found to minimize error from the selected + cbtquantile, with 10ms bins for quantiles 60-80, compared to many other + heuristics). + + To avoid ln(1.0+epsilon) precision issues, use log laws to rewrite the + estimator for 'alpha' as the sum of logs followed by subtraction, rather + than multiplication and division: + + alpha = n/(Sum_n{ln(MAX(Xm, x_i))} - n*ln(Xm)) + + In this, n is the total number of build times that have completed, x_i is + the ith recorded build time, and Xm is the modes of x_i as above. + + All times below Xm are counted as having the Xm value via the MAX(), + because in Pareto estimators, Xm is supposed to be the lowest value. + However, since clients use mode averaging to estimate Xm, there can be + values below our Xm. Effectively, the Pareto estimator then treats that + everything smaller than Xm happened at Xm. One can also see that if + clients did not do this, alpha could underflow to become negative, which + results in an exponential curve, not a Pareto probability distribution. + + The timeout itself is calculated by using the Pareto Quantile function (the + inverted CDF) to give us the value on the CDF such that 80% of the mass + of the distribution is below the timeout value (parameter 'cbtquantile'). + + The Pareto Quantile Function (inverse CDF) is: + + F(q) = Xm/((1.0-q)^(1.0/alpha)) + + Thus, clients obtain the circuit build timeout for 3-hop circuits by + computing: + + timeout_ms = F(0.8) # 'cbtquantile' == 0.8 + + With this, we expect that the Tor client will accept the fastest 80% of the + total number of paths on the network. + + Clients obtain the circuit close time to completely abandon circuits as: + + close_ms = F(0.99) # 'cbtclosequantile' == 0.99 + + To avoid waiting an unreasonably long period of time for circuits that + simply have relays that are down, Tor clients cap timeout_ms at the max + build time actually observed so far, and cap close_ms at twice this max, + but at least 60 seconds: + + timeout_ms = MIN(timeout_ms, max_observed_timeout) + close_ms = MAX(MIN(close_ms, 2*max_observed_timeout), 'cbtinitialtimeout') + +2.4.3. Calculating timeouts thresholds for circuits of different lengths + + The timeout_ms and close_ms estimates above are good only for 3-hop + circuits, since only 3-hop circuits are recorded in the list of build + times. + + To calculate the appropriate timeouts and close timeouts for circuits of + other lengths, the client multiples the timeout_ms and close_ms values + by a scaling factor determined by the number of communication hops + needed to build their circuits: + + timeout_ms[hops=n] = timeout_ms * Actions(N) / Actions(3) + + close_ms[hops=n] = close_ms * Actions(N) / Actions(3) + + where Actions(N) = N * (N + 1) / 2. + + To calculate timeouts for operations other than circuit building, + the client should add X to Actions(N) for every round-trip communication + required with the Xth hop. + +2.4.4. How to record timeouts + + Pareto estimators begin to lose their accuracy if the tail is omitted. + Hence, Tor clients actually calculate two timeouts: a usage timeout, and a + close timeout. + + Circuits that pass the usage timeout are marked as measurement circuits, + and are allowed to continue to build until the close timeout corresponding + to the point 'cbtclosequantile' (default 99) on the Pareto curve, or 60 + seconds, whichever is greater. + + The actual completion times for these measurement circuits should be + recorded. + + Implementations should completely abandon a circuit and ignore the circuit + if the total build time exceeds the close threshold. Such closed circuits + should be ignored, as this typically means one of the relays in the path is + offline. + +2.4.5. Detecting Changing Network Conditions + + Tor clients attempt to detect both network connectivity loss and drastic + changes in the timeout characteristics. + + To detect changing network conditions, clients keep a history of + the timeout or non-timeout status of the past 'cbtrecentcount' circuits + (20 circuits) that successfully completed at least one hop. If more than + 90% of these circuits timeout, the client discards all buildtimes history, + resets the timeout to 'cbtinitialtimeout' (60 seconds), and then begins + recomputing the timeout. + + If the timeout was already at least `cbtinitialtimeout`, + the client doubles the timeout. + + The records here (of how many circuits succeeded or failed among the most + recent 'cbrrecentcount') are not stored as persistent state. On reload, + we start with a new, empty state. + +2.4.6. Consensus parameters governing behavior + + Clients that implement circuit build timeout learning should obey the + following consensus parameters that govern behavior, in order to allow + us to handle bugs or other emergent behaviors due to client circuit + construction. If these parameters are not present in the consensus, + the listed default values should be used instead. + + cbtdisabled + Default: 0 + Min: 0 + Max: 1 + Effect: If 1, all CircuitBuildTime learning code should be + disabled and history should be discarded. For use in + emergency situations only. + + cbtnummodes + Default: 10 + Min: 1 + Max: 20 + Effect: This value governs how many modes to use in the weighted + average calculation of Pareto parameter Xm. Selecting Xm as the + average of multiple modes improves accuracy of the Pareto tail + for quantile cutoffs from 60-80% (see cbtquantile). + + cbtrecentcount + Default: 20 + Min: 3 + Max: 1000 + Effect: This is the number of circuit build outcomes (success vs + timeout) to keep track of for the following option. + + cbtmaxtimeouts + Default: 18 + Min: 3 + Max: 10000 + Effect: When this many timeouts happen in the last 'cbtrecentcount' + circuit attempts, the client should discard all of its + history and begin learning a fresh timeout value. + + Note that if this parameter's value is greater than the value + of 'cbtrecentcount', then the history will never be + discarded because of this feature. + + cbtmincircs + Default: 100 + Min: 1 + Max: 10000 + Effect: This is the minimum number of circuits to build before + computing a timeout. + + Note that if this parameter's value is higher than 1000 (the + number of time observations that a client keeps in its + circular buffer), circuit build timeout calculation is + effectively disabled, and the default timeouts are used + indefinitely. + + cbtquantile + Default: 80 + Min: 10 + Max: 99 + Effect: This is the position on the quantile curve to use to set the + timeout value. It is a percent (10-99). + + cbtclosequantile + Default: 99 + Min: Value of cbtquantile parameter + Max: 99 + Effect: This is the position on the quantile curve to use to set the + timeout value to use to actually close circuits. It is a + percent (0-99). + + cbttestfreq + Default: 10 + Min: 1 + Max: 2147483647 (INT32_MAX) + Effect: Describes how often in seconds to build a test circuit to + gather timeout values. Only applies if less than 'cbtmincircs' + have been recorded. + + cbtmintimeout + Default: 10 + Min: 10 + Max: 2147483647 (INT32_MAX) + Effect: This is the minimum allowed timeout value in milliseconds. + + cbtinitialtimeout + Default: 60000 + Min: Value of cbtmintimeout + Max: 2147483647 (INT32_MAX) + Effect: This is the timeout value to use before we have enough data + to compute a timeout, in milliseconds. If we do not have + enough data to compute a timeout estimate (see cbtmincircs), + then we use this interval both for the close timeout and the + abandon timeout. + + cbtlearntimeout + Default: 180 + Min: 10 + Max: 60000 + Effect: This is how long idle circuits will be kept open while cbt is + learning a new timeout value. + + cbtmaxopencircs + Default: 10 + Min: 0 + Max: 14 + Effect: This is the maximum number of circuits that can be open at + at the same time during the circuit build time learning phase. + +2.5. Handling failure + + If an attempt to extend a circuit fails (either because the first create + failed or a subsequent extend failed) then the circuit is torn down and is + no longer pending. (XXXX really?) Requests that might have been + supported by the pending circuit thus become unsupported, and a new + circuit needs to be constructed. + + If a stream "begin" attempt fails with an EXITPOLICY error, we + decide that the exit node's exit policy is not correctly advertised, + so we treat the exit node as if it were a non-exit until we retrieve + a fresh descriptor for it. + + Excessive amounts of either type of failure can indicate an + attack on anonymity. See section 7 for how excessive failure is handled. + +3. Attaching streams to circuits + + When a circuit that might support a request is built, Tor tries to attach + the request's stream to the circuit and sends a BEGIN, BEGIN_DIR, + or RESOLVE relay + cell as appropriate. If the request completes unsuccessfully, Tor + considers the reason given in the CLOSE relay cell. [XXX yes, and?] + + + After a request has remained unattached for SocksTimeout (2 minutes + by default), Tor abandons the attempt and signals an error to the + client as appropriate (e.g., by closing the SOCKS connection). + + XXX Timeouts and when Tor auto-retries. + + * What stream-end-reasons are appropriate for retrying. + + If no reply to BEGIN/RESOLVE, then the stream will timeout and fail. + +4. Hidden-service related circuits + + XXX Tracking expected hidden service use (client-side and hidserv-side) + +5. Guard nodes + + We use Guard nodes (also called "helper nodes" in the research + literature) to prevent certain profiling attacks. For an overview of + our Guard selection algorithm -- which has grown rather complex -- see + guard-spec.txt. + +5.1. How consensus bandwidth weights factor into entry guard selection + + When weighting a list of routers for choosing an entry guard, the following + consensus parameters (from the "bandwidth-weights" line) apply: + + Wgg - Weight for Guard-flagged nodes in the guard position + Wgm - Weight for non-flagged nodes in the guard Position + Wgd - Weight for Guard+Exit-flagged nodes in the guard Position + Wgb - Weight for BEGIN_DIR-supporting Guard-flagged nodes + Wmb - Weight for BEGIN_DIR-supporting non-flagged nodes + Web - Weight for BEGIN_DIR-supporting Exit-flagged nodes + Wdb - Weight for BEGIN_DIR-supporting Guard+Exit-flagged nodes + + Please see "bandwidth-weights" in §3.4.1 of dir-spec.txt for more in depth + descriptions of these parameters. + + If a router has been marked as both an entry guard and an exit, then we + prefer to use it more, with our preference for doing so (roughly) linearly + increasing w.r.t. the router's non-guard bandwidth and bandwidth weight + (calculated without taking the guard flag into account). From proposal + #236: + + | + | Let Wpf denote the weight from the 'bandwidth-weights' line a + | client would apply to N for position p if it had the guard + | flag, Wpn the weight if it did not have the guard flag, and B the + | measured bandwidth of N in the consensus. Then instead of choosing + | N for position p proportionally to Wpf*B or Wpn*B, clients should + | choose N proportionally to F*Wpf*B + (1-F)*Wpn*B. + + where F is the weight as calculated using the above parameters. + +6. Server descriptor purposes + + There are currently three "purposes" supported for server descriptors: + general, controller, and bridge. Most descriptors are of type general + -- these are the ones listed in the consensus, and the ones fetched + and used in normal cases. + + Controller-purpose descriptors are those delivered by the controller + and labelled as such: they will be kept around (and expire like + normal descriptors), and they can be used by the controller in its + CIRCUITEXTEND commands. Otherwise they are ignored by Tor when it + chooses paths. + + Bridge-purpose descriptors are for routers that are used as bridges. See + doc/design-paper/blocking.pdf for more design explanation, or proposal + 125 for specific details. Currently bridge descriptors are used in place + of normal entry guards, for Tor clients that have UseBridges enabled. + +7. Detecting route manipulation by Guard nodes (Path Bias) + + The Path Bias defense is designed to defend against a type of route + capture where malicious Guard nodes deliberately fail or choke circuits + that extend to non-colluding Exit nodes to maximize their network + utilization in favor of carrying only compromised traffic. + + In the extreme, the attack allows an adversary that carries c/n + of the network capacity to deanonymize c/n of the network + connections, breaking the O((c/n)^2) property of Tor's original + threat model. It also allows targeted attacks aimed at monitoring + the activity of specific users, bridges, or Guard nodes. + + There are two points where path selection can be manipulated: + during construction, and during usage. Circuit construction + can be manipulated by inducing circuit failures during circuit + extend steps, which causes the Tor client to transparently retry + the circuit construction with a new path. Circuit usage can be + manipulated by abusing the stream retry features of Tor (for + example by withholding stream attempt responses from the client + until the stream timeout has expired), at which point the tor client + will also transparently retry the stream on a new path. + + The defense as deployed therefore makes two independent sets of + measurements of successful path use: one during circuit construction, + and one during circuit usage. + + The intended behavior is for clients to ultimately disable the use + of Guards responsible for excessive circuit failure of either type + (see section 7.4); however known issues with the Tor network currently + restrict the defense to being informational only at this stage (see + section 7.5). + +7.1. Measuring path construction success rates + + Clients maintain two counts for each of their guards: a count of the + number of times a circuit was extended to at least two hops through that + guard, and a count of the number of circuits that successfully complete + through that guard. The ratio of these two numbers is used to determine + a circuit success rate for that Guard. + + Circuit build timeouts are counted as construction failures if the + circuit fails to complete before the 95% "right-censored" timeout + interval, not the 80% timeout condition (see section 2.4). + + If a circuit closes prematurely after construction but before being + requested to close by the client, this is counted as a failure. + +7.2. Measuring path usage success rates + + Clients maintain two usage counts for each of their guards: a count + of the number of usage attempts, and a count of the number of + successful usages. + + A usage attempt means any attempt to attach a stream to a circuit. + + Usage success status is temporarily recorded by state flags on circuits. + Guard usage success counts are not incremented until circuit close. A + circuit is marked as successfully used if we receive a properly + recognized RELAY cell on that circuit that was expected for the current + circuit purpose. + + If subsequent stream attachments fail or time out, the successfully used + state of the circuit is cleared, causing it once again to be regarded + as a usage attempt only. + + Upon close by the client, all circuits that are still marked as usage + attempts are probed using a RELAY_BEGIN cell constructed with a + destination of the form 0.a.b.c:25, where a.b.c is a 24 bit random + nonce. If we get a RELAY_COMMAND_END in response matching our nonce, + the circuit is counted as successfully used. + + If any unrecognized RELAY cells arrive after the probe has been sent, + the circuit is counted as a usage failure. + + If the stream failure reason codes DESTROY, TORPROTOCOL, or INTERNAL + are received in response to any stream attempt, such circuits are not + probed and are declared usage failures. + + Prematurely closed circuits are not probed, and are counted as usage + failures. + +7.3. Scaling success counts + + To provide a moving average of recent Guard activity while + still preserving the ability to verify correctness, we periodically + "scale" the success counts by multiplying them by a scale factor + between 0 and 1.0. + + Scaling is performed when either usage or construction attempt counts + exceed a parametrized value. + + To avoid error due to scaling during circuit construction and use, + currently open circuits are subtracted from the usage counts before + scaling, and added back after scaling. + +7.4. Parametrization + + The following consensus parameters tune various aspects of the + defense. + + pb_mincircs + Default: 150 + Min: 5 + Effect: This is the minimum number of circuits that must complete + at least 2 hops before we begin evaluating construction rates. + + + pb_noticepct + Default: 70 + Min: 0 + Max: 100 + Effect: If the circuit success rate falls below this percentage, + we emit a notice log message. + + pb_warnpct + Default: 50 + Min: 0 + Max: 100 + Effect: If the circuit success rate falls below this percentage, + we emit a warn log message. + + pb_extremepct + Default: 30 + Min: 0 + Max: 100 + Effect: If the circuit success rate falls below this percentage, + we emit a more alarmist warning log message. If + pb_dropguard is set to 1, we also disable the use of the + guard. + + pb_dropguards + Default: 0 + Min: 0 + Max: 1 + Effect: If the circuit success rate falls below pb_extremepct, + when pb_dropguard is set to 1, we disable use of that + guard. + + pb_scalecircs + Default: 300 + Min: 10 + Effect: After this many circuits have completed at least two hops, + Tor performs the scaling described in Section 7.3. + + pb_multfactor and pb_scalefactor + Default: 1/2 + Min: 0.0 + Max: 1.0 + Effect: The double-precision result obtained from + pb_multfactor/pb_scalefactor is multiplied by our current + counts to scale them. + + pb_minuse + Default: 20 + Min: 3 + Effect: This is the minimum number of circuits that we must attempt to + use before we begin evaluating construction rates. + + pb_noticeusepct + Default: 80 + Min: 3 + Effect: If the circuit usage success rate falls below this percentage, + we emit a notice log message. + + pb_extremeusepct + Default: 60 + Min: 3 + Effect: If the circuit usage success rate falls below this percentage, + we emit a warning log message. We also disable the use of the + guard if pb_dropguards is set. + + pb_scaleuse + Default: 100 + Min: 10 + Effect: After we have attempted to use this many circuits, + Tor performs the scaling described in Section 7.3. + +7.5. Known barriers to enforcement + + Due to intermittent CPU overload at relays, the normal rate of + successful circuit completion is highly variable. The Guard-dropping + version of the defense is unlikely to be deployed until the ntor + circuit handshake is enabled, or the nature of CPU overload induced + failure is better understood. + + + +X. Old notes + +X.1. Do we actually do this? + +How to deal with network down. + - While all helpers are down/unreachable and there are no established + or on-the-way testing circuits, launch a testing circuit. (Do this + periodically in the same way we try to establish normal circuits + when things are working normally.) + (Testing circuits are a special type of circuit, that streams won't + attach to by accident.) + - When a testing circuit succeeds, mark all helpers up and hold + the testing circuit open. + - If a connection to a helper succeeds, close all testing circuits. + Else mark that helper down and try another. + - If the last helper is marked down and we already have a testing + circuit established, then add the first hop of that testing circuit + to the end of our helper node list, close that testing circuit, + and go back to square one. (Actually, rather than closing the + testing circuit, can we get away with converting it to a normal + circuit and beginning to use it immediately?) + + [Do we actually do any of the above? If so, let's spec it. If not, let's + remove it. -NM] + +X.2. A thing we could do to deal with reachability. + +And as a bonus, it leads to an answer to Nick's attack ("If I pick +my helper nodes all on 18.0.0.0:*, then I move, you'll know where I +bootstrapped") -- the answer is to pick your original three helper nodes +without regard for reachability. Then the above algorithm will add some +more that are reachable for you, and if you move somewhere, it's more +likely (though not certain) that some of the originals will become useful. +Is that smart or just complex? + +X.3. Some stuff that worries me about entry guards. 2006 Jun, Nickm. + + It is unlikely for two users to have the same set of entry guards. + Observing a user is sufficient to learn its entry guards. So, as we move + around, entry guards make us linkable. If we want to change guards when + our location (IP? subnet?) changes, we have two bad options. We could + + - Drop the old guards. But if we go back to our old location, + we'll not use our old guards. For a laptop that sometimes gets used + from work and sometimes from home, this is pretty fatal. + - Remember the old guards as associated with the old location, and use + them again if we ever go back to the old location. This would be + nasty, since it would force us to record where we've been. + + [Do we do any of this now? If not, this should move into 099-misc or + 098-todo. -NM] diff --git a/attic/text_formats/pt-spec.txt b/attic/text_formats/pt-spec.txt new file mode 100644 index 0000000..45b4c31 --- /dev/null +++ b/attic/text_formats/pt-spec.txt @@ -0,0 +1,828 @@ + Pluggable Transport Specification (Version 1) + +Abstract + + Pluggable Transports (PTs) are a generic mechanism for the rapid + development and deployment of censorship circumvention, + based around the idea of modular sub-processes that transform + traffic to defeat censors. + + This document specifies the sub-process startup, shutdown, + and inter-process communication mechanisms required to utilize + PTs. + +Table of Contents + + 1. Introduction + 1.1. Requirements Notation + 2. Architecture Overview + 3. Specification + 3.1. Pluggable Transport Naming + 3.2. Pluggable Transport Configuration Environment Variables + 3.2.1. Common Environment Variables + 3.2.2. Pluggable Transport Client Environment Variables + 3.2.3. Pluggable Transport Server Environment Variables + 3.3. Pluggable Transport To Parent Process Communication + 3.3.1. Common Messages + 3.3.2. Pluggable Transport Client Messages + 3.3.3. Pluggable Transport Server Messages + 3.4. Pluggable Transport Shutdown + 3.5. Pluggable Transport Client Per-Connection Arguments + 4. Anonymity Considerations + 5 References + 6. Acknowledgments + Appendix A. Example Client Pluggable Transport Session + Appendix B. Example Server Pluggable Transport Session + +1. Introduction + + This specification describes a way to decouple protocol-level + obfuscation from an application's client/server code, in a manner + that promotes rapid development of obfuscation/circumvention + tools and promotes reuse beyond the scope of the Tor Project's + efforts in that area. + + This is accomplished by utilizing helper sub-processes that + implement the necessary forward/reverse proxy servers that handle + the censorship circumvention, with a well defined and + standardized configuration and management interface. + + Any application code that implements the interfaces as specified + in this document will be able to use all spec compliant Pluggable + Transports. + +1.1. Requirements Notation + + The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL + NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and + "OPTIONAL" in this document are to be interpreted as described in + [RFC2119]. + +2. Architecture Overview + + +------------+ +---------------------------+ + | Client App +-- Local Loopback --+ PT Client (SOCKS Proxy) +--+ + +------------+ +---------------------------+ | + | + Public Internet (Obfuscated/Transformed traffic) ==> | + | + +------------+ +---------------------------+ | + | Server App +-- Local Loopback --+ PT Server (Reverse Proxy) +--+ + +------------+ +---------------------------+ + + On the client's host, the PT Client software exposes a SOCKS proxy + [RFC1928] to the client application, and obfuscates or otherwise + transforms traffic before forwarding it to the server's host. + + On the server's host, the PT Server software exposes a reverse proxy + that accepts connections from PT Clients, and handles reversing the + obfuscation/transformation applied to traffic, before forwarding it + to the actual server software. An optional lightweight protocol + exists to facilitate communicating connection meta-data that would + otherwise be lost such as the source IP address and port + [EXTORPORT]. + + All PT instances are configured by the respective parent process via + a set of standardized environment variables (3.2) that are set at + launch time, and report status information back to the parent via + writing output in a standardized format to stdout (3.3). + + Each invocation of a PT MUST be either a client OR a server. + + All PT client forward proxies MUST support either SOCKS 4 or SOCKS 5, + and SHOULD prefer SOCKS 5 over SOCKS 4. + +3. Specification + + Pluggable Transport proxies follow the following workflow + throughout their lifespan. + + 1) Parent process sets the required environment values (3.2) + and launches the PT proxy as a sub-process (fork()/exec()). + + 2) The PT Proxy determines the versions of the PT specification + supported by the parent"TOR_PT_MANAGED_TRANSPORT_VER" (3.2.1) + + 2.1) If there are no compatible versions, the PT proxy + writes a "VERSION-ERROR" message (3.3.1) to stdout and + terminates. + + 2.2) If there is a compatible version, the PT proxy writes + a "VERSION" message (3.3.1) to stdout. + + 3) The PT Proxy parses the rest of the environment values. + + 3.1) If the environment values are malformed, or otherwise + invalid, the PT proxy writes a "ENV-ERROR" message + (3.3.1) to stdout and terminates. + + 3.2) Determining if it is a client side forward proxy or + a server side reverse proxy can be done via examining + the "TOR_PT_CLIENT_TRANSPORTS" and "TOR_PT_SERVER_TRANSPORTS" + environment variables. + + 4) (Client only) If there is an upstream proxy specified via + "TOR_PT_PROXY" (3.2.2), the PT proxy validates the URI + provided. + + 4.1) If the upstream proxy is unusable, the PT proxy writes + a "PROXY-ERROR" message (3.3.2) to stdout and + terminates. + + 4.2) If there is a supported and well-formed upstream proxy + the PT proxy writes a "PROXY DONE" message (3.3.2) to + stdout. + + 5) The PT Proxy initializes the transports and reports the + status via stdout (3.3.2, 3.3.3) + + 6) The PT Proxy forwards and transforms traffic as appropriate. + + 7) Upon being signaled to terminate by the parent process (3.4), + the PT Proxy gracefully shuts down. + +3.1. Pluggable Transport Naming + + Pluggable Transport names serve as unique identifiers, and every + PT MUST have a unique name. + + PT names MUST be valid C identifiers. PT names MUST begin with + a letter or underscore, and the remaining characters MUST be + ASCII letters, numbers or underscores. No length limit is + imposted. + + PT names MUST satisfy the regular expression "[a-zA-Z_][a-zA-Z0-9_]*". + +3.2. Pluggable Transport Configuration Environment Variables + + All Pluggable Transport proxy instances are configured by their + parent process at launch time via a set of well defined + environment variables. + + The "TOR_PT_" prefix is used for namespacing reasons and does not + indicate any relations to Tor, except for the origins of this + specification. + +3.2.1. Common Environment Variables + + When launching either a client or server Pluggable Transport proxy, + the following common environment variables MUST be set. + + "TOR_PT_MANAGED_TRANSPORT_VER" + + Specifies the versions of the Pluggable Transport specification + the parent process supports, delimited by commas. All PTs MUST + accept any well-formed list, as long as a compatible version is + present. + + Valid versions MUST consist entirely of non-whitespace, + non-comma printable ASCII characters. + + The version of the Pluggable Transport specification as of this + document is "1". + + Example: + + TOR_PT_MANAGED_TRANSPORT_VER=1,1a,2b,this_is_a_valid_ver + + "TOR_PT_STATE_LOCATION" + + Specifies an absolute path to a directory where the PT is + allowed to store state that will be persisted across + invocations. The directory is not required to exist when + the PT is launched, however PT implementations SHOULD be + able to create it as required. + + PTs MUST only store files in the path provided, and MUST NOT + create or modify files elsewhere on the system. + + Example: + + TOR_PT_STATE_LOCATION=/var/lib/tor/pt_state/ + + "TOR_PT_EXIT_ON_STDIN_CLOSE" + + Specifies that the parent process will close the PT proxy's + standard input (stdin) stream to indicate that the PT proxy + should gracefully exit. + + PTs MUST NOT treat a closed stdin as a signal to terminate + unless this environment variable is set to "1". + + PTs SHOULD treat stdin being closed as a signal to gracefully + terminate if this environment variable is set to "1". + + Example: + + TOR_PT_EXIT_ON_STDIN_CLOSE=1 + + "TOR_PT_OUTBOUND_BIND_ADDRESS_V4" + + Specifies an IPv4 IP address that the PT proxy SHOULD use as source address for + outgoing IPv4 IP packets. This feature allows people with multiple network + interfaces to specify explicitly which interface they prefer the PT proxy to + use. + + If this value is unset or empty, the PT proxy MUST use the default source + address for outgoing connections. + + This setting MUST be ignored for connections to + loopback addresses (127.0.0.0/8). + + Example: + + TOR_PT_OUTBOUND_BIND_ADDRESS_V4=203.0.113.4 + + "TOR_PT_OUTBOUND_BIND_ADDRESS_V6" + + Specifies an IPv6 IP address that the PT proxy SHOULD use as source address for + outgoing IPv6 IP packets. This feature allows people with multiple network + interfaces to specify explicitly which interface they prefer the PT proxy to + use. + + If this value is unset or empty, the PT proxy MUST use the default source + address for outgoing connections. + + This setting MUST be ignored for connections to the loopback address ([::1]). + + IPv6 addresses MUST always be wrapped in square brackets. + + Example:: + + TOR_PT_OUTBOUND_BIND_ADDRESS_V6=[2001:db8::4] + +3.2.2. Pluggable Transport Client Environment Variables + + Client-side Pluggable Transport forward proxies are configured + via the following environment variables. + + "TOR_PT_CLIENT_TRANSPORTS" + + Specifies the PT protocols the client proxy should initialize, + as a comma separated list of PT names. + + PTs SHOULD ignore PT names that it does not recognize. + + Parent processes MUST set this environment variable when + launching a client-side PT proxy instance. + + Example: + + TOR_PT_CLIENT_TRANSPORTS=obfs2,obfs3,obfs4 + + "TOR_PT_PROXY" + + Specifies an upstream proxy that the PT MUST use when making + outgoing network connections. It is a URI [RFC3986] of the + format: + + ://[[:][@]:. + + The "TOR_PT_PROXY" environment variable is OPTIONAL and + MUST be omitted if there is no need to connect via an + upstream proxy. + + Examples: + + TOR_PT_PROXY=socks5://tor:test1234@198.51.100.1:8000 + TOR_PT_PROXY=socks4a://198.51.100.2:8001 + TOR_PT_PROXY=http://198.51.100.3:443 + +3.2.3. Pluggable Transport Server Environment Variables + + Server-side Pluggable Transport reverse proxies are configured + via the following environment variables. + + "TOR_PT_SERVER_TRANSPORTS" + + Specifies the PT protocols the server proxy should initialize, + as a comma separated list of PT names. + + PTs SHOULD ignore PT names that it does not recognize. + + Parent processes MUST set this environment variable when + launching a server-side PT reverse proxy instance. + + Example: + + TOR_PT_SERVER_TRANSPORTS=obfs3,scramblesuit + + "TOR_PT_SERVER_TRANSPORT_OPTIONS" + + Specifies per-PT protocol configuration directives, as a + semicolon-separated list of : pairs, where + is a PT name and is a k=v string value with options + that are to be passed to the transport. + + Colons, semicolons, and backslashes MUST be + escaped with a backslash. + + If there are no arguments that need to be passed to any of + PT transport protocols, "TOR_PT_SERVER_TRANSPORT_OPTIONS" + MAY be omitted. + + Example: + + TOR_PT_SERVER_TRANSPORT_OPTIONS=scramblesuit:key=banana;automata:rule=110;automata:depth=3 + + Will pass to 'scramblesuit' the parameter 'key=banana' and to + 'automata' the arguments 'rule=110' and 'depth=3'. + + "TOR_PT_SERVER_BINDADDR" + + A comma separated list of - pairs, where is + a PT name and is the
: on which it + should listen for incoming client connections. + + The keys holding transport names MUST be in the same order as + they appear in "TOR_PT_SERVER_TRANSPORTS". + + The
MAY be a locally scoped address as long as port + forwarding is done externally. + + The
: combination MUST be an IP address + supported by `bind()`, and MUST NOT be a host name. + + Applications MUST NOT set more than one
: pair + per PT name. + + If there is no specific
: combination to be + configured for any transports, "TOR_PT_SERVER_BINDADDR" MAY + be omitted. + + Example: + + TOR_PT_SERVER_BINDADDR=obfs3-198.51.100.1:1984,scramblesuit-127.0.0.1:4891 + + "TOR_PT_ORPORT" + + Specifies the destination that the PT reverse proxy should forward + traffic to after transforming it as appropriate, as an +
:. + + Connections to the destination specified via "TOR_PT_ORPORT" + MUST only contain application payload. If the parent process + requires the actual source IP address of client connections + (or other metadata), it should set "TOR_PT_EXTENDED_SERVER_PORT" + instead. + + Example: + + TOR_PT_ORPORT=127.0.0.1:9001 + + "TOR_PT_EXTENDED_SERVER_PORT" + + Specifies the destination that the PT reverse proxy should + forward traffic to, via the Extended ORPort protocol [EXTORPORT] + as an
:. + + The Extended ORPort protocol allows the PT reverse proxy to + communicate per-connection metadata such as the PT name and + client IP address/port to the parent process. + + If the parent process does not support the ExtORPort protocol, + it MUST set "TOR_PT_EXTENDED_SERVER_PORT" to an empty string. + + Example: + + TOR_PT_EXTENDED_SERVER_PORT=127.0.0.1:4200 + + "TOR_PT_AUTH_COOKIE_FILE" + + Specifies an absolute filesystem path to the Extended ORPort + authentication cookie, required to communicate with the + Extended ORPort specified via "TOR_PT_EXTENDED_SERVER_PORT". + + If the parent process is not using the ExtORPort protocol for + incoming traffic, "TOR_PT_AUTH_COOKIE_FILE" MUST be omitted. + + Example: + + TOR_PT_AUTH_COOKIE_FILE=/var/lib/tor/extended_orport_auth_cookie + +3.3. Pluggable Transport To Parent Process Communication + + All Pluggable Transport Proxies communicate to the parent process + via writing NL-terminated lines to stdout. The line metaformat is: + + ::= + ::= | + ::= + ::= * + ::= | + ::= + ::= + ::= + + The parent process MUST ignore lines received from PT proxies with + unknown keywords. + +3.3.1. Common Messages + + When a PT proxy first starts up, it must determine which version + of the Pluggable Transports Specification to use to configure + itself. + + It does this via the "TOR_PT_MANAGED_TRANSPORT_VER" (3.2.1) + environment variable which contains all of the versions supported + by the application. + + Upon determining the version to use, or lack thereof, the PT + proxy responds with one of two messages. + + VERSION-ERROR + + The "VERSION-ERROR" message is used to signal that there was + no compatible Pluggable Transport Specification version + present in the "TOR_PT_MANAGED_TRANSPORT_VER" list. + + The SHOULD be set to "no-version" for + historical reasons but MAY be set to a useful error message + instead. + + PT proxies MUST terminate after outputting a "VERSION-ERROR" + message. + + Example: + + VERSION-ERROR no-version + + VERSION + + The "VERSION" message is used to signal the Pluggable Transport + Specification version (as in "TOR_PT_MANAGED_TRANSPORT_VER") + that the PT proxy will use to configure its transports and + communicate with the parent process. + + The version for the environment values and reply messages + specified by this document is "1". + + PT proxies MUST either report an error and terminate, or output + a "VERSION" message before moving on to client/server proxy + initialization and configuration. + + Example: + + VERSION 1 + + After version negotiation has been completed the PT proxy must + then validate that all of the required environment variables are + provided, and that all of the configuration values supplied are + well formed. + + At any point, if there is an error encountered related to + configuration supplied via the environment variables, it MAY + respond with an error message and terminate. + + ENV-ERROR + + The "ENV-ERROR" message is used to signal the PT proxy's + failure to parse the configuration environment variables (3.2). + + The SHOULD consist of a useful error message + that can be used to diagnose and correct the root cause of + the failure. + + PT proxies MUST terminate after outputting a "ENV-ERROR" + message. + + Example: + + ENV-ERROR No TOR_PT_AUTH_COOKIE_FILE when TOR_PT_EXTENDED_SERVER_PORT set + +3.3.2. Pluggable Transport Client Messages + + After negotiating the Pluggable Transport Specification version, + PT client proxies MUST first validate "TOR_PT_PROXY" (3.2.2) if + it is set, before initializing any transports. + + Assuming that an upstream proxy is provided, PT client proxies + MUST respond with a message indicating that the proxy is valid, + supported, and will be used OR a failure message. + + PROXY DONE + + The "PROXY DONE" message is used to signal the PT proxy's + acceptance of the upstream proxy specified by "TOR_PT_PROXY". + + PROXY-ERROR + + The "PROXY-ERROR" message is used to signal that the upstream + proxy is malformed/unsupported or otherwise unusable. + + PT proxies MUST terminate immediately after outputting a + "PROXY-ERROR" message. + + Example: + + PROXY-ERROR SOCKS 4 upstream proxies unsupported. + + After the upstream proxy (if any) is configured, PT clients then + iterate over the requested transports in "TOR_PT_CLIENT_TRANSPORTS" + and initialize the listeners. + + For each transport initialized, the PT proxy reports the listener + status back to the parent via messages to stdout. + + CMETHOD <'socks4','socks5'> + + The "CMETHOD" message is used to signal that a requested + PT transport has been launched, the protocol which the parent + should use to make outgoing connections, and the IP address + and port that the PT transport's forward proxy is listening on. + + Example: + + CMETHOD trebuchet socks5 127.0.0.1:19999 + + CMETHOD-ERROR + + The "CMETHOD-ERROR" message is used to signal that + requested PT transport was unable to be launched. + + Example: + + CMETHOD-ERROR trebuchet no rocks available + + Once all PT transports have been initialized (or have failed), the + PT proxy MUST send a final message indicating that it has finished + initializing. + + CMETHODS DONE + + The "CMETHODS DONE" message signals that the PT proxy has + finished initializing all of the transports that it is capable + of handling. + + Upon sending the "CMETHODS DONE" message, the PT proxy + initialization is complete. + + Notes: + + - Unknown transports in "TOR_PT_CLIENT_TRANSPORTS" are ignored + entirely, and MUST NOT result in a "CMETHOD-ERROR" message. + Thus it is entirely possible for a given PT proxy to + immediately output "CMETHODS DONE". + + - Parent processes MUST handle "CMETHOD"/"CMETHOD-ERROR" + messages in any order, regardless of ordering in + "TOR_PT_CLIENT_TRANSPORTS". + +3.3.3. Pluggable Transport Server Messages + + PT server reverse proxies iterate over the requested transports + in "TOR_PT_CLIENT_TRANSPORTS" and initialize the listeners. + + For each transport initialized, the PT proxy reports the listener + status back to the parent via messages to stdout. + + SMETHOD [options] + + The "SMETHOD" message is used to signal that a requested + PT transport has been launched, the protocol which will be + used to handle incoming connections, and the IP address and + port that clients should use to reach the reverse-proxy. + + If there is a specific provided for a given + PT transport via "TOR_PT_SERVER_BINDADDR", the transport + MUST be initialized using that as the server address. + + The OPTIONAL 'options' field is used to pass additional + per-transport information back to the parent process. + + The currently recognized 'options' are: + + ARGS:[=,]+[=] + + The "ARGS" option is used to pass additional key/value + formatted information that clients will require to use + the reverse proxy. + + Equal signs and commas MUST be escaped with a backslash. + + Tor: The ARGS are included in the transport line of the + Bridge's extra-info document. + + Examples: + + SMETHOD trebuchet 198.51.100.1:19999 + SMETHOD rot_by_N 198.51.100.1:2323 ARGS:N=13 + + SMETHOD-ERROR + + The "SMETHOD-ERROR" message is used to signal that + requested PT transport reverse proxy was unable to be + launched. + + Example: + + SMETHOD-ERROR trebuchet no cows available + + Once all PT transports have been initialized (or have failed), the + PT proxy MUST send a final message indicating that it has finished + initializing. + + SMETHODS DONE + + The "SMETHODS DONE" message signals that the PT proxy has + finished initializing all of the transports that it is capable + of handling. + + Upon sending the "SMETHODS DONE" message, the PT proxy + initialization is complete. + +3.3.4. Pluggable Transport Log Message + + This message is for a client or server PT to be able to signal back to the + parent process via stdout or stderr any log messages. + + A log message can be any kind of messages (human readable) that the PT + sends back so the parent process can gather information about what is going + on in the child process. It is not intended for the parent process to parse + and act accordingly but rather a message used for plain logging. + + For example, the tor daemon logs those messages at the Severity level and + sends them onto the control port using the PT_LOG (see control-spec.txt) + event so any third party can pick them up for debugging. + + The format of the message: + + LOG SEVERITY=Severity MESSAGE=Message + + The SEVERITY value indicate at which logging level the message applies. + The accepted values for are: error, warning, notice, info, debug + + The MESSAGE value is a human readable string formatted by the PT. The + contains the log message which can be a String or CString (see + section 2 in control-spec.txt). + + Example: + + LOG SEVERITY=debug MESSAGE="Connected to bridge A" + +3.3.5. Pluggable Transport Status Message + + This message is for a client or server PT to be able to signal back to the + parent process via stdout or stderr any status messages. + + The format of the message: + + STATUS TRANSPORT=Transport = [= ...] + + The TRANSPORT value indicates a hint on what the PT is such has the name or + the protocol used for instance. As an example, obfs4proxy would use + "obfs4". Thus, the Transport value can be anything the PT itself defines + and it can be a String or CString (see section 2 in control-spec.txt). + + The = values are specific to the PT and there has to be at least + one. They are messages that reflects the status that the PT wants to + report. can be a String or CString. + + Examples (fictional): + + STATUS TRANSPORT=obfs4 ADDRESS=198.51.100.123:1234 CONNECT=Success + STATUS TRANSPORT=obfs4 ADDRESS=198.51.100.222:2222 CONNECT=Failed FINGERPRINT= ERRSTR="Connection refused" + STATUS TRANSPORT=trebuchet ADDRESS=198.51.100.15:443 PERCENT=42 + +3.4. Pluggable Transport Shutdown + + The recommended way for Pluggable Transport using applications and + Pluggable Transports to handle graceful shutdown is as follows. + + - (Parent) Set "TOR_PT_EXIT_ON_STDIN_CLOSE" (3.2.1) when + launching the PT proxy, to indicate that stdin will be used + for graceful shutdown notification. + + - (Parent) When the time comes to terminate the PT proxy: + + 1. Close the PT proxy's stdin. + 2. Wait for a "reasonable" amount of time for the PT to exit. + 3. Attempt to use OS specific mechanisms to cause graceful + PT shutdown (eg: 'SIGTERM') + 4. Use OS specific mechanisms to force terminate the PT + (eg: 'SIGKILL', 'ProccessTerminate()'). + + - PT proxies SHOULD monitor stdin, and exit gracefully when + it is closed, if the parent supports that behavior. + + - PT proxies SHOULD handle OS specific mechanisms to gracefully + terminate (eg: Install a signal handler on 'SIGTERM' that + causes cleanup and a graceful shutdown if able). + + - PT proxies SHOULD attempt to detect when the parent has + terminated (eg: via detecting that its parent process ID has + changed on U*IX systems), and gracefully terminate. + +3.5. Pluggable Transport Client Per-Connection Arguments + + Certain PT transport protocols require that the client provides + per-connection arguments when making outgoing connections. On + the server side, this is handled by the "ARGS" optional argument + as part of the "SMETHOD" message. + + On the client side, arguments are passed via the authentication + fields that are part of the SOCKS protocol. + + First the "=" formatted arguments MUST be escaped, + such that all backslash, equal sign, and semicolon characters + are escaped with a backslash. + + Second, all of the escaped are concatenated together. + + Example: + + shared-secret=rahasia;secrets-file=/tmp/blob + + Lastly the arguments are transmitted when making the outgoing + connection using the authentication mechanism specific to the + SOCKS protocol version. + + - In the case of SOCKS 4, the concatenated argument list is + transmitted in the "USERID" field of the "CONNECT" request. + + - In the case of SOCKS 5, the parent process must negotiate + "Username/Password" authentication [RFC1929], and transmit + the arguments encoded in the "UNAME" and "PASSWD" fields. + + If the encoded argument list is less than 255 bytes in + length, the "PLEN" field must be set to "1" and the "PASSWD" + field must contain a single NUL character. + +4. Anonymity Considerations + + When designing and implementing a Pluggable Transport, care + should be taken to preserve the privacy of clients and to avoid + leaking personally identifying information. + + Examples of client related considerations are: + + - Not logging client IP addresses to disk. + + - Not leaking DNS addresses except when necessary. + + - Ensuring that "TOR_PT_PROXY"'s "fail closed" behavior is + implemented correctly. + + Additionally, certain obfuscation mechanisms rely on information + such as the server IP address/port being confidential, so clients + also need to take care to preserve server side information + confidential when applicable. + +5. References + + [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate + Requirement Levels", BCP 14, RFC 2119, March 1997. + + [RFC1928] Leech, M., Ganis, M., Lee, Y., Kuris, R., + Koblas, D., Jones, L., "SOCKS Protocol Version 5", + RFC 1928, March 1996. + + [EXTORPORT] Kadianakis, G., Mathewson, N., "Extended ORPort and + TransportControlPort", Tor Proposal 196, March 2012. + + [RFC3986] Berners-Lee, T., Fielding, R., Masinter, L., "Uniform + Resource Identifier (URI): Generic Syntax", RFC 3986, + January 2005. + + [RFC1929] Leech, M., "Username/Password Authentication for + SOCKS V5", RFC 1929, March 1996. + +6. Acknowledgments + + This specification draws heavily from prior versions done by Jacob + Appelbaum, Nick Mathewson, and George Kadianakis. + +Appendix A. Example Client Pluggable Transport Session + + Environment variables: + + TOR_PT_MANAGED_TRANSPORT_VER=1 + TOR_PT_STATE_LOCATION=/var/lib/tor/pt_state/ + TOR_PT_EXIT_ON_STDIN_CLOSE=1 + TOR_PT_PROXY=socks5://127.0.0.1:8001 + TOR_PT_CLIENT_TRANSPORTS=obfs3,obfs4 + + Messages the PT Proxy writes to stdin: + + VERSION 1 + PROXY DONE + CMETHOD obfs3 socks5 127.0.0.1:32525 + CMETHOD obfs4 socks5 127.0.0.1:37347 + CMETHODS DONE + +Appendix B. Example Server Pluggable Transport Session + + Environment variables: + + TOR_PT_MANAGED_TRANSPORT_VER=1 + TOR_PT_STATE_LOCATION=/var/lib/tor/pt_state + TOR_PT_EXIT_ON_STDIN_CLOSE=1 + TOR_PT_SERVER_TRANSPORTS=obfs3,obfs4 + TOR_PT_SERVER_BINDADDR=obfs3-198.51.100.1:1984 + + Messages the PT Proxy writes to stdin: + + VERSION 1 + SMETHOD obfs3 198.51.100.1:1984 + SMETHOD obfs4 198.51.100.1:43734 ARGS:cert=HszPy3vWfjsESCEOo9ZBkRv6zQ/1mGHzc8arF0y2SpwFr3WhsMu8rK0zyaoyERfbz3ddFw,iat-mode=0 + SMETHODS DONE diff --git a/attic/text_formats/rend-spec-v3.txt b/attic/text_formats/rend-spec-v3.txt new file mode 100644 index 0000000..d836d23 --- /dev/null +++ b/attic/text_formats/rend-spec-v3.txt @@ -0,0 +1,2869 @@ + + Tor Rendezvous Specification - Version 3 + +This document specifies how the hidden service version 3 protocol works. This +text used to be proposal 224-rend-spec-ng.txt. + + +Table of contents: + + 0. Hidden services: overview and preliminaries. + 0.1. Improvements over previous versions. + 0.2. Notation and vocabulary + 0.3. Cryptographic building blocks + 0.4. Protocol building blocks [BUILDING-BLOCKS] + 0.5. Assigned relay cell types + 0.6. Acknowledgments + 1. Protocol overview + 1.1. View from 10,000 feet + 1.2. In more detail: naming hidden services [NAMING] + 1.3. In more detail: Access control [IMD:AC] + 1.4. In more detail: Distributing hidden service descriptors. [IMD:DIST] + 1.5. In more detail: Scaling to multiple hosts + 1.6. In more detail: Backward compatibility with older hidden service + 1.7. In more detail: Keeping crypto keys offline + 1.8. In more detail: Encryption Keys And Replay Resistance + 1.9. In more detail: A menagerie of keys + 1.9.1. In even more detail: Client authorization [CLIENT-AUTH] + 2. Generating and publishing hidden service descriptors [HSDIR] + 2.1. Deriving blinded keys and subcredentials [SUBCRED] + 2.2. Locating, uploading, and downloading hidden service descriptors + 2.2.1. Dividing time into periods [TIME-PERIODS] + 2.2.2. When to publish a hidden service descriptor [WHEN-HSDESC] + 2.2.3. Where to publish a hidden service descriptor [WHERE-HSDESC] + 2.2.4. Using time periods and SRVs to fetch/upload HS descriptors + 2.2.5. Expiring hidden service descriptors [EXPIRE-DESC] + 2.2.6. URLs for anonymous uploading and downloading + 2.3. Publishing shared random values [PUB-SHAREDRANDOM] + 2.3.1. Client behavior in the absence of shared random values + 2.3.2. Hidden services and changing shared random values + 2.4. Hidden service descriptors: outer wrapper [DESC-OUTER] + 2.5. Hidden service descriptors: encryption format [HS-DESC-ENC] + 2.5.1. First layer of encryption [HS-DESC-FIRST-LAYER] + 2.5.1.1. First layer encryption logic + 2.5.1.2. First layer plaintext format + 2.5.1.3. Client behavior + 2.5.1.4. Obfuscating the number of authorized clients + 2.5.2. Second layer of encryption [HS-DESC-SECOND-LAYER] + 2.5.2.1. Second layer encryption keys + 2.5.2.2. Second layer plaintext format + 2.5.3. Deriving hidden service descriptor encryption keys [HS-DESC-ENCRYPTION-KEYS] + 3. The introduction protocol [INTRO-PROTOCOL] + 3.1. Registering an introduction point [REG_INTRO_POINT] + 3.1.1. Extensible ESTABLISH_INTRO protocol. [EST_INTRO] + 3.1.1.1. Denial-of-Server Defense Extension. [EST_INTRO_DOS_EXT] + 3.1.2. Registering an introduction point on a legacy Tor node [LEGACY_EST_INTRO] + 3.1.3. Acknowledging establishment of introduction point [INTRO_ESTABLISHED] + 3.2. Sending an INTRODUCE1 cell to the introduction point. [SEND_INTRO1] + 3.2.1. INTRODUCE1 cell format [FMT_INTRO1] + 3.2.2. INTRODUCE_ACK cell format. [INTRO_ACK] + 3.3. Processing an INTRODUCE2 cell at the hidden service. [PROCESS_INTRO2] + 3.3.1. Introduction handshake encryption requirements [INTRO-HANDSHAKE-REQS] + 3.3.2. Example encryption handshake: ntor with extra data [NTOR-WITH-EXTRA-DATA] + 3.4. Authentication during the introduction phase. [INTRO-AUTH] + 3.4.1. Ed25519-based authentication. + 4. The rendezvous protocol + 4.1. Establishing a rendezvous point [EST_REND_POINT] + 4.2. Joining to a rendezvous point [JOIN_REND] + 4.2.1. Key expansion + 4.3. Using legacy hosts as rendezvous points + 5. Encrypting data between client and host + 6. Encoding onion addresses [ONIONADDRESS] + 7. Open Questions: + +-1. Draft notes + + This document describes a proposed design and specification for + hidden services in Tor version 0.2.5.x or later. It's a replacement + for the current rend-spec.txt, rewritten for clarity and for improved + design. + + Look for the string "TODO" below: it describes gaps or uncertainties + in the design. + + Change history: + + 2013-11-29: Proposal first numbered. Some TODO and XXX items remain. + + 2014-01-04: Clarify some unclear sections. + + 2014-01-21: Fix a typo. + + 2014-02-20: Move more things to the revised certificate format in the + new updated proposal 220. + + 2015-05-26: Fix two typos. + + +0. Hidden services: overview and preliminaries. + + Hidden services aim to provide responder anonymity for bidirectional + stream-based communication on the Tor network. Unlike regular Tor + connections, where the connection initiator receives anonymity but + the responder does not, hidden services attempt to provide + bidirectional anonymity. + + Participants: + + Operator -- A person running a hidden service + + Host, "Server" -- The Tor software run by the operator to provide + a hidden service. + + User -- A person contacting a hidden service. + + Client -- The Tor software running on the User's computer + + Hidden Service Directory (HSDir) -- A Tor node that hosts signed + statements from hidden service hosts so that users can make + contact with them. + + Introduction Point -- A Tor node that accepts connection requests + for hidden services and anonymously relays those requests to the + hidden service. + + Rendezvous Point -- A Tor node to which clients and servers + connect and which relays traffic between them. + +0.1. Improvements over previous versions. + + Here is a list of improvements of this proposal over the legacy hidden + services: + + a) Better crypto (replaced SHA1/DH/RSA1024 with SHA3/ed25519/curve25519) + b) Improved directory protocol leaking less to directory servers. + c) Improved directory protocol with smaller surface for targeted attacks. + d) Better onion address security against impersonation. + e) More extensible introduction/rendezvous protocol. + f) Offline keys for onion services + g) Advanced client authorization + +0.2. Notation and vocabulary + + Unless specified otherwise, all multi-octet integers are big-endian. + + We write sequences of bytes in two ways: + + 1. A sequence of two-digit hexadecimal values in square brackets, + as in [AB AD 1D EA]. + + 2. A string of characters enclosed in quotes, as in "Hello". The + characters in these strings are encoded in their ascii + representations; strings are NOT nul-terminated unless + explicitly described as NUL terminated. + + We use the words "byte" and "octet" interchangeably. + + We use the vertical bar | to denote concatenation. + + We use INT_N(val) to denote the network (big-endian) encoding of the + unsigned integer "val" in N bytes. For example, INT_4(1337) is [00 00 + 05 39]. Values are truncated like so: val % (2 ^ (N * 8)). For example, + INT_4(42) is 42 % 4294967296 (32 bit). + +0.3. Cryptographic building blocks + + This specification uses the following cryptographic building blocks: + + * A pseudorandom number generator backed by a strong entropy source. + The output of the PRNG should always be hashed before being posted on + the network to avoid leaking raw PRNG bytes to the network + (see [PRNG-REFS]). + + * A stream cipher STREAM(iv, k) where iv is a nonce of length + S_IV_LEN bytes and k is a key of length S_KEY_LEN bytes. + + * A public key signature system SIGN_KEYGEN()->seckey, pubkey; + SIGN_SIGN(seckey,msg)->sig; and SIGN_CHECK(pubkey, sig, msg) -> + { "OK", "BAD" }; where secret keys are of length SIGN_SECKEY_LEN + bytes, public keys are of length SIGN_PUBKEY_LEN bytes, and + signatures are of length SIGN_SIG_LEN bytes. + + This signature system must also support key blinding operations + as discussed in appendix [KEYBLIND] and in section [SUBCRED]: + SIGN_BLIND_SECKEY(seckey, blind)->seckey2 and + SIGN_BLIND_PUBKEY(pubkey, blind)->pubkey2 . + + * A public key agreement system "PK", providing + PK_KEYGEN()->seckey, pubkey; PK_VALID(pubkey) -> {"OK", "BAD"}; + and PK_HANDSHAKE(seckey, pubkey)->output; where secret keys are + of length PK_SECKEY_LEN bytes, public keys are of length + PK_PUBKEY_LEN bytes, and the handshake produces outputs of + length PK_OUTPUT_LEN bytes. + + * A cryptographic hash function H(d), which should be preimage and + collision resistant. It produces hashes of length HASH_LEN + bytes. + + * A cryptographic message authentication code MAC(key,msg) that + produces outputs of length MAC_LEN bytes. + + * A key derivation function KDF(message, n) that outputs n bytes. + + As a first pass, I suggest: + + * Instantiate STREAM with AES256-CTR. + + * Instantiate SIGN with Ed25519 and the blinding protocol in + [KEYBLIND]. + + * Instantiate PK with Curve25519. + + * Instantiate H with SHA3-256. + + * Instantiate KDF with SHAKE-256. + + * Instantiate MAC(key=k, message=m) with H(k_len | k | m), + where k_len is htonll(len(k)). + + When we need a particular MAC key length below, we choose + MAC_KEY_LEN=32 (256 bits). + + For legacy purposes, we specify compatibility with older versions of + the Tor introduction point and rendezvous point protocols. These used + RSA1024, DH1024, AES128, and SHA1, as discussed in + rend-spec.txt. + + As in [proposal 220], all signatures are generated not over strings + themselves, but over those strings prefixed with a distinguishing + value. + +0.4. Protocol building blocks [BUILDING-BLOCKS] + + In sections below, we need to transmit the locations and identities + of Tor nodes. We do so in the link identification format used by + EXTEND2 cells in the Tor protocol. + + NSPEC (Number of link specifiers) [1 byte] + NSPEC times: + LSTYPE (Link specifier type) [1 byte] + LSLEN (Link specifier length) [1 byte] + LSPEC (Link specifier) [LSLEN bytes] + + Link specifier types are as described in tor-spec.txt. Every set of + link specifiers SHOULD include at minimum specifiers of type [00] + (TLS-over-TCP, IPv4), [02] (legacy node identity) and [03] (ed25519 + identity key). Sets of link specifiers without these three types + SHOULD be rejected. + + As of 0.4.1.1-alpha, Tor includes both IPv4 and IPv6 link specifiers + in v3 onion service protocol link specifier lists. All available + addresses SHOULD be included as link specifiers, regardless of the + address that Tor actually used to connect/extend to the remote relay. + + We also incorporate Tor's circuit extension handshakes, as used in + the CREATE2 and CREATED2 cells described in tor-spec.txt. In these + handshakes, a client who knows a public key for a server sends a + message and receives a message from that server. Once the exchange is + done, the two parties have a shared set of forward-secure key + material, and the client knows that nobody else shares that key + material unless they control the secret key corresponding to the + server's public key. + +0.5. Assigned relay cell types + + These relay cell types are reserved for use in the hidden service + protocol. + + 32 -- RELAY_COMMAND_ESTABLISH_INTRO + + Sent from hidden service host to introduction point; + establishes introduction point. Discussed in + [REG_INTRO_POINT]. + + 33 -- RELAY_COMMAND_ESTABLISH_RENDEZVOUS + + Sent from client to rendezvous point; creates rendezvous + point. Discussed in [EST_REND_POINT]. + + 34 -- RELAY_COMMAND_INTRODUCE1 + + Sent from client to introduction point; requests + introduction. Discussed in [SEND_INTRO1] + + 35 -- RELAY_COMMAND_INTRODUCE2 + + Sent from introduction point to hidden service host; requests + introduction. Same format as INTRODUCE1. Discussed in + [FMT_INTRO1] and [PROCESS_INTRO2] + + 36 -- RELAY_COMMAND_RENDEZVOUS1 + + Sent from hidden service host to rendezvous point; + attempts to join host's circuit to + client's circuit. Discussed in [JOIN_REND] + + 37 -- RELAY_COMMAND_RENDEZVOUS2 + + Sent from rendezvous point to client; + reports join of host's circuit to + client's circuit. Discussed in [JOIN_REND] + + 38 -- RELAY_COMMAND_INTRO_ESTABLISHED + + Sent from introduction point to hidden service host; + reports status of attempt to establish introduction + point. Discussed in [INTRO_ESTABLISHED] + + 39 -- RELAY_COMMAND_RENDEZVOUS_ESTABLISHED + + Sent from rendezvous point to client; acknowledges + receipt of ESTABLISH_RENDEZVOUS cell. Discussed in + [EST_REND_POINT] + + 40 -- RELAY_COMMAND_INTRODUCE_ACK + + Sent from introduction point to client; acknowledges + receipt of INTRODUCE1 cell and reports success/failure. + Discussed in [INTRO_ACK] + +0.6. Acknowledgments + + This design includes ideas from many people, including + + Christopher Baines, + Daniel J. Bernstein, + Matthew Finkel, + Ian Goldberg, + George Kadianakis, + Aniket Kate, + Tanja Lange, + Robert Ransom, + Roger Dingledine, + Aaron Johnson, + Tim Wilson-Brown ("teor"), + special (John Brooks), + s7r + + It's based on Tor's original hidden service design by Roger + Dingledine, Nick Mathewson, and Paul Syverson, and on improvements to + that design over the years by people including + + Tobias Kamm, + Thomas Lauterbach, + Karsten Loesing, + Alessandro Preite Martinez, + Robert Ransom, + Ferdinand Rieger, + Christoph Weingarten, + Christian Wilms, + + We wouldn't be able to do any of this work without good attack + designs from researchers including + + Alex Biryukov, + Lasse Øverlier, + Ivan Pustogarov, + Paul Syverson, + Ralf-Philipp Weinmann, + + See [ATTACK-REFS] for their papers. + + Several of these ideas have come from conversations with + + Christian Grothoff, + Brian Warner, + Zooko Wilcox-O'Hearn, + + And if this document makes any sense at all, it's thanks to + editing help from + + Matthew Finkel, + George Kadianakis, + Peter Palfrader, + Tim Wilson-Brown ("teor"), + + + [XXX Acknowledge the huge bunch of people working on 8106.] + [XXX Acknowledge the huge bunch of people working on 8244.] + + + Please forgive me if I've missed you; please forgive me if I've + misunderstood your best ideas here too. + + +1. Protocol overview + + In this section, we outline the hidden service protocol. This section + omits some details in the name of simplicity; those are given more + fully below, when we specify the protocol in more detail. + +1.1. View from 10,000 feet + + A hidden service host prepares to offer a hidden service by choosing + several Tor nodes to serve as its introduction points. It builds + circuits to those nodes, and tells them to forward introduction + requests to it using those circuits. + + Once introduction points have been picked, the host builds a set of + documents called "hidden service descriptors" (or just "descriptors" + for short) and uploads them to a set of HSDir nodes. These documents + list the hidden service's current introduction points and describe + how to make contact with the hidden service. + + When a client wants to connect to a hidden service, it first chooses + a Tor node at random to be its "rendezvous point" and builds a + circuit to that rendezvous point. If the client does not have an + up-to-date descriptor for the service, it contacts an appropriate + HSDir and requests such a descriptor. + + The client then builds an anonymous circuit to one of the hidden + service's introduction points listed in its descriptor, and gives the + introduction point an introduction request to pass to the hidden + service. This introduction request includes the target rendezvous + point and the first part of a cryptographic handshake. + + Upon receiving the introduction request, the hidden service host + makes an anonymous circuit to the rendezvous point and completes the + cryptographic handshake. The rendezvous point connects the two + circuits, and the cryptographic handshake gives the two parties a + shared key and proves to the client that it is indeed talking to the + hidden service. + + Once the two circuits are joined, the client can send Tor RELAY cells + to the server. RELAY_BEGIN cells open streams to an external process + or processes configured by the server; RELAY_DATA cells are used to + communicate data on those streams, and so forth. + +1.2. In more detail: naming hidden services [NAMING] + + A hidden service's name is its long term master identity key. This is + encoded as a hostname by encoding the entire key in Base 32, including a + version byte and a checksum, and then appending the string ".onion" at the + end. The result is a 56-character domain name. + + (This is a change from older versions of the hidden service protocol, + where we used an 80-bit truncated SHA1 hash of a 1024 bit RSA key.) + + The names in this format are distinct from earlier names because of + their length. An older name might look like: + + unlikelynamefora.onion + yyhws9optuwiwsns.onion + + And a new name following this specification might look like: + + l5satjgud6gucryazcyvyvhuxhr74u6ygigiuyixe3a6ysis67ororad.onion + + Please see section [ONIONADDRESS] for the encoding specification. + +1.3. In more detail: Access control [IMD:AC] + + Access control for a hidden service is imposed at multiple points through + the process above. Furthermore, there is also the option to impose + additional client authorization access control using pre-shared secrets + exchanged out-of-band between the hidden service and its clients. + + The first stage of access control happens when downloading HS descriptors. + Specifically, in order to download a descriptor, clients must know which + blinded signing key was used to sign it. (See the next section for more info + on key blinding.) + + To learn the introduction points, clients must decrypt the body of the + hidden service descriptor. To do so, clients must know the _unblinded_ + public key of the service, which makes the descriptor unusable by entities + without that knowledge (e.g. HSDirs that don't know the onion address). + + Also, if optional client authorization is enabled, hidden service + descriptors are superencrypted using each authorized user's identity x25519 + key, to further ensure that unauthorized entities cannot decrypt it. + + In order to make the introduction point send a rendezvous request to the + service, the client needs to use the per-introduction-point authentication + key found in the hidden service descriptor. + + The final level of access control happens at the server itself, which may + decide to respond or not respond to the client's request depending on the + contents of the request. The protocol is extensible at this point: at a + minimum, the server requires that the client demonstrate knowledge of the + contents of the encrypted portion of the hidden service descriptor. If + optional client authorization is enabled, the service may additionally + require the client to prove knowledge of a pre-shared private key. + +1.4. In more detail: Distributing hidden service descriptors. [IMD:DIST] + + Periodically, hidden service descriptors become stored at different + locations to prevent a single directory or small set of directories + from becoming a good DoS target for removing a hidden service. + + For each period, the Tor directory authorities agree upon a + collaboratively generated random value. (See section 2.3 for a + description of how to incorporate this value into the voting + practice; generating the value is described in other proposals, + including [SHAREDRANDOM-REFS].) That value, combined with hidden service + directories' public identity keys, determines each HSDir's position + in the hash ring for descriptors made in that period. + + Each hidden service's descriptors are placed into the ring in + positions based on the key that was used to sign them. Note that + hidden service descriptors are not signed with the services' public + keys directly. Instead, we use a key-blinding system [KEYBLIND] to + create a new key-of-the-day for each hidden service. Any client that + knows the hidden service's public identity key can derive these blinded + signing keys for a given period. It should be impossible to derive + the blinded signing key lacking that knowledge. + + This is achieved using two nonces: + + * A "credential", derived from the public identity key KP_hs_id. + N_hs_cred. + + * A "subcredential", derived from the credential N_hs_cred + and information which various with the current time period. + N_hs_subcred. + + The body of each descriptor is also encrypted with a key derived from + the public signing key. + + To avoid a "thundering herd" problem where every service generates + and uploads a new descriptor at the start of each period, each + descriptor comes online at a time during the period that depends on + its blinded signing key. The keys for the last period remain valid + until the new keys come online. + +1.5. In more detail: Scaling to multiple hosts + + This design is compatible with our current approaches for scaling hidden + services. Specifically, hidden service operators can use onionbalance to + achieve high availability between multiple nodes on the HSDir + layer. Furthermore, operators can use proposal 255 to load balance their + hidden services on the introduction layer. See [SCALING-REFS] for further + discussions on this topic and alternative designs. + +1.6. In more detail: Backward compatibility with older hidden service + protocols + + This design is incompatible with the clients, server, and hsdir node + protocols from older versions of the hidden service protocol as + described in rend-spec.txt. On the other hand, it is designed to + enable the use of older Tor nodes as rendezvous points and + introduction points. + +1.7. In more detail: Keeping crypto keys offline + + In this design, a hidden service's secret identity key may be + stored offline. It's used only to generate blinded signing keys, + which are used to sign descriptor signing keys. + + In order to operate a hidden service, the operator can generate in + advance a number of blinded signing keys and descriptor signing + keys (and their credentials; see [DESC-OUTER] and [HS-DESC-ENC] + below), and their corresponding descriptor encryption keys, and + export those to the hidden service hosts. + + As a result, in the scenario where the Hidden Service gets + compromised, the adversary can only impersonate it for a limited + period of time (depending on how many signing keys were generated + in advance). + + It's important to not send the private part of the blinded signing + key to the Hidden Service since an attacker can derive from it the + secret master identity key. The secret blinded signing key should + only be used to create credentials for the descriptor signing keys. + + (NOTE: although the protocol allows them, offline keys are not + implemented as of 0.3.2.1-alpha.) + +1.8. In more detail: Encryption Keys And Replay Resistance + + To avoid replays of an introduction request by an introduction point, + a hidden service host must never accept the same request + twice. Earlier versions of the hidden service design used an + authenticated timestamp here, but including a view of the current + time can create a problematic fingerprint. (See proposal 222 for more + discussion.) + +1.9. In more detail: A menagerie of keys + + [In the text below, an "encryption keypair" is roughly "a keypair you + can do Diffie-Hellman with" and a "signing keypair" is roughly "a + keypair you can do ECDSA with."] + + Public/private keypairs defined in this document: + + Master (hidden service) identity key -- A master signing keypair + used as the identity for a hidden service. This key is long + term and not used on its own to sign anything; it is only used + to generate blinded signing keys as described in [KEYBLIND] + and [SUBCRED]. The public key is encoded in the ".onion" + address according to [NAMING]. + KP_hs_id, KS_hs_id. + + Blinded signing key -- A keypair derived from the identity key, + used to sign descriptor signing keys. It changes periodically for + each service. Clients who know a 'credential' consisting of the + service's public identity key and an optional secret can derive + the public blinded identity key for a service. This key is used + as an index in the DHT-like structure of the directory system + (see [SUBCRED]). + KP_hs_blind_id, KS_hs_blind_id. + + Descriptor signing key -- A key used to sign hidden service + descriptors. This is signed by blinded signing keys. Unlike + blinded signing keys and master identity keys, the secret part + of this key must be stored online by hidden service hosts. The + public part of this key is included in the unencrypted section + of HS descriptors (see [DESC-OUTER]). + KP_hs_desc_sign, KS_hs_desc_sign. + + Introduction point authentication key -- A short-term signing + keypair used to identify a hidden service's session at a given + introduction point. The service makes a fresh keypair for each + introduction point; these are used to sign the request that a + hidden service host makes when establishing an introduction + point, so that clients who know the public component of this key + can get their introduction requests sent to the right + service. No keypair is ever used with more than one introduction + point. (previously called a "service key" in rend-spec.txt) + KP_hs_ipt_sid, KS_hs_ipt_sid + ("hidden service introduction point session id"). + + Introduction point encryption key -- A short-term encryption + keypair used when establishing connections via an introduction + point. Plays a role analogous to Tor nodes' onion keys. The service + makes a fresh keypair for each introduction point. + KP_hss_ntor, KS_hss_ntor. + + Ephemeral descriptor encryption key -- A short-lived encryption + keypair made by the service, and used to encrypt the inner layer + of hidden service descriptors when client authentication is in + use. + KP_hss_desc_enc, KS_hss_desc_enc + + Nonces defined in this document: + + N_hs_desc_enc -- a nonce used to derive keys to decrypt the inner + encryption layer of hidden service descriptors. This is + sometimes also called a "descriptor cookie". + + Public/private keypairs defined elsewhere: + + Onion key -- Short-term encryption keypair (KS_ntor, KP_ntor). + + (Node) identity key (KP_relayid). + + Symmetric key-like things defined elsewhere: + + KH from circuit handshake -- An unpredictable value derived as + part of the Tor circuit extension handshake, used to tie a request + to a particular circuit. + +1.9.1. In even more detail: Client authorization keys [CLIENT-AUTH] + + When client authorization is enabled, each authorized client of a hidden + service has two more asymmetric keypairs which are shared with the hidden + service. An entity without those keys is not able to use the hidden + service. Throughout this document, we assume that these pre-shared keys are + exchanged between the hidden service and its clients in a secure out-of-band + fashion. + + Specifically, each authorized client possesses: + + - An x25519 keypair used to compute decryption keys that allow the client to + decrypt the hidden service descriptor. See [HS-DESC-ENC]. This is + the client's counterpart to KP_hss_desc_enc. + KP_hsc_desc_enc, KS_hsd_desc_enc. + + - An ed25519 keypair which allows the client to compute signatures which + prove to the hidden service that the client is authorized. These + signatures are inserted into the INTRODUCE1 cell, and without them the + introduction to the hidden service cannot be completed. See [INTRO-AUTH]. + KP_hsc_intro_auth, KS_hsc_intro_auth. + + The right way to exchange these keys is to have the client generate keys and + send the corresponding public keys to the hidden service out-of-band. An + easier but less secure way of doing this exchange would be to have the + hidden service generate the keypairs and pass the corresponding private keys + to its clients. See section [CLIENT-AUTH-MGMT] for more details on how these + keys should be managed. + + [TODO: Also specify stealth client authorization.] + + (NOTE: client authorization is implemented as of 0.3.5.1-alpha.) + +2. Generating and publishing hidden service descriptors [HSDIR] + + Hidden service descriptors follow the same metaformat as other Tor + directory objects. They are published anonymously to Tor servers with the + HSDir flag, HSDir=2 protocol version and tor version >= 0.3.0.8 (because a + bug was fixed in this version). + +2.1. Deriving blinded keys and subcredentials [SUBCRED] + + In each time period (see [TIME-PERIODS] for a definition of time + periods), a hidden service host uses a different blinded private key + to sign its directory information, and clients use a different + blinded public key as the index for fetching that information. + + For a candidate for a key derivation method, see Appendix [KEYBLIND]. + + Additionally, clients and hosts derive a subcredential for each + period. Knowledge of the subcredential is needed to decrypt hidden + service descriptors for each period and to authenticate with the + hidden service host in the introduction process. Unlike the + credential, it changes each period. Knowing the subcredential, even + in combination with the blinded private key, does not enable the + hidden service host to derive the main credential--therefore, it is + safe to put the subcredential on the hidden service host while + leaving the hidden service's private key offline. + + The subcredential for a period is derived as: + + N_hs_subcred = H("subcredential" | N_hs_cred | blinded-public-key). + + In the above formula, credential corresponds to: + + N_hs_cred = H("credential" | public-identity-key) + + where public-identity-key is the public identity master key of the hidden + service. + +2.2. Locating, uploading, and downloading hidden service descriptors + [HASHRING] + + To avoid attacks where a hidden service's descriptor is easily + targeted for censorship, we store them at different directories over + time, and use shared random values to prevent those directories from + being predictable far in advance. + + Which Tor servers hosts a hidden service depends on: + + * the current time period, + * the daily subcredential, + * the hidden service directories' public keys, + * a shared random value that changes in each time period, + shared_random_value. + * a set of network-wide networkstatus consensus parameters. + (Consensus parameters are integer values voted on by authorities + and published in the consensus documents, described in + dir-spec.txt, section 3.3.) + + Below we explain in more detail. + +2.2.1. Dividing time into periods [TIME-PERIODS] + + To prevent a single set of hidden service directory from becoming a + target by adversaries looking to permanently censor a hidden service, + hidden service descriptors are uploaded to different locations that + change over time. + + The length of a "time period" is controlled by the consensus + parameter 'hsdir-interval', and is a number of minutes between 30 and + 14400 (10 days). The default time period length is 1440 (one day). + + Time periods start at the Unix epoch (Jan 1, 1970), and are computed by + taking the number of minutes since the epoch and dividing by the time + period. However, we want our time periods to start at a regular offset + from the SRV voting schedule, so we subtract a "rotation time offset" + of 12 voting periods from the number of minutes since the epoch, before + dividing by the time period (effectively making "our" epoch start at Jan + 1, 1970 12:00UTC when the voting period is 1 hour.) + + Example: If the current time is 2016-04-13 11:15:01 UTC, making the seconds + since the epoch 1460546101, and the number of minutes since the epoch + 24342435. We then subtract the "rotation time offset" of 12*60 minutes from + the minutes since the epoch, to get 24341715. If the current time period + length is 1440 minutes, by doing the division we see that we are currently + in time period number 16903. + + Specifically, time period #16903 began 16903*1440*60 + (12*60*60) seconds + after the epoch, at 2016-04-12 12:00 UTC, and ended at 16904*1440*60 + + (12*60*60) seconds after the epoch, at 2016-04-13 12:00 UTC. + +2.2.2. When to publish a hidden service descriptor [WHEN-HSDESC] + + Hidden services periodically publish their descriptor to the responsible + HSDirs. The set of responsible HSDirs is determined as specified in + [WHERE-HSDESC]. + + Specifically, every time a hidden service publishes its descriptor, it also + sets up a timer for a random time between 60 minutes and 120 minutes in the + future. When the timer triggers, the hidden service needs to publish its + descriptor again to the responsible HSDirs for that time period. + [TODO: Control republish period using a consensus parameter?] + +2.2.2.1. Overlapping descriptors + + Hidden services need to upload multiple descriptors so that they can be + reachable to clients with older or newer consensuses than them. Services + need to upload their descriptors to the HSDirs _before_ the beginning of + each upcoming time period, so that they are readily available for clients to + fetch them. Furthermore, services should keep uploading their old descriptor + even after the end of a time period, so that they can be reachable by + clients that still have consensuses from the previous time period. + + Hence, services maintain two active descriptors at every point. Clients on + the other hand, don't have a notion of overlapping descriptors, and instead + always download the descriptor for the current time period and shared random + value. It's the job of the service to ensure that descriptors will be + available for all clients. See section [FETCHUPLOADDESC] for how this is + achieved. + + [TODO: What to do when we run multiple hidden services in a single host?] + +2.2.3. Where to publish a hidden service descriptor [WHERE-HSDESC] + + This section specifies how the HSDir hash ring is formed at any given + time. Whenever a time value is needed (e.g. to get the current time period + number), we assume that clients and services use the valid-after time from + their latest live consensus. + + The following consensus parameters control where a hidden service + descriptor is stored; + + hsdir_n_replicas = an integer in range [1,16] with default value 2. + hsdir_spread_fetch = an integer in range [1,128] with default value 3. + hsdir_spread_store = an integer in range [1,128] with default value 4. + (Until 0.3.2.8-rc, the default was 3.) + + To determine where a given hidden service descriptor will be stored + in a given period, after the blinded public key for that period is + derived, the uploading or downloading party calculates: + + for replicanum in 1...hsdir_n_replicas: + hs_service_index(replicanum) = H("store-at-idx" | + blinded_public_key | + INT_8(replicanum) | + INT_8(period_length) | + INT_8(period_num) ) + + where blinded_public_key is specified in section [KEYBLIND], period_length + is the length of the time period in minutes, and period_num is calculated + using the current consensus "valid-after" as specified in section + [TIME-PERIODS]. + + Then, for each node listed in the current consensus with the HSDir flag, + we compute a directory index for that node as: + + hs_relay_index(node) = H("node-idx" | node_identity | + shared_random_value | + INT_8(period_num) | + INT_8(period_length) ) + + where shared_random_value is the shared value generated by the authorities + in section [PUB-SHAREDRANDOM], and node_identity is the ed25519 identity + key of the node. + + Finally, for replicanum in 1...hsdir_n_replicas, the hidden service + host uploads descriptors to the first hsdir_spread_store nodes whose + indices immediately follow hs_service_index(replicanum). If any of those + nodes have already been selected for a lower-numbered replica of the + service, any nodes already chosen are disregarded (i.e. skipped over) + when choosing a replica's hsdir_spread_store nodes. + + When choosing an HSDir to download from, clients choose randomly from + among the first hsdir_spread_fetch nodes after the indices. (Note + that, in order to make the system better tolerate disappearing + HSDirs, hsdir_spread_fetch may be less than hsdir_spread_store.) + Again, nodes from lower-numbered replicas are disregarded when + choosing the spread for a replica. + +2.2.4. Using time periods and SRVs to fetch/upload HS descriptors [FETCHUPLOADDESC] + + Hidden services and clients need to make correct use of time periods (TP) + and shared random values (SRVs) to successfully fetch and upload + descriptors. Furthermore, to avoid problems with skewed clocks, both clients + and services use the 'valid-after' time of a live consensus as a way to take + decisions with regards to uploading and fetching descriptors. By using the + consensus times as the ground truth here, we minimize the desynchronization + of clients and services due to system clock. Whenever time-based decisions + are taken in this section, assume that they are consensus times and not + system times. + + As [PUB-SHAREDRANDOM] specifies, consensuses contain two shared random + values (the current one and the previous one). Hidden services and clients + are asked to match these shared random values with descriptor time periods + and use the right SRV when fetching/uploading descriptors. This section + attempts to precisely specify how this works. + + Let's start with an illustration of the system: + + +------------------------------------------------------------------+ + | | + | 00:00 12:00 00:00 12:00 00:00 12:00 | + | SRV#1 TP#1 SRV#2 TP#2 SRV#3 TP#3 | + | | + | $==========|-----------$===========|-----------$===========| | + | | + | | + +------------------------------------------------------------------+ + + Legend: [TP#1 = Time Period #1] + [SRV#1 = Shared Random Value #1] + ["$" = descriptor rotation moment] + +2.2.4.1. Client behavior for fetching descriptors [CLIENTFETCH] + + And here is how clients use TPs and SRVs to fetch descriptors: + + Clients always aim to synchronize their TP with SRV, so they always want to + use TP#N with SRV#N: To achieve this wrt time periods, clients always use + the current time period when fetching descriptors. Now wrt SRVs, if a client + is in the time segment between a new time period and a new SRV (i.e. the + segments drawn with "-") it uses the current SRV, else if the client is in a + time segment between a new SRV and a new time period (i.e. the segments + drawn with "="), it uses the previous SRV. + + Example: + + +------------------------------------------------------------------+ + | | + | 00:00 12:00 00:00 12:00 00:00 12:00 | + | SRV#1 TP#1 SRV#2 TP#2 SRV#3 TP#3 | + | | + | $==========|-----------$===========|-----------$===========| | + | ^ ^ | + | C1 C2 | + +------------------------------------------------------------------+ + + If a client (C1) is at 13:00 right after TP#1, then it will use TP#1 and + SRV#1 for fetching descriptors. Also, if a client (C2) is at 01:00 right + after SRV#2, it will still use TP#1 and SRV#1. + +2.2.4.2. Service behavior for uploading descriptors [SERVICEUPLOAD] + + As discussed above, services maintain two active descriptors at any time. We + call these the "first" and "second" service descriptors. Services rotate + their descriptor every time they receive a consensus with a valid_after time + past the next SRV calculation time. They rotate their descriptors by + discarding their first descriptor, pushing the second descriptor to the + first, and rebuilding their second descriptor with the latest data. + + Services like clients also employ a different logic for picking SRV and TP + values based on their position in the graph above. Here is the logic: + +2.2.4.2.1. First descriptor upload logic [FIRSTDESCUPLOAD] + + Here is the service logic for uploading its first descriptor: + + When a service is in the time segment between a new time period a new SRV + (i.e. the segments drawn with "-"), it uses the previous time period and + previous SRV for uploading its first descriptor: that's meant to cover + for clients that have a consensus that is still in the previous time period. + + Example: Consider in the above illustration that the service is at 13:00 + right after TP#1. It will upload its first descriptor using TP#0 and SRV#0. + So if a client still has a 11:00 consensus it will be able to access it + based on the client logic above. + + Now if a service is in the time segment between a new SRV and a new time + period (i.e. the segments drawn with "=") it uses the current time period + and the previous SRV for its first descriptor: that's meant to cover clients + with an up-to-date consensus in the same time period as the service. + + Example: + + +------------------------------------------------------------------+ + | | + | 00:00 12:00 00:00 12:00 00:00 12:00 | + | SRV#1 TP#1 SRV#2 TP#2 SRV#3 TP#3 | + | | + | $==========|-----------$===========|-----------$===========| | + | ^ | + | S | + +------------------------------------------------------------------+ + + Consider that the service is at 01:00 right after SRV#2: it will upload its + first descriptor using TP#1 and SRV#1. + +2.2.4.2.2. Second descriptor upload logic [SECONDDESCUPLOAD] + + Here is the service logic for uploading its second descriptor: + + When a service is in the time segment between a new time period a new SRV + (i.e. the segments drawn with "-"), it uses the current time period and + current SRV for uploading its second descriptor: that's meant to cover for + clients that have an up-to-date consensus on the same TP as the service. + + Example: Consider in the above illustration that the service is at 13:00 + right after TP#1: it will upload its second descriptor using TP#1 and SRV#1. + + Now if a service is in the time segment between a new SRV and a new time + period (i.e. the segments drawn with "=") it uses the next time period and + the current SRV for its second descriptor: that's meant to cover clients + with a newer consensus than the service (in the next time period). + + Example: + + +------------------------------------------------------------------+ + | | + | 00:00 12:00 00:00 12:00 00:00 12:00 | + | SRV#1 TP#1 SRV#2 TP#2 SRV#3 TP#3 | + | | + | $==========|-----------$===========|-----------$===========| | + | ^ | + | S | + +------------------------------------------------------------------+ + + Consider that the service is at 01:00 right after SRV#2: it will upload its + second descriptor using TP#2 and SRV#2. + +2.2.4.3. Directory behavior for handling descriptor uploads [DIRUPLOAD] + + Upon receiving a hidden service descriptor publish request, directories MUST + check the following: + + * The outer wrapper of the descriptor can be parsed according to + [DESC-OUTER] + * The version-number of the descriptor is "3" + * If the directory has already cached a descriptor for this hidden service, + the revision-counter of the uploaded descriptor must be greater than the + revision-counter of the cached one + * The descriptor signature is valid + + If any of these basic validity checks fails, the directory MUST reject the + descriptor upload. + + NOTE: Even if the descriptor passes the checks above, its first and second + layers could still be invalid: directories cannot validate the encrypted + layers of the descriptor, as they do not have access to the public key of the + service (required for decrypting the first layer of encryption), or the + necessary client credentials (for decrypting the second layer). + +2.2.5. Expiring hidden service descriptors [EXPIRE-DESC] + + Hidden services set their descriptor's "descriptor-lifetime" field to 180 + minutes (3 hours). Hidden services ensure that their descriptor will remain + valid in the HSDir caches, by republishing their descriptors periodically as + specified in [WHEN-HSDESC]. + + Hidden services MUST also keep their introduction circuits alive for as long + as descriptors including those intro points are valid (even if that's after + the time period has changed). + +2.2.6. URLs for anonymous uploading and downloading + + Hidden service descriptors conforming to this specification are uploaded + with an HTTP POST request to the URL /tor/hs//publish relative to + the hidden service directory's root, and downloaded with an HTTP GET + request for the URL /tor/hs// where is a base64 encoding of + the hidden service's blinded public key and is the protocol + version which is "3" in this case. + + These requests must be made anonymously, on circuits not used for + anything else. + +2.2.7. Client-side validation of onion addresses + + When a Tor client receives a prop224 onion address from the user, it + MUST first validate the onion address before attempting to connect or + fetch its descriptor. If the validation fails, the client MUST + refuse to connect. + + As part of the address validation, Tor clients should check that the + underlying ed25519 key does not have a torsion component. If Tor accepted + ed25519 keys with torsion components, attackers could create multiple + equivalent onion addresses for a single ed25519 key, which would map to the + same service. We want to avoid that because it could lead to phishing + attacks and surprising behaviors (e.g. imagine a browser plugin that blocks + onion addresses, but could be bypassed using an equivalent onion address + with a torsion component). + + The right way for clients to detect such fraudulent addresses (which should + only occur malevolently and never naturally) is to extract the ed25519 + public key from the onion address and multiply it by the ed25519 group order + and ensure that the result is the ed25519 identity element. For more + details, please see [TORSION-REFS]. + +2.3. Publishing shared random values [PUB-SHAREDRANDOM] + + Our design for limiting the predictability of HSDir upload locations + relies on a shared random value (SRV) that isn't predictable in advance or + too influenceable by an attacker. The authorities must run a protocol + to generate such a value at least once per hsdir period. Here we + describe how they publish these values; the procedure they use to + generate them can change independently of the rest of this + specification. For more information see [SHAREDRANDOM-REFS]. + + According to proposal 250, we add two new lines in consensuses: + + "shared-rand-previous-value" SP NUM_REVEALS SP VALUE NL + "shared-rand-current-value" SP NUM_REVEALS SP VALUE NL + +2.3.1. Client behavior in the absence of shared random values + + If the previous or current shared random value cannot be found in a + consensus, then Tor clients and services need to generate their own random + value for use when choosing HSDirs. + + To do so, Tor clients and services use: + + SRV = H("shared-random-disaster" | INT_8(period_length) | INT_8(period_num)) + + where period_length is the length of a time period in minutes, + rounded down; period_num is calculated as specified in + [TIME-PERIODS] for the wanted shared random value that could not be + found originally. + +2.3.2. Hidden services and changing shared random values + + It's theoretically possible that the consensus shared random values will + change or disappear in the middle of a time period because of directory + authorities dropping offline or misbehaving. + + To avoid client reachability issues in this rare event, hidden services + should use the new shared random values to find the new responsible HSDirs + and upload their descriptors there. + + XXX How long should they upload descriptors there for? + +2.4. Hidden service descriptors: outer wrapper [DESC-OUTER] + + The format for a hidden service descriptor is as follows, using the + meta-format from dir-spec.txt. + + "hs-descriptor" SP version-number NL + + [At start, exactly once.] + + The version-number is a 32 bit unsigned integer indicating the version + of the descriptor. Current version is "3". + + "descriptor-lifetime" SP LifetimeMinutes NL + + [Exactly once] + + The lifetime of a descriptor in minutes. An HSDir SHOULD expire the + hidden service descriptor at least LifetimeMinutes after it was + uploaded. + + The LifetimeMinutes field can take values between 30 and 720 (12 + hours). + + "descriptor-signing-key-cert" NL certificate NL + + [Exactly once.] + + The 'certificate' field contains a certificate in the format from + proposal 220, wrapped with "-----BEGIN ED25519 CERT-----". The + certificate cross-certifies the short-term descriptor signing key with + the blinded public key. The certificate type must be [08], and the + blinded public key must be present as the signing-key extension. + + "revision-counter" SP Integer NL + + [Exactly once.] + + The revision number of the descriptor. If an HSDir receives a + second descriptor for a key that it already has a descriptor for, + it should retain and serve the descriptor with the higher + revision-counter. + + (Checking for monotonically increasing revision-counter values + prevents an attacker from replacing a newer descriptor signed by + a given key with a copy of an older version.) + + Implementations MUST be able to parse 64-bit values for these + counters. + + "superencrypted" NL encrypted-string + + [Exactly once.] + + An encrypted blob, whose format is discussed in [HS-DESC-ENC] below. The + blob is base64 encoded and enclosed in -----BEGIN MESSAGE---- and + ----END MESSAGE---- wrappers. (The resulting document does not end with + a newline character.) + + "signature" SP signature NL + + [exactly once, at end.] + + A signature of all previous fields, using the signing key in the + descriptor-signing-key-cert line, prefixed by the string "Tor onion + service descriptor sig v3". We use a separate key for signing, so that + the hidden service host does not need to have its private blinded key + online. + + HSDirs accept hidden service descriptors of up to 50k bytes (a consensus + parameter should also be introduced to control this value). + +2.5. Hidden service descriptors: encryption format [HS-DESC-ENC] + + Hidden service descriptors are protected by two layers of encryption. + Clients need to decrypt both layers to connect to the hidden service. + + The first layer of encryption provides confidentiality against entities who + don't know the public key of the hidden service (e.g. HSDirs), while the + second layer of encryption is only useful when client authorization is enabled + and protects against entities that do not possess valid client credentials. + +2.5.1. First layer of encryption [HS-DESC-FIRST-LAYER] + + The first layer of HS descriptor encryption is designed to protect + descriptor confidentiality against entities who don't know the public + identity key of the hidden service. + +2.5.1.1. First layer encryption logic + + The encryption keys and format for the first layer of encryption are + generated as specified in [HS-DESC-ENCRYPTION-KEYS] with customization + parameters: + + SECRET_DATA = blinded-public-key + STRING_CONSTANT = "hsdir-superencrypted-data" + + The encryption scheme in [HS-DESC-ENCRYPTION-KEYS] uses the service + credential which is derived from the public identity key (see [SUBCRED]) to + ensure that only entities who know the public identity key can decrypt the + first descriptor layer. + + The ciphertext is placed on the "superencrypted" field of the descriptor. + + Before encryption the plaintext is padded with NUL bytes to the nearest + multiple of 10k bytes. + +2.5.1.2. First layer plaintext format + + After clients decrypt the first layer of encryption, they need to parse the + plaintext to get to the second layer ciphertext which is contained in the + "encrypted" field. + + If client auth is enabled, the hidden service generates a fresh + descriptor_cookie key (`N_hs_desc_enc`, 32 random bytes) and encrypts + it using each authorized client's identity x25519 key. Authorized + clients can use the descriptor cookie (`N_hs_desc_enc`) to decrypt + the second (inner) layer of encryption. Our encryption scheme + requires the hidden service to also generate an ephemeral x25519 + keypair for each new descriptor. + + If client auth is disabled, fake data is placed in each of the fields below + to obfuscate whether client authorization is enabled. + + Here are all the supported fields: + + "desc-auth-type" SP type NL + + [Exactly once] + + This field contains the type of authorization used to protect the + descriptor. The only recognized type is "x25519" and specifies the + encryption scheme described in this section. + + If client authorization is disabled, the value here should be "x25519". + + "desc-auth-ephemeral-key" SP KP_hs_desc_ephem NL + + [Exactly once] + + This field contains `KP_hss_desc_enc`, an ephemeral x25519 public + key generated by the hidden service and encoded in base64. The key + is used by the encryption scheme below. + + If client authorization is disabled, the value here should be a fresh + x25519 pubkey that will remain unused. + + "auth-client" SP client-id SP iv SP encrypted-cookie + + [At least once] + + When client authorization is enabled, the hidden service inserts an + "auth-client" line for each of its authorized clients. If client + authorization is disabled, the fields here can be populated with random + data of the right size (that's 8 bytes for 'client-id', 16 bytes for 'iv' + and 16 bytes for 'encrypted-cookie' all encoded with base64). + + When client authorization is enabled, each "auth-client" line + contains the descriptor cookie `N_hs_desc_enc` encrypted to each + individual client. We assume that each authorized client possesses + a pre-shared x25519 keypair (`KP_hsc_desc_enc`) which is used to + decrypt the descriptor cookie. + + We now describe the descriptor cookie encryption scheme. Here is what + the hidden service computes: + + SECRET_SEED = x25519(KS_hs_desc_ephem, KP_hsc_desc_enc) + KEYS = KDF(N_hs_subcred | SECRET_SEED, 40) + CLIENT-ID = fist 8 bytes of KEYS + COOKIE-KEY = last 32 bytes of KEYS + + Here is a description of the fields in the "auth-client" line: + + - The "client-id" field is CLIENT-ID from above encoded in base64. + + - The "iv" field is 16 random bytes encoded in base64. + + - The "encrypted-cookie" field contains the descriptor cookie ciphertext + as follows and is encoded in base64: + encrypted-cookie = STREAM(iv, COOKIE-KEY) XOR N_hs_desc_enc. + + See section [FIRST-LAYER-CLIENT-BEHAVIOR] for the client-side logic of + how to decrypt the descriptor cookie. + + "encrypted" NL encrypted-string + + [Exactly once] + + An encrypted blob containing the second layer ciphertext, whose format is + discussed in [HS-DESC-SECOND-LAYER] below. The blob is base64 encoded + and enclosed in -----BEGIN MESSAGE---- and ----END MESSAGE---- wrappers. + + Compatibility note: The C Tor implementation does not include a final + newline when generating this first-layer-plaintext section; other + implementations MUST accept this section even if it is missing its final + newline. Other implementations MAY generate this section without a final + newline themselves, to avoid being distinguishable from C tor. + +2.5.1.3. Client behavior [FIRST-LAYER-CLIENT-BEHAVIOR] + + The goal of clients at this stage is to decrypt the "encrypted" field as + described in [HS-DESC-SECOND-LAYER]. + + If client authorization is enabled, authorized clients need to extract the + descriptor cookie to proceed with decryption of the second layer as + follows: + + An authorized client parsing the first layer of an encrypted descriptor, + extracts the ephemeral key from "desc-auth-ephemeral-key" and calculates + CLIENT-ID and COOKIE-KEY as described in the section above using their + x25519 private key. The client then uses CLIENT-ID to find the right + "auth-client" field which contains the ciphertext of the descriptor + cookie. The client then uses COOKIE-KEY and the iv to decrypt the + descriptor_cookie, which is used to decrypt the second layer of descriptor + encryption as described in [HS-DESC-SECOND-LAYER]. + +2.5.1.4. Hiding client authorization data + + Hidden services should avoid leaking whether client authorization is + enabled or how many authorized clients there are. + + Hence even when client authorization is disabled, the hidden service adds + fake "desc-auth-type", "desc-auth-ephemeral-key" and "auth-client" lines to + the descriptor, as described in [HS-DESC-FIRST-LAYER]. + + The hidden service also avoids leaking the number of authorized clients by + adding fake "auth-client" entries to its descriptor. Specifically, + descriptors always contain a number of authorized clients that is a + multiple of 16 by adding fake "auth-client" entries if needed. + [XXX consider randomization of the value 16] + + Clients MUST accept descriptors with any number of "auth-client" lines as + long as the total descriptor size is within the max limit of 50k (also + controlled with a consensus parameter). + +2.5.2. Second layer of encryption [HS-DESC-SECOND-LAYER] + + The second layer of descriptor encryption is designed to protect descriptor + confidentiality against unauthorized clients. If client authorization is + enabled, it's encrypted using the descriptor_cookie, and contains needed + information for connecting to the hidden service, like the list of its + introduction points. + + If client authorization is disabled, then the second layer of HS encryption + does not offer any additional security, but is still used. + +2.5.2.1. Second layer encryption keys + + The encryption keys and format for the second layer of encryption are + generated as specified in [HS-DESC-ENCRYPTION-KEYS] with customization + parameters as follows: + + SECRET_DATA = blinded-public-key | descriptor_cookie + STRING_CONSTANT = "hsdir-encrypted-data" + + If client authorization is disabled the 'descriptor_cookie' field is left blank. + + The ciphertext is placed on the "encrypted" field of the descriptor. + +2.5.2.2. Second layer plaintext format + + After decrypting the second layer ciphertext, clients can finally learn the + list of intro points etc. The plaintext has the following format: + + "create2-formats" SP formats NL + + [Exactly once] + + A space-separated list of integers denoting CREATE2 cell HTYPEs + (handshake types) that the server recognizes. Must include at least + ntor as described in tor-spec.txt. See tor-spec section 5.1 for a list + of recognized handshake types. + + "intro-auth-required" SP types NL + + [At most once] + + A space-separated list of introduction-layer authentication types; see + section [INTRO-AUTH] for more info. A client that does not support at + least one of these authentication types will not be able to contact the + host. Recognized types are: 'ed25519'. + + "single-onion-service" + + [None or at most once] + + If present, this line indicates that the service is a Single Onion + Service (see prop260 for more details about that type of service). This + field has been introduced in 0.3.0 meaning 0.2.9 service don't include + this. + + Followed by zero or more introduction points as follows (see section + [NUM_INTRO_POINT] below for accepted values): + + "introduction-point" SP link-specifiers NL + + [Exactly once per introduction point at start of introduction + point section] + + The link-specifiers is a base64 encoding of a link specifier + block in the format described in [BUILDING-BLOCKS] above. + + As of 0.4.1.1-alpha, services include both IPv4 and IPv6 link + specifiers in descriptors. All available addresses SHOULD be + included in the descriptor, regardless of the address that the + onion service actually used to connect/extend to the intro + point. + + The client SHOULD NOT reject any LSTYPE fields which it doesn't + recognize; instead, it should use them verbatim in its EXTEND + request to the introduction point. + + The client SHOULD perform the basic validity checks on the link + specifiers in the descriptor, described in `tor-spec.txt` + section 5.1.2. These checks SHOULD NOT leak + detailed information about the client's version, configuration, + or consensus. (See 3.3 for service link specifier handling.) + + When connecting to the introduction point, the client SHOULD send + this list of link specifiers verbatim, in the same order as given + here. + + The client MAY reject the list of link specifiers if it is + inconsistent with relay information from the directory, but SHOULD + NOT modify it. + + "onion-key" SP "ntor" SP key NL + + [Exactly once per introduction point] + + The key is a base64 encoded curve25519 public key which is the onion + key of the introduction point Tor node used for the ntor handshake + when a client extends to it. + + "onion-key" SP KeyType SP key.. NL + + [Any number of times] + + Implementations should accept other types of onion keys using this + syntax (where "KeyType" is some string other than "ntor"); + unrecognized key types should be ignored. + + "auth-key" NL certificate NL + + [Exactly once per introduction point] + + The certificate is a proposal 220 certificate wrapped in + "-----BEGIN ED25519 CERT-----". It contains the introduction + point authentication key (`KP_hs_ipt_sid`), signed by + the descriptor signing key (`KP_hs_desc_sign`). The + certificate type must be [09], and the signing key extension + is mandatory. + + NOTE: This certificate was originally intended to be + constructed the other way around: the signing and signed keys + are meant to be reversed. However, C tor implemented it + backwards, and other implementations now need to do the same + in order to conform. (Since this section is inside the + descriptor, which is _already_ signed by `KP_hs_desc_sign`, + the verification aspect of this certificate serves no point in + its current form.) + + "enc-key" SP "ntor" SP key NL + + [Exactly once per introduction point] + + The key is a base64 encoded curve25519 public key used to encrypt + the introduction request to service. (`KP_hss_ntor`) + + "enc-key" SP KeyType SP key.. NL + + [Any number of times] + + Implementations should accept other types of onion keys using this + syntax (where "KeyType" is some string other than "ntor"); + unrecognized key types should be ignored. + + "enc-key-cert" NL certificate NL + + [Exactly once per introduction point] + + Cross-certification of the encryption key using the descriptor + signing key. + + For "ntor" keys, certificate is a proposal 220 certificate + wrapped in "-----BEGIN ED25519 CERT-----" armor. The subject + key is the the ed25519 equivalent of a curve25519 public + encryption key (`KP_hss_ntor`), with the ed25519 key + derived using the process in proposal 228 appendix A. The + signing key is the descriptor signing key (`KP_hs_desc_sign`). + The certificate type must be [0B], and the signing-key + extension is mandatory. + + NOTE: As with "auth-key", this certificate was intended to be + constructed the other way around. However, for compatibility + with C tor, implementations need to construct it this way. It + serves even less point than "auth-key", however, since the + encryption key `KP_hss_ntor` is already available from + the `enc-key` entry. + + "legacy-key" NL key NL + + [None or at most once per introduction point] + [This field is obsolete and should never be generated; it + is included for historical reasons only.] + + The key is an ASN.1 encoded RSA public key in PEM format used for a + legacy introduction point as described in [LEGACY_EST_INTRO]. + + This field is only present if the introduction point only supports + legacy protocol (v2) that is <= 0.2.9 or the protocol version value + "HSIntro 3". + + "legacy-key-cert" NL certificate NL + + [None or at most once per introduction point] + [This field is obsolete and should never be generated; it + is included for historical reasons only.] + + MUST be present if "legacy-key" is present. + + The certificate is a proposal 220 RSA->Ed cross-certificate wrapped + in "-----BEGIN CROSSCERT-----" armor, cross-certifying the RSA + public key found in "legacy-key" using the descriptor signing key. + + To remain compatible with future revisions to the descriptor format, + clients should ignore unrecognized lines in the descriptor. + Other encryption and authentication key formats are allowed; clients + should ignore ones they do not recognize. + + Clients who manage to extract the introduction points of the hidden service + can proceed with the introduction protocol as specified in [INTRO-PROTOCOL]. + + Compatibility note: At least some versions of OnionBalance do not include + a final newline when generating this inner plaintext section; other + implementations MUST accept this section even if it is missing its final + newline. + +2.5.3. Deriving hidden service descriptor encryption keys [HS-DESC-ENCRYPTION-KEYS] + + In this section we present the generic encryption format for hidden service + descriptors. We use the same encryption format in both encryption layers, + hence we introduce two customization parameters SECRET_DATA and + STRING_CONSTANT which vary between the layers. + + The SECRET_DATA parameter specifies the secret data that are used during + encryption key generation, while STRING_CONSTANT is merely a string constant + that is used as part of the KDF. + + Here is the key generation logic: + + SALT = 16 bytes from H(random), changes each time we rebuild the + descriptor even if the content of the descriptor hasn't changed. + (So that we don't leak whether the intro point list etc. changed) + + secret_input = SECRET_DATA | N_hs_subcred | INT_8(revision_counter) + + keys = KDF(secret_input | salt | STRING_CONSTANT, S_KEY_LEN + S_IV_LEN + MAC_KEY_LEN) + + SECRET_KEY = first S_KEY_LEN bytes of keys + SECRET_IV = next S_IV_LEN bytes of keys + MAC_KEY = last MAC_KEY_LEN bytes of keys + + The encrypted data has the format: + + SALT hashed random bytes from above [16 bytes] + ENCRYPTED The ciphertext [variable] + MAC D_MAC of both above fields [32 bytes] + + The final encryption format is ENCRYPTED = STREAM(SECRET_IV,SECRET_KEY) XOR Plaintext . + + Where D_MAC = H(mac_key_len | MAC_KEY | salt_len | SALT | ENCRYPTED) + and + mac_key_len = htonll(len(MAC_KEY)) + and + salt_len = htonll(len(SALT)). + +2.5.4. Number of introduction points [NUM_INTRO_POINT] + + This section defines how many introduction points an hidden service + descriptor can have at minimum, by default and the maximum: + + Minimum: 0 - Default: 3 - Maximum: 20 + + A value of 0 would means that the service is still alive but doesn't want + to be reached by any client at the moment. Note that the descriptor size + increases considerably as more introduction points are added. + + The reason for a maximum value of 20 is to give enough scalability to tools + like OnionBalance to be able to load balance up to 120 servers (20 x 6 + HSDirs) but also in order for the descriptor size to not overwhelmed hidden + service directories with user defined values that could be gigantic. + +3. The introduction protocol [INTRO-PROTOCOL] + + The introduction protocol proceeds in three steps. + + First, a hidden service host builds an anonymous circuit to a Tor + node and registers that circuit as an introduction point. + + Single Onion Services attempt to build a non-anonymous single-hop circuit, + but use an anonymous 3-hop circuit if: + + * the intro point is on an address that is configured as unreachable via + a direct connection, or + * the initial attempt to connect to the intro point over a single-hop + circuit fails, and they are retrying the intro point connection. + + [After 'First' and before 'Second', the hidden service publishes its + introduction points and associated keys, and the client fetches + them as described in section [HSDIR] above.] + + Second, a client builds an anonymous circuit to the introduction + point, and sends an introduction request. + + Third, the introduction point relays the introduction request along + the introduction circuit to the hidden service host, and acknowledges + the introduction request to the client. + +3.1. Registering an introduction point [REG_INTRO_POINT] + +3.1.1. Extensible ESTABLISH_INTRO protocol. [EST_INTRO] + + When a hidden service is establishing a new introduction point, it + sends an ESTABLISH_INTRO cell with the following contents: + + AUTH_KEY_TYPE [1 byte] + AUTH_KEY_LEN [2 bytes] + AUTH_KEY [AUTH_KEY_LEN bytes] + N_EXTENSIONS [1 byte] + N_EXTENSIONS times: + EXT_FIELD_TYPE [1 byte] + EXT_FIELD_LEN [1 byte] + EXT_FIELD [EXT_FIELD_LEN bytes] + HANDSHAKE_AUTH [MAC_LEN bytes] + SIG_LEN [2 bytes] + SIG [SIG_LEN bytes] + + The AUTH_KEY_TYPE field indicates the type of the introduction point + authentication key and the type of the MAC to use in + HANDSHAKE_AUTH. Recognized types are: + + [00, 01] -- Reserved for legacy introduction cells; see + [LEGACY_EST_INTRO below] + [02] -- Ed25519; SHA3-256. + + The AUTH_KEY_LEN field determines the length of the AUTH_KEY + field. The AUTH_KEY field contains the public introduction point + authentication key, KP_hs_ipt_sid. + + The EXT_FIELD_TYPE, EXT_FIELD_LEN, EXT_FIELD entries are reserved for + extensions to the introduction protocol. Extensions with + unrecognized EXT_FIELD_TYPE values must be ignored. + (`EXT_FIELD_LEN` may be zero, in which case EXT_FIELD is absent.) + + Unless otherwise specified in the documentation for an extension type: + * Each extension type SHOULD be sent only once in a message. + * Parties MUST ignore any occurrences all occurrences of an extension + with a given type after the first such occurrence. + * Extensions SHOULD be sent in numerically ascending order by type. + (The above extension sorting and multiplicity rules are only defaults; + they may be overridden in the descriptions of individual extensions.) + + The HANDSHAKE_AUTH field contains the MAC of all earlier fields in + the cell using as its key the shared per-circuit material ("KH") + generated during the circuit extension protocol; see tor-spec.txt + section 5.2, "Setting circuit keys". It prevents replays of + ESTABLISH_INTRO cells. + + SIG_LEN is the length of the signature. + + SIG is a signature, using AUTH_KEY, of all contents of the cell, up + to but not including SIG_LEN and SIG. These contents are prefixed + with the string "Tor establish-intro cell v1". + + Upon receiving an ESTABLISH_INTRO cell, a Tor node first decodes the + key and the signature, and checks the signature. The node must reject + the ESTABLISH_INTRO cell and destroy the circuit in these cases: + + * If the key type is unrecognized + * If the key is ill-formatted + * If the signature is incorrect + * If the HANDSHAKE_AUTH value is incorrect + + * If the circuit is already a rendezvous circuit. + * If the circuit is already an introduction circuit. + [TODO: some scalability designs fail there.] + * If the key is already in use by another circuit. + + Otherwise, the node must associate the key with the circuit, for use + later in INTRODUCE1 cells. + +3.1.1.1. Denial-of-Service Defense Extension. [EST_INTRO_DOS_EXT] + + This extension can be used to send Denial-of-Service (DoS) parameters to + the introduction point in order for it to apply them for the introduction + circuit. + + If used, it needs to be encoded within the N_EXTENSIONS field of the + ESTABLISH_INTRO cell defined in the previous section. The content is + defined as follow: + + EXT_FIELD_TYPE: + + [01] -- Denial-of-Service Parameters. + + If this flag is set, the extension should be used by the introduction + point to learn what values the denial of service subsystem should be + using. + + EXT_FIELD content format is: + + N_PARAMS [1 byte] + N_PARAMS times: + PARAM_TYPE [1 byte] + PARAM_VALUE [8 byte] + + The PARAM_TYPE possible values are: + + [01] -- DOS_INTRODUCE2_RATE_PER_SEC + The rate per second of INTRODUCE2 cell relayed to the + service. + + [02] -- DOS_INTRODUCE2_BURST_PER_SEC + The burst per second of INTRODUCE2 cell relayed to the + service. + + The PARAM_VALUE size is 8 bytes in order to accommodate 64bit values. + It MUST match the specified limit for the following PARAM_TYPE: + + [01] -- Min: 0, Max: 2147483647 + [02] -- Min: 0, Max: 2147483647 + + A value of 0 means the defense is disabled. If the rate per second is + set to 0 (param 0x01) then the burst value should be ignored. And + vice-versa, if the burst value is 0 (param 0x02), then the rate value + should be ignored. In other words, setting one single parameter to 0 + disables the defense. + + The burst can NOT be smaller than the rate. If so, the parameters + should be ignored by the introduction point. + + Any valid value does have precedence over the network wide consensus + parameter. + + Using this extension extends the payload of the ESTABLISH_INTRO cell by 19 + bytes bringing it from 134 bytes to 155 bytes. + + This extension can only be used with relays supporting the protocol version + "HSIntro=5". + + Introduced in tor-0.4.2.1-alpha. + +3.1.2. Registering an introduction point on a legacy Tor node + [LEGACY_EST_INTRO] + + [This section is obsolete and refers to a workaround for now-obsolete Tor + relay versions. It is included for historical reasons.] + + Tor nodes should also support an older version of the ESTABLISH_INTRO + cell, first documented in rend-spec.txt. New hidden service hosts + must use this format when establishing introduction points at older + Tor nodes that do not support the format above in [EST_INTRO]. + + In this older protocol, an ESTABLISH_INTRO cell contains: + + KEY_LEN [2 bytes] + KEY [KEY_LEN bytes] + HANDSHAKE_AUTH [20 bytes] + SIG [variable, up to end of relay payload] + + The KEY_LEN variable determines the length of the KEY field. + + The KEY field is the ASN1-encoded legacy RSA public key that was also + included in the hidden service descriptor. + + The HANDSHAKE_AUTH field contains the SHA1 digest of (KH | "INTRODUCE"). + + The SIG field contains an RSA signature, using PKCS1 padding, of all + earlier fields. + + Older versions of Tor always use a 1024-bit RSA key for these introduction + authentication keys. + +3.1.3. Acknowledging establishment of introduction point [INTRO_ESTABLISHED] + + After setting up an introduction circuit, the introduction point reports its + status back to the hidden service host with an INTRO_ESTABLISHED cell. + + The INTRO_ESTABLISHED cell has the following contents: + + N_EXTENSIONS [1 byte] + N_EXTENSIONS times: + EXT_FIELD_TYPE [1 byte] + EXT_FIELD_LEN [1 byte] + EXT_FIELD [EXT_FIELD_LEN bytes] + + Older versions of Tor send back an empty INTRO_ESTABLISHED cell instead. + Services must accept an empty INTRO_ESTABLISHED cell from a legacy relay. + [The above paragraph is obsolete and refers to a workaround for + now-obsolete Tor relay versions. It is included for historical reasons.] + + The same rules for multiplicity, ordering, and handling unknown types + apply to the extension fields here as described [EST_INTRO] above. + + +3.2. Sending an INTRODUCE1 cell to the introduction point. [SEND_INTRO1] + + In order to participate in the introduction protocol, a client must + know the following: + + * An introduction point for a service. + * The introduction authentication key for that introduction point. + * The introduction encryption key for that introduction point. + + The client sends an INTRODUCE1 cell to the introduction point, + containing an identifier for the service, an identifier for the + encryption key that the client intends to use, and an opaque blob to + be relayed to the hidden service host. + + In reply, the introduction point sends an INTRODUCE_ACK cell back to + the client, either informing it that its request has been delivered, + or that its request will not succeed. + + [TODO: specify what tor should do when receiving a malformed cell. Drop it? + Kill circuit? This goes for all possible cells.] + +3.2.1. INTRODUCE1 cell format [FMT_INTRO1] + + When a client is connecting to an introduction point, INTRODUCE1 cells + should be of the form: + + LEGACY_KEY_ID [20 bytes] + AUTH_KEY_TYPE [1 byte] + AUTH_KEY_LEN [2 bytes] + AUTH_KEY [AUTH_KEY_LEN bytes] + N_EXTENSIONS [1 byte] + N_EXTENSIONS times: + EXT_FIELD_TYPE [1 byte] + EXT_FIELD_LEN [1 byte] + EXT_FIELD [EXT_FIELD_LEN bytes] + ENCRYPTED [Up to end of relay payload] + + AUTH_KEY_TYPE is defined as in [EST_INTRO]. Currently, the only value of + AUTH_KEY_TYPE for this cell is an Ed25519 public key [02]. + + The LEGACY_KEY_ID field is used to distinguish between legacy and new style + INTRODUCE1 cells. In new style INTRODUCE1 cells, LEGACY_KEY_ID is 20 zero + bytes. Upon receiving an INTRODUCE1 cell, the introduction point checks the + LEGACY_KEY_ID field. If LEGACY_KEY_ID is non-zero, the INTRODUCE1 cell + should be handled as a legacy INTRODUCE1 cell by the intro point. + + Upon receiving a INTRODUCE1 cell, the introduction point checks + whether AUTH_KEY matches the introduction point authentication key for an + active introduction circuit. If so, the introduction point sends an + INTRODUCE2 cell with exactly the same contents to the service, and sends an + INTRODUCE_ACK response to the client. + + (Note that the introduction point does not "clean up" the + INTRODUCE1 cells that it retransmits. Specifically, it does not + change the order or multiplicity of the extensions sent by the + client.) + + The same rules for multiplicity, ordering, and handling unknown types + apply to the extension fields here as described [EST_INTRO] above. + + +3.2.2. INTRODUCE_ACK cell format. [INTRO_ACK] + + An INTRODUCE_ACK cell has the following fields: + + STATUS [2 bytes] + N_EXTENSIONS [1 bytes] + N_EXTENSIONS times: + EXT_FIELD_TYPE [1 byte] + EXT_FIELD_LEN [1 byte] + EXT_FIELD [EXT_FIELD_LEN bytes] + + Recognized status values are: + + [00 00] -- Success: cell relayed to hidden service host. + [00 01] -- Failure: service ID not recognized + [00 02] -- Bad message format + [00 03] -- Can't relay cell to service + + The same rules for multiplicity, ordering, and handling unknown types + apply to the extension fields here as described [EST_INTRO] above. + + +3.3. Processing an INTRODUCE2 cell at the hidden service. [PROCESS_INTRO2] + + Upon receiving an INTRODUCE2 cell, the hidden service host checks whether + the AUTH_KEY or LEGACY_KEY_ID field matches the keys for this + introduction circuit. + + The service host then checks whether it has received a cell with these + contents or rendezvous cookie before. If it has, it silently drops it as a + replay. (It must maintain a replay cache for as long as it accepts cells + with the same encryption key. Note that the encryption format below should + be non-malleable.) + + If the cell is not a replay, it decrypts the ENCRYPTED field, + establishes a shared key with the client, and authenticates the whole + contents of the cell as having been unmodified since they left the + client. There may be multiple ways of decrypting the ENCRYPTED field, + depending on the chosen type of the encryption key. Requirements for + an introduction handshake protocol are described in + [INTRO-HANDSHAKE-REQS]. We specify one below in section + [NTOR-WITH-EXTRA-DATA]. + + The decrypted plaintext must have the form: + + RENDEZVOUS_COOKIE [20 bytes] + N_EXTENSIONS [1 byte] + N_EXTENSIONS times: + EXT_FIELD_TYPE [1 byte] + EXT_FIELD_LEN [1 byte] + EXT_FIELD [EXT_FIELD_LEN bytes] + ONION_KEY_TYPE [1 bytes] + ONION_KEY_LEN [2 bytes] + ONION_KEY [ONION_KEY_LEN bytes] + NSPEC (Number of link specifiers) [1 byte] + NSPEC times: + LSTYPE (Link specifier type) [1 byte] + LSLEN (Link specifier length) [1 byte] + LSPEC (Link specifier) [LSLEN bytes] + PAD (optional padding) [up to end of plaintext] + + Upon processing this plaintext, the hidden service makes sure that + any required authentication is present in the extension fields, and + then extends a rendezvous circuit to the node described in the LSPEC + fields, using the ONION_KEY to complete the extension. As mentioned + in [BUILDING-BLOCKS], the "TLS-over-TCP, IPv4" and "Legacy node + identity" specifiers must be present. + + As of 0.4.1.1-alpha, clients include both IPv4 and IPv6 link specifiers + in INTRODUCE1 cells. All available addresses SHOULD be included in the + cell, regardless of the address that the client actually used to extend + to the rendezvous point. + + The hidden service should handle invalid or unrecognised link specifiers + the same way as clients do in section 2.5.2.2. In particular, services + SHOULD perform basic validity checks on link specifiers, and SHOULD NOT + reject unrecognised link specifiers, to avoid information leaks. + The list of link specifiers received here SHOULD either be rejected, or + sent verbatim when extending to the rendezvous point, in the same order + received. + + The service MAY reject the list of link specifiers if it is + inconsistent with relay information from the directory, but SHOULD + NOT modify it. + + The ONION_KEY_TYPE field is: + + [01] NTOR: ONION_KEY is 32 bytes long. + + The ONION_KEY field describes the onion key that must be used when + extending to the rendezvous point. It must be of a type listed as + supported in the hidden service descriptor. + + The PAD field should be filled with zeros; its size should be chosen + so that the INTRODUCE2 message occupies a fixed maximum size, in + order to hide the length of the encrypted data. (This maximum size is + 490, since we assume that a future Tor implementations will implement + proposal 340 and thus lower the number of bytes that can be contained + in a single relay message.) Note also that current versions of Tor + only pad the INTRODUCE2 message up to 246 bytes. + + Upon receiving a well-formed INTRODUCE2 cell, the hidden service host + will have: + + * The information needed to connect to the client's chosen + rendezvous point. + * The second half of a handshake to authenticate and establish a + shared key with the hidden service client. + * A set of shared keys to use for end-to-end encryption. + + The same rules for multiplicity, ordering, and handling unknown types + apply to the extension fields here as described [EST_INTRO] above. + + +3.3.1. Introduction handshake encryption requirements [INTRO-HANDSHAKE-REQS] + + When decoding the encrypted information in an INTRODUCE2 cell, a + hidden service host must be able to: + + * Decrypt additional information included in the INTRODUCE2 cell, + to include the rendezvous token and the information needed to + extend to the rendezvous point. + + * Establish a set of shared keys for use with the client. + + * Authenticate that the cell has not been modified since the client + generated it. + + Note that the old TAP-derived protocol of the previous hidden service + design achieved the first two requirements, but not the third. + +3.3.2. Example encryption handshake: ntor with extra data + [NTOR-WITH-EXTRA-DATA] + + [TODO: relocate this] + + This is a variant of the ntor handshake (see tor-spec.txt, section + 5.1.4; see proposal 216; and see "Anonymity and one-way + authentication in key-exchange protocols" by Goldberg, Stebila, and + Ustaoglu). + + It behaves the same as the ntor handshake, except that, in addition + to negotiating forward secure keys, it also provides a means for + encrypting non-forward-secure data to the server (in this case, to + the hidden service host) as part of the handshake. + + Notation here is as in section 5.1.4 of tor-spec.txt, which defines + the ntor handshake. + + The PROTOID for this variant is "tor-hs-ntor-curve25519-sha3-256-1". + We also use the following tweak values: + + t_hsenc = PROTOID | ":hs_key_extract" + t_hsverify = PROTOID | ":hs_verify" + t_hsmac = PROTOID | ":hs_mac" + m_hsexpand = PROTOID | ":hs_key_expand" + + To make an INTRODUCE1 cell, the client must know a public encryption + key B for the hidden service on this introduction circuit. The client + generates a single-use keypair: + + x,X = KEYGEN() + + and computes: + + intro_secret_hs_input = EXP(B,x) | AUTH_KEY | X | B | PROTOID + info = m_hsexpand | N_hs_subcred + hs_keys = KDF(intro_secret_hs_input | t_hsenc | info, S_KEY_LEN+MAC_LEN) + ENC_KEY = hs_keys[0:S_KEY_LEN] + MAC_KEY = hs_keys[S_KEY_LEN:S_KEY_LEN+MAC_KEY_LEN] + + and sends, as the ENCRYPTED part of the INTRODUCE1 cell: + + CLIENT_PK [PK_PUBKEY_LEN bytes] + ENCRYPTED_DATA [Padded to length of plaintext] + MAC [MAC_LEN bytes] + + + Substituting those fields into the INTRODUCE1 cell body format + described in [FMT_INTRO1] above, we have + + LEGACY_KEY_ID [20 bytes] + AUTH_KEY_TYPE [1 byte] + AUTH_KEY_LEN [2 bytes] + AUTH_KEY [AUTH_KEY_LEN bytes] + N_EXTENSIONS [1 bytes] + N_EXTENSIONS times: + EXT_FIELD_TYPE [1 byte] + EXT_FIELD_LEN [1 byte] + EXT_FIELD [EXT_FIELD_LEN bytes] + ENCRYPTED: + CLIENT_PK [PK_PUBKEY_LEN bytes] + ENCRYPTED_DATA [Padded to length of plaintext] + MAC [MAC_LEN bytes] + + + (This format is as documented in [FMT_INTRO1] above, except that here + we describe how to build the ENCRYPTED portion.) + + Here, the encryption key plays the role of B in the regular ntor + handshake, and the AUTH_KEY field plays the role of the node ID. + The CLIENT_PK field is the public key X. The ENCRYPTED_DATA field is + the message plaintext, encrypted with the symmetric key ENC_KEY. The + MAC field is a MAC of all of the cell from the AUTH_KEY through the + end of ENCRYPTED_DATA, using the MAC_KEY value as its key. + + To process this format, the hidden service checks PK_VALID(CLIENT_PK) + as necessary, and then computes ENC_KEY and MAC_KEY as the client did + above, except using EXP(CLIENT_PK,b) in the calculation of + intro_secret_hs_input. The service host then checks whether the MAC is + correct. If it is invalid, it drops the cell. Otherwise, it computes + the plaintext by decrypting ENCRYPTED_DATA. + + The hidden service host now completes the service side of the + extended ntor handshake, as described in tor-spec.txt section 5.1.4, + with the modified PROTOID as given above. To be explicit, the hidden + service host generates a keypair of y,Y = KEYGEN(), and uses its + introduction point encryption key 'b' to compute: + + intro_secret_hs_input = EXP(X,b) | AUTH_KEY | X | B | PROTOID + info = m_hsexpand | N_hs_subcred + hs_keys = KDF(intro_secret_hs_input | t_hsenc | info, S_KEY_LEN+MAC_LEN) + HS_DEC_KEY = hs_keys[0:S_KEY_LEN] + HS_MAC_KEY = hs_keys[S_KEY_LEN:S_KEY_LEN+MAC_KEY_LEN] + + (The above are used to check the MAC and then decrypt the + encrypted data.) + + rend_secret_hs_input = EXP(X,y) | EXP(X,b) | AUTH_KEY | B | X | Y | PROTOID + NTOR_KEY_SEED = MAC(rend_secret_hs_input, t_hsenc) + verify = MAC(rend_secret_hs_input, t_hsverify) + auth_input = verify | AUTH_KEY | B | Y | X | PROTOID | "Server" + AUTH_INPUT_MAC = MAC(auth_input, t_hsmac) + + (The above are used to finish the ntor handshake.) + + The server's handshake reply is: + + SERVER_PK Y [PK_PUBKEY_LEN bytes] + AUTH AUTH_INPUT_MAC [MAC_LEN bytes] + + These fields will be sent to the client in a RENDEZVOUS1 cell using the + HANDSHAKE_INFO element (see [JOIN_REND]). + + The hidden service host now also knows the keys generated by the + handshake, which it will use to encrypt and authenticate data + end-to-end between the client and the server. These keys are as + computed in tor-spec.txt section 5.1.4, except that instead of using + AES-128 and SHA1 for this hop, we use AES-256 and SHA3-256. + +3.4. Authentication during the introduction phase. [INTRO-AUTH] + + Hidden services may restrict access only to authorized users. + One mechanism to do so is the credential mechanism, where only users who + know the credential for a hidden service may connect at all. + + There is one defined authentication type: `ed25519`. + + +3.4.1. Ed25519-based authentication `ed25519`. + + (NOTE: This section is not implemented by Tor. It is likely + that we would want to change its design substantially before + deploying any implementation. At the very least, we would + want to bind these extensions to a single onion service, to + prevent replays. We might also want to look for ways to limit + the number of keys a user needs to have.) + + To authenticate with an Ed25519 private key, the user must include an + extension field in the encrypted part of the INTRODUCE1 cell with an + EXT_FIELD_TYPE type of [02] and the contents: + + Nonce [16 bytes] + Pubkey [32 bytes] + Signature [64 bytes] + + Nonce is a random value. Pubkey is the public key that will be used + to authenticate. [TODO: should this be an identifier for the public + key instead?] Signature is the signature, using Ed25519, of: + + "hidserv-userauth-ed25519" + Nonce (same as above) + Pubkey (same as above) + AUTH_KEY (As in the INTRODUCE1 cell) + + The hidden service host checks this by seeing whether it recognizes + and would accept a signature from the provided public key. If it + would, then it checks whether the signature is correct. If it is, + then the correct user has authenticated. + + Replay prevention on the whole cell is sufficient to prevent replays + on the authentication. + + Users SHOULD NOT use the same public key with multiple hidden + services. + +4. The rendezvous protocol + + Before connecting to a hidden service, the client first builds a + circuit to an arbitrarily chosen Tor node (known as the rendezvous + point), and sends an ESTABLISH_RENDEZVOUS cell. The hidden service + later connects to the same node and sends a RENDEZVOUS cell. Once + this has occurred, the relay forwards the contents of the RENDEZVOUS + cell to the client, and joins the two circuits together. + + Single Onion Services attempt to build a non-anonymous single-hop circuit, + but use an anonymous 3-hop circuit if: + + * the rend point is on an address that is configured as unreachable via + a direct connection, or + * the initial attempt to connect to the rend point over a single-hop + circuit fails, and they are retrying the rend point connection. + +4.1. Establishing a rendezvous point [EST_REND_POINT] + + The client sends the rendezvous point a RELAY_COMMAND_ESTABLISH_RENDEZVOUS + cell containing a 20-byte value. + + RENDEZVOUS_COOKIE [20 bytes] + + Rendezvous points MUST ignore any extra bytes in an + ESTABLISH_RENDEZVOUS cell. (Older versions of Tor did not.) + + The rendezvous cookie is an arbitrary 20-byte value, chosen randomly + by the client. The client SHOULD choose a new rendezvous cookie for + each new connection attempt. If the rendezvous cookie is already in + use on an existing circuit, the rendezvous point should reject it and + destroy the circuit. + + Upon receiving an ESTABLISH_RENDEZVOUS cell, the rendezvous point associates + the cookie with the circuit on which it was sent. It replies to the client + with an empty RENDEZVOUS_ESTABLISHED cell to indicate success. Clients MUST + ignore any extra bytes in a RENDEZVOUS_ESTABLISHED cell. + + The client MUST NOT use the circuit which sent the cell for any + purpose other than rendezvous with the given location-hidden service. + + The client should establish a rendezvous point BEFORE trying to + connect to a hidden service. + +4.2. Joining to a rendezvous point [JOIN_REND] + + To complete a rendezvous, the hidden service host builds a circuit to + the rendezvous point and sends a RENDEZVOUS1 cell containing: + + RENDEZVOUS_COOKIE [20 bytes] + HANDSHAKE_INFO [variable; depends on handshake type + used.] + + where RENDEZVOUS_COOKIE is the cookie suggested by the client during the + introduction (see [PROCESS_INTRO2]) and HANDSHAKE_INFO is defined in + [NTOR-WITH-EXTRA-DATA]. + + If the cookie matches the rendezvous cookie set on any + not-yet-connected circuit on the rendezvous point, the rendezvous + point connects the two circuits, and sends a RENDEZVOUS2 cell to the + client containing the HANDSHAKE_INFO field of the RENDEZVOUS1 cell. + + Upon receiving the RENDEZVOUS2 cell, the client verifies that HANDSHAKE_INFO + correctly completes a handshake. To do so, the client parses SERVER_PK from + HANDSHAKE_INFO and reverses the final operations of section + [NTOR-WITH-EXTRA-DATA] as shown here: + + rend_secret_hs_input = EXP(Y,x) | EXP(B,x) | AUTH_KEY | B | X | Y | PROTOID + NTOR_KEY_SEED = MAC(ntor_secret_input, t_hsenc) + verify = MAC(ntor_secret_input, t_hsverify) + auth_input = verify | AUTH_KEY | B | Y | X | PROTOID | "Server" + AUTH_INPUT_MAC = MAC(auth_input, t_hsmac) + + Finally the client verifies that the received AUTH field of HANDSHAKE_INFO + is equal to the computed AUTH_INPUT_MAC. + + Now both parties use the handshake output to derive shared keys for use on + the circuit as specified in the section below: + +4.2.1. Key expansion + + The hidden service and its client need to derive crypto keys from the + NTOR_KEY_SEED part of the handshake output. To do so, they use the KDF + construction as follows: + + K = KDF(NTOR_KEY_SEED | m_hsexpand, HASH_LEN * 2 + S_KEY_LEN * 2) + + The first HASH_LEN bytes of K form the forward digest Df; the next HASH_LEN + bytes form the backward digest Db; the next S_KEY_LEN bytes form Kf, and the + final S_KEY_LEN bytes form Kb. Excess bytes from K are discarded. + + Subsequently, the rendezvous point passes relay cells, unchanged, from each + of the two circuits to the other. When Alice's OP sends RELAY cells along + the circuit, it authenticates with Df, and encrypts them with the Kf, then + with all of the keys for the ORs in Alice's side of the circuit; and when + Alice's OP receives RELAY cells from the circuit, it decrypts them with the + keys for the ORs in Alice's side of the circuit, then decrypts them with Kb, + and checks integrity with Db. Bob's OP does the same, with Kf and Kb + interchanged. + + [TODO: Should we encrypt HANDSHAKE_INFO as we did INTRODUCE2 + contents? It's not necessary, but it could be wise. Similarly, we + should make it extensible.] + +4.3. Using legacy hosts as rendezvous points + + [This section is obsolete and refers to a workaround for now-obsolete Tor + relay versions. It is included for historical reasons.] + + The behavior of ESTABLISH_RENDEZVOUS is unchanged from older versions + of this protocol, except that relays should now ignore unexpected + bytes at the end. + + Old versions of Tor required that RENDEZVOUS cell payloads be exactly + 168 bytes long. All shorter rendezvous payloads should be padded to + this length with random bytes, to make them difficult to distinguish from + older protocols at the rendezvous point. + + Relays older than 0.2.9.1 should not be used for rendezvous points by next + generation onion services because they enforce too-strict length checks to + rendezvous cells. Hence the "HSRend" protocol from proposal#264 should be + used to select relays for rendezvous points. + +5. Encrypting data between client and host + + A successfully completed handshake, as embedded in the + INTRODUCE/RENDEZVOUS cells, gives the client and hidden service host + a shared set of keys Kf, Kb, Df, Db, which they use for sending + end-to-end traffic encryption and authentication as in the regular + Tor relay encryption protocol, applying encryption with these keys + before other encryption, and decrypting with these keys before other + decryption. The client encrypts with Kf and decrypts with Kb; the + service host does the opposite. + +6. Encoding onion addresses [ONIONADDRESS] + + The onion address of a hidden service includes its identity public key, a + version field and a basic checksum. All this information is then base32 + encoded as shown below: + + onion_address = base32(PUBKEY | CHECKSUM | VERSION) + ".onion" + CHECKSUM = H(".onion checksum" | PUBKEY | VERSION)[:2] + + where: + - PUBKEY is the 32 bytes ed25519 master pubkey of the hidden service. + - VERSION is a one byte version field (default value '\x03') + - ".onion checksum" is a constant string + - CHECKSUM is truncated to two bytes before inserting it in onion_address + + Here are a few example addresses: + + pg6mmjiyjmcrsslvykfwnntlaru7p5svn6y2ymmju6nubxndf4pscryd.onion + sp3k262uwy4r2k3ycr5awluarykdpag6a7y33jxop4cs2lu5uz5sseqd.onion + xa4r2iadxm55fbnqgwwi5mymqdcofiu3w6rpbtqn7b2dyn7mgwj64jyd.onion + + For more information about this encoding, please see our discussion thread + at [ONIONADDRESS-REFS]. + +7. Open Questions: + + Scaling hidden services is hard. There are on-going discussions that + you might be able to help with. See [SCALING-REFS]. + + How can we improve the HSDir unpredictability design proposed in + [SHAREDRANDOM]? See [SHAREDRANDOM-REFS] for discussion. + + How can hidden service addresses become memorable while retaining + their self-authenticating and decentralized nature? See + [HUMANE-HSADDRESSES-REFS] for some proposals; many more are possible. + + Hidden Services are pretty slow. Both because of the lengthy setup + procedure and because the final circuit has 6 hops. How can we make + the Hidden Service protocol faster? See [PERFORMANCE-REFS] for some + suggestions. + +References: + +[KEYBLIND-REFS]: + https://trac.torproject.org/projects/tor/ticket/8106 + https://lists.torproject.org/pipermail/tor-dev/2012-September/004026.html + +[KEYBLIND-PROOF]: + https://lists.torproject.org/pipermail/tor-dev/2013-December/005943.html + +[SHAREDRANDOM-REFS]: + https://gitweb.torproject.org/torspec.git/tree/proposals/250-commit-reveal-consensus.txt + https://trac.torproject.org/projects/tor/ticket/8244 + +[SCALING-REFS]: + https://lists.torproject.org/pipermail/tor-dev/2013-October/005556.html + +[HUMANE-HSADDRESSES-REFS]: + https://gitweb.torproject.org/torspec.git/blob/HEAD:/proposals/ideas/xxx-onion-nyms.txt + http://archives.seul.org/or/dev/Dec-2011/msg00034.html + +[PERFORMANCE-REFS]: + "Improving Efficiency and Simplicity of Tor circuit + establishment and hidden services" by Overlier, L., and + P. Syverson + + [TODO: Need more here! Do we have any? :( ] + +[ATTACK-REFS]: + "Trawling for Tor Hidden Services: Detection, Measurement, + Deanonymization" by Alex Biryukov, Ivan Pustogarov, + Ralf-Philipp Weinmann + + "Locating Hidden Servers" by Lasse Øverlier and Paul + Syverson + +[ED25519-REFS]: + "High-speed high-security signatures" by Daniel + J. Bernstein, Niels Duif, Tanja Lange, Peter Schwabe, and + Bo-Yin Yang. http://cr.yp.to/papers.html#ed25519 + +[ED25519-B-REF]: + https://tools.ietf.org/html/draft-josefsson-eddsa-ed25519-03#section-5: + +[PRNG-REFS]: + http://projectbullrun.org/dual-ec/ext-rand.html + https://lists.torproject.org/pipermail/tor-dev/2015-November/009954.html + +[SRV-TP-REFS]: + https://lists.torproject.org/pipermail/tor-dev/2016-April/010759.html + +[VANITY-REFS]: + https://github.com/Yawning/horse25519 + +[ONIONADDRESS-REFS]: + https://lists.torproject.org/pipermail/tor-dev/2017-January/011816.html + +[TORSION-REFS]: + https://lists.torproject.org/pipermail/tor-dev/2017-April/012164.html + https://getmonero.org/2017/05/17/disclosure-of-a-major-bug-in-cryptonote-based-currencies.html + +Appendix A. Signature scheme with key blinding [KEYBLIND] + +A.1. Key derivation overview + + As described in [IMD:DIST] and [SUBCRED] above, we require a "key + blinding" system that works (roughly) as follows: + + There is a master keypair (sk, pk). + + Given the keypair and a nonce n, there is a derivation function + that gives a new blinded keypair (sk_n, pk_n). This keypair can + be used for signing. + + Given only the public key and the nonce, there is a function + that gives pk_n. + + Without knowing pk, it is not possible to derive pk_n; without + knowing sk, it is not possible to derive sk_n. + + It's possible to check that a signature was made with sk_n while + knowing only pk_n. + + Someone who sees a large number of blinded public keys and + signatures made using those public keys can't tell which + signatures and which blinded keys were derived from the same + master keypair. + + You can't forge signatures. + + [TODO: Insert a more rigorous definition and better references.] + +A.2. Tor's key derivation scheme + + We propose the following scheme for key blinding, based on Ed25519. + + (This is an ECC group, so remember that scalar multiplication is the + trapdoor function, and it's defined in terms of iterated point + addition. See the Ed25519 paper [Reference ED25519-REFS] for a fairly + clear writeup.) + + Let B be the ed25519 basepoint as found in section 5 of [ED25519-B-REF]: + + B = (15112221349535400772501151409588531511454012693041857206046113283949847762202, + 46316835694926478169428394003475163141307993866256225615783033603165251855960) + + Assume B has prime order l, so lB=0. Let a master keypair be written as + (a,A), where a is the private key and A is the public key (A=aB). + + To derive the key for a nonce N and an optional secret s, compute the + blinding factor like this: + + h = H(BLIND_STRING | A | s | B | N) + BLIND_STRING = "Derive temporary signing key" | INT_1(0) + N = "key-blind" | INT_8(period-number) | INT_8(period_length) + B = "(1511[...]2202, 4631[...]5960)" + + then clamp the blinding factor 'h' according to the ed25519 spec: + + h[0] &= 248; + h[31] &= 63; + h[31] |= 64; + + and do the key derivation as follows: + + private key for the period: + + a' = h a mod l + RH' = SHA-512(RH_BLIND_STRING | RH)[:32] + RH_BLIND_STRING = "Derive temporary signing key hash input" + + public key for the period: + + A' = h A = (ha)B + + Generating a signature of M: given a deterministic random-looking r + (see EdDSA paper), take R=rB, S=r+hash(R,A',M)ah mod l. Send signature + (R,S) and public key A'. + + Verifying the signature: Check whether SB = R+hash(R,A',M)A'. + + (If the signature is valid, + SB = (r + hash(R,A',M)ah)B + = rB + (hash(R,A',M)ah)B + = R + hash(R,A',M)A' ) + + This boils down to regular Ed25519 with key pair (a', A'). + + See [KEYBLIND-REFS] for an extensive discussion on this scheme and + possible alternatives. Also, see [KEYBLIND-PROOF] for a security + proof of this scheme. + +Appendix B. Selecting nodes [PICKNODES] + + Picking introduction points + Picking rendezvous points + Building paths + Reusing circuits + + (TODO: This needs a writeup) + +Appendix C. Recommendations for searching for vanity .onions [VANITY] + + EDITORIAL NOTE: The author thinks that it's silly to brute-force the + keyspace for a key that, when base-32 encoded, spells out the name of + your website. It also feels a bit dangerous to me. If you train your + users to connect to + + llamanymityx4fi3l6x2gyzmtmgxjyqyorj9qsb5r543izcwymle.onion + + I worry that you're making it easier for somebody to trick them into + connecting to + + llamanymityb4sqi0ta0tsw6uovyhwlezkcrmczeuzdvfauuemle.onion + + Nevertheless, people are probably going to try to do this, so here's a + decent algorithm to use. + + To search for a public key with some criterion X: + + Generate a random (sk,pk) pair. + + While pk does not satisfy X: + + Add the number 8 to sk + Add the point 8*B to pk + + Return sk, pk. + + We add 8 and 8*B, rather than 1 and B, so that sk is always a valid + Curve25519 private key, with the lowest 3 bits equal to 0. + + This algorithm is safe [source: djb, personal communication] [TODO: + Make sure I understood correctly!] so long as only the final (sk,pk) + pair is used, and all previous values are discarded. + + To parallelize this algorithm, start with an independent (sk,pk) pair + generated for each independent thread, and let each search proceed + independently. + + See [VANITY-REFS] for a reference implementation of this vanity .onion + search scheme. + +Appendix D. Numeric values reserved in this document + + [TODO: collect all the lists of commands and values mentioned above] + +Appendix E. Reserved numbers + + We reserve these certificate type values for Ed25519 certificates: + + [08] short-term descriptor signing key, signed with blinded + public key. (Section 2.4) + [09] intro point authentication key, cross-certifying the descriptor + signing key. (Section 2.5) + [0B] ed25519 key derived from the curve25519 intro point encryption key, + cross-certifying the descriptor signing key. (Section 2.5) + + Note: The value "0A" is skipped because it's reserved for the onion key + cross-certifying ntor identity key from proposal 228. + +Appendix F. Hidden service directory format [HIDSERVDIR-FORMAT] + + This appendix section specifies the contents of the HiddenServiceDir directory: + + - "hostname" [FILE] + + This file contains the onion address of the onion service. + + - "private_key_ed25519" [FILE] + + This file contains the private master ed25519 key of the onion service. + [TODO: Offline keys] + + - "./authorized_clients/" [DIRECTORY] + "./authorized_clients/alice.auth" [FILE] + "./authorized_clients/bob.auth" [FILE] + "./authorized_clients/charlie.auth" [FILE] + + If client authorization is enabled, this directory MUST contain a ".auth" + file for each authorized client. Each such file contains the public key of + the respective client. The files are transmitted to the service operator by + the client. + + See section [CLIENT-AUTH-MGMT] for more details and the format of the client file. + + (NOTE: client authorization is implemented as of 0.3.5.1-alpha.) + +Appendix G. Managing authorized client data [CLIENT-AUTH-MGMT] + + Hidden services and clients can configure their authorized client data either + using the torrc, or using the control port. This section presents a suggested + scheme for configuring client authorization. Please see appendix + [HIDSERVDIR-FORMAT] for more information about relevant hidden service files. + + (NOTE: client authorization is implemented as of 0.3.5.1-alpha.) + + G.1. Configuring client authorization using torrc + + G.1.1. Hidden Service side configuration + + A hidden service that wants to enable client authorization, needs to + populate the "authorized_clients/" directory of its HiddenServiceDir + directory with the ".auth" files of its authorized clients. + + When Tor starts up with a configured onion service, Tor checks its + /authorized_clients/ directory for ".auth" files, and if + any recognized and parseable such files are found, then client + authorization becomes activated for that service. + + G.1.2. Service-side bookkeeping + + This section contains more details on how onion services should be keeping + track of their client ".auth" files. + + For the "descriptor" authentication type, the ".auth" file MUST contain + the x25519 public key of that client. Here is a suggested file format: + + :: + + Here is an an example: + + descriptor:x25519:OM7TGIVRYMY6PFX6GAC6ATRTA5U6WW6U7A4ZNHQDI6OVL52XVV2Q + + Tor SHOULD ignore lines it does not recognize. + Tor SHOULD ignore files that don't use the ".auth" suffix. + + G.1.3. Client side configuration + + A client who wants to register client authorization data for onion + services needs to add the following line to their torrc to indicate the + directory which hosts ".auth_private" files containing client-side + credentials for onion services: + + ClientOnionAuthDir + + The contains a file with the suffix ".auth_private" for each onion + service the client is authorized with. Tor should scan the directory for + ".auth_private" files to find which onion services require client + authorization from this client. + + For the "descriptor" auth-type, a ".auth_private" file contains the + private x25519 key: + + :descriptor:x25519: + + The keypair used for client authorization is created by a third party tool + for which the public key needs to be transferred to the service operator + in a secure out-of-band way. The third party tool SHOULD add appropriate + headers to the private key file to ensure that users won't accidentally + give out their private key. + + G.2. Configuring client authorization using the control port + + G.2.1. Service side + + A hidden service also has the option to configure authorized clients + using the control port. The idea is that hidden service operators can use + controller utilities that manage their access control instead of using + the filesystem to register client keys. + + Specifically, we require a new control port command ADD_ONION_CLIENT_AUTH + which is able to register x25519/ed25519 public keys tied to a specific + authorized client. + [XXX figure out control port command format] + + Hidden services who use the control port interface for client auth need + to perform their own key management. + + G.2.2. Client side + + There should also be a control port interface for clients to register + authorization data for hidden services without having to use the + torrc. It should allow both generation of client authorization private + keys, and also to import client authorization data provided by a hidden + service + + This way, Tor Browser can present "Generate client auth keys" and "Import + client auth keys" dialogs to users when they try to visit a hidden service + that is protected by client authorization. + + Specifically, we require two new control port commands: + IMPORT_ONION_CLIENT_AUTH_DATA + GENERATE_ONION_CLIENT_AUTH_DATA + which import and generate client authorization data respectively. + + [XXX how does key management work here?] + [XXX what happens when people use both the control port interface and the + filesystem interface?] + +Appendix F. Two methods for managing revision counters. + + Implementations MAY generate revision counters in any way they please, + so long as they are monotonically increasing over the lifetime of each + blinded public key. But to avoid fingerprinting, implementors SHOULD + choose a strategy also used by other Tor implementations. Here we + describe two, and additionally list some strategies that implementors + should NOT use. + + F.1. Increment-on-generation + + This is the simplest strategy, and the one used by Tor through at + least version 0.3.4.0-alpha. + + Whenever using a new blinded key, the service records the + highest revision counter it has used with that key. When generating + a descriptor, the service uses the smallest non-negative number + higher than any number it has already used. + + In other words, the revision counters under this system start fresh + with each blinded key as 0, 1, 2, 3, and so on. + + F.2. Encrypted time in period + + This scheme is what we recommend for situations when multiple + service instances need to coordinate their revision counters, + without an actual coordination mechanism. + + Let T be the number of seconds that have elapsed since the descriptor + became valid, plus 1. (T must be at least 1.) Implementations can use the + number of seconds since the start time of the shared random protocol run + that corresponds to this descriptor. + + Let S be a secret that all the service providers share. For + example, it could be the private signing key corresponding to the + current blinded key. + + Let K be an AES-256 key, generated as + K = H("rev-counter-generation" | S) + + Use K, and AES in counter mode with IV=0, to generate a stream of T + * 2 bytes. Consider these bytes as a sequence of T 16-bit + little-endian words. Add these words. + + Let the sum of these words be the revision counter. + + + Cryptowiki attributes roughly this scheme to G. Bebek in: + + G. Bebek. Anti-tamper database research: Inference control + techniques. Technical Report EECS 433 Final Report, Case + Western Reserve University, November 2002. + + Although we believe it is suitable for use in this application, it + is not a perfect order-preserving encryption algorithm (and all + order-preserving encryption has weaknesses). Please think twice + before using it for anything else. + + (This scheme can be optimized pretty easily by caching the encryption of + X*1, X*2, X*3, etc for some well chosen X.) + + For a slow reference implementation, see src/test/ope_ref.py in the + Tor source repository. [XXXX for now, see the same file in Nick's + "ope_hax" branch -- it isn't merged yet.] + + This scheme is not currently implemented in Tor. + + F.X. Some revision-counter strategies to avoid + + Though it might be tempting, implementations SHOULD NOT use the + current time or the current time within the period directly as their + revision counter -- doing so leaks their view of the current time, + which can be used to link the onion service to other services run on + the same host. + + Similarly, implementations SHOULD NOT let the revision counter + increase forever without resetting it -- doing so links the service + across changes in the blinded public key. + +Appendix G. Text vectors + + G.1. Test vectors for hs-ntor / NTOR-WITH-EXTRA-DATA + + Here is a set of test values for the hs-ntor handshake, called + [NTOR-WITH-EXTRA-DATA] in this document. They were generated by + instrumenting Tor's code to dump the values for an INTRODUCE/RENDEZVOUS + handshake, and then by running that code on a Chutney network. + + We assume an onion service with: + + KP_hs_ipd_sid = 34E171E4358E501BFF21ED907E96AC6B + FEF697C779D040BBAF49ACC30FC5D21F + KP_hss_ntor = 8E5127A40E83AABF6493E41F142B6EE3 + 604B85A3961CD7E38D247239AFF71979 + KS_hss_ntor = A0ED5DBF94EEB2EDB3B514E4CF6ABFF6 + 022051CC5F103391F1970A3FCD15296A + N_hs_subcred = 0085D26A9DEBA252263BF0231AEAC59B + 17CA11BAD8A218238AD6487CBAD68B57 + + The client wants to make in INTRODUCE request. It generates + the following header (everything before the ENCRYPTED portion) + of its INTRODUCE1 cell: + + H = 000000000000000000000000000000000000000002002034E171E4358E501BFF + 21ED907E96AC6BFEF697C779D040BBAF49ACC30FC5D21F00 + + It generates the following plaintext body to encrypt. (This + is the "decrypted plaintext body" from [PROCESS_INTRO2]. + + P = 6BD364C12638DD5C3BE23D76ACA05B04E6CE932C0101000100200DE6130E4FCA + C4EDDA24E21220CC3EADAE403EF6B7D11C8273AC71908DE565450300067F0000 + 0113890214F823C4F8CC085C792E0AEE0283FE00AD7520B37D0320728D5DF39B + 7B7077A0118A900FF4456C382F0041300ACF9C58E51C392795EF870000000000 + 0000000000000000000000000000000000000000000000000000000000000000 + 000000000000000000000000000000000000000000000000000000000000 + + (Note! This should in fact be padded to be longer; when these + test vectors were generated, the target INTRODUCE1 length in C + Tor was needlessly short.) + + The client now begins the hs-ntor handshake. It generates + a curve25519 keypair: + + x = 60B4D6BF5234DCF87A4E9D7487BDF3F4 + A69B6729835E825CA29089CFDDA1E341 + X = BF04348B46D09AED726F1D66C618FDEA + 1DE58E8CB8B89738D7356A0C59111D5D + + Then it calculates: + + ENC_KEY = 9B8917BA3D05F3130DACCE5300C3DC27 + F6D012912F1C733036F822D0ED238706 + MAC_KEY = FC4058DA59D4DF61E7B40985D122F502 + FD59336BC21C30CAF5E7F0D4A2C38FD5 + + With these, it encrypts the plaintext body P with ENC_KEY, getting + an encrypted value C. It computes MAC(MAC_KEY, H | X | C), + getting a MAC value M. It then assembles the final INTRODUCE1 + body as H | X | C | M: + + 000000000000000000000000000000000000000002002034E171E4358E501BFF + 21ED907E96AC6BFEF697C779D040BBAF49ACC30FC5D21F00BF04348B46D09AED + 726F1D66C618FDEA1DE58E8CB8B89738D7356A0C59111D5DADBECCCB38E37830 + 4DCC179D3D9E437B452AF5702CED2CCFEC085BC02C4C175FA446525C1B9D5530 + 563C362FDFFB802DAB8CD9EBC7A5EE17DA62E37DEEB0EB187FBB48C63298B0E8 + 3F391B7566F42ADC97C46BA7588278273A44CE96BC68FFDAE31EF5F0913B9A9C + 7E0F173DBC0BDDCD4ACB4C4600980A7DDD9EAEC6E7F3FA3FC37CD95E5B8BFB3E + 35717012B78B4930569F895CB349A07538E42309C993223AEA77EF8AEA64F25D + DEE97DA623F1AEC0A47F150002150455845C385E5606E41A9A199E7111D54EF2 + D1A51B7554D8B3692D85AC587FB9E69DF990EFB776D8 + + Later the service receives that body in an INTRODUCE2 cell. It + processes it according to the hs-ntor handshake, and recovers + the client's plaintext P. To continue the hs-ntor handshake, + the service chooses a curve25519 keypair: + + y = 68CB5188CA0CD7924250404FAB54EE13 + 92D3D2B9C049A2E446513875952F8F55 + Y = 8FBE0DB4D4A9C7FF46701E3E0EE7FD05 + CD28BE4F302460ADDEEC9E93354EE700 + + From this and the client's input, it computes: + + AUTH_INPUT_MAC = 4A92E8437B8424D5E5EC279245D5C72B + 25A0327ACF6DAF902079FCB643D8B208 + NTOR_KEY_SEED = 4D0C72FE8AFF35559D95ECC18EB5A368 + 83402B28CDFD48C8A530A5A3D7D578DB + + The service sends back Y | AUTH_INPUT_MAC in its RENDEZVOUS1 cell + body. From these, the client finishes the handshake, validates + AUTH_INPUT_MAC, and computes the same NTOR_KEY_SEED. + + Now that both parties have the same NTOR_KEY_SEED, they can derive + the shared key material they will use for their circuit. diff --git a/attic/text_formats/socks-extensions.txt b/attic/text_formats/socks-extensions.txt new file mode 100644 index 0000000..c35069d --- /dev/null +++ b/attic/text_formats/socks-extensions.txt @@ -0,0 +1,175 @@ + + Tor's extensions to the SOCKS protocol + +Table of Contents + + 1. Overview + 1.1. Extent of support + 2. Name lookup + 3. Other command extensions. + 4. HTTP-resistance + 5. Optimistic data + 6. Extended error codes + +1. Overview + + The SOCKS protocol provides a generic interface for TCP proxies. Client + software connects to a SOCKS server via TCP, and requests a TCP connection + to another address and port. The SOCKS server establishes the connection, + and reports success or failure to the client. After the connection has + been established, the client application uses the TCP stream as usual. + + Tor supports SOCKS4 as defined in [1], SOCKS4A as defined in [2], and + SOCKS5 as defined in [3] and [4]. + + The stickiest issue for Tor in supporting clients, in practice, is forcing + DNS lookups to occur at the OR side: if clients do their own DNS lookup, + the DNS server can learn which addresses the client wants to reach. + SOCKS4 supports addressing by IPv4 address; SOCKS4A is a kludge on top of + SOCKS4 to allow addressing by hostname; SOCKS5 supports IPv4, IPv6, and + hostnames. + +1.1. Extent of support + + Tor supports the SOCKS4, SOCKS4A, and SOCKS5 standards, except as follows: + + BOTH: + - The BIND command is not supported. + + SOCKS4,4A: + - SOCKS4 usernames are used to implement stream isolation. + + SOCKS5: + - The (SOCKS5) "UDP ASSOCIATE" command is not supported. + - SOCKS5 BIND command is not supported. + - IPv6 is not supported in CONNECT commands. + - SOCKS5 GSSAPI subnegotiation is not supported. + - The "NO AUTHENTICATION REQUIRED" (SOCKS5) authentication method [00] is + supported; and as of Tor 0.2.3.2-alpha, the "USERNAME/PASSWORD" (SOCKS5) + authentication method [02] is supported too, and used as a method to + implement stream isolation. As an extension to support some broken clients, + we allow clients to pass "USERNAME/PASSWORD" authentication message to us + even if no authentication was selected. Furthermore, we allow + username/password fields of this message to be empty. This technically + violates RFC1929 [4], but ensures interoperability with somewhat broken + SOCKS5 client implementations. + - Custom reply error code. The "REP" fields, as per the RFC[3], has + unassigned values which are used to describe Tor internal errors. See + ExtendedErrors in the tor.1 man page for more details. It is only sent + back if this SocksPort flag is set. + + (For more information on stream isolation, see IsolateSOCKSAuth on the Tor + manpage.) + +2. Name lookup + + As an extension to SOCKS4A and SOCKS5, Tor implements a new command value, + "RESOLVE" [F0]. When Tor receives a "RESOLVE" SOCKS command, it initiates + a remote lookup of the hostname provided as the target address in the SOCKS + request. The reply is either an error (if the address couldn't be + resolved) or a success response. In the case of success, the address is + stored in the portion of the SOCKS response reserved for remote IP address. + + (We support RESOLVE in SOCKS4 too, even though it is unnecessary.) + + For SOCKS5 only, we support reverse resolution with a new command value, + "RESOLVE_PTR" [F1]. In response to a "RESOLVE_PTR" SOCKS5 command with + an IPv4 address as its target, Tor attempts to find the canonical + hostname for that IPv4 record, and returns it in the "server bound + address" portion of the reply. + (This command was not supported before Tor 0.1.2.2-alpha.) + +3. Other command extensions. + + Tor 0.1.2.4-alpha added a new command value: "CONNECT_DIR" [F2]. + In this case, Tor will open an encrypted direct TCP connection to the + directory port of the Tor server specified by address:port (the port + specified should be the ORPort of the server). It uses a one-hop tunnel + and a "BEGIN_DIR" relay cell to accomplish this secure connection. + + The F2 command value was removed in Tor 0.2.0.10-alpha in favor of a + new use_begindir flag in edge_connection_t. + +4. HTTP-resistance + + Tor checks the first byte of each SOCKS request to see whether it looks + more like an HTTP request (that is, it starts with a "G", "H", or "P"). If + so, Tor returns a small webpage, telling the user that his/her browser is + misconfigured. This is helpful for the many users who mistakenly try to + use Tor as an HTTP proxy instead of a SOCKS proxy. + +5. Optimistic data + + Tor allows SOCKS clients to send connection data before Tor has sent a + SOCKS response. When using an exit node that supports "optimistic data", + Tor will send such data to the server without waiting to see whether the + connection attempt succeeds. This behavior can save a single round-trip + time when starting connections with a protocol where the client speaks + first (like HTTP). Clients that do this must be ready to hear that + their connection has succeeded or failed _after_ they have sent the + data. + +6. Extended error codes + + We define a set of additional extension error codes that can be returned + by our SOCKS implementation in response to failed onion service + connections. + + (In the C Tor implementation, these error codes can be disabled + via the ExtendedErrors flag. In Arti, these error codes are enabled + whenever onion services are.) + + * X'F0' Onion Service Descriptor Can Not be Found + + The requested onion service descriptor can't be found on the hashring + and thus not reachable by the client. + + * X'F1' Onion Service Descriptor Is Invalid + + The requested onion service descriptor can't be parsed or signature + validation failed. + + * X'F2' Onion Service Introduction Failed + + Client failed to introduce to the service meaning the descriptor was + found but the service is not anymore at the introduction points. The + service has likely changed its descriptor or is not running. + + * X'F3' Onion Service Rendezvous Failed + + Client failed to rendezvous with the service which means that the client + is unable to finalize the connection. + + * X'F4' Onion Service Missing Client Authorization + + Tor was able to download the requested onion service descriptor but is + unable to decrypt its content because it is missing client authorization + information for it. + + * X'F5' Onion Service Wrong Client Authorization + + Tor was able to download the requested onion service descriptor but is + unable to decrypt its content using the client authorization information + it has. This means the client access were revoked. + + * X'F6' Onion Service Invalid Address + + The given .onion address is invalid. In one of these cases this + error is returned: address checksum doesn't match, ed25519 public + key is invalid or the encoding is invalid. + + * X'F7' Onion Service Introduction Timed Out + + Similar to X'F2' code but in this case, all introduction attempts + have failed due to a time out. + + (Note that not all of the above error codes are currently returned + by Arti as of August 2023.) + + +References: + [1] http://en.wikipedia.org/wiki/SOCKS#SOCKS4 + [2] http://en.wikipedia.org/wiki/SOCKS#SOCKS4a + [3] SOCKS5: RFC 1928 https://www.ietf.org/rfc/rfc1928.txt + [4] RFC 1929: https://www.ietf.org/rfc/rfc1929.txt + diff --git a/attic/text_formats/srv-spec.txt b/attic/text_formats/srv-spec.txt new file mode 100644 index 0000000..f768b73 --- /dev/null +++ b/attic/text_formats/srv-spec.txt @@ -0,0 +1,653 @@ + + Tor Shared Random Subsystem Specification + +This document specifies how the commit-and-reveal shared random subsystem of +Tor works. This text used to be proposal 250-commit-reveal-consensus.txt. + + Table Of Contents: + + 1. Introduction + 1.1. Motivation + 1.2. Previous work + 2. Overview + 2.1. Introduction to our commit-and-reveal protocol + 2.2. Ten thousand feet view of the protocol + 2.3. How we use the consensus [CONS] + 2.3.1. Inserting Shared Random Values in the consensus + 2.4. Persistent State of the Protocol [STATE] + 2.5. Protocol Illustration + 3. Protocol + 3.1 Commitment Phase [COMMITMENTPHASE] + 3.1.1. Voting During Commitment Phase + 3.1.2. Persistent State During Commitment Phase [STATECOMMIT] + 3.2 Reveal Phase + 3.2.1. Voting During Reveal Phase + 3.2.2. Persistent State During Reveal Phase [STATEREVEAL] + 3.3. Shared Random Value Calculation At 00:00UTC + 3.3.1. Shared Randomness Calculation [SRCALC] + 3.4. Bootstrapping Procedure + 3.5. Rebooting Directory Authorities [REBOOT] + 4. Specification [SPEC] + 4.1. Voting + 4.1.1. Computing commitments and reveals [COMMITREVEAL] + 4.1.2. Validating commitments and reveals [VALIDATEVALUES] + 4.1.4. Encoding commit/reveal values in votes [COMMITVOTE] + 4.1.5. Shared Random Value [SRVOTE] + 4.2. Encoding Shared Random Values in the consensus [SRCONSENSUS] + 4.3. Persistent state format [STATEFORMAT] + 5. Security Analysis + 5.1. Security of commit-and-reveal and future directions + 5.2. Predicting the shared random value during reveal phase + 5.3. Partition attacks + 5.3.1. Partition attacks during commit phase + 5.3.2. Partition attacks during reveal phase + 6. Discussion + 6.1. Why the added complexity from proposal 225? + 6.2. Why do you do a commit-and-reveal protocol in 24 rounds? + 6.3. Why can't we recover if the 00:00UTC consensus fails? + 7. Acknowledgements + + +1. Introduction + +1.1. Motivation + + For the next generation hidden services project, we need the Tor network to + produce a fresh random value every day in such a way that it cannot be + predicted in advance or influenced by an attacker. + + Currently we need this random value to make the HSDir hash ring + unpredictable (#8244), which should resolve a wide class of hidden service + DoS attacks and should make it harder for people to gauge the popularity + and activity of target hidden services. Furthermore this random value can + be used by other systems in need of fresh global randomness like + Tor-related protocols (e.g. OnioNS) or even non-Tor-related (e.g. warrant + canaries). + +1.2. Previous work + + Proposal 225 specifies a commit-and-reveal protocol that can be run as an + external script and have the results be fed to the directory authorities. + However, directory authority operators feel unsafe running a third-party + script that opens TCP ports and accepts connections from the Internet. + Hence, this proposal aims to embed the commit-and-reveal idea in the Tor + voting process which should make it smoother to deploy and maintain. + +2. Overview + + This proposal alters the Tor consensus protocol such that a random number is + generated every midnight by the directory authorities during the regular voting + process. The distributed random generator scheme is based on the + commit-and-reveal technique. + + The proposal also specifies how the final shared random value is embedded + in consensus documents so that clients who need it can get it. + +2.1. Introduction to our commit-and-reveal protocol + + Every day, before voting for the consensus at 00:00UTC each authority + generates a new random value and keeps it for the whole day. The authority + cryptographically hashes the random value and calls the output its + "commitment" value. The original random value is called the "reveal" value. + + The idea is that given a reveal value you can cryptographically confirm that + it corresponds to a given commitment value (by hashing it). However given a + commitment value you should not be able to derive the underlying reveal + value. The construction of these values is specified in section [COMMITREVEAL]. + +2.1. Ten thousand feet view of the protocol + + Our commit-and-reveal protocol aims to produce a fresh shared random value + (denoted shared_random_value here and elsewhere) every day at 00:00UTC. The + final fresh random value is embedded in the consensus document at that + time. + + Our protocol has two phases and uses the hourly voting procedure of Tor. + Each phase lasts 12 hours, which means that 12 voting rounds happen in + between. In short, the protocol works as follows: + + Commit phase: + + Starting at 00:00UTC and for a period of 12 hours, authorities every + hour include their commitment in their votes. They also include any + received commitments from other authorities, if available. + + Reveal phase: + + At 12:00UTC, the reveal phase starts and lasts till the end of the + protocol at 00:00UTC. In this stage, authorities must reveal the value + they committed to in the previous phase. The commitment and revealed + values from other authorities, when available, are also added to the + vote. + + Shared Randomness Calculation: + + At 00:00UTC, the shared random value is computed from the agreed + revealed values and added to the consensus. + + This concludes the commit-and-reveal protocol every day at 00:00UTC. + +2.3. How we use the consensus [CONS] + + The produced shared random values need to be readily available to + clients. For this reason we include them in the consensus documents. + + Every hour the consensus documents need to include the shared random value + of the day, as well as the shared random value of the previous day. That's + because either of these values might be needed at a given time for a Tor + client to access a hidden service according to section [TIME-OVERLAP] of + proposal 224. This means that both of these two values need to be included + in votes as well. + + Hence, consensuses need to include: + + (a) The shared random value of the current time period. + (b) The shared random value of the previous time period. + + For this, a new SR consensus method will be needed to indicate which + authorities support this new protocol. + +2.3.1. Inserting Shared Random Values in the consensus + + After voting happens, we need to be careful on how we pick which shared + random values (SRV) to put in the consensus, to avoid breaking the consensus + because of authorities having different views of the commit-and-reveal + protocol (because maybe they missed some rounds of the protocol). + + For this reason, authorities look at the received votes before creating a + consensus and employ the following logic: + + - First of all, they make sure that the agreed upon consensus method is + above the SR consensus method. + + - Authorities include an SRV in the consensus if and only if the SRV has + been voted by at least the majority of authorities. + + - For the consensus at 00:00UTC, authorities include an SRV in the consensus + if and only if the SRV has been voted by at least AuthDirNumAgreements + authorities (where AuthDirNumAgreements is a newly introduced consensus + parameter). + + Authorities include in the consensus the most popular SRV that also + satisfies the above constraints. Otherwise, no SRV should be included. + + The above logic is used to make it harder to break the consensus by natural + partioning causes. + + We use the AuthDirNumAgreements consensus parameter to enforce that a + _supermajority_ of dirauths supports the SR protocol during SRV creation, so + that even if a few of those dirauths drop offline in the middle of the run + the SR protocol does not get disturbed. We go to extra lengths to ensure + this because changing SRVs in the middle of the day has terrible + reachability consequences for hidden service clients. + +2.4. Persistent State of the Protocol [STATE] + + A directory authority needs to keep a persistent state on disk of the on + going protocol run. This allows an authority to join the protocol seamlessly + in the case of a reboot. + + During the commitment phase, it is populated with the commitments of all + authorities. Then during the reveal phase, the reveal values are also + stored in the state. + + As discussed previously, the shared random values from the current and + previous time period must also be present in the state at all times if they + are available. + +2.5. Protocol Illustration + + An illustration for better understanding the protocol can be found here: + + https://people.torproject.org/~asn/hs_notes/shared_rand.jpg + + It reads left-to-right. + + The illustration displays what the authorities (A_1, A_2, A_3) put in their + votes. A chain 'A_1 -> c_1 -> r_1' denotes that authority A_1 committed to + the value c_1 which corresponds to the reveal value r_1. + + The illustration depicts only a few rounds of the whole protocol. It starts + with the first three rounds of the commit phase, then it jumps to the last + round of the commit phase. It continues with the first two rounds of the + reveal phase and then it jumps to the final round of the protocol run. It + finally shows the first round of the commit phase of the next protocol run + (00:00UTC) where the final Shared Random Value is computed. In our fictional + example, the SRV was computed with 3 authority contributions and its value + is "a56fg39h". + + We advice you to revisit this after you have read the whole document. + +3. Protocol + + In this section we give a detailed specification of the protocol. We + describe the protocol participants' logic and the messages they send. The + encoding of the messages is specified in the next section ([SPEC]). + + Now we go through the phases of the protocol: + +3.1. Commitment Phase [COMMITMENTPHASE] + + The commit phase lasts from 00:00UTC to 12:00UTC. + + During this phase, an authority commits a value in its vote and + saves it to the permanent state as well. + + Authorities also save any received authoritative commits by other authorities + in their permanent state. We call a commit by Alice "authoritative" if it was + included in Alice's vote. + +3.1.1. Voting During Commitment Phase + + During the commit phase, each authority includes in its votes: + + - The commitment value for this protocol run. + - Any authoritative commitments received from other authorities. + - The two previous shared random values produced by the protocol (if any). + + The commit phase lasts for 12 hours, so authorities have multiple chances to + commit their values. An authority MUST NOT commit a second value during a + subsequent round of the commit phase. + + If an authority publishes a second commitment value in the same commit + phase, only the first commitment should be taken in account by other + authorities. Any subsequent commitments MUST be ignored. + +3.1.2. Persistent State During Commitment Phase [STATECOMMIT] + + During the commitment phase, authorities save in their persistent state the + authoritative commits they have received from each authority. Only one commit + per authority must be considered trusted and active at a given time. + +3.2. Reveal Phase + + The reveal phase lasts from 12:00UTC to 00:00UTC. + + Now that the commitments have been agreed on, it's time for authorities to + reveal their random values. + +3.2.1. Voting During Reveal Phase + + During the reveal phase, each authority includes in its votes: + + - Its reveal value that was previously committed in the commit phase. + - All the commitments and reveals received from other authorities. + - The two previous shared random values produced by the protocol (if any). + + The set of commitments have been decided during the commitment + phase and must remain the same. If an authority tries to change its + commitment during the reveal phase or introduce a new commitment, + the new commitment MUST be ignored. + +3.2.2. Persistent State During Reveal Phase [STATEREVEAL] + + During the reveal phase, authorities keep the authoritative commits from the + commit phase in their persistent state. They also save any received reveals + that correspond to authoritative commits and are valid (as specified in + [VALIDATEVALUES]). + + An authority that just received a reveal value from another authority's vote, + MUST wait till the next voting round before including that reveal value in + its votes. + +3.3. Shared Random Value Calculation At 00:00UTC + + Finally, at 00:00UTC every day, authorities compute a fresh shared random + value and this value must be added to the consensus so clients can use it. + + Authorities calculate the shared random value using the reveal values in + their state as specified in subsection [SRCALC]. + + Authorities at 00:00UTC start including this new shared random value in + their votes, replacing the one from two protocol runs ago. Authorities also + start including this new shared random value in the consensus as well. + + Apart from that, authorities at 00:00UTC proceed voting normally as they + would in the first round of the commitment phase (section [COMMITMENTPHASE]). + +3.3.1. Shared Randomness Calculation [SRCALC] + + An authority that wants to derive the shared random value SRV, should use + the appropriate reveal values for that time period and calculate SRV as + follows. + + HASHED_REVEALS = H(ID_a | R_a | ID_b | R_b | ..) + + SRV = SHA3-256("shared-random" | INT_8(REVEAL_NUM) | INT_4(VERSION) | + HASHED_REVEALS | PREVIOUS_SRV) + + where the ID_a value is the identity key fingerprint of authority 'a' and R_a + is the corresponding reveal value of that authority for the current period. + + Also, REVEAL_NUM is the number of revealed values in this construction, + VERSION is the protocol version number and PREVIOUS_SRV is the previous + shared random value. If no previous shared random value is known, then + PREVIOUS_SRV is set to 32 NUL (\x00) bytes. + + To maintain consistent ordering in HASHED_REVEALS, all the ID_a | R_a pairs + are ordered based on the R_a value in ascending order. + +3.4. Bootstrapping Procedure + + As described in [CONS], two shared random values are required for the HSDir + overlay periods to work properly as specified in proposal 224. Hence + clients MUST NOT use the randomness of this system till it has bootstrapped + completely; that is, until two shared random values are included in a + consensus. This should happen after three 00:00UTC consensuses have been + produced, which takes 48 hours. + +3.5. Rebooting Directory Authorities [REBOOT] + + The shared randomness protocol must be able to support directory + authorities who leave or join in the middle of the protocol execution. + + An authority that commits in the Commitment Phase and then leaves MUST have + stored its reveal value on disk so that it continues participating in the + protocol if it returns before or during the Reveal Phase. The reveal value + MUST be stored timestamped to avoid sending it on wrong protocol runs. + + An authority that misses the Commitment Phase cannot commit anymore, so it's + unable to participate in the protocol for that run. Same goes for an + authority that misses the Reveal phase. Authorities who do not participate in + the protocol SHOULD still carry commits and reveals of others in their vote. + + Finally, authorities MUST implement their persistent state in such a way that they + will never commit two different values in the same protocol run, even if they + have to reboot in the middle (assuming that their persistent state file is + kept). A suggested way to structure the persistent state is found at [STATEFORMAT]. + +4. Specification [SPEC] + +4.1. Voting + + This section describes how commitments, reveals and SR values are encoded in + votes. We describe how to encode both the authority's own + commitments/reveals and also the commitments/reveals received from the other + authorities. Commitments and reveals share the same line, but reveals are + optional. + + Participating authorities need to include the line: + + "shared-rand-participate" + + in their votes to announce that they take part in the protocol. + +4.1.1. Computing commitments and reveals [COMMITREVEAL] + + A directory authority that wants to participate in this protocol needs to + create a new pair of commitment/reveal values for every protocol + run. Authorities SHOULD generate a fresh pair of such values right before the + first commitment phase of the day (at 00:00UTC). + + The value REVEAL is computed as follows: + + REVEAL = base64-encode( TIMESTAMP || H(RN) ) + + where RN is the SHA3 hashed value of a 256-bit random value. We hash the + random value to avoid exposing raw bytes from our PRNG to the network (see + [RANDOM-REFS]). + + TIMESTAMP is an 8-bytes network-endian time_t value. Authorities SHOULD + set TIMESTAMP to the valid-after time of the vote document they first plan + to publish their commit into (so usually at 00:00UTC, except if they start + up in a later commit round). + + The value COMMIT is computed as follows: + + COMMIT = base64-encode( TIMESTAMP || H(REVEAL) ) + +4.1.2. Validating commitments and reveals [VALIDATEVALUES] + + Given a COMMIT message and a REVEAL message it should be possible to verify + that they indeed correspond. To do so, the client extracts the random value + H(RN) from the REVEAL message, hashes it, and compares it with the H(H(RN)) + from the COMMIT message. We say that the COMMIT and REVEAL messages + correspond, if the comparison was successful. + + Participants MUST also check that corresponding COMMIT and REVEAL values + have the same timestamp value. + + Authorities should ignore reveal values during the Reveal Phase that don't + correspond to commit values published during the Commitment Phase. + +4.1.4. Encoding commit/reveal values in votes [COMMITVOTE] + + An authority puts in its vote the commitments and reveals it has produced and + seen from the other authorities. To do so, it includes the following in its + votes: + + "shared-rand-commit" SP VERSION SP ALGNAME SP IDENTITY SP COMMIT [SP REVEAL] NL + + where VERSION is the version of the protocol the commit was created with. + IDENTITY is the authority's SHA1 identity fingerprint and COMMIT is the + encoded commit [COMMITREVEAL]. Authorities during the reveal phase can + also optionally include an encoded reveal value REVEAL. There MUST be only + one line per authority else the vote is considered invalid. Finally, the + ALGNAME is the hash algorithm that should be used to compute COMMIT and + REVEAL which is "sha3-256" for version 1. + +4.1.5. Shared Random Value [SRVOTE] + + Authorities include a shared random value (SRV) in their votes using the + following encoding for the previous and current value respectively: + + "shared-rand-previous-value" SP NUM_REVEALS SP VALUE NL + "shared-rand-current-value" SP NUM_REVEALS SP VALUE NL + + where VALUE is the actual shared random value encoded in hex (computed as + specified in section [SRCALC]. NUM_REVEALS is the number of reveal values + used to generate this SRV. + + To maintain consistent ordering, the shared random values of the previous + period should be listed before the values of the current period. + +4.2. Encoding Shared Random Values in the consensus [SRCONSENSUS] + + Authorities insert the two active shared random values in the consensus + following the same encoding format as in [SRVOTE]. + +4.3. Persistent state format [STATEFORMAT] + + As a way to keep ground truth state in this protocol, an authority MUST + keep a persistent state of the protocol. The next sub-section suggest a + format for this state which is the same as the current state file format. + + It contains a preamble, a commitment and reveal section and a list of + shared random values. + + The preamble (or header) contains the following items. They MUST occur in + the order given here: + + "Version" SP version NL + + [At start, exactly once.] + + A document format version. For this specification, version is "1". + + "ValidUntil" SP YYYY-MM-DD SP HH:MM:SS NL + + [Exactly once] + + After this time, this state is expired and shouldn't be used nor + trusted. The validity time period is till the end of the current + protocol run (the upcoming noon). + + The following details the commitment and reveal section. They are encoded + the same as in the vote. This makes it easier for implementation purposes. + + "Commit" SP version SP algname SP identity SP commit [SP reveal] NL + + [Exactly once per authority] + + The values are the same as detailed in section [COMMITVOTE]. + + This line is also used by an authority to store its own value. + + Finally is the shared random value section. + + "SharedRandPreviousValue" SP num_reveals SP value NL + + [At most once] + + This is the previous shared random value agreed on at the previous + period. The fields are the same as in section [SRVOTE]. + + "SharedRandCurrentValue" SP num_reveals SP value NL + + [At most once] + + This is the latest shared random value. The fields are the same as in + section [SRVOTE]. + +5. Security Analysis + +5.1. Security of commit-and-reveal and future directions + + The security of commit-and-reveal protocols is well understood, and has + certain flaws. Basically, the protocol is insecure to the extent that an + adversary who controls b of the authorities gets to choose among 2^b + outcomes for the result of the protocol. However, an attacker who is not a + dirauth should not be able to influence the outcome at all. + + We believe that this system offers sufficient security especially compared + to the current situation. More secure solutions require much more advanced + crypto and more complex protocols so this seems like an acceptable solution + for now. + + Here are some examples of possible future directions: + - Schemes based on threshold signatures (e.g. see [HOPPER]) + - Unicorn scheme by Lenstra et al. [UNICORN] + - Schemes based on Verifiable Delay Functions [VDFS] + + For more alternative approaches on collaborative random number generation + also see the discussion at [RNGMESSAGING]. + +5.2. Predicting the shared random value during reveal phase + + The reveal phase lasts 12 hours, and most authorities will send their + reveal value on the first round of the reveal phase. This means that an + attacker can predict the final shared random value about 12 hours before + it's generated. + + This does not pose a problem for the HSDir hash ring, since we impose an + higher uptime restriction on HSDir nodes, so 12 hours predictability is not + an issue. + + Any other protocols using the shared random value from this system should + be aware of this property. + +5.3. Partition attacks + + This design is not immune to certain partition attacks. We believe they + don't offer much gain to an attacker as they are very easy to detect and + difficult to pull off since an attacker would need to compromise a directory + authority at the very least. Also, because of the byzantine general problem, + it's very hard (even impossible in some cases) to protect against all such + attacks. Nevertheless, this section describes all possible partition attack + and how to detect them. + +5.3.1. Partition attacks during commit phase + + A malicious directory authority could send only its commit to one single + authority which results in that authority having an extra commit value for + the shared random calculation that the others don't have. Since the + consensus needs majority, this won't affect the final SRV value. However, + the attacker, using this attack, could remove a single directory authority + from the consensus decision at 24:00 when the SRV is computed. + + An attacker could also partition the authorities by sending two different + commitment values to different authorities during the commit phase. + + All of the above is fairly easy to detect. Commitment values in the vote + coming from an authority should NEVER be different between authorities. If + so, this means an attack is ongoing or very bad bug (highly unlikely). + +5.3.2. Partition attacks during reveal phase + + Let's consider Alice, a malicious directory authority. Alice could wait + until the last reveal round, and reveal its value to half of the + authorities. That would partition the authorities into two sets: the ones + who think that the shared random value should contain this new reveal, and + the rest who don't know about it. This would result in a tie and two + different shared random value. + + A similar attack is possible. For example, two rounds before the end of the + reveal phase, Alice could advertise her reveal value to only half of the + dirauths. This way, in the last reveal phase round, half of the dirauths + will include that reveal value in their votes and the others will not. In + the end of the reveal phase, half of the dirauths will calculate a + different shared randomness value than the others. + + We claim that this attack is not particularly fruitful: Alice ends up + having two shared random values to choose from which is a fundamental + problem of commit-and-reveal protocols as well (since the last person can + always abort or reveal). The attacker can also sabotage the consensus, but + there are other ways this can be done with the current voting system. + + Furthermore, we claim that such an attack is very noisy and detectable. + First of all, it requires the authority to sabotage two consensuses which + will cause quite some noise. Furthermore, the authority needs to send + different votes to different auths which is detectable. Like the commit + phase attack, the detection here is to make sure that the commitment values + in a vote coming from an authority are always the same for each authority. + +6. Discussion + +6.1. Why the added complexity from proposal 225? + + The complexity difference between this proposal and prop225 is in part + because prop225 doesn't specify how the shared random value gets to the + clients. This proposal spends lots of effort specifying how the two shared + random values can always be readily accessible to clients. + +6.2. Why do you do a commit-and-reveal protocol in 24 rounds? + + The reader might be wondering why we span the protocol over the course of a + whole day (24 hours), when only 3 rounds would be sufficient to generate a + shared random value. + + We decided to do it this way, because we piggyback on the Tor voting + protocol which also happens every hour. + + We could instead only do the shared randomness protocol from 21:00 to 00:00 + every day. Or to do it multiple times a day. + + However, we decided that since the shared random value needs to be in every + consensus anyway, carrying the commitments/reveals as well will not be a + big problem. Also, this way we give more chances for a failing dirauth to + recover and rejoin the protocol. + +6.3. Why can't we recover if the 00:00UTC consensus fails? + + If the 00:00UTC consensus fails, there will be no shared random value for + the whole day. In theory, we could recover by calculating the shared + randomness of the day at 01:00UTC instead. However, the engineering issues + with adding such recovery logic are too great. For example, it's not easy + for an authority who just booted to learn whether a specific consensus + failed to be created. + +7. Acknowledgements + + Thanks to everyone who has contributed to this design with feedback and + discussion. + + Thanks go to arma, ioerror, kernelcorn, nickm, s7r, Sebastian, teor, weasel + and everyone else! + +References: + +[RANDOM-REFS]: + http://projectbullrun.org/dual-ec/ext-rand.html + https://lists.torproject.org/pipermail/tor-dev/2015-November/009954.html + +[RNGMESSAGING]: + https://moderncrypto.org/mail-archive/messaging/2015/002032.html + +[HOPPER]: + https://lists.torproject.org/pipermail/tor-dev/2014-January/006053.html + +[UNICORN]: + https://eprint.iacr.org/2015/366.pdf + +[VDFS]: + https://eprint.iacr.org/2018/601.pdf diff --git a/attic/text_formats/tor-spec.txt b/attic/text_formats/tor-spec.txt new file mode 100644 index 0000000..4d21c9a --- /dev/null +++ b/attic/text_formats/tor-spec.txt @@ -0,0 +1,2735 @@ + + Tor Protocol Specification + + Roger Dingledine + Nick Mathewson + +Table of Contents + + 0. Preliminaries + 0.1. Notation and encoding + 0.2. Security parameters + 0.3. Ciphers + 0.4. A bad hybrid encryption algorithm, for legacy purposes + 1. System overview + 1.1. Keys and names + 2. Connections + 2.1. Picking TLS ciphersuites + 2.2. TLS security considerations + 3. Cell Packet format + 4. Negotiating and initializing connections + 4.1. Negotiating versions with VERSIONS cells + 4.2. CERTS cells + 4.3. AUTH_CHALLENGE cells + 4.4. AUTHENTICATE cells + 4.4.1. Link authentication type 1: RSA-SHA256-TLSSecret + 4.4.2. Link authentication type 3: Ed25519-SHA256-RFC5705 + 4.5. NETINFO cells + 5. Circuit management + 5.1. CREATE and CREATED cells + 5.1.1. Choosing circuit IDs in create cells + 5.1.2. EXTEND and EXTENDED cells + 5.1.3. The "TAP" handshake + 5.1.4. The "ntor" handshake + 5.1.4.1. The "ntor-v3" handshake. + 5.1.5. CREATE_FAST/CREATED_FAST cells + 5.1.6. Additional data in CREATE/CREATED cells + 5.2. Setting circuit keys + 5.2.1. KDF-TOR + 5.2.2. KDF-RFC5869 + 5.3. Creating circuits + 5.3.1. Canonical connections + 5.4. Tearing down circuits + 5.5. Routing relay cells + 5.5.1. Circuit ID Checks + 5.5.2. Forward Direction + 5.5.2.1. Routing from the Origin + 5.5.2.2. Relaying Forward at Onion Routers + 5.5.3. Backward Direction + 5.5.3.1. Relaying Backward at Onion Routers + 5.5.4. Routing to the Origin + 5.6. Handling relay_early cells + 6. Application connections and stream management + 6.1. Relay cells + 6.1.1. Calculating the 'Digest' field + 6.2. Opening streams and transferring data + 6.2.1. Opening a directory stream + 6.3. Closing streams + 6.4. Remote hostname lookup + 7. Flow control + 7.1. Link throttling + 7.2. Link padding + 7.3. Circuit-level flow control + 7.3.1. SENDME Cell Format + 7.4. Stream-level flow control + 8. Handling resource exhaustion + 8.1. Memory exhaustion + 9. Subprotocol versioning + 9.1. "Link" + 9.2. "LinkAuth" + 9.3. "Relay" + 9.4. "HSIntro" + 9.5. "HSRend" + 9.6. "HSDir" + 9.7. "DirCache" + 9.8. "Desc" + 9.9. "Microdesc" + 9.10. "Cons" + 9.11. "Padding" + 9.12. "FlowCtrl" + +Note: This document aims to specify Tor as currently implemented, though it +may take it a little time to become fully up to date. Future versions of Tor +may implement improved protocols, and compatibility is not guaranteed. +We may or may not remove compatibility notes for other obsolete versions of +Tor as they become obsolete. + +This specification is not a design document; most design criteria +are not examined. For more information on why Tor acts as it does, +see tor-design.pdf. + +0. Preliminaries + + The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL + NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and + "OPTIONAL" in this document are to be interpreted as described in + RFC 2119. + +0.1. Notation and encoding + + KP -- a public key for an asymmetric cipher. + KS -- a private key for an asymmetric cipher. + K -- a key for a symmetric cipher. + N -- a "nonce", a random value, usually deterministically chosen + from other inputs using hashing. + + a|b -- concatenation of 'a' and 'b'. + + [A0 B1 C2] -- a three-byte sequence, containing the bytes with + hexadecimal values A0, B1, and C2, in that order. + + H(m) -- a cryptographic hash of m. + + We use "byte" and "octet" interchangeably. Possibly we shouldn't. + + Some specs mention "base32". This means RFC4648, without "=" padding. + +0.1.1. Encoding integers + + Unless we explicitly say otherwise below, all numeric values in the + Tor protocol are encoded in network (big-endian) order. So a "32-bit + integer" means a big-endian 32-bit integer; a "2-byte" integer means + a big-endian 16-bit integer, and so forth. + +0.2. Security parameters + + Tor uses a stream cipher, a public-key cipher, the Diffie-Hellman + protocol, and a hash function. + + KEY_LEN -- the length of the stream cipher's key, in bytes. + + KP_ENC_LEN -- the length of a public-key encrypted message, in bytes. + KP_PAD_LEN -- the number of bytes added in padding for public-key + encryption, in bytes. (The largest number of bytes that can be encrypted + in a single public-key operation is therefore KP_ENC_LEN-KP_PAD_LEN.) + + DH_LEN -- the number of bytes used to represent a member of the + Diffie-Hellman group. + DH_SEC_LEN -- the number of bytes used in a Diffie-Hellman private key (x). + + HASH_LEN -- the length of the hash function's output, in bytes. + + PAYLOAD_LEN -- The longest allowable cell payload, in bytes. (509) + + CELL_LEN(v) -- The length of a Tor cell, in bytes, for link protocol + version v. + CELL_LEN(v) = 512 if v is less than 4; + = 514 otherwise. + +0.3. Ciphers + + These are the ciphers we use _unless otherwise specified_. Several of + them are deprecated for new use. + + For a stream cipher, unless otherwise specified, we use 128-bit AES in + counter mode, with an IV of all 0 bytes. (We also require AES256.) + + For a public-key cipher, unless otherwise specified, we use RSA with + 1024-bit keys and a fixed exponent of 65537. We use OAEP-MGF1 + padding, with SHA-1 as its digest function. We leave the optional + "Label" parameter unset. (For OAEP padding, see + ftp://ftp.rsasecurity.com/pub/pkcs/pkcs-1/pkcs-1v2-1.pdf) + + We also use the Curve25519 group and the Ed25519 signature format in + several places. + + For Diffie-Hellman, unless otherwise specified, we use a generator + (g) of 2. For the modulus (p), we use the 1024-bit safe prime from + rfc2409 section 6.2 whose hex representation is: + + "FFFFFFFFFFFFFFFFC90FDAA22168C234C4C6628B80DC1CD129024E08" + "8A67CC74020BBEA63B139B22514A08798E3404DDEF9519B3CD3A431B" + "302B0A6DF25F14374FE1356D6D51C245E485B576625E7EC6F44C42E9" + "A637ED6B0BFF5CB6F406B7EDEE386BFB5A899FA5AE9F24117C4B1FE6" + "49286651ECE65381FFFFFFFFFFFFFFFF" + + As an optimization, implementations SHOULD choose DH private keys (x) of + 320 bits. Implementations that do this MUST never use any DH key more + than once. + [May other implementations reuse their DH keys?? -RD] + [Probably not. Conceivably, you could get away with changing DH keys once + per second, but there are too many oddball attacks for me to be + comfortable that this is safe. -NM] + + For a hash function, unless otherwise specified, we use SHA-1. + + KEY_LEN=16. + DH_LEN=128; DH_SEC_LEN=40. + KP_ENC_LEN=128; KP_PAD_LEN=42. + HASH_LEN=20. + + We also use SHA256 and SHA3-256 in some places. + + When we refer to "the hash of a public key", unless otherwise + specified, we mean the SHA-1 hash of the DER encoding of an ASN.1 RSA + public key (as specified in PKCS.1). + + All "random" values MUST be generated with a cryptographically + strong pseudorandom number generator seeded from a strong entropy + source, unless otherwise noted. + +0.4. A bad hybrid encryption algorithm, for legacy purposes. + + Some specifications will refer to the "legacy hybrid encryption" of a + byte sequence M with a public key KP. It is computed as follows: + + 1. If the length of M is no more than KP_ENC_LEN-KP_PAD_LEN, + pad and encrypt M with KP. + 2. Otherwise, generate a KEY_LEN byte random key K. + Let M1 = the first KP_ENC_LEN-KP_PAD_LEN-KEY_LEN bytes of M, + and let M2 = the rest of M. + Pad and encrypt K|M1 with KP. Encrypt M2 with our stream cipher, + using the key K. Concatenate these encrypted values. + + Note that this "hybrid encryption" approach does not prevent + an attacker from adding or removing bytes to the end of M. It also + allows attackers to modify the bytes not covered by the OAEP -- + see Goldberg's PET2006 paper for details. Do not use it as the basis + for new protocols! Also note that as used in Tor's protocols, case 1 + never occurs. + +1. System overview + + Tor is a distributed overlay network designed to anonymize + low-latency TCP-based applications such as web browsing, secure shell, + and instant messaging. Clients choose a path through the network and + build a ``circuit'', in which each node (or ``onion router'' or ``OR'') + in the path knows its predecessor and successor, but no other nodes in + the circuit. Traffic flowing down the circuit is sent in fixed-size + ``cells'', which are unwrapped by a symmetric key at each node (like + the layers of an onion) and relayed downstream. + +1.1. Keys and names + + Every Tor relay has multiple public/private keypairs: + + These are 1024-bit RSA keys: + + - A long-term signing-only "Identity key" used to sign documents and + certificates, and used to establish relay identity. + KP_relayid_rsa, KS_relayid_rsa. + - A medium-term TAP "Onion key" used to decrypt onion skins when accepting + circuit extend attempts. (See 5.1.) Old keys MUST be accepted for a + while after they are no longer advertised. Because of this, + relays MUST retain old keys for a while after they're rotated. (See + "onion key lifetime parameters" in dir-spec.txt.) + KP_onion_tap, KS_onion_tap. + - A short-term "Connection key" used to negotiate TLS connections. + Tor implementations MAY rotate this key as often as they like, and + SHOULD rotate this key at least once a day. + KP_conn_tls, KS_conn_tls. + + This is Curve25519 key: + + - A medium-term ntor "Onion key" used to handle onion key handshakes when + accepting incoming circuit extend requests. As with TAP onion keys, + old ntor keys MUST be accepted for at least one week after they are no + longer advertised. Because of this, relays MUST retain old keys for a + while after they're rotated. (See "onion key lifetime parameters" in + dir-spec.txt.) + KP_ntor, KS_ntor. + + These are Ed25519 keys: + + - A long-term "master identity" key. This key never + changes; it is used only to sign the "signing" key below. It may be + kept offline. + KP_relayid_ed, KS_relayid_ed. + - A medium-term "signing" key. This key is signed by the master identity + key, and must be kept online. A new one should be generated + periodically. It signs nearly everything else. + KP_relaysign_ed, KS_relaysign_ed. + - A short-term "link authentication" key, used to authenticate + the link handshake: see section 4 below. This key is signed + by the "signing" key, and should be regenerated frequently. + KP_link_ed, KS_link_ed. + + KP_relayid_* together identify a router uniquely. Once a router + has used a KP_relayid_ed (an Ed25519 master identity key) + together with a given KP_relayid_rsa (RSA identity key), neither of + those keys may ever be used with a different key. + + We write KP_relayid to refer to a key which is either + KP_relayid_rsa or KP_relayid_ed. + + The same key or keypair should never be used for separate roles within + the Tor protocol suite, unless specifically stated. For example, + a relay's identity keys K_relayid should not also be used as the + identity keypair for a hidden service K_hs_id (see rend-spec-v3.txt). + +2. Connections + + Connections between two Tor relays, or between a client and a relay, + use TLS/SSLv3 for link authentication and encryption. All + implementations MUST support the SSLv3 ciphersuite + "TLS_DHE_RSA_WITH_AES_128_CBC_SHA" if it is available. They SHOULD + support better ciphersuites if available. + + There are three ways to perform TLS handshakes with a Tor server. In + the first way, "certificates-up-front", both the initiator and + responder send a two-certificate chain as part of their initial + handshake. (This is supported in all Tor versions.) In the second + way, "renegotiation", the responder provides a single certificate, + and the initiator immediately performs a TLS renegotiation. (This is + supported in Tor 0.2.0.21 and later.) And in the third way, + "in-protocol", the initial TLS negotiation completes, and the + parties bootstrap themselves to mutual authentication via use of the + Tor protocol without further TLS handshaking. (This is supported in + 0.2.3.6-alpha and later.) + + Each of these options provides a way for the parties to learn it is + available: a client does not need to know the version of the Tor + server in order to connect to it properly. + + In "certificates up-front" (a.k.a "the v1 handshake"), + the connection initiator always sends a + two-certificate chain, consisting of an X.509 certificate using a + short-term connection public key and a second, self-signed X.509 + certificate containing its identity key. The other party sends a similar + certificate chain. The initiator's ClientHello MUST NOT include any + ciphersuites other than: + + TLS_DHE_RSA_WITH_AES_256_CBC_SHA + TLS_DHE_RSA_WITH_AES_128_CBC_SHA + SSL_DHE_RSA_WITH_3DES_EDE_CBC_SHA + + In "renegotiation" (a.k.a. "the v2 handshake"), + the connection initiator sends no certificates, and + the responder sends a single connection certificate. Once the TLS + handshake is complete, the initiator renegotiates the handshake, with each + party sending a two-certificate chain as in "certificates up-front". + The initiator's ClientHello MUST include at least one ciphersuite not in + the list above -- that's how the initiator indicates that it can + handle this handshake. For other considerations on the initiator's + ClientHello, see section 2.1 below. + + In "in-protocol" (a.k.a. "the v3 handshake"), the initiator sends no + certificates, and the + responder sends a single connection certificate. The choice of + ciphersuites must be as in a "renegotiation" handshake. There are + additionally a set of constraints on the connection certificate, + which the initiator can use to learn that the in-protocol handshake + is in use. Specifically, at least one of these properties must be + true of the certificate: + + * The certificate is self-signed + * Some component other than "commonName" is set in the subject or + issuer DN of the certificate. + * The commonName of the subject or issuer of the certificate ends + with a suffix other than ".net". + * The certificate's public key modulus is longer than 1024 bits. + + The initiator then sends a VERSIONS cell to the responder, which then + replies with a VERSIONS cell; they have then negotiated a Tor + protocol version. Assuming that the version they negotiate is 3 or higher + (the only ones specified for use with this handshake right now), the + responder sends a CERTS cell, an AUTH_CHALLENGE cell, and a NETINFO + cell to the initiator, which may send either CERTS, AUTHENTICATE, + NETINFO if it wants to authenticate, or just NETINFO if it does not. + + For backward compatibility between later handshakes and "certificates + up-front", the ClientHello of an initiator that supports a later + handshake MUST include at least one ciphersuite other than those listed + above. The connection responder examines the initiator's ciphersuite list + to see whether it includes any ciphers other than those included in the + list above. If extra ciphers are included, the responder proceeds as in + "renegotiation" and "in-protocol": it sends a single certificate and + does not request + client certificates. Otherwise (in the case that no extra ciphersuites + are included in the ClientHello) the responder proceeds as in + "certificates up-front": it requests client certificates, and sends a + two-certificate chain. In either case, once the responder has sent its + certificate or certificates, the initiator counts them. If two + certificates have been sent, it proceeds as in "certificates up-front"; + otherwise, it proceeds as in "renegotiation" or "in-protocol". + + To decide whether to do "renegotiation" or "in-protocol", the + initiator checks whether the responder's initial certificate matches + the criteria listed above. + + All new relay implementations of the Tor protocol MUST support + backwards-compatible renegotiation; clients SHOULD do this too. If + this is not possible, new client implementations MUST support both + "renegotiation" and "in-protocol" and use the router's + published link protocols list (see dir-spec.txt on the "protocols" entry) + to decide which to use. + + In all of the above handshake variants, certificates sent in the clear + SHOULD NOT include any strings to identify the host as a Tor relay. In + the "renegotiation" and "backwards-compatible renegotiation" steps, the + initiator SHOULD choose a list of ciphersuites and TLS extensions + to mimic one used by a popular web browser. + + Even though the connection protocol is identical, we will think of the + initiator as either an onion router (OR) if it is willing to relay + traffic for other Tor users, or an onion proxy (OP) if it only handles + local requests. Onion proxies SHOULD NOT provide long-term-trackable + identifiers in their handshakes. + + In all handshake variants, once all certificates are exchanged, all + parties receiving certificates must confirm that the identity key is as + expected. If the key is not as expected, the party must close the + connection. + + (When initiating a connection, if a reasonably live consensus is + available, then the expected identity key is taken from that + consensus. But when initiating a connection otherwise, the expected + identity key is the one given in the hard-coded authority or + fallback list. Finally, when creating a connection because of an + EXTEND/EXTEND2 cell, the expected identity key is the one given in + the cell.) + + When connecting to an OR, all parties SHOULD reject the connection if that + OR has a malformed or missing certificate. When accepting an incoming + connection, an OR SHOULD NOT reject incoming connections from parties with + malformed or missing certificates. (However, an OR should not believe + that an incoming connection is from another OR unless the certificates + are present and well-formed.) + + [Before version 0.1.2.8-rc, ORs rejected incoming connections from ORs and + OPs alike if their certificates were missing or malformed.] + + Once a TLS connection is established, the two sides send cells + (specified below) to one another. Cells are sent serially. Standard + cells are CELL_LEN(link_proto) bytes long, but variable-length cells + also exist; see Section 3. Cells may be sent embedded in TLS records + of any size or divided across TLS records, but the framing of TLS + records MUST NOT leak information about the type or contents of the + cells. + + TLS connections are not permanent. Either side MAY close a connection + if there are no circuits running over it and an amount of time + (KeepalivePeriod, defaults to 5 minutes) has passed since the last time + any traffic was transmitted over the TLS connection. Clients SHOULD + also hold a TLS connection with no circuits open, if it is likely that a + circuit will be built soon using that connection. + + Client-only Tor instances are encouraged to avoid using handshake + variants that include certificates, if those certificates provide + any persistent tags to the relays they contact. If clients do use + certificates, they SHOULD NOT keep using the same certificates when + their IP address changes. Clients MAY send certificates using any + of the above handshake variants. + +2.1. Picking TLS ciphersuites + + Clients SHOULD send a ciphersuite list chosen to emulate some popular + web browser or other program common on the internet. Clients may send + the "Fixed Cipheruite List" below. If they do not, they MUST NOT + advertise any ciphersuite that they cannot actually support, unless that + cipher is one not supported by OpenSSL 1.0.1. + + The fixed ciphersuite list is: + + TLS1_ECDHE_ECDSA_WITH_AES_256_CBC_SHA + TLS1_ECDHE_RSA_WITH_AES_256_CBC_SHA + TLS1_DHE_RSA_WITH_AES_256_SHA + TLS1_DHE_DSS_WITH_AES_256_SHA + TLS1_ECDH_RSA_WITH_AES_256_CBC_SHA + TLS1_ECDH_ECDSA_WITH_AES_256_CBC_SHA + TLS1_RSA_WITH_AES_256_SHA + TLS1_ECDHE_ECDSA_WITH_RC4_128_SHA + TLS1_ECDHE_ECDSA_WITH_AES_128_CBC_SHA + TLS1_ECDHE_RSA_WITH_RC4_128_SHA + TLS1_ECDHE_RSA_WITH_AES_128_CBC_SHA + TLS1_DHE_RSA_WITH_AES_128_SHA + TLS1_DHE_DSS_WITH_AES_128_SHA + TLS1_ECDH_RSA_WITH_RC4_128_SHA + TLS1_ECDH_RSA_WITH_AES_128_CBC_SHA + TLS1_ECDH_ECDSA_WITH_RC4_128_SHA + TLS1_ECDH_ECDSA_WITH_AES_128_CBC_SHA + SSL3_RSA_RC4_128_MD5 + SSL3_RSA_RC4_128_SHA + TLS1_RSA_WITH_AES_128_SHA + TLS1_ECDHE_ECDSA_WITH_DES_192_CBC3_SHA + TLS1_ECDHE_RSA_WITH_DES_192_CBC3_SHA + SSL3_EDH_RSA_DES_192_CBC3_SHA + SSL3_EDH_DSS_DES_192_CBC3_SHA + TLS1_ECDH_RSA_WITH_DES_192_CBC3_SHA + TLS1_ECDH_ECDSA_WITH_DES_192_CBC3_SHA + SSL3_RSA_FIPS_WITH_3DES_EDE_CBC_SHA + SSL3_RSA_DES_192_CBC3_SHA + [*] The "extended renegotiation is supported" ciphersuite, 0x00ff, is + not counted when checking the list of ciphersuites. + + If the client sends the Fixed Ciphersuite List, the responder MUST NOT + select any ciphersuite besides TLS_DHE_RSA_WITH_AES_256_CBC_SHA, + TLS_DHE_RSA_WITH_AES_128_CBC_SHA, and SSL_DHE_RSA_WITH_3DES_EDE_CBC_SHA: + such ciphers might not actually be supported by the client. + + If the client sends a v2+ ClientHello with a list of ciphers other then + the Fixed Ciphersuite List, the responder can trust that the client + supports every cipher advertised in that list, so long as that ciphersuite + is also supported by OpenSSL 1.0.1. + + Responders MUST NOT select any TLS ciphersuite that lacks ephemeral keys, + or whose symmetric keys are less then KEY_LEN bits, or whose digests are + less than HASH_LEN bits. Responders SHOULD NOT select any SSLv3 + ciphersuite other than the DHE+3DES suites listed above. + +2.2. TLS security considerations + + Implementations MUST NOT allow TLS session resumption -- it can + exacerbate some attacks (e.g. the "Triple Handshake" attack from + Feb 2013), and it plays havoc with forward secrecy guarantees. + + Implementations SHOULD NOT allow TLS compression -- although we don't + know a way to apply a CRIME-style attack to current Tor directly, + it's a waste of resources. + +3. Cell Packet format + + The basic unit of communication for onion routers and onion + proxies is a fixed-width "cell". + + On a version 1 connection, each cell contains the following + fields: + + CircID [CIRCID_LEN bytes] + Command [1 byte] + Payload (padded with padding bytes) [PAYLOAD_LEN bytes] + + On a version 2 or higher connection, all cells are as in version 1 + connections, except for variable-length cells, whose format is: + + CircID [CIRCID_LEN octets] + Command [1 octet] + Length [2 octets; big-endian integer] + Payload (some commands MAY pad) [Length bytes] + + Most variable-length cells MAY be padded with padding bytes, except + for VERSIONS cells, which MUST NOT contain any additional bytes. + (The payload of VPADDING cells consists of padding bytes.) + + On a version 2 connection, variable-length cells are indicated by a + command byte equal to 7 ("VERSIONS"). On a version 3 or + higher connection, variable-length cells are indicated by a command + byte equal to 7 ("VERSIONS"), or greater than or equal to 128. + + CIRCID_LEN is 2 for link protocol versions 1, 2, and 3. CIRCID_LEN + is 4 for link protocol version 4 or higher. The first VERSIONS cell, + and any cells sent before the first VERSIONS cell, always have + CIRCID_LEN == 2 for backward compatibility. + + The CircID field determines which circuit, if any, the cell is + associated with. + + The 'Command' field of a fixed-length cell holds one of the following + values: + + 0 -- PADDING (Padding) (See Sec 7.2) + 1 -- CREATE (Create a circuit) (See Sec 5.1) + 2 -- CREATED (Acknowledge create) (See Sec 5.1) + 3 -- RELAY (End-to-end data) (See Sec 5.5 and 6) + 4 -- DESTROY (Stop using a circuit) (See Sec 5.4) + 5 -- CREATE_FAST (Create a circuit, no KP) (See Sec 5.1) + 6 -- CREATED_FAST (Circuit created, no KP) (See Sec 5.1) + 8 -- NETINFO (Time and address info) (See Sec 4.5) + 9 -- RELAY_EARLY (End-to-end data; limited)(See Sec 5.6) + 10 -- CREATE2 (Extended CREATE cell) (See Sec 5.1) + 11 -- CREATED2 (Extended CREATED cell) (See Sec 5.1) + 12 -- PADDING_NEGOTIATE (Padding negotiation) (See Sec 7.2) + + Variable-length command values are: + + 7 -- VERSIONS (Negotiate proto version) (See Sec 4) + 128 -- VPADDING (Variable-length padding) (See Sec 7.2) + 129 -- CERTS (Certificates) (See Sec 4.2) + 130 -- AUTH_CHALLENGE (Challenge value) (See Sec 4.3) + 131 -- AUTHENTICATE (Client authentication)(See Sec 4.5) + 132 -- AUTHORIZE (Client authorization) (Not yet used) + + The interpretation of 'Payload' depends on the type of the cell. + + VPADDING/PADDING: + Payload contains padding bytes. + CREATE/CREATE2: Payload contains the handshake challenge. + CREATED/CREATED2: Payload contains the handshake response. + RELAY/RELAY_EARLY: Payload contains the relay header and relay body. + DESTROY: Payload contains a reason for closing the circuit. + (see 5.4) + + Upon receiving any other value for the command field, an OR must + drop the cell. Since more cell types may be added in the future, ORs + should generally not warn when encountering unrecognized commands. + + The cell is padded up to the cell length with padding bytes. + + Senders set padding bytes depending on the cell's command: + + VERSIONS: Payload MUST NOT contain padding bytes. + AUTHORIZE: Payload is unspecified and reserved for future use. + Other variable-length cells: + Payload MAY contain padding bytes at the end of the cell. + Padding bytes SHOULD be set to NUL. + RELAY/RELAY_EARLY: Payload MUST be padded to PAYLOAD_LEN with padding + bytes. Padding bytes SHOULD be set to random values. + Other fixed-length cells: + Payload MUST be padded to PAYLOAD_LEN with padding bytes. + Padding bytes SHOULD be set to NUL. + + We recommend random padding in RELAY/RELAY_EARLY cells, so that the cell + content is unpredictable. See the format of relay cells in section 6.1 + for detail. + + For other cells, TLS authenticates cell content, so randomized padding + bytes are redundant. + + Receivers MUST ignore padding bytes. + + PADDING cells are currently used to implement connection keepalive. + If there is no other traffic, ORs and OPs send one another a PADDING + cell every few minutes. + + CREATE, CREATE2, CREATED, CREATED2, and DESTROY cells are used to + manage circuits; see section 5 below. + + RELAY cells are used to send commands and data along a circuit; see + section 6 below. + + VERSIONS and NETINFO cells are used to set up connections in link + protocols v2 and higher; in link protocol v3 and higher, CERTS, + AUTH_CHALLENGE, and AUTHENTICATE may also be used. See section 4 + below. + +4. Negotiating and initializing connections + + After Tor instances negotiate handshake with either the "renegotiation" or + "in-protocol" handshakes, they must exchange a set of cells to set up + the Tor connection and make it "open" and usable for circuits. + + When the renegotiation handshake is used, both parties immediately + send a VERSIONS cell (4.1 below), and after negotiating a link + protocol version (which will be 2), each send a NETINFO cell (4.5 + below) to confirm their addresses and timestamps. No other intervening + cell types are allowed. + + When the in-protocol handshake is used, the initiator sends a + VERSIONS cell to indicate that it will not be renegotiating. The + responder sends a VERSIONS cell, a CERTS cell (4.2 below) to give the + initiator the certificates it needs to learn the responder's + identity, an AUTH_CHALLENGE cell (4.3) that the initiator must include + as part of its answer if it chooses to authenticate, and a NETINFO + cell (4.5). As soon as it gets the CERTS cell, the initiator knows + whether the responder is correctly authenticated. At this point the + initiator behaves differently depending on whether it wants to + authenticate or not. If it does not want to authenticate, it MUST + send a NETINFO cell. If it does want to authenticate, it MUST send a + CERTS cell, an AUTHENTICATE cell (4.4), and a NETINFO. When this + handshake is in use, the first cell must be VERSIONS, VPADDING, or + AUTHORIZE, and no other cell type is allowed to intervene besides + those specified, except for VPADDING cells. + + The AUTHORIZE cell type is reserved for future use by scanning-resistance + designs. + + [Tor versions before 0.2.3.11-alpha did not recognize the AUTHORIZE cell, + and did not permit any command other than VERSIONS as the first cell of + the in-protocol handshake.] + +4.1. Negotiating versions with VERSIONS cells + + There are multiple instances of the Tor link connection protocol. Any + connection negotiated using the "certificates up front" handshake (see + section 2 above) is "version 1". In any connection where both parties + have behaved as in the "renegotiation" handshake, the link protocol + version must be 2. In any connection where both parties have behaved + as in the "in-protocol" handshake, the link protocol must be 3 or higher. + + To determine the version, in any connection where the "renegotiation" + or "in-protocol" handshake was used (that is, where the responder + sent only one certificate at first and where the initiator did not + send any certificates in the first negotiation), both parties MUST + send a VERSIONS cell. In "renegotiation", they send a VERSIONS cell + right after the renegotiation is finished, before any other cells are + sent. In "in-protocol", the initiator sends a VERSIONS cell + immediately after the initial TLS handshake, and the responder + replies immediately with a VERSIONS cell. (As an exception to this rule, + if both sides support the "in-protocol" handshake, either side may send + VPADDING cells at any time.) + + The payload in a VERSIONS cell is a series of big-endian two-byte + integers. Both parties MUST select as the link protocol version the + highest number contained both in the VERSIONS cell they sent and in the + versions cell they received. If they have no such version in common, + they cannot communicate and MUST close the connection. Either party MUST + close the connection if the versions cell is not well-formed (for example, + if the payload contains an odd number of bytes). + + Any VERSIONS cells sent after the first VERSIONS cell MUST be ignored. + (To be interpreted correctly, later VERSIONS cells MUST have a CIRCID_LEN + matching the version negotiated with the first VERSIONS cell.) + + Since the version 1 link protocol does not use the "renegotiation" + handshake, implementations MUST NOT list version 1 in their VERSIONS + cell. When the "renegotiation" handshake is used, implementations + MUST list only the version 2. When the "in-protocol" handshake is + used, implementations MUST NOT list any version before 3, and SHOULD + list at least version 3. + + Link protocols differences are: + + 1 -- The "certs up front" handshake. + 2 -- Uses the renegotiation-based handshake. Introduces + variable-length cells. + 3 -- Uses the in-protocol handshake. + 4 -- Increases circuit ID width to 4 bytes. + 5 -- Adds support for link padding and negotiation (padding-spec.txt). + + +4.2. CERTS cells + + The CERTS cell describes the keys that a Tor instance is claiming + to have. It is a variable-length cell. Its payload format is: + + N: Number of certs in cell [1 octet] + N times: + CertType [1 octet] + CLEN [2 octets] + Certificate [CLEN octets] + + Any extra octets at the end of a CERTS cell MUST be ignored. + + Relevant certType values are: + 1: Link key certificate certified by RSA1024 identity + 2: RSA1024 Identity certificate, self-signed. + 3: RSA1024 AUTHENTICATE cell link certificate, signed with RSA1024 key. + 4: Ed25519 signing key, signed with identity key. + 5: TLS link certificate, signed with ed25519 signing key. + 6: Ed25519 AUTHENTICATE cell key, signed with ed25519 signing key. + 7: Ed25519 identity, signed with RSA identity. + + The certificate format for certificate types 1-3 is DER encoded + X509. For others, the format is as documented in cert-spec.txt. + Note that type 7 uses a different format from types 4-6. + + A CERTS cell may have no more than one certificate of each CertType. + + + To authenticate the responder as having a given Ed25519,RSA identity key + combination, the initiator MUST check the following. + + * The CERTS cell contains exactly one CertType 2 "ID" certificate. + * The CERTS cell contains exactly one CertType 4 Ed25519 + "Id->Signing" cert. + * The CERTS cell contains exactly one CertType 5 Ed25519 + "Signing->link" certificate. + * The CERTS cell contains exactly one CertType 7 "RSA->Ed25519" + cross-certificate. + * All X.509 certificates above have validAfter and validUntil dates; + no X.509 or Ed25519 certificates are expired. + * All certificates are correctly signed. + * The certified key in the Signing->Link certificate matches the + SHA256 digest of the certificate that was used to + authenticate the TLS connection. + * The identity key listed in the ID->Signing cert was used to + sign the ID->Signing Cert. + * The Signing->Link cert was signed with the Signing key listed + in the ID->Signing cert. + * The RSA->Ed25519 cross-certificate certifies the Ed25519 + identity, and is signed with the RSA identity listed in the + "ID" certificate. + * The certified key in the ID certificate is a 1024-bit RSA key. + * The RSA ID certificate is correctly self-signed. + + To authenticate the responder as having a given RSA identity only, + the initiator MUST check the following: + + * The CERTS cell contains exactly one CertType 1 "Link" certificate. + * The CERTS cell contains exactly one CertType 2 "ID" certificate. + * Both certificates have validAfter and validUntil dates that + are not expired. + * The certified key in the Link certificate matches the + link key that was used to negotiate the TLS connection. + * The certified key in the ID certificate is a 1024-bit RSA key. + * The certified key in the ID certificate was used to sign both + certificates. + * The link certificate is correctly signed with the key in the + ID certificate + * The ID certificate is correctly self-signed. + + In both cases above, checking these conditions is sufficient to + authenticate that the initiator is talking to the Tor node with the + expected identity, as certified in the ID certificate(s). + + + To authenticate the initiator as having a given Ed25519,RSA + identity key combination, the responder MUST check the following: + + * The CERTS cell contains exactly one CertType 2 "ID" certificate. + * The CERTS cell contains exactly one CertType 4 Ed25519 + "Id->Signing" certificate. + * The CERTS cell contains exactly one CertType 6 Ed25519 + "Signing->auth" certificate. + * The CERTS cell contains exactly one CertType 7 "RSA->Ed25519" + cross-certificate. + * All X.509 certificates above have validAfter and validUntil dates; + no X.509 or Ed25519 certificates are expired. + * All certificates are correctly signed. + * The identity key listed in the ID->Signing cert was used to + sign the ID->Signing Cert. + * The Signing->AUTH cert was signed with the Signing key listed + in the ID->Signing cert. + * The RSA->Ed25519 cross-certificate certifies the Ed25519 + identity, and is signed with the RSA identity listed in the + "ID" certificate. + * The certified key in the ID certificate is a 1024-bit RSA key. + * The RSA ID certificate is correctly self-signed. + + + To authenticate the initiator as having an RSA identity key only, + the responder MUST check the following: + + * The CERTS cell contains exactly one CertType 3 "AUTH" certificate. + * The CERTS cell contains exactly one CertType 2 "ID" certificate. + * Both certificates have validAfter and validUntil dates that + are not expired. + * The certified key in the AUTH certificate is a 1024-bit RSA key. + * The certified key in the ID certificate is a 1024-bit RSA key. + * The certified key in the ID certificate was used to sign both + certificates. + * The auth certificate is correctly signed with the key in the + ID certificate. + * The ID certificate is correctly self-signed. + + Checking these conditions is NOT sufficient to authenticate that the + initiator has the ID it claims; to do so, the cells in 4.3 and 4.4 + below must be exchanged. + + +4.3. AUTH_CHALLENGE cells + + An AUTH_CHALLENGE cell is a variable-length cell with the following + fields: + + Challenge [32 octets] + N_Methods [2 octets] + Methods [2 * N_Methods octets] + + It is sent from the responder to the initiator. Initiators MUST + ignore unexpected bytes at the end of the cell. Responders MUST + generate every challenge independently using a strong RNG or PRNG. + + The Challenge field is a randomly generated string that the + initiator must sign (a hash of) as part of authenticating. The + methods are the authentication methods that the responder will + accept. Only two authentication methods are defined right now: + see 4.4.1 and 4.4.2 below. + +4.4. AUTHENTICATE cells + + If an initiator wants to authenticate, it responds to the + AUTH_CHALLENGE cell with a CERTS cell and an AUTHENTICATE cell. + The CERTS cell is as a server would send, except that instead of + sending a CertType 1 (and possibly CertType 5) certs for arbitrary link + certificates, the initiator sends a CertType 3 (and possibly + CertType 6) cert for an RSA/Ed25519 AUTHENTICATE key. + + This difference is because we allow any link key type on a TLS + link, but the protocol described here will only work for specific key + types as described in 4.4.1 and 4.4.2 below. + + An AUTHENTICATE cell contains the following: + + AuthType [2 octets] + AuthLen [2 octets] + Authentication [AuthLen octets] + + Responders MUST ignore extra bytes at the end of an AUTHENTICATE + cell. Recognized AuthTypes are 1 and 3, described in the next + two sections. + + Initiators MUST NOT send an AUTHENTICATE cell before they have + verified the certificates presented in the responder's CERTS + cell, and authenticated the responder. + +4.4.1. Link authentication type 1: RSA-SHA256-TLSSecret + + If AuthType is 1 (meaning "RSA-SHA256-TLSSecret"), then the + Authentication field of the AUTHENTICATE cell contains the following: + + TYPE: The characters "AUTH0001" [8 octets] + CID: A SHA256 hash of the initiator's RSA1024 identity key [32 octets] + SID: A SHA256 hash of the responder's RSA1024 identity key [32 octets] + SLOG: A SHA256 hash of all bytes sent from the responder to the + initiator as part of the negotiation up to and including the + AUTH_CHALLENGE cell; that is, the VERSIONS cell, the CERTS cell, + the AUTH_CHALLENGE cell, and any padding cells. [32 octets] + CLOG: A SHA256 hash of all bytes sent from the initiator to the + responder as part of the negotiation so far; that is, the + VERSIONS cell and the CERTS cell and any padding cells. [32 + octets] + SCERT: A SHA256 hash of the responder's TLS link certificate. [32 + octets] + TLSSECRETS: A SHA256 HMAC, using the TLS master secret as the + secret key, of the following: + - client_random, as sent in the TLS Client Hello + - server_random, as sent in the TLS Server Hello + - the NUL terminated ASCII string: + "Tor V3 handshake TLS cross-certification" + [32 octets] + RAND: A 24 byte value, randomly chosen by the initiator. (In an + imitation of SSL3's gmt_unix_time field, older versions of Tor + sent an 8-byte timestamp as the first 8 bytes of this field; + new implementations should not do that.) [24 octets] + SIG: A signature of a SHA256 hash of all the previous fields + using the initiator's "Authenticate" key as presented. (As + always in Tor, we use OAEP-MGF1 padding; see tor-spec.txt + section 0.3.) + [variable length] + + To check the AUTHENTICATE cell, a responder checks that all fields + from TYPE through TLSSECRETS contain their unique + correct values as described above, and then verifies the signature. + The server MUST ignore any extra bytes in the signed data after + the RAND field. + + Responders MUST NOT accept this AuthType if the initiator has + claimed to have an Ed25519 identity. + + (There is no AuthType 2: It was reserved but never implemented.) + +4.4.2. Link authentication type 3: Ed25519-SHA256-RFC5705. + + If AuthType is 3, meaning "Ed25519-SHA256-RFC5705", the + Authentication field of the AuthType cell is as below: + + Modified values and new fields below are marked with asterisks. + + TYPE: The characters "AUTH0003" [8 octets] + CID: A SHA256 hash of the initiator's RSA1024 identity key [32 octets] + SID: A SHA256 hash of the responder's RSA1024 identity key [32 octets] + CID_ED: The initiator's Ed25519 identity key [32 octets] + SID_ED: The responder's Ed25519 identity key, or all-zero. [32 octets] + SLOG: A SHA256 hash of all bytes sent from the responder to the + initiator as part of the negotiation up to and including the + AUTH_CHALLENGE cell; that is, the VERSIONS cell, the CERTS cell, + the AUTH_CHALLENGE cell, and any padding cells. [32 octets] + CLOG: A SHA256 hash of all bytes sent from the initiator to the + responder as part of the negotiation so far; that is, the + VERSIONS cell and the CERTS cell and any padding cells. [32 + octets] + SCERT: A SHA256 hash of the responder's TLS link certificate. [32 + octets] + TLSSECRETS: The output of an RFC5705 Exporter function on the + TLS session, using as its inputs: + - The label string "EXPORTER FOR TOR TLS CLIENT BINDING AUTH0003" + - The context value equal to the initiator's Ed25519 identity key. + - The length 32. + [32 octets] + RAND: A 24 byte value, randomly chosen by the initiator. [24 octets] + SIG: A signature of all previous fields using the initiator's + Ed25519 authentication key (as in the cert with CertType 6). + [variable length] + + To check the AUTHENTICATE cell, a responder checks that all fields + from TYPE through TLSSECRETS contain their unique + correct values as described above, and then verifies the signature. + The server MUST ignore any extra bytes in the signed data after + the RAND field. + +4.5. NETINFO cells + + If version 2 or higher is negotiated, each party sends the other a + NETINFO cell. The cell's payload is: + + TIME (Timestamp) [4 bytes] + OTHERADDR (Other OR's address) [variable] + ATYPE (Address type) [1 byte] + ALEN (Address length) [1 byte] + AVAL (Address value in NBO) [ALEN bytes] + NMYADDR (Number of this OR's addresses) [1 byte] + NMYADDR times: + ATYPE (Address type) [1 byte] + ALEN (Address length) [1 byte] + AVAL (Address value in NBO)) [ALEN bytes] + + Recognized address types (ATYPE) are: + + [04] IPv4. + [06] IPv6. + + ALEN MUST be 4 when ATYPE is 0x04 (IPv4) and 16 when ATYPE is 0x06 + (IPv6). If the ALEN value is wrong for the given ATYPE value, then + the provided address should be ignored. + + The timestamp is a big-endian unsigned integer number of seconds + since the Unix epoch. Implementations MUST ignore unexpected bytes + at the end of the cell. Clients SHOULD send "0" as their timestamp, to + avoid fingerprinting. + + Implementations MAY use the timestamp value to help decide if their + clocks are skewed. Initiators MAY use "other OR's address" to help + learn which address their connections may be originating from, if they do + not know it; and to learn whether the peer will treat the current + connection as canonical. Implementations SHOULD NOT trust these + values unconditionally, especially when they come from non-authorities, + since the other party can lie about the time or IP addresses it sees. + + Initiators SHOULD use "this OR's address" to make sure + that they have connected to another OR at its canonical address. + (See 5.3.1 below.) + +5. Circuit management + +5.1. CREATE and CREATED cells + + Users set up circuits incrementally, one hop at a time. To create a + new circuit, OPs send a CREATE/CREATE2 cell to the first node, with + the first half of an authenticated handshake; that node responds with + a CREATED/CREATED2 cell with the second half of the handshake. To + extend a circuit past the first hop, the OP sends an EXTEND/EXTEND2 + relay cell (see section 5.1.2) which instructs the last node in the + circuit to send a CREATE/CREATE2 cell to extend the circuit. + + There are two kinds of CREATE and CREATED cells: The older + "CREATE/CREATED" format, and the newer "CREATE2/CREATED2" format. The + newer format is extensible by design; the older one is not. + + A CREATE2 cell contains: + + HTYPE (Client Handshake Type) [2 bytes] + HLEN (Client Handshake Data Len) [2 bytes] + HDATA (Client Handshake Data) [HLEN bytes] + + A CREATED2 cell contains: + + HLEN (Server Handshake Data Len) [2 bytes] + HDATA (Server Handshake Data) [HLEN bytes] + + Recognized HTYPEs (handshake types) are: + + 0x0000 TAP -- the original Tor handshake; see 5.1.3 + 0x0001 reserved + 0x0002 ntor -- the ntor+curve25519+sha256 handshake; see 5.1.4 + 0x0003 ntor-v3 -- ntor extended with extra data; see 5.1.4.1 + + The format of a CREATE cell is one of the following: + + HDATA (Client Handshake Data) [TAP_C_HANDSHAKE_LEN bytes] + + or + + HTAG (Client Handshake Type Tag) [16 bytes] + HDATA (Client Handshake Data) [TAP_C_HANDSHAKE_LEN-16 bytes] + + The first format is equivalent to a CREATE2 cell with HTYPE of 'tap' + and length of TAP_C_HANDSHAKE_LEN. The second format is a way to + encapsulate new handshake types into the old CREATE cell format for + migration. See 5.1.2 below. Recognized HTAG values are: + + ntor -- 'ntorNTORntorNTOR' + + The format of a CREATED cell is: + + HDATA (Server Handshake Data) [TAP_S_HANDSHAKE_LEN bytes] + + (It's equivalent to a CREATED2 cell with length of TAP_S_HANDSHAKE_LEN.) + + As usual with DH, x and y MUST be generated randomly. + + In general, clients SHOULD use CREATE whenever they are using the TAP + handshake, and CREATE2 otherwise. Clients SHOULD NOT send the + second format of CREATE cells (the one with the handshake type tag) + to a server directly. + + Servers always reply to a successful CREATE with a CREATED, and to a + successful CREATE2 with a CREATED2. On failure, a server sends a + DESTROY cell to tear down the circuit. + + [CREATE2 is handled by Tor 0.2.4.7-alpha and later.] + +5.1.1. Choosing circuit IDs in create cells + + The CircID for a CREATE/CREATE2 cell is a nonzero integer, selected + by the node (OP or OR) that sends the CREATE/CREATED2 cell. + Depending on the link protocol version, there are certain rules for + choosing the value of CircID which MUST be obeyed, as implementations + MAY decide to refuse in case of a violation. In link protocol 3 or + lower, CircIDs are 2 bytes long; in protocol 4 or higher, CircIDs are + 4 bytes long. + + In link protocol version 3 or lower, the nodes choose from only one + half of the possible values based on the ORs' public identity keys, + in order to avoid collisions. If the sending node has a lower key, + it chooses a CircID with an MSB of 0; otherwise, it chooses a CircID + with an MSB of 1. (Public keys are compared numerically by modulus.) + A client with no public key MAY choose any CircID it wishes, since + clients never need to process CREATE/CREATE2 cells. + + In link protocol version 4 or higher, whichever node initiated the + connection MUST set its MSB to 1, and whichever node didn't initiate + the connection MUST set its MSB to 0. + + The CircID value 0 is specifically reserved for cells that do not + belong to any circuit: CircID 0 MUST not be used for circuits. No + other CircID value, including 0x8000 or 0x80000000, is reserved. + + Existing Tor implementations choose their CircID values at random from + among the available unused values. To avoid distinguishability, new + implementations should do the same. Implementations MAY give up and stop + attempting to build new circuits on a channel, if a certain number of + randomly chosen CircID values are all in use (today's Tor stops after 64). + +5.1.2. EXTEND and EXTENDED cells + + To extend an existing circuit, the client sends an EXTEND or EXTEND2 + RELAY_EARLY cell to the last node in the circuit. + + An EXTEND2 cell's relay payload contains: + + NSPEC (Number of link specifiers) [1 byte] + NSPEC times: + LSTYPE (Link specifier type) [1 byte] + LSLEN (Link specifier length) [1 byte] + LSPEC (Link specifier) [LSLEN bytes] + HTYPE (Client Handshake Type) [2 bytes] + HLEN (Client Handshake Data Len) [2 bytes] + HDATA (Client Handshake Data) [HLEN bytes] + + Link specifiers describe the next node in the circuit and how to + connect to it. Recognized specifiers are: + + [00] TLS-over-TCP, IPv4 address + A four-byte IPv4 address plus two-byte ORPort + [01] TLS-over-TCP, IPv6 address + A sixteen-byte IPv6 address plus two-byte ORPort + [02] Legacy identity + A 20-byte SHA1 identity fingerprint. At most one may be listed. + [03] Ed25519 identity + A 32-byte Ed25519 identity fingerprint. At most one may + be listed. + + Nodes MUST ignore unrecognized specifiers, and MUST accept multiple + instances of specifiers other than 'legacy identity' and + 'Ed25519 identity'. (Nodes SHOULD reject link specifier lists + that include multiple instances of either one of those specifiers.) + + For purposes of indistinguishability, implementations SHOULD send + these link specifiers, if using them, in this order: [00], [02], [03], + [01]. + + The relay payload for an EXTEND relay cell consists of: + + Address [4 bytes] + Port [2 bytes] + Onion skin [TAP_C_HANDSHAKE_LEN bytes] + Identity fingerprint [HASH_LEN bytes] + + The "legacy identity" and "identity fingerprint" fields are the + SHA1 hash of the PKCS#1 ASN1 encoding of the next onion router's + identity (signing) key. (See 0.3 above.) The "Ed25519 identity" + field is the Ed25519 identity key of the target node. Including + this key information allows the extending OR verify that it is + indeed connected to the correct target OR, and prevents certain + man-in-the-middle attacks. + + Extending ORs MUST check _all_ provided identity keys (if they + recognize the format), and and MUST NOT extend the circuit if the + target OR did not prove its ownership of any such identity key. + If only one identity key is provided, but the extending OR knows + the other (from directory information), then the OR SHOULD also + enforce the key in the directory. + + If an extending OR has a channel with a given Ed25519 ID and RSA + identity, and receives a request for that Ed25519 ID and a + different RSA identity, it SHOULD NOT attempt to make another + connection: it should just fail and DESTROY the circuit. + + The client MAY include multiple IPv4 or IPv6 link specifiers in an + EXTEND cell; current OR implementations only consider the first + of each type. + + After checking relay identities, extending ORs generate a + CREATE/CREATE2 cell from the contents of the EXTEND/EXTEND2 cell. + See section 5.3 for details. + + The payload of an EXTENDED cell is the same as the payload of a + CREATED cell. + + The payload of an EXTENDED2 cell is the same as the payload of a + CREATED2 cell. + + [Support for EXTEND2/EXTENDED2 was added in Tor 0.2.4.8-alpha.] + + Clients SHOULD use the EXTEND format whenever sending a TAP + handshake, and MUST use it whenever the EXTEND cell will be handled + by a node running a version of Tor too old to support EXTEND2. In + other cases, clients SHOULD use EXTEND2. + + When generating an EXTEND2 cell, clients SHOULD include the target's + Ed25519 identity whenever the target has one, and whenever the + target supports LinkAuth subprotocol version "3". (See section 9.2.) + + When encoding a non-TAP handshake in an EXTEND cell, clients SHOULD + use the format with 'client handshake type tag'. + +5.1.3. The "TAP" handshake + + This handshake uses Diffie-Hellman in Z_p and RSA to compute a set of + shared keys which the client knows are shared only with a particular + server, and the server knows are shared with whomever sent the + original handshake (or with nobody at all). It's not very fast and + not very good. (See Goldberg's "On the Security of the Tor + Authentication Protocol".) + + Define TAP_C_HANDSHAKE_LEN as DH_LEN+KEY_LEN+KP_PAD_LEN. + Define TAP_S_HANDSHAKE_LEN as DH_LEN+HASH_LEN. + + The payload for a CREATE cell is an 'onion skin', which consists of + the first step of the DH handshake data (also known as g^x). This + value is encrypted using the "legacy hybrid encryption" algorithm + (see 0.4 above) to the server's onion key, giving a client handshake: + + KP-encrypted: + Padding [KP_PAD_LEN bytes] + Symmetric key [KEY_LEN bytes] + First part of g^x [KP_ENC_LEN-KP_PAD_LEN-KEY_LEN bytes] + Symmetrically encrypted: + Second part of g^x [DH_LEN-(KP_ENC_LEN-KP_PAD_LEN-KEY_LEN) + bytes] + + The payload for a CREATED cell, or the relay payload for an + EXTENDED cell, contains: + + DH data (g^y) [DH_LEN bytes] + Derivative key data (KH) [HASH_LEN bytes] + + Once the handshake between the OP and an OR is completed, both can + now calculate g^xy with ordinary DH. Before computing g^xy, both parties + MUST verify that the received g^x or g^y value is not degenerate; + that is, it must be strictly greater than 1 and strictly less than p-1 + where p is the DH modulus. Implementations MUST NOT complete a handshake + with degenerate keys. Implementations MUST NOT discard other "weak" + g^x values. + + (Discarding degenerate keys is critical for security; if bad keys + are not discarded, an attacker can substitute the OR's CREATED + cell's g^y with 0 or 1, thus creating a known g^xy and impersonating + the OR. Discarding other keys may allow attacks to learn bits of + the private key.) + + Once both parties have g^xy, they derive their shared circuit keys + and 'derivative key data' value via the KDF-TOR function in 5.2.1. + +5.1.4. The "ntor" handshake + + This handshake uses a set of DH handshakes to compute a set of + shared keys which the client knows are shared only with a particular + server, and the server knows are shared with whomever sent the + original handshake (or with nobody at all). Here we use the + "curve25519" group and representation as specified in "Curve25519: + new Diffie-Hellman speed records" by D. J. Bernstein. + + [The ntor handshake was added in Tor 0.2.4.8-alpha.] + + In this section, define: + + H(x,t) as HMAC_SHA256 with message x and key t. + H_LENGTH = 32. + ID_LENGTH = 20. + G_LENGTH = 32 + PROTOID = "ntor-curve25519-sha256-1" + t_mac = PROTOID | ":mac" + t_key = PROTOID | ":key_extract" + t_verify = PROTOID | ":verify" + G = The preferred base point for curve25519 ([9]) + KEYGEN() = The curve25519 key generation algorithm, returning + a private/public keypair. + m_expand = PROTOID | ":key_expand" + KEYID(A) = A + EXP(a, b) = The ECDH algorithm for establishing a shared secret. + + To perform the handshake, the client needs to know an identity key + digest for the server, and an ntor onion key (a curve25519 public + key) for that server. Call the ntor onion key "B". The client + generates a temporary keypair: + + x,X = KEYGEN() + + and generates a client-side handshake with contents: + + NODEID Server identity digest [ID_LENGTH bytes] + KEYID KEYID(B) [H_LENGTH bytes] + CLIENT_KP X [G_LENGTH bytes] + + The server generates a keypair of y,Y = KEYGEN(), and uses its ntor + private key 'b' to compute: + + secret_input = EXP(X,y) | EXP(X,b) | ID | B | X | Y | PROTOID + KEY_SEED = H(secret_input, t_key) + verify = H(secret_input, t_verify) + auth_input = verify | ID | B | Y | X | PROTOID | "Server" + + The server's handshake reply is: + + SERVER_KP Y [G_LENGTH bytes] + AUTH H(auth_input, t_mac) [H_LENGTH bytes] + + The client then checks Y is in G^* [see NOTE below], and computes + + secret_input = EXP(Y,x) | EXP(B,x) | ID | B | X | Y | PROTOID + KEY_SEED = H(secret_input, t_key) + verify = H(secret_input, t_verify) + auth_input = verify | ID | B | Y | X | PROTOID | "Server" + + The client verifies that AUTH == H(auth_input, t_mac). + + Both parties check that none of the EXP() operations produced the + point at infinity. [NOTE: This is an adequate replacement for + checking Y for group membership, if the group is curve25519.] + + Both parties now have a shared value for KEY_SEED. They expand this + into the keys needed for the Tor relay protocol, using the KDF + described in 5.2.2 and the tag m_expand. + +5.1.4.1. The "ntor-v3" handshake + + This handshake extends the ntor handshake to include support + for extra data transmitted as part of the handshake. Both + the client and the server can transmit extra data; in both cases, + the extra data is encrypted, but only server data receives + forward secrecy. + + To advertise support for this handshake, servers advertise the + "Relay=4" subprotocol version. To select it, clients use the + 'ntor-v3' HTYPE value in their CREATE2 cells. + + In this handshake, we define: + + PROTOID = "ntor3-curve25519-sha3_256-1" + t_msgkdf = PROTOID | ":kdf_phase1" + t_msgmac = PROTOID | ":msg_mac" + t_key_seed = PROTOID | ":key_seed" + t_verify = PROTOID | ":verify" + t_final = PROTOID | ":kdf_final" + t_auth = PROTOID | ":auth_final" + + `ENCAP(s)` -- an encapsulation function. We define this + as `htonll(len(s)) | s`. (Note that `len(ENCAP(s)) = len(s) + 8`). + + `PARTITION(s, n1, n2, n3, ...)` -- a function that partitions a + bytestring `s` into chunks of length `n1`, `n2`, `n3`, and so + on. Extra data is put into a final chunk. If `s` is not long + enough, the function fails. + + H(s, t) = SHA3_256(ENCAP(t) | s) + MAC(k, msg, t) = SHA3_256(ENCAP(t) | ENCAP(k) | s) + KDF(s, t) = SHAKE_256(ENCAP(t) | s) + ENC(k, m) = AES_256_CTR(k, m) + + EXP(pk,sk), KEYGEN: defined as in curve25519 + + DIGEST_LEN = MAC_LEN = MAC_KEY_LEN = ENC_KEY_LEN = PUB_KEY_LEN = 32 + + ID_LEN = 32 (representing an ed25519 identity key) + + For any tag "t_foo": + H_foo(s) = H(s, t_foo) + MAC_foo(k, msg) = MAC(k, msg, t_foo) + KDF_foo(s) = KDF(s, t_foo) + + Other notation is as in the ntor description in 5.1.4 above. + + The client begins by knowing: + + B, ID -- The curve25519 onion key and Ed25519 ID of the server that it + wants to use. + CM -- A message it wants to send as part of its handshake. + VER -- An optional shared verification string: + + The client computes: + + x,X = KEYGEN() + Bx = EXP(B,x) + secret_input_phase1 = Bx | ID | X | B | PROTOID | ENCAP(VER) + phase1_keys = KDF_msgkdf(secret_input_phase1) + (ENC_K1, MAC_K1) = PARTITION(phase1_keys, ENC_KEY_LEN, MAC_KEY_LEN) + encrypted_msg = ENC(ENC_K1, CM) + msg_mac = MAC_msgmac(MAC_K1, ID | B | X | encrypted_msg) + + The client then sends, as its CREATE handshake: + + NODEID ID [ID_LEN bytes] + KEYID B [PUB_KEY_LEN bytes] + CLIENT_PK X [PUB_KEY_LEN bytes] + MSG encrypted_msg [len(CM) bytes] + MAC msg_mac [MAC_LEN bytes] + + The client remembers x, X, B, ID, Bx, and msg_mac. + + When the server receives this handshake, it checks whether NODEID is as + expected, and looks up the (b,B) keypair corresponding to KEYID. If the + keypair is missing or the NODEID is wrong, the handshake fails. + + Now the relay uses `X=CLIENT_PK` to compute: + + Xb = EXP(X,b) + secret_input_phase1 = Xb | ID | X | B | PROTOID | ENCAP(VER) + phase1_keys = KDF_msgkdf(secret_input_phase1) + (ENC_K1, MAC_K1) = PARTITION(phase1_keys, ENC_KEY_LEN, MAC_KEY_LEN) + + expected_mac = MAC_msgmac(MAC_K1, ID | B | X | MSG) + + If `expected_mac` is not `MAC`, the handshake fails. Otherwise + the relay computes `CM` as: + + CM = DEC(MSG, ENC_K1) + + The relay then checks whether `CM` is well-formed, and in response + composes `SM`, the reply that it wants to send as part of the + handshake. It then generates a new ephemeral keypair: + + y,Y = KEYGEN() + + and computes the rest of the handshake: + + Xy = EXP(X,y) + secret_input = Xy | Xb | ID | B | X | Y | PROTOID | ENCAP(VER) + ntor_key_seed = H_key_seed(secret_input) + verify = H_verify(secret_input) + + RAW_KEYSTREAM = KDF_final(ntor_key_seed) + (ENC_KEY, KEYSTREAM) = PARTITION(RAW_KEYSTREAM, ENC_KEY_LKEN, ...) + + encrypted_msg = ENC(ENC_KEY, SM) + + auth_input = verify | ID | B | Y | X | MAC | ENCAP(encrypted_msg) | + PROTOID | "Server" + AUTH = H_auth(auth_input) + + The relay then sends as its CREATED handshake: + + Y Y [PUB_KEY_LEN bytes] + AUTH AUTH [DIGEST_LEN bytes] + MSG encrypted_msg [len(SM) bytes, up to end of the message] + + Upon receiving this handshake, the client computes: + + Yx = EXP(Y, x) + secret_input = Yx | Bx | ID | B | X | Y | PROTOID | ENCAP(VER) + ntor_key_seed = H_key_seed(secret_input) + verify = H_verify(secret_input) + + auth_input = verify | ID | B | Y | X | MAC | ENCAP(MSG) | + PROTOID | "Server" + AUTH_expected = H_auth(auth_input) + + If AUTH_expected is equal to AUTH, then the handshake has + succeeded. The client can then calculate: + + RAW_KEYSTREAM = KDF_final(ntor_key_seed) + (ENC_KEY, KEYSTREAM) = PARTITION(RAW_KEYSTREAM, ENC_KEY_LKEN, ...) + + SM = DEC(ENC_KEY, MSG) + + SM is the message from the relay, and the client uses KEYSTREAM to + generate the shared secrets for the newly created circuit. + + Now both parties share the same KEYSTREAM, and can use it to generate + their circuit keys. + +5.1.5. CREATE_FAST/CREATED_FAST cells + + When initializing the first hop of a circuit, the OP has already + established the OR's identity and negotiated a secret key using TLS. + Because of this, it is not always necessary for the OP to perform the + public key operations to create a circuit. In this case, the + OP MAY send a CREATE_FAST cell instead of a CREATE cell for the first + hop only. The OR responds with a CREATED_FAST cell, and the circuit is + created. + + A CREATE_FAST cell contains: + + Key material (X) [HASH_LEN bytes] + + A CREATED_FAST cell contains: + + Key material (Y) [HASH_LEN bytes] + Derivative key data [HASH_LEN bytes] (See 5.2.1 below) + + The values of X and Y must be generated randomly. + + Once both parties have X and Y, they derive their shared circuit keys + and 'derivative key data' value via the KDF-TOR function in 5.2.1. + + The CREATE_FAST handshake is currently deprecated whenever it is not + necessary; the migration is controlled by the "usecreatefast" + networkstatus parameter as described in dir-spec.txt. + + [Tor 0.3.1.1-alpha and later disable CREATE_FAST by default.] + +5.1.6. Additional data in CREATE/CREATED cells + + Some handshakes (currently ntor-v3 defined above) allow the client or the + relay to send additional data as part of the handshake. When used in a + CREATE/CREATED handshake, this additional data must have the following + format: + + N_EXTENSIONS [one byte] + N_EXTENSIONS times: + EXT_FIELD_TYPE [one byte] + EXT_FIELD_LEN [one byte] + EXT_FIELD [EXT_FIELD_LEN bytes] + + (`EXT_FIELD_LEN` may be zero, in which case EXT_FIELD is absent.) + + All parties MUST reject messages that are not well-formed per the + rules above. + + We do not specify specific TYPE semantics here; we leave those for + other proposals and specifications. + + Parties MUST ignore extensions with `EXT_FIELD_TYPE` bodies they do not + recognize. + + Unless otherwise specified in the documentation for an extension type: + * Each extension type SHOULD be sent only once in a message. + * Parties MUST ignore any occurrences all occurrences of an extension + with a given type after the first such occurrence. + * Extensions SHOULD be sent in numerically ascending order by type. + + (The above extension sorting and multiplicity rules are only defaults; + they may be overridden in the description of individual extensions.) + + Currently supported extensions are: + + 1 -- CC_FIELD_REQUEST [Client to server] + + Contains an empty payload. Signifies that the client + wants to use the extended congestion control described + in proposal 324. + + 2 -- CC_FIELD_RESPONSE [Server to client] + + Indicates that the relay will use the congestion control + of proposal 324, as requested by the client. One byte + in length: + + sendme_inc [1 byte] + +5.2. Setting circuit keys + +5.2.1. KDF-TOR + + This key derivation function is used by the TAP and CREATE_FAST + handshakes, and in the current hidden service protocol. It shouldn't + be used for new functionality. + + If the TAP handshake is used to extend a circuit, both parties + base their key material on K0=g^xy, represented as a big-endian unsigned + integer. + + If CREATE_FAST is used, both parties base their key material on + K0=X|Y. + + From the base key material K0, they compute KEY_LEN*2+HASH_LEN*3 bytes of + derivative key data as + + K = H(K0 | [00]) | H(K0 | [01]) | H(K0 | [02]) | ... + + The first HASH_LEN bytes of K form KH; the next HASH_LEN form the forward + digest Df; the next HASH_LEN 41-60 form the backward digest Db; the next + KEY_LEN 61-76 form Kf, and the final KEY_LEN form Kb. Excess bytes from K + are discarded. + + KH is used in the handshake response to demonstrate knowledge of the + computed shared key. Df is used to seed the integrity-checking hash + for the stream of data going from the OP to the OR, and Db seeds the + integrity-checking hash for the data stream from the OR to the OP. Kf + is used to encrypt the stream of data going from the OP to the OR, and + Kb is used to encrypt the stream of data going from the OR to the OP. + +5.2.2. KDF-RFC5869 + + For newer KDF needs, Tor uses the key derivation function HKDF from + RFC5869, instantiated with SHA256. (This is due to a construction + from Krawczyk.) The generated key material is: + + K = K_1 | K_2 | K_3 | ... + + Where H(x,t) is HMAC_SHA256 with value x and key t + and K_1 = H(m_expand | INT8(1) , KEY_SEED ) + and K_(i+1) = H(K_i | m_expand | INT8(i+1) , KEY_SEED ) + and m_expand is an arbitrarily chosen value, + and INT8(i) is a octet with the value "i". + + In RFC5869's vocabulary, this is HKDF-SHA256 with info == m_expand, + salt == t_key, and IKM == secret_input. + + When used in the ntor handshake, the first HASH_LEN bytes form the + forward digest Df; the next HASH_LEN form the backward digest Db; the + next KEY_LEN form Kf, the next KEY_LEN form Kb, and the final + DIGEST_LEN bytes are taken as a nonce to use in the place of KH in the + hidden service protocol. Excess bytes from K are discarded. + +5.3. Creating circuits + + When creating a circuit through the network, the circuit creator + (OP) performs the following steps: + + 1. Choose an onion router as an end node (R_N): + * N MAY be 1 for non-anonymous directory mirror, introduction point, + or service rendezvous connections. + * N SHOULD be 3 or more for anonymous connections. + Some end nodes accept streams (see 6.1), others are introduction + or rendezvous points (see rend-spec-{v2,v3}.txt). + + 2. Choose a chain of (N-1) onion routers (R_1...R_N-1) to constitute + the path, such that no router appears in the path twice. + + 3. If not already connected to the first router in the chain, + open a new connection to that router. + + 4. Choose a circID not already in use on the connection with the + first router in the chain; send a CREATE/CREATE2 cell along + the connection, to be received by the first onion router. + + 5. Wait until a CREATED/CREATED2 cell is received; finish the + handshake and extract the forward key Kf_1 and the backward + key Kb_1. + + 6. For each subsequent onion router R (R_2 through R_N), extend + the circuit to R. + + To extend the circuit by a single onion router R_M, the OP performs + these steps: + + 1. Create an onion skin, encrypted to R_M's public onion key. + + 2. Send the onion skin in a relay EXTEND/EXTEND2 cell along + the circuit (see sections 5.1.2 and 5.5). + + 3. When a relay EXTENDED/EXTENDED2 cell is received, verify KH, + and calculate the shared keys. The circuit is now extended. + + When an onion router receives an EXTEND relay cell, it sends a CREATE + cell to the next onion router, with the enclosed onion skin as its + payload. + + When an onion router receives an EXTEND2 relay cell, it sends a CREATE2 + cell to the next onion router, with the enclosed HLEN, HTYPE, and HDATA + as its payload. The initiating onion router chooses some circID not yet + used on the connection between the two onion routers. (But see section + 5.1.1 above, concerning choosing circIDs.) + + As special cases, if the EXTEND/EXTEND2 cell includes a legacy identity, or + identity fingerprint of all zeroes, or asks to extend back to the relay + that sent the extend cell, the circuit will fail and be torn down. + + Ed25519 identity keys are not required in EXTEND2 cells, so all zero + keys SHOULD be accepted. If the extending relay knows the ed25519 key from + the consensus, it SHOULD also check that key. (See section 5.1.2.) + + If an EXTEND2 cell contains the ed25519 key of the relay that sent the + extend cell, the circuit will fail and be torn down. + + When an onion router receives a CREATE/CREATE2 cell, if it already has a + circuit on the given connection with the given circID, it drops the + cell. Otherwise, after receiving the CREATE/CREATE2 cell, it completes + the specified handshake, and replies with a CREATED/CREATED2 cell. + + Upon receiving a CREATED/CREATED2 cell, an onion router packs it payload + into an EXTENDED/EXTENDED2 relay cell (see section 5.1.2), and sends + that cell up the circuit. Upon receiving the EXTENDED/EXTENDED2 relay + cell, the OP can retrieve the handshake material. + + (As an optimization, OR implementations may delay processing onions + until a break in traffic allows time to do so without harming + network latency too greatly.) + +5.3.1. Canonical connections + + It is possible for an attacker to launch a man-in-the-middle attack + against a connection by telling OR Alice to extend to OR Bob at some + address X controlled by the attacker. The attacker cannot read the + encrypted traffic, but the attacker is now in a position to count all + bytes sent between Alice and Bob (assuming Alice was not already + connected to Bob.) + + To prevent this, when an OR gets an extend request, it SHOULD use an + existing OR connection if the ID matches, and ANY of the following + conditions hold: + + - The IP matches the requested IP. + - The OR knows that the IP of the connection it's using is canonical + because it was listed in the NETINFO cell. + + ORs SHOULD NOT check the IPs that are listed in the server descriptor. + Trusting server IPs makes it easier to covertly impersonate a relay, after + stealing its keys. + +5.4. Tearing down circuits + + Circuits are torn down when an unrecoverable error occurs along + the circuit, or when all streams on a circuit are closed and the + circuit's intended lifetime is over. + + ORs SHOULD also tear down circuits which attempt to create: + + * streams with RELAY_BEGIN, or + * rendezvous points with ESTABLISH_RENDEZVOUS, + ending at the first hop. Letting Tor be used as a single hop proxy makes + exit and rendezvous nodes a more attractive target for compromise. + + ORs MAY use multiple methods to check if they are the first hop: + + * If an OR sees a circuit created with CREATE_FAST, the OR is sure to be + the first hop of a circuit. + * If an OR is the responder, and the initiator: + * did not authenticate the link, or + * authenticated with a key that is not in the consensus, + then the OR is probably the first hop of a circuit (or the second hop of + a circuit via a bridge relay). + + Circuits may be torn down either completely or hop-by-hop. + + To tear down a circuit completely, an OR or OP sends a DESTROY + cell to the adjacent nodes on that circuit, using the appropriate + direction's circID. + + Upon receiving an outgoing DESTROY cell, an OR frees resources + associated with the corresponding circuit. If it's not the end of + the circuit, it sends a DESTROY cell for that circuit to the next OR + in the circuit. If the node is the end of the circuit, then it tears + down any associated edge connections (see section 6.1). + + After a DESTROY cell has been processed, an OR ignores all data or + destroy cells for the corresponding circuit. + + To tear down part of a circuit, the OP may send a RELAY_TRUNCATE cell + signaling a given OR (Stream ID zero). That OR sends a DESTROY + cell to the next node in the circuit, and replies to the OP with a + RELAY_TRUNCATED cell. + + [Note: If an OR receives a TRUNCATE cell and it has any RELAY cells + still queued on the circuit for the next node it will drop them + without sending them. This is not considered conformant behavior, + but it probably won't get fixed until a later version of Tor. Thus, + clients SHOULD NOT send a TRUNCATE cell to a node running any current + version of Tor if a) they have sent relay cells through that node, + and b) they aren't sure whether those cells have been sent on yet.] + + When an unrecoverable error occurs along one a circuit, the nodes + must report it as follows: + * If possible, send a DESTROY cell to ORs _away_ from the client. + * If possible, send *either* a DESTROY cell towards the client, or + a RELAY_TRUNCATED cell towards the client. + + Current versions of Tor do not reuse truncated RELAY_TRUNCATED + circuits: An OP, upon receiving a RELAY_TRUNCATED, will send + forward a DESTROY cell in order to entirely tear down the circuit. + Because of this, we recommend that relays should send DESTROY + towards the client, not RELAY_TRUNCATED. + + NOTE: + In tor versions before 0.4.5.13, 0.4.6.11 and 0.4.7.9, relays would + handle an inbound DESTROY by sending the client a RELAY_TRUNCATED + message. Beginning with those versions, relays now propagate + DESTROY cells in either direction, in order to tell every + intermediary ORs to stop queuing data on the circuit. The earlier + behavior created queuing pressure on the intermediary ORs. + + The payload of a DESTROY and RELAY_TRUNCATED cell contains a single + octet, describing the reason that the circuit was + closed. RELAY_TRUNCATED cells, and DESTROY cells sent _towards the + client, should contain the actual reason from the list of error codes + below. Reasons in DESTROY cell SHOULD NOT be propagated downward or + upward, due to potential side channel risk: An OR receiving a DESTROY + command should use the DESTROYED reason for its next cell. An OP + should always use the NONE reason for its own DESTROY cells. + + The error codes are: + + 0 -- NONE (No reason given.) + 1 -- PROTOCOL (Tor protocol violation.) + 2 -- INTERNAL (Internal error.) + 3 -- REQUESTED (A client sent a TRUNCATE command.) + 4 -- HIBERNATING (Not currently operating; trying to save bandwidth.) + 5 -- RESOURCELIMIT (Out of memory, sockets, or circuit IDs.) + 6 -- CONNECTFAILED (Unable to reach relay.) + 7 -- OR_IDENTITY (Connected to relay, but its OR identity was not + as expected.) + 8 -- CHANNEL_CLOSED (The OR connection that was carrying this circuit + died.) + 9 -- FINISHED (The circuit has expired for being dirty or old.) + 10 -- TIMEOUT (Circuit construction took too long) + 11 -- DESTROYED (The circuit was destroyed w/o client TRUNCATE) + 12 -- NOSUCHSERVICE (Request for unknown hidden service) + +5.5. Routing relay cells + +5.5.1. Circuit ID Checks + + When a node wants to send a RELAY or RELAY_EARLY cell, it checks the cell's + circID and determines whether the corresponding circuit along that + connection is still open. If not, the node drops the cell. + + When a node receives a RELAY or RELAY_EARLY cell, it checks the cell's + circID and determines whether it has a corresponding circuit along + that connection. If not, the node drops the cell. + +5.5.2. Forward Direction + + The forward direction is the direction that CREATE/CREATE2 cells + are sent. + +5.5.2.1. Routing from the Origin + + When a relay cell is sent from an OP, the OP encrypts the payload + with the stream cipher as follows: + + OP sends relay cell: + For I=N...1, where N is the destination node: + Encrypt with Kf_I. + Transmit the encrypted cell to node 1. + +5.5.2.2. Relaying Forward at Onion Routers + + When a forward relay cell is received by an OR, it decrypts the payload + with the stream cipher, as follows: + + 'Forward' relay cell: + Use Kf as key; decrypt. + + The OR then decides whether it recognizes the relay cell, by + inspecting the payload as described in section 6.1 below. If the OR + recognizes the cell, it processes the contents of the relay cell. + Otherwise, it passes the decrypted relay cell along the circuit if + the circuit continues. If the OR at the end of the circuit + encounters an unrecognized relay cell, an error has occurred: the OR + sends a DESTROY cell to tear down the circuit. + + For more information, see section 6 below. + +5.5.3. Backward Direction + + The backward direction is the opposite direction from + CREATE/CREATE2 cells. + +5.5.3.1. Relaying Backward at Onion Routers + + When a backward relay cell is received by an OR, it encrypts the payload + with the stream cipher, as follows: + + 'Backward' relay cell: + Use Kb as key; encrypt. + +5.5.3. Routing to the Origin + + When a relay cell arrives at an OP, the OP decrypts the payload + with the stream cipher as follows: + + OP receives relay cell from node 1: + For I=1...N, where N is the final node on the circuit: + Decrypt with Kb_I. + If the payload is recognized (see section 6.1), then: + The sending node is I. + Stop and process the payload. + +5.6. Handling relay_early cells + + A RELAY_EARLY cell is designed to limit the length any circuit can reach. + When an OR receives a RELAY_EARLY cell, and the next node in the circuit + is speaking v2 of the link protocol or later, the OR relays the cell as a + RELAY_EARLY cell. Otherwise, older Tors will relay it as a RELAY cell. + + If a node ever receives more than 8 RELAY_EARLY cells on a given + outbound circuit, it SHOULD close the circuit. If it receives any + inbound RELAY_EARLY cells, it MUST close the circuit immediately. + + When speaking v2 of the link protocol or later, clients MUST only send + EXTEND/EXTEND2 cells inside RELAY_EARLY cells. Clients SHOULD send the first + ~8 RELAY cells that are not targeted at the first hop of any circuit as + RELAY_EARLY cells too, in order to partially conceal the circuit length. + + [Starting with Tor 0.2.3.11-alpha, relays should reject any + EXTEND/EXTEND2 cell not received in a RELAY_EARLY cell.] + +6. Application connections and stream management + +6.1. Relay cells + + Within a circuit, the OP and the end node use the contents of + RELAY packets to tunnel end-to-end commands and TCP connections + ("Streams") across circuits. End-to-end commands can be initiated + by either edge; streams are initiated by the OP. + + End nodes that accept streams may be: + * exit relays (RELAY_BEGIN, anonymous), + * directory servers (RELAY_BEGIN_DIR, anonymous or non-anonymous), + * onion services (RELAY_BEGIN, anonymous via a rendezvous point). + + The payload of each unencrypted RELAY cell consists of: + + Relay command [1 byte] + 'Recognized' [2 bytes] + StreamID [2 bytes] + Digest [4 bytes] + Length [2 bytes] + Data [Length bytes] + Padding [PAYLOAD_LEN - 11 - Length bytes] + + The relay commands are: + + 1 -- RELAY_BEGIN [forward] + 2 -- RELAY_DATA [forward or backward] + 3 -- RELAY_END [forward or backward] + 4 -- RELAY_CONNECTED [backward] + 5 -- RELAY_SENDME [forward or backward] [sometimes control] + 6 -- RELAY_EXTEND [forward] [control] + 7 -- RELAY_EXTENDED [backward] [control] + 8 -- RELAY_TRUNCATE [forward] [control] + 9 -- RELAY_TRUNCATED [backward] [control] + 10 -- RELAY_DROP [forward or backward] [control] + 11 -- RELAY_RESOLVE [forward] + 12 -- RELAY_RESOLVED [backward] + 13 -- RELAY_BEGIN_DIR [forward] + 14 -- RELAY_EXTEND2 [forward] [control] + 15 -- RELAY_EXTENDED2 [backward] [control] + + 16..18 -- Reserved for UDP; Not yet in use, see prop339. + + 19..22 -- Reserved for Conflux, see prop329. + + 32..40 -- Used for hidden services; see rend-spec-{v2,v3}.txt. + + 41..42 -- Used for circuit padding; see Section 3 of padding-spec.txt. + + Used for flow control; see Section 4 of prop324. + 43 -- XON [forward or backward] + 44 -- XOFF [forward or backward] + + Commands labelled as "forward" must only be sent by the originator + of the circuit. Commands labelled as "backward" must only be sent by + other nodes in the circuit back to the originator. Commands marked + as either can be sent either by the originator or other nodes. + + The 'recognized' field is used as a simple indication that the cell + is still encrypted. It is an optimization to avoid calculating + expensive digests for every cell. When sending cells, the unencrypted + 'recognized' MUST be set to zero. + + When receiving and decrypting cells the 'recognized' will always be + zero if we're the endpoint that the cell is destined for. For cells + that we should relay, the 'recognized' field will usually be nonzero, + but will accidentally be zero with P=2^-16. + + When handling a relay cell, if the 'recognized' in field in a + decrypted relay payload is zero, the 'digest' field is computed as + the first four bytes of the running digest of all the bytes that have + been destined for this hop of the circuit or originated from this hop + of the circuit, seeded from Df or Db respectively (obtained in + section 5.2 above), and including this RELAY cell's entire payload + (taken with the digest field set to zero). Note that these digests + _do_ include the padding bytes at the end of the cell, not only those up + to "Len". If the digest is correct, the cell is considered "recognized" + for the purposes of decryption (see section 5.5 above). + + (The digest does not include any bytes from relay cells that do + not start or end at this hop of the circuit. That is, it does not + include forwarded data. Therefore if 'recognized' is zero but the + digest does not match, the running digest at that node should + not be updated, and the cell should be forwarded on.) + + All RELAY cells pertaining to the same tunneled stream have the same + stream ID. StreamIDs are chosen arbitrarily by the OP. No stream + may have a StreamID of zero. Rather, RELAY cells that affect the + entire circuit rather than a particular stream use a StreamID of zero + -- they are marked in the table above as "[control]" style + cells. (Sendme cells are marked as "sometimes control" because they + can include a StreamID or not depending on their purpose -- see + Section 7.) + + The 'Length' field of a relay cell contains the number of bytes in + the relay payload which contain real payload data. The remainder of + the unencrypted payload is padded with padding bytes. Implementations + handle padding bytes of unencrypted relay cells as they do padding + bytes for other cell types; see Section 3. + + The 'Padding' field is used to make relay cell contents unpredictable, to + avoid certain attacks (see proposal 289 for rationale). Implementations + SHOULD fill this field with four zero-valued bytes, followed by as many + random bytes as will fit. (If there are fewer than 4 bytes for padding, + then they should all be filled with zero. + + Implementations MUST NOT rely on the contents of the 'Padding' field. + + If the RELAY cell is recognized but the relay command is not + understood, the cell must be dropped and ignored. Its contents + still count with respect to the digests and flow control windows, though. + +6.1.1. Calculating the 'Digest' field + + The 'Digest' field itself serves the purpose to check if a cell has been + fully decrypted, that is, all onion layers have been removed. Having a + single field, namely 'Recognized' is not sufficient, as outlined above. + + When ENCRYPTING a RELAY cell, an implementation does the following: + + # Encode the cell in binary (recognized and digest set to zero) + tmp = cmd + [0, 0] + stream_id + [0, 0, 0, 0] + length + data + padding + + # Update the digest with the encoded data + digest_state = hash_update(digest_state, tmp) + digest = hash_calculate(digest_state) + + # The encoded data is the same as above with the digest field not being + # zero anymore + encoded = cmd + [0, 0] + stream_id + digest[0..4] + length + data + + padding + + # Now we can encrypt the cell by adding the onion layers ... + + When DECRYPTING a RELAY cell, an implementation does the following: + + decrypted = decrypt(cell) + + # Replace the digest field in decrypted by zeros + tmp = decrypted[0..5] + [0, 0, 0, 0] + decrypted[9..] + + # Update the digest field with the decrypted data and its digest field + # set to zero + digest_state = hash_update(digest_state, tmp) + digest = hash_calculate(digest_state) + + if digest[0..4] == decrypted[5..9] + # The cell has been fully decrypted ... + + The caveat itself is that only the binary data with the digest bytes set to + zero are being taken into account when calculating the running digest. The + final plain-text cells (with the digest field set to its actual value) are + not taken into the running digest. + +6.2. Opening streams and transferring data + + To open a new anonymized TCP connection, the OP chooses an open + circuit to an exit that may be able to connect to the destination + address, selects an arbitrary StreamID not yet used on that circuit, + and constructs a RELAY_BEGIN cell with a payload encoding the address + and port of the destination host. The payload format is: + + ADDRPORT [nul-terminated string] + FLAGS [4 bytes] + + ADDRPORT is made of ADDRESS | ':' | PORT | [00] + + where ADDRESS can be a DNS hostname, or an IPv4 address in + dotted-quad format, or an IPv6 address surrounded by square brackets; + and where PORT is a decimal integer between 1 and 65535, inclusive. + + The ADDRPORT string SHOULD be sent in lower case, to avoid + fingerprinting. Implementations MUST accept strings in any case. + + The FLAGS value has one or more of the following bits set, where + "bit 1" is the LSB of the 32-bit value, and "bit 32" is the MSB. + (Remember that all values in Tor are big-endian (see 0.1.1 above), so + the MSB of a 4-byte value is the MSB of the first byte, and the LSB + of a 4-byte value is the LSB of its last byte.) + + bit meaning + 1 -- IPv6 okay. We support learning about IPv6 addresses and + connecting to IPv6 addresses. + 2 -- IPv4 not okay. We don't want to learn about IPv4 addresses + or connect to them. + 3 -- IPv6 preferred. If there are both IPv4 and IPv6 addresses, + we want to connect to the IPv6 one. (By default, we connect + to the IPv4 address.) + 4..32 -- Reserved. Current clients MUST NOT set these. Servers + MUST ignore them. + + Upon receiving this cell, the exit node resolves the address as + necessary, and opens a new TCP connection to the target port. If the + address cannot be resolved, or a connection can't be established, the + exit node replies with a RELAY_END cell. (See 6.3 below.) + Otherwise, the exit node replies with a RELAY_CONNECTED cell, whose + payload is in one of the following formats: + + The IPv4 address to which the connection was made [4 octets] + A number of seconds (TTL) for which the address may be cached [4 octets] + + or + + Four zero-valued octets [4 octets] + An address type (6) [1 octet] + The IPv6 address to which the connection was made [16 octets] + A number of seconds (TTL) for which the address may be cached [4 octets] + + [Tor exit nodes before 0.1.2.0 set the TTL field to a fixed value. Later + versions set the TTL to the last value seen from a DNS server, and expire + their own cached entries after a fixed interval. This prevents certain + attacks.] + + Once a connection has been established, the OP and exit node + package stream data in RELAY_DATA cells, and upon receiving such + cells, echo their contents to the corresponding TCP stream. + + If the exit node does not support optimistic data (i.e. its + version number is before 0.2.3.1-alpha), then the OP MUST wait + for a RELAY_CONNECTED cell before sending any data. If the exit + node supports optimistic data (i.e. its version number is + 0.2.3.1-alpha or later), then the OP MAY send RELAY_DATA cells + immediately after sending the RELAY_BEGIN cell (and before + receiving either a RELAY_CONNECTED or RELAY_END cell). + + RELAY_DATA cells sent to unrecognized streams are dropped. If + the exit node supports optimistic data, then RELAY_DATA cells it + receives on streams which have seen RELAY_BEGIN but have not yet + been replied to with a RELAY_CONNECTED or RELAY_END are queued. + If the stream creation succeeds with a RELAY_CONNECTED, the queue + is processed immediately afterwards; if the stream creation fails + with a RELAY_END, the contents of the queue are deleted. + + Relay RELAY_DROP cells are long-range dummies; upon receiving such + a cell, the OR or OP must drop it. + +6.2.1. Opening a directory stream + + If a Tor relay is a directory server, it should respond to a + RELAY_BEGIN_DIR cell as if it had received a BEGIN cell requesting a + connection to its directory port. RELAY_BEGIN_DIR cells ignore exit + policy, since the stream is local to the Tor process. + + Directory servers may be: + * authoritative directories (RELAY_BEGIN_DIR, usually non-anonymous), + * bridge authoritative directories (RELAY_BEGIN_DIR, anonymous), + * directory mirrors (RELAY_BEGIN_DIR, usually non-anonymous), + * onion service directories (RELAY_BEGIN_DIR, anonymous). + + If the Tor relay is not running a directory service, it should respond + with a REASON_NOTDIRECTORY RELAY_END cell. + + Clients MUST generate an all-zero payload for RELAY_BEGIN_DIR cells, + and relays MUST ignore the payload. + + In response to a RELAY_BEGIN_DIR cell, relays respond either with a + RELAY_CONNECTED cell on success, or a RELAY_END cell on failure. They + MUST send a RELAY_CONNECTED cell all-zero payload, and clients MUST ignore + the payload. + + [RELAY_BEGIN_DIR was not supported before Tor 0.1.2.2-alpha; clients + SHOULD NOT send it to routers running earlier versions of Tor.] + +6.3. Closing streams + + When an anonymized TCP connection is closed, or an edge node + encounters error on any stream, it sends a 'RELAY_END' cell along the + circuit (if possible) and closes the TCP connection immediately. If + an edge node receives a 'RELAY_END' cell for any stream, it closes + the TCP connection completely, and sends nothing more along the + circuit for that stream. + + The payload of a RELAY_END cell begins with a single 'reason' byte to + describe why the stream is closing. For some reasons, it contains + additional data (depending on the reason.) The values are: + + 1 -- REASON_MISC (catch-all for unlisted reasons) + 2 -- REASON_RESOLVEFAILED (couldn't look up hostname) + 3 -- REASON_CONNECTREFUSED (remote host refused connection) [*] + 4 -- REASON_EXITPOLICY (OR refuses to connect to host or port) + 5 -- REASON_DESTROY (Circuit is being destroyed) + 6 -- REASON_DONE (Anonymized TCP connection was closed) + 7 -- REASON_TIMEOUT (Connection timed out, or OR timed out + while connecting) + 8 -- REASON_NOROUTE (Routing error while attempting to + contact destination) + 9 -- REASON_HIBERNATING (OR is temporarily hibernating) + 10 -- REASON_INTERNAL (Internal error at the OR) + 11 -- REASON_RESOURCELIMIT (OR has no resources to fulfill request) + 12 -- REASON_CONNRESET (Connection was unexpectedly reset) + 13 -- REASON_TORPROTOCOL (Sent when closing connection because of + Tor protocol violations.) + 14 -- REASON_NOTDIRECTORY (Client sent RELAY_BEGIN_DIR to a + non-directory relay.) + + [*] Older versions of Tor also send this reason when connections are + reset. + + OPs and ORs MUST accept reasons not on the above list, since future + versions of Tor may provide more fine-grained reasons. + + For most reasons, the format of RELAY_END is: + + Reason [1 byte] + + For REASON_EXITPOLICY, the format of RELAY_END is: + + Reason [1 byte] + IPv4 or IPv6 address [4 bytes or 16 bytes] + TTL [4 bytes] + + (If the TTL is absent, it should be treated as if it were 0xffffffff. + If the address is absent or is the wrong length, the RELAY_END message + should be processed anyway.) + + Tors SHOULD NOT send any reason except REASON_MISC for a stream that they + have originated. + + Implementations SHOULD accept empty RELAY_END messages, and treat them + as if they specified REASON_MISC. + + Upon receiving a RELAY_END cell, the recipient may be sure that no further + cells will arrive on that stream, and can treat such cells as a protocol + violation. + + After sending a RELAY_END cell, the sender needs to give the recipient + time to receive that cell. In the meantime, the sender SHOULD remember + how many cells of which types (CONNECTED, SENDME, DATA) that it would have + accepted on that stream, and SHOULD kill the circuit if it receives more + than permitted. + + --- [The rest of this section describes unimplemented functionality.] + + Because TCP connections can be half-open, we follow an equivalent + to TCP's FIN/FIN-ACK/ACK protocol to close streams. + + An exit (or onion service) connection can have a TCP stream in one of + three states: 'OPEN', 'DONE_PACKAGING', and 'DONE_DELIVERING'. For the + purposes of modeling transitions, we treat 'CLOSED' as a fourth state, + although connections in this state are not, in fact, tracked by the + onion router. + + A stream begins in the 'OPEN' state. Upon receiving a 'FIN' from + the corresponding TCP connection, the edge node sends a 'RELAY_FIN' + cell along the circuit and changes its state to 'DONE_PACKAGING'. + Upon receiving a 'RELAY_FIN' cell, an edge node sends a 'FIN' to + the corresponding TCP connection (e.g., by calling + shutdown(SHUT_WR)) and changing its state to 'DONE_DELIVERING'. + + When a stream in already in 'DONE_DELIVERING' receives a 'FIN', it + also sends a 'RELAY_FIN' along the circuit, and changes its state + to 'CLOSED'. When a stream already in 'DONE_PACKAGING' receives a + 'RELAY_FIN' cell, it sends a 'FIN' and changes its state to + 'CLOSED'. + + If an edge node encounters an error on any stream, it sends a + 'RELAY_END' cell (if possible) and closes the stream immediately. + +6.4. Remote hostname lookup + + To find the address associated with a hostname, the OP sends a + RELAY_RESOLVE cell containing the hostname to be resolved with a NUL + terminating byte. (For a reverse lookup, the OP sends a RELAY_RESOLVE + cell containing an in-addr.arpa address.) The OR replies with a + RELAY_RESOLVED cell containing any number of answers. Each answer is + of the form: + + Type (1 octet) + Length (1 octet) + Value (variable-width) + TTL (4 octets) + "Length" is the length of the Value field. + "Type" is one of: + + 0x00 -- Hostname + 0x04 -- IPv4 address + 0x06 -- IPv6 address + 0xF0 -- Error, transient + 0xF1 -- Error, nontransient + + If any answer has a type of 'Error', then no other answer may be + given. + + The 'Value' field encodes the answer: + IP addresses are given in network order. + Hostnames are given in standard DNS order ("www.example.com") + and not NUL-terminated. + The content of Errors is currently ignored. Relays currently + set it to the string "Error resolving hostname" with no + terminating NUL. Implementations MUST ignore this value. + + For backward compatibility, if there are any IPv4 answers, one of those + must be given as the first answer. + + The RELAY_RESOLVE cell must use a nonzero, distinct streamID; the + corresponding RELAY_RESOLVED cell must use the same streamID. No stream + is actually created by the OR when resolving the name. + +7. Flow control + +7.1. Link throttling + + Each client or relay should do appropriate bandwidth throttling to + keep its user happy. + + Communicants rely on TCP's default flow control to push back when they + stop reading. + + The mainline Tor implementation uses token buckets (one for reads, + one for writes) for the rate limiting. + + Since 0.2.0.x, Tor has let the user specify an additional pair of + token buckets for "relayed" traffic, so people can deploy a Tor relay + with strict rate limiting, but also use the same Tor as a client. To + avoid partitioning concerns we combine both classes of traffic over a + given OR connection, and keep track of the last time we read or wrote + a high-priority (non-relayed) cell. If it's been less than N seconds + (currently N=30), we give the whole connection high priority, else we + give the whole connection low priority. We also give low priority + to reads and writes for connections that are serving directory + information. See proposal 111 for details. + +7.2. Link padding + + Link padding can be created by sending PADDING or VPADDING cells + along the connection; relay cells of type "DROP" can be used for + long-range padding. The payloads of PADDING, VPADDING, or DROP + cells are filled with padding bytes. See Section 3. + + If the link protocol is version 5 or higher, link level padding is + enabled as per padding-spec.txt. On these connections, clients may + negotiate the use of padding with a CELL_PADDING_NEGOTIATE command + whose format is as follows: + + Version [1 byte] + Command [1 byte] + ito_low_ms [2 bytes] + ito_high_ms [2 bytes] + + Currently, only version 0 of this cell is defined. In it, the command + field is either 1 (stop padding) or 2 (start padding). For the start + padding command, a pair of timeout values specifying a low and a high + range bounds for randomized padding timeouts may be specified as unsigned + integer values in milliseconds. The ito_low_ms field should not be lower + than the current consensus parameter value for nf_ito_low (default: + 1500). The ito_high_ms field should not be lower than ito_low_ms. + (If any party receives an out-of-range value, they clamp it so + that it is in-range.) + + For the stop padding command, the timeout fields should be sent as + zero (to avoid client distinguishability) and ignored by the recipient. + + For more details on padding behavior, see padding-spec.txt. + +7.3. Circuit-level flow control + + To control a circuit's bandwidth usage, each OR keeps track of two + 'windows', consisting of how many RELAY_DATA cells it is allowed to + originate or willing to consume. + + These two windows are respectively named: the package window (packaged for + transmission) and the deliver window (delivered for local streams). + + Because of our leaky-pipe topology, every relay on the circuit has a pair + of windows, and the OP has a pair of windows for every relay on the + circuit. These windows do not apply to relayed cells, however, and a relay + that is never used for streams will never decrement its window or cause the + client to decrement a window. + + Each 'window' value is initially set based on the consensus parameter + 'circwindow' in the directory (see dir-spec.txt), or to 1000 data cells if + no 'circwindow' value is given. In each direction, cells that are not + RELAY_DATA cells do not affect the window. + + An OR or OP (depending on the stream direction) sends a RELAY_SENDME cell + to indicate that it is willing to receive more cells when its deliver + window goes down below a full increment (100). For example, if the window + started at 1000, it should send a RELAY_SENDME when it reaches 900. + + When an OR or OP receives a RELAY_SENDME, it increments its package window + by a value of 100 (circuit window increment) and proceeds to sending the + remaining RELAY_DATA cells. + + If a package window reaches 0, the OR or OP stops reading from TCP + connections for all streams on the corresponding circuit, and sends no more + RELAY_DATA cells until receiving a RELAY_SENDME cell. + + If a deliver window goes below 0, the circuit should be torn down. + + Starting with tor-0.4.1.1-alpha, authenticated SENDMEs are supported + (version 1, see below). This means that both the OR and OP need to remember + the rolling digest of the cell that precedes (triggers) a RELAY_SENDME. + This can be known if the package window gets to a multiple of the circuit + window increment (100). + + When the RELAY_SENDME version 1 arrives, it will contain a digest that MUST + match the one remembered. This represents a proof that the end point of the + circuit saw the sent cells. On failure to match, the circuit should be torn + down. + + To ensure unpredictability, random bytes should be added to at least one + RELAY_DATA cell within one increment window. In other word, every 100 cells + (increment), random bytes should be introduced in at least one cell. + +7.3.1. SENDME Cell Format + + A circuit-level RELAY_SENDME cell always has its StreamID=0. + + An OR or OP must obey these two consensus parameters in order to know which + version to emit and accept. + + 'sendme_emit_min_version': Minimum version to emit. + 'sendme_accept_min_version': Minimum version to accept. + + If a RELAY_SENDME version is received that is below the minimum accepted + version, the circuit should be closed. + + The RELAY_SENDME payload contains the following: + + VERSION [1 byte] + DATA_LEN [2 bytes] + DATA [DATA_LEN bytes] + + The VERSION tells us what is expected in the DATA section of length + DATA_LEN and how to handle it. The recognized values are: + + 0x00: The rest of the payload should be ignored. + + 0x01: Authenticated SENDME. The DATA section MUST contain: + + DIGEST [20 bytes] + + If the DATA_LEN value is less than 20 bytes, the cell should be + dropped and the circuit closed. If the value is more than 20 bytes, + then the first 20 bytes should be read to get the DIGEST value. + + The DIGEST is the rolling digest value from the RELAY_DATA cell that + immediately preceded (triggered) this RELAY_SENDME. This value is + matched on the other side from the previous cell sent that the OR/OP + must remember. + + (Note that if the digest in use has an output length greater than 20 + bytes—as is the case for the hop of an onion service rendezvous + circuit created by the hs_ntor handshake—we truncate the digest + to 20 bytes here.) + + If the VERSION is unrecognized or below the minimum accepted version (taken + from the consensus), the circuit should be torn down. + +7.4. Stream-level flow control + + Edge nodes use RELAY_SENDME cells to implement end-to-end flow + control for individual connections across circuits. Similarly to + circuit-level flow control, edge nodes begin with a window of cells + (500) per stream, and increment the window by a fixed value (50) + upon receiving a RELAY_SENDME cell. Edge nodes initiate RELAY_SENDME + cells when both a) the window is <= 450, and b) there are less than + ten cell payloads remaining to be flushed at that edge. + + Stream-level RELAY_SENDME cells are distinguished by having nonzero + StreamID. They are still empty; the body still SHOULD be ignored. + + +8. Handling resource exhaustion + + +8.1. Memory exhaustion. + + (See also dos-spec.md.) + + If RAM becomes low, an OR should begin destroying circuits until + more memory is free again. We recommend the following algorithm: + + - Set a threshold amount of RAM to recover at 10% of the total RAM. + + - Sort the circuits by their 'staleness', defined as the age of the + oldest data queued on the circuit. This data can be: + + * Bytes that are waiting to flush to or from a stream on that + circuit. + + * Bytes that are waiting to flush from a connection created with + BEGIN_DIR. + + * Cells that are waiting to flush or be processed. + + - While we have not yet recovered enough RAM: + + * Free all memory held by the most stale circuit, and send DESTROY + cells in both directions on that circuit. Count the amount of + memory we recovered towards the total. + +9. Subprotocol versioning + + This section specifies the Tor subprotocol versioning. They are broken down + into different types with their current version numbers. Any new version + number should be added to this section. + + The dir-spec.txt details how those versions are encoded. See the + "proto"/"pr" line in a descriptor and the "recommended-relay-protocols", + "required-relay-protocols", "recommended-client-protocols" and + "required-client-protocols" lines in the vote/consensus format. + + Here are the rules a relay and client should follow when encountering a + protocol list in the consensus: + + - When a relay lacks a protocol listed in recommended-relay-protocols, + it should warn its operator that the relay is obsolete. + + - When a relay lacks a protocol listed in required-relay-protocols, it + should warn its operator as above. If the consensus is newer than the + date when the software was released or scheduled for release, it must + not attempt to join the network. + + - When a client lacks a protocol listed in recommended-client-protocols, + it should warn the user that the client is obsolete. + + - When a client lacks a protocol listed in required-client-protocols, + it should warn the user as above. If the consensus is newer than the + date when the software was released, it must not connect to the + network. This implements a "safe forward shutdown" mechanism for + zombie clients. + + - If a client or relay has a cached consensus telling it that a given + protocol is required, and it does not implement that protocol, it + SHOULD NOT try to fetch a newer consensus. + + Software release dates SHOULD be automatically updated as part of the + release process, to prevent forgetting to move them forward. Software + release dates MAY be manually adjusted by maintainers if necessary. + + Starting in version 0.2.9.4-alpha, the initial required protocols for + clients that we will Recommend and Require are: + + Cons=1-2 Desc=1-2 DirCache=1 HSDir=1 HSIntro=3 HSRend=1 Link=4 + LinkAuth=1 Microdesc=1-2 Relay=2 + + For relays we will Require: + + Cons=1 Desc=1 DirCache=1 HSDir=1 HSIntro=3 HSRend=1 Link=3-4 + LinkAuth=1 Microdesc=1 Relay=1-2 + + For relays, we will additionally Recommend all protocols which we + recommend for clients. + +9.1. "Link" + + The "link" protocols are those used by clients and relays to initiate and + receive OR connections and to handle cells on OR connections. The "link" + protocol versions correspond 1:1 to those versions. + + Two Tor instances can make a connection to each other only if they have at + least one link protocol in common. + + The current "link" versions are: "1" through "5". See section 4.1 for more + information. All current Tor versions support "1-3"; versions from + 0.2.4.11-alpha and on support "1-4"; versions from 0.3.1.1-alpha and on + support "1-5". Eventually we will drop "1" and "2". + +9.2. "LinkAuth" + + LinkAuth protocols correspond to varieties of Authenticate cells used for + the v3+ link protocols. + + Current versions are: + + "1" is the RSA link authentication described in section 4.4.1 above. + + "2" is unused, and reserved by proposal 244. + + "3" is the ed25519 link authentication described in 4.4.2 above. + + +9.3. "Relay" + + The "relay" protocols are those used to handle CREATE/CREATE2 + cells, and those that handle the various RELAY cell types received + after a CREATE/CREATE2 cell. (Except, relay cells used to manage + introduction and rendezvous points are managed with the "HSIntro" + and "HSRend" protocols respectively.) + + Current versions are: + + "1" -- supports the TAP key exchange, with all features in Tor 0.2.3. + Support for CREATE and CREATED and CREATE_FAST and CREATED_FAST + and EXTEND and EXTENDED. + + "2" -- supports the ntor key exchange, and all features in Tor + 0.2.4.19. Includes support for CREATE2 and CREATED2 and + EXTEND2 and EXTENDED2. + + Relay=2 has limited IPv6 support: + * Clients might not include IPv6 ORPorts in EXTEND2 cells. + * Relays (and bridges) might not initiate IPv6 connections in + response to EXTEND2 cells containing IPv6 ORPorts, even if they + are configured with an IPv6 ORPort. + + However, relays support accepting inbound connections to their IPv6 + ORPorts. And they might extend circuits via authenticated IPv6 + connections to other relays. + + "3" -- relays support extending over IPv6 connections in response to an + EXTEND2 cell containing an IPv6 ORPort. + + Bridges might not extend over IPv6, because they try to imitate + client behaviour. + + A successful IPv6 extend requires: + * Relay subprotocol version 3 (or later) on the extending relay, + * an IPv6 ORPort on the extending relay, + * an IPv6 ORPort for the accepting relay in the EXTEND2 cell, and + * an IPv6 ORPort on the accepting relay. + (Because different tor instances can have different views of the + network, these checks should be done when the path is selected. + Extending relays should only check local IPv6 information, before + attempting the extend.) + + When relays receive an EXTEND2 cell containing both an IPv4 and an + IPv6 ORPort, and there is no existing authenticated connection with + the target relay, the extending relay may choose between IPv4 and + IPv6 at random. The extending relay might not try the other address, + if the first connection fails. + + As is the case with other subprotocol versions, tor advertises, + recommends, or requires support for this protocol version, regardless + of its current configuration. + + In particular: + * relays without an IPv6 ORPort, and + * tor instances that are not relays, + have the following behaviour, regardless of their configuration: + * advertise support for "Relay=3" in their descriptor + (if they are a relay, bridge, or directory authority), and + * react to consensuses recommending or requiring support for + "Relay=3". + + This subprotocol version is described in proposal 311, and + implemented in Tor 0.4.5.1-alpha. + + "4" -- support the ntorv3 (version 3) key exchange and all features in + 0.4.7.3-alpha. This adds a new CREATE2 cell type. See proposal 332 + and section 5.1.4.1 above for more details. + +9.4. "HSIntro" + + The "HSIntro" protocol handles introduction points. + + "3" -- supports authentication as of proposal 121 in Tor + 0.2.1.6-alpha. + + "4" -- support ed25519 authentication keys which is defined by the HS v3 + protocol as part of proposal 224 in Tor 0.3.0.4-alpha. + + "5" -- support ESTABLISH_INTRO cell DoS parameters extension for onion + service version 3 only in Tor 0.4.2.1-alpha. + +9.5. "HSRend" + + The "HSRend" protocol handles rendezvous points. + + "1" -- supports all features in Tor 0.0.6. + + "2" -- supports RENDEZVOUS2 cells of arbitrary length as long as they + have 20 bytes of cookie in Tor 0.2.9.1-alpha. + +9.6. "HSDir" + + The "HSDir" protocols are the set of hidden service document types that can + be uploaded to, understood by, and downloaded from a tor relay, and the set + of URLs available to fetch them. + + "1" -- supports all features in Tor 0.2.0.10-alpha. + + "2" -- support ed25519 blinded keys request which is defined by the HS v3 + protocol as part of proposal 224 in Tor 0.3.0.4-alpha. + +9.7. "DirCache" + + The "DirCache" protocols are the set of documents available for download + from a directory cache via BEGIN_DIR, and the set of URLs available to + fetch them. (This excludes URLs for hidden service objects.) + + "1" -- supports all features in Tor 0.2.4.19. + + "2" -- adds support for consensus diffs in Tor 0.3.1.1-alpha. + +9.8. "Desc" + + Describes features present or absent in descriptors. + + Most features in descriptors don't require a "Desc" update -- only those + that need to someday be required. For example, someday clients will need + to understand ed25519 identities. + + "1" -- supports all features in Tor 0.2.4.19. + + "2" -- cross-signing with onion-keys, signing with ed25519 + identities. + +9.9. "Microdesc" + + Describes features present or absent in microdescriptors. + + Most features in descriptors don't require a "MicroDesc" update -- only + those that need to someday be required. These correspond more or less with + consensus methods. + + "1" -- consensus methods 9 through 20. + + "2" -- consensus method 21 (adds ed25519 keys to microdescs). + +9.10. "Cons" + + Describes features present or absent in consensus documents. + + Most features in consensus documents don't require a "Cons" update -- only + those that need to someday be required. + + These correspond more or less with consensus methods. + + "1" -- consensus methods 9 through 20. + + "2" -- consensus method 21 (adds ed25519 keys to microdescs). + +9.11. "Padding" + + Describes the padding capabilities of the relay. + + "1" -- [DEFUNCT] Relay supports circuit-level padding. This version MUST NOT + be used as it was also enabled in relays that don't actually support + circuit-level padding. Advertised by Tor versions from + tor-0.4.0.1-alpha and only up to and including tor-0.4.1.4-rc. + + "2" -- Relay supports the HS circuit setup padding machines (proposal 302). + Advertised by Tor versions from tor-0.4.1.5 and onwards. + +9.12. "FlowCtrl" + + Describes the flow control protocol at the circuit and stream level. If + there is no FlowCtrl advertised, tor supports the unauthenticated flow + control features (version 0). + + "1" -- supports authenticated circuit level SENDMEs as of proposal 289 in + Tor 0.4.1.1-alpha. + + "2" -- supports congestion control by the Exits which implies a new SENDME + format and algorithm. See proposal 324 for more details. Advertised + in tor 0.4.7.3-alpha. + +9.13. "Datagram" + + Describes the UDP protocol capabilities of a relay. + + "1" -- [RESERVED] supports UDP by an Exit as in the relay command + CONNECT_UDP, CONNECTED_UDP and DATAGRAM. See proposal + 339 for more details. (Not yet advertised, reserved) diff --git a/attic/text_formats/version-spec.txt b/attic/text_formats/version-spec.txt new file mode 100644 index 0000000..615f6f2 --- /dev/null +++ b/attic/text_formats/version-spec.txt @@ -0,0 +1,86 @@ + + HOW TOR VERSION NUMBERS WORK + +Table of Contents + + 1. The Old Way + 2. The New Way + 3. Version status. + +1. The Old Way + + Before 0.1.0, versions were of the format: + + MAJOR.MINOR.MICRO(status(PATCHLEVEL))?(-cvs)? + + where MAJOR, MINOR, MICRO, and PATCHLEVEL are numbers, status is one + of "pre" (for an alpha release), "rc" (for a release candidate), or + "." for a release. As a special case, "a.b.c" was equivalent to + "a.b.c.0". We compare the elements in order (major, minor, micro, + status, patchlevel, cvs), with "cvs" preceding non-cvs. + + We would start each development branch with a final version in mind: + say, "0.0.8". Our first pre-release would be "0.0.8pre1", followed by + (for example) "0.0.8pre2-cvs", "0.0.8pre2", "0.0.8pre3-cvs", + "0.0.8rc1", "0.0.8rc2-cvs", and "0.0.8rc2". Finally, we'd release + 0.0.8. The stable CVS branch would then be versioned "0.0.8.1-cvs", + and any eventual bugfix release would be "0.0.8.1". + +2. The New Way + + Starting at 0.1.0.1-rc, versions are of the format: + + MAJOR.MINOR.MICRO[.PATCHLEVEL][-STATUS_TAG][ (EXTRA_INFO)]* + + The stuff in parentheses is optional. As before, MAJOR, MINOR, MICRO, + and PATCHLEVEL are numbers, with an absent number equivalent to 0. + All versions should be distinguishable purely by those four + numbers. + + The STATUS_TAG is purely informational, and lets you know how + stable we think the release is: "alpha" is pretty unstable; "rc" is a + release candidate; and no tag at all means that we have a final + release. If the tag ends with "-cvs" or "-dev", you're looking at a + development snapshot that came after a given release. If we *do* + encounter two versions that differ only by status tag, we compare them + lexically. The STATUS_TAG can't contain whitespace. + + The EXTRA_INFO is also purely informational, often containing information + about the SCM commit this version came from. It is surrounded by parentheses + and can't contain whitespace. Unlike the STATUS_TAG this never impacts the way + that versions should be compared. EXTRA_INFO may appear any number of + times. Tools should generally not parse EXTRA_INFO entries. + + Now, we start each development branch with (say) 0.1.1.1-alpha. The + patchlevel increments consistently as the status tag changes, for + example, as in: 0.1.1.2-alpha, 0.1.1.3-alpha, 0.1.1.4-rc, 0.1.1.5-rc. + Eventually, we release 0.1.1.6. The next patch release is 0.1.1.7. + + Between these releases, CVS is versioned with a -cvs tag: after + 0.1.1.1-alpha comes 0.1.1.1-alpha-cvs, and so on. But starting with + 0.1.2.1-alpha-dev, we switched to SVN and started using the "-dev" + suffix instead of the "-cvs" suffix. + +3. Version status. + + Sometimes we need to determine whether a Tor version is obsolete, + experimental, or neither, based on a list of recommended versions. The + logic is as follows: + + * If a version is listed on the recommended list, then it is + "recommended". + + * If a version is newer than every recommended version, that version + is "experimental" or "new". + + * If a version is older than every recommended version, it is + "obsolete" or "old". + + * The first three components (major,minor,micro) of a version number + are its "release series". If a version has other recommended + versions with the same release series, and the version is newer + than all such recommended versions, but it is not newer than + _every_ recommended version, then the version is "new in series". + + * Finally, if none of the above conditions hold, then the version is + "un-recommended." diff --git a/bandwidth-file-spec.txt b/bandwidth-file-spec.txt deleted file mode 100644 index bad13f6..0000000 --- a/bandwidth-file-spec.txt +++ /dev/null @@ -1,1315 +0,0 @@ - - Tor Bandwidth File Format - juga - teor - -Table of Contents - - 1. Scope and preliminaries - 1.2. Acknowledgements - 1.3. Outline - 1.4. Format Versions - 2. Format details - 2.1. Definitions - 2.2. Header List format - 2.3. Relay Line format - 2.4. Implementation details - 2.4.1. Writing bandwidth files atomically - 2.4.2. Additional KeyValue pair definitions - 2.4.2.1. Simple Bandwidth Scanner - 2.4.2.2. Torflow - A. Sample data - A.1. Generated by Torflow - A.2. Generated by sbws version 0.1.0 - A.3. Generated by sbws version 1.0.3 - A.4. Headers generated by sbws version 1.0.4 - A.5 Generated by sbws version 1.1.0 - B. Scaling bandwidths - B.1. Scaling requirements - B.2. A linear scaling method - B.3. Quota changes - B.4. Torflow aggregation - -1. Scope and preliminaries - - This document describes the format of Tor's Bandwidth File, version - 1.0.0 and later. - - It is a new specification for the existing bandwidth file format, - which we call version 1.0.0. It also specifies new format versions - 1.1.0 and later, which are backwards compatible with 1.0.0 parsers. - - Since Tor version 0.2.4.12-alpha, the directory authorities use - the Bandwidth File file called "V3BandwidthsFile" generated by - Torflow [1]. The details of this format are described in Torflow's - README.spec.txt. We also summarise the format in this specification. - - The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL - NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and - "OPTIONAL" in this document are to be interpreted as described in - RFC 2119. - -1.2. Acknowledgements - - The original bandwidth generator (Torflow) and format was - created by mike. Teor suggested to write this specification while - contributing on pastly's new bandwidth generator implementation. - - This specification was revised after feedback from: - - Nick Mathewson (nickm) - Iain Learmonth (irl) - -1.3. Outline - - The Tor directory protocol (dir-spec.txt [3]) sections 3.4.1 - and 3.4.2, use the term bandwidth measurements, to refer to what - here is called Bandwidth File. - - A Bandwidth File contains information on relays' bandwidth - capacities and is produced by bandwidth generators, previously known - as bandwidth scanners. - -1.4. Format Versions - - 1.0.0 - The legacy Bandwidth File format - - 1.1.0 - Adds a header containing information about the bandwidth - file. Document the sbws and Torflow relay line keys. - - 1.2.0 - If there are not enough eligible relays, the bandwidth file - SHOULD contain a header, but no relays. (To match Torflow's - existing behaviour.) - - Adds scanner and destination countries to the header. - Adds new KeyValue Lines to the Header List section with - statistics about the number of relays included in the file. - Adds new KeyValues to Relay Bandwidth Lines, with different - bandwidth values (averages and descriptor bandwidths). - - 1.4.0 - Adds monitoring KeyValues to the header and relay lines. - - RelayLines for excluded relays MAY be present in the bandwidth - file for diagnostic reasons. Similarly, if there are not enough - eligible relays, the bandwidth file MAY contain all known relays. - - Diagnostic relay lines SHOULD be marked with vote=0, and - Tor SHOULD NOT use their bandwidths in its votes. - - Also adds Tor version. - 1.5.0 - Removes "recent_measurement_attempt_count" KeyValue. - 1.6.0 - Adds congestion control stream events KeyValues. - 1.7.0 - Adds ratios KeyValues to the relay lines and network averages - KeyValues to the header. - - All Tor versions can consume format version 1.0.0. - - All Tor versions can consume format version 1.1.0 and later, - but Tor versions earlier than 0.3.5.1-alpha warn if the header - contains any KeyValue lines after the Timestamp. - - Tor versions 0.4.0.3-alpha, 0.3.5.8, 0.3.4.11, and earlier do not - understand "vote=0". Instead, they will vote for the actual bandwidths - that sbws puts in diagnostic relay lines: - * 1 for relays with "unmeasured=1", and - * the relay's measured and scaled bandwidth when "under_min_report=1". - -2. Format details - - The Bandwidth File MUST contain the following sections: - - Header List (exactly once), which is a partially ordered list of - - Header Lines (one or more times), then - - Relay Lines (zero or more times), in an arbitrary order. - If it does not contain these sections, parsers SHOULD ignore the file. - -2.1. Definitions - - The following nonterminals are defined in Tor directory protocol - sections 1.2., 2.1.1., 2.1.3.: - - bool - Int - SP (space) - NL (newline) - KeywordChar - ArgumentChar - nickname - hexdigest (a '$', followed by 40 hexadecimal characters - ([A-Fa-f0-9])) - - Nonterminal defined section 2 of version-spec.txt [4]: - - version_number - - We define the following nonterminals: - - Line ::= ArgumentChar* NL - RelayLine ::= KeyValue (SP KeyValue)* NL - HeaderLine ::= KeyValue NL - KeyValue ::= Key "=" Value - Key ::= (KeywordChar | "_")+ - Value ::= ArgumentCharValue+ - ArgumentCharValue ::= any printing ASCII character except NL and SP. - Terminator ::= "=====" or "====" - Generators SHOULD use a 5-character terminator. - Timestamp ::= Int - Bandwidth ::= Int - MasterKey ::= a base64-encoded Ed25519 public key, with - padding characters omitted. - DateTime ::= "YYYY-MM-DDTHH:MM:SS", as in ISO 8601 - CountryCode ::= Two capital ASCII letters ([A-Z]{2}), as defined in - ISO 3166-1 alpha-2 plus "ZZ" to denote unknown country - (eg the destination is in a Content Delivery Network). - CountryCodeList ::= One or more CountryCode(s) separated by a comma - ([A-Z]{2}(,[A-Z]{2})*). - - Note that key_value and value are defined in Tor directory protocol - with different formats to KeyValue and Value here. - - Tor versions earlier than 0.3.5.1-alpha require all lines in the file - to be 510 characters or less. The previous limit was 254 characters in - Tor 0.2.6.2-alpha and earlier. Parsers MAY ignore longer Lines. - - Note that directory authorities are only supported on the two most - recent stable Tor versions, so we expect that line limits will be - removed after Tor 0.4.0 is released in 2019. - -2.2. Header List format - - It consists of a Timestamp line and zero or more HeaderLines. - - All the header lines MUST conform to the HeaderLine format, except - the first Timestamp line. - - The Timestamp line is not a HeaderLine to keep compatibility with - the legacy Bandwidth File format. - - Some header Lines MUST appear in specific positions, as documented - below. All other Lines can appear in any order. - - If a parser does not recognize any extra material in a header Line, - the Line MUST be ignored. - - If a header Line does not conform to this format, the Line SHOULD be - ignored by parsers. - - It consists of: - - Timestamp NL - - [At start, exactly once.] - - The Unix Epoch time in seconds of the most recent generator bandwidth - result. - - If the generator implementation has multiple threads or - subprocesses which can fail independently, it SHOULD take the most - recent timestamp from each thread and use the oldest value. This - ensures all the threads continue running. - - If there are threads that do not run continuously, they SHOULD be - excluded from the timestamp calculation. - - If there are no recent results, the generator MUST NOT generate a new - file. - - It does not follow the KeyValue format for backwards compatibility - with version 1.0.0. - - "version" version_number NL - - [In second position, zero or one time.] - - The specification document format version. - It uses semantic versioning [5]. - - This Line was added in version 1.1.0 of this specification. - - Version 1.0.0 documents do not contain this Line, and the - version_number is considered to be "1.0.0". - - "software" Value NL - - [Zero or one time.] - - The name of the software that created the document. - - This Line was added in version 1.1.0 of this specification. - - Version 1.0.0 documents do not contain this Line, and the software - is considered to be "torflow". - - "software_version" Value NL - - [Zero or one time.] - - The version of the software that created the document. - The version may be a version_number, a git commit, or some other - version scheme. - - This Line was added in version 1.1.0 of this specification. - - "file_created" DateTime NL - - [Zero or one time.] - - The date and time timestamp in ISO 8601 format and UTC time zone - when the file was created. - - This Line was added in version 1.1.0 of this specification. - - "generator_started" DateTime NL - - [Zero or one time.] - - The date and time timestamp in ISO 8601 format and UTC time zone - when the generator started. - - This Line was added in version 1.1.0 of this specification. - - "earliest_bandwidth" DateTime NL - - [Zero or one time.] - - The date and time timestamp in ISO 8601 format and UTC time zone - when the first relay bandwidth was obtained. - - This Line was added in version 1.1.0 of this specification. - - "latest_bandwidth" DateTime NL - - [Zero or one time.] - - The date and time timestamp in ISO 8601 format and UTC time zone - of the most recent generator bandwidth result. - - This time MUST be identical to the initial Timestamp line. - - This duplicate value is included to make the format easier for people - to read. - - This Line was added in version 1.1.0 of this specification. - - "number_eligible_relays" Int NL - - [Zero or one time.] - - The number of relays that have enough measurements to be - included in the bandwidth file. - - This Line was added in version 1.2.0 of this specification. - - "minimum_percent_eligible_relays" Int NL - - [Zero or one time.] - - The percentage of relays in the consensus that SHOULD be - included in every generated bandwidth file. - - If this threshold is not reached, format versions 1.3.0 and earlier - SHOULD NOT contain any relays. (Bandwidth files always include a - header.) - - Format versions 1.4.0 and later SHOULD include all the relays for - diagnostic purposes, even if this threshold is not reached. But these - relays SHOULD be marked so that Tor does not vote on them. - See section 1.4 for details. - - The minimum percentage is 60% in Torflow, so sbws uses - 60% as the default. - - This Line was added in version 1.2.0 of this specification. - - "number_consensus_relays" Int NL - - [Zero or one time.] - - The number of relays in the consensus. - - This Line was added in version 1.2.0 of this specification. - - "percent_eligible_relays" Int NL - - [Zero or one time.] - - The number of eligible relays, as a percentage of the number - of relays in the consensus. - - This line SHOULD be equal to: - (number_eligible_relays * 100.0) / number_consensus_relays - to the number of relays in the consensus to include in this file. - - This Line was added in version 1.2.0 of this specification. - - "minimum_number_eligible_relays" Int NL - - [Zero or one time.] - - The minimum number of relays that SHOULD be included in the bandwidth - file. See minimum_percent_eligible_relays for details. - - This line SHOULD be equal to: - number_consensus_relays * (minimum_percent_eligible_relays / 100.0) - - This Line was added in version 1.2.0 of this specification. - - "scanner_country" CountryCode NL - - [Zero or one time.] - - The country, as in political geolocation, where the generator is run. - - This Line was added in version 1.2.0 of this specification. - - "destinations_countries" CountryCodeList NL - - [Zero or one time.] - - The country, as in political geolocation, or countries where the - destination Web server(s) are located. - The destination Web Servers serve the data that the generator retrieves - to measure the bandwidth. - - This Line was added in version 1.2.0 of this specification. - - "recent_consensus_count" Int NL - - [Zero or one time.]. - - The number of the different consensuses seen in the last data_period - days. (data_period is 5 by default.) - - Assuming that Tor clients fetch a consensus every 1-2 hours, - and that the data_period is 5 days, the Value of this Key SHOULD be - between: - data_period * 24 / 2 = 60 - data_period * 24 = 120 - - This Line was added in version 1.4.0 of this specification. - - "recent_priority_list_count" Int NL - - [Zero or one time.] - - The number of times that a list with a subset of relays prioritized - to be measured has been created in the last data_period days. - (data_period is 5 by default.) - - In 2019, with 7000 relays in the network, the Value of this Key SHOULD be - approximately: - data_period * 24 / 1.5 = 80 - Being 1.5 the approximate number of hours it takes to measure a - priority list of 7000 * 0.05 (350) relays, when the fraction of relays - in a priority list is the 5% (0.05). - - This Line was added in version 1.4.0 of this specification. - - "recent_priority_relay_count" Int NL - - [Zero or one time.] - - The number of relays that has been in in the list of relays prioritized - to be measured in the last data_period days. (data_period is 5 by - default.) - - In 2019, with 7000 relays in the network, the Value of this Key SHOULD be - approximately: - 80 * (7000 * 0.05) = 28000 - Being 0.05 (5%) the fraction of relays in a priority list and 80 - the approximate number of priority lists (see - "recent_priority_list_count"). - - This Line was added in version 1.4.0 of this specification. - - "recent_measurement_attempt_count" Int NL - - [Zero or one time.] - - The number of times that any relay has been queued to be measured - in the last data_period days. (data_period is 5 by default.) - - In 2019, with 7000 relays in the network, the Value of this Key SHOULD be - approximately the same as "recent_priority_relay_count", - assuming that there is one attempt to measure a relay for each relay that - has been prioritized unless there are system, network or implementation - issues. - - This Line was added in version 1.4.0 of this specification and removed - in version 1.5.0. - - "recent_measurement_failure_count" Int NL - - [Zero or one time.] - - The number of times that the scanner attempted to measure a relay in - the last data_period days (5 by default), but the relay has not been - measured because of system, network or implementation issues. - - This Line was added in version 1.4.0 of this specification. - - "recent_measurements_excluded_error_count" Int NL - - [Zero or one time.] - - The number of relays that have no successful measurements in the last - data_period days (5 by default). - - (See the note in section 1.4, version 1.4.0, about excluded relays.) - - This Line was added in version 1.4.0 of this specification. - - "recent_measurements_excluded_near_count" Int NL - - [Zero or one time.] - - The number of relays that have some successful measurements in the last - data_period days (5 by default), but all those measurements were - performed in a period of time that was too short (by default 1 day). - - (See the note in section 1.4, version 1.4.0, about excluded relays.) - - This Line was added in version 1.4.0 of this specification. - - "recent_measurements_excluded_old_count" Int NL - - [Zero or one time.] - - The number of relays that have some successful measurements, but all - those measurements are too old (more than 5 days, by default). - - Excludes relays that are already counted in - recent_measurements_excluded_near_count. - - (See the note in section 1.4, version 1.4.0, about excluded relays.) - - This Line was added in version 1.4.0 of this specification. - - "recent_measurements_excluded_few_count" Int NL - - [Zero or one time.] - - The number of relays that don't have enough recent successful - measurements. (Fewer than 2 measurements in the last 5 days, by - default). - - Excludes relays that are already counted in - recent_measurements_excluded_near_count and - recent_measurements_excluded_old_count. - - (See the note in section 1.4, version 1.4.0, about excluded relays.) - - This Line was added in version 1.4.0 of this specification. - - "time_to_report_half_network" Int NL - - [Zero or one time.] - - The time in seconds that it would take to report measurements about the - half of the network, given the number of eligible relays and the time - it took in the last days (5 days, by default). - - (See the note in section 1.4, version 1.4.0, about excluded relays.) - - This Line was added in version 1.4.0 of this specification. - - "tor_version" version_number NL - - [Zero or one time.] - - The Tor version of the Tor process controlled by the generator. - - This Line was added in version 1.4.0 of this specification. - - "mu" Int NL - - [Zero or one time.] - - The network stream bandwidth average calculated as explained in B4.2. - - This Line was added in version 1.7.0 of this specification. - - "muf" Int NL - - [Zero or one time.] - - The network stream bandwidth average filtered calculated as explained in - B4.2. - - This Line was added in version 1.7.0 of this specification. - - KeyValue NL - - [Zero or more times.] - - There MUST NOT be multiple KeyValue header Lines with the same key. - If there are, the parser SHOULD choose an arbitrary Line. - - If a parser does not recognize a Keyword in a KeyValue Line, it - MUST be ignored. - - Future format versions may include additional KeyValue header Lines. - Additional header Lines will be accompanied by a minor version - increment. - - Implementations MAY add additional header Lines as needed. This - specification SHOULD be updated to avoid conflicting meanings for - the same header keys. - - Parsers MUST NOT rely on the order of these additional Lines. - - Additional header Lines MUST NOT use any keywords specified in the - relay measurements format. - If there are, the parser MAY ignore conflicting keywords. - - Terminator NL - - [Zero or one time.] - - The Header List section ends with a Terminator. - - In version 1.0.0, Header List ends when the first relay bandwidth - is found conforming to the next section. - - Implementations of version 1.1.0 and later SHOULD use a 5-character - terminator. - - Tor 0.4.0.1-alpha and later look for a 5-character terminator, - or the first relay bandwidth line. sbws versions 0.1.0 to 1.0.2 - used a 4-character terminator, this bug was fixed in 1.0.3. - -2.3. Relay Line format - - It consists of zero or more RelayLines containing relay ids and - bandwidths. The relays and their KeyValues are in arbitrary order. - - There MUST NOT be multiple KeyValue pairs with the same key in the same - RelayLine. If there are, the parser SHOULD choose an arbitrary Value. - - There MUST NOT be multiple RelayLines per relay identity (node_id or - master_key_ed25519). If there are, parsers SHOULD issue a warning. - Parers MAY reject the file, choose an arbitrary RelayLine, or ignore - both RelayLines. - - If a parser does not recognize any extra material in a RelayLine, - the extra material MUST be ignored. - - Each RelayLine includes the following KeyValue pairs: - - "node_id" hexdigest - - [Exactly once.] - - The fingerprint for the relay's RSA identity key. - - Note: In bandwidth files read by Tor versions earlier than - 0.3.4.1-alpha, node_id MUST NOT be at the end of the Line. - These authority versions are no longer supported. - - Current Tor versions ignore master_key_ed25519, so node_id MUST be - present in each relay Line. - - Implementations of version 1.1.0 and later SHOULD include both node_id - and master_key_ed25519. Parsers SHOULD accept Lines that contain at - least one of them. - - "master_key_ed25519" MasterKey - - [Zero or one time.] - - The relays's master Ed25519 key, base64 encoded, - without trailing "="s, to avoid ambiguity with KeyValue "=" - character. - - This KeyValue pair SHOULD be present, see the note under node_id. - - This KeyValue was added in version 1.1.0 of this specification. - - "bw" Bandwidth - - [Exactly once.] - - The bandwidth of this relay in kilobytes per second. - - No Zero Bandwidths: - Tor accepts zero bandwidths, but they trigger bugs in older Tor - implementations. Therefore, implementations SHOULD NOT produce zero - bandwidths. Instead, they SHOULD use one as their minimum bandwidth. - If there are zero bandwidths, the parser MAY ignore them. - - Bandwidth Aggregation: - Multiple measurements can be aggregated using an averaging scheme, - such as a mean, median, or decaying average. - - Bandwidth Scaling: - Torflow scales bandwidths to kilobytes per second. Other - implementations SHOULD use kilobytes per second for their initial - bandwidth scaling. - - If different implementations or configurations are used in votes for - the same network, their measurements MAY need further scaling. See - Appendix B for information about scaling, and one possible scaling - method. - - MaxAdvertisedBandwidth: - Bandwidth generators MUST limit the relays' measured bandwidth based - on the MaxAdvertisedBadwidth. - A relay's MaxAdvertisedBandwidth limits the bandwidth-avg in its - descriptor. bandwidth-avg is the minimum of MaxAdvertisedBandwidth, - BandwidthRate, RelayBandwidthRate, BandwidthBurst, and - RelayBandwidthBurst. - Therefore, generators MUST limit a relay's measured bandwidth to its - descriptor's bandwidth-avg. This limit needs to be implemented in the - generator, because generators may scale consensus weights before - sending them to Tor. - Generators SHOULD NOT limit measured bandwidths based on descriptors' - bandwidth-observed, because that penalises new relays. - - sbws limits the relay's measured bandwidth to the bandwidth-avg - advertised. - - Torflow partitions relays based on their bandwidth. For unmeasured - relays, Torflow uses the minimum of all descriptor bandwidths, - including bandwidth-avg (MaxAdvertisedBandwidth) and - bandwidth-observed. Then Torflow measures the relays in each partition - against each other, which implicitly limits a relay's measured - bandwidth to the bandwidths of similar relays. - - Torflow also generates consensus weights based on the ratio between the - measured bandwidth and the minimum of all descriptor bandwidths (at the - time of the measurement). So when an operator reduces the - MaxAdvertisedBandwidth for a relay, Torflow reduces that relay's - measured bandwidth. - - KeyValue - - [Zero or more times.] - - Future format versions may include additional KeyValue pairs on a - RelayLine. - Additional KeyValue pairs will be accompanied by a minor version - increment. - - Implementations MAY add additional relay KeyValue pairs as needed. - This specification SHOULD be updated to avoid conflicting meanings - for the same Keywords. - - Parsers MUST NOT rely on the order of these additional KeyValue - pairs. - - Additional KeyValue pairs MUST NOT use any keywords specified in the - header format. - If there are, the parser MAY ignore conflicting keywords. - -2.4. Implementation details - -2.4.1. Writing bandwidth files atomically - - To avoid inconsistent reads, implementations SHOULD write bandwidth files - atomically. If the file is transferred from another host, it SHOULD be - written to a temporary path, then renamed to the V3BandwidthsFile path. - - sbws versions 0.7.0 and later write the bandwidth file to an archival - location, create a temporary symlink to that location, then atomically rename - the symlink - to the configured V3BandwidthsFile path. - - Torflow does not write bandwidth files atomically. - -2.4.2. Additional KeyValue pair definitions - - KeyValue pairs in RelayLines that current implementations generate. - -2.4.2.1. Simple Bandwidth Scanner - - sbws RelayLines contain these keys: - - "node_id" hexdigest - - As above. - - "bw" Bandwidth - - As above. - - "nick" nickname - - [Exactly once.] - - The relay nickname. - - Torflow also has a "nick" KeyValue. - - "rtt" Int - - [Zero or one time.] - - The Round Trip Time in milliseconds to obtain 1 byte of data. - - This KeyValue was added in version 1.1.0 of this specification. - It became optional in version 1.3.0 or 1.4.0 of this specification. - - "time" DateTime - - [Exactly once.] - - The date and time timestamp in ISO 8601 format and UTC time zone - when the last bandwidth was obtained. - - This KeyValue was added in version 1.1.0 of this specification. - The Torflow equivalent is "measured_at". - - "success" Int - - [Zero or one time.] - - The number of times that the bandwidth measurements for this relay were - successful. - - This KeyValue was added in version 1.1.0 of this specification. - - "error_circ" Int - - [Zero or one time.] - - The number of times that the bandwidth measurements for this relay - failed because of circuit failures. - - This KeyValue was added in version 1.1.0 of this specification. - The Torflow equivalent is "circ_fail". - - "error_stream" Int - - [Zero or one time.] - - The number of times that the bandwidth measurements for this relay - failed because of stream failures. - - This KeyValue was added in version 1.1.0 of this specification. - - "error_destination" Int - - [Zero or one time.] - - The number of times that the bandwidth measurements for this relay - failed because the destination Web server was not available. - - This KeyValue was added in version 1.4.0 of this specification. - - "error_second_relay" Int - - [Zero or one time.] - - The number of times that the bandwidth measurements for this relay - failed because sbws could not find a second relay for the test circuit. - - This KeyValue was added in version 1.4.0 of this specification. - - "error_misc" Int - - [Zero or one time.] - - The number of times that the bandwidth measurements for this relay - failed because of other reasons. - - This KeyValue was added in version 1.1.0 of this specification. - - "bw_mean" Int - - [Zero or one time.] - - The measured bandwidth mean for this relay in bytes per second. - - This KeyValue was added in version 1.2.0 of this specification. - - "bw_median" Int - - [Zero or one time.] - - The measured bandwidth median for this relay in bytes per second. - - This KeyValue was added in version 1.2.0 of this specification. - - "desc_bw_avg" Int - - [Zero or one time.] - - The descriptor average bandwidth for this relay in bytes per second. - - This KeyValue was added in version 1.2.0 of this specification. - - "desc_bw_obs_last" Int - - [Zero or one time.] - - The last descriptor observed bandwidth for this relay in bytes per - second. - - This KeyValue was added in version 1.2.0 of this specification. - - "desc_bw_obs_mean" Int - - [Zero or one time.] - - The descriptor observed bandwidth mean for this relay in bytes per - second. - - This KeyValue was added in version 1.2.0 of this specification. - - "desc_bw_bur" Int - - [Zero or one time.] - - The descriptor burst bandwidth for this relay in bytes per - second. - - This KeyValue was added in version 1.2.0 of this specification. - - "consensus_bandwidth" Int - - [Zero or one time.] - - The consensus bandwidth for this relay in bytes per second. - - This KeyValue was added in version 1.2.0 of this specification. - - "consensus_bandwidth_is_unmeasured" Bool - - [Zero or one time.] - - If the consensus bandwidth for this relay was not obtained from - three or more bandwidth authorities, this KeyValue is True or - False otherwise. - - This KeyValue was added in version 1.2.0 of this specification. - - "relay_in_recent_consensus_count" Int - - [Zero or one time.] - - The number of times this relay was found in a consensus in the - last data_period days. (Unless otherwise stated, data_period is - 5 by default.) - - This KeyValue was added in version 1.4.0 of this specification. - - "relay_recent_priority_list_count" Int - - [Zero or one time.] - - The number of times this relay has been prioritized to be measured - in the last data_period days. - - This KeyValue was added in version 1.4.0 of this specification. - - "relay_recent_measurement_attempt_count" Int - - [Zero or one time.] - - The number of times this relay was tried to be measured in the - last data_period days. - - This KeyValue was added in version 1.4.0 of this specification. - - "relay_recent_measurement_failure_count" Int - - [Zero or one time.] - - The number of times this relay was tried to be measured in the - last data_period days, but it was not possible to obtain a - measurement. - - This KeyValue was added in version 1.4.0 of this specification. - - "relay_recent_measurements_excluded_error_count" Int - - [Zero or one time.] - - The number of recent relay measurement attempts that failed. - Measurements are recent if they are in the last data_period days - (5 by default). - - (See the note in section 1.4, version 1.4.0, about excluded relays.) - - This KeyValue was added in version 1.4.0 of this specification. - - "relay_recent_measurements_excluded_near_count" Int - - [Zero or one time.] - - When all of a relay's recent successful measurements were performed in - a period of time that was too short (by default 1 day), the relay is - excluded. This KeyValue contains the number of recent successful - measurements for the relay that were ignored for this reason. - - (See the note in section 1.4, version 1.4.0, about excluded relays.) - - This KeyValue was added in version 1.4.0 of this specification. - - "relay_recent_measurements_excluded_old_count" Int - - [Zero or one time.] - - The number of successful measurements for this relay that are too old - (more than data_period days, 5 by default). - - Excludes measurements that are already counted in - relay_recent_measurements_excluded_near_count. - - (See the note in section 1.4, version 1.4.0, about excluded relays.) - - This KeyValue was added in version 1.4.0 of this specification. - - "relay_recent_measurements_excluded_few_count" Int - - [Zero or one time.] - - The number of successful measurements for this relay that were ignored - because the relay did not have enough successful measurements (fewer - than 2, by default). - - Excludes measurements that are already counted in - relay_recent_measurements_excluded_near_count or - relay_recent_measurements_excluded_old_count. - - (See the note in section 1.4, version 1.4.0, about excluded relays.) - - This KeyValue was added in version 1.4.0 of this specification. - - "under_min_report" bool - - [Zero or one time.] - - If the value is 1, there are not enough eligible relays in the - bandwidth file, and Tor bandwidth authorities MAY NOT vote on this - relay. (Current Tor versions do not change their behaviour based on - the "under_min_report" key.) - - If the value is 0 or the KeyValue is not present, there are enough - relays in the bandwidth file. - - Because Tor versions released before April 2019 (see section 1.4. for - the full list of versions) ignore "vote=0", generator implementations - MUST NOT change the bandwidths for under_min_report relays. Using the - same bw value makes authorities that do not understand "vote=0" - or "under_min_report=1" produce votes that don't change relay weights - too much. It also avoids flapping when the reporting threshold is - reached. - - This KeyValue was added in version 1.4.0 of this specification. - - "unmeasured" bool - - [Zero or one time.] - - If the value is 1, this relay was not successfully measured and - Tor bandwidth authorities MAY NOT vote on this relay. - (Current Tor versions do not change their behaviour based on - the "unmeasured" key.) - - If the value is 0 or the KeyValue is not present, this relay - was successfully measured. - - Because Tor versions released before April 2019 (see section 1.4. for - the full list of versions) ignore "vote=0", generator implementations - MUST set "bw=1" for unmeasured relays. Using the minimum bw value - makes authorities that do not understand "vote=0" or "unmeasured=1" - produce votes that don't change relay weights too much. - - This KeyValue was added in version 1.4.0 of this specification. - - "vote" bool - - [Zero or one time.] - - If the value is 0, Tor directory authorities SHOULD ignore the relay's - entry in the bandwidth file. They SHOULD vote for the relay the same - way they would vote for a relay that is not present in the file. - - This MAY be the case when this relay was not successfully measured but - it is included in the Bandwidth File, to diagnose why they were not - measured. - - If the value is 1 or the KeyValue is not present, Tor directory - authorities MUST use the relay's bw value in any votes for that relay. - - Implementations MUST also set "bw=1" for unmeasured relays. - But they MUST NOT change the bw for under_min_report relays. - (See the explanations under "unmeasured" and "under_min_report" - for more details.) - - This KeyValue was added in version 1.4.0 of this specification. - - "xoff_recv" Int - - [Zero or one time.] - - The number of times this relay received `XOFF_RECV` stream events while - being measured in the last data_period days. - - This KeyValue was added in version 1.6.0 of this specification. - - "xoff_sent" Int - - [Zero or one time.] - - The number of times this relay received `XOFF_SENT` stream events while - being measured in the last data_period days. - - This KeyValue was added in version 1.6.0 of this specification. - - "r_strm" Float - - [Zero or one time.] - - The stream ratio of this relay calculated as explained in B4.3. - - This KeyValue was added in version 1.7.0 of this specification. - - "r_strm_filt" Float - - [Zero or one time.] - - The filtered stream ratio of this relay calculated as explained in B4.3. - - This KeyValue was added in version 1.7.0 of this specification. - - -2.4.2.2. Torflow - - Torflow RelayLines include node_id and bw, and other KeyValue pairs [2]. - -References: - -1. https://gitweb.torproject.org/torflow.git -2. https://gitweb.torproject.org/torflow.git/tree/NetworkScanners/BwAuthority/README.spec.txt#n332 - The Torflow specification is outdated, and does not match the current - implementation. See section A.1. for the format produced by Torflow. -3. https://gitweb.torproject.org/torspec.git/tree/dir-spec.txt -4. https://gitweb.torproject.org/torspec.git/tree/version-spec.txt -5. https://semver.org/ - -A. Sample data - -The following has not been obtained from any real measurement. - -A.1. Generated by Torflow - -This an example version 1.0.0 document: - -1523911758 -node_id=$68A483E05A2ABDCA6DA5A3EF8DB5177638A27F80 bw=760 nick=Test measured_at=1523911725 updated_at=1523911725 pid_error=4.11374090719 pid_error_sum=4.11374090719 pid_bw=57136645 pid_delta=2.12168374577 circ_fail=0.2 scanner=/filepath -node_id=$96C15995F30895689291F455587BD94CA427B6FC bw=189 nick=Test2 measured_at=1523911623 updated_at=1523911623 pid_error=3.96703337994 pid_error_sum=3.96703337994 pid_bw=47422125 pid_delta=2.65469736988 circ_fail=0.0 scanner=/filepath - -A.2. Generated by sbws version 0.1.0 - -1523911758 -version=1.1.0 -software=sbws -software_version=0.1.0 -latest_bandwidth=2018-04-16T20:49:18 -file_created=2018-04-16T21:49:18 -generator_started=2018-04-16T15:13:25 -earliest_bandwidth=2018-04-16T15:13:26 -==== -bw=380 error_circ=0 error_misc=0 error_stream=1 master_key_ed25519=YaqV4vbvPYKucElk297eVdNArDz9HtIwUoIeo0+cVIpQ nick=Test node_id=$68A483E05A2ABDCA6DA5A3EF8DB5177638A27F80 rtt=380 success=1 time=2018-05-08T16:13:26 -bw=189 error_circ=0 error_misc=0 error_stream=0 master_key_ed25519=a6a+dZadrQBtfSbmQkP7j2ardCmLnm5NJ4ZzkvDxbo0I nick=Test2 node_id=$96C15995F30895689291F455587BD94CA427B6FC rtt=378 success=1 time=2018-05-08T16:13:36 - -A.3. Generated by sbws version 1.0.3 - -1523911758 -version=1.2.0 -latest_bandwidth=2018-04-16T20:49:18 -file_created=2018-04-16T21:49:18 -generator_started=2018-04-16T15:13:25 -earliest_bandwidth=2018-04-16T15:13:26 -minimum_number_eligible_relays=3862 -minimum_percent_eligible_relays=60 -number_consensus_relays=6436 -number_eligible_relays=6000 -percent_eligible_relays=93 -software=sbws -software_version=1.0.3 -===== -bw=38000 bw_mean=1127824 bw_median=1180062 desc_bw_avg=1073741824 desc_bw_obs_last=17230879 desc_bw_obs_mean=14732306 error_circ=0 error_misc=0 error_stream=1 master_key_ed25519=YaqV4vbvPYKucElk297eVdNArDz9HtIwUoIeo0+cVIpQ nick=Test node_id=$68A483E05A2ABDCA6DA5A3EF8DB5177638A27F80 rtt=380 success=1 time=2018-05-08T16:13:26 -bw=1 bw_mean=199162 bw_median=185675 desc_bw_avg=409600 desc_bw_obs_last=836165 desc_bw_obs_mean=858030 error_circ=0 error_misc=0 error_stream=0 master_key_ed25519=a6a+dZadrQBtfSbmQkP7j2ardCmLnm5NJ4ZzkvDxbo0I nick=Test2 node_id=$96C15995F30895689291F455587BD94CA427B6FC rtt=378 success=1 time=2018-05-08T16:13:36 - -A.3.1. When there are not enough eligible measured relays: - -1540496079 -version=1.2.0 -earliest_bandwidth=2018-10-20T19:35:52 -file_created=2018-10-25T19:35:03 -generator_started=2018-10-25T11:42:56 -latest_bandwidth=2018-10-25T19:34:39 -minimum_number_eligible_relays=3862 -minimum_percent_eligible_relays=60 -number_consensus_relays=6436 -number_eligible_relays=2960 -percent_eligible_relays=46 -software=sbws -software_version=1.0.3 -===== - -A.4. Headers generated by sbws version 1.0.4 - -1523911758 -version=1.2.0 -latest_bandwidth=2018-04-16T20:49:18 -destinations_countries=TH,ZZ -file_created=2018-04-16T21:49:18 -generator_started=2018-04-16T15:13:25 -earliest_bandwidth=2018-04-16T15:13:26 -minimum_number_eligible_relays=3862 -minimum_percent_eligible_relays=60 -number_consensus_relays=6436 -number_eligible_relays=6000 -percent_eligible_relays=93 -scanner_country=SN -software=sbws -software_version=1.0.4 -===== - -A.5 Generated by sbws version 1.1.0 - -1523911758 -version=1.4.0 -latest_bandwidth=2018-04-16T20:49:18 -destinations_countries=TH,ZZ -file_created=2018-04-16T21:49:18 -generator_started=2018-04-16T15:13:25 -earliest_bandwidth=2018-04-16T15:13:26 -minimum_number_eligible_relays=3862 -minimum_percent_eligible_relays=60 -number_consensus_relays=6436 -number_eligible_relays=6000 -percent_eligible_relays=93 -recent_measurement_attempt_count=6243 -recent_measurement_failure_count=732 -recent_measurements_excluded_error_count=969 -recent_measurements_excluded_few_count=3946 -recent_measurements_excluded_near_count=90 -recent_measurements_excluded_old_count=0 -recent_priority_list_count=20 -recent_priority_relay_count=6243 -scanner_country=SN -software=sbws -software_version=1.1.0 -time_to_report_half_network=57273 -===== -bw=1 error_circ=1 error_destination=0 error_misc=0 error_second_relay=0 error_stream=0 master_key_ed25519=J3HQ24kOQWac3L1xlFLp7gY91qkb5NuKxjj1BhDi+m8 nick=snap269 node_id=$DC4D609F95A52614D1E69C752168AF1FCAE0B05F relay_recent_measurement_attempt_count=3 relay_recent_measurements_excluded_error_count=1 relay_recent_measurements_excluded_near_count=3 relay_recent_consensus_count=3 relay_recent_priority_list_count=3 success=3 time=2019-03-16T18:20:57 unmeasured=1 vote=0 -bw=1 error_circ=0 error_destination=0 error_misc=0 error_second_relay=0 error_stream=2 master_key_ed25519=h6ZB1E1yBFWIMloUm9IWwjgaPXEpL5cUbuoQDgdSDKg nick=relay node_id=$C4544F9E209A9A9B99591D548B3E2822236C0503 relay_recent_measurement_attempt_count=3 relay_recent_measurements_excluded_error_count=2 relay_recent_measurements_excluded_few_count=1 relay_recent_consensus_count=3 relay_recent_priority_list_count=3 success=1 time=2019-03-17T06:50:58 unmeasured=1 vote=0 - -B. Scaling bandwidths - -B.1. Scaling requirements - - Tor accepts zero bandwidths, but they trigger bugs in older Tor - implementations. Therefore, scaling methods SHOULD perform the - following checks: - * If the total bandwidth is zero, all relays should be given equal - bandwidths. - * If the scaled bandwidth is zero, it should be rounded up to one. - - Initial experiments indicate that scaling may not be needed for - torflow and sbws, because their measured bandwidths are similar - enough already. - -B.2. A linear scaling method - - If scaling is required, here is a simple linear bandwidth scaling - method, which ensures that all bandwidth votes contain approximately - the same total bandwidth: - - 1. Calculate the relay quota by dividing the total measured bandwidth - in all votes, by the number of relays with measured bandwidth - votes. In the public tor network, this is approximately 7500 as of - April 2018. The quota should be a consensus parameter, so it can be - adjusted for all generators on the network. - - 2. Calculate a vote quota by multiplying the relay quota by the number - of relays this bandwidth authority has measured - bandwidths for. - - 3. Calculate a scaling factor by dividing the vote quota by the - total unscaled measured bandwidth in this bandwidth - authority's upcoming vote. - - 4. Multiply each unscaled measured bandwidth by the scaling - factor. - - Now, the total scaled bandwidth in the upcoming vote is - approximately equal to the quota. - -B.3. Quota changes - - If all generators are using scaling, the quota can be gradually - reduced or increased as needed. Smaller quotas decrease the size - of uncompressed consensuses, and may decrease the size of - consensus diffs and compressed consensuses. But if the relay - quota is too small, some relays may be over- or under-weighted. - -B.4. Torflow aggregation - - Torflow implements two methods to compute the bandwidth values from the - (stream) bandwidth measurements: with and without PID control feedback. - The method described here is without PID control (see Torflow - specification, section 2.2). - - In the following sections, the relays' measured bandwidth refer to the - ones that this bandwidth authority has measured for the relays that - would be included in the next bandwidth authority's upcoming vote. - - 1. Calculate the filtered bandwidth for each relay: - - choose the relay's measurements (`bw_j`) that are equal or greater - than the mean of the measurements for this relay - - calculate the mean of those measurements - - In pseudocode: - - bw_filt_i = mean(max(mean(bw_j), bw_j)) - - 2. Calculate network averages: - - calculate the filtered average by dividing the sum of all the - relays' filtered bandwidth by the number of relays that have been - measured (`n`), ie, calculate the mean average of the relays' - filtered bandwidth. - - calculate the stream average by dividing the sum of all the - relays' measured bandwidth by the number of relays that have been - measured (`n`), ie, calculate the mean average or the relays' - measured bandwidth. - - In pseudocode: - - bw_avg_filt_ = bw_filt_i / n - bw_avg_strm = bw_i / n - - 3. Calculate ratios for each relay: - - calculate the filtered ratio by dividing each relay filtered - bandwidth by the filtered average - - calculate the stream ratio by dividing each relay measured - bandwidth by the stream average - - In pseudocode: - - r_filt_i = bw_filt_i / bw_avg_filt - r_strm_i = bw_i / bw_avg_strm - - 4. Calculate the final ratio for each relay: - The final ratio is the larger between the filtered bandwidth's and the - stream bandwidth's ratio. - - In pseudocode: - - r_i = max(r_filt_i, r_strm_i) - - 5. Calculate the scaled bandwidth for each relay: - The most recent descriptor observed bandwidth (`bw_obs_i`) is - multiplied by the ratio - - In pseudocode: - - bw_new_i = r_i * bw_obs_i - - <> diff --git a/bridgedb-spec.txt b/bridgedb-spec.txt deleted file mode 100644 index 51f6e5d..0000000 --- a/bridgedb-spec.txt +++ /dev/null @@ -1,409 +0,0 @@ - - BridgeDB specification - - Karsten Loesing - Nick Mathewson - -Table of Contents - - 0. Preliminaries - 1. Importing bridge network statuses and bridge descriptors - 1.1. Parsing bridge network statuses - 1.2. Parsing bridge descriptors - 1.3. Parsing extra-info documents - 2. Assigning bridges to distributors - 3. Giving out bridges upon requests - 4. Selecting bridges to be given out based on IP addresses - 5. Selecting bridges to be given out based on email addresses - 6. Selecting unallocated bridges to be stored in file buckets - 7. Displaying Bridge Information - 8. Writing bridge assignments for statistics - -0. Preliminaries - - This document specifies how BridgeDB processes bridge descriptor files - to learn about new bridges, maintains persistent assignments of bridges - to distributors, and decides which bridges to give out upon user - requests. - - Some of the decisions here may be suboptimal: this document is meant to - specify current behavior as of August 2013, not to specify ideal - behavior. - -1. Importing bridge network statuses and bridge descriptors - - BridgeDB learns about bridges by parsing bridge network statuses, - bridge descriptors, and extra info documents as specified in Tor's - directory protocol. BridgeDB parses one bridge network status file - first and at least one bridge descriptor file and potentially one extra - info file afterwards. - - BridgeDB scans its files on sighup. - - BridgeDB does not validate signatures on descriptors or networkstatus - files: the operator needs to make sure that these documents have come - from a Tor instance that did the validation for us. - -1.1. Parsing bridge network statuses - - Bridge network status documents contain the information of which bridges - are known to the bridge authority and which flags the bridge authority - assigns to them. - We expect bridge network statuses to contain at least the following two - lines for every bridge in the given order (format fully specified in Tor's - directory protocol): - - "r" SP nickname SP identity SP digest SP publication SP IP SP ORPort - SP DirPort NL - "a" SP address ":" port NL (no more than 8 instances) - "s" SP Flags NL - - BridgeDB parses the identity and the publication timestamp from the "r" - line, the OR address(es) and ORPort(s) from the "a" line(s), and the - assigned flags from the "s" line, specifically checking the assignment - of the "Running" and "Stable" flags. - BridgeDB memorizes all bridges that have the Running flag as the set of - running bridges that can be given out to bridge users. - BridgeDB memorizes assigned flags if it wants to ensure that sets of - bridges given out should contain at least a given number of bridges - with these flags. - -1.2. Parsing bridge descriptors - - BridgeDB learns about a bridge's most recent IP address and OR port - from parsing bridge descriptors. - In theory, both IP address and OR port of a bridge are also contained - in the "r" line of the bridge network status, so there is no mandatory - reason for parsing bridge descriptors. But the functionality described - in this section is still implemented in case we need data from the - bridge descriptor in the future. - - Bridge descriptor files may contain one or more bridge descriptors. - We expect a bridge descriptor to contain at least the following lines in - the stated order: - - "@purpose" SP purpose NL - "router" SP nickname SP IP SP ORPort SP SOCKSPort SP DirPort NL - "published" SP timestamp - ["opt" SP] "fingerprint" SP fingerprint NL - "router-signature" NL Signature NL - - BridgeDB parses the purpose, IP, ORPort, nickname, and fingerprint - from these lines. - BridgeDB skips bridge descriptors if the fingerprint is not contained - in the bridge network status parsed earlier or if the bridge does not - have the Running flag. - BridgeDB discards bridge descriptors which have a different purpose - than "bridge". BridgeDB can be configured to only accept descriptors - with another purpose or not discard descriptors based on purpose at - all. - BridgeDB memorizes the IP addresses and OR ports of the remaining - bridges. - If there is more than one bridge descriptor with the same fingerprint, - BridgeDB memorizes the IP address and OR port of the most recently - parsed bridge descriptor. - If BridgeDB does not find a bridge descriptor for a bridge contained in - the bridge network status parsed before, it does not add that bridge - to the set of bridges to be given out to bridge users. - -1.3. Parsing extra-info documents - - BridgeDB learns if a bridge supports a pluggable transport by parsing - extra-info documents. - Extra-info documents contain the name of the bridge (but only if it is - named), the bridge's fingerprint, the type of pluggable transport(s) it - supports, and the IP address and port number on which each transport - listens, respectively. - - Extra-info documents may contain zero or more entries per bridge. We expect - an extra-info entry to contain the following lines in the stated order: - - "extra-info" SP name SP fingerprint NL - "transport" SP transport SP IP ":" PORT ARGS NL - - BridgeDB parses the fingerprint, transport type, IP address, port and any - arguments that are specified on these lines. BridgeDB skips the name. If - the fingerprint is invalid, BridgeDB skips the entry. BridgeDB memorizes - the transport type, IP address, port number, and any arguments that are be - provided and then it assigns them to the corresponding bridge based on the - fingerprint. Arguments are comma-separated and are of the form k=v,k=v. - Bridges that do not have an associated extra-info entry are not invalid. - -2. Assigning bridges to distributors - - A "distributor" is a mechanism by which bridges are given (or not - given) to clients. The current distributors are "email", "https", - and "unallocated". - - BridgeDB assigns bridges to distributors based on an HMAC hash of the - bridge's ID and a secret and makes these assignments persistent. - Persistence is achieved by using a database to map node ID to - distributor. - Each bridge is assigned to exactly one distributor (including - the "unallocated" distributor). - BridgeDB may be configured to support only a non-empty subset of the - distributors specified in this document. - BridgeDB may be configured to use different probabilities for assigning - new bridges to distributors. - BridgeDB does not change existing assignments of bridges to - distributors, even if probabilities for assigning bridges to - distributors change or distributors are disabled entirely. - -3. Giving out bridges upon requests - - Upon receiving a client request, a BridgeDB distributor provides a - subset of the bridges assigned to it. - BridgeDB only gives out bridges that are contained in the most recently - parsed bridge network status and that have the Running flag set (see - Section 1). - BridgeDB may be configured to give out a different number of bridges - (typically 4) depending on the distributor. - BridgeDB may define an arbitrary number of rules. These rules may - specify the criteria by which a bridge is selected. Specifically, - the available rules restrict the IP address version, OR port number, - transport type, bridge relay flag, or country in which the bridge - should not be blocked. - -4. Selecting bridges to be given out based on IP addresses - - BridgeDB may be configured to support one or more distributors which - gives out bridges based on the requestor's IP address. Currently, this - is how the HTTPS distributor works. - The goal is to avoid handing out all the bridges to users in a similar - IP space and time. -# Someone else should look at proposals/ideas/old/xxx-bridge-disbursement -# to see if this section is missing relevant pieces from it. -KL - - BridgeDB fixes the set of bridges to be returned for a defined time - period. - BridgeDB considers all IP addresses coming from the same /24 network - as the same IP address and returns the same set of bridges. From here on, - this non-unique address will be referred to as the IP address's 'area'. - BridgeDB divides the IP address space equally into a small number of -# Note, changed term from "areas" to "disjoint clusters" -MF - disjoint clusters (typically 4) and returns different results for requests - coming from addresses that are placed into different clusters. -# I found that BridgeDB is not strict in returning only bridges for a -# given area. If a ring is empty, it considers the next one. Is this -# expected behavior? -KL -# -# This does not appear to be the case, anymore. If a ring is empty, then -# BridgeDB simply returns an empty set of bridges. -MF -# -# I also found that BridgeDB does not make the assignment to areas -# persistent in the database. So, if we change the number of rings, it -# will assign bridges to other rings. I assume this is okay? -KL - BridgeDB maintains a list of proxy IP addresses and returns the same - set of bridges to requests coming from these IP addresses. - The bridges returned to proxy IP addresses do not come from the same - set as those for the general IP address space. - - BridgeDB can be configured to include bridge fingerprints in replies - along with bridge IP addresses and OR ports. - BridgeDB can be configured to display a CAPTCHA which the user must solve - prior to returning the requested bridges. - - The current algorithm is as follows. An IP-based distributor splits - the bridges uniformly into a set of "rings" based on an HMAC of their - ID. Some of these rings are "area" rings for parts of IP space; some - are "category" rings for categories of IPs (like proxies). When a - client makes a request from an IP, the distributor first sees whether - the IP is in one of the categories it knows. If so, the distributor - returns an IP from the category rings. If not, the distributor - maps the IP into an "area" (that is, a /24), and then uses an HMAC to - map the area to one of the area rings. - - When the IP-based distributor determines from which area ring it is handing - out bridges, it identifies which rules it will use to choose appropriate - bridges. Using this information, it searches its cache of rings for one - that already adheres to the criteria specified in this request. If one - exists, then BridgeDB maps the current "epoch" (N-hour period) and the - IP's area (/24) to a point on the ring based on HMAC, and hands out - bridges at that point. If a ring does not already exist which satisfies this - request, then a new ring is created and filled with bridges that fulfill - the requirements. This ring is then used to select bridges as described. - - "Mapping X to Y based on an HMAC" above means one of the following: - - - We keep all of the elements of Y in some order, with a mapping - from all 160-bit strings to positions in Y. - - We take an HMAC of X using some fixed string as a key to get a - 160-bit value. We then map that value to the next position of Y. - - When giving out bridges based on a position in a ring, BridgeDB first - looks at flag requirements and port requirements. For example, - BridgeDB may be configured to "Give out at least L bridges with port - 443, and at least M bridges with Stable, and at most N bridges - total." To do this, BridgeDB combines to the results: - - - The first L bridges in the ring after the position that have the - port 443, and - - The first M bridges in the ring after the position that have the - flag stable and that it has not already decided to give out, and - - The first N-L-M bridges in the ring after the position that it - has not already decided to give out. - - After BridgeDB selects appropriate bridges to return to the requestor, it - then prioritises the ordering of them in a list so that as many criteria - are fulfilled as possible within the first few bridges. This list is then - truncated to N bridges, if possible. N is currently defined as a - piecewise function of the number of bridges in the ring such that: - - / - | 1, if len(ring) < 20 - | - N = | 2, if 20 <= len(ring) <= 100 - | - | 3, if 100 <= len(ring) - \ - - The bridges in this sublist, containing no more than N bridges, are the - bridges returned to the requestor. - -5. Selecting bridges to be given out based on email addresses - - BridgeDB can be configured to support one or more distributors that are - giving out bridges based on the requestor's email address. Currently, - this is how the email distributor works. - The goal is to bootstrap based on one or more popular email service's - sybil prevention algorithms. -# Someone else should look at proposals/ideas/old/xxx-bridge-disbursement -# to see if this section is missing relevant pieces from it. -KL - - BridgeDB rejects email addresses containing other characters than the - ones that RFC2822 allows. - BridgeDB may be configured to reject email addresses containing other - characters it might not process correctly. -# I don't think we do this, is it worthwhile? -MF - BridgeDB rejects email addresses coming from other domains than a - configured set of permitted domains. - BridgeDB normalizes email addresses by removing "." characters and by - removing parts after the first "+" character. - BridgeDB can be configured to discard requests that do not have the - value "pass" in their X-DKIM-Authentication-Result header or does not - have this header. The X-DKIM-Authentication-Result header is set by - the incoming mail stack that needs to check DKIM authentication. - - BridgeDB does not return a new set of bridges to the same email address - until a given time period (typically a few hours) has passed. -# Why don't we fix the bridges we give out for a global 3-hour time period -# like we do for IP addresses? This way we could avoid storing email -# addresses. -KL -# The 3-hour value is probably much too short anyway. If we take longer -# time values, then people get new bridges when bridges show up, as -# opposed to then we decide to reset the bridges we give them. (Yes, this -# problem exists for the IP distributor). -NM -# I'm afraid I don't fully understand what you mean here. Can you -# elaborate? -KL -# -# Assuming an average churn rate, if we use short time periods, then a -# requestor will receive new bridges based on rate-limiting and will (likely) -# eventually work their way around the ring; eventually exhausting all bridges -# available to them from this distributor. If we use a longer time period, -# then each time the period expires there will be more bridges in the ring -# thus reducing the likelihood of all bridges being blocked and increasing -# the time and effort required to enumerate all bridges. (This is my -# understanding, not from Nick) -MF -# Also, we presently need the cache to prevent replays and because if a user -# sent multiple requests with different criteria in each then we would leak -# additional bridges otherwise. -MF - BridgeDB can be configured to include bridge fingerprints in replies - along with bridge IP addresses and OR ports. - BridgeDB can be configured to sign all replies using a PGP signing key. - BridgeDB periodically discards old email-address-to-bridge mappings. - BridgeDB rejects too frequent email requests coming from the same - normalized address. - - To map previously unseen email addresses to a set of bridges, BridgeDB - proceeds as follows: - - - It normalizes the email address as above, by stripping out dots, - removing all of the localpart after the +, and putting it all - in lowercase. (Example: "John.Doe+bridges@example.COM" becomes - "johndoe@example.com".) - - It maps an HMAC of the normalized address to a position on its ring - of bridges. - - It hands out bridges starting at that position, based on the - port/flag requirements, as specified at the end of section 4. - - See section 4 for the details of how bridges are selected from the ring - and returned to the requestor. - -6. Selecting unallocated bridges to be stored in file buckets - -# Kaner should have a look at this section. -NM - - BridgeDB can be configured to reserve a subset of bridges and not give - them out via one of the distributors. - BridgeDB assigns reserved bridges to one or more file buckets of fixed - sizes and write these file buckets to disk for manual distribution. - BridgeDB ensures that a file bucket always contains the requested - number of running bridges. - If the requested number of bridges in a file bucket is reduced or the - file bucket is not required anymore, the unassigned bridges are - returned to the reserved set of bridges. - If a bridge stops running, BridgeDB replaces it with another bridge - from the reserved set of bridges. -# I'm not sure if there's a design bug in file buckets. What happens if -# we add a bridge X to file bucket A, and X goes offline? We would add -# another bridge Y to file bucket A. OK, but what if A comes back? We -# cannot put it back in file bucket A, because it's full. Are we going to -# add it to a different file bucket? Doesn't that mean that most bridges -# will be contained in most file buckets over time? -KL -# -# This should be handled the same as if the file bucket is reduced in size. -# If X returns, then it should be added to the appropriate distributor. -MF - -7. Displaying Bridge Information - - After bridges are selected using one of the methods described in - Sections 4 - 6, they are output in one of two formats. Bridges are - formatted as: - - NL - - Pluggable transports are formatted as: - - SP [SP arglist] NL - - where arglist is an optional space-separated list of key-value pairs in - the form of k=v. - - Previously, each line was prepended with the "bridge" keyword, such as - - "bridge" SP NL - - "bridge" SP SP [SP arglist] NL - -# We don't do this anymore because Vidalia and TorLauncher don't expect it. -# See the commit message for b70347a9c5fd769c6d5d0c0eb5171ace2999a736. - -8. Writing bridge assignments for statistics - - BridgeDB can be configured to write bridge assignments to disk for - statistical analysis. - The start of a bridge assignment is marked by the following line: - - "bridge-pool-assignment" SP YYYY-MM-DD HH:MM:SS NL - - YYYY-MM-DD HH:MM:SS is the time, in UTC, when BridgeDB has completed - loading new bridges and assigning them to distributors. - - For every running bridge there is a line with the following format: - - fingerprint SP distributor (SP key "=" value)* NL - - The distributor is one out of "email", "https", or "unallocated". - - Both "email" and "https" distributors support adding keys for "port", - "flag" and "transport". Respectively, the port number, flag name, and - transport types are the values. These are used to indicate that - a bridge matches certain port, flag, transport criteria of requests. - - The "https" distributor also allows the key "ring" with a number as - value to indicate to which IP address area the bridge is returned. - - The "unallocated" distributor allows the key "bucket" with the file - bucket name as value to indicate which file bucket a bridge is assigned - to. - diff --git a/cert-spec.txt b/cert-spec.txt deleted file mode 100644 index a70e100..0000000 --- a/cert-spec.txt +++ /dev/null @@ -1,198 +0,0 @@ - - Ed25519 certificates in Tor - -Table of Contents - - 1. Scope and Preliminaries - 1.1. Signing - 1.2. Integer encoding - 2. Document formats - 2.1. Ed25519 Certificates - 2.2. Basic extensions - 2.2.1. Signed-with-ed25519-key extension [type 04] - 2.3. RSA->Ed25519 cross-certificate - A.1. List of certificate types (CERT_TYPE field) - A.2. List of extension types - A.3. List of signature prefixes - A.4. List of certified key types (CERT_KEY_TYPE field) - -1. Scope and Preliminaries - - This document describes a certificate format that Tor uses for - its Ed25519 internal certificates. It is not the only - certificate format that Tor uses. For the certificates that - authorities use for their signing keys, see dir-spec.txt. - Additionally, Tor uses TLS, which depends on X.509 certificates; - see tor-spec.txt for details. - - The certificates in this document were first introduced in - proposal 220, and were first supported by Tor in Tor version - 0.2.7.2-alpha. - -1.1. Signing - - All signatures here, unless otherwise specified, are computed - using an Ed25519 key. - - In order to future-proof the format, before signing anything, the - signed document is prefixed with a personalization string, which - will be different in each case. - -1.2. Integer encoding - - Network byte order (big-endian) is used to encode all integer values - in Ed25519 certificates unless explicitly specified otherwise. - -2. Document formats - -2.1. Ed25519 Certificates - - When generating a signing key, we also generate a certificate for it. - Unlike the certificates for authorities' signing keys, these - certificates need to be sent around frequently, in significant - numbers. So we'll choose a compact representation. - - VERSION [1 Byte] - CERT_TYPE [1 Byte] - EXPIRATION_DATE [4 Bytes] - CERT_KEY_TYPE [1 byte] - CERTIFIED_KEY [32 Bytes] - N_EXTENSIONS [1 byte] - EXTENSIONS [N_EXTENSIONS times] - SIGNATURE [64 Bytes] - - The "VERSION" field holds the value [01]. The "CERT_TYPE" field - holds a value depending on the type of certificate. (See appendix - A.1.) The CERTIFIED_KEY field is an Ed25519 public key if - CERT_KEY_TYPE is [01], or a digest of some other key type - depending on the value of CERT_KEY_TYPE. (See appendix A.4.) - The EXPIRATION_DATE is a date, given in HOURS since the epoch, - after which this certificate isn't valid. (A four-byte field here - will work fine until 10136 A.D.) - - The EXTENSIONS field contains zero or more extensions, each of - the format: - - ExtLength [2 bytes] - ExtType [1 byte] - ExtFlags [1 byte] - ExtData [ExtLength bytes] - - The meaning of the ExtData field in an extension is type-dependent. - - The ExtFlags field holds flags; this flag is currently defined: - - 1 -- AFFECTS_VALIDATION. If this flag is present, then the - extension affects whether the certificate is valid; clients - must not accept the certificate as valid unless they - understand the extension. - - It is an error for an extension to be truncated; such a - certificate is invalid. - - Before processing any certificate, parties SHOULD know which - identity key it is supposed to be signed by, and then check the - signature. The signature is created by signing all the fields in - the certificate up until "SIGNATURE" (that is, signing - sizeof(ed25519_cert) - 64 bytes). - -2.2. Basic extensions - -2.2.1. Signed-with-ed25519-key extension [type 04] - - In several places, it's desirable to bundle the key signing a - certificate along with the certificate. We do so with this - extension. - - ExtLength = 32 - ExtData = - An ed25519 key [32 bytes] - - When this extension is present, it MUST match the key used to - sign the certificate. - -2.3. RSA->Ed25519 cross-certificate - - Certificate type [07] (Cross-certification of Ed25519 identity - with RSA key) contains the following data: - - ED25519_KEY [32 bytes] - EXPIRATION_DATE [4 bytes] - SIGLEN [1 byte] - SIGNATURE [SIGLEN bytes] - - Here, the Ed25519 identity key is signed with router's RSA - identity key, to indicate that authenticating with a key - certified by the Ed25519 key counts as certifying with RSA - identity key. (The signature is computed on the SHA256 hash of - the non-signature parts of the certificate, prefixed with the - string "Tor TLS RSA/Ed25519 cross-certificate".) - - Just like with the Ed25519 certificates above, the EXPIRATION_DATE - operates in HOURS after the epoch. - - This certificate type is used to mean, "This Ed25519 identity key - acts with the authority of the RSA key that signed this - certificate." - -A.1. List of certificate types (CERT_TYPE field) - - The values marked with asterisks are not types corresponding to - the certificate format of section 2.1. Instead, they are - reserved for RSA-signed certificates to avoid conflicts between - the certificate type enumeration of the CERTS cell and the - certificate type enumeration of in our Ed25519 certificates. - - - **[00],[01],[02],[03] - Reserved to avoid conflict with types used - in CERTS cells. - - [04] - Ed25519 signing key with an identity key - (see prop220 section 4.2) - - [05] - TLS link certificate signed with ed25519 signing key - (see prop220 section 4.2) - - [06] - Ed25519 authentication key signed with ed25519 signing key - (see prop220 section 4.2) - - **[07] - Reserved for RSA identity cross-certification; - (see section 2.3 above, and tor-spec.txt section 4.2) - - [08] - Onion service: short-term descriptor signing key, signed - with blinded public key. - (See rend-spec-v3.txt, section [DESC_OUTER]) - - [09] - Onion service: intro point authentication key, cross-certifying the - descriptor signing key. - (See rend-spec-v3.txt, description of "auth-key") - - [0A] - ntor onion key cross-certifying ed25519 identity key - (see dir-spec.txt, description of "ntor-onion-key-crosscert") - - [0B] - Onion service: ntor-extra encryption key, cross-certifying - descriptor signing key. - (see rend-spec-v3.txt, description of "enc-key-cert") - -A.2. List of extension types - - [04] - signed-with-ed25519-key (section 2.2.1) - -A.3. List of signature prefixes - - We describe various documents as being signed with a prefix. Here - are those prefixes: - - "Tor router descriptor signature v1" (see dir-spec.txt) - -A.4. List of certified key types (CERT_KEY_TYPE field) - - [01] ed25519 key - [02] SHA256 hash of an RSA key. (Not currently used.) - [03] SHA256 hash of an X.509 certificate. (Used with certificate - type 5.) - - (NOTE: Up till 0.4.5.1-alpha, all versions of Tor have incorrectly used - "01" for all types of certified key. Implementations SHOULD - allow "01" in this position, and infer the actual key type from - the CERT_TYPE field.) diff --git a/control-spec.txt b/control-spec.txt deleted file mode 100644 index 52e11a0..0000000 --- a/control-spec.txt +++ /dev/null @@ -1,4418 +0,0 @@ - - TC: A Tor control protocol (Version 1) - -Table of Contents - - 0. Scope - 1. Protocol outline - 1.1. Forward-compatibility - 2. Message format - 2.1. Description format - 2.1.1. Notes on an escaping bug - 2.2. Commands from controller to Tor - 2.3. Replies from Tor to the controller - 2.4. General-use tokens - 3. Commands - 3.1. SETCONF - 3.2. RESETCONF - 3.3. GETCONF - 3.4. SETEVENTS - 3.5. AUTHENTICATE - 3.6. SAVECONF - 3.7. SIGNAL - 3.8. MAPADDRESS - 3.9. GETINFO - 3.10. EXTENDCIRCUIT - 3.11. SETCIRCUITPURPOSE - 3.12. SETROUTERPURPOSE - 3.13. ATTACHSTREAM - 3.14. POSTDESCRIPTOR - 3.15. REDIRECTSTREAM - 3.16. CLOSESTREAM - 3.17. CLOSECIRCUIT - 3.18. QUIT - 3.19. USEFEATURE - 3.20. RESOLVE - 3.21. PROTOCOLINFO - 3.22. LOADCONF - 3.23. TAKEOWNERSHIP - 3.24. AUTHCHALLENGE - 3.25. DROPGUARDS - 3.26. HSFETCH - 3.27. ADD_ONION - 3.28. DEL_ONION - 3.29. HSPOST - 3.30. ONION_CLIENT_AUTH_ADD - 3.31. ONION_CLIENT_AUTH_REMOVE - 3.32. ONION_CLIENT_AUTH_VIEW - 3.33. DROPOWNERSHIP - 3.34. DROPTIMEOUTS - 4. Replies - 4.1. Asynchronous events - 4.1.1. Circuit status changed - 4.1.2. Stream status changed - 4.1.3. OR Connection status changed - 4.1.4. Bandwidth used in the last second - 4.1.5. Log messages - 4.1.6. New descriptors available - 4.1.7. New Address mapping - 4.1.8. Descriptors uploaded to us in our role as authoritative dirserver - 4.1.9. Our descriptor changed - 4.1.10. Status events - 4.1.11. Our set of guard nodes has changed - 4.1.12. Network status has changed - 4.1.13. Bandwidth used on an application stream - 4.1.14. Per-country client stats - 4.1.15. New consensus networkstatus has arrived - 4.1.16. New circuit buildtime has been set - 4.1.17. Signal received - 4.1.18. Configuration changed - 4.1.19. Circuit status changed slightly - 4.1.20. Pluggable transport launched - 4.1.21. Bandwidth used on an OR or DIR or EXIT connection - 4.1.22. Bandwidth used by all streams attached to a circuit - 4.1.23. Per-circuit cell stats - 4.1.24. Token buckets refilled - 4.1.25. HiddenService descriptors - 4.1.26. HiddenService descriptors content - 4.1.27. Network liveness has changed - 4.1.28. Pluggable Transport Logs - 4.1.29. Pluggable Transport Status - 5. Implementation notes - 5.1. Authentication - 5.2. Don't let the buffer get too big - 5.3. Backward compatibility with v0 control protocol - 5.4. Tor config options for use by controllers - 5.5. Phases from the Bootstrap status event - 5.5.1. Overview of Bootstrap reporting. - 5.5.2. Phases in Bootstrap Stage 1 - 5.5.3. Phases in Bootstrap Stage 2 - 5.5.4. Phases in Bootstrap Stage 3 - 5.6 Bootstrap phases reported by older versions of Tor - -0. Scope - - This document describes an implementation-specific protocol that is used - for other programs (such as frontend user-interfaces) to communicate with a - locally running Tor process. It is not part of the Tor onion routing - protocol. - - This protocol replaces version 0 of TC, which is now deprecated. For - reference, TC is described in "control-spec-v0.txt". Implementors are - recommended to avoid using TC directly, but instead to use a library that - can easily be updated to use the newer protocol. (Version 0 is used by Tor - versions 0.1.0.x; the protocol in this document only works with Tor - versions in the 0.1.1.x series and later.) - - The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL - NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and - "OPTIONAL" in this document are to be interpreted as described in - RFC 2119. - -1. Protocol outline - - TC is a bidirectional message-based protocol. It assumes an underlying - stream for communication between a controlling process (the "client" - or "controller") and a Tor process (or "server"). The stream may be - implemented via TCP, TLS-over-TCP, a Unix-domain socket, or so on, - but it must provide reliable in-order delivery. For security, the - stream should not be accessible by untrusted parties. - - In TC, the client and server send typed messages to each other over the - underlying stream. The client sends "commands" and the server sends - "replies". - - By default, all messages from the server are in response to messages from - the client. Some client requests, however, will cause the server to send - messages to the client indefinitely far into the future. Such - "asynchronous" replies are marked as such. - - Servers respond to messages in the order messages are received. - -1.1. Forward-compatibility - - This is an evolving protocol; new client and server behavior will be - allowed in future versions. To allow new backward-compatible behavior - on behalf of the client, we may add new commands and allow existing - commands to take new arguments in future versions. To allow new - backward-compatible server behavior, we note various places below - where servers speaking a future version of this protocol may insert - new data, and note that clients should/must "tolerate" unexpected - elements in these places. There are two ways that we do this: - - * Adding a new field to a message: - - For example, we might say "This message has three space-separated - fields; clients MUST tolerate more fields." This means that a - client MUST NOT crash or otherwise fail to parse the message or - other subsequent messages when there are more than three fields, and - that it SHOULD function at least as well when more fields are - provided as it does when it only gets the fields it accepts. The - most obvious way to do this is by ignoring additional fields; the - next-most-obvious way is to report additional fields verbatim to the - user, perhaps as part of an expert UI. - - * Adding a new possible value to a list of alternatives: - - For example, we might say "This field will be OPEN, CLOSED, or - CONNECTED. Clients MUST tolerate unexpected values." This means - that a client MUST NOT crash or otherwise fail to parse the message - or other subsequent messages when there are unexpected values, and - that it SHOULD try to handle the rest of the message as well as it - can. The most obvious way to do this is by pretending that each - list of alternatives has an additional "unrecognized value" element, - and mapping any unrecognized values to that element; the - next-most-obvious way is to create a separate "unrecognized value" - element for each unrecognized value. - - Clients SHOULD NOT "tolerate" unrecognized alternatives by - pretending that the message containing them is absent. For example, - a stream closed for an unrecognized reason is nevertheless closed, - and should be reported as such. - - (If some list of alternatives is given, and there isn't an explicit - statement that clients must tolerate unexpected values, clients still - must tolerate unexpected values. The only exception would be if there - were an explicit statement that no future values will ever be added.) - -2. Message format - -2.1. Description format - - The message formats listed below use ABNF as described in RFC 2234. - The protocol itself is loosely based on SMTP (see RFC 2821). - - We use the following nonterminals from RFC 2822: atom, qcontent - - We define the following general-use nonterminals: - - QuotedString = DQUOTE *qcontent DQUOTE - - There are explicitly no limits on line length. All 8-bit characters - are permitted unless explicitly disallowed. In QuotedStrings, - backslashes and quotes must be escaped; other characters need not be - escaped. - - Wherever CRLF is specified to be accepted from the controller, Tor MAY also - accept LF. Tor, however, MUST NOT generate LF instead of CRLF. - Controllers SHOULD always send CRLF. - -2.1.1. Notes on an escaping bug - - CString = DQUOTE *qcontent DQUOTE - - Note that although these nonterminals have the same grammar, they - are interpreted differently. In a QuotedString, a backslash - followed by any character represents that character. But - in a CString, the escapes "\n", "\t", "\r", and the octal escapes - "\0" ... "\377" represent newline, tab, carriage return, and the - 256 possible octet values respectively. - - The use of CString in this document reflects a bug in Tor; - they should have been QuotedString instead. In the future, they - may migrate to use QuotedString instead. If they do, the - QuotedString implementation will never place a backslash before a - "n", "t", "r", or digit, to ensure that old controllers don't get - confused. - - For future-proofing, controller implementors MAY use the following - rules to be compatible with buggy Tor implementations and with - future ones that implement the spec as intended: - - Read \n \t \r and \0 ... \377 as C escapes. - Treat a backslash followed by any other character as that character. - - Currently, many of the QuotedString instances below that Tor - outputs are in fact CStrings. We intend to fix this in future - versions of Tor, and document which ones were broken. (See - bugtracker ticket #14555 for a bit more information.) - - Note that this bug exists only in strings generated by Tor for the - Tor controller; Tor should parse input QuotedStrings from the - controller correctly. - - -2.2. Commands from controller to Tor - - Command = Keyword OptArguments CRLF / "+" Keyword OptArguments CRLF CmdData - Keyword = 1*ALPHA - OptArguments = [ SP *(SP / VCHAR) ] - - A command is either a single line containing a Keyword and arguments, or a - multiline command whose initial keyword begins with +, and whose data - section ends with a single "." on a line of its own. (We use a special - character to distinguish multiline commands so that Tor can correctly parse - multi-line commands that it does not recognize.) Specific commands and - their arguments are described below in section 3. - -2.3. Replies from Tor to the controller - - Reply = SyncReply / AsyncReply - SyncReply = *(MidReplyLine / DataReplyLine) EndReplyLine - AsyncReply = *(MidReplyLine / DataReplyLine) EndReplyLine - - MidReplyLine = StatusCode "-" ReplyLine - DataReplyLine = StatusCode "+" ReplyLine CmdData - EndReplyLine = StatusCode SP ReplyLine - ReplyLine = [ReplyText] CRLF - ReplyText = XXXX - StatusCode = 3DIGIT - - Unless specified otherwise, multiple lines in a single reply from - Tor to the controller are guaranteed to share the same status - code. Specific replies are mentioned below in section 3, and - described more fully in section 4. - - [Compatibility note: versions of Tor before 0.2.0.3-alpha sometimes - generate AsyncReplies of the form "*(MidReplyLine / DataReplyLine)". - This is incorrect, but controllers that need to work with these - versions of Tor should be prepared to get multi-line AsyncReplies with - the final line (usually "650 OK") omitted.] - -2.4. General-use tokens - - ; CRLF means, "the ASCII Carriage Return character (decimal value 13) - ; followed by the ASCII Linefeed character (decimal value 10)." - CRLF = CR LF - - ; How a controller tells Tor about a particular OR. There are four - ; possible formats: - ; $Fingerprint -- The router whose identity key hashes to the fingerprint. - ; This is the preferred way to refer to an OR. - ; $Fingerprint~Nickname -- The router whose identity key hashes to the - ; given fingerprint, but only if the router has the given nickname. - ; $Fingerprint=Nickname -- The router whose identity key hashes to the - ; given fingerprint, but only if the router is Named and has the given - ; nickname. - ; Nickname -- The Named router with the given nickname, or, if no such - ; router exists, any router whose nickname matches the one given. - ; This is not a safe way to refer to routers, since Named status - ; could under some circumstances change over time. - ; - ; The tokens that implement the above follow: - - ServerSpec = LongName / Nickname - LongName = Fingerprint [ "~" Nickname ] - - ; For tors older than 0.3.1.3-alpha, LongName may have included an equal - ; sign ("=") in lieu of a tilde ("~"). The presence of an equal sign - ; denoted that the OR possessed the "Named" flag: - - LongName = Fingerprint [ ( "=" / "~" ) Nickname ] - - Fingerprint = "$" 40*HEXDIG - NicknameChar = "a"-"z" / "A"-"Z" / "0" - "9" - Nickname = 1*19 NicknameChar - - ; What follows is an outdated way to refer to ORs. - ; Feature VERBOSE_NAMES replaces ServerID with LongName in events and - ; GETINFO results. VERBOSE_NAMES can be enabled starting in Tor version - ; 0.1.2.2-alpha and it is always-on in 0.2.2.1-alpha and later. - ServerID = Nickname / Fingerprint - - - ; Unique identifiers for streams or circuits. Currently, Tor only - ; uses digits, but this may change - StreamID = 1*16 IDChar - CircuitID = 1*16 IDChar - ConnID = 1*16 IDChar - QueueID = 1*16 IDChar - IDChar = ALPHA / DIGIT - - Address = ip4-address / ip6-address / hostname (XXXX Define these) - - ; A "CmdData" section is a sequence of octets concluded by the terminating - ; sequence CRLF "." CRLF. The terminating sequence may not appear in the - ; body of the data. Leading periods on lines in the data are escaped with - ; an additional leading period as in RFC 2821 section 4.5.2. - CmdData = *DataLine "." CRLF - DataLine = CRLF / "." 1*LineItem CRLF / NonDotItem *LineItem CRLF - LineItem = NonCR / 1*CR NonCRLF - NonDotItem = NonDotCR / 1*CR NonCRLF - - ; ISOTime, ISOTime2, and ISOTime2Frac are time formats as specified in - ; ISO8601. - ; example ISOTime: "2012-01-11 12:15:33" - ; example ISOTime2: "2012-01-11T12:15:33" - ; example ISOTime2Frac: "2012-01-11T12:15:33.51" - IsoDatePart = 4*DIGIT "-" 2*DIGIT "-" 2*DIGIT - IsoTimePart = 2*DIGIT ":" 2*DIGIT ":" 2*DIGIT - ISOTime = IsoDatePart " " IsoTimePart - ISOTime2 = IsoDatePart "T" IsoTimePart - ISOTime2Frac = IsoTime2 [ "." 1*DIGIT ] - - ; Numbers - LeadingDigit = "1" - "9" - UInt = LeadingDigit *Digit - -3. Commands - - All commands are case-insensitive, but most keywords are case-sensitive. - -3.1. SETCONF - - Change the value of one or more configuration variables. The syntax is: - - "SETCONF" 1*(SP keyword ["=" value]) CRLF - value = String / QuotedString - - Tor behaves as though it had just read each of the key-value pairs - from its configuration file. Keywords with no corresponding values have - their configuration values reset to 0 or NULL (use RESETCONF if you want - to set it back to its default). SETCONF is all-or-nothing: if there - is an error in any of the configuration settings, Tor sets none of them. - - Tor responds with a "250 OK" reply on success. - If some of the listed keywords can't be found, Tor replies with a - "552 Unrecognized option" message. Otherwise, Tor responds with a - "513 syntax error in configuration values" reply on syntax error, or a - "553 impossible configuration setting" reply on a semantic error. - - Some configuration options (e.g. "Bridge") take multiple values. Also, - some configuration keys (e.g. for hidden services and for entry - guard lists) form a context-sensitive group where order matters (see - GETCONF below). In these cases, setting _any_ of the options in a - SETCONF command is taken to reset all of the others. For example, - if two ORListenAddress values are configured, and a SETCONF command - arrives containing a single ORListenAddress value, the new command's - value replaces the two old values. - - Sometimes it is not possible to change configuration options solely by - issuing a series of SETCONF commands, because the value of one of the - configuration options depends on the value of another which has not yet - been set. Such situations can be overcome by setting multiple configuration - options with a single SETCONF command (e.g. SETCONF ORPort=443 - ORListenAddress=9001). - -3.2. RESETCONF - - Remove all settings for a given configuration option entirely, assign - its default value (if any), and then assign the String provided. - Typically the String is left empty, to simply set an option back to - its default. The syntax is: - - "RESETCONF" 1*(SP keyword ["=" String]) CRLF - - Otherwise it behaves like SETCONF above. - -3.3. GETCONF - - Request the value of zero or more configuration variable(s). - The syntax is: - - "GETCONF" *(SP keyword) CRLF - - If all of the listed keywords exist in the Tor configuration, Tor replies - with a series of reply lines of the form: - - 250 keyword=value - - If any option is set to a 'default' value semantically different from an - empty string, Tor may reply with a reply line of the form: - - 250 keyword - - Value may be a raw value or a quoted string. Tor will try to use unquoted - values except when the value could be misinterpreted through not being - quoted. (Right now, Tor supports no such misinterpretable values for - configuration options.) - - If some of the listed keywords can't be found, Tor replies with a - "552 unknown configuration keyword" message. - - If an option appears multiple times in the configuration, all of its - key-value pairs are returned in order. - - If no keywords were provided, Tor responds with "250 OK" message. - - Some options are context-sensitive, and depend on other options with - different keywords. These cannot be fetched directly. Currently there - is only one such option: clients should use the "HiddenServiceOptions" - virtual keyword to get all HiddenServiceDir, HiddenServicePort, - HiddenServiceVersion, and HiddenserviceAuthorizeClient option settings. - -3.4. SETEVENTS - - Request the server to inform the client about interesting events. The - syntax is: - - "SETEVENTS" [SP "EXTENDED"] *(SP EventCode) CRLF - - EventCode = 1*(ALPHA / "_") (see section 4.1.x for event types) - - Any events *not* listed in the SETEVENTS line are turned off; thus, sending - SETEVENTS with an empty body turns off all event reporting. - - The server responds with a "250 OK" reply on success, and a "552 - Unrecognized event" reply if one of the event codes isn't recognized. (On - error, the list of active event codes isn't changed.) - - If the flag string "EXTENDED" is provided, Tor may provide extra - information with events for this connection; see 4.1 for more information. - NOTE: All events on a given connection will be provided in extended format, - or none. - NOTE: "EXTENDED" was first supported in Tor 0.1.1.9-alpha; it is - always-on in Tor 0.2.2.1-alpha and later. - - Each event is described in more detail in Section 4.1. - -3.5. AUTHENTICATE - - Sent from the client to the server. The syntax is: - - "AUTHENTICATE" [ SP 1*HEXDIG / QuotedString ] CRLF - - This command is used to authenticate to the server. The provided string is - one of the following: - - * (For the HASHEDPASSWORD authentication method; see 3.21) - The original password represented as a QuotedString. - - * (For the COOKIE is authentication method; see 3.21) - The contents of the cookie file, formatted in hexadecimal - - * (For the SAFECOOKIE authentication method; see 3.21) - The HMAC based on the AUTHCHALLENGE message, in hexadecimal. - - The server responds with "250 OK" on success or "515 Bad authentication" if - the authentication cookie is incorrect. Tor closes the connection on an - authentication failure. - - The authentication token can be specified as either a quoted ASCII string, - or as an unquoted hexadecimal encoding of that same string (to avoid escaping - issues). - - For information on how the implementation securely stores authentication - information on disk, see section 5.1. - - Before the client has authenticated, no command other than - PROTOCOLINFO, AUTHCHALLENGE, AUTHENTICATE, or QUIT is valid. If the - controller sends any other command, or sends a malformed command, or - sends an unsuccessful AUTHENTICATE command, or sends PROTOCOLINFO or - AUTHCHALLENGE more than once, Tor sends an error reply and closes - the connection. - - To prevent some cross-protocol attacks, the AUTHENTICATE command is still - required even if all authentication methods in Tor are disabled. In this - case, the controller should just send "AUTHENTICATE" CRLF. - - (Versions of Tor before 0.1.2.16 and 0.2.0.4-alpha did not close the - connection after an authentication failure.) - -3.6. SAVECONF - - Sent from the client to the server. The syntax is: - - "SAVECONF" [SP "FORCE"] CRLF - - Instructs the server to write out its config options into its torrc. Server - returns "250 OK" if successful, or "551 Unable to write configuration - to disk" if it can't write the file or some other error occurs. - - If the %include option is used on torrc, SAVECONF will not write the - configuration to disk. If the flag string "FORCE" is provided, the - configuration will be overwritten even if %include is used. Using %include - on defaults-torrc does not affect SAVECONF. (Introduced in 0.3.1.1-alpha.) - - See also the "getinfo config-text" command, if the controller wants - to write the torrc file itself. - - See also the "getinfo config-can-saveconf" command, to tell if the FORCE - flag will be required. (Also introduced in 0.3.1.1-alpha.) - -3.7. SIGNAL - - Sent from the client to the server. The syntax is: - - "SIGNAL" SP Signal CRLF - - Signal = "RELOAD" / "SHUTDOWN" / "DUMP" / "DEBUG" / "HALT" / - "HUP" / "INT" / "USR1" / "USR2" / "TERM" / "NEWNYM" / - "CLEARDNSCACHE" / "HEARTBEAT" / "ACTIVE" / "DORMANT" - - The meaning of the signals are: - - RELOAD -- Reload: reload config items. - SHUTDOWN -- Controlled shutdown: if server is an OP, exit immediately. - If it's an OR, close listeners and exit after - ShutdownWaitLength seconds. - DUMP -- Dump stats: log information about open connections and - circuits. - DEBUG -- Debug: switch all open logs to loglevel debug. - HALT -- Immediate shutdown: clean up and exit now. - CLEARDNSCACHE -- Forget the client-side cached IPs for all hostnames. - NEWNYM -- Switch to clean circuits, so new application requests - don't share any circuits with old ones. Also clears - the client-side DNS cache. (Tor MAY rate-limit its - response to this signal.) - HEARTBEAT -- Make Tor dump an unscheduled Heartbeat message to log. - DORMANT -- Tell Tor to become "dormant". A dormant Tor will - try to avoid CPU and network usage until it receives - user-initiated network request. (Don't use this - on relays or hidden services yet!) - ACTIVE -- Tell Tor to stop being "dormant", as if it had received - a user-initiated network request. - - The server responds with "250 OK" if the signal is recognized (or simply - closes the socket if it was asked to close immediately), or "552 - Unrecognized signal" if the signal is unrecognized. - - Note that not all of these signals have POSIX signal equivalents. The - ones that do are as below. You may also use these POSIX names for the - signal that have them. - - RELOAD: HUP - SHUTDOWN: INT - HALT: TERM - DUMP: USR1 - DEBUG: USR2 - - [SIGNAL DORMANT and SIGNAL ACTIVE were added in 0.4.0.1-alpha.] - -3.8. MAPADDRESS - - Sent from the client to the server. The syntax is: - - "MAPADDRESS" 1*(Address "=" Address SP) CRLF - - The first address in each pair is an "original" address; the second is a - "replacement" address. The client sends this message to the server in - order to tell it that future SOCKS requests for connections to the original - address should be replaced with connections to the specified replacement - address. If the addresses are well-formed, and the server is able to - fulfill the request, the server replies with a 250 message: - - 250-OldAddress1=NewAddress1 - 250 OldAddress2=NewAddress2 - - containing the source and destination addresses. If request is - malformed, the server replies with "512 syntax error in command - argument". If the server can't fulfill the request, it replies with - "451 resource exhausted". - - The client may decline to provide a body for the original address, and - instead send a special null address ("0.0.0.0" for IPv4, "::0" for IPv6, or - "." for hostname), signifying that the server should choose the original - address itself, and return that address in the reply. The server - should ensure that it returns an element of address space that is unlikely - to be in actual use. If there is already an address mapped to the - destination address, the server may reuse that mapping. - - If the original address is already mapped to a different address, the old - mapping is removed. If the original address and the destination address - are the same, the server removes any mapping in place for the original - address. - - Example: - - C: MAPADDRESS 1.2.3.4=torproject.org - S: 250 1.2.3.4=torproject.org - - C: GETINFO address-mappings/control - S: 250-address-mappings/control=1.2.3.4 torproject.org NEVER - S: 250 OK - - C: MAPADDRESS 1.2.3.4=1.2.3.4 - S: 250 1.2.3.4=1.2.3.4 - - C: GETINFO address-mappings/control - S: 250-address-mappings/control= - S: 250 OK - - {Note: This feature is designed to be used to help Tor-ify applications - that need to use SOCKS4 or hostname-less SOCKS5. There are three - approaches to doing this: - - 1. Somehow make them use SOCKS4a or SOCKS5-with-hostnames instead. - 2. Use tor-resolve (or another interface to Tor's resolve-over-SOCKS - feature) to resolve the hostname remotely. This doesn't work - with special addresses like x.onion or x.y.exit. - 3. Use MAPADDRESS to map an IP address to the desired hostname, and then - arrange to fool the application into thinking that the hostname - has resolved to that IP. - - This functionality is designed to help implement the 3rd approach.} - - Mappings set by the controller last until the Tor process exits: - they never expire. If the controller wants the mapping to last only - a certain time, then it must explicitly un-map the address when that - time has elapsed. - - MapAddress replies MAY contain mixed status codes. - - Example: - - C: MAPADDRESS xxx=@@@ 0.0.0.0=bogus1.google.com - S: 512-syntax error: invalid address '@@@' - S: 250 127.199.80.246=bogus1.google.com - -3.9. GETINFO - - Sent from the client to the server. The syntax is as for GETCONF: - - "GETINFO" 1*(SP keyword) CRLF - - Unlike GETCONF, this message is used for data that are not stored in the Tor - configuration file, and that may be longer than a single line. On success, - one ReplyLine is sent for each requested value, followed by a final 250 OK - ReplyLine. If a value fits on a single line, the format is: - - 250-keyword=value - If a value must be split over multiple lines, the format is: - - 250+keyword= - value - . - The server sends a 551 or 552 error on failure. - - Recognized keys and their values include: - - "version" -- The version of the server's software, which MAY include the - name of the software, such as "Tor 0.0.9.4". The name of the software, - if absent, is assumed to be "Tor". - - "config-file" -- The location of Tor's configuration file ("torrc"). - - "config-defaults-file" -- The location of Tor's configuration - defaults file ("torrc.defaults"). This file gets parsed before - torrc, and is typically used to replace Tor's default - configuration values. [First implemented in 0.2.3.9-alpha.] - - "config-text" -- The contents that Tor would write if you send it - a SAVECONF command, so the controller can write the file to - disk itself. [First implemented in 0.2.2.7-alpha.] - - "exit-policy/default" -- The default exit policy lines that Tor will - *append* to the ExitPolicy config option. - - "exit-policy/reject-private/default" -- The default exit policy lines - that Tor will *prepend* to the ExitPolicy config option when - ExitPolicyRejectPrivate is 1. - - "exit-policy/reject-private/relay" -- The relay-specific exit policy - lines that Tor will *prepend* to the ExitPolicy config option based - on the current values of ExitPolicyRejectPrivate and - ExitPolicyRejectLocalInterfaces. These lines are based on the public - addresses configured in the torrc and present on the relay's - interfaces. Will send 552 error if the server is not running as - onion router. Will send 551 on internal error which may be transient. - - "exit-policy/ipv4" - "exit-policy/ipv6" - "exit-policy/full" -- This OR's exit policy, in IPv4-only, IPv6-only, or - all-entries flavors. Handles errors in the same way as "exit-policy/ - reject-private/relay" does. - - "desc/id/" or "desc/name/" -- the latest - server descriptor for a given OR. (Note that modern Tor clients - do not download server descriptors by default, but download - microdescriptors instead. If microdescriptors are enabled, you'll - need to use "md" instead.) - - "md/all" -- all known microdescriptors for the entire Tor network. - Each microdescriptor is terminated by a newline. - [First implemented in 0.3.5.1-alpha] - - "md/id/" or "md/name/" -- the latest - microdescriptor for a given OR. Empty if we have no microdescriptor for - that OR (because we haven't downloaded one, or it isn't in the - consensus). [First implemented in 0.2.3.8-alpha.] - - "desc/download-enabled" -- "1" if we try to download router descriptors; - "0" otherwise. [First implemented in 0.3.2.1-alpha] - - "md/download-enabled" -- "1" if we try to download microdescriptors; - "0" otherwise. [First implemented in 0.3.2.1-alpha] - - "dormant" -- A nonnegative integer: zero if Tor is currently active and - building circuits, and nonzero if Tor has gone idle due to lack of use - or some similar reason. [First implemented in 0.2.3.16-alpha] - - "desc-annotations/id/" -- outputs the annotations string - (source, timestamp of arrival, purpose, etc) for the corresponding - descriptor. [First implemented in 0.2.0.13-alpha.] - - "extra-info/digest/" -- the extrainfo document whose digest (in - hex) is . Only available if we're downloading extra-info - documents. - - "ns/id/" or "ns/name/" -- the latest router - status info (v3 directory style) for a given OR. Router status - info is as given in dir-spec.txt, and reflects the latest - consensus opinion about the - router in question. Like directory clients, controllers MUST - tolerate unrecognized flags and lines. The published date and - descriptor digest are those believed to be best by this Tor, - not necessarily those for a descriptor that Tor currently has. - [First implemented in 0.1.2.3-alpha.] - [In 0.2.0.9-alpha this switched from v2 directory style to v3] - - "ns/all" -- Router status info (v3 directory style) for all ORs we - that the consensus has an opinion about, joined by newlines. - [First implemented in 0.1.2.3-alpha.] - [In 0.2.0.9-alpha this switched from v2 directory style to v3] - - "ns/purpose/" -- Router status info (v3 directory style) - for all ORs of this purpose. Mostly designed for /ns/purpose/bridge - queries. - [First implemented in 0.2.0.13-alpha.] - [In 0.2.0.9-alpha this switched from v2 directory style to v3] - [In versions before 0.4.1.1-alpha we set the Running flag on - bridges when /ns/purpose/bridge is accessed] - [In 0.4.1.1-alpha we set the Running flag on bridges when the - bridge networkstatus file is written to disk] - - "desc/all-recent" -- the latest server descriptor for every router that - Tor knows about. (See md note about "desc/id" and "desc/name" above.) - - "network-status" -- [Deprecated in 0.3.1.1-alpha, removed - in 0.4.5.1-alpha.] - - "address-mappings/all" - "address-mappings/config" - "address-mappings/cache" - "address-mappings/control" -- a \r\n-separated list of address - mappings, each in the form of "from-address to-address expiry". - The 'config' key returns those address mappings set in the - configuration; the 'cache' key returns the mappings in the - client-side DNS cache; the 'control' key returns the mappings set - via the control interface; the 'all' target returns the mappings - set through any mechanism. - Expiry is formatted as with ADDRMAP events, except that "expiry" is - always a time in UTC or the string "NEVER"; see section 4.1.7. - First introduced in 0.2.0.3-alpha. - - "addr-mappings/*" -- as for address-mappings/*, but without the - expiry portion of the value. Use of this value is deprecated - since 0.2.0.3-alpha; use address-mappings instead. - - "address" -- the best guess at our external IP address. If we - have no guess, return a 551 error. (Added in 0.1.2.2-alpha) - - "address/v4" - "address/v6" - the best guess at our respective external IPv4 or IPv6 address. - If we have no guess, return a 551 error. (Added in 0.4.5.1-alpha) - - "fingerprint" -- the contents of the fingerprint file that Tor - writes as a relay, or a 551 if we're not a relay currently. - (Added in 0.1.2.3-alpha) - - "circuit-status" - A series of lines as for a circuit status event. Each line is of - the form described in section 4.1.1, omitting the initial - "650 CIRC ". Note that clients must be ready to accept additional - arguments as described in section 4.1. - - "stream-status" - A series of lines as for a stream status event. Each is of the form: - StreamID SP StreamStatus SP CircuitID SP Target CRLF - - "orconn-status" - A series of lines as for an OR connection status event. In Tor - 0.1.2.2-alpha with feature VERBOSE_NAMES enabled and in Tor - 0.2.2.1-alpha and later by default, each line is of the form: - LongName SP ORStatus CRLF - - In Tor versions 0.1.2.2-alpha through 0.2.2.1-alpha with feature - VERBOSE_NAMES turned off and before version 0.1.2.2-alpha, each line - is of the form: - ServerID SP ORStatus CRLF - - "entry-guards" - A series of lines listing the currently chosen entry guards, if any. - In Tor 0.1.2.2-alpha with feature VERBOSE_NAMES enabled and in Tor - 0.2.2.1-alpha and later by default, each line is of the form: - LongName SP Status [SP ISOTime] CRLF - - In Tor versions 0.1.2.2-alpha through 0.2.2.1-alpha with feature - VERBOSE_NAMES turned off and before version 0.1.2.2-alpha, each line - is of the form: - ServerID2 SP Status [SP ISOTime] CRLF - ServerID2 = Nickname / 40*HEXDIG - - The definition of Status is the same for both: - Status = "up" / "never-connected" / "down" / - "unusable" / "unlisted" - - [From 0.1.1.4-alpha to 0.1.1.10-alpha, entry-guards was called - "helper-nodes". Tor still supports calling "helper-nodes", but it - is deprecated and should not be used.] - - [Older versions of Tor (before 0.1.2.x-final) generated 'down' instead - of unlisted/unusable. Between 0.1.2.x-final and 0.2.6.3-alpha, - 'down' was never generated.] - - [XXXX ServerID2 differs from ServerID in not prefixing fingerprints - with a $. This is an implementation error. It would be nice to add - the $ back in if we can do so without breaking compatibility.] - - "traffic/read" -- Total bytes read (downloaded). - - "traffic/written" -- Total bytes written (uploaded). - - "uptime" -- Uptime of the Tor daemon (in seconds). Added in - 0.3.5.1-alpha. - - "accounting/enabled" - "accounting/hibernating" - "accounting/bytes" - "accounting/bytes-left" - "accounting/interval-start" - "accounting/interval-wake" - "accounting/interval-end" - Information about accounting status. If accounting is enabled, - "enabled" is 1; otherwise it is 0. The "hibernating" field is "hard" - if we are accepting no data; "soft" if we're accepting no new - connections, and "awake" if we're not hibernating at all. The "bytes" - and "bytes-left" fields contain (read-bytes SP write-bytes), for the - start and the rest of the interval respectively. The 'interval-start' - and 'interval-end' fields are the borders of the current interval; the - 'interval-wake' field is the time within the current interval (if any) - where we plan[ned] to start being active. The times are UTC. - - "config/names" - A series of lines listing the available configuration options. Each is - of the form: - OptionName SP OptionType [ SP Documentation ] CRLF - OptionName = Keyword - OptionType = "Integer" / "TimeInterval" / "TimeMsecInterval" / - "DataSize" / "Float" / "Boolean" / "Time" / "CommaList" / - "Dependent" / "Virtual" / "String" / "LineList" - Documentation = Text - Note: The incorrect spelling "Dependant" was used from the time this key - was introduced in Tor 0.1.1.4-alpha until it was corrected in Tor - 0.3.0.2-alpha. It is recommended that clients accept both spellings. - - "config/defaults" - A series of lines listing default values for each configuration - option. Options which don't have a valid default don't show up - in the list. Introduced in Tor 0.2.4.1-alpha. - OptionName SP OptionValue CRLF - OptionName = Keyword - OptionValue = Text - - "info/names" - A series of lines listing the available GETINFO options. Each is of - one of these forms: - OptionName SP Documentation CRLF - OptionPrefix SP Documentation CRLF - OptionPrefix = OptionName "/*" - The OptionPrefix form indicates a number of options beginning with the - prefix. So if "config/*" is listed, other options beginning with - "config/" will work, but "config/*" itself is not an option. - - "events/names" - A space-separated list of all the events supported by this version of - Tor's SETEVENTS. - - "features/names" - A space-separated list of all the features supported by this version - of Tor's USEFEATURE. - - "signal/names" - A space-separated list of all the values supported by the SIGNAL - command. - - "ip-to-country/ipv4-available" - "ip-to-country/ipv6-available" - "1" if the relevant geoip or geoip6 database is present; "0" otherwise. - This field was added in Tor 0.3.2.1-alpha. - - "ip-to-country/*" - Maps IP addresses to 2-letter country codes. For example, - "GETINFO ip-to-country/18.0.0.1" should give "US". - - "process/pid" -- Process id belonging to the main tor process. - "process/uid" -- User id running the tor process, -1 if unknown (this is - unimplemented on Windows, returning -1). - "process/user" -- Username under which the tor process is running, - providing an empty string if none exists (this is unimplemented on - Windows, returning an empty string). - "process/descriptor-limit" -- Upper bound on the file descriptor limit, -1 - if unknown - - "dir/status-vote/current/consensus" [added in Tor 0.2.1.6-alpha] - "dir/status-vote/current/consensus-microdesc" [added in Tor 0.4.3.1-alpha] - "dir/status/authority" - "dir/status/fp/" - "dir/status/fp/++" - "dir/status/all" - "dir/server/fp/" - "dir/server/fp/++" - "dir/server/d/" - "dir/server/d/++" - "dir/server/authority" - "dir/server/all" - A series of lines listing directory contents, provided according to the - specification for the URLs listed in Section 4.4 of dir-spec.txt. Note - that Tor MUST NOT provide private information, such as descriptors for - routers not marked as general-purpose. When asked for 'authority' - information for which this Tor is not authoritative, Tor replies with - an empty string. - - Note that, as of Tor 0.2.3.3-alpha, Tor clients don't download server - descriptors anymore, but microdescriptors. So, a "551 Servers - unavailable" reply to all "GETINFO dir/server/*" requests is actually - correct. If you have an old program which absolutely requires server - descriptors to work, try setting UseMicrodescriptors 0 or - FetchUselessDescriptors 1 in your client's torrc. - - "status/circuit-established" - "status/enough-dir-info" - "status/good-server-descriptor" - "status/accepted-server-descriptor" - "status/..." - These provide the current internal Tor values for various Tor - states. See Section 4.1.10 for explanations. (Only a few of the - status events are available as getinfo's currently. Let us know if - you want more exposed.) - "status/reachability-succeeded/or" - 0 or 1, depending on whether we've found our ORPort reachable. - "status/reachability-succeeded/dir" - 0 or 1, depending on whether we've found our DirPort reachable. - 1 if there is no DirPort, and therefore no need for a reachability - check. - "status/reachability-succeeded" - "OR=" ("0"/"1") SP "DIR=" ("0"/"1") - Combines status/reachability-succeeded/*; controllers MUST ignore - unrecognized elements in this entry. - "status/bootstrap-phase" - Returns the most recent bootstrap phase status event - sent. Specifically, it returns a string starting with either - "NOTICE BOOTSTRAP ..." or "WARN BOOTSTRAP ...". Controllers should - use this getinfo when they connect or attach to Tor to learn its - current bootstrap state. - "status/version/recommended" - List of currently recommended versions. - "status/version/current" - Status of the current version. One of: new, old, unrecommended, - recommended, new in series, obsolete, unknown. - "status/clients-seen" - A summary of which countries we've seen clients from recently, - formatted the same as the CLIENTS_SEEN status event described in - Section 4.1.14. This GETINFO option is currently available only - for bridge relays. - "status/fresh-relay-descs" - Provides fresh server and extra-info descriptors for our relay. Note - this is *not* the latest descriptors we've published, but rather what we - would generate if we needed to make a new descriptor right now. - - "net/listeners/*" - - A quoted, space-separated list of the locations where Tor is listening - for connections of the specified type. These can contain IPv4 - network address... - - "127.0.0.1:9050" "127.0.0.1:9051" - - ... or local unix sockets... - - "unix:/home/my_user/.tor/socket" - - ... or IPv6 network addresses: - - "[2001:0db8:7000:0000:0000:dead:beef:1234]:9050" - - [New in Tor 0.2.2.26-beta.] - - "net/listeners/or" - - Listeners for OR connections. Talks Tor protocol as described in - tor-spec.txt. - - "net/listeners/dir" - - Listeners for Tor directory protocol, as described in dir-spec.txt. - - "net/listeners/socks" - - Listeners for onion proxy connections that talk SOCKS4/4a/5 protocol. - - "net/listeners/trans" - - Listeners for transparent connections redirected by firewall, such as - pf or netfilter. - - "net/listeners/natd" - - Listeners for transparent connections redirected by natd. - - "net/listeners/dns" - - Listeners for a subset of DNS protocol that Tor network supports. - - "net/listeners/control" - - Listeners for Tor control protocol, described herein. - - "net/listeners/extor" - - Listeners corresponding to Extended ORPorts for integration with - pluggable transports. See proposals 180 and 196. - - "net/listeners/httptunnel" - - Listeners for onion proxy connections that leverage HTTP CONNECT - tunnelling. - - [The extor and httptunnel lists were added in 0.3.2.12, 0.3.3.10, and - 0.3.4.6-rc.] - - "dir-usage" - A newline-separated list of how many bytes we've served to answer - each type of directory request. The format of each line is: - Keyword 1*SP Integer 1*SP Integer - where the first integer is the number of bytes written, and the second - is the number of requests answered. - - [This feature was added in Tor 0.2.2.1-alpha, and removed in - Tor 0.2.9.1-alpha. Even when it existed, it only provided - useful output when the Tor client was built with either the - INSTRUMENT_DOWNLOADS or RUNNING_DOXYGEN compile-time options.] - - "bw-event-cache" - A space-separated summary of recent BW events in chronological order - from oldest to newest. Each event is represented by a comma-separated - tuple of "R,W", R is the number of bytes read, and W is the number of - bytes written. These entries each represent about one second's worth - of traffic. - [New in Tor 0.2.6.3-alpha] - - "consensus/valid-after" - "consensus/fresh-until" - "consensus/valid-until" - Each of these produces an ISOTime describing part of the lifetime of - the current (valid, accepted) consensus that Tor has. - [New in Tor 0.2.6.3-alpha] - - "hs/client/desc/id/" - Prints the content of the hidden service descriptor corresponding to - the given which is an onion address without the ".onion" part. - The client's cache is queried to find the descriptor. The format of - the descriptor is described in section 1.3 of the rend-spec.txt - document. - - If is unrecognized or if not found in the cache, a 551 error is - returned. - - [New in Tor 0.2.7.1-alpha] - [HS v3 support added 0.3.3.1-alpha] - - "hs/service/desc/id/" - Prints the content of the hidden service descriptor corresponding to - the given which is an onion address without the ".onion" part. - The service's local descriptor cache is queried to find the descriptor. - The format of the descriptor is described in section 1.3 of the - rend-spec.txt document. - - If is unrecognized or if not found in the cache, a 551 error is - returned. - - [New in Tor 0.2.7.2-alpha] - [HS v3 support added 0.3.3.1-alpha] - - "onions/current" - "onions/detached" - A newline-separated list of the Onion ("Hidden") Services created - via the "ADD_ONION" command. The 'current' key returns Onion Services - belonging to the current control connection. The 'detached' key - returns Onion Services detached from the parent control connection - (as in, belonging to no control connection). - The format of each line is: - HSAddress - [New in Tor 0.2.7.1-alpha.] - [HS v3 support added 0.3.3.1-alpha] - - "network-liveness" - The string "up" or "down", indicating whether we currently believe the - network is reachable. - - "downloads/" - The keys under downloads/ are used to query download statuses; they all - return either a sequence of newline-terminated hex encoded digests, or - a "serialized download status" as follows: - - SerializedDownloadStatus = - -- when do we plan to next attempt to download this object? - "next-attempt-at" SP ISOTime CRLF - -- how many times have we failed since the last success? - "n-download-failures" SP UInt CRLF - -- how many times have we tried to download this? - "n-download-attempts" SP UInt CRLF - -- according to which schedule rule will we download this? - "schedule" SP DownloadSchedule CRLF - -- do we want to fetch this from an authority, or will any cache do? - "want-authority" SP DownloadWantAuthority CRLF - -- do we increase our download delay whenever we fail to fetch this, - -- or whenever we attempt fetching this? - "increment-on" SP DownloadIncrementOn CRLF - -- do we increase the download schedule deterministically, or at - -- random? - "backoff" SP DownloadBackoff CRLF - [ - -- with an exponential backoff, where are we in the schedule? - "last-backoff-position" Uint CRLF - -- with an exponential backoff, what was our last delay? - "last-delay-used UInt CRLF - ] - - where - - DownloadSchedule = - "DL_SCHED_GENERIC" / "DL_SCHED_CONSENSUS" / "DL_SCHED_BRIDGE" - DownloadWantAuthority = - "DL_WANT_ANY_DIRSERVER" / "DL_WANT_AUTHORITY" - DownloadIncrementOn = - "DL_SCHED_INCREMENT_FAILURE" / "DL_SCHED_INCREMENT_ATTEMPT" - DownloadBackoff = - "DL_SCHED_DETERMINISTIC" / "DL_SCHED_RANDOM_EXPONENTIAL" - - The optional last two lines must be present if DownloadBackoff is - "DL_SCHED_RANDOM_EXPONENTIAL" and must be absent if DownloadBackoff - is "DL_SCHED_DETERMINISTIC". - - In detail, the keys supported are: - - "downloads/networkstatus/ns" - The SerializedDownloadStatus for the NS-flavored consensus for - whichever bootstrap state Tor is currently in. - - "downloads/networkstatus/ns/bootstrap" - The SerializedDownloadStatus for the NS-flavored consensus at - bootstrap time, regardless of whether we are currently bootstrapping. - - "downloads/networkstatus/ns/running" - - The SerializedDownloadStatus for the NS-flavored consensus when - running, regardless of whether we are currently bootstrapping. - - "downloads/networkstatus/microdesc" - The SerializedDownloadStatus for the microdesc-flavored consensus for - whichever bootstrap state Tor is currently in. - - "downloads/networkstatus/microdesc/bootstrap" - The SerializedDownloadStatus for the microdesc-flavored consensus at - bootstrap time, regardless of whether we are currently bootstrapping. - - "downloads/networkstatus/microdesc/running" - The SerializedDownloadStatus for the microdesc-flavored consensus when - running, regardless of whether we are currently bootstrapping. - - "downloads/cert/fps" - - A newline-separated list of hex-encoded digests for authority - certificates for which we have download status available. - - "downloads/cert/fp/" - A SerializedDownloadStatus for the default certificate for the - identity digest returned by the downloads/cert/fps key. - - "downloads/cert/fp//sks" - A newline-separated list of hex-encoded signing key digests for the - authority identity digest returned by the - downloads/cert/fps key. - - "downloads/cert/fp//" - A SerializedDownloadStatus for the certificate for the identity - digest returned by the downloads/cert/fps key and signing - key digest returned by the downloads/cert/fp// - sks key. - - "downloads/desc/descs" - A newline-separated list of hex-encoded router descriptor digests - [note, not identity digests - the Tor process may not have seen them - yet while downloading router descriptors]. If the Tor process is not - using a NS-flavored consensus, a 551 error is returned. - - "downloads/desc/" - A SerializedDownloadStatus for the router descriptor with digest - as returned by the downloads/desc/descs key. If the Tor - process is not using a NS-flavored consensus, a 551 error is returned. - - "downloads/bridge/bridges" - A newline-separated list of hex-encoded bridge identity digests. If - the Tor process is not using bridges, a 551 error is returned. - - "downloads/bridge/" - A SerializedDownloadStatus for the bridge descriptor with identity - digest as returned by the downloads/bridge/bridges key. If - the Tor process is not using bridges, a 551 error is returned. - - "sr/current" - "sr/previous" - The current or previous shared random value, as received in the - consensus, base-64 encoded. An empty value means that either - the consensus has no shared random value, or Tor has no consensus. - - "current-time/local" - "current-time/utc" - The current system or UTC time, as returned by the system, in ISOTime2 - format. (Introduced in 0.3.4.1-alpha.) - - "stats/ntor/requested" - "stats/ntor/assigned" - The NTor circuit onion handshake rephist values which are requested or - assigned. (Introduced in 0.4.5.1-alpha) - - "stats/tap/requested" - "stats/tap/assigned" - The TAP circuit onion handshake rephist values which are requested or - assigned. (Introduced in 0.4.5.1-alpha) - - "config-can-saveconf" - 0 or 1, depending on whether it is possible to use SAVECONF without the - FORCE flag. (Introduced in 0.3.1.1-alpha.) - - "limits/max-mem-in-queues" - The amount of memory that Tor's out-of-memory checker will allow - Tor to allocate (in places it can see) before it starts freeing memory - and killing circuits. See the MaxMemInQueues option for more - details. Unlike the option, this value reflects Tor's actual limit, and - may be adjusted depending on the available system memory rather than on - the MaxMemInQueues option. (Introduced in 0.2.5.4-alpha) - - Examples: - - C: GETINFO version desc/name/moria1 - S: 250+desc/name/moria= - S: [Descriptor for moria] - S: . - S: 250-version=Tor 0.1.1.0-alpha-cvs - S: 250 OK - -3.10. EXTENDCIRCUIT - - Sent from the client to the server. The format is: - - "EXTENDCIRCUIT" SP CircuitID - [SP ServerSpec *("," ServerSpec)] - [SP "purpose=" Purpose] CRLF - - This request takes one of two forms: either the CircuitID is zero, in - which case it is a request for the server to build a new circuit, - or the CircuitID is nonzero, in which case it is a request for the - server to extend an existing circuit with that ID according to the - specified path. - - If the CircuitID is 0, the controller has the option of providing - a path for Tor to use to build the circuit. If it does not provide - a path, Tor will select one automatically from high capacity nodes - according to path-spec.txt. - - If CircuitID is 0 and "purpose=" is specified, then the circuit's - purpose is set. Two choices are recognized: "general" and - "controller". If not specified, circuits are created as "general". - - If the request is successful, the server sends a reply containing a - message body consisting of the CircuitID of the (maybe newly created) - circuit. The syntax is "250" SP "EXTENDED" SP CircuitID CRLF. - -3.11. SETCIRCUITPURPOSE - - Sent from the client to the server. The format is: - - "SETCIRCUITPURPOSE" SP CircuitID SP "purpose=" Purpose CRLF - - This changes the circuit's purpose. See EXTENDCIRCUIT above for details. - -3.12. SETROUTERPURPOSE - - Sent from the client to the server. The format is: - - "SETROUTERPURPOSE" SP NicknameOrKey SP Purpose CRLF - - This changes the descriptor's purpose. See +POSTDESCRIPTOR below - for details. - - NOTE: This command was disabled and made obsolete as of Tor - 0.2.0.8-alpha. It doesn't exist anymore, and is listed here only for - historical interest. - -3.13. ATTACHSTREAM - - Sent from the client to the server. The syntax is: - - "ATTACHSTREAM" SP StreamID SP CircuitID [SP "HOP=" HopNum] CRLF - - This message informs the server that the specified stream should be - associated with the specified circuit. Each stream may be associated with - at most one circuit, and multiple streams may share the same circuit. - Streams can only be attached to completed circuits (that is, circuits that - have sent a circuit status 'BUILT' event or are listed as built in a - GETINFO circuit-status request). - - If the circuit ID is 0, responsibility for attaching the given stream is - returned to Tor. - - If HOP=HopNum is specified, Tor will choose the HopNumth hop in the - circuit as the exit node, rather than the last node in the circuit. - Hops are 1-indexed; generally, it is not permitted to attach to hop 1. - - Tor responds with "250 OK" if it can attach the stream, 552 if the - circuit or stream didn't exist, 555 if the stream isn't in an - appropriate state to be attached (e.g. it's already open), or 551 if - the stream couldn't be attached for another reason. - - {Implementation note: Tor will close unattached streams by itself, - roughly two minutes after they are born. Let the developers know if - that turns out to be a problem.} - - {Implementation note: By default, Tor automatically attaches streams to - circuits itself, unless the configuration variable - "__LeaveStreamsUnattached" is set to "1". Attempting to attach streams - via TC when "__LeaveStreamsUnattached" is false may cause a race between - Tor and the controller, as both attempt to attach streams to circuits.} - - {Implementation note: You can try to attachstream to a stream that - has already sent a connect or resolve request but hasn't succeeded - yet, in which case Tor will detach the stream from its current circuit - before proceeding with the new attach request.} - -3.14. POSTDESCRIPTOR - - Sent from the client to the server. The syntax is: - - "+POSTDESCRIPTOR" [SP "purpose=" Purpose] [SP "cache=" Cache] - CRLF Descriptor CRLF "." CRLF - - This message informs the server about a new descriptor. If Purpose is - specified, it must be either "general", "controller", or "bridge", - else we return a 552 error. The default is "general". - - If Cache is specified, it must be either "no" or "yes", else we - return a 552 error. If Cache is not specified, Tor will decide for - itself whether it wants to cache the descriptor, and controllers - must not rely on its choice. - - The descriptor, when parsed, must contain a number of well-specified - fields, including fields for its nickname and identity. - - If there is an error in parsing the descriptor, the server must send a - "554 Invalid descriptor" reply. If the descriptor is well-formed but - the server chooses not to add it, it must reply with a 251 message - whose body explains why the server was not added. If the descriptor - is added, Tor replies with "250 OK". - -3.15. REDIRECTSTREAM - - Sent from the client to the server. The syntax is: - - "REDIRECTSTREAM" SP StreamID SP Address [SP Port] CRLF - - Tells the server to change the exit address on the specified stream. If - Port is specified, changes the destination port as well. No remapping - is performed on the new provided address. - - To be sure that the modified address will be used, this event must be sent - after a new stream event is received, and before attaching this stream to - a circuit. - - Tor replies with "250 OK" on success. - -3.16. CLOSESTREAM - - Sent from the client to the server. The syntax is: - - "CLOSESTREAM" SP StreamID SP Reason *(SP Flag) CRLF - - Tells the server to close the specified stream. The reason should be one - of the Tor RELAY_END reasons given in tor-spec.txt, as a decimal. Flags is - not used currently; Tor servers SHOULD ignore unrecognized flags. Tor may - hold the stream open for a while to flush any data that is pending. - - Tor replies with "250 OK" on success, or a 512 if there aren't enough - arguments, or a 552 if it doesn't recognize the StreamID or reason. - -3.17. CLOSECIRCUIT - - The syntax is: - - "CLOSECIRCUIT" SP CircuitID *(SP Flag) CRLF - Flag = "IfUnused" - - Tells the server to close the specified circuit. If "IfUnused" is - provided, do not close the circuit unless it is unused. - - Other flags may be defined in the future; Tor SHOULD ignore unrecognized - flags. - - Tor replies with "250 OK" on success, or a 512 if there aren't enough - arguments, or a 552 if it doesn't recognize the CircuitID. - -3.18. QUIT - - Tells the server to hang up on this controller connection. This command - can be used before authenticating. - -3.19. USEFEATURE - - Adding additional features to the control protocol sometimes will break - backwards compatibility. Initially such features are added into Tor and - disabled by default. USEFEATURE can enable these additional features. - - The syntax is: - - "USEFEATURE" *(SP FeatureName) CRLF - FeatureName = 1*(ALPHA / DIGIT / "_" / "-") - - Feature names are case-insensitive. - - Once enabled, a feature stays enabled for the duration of the connection - to the controller. A new connection to the controller must be opened to - disable an enabled feature. - - Features are a forward-compatibility mechanism; each feature will eventually - become a standard part of the control protocol. Once a feature becomes part - of the protocol, it is always-on. Each feature documents the version it was - introduced as a feature and the version in which it became part of the - protocol. - - Tor will ignore a request to use any feature that is always-on. Tor will give - a 552 error in response to an unrecognized feature. - - EXTENDED_EVENTS - - Same as passing 'EXTENDED' to SETEVENTS; this is the preferred way to - request the extended event syntax. - - This feature was first introduced in 0.1.2.3-alpha. It is always-on - and part of the protocol in Tor 0.2.2.1-alpha and later. - - VERBOSE_NAMES - - Replaces ServerID with LongName in events and GETINFO results. LongName - provides a Fingerprint for all routers, an indication of Named status, - and a Nickname if one is known. LongName is strictly more informative - than ServerID, which only provides either a Fingerprint or a Nickname. - - This feature was first introduced in 0.1.2.2-alpha. It is always-on and - part of the protocol in Tor 0.2.2.1-alpha and later. - -3.20. RESOLVE - - The syntax is - - "RESOLVE" *Option *Address CRLF - Option = "mode=reverse" - Address = a hostname or IPv4 address - - This command launches a remote hostname lookup request for every specified - request (or reverse lookup if "mode=reverse" is specified). Note that the - request is done in the background: to see the answers, your controller will - need to listen for ADDRMAP events; see 4.1.7 below. - - [Added in Tor 0.2.0.3-alpha] - -3.21. PROTOCOLINFO - - The syntax is: - - "PROTOCOLINFO" *(SP PIVERSION) CRLF - - The server reply format is: - - "250-PROTOCOLINFO" SP PIVERSION CRLF *InfoLine "250 OK" CRLF - - InfoLine = AuthLine / VersionLine / OtherLine - - AuthLine = "250-AUTH" SP "METHODS=" AuthMethod *("," AuthMethod) - *(SP "COOKIEFILE=" AuthCookieFile) CRLF - VersionLine = "250-VERSION" SP "Tor=" TorVersion OptArguments CRLF - - AuthMethod = - "NULL" / ; No authentication is required - "HASHEDPASSWORD" / ; A controller must supply the original password - "COOKIE" / ; ... or supply the contents of a cookie file - "SAFECOOKIE" ; ... or prove knowledge of a cookie file's contents - - AuthCookieFile = QuotedString - TorVersion = QuotedString - - OtherLine = "250-" Keyword OptArguments CRLF - - PIVERSION: 1*DIGIT - - This command tells the controller what kinds of authentication are - supported. - - Tor MAY give its InfoLines in any order; controllers MUST ignore InfoLines - with keywords they do not recognize. Controllers MUST ignore extraneous - data on any InfoLine. - - PIVERSION is there in case we drastically change the syntax one day. For - now it should always be "1". Controllers MAY provide a list of the - protocolinfo versions they support; Tor MAY select a version that the - controller does not support. - - AuthMethod is used to specify one or more control authentication - methods that Tor currently accepts. - - AuthCookieFile specifies the absolute path and filename of the - authentication cookie that Tor is expecting and is provided iff the - METHODS field contains the method "COOKIE" and/or "SAFECOOKIE". - Controllers MUST handle escape sequences inside this string. - - All authentication cookies are 32 bytes long. Controllers MUST NOT - use the contents of a non-32-byte-long file as an authentication - cookie. - - If the METHODS field contains the method "SAFECOOKIE", every - AuthCookieFile must contain the same authentication cookie. - - The COOKIE authentication method exposes the user running a - controller to an unintended information disclosure attack whenever - the controller has greater filesystem read access than the process - that it has connected to. (Note that a controller may connect to a - process other than Tor.) It is almost never safe to use, even if - the controller's user has explicitly specified which filename to - read an authentication cookie from. For this reason, the COOKIE - authentication method has been deprecated and will be removed from - a future version of Tor. - - The VERSION line contains the Tor version. - - [Unlike other commands besides AUTHENTICATE, PROTOCOLINFO may be used (but - only once!) before AUTHENTICATE.] - - [PROTOCOLINFO was not supported before Tor 0.2.0.5-alpha.] - -3.22. LOADCONF - - The syntax is: - - "+LOADCONF" CRLF ConfigText CRLF "." CRLF - - This command allows a controller to upload the text of a config file - to Tor over the control port. This config file is then loaded as if - it had been read from disk. - - [LOADCONF was added in Tor 0.2.1.1-alpha.] - -3.23. TAKEOWNERSHIP - - The syntax is: - - "TAKEOWNERSHIP" CRLF - - This command instructs Tor to shut down when this control - connection is closed. This command affects each control connection - that sends it independently; if multiple control connections send - the TAKEOWNERSHIP command to a Tor instance, Tor will shut down when - any of those connections closes. - - (As of Tor 0.2.5.2-alpha, Tor does not wait a while for circuits to - close when shutting down because of an exiting controller. If you - want to ensure a clean shutdown--and you should!--then send "SIGNAL - SHUTDOWN" and wait for the Tor process to close.) - - This command is intended to be used with the - __OwningControllerProcess configuration option. A controller that - starts a Tor process which the user cannot easily control or stop - should 'own' that Tor process: - - * When starting Tor, the controller should specify its PID in an - __OwningControllerProcess on Tor's command line. This will - cause Tor to poll for the existence of a process with that PID, - and exit if it does not find such a process. (This is not a - completely reliable way to detect whether the 'owning - controller' is still running, but it should work well enough in - most cases.) - - * Once the controller has connected to Tor's control port, it - should send the TAKEOWNERSHIP command along its control - connection. At this point, *both* the TAKEOWNERSHIP command and - the __OwningControllerProcess option are in effect: Tor will - exit when the control connection ends *and* Tor will exit if it - detects that there is no process with the PID specified in the - __OwningControllerProcess option. - - * After the controller has sent the TAKEOWNERSHIP command, it - should send "RESETCONF __OwningControllerProcess" along its - control connection. This will cause Tor to stop polling for the - existence of a process with its owning controller's PID; Tor - will still exit when the control connection ends. - - [TAKEOWNERSHIP was added in Tor 0.2.2.28-beta.] - -3.24. AUTHCHALLENGE - - The syntax is: - - "AUTHCHALLENGE" SP "SAFECOOKIE" - SP ClientNonce - CRLF - - ClientNonce = 2*HEXDIG / QuotedString - - This command is used to begin the authentication routine for the - SAFECOOKIE method of authentication. - - If the server accepts the command, the server reply format is: - - "250 AUTHCHALLENGE" - SP "SERVERHASH=" ServerHash - SP "SERVERNONCE=" ServerNonce - CRLF - - ServerHash = 64*64HEXDIG - ServerNonce = 64*64HEXDIG - - The ClientNonce, ServerHash, and ServerNonce values are - encoded/decoded in the same way as the argument passed to the - AUTHENTICATE command. ServerNonce MUST be 32 bytes long. - - ServerHash is computed as: - - HMAC-SHA256("Tor safe cookie authentication server-to-controller hash", - CookieString | ClientNonce | ServerNonce) - - (with the HMAC key as its first argument) - - After a controller sends a successful AUTHCHALLENGE command, the - next command sent on the connection must be an AUTHENTICATE command, - and the only authentication string which that AUTHENTICATE command - will accept is: - - HMAC-SHA256("Tor safe cookie authentication controller-to-server hash", - CookieString | ClientNonce | ServerNonce) - - [Unlike other commands besides AUTHENTICATE, AUTHCHALLENGE may be - used (but only once!) before AUTHENTICATE.] - - [AUTHCHALLENGE was added in Tor 0.2.3.13-alpha.] - -3.25. DROPGUARDS - - The syntax is: - - "DROPGUARDS" CRLF - - Tells the server to drop all guard nodes. Do not invoke this command - lightly; it can increase vulnerability to tracking attacks over time. - - Tor replies with "250 OK" on success. - - [DROPGUARDS was added in Tor 0.2.5.2-alpha.] - -3.26. HSFETCH - - The syntax is: - - "HSFETCH" SP (HSAddress / "v" Version "-" DescId) - *[SP "SERVER=" Server] CRLF - - HSAddress = 16*Base32Character / 56*Base32Character - Version = "2" / "3" - DescId = 32*Base32Character - Server = LongName - - This command launches hidden service descriptor fetch(es) for the given - HSAddress or DescId. - - HSAddress can be version 2 or version 3 addresses. DescIDs can only be - version 2 IDs. Version 2 addresses consist of 16*Base32Character and - version 3 addresses consist of 56*Base32Character. - - If a DescId is specified, at least one Server MUST also be provided, - otherwise a 512 error is returned. If no DescId and Server(s) are specified, - it behaves like a normal Tor client descriptor fetch. If one or more - Server are given, they are used instead triggering a fetch on each of them - in parallel. - - The caching behavior when fetching a descriptor using this command is - identical to normal Tor client behavior. - - Details on how to compute a descriptor id (DescId) can be found in - rend-spec.txt section 1.3. - - If any values are unrecognized, a 513 error is returned and the command is - stopped. On success, Tor replies "250 OK" then Tor MUST eventually follow - this with both a HS_DESC and HS_DESC_CONTENT events with the results. If - SERVER is specified then events are emitted for each location. - - Examples are: - - C: HSFETCH v2-gezdgnbvgy3tqolbmjrwizlgm5ugs2tl - SERVER=9695DFC35FFEB861329B9F1AB04C46397020CE31 - S: 250 OK - - C: HSFETCH ajkhdsfuygaesfaa - S: 250 OK - - C: HSFETCH vww6ybal4bd7szmgncyruucpgfkqahzddi37ktceo3ah7ngmcopnpyyd - S: 250 OK - - [HSFETCH was added in Tor 0.2.7.1-alpha] - [HS v3 support added 0.4.1.1-alpha] - -3.27. ADD_ONION - - The syntax is: - - "ADD_ONION" SP KeyType ":" KeyBlob - [SP "Flags=" Flag *("," Flag)] - [SP "MaxStreams=" NumStreams] - 1*(SP "Port=" VirtPort ["," Target]) - *(SP "ClientAuth=" ClientName [":" ClientBlob]) CRLF - *(SP "ClientAuthV3=" V3Key) CRLF - - KeyType = - "NEW" / ; The server should generate a key of algorithm KeyBlob - "RSA1024" / ; The server should use the 1024 bit RSA key provided - in as KeyBlob (v2). - "ED25519-V3"; The server should use the ed25519 v3 key provided in as - KeyBlob (v3). - - KeyBlob = - "BEST" / ; The server should generate a key using the "best" - supported algorithm (KeyType == "NEW"). - [As of 0.4.2.3-alpha, ED25519-V3 is used] - "RSA1024" / ; The server should generate a 1024 bit RSA key - (KeyType == "NEW") (v2). - "ED25519-V3"; The server should generate an ed25519 private key - (KeyType == "NEW") (v3). - String ; A serialized private key (without whitespace) - - Flag = - "DiscardPK" / ; The server should not include the newly generated - private key as part of the response. - "Detach" / ; Do not associate the newly created Onion Service - to the current control connection. - "BasicAuth" / ; Client authorization is required using the "basic" - method (v2 only). - "V3Auth" / ; Version 3 client authorization is required (v3 only). - - "NonAnonymous" /; Add a non-anonymous Single Onion Service. Tor - checks this flag matches its configured hidden - service anonymity mode. - "MaxStreamsCloseCircuit"; Close the circuit is the maximum streams - allowed is reached. - - NumStreams = A value between 0 and 65535 which is used as the maximum - streams that can be attached on a rendezvous circuit. Setting - it to 0 means unlimited which is also the default behavior. - - VirtPort = The virtual TCP Port for the Onion Service (As in the - HiddenServicePort "VIRTPORT" argument). - - Target = The (optional) target for the given VirtPort (As in the - optional HiddenServicePort "TARGET" argument). - - ClientName = An identifier 1 to 16 characters long, using only - characters in A-Za-z0-9+-_ (no spaces) (v2 only). - - ClientBlob = Authorization data for the client, in an opaque format - specific to the authorization method (v2 only). - - V3Key = The client's base32-encoded x25519 public key, using only the key - part of rend-spec-v3.txt section G.1.2 (v3 only). - - The server reply format is: - - "250-ServiceID=" ServiceID CRLF - ["250-PrivateKey=" KeyType ":" KeyBlob CRLF] - *("250-ClientAuth=" ClientName ":" ClientBlob CRLF) - "250 OK" CRLF - - ServiceID = The Onion Service address without the trailing ".onion" - suffix - - Tells the server to create a new Onion ("Hidden") Service, with the - specified private key and algorithm. If a KeyType of "NEW" is selected, - the server will generate a new keypair using the selected algorithm. - The "Port" argument's VirtPort and Target values have identical - semantics to the corresponding HiddenServicePort configuration values. - - The server response will only include a private key if the server was - requested to generate a new keypair, and also the "DiscardPK" flag was - not specified. (Note that if "DiscardPK" flag is specified, there is no - way to recreate the generated keypair and the corresponding Onion - Service at a later date). - - If client authorization is enabled using the "BasicAuth" flag (which is v2 - only), the service will not be accessible to clients without valid - authorization data (configured with the "HidServAuth" option). The list of - authorized clients is specified with one or more "ClientAuth" parameters. - If "ClientBlob" is not specified for a client, a new credential will be - randomly generated and returned. - - Tor instances can either be in anonymous hidden service mode, or - non-anonymous single onion service mode. All hidden services on the same - tor instance have the same anonymity. To guard against unexpected loss - of anonymity, Tor checks that the ADD_ONION "NonAnonymous" flag matches - the current hidden service anonymity mode. The hidden service anonymity - mode is configured using the Tor options HiddenServiceSingleHopMode and - HiddenServiceNonAnonymousMode. If both these options are 1, the - "NonAnonymous" flag must be provided to ADD_ONION. If both these options - are 0 (the Tor default), the flag must NOT be provided. - - Once created the new Onion Service will remain active until either the - Onion Service is removed via "DEL_ONION", the server terminates, or the - control connection that originated the "ADD_ONION" command is closed. - It is possible to override disabling the Onion Service on control - connection close by specifying the "Detach" flag. - - It is the Onion Service server application's responsibility to close - existing client connections if desired after the Onion Service is - removed. - - (The KeyBlob format is left intentionally opaque, however for "RSA1024" - keys it is currently the Base64 encoded DER representation of a PKCS#1 - RSAPrivateKey, with all newlines removed. For a "ED25519-V3" key is - the Base64 encoding of the concatenation of the 32-byte ed25519 secret - scalar in little-endian and the 32-byte ed25519 PRF secret.) - - [Note: The ED25519-V3 format is not the same as, e.g., SUPERCOP - ed25519/ref, which stores the concatenation of the 32-byte ed25519 - hash seed concatenated with the 32-byte public key, and which derives - the secret scalar and PRF secret by expanding the hash seed with - SHA-512. Our key blinding scheme is incompatible with storing - private keys as seeds, so we store the secret scalar alongside the - PRF secret, and just pay the cost of recomputing the public key when - importing an ED25519-V3 key.] - - Examples: - - C: ADD_ONION NEW:BEST Flags=DiscardPK Port=80 - S: 250-ServiceID=exampleoniont2pqglbny66wpovyvao3ylc23eileodtevc4b75ikpad - S: 250 OK - - C: ADD_ONION RSA1024:[Blob Redacted] Port=80,192.168.1.1:8080 - S: 250-ServiceID=sampleonion12456 - S: 250 OK - - C: ADD_ONION NEW:BEST Port=22 Port=80,8080 - S: 250-ServiceID=sampleonion4t2pqglbny66wpovyvao3ylc23eileodtevc4b75ikpad - S: 250-PrivateKey=ED25519-V3:[Blob Redacted] - S: 250 OK - - C: ADD_ONION NEW:RSA1024 Flags=DiscardPK,BasicAuth Port=22 - ClientAuth=alice:[Blob Redacted] ClientAuth=bob - S: 250-ServiceID=testonion1234567 - S: 250-ClientAuth=bob:[Blob Redacted] - S: 250 OK - - C: ADD_ONION NEW:ED25519-V3 ClientAuthV3=[Blob Redacted] Port=22 - S: 250-ServiceID=n35etu3yjxrqjpntmfziom5sjwspoydchmelc4xleoy4jk2u4lziz2yd - S: 250-ClientAuthV3=[Blob Redacted] - S: 250 OK - - Examples with Tor in anonymous onion service mode: - - C: ADD_ONION NEW:BEST Flags=DiscardPK Port=22 - S: 250-ServiceID=exampleoniont2pqglbny66wpovyvao3ylc23eileodtevc4b75ikpad - S: 250 OK - - C: ADD_ONION NEW:BEST Flags=DiscardPK,NonAnonymous Port=22 - S: 512 Tor is in anonymous hidden service mode - - Examples with Tor in non-anonymous onion service mode: - - C: ADD_ONION NEW:BEST Flags=DiscardPK Port=22 - S: 512 Tor is in non-anonymous hidden service mode - - C: ADD_ONION NEW:BEST Flags=DiscardPK,NonAnonymous Port=22 - S: 250-ServiceID=exampleoniont2pqglbny66wpovyvao3ylc23eileodtevc4b75ikpad - S: 250 OK - - [ADD_ONION was added in Tor 0.2.7.1-alpha.] - [MaxStreams and MaxStreamsCloseCircuit were added in Tor 0.2.7.2-alpha] - [ClientAuth was added in Tor 0.2.9.1-alpha. It is v2 only.] - [NonAnonymous was added in Tor 0.2.9.3-alpha.] - [HS v3 support added 0.3.3.1-alpha] - [ClientV3Auth support added 0.4.6.1-alpha] - -3.28. DEL_ONION - - The syntax is: - - "DEL_ONION" SP ServiceID CRLF - - ServiceID = The Onion Service address without the trailing ".onion" - suffix - - Tells the server to remove an Onion ("Hidden") Service, that was - previously created via an "ADD_ONION" command. It is only possible to - remove Onion Services that were created on the same control connection - as the "DEL_ONION" command, and those that belong to no control - connection in particular (The "Detach" flag was specified at creation). - - If the ServiceID is invalid, or is neither owned by the current control - connection nor a detached Onion Service, the server will return a 552. - - It is the Onion Service server application's responsibility to close - existing client connections if desired after the Onion Service has been - removed via "DEL_ONION". - - Tor replies with "250 OK" on success, or a 512 if there are an invalid - number of arguments, or a 552 if it doesn't recognize the ServiceID. - - [DEL_ONION was added in Tor 0.2.7.1-alpha.] - [HS v3 support added 0.3.3.1-alpha] - -3.29. HSPOST - - The syntax is: - - "+HSPOST" *[SP "SERVER=" Server] [SP "HSADDRESS=" HSAddress] - CRLF Descriptor CRLF "." CRLF - - Server = LongName - HSAddress = 56*Base32Character - Descriptor = The text of the descriptor formatted as specified - in rend-spec.txt section 1.3. - - The "HSAddress" key is optional and only applies for v3 descriptors. A 513 - error is returned if used with v2. - - This command launches a hidden service descriptor upload to the specified - HSDirs. If one or more Server arguments are provided, an upload is triggered - on each of them in parallel. If no Server options are provided, it behaves - like a normal HS descriptor upload and will upload to the set of responsible - HS directories. - - If any value is unrecognized, a 552 error is returned and the command is - stopped. If there is an error in parsing the descriptor, the server - must send a "554 Invalid descriptor" reply. - - On success, Tor replies "250 OK" then Tor MUST eventually follow - this with a HS_DESC event with the result for each upload location. - - Examples are: - - C: +HSPOST SERVER=9695DFC35FFEB861329B9F1AB04C46397020CE31 - [DESCRIPTOR] - . - S: 250 OK - - [HSPOST was added in Tor 0.2.7.1-alpha] - -3.30. ONION_CLIENT_AUTH_ADD - - The syntax is: - - "ONION_CLIENT_AUTH_ADD" SP HSAddress - SP KeyType ":" PrivateKeyBlob - [SP "ClientName=" Nickname] - [SP "Flags=" TYPE] CRLF - - HSAddress = 56*Base32Character - KeyType = "x25519" is the only one supported right now - PrivateKeyBlob = base64 encoding of x25519 key - - Tells the connected Tor to add client-side v3 client auth credentials for the - onion service with "HSAddress". The "PrivateKeyBlob" is the x25519 private - key that should be used for this client, and "Nickname" is an optional - nickname for the client. - - FLAGS is a comma-separated tuple of flags for this new client. For now, the - currently supported flags are: - - "Permanent" - This client's credentials should be stored in the filesystem. - If this is not set, the client's credentials are ephemeral - and stored in memory. - - If client auth credentials already existed for this service, replace them - with the new ones. - - If Tor has cached onion service descriptors that it has been unable to - decrypt in the past (due to lack of client auth credentials), attempt to - decrypt those descriptors as soon as this command succeeds. - - On success, "250 OK" is returned. Otherwise, the following error codes exist: - - 251 - Client auth credentials for this onion service already existed and replaced. - 252 - Added client auth credentials and successfully decrypted a cached descriptor. - 451 - We reached authorized client capacity - 512 - Syntax error in "HSAddress", or "PrivateKeyBlob" or "Nickname" - 551 - Client with with this "Nickname" already exists - 552 - Unrecognized KeyType - - [ONION_CLIENT_AUTH_ADD was added in Tor 0.4.3.1-alpha] - -3.31. ONION_CLIENT_AUTH_REMOVE - - The syntax is: - - "ONION_CLIENT_AUTH_REMOVE" SP HSAddress - - KeyType = "x25519" is the only one supported right now - - Tells the connected Tor to remove the client-side v3 client auth credentials - for the onion service with "HSAddress". - - On success "250 OK" is returned. Otherwise, the following error codes exist: - - 512 - Syntax error in "HSAddress". - 251 - Client credentials for "HSAddress" did not exist. - - [ONION_CLIENT_AUTH_REMOVE was added in Tor 0.4.3.1-alpha] - -3.32. ONION_CLIENT_AUTH_VIEW - - The syntax is: - - "ONION_CLIENT_AUTH_VIEW" [SP HSAddress] CRLF - - Tells the connected Tor to list all the stored client-side v3 client auth - credentials for "HSAddress". If no "HSAddress" is provided, list all the - stored client-side v3 client auth credentials. - - The server reply format is: - - "250-ONION_CLIENT_AUTH_VIEW" [SP HSAddress] CRLF - *("250-CLIENT" SP HSAddress SP KeyType ":" PrivateKeyBlob - [SP "ClientName=" Nickname] - [SP "Flags=" FLAGS] CRLF) - "250 OK" CRLF - - HSAddress = The onion address under which this credential is stored - KeyType = "x25519" is the only one supported right now - PrivateKeyBlob = base64 encoding of x25519 key - - "Nickname" is an optional nickname for this client, which can be set either - through the ONION_CLIENT_AUTH_ADD command, or it's the filename of this - client if the credentials are stored in the filesystem. - - FLAGS is a comma-separated field of flags for this client, the currently - supported flags are: - - "Permanent" - This client's credentials are stored in the filesystem. - - On success "250 OK" is returned. Otherwise, the following error codes exist: - - 512 - Syntax error in "HSAddress". - - [ONION_CLIENT_AUTH_VIEW was added in Tor 0.4.3.1-alpha] - -3.33. DROPOWNERSHIP - - The syntax is: - - "DROPOWNERSHIP" CRLF - - This command instructs Tor to relinquish ownership of its control - connection. As such tor will not shut down when this control - connection is closed. - - This method is idempotent. If the control connection does not - already have ownership this method returns successfully, and - does nothing. - - The controller can call TAKEOWNERSHIP again to re-establish - ownership. - - [DROPOWNERSHIP was added in Tor 0.4.0.0-alpha] - -3.34. DROPTIMEOUTS - - The syntax is: - "DROPTIMEOUTS" CRLF - - Tells the server to drop all circuit build times. Do not invoke this command - lightly; it can increase vulnerability to tracking attacks over time. - - Tor replies with "250 OK" on success. Tor also emits the BUILDTIMEOUT_SET - RESET event right after this "250 OK". - - [DROPTIMEOUTS was added in Tor 0.4.5.0-alpha.] - -4. Replies - - Reply codes follow the same 3-character format as used by SMTP, with the - first character defining a status, the second character defining a - subsystem, and the third designating fine-grained information. - - The TC protocol currently uses the following first characters: - - 2yz Positive Completion Reply - The command was successful; a new request can be started. - - 4yz Temporary Negative Completion reply - The command was unsuccessful but might be reattempted later. - - 5yz Permanent Negative Completion Reply - The command was unsuccessful; the client should not try exactly - that sequence of commands again. - - 6yz Asynchronous Reply - Sent out-of-order in response to an earlier SETEVENTS command. - - The following second characters are used: - - x0z Syntax - Sent in response to ill-formed or nonsensical commands. - - x1z Protocol - Refers to operations of the Tor Control protocol. - - x5z Tor - Refers to actual operations of Tor system. - - The following codes are defined: - - 250 OK - 251 Operation was unnecessary - [Tor has declined to perform the operation, but no harm was done.] - - 451 Resource exhausted - - 500 Syntax error: protocol - - 510 Unrecognized command - 511 Unimplemented command - 512 Syntax error in command argument - 513 Unrecognized command argument - 514 Authentication required - 515 Bad authentication - - 550 Unspecified Tor error - - 551 Internal error - [Something went wrong inside Tor, so that the client's - request couldn't be fulfilled.] - - 552 Unrecognized entity - [A configuration key, a stream ID, circuit ID, event, - mentioned in the command did not actually exist.] - - 553 Invalid configuration value - [The client tried to set a configuration option to an - incorrect, ill-formed, or impossible value.] - - 554 Invalid descriptor - - 555 Unmanaged entity - - 650 Asynchronous event notification - - Unless specified to have specific contents, the human-readable messages - in error replies should not be relied upon to match those in this document. - -4.1. Asynchronous events - - These replies can be sent after a corresponding SETEVENTS command has been - received. They will not be interleaved with other Reply elements, but they - can appear between a command and its corresponding reply. For example, - this sequence is possible: - - C: SETEVENTS CIRC - S: 250 OK - C: GETCONF SOCKSPORT ORPORT - S: 650 CIRC 1000 EXTENDED moria1,moria2 - S: 250-SOCKSPORT=9050 - S: 250 ORPORT=0 - - But this sequence is disallowed: - - C: SETEVENTS CIRC - S: 250 OK - C: GETCONF SOCKSPORT ORPORT - S: 250-SOCKSPORT=9050 - S: 650 CIRC 1000 EXTENDED moria1,moria2 - S: 250 ORPORT=0 - - Clients MUST tolerate more arguments in an asynchronous reply than - expected, and MUST tolerate more lines in an asynchronous reply than - expected. For instance, a client that expects a CIRC message like: - - 650 CIRC 1000 EXTENDED moria1,moria2 - - must tolerate: - - 650-CIRC 1000 EXTENDED moria1,moria2 0xBEEF - 650-EXTRAMAGIC=99 - 650 ANONYMITY=high - - If clients receives extended events (selected by USEFEATUERE - EXTENDED_EVENTS in Tor 0.1.2.2-alpha..Tor-0.2.1.x, and always-on in - Tor 0.2.2.x and later), then each event line as specified below may be - followed by additional arguments and additional lines. Additional - lines will be of the form: - - "650" ("-"/" ") KEYWORD ["=" ARGUMENTS] CRLF - - Additional arguments will be of the form - - SP KEYWORD ["=" ( QuotedString / * NonSpDquote ) ] - - Clients MUST tolerate events with arguments and keywords they do not - recognize, and SHOULD process those events as if any unrecognized - arguments and keywords were not present. - - Clients SHOULD NOT depend on the order of keyword=value arguments, - and SHOULD NOT depend on there being no new keyword=value arguments - appearing between existing keyword=value arguments, though as of this - writing (Jun 2011) some do. Thus, extensions to this protocol should - add new keywords only after the existing keywords, until all - controllers have been fixed. At some point this "SHOULD NOT" might - become a "MUST NOT". - -4.1.1. Circuit status changed - - The syntax is: - - "650" SP "CIRC" SP CircuitID SP CircStatus [SP Path] - [SP "BUILD_FLAGS=" BuildFlags] [SP "PURPOSE=" Purpose] - [SP "HS_STATE=" HSState] [SP "REND_QUERY=" HSAddress] - [SP "TIME_CREATED=" TimeCreated] - [SP "REASON=" Reason [SP "REMOTE_REASON=" Reason]] - [SP "SOCKS_USERNAME=" EscapedUsername] - [SP "SOCKS_PASSWORD=" EscapedPassword] - [SP "HS_POW=" HSPoW ] - CRLF - - CircStatus = - "LAUNCHED" / ; circuit ID assigned to new circuit - "BUILT" / ; all hops finished, can now accept streams - "GUARD_WAIT" / ; all hops finished, waiting to see if a - ; circuit with a better guard will be usable. - "EXTENDED" / ; one more hop has been completed - "FAILED" / ; circuit closed (was not built) - "CLOSED" ; circuit closed (was built) - - Path = LongName *("," LongName) - ; In Tor versions 0.1.2.2-alpha through 0.2.2.1-alpha with feature - ; VERBOSE_NAMES turned off and before version 0.1.2.2-alpha, Path - ; is as follows: - ; Path = ServerID *("," ServerID) - - BuildFlags = BuildFlag *("," BuildFlag) - BuildFlag = "ONEHOP_TUNNEL" / "IS_INTERNAL" / - "NEED_CAPACITY" / "NEED_UPTIME" - - Purpose = "GENERAL" / "HS_CLIENT_INTRO" / "HS_CLIENT_REND" / - "HS_SERVICE_INTRO" / "HS_SERVICE_REND" / "TESTING" / - "CONTROLLER" / "MEASURE_TIMEOUT" / - "HS_VANGUARDS" / "PATH_BIAS_TESTING" / - "CIRCUIT_PADDING" - - HSState = "HSCI_CONNECTING" / "HSCI_INTRO_SENT" / "HSCI_DONE" / - "HSCR_CONNECTING" / "HSCR_ESTABLISHED_IDLE" / - "HSCR_ESTABLISHED_WAITING" / "HSCR_JOINED" / - "HSSI_CONNECTING" / "HSSI_ESTABLISHED" / - "HSSR_CONNECTING" / "HSSR_JOINED" - - HSPoWType = "v1" - HSPoWEffort = 1*DIGIT - HSPoW = HSPoWType "," HSPoWEffort - - EscapedUsername = QuotedString - EscapedPassword = QuotedString - - HSAddress = 16*Base32Character / 56*Base32Character - Base32Character = ALPHA / "2" / "3" / "4" / "5" / "6" / "7" - - TimeCreated = ISOTime2Frac - Seconds = 1*DIGIT - Microseconds = 1*DIGIT - - Reason = "NONE" / "TORPROTOCOL" / "INTERNAL" / "REQUESTED" / - "HIBERNATING" / "RESOURCELIMIT" / "CONNECTFAILED" / - "OR_IDENTITY" / "OR_CONN_CLOSED" / "TIMEOUT" / - "FINISHED" / "DESTROYED" / "NOPATH" / "NOSUCHSERVICE" / - "MEASUREMENT_EXPIRED" - - The path is provided only when the circuit has been extended at least one - hop. - - The "BUILD_FLAGS" field is provided only in versions 0.2.3.11-alpha - and later. Clients MUST accept build flags not listed above. - Build flags are defined as follows: - - ONEHOP_TUNNEL (one-hop circuit, used for tunneled directory conns) - IS_INTERNAL (internal circuit, not to be used for exiting streams) - NEED_CAPACITY (this circuit must use only high-capacity nodes) - NEED_UPTIME (this circuit must use only high-uptime nodes) - - The "PURPOSE" field is provided only in versions 0.2.1.6-alpha and - later, and only if extended events are enabled (see 3.19). Clients - MUST accept purposes not listed above. Purposes are defined as - follows: - - GENERAL (circuit for AP and/or directory request streams) - HS_CLIENT_INTRO (HS client-side introduction-point circuit) - HS_CLIENT_REND (HS client-side rendezvous circuit; carries AP streams) - HS_SERVICE_INTRO (HS service-side introduction-point circuit) - HS_SERVICE_REND (HS service-side rendezvous circuit) - TESTING (reachability-testing circuit; carries no traffic) - CONTROLLER (circuit built by a controller) - MEASURE_TIMEOUT (circuit being kept around to see how long it takes) - HS_VANGUARDS (circuit created ahead of time when using - HS vanguards, and later repurposed as needed) - PATH_BIAS_TESTING (circuit used to probe whether our circuits are - being deliberately closed by an attacker) - CIRCUIT_PADDING (circuit that is being held open to disguise its - true close time) - - The "HS_STATE" field is provided only for hidden-service circuits, - and only in versions 0.2.3.11-alpha and later. Clients MUST accept - hidden-service circuit states not listed above. Hidden-service - circuit states are defined as follows: - - HSCI_* (client-side introduction-point circuit states) - HSCI_CONNECTING (connecting to intro point) - HSCI_INTRO_SENT (sent INTRODUCE1; waiting for reply from IP) - HSCI_DONE (received reply from IP relay; closing) - - HSCR_* (client-side rendezvous-point circuit states) - HSCR_CONNECTING (connecting to or waiting for reply from RP) - HSCR_ESTABLISHED_IDLE (established RP; waiting for introduction) - HSCR_ESTABLISHED_WAITING (introduction sent to HS; waiting for rend) - HSCR_JOINED (connected to HS) - - HSSI_* (service-side introduction-point circuit states) - HSSI_CONNECTING (connecting to intro point) - HSSI_ESTABLISHED (established intro point) - - HSSR_* (service-side rendezvous-point circuit states) - HSSR_CONNECTING (connecting to client's rend point) - HSSR_JOINED (connected to client's RP circuit) - - The "SOCKS_USERNAME" and "SOCKS_PASSWORD" fields indicate the credentials - that were used by a SOCKS client to connect to Tor's SOCKS port and - initiate this circuit. (Streams for SOCKS clients connected with different - usernames and/or passwords are isolated on separate circuits if the - IsolateSOCKSAuth flag is active; see Proposal 171.) [Added in Tor - 0.4.3.1-alpha.] - - The "REND_QUERY" field is provided only for hidden-service-related - circuits, and only in versions 0.2.3.11-alpha and later. Clients - MUST accept hidden service addresses in formats other than that - specified above. [Added in Tor 0.4.3.1-alpha.] - - The "TIME_CREATED" field is provided only in versions 0.2.3.11-alpha and - later. TIME_CREATED is the time at which the circuit was created or - cannibalized. [Added in Tor 0.4.3.1-alpha.] - - The "REASON" field is provided only for FAILED and CLOSED events, and only - if extended events are enabled (see 3.19). Clients MUST accept reasons - not listed above. [Added in Tor 0.4.3.1-alpha.] Reasons are as given in - tor-spec.txt, except for: - - NOPATH (Not enough nodes to make circuit) - MEASUREMENT_EXPIRED (As "TIMEOUT", except that we had left the circuit - open for measurement purposes to see how long it - would take to finish.) - IP_NOW_REDUNDANT (Closing a circuit to an introduction point that - has become redundant, since some other circuit - opened in parallel with it has succeeded.) - - The "REMOTE_REASON" field is provided only when we receive a DESTROY or - TRUNCATE cell, and only if extended events are enabled. It contains the - actual reason given by the remote OR for closing the circuit. Clients MUST - accept reasons not listed above. Reasons are as listed in tor-spec.txt. - [Added in Tor 0.4.3.1-alpha.] - -4.1.2. Stream status changed - - The syntax is: - - "650" SP "STREAM" SP StreamID SP StreamStatus SP CircuitID SP Target - [SP "REASON=" Reason [ SP "REMOTE_REASON=" Reason ]] - [SP "SOURCE=" Source] [ SP "SOURCE_ADDR=" Address ":" Port ] - [SP "PURPOSE=" Purpose] [SP "SOCKS_USERNAME=" EscapedUsername] - [SP "SOCKS_PASSWORD=" EscapedPassword] - [SP "CLIENT_PROTOCOL=" ClientProtocol] [SP "NYM_EPOCH=" NymEpoch] - [SP "SESSION_GROUP=" SessionGroup] [SP "ISO_FIELDS=" IsoFields] - CRLF - - StreamStatus = - "NEW" / ; New request to connect - "NEWRESOLVE" / ; New request to resolve an address - "REMAP" / ; Address re-mapped to another - "SENTCONNECT" / ; Sent a connect cell along a circuit - "SENTRESOLVE" / ; Sent a resolve cell along a circuit - "SUCCEEDED" / ; Received a reply; stream established - "FAILED" / ; Stream failed and not retriable - "CLOSED" / ; Stream closed - "DETACHED" / ; Detached from circuit; still retriable - "CONTROLLER_WAIT" ; Waiting for controller to use ATTACHSTREAM - ; (new in 0.4.5.1-alpha) - "XOFF_SENT" ; XOFF has been sent for this stream - ; (new in 0.4.7.5-alpha) - "XOFF_RECV" ; XOFF has been received for this stream - ; (new in 0.4.7.5-alpha) - "XON_SENT" ; XON has been sent for this stream - ; (new in 0.4.7.5-alpha) - "XON_RECV" ; XON has been received for this stream - ; (new in 0.4.7.5-alpha) - - Target = TargetAddress ":" Port - Port = an integer from 0 to 65535 inclusive - TargetAddress = Address / "(Tor_internal)" - - EscapedUsername = QuotedString - EscapedPassword = QuotedString - - ClientProtocol = - "SOCKS4" / - "SOCKS5" / - "TRANS" / - "NATD" / - "DNS" / - "HTTPCONNECT" / - "UNKNOWN" - - NymEpoch = a nonnegative integer - SessionGroup = an integer - - IsoFields = a comma-separated list of IsoField values - - IsoField = - "CLIENTADDR" / - "CLIENTPORT" / - "DESTADDR" / - "DESTPORT" / - the name of a field that is valid for STREAM events - - The circuit ID designates which circuit this stream is attached to. If - the stream is unattached, the circuit ID "0" is given. The target - indicates the address which the stream is meant to resolve or connect to; - it can be "(Tor_internal)" for a virtual stream created by the Tor program - to talk to itself. - - Reason = "MISC" / "RESOLVEFAILED" / "CONNECTREFUSED" / - "EXITPOLICY" / "DESTROY" / "DONE" / "TIMEOUT" / - "NOROUTE" / "HIBERNATING" / "INTERNAL"/ "RESOURCELIMIT" / - "CONNRESET" / "TORPROTOCOL" / "NOTDIRECTORY" / "END" / - "PRIVATE_ADDR" - - The "REASON" field is provided only for FAILED, CLOSED, and DETACHED - events, and only if extended events are enabled (see 3.19). Clients MUST - accept reasons not listed above. Reasons are as given in tor-spec.txt, - except for: - - END (We received a RELAY_END cell from the other side of this - stream.) - PRIVATE_ADDR (The client tried to connect to a private address like - 127.0.0.1 or 10.0.0.1 over Tor.) - [XXXX document more. -NM] - - The "REMOTE_REASON" field is provided only when we receive a RELAY_END - cell, and only if extended events are enabled. It contains the actual - reason given by the remote OR for closing the stream. Clients MUST accept - reasons not listed above. Reasons are as listed in tor-spec.txt. - - "REMAP" events include a Source if extended events are enabled: - - Source = "CACHE" / "EXIT" - - Clients MUST accept sources not listed above. "CACHE" is given if - the Tor client decided to remap the address because of a cached - answer, and "EXIT" is given if the remote node we queried gave us - the new address as a response. - - The "SOURCE_ADDR" field is included with NEW and NEWRESOLVE events if - extended events are enabled. It indicates the address and port - that requested the connection, and can be (e.g.) used to look up the - requesting program. - - Purpose = "DIR_FETCH" / "DIR_UPLOAD" / "DNS_REQUEST" / - "USER" / "DIRPORT_TEST" - - The "PURPOSE" field is provided only for NEW and NEWRESOLVE events, and - only if extended events are enabled (see 3.19). Clients MUST accept - purposes not listed above. The purposes above are defined as: - - "DIR_FETCH" -- This stream is generated internally to Tor for - fetching directory information. - "DIR_UPLOAD" -- An internal stream for uploading information to - a directory authority. - "DIRPORT_TEST" -- A stream we're using to test our own directory - port to make sure it's reachable. - "DNS_REQUEST" -- A user-initiated DNS request. - "USER" -- This stream is handling user traffic, OR it's internal - to Tor, but it doesn't match one of the purposes above. - - The "SOCKS_USERNAME" and "SOCKS_PASSWORD" fields indicate the credentials - that were used by a SOCKS client to connect to Tor's SOCKS port and - initiate this stream. (Streams for SOCKS clients connected with different - usernames and/or passwords are isolated on separate circuits if the - IsolateSOCKSAuth flag is active; see Proposal 171.) - - The "CLIENT_PROTOCOL" field indicates the protocol that was used by a client - to initiate this stream. (Streams for clients connected with different - protocols are isolated on separate circuits if the IsolateClientProtocol - flag is active.) Controllers MUST tolerate unrecognized client protocols. - - The "NYM_EPOCH" field indicates the nym epoch that was active when a client - initiated this stream. The epoch increments when the NEWNYM signal is - received. (Streams with different nym epochs are isolated on separate - circuits.) - - The "SESSION_GROUP" field indicates the session group of the listener port - that a client used to initiate this stream. By default, the session group is - different for each listener port, but this can be overridden for a listener - via the "SessionGroup" option in torrc. (Streams with different session - groups are isolated on separate circuits.) - - The "ISO_FIELDS" field indicates the set of STREAM event fields for which - stream isolation is enabled for the listener port that a client used to - initiate this stream. The special values "CLIENTADDR", "CLIENTPORT", - "DESTADDR", and "DESTPORT", if their correspondingly named fields are not - present, refer to the Address and Port components of the "SOURCE_ADDR" and - Target fields. - -4.1.3. OR Connection status changed - - The syntax is: - - "650" SP "ORCONN" SP (LongName / Target) SP ORStatus [ SP "REASON=" - Reason ] [ SP "NCIRCS=" NumCircuits ] [ SP "ID=" ConnID ] CRLF - - ORStatus = "NEW" / "LAUNCHED" / "CONNECTED" / "FAILED" / "CLOSED" - - ; In Tor versions 0.1.2.2-alpha through 0.2.2.1-alpha with feature - ; VERBOSE_NAMES turned off and before version 0.1.2.2-alpha, OR - ; Connection is as follows: - "650" SP "ORCONN" SP (ServerID / Target) SP ORStatus [ SP "REASON=" - Reason ] [ SP "NCIRCS=" NumCircuits ] CRLF - - NEW is for incoming connections, and LAUNCHED is for outgoing - connections. CONNECTED means the TLS handshake has finished (in - either direction). FAILED means a connection is being closed that - hasn't finished its handshake, and CLOSED is for connections that - have handshaked. - - A LongName or ServerID is specified unless it's a NEW connection, in - which case we don't know what server it is yet, so we use Address:Port. - - If extended events are enabled (see 3.19), optional reason and - circuit counting information is provided for CLOSED and FAILED - events. - - Reason = "MISC" / "DONE" / "CONNECTREFUSED" / - "IDENTITY" / "CONNECTRESET" / "TIMEOUT" / "NOROUTE" / - "IOERROR" / "RESOURCELIMIT" / "PT_MISSING" - - NumCircuits counts both established and pending circuits. - - The ORStatus values are as follows: - - NEW -- We have received a new incoming OR connection, and are starting - the server-side handshake. - LAUNCHED -- We have launched a new outgoing OR connection, and are - starting the client-side handshake. - CONNECTED -- The OR connection has been connected and the handshake is - done. - FAILED -- Our attempt to open the OR connection failed. - CLOSED -- The OR connection closed in an unremarkable way. - - The Reason values for closed/failed OR connections are: - - DONE -- The OR connection has shut down cleanly. - CONNECTREFUSED -- We got an ECONNREFUSED while connecting to the target - OR. - IDENTITY -- We connected to the OR, but found that its identity was - not what we expected. - CONNECTRESET -- We got an ECONNRESET or similar IO error from the - connection with the OR. - TIMEOUT -- We got an ETIMEOUT or similar IO error from the connection - with the OR, or we're closing the connection for being idle for too - long. - NOROUTE -- We got an ENOTCONN, ENETUNREACH, ENETDOWN, EHOSTUNREACH, or - similar error while connecting to the OR. - IOERROR -- We got some other IO error on our connection to the OR. - RESOURCELIMIT -- We don't have enough operating system resources (file - descriptors, buffers, etc) to connect to the OR. - PT_MISSING -- No pluggable transport was available. - MISC -- The OR connection closed for some other reason. - - [First added ID parameter in 0.2.5.2-alpha] - -4.1.4. Bandwidth used in the last second - - The syntax is: - - "650" SP "BW" SP BytesRead SP BytesWritten *(SP Type "=" Num) CRLF - BytesRead = 1*DIGIT - BytesWritten = 1*DIGIT - Type = "DIR" / "OR" / "EXIT" / "APP" / ... - Num = 1*DIGIT - - BytesRead and BytesWritten are the totals. [In a future Tor version, - we may also include a breakdown of the connection types that used - bandwidth this second (not implemented yet).] - -4.1.5. Log messages - - The syntax is: - - "650" SP Severity SP ReplyText CRLF - - or - - "650+" Severity CRLF Data 650 SP "OK" CRLF - - Severity = "DEBUG" / "INFO" / "NOTICE" / "WARN"/ "ERR" - - Some low-level logs may be sent from signal handlers, so their destination - logs must be signal-safe. These low-level logs include backtraces, - logging function errors, and errors in code called by logging functions. - Signal-safe logs are never sent as control port log events. - - Control port message trace debug logs are never sent as control port log - events, to avoid modifying control output when debugging. - -4.1.6. New descriptors available - - This event is generated when new router descriptors (not microdescs or - extrainfos or anything else) are received. - - Syntax: - - "650" SP "NEWDESC" 1*(SP LongName) CRLF - ; In Tor versions 0.1.2.2-alpha through 0.2.2.1-alpha with feature - ; VERBOSE_NAMES turned off and before version 0.1.2.2-alpha, it - ; is as follows: - "650" SP "NEWDESC" 1*(SP ServerID) CRLF - -4.1.7. New Address mapping - - These events are generated when a new address mapping is entered in - Tor's address map cache, or when the answer for a RESOLVE command is - found. Entries can be created by a successful or failed DNS lookup, - a successful or failed connection attempt, a RESOLVE command, - a MAPADDRESS command, the AutomapHostsOnResolve feature, or the - TrackHostExits feature. - - Syntax: - - "650" SP "ADDRMAP" SP Address SP NewAddress SP Expiry - [SP "error=" ErrorCode] [SP "EXPIRES=" UTCExpiry] [SP "CACHED=" Cached] - [SP "STREAMID=" StreamId] CRLF - - NewAddress = Address / "" - Expiry = DQUOTE ISOTime DQUOTE / "NEVER" - - ErrorCode = "yes" / "internal" / "Unable to launch resolve request" - UTCExpiry = DQUOTE IsoTime DQUOTE - - Cached = DQUOTE "YES" DQUOTE / DQUOTE "NO" DQUOTE - StreamId = DQUOTE StreamId DQUOTE - - Error and UTCExpiry are only provided if extended events are enabled. - The values for Error are mostly useless. Future values will be - chosen to match 1*(ALNUM / "_"); the "Unable to launch resolve request" - value is a bug in Tor before 0.2.4.7-alpha. - - Expiry is expressed as the local time (rather than UTC). This is a bug, - left in for backward compatibility; new code should look at UTCExpiry - instead. (If Expiry is "NEVER", UTCExpiry is omitted.) - - Cached indicates whether the mapping will be stored until it expires, or if - it is just a notification in response to a RESOLVE command. - - StreamId is the global stream identifier of the stream or circuit from which - the address was resolved. - -4.1.8. Descriptors uploaded to us in our role as authoritative dirserver - - [NOTE: This feature was removed in Tor 0.3.2.1-alpha.] - - Tor generates this event when it's a directory authority, and - somebody has just uploaded a server descriptor. - - Syntax: - - "650" "+" "AUTHDIR_NEWDESCS" CRLF Action CRLF Message CRLF - Descriptor CRLF "." CRLF "650" SP "OK" CRLF - Action = "ACCEPTED" / "DROPPED" / "REJECTED" - Message = Text - - The Descriptor field is the text of the server descriptor; the Action - field is "ACCEPTED" if we're accepting the descriptor as the new - best valid descriptor for its router, "REJECTED" if we aren't taking - the descriptor and we're complaining to the uploading relay about - it, and "DROPPED" if we decide to drop the descriptor without - complaining. The Message field is a human-readable string - explaining why we chose the Action. (It doesn't contain newlines.) - -4.1.9. Our descriptor changed - - Syntax: - - "650" SP "DESCCHANGED" CRLF - - [First added in 0.1.2.2-alpha.] - -4.1.10. Status events - - Status events (STATUS_GENERAL, STATUS_CLIENT, and STATUS_SERVER) are sent - based on occurrences in the Tor process pertaining to the general state of - the program. Generally, they correspond to log messages of severity Notice - or higher. They differ from log messages in that their format is a - specified interface. - - Syntax: - - "650" SP StatusType SP StatusSeverity SP StatusAction - [SP StatusArguments] CRLF - - StatusType = "STATUS_GENERAL" / "STATUS_CLIENT" / "STATUS_SERVER" - StatusSeverity = "NOTICE" / "WARN" / "ERR" - StatusAction = 1*ALPHA - StatusArguments = StatusArgument *(SP StatusArgument) - StatusArgument = StatusKeyword '=' StatusValue - StatusKeyword = 1*(ALNUM / "_") - StatusValue = 1*(ALNUM / '_') / QuotedString - - StatusAction is a string, and StatusArguments is a series of - keyword=value pairs on the same line. Values may be space-terminated - strings, or quoted strings. - - These events are always produced with EXTENDED_EVENTS and - VERBOSE_NAMES; see the explanations in the USEFEATURE section - for details. - - Controllers MUST tolerate unrecognized actions, MUST tolerate - unrecognized arguments, MUST tolerate missing arguments, and MUST - tolerate arguments that arrive in any order. - - Each event description below is accompanied by a recommendation for - controllers. These recommendations are suggestions only; no controller - is required to implement them. - - Compatibility note: versions of Tor before 0.2.0.22-rc incorrectly - generated "STATUS_SERVER" as "STATUS_SEVER". To be compatible with those - versions, tools should accept both. - - Actions for STATUS_GENERAL events can be as follows: - - CLOCK_JUMPED - "TIME=NUM" - Tor spent enough time without CPU cycles that it has closed all - its circuits and will establish them anew. This typically - happens when a laptop goes to sleep and then wakes up again. It - also happens when the system is swapping so heavily that Tor is - starving. The "time" argument specifies the number of seconds Tor - thinks it was unconscious for (or alternatively, the number of - seconds it went back in time). - - This status event is sent as NOTICE severity normally, but WARN - severity if Tor is acting as a server currently. - - {Recommendation for controller: ignore it, since we don't really - know what the user should do anyway. Hm.} - - DANGEROUS_VERSION - "CURRENT=version" - "REASON=NEW/OBSOLETE/UNRECOMMENDED" - "RECOMMENDED=\"version, version, ...\"" - Tor has found that directory servers don't recommend its version of - the Tor software. RECOMMENDED is a comma-and-space-separated string - of Tor versions that are recommended. REASON is NEW if this version - of Tor is newer than any recommended version, OBSOLETE if - this version of Tor is older than any recommended version, and - UNRECOMMENDED if some recommended versions of Tor are newer and - some are older than this version. (The "OBSOLETE" reason was called - "OLD" from Tor 0.1.2.3-alpha up to and including 0.2.0.12-alpha.) - - {Controllers may want to suggest that the user upgrade OLD or - UNRECOMMENDED versions. NEW versions may be known-insecure, or may - simply be development versions.} - - TOO_MANY_CONNECTIONS - "CURRENT=NUM" - Tor has reached its ulimit -n or whatever the native limit is on file - descriptors or sockets. CURRENT is the number of sockets Tor - currently has open. The user should really do something about - this. The "current" argument shows the number of connections currently - open. - - {Controllers may recommend that the user increase the limit, or - increase it for them. Recommendations should be phrased in an - OS-appropriate way and automated when possible.} - - BUG - "REASON=STRING" - Tor has encountered a situation that its developers never expected, - and the developers would like to learn that it happened. Perhaps - the controller can explain this to the user and encourage her to - file a bug report? - - {Controllers should log bugs, but shouldn't annoy the user in case a - bug appears frequently.} - - CLOCK_SKEW - SKEW="+" / "-" SECONDS - MIN_SKEW="+" / "-" SECONDS. - SOURCE="DIRSERV:" IP ":" Port / - "NETWORKSTATUS:" IP ":" Port / - "OR:" IP ":" Port / - "CONSENSUS" - If "SKEW" is present, it's an estimate of how far we are from the - time declared in the source. (In other words, if we're an hour in - the past, the value is -3600.) "MIN_SKEW" is present, it's a lower - bound. If the source is a DIRSERV, we got the current time from a - connection to a dirserver. If the source is a NETWORKSTATUS, we - decided we're skewed because we got a v2 networkstatus from far in - the future. If the source is OR, the skew comes from a NETINFO - cell from a connection to another relay. If the source is - CONSENSUS, we decided we're skewed because we got a networkstatus - consensus from the future. - - {Tor should send this message to controllers when it thinks the - skew is so high that it will interfere with proper Tor operation. - Controllers shouldn't blindly adjust the clock, since the more - accurate source of skew info (DIRSERV) is currently - unauthenticated.} - - BAD_LIBEVENT - "METHOD=" libevent method - "VERSION=" libevent version - "BADNESS=" "BROKEN" / "BUGGY" / "SLOW" - "RECOVERED=" "NO" / "YES" - Tor knows about bugs in using the configured event method in this - version of libevent. "BROKEN" libevents won't work at all; - "BUGGY" libevents might work okay; "SLOW" libevents will work - fine, but not quickly. If "RECOVERED" is YES, Tor managed to - switch to a more reliable (but probably slower!) libevent method. - - {Controllers may want to warn the user if this event occurs, though - generally it's the fault of whoever built the Tor binary and there's - not much the user can do besides upgrade libevent or upgrade the - binary.} - - DIR_ALL_UNREACHABLE - Tor believes that none of the known directory servers are - reachable -- this is most likely because the local network is - down or otherwise not working, and might help to explain for the - user why Tor appears to be broken. - - {Controllers may want to warn the user if this event occurs; further - action is generally not possible.} - - Actions for STATUS_CLIENT events can be as follows: - - BOOTSTRAP - "PROGRESS=" num - "TAG=" Keyword - "SUMMARY=" String - ["WARNING=" String] - ["REASON=" Keyword] - ["COUNT=" num] - ["RECOMMENDATION=" Keyword] - ["HOST=" QuotedString] - ["HOSTADDR=" QuotedString] - - Tor has made some progress at establishing a connection to the - Tor network, fetching directory information, or making its first - circuit; or it has encountered a problem while bootstrapping. This - status event is especially useful for users with slow connections - or with connectivity problems. - - "Progress" gives a number between 0 and 100 for how far through - the bootstrapping process we are. "Summary" is a string that can - be displayed to the user to describe the *next* task that Tor - will tackle, i.e., the task it is working on after sending the - status event. "Tag" is a string that controllers can use to - recognize bootstrap phases, if they want to do something smarter - than just blindly displaying the summary string; see Section 5 - for the current tags that Tor issues. - - The StatusSeverity describes whether this is a normal bootstrap - phase (severity notice) or an indication of a bootstrapping - problem (severity warn). - - For bootstrap problems, we include the same progress, tag, and - summary values as we would for a normal bootstrap event, but we - also include "warning", "reason", "count", and "recommendation" - key/value combos. The "count" number tells how many bootstrap - problems there have been so far at this phase. The "reason" - string lists one of the reasons allowed in the ORCONN event. The - "warning" argument string with any hints Tor has to offer about - why it's having troubles bootstrapping. - - The "reason" values are long-term-stable controller-facing tags to - identify particular issues in a bootstrapping step. The warning - strings, on the other hand, are human-readable. Controllers - SHOULD NOT rely on the format of any warning string. Currently - the possible values for "recommendation" are either "ignore" or - "warn" -- if ignore, the controller can accumulate the string in - a pile of problems to show the user if the user asks; if warn, - the controller should alert the user that Tor is pretty sure - there's a bootstrapping problem. - - The "host" value is the identity digest (in hex) of the node we're - trying to connect to; the "hostaddr" is an address:port combination, - where 'address' is an ipv4 or ipv6 address. - - Currently Tor uses recommendation=ignore for the first - nine bootstrap problem reports for a given phase, and then - uses recommendation=warn for subsequent problems at that - phase. Hopefully this is a good balance between tolerating - occasional errors and reporting serious problems quickly. - - ENOUGH_DIR_INFO - Tor now knows enough network-status documents and enough server - descriptors that it's going to start trying to build circuits now. - [Newer versions of Tor (0.2.6.2-alpha and later): - If the consensus contains Exits (the typical case), Tor will build - both exit and internal circuits. If not, Tor will only build internal - circuits.] - - {Controllers may want to use this event to decide when to indicate - progress to their users, but should not interrupt the user's browsing - to tell them so.} - - NOT_ENOUGH_DIR_INFO - We discarded expired statuses and server descriptors to fall - below the desired threshold of directory information. We won't - try to build any circuits until ENOUGH_DIR_INFO occurs again. - - {Controllers may want to use this event to decide when to indicate - progress to their users, but should not interrupt the user's browsing - to tell them so.} - - CIRCUIT_ESTABLISHED - Tor is able to establish circuits for client use. This event will - only be sent if we just built a circuit that changed our mind -- - that is, prior to this event we didn't know whether we could - establish circuits. - - {Suggested use: controllers can notify their users that Tor is - ready for use as a client once they see this status event. [Perhaps - controllers should also have a timeout if too much time passes and - this event hasn't arrived, to give tips on how to troubleshoot. - On the other hand, hopefully Tor will send further status events - if it can identify the problem.]} - - CIRCUIT_NOT_ESTABLISHED - "REASON=" "EXTERNAL_ADDRESS" / "DIR_ALL_UNREACHABLE" / "CLOCK_JUMPED" - We are no longer confident that we can build circuits. The "reason" - keyword provides an explanation: which other status event type caused - our lack of confidence. - - {Controllers may want to use this event to decide when to indicate - progress to their users, but should not interrupt the user's browsing - to do so.} - [Note: only REASON=CLOCK_JUMPED is implemented currently.] - - CONSENSUS_ARRIVED - Tor has received and validated a new consensus networkstatus. - (This event can be delayed a little while after the consensus - is received, if Tor needs to fetch certificates.) - - DANGEROUS_PORT - "PORT=" port - "RESULT=" "REJECT" / "WARN" - A stream was initiated to a port that's commonly used for - vulnerable-plaintext protocols. If the Result is "reject", we - refused the connection; whereas if it's "warn", we allowed it. - - {Controllers should warn their users when this occurs, unless they - happen to know that the application using Tor is in fact doing so - correctly (e.g., because it is part of a distributed bundle). They - might also want some sort of interface to let the user configure - their RejectPlaintextPorts and WarnPlaintextPorts config options.} - - DANGEROUS_SOCKS - "PROTOCOL=" "SOCKS4" / "SOCKS5" - "ADDRESS=" IP:port - A connection was made to Tor's SOCKS port using one of the SOCKS - approaches that doesn't support hostnames -- only raw IP addresses. - If the client application got this address from gethostbyname(), - it may be leaking target addresses via DNS. - - {Controllers should warn their users when this occurs, unless they - happen to know that the application using Tor is in fact doing so - correctly (e.g., because it is part of a distributed bundle).} - - SOCKS_UNKNOWN_PROTOCOL - "DATA=string" - A connection was made to Tor's SOCKS port that tried to use it - for something other than the SOCKS protocol. Perhaps the user is - using Tor as an HTTP proxy? The DATA is the first few characters - sent to Tor on the SOCKS port. - - {Controllers may want to warn their users when this occurs: it - indicates a misconfigured application.} - - SOCKS_BAD_HOSTNAME - "HOSTNAME=QuotedString" - Some application gave us a funny-looking hostname. Perhaps - it is broken? In any case it won't work with Tor and the user - should know. - - {Controllers may want to warn their users when this occurs: it - usually indicates a misconfigured application.} - - Actions for STATUS_SERVER can be as follows: - - EXTERNAL_ADDRESS - "ADDRESS=IP" - "HOSTNAME=NAME" - "METHOD=CONFIGURED/CONFIGURED_ORPORT/DIRSERV/RESOLVED/ - INTERFACE/GETHOSTNAME" - Our best idea for our externally visible IP has changed to 'IP'. If - 'HOSTNAME' is present, we got the new IP by resolving 'NAME'. If the - method is 'CONFIGURED', the IP was given verbatim as the Address - configuration option. If the method is 'CONFIGURED_ORPORT', the IP was - given verbatim in the ORPort configuration option. If the method is - 'RESOLVED', we resolved the Address configuration option to get the IP. - If the method is 'GETHOSTNAME', we resolved our hostname to get the IP. - If the method is 'INTERFACE', we got the address of one of our network - interfaces to get the IP. If the method is 'DIRSERV', a directory - server told us a guess for what our IP might be. - - {Controllers may want to record this info and display it to the user.} - - CHECKING_REACHABILITY - "ORADDRESS=IP:port" - "DIRADDRESS=IP:port" - We're going to start testing the reachability of our external OR port - or directory port. - - {This event could affect the controller's idea of server status, but - the controller should not interrupt the user to tell them so.} - - REACHABILITY_SUCCEEDED - "ORADDRESS=IP:port" - "DIRADDRESS=IP:port" - We successfully verified the reachability of our external OR port or - directory port (depending on which of ORADDRESS or DIRADDRESS is - given.) - - {This event could affect the controller's idea of server status, but - the controller should not interrupt the user to tell them so.} - - GOOD_SERVER_DESCRIPTOR - We successfully uploaded our server descriptor to at least one - of the directory authorities, with no complaints. - - {Originally, the goal of this event was to declare "every authority - has accepted the descriptor, so there will be no complaints - about it." But since some authorities might be offline, it's - harder to get certainty than we had thought. As such, this event - is equivalent to ACCEPTED_SERVER_DESCRIPTOR below. Controllers - should just look at ACCEPTED_SERVER_DESCRIPTOR and should ignore - this event for now.} - - SERVER_DESCRIPTOR_STATUS - "STATUS=" "LISTED" / "UNLISTED" - We just got a new networkstatus consensus, and whether we're in - it or not in it has changed. Specifically, status is "listed" - if we're listed in it but previous to this point we didn't know - we were listed in a consensus; and status is "unlisted" if we - thought we should have been listed in it (e.g. we were listed in - the last one), but we're not. - - {Moving from listed to unlisted is not necessarily cause for - alarm. The relay might have failed a few reachability tests, - or the Internet might have had some routing problems. So this - feature is mainly to let relay operators know when their relay - has successfully been listed in the consensus.} - - [Not implemented yet. We should do this in 0.2.2.x. -RD] - - NAMESERVER_STATUS - "NS=addr" - "STATUS=" "UP" / "DOWN" - "ERR=" message - One of our nameservers has changed status. - - {This event could affect the controller's idea of server status, but - the controller should not interrupt the user to tell them so.} - - NAMESERVER_ALL_DOWN - All of our nameservers have gone down. - - {This is a problem; if it happens often without the nameservers - coming up again, the user needs to configure more or better - nameservers.} - - DNS_HIJACKED - Our DNS provider is providing an address when it should be saying - "NOTFOUND"; Tor will treat the address as a synonym for "NOTFOUND". - - {This is an annoyance; controllers may want to tell admins that their - DNS provider is not to be trusted.} - - DNS_USELESS - Our DNS provider is giving a hijacked address instead of well-known - websites; Tor will not try to be an exit node. - - {Controllers could warn the admin if the relay is running as an - exit node: the admin needs to configure a good DNS server. - Alternatively, this happens a lot in some restrictive environments - (hotels, universities, coffeeshops) when the user hasn't registered.} - - BAD_SERVER_DESCRIPTOR - "DIRAUTH=addr:port" - "REASON=string" - A directory authority rejected our descriptor. Possible reasons - include malformed descriptors, incorrect keys, highly skewed clocks, - and so on. - - {Controllers should warn the admin, and try to cope if they can.} - - ACCEPTED_SERVER_DESCRIPTOR - "DIRAUTH=addr:port" - A single directory authority accepted our descriptor. - // actually notice - - {This event could affect the controller's idea of server status, but - the controller should not interrupt the user to tell them so.} - - REACHABILITY_FAILED - "ORADDRESS=IP:port" - "DIRADDRESS=IP:port" - We failed to connect to our external OR port or directory port - successfully. - - {This event could affect the controller's idea of server status. The - controller should warn the admin and suggest reasonable steps to take.} - - HIBERNATION_STATUS - "STATUS=" "AWAKE" | "SOFT" | "HARD" - Our bandwidth based accounting status has changed, and we are now - relaying traffic/rejecting new connections/hibernating. - - {This event could affect the controller's idea of server status. The - controller MAY inform the admin, though presumably the accounting was - explicitly enabled for a reason.} - - [This event was added in tor 0.2.9.0-alpha.] - -4.1.11. Our set of guard nodes has changed - - Syntax: - - "650" SP "GUARD" SP Type SP Name SP Status ... CRLF - Type = "ENTRY" - Name = ServerSpec - (Identifies the guard affected) - Status = "NEW" | "UP" | "DOWN" | "BAD" | "GOOD" | "DROPPED" - - The ENTRY type indicates a guard used for connections to the Tor - network. - - The Status values are: - - "NEW" -- This node was not previously used as a guard; now we have - picked it as one. - "DROPPED" -- This node is one we previously picked as a guard; we - no longer consider it to be a member of our guard list. - "UP" -- The guard now seems to be reachable. - "DOWN" -- The guard now seems to be unreachable. - "BAD" -- Because of flags set in the consensus and/or values in the - configuration, this node is now unusable as a guard. - "BAD_L2" -- This layer2 guard has expired or got removed from the - consensus. This node is removed from the layer2 guard set. - "GOOD" -- Because of flags set in the consensus and/or values in the - configuration, this node is now usable as a guard. - - Controllers must accept unrecognized types and unrecognized statuses. - -4.1.12. Network status has changed - - Syntax: - - "650" "+" "NS" CRLF 1*NetworkStatus "." CRLF "650" SP "OK" CRLF - - The event is used whenever our local view of a relay status changes. - This happens when we get a new v3 consensus (in which case the entries - we see are a duplicate of what we see in the NEWCONSENSUS event, - below), but it also happens when we decide to mark a relay as up or - down in our local status, for example based on connection attempts. - - [First added in 0.1.2.3-alpha] - -4.1.13. Bandwidth used on an application stream - - The syntax is: - - "650" SP "STREAM_BW" SP StreamID SP BytesWritten SP BytesRead SP - Time CRLF - BytesWritten = 1*DIGIT - BytesRead = 1*DIGIT - Time = ISOTime2Frac - - BytesWritten and BytesRead are the number of bytes written and read - by the application since the last STREAM_BW event on this stream. - - Note that from Tor's perspective, *reading* a byte on a stream means - that the application *wrote* the byte. That's why the order of "written" - vs "read" is opposite for stream_bw events compared to bw events. - - The Time field is provided only in versions 0.3.2.1-alpha and later. It - records when Tor created the bandwidth event. - - These events are generated about once per second per stream; no events - are generated for streams that have not written or read. These events - apply only to streams entering Tor (such as on a SOCKSPort, TransPort, - or so on). They are not generated for exiting streams. - -4.1.14. Per-country client stats - - The syntax is: - - "650" SP "CLIENTS_SEEN" SP TimeStarted SP CountrySummary SP - IPVersions CRLF - - We just generated a new summary of which countries we've seen clients - from recently. The controller could display this for the user, e.g. - in their "relay" configuration window, to give them a sense that they - are actually being useful. - - Currently only bridge relays will receive this event, but once we figure - out how to sufficiently aggregate and sanitize the client counts on - main relays, we might start sending these events in other cases too. - - TimeStarted is a quoted string indicating when the reported summary - counts from (in UTCS). - - The CountrySummary keyword has as its argument a comma-separated, - possibly empty set of "countrycode=count" pairs. For example (without - linebreak), - 650-CLIENTS_SEEN TimeStarted="2008-12-25 23:50:43" - CountrySummary=us=16,de=8,uk=8 - - The IPVersions keyword has as its argument a comma-separated set of - "protocol-family=count" pairs. For example, - IPVersions=v4=16,v6=40 - - Note that these values are rounded, not exact. The rounding - algorithm is specified in the description of "geoip-client-origins" - in dir-spec.txt. - -4.1.15. New consensus networkstatus has arrived - - The syntax is: - - "650" "+" "NEWCONSENSUS" CRLF 1*NetworkStatus "." CRLF "650" SP - "OK" CRLF - - A new consensus networkstatus has arrived. We include NS-style lines for - every relay in the consensus. NEWCONSENSUS is a separate event from the - NS event, because the list here represents every usable relay: so any - relay *not* mentioned in this list is implicitly no longer recommended. - - [First added in 0.2.1.13-alpha] - -4.1.16. New circuit buildtime has been set - - The syntax is: - - "650" SP "BUILDTIMEOUT_SET" SP Type SP "TOTAL_TIMES=" Total SP - "TIMEOUT_MS=" Timeout SP "XM=" Xm SP "ALPHA=" Alpha SP - "CUTOFF_QUANTILE=" Quantile SP "TIMEOUT_RATE=" TimeoutRate SP - "CLOSE_MS=" CloseTimeout SP "CLOSE_RATE=" CloseRate - CRLF - Type = "COMPUTED" / "RESET" / "SUSPENDED" / "DISCARD" / "RESUME" - Total = Integer count of timeouts stored - Timeout = Integer timeout in milliseconds - Xm = Estimated integer Pareto parameter Xm in milliseconds - Alpha = Estimated floating point Paredo parameter alpha - Quantile = Floating point CDF quantile cutoff point for this timeout - TimeoutRate = Floating point ratio of circuits that timeout - CloseTimeout = How long to keep measurement circs in milliseconds - CloseRate = Floating point ratio of measurement circuits that are closed - - A new circuit build timeout time has been set. If Type is "COMPUTED", - Tor has computed the value based on historical data. If Type is "RESET", - initialization or drastic network changes have caused Tor to reset - the timeout back to the default, to relearn again. If Type is - "SUSPENDED", Tor has detected a loss of network connectivity and has - temporarily changed the timeout value to the default until the network - recovers. If type is "DISCARD", Tor has decided to discard timeout - values that likely happened while the network was down. If type is - "RESUME", Tor has decided to resume timeout calculation. - - The Total value is the count of circuit build times Tor used in - computing this value. It is capped internally at the maximum number - of build times Tor stores (NCIRCUITS_TO_OBSERVE). - - The Timeout itself is provided in milliseconds. Internally, Tor rounds - this value to the nearest second before using it. - - [First added in 0.2.2.7-alpha] - -4.1.17. Signal received - - The syntax is: - - "650" SP "SIGNAL" SP Signal CRLF - - Signal = "RELOAD" / "DUMP" / "DEBUG" / "NEWNYM" / "CLEARDNSCACHE" - - A signal has been received and actions taken by Tor. The meaning of each - signal, and the mapping to Unix signals, is as defined in section 3.7. - Future versions of Tor MAY generate signals other than those listed here; - controllers MUST be able to accept them. - - If Tor chose to ignore a signal (such as NEWNYM), this event will not be - sent. Note that some options (like ReloadTorrcOnSIGHUP) may affect the - semantics of the signals here. - - Note that the HALT (SIGTERM) and SHUTDOWN (SIGINT) signals do not currently - generate any event. - - [First added in 0.2.3.1-alpha] - -4.1.18. Configuration changed - - The syntax is: - - StartReplyLine *(MidReplyLine) EndReplyLine - - StartReplyLine = "650-CONF_CHANGED" CRLF - MidReplyLine = "650-" KEYWORD ["=" VALUE] CRLF - EndReplyLine = "650 OK" - - Tor configuration options have changed (such as via a SETCONF or RELOAD - signal). KEYWORD and VALUE specify the configuration option that was changed. - Undefined configuration options contain only the KEYWORD. - -4.1.19. Circuit status changed slightly - - The syntax is: - - "650" SP "CIRC_MINOR" SP CircuitID SP CircEvent [SP Path] - [SP "BUILD_FLAGS=" BuildFlags] [SP "PURPOSE=" Purpose] - [SP "HS_STATE=" HSState] [SP "REND_QUERY=" HSAddress] - [SP "TIME_CREATED=" TimeCreated] - [SP "OLD_PURPOSE=" Purpose [SP "OLD_HS_STATE=" HSState]] CRLF - - CircEvent = - "PURPOSE_CHANGED" / ; circuit purpose or HS-related state changed - "CANNIBALIZED" ; circuit cannibalized - - Clients MUST accept circuit events not listed above. - - The "OLD_PURPOSE" field is provided for both PURPOSE_CHANGED and - CANNIBALIZED events. The "OLD_HS_STATE" field is provided whenever - the "OLD_PURPOSE" field is provided and is a hidden-service-related - purpose. - - Other fields are as specified in section 4.1.1 above. - - [First added in 0.2.3.11-alpha] - -4.1.20. Pluggable transport launched - - The syntax is: - - "650" SP "TRANSPORT_LAUNCHED" SP Type SP Name SP TransportAddress SP Port - Type = "server" | "client" - Name = The name of the pluggable transport - TransportAddress = An IPv4 or IPv6 address on which the pluggable - transport is listening for connections - Port = The TCP port on which it is listening for connections. - - A pluggable transport called 'Name' of type 'Type' was launched - successfully and is now listening for connections on 'Address':'Port'. - -4.1.21. Bandwidth used on an OR or DIR or EXIT connection - - The syntax is: - - "650" SP "CONN_BW" SP "ID=" ConnID SP "TYPE=" ConnType - SP "READ=" BytesRead SP "WRITTEN=" BytesWritten CRLF - - ConnType = "OR" / ; Carrying traffic within the tor network. This can - either be our own (client) traffic or traffic we're - relaying within the network. - "DIR" / ; Fetching tor descriptor data, or transmitting - descriptors we're mirroring. - "EXIT" ; Carrying traffic between the tor network and an - external destination. - - BytesRead = 1*DIGIT - BytesWritten = 1*DIGIT - - Controllers MUST tolerate unrecognized connection types. - - BytesWritten and BytesRead are the number of bytes written and read - by Tor since the last CONN_BW event on this connection. - - These events are generated about once per second per connection; no - events are generated for connections that have not read or written. - These events are only generated if TestingTorNetwork is set. - - [First added in 0.2.5.2-alpha] - -4.1.22. Bandwidth used by all streams attached to a circuit - - The syntax is: - - "650" SP "CIRC_BW" SP "ID=" CircuitID SP "READ=" BytesRead SP - "WRITTEN=" BytesWritten SP "TIME=" Time SP - "DELIVERED_READ=" DeliveredBytesRead SP - "OVERHEAD_READ=" OverheadBytesRead SP - "DELIVERED_WRITTEN=" DeliveredBytesWritten SP - "OVERHEAD_WRITTEN=" OverheadBytesWritten SP - "SS=" SlowStartState SP - "CWND=" CWNDCells SP - "RTT=" RTTMilliseconds SP - "MIN_RTT=" RTTMilliseconds CRLF - BytesRead = 1*DIGIT - BytesWritten = 1*DIGIT - OverheadBytesRead = 1*DIGIT - OverheadBytesWritten = 1*DIGIT - DeliveredBytesRead = 1*DIGIT - DeliveredBytesWritten = 1*DIGIT - SlowStartState = 0 or 1 - CWNDCells = 1*DIGIT - RTTMilliseconds= 1*DIGIT - Time = ISOTime2Frac - - BytesRead and BytesWritten are the number of bytes read and written - on this circuit since the last CIRC_BW event. These bytes have not - necessarily been validated by Tor, and can include invalid cells, - dropped cells, and ignored cells (such as padding cells). These - values include the relay headers, but not circuit headers. - - Circuit data that has been validated and processed by Tor is further - broken down into two categories: delivered payloads and overhead. - DeliveredBytesRead and DeliveredBytesWritten are the total relay cell - payloads transmitted since the last CIRC_BW event, not counting relay - cell headers or circuit headers. OverheadBytesRead and - OverheadBytesWritten are the extra unused bytes at the end of each - cell in order for it to be the fixed CELL_LEN bytes long. - - The sum of DeliveredBytesRead and OverheadBytesRead MUST be less than - BytesRead, and the same is true for their written counterparts. This - sum represents the total relay cell bytes on the circuit that - have been validated by Tor, not counting relay headers and cell headers. - Subtracting this sum (plus relay cell headers) from the BytesRead - (or BytesWritten) value gives the byte count that Tor has decided to - reject due to protocol errors, or has otherwise decided to ignore. - - The Time field is provided only in versions 0.3.2.1-alpha and later. It - records when Tor created the bandwidth event. - - The SS, CWND, RTT, and MIN_RTT fields are present only if the circuit - has negotiated congestion control to an onion service or Exit hop (any - intermediate leaky pipe congestion control hops are not examined here). - SS provides an indication if the circuit is in slow start (1), or not (0). - CWND is the size of the congestion window in terms of number of cells. - RTT is the N_EWMA smoothed current RTT value, and MIN_RTT is the minimum - RTT value of the circuit. The SS and CWND fields apply only to the - upstream direction of the circuit. The slow start state and CWND values - of the other endpoint may be different. - - These events are generated about once per second per circuit; no events - are generated for circuits that had no attached stream writing or - reading. - - [First added in 0.2.5.2-alpha] - - [DELIVERED_READ, OVERHEAD_READ, DELIVERED_WRITTEN, and OVERHEAD_WRITTEN - were added in Tor 0.3.4.0-alpha] - - [SS, CWND, RTT, and MIN_RTT were added in Tor 0.4.7.5-alpha] - -4.1.23. Per-circuit cell stats - - The syntax is: - - "650" SP "CELL_STATS" - [ SP "ID=" CircuitID ] - [ SP "InboundQueue=" QueueID SP "InboundConn=" ConnID ] - [ SP "InboundAdded=" CellsByType ] - [ SP "InboundRemoved=" CellsByType SP - "InboundTime=" MsecByType ] - [ SP "OutboundQueue=" QueueID SP "OutboundConn=" ConnID ] - [ SP "OutboundAdded=" CellsByType ] - [ SP "OutboundRemoved=" CellsByType SP - "OutboundTime=" MsecByType ] CRLF - CellsByType, MsecByType = CellType ":" 1*DIGIT - 0*( "," CellType ":" 1*DIGIT ) - CellType = 1*( "a" - "z" / "0" - "9" / "_" ) - - Examples are: - - 650 CELL_STATS ID=14 OutboundQueue=19403 OutboundConn=15 - OutboundAdded=create_fast:1,relay_early:2 - OutboundRemoved=create_fast:1,relay_early:2 - OutboundTime=create_fast:0,relay_early:0 - 650 CELL_STATS InboundQueue=19403 InboundConn=32 - InboundAdded=relay:1,created_fast:1 - InboundRemoved=relay:1,created_fast:1 - InboundTime=relay:0,created_fast:0 - OutboundQueue=6710 OutboundConn=18 - OutboundAdded=create:1,relay_early:1 - OutboundRemoved=create:1,relay_early:1 - OutboundTime=create:0,relay_early:0 - - ID is the locally unique circuit identifier that is only included if the - circuit originates at this node. - - Inbound and outbound refer to the direction of cell flow through the - circuit which is either to origin (inbound) or from origin (outbound). - - InboundQueue and OutboundQueue are identifiers of the inbound and - outbound circuit queues of this circuit. These identifiers are only - unique per OR connection. OutboundQueue is chosen by this node and - matches InboundQueue of the next node in the circuit. - - InboundConn and OutboundConn are locally unique IDs of inbound and - outbound OR connection. OutboundConn does not necessarily match - InboundConn of the next node in the circuit. - - InboundQueue and InboundConn are not present if the circuit originates - at this node. OutboundQueue and OutboundConn are not present if the - circuit (currently) ends at this node. - - InboundAdded and OutboundAdded are total number of cells by cell type - added to inbound and outbound queues. Only present if at least one cell - was added to a queue. - - InboundRemoved and OutboundRemoved are total number of cells by - cell type processed from inbound and outbound queues. InboundTime and - OutboundTime are total waiting times in milliseconds of all processed - cells by cell type. Only present if at least one cell was removed from - a queue. - - These events are generated about once per second per circuit; no - events are generated for circuits that have not added or processed any - cell. These events are only generated if TestingTorNetwork is set. - - [First added in 0.2.5.2-alpha] - -4.1.24. Token buckets refilled - - The syntax is: - - "650" SP "TB_EMPTY" SP BucketName [ SP "ID=" ConnID ] SP - "READ=" ReadBucketEmpty SP "WRITTEN=" WriteBucketEmpty SP - "LAST=" LastRefill CRLF - - BucketName = "GLOBAL" / "RELAY" / "ORCONN" - ReadBucketEmpty = 1*DIGIT - WriteBucketEmpty = 1*DIGIT - LastRefill = 1*DIGIT - - Examples are: - - 650 TB_EMPTY ORCONN ID=16 READ=0 WRITTEN=0 LAST=100 - 650 TB_EMPTY GLOBAL READ=93 WRITTEN=93 LAST=100 - 650 TB_EMPTY RELAY READ=93 WRITTEN=93 LAST=100 - - This event is generated when refilling a previously empty token - bucket. BucketNames "GLOBAL" and "RELAY" keywords are used for the - global or relay token buckets, BucketName "ORCONN" is used for the - token buckets of an OR connection. Controllers MUST tolerate - unrecognized bucket names. - - ConnID is only included if the BucketName is "ORCONN". - - If both global and relay buckets and/or the buckets of one or more OR - connections run out of tokens at the same time, multiple separate - events are generated. - - ReadBucketEmpty (WriteBucketEmpty) is the time in millis that the read - (write) bucket was empty since the last refill. LastRefill is the - time in millis since the last refill. - - If a bucket went negative and if refilling tokens didn't make it go - positive again, there will be multiple consecutive TB_EMPTY events for - each refill interval during which the bucket contained zero tokens or - less. In such a case, ReadBucketEmpty or WriteBucketEmpty are capped - at LastRefill in order not to report empty times more than once. - - These events are only generated if TestingTorNetwork is set. - - [First added in 0.2.5.2-alpha] - -4.1.25. HiddenService descriptors - - The syntax is: - - "650" SP "HS_DESC" SP Action SP HSAddress SP AuthType SP HsDir - [SP DescriptorID] [SP "REASON=" Reason] [SP "REPLICA=" Replica] - [SP "HSDIR_INDEX=" HSDirIndex] - - Action = "REQUESTED" / "UPLOAD" / "RECEIVED" / "UPLOADED" / "IGNORE" / - "FAILED" / "CREATED" - HSAddress = 16*Base32Character / 56*Base32Character / "UNKNOWN" - AuthType = "NO_AUTH" / "BASIC_AUTH" / "STEALTH_AUTH" / "UNKNOWN" - HsDir = LongName / Fingerprint / "UNKNOWN" - DescriptorID = 32*Base32Character / 43*Base64Character - Reason = "BAD_DESC" / "QUERY_REJECTED" / "UPLOAD_REJECTED" / "NOT_FOUND" / - "UNEXPECTED" / "QUERY_NO_HSDIR" / "QUERY_RATE_LIMITED" - Replica = 1*DIGIT - HSDirIndex = 64*HEXDIG - - These events will be triggered when required HiddenService descriptor is - not found in the cache and a fetch or upload with the network is performed. - - If the fetch was triggered with only a DescriptorID (using the HSFETCH - command for instance), the HSAddress only appears in the Action=RECEIVED - since there is no way to know the HSAddress from the DescriptorID thus - the value will be "UNKNOWN". - - If we already had the v0 descriptor, the newly fetched v2 descriptor - will be ignored and a "HS_DESC" event with "IGNORE" action will be - generated. - - For HsDir, LongName is always preferred. If HsDir cannot be found in node - list at the time event is sent, Fingerprint will be used instead. - - If Action is "FAILED", Tor SHOULD send Reason field as well. Possible - values of Reason are: - - "BAD_DESC" - descriptor was retrieved, but found to be unparsable. - - "QUERY_REJECTED" - query was rejected by HS directory. - - "UPLOAD_REJECTED" - descriptor was rejected by HS directory. - - "NOT_FOUND" - HS descriptor with given identifier was not found. - - "UNEXPECTED" - nature of failure is unknown. - - "QUERY_NO_HSDIR" - No suitable HSDir were found for the query. - - "QUERY_RATE_LIMITED" - query for this service is rate-limited - - For "QUERY_NO_HSDIR" or "QUERY_RATE_LIMITED", the HsDir will be set to - "UNKNOWN" which was introduced in tor 0.3.1.0-alpha and 0.4.1.0-alpha - respectively. - - If Action is "CREATED", Tor SHOULD send Replica field as well. The Replica - field contains the replica number of the generated descriptor. The Replica - number is specified in rend-spec.txt section 1.3 and determines the - descriptor ID of the descriptor. - - For hidden service v3, the following applies: - - The "HSDIR_INDEX=" is an optional field that is only for version 3 - which contains the computed index of the HsDir the descriptor was - uploaded to or fetched from. - - The "DescriptorID" key is the descriptor blinded key used for the index - value at the "HsDir". - - The "REPLICA=" field is not used for the "CREATED" event because v3 - doesn't use the replica number in the descriptor ID computation. - - Because client authentication is not yet implemented, the "AuthType" - field is always "NO_AUTH". - - [HS v3 support added 0.3.3.1-alpha] - -4.1.26. HiddenService descriptors content - - The syntax is: - - "650" "+" "HS_DESC_CONTENT" SP HSAddress SP DescId SP HsDir CRLF - Descriptor CRLF "." CRLF "650" SP "OK" CRLF - - HSAddress = 16*Base32Character / 56*Base32Character / "UNKNOWN" - DescId = 32*Base32Character / 32*Base64Character - HsDir = LongName / "UNKNOWN" - Descriptor = The text of the descriptor formatted as specified in - rend-spec.txt section 1.3 (v2) or rend-spec-v3.txt - section 2.4 (v3) or empty string on failure. - - This event is triggered when a successfully fetched HS descriptor is - received. The text of that descriptor is then replied. If the HS_DESC - event is enabled, it is replied just after the RECEIVED action. - - If a fetch fails, the Descriptor is an empty string and HSAddress is set - to "UNKNOWN". The HS_DESC event should be used to get more information on - the failed request. - - If the fetch fails for the QUERY_NO_HSDIR or QUERY_RATE_LIMITED reason from - the HS_DESC event, the HsDir is set to "UNKNOWN". This was introduced in - 0.3.1.0-alpha and 0.4.1.0-alpha respectively. - - It's expected to receive a reply relatively fast as in it's the time it - takes to fetch something over the Tor network. This can be between a - couple of seconds up to 60 seconds (not a hard limit). But, in any cases, - this event will reply either the descriptor's content or an empty one. - - [HS_DESC_CONTENT was added in Tor 0.2.7.1-alpha] - [HS v3 support added 0.3.3.1-alpha] - -4.1.27. Network liveness has changed - - Syntax: - - "650" SP "NETWORK_LIVENESS" SP Status CRLF - Status = "UP" / ; The network now seems to be reachable. - "DOWN" / ; The network now seems to be unreachable. - - Controllers MUST tolerate unrecognized status types. - - [NETWORK_LIVENESS was added in Tor 0.2.7.2-alpha] - -4.1.28. Pluggable Transport Logs - - Syntax: - - "650" SP "PT_LOG" SP PT=Program SP Message - - Program = The program path as defined in the *TransportPlugin - configuration option. Tor accepts relative and full path. - Message = The log message that the PT sends back to the tor parent - process minus the "LOG" string prefix. Formatted as - specified in pt-spec.txt section "3.3.4. Pluggable - Transport Log Message". - - This event is triggered when tor receives a log message from the PT. - - Example: - - PT (obfs4): LOG SEVERITY=debug MESSAGE="Connected to bridge A" - - the resulting control port event would be: - - Tor: 650 PT_LOG PT=/usr/bin/obs4proxy SEVERITY=debug MESSAGE="Connected to bridge A" - - [PT_LOG was added in Tor 0.4.0.1-alpha] - -4.1.29. Pluggable Transport Status - - Syntax: - - "650" SP "PT_STATUS" SP PT=Program SP TRANSPORT=Transport SP Message - - Program = The program path as defined in the *TransportPlugin - configuration option. Tor accepts relative and full path. - Transport = This value indicates a hint on what the PT is such as the - name or the protocol used for instance. - Message = The status message that the PT sends back to the tor parent - process minus the "STATUS" string prefix. Formatted as - specified in pt-spec.txt section "3.3.5 Pluggable - Transport Status Message". - - This event is triggered when tor receives a log message from the PT. - - Example: - - PT (obfs4): STATUS TRANSPORT=obfs4 CONNECT=Success - - the resulting control port event would be: - - Tor: 650 PT_STATUS PT=/usr/bin/obs4proxy TRANSPORT=obfs4 CONNECT=Success - - [PT_STATUS was added in Tor 0.4.0.1-alpha] - -5. Implementation notes - -5.1. Authentication - - If the control port is open and no authentication operation is enabled, Tor - trusts any local user that connects to the control port. This is generally - a poor idea. - - If the 'CookieAuthentication' option is true, Tor writes a "magic - cookie" file named "control_auth_cookie" into its data directory (or - to another file specified in the 'CookieAuthFile' option). To - authenticate, the controller must demonstrate that it can read the - contents of the cookie file: - - * Current versions of Tor support cookie authentication - - using the "COOKIE" authentication method: the controller sends the - contents of the cookie file, encoded in hexadecimal. This - authentication method exposes the user running a controller to an - unintended information disclosure attack whenever the controller - has greater filesystem read access than the process that it has - connected to. (Note that a controller may connect to a process - other than Tor.) It is almost never safe to use, even if the - controller's user has explicitly specified which filename to read - an authentication cookie from. For this reason, the COOKIE - authentication method has been deprecated and will be removed from - Tor before some future version of Tor. - - * 0.2.2.x versions of Tor starting with 0.2.2.36, and all versions of - - Tor after 0.2.3.12-alpha, support cookie authentication using the - "SAFECOOKIE" authentication method, which discloses much less - information about the contents of the cookie file. - - If the 'HashedControlPassword' option is set, it must contain the salted - hash of a secret password. The salted hash is computed according to the - S2K algorithm in RFC 2440 (OpenPGP), and prefixed with the s2k specifier. - This is then encoded in hexadecimal, prefixed by the indicator sequence - "16:". Thus, for example, the password 'foo' could encode to: - - 16:660537E3E1CD49996044A3BF558097A981F539FEA2F9DA662B4626C1C2 - ++++++++++++++++**^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - salt hashed value - indicator - - You can generate the salt of a password by calling - - 'tor --hash-password ' - - or by using the example code in the Python and Java controller libraries. - To authenticate under this scheme, the controller sends Tor the original - secret that was used to generate the password, either as a quoted string - or encoded in hexadecimal. - -5.2. Don't let the buffer get too big. - - With old versions of Tor (before 0.2.0.16-alpha), if you ask for - lots of events, and 16MB of them queue up on the buffer, the Tor - process will close the socket. - - Newer Tor versions do not have this 16 MB buffer limit. However, - if you leave huge numbers of events unread, Tor may still run out - of memory, so you should still be careful about buffer size. - -5.3. Backward compatibility with v0 control protocol. - - The 'version 0' control protocol was replaced in Tor 0.1.1.x. Support - was removed in Tor 0.2.0.x. Every non-obsolete version of Tor now - supports the version 1 control protocol. - - For backward compatibility with the "version 0" control protocol, - Tor used to check whether the third octet of the first command is zero. - (If it was, Tor assumed that version 0 is in use.) - - This compatibility was removed in Tor 0.1.2.16 and 0.2.0.4-alpha. - -5.4. Tor config options for use by controllers - - Tor provides a few special configuration options for use by controllers. - These options are not saved to disk by SAVECONF. Most can be set and - examined by the SETCONF and GETCONF commands, but some (noted below) can - only be given in a torrc file or on the command line. - - Generally, these options make Tor unusable by disabling a portion of Tor's - normal operations. Unless a controller provides replacement functionality - to fill this gap, Tor will not correctly handle user requests. - - __AllDirActionsPrivate - - If true, Tor will try to launch all directory operations through - anonymous connections. (Ordinarily, Tor only tries to anonymize - requests related to hidden services.) This option will slow down - directory access, and may stop Tor from working entirely if it does not - yet have enough directory information to build circuits. - - (Boolean. Default: "0".) - - __DisablePredictedCircuits - - If true, Tor will not launch preemptive "general-purpose" circuits for - streams to attach to. (It will still launch circuits for testing and - for hidden services.) - - (Boolean. Default: "0".) - - __LeaveStreamsUnattached - - If true, Tor will not automatically attach new streams to circuits; - instead, the controller must attach them with ATTACHSTREAM. If the - controller does not attach the streams, their data will never be routed. - - (Boolean. Default: "0".) - - __HashedControlSessionPassword - - As HashedControlPassword, but is not saved to the torrc file by - SAVECONF. Added in Tor 0.2.0.20-rc. - - __ReloadTorrcOnSIGHUP - - If this option is true (the default), we reload the torrc from disk - every time we get a SIGHUP (from the controller or via a signal). - Otherwise, we don't. This option exists so that controllers can keep - their options from getting overwritten when a user sends Tor a HUP for - some other reason (for example, to rotate the logs). - - (Boolean. Default: "1") - - __OwningControllerProcess - - If this option is set to a process ID, Tor will periodically check - whether a process with the specified PID exists, and exit if one - does not. Added in Tor 0.2.2.28-beta. This option's intended use - is documented in section 3.23 with the related TAKEOWNERSHIP - command. - - Note that this option can only specify a single process ID, unlike - the TAKEOWNERSHIP command which can be sent along multiple control - connections. - - (String. Default: unset.) - - __OwningControllerFD - - If this option is a valid socket, Tor will start with an open control - connection on this socket. Added in Tor 0.3.3.1-alpha. - - This socket will be an owning controller, as if it had already called - TAKEOWNERSHIP. It will be automatically authenticated. This option - should only be used by other programs that are starting Tor. - - This option cannot be changed via SETCONF; it must be set in a torrc or - via the command line. - - (Integer. Default: -1.) - - __DisableSignalHandlers - - If this option is set to true during startup, then Tor will not install - any signal handlers to watch for POSIX signals. The SIGNAL controller - command will still work. - - This option is meant for embedding Tor inside another process, when - the controlling process would rather handle signals on its own. - - This option cannot be changed via SETCONF; it must be set in a torrc or - via the command line. - - (Boolean. Default: 0.) - -5.5. Phases from the Bootstrap status event. - - [For the bootstrap phases reported by Tor prior to 0.4.0.x, see - Section 5.6.] - - This section describes the various bootstrap phases currently reported - by Tor. Controllers should not assume that the percentages and tags - listed here will continue to match up, or even that the tags will stay - in the same order. Some phases might also be skipped (not reported) - if the associated bootstrap step is already complete, or if the phase - no longer is necessary. Only "starting" and "done" are guaranteed to - exist in all future versions. - - Current Tor versions enter these phases in order, monotonically. - Future Tors MAY revisit earlier phases, for example, if the network - fails. - -5.5.1. Overview of Bootstrap reporting. - - Bootstrap phases can be viewed as belonging to one of three stages: - - 1. Initial connection to a Tor relay or bridge - 2. Obtaining directory information - 3. Building an application circuit - - Tor doesn't specifically enter Stage 1; that is a side effect of - other actions that Tor is taking. Tor could be making a connection - to a fallback directory server, or it could be making a connection - to a guard candidate. Either one counts as Stage 1 for the purposes - of bootstrap reporting. - - Stage 2 might involve Tor contacting directory servers, or it might - involve reading cached directory information from a previous - session. Large parts of Stage 2 might be skipped if there is already - enough cached directory information to build circuits. Tor will - defer reporting progress in Stage 2 until Stage 1 is complete. - - Tor defers this reporting because Tor can already have enough - directory information to build circuits, yet not be able to connect - to a relay. Without that deferral, a user might misleadingly see Tor - stuck at a large amount of progress when something as fundamental as - making a TCP connection to any relay is failing. - - Tor also doesn't specifically enter Stage 3; that is a side effect - of Tor building circuits for some purpose or other. In a typical - client, Tor builds predicted circuits to provide lower latency for - application connection requests. In Stage 3, Tor might make new - connections to relays or bridges that it did not connect to in Stage - 1. - -5.5.2. Phases in Bootstrap Stage 1. - - Phase 0: - tag=starting summary="Starting" - - Tor starts out in this phase. - - Phase 1: - tag=conn_pt summary="Connecting to pluggable transport" - [This phase is new in 0.4.0.x] - - Tor is making a TCP connection to the transport plugin for a - pluggable transport. Tor will use this pluggable transport to make - its first connection to a bridge. - - Phase 2: - tag=conn_done_pt summary="Connected to pluggable transport" - [New in 0.4.0.x] - - Tor has completed its TCP connection to the transport plugin for the - pluggable transport. - - Phase 3: - tag=conn_proxy summary="Connecting to proxy" - [New in 0.4.0.x] - - Tor is making a TCP connection to a proxy to make its first - connection to a relay or bridge. - - Phase 4: - tag=conn_done_proxy summary="Connected to proxy" - [New in 0.4.0.x] - - Tor has completed its TCP connection to a proxy to make its first - connection to a relay or bridge. - - Phase 5: - tag=conn summary="Connecting to a relay" - [New in 0.4.0.x; prior versions of Tor had a "conn_dir" phase that - sometimes but not always corresponded to connecting to a directory server] - - Tor is making its first connection to a relay. This might be through - a pluggable transport or proxy connection that Tor has already - established. - - Phase 10: - tag=conn_done summary="Connected to a relay" - [New in 0.4.0.x] - - Tor has completed its first connection to a relay. - - Phase 14: - tag=handshake summary="Handshaking with a relay" - [New in 0.4.0.x; prior versions of Tor had a "handshake_dir" phase] - - Tor is in the process of doing a TLS handshake with a relay. - - Phase 15: - tag=handshake_done summary="Handshake with a relay done" - [New in 0.4.0.x] - - Tor has completed its TLS handshake with a relay. - -5.5.3. Phases in Bootstrap Stage 2. - - Phase 20: - tag=onehop_create summary="Establishing an encrypted directory connection" - [prior to 0.4.0.x, this was numbered 15] - - Once TLS is finished with a relay, Tor will send a CREATE_FAST cell - to establish a one-hop circuit for retrieving directory information. - It will remain in this phase until it receives the CREATED_FAST cell - back, indicating that the circuit is ready. - - Phase 25: - tag=requesting_status summary="Asking for networkstatus consensus" - [prior to 0.4.0.x, this was numbered 20] - - Once we've finished our one-hop circuit, we will start a new stream - for fetching the networkstatus consensus. We'll stay in this phase - until we get the 'connected' relay cell back, indicating that we've - established a directory connection. - - Phase 30: - tag=loading_status summary="Loading networkstatus consensus" - [prior to 0.4.0.x, this was numbered 25] - - Once we've established a directory connection, we will start fetching - the networkstatus consensus document. This could take a while; this - phase is a good opportunity for using the "progress" keyword to indicate - partial progress. - - This phase could stall if the directory server we picked doesn't - have a copy of the networkstatus consensus so we have to ask another, - or it does give us a copy but we don't find it valid. - - Phase 40: - tag=loading_keys summary="Loading authority key certs" - - Sometimes when we've finished loading the networkstatus consensus, - we find that we don't have all the authority key certificates for the - keys that signed the consensus. At that point we put the consensus we - fetched on hold and fetch the keys so we can verify the signatures. - - Phase 45 - tag=requesting_descriptors summary="Asking for relay descriptors" - - Once we have a valid networkstatus consensus and we've checked all - its signatures, we start asking for relay descriptors. We stay in this - phase until we have received a 'connected' relay cell in response to - a request for descriptors. - - [Some versions of Tor (starting with 0.2.6.2-alpha but before - 0.4.0.x): Tor could report having internal paths only; see Section - 5.6] - - Phase 50: - tag=loading_descriptors summary="Loading relay descriptors" - - We will ask for relay descriptors from several different locations, - so this step will probably make up the bulk of the bootstrapping, - especially for users with slow connections. We stay in this phase until - we have descriptors for a significant fraction of the usable relays - listed in the networkstatus consensus (this can be between 25% and 95% - depending on Tor's configuration and network consensus parameters). - This phase is also a good opportunity to use the "progress" keyword to - indicate partial steps. - - [Some versions of Tor (starting with 0.2.6.2-alpha but before - 0.4.0.x): Tor could report having internal paths only; see Section - 5.6] - - Phase 75: - tag=enough_dirinfo summary="Loaded enough directory info to build - circuits" - [New in 0.4.0.x; previously, Tor would misleadingly report the - "conn_or" tag once it had enough directory info.] - -5.5.4. Phases in Bootstrap Stage 3. - - Phase 76: - tag=ap_conn_pt summary="Connecting to pluggable transport to build - circuits" - [New in 0.4.0.x] - - This is similar to conn_pt, except for making connections to - additional relays or bridges that Tor needs to use to build - application circuits. - - Phase 77: - tag=ap_conn_done_pt summary="Connected to pluggable transport to build circuits" - [New in 0.4.0.x] - - This is similar to conn_done_pt, except for making connections to - additional relays or bridges that Tor needs to use to build - application circuits. - - Phase 78: - tag=ap_conn_proxy summary="Connecting to proxy to build circuits" - [New in 0.4.0.x] - - This is similar to conn_proxy, except for making connections to - additional relays or bridges that Tor needs to use to build - application circuits. - - Phase 79: - tag=ap_conn_done_proxy summary="Connected to proxy to build circuits" - [New in 0.4.0.x] - - This is similar to conn_done_proxy, except for making connections to - additional relays or bridges that Tor needs to use to build - application circuits. - - Phase 80: - tag=ap_conn summary="Connecting to a relay to build circuits" - [New in 0.4.0.x] - - This is similar to conn, except for making connections to additional - relays or bridges that Tor needs to use to build application - circuits. - - Phase 85: - tag=ap_conn_done summary="Connected to a relay to build circuits" - [New in 0.4.0.x] - - This is similar to conn_done, except for making connections to - additional relays or bridges that Tor needs to use to build - application circuits. - - Phase 89: - tag=ap_handshake summary="Finishing handshake with a relay to build circuits" - [New in 0.4.0.x] - - This is similar to handshake, except for making connections to - additional relays or bridges that Tor needs to use to build - application circuits. - - Phase 90: - tag=ap_handshake_done summary="Handshake finished with a relay to build circuits" - [New in 0.4.0.x] - - This is similar to handshake_done, except for making connections to - additional relays or bridges that Tor needs to use to build - application circuits. - - Phase 95: - tag=circuit_create summary="Establishing a[n internal] Tor circuit" - [prior to 0.4.0.x, this was numbered 90] - - Once we've finished our TLS handshake with the first hop of a circuit, - we will set about trying to make some 3-hop circuits in case we need them - soon. - - [Some versions of Tor (starting with 0.2.6.2-alpha but before - 0.4.0.x): Tor could report having internal paths only; see Section - 5.6] - - Phase 100: - tag=done summary="Done" - - A full 3-hop circuit has been established. Tor is ready to handle - application connections now. - - [Some versions of Tor (starting with 0.2.6.2-alpha but before - 0.4.0.x): Tor could report having internal paths only; see Section - 5.6] - -5.6. Bootstrap phases reported by older versions of Tor - - These phases were reported by Tor older than 0.4.0.x. For newer - versions of Tor, see Section 5.5. - - [Newer versions of Tor (0.2.6.2-alpha and later): - If the consensus contains Exits (the typical case), Tor will build both - exit and internal circuits. When bootstrap completes, Tor will be ready - to handle an application requesting an exit circuit to services like the - World Wide Web. - - If the consensus does not contain Exits, Tor will only build internal - circuits. In this case, earlier statuses will have included "internal" - as indicated above. When bootstrap completes, Tor will be ready to handle - an application requesting an internal circuit to hidden services at - ".onion" addresses. - - If a future consensus contains Exits, exit circuits may become available.] - - Phase 0: - tag=starting summary="Starting" - - Tor starts out in this phase. - - Phase 5: - tag=conn_dir summary="Connecting to directory server" - - Tor sends this event as soon as Tor has chosen a directory server -- - e.g. one of the authorities if bootstrapping for the first time or - after a long downtime, or one of the relays listed in its cached - directory information otherwise. - - Tor will stay at this phase until it has successfully established - a TCP connection with some directory server. Problems in this phase - generally happen because Tor doesn't have a network connection, or - because the local firewall is dropping SYN packets. - - Phase 10: - tag=handshake_dir summary="Finishing handshake with directory server" - - This event occurs when Tor establishes a TCP connection with a relay or - authority used as a directory server (or its https proxy if it's using - one). Tor remains in this phase until the TLS handshake with the relay - or authority is finished. - - Problems in this phase generally happen because Tor's firewall is - doing more sophisticated MITM attacks on it, or doing packet-level - keyword recognition of Tor's handshake. - - Phase 15: - tag=onehop_create summary="Establishing an encrypted directory connection" - - Once TLS is finished with a relay, Tor will send a CREATE_FAST cell - to establish a one-hop circuit for retrieving directory information. - It will remain in this phase until it receives the CREATED_FAST cell - back, indicating that the circuit is ready. - - Phase 20: - tag=requesting_status summary="Asking for networkstatus consensus" - - Once we've finished our one-hop circuit, we will start a new stream - for fetching the networkstatus consensus. We'll stay in this phase - until we get the 'connected' relay cell back, indicating that we've - established a directory connection. - - Phase 25: - tag=loading_status summary="Loading networkstatus consensus" - - Once we've established a directory connection, we will start fetching - the networkstatus consensus document. This could take a while; this - phase is a good opportunity for using the "progress" keyword to indicate - partial progress. - - This phase could stall if the directory server we picked doesn't - have a copy of the networkstatus consensus so we have to ask another, - or it does give us a copy but we don't find it valid. - - Phase 40: - tag=loading_keys summary="Loading authority key certs" - - Sometimes when we've finished loading the networkstatus consensus, - we find that we don't have all the authority key certificates for the - keys that signed the consensus. At that point we put the consensus we - fetched on hold and fetch the keys so we can verify the signatures. - - Phase 45 - tag=requesting_descriptors summary="Asking for relay descriptors - [ for internal paths]" - - Once we have a valid networkstatus consensus and we've checked all - its signatures, we start asking for relay descriptors. We stay in this - phase until we have received a 'connected' relay cell in response to - a request for descriptors. - - [Newer versions of Tor (0.2.6.2-alpha and later): - If the consensus contains Exits (the typical case), Tor will ask for - descriptors for both exit and internal paths. If not, Tor will only ask - for descriptors for internal paths. In this case, this status will - include "internal" as indicated above.] - - Phase 50: - tag=loading_descriptors summary="Loading relay descriptors[ for internal - paths]" - - We will ask for relay descriptors from several different locations, - so this step will probably make up the bulk of the bootstrapping, - especially for users with slow connections. We stay in this phase until - we have descriptors for a significant fraction of the usable relays - listed in the networkstatus consensus (this can be between 25% and 95% - depending on Tor's configuration and network consensus parameters). - This phase is also a good opportunity to use the "progress" keyword to - indicate partial steps. - - [Newer versions of Tor (0.2.6.2-alpha and later): - If the consensus contains Exits (the typical case), Tor will download - descriptors for both exit and internal paths. If not, Tor will only - download descriptors for internal paths. In this case, this status will - include "internal" as indicated above.] - - Phase 80: - tag=conn_or summary="Connecting to the Tor network[ internally]" - - Once we have a valid consensus and enough relay descriptors, we choose - entry guard(s) and start trying to build some circuits. This step - is similar to the "conn_dir" phase above; the only difference is - the context. - - If a Tor starts with enough recent cached directory information, - its first bootstrap status event will be for the conn_or phase. - - [Newer versions of Tor (0.2.6.2-alpha and later): - If the consensus contains Exits (the typical case), Tor will build both - exit and internal circuits. If not, Tor will only build internal circuits. - In this case, this status will include "internal(ly)" as indicated above.] - - Phase 85: - tag=handshake_or summary="Finishing handshake with first hop[ of internal - circuit]" - - This phase is similar to the "handshake_dir" phase, but it gets reached - if we finish a TCP connection to a Tor relay and we have already reached - the "conn_or" phase. We'll stay in this phase until we complete a TLS - handshake with a Tor relay. - - [Newer versions of Tor (0.2.6.2-alpha and later): - If the consensus contains Exits (the typical case), Tor may be finishing - a handshake with the first hop if either an exit or internal circuit. In - this case, it won't specify which type. If the consensus contains no Exits, - Tor will only build internal circuits. In this case, this status will - include "internal" as indicated above.] - - Phase 90: - tag=circuit_create summary="Establishing a[n internal] Tor circuit" - - Once we've finished our TLS handshake with the first hop of a circuit, - we will set about trying to make some 3-hop circuits in case we need them - soon. - - [Newer versions of Tor (0.2.6.2-alpha and later): - If the consensus contains Exits (the typical case), Tor will build both - exit and internal circuits. If not, Tor will only build internal circuits. - In this case, this status will include "internal" as indicated above.] - - Phase 100: - tag=done summary="Done" - - A full 3-hop circuit has been established. Tor is ready to handle - application connections now. - - [Newer versions of Tor (0.2.6.2-alpha and later): - If the consensus contains Exits (the typical case), Tor will build both - exit and internal circuits. At this stage, Tor will be ready to handle - an application requesting an exit circuit to services like the World - Wide Web. - - If the consensus does not contain Exits, Tor will only build internal - circuits. In this case, earlier statuses will have included "internal" - as indicated above. At this stage, Tor will be ready to handle an - application requesting an internal circuit to hidden services at ".onion" - addresses. - - If a future consensus contains Exits, exit circuits may become available.] diff --git a/dir-list-spec.txt b/dir-list-spec.txt deleted file mode 100644 index 65af536..0000000 --- a/dir-list-spec.txt +++ /dev/null @@ -1,529 +0,0 @@ - - Tor Directory List Format - Tim Wilson-Brown (teor) - -Table of Contents - - 1. Scope and Preliminaries - 1.1. Format Overview - 1.2. Acknowledgements - 1.3. Format Versions - 1.4. Future Plans - 2. Format Details - 2.1. Nonterminals - 2.2. List Header - 2.2.1. List Header Format - 2.3. List Generation - 2.3.1. List Generation Format - 2.4. Directory Entry - 2.4.1. Directory Entry Format - 3. Usage Considerations - 3.1. Caching - 3.2. Retrieving Directory Information - 3.3. Fallback Reliability - A.1. Sample Data - A.1.1. Sample Fallback List Header - A.1.2. Sample Fallback List Generation - A.1.3. Sample Fallback Entries - -1. Scope and Preliminaries - - This document describes the format of Tor's directory lists, which are - compiled and hard-coded into the tor binary. There is currently one - list: the fallback directory mirrors. This list is also parsed by other - libraries, like stem and metrics-lib. Alternate Tor implementations can - use this list to bootstrap from the latest public Tor directory - information. - - The FallbackDir feature was introduced by proposal 210, and was first - supported by Tor in Tor version 0.2.4.7-alpha. The first hard-coded - list was shipped in 0.2.8.1-alpha. - - The hard-coded fallback directory list is located in the tor source - repository at: - - src/app/config/fallback_dirs.inc - - In Tor 0.3.4 and earlier, the list is located at: - - src/or/fallback_dirs.inc - - This document describes version 2.0.0 and later of the directory list - format. - - Legacy, semi-structured versions of the fallback list were released with - Tor 0.2.8.1-alpha through Tor 0.3.1.9. We call this format version 1. - Stem and Relay Search have parsers for this legacy format. - -1.1. Format Overview - - A directory list is a C code fragment containing an array of C string - constants. Each double-quoted C string constant is a valid torrc - FallbackDir entry. Each entry contains various data fields. - - Directory lists do not include the C array's declaration, or the array's - terminating NULL. Entries in directory lists do not include the - FallbackDir torrc option. These are handled by the including C code. - - Directory lists also include C-style comments and whitespace. The - presence of whitespace may be significant, but the amount of whitespace - is never significant. The type of whitespace is not significant to the - C compiler or Tor C string parser. However, other parsers MAY rely on - the distinction between newlines and spaces. (And that the only - whitespace characters in the list are newlines and spaces.) - - The directory entry C string constants are split over multiple lines for - readability. Structured C-style comments are used to provide additional - data fields. This information is not used by Tor, but may be of interest - to other libraries. - - The order of directory entries and data fields is not significant, - except where noted below. - -1.2. Acknowledgements - - The original fallback directory script and format was created by - weasel. The current script uses code written by gsathya & karsten. - - This specification was revised after feedback from: - - Damian Johnson ("atagar") - Iain R. Learmonth ("irl") - -1.3. Format Versions - - The directory list format uses semantic versioning: https://semver.org - - In particular: - * major versions are used for incompatible changes, like - removing non-optional fields - * minor versions are used for compatible changes, like adding - fields - * patch versions are for bug fixes, like fixing an - incorrectly-formatted Summary item - - 1.0.0 - The legacy fallback directory list format - - 2.0.0 - Adds name and extrainfo structured comments, and section separator - comments to make the list easier to parses. Also adds a source list - comment to the header. - - 3.0.0 - Modifies the format of the source list comment. - -1.4. Future Plans - - Tor also has an auth_dirs.inc file, but it is not yet in this format. - Tor uses slightly different formats for authorities and fallback - directory mirrors, so we will need to make some changes to tor so that - it parses this format. (We will also need to add authority-specific - information to this format.) See #24818 for details. - - We want to add a torrc option so operators can opt-in their relays as - fallback directory mirrors. This gives us a signed opt-in confirmation. - (We can also continue to accept whitelist entries, and do other checks.) - We need to write a short proposal, and make some changes to tor and the - fallback update script. See #24839 for details. - -2. Format Details - - Directory lists contain the following sections: - - - List Header (exactly once) - - List Generation (exactly once, may be empty) - - Directory Entry (zero or more times) - - Each section (or entry) ends with a separator. - -2.1. Nonterminals - - The following nonterminals are defined in the Onionoo details document - specification: - - dir_address - fingerprint - nickname - - See https://metrics.torproject.org/onionoo.html#details - - The following nonterminals are defined in the "Tor directory protocol" - specification in dir-spec.txt: - - Keyword - ArgumentChar - NL (newline) - SP (space) - bool (must not be confused with Onionoo's JSON "boolean") - - We derive the following nonterminals from Onionoo and dir-spec.txt: - - ipv4_or_port ::= port from an IPv4 or_addresses item - - The ipv4_or_port is the port part of an IPv4 address from the - Onionoo or_addresses list. - - ipv6_or_address ::= an IPv6 or_addresses item - - The ipv6_or_address is an IPv6 address and port from the Onionoo - or_addresses list. The address MAY be in the canonical RFC 5952 - IPv6 address format. - - A key-value pair: - - value ::= Zero or more ArgumentChar, excluding the following strings: - * a double quotation mark (DQUOTE), and - * the C comment terminators ("/*" and "*/"). - - Note that the C++ comment ("//") and equals sign ("=") are - not excluded, because they are reserved for future use in - base64 values. - - key_value ::= Keyword "=" value - - We also define these additional nonterminals: - - number ::= An optional negative sign ("-"), followed by one or more - numeric characters ([0-9]), with an optional decimal part - (".", followed by one or more numeric characters). - - separator ::= "/*" SP+ "=====" SP+ "*/" - -2.2. List Header - - The list header consists of a number of key-value pairs, embedded in - C-style comments. - -2.2.1. List Header Format - - "/*" SP+ "type=" Keyword SP+ "*/" SP* NL - - [At start, exactly once.] - - The type of directory entries in the list. Parsers SHOULD exit with - an error if this is not the first line of the list, or if the value - is anything other than "fallback". - - "/*" SP+ "version=" version_number SP+ "*/" SP* NL - - [In second position, exactly once.] - - The version of the directory list format. - - version_number is a semantic version, see the "Format Versions" - section for details. - - Version 1.0.0 represents the undocumented, legacy fallback list - format(s). Version 2.0.0 and later are documented by this - specification. - - "/*" SP+ "timestamp=" number SP+ "*/" SP* NL - - [Exactly once.] - - A positive integer that indicates when this directory list was - generated. This timestamp is guaranteed to increase for every - version 2.0.0 and later directory list. - - The current timestamp format is YYYYMMDDHHMMSS, as an integer. - - "/*" SP+ "source=" Keyword ("," Keyword)* SP+ "*/" SP* NL - - [Zero or one time.] - - A list of the sources of the directory entries in the list. - - As of version 3.0.0, the possible sources are: - * "offer-list" - the fallback_offer_list file in the fallback-scripts - repository. - * "descriptor" - one or more signed descriptors, each containing an - "offer-fallback-dir" line. This feature will be - implemented in ticket #24839. - * "fallback" - a fallback_dirs.inc file from a tor repository. - Used in check_existing mode. - - Before #24839 is implemented, the default is "offer-list". During the - transition to signed offers, it will be "descriptor,offer-list". - Afterwards, it will be "descriptor". - - In version 2.0.0, only one source name was allowed after "source=", - and the deprecated "whitelist" source name was used instead of - "offer-list". - - This line was added in version 2.0.0 of this specification. The format - of this line was modified in version 3.0.0 of this specification. - - "/*" SP+ key_value SP+ "*/" SP* NL - - [Zero or more times.] - - Future releases may include additional header fields. Parsers MUST NOT - rely on the order of these additional fields. Additional header fields - will be accompanied by a minor version increment. - - separator SP* NL - - The list header ends with the section separator. - -2.3. List Generation - - The list generation information consists of human-readable prose - describing the content and origin of this directory list. It is contained - in zero or more C-style comments, and may contain multi-line comments and - uncommented C code. - - In particular, this section may contain C-style comments that contain - an equals ("=") character. It may also be entirely empty. - - Future releases may arbitrarily change the content of this section. - Parsers MUST NOT rely on a version increment when the format changes. - -2.3.1. List Generation Format - - In general, parsers MUST NOT rely on the format of this section. - - Parsers MAY rely on the following details: - - The list generation section MUST NOT be a valid directory entry. - - The list generation summary MUST end with a section separator: - - separator SP* NL - - There MUST NOT be any section separators in the list generation - section, other than the terminating section separator. - -2.4. Directory Entry - - A directory entry consists of a C string constant, and one or more - C-style comments. The C string constant is a valid argument to the - DirAuthority or FallbackDir torrc option. The section also contains - additional key-value fields in C-style comments. - - The list of fallback entries does not include the directory - authorities: they are in a separate list. (The Tor implementation combines - these lists after parsing them, and applies the DirAuthorityFallbackRate - to their weights.) - -2.4.1. Directory Entry Format - - If a directory entry does not conform to this format, the entry SHOULD - be ignored by parsers. - - DQUOTE dir_address SP+ "orport=" ipv4_or_port SP+ - "id=" fingerprint DQUOTE SP* NL - - [At start, exactly once, on a single line.] - - This line consists of the following fields: - - dir_address - - An IPv4 address and DirPort for this directory, as defined by - Onionoo. In this format version, all IPv4 addresses and DirPorts - are guaranteed to be non-zero. (For IPv4 addresses, this means - that they are not equal to "0.0.0.0".) - - ipv4_or_port - - An IPv4 ORPort for this directory, derived from Onionoo. In this - format version, all IPv4 ORPorts are guaranteed to be non-zero. - - fingerprint - - The relay fingerprint of this directory, as defined by Onionoo. - All relay fingerprints are guaranteed to have one or more non-zero - digits. - - Note: - - Each double-quoted C string line that occurs after the first line, - starts with space inside the quotes. This is a requirement of the - Tor implementation. - - DQUOTE SP+ "ipv6=" ipv6_or_address DQUOTE SP* NL - - [Zero or one time.] - - The IPv6 address and ORPort for this directory, as defined by - Onionoo. If present, IPv6 addresses and ORPorts are guaranteed to be - non-zero. (For IPv6 addresses, this means that they are not equal to - "[::]".) - - DQUOTE SP+ "weight=" number DQUOTE SP* NL - - [Zero or one time.] - - A non-negative, real-numbered weight for this directory. - The default fallback weight is 1.0, and the default - DirAuthorityFallbackRate is 1.0 in legacy Tor versions, and 0.1 in - recent Tor versions. - - weight was removed in version 2.0.0, but is documented because it - may be of interest to libraries implementing Tor's fallback - behaviour. - - DQUOTE SP+ key_value DQUOTE SP* NL - - [Zero or more times.] - - Future releases may include additional data fields in double-quoted - C string constants. Parsers MUST NOT rely on the order of these - additional fields. Additional data fields will be accompanied by a - minor version increment. - - "/*" SP+ "nickname=" nickname* SP+ "*/" SP* NL - - [Exactly once.] - - The nickname for this directory, as defined by Onionoo. An - empty nickname indicates that the nickname is unknown. - - The first fallback list in the 2.0.0 format had nickname lines, but - they were all empty. - - "/*" SP+ "extrainfo=" bool SP+ "*/" SP* NL - - [Exactly once.] - - An integer flag that indicates whether this directory caches - extra-info documents. Set to 1 if the directory claimed that it - cached extra-info documents in its descriptor when the list was - created. 0 indicates that it did not, or its descriptor was not - available. - - The first fallback list in the 2.0.0 format had extrainfo lines, but - they were all zero. - - "/*" SP+ key_value SP+ "*/" SP* NL - - [Zero or more times.] - - Future releases may include additional data fields in C-style - comments. Parsers MUST NOT rely on the order of these additional - fields. Additional data fields will be accompanied by a minor version - increment. - - separator SP* NL - - [Exactly once.] - - Each directory entry ends with the section separator. - - "," SP* NL - - [Exactly once.] - - The comma terminates the C string constant. (Multiple C string - constants separated by whitespace or comments are coalesced by - the C compiler.) - -3. Usage Considerations - - This section contains recommended library behaviours. It does not affect - the format of directory lists. - -3.1. Caching - - The fallback list typically changes once every 6-12 months. The data in - the list represents the state of the fallback directory entries when the - list was created. Fallbacks can and do change their details over time. - - Libraries SHOULD parse and cache the most recent version of these lists - during their build or release processes. Libraries MUST NOT retrieve the - lists by default every time they are deployed or executed. - - The latest fallback list can be retrieved from: - - https://gitweb.torproject.org/tor.git/plain/src/or/fallback_dirs.inc - - Libraries MUST NOT rely on the availability of the server that hosts - these lists. - - The list can also be retrieved using: - - git clone https://git.torproject.org/tor.git - - If you just want the latest list, you may wish to perform a shallow - clone. - -3.2. Retrieving Directory Information - - Some libraries retrieve directory documents directly from the Tor - Directory Authorities. The directory authorities are designed to support - Tor relay and client bootstrap, and MAY choose to rate-limit library - access. Libraries MAY provide a user-agent in their requests, if they - are not intended to support anonymous operation. (User agents are a - fingerprinting vector.) - - Libraries SHOULD consider the potential load on the authorities, and - whether other sources can meet their needs. - - Libraries that require high-uptime availability of Tor directory - information should investigate the following options: - - * OnionOO: https://metrics.torproject.org/onionoo.html - * Third-party OnionOO mirrors are also available - * CollecTor: https://collector.torproject.org/ - * Fallback Directory Mirrors - - Onionoo and CollecTor are typically updated every hour on a regular - schedule. Fallbacks update their own directory information at random - intervals, see dir-spec for details. - -3.3. Fallback Reliability - - The fallback list is typically regenerated when the fallback failure - rate exceeds 25%. Libraries SHOULD NOT rely on any particular fallback - being available, or some proportion of fallbacks being available. - - Libraries that use fallbacks MAY wish to query an authority after a - few fallback queries fail. For example, Tor clients try 3-4 fallbacks - before trying an authority. - -A.1. Sample Data - - A sample version 2.0.0 fallback list is available here: - - https://trac.torproject.org/projects/tor/raw-attachment/ticket/22759/fallback_dirs_new_format_version.4.inc - - A sample transitional version 2.0.0 fallback list is available here: - - https://raw.githubusercontent.com/teor2345/tor/fallback-format-2-v4/src/or/fallback_dirs.inc - -A.1.1. Sample Fallback List Header - -/* type=fallback */ -/* version=2.0.0 */ -/* ===== */ - -A.1.2. Sample Fallback List Generation - -/* Whitelist & blacklist excluded 1326 of 1513 candidates. */ -/* Checked IPv4 DirPorts served a consensus within 15.0s. */ -/* -Final Count: 151 (Eligible 187, Target 392 (1963 * 0.20), Max 200) -Excluded: 36 (Same Operator 27, Failed/Skipped Download 9, Excess 0) -Bandwidth Range: 1.3 - 40.0 MByte/s -*/ -/* -Onionoo Source: details Date: 2017-05-16 07:00:00 Version: 4.0 -URL: https:onionoo.torproject.orgdetails?fields=fingerprint%2Cnickname%2Ccontact%2Clast_changed_address_or_port%2Cconsensus_weight%2Cadvertised_bandwidth%2Cor_addresses%2Cdir_address%2Crecommended_version%2Cflags%2Ceffective_family%2Cplatform&flag=V2Dir&type=relay&last_seen_days=-0&first_seen_days=30- -*/ -/* -Onionoo Source: uptime Date: 2017-05-16 07:00:00 Version: 4.0 -URL: https:onionoo.torproject.orguptime?first_seen_days=30-&flag=V2Dir&type=relay&last_seen_days=-0 -*/ -/* ===== */ - -A.1.3. Sample Fallback Entries - -"176.10.104.240:80 orport=443 id=0111BA9B604669E636FFD5B503F382A4B7AD6E80" -/* nickname=foo */ -/* extrainfo=1 */ -/* ===== */ -, -"5.9.110.236:9030 orport=9001 id=0756B7CD4DFC8182BE23143FAC0642F515182CEB" -" ipv6=[2a01:4f8:162:51e2::2]:9001" -/* nickname= */ -/* extrainfo=0 */ -/* ===== */ -, diff --git a/dir-spec.txt b/dir-spec.txt deleted file mode 100644 index f133c39..0000000 --- a/dir-spec.txt +++ /dev/null @@ -1,4299 +0,0 @@ - - Tor directory protocol, version 3 - -Table of Contents - - 0. Scope and preliminaries - 0.1. History - 0.2. Goals of the version 3 protoc - 0.3. Some Remaining questions - 1. Outline - 1.1. What's different from version 2? - 1.2. Document meta-format - 1.3. Signing documents - 1.4. Voting timeline - 2. Router operation and formats - 2.1. Uploading server descriptors and extra-info documents - 2.1.1. Server descriptor format - 2.1.2. Extra-info document format - 2.1.3. Nonterminals in server descriptors - 3. Directory authority operation and formats - 3.1. Creating key certificates - 3.2. Accepting server descriptor and extra-info document uploads - 3.3. Computing microdescriptors - 3.4. Exchanging votes - 3.4.1. Vote and consensus status document formats - 3.4.2. Assigning flags in a vote - 3.4.3. Serving bandwidth list files - 3.5. Downloading missing certificates from other directory authorities - 3.6. Downloading server descriptors from other directory authorities - 3.7. Downloading extra-info documents from other directory authorities - 3.8. Computing a consensus from a set of votes - 3.8.0.1. Deciding which Ids to include. - 3.8.0.2. Deciding which descriptors to include - 3.8.1. Forward compatibility - 3.8.2. Encoding port lists - 3.8.3. Computing Bandwidth Weights - 3.9. Computing consensus flavors - 3.9.1. ns consensus - 3.9.2. Microdescriptor consensus - 3.10. Exchanging detached signatures - 3.11. Publishing the signed consensus - 4. Directory cache operation - 4.1. Downloading consensus status documents from directory authorities - 4.2. Downloading server descriptors from directory authorities - 4.3. Downloading microdescriptors from directory authorities - 4.4. Downloading extra-info documents from directory authorities - 4.5. Consensus diffs - 4.5.1. Consensus diff format - 4.5.2. Serving and requesting diff - 4.6 Retrying failed downloads - 5. Client operation - 5.1. Downloading network-status documents - 5.2. Downloading server descriptors or microdescriptors - 5.3. Downloading extra-info documents - 5.4. Using directory information - 5.4.1. Choosing routers for circuits. - 5.4.2. Managing naming - 5.4.3. Software versions - 5.4.4. Warning about a router's status. - 5.5. Retrying failed downloads - 6. Standards compliance - 6.1. HTTP headers - 6.2. HTTP status codes - A. Consensus-negotiation timeline. - B. General-use HTTP URLs - C. Converting a curve25519 public key to an ed25519 public key - D. Inferring missing proto lines. - E. Limited ed diff format - -0. Scope and preliminaries - - This directory protocol is used by Tor version 0.2.0.x-alpha and later. - See dir-spec-v1.txt for information on the protocol used up to the - 0.1.0.x series, and dir-spec-v2.txt for information on the protocol - used by the 0.1.1.x and 0.1.2.x series. - - This document merges and supersedes the following proposals: - - 101 Voting on the Tor Directory System - 103 Splitting identity key from regularly used signing key - 104 Long and Short Router Descriptors - - XXX timeline - XXX fill in XXXXs - - The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL - NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and - "OPTIONAL" in this document are to be interpreted as described in - RFC 2119. - -0.1. History - - The earliest versions of Onion Routing shipped with a list of known - routers and their keys. When the set of routers changed, users needed to - fetch a new list. - - The Version 1 Directory protocol - -------------------------------- - - Early versions of Tor (0.0.2) introduced "Directory authorities": servers - that served signed "directory" documents containing a list of signed - "server descriptors", along with short summary of the status of each - router. Thus, clients could get up-to-date information on the state of - the network automatically, and be certain that the list they were getting - was attested by a trusted directory authority. - - Later versions (0.0.8) added directory caches, which download - directories from the authorities and serve them to clients. Non-caches - fetch from the caches in preference to fetching from the authorities, thus - distributing bandwidth requirements. - - Also added during the version 1 directory protocol were "router status" - documents: short documents that listed only the up/down status of the - routers on the network, rather than a complete list of all the - descriptors. Clients and caches would fetch these documents far more - frequently than they would fetch full directories. - - The Version 2 Directory Protocol - -------------------------------- - - During the Tor 0.1.1.x series, Tor revised its handling of directory - documents in order to address two major problems: - - * Directories had grown quite large (over 1MB), and most directory - downloads consisted mainly of server descriptors that clients - already had. - - * Every directory authority was a trust bottleneck: if a single - directory authority lied, it could make clients believe for a time - an arbitrarily distorted view of the Tor network. (Clients - trusted the most recent signed document they downloaded.) Thus, - adding more authorities would make the system less secure, not - more. - - To address these, we extended the directory protocol so that - authorities now published signed "network status" documents. Each - network status listed, for every router in the network: a hash of its - identity key, a hash of its most recent descriptor, and a summary of - what the authority believed about its status. Clients would download - the authorities' network status documents in turn, and believe - statements about routers iff they were attested to by more than half of - the authorities. - - Instead of downloading all server descriptors at once, clients - downloaded only the descriptors that they did not have. Descriptors - were indexed by their digests, in order to prevent malicious caches - from giving different versions of a server descriptor to different - clients. - - Routers began working harder to upload new descriptors only when their - contents were substantially changed. - - -0.2. Goals of the version 3 protocol - - Version 3 of the Tor directory protocol tries to solve the following - issues: - - * A great deal of bandwidth used to transmit server descriptors was - used by two fields that are not actually used by Tor routers - (namely read-history and write-history). We save about 60% by - moving them into a separate document that most clients do not - fetch or use. - - * It was possible under certain perverse circumstances for clients - to download an unusual set of network status documents, thus - partitioning themselves from clients who have a more recent and/or - typical set of documents. Even under the best of circumstances, - clients were sensitive to the ages of the network status documents - they downloaded. Therefore, instead of having the clients - correlate multiple network status documents, we have the - authorities collectively vote on a single consensus network status - document. - - * The most sensitive data in the entire network (the identity keys - of the directory authorities) needed to be stored unencrypted so - that the authorities can sign network-status documents on the fly. - Now, the authorities' identity keys are stored offline, and used - to certify medium-term signing keys that can be rotated. - -0.3. Some Remaining questions - - Things we could solve on a v3 timeframe: - - The SHA-1 hash is showing its age. We should do something about our - dependency on it. We could probably future-proof ourselves here in - this revision, at least so far as documents from the authorities are - concerned. - - Too many things about the authorities are hardcoded by IP. - - Perhaps we should start accepting longer identity keys for routers - too. - - Things to solve eventually: - - Requiring every client to know about every router won't scale forever. - - Requiring every directory cache to know every router won't scale - forever. - - -1. Outline - - There is a small set (say, around 5-10) of semi-trusted directory - authorities. A default list of authorities is shipped with the Tor - software. Users can change this list, but are encouraged not to do so, - in order to avoid partitioning attacks. - - Every authority has a very-secret, long-term "Authority Identity Key". - This is stored encrypted and/or offline, and is used to sign "key - certificate" documents. Every key certificate contains a medium-term - (3-12 months) "authority signing key", that is used by the authority to - sign other directory information. (Note that the authority identity - key is distinct from the router identity key that the authority uses - in its role as an ordinary router.) - - Routers periodically upload signed "routers descriptors" to the - directory authorities describing their keys, capabilities, and other - information. Routers may also upload signed "extra-info documents" - containing information that is not required for the Tor protocol. - Directory authorities serve server descriptors indexed by router - identity, or by hash of the descriptor. - - Routers may act as directory caches to reduce load on the directory - authorities. They announce this in their descriptors. - - Periodically, each directory authority generates a view of - the current descriptors and status for known routers. They send a - signed summary of this view (a "status vote") to the other - authorities. The authorities compute the result of this vote, and sign - a "consensus status" document containing the result of the vote. - - Directory caches download, cache, and re-serve consensus documents. - - Clients, directory caches, and directory authorities all use consensus - documents to find out when their list of routers is out-of-date. - (Directory authorities also use vote statuses.) If it is, they download - any missing server descriptors. Clients download missing descriptors - from caches; caches and authorities download from authorities. - Descriptors are downloaded by the hash of the descriptor, not by the - relay's identity key: this prevents directory servers from attacking - clients by giving them descriptors nobody else uses. - - All directory information is uploaded and downloaded with HTTP. - -1.1. What's different from version 2? - - Clients used to download multiple network status documents, - corresponding roughly to "status votes" above. They would compute the - result of the vote on the client side. - - Authorities used to sign documents using the same private keys they used - for their roles as routers. This forced them to keep these extremely - sensitive keys in memory unencrypted. - - All of the information in extra-info documents used to be kept in the - main descriptors. - -1.2. Document meta-format - - Server descriptors, directories, and running-routers documents all obey the - following lightweight extensible information format. - - The highest level object is a Document, which consists of one or more - Items. Every Item begins with a KeywordLine, followed by zero or more - Objects. A KeywordLine begins with a Keyword, optionally followed by - whitespace and more non-newline characters, and ends with a newline. A - Keyword is a sequence of one or more characters in the set [A-Za-z0-9-], - but may not start with -. - An Object is a block of encoded data in pseudo-Privacy-Enhanced-Mail (PEM) - style format: that is, lines of encoded data MAY be wrapped by inserting - an ascii linefeed ("LF", also called newline, or "NL" here) character - (cf. RFC 4648 §3.1). When line wrapping, implementations MUST wrap lines - at 64 characters. Upon decoding, implementations MUST ignore and discard - all linefeed characters. - - More formally: - - NL = The ascii LF character (hex value 0x0a). - Document ::= (Item | NL)+ - Item ::= KeywordLine Object? - KeywordLine ::= Keyword (WS Argument)* NL - Keyword = KeywordStart KeywordChar* - KeywordStart ::= 'A' ... 'Z' | 'a' ... 'z' | '0' ... '9' - KeywordChar ::= KeywordStart | '-' - Argument := ArgumentChar+ - ArgumentChar ::= any graphical printing ASCII character. - WS = (SP | TAB)+ - Object ::= BeginLine Base64-encoded-data EndLine - BeginLine ::= "-----BEGIN " Keyword (" " Keyword)* "-----" NL - EndLine ::= "-----END " Keyword (" " Keyword)* "-----" NL - - A Keyword may not be "-----BEGIN". - - The BeginLine and EndLine of an Object must use the same keyword. - - When interpreting a Document, software MUST ignore any KeywordLine that - starts with a keyword it doesn't recognize; future implementations MUST NOT - require current clients to understand any KeywordLine not currently - described. - - Other implementations that want to extend Tor's directory format MAY - introduce their own items. The keywords for extension items SHOULD start - with the characters "x-" or "X-", to guarantee that they will not conflict - with keywords used by future versions of Tor. - - In our document descriptions below, we tag Items with a multiplicity in - brackets. Possible tags are: - - "At start, exactly once": These items MUST occur in every instance of - the document type, and MUST appear exactly once, and MUST be the - first item in their documents. - - "Exactly once": These items MUST occur exactly one time in every - instance of the document type. - - "At end, exactly once": These items MUST occur in every instance of - the document type, and MUST appear exactly once, and MUST be the - last item in their documents. - - "At most once": These items MAY occur zero or one times in any - instance of the document type, but MUST NOT occur more than once. - - "Any number": These items MAY occur zero, one, or more times in any - instance of the document type. - - "Once or more": These items MUST occur at least once in any instance - of the document type, and MAY occur more. - - For forward compatibility, each item MUST allow extra arguments at the - end of the line unless otherwise noted. So if an item's description below - is given as: - - "thing" int int int NL - - then implementations SHOULD accept this string as well: - - "thing 5 9 11 13 16 12" NL - - but not this string: - - "thing 5" NL - - and not this string: - - "thing 5 10 thing" NL - . - - Whenever an item DOES NOT allow extra arguments, we will tag it with - "no extra arguments". - -1.3. Signing documents - - Every signable document below is signed in a similar manner, using a - given "Initial Item", a final "Signature Item", a digest algorithm, and - a signing key. - - The Initial Item must be the first item in the document. - - The Signature Item has the following format: - - [arguments] NL SIGNATURE NL - - The "SIGNATURE" Object contains a signature (using the signing key) of - the PKCS#1 1.5 padded digest of the entire document, taken from the - beginning of the Initial item, through the newline after the Signature - Item's keyword and its arguments. - - The signature does not include the algorithmIdentifier specified in PKCS #1. - - Unless specified otherwise, the digest algorithm is SHA-1. - - All documents are invalid unless signed with the correct signing key. - - The "Digest" of a document, unless stated otherwise, is its digest *as - signed by this signature scheme*. - -1.4. Voting timeline - - Every consensus document has a "valid-after" (VA) time, a "fresh-until" - (FU) time and a "valid-until" (VU) time. VA MUST precede FU, which MUST - in turn precede VU. Times are chosen so that every consensus will be - "fresh" until the next consensus becomes valid, and "valid" for a while - after. At least 3 consensuses should be valid at any given time. - - The timeline for a given consensus is as follows: - - VA-DistSeconds-VoteSeconds: The authorities exchange votes. Each authority - uploads their vote to all other authorities. - - VA-DistSeconds-VoteSeconds/2: The authorities try to download any - votes they don't have. - - Authorities SHOULD also reject any votes that other authorities try to - upload after this time. (0.4.4.1-alpha was the first version to reject votes - in this way.) - - Note: Refusing late uploaded votes minimizes the chance of a consensus - split, particular when authorities are under bandwidth pressure. If an - authority is struggling to upload its vote, and finally uploads to a - fraction of authorities after this period, they will compute a consensus - different from the others. By refusing uploaded votes after this time, - we increase the likelihood that most authorities will use the same vote - set. - - Rejecting late uploaded votes does not fix the problem entirely. If - some authorities are able to download a specific vote, but others fail - to do so, then there may still be a consensus split. However, this - change does remove one common cause of consensus splits. - - VA-DistSeconds: The authorities calculate the consensus and exchange - signatures. (This is the earliest point at which anybody can - possibly get a given consensus if they ask for it.) - - VA-DistSeconds/2: The authorities try to download any signatures - they don't have. - - VA: All authorities have a multiply signed consensus. - - VA ... FU: Caches download the consensus. (Note that since caches have - no way of telling what VA and FU are until they have downloaded - the consensus, they assume that the present consensus's VA is - equal to the previous one's FU, and that its FU is one interval after - that.) - - FU: The consensus is no longer the freshest consensus. - - FU ... (the current consensus's VU): Clients download the consensus. - (See note above: clients guess that the next consensus's FU will be - two intervals after the current VA.) - - VU: The consensus is no longer valid; clients should continue to try to - download a new consensus if they have not done so already. - - VU + 24 hours: Clients will no longer use the consensus at all. - - VoteSeconds and DistSeconds MUST each be at least 20 seconds; FU-VA and - VU-FU MUST each be at least 5 minutes. - -2. Router operation and formats - -2.1. Uploading server descriptors and extra-info documents - - ORs SHOULD generate a new server descriptor and a new extra-info - document whenever any of the following events have occurred: - - - A period of time (18 hrs by default) has passed since the last - time a descriptor was generated. - - - A descriptor field other than bandwidth or uptime has changed. - - - Its uptime is less than 24h and bandwidth has changed by a factor of 2 - from the last time a descriptor was generated, and at least a given - interval of time (3 hours by default) has passed since then. - - - Its uptime has been reset (by restarting). - - - It receives a networkstatus consensus in which it is not listed. - - - It receives a networkstatus consensus in which it is listed - with the StaleDesc flag. - - [XXX this list is incomplete; see router_differences_are_cosmetic() - in routerlist.c for others] - - ORs SHOULD NOT publish a new server descriptor or extra-info document - if none of the above events have occurred and not much time has passed - (12 hours by default). - - Tor versions older than 0.3.5.1-alpha ignore uptime when checking for - bandwidth changes. - - After generating a descriptor, ORs upload them to every directory - authority they know, by posting them (in order) to the URL - - http:///tor/ - - Server descriptors may not exceed 20,000 bytes in length; extra-info - documents may not exceed 50,000 bytes in length. If they do, the - authorities SHOULD reject them. - -2.1.1. Server descriptor format - - Server descriptors consist of the following items. - - In lines that take multiple arguments, extra arguments SHOULD be - accepted and ignored. Many of the nonterminals below are defined in - section 2.1.3. - - Note that many versions of Tor will generate an extra newline at the - end of their descriptors. Implementations MUST tolerate one or - more blank lines at the end of a single descriptor or a list of - concatenated descriptors. New implementations SHOULD NOT generate - such blank lines. - - "router" nickname address ORPort SOCKSPort DirPort NL - - [At start, exactly once.] - - Indicates the beginning of a server descriptor. "nickname" must be a - valid router nickname as specified in section 2.1.3. "address" must - be an IPv4 - address in dotted-quad format. The last three numbers indicate the - TCP ports at which this OR exposes functionality. ORPort is a port at - which this OR accepts TLS connections for the main OR protocol; - SOCKSPort is deprecated and should always be 0; and DirPort is the - port at which this OR accepts directory-related HTTP connections. If - any port is not supported, the value 0 is given instead of a port - number. (At least one of DirPort and ORPort SHOULD be set; - authorities MAY reject any descriptor with both DirPort and ORPort of - 0.) - - "identity-ed25519" NL "-----BEGIN ED25519 CERT-----" NL certificate - "-----END ED25519 CERT-----" NL - - [Exactly once, in second position in document.] - [No extra arguments] - - The certificate is a base64-encoded Ed25519 certificate (see - cert-spec.txt) with terminating =s removed. When this element - is present, it MUST appear as the first or second element in - the router descriptor. - - The certificate has CERT_TYPE of [04]. It must include a - signed-with-ed25519-key extension (see cert-spec.txt, - section 2.2.1), so that we can extract the master identity key. - - [Before Tor 0.4.5.1-alpha, this field was optional.] - - "master-key-ed25519" SP MasterKey NL - - [Exactly once] - - Contains the base-64 encoded ed25519 master key as a single - argument. If it is present, it MUST match the identity key - in the identity-ed25519 entry. - - [Before Tor 0.4.5.1-alpha, this field was optional.] - - "bandwidth" bandwidth-avg bandwidth-burst bandwidth-observed NL - - [Exactly once] - - Estimated bandwidth for this router, in bytes per second. The - "average" bandwidth is the volume per second that the OR is willing to - sustain over long periods; the "burst" bandwidth is the volume that - the OR is willing to sustain in very short intervals. The "observed" - value is an estimate of the capacity this relay can handle. The - relay remembers the max bandwidth sustained output over any ten - second period in the past 5 days, and another sustained input. The - "observed" value is the lesser of these two numbers. - - Tor versions released before 2018 only kept bandwidth-observed for one - day. These versions are no longer supported or recommended. - - "platform" string NL - - [At most once] - - A human-readable string describing the system on which this OR is - running. This MAY include the operating system, and SHOULD include - the name and version of the software implementing the Tor protocol. - - "published" YYYY-MM-DD HH:MM:SS NL - - [Exactly once] - - The time, in UTC, when this descriptor (and its corresponding - extra-info document if any) was generated. - - "fingerprint" fingerprint NL - - [At most once] - - A fingerprint (a HASH_LEN-byte of asn1 encoded public key, encoded in - hex, with a single space after every 4 characters) for this router's - identity key. A descriptor is considered invalid (and MUST be - rejected) if the fingerprint line does not match the public key. - - [We didn't start parsing this line until Tor 0.1.0.6-rc; it should - be marked with "opt" until earlier versions of Tor are obsolete.] - - "hibernating" bool NL - - [At most once] - - If the value is 1, then the Tor relay was hibernating when the - descriptor was published, and shouldn't be used to build circuits. - - [We didn't start parsing this line until Tor 0.1.0.6-rc; it should be - marked with "opt" until earlier versions of Tor are obsolete.] - - "uptime" number NL - - [At most once] - - The number of seconds that this OR process has been running. - - "onion-key" NL a public key in PEM format - - [Exactly once] - [No extra arguments] - - This key is used to encrypt CREATE cells for this OR. The key MUST be - accepted for at least 1 week after any new key is published in a - subsequent descriptor. It MUST be 1024 bits. - - The key encoding is the encoding of the key as a PKCS#1 RSAPublicKey - structure, encoded in base64, and wrapped in "-----BEGIN RSA PUBLIC - KEY-----" and "-----END RSA PUBLIC KEY-----". - - "onion-key-crosscert" NL a RSA signature in PEM format. - - [Exactly once] - [No extra arguments] - - This element contains an RSA signature, generated using the - onion-key, of the following: - - A SHA1 hash of the RSA identity key, - i.e. RSA key from "signing-key" (see below) [20 bytes] - The Ed25519 identity key, - i.e. Ed25519 key from "master-key-ed25519" [32 bytes] - - If there is no Ed25519 identity key, or if in some future version - there is no RSA identity key, the corresponding field must be - zero-filled. - - Parties verifying this signature MUST allow additional data - beyond the 52 bytes listed above. - - This signature proves that the party creating the descriptor - had control over the secret key corresponding to the - onion-key. - - [Before Tor 0.4.5.1-alpha, this field was optional whenever - identity-ed25519 was absent.] - - "ntor-onion-key" base-64-encoded-key - - [Exactly once] - - A curve25519 public key used for the ntor circuit extended - handshake. It's the standard encoding of the OR's curve25519 - public key, encoded in base 64. The trailing '=' sign MAY be - omitted from the base64 encoding. The key MUST be accepted - for at least 1 week after any new key is published in a - subsequent descriptor. - - [Before Tor 0.4.5.1-alpha, this field was optional.] - - "ntor-onion-key-crosscert" SP Bit NL - "-----BEGIN ED25519 CERT-----" NL certificate - "-----END ED25519 CERT-----" NL - - [Exactly once] - [No extra arguments] - - A signature created with the ntor-onion-key, using the - certificate format documented in cert-spec.txt, with type - [0a]. The signed key here is the master identity key. - - Bit must be "0" or "1". It indicates the sign of the ed25519 - public key corresponding to the ntor onion key. If Bit is "0", - then implementations MUST guarantee that the x-coordinate of - the resulting ed25519 public key is positive. Otherwise, if - Bit is "1", then the sign of the x-coordinate MUST be negative. - - To compute the ed25519 public key corresponding to a curve25519 - key, and for further explanation on key formats, see appendix C. - - This signature proves that the party creating the descriptor - had control over the secret key corresponding to the - ntor-onion-key. - - [Before Tor 0.4.5.1-alpha, this field was optional whenever - identity-ed25519 was absent.] - - "signing-key" NL a public key in PEM format - - [Exactly once] - [No extra arguments] - - The OR's long-term RSA identity key. It MUST be 1024 bits. - - The encoding is as for "onion-key" above. - - "accept" exitpattern NL - "reject" exitpattern NL - - [Any number] - - These lines describe an "exit policy": the rules that an OR follows - when deciding whether to allow a new stream to a given address. The - 'exitpattern' syntax is described below. There MUST be at least one - such entry. The rules are considered in order; if no rule matches, - the address will be accepted. For clarity, the last such entry SHOULD - be accept *:* or reject *:*. - - "ipv6-policy" SP ("accept" / "reject") SP PortList NL - - [At most once.] - - An exit-policy summary as specified in sections 3.4.1 and 3.8.2, - summarizing - the router's rules for connecting to IPv6 addresses. A missing - "ipv6-policy" line is equivalent to "ipv6-policy reject - 1-65535". - - "overload-general" SP version SP YYYY-MM-DD HH:MM:SS NL - - [At most once.] - - Indicates that a relay has reached an "overloaded state" which can be - one or many of the following load metrics: - - - Any OOM invocation due to memory pressure - - Any ntor onionskins are dropped - - TCP port exhaustion - - The timestamp is when at least one metrics was detected. It should always - be at the hour and thus, as an example, "2020-01-10 13:00:00" is an - expected timestamp. Because this is a binary state, if the line is - present, we consider that it was hit at the very least once somewhere - between the provided timestamp and the "published" timestamp of the - document which is when the document was generated. - - The overload-general line should remain in place for 72 hours since last - triggered. If the limits are reached again in this period, the timestamp - is updated, and this 72 hour period restarts. - - The 'version' field is set to '1' for now. - - (Introduced in tor-0.4.6.1-alpha, but moved from extra-info to general - descriptor in tor-0.4.6.2-alpha) - - "router-sig-ed25519" SP Signature NL - - [Exactly once.] - - It MUST be the next-to-last element in the descriptor, appearing - immediately before the RSA signature. It MUST contain an Ed25519 - signature of a SHA256 digest of the entire document. This digest is - taken from the first character up to and including the first space - after the "router-sig-ed25519" string. Before computing the digest, - the string "Tor router descriptor signature v1" is prefixed to the - document. - - The signature is encoded in Base64, with terminating =s removed. - - The signing key in the identity-ed25519 certificate MUST - be the one used to sign the document. - - [Before Tor 0.4.5.1-alpha, this field was optional whenever - identity-ed25519 was absent.] - - "router-signature" NL Signature NL - - [At end, exactly once] - [No extra arguments] - - The "SIGNATURE" object contains a signature of the PKCS1-padded - hash of the entire server descriptor, taken from the beginning of the - "router" line, through the newline after the "router-signature" line. - The server descriptor is invalid unless the signature is performed - with the router's identity key. - - "contact" info NL - - [At most once] - - Describes a way to contact the relay's administrator, preferably - including an email address and a PGP key fingerprint. - - "bridge-distribution-request" SP Method NL - - [At most once, bridges only.] - - The "Method" describes how a Bridge address is distributed by - BridgeDB. Recognized methods are: "none", "any", "https", "email", - "moat". If set to "none", BridgeDB will avoid distributing your bridge - address. If set to "any", BridgeDB will choose how to distribute your - bridge address. Choosing any of the other methods will tell BridgeDB to - distribute your bridge via a specific method: - - - "https" specifies distribution via the web interface at - https://bridges.torproject.org; - - "email" specifies distribution via the email autoresponder at - bridges@torproject.org; - - "moat" specifies distribution via an interactive menu inside Tor - Browser; and - - Potential future "Method" specifiers must be as follows: - Method = (KeywordChar | "_") + - - All bridges SHOULD include this line. Non-bridges MUST NOT include - it. - - BridgeDB SHOULD treat unrecognized Method values as if they were - "none". - - (Default: "any") - - [This line was introduced in 0.3.2.3-alpha, with a minimal backport - to 0.2.5.16, 0.2.8.17, 0.2.9.14, 0.3.0.13, 0.3.1.9, and later.] - - "family" names NL - - [At most once] - - 'Names' is a space-separated list of relay nicknames or - hexdigests. If two ORs list one another in their "family" entries, - then OPs should treat them as a single OR for the purpose of path - selection. - - For example, if node A's descriptor contains "family B", and node B's - descriptor contains "family A", then node A and node B should never - be used on the same circuit. - - "read-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM,NUM,NUM,NUM,NUM... NL - [At most once] - "write-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM,NUM,NUM,NUM,NUM... NL - [At most once] - - (These fields once appeared in router descriptors, but have - appeared in extra-info descriptors since 0.2.0.x.) - - "eventdns" bool NL - - [At most once] - - Declare whether this version of Tor is using the newer enhanced - dns logic. Versions of Tor with this field set to false SHOULD NOT - be used for reverse hostname lookups. - - [This option is obsolete. All Tor current relays should be presumed - to have the evdns backend.] - - "caches-extra-info" NL - - [At most once.] - [No extra arguments] - - Present only if this router is a directory cache that provides - extra-info documents. - - [Versions before 0.2.0.1-alpha don't recognize this] - - "extra-info-digest" SP sha1-digest [SP sha256-digest] NL - - [At most once] - - "sha1-digest" is a hex-encoded SHA1 digest (using upper-case characters) - of the router's extra-info document, as signed in the router's - extra-info (that is, not including the signature). (If this field is - absent, the router is not uploading a corresponding extra-info - document.) - - "sha256-digest" is a base64-encoded SHA256 digest of the extra-info - document. Unlike the "sha1-digest", this digest is calculated over the - entire document, including the signature. This difference is due to - a long-lived bug in the tor implementation that it would be difficult - to roll out an incremental fix for, not a design choice. Future digest - algorithms specified should not include the signature in the data used - to compute the digest. - - [Versions before 0.2.7.2-alpha did not include a SHA256 digest.] - [Versions before 0.2.0.1-alpha don't recognize this field at all.] - - "hidden-service-dir" NL - - [At most once.] - - Present only if this router stores and serves hidden service - descriptors. This router supports the descriptor versions declared - in the HSDir "proto" entry. If there is no "proto" entry, this - router supports version 2 descriptors. - - "protocols" SP "Link" SP LINK-VERSION-LIST SP "Circuit" SP - CIRCUIT-VERSION-LIST NL - - [At most once.] - - An obsolete list of protocol versions, superseded by the "proto" - entry. This list was never parsed, and has not been emitted - since Tor 0.2.9.4-alpha. New code should neither generate nor - parse this line. - - "allow-single-hop-exits" NL - - [At most once.] - [No extra arguments] - - Present only if the router allows single-hop circuits to make exit - connections. Most Tor relays do not support this: this is - included for specialized controllers designed to support perspective - access and such. This is obsolete in tor version >= 0.3.1.0-alpha. - - "or-address" SP ADDRESS ":" PORT NL - - [Any number] - - ADDRESS = IP6ADDR | IP4ADDR - IPV6ADDR = an ipv6 address, surrounded by square brackets. - IPV4ADDR = an ipv4 address, represented as a dotted quad. - PORT = a number between 1 and 65535 inclusive. - - An alternative for the address and ORPort of the "router" line, but with - two added capabilities: - - * or-address can be either an IPv4 or IPv6 address - * or-address allows for multiple ORPorts and addresses - - A descriptor SHOULD NOT include an or-address line that does nothing but - duplicate the address:port pair from its "router" line. - - The ordering of or-address lines and their PORT entries matter because - Tor MAY accept a limited number of address/port pairs. As of - Tor 0.2.3.x only the first address/port pair is advertised and used. - - "tunnelled-dir-server" NL - - [At most once.] - [No extra arguments] - - Present if the router accepts "tunneled" directory requests using a - BEGIN_DIR cell over the router's OR port. - (Added in 0.2.8.1-alpha. Before this, Tor relays accepted - tunneled directory requests only if they had a DirPort open, - or if they were bridges.) - - "proto" SP Entries NL - - [Exactly once.] - - Entries = - Entries = Entry - Entries = Entry SP Entries - - Entry = Keyword "=" Values - - Values = - Values = Value - Values = Value "," Values - - Value = Int - Value = Int "-" Int - - Int = NON_ZERO_DIGIT - Int = Int DIGIT - - Each 'Entry' in the "proto" line indicates that the Tor relay supports - one or more versions of the protocol in question. Entries should be - sorted by keyword. Values should be numerically ascending within each - entry. (This implies that there should be no overlapping ranges.) - Ranges should be represented as compactly as possible. Ints must be no - larger than 63. - - This field was first added in Tor 0.2.9.x. - - [Before Tor 0.4.5.1-alpha, this field was optional.] - - -2.1.2. Extra-info document format - - Extra-info documents consist of the following items: - - "extra-info" Nickname Fingerprint NL - [At start, exactly once.] - - Identifies what router this is an extra-info descriptor for. - Fingerprint is encoded in hex (using upper-case letters), with - no spaces. - - "identity-ed25519" - [As in router descriptors] - - "published" YYYY-MM-DD HH:MM:SS NL - - [Exactly once.] - - The time, in UTC, when this document (and its corresponding router - descriptor if any) was generated. It MUST match the published time - in the corresponding server descriptor. - - "read-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM,NUM,NUM,NUM,NUM... NL - [At most once.] - "write-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM,NUM,NUM,NUM,NUM... NL - [At most once.] - - Declare how much bandwidth the OR has used recently. Usage is divided - into intervals of NSEC seconds. The YYYY-MM-DD HH:MM:SS field - defines the end of the most recent interval. The numbers are the - number of bytes used in the most recent intervals, ordered from - oldest to newest. - - These fields include both IPv4 and IPv6 traffic. - - "ipv6-read-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM,NUM,NUM... NL - [At most once] - "ipv6-write-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM,NUM,NUM... NL - [At most once] - - Declare how much bandwidth the OR has used recently, on IPv6 - connections. See "read-history" and "write-history" for full details. - - "geoip-db-digest" Digest NL - [At most once.] - - SHA1 digest of the IPv4 GeoIP database file that is used to - resolve IPv4 addresses to country codes. - - "geoip6-db-digest" Digest NL - [At most once.] - - SHA1 digest of the IPv6 GeoIP database file that is used to - resolve IPv6 addresses to country codes. - - ("geoip-start-time" YYYY-MM-DD HH:MM:SS NL) - ("geoip-client-origins" CC=NUM,CC=NUM,... NL) - - Only generated by bridge routers (see blocking.pdf), and only - when they have been configured with a geoip database. - Non-bridges SHOULD NOT generate these fields. Contains a list - of mappings from two-letter country codes (CC) to the number - of clients that have connected to that bridge from that - country (approximate, and rounded up to the nearest multiple of 8 - in order to hamper traffic analysis). A country is included - only if it has at least one address. The time in - "geoip-start-time" is the time at which we began collecting geoip - statistics. - - "geoip-start-time" and "geoip-client-origins" have been replaced by - "bridge-stats-end" and "bridge-ips" in 0.2.2.4-alpha. The - reason is that the measurement interval with "geoip-stats" as - determined by subtracting "geoip-start-time" from "published" could - have had a variable length, whereas the measurement interval in - 0.2.2.4-alpha and later is set to be exactly 24 hours long. In - order to clearly distinguish the new measurement intervals from - the old ones, the new keywords have been introduced. - - "bridge-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL - [At most once.] - - YYYY-MM-DD HH:MM:SS defines the end of the included measurement - interval of length NSEC seconds (86400 seconds by default). - - A "bridge-stats-end" line, as well as any other "bridge-*" line, - is only added when the relay has been running as a bridge for at - least 24 hours. - - "bridge-ips" CC=NUM,CC=NUM,... NL - [At most once.] - - List of mappings from two-letter country codes to the number of - unique IP addresses that have connected from that country to the - bridge and which are no known relays, rounded up to the nearest - multiple of 8. - - "bridge-ip-versions" FAM=NUM,FAM=NUM,... NL - [At most once.] - - List of unique IP addresses that have connected to the bridge - per protocol family. - - "bridge-ip-transports" PT=NUM,PT=NUM,... NL - [At most once.] - - List of mappings from pluggable transport names to the number - of unique IP addresses that have connected using that - pluggable transport. Unobfuscated connections are counted - using the reserved pluggable transport name "" (without - quotes). If we received a connection from a transport proxy - but we couldn't figure out the name of the pluggable - transport, we use the reserved pluggable transport name - "". - - ("" and "" are reserved because normal pluggable - transport names MUST match the following regular expression: - "[a-zA-Z_][a-zA-Z0-9_]*" ) - - The pluggable transport name list is sorted into lexically - ascending order. - - If no clients have connected to the bridge yet, we only write - "bridge-ip-transports" to the stats file. - - "dirreq-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL - [At most once.] - - YYYY-MM-DD HH:MM:SS defines the end of the included measurement - interval of length NSEC seconds (86400 seconds by default). - - A "dirreq-stats-end" line, as well as any other "dirreq-*" line, - is only added when the relay has opened its Dir port and after 24 - hours of measuring directory requests. - - "dirreq-v2-ips" CC=NUM,CC=NUM,... NL - [At most once.] - "dirreq-v3-ips" CC=NUM,CC=NUM,... NL - [At most once.] - - List of mappings from two-letter country codes to the number of - unique IP addresses that have connected from that country to - request a v2/v3 network status, rounded up to the nearest multiple - of 8. Only those IP addresses are counted that the directory can - answer with a 200 OK status code. (Note here and below: current Tor - versions, as of 0.2.5.2-alpha, no longer cache or serve v2 - networkstatus documents.) - - "dirreq-v2-reqs" CC=NUM,CC=NUM,... NL - [At most once.] - "dirreq-v3-reqs" CC=NUM,CC=NUM,... NL - [At most once.] - - List of mappings from two-letter country codes to the number of - requests for v2/v3 network statuses from that country, rounded up - to the nearest multiple of 8. Only those requests are counted that - the directory can answer with a 200 OK status code. - - "dirreq-v2-share" NUM% NL - [At most once.] - "dirreq-v3-share" NUM% NL - [At most once.] - - The share of v2/v3 network status requests that the directory - expects to receive from clients based on its advertised bandwidth - compared to the overall network bandwidth capacity. Shares are - formatted in percent with two decimal places. Shares are - calculated as means over the whole 24-hour interval. - - "dirreq-v2-resp" status=NUM,... NL - [At most once.] - "dirreq-v3-resp" status=NUM,... NL - [At most once.] - - List of mappings from response statuses to the number of requests - for v2/v3 network statuses that were answered with that response - status, rounded up to the nearest multiple of 4. Only response - statuses with at least 1 response are reported. New response - statuses can be added at any time. The current list of response - statuses is as follows: - - "ok": a network status request is answered; this number - corresponds to the sum of all requests as reported in - "dirreq-v2-reqs" or "dirreq-v3-reqs", respectively, before - rounding up. - "not-enough-sigs: a version 3 network status is not signed by a - sufficient number of requested authorities. - "unavailable": a requested network status object is unavailable. - "not-found": a requested network status is not found. - "not-modified": a network status has not been modified since the - If-Modified-Since time that is included in the request. - "busy": the directory is busy. - - "dirreq-v2-direct-dl" key=NUM,... NL - [At most once.] - "dirreq-v3-direct-dl" key=NUM,... NL - [At most once.] - "dirreq-v2-tunneled-dl" key=NUM,... NL - [At most once.] - "dirreq-v3-tunneled-dl" key=NUM,... NL - [At most once.] - - List of statistics about possible failures in the download process - of v2/v3 network statuses. Requests are either "direct" - HTTP-encoded requests over the relay's directory port, or - "tunneled" requests using a BEGIN_DIR cell over the relay's OR - port. The list of possible statistics can change, and statistics - can be left out from reporting. The current list of statistics is - as follows: - - Successful downloads and failures: - - "complete": a client has finished the download successfully. - "timeout": a download did not finish within 10 minutes after - starting to send the response. - "running": a download is still running at the end of the - measurement period for less than 10 minutes after starting to - send the response. - - Download times: - - "min", "max": smallest and largest measured bandwidth in B/s. - "d[1-4,6-9]": 1st to 4th and 6th to 9th decile of measured - bandwidth in B/s. For a given decile i, i/10 of all downloads - had a smaller bandwidth than di, and (10-i)/10 of all downloads - had a larger bandwidth than di. - "q[1,3]": 1st and 3rd quartile of measured bandwidth in B/s. One - fourth of all downloads had a smaller bandwidth than q1, one - fourth of all downloads had a larger bandwidth than q3, and the - remaining half of all downloads had a bandwidth between q1 and - q3. - "md": median of measured bandwidth in B/s. Half of the downloads - had a smaller bandwidth than md, the other half had a larger - bandwidth than md. - - "dirreq-read-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM,NUM,NUM... NL - [At most once] - "dirreq-write-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM,NUM,NUM... NL - [At most once] - - Declare how much bandwidth the OR has spent on answering directory - requests. Usage is divided into intervals of NSEC seconds. The - YYYY-MM-DD HH:MM:SS field defines the end of the most recent - interval. The numbers are the number of bytes used in the most - recent intervals, ordered from oldest to newest. - - "entry-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL - [At most once.] - - YYYY-MM-DD HH:MM:SS defines the end of the included measurement - interval of length NSEC seconds (86400 seconds by default). - - An "entry-stats-end" line, as well as any other "entry-*" - line, is first added after the relay has been running for at least - 24 hours. - - "entry-ips" CC=NUM,CC=NUM,... NL - [At most once.] - - List of mappings from two-letter country codes to the number of - unique IP addresses that have connected from that country to the - relay and which are no known other relays, rounded up to the - nearest multiple of 8. - - "cell-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL - [At most once.] - - YYYY-MM-DD HH:MM:SS defines the end of the included measurement - interval of length NSEC seconds (86400 seconds by default). - - A "cell-stats-end" line, as well as any other "cell-*" line, - is first added after the relay has been running for at least 24 - hours. - - "cell-processed-cells" NUM,...,NUM NL - [At most once.] - - Mean number of processed cells per circuit, subdivided into - deciles of circuits by the number of cells they have processed in - descending order from loudest to quietest circuits. - - "cell-queued-cells" NUM,...,NUM NL - [At most once.] - - Mean number of cells contained in queues by circuit decile. These - means are calculated by 1) determining the mean number of cells in - a single circuit between its creation and its termination and 2) - calculating the mean for all circuits in a given decile as - determined in "cell-processed-cells". Numbers have a precision of - two decimal places. - - Note that this statistic can be inaccurate for circuits that had - queued cells at the start or end of the measurement interval. - - "cell-time-in-queue" NUM,...,NUM NL - [At most once.] - - Mean time cells spend in circuit queues in milliseconds. Times are - calculated by 1) determining the mean time cells spend in the - queue of a single circuit and 2) calculating the mean for all - circuits in a given decile as determined in - "cell-processed-cells". - - Note that this statistic can be inaccurate for circuits that had - queued cells at the start or end of the measurement interval. - - "cell-circuits-per-decile" NUM NL - [At most once.] - - Mean number of circuits that are included in any of the deciles, - rounded up to the next integer. - - "conn-bi-direct" YYYY-MM-DD HH:MM:SS (NSEC s) BELOW,READ,WRITE,BOTH NL - [At most once] - - Number of connections, split into 10-second intervals, that are - used uni-directionally or bi-directionally as observed in the NSEC - seconds (usually 86400 seconds) before YYYY-MM-DD HH:MM:SS. Every - 10 seconds, we determine for every connection whether we read and - wrote less than a threshold of 20 KiB (BELOW), read at least 10 - times more than we wrote (READ), wrote at least 10 times more than - we read (WRITE), or read and wrote more than the threshold, but - not 10 times more in either direction (BOTH). After classifying a - connection, read and write counters are reset for the next - 10-second interval. - - This measurement includes both IPv4 and IPv6 connections. - - "ipv6-conn-bi-direct" YYYY-MM-DD HH:MM:SS (NSEC s) BELOW,READ,WRITE,BOTH NL - [At most once] - - Number of IPv6 connections that are used uni-directionally or - bi-directionally. See "conn-bi-direct" for more details. - - "exit-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL - [At most once.] - - YYYY-MM-DD HH:MM:SS defines the end of the included measurement - interval of length NSEC seconds (86400 seconds by default). - - An "exit-stats-end" line, as well as any other "exit-*" line, is - first added after the relay has been running for at least 24 hours - and only if the relay permits exiting (where exiting to a single - port and IP address is sufficient). - - "exit-kibibytes-written" port=N,port=N,... NL - [At most once.] - "exit-kibibytes-read" port=N,port=N,... NL - [At most once.] - - List of mappings from ports to the number of kibibytes that the - relay has written to or read from exit connections to that port, - rounded up to the next full kibibyte. Relays may limit the - number of listed ports and subsume any remaining kibibytes under - port "other". - - "exit-streams-opened" port=N,port=N,... NL - [At most once.] - - List of mappings from ports to the number of opened exit streams - to that port, rounded up to the nearest multiple of 4. Relays may - limit the number of listed ports and subsume any remaining opened - streams under port "other". - - "hidserv-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL - [At most once.] - "hidserv-v3-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL - [At most once.] - - YYYY-MM-DD HH:MM:SS defines the end of the included measurement - interval of length NSEC seconds (86400 seconds by default). - - A "hidserv-stats-end" line, as well as any other "hidserv-*" line, - is first added after the relay has been running for at least 24 - hours. - - (Introduced in tor-0.4.6.1-alpha) - - "hidserv-rend-relayed-cells" SP NUM SP key=val SP key=val ... NL - [At most once.] - "hidserv-rend-v3-relayed-cells" SP NUM SP key=val SP key=val ... NL - [At most once.] - - Approximate number of RELAY cells seen in either direction on a - circuit after receiving and successfully processing a RENDEZVOUS1 - cell. - - The original measurement value is obfuscated in several steps: - first, it is rounded up to the nearest multiple of 'bin_size' - which is reported in the key=val part of this line; second, a - (possibly negative) noise value is added to the result of the - first step by randomly sampling from a Laplace distribution with - mu = 0 and b = (delta_f / epsilon) with 'delta_f' and 'epsilon' - being reported in the key=val part, too; third, the result of the - previous obfuscation steps is truncated to the next smaller - integer and included as 'NUM'. Note that the overall reported - value can be negative. - - (Introduced in tor-0.4.6.1-alpha) - - "hidserv-dir-onions-seen" SP NUM SP key=val SP key=val ... NL - [At most once.] - "hidserv-dir-v3-onions-seen" SP NUM SP key=val SP key=val ... NL - [At most once.] - - Approximate number of unique hidden-service identities seen in - descriptors published to and accepted by this hidden-service - directory. - - The original measurement value is obfuscated in the same way as - the 'NUM' value reported in "hidserv-rend-relayed-cells", but - possibly with different parameters as reported in the key=val part - of this line. Note that the overall reported value can be - negative. - - (Introduced in tor-0.4.6.1-alpha) - - "transport" transportname address:port [arglist] NL - [Any number.] - - Signals that the router supports the 'transportname' pluggable - transport in IP address 'address' and TCP port 'port'. A single - descriptor MUST not have more than one transport line with the - same 'transportname'. - - Pluggable transports are only relevant to bridges, but these entries - can appear in non-bridge relays as well. - - "padding-counts" YYYY-MM-DD HH:MM:SS (NSEC s) key=NUM key=NUM ... NL - [At most once.] - - YYYY-MM-DD HH:MM:SS defines the end of the included measurement - interval of length NSEC seconds (86400 seconds by default). Counts - are reset to 0 at the end of this interval. - - The keyword list is currently as follows: - - bin-size - - The current rounding value for cell count fields (10000 by - default) - write-drop - - The number of RELAY_DROP cells this relay sent - write-pad - - The number of CELL_PADDING cells this relay sent - write-total - - The total number of cells this relay cent - read-drop - - The number of RELAY_DROP cells this relay received - read-pad - - The number of CELL_PADDING cells this relay received - read-total - - The total number of cells this relay received - enabled-read-pad - - The number of CELL_PADDING cells this relay received on - connections that support padding - enabled-read-total - - The total number of cells this relay received on connections - that support padding - enabled-write-pad - - The total number of cells this relay received on connections - that support padding - enabled-write-total - - The total number of cells sent by this relay on connections - that support padding - max-chanpad-timers - - The maximum number of timers that this relay scheduled for - padding in the previous NSEC interval - - "overload-ratelimits" SP version SP YYYY-MM-DD SP HH:MM:SS - SP rate-limit SP burst-limit - SP read-overload-count SP write-overload-count NL - [At most once.] - - Indicates that a bandwidth limit was exhausted for this relay. - - The "rate-limit" and "burst-limit" are the raw values from the - BandwidthRate and BandwidthBurst found in the torrc configuration file. - - The "{read|write}-overload-count" are the counts of how many times the - reported limits of burst/rate were exhausted and thus the maximum - between the read and write count occurrences. To make the counter more - meaningful and to avoid multiple connections saturating the counter - when a relay is overloaded, we only increment it once a minute. - - The 'version' field is set to '1' for now. - - (Introduced in tor-0.4.6.1-alpha) - - "overload-fd-exhausted" SP version YYYY-MM-DD HH:MM:SS NL - [At most once.] - - Indicates that a file descriptor exhaustion was experienced by this - relay. - - The timestamp indicates that the maximum was reached between the - timestamp and the "published" timestamp of the document. - - This overload field should remain in place for 72 hours since last - triggered. If the limits are reached again in this period, the - timestamp is updated, and this 72 hour period restarts. - - The 'version' field is set to '1' for the initial implementation which - detects fd exhaustion only when a socket open fails. - - (Introduced in tor-0.4.6.1-alpha) - - "router-sig-ed25519" - [As in router descriptors] - - "router-signature" NL Signature NL - [At end, exactly once.] - [No extra arguments] - - A document signature as documented in section 1.3, using the - initial item "extra-info" and the final item "router-signature", - signed with the router's identity key. - -2.1.3. Nonterminals in server descriptors - - nickname ::= between 1 and 19 alphanumeric characters ([A-Za-z0-9]), - case-insensitive. - hexdigest ::= a '$', followed by 40 hexadecimal characters - ([A-Fa-f0-9]). [Represents a relay by the digest of its identity - key.] - - exitpattern ::= addrspec ":" portspec - portspec ::= "*" | port | port "-" port - port ::= an integer between 1 and 65535, inclusive. - - [Some implementations incorrectly generate ports with value 0. - Implementations SHOULD accept this, and SHOULD NOT generate it. - Connections to port 0 are never permitted.] - - addrspec ::= "*" | ip4spec | ip6spec - ipv4spec ::= ip4 | ip4 "/" num_ip4_bits | ip4 "/" ip4mask - ip4 ::= an IPv4 address in dotted-quad format - ip4mask ::= an IPv4 mask in dotted-quad format - num_ip4_bits ::= an integer between 0 and 32 - ip6spec ::= ip6 | ip6 "/" num_ip6_bits - ip6 ::= an IPv6 address, surrounded by square brackets. - num_ip6_bits ::= an integer between 0 and 128 - - bool ::= "0" | "1" - -3. Directory authority operation and formats - - Every authority has two keys used in this protocol: a signing key, and - an authority identity key. (Authorities also have a router identity - key used in their role as a router and by earlier versions of the - directory protocol.) The identity key is used from time to time to - sign new key certificates using new signing keys; it is very sensitive. - The signing key is used to sign key certificates and status documents. - -3.1. Creating key certificates - - Key certificates consist of the following items: - - "dir-key-certificate-version" version NL - - [At start, exactly once.] - - Determines the version of the key certificate. MUST be "3" for - the protocol described in this document. Implementations MUST - reject formats they don't understand. - - "dir-address" IPPort NL - [At most once] - - An IP:Port for this authority's directory port. - - "fingerprint" fingerprint NL - - [Exactly once.] - - Hexadecimal encoding without spaces based on the authority's - identity key. - - "dir-identity-key" NL a public key in PEM format - - [Exactly once.] - [No extra arguments] - - The long-term authority identity key for this authority. This key - SHOULD be at least 2048 bits long; it MUST NOT be shorter than - 1024 bits. - - "dir-key-published" YYYY-MM-DD HH:MM:SS NL - - [Exactly once.] - - The time (in UTC) when this document and corresponding key were - last generated. - - Implementations SHOULD reject certificates that are published - too far in the future, though they MAY tolerate some clock skew. - - "dir-key-expires" YYYY-MM-DD HH:MM:SS NL - - [Exactly once.] - - A time (in UTC) after which this key is no longer valid. - - Implementations SHOULD reject expired certificates, though they - MAY tolerate some clock skew. - - "dir-signing-key" NL a key in PEM format - - [Exactly once.] - [No extra arguments] - - The directory server's public signing key. This key MUST be at - least 1024 bits, and MAY be longer. - - "dir-key-crosscert" NL CrossSignature NL - - [Exactly once.] - [No extra arguments] - - CrossSignature is a signature, made using the certificate's signing - key, of the digest of the PKCS1-padded hash of the certificate's - identity key. For backward compatibility with broken versions of the - parser, we wrap the base64-encoded signature in -----BEGIN ID - SIGNATURE---- and -----END ID SIGNATURE----- tags. Implementations - MUST allow the "ID " portion to be omitted, however. - - Implementations MUST verify that the signature is a correct signature - of the hash of the identity key using the signing key. - - "dir-key-certification" NL Signature NL - - [At end, exactly once.] - [No extra arguments] - - A document signature as documented in section 1.3, using the - initial item "dir-key-certificate-version" and the final item - "dir-key-certification", signed with the authority identity key. - - Authorities MUST generate a new signing key and corresponding - certificate before the key expires. - -3.2. Accepting server descriptor and extra-info document uploads - - When a router posts a signed descriptor to a directory authority, the - authority first checks whether it is well-formed and correctly - self-signed. If it is, the authority next verifies that the nickname - in question is not already assigned to a router with a different - public key. - Finally, the authority MAY check that the router is not blacklisted - because of its key, IP, or another reason. - - An authority also keeps a record of all the Ed25519/RSA1024 - identity key pairs that it has seen before. It rejects any - descriptor that has a known Ed/RSA identity key that it has - already seen accompanied by a different RSA/Ed identity key - in an older descriptor. - - At a future date, authorities will begin rejecting all - descriptors whose RSA key was previously accompanied by an - Ed25519 key, if the descriptor does not list an Ed25519 key. - - At a future date, authorities will begin rejecting all descriptors - that do not list an Ed25519 key. - - If the descriptor passes these tests, and the authority does not already - have a descriptor for a router with this public key, it accepts the - descriptor and remembers it. - - If the authority _does_ have a descriptor with the same public key, the - newly uploaded descriptor is remembered if its publication time is more - recent than the most recent old descriptor for that router, and either: - - - There are non-cosmetic differences between the old descriptor and the - new one. - - Enough time has passed between the descriptors' publication times. - (Currently, 2 hours.) - - Differences between server descriptors are "non-cosmetic" if they would be - sufficient to force an upload as described in section 2.1 above. - - Note that the "cosmetic difference" test only applies to uploaded - descriptors, not to descriptors that the authority downloads from other - authorities. - - When a router posts a signed extra-info document to a directory authority, - the authority again checks it for well-formedness and correct signature, - and checks that its matches the extra-info-digest in some router - descriptor that it believes is currently useful. If so, it accepts it and - stores it and serves it as requested. If not, it drops it. - - -3.3. Computing microdescriptors - - Microdescriptors are a stripped-down version of server descriptors - generated by the directory authorities which may additionally contain - authority-generated information. Microdescriptors contain only the - most relevant parts that clients care about. Microdescriptors are - expected to be relatively static and only change about once per week. - Microdescriptors do not contain any information that clients need to - use to decide which servers to fetch information about, or which - servers to fetch information from. - - Microdescriptors are a straight transform from the server descriptor - and the consensus method. Microdescriptors have no header or footer. - Microdescriptors are identified by the hash of its concatenated - elements without a signature by the router. Microdescriptors do not - contain any version information, because their version is determined - by the consensus method. - - Starting with consensus method 8, microdescriptors contain the - following elements taken from or based on the server descriptor. Order - matters here, because different directory authorities must be able to - transform a given server descriptor and consensus method into the exact - same microdescriptor. - - "onion-key" NL a public key in PEM format - - [Exactly once, at start] - [No extra arguments] - - The "onion-key" element as specified in section 2.1.1. - - When generating microdescriptors for consensus method 30 or later, - the trailing = sign must be absent. For consensus method 29 or - earlier, the trailing = sign must be present. - - "ntor-onion-key" SP base-64-encoded-key NL - - [Exactly once] - - The "ntor-onion-key" element as specified in section 2.1.1. - - (Only included when generating microdescriptors for - consensus-method 16 or later.) - - [Before Tor 0.4.5.1-alpha, this field was optional.] - - "a" SP address ":" port NL - - [Any number] - - Additional advertised addresses for the OR. - - Present currently only if the OR advertises at least one IPv6 - address; currently, the first address is included and all others are - omitted. Any other IPv4 or IPv6 addresses should be ignored. - - Address and port are as for "or-address" as specified in - section 2.1.1. - - (Only included when generating microdescriptors for - consensus-methods 14 to 27.) - - "family" names NL - - [At most once] - - The "family" element as specified in section 2.1.1. - - When generating microdescriptors for consensus method 29 or later, - the following canonicalization algorithm is applied to improve - compression: - - For all entries of the form $hexid=name or $hexid~name, - remove the =name or ~name portion. - - Remove all entries of the form $hexid, where hexid is not - 40 hexadecimal characters long. - - If an entry is a valid nickname, put it into lower case. - - If an entry is a valid $hexid, put it into upper case. - - If there are any entries, add a single $hexid entry for - the relay in question, so that it is a member of its own - family. - - Sort all entries in lexical order. - - Remove duplicate entries. - - (Note that if an entry is not of the form "nickname", "$hexid", - "$hexid=nickname" or "$hexid~nickname", then it will be unchanged: - this is what makes the algorithm forward-compatible.) - - "p" SP ("accept" / "reject") SP PortList NL - - [Exactly once.] - - The exit-policy summary as specified in sections 3.4.1 and 3.8.2. - - [With microdescriptors, clients don't learn exact exit policies: - clients can only guess whether a relay accepts their request, try the - BEGIN request, and might get end-reason-exit-policy if they guessed - wrong, in which case they'll have to try elsewhere.] - - [In consensus methods before 5, this line was omitted.] - - "p6" SP ("accept" / "reject") SP PortList NL - - [At most once] - - The IPv6 exit policy summary as specified in sections 3.4.1 and - 3.8.2. A missing "p6" line is equivalent to "p6 reject 1-65535". - - (Only included when generating microdescriptors for - consensus-method 15 or later.) - - "id" SP "rsa1024" SP base64-encoded-identity-digest NL - - [At most once] - - The node identity digest (as described in tor-spec.txt), base64 - encoded, without trailing =s. This line is included to prevent - collisions between microdescriptors. - - Implementations SHOULD ignore these lines: they are - added to microdescriptors only to prevent collisions. - - (Only included when generating microdescriptors for - consensus-method 18 or later.) - - "id" SP "ed25519" SP base64-encoded-ed25519-identity NL - - [At most once] - - The node's master Ed25519 identity key, base64 encoded, - without trailing =s. - - All implementations MUST ignore this key for any microdescriptor - whose corresponding entry in the consensus includes the - 'NoEdConsensus' flag. - - (Only included when generating microdescriptors for - consensus-method 21 or later.) - - "id" SP keytype ... NL - - [At most once per distinct keytype.] - - Implementations MUST ignore "id" lines with unrecognized - key-types in place of "rsa1024" or "ed25519" - - "pr" SP Entries NL - - [Exactly once.] - - The "proto" element as specified in section 2.1.1. - - [Before Tor 0.4.5.1-alpha, this field was optional.] - - (Note that with microdescriptors, clients do not learn the RSA identity of - their routers: they only learn a hash of the RSA identity key. This is - all they need to confirm the actual identity key when doing a TLS - handshake, and all they need to put the identity key digest in their - CREATE cells.) - -3.4. Exchanging votes - - Authorities divide time into Intervals. Authority administrators SHOULD - try to all pick the same interval length, and SHOULD pick intervals that - are commonly used divisions of time (e.g., 5 minutes, 15 minutes, 30 - minutes, 60 minutes, 90 minutes). Voting intervals SHOULD be chosen to - divide evenly into a 24-hour day. - - Authorities SHOULD act according to interval and delays in the - latest consensus. Lacking a latest consensus, they SHOULD default to a - 30-minute Interval, a 5 minute VotingDelay, and a 5 minute DistDelay. - - Authorities MUST take pains to ensure that their clocks remain accurate - within a few seconds. (Running NTP is usually sufficient.) - - The first voting period of each day begins at 00:00 (midnight) UTC. If - the last period of the day would be truncated by one-half or more, it is - merged with the second-to-last period. - - An authority SHOULD publish its vote immediately at the start of each voting - period (minus VoteSeconds+DistSeconds). It does this by making it - available at - - http:///tor/status-vote/next/authority.z - - and sending it in an HTTP POST request to each other authority at the URL - - http:///tor/post/vote - - If, at the start of the voting period, minus DistSeconds, an authority - does not have a current statement from another authority, the first - authority downloads the other's statement. - - Once an authority has a vote from another authority, it makes it available - at - - http:///tor/status-vote/next/.z - - where is the fingerprint of the other authority's identity key. - And at - - http:///tor/status-vote/next/d/.z - - where is the digest of the vote document. - - Also, once an authority receives a vote from another authority, it - examines it for new descriptors and fetches them from that authority. - This may be the only way for an authority to hear about relays that didn't - publish their descriptor to all authorities, and, while it's too late - for the authority to include relays in its current vote, it can include - them in its next vote. See section 3.6 below for details. - -3.4.1. Vote and consensus status document formats - - Votes and consensuses are more strictly formatted than other documents - in this specification, since different authorities must be able to - generate exactly the same consensus given the same set of votes. - - The procedure for deciding when to generate vote and consensus status - documents are described in section 1.4 on the voting timeline. - - Status documents contain a preamble, an authority section, a list of - router status entries, and one or more footer signature, in that order. - - Unlike other formats described above, a SP in these documents must be a - single space character (hex 20). - - Some items appear only in votes, and some items appear only in - consensuses. Unless specified, items occur in both. - - The preamble contains the following items. They SHOULD occur in the - order given here: - - "network-status-version" SP version NL - - [At start, exactly once.] - - A document format version. For this specification, the version is - "3". - - "vote-status" SP type NL - - [Exactly once.] - - The status MUST be "vote" or "consensus", depending on the type of - the document. - - "consensus-methods" SP IntegerList NL - - [At most once for votes; does not occur in consensuses.] - - A space-separated list of supported methods for generating - consensuses from votes. See section 3.8.1 for details. Absence of - the line means that only method "1" is supported. - - "consensus-method" SP Integer NL - - [At most once for consensuses; does not occur in votes.] - [No extra arguments] - - See section 3.8.1 for details. - - (Only included when the vote is generated with consensus-method 2 or - later.) - - "published" SP YYYY-MM-DD SP HH:MM:SS NL - - [Exactly once for votes; does not occur in consensuses.] - - The publication time for this status document (if a vote). - - "valid-after" SP YYYY-MM-DD SP HH:MM:SS NL - - [Exactly once.] - - The start of the Interval for this vote. Before this time, the - consensus document produced from this vote is not officially in - use. - - (Note that because of propagation delays, clients and relays - may see consensus documents that are up to `DistSeconds` - earlier than this time, and should not warn about them.) - - See section 1.4 for voting timeline information. - - "fresh-until" SP YYYY-MM-DD SP HH:MM:SS NL - - [Exactly once.] - - The time at which the next consensus should be produced; before this - time, there is no point in downloading another consensus, since there - won't be a new one. See section 1.4 for voting timeline information. - - "valid-until" SP YYYY-MM-DD SP HH:MM:SS NL - - [Exactly once.] - - The end of the Interval for this vote. After this time, all - clients should try to find a more recent consensus. See section 1.4 - for voting timeline information. - - In practice, clients continue to use the consensus for up to 24 hours - after it is no longer valid, if no more recent consensus can be - downloaded. - - "voting-delay" SP VoteSeconds SP DistSeconds NL - - [Exactly once.] - - VoteSeconds is the number of seconds that we will allow to collect - votes from all authorities; DistSeconds is the number of seconds - we'll allow to collect signatures from all authorities. See - section 1.4 for voting timeline information. - - "client-versions" SP VersionList NL - - [At most once.] - - A comma-separated list of recommended Tor versions for client - usage, in ascending order. The versions are given as defined by - version-spec.txt. If absent, no opinion is held about client - versions. - - "server-versions" SP VersionList NL - - [At most once.] - - A comma-separated list of recommended Tor versions for relay - usage, in ascending order. The versions are given as defined by - version-spec.txt. If absent, no opinion is held about server - versions. - - "package" SP PackageName SP Version SP URL SP DIGESTS NL - - [Any number of times.] - - For this element: - - PACKAGENAME = NONSPACE - VERSION = NONSPACE - URL = NONSPACE - DIGESTS = DIGEST | DIGESTS SP DIGEST - DIGEST = DIGESTTYPE "=" DIGESTVAL - NONSPACE = one or more non-space printing characters - DIGESTVAL = DIGESTTYPE = one or more non-space printing characters - other than "=". - - Indicates that a package called "package" of version VERSION may be - found at URL, and its digest as computed with DIGESTTYPE is equal to - DIGESTVAL. In consensuses, these lines are sorted lexically by - "PACKAGENAME VERSION" pairs, and DIGESTTYPES must appear in ascending - order. A consensus must not contain the same "PACKAGENAME VERSION" - more than once. If a vote contains the same "PACKAGENAME VERSION" - more than once, all but the last is ignored. - - Included in consensuses only for methods 19-33. Earlier methods - did not include this; method 34 removed it. - - "known-flags" SP FlagList NL - - [Exactly once.] - - A space-separated list of all of the flags that this document - might contain. A flag is "known" either because the authority - knows about them and might set them (if in a vote), or because - enough votes were counted for the consensus for an authoritative - opinion to have been formed about their status. - - "flag-thresholds" SP Thresholds NL - - [At most once for votes; does not occur in consensuses.] - - A space-separated list of the internal performance thresholds - that the directory authority had at the moment it was forming - a vote. - - The metaformat is: - Thresholds = Threshold | Threshold SP Thresholds - Threshold = ThresholdKey '=' ThresholdVal - ThresholdKey = (KeywordChar | "_") + - ThresholdVal = [0-9]+("."[0-9]+)? "%"? - - Commonly used Thresholds at this point include: - - "stable-uptime" -- Uptime (in seconds) required for a relay - to be marked as stable. - - "stable-mtbf" -- MTBF (in seconds) required for a relay to be - marked as stable. - - "enough-mtbf" -- Whether we have measured enough MTBF to look - at stable-mtbf instead of stable-uptime. - - "fast-speed" -- Bandwidth (in bytes per second) required for - a relay to be marked as fast. - - "guard-wfu" -- WFU (in seconds) required for a relay to be - marked as guard. - - "guard-tk" -- Weighted Time Known (in seconds) required for a - relay to be marked as guard. - - "guard-bw-inc-exits" -- If exits can be guards, then all guards - must have a bandwidth this high. - - "guard-bw-exc-exits" -- If exits can't be guards, then all guards - must have a bandwidth this high. - - "ignoring-advertised-bws" -- 1 if we have enough measured bandwidths - that we'll ignore the advertised bandwidth - claims of routers without measured bandwidth. - - "recommended-client-protocols" SP Entries NL - "recommended-relay-protocols" SP Entries NL - "required-client-protocols" SP Entries NL - "required-relay-protocols" SP Entries NL - - [At most once for each.] - - The "proto" element as specified in section 2.1.1. - - To vote on these entries, a protocol/version combination is included - only if it is listed by a majority of the voters. - - These lines should be voted on. A majority of votes is sufficient to - make a protocol un-supported. A supermajority of authorities (2/3) - are needed to make a protocol required. The required protocols - should not be torrc-configurable, but rather should be hardwired in - the Tor code. - - The tor-spec.txt section 9 details how a relay and a client should - behave when they encounter these lines in the consensus. - - "params" SP [Parameters] NL - - [At most once] - - Parameter ::= Keyword '=' Int32 - Int32 ::= A decimal integer between -2147483648 and 2147483647. - Parameters ::= Parameter | Parameters SP Parameter - - The parameters list, if present, contains a space-separated list of - case-sensitive key-value pairs, sorted in lexical order by their - keyword (as ASCII byte strings). Each parameter has its own meaning. - - (Only included when the vote is generated with consensus-method 7 or - later.) - - See param-spec.txt for a list of parameters and their meanings. - - "shared-rand-previous-value" SP NumReveals SP Value NL - - [At most once] - - NumReveals ::= An integer greater or equal to 0. - Value ::= Base64-encoded-data - - The shared_random_value that was generated during the second-to-last - shared randomness protocol run. For example, if this document was - created on the 5th of November, this field carries the shared random - value generated during the protocol run of the 3rd of November. - - See section [SRCALC] of srv-spec.txt for instructions on how to compute - this value, and see section [CONS] for why we include old shared random - values in votes and consensus. - - Value is the actual shared random value encoded in base64. It will - be exactly 256 bits long. NumReveals is the number of commits used - to generate this SRV. - - "shared-rand-current-value" SP NumReveals SP Value NL - - [At most once] - - NumReveals ::= An integer greater or equal to 0. - Value ::= Base64-encoded-data - - The shared_random_value that was generated during the latest shared - randomness protocol run. For example, if this document was created on - the 5th of November, this field carries the shared random value - generated during the protocol run of the 4th of November - - See section [SRCALC] of srv-spec.txt for instructions on how to compute - this value given the active commits. - - Value is the actual shared random value encoded in base64. It will - be exactly 256 bits long. NumReveals is the number of commits used to - generate this SRV. - - "bandwidth-file-headers" SP KeyValues NL - - [At most once for votes; does not occur in consensuses.] - - KeyValues ::= "" | KeyValue | KeyValues SP KeyValue - KeyValue ::= Keyword '=' Value - Value ::= ArgumentCharValue+ - ArgumentCharValue ::= any printing ASCII character except NL and SP. - - The headers from the bandwidth file used to generate this vote. - The bandwidth file headers are described in bandwidth-file-spec.txt. - - If an authority is not configured with a V3BandwidthsFile, this line - SHOULD NOT appear in its vote. - - If an authority is configured with a V3BandwidthsFile, but parsing - fails, this line SHOULD appear in its vote, but without any headers. - - First-appeared: Tor 0.3.5.1-alpha. - - "bandwidth-file-digest" 1*(SP algorithm "=" digest) NL - - [At most once for votes; does not occur in consensuses.] - - A digest of the bandwidth file used to generate this vote. - "algorithm" is the name of the hash algorithm producing "digest", - which can be "sha256" or another algorithm. "digest" is the - base64 encoding of the hash of the bandwidth file, with trailing =s - omitted. - - If an authority is not configured with a V3BandwidthsFile, this line - SHOULD NOT appear in its vote. - - If an authority is configured with a V3BandwidthsFile, but parsing - fails, this line SHOULD appear in its vote, with the digest(s) of the - unparseable file. - - First-appeared: Tor 0.4.0.4-alpha - - The authority section of a vote contains the following items, followed - in turn by the authority's current key certificate: - - "dir-source" SP nickname SP identity SP address SP IP SP dirport SP - orport NL - - [Exactly once, at start] - - Describes this authority. The nickname is a convenient identifier - for the authority. The identity is an uppercase hex fingerprint of - the authority's current (v3 authority) identity key. The address is - the server's hostname. The IP is the server's current IP address, - and dirport is its current directory port. The orport is the - port at that address where the authority listens for OR - connections. - - "contact" SP string NL - - [Exactly once] - - An arbitrary string describing how to contact the directory - server's administrator. Administrators should include at least an - email address and a PGP fingerprint. - - "legacy-dir-key" SP FINGERPRINT NL - - [At most once] - - Lists a fingerprint for an obsolete _identity_ key still used - by this authority to keep older clients working. This option - is used to keep key around for a little while in case the - authorities need to migrate many identity keys at once. - (Generally, this would only happen because of a security - vulnerability that affected multiple authorities, like the - Debian OpenSSL RNG bug of May 2008.) - - "shared-rand-participate" NL - - [At most once] - - Denotes that the directory authority supports and can participate in the - shared random protocol. - - "shared-rand-commit" SP Version SP AlgName SP Identity SP Commit [SP Reveal] NL - - [Any number of times] - - Version ::= An integer greater or equal to 0. - AlgName ::= 1*(ALPHA / DIGIT / "_" / "-") - Identity ::= 40 * HEXDIG - Commit ::= Base64-encoded-data - Reveal ::= Base64-encoded-data - - Denotes a directory authority commit for the shared randomness - protocol, containing the commitment value and potentially also the - reveal value. See sections [COMMITREVEAL] and [VALIDATEVALUES] of - srv-spec.txt on how to generate and validate these values. - - Version is the current shared randomness protocol version. AlgName is - the hash algorithm that is used (e.g. "sha3-256") and Identity is the - authority's SHA1 v3 identity fingerprint. Commit is the encoded - commitment value in base64. Reveal is optional and if it's set, it - contains the reveal value in base64. - - If a vote contains multiple commits from the same authority, the - receiver MUST only consider the first commit listed. - - "shared-rand-previous-value" SP NumReveals SP Value NL - - [At most once] - - See shared-rand-previous-value description above. - - "shared-rand-current-value" SP NumReveals SP Value NL - - [At most once] - - See shared-rand-current-value description above. - - The authority section of a consensus contains groups of the following items, - in the order given, with one group for each authority that contributed to - the consensus, with groups sorted by authority identity digest: - - "dir-source" SP nickname SP identity SP address SP IP SP dirport SP - orport NL - - [Exactly once, at start] - - As in the authority section of a vote. - - "contact" SP string NL - - [Exactly once.] - - As in the authority section of a vote. - - "vote-digest" SP digest NL - - [Exactly once.] - - A digest of the vote from the authority that contributed to this - consensus, as signed (that is, not including the signature). - (Hex, upper-case.) - - For each "legacy-dir-key" in the vote, there is an additional "dir-source" - line containing that legacy key's fingerprint, the authority's nickname - with "-legacy" appended, and all other fields as in the main "dir-source" - line for that authority. These "dir-source" lines do not have - corresponding "contact" or "vote-digest" entries. - - Each router status entry contains the following items. Router status - entries are sorted in ascending order by identity digest. - - "r" SP nickname SP identity SP digest SP publication SP IP SP ORPort - SP DirPort NL - - [At start, exactly once.] - - "Nickname" is the OR's nickname. "Identity" is a hash of its - identity key, encoded in base64, with trailing equals sign(s) - removed. "Digest" is a hash of its most recent descriptor as - signed (that is, not including the signature) by the RSA identity - key (see section 1.3.), encoded in base64. - - "Publication" was once the publication time of the router's most - recent descriptor, in the form YYYY-MM-DD HH:MM:SS, in UTC. Now - it is only used in votes, and may be set to a fixed value in - consensus documents. Implementations SHOULD ignore this value - in non-vote documents. - - "IP" is its current IP address; ORPort is its current OR port, - "DirPort" is its current directory port, or "0" for "none". - - "a" SP address ":" port NL - - [Any number] - - The first advertised IPv6 address for the OR, if it is reachable. - - Present only if the OR advertises at least one IPv6 address, and the - authority believes that the first advertised address is reachable. - Any other IPv4 or IPv6 addresses should be ignored. - - Address and port are as for "or-address" as specified in - section 2.1.1. - - (Only included when the vote or consensus is generated with - consensus-method 14 or later.) - - "s" SP Flags NL - - [Exactly once.] - - A series of space-separated status flags, in lexical order (as ASCII - byte strings). Currently documented flags are: - - "Authority" if the router is a directory authority. - "BadExit" if the router is believed to be useless as an exit node - (because its ISP censors it, because it is behind a restrictive - proxy, or for some similar reason). - "Exit" if the router is more useful for building - general-purpose exit circuits than for relay circuits. The - path building algorithm uses this flag; see path-spec.txt. - "Fast" if the router is suitable for high-bandwidth circuits. - "Guard" if the router is suitable for use as an entry guard. - "HSDir" if the router is considered a v2 hidden service directory. - "MiddleOnly" if the router is considered unsuitable for - usage other than as a middle relay. Clients do not need - to handle this option, since when it is present, the authorities - will automatically vote against flags that would make the router - usable in other positions. (Since 0.4.7.2-alpha.) - "NoEdConsensus" if any Ed25519 key in the router's descriptor or - microdescriptor does not reflect authority consensus. - "Stable" if the router is suitable for long-lived circuits. - "StaleDesc" if the router should upload a new descriptor because - the old one is too old. - "Running" if the router is currently usable over all its published - ORPorts. (Authorities ignore IPv6 ORPorts unless configured to - check IPv6 reachability.) Relays without this flag are omitted - from the consensus, and current clients (since 0.2.9.4-alpha) - assume that every listed relay has this flag. - "Valid" if the router has been 'validated'. Clients before - 0.2.9.4-alpha would not use routers without this flag by - default. Currently, relays without this flag are omitted - from the consensus, and current (post-0.2.9.4-alpha) clients - assume that every listed relay has this flag. - "V2Dir" if the router implements the v2 directory protocol or - higher. - - "v" SP version NL - - [At most once.] - - The version of the Tor protocol that this relay is running. If - the value begins with "Tor" SP, the rest of the string is a Tor - version number, and the protocol is "The Tor protocol as supported - by the given version of Tor." Otherwise, if the value begins with - some other string, Tor has upgraded to a more sophisticated - protocol versioning system, and the protocol is "a version of the - Tor protocol more recent than any we recognize." - - Directory authorities SHOULD omit version strings they receive from - descriptors if they would cause "v" lines to be over 128 characters - long. - - "pr" SP Entries NL - - [At most once.] - - The "proto" family element as specified in section 2.1.1. - - During voting, authorities copy these lines immediately below the "v" - lines. When a descriptor does not contain a "proto" entry, the - authorities should reconstruct it using the approach described below - in section D. They are included in the consensus using the same rules - as currently used for "v" lines, if a sufficiently late consensus - method is in use. - - "w" SP "Bandwidth=" INT [SP "Measured=" INT] [SP "Unmeasured=1"] NL - - [At most once.] - - An estimate of the bandwidth of this relay, in an arbitrary - unit (currently kilobytes per second). Used to weight router - selection. See section 3.4.2 for details on how the value of - Bandwidth is determined in a consensus. - - Additionally, the Measured= keyword is present in votes by - participating bandwidth measurement authorities to indicate - a measured bandwidth currently produced by measuring stream - capacities. It does not occur in consensuses. - - 'Bandwidth=' and 'Measured=' values must be between 0 and - 2^32 - 1 inclusive. - - The "Unmeasured=1" value is included in consensuses generated - with method 17 or later when the 'Bandwidth=' value is not - based on a threshold of 3 or more measurements for this relay. - - Other weighting keywords may be added later. - Clients MUST ignore keywords they do not recognize. - - "p" SP ("accept" / "reject") SP PortList NL - - [At most once.] - - PortList = PortOrRange - PortList = PortList "," PortOrRange - PortOrRange = INT "-" INT / INT - - A list of those ports that this router supports (if 'accept') - or does not support (if 'reject') for exit to "most - addresses". - - "m" SP methods 1*(SP algorithm "=" digest) NL - - [Any number, only in votes.] - - Microdescriptor hashes for all consensus methods that an authority - supports and that use the same microdescriptor format. "methods" - is a comma-separated list of the consensus methods that the - authority believes will produce "digest". "algorithm" is the name - of the hash algorithm producing "digest", which can be "sha256" or - something else, depending on the consensus "methods" supporting - this algorithm. "digest" is the base64 encoding of the hash of - the router's microdescriptor with trailing =s omitted. - - "id" SP "ed25519" SP ed25519-identity NL - "id" SP "ed25519" SP "none" NL - [vote only, at most once] - - "stats" SP [KeyValues] NL - - [At most once. Vote only] - - KeyValue ::= Keyword '=' Number - Number ::= [0-9]+("."[0-9]+)? - KeyValues ::= KeyValue | KeyValues SP KeyValue - - Line containing various statistics that an authority has computed for - this relay. Each stats is represented as a key + value. Reported keys - are: - - "wfu" - Weighted Fractional Uptime - "tk" - Weighted Time Known - "mtbf" - Mean Time Between Failure (stability) - - (As of tor-0.4.6.1-alpha) - - The footer section is delineated in all votes and consensuses supporting - consensus method 9 and above with the following: - - "directory-footer" NL - [No extra arguments] - - It contains two subsections, a bandwidths-weights line and a - directory-signature. (Prior to consensus method 9, footers only contained - directory-signatures without a 'directory-footer' line or - bandwidth-weights.) - - The bandwidths-weights line appears At Most Once for a consensus. It does - not appear in votes. - - "bandwidth-weights" [SP Weights] NL - - Weight ::= Keyword '=' Int32 - Int32 ::= A decimal integer between -2147483648 and 2147483647. - Weights ::= Weight | Weights SP Weight - - List of optional weights to apply to router bandwidths during path - selection. They are sorted in lexical order (as ASCII byte strings) and - values are divided by the consensus' "bwweightscale" param. Definition - of our known entries are... - - Wgg - Weight for Guard-flagged nodes in the guard position - Wgm - Weight for non-flagged nodes in the guard Position - Wgd - Weight for Guard+Exit-flagged nodes in the guard Position - - Wmg - Weight for Guard-flagged nodes in the middle Position - Wmm - Weight for non-flagged nodes in the middle Position - Wme - Weight for Exit-flagged nodes in the middle Position - Wmd - Weight for Guard+Exit flagged nodes in the middle Position - - Weg - Weight for Guard flagged nodes in the exit Position - Wem - Weight for non-flagged nodes in the exit Position - Wee - Weight for Exit-flagged nodes in the exit Position - Wed - Weight for Guard+Exit-flagged nodes in the exit Position - - Wgb - Weight for BEGIN_DIR-supporting Guard-flagged nodes - Wmb - Weight for BEGIN_DIR-supporting non-flagged nodes - Web - Weight for BEGIN_DIR-supporting Exit-flagged nodes - Wdb - Weight for BEGIN_DIR-supporting Guard+Exit-flagged nodes - - Wbg - Weight for Guard flagged nodes for BEGIN_DIR requests - Wbm - Weight for non-flagged nodes for BEGIN_DIR requests - Wbe - Weight for Exit-flagged nodes for BEGIN_DIR requests - Wbd - Weight for Guard+Exit-flagged nodes for BEGIN_DIR requests - - These values are calculated as specified in section 3.8.3. - - The signature contains the following item, which appears Exactly Once - for a vote, and At Least Once for a consensus. - - "directory-signature" [SP Algorithm] SP identity SP signing-key-digest - NL Signature - - This is a signature of the status document, with the initial item - "network-status-version", and the signature item - "directory-signature", using the signing key. (In this case, we take - the hash through the _space_ after directory-signature, not the - newline: this ensures that all authorities sign the same thing.) - "identity" is the hex-encoded digest of the authority identity key of - the signing authority, and "signing-key-digest" is the hex-encoded - digest of the current authority signing key of the signing authority. - - The Algorithm is one of "sha1" or "sha256" if it is present; - implementations MUST ignore directory-signature entries with an - unrecognized Algorithm. "sha1" is the default, if no Algorithm is - given. The algorithm describes how to compute the hash of the - document before signing it. - - "ns"-flavored consensus documents must contain only sha1 signatures. - Votes and microdescriptor documents may contain other signature - types. Note that only one signature from each authority should be - "counted" as meaning that the authority has signed the consensus. - - (Tor clients before 0.2.3.x did not understand the 'algorithm' - field.) - -3.4.2. Assigning flags in a vote - - (This section describes how directory authorities choose which status - flags to apply to routers. Later directory authorities MAY do things - differently, so long as clients keep working well. Clients MUST NOT - depend on the exact behaviors in this section.) - - In the below definitions, a router is considered "active" if it is - running, valid, and not hibernating. - - When we speak of a router's bandwidth in this section, we mean either - its measured bandwidth, or its advertised bandwidth. If a sufficient - threshold (configurable with MinMeasuredBWsForAuthToIgnoreAdvertised, - 500 by default) of routers have measured bandwidth values, then the - authority bases flags on _measured_ bandwidths, and treats nodes with - non-measured bandwidths as if their bandwidths were zero. Otherwise, - it uses measured bandwidths for nodes that have them, and advertised - bandwidths for other nodes. - - When computing thresholds based on percentiles of nodes, an authority - only considers nodes that are active, that have not been - omitted as a sybil (see below), and whose bandwidth is at least - 4 KB. Nodes that don't meet these criteria do not influence any - threshold calculations (including calculation of stability and uptime - and bandwidth thresholds) and also do not have their Exit status - change. - - "Valid" -- a router is 'Valid' if it is running a version of Tor not - known to be broken, and the directory authority has not blacklisted - it as suspicious. - - "Named" -- - "Unnamed" -- Directory authorities no longer assign these flags. - They were once used to determine whether a relay's nickname was - canonically linked to its public key. - - "Running" -- A router is 'Running' if the authority managed to connect to - it successfully within the last 45 minutes on all its published ORPorts. - Authorities check reachability on: - - * the IPv4 ORPort in the "r" line, and - * the IPv6 ORPort considered for the "a" line, if: - * the router advertises at least one IPv6 ORPort, and - * AuthDirHasIPv6Connectivity 1 is set on the authority. - - A minority of voting authorities that set AuthDirHasIPv6Connectivity will - drop unreachable IPv6 ORPorts from the full consensus. Consensus method 27 - in 0.3.3.x puts IPv6 ORPorts in the microdesc consensus, so that - authorities can drop unreachable IPv6 ORPorts from all consensus flavors. - Consensus method 28 removes IPv6 ORPorts from microdescriptors. - - "Stable" -- A router is 'Stable' if it is active, and either its Weighted - MTBF is at least the median for known active routers or its Weighted MTBF - corresponds to at least 7 days. Routers are never called Stable if they are - running a version of Tor known to drop circuits stupidly. (0.1.1.10-alpha - through 0.1.1.16-rc are stupid this way.) - - To calculate weighted MTBF, compute the weighted mean of the lengths - of all intervals when the router was observed to be up, weighting - intervals by $\alpha^n$, where $n$ is the amount of time that has - passed since the interval ended, and $\alpha$ is chosen so that - measurements over approximately one month old no longer influence the - weighted MTBF much. - - [XXXX what happens when we have less than 4 days of MTBF info.] - - "Exit" -- A router is called an 'Exit' iff it allows exits to at - least one /8 address space on each of ports 80 and 443. (Up until - Tor version 0.3.2, the flag was assigned if relays exit to at least - two of the ports 80, 443, and 6667.) - - "Fast" -- A router is 'Fast' if it is active, and its bandwidth is either in - the top 7/8ths for known active routers or at least 100KB/s. - - "Guard" -- A router is a possible Guard if all of the following apply: - - - It is Fast, - - It is Stable, - - Its Weighted Fractional Uptime is at least the median for "familiar" - active routers, - - It is "familiar", - - Its bandwidth is at least AuthDirGuardBWGuarantee (if set, 2 MB by - default), OR its bandwidth is among the 25% fastest relays, - - It qualifies for the V2Dir flag as described below (this - constraint was added in 0.3.3.x, because in 0.3.0.x clients - started avoiding guards that didn't also have the V2Dir flag). - - To calculate weighted fractional uptime, compute the fraction - of time that the router is up in any given day, weighting so that - downtime and uptime in the past counts less. - - A node is 'familiar' if 1/8 of all active nodes have appeared more - recently than it, OR it has been around for a few weeks. - - "Authority" -- A router is called an 'Authority' if the authority - generating the network-status document believes it is an authority. - - "V2Dir" -- A router supports the v2 directory protocol or higher if it has - an open directory port OR a tunnelled-dir-server line in its router - descriptor, and it is running a version of the directory - protocol that supports the functionality clients need. (Currently, every - supported version of Tor supports the functionality that clients need, - but some relays might set "DirCache 0" or set really low rate limiting, - making them unqualified to be a directory mirror, i.e. they will omit - the tunnelled-dir-server line from their descriptor.) - - "HSDir" -- A router is a v2 hidden service directory if it stores and - serves v2 hidden service descriptors, has the Stable and Fast flag, and the - authority believes that it's been up for at least 96 hours (or the current - value of MinUptimeHidServDirectoryV2). - - "MiddleOnly" -- An authority should vote for this flag if it believes - that a relay is unsuitable for use except as a middle relay. When - voting for this flag, the authority should also vote against "Exit", - "Guard", "HsDir", and "V2Dir". When voting for this flag, if the - authority votes on the "BadExit" flag, the authority should vote in - favor of "BadExit". (This flag was added in 0.4.7.2-alpha.) - - "NoEdConsensus" -- authorities should not vote on this flag; it is - produced as part of the consensus for consensus method 22 or later. - - "StaleDesc" -- authorities should vote to assign this flag if the - published time on the descriptor is over 18 hours in the past. (This flag - was added in 0.4.0.1-alpha.) - - "Sybil" -- authorities SHOULD NOT accept more than 2 relays on a single IP. - If this happens, the authority *should* vote for the excess relays, but - should omit the Running or Valid flags and instead should assign the "Sybil" - flag. When there are more than 2 (or AuthDirMaxServersPerAddr) relays to - choose from, authorities should first prefer authorities to non-authorities, - then prefer Running to non-Running, and then prefer high-bandwidth to - low-bandwidth relays. In this comparison, measured bandwidth is used unless - it is not present for a router, in which case advertised bandwidth is used. - - Thus, the network-status vote includes all non-blacklisted, - non-expired, non-superseded descriptors. - - The bandwidth in a "w" line should be taken as the best estimate - of the router's actual capacity that the authority has. For now, - this should be the lesser of the observed bandwidth and bandwidth - rate limit from the server descriptor. It is given in kilobytes - per second, and capped at some arbitrary value (currently 10 MB/s). - - The Measured= keyword on a "w" line vote is currently computed - by multiplying the previous published consensus bandwidth by the - ratio of the measured average node stream capacity to the network - average. If 3 or more authorities provide a Measured= keyword for - a router, the authorities produce a consensus containing a "w" - Bandwidth= keyword equal to the median of the Measured= votes. - - As a special case, if the "w" line in a vote is about a relay with the - Authority flag, it should not include a Measured= keyword. The goal is - to leave such relays marked as Unmeasured, so they can reserve their - attention for authority-specific activities. "w" lines for votes about - authorities may include the bandwidth authority's measurement using - a different keyword, e.g. MeasuredButAuthority=, so it can still be - reported and recorded for posterity. - - The ports listed in a "p" line should be taken as those ports for - which the router's exit policy permits 'most' addresses, ignoring any - accept not for all addresses, ignoring all rejects for private - netblocks. "Most" addresses are permitted if no more than 2^25 - IPv4 addresses (two /8 networks) were blocked. The list is encoded - as described in section 3.8.2. - -3.4.3. Serving bandwidth list files - - If an authority has used a bandwidth list file to generate a vote - document it SHOULD make it available at - - http:///tor/status-vote/next/bandwidth.z - - at the start of each voting period. - - It MUST NOT attempt to send its bandwidth list file in a HTTP POST to - other authorities and it SHOULD NOT make bandwidth list files from other - authorities available. - - If an authority makes this file available, it MUST be the bandwidth file - used to create the vote document available at - - http:///tor/status-vote/next/authority.z - - To avoid inconsistent reads, authorities SHOULD only read the bandwidth - file once per voting period. Further processing and serving SHOULD use a - cached copy. - - The bandwidth list format is described in bandwidth-file-spec.txt. - - The standard URLs for bandwidth list files first-appeared in - Tor 0.4.0.4-alpha. - -3.5. Downloading missing certificates from other directory authorities - - XXX when to download certificates. - -3.6. Downloading server descriptors from other directory authorities - - Periodically (currently, every 10 seconds), directory authorities check - whether there are any specific descriptors that they do not have and that - they are not currently trying to download. - Authorities identify them by hash in vote (if publication date is more - recent than the descriptor we currently have). - - [XXXX need a way to fetch descriptors ahead of the vote? v2 status docs can - do that for now.] - - If so, the directory authority launches requests to the authorities for these - descriptors, such that each authority is only asked for descriptors listed - in its most recent vote. If more - than one authority lists the descriptor, we choose which to ask at random. - - If one of these downloads fails, we do not try to download that descriptor - from the authority that failed to serve it again unless we receive a newer - network-status (consensus or vote) from that authority that lists the same - descriptor. - - Directory authorities must potentially cache multiple descriptors for each - router. Authorities must not discard any descriptor listed by any recent - consensus. If there is enough space to store additional descriptors, - authorities SHOULD try to hold those which clients are likely to download the - most. (Currently, this is judged based on the interval for which each - descriptor seemed newest.) -[XXXX define recent] - - Authorities SHOULD NOT download descriptors for routers that they would - immediately reject for reasons listed in section 3.2. - -3.7. Downloading extra-info documents from other directory authorities - - Periodically, an authority checks whether it is missing any extra-info - documents: in other words, if it has any server descriptors with an - extra-info-digest field that does not match any of the extra-info - documents currently held. If so, it downloads whatever extra-info - documents are missing. We follow the same splitting and back-off rules - as in section 3.6. - -3.8. Computing a consensus from a set of votes - - Given a set of votes, authorities compute the contents of the consensus. - - The consensus status, along with as many signatures as the server - currently knows (see section 3.10 below), should be available at - - http:///tor/status-vote/next/consensus.z - - The contents of the consensus document are as follows: - - The "valid-after", "valid-until", and "fresh-until" times are taken as - the median of the respective values from all the votes. - - The times in the "voting-delay" line are taken as the median of the - VoteSeconds and DistSeconds times in the votes. - - Known-flags is the union of all flags known by any voter. - - Entries are given on the "params" line for every keyword on which a - majority of authorities (total authorities, not just those - participating in this vote) voted on, or if at least three - authorities voted for that parameter. The values given are the - low-median of all votes on that keyword. - - (In consensus methods 7 to 11 inclusive, entries were given on - the "params" line for every keyword on which *any* authority voted, - the value given being the low-median of all votes on that keyword.) - - "client-versions" and "server-versions" are sorted in ascending - order; A version is recommended in the consensus if it is recommended - by more than half of the voting authorities that included a - client-versions or server-versions lines in their votes. - - With consensus methods 19 through 33, a package line is generated for a - given PACKAGENAME/VERSION pair if at least three authorities list such a - package in their votes. (Call these lines the "input" lines for - PACKAGENAME.) The consensus will contain every "package" line that is - listed verbatim by more than half of the authorities listing a line for - the PACKAGENAME/VERSION pair, and no others. - - The authority item groups (dir-source, contact, fingerprint, - vote-digest) are taken from the votes of the voting - authorities. These groups are sorted by the digests of the - authorities identity keys, in ascending order. If the consensus - method is 3 or later, a dir-source line must be included for - every vote with legacy-key entry, using the legacy-key's - fingerprint, the voter's ordinary nickname with the string - "-legacy" appended, and all other fields as from the original - vote's dir-source line. - - A router status entry: - * is included in the result if some router status entry with the same - identity is included by more than half of the authorities (total - authorities, not just those whose votes we have). - (Consensus method earlier than 21) - - * is included according to the rules in section 3.8.0.1 and - 3.8.0.2 below. (Consensus method 22 or later) - - * For any given RSA identity digest, we include at most - one router status entry. - - * For any given Ed25519 identity, we include at most one router - status entry. - - * A router entry has a flag set if that is included by more than half - of the authorities who care about that flag. - - * Two router entries are "the same" if they have the same - tuple. - We choose the tuple for a given router as whichever tuple appears - for that router in the most votes. We break ties first in favor of - the more recently published, then in favor of smaller server - descriptor digest. - - [ - * The Named flag appears if it is included for this routerstatus by - _any_ authority, and if all authorities that list it list the same - nickname. However, if consensus-method 2 or later is in use, and - any authority calls this identity/nickname pair Unnamed, then - this routerstatus does not get the Named flag. - - * If consensus-method 2 or later is in use, the Unnamed flag is - set for a routerstatus if any authorities have voted for a different - identities to be Named with that nickname, or if any authority - lists that nickname/ID pair as Unnamed. - - (With consensus-method 1, Unnamed is set like any other flag.) - - [But note that authorities no longer vote for the Named flag, - and the above two bulletpoints are now irrelevant.] - ] - - * The version is given as whichever version is listed by the most - voters, with ties decided in favor of more recent versions. - - * If consensus-method 4 or later is in use, then routers that - do not have the Running flag are not listed at all. - - * If consensus-method 5 or later is in use, then the "w" line - is generated using a low-median of the bandwidth values from - the votes that included "w" lines for this router. - - * If consensus-method 5 or later is in use, then the "p" line - is taken from the votes that have the same policy summary - for the descriptor we are listing. (They should all be the - same. If they are not, we pick the most commonly listed - one, breaking ties in favor of the lexicographically larger - vote.) The port list is encoded as specified in section 3.8.2. - - * If consensus-method 6 or later is in use and if 3 or more - authorities provide a Measured= keyword in their votes for - a router, the authorities produce a consensus containing a - Bandwidth= keyword equal to the median of the Measured= votes. - - * If consensus-method 7 or later is in use, the params line is - included in the output. - - * If the consensus method is under 11, bad exits are considered as - possible exits when computing bandwidth weights. Otherwise, if - method 11 or later is in use, any router that is determined to get - the BadExit flag doesn't count when we're calculating weights. - - * If consensus method 12 or later is used, only consensus - parameters that more than half of the total number of - authorities voted for are included in the consensus. - - [ As of 0.2.6.1-alpha, authorities no longer advertise or negotiate - any consensus methods lower than 13. ] - - * If consensus method 13 or later is used, microdesc consensuses - omit any router for which no microdesc was agreed upon. - - * If consensus method 14 or later is used, the ns consensus and - microdescriptors may include an "a" line for each router, listing - an IPv6 OR port. - - * If consensus method 15 or later is used, microdescriptors - include "p6" lines including IPv6 exit policies. - - * If consensus method 16 or later is used, ntor-onion-key - are included in microdescriptors - - * If consensus method 17 or later is used, authorities impose a - maximum on the Bandwidth= values that they'll put on a 'w' - line for any router that doesn't have at least 3 measured - bandwidth values in votes. They also add an "Unmeasured=1" - flag to such 'w' lines. - - * If consensus method 18 or later is used, authorities include - "id" lines in microdescriptors. This method adds RSA ids. - - * If consensus method 19 or later is used, authorities may include - "package" lines in consensuses. - - * If consensus method 20 or later is used, authorities may include - GuardFraction information in microdescriptors. - - * If consensus method 21 or later is used, authorities may include - an "id" line for ed25519 identities in microdescriptors. - - [ As of 0.2.8.2-alpha, authorities no longer advertise or negotiate - consensus method 21, because it contains bugs. ] - - * If consensus method 22 or later is used, and the votes do not - produce a majority consensus about a relay's Ed25519 key (see - 3.8.0.1 below), the consensus must include a NoEdConsensus flag on - the "s" line for every relay whose listed Ed key does not reflect - consensus. - - * If consensus method 23 or later is used, authorities include - shared randomness protocol data on their votes and consensus. - - * If consensus-method 24 or later is in use, then routers that - do not have the Valid flag are not listed at all. - - [ As of 0.3.4.1-alpha, authorities no longer advertise or negotiate - any consensus methods lower than 25. ] - - * If consensus-method 25 or later is in use, then we vote - on recommended-protocols and required-protocols lines in the - consensus. We also include protocols lines in routerstatus - entries. - - * If consensus-method 26 or later is in use, then we initialize - bandwidth weights to 1 in our calculations, to avoid - division-by-zero errors on unusual networks. - - * If consensus method 27 or later is used, the microdesc consensus - may include an "a" line for each router, listing an IPv6 OR port. - - [ As of 0.4.3.1-alpha, authorities no longer advertise or negotiate - any consensus methods lower than 28. ] - - * If consensus method 28 or later is used, microdescriptors no longer - include "a" lines. - - * If consensus method 29 or later is used, microdescriptor "family" - lines are canonicalized to improve compression. - - * If consensus method 30 or later is used, the base64 encoded - ntor-onion-key does not include the trailing = sign. - - * If consensus method 31 or later is used, authorities parse the - "bwweightscale" and "maxunmeasuredbw" parameters correctly when - computing votes. - - * If consensus method 32 or later is used, authorities handle the - "MiddleOnly" flag specially when computing a consensus. When the - voters agree to include "MiddleOnly" in a routerstatus, they - automatically remove "Exit", "Guard", "V2Dir", and "HSDir". If - the BadExit flag is included in the consensus, they automatically - add it to the routerstatus. - - * If consensus method 33 or later is used, and the consensus - flavor is "microdesc", then the "Publication" field in the "r" - line is set to "2038-01-01 00:00:00". - - * If consensus method 34 or later is used, the consensus - does not include any "package" lines. - - The signatures at the end of a consensus document are sorted in - ascending order by identity digest. - - All ties in computing medians are broken in favor of the smaller or - earlier item. - -3.8.0.1. Deciding which Ids to include. - - This sorting algorithm is used for consensus-method 22 and later. - - First, consider each listing by tuple of identities, where 'Ed' - may be "None" if the voter included "id ed25519 none" to indicate that - the authority knows what ed25519 identities are, and thinks that the RSA - key doesn't have one. - - For each such tuple that is listed by more than half of the - total authorities (not just total votes), include it. (It is not - possible for any other to have as many votes.) If more - than half of the authorities list a single pair of this type, we - consider that Ed key to be "consensus"; see description of the - NoEdConsensus flag. - - Log any other id-RSA values corresponding to an id-Ed we included, and any - other id-Ed values corresponding to an id-RSA we included. - - For each that is not yet included, if it is listed by more than - half of the total authorities, and we do not already have it listed with - some , include it, but do not consider its Ed identity canonical. - -3.8.0.2. Deciding which descriptors to include - - Deciding which descriptors to include. - - A tuple belongs to an identity if it is a new tuple that - matches both ID parts, or if it is an old tuple (one with no Ed opinion) - that matches the RSA part. A tuple belongs to an identity if its - RSA identity matches. - - A tuple matches another tuple if all the fields that are present in both - tuples are the same. - - For every included identity, consider the tuples belonging to that - identity. Group them into sets of matching tuples. Include the tuple - that matches the largest set, breaking ties in favor of the most recently - published, and then in favor of the smaller server descriptor digest. - -3.8.1. Forward compatibility - - Future versions of Tor will need to include new information in the - consensus documents, but it is important that all authorities (or at least - half) generate and sign the same signed consensus. - - To achieve this, authorities list in their votes their supported methods - for generating consensuses from votes. Later methods will be assigned - higher numbers. Currently specified methods: - - "1" -- The first implemented version. - "2" -- Added support for the Unnamed flag. - "3" -- Added legacy ID key support to aid in authority ID key rollovers - "4" -- No longer list routers that are not running in the consensus - "5" -- adds support for "w" and "p" lines. - "6" -- Prefers measured bandwidth values rather than advertised - "7" -- Provides keyword=integer pairs of consensus parameters - "8" -- Provides microdescriptor summaries - "9" -- Provides weights for selecting flagged routers in paths - "10" -- Fixes edge case bugs in router flag selection weights - "11" -- Don't consider BadExits when calculating bandwidth weights - "12" -- Params are only included if enough auths voted for them - "13" -- Omit router entries with missing microdescriptors. - "14" -- Adds support for "a" lines in ns consensuses and microdescriptors. - "15" -- Adds support for "p6" lines. - "16" -- Adds ntor keys to microdescriptors - "17" -- Adds "Unmeasured=1" flags to "w" lines - "18" -- Adds 'id' to microdescriptors. - "19" -- Adds "package" lines to consensuses - "20" -- Adds GuardFraction information to microdescriptors. - "21" -- Adds Ed25519 keys to microdescriptors. - "22" -- Instantiates Ed25519 voting algorithm correctly. - "23" -- Adds shared randomness protocol data. - "24" -- No longer lists routers that are not Valid in the consensus. - "25" -- Vote on recommended-protocols and required-protocols. - "26" -- Initialize bandwidth weights to 1 to avoid division-by-zero. - "27" -- Adds support for "a" lines in microdescriptor consensuses. - "28" -- Removes "a" lines from microdescriptors. - "29" -- Canonicalizes families in microdescriptors. - "30" -- Removes padding from ntor-onion-key. - "31" -- Uses correct parsing for bwweightscale and maxunmeasuredbw - when computing weights - "32" -- Adds special handling for MiddleOnly flag. - "33" -- Sets "publication" field in microdesc consensus "r" lines - to a meaningless value. - "34" -- Removes "package" lines from consensus. - - Before generating a consensus, an authority must decide which consensus - method to use. To do this, it looks for the highest version number - supported by more than 2/3 of the authorities voting. If it supports this - method, then it uses it. Otherwise, it falls back to the newest consensus - method that it supports (which will probably not result in a sufficiently - signed consensus). - - All authorities MUST support method 25; authorities SHOULD support - more recent methods as well. Authorities SHOULD NOT support or - advertise support for any method before 25. Clients MAY assume that - they will never see a current valid signed consensus for any method - before method 25. - - (The consensuses generated by new methods must be parsable by - implementations that only understand the old methods, and must not cause - those implementations to compromise their anonymity. This is a means for - making changes in the contents of consensus; not for making - backward-incompatible changes in their format.) - - The following methods have incorrect implementations; authorities SHOULD - NOT advertise support for them: - - "21" -- Did not correctly enable support for ed25519 key collation. - -3.8.2. Encoding port lists - - Whether the summary shows the list of accepted ports or the list of - rejected ports depends on which list is shorter (has a shorter string - representation). In case of ties we choose the list of accepted - ports. As an exception to this rule an allow-all policy is - represented as "accept 1-65535" instead of "reject " and a reject-all - policy is similarly given as "reject 1-65535". - - Summary items are compressed, that is instead of "80-88,89-100" there - only is a single item of "80-100", similarly instead of "20,21" a - summary will say "20-21". - - Port lists are sorted in ascending order. - - The maximum allowed length of a policy summary (including the "accept " - or "reject ") is 1000 characters. If a summary exceeds that length we - use an accept-style summary and list as much of the port list as is - possible within these 1000 bytes. [XXXX be more specific.] - -3.8.3. Computing Bandwidth Weights - - Let weight_scale = 10000, or the value of the "bwweightscale" parameter. - (Before consensus method 31 there was a bug in parsing bwweightscale, so - that if there were any consensus parameters after it alphabetically, it - would always be treated as 10000. A similar bug existed for - "maxunmeasuredbw".) - - Starting with consensus method 26, G, M, E, and D are initialized to 1 and - T to 4. Prior consensus methods initialize them all to 0. With this change, - test tor networks that are small or new are much more likely to produce - bandwidth-weights in their consensus. The extra bandwidth has a negligible - impact on the bandwidth weights in the public tor network. - - Let G be the total bandwidth for Guard-flagged nodes. - Let M be the total bandwidth for non-flagged nodes. - Let E be the total bandwidth for Exit-flagged nodes. - Let D be the total bandwidth for Guard+Exit-flagged nodes. - Let T = G+M+E+D - - Let Wgd be the weight for choosing a Guard+Exit for the guard position. - Let Wmd be the weight for choosing a Guard+Exit for the middle position. - Let Wed be the weight for choosing a Guard+Exit for the exit position. - - Let Wme be the weight for choosing an Exit for the middle position. - Let Wmg be the weight for choosing a Guard for the middle position. - - Let Wgg be the weight for choosing a Guard for the guard position. - Let Wee be the weight for choosing an Exit for the exit position. - - Balanced network conditions then arise from solutions to the following - system of equations: - - Wgg*G + Wgd*D == M + Wmd*D + Wme*E + Wmg*G (guard bw = middle bw) - Wgg*G + Wgd*D == Wee*E + Wed*D (guard bw = exit bw) - Wed*D + Wmd*D + Wgd*D == D (aka: Wed+Wmd+Wdg = weight_scale) - Wmg*G + Wgg*G == G (aka: Wgg = weight_scale-Wmg) - Wme*E + Wee*E == E (aka: Wee = weight_scale-Wme) - - We are short 2 constraints with the above set. The remaining constraints - come from examining different cases of network load. The following - constraints are used in consensus method 10 and above. There are another - incorrect and obsolete set of constraints used for these same cases in - consensus method 9. For those, see dir-spec.txt in Tor 0.2.2.10-alpha - to 0.2.2.16-alpha. - - Case 1: E >= T/3 && G >= T/3 (Neither Exit nor Guard Scarce) - - In this case, the additional two constraints are: Wmg == Wmd, - Wed == 1/3. - - This leads to the solution: - Wgd = weight_scale/3 - Wed = weight_scale/3 - Wmd = weight_scale/3 - Wee = (weight_scale*(E+G+M))/(3*E) - Wme = weight_scale - Wee - Wmg = (weight_scale*(2*G-E-M))/(3*G) - Wgg = weight_scale - Wmg - - Case 2: E < T/3 && G < T/3 (Both are scarce) - - Let R denote the more scarce class (Rare) between Guard vs Exit. - Let S denote the less scarce class. - - Subcase a: R+D < S - - In this subcase, we simply devote all of D bandwidth to the - scarce class. - - Wgg = Wee = weight_scale - Wmg = Wme = Wmd = 0; - if E < G: - Wed = weight_scale - Wgd = 0 - else: - Wed = 0 - Wgd = weight_scale - - Subcase b: R+D >= S - - In this case, if M <= T/3, we have enough bandwidth to try to achieve - a balancing condition. - - Add constraints Wgg = weight_scale, Wmd == Wgd to maximize bandwidth in - the guard position while still allowing exits to be used as middle nodes: - - Wee = (weight_scale*(E - G + M))/E - Wed = (weight_scale*(D - 2*E + 4*G - 2*M))/(3*D) - Wme = (weight_scale*(G-M))/E - Wmg = 0 - Wgg = weight_scale - Wmd = (weight_scale - Wed)/2 - Wgd = (weight_scale - Wed)/2 - - If this system ends up with any values out of range (ie negative, or - above weight_scale), use the constraints Wgg == weight_scale and Wee == - weight_scale, since both those positions are scarce: - - Wgg = weight_scale - Wee = weight_scale - Wed = (weight_scale*(D - 2*E + G + M))/(3*D) - Wmd = (weight_Scale*(D - 2*M + G + E))/(3*D) - Wme = 0 - Wmg = 0 - Wgd = weight_scale - Wed - Wmd - - If M > T/3, then the Wmd weight above will become negative. Set it to 0 - in this case: - Wmd = 0 - Wgd = weight_scale - Wed - - Case 3: One of E < T/3 or G < T/3 - - Let S be the scarce class (of E or G). - - Subcase a: (S+D) < T/3: - if G=S: - Wgg = Wgd = weight_scale; - Wmd = Wed = Wmg = 0; - // Minor subcase, if E is more scarce than M, - // keep its bandwidth in place. - if (E < M) Wme = 0; - else Wme = (weight_scale*(E-M))/(2*E); - Wee = weight_scale-Wme; - if E=S: - Wee = Wed = weight_scale; - Wmd = Wgd = Wme = 0; - // Minor subcase, if G is more scarce than M, - // keep its bandwidth in place. - if (G < M) Wmg = 0; - else Wmg = (weight_scale*(G-M))/(2*G); - Wgg = weight_scale-Wmg; - - Subcase b: (S+D) >= T/3 - if G=S: - Add constraints Wgg = weight_scale, Wmd == Wed to maximize bandwidth - in the guard position, while still allowing exits to be - used as middle nodes: - Wgg = weight_scale - Wgd = (weight_scale*(D - 2*G + E + M))/(3*D) - Wmg = 0 - Wee = (weight_scale*(E+M))/(2*E) - Wme = weight_scale - Wee - Wmd = (weight_scale - Wgd)/2 - Wed = (weight_scale - Wgd)/2 - if E=S: - Add constraints Wee == weight_scale, Wmd == Wgd to maximize bandwidth - in the exit position: - Wee = weight_scale; - Wed = (weight_scale*(D - 2*E + G + M))/(3*D); - Wme = 0; - Wgg = (weight_scale*(G+M))/(2*G); - Wmg = weight_scale - Wgg; - Wmd = (weight_scale - Wed)/2; - Wgd = (weight_scale - Wed)/2; - - To ensure consensus, all calculations are performed using integer math - with a fixed precision determined by the bwweightscale consensus - parameter (defaults at 10000, Min: 1, Max: INT32_MAX). (See note above - about parsing bug in bwweightscale before consensus method 31.) - - For future balancing improvements, Tor clients support 11 additional weights - for directory requests and middle weighting. These weights are currently - set at weight_scale, with the exception of the following groups of - assignments: - - Directory requests use middle weights: - - Wbd=Wmd, Wbg=Wmg, Wbe=Wme, Wbm=Wmm - - Handle bridges and strange exit policies: - - Wgm=Wgg, Wem=Wee, Weg=Wed - -3.9. Computing consensus flavors - - Consensus flavors are variants of the consensus that clients can choose - to download and use instead of the unflavored consensus. The purpose - of a consensus flavor is to remove or replace information in the - unflavored consensus without forcing clients to download information - they would not use anyway. - - Directory authorities can produce and serve an arbitrary number of - flavors of the same consensus. A downside of creating too many new - flavors is that clients will be distinguishable based on which flavor - they download. A new flavor should not be created when adding a field - instead wouldn't be too onerous. - - Examples for consensus flavors include: - - - Publishing hashes of microdescriptors instead of hashes of - full descriptors (see section 3.9.2). - - Including different digests of descriptors, instead of the - perhaps-soon-to-be-totally-broken SHA1. - - Consensus flavors are derived from the unflavored consensus once the - voting process is complete. This is to avoid consensus synchronization - problems. - - Every consensus flavor has a name consisting of a sequence of one - or more alphanumeric characters and dashes. For compatibility, - the original (unflavored) consensus type is called "ns". - - The supported consensus flavors are defined as part of the - authorities' consensus method. - - All consensus flavors have in common that their first line is - "network-status-version" where version is 3 or higher, and the flavor - is a string consisting of alphanumeric characters and dashes: - - "network-status-version" SP version [SP flavor] NL - -3.9.1. ns consensus - - The ns consensus flavor is equivalent to the unflavored consensus. - When the flavor is omitted from the "network-status-version" line, - it should be assumed to be "ns". Some implementations may explicitly - state that the flavor is "ns" when generating consensuses, but should - accept consensuses where the flavor is omitted. - -3.9.2. Microdescriptor consensus - - The microdescriptor consensus is a consensus flavor that contains - microdescriptor hashes instead of descriptor hashes and that omits - exit-policy summaries which are contained in microdescriptors. The - microdescriptor consensus was designed to contain elements that are - small and frequently changing. Clients use the information in the - microdescriptor consensus to decide which servers to fetch information - about and which servers to fetch information from. - - The microdescriptor consensus is based on the unflavored consensus with - the exceptions as follows: - - "network-status-version" SP version SP "microdesc" NL - - [At start, exactly once.] - - The flavor name of a microdescriptor consensus is "microdesc". - - Changes to router status entries are as follows: - - "r" SP nickname SP identity SP publication SP IP SP ORPort - SP DirPort NL - - [At start, exactly once.] - - Similar to "r" lines in section 3.4.1, but without the digest element. - - "a" SP address ":" port NL - - [Any number] - - Identical to the "r" lines in section 3.4.1. - - (Only included when the vote is generated with consensus-method 14 - or later, and the consensus is generated with consensus-method 27 or - later.) - - "p" ... NL - - [At most once] - - Not currently generated. - - Exit policy summaries are contained in microdescriptors and - therefore omitted in the microdescriptor consensus. - - "m" SP digest NL - - [Exactly once.*] - - "digest" is the base64 of the SHA256 hash of the router's - microdescriptor with trailing =s omitted. For a given router - descriptor digest and consensus method there should only be a - single microdescriptor digest in the "m" lines of all votes. - If different votes have different microdescriptor digests for - the same descriptor digest and consensus method, at least one - of the authorities is broken. If this happens, the microdesc - consensus should contain whichever microdescriptor digest is - most common. If there is no winner, we break ties in the favor - of the lexically earliest. - - [*Before consensus method 13, this field was sometimes erroneously - omitted.] - - Additionally, a microdescriptor consensus SHOULD use the sha256 digest - algorithm for its signatures. - -3.10. Exchanging detached signatures - - Once an authority has computed and signed a consensus network status, it - should send its detached signature to each other authority in an HTTP POST - request to the URL: - - http:///tor/post/consensus-signature - - [XXX Note why we support push-and-then-pull.] - - All of the detached signatures it knows for consensus status should be - available at: - - http:///tor/status-vote/next/consensus-signatures.z - - Assuming full connectivity, every authority should compute and sign the - same consensus including any flavors in each period. Therefore, it - isn't necessary to download the consensus or any flavors of it computed - by each authority; instead, the authorities only push/fetch each - others' signatures. A "detached signature" document contains items as - follows: - - "consensus-digest" SP Digest NL - - [At start, at most once.] - - The digest of the consensus being signed. - - "valid-after" SP YYYY-MM-DD SP HH:MM:SS NL - "fresh-until" SP YYYY-MM-DD SP HH:MM:SS NL - "valid-until" SP YYYY-MM-DD SP HH:MM:SS NL - - [As in the consensus] - - "additional-digest" SP flavor SP algname SP digest NL - - [Any number.] - - For each supported consensus flavor, every directory authority - adds one or more "additional-digest" lines. "flavor" is the name - of the consensus flavor, "algname" is the name of the hash - algorithm that is used to generate the digest, and "digest" is the - hex-encoded digest. - - The hash algorithm for the microdescriptor consensus flavor is - defined as SHA256 with algname "sha256". - - "additional-signature" SP flavor SP algname SP identity SP - signing-key-digest NL signature. - - [Any number.] - - For each supported consensus flavor and defined digest algorithm, - every directory authority adds an "additional-signature" line. - "flavor" is the name of the consensus flavor. "algname" is the - name of the algorithm that was used to hash the identity and - signing keys, and to compute the signature. "identity" is the - hex-encoded digest of the authority identity key of the signing - authority, and "signing-key-digest" is the hex-encoded digest of - the current authority signing key of the signing authority. - - The "sha256" signature format is defined as the RSA signature of - the OAEP+-padded SHA256 digest of the item to be signed. When - checking signatures, the signature MUST be treated as valid if the - signature material begins with SHA256(document), so that other - data can get added later. - [To be honest, I didn't fully understand the previous paragraph - and only copied it from the proposals. Review carefully. -KL] - - "directory-signature" - - [As in the consensus; the signature object is the same as in the - consensus document.] - -3.11. Publishing the signed consensus - - The voting period ends at the valid-after time. If the consensus has - been signed by a majority of authorities, these documents are made - available at - - http:///tor/status-vote/current/consensus.z - - and - - http:///tor/status-vote/current/consensus-signatures.z - - [XXX current/consensus-signatures is not currently implemented, as it - is not used in the voting protocol.] - - [XXX possible future features include support for downloading old - consensuses.] - - The other vote documents are analogously made available under - - http:///tor/status-vote/current/authority.z - http:///tor/status-vote/current/.z - http:///tor/status-vote/current/d/.z - http:///tor/status-vote/current/bandwidth.z - - once the voting period ends, regardless of the number of signatures. - - The authorities serve another consensus of each flavor "F" from the - locations - - /tor/status-vote/(current|next)/consensus-F.z. and - /tor/status-vote/(current|next)/consensus-F/+....z. - - The standard URLs for bandwidth list files first-appeared in Tor 0.3.5. - -4. Directory cache operation - - All directory caches implement this section, except as noted. - -4.1. Downloading consensus status documents from directory authorities - - All directory caches try to keep a recent - network-status consensus document to serve to clients. A cache ALWAYS - downloads a network-status consensus if any of the following are true: - - - The cache has no consensus document. - - The cache's consensus document is no longer valid. - - Otherwise, the cache downloads a new consensus document at a randomly - chosen time in the first half-interval after its current consensus - stops being fresh. (This time is chosen at random to avoid swarming - the authorities at the start of each period. The interval size is - inferred from the difference between the valid-after time and the - fresh-until time on the consensus.) - - [For example, if a cache has a consensus that became valid at 1:00, - and is fresh until 2:00, that cache will fetch a new consensus at - a random time between 2:00 and 2:30.] - - Directory caches also fetch consensus flavors from the authorities. - Caches check the correctness of consensus flavors, but do not check - anything about an unrecognized consensus document beyond its digest and - length. Caches serve all consensus flavors from the same locations as - the directory authorities. - -4.2. Downloading server descriptors from directory authorities - - Periodically (currently, every 10 seconds), directory caches check - whether there are any specific descriptors that they do not have and that - they are not currently trying to download. Caches identify these - descriptors by hash in the recent network-status consensus documents. - - If so, the directory cache launches requests to the authorities for these - descriptors. - - If one of these downloads fails, we do not try to download that descriptor - from the authority that failed to serve it again unless we receive a newer - network-status consensus that lists the same descriptor. - - Directory caches must potentially cache multiple descriptors for each - router. Caches must not discard any descriptor listed by any recent - consensus. If there is enough space to store additional descriptors, - caches SHOULD try to hold those which clients are likely to download the - most. (Currently, this is judged based on the interval for which each - descriptor seemed newest.) - - [XXXX define recent] - -4.3. Downloading microdescriptors from directory authorities - - Directory mirrors should fetch, cache, and serve each microdescriptor - from the authorities. - - The microdescriptors with base64 hashes ,, are available - at: - - http:///tor/micro/d/--[.z] - - are base64 encoded with trailing =s omitted for size and for - consistency with the microdescriptor consensus format. -s are used - instead of +s to separate items, since the + character is used in - base64 encoding. - - Directory mirrors should check to make sure that the microdescriptors - they're about to serve match the right hashes (either the hashes from - the fetch URL or the hashes from the consensus, respectively). - - (NOTE: Due to squid proxy url limitations at most 92 microdescriptor hashes - can be retrieved in a single request.) - -4.4. Downloading extra-info documents from directory authorities - - Any cache that chooses to cache extra-info documents should implement this - section. - - Periodically, the Tor instance checks whether it is missing any extra-info - documents: in other words, if it has any server descriptors with an - extra-info-digest field that does not match any of the extra-info - documents currently held. If so, it downloads whatever extra-info - documents are missing. Caches download from authorities. We follow the - same splitting and back-off rules as in section 4.2. - -4.5. Consensus diffs - - Instead of downloading an entire consensus, clients may download - a "diff" document containing an ed-style diff from a previous - consensus document. Caches (and authorities) make these diffs as - they learn about new consensuses. To do so, they must store a - record of older consensuses. - - (Support for consensus diffs was added in 0.3.1.1-alpha, and is - advertised with the DirCache protocol version "2" or later.) - -4.5.1. Consensus diff format - - Consensus diffs are formatted as follows: - - The first line is "network-status-diff-version 1" NL - - The second line is - - "hash" SP FromDigest SP ToDigest NL - - where FromDigest is the hex-encoded SHA3-256 digest of the _signed - part_ of the consensus that the diff should be applied to, and - ToDigest is the hex-encoded SHA3-256 digest of the _entire_ - consensus resulting from applying the diff. (See 3.4.1 for - information on that part of a consensus is signed.) - - The third and subsequent lines encode the diff from FromDigest to - ToDigest in a limited subset of the ed diff format, as specified - in appendix E. - -4.5.2. Serving and requesting diffs. - - When downloading the current consensus, a client may include an - HTTP header of the form - - X-Or-Diff-From-Consensus: HASH1, HASH2, ... - - where the HASH values are hex-encoded SHA3-256 digests of the - _signed part_ of one or more consensuses that the client knows - about. - - If a cache knows a consensus diff from one of those consensuses - to the most recent consensus of the requested flavor, it may - send that diff instead of the specified consensus. - - Caches also serve diffs from the URIs: - - /tor/status-vote/current/consensus/diff//.z - /tor/status-vote/current/consensus-/diff//.z - - where FLAVOR is the consensus flavor, defaulting to "ns", and - FPRLIST is +-separated list of recognized authority identity - fingerprints as in appendix B. - -4.6 Retrying failed downloads - - See section 5.5 below; it applies to caches as well as clients. - -5. Client operation - - Every Tor that is not a directory server (that is, those that do - not have a DirPort set) implements this section. - -5.1. Downloading network-status documents - - Each client maintains a list of directory authorities. Insofar as - possible, clients SHOULD all use the same list. - - [Newer versions of Tor (0.2.8.1-alpha and later): - Each client also maintains a list of default fallback directory mirrors - (fallbacks). Each released version of Tor MAY have a different list, - depending on the mirrors that satisfy the fallback directory criteria at - release time.] - - Clients try to have a live consensus network-status document at all times. - A network-status document is "live" if the time in its valid-after field - has passed, and the time in its valid-until field has not passed. - - When a client has no consensus network-status document, it downloads it - from a randomly chosen fallback directory mirror or authority. Clients - prefer fallbacks to authorities, trying them earlier and more frequently. - In all other cases, the client downloads from caches randomly chosen from - among those believed to be V3 directory servers. (This information comes - from the network-status documents.) - - After receiving any response client MUST discard any network-status - documents that it did not request. - - On failure, the client waits briefly, then tries that network-status - document again from another cache. The client does not build circuits - until it has a live network-status consensus document, and it has - descriptors for a significant proportion of the routers that it believes - are running (this is configurable using torrc options and consensus - parameters). - - [Newer versions of Tor (0.2.6.2-alpha and later): - If the consensus contains Exits (the typical case), Tor will build both - exit and internal circuits. When bootstrap completes, Tor will be ready - to handle an application requesting an exit circuit to services like the - World Wide Web. - - If the consensus does not contain Exits, Tor will only build internal - circuits. In this case, earlier statuses will have included "internal" - as indicated above. When bootstrap completes, Tor will be ready to handle - an application requesting an internal circuit to hidden services at - ".onion" addresses. - - If a future consensus contains Exits, exit circuits may become available.] - - (Note: clients can and should pick caches based on the network-status - information they have: once they have first fetched network-status info - from an authority or fallback, they should not need to go to the authority - directly again, and should only choose the fallback at random, based on its - consensus weight in the current consensus.) - - To avoid swarming the caches whenever a consensus expires, the - clients download new consensuses at a randomly chosen time after the - caches are expected to have a fresh consensus, but before their - consensus will expire. (This time is chosen uniformly at random from - the interval between the time 3/4 into the first interval after the - consensus is no longer fresh, and 7/8 of the time remaining after - that before the consensus is invalid.) - - [For example, if a client has a consensus that became valid at 1:00, - and is fresh until 2:00, and expires at 4:00, that client will fetch - a new consensus at a random time between 2:45 and 3:50, since 3/4 - of the one-hour interval is 45 minutes, and 7/8 of the remaining 75 - minutes is 65 minutes.] - - Clients may choose to download the microdescriptor consensus instead - of the general network status consensus. In that case they should use - the same update strategy as for the normal consensus. They should not - download more than one consensus flavor. - - When a client does not have a live consensus, it will generally use the - most recent consensus it has if that consensus is "reasonably live". A - "reasonably live" consensus is one that expired less than 24 hours ago. - -5.2. Downloading server descriptors or microdescriptors - - Clients try to have the best descriptor for each router. A descriptor is - "best" if: - - * It is listed in the consensus network-status document. - - Periodically (currently every 10 seconds) clients check whether there are - any "downloadable" descriptors. A descriptor is downloadable if: - - - It is the "best" descriptor for some router. - - The descriptor was published at least 10 minutes in the past. - (This prevents clients from trying to fetch descriptors that the - mirrors have probably not yet retrieved and cached.) - - The client does not currently have it. - - The client is not currently trying to download it. - - The client would not discard it immediately upon receiving it. - - The client thinks it is running and valid (see section 5.4.1 below). - - If at least 16 known routers have downloadable descriptors, or if - enough time (currently 10 minutes) has passed since the last time the - client tried to download descriptors, it launches requests for all - downloadable descriptors. - - When downloading multiple server descriptors, the client chooses multiple - mirrors so that: - - - At least 3 different mirrors are used, except when this would result - in more than one request for under 4 descriptors. - - No more than 128 descriptors are requested from a single mirror. - - Otherwise, as few mirrors as possible are used. - After choosing mirrors, the client divides the descriptors among them - randomly. - - After receiving any response the client MUST discard any descriptors that - it did not request. - - When a descriptor download fails, the client notes it, and does not - consider the descriptor downloadable again until a certain amount of time - has passed. (Currently 0 seconds for the first failure, 60 seconds for the - second, 5 minutes for the third, 10 minutes for the fourth, and 1 day - thereafter.) Periodically (currently once an hour) clients reset the - failure count. - - Clients retain the most recent descriptor they have downloaded for each - router so long as it is listed in the consensus. If it is not listed, - they keep it so long as it is not too old (currently, ROUTER_MAX_AGE=48 - hours) and no better router descriptor has been downloaded for the same - relay. Caches retain descriptors until they are at least - OLD_ROUTER_DESC_MAX_AGE=5 days old. - - Clients which chose to download the microdescriptor consensus instead - of the general consensus must download the referenced microdescriptors - instead of server descriptors. Clients fetch and cache - microdescriptors preemptively from dir mirrors when starting up, like - they currently fetch descriptors. After bootstrapping, clients only - need to fetch the microdescriptors that have changed. - - When a client gets a new microdescriptor consensus, it looks to see if - there are any microdescriptors it needs to learn, and launches a request - for them. - - Clients maintain a cache of microdescriptors along with metadata like - when it was last referenced by a consensus, and which identity key - it corresponds to. They keep a microdescriptor until it hasn't been - mentioned in any consensus for a week. Future clients might cache them - for longer or shorter times. - -5.3. Downloading extra-info documents - - Any client that uses extra-info documents should implement this - section. - - Note that generally, clients don't need extra-info documents. - - Periodically, the Tor instance checks whether it is missing any extra-info - documents: in other words, if it has any server descriptors with an - extra-info-digest field that does not match any of the extra-info - documents currently held. If so, it downloads whatever extra-info - documents are missing. Clients try to download from caches. - We follow the same splitting and back-off rules as in section 5.2. - -5.4. Using directory information - - [XXX This subsection really belongs in path-spec.txt, not here. -KL] - - Everyone besides directory authorities uses the approaches in this section - to decide which relays to use and what their keys are likely to be. - (Directory authorities just believe their own opinions, as in section 3.4.2 - above.) - -5.4.1. Choosing routers for circuits. - - Circuits SHOULD NOT be built until the client has enough directory - information: a live consensus network status [XXXX fallback?] and - descriptors for at least 1/4 of the relays believed to be running. - - A relay is "listed" if it is included by the consensus network-status - document. Clients SHOULD NOT use unlisted relays. - - These flags are used as follows: - - - Clients SHOULD NOT use non-'Valid' or non-'Running' routers unless - requested to do so. - - - Clients SHOULD NOT use non-'Fast' routers for any purpose other than - very-low-bandwidth circuits (such as introduction circuits). - - - Clients SHOULD NOT use non-'Stable' routers for circuits that are - likely to need to be open for a very long time (such as those used for - IRC or SSH connections). - - - Clients SHOULD NOT choose non-'Guard' nodes when picking entry guard - nodes. - - See the "path-spec.txt" document for more details. - -5.4.2. Managing naming - - (This section is removed; authorities no longer assign the 'Named' flag.) - -5.4.3. Software versions - - An implementation of Tor SHOULD warn when it has fetched a consensus - network-status, and it is running a software version not listed. - -5.4.4. Warning about a router's status. - - (This section is removed; authorities no longer assign the 'Named' flag.) - -5.5. Retrying failed downloads - - This section applies to caches as well as to clients. - - When a client fails to download a resource (a consensus, a router - descriptor, a microdescriptor, etc) it waits for a certain amount of - time before retrying the download. To determine the amount of time - to wait, clients use a randomized exponential backoff algorithm. - (Specifically, they use a variation of the "decorrelated jitter" - algorithm from - https://aws.amazon.com/blogs/architecture/exponential-backoff-and-jitter/ .) - - The specific formula used to compute the 'i+1'th delay is: - - Delay_{i+1} = MIN(cap, random_between(lower_bound, upper_bound))) - where upper_bound = MAX(lower_bound+1, Delay_i * 3) - lower_bound = MAX(1, base_delay). - - The value of 'cap' is set to INT_MAX; the value of 'base_delay' - depends on what is being downloaded, whether the client is fully - bootstrapped, how the client is configured, and where it is - downloading from. Current base_delay values are: - - Consensus objects, as a non-bridge cache: - 0 (TestingServerConsensusDownloadInitialDelay) - - Consensus objects, as a client or bridge that has bootstrapped: - 0 (TestingClientConsensusDownloadInitialDelay) - - Consensus objects, as a client or bridge that is bootstrapping, - when connecting to an authority because no "fallback" caches are - known: - 0 (ClientBootstrapConsensusAuthorityOnlyDownloadInitialDelay) - - Consensus objects, as a client or bridge that is bootstrapping, - when "fallback" caches are known but connecting to an authority - anyway: - 6 (ClientBootstrapConsensusAuthorityDownloadInitialDelay) - - Consensus objects, as a client or bridge that is bootstrapping, - when downloading from a "fallback" cache. - 0 (ClientBootstrapConsensusFallbackDownloadInitialDelay) - - Bridge descriptors, as a bridge-using client when at least one bridge - is usable: - 10800 (TestingBridgeDownloadInitialDelay) - - Bridge descriptors, otherwise: - 0 (TestingBridgeBootstrapDownloadInitialDelay) - - Other objects, as cache or authority: - 0 (TestingServerDownloadInitialDelay) - - Other objects, as client: - 0 (TestingClientDownloadInitialDelay) - - -6. Standards compliance - - All clients and servers MUST support HTTP 1.0. Clients and servers MAY - support later versions of HTTP as well. - -6.1. HTTP headers - - Servers SHOULD set Content-Encoding to the algorithm used to compress the - document(s) being served. Recognized algorithms are: - - - "identity" -- RFC2616 section 3.5 - - "deflate" -- RFC2616 section 3.5 - - "gzip" -- RFC2616 section 3.5 - - "x-zstd" -- The zstandard compression algorithm (www.zstd.net) - - "x-tor-lzma" -- The lzma compression algorithm, with a "preset" - value no higher than 6. - - Clients SHOULD use Accept-Encoding on most directory requests to indicate - which of the above compression algorithms they support. If they omit it - (as Tor clients did before 0.3.1.1-alpha), then the server should serve - only "deflate" or "identity" encoded documents, based on the presence or - absence of the ".z" suffix on the requested URL. - - Note that for anonymous directory requests (that is, requests made over - multi-hop circuits, like those for onion service lookups) implementations - SHOULD NOT advertise any Accept-Encoding values other than deflate. To do - so would be to create a fingerprinting opportunity. - - When receiving multiple documents, clients MUST accept compressed - concatenated documents and concatenated compressed documents as - equivalent. - - Servers MAY set the Content-Length: header. When they do, it should - match the number of compressed bytes that they are sending. - - Servers MAY include an X-Your-Address-Is: header, whose value is the - apparent IP address of the client connecting to them (as a dotted quad). - For directory connections tunneled over a BEGIN_DIR stream, servers SHOULD - report the IP from which the circuit carrying the BEGIN_DIR stream reached - them. - - Servers SHOULD disable caching of multiple network statuses or multiple - server descriptors. Servers MAY enable caching of single descriptors, - single network statuses, the list of all server descriptors, a v1 - directory, or a v1 running routers document. XXX mention times. - -6.2. HTTP status codes - - Tor delivers the following status codes. Some were chosen without much - thought; other code SHOULD NOT rely on specific status codes yet. - - 200 -- the operation completed successfully - -- the user requested statuses or serverdescs, and none of the ones we - requested were found (0.2.0.4-alpha and earlier). - - 304 -- the client specified an if-modified-since time, and none of the - requested resources have changed since that time. - - 400 -- the request is malformed, or - -- the URL is for a malformed variation of one of the URLs we support, - or - -- the client tried to post to a non-authority, or - -- the authority rejected a malformed posted document, or - - 404 -- the requested document was not found. - -- the user requested statuses or serverdescs, and none of the ones - requested were found (0.2.0.5-alpha and later). - - 503 -- we are declining the request in order to save bandwidth - -- user requested some items that we ordinarily generate or store, - but we do not have any available. - -A. Consensus-negotiation timeline. - - Period begins: this is the Published time. - Everybody sends votes - Reconciliation: everybody tries to fetch missing votes. - consensus may exist at this point. - End of voting period: - everyone swaps signatures. - Now it's okay for caches to download - Now it's okay for clients to download. - - Valid-after/valid-until switchover - -B. General-use HTTP URLs - - "Fingerprints" in these URLs are base16-encoded SHA1 hashes. - - The most recent v3 consensus should be available at: - - http:///tor/status-vote/current/consensus.z - - Similarly, the v3 microdescriptor consensus should be available at: - - http:///tor/status-vote/current/consensus-microdesc.z - - Starting with Tor version 0.2.1.1-alpha is also available at: - - http:///tor/status-vote/current/consensus/++.z - - (NOTE: Due to squid proxy url limitations at most 96 fingerprints can be - retrieved in a single request.) - - Where F1, F2, etc. are authority identity fingerprints the client trusts. - Servers will only return a consensus if more than half of the requested - authorities have signed the document, otherwise a 404 error will be sent - back. The fingerprints can be shortened to a length of any multiple of - two, using only the leftmost part of the encoded fingerprint. Tor uses - 3 bytes (6 hex characters) of the fingerprint. - - Clients SHOULD sort the fingerprints in ascending order. Server MUST - accept any order. - - Clients SHOULD use this format when requesting consensus documents from - directory authority servers and from caches running a version of Tor - that is known to support this URL format. - - A concatenated set of all the current key certificates should be available - at: - - http:///tor/keys/all.z - - The key certificate for this server should be available at: - - http:///tor/keys/authority.z - - The key certificate for an authority whose authority identity fingerprint - is should be available at: - - http:///tor/keys/fp/.z - - The key certificate whose signing key fingerprint is should be - available at: - - http:///tor/keys/sk/.z - - The key certificate whose identity key fingerprint is and whose signing - key fingerprint is should be available at: - - http:///tor/keys/fp-sk/-.z - - (As usual, clients may request multiple certificates using: - - http:///tor/keys/fp-sk/-+-.z ) - - [The above fp-sk format was not supported before Tor 0.2.1.9-alpha.] - - The most recent descriptor for a server whose identity key has a - fingerprint of should be available at: - - http:///tor/server/fp/.z - - The most recent descriptors for servers with identity fingerprints - ,, should be available at: - - http:///tor/server/fp/++.z - - (NOTE: Due to squid proxy url limitations at most 96 fingerprints can be - retrieved in a single request. - - Implementations SHOULD NOT download descriptors by identity key - fingerprint. This allows a corrupted server (in collusion with a cache) to - provide a unique descriptor to a client, and thereby partition that client - from the rest of the network.) - - The server descriptor with (descriptor) digest (in hex) should be - available at: - - http:///tor/server/d/.z - - The most recent descriptors with digests ,, should be - available at: - - http:///tor/server/d/++.z - - The most recent descriptor for this server should be at: - - http:///tor/server/authority.z - - This is used for authorities, and also if a server is configured - as a bridge. The official Tor implementations (starting at - 0.1.1.x) use this resource to test whether a server's own DirPort - is reachable. It is also useful for debugging purposes. - - A concatenated set of the most recent descriptors for all known servers - should be available at: - - http:///tor/server/all.z - - Extra-info documents are available at the URLS - - http:///tor/extra/d/... - http:///tor/extra/fp/... - http:///tor/extra/all[.z] - http:///tor/extra/authority[.z] - (As for /tor/server/ URLs: supports fetching extra-info - documents by their digest, by the fingerprint of their servers, - or all at once. When serving by fingerprint, we serve the - extra-info that corresponds to the descriptor we would serve by - that fingerprint. Only directory authorities of version - 0.2.0.1-alpha or later are guaranteed to support the first - three classes of URLs. Caches may support them, and MUST - support them if they have advertised "caches-extra-info".) - - For debugging, directories SHOULD expose non-compressed objects at - URLs like the above, but without the final ".z". If the client uses - Accept-Encodings header, it should override the presence or absence - of the ".z" (see section 6.1). - - Clients SHOULD use upper case letters (A-F) when base16-encoding - fingerprints. Servers MUST accept both upper and lower case fingerprints - in requests. - -C. Converting a curve25519 public key to an ed25519 public key - - Given an X25519 key, that is, an affine point (u,v) on the - Montgomery curve defined by - - bv^2 = u(u^2 + au +1) - - where - - a = 486662 - b = 1 - - and comprised of the compressed form (i.e. consisting of only the - u-coordinate), we can retrieve the y-coordinate of the affine point - (x,y) on the twisted Edwards form of the curve defined by - - -x^2 + y^2 = 1 + d x^2 y^2 - - where - - d = - 121665/121666 - - by computing - - y = (u-1)/(u+1). - - and then we can apply the usual curve25519 twisted Edwards point - decompression algorithm to find _an_ x-coordinate of an affine - twisted Edwards point to check signatures with. Signing keys for - ed25519 are compressed curve points in twisted Edwards form (so a - y-coordinate and the sign of the x-coordinate), and X25519 keys are - compressed curve points in Montgomery form (i.e. a u-coordinate). - - However, note that compressed point in Montgomery form neglects to - encode what the sign of the corresponding twisted Edwards - x-coordinate would be. Thus, we need the sign of the x-coordinate - to do this operation; otherwise, we'll have two possible - x-coordinates that might have correspond to the ed25519 public key. - - To get the sign, the easiest way is to take the corresponding - private key, feed it to the ed25519 public key generation - algorithm, and see what the sign is. - - [Recomputing the sign bit from the private key every time sounds - rather strange and inefficient to me… —isis] - - Note that in addition to its coordinates, an expanded Ed25519 private key - also has a 32-byte random value, "prefix", used to compute internal `r` - values in the signature. For security, this prefix value should be - derived deterministically from the curve25519 key. The Tor - implementation derives it as SHA512(private_key | STR)[0..32], where - STR is the nul-terminated string: - - "Derive high part of ed25519 key from curve25519 key\0" - - - On the client side, where there is no access to the curve25519 private - keys, one may use the curve25519 public key's Montgomery u-coordinate to - recover the Montgomery v-coordinate by computing the right-hand side of - the Montgomery curve equation: - - bv^2 = u(u^2 + au +1) - - where - - a = 486662 - b = 1 - - Then, knowing the intended sign of the Edwards x-coordinate, one - may recover said x-coordinate by computing: - - x = (u/v) * sqrt(-a - 2) - -D. Inferring missing proto lines. - - The directory authorities no longer allow versions of Tor before - 0.2.4.18-rc. But right now, there is no version of Tor in the consensus - before 0.2.4.19. Therefore, we should disallow versions of Tor earlier - than 0.2.4.19, so that we can have the protocol list for all current Tor - versions include: - - Cons=1-2 Desc=1-2 DirCache=1 HSDir=1 HSIntro=3 HSRend=1-2 Link=1-4 - LinkAuth=1 Microdesc=1-2 Relay=1-2 - - For Desc, Microdesc and Cons, Tor versions before 0.2.7.stable should be - taken to only support version 1. - -E. Limited ed diff format - - We support the following format for consensus diffs. It's a - subset of the ed diff format, but clients MUST NOT accept other - ed commands. - - We support the following ed commands, each on a line by itself: - - - "d" Delete line n1 - - ",d" Delete lines n1 through n2, inclusive - - ",$d" Delete line n1 through the end of the file, inclusive. - - "c" Replace line n1 with the following block - - ",c" Replace lines n1 through n2, inclusive, with the - following block. - - "a" Append the following block after line n1. - - Note that line numbers always apply to the file after all previous - commands have already been applied. Note also that line numbers - are 1-indexed. - - The commands MUST apply to the file from back to front, such that - lines are only ever referred to by their position in the original - file. - - If there are any directory signatures on the original document, the - first command MUST be a ",$d" form to remove all of the directory - signatures. Using this format ensures that the client will - successfully apply the diff even if they have an unusual encoding for - the signatures. - - The replace and append command take blocks. These blocks are simply - appended to the diff after the line with the command. A line with - just a period (".") ends the block (and is not part of the lines - to add). Note that it is impossible to insert a line with just - a single dot. diff --git a/ext-orport-spec.txt b/ext-orport-spec.txt deleted file mode 100644 index 6b8f8e1..0000000 --- a/ext-orport-spec.txt +++ /dev/null @@ -1,226 +0,0 @@ - Extended ORPort for pluggable transports - George Kadianakis, Nick Mathewson - -Table of Contents - - 1. Overview - 2. Establishing a connection and authenticating. - 2.1. Authentication type: SAFE_COOKIE - 2.1.2. Cookie-file format - 2.1.3. SAFE_COOKIE Protocol specification - 3. The extended ORPort protocol - 3.1. Protocol - 3.2. Command descriptions - 3.2.1. USERADDR - 3.2.2. TRANSPORT - 4. Security Considerations - -1. Overview - - This document describes the "Extended ORPort" protocol, a wrapper - around Tor's ordinary ORPort protocol for use by bridges that - support pluggable transports. It provides a way for server-side PTs - and bridges to exchange additional information before beginning - the actual OR connection. - - See `tor-spec.txt` for information on the regular OR protocol, and - `pt-spec.txt` for information on pluggable transports. - - This protocol was originally proposed in proposal 196, and - extended with authentication in proposal 217. - -2. Establishing a connection and authenticating. - - When a client (that is to say, a server-side pluggable transport) - connects to an Extended ORPort, the server sends: - - AuthTypes [variable] - EndAuthTypes [1 octet] - - Where, - - + AuthTypes are the authentication schemes that the server supports - for this session. They are multiple concatenated 1-octet values that - take values from 1 to 255. - + EndAuthTypes is the special value 0. - - The client reads the list of supported authentication schemes, - chooses one, and sends it back: - - AuthType [1 octet] - - Where, - - + AuthType is the authentication scheme that the client wants to use - for this session. A valid authentication type takes values from 1 to - 255. A value of 0 means that the client did not like the - authentication types offered by the server. - - If the client sent an AuthType of value 0, or an AuthType that the - server does not support, the server MUST close the connection. - -2.1. Authentication type: SAFE_COOKIE - - We define one authentication type: SAFE_COOKIE. Its AuthType - value is 1. It is based on the client proving to the bridge that - it can access a given "cookie" file on disk. The purpose of - authentication is to defend against cross-protocol attacks. - - If the Extended ORPort is enabled, Tor should regenerate the cookie - file on startup and store it in - $DataDirectory/extended_orport_auth_cookie. - - The location of the cookie can be overridden by using the - configuration file parameter ExtORPortCookieAuthFile, which is - defined as: - - ExtORPortCookieAuthFile - - where is a filesystem path. - -2.1.2. Cookie-file format - - The format of the cookie-file is: - - StaticHeader [32 octets] - Cookie [32 octets] - - Where, - + StaticHeader is the following string: - "! Extended ORPort Auth Cookie !\x0a" - + Cookie is the shared-secret. During the SAFE_COOKIE protocol, the - cookie is called CookieString. - - Extended ORPort clients MUST make sure that the StaticHeader is - present in the cookie file, before proceeding with the - authentication protocol. - -2.1.3. SAFE_COOKIE Protocol specification - - - A client that performs the SAFE_COOKIE handshake begins by sending: - - ClientNonce [32 octets] - - Where, - + ClientNonce is 32 octets of random data. - - Then, the server replies with: - - ServerHash [32 octets] - ServerNonce [32 octets] - - Where, - + ServerHash is computed as: - HMAC-SHA256(CookieString, - "ExtORPort authentication server-to-client hash" | ClientNonce | ServerNonce) - + ServerNonce is 32 random octets. - - Upon receiving that data, the client computes ServerHash, and - validates it against the ServerHash provided by the server. - - If the server-provided ServerHash is invalid, the client MUST - terminate the connection. - - Otherwise the client replies with: - - ClientHash [32 octets] - - Where, - + ClientHash is computed as: - HMAC-SHA256(CookieString, - "ExtORPort authentication client-to-server hash" | ClientNonce | ServerNonce) - - Upon receiving that data, the server computes ClientHash, and - validates it against the ClientHash provided by the client. - - Finally, the server replies with: - - Status [1 octet] - - Where, - + Status is 1 if the authentication was successful. If the - authentication failed, Status is 0. - -3. The extended ORPort protocol - - Once a connection is established and authenticated, the parties - communicate with the protocol described here. - -3.1. Protocol - - The extended server port protocol is as follows: - - COMMAND [2 bytes, big-endian] - BODYLEN [2 bytes, big-endian] - BODY [BODYLEN bytes] - - Commands sent from the transport proxy to the bridge are: - - [0x0000] DONE: There is no more information to give. The next - bytes sent by the transport will be those tunneled over it. - (body ignored) - - [0x0001] USERADDR: an address:port string that represents the - client's address. - - [0x0002] TRANSPORT: a string of the name of the pluggable - transport currently in effect on the connection. - - Replies sent from tor to the proxy are: - - [0x1000] OKAY: Send the user's traffic. (body ignored) - - [0x1001] DENY: Tor would prefer not to get more traffic from - this address for a while. (body ignored) - - [0x1002] CONTROL: (Not used) - - Parties MUST ignore command codes that they do not understand. - - If the server receives a recognized command that does not parse, it - MUST close the connection to the client. - -3.2. Command descriptions - -3.2.1. USERADDR - - An ASCII string holding the TCP/IP address of the client of the - pluggable transport proxy. A Tor bridge SHOULD use that address to - collect statistics about its clients. Recognized formats are: - 1.2.3.4:5678 - [1:2::3:4]:5678 - - (Current Tor versions may accept other formats, but this is a bug: - transports MUST NOT send them.) - - The string MUST not be NUL-terminated. - -3.2.2. TRANSPORT - - An ASCII string holding the name of the pluggable transport used by - the client of the pluggable transport proxy. A Tor bridge that - supports multiple transports SHOULD use that information to collect - statistics about the popularity of individual pluggable transports. - - The string MUST not be NUL-terminated. - - Pluggable transport names are C-identifiers and Tor MUST check them - for correctness. - -4. Security Considerations - - Extended ORPort or TransportControlPort do _not_ provide link - confidentiality, authentication or integrity. Sensitive data, like - cryptographic material, should not be transferred through them. - - An attacker with superuser access is able to sniff network traffic, - and capture TransportControlPort identifiers and any data passed - through those ports. - - Tor SHOULD issue a warning if the bridge operator tries to bind - Extended ORPort to a non-localhost address. - - Pluggable transport proxies SHOULD issue a warning if they are - instructed to connect to a non-localhost Extended ORPort. - diff --git a/gettor-spec.txt b/gettor-spec.txt deleted file mode 100644 index a4959b4..0000000 --- a/gettor-spec.txt +++ /dev/null @@ -1,88 +0,0 @@ - - GetTor specification - Jacob Appelbaum - -Table of Contents - - 0. Preface - 1. Overview - 2. Implementation - 2.1. Reference implementation - 3. SMTP transport - 3.1. SMTP transport security considerations - 3.2. SMTP transport privacy considerations - 4. Other transports - 5. Implementation suggestions - -0. Preface - - This document describes GetTor and how to properly implementation GetTor. - -1. Overview - - GetTor was created to resolve direct and indirect censorship of Tor's - software. In many countries and networks Tor's main website is blocked and - would-be Tor users are unable to download even the source code to the Tor - program. Other software hosted by the Tor Project is similarly censored. The - filtering of the possible download sites is sometimes easy to bypass by using - our TLS enabled website. In other cases the website and all of the mirrors are - entirely blocked; this is a situation where a user seems to actually need Tor - to fetch Tor. We discovered that it is feasible to use alternate transport - methods such as SMTP between a non-trusted third party or with IRC and XDCC. - -2. Implementation - - Any compliant GetTor implementation will implement at least a single transport - to meet the needs of a certain class of users. It should be i18n and l10n - compliant for all user facing interactions; users should be able to manually - set their language and this should serve as their preference for localization - of any software delivered. The implementation must be free software and it - should be freely available by request from the implementation that they - interface with to download any of the other software available from that - GetTor instance. Security and privacy considerations should be described on a - per transport basis. - -2.1. Reference implementation - - We have implemented[0] a compliant GetTor that supports SMTP as a transport. - -3. SMTP transport - - The SMTP transport for GetTor should allow users to send any RFC822 compliant - message in any known human language; GetTor should respond in whatever - language is detected with supplementary translations in the same email. - GetTor shall offer a list of all available software in the body of the email - - it should offer the software as a list of packages and their subsequent - descriptions. - -3.1. SMTP transport security considerations - - Any GetTor instance that offers SMTP as a transport should optionally - implement the checking of DKIM signatures to ensure that email is not forged. - Optionally GetTor should take an OpenPGP key from the user and encrypt the - response with a blinded message. - -3.2. SMTP transport privacy considerations - - Any GetTor instance that offers SMTP as a transport must at least store the - requester's address for the time that it takes to process a response. This - should not be written to any permanent storage medium; GetTor should function - without any long term storage excepting a cache of files that it will send to - any user who requests it. - - GetTor may optionally collect anonymized usage statistics to better understand - how GetTor[1] is in use. This must not include any personally identifying - information about any of the requester beyond language selection. - -4. Other transports - - At this time no other transports have been specified. IRC XDCC is a likely - useful system as is XMPP/Jabber with the newest OTR file sharing transport. - -5. Implementation suggestions - - It is suggested that any compliant GetTor instance should be written in a so - called "safe" language such as Python. - -[0] https://gitweb.torproject.org/gettor.git -[1] https://metrics.torproject.org/packages.html diff --git a/glossary.txt b/glossary.txt deleted file mode 100644 index 68de376..0000000 --- a/glossary.txt +++ /dev/null @@ -1,198 +0,0 @@ - - Glossary - - The Tor Project - -This document aims to specify terms, notations, and phrases related -to Tor, as used in the Tor specification documents and other documentation. - -This glossary is not a design document; it is only a reference. - -This glossary is a work-in-progress; double-check its definitions before -citing them authoritatively. ;) - -Table of Contents - - 0. Preliminaries - 1.0. Commonly used Tor configuration terms - 2.0. Tor network components - 2.1. Relays, aka OR (onion router) - 2.1.1. Specific roles - 2.2. Client, aka OP (onion proxy) - 2.3. Authorities - 2.4. Hidden Service - 2.5. Circuit - 2.6. Edge connection - 2.7. Consensus - 2.8. Descriptor - 3.0. Tor network protocols - 3.1. Link handshake - 3.2. Circuit handshake - 3.3. Hidden Service Protocol - 3.4. Directory Protocol - 4.0. General network definitions - -0. Preliminaries - - The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL - NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and - "OPTIONAL" in this document are to be interpreted as described in - RFC 2119. - -1.0. Commonly used Tor configuration terms - - ORPort - Onion Router Port - DirPort - Directory Port - -2.0. Tor network components - -2.1. Relays, aka OR (onion router) - - [Style guide: prefer the term "Relay"] - -2.1.1. Specific roles - - Exit relay: The final hop in an exit circuit before traffic leaves - the Tor network to connect to external servers. - - Non-exit relay: Relays that send and receive traffic only to - other Tor relays and Tor clients. - - Entry relay: The first hop in a Tor circuit. Can be either a guard - relay or a bridge, depending on the client's configuration. - - Guard relay: A relay that a client uses as its entry for a longer - period of time. Guard relays are rotated more slowly to prevent - attacks that can come from being exposed to too many guards. - - Bridge: A relay intentionally not listed in the public Tor - consensus, with the purpose of circumventing entities (such as - governments or ISPs) seeking to block clients from using Tor. - Currently, bridges are used only as entry relays. - - Directory cache: A relay that downloads cached directory information - from the directory authorities and serves it to clients on demand. - Any relay will act as a directory cache, if its bandwidth is high enough. - - Rendezvous point: A relay connecting a client to a hidden service. - Each party builds a three-hop circuit, meeting at the - rendezvous point. - -2.2. Client, aka OP (onion proxy) - - [Style: the "OP" and "onion proxy" terms are deprecated.] - -2.3. Authorities: - - Directory Authority: Nine total in the Tor network, operated by - trusted individuals. Directory authorities define and serve the - consensus document, defining the "state of the network." This document - contains a "router status" section for every relay currently - in the network. Directory authorities also serve router descriptors, - extra info documents, microdescriptors, and the microdescriptor consensus. - - Bridge Authority: One total. Similar in responsibility to directory - authorities, but for bridges. - - Fallback directory mirror: One of a list of directory caches distributed - with the Tor software. (When a client first connects to the network, and - has no directory information, it asks a fallback directory. From then on, - the client can ask any directory cache that's listed in the directory - information it has.) - -2.4. Hidden Service: - - A hidden service is a server that will only accept incoming - connections via the hidden service protocol. Connection - initiators will not be able to learn the IP address of the hidden - service, allowing the hidden service to receive incoming connections, - serve content, etc, while preserving its location anonymity. - -2.5. Circuit: - - An established path through the network, where cryptographic keys - are negotiated using the ntor protocol or TAP (Tor Authentication - Protocol (deprecated)) with each hop. Circuits can differ in length - depending on their purpose. See also Leaky Pipe Topology. - - Origin Circuit - - - Exit Circuit: A circuit which connects clients to destinations - outside the Tor network. For example, if a client wanted to visit - duckduckgo.com, this connection would require an exit circuit. - - Internal Circuit: A circuit whose traffic never leaves the Tor - network. For example, a client could connect to a hidden service via - an internal circuit. - -2.6. Edge connection: - -2.7. Consensus: The state of the Tor network, published every hour, - decided by a vote from the network's directory authorities. Clients - fetch the consensus from directory authorities, fallback - directories, or directory caches. - -2.8. Descriptor: Each descriptor represents information about one - relay in the Tor network. The descriptor includes the relay's IP - address, public keys, and other data. Relays send - descriptors to directory authorities, who vote and publish a - summary of them in the network consensus. - -3.0. Tor network protocols - -3.1. Link handshake - - The link handshake establishes the TLS connection over which two - Tor participants will send Tor cells. This handshake also - authenticates the participants to each other, possibly using Tor - cells. - -3.2. Circuit handshake - - Circuit handshakes establish the hop-by-hop onion encryption - that clients use to tunnel their application traffic. The - client does a pairwise key establishment handshake with each - individual relay in the circuit. For every hop except the - first, these handshakes tunnel through existing hops in the - circuit. Each cell type in this protocol also has a newer - version (with a "2" suffix), e.g., CREATE2. - - CREATE cell: First part of a handshake, sent by the initiator. - - CREATED cell: Second part of a handshake, sent by the responder. - - EXTEND cell: (also known as a RELAY_EXTEND cell) First part of a - handshake, tunneled through an existing circuit. The last relay - in the circuit so far will decrypt this cell and send the - payload in a CREATED cell to the chosen next hop relay. - - EXTENDED cell: (also known as a RELAY_EXTENDED cell) Second part - of a handshake, tunneled through an existing circuit. The last - relay in the circuit so far receives the CREATED cell from the - new last hop relay and encrypts the payload in an EXTENDED cell - to tunnel back to the client. - - Onion skin: A CREATE/CREATE2 or EXTEND/EXTEND2 payload that - contains the first part of the TAP or ntor key establishment - handshake. - -3.3. Hidden Service Protocol - -3.4. Directory Protocol - - -4.0. General network definitions - - Leaky Pipe Topology: The ability for the origin of a circuit to address - relay cells to be addressed to any hop in the path of a circuit. In Tor, - the destination hop is determined by using the 'recognized' field of relay - cells. - - Stream: A single application-level connection or request, multiplexed over - a Tor circuit. A 'Stream' can currently carry the contents of a TCP - connection, a DNS request, or a Tor directory request. - - Channel: A pairwise connection between two Tor relays, or between a - client and a relay. Circuits are multiplexed over Channels. All - channels are currently implemented as TLS connections. - diff --git a/guard-spec.txt b/guard-spec.txt deleted file mode 100644 index 154edae..0000000 --- a/guard-spec.txt +++ /dev/null @@ -1,972 +0,0 @@ - - Tor Guard Specification - - Isis Lovecruft - George Kadianakis - Ola Bini - Nick Mathewson - -Table of Contents - - 1. Introduction and motivation - 2. State instances - 3. Circuit Creation, Entry Guard Selection (1000 foot view) - 3.1 Path selection - 3.1.1 Managing entry guards - 3.1.2 Middle and exit node selection - 3.2 Circuit Building - 4. The algorithm. - 4.0. The guards listed in the current consensus. [Section:GUARDS] - 4.1. The Sampled Guard Set. [Section:SAMPLED] - 4.2. The Usable Sample [Section:FILTERED] - 4.3. The confirmed-guard list. [Section:CONFIRMED] - 4.4. The Primary guards [Section:PRIMARY] - 4.5. Retrying guards. [Section:RETRYING] - 4.6. Selecting guards for circuits. [Section:SELECTING] - 4.7. When a circuit fails. [Section:ON_FAIL] - 4.8. When a circuit succeeds [Section:ON_SUCCESS] - 4.9. Updating the list of waiting circuits [Section:UPDATE_WAITING] - 4.10. Whenever we get a new consensus. [Section:ON_CONSENSUS] - 4.11. Deciding whether to generate a new circuit. - 4.12. When we are missing descriptors. - A. Appendices - A.0. Acknowledgements - A.1. Parameters with suggested values. [Section:PARAM_VALS] - A.2. Random values [Section:RANDOM] - A.3. Why not a sliding scale of primaryness? [Section:CVP] - A.4. Controller changes - A.5. Persistent state format - -1. Introduction and motivation - - Tor uses entry guards to prevent an attacker who controls some - fraction of the network from observing a fraction of every user's - traffic. If users chose their entries and exits uniformly at - random from the list of servers every time they build a circuit, - then an adversary who had (k/N) of the network would deanonymize - F=(k/N)^2 of all circuits... and after a given user had built C - circuits, the attacker would see them at least once with - probability 1-(1-F)^C. With large C, the attacker would get a - sample of every user's traffic with probability 1. - - To prevent this from happening, Tor clients choose a small number - of guard nodes (e.g. 3). These guard nodes are the only - nodes that the client will connect to directly. If they are not - compromised, the user's paths are not compromised. - - This specification outlines Tor's guard housekeeping algorithm, - which tries to meet the following goals: - - - Heuristics and algorithms for determining how and which guards - are chosen should be kept as simple and easy to understand as - possible. - - - Clients in censored regions or who are behind a fascist - firewall who connect to the Tor network should not experience - any significant disadvantage in terms of reachability or - usability. - - - Tor should make a best attempt at discovering the most - appropriate behavior, with as little user input and - configuration as possible. - - - Tor clients should discover usable guards without too much - delay. - - - Tor clients should resist (to the extent possible) attacks - that try to force them onto compromised guards. - - - Should maintain the load-balancing offered by the path selection - algorithm - -2. State instances - - In the algorithm below, we describe a set of persistent and - non-persistent state variables. These variables should be - treated as an object, of which multiple instances can exist. - - In particular, we specify the use of three particular instances: - - A. UseBridges - - If UseBridges is set, then we replace the {GUARDS} set in - [Sec:GUARDS] below with the list of configured - bridges. We maintain a separate persistent instance of - {SAMPLED_GUARDS} and {CONFIRMED_GUARDS} and other derived - values for the UseBridges case. - - In this case, we impose no upper limit on the sample size. - - B. EntryNodes / ExcludeNodes / Reachable*Addresses / - FascistFirewall / ClientUseIPv4=0 - - If one of the above options is set, and UseBridges is not, - then we compare the fraction of usable guards in the consensus - to the total number of guards in the consensus. - - If this fraction is less than {MEANINGFUL_RESTRICTION_FRAC}, - we use a separate instance of the state. - - (While Tor is running, we do not change back and forth between - the separate instance of the state and the default instance - unless the fraction of usable guards is 5% higher than, or 5% - lower than, {MEANINGFUL_RESTRICTION_FRAC}. This prevents us - from flapping back and forth between instances if we happen to - hit {MEANINGFUL_RESTRICTION_FRAC} exactly. - - If this fraction is less than {EXTREME_RESTRICTION_FRAC}, we use a - separate instance of the state, and warn the user. - - [TODO: should we have a different instance for each set of heavily - restricted options?] - - C. Default - - If neither of the above variant-state instances is used, - we use a default instance. - -3. Circuit Creation, Entry Guard Selection (1000 foot view) - - A circuit in Tor is a path through the network connecting a client to - its destination. At a high-level, a three-hop exit circuit will look - like this: - - Client <-> Entry Guard <-> Middle Node <-> Exit Node <-> Destination - - Entry guards are the only nodes which a client will connect to - directly. Exit relays are the nodes by which traffic exits the - Tor network in order to connect to an external destination. - - 3.1 Path selection - - For any multi-hop circuit, at least one entry guard and middle node(s) are - required. An exit node is required if traffic will exit the Tor - network. Depending on its configuration, a relay listed in a - consensus could be used for any of these roles. However, this - specification defines how entry guards specifically should be selected and - managed, as opposed to middle or exit nodes. - - 3.1.1 Managing entry guards - - At a high level, a relay listed in a consensus will move through the - following states in the process from initial selection to eventual - usage as an entry guard: - - relays listed in consensus - | - sampled - | | - confirmed filtered - | | | - primary usable_filtered - - Relays listed in the latest consensus can be sampled for guard usage - if they have the "Guard" flag. Sampling is random but weighted by - a measured bandwidth multiplied by bandwidth-weights (Wgg if guard only, - Wgd if guard+exit flagged). - - Once a path is built and a circuit established using this guard, it - is marked as confirmed. Until this point, guards are first sampled - and then filtered based on information such as our current - configuration (see SAMPLED and FILTERED sections) and later marked as - usable_filtered if the guard is not primary but can be reached. - - It is always preferable to use a primary guard when building a new - circuit in order to reduce guard churn; only on failure to connect to - existing primary guards will new guards be used. - - 3.1.2 Middle and exit node selection - - Middle nodes are selected at random from relays listed in the latest - consensus, weighted by bandwidth and bandwidth-weights. Exit nodes are - chosen similarly but restricted to relays with a sufficiently permissive - exit policy. - - 3.2 Circuit Building - - Once a path is chosen, Tor will use this path to build a new circuit. - - If the circuit is built successfully, Tor will either use it - immediately, or Tor will wait for a circuit with a more preferred - guard if there's a good chance that it will be able to make one. - - If the circuit fails in a way that makes us conclude that a guard - is not reachable, the guard is marked as unreachable, the circuit is - closed, and waiting circuits are updated. - -4. The algorithm. - -4.0. The guards listed in the current consensus. [Section:GUARDS] - - By {set:GUARDS} we mean the set of all guards in the current - consensus that are usable for all circuits and directory - requests. (They must have the flags: Stable, Fast, V2Dir, Guard.) - - **Rationale** - - We require all guards to have the flags that we potentially need - from any guard, so that all guards are usable for all circuits. - -4.1. The Sampled Guard Set. [Section:SAMPLED] - - We maintain a set, {set:SAMPLED_GUARDS}, that persists across - invocations of Tor. It is a subset of the nodes ordered by a sample idx that - we have seen listed as a guard in the consensus at some point. - For each such guard, we record persistently: - - - {pvar:ADDED_ON_DATE}: The date on which it was added to - sampled_guards. - - We set this value to a point in the past, using - RAND(now, {GUARD_LIFETIME}/10). See - Appendix [RANDOM] below. - - - {pvar:ADDED_BY_VERSION}: The version of Tor that added it to - sampled_guards. - - - {pvar:IS_LISTED}: Whether it was listed as a usable Guard in - the _most recent_ consensus we have seen. - - - {pvar:FIRST_UNLISTED_AT}: If IS_LISTED is false, the publication date - of the earliest consensus in which this guard was listed such that we - have not seen it listed in any later consensus. Otherwise "None." - We randomize this to a point in the past, based on - RAND(added_at_time, {REMOVE_UNLISTED_GUARDS_AFTER} / 5) - - For each guard in {SAMPLED_GUARDS}, we also record this data, - non-persistently: - - - {tvar:last_tried_connect}: A 'last tried to connect at' - time. Default 'never'. - - - {tvar:is_reachable}: an "is reachable" tristate, with - possible values { , , }. - Default '.' - - [Note: "yes" is not strictly necessary, but I'm - making it distinct from "maybe" anyway, to make our - logic clearer. A guard is "maybe" reachable if it's - worth trying. A guard is "yes" reachable if we tried - it and succeeded.] - - - {tvar:failing_since}: The first time when we failed to - connect to this guard. Defaults to "never". Reset to - "never" when we successfully connect to this guard. - - - {tvar:is_pending} A "pending" flag. This indicates that we - are trying to build an exploratory circuit through the - guard, and we don't know whether it will succeed. - - - {tvar:pending_since}: A timestamp. Set whenever we set - {tvar:is_pending} to true; cleared whenever we set - {tvar:is_pending} to false. NOTE - - We require that {SAMPLED_GUARDS} contain at least - {MIN_FILTERED_SAMPLE} guards from the consensus (if possible), - but not more than {MAX_SAMPLE_THRESHOLD} of the number of guards - in the consensus, and not more than {MAX_SAMPLE_SIZE} in total. - (But if the maximum would be smaller than {MIN_FILTERED_SAMPLE}, we - set the maximum at {MIN_FILTERED_SAMPLE}.) - - To add a new guard to {SAMPLED_GUARDS}, pick an entry at random from - ({GUARDS} - {SAMPLED_GUARDS}), according to the path selection rules. - - We remove an entry from {SAMPLED_GUARDS} if: - - * We have a live consensus, and {IS_LISTED} is false, and - {FIRST_UNLISTED_AT} is over {REMOVE_UNLISTED_GUARDS_AFTER} - days in the past. - - OR - - * We have a live consensus, and {ADDED_ON_DATE} is over - {GUARD_LIFETIME} ago, *and* {CONFIRMED_ON_DATE} is either - "never", or over {GUARD_CONFIRMED_MIN_LIFETIME} ago. - - Note that {SAMPLED_GUARDS} does not depend on our configuration. - It is possible that we can't actually connect to any of these - guards. - - **Rationale** - - The {SAMPLED_GUARDS} set is meant to limit the total number of - guards that a client will connect to in a given period. The - upper limit on its size prevents us from considering too many - guards. - - The first expiration mechanism is there so that our - {SAMPLED_GUARDS} list does not accumulate so many dead - guards that we cannot add new ones. - - The second expiration mechanism makes us rotate our guards slowly - over time. - - Ordering the {SAMPLED_GUARDS} set in the order in which we sampled those - guards and picking guards from that set according to this ordering improves - load-balancing. It is closer to offer the expected usage of the guard nodes - as per the path selection rules. - - The ordering also improves on another objective of this proposal: trying to - resist an adversary pushing clients over compromised guards, since the - adversary would need the clients to exhaust all their initial - {SAMPLED_GUARDS} set before having a chance to use a newly deployed - adversary node. - - -4.2. The Usable Sample [Section:FILTERED] - - We maintain another set, {set:FILTERED_GUARDS}, that does not - persist. It is derived from: - - - {SAMPLED_GUARDS} - - our current configuration, - - the path bias information. - - A guard is a member of {set:FILTERED_GUARDS} if and only if all - of the following are true: - - - It is a member of {SAMPLED_GUARDS}, with {IS_LISTED} set to - true. - - It is not disabled because of path bias issues. - - It is not disabled because of ReachableAddresses policy, - the ClientUseIPv4 setting, the ClientUseIPv6 setting, - the FascistFirewall setting, or some other - option that prevents using some addresses. - - It is not disabled because of ExcludeNodes. - - It is a bridge if UseBridges is true; or it is not a - bridge if UseBridges is false. - - Is included in EntryNodes if EntryNodes is set and - UseBridges is not. (But see 2.B above). - - We have an additional subset, {set:USABLE_FILTERED_GUARDS}, which - is defined to be the subset of {FILTERED_GUARDS} where - {is_reachable} is or . - - We try to maintain a requirement that {USABLE_FILTERED_GUARDS} - contain at least {MIN_FILTERED_SAMPLE} elements: - - Whenever we are going to sample from {USABLE_FILTERED_GUARDS}, - and it contains fewer than {MIN_FILTERED_SAMPLE} elements, we - add new elements to {SAMPLED_GUARDS} until one of the following - is true: - - * {USABLE_FILTERED_GUARDS} is large enough, - OR - * {SAMPLED_GUARDS} is at its maximum size. - - - ** Rationale ** - - These filters are applied _after_ sampling: if we applied them - before the sampling, then our sample would reflect the set of - filtering restrictions that we had in the past. - -4.3. The confirmed-guard list. [Section:CONFIRMED] - - [formerly USED_GUARDS] - - We maintain a persistent ordered list, {list:CONFIRMED_GUARDS}. - It contains guards that we have used before, in our preference - order of using them. It is a subset of {SAMPLED_GUARDS}. For - each guard in this list, we store persistently: - - - {pvar:IDENTITY} Its fingerprint. - - - {pvar:CONFIRMED_ON_DATE} When we added this guard to - {CONFIRMED_GUARDS}. - - Randomized to a point in the past as RAND(now, {GUARD_LIFETIME}/10). - - We append new members to {CONFIRMED_GUARDS} when we mark a circuit - built through a guard as "for user traffic." - - Whenever we remove a member from {SAMPLED_GUARDS}, we also remove - it from {CONFIRMED_GUARDS}. - - [Note: You can also regard the {CONFIRMED_GUARDS} list as a - total ordering defined over a subset of {SAMPLED_GUARDS}.] - - Definition: we call Guard A "higher priority" than another Guard B - if, when A and B are both reachable, we would rather use A. We - define priority as follows: - - * Every guard in {CONFIRMED_GUARDS} has a higher priority - than every guard not in {CONFIRMED_GUARDS}. - - * Among guards in {CONFIRMED_GUARDS}, the one appearing earlier - on the {CONFIRMED_GUARDS} list has a higher priority. - - * Among guards that do not appear in {CONFIRMED_GUARDS}, - {is_pending}==true guards have higher priority. - - * Among those, the guard with earlier {last_tried_connect} time - has higher priority. - - * Finally, among guards that do not appear in - {CONFIRMED_GUARDS} with {is_pending==false}, all have equal - priority. - - ** Rationale ** - - We add elements to this ordering when we have actually used them - for building a usable circuit. We could mark them at some other - time (such as when we attempt to connect to them, or when we - actually connect to them), but this approach keeps us from - committing to a guard before we actually use it for sensitive - traffic. - -4.4. The Primary guards [Section:PRIMARY] - - We keep a run-time non-persistent ordered list of - {list:PRIMARY_GUARDS}. It is a subset of {FILTERED_GUARDS}. It - contains {N_PRIMARY_GUARDS} elements. - - To compute primary guards, take the ordered intersection of - {CONFIRMED_GUARDS} and {FILTERED_GUARDS}, and take the first - {N_PRIMARY_GUARDS} elements. If there are fewer than - {N_PRIMARY_GUARDS} elements, append additional elements to - PRIMARY_GUARDS chosen from ({FILTERED_GUARDS} - {CONFIRMED_GUARDS}), - ordered in "sample order" (that is, by {ADDED_ON_DATE}). - - Once an element has been added to {PRIMARY_GUARDS}, we do not remove it - until it is replaced by some element from {CONFIRMED_GUARDS}. - That is: if a non-primary guard becomes confirmed and not every primary - guard is confirmed, then the list of primary guards list is regenerated, - first from the confirmed guards (as before), and then from any - non-confirmed primary guards. - - Note that {PRIMARY_GUARDS} do not have to be in - {USABLE_FILTERED_GUARDS}: they might be unreachable. - - ** Rationale ** - - These guards are treated differently from other guards. If one of - them is usable, then we use it right away. For other guards - {FILTERED_GUARDS}, if it's usable, then before using it we might - first double-check whether perhaps one of the primary guards is - usable after all. - -4.5. Retrying guards. [Section:RETRYING] - - (We run this process as frequently as needed. It can be done once - a second, or just-in-time.) - - If a primary sampled guard's {is_reachable} status is , then - we decide whether to update its {is_reachable} status to - based on its {last_tried_connect} time, its {failing_since} time, - and the {PRIMARY_GUARDS_RETRY_SCHED} schedule. - - If a non-primary sampled guard's {is_reachable} status is , then - we decide whether to update its {is_reachable} status to - based on its {last_tried_connect} time, its {failing_since} time, - and the {GUARDS_RETRY_SCHED} schedule. - - ** Rationale ** - - An observation that a guard has been 'unreachable' only lasts for - a given amount of time, since we can't infer that it's unreachable - now from the fact that it was unreachable a few minutes ago. - -4.6. Selecting guards for circuits. [Section:SELECTING] - - Every origin circuit is now in one of these states: - - , - , - , or - . - - You may only attach streams to circuits. - (Additionally, you may only send RENDEZVOUS cells, ESTABLISH_INTRO - cells, and INTRODUCE cells on circuits.) - - The per-circuit state machine is: - - New circuits are or - . - - A circuit may become , or may - fail. - - A circuit may become - ; may become ; or may - fail. - - A circuit will become , or will - be closed, or will fail. - - A circuit remains until it fails or is - closed. - - Each of these transitions is described below. - - We keep, as global transient state: - - * {tvar:last_time_on_internet} -- the last time at which we - successfully used a circuit or connected to a guard. At - startup we set this to "infinitely far in the past." - - When we want to build a circuit, and we need to pick a guard: - - * If any entry in PRIMARY_GUARDS has {is_reachable} status of - or , return one of the first - {NUM_USABLE_PRIMARY_GUARDS} or - {NUM_USABLE_PRIMARY_DIRECTORY_GUARDS} such guards, chosen - uniformly at random. The circuit is . - - [Note: We do not use {is_pending} on primary guards, since we - are willing to try to build multiple circuits through them - before we know for sure whether they work, and since we will - not use any non-primary guards until we are sure that the - primary guards are all down. (XX is this good?)] - - * Otherwise, if the ordered intersection of {CONFIRMED_GUARDS} - and {USABLE_FILTERED_GUARDS} is nonempty, return the first - entry in that intersection that has {is_pending} set to - false. Set its value of {is_pending} to true, - and set its {pending_since} to the current time. - The circuit - is now . (If all entries have - {is_pending} true, pick the first one.) - - * Otherwise, if there is no such entry, select a member from - {USABLE_FILTERED_GUARDS} in sample order. Set its {is_pending} field to - true, and set its {pending_since} to the current time. - The circuit is . - - * Otherwise, if USABLE_FILTERED_GUARDS is empty, we have exhausted - all the sampled guards. In this case we proceed by marking all guards - as reachable so that we can keep on trying circuits. - - Whenever we select a guard for a new circuit attempt, we update the - {last_tried_connect} time for the guard to 'now.' - - In some cases (for example, when we need a certain directory feature, - or when we need to avoid using a certain exit as a guard), we need to - restrict the guards that we use for a single circuit. When this happens, we - remember the restrictions that applied when choosing the guard for - that circuit, since we will need them later (see [UPDATE_WAITING].). - - ** Rationale ** - - We're getting to the core of the algorithm here. Our main goals are to - make sure that - - 1. If it's possible to use a primary guard, we do. - 2. We probably use the first primary guard. - - So we only try non-primary guards if we're pretty sure that all - the primary guards are down, and we only try a given primary guard - if the earlier primary guards seem down. - - When we _do_ try non-primary guards, however, we only build one - circuit through each, to give it a chance to succeed or fail. If - ever such a circuit succeeds, we don't use it until we're pretty - sure that it's the best guard we're getting. (see below). - - [XXX timeout.] - -4.7. When a circuit fails. [Section:ON_FAIL] - - When a circuit fails in a way that makes us conclude that a guard - is not reachable, we take the following steps: - - * Set the guard's {is_reachable} status to . If it had - {is_pending} set to true, we make it non-pending and clear - {pending_since}. - - * Close the circuit, of course. (This removes it from - consideration by the algorithm in [UPDATE_WAITING].) - - * Update the list of waiting circuits. (See [UPDATE_WAITING] - below.) - - [Note: the existing Tor logic will cause us to create more - circuits in response to some of these steps; and also see - [ON_CONSENSUS].] - - ** Rationale ** - - See [SELECTING] above for rationale. - -4.8. When a circuit succeeds [Section:ON_SUCCESS] - - When a circuit succeeds in a way that makes us conclude that a - guard _was_ reachable, we take these steps: - - * We set its {is_reachable} status to . - * We set its {failing_since} to "never". - * If the guard was {is_pending}, we clear the {is_pending} flag - and set {pending_since} to false. - * If the guard was not a member of {CONFIRMED_GUARDS}, we add - it to the end of {CONFIRMED_GUARDS}. - - * If this circuit was , this circuit is - now . You may attach streams to this circuit, - and use it for hidden services. - - * If this circuit was , it is now - . You may not yet attach streams to it. - Then check whether the {last_time_on_internet} is more than - {INTERNET_LIKELY_DOWN_INTERVAL} seconds ago: - - * If it is, then mark all {PRIMARY_GUARDS} as "maybe" - reachable. - - * If it is not, update the list of waiting circuits. (See - [UPDATE_WAITING] below) - - [Note: the existing Tor logic will cause us to create more - circuits in response to some of these steps; and see - [ON_CONSENSUS].] - - ** Rationale ** - - See [SELECTING] above for rationale. - -4.9. Updating the list of waiting circuits [Section:UPDATE_WAITING] - - We run this procedure whenever it's possible that a - circuit might be ready to be called - . - - * If any circuit C1 is , AND: - * All primary guards have reachable status of . - * There is no circuit C2 that "blocks" C1. - Then, upgrade C1 to . - - Definition: In the algorithm above, C2 "blocks" C1 if: - * C2 obeys all the restrictions that C1 had to obey, AND - * C2 has higher priority than C1, AND - * Either C2 is , or C2 is , - or C2 has been for no more than - {NONPRIMARY_GUARD_CONNECT_TIMEOUT} seconds. - - We run this procedure periodically: - - * If any circuit stays in - for more than {NONPRIMARY_GUARD_IDLE_TIMEOUT} seconds, - time it out. - - **Rationale** - - If we open a connection to a guard, we might want to use it - immediately (if we're sure that it's the best we can do), or we - might want to wait a little while to see if some other circuit - which we like better will finish. - - - When we mark a circuit , we don't close the - lower-priority circuits immediately: we might decide to use - them after all if the circuit goes down before - {NONPRIMARY_GUARD_IDLE_TIMEOUT} seconds. - -4.9.1. Without a list of waiting circuits [Section:NO_CIRCLIST] - - As an alternative to the section [SECTION:UPDATE_WAITING] above, - this section presents a new way to maintain guard status - independently of tracking individual circuit status. This - formulation gives a result equivalent or similar to the approach - above, but simplifies the necessary communications between the - guard and circuit subsystems. - - As before, when all primary guards are Unreachable, we need to - try non-primary guards. We select the first such guard (in - preference order) that is neither Unreachable nor Pending. - Whenever we give out such a guard, if the guard's status is - Unknown, then we call that guard "Pending" with its {is_pending} - flag, until the attempt to use it succeeds or fails. We remember - when the guard became Pending with the {pending_since variable}. - - After completing a circuit, the implementation must check whether - its guard is usable. A guard's usability status may be "usable", - "unusable", or "unknown". A guard is usable according to - these rules: - - 1. Primary guards are always usable. - - 2. Non-primary guards are usable _for a given circuit_ if every - guard earlier in the preference list is either unsuitable for - that circuit (e.g. because of family restrictions), or marked as - Unreachable, or has been pending for at least - `{NONPRIMARY_GUARD_CONNECT_TIMEOUT}`. - - Non-primary guards are not usable _for a given circuit_ if some - guard earlier in the preference list is suitable for the circuit - _and_ Reachable. - - Non-primary guards are unusable if they have not become - usable after `{NONPRIMARY_GUARD_IDLE_TIMEOUT}` seconds. - - 3. If a circuit's guard is not usable or unusable immediately, the - circuit is not discarded; instead, it is kept (but not used) until the - guard becomes usable or unusable. - - -4.10. Whenever we get a new consensus. [Section:ON_CONSENSUS] - - We update {GUARDS}. - - For every guard in {SAMPLED_GUARDS}, we update {IS_LISTED} and - {FIRST_UNLISTED_AT}. - - [**] We remove entries from {SAMPLED_GUARDS} if appropriate, - according to the sampled-guards expiration rules. If they were - in {CONFIRMED_GUARDS}, we also remove them from - {CONFIRMED_GUARDS}. - - We recompute {FILTERED_GUARDS}, and everything that derives from - it, including {USABLE_FILTERED_GUARDS}, and {PRIMARY_GUARDS}. - - (Whenever one of the configuration options that affects the - filter is updated, we repeat the process above, starting at the - [**] line.) - -4.11. Deciding whether to generate a new circuit. - [Section:NEW_CIRCUIT_NEEDED] - - We generate a new circuit when we don't have - enough circuits either built or in-progress to handle a given - stream, or an expected stream. - - For the purpose of this rule, we say that - circuits are neither built nor in-progress; that - circuits are built; and that the other states are in-progress. - -4.12. When we are missing descriptors. - [Section:MISSING_DESCRIPTORS] - - We need either a router descriptor or a microdescriptor in order - to build a circuit through a guard. If we do not have such a - descriptor for a guard, we can still use the guard for one-hop - directory fetches, but not for longer circuits. - - (Also, when we are missing descriptors for our first - {NUM_USABLE_PRIMARY_GUARDS} primary guards, we don't build - circuits at all until we have fetched them.) - -A. Appendices - -A.0. Acknowledgements - - This research was supported in part by NSF grants CNS-1111539, - CNS-1314637, CNS-1526306, CNS-1619454, and CNS-1640548. - -A.1. Parameters with suggested values. [Section:PARAM_VALS] - - (All suggested values chosen arbitrarily) - - {param:MAX_SAMPLE_THRESHOLD} -- 20% - - {param:MAX_SAMPLE_SIZE} -- 60 - - {param:GUARD_LIFETIME} -- 120 days - - {param:REMOVE_UNLISTED_GUARDS_AFTER} -- 20 days - [previously ENTRY_GUARD_REMOVE_AFTER] - - {param:MIN_FILTERED_SAMPLE} -- 20 - - {param:N_PRIMARY_GUARDS} -- 3 - - {param:PRIMARY_GUARDS_RETRY_SCHED} - - We recommend the following schedule, which is the one - used in Arti: - - -- Use the "decorrelated-jitter" algorithm from "dir-spec.txt" - section 5.5 where `base_delay` is 30 seconds and `cap` - is 6 hours. - - This legacy schedule is the one used in C tor: - - -- every 10 minutes for the first six hours, - -- every 90 minutes for the next 90 hours, - -- every 4 hours for the next 3 days, - -- every 9 hours thereafter. - - {param:GUARDS_RETRY_SCHED} -- - - We recommend the following schedule, which is the one - used in Arti: - - -- Use the "decorrelated-jitter" algorithm from "dir-spec.txt" - section 5.5 where `base_delay` is 10 minutes and `cap` - is 36 hours. - - This legacy schedule is the one used in C tor: - - -- every hour for the first six hours, - -- every 4 hours for the 90 hours, - -- every 18 hours for the next 3 days, - -- every 36 hours thereafter. - - {param:INTERNET_LIKELY_DOWN_INTERVAL} -- 10 minutes - - {param:NONPRIMARY_GUARD_CONNECT_TIMEOUT} -- 15 seconds - - {param:NONPRIMARY_GUARD_IDLE_TIMEOUT} -- 10 minutes - - {param:MEANINGFUL_RESTRICTION_FRAC} -- .2 - - {param:EXTREME_RESTRICTION_FRAC} -- .01 - - {param:GUARD_CONFIRMED_MIN_LIFETIME} -- 60 days - - {param:NUM_USABLE_PRIMARY_GUARDS} -- 1 - - {param:NUM_USABLE_PRIMARY_DIRECTORY_GUARDS} -- 3 - -A.2. Random values [Section:RANDOM] - - Frequently, we want to randomize the expiration time of something - so that it's not easy for an observer to match it to its start - time. We do this by randomizing its start date a little, so that - we only need to remember a fixed expiration interval. - - By RAND(now, INTERVAL) we mean a time between now and INTERVAL in - the past, chosen uniformly at random. - - -A.3. Why not a sliding scale of primaryness? [Section:CVP] - - At one meeting, I floated the idea of having "primaryness" be a - continuous variable rather than a boolean. - - I'm no longer sure this is a great idea, but I'll try to outline - how it might work. - - To begin with: being "primary" gives it a few different traits: - - 1) We retry primary guards more frequently. [Section:RETRYING] - - 2) We don't even _try_ building circuits through - lower-priority guards until we're pretty sure that the - higher-priority primary guards are down. (With non-primary - guards, on the other hand, we launch exploratory circuits - which we plan not to use if higher-priority guards - succeed.) [Section:SELECTING] - - 3) We retry them all one more time if a circuit succeeds after - the net has been down for a while. [Section:ON_SUCCESS] - - We could make each of the above traits continuous: - - 1) We could make the interval at which a guard is retried - depend continuously on its position in CONFIRMED_GUARDS. - - 2) We could change the number of guards we test in parallel - based on their position in CONFIRMED_GUARDS. - - 3) We could change the rule for how long the higher-priority - guards need to have been down before we call a - circuit based on a - possible network-down condition. For example, we could - retry the first guard if we tried it more than 10 seconds - ago, the second if we tried it more than 20 seconds ago, - etc. - - I am pretty sure, however, that if these are worth doing, they - need more analysis! Here's why: - - * They all have the potential to leak more information about a - guard's exact position on the list. Is that safe? Is there - any way to exploit that? I don't think we know. - - * They all seem like changes which it would be relatively - simple to make to the code after we implement the simpler - version of the algorithm described above. - -A.4. Controller changes - - We will add to control-spec.txt a new possible circuit state, GUARD_WAIT, - that can be given as part of circuit events and GETINFO responses about - circuits. A circuit is in the GUARD_WAIT state when it is fully built, - but we will not use it because a circuit with a better guard might - become built too. - -A.5. Persistent state format - - The persistent state format doesn't need to be part of this - specification, since different implementations can do it - differently. Nonetheless, here's the one Tor uses: - - The "state" file contains one Guard entry for each sampled guard - in each instance of the guard state (see section 2). The value - of this Guard entry is a set of space-separated K=V entries, - where K contains any nonspace character except =, and V contains - any nonspace characters. - - Implementations must retain any unrecognized K=V entries for a - sampled guard when they regenerate the state file. - - The order of K=V entries is not allowed to matter. - - Recognized fields (values of K) are: - - "in" -- the name of the guard state instance that this - sampled guard is in. If a sampled guard is in two guard - states instances, it appears twice, with a different "in" - field each time. Required. - - "rsa_id" -- the RSA id digest for this guard, encoded in - hex. Required. - - "bridge_addr" -- If the guard is a bridge, its configured address and - port (this can be the ORPort or a pluggable transport port). Optional. - - "nickname" -- the guard's nickname, if any. Optional. - - "sampled_on" -- the date when the guard was sampled. Required. - - "sampled_by" -- the Tor version that sampled this guard. - Optional. - - "unlisted_since" -- the date since which the guard has been - unlisted. Optional. - - "listed" -- 0 if the guard is not listed; 1 if it is. Required. - - "confirmed_on" -- date when the guard was - confirmed. Optional. - - "confirmed_idx" -- position of the guard in the confirmed - list. Optional. - - "pb_use_attempts", "pb_use_successes", "pb_circ_attempts", - "pb_circ_successes", "pb_successful_circuits_closed", - "pb_collapsed_circuits", "pb_unusable_circuits", - "pb_timeouts" -- state for the circuit path bias algorithm, - given in decimal fractions. Optional. - - All dates here are given as a (spaceless) ISO8601 combined date - and time in UTC (e.g., 2016-11-29T19:39:31). - - -TODO. Still non-addressed issues [Section:TODO] - - Simulate to answer: Will this work in a dystopic world? - - Simulate actual behavior. - - For all lifetimes: instead of storing the "this began at" time, - store the "remove this at" time, slightly randomized. - - Clarify that when you get a circuit, you might need to - relaunch circuits through that same guard immediately, if they - are circuits that have to be independent. - - - Fix all items marked XX or TODO. - - "Directory guards" -- do they matter? - - Suggestion: require that all guards support downloads via BEGINDIR. - We don't need to worry about directory guards for relays, since we - aren't trying to prevent relay enumeration. - - IP version preferences via ClientPreferIPv6ORPort - - Suggestion: Treat it as a preference when adding to - {CONFIRMED_GUARDS}, but not otherwise. - diff --git a/padding-spec.txt b/padding-spec.txt deleted file mode 100644 index 206a7f1..0000000 --- a/padding-spec.txt +++ /dev/null @@ -1,625 +0,0 @@ - - Tor Padding Specification - - Mike Perry, George Kadianakis - -Note: This is an attempt to specify Tor as currently implemented. Future -versions of Tor will implement improved algorithms. - -This document tries to cover how Tor chooses to use cover traffic to obscure -various traffic patterns from external and internal observers. Other -implementations MAY take other approaches, but implementors should be aware of -the anonymity and load-balancing implications of their choices. - - The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL - NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and - "OPTIONAL" in this document are to be interpreted as described in - RFC 2119. - -Table of Contents - - 1. Overview - 2. Connection-level padding - 2.1. Background - 2.2. Implementation - 2.3. Padding Cell Timeout Distribution Statistics - 2.4. Maximum overhead bounds - 2.5. Reducing or Disabling Padding via Negotiation - 2.6. Consensus Parameters Governing Behavior - 3. Circuit-level padding - 3.1. Circuit Padding Negotiation - 3.2. Circuit Padding Machine Message Management - 3.3. Obfuscating client-side onion service circuit setup - 3.3.1. Common general circuit construction sequences - 3.3.2. Client-side onion service introduction circuit obfuscation - 3.3.3. Client-side rendezvous circuit hiding - 3.3.4. Circuit setup machine overhead - 3.4. Circuit padding consensus parameters - A. Acknowledgments - -1. Overview - - Tor supports two classes of cover traffic: connection-level padding, and - circuit-level padding. - - Connection-level padding uses the CELL_PADDING cell command for cover - traffic, where as circuit-level padding uses the RELAY_COMMAND_DROP relay - command. CELL_PADDING is single-hop only and can be differentiated from - normal traffic by Tor relays ("internal" observers), but not by entities - monitoring Tor OR connections ("external" observers). - - RELAY_COMMAND_DROP is multi-hop, and is not visible to intermediate Tor - relays, because the relay command field is covered by circuit layer - encryption. Moreover, Tor's 'recognized' field allows RELAY_COMMAND_DROP - padding to be sent to any intermediate node in a circuit (as per Section - 6.1 of tor-spec.txt). - - Tor uses both connection level and circuit level padding. Connection - level padding is described in section 2. Circuit level padding is - described in section 3. - - The circuit-level padding system is completely orthogonal to the - connection-level padding. The connection-level padding system regards - circuit-level padding as normal data traffic, and hence the connection-level - padding system will not add any additional overhead while the circuit-level - padding system is actively padding. - - -2. Connection-level padding - -2.1. Background - - Tor clients and relays make use of CELL_PADDING to reduce the resolution of - connection-level metadata retention by ISPs and surveillance infrastructure. - - Such metadata retention is implemented by Internet routers in the form of - Netflow, jFlow, Netstream, or IPFIX records. These records are emitted by - gateway routers in a raw form and then exported (often over plaintext) to a - "collector" that either records them verbatim, or reduces their granularity - further[1]. - - Netflow records and the associated data collection and retention tools are - very configurable, and have many modes of operation, especially when - configured to handle high throughput. However, at ISP scale, per-flow records - are very likely to be employed, since they are the default, and also provide - very high resolution in terms of endpoint activity, second only to full packet - and/or header capture. - - Per-flow records record the endpoint connection 5-tuple, as well as the - total number of bytes sent and received by that 5-tuple during a particular - time period. They can store additional fields as well, but it is primarily - timing and bytecount information that concern us. - - When configured to provide per-flow data, routers emit these raw flow - records periodically for all active connections passing through them - based on two parameters: the "active flow timeout" and the "inactive - flow timeout". - - The "active flow timeout" causes the router to emit a new record - periodically for every active TCP session that continuously sends data. The - default active flow timeout for most routers is 30 minutes, meaning that a - new record is created for every TCP session at least every 30 minutes, no - matter what. This value can be configured from 1 minute to 60 minutes on - major routers. - - The "inactive flow timeout" is used by routers to create a new record if a - TCP session is inactive for some number of seconds. It allows routers to - avoid the need to track a large number of idle connections in memory, and - instead emit a separate record only when there is activity. This value - ranges from 10 seconds to 600 seconds on common routers. It appears as - though no routers support a value lower than 10 seconds. - - For reference, here are default values and ranges (in parenthesis when - known) for common routers, along with citations to their manuals. - - Some routers speak other collection protocols than Netflow, and in the - case of Juniper, use different timeouts for these protocols. Where this - is known to happen, it has been noted. - - Inactive Timeout Active Timeout - Cisco IOS[3] 15s (10-600s) 30min (1-60min) - Cisco Catalyst[4] 5min 32min - Juniper (jFlow)[5] 15s (10-600s) 30min (1-60min) - Juniper (Netflow)[6,7] 60s (10-600s) 30min (1-30min) - H3C (Netstream)[8] 60s (60-600s) 30min (1-60min) - Fortinet[9] 15s 30min - MicroTik[10] 15s 30min - nProbe[14] 30s 120s - Alcatel-Lucent[2] 15s (10-600s) 30min (1-600min) - - The combination of the active and inactive netflow record timeouts allow us - to devise a low-cost padding defense that causes what would otherwise be - split records to "collapse" at the router even before they are exported to - the collector for storage. So long as a connection transmits data before the - "inactive flow timeout" expires, then the router will continue to count the - total bytes on that flow before finally emitting a record at the "active - flow timeout". - - This means that for a minimal amount of padding that prevents the "inactive - flow timeout" from expiring, it is possible to reduce the resolution of raw - per-flow netflow data to the total amount of bytes send and received in a 30 - minute window. This is a vast reduction in resolution for HTTP, IRC, XMPP, - SSH, and other intermittent interactive traffic, especially when all - user traffic in that time period is multiplexed over a single connection - (as it is with Tor). - - Though flow measurement in principle can be bidirectional (counting cells - sent in both directions between a pair of IPs) or unidirectional (counting - only cells sent from one IP to another), we assume for safety that all - measurement is unidirectional, and so traffic must be sent by both parties - in order to prevent record splitting. - -2.2. Implementation - - Tor clients currently maintain one TLS connection to their Guard node to - carry actual application traffic, and make up to 3 additional connections to - other nodes to retrieve directory information. - - We pad only the client's connection to the Guard node, and not any other - connection. We treat Bridge node connections to the Tor network as client - connections, and pad them, but otherwise not pad between normal relays. - - Both clients and Guards will maintain a timer for all application (ie: - non-directory) TLS connections. Every time a padding packet sent by an - endpoint, that endpoint will sample a timeout value from - the max(X,X) distribution described in Section 2.3. The default - range is from 1.5 seconds to 9.5 seconds time range, subject to consensus - parameters as specified in Section 2.6. - - (The timing is randomized to avoid making it obvious which cells are - padding.) - - If another cell is sent for any reason before this timer expires, the timer - is reset to a new random value. - - If the connection remains inactive until the timer expires, a - single CELL_PADDING cell will be sent on that connection (which will - also start a new timer). - - In this way, the connection will only be padded in a given direction in - the event that it is idle in that direction, and will always transmit a - packet before the minimum 10 second inactive timeout. - - (In practice, an implementation may not be able to determine when, - exactly, a cell is sent on a given channel. For example, even though the - cell has been given to the kernel via a call to `send(2)`, the kernel may - still be buffering that cell. In cases such as these, implementations - should use a reasonable proxy for the time at which a cell is sent: for - example, when the cell is queued. If this strategy is used, - implementations should try to observe the innermost (closest to the wire) - queue that they practically can, and if this queue is already nonempty, - padding should not be scheduled until after the queue does become empty.) - -2.3. Padding Cell Timeout Distribution Statistics - - To limit the amount of padding sent, instead of sampling each endpoint - timeout uniformly, we instead sample it from max(X,X), where X is - uniformly distributed. - - If X is a random variable uniform from 0..R-1 (where R=high-low), then the - random variable Y = max(X,X) has Prob(Y == i) = (2.0*i + 1)/(R*R). - - Then, when both sides apply timeouts sampled from Y, the resulting - bidirectional padding packet rate is now a third random variable: - Z = min(Y,Y). - - The distribution of Z is slightly bell-shaped, but mostly flat around the - mean. It also turns out that Exp[Z] ~= Exp[X]. Here's a table of average - values for each random variable: - - R Exp[X] Exp[Z] Exp[min(X,X)] Exp[Y=max(X,X)] - 2000 999.5 1066 666.2 1332.8 - 3000 1499.5 1599.5 999.5 1999.5 - 5000 2499.5 2666 1666.2 3332.8 - 6000 2999.5 3199.5 1999.5 3999.5 - 7000 3499.5 3732.8 2332.8 4666.2 - 8000 3999.5 4266.2 2666.2 5332.8 - 10000 4999.5 5328 3332.8 6666.2 - 15000 7499.5 7995 4999.5 9999.5 - 20000 9900.5 10661 6666.2 13332.8 - - -2.4. Maximum overhead bounds - - With the default parameters and the above distribution, we expect a - padded connection to send one padding cell every 5.5 seconds. This - averages to 103 bytes per second full duplex (~52 bytes/sec in each - direction), assuming a 512 byte cell and 55 bytes of TLS+TCP+IP headers. - For a client connection that remains otherwise idle for its expected - ~50 minute lifespan (governed by the circuit available timeout plus a - small additional connection timeout), this is about 154.5KB of overhead - in each direction (309KB total). - - With 2.5M completely idle clients connected simultaneously, 52 bytes per - second amounts to 130MB/second in each direction network-wide, which is - roughly the current amount of Tor directory traffic[11]. Of course, our - 2.5M daily users will neither be connected simultaneously, nor entirely - idle, so we expect the actual overhead to be much lower than this. - -2.5. Reducing or Disabling Padding via Negotiation - - To allow mobile clients to either disable or reduce their padding overhead, - the CELL_PADDING_NEGOTIATE cell (tor-spec.txt section 7.2) may be sent from - clients to relays. This cell is used to instruct relays to cease sending - padding. - - If the client has opted to use reduced padding, it continues to send - padding cells sampled from the range [9000,14000] milliseconds (subject to - consensus parameter alteration as per Section 2.6), still using the - Y=max(X,X) distribution. Since the padding is now unidirectional, the - expected frequency of padding cells is now governed by the Y distribution - above as opposed to Z. For a range of 5000ms, we can see that we expect to - send a padding packet every 9000+3332.8 = 12332.8ms. We also half the - circuit available timeout from ~50min down to ~25min, which causes the - client's OR connections to be closed shortly there after when it is idle, - thus reducing overhead. - - These two changes cause the padding overhead to go from 309KB per one-time-use - Tor connection down to 69KB per one-time-use Tor connection. For continual - usage, the maximum overhead goes from 103 bytes/sec down to 46 bytes/sec. - - If a client opts to completely disable padding, it sends a - CELL_PADDING_NEGOTIATE to instruct the relay not to pad, and then does not - send any further padding itself. - - Currently, clients negotiate padding only when a channel is created, - immediately after sending their NETINFO cell. Recipients SHOULD, however, - accept padding negotiation messages at any time. - - If a client which previously negotiated reduced, or disabled, padding, and - wishes to re-enable default padding (ie padding according to the consensus - parameters), it SHOULD send CELL_PADDING_NEGOTIATE START with zero in the - ito_low_ms and ito_high_ms fields. (It therefore SHOULD NOT copy the values - from its own established consensus into the CELL_PADDING_NEGOTIATE cell.) - This avoids the client needing to send updated padding negotiations if the - consensus parameters should change. The recipient's clamping of the timing - parameters will cause the recipient to use its notion of the consensus - parameters. - - Clients and bridges MUST reject padding negotiation messages from relays, - and close the channel if they receive one. - -2.6. Consensus Parameters Governing Behavior - - Connection-level padding is controlled by the following consensus parameters: - - * nf_ito_low - - The low end of the range to send padding when inactive, in ms. - - Default: 1500 - - * nf_ito_high - - The high end of the range to send padding, in ms. - - Default: 9500 - - If nf_ito_low == nf_ito_high == 0, padding will be disabled. - - * nf_ito_low_reduced - - For reduced padding clients: the low end of the range to send padding - when inactive, in ms. - - Default: 9000 - - * nf_ito_high_reduced - - For reduced padding clients: the high end of the range to send padding, - in ms. - - Default: 14000 - - * nf_conntimeout_clients - - The number of seconds to keep never-used circuits opened and - available for clients to use. Note that the actual client timeout is - randomized uniformly from this value to twice this value. - - The number of seconds to keep idle (not currently used) canonical - channels are open and available. (We do this to ensure a sufficient - time duration of padding, which is the ultimate goal.) - - This value is also used to determine how long, after a port has been - used, we should attempt to keep building predicted circuits for that - port. (See path-spec.txt section 2.1.1.) This behavior was - originally added to work around implementation limitations, but it - serves as a reasonable default regardless of implementation. - - For all use cases, reduced padding clients use half the consensus - value. - - Implementations MAY mark circuits held open past the reduced padding - quantity (half the consensus value) as "not to be used for streams", - to prevent their use from becoming a distinguisher. - - Default: 1800 - - * nf_pad_before_usage - - If set to 1, OR connections are padded before the client uses them - for any application traffic. If 0, OR connections are not padded - until application data begins. - - Default: 1 - - * nf_pad_relays - - If set to 1, we also pad inactive relay-to-relay connections - - Default: 0 - - * nf_conntimeout_relays - - The number of seconds that idle relay-to-relay connections are kept - open. - - Default: 3600 - - -3. Circuit-level padding - - The circuit padding system in Tor is an extension of the WTF-PAD - event-driven state machine design[15]. At a high level, this design places - one or more padding state machines at the client, and one or more padding - state machines at a relay, on each circuit. - - State transition and histogram generation has been generalized to be fully - programmable, and probability distribution support was added to support more - compact representations like APE[16]. Additionally, packet count limits, - rate limiting, and circuit application conditions have been added. - - At present, Tor uses this system to deploy two pairs of circuit padding - machines, to obscure differences between the setup phase of client-side - onion service circuits, up to the first 10 cells. - - This specification covers only the resulting behavior of these padding - machines, and thus does not cover the state machine implementation details or - operation. For full details on using the circuit padding system to develop - future padding defenses, see the research developer documentation[17]. - -3.1. Circuit Padding Negotiation - - Circuit padding machines are advertised as "Padding" subprotocol versions - (see tor-spec.txt Section 9). The onion service circuit padding machines are - advertised as "Padding=2". - - Because circuit padding machines only become active at certain points in - circuit lifetime, and because more than one padding machine may be active at - any given point in circuit lifetime, there is also a padding negotiation - cell and a negotiated response. These are relay commands 41 and 42, with - relay headers as per section 6.1 of tor-spec.txt. - - The fields of the relay cell Data payload of a negotiate request are - as follows: - - const CIRCPAD_COMMAND_STOP = 1; - const CIRCPAD_COMMAND_START = 2; - - const CIRCPAD_RESPONSE_OK = 1; - const CIRCPAD_RESPONSE_ERR = 2; - - const CIRCPAD_MACHINE_CIRC_SETUP = 1; - - struct circpad_negotiate { - u8 version IN [0]; - u8 command IN [CIRCPAD_COMMAND_START, CIRCPAD_COMMAND_STOP]; - - u8 machine_type IN [CIRCPAD_MACHINE_CIRC_SETUP]; - - u8 unused; // Formerly echo_request - - u32 machine_ctr; - }; - - When a client wants to start a circuit padding machine, it first checks that - the desired destination hop advertises the appropriate subprotocol version for - that machine. It then sends a circpad_negotiate cell to that hop with - command=CIRCPAD_COMMAND_START, and machine_type=CIRCPAD_MACHINE_CIRC_SETUP (for - the circ setup machine, the destination hop is the second hop in the - circuit). The machine_ctr is the count of which machine instance this is on - the circuit. It is used to disambiguate shutdown requests. - - When a relay receives a circpad_negotiate cell, it checks that it supports - the requested machine, and sends a circpad_negotiated cell, which is formatted - in the data payload of a relay cell with command number 42 (see tor-spec.txt - section 6.1), as follows: - - struct circpad_negotiated { - u8 version IN [0]; - u8 command IN [CIRCPAD_COMMAND_START, CIRCPAD_COMMAND_STOP]; - u8 response IN [CIRCPAD_RESPONSE_OK, CIRCPAD_RESPONSE_ERR]; - - u8 machine_type IN [CIRCPAD_MACHINE_CIRC_SETUP]; - - u32 machine_ctr; - }; - - If the machine is supported, the response field will contain - CIRCPAD_RESPONSE_OK. If it is not, it will contain CIRCPAD_RESPONSE_ERR. - - Either side may send a CIRCPAD_COMMAND_STOP to shut down the padding machines - (clients MUST only send circpad_negotiate, and relays MUST only send - circpad_negotiated for this purpose). - - If the machine_ctr does not match the current machine instance count - on the circuit, the command is ignored. - -3.2. Circuit Padding Machine Message Management - - Clients MAY send padding cells towards the relay before receiving the - circpad_negotiated response, to allow for outbound cover traffic before - negotiation completes. - - Clients MAY send another circpad_negotiate cell before receiving the - circpad_negotiated response, to allow for rapid machine changes. - - Relays MUST NOT send padding cells or circpad_negotiated cells, unless a - padding machine is active. Any padding-related cells that arrive at the client - from unexpected relay sources are protocol violations, and clients MAY - immediately tear down such circuits to avoid side channel risk. - -3.3. Obfuscating client-side onion service circuit setup - - The circuit padding currently deployed in Tor attempts to hide client-side - onion service circuit setup. Service-side setup is not covered, because doing - so would involve significantly more overhead, and/or require interaction with - the application layer. - - The approach taken aims to make client-side introduction and rendezvous - circuits match the cell direction sequence and cell count of 3 hop general - circuits used for normal web traffic, for the first 10 cells only. The - lifespan of introduction circuits is also made to match the lifespan - of general circuits. - - Note that inter-arrival timing is not obfuscated by this defense. - -3.3.1. Common general circuit construction sequences - - Most general Tor circuits used to surf the web or download directory - information start with the following 6-cell relay cell sequence (cells - surrounded in [brackets] are outgoing, the others are incoming): - - [EXTEND2] -> EXTENDED2 -> [EXTEND2] -> EXTENDED2 -> [BEGIN] -> CONNECTED - - When this is done, the client has established a 3-hop circuit and also opened - a stream to the other end. Usually after this comes a series of DATA cell that - either fetches pages, establishes an SSL connection or fetches directory - information: - - [DATA] -> [DATA] -> DATA -> DATA...(inbound cells continue) - - The above stream of 10 relay cells defines the grand majority of general - circuits that come out of Tor browser during our testing, and it's what we use - to make introduction and rendezvous circuits blend in. - - Please note that in this section we only investigate relay cells and not - connection-level cells like CREATE/CREATED or AUTHENTICATE/etc. that are used - during the link-layer handshake. The rationale is that connection-level cells - depend on the type of guard used and are not an effective fingerprint for a - network/guard-level adversary. - -3.3.2. Client-side onion service introduction circuit obfuscation - - Two circuit padding machines work to hide client-side introduction circuits: - one machine at the origin, and one machine at the second hop of the circuit. - Each machine sends padding towards the other. The padding from the origin-side - machine terminates at the second hop and does not get forwarded to the actual - introduction point. - - From Section 3.3.1 above, most general circuits have the following initial - relay cell sequence (outgoing cells marked in [brackets]): - - [EXTEND2] -> EXTENDED2 -> [EXTEND2] -> EXTENDED2 -> [BEGIN] -> CONNECTED - -> [DATA] -> [DATA] -> DATA -> DATA...(inbound data cells continue) - - Whereas normal introduction circuits usually look like: - - [EXTEND2] -> EXTENDED2 -> [EXTEND2] -> EXTENDED2 -> [EXTEND2] -> EXTENDED2 - -> [INTRO1] -> INTRODUCE_ACK - - This means that up to the sixth cell (first line of each sequence above), - both general and intro circuits have identical cell sequences. After that - we want to mimic the second line sequence of - - -> [DATA] -> [DATA] -> DATA -> DATA...(inbound data cells continue) - - We achieve this by starting padding INTRODUCE1 has been sent. With padding - negotiation cells, in the common case of the second line looks like: - - -> [INTRO1] -> [PADDING_NEGOTIATE] -> PADDING_NEGOTIATED -> INTRO_ACK - - Then, the middle node will send between INTRO_MACHINE_MINIMUM_PADDING (7) and - INTRO_MACHINE_MAXIMUM_PADDING (10) cells, to match the "...(inbound data cells - continue)" portion of the trace (aka the rest of an HTTPS response body). - - We also set a special flag which keeps the circuit open even after the - introduction is performed. With this feature the circuit will stay alive for - the same duration as normal web circuits before they expire (usually 10 - minutes). - -3.3.3. Client-side rendezvous circuit hiding - - Following a similar argument as for intro circuits, we are aiming for padded - rendezvous circuits to blend in with the initial cell sequence of general - circuits which usually look like this: - - [EXTEND2] -> EXTENDED2 -> [EXTEND2] -> EXTENDED2 -> [BEGIN] -> CONNECTED - -> [DATA] -> [DATA] -> DATA -> DATA...(incoming cells continue) - - Whereas normal rendezvous circuits usually look like: - - [EXTEND2] -> EXTENDED2 -> [EXTEND2] -> EXTENDED2 -> [EST_REND] -> REND_EST - -> REND2 -> [BEGIN] - - This means that up to the sixth cell (the first line), both general and - rend circuits have identical cell sequences. - - After that we want to mimic a [DATA] -> [DATA] -> DATA -> DATA sequence. - - With padding negotiation right after the REND_ESTABLISHED, the sequence - becomes: - - [EXTEND2] -> EXTENDED2 -> [EXTEND2] -> EXTENDED2 -> [EST_REND] -> REND_EST - -> [PADDING_NEGOTIATE] -> [DROP] -> PADDING_NEGOTIATED -> DROP... - - After which normal application DATA cells continue on the circuit. - - Hence this way we make rendezvous circuits look like general circuits up - till the end of the circuit setup. - - After that our machine gets deactivated, and we let the actual rendezvous - circuit shape the traffic flow. Since rendezvous circuits usually imitate - general circuits (their purpose is to surf the web), we can expect that they - will look alike. - -3.3.4. Circuit setup machine overhead - - For the intro circuit case, we see that the origin-side machine just sends a - single [PADDING_NEGOTIATE] cell, whereas the origin-side machine sends a - PADDING_NEGOTIATED cell and between 7 to 10 DROP cells. This means that the - average overhead of this machine is 11 padding cells per introduction circuit. - - For the rend circuit case, this machine is quite light. Both sides send 2 - padding cells, for a total of 4 padding cells. - -3.4. Circuit padding consensus parameters - - The circuit padding system has a handful of consensus parameters that can - either disable circuit padding entirely, or rate limit the total overhead - at relays and clients. - - * circpad_padding_disabled - - If set to 1, no circuit padding machines will negotiate, and all - current padding machines will cease padding immediately. - - Default: 0 - - * circpad_padding_reduced - - If set to 1, only circuit padding machines marked as "reduced"/"low - overhead" will be used. (Currently no such machines are marked - as "reduced overhead"). - - Default: 0 - - * circpad_global_allowed_cells - - This is the number of padding cells that must be sent before - the 'circpad_global_max_padding_percent' parameter is applied. - - Default: 0 - - * circpad_global_max_padding_percent - - This is the maximum ratio of padding cells to total cells, specified - as a percent. If the global ratio of padding cells to total cells - across all circuits exceeds this percent value, no more padding is sent - until the ratio becomes lower. 0 means no limit. - - Default: 0 - - * circpad_max_circ_queued_cells - - This is the maximum number of cells that can be in the circuitmux queue - before padding stops being sent on that circuit. - - Default: CIRCWINDOW_START_MAX (1000) - - -A. Acknowledgments - - This research was supported in part by NSF grants CNS-1111539, - CNS-1314637, CNS-1526306, CNS-1619454, and CNS-1640548. - -1. https://en.wikipedia.org/wiki/NetFlow -2. http://infodoc.alcatel-lucent.com/html/0_add-h-f/93-0073-10-01/7750_SR_OS_Router_Configuration_Guide/Cflowd-CLI.html -3. http://www.cisco.com/en/US/docs/ios/12_3t/netflow/command/reference/nfl_a1gt_ps5207_TSD_Products_Command_Reference_Chapter.html#wp1185203 -4. http://www.cisco.com/c/en/us/support/docs/switches/catalyst-6500-series-switches/70974-netflow-catalyst6500.html#opconf -5. https://www.juniper.net/techpubs/software/erx/junose60/swconfig-routing-vol1/html/ip-jflow-stats-config4.html#560916 -6. http://www.jnpr.net/techpubs/en_US/junos15.1/topics/reference/configuration-statement/flow-active-timeout-edit-forwarding-options-po.html -7. http://www.jnpr.net/techpubs/en_US/junos15.1/topics/reference/configuration-statement/flow-active-timeout-edit-forwarding-options-po.html -8. http://www.h3c.com/portal/Technical_Support___Documents/Technical_Documents/Switches/H3C_S9500_Series_Switches/Command/Command/H3C_S9500_CM-Release1648%5Bv1.24%5D-System_Volume/200901/624854_1285_0.htm#_Toc217704193 -9. http://docs-legacy.fortinet.com/fgt/handbook/cli52_html/FortiOS%205.2%20CLI/config_system.23.046.html -10. http://wiki.mikrotik.com/wiki/Manual:IP/Traffic_Flow -11. https://metrics.torproject.org/dirbytes.html -12. http://freehaven.net/anonbib/cache/murdoch-pet2007.pdf -13. https://gitweb.torproject.org/torspec.git/tree/proposals/188-bridge-guards.txt -14. http://www.ntop.org/wp-content/uploads/2013/03/nProbe_UserGuide.pdf -15. http://arxiv.org/pdf/1512.00524 -16. https://www.cs.kau.se/pulls/hot/thebasketcase-ape/ -17. https://github.com/torproject/tor/tree/master/doc/HACKING/CircuitPaddingDevelopment.md -18. https://www.usenix.org/node/190967 - https://blog.torproject.org/technical-summary-usenix-fingerprinting-paper - diff --git a/param-spec.txt b/param-spec.txt deleted file mode 100644 index d8ea80b..0000000 --- a/param-spec.txt +++ /dev/null @@ -1,517 +0,0 @@ - - Tor network parameters - -This file lists the recognized parameters that can appear on the "params" -line of a directory consensus. - -Table of Contents - - 1. Network protocol parameters - 2. Performance-tuning parameters - 3. Voting-related parameters - 4. Circuit-build-timeout parameters - 5. Directory-related parameters - 6. Pathbias parameters - 7. Relay behavior - 8. V3 onion service parameters - 9. Denial-of-service parameters - 10. Padding-related parameters - 11. Guard-related parameters - X. Obsolete parameters - -1. Network protocol parameters - - "circwindow" -- the default package window that circuits should be - established with. It started out at 1000 cells, but some research - indicates that a lower value would mean fewer cells in transit in the - network at any given time. - Min: 100, Max: 1000, Default: 1000 - First-appeared: Tor 0.2.1.20 - - "UseOptimisticData" -- If set to zero, clients by default shouldn't try - to send optimistic data to servers until they have received a - RELAY_CONNECTED cell. - Min: 0, Max: 1, Default: 1 - First-appeared: 0.2.3.3-alpha - Default was 0 before: 0.2.9.1-alpha - Removed in 0.4.5.1-alpha; now always on. - - "usecreatefast" -- Used to control whether clients use the CREATE_FAST - handshake on the first hop of their circuits. - Min: 0, Max: 1. Default: 1. - First-appeared: 0.2.4.23, 0.2.5.2-alpha - Removed in 0.4.5.1-alpha; now always off. - - "min_paths_for_circs_pct" -- A percentage threshold that determines - whether clients believe they have enough directory information to - build circuits. This value applies to the total fraction of - bandwidth-weighted paths that the client could build; see - path-spec.txt for more information. - Min: 25, Max: 95, Default: 60 - First-appeared: 0.2.4 - - "ExtendByEd25519ID" -- If true, clients should include Ed25519 - identities for relays when generating EXTEND2 cells. - Min: 0. Max: 1. Default: 0. - First-appeared: 0.3.0 - - "sendme_emit_min_version" -- Minimum SENDME version that can be sent. - Min: 0. Max: 255. Default 0. - First appeared: 0.4.1.1-alpha. - - "sendme_accept_min_version" -- Minimum SENDME version that is accepted. - Min: 0. Max: 255. Default 0. - First appeared: 0.4.1.1-alpha. - - "allow-network-reentry" -- If true, the Exit relays allow connections that - are exiting the network to re-enter. If false, any exit connections going - to a relay ORPort or an authority ORPort and DirPort is denied and the - stream is terminated. - Min: 0. Max: 1. Default: 0 - First appeared: 0.4.5.1-alpha. - -2. Performance-tuning parameters - - "CircuitPriorityHalflifeMsec" -- the halflife parameter used when - weighting which circuit will send the next cell. Obeyed by Tor - 0.2.2.10-alpha and later. (Versions of Tor between 0.2.2.7-alpha and - 0.2.2.10-alpha recognized a "CircPriorityHalflifeMsec" parameter, but - mishandled it badly.) - Min: 1, Max: 2147483647 (INT32_MAX), Default: 30000. - First-appeared: Tor 0.2.2.11-alpha - - "perconnbwrate" and "perconnbwburst" -- if set, each relay sets up a - separate token bucket for every client OR connection, and rate limits - that connection independently. Typically left unset, except when used for - performance experiments around trac entry 1750. Only honored by relays - running Tor 0.2.2.16-alpha and later. (Note that relays running - 0.2.2.7-alpha through 0.2.2.14-alpha looked for bwconnrate and - bwconnburst, but then did the wrong thing with them; see bug 1830 for - details.) - Min: 1, Max: 2147483647 (INT32_MAX), Default: (user setting of - BandwidthRate/BandwidthBurst). - First-appeared: 0.2.2.7-alpha - Removed-in: 0.2.2.16-alpha - - "NumNTorsPerTAP" -- When balancing ntor and TAP cells at relays, - how many ntor handshakes should we perform for each TAP handshake? - Min: 1. Max: 100000. Default: 10. - First-appeared: 0.2.4.17-rc - - "circ_max_cell_queue_size" -- This parameter determines the maximum - number of cells allowed per circuit queue. - Min: 1000. Max: 2147483647 (INT32_MAX). Default: 50000. - First-appeared: 0.3.3.6-rc. - - "KISTSchedRunInterval" -- How frequently should the "KIST" scheduler - run in order to decide which data to write to the network? Value in - units of milliseconds. - Min: 2. Max: 100. Default: 2 - First appeared: 0.3.2 - - "KISTSchedRunIntervalClient" -- How frequently should the "KIST" scheduler - run in order to decide which data to write to the network, on clients? Value - in units of milliseconds. The client value needs to be much lower than - the relay value. - Min: 2. Max: 100. Default: 2. - First appeared: 0.4.8.2 - -3. Voting-related parameters - - "bwweightscale" -- Value that bandwidth-weights are divided by. If not - present then this defaults to 10000. - Min: 1 - First-appeared: 0.2.2.10-alpha - - "maxunmeasuredbw" -- Used by authorities during voting with method 17 or - later. The maximum value to give for any Bandwidth= entry for a router - that isn't based on at least three measurements. - - (Note: starting in version 0.4.6.1-alpha - there was a bug where Tor authorities would instead look at - a parameter called "maxunmeasurdbw", without the "e". - This bug was fixed in 0.4.9.1-alpha and in 0.4.8.8. - Until all relays are running a fixed version, then either this parameter - must not be set, or it must be set to the same value for both - spellings.) - - First-appeared: 0.2.4.11-alpha - - "FastFlagMinThreshold", "FastFlagMaxThreshold" -- lowest and highest - allowable values for the cutoff for routers that should get the Fast - flag. This is used during voting to prevent the threshold for getting - the Fast flag from being too low or too high. - FastFlagMinThreshold: Min: 4. Max: INT32_MAX: Default: 4. - FastFlagMaxThreshold: Min: -. Max: INT32_MAX: Default: INT32_MAX - First-appeared: 0.2.3.11-alpha - - "AuthDirNumSRVAgreements" -- Minimum number of agreeing directory - authority votes required for a fresh shared random value to be written in - the consensus (this rule only applies on the first commit round of the - shared randomness protocol). - Min: 1. Max: INT32_MAX. Default: 2/3 of the total number of - dirauth. - -4. Circuit-build-timeout parameters - - "cbtdisabled", "cbtnummodes", "cbtrecentcount", "cbtmaxtimeouts", - "cbtmincircs", "cbtquantile", "cbtclosequantile", "cbttestfreq", - "cbtmintimeout", "cbtlearntimeout", "cbtmaxopencircs", and - "cbtinitialtimeout" -- see "2.4.5. Consensus parameters governing - behavior" in path-spec.txt for a series of circuit build time related - consensus parameters. - - -5. Directory-related parameters - - "max-consensus-age-to-cache-for-diff" -- Determines how much - consensus history (in hours) relays should try to cache in order to - serve diffs. (min 0, max 8192, default 72) - - "try-diff-for-consensus-newer-than" -- This parameter determines how - old a consensus can be (in hours) before a client should no longer - try to find a diff for it. (min 0, max 8192, default 72) - -6. Pathbias parameters - - "pb_mincircs", "pb_noticepct", "pb_warnpct", "pb_extremepct", - "pb_dropguards", "pb_scalecircs", "pb_scalefactor", - "pb_multfactor", "pb_minuse", "pb_noticeusepct", - "pb_extremeusepct", "pb_scaleuse" -- DOCDOC - -7. Relay behavior - - "refuseunknownexits" -- if set to one, exit relays look at the previous - hop of circuits that ask to open an exit stream, and refuse to exit if - they don't recognize it as a relay. The goal is to make it harder for - people to use them as one-hop proxies. See trac entry 1751 for details. - Min: 0, Max: 1 - First-appeared: 0.2.2.17-alpha - - "onion-key-rotation-days" -- (min 1, max 90, default 28) - - "onion-key-grace-period-days" -- (min 1, max - onion-key-rotation-days, default 7) - - Every relay should list each onion key it generates for - onion-key-rotation-days days after generating it, and then - replace it. Relays should continue to accept their most recent - previous onion key for an additional onion-key-grace-period-days - days after it is replaced. (Introduced in 0.3.1.1-alpha; - prior versions of tor hardcoded both of these values to 7 days.) - - "AllowNonearlyExtend" -- If true, permit EXTEND cells that are not inside - RELAY_EARLY cells. - Min: 0. Max: 1. Default: 0. - First-appeared: 0.2.3.11-alpha - - "overload_dns_timeout_scale_percent" -- This value is a percentage of how - many DNS timeout over N seconds we accept before reporting the overload - general state. It is scaled by a factor of 1000 in order to be able to - represent decimal point. As an example, a value of 1000 means 1%. - Min: 0. Max: 100000. Default: 1000. - First-appeared: 0.4.6.8 - Deprecated: 0.4.7.3-alpha-dev - - "overload_dns_timeout_period_secs" -- This value is the period in seconds - of the DNS timeout measurements (the N in the - "overload_dns_timeout_scale_percent" parameter). For this amount of - seconds, we will gather DNS statistics and at the end, we'll do an - assessment on the overload general signal with regards to DNS timeouts. - Min: 0. Max: 2147483647. Default: 600 - First-appeared: 0.4.6.8 - Deprecated: 0.4.7.3-alpha-dev - - "overload_onionskin_ntor_scale_percent" -- This value is a percentage of - how many onionskin ntor drop over N seconds we accept before reporting the - overload general state. It is scaled by a factor of 1000 in order to be - able to represent decimal point. As an example, a value of 1000 means 1%. - Min: 0. Max: 100000. Default: 1000. - First-appeared: 0.4.7.5-alpha - - "overload_onionskin_ntor_period_secs" -- This value is the period in - seconds of the onionskin ntor overload measurements (the N in the - "overload_onionskin_ntor_scale_percent" parameter). For this amount of - seconds, we will gather onionskin ntor statistics and at the end, we'll do - an assessment on the overload general signal. - Min: 0. Max: 2147483647. Default: 21600 (6 hours) - First-appeared: 0.4.7.5-alpha - - "assume-reachable" -- If true, relays should publish descriptors - even when they cannot make a connection to their IPv4 ORPort. - Min: 0. Max: 1. Default: 0. - First appeared: 0.4.5.1-alpha. - - "assume-reachable-ipv6" -- If true, relays should publish - descriptors even when they cannot make a connection to their IPv6 - ORPort. - Min: 0. Max: 1. Default: 0. - First appeared: 0.4.5.1-alpha. - - "exit_dns_timeout" -- The time in milliseconds an Exit sets libevent to - wait before it considers the DNS timed out. The corresponding libevent - option is "timeout:". - Min: 1. Max: 120000. Default: 1000 (1sec) - First appeared: 0.4.7.5-alpha. - - "exit_dns_num_attempts" -- How many attempts _after the first_ should an - Exit should try a timing-out DNS query before calling it hopeless? (Each of - these attempts will wait for "exit_dns_timeout" independently). The - corresponding libevent option is "attempts:". - Min: 0. Max: 255. Default: 2 - First appeared: 0.4.7.5-alpha. - -8. V3 onion service parameters - - "hs_intro_min_introduce2", "hs_intro_max_introduce2" -- - Minimum/maximum amount of INTRODUCE2 cells allowed per circuits - before rotation (actual amount picked at random between these two - values). - Min: 0. Max: INT32_MAX. Defaults: 16384, 32768. - - "hs_intro_min_lifetime", "hs_intro_max_lifetime" -- Minimum/maximum - lifetime in seconds that a service should keep an intro point for - (actual lifetime picked at random between these two values). - Min: 0. Max: INT32_MAX. Defaults: 18 hours, 24 hours. - - "hs_intro_num_extra" -- Number of extra intro points a service is - allowed to open. This concept comes from proposal #155. - Min: 0. Max: 128. Default: 2. - - "hsdir_interval" -- The length of a time period, _in minutes_. See - rend-spec-v3.txt section [TIME-PERIODS]. - Min: 30. Max: 14400. Default: 1440. - - "hsdir_n_replicas" -- Number of HS descriptor replicas. - Min: 1. Max: 16. Default: 2. - - "hsdir_spread_fetch" -- Total number of HSDirs per replica a tor - client should select to try to fetch a descriptor. - Min: 1. Max: 128. Default: 3. - - "hsdir_spread_store" -- Total number of HSDirs per replica a service - will upload its descriptor to. - Min: 1. Max: 128. Default: 4 - - "HSV3MaxDescriptorSize" -- Maximum descriptor size (in bytes). - Min: 1. Max: INT32_MAX. Default: 50000 - - "hs_service_max_rdv_failures" -- This parameter determines the - maximum number of rendezvous attempt an HS service can make per - introduction. - Min 1. Max 10. Default 2. - First-appeared: 0.3.3.0-alpha. - - "HiddenServiceEnableIntroDoSDefense" -- This parameter makes tor - start using this defense if the introduction point supports it - (for protover HSIntro=5). - Min: 0. Max: 1. Default: 0. - First appeared: 0.4.2.1-alpha. - - "HiddenServiceEnableIntroDoSBurstPerSec" -- Maximum burst to be used - for token bucket for the introduction point rate-limiting. - Min: 0. Max: INT32_MAX. Default: 200 - First appeared: 0.4.2.1-alpha. - - "HiddenServiceEnableIntroDoSRatePerSec" -- Refill rate to be used - for token bucket for the introduction point rate-limiting. - Min: 0. Max: INT32_MAX. Default: 25 - First appeared: 0.4.2.1-alpha. - -9. Denial-of-service parameters - - Denial of Service mitigation parameters. Introduced in 0.3.3.2-alpha: - - "DoSCircuitCreationEnabled" -- Enable the circuit creation DoS - mitigation. - - "DoSCircuitCreationMinConnections" -- Minimum threshold of - concurrent connections before a client address can be flagged as - executing a circuit creation DoS - - "DoSCircuitCreationRate" -- Allowed circuit creation rate per second - per client IP address once the minimum concurrent connection - threshold is reached. - - "DoSCircuitCreationBurst" -- The allowed circuit creation burst per - client IP address once the minimum concurrent connection threshold - is reached. - - "DoSCircuitCreationDefenseType" -- Defense type applied to a - detected client address for the circuit creation mitigation. - 1: No defense. - 2: Refuse circuit creation for the length of - "DoSCircuitCreationDefenseTimePeriod". - - - "DoSCircuitCreationDefenseTimePeriod" -- The base time period that - the DoS defense is activated for. - - "DoSConnectionEnabled" -- Enable the connection DoS mitigation. - - "DoSConnectionMaxConcurrentCount" -- The maximum threshold of - concurrent connection from a client IP address. - - "DoSConnectionDefenseType" -- Defense type applied to a detected - client address for the connection mitigation. Possible values are: - 1: No defense. - 2: Immediately close new connections. - - "DoSRefuseSingleHopClientRendezvous" -- Refuse establishment of - rendezvous points for single hop clients. - -10. Padding-related parameters - - "circpad_max_circ_queued_cells" -- The circuitpadding module will - stop sending more padding cells if more than this many cells are in - the circuit queue a given circuit. - Min: 0. Max: 50000. Default 1000. - First appeared: 0.4.0.3-alpha. - - "circpad_global_allowed_cells" -- DOCDOC - - "circpad_global_max_padding_pct" -- DOCDOC - - "circpad_padding_disabled" -- DOCDOC - - "circpad_padding_reduced" -- DOCDOC - - "nf_conntimeout_clients" -- DOCDOC - - "nf_conntimeout_relays" -- DOCDOC - - "nf_ito_high_reduced" -- DOCDOC - - "nf_ito_low" -- DOCDOC - - "nf_ito_low_reduced" -- DOCDOC - - "nf_pad_before_usage" -- DOCDOC - - "nf_pad_relays" -- DOCDOC - - "nf_pad_single_onion" -- DOCDOC - -11. Guard-related parameters - - (See guard-spec.txt for more information on the vocabulary used here.) - - "UseGuardFraction" -- If true, clients use `GuardFraction` - information from the consensus in order to decide how to weight - guards when picking them. - Min: 0. Max: 1. Default: 0. - First appeared: 0.2.6 - - "guard-lifetime-days" -- Controls guard lifetime. If an unconfirmed - guard has been sampled more than this many days ago, it should be - removed from the guard sample. - Min: 1. Max: 3650. Default: 120. - First appeared: 0.3.0 - - "guard-confirmed-min-lifetime-days" -- Controls confirmed guard - lifetime: if a guard was confirmed more than this many days ago, it - should be removed from the guard sample. - Min: 1. Max: 3650. Default: 60. - First appeared: 0.3.0 - - "guard-internet-likely-down-interval" -- If Tor has been unable to - build a circuit for this long (in seconds), assume that the internet - connection is down, and treat guard failures as unproven. - Min: 1. Max: INT32_MAX. Default: 600. - First appeared: 0.3.0 - - "guard-max-sample-size" -- Largest number of guards that clients - should try to collect in their sample. - Min: 1. Max: INT32_MAX. Default: 60. - First appeared: 0.3.0 - - "guard-max-sample-threshold-percent" -- Largest bandwidth-weighted - fraction of guards that clients should try to collect in their - sample. - Min: 1. Max: 100. Default: 20. - First appeared: 0.3.0 - - "guard-meaningful-restriction-percent" -- If the client has - configured tor to exclude so many guards that the available guard - bandwidth is less than this percentage of the total, treat the guard - sample as "restricted", and keep it in a separate sample. - Min: 1. Max: 100. Default: 20. - First appeared: 0.3.0 - - "guard-extreme-restriction-percent" -- Warn the user if they have - configured tor to exclude so many guards that the available guard - bandwidth is less than this percentage of the total. - Min: 1. Max: 100. Default: 1. - First appeared: 0.3.0. MAX was INT32_MAX, which would have no meaningful - effect. MAX lowered to 100 in 0.4.7. - - "guard-min-filtered-sample-size" -- If fewer than this number of - guards is available in the sample after filtering out unusable - guards, the client should try to add more guards to the sample (if - allowed). - Min: 1. Max: INT32_MAX. Default: 20. - First appeared: 0.3.0 - - "guard-n-primary-guards" -- The number of confirmed guards that the - client should treat as "primary guards". - Min: 1. Max: INT32_MAX. Default: 3. - First appeared: 0.3.0 - - "guard-n-primary-guards-to-use", "guard-n-primary-dir-guards-to-use" - -- number of primary guards and primary directory guards that the - client should be willing to use in parallel. Other primary guards - won't get used unless the earlier ones are down. - "guard-n-primary-guards-to-use": - Min 1, Max INT32_MAX: Default: 1. - "guard-n-primary-dir-guards-to-use" - Min 1, Max INT32_MAX: Default: 3. - First appeared: 0.3.0 - - "guard-nonprimary-guard-connect-timeout" -- When trying to confirm - nonprimary guards, if a guard doesn't answer for more than this long - in seconds, treat lower-priority guards as usable. - Min: 1. Max: INT32_MAX. Default: 15 - First appeared: 0.3.0 - - "guard-nonprimary-guard-idle-timeout" -- When trying to confirm - nonprimary guards, if a guard doesn't answer for more than this long - in seconds, treat it as down. - Min: 1. Max: INT32_MAX. Default: 600 - First appeared: 0.3.0 - - "guard-remove-unlisted-guards-after-days" -- If a guard has been - unlisted in the consensus for at least this many days, remove it - from the sample. - Min: 1. Max: 3650. Default: 20. - First appeared: 0.3.0 - -X. Obsolete parameters - - "NumDirectoryGuards", "NumEntryGuards" -- Number of guard nodes - clients should use by default. If NumDirectoryGuards is 0, we - default to NumEntryGuards. - NumDirectoryGuards: Min: 0. Max: 10. Default: 0 - NumEntryGuards: Min: 1. Max: 10. Default: 3 - First-appeared: 0.2.4.23, 0.2.5.6-alpha - Removed in: 0.3.0 - - "GuardLifetime" -- Duration for which clients should choose guard - nodes, in seconds. - Min: 30 days. Max: 1826 days. Default: 60 days. - First-appeared: 0.2.4.12-alpha - Removed in: 0.3.0. - - "UseNTorHandshake" -- If true, then versions of Tor that support - NTor will prefer to use it by default. - Min: 0, Max: 1. Default: 1. - First-appeared: 0.2.4.8-alpha - Removed in: 0.2.9. - - "Support022HiddenServices" -- Used to implement a mass switch-over - from sending timestamps to hidden services by default to sending no - timestamps at all. If this option is absent, or is set to 1, - clients with the default configuration send timestamps; otherwise, - they do not. - Min: 0, Max: 1. Default: 1. - First-appeared: 0.2.4.18-rc - Removed in: 0.2.6 diff --git a/path-spec.txt b/path-spec.txt deleted file mode 100644 index 33d50e5..0000000 --- a/path-spec.txt +++ /dev/null @@ -1,1051 +0,0 @@ - - Tor Path Specification - - Roger Dingledine - Nick Mathewson - -Note: This is an attempt to specify Tor as currently implemented. Future -versions of Tor will implement improved algorithms. - -This document tries to cover how Tor chooses to build circuits and assign -streams to circuits. Other implementations MAY take other approaches, but -implementors should be aware of the anonymity and load-balancing implications -of their choices. - - THIS SPEC ISN'T DONE YET. - - The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL - NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and - "OPTIONAL" in this document are to be interpreted as described in - RFC 2119. - -Tables of Contents - - 1. General operation - 1.1. Terminology - 1.2. A relay's bandwidth - 2. Building circuits - 2.1. When we build - 2.1.0. We don't build circuits until we have enough directory info - 2.1.1. Clients build circuits preemptively - 2.1.2. Clients build circuits on demand - 2.1.3. Relays build circuits for testing reachability and bandwidth - 2.1.4. Hidden-service circuits - 2.1.5. Rate limiting of failed circuits - 2.1.6. When to tear down circuits - 2.2. Path selection and constraints - 2.2.1. Choosing an exit - 2.2.2. User configuration - 2.3. Cannibalizing circuits - 2.4. Learning when to give up ("timeout") on circuit construction - 2.4.1 Distribution choice and parameter estimation - 2.4.2. How much data to record - 2.4.3. How to record timeouts - 2.4.4. Detecting Changing Network Conditions - 2.4.5. Consensus parameters governing behavior - 2.4.6. Consensus parameters governing behavior - 2.5. Handling failure - 3. Attaching streams to circuits - 4. Hidden-service related circuits - 5. Guard nodes - 5.1. How consensus bandwidth weights factor into entry guard selection - 6. Server descriptor purposes - 7. Detecting route manipulation by Guard nodes (Path Bias) - 7.1. Measuring path construction success rates - 7.2. Measuring path usage success rates - 7.3. Scaling success counts - 7.4. Parametrization - 7.5. Known barriers to enforcement - X. Old notes - X.1. Do we actually do this? - X.2. A thing we could do to deal with reachability. - X.3. Some stuff that worries me about entry guards. 2006 Jun, Nickm. - -1. General operation - - Tor begins building circuits as soon as it has enough directory - information to do so (see section 5 of dir-spec.txt). Some circuits are - built preemptively because we expect to need them later (for user - traffic), and some are built because of immediate need (for user traffic - that no current circuit can handle, for testing the network or our - reachability, and so on). - - [Newer versions of Tor (0.2.6.2-alpha and later): - If the consensus contains Exits (the typical case), Tor will build both - exit and internal circuits. When bootstrap completes, Tor will be ready - to handle an application requesting an exit circuit to services like the - World Wide Web. - - If the consensus does not contain Exits, Tor will only build internal - circuits. In this case, earlier statuses will have included "internal" - as indicated above. When bootstrap completes, Tor will be ready to handle - an application requesting an internal circuit to hidden services at - ".onion" addresses. - - If a future consensus contains Exits, exit circuits may become available.] - - When a client application creates a new stream (by opening a SOCKS - connection or launching a resolve request), we attach it to an appropriate - open circuit if one exists, or wait if an appropriate circuit is - in-progress. We launch a new circuit only - if no current circuit can handle the request. We rotate circuits over - time to avoid some profiling attacks. - - To build a circuit, we choose all the nodes we want to use, and then - construct the circuit. Sometimes, when we want a circuit that ends at a - given hop, and we have an appropriate unused circuit, we "cannibalize" the - existing circuit and extend it to the new terminus. - - These processes are described in more detail below. - - This document describes Tor's automatic path selection logic only; path - selection can be overridden by a controller (with the EXTENDCIRCUIT and - ATTACHSTREAM commands). Paths constructed through these means may - violate some constraints given below. - -1.1. Terminology - - A "path" is an ordered sequence of nodes, not yet built as a circuit. - - A "clean" circuit is one that has not yet been used for any traffic. - - A "fast" or "stable" or "valid" node is one that has the 'Fast' or - 'Stable' or 'Valid' flag - set respectively, based on our current directory information. A "fast" - or "stable" circuit is one consisting only of "fast" or "stable" nodes. - - In an "exit" circuit, the final node is chosen based on waiting stream - requests if any, and in any case it avoids nodes with exit policy of - "reject *:*". An "internal" circuit, on the other hand, is one where - the final node is chosen just like a middle node (ignoring its exit - policy). - - A "request" is a client-side stream or DNS resolve that needs to be - served by a circuit. - - A "pending" circuit is one that we have started to build, but which has - not yet completed. - - A circuit or path "supports" a request if it is okay to use the - circuit/path to fulfill the request, according to the rules given below. - A circuit or path "might support" a request if some aspect of the request - is unknown (usually its target IP), but we believe the path probably - supports the request according to the rules given below. - -1.2. A relay's bandwidth - - Old versions of Tor did not report bandwidths in network status - documents, so clients had to learn them from the routers' advertised - relay descriptors. - - For versions of Tor prior to 0.2.1.17-rc, everywhere below where we - refer to a relay's "bandwidth", we mean its clipped advertised - bandwidth, computed by taking the smaller of the 'rate' and - 'observed' arguments to the "bandwidth" element in the relay's - descriptor. If a router's advertised bandwidth is greater than - MAX_BELIEVABLE_BANDWIDTH (currently 10 MB/s), we clipped to that - value. - - For more recent versions of Tor, we take the bandwidth value declared - in the consensus, and fall back to the clipped advertised bandwidth - only if the consensus does not have bandwidths listed. - -2. Building circuits - -2.1. When we build - -2.1.0. We don't build circuits until we have enough directory info - - There's a class of possible attacks where our directory servers - only give us information about the relays that they would like us - to use. To prevent this attack, we don't build multi-hop - circuits for real traffic (like those in 2.1.1, 2.1.2, 2.1.4 - below) until we have enough directory information to be - reasonably confident this attack isn't being done to us. - - Here, "enough" directory information is defined as: - - * Having a consensus that's been valid at some point in the - last REASONABLY_LIVE_TIME interval (24 hours). - - * Having enough descriptors that we could build at least some - fraction F of all bandwidth-weighted paths, without taking - ExitNodes/EntryNodes/etc into account. - - (F is set by the PathsNeededToBuildCircuits option, - defaulting to the 'min_paths_for_circs_pct' consensus - parameter, with a final default value of 60%.) - - * Having enough descriptors that we could build at least some - fraction F of all bandwidth-weighted paths, _while_ taking - ExitNodes/EntryNodes/etc into account. - - (F is as above.) - - * Having a descriptor for every one of the first - NUM_USABLE_PRIMARY_GUARDS guards among our primary guards. (see - guard-spec.txt) - - We define the "fraction of bandwidth-weighted paths" as the product of - these three fractions. - - * The fraction of descriptors that we have for nodes with the Guard - flag, weighted by their bandwidth for the guard position. - * The fraction of descriptors that we have for all nodes, - weighted by their bandwidth for the middle position. - * The fraction of descriptors that we have for nodes with the Exit - flag, weighted by their bandwidth for the exit position. - - If the consensus has zero weighted bandwidth for a given kind of - relay (Guard, Middle, or Exit), Tor instead uses the fraction of relays - for which it has the descriptor (not weighted by bandwidth at all). - - If the consensus lists zero exit-flagged relays, Tor instead uses the - fraction of middle relays. - - -2.1.1. Clients build circuits preemptively - - When running as a client, Tor tries to maintain at least a certain - number of clean circuits, so that new streams can be handled - quickly. To increase the likelihood of success, Tor tries to - predict what circuits will be useful by choosing from among nodes - that support the ports we have used in the recent past (by default - one hour). Specifically, on startup Tor tries to maintain one clean - fast exit circuit that allows connections to port 80, and at least - two fast clean stable internal circuits in case we get a resolve - request or hidden service request (at least three if we _run_ a - hidden service). - - After that, Tor will adapt the circuits that it preemptively builds - based on the requests it sees from the user: it tries to have two fast - clean exit circuits available for every port seen within the past hour - (each circuit can be adequate for many predicted ports -- it doesn't - need two separate circuits for each port), and it tries to have the - above internal circuits available if we've seen resolves or hidden - service activity within the past hour. If there are 12 or more clean - circuits open, it doesn't open more even if it has more predictions. - - Only stable circuits can "cover" a port that is listed in the - LongLivedPorts config option. Similarly, hidden service requests - to ports listed in LongLivedPorts make us create stable internal - circuits. - - Note that if there are no requests from the user for an hour, Tor - will predict no use and build no preemptive circuits. - - The Tor client SHOULD NOT store its list of predicted requests to a - persistent medium. - -2.1.2. Clients build circuits on demand - - Additionally, when a client request exists that no circuit (built or - pending) might support, we create a new circuit to support the request. - For exit connections, we pick an exit node that will handle the - most pending requests (choosing arbitrarily among ties), launch a - circuit to end there, and repeat until every unattached request - might be supported by a pending or built circuit. For internal - circuits, we pick an arbitrary acceptable path, repeating as needed. - - Clients consider a circuit to become "dirty" as soon as a stream is - attached to it, or some other request is performed over the circuit. - If a circuit has been "dirty" for at least MaxCircuitDirtiness seconds, - new circuits may not be attached to it. - - In some cases we can reuse an already established circuit if it's - clean; see Section 2.3 (cannibalizing circuits) for details. - -2.1.3. Relays build circuits for testing reachability and bandwidth - - Tor relays test reachability of their ORPort once they have - successfully built a circuit (on startup and whenever their IP address - changes). They build an ordinary fast internal circuit with themselves - as the last hop. As soon as any testing circuit succeeds, the Tor - relay decides it's reachable and is willing to publish a descriptor. - - We launch multiple testing circuits (one at a time), until we - have NUM_PARALLEL_TESTING_CIRC (4) such circuits open. Then we - do a "bandwidth test" by sending a certain number of relay drop - cells down each circuit: BandwidthRate * 10 / CELL_NETWORK_SIZE - total cells divided across the four circuits, but never more than - CIRCWINDOW_START (1000) cells total. This exercises both outgoing and - incoming bandwidth, and helps to jumpstart the observed bandwidth - (see dir-spec.txt). - - Tor relays also test reachability of their DirPort once they have - established a circuit, but they use an ordinary exit circuit for - this purpose. - -2.1.4. Hidden-service circuits - - See section 4 below. - -2.1.5. Rate limiting of failed circuits - - If we fail to build a circuit N times in a X second period (see Section - 2.3 for how this works), we stop building circuits until the X seconds - have elapsed. - XXXX - -2.1.6. When to tear down circuits - - Clients should tear down circuits (in general) only when those circuits - have no streams on them. Additionally, clients should tear-down - stream-less circuits only under one of the following conditions: - - - The circuit has never had a stream attached, and it was created too - long in the past (based on CircuitsAvailableTimeout or - cbtlearntimeout, depending on timeout estimate status). - - - The circuit is dirty (has had a stream attached), and it has been - dirty for at least MaxCircuitDirtiness. - -2.2. Path selection and constraints - - We choose the path for each new circuit before we build it. We choose the - exit node first, followed by the other nodes in the circuit, front to - back. (In other words, for a 3-hop circuit, we first pick hop 3, - then hop 1, then hop 2.) All paths we generate obey the following - constraints: - - - We do not choose the same router twice for the same path. - - We do not choose any router in the same family as another in the same - path. (Two routers are in the same family if each one lists the other - in the "family" entries of its descriptor.) - - We do not choose more than one router in a given /16 subnet - (unless EnforceDistinctSubnets is 0). - - We don't choose any non-running or non-valid router unless we have - been configured to do so. By default, we are configured to allow - non-valid routers in "middle" and "rendezvous" positions. - - If we're using Guard nodes, the first node must be a Guard (see 5 - below) - - XXXX Choosing the length - - For "fast" circuits, we only choose nodes with the Fast flag. For - non-"fast" circuits, all nodes are eligible. - - For all circuits, we weight node selection according to router bandwidth. - - We also weight the bandwidth of Exit and Guard flagged nodes depending on - the fraction of total bandwidth that they make up and depending upon the - position they are being selected for. - - These weights are published in the consensus, and are computed as described - in Section "Computing Bandwidth Weights" of dir-spec.txt. They are: - - Wgg - Weight for Guard-flagged nodes in the guard position - Wgm - Weight for non-flagged nodes in the guard Position - Wgd - Weight for Guard+Exit-flagged nodes in the guard Position - - Wmg - Weight for Guard-flagged nodes in the middle Position - Wmm - Weight for non-flagged nodes in the middle Position - Wme - Weight for Exit-flagged nodes in the middle Position - Wmd - Weight for Guard+Exit flagged nodes in the middle Position - - Weg - Weight for Guard flagged nodes in the exit Position - Wem - Weight for non-flagged nodes in the exit Position - Wee - Weight for Exit-flagged nodes in the exit Position - Wed - Weight for Guard+Exit-flagged nodes in the exit Position - - Wgb - Weight for BEGIN_DIR-supporting Guard-flagged nodes - Wmb - Weight for BEGIN_DIR-supporting non-flagged nodes - Web - Weight for BEGIN_DIR-supporting Exit-flagged nodes - Wdb - Weight for BEGIN_DIR-supporting Guard+Exit-flagged nodes - - Wbg - Weight for Guard+Exit-flagged nodes for BEGIN_DIR requests - Wbm - Weight for Guard+Exit-flagged nodes for BEGIN_DIR requests - Wbe - Weight for Guard+Exit-flagged nodes for BEGIN_DIR requests - Wbd - Weight for Guard+Exit-flagged nodes for BEGIN_DIR requests - - If any of those weights is malformed or not present in a consensus, - clients proceed with the regular path selection algorithm setting - the weights to the default value of 10000. - - Additionally, we may be building circuits with one or more requests in - mind. Each kind of request puts certain constraints on paths: - - - All service-side introduction circuits and all rendezvous paths - should be Stable. - - All connection requests for connections that we think will need to - stay open a long time require Stable circuits. Currently, Tor decides - this by examining the request's target port, and comparing it to a - list of "long-lived" ports. (Default: 21, 22, 706, 1863, 5050, - 5190, 5222, 5223, 6667, 6697, 8300.) - - DNS resolves require an exit node whose exit policy is not equivalent - to "reject *:*". - - Reverse DNS resolves require a version of Tor with advertised eventdns - support (available in Tor 0.1.2.1-alpha-dev and later). - - All connection requests require an exit node whose exit policy - supports their target address and port (if known), or which "might - support it" (if the address isn't known). See 2.2.1. - - Rules for Fast? XXXXX - -2.2.1. Choosing an exit - - If we know what IP address we want to connect to or resolve, we can - trivially tell whether a given router will support it by simulating - its declared exit policy. - - Because we often connect to addresses of the form hostname:port, we do not - always know the target IP address when we select an exit node. In these - cases, we need to pick an exit node that "might support" connections to a - given address port with an unknown address. An exit node "might support" - such a connection if any clause that accepts any connections to that port - precedes all clauses (if any) that reject all connections to that port. - - Unless requested to do so by the user, we never choose an exit node - flagged as "BadExit" by more than half of the authorities who advertise - themselves as listing bad exits. - -2.2.2. User configuration - - Users can alter the default behavior for path selection with configuration - options. - - - If "ExitNodes" is provided, then every request requires an exit node on - the ExitNodes list. (If a request is supported by no nodes on that list, - and StrictExitNodes is false, then Tor treats that request as if - ExitNodes were not provided.) - - - "EntryNodes" and "StrictEntryNodes" behave analogously. - - - If a user tries to connect to or resolve a hostname of the form - ..exit, the request is rewritten to a request for - , and the request is only supported by the exit whose nickname - or fingerprint is . - - - When set, "HSLayer2Nodes" and "HSLayer3Nodes" relax Tor's path - restrictions to allow nodes in the same /16 and node family to reappear - in the path. They also allow the guard node to be chosen as the RP, IP, - and HSDIR, and as the hop before those positions. - -2.3. Cannibalizing circuits - - If we need a circuit and have a clean one already established, in - some cases we can adapt the clean circuit for our new - purpose. Specifically, - - For hidden service interactions, we can "cannibalize" a clean internal - circuit if one is available, so we don't need to build those circuits - from scratch on demand. - - We can also cannibalize clean circuits when the client asks to exit - at a given node -- either via the ".exit" notation or because the - destination is running at the same location as an exit node. - -2.4. Learning when to give up ("timeout") on circuit construction - - Since version 0.2.2.8-alpha, Tor clients attempt to learn when to give - up on circuits based on network conditions. - -2.4.1. Distribution choice - - Based on studies of build times, we found that the distribution of - circuit build times appears to be a Frechet distribution (and a multi-modal - Frechet distribution, if more than one guard or bridge is used). However, - estimators and quantile functions of the Frechet distribution are difficult - to work with and slow to converge. So instead, since we are only interested - in the accuracy of the tail, clients approximate the tail of the multi-modal - distribution with a single Pareto curve. - -2.4.2. How much data to record - - From our observations, the minimum number of circuit build times for a - reasonable fit appears to be on the order of 100. However, to keep a - good fit over the long term, clients store 1000 most recent circuit build - times in a circular array. - - These build times only include the times required to build three-hop - circuits, and the times required to build the first three hops of circuits - with more than three hops. Circuits of fewer than three hops are not - recorded, and hops past the third are not recorded. - - The Tor client should build test circuits at a rate of one every 'cbttestfreq' - (10 seconds) until 'cbtmincircs' (100 circuits) are built, with a maximum of - 'cbtmaxopencircs' (default: 10) circuits open at once. This allows a fresh - Tor to have a CircuitBuildTimeout estimated within 30 minutes after install - or network change (see section 2.4.5 below). - - Timeouts are stored on disk in a histogram of 10ms bin width, the same - width used to calculate the Xm value above. The timeouts recorded in the - histogram must be shuffled after being read from disk, to preserve a - proper expiration of old values after restart. - - Thus, some build time resolution is lost during restart. Implementations may - choose a different persistence mechanism than this histogram, but be aware - that build time binning is still needed for parameter estimation. - -2.4.3. Parameter estimation - - Once 'cbtmincircs' build times are recorded, Tor clients update the - distribution parameters and recompute the timeout every circuit completion - (though see section 2.4.5 for when to pause and reset timeout due to - too many circuits timing out). - - Tor clients calculate the parameters for a Pareto distribution fitting the - data using the maximum likelihood estimator. For derivation, see: - https://en.wikipedia.org/wiki/Pareto_distribution#Estimation_of_parameters - - Because build times are not a true Pareto distribution, we alter how Xm is - computed. In a max likelihood estimator, the mode of the distribution is - used directly as Xm. - - Instead of using the mode of discrete build times directly, Tor clients - compute the Xm parameter using the weighted average of the midpoints - of the 'cbtnummodes' (10) most frequently occurring 10ms histogram bins. - Ties are broken in favor of earlier bins (that is, in favor of bins - corresponding to shorter build times). - - (The use of 10 modes was found to minimize error from the selected - cbtquantile, with 10ms bins for quantiles 60-80, compared to many other - heuristics). - - To avoid ln(1.0+epsilon) precision issues, use log laws to rewrite the - estimator for 'alpha' as the sum of logs followed by subtraction, rather - than multiplication and division: - - alpha = n/(Sum_n{ln(MAX(Xm, x_i))} - n*ln(Xm)) - - In this, n is the total number of build times that have completed, x_i is - the ith recorded build time, and Xm is the modes of x_i as above. - - All times below Xm are counted as having the Xm value via the MAX(), - because in Pareto estimators, Xm is supposed to be the lowest value. - However, since clients use mode averaging to estimate Xm, there can be - values below our Xm. Effectively, the Pareto estimator then treats that - everything smaller than Xm happened at Xm. One can also see that if - clients did not do this, alpha could underflow to become negative, which - results in an exponential curve, not a Pareto probability distribution. - - The timeout itself is calculated by using the Pareto Quantile function (the - inverted CDF) to give us the value on the CDF such that 80% of the mass - of the distribution is below the timeout value (parameter 'cbtquantile'). - - The Pareto Quantile Function (inverse CDF) is: - - F(q) = Xm/((1.0-q)^(1.0/alpha)) - - Thus, clients obtain the circuit build timeout for 3-hop circuits by - computing: - - timeout_ms = F(0.8) # 'cbtquantile' == 0.8 - - With this, we expect that the Tor client will accept the fastest 80% of the - total number of paths on the network. - - Clients obtain the circuit close time to completely abandon circuits as: - - close_ms = F(0.99) # 'cbtclosequantile' == 0.99 - - To avoid waiting an unreasonably long period of time for circuits that - simply have relays that are down, Tor clients cap timeout_ms at the max - build time actually observed so far, and cap close_ms at twice this max, - but at least 60 seconds: - - timeout_ms = MIN(timeout_ms, max_observed_timeout) - close_ms = MAX(MIN(close_ms, 2*max_observed_timeout), 'cbtinitialtimeout') - -2.4.3. Calculating timeouts thresholds for circuits of different lengths - - The timeout_ms and close_ms estimates above are good only for 3-hop - circuits, since only 3-hop circuits are recorded in the list of build - times. - - To calculate the appropriate timeouts and close timeouts for circuits of - other lengths, the client multiples the timeout_ms and close_ms values - by a scaling factor determined by the number of communication hops - needed to build their circuits: - - timeout_ms[hops=n] = timeout_ms * Actions(N) / Actions(3) - - close_ms[hops=n] = close_ms * Actions(N) / Actions(3) - - where Actions(N) = N * (N + 1) / 2. - - To calculate timeouts for operations other than circuit building, - the client should add X to Actions(N) for every round-trip communication - required with the Xth hop. - -2.4.4. How to record timeouts - - Pareto estimators begin to lose their accuracy if the tail is omitted. - Hence, Tor clients actually calculate two timeouts: a usage timeout, and a - close timeout. - - Circuits that pass the usage timeout are marked as measurement circuits, - and are allowed to continue to build until the close timeout corresponding - to the point 'cbtclosequantile' (default 99) on the Pareto curve, or 60 - seconds, whichever is greater. - - The actual completion times for these measurement circuits should be - recorded. - - Implementations should completely abandon a circuit and ignore the circuit - if the total build time exceeds the close threshold. Such closed circuits - should be ignored, as this typically means one of the relays in the path is - offline. - -2.4.5. Detecting Changing Network Conditions - - Tor clients attempt to detect both network connectivity loss and drastic - changes in the timeout characteristics. - - To detect changing network conditions, clients keep a history of - the timeout or non-timeout status of the past 'cbtrecentcount' circuits - (20 circuits) that successfully completed at least one hop. If more than - 90% of these circuits timeout, the client discards all buildtimes history, - resets the timeout to 'cbtinitialtimeout' (60 seconds), and then begins - recomputing the timeout. - - If the timeout was already at least `cbtinitialtimeout`, - the client doubles the timeout. - - The records here (of how many circuits succeeded or failed among the most - recent 'cbrrecentcount') are not stored as persistent state. On reload, - we start with a new, empty state. - -2.4.6. Consensus parameters governing behavior - - Clients that implement circuit build timeout learning should obey the - following consensus parameters that govern behavior, in order to allow - us to handle bugs or other emergent behaviors due to client circuit - construction. If these parameters are not present in the consensus, - the listed default values should be used instead. - - cbtdisabled - Default: 0 - Min: 0 - Max: 1 - Effect: If 1, all CircuitBuildTime learning code should be - disabled and history should be discarded. For use in - emergency situations only. - - cbtnummodes - Default: 10 - Min: 1 - Max: 20 - Effect: This value governs how many modes to use in the weighted - average calculation of Pareto parameter Xm. Selecting Xm as the - average of multiple modes improves accuracy of the Pareto tail - for quantile cutoffs from 60-80% (see cbtquantile). - - cbtrecentcount - Default: 20 - Min: 3 - Max: 1000 - Effect: This is the number of circuit build outcomes (success vs - timeout) to keep track of for the following option. - - cbtmaxtimeouts - Default: 18 - Min: 3 - Max: 10000 - Effect: When this many timeouts happen in the last 'cbtrecentcount' - circuit attempts, the client should discard all of its - history and begin learning a fresh timeout value. - - Note that if this parameter's value is greater than the value - of 'cbtrecentcount', then the history will never be - discarded because of this feature. - - cbtmincircs - Default: 100 - Min: 1 - Max: 10000 - Effect: This is the minimum number of circuits to build before - computing a timeout. - - Note that if this parameter's value is higher than 1000 (the - number of time observations that a client keeps in its - circular buffer), circuit build timeout calculation is - effectively disabled, and the default timeouts are used - indefinitely. - - cbtquantile - Default: 80 - Min: 10 - Max: 99 - Effect: This is the position on the quantile curve to use to set the - timeout value. It is a percent (10-99). - - cbtclosequantile - Default: 99 - Min: Value of cbtquantile parameter - Max: 99 - Effect: This is the position on the quantile curve to use to set the - timeout value to use to actually close circuits. It is a - percent (0-99). - - cbttestfreq - Default: 10 - Min: 1 - Max: 2147483647 (INT32_MAX) - Effect: Describes how often in seconds to build a test circuit to - gather timeout values. Only applies if less than 'cbtmincircs' - have been recorded. - - cbtmintimeout - Default: 10 - Min: 10 - Max: 2147483647 (INT32_MAX) - Effect: This is the minimum allowed timeout value in milliseconds. - - cbtinitialtimeout - Default: 60000 - Min: Value of cbtmintimeout - Max: 2147483647 (INT32_MAX) - Effect: This is the timeout value to use before we have enough data - to compute a timeout, in milliseconds. If we do not have - enough data to compute a timeout estimate (see cbtmincircs), - then we use this interval both for the close timeout and the - abandon timeout. - - cbtlearntimeout - Default: 180 - Min: 10 - Max: 60000 - Effect: This is how long idle circuits will be kept open while cbt is - learning a new timeout value. - - cbtmaxopencircs - Default: 10 - Min: 0 - Max: 14 - Effect: This is the maximum number of circuits that can be open at - at the same time during the circuit build time learning phase. - -2.5. Handling failure - - If an attempt to extend a circuit fails (either because the first create - failed or a subsequent extend failed) then the circuit is torn down and is - no longer pending. (XXXX really?) Requests that might have been - supported by the pending circuit thus become unsupported, and a new - circuit needs to be constructed. - - If a stream "begin" attempt fails with an EXITPOLICY error, we - decide that the exit node's exit policy is not correctly advertised, - so we treat the exit node as if it were a non-exit until we retrieve - a fresh descriptor for it. - - Excessive amounts of either type of failure can indicate an - attack on anonymity. See section 7 for how excessive failure is handled. - -3. Attaching streams to circuits - - When a circuit that might support a request is built, Tor tries to attach - the request's stream to the circuit and sends a BEGIN, BEGIN_DIR, - or RESOLVE relay - cell as appropriate. If the request completes unsuccessfully, Tor - considers the reason given in the CLOSE relay cell. [XXX yes, and?] - - - After a request has remained unattached for SocksTimeout (2 minutes - by default), Tor abandons the attempt and signals an error to the - client as appropriate (e.g., by closing the SOCKS connection). - - XXX Timeouts and when Tor auto-retries. - - * What stream-end-reasons are appropriate for retrying. - - If no reply to BEGIN/RESOLVE, then the stream will timeout and fail. - -4. Hidden-service related circuits - - XXX Tracking expected hidden service use (client-side and hidserv-side) - -5. Guard nodes - - We use Guard nodes (also called "helper nodes" in the research - literature) to prevent certain profiling attacks. For an overview of - our Guard selection algorithm -- which has grown rather complex -- see - guard-spec.txt. - -5.1. How consensus bandwidth weights factor into entry guard selection - - When weighting a list of routers for choosing an entry guard, the following - consensus parameters (from the "bandwidth-weights" line) apply: - - Wgg - Weight for Guard-flagged nodes in the guard position - Wgm - Weight for non-flagged nodes in the guard Position - Wgd - Weight for Guard+Exit-flagged nodes in the guard Position - Wgb - Weight for BEGIN_DIR-supporting Guard-flagged nodes - Wmb - Weight for BEGIN_DIR-supporting non-flagged nodes - Web - Weight for BEGIN_DIR-supporting Exit-flagged nodes - Wdb - Weight for BEGIN_DIR-supporting Guard+Exit-flagged nodes - - Please see "bandwidth-weights" in §3.4.1 of dir-spec.txt for more in depth - descriptions of these parameters. - - If a router has been marked as both an entry guard and an exit, then we - prefer to use it more, with our preference for doing so (roughly) linearly - increasing w.r.t. the router's non-guard bandwidth and bandwidth weight - (calculated without taking the guard flag into account). From proposal - #236: - - | - | Let Wpf denote the weight from the 'bandwidth-weights' line a - | client would apply to N for position p if it had the guard - | flag, Wpn the weight if it did not have the guard flag, and B the - | measured bandwidth of N in the consensus. Then instead of choosing - | N for position p proportionally to Wpf*B or Wpn*B, clients should - | choose N proportionally to F*Wpf*B + (1-F)*Wpn*B. - - where F is the weight as calculated using the above parameters. - -6. Server descriptor purposes - - There are currently three "purposes" supported for server descriptors: - general, controller, and bridge. Most descriptors are of type general - -- these are the ones listed in the consensus, and the ones fetched - and used in normal cases. - - Controller-purpose descriptors are those delivered by the controller - and labelled as such: they will be kept around (and expire like - normal descriptors), and they can be used by the controller in its - CIRCUITEXTEND commands. Otherwise they are ignored by Tor when it - chooses paths. - - Bridge-purpose descriptors are for routers that are used as bridges. See - doc/design-paper/blocking.pdf for more design explanation, or proposal - 125 for specific details. Currently bridge descriptors are used in place - of normal entry guards, for Tor clients that have UseBridges enabled. - -7. Detecting route manipulation by Guard nodes (Path Bias) - - The Path Bias defense is designed to defend against a type of route - capture where malicious Guard nodes deliberately fail or choke circuits - that extend to non-colluding Exit nodes to maximize their network - utilization in favor of carrying only compromised traffic. - - In the extreme, the attack allows an adversary that carries c/n - of the network capacity to deanonymize c/n of the network - connections, breaking the O((c/n)^2) property of Tor's original - threat model. It also allows targeted attacks aimed at monitoring - the activity of specific users, bridges, or Guard nodes. - - There are two points where path selection can be manipulated: - during construction, and during usage. Circuit construction - can be manipulated by inducing circuit failures during circuit - extend steps, which causes the Tor client to transparently retry - the circuit construction with a new path. Circuit usage can be - manipulated by abusing the stream retry features of Tor (for - example by withholding stream attempt responses from the client - until the stream timeout has expired), at which point the tor client - will also transparently retry the stream on a new path. - - The defense as deployed therefore makes two independent sets of - measurements of successful path use: one during circuit construction, - and one during circuit usage. - - The intended behavior is for clients to ultimately disable the use - of Guards responsible for excessive circuit failure of either type - (see section 7.4); however known issues with the Tor network currently - restrict the defense to being informational only at this stage (see - section 7.5). - -7.1. Measuring path construction success rates - - Clients maintain two counts for each of their guards: a count of the - number of times a circuit was extended to at least two hops through that - guard, and a count of the number of circuits that successfully complete - through that guard. The ratio of these two numbers is used to determine - a circuit success rate for that Guard. - - Circuit build timeouts are counted as construction failures if the - circuit fails to complete before the 95% "right-censored" timeout - interval, not the 80% timeout condition (see section 2.4). - - If a circuit closes prematurely after construction but before being - requested to close by the client, this is counted as a failure. - -7.2. Measuring path usage success rates - - Clients maintain two usage counts for each of their guards: a count - of the number of usage attempts, and a count of the number of - successful usages. - - A usage attempt means any attempt to attach a stream to a circuit. - - Usage success status is temporarily recorded by state flags on circuits. - Guard usage success counts are not incremented until circuit close. A - circuit is marked as successfully used if we receive a properly - recognized RELAY cell on that circuit that was expected for the current - circuit purpose. - - If subsequent stream attachments fail or time out, the successfully used - state of the circuit is cleared, causing it once again to be regarded - as a usage attempt only. - - Upon close by the client, all circuits that are still marked as usage - attempts are probed using a RELAY_BEGIN cell constructed with a - destination of the form 0.a.b.c:25, where a.b.c is a 24 bit random - nonce. If we get a RELAY_COMMAND_END in response matching our nonce, - the circuit is counted as successfully used. - - If any unrecognized RELAY cells arrive after the probe has been sent, - the circuit is counted as a usage failure. - - If the stream failure reason codes DESTROY, TORPROTOCOL, or INTERNAL - are received in response to any stream attempt, such circuits are not - probed and are declared usage failures. - - Prematurely closed circuits are not probed, and are counted as usage - failures. - -7.3. Scaling success counts - - To provide a moving average of recent Guard activity while - still preserving the ability to verify correctness, we periodically - "scale" the success counts by multiplying them by a scale factor - between 0 and 1.0. - - Scaling is performed when either usage or construction attempt counts - exceed a parametrized value. - - To avoid error due to scaling during circuit construction and use, - currently open circuits are subtracted from the usage counts before - scaling, and added back after scaling. - -7.4. Parametrization - - The following consensus parameters tune various aspects of the - defense. - - pb_mincircs - Default: 150 - Min: 5 - Effect: This is the minimum number of circuits that must complete - at least 2 hops before we begin evaluating construction rates. - - - pb_noticepct - Default: 70 - Min: 0 - Max: 100 - Effect: If the circuit success rate falls below this percentage, - we emit a notice log message. - - pb_warnpct - Default: 50 - Min: 0 - Max: 100 - Effect: If the circuit success rate falls below this percentage, - we emit a warn log message. - - pb_extremepct - Default: 30 - Min: 0 - Max: 100 - Effect: If the circuit success rate falls below this percentage, - we emit a more alarmist warning log message. If - pb_dropguard is set to 1, we also disable the use of the - guard. - - pb_dropguards - Default: 0 - Min: 0 - Max: 1 - Effect: If the circuit success rate falls below pb_extremepct, - when pb_dropguard is set to 1, we disable use of that - guard. - - pb_scalecircs - Default: 300 - Min: 10 - Effect: After this many circuits have completed at least two hops, - Tor performs the scaling described in Section 7.3. - - pb_multfactor and pb_scalefactor - Default: 1/2 - Min: 0.0 - Max: 1.0 - Effect: The double-precision result obtained from - pb_multfactor/pb_scalefactor is multiplied by our current - counts to scale them. - - pb_minuse - Default: 20 - Min: 3 - Effect: This is the minimum number of circuits that we must attempt to - use before we begin evaluating construction rates. - - pb_noticeusepct - Default: 80 - Min: 3 - Effect: If the circuit usage success rate falls below this percentage, - we emit a notice log message. - - pb_extremeusepct - Default: 60 - Min: 3 - Effect: If the circuit usage success rate falls below this percentage, - we emit a warning log message. We also disable the use of the - guard if pb_dropguards is set. - - pb_scaleuse - Default: 100 - Min: 10 - Effect: After we have attempted to use this many circuits, - Tor performs the scaling described in Section 7.3. - -7.5. Known barriers to enforcement - - Due to intermittent CPU overload at relays, the normal rate of - successful circuit completion is highly variable. The Guard-dropping - version of the defense is unlikely to be deployed until the ntor - circuit handshake is enabled, or the nature of CPU overload induced - failure is better understood. - - - -X. Old notes - -X.1. Do we actually do this? - -How to deal with network down. - - While all helpers are down/unreachable and there are no established - or on-the-way testing circuits, launch a testing circuit. (Do this - periodically in the same way we try to establish normal circuits - when things are working normally.) - (Testing circuits are a special type of circuit, that streams won't - attach to by accident.) - - When a testing circuit succeeds, mark all helpers up and hold - the testing circuit open. - - If a connection to a helper succeeds, close all testing circuits. - Else mark that helper down and try another. - - If the last helper is marked down and we already have a testing - circuit established, then add the first hop of that testing circuit - to the end of our helper node list, close that testing circuit, - and go back to square one. (Actually, rather than closing the - testing circuit, can we get away with converting it to a normal - circuit and beginning to use it immediately?) - - [Do we actually do any of the above? If so, let's spec it. If not, let's - remove it. -NM] - -X.2. A thing we could do to deal with reachability. - -And as a bonus, it leads to an answer to Nick's attack ("If I pick -my helper nodes all on 18.0.0.0:*, then I move, you'll know where I -bootstrapped") -- the answer is to pick your original three helper nodes -without regard for reachability. Then the above algorithm will add some -more that are reachable for you, and if you move somewhere, it's more -likely (though not certain) that some of the originals will become useful. -Is that smart or just complex? - -X.3. Some stuff that worries me about entry guards. 2006 Jun, Nickm. - - It is unlikely for two users to have the same set of entry guards. - Observing a user is sufficient to learn its entry guards. So, as we move - around, entry guards make us linkable. If we want to change guards when - our location (IP? subnet?) changes, we have two bad options. We could - - - Drop the old guards. But if we go back to our old location, - we'll not use our old guards. For a laptop that sometimes gets used - from work and sometimes from home, this is pretty fatal. - - Remember the old guards as associated with the old location, and use - them again if we ever go back to the old location. This would be - nasty, since it would force us to record where we've been. - - [Do we do any of this now? If not, this should move into 099-misc or - 098-todo. -NM] diff --git a/pt-spec.txt b/pt-spec.txt deleted file mode 100644 index 45b4c31..0000000 --- a/pt-spec.txt +++ /dev/null @@ -1,828 +0,0 @@ - Pluggable Transport Specification (Version 1) - -Abstract - - Pluggable Transports (PTs) are a generic mechanism for the rapid - development and deployment of censorship circumvention, - based around the idea of modular sub-processes that transform - traffic to defeat censors. - - This document specifies the sub-process startup, shutdown, - and inter-process communication mechanisms required to utilize - PTs. - -Table of Contents - - 1. Introduction - 1.1. Requirements Notation - 2. Architecture Overview - 3. Specification - 3.1. Pluggable Transport Naming - 3.2. Pluggable Transport Configuration Environment Variables - 3.2.1. Common Environment Variables - 3.2.2. Pluggable Transport Client Environment Variables - 3.2.3. Pluggable Transport Server Environment Variables - 3.3. Pluggable Transport To Parent Process Communication - 3.3.1. Common Messages - 3.3.2. Pluggable Transport Client Messages - 3.3.3. Pluggable Transport Server Messages - 3.4. Pluggable Transport Shutdown - 3.5. Pluggable Transport Client Per-Connection Arguments - 4. Anonymity Considerations - 5 References - 6. Acknowledgments - Appendix A. Example Client Pluggable Transport Session - Appendix B. Example Server Pluggable Transport Session - -1. Introduction - - This specification describes a way to decouple protocol-level - obfuscation from an application's client/server code, in a manner - that promotes rapid development of obfuscation/circumvention - tools and promotes reuse beyond the scope of the Tor Project's - efforts in that area. - - This is accomplished by utilizing helper sub-processes that - implement the necessary forward/reverse proxy servers that handle - the censorship circumvention, with a well defined and - standardized configuration and management interface. - - Any application code that implements the interfaces as specified - in this document will be able to use all spec compliant Pluggable - Transports. - -1.1. Requirements Notation - - The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL - NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and - "OPTIONAL" in this document are to be interpreted as described in - [RFC2119]. - -2. Architecture Overview - - +------------+ +---------------------------+ - | Client App +-- Local Loopback --+ PT Client (SOCKS Proxy) +--+ - +------------+ +---------------------------+ | - | - Public Internet (Obfuscated/Transformed traffic) ==> | - | - +------------+ +---------------------------+ | - | Server App +-- Local Loopback --+ PT Server (Reverse Proxy) +--+ - +------------+ +---------------------------+ - - On the client's host, the PT Client software exposes a SOCKS proxy - [RFC1928] to the client application, and obfuscates or otherwise - transforms traffic before forwarding it to the server's host. - - On the server's host, the PT Server software exposes a reverse proxy - that accepts connections from PT Clients, and handles reversing the - obfuscation/transformation applied to traffic, before forwarding it - to the actual server software. An optional lightweight protocol - exists to facilitate communicating connection meta-data that would - otherwise be lost such as the source IP address and port - [EXTORPORT]. - - All PT instances are configured by the respective parent process via - a set of standardized environment variables (3.2) that are set at - launch time, and report status information back to the parent via - writing output in a standardized format to stdout (3.3). - - Each invocation of a PT MUST be either a client OR a server. - - All PT client forward proxies MUST support either SOCKS 4 or SOCKS 5, - and SHOULD prefer SOCKS 5 over SOCKS 4. - -3. Specification - - Pluggable Transport proxies follow the following workflow - throughout their lifespan. - - 1) Parent process sets the required environment values (3.2) - and launches the PT proxy as a sub-process (fork()/exec()). - - 2) The PT Proxy determines the versions of the PT specification - supported by the parent"TOR_PT_MANAGED_TRANSPORT_VER" (3.2.1) - - 2.1) If there are no compatible versions, the PT proxy - writes a "VERSION-ERROR" message (3.3.1) to stdout and - terminates. - - 2.2) If there is a compatible version, the PT proxy writes - a "VERSION" message (3.3.1) to stdout. - - 3) The PT Proxy parses the rest of the environment values. - - 3.1) If the environment values are malformed, or otherwise - invalid, the PT proxy writes a "ENV-ERROR" message - (3.3.1) to stdout and terminates. - - 3.2) Determining if it is a client side forward proxy or - a server side reverse proxy can be done via examining - the "TOR_PT_CLIENT_TRANSPORTS" and "TOR_PT_SERVER_TRANSPORTS" - environment variables. - - 4) (Client only) If there is an upstream proxy specified via - "TOR_PT_PROXY" (3.2.2), the PT proxy validates the URI - provided. - - 4.1) If the upstream proxy is unusable, the PT proxy writes - a "PROXY-ERROR" message (3.3.2) to stdout and - terminates. - - 4.2) If there is a supported and well-formed upstream proxy - the PT proxy writes a "PROXY DONE" message (3.3.2) to - stdout. - - 5) The PT Proxy initializes the transports and reports the - status via stdout (3.3.2, 3.3.3) - - 6) The PT Proxy forwards and transforms traffic as appropriate. - - 7) Upon being signaled to terminate by the parent process (3.4), - the PT Proxy gracefully shuts down. - -3.1. Pluggable Transport Naming - - Pluggable Transport names serve as unique identifiers, and every - PT MUST have a unique name. - - PT names MUST be valid C identifiers. PT names MUST begin with - a letter or underscore, and the remaining characters MUST be - ASCII letters, numbers or underscores. No length limit is - imposted. - - PT names MUST satisfy the regular expression "[a-zA-Z_][a-zA-Z0-9_]*". - -3.2. Pluggable Transport Configuration Environment Variables - - All Pluggable Transport proxy instances are configured by their - parent process at launch time via a set of well defined - environment variables. - - The "TOR_PT_" prefix is used for namespacing reasons and does not - indicate any relations to Tor, except for the origins of this - specification. - -3.2.1. Common Environment Variables - - When launching either a client or server Pluggable Transport proxy, - the following common environment variables MUST be set. - - "TOR_PT_MANAGED_TRANSPORT_VER" - - Specifies the versions of the Pluggable Transport specification - the parent process supports, delimited by commas. All PTs MUST - accept any well-formed list, as long as a compatible version is - present. - - Valid versions MUST consist entirely of non-whitespace, - non-comma printable ASCII characters. - - The version of the Pluggable Transport specification as of this - document is "1". - - Example: - - TOR_PT_MANAGED_TRANSPORT_VER=1,1a,2b,this_is_a_valid_ver - - "TOR_PT_STATE_LOCATION" - - Specifies an absolute path to a directory where the PT is - allowed to store state that will be persisted across - invocations. The directory is not required to exist when - the PT is launched, however PT implementations SHOULD be - able to create it as required. - - PTs MUST only store files in the path provided, and MUST NOT - create or modify files elsewhere on the system. - - Example: - - TOR_PT_STATE_LOCATION=/var/lib/tor/pt_state/ - - "TOR_PT_EXIT_ON_STDIN_CLOSE" - - Specifies that the parent process will close the PT proxy's - standard input (stdin) stream to indicate that the PT proxy - should gracefully exit. - - PTs MUST NOT treat a closed stdin as a signal to terminate - unless this environment variable is set to "1". - - PTs SHOULD treat stdin being closed as a signal to gracefully - terminate if this environment variable is set to "1". - - Example: - - TOR_PT_EXIT_ON_STDIN_CLOSE=1 - - "TOR_PT_OUTBOUND_BIND_ADDRESS_V4" - - Specifies an IPv4 IP address that the PT proxy SHOULD use as source address for - outgoing IPv4 IP packets. This feature allows people with multiple network - interfaces to specify explicitly which interface they prefer the PT proxy to - use. - - If this value is unset or empty, the PT proxy MUST use the default source - address for outgoing connections. - - This setting MUST be ignored for connections to - loopback addresses (127.0.0.0/8). - - Example: - - TOR_PT_OUTBOUND_BIND_ADDRESS_V4=203.0.113.4 - - "TOR_PT_OUTBOUND_BIND_ADDRESS_V6" - - Specifies an IPv6 IP address that the PT proxy SHOULD use as source address for - outgoing IPv6 IP packets. This feature allows people with multiple network - interfaces to specify explicitly which interface they prefer the PT proxy to - use. - - If this value is unset or empty, the PT proxy MUST use the default source - address for outgoing connections. - - This setting MUST be ignored for connections to the loopback address ([::1]). - - IPv6 addresses MUST always be wrapped in square brackets. - - Example:: - - TOR_PT_OUTBOUND_BIND_ADDRESS_V6=[2001:db8::4] - -3.2.2. Pluggable Transport Client Environment Variables - - Client-side Pluggable Transport forward proxies are configured - via the following environment variables. - - "TOR_PT_CLIENT_TRANSPORTS" - - Specifies the PT protocols the client proxy should initialize, - as a comma separated list of PT names. - - PTs SHOULD ignore PT names that it does not recognize. - - Parent processes MUST set this environment variable when - launching a client-side PT proxy instance. - - Example: - - TOR_PT_CLIENT_TRANSPORTS=obfs2,obfs3,obfs4 - - "TOR_PT_PROXY" - - Specifies an upstream proxy that the PT MUST use when making - outgoing network connections. It is a URI [RFC3986] of the - format: - - ://[[:][@]:. - - The "TOR_PT_PROXY" environment variable is OPTIONAL and - MUST be omitted if there is no need to connect via an - upstream proxy. - - Examples: - - TOR_PT_PROXY=socks5://tor:test1234@198.51.100.1:8000 - TOR_PT_PROXY=socks4a://198.51.100.2:8001 - TOR_PT_PROXY=http://198.51.100.3:443 - -3.2.3. Pluggable Transport Server Environment Variables - - Server-side Pluggable Transport reverse proxies are configured - via the following environment variables. - - "TOR_PT_SERVER_TRANSPORTS" - - Specifies the PT protocols the server proxy should initialize, - as a comma separated list of PT names. - - PTs SHOULD ignore PT names that it does not recognize. - - Parent processes MUST set this environment variable when - launching a server-side PT reverse proxy instance. - - Example: - - TOR_PT_SERVER_TRANSPORTS=obfs3,scramblesuit - - "TOR_PT_SERVER_TRANSPORT_OPTIONS" - - Specifies per-PT protocol configuration directives, as a - semicolon-separated list of : pairs, where - is a PT name and is a k=v string value with options - that are to be passed to the transport. - - Colons, semicolons, and backslashes MUST be - escaped with a backslash. - - If there are no arguments that need to be passed to any of - PT transport protocols, "TOR_PT_SERVER_TRANSPORT_OPTIONS" - MAY be omitted. - - Example: - - TOR_PT_SERVER_TRANSPORT_OPTIONS=scramblesuit:key=banana;automata:rule=110;automata:depth=3 - - Will pass to 'scramblesuit' the parameter 'key=banana' and to - 'automata' the arguments 'rule=110' and 'depth=3'. - - "TOR_PT_SERVER_BINDADDR" - - A comma separated list of - pairs, where is - a PT name and is the
: on which it - should listen for incoming client connections. - - The keys holding transport names MUST be in the same order as - they appear in "TOR_PT_SERVER_TRANSPORTS". - - The
MAY be a locally scoped address as long as port - forwarding is done externally. - - The
: combination MUST be an IP address - supported by `bind()`, and MUST NOT be a host name. - - Applications MUST NOT set more than one
: pair - per PT name. - - If there is no specific
: combination to be - configured for any transports, "TOR_PT_SERVER_BINDADDR" MAY - be omitted. - - Example: - - TOR_PT_SERVER_BINDADDR=obfs3-198.51.100.1:1984,scramblesuit-127.0.0.1:4891 - - "TOR_PT_ORPORT" - - Specifies the destination that the PT reverse proxy should forward - traffic to after transforming it as appropriate, as an -
:. - - Connections to the destination specified via "TOR_PT_ORPORT" - MUST only contain application payload. If the parent process - requires the actual source IP address of client connections - (or other metadata), it should set "TOR_PT_EXTENDED_SERVER_PORT" - instead. - - Example: - - TOR_PT_ORPORT=127.0.0.1:9001 - - "TOR_PT_EXTENDED_SERVER_PORT" - - Specifies the destination that the PT reverse proxy should - forward traffic to, via the Extended ORPort protocol [EXTORPORT] - as an
:. - - The Extended ORPort protocol allows the PT reverse proxy to - communicate per-connection metadata such as the PT name and - client IP address/port to the parent process. - - If the parent process does not support the ExtORPort protocol, - it MUST set "TOR_PT_EXTENDED_SERVER_PORT" to an empty string. - - Example: - - TOR_PT_EXTENDED_SERVER_PORT=127.0.0.1:4200 - - "TOR_PT_AUTH_COOKIE_FILE" - - Specifies an absolute filesystem path to the Extended ORPort - authentication cookie, required to communicate with the - Extended ORPort specified via "TOR_PT_EXTENDED_SERVER_PORT". - - If the parent process is not using the ExtORPort protocol for - incoming traffic, "TOR_PT_AUTH_COOKIE_FILE" MUST be omitted. - - Example: - - TOR_PT_AUTH_COOKIE_FILE=/var/lib/tor/extended_orport_auth_cookie - -3.3. Pluggable Transport To Parent Process Communication - - All Pluggable Transport Proxies communicate to the parent process - via writing NL-terminated lines to stdout. The line metaformat is: - - ::= - ::= | - ::= - ::= * - ::= | - ::= - ::= - ::= - - The parent process MUST ignore lines received from PT proxies with - unknown keywords. - -3.3.1. Common Messages - - When a PT proxy first starts up, it must determine which version - of the Pluggable Transports Specification to use to configure - itself. - - It does this via the "TOR_PT_MANAGED_TRANSPORT_VER" (3.2.1) - environment variable which contains all of the versions supported - by the application. - - Upon determining the version to use, or lack thereof, the PT - proxy responds with one of two messages. - - VERSION-ERROR - - The "VERSION-ERROR" message is used to signal that there was - no compatible Pluggable Transport Specification version - present in the "TOR_PT_MANAGED_TRANSPORT_VER" list. - - The SHOULD be set to "no-version" for - historical reasons but MAY be set to a useful error message - instead. - - PT proxies MUST terminate after outputting a "VERSION-ERROR" - message. - - Example: - - VERSION-ERROR no-version - - VERSION - - The "VERSION" message is used to signal the Pluggable Transport - Specification version (as in "TOR_PT_MANAGED_TRANSPORT_VER") - that the PT proxy will use to configure its transports and - communicate with the parent process. - - The version for the environment values and reply messages - specified by this document is "1". - - PT proxies MUST either report an error and terminate, or output - a "VERSION" message before moving on to client/server proxy - initialization and configuration. - - Example: - - VERSION 1 - - After version negotiation has been completed the PT proxy must - then validate that all of the required environment variables are - provided, and that all of the configuration values supplied are - well formed. - - At any point, if there is an error encountered related to - configuration supplied via the environment variables, it MAY - respond with an error message and terminate. - - ENV-ERROR - - The "ENV-ERROR" message is used to signal the PT proxy's - failure to parse the configuration environment variables (3.2). - - The SHOULD consist of a useful error message - that can be used to diagnose and correct the root cause of - the failure. - - PT proxies MUST terminate after outputting a "ENV-ERROR" - message. - - Example: - - ENV-ERROR No TOR_PT_AUTH_COOKIE_FILE when TOR_PT_EXTENDED_SERVER_PORT set - -3.3.2. Pluggable Transport Client Messages - - After negotiating the Pluggable Transport Specification version, - PT client proxies MUST first validate "TOR_PT_PROXY" (3.2.2) if - it is set, before initializing any transports. - - Assuming that an upstream proxy is provided, PT client proxies - MUST respond with a message indicating that the proxy is valid, - supported, and will be used OR a failure message. - - PROXY DONE - - The "PROXY DONE" message is used to signal the PT proxy's - acceptance of the upstream proxy specified by "TOR_PT_PROXY". - - PROXY-ERROR - - The "PROXY-ERROR" message is used to signal that the upstream - proxy is malformed/unsupported or otherwise unusable. - - PT proxies MUST terminate immediately after outputting a - "PROXY-ERROR" message. - - Example: - - PROXY-ERROR SOCKS 4 upstream proxies unsupported. - - After the upstream proxy (if any) is configured, PT clients then - iterate over the requested transports in "TOR_PT_CLIENT_TRANSPORTS" - and initialize the listeners. - - For each transport initialized, the PT proxy reports the listener - status back to the parent via messages to stdout. - - CMETHOD <'socks4','socks5'> - - The "CMETHOD" message is used to signal that a requested - PT transport has been launched, the protocol which the parent - should use to make outgoing connections, and the IP address - and port that the PT transport's forward proxy is listening on. - - Example: - - CMETHOD trebuchet socks5 127.0.0.1:19999 - - CMETHOD-ERROR - - The "CMETHOD-ERROR" message is used to signal that - requested PT transport was unable to be launched. - - Example: - - CMETHOD-ERROR trebuchet no rocks available - - Once all PT transports have been initialized (or have failed), the - PT proxy MUST send a final message indicating that it has finished - initializing. - - CMETHODS DONE - - The "CMETHODS DONE" message signals that the PT proxy has - finished initializing all of the transports that it is capable - of handling. - - Upon sending the "CMETHODS DONE" message, the PT proxy - initialization is complete. - - Notes: - - - Unknown transports in "TOR_PT_CLIENT_TRANSPORTS" are ignored - entirely, and MUST NOT result in a "CMETHOD-ERROR" message. - Thus it is entirely possible for a given PT proxy to - immediately output "CMETHODS DONE". - - - Parent processes MUST handle "CMETHOD"/"CMETHOD-ERROR" - messages in any order, regardless of ordering in - "TOR_PT_CLIENT_TRANSPORTS". - -3.3.3. Pluggable Transport Server Messages - - PT server reverse proxies iterate over the requested transports - in "TOR_PT_CLIENT_TRANSPORTS" and initialize the listeners. - - For each transport initialized, the PT proxy reports the listener - status back to the parent via messages to stdout. - - SMETHOD [options] - - The "SMETHOD" message is used to signal that a requested - PT transport has been launched, the protocol which will be - used to handle incoming connections, and the IP address and - port that clients should use to reach the reverse-proxy. - - If there is a specific provided for a given - PT transport via "TOR_PT_SERVER_BINDADDR", the transport - MUST be initialized using that as the server address. - - The OPTIONAL 'options' field is used to pass additional - per-transport information back to the parent process. - - The currently recognized 'options' are: - - ARGS:[=,]+[=] - - The "ARGS" option is used to pass additional key/value - formatted information that clients will require to use - the reverse proxy. - - Equal signs and commas MUST be escaped with a backslash. - - Tor: The ARGS are included in the transport line of the - Bridge's extra-info document. - - Examples: - - SMETHOD trebuchet 198.51.100.1:19999 - SMETHOD rot_by_N 198.51.100.1:2323 ARGS:N=13 - - SMETHOD-ERROR - - The "SMETHOD-ERROR" message is used to signal that - requested PT transport reverse proxy was unable to be - launched. - - Example: - - SMETHOD-ERROR trebuchet no cows available - - Once all PT transports have been initialized (or have failed), the - PT proxy MUST send a final message indicating that it has finished - initializing. - - SMETHODS DONE - - The "SMETHODS DONE" message signals that the PT proxy has - finished initializing all of the transports that it is capable - of handling. - - Upon sending the "SMETHODS DONE" message, the PT proxy - initialization is complete. - -3.3.4. Pluggable Transport Log Message - - This message is for a client or server PT to be able to signal back to the - parent process via stdout or stderr any log messages. - - A log message can be any kind of messages (human readable) that the PT - sends back so the parent process can gather information about what is going - on in the child process. It is not intended for the parent process to parse - and act accordingly but rather a message used for plain logging. - - For example, the tor daemon logs those messages at the Severity level and - sends them onto the control port using the PT_LOG (see control-spec.txt) - event so any third party can pick them up for debugging. - - The format of the message: - - LOG SEVERITY=Severity MESSAGE=Message - - The SEVERITY value indicate at which logging level the message applies. - The accepted values for are: error, warning, notice, info, debug - - The MESSAGE value is a human readable string formatted by the PT. The - contains the log message which can be a String or CString (see - section 2 in control-spec.txt). - - Example: - - LOG SEVERITY=debug MESSAGE="Connected to bridge A" - -3.3.5. Pluggable Transport Status Message - - This message is for a client or server PT to be able to signal back to the - parent process via stdout or stderr any status messages. - - The format of the message: - - STATUS TRANSPORT=Transport = [= ...] - - The TRANSPORT value indicates a hint on what the PT is such has the name or - the protocol used for instance. As an example, obfs4proxy would use - "obfs4". Thus, the Transport value can be anything the PT itself defines - and it can be a String or CString (see section 2 in control-spec.txt). - - The = values are specific to the PT and there has to be at least - one. They are messages that reflects the status that the PT wants to - report. can be a String or CString. - - Examples (fictional): - - STATUS TRANSPORT=obfs4 ADDRESS=198.51.100.123:1234 CONNECT=Success - STATUS TRANSPORT=obfs4 ADDRESS=198.51.100.222:2222 CONNECT=Failed FINGERPRINT= ERRSTR="Connection refused" - STATUS TRANSPORT=trebuchet ADDRESS=198.51.100.15:443 PERCENT=42 - -3.4. Pluggable Transport Shutdown - - The recommended way for Pluggable Transport using applications and - Pluggable Transports to handle graceful shutdown is as follows. - - - (Parent) Set "TOR_PT_EXIT_ON_STDIN_CLOSE" (3.2.1) when - launching the PT proxy, to indicate that stdin will be used - for graceful shutdown notification. - - - (Parent) When the time comes to terminate the PT proxy: - - 1. Close the PT proxy's stdin. - 2. Wait for a "reasonable" amount of time for the PT to exit. - 3. Attempt to use OS specific mechanisms to cause graceful - PT shutdown (eg: 'SIGTERM') - 4. Use OS specific mechanisms to force terminate the PT - (eg: 'SIGKILL', 'ProccessTerminate()'). - - - PT proxies SHOULD monitor stdin, and exit gracefully when - it is closed, if the parent supports that behavior. - - - PT proxies SHOULD handle OS specific mechanisms to gracefully - terminate (eg: Install a signal handler on 'SIGTERM' that - causes cleanup and a graceful shutdown if able). - - - PT proxies SHOULD attempt to detect when the parent has - terminated (eg: via detecting that its parent process ID has - changed on U*IX systems), and gracefully terminate. - -3.5. Pluggable Transport Client Per-Connection Arguments - - Certain PT transport protocols require that the client provides - per-connection arguments when making outgoing connections. On - the server side, this is handled by the "ARGS" optional argument - as part of the "SMETHOD" message. - - On the client side, arguments are passed via the authentication - fields that are part of the SOCKS protocol. - - First the "=" formatted arguments MUST be escaped, - such that all backslash, equal sign, and semicolon characters - are escaped with a backslash. - - Second, all of the escaped are concatenated together. - - Example: - - shared-secret=rahasia;secrets-file=/tmp/blob - - Lastly the arguments are transmitted when making the outgoing - connection using the authentication mechanism specific to the - SOCKS protocol version. - - - In the case of SOCKS 4, the concatenated argument list is - transmitted in the "USERID" field of the "CONNECT" request. - - - In the case of SOCKS 5, the parent process must negotiate - "Username/Password" authentication [RFC1929], and transmit - the arguments encoded in the "UNAME" and "PASSWD" fields. - - If the encoded argument list is less than 255 bytes in - length, the "PLEN" field must be set to "1" and the "PASSWD" - field must contain a single NUL character. - -4. Anonymity Considerations - - When designing and implementing a Pluggable Transport, care - should be taken to preserve the privacy of clients and to avoid - leaking personally identifying information. - - Examples of client related considerations are: - - - Not logging client IP addresses to disk. - - - Not leaking DNS addresses except when necessary. - - - Ensuring that "TOR_PT_PROXY"'s "fail closed" behavior is - implemented correctly. - - Additionally, certain obfuscation mechanisms rely on information - such as the server IP address/port being confidential, so clients - also need to take care to preserve server side information - confidential when applicable. - -5. References - - [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate - Requirement Levels", BCP 14, RFC 2119, March 1997. - - [RFC1928] Leech, M., Ganis, M., Lee, Y., Kuris, R., - Koblas, D., Jones, L., "SOCKS Protocol Version 5", - RFC 1928, March 1996. - - [EXTORPORT] Kadianakis, G., Mathewson, N., "Extended ORPort and - TransportControlPort", Tor Proposal 196, March 2012. - - [RFC3986] Berners-Lee, T., Fielding, R., Masinter, L., "Uniform - Resource Identifier (URI): Generic Syntax", RFC 3986, - January 2005. - - [RFC1929] Leech, M., "Username/Password Authentication for - SOCKS V5", RFC 1929, March 1996. - -6. Acknowledgments - - This specification draws heavily from prior versions done by Jacob - Appelbaum, Nick Mathewson, and George Kadianakis. - -Appendix A. Example Client Pluggable Transport Session - - Environment variables: - - TOR_PT_MANAGED_TRANSPORT_VER=1 - TOR_PT_STATE_LOCATION=/var/lib/tor/pt_state/ - TOR_PT_EXIT_ON_STDIN_CLOSE=1 - TOR_PT_PROXY=socks5://127.0.0.1:8001 - TOR_PT_CLIENT_TRANSPORTS=obfs3,obfs4 - - Messages the PT Proxy writes to stdin: - - VERSION 1 - PROXY DONE - CMETHOD obfs3 socks5 127.0.0.1:32525 - CMETHOD obfs4 socks5 127.0.0.1:37347 - CMETHODS DONE - -Appendix B. Example Server Pluggable Transport Session - - Environment variables: - - TOR_PT_MANAGED_TRANSPORT_VER=1 - TOR_PT_STATE_LOCATION=/var/lib/tor/pt_state - TOR_PT_EXIT_ON_STDIN_CLOSE=1 - TOR_PT_SERVER_TRANSPORTS=obfs3,obfs4 - TOR_PT_SERVER_BINDADDR=obfs3-198.51.100.1:1984 - - Messages the PT Proxy writes to stdin: - - VERSION 1 - SMETHOD obfs3 198.51.100.1:1984 - SMETHOD obfs4 198.51.100.1:43734 ARGS:cert=HszPy3vWfjsESCEOo9ZBkRv6zQ/1mGHzc8arF0y2SpwFr3WhsMu8rK0zyaoyERfbz3ddFw,iat-mode=0 - SMETHODS DONE diff --git a/rend-spec-v3.txt b/rend-spec-v3.txt deleted file mode 100644 index d836d23..0000000 --- a/rend-spec-v3.txt +++ /dev/null @@ -1,2869 +0,0 @@ - - Tor Rendezvous Specification - Version 3 - -This document specifies how the hidden service version 3 protocol works. This -text used to be proposal 224-rend-spec-ng.txt. - - -Table of contents: - - 0. Hidden services: overview and preliminaries. - 0.1. Improvements over previous versions. - 0.2. Notation and vocabulary - 0.3. Cryptographic building blocks - 0.4. Protocol building blocks [BUILDING-BLOCKS] - 0.5. Assigned relay cell types - 0.6. Acknowledgments - 1. Protocol overview - 1.1. View from 10,000 feet - 1.2. In more detail: naming hidden services [NAMING] - 1.3. In more detail: Access control [IMD:AC] - 1.4. In more detail: Distributing hidden service descriptors. [IMD:DIST] - 1.5. In more detail: Scaling to multiple hosts - 1.6. In more detail: Backward compatibility with older hidden service - 1.7. In more detail: Keeping crypto keys offline - 1.8. In more detail: Encryption Keys And Replay Resistance - 1.9. In more detail: A menagerie of keys - 1.9.1. In even more detail: Client authorization [CLIENT-AUTH] - 2. Generating and publishing hidden service descriptors [HSDIR] - 2.1. Deriving blinded keys and subcredentials [SUBCRED] - 2.2. Locating, uploading, and downloading hidden service descriptors - 2.2.1. Dividing time into periods [TIME-PERIODS] - 2.2.2. When to publish a hidden service descriptor [WHEN-HSDESC] - 2.2.3. Where to publish a hidden service descriptor [WHERE-HSDESC] - 2.2.4. Using time periods and SRVs to fetch/upload HS descriptors - 2.2.5. Expiring hidden service descriptors [EXPIRE-DESC] - 2.2.6. URLs for anonymous uploading and downloading - 2.3. Publishing shared random values [PUB-SHAREDRANDOM] - 2.3.1. Client behavior in the absence of shared random values - 2.3.2. Hidden services and changing shared random values - 2.4. Hidden service descriptors: outer wrapper [DESC-OUTER] - 2.5. Hidden service descriptors: encryption format [HS-DESC-ENC] - 2.5.1. First layer of encryption [HS-DESC-FIRST-LAYER] - 2.5.1.1. First layer encryption logic - 2.5.1.2. First layer plaintext format - 2.5.1.3. Client behavior - 2.5.1.4. Obfuscating the number of authorized clients - 2.5.2. Second layer of encryption [HS-DESC-SECOND-LAYER] - 2.5.2.1. Second layer encryption keys - 2.5.2.2. Second layer plaintext format - 2.5.3. Deriving hidden service descriptor encryption keys [HS-DESC-ENCRYPTION-KEYS] - 3. The introduction protocol [INTRO-PROTOCOL] - 3.1. Registering an introduction point [REG_INTRO_POINT] - 3.1.1. Extensible ESTABLISH_INTRO protocol. [EST_INTRO] - 3.1.1.1. Denial-of-Server Defense Extension. [EST_INTRO_DOS_EXT] - 3.1.2. Registering an introduction point on a legacy Tor node [LEGACY_EST_INTRO] - 3.1.3. Acknowledging establishment of introduction point [INTRO_ESTABLISHED] - 3.2. Sending an INTRODUCE1 cell to the introduction point. [SEND_INTRO1] - 3.2.1. INTRODUCE1 cell format [FMT_INTRO1] - 3.2.2. INTRODUCE_ACK cell format. [INTRO_ACK] - 3.3. Processing an INTRODUCE2 cell at the hidden service. [PROCESS_INTRO2] - 3.3.1. Introduction handshake encryption requirements [INTRO-HANDSHAKE-REQS] - 3.3.2. Example encryption handshake: ntor with extra data [NTOR-WITH-EXTRA-DATA] - 3.4. Authentication during the introduction phase. [INTRO-AUTH] - 3.4.1. Ed25519-based authentication. - 4. The rendezvous protocol - 4.1. Establishing a rendezvous point [EST_REND_POINT] - 4.2. Joining to a rendezvous point [JOIN_REND] - 4.2.1. Key expansion - 4.3. Using legacy hosts as rendezvous points - 5. Encrypting data between client and host - 6. Encoding onion addresses [ONIONADDRESS] - 7. Open Questions: - --1. Draft notes - - This document describes a proposed design and specification for - hidden services in Tor version 0.2.5.x or later. It's a replacement - for the current rend-spec.txt, rewritten for clarity and for improved - design. - - Look for the string "TODO" below: it describes gaps or uncertainties - in the design. - - Change history: - - 2013-11-29: Proposal first numbered. Some TODO and XXX items remain. - - 2014-01-04: Clarify some unclear sections. - - 2014-01-21: Fix a typo. - - 2014-02-20: Move more things to the revised certificate format in the - new updated proposal 220. - - 2015-05-26: Fix two typos. - - -0. Hidden services: overview and preliminaries. - - Hidden services aim to provide responder anonymity for bidirectional - stream-based communication on the Tor network. Unlike regular Tor - connections, where the connection initiator receives anonymity but - the responder does not, hidden services attempt to provide - bidirectional anonymity. - - Participants: - - Operator -- A person running a hidden service - - Host, "Server" -- The Tor software run by the operator to provide - a hidden service. - - User -- A person contacting a hidden service. - - Client -- The Tor software running on the User's computer - - Hidden Service Directory (HSDir) -- A Tor node that hosts signed - statements from hidden service hosts so that users can make - contact with them. - - Introduction Point -- A Tor node that accepts connection requests - for hidden services and anonymously relays those requests to the - hidden service. - - Rendezvous Point -- A Tor node to which clients and servers - connect and which relays traffic between them. - -0.1. Improvements over previous versions. - - Here is a list of improvements of this proposal over the legacy hidden - services: - - a) Better crypto (replaced SHA1/DH/RSA1024 with SHA3/ed25519/curve25519) - b) Improved directory protocol leaking less to directory servers. - c) Improved directory protocol with smaller surface for targeted attacks. - d) Better onion address security against impersonation. - e) More extensible introduction/rendezvous protocol. - f) Offline keys for onion services - g) Advanced client authorization - -0.2. Notation and vocabulary - - Unless specified otherwise, all multi-octet integers are big-endian. - - We write sequences of bytes in two ways: - - 1. A sequence of two-digit hexadecimal values in square brackets, - as in [AB AD 1D EA]. - - 2. A string of characters enclosed in quotes, as in "Hello". The - characters in these strings are encoded in their ascii - representations; strings are NOT nul-terminated unless - explicitly described as NUL terminated. - - We use the words "byte" and "octet" interchangeably. - - We use the vertical bar | to denote concatenation. - - We use INT_N(val) to denote the network (big-endian) encoding of the - unsigned integer "val" in N bytes. For example, INT_4(1337) is [00 00 - 05 39]. Values are truncated like so: val % (2 ^ (N * 8)). For example, - INT_4(42) is 42 % 4294967296 (32 bit). - -0.3. Cryptographic building blocks - - This specification uses the following cryptographic building blocks: - - * A pseudorandom number generator backed by a strong entropy source. - The output of the PRNG should always be hashed before being posted on - the network to avoid leaking raw PRNG bytes to the network - (see [PRNG-REFS]). - - * A stream cipher STREAM(iv, k) where iv is a nonce of length - S_IV_LEN bytes and k is a key of length S_KEY_LEN bytes. - - * A public key signature system SIGN_KEYGEN()->seckey, pubkey; - SIGN_SIGN(seckey,msg)->sig; and SIGN_CHECK(pubkey, sig, msg) -> - { "OK", "BAD" }; where secret keys are of length SIGN_SECKEY_LEN - bytes, public keys are of length SIGN_PUBKEY_LEN bytes, and - signatures are of length SIGN_SIG_LEN bytes. - - This signature system must also support key blinding operations - as discussed in appendix [KEYBLIND] and in section [SUBCRED]: - SIGN_BLIND_SECKEY(seckey, blind)->seckey2 and - SIGN_BLIND_PUBKEY(pubkey, blind)->pubkey2 . - - * A public key agreement system "PK", providing - PK_KEYGEN()->seckey, pubkey; PK_VALID(pubkey) -> {"OK", "BAD"}; - and PK_HANDSHAKE(seckey, pubkey)->output; where secret keys are - of length PK_SECKEY_LEN bytes, public keys are of length - PK_PUBKEY_LEN bytes, and the handshake produces outputs of - length PK_OUTPUT_LEN bytes. - - * A cryptographic hash function H(d), which should be preimage and - collision resistant. It produces hashes of length HASH_LEN - bytes. - - * A cryptographic message authentication code MAC(key,msg) that - produces outputs of length MAC_LEN bytes. - - * A key derivation function KDF(message, n) that outputs n bytes. - - As a first pass, I suggest: - - * Instantiate STREAM with AES256-CTR. - - * Instantiate SIGN with Ed25519 and the blinding protocol in - [KEYBLIND]. - - * Instantiate PK with Curve25519. - - * Instantiate H with SHA3-256. - - * Instantiate KDF with SHAKE-256. - - * Instantiate MAC(key=k, message=m) with H(k_len | k | m), - where k_len is htonll(len(k)). - - When we need a particular MAC key length below, we choose - MAC_KEY_LEN=32 (256 bits). - - For legacy purposes, we specify compatibility with older versions of - the Tor introduction point and rendezvous point protocols. These used - RSA1024, DH1024, AES128, and SHA1, as discussed in - rend-spec.txt. - - As in [proposal 220], all signatures are generated not over strings - themselves, but over those strings prefixed with a distinguishing - value. - -0.4. Protocol building blocks [BUILDING-BLOCKS] - - In sections below, we need to transmit the locations and identities - of Tor nodes. We do so in the link identification format used by - EXTEND2 cells in the Tor protocol. - - NSPEC (Number of link specifiers) [1 byte] - NSPEC times: - LSTYPE (Link specifier type) [1 byte] - LSLEN (Link specifier length) [1 byte] - LSPEC (Link specifier) [LSLEN bytes] - - Link specifier types are as described in tor-spec.txt. Every set of - link specifiers SHOULD include at minimum specifiers of type [00] - (TLS-over-TCP, IPv4), [02] (legacy node identity) and [03] (ed25519 - identity key). Sets of link specifiers without these three types - SHOULD be rejected. - - As of 0.4.1.1-alpha, Tor includes both IPv4 and IPv6 link specifiers - in v3 onion service protocol link specifier lists. All available - addresses SHOULD be included as link specifiers, regardless of the - address that Tor actually used to connect/extend to the remote relay. - - We also incorporate Tor's circuit extension handshakes, as used in - the CREATE2 and CREATED2 cells described in tor-spec.txt. In these - handshakes, a client who knows a public key for a server sends a - message and receives a message from that server. Once the exchange is - done, the two parties have a shared set of forward-secure key - material, and the client knows that nobody else shares that key - material unless they control the secret key corresponding to the - server's public key. - -0.5. Assigned relay cell types - - These relay cell types are reserved for use in the hidden service - protocol. - - 32 -- RELAY_COMMAND_ESTABLISH_INTRO - - Sent from hidden service host to introduction point; - establishes introduction point. Discussed in - [REG_INTRO_POINT]. - - 33 -- RELAY_COMMAND_ESTABLISH_RENDEZVOUS - - Sent from client to rendezvous point; creates rendezvous - point. Discussed in [EST_REND_POINT]. - - 34 -- RELAY_COMMAND_INTRODUCE1 - - Sent from client to introduction point; requests - introduction. Discussed in [SEND_INTRO1] - - 35 -- RELAY_COMMAND_INTRODUCE2 - - Sent from introduction point to hidden service host; requests - introduction. Same format as INTRODUCE1. Discussed in - [FMT_INTRO1] and [PROCESS_INTRO2] - - 36 -- RELAY_COMMAND_RENDEZVOUS1 - - Sent from hidden service host to rendezvous point; - attempts to join host's circuit to - client's circuit. Discussed in [JOIN_REND] - - 37 -- RELAY_COMMAND_RENDEZVOUS2 - - Sent from rendezvous point to client; - reports join of host's circuit to - client's circuit. Discussed in [JOIN_REND] - - 38 -- RELAY_COMMAND_INTRO_ESTABLISHED - - Sent from introduction point to hidden service host; - reports status of attempt to establish introduction - point. Discussed in [INTRO_ESTABLISHED] - - 39 -- RELAY_COMMAND_RENDEZVOUS_ESTABLISHED - - Sent from rendezvous point to client; acknowledges - receipt of ESTABLISH_RENDEZVOUS cell. Discussed in - [EST_REND_POINT] - - 40 -- RELAY_COMMAND_INTRODUCE_ACK - - Sent from introduction point to client; acknowledges - receipt of INTRODUCE1 cell and reports success/failure. - Discussed in [INTRO_ACK] - -0.6. Acknowledgments - - This design includes ideas from many people, including - - Christopher Baines, - Daniel J. Bernstein, - Matthew Finkel, - Ian Goldberg, - George Kadianakis, - Aniket Kate, - Tanja Lange, - Robert Ransom, - Roger Dingledine, - Aaron Johnson, - Tim Wilson-Brown ("teor"), - special (John Brooks), - s7r - - It's based on Tor's original hidden service design by Roger - Dingledine, Nick Mathewson, and Paul Syverson, and on improvements to - that design over the years by people including - - Tobias Kamm, - Thomas Lauterbach, - Karsten Loesing, - Alessandro Preite Martinez, - Robert Ransom, - Ferdinand Rieger, - Christoph Weingarten, - Christian Wilms, - - We wouldn't be able to do any of this work without good attack - designs from researchers including - - Alex Biryukov, - Lasse Øverlier, - Ivan Pustogarov, - Paul Syverson, - Ralf-Philipp Weinmann, - - See [ATTACK-REFS] for their papers. - - Several of these ideas have come from conversations with - - Christian Grothoff, - Brian Warner, - Zooko Wilcox-O'Hearn, - - And if this document makes any sense at all, it's thanks to - editing help from - - Matthew Finkel, - George Kadianakis, - Peter Palfrader, - Tim Wilson-Brown ("teor"), - - - [XXX Acknowledge the huge bunch of people working on 8106.] - [XXX Acknowledge the huge bunch of people working on 8244.] - - - Please forgive me if I've missed you; please forgive me if I've - misunderstood your best ideas here too. - - -1. Protocol overview - - In this section, we outline the hidden service protocol. This section - omits some details in the name of simplicity; those are given more - fully below, when we specify the protocol in more detail. - -1.1. View from 10,000 feet - - A hidden service host prepares to offer a hidden service by choosing - several Tor nodes to serve as its introduction points. It builds - circuits to those nodes, and tells them to forward introduction - requests to it using those circuits. - - Once introduction points have been picked, the host builds a set of - documents called "hidden service descriptors" (or just "descriptors" - for short) and uploads them to a set of HSDir nodes. These documents - list the hidden service's current introduction points and describe - how to make contact with the hidden service. - - When a client wants to connect to a hidden service, it first chooses - a Tor node at random to be its "rendezvous point" and builds a - circuit to that rendezvous point. If the client does not have an - up-to-date descriptor for the service, it contacts an appropriate - HSDir and requests such a descriptor. - - The client then builds an anonymous circuit to one of the hidden - service's introduction points listed in its descriptor, and gives the - introduction point an introduction request to pass to the hidden - service. This introduction request includes the target rendezvous - point and the first part of a cryptographic handshake. - - Upon receiving the introduction request, the hidden service host - makes an anonymous circuit to the rendezvous point and completes the - cryptographic handshake. The rendezvous point connects the two - circuits, and the cryptographic handshake gives the two parties a - shared key and proves to the client that it is indeed talking to the - hidden service. - - Once the two circuits are joined, the client can send Tor RELAY cells - to the server. RELAY_BEGIN cells open streams to an external process - or processes configured by the server; RELAY_DATA cells are used to - communicate data on those streams, and so forth. - -1.2. In more detail: naming hidden services [NAMING] - - A hidden service's name is its long term master identity key. This is - encoded as a hostname by encoding the entire key in Base 32, including a - version byte and a checksum, and then appending the string ".onion" at the - end. The result is a 56-character domain name. - - (This is a change from older versions of the hidden service protocol, - where we used an 80-bit truncated SHA1 hash of a 1024 bit RSA key.) - - The names in this format are distinct from earlier names because of - their length. An older name might look like: - - unlikelynamefora.onion - yyhws9optuwiwsns.onion - - And a new name following this specification might look like: - - l5satjgud6gucryazcyvyvhuxhr74u6ygigiuyixe3a6ysis67ororad.onion - - Please see section [ONIONADDRESS] for the encoding specification. - -1.3. In more detail: Access control [IMD:AC] - - Access control for a hidden service is imposed at multiple points through - the process above. Furthermore, there is also the option to impose - additional client authorization access control using pre-shared secrets - exchanged out-of-band between the hidden service and its clients. - - The first stage of access control happens when downloading HS descriptors. - Specifically, in order to download a descriptor, clients must know which - blinded signing key was used to sign it. (See the next section for more info - on key blinding.) - - To learn the introduction points, clients must decrypt the body of the - hidden service descriptor. To do so, clients must know the _unblinded_ - public key of the service, which makes the descriptor unusable by entities - without that knowledge (e.g. HSDirs that don't know the onion address). - - Also, if optional client authorization is enabled, hidden service - descriptors are superencrypted using each authorized user's identity x25519 - key, to further ensure that unauthorized entities cannot decrypt it. - - In order to make the introduction point send a rendezvous request to the - service, the client needs to use the per-introduction-point authentication - key found in the hidden service descriptor. - - The final level of access control happens at the server itself, which may - decide to respond or not respond to the client's request depending on the - contents of the request. The protocol is extensible at this point: at a - minimum, the server requires that the client demonstrate knowledge of the - contents of the encrypted portion of the hidden service descriptor. If - optional client authorization is enabled, the service may additionally - require the client to prove knowledge of a pre-shared private key. - -1.4. In more detail: Distributing hidden service descriptors. [IMD:DIST] - - Periodically, hidden service descriptors become stored at different - locations to prevent a single directory or small set of directories - from becoming a good DoS target for removing a hidden service. - - For each period, the Tor directory authorities agree upon a - collaboratively generated random value. (See section 2.3 for a - description of how to incorporate this value into the voting - practice; generating the value is described in other proposals, - including [SHAREDRANDOM-REFS].) That value, combined with hidden service - directories' public identity keys, determines each HSDir's position - in the hash ring for descriptors made in that period. - - Each hidden service's descriptors are placed into the ring in - positions based on the key that was used to sign them. Note that - hidden service descriptors are not signed with the services' public - keys directly. Instead, we use a key-blinding system [KEYBLIND] to - create a new key-of-the-day for each hidden service. Any client that - knows the hidden service's public identity key can derive these blinded - signing keys for a given period. It should be impossible to derive - the blinded signing key lacking that knowledge. - - This is achieved using two nonces: - - * A "credential", derived from the public identity key KP_hs_id. - N_hs_cred. - - * A "subcredential", derived from the credential N_hs_cred - and information which various with the current time period. - N_hs_subcred. - - The body of each descriptor is also encrypted with a key derived from - the public signing key. - - To avoid a "thundering herd" problem where every service generates - and uploads a new descriptor at the start of each period, each - descriptor comes online at a time during the period that depends on - its blinded signing key. The keys for the last period remain valid - until the new keys come online. - -1.5. In more detail: Scaling to multiple hosts - - This design is compatible with our current approaches for scaling hidden - services. Specifically, hidden service operators can use onionbalance to - achieve high availability between multiple nodes on the HSDir - layer. Furthermore, operators can use proposal 255 to load balance their - hidden services on the introduction layer. See [SCALING-REFS] for further - discussions on this topic and alternative designs. - -1.6. In more detail: Backward compatibility with older hidden service - protocols - - This design is incompatible with the clients, server, and hsdir node - protocols from older versions of the hidden service protocol as - described in rend-spec.txt. On the other hand, it is designed to - enable the use of older Tor nodes as rendezvous points and - introduction points. - -1.7. In more detail: Keeping crypto keys offline - - In this design, a hidden service's secret identity key may be - stored offline. It's used only to generate blinded signing keys, - which are used to sign descriptor signing keys. - - In order to operate a hidden service, the operator can generate in - advance a number of blinded signing keys and descriptor signing - keys (and their credentials; see [DESC-OUTER] and [HS-DESC-ENC] - below), and their corresponding descriptor encryption keys, and - export those to the hidden service hosts. - - As a result, in the scenario where the Hidden Service gets - compromised, the adversary can only impersonate it for a limited - period of time (depending on how many signing keys were generated - in advance). - - It's important to not send the private part of the blinded signing - key to the Hidden Service since an attacker can derive from it the - secret master identity key. The secret blinded signing key should - only be used to create credentials for the descriptor signing keys. - - (NOTE: although the protocol allows them, offline keys are not - implemented as of 0.3.2.1-alpha.) - -1.8. In more detail: Encryption Keys And Replay Resistance - - To avoid replays of an introduction request by an introduction point, - a hidden service host must never accept the same request - twice. Earlier versions of the hidden service design used an - authenticated timestamp here, but including a view of the current - time can create a problematic fingerprint. (See proposal 222 for more - discussion.) - -1.9. In more detail: A menagerie of keys - - [In the text below, an "encryption keypair" is roughly "a keypair you - can do Diffie-Hellman with" and a "signing keypair" is roughly "a - keypair you can do ECDSA with."] - - Public/private keypairs defined in this document: - - Master (hidden service) identity key -- A master signing keypair - used as the identity for a hidden service. This key is long - term and not used on its own to sign anything; it is only used - to generate blinded signing keys as described in [KEYBLIND] - and [SUBCRED]. The public key is encoded in the ".onion" - address according to [NAMING]. - KP_hs_id, KS_hs_id. - - Blinded signing key -- A keypair derived from the identity key, - used to sign descriptor signing keys. It changes periodically for - each service. Clients who know a 'credential' consisting of the - service's public identity key and an optional secret can derive - the public blinded identity key for a service. This key is used - as an index in the DHT-like structure of the directory system - (see [SUBCRED]). - KP_hs_blind_id, KS_hs_blind_id. - - Descriptor signing key -- A key used to sign hidden service - descriptors. This is signed by blinded signing keys. Unlike - blinded signing keys and master identity keys, the secret part - of this key must be stored online by hidden service hosts. The - public part of this key is included in the unencrypted section - of HS descriptors (see [DESC-OUTER]). - KP_hs_desc_sign, KS_hs_desc_sign. - - Introduction point authentication key -- A short-term signing - keypair used to identify a hidden service's session at a given - introduction point. The service makes a fresh keypair for each - introduction point; these are used to sign the request that a - hidden service host makes when establishing an introduction - point, so that clients who know the public component of this key - can get their introduction requests sent to the right - service. No keypair is ever used with more than one introduction - point. (previously called a "service key" in rend-spec.txt) - KP_hs_ipt_sid, KS_hs_ipt_sid - ("hidden service introduction point session id"). - - Introduction point encryption key -- A short-term encryption - keypair used when establishing connections via an introduction - point. Plays a role analogous to Tor nodes' onion keys. The service - makes a fresh keypair for each introduction point. - KP_hss_ntor, KS_hss_ntor. - - Ephemeral descriptor encryption key -- A short-lived encryption - keypair made by the service, and used to encrypt the inner layer - of hidden service descriptors when client authentication is in - use. - KP_hss_desc_enc, KS_hss_desc_enc - - Nonces defined in this document: - - N_hs_desc_enc -- a nonce used to derive keys to decrypt the inner - encryption layer of hidden service descriptors. This is - sometimes also called a "descriptor cookie". - - Public/private keypairs defined elsewhere: - - Onion key -- Short-term encryption keypair (KS_ntor, KP_ntor). - - (Node) identity key (KP_relayid). - - Symmetric key-like things defined elsewhere: - - KH from circuit handshake -- An unpredictable value derived as - part of the Tor circuit extension handshake, used to tie a request - to a particular circuit. - -1.9.1. In even more detail: Client authorization keys [CLIENT-AUTH] - - When client authorization is enabled, each authorized client of a hidden - service has two more asymmetric keypairs which are shared with the hidden - service. An entity without those keys is not able to use the hidden - service. Throughout this document, we assume that these pre-shared keys are - exchanged between the hidden service and its clients in a secure out-of-band - fashion. - - Specifically, each authorized client possesses: - - - An x25519 keypair used to compute decryption keys that allow the client to - decrypt the hidden service descriptor. See [HS-DESC-ENC]. This is - the client's counterpart to KP_hss_desc_enc. - KP_hsc_desc_enc, KS_hsd_desc_enc. - - - An ed25519 keypair which allows the client to compute signatures which - prove to the hidden service that the client is authorized. These - signatures are inserted into the INTRODUCE1 cell, and without them the - introduction to the hidden service cannot be completed. See [INTRO-AUTH]. - KP_hsc_intro_auth, KS_hsc_intro_auth. - - The right way to exchange these keys is to have the client generate keys and - send the corresponding public keys to the hidden service out-of-band. An - easier but less secure way of doing this exchange would be to have the - hidden service generate the keypairs and pass the corresponding private keys - to its clients. See section [CLIENT-AUTH-MGMT] for more details on how these - keys should be managed. - - [TODO: Also specify stealth client authorization.] - - (NOTE: client authorization is implemented as of 0.3.5.1-alpha.) - -2. Generating and publishing hidden service descriptors [HSDIR] - - Hidden service descriptors follow the same metaformat as other Tor - directory objects. They are published anonymously to Tor servers with the - HSDir flag, HSDir=2 protocol version and tor version >= 0.3.0.8 (because a - bug was fixed in this version). - -2.1. Deriving blinded keys and subcredentials [SUBCRED] - - In each time period (see [TIME-PERIODS] for a definition of time - periods), a hidden service host uses a different blinded private key - to sign its directory information, and clients use a different - blinded public key as the index for fetching that information. - - For a candidate for a key derivation method, see Appendix [KEYBLIND]. - - Additionally, clients and hosts derive a subcredential for each - period. Knowledge of the subcredential is needed to decrypt hidden - service descriptors for each period and to authenticate with the - hidden service host in the introduction process. Unlike the - credential, it changes each period. Knowing the subcredential, even - in combination with the blinded private key, does not enable the - hidden service host to derive the main credential--therefore, it is - safe to put the subcredential on the hidden service host while - leaving the hidden service's private key offline. - - The subcredential for a period is derived as: - - N_hs_subcred = H("subcredential" | N_hs_cred | blinded-public-key). - - In the above formula, credential corresponds to: - - N_hs_cred = H("credential" | public-identity-key) - - where public-identity-key is the public identity master key of the hidden - service. - -2.2. Locating, uploading, and downloading hidden service descriptors - [HASHRING] - - To avoid attacks where a hidden service's descriptor is easily - targeted for censorship, we store them at different directories over - time, and use shared random values to prevent those directories from - being predictable far in advance. - - Which Tor servers hosts a hidden service depends on: - - * the current time period, - * the daily subcredential, - * the hidden service directories' public keys, - * a shared random value that changes in each time period, - shared_random_value. - * a set of network-wide networkstatus consensus parameters. - (Consensus parameters are integer values voted on by authorities - and published in the consensus documents, described in - dir-spec.txt, section 3.3.) - - Below we explain in more detail. - -2.2.1. Dividing time into periods [TIME-PERIODS] - - To prevent a single set of hidden service directory from becoming a - target by adversaries looking to permanently censor a hidden service, - hidden service descriptors are uploaded to different locations that - change over time. - - The length of a "time period" is controlled by the consensus - parameter 'hsdir-interval', and is a number of minutes between 30 and - 14400 (10 days). The default time period length is 1440 (one day). - - Time periods start at the Unix epoch (Jan 1, 1970), and are computed by - taking the number of minutes since the epoch and dividing by the time - period. However, we want our time periods to start at a regular offset - from the SRV voting schedule, so we subtract a "rotation time offset" - of 12 voting periods from the number of minutes since the epoch, before - dividing by the time period (effectively making "our" epoch start at Jan - 1, 1970 12:00UTC when the voting period is 1 hour.) - - Example: If the current time is 2016-04-13 11:15:01 UTC, making the seconds - since the epoch 1460546101, and the number of minutes since the epoch - 24342435. We then subtract the "rotation time offset" of 12*60 minutes from - the minutes since the epoch, to get 24341715. If the current time period - length is 1440 minutes, by doing the division we see that we are currently - in time period number 16903. - - Specifically, time period #16903 began 16903*1440*60 + (12*60*60) seconds - after the epoch, at 2016-04-12 12:00 UTC, and ended at 16904*1440*60 + - (12*60*60) seconds after the epoch, at 2016-04-13 12:00 UTC. - -2.2.2. When to publish a hidden service descriptor [WHEN-HSDESC] - - Hidden services periodically publish their descriptor to the responsible - HSDirs. The set of responsible HSDirs is determined as specified in - [WHERE-HSDESC]. - - Specifically, every time a hidden service publishes its descriptor, it also - sets up a timer for a random time between 60 minutes and 120 minutes in the - future. When the timer triggers, the hidden service needs to publish its - descriptor again to the responsible HSDirs for that time period. - [TODO: Control republish period using a consensus parameter?] - -2.2.2.1. Overlapping descriptors - - Hidden services need to upload multiple descriptors so that they can be - reachable to clients with older or newer consensuses than them. Services - need to upload their descriptors to the HSDirs _before_ the beginning of - each upcoming time period, so that they are readily available for clients to - fetch them. Furthermore, services should keep uploading their old descriptor - even after the end of a time period, so that they can be reachable by - clients that still have consensuses from the previous time period. - - Hence, services maintain two active descriptors at every point. Clients on - the other hand, don't have a notion of overlapping descriptors, and instead - always download the descriptor for the current time period and shared random - value. It's the job of the service to ensure that descriptors will be - available for all clients. See section [FETCHUPLOADDESC] for how this is - achieved. - - [TODO: What to do when we run multiple hidden services in a single host?] - -2.2.3. Where to publish a hidden service descriptor [WHERE-HSDESC] - - This section specifies how the HSDir hash ring is formed at any given - time. Whenever a time value is needed (e.g. to get the current time period - number), we assume that clients and services use the valid-after time from - their latest live consensus. - - The following consensus parameters control where a hidden service - descriptor is stored; - - hsdir_n_replicas = an integer in range [1,16] with default value 2. - hsdir_spread_fetch = an integer in range [1,128] with default value 3. - hsdir_spread_store = an integer in range [1,128] with default value 4. - (Until 0.3.2.8-rc, the default was 3.) - - To determine where a given hidden service descriptor will be stored - in a given period, after the blinded public key for that period is - derived, the uploading or downloading party calculates: - - for replicanum in 1...hsdir_n_replicas: - hs_service_index(replicanum) = H("store-at-idx" | - blinded_public_key | - INT_8(replicanum) | - INT_8(period_length) | - INT_8(period_num) ) - - where blinded_public_key is specified in section [KEYBLIND], period_length - is the length of the time period in minutes, and period_num is calculated - using the current consensus "valid-after" as specified in section - [TIME-PERIODS]. - - Then, for each node listed in the current consensus with the HSDir flag, - we compute a directory index for that node as: - - hs_relay_index(node) = H("node-idx" | node_identity | - shared_random_value | - INT_8(period_num) | - INT_8(period_length) ) - - where shared_random_value is the shared value generated by the authorities - in section [PUB-SHAREDRANDOM], and node_identity is the ed25519 identity - key of the node. - - Finally, for replicanum in 1...hsdir_n_replicas, the hidden service - host uploads descriptors to the first hsdir_spread_store nodes whose - indices immediately follow hs_service_index(replicanum). If any of those - nodes have already been selected for a lower-numbered replica of the - service, any nodes already chosen are disregarded (i.e. skipped over) - when choosing a replica's hsdir_spread_store nodes. - - When choosing an HSDir to download from, clients choose randomly from - among the first hsdir_spread_fetch nodes after the indices. (Note - that, in order to make the system better tolerate disappearing - HSDirs, hsdir_spread_fetch may be less than hsdir_spread_store.) - Again, nodes from lower-numbered replicas are disregarded when - choosing the spread for a replica. - -2.2.4. Using time periods and SRVs to fetch/upload HS descriptors [FETCHUPLOADDESC] - - Hidden services and clients need to make correct use of time periods (TP) - and shared random values (SRVs) to successfully fetch and upload - descriptors. Furthermore, to avoid problems with skewed clocks, both clients - and services use the 'valid-after' time of a live consensus as a way to take - decisions with regards to uploading and fetching descriptors. By using the - consensus times as the ground truth here, we minimize the desynchronization - of clients and services due to system clock. Whenever time-based decisions - are taken in this section, assume that they are consensus times and not - system times. - - As [PUB-SHAREDRANDOM] specifies, consensuses contain two shared random - values (the current one and the previous one). Hidden services and clients - are asked to match these shared random values with descriptor time periods - and use the right SRV when fetching/uploading descriptors. This section - attempts to precisely specify how this works. - - Let's start with an illustration of the system: - - +------------------------------------------------------------------+ - | | - | 00:00 12:00 00:00 12:00 00:00 12:00 | - | SRV#1 TP#1 SRV#2 TP#2 SRV#3 TP#3 | - | | - | $==========|-----------$===========|-----------$===========| | - | | - | | - +------------------------------------------------------------------+ - - Legend: [TP#1 = Time Period #1] - [SRV#1 = Shared Random Value #1] - ["$" = descriptor rotation moment] - -2.2.4.1. Client behavior for fetching descriptors [CLIENTFETCH] - - And here is how clients use TPs and SRVs to fetch descriptors: - - Clients always aim to synchronize their TP with SRV, so they always want to - use TP#N with SRV#N: To achieve this wrt time periods, clients always use - the current time period when fetching descriptors. Now wrt SRVs, if a client - is in the time segment between a new time period and a new SRV (i.e. the - segments drawn with "-") it uses the current SRV, else if the client is in a - time segment between a new SRV and a new time period (i.e. the segments - drawn with "="), it uses the previous SRV. - - Example: - - +------------------------------------------------------------------+ - | | - | 00:00 12:00 00:00 12:00 00:00 12:00 | - | SRV#1 TP#1 SRV#2 TP#2 SRV#3 TP#3 | - | | - | $==========|-----------$===========|-----------$===========| | - | ^ ^ | - | C1 C2 | - +------------------------------------------------------------------+ - - If a client (C1) is at 13:00 right after TP#1, then it will use TP#1 and - SRV#1 for fetching descriptors. Also, if a client (C2) is at 01:00 right - after SRV#2, it will still use TP#1 and SRV#1. - -2.2.4.2. Service behavior for uploading descriptors [SERVICEUPLOAD] - - As discussed above, services maintain two active descriptors at any time. We - call these the "first" and "second" service descriptors. Services rotate - their descriptor every time they receive a consensus with a valid_after time - past the next SRV calculation time. They rotate their descriptors by - discarding their first descriptor, pushing the second descriptor to the - first, and rebuilding their second descriptor with the latest data. - - Services like clients also employ a different logic for picking SRV and TP - values based on their position in the graph above. Here is the logic: - -2.2.4.2.1. First descriptor upload logic [FIRSTDESCUPLOAD] - - Here is the service logic for uploading its first descriptor: - - When a service is in the time segment between a new time period a new SRV - (i.e. the segments drawn with "-"), it uses the previous time period and - previous SRV for uploading its first descriptor: that's meant to cover - for clients that have a consensus that is still in the previous time period. - - Example: Consider in the above illustration that the service is at 13:00 - right after TP#1. It will upload its first descriptor using TP#0 and SRV#0. - So if a client still has a 11:00 consensus it will be able to access it - based on the client logic above. - - Now if a service is in the time segment between a new SRV and a new time - period (i.e. the segments drawn with "=") it uses the current time period - and the previous SRV for its first descriptor: that's meant to cover clients - with an up-to-date consensus in the same time period as the service. - - Example: - - +------------------------------------------------------------------+ - | | - | 00:00 12:00 00:00 12:00 00:00 12:00 | - | SRV#1 TP#1 SRV#2 TP#2 SRV#3 TP#3 | - | | - | $==========|-----------$===========|-----------$===========| | - | ^ | - | S | - +------------------------------------------------------------------+ - - Consider that the service is at 01:00 right after SRV#2: it will upload its - first descriptor using TP#1 and SRV#1. - -2.2.4.2.2. Second descriptor upload logic [SECONDDESCUPLOAD] - - Here is the service logic for uploading its second descriptor: - - When a service is in the time segment between a new time period a new SRV - (i.e. the segments drawn with "-"), it uses the current time period and - current SRV for uploading its second descriptor: that's meant to cover for - clients that have an up-to-date consensus on the same TP as the service. - - Example: Consider in the above illustration that the service is at 13:00 - right after TP#1: it will upload its second descriptor using TP#1 and SRV#1. - - Now if a service is in the time segment between a new SRV and a new time - period (i.e. the segments drawn with "=") it uses the next time period and - the current SRV for its second descriptor: that's meant to cover clients - with a newer consensus than the service (in the next time period). - - Example: - - +------------------------------------------------------------------+ - | | - | 00:00 12:00 00:00 12:00 00:00 12:00 | - | SRV#1 TP#1 SRV#2 TP#2 SRV#3 TP#3 | - | | - | $==========|-----------$===========|-----------$===========| | - | ^ | - | S | - +------------------------------------------------------------------+ - - Consider that the service is at 01:00 right after SRV#2: it will upload its - second descriptor using TP#2 and SRV#2. - -2.2.4.3. Directory behavior for handling descriptor uploads [DIRUPLOAD] - - Upon receiving a hidden service descriptor publish request, directories MUST - check the following: - - * The outer wrapper of the descriptor can be parsed according to - [DESC-OUTER] - * The version-number of the descriptor is "3" - * If the directory has already cached a descriptor for this hidden service, - the revision-counter of the uploaded descriptor must be greater than the - revision-counter of the cached one - * The descriptor signature is valid - - If any of these basic validity checks fails, the directory MUST reject the - descriptor upload. - - NOTE: Even if the descriptor passes the checks above, its first and second - layers could still be invalid: directories cannot validate the encrypted - layers of the descriptor, as they do not have access to the public key of the - service (required for decrypting the first layer of encryption), or the - necessary client credentials (for decrypting the second layer). - -2.2.5. Expiring hidden service descriptors [EXPIRE-DESC] - - Hidden services set their descriptor's "descriptor-lifetime" field to 180 - minutes (3 hours). Hidden services ensure that their descriptor will remain - valid in the HSDir caches, by republishing their descriptors periodically as - specified in [WHEN-HSDESC]. - - Hidden services MUST also keep their introduction circuits alive for as long - as descriptors including those intro points are valid (even if that's after - the time period has changed). - -2.2.6. URLs for anonymous uploading and downloading - - Hidden service descriptors conforming to this specification are uploaded - with an HTTP POST request to the URL /tor/hs//publish relative to - the hidden service directory's root, and downloaded with an HTTP GET - request for the URL /tor/hs// where is a base64 encoding of - the hidden service's blinded public key and is the protocol - version which is "3" in this case. - - These requests must be made anonymously, on circuits not used for - anything else. - -2.2.7. Client-side validation of onion addresses - - When a Tor client receives a prop224 onion address from the user, it - MUST first validate the onion address before attempting to connect or - fetch its descriptor. If the validation fails, the client MUST - refuse to connect. - - As part of the address validation, Tor clients should check that the - underlying ed25519 key does not have a torsion component. If Tor accepted - ed25519 keys with torsion components, attackers could create multiple - equivalent onion addresses for a single ed25519 key, which would map to the - same service. We want to avoid that because it could lead to phishing - attacks and surprising behaviors (e.g. imagine a browser plugin that blocks - onion addresses, but could be bypassed using an equivalent onion address - with a torsion component). - - The right way for clients to detect such fraudulent addresses (which should - only occur malevolently and never naturally) is to extract the ed25519 - public key from the onion address and multiply it by the ed25519 group order - and ensure that the result is the ed25519 identity element. For more - details, please see [TORSION-REFS]. - -2.3. Publishing shared random values [PUB-SHAREDRANDOM] - - Our design for limiting the predictability of HSDir upload locations - relies on a shared random value (SRV) that isn't predictable in advance or - too influenceable by an attacker. The authorities must run a protocol - to generate such a value at least once per hsdir period. Here we - describe how they publish these values; the procedure they use to - generate them can change independently of the rest of this - specification. For more information see [SHAREDRANDOM-REFS]. - - According to proposal 250, we add two new lines in consensuses: - - "shared-rand-previous-value" SP NUM_REVEALS SP VALUE NL - "shared-rand-current-value" SP NUM_REVEALS SP VALUE NL - -2.3.1. Client behavior in the absence of shared random values - - If the previous or current shared random value cannot be found in a - consensus, then Tor clients and services need to generate their own random - value for use when choosing HSDirs. - - To do so, Tor clients and services use: - - SRV = H("shared-random-disaster" | INT_8(period_length) | INT_8(period_num)) - - where period_length is the length of a time period in minutes, - rounded down; period_num is calculated as specified in - [TIME-PERIODS] for the wanted shared random value that could not be - found originally. - -2.3.2. Hidden services and changing shared random values - - It's theoretically possible that the consensus shared random values will - change or disappear in the middle of a time period because of directory - authorities dropping offline or misbehaving. - - To avoid client reachability issues in this rare event, hidden services - should use the new shared random values to find the new responsible HSDirs - and upload their descriptors there. - - XXX How long should they upload descriptors there for? - -2.4. Hidden service descriptors: outer wrapper [DESC-OUTER] - - The format for a hidden service descriptor is as follows, using the - meta-format from dir-spec.txt. - - "hs-descriptor" SP version-number NL - - [At start, exactly once.] - - The version-number is a 32 bit unsigned integer indicating the version - of the descriptor. Current version is "3". - - "descriptor-lifetime" SP LifetimeMinutes NL - - [Exactly once] - - The lifetime of a descriptor in minutes. An HSDir SHOULD expire the - hidden service descriptor at least LifetimeMinutes after it was - uploaded. - - The LifetimeMinutes field can take values between 30 and 720 (12 - hours). - - "descriptor-signing-key-cert" NL certificate NL - - [Exactly once.] - - The 'certificate' field contains a certificate in the format from - proposal 220, wrapped with "-----BEGIN ED25519 CERT-----". The - certificate cross-certifies the short-term descriptor signing key with - the blinded public key. The certificate type must be [08], and the - blinded public key must be present as the signing-key extension. - - "revision-counter" SP Integer NL - - [Exactly once.] - - The revision number of the descriptor. If an HSDir receives a - second descriptor for a key that it already has a descriptor for, - it should retain and serve the descriptor with the higher - revision-counter. - - (Checking for monotonically increasing revision-counter values - prevents an attacker from replacing a newer descriptor signed by - a given key with a copy of an older version.) - - Implementations MUST be able to parse 64-bit values for these - counters. - - "superencrypted" NL encrypted-string - - [Exactly once.] - - An encrypted blob, whose format is discussed in [HS-DESC-ENC] below. The - blob is base64 encoded and enclosed in -----BEGIN MESSAGE---- and - ----END MESSAGE---- wrappers. (The resulting document does not end with - a newline character.) - - "signature" SP signature NL - - [exactly once, at end.] - - A signature of all previous fields, using the signing key in the - descriptor-signing-key-cert line, prefixed by the string "Tor onion - service descriptor sig v3". We use a separate key for signing, so that - the hidden service host does not need to have its private blinded key - online. - - HSDirs accept hidden service descriptors of up to 50k bytes (a consensus - parameter should also be introduced to control this value). - -2.5. Hidden service descriptors: encryption format [HS-DESC-ENC] - - Hidden service descriptors are protected by two layers of encryption. - Clients need to decrypt both layers to connect to the hidden service. - - The first layer of encryption provides confidentiality against entities who - don't know the public key of the hidden service (e.g. HSDirs), while the - second layer of encryption is only useful when client authorization is enabled - and protects against entities that do not possess valid client credentials. - -2.5.1. First layer of encryption [HS-DESC-FIRST-LAYER] - - The first layer of HS descriptor encryption is designed to protect - descriptor confidentiality against entities who don't know the public - identity key of the hidden service. - -2.5.1.1. First layer encryption logic - - The encryption keys and format for the first layer of encryption are - generated as specified in [HS-DESC-ENCRYPTION-KEYS] with customization - parameters: - - SECRET_DATA = blinded-public-key - STRING_CONSTANT = "hsdir-superencrypted-data" - - The encryption scheme in [HS-DESC-ENCRYPTION-KEYS] uses the service - credential which is derived from the public identity key (see [SUBCRED]) to - ensure that only entities who know the public identity key can decrypt the - first descriptor layer. - - The ciphertext is placed on the "superencrypted" field of the descriptor. - - Before encryption the plaintext is padded with NUL bytes to the nearest - multiple of 10k bytes. - -2.5.1.2. First layer plaintext format - - After clients decrypt the first layer of encryption, they need to parse the - plaintext to get to the second layer ciphertext which is contained in the - "encrypted" field. - - If client auth is enabled, the hidden service generates a fresh - descriptor_cookie key (`N_hs_desc_enc`, 32 random bytes) and encrypts - it using each authorized client's identity x25519 key. Authorized - clients can use the descriptor cookie (`N_hs_desc_enc`) to decrypt - the second (inner) layer of encryption. Our encryption scheme - requires the hidden service to also generate an ephemeral x25519 - keypair for each new descriptor. - - If client auth is disabled, fake data is placed in each of the fields below - to obfuscate whether client authorization is enabled. - - Here are all the supported fields: - - "desc-auth-type" SP type NL - - [Exactly once] - - This field contains the type of authorization used to protect the - descriptor. The only recognized type is "x25519" and specifies the - encryption scheme described in this section. - - If client authorization is disabled, the value here should be "x25519". - - "desc-auth-ephemeral-key" SP KP_hs_desc_ephem NL - - [Exactly once] - - This field contains `KP_hss_desc_enc`, an ephemeral x25519 public - key generated by the hidden service and encoded in base64. The key - is used by the encryption scheme below. - - If client authorization is disabled, the value here should be a fresh - x25519 pubkey that will remain unused. - - "auth-client" SP client-id SP iv SP encrypted-cookie - - [At least once] - - When client authorization is enabled, the hidden service inserts an - "auth-client" line for each of its authorized clients. If client - authorization is disabled, the fields here can be populated with random - data of the right size (that's 8 bytes for 'client-id', 16 bytes for 'iv' - and 16 bytes for 'encrypted-cookie' all encoded with base64). - - When client authorization is enabled, each "auth-client" line - contains the descriptor cookie `N_hs_desc_enc` encrypted to each - individual client. We assume that each authorized client possesses - a pre-shared x25519 keypair (`KP_hsc_desc_enc`) which is used to - decrypt the descriptor cookie. - - We now describe the descriptor cookie encryption scheme. Here is what - the hidden service computes: - - SECRET_SEED = x25519(KS_hs_desc_ephem, KP_hsc_desc_enc) - KEYS = KDF(N_hs_subcred | SECRET_SEED, 40) - CLIENT-ID = fist 8 bytes of KEYS - COOKIE-KEY = last 32 bytes of KEYS - - Here is a description of the fields in the "auth-client" line: - - - The "client-id" field is CLIENT-ID from above encoded in base64. - - - The "iv" field is 16 random bytes encoded in base64. - - - The "encrypted-cookie" field contains the descriptor cookie ciphertext - as follows and is encoded in base64: - encrypted-cookie = STREAM(iv, COOKIE-KEY) XOR N_hs_desc_enc. - - See section [FIRST-LAYER-CLIENT-BEHAVIOR] for the client-side logic of - how to decrypt the descriptor cookie. - - "encrypted" NL encrypted-string - - [Exactly once] - - An encrypted blob containing the second layer ciphertext, whose format is - discussed in [HS-DESC-SECOND-LAYER] below. The blob is base64 encoded - and enclosed in -----BEGIN MESSAGE---- and ----END MESSAGE---- wrappers. - - Compatibility note: The C Tor implementation does not include a final - newline when generating this first-layer-plaintext section; other - implementations MUST accept this section even if it is missing its final - newline. Other implementations MAY generate this section without a final - newline themselves, to avoid being distinguishable from C tor. - -2.5.1.3. Client behavior [FIRST-LAYER-CLIENT-BEHAVIOR] - - The goal of clients at this stage is to decrypt the "encrypted" field as - described in [HS-DESC-SECOND-LAYER]. - - If client authorization is enabled, authorized clients need to extract the - descriptor cookie to proceed with decryption of the second layer as - follows: - - An authorized client parsing the first layer of an encrypted descriptor, - extracts the ephemeral key from "desc-auth-ephemeral-key" and calculates - CLIENT-ID and COOKIE-KEY as described in the section above using their - x25519 private key. The client then uses CLIENT-ID to find the right - "auth-client" field which contains the ciphertext of the descriptor - cookie. The client then uses COOKIE-KEY and the iv to decrypt the - descriptor_cookie, which is used to decrypt the second layer of descriptor - encryption as described in [HS-DESC-SECOND-LAYER]. - -2.5.1.4. Hiding client authorization data - - Hidden services should avoid leaking whether client authorization is - enabled or how many authorized clients there are. - - Hence even when client authorization is disabled, the hidden service adds - fake "desc-auth-type", "desc-auth-ephemeral-key" and "auth-client" lines to - the descriptor, as described in [HS-DESC-FIRST-LAYER]. - - The hidden service also avoids leaking the number of authorized clients by - adding fake "auth-client" entries to its descriptor. Specifically, - descriptors always contain a number of authorized clients that is a - multiple of 16 by adding fake "auth-client" entries if needed. - [XXX consider randomization of the value 16] - - Clients MUST accept descriptors with any number of "auth-client" lines as - long as the total descriptor size is within the max limit of 50k (also - controlled with a consensus parameter). - -2.5.2. Second layer of encryption [HS-DESC-SECOND-LAYER] - - The second layer of descriptor encryption is designed to protect descriptor - confidentiality against unauthorized clients. If client authorization is - enabled, it's encrypted using the descriptor_cookie, and contains needed - information for connecting to the hidden service, like the list of its - introduction points. - - If client authorization is disabled, then the second layer of HS encryption - does not offer any additional security, but is still used. - -2.5.2.1. Second layer encryption keys - - The encryption keys and format for the second layer of encryption are - generated as specified in [HS-DESC-ENCRYPTION-KEYS] with customization - parameters as follows: - - SECRET_DATA = blinded-public-key | descriptor_cookie - STRING_CONSTANT = "hsdir-encrypted-data" - - If client authorization is disabled the 'descriptor_cookie' field is left blank. - - The ciphertext is placed on the "encrypted" field of the descriptor. - -2.5.2.2. Second layer plaintext format - - After decrypting the second layer ciphertext, clients can finally learn the - list of intro points etc. The plaintext has the following format: - - "create2-formats" SP formats NL - - [Exactly once] - - A space-separated list of integers denoting CREATE2 cell HTYPEs - (handshake types) that the server recognizes. Must include at least - ntor as described in tor-spec.txt. See tor-spec section 5.1 for a list - of recognized handshake types. - - "intro-auth-required" SP types NL - - [At most once] - - A space-separated list of introduction-layer authentication types; see - section [INTRO-AUTH] for more info. A client that does not support at - least one of these authentication types will not be able to contact the - host. Recognized types are: 'ed25519'. - - "single-onion-service" - - [None or at most once] - - If present, this line indicates that the service is a Single Onion - Service (see prop260 for more details about that type of service). This - field has been introduced in 0.3.0 meaning 0.2.9 service don't include - this. - - Followed by zero or more introduction points as follows (see section - [NUM_INTRO_POINT] below for accepted values): - - "introduction-point" SP link-specifiers NL - - [Exactly once per introduction point at start of introduction - point section] - - The link-specifiers is a base64 encoding of a link specifier - block in the format described in [BUILDING-BLOCKS] above. - - As of 0.4.1.1-alpha, services include both IPv4 and IPv6 link - specifiers in descriptors. All available addresses SHOULD be - included in the descriptor, regardless of the address that the - onion service actually used to connect/extend to the intro - point. - - The client SHOULD NOT reject any LSTYPE fields which it doesn't - recognize; instead, it should use them verbatim in its EXTEND - request to the introduction point. - - The client SHOULD perform the basic validity checks on the link - specifiers in the descriptor, described in `tor-spec.txt` - section 5.1.2. These checks SHOULD NOT leak - detailed information about the client's version, configuration, - or consensus. (See 3.3 for service link specifier handling.) - - When connecting to the introduction point, the client SHOULD send - this list of link specifiers verbatim, in the same order as given - here. - - The client MAY reject the list of link specifiers if it is - inconsistent with relay information from the directory, but SHOULD - NOT modify it. - - "onion-key" SP "ntor" SP key NL - - [Exactly once per introduction point] - - The key is a base64 encoded curve25519 public key which is the onion - key of the introduction point Tor node used for the ntor handshake - when a client extends to it. - - "onion-key" SP KeyType SP key.. NL - - [Any number of times] - - Implementations should accept other types of onion keys using this - syntax (where "KeyType" is some string other than "ntor"); - unrecognized key types should be ignored. - - "auth-key" NL certificate NL - - [Exactly once per introduction point] - - The certificate is a proposal 220 certificate wrapped in - "-----BEGIN ED25519 CERT-----". It contains the introduction - point authentication key (`KP_hs_ipt_sid`), signed by - the descriptor signing key (`KP_hs_desc_sign`). The - certificate type must be [09], and the signing key extension - is mandatory. - - NOTE: This certificate was originally intended to be - constructed the other way around: the signing and signed keys - are meant to be reversed. However, C tor implemented it - backwards, and other implementations now need to do the same - in order to conform. (Since this section is inside the - descriptor, which is _already_ signed by `KP_hs_desc_sign`, - the verification aspect of this certificate serves no point in - its current form.) - - "enc-key" SP "ntor" SP key NL - - [Exactly once per introduction point] - - The key is a base64 encoded curve25519 public key used to encrypt - the introduction request to service. (`KP_hss_ntor`) - - "enc-key" SP KeyType SP key.. NL - - [Any number of times] - - Implementations should accept other types of onion keys using this - syntax (where "KeyType" is some string other than "ntor"); - unrecognized key types should be ignored. - - "enc-key-cert" NL certificate NL - - [Exactly once per introduction point] - - Cross-certification of the encryption key using the descriptor - signing key. - - For "ntor" keys, certificate is a proposal 220 certificate - wrapped in "-----BEGIN ED25519 CERT-----" armor. The subject - key is the the ed25519 equivalent of a curve25519 public - encryption key (`KP_hss_ntor`), with the ed25519 key - derived using the process in proposal 228 appendix A. The - signing key is the descriptor signing key (`KP_hs_desc_sign`). - The certificate type must be [0B], and the signing-key - extension is mandatory. - - NOTE: As with "auth-key", this certificate was intended to be - constructed the other way around. However, for compatibility - with C tor, implementations need to construct it this way. It - serves even less point than "auth-key", however, since the - encryption key `KP_hss_ntor` is already available from - the `enc-key` entry. - - "legacy-key" NL key NL - - [None or at most once per introduction point] - [This field is obsolete and should never be generated; it - is included for historical reasons only.] - - The key is an ASN.1 encoded RSA public key in PEM format used for a - legacy introduction point as described in [LEGACY_EST_INTRO]. - - This field is only present if the introduction point only supports - legacy protocol (v2) that is <= 0.2.9 or the protocol version value - "HSIntro 3". - - "legacy-key-cert" NL certificate NL - - [None or at most once per introduction point] - [This field is obsolete and should never be generated; it - is included for historical reasons only.] - - MUST be present if "legacy-key" is present. - - The certificate is a proposal 220 RSA->Ed cross-certificate wrapped - in "-----BEGIN CROSSCERT-----" armor, cross-certifying the RSA - public key found in "legacy-key" using the descriptor signing key. - - To remain compatible with future revisions to the descriptor format, - clients should ignore unrecognized lines in the descriptor. - Other encryption and authentication key formats are allowed; clients - should ignore ones they do not recognize. - - Clients who manage to extract the introduction points of the hidden service - can proceed with the introduction protocol as specified in [INTRO-PROTOCOL]. - - Compatibility note: At least some versions of OnionBalance do not include - a final newline when generating this inner plaintext section; other - implementations MUST accept this section even if it is missing its final - newline. - -2.5.3. Deriving hidden service descriptor encryption keys [HS-DESC-ENCRYPTION-KEYS] - - In this section we present the generic encryption format for hidden service - descriptors. We use the same encryption format in both encryption layers, - hence we introduce two customization parameters SECRET_DATA and - STRING_CONSTANT which vary between the layers. - - The SECRET_DATA parameter specifies the secret data that are used during - encryption key generation, while STRING_CONSTANT is merely a string constant - that is used as part of the KDF. - - Here is the key generation logic: - - SALT = 16 bytes from H(random), changes each time we rebuild the - descriptor even if the content of the descriptor hasn't changed. - (So that we don't leak whether the intro point list etc. changed) - - secret_input = SECRET_DATA | N_hs_subcred | INT_8(revision_counter) - - keys = KDF(secret_input | salt | STRING_CONSTANT, S_KEY_LEN + S_IV_LEN + MAC_KEY_LEN) - - SECRET_KEY = first S_KEY_LEN bytes of keys - SECRET_IV = next S_IV_LEN bytes of keys - MAC_KEY = last MAC_KEY_LEN bytes of keys - - The encrypted data has the format: - - SALT hashed random bytes from above [16 bytes] - ENCRYPTED The ciphertext [variable] - MAC D_MAC of both above fields [32 bytes] - - The final encryption format is ENCRYPTED = STREAM(SECRET_IV,SECRET_KEY) XOR Plaintext . - - Where D_MAC = H(mac_key_len | MAC_KEY | salt_len | SALT | ENCRYPTED) - and - mac_key_len = htonll(len(MAC_KEY)) - and - salt_len = htonll(len(SALT)). - -2.5.4. Number of introduction points [NUM_INTRO_POINT] - - This section defines how many introduction points an hidden service - descriptor can have at minimum, by default and the maximum: - - Minimum: 0 - Default: 3 - Maximum: 20 - - A value of 0 would means that the service is still alive but doesn't want - to be reached by any client at the moment. Note that the descriptor size - increases considerably as more introduction points are added. - - The reason for a maximum value of 20 is to give enough scalability to tools - like OnionBalance to be able to load balance up to 120 servers (20 x 6 - HSDirs) but also in order for the descriptor size to not overwhelmed hidden - service directories with user defined values that could be gigantic. - -3. The introduction protocol [INTRO-PROTOCOL] - - The introduction protocol proceeds in three steps. - - First, a hidden service host builds an anonymous circuit to a Tor - node and registers that circuit as an introduction point. - - Single Onion Services attempt to build a non-anonymous single-hop circuit, - but use an anonymous 3-hop circuit if: - - * the intro point is on an address that is configured as unreachable via - a direct connection, or - * the initial attempt to connect to the intro point over a single-hop - circuit fails, and they are retrying the intro point connection. - - [After 'First' and before 'Second', the hidden service publishes its - introduction points and associated keys, and the client fetches - them as described in section [HSDIR] above.] - - Second, a client builds an anonymous circuit to the introduction - point, and sends an introduction request. - - Third, the introduction point relays the introduction request along - the introduction circuit to the hidden service host, and acknowledges - the introduction request to the client. - -3.1. Registering an introduction point [REG_INTRO_POINT] - -3.1.1. Extensible ESTABLISH_INTRO protocol. [EST_INTRO] - - When a hidden service is establishing a new introduction point, it - sends an ESTABLISH_INTRO cell with the following contents: - - AUTH_KEY_TYPE [1 byte] - AUTH_KEY_LEN [2 bytes] - AUTH_KEY [AUTH_KEY_LEN bytes] - N_EXTENSIONS [1 byte] - N_EXTENSIONS times: - EXT_FIELD_TYPE [1 byte] - EXT_FIELD_LEN [1 byte] - EXT_FIELD [EXT_FIELD_LEN bytes] - HANDSHAKE_AUTH [MAC_LEN bytes] - SIG_LEN [2 bytes] - SIG [SIG_LEN bytes] - - The AUTH_KEY_TYPE field indicates the type of the introduction point - authentication key and the type of the MAC to use in - HANDSHAKE_AUTH. Recognized types are: - - [00, 01] -- Reserved for legacy introduction cells; see - [LEGACY_EST_INTRO below] - [02] -- Ed25519; SHA3-256. - - The AUTH_KEY_LEN field determines the length of the AUTH_KEY - field. The AUTH_KEY field contains the public introduction point - authentication key, KP_hs_ipt_sid. - - The EXT_FIELD_TYPE, EXT_FIELD_LEN, EXT_FIELD entries are reserved for - extensions to the introduction protocol. Extensions with - unrecognized EXT_FIELD_TYPE values must be ignored. - (`EXT_FIELD_LEN` may be zero, in which case EXT_FIELD is absent.) - - Unless otherwise specified in the documentation for an extension type: - * Each extension type SHOULD be sent only once in a message. - * Parties MUST ignore any occurrences all occurrences of an extension - with a given type after the first such occurrence. - * Extensions SHOULD be sent in numerically ascending order by type. - (The above extension sorting and multiplicity rules are only defaults; - they may be overridden in the descriptions of individual extensions.) - - The HANDSHAKE_AUTH field contains the MAC of all earlier fields in - the cell using as its key the shared per-circuit material ("KH") - generated during the circuit extension protocol; see tor-spec.txt - section 5.2, "Setting circuit keys". It prevents replays of - ESTABLISH_INTRO cells. - - SIG_LEN is the length of the signature. - - SIG is a signature, using AUTH_KEY, of all contents of the cell, up - to but not including SIG_LEN and SIG. These contents are prefixed - with the string "Tor establish-intro cell v1". - - Upon receiving an ESTABLISH_INTRO cell, a Tor node first decodes the - key and the signature, and checks the signature. The node must reject - the ESTABLISH_INTRO cell and destroy the circuit in these cases: - - * If the key type is unrecognized - * If the key is ill-formatted - * If the signature is incorrect - * If the HANDSHAKE_AUTH value is incorrect - - * If the circuit is already a rendezvous circuit. - * If the circuit is already an introduction circuit. - [TODO: some scalability designs fail there.] - * If the key is already in use by another circuit. - - Otherwise, the node must associate the key with the circuit, for use - later in INTRODUCE1 cells. - -3.1.1.1. Denial-of-Service Defense Extension. [EST_INTRO_DOS_EXT] - - This extension can be used to send Denial-of-Service (DoS) parameters to - the introduction point in order for it to apply them for the introduction - circuit. - - If used, it needs to be encoded within the N_EXTENSIONS field of the - ESTABLISH_INTRO cell defined in the previous section. The content is - defined as follow: - - EXT_FIELD_TYPE: - - [01] -- Denial-of-Service Parameters. - - If this flag is set, the extension should be used by the introduction - point to learn what values the denial of service subsystem should be - using. - - EXT_FIELD content format is: - - N_PARAMS [1 byte] - N_PARAMS times: - PARAM_TYPE [1 byte] - PARAM_VALUE [8 byte] - - The PARAM_TYPE possible values are: - - [01] -- DOS_INTRODUCE2_RATE_PER_SEC - The rate per second of INTRODUCE2 cell relayed to the - service. - - [02] -- DOS_INTRODUCE2_BURST_PER_SEC - The burst per second of INTRODUCE2 cell relayed to the - service. - - The PARAM_VALUE size is 8 bytes in order to accommodate 64bit values. - It MUST match the specified limit for the following PARAM_TYPE: - - [01] -- Min: 0, Max: 2147483647 - [02] -- Min: 0, Max: 2147483647 - - A value of 0 means the defense is disabled. If the rate per second is - set to 0 (param 0x01) then the burst value should be ignored. And - vice-versa, if the burst value is 0 (param 0x02), then the rate value - should be ignored. In other words, setting one single parameter to 0 - disables the defense. - - The burst can NOT be smaller than the rate. If so, the parameters - should be ignored by the introduction point. - - Any valid value does have precedence over the network wide consensus - parameter. - - Using this extension extends the payload of the ESTABLISH_INTRO cell by 19 - bytes bringing it from 134 bytes to 155 bytes. - - This extension can only be used with relays supporting the protocol version - "HSIntro=5". - - Introduced in tor-0.4.2.1-alpha. - -3.1.2. Registering an introduction point on a legacy Tor node - [LEGACY_EST_INTRO] - - [This section is obsolete and refers to a workaround for now-obsolete Tor - relay versions. It is included for historical reasons.] - - Tor nodes should also support an older version of the ESTABLISH_INTRO - cell, first documented in rend-spec.txt. New hidden service hosts - must use this format when establishing introduction points at older - Tor nodes that do not support the format above in [EST_INTRO]. - - In this older protocol, an ESTABLISH_INTRO cell contains: - - KEY_LEN [2 bytes] - KEY [KEY_LEN bytes] - HANDSHAKE_AUTH [20 bytes] - SIG [variable, up to end of relay payload] - - The KEY_LEN variable determines the length of the KEY field. - - The KEY field is the ASN1-encoded legacy RSA public key that was also - included in the hidden service descriptor. - - The HANDSHAKE_AUTH field contains the SHA1 digest of (KH | "INTRODUCE"). - - The SIG field contains an RSA signature, using PKCS1 padding, of all - earlier fields. - - Older versions of Tor always use a 1024-bit RSA key for these introduction - authentication keys. - -3.1.3. Acknowledging establishment of introduction point [INTRO_ESTABLISHED] - - After setting up an introduction circuit, the introduction point reports its - status back to the hidden service host with an INTRO_ESTABLISHED cell. - - The INTRO_ESTABLISHED cell has the following contents: - - N_EXTENSIONS [1 byte] - N_EXTENSIONS times: - EXT_FIELD_TYPE [1 byte] - EXT_FIELD_LEN [1 byte] - EXT_FIELD [EXT_FIELD_LEN bytes] - - Older versions of Tor send back an empty INTRO_ESTABLISHED cell instead. - Services must accept an empty INTRO_ESTABLISHED cell from a legacy relay. - [The above paragraph is obsolete and refers to a workaround for - now-obsolete Tor relay versions. It is included for historical reasons.] - - The same rules for multiplicity, ordering, and handling unknown types - apply to the extension fields here as described [EST_INTRO] above. - - -3.2. Sending an INTRODUCE1 cell to the introduction point. [SEND_INTRO1] - - In order to participate in the introduction protocol, a client must - know the following: - - * An introduction point for a service. - * The introduction authentication key for that introduction point. - * The introduction encryption key for that introduction point. - - The client sends an INTRODUCE1 cell to the introduction point, - containing an identifier for the service, an identifier for the - encryption key that the client intends to use, and an opaque blob to - be relayed to the hidden service host. - - In reply, the introduction point sends an INTRODUCE_ACK cell back to - the client, either informing it that its request has been delivered, - or that its request will not succeed. - - [TODO: specify what tor should do when receiving a malformed cell. Drop it? - Kill circuit? This goes for all possible cells.] - -3.2.1. INTRODUCE1 cell format [FMT_INTRO1] - - When a client is connecting to an introduction point, INTRODUCE1 cells - should be of the form: - - LEGACY_KEY_ID [20 bytes] - AUTH_KEY_TYPE [1 byte] - AUTH_KEY_LEN [2 bytes] - AUTH_KEY [AUTH_KEY_LEN bytes] - N_EXTENSIONS [1 byte] - N_EXTENSIONS times: - EXT_FIELD_TYPE [1 byte] - EXT_FIELD_LEN [1 byte] - EXT_FIELD [EXT_FIELD_LEN bytes] - ENCRYPTED [Up to end of relay payload] - - AUTH_KEY_TYPE is defined as in [EST_INTRO]. Currently, the only value of - AUTH_KEY_TYPE for this cell is an Ed25519 public key [02]. - - The LEGACY_KEY_ID field is used to distinguish between legacy and new style - INTRODUCE1 cells. In new style INTRODUCE1 cells, LEGACY_KEY_ID is 20 zero - bytes. Upon receiving an INTRODUCE1 cell, the introduction point checks the - LEGACY_KEY_ID field. If LEGACY_KEY_ID is non-zero, the INTRODUCE1 cell - should be handled as a legacy INTRODUCE1 cell by the intro point. - - Upon receiving a INTRODUCE1 cell, the introduction point checks - whether AUTH_KEY matches the introduction point authentication key for an - active introduction circuit. If so, the introduction point sends an - INTRODUCE2 cell with exactly the same contents to the service, and sends an - INTRODUCE_ACK response to the client. - - (Note that the introduction point does not "clean up" the - INTRODUCE1 cells that it retransmits. Specifically, it does not - change the order or multiplicity of the extensions sent by the - client.) - - The same rules for multiplicity, ordering, and handling unknown types - apply to the extension fields here as described [EST_INTRO] above. - - -3.2.2. INTRODUCE_ACK cell format. [INTRO_ACK] - - An INTRODUCE_ACK cell has the following fields: - - STATUS [2 bytes] - N_EXTENSIONS [1 bytes] - N_EXTENSIONS times: - EXT_FIELD_TYPE [1 byte] - EXT_FIELD_LEN [1 byte] - EXT_FIELD [EXT_FIELD_LEN bytes] - - Recognized status values are: - - [00 00] -- Success: cell relayed to hidden service host. - [00 01] -- Failure: service ID not recognized - [00 02] -- Bad message format - [00 03] -- Can't relay cell to service - - The same rules for multiplicity, ordering, and handling unknown types - apply to the extension fields here as described [EST_INTRO] above. - - -3.3. Processing an INTRODUCE2 cell at the hidden service. [PROCESS_INTRO2] - - Upon receiving an INTRODUCE2 cell, the hidden service host checks whether - the AUTH_KEY or LEGACY_KEY_ID field matches the keys for this - introduction circuit. - - The service host then checks whether it has received a cell with these - contents or rendezvous cookie before. If it has, it silently drops it as a - replay. (It must maintain a replay cache for as long as it accepts cells - with the same encryption key. Note that the encryption format below should - be non-malleable.) - - If the cell is not a replay, it decrypts the ENCRYPTED field, - establishes a shared key with the client, and authenticates the whole - contents of the cell as having been unmodified since they left the - client. There may be multiple ways of decrypting the ENCRYPTED field, - depending on the chosen type of the encryption key. Requirements for - an introduction handshake protocol are described in - [INTRO-HANDSHAKE-REQS]. We specify one below in section - [NTOR-WITH-EXTRA-DATA]. - - The decrypted plaintext must have the form: - - RENDEZVOUS_COOKIE [20 bytes] - N_EXTENSIONS [1 byte] - N_EXTENSIONS times: - EXT_FIELD_TYPE [1 byte] - EXT_FIELD_LEN [1 byte] - EXT_FIELD [EXT_FIELD_LEN bytes] - ONION_KEY_TYPE [1 bytes] - ONION_KEY_LEN [2 bytes] - ONION_KEY [ONION_KEY_LEN bytes] - NSPEC (Number of link specifiers) [1 byte] - NSPEC times: - LSTYPE (Link specifier type) [1 byte] - LSLEN (Link specifier length) [1 byte] - LSPEC (Link specifier) [LSLEN bytes] - PAD (optional padding) [up to end of plaintext] - - Upon processing this plaintext, the hidden service makes sure that - any required authentication is present in the extension fields, and - then extends a rendezvous circuit to the node described in the LSPEC - fields, using the ONION_KEY to complete the extension. As mentioned - in [BUILDING-BLOCKS], the "TLS-over-TCP, IPv4" and "Legacy node - identity" specifiers must be present. - - As of 0.4.1.1-alpha, clients include both IPv4 and IPv6 link specifiers - in INTRODUCE1 cells. All available addresses SHOULD be included in the - cell, regardless of the address that the client actually used to extend - to the rendezvous point. - - The hidden service should handle invalid or unrecognised link specifiers - the same way as clients do in section 2.5.2.2. In particular, services - SHOULD perform basic validity checks on link specifiers, and SHOULD NOT - reject unrecognised link specifiers, to avoid information leaks. - The list of link specifiers received here SHOULD either be rejected, or - sent verbatim when extending to the rendezvous point, in the same order - received. - - The service MAY reject the list of link specifiers if it is - inconsistent with relay information from the directory, but SHOULD - NOT modify it. - - The ONION_KEY_TYPE field is: - - [01] NTOR: ONION_KEY is 32 bytes long. - - The ONION_KEY field describes the onion key that must be used when - extending to the rendezvous point. It must be of a type listed as - supported in the hidden service descriptor. - - The PAD field should be filled with zeros; its size should be chosen - so that the INTRODUCE2 message occupies a fixed maximum size, in - order to hide the length of the encrypted data. (This maximum size is - 490, since we assume that a future Tor implementations will implement - proposal 340 and thus lower the number of bytes that can be contained - in a single relay message.) Note also that current versions of Tor - only pad the INTRODUCE2 message up to 246 bytes. - - Upon receiving a well-formed INTRODUCE2 cell, the hidden service host - will have: - - * The information needed to connect to the client's chosen - rendezvous point. - * The second half of a handshake to authenticate and establish a - shared key with the hidden service client. - * A set of shared keys to use for end-to-end encryption. - - The same rules for multiplicity, ordering, and handling unknown types - apply to the extension fields here as described [EST_INTRO] above. - - -3.3.1. Introduction handshake encryption requirements [INTRO-HANDSHAKE-REQS] - - When decoding the encrypted information in an INTRODUCE2 cell, a - hidden service host must be able to: - - * Decrypt additional information included in the INTRODUCE2 cell, - to include the rendezvous token and the information needed to - extend to the rendezvous point. - - * Establish a set of shared keys for use with the client. - - * Authenticate that the cell has not been modified since the client - generated it. - - Note that the old TAP-derived protocol of the previous hidden service - design achieved the first two requirements, but not the third. - -3.3.2. Example encryption handshake: ntor with extra data - [NTOR-WITH-EXTRA-DATA] - - [TODO: relocate this] - - This is a variant of the ntor handshake (see tor-spec.txt, section - 5.1.4; see proposal 216; and see "Anonymity and one-way - authentication in key-exchange protocols" by Goldberg, Stebila, and - Ustaoglu). - - It behaves the same as the ntor handshake, except that, in addition - to negotiating forward secure keys, it also provides a means for - encrypting non-forward-secure data to the server (in this case, to - the hidden service host) as part of the handshake. - - Notation here is as in section 5.1.4 of tor-spec.txt, which defines - the ntor handshake. - - The PROTOID for this variant is "tor-hs-ntor-curve25519-sha3-256-1". - We also use the following tweak values: - - t_hsenc = PROTOID | ":hs_key_extract" - t_hsverify = PROTOID | ":hs_verify" - t_hsmac = PROTOID | ":hs_mac" - m_hsexpand = PROTOID | ":hs_key_expand" - - To make an INTRODUCE1 cell, the client must know a public encryption - key B for the hidden service on this introduction circuit. The client - generates a single-use keypair: - - x,X = KEYGEN() - - and computes: - - intro_secret_hs_input = EXP(B,x) | AUTH_KEY | X | B | PROTOID - info = m_hsexpand | N_hs_subcred - hs_keys = KDF(intro_secret_hs_input | t_hsenc | info, S_KEY_LEN+MAC_LEN) - ENC_KEY = hs_keys[0:S_KEY_LEN] - MAC_KEY = hs_keys[S_KEY_LEN:S_KEY_LEN+MAC_KEY_LEN] - - and sends, as the ENCRYPTED part of the INTRODUCE1 cell: - - CLIENT_PK [PK_PUBKEY_LEN bytes] - ENCRYPTED_DATA [Padded to length of plaintext] - MAC [MAC_LEN bytes] - - - Substituting those fields into the INTRODUCE1 cell body format - described in [FMT_INTRO1] above, we have - - LEGACY_KEY_ID [20 bytes] - AUTH_KEY_TYPE [1 byte] - AUTH_KEY_LEN [2 bytes] - AUTH_KEY [AUTH_KEY_LEN bytes] - N_EXTENSIONS [1 bytes] - N_EXTENSIONS times: - EXT_FIELD_TYPE [1 byte] - EXT_FIELD_LEN [1 byte] - EXT_FIELD [EXT_FIELD_LEN bytes] - ENCRYPTED: - CLIENT_PK [PK_PUBKEY_LEN bytes] - ENCRYPTED_DATA [Padded to length of plaintext] - MAC [MAC_LEN bytes] - - - (This format is as documented in [FMT_INTRO1] above, except that here - we describe how to build the ENCRYPTED portion.) - - Here, the encryption key plays the role of B in the regular ntor - handshake, and the AUTH_KEY field plays the role of the node ID. - The CLIENT_PK field is the public key X. The ENCRYPTED_DATA field is - the message plaintext, encrypted with the symmetric key ENC_KEY. The - MAC field is a MAC of all of the cell from the AUTH_KEY through the - end of ENCRYPTED_DATA, using the MAC_KEY value as its key. - - To process this format, the hidden service checks PK_VALID(CLIENT_PK) - as necessary, and then computes ENC_KEY and MAC_KEY as the client did - above, except using EXP(CLIENT_PK,b) in the calculation of - intro_secret_hs_input. The service host then checks whether the MAC is - correct. If it is invalid, it drops the cell. Otherwise, it computes - the plaintext by decrypting ENCRYPTED_DATA. - - The hidden service host now completes the service side of the - extended ntor handshake, as described in tor-spec.txt section 5.1.4, - with the modified PROTOID as given above. To be explicit, the hidden - service host generates a keypair of y,Y = KEYGEN(), and uses its - introduction point encryption key 'b' to compute: - - intro_secret_hs_input = EXP(X,b) | AUTH_KEY | X | B | PROTOID - info = m_hsexpand | N_hs_subcred - hs_keys = KDF(intro_secret_hs_input | t_hsenc | info, S_KEY_LEN+MAC_LEN) - HS_DEC_KEY = hs_keys[0:S_KEY_LEN] - HS_MAC_KEY = hs_keys[S_KEY_LEN:S_KEY_LEN+MAC_KEY_LEN] - - (The above are used to check the MAC and then decrypt the - encrypted data.) - - rend_secret_hs_input = EXP(X,y) | EXP(X,b) | AUTH_KEY | B | X | Y | PROTOID - NTOR_KEY_SEED = MAC(rend_secret_hs_input, t_hsenc) - verify = MAC(rend_secret_hs_input, t_hsverify) - auth_input = verify | AUTH_KEY | B | Y | X | PROTOID | "Server" - AUTH_INPUT_MAC = MAC(auth_input, t_hsmac) - - (The above are used to finish the ntor handshake.) - - The server's handshake reply is: - - SERVER_PK Y [PK_PUBKEY_LEN bytes] - AUTH AUTH_INPUT_MAC [MAC_LEN bytes] - - These fields will be sent to the client in a RENDEZVOUS1 cell using the - HANDSHAKE_INFO element (see [JOIN_REND]). - - The hidden service host now also knows the keys generated by the - handshake, which it will use to encrypt and authenticate data - end-to-end between the client and the server. These keys are as - computed in tor-spec.txt section 5.1.4, except that instead of using - AES-128 and SHA1 for this hop, we use AES-256 and SHA3-256. - -3.4. Authentication during the introduction phase. [INTRO-AUTH] - - Hidden services may restrict access only to authorized users. - One mechanism to do so is the credential mechanism, where only users who - know the credential for a hidden service may connect at all. - - There is one defined authentication type: `ed25519`. - - -3.4.1. Ed25519-based authentication `ed25519`. - - (NOTE: This section is not implemented by Tor. It is likely - that we would want to change its design substantially before - deploying any implementation. At the very least, we would - want to bind these extensions to a single onion service, to - prevent replays. We might also want to look for ways to limit - the number of keys a user needs to have.) - - To authenticate with an Ed25519 private key, the user must include an - extension field in the encrypted part of the INTRODUCE1 cell with an - EXT_FIELD_TYPE type of [02] and the contents: - - Nonce [16 bytes] - Pubkey [32 bytes] - Signature [64 bytes] - - Nonce is a random value. Pubkey is the public key that will be used - to authenticate. [TODO: should this be an identifier for the public - key instead?] Signature is the signature, using Ed25519, of: - - "hidserv-userauth-ed25519" - Nonce (same as above) - Pubkey (same as above) - AUTH_KEY (As in the INTRODUCE1 cell) - - The hidden service host checks this by seeing whether it recognizes - and would accept a signature from the provided public key. If it - would, then it checks whether the signature is correct. If it is, - then the correct user has authenticated. - - Replay prevention on the whole cell is sufficient to prevent replays - on the authentication. - - Users SHOULD NOT use the same public key with multiple hidden - services. - -4. The rendezvous protocol - - Before connecting to a hidden service, the client first builds a - circuit to an arbitrarily chosen Tor node (known as the rendezvous - point), and sends an ESTABLISH_RENDEZVOUS cell. The hidden service - later connects to the same node and sends a RENDEZVOUS cell. Once - this has occurred, the relay forwards the contents of the RENDEZVOUS - cell to the client, and joins the two circuits together. - - Single Onion Services attempt to build a non-anonymous single-hop circuit, - but use an anonymous 3-hop circuit if: - - * the rend point is on an address that is configured as unreachable via - a direct connection, or - * the initial attempt to connect to the rend point over a single-hop - circuit fails, and they are retrying the rend point connection. - -4.1. Establishing a rendezvous point [EST_REND_POINT] - - The client sends the rendezvous point a RELAY_COMMAND_ESTABLISH_RENDEZVOUS - cell containing a 20-byte value. - - RENDEZVOUS_COOKIE [20 bytes] - - Rendezvous points MUST ignore any extra bytes in an - ESTABLISH_RENDEZVOUS cell. (Older versions of Tor did not.) - - The rendezvous cookie is an arbitrary 20-byte value, chosen randomly - by the client. The client SHOULD choose a new rendezvous cookie for - each new connection attempt. If the rendezvous cookie is already in - use on an existing circuit, the rendezvous point should reject it and - destroy the circuit. - - Upon receiving an ESTABLISH_RENDEZVOUS cell, the rendezvous point associates - the cookie with the circuit on which it was sent. It replies to the client - with an empty RENDEZVOUS_ESTABLISHED cell to indicate success. Clients MUST - ignore any extra bytes in a RENDEZVOUS_ESTABLISHED cell. - - The client MUST NOT use the circuit which sent the cell for any - purpose other than rendezvous with the given location-hidden service. - - The client should establish a rendezvous point BEFORE trying to - connect to a hidden service. - -4.2. Joining to a rendezvous point [JOIN_REND] - - To complete a rendezvous, the hidden service host builds a circuit to - the rendezvous point and sends a RENDEZVOUS1 cell containing: - - RENDEZVOUS_COOKIE [20 bytes] - HANDSHAKE_INFO [variable; depends on handshake type - used.] - - where RENDEZVOUS_COOKIE is the cookie suggested by the client during the - introduction (see [PROCESS_INTRO2]) and HANDSHAKE_INFO is defined in - [NTOR-WITH-EXTRA-DATA]. - - If the cookie matches the rendezvous cookie set on any - not-yet-connected circuit on the rendezvous point, the rendezvous - point connects the two circuits, and sends a RENDEZVOUS2 cell to the - client containing the HANDSHAKE_INFO field of the RENDEZVOUS1 cell. - - Upon receiving the RENDEZVOUS2 cell, the client verifies that HANDSHAKE_INFO - correctly completes a handshake. To do so, the client parses SERVER_PK from - HANDSHAKE_INFO and reverses the final operations of section - [NTOR-WITH-EXTRA-DATA] as shown here: - - rend_secret_hs_input = EXP(Y,x) | EXP(B,x) | AUTH_KEY | B | X | Y | PROTOID - NTOR_KEY_SEED = MAC(ntor_secret_input, t_hsenc) - verify = MAC(ntor_secret_input, t_hsverify) - auth_input = verify | AUTH_KEY | B | Y | X | PROTOID | "Server" - AUTH_INPUT_MAC = MAC(auth_input, t_hsmac) - - Finally the client verifies that the received AUTH field of HANDSHAKE_INFO - is equal to the computed AUTH_INPUT_MAC. - - Now both parties use the handshake output to derive shared keys for use on - the circuit as specified in the section below: - -4.2.1. Key expansion - - The hidden service and its client need to derive crypto keys from the - NTOR_KEY_SEED part of the handshake output. To do so, they use the KDF - construction as follows: - - K = KDF(NTOR_KEY_SEED | m_hsexpand, HASH_LEN * 2 + S_KEY_LEN * 2) - - The first HASH_LEN bytes of K form the forward digest Df; the next HASH_LEN - bytes form the backward digest Db; the next S_KEY_LEN bytes form Kf, and the - final S_KEY_LEN bytes form Kb. Excess bytes from K are discarded. - - Subsequently, the rendezvous point passes relay cells, unchanged, from each - of the two circuits to the other. When Alice's OP sends RELAY cells along - the circuit, it authenticates with Df, and encrypts them with the Kf, then - with all of the keys for the ORs in Alice's side of the circuit; and when - Alice's OP receives RELAY cells from the circuit, it decrypts them with the - keys for the ORs in Alice's side of the circuit, then decrypts them with Kb, - and checks integrity with Db. Bob's OP does the same, with Kf and Kb - interchanged. - - [TODO: Should we encrypt HANDSHAKE_INFO as we did INTRODUCE2 - contents? It's not necessary, but it could be wise. Similarly, we - should make it extensible.] - -4.3. Using legacy hosts as rendezvous points - - [This section is obsolete and refers to a workaround for now-obsolete Tor - relay versions. It is included for historical reasons.] - - The behavior of ESTABLISH_RENDEZVOUS is unchanged from older versions - of this protocol, except that relays should now ignore unexpected - bytes at the end. - - Old versions of Tor required that RENDEZVOUS cell payloads be exactly - 168 bytes long. All shorter rendezvous payloads should be padded to - this length with random bytes, to make them difficult to distinguish from - older protocols at the rendezvous point. - - Relays older than 0.2.9.1 should not be used for rendezvous points by next - generation onion services because they enforce too-strict length checks to - rendezvous cells. Hence the "HSRend" protocol from proposal#264 should be - used to select relays for rendezvous points. - -5. Encrypting data between client and host - - A successfully completed handshake, as embedded in the - INTRODUCE/RENDEZVOUS cells, gives the client and hidden service host - a shared set of keys Kf, Kb, Df, Db, which they use for sending - end-to-end traffic encryption and authentication as in the regular - Tor relay encryption protocol, applying encryption with these keys - before other encryption, and decrypting with these keys before other - decryption. The client encrypts with Kf and decrypts with Kb; the - service host does the opposite. - -6. Encoding onion addresses [ONIONADDRESS] - - The onion address of a hidden service includes its identity public key, a - version field and a basic checksum. All this information is then base32 - encoded as shown below: - - onion_address = base32(PUBKEY | CHECKSUM | VERSION) + ".onion" - CHECKSUM = H(".onion checksum" | PUBKEY | VERSION)[:2] - - where: - - PUBKEY is the 32 bytes ed25519 master pubkey of the hidden service. - - VERSION is a one byte version field (default value '\x03') - - ".onion checksum" is a constant string - - CHECKSUM is truncated to two bytes before inserting it in onion_address - - Here are a few example addresses: - - pg6mmjiyjmcrsslvykfwnntlaru7p5svn6y2ymmju6nubxndf4pscryd.onion - sp3k262uwy4r2k3ycr5awluarykdpag6a7y33jxop4cs2lu5uz5sseqd.onion - xa4r2iadxm55fbnqgwwi5mymqdcofiu3w6rpbtqn7b2dyn7mgwj64jyd.onion - - For more information about this encoding, please see our discussion thread - at [ONIONADDRESS-REFS]. - -7. Open Questions: - - Scaling hidden services is hard. There are on-going discussions that - you might be able to help with. See [SCALING-REFS]. - - How can we improve the HSDir unpredictability design proposed in - [SHAREDRANDOM]? See [SHAREDRANDOM-REFS] for discussion. - - How can hidden service addresses become memorable while retaining - their self-authenticating and decentralized nature? See - [HUMANE-HSADDRESSES-REFS] for some proposals; many more are possible. - - Hidden Services are pretty slow. Both because of the lengthy setup - procedure and because the final circuit has 6 hops. How can we make - the Hidden Service protocol faster? See [PERFORMANCE-REFS] for some - suggestions. - -References: - -[KEYBLIND-REFS]: - https://trac.torproject.org/projects/tor/ticket/8106 - https://lists.torproject.org/pipermail/tor-dev/2012-September/004026.html - -[KEYBLIND-PROOF]: - https://lists.torproject.org/pipermail/tor-dev/2013-December/005943.html - -[SHAREDRANDOM-REFS]: - https://gitweb.torproject.org/torspec.git/tree/proposals/250-commit-reveal-consensus.txt - https://trac.torproject.org/projects/tor/ticket/8244 - -[SCALING-REFS]: - https://lists.torproject.org/pipermail/tor-dev/2013-October/005556.html - -[HUMANE-HSADDRESSES-REFS]: - https://gitweb.torproject.org/torspec.git/blob/HEAD:/proposals/ideas/xxx-onion-nyms.txt - http://archives.seul.org/or/dev/Dec-2011/msg00034.html - -[PERFORMANCE-REFS]: - "Improving Efficiency and Simplicity of Tor circuit - establishment and hidden services" by Overlier, L., and - P. Syverson - - [TODO: Need more here! Do we have any? :( ] - -[ATTACK-REFS]: - "Trawling for Tor Hidden Services: Detection, Measurement, - Deanonymization" by Alex Biryukov, Ivan Pustogarov, - Ralf-Philipp Weinmann - - "Locating Hidden Servers" by Lasse Øverlier and Paul - Syverson - -[ED25519-REFS]: - "High-speed high-security signatures" by Daniel - J. Bernstein, Niels Duif, Tanja Lange, Peter Schwabe, and - Bo-Yin Yang. http://cr.yp.to/papers.html#ed25519 - -[ED25519-B-REF]: - https://tools.ietf.org/html/draft-josefsson-eddsa-ed25519-03#section-5: - -[PRNG-REFS]: - http://projectbullrun.org/dual-ec/ext-rand.html - https://lists.torproject.org/pipermail/tor-dev/2015-November/009954.html - -[SRV-TP-REFS]: - https://lists.torproject.org/pipermail/tor-dev/2016-April/010759.html - -[VANITY-REFS]: - https://github.com/Yawning/horse25519 - -[ONIONADDRESS-REFS]: - https://lists.torproject.org/pipermail/tor-dev/2017-January/011816.html - -[TORSION-REFS]: - https://lists.torproject.org/pipermail/tor-dev/2017-April/012164.html - https://getmonero.org/2017/05/17/disclosure-of-a-major-bug-in-cryptonote-based-currencies.html - -Appendix A. Signature scheme with key blinding [KEYBLIND] - -A.1. Key derivation overview - - As described in [IMD:DIST] and [SUBCRED] above, we require a "key - blinding" system that works (roughly) as follows: - - There is a master keypair (sk, pk). - - Given the keypair and a nonce n, there is a derivation function - that gives a new blinded keypair (sk_n, pk_n). This keypair can - be used for signing. - - Given only the public key and the nonce, there is a function - that gives pk_n. - - Without knowing pk, it is not possible to derive pk_n; without - knowing sk, it is not possible to derive sk_n. - - It's possible to check that a signature was made with sk_n while - knowing only pk_n. - - Someone who sees a large number of blinded public keys and - signatures made using those public keys can't tell which - signatures and which blinded keys were derived from the same - master keypair. - - You can't forge signatures. - - [TODO: Insert a more rigorous definition and better references.] - -A.2. Tor's key derivation scheme - - We propose the following scheme for key blinding, based on Ed25519. - - (This is an ECC group, so remember that scalar multiplication is the - trapdoor function, and it's defined in terms of iterated point - addition. See the Ed25519 paper [Reference ED25519-REFS] for a fairly - clear writeup.) - - Let B be the ed25519 basepoint as found in section 5 of [ED25519-B-REF]: - - B = (15112221349535400772501151409588531511454012693041857206046113283949847762202, - 46316835694926478169428394003475163141307993866256225615783033603165251855960) - - Assume B has prime order l, so lB=0. Let a master keypair be written as - (a,A), where a is the private key and A is the public key (A=aB). - - To derive the key for a nonce N and an optional secret s, compute the - blinding factor like this: - - h = H(BLIND_STRING | A | s | B | N) - BLIND_STRING = "Derive temporary signing key" | INT_1(0) - N = "key-blind" | INT_8(period-number) | INT_8(period_length) - B = "(1511[...]2202, 4631[...]5960)" - - then clamp the blinding factor 'h' according to the ed25519 spec: - - h[0] &= 248; - h[31] &= 63; - h[31] |= 64; - - and do the key derivation as follows: - - private key for the period: - - a' = h a mod l - RH' = SHA-512(RH_BLIND_STRING | RH)[:32] - RH_BLIND_STRING = "Derive temporary signing key hash input" - - public key for the period: - - A' = h A = (ha)B - - Generating a signature of M: given a deterministic random-looking r - (see EdDSA paper), take R=rB, S=r+hash(R,A',M)ah mod l. Send signature - (R,S) and public key A'. - - Verifying the signature: Check whether SB = R+hash(R,A',M)A'. - - (If the signature is valid, - SB = (r + hash(R,A',M)ah)B - = rB + (hash(R,A',M)ah)B - = R + hash(R,A',M)A' ) - - This boils down to regular Ed25519 with key pair (a', A'). - - See [KEYBLIND-REFS] for an extensive discussion on this scheme and - possible alternatives. Also, see [KEYBLIND-PROOF] for a security - proof of this scheme. - -Appendix B. Selecting nodes [PICKNODES] - - Picking introduction points - Picking rendezvous points - Building paths - Reusing circuits - - (TODO: This needs a writeup) - -Appendix C. Recommendations for searching for vanity .onions [VANITY] - - EDITORIAL NOTE: The author thinks that it's silly to brute-force the - keyspace for a key that, when base-32 encoded, spells out the name of - your website. It also feels a bit dangerous to me. If you train your - users to connect to - - llamanymityx4fi3l6x2gyzmtmgxjyqyorj9qsb5r543izcwymle.onion - - I worry that you're making it easier for somebody to trick them into - connecting to - - llamanymityb4sqi0ta0tsw6uovyhwlezkcrmczeuzdvfauuemle.onion - - Nevertheless, people are probably going to try to do this, so here's a - decent algorithm to use. - - To search for a public key with some criterion X: - - Generate a random (sk,pk) pair. - - While pk does not satisfy X: - - Add the number 8 to sk - Add the point 8*B to pk - - Return sk, pk. - - We add 8 and 8*B, rather than 1 and B, so that sk is always a valid - Curve25519 private key, with the lowest 3 bits equal to 0. - - This algorithm is safe [source: djb, personal communication] [TODO: - Make sure I understood correctly!] so long as only the final (sk,pk) - pair is used, and all previous values are discarded. - - To parallelize this algorithm, start with an independent (sk,pk) pair - generated for each independent thread, and let each search proceed - independently. - - See [VANITY-REFS] for a reference implementation of this vanity .onion - search scheme. - -Appendix D. Numeric values reserved in this document - - [TODO: collect all the lists of commands and values mentioned above] - -Appendix E. Reserved numbers - - We reserve these certificate type values for Ed25519 certificates: - - [08] short-term descriptor signing key, signed with blinded - public key. (Section 2.4) - [09] intro point authentication key, cross-certifying the descriptor - signing key. (Section 2.5) - [0B] ed25519 key derived from the curve25519 intro point encryption key, - cross-certifying the descriptor signing key. (Section 2.5) - - Note: The value "0A" is skipped because it's reserved for the onion key - cross-certifying ntor identity key from proposal 228. - -Appendix F. Hidden service directory format [HIDSERVDIR-FORMAT] - - This appendix section specifies the contents of the HiddenServiceDir directory: - - - "hostname" [FILE] - - This file contains the onion address of the onion service. - - - "private_key_ed25519" [FILE] - - This file contains the private master ed25519 key of the onion service. - [TODO: Offline keys] - - - "./authorized_clients/" [DIRECTORY] - "./authorized_clients/alice.auth" [FILE] - "./authorized_clients/bob.auth" [FILE] - "./authorized_clients/charlie.auth" [FILE] - - If client authorization is enabled, this directory MUST contain a ".auth" - file for each authorized client. Each such file contains the public key of - the respective client. The files are transmitted to the service operator by - the client. - - See section [CLIENT-AUTH-MGMT] for more details and the format of the client file. - - (NOTE: client authorization is implemented as of 0.3.5.1-alpha.) - -Appendix G. Managing authorized client data [CLIENT-AUTH-MGMT] - - Hidden services and clients can configure their authorized client data either - using the torrc, or using the control port. This section presents a suggested - scheme for configuring client authorization. Please see appendix - [HIDSERVDIR-FORMAT] for more information about relevant hidden service files. - - (NOTE: client authorization is implemented as of 0.3.5.1-alpha.) - - G.1. Configuring client authorization using torrc - - G.1.1. Hidden Service side configuration - - A hidden service that wants to enable client authorization, needs to - populate the "authorized_clients/" directory of its HiddenServiceDir - directory with the ".auth" files of its authorized clients. - - When Tor starts up with a configured onion service, Tor checks its - /authorized_clients/ directory for ".auth" files, and if - any recognized and parseable such files are found, then client - authorization becomes activated for that service. - - G.1.2. Service-side bookkeeping - - This section contains more details on how onion services should be keeping - track of their client ".auth" files. - - For the "descriptor" authentication type, the ".auth" file MUST contain - the x25519 public key of that client. Here is a suggested file format: - - :: - - Here is an an example: - - descriptor:x25519:OM7TGIVRYMY6PFX6GAC6ATRTA5U6WW6U7A4ZNHQDI6OVL52XVV2Q - - Tor SHOULD ignore lines it does not recognize. - Tor SHOULD ignore files that don't use the ".auth" suffix. - - G.1.3. Client side configuration - - A client who wants to register client authorization data for onion - services needs to add the following line to their torrc to indicate the - directory which hosts ".auth_private" files containing client-side - credentials for onion services: - - ClientOnionAuthDir - - The contains a file with the suffix ".auth_private" for each onion - service the client is authorized with. Tor should scan the directory for - ".auth_private" files to find which onion services require client - authorization from this client. - - For the "descriptor" auth-type, a ".auth_private" file contains the - private x25519 key: - - :descriptor:x25519: - - The keypair used for client authorization is created by a third party tool - for which the public key needs to be transferred to the service operator - in a secure out-of-band way. The third party tool SHOULD add appropriate - headers to the private key file to ensure that users won't accidentally - give out their private key. - - G.2. Configuring client authorization using the control port - - G.2.1. Service side - - A hidden service also has the option to configure authorized clients - using the control port. The idea is that hidden service operators can use - controller utilities that manage their access control instead of using - the filesystem to register client keys. - - Specifically, we require a new control port command ADD_ONION_CLIENT_AUTH - which is able to register x25519/ed25519 public keys tied to a specific - authorized client. - [XXX figure out control port command format] - - Hidden services who use the control port interface for client auth need - to perform their own key management. - - G.2.2. Client side - - There should also be a control port interface for clients to register - authorization data for hidden services without having to use the - torrc. It should allow both generation of client authorization private - keys, and also to import client authorization data provided by a hidden - service - - This way, Tor Browser can present "Generate client auth keys" and "Import - client auth keys" dialogs to users when they try to visit a hidden service - that is protected by client authorization. - - Specifically, we require two new control port commands: - IMPORT_ONION_CLIENT_AUTH_DATA - GENERATE_ONION_CLIENT_AUTH_DATA - which import and generate client authorization data respectively. - - [XXX how does key management work here?] - [XXX what happens when people use both the control port interface and the - filesystem interface?] - -Appendix F. Two methods for managing revision counters. - - Implementations MAY generate revision counters in any way they please, - so long as they are monotonically increasing over the lifetime of each - blinded public key. But to avoid fingerprinting, implementors SHOULD - choose a strategy also used by other Tor implementations. Here we - describe two, and additionally list some strategies that implementors - should NOT use. - - F.1. Increment-on-generation - - This is the simplest strategy, and the one used by Tor through at - least version 0.3.4.0-alpha. - - Whenever using a new blinded key, the service records the - highest revision counter it has used with that key. When generating - a descriptor, the service uses the smallest non-negative number - higher than any number it has already used. - - In other words, the revision counters under this system start fresh - with each blinded key as 0, 1, 2, 3, and so on. - - F.2. Encrypted time in period - - This scheme is what we recommend for situations when multiple - service instances need to coordinate their revision counters, - without an actual coordination mechanism. - - Let T be the number of seconds that have elapsed since the descriptor - became valid, plus 1. (T must be at least 1.) Implementations can use the - number of seconds since the start time of the shared random protocol run - that corresponds to this descriptor. - - Let S be a secret that all the service providers share. For - example, it could be the private signing key corresponding to the - current blinded key. - - Let K be an AES-256 key, generated as - K = H("rev-counter-generation" | S) - - Use K, and AES in counter mode with IV=0, to generate a stream of T - * 2 bytes. Consider these bytes as a sequence of T 16-bit - little-endian words. Add these words. - - Let the sum of these words be the revision counter. - - - Cryptowiki attributes roughly this scheme to G. Bebek in: - - G. Bebek. Anti-tamper database research: Inference control - techniques. Technical Report EECS 433 Final Report, Case - Western Reserve University, November 2002. - - Although we believe it is suitable for use in this application, it - is not a perfect order-preserving encryption algorithm (and all - order-preserving encryption has weaknesses). Please think twice - before using it for anything else. - - (This scheme can be optimized pretty easily by caching the encryption of - X*1, X*2, X*3, etc for some well chosen X.) - - For a slow reference implementation, see src/test/ope_ref.py in the - Tor source repository. [XXXX for now, see the same file in Nick's - "ope_hax" branch -- it isn't merged yet.] - - This scheme is not currently implemented in Tor. - - F.X. Some revision-counter strategies to avoid - - Though it might be tempting, implementations SHOULD NOT use the - current time or the current time within the period directly as their - revision counter -- doing so leaks their view of the current time, - which can be used to link the onion service to other services run on - the same host. - - Similarly, implementations SHOULD NOT let the revision counter - increase forever without resetting it -- doing so links the service - across changes in the blinded public key. - -Appendix G. Text vectors - - G.1. Test vectors for hs-ntor / NTOR-WITH-EXTRA-DATA - - Here is a set of test values for the hs-ntor handshake, called - [NTOR-WITH-EXTRA-DATA] in this document. They were generated by - instrumenting Tor's code to dump the values for an INTRODUCE/RENDEZVOUS - handshake, and then by running that code on a Chutney network. - - We assume an onion service with: - - KP_hs_ipd_sid = 34E171E4358E501BFF21ED907E96AC6B - FEF697C779D040BBAF49ACC30FC5D21F - KP_hss_ntor = 8E5127A40E83AABF6493E41F142B6EE3 - 604B85A3961CD7E38D247239AFF71979 - KS_hss_ntor = A0ED5DBF94EEB2EDB3B514E4CF6ABFF6 - 022051CC5F103391F1970A3FCD15296A - N_hs_subcred = 0085D26A9DEBA252263BF0231AEAC59B - 17CA11BAD8A218238AD6487CBAD68B57 - - The client wants to make in INTRODUCE request. It generates - the following header (everything before the ENCRYPTED portion) - of its INTRODUCE1 cell: - - H = 000000000000000000000000000000000000000002002034E171E4358E501BFF - 21ED907E96AC6BFEF697C779D040BBAF49ACC30FC5D21F00 - - It generates the following plaintext body to encrypt. (This - is the "decrypted plaintext body" from [PROCESS_INTRO2]. - - P = 6BD364C12638DD5C3BE23D76ACA05B04E6CE932C0101000100200DE6130E4FCA - C4EDDA24E21220CC3EADAE403EF6B7D11C8273AC71908DE565450300067F0000 - 0113890214F823C4F8CC085C792E0AEE0283FE00AD7520B37D0320728D5DF39B - 7B7077A0118A900FF4456C382F0041300ACF9C58E51C392795EF870000000000 - 0000000000000000000000000000000000000000000000000000000000000000 - 000000000000000000000000000000000000000000000000000000000000 - - (Note! This should in fact be padded to be longer; when these - test vectors were generated, the target INTRODUCE1 length in C - Tor was needlessly short.) - - The client now begins the hs-ntor handshake. It generates - a curve25519 keypair: - - x = 60B4D6BF5234DCF87A4E9D7487BDF3F4 - A69B6729835E825CA29089CFDDA1E341 - X = BF04348B46D09AED726F1D66C618FDEA - 1DE58E8CB8B89738D7356A0C59111D5D - - Then it calculates: - - ENC_KEY = 9B8917BA3D05F3130DACCE5300C3DC27 - F6D012912F1C733036F822D0ED238706 - MAC_KEY = FC4058DA59D4DF61E7B40985D122F502 - FD59336BC21C30CAF5E7F0D4A2C38FD5 - - With these, it encrypts the plaintext body P with ENC_KEY, getting - an encrypted value C. It computes MAC(MAC_KEY, H | X | C), - getting a MAC value M. It then assembles the final INTRODUCE1 - body as H | X | C | M: - - 000000000000000000000000000000000000000002002034E171E4358E501BFF - 21ED907E96AC6BFEF697C779D040BBAF49ACC30FC5D21F00BF04348B46D09AED - 726F1D66C618FDEA1DE58E8CB8B89738D7356A0C59111D5DADBECCCB38E37830 - 4DCC179D3D9E437B452AF5702CED2CCFEC085BC02C4C175FA446525C1B9D5530 - 563C362FDFFB802DAB8CD9EBC7A5EE17DA62E37DEEB0EB187FBB48C63298B0E8 - 3F391B7566F42ADC97C46BA7588278273A44CE96BC68FFDAE31EF5F0913B9A9C - 7E0F173DBC0BDDCD4ACB4C4600980A7DDD9EAEC6E7F3FA3FC37CD95E5B8BFB3E - 35717012B78B4930569F895CB349A07538E42309C993223AEA77EF8AEA64F25D - DEE97DA623F1AEC0A47F150002150455845C385E5606E41A9A199E7111D54EF2 - D1A51B7554D8B3692D85AC587FB9E69DF990EFB776D8 - - Later the service receives that body in an INTRODUCE2 cell. It - processes it according to the hs-ntor handshake, and recovers - the client's plaintext P. To continue the hs-ntor handshake, - the service chooses a curve25519 keypair: - - y = 68CB5188CA0CD7924250404FAB54EE13 - 92D3D2B9C049A2E446513875952F8F55 - Y = 8FBE0DB4D4A9C7FF46701E3E0EE7FD05 - CD28BE4F302460ADDEEC9E93354EE700 - - From this and the client's input, it computes: - - AUTH_INPUT_MAC = 4A92E8437B8424D5E5EC279245D5C72B - 25A0327ACF6DAF902079FCB643D8B208 - NTOR_KEY_SEED = 4D0C72FE8AFF35559D95ECC18EB5A368 - 83402B28CDFD48C8A530A5A3D7D578DB - - The service sends back Y | AUTH_INPUT_MAC in its RENDEZVOUS1 cell - body. From these, the client finishes the handshake, validates - AUTH_INPUT_MAC, and computes the same NTOR_KEY_SEED. - - Now that both parties have the same NTOR_KEY_SEED, they can derive - the shared key material they will use for their circuit. diff --git a/socks-extensions.txt b/socks-extensions.txt deleted file mode 100644 index c35069d..0000000 --- a/socks-extensions.txt +++ /dev/null @@ -1,175 +0,0 @@ - - Tor's extensions to the SOCKS protocol - -Table of Contents - - 1. Overview - 1.1. Extent of support - 2. Name lookup - 3. Other command extensions. - 4. HTTP-resistance - 5. Optimistic data - 6. Extended error codes - -1. Overview - - The SOCKS protocol provides a generic interface for TCP proxies. Client - software connects to a SOCKS server via TCP, and requests a TCP connection - to another address and port. The SOCKS server establishes the connection, - and reports success or failure to the client. After the connection has - been established, the client application uses the TCP stream as usual. - - Tor supports SOCKS4 as defined in [1], SOCKS4A as defined in [2], and - SOCKS5 as defined in [3] and [4]. - - The stickiest issue for Tor in supporting clients, in practice, is forcing - DNS lookups to occur at the OR side: if clients do their own DNS lookup, - the DNS server can learn which addresses the client wants to reach. - SOCKS4 supports addressing by IPv4 address; SOCKS4A is a kludge on top of - SOCKS4 to allow addressing by hostname; SOCKS5 supports IPv4, IPv6, and - hostnames. - -1.1. Extent of support - - Tor supports the SOCKS4, SOCKS4A, and SOCKS5 standards, except as follows: - - BOTH: - - The BIND command is not supported. - - SOCKS4,4A: - - SOCKS4 usernames are used to implement stream isolation. - - SOCKS5: - - The (SOCKS5) "UDP ASSOCIATE" command is not supported. - - SOCKS5 BIND command is not supported. - - IPv6 is not supported in CONNECT commands. - - SOCKS5 GSSAPI subnegotiation is not supported. - - The "NO AUTHENTICATION REQUIRED" (SOCKS5) authentication method [00] is - supported; and as of Tor 0.2.3.2-alpha, the "USERNAME/PASSWORD" (SOCKS5) - authentication method [02] is supported too, and used as a method to - implement stream isolation. As an extension to support some broken clients, - we allow clients to pass "USERNAME/PASSWORD" authentication message to us - even if no authentication was selected. Furthermore, we allow - username/password fields of this message to be empty. This technically - violates RFC1929 [4], but ensures interoperability with somewhat broken - SOCKS5 client implementations. - - Custom reply error code. The "REP" fields, as per the RFC[3], has - unassigned values which are used to describe Tor internal errors. See - ExtendedErrors in the tor.1 man page for more details. It is only sent - back if this SocksPort flag is set. - - (For more information on stream isolation, see IsolateSOCKSAuth on the Tor - manpage.) - -2. Name lookup - - As an extension to SOCKS4A and SOCKS5, Tor implements a new command value, - "RESOLVE" [F0]. When Tor receives a "RESOLVE" SOCKS command, it initiates - a remote lookup of the hostname provided as the target address in the SOCKS - request. The reply is either an error (if the address couldn't be - resolved) or a success response. In the case of success, the address is - stored in the portion of the SOCKS response reserved for remote IP address. - - (We support RESOLVE in SOCKS4 too, even though it is unnecessary.) - - For SOCKS5 only, we support reverse resolution with a new command value, - "RESOLVE_PTR" [F1]. In response to a "RESOLVE_PTR" SOCKS5 command with - an IPv4 address as its target, Tor attempts to find the canonical - hostname for that IPv4 record, and returns it in the "server bound - address" portion of the reply. - (This command was not supported before Tor 0.1.2.2-alpha.) - -3. Other command extensions. - - Tor 0.1.2.4-alpha added a new command value: "CONNECT_DIR" [F2]. - In this case, Tor will open an encrypted direct TCP connection to the - directory port of the Tor server specified by address:port (the port - specified should be the ORPort of the server). It uses a one-hop tunnel - and a "BEGIN_DIR" relay cell to accomplish this secure connection. - - The F2 command value was removed in Tor 0.2.0.10-alpha in favor of a - new use_begindir flag in edge_connection_t. - -4. HTTP-resistance - - Tor checks the first byte of each SOCKS request to see whether it looks - more like an HTTP request (that is, it starts with a "G", "H", or "P"). If - so, Tor returns a small webpage, telling the user that his/her browser is - misconfigured. This is helpful for the many users who mistakenly try to - use Tor as an HTTP proxy instead of a SOCKS proxy. - -5. Optimistic data - - Tor allows SOCKS clients to send connection data before Tor has sent a - SOCKS response. When using an exit node that supports "optimistic data", - Tor will send such data to the server without waiting to see whether the - connection attempt succeeds. This behavior can save a single round-trip - time when starting connections with a protocol where the client speaks - first (like HTTP). Clients that do this must be ready to hear that - their connection has succeeded or failed _after_ they have sent the - data. - -6. Extended error codes - - We define a set of additional extension error codes that can be returned - by our SOCKS implementation in response to failed onion service - connections. - - (In the C Tor implementation, these error codes can be disabled - via the ExtendedErrors flag. In Arti, these error codes are enabled - whenever onion services are.) - - * X'F0' Onion Service Descriptor Can Not be Found - - The requested onion service descriptor can't be found on the hashring - and thus not reachable by the client. - - * X'F1' Onion Service Descriptor Is Invalid - - The requested onion service descriptor can't be parsed or signature - validation failed. - - * X'F2' Onion Service Introduction Failed - - Client failed to introduce to the service meaning the descriptor was - found but the service is not anymore at the introduction points. The - service has likely changed its descriptor or is not running. - - * X'F3' Onion Service Rendezvous Failed - - Client failed to rendezvous with the service which means that the client - is unable to finalize the connection. - - * X'F4' Onion Service Missing Client Authorization - - Tor was able to download the requested onion service descriptor but is - unable to decrypt its content because it is missing client authorization - information for it. - - * X'F5' Onion Service Wrong Client Authorization - - Tor was able to download the requested onion service descriptor but is - unable to decrypt its content using the client authorization information - it has. This means the client access were revoked. - - * X'F6' Onion Service Invalid Address - - The given .onion address is invalid. In one of these cases this - error is returned: address checksum doesn't match, ed25519 public - key is invalid or the encoding is invalid. - - * X'F7' Onion Service Introduction Timed Out - - Similar to X'F2' code but in this case, all introduction attempts - have failed due to a time out. - - (Note that not all of the above error codes are currently returned - by Arti as of August 2023.) - - -References: - [1] http://en.wikipedia.org/wiki/SOCKS#SOCKS4 - [2] http://en.wikipedia.org/wiki/SOCKS#SOCKS4a - [3] SOCKS5: RFC 1928 https://www.ietf.org/rfc/rfc1928.txt - [4] RFC 1929: https://www.ietf.org/rfc/rfc1929.txt - diff --git a/srv-spec.txt b/srv-spec.txt deleted file mode 100644 index f768b73..0000000 --- a/srv-spec.txt +++ /dev/null @@ -1,653 +0,0 @@ - - Tor Shared Random Subsystem Specification - -This document specifies how the commit-and-reveal shared random subsystem of -Tor works. This text used to be proposal 250-commit-reveal-consensus.txt. - - Table Of Contents: - - 1. Introduction - 1.1. Motivation - 1.2. Previous work - 2. Overview - 2.1. Introduction to our commit-and-reveal protocol - 2.2. Ten thousand feet view of the protocol - 2.3. How we use the consensus [CONS] - 2.3.1. Inserting Shared Random Values in the consensus - 2.4. Persistent State of the Protocol [STATE] - 2.5. Protocol Illustration - 3. Protocol - 3.1 Commitment Phase [COMMITMENTPHASE] - 3.1.1. Voting During Commitment Phase - 3.1.2. Persistent State During Commitment Phase [STATECOMMIT] - 3.2 Reveal Phase - 3.2.1. Voting During Reveal Phase - 3.2.2. Persistent State During Reveal Phase [STATEREVEAL] - 3.3. Shared Random Value Calculation At 00:00UTC - 3.3.1. Shared Randomness Calculation [SRCALC] - 3.4. Bootstrapping Procedure - 3.5. Rebooting Directory Authorities [REBOOT] - 4. Specification [SPEC] - 4.1. Voting - 4.1.1. Computing commitments and reveals [COMMITREVEAL] - 4.1.2. Validating commitments and reveals [VALIDATEVALUES] - 4.1.4. Encoding commit/reveal values in votes [COMMITVOTE] - 4.1.5. Shared Random Value [SRVOTE] - 4.2. Encoding Shared Random Values in the consensus [SRCONSENSUS] - 4.3. Persistent state format [STATEFORMAT] - 5. Security Analysis - 5.1. Security of commit-and-reveal and future directions - 5.2. Predicting the shared random value during reveal phase - 5.3. Partition attacks - 5.3.1. Partition attacks during commit phase - 5.3.2. Partition attacks during reveal phase - 6. Discussion - 6.1. Why the added complexity from proposal 225? - 6.2. Why do you do a commit-and-reveal protocol in 24 rounds? - 6.3. Why can't we recover if the 00:00UTC consensus fails? - 7. Acknowledgements - - -1. Introduction - -1.1. Motivation - - For the next generation hidden services project, we need the Tor network to - produce a fresh random value every day in such a way that it cannot be - predicted in advance or influenced by an attacker. - - Currently we need this random value to make the HSDir hash ring - unpredictable (#8244), which should resolve a wide class of hidden service - DoS attacks and should make it harder for people to gauge the popularity - and activity of target hidden services. Furthermore this random value can - be used by other systems in need of fresh global randomness like - Tor-related protocols (e.g. OnioNS) or even non-Tor-related (e.g. warrant - canaries). - -1.2. Previous work - - Proposal 225 specifies a commit-and-reveal protocol that can be run as an - external script and have the results be fed to the directory authorities. - However, directory authority operators feel unsafe running a third-party - script that opens TCP ports and accepts connections from the Internet. - Hence, this proposal aims to embed the commit-and-reveal idea in the Tor - voting process which should make it smoother to deploy and maintain. - -2. Overview - - This proposal alters the Tor consensus protocol such that a random number is - generated every midnight by the directory authorities during the regular voting - process. The distributed random generator scheme is based on the - commit-and-reveal technique. - - The proposal also specifies how the final shared random value is embedded - in consensus documents so that clients who need it can get it. - -2.1. Introduction to our commit-and-reveal protocol - - Every day, before voting for the consensus at 00:00UTC each authority - generates a new random value and keeps it for the whole day. The authority - cryptographically hashes the random value and calls the output its - "commitment" value. The original random value is called the "reveal" value. - - The idea is that given a reveal value you can cryptographically confirm that - it corresponds to a given commitment value (by hashing it). However given a - commitment value you should not be able to derive the underlying reveal - value. The construction of these values is specified in section [COMMITREVEAL]. - -2.1. Ten thousand feet view of the protocol - - Our commit-and-reveal protocol aims to produce a fresh shared random value - (denoted shared_random_value here and elsewhere) every day at 00:00UTC. The - final fresh random value is embedded in the consensus document at that - time. - - Our protocol has two phases and uses the hourly voting procedure of Tor. - Each phase lasts 12 hours, which means that 12 voting rounds happen in - between. In short, the protocol works as follows: - - Commit phase: - - Starting at 00:00UTC and for a period of 12 hours, authorities every - hour include their commitment in their votes. They also include any - received commitments from other authorities, if available. - - Reveal phase: - - At 12:00UTC, the reveal phase starts and lasts till the end of the - protocol at 00:00UTC. In this stage, authorities must reveal the value - they committed to in the previous phase. The commitment and revealed - values from other authorities, when available, are also added to the - vote. - - Shared Randomness Calculation: - - At 00:00UTC, the shared random value is computed from the agreed - revealed values and added to the consensus. - - This concludes the commit-and-reveal protocol every day at 00:00UTC. - -2.3. How we use the consensus [CONS] - - The produced shared random values need to be readily available to - clients. For this reason we include them in the consensus documents. - - Every hour the consensus documents need to include the shared random value - of the day, as well as the shared random value of the previous day. That's - because either of these values might be needed at a given time for a Tor - client to access a hidden service according to section [TIME-OVERLAP] of - proposal 224. This means that both of these two values need to be included - in votes as well. - - Hence, consensuses need to include: - - (a) The shared random value of the current time period. - (b) The shared random value of the previous time period. - - For this, a new SR consensus method will be needed to indicate which - authorities support this new protocol. - -2.3.1. Inserting Shared Random Values in the consensus - - After voting happens, we need to be careful on how we pick which shared - random values (SRV) to put in the consensus, to avoid breaking the consensus - because of authorities having different views of the commit-and-reveal - protocol (because maybe they missed some rounds of the protocol). - - For this reason, authorities look at the received votes before creating a - consensus and employ the following logic: - - - First of all, they make sure that the agreed upon consensus method is - above the SR consensus method. - - - Authorities include an SRV in the consensus if and only if the SRV has - been voted by at least the majority of authorities. - - - For the consensus at 00:00UTC, authorities include an SRV in the consensus - if and only if the SRV has been voted by at least AuthDirNumAgreements - authorities (where AuthDirNumAgreements is a newly introduced consensus - parameter). - - Authorities include in the consensus the most popular SRV that also - satisfies the above constraints. Otherwise, no SRV should be included. - - The above logic is used to make it harder to break the consensus by natural - partioning causes. - - We use the AuthDirNumAgreements consensus parameter to enforce that a - _supermajority_ of dirauths supports the SR protocol during SRV creation, so - that even if a few of those dirauths drop offline in the middle of the run - the SR protocol does not get disturbed. We go to extra lengths to ensure - this because changing SRVs in the middle of the day has terrible - reachability consequences for hidden service clients. - -2.4. Persistent State of the Protocol [STATE] - - A directory authority needs to keep a persistent state on disk of the on - going protocol run. This allows an authority to join the protocol seamlessly - in the case of a reboot. - - During the commitment phase, it is populated with the commitments of all - authorities. Then during the reveal phase, the reveal values are also - stored in the state. - - As discussed previously, the shared random values from the current and - previous time period must also be present in the state at all times if they - are available. - -2.5. Protocol Illustration - - An illustration for better understanding the protocol can be found here: - - https://people.torproject.org/~asn/hs_notes/shared_rand.jpg - - It reads left-to-right. - - The illustration displays what the authorities (A_1, A_2, A_3) put in their - votes. A chain 'A_1 -> c_1 -> r_1' denotes that authority A_1 committed to - the value c_1 which corresponds to the reveal value r_1. - - The illustration depicts only a few rounds of the whole protocol. It starts - with the first three rounds of the commit phase, then it jumps to the last - round of the commit phase. It continues with the first two rounds of the - reveal phase and then it jumps to the final round of the protocol run. It - finally shows the first round of the commit phase of the next protocol run - (00:00UTC) where the final Shared Random Value is computed. In our fictional - example, the SRV was computed with 3 authority contributions and its value - is "a56fg39h". - - We advice you to revisit this after you have read the whole document. - -3. Protocol - - In this section we give a detailed specification of the protocol. We - describe the protocol participants' logic and the messages they send. The - encoding of the messages is specified in the next section ([SPEC]). - - Now we go through the phases of the protocol: - -3.1. Commitment Phase [COMMITMENTPHASE] - - The commit phase lasts from 00:00UTC to 12:00UTC. - - During this phase, an authority commits a value in its vote and - saves it to the permanent state as well. - - Authorities also save any received authoritative commits by other authorities - in their permanent state. We call a commit by Alice "authoritative" if it was - included in Alice's vote. - -3.1.1. Voting During Commitment Phase - - During the commit phase, each authority includes in its votes: - - - The commitment value for this protocol run. - - Any authoritative commitments received from other authorities. - - The two previous shared random values produced by the protocol (if any). - - The commit phase lasts for 12 hours, so authorities have multiple chances to - commit their values. An authority MUST NOT commit a second value during a - subsequent round of the commit phase. - - If an authority publishes a second commitment value in the same commit - phase, only the first commitment should be taken in account by other - authorities. Any subsequent commitments MUST be ignored. - -3.1.2. Persistent State During Commitment Phase [STATECOMMIT] - - During the commitment phase, authorities save in their persistent state the - authoritative commits they have received from each authority. Only one commit - per authority must be considered trusted and active at a given time. - -3.2. Reveal Phase - - The reveal phase lasts from 12:00UTC to 00:00UTC. - - Now that the commitments have been agreed on, it's time for authorities to - reveal their random values. - -3.2.1. Voting During Reveal Phase - - During the reveal phase, each authority includes in its votes: - - - Its reveal value that was previously committed in the commit phase. - - All the commitments and reveals received from other authorities. - - The two previous shared random values produced by the protocol (if any). - - The set of commitments have been decided during the commitment - phase and must remain the same. If an authority tries to change its - commitment during the reveal phase or introduce a new commitment, - the new commitment MUST be ignored. - -3.2.2. Persistent State During Reveal Phase [STATEREVEAL] - - During the reveal phase, authorities keep the authoritative commits from the - commit phase in their persistent state. They also save any received reveals - that correspond to authoritative commits and are valid (as specified in - [VALIDATEVALUES]). - - An authority that just received a reveal value from another authority's vote, - MUST wait till the next voting round before including that reveal value in - its votes. - -3.3. Shared Random Value Calculation At 00:00UTC - - Finally, at 00:00UTC every day, authorities compute a fresh shared random - value and this value must be added to the consensus so clients can use it. - - Authorities calculate the shared random value using the reveal values in - their state as specified in subsection [SRCALC]. - - Authorities at 00:00UTC start including this new shared random value in - their votes, replacing the one from two protocol runs ago. Authorities also - start including this new shared random value in the consensus as well. - - Apart from that, authorities at 00:00UTC proceed voting normally as they - would in the first round of the commitment phase (section [COMMITMENTPHASE]). - -3.3.1. Shared Randomness Calculation [SRCALC] - - An authority that wants to derive the shared random value SRV, should use - the appropriate reveal values for that time period and calculate SRV as - follows. - - HASHED_REVEALS = H(ID_a | R_a | ID_b | R_b | ..) - - SRV = SHA3-256("shared-random" | INT_8(REVEAL_NUM) | INT_4(VERSION) | - HASHED_REVEALS | PREVIOUS_SRV) - - where the ID_a value is the identity key fingerprint of authority 'a' and R_a - is the corresponding reveal value of that authority for the current period. - - Also, REVEAL_NUM is the number of revealed values in this construction, - VERSION is the protocol version number and PREVIOUS_SRV is the previous - shared random value. If no previous shared random value is known, then - PREVIOUS_SRV is set to 32 NUL (\x00) bytes. - - To maintain consistent ordering in HASHED_REVEALS, all the ID_a | R_a pairs - are ordered based on the R_a value in ascending order. - -3.4. Bootstrapping Procedure - - As described in [CONS], two shared random values are required for the HSDir - overlay periods to work properly as specified in proposal 224. Hence - clients MUST NOT use the randomness of this system till it has bootstrapped - completely; that is, until two shared random values are included in a - consensus. This should happen after three 00:00UTC consensuses have been - produced, which takes 48 hours. - -3.5. Rebooting Directory Authorities [REBOOT] - - The shared randomness protocol must be able to support directory - authorities who leave or join in the middle of the protocol execution. - - An authority that commits in the Commitment Phase and then leaves MUST have - stored its reveal value on disk so that it continues participating in the - protocol if it returns before or during the Reveal Phase. The reveal value - MUST be stored timestamped to avoid sending it on wrong protocol runs. - - An authority that misses the Commitment Phase cannot commit anymore, so it's - unable to participate in the protocol for that run. Same goes for an - authority that misses the Reveal phase. Authorities who do not participate in - the protocol SHOULD still carry commits and reveals of others in their vote. - - Finally, authorities MUST implement their persistent state in such a way that they - will never commit two different values in the same protocol run, even if they - have to reboot in the middle (assuming that their persistent state file is - kept). A suggested way to structure the persistent state is found at [STATEFORMAT]. - -4. Specification [SPEC] - -4.1. Voting - - This section describes how commitments, reveals and SR values are encoded in - votes. We describe how to encode both the authority's own - commitments/reveals and also the commitments/reveals received from the other - authorities. Commitments and reveals share the same line, but reveals are - optional. - - Participating authorities need to include the line: - - "shared-rand-participate" - - in their votes to announce that they take part in the protocol. - -4.1.1. Computing commitments and reveals [COMMITREVEAL] - - A directory authority that wants to participate in this protocol needs to - create a new pair of commitment/reveal values for every protocol - run. Authorities SHOULD generate a fresh pair of such values right before the - first commitment phase of the day (at 00:00UTC). - - The value REVEAL is computed as follows: - - REVEAL = base64-encode( TIMESTAMP || H(RN) ) - - where RN is the SHA3 hashed value of a 256-bit random value. We hash the - random value to avoid exposing raw bytes from our PRNG to the network (see - [RANDOM-REFS]). - - TIMESTAMP is an 8-bytes network-endian time_t value. Authorities SHOULD - set TIMESTAMP to the valid-after time of the vote document they first plan - to publish their commit into (so usually at 00:00UTC, except if they start - up in a later commit round). - - The value COMMIT is computed as follows: - - COMMIT = base64-encode( TIMESTAMP || H(REVEAL) ) - -4.1.2. Validating commitments and reveals [VALIDATEVALUES] - - Given a COMMIT message and a REVEAL message it should be possible to verify - that they indeed correspond. To do so, the client extracts the random value - H(RN) from the REVEAL message, hashes it, and compares it with the H(H(RN)) - from the COMMIT message. We say that the COMMIT and REVEAL messages - correspond, if the comparison was successful. - - Participants MUST also check that corresponding COMMIT and REVEAL values - have the same timestamp value. - - Authorities should ignore reveal values during the Reveal Phase that don't - correspond to commit values published during the Commitment Phase. - -4.1.4. Encoding commit/reveal values in votes [COMMITVOTE] - - An authority puts in its vote the commitments and reveals it has produced and - seen from the other authorities. To do so, it includes the following in its - votes: - - "shared-rand-commit" SP VERSION SP ALGNAME SP IDENTITY SP COMMIT [SP REVEAL] NL - - where VERSION is the version of the protocol the commit was created with. - IDENTITY is the authority's SHA1 identity fingerprint and COMMIT is the - encoded commit [COMMITREVEAL]. Authorities during the reveal phase can - also optionally include an encoded reveal value REVEAL. There MUST be only - one line per authority else the vote is considered invalid. Finally, the - ALGNAME is the hash algorithm that should be used to compute COMMIT and - REVEAL which is "sha3-256" for version 1. - -4.1.5. Shared Random Value [SRVOTE] - - Authorities include a shared random value (SRV) in their votes using the - following encoding for the previous and current value respectively: - - "shared-rand-previous-value" SP NUM_REVEALS SP VALUE NL - "shared-rand-current-value" SP NUM_REVEALS SP VALUE NL - - where VALUE is the actual shared random value encoded in hex (computed as - specified in section [SRCALC]. NUM_REVEALS is the number of reveal values - used to generate this SRV. - - To maintain consistent ordering, the shared random values of the previous - period should be listed before the values of the current period. - -4.2. Encoding Shared Random Values in the consensus [SRCONSENSUS] - - Authorities insert the two active shared random values in the consensus - following the same encoding format as in [SRVOTE]. - -4.3. Persistent state format [STATEFORMAT] - - As a way to keep ground truth state in this protocol, an authority MUST - keep a persistent state of the protocol. The next sub-section suggest a - format for this state which is the same as the current state file format. - - It contains a preamble, a commitment and reveal section and a list of - shared random values. - - The preamble (or header) contains the following items. They MUST occur in - the order given here: - - "Version" SP version NL - - [At start, exactly once.] - - A document format version. For this specification, version is "1". - - "ValidUntil" SP YYYY-MM-DD SP HH:MM:SS NL - - [Exactly once] - - After this time, this state is expired and shouldn't be used nor - trusted. The validity time period is till the end of the current - protocol run (the upcoming noon). - - The following details the commitment and reveal section. They are encoded - the same as in the vote. This makes it easier for implementation purposes. - - "Commit" SP version SP algname SP identity SP commit [SP reveal] NL - - [Exactly once per authority] - - The values are the same as detailed in section [COMMITVOTE]. - - This line is also used by an authority to store its own value. - - Finally is the shared random value section. - - "SharedRandPreviousValue" SP num_reveals SP value NL - - [At most once] - - This is the previous shared random value agreed on at the previous - period. The fields are the same as in section [SRVOTE]. - - "SharedRandCurrentValue" SP num_reveals SP value NL - - [At most once] - - This is the latest shared random value. The fields are the same as in - section [SRVOTE]. - -5. Security Analysis - -5.1. Security of commit-and-reveal and future directions - - The security of commit-and-reveal protocols is well understood, and has - certain flaws. Basically, the protocol is insecure to the extent that an - adversary who controls b of the authorities gets to choose among 2^b - outcomes for the result of the protocol. However, an attacker who is not a - dirauth should not be able to influence the outcome at all. - - We believe that this system offers sufficient security especially compared - to the current situation. More secure solutions require much more advanced - crypto and more complex protocols so this seems like an acceptable solution - for now. - - Here are some examples of possible future directions: - - Schemes based on threshold signatures (e.g. see [HOPPER]) - - Unicorn scheme by Lenstra et al. [UNICORN] - - Schemes based on Verifiable Delay Functions [VDFS] - - For more alternative approaches on collaborative random number generation - also see the discussion at [RNGMESSAGING]. - -5.2. Predicting the shared random value during reveal phase - - The reveal phase lasts 12 hours, and most authorities will send their - reveal value on the first round of the reveal phase. This means that an - attacker can predict the final shared random value about 12 hours before - it's generated. - - This does not pose a problem for the HSDir hash ring, since we impose an - higher uptime restriction on HSDir nodes, so 12 hours predictability is not - an issue. - - Any other protocols using the shared random value from this system should - be aware of this property. - -5.3. Partition attacks - - This design is not immune to certain partition attacks. We believe they - don't offer much gain to an attacker as they are very easy to detect and - difficult to pull off since an attacker would need to compromise a directory - authority at the very least. Also, because of the byzantine general problem, - it's very hard (even impossible in some cases) to protect against all such - attacks. Nevertheless, this section describes all possible partition attack - and how to detect them. - -5.3.1. Partition attacks during commit phase - - A malicious directory authority could send only its commit to one single - authority which results in that authority having an extra commit value for - the shared random calculation that the others don't have. Since the - consensus needs majority, this won't affect the final SRV value. However, - the attacker, using this attack, could remove a single directory authority - from the consensus decision at 24:00 when the SRV is computed. - - An attacker could also partition the authorities by sending two different - commitment values to different authorities during the commit phase. - - All of the above is fairly easy to detect. Commitment values in the vote - coming from an authority should NEVER be different between authorities. If - so, this means an attack is ongoing or very bad bug (highly unlikely). - -5.3.2. Partition attacks during reveal phase - - Let's consider Alice, a malicious directory authority. Alice could wait - until the last reveal round, and reveal its value to half of the - authorities. That would partition the authorities into two sets: the ones - who think that the shared random value should contain this new reveal, and - the rest who don't know about it. This would result in a tie and two - different shared random value. - - A similar attack is possible. For example, two rounds before the end of the - reveal phase, Alice could advertise her reveal value to only half of the - dirauths. This way, in the last reveal phase round, half of the dirauths - will include that reveal value in their votes and the others will not. In - the end of the reveal phase, half of the dirauths will calculate a - different shared randomness value than the others. - - We claim that this attack is not particularly fruitful: Alice ends up - having two shared random values to choose from which is a fundamental - problem of commit-and-reveal protocols as well (since the last person can - always abort or reveal). The attacker can also sabotage the consensus, but - there are other ways this can be done with the current voting system. - - Furthermore, we claim that such an attack is very noisy and detectable. - First of all, it requires the authority to sabotage two consensuses which - will cause quite some noise. Furthermore, the authority needs to send - different votes to different auths which is detectable. Like the commit - phase attack, the detection here is to make sure that the commitment values - in a vote coming from an authority are always the same for each authority. - -6. Discussion - -6.1. Why the added complexity from proposal 225? - - The complexity difference between this proposal and prop225 is in part - because prop225 doesn't specify how the shared random value gets to the - clients. This proposal spends lots of effort specifying how the two shared - random values can always be readily accessible to clients. - -6.2. Why do you do a commit-and-reveal protocol in 24 rounds? - - The reader might be wondering why we span the protocol over the course of a - whole day (24 hours), when only 3 rounds would be sufficient to generate a - shared random value. - - We decided to do it this way, because we piggyback on the Tor voting - protocol which also happens every hour. - - We could instead only do the shared randomness protocol from 21:00 to 00:00 - every day. Or to do it multiple times a day. - - However, we decided that since the shared random value needs to be in every - consensus anyway, carrying the commitments/reveals as well will not be a - big problem. Also, this way we give more chances for a failing dirauth to - recover and rejoin the protocol. - -6.3. Why can't we recover if the 00:00UTC consensus fails? - - If the 00:00UTC consensus fails, there will be no shared random value for - the whole day. In theory, we could recover by calculating the shared - randomness of the day at 01:00UTC instead. However, the engineering issues - with adding such recovery logic are too great. For example, it's not easy - for an authority who just booted to learn whether a specific consensus - failed to be created. - -7. Acknowledgements - - Thanks to everyone who has contributed to this design with feedback and - discussion. - - Thanks go to arma, ioerror, kernelcorn, nickm, s7r, Sebastian, teor, weasel - and everyone else! - -References: - -[RANDOM-REFS]: - http://projectbullrun.org/dual-ec/ext-rand.html - https://lists.torproject.org/pipermail/tor-dev/2015-November/009954.html - -[RNGMESSAGING]: - https://moderncrypto.org/mail-archive/messaging/2015/002032.html - -[HOPPER]: - https://lists.torproject.org/pipermail/tor-dev/2014-January/006053.html - -[UNICORN]: - https://eprint.iacr.org/2015/366.pdf - -[VDFS]: - https://eprint.iacr.org/2018/601.pdf diff --git a/tor-spec.txt b/tor-spec.txt deleted file mode 100644 index 4d21c9a..0000000 --- a/tor-spec.txt +++ /dev/null @@ -1,2735 +0,0 @@ - - Tor Protocol Specification - - Roger Dingledine - Nick Mathewson - -Table of Contents - - 0. Preliminaries - 0.1. Notation and encoding - 0.2. Security parameters - 0.3. Ciphers - 0.4. A bad hybrid encryption algorithm, for legacy purposes - 1. System overview - 1.1. Keys and names - 2. Connections - 2.1. Picking TLS ciphersuites - 2.2. TLS security considerations - 3. Cell Packet format - 4. Negotiating and initializing connections - 4.1. Negotiating versions with VERSIONS cells - 4.2. CERTS cells - 4.3. AUTH_CHALLENGE cells - 4.4. AUTHENTICATE cells - 4.4.1. Link authentication type 1: RSA-SHA256-TLSSecret - 4.4.2. Link authentication type 3: Ed25519-SHA256-RFC5705 - 4.5. NETINFO cells - 5. Circuit management - 5.1. CREATE and CREATED cells - 5.1.1. Choosing circuit IDs in create cells - 5.1.2. EXTEND and EXTENDED cells - 5.1.3. The "TAP" handshake - 5.1.4. The "ntor" handshake - 5.1.4.1. The "ntor-v3" handshake. - 5.1.5. CREATE_FAST/CREATED_FAST cells - 5.1.6. Additional data in CREATE/CREATED cells - 5.2. Setting circuit keys - 5.2.1. KDF-TOR - 5.2.2. KDF-RFC5869 - 5.3. Creating circuits - 5.3.1. Canonical connections - 5.4. Tearing down circuits - 5.5. Routing relay cells - 5.5.1. Circuit ID Checks - 5.5.2. Forward Direction - 5.5.2.1. Routing from the Origin - 5.5.2.2. Relaying Forward at Onion Routers - 5.5.3. Backward Direction - 5.5.3.1. Relaying Backward at Onion Routers - 5.5.4. Routing to the Origin - 5.6. Handling relay_early cells - 6. Application connections and stream management - 6.1. Relay cells - 6.1.1. Calculating the 'Digest' field - 6.2. Opening streams and transferring data - 6.2.1. Opening a directory stream - 6.3. Closing streams - 6.4. Remote hostname lookup - 7. Flow control - 7.1. Link throttling - 7.2. Link padding - 7.3. Circuit-level flow control - 7.3.1. SENDME Cell Format - 7.4. Stream-level flow control - 8. Handling resource exhaustion - 8.1. Memory exhaustion - 9. Subprotocol versioning - 9.1. "Link" - 9.2. "LinkAuth" - 9.3. "Relay" - 9.4. "HSIntro" - 9.5. "HSRend" - 9.6. "HSDir" - 9.7. "DirCache" - 9.8. "Desc" - 9.9. "Microdesc" - 9.10. "Cons" - 9.11. "Padding" - 9.12. "FlowCtrl" - -Note: This document aims to specify Tor as currently implemented, though it -may take it a little time to become fully up to date. Future versions of Tor -may implement improved protocols, and compatibility is not guaranteed. -We may or may not remove compatibility notes for other obsolete versions of -Tor as they become obsolete. - -This specification is not a design document; most design criteria -are not examined. For more information on why Tor acts as it does, -see tor-design.pdf. - -0. Preliminaries - - The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL - NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and - "OPTIONAL" in this document are to be interpreted as described in - RFC 2119. - -0.1. Notation and encoding - - KP -- a public key for an asymmetric cipher. - KS -- a private key for an asymmetric cipher. - K -- a key for a symmetric cipher. - N -- a "nonce", a random value, usually deterministically chosen - from other inputs using hashing. - - a|b -- concatenation of 'a' and 'b'. - - [A0 B1 C2] -- a three-byte sequence, containing the bytes with - hexadecimal values A0, B1, and C2, in that order. - - H(m) -- a cryptographic hash of m. - - We use "byte" and "octet" interchangeably. Possibly we shouldn't. - - Some specs mention "base32". This means RFC4648, without "=" padding. - -0.1.1. Encoding integers - - Unless we explicitly say otherwise below, all numeric values in the - Tor protocol are encoded in network (big-endian) order. So a "32-bit - integer" means a big-endian 32-bit integer; a "2-byte" integer means - a big-endian 16-bit integer, and so forth. - -0.2. Security parameters - - Tor uses a stream cipher, a public-key cipher, the Diffie-Hellman - protocol, and a hash function. - - KEY_LEN -- the length of the stream cipher's key, in bytes. - - KP_ENC_LEN -- the length of a public-key encrypted message, in bytes. - KP_PAD_LEN -- the number of bytes added in padding for public-key - encryption, in bytes. (The largest number of bytes that can be encrypted - in a single public-key operation is therefore KP_ENC_LEN-KP_PAD_LEN.) - - DH_LEN -- the number of bytes used to represent a member of the - Diffie-Hellman group. - DH_SEC_LEN -- the number of bytes used in a Diffie-Hellman private key (x). - - HASH_LEN -- the length of the hash function's output, in bytes. - - PAYLOAD_LEN -- The longest allowable cell payload, in bytes. (509) - - CELL_LEN(v) -- The length of a Tor cell, in bytes, for link protocol - version v. - CELL_LEN(v) = 512 if v is less than 4; - = 514 otherwise. - -0.3. Ciphers - - These are the ciphers we use _unless otherwise specified_. Several of - them are deprecated for new use. - - For a stream cipher, unless otherwise specified, we use 128-bit AES in - counter mode, with an IV of all 0 bytes. (We also require AES256.) - - For a public-key cipher, unless otherwise specified, we use RSA with - 1024-bit keys and a fixed exponent of 65537. We use OAEP-MGF1 - padding, with SHA-1 as its digest function. We leave the optional - "Label" parameter unset. (For OAEP padding, see - ftp://ftp.rsasecurity.com/pub/pkcs/pkcs-1/pkcs-1v2-1.pdf) - - We also use the Curve25519 group and the Ed25519 signature format in - several places. - - For Diffie-Hellman, unless otherwise specified, we use a generator - (g) of 2. For the modulus (p), we use the 1024-bit safe prime from - rfc2409 section 6.2 whose hex representation is: - - "FFFFFFFFFFFFFFFFC90FDAA22168C234C4C6628B80DC1CD129024E08" - "8A67CC74020BBEA63B139B22514A08798E3404DDEF9519B3CD3A431B" - "302B0A6DF25F14374FE1356D6D51C245E485B576625E7EC6F44C42E9" - "A637ED6B0BFF5CB6F406B7EDEE386BFB5A899FA5AE9F24117C4B1FE6" - "49286651ECE65381FFFFFFFFFFFFFFFF" - - As an optimization, implementations SHOULD choose DH private keys (x) of - 320 bits. Implementations that do this MUST never use any DH key more - than once. - [May other implementations reuse their DH keys?? -RD] - [Probably not. Conceivably, you could get away with changing DH keys once - per second, but there are too many oddball attacks for me to be - comfortable that this is safe. -NM] - - For a hash function, unless otherwise specified, we use SHA-1. - - KEY_LEN=16. - DH_LEN=128; DH_SEC_LEN=40. - KP_ENC_LEN=128; KP_PAD_LEN=42. - HASH_LEN=20. - - We also use SHA256 and SHA3-256 in some places. - - When we refer to "the hash of a public key", unless otherwise - specified, we mean the SHA-1 hash of the DER encoding of an ASN.1 RSA - public key (as specified in PKCS.1). - - All "random" values MUST be generated with a cryptographically - strong pseudorandom number generator seeded from a strong entropy - source, unless otherwise noted. - -0.4. A bad hybrid encryption algorithm, for legacy purposes. - - Some specifications will refer to the "legacy hybrid encryption" of a - byte sequence M with a public key KP. It is computed as follows: - - 1. If the length of M is no more than KP_ENC_LEN-KP_PAD_LEN, - pad and encrypt M with KP. - 2. Otherwise, generate a KEY_LEN byte random key K. - Let M1 = the first KP_ENC_LEN-KP_PAD_LEN-KEY_LEN bytes of M, - and let M2 = the rest of M. - Pad and encrypt K|M1 with KP. Encrypt M2 with our stream cipher, - using the key K. Concatenate these encrypted values. - - Note that this "hybrid encryption" approach does not prevent - an attacker from adding or removing bytes to the end of M. It also - allows attackers to modify the bytes not covered by the OAEP -- - see Goldberg's PET2006 paper for details. Do not use it as the basis - for new protocols! Also note that as used in Tor's protocols, case 1 - never occurs. - -1. System overview - - Tor is a distributed overlay network designed to anonymize - low-latency TCP-based applications such as web browsing, secure shell, - and instant messaging. Clients choose a path through the network and - build a ``circuit'', in which each node (or ``onion router'' or ``OR'') - in the path knows its predecessor and successor, but no other nodes in - the circuit. Traffic flowing down the circuit is sent in fixed-size - ``cells'', which are unwrapped by a symmetric key at each node (like - the layers of an onion) and relayed downstream. - -1.1. Keys and names - - Every Tor relay has multiple public/private keypairs: - - These are 1024-bit RSA keys: - - - A long-term signing-only "Identity key" used to sign documents and - certificates, and used to establish relay identity. - KP_relayid_rsa, KS_relayid_rsa. - - A medium-term TAP "Onion key" used to decrypt onion skins when accepting - circuit extend attempts. (See 5.1.) Old keys MUST be accepted for a - while after they are no longer advertised. Because of this, - relays MUST retain old keys for a while after they're rotated. (See - "onion key lifetime parameters" in dir-spec.txt.) - KP_onion_tap, KS_onion_tap. - - A short-term "Connection key" used to negotiate TLS connections. - Tor implementations MAY rotate this key as often as they like, and - SHOULD rotate this key at least once a day. - KP_conn_tls, KS_conn_tls. - - This is Curve25519 key: - - - A medium-term ntor "Onion key" used to handle onion key handshakes when - accepting incoming circuit extend requests. As with TAP onion keys, - old ntor keys MUST be accepted for at least one week after they are no - longer advertised. Because of this, relays MUST retain old keys for a - while after they're rotated. (See "onion key lifetime parameters" in - dir-spec.txt.) - KP_ntor, KS_ntor. - - These are Ed25519 keys: - - - A long-term "master identity" key. This key never - changes; it is used only to sign the "signing" key below. It may be - kept offline. - KP_relayid_ed, KS_relayid_ed. - - A medium-term "signing" key. This key is signed by the master identity - key, and must be kept online. A new one should be generated - periodically. It signs nearly everything else. - KP_relaysign_ed, KS_relaysign_ed. - - A short-term "link authentication" key, used to authenticate - the link handshake: see section 4 below. This key is signed - by the "signing" key, and should be regenerated frequently. - KP_link_ed, KS_link_ed. - - KP_relayid_* together identify a router uniquely. Once a router - has used a KP_relayid_ed (an Ed25519 master identity key) - together with a given KP_relayid_rsa (RSA identity key), neither of - those keys may ever be used with a different key. - - We write KP_relayid to refer to a key which is either - KP_relayid_rsa or KP_relayid_ed. - - The same key or keypair should never be used for separate roles within - the Tor protocol suite, unless specifically stated. For example, - a relay's identity keys K_relayid should not also be used as the - identity keypair for a hidden service K_hs_id (see rend-spec-v3.txt). - -2. Connections - - Connections between two Tor relays, or between a client and a relay, - use TLS/SSLv3 for link authentication and encryption. All - implementations MUST support the SSLv3 ciphersuite - "TLS_DHE_RSA_WITH_AES_128_CBC_SHA" if it is available. They SHOULD - support better ciphersuites if available. - - There are three ways to perform TLS handshakes with a Tor server. In - the first way, "certificates-up-front", both the initiator and - responder send a two-certificate chain as part of their initial - handshake. (This is supported in all Tor versions.) In the second - way, "renegotiation", the responder provides a single certificate, - and the initiator immediately performs a TLS renegotiation. (This is - supported in Tor 0.2.0.21 and later.) And in the third way, - "in-protocol", the initial TLS negotiation completes, and the - parties bootstrap themselves to mutual authentication via use of the - Tor protocol without further TLS handshaking. (This is supported in - 0.2.3.6-alpha and later.) - - Each of these options provides a way for the parties to learn it is - available: a client does not need to know the version of the Tor - server in order to connect to it properly. - - In "certificates up-front" (a.k.a "the v1 handshake"), - the connection initiator always sends a - two-certificate chain, consisting of an X.509 certificate using a - short-term connection public key and a second, self-signed X.509 - certificate containing its identity key. The other party sends a similar - certificate chain. The initiator's ClientHello MUST NOT include any - ciphersuites other than: - - TLS_DHE_RSA_WITH_AES_256_CBC_SHA - TLS_DHE_RSA_WITH_AES_128_CBC_SHA - SSL_DHE_RSA_WITH_3DES_EDE_CBC_SHA - - In "renegotiation" (a.k.a. "the v2 handshake"), - the connection initiator sends no certificates, and - the responder sends a single connection certificate. Once the TLS - handshake is complete, the initiator renegotiates the handshake, with each - party sending a two-certificate chain as in "certificates up-front". - The initiator's ClientHello MUST include at least one ciphersuite not in - the list above -- that's how the initiator indicates that it can - handle this handshake. For other considerations on the initiator's - ClientHello, see section 2.1 below. - - In "in-protocol" (a.k.a. "the v3 handshake"), the initiator sends no - certificates, and the - responder sends a single connection certificate. The choice of - ciphersuites must be as in a "renegotiation" handshake. There are - additionally a set of constraints on the connection certificate, - which the initiator can use to learn that the in-protocol handshake - is in use. Specifically, at least one of these properties must be - true of the certificate: - - * The certificate is self-signed - * Some component other than "commonName" is set in the subject or - issuer DN of the certificate. - * The commonName of the subject or issuer of the certificate ends - with a suffix other than ".net". - * The certificate's public key modulus is longer than 1024 bits. - - The initiator then sends a VERSIONS cell to the responder, which then - replies with a VERSIONS cell; they have then negotiated a Tor - protocol version. Assuming that the version they negotiate is 3 or higher - (the only ones specified for use with this handshake right now), the - responder sends a CERTS cell, an AUTH_CHALLENGE cell, and a NETINFO - cell to the initiator, which may send either CERTS, AUTHENTICATE, - NETINFO if it wants to authenticate, or just NETINFO if it does not. - - For backward compatibility between later handshakes and "certificates - up-front", the ClientHello of an initiator that supports a later - handshake MUST include at least one ciphersuite other than those listed - above. The connection responder examines the initiator's ciphersuite list - to see whether it includes any ciphers other than those included in the - list above. If extra ciphers are included, the responder proceeds as in - "renegotiation" and "in-protocol": it sends a single certificate and - does not request - client certificates. Otherwise (in the case that no extra ciphersuites - are included in the ClientHello) the responder proceeds as in - "certificates up-front": it requests client certificates, and sends a - two-certificate chain. In either case, once the responder has sent its - certificate or certificates, the initiator counts them. If two - certificates have been sent, it proceeds as in "certificates up-front"; - otherwise, it proceeds as in "renegotiation" or "in-protocol". - - To decide whether to do "renegotiation" or "in-protocol", the - initiator checks whether the responder's initial certificate matches - the criteria listed above. - - All new relay implementations of the Tor protocol MUST support - backwards-compatible renegotiation; clients SHOULD do this too. If - this is not possible, new client implementations MUST support both - "renegotiation" and "in-protocol" and use the router's - published link protocols list (see dir-spec.txt on the "protocols" entry) - to decide which to use. - - In all of the above handshake variants, certificates sent in the clear - SHOULD NOT include any strings to identify the host as a Tor relay. In - the "renegotiation" and "backwards-compatible renegotiation" steps, the - initiator SHOULD choose a list of ciphersuites and TLS extensions - to mimic one used by a popular web browser. - - Even though the connection protocol is identical, we will think of the - initiator as either an onion router (OR) if it is willing to relay - traffic for other Tor users, or an onion proxy (OP) if it only handles - local requests. Onion proxies SHOULD NOT provide long-term-trackable - identifiers in their handshakes. - - In all handshake variants, once all certificates are exchanged, all - parties receiving certificates must confirm that the identity key is as - expected. If the key is not as expected, the party must close the - connection. - - (When initiating a connection, if a reasonably live consensus is - available, then the expected identity key is taken from that - consensus. But when initiating a connection otherwise, the expected - identity key is the one given in the hard-coded authority or - fallback list. Finally, when creating a connection because of an - EXTEND/EXTEND2 cell, the expected identity key is the one given in - the cell.) - - When connecting to an OR, all parties SHOULD reject the connection if that - OR has a malformed or missing certificate. When accepting an incoming - connection, an OR SHOULD NOT reject incoming connections from parties with - malformed or missing certificates. (However, an OR should not believe - that an incoming connection is from another OR unless the certificates - are present and well-formed.) - - [Before version 0.1.2.8-rc, ORs rejected incoming connections from ORs and - OPs alike if their certificates were missing or malformed.] - - Once a TLS connection is established, the two sides send cells - (specified below) to one another. Cells are sent serially. Standard - cells are CELL_LEN(link_proto) bytes long, but variable-length cells - also exist; see Section 3. Cells may be sent embedded in TLS records - of any size or divided across TLS records, but the framing of TLS - records MUST NOT leak information about the type or contents of the - cells. - - TLS connections are not permanent. Either side MAY close a connection - if there are no circuits running over it and an amount of time - (KeepalivePeriod, defaults to 5 minutes) has passed since the last time - any traffic was transmitted over the TLS connection. Clients SHOULD - also hold a TLS connection with no circuits open, if it is likely that a - circuit will be built soon using that connection. - - Client-only Tor instances are encouraged to avoid using handshake - variants that include certificates, if those certificates provide - any persistent tags to the relays they contact. If clients do use - certificates, they SHOULD NOT keep using the same certificates when - their IP address changes. Clients MAY send certificates using any - of the above handshake variants. - -2.1. Picking TLS ciphersuites - - Clients SHOULD send a ciphersuite list chosen to emulate some popular - web browser or other program common on the internet. Clients may send - the "Fixed Cipheruite List" below. If they do not, they MUST NOT - advertise any ciphersuite that they cannot actually support, unless that - cipher is one not supported by OpenSSL 1.0.1. - - The fixed ciphersuite list is: - - TLS1_ECDHE_ECDSA_WITH_AES_256_CBC_SHA - TLS1_ECDHE_RSA_WITH_AES_256_CBC_SHA - TLS1_DHE_RSA_WITH_AES_256_SHA - TLS1_DHE_DSS_WITH_AES_256_SHA - TLS1_ECDH_RSA_WITH_AES_256_CBC_SHA - TLS1_ECDH_ECDSA_WITH_AES_256_CBC_SHA - TLS1_RSA_WITH_AES_256_SHA - TLS1_ECDHE_ECDSA_WITH_RC4_128_SHA - TLS1_ECDHE_ECDSA_WITH_AES_128_CBC_SHA - TLS1_ECDHE_RSA_WITH_RC4_128_SHA - TLS1_ECDHE_RSA_WITH_AES_128_CBC_SHA - TLS1_DHE_RSA_WITH_AES_128_SHA - TLS1_DHE_DSS_WITH_AES_128_SHA - TLS1_ECDH_RSA_WITH_RC4_128_SHA - TLS1_ECDH_RSA_WITH_AES_128_CBC_SHA - TLS1_ECDH_ECDSA_WITH_RC4_128_SHA - TLS1_ECDH_ECDSA_WITH_AES_128_CBC_SHA - SSL3_RSA_RC4_128_MD5 - SSL3_RSA_RC4_128_SHA - TLS1_RSA_WITH_AES_128_SHA - TLS1_ECDHE_ECDSA_WITH_DES_192_CBC3_SHA - TLS1_ECDHE_RSA_WITH_DES_192_CBC3_SHA - SSL3_EDH_RSA_DES_192_CBC3_SHA - SSL3_EDH_DSS_DES_192_CBC3_SHA - TLS1_ECDH_RSA_WITH_DES_192_CBC3_SHA - TLS1_ECDH_ECDSA_WITH_DES_192_CBC3_SHA - SSL3_RSA_FIPS_WITH_3DES_EDE_CBC_SHA - SSL3_RSA_DES_192_CBC3_SHA - [*] The "extended renegotiation is supported" ciphersuite, 0x00ff, is - not counted when checking the list of ciphersuites. - - If the client sends the Fixed Ciphersuite List, the responder MUST NOT - select any ciphersuite besides TLS_DHE_RSA_WITH_AES_256_CBC_SHA, - TLS_DHE_RSA_WITH_AES_128_CBC_SHA, and SSL_DHE_RSA_WITH_3DES_EDE_CBC_SHA: - such ciphers might not actually be supported by the client. - - If the client sends a v2+ ClientHello with a list of ciphers other then - the Fixed Ciphersuite List, the responder can trust that the client - supports every cipher advertised in that list, so long as that ciphersuite - is also supported by OpenSSL 1.0.1. - - Responders MUST NOT select any TLS ciphersuite that lacks ephemeral keys, - or whose symmetric keys are less then KEY_LEN bits, or whose digests are - less than HASH_LEN bits. Responders SHOULD NOT select any SSLv3 - ciphersuite other than the DHE+3DES suites listed above. - -2.2. TLS security considerations - - Implementations MUST NOT allow TLS session resumption -- it can - exacerbate some attacks (e.g. the "Triple Handshake" attack from - Feb 2013), and it plays havoc with forward secrecy guarantees. - - Implementations SHOULD NOT allow TLS compression -- although we don't - know a way to apply a CRIME-style attack to current Tor directly, - it's a waste of resources. - -3. Cell Packet format - - The basic unit of communication for onion routers and onion - proxies is a fixed-width "cell". - - On a version 1 connection, each cell contains the following - fields: - - CircID [CIRCID_LEN bytes] - Command [1 byte] - Payload (padded with padding bytes) [PAYLOAD_LEN bytes] - - On a version 2 or higher connection, all cells are as in version 1 - connections, except for variable-length cells, whose format is: - - CircID [CIRCID_LEN octets] - Command [1 octet] - Length [2 octets; big-endian integer] - Payload (some commands MAY pad) [Length bytes] - - Most variable-length cells MAY be padded with padding bytes, except - for VERSIONS cells, which MUST NOT contain any additional bytes. - (The payload of VPADDING cells consists of padding bytes.) - - On a version 2 connection, variable-length cells are indicated by a - command byte equal to 7 ("VERSIONS"). On a version 3 or - higher connection, variable-length cells are indicated by a command - byte equal to 7 ("VERSIONS"), or greater than or equal to 128. - - CIRCID_LEN is 2 for link protocol versions 1, 2, and 3. CIRCID_LEN - is 4 for link protocol version 4 or higher. The first VERSIONS cell, - and any cells sent before the first VERSIONS cell, always have - CIRCID_LEN == 2 for backward compatibility. - - The CircID field determines which circuit, if any, the cell is - associated with. - - The 'Command' field of a fixed-length cell holds one of the following - values: - - 0 -- PADDING (Padding) (See Sec 7.2) - 1 -- CREATE (Create a circuit) (See Sec 5.1) - 2 -- CREATED (Acknowledge create) (See Sec 5.1) - 3 -- RELAY (End-to-end data) (See Sec 5.5 and 6) - 4 -- DESTROY (Stop using a circuit) (See Sec 5.4) - 5 -- CREATE_FAST (Create a circuit, no KP) (See Sec 5.1) - 6 -- CREATED_FAST (Circuit created, no KP) (See Sec 5.1) - 8 -- NETINFO (Time and address info) (See Sec 4.5) - 9 -- RELAY_EARLY (End-to-end data; limited)(See Sec 5.6) - 10 -- CREATE2 (Extended CREATE cell) (See Sec 5.1) - 11 -- CREATED2 (Extended CREATED cell) (See Sec 5.1) - 12 -- PADDING_NEGOTIATE (Padding negotiation) (See Sec 7.2) - - Variable-length command values are: - - 7 -- VERSIONS (Negotiate proto version) (See Sec 4) - 128 -- VPADDING (Variable-length padding) (See Sec 7.2) - 129 -- CERTS (Certificates) (See Sec 4.2) - 130 -- AUTH_CHALLENGE (Challenge value) (See Sec 4.3) - 131 -- AUTHENTICATE (Client authentication)(See Sec 4.5) - 132 -- AUTHORIZE (Client authorization) (Not yet used) - - The interpretation of 'Payload' depends on the type of the cell. - - VPADDING/PADDING: - Payload contains padding bytes. - CREATE/CREATE2: Payload contains the handshake challenge. - CREATED/CREATED2: Payload contains the handshake response. - RELAY/RELAY_EARLY: Payload contains the relay header and relay body. - DESTROY: Payload contains a reason for closing the circuit. - (see 5.4) - - Upon receiving any other value for the command field, an OR must - drop the cell. Since more cell types may be added in the future, ORs - should generally not warn when encountering unrecognized commands. - - The cell is padded up to the cell length with padding bytes. - - Senders set padding bytes depending on the cell's command: - - VERSIONS: Payload MUST NOT contain padding bytes. - AUTHORIZE: Payload is unspecified and reserved for future use. - Other variable-length cells: - Payload MAY contain padding bytes at the end of the cell. - Padding bytes SHOULD be set to NUL. - RELAY/RELAY_EARLY: Payload MUST be padded to PAYLOAD_LEN with padding - bytes. Padding bytes SHOULD be set to random values. - Other fixed-length cells: - Payload MUST be padded to PAYLOAD_LEN with padding bytes. - Padding bytes SHOULD be set to NUL. - - We recommend random padding in RELAY/RELAY_EARLY cells, so that the cell - content is unpredictable. See the format of relay cells in section 6.1 - for detail. - - For other cells, TLS authenticates cell content, so randomized padding - bytes are redundant. - - Receivers MUST ignore padding bytes. - - PADDING cells are currently used to implement connection keepalive. - If there is no other traffic, ORs and OPs send one another a PADDING - cell every few minutes. - - CREATE, CREATE2, CREATED, CREATED2, and DESTROY cells are used to - manage circuits; see section 5 below. - - RELAY cells are used to send commands and data along a circuit; see - section 6 below. - - VERSIONS and NETINFO cells are used to set up connections in link - protocols v2 and higher; in link protocol v3 and higher, CERTS, - AUTH_CHALLENGE, and AUTHENTICATE may also be used. See section 4 - below. - -4. Negotiating and initializing connections - - After Tor instances negotiate handshake with either the "renegotiation" or - "in-protocol" handshakes, they must exchange a set of cells to set up - the Tor connection and make it "open" and usable for circuits. - - When the renegotiation handshake is used, both parties immediately - send a VERSIONS cell (4.1 below), and after negotiating a link - protocol version (which will be 2), each send a NETINFO cell (4.5 - below) to confirm their addresses and timestamps. No other intervening - cell types are allowed. - - When the in-protocol handshake is used, the initiator sends a - VERSIONS cell to indicate that it will not be renegotiating. The - responder sends a VERSIONS cell, a CERTS cell (4.2 below) to give the - initiator the certificates it needs to learn the responder's - identity, an AUTH_CHALLENGE cell (4.3) that the initiator must include - as part of its answer if it chooses to authenticate, and a NETINFO - cell (4.5). As soon as it gets the CERTS cell, the initiator knows - whether the responder is correctly authenticated. At this point the - initiator behaves differently depending on whether it wants to - authenticate or not. If it does not want to authenticate, it MUST - send a NETINFO cell. If it does want to authenticate, it MUST send a - CERTS cell, an AUTHENTICATE cell (4.4), and a NETINFO. When this - handshake is in use, the first cell must be VERSIONS, VPADDING, or - AUTHORIZE, and no other cell type is allowed to intervene besides - those specified, except for VPADDING cells. - - The AUTHORIZE cell type is reserved for future use by scanning-resistance - designs. - - [Tor versions before 0.2.3.11-alpha did not recognize the AUTHORIZE cell, - and did not permit any command other than VERSIONS as the first cell of - the in-protocol handshake.] - -4.1. Negotiating versions with VERSIONS cells - - There are multiple instances of the Tor link connection protocol. Any - connection negotiated using the "certificates up front" handshake (see - section 2 above) is "version 1". In any connection where both parties - have behaved as in the "renegotiation" handshake, the link protocol - version must be 2. In any connection where both parties have behaved - as in the "in-protocol" handshake, the link protocol must be 3 or higher. - - To determine the version, in any connection where the "renegotiation" - or "in-protocol" handshake was used (that is, where the responder - sent only one certificate at first and where the initiator did not - send any certificates in the first negotiation), both parties MUST - send a VERSIONS cell. In "renegotiation", they send a VERSIONS cell - right after the renegotiation is finished, before any other cells are - sent. In "in-protocol", the initiator sends a VERSIONS cell - immediately after the initial TLS handshake, and the responder - replies immediately with a VERSIONS cell. (As an exception to this rule, - if both sides support the "in-protocol" handshake, either side may send - VPADDING cells at any time.) - - The payload in a VERSIONS cell is a series of big-endian two-byte - integers. Both parties MUST select as the link protocol version the - highest number contained both in the VERSIONS cell they sent and in the - versions cell they received. If they have no such version in common, - they cannot communicate and MUST close the connection. Either party MUST - close the connection if the versions cell is not well-formed (for example, - if the payload contains an odd number of bytes). - - Any VERSIONS cells sent after the first VERSIONS cell MUST be ignored. - (To be interpreted correctly, later VERSIONS cells MUST have a CIRCID_LEN - matching the version negotiated with the first VERSIONS cell.) - - Since the version 1 link protocol does not use the "renegotiation" - handshake, implementations MUST NOT list version 1 in their VERSIONS - cell. When the "renegotiation" handshake is used, implementations - MUST list only the version 2. When the "in-protocol" handshake is - used, implementations MUST NOT list any version before 3, and SHOULD - list at least version 3. - - Link protocols differences are: - - 1 -- The "certs up front" handshake. - 2 -- Uses the renegotiation-based handshake. Introduces - variable-length cells. - 3 -- Uses the in-protocol handshake. - 4 -- Increases circuit ID width to 4 bytes. - 5 -- Adds support for link padding and negotiation (padding-spec.txt). - - -4.2. CERTS cells - - The CERTS cell describes the keys that a Tor instance is claiming - to have. It is a variable-length cell. Its payload format is: - - N: Number of certs in cell [1 octet] - N times: - CertType [1 octet] - CLEN [2 octets] - Certificate [CLEN octets] - - Any extra octets at the end of a CERTS cell MUST be ignored. - - Relevant certType values are: - 1: Link key certificate certified by RSA1024 identity - 2: RSA1024 Identity certificate, self-signed. - 3: RSA1024 AUTHENTICATE cell link certificate, signed with RSA1024 key. - 4: Ed25519 signing key, signed with identity key. - 5: TLS link certificate, signed with ed25519 signing key. - 6: Ed25519 AUTHENTICATE cell key, signed with ed25519 signing key. - 7: Ed25519 identity, signed with RSA identity. - - The certificate format for certificate types 1-3 is DER encoded - X509. For others, the format is as documented in cert-spec.txt. - Note that type 7 uses a different format from types 4-6. - - A CERTS cell may have no more than one certificate of each CertType. - - - To authenticate the responder as having a given Ed25519,RSA identity key - combination, the initiator MUST check the following. - - * The CERTS cell contains exactly one CertType 2 "ID" certificate. - * The CERTS cell contains exactly one CertType 4 Ed25519 - "Id->Signing" cert. - * The CERTS cell contains exactly one CertType 5 Ed25519 - "Signing->link" certificate. - * The CERTS cell contains exactly one CertType 7 "RSA->Ed25519" - cross-certificate. - * All X.509 certificates above have validAfter and validUntil dates; - no X.509 or Ed25519 certificates are expired. - * All certificates are correctly signed. - * The certified key in the Signing->Link certificate matches the - SHA256 digest of the certificate that was used to - authenticate the TLS connection. - * The identity key listed in the ID->Signing cert was used to - sign the ID->Signing Cert. - * The Signing->Link cert was signed with the Signing key listed - in the ID->Signing cert. - * The RSA->Ed25519 cross-certificate certifies the Ed25519 - identity, and is signed with the RSA identity listed in the - "ID" certificate. - * The certified key in the ID certificate is a 1024-bit RSA key. - * The RSA ID certificate is correctly self-signed. - - To authenticate the responder as having a given RSA identity only, - the initiator MUST check the following: - - * The CERTS cell contains exactly one CertType 1 "Link" certificate. - * The CERTS cell contains exactly one CertType 2 "ID" certificate. - * Both certificates have validAfter and validUntil dates that - are not expired. - * The certified key in the Link certificate matches the - link key that was used to negotiate the TLS connection. - * The certified key in the ID certificate is a 1024-bit RSA key. - * The certified key in the ID certificate was used to sign both - certificates. - * The link certificate is correctly signed with the key in the - ID certificate - * The ID certificate is correctly self-signed. - - In both cases above, checking these conditions is sufficient to - authenticate that the initiator is talking to the Tor node with the - expected identity, as certified in the ID certificate(s). - - - To authenticate the initiator as having a given Ed25519,RSA - identity key combination, the responder MUST check the following: - - * The CERTS cell contains exactly one CertType 2 "ID" certificate. - * The CERTS cell contains exactly one CertType 4 Ed25519 - "Id->Signing" certificate. - * The CERTS cell contains exactly one CertType 6 Ed25519 - "Signing->auth" certificate. - * The CERTS cell contains exactly one CertType 7 "RSA->Ed25519" - cross-certificate. - * All X.509 certificates above have validAfter and validUntil dates; - no X.509 or Ed25519 certificates are expired. - * All certificates are correctly signed. - * The identity key listed in the ID->Signing cert was used to - sign the ID->Signing Cert. - * The Signing->AUTH cert was signed with the Signing key listed - in the ID->Signing cert. - * The RSA->Ed25519 cross-certificate certifies the Ed25519 - identity, and is signed with the RSA identity listed in the - "ID" certificate. - * The certified key in the ID certificate is a 1024-bit RSA key. - * The RSA ID certificate is correctly self-signed. - - - To authenticate the initiator as having an RSA identity key only, - the responder MUST check the following: - - * The CERTS cell contains exactly one CertType 3 "AUTH" certificate. - * The CERTS cell contains exactly one CertType 2 "ID" certificate. - * Both certificates have validAfter and validUntil dates that - are not expired. - * The certified key in the AUTH certificate is a 1024-bit RSA key. - * The certified key in the ID certificate is a 1024-bit RSA key. - * The certified key in the ID certificate was used to sign both - certificates. - * The auth certificate is correctly signed with the key in the - ID certificate. - * The ID certificate is correctly self-signed. - - Checking these conditions is NOT sufficient to authenticate that the - initiator has the ID it claims; to do so, the cells in 4.3 and 4.4 - below must be exchanged. - - -4.3. AUTH_CHALLENGE cells - - An AUTH_CHALLENGE cell is a variable-length cell with the following - fields: - - Challenge [32 octets] - N_Methods [2 octets] - Methods [2 * N_Methods octets] - - It is sent from the responder to the initiator. Initiators MUST - ignore unexpected bytes at the end of the cell. Responders MUST - generate every challenge independently using a strong RNG or PRNG. - - The Challenge field is a randomly generated string that the - initiator must sign (a hash of) as part of authenticating. The - methods are the authentication methods that the responder will - accept. Only two authentication methods are defined right now: - see 4.4.1 and 4.4.2 below. - -4.4. AUTHENTICATE cells - - If an initiator wants to authenticate, it responds to the - AUTH_CHALLENGE cell with a CERTS cell and an AUTHENTICATE cell. - The CERTS cell is as a server would send, except that instead of - sending a CertType 1 (and possibly CertType 5) certs for arbitrary link - certificates, the initiator sends a CertType 3 (and possibly - CertType 6) cert for an RSA/Ed25519 AUTHENTICATE key. - - This difference is because we allow any link key type on a TLS - link, but the protocol described here will only work for specific key - types as described in 4.4.1 and 4.4.2 below. - - An AUTHENTICATE cell contains the following: - - AuthType [2 octets] - AuthLen [2 octets] - Authentication [AuthLen octets] - - Responders MUST ignore extra bytes at the end of an AUTHENTICATE - cell. Recognized AuthTypes are 1 and 3, described in the next - two sections. - - Initiators MUST NOT send an AUTHENTICATE cell before they have - verified the certificates presented in the responder's CERTS - cell, and authenticated the responder. - -4.4.1. Link authentication type 1: RSA-SHA256-TLSSecret - - If AuthType is 1 (meaning "RSA-SHA256-TLSSecret"), then the - Authentication field of the AUTHENTICATE cell contains the following: - - TYPE: The characters "AUTH0001" [8 octets] - CID: A SHA256 hash of the initiator's RSA1024 identity key [32 octets] - SID: A SHA256 hash of the responder's RSA1024 identity key [32 octets] - SLOG: A SHA256 hash of all bytes sent from the responder to the - initiator as part of the negotiation up to and including the - AUTH_CHALLENGE cell; that is, the VERSIONS cell, the CERTS cell, - the AUTH_CHALLENGE cell, and any padding cells. [32 octets] - CLOG: A SHA256 hash of all bytes sent from the initiator to the - responder as part of the negotiation so far; that is, the - VERSIONS cell and the CERTS cell and any padding cells. [32 - octets] - SCERT: A SHA256 hash of the responder's TLS link certificate. [32 - octets] - TLSSECRETS: A SHA256 HMAC, using the TLS master secret as the - secret key, of the following: - - client_random, as sent in the TLS Client Hello - - server_random, as sent in the TLS Server Hello - - the NUL terminated ASCII string: - "Tor V3 handshake TLS cross-certification" - [32 octets] - RAND: A 24 byte value, randomly chosen by the initiator. (In an - imitation of SSL3's gmt_unix_time field, older versions of Tor - sent an 8-byte timestamp as the first 8 bytes of this field; - new implementations should not do that.) [24 octets] - SIG: A signature of a SHA256 hash of all the previous fields - using the initiator's "Authenticate" key as presented. (As - always in Tor, we use OAEP-MGF1 padding; see tor-spec.txt - section 0.3.) - [variable length] - - To check the AUTHENTICATE cell, a responder checks that all fields - from TYPE through TLSSECRETS contain their unique - correct values as described above, and then verifies the signature. - The server MUST ignore any extra bytes in the signed data after - the RAND field. - - Responders MUST NOT accept this AuthType if the initiator has - claimed to have an Ed25519 identity. - - (There is no AuthType 2: It was reserved but never implemented.) - -4.4.2. Link authentication type 3: Ed25519-SHA256-RFC5705. - - If AuthType is 3, meaning "Ed25519-SHA256-RFC5705", the - Authentication field of the AuthType cell is as below: - - Modified values and new fields below are marked with asterisks. - - TYPE: The characters "AUTH0003" [8 octets] - CID: A SHA256 hash of the initiator's RSA1024 identity key [32 octets] - SID: A SHA256 hash of the responder's RSA1024 identity key [32 octets] - CID_ED: The initiator's Ed25519 identity key [32 octets] - SID_ED: The responder's Ed25519 identity key, or all-zero. [32 octets] - SLOG: A SHA256 hash of all bytes sent from the responder to the - initiator as part of the negotiation up to and including the - AUTH_CHALLENGE cell; that is, the VERSIONS cell, the CERTS cell, - the AUTH_CHALLENGE cell, and any padding cells. [32 octets] - CLOG: A SHA256 hash of all bytes sent from the initiator to the - responder as part of the negotiation so far; that is, the - VERSIONS cell and the CERTS cell and any padding cells. [32 - octets] - SCERT: A SHA256 hash of the responder's TLS link certificate. [32 - octets] - TLSSECRETS: The output of an RFC5705 Exporter function on the - TLS session, using as its inputs: - - The label string "EXPORTER FOR TOR TLS CLIENT BINDING AUTH0003" - - The context value equal to the initiator's Ed25519 identity key. - - The length 32. - [32 octets] - RAND: A 24 byte value, randomly chosen by the initiator. [24 octets] - SIG: A signature of all previous fields using the initiator's - Ed25519 authentication key (as in the cert with CertType 6). - [variable length] - - To check the AUTHENTICATE cell, a responder checks that all fields - from TYPE through TLSSECRETS contain their unique - correct values as described above, and then verifies the signature. - The server MUST ignore any extra bytes in the signed data after - the RAND field. - -4.5. NETINFO cells - - If version 2 or higher is negotiated, each party sends the other a - NETINFO cell. The cell's payload is: - - TIME (Timestamp) [4 bytes] - OTHERADDR (Other OR's address) [variable] - ATYPE (Address type) [1 byte] - ALEN (Address length) [1 byte] - AVAL (Address value in NBO) [ALEN bytes] - NMYADDR (Number of this OR's addresses) [1 byte] - NMYADDR times: - ATYPE (Address type) [1 byte] - ALEN (Address length) [1 byte] - AVAL (Address value in NBO)) [ALEN bytes] - - Recognized address types (ATYPE) are: - - [04] IPv4. - [06] IPv6. - - ALEN MUST be 4 when ATYPE is 0x04 (IPv4) and 16 when ATYPE is 0x06 - (IPv6). If the ALEN value is wrong for the given ATYPE value, then - the provided address should be ignored. - - The timestamp is a big-endian unsigned integer number of seconds - since the Unix epoch. Implementations MUST ignore unexpected bytes - at the end of the cell. Clients SHOULD send "0" as their timestamp, to - avoid fingerprinting. - - Implementations MAY use the timestamp value to help decide if their - clocks are skewed. Initiators MAY use "other OR's address" to help - learn which address their connections may be originating from, if they do - not know it; and to learn whether the peer will treat the current - connection as canonical. Implementations SHOULD NOT trust these - values unconditionally, especially when they come from non-authorities, - since the other party can lie about the time or IP addresses it sees. - - Initiators SHOULD use "this OR's address" to make sure - that they have connected to another OR at its canonical address. - (See 5.3.1 below.) - -5. Circuit management - -5.1. CREATE and CREATED cells - - Users set up circuits incrementally, one hop at a time. To create a - new circuit, OPs send a CREATE/CREATE2 cell to the first node, with - the first half of an authenticated handshake; that node responds with - a CREATED/CREATED2 cell with the second half of the handshake. To - extend a circuit past the first hop, the OP sends an EXTEND/EXTEND2 - relay cell (see section 5.1.2) which instructs the last node in the - circuit to send a CREATE/CREATE2 cell to extend the circuit. - - There are two kinds of CREATE and CREATED cells: The older - "CREATE/CREATED" format, and the newer "CREATE2/CREATED2" format. The - newer format is extensible by design; the older one is not. - - A CREATE2 cell contains: - - HTYPE (Client Handshake Type) [2 bytes] - HLEN (Client Handshake Data Len) [2 bytes] - HDATA (Client Handshake Data) [HLEN bytes] - - A CREATED2 cell contains: - - HLEN (Server Handshake Data Len) [2 bytes] - HDATA (Server Handshake Data) [HLEN bytes] - - Recognized HTYPEs (handshake types) are: - - 0x0000 TAP -- the original Tor handshake; see 5.1.3 - 0x0001 reserved - 0x0002 ntor -- the ntor+curve25519+sha256 handshake; see 5.1.4 - 0x0003 ntor-v3 -- ntor extended with extra data; see 5.1.4.1 - - The format of a CREATE cell is one of the following: - - HDATA (Client Handshake Data) [TAP_C_HANDSHAKE_LEN bytes] - - or - - HTAG (Client Handshake Type Tag) [16 bytes] - HDATA (Client Handshake Data) [TAP_C_HANDSHAKE_LEN-16 bytes] - - The first format is equivalent to a CREATE2 cell with HTYPE of 'tap' - and length of TAP_C_HANDSHAKE_LEN. The second format is a way to - encapsulate new handshake types into the old CREATE cell format for - migration. See 5.1.2 below. Recognized HTAG values are: - - ntor -- 'ntorNTORntorNTOR' - - The format of a CREATED cell is: - - HDATA (Server Handshake Data) [TAP_S_HANDSHAKE_LEN bytes] - - (It's equivalent to a CREATED2 cell with length of TAP_S_HANDSHAKE_LEN.) - - As usual with DH, x and y MUST be generated randomly. - - In general, clients SHOULD use CREATE whenever they are using the TAP - handshake, and CREATE2 otherwise. Clients SHOULD NOT send the - second format of CREATE cells (the one with the handshake type tag) - to a server directly. - - Servers always reply to a successful CREATE with a CREATED, and to a - successful CREATE2 with a CREATED2. On failure, a server sends a - DESTROY cell to tear down the circuit. - - [CREATE2 is handled by Tor 0.2.4.7-alpha and later.] - -5.1.1. Choosing circuit IDs in create cells - - The CircID for a CREATE/CREATE2 cell is a nonzero integer, selected - by the node (OP or OR) that sends the CREATE/CREATED2 cell. - Depending on the link protocol version, there are certain rules for - choosing the value of CircID which MUST be obeyed, as implementations - MAY decide to refuse in case of a violation. In link protocol 3 or - lower, CircIDs are 2 bytes long; in protocol 4 or higher, CircIDs are - 4 bytes long. - - In link protocol version 3 or lower, the nodes choose from only one - half of the possible values based on the ORs' public identity keys, - in order to avoid collisions. If the sending node has a lower key, - it chooses a CircID with an MSB of 0; otherwise, it chooses a CircID - with an MSB of 1. (Public keys are compared numerically by modulus.) - A client with no public key MAY choose any CircID it wishes, since - clients never need to process CREATE/CREATE2 cells. - - In link protocol version 4 or higher, whichever node initiated the - connection MUST set its MSB to 1, and whichever node didn't initiate - the connection MUST set its MSB to 0. - - The CircID value 0 is specifically reserved for cells that do not - belong to any circuit: CircID 0 MUST not be used for circuits. No - other CircID value, including 0x8000 or 0x80000000, is reserved. - - Existing Tor implementations choose their CircID values at random from - among the available unused values. To avoid distinguishability, new - implementations should do the same. Implementations MAY give up and stop - attempting to build new circuits on a channel, if a certain number of - randomly chosen CircID values are all in use (today's Tor stops after 64). - -5.1.2. EXTEND and EXTENDED cells - - To extend an existing circuit, the client sends an EXTEND or EXTEND2 - RELAY_EARLY cell to the last node in the circuit. - - An EXTEND2 cell's relay payload contains: - - NSPEC (Number of link specifiers) [1 byte] - NSPEC times: - LSTYPE (Link specifier type) [1 byte] - LSLEN (Link specifier length) [1 byte] - LSPEC (Link specifier) [LSLEN bytes] - HTYPE (Client Handshake Type) [2 bytes] - HLEN (Client Handshake Data Len) [2 bytes] - HDATA (Client Handshake Data) [HLEN bytes] - - Link specifiers describe the next node in the circuit and how to - connect to it. Recognized specifiers are: - - [00] TLS-over-TCP, IPv4 address - A four-byte IPv4 address plus two-byte ORPort - [01] TLS-over-TCP, IPv6 address - A sixteen-byte IPv6 address plus two-byte ORPort - [02] Legacy identity - A 20-byte SHA1 identity fingerprint. At most one may be listed. - [03] Ed25519 identity - A 32-byte Ed25519 identity fingerprint. At most one may - be listed. - - Nodes MUST ignore unrecognized specifiers, and MUST accept multiple - instances of specifiers other than 'legacy identity' and - 'Ed25519 identity'. (Nodes SHOULD reject link specifier lists - that include multiple instances of either one of those specifiers.) - - For purposes of indistinguishability, implementations SHOULD send - these link specifiers, if using them, in this order: [00], [02], [03], - [01]. - - The relay payload for an EXTEND relay cell consists of: - - Address [4 bytes] - Port [2 bytes] - Onion skin [TAP_C_HANDSHAKE_LEN bytes] - Identity fingerprint [HASH_LEN bytes] - - The "legacy identity" and "identity fingerprint" fields are the - SHA1 hash of the PKCS#1 ASN1 encoding of the next onion router's - identity (signing) key. (See 0.3 above.) The "Ed25519 identity" - field is the Ed25519 identity key of the target node. Including - this key information allows the extending OR verify that it is - indeed connected to the correct target OR, and prevents certain - man-in-the-middle attacks. - - Extending ORs MUST check _all_ provided identity keys (if they - recognize the format), and and MUST NOT extend the circuit if the - target OR did not prove its ownership of any such identity key. - If only one identity key is provided, but the extending OR knows - the other (from directory information), then the OR SHOULD also - enforce the key in the directory. - - If an extending OR has a channel with a given Ed25519 ID and RSA - identity, and receives a request for that Ed25519 ID and a - different RSA identity, it SHOULD NOT attempt to make another - connection: it should just fail and DESTROY the circuit. - - The client MAY include multiple IPv4 or IPv6 link specifiers in an - EXTEND cell; current OR implementations only consider the first - of each type. - - After checking relay identities, extending ORs generate a - CREATE/CREATE2 cell from the contents of the EXTEND/EXTEND2 cell. - See section 5.3 for details. - - The payload of an EXTENDED cell is the same as the payload of a - CREATED cell. - - The payload of an EXTENDED2 cell is the same as the payload of a - CREATED2 cell. - - [Support for EXTEND2/EXTENDED2 was added in Tor 0.2.4.8-alpha.] - - Clients SHOULD use the EXTEND format whenever sending a TAP - handshake, and MUST use it whenever the EXTEND cell will be handled - by a node running a version of Tor too old to support EXTEND2. In - other cases, clients SHOULD use EXTEND2. - - When generating an EXTEND2 cell, clients SHOULD include the target's - Ed25519 identity whenever the target has one, and whenever the - target supports LinkAuth subprotocol version "3". (See section 9.2.) - - When encoding a non-TAP handshake in an EXTEND cell, clients SHOULD - use the format with 'client handshake type tag'. - -5.1.3. The "TAP" handshake - - This handshake uses Diffie-Hellman in Z_p and RSA to compute a set of - shared keys which the client knows are shared only with a particular - server, and the server knows are shared with whomever sent the - original handshake (or with nobody at all). It's not very fast and - not very good. (See Goldberg's "On the Security of the Tor - Authentication Protocol".) - - Define TAP_C_HANDSHAKE_LEN as DH_LEN+KEY_LEN+KP_PAD_LEN. - Define TAP_S_HANDSHAKE_LEN as DH_LEN+HASH_LEN. - - The payload for a CREATE cell is an 'onion skin', which consists of - the first step of the DH handshake data (also known as g^x). This - value is encrypted using the "legacy hybrid encryption" algorithm - (see 0.4 above) to the server's onion key, giving a client handshake: - - KP-encrypted: - Padding [KP_PAD_LEN bytes] - Symmetric key [KEY_LEN bytes] - First part of g^x [KP_ENC_LEN-KP_PAD_LEN-KEY_LEN bytes] - Symmetrically encrypted: - Second part of g^x [DH_LEN-(KP_ENC_LEN-KP_PAD_LEN-KEY_LEN) - bytes] - - The payload for a CREATED cell, or the relay payload for an - EXTENDED cell, contains: - - DH data (g^y) [DH_LEN bytes] - Derivative key data (KH) [HASH_LEN bytes] - - Once the handshake between the OP and an OR is completed, both can - now calculate g^xy with ordinary DH. Before computing g^xy, both parties - MUST verify that the received g^x or g^y value is not degenerate; - that is, it must be strictly greater than 1 and strictly less than p-1 - where p is the DH modulus. Implementations MUST NOT complete a handshake - with degenerate keys. Implementations MUST NOT discard other "weak" - g^x values. - - (Discarding degenerate keys is critical for security; if bad keys - are not discarded, an attacker can substitute the OR's CREATED - cell's g^y with 0 or 1, thus creating a known g^xy and impersonating - the OR. Discarding other keys may allow attacks to learn bits of - the private key.) - - Once both parties have g^xy, they derive their shared circuit keys - and 'derivative key data' value via the KDF-TOR function in 5.2.1. - -5.1.4. The "ntor" handshake - - This handshake uses a set of DH handshakes to compute a set of - shared keys which the client knows are shared only with a particular - server, and the server knows are shared with whomever sent the - original handshake (or with nobody at all). Here we use the - "curve25519" group and representation as specified in "Curve25519: - new Diffie-Hellman speed records" by D. J. Bernstein. - - [The ntor handshake was added in Tor 0.2.4.8-alpha.] - - In this section, define: - - H(x,t) as HMAC_SHA256 with message x and key t. - H_LENGTH = 32. - ID_LENGTH = 20. - G_LENGTH = 32 - PROTOID = "ntor-curve25519-sha256-1" - t_mac = PROTOID | ":mac" - t_key = PROTOID | ":key_extract" - t_verify = PROTOID | ":verify" - G = The preferred base point for curve25519 ([9]) - KEYGEN() = The curve25519 key generation algorithm, returning - a private/public keypair. - m_expand = PROTOID | ":key_expand" - KEYID(A) = A - EXP(a, b) = The ECDH algorithm for establishing a shared secret. - - To perform the handshake, the client needs to know an identity key - digest for the server, and an ntor onion key (a curve25519 public - key) for that server. Call the ntor onion key "B". The client - generates a temporary keypair: - - x,X = KEYGEN() - - and generates a client-side handshake with contents: - - NODEID Server identity digest [ID_LENGTH bytes] - KEYID KEYID(B) [H_LENGTH bytes] - CLIENT_KP X [G_LENGTH bytes] - - The server generates a keypair of y,Y = KEYGEN(), and uses its ntor - private key 'b' to compute: - - secret_input = EXP(X,y) | EXP(X,b) | ID | B | X | Y | PROTOID - KEY_SEED = H(secret_input, t_key) - verify = H(secret_input, t_verify) - auth_input = verify | ID | B | Y | X | PROTOID | "Server" - - The server's handshake reply is: - - SERVER_KP Y [G_LENGTH bytes] - AUTH H(auth_input, t_mac) [H_LENGTH bytes] - - The client then checks Y is in G^* [see NOTE below], and computes - - secret_input = EXP(Y,x) | EXP(B,x) | ID | B | X | Y | PROTOID - KEY_SEED = H(secret_input, t_key) - verify = H(secret_input, t_verify) - auth_input = verify | ID | B | Y | X | PROTOID | "Server" - - The client verifies that AUTH == H(auth_input, t_mac). - - Both parties check that none of the EXP() operations produced the - point at infinity. [NOTE: This is an adequate replacement for - checking Y for group membership, if the group is curve25519.] - - Both parties now have a shared value for KEY_SEED. They expand this - into the keys needed for the Tor relay protocol, using the KDF - described in 5.2.2 and the tag m_expand. - -5.1.4.1. The "ntor-v3" handshake - - This handshake extends the ntor handshake to include support - for extra data transmitted as part of the handshake. Both - the client and the server can transmit extra data; in both cases, - the extra data is encrypted, but only server data receives - forward secrecy. - - To advertise support for this handshake, servers advertise the - "Relay=4" subprotocol version. To select it, clients use the - 'ntor-v3' HTYPE value in their CREATE2 cells. - - In this handshake, we define: - - PROTOID = "ntor3-curve25519-sha3_256-1" - t_msgkdf = PROTOID | ":kdf_phase1" - t_msgmac = PROTOID | ":msg_mac" - t_key_seed = PROTOID | ":key_seed" - t_verify = PROTOID | ":verify" - t_final = PROTOID | ":kdf_final" - t_auth = PROTOID | ":auth_final" - - `ENCAP(s)` -- an encapsulation function. We define this - as `htonll(len(s)) | s`. (Note that `len(ENCAP(s)) = len(s) + 8`). - - `PARTITION(s, n1, n2, n3, ...)` -- a function that partitions a - bytestring `s` into chunks of length `n1`, `n2`, `n3`, and so - on. Extra data is put into a final chunk. If `s` is not long - enough, the function fails. - - H(s, t) = SHA3_256(ENCAP(t) | s) - MAC(k, msg, t) = SHA3_256(ENCAP(t) | ENCAP(k) | s) - KDF(s, t) = SHAKE_256(ENCAP(t) | s) - ENC(k, m) = AES_256_CTR(k, m) - - EXP(pk,sk), KEYGEN: defined as in curve25519 - - DIGEST_LEN = MAC_LEN = MAC_KEY_LEN = ENC_KEY_LEN = PUB_KEY_LEN = 32 - - ID_LEN = 32 (representing an ed25519 identity key) - - For any tag "t_foo": - H_foo(s) = H(s, t_foo) - MAC_foo(k, msg) = MAC(k, msg, t_foo) - KDF_foo(s) = KDF(s, t_foo) - - Other notation is as in the ntor description in 5.1.4 above. - - The client begins by knowing: - - B, ID -- The curve25519 onion key and Ed25519 ID of the server that it - wants to use. - CM -- A message it wants to send as part of its handshake. - VER -- An optional shared verification string: - - The client computes: - - x,X = KEYGEN() - Bx = EXP(B,x) - secret_input_phase1 = Bx | ID | X | B | PROTOID | ENCAP(VER) - phase1_keys = KDF_msgkdf(secret_input_phase1) - (ENC_K1, MAC_K1) = PARTITION(phase1_keys, ENC_KEY_LEN, MAC_KEY_LEN) - encrypted_msg = ENC(ENC_K1, CM) - msg_mac = MAC_msgmac(MAC_K1, ID | B | X | encrypted_msg) - - The client then sends, as its CREATE handshake: - - NODEID ID [ID_LEN bytes] - KEYID B [PUB_KEY_LEN bytes] - CLIENT_PK X [PUB_KEY_LEN bytes] - MSG encrypted_msg [len(CM) bytes] - MAC msg_mac [MAC_LEN bytes] - - The client remembers x, X, B, ID, Bx, and msg_mac. - - When the server receives this handshake, it checks whether NODEID is as - expected, and looks up the (b,B) keypair corresponding to KEYID. If the - keypair is missing or the NODEID is wrong, the handshake fails. - - Now the relay uses `X=CLIENT_PK` to compute: - - Xb = EXP(X,b) - secret_input_phase1 = Xb | ID | X | B | PROTOID | ENCAP(VER) - phase1_keys = KDF_msgkdf(secret_input_phase1) - (ENC_K1, MAC_K1) = PARTITION(phase1_keys, ENC_KEY_LEN, MAC_KEY_LEN) - - expected_mac = MAC_msgmac(MAC_K1, ID | B | X | MSG) - - If `expected_mac` is not `MAC`, the handshake fails. Otherwise - the relay computes `CM` as: - - CM = DEC(MSG, ENC_K1) - - The relay then checks whether `CM` is well-formed, and in response - composes `SM`, the reply that it wants to send as part of the - handshake. It then generates a new ephemeral keypair: - - y,Y = KEYGEN() - - and computes the rest of the handshake: - - Xy = EXP(X,y) - secret_input = Xy | Xb | ID | B | X | Y | PROTOID | ENCAP(VER) - ntor_key_seed = H_key_seed(secret_input) - verify = H_verify(secret_input) - - RAW_KEYSTREAM = KDF_final(ntor_key_seed) - (ENC_KEY, KEYSTREAM) = PARTITION(RAW_KEYSTREAM, ENC_KEY_LKEN, ...) - - encrypted_msg = ENC(ENC_KEY, SM) - - auth_input = verify | ID | B | Y | X | MAC | ENCAP(encrypted_msg) | - PROTOID | "Server" - AUTH = H_auth(auth_input) - - The relay then sends as its CREATED handshake: - - Y Y [PUB_KEY_LEN bytes] - AUTH AUTH [DIGEST_LEN bytes] - MSG encrypted_msg [len(SM) bytes, up to end of the message] - - Upon receiving this handshake, the client computes: - - Yx = EXP(Y, x) - secret_input = Yx | Bx | ID | B | X | Y | PROTOID | ENCAP(VER) - ntor_key_seed = H_key_seed(secret_input) - verify = H_verify(secret_input) - - auth_input = verify | ID | B | Y | X | MAC | ENCAP(MSG) | - PROTOID | "Server" - AUTH_expected = H_auth(auth_input) - - If AUTH_expected is equal to AUTH, then the handshake has - succeeded. The client can then calculate: - - RAW_KEYSTREAM = KDF_final(ntor_key_seed) - (ENC_KEY, KEYSTREAM) = PARTITION(RAW_KEYSTREAM, ENC_KEY_LKEN, ...) - - SM = DEC(ENC_KEY, MSG) - - SM is the message from the relay, and the client uses KEYSTREAM to - generate the shared secrets for the newly created circuit. - - Now both parties share the same KEYSTREAM, and can use it to generate - their circuit keys. - -5.1.5. CREATE_FAST/CREATED_FAST cells - - When initializing the first hop of a circuit, the OP has already - established the OR's identity and negotiated a secret key using TLS. - Because of this, it is not always necessary for the OP to perform the - public key operations to create a circuit. In this case, the - OP MAY send a CREATE_FAST cell instead of a CREATE cell for the first - hop only. The OR responds with a CREATED_FAST cell, and the circuit is - created. - - A CREATE_FAST cell contains: - - Key material (X) [HASH_LEN bytes] - - A CREATED_FAST cell contains: - - Key material (Y) [HASH_LEN bytes] - Derivative key data [HASH_LEN bytes] (See 5.2.1 below) - - The values of X and Y must be generated randomly. - - Once both parties have X and Y, they derive their shared circuit keys - and 'derivative key data' value via the KDF-TOR function in 5.2.1. - - The CREATE_FAST handshake is currently deprecated whenever it is not - necessary; the migration is controlled by the "usecreatefast" - networkstatus parameter as described in dir-spec.txt. - - [Tor 0.3.1.1-alpha and later disable CREATE_FAST by default.] - -5.1.6. Additional data in CREATE/CREATED cells - - Some handshakes (currently ntor-v3 defined above) allow the client or the - relay to send additional data as part of the handshake. When used in a - CREATE/CREATED handshake, this additional data must have the following - format: - - N_EXTENSIONS [one byte] - N_EXTENSIONS times: - EXT_FIELD_TYPE [one byte] - EXT_FIELD_LEN [one byte] - EXT_FIELD [EXT_FIELD_LEN bytes] - - (`EXT_FIELD_LEN` may be zero, in which case EXT_FIELD is absent.) - - All parties MUST reject messages that are not well-formed per the - rules above. - - We do not specify specific TYPE semantics here; we leave those for - other proposals and specifications. - - Parties MUST ignore extensions with `EXT_FIELD_TYPE` bodies they do not - recognize. - - Unless otherwise specified in the documentation for an extension type: - * Each extension type SHOULD be sent only once in a message. - * Parties MUST ignore any occurrences all occurrences of an extension - with a given type after the first such occurrence. - * Extensions SHOULD be sent in numerically ascending order by type. - - (The above extension sorting and multiplicity rules are only defaults; - they may be overridden in the description of individual extensions.) - - Currently supported extensions are: - - 1 -- CC_FIELD_REQUEST [Client to server] - - Contains an empty payload. Signifies that the client - wants to use the extended congestion control described - in proposal 324. - - 2 -- CC_FIELD_RESPONSE [Server to client] - - Indicates that the relay will use the congestion control - of proposal 324, as requested by the client. One byte - in length: - - sendme_inc [1 byte] - -5.2. Setting circuit keys - -5.2.1. KDF-TOR - - This key derivation function is used by the TAP and CREATE_FAST - handshakes, and in the current hidden service protocol. It shouldn't - be used for new functionality. - - If the TAP handshake is used to extend a circuit, both parties - base their key material on K0=g^xy, represented as a big-endian unsigned - integer. - - If CREATE_FAST is used, both parties base their key material on - K0=X|Y. - - From the base key material K0, they compute KEY_LEN*2+HASH_LEN*3 bytes of - derivative key data as - - K = H(K0 | [00]) | H(K0 | [01]) | H(K0 | [02]) | ... - - The first HASH_LEN bytes of K form KH; the next HASH_LEN form the forward - digest Df; the next HASH_LEN 41-60 form the backward digest Db; the next - KEY_LEN 61-76 form Kf, and the final KEY_LEN form Kb. Excess bytes from K - are discarded. - - KH is used in the handshake response to demonstrate knowledge of the - computed shared key. Df is used to seed the integrity-checking hash - for the stream of data going from the OP to the OR, and Db seeds the - integrity-checking hash for the data stream from the OR to the OP. Kf - is used to encrypt the stream of data going from the OP to the OR, and - Kb is used to encrypt the stream of data going from the OR to the OP. - -5.2.2. KDF-RFC5869 - - For newer KDF needs, Tor uses the key derivation function HKDF from - RFC5869, instantiated with SHA256. (This is due to a construction - from Krawczyk.) The generated key material is: - - K = K_1 | K_2 | K_3 | ... - - Where H(x,t) is HMAC_SHA256 with value x and key t - and K_1 = H(m_expand | INT8(1) , KEY_SEED ) - and K_(i+1) = H(K_i | m_expand | INT8(i+1) , KEY_SEED ) - and m_expand is an arbitrarily chosen value, - and INT8(i) is a octet with the value "i". - - In RFC5869's vocabulary, this is HKDF-SHA256 with info == m_expand, - salt == t_key, and IKM == secret_input. - - When used in the ntor handshake, the first HASH_LEN bytes form the - forward digest Df; the next HASH_LEN form the backward digest Db; the - next KEY_LEN form Kf, the next KEY_LEN form Kb, and the final - DIGEST_LEN bytes are taken as a nonce to use in the place of KH in the - hidden service protocol. Excess bytes from K are discarded. - -5.3. Creating circuits - - When creating a circuit through the network, the circuit creator - (OP) performs the following steps: - - 1. Choose an onion router as an end node (R_N): - * N MAY be 1 for non-anonymous directory mirror, introduction point, - or service rendezvous connections. - * N SHOULD be 3 or more for anonymous connections. - Some end nodes accept streams (see 6.1), others are introduction - or rendezvous points (see rend-spec-{v2,v3}.txt). - - 2. Choose a chain of (N-1) onion routers (R_1...R_N-1) to constitute - the path, such that no router appears in the path twice. - - 3. If not already connected to the first router in the chain, - open a new connection to that router. - - 4. Choose a circID not already in use on the connection with the - first router in the chain; send a CREATE/CREATE2 cell along - the connection, to be received by the first onion router. - - 5. Wait until a CREATED/CREATED2 cell is received; finish the - handshake and extract the forward key Kf_1 and the backward - key Kb_1. - - 6. For each subsequent onion router R (R_2 through R_N), extend - the circuit to R. - - To extend the circuit by a single onion router R_M, the OP performs - these steps: - - 1. Create an onion skin, encrypted to R_M's public onion key. - - 2. Send the onion skin in a relay EXTEND/EXTEND2 cell along - the circuit (see sections 5.1.2 and 5.5). - - 3. When a relay EXTENDED/EXTENDED2 cell is received, verify KH, - and calculate the shared keys. The circuit is now extended. - - When an onion router receives an EXTEND relay cell, it sends a CREATE - cell to the next onion router, with the enclosed onion skin as its - payload. - - When an onion router receives an EXTEND2 relay cell, it sends a CREATE2 - cell to the next onion router, with the enclosed HLEN, HTYPE, and HDATA - as its payload. The initiating onion router chooses some circID not yet - used on the connection between the two onion routers. (But see section - 5.1.1 above, concerning choosing circIDs.) - - As special cases, if the EXTEND/EXTEND2 cell includes a legacy identity, or - identity fingerprint of all zeroes, or asks to extend back to the relay - that sent the extend cell, the circuit will fail and be torn down. - - Ed25519 identity keys are not required in EXTEND2 cells, so all zero - keys SHOULD be accepted. If the extending relay knows the ed25519 key from - the consensus, it SHOULD also check that key. (See section 5.1.2.) - - If an EXTEND2 cell contains the ed25519 key of the relay that sent the - extend cell, the circuit will fail and be torn down. - - When an onion router receives a CREATE/CREATE2 cell, if it already has a - circuit on the given connection with the given circID, it drops the - cell. Otherwise, after receiving the CREATE/CREATE2 cell, it completes - the specified handshake, and replies with a CREATED/CREATED2 cell. - - Upon receiving a CREATED/CREATED2 cell, an onion router packs it payload - into an EXTENDED/EXTENDED2 relay cell (see section 5.1.2), and sends - that cell up the circuit. Upon receiving the EXTENDED/EXTENDED2 relay - cell, the OP can retrieve the handshake material. - - (As an optimization, OR implementations may delay processing onions - until a break in traffic allows time to do so without harming - network latency too greatly.) - -5.3.1. Canonical connections - - It is possible for an attacker to launch a man-in-the-middle attack - against a connection by telling OR Alice to extend to OR Bob at some - address X controlled by the attacker. The attacker cannot read the - encrypted traffic, but the attacker is now in a position to count all - bytes sent between Alice and Bob (assuming Alice was not already - connected to Bob.) - - To prevent this, when an OR gets an extend request, it SHOULD use an - existing OR connection if the ID matches, and ANY of the following - conditions hold: - - - The IP matches the requested IP. - - The OR knows that the IP of the connection it's using is canonical - because it was listed in the NETINFO cell. - - ORs SHOULD NOT check the IPs that are listed in the server descriptor. - Trusting server IPs makes it easier to covertly impersonate a relay, after - stealing its keys. - -5.4. Tearing down circuits - - Circuits are torn down when an unrecoverable error occurs along - the circuit, or when all streams on a circuit are closed and the - circuit's intended lifetime is over. - - ORs SHOULD also tear down circuits which attempt to create: - - * streams with RELAY_BEGIN, or - * rendezvous points with ESTABLISH_RENDEZVOUS, - ending at the first hop. Letting Tor be used as a single hop proxy makes - exit and rendezvous nodes a more attractive target for compromise. - - ORs MAY use multiple methods to check if they are the first hop: - - * If an OR sees a circuit created with CREATE_FAST, the OR is sure to be - the first hop of a circuit. - * If an OR is the responder, and the initiator: - * did not authenticate the link, or - * authenticated with a key that is not in the consensus, - then the OR is probably the first hop of a circuit (or the second hop of - a circuit via a bridge relay). - - Circuits may be torn down either completely or hop-by-hop. - - To tear down a circuit completely, an OR or OP sends a DESTROY - cell to the adjacent nodes on that circuit, using the appropriate - direction's circID. - - Upon receiving an outgoing DESTROY cell, an OR frees resources - associated with the corresponding circuit. If it's not the end of - the circuit, it sends a DESTROY cell for that circuit to the next OR - in the circuit. If the node is the end of the circuit, then it tears - down any associated edge connections (see section 6.1). - - After a DESTROY cell has been processed, an OR ignores all data or - destroy cells for the corresponding circuit. - - To tear down part of a circuit, the OP may send a RELAY_TRUNCATE cell - signaling a given OR (Stream ID zero). That OR sends a DESTROY - cell to the next node in the circuit, and replies to the OP with a - RELAY_TRUNCATED cell. - - [Note: If an OR receives a TRUNCATE cell and it has any RELAY cells - still queued on the circuit for the next node it will drop them - without sending them. This is not considered conformant behavior, - but it probably won't get fixed until a later version of Tor. Thus, - clients SHOULD NOT send a TRUNCATE cell to a node running any current - version of Tor if a) they have sent relay cells through that node, - and b) they aren't sure whether those cells have been sent on yet.] - - When an unrecoverable error occurs along one a circuit, the nodes - must report it as follows: - * If possible, send a DESTROY cell to ORs _away_ from the client. - * If possible, send *either* a DESTROY cell towards the client, or - a RELAY_TRUNCATED cell towards the client. - - Current versions of Tor do not reuse truncated RELAY_TRUNCATED - circuits: An OP, upon receiving a RELAY_TRUNCATED, will send - forward a DESTROY cell in order to entirely tear down the circuit. - Because of this, we recommend that relays should send DESTROY - towards the client, not RELAY_TRUNCATED. - - NOTE: - In tor versions before 0.4.5.13, 0.4.6.11 and 0.4.7.9, relays would - handle an inbound DESTROY by sending the client a RELAY_TRUNCATED - message. Beginning with those versions, relays now propagate - DESTROY cells in either direction, in order to tell every - intermediary ORs to stop queuing data on the circuit. The earlier - behavior created queuing pressure on the intermediary ORs. - - The payload of a DESTROY and RELAY_TRUNCATED cell contains a single - octet, describing the reason that the circuit was - closed. RELAY_TRUNCATED cells, and DESTROY cells sent _towards the - client, should contain the actual reason from the list of error codes - below. Reasons in DESTROY cell SHOULD NOT be propagated downward or - upward, due to potential side channel risk: An OR receiving a DESTROY - command should use the DESTROYED reason for its next cell. An OP - should always use the NONE reason for its own DESTROY cells. - - The error codes are: - - 0 -- NONE (No reason given.) - 1 -- PROTOCOL (Tor protocol violation.) - 2 -- INTERNAL (Internal error.) - 3 -- REQUESTED (A client sent a TRUNCATE command.) - 4 -- HIBERNATING (Not currently operating; trying to save bandwidth.) - 5 -- RESOURCELIMIT (Out of memory, sockets, or circuit IDs.) - 6 -- CONNECTFAILED (Unable to reach relay.) - 7 -- OR_IDENTITY (Connected to relay, but its OR identity was not - as expected.) - 8 -- CHANNEL_CLOSED (The OR connection that was carrying this circuit - died.) - 9 -- FINISHED (The circuit has expired for being dirty or old.) - 10 -- TIMEOUT (Circuit construction took too long) - 11 -- DESTROYED (The circuit was destroyed w/o client TRUNCATE) - 12 -- NOSUCHSERVICE (Request for unknown hidden service) - -5.5. Routing relay cells - -5.5.1. Circuit ID Checks - - When a node wants to send a RELAY or RELAY_EARLY cell, it checks the cell's - circID and determines whether the corresponding circuit along that - connection is still open. If not, the node drops the cell. - - When a node receives a RELAY or RELAY_EARLY cell, it checks the cell's - circID and determines whether it has a corresponding circuit along - that connection. If not, the node drops the cell. - -5.5.2. Forward Direction - - The forward direction is the direction that CREATE/CREATE2 cells - are sent. - -5.5.2.1. Routing from the Origin - - When a relay cell is sent from an OP, the OP encrypts the payload - with the stream cipher as follows: - - OP sends relay cell: - For I=N...1, where N is the destination node: - Encrypt with Kf_I. - Transmit the encrypted cell to node 1. - -5.5.2.2. Relaying Forward at Onion Routers - - When a forward relay cell is received by an OR, it decrypts the payload - with the stream cipher, as follows: - - 'Forward' relay cell: - Use Kf as key; decrypt. - - The OR then decides whether it recognizes the relay cell, by - inspecting the payload as described in section 6.1 below. If the OR - recognizes the cell, it processes the contents of the relay cell. - Otherwise, it passes the decrypted relay cell along the circuit if - the circuit continues. If the OR at the end of the circuit - encounters an unrecognized relay cell, an error has occurred: the OR - sends a DESTROY cell to tear down the circuit. - - For more information, see section 6 below. - -5.5.3. Backward Direction - - The backward direction is the opposite direction from - CREATE/CREATE2 cells. - -5.5.3.1. Relaying Backward at Onion Routers - - When a backward relay cell is received by an OR, it encrypts the payload - with the stream cipher, as follows: - - 'Backward' relay cell: - Use Kb as key; encrypt. - -5.5.3. Routing to the Origin - - When a relay cell arrives at an OP, the OP decrypts the payload - with the stream cipher as follows: - - OP receives relay cell from node 1: - For I=1...N, where N is the final node on the circuit: - Decrypt with Kb_I. - If the payload is recognized (see section 6.1), then: - The sending node is I. - Stop and process the payload. - -5.6. Handling relay_early cells - - A RELAY_EARLY cell is designed to limit the length any circuit can reach. - When an OR receives a RELAY_EARLY cell, and the next node in the circuit - is speaking v2 of the link protocol or later, the OR relays the cell as a - RELAY_EARLY cell. Otherwise, older Tors will relay it as a RELAY cell. - - If a node ever receives more than 8 RELAY_EARLY cells on a given - outbound circuit, it SHOULD close the circuit. If it receives any - inbound RELAY_EARLY cells, it MUST close the circuit immediately. - - When speaking v2 of the link protocol or later, clients MUST only send - EXTEND/EXTEND2 cells inside RELAY_EARLY cells. Clients SHOULD send the first - ~8 RELAY cells that are not targeted at the first hop of any circuit as - RELAY_EARLY cells too, in order to partially conceal the circuit length. - - [Starting with Tor 0.2.3.11-alpha, relays should reject any - EXTEND/EXTEND2 cell not received in a RELAY_EARLY cell.] - -6. Application connections and stream management - -6.1. Relay cells - - Within a circuit, the OP and the end node use the contents of - RELAY packets to tunnel end-to-end commands and TCP connections - ("Streams") across circuits. End-to-end commands can be initiated - by either edge; streams are initiated by the OP. - - End nodes that accept streams may be: - * exit relays (RELAY_BEGIN, anonymous), - * directory servers (RELAY_BEGIN_DIR, anonymous or non-anonymous), - * onion services (RELAY_BEGIN, anonymous via a rendezvous point). - - The payload of each unencrypted RELAY cell consists of: - - Relay command [1 byte] - 'Recognized' [2 bytes] - StreamID [2 bytes] - Digest [4 bytes] - Length [2 bytes] - Data [Length bytes] - Padding [PAYLOAD_LEN - 11 - Length bytes] - - The relay commands are: - - 1 -- RELAY_BEGIN [forward] - 2 -- RELAY_DATA [forward or backward] - 3 -- RELAY_END [forward or backward] - 4 -- RELAY_CONNECTED [backward] - 5 -- RELAY_SENDME [forward or backward] [sometimes control] - 6 -- RELAY_EXTEND [forward] [control] - 7 -- RELAY_EXTENDED [backward] [control] - 8 -- RELAY_TRUNCATE [forward] [control] - 9 -- RELAY_TRUNCATED [backward] [control] - 10 -- RELAY_DROP [forward or backward] [control] - 11 -- RELAY_RESOLVE [forward] - 12 -- RELAY_RESOLVED [backward] - 13 -- RELAY_BEGIN_DIR [forward] - 14 -- RELAY_EXTEND2 [forward] [control] - 15 -- RELAY_EXTENDED2 [backward] [control] - - 16..18 -- Reserved for UDP; Not yet in use, see prop339. - - 19..22 -- Reserved for Conflux, see prop329. - - 32..40 -- Used for hidden services; see rend-spec-{v2,v3}.txt. - - 41..42 -- Used for circuit padding; see Section 3 of padding-spec.txt. - - Used for flow control; see Section 4 of prop324. - 43 -- XON [forward or backward] - 44 -- XOFF [forward or backward] - - Commands labelled as "forward" must only be sent by the originator - of the circuit. Commands labelled as "backward" must only be sent by - other nodes in the circuit back to the originator. Commands marked - as either can be sent either by the originator or other nodes. - - The 'recognized' field is used as a simple indication that the cell - is still encrypted. It is an optimization to avoid calculating - expensive digests for every cell. When sending cells, the unencrypted - 'recognized' MUST be set to zero. - - When receiving and decrypting cells the 'recognized' will always be - zero if we're the endpoint that the cell is destined for. For cells - that we should relay, the 'recognized' field will usually be nonzero, - but will accidentally be zero with P=2^-16. - - When handling a relay cell, if the 'recognized' in field in a - decrypted relay payload is zero, the 'digest' field is computed as - the first four bytes of the running digest of all the bytes that have - been destined for this hop of the circuit or originated from this hop - of the circuit, seeded from Df or Db respectively (obtained in - section 5.2 above), and including this RELAY cell's entire payload - (taken with the digest field set to zero). Note that these digests - _do_ include the padding bytes at the end of the cell, not only those up - to "Len". If the digest is correct, the cell is considered "recognized" - for the purposes of decryption (see section 5.5 above). - - (The digest does not include any bytes from relay cells that do - not start or end at this hop of the circuit. That is, it does not - include forwarded data. Therefore if 'recognized' is zero but the - digest does not match, the running digest at that node should - not be updated, and the cell should be forwarded on.) - - All RELAY cells pertaining to the same tunneled stream have the same - stream ID. StreamIDs are chosen arbitrarily by the OP. No stream - may have a StreamID of zero. Rather, RELAY cells that affect the - entire circuit rather than a particular stream use a StreamID of zero - -- they are marked in the table above as "[control]" style - cells. (Sendme cells are marked as "sometimes control" because they - can include a StreamID or not depending on their purpose -- see - Section 7.) - - The 'Length' field of a relay cell contains the number of bytes in - the relay payload which contain real payload data. The remainder of - the unencrypted payload is padded with padding bytes. Implementations - handle padding bytes of unencrypted relay cells as they do padding - bytes for other cell types; see Section 3. - - The 'Padding' field is used to make relay cell contents unpredictable, to - avoid certain attacks (see proposal 289 for rationale). Implementations - SHOULD fill this field with four zero-valued bytes, followed by as many - random bytes as will fit. (If there are fewer than 4 bytes for padding, - then they should all be filled with zero. - - Implementations MUST NOT rely on the contents of the 'Padding' field. - - If the RELAY cell is recognized but the relay command is not - understood, the cell must be dropped and ignored. Its contents - still count with respect to the digests and flow control windows, though. - -6.1.1. Calculating the 'Digest' field - - The 'Digest' field itself serves the purpose to check if a cell has been - fully decrypted, that is, all onion layers have been removed. Having a - single field, namely 'Recognized' is not sufficient, as outlined above. - - When ENCRYPTING a RELAY cell, an implementation does the following: - - # Encode the cell in binary (recognized and digest set to zero) - tmp = cmd + [0, 0] + stream_id + [0, 0, 0, 0] + length + data + padding - - # Update the digest with the encoded data - digest_state = hash_update(digest_state, tmp) - digest = hash_calculate(digest_state) - - # The encoded data is the same as above with the digest field not being - # zero anymore - encoded = cmd + [0, 0] + stream_id + digest[0..4] + length + data + - padding - - # Now we can encrypt the cell by adding the onion layers ... - - When DECRYPTING a RELAY cell, an implementation does the following: - - decrypted = decrypt(cell) - - # Replace the digest field in decrypted by zeros - tmp = decrypted[0..5] + [0, 0, 0, 0] + decrypted[9..] - - # Update the digest field with the decrypted data and its digest field - # set to zero - digest_state = hash_update(digest_state, tmp) - digest = hash_calculate(digest_state) - - if digest[0..4] == decrypted[5..9] - # The cell has been fully decrypted ... - - The caveat itself is that only the binary data with the digest bytes set to - zero are being taken into account when calculating the running digest. The - final plain-text cells (with the digest field set to its actual value) are - not taken into the running digest. - -6.2. Opening streams and transferring data - - To open a new anonymized TCP connection, the OP chooses an open - circuit to an exit that may be able to connect to the destination - address, selects an arbitrary StreamID not yet used on that circuit, - and constructs a RELAY_BEGIN cell with a payload encoding the address - and port of the destination host. The payload format is: - - ADDRPORT [nul-terminated string] - FLAGS [4 bytes] - - ADDRPORT is made of ADDRESS | ':' | PORT | [00] - - where ADDRESS can be a DNS hostname, or an IPv4 address in - dotted-quad format, or an IPv6 address surrounded by square brackets; - and where PORT is a decimal integer between 1 and 65535, inclusive. - - The ADDRPORT string SHOULD be sent in lower case, to avoid - fingerprinting. Implementations MUST accept strings in any case. - - The FLAGS value has one or more of the following bits set, where - "bit 1" is the LSB of the 32-bit value, and "bit 32" is the MSB. - (Remember that all values in Tor are big-endian (see 0.1.1 above), so - the MSB of a 4-byte value is the MSB of the first byte, and the LSB - of a 4-byte value is the LSB of its last byte.) - - bit meaning - 1 -- IPv6 okay. We support learning about IPv6 addresses and - connecting to IPv6 addresses. - 2 -- IPv4 not okay. We don't want to learn about IPv4 addresses - or connect to them. - 3 -- IPv6 preferred. If there are both IPv4 and IPv6 addresses, - we want to connect to the IPv6 one. (By default, we connect - to the IPv4 address.) - 4..32 -- Reserved. Current clients MUST NOT set these. Servers - MUST ignore them. - - Upon receiving this cell, the exit node resolves the address as - necessary, and opens a new TCP connection to the target port. If the - address cannot be resolved, or a connection can't be established, the - exit node replies with a RELAY_END cell. (See 6.3 below.) - Otherwise, the exit node replies with a RELAY_CONNECTED cell, whose - payload is in one of the following formats: - - The IPv4 address to which the connection was made [4 octets] - A number of seconds (TTL) for which the address may be cached [4 octets] - - or - - Four zero-valued octets [4 octets] - An address type (6) [1 octet] - The IPv6 address to which the connection was made [16 octets] - A number of seconds (TTL) for which the address may be cached [4 octets] - - [Tor exit nodes before 0.1.2.0 set the TTL field to a fixed value. Later - versions set the TTL to the last value seen from a DNS server, and expire - their own cached entries after a fixed interval. This prevents certain - attacks.] - - Once a connection has been established, the OP and exit node - package stream data in RELAY_DATA cells, and upon receiving such - cells, echo their contents to the corresponding TCP stream. - - If the exit node does not support optimistic data (i.e. its - version number is before 0.2.3.1-alpha), then the OP MUST wait - for a RELAY_CONNECTED cell before sending any data. If the exit - node supports optimistic data (i.e. its version number is - 0.2.3.1-alpha or later), then the OP MAY send RELAY_DATA cells - immediately after sending the RELAY_BEGIN cell (and before - receiving either a RELAY_CONNECTED or RELAY_END cell). - - RELAY_DATA cells sent to unrecognized streams are dropped. If - the exit node supports optimistic data, then RELAY_DATA cells it - receives on streams which have seen RELAY_BEGIN but have not yet - been replied to with a RELAY_CONNECTED or RELAY_END are queued. - If the stream creation succeeds with a RELAY_CONNECTED, the queue - is processed immediately afterwards; if the stream creation fails - with a RELAY_END, the contents of the queue are deleted. - - Relay RELAY_DROP cells are long-range dummies; upon receiving such - a cell, the OR or OP must drop it. - -6.2.1. Opening a directory stream - - If a Tor relay is a directory server, it should respond to a - RELAY_BEGIN_DIR cell as if it had received a BEGIN cell requesting a - connection to its directory port. RELAY_BEGIN_DIR cells ignore exit - policy, since the stream is local to the Tor process. - - Directory servers may be: - * authoritative directories (RELAY_BEGIN_DIR, usually non-anonymous), - * bridge authoritative directories (RELAY_BEGIN_DIR, anonymous), - * directory mirrors (RELAY_BEGIN_DIR, usually non-anonymous), - * onion service directories (RELAY_BEGIN_DIR, anonymous). - - If the Tor relay is not running a directory service, it should respond - with a REASON_NOTDIRECTORY RELAY_END cell. - - Clients MUST generate an all-zero payload for RELAY_BEGIN_DIR cells, - and relays MUST ignore the payload. - - In response to a RELAY_BEGIN_DIR cell, relays respond either with a - RELAY_CONNECTED cell on success, or a RELAY_END cell on failure. They - MUST send a RELAY_CONNECTED cell all-zero payload, and clients MUST ignore - the payload. - - [RELAY_BEGIN_DIR was not supported before Tor 0.1.2.2-alpha; clients - SHOULD NOT send it to routers running earlier versions of Tor.] - -6.3. Closing streams - - When an anonymized TCP connection is closed, or an edge node - encounters error on any stream, it sends a 'RELAY_END' cell along the - circuit (if possible) and closes the TCP connection immediately. If - an edge node receives a 'RELAY_END' cell for any stream, it closes - the TCP connection completely, and sends nothing more along the - circuit for that stream. - - The payload of a RELAY_END cell begins with a single 'reason' byte to - describe why the stream is closing. For some reasons, it contains - additional data (depending on the reason.) The values are: - - 1 -- REASON_MISC (catch-all for unlisted reasons) - 2 -- REASON_RESOLVEFAILED (couldn't look up hostname) - 3 -- REASON_CONNECTREFUSED (remote host refused connection) [*] - 4 -- REASON_EXITPOLICY (OR refuses to connect to host or port) - 5 -- REASON_DESTROY (Circuit is being destroyed) - 6 -- REASON_DONE (Anonymized TCP connection was closed) - 7 -- REASON_TIMEOUT (Connection timed out, or OR timed out - while connecting) - 8 -- REASON_NOROUTE (Routing error while attempting to - contact destination) - 9 -- REASON_HIBERNATING (OR is temporarily hibernating) - 10 -- REASON_INTERNAL (Internal error at the OR) - 11 -- REASON_RESOURCELIMIT (OR has no resources to fulfill request) - 12 -- REASON_CONNRESET (Connection was unexpectedly reset) - 13 -- REASON_TORPROTOCOL (Sent when closing connection because of - Tor protocol violations.) - 14 -- REASON_NOTDIRECTORY (Client sent RELAY_BEGIN_DIR to a - non-directory relay.) - - [*] Older versions of Tor also send this reason when connections are - reset. - - OPs and ORs MUST accept reasons not on the above list, since future - versions of Tor may provide more fine-grained reasons. - - For most reasons, the format of RELAY_END is: - - Reason [1 byte] - - For REASON_EXITPOLICY, the format of RELAY_END is: - - Reason [1 byte] - IPv4 or IPv6 address [4 bytes or 16 bytes] - TTL [4 bytes] - - (If the TTL is absent, it should be treated as if it were 0xffffffff. - If the address is absent or is the wrong length, the RELAY_END message - should be processed anyway.) - - Tors SHOULD NOT send any reason except REASON_MISC for a stream that they - have originated. - - Implementations SHOULD accept empty RELAY_END messages, and treat them - as if they specified REASON_MISC. - - Upon receiving a RELAY_END cell, the recipient may be sure that no further - cells will arrive on that stream, and can treat such cells as a protocol - violation. - - After sending a RELAY_END cell, the sender needs to give the recipient - time to receive that cell. In the meantime, the sender SHOULD remember - how many cells of which types (CONNECTED, SENDME, DATA) that it would have - accepted on that stream, and SHOULD kill the circuit if it receives more - than permitted. - - --- [The rest of this section describes unimplemented functionality.] - - Because TCP connections can be half-open, we follow an equivalent - to TCP's FIN/FIN-ACK/ACK protocol to close streams. - - An exit (or onion service) connection can have a TCP stream in one of - three states: 'OPEN', 'DONE_PACKAGING', and 'DONE_DELIVERING'. For the - purposes of modeling transitions, we treat 'CLOSED' as a fourth state, - although connections in this state are not, in fact, tracked by the - onion router. - - A stream begins in the 'OPEN' state. Upon receiving a 'FIN' from - the corresponding TCP connection, the edge node sends a 'RELAY_FIN' - cell along the circuit and changes its state to 'DONE_PACKAGING'. - Upon receiving a 'RELAY_FIN' cell, an edge node sends a 'FIN' to - the corresponding TCP connection (e.g., by calling - shutdown(SHUT_WR)) and changing its state to 'DONE_DELIVERING'. - - When a stream in already in 'DONE_DELIVERING' receives a 'FIN', it - also sends a 'RELAY_FIN' along the circuit, and changes its state - to 'CLOSED'. When a stream already in 'DONE_PACKAGING' receives a - 'RELAY_FIN' cell, it sends a 'FIN' and changes its state to - 'CLOSED'. - - If an edge node encounters an error on any stream, it sends a - 'RELAY_END' cell (if possible) and closes the stream immediately. - -6.4. Remote hostname lookup - - To find the address associated with a hostname, the OP sends a - RELAY_RESOLVE cell containing the hostname to be resolved with a NUL - terminating byte. (For a reverse lookup, the OP sends a RELAY_RESOLVE - cell containing an in-addr.arpa address.) The OR replies with a - RELAY_RESOLVED cell containing any number of answers. Each answer is - of the form: - - Type (1 octet) - Length (1 octet) - Value (variable-width) - TTL (4 octets) - "Length" is the length of the Value field. - "Type" is one of: - - 0x00 -- Hostname - 0x04 -- IPv4 address - 0x06 -- IPv6 address - 0xF0 -- Error, transient - 0xF1 -- Error, nontransient - - If any answer has a type of 'Error', then no other answer may be - given. - - The 'Value' field encodes the answer: - IP addresses are given in network order. - Hostnames are given in standard DNS order ("www.example.com") - and not NUL-terminated. - The content of Errors is currently ignored. Relays currently - set it to the string "Error resolving hostname" with no - terminating NUL. Implementations MUST ignore this value. - - For backward compatibility, if there are any IPv4 answers, one of those - must be given as the first answer. - - The RELAY_RESOLVE cell must use a nonzero, distinct streamID; the - corresponding RELAY_RESOLVED cell must use the same streamID. No stream - is actually created by the OR when resolving the name. - -7. Flow control - -7.1. Link throttling - - Each client or relay should do appropriate bandwidth throttling to - keep its user happy. - - Communicants rely on TCP's default flow control to push back when they - stop reading. - - The mainline Tor implementation uses token buckets (one for reads, - one for writes) for the rate limiting. - - Since 0.2.0.x, Tor has let the user specify an additional pair of - token buckets for "relayed" traffic, so people can deploy a Tor relay - with strict rate limiting, but also use the same Tor as a client. To - avoid partitioning concerns we combine both classes of traffic over a - given OR connection, and keep track of the last time we read or wrote - a high-priority (non-relayed) cell. If it's been less than N seconds - (currently N=30), we give the whole connection high priority, else we - give the whole connection low priority. We also give low priority - to reads and writes for connections that are serving directory - information. See proposal 111 for details. - -7.2. Link padding - - Link padding can be created by sending PADDING or VPADDING cells - along the connection; relay cells of type "DROP" can be used for - long-range padding. The payloads of PADDING, VPADDING, or DROP - cells are filled with padding bytes. See Section 3. - - If the link protocol is version 5 or higher, link level padding is - enabled as per padding-spec.txt. On these connections, clients may - negotiate the use of padding with a CELL_PADDING_NEGOTIATE command - whose format is as follows: - - Version [1 byte] - Command [1 byte] - ito_low_ms [2 bytes] - ito_high_ms [2 bytes] - - Currently, only version 0 of this cell is defined. In it, the command - field is either 1 (stop padding) or 2 (start padding). For the start - padding command, a pair of timeout values specifying a low and a high - range bounds for randomized padding timeouts may be specified as unsigned - integer values in milliseconds. The ito_low_ms field should not be lower - than the current consensus parameter value for nf_ito_low (default: - 1500). The ito_high_ms field should not be lower than ito_low_ms. - (If any party receives an out-of-range value, they clamp it so - that it is in-range.) - - For the stop padding command, the timeout fields should be sent as - zero (to avoid client distinguishability) and ignored by the recipient. - - For more details on padding behavior, see padding-spec.txt. - -7.3. Circuit-level flow control - - To control a circuit's bandwidth usage, each OR keeps track of two - 'windows', consisting of how many RELAY_DATA cells it is allowed to - originate or willing to consume. - - These two windows are respectively named: the package window (packaged for - transmission) and the deliver window (delivered for local streams). - - Because of our leaky-pipe topology, every relay on the circuit has a pair - of windows, and the OP has a pair of windows for every relay on the - circuit. These windows do not apply to relayed cells, however, and a relay - that is never used for streams will never decrement its window or cause the - client to decrement a window. - - Each 'window' value is initially set based on the consensus parameter - 'circwindow' in the directory (see dir-spec.txt), or to 1000 data cells if - no 'circwindow' value is given. In each direction, cells that are not - RELAY_DATA cells do not affect the window. - - An OR or OP (depending on the stream direction) sends a RELAY_SENDME cell - to indicate that it is willing to receive more cells when its deliver - window goes down below a full increment (100). For example, if the window - started at 1000, it should send a RELAY_SENDME when it reaches 900. - - When an OR or OP receives a RELAY_SENDME, it increments its package window - by a value of 100 (circuit window increment) and proceeds to sending the - remaining RELAY_DATA cells. - - If a package window reaches 0, the OR or OP stops reading from TCP - connections for all streams on the corresponding circuit, and sends no more - RELAY_DATA cells until receiving a RELAY_SENDME cell. - - If a deliver window goes below 0, the circuit should be torn down. - - Starting with tor-0.4.1.1-alpha, authenticated SENDMEs are supported - (version 1, see below). This means that both the OR and OP need to remember - the rolling digest of the cell that precedes (triggers) a RELAY_SENDME. - This can be known if the package window gets to a multiple of the circuit - window increment (100). - - When the RELAY_SENDME version 1 arrives, it will contain a digest that MUST - match the one remembered. This represents a proof that the end point of the - circuit saw the sent cells. On failure to match, the circuit should be torn - down. - - To ensure unpredictability, random bytes should be added to at least one - RELAY_DATA cell within one increment window. In other word, every 100 cells - (increment), random bytes should be introduced in at least one cell. - -7.3.1. SENDME Cell Format - - A circuit-level RELAY_SENDME cell always has its StreamID=0. - - An OR or OP must obey these two consensus parameters in order to know which - version to emit and accept. - - 'sendme_emit_min_version': Minimum version to emit. - 'sendme_accept_min_version': Minimum version to accept. - - If a RELAY_SENDME version is received that is below the minimum accepted - version, the circuit should be closed. - - The RELAY_SENDME payload contains the following: - - VERSION [1 byte] - DATA_LEN [2 bytes] - DATA [DATA_LEN bytes] - - The VERSION tells us what is expected in the DATA section of length - DATA_LEN and how to handle it. The recognized values are: - - 0x00: The rest of the payload should be ignored. - - 0x01: Authenticated SENDME. The DATA section MUST contain: - - DIGEST [20 bytes] - - If the DATA_LEN value is less than 20 bytes, the cell should be - dropped and the circuit closed. If the value is more than 20 bytes, - then the first 20 bytes should be read to get the DIGEST value. - - The DIGEST is the rolling digest value from the RELAY_DATA cell that - immediately preceded (triggered) this RELAY_SENDME. This value is - matched on the other side from the previous cell sent that the OR/OP - must remember. - - (Note that if the digest in use has an output length greater than 20 - bytes—as is the case for the hop of an onion service rendezvous - circuit created by the hs_ntor handshake—we truncate the digest - to 20 bytes here.) - - If the VERSION is unrecognized or below the minimum accepted version (taken - from the consensus), the circuit should be torn down. - -7.4. Stream-level flow control - - Edge nodes use RELAY_SENDME cells to implement end-to-end flow - control for individual connections across circuits. Similarly to - circuit-level flow control, edge nodes begin with a window of cells - (500) per stream, and increment the window by a fixed value (50) - upon receiving a RELAY_SENDME cell. Edge nodes initiate RELAY_SENDME - cells when both a) the window is <= 450, and b) there are less than - ten cell payloads remaining to be flushed at that edge. - - Stream-level RELAY_SENDME cells are distinguished by having nonzero - StreamID. They are still empty; the body still SHOULD be ignored. - - -8. Handling resource exhaustion - - -8.1. Memory exhaustion. - - (See also dos-spec.md.) - - If RAM becomes low, an OR should begin destroying circuits until - more memory is free again. We recommend the following algorithm: - - - Set a threshold amount of RAM to recover at 10% of the total RAM. - - - Sort the circuits by their 'staleness', defined as the age of the - oldest data queued on the circuit. This data can be: - - * Bytes that are waiting to flush to or from a stream on that - circuit. - - * Bytes that are waiting to flush from a connection created with - BEGIN_DIR. - - * Cells that are waiting to flush or be processed. - - - While we have not yet recovered enough RAM: - - * Free all memory held by the most stale circuit, and send DESTROY - cells in both directions on that circuit. Count the amount of - memory we recovered towards the total. - -9. Subprotocol versioning - - This section specifies the Tor subprotocol versioning. They are broken down - into different types with their current version numbers. Any new version - number should be added to this section. - - The dir-spec.txt details how those versions are encoded. See the - "proto"/"pr" line in a descriptor and the "recommended-relay-protocols", - "required-relay-protocols", "recommended-client-protocols" and - "required-client-protocols" lines in the vote/consensus format. - - Here are the rules a relay and client should follow when encountering a - protocol list in the consensus: - - - When a relay lacks a protocol listed in recommended-relay-protocols, - it should warn its operator that the relay is obsolete. - - - When a relay lacks a protocol listed in required-relay-protocols, it - should warn its operator as above. If the consensus is newer than the - date when the software was released or scheduled for release, it must - not attempt to join the network. - - - When a client lacks a protocol listed in recommended-client-protocols, - it should warn the user that the client is obsolete. - - - When a client lacks a protocol listed in required-client-protocols, - it should warn the user as above. If the consensus is newer than the - date when the software was released, it must not connect to the - network. This implements a "safe forward shutdown" mechanism for - zombie clients. - - - If a client or relay has a cached consensus telling it that a given - protocol is required, and it does not implement that protocol, it - SHOULD NOT try to fetch a newer consensus. - - Software release dates SHOULD be automatically updated as part of the - release process, to prevent forgetting to move them forward. Software - release dates MAY be manually adjusted by maintainers if necessary. - - Starting in version 0.2.9.4-alpha, the initial required protocols for - clients that we will Recommend and Require are: - - Cons=1-2 Desc=1-2 DirCache=1 HSDir=1 HSIntro=3 HSRend=1 Link=4 - LinkAuth=1 Microdesc=1-2 Relay=2 - - For relays we will Require: - - Cons=1 Desc=1 DirCache=1 HSDir=1 HSIntro=3 HSRend=1 Link=3-4 - LinkAuth=1 Microdesc=1 Relay=1-2 - - For relays, we will additionally Recommend all protocols which we - recommend for clients. - -9.1. "Link" - - The "link" protocols are those used by clients and relays to initiate and - receive OR connections and to handle cells on OR connections. The "link" - protocol versions correspond 1:1 to those versions. - - Two Tor instances can make a connection to each other only if they have at - least one link protocol in common. - - The current "link" versions are: "1" through "5". See section 4.1 for more - information. All current Tor versions support "1-3"; versions from - 0.2.4.11-alpha and on support "1-4"; versions from 0.3.1.1-alpha and on - support "1-5". Eventually we will drop "1" and "2". - -9.2. "LinkAuth" - - LinkAuth protocols correspond to varieties of Authenticate cells used for - the v3+ link protocols. - - Current versions are: - - "1" is the RSA link authentication described in section 4.4.1 above. - - "2" is unused, and reserved by proposal 244. - - "3" is the ed25519 link authentication described in 4.4.2 above. - - -9.3. "Relay" - - The "relay" protocols are those used to handle CREATE/CREATE2 - cells, and those that handle the various RELAY cell types received - after a CREATE/CREATE2 cell. (Except, relay cells used to manage - introduction and rendezvous points are managed with the "HSIntro" - and "HSRend" protocols respectively.) - - Current versions are: - - "1" -- supports the TAP key exchange, with all features in Tor 0.2.3. - Support for CREATE and CREATED and CREATE_FAST and CREATED_FAST - and EXTEND and EXTENDED. - - "2" -- supports the ntor key exchange, and all features in Tor - 0.2.4.19. Includes support for CREATE2 and CREATED2 and - EXTEND2 and EXTENDED2. - - Relay=2 has limited IPv6 support: - * Clients might not include IPv6 ORPorts in EXTEND2 cells. - * Relays (and bridges) might not initiate IPv6 connections in - response to EXTEND2 cells containing IPv6 ORPorts, even if they - are configured with an IPv6 ORPort. - - However, relays support accepting inbound connections to their IPv6 - ORPorts. And they might extend circuits via authenticated IPv6 - connections to other relays. - - "3" -- relays support extending over IPv6 connections in response to an - EXTEND2 cell containing an IPv6 ORPort. - - Bridges might not extend over IPv6, because they try to imitate - client behaviour. - - A successful IPv6 extend requires: - * Relay subprotocol version 3 (or later) on the extending relay, - * an IPv6 ORPort on the extending relay, - * an IPv6 ORPort for the accepting relay in the EXTEND2 cell, and - * an IPv6 ORPort on the accepting relay. - (Because different tor instances can have different views of the - network, these checks should be done when the path is selected. - Extending relays should only check local IPv6 information, before - attempting the extend.) - - When relays receive an EXTEND2 cell containing both an IPv4 and an - IPv6 ORPort, and there is no existing authenticated connection with - the target relay, the extending relay may choose between IPv4 and - IPv6 at random. The extending relay might not try the other address, - if the first connection fails. - - As is the case with other subprotocol versions, tor advertises, - recommends, or requires support for this protocol version, regardless - of its current configuration. - - In particular: - * relays without an IPv6 ORPort, and - * tor instances that are not relays, - have the following behaviour, regardless of their configuration: - * advertise support for "Relay=3" in their descriptor - (if they are a relay, bridge, or directory authority), and - * react to consensuses recommending or requiring support for - "Relay=3". - - This subprotocol version is described in proposal 311, and - implemented in Tor 0.4.5.1-alpha. - - "4" -- support the ntorv3 (version 3) key exchange and all features in - 0.4.7.3-alpha. This adds a new CREATE2 cell type. See proposal 332 - and section 5.1.4.1 above for more details. - -9.4. "HSIntro" - - The "HSIntro" protocol handles introduction points. - - "3" -- supports authentication as of proposal 121 in Tor - 0.2.1.6-alpha. - - "4" -- support ed25519 authentication keys which is defined by the HS v3 - protocol as part of proposal 224 in Tor 0.3.0.4-alpha. - - "5" -- support ESTABLISH_INTRO cell DoS parameters extension for onion - service version 3 only in Tor 0.4.2.1-alpha. - -9.5. "HSRend" - - The "HSRend" protocol handles rendezvous points. - - "1" -- supports all features in Tor 0.0.6. - - "2" -- supports RENDEZVOUS2 cells of arbitrary length as long as they - have 20 bytes of cookie in Tor 0.2.9.1-alpha. - -9.6. "HSDir" - - The "HSDir" protocols are the set of hidden service document types that can - be uploaded to, understood by, and downloaded from a tor relay, and the set - of URLs available to fetch them. - - "1" -- supports all features in Tor 0.2.0.10-alpha. - - "2" -- support ed25519 blinded keys request which is defined by the HS v3 - protocol as part of proposal 224 in Tor 0.3.0.4-alpha. - -9.7. "DirCache" - - The "DirCache" protocols are the set of documents available for download - from a directory cache via BEGIN_DIR, and the set of URLs available to - fetch them. (This excludes URLs for hidden service objects.) - - "1" -- supports all features in Tor 0.2.4.19. - - "2" -- adds support for consensus diffs in Tor 0.3.1.1-alpha. - -9.8. "Desc" - - Describes features present or absent in descriptors. - - Most features in descriptors don't require a "Desc" update -- only those - that need to someday be required. For example, someday clients will need - to understand ed25519 identities. - - "1" -- supports all features in Tor 0.2.4.19. - - "2" -- cross-signing with onion-keys, signing with ed25519 - identities. - -9.9. "Microdesc" - - Describes features present or absent in microdescriptors. - - Most features in descriptors don't require a "MicroDesc" update -- only - those that need to someday be required. These correspond more or less with - consensus methods. - - "1" -- consensus methods 9 through 20. - - "2" -- consensus method 21 (adds ed25519 keys to microdescs). - -9.10. "Cons" - - Describes features present or absent in consensus documents. - - Most features in consensus documents don't require a "Cons" update -- only - those that need to someday be required. - - These correspond more or less with consensus methods. - - "1" -- consensus methods 9 through 20. - - "2" -- consensus method 21 (adds ed25519 keys to microdescs). - -9.11. "Padding" - - Describes the padding capabilities of the relay. - - "1" -- [DEFUNCT] Relay supports circuit-level padding. This version MUST NOT - be used as it was also enabled in relays that don't actually support - circuit-level padding. Advertised by Tor versions from - tor-0.4.0.1-alpha and only up to and including tor-0.4.1.4-rc. - - "2" -- Relay supports the HS circuit setup padding machines (proposal 302). - Advertised by Tor versions from tor-0.4.1.5 and onwards. - -9.12. "FlowCtrl" - - Describes the flow control protocol at the circuit and stream level. If - there is no FlowCtrl advertised, tor supports the unauthenticated flow - control features (version 0). - - "1" -- supports authenticated circuit level SENDMEs as of proposal 289 in - Tor 0.4.1.1-alpha. - - "2" -- supports congestion control by the Exits which implies a new SENDME - format and algorithm. See proposal 324 for more details. Advertised - in tor 0.4.7.3-alpha. - -9.13. "Datagram" - - Describes the UDP protocol capabilities of a relay. - - "1" -- [RESERVED] supports UDP by an Exit as in the relay command - CONNECT_UDP, CONNECTED_UDP and DATAGRAM. See proposal - 339 for more details. (Not yet advertised, reserved) diff --git a/version-spec.txt b/version-spec.txt deleted file mode 100644 index 615f6f2..0000000 --- a/version-spec.txt +++ /dev/null @@ -1,86 +0,0 @@ - - HOW TOR VERSION NUMBERS WORK - -Table of Contents - - 1. The Old Way - 2. The New Way - 3. Version status. - -1. The Old Way - - Before 0.1.0, versions were of the format: - - MAJOR.MINOR.MICRO(status(PATCHLEVEL))?(-cvs)? - - where MAJOR, MINOR, MICRO, and PATCHLEVEL are numbers, status is one - of "pre" (for an alpha release), "rc" (for a release candidate), or - "." for a release. As a special case, "a.b.c" was equivalent to - "a.b.c.0". We compare the elements in order (major, minor, micro, - status, patchlevel, cvs), with "cvs" preceding non-cvs. - - We would start each development branch with a final version in mind: - say, "0.0.8". Our first pre-release would be "0.0.8pre1", followed by - (for example) "0.0.8pre2-cvs", "0.0.8pre2", "0.0.8pre3-cvs", - "0.0.8rc1", "0.0.8rc2-cvs", and "0.0.8rc2". Finally, we'd release - 0.0.8. The stable CVS branch would then be versioned "0.0.8.1-cvs", - and any eventual bugfix release would be "0.0.8.1". - -2. The New Way - - Starting at 0.1.0.1-rc, versions are of the format: - - MAJOR.MINOR.MICRO[.PATCHLEVEL][-STATUS_TAG][ (EXTRA_INFO)]* - - The stuff in parentheses is optional. As before, MAJOR, MINOR, MICRO, - and PATCHLEVEL are numbers, with an absent number equivalent to 0. - All versions should be distinguishable purely by those four - numbers. - - The STATUS_TAG is purely informational, and lets you know how - stable we think the release is: "alpha" is pretty unstable; "rc" is a - release candidate; and no tag at all means that we have a final - release. If the tag ends with "-cvs" or "-dev", you're looking at a - development snapshot that came after a given release. If we *do* - encounter two versions that differ only by status tag, we compare them - lexically. The STATUS_TAG can't contain whitespace. - - The EXTRA_INFO is also purely informational, often containing information - about the SCM commit this version came from. It is surrounded by parentheses - and can't contain whitespace. Unlike the STATUS_TAG this never impacts the way - that versions should be compared. EXTRA_INFO may appear any number of - times. Tools should generally not parse EXTRA_INFO entries. - - Now, we start each development branch with (say) 0.1.1.1-alpha. The - patchlevel increments consistently as the status tag changes, for - example, as in: 0.1.1.2-alpha, 0.1.1.3-alpha, 0.1.1.4-rc, 0.1.1.5-rc. - Eventually, we release 0.1.1.6. The next patch release is 0.1.1.7. - - Between these releases, CVS is versioned with a -cvs tag: after - 0.1.1.1-alpha comes 0.1.1.1-alpha-cvs, and so on. But starting with - 0.1.2.1-alpha-dev, we switched to SVN and started using the "-dev" - suffix instead of the "-cvs" suffix. - -3. Version status. - - Sometimes we need to determine whether a Tor version is obsolete, - experimental, or neither, based on a list of recommended versions. The - logic is as follows: - - * If a version is listed on the recommended list, then it is - "recommended". - - * If a version is newer than every recommended version, that version - is "experimental" or "new". - - * If a version is older than every recommended version, it is - "obsolete" or "old". - - * The first three components (major,minor,micro) of a version number - are its "release series". If a version has other recommended - versions with the same release series, and the version is newer - than all such recommended versions, but it is not newer than - _every_ recommended version, then the version is "new in series". - - * Finally, if none of the above conditions hold, then the version is - "un-recommended." -- cgit v1.2.3-54-g00ecf