From e4e0d93d56ee8c1aec4c2efaa7046b651f0fe55c Mon Sep 17 00:00:00 2001 From: Nick Mathewson Date: Thu, 12 Oct 2023 12:27:58 -0400 Subject: Move all text-only specifications into the OLD_TXT directory. --- bandwidth-file-spec.txt | 1315 ----------------------------------------------- 1 file changed, 1315 deletions(-) delete mode 100644 bandwidth-file-spec.txt (limited to 'bandwidth-file-spec.txt') diff --git a/bandwidth-file-spec.txt b/bandwidth-file-spec.txt deleted file mode 100644 index bad13f6..0000000 --- a/bandwidth-file-spec.txt +++ /dev/null @@ -1,1315 +0,0 @@ - - Tor Bandwidth File Format - juga - teor - -Table of Contents - - 1. Scope and preliminaries - 1.2. Acknowledgements - 1.3. Outline - 1.4. Format Versions - 2. Format details - 2.1. Definitions - 2.2. Header List format - 2.3. Relay Line format - 2.4. Implementation details - 2.4.1. Writing bandwidth files atomically - 2.4.2. Additional KeyValue pair definitions - 2.4.2.1. Simple Bandwidth Scanner - 2.4.2.2. Torflow - A. Sample data - A.1. Generated by Torflow - A.2. Generated by sbws version 0.1.0 - A.3. Generated by sbws version 1.0.3 - A.4. Headers generated by sbws version 1.0.4 - A.5 Generated by sbws version 1.1.0 - B. Scaling bandwidths - B.1. Scaling requirements - B.2. A linear scaling method - B.3. Quota changes - B.4. Torflow aggregation - -1. Scope and preliminaries - - This document describes the format of Tor's Bandwidth File, version - 1.0.0 and later. - - It is a new specification for the existing bandwidth file format, - which we call version 1.0.0. It also specifies new format versions - 1.1.0 and later, which are backwards compatible with 1.0.0 parsers. - - Since Tor version 0.2.4.12-alpha, the directory authorities use - the Bandwidth File file called "V3BandwidthsFile" generated by - Torflow [1]. The details of this format are described in Torflow's - README.spec.txt. We also summarise the format in this specification. - - The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL - NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and - "OPTIONAL" in this document are to be interpreted as described in - RFC 2119. - -1.2. Acknowledgements - - The original bandwidth generator (Torflow) and format was - created by mike. Teor suggested to write this specification while - contributing on pastly's new bandwidth generator implementation. - - This specification was revised after feedback from: - - Nick Mathewson (nickm) - Iain Learmonth (irl) - -1.3. Outline - - The Tor directory protocol (dir-spec.txt [3]) sections 3.4.1 - and 3.4.2, use the term bandwidth measurements, to refer to what - here is called Bandwidth File. - - A Bandwidth File contains information on relays' bandwidth - capacities and is produced by bandwidth generators, previously known - as bandwidth scanners. - -1.4. Format Versions - - 1.0.0 - The legacy Bandwidth File format - - 1.1.0 - Adds a header containing information about the bandwidth - file. Document the sbws and Torflow relay line keys. - - 1.2.0 - If there are not enough eligible relays, the bandwidth file - SHOULD contain a header, but no relays. (To match Torflow's - existing behaviour.) - - Adds scanner and destination countries to the header. - Adds new KeyValue Lines to the Header List section with - statistics about the number of relays included in the file. - Adds new KeyValues to Relay Bandwidth Lines, with different - bandwidth values (averages and descriptor bandwidths). - - 1.4.0 - Adds monitoring KeyValues to the header and relay lines. - - RelayLines for excluded relays MAY be present in the bandwidth - file for diagnostic reasons. Similarly, if there are not enough - eligible relays, the bandwidth file MAY contain all known relays. - - Diagnostic relay lines SHOULD be marked with vote=0, and - Tor SHOULD NOT use their bandwidths in its votes. - - Also adds Tor version. - 1.5.0 - Removes "recent_measurement_attempt_count" KeyValue. - 1.6.0 - Adds congestion control stream events KeyValues. - 1.7.0 - Adds ratios KeyValues to the relay lines and network averages - KeyValues to the header. - - All Tor versions can consume format version 1.0.0. - - All Tor versions can consume format version 1.1.0 and later, - but Tor versions earlier than 0.3.5.1-alpha warn if the header - contains any KeyValue lines after the Timestamp. - - Tor versions 0.4.0.3-alpha, 0.3.5.8, 0.3.4.11, and earlier do not - understand "vote=0". Instead, they will vote for the actual bandwidths - that sbws puts in diagnostic relay lines: - * 1 for relays with "unmeasured=1", and - * the relay's measured and scaled bandwidth when "under_min_report=1". - -2. Format details - - The Bandwidth File MUST contain the following sections: - - Header List (exactly once), which is a partially ordered list of - - Header Lines (one or more times), then - - Relay Lines (zero or more times), in an arbitrary order. - If it does not contain these sections, parsers SHOULD ignore the file. - -2.1. Definitions - - The following nonterminals are defined in Tor directory protocol - sections 1.2., 2.1.1., 2.1.3.: - - bool - Int - SP (space) - NL (newline) - KeywordChar - ArgumentChar - nickname - hexdigest (a '$', followed by 40 hexadecimal characters - ([A-Fa-f0-9])) - - Nonterminal defined section 2 of version-spec.txt [4]: - - version_number - - We define the following nonterminals: - - Line ::= ArgumentChar* NL - RelayLine ::= KeyValue (SP KeyValue)* NL - HeaderLine ::= KeyValue NL - KeyValue ::= Key "=" Value - Key ::= (KeywordChar | "_")+ - Value ::= ArgumentCharValue+ - ArgumentCharValue ::= any printing ASCII character except NL and SP. - Terminator ::= "=====" or "====" - Generators SHOULD use a 5-character terminator. - Timestamp ::= Int - Bandwidth ::= Int - MasterKey ::= a base64-encoded Ed25519 public key, with - padding characters omitted. - DateTime ::= "YYYY-MM-DDTHH:MM:SS", as in ISO 8601 - CountryCode ::= Two capital ASCII letters ([A-Z]{2}), as defined in - ISO 3166-1 alpha-2 plus "ZZ" to denote unknown country - (eg the destination is in a Content Delivery Network). - CountryCodeList ::= One or more CountryCode(s) separated by a comma - ([A-Z]{2}(,[A-Z]{2})*). - - Note that key_value and value are defined in Tor directory protocol - with different formats to KeyValue and Value here. - - Tor versions earlier than 0.3.5.1-alpha require all lines in the file - to be 510 characters or less. The previous limit was 254 characters in - Tor 0.2.6.2-alpha and earlier. Parsers MAY ignore longer Lines. - - Note that directory authorities are only supported on the two most - recent stable Tor versions, so we expect that line limits will be - removed after Tor 0.4.0 is released in 2019. - -2.2. Header List format - - It consists of a Timestamp line and zero or more HeaderLines. - - All the header lines MUST conform to the HeaderLine format, except - the first Timestamp line. - - The Timestamp line is not a HeaderLine to keep compatibility with - the legacy Bandwidth File format. - - Some header Lines MUST appear in specific positions, as documented - below. All other Lines can appear in any order. - - If a parser does not recognize any extra material in a header Line, - the Line MUST be ignored. - - If a header Line does not conform to this format, the Line SHOULD be - ignored by parsers. - - It consists of: - - Timestamp NL - - [At start, exactly once.] - - The Unix Epoch time in seconds of the most recent generator bandwidth - result. - - If the generator implementation has multiple threads or - subprocesses which can fail independently, it SHOULD take the most - recent timestamp from each thread and use the oldest value. This - ensures all the threads continue running. - - If there are threads that do not run continuously, they SHOULD be - excluded from the timestamp calculation. - - If there are no recent results, the generator MUST NOT generate a new - file. - - It does not follow the KeyValue format for backwards compatibility - with version 1.0.0. - - "version" version_number NL - - [In second position, zero or one time.] - - The specification document format version. - It uses semantic versioning [5]. - - This Line was added in version 1.1.0 of this specification. - - Version 1.0.0 documents do not contain this Line, and the - version_number is considered to be "1.0.0". - - "software" Value NL - - [Zero or one time.] - - The name of the software that created the document. - - This Line was added in version 1.1.0 of this specification. - - Version 1.0.0 documents do not contain this Line, and the software - is considered to be "torflow". - - "software_version" Value NL - - [Zero or one time.] - - The version of the software that created the document. - The version may be a version_number, a git commit, or some other - version scheme. - - This Line was added in version 1.1.0 of this specification. - - "file_created" DateTime NL - - [Zero or one time.] - - The date and time timestamp in ISO 8601 format and UTC time zone - when the file was created. - - This Line was added in version 1.1.0 of this specification. - - "generator_started" DateTime NL - - [Zero or one time.] - - The date and time timestamp in ISO 8601 format and UTC time zone - when the generator started. - - This Line was added in version 1.1.0 of this specification. - - "earliest_bandwidth" DateTime NL - - [Zero or one time.] - - The date and time timestamp in ISO 8601 format and UTC time zone - when the first relay bandwidth was obtained. - - This Line was added in version 1.1.0 of this specification. - - "latest_bandwidth" DateTime NL - - [Zero or one time.] - - The date and time timestamp in ISO 8601 format and UTC time zone - of the most recent generator bandwidth result. - - This time MUST be identical to the initial Timestamp line. - - This duplicate value is included to make the format easier for people - to read. - - This Line was added in version 1.1.0 of this specification. - - "number_eligible_relays" Int NL - - [Zero or one time.] - - The number of relays that have enough measurements to be - included in the bandwidth file. - - This Line was added in version 1.2.0 of this specification. - - "minimum_percent_eligible_relays" Int NL - - [Zero or one time.] - - The percentage of relays in the consensus that SHOULD be - included in every generated bandwidth file. - - If this threshold is not reached, format versions 1.3.0 and earlier - SHOULD NOT contain any relays. (Bandwidth files always include a - header.) - - Format versions 1.4.0 and later SHOULD include all the relays for - diagnostic purposes, even if this threshold is not reached. But these - relays SHOULD be marked so that Tor does not vote on them. - See section 1.4 for details. - - The minimum percentage is 60% in Torflow, so sbws uses - 60% as the default. - - This Line was added in version 1.2.0 of this specification. - - "number_consensus_relays" Int NL - - [Zero or one time.] - - The number of relays in the consensus. - - This Line was added in version 1.2.0 of this specification. - - "percent_eligible_relays" Int NL - - [Zero or one time.] - - The number of eligible relays, as a percentage of the number - of relays in the consensus. - - This line SHOULD be equal to: - (number_eligible_relays * 100.0) / number_consensus_relays - to the number of relays in the consensus to include in this file. - - This Line was added in version 1.2.0 of this specification. - - "minimum_number_eligible_relays" Int NL - - [Zero or one time.] - - The minimum number of relays that SHOULD be included in the bandwidth - file. See minimum_percent_eligible_relays for details. - - This line SHOULD be equal to: - number_consensus_relays * (minimum_percent_eligible_relays / 100.0) - - This Line was added in version 1.2.0 of this specification. - - "scanner_country" CountryCode NL - - [Zero or one time.] - - The country, as in political geolocation, where the generator is run. - - This Line was added in version 1.2.0 of this specification. - - "destinations_countries" CountryCodeList NL - - [Zero or one time.] - - The country, as in political geolocation, or countries where the - destination Web server(s) are located. - The destination Web Servers serve the data that the generator retrieves - to measure the bandwidth. - - This Line was added in version 1.2.0 of this specification. - - "recent_consensus_count" Int NL - - [Zero or one time.]. - - The number of the different consensuses seen in the last data_period - days. (data_period is 5 by default.) - - Assuming that Tor clients fetch a consensus every 1-2 hours, - and that the data_period is 5 days, the Value of this Key SHOULD be - between: - data_period * 24 / 2 = 60 - data_period * 24 = 120 - - This Line was added in version 1.4.0 of this specification. - - "recent_priority_list_count" Int NL - - [Zero or one time.] - - The number of times that a list with a subset of relays prioritized - to be measured has been created in the last data_period days. - (data_period is 5 by default.) - - In 2019, with 7000 relays in the network, the Value of this Key SHOULD be - approximately: - data_period * 24 / 1.5 = 80 - Being 1.5 the approximate number of hours it takes to measure a - priority list of 7000 * 0.05 (350) relays, when the fraction of relays - in a priority list is the 5% (0.05). - - This Line was added in version 1.4.0 of this specification. - - "recent_priority_relay_count" Int NL - - [Zero or one time.] - - The number of relays that has been in in the list of relays prioritized - to be measured in the last data_period days. (data_period is 5 by - default.) - - In 2019, with 7000 relays in the network, the Value of this Key SHOULD be - approximately: - 80 * (7000 * 0.05) = 28000 - Being 0.05 (5%) the fraction of relays in a priority list and 80 - the approximate number of priority lists (see - "recent_priority_list_count"). - - This Line was added in version 1.4.0 of this specification. - - "recent_measurement_attempt_count" Int NL - - [Zero or one time.] - - The number of times that any relay has been queued to be measured - in the last data_period days. (data_period is 5 by default.) - - In 2019, with 7000 relays in the network, the Value of this Key SHOULD be - approximately the same as "recent_priority_relay_count", - assuming that there is one attempt to measure a relay for each relay that - has been prioritized unless there are system, network or implementation - issues. - - This Line was added in version 1.4.0 of this specification and removed - in version 1.5.0. - - "recent_measurement_failure_count" Int NL - - [Zero or one time.] - - The number of times that the scanner attempted to measure a relay in - the last data_period days (5 by default), but the relay has not been - measured because of system, network or implementation issues. - - This Line was added in version 1.4.0 of this specification. - - "recent_measurements_excluded_error_count" Int NL - - [Zero or one time.] - - The number of relays that have no successful measurements in the last - data_period days (5 by default). - - (See the note in section 1.4, version 1.4.0, about excluded relays.) - - This Line was added in version 1.4.0 of this specification. - - "recent_measurements_excluded_near_count" Int NL - - [Zero or one time.] - - The number of relays that have some successful measurements in the last - data_period days (5 by default), but all those measurements were - performed in a period of time that was too short (by default 1 day). - - (See the note in section 1.4, version 1.4.0, about excluded relays.) - - This Line was added in version 1.4.0 of this specification. - - "recent_measurements_excluded_old_count" Int NL - - [Zero or one time.] - - The number of relays that have some successful measurements, but all - those measurements are too old (more than 5 days, by default). - - Excludes relays that are already counted in - recent_measurements_excluded_near_count. - - (See the note in section 1.4, version 1.4.0, about excluded relays.) - - This Line was added in version 1.4.0 of this specification. - - "recent_measurements_excluded_few_count" Int NL - - [Zero or one time.] - - The number of relays that don't have enough recent successful - measurements. (Fewer than 2 measurements in the last 5 days, by - default). - - Excludes relays that are already counted in - recent_measurements_excluded_near_count and - recent_measurements_excluded_old_count. - - (See the note in section 1.4, version 1.4.0, about excluded relays.) - - This Line was added in version 1.4.0 of this specification. - - "time_to_report_half_network" Int NL - - [Zero or one time.] - - The time in seconds that it would take to report measurements about the - half of the network, given the number of eligible relays and the time - it took in the last days (5 days, by default). - - (See the note in section 1.4, version 1.4.0, about excluded relays.) - - This Line was added in version 1.4.0 of this specification. - - "tor_version" version_number NL - - [Zero or one time.] - - The Tor version of the Tor process controlled by the generator. - - This Line was added in version 1.4.0 of this specification. - - "mu" Int NL - - [Zero or one time.] - - The network stream bandwidth average calculated as explained in B4.2. - - This Line was added in version 1.7.0 of this specification. - - "muf" Int NL - - [Zero or one time.] - - The network stream bandwidth average filtered calculated as explained in - B4.2. - - This Line was added in version 1.7.0 of this specification. - - KeyValue NL - - [Zero or more times.] - - There MUST NOT be multiple KeyValue header Lines with the same key. - If there are, the parser SHOULD choose an arbitrary Line. - - If a parser does not recognize a Keyword in a KeyValue Line, it - MUST be ignored. - - Future format versions may include additional KeyValue header Lines. - Additional header Lines will be accompanied by a minor version - increment. - - Implementations MAY add additional header Lines as needed. This - specification SHOULD be updated to avoid conflicting meanings for - the same header keys. - - Parsers MUST NOT rely on the order of these additional Lines. - - Additional header Lines MUST NOT use any keywords specified in the - relay measurements format. - If there are, the parser MAY ignore conflicting keywords. - - Terminator NL - - [Zero or one time.] - - The Header List section ends with a Terminator. - - In version 1.0.0, Header List ends when the first relay bandwidth - is found conforming to the next section. - - Implementations of version 1.1.0 and later SHOULD use a 5-character - terminator. - - Tor 0.4.0.1-alpha and later look for a 5-character terminator, - or the first relay bandwidth line. sbws versions 0.1.0 to 1.0.2 - used a 4-character terminator, this bug was fixed in 1.0.3. - -2.3. Relay Line format - - It consists of zero or more RelayLines containing relay ids and - bandwidths. The relays and their KeyValues are in arbitrary order. - - There MUST NOT be multiple KeyValue pairs with the same key in the same - RelayLine. If there are, the parser SHOULD choose an arbitrary Value. - - There MUST NOT be multiple RelayLines per relay identity (node_id or - master_key_ed25519). If there are, parsers SHOULD issue a warning. - Parers MAY reject the file, choose an arbitrary RelayLine, or ignore - both RelayLines. - - If a parser does not recognize any extra material in a RelayLine, - the extra material MUST be ignored. - - Each RelayLine includes the following KeyValue pairs: - - "node_id" hexdigest - - [Exactly once.] - - The fingerprint for the relay's RSA identity key. - - Note: In bandwidth files read by Tor versions earlier than - 0.3.4.1-alpha, node_id MUST NOT be at the end of the Line. - These authority versions are no longer supported. - - Current Tor versions ignore master_key_ed25519, so node_id MUST be - present in each relay Line. - - Implementations of version 1.1.0 and later SHOULD include both node_id - and master_key_ed25519. Parsers SHOULD accept Lines that contain at - least one of them. - - "master_key_ed25519" MasterKey - - [Zero or one time.] - - The relays's master Ed25519 key, base64 encoded, - without trailing "="s, to avoid ambiguity with KeyValue "=" - character. - - This KeyValue pair SHOULD be present, see the note under node_id. - - This KeyValue was added in version 1.1.0 of this specification. - - "bw" Bandwidth - - [Exactly once.] - - The bandwidth of this relay in kilobytes per second. - - No Zero Bandwidths: - Tor accepts zero bandwidths, but they trigger bugs in older Tor - implementations. Therefore, implementations SHOULD NOT produce zero - bandwidths. Instead, they SHOULD use one as their minimum bandwidth. - If there are zero bandwidths, the parser MAY ignore them. - - Bandwidth Aggregation: - Multiple measurements can be aggregated using an averaging scheme, - such as a mean, median, or decaying average. - - Bandwidth Scaling: - Torflow scales bandwidths to kilobytes per second. Other - implementations SHOULD use kilobytes per second for their initial - bandwidth scaling. - - If different implementations or configurations are used in votes for - the same network, their measurements MAY need further scaling. See - Appendix B for information about scaling, and one possible scaling - method. - - MaxAdvertisedBandwidth: - Bandwidth generators MUST limit the relays' measured bandwidth based - on the MaxAdvertisedBadwidth. - A relay's MaxAdvertisedBandwidth limits the bandwidth-avg in its - descriptor. bandwidth-avg is the minimum of MaxAdvertisedBandwidth, - BandwidthRate, RelayBandwidthRate, BandwidthBurst, and - RelayBandwidthBurst. - Therefore, generators MUST limit a relay's measured bandwidth to its - descriptor's bandwidth-avg. This limit needs to be implemented in the - generator, because generators may scale consensus weights before - sending them to Tor. - Generators SHOULD NOT limit measured bandwidths based on descriptors' - bandwidth-observed, because that penalises new relays. - - sbws limits the relay's measured bandwidth to the bandwidth-avg - advertised. - - Torflow partitions relays based on their bandwidth. For unmeasured - relays, Torflow uses the minimum of all descriptor bandwidths, - including bandwidth-avg (MaxAdvertisedBandwidth) and - bandwidth-observed. Then Torflow measures the relays in each partition - against each other, which implicitly limits a relay's measured - bandwidth to the bandwidths of similar relays. - - Torflow also generates consensus weights based on the ratio between the - measured bandwidth and the minimum of all descriptor bandwidths (at the - time of the measurement). So when an operator reduces the - MaxAdvertisedBandwidth for a relay, Torflow reduces that relay's - measured bandwidth. - - KeyValue - - [Zero or more times.] - - Future format versions may include additional KeyValue pairs on a - RelayLine. - Additional KeyValue pairs will be accompanied by a minor version - increment. - - Implementations MAY add additional relay KeyValue pairs as needed. - This specification SHOULD be updated to avoid conflicting meanings - for the same Keywords. - - Parsers MUST NOT rely on the order of these additional KeyValue - pairs. - - Additional KeyValue pairs MUST NOT use any keywords specified in the - header format. - If there are, the parser MAY ignore conflicting keywords. - -2.4. Implementation details - -2.4.1. Writing bandwidth files atomically - - To avoid inconsistent reads, implementations SHOULD write bandwidth files - atomically. If the file is transferred from another host, it SHOULD be - written to a temporary path, then renamed to the V3BandwidthsFile path. - - sbws versions 0.7.0 and later write the bandwidth file to an archival - location, create a temporary symlink to that location, then atomically rename - the symlink - to the configured V3BandwidthsFile path. - - Torflow does not write bandwidth files atomically. - -2.4.2. Additional KeyValue pair definitions - - KeyValue pairs in RelayLines that current implementations generate. - -2.4.2.1. Simple Bandwidth Scanner - - sbws RelayLines contain these keys: - - "node_id" hexdigest - - As above. - - "bw" Bandwidth - - As above. - - "nick" nickname - - [Exactly once.] - - The relay nickname. - - Torflow also has a "nick" KeyValue. - - "rtt" Int - - [Zero or one time.] - - The Round Trip Time in milliseconds to obtain 1 byte of data. - - This KeyValue was added in version 1.1.0 of this specification. - It became optional in version 1.3.0 or 1.4.0 of this specification. - - "time" DateTime - - [Exactly once.] - - The date and time timestamp in ISO 8601 format and UTC time zone - when the last bandwidth was obtained. - - This KeyValue was added in version 1.1.0 of this specification. - The Torflow equivalent is "measured_at". - - "success" Int - - [Zero or one time.] - - The number of times that the bandwidth measurements for this relay were - successful. - - This KeyValue was added in version 1.1.0 of this specification. - - "error_circ" Int - - [Zero or one time.] - - The number of times that the bandwidth measurements for this relay - failed because of circuit failures. - - This KeyValue was added in version 1.1.0 of this specification. - The Torflow equivalent is "circ_fail". - - "error_stream" Int - - [Zero or one time.] - - The number of times that the bandwidth measurements for this relay - failed because of stream failures. - - This KeyValue was added in version 1.1.0 of this specification. - - "error_destination" Int - - [Zero or one time.] - - The number of times that the bandwidth measurements for this relay - failed because the destination Web server was not available. - - This KeyValue was added in version 1.4.0 of this specification. - - "error_second_relay" Int - - [Zero or one time.] - - The number of times that the bandwidth measurements for this relay - failed because sbws could not find a second relay for the test circuit. - - This KeyValue was added in version 1.4.0 of this specification. - - "error_misc" Int - - [Zero or one time.] - - The number of times that the bandwidth measurements for this relay - failed because of other reasons. - - This KeyValue was added in version 1.1.0 of this specification. - - "bw_mean" Int - - [Zero or one time.] - - The measured bandwidth mean for this relay in bytes per second. - - This KeyValue was added in version 1.2.0 of this specification. - - "bw_median" Int - - [Zero or one time.] - - The measured bandwidth median for this relay in bytes per second. - - This KeyValue was added in version 1.2.0 of this specification. - - "desc_bw_avg" Int - - [Zero or one time.] - - The descriptor average bandwidth for this relay in bytes per second. - - This KeyValue was added in version 1.2.0 of this specification. - - "desc_bw_obs_last" Int - - [Zero or one time.] - - The last descriptor observed bandwidth for this relay in bytes per - second. - - This KeyValue was added in version 1.2.0 of this specification. - - "desc_bw_obs_mean" Int - - [Zero or one time.] - - The descriptor observed bandwidth mean for this relay in bytes per - second. - - This KeyValue was added in version 1.2.0 of this specification. - - "desc_bw_bur" Int - - [Zero or one time.] - - The descriptor burst bandwidth for this relay in bytes per - second. - - This KeyValue was added in version 1.2.0 of this specification. - - "consensus_bandwidth" Int - - [Zero or one time.] - - The consensus bandwidth for this relay in bytes per second. - - This KeyValue was added in version 1.2.0 of this specification. - - "consensus_bandwidth_is_unmeasured" Bool - - [Zero or one time.] - - If the consensus bandwidth for this relay was not obtained from - three or more bandwidth authorities, this KeyValue is True or - False otherwise. - - This KeyValue was added in version 1.2.0 of this specification. - - "relay_in_recent_consensus_count" Int - - [Zero or one time.] - - The number of times this relay was found in a consensus in the - last data_period days. (Unless otherwise stated, data_period is - 5 by default.) - - This KeyValue was added in version 1.4.0 of this specification. - - "relay_recent_priority_list_count" Int - - [Zero or one time.] - - The number of times this relay has been prioritized to be measured - in the last data_period days. - - This KeyValue was added in version 1.4.0 of this specification. - - "relay_recent_measurement_attempt_count" Int - - [Zero or one time.] - - The number of times this relay was tried to be measured in the - last data_period days. - - This KeyValue was added in version 1.4.0 of this specification. - - "relay_recent_measurement_failure_count" Int - - [Zero or one time.] - - The number of times this relay was tried to be measured in the - last data_period days, but it was not possible to obtain a - measurement. - - This KeyValue was added in version 1.4.0 of this specification. - - "relay_recent_measurements_excluded_error_count" Int - - [Zero or one time.] - - The number of recent relay measurement attempts that failed. - Measurements are recent if they are in the last data_period days - (5 by default). - - (See the note in section 1.4, version 1.4.0, about excluded relays.) - - This KeyValue was added in version 1.4.0 of this specification. - - "relay_recent_measurements_excluded_near_count" Int - - [Zero or one time.] - - When all of a relay's recent successful measurements were performed in - a period of time that was too short (by default 1 day), the relay is - excluded. This KeyValue contains the number of recent successful - measurements for the relay that were ignored for this reason. - - (See the note in section 1.4, version 1.4.0, about excluded relays.) - - This KeyValue was added in version 1.4.0 of this specification. - - "relay_recent_measurements_excluded_old_count" Int - - [Zero or one time.] - - The number of successful measurements for this relay that are too old - (more than data_period days, 5 by default). - - Excludes measurements that are already counted in - relay_recent_measurements_excluded_near_count. - - (See the note in section 1.4, version 1.4.0, about excluded relays.) - - This KeyValue was added in version 1.4.0 of this specification. - - "relay_recent_measurements_excluded_few_count" Int - - [Zero or one time.] - - The number of successful measurements for this relay that were ignored - because the relay did not have enough successful measurements (fewer - than 2, by default). - - Excludes measurements that are already counted in - relay_recent_measurements_excluded_near_count or - relay_recent_measurements_excluded_old_count. - - (See the note in section 1.4, version 1.4.0, about excluded relays.) - - This KeyValue was added in version 1.4.0 of this specification. - - "under_min_report" bool - - [Zero or one time.] - - If the value is 1, there are not enough eligible relays in the - bandwidth file, and Tor bandwidth authorities MAY NOT vote on this - relay. (Current Tor versions do not change their behaviour based on - the "under_min_report" key.) - - If the value is 0 or the KeyValue is not present, there are enough - relays in the bandwidth file. - - Because Tor versions released before April 2019 (see section 1.4. for - the full list of versions) ignore "vote=0", generator implementations - MUST NOT change the bandwidths for under_min_report relays. Using the - same bw value makes authorities that do not understand "vote=0" - or "under_min_report=1" produce votes that don't change relay weights - too much. It also avoids flapping when the reporting threshold is - reached. - - This KeyValue was added in version 1.4.0 of this specification. - - "unmeasured" bool - - [Zero or one time.] - - If the value is 1, this relay was not successfully measured and - Tor bandwidth authorities MAY NOT vote on this relay. - (Current Tor versions do not change their behaviour based on - the "unmeasured" key.) - - If the value is 0 or the KeyValue is not present, this relay - was successfully measured. - - Because Tor versions released before April 2019 (see section 1.4. for - the full list of versions) ignore "vote=0", generator implementations - MUST set "bw=1" for unmeasured relays. Using the minimum bw value - makes authorities that do not understand "vote=0" or "unmeasured=1" - produce votes that don't change relay weights too much. - - This KeyValue was added in version 1.4.0 of this specification. - - "vote" bool - - [Zero or one time.] - - If the value is 0, Tor directory authorities SHOULD ignore the relay's - entry in the bandwidth file. They SHOULD vote for the relay the same - way they would vote for a relay that is not present in the file. - - This MAY be the case when this relay was not successfully measured but - it is included in the Bandwidth File, to diagnose why they were not - measured. - - If the value is 1 or the KeyValue is not present, Tor directory - authorities MUST use the relay's bw value in any votes for that relay. - - Implementations MUST also set "bw=1" for unmeasured relays. - But they MUST NOT change the bw for under_min_report relays. - (See the explanations under "unmeasured" and "under_min_report" - for more details.) - - This KeyValue was added in version 1.4.0 of this specification. - - "xoff_recv" Int - - [Zero or one time.] - - The number of times this relay received `XOFF_RECV` stream events while - being measured in the last data_period days. - - This KeyValue was added in version 1.6.0 of this specification. - - "xoff_sent" Int - - [Zero or one time.] - - The number of times this relay received `XOFF_SENT` stream events while - being measured in the last data_period days. - - This KeyValue was added in version 1.6.0 of this specification. - - "r_strm" Float - - [Zero or one time.] - - The stream ratio of this relay calculated as explained in B4.3. - - This KeyValue was added in version 1.7.0 of this specification. - - "r_strm_filt" Float - - [Zero or one time.] - - The filtered stream ratio of this relay calculated as explained in B4.3. - - This KeyValue was added in version 1.7.0 of this specification. - - -2.4.2.2. Torflow - - Torflow RelayLines include node_id and bw, and other KeyValue pairs [2]. - -References: - -1. https://gitweb.torproject.org/torflow.git -2. https://gitweb.torproject.org/torflow.git/tree/NetworkScanners/BwAuthority/README.spec.txt#n332 - The Torflow specification is outdated, and does not match the current - implementation. See section A.1. for the format produced by Torflow. -3. https://gitweb.torproject.org/torspec.git/tree/dir-spec.txt -4. https://gitweb.torproject.org/torspec.git/tree/version-spec.txt -5. https://semver.org/ - -A. Sample data - -The following has not been obtained from any real measurement. - -A.1. Generated by Torflow - -This an example version 1.0.0 document: - -1523911758 -node_id=$68A483E05A2ABDCA6DA5A3EF8DB5177638A27F80 bw=760 nick=Test measured_at=1523911725 updated_at=1523911725 pid_error=4.11374090719 pid_error_sum=4.11374090719 pid_bw=57136645 pid_delta=2.12168374577 circ_fail=0.2 scanner=/filepath -node_id=$96C15995F30895689291F455587BD94CA427B6FC bw=189 nick=Test2 measured_at=1523911623 updated_at=1523911623 pid_error=3.96703337994 pid_error_sum=3.96703337994 pid_bw=47422125 pid_delta=2.65469736988 circ_fail=0.0 scanner=/filepath - -A.2. Generated by sbws version 0.1.0 - -1523911758 -version=1.1.0 -software=sbws -software_version=0.1.0 -latest_bandwidth=2018-04-16T20:49:18 -file_created=2018-04-16T21:49:18 -generator_started=2018-04-16T15:13:25 -earliest_bandwidth=2018-04-16T15:13:26 -==== -bw=380 error_circ=0 error_misc=0 error_stream=1 master_key_ed25519=YaqV4vbvPYKucElk297eVdNArDz9HtIwUoIeo0+cVIpQ nick=Test node_id=$68A483E05A2ABDCA6DA5A3EF8DB5177638A27F80 rtt=380 success=1 time=2018-05-08T16:13:26 -bw=189 error_circ=0 error_misc=0 error_stream=0 master_key_ed25519=a6a+dZadrQBtfSbmQkP7j2ardCmLnm5NJ4ZzkvDxbo0I nick=Test2 node_id=$96C15995F30895689291F455587BD94CA427B6FC rtt=378 success=1 time=2018-05-08T16:13:36 - -A.3. Generated by sbws version 1.0.3 - -1523911758 -version=1.2.0 -latest_bandwidth=2018-04-16T20:49:18 -file_created=2018-04-16T21:49:18 -generator_started=2018-04-16T15:13:25 -earliest_bandwidth=2018-04-16T15:13:26 -minimum_number_eligible_relays=3862 -minimum_percent_eligible_relays=60 -number_consensus_relays=6436 -number_eligible_relays=6000 -percent_eligible_relays=93 -software=sbws -software_version=1.0.3 -===== -bw=38000 bw_mean=1127824 bw_median=1180062 desc_bw_avg=1073741824 desc_bw_obs_last=17230879 desc_bw_obs_mean=14732306 error_circ=0 error_misc=0 error_stream=1 master_key_ed25519=YaqV4vbvPYKucElk297eVdNArDz9HtIwUoIeo0+cVIpQ nick=Test node_id=$68A483E05A2ABDCA6DA5A3EF8DB5177638A27F80 rtt=380 success=1 time=2018-05-08T16:13:26 -bw=1 bw_mean=199162 bw_median=185675 desc_bw_avg=409600 desc_bw_obs_last=836165 desc_bw_obs_mean=858030 error_circ=0 error_misc=0 error_stream=0 master_key_ed25519=a6a+dZadrQBtfSbmQkP7j2ardCmLnm5NJ4ZzkvDxbo0I nick=Test2 node_id=$96C15995F30895689291F455587BD94CA427B6FC rtt=378 success=1 time=2018-05-08T16:13:36 - -A.3.1. When there are not enough eligible measured relays: - -1540496079 -version=1.2.0 -earliest_bandwidth=2018-10-20T19:35:52 -file_created=2018-10-25T19:35:03 -generator_started=2018-10-25T11:42:56 -latest_bandwidth=2018-10-25T19:34:39 -minimum_number_eligible_relays=3862 -minimum_percent_eligible_relays=60 -number_consensus_relays=6436 -number_eligible_relays=2960 -percent_eligible_relays=46 -software=sbws -software_version=1.0.3 -===== - -A.4. Headers generated by sbws version 1.0.4 - -1523911758 -version=1.2.0 -latest_bandwidth=2018-04-16T20:49:18 -destinations_countries=TH,ZZ -file_created=2018-04-16T21:49:18 -generator_started=2018-04-16T15:13:25 -earliest_bandwidth=2018-04-16T15:13:26 -minimum_number_eligible_relays=3862 -minimum_percent_eligible_relays=60 -number_consensus_relays=6436 -number_eligible_relays=6000 -percent_eligible_relays=93 -scanner_country=SN -software=sbws -software_version=1.0.4 -===== - -A.5 Generated by sbws version 1.1.0 - -1523911758 -version=1.4.0 -latest_bandwidth=2018-04-16T20:49:18 -destinations_countries=TH,ZZ -file_created=2018-04-16T21:49:18 -generator_started=2018-04-16T15:13:25 -earliest_bandwidth=2018-04-16T15:13:26 -minimum_number_eligible_relays=3862 -minimum_percent_eligible_relays=60 -number_consensus_relays=6436 -number_eligible_relays=6000 -percent_eligible_relays=93 -recent_measurement_attempt_count=6243 -recent_measurement_failure_count=732 -recent_measurements_excluded_error_count=969 -recent_measurements_excluded_few_count=3946 -recent_measurements_excluded_near_count=90 -recent_measurements_excluded_old_count=0 -recent_priority_list_count=20 -recent_priority_relay_count=6243 -scanner_country=SN -software=sbws -software_version=1.1.0 -time_to_report_half_network=57273 -===== -bw=1 error_circ=1 error_destination=0 error_misc=0 error_second_relay=0 error_stream=0 master_key_ed25519=J3HQ24kOQWac3L1xlFLp7gY91qkb5NuKxjj1BhDi+m8 nick=snap269 node_id=$DC4D609F95A52614D1E69C752168AF1FCAE0B05F relay_recent_measurement_attempt_count=3 relay_recent_measurements_excluded_error_count=1 relay_recent_measurements_excluded_near_count=3 relay_recent_consensus_count=3 relay_recent_priority_list_count=3 success=3 time=2019-03-16T18:20:57 unmeasured=1 vote=0 -bw=1 error_circ=0 error_destination=0 error_misc=0 error_second_relay=0 error_stream=2 master_key_ed25519=h6ZB1E1yBFWIMloUm9IWwjgaPXEpL5cUbuoQDgdSDKg nick=relay node_id=$C4544F9E209A9A9B99591D548B3E2822236C0503 relay_recent_measurement_attempt_count=3 relay_recent_measurements_excluded_error_count=2 relay_recent_measurements_excluded_few_count=1 relay_recent_consensus_count=3 relay_recent_priority_list_count=3 success=1 time=2019-03-17T06:50:58 unmeasured=1 vote=0 - -B. Scaling bandwidths - -B.1. Scaling requirements - - Tor accepts zero bandwidths, but they trigger bugs in older Tor - implementations. Therefore, scaling methods SHOULD perform the - following checks: - * If the total bandwidth is zero, all relays should be given equal - bandwidths. - * If the scaled bandwidth is zero, it should be rounded up to one. - - Initial experiments indicate that scaling may not be needed for - torflow and sbws, because their measured bandwidths are similar - enough already. - -B.2. A linear scaling method - - If scaling is required, here is a simple linear bandwidth scaling - method, which ensures that all bandwidth votes contain approximately - the same total bandwidth: - - 1. Calculate the relay quota by dividing the total measured bandwidth - in all votes, by the number of relays with measured bandwidth - votes. In the public tor network, this is approximately 7500 as of - April 2018. The quota should be a consensus parameter, so it can be - adjusted for all generators on the network. - - 2. Calculate a vote quota by multiplying the relay quota by the number - of relays this bandwidth authority has measured - bandwidths for. - - 3. Calculate a scaling factor by dividing the vote quota by the - total unscaled measured bandwidth in this bandwidth - authority's upcoming vote. - - 4. Multiply each unscaled measured bandwidth by the scaling - factor. - - Now, the total scaled bandwidth in the upcoming vote is - approximately equal to the quota. - -B.3. Quota changes - - If all generators are using scaling, the quota can be gradually - reduced or increased as needed. Smaller quotas decrease the size - of uncompressed consensuses, and may decrease the size of - consensus diffs and compressed consensuses. But if the relay - quota is too small, some relays may be over- or under-weighted. - -B.4. Torflow aggregation - - Torflow implements two methods to compute the bandwidth values from the - (stream) bandwidth measurements: with and without PID control feedback. - The method described here is without PID control (see Torflow - specification, section 2.2). - - In the following sections, the relays' measured bandwidth refer to the - ones that this bandwidth authority has measured for the relays that - would be included in the next bandwidth authority's upcoming vote. - - 1. Calculate the filtered bandwidth for each relay: - - choose the relay's measurements (`bw_j`) that are equal or greater - than the mean of the measurements for this relay - - calculate the mean of those measurements - - In pseudocode: - - bw_filt_i = mean(max(mean(bw_j), bw_j)) - - 2. Calculate network averages: - - calculate the filtered average by dividing the sum of all the - relays' filtered bandwidth by the number of relays that have been - measured (`n`), ie, calculate the mean average of the relays' - filtered bandwidth. - - calculate the stream average by dividing the sum of all the - relays' measured bandwidth by the number of relays that have been - measured (`n`), ie, calculate the mean average or the relays' - measured bandwidth. - - In pseudocode: - - bw_avg_filt_ = bw_filt_i / n - bw_avg_strm = bw_i / n - - 3. Calculate ratios for each relay: - - calculate the filtered ratio by dividing each relay filtered - bandwidth by the filtered average - - calculate the stream ratio by dividing each relay measured - bandwidth by the stream average - - In pseudocode: - - r_filt_i = bw_filt_i / bw_avg_filt - r_strm_i = bw_i / bw_avg_strm - - 4. Calculate the final ratio for each relay: - The final ratio is the larger between the filtered bandwidth's and the - stream bandwidth's ratio. - - In pseudocode: - - r_i = max(r_filt_i, r_strm_i) - - 5. Calculate the scaled bandwidth for each relay: - The most recent descriptor observed bandwidth (`bw_obs_i`) is - multiplied by the ratio - - In pseudocode: - - bw_new_i = r_i * bw_obs_i - - <> -- cgit v1.2.3-54-g00ecf