diff options
Diffstat (limited to 'proposals/313-relay-ipv6-stats.txt')
-rw-r--r-- | proposals/313-relay-ipv6-stats.txt | 416 |
1 files changed, 416 insertions, 0 deletions
diff --git a/proposals/313-relay-ipv6-stats.txt b/proposals/313-relay-ipv6-stats.txt new file mode 100644 index 0000000..bd546b5 --- /dev/null +++ b/proposals/313-relay-ipv6-stats.txt @@ -0,0 +1,416 @@ +Filename: 313-relay-ipv6-stats.txt +Title: Tor Relay IPv6 Statistics +Author: teor, Karsten Loesing, Nick Mathewson +Created: 10-February-2020 +Status: Draft +Ticket: #33159 + +0. Abstract + + We propose that: + * tor relays should collect statistics on IPv6 connections, and + * tor relays and bridges should collect statistics on consumed bandwidth. + Like tor's existing connection and consumed bandwidth statistics, these new + IPv6 statistics will be published in each relay's extra-info descriptor. + + We also plan to write a script that shows the number of relays in the + consensus that support: + * IPv6 extends, and + * IPv6 client connections. + This script will be used for medium-term monitoring, during the deployment + of tor's IPv6 changes in 2020. (See [Proposal 311: Relay IPv6 Reachability] + and [Proposal 312: Relay Auto IPv6 Address].) + +1. Introduction + + Tor relays (and bridges) can accept IPv6 client connections via their + ORPort. But current versions of tor need to have an explicitly configured + IPv6 address (see [Proposal 312: Relay Auto IPv6 Address]), and they don't + perform IPv6 reachability self-checks (see + [Proposal 311: Relay IPv6 Reachability]). + + As we implement these new IPv6 features in tor, we want to monitor their + impact on the IPv6 connections and bandwidth in the tor network. + + Tor developers also need to know how many relays support these new IPv6 + features, so they can test tor's IPv6 reachability checks. (In particular, + see section 4.3.1 in [Proposal 311: Relay IPv6 Reachability]: Refusing to + Publish the Descriptor.) + +2. Scope + + This proposal modifies Tor's behaviour as follows: + + Relays, bridges, and directory authorities collect statistics on: + * IPv6 connections, and + * IPv6 consumed bandwidth. + The design of these statistics will be based on tor's existing connection + and consumed bandwidth statistics. + + Tor's existing consumed bandwidth statistics truncate their totals to the + nearest kilobyte. The existing connection statistics do not perform any + binning. + + We do not proposed to add any extra noise or binning to these statistics. + Instead, we expect to leave these changes until we have a consistent + privacy-preserving statistics framwework for tor. As an example of this + kind of framework, see + [Proposal 288: Privacy-Preserving Stats with Privcount (Shamir version)]. + + We avoid: + * splitting connection statistics into clients and relays, and + * collecting circuit statistics. + These statistics are more sensitive, so we want to implement + privacy-preserving statistics, before we consider adding them. + + Throughout this proposal, "relays" includes directory authorities, except + where they are specifically excluded. "relays" does not include bridges, + except where they are specifically included. (The first mention of "relays" + in each section should specifically exclude or include these other roles.) + + Tor clients do not collect any statistics for public reporting. Therefore, + clients are out of scope in this proposal. + + When this proposal describes Tor's current behaviour, it covers all + supported Tor versions (0.3.5.7 to 0.4.2.5), as of January 2020, except + where another version is specifically mentioned. + + This proposal also includes a medium-term monitoring script, which + calculates the number of relays in the consensus that support IPv6 extends, + and IPv6 client connections. + +3. Monitoring IPv6 Relays in the Consensus + + We propose writing a script that calculates: + * the number of relays, and + * the consensus weight fraction of relays, + in the consensus that: + * have an IPv6 ORPort, + * support IPv6 reachability checks, + * support IPv6 clients, and + * support IPv6 reachability checks, and IPv6 clients. + + In order to provide easy access to these statistics, we propose + that the script should: + * download a consensus (or read an existing consensus), and + * calculate and report these statistics. + + The following consensus weight fractions should divide by the total + consensus weight: + * have an IPv6 ORPort (all relays have an IPv4 ORPort), and + * support IPv6 reachability checks (all relays support IPv4 reachability). + + The following consensus weight fractions should divide by the + "usable Guard" consensus weight: + * support IPv6 clients, and + * support IPv6 reachability checks and IPv6 clients. + + "Usable Guards" have the Guard flag, but do not have the Exit flag. If the + Guard also has the BadExit flag, the Exit flag should be ignored. + + Note that this definition of "Usable Guards" is only valid when the + consensus contains many more guards than exits. That is, Wgd must be 0 in + the consensus. (See the [Tor Directory Protocol] for more details.) + + Therefore, the script should check that Wgd is 0. If it is not, the script + should log a warning about the accuracy of the "Usable Guard" statistics. + +4. Collecting IPv6 Consumed Bandwidth Statistics + + We propose that relays (and bridges) collect IPv6 consumed bandwidth + statistics. + + To minimise development and testing effort, we propose re-using the existing + "bw_array" code in rephist.c. + + In particular, tor currently counts these bandwidth statistics: + * read, + * write, + * dir_read, and + * dir_write. + + We propose adding the following bandwidth statistics: + * ipv6_read, and + * ipv6_write. + (The IPv4 statistics can be calculated by subtracting the IPv6 statistics + from the existing total consumed bandwidth statistics.) + + We believe that collecting IPv6 consumed bandwidth statistics is about as + safe as the existing IPv4+IPv6 total consumed bandwidth statistics. + + See also section 7.5, which adds a BandwidthStatistics torrc option and + consensus parameter. BandwidthStatistics is an optional change. + +5. Collecting IPv6 Connection Statistics + + We propose that relays (but not bridges) collect IPv6 connection statistics. + + Bridges refuse to collect the existing ConnDirectionStatistics, so we do not + believe it is safe to collect the smaller IPv6 totals on bridges. + + To minimise development and testing effort, we propose re-using the existing + "bidi" code in rephist.c. (This code may require some refactoring, because + the "bidi" totals are globals, rather than a struct.) + + In particular, tor currently counts these connection statistics: + * below threshold, + * mostly read, + * mostly written, and + * both read and written. + + We propose adding IPv6 variants of all these statistics. (The IPv4 + statistics can be calculated by subtracting the IPv6 statistics from the + existing total connection statistics.) + + See also section 7.6, which adds a ConnDirectionStatistics consensus + parameter. This consensus paramter is an optional change. + +6. Directory Protocol Specification Changes + + We propose adding IPv6 variants of the consumed bandwidth and connection + direction statistics to the tor directory protocol. + + We propose the following additions to the [Tor Directory Protocol] + specification, in section 2.1.2. Each addition should be inserted below the + existing consumed bandwidth and connection direction specifications. + + "ipv6-read-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM,NUM,NUM... NL + [At most once] + "ipv6-write-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM,NUM,NUM... NL + [At most once] + + Declare how much bandwidth the OR has used recently, on IPv6 + connections. See "read-history" and "write-history" for more details. + (The migration notes do not apply to IPv6.) + + "ipv6-conn-bi-direct" YYYY-MM-DD HH:MM:SS (NSEC s) BELOW,READ,WRITE,BOTH NL + [At most once] + + Number of IPv6 connections, that are used uni-directionally or + bi-directionally. See "conn-bi-direct" for more details. + + We also propose the following replacement, in the same section: + + "dirreq-read-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM,NUM,NUM... NL + [At most once] + "dirreq-write-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM,NUM,NUM... NL + [At most once] + + Declare how much bandwidth the OR has spent on answering directory + requests. See "read-history" and "write-history" for more details. + (The migration notes do not apply to dirreq.) + + This replacement is optional, but it may avoid the 3 *read-history + definitions getting out of sync. + +7. Optional Changes + + We propose some optional changes to help relay operators, tor developers, + and tor's network health. We also expect that these changes will drive IPv6 + relay adoption. + + Some of these changes may be more appropriate as future work, or along with + other proposed features. + +7.1. Log IPv6 Statistics in Tor's Heartbeat Logs + + We propose this optional change, so relay operators can see their own IPv6 + statistics: + + We propose that tor logs its IPv6 consumed bandwidth and connection + statistics in its regular "heartbeat" logs. + + These heartbeat statistics should be collected over the lifetime of the tor + process, rather than using the state file, like the statistics in sections + 4 and 5. + + Tor's existing heartbeat logs already show its consumed bandwidth and + connections (in the link protocol counts). + + We may also want to show IPv6 consumed bandwidth and connections as a + propotion of the total consumed bandwidth and connections. + + These statistics only show a relay's local bandwidth usage, so they can't + be used for reporting. + +7.2. Show IPv6 Relay Counts on Consensus Health + + The [Consensus Health] website displays a wide rage of tor statistics, + based on the most recent consensus. + + We propose this optional change, to: + * help tor developers improve IPv6 support on the tor network, + * help diagnose issues with IPv6 on the tor network, and + * drive IPv6 adoption on tor relays. + + Consensus Health adds an IPv6 section, with relays in the consensus that: + * have an IPv6 ORPort, and + * support IPv6 reachability checks. + + The definitions of these statistics are in section 3. + + These changes can be tested using the script proposed in section 3. + +7.3. Add an IPv6 Reachability Pseudo-Flag on Relay Search + + The [Relay Search] website displays tor relay information, based on the + current consensus and relay descriptors. + + We propose this optional change, to: + * help relay operators diagnose issues with IPv6 on their relays, and + * drive IPv6 adoption on tor relays. + + Relay Search adds a pseudo-flag for relay IPv6 reachability support. + + This pseudo-flag should be given to relays that have: + * a reachable IPv6 ORPort (in the consensus), and + * support tor subprotocol version "Relay=3" (or later). + See [Proposal 311: Relay IPv6 Reachability] for details. + + TODO: Is this a useful change? + Are there better ways of driving IPv6 adoption? + +7.4. Add IPv6 Connections and Consumed Bandwidth Graphs to Tor Metrics + + The [Tor Metrics: Traffic] website displays connection and bandwidth + information for the tor network, based on relay extra-info descriptors. + + We propose these optional changes, to: + * help tor developers improve IPv6 support on the tor network, + * help diagnose issues with IPv6 on the tor network, and + * drive IPv6 adoption on tor relays. + + Tor Metrics adds the following information to the graphs on the Traffic + page: + + Consumed Bandwidth by IP version + * added to the existing [Tor Metrics: Advertised bandwidth by IP version] + page + * as a stacked graph, like + [Tor Metrics: Advertised and consumed bandwidth by relay flags] + + Fraction of connections used uni-/bidirectionally by IP version + * added to the existing + [Tor Metrics: Fraction of connections used uni-/bidirectionally] page + * as a stacked graph, like + [Tor Metrics: Advertised and consumed bandwidth by relay flags] + +7.5. Add a BandwidthStatistics option + + We propose adding a new BandwidthStatistics torrc option and consensus + parameter, which activates reporting of all these statistics. Currently, + the existing statistics are controlled by ExtraInfoStatistics, but we + propose using the new BandwidthStatistics option for them as well. + + The default value of this option should be "auto", which checks the + consensus parameter. If there is no consensus parameter, the default should + be 1. (The existing bandwidth statistics are reported by default.) + +7.6. Add a ConnDirectionStatistics consensus parameter + + We propose using the existing ConnDirectionStatistics torrc option, and + adding a consensus parameter with the same name. This option will control + the new and existing connection statistics. + + The default value of this option should be "auto", which checks the + consensus parameter. If there is no consensus parameter, the default should + be 0. + + Bridges refuse to collect the existing ConnDirectionStatistics, so we do not + believe it is safe to collect the smaller IPv6 totals on bridges. The new + consensus parameter should also be ignored on bridges. + + The existing connection direction statistics are not reported by default, + but almost all relays actually report them. For more details, see: + [Ticket 33214: ConnDirectionStatistics is off by default, but most relays + report it]. + + If we fix the ConnDirectionStatistics default in Tor 0.4.4, we should also + implement the ConnDirectionStatistics consensus parameter. Then we can set + the consensus parameter to 1 for a week or two, so we can collect these + statistics. + +8. Test Plan + + We provide a quick summary of our testing plans. + +8.1. Testing IPv6 Relay Consensus Calculations + + We propose to test the IPv6 Relay consensus script using chutney networks. + However, chutney creates a limited number of relays, so we also need to + test these changes on consensuses from the public tor network. + + Some of these calculations are similar to the calculations that tor will do, + to find out if IPv6 reachability checks are reliable. So we may be able to + check the script against tor's reachability logs. (See section 4.3.1 in + [Proposal 311: Relay IPv6 Reachability]: Refusing to Publish the + Descriptor.) + + The Tor Metrics team may also independently check these calculations. + + Once the script is completed, its output will be monitored by tor + developers, as more volunteer relay operators deploy the relevant tor + versions. (And as the number of IPv6 relays in the consensus increases.) + +8.2. Testing IPv6 Extra-Info Statistics + + We propose to test the connection and consumed bandwidth statistics using + chutney networks. However, chutney runs for a short amount of time, and + creates a limited amount of traffic, so we also need to test these changes + on the public tor network. + + In particular, we have struggled to test statistics using chutney, because + tor's hard-coded statistics period is 24 hours. (And most chutney networks + run for under 1 minute.) + + Therefore, we propose to test these changes on the public network with a + small number of relays and bridges. + + During 2020, the Tor Metrics team will analyse these statistics on the + public tor network, and provide IPv6 progress reports. We expect that we may + discover some bugs during the first analysis. + + Once these changes are merged, they will be monitored by tor developers, as + more volunteer relay operators deploy the relevant tor versions. (And as the + number of IPv6 relays in the consensus increases.) + +References: + +[Consensus Health]: + https://consensus-health.torproject.org/ + +[Proposal 288: Privacy-Preserving Stats with Privcount (Shamir version)]: + https://gitweb.torproject.org/torspec.git/tree/proposals/288-privcount-with-shamir.txt + +[Proposal 311: Relay IPv6 Reachability]: + https://gitweb.torproject.org/torspec.git/tree/proposals/311-relay-ipv6-reachability.txt + +[Proposal 312: Relay Auto IPv6 Address]: + https://gitweb.torproject.org/torspec.git/tree/proposals/312-relay-auto-ipv6-addr.txt + +[Relay Search]: + https://metrics.torproject.org/rs.html + +[Ticket 33214: ConnDirectionStatistics is off by default, but most relays report it]: + https://trac.torproject.org/projects/tor/ticket/12377 + +[Tor Directory Protocol]: + (version 3) https://gitweb.torproject.org/torspec.git/tree/dir-spec.txt + +[Tor Manual Page]: + https://2019.www.torproject.org/docs/tor-manual.html.en + +[Tor Metrics: Advertised and consumed bandwidth by relay flags]: + https://metrics.torproject.org/bandwidth-flags.html + +[Tor Metrics: Advertised bandwidth by IP version]: + https://metrics.torproject.org/advbw-ipv6.html + +[Tor Metrics: Fraction of connections used uni-/bidirectionally]: + https://metrics.torproject.org/connbidirect.html + +[Tor Metrics: Traffic]: + https://metrics.torproject.org/bandwidth-flags.html + +[Tor Specification]: + https://gitweb.torproject.org/torspec.git/tree/tor-spec.txt |