aboutsummaryrefslogtreecommitdiff
path: root/proposals/313-relay-ipv6-stats.txt
diff options
context:
space:
mode:
Diffstat (limited to 'proposals/313-relay-ipv6-stats.txt')
-rw-r--r--proposals/313-relay-ipv6-stats.txt416
1 files changed, 416 insertions, 0 deletions
diff --git a/proposals/313-relay-ipv6-stats.txt b/proposals/313-relay-ipv6-stats.txt
new file mode 100644
index 0000000..bd546b5
--- /dev/null
+++ b/proposals/313-relay-ipv6-stats.txt
@@ -0,0 +1,416 @@
+Filename: 313-relay-ipv6-stats.txt
+Title: Tor Relay IPv6 Statistics
+Author: teor, Karsten Loesing, Nick Mathewson
+Created: 10-February-2020
+Status: Draft
+Ticket: #33159
+
+0. Abstract
+
+ We propose that:
+ * tor relays should collect statistics on IPv6 connections, and
+ * tor relays and bridges should collect statistics on consumed bandwidth.
+ Like tor's existing connection and consumed bandwidth statistics, these new
+ IPv6 statistics will be published in each relay's extra-info descriptor.
+
+ We also plan to write a script that shows the number of relays in the
+ consensus that support:
+ * IPv6 extends, and
+ * IPv6 client connections.
+ This script will be used for medium-term monitoring, during the deployment
+ of tor's IPv6 changes in 2020. (See [Proposal 311: Relay IPv6 Reachability]
+ and [Proposal 312: Relay Auto IPv6 Address].)
+
+1. Introduction
+
+ Tor relays (and bridges) can accept IPv6 client connections via their
+ ORPort. But current versions of tor need to have an explicitly configured
+ IPv6 address (see [Proposal 312: Relay Auto IPv6 Address]), and they don't
+ perform IPv6 reachability self-checks (see
+ [Proposal 311: Relay IPv6 Reachability]).
+
+ As we implement these new IPv6 features in tor, we want to monitor their
+ impact on the IPv6 connections and bandwidth in the tor network.
+
+ Tor developers also need to know how many relays support these new IPv6
+ features, so they can test tor's IPv6 reachability checks. (In particular,
+ see section 4.3.1 in [Proposal 311: Relay IPv6 Reachability]: Refusing to
+ Publish the Descriptor.)
+
+2. Scope
+
+ This proposal modifies Tor's behaviour as follows:
+
+ Relays, bridges, and directory authorities collect statistics on:
+ * IPv6 connections, and
+ * IPv6 consumed bandwidth.
+ The design of these statistics will be based on tor's existing connection
+ and consumed bandwidth statistics.
+
+ Tor's existing consumed bandwidth statistics truncate their totals to the
+ nearest kilobyte. The existing connection statistics do not perform any
+ binning.
+
+ We do not proposed to add any extra noise or binning to these statistics.
+ Instead, we expect to leave these changes until we have a consistent
+ privacy-preserving statistics framwework for tor. As an example of this
+ kind of framework, see
+ [Proposal 288: Privacy-Preserving Stats with Privcount (Shamir version)].
+
+ We avoid:
+ * splitting connection statistics into clients and relays, and
+ * collecting circuit statistics.
+ These statistics are more sensitive, so we want to implement
+ privacy-preserving statistics, before we consider adding them.
+
+ Throughout this proposal, "relays" includes directory authorities, except
+ where they are specifically excluded. "relays" does not include bridges,
+ except where they are specifically included. (The first mention of "relays"
+ in each section should specifically exclude or include these other roles.)
+
+ Tor clients do not collect any statistics for public reporting. Therefore,
+ clients are out of scope in this proposal.
+
+ When this proposal describes Tor's current behaviour, it covers all
+ supported Tor versions (0.3.5.7 to 0.4.2.5), as of January 2020, except
+ where another version is specifically mentioned.
+
+ This proposal also includes a medium-term monitoring script, which
+ calculates the number of relays in the consensus that support IPv6 extends,
+ and IPv6 client connections.
+
+3. Monitoring IPv6 Relays in the Consensus
+
+ We propose writing a script that calculates:
+ * the number of relays, and
+ * the consensus weight fraction of relays,
+ in the consensus that:
+ * have an IPv6 ORPort,
+ * support IPv6 reachability checks,
+ * support IPv6 clients, and
+ * support IPv6 reachability checks, and IPv6 clients.
+
+ In order to provide easy access to these statistics, we propose
+ that the script should:
+ * download a consensus (or read an existing consensus), and
+ * calculate and report these statistics.
+
+ The following consensus weight fractions should divide by the total
+ consensus weight:
+ * have an IPv6 ORPort (all relays have an IPv4 ORPort), and
+ * support IPv6 reachability checks (all relays support IPv4 reachability).
+
+ The following consensus weight fractions should divide by the
+ "usable Guard" consensus weight:
+ * support IPv6 clients, and
+ * support IPv6 reachability checks and IPv6 clients.
+
+ "Usable Guards" have the Guard flag, but do not have the Exit flag. If the
+ Guard also has the BadExit flag, the Exit flag should be ignored.
+
+ Note that this definition of "Usable Guards" is only valid when the
+ consensus contains many more guards than exits. That is, Wgd must be 0 in
+ the consensus. (See the [Tor Directory Protocol] for more details.)
+
+ Therefore, the script should check that Wgd is 0. If it is not, the script
+ should log a warning about the accuracy of the "Usable Guard" statistics.
+
+4. Collecting IPv6 Consumed Bandwidth Statistics
+
+ We propose that relays (and bridges) collect IPv6 consumed bandwidth
+ statistics.
+
+ To minimise development and testing effort, we propose re-using the existing
+ "bw_array" code in rephist.c.
+
+ In particular, tor currently counts these bandwidth statistics:
+ * read,
+ * write,
+ * dir_read, and
+ * dir_write.
+
+ We propose adding the following bandwidth statistics:
+ * ipv6_read, and
+ * ipv6_write.
+ (The IPv4 statistics can be calculated by subtracting the IPv6 statistics
+ from the existing total consumed bandwidth statistics.)
+
+ We believe that collecting IPv6 consumed bandwidth statistics is about as
+ safe as the existing IPv4+IPv6 total consumed bandwidth statistics.
+
+ See also section 7.5, which adds a BandwidthStatistics torrc option and
+ consensus parameter. BandwidthStatistics is an optional change.
+
+5. Collecting IPv6 Connection Statistics
+
+ We propose that relays (but not bridges) collect IPv6 connection statistics.
+
+ Bridges refuse to collect the existing ConnDirectionStatistics, so we do not
+ believe it is safe to collect the smaller IPv6 totals on bridges.
+
+ To minimise development and testing effort, we propose re-using the existing
+ "bidi" code in rephist.c. (This code may require some refactoring, because
+ the "bidi" totals are globals, rather than a struct.)
+
+ In particular, tor currently counts these connection statistics:
+ * below threshold,
+ * mostly read,
+ * mostly written, and
+ * both read and written.
+
+ We propose adding IPv6 variants of all these statistics. (The IPv4
+ statistics can be calculated by subtracting the IPv6 statistics from the
+ existing total connection statistics.)
+
+ See also section 7.6, which adds a ConnDirectionStatistics consensus
+ parameter. This consensus paramter is an optional change.
+
+6. Directory Protocol Specification Changes
+
+ We propose adding IPv6 variants of the consumed bandwidth and connection
+ direction statistics to the tor directory protocol.
+
+ We propose the following additions to the [Tor Directory Protocol]
+ specification, in section 2.1.2. Each addition should be inserted below the
+ existing consumed bandwidth and connection direction specifications.
+
+ "ipv6-read-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM,NUM,NUM... NL
+ [At most once]
+ "ipv6-write-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM,NUM,NUM... NL
+ [At most once]
+
+ Declare how much bandwidth the OR has used recently, on IPv6
+ connections. See "read-history" and "write-history" for more details.
+ (The migration notes do not apply to IPv6.)
+
+ "ipv6-conn-bi-direct" YYYY-MM-DD HH:MM:SS (NSEC s) BELOW,READ,WRITE,BOTH NL
+ [At most once]
+
+ Number of IPv6 connections, that are used uni-directionally or
+ bi-directionally. See "conn-bi-direct" for more details.
+
+ We also propose the following replacement, in the same section:
+
+ "dirreq-read-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM,NUM,NUM... NL
+ [At most once]
+ "dirreq-write-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM,NUM,NUM... NL
+ [At most once]
+
+ Declare how much bandwidth the OR has spent on answering directory
+ requests. See "read-history" and "write-history" for more details.
+ (The migration notes do not apply to dirreq.)
+
+ This replacement is optional, but it may avoid the 3 *read-history
+ definitions getting out of sync.
+
+7. Optional Changes
+
+ We propose some optional changes to help relay operators, tor developers,
+ and tor's network health. We also expect that these changes will drive IPv6
+ relay adoption.
+
+ Some of these changes may be more appropriate as future work, or along with
+ other proposed features.
+
+7.1. Log IPv6 Statistics in Tor's Heartbeat Logs
+
+ We propose this optional change, so relay operators can see their own IPv6
+ statistics:
+
+ We propose that tor logs its IPv6 consumed bandwidth and connection
+ statistics in its regular "heartbeat" logs.
+
+ These heartbeat statistics should be collected over the lifetime of the tor
+ process, rather than using the state file, like the statistics in sections
+ 4 and 5.
+
+ Tor's existing heartbeat logs already show its consumed bandwidth and
+ connections (in the link protocol counts).
+
+ We may also want to show IPv6 consumed bandwidth and connections as a
+ propotion of the total consumed bandwidth and connections.
+
+ These statistics only show a relay's local bandwidth usage, so they can't
+ be used for reporting.
+
+7.2. Show IPv6 Relay Counts on Consensus Health
+
+ The [Consensus Health] website displays a wide rage of tor statistics,
+ based on the most recent consensus.
+
+ We propose this optional change, to:
+ * help tor developers improve IPv6 support on the tor network,
+ * help diagnose issues with IPv6 on the tor network, and
+ * drive IPv6 adoption on tor relays.
+
+ Consensus Health adds an IPv6 section, with relays in the consensus that:
+ * have an IPv6 ORPort, and
+ * support IPv6 reachability checks.
+
+ The definitions of these statistics are in section 3.
+
+ These changes can be tested using the script proposed in section 3.
+
+7.3. Add an IPv6 Reachability Pseudo-Flag on Relay Search
+
+ The [Relay Search] website displays tor relay information, based on the
+ current consensus and relay descriptors.
+
+ We propose this optional change, to:
+ * help relay operators diagnose issues with IPv6 on their relays, and
+ * drive IPv6 adoption on tor relays.
+
+ Relay Search adds a pseudo-flag for relay IPv6 reachability support.
+
+ This pseudo-flag should be given to relays that have:
+ * a reachable IPv6 ORPort (in the consensus), and
+ * support tor subprotocol version "Relay=3" (or later).
+ See [Proposal 311: Relay IPv6 Reachability] for details.
+
+ TODO: Is this a useful change?
+ Are there better ways of driving IPv6 adoption?
+
+7.4. Add IPv6 Connections and Consumed Bandwidth Graphs to Tor Metrics
+
+ The [Tor Metrics: Traffic] website displays connection and bandwidth
+ information for the tor network, based on relay extra-info descriptors.
+
+ We propose these optional changes, to:
+ * help tor developers improve IPv6 support on the tor network,
+ * help diagnose issues with IPv6 on the tor network, and
+ * drive IPv6 adoption on tor relays.
+
+ Tor Metrics adds the following information to the graphs on the Traffic
+ page:
+
+ Consumed Bandwidth by IP version
+ * added to the existing [Tor Metrics: Advertised bandwidth by IP version]
+ page
+ * as a stacked graph, like
+ [Tor Metrics: Advertised and consumed bandwidth by relay flags]
+
+ Fraction of connections used uni-/bidirectionally by IP version
+ * added to the existing
+ [Tor Metrics: Fraction of connections used uni-/bidirectionally] page
+ * as a stacked graph, like
+ [Tor Metrics: Advertised and consumed bandwidth by relay flags]
+
+7.5. Add a BandwidthStatistics option
+
+ We propose adding a new BandwidthStatistics torrc option and consensus
+ parameter, which activates reporting of all these statistics. Currently,
+ the existing statistics are controlled by ExtraInfoStatistics, but we
+ propose using the new BandwidthStatistics option for them as well.
+
+ The default value of this option should be "auto", which checks the
+ consensus parameter. If there is no consensus parameter, the default should
+ be 1. (The existing bandwidth statistics are reported by default.)
+
+7.6. Add a ConnDirectionStatistics consensus parameter
+
+ We propose using the existing ConnDirectionStatistics torrc option, and
+ adding a consensus parameter with the same name. This option will control
+ the new and existing connection statistics.
+
+ The default value of this option should be "auto", which checks the
+ consensus parameter. If there is no consensus parameter, the default should
+ be 0.
+
+ Bridges refuse to collect the existing ConnDirectionStatistics, so we do not
+ believe it is safe to collect the smaller IPv6 totals on bridges. The new
+ consensus parameter should also be ignored on bridges.
+
+ The existing connection direction statistics are not reported by default,
+ but almost all relays actually report them. For more details, see:
+ [Ticket 33214: ConnDirectionStatistics is off by default, but most relays
+ report it].
+
+ If we fix the ConnDirectionStatistics default in Tor 0.4.4, we should also
+ implement the ConnDirectionStatistics consensus parameter. Then we can set
+ the consensus parameter to 1 for a week or two, so we can collect these
+ statistics.
+
+8. Test Plan
+
+ We provide a quick summary of our testing plans.
+
+8.1. Testing IPv6 Relay Consensus Calculations
+
+ We propose to test the IPv6 Relay consensus script using chutney networks.
+ However, chutney creates a limited number of relays, so we also need to
+ test these changes on consensuses from the public tor network.
+
+ Some of these calculations are similar to the calculations that tor will do,
+ to find out if IPv6 reachability checks are reliable. So we may be able to
+ check the script against tor's reachability logs. (See section 4.3.1 in
+ [Proposal 311: Relay IPv6 Reachability]: Refusing to Publish the
+ Descriptor.)
+
+ The Tor Metrics team may also independently check these calculations.
+
+ Once the script is completed, its output will be monitored by tor
+ developers, as more volunteer relay operators deploy the relevant tor
+ versions. (And as the number of IPv6 relays in the consensus increases.)
+
+8.2. Testing IPv6 Extra-Info Statistics
+
+ We propose to test the connection and consumed bandwidth statistics using
+ chutney networks. However, chutney runs for a short amount of time, and
+ creates a limited amount of traffic, so we also need to test these changes
+ on the public tor network.
+
+ In particular, we have struggled to test statistics using chutney, because
+ tor's hard-coded statistics period is 24 hours. (And most chutney networks
+ run for under 1 minute.)
+
+ Therefore, we propose to test these changes on the public network with a
+ small number of relays and bridges.
+
+ During 2020, the Tor Metrics team will analyse these statistics on the
+ public tor network, and provide IPv6 progress reports. We expect that we may
+ discover some bugs during the first analysis.
+
+ Once these changes are merged, they will be monitored by tor developers, as
+ more volunteer relay operators deploy the relevant tor versions. (And as the
+ number of IPv6 relays in the consensus increases.)
+
+References:
+
+[Consensus Health]:
+ https://consensus-health.torproject.org/
+
+[Proposal 288: Privacy-Preserving Stats with Privcount (Shamir version)]:
+ https://gitweb.torproject.org/torspec.git/tree/proposals/288-privcount-with-shamir.txt
+
+[Proposal 311: Relay IPv6 Reachability]:
+ https://gitweb.torproject.org/torspec.git/tree/proposals/311-relay-ipv6-reachability.txt
+
+[Proposal 312: Relay Auto IPv6 Address]:
+ https://gitweb.torproject.org/torspec.git/tree/proposals/312-relay-auto-ipv6-addr.txt
+
+[Relay Search]:
+ https://metrics.torproject.org/rs.html
+
+[Ticket 33214: ConnDirectionStatistics is off by default, but most relays report it]:
+ https://trac.torproject.org/projects/tor/ticket/12377
+
+[Tor Directory Protocol]:
+ (version 3) https://gitweb.torproject.org/torspec.git/tree/dir-spec.txt
+
+[Tor Manual Page]:
+ https://2019.www.torproject.org/docs/tor-manual.html.en
+
+[Tor Metrics: Advertised and consumed bandwidth by relay flags]:
+ https://metrics.torproject.org/bandwidth-flags.html
+
+[Tor Metrics: Advertised bandwidth by IP version]:
+ https://metrics.torproject.org/advbw-ipv6.html
+
+[Tor Metrics: Fraction of connections used uni-/bidirectionally]:
+ https://metrics.torproject.org/connbidirect.html
+
+[Tor Metrics: Traffic]:
+ https://metrics.torproject.org/bandwidth-flags.html
+
+[Tor Specification]:
+ https://gitweb.torproject.org/torspec.git/tree/tor-spec.txt