aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
-rw-r--r--bandwidth-file-spec.txt129
-rw-r--r--dir-spec.txt35
-rw-r--r--param-spec.txt10
-rw-r--r--proposals/000-index.txt2
-rw-r--r--proposals/291-two-guard-nodes.txt2
-rw-r--r--proposals/324-rtt-congestion-control.txt234
-rw-r--r--proposals/329-traffic-splitting.txt495
-rw-r--r--proposals/342-decouple-hs-interval.md107
-rw-r--r--proposals/BY_INDEX.md1
-rw-r--r--proposals/README.md1
-rw-r--r--rend-spec-v3.txt15
-rw-r--r--tor-spec.txt8
12 files changed, 658 insertions, 381 deletions
diff --git a/bandwidth-file-spec.txt b/bandwidth-file-spec.txt
index d9f4db6..5ad946f 100644
--- a/bandwidth-file-spec.txt
+++ b/bandwidth-file-spec.txt
@@ -98,6 +98,7 @@ Table of Contents
Also adds Tor version.
1.5.0 - Removes "recent_measurement_attempt_count" KeyValue.
+ 1.6.0 - Adds congestion control stream events KeyValues.
All Tor versions can consume format version 1.0.0.
@@ -213,7 +214,7 @@ Table of Contents
It does not follow the KeyValue format for backwards compatibility
with version 1.0.0.
- "version=" version_number NL
+ "version" version_number NL
[In second position, zero or one time.]
@@ -225,7 +226,7 @@ Table of Contents
Version 1.0.0 documents do not contain this Line, and the
version_number is considered to be "1.0.0".
- "software=" Value NL
+ "software" Value NL
[Zero or one time.]
@@ -236,7 +237,7 @@ Table of Contents
Version 1.0.0 documents do not contain this Line, and the software
is considered to be "torflow".
- "software_version=" Value NL
+ "software_version" Value NL
[Zero or one time.]
@@ -246,7 +247,7 @@ Table of Contents
This Line was added in version 1.1.0 of this specification.
- "file_created=" DateTime NL
+ "file_created" DateTime NL
[Zero or one time.]
@@ -255,7 +256,7 @@ Table of Contents
This Line was added in version 1.1.0 of this specification.
- "generator_started=" DateTime NL
+ "generator_started" DateTime NL
[Zero or one time.]
@@ -264,7 +265,7 @@ Table of Contents
This Line was added in version 1.1.0 of this specification.
- "earliest_bandwidth=" DateTime NL
+ "earliest_bandwidth" DateTime NL
[Zero or one time.]
@@ -273,7 +274,7 @@ Table of Contents
This Line was added in version 1.1.0 of this specification.
- "latest_bandwidth=" DateTime NL
+ "latest_bandwidth" DateTime NL
[Zero or one time.]
@@ -287,7 +288,7 @@ Table of Contents
This Line was added in version 1.1.0 of this specification.
- "number_eligible_relays=" Int NL
+ "number_eligible_relays" Int NL
[Zero or one time.]
@@ -296,7 +297,7 @@ Table of Contents
This Line was added in version 1.2.0 of this specification.
- "minimum_percent_eligible_relays=" Int NL
+ "minimum_percent_eligible_relays" Int NL
[Zero or one time.]
@@ -317,7 +318,7 @@ Table of Contents
This Line was added in version 1.2.0 of this specification.
- "number_consensus_relays=" Int NL
+ "number_consensus_relays" Int NL
[Zero or one time.]
@@ -325,7 +326,7 @@ Table of Contents
This Line was added in version 1.2.0 of this specification.
- "percent_eligible_relays=" Int NL
+ "percent_eligible_relays" Int NL
[Zero or one time.]
@@ -338,7 +339,7 @@ Table of Contents
This Line was added in version 1.2.0 of this specification.
- "minimum_number_eligible_relays=" Int NL
+ "minimum_number_eligible_relays" Int NL
[Zero or one time.]
@@ -350,7 +351,7 @@ Table of Contents
This Line was added in version 1.2.0 of this specification.
- "scanner_country=" CountryCode NL
+ "scanner_country" CountryCode NL
[Zero or one time.]
@@ -358,7 +359,7 @@ Table of Contents
This Line was added in version 1.2.0 of this specification.
- "destinations_countries=" CountryCodeList NL
+ "destinations_countries" CountryCodeList NL
[Zero or one time.]
@@ -369,7 +370,7 @@ Table of Contents
This Line was added in version 1.2.0 of this specification.
- "recent_consensus_count=" Int NL
+ "recent_consensus_count" Int NL
[Zero or one time.].
@@ -384,7 +385,7 @@ Table of Contents
This Line was added in version 1.4.0 of this specification.
- "recent_priority_list_count=" Int NL
+ "recent_priority_list_count" Int NL
[Zero or one time.]
@@ -401,7 +402,7 @@ Table of Contents
This Line was added in version 1.4.0 of this specification.
- "recent_priority_relay_count=" Int NL
+ "recent_priority_relay_count" Int NL
[Zero or one time.]
@@ -418,7 +419,7 @@ Table of Contents
This Line was added in version 1.4.0 of this specification.
- "recent_measurement_attempt_count=" Int NL
+ "recent_measurement_attempt_count" Int NL
[Zero or one time.]
@@ -434,7 +435,7 @@ Table of Contents
This Line was added in version 1.4.0 of this specification and removed
in version 1.5.0.
- "recent_measurement_failure_count=" Int NL
+ "recent_measurement_failure_count" Int NL
[Zero or one time.]
@@ -444,7 +445,7 @@ Table of Contents
This Line was added in version 1.4.0 of this specification.
- "recent_measurements_excluded_error_count=" Int NL
+ "recent_measurements_excluded_error_count" Int NL
[Zero or one time.]
@@ -455,7 +456,7 @@ Table of Contents
This Line was added in version 1.4.0 of this specification.
- "recent_measurements_excluded_near_count=" Int NL
+ "recent_measurements_excluded_near_count" Int NL
[Zero or one time.]
@@ -467,7 +468,7 @@ Table of Contents
This Line was added in version 1.4.0 of this specification.
- "recent_measurements_excluded_old_count=" Int NL
+ "recent_measurements_excluded_old_count" Int NL
[Zero or one time.]
@@ -481,7 +482,7 @@ Table of Contents
This Line was added in version 1.4.0 of this specification.
- "recent_measurements_excluded_few_count=" Int NL
+ "recent_measurements_excluded_few_count" Int NL
[Zero or one time.]
@@ -497,7 +498,7 @@ Table of Contents
This Line was added in version 1.4.0 of this specification.
- "time_to_report_half_network=" Int NL
+ "time_to_report_half_network" Int NL
[Zero or one time.]
@@ -509,7 +510,7 @@ Table of Contents
This Line was added in version 1.4.0 of this specification.
- "tor_version=" version_number NL
+ "tor_version" version_number NL
[Zero or one time.]
@@ -575,7 +576,7 @@ Table of Contents
Each RelayLine includes the following KeyValue pairs:
- "node_id=" hexdigest
+ "node_id" hexdigest
[Exactly once.]
@@ -592,7 +593,7 @@ Table of Contents
and master_key_ed25519. Parsers SHOULD accept Lines that contain at
least one of them.
- "master_key_ed25519=" MasterKey
+ "master_key_ed25519" MasterKey
[Zero or one time.]
@@ -604,7 +605,7 @@ Table of Contents
This KeyValue was added in version 1.1.0 of this specification.
- "bw=" Bandwidth
+ "bw" Bandwidth
[Exactly once.]
@@ -703,23 +704,23 @@ Table of Contents
sbws RelayLines contain these keys:
- "node_id=" hexdigest
+ "node_id" hexdigest
As above.
- "bw=" Bandwidth
+ "bw" Bandwidth
As above.
- "nick=" nickname
+ "nick" nickname
[Exactly once.]
The relay nickname.
- Torflow also has a "nick=" KeyValue.
+ Torflow also has a "nick" KeyValue.
- "rtt=" Int
+ "rtt" Int
[Zero or one time.]
@@ -728,7 +729,7 @@ Table of Contents
This KeyValue was added in version 1.1.0 of this specification.
It became optional in version 1.3.0 or 1.4.0 of this specification.
- "time=" DateTime
+ "time" DateTime
[Exactly once.]
@@ -736,9 +737,9 @@ Table of Contents
when the last bandwidth was obtained.
This KeyValue was added in version 1.1.0 of this specification.
- The Torflow equivalent is "measured_at=".
+ The Torflow equivalent is "measured_at".
- "success=" Int
+ "success" Int
[Zero or one time.]
@@ -747,7 +748,7 @@ Table of Contents
This KeyValue was added in version 1.1.0 of this specification.
- "error_circ=" Int
+ "error_circ" Int
[Zero or one time.]
@@ -755,9 +756,9 @@ Table of Contents
failed because of circuit failures.
This KeyValue was added in version 1.1.0 of this specification.
- The Torflow equivalent is "circ_fail=".
+ The Torflow equivalent is "circ_fail".
- "error_stream=" Int
+ "error_stream" Int
[Zero or one time.]
@@ -766,7 +767,7 @@ Table of Contents
This KeyValue was added in version 1.1.0 of this specification.
- "error_destination=" Int
+ "error_destination" Int
[Zero or one time.]
@@ -775,7 +776,7 @@ Table of Contents
This KeyValue was added in version 1.4.0 of this specification.
- "error_second_relay=" Int
+ "error_second_relay" Int
[Zero or one time.]
@@ -784,7 +785,7 @@ Table of Contents
This KeyValue was added in version 1.4.0 of this specification.
- "error_misc=" Int
+ "error_misc" Int
[Zero or one time.]
@@ -793,7 +794,7 @@ Table of Contents
This KeyValue was added in version 1.1.0 of this specification.
- "bw_mean=" Int
+ "bw_mean" Int
[Zero or one time.]
@@ -801,7 +802,7 @@ Table of Contents
This KeyValue was added in version 1.2.0 of this specification.
- "bw_median=" Int
+ "bw_median" Int
[Zero or one time.]
@@ -809,7 +810,7 @@ Table of Contents
This KeyValue was added in version 1.2.0 of this specification.
- "desc_bw_avg=" Int
+ "desc_bw_avg" Int
[Zero or one time.]
@@ -817,7 +818,7 @@ Table of Contents
This KeyValue was added in version 1.2.0 of this specification.
- "desc_bw_obs_last=" Int
+ "desc_bw_obs_last" Int
[Zero or one time.]
@@ -826,7 +827,7 @@ Table of Contents
This KeyValue was added in version 1.2.0 of this specification.
- "desc_bw_obs_mean=" Int
+ "desc_bw_obs_mean" Int
[Zero or one time.]
@@ -835,7 +836,7 @@ Table of Contents
This KeyValue was added in version 1.2.0 of this specification.
- "desc_bw_bur=" Int
+ "desc_bw_bur" Int
[Zero or one time.]
@@ -900,7 +901,7 @@ Table of Contents
This KeyValue was added in version 1.4.0 of this specification.
- "relay_recent_measurements_excluded_error_count=" Int
+ "relay_recent_measurements_excluded_error_count" Int
[Zero or one time.]
@@ -912,7 +913,7 @@ Table of Contents
This KeyValue was added in version 1.4.0 of this specification.
- "relay_recent_measurements_excluded_near_count=" Int
+ "relay_recent_measurements_excluded_near_count" Int
[Zero or one time.]
@@ -925,7 +926,7 @@ Table of Contents
This KeyValue was added in version 1.4.0 of this specification.
- "relay_recent_measurements_excluded_old_count=" Int
+ "relay_recent_measurements_excluded_old_count" Int
[Zero or one time.]
@@ -939,7 +940,7 @@ Table of Contents
This KeyValue was added in version 1.4.0 of this specification.
- "relay_recent_measurements_excluded_few_count=" Int
+ "relay_recent_measurements_excluded_few_count" Int
[Zero or one time.]
@@ -955,7 +956,7 @@ Table of Contents
This KeyValue was added in version 1.4.0 of this specification.
- "under_min_report=" bool
+ "under_min_report" bool
[Zero or one time.]
@@ -977,7 +978,7 @@ Table of Contents
This KeyValue was added in version 1.4.0 of this specification.
- "unmeasured=" bool
+ "unmeasured" bool
[Zero or one time.]
@@ -997,7 +998,7 @@ Table of Contents
This KeyValue was added in version 1.4.0 of this specification.
- "vote=" bool
+ "vote" bool
[Zero or one time.]
@@ -1019,6 +1020,24 @@ Table of Contents
This KeyValue was added in version 1.4.0 of this specification.
+ "xoff_recv" Int
+
+ [Zero or one time.]
+
+ The number of times this relay received `XOFF_RECV` stream events while
+ being measured in the last data_period days.
+
+ This KeyValue was added in version 1.6.0 of this specification.
+
+ "xoff_sent" Int
+
+ [Zero or one time.]
+
+ The number of times this relay received `XOFF_SENT` stream events while
+ being measured in the last data_period days.
+
+ This KeyValue was added in version 1.6.0 of this specification.
+
2.4.2.2. Torflow
Torflow RelayLines include node_id and bw, and other KeyValue pairs [2].
diff --git a/dir-spec.txt b/dir-spec.txt
index 9ae9368..db64b5d 100644
--- a/dir-spec.txt
+++ b/dir-spec.txt
@@ -487,6 +487,12 @@ Table of Contents
accepted and ignored. Many of the nonterminals below are defined in
section 2.1.3.
+ Note that many versions of Tor will generate an extra newline at the
+ end of their descriptors. Implementations MUST tolerate one or
+ more blank lines at the end of a single descriptor or a list of
+ concatenated descriptors. New implementations SHOULD NOT generate
+ such blank lines.
+
"router" nickname address ORPort SOCKSPort DirPort NL
[At start, exactly once.]
@@ -2115,8 +2121,9 @@ Table of Contents
this value, and see section [CONS] for why we include old shared random
values in votes and consensus.
- Value is the actual shared random value encoded in base64. NumReveals
- is the number of commits used to generate this SRV.
+ Value is the actual shared random value encoded in base64. It will
+ be exactly 256 bits long. NumReveals is the number of commits used
+ to generate this SRV.
"shared-rand-current-value" SP NumReveals SP Value NL
@@ -2133,8 +2140,9 @@ Table of Contents
See section [SRCALC] of srv-spec.txt for instructions on how to compute
this value given the active commits.
- Value is the actual shared random value encoded in base64. NumReveals
- is the number of commits used to generate this SRV.
+ Value is the actual shared random value encoded in base64. It will
+ be exactly 256 bits long. NumReveals is the number of commits used to
+ generate this SRV.
"bandwidth-file-headers" SP KeyValues NL
@@ -2694,6 +2702,14 @@ Table of Contents
a router, the authorities produce a consensus containing a "w"
Bandwidth= keyword equal to the median of the Measured= votes.
+ As a special case, if the "w" line in a vote is about a relay with the
+ Authority flag, it should not include a Measured= keyword. The goal is
+ to leave such relays marked as Unmeasured, so they can reserve their
+ attention for authority-specific activities. "w" lines for votes about
+ authorities may include the bandwidth authority's measurement using
+ a different keyword, e.g. MeasuredButAuthority=, so it can still be
+ reported and recorded for posterity.
+
The ports listed in a "p" line should be taken as those ports for
which the router's exit policy permits 'most' addresses, ignoring any
accept not for all addresses, ignoring all rejects for private
@@ -4059,8 +4075,7 @@ B. General-use HTTP URLs
http://<hostname>/tor/keys/all.z
- The key certificate for this server (if it is an authority) should be
- available at:
+ The key certificate for this server should be available at:
http://<hostname>/tor/keys/authority.z
@@ -4117,10 +4132,10 @@ B. General-use HTTP URLs
http://<hostname>/tor/server/authority.z
- [Nothing in the Tor protocol uses this resource yet, but it is useful
- for debugging purposes. Also, the official Tor implementations
- (starting at 0.1.1.x) use this resource to test whether a server's
- own DirPort is reachable.]
+ This is used for authorities, and also if a server is configured
+ as a bridge. The official Tor implementations (starting at
+ 0.1.1.x) use this resource to test whether a server's own DirPort
+ is reachable. It is also useful for debugging purposes.
A concatenated set of the most recent descriptors for all known servers
should be available at:
diff --git a/param-spec.txt b/param-spec.txt
index 67c809d..a63ad3b 100644
--- a/param-spec.txt
+++ b/param-spec.txt
@@ -252,26 +252,34 @@ Table of Contents
Minimum/maximum amount of INTRODUCE2 cells allowed per circuits
before rotation (actual amount picked at random between these two
values).
+ Min: 0. Max: INT32_MAX. Defaults: 16384, 32768.
"hs_intro_min_lifetime", "hs_intro_max_lifetime" -- Minimum/maximum
lifetime in seconds that a service should keep an intro point for
(actual lifetime picked at random between these two values).
+ Min: 0. Max: INT32_MAX. Defaults: 18 hours, 24 hours.
"hs_intro_num_extra" -- Number of extra intro points a service is
allowed to open. This concept comes from proposal #155.
+ Min: 0. Max: 128. Default: 2.
- "hsdir_interval" -- The length of a time period. See
+ "hsdir_interval" -- The length of a time period, _in minutes_. See
rend-spec-v3.txt section [TIME-PERIODS].
+ Min: 30. Max: 14400. Default: 1440.
"hsdir_n_replicas" -- Number of HS descriptor replicas.
+ Min: 1. Max: 16. Default: 2.
"hsdir_spread_fetch" -- Total number of HSDirs per replica a tor
client should select to try to fetch a descriptor.
+ Min: 1. Max: 128. Default: 3.
"hsdir_spread_store" -- Total number of HSDirs per replica a service
will upload its descriptor to.
+ Min: 1. Max: 128. Default: 4
"HSV3MaxDescriptorSize" -- Maximum descriptor size (in bytes).
+ Min: 1. Max: INT32_MAX. Default: 50000
"hs_service_max_rdv_failures" -- This parameter determines the
maximum number of rendezvous attempt an HS service can make per
diff --git a/proposals/000-index.txt b/proposals/000-index.txt
index 0f82741..a838b94 100644
--- a/proposals/000-index.txt
+++ b/proposals/000-index.txt
@@ -262,6 +262,7 @@ Proposals by number:
339 UDP traffic over Tor [ACCEPTED]
340 Packed and fragmented relay messages [OPEN]
341 A better algorithm for out-of-sockets eviction [OPEN]
+342 Decoupling hs_interval and SRV lifetime [DRAFT]
Proposals by status:
@@ -272,6 +273,7 @@ Proposals by status:
327 A First Take at PoW Over Introduction Circuits
329 Overcoming Tor's Bottlenecks with Traffic Splitting
331 Res tokens: Anonymous Credentials for Onion Service DoS Resilience
+ 342 Decoupling hs_interval and SRV lifetime
NEEDS-REVISION:
212 Increase Acceptable Consensus Age [for 0.2.4.x+]
219 Support for full DNS and DNSSEC resolution in Tor [for 0.2.5.x]
diff --git a/proposals/291-two-guard-nodes.txt b/proposals/291-two-guard-nodes.txt
index 43885e3..a9554c6 100644
--- a/proposals/291-two-guard-nodes.txt
+++ b/proposals/291-two-guard-nodes.txt
@@ -154,7 +154,7 @@ Status: Needs-Revision
However, while removing path restrictions will solve the immediate
problem, it will not address other instances where Tor temporarily opts
- use a second guard due to congestion, OOM, or failure of its primary
+ to use a second guard due to congestion, OOM, or failure of its primary
guard, and we're still running into bugs where this can be adversarially
controlled or just happen randomly[5].
diff --git a/proposals/324-rtt-congestion-control.txt b/proposals/324-rtt-congestion-control.txt
index c46fd4e..952d140 100644
--- a/proposals/324-rtt-congestion-control.txt
+++ b/proposals/324-rtt-congestion-control.txt
@@ -144,20 +144,22 @@ measured by the RTT estimator, and if these heurtics detect a stall or a jump,
we do not use that value to update RTT or BDP, nor do we update any congestion
control algorithm information that round.
-If the time delta is 0, that is always treated as a clock stall.
+If the time delta is 0, that is always treated as a clock stall, the RTT is
+not used, congestion control is not updated, and this fact is cached globally.
-If we have measured at least 'cc_bwe_min' RTT values or we have successfully
-exited slow start, then every sendme ACK, the new candidate RTT is compared to
-the stored EWMA RTT. If the new RTT is either 5000 times larger than the EWMA
-RTT, or 5000 times smaller than the stored EWMA RTT, then we do not record that
-estimate, and do not update BDP or the congestion control algorithms for that
-SENDME ack.
+If the circuit does not yet have an EWMA RTT or it is still in Slow Start, then
+no further checks are performed, and the RTT is used.
-Moreover, if a clock stall is detected by *any* circuit, this fact is
-cached, and this cached value is used on circuits for which we do not
-have enough data to compute the above heueristics. This cached value is
-also exported for use by the edge connection rate calculations done by
-[XON_ADVISORY].
+If the circuit has stored an EWMA RTT and has exited Slow Start, then every
+sendme ACK, the new candidate RTT is compared to the stored EWMA RTT. If the
+new RTT is 5000 times larger than the EWMA RTT, then the circuit does not
+record that estimate, and does not update BDP or the congestion control
+algorithms for that SENDME ack. If the new RTT is 5000 times smaller than the
+EWMA RTT, then the circuit uses the globally cached value from above (ie: it
+assumes the clock is stalled *only* if there was previously *also* a 0-delta RTT).
+
+If both ratio checks pass, the globally cached clock stall state is set to
+false (no stall), and the RTT value is used.
2.1.2. N_EWMA Smoothing [N_EWMA_SMOOTHING]
@@ -167,7 +169,11 @@ reduce the effects of packet jitter.
This smoothing is performed using N_EWMA[27], which is an Exponential
Moving Average with alpha = 2/(N+1):
- N_EWMA = BDP*2/(N+1) + N_EWMA_prev*(N-1)/(N+1).
+ N_EWMA = BDP*2/(N+1) + N_EWMA_prev*(N-1)/(N+1)
+ = (BDP*2 + N_EWMA_prev*(N-1))/(N+1).
+
+Note that the second rearranged form MUST be used in order to ensure that
+rounding errors are handled in the same manner as other implementations.
Flow control rate limiting uses this function.
@@ -389,12 +395,12 @@ Simplifying:
BDP = cwnd * RTT_min / RTT_current_ewma
-The RTT_min for this calculation comes from the minimum RTT_current_ewma
-seen in the lifetime of this circuit. If the congestion window falls to
-`cc_cwnd_min`, implementations MAY choose to reset RTT_min for use in this
-calculation to either the RTT_current_ewma, or a percentile-weighted average
-between RTT_min and RTT_current_ewma, specified by `cc_rtt_reset_pct`. This
-helps with escaping starvation conditions.
+The RTT_min for this calculation comes from the minimum RTT_current_ewma seen
+in the lifetime of this circuit. If the congestion window falls to
+`cc_cwnd_min` after slow start, implementations MAY choose to reset RTT_min
+for use in this calculation to either the RTT_current_ewma, or a
+percentile-weighted average between RTT_min and RTT_current_ewma, specified by
+`cc_rtt_reset_pct`. This helps with escaping starvation conditions.
The net effect of this estimation is to correct for any overshoot of
the cwnd over the actual BDP. It will obviously underestimate BDP if cwnd
@@ -538,7 +544,10 @@ original cwnd estimator. So while this capability to change the BDP estimator
remains in the C implementation, we do not expect it to be used.
However, it was useful to use a local OR connection block at the time of
-SENDME ack arrival, as an immediate congestion signal.
+SENDME ack arrival, as an immediate congestion signal. Note that in C-Tor,
+this orconn_block state is not derived from any socket info, but instead is a
+heuristic that declares an orconn as blocked if any circuit cell queue
+exceeds the 'cellq_high' consensus parameter.
(As an additional optimization, we could also use the ECN signal described in
ideas/xxx-backward-ecn.txt, but this is not implemented. It is likely only of
@@ -554,70 +563,132 @@ per the rules in RFC3742:
# Below the cap, we increment as per cc_cwnd_inc_pct_ss percent:
return round(cc_cwnd_inc_pct_ss*cc_sendme_inc/100)
else:
- # This returns an increment equivalent to RFC3742, rounded:
- # K = int(cwnd/(0.5 max_ssthresh));
- # inc = int(MSS/K);
- return round((cc_sendme_inc*cc_ss_cap_pathtype)/(2*cwnd));
+ # This returns an increment equivalent to RFC3742, rounded,
+ # with a minimum of inc=1.
+ # From RFC3742:
+ # K = int(cwnd/(0.5 max_ssthresh));
+ # inc = int(MSS/K);
+ return MAX(round((cc_sendme_inc*cc_ss_cap_pathtype)/(2*cwnd)), 1);
+
+During both Slow Start, and Steady State, if the congestion window is not full,
+we never increase the congestion window. We can still decrease it, or exit slow
+start, in this case. This is done to avoid causing overshoot. The original TCP
+Vegas addressed this problem by computing BDP and queue_use from inflight,
+instead of cwnd, but we found that approach to have signficantly worse
+performance.
+
+Because C-Tor is single-threaded, multiple SENDME acks may arrive during one
+processing loop, before edge connections resume reading. For this reason,
+we provide two heuristics to provide some slack in determining the full
+condition. The first is to allow a gap between inflight and cwnd,
+parameterized as 'cc_cwnd_full_gap' multiples of 'cc_sendme_inc':
+ cwnd_is_full(cwnd, inflight):
+ if inflight + 'cc_cwnd_full_gap'*'cc_sendme_inc' >= cwnd:
+ return true
+ else
+ return false
+
+The second heuristic immediately resets the full state if it falls below
+'cc_cwnd_full_minpct' full:
+ cwnd_is_nonfull(cwnd, inflight):
+ if 100*inflight < 'cc_cwnd_full_minpct'*cwnd:
+ return true
+ else
+ return false
+
+This full status is cached once per cwnd if 'cc_cwnd_full_per_cwnd=1';
+otherwise it is cached once per cwnd update. These two helper functions
+determine the number of acks in each case:
+ SENDME_PER_CWND(cwnd):
+ return ((cwnd + 'cc_sendme_inc'/2)/'cc_sendme_inc')
+ CWND_UPDATE_RATE(cwnd, in_slow_start):
+ # In Slow Start, update every SENDME
+ if in_slow_start:
+ return 1
+ else: # Otherwise, update as per the 'cc_inc_rate' (31)
+ return ((cwnd + 'cc_cwnd_inc_rate'*'cc_sendme_inc'/2)
+ / ('cc_cwnd_inc_rate'*'cc_sendme_inc'));
-After Slow Start, congestion signals from RTT, blocked OR connections, or ECN
-are processed only once per congestion window. This is achieved through the
-next_cc_event flag, which is initialized to a cwnd worth of SENDME acks, and
-is decremented each ack. Congestion signals are only evaluated when it reaches
-0.
+Shadow experimentation indicates that 'cc_cwnd_full_gap=2' and
+'cc_cwnd_full_per_cwnd=0' minimizes queue overshoot, where as
+'cc_cwnd_full_per_cwnd=1' and 'cc_cwnd_full_gap=1' is slightly better
+for performance. Since there may be a difference between Shadow and live,
+we leave this parmeterization in place.
Here is the complete pseudocode for TOR_VEGAS with RFC3742, which is run every
-time an endpoint receives a SENDME ack:
-
- # Update acked cells
- inflight -= cc_sendme_inc
+time an endpoint receives a SENDME ack. All variables are scoped to the
+circuit, unless prefixed by an underscore (local), or in single quotes
+(consensus parameters):
+ # Decrement counters that signal either an update or cwnd event
if next_cc_event:
next_cc_event--
+ if next_cwnd_event:
+ next_cwnd_event--
# Do not update anything if we detected a clock stall or jump,
# as per [CLOCK_HEURISTICS]
if clock_stalled_or_jumped:
+ inflight -= 'cc_sendme_inc'
return
if BDP > cwnd:
- queue_use = 0
+ _queue_use = 0
else:
- queue_use = cwnd - BDP
+ _queue_use = cwnd - BDP
+
+ if cwnd_is_full(cwnd, inflight):
+ cwnd_full = 1
+ else if cwnd_is_nonfull(cwnd, inflight):
+ cwnd_full = 0
if in_slow_start:
- if queue_use < cc_vegas_gamma and not orconn_blocked:
- inc = rfc3742_ss_inc(cwnd);
- cwnd += inc
- next_cc_event = 1
-
- # If the RFC3742 increment drops below steady-state increment
- # over a full cwnd worth of acks, exit slow start
- if inc*SENDME_PER_CWND(cwnd) <= cc_cwnd_inc:
- in_slow_start = 0
- next_cc_event = round(cwnd / (cc_cwnd_inc_rate * cc_sendme_inc))
- else:
+ if _queue_use < 'cc_vegas_gamma' and not orconn_blocked:
+ # Only increase cwnd if the cwnd is full
+ if cwnd_full:
+ _inc = rfc3742_ss_inc(cwnd);
+ cwnd += _inc
+
+ # If the RFC3742 increment drops below steady-state increment
+ # over a full cwnd worth of acks, exit slow start.
+ if _inc*SENDME_PER_CWND(cwnd) <= 'cc_cwnd_inc'*'cc_cwnd_inc_rate':
+ in_slow_start = 0
+ else: # Limit hit. Exit Slow start (even if cwnd not full)
in_slow_start = 0
- cwnd = BDP + cc_vegas_gamma
- next_cc_event = round(cwnd / (cc_cwnd_inc_rate * cc_sendme_inc))
+ cwnd = BDP + 'cc_vegas_gamma'
# Provide an emergency hard-max on slow start:
- if cwnd >= cc_ss_max:
- cwnd = cc_ss_max
+ if cwnd >= 'cc_ss_max':
+ cwnd = 'cc_ss_max'
in_slow_start = 0
- next_cc_event = round(cwnd / (cc_cwnd_inc_rate * cc_sendme_inc))
else if next_cc_event == 0:
- if queue_use > cc_vegas_delta:
- cwnd = BDP + cc_vegas_delta - cc_cwnd_inc
- elif queue_use > cc_vegas_beta or orconn_blocked:
- cwnd -= cc_cwnd_inc
- elif queue_use < cc_vegas_alpha:
- cwnd += cc_cwnd_inc
-
- cwnd = MAX(cwnd, cc_circwindow_min)
+ if _queue_use > 'cc_vegas_delta':
+ cwnd = BDP + 'cc_vegas_delta' - 'cc_cwnd_inc'
+ elif _queue_use > cc_vegas_beta or orconn_blocked:
+ cwnd -= 'cc_cwnd_inc'
+ elif cwnd_full and _queue_use < 'cc_vegas_alpha':
+ # Only increment if queue is low, *and* the cwnd is full
+ cwnd += 'cc_cwnd_inc'
+
+ cwnd = MAX(cwnd, 'cc_circwindow_min')
+
+ # Specify next cwnd and cc update
+ if next_cc_event == 0:
+ next_cc_event = CWND_UPDATE_RATE(cwnd)
+ if next_cwnd_event == 0:
+ next_cwnd_event = SENDME_PER_CWND(cwnd)
+
+ # Determine if we need to reset the cwnd_full state
+ # (Parameterized)
+ if 'cc_cwnd_full_per_cwnd' == 1:
+ if next_cwnd_event == SENDME_PER_CWND(cwnd):
+ cwnd_full = 0
+ else:
+ if next_cc_event == CWND_UPDATE_RATE(cwnd):
+ cwnd_full = 0
- # Count the number of sendme acks until next update of cwnd,
- # rounded to nearest integer
- next_cc_event = round(cwnd / (cc_cwnd_inc_rate * cc_sendme_inc))
+ # Update acked cells
+ inflight -= 'cc_sendme_inc'
3.4. Tor NOLA: Direct BDP tracker [TOR_NOLA]
@@ -747,6 +818,12 @@ struct xon_cell {
u32 kbps_ewma;
}
+Parties SHOULD treat XON or XOFF cells with unrecognized versions as a
+protocol violation.
+
+In `xon_cell`, a zero value for `kbps_ewma` means that the stream's rate is
+unlimited. Parties should therefore not send "0" to mean "do not send data".
+
4.1.1. XON/XOFF behavior
If the length of an edge outbuf queue exceeds the size provided in the
@@ -1473,6 +1550,39 @@ These are sorted in order of importance to tune, most important first.
The largest congestion window seen in Shadow is ~3000, so this was
set as a safety valve above that.
+ cc_cwnd_full_gap:
+ - Description: This parameter defines the integer number of
+ 'cc_sendme_inc' multiples of gap allowed between inflight and
+ cwnd, to still declare the cwnd full.
+ - Range: [0, INT32_MAX]
+ - Default: 1-2
+ - Shadow Tuning Results:
+ A value of 0 resulted in a slight loss of performance, and increased
+ variance in throughput. The optimal number here likely depends on
+ edgeconn inbuf size, edgeconn kernel buffer size, and eventloop
+ behavior.
+
+ cc_cwnd_full_minpct:
+ - Description: This paramter defines a low watermark in percent. If
+ inflight falls below this percent of cwnd, the congestion window
+ is immediately declared non-full.
+ - Range: [0, 100]
+ - Default: 75
+
+ cc_cwnd_full_per_cwnd:
+ - Description: This parameter governs how often a cwnd must be
+ full, in order to allow congestion window increase. If it is 1,
+ then the cwnd only needs to be full once per cwnd worth of acks.
+ If it is 0, then it must be full once every cwnd update (ie:
+ every SENDME).
+ - Range: [0, 1]
+ - Default: 1
+ - Shadow Tuning Results:
+ A value of 0 resulted in a slight loss of performance, and increased
+ variance in throughput. The optimal number here likely depends on
+ edgeconn inbuf size, edgeconn kernel buffer size, and eventloop
+ behavior.
+
6.5.4. NOLA Parameters
cc_nola_overshoot:
diff --git a/proposals/329-traffic-splitting.txt b/proposals/329-traffic-splitting.txt
index 85d704b..b9f5dce 100644
--- a/proposals/329-traffic-splitting.txt
+++ b/proposals/329-traffic-splitting.txt
@@ -19,11 +19,11 @@ Status: Draft
In order to understand our improvements to Conflux, it is important to
properly conceptualize what is involved in the design of multipath
algorithms in general.
-
+
The design space is broken into two orthogonal parts: congestion control
algorithms that apply to each path, and traffic scheduling algorithms
that decide which packets to send on each path.
-
+
MPTCP specifies 'coupled' congestion control (see [COUPLED]). Coupled
congestion control updates single-path congestion control algorithms to
account for shared bottlenecks between the paths, so that the combined
@@ -31,14 +31,14 @@ Status: Draft
happen to be shared between the multiple paths. Various ways of
accomplishing this have been proposed and implemented in the Linux
kernel.
-
+
Because Tor's congestion control only concerns itself with bottlenecks in
Tor relay queues, and not with any other bottlenecks (such as
intermediate Internet routers), we can avoid this complexity merely by
specifying that any paths that are constructed SHOULD NOT share any
relays. In this way, we can proceed to use the exact same congestion
control as specified in Proposal 324, for each path.
-
+
For this reason, this proposal will focus on the traffic scheduling
algorithms, rather than coupling. We propose three candidate algorithms
that have been studied in the literature, and will compare their
@@ -55,7 +55,7 @@ Status: Draft
(which will be available to us via Proposal 324). Additionally, since
the publication of [CONFLUX], more modern packet scheduling algorithms
have been developed, which aim to reduce out-of-order queue size.
-
+
We propose mitigations for these issues using modern scheduling
algorithms, as well as implementations options for avoiding the
out-of-order queue at Exit relays. Additionally, we consider resumption,
@@ -67,9 +67,9 @@ Status: Draft
The following section describes the Conflux design. Each sub-section is
a building block to the multipath design that Conflux proposes.
-
+
The circuit construction is as follow:
-
+
Primary Circuit (lower RTT)
+-------+ +--------+
|Guard 1|----->|Middle 1|----------+
@@ -81,7 +81,7 @@ Status: Draft
|Guard 2|----->|Middle 2|----------+
+-------+ +--------+
Secondary Circuit (higher RTT)
-
+
Both circuits are built using current Tor path selection, however they
SHOULD NOT share the same Guard relay, or middle relay. By avoiding
using the same relays in these positions in the path, we ensure
@@ -89,10 +89,10 @@ Status: Draft
'coupled' congestion control algorithms from the MPTCP
literature[COUPLED]. This both simplifies design, and improves
performance.
-
+
Then, the OP needs to link the two circuits together, as described in
[LINKING_CIRCUITS], [LINKING_EXIT], and [LINKING_SERVICE].
-
+
For ease of explanation, the primary circuit is the circuit with lower
RTT, and the secondary circuit is the circuit with higher RTT. Initial
RTT is measured during circuit linking, as described in
@@ -100,7 +100,7 @@ Status: Draft
in Proposal 324. This means that during use, the primary circuit and
secondary circuit may switch roles, depending on unrelated network
congestion caused by other Tor clients.
-
+
We also support linking onion service circuits together. In this case,
only two rendezvous circuits are linked. Each of these RP circuits will
be constructed separately, and then linked. However, the same path
@@ -109,31 +109,43 @@ Status: Draft
sharing some relays, this is not catastrophic. Multipath TCP researchers
we have consulted (see [ACKNOWLEDGEMENTS]), believe Tor's congestion
control from Proposal 324 to be sufficient in this rare case.
-
+
Only two circuits SHOULD be linked together. However, implementations
SHOULD make it easy for researchers to *test* more than two paths, as
this has been shown to assist in traffic analysis resistance[WTF_SPLIT].
At minimum, this means not hardcoding only two circuits in the
implementation.
-
+
If the number of circuits exceeds the current number of guard relays,
guard relays MAY be re-used, but implementations SHOULD use the same
number of Guards as paths.
-
+
Linked circuits MUST NOT be extended further once linked (ie:
'cannibalization' is not supported).
2.1. Advertising support for conflux
+2.1.1 Relay
+
We propose a new protocol version in order to advertise support for
circuit linking on the relay side:
-
- "Relay=4" -- Relay supports an 2 byte sequence number in a RELAY cell
- header used for multipath circuit which are linked with the
- new RELAY_CIRCUIT_LINK relay cell command.
-
- XXX: Advertise this in onion service descriptor.
- XXX: Onion service descriptor can advertise more than two circuits?
+
+ "Relay=5" -- Relay supports Conflux as in linking circuits together using
+ the new LINK, LINKED and SWITCH relay command.
+
+2.1.2 Onion Service
+
+ We propose to add a new line in order to advertise conflux support in the
+ onion service descriptor:
+
+ "conflux" SP max-num-circ NL
+
+ The "max-num-circ" value indicate the maximum number of rendezvous
+ circuits that are allowed to be linked together.
+
+ XXX: We should let the service specify the conflux algorithm to use.
+ Some services may prefer latency (LowRTT), where as some may prefer
+ throughput (BLEST).
The next section describes how the circuits are linked together.
@@ -144,199 +156,220 @@ Status: Draft
response. These commands create a 3way handshake, which allows each
endpoint to measure the initial RTT of each leg upon link, without
needing to wait for any data.
-
+
All three stages of this handshake are sent on *each* circuit leg to be
linked.
-
- To save round trips, these cells SHOULD be combined with the initial
- RELAY_BEGIN cell on the faster circuit leg, using Proposal 325. See
- [LINKING_EXIT] and [LINKING_SERVICE] for more details on setup in each
- case.
-
- There are other ways to do this linking that we have considered, but
- they seem not to be significantly better than this method, especially
- since we can use Proposal 325 to eliminate the RTT cost of this setup
- before sending data. For those other ideas, see [ALTERNATIVE_LINKING]
- and [ALTERNATIVE_RTT], in the appendix.
-
+
+ When packed cells are a reality (proposal 340), these cells SHOULD be
+ combined with the initial RELAY_BEGIN cell on the faster circuit leg. See
+ [LINKING_EXIT] and [LINKING_SERVICE] for more details on setup in each case.
+
+ There are other ways to do this linking that we have considered, but they
+ seem not to be significantly better than this method, especially since we can
+ use Proposal 340 to eliminate the RTT cost of this setup before sending data.
+ For those other ideas, see [ALTERNATIVE_LINKING] and [ALTERNATIVE_RTT], in
+ the appendix.
+
The first two parts of the handshake establish the link, and enable
resumption:
-
- 16 -- RELAY_CIRCUIT_LINK
-
- Sent from the OP to the exit/service in order to link
- circuits together at the end point.
-
- 17 -- RELAY_CIRCUIT_LINKED
-
- Sent from the exit/service to the OP, to confirm the circuits
- were linked.
-
+
+ 19 -- RELAY_CONFLUX_LINK
+
+ Sent from the OP to the exit/service in order to link circuits
+ together at the end point.
+
+ 20 -- RELAY_CONFLUX_LINKED
+
+ Sent from the exit/service to the OP, to confirm the circuits were
+ linked.
+
These cells have the following contents:
-
+
VERSION [1 byte]
PAYLOAD [variable, up to end of relay payload]
-
+
The VERSION tells us which circuit linking mechanism to use. At this
point in time, only 0x01 is recognized and is the one described by the
Conflux design.
-
+
For version 0x01, the PAYLOAD contains:
-
+
NONCE [32 bytes]
LAST_SEQNO_SENT [8 bytes]
LAST_SEQNO_RECV [8 bytes]
-
+ ALGORITHM [1 byte]
+
XXX: Should we let endpoints specify their preferred [SCHEDULING] alg
here, to override consensus params? This has benefits: eg low-memory
mobile clients can ask for an alg that is better for their reorder
queues. But it also has complexity risk, if the other endpoint does not
want to support it, because of its own memory issues.
-
+ - YES. At least for Exit circuits, we *will* want to let clients
+ request LowRTT or BLEST/CWND scheduling. So we need an algorithm
+ field here.
+ - XXX: We need to define rules for negotiation then, for onions and
+ exits vs consensus.
+
The NONCE contains a random 256-bit secret, used to associate the two
circuits together. The nonce MUST NOT be shared outside of the circuit
transmission, or data may be injected into TCP streams. This means it
MUST NOT be logged to disk.
-
+
The two sequence number fields are 0 upon initial link, but non-zero in
the case of a resumption attempt (See [RESUMPTION]).
-
- If either circuit does not receive a RELAY_CIRCUIT_LINKED response, both
+
+ If either circuit does not receive a RELAY_CONFLUX_LINKED response, both
circuits MUST be closed.
-
+
The third stage of the handshake exists to help the exit/service measure
initial RTT, for use in [SCHEDULING]:
-
- 18 -- RELAY_CIRCUIT_LINKED_RTT_ACK
-
- Sent from the OP to the exit/service, to provide initial RTT
- measurement for the exit/service.
-
+
+ 21 -- RELAY_CONFLUX_LINKED_ACK
+
+ Sent from the OP to the exit/service, to provide initial RTT
+ measurement for the exit/service.
+
For timeout of the handshake, clients SHOULD use the normal SOCKS/stream
timeout already in use for RELAY_BEGIN.
-
- These three relay commands (RELAY_CIRCUIT_LINK, RELAY_CIRCUIT_LINKED,
- and RELAY_CIRCUIT_LINKED_RTT_ACK) are send on *each* leg, to allow each
- endpoint to measure the initial RTT of each leg.
+
+ These three relay commands are send on *each* leg, to allow each endpoint to
+ measure the initial RTT of each leg.
+
+ The circuit SHOULD be closed if at least one of these conditions is met:
+
+ - Once a LINK is received, if the next cell relay command is not a
+ LINKED_ACK, unless the command is in a packed cell.
+ - Once a LINKED_ACK is received, receiving any other command than these:
+ * BEGIN, DATA, END, CONNECTED, RESOLVE, RESOLVED, XON, XOFF, SWITCH
+ - Receiving a LINKED without a LINK.
+ - Receiving a LINKED_ACK without having sent a LINKED.
+
+ XXX Must define our LINK rate limiting parameters.
2.2. Linking Circuits from OP to Exit [LINKING_EXIT]
To link exit circuits, two circuits to the same exit are built. The
client records the circuit build time of each.
-
- If the circuits are being built on-demand, for immediate use, the
- circuit with the lower build time SHOULD use Proposal 325 to append its
- first RELAY cell to the RELAY_COMMAND_LINK, on the circuit with the
- lower circuit build time. The exit MUST respond on this same leg. After
- that, actual RTT measurements MUST be used to determine future
- transmissions, as specified in [SCHEDULING].
-
- The RTT times between RELAY_COMMAND_LINK and RELAY_COMMAND_LINKED are
- measured by the client, to determine each circuit RTT to determine
- primary vs secondary circuit use, and for packet scheduling. Similarly,
- the exit measures the RTT times between RELAY_COMMAND_LINKED and
- RELAY_COMMAND_LINKED_RTT_ACK, for the same purpose.
-
+
+ If the circuits are being built on-demand, for immediate use, the circuit
+ with the lower build time SHOULD use Proposal 340 to append its first RELAY
+ cell to the RELAY_CONFLUX_LINK, on the circuit with the lower circuit build
+ time. The exit MUST respond on this same leg. After that, actual RTT
+ measurements MUST be used to determine future transmissions, as specified in
+ [SCHEDULING].
+
+ The RTT times between RELAY_CONFLUX_LINK and RELAY_CONFLUX_LINKED are
+ measured by the client, to determine each circuit RTT to determine primary vs
+ secondary circuit use, and for packet scheduling. Similarly, the exit
+ measures the RTT times between RELAY_CONFLUX_LINKED and
+ RELAY_CONFLUX_LINKED_ACK, for the same purpose.
+
2.3. Linking circuits to an onion service [LINKING_SERVICE]
-
+
For onion services, we will only concern ourselves with linking
rendezvous circuits.
-
+
To join rendezvous circuits, clients make two introduce requests to a
service's intropoint, causing it to create two rendezvous circuits, to
meet the client at two separate rendezvous points. These introduce
requests MUST be sent to the same intropoint (due to potential use of
onionbalance), and SHOULD be sent back-to-back on the same intro
- circuit. They MAY be combined with Proposal 325.
-
- The first rendezvous circuit to get joined SHOULD use Proposal 325 to
+ circuit. They MAY be combined with Proposal 340.
+
+ The first rendezvous circuit to get joined SHOULD use Proposal 340 to
append the RELAY_BEGIN command, and the service MUST answer on this
circuit, until RTT can be measured.
-
+
Once both circuits are linked and RTT is measured, packet scheduling
MUST be used, as per [SCHEDULING].
-
+
2.4. Congestion Control Application [CONGESTION_CONTROL]
-
+
The SENDMEs for congestion control are performed per-leg. As data
arrives, regardless of its ordering, it is counted towards SENDME
delivery. In this way, 'cwnd - package_window' of each leg always
reflects the available data to send on each leg. This is important for
[SCHEDULING].
-
+
The Congestion control Stream XON/XOFF can be sent on either leg, and
applies to the stream's transmission on both legs.
-
+
2.5. Sequencing [SEQUENCING]
-
+
With multiple paths for data, the problem of data re-ordering appears.
In other words, cells can arrive out of order from the two circuits
where cell N + 1 arrives before the cell N.
-
+
Handling this reordering operates after congestion control for each
circuit leg, but before relay cell command processing or stream data
delivery.
-
+
For the receiver to be able to reorder the receiving cells, a sequencing
scheme needs to be implemented. However, because Tor does not drop or
reorder packets inside of a circuit, this sequence number can be very
small. It only has to signal that a cell comes after those arriving on
another circuit.
-
- To achieve this, we add a small sequence number to the common relay
- header for all relay cells on linked circuits. This sequence number is
- meant to signal the number of cells sent on the *other* leg, so that
- each endpoint knows how many cells are still in-flight on another leg.
- It is different from the absolute sequence number used in
- [LINKING_CIRCUITS] and [RESUMPTION], but can be derived from that
- number, using relative arithmetic.
-
- Relay command [1 byte]
- Recognized [2 bytes]
- StreamID [2 bytes]
- Digest [4 bytes]
- Length [2 bytes]
- > LongSeq [1 bit] # If this bit is set, use 31 bits for Seq
- > Sequencing [7 or 31 bits]
- Data [Remainder]
-
- The sequence number is only set for the first cell after the endpoint
- switches legs. In this case, LongSeq is set to 1, and the Sequencing
- field is 31 more bits. Otherwise it is a 1 byte 0 value.
-
- These fields MUST be present on ALL end-to-end relay cells on each leg
- that come from the endpoint, following a RELAY_CIRCUIT_LINK command.
-
- They are absent on 'leaky pipe' RELAY_COMMAND_DROP and
- RELAY_COMMAND_PADDING_NEGOTIATED cells that come from middle relays, as
- opposed to the endpoint, to support padding.
-
- When an endpoint switches legs, on the first cell in a new leg, LongSeq
- is set to 1, and the following 31 bits represent the *total* number of
- cells sent on the *other* leg, before the switch. The receiver MUST wait
- for that number of cells to arrive from the previous leg before
- delivering that cell.
-
- XXX: In the rare event that we send more than 2^31 cells (~1TB) on a
- single leg, do we force a switch of legs, or expand the field further?
-
- An alternative method of sequencing, that assumes that the endpoint
- knows when it is going to switch, the cell before it switches, is
- specified in [ALTERNATIVE_SEQUENCING]. Note that that method requires
- only 1 byte for sequence number and switch signaling, but requires that
- the sender know that it is planning to switch, the cell before it
- switches. (This is possible with [BLEST_TOR], but [LOWRTT_TOR] can
- switch based on RTT change, so it may be one cell late in that case).
+
+ To achieve this, we propose a new relay command used to indicate a switch to
+ another leg:
+
+ 22 -- RELAY_CONFLUX_SWITCH
+
+ Sent from the client to the exit/service when switching leg in an
+ already linked circuit construction.
+
+ The cell payload format is:
+
+ SeqNum [4 bytes]
+
+ The "SeqNum" value is a relative sequence number, which is the difference
+ between the last absolute sequence number sent on the new leg and the last
+ absolute sequence number sent on all other legs prior to the switch. In this
+ way, the endpoint knows what to increment its local absolute sequence number
+ by, before cells start to arrive.
+
+ To achieve this, the sender must maintain the last absolute sequence sent for
+ each leg, and the receiver must maintain the last absolute sequence number
+ received for each leg.
+
+ As an example, let's say we send 10 cells on the first leg, so our absolute
+ sequence number is 10. If we then switch to the second leg, it is trivial to
+ see that we should send a SWITCH with 10 as the relative sequence number, to
+ indicate that regardless of the order in which the first cells are received,
+ subsequent cells on the second leg should start counting at 10.
+
+ However, if we then send 21 cells on this leg, our local absolute sequence
+ number as the sender is 31. So when we switch back to the first leg, where
+ the last absolute sequence sent was 10, we must send a SWITCH cell with 21,
+ so that when the first leg receives subsequent cells, it assigns those cells
+ an absolute sequence number starting at 31.
+
+ In the rare event that we send more than 2^31 cells (~1TB) on a single leg,
+ the leg should be switched in order to reset that relative sequence number to
+ fit within 4 bytes.
+
+ In order to rate limit the use of SWITCH to prevent its use as a DropMark
+ side channel, the circuit SHOULD be closed if at least one of these
+ conditions is met:
+
+ - The SeqNum value is below the "cc_sendme_inc" which is currently set
+ at 31.
+ - If immediately after receiving a SWITCH, another one is received.
+
+ XXX: We should define our rate limiting.
+
+ - If we are NOT an exit circuit.
+ - If the SeqNum makes our absolute sequence number to overflow.
2.6. Resumption [RESUMPTION]
In the event that a circuit leg is destroyed, they MAY be resumed.
-
+
Resumption is achieved by re-using the NONCE to the same endpoint
(either [LINKING_EXIT] or [LINKING_SERVICE]). The resumed path need
not use the same middle and guard relays as the destroyed leg(s), but
SHOULD NOT share any relays with any existing legs(s).
-
+
To provide resumption, endpoints store an absolute 64bit cell counter of
the last cell they have sent on a conflux pair (their LAST_SEQNO_SENT),
as well the last sequence number they have delivered in-order to edge
@@ -345,14 +378,14 @@ Status: Draft
inflight cells (ie the 'package_window' from proposal 324), for each
leg, along with information corresponding to those cells' absolute
sequence numbers.
-
+
These 64 bit absolute counters can wrap without issue, as congestion
windows will never grow to 2^64 cells until well past the Singularity.
However, it is possible that extremely long, bulk circuits could exceed
2^64 total sent or received cells, so endpoints SHOULD handle wrapped
sequence numbers for purposes of computing retransmit information. (But
even this case is unlikely to happen within the next decade or so).
-
+
Upon resumption, the LAST_SEQNO_SENT and LAST_SEQNO_RECV fields are used
to convey the sequence numbers of the last cell the relay sent and
received on that leg. The other endpoint can use these sequence numbers
@@ -360,27 +393,27 @@ Status: Draft
since that point, up to and including this absolute sequence number. If
LAST_SEQNO_SENT has not been received, the endpoint MAY transmit the
missing data, if it still has it buffered.
-
+
Because both endpoints get information about the other side's absolute
SENT sequence number, they will know exactly how many re-transmitted
packets to expect, if the circuit is successfully resumed.
Re-transmitters MUST NOT re-increment their absolute sent fields
while re-transmitting.
-
+
If it does not have this missing data due to memory pressure, that
endpoint MUST destroy *both* legs, as this represents unrecoverable
data loss.
-
+
Otherwise, the new circuit can be re-joined, and its RTT can be compared
to the remaining circuit to determine if the new leg is primary or
secondary.
-
+
It is even possible to resume conflux circuits where both legs have been
collapsed using this scheme, if endpoints continue to buffer their
unacked package_window data for some time after this close. However, see
[TRAFFIC_ANALYSIS] for more details on the full scope of this issue.
-
+
If endpoints are buffering package_window data, such data should be
given priority to be freed in any oomkiller invocation. See [MEMORY_DOS]
for more oomkiller information.
@@ -393,14 +426,14 @@ Status: Draft
will have accurate information on the instantaneous available bandwidth
of each circuit leg, as 'cwnd - package_window' (see Section 3 of
Proposal 324).
-
+
Some additional RTT optimizations are also useful, to improve
responsiveness and minimize out-of-order queue sizes.
-
+
We specify two traffic schedulers from the multipath literature and
adapt them to Tor: [LOWRTT_TOR] and [BLEST_TOR]. [LOWRTT_TOR] also has
three variants, with different trade offs.
-
+
However, see the [TRAFFIC_ANALYSIS] sections of this proposal for
important details on how this selection can be changed, to reduce
website traffic fingerprinting.
@@ -409,21 +442,21 @@ Status: Draft
This scheduling algorithm is based on the original [CONFLUX] paper, with
ideas from [MPTCP]'s minRTT/LowRTT scheduler.
-
+
In this algorithm, endpoints send cells on the circuit with lower RTT
(primary circuit). This continues while the congestion window on the
circuit has available room: ie whenever cwnd - package_window > 0.
-
+
Whenever the primary circuit's congestion window becomes full, the
secondary circuit is used. We stop reading on the send window source
(edge connection) when both congestion windows become full.
-
+
In this way, unlike original conflux, we switch to the secondary circuit
without causing congestion on the primary circuit. This improves both
load times, and overall throughput.
-
+
This behavior matches minRTT from [MPTCP], sometimes called LowRTT.
-
+
It may be better to stop reading on the edge connection when the primary
congestion window becomes full, rather than switch to the secondary
circuit as soon as the primary congestion window becomes full. (Ie: only
@@ -432,7 +465,7 @@ Status: Draft
causes us to optimize for responsiveness and congestion avoidance,
rather than throughput. For evaluation, we will control this switching
behavior with a consensus parameter (see [CONSENSUS_PARAMETERS]).
-
+
Because of potential side channel risk (see [SIDE_CHANNELS]), a third
variant of this algorithm, where the primary circuit is chosen during
the [LINKING_CIRCUITS] handshake and never changed, is also possible
@@ -444,25 +477,25 @@ Status: Draft
use this information to reorder transmitted data, to minimize
head-of-line blocking in the recipient (and thus minimize out-of-order
queues there).
-
+
BLEST_TOR uses the primary circuit until the congestion window is full.
Then, it uses the relative RTT times of the two circuits to calculate
how much data can be sent on the secondary circuit faster than if we
- just waited for the primary circuit to become available.
-
- This is achieved by computing two variables at the sender:
-
+ just waited for the primary circuit to become available.
+
+ This is achieved by computing two variables at the sender:
+
rtts = secondary.currRTT / primary.currRTT
primary_limit = primary.cwnd + (rtts-1)/2)*rtts
-
+
Note: This (rtts-1)/2 factor represents anticipated congestion window
growth over this period.. it may be different for Tor, depending on CC
alg.
-
+
If primary_limit < secondary.cwnd - (secondary.package_window + 1), then
there is enough space on the secondary circuit to send data faster than
we could than waiting for the primary circuit.
-
+
XXX: Note that BLEST uses total_send_window where we use secondary.cwnd
in this check. total_send_window is min(recv_win, CWND). But since Tor
does not use receive windows and instead uses stream XON/XOFF, we only
@@ -472,29 +505,33 @@ Status: Draft
hopefully this is fine. If we need to, we could turn [REORDER_SIGNALING]
into a receive window indication of some kind, to indicate remaining
buffer size.
-
+
Otherwise, if the primary_limit condition is not hit, cease reading on
source edge connections until SENDME acks come back.
-
+
Here is the pseudocode for this:
-
+
while source.has_data_to_send():
if primary.cwnd > primary.package_window:
primary.send(source.get_packet())
continue
-
+
rtts = secondary.currRTT / primary.currRTT
primary_limit = (primary.cwnd + (rtts-1)/2)*rtts
-
+
if primary_limit < secondary.cwnd - (secondary.package_window+1):
secondary.send(source.get_packet())
else:
break # done for now, wait for SENDME to free up CWND and restart
-
+
Note that BLEST also has a parameter lambda that is updated whenever HoL
blocking occurs. Because it is expensive and takes significant time to
signal this over Tor, we omit this.
-
+
+ XXX: We may want a third algorithm that only uses cwnd, for comparison.
+ The above algorithm may have issues if the primary cwnd grows while the
+ secondary does not. Expect this section to change.
+
XXX: See [REORDER_SIGNALING] section if we want this lambda feedback.
3.3. Reorder queue signaling [REORDER_SIGNALING]
@@ -502,7 +539,7 @@ Status: Draft
Reordering is fairly simple task. By following using the sequence
number field in [SEQUENCING], endpoints can know how many cells are
still in flight on the other leg.
-
+
To reorder them properly, a buffer of out of order cells needs to be
kept. On the Exit side, this can quickly become overwhelming
considering ten of thousands of possible circuits can be held open
@@ -512,26 +549,26 @@ Status: Draft
Luckily, [BLEST_TOR] and the form of [LOWRTT_TOR] that only uses the
primary circuit will minimize or eliminate this out-of-order buffer.
-
+
XXX: The remainder of this section may be over-complicating things... We
only need these concepts if we want to use BLEST's lambda feedback. Though
turning this into some kind of receive window that indicates remaining
reorder buffer size may also help with the total_send_window also noted
in BLEST_TOR.
-
+
The default for this queue size is governed by the 'cflx_reorder_client'
and 'cflx_reorder_srv' consensus parameters (see [CONSENSUS_PARAMS]).
'cflx_reorder_srv' applies to Exits and onion services. Both parameters
can be overridden by Torrc, to larger or smaller than the consensus
parameter. (Low memory clients may want to lower it; SecureDrop onion
services or other high-upload services may want to raise it).
-
+
When the reorder queue hits this size, a RELAY_CONFLUX_XOFF is sent down
the circuit leg that has data waiting in the queue and use of that leg
SHOULD cease, until it drains to half of this value, at which point an
RELAY_CONFLUX_XON is sent. Note that this is different than the stream
XON/XOFF from Proposal 324.
-
+
XXX: [BLEST] actually does not cease use of a path in this case, but
instead uses this signal to adjust the lambda parameter, which biases
traffic away from that leg.
@@ -543,38 +580,38 @@ Status: Draft
Both reorder queues and retransmit buffers inherently represent a memory
denial of service condition.
-
+
For [RESUMPTION] retransmit buffers, endpoints that support this feature
SHOULD free retransmit information as soon as they get close to memory
pressure. This prevents resumption while data is in flight, but will not
otherwise harm operation.
-
+
For reorder buffers, adversaries can potentially impact this at any
point, but most obviously and most severely from the client position.
-
+
In particular, clients can lie about sequence numbers, sending cells
with sequence numbers such that the next expected sequence number is
never sent. They can do this repeatedly on many circuits, to exhaust
memory at exits.
-
+
One option is to only allow actual traffic splitting in the downstream
direction, towards clients, and always use the primary circuit for
everything in the upstream direction. However, the ability to support
conflux from the client to the exit shows promise against traffic
analysis (see [WTF_SPLIT]).
-
+
The other option is to use [BLEST_TOR] from clients to exits, as it has
predictable interleaved cell scheduling, and minimizes reorder queues at
exits. If the ratios prescribed by that algorithm are not followed
within some bounds, the other endpoint can close both circuits, and free
the queue memory.
-
+
This still leaves the possibility that intermediate relays may block a
leg, allowing cells to traverse only one leg, thus still accumulating at
the reorder queue. Clients can also spoof sequence numbers similarly, to
make it appear that they are following [BLEST_TOR], without actually
sending any data on one of the legs.
-
+
To handle either of these cases, when a relay is under memory pressure,
the circuit OOM killer SHOULD free and close circuits with the oldest
reorder queue data, first. This heuristic was shown to be best during
@@ -585,7 +622,7 @@ Status: Draft
Two potential side channels may be introduced by the use of Conflux:
1. RTT leg-use bias by altering SENDME latency
2. Location info leaks through the use of both leg's latencies
-
+
For RTT and leg-use bias, Guard relays could delay legs to introduce a
pattern into the delivery of cells at the exit relay, by varying the
latency of SENDME cells (every 100th cell) to change the distribution of
@@ -593,14 +630,14 @@ Status: Draft
direction of traffic, to bias traffic load off of a particular Guard.
If an adversary controls both Guards, it could in theory send a binary
signal more easily, by alternating delays on each.
-
+
However, this risk weighs against the potential benefits against traffic
fingerprinting, as per [WTF_SPLIT]. Additionally, even ignoring
cryptographic tagging attacks, this side channel provides significantly
lower information over time than inter-packet-delay based side channels
that are already available to Guards and routers along the path to the
Guard.
-
+
Tor currently provides no defenses against already existing
single-circuit delay-based side channels, though both circuit padding
and [BACKLIT] are potential options it could conceivably deploy. The
@@ -610,17 +647,17 @@ Status: Draft
conflux side channels as well. Circuit padding can also help to obscure
which cells are SENDMEs, since circuit padding is not counted towards
SENDME totals.
-
+
The second class of side channel is where the Exit relay may be able to
use the two legs to further infer more information about client
location. See [LATENCY_LEAK] for more details. It is unclear at this
time how much more severe this is for two paths than just one.
-
+
We preserve the ability to disable conflux to and from Exit relays
using consensus parameters, if these side channels prove more severe,
or if it proves possible possible to mitigate single-circuit side
channels, but not conflux side channels.
-
+
In all cases, all of these side channels appear less severe for onion
service traffic, due to the higher path variability due to relay
selection, as well as the end-to-end nature of conflux in that case.
@@ -634,18 +671,18 @@ Status: Draft
packet counting and timing analysis at guards to guess which specific
circuits are linked. In particular, the 3 way handshake in
[LINKING_CIRCUITS] may be quite noticeable.
-
+
As one countermeasure, it may be possible to eliminate the third leg
- (RELAY_CIRCUIT_LINKED_RTT_ACK) by computing the exit/service RTT via
+ (RELAY_CIRCUIT_LINKED_ACK) by computing the exit/service RTT via
measuring the time between CREATED/REND_JOINED and RELAY_CIRCUIT_LINK,
but this will introduce cross-component complexity into Tor's protocol
that could quickly become unwieldy and fragile.
-
+
Additionally, the conflux handshake may make onion services stand out
more, regardless of the number of stages in the handshake. For this
reason, it may be more wise to simply address these issues with circuit
padding machines during circuit setup (see padding-spec.txt).
-
+
Additional traffic analysis considerations arise when combining conflux
with padding, for purposes of mitigating traffic fingerprinting. For
this, it seems wise to treat the packet schedulers as another piece of a
@@ -653,7 +690,7 @@ Status: Draft
machines, perhaps introducing randomness or fudge factors their
scheduling, as a parameterized distribution. For details, see
https://github.com/torproject/tor/blob/master/doc/HACKING/CircuitPaddingDevelopment.md
-
+
Finally, conflux may exacerbate forms of confirmation-based traffic
analysis that close circuits to determine concretely if they were in
use, since closing either leg might cause resumption to fail. TCP RST
@@ -665,13 +702,13 @@ Status: Draft
more vulnerable to this attack. However, if the adversary controls the
client, they will notice the resumption re-link, and still obtain
confirmation that way.
-
+
It seems the only way to fully mitigate these kinds of attacks is with
the Snowflake pluggable transport, which provides its own resumption and
retransmit behavior. Additionally, Snowflake's use of UDP DTLS also
protects against TCP RST injection, which we suspect to be the main
vector for such attacks.
-
+
In the future, a DTLS or QUIC transport for Tor such as masque could
provide similar RST injection resistance, and resumption at Guard/Bridge
nodes, as well.
@@ -713,51 +750,11 @@ Status: Draft
Appended A [ALTERNATIVES]
-A.1 BEGIN/END sequencing [ALTERNATIVE_SEQUENCING]
-
- In this method of signaling, we increment the sequence number by 1 only
- when we switch legs, and use BEGIN/END "bookends" to know that all data
- on a leg has been received.
-
- To achieve this, we add a small sequence number to the common relay
- header for all relay cells on linked circuits, as well as a field to
- signal the beginning of a sequence, intermediate data, and the end of a
- sequence.
-
- Relay command [1 byte]
- Recognized [2 bytes]
- StreamID [2 bytes]
- Digest [4 bytes]
- Length [2 bytes]
- > Switching [2 bits] # 01 = BEGIN, 00 = CONTINUE, 10 = END
- > Sequencing [6 bits]
- Data [PAYLOAD_LEN - 12 - Length bytes]
-
- These fields MUST be present on ALL end-to-end relay cells on each leg
- that come from the endpoint, following a RELAY_CIRCUIT_LINK command.
-
- They are absent on 'leaky pipe' RELAY_COMMAND_DROP and
- RELAY_COMMAND_PADDING_NEGOTIATED cells that come from middle relays, as
- opposed to the endpoint, to support padding.
-
- Sequence numbers are incremented by one when an endpoint switches legs
- to transmit a cell. This number will wrap; implementations MUST treat
- 0 as the next sequence after 2^6-1. Because we do not expect to support
- significantly more than 2 legs, and much fewer than 63, this is not an
- issue.
-
- The first cell on a new circuit MUST use the BEGIN code for switching.
- Cells are delivered from that circuit until an END switching signal is
- received, even if cells arrive first on another circuit with the next
- sequence number before and END switching field. Recipients MUST only
- deliver cells with a BEGIN, if their Sequencing number is one more than
- the last END.
-
-A.2 Alternative Link Handshake [ALTERNATIVE_LINKING]
+A.1 Alternative Link Handshake [ALTERNATIVE_LINKING]
The circuit linking in [LINKING_CIRCUITS] could be done as encrypted
ntor onionskin extension fields, similar to those used by v3 onions.
-
+
This approach has at least four problems:
i). For onion services, since onionskins traverse the intro circuit
and return on the rend circuit, this handshake cannot measure
@@ -771,42 +768,42 @@ A.2 Alternative Link Handshake [ALTERNATIVE_LINKING]
iv). The overhead in processing this onionskin in onionskin queues
adds additional time for linking, even in the Exit case, making
that RTT potentially noisy.
-
+
Additionally, it is not clear that this approach actually saves us
anything in terms of setup time, because we can optimize away the
- linking phase using Proposal 325, to combine initial RELAY_BEGIN cells
+ linking phase using Proposal 340, to combine initial RELAY_BEGIN cells
with RELAY_CIRCUIT_LINK.
-A.3. Alternative RTT measurement [ALTERNATIVE_RTT]
+A.2. Alternative RTT measurement [ALTERNATIVE_RTT]
Instead of measuring RTTs during [LINKING_CIRCUITS], we could create
PING/PONG cells, whose sole purpose is to allow endpoints to measure
RTT.
-
+
This was rejected for several reasons. First, during circuit use, we
already have SENDMEs to measure RTT. Every 100 cells (or
'circwindow_inc' from Proposal 324), we are able to re-measure RTT based
on the time between that Nth cell and the SENDME ack. So we only need
PING/PONG to measure initial circuit RTT.
-
+
If we were able to use onionskins, as per [ALTERNATIVE_LINKING] above,
we might be able to specify a PING/PONG/PING handshake solely for
measuring initial RTT, especially for onion service circuits.
-
+
The reason for not making a dedicated PING/PONG for this purpose is that
it is context-free. Even if we were able to use onionskins for linking
and resumption, to avoid additional data in handshake that just measures
RTT, we would have to enforce that this PING/PONG/PING only follows the
exact form needed by this proposal, at the expected time, and at no
other points.
-
+
If we do not enforce this specific use of PING/PONG/PING, it becomes
another potential side channel, for use in attacks such as [DROPMARK].
-
+
In general, Tor is planning to remove current forms of context-free and
semantic-free cells from its protocol:
https://gitlab.torproject.org/tpo/core/torspec/-/issues/39
-
+
We should not add more.
@@ -814,15 +811,15 @@ Appendix B: Acknowledgments [ACKNOWLEDGEMENTS]
Thanks to Per Hurtig for helping us with the framing of the MPTCP
problem space.
-
+
Thanks to Simone Ferlin for clarifications on the [BLEST] paper, and for
pointing us at the Linux kernel implementation.
-
+
Extreme thanks goes again to Toke Høiland-Jørgensen, who helped
immensely towards our understanding of how the BLEST condition relates
to edge connection pushback, and for clearing up many other
misconceptions we had.
-
+
Finally, thanks to Mashael AlSabah, Kevin Bauer, Tariq Elahi, and Ian
Goldberg, for the original [CONFLUX] paper!
diff --git a/proposals/342-decouple-hs-interval.md b/proposals/342-decouple-hs-interval.md
new file mode 100644
index 0000000..395c454
--- /dev/null
+++ b/proposals/342-decouple-hs-interval.md
@@ -0,0 +1,107 @@
+```
+Filename: 342-decouple-hs-interval.md
+Title: Decoupling hs_interval and SRV lifetime
+Author: Nick Mathewson
+Created: 9 January 2023
+Status: Draft
+```
+
+# Motivation and introduction
+
+Tor uses shared random values (SRVs) in the consensus to determine
+positions of relays within a hash ring. Which shared random value is to
+be used for a given time period depends upon the time at which that
+shared random value became valid.
+
+But right now, the consensus voting period is closely tied to the shared
+random value voting cycle: and clients need to understand both of these
+in order to determine when a shared random value became current.
+
+This creates tight coupling between:
+ * The voting schedule
+ * The SRV liveness schedule
+ * The hsdir_interval parameter that determines the length of the
+ an HSDIR index
+
+To decouple these values, this proposal describes a forward compatible
+change to how Tor reports SRVs in consensuses, and how Tor decides which
+hash ring to use when.
+
+
+## Reporting SRV timestamps
+
+In consensus documents, parties should begin to accept
+`shared-rand-*-value` lines with an additional argument, in the format
+of an IsoTimeNospace timestamp (like "1985-10-26T00:00:00"). When
+present, this timestamp indicates the time at which the given shared
+random value first became the "current" SRV.
+
+Additionally, we define a new consensus method that adds these
+timestamps to the consensus.
+
+We specify that, in the absence of such a timestamp, parties are to
+assume that the `shared-rand-current-value` SRV became "current" at the
+first 00:00 UTC on the UTC day of the consensus's valid-after timestamp,
+and that the `shard-rand-previous-value` SRV became "current" at 00:00
+UTC on the previous UTC day.
+
+
+## Generalizing HSDir index scheduling.
+
+Under the current HSDir design, there is one SRV for each time period,
+and one time period for which each SRV is in use. Decoupling
+`hsdir_interval` from 24 hours will require that we change this notion
+slightly.
+
+We therefore propose this set of generalized directory behavior rules,
+which should be equivalent to the current rules under current
+parameters.
+
+The calculation of time periods remains the same (see `rend-spec-v3.txt`
+section `[TIME PERIODS]`).
+
+A single SRV is associated with each time period: specifically, the SRV
+that was "current" at the start of the time period.
+
+There is a separate hash ring associated with each time period and its
+SRV.
+
+Whenever fetching an onion service descriptor, the client uses the hash
+ring for the time period that contains the start of the liveness
+interval of the current consensus. Call this the "Consensus" time period.
+
+Whenever uploading an onion service descriptor, the service uses _two or
+three_ hash rings:
+ * The "consensus" time period (see above).
+ * The immediately preceding time period, if the SRV to calculate that
+ hash ring is available in the consensus.
+ * The immediately following time period, if the SRV to calculate that
+ hash ring is available in the consensus.
+
+(Under the current parameters, where `hsdir_interval = SRV_interval`,
+there will never be more than two possible time periods for which the
+service can qualify.)
+
+## Migration
+
+We declare that, for at least the lifetime of the C tor client, we will
+not make any changes to the voting interval, the SRV interval, or the
+`hsdir_interval`. As such, we do not need to prioritize implementing
+these changes in the C client: we can make them in Arti only.
+
+## Issues left unsolved
+
+There are likely other lingering issues that would come up if we try to
+change the voting interval. This proposal does not attempt to solve
+them.
+
+This proposal does not attempt to add flexibility to the SRV voting
+algorithm itself.
+
+Changing `hsdir_interval` would create a flag day where everybody using
+old and new values of `hsdir_interval` would get different hash
+rings. We do not try to solve that here.
+
+## Acknowledgments
+
+Thanks to David Goulet for explaining all of this stuff to me!
diff --git a/proposals/BY_INDEX.md b/proposals/BY_INDEX.md
index b64ae55..d0b1214 100644
--- a/proposals/BY_INDEX.md
+++ b/proposals/BY_INDEX.md
@@ -259,4 +259,5 @@ Below are a list of proposals sorted by their proposal number. See
* [`339-udp-over-tor.md`](/proposals/339-udp-over-tor.md): UDP traffic over Tor [ACCEPTED]
* [`340-packed-and-fragmented.md`](/proposals/340-packed-and-fragmented.md): Packed and fragmented relay messages [OPEN]
* [`341-better-oos.md`](/proposals/341-better-oos.md): A better algorithm for out-of-sockets eviction [OPEN]
+* [`342-decouple-hs-interval.md`](/proposals/342-decouple-hs-interval.md): Decoupling hs_interval and SRV lifetime [DRAFT]
diff --git a/proposals/README.md b/proposals/README.md
index 4127329..0461d6a 100644
--- a/proposals/README.md
+++ b/proposals/README.md
@@ -109,6 +109,7 @@ discussion.
* [`327-pow-over-intro.txt`](/proposals/327-pow-over-intro.txt): A First Take at PoW Over Introduction Circuits
* [`329-traffic-splitting.txt`](/proposals/329-traffic-splitting.txt): Overcoming Tor's Bottlenecks with Traffic Splitting
* [`331-res-tokens-for-anti-dos.md`](/proposals/331-res-tokens-for-anti-dos.md): Res tokens: Anonymous Credentials for Onion Service DoS Resilience
+* [`342-decouple-hs-interval.md`](/proposals/342-decouple-hs-interval.md): Decoupling hs_interval and SRV lifetime
## NEEDS-REVISION proposals: ideas that we can't implement as-is
diff --git a/rend-spec-v3.txt b/rend-spec-v3.txt
index fac1395..0914c81 100644
--- a/rend-spec-v3.txt
+++ b/rend-spec-v3.txt
@@ -1100,6 +1100,9 @@ Table of contents:
prevents an attacker from replacing a newer descriptor signed by
a given key with a copy of an older version.)
+ Implementations MUST be able to parse 64-bit values for these
+ counters.
+
"superencrypted" NL encrypted-string
[Exactly once.]
@@ -1409,6 +1412,8 @@ Table of contents:
"legacy-key" NL key NL
[None or at most once per introduction point]
+ [This field is obsolete and should never be generated; it
+ is included for historical reasons only.]
The key is an ASN.1 encoded RSA public key in PEM format used for a
legacy introduction point as described in [LEGACY_EST_INTRO].
@@ -1420,6 +1425,8 @@ Table of contents:
"legacy-key-cert" NL certificate NL
[None or at most once per introduction point]
+ [This field is obsolete and should never be generated; it
+ is included for historical reasons only.]
MUST be present if "legacy-key" is present.
@@ -1653,6 +1660,9 @@ Table of contents:
3.1.2. Registering an introduction point on a legacy Tor node
[LEGACY_EST_INTRO]
+ [This section is obsolete and refers to a workaround for now-obsolete Tor
+ relay versions. It is included for historical reasons.]
+
Tor nodes should also support an older version of the ESTABLISH_INTRO
cell, first documented in rend-spec.txt. New hidden service hosts
must use this format when establishing introduction points at older
@@ -1693,6 +1703,8 @@ Table of contents:
Older versions of Tor send back an empty INTRO_ESTABLISHED cell instead.
Services must accept an empty INTRO_ESTABLISHED cell from a legacy relay.
+ [The above paragraph is obsolete and refers to a workaround for
+ now-obsolete Tor relay versions. It is included for historical reasons.]
The same rules for multiplicity, ordering, and handling unknown types
apply to the extension fields here as described [EST_INTRO] above.
@@ -2133,6 +2145,9 @@ Table of contents:
4.3. Using legacy hosts as rendezvous points
+ [This section is obsolete and refers to a workaround for now-obsolete Tor
+ relay versions. It is included for historical reasons.]
+
The behavior of ESTABLISH_RENDEZVOUS is unchanged from older versions
of this protocol, except that relays should now ignore unexpected
bytes at the end.
diff --git a/tor-spec.txt b/tor-spec.txt
index 25a12a7..d5305f2 100644
--- a/tor-spec.txt
+++ b/tor-spec.txt
@@ -662,7 +662,7 @@ see tor-design.pdf.
versions cell they received. If they have no such version in common,
they cannot communicate and MUST close the connection. Either party MUST
close the connection if the versions cell is not well-formed (for example,
- if it contains an odd number of bytes).
+ if the payload contains an odd number of bytes).
Any VERSIONS cells sent after the first VERSIONS cell MUST be ignored.
(To be interpreted correctly, later VERSIONS cells MUST have a CIRCID_LEN
@@ -1651,8 +1651,8 @@ see tor-design.pdf.
inbound RELAY_EARLY cells, it MUST close the circuit immediately.
When speaking v2 of the link protocol or later, clients MUST only send
- EXTEND/EXTEND2 cells inside RELAY_EARLY cells. Clients SHOULD send the first ~8
- RELAY cells that are not targeted at the first hop of any circuit as
+ EXTEND/EXTEND2 cells inside RELAY_EARLY cells. Clients SHOULD send the first
+ ~8 RELAY cells that are not targeted at the first hop of any circuit as
RELAY_EARLY cells too, in order to partially conceal the circuit length.
[Starting with Tor 0.2.3.11-alpha, relays should reject any
@@ -1702,6 +1702,8 @@ see tor-design.pdf.
16..18 -- Reserved for UDP; Not yet in use, see prop339.
+ 19..22 -- Reserved for Conflux, see prop329.
+
32..40 -- Used for hidden services; see rend-spec-{v2,v3}.txt.
41..42 -- Used for circuit padding; see Section 3 of padding-spec.txt.