aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
-rw-r--r--address-spec.txt2
-rw-r--r--control-spec.txt5
-rw-r--r--dir-spec.txt2
-rw-r--r--proposals/000-index.txt14
-rw-r--r--proposals/319-wide-everything.md4
-rw-r--r--proposals/325-packed-relay-cells.md4
-rw-r--r--proposals/327-pow-over-intro.txt408
-rw-r--r--proposals/329-traffic-splitting.txt229
-rw-r--r--proposals/343-rend-caa.txt107
-rw-r--r--proposals/BY_INDEX.md7
-rw-r--r--proposals/README.md7
-rw-r--r--rend-spec-v3.txt92
-rw-r--r--tor-spec.txt3
13 files changed, 634 insertions, 250 deletions
diff --git a/address-spec.txt b/address-spec.txt
index 98e7068..1e90e6e 100644
--- a/address-spec.txt
+++ b/address-spec.txt
@@ -1,4 +1,3 @@
-
Special Hostnames in Tor
Nick Mathewson
@@ -71,6 +70,7 @@ Table of Contents
- PUBKEY is the 32 bytes ed25519 master pubkey of the onion service.
- VERSION is a one byte version field (default value '\x03')
- ".onion checksum" is a constant string
+ - H is SHA3-256
- CHECKSUM is truncated to two bytes before inserting it in onion_address
When Tor sees an address in this format, it tries to look up and connect to
diff --git a/control-spec.txt b/control-spec.txt
index 9295580..52e11a0 100644
--- a/control-spec.txt
+++ b/control-spec.txt
@@ -2260,6 +2260,7 @@ Table of Contents
[SP "REASON=" Reason [SP "REMOTE_REASON=" Reason]]
[SP "SOCKS_USERNAME=" EscapedUsername]
[SP "SOCKS_PASSWORD=" EscapedPassword]
+ [SP "HS_POW=" HSPoW ]
CRLF
CircStatus =
@@ -2293,6 +2294,10 @@ Table of Contents
"HSSI_CONNECTING" / "HSSI_ESTABLISHED" /
"HSSR_CONNECTING" / "HSSR_JOINED"
+ HSPoWType = "v1"
+ HSPoWEffort = 1*DIGIT
+ HSPoW = HSPoWType "," HSPoWEffort
+
EscapedUsername = QuotedString
EscapedPassword = QuotedString
diff --git a/dir-spec.txt b/dir-spec.txt
index b6a7c8e..d6b58cd 100644
--- a/dir-spec.txt
+++ b/dir-spec.txt
@@ -1032,7 +1032,7 @@ Table of Contents
statistics.
"geoip-start-time" and "geoip-client-origins" have been replaced by
- "bridge-stats-end" and "bridge-stats-ips" in 0.2.2.4-alpha. The
+ "bridge-stats-end" and "bridge-ips" in 0.2.2.4-alpha. The
reason is that the measurement interval with "geoip-stats" as
determined by subtracting "geoip-start-time" from "published" could
have had a variable length, whereas the measurement interval in
diff --git a/proposals/000-index.txt b/proposals/000-index.txt
index a838b94..160ce4e 100644
--- a/proposals/000-index.txt
+++ b/proposals/000-index.txt
@@ -239,17 +239,17 @@ Proposals by number:
316 FlashFlow: A Secure Speed Test for Tor (Parent Proposal) [DRAFT]
317 Improve security aspects of DNS name resolution [NEEDS-REVISION]
318 Limit protover values to 0-63 [CLOSED]
-319 RELAY_FRAGMENT cells [OPEN]
+319 RELAY_FRAGMENT cells [OBSOLETE]
320 Removing TAP usage from v2 onion services [REJECTED]
321 Better performance and usability for the MyFamily option (v2) [ACCEPTED]
322 Extending link specifiers to include the directory port [OPEN]
323 Specification for Walking Onions [OPEN]
324 RTT-based Congestion Control for Tor [OPEN]
-325 Packed relay cells: saving space on small commands [OPEN]
+325 Packed relay cells: saving space on small commands [OBSOLETE]
326 The "tor-relay" Well-Known Resource Identifier [OPEN]
327 A First Take at PoW Over Introduction Circuits [DRAFT]
328 Make Relays Report When They Are Overloaded [CLOSED]
-329 Overcoming Tor's Bottlenecks with Traffic Splitting [DRAFT]
+329 Overcoming Tor's Bottlenecks with Traffic Splitting [NEEDS-REVISION]
330 Modernizing authority contact entries [OPEN]
331 Res tokens: Anonymous Credentials for Onion Service DoS Resilience [DRAFT]
332 Ntor protocol with extra data, version 3 [FINISHED]
@@ -263,6 +263,7 @@ Proposals by number:
340 Packed and fragmented relay messages [OPEN]
341 A better algorithm for out-of-sockets eviction [OPEN]
342 Decoupling hs_interval and SRV lifetime [DRAFT]
+343 CAA Extensions for the Tor Rendezvous Specification [OPEN]
Proposals by status:
@@ -271,7 +272,6 @@ Proposals by status:
294 TLS 1.3 Migration
316 FlashFlow: A Secure Speed Test for Tor (Parent Proposal)
327 A First Take at PoW Over Introduction Circuits
- 329 Overcoming Tor's Bottlenecks with Traffic Splitting
331 Res tokens: Anonymous Credentials for Onion Service DoS Resilience
342 Decoupling hs_interval and SRV lifetime
NEEDS-REVISION:
@@ -283,6 +283,7 @@ Proposals by status:
279 A Name System API for Tor Onion Services
291 The move to two guard nodes
317 Improve security aspects of DNS name resolution
+ 329 Overcoming Tor's Bottlenecks with Traffic Splitting
OPEN:
239 Consensus Hash Chaining
240 Early signing key revocation for directory authorities
@@ -296,15 +297,14 @@ Proposals by status:
306 A Tor Implementation of IPv6 Happy Eyeballs
308 Counter Galois Onion: A New Proposal for Forward-Secure Relay Cryptography
309 Optimistic SOCKS Data
- 319 RELAY_FRAGMENT cells
322 Extending link specifiers to include the directory port
323 Specification for Walking Onions
324 RTT-based Congestion Control for Tor
- 325 Packed relay cells: saving space on small commands
326 The "tor-relay" Well-Known Resource Identifier
330 Modernizing authority contact entries
340 Packed and fragmented relay messages
341 A better algorithm for out-of-sockets eviction
+ 343 CAA Extensions for the Tor Rendezvous Specification
ACCEPTED:
265 Load Balancing with Overhead Parameters [for 0.2.9.x]
282 Remove "Named" and "Unnamed" handling from consensus voting [for 0.3.3.x]
@@ -508,6 +508,8 @@ Proposals by status:
263 Request to change key exchange protocol for handshake v1.2
268 New Guard Selection Behaviour
270 RebelAlliance: A Post-Quantum Secure Hybrid Handshake Based on NewHope
+ 319 RELAY_FRAGMENT cells
+ 325 Packed relay cells: saving space on small commands
RESERVE:
133 Incorporate Unreachable ORs into the Tor Network
172 GETINFO controller option for circuit information
diff --git a/proposals/319-wide-everything.md b/proposals/319-wide-everything.md
index 0de6676..06173ed 100644
--- a/proposals/319-wide-everything.md
+++ b/proposals/319-wide-everything.md
@@ -3,9 +3,11 @@ Filename: 319-wide-everything.md
Title: RELAY_FRAGMENT cells
Author: Nick Mathewson
Created: 11 May 2020
-Status: Open
+Status: Obsolete
```
+(Proposal superseded by proposal 340)
+
(This proposal is part of the Walking Onions spec project.)
# Introduction
diff --git a/proposals/325-packed-relay-cells.md b/proposals/325-packed-relay-cells.md
index 6498d4c..7a88840 100644
--- a/proposals/325-packed-relay-cells.md
+++ b/proposals/325-packed-relay-cells.md
@@ -3,9 +3,11 @@ Filename: 325-packed-relay-cells.md
Title: Packed relay cells: saving space on small commands
Author: Nick Mathewson
Created: 10 July 2020
-Status: Open
+Status: Obsolete
```
+(Proposal superseded by proposal 340)
+
# Introduction
In proposal 319 I suggested a way to fragment long commands across
diff --git a/proposals/327-pow-over-intro.txt b/proposals/327-pow-over-intro.txt
index ca7f7cd..8f17753 100644
--- a/proposals/327-pow-over-intro.txt
+++ b/proposals/327-pow-over-intro.txt
@@ -56,8 +56,8 @@ Status: Draft
The small botnet is a bunch of computers lined up to do an introduction
flooding attack. Assuming 500 medium-range computers, we are talking about
- an attacker with total access to 10 Thz of CPU and 10 TB of RAM. We consider
- the upfront cost for this attacker to be about $400.
+ an attacker with total access to 10 Thz of CPU and 10 TB of RAM. We
+ consider the upfront cost for this attacker to be about $400.
"The large botnet"
@@ -68,7 +68,7 @@ Status: Draft
We hope that this proposal can help us defend against the script-kiddie
attacker and small botnets. To defend against a large botnet we would need
- more tools in our disposal (see [FUTURE_DESIGNS]).
+ more tools at our disposal (see [FUTURE_DESIGNS]).
1.2.2. User profiles [USER_MODEL]
@@ -104,8 +104,8 @@ Status: Draft
This proposal is not perfect and it does not cover all the use cases. Still,
we think that by covering some use cases and giving reachability to the
people who really need it, we will severely demotivate the attackers from
- continuing the DoS attacks and hence stop the DoS threat all
- together. Furthermore, by increasing the cost to launch a DoS attack, a big
+ continuing the DoS attacks and hence stop the DoS threat all together.
+ Furthermore, by increasing the cost to launch a DoS attack, a big
class of DoS attackers will disappear from the map, since the expected ROI
will decrease.
@@ -135,33 +135,73 @@ Status: Draft
introduction phase of the onion service protocol.
The system described in this proposal is not meant to be on all the time, and
- should only be enabled by services when under duress. The percentage of
- clients receiving puzzles can also be configured based on the load of the
- service.
+ it can be entirely disabled for services that do not experience DoS attacks.
- In summary, the following steps are taken for the protocol to complete:
+ When the subsystem is enabled, suggested effort is continuously adjusted and
+ the computational puzzle can be bypassed entirely when the effort reaches
+ zero. In these cases, the proof-of-work subsystem can be dormant but still
+ provide the necessary parameters for clients to voluntarily provide effort
+ in order to get better placement in the priority queue.
+
+ The protocol involves the following major steps:
1) Service encodes PoW parameters in descriptor [DESC_POW]
2) Client fetches descriptor and computes PoW [CLIENT_POW]
3) Client completes PoW and sends results in INTRO1 cell [INTRO1_POW]
- 4) Service verifies PoW and queues introduction based on PoW effort [SERVICE_VERIFY]
+ 4) Service verifies PoW and queues introduction based on PoW effort
+ [SERVICE_VERIFY]
+ 5) Requests are continuously drained from the queue, highest effort first,
+ subject to multiple constraints on speed [HANDLE_QUEUE]
2.2. Proof-of-work overview
-2.2.1. Primitives
-
- For our proof-of-work function we will use the 'equix' scheme by tevador
- [REF_EQUIX]. Equix is an assymetric PoW function based on Equihash<60,3>. It
- features lightning fast verification speed, and also aims to minimize the
- assymetry between CPU and GPU. Furthermore, it's designed for this particular
- use-case and hence cryptocurrency miners are not incentivized to make
- optimized ASICs for it.
-
- The Equix scheme provides two functions that will be used in this proposal:
- - equix_solve(seed, nonce, effort) which solves a puzzle instance.
- - equix_verify(seed, nonce, effort, solution) which verifies a puzzle solution.
-
- We tune equix in section [EQUIX_TUNING].
+2.2.1. Algorithm overview
+
+ For our proof-of-work function we will use the Equi-X scheme by tevador
+ [REF_EQUIX]. Equi-X is an assymetric PoW function based on Equihash<60,3>,
+ using HashX as the underlying layer. It features lightning fast verification
+ speed, and also aims to minimize the assymetry between CPU and GPU.
+ Furthermore, it's designed for this particular use-case and hence
+ cryptocurrency miners are not incentivized to make optimized ASICs for it.
+
+ The overall scheme consists of several layers that provide different pieces
+ of this functionality:
+
+ 1) At the lowest layers, blake2b and siphash are used as hashing and PRNG
+ algorithms that are well suited to common 64-bit CPUs.
+ 2) A custom hash function, HashX, uses dynamically generated functions that
+ are tuned to be a good match for pipelined integer and floating point
+ performance on current 64-bit CPUs. This layer provides the strongest ASIC
+ resistance, since a reimplementation in hardware would need to implement
+ much of a CPU to compute these functions efficiently.
+ 3) The Equi-X layer itself builds on HashX and adds an algorithmic puzzle
+ that's designed to be strongly asymmetric and to require RAM to solve
+ efficiently.
+ 4) The PoW protocol itself builds on this Equi-X function with a particular
+ construction of the challenge input and particular constraints on the
+ allowed blake2b hash of the solution. This layer provides a linearly
+ adjustible effort that we can verify.
+ 5) Above the level of individual PoW handshakes, the client and service
+ form a closed-loop system that adjusts the effort of future handshakes.
+
+ The Equi-X scheme provides two functions that will be used in this proposal:
+ - equix_solve(challenge) which solves a puzzle instance, returning
+ a variable number of solutions per invocation depending on the specific
+ challenge value.
+ - equix_verify(challenge, solution) which verifies a puzzle solution
+ quickly. Verification still depends on executing the HashX function,
+ but far fewer times than when searching for a solution.
+
+ For the purposes of this proposal, all cryptographic algorithms are assumed
+ to produce and consume byte strings, even if internally they operate on
+ some other data type like 64-bit words. This is conventionally little endian
+ order for blake2b, which contrasts with Tor's typical use of big endian.
+ HashX itself is configured with an 8-byte output but its input is a single
+ 64-bit word of undefined byte order, of which only the low 16 bits are used
+ by Equi-X in its solution output. We treat Equi-X solution arrays as byte
+ arrays using their packed little endian 16-bit representation.
+
+ We tune Equi-X in section [EQUIX_TUNING].
2.2.2. Dynamic PoW
@@ -184,21 +224,31 @@ Status: Draft
2.2.3. PoW effort
- For our dynamic PoW system to work, we will need to be able to compare PoW
- tokens with each other. To do so we define a function:
+ It's common for proof-of-work systems to define an exponential effort
+ function based on a particular number of leading zero bits or equivalent.
+ For the benefit of our effort estimation system, it's quite useful if we
+ instead have a linear scale. We use the first 32 bits of a hashed version
+ of the Equi-X solution as compared to the full 32-bit range.
+
+ Conceptually we could define a function:
unsigned effort(uint8_t *token)
- which takes as its argument a hash output token, interprets it as a
+ which takes as its argument a hashed solution, interprets it as a
bitstring, and returns the quotient of dividing a bitstring of 1s by it.
So for example:
- effort(00000001100010101101) = 11111111111111111111 / 00000001100010101101
+ effort(00000001100010101101) = 11111111111111111111
+ / 00000001100010101101
or the same in decimal:
effort(6317) = 1048575 / 6317 = 165.
- This definition of effort has the advantage of directly expressing the
- expected number of hashes that the client had to calculate to reach the
- effort. This is in contrast to the (cheaper) exponential effort definition of
- taking the number of leading zero bits.
+ In practice we can avoid even having to perform this division, performing
+ just one multiply instead to see if a request's claimed effort is supported
+ by the smallness of the resulting 32-bit hash prefix. This assumes we send
+ the desired effort explicitly as part of each PoW solution. We do want to
+ force clients to pick a specific effort before looking for a solution,
+ otherwise a client could opportunistically claim a very large effort any
+ time a lucky hash prefix comes up. Thus the effort is communicated explicitly
+ in our protocol, and it forms part of the concatenated Equi-X challenge.
3. Protocol specification
@@ -219,14 +269,16 @@ Status: Draft
without trailing padding.
suggested-effort: An unsigned integer specifying an effort value that
- clients should aim for when contacting the service. See
+ clients should aim for when contacting the service. Can be
+ zero to mean that PoW is available but not currently
+ suggested for a first connection attempt. See
[EFFORT_ESTIMATION] for more details here.
- expiration-time: A timestamp in "YYYY-MM-DD SP HH:MM:SS" format after
- which the above seed expires and is no longer valid as
- the input for PoW. It's needed so that the size of our
- replay cache does not grow infinitely. It should be
- set to RAND_TIME(now+7200, 900) seconds.
+ expiration-time: A timestamp in "YYYY-MM-DDTHH:MM:SS" format (iso time
+ with no space) after which the above seed expires and
+ is no longer valid as the input for PoW. It's needed
+ so that our replay cache does not grow infinitely. It
+ should be set to RAND_TIME(now+7200, 900) seconds.
The service should refresh its seed when expiration-time passes. The service
SHOULD keep its previous seed in memory and accept PoWs using it to avoid
@@ -240,7 +292,8 @@ Status: Draft
3.2. Client fetches descriptor and computes PoW [CLIENT_POW]
If a client receives a descriptor with "pow-params", it should assume that
- the service is expecting a PoW input as part of the introduction protocol.
+ the service is prepared to receive PoW solutions as part of the introduction
+ protocol.
The client parses the descriptor and extracts the PoW parameters. It makes
sure that the <expiration-time> has not expired and if it has, it needs to
@@ -248,25 +301,44 @@ Status: Draft
The client should then extract the <suggested-effort> field to configure its
PoW 'target' (see [REF_TARGET]). The client SHOULD NOT accept 'target' values
- that will cause an infinite PoW computation. {XXX: How to enforce this?}
+ that will cause unacceptably long PoW computation.
+
+ The client uses a "personalization string" P equal to the following
+ nul-terminated ascii string: "Tor hs intro v1\0".
+
+ The client looks up `ID`, the current 32-byte blinded public ID
+ (KP_hs_blind_id) for the onion service.
To complete the PoW the client follows the following logic:
- a) Client selects a target effort E.
- b) Client generates a random 16-byte nonce N.
+ a) Client selects a target effort E, based on <suggested-effort> and past
+ connection attempt history.
+ b) Client generates a secure random 16-byte nonce N, as the starting
+ point for the solution search.
c) Client derives seed C by decoding 'seed-b64'.
- d) Client calculates S = equix_solve(C || N || E)
- e) Client calculates R = blake2b(C || N || E || S)
+ d) Client calculates S = equix_solve(P || ID || C || N || E)
+ e) Client calculates R = ntohl(blake2b_32(P || ID || C || N || E || S))
f) Client checks if R * E <= UINT32_MAX.
- f1) If yes, success! The client can submit N, E, the first 4 bytes of C
- and S.
+ f1) If yes, success! The client can submit N, E, the first 4 bytes of
+ C, and S.
f2) If no, fail! The client interprets N as a 16-byte little-endian
- integer, increments it by 1 and goes back to step d).
+ integer, increments it by 1 and goes back to step d).
+
+ Note that the blake2b hash includes the output length parameter in its
+ initial state vector, so a blake2b_32 is not equivalent to the prefix of a
+ blake2b_512. We calculate the 32-bit blake2b specifically, and interpret it
+ in network byte order as an unsigned integer.
At the end of the above procedure, the client should have S as the solution
- of the Equix puzzle with N as the nonce, C as the seed. How quickly this
+ of the Equix-X puzzle with N as the nonce, C as the seed. How quickly this
happens depends solely on the target effort E parameter.
+ The algorithm as described is suitable for single-threaded computation.
+ Optionally, a client may choose multiple nonces and attempt several solutions
+ in parallel on separate CPU cores. The specific choice of nonce is entirely
+ up to the client, so parallelization choices like this do not impact the
+ network protocol's interoperability at all.
+
3.3. Client sends PoW in INTRO1 cell [INTRO1_POW]
Now that the client has an answer to the puzzle it's time to encode it into
@@ -292,6 +364,7 @@ Status: Draft
POW_VERSION is 1 for the protocol specified in this proposal
POW_NONCE is the nonce 'N' from the section above
+ POW_EFFORT is the 32-bit integer effort value, in network byte order
POW_SEED is the first 4 bytes of the seed used
This will increase the INTRODUCE1 payload size by 43 bytes since the
@@ -303,10 +376,10 @@ Status: Draft
3.4. Service verifies PoW and handles the introduction [SERVICE_VERIFY]
When a service receives an INTRODUCE1 with the PROOF_OF_WORK extension, it
- should check its configuration on whether proof-of-work is required to
- complete the introduction. If it's not required, the extension SHOULD BE
- ignored. If it is required, the service follows the procedure detailed in
- this section.
+ should check its configuration on whether proof-of-work is enabled on the
+ service. If it's not enabled, the extension SHOULD BE ignored. If enabled,
+ even if the suggested effort is currently zero, the service follows the
+ procedure detailed in this section.
If the service requires the PROOF_OF_WORK extension but received an
INTRODUCE1 cell without any embedded proof-of-work, the service SHOULD
@@ -319,12 +392,12 @@ Status: Draft
a) Find a valid seed C that starts with POW_SEED. Fail if no such seed
exists.
- b) Fail if E = POW_EFFORT is lower than the minimum effort.
- c) Fail if N = POW_NONCE is present in the replay cache (see [REPLAY_PROTECTION[)
- d) Calculate R = blake2b(C || N || E || S)
- e) Fail if R * E > UINT32_MAX
- f) Fail if equix_verify(C || N || E, S) != EQUIX_OK
- g) Put the request in the queue with a priority of E
+ b) Fail if N = POW_NONCE is present in the replay cache
+ (see [REPLAY_PROTECTION])
+ c) Calculate R = ntohl(blake2b_32(P || ID || C || N || E || S))
+ d) Fail if R * E > UINT32_MAX
+ e) Fail if equix_verify(P || ID || C || N || E, S) != EQUIX_OK
+ f) Put the request in the queue with a priority of E
If any of these steps fail the service MUST ignore this introduction request
and abort the protocol.
@@ -349,7 +422,9 @@ Status: Draft
will flag some connections as replays even if they are not; with this false
positive probability increasing as the number of entries increase. However,
with the right parameter tuning this probability should be negligible and
- well handled by clients. {TODO: Figure bloom filter}
+ well handled by clients.
+
+ {TODO: Design and specify a suitable bloom filter for this purpose.}
3.4.2. The Introduction Queue [INTRO_QUEUE]
@@ -365,11 +440,11 @@ Status: Draft
structure. Each element in that priority queue is an introduction request,
and its priority is the effort put into its PoW:
- When a verified introduction comes through, the service uses the effort()
- function with the solution S as its input, and uses the output to place requests
- into the right position of the priority_queue: The bigger the effort, the
- more priority it gets in the queue. If two elements have the same effort, the
- older one has priority over the newer one.
+ When a verified introduction comes through, the service uses its included
+ effort commitment value to place each request into the right position of the
+ priority_queue: The bigger the effort, the more priority it gets in the
+ queue. If two elements have the same effort, the older one has priority over
+ the newer one.
3.4.2.2. Handling introductions from the introduction queue [HANDLE_QUEUE]
@@ -380,43 +455,103 @@ Status: Draft
3.4.3. PoW effort estimation [EFFORT_ESTIMATION]
- The service starts with a default suggested-effort value of 5000 (see
- [EQUIX_DIFFICULTY] section for more info).
+3.4.3.1. High-level description of the effort estimation process
+
+ The service starts with a default suggested-effort value of 0, which keeps
+ the PoW defenses dormant until we notice signs of overload.
+
+ The overall process of determining effort can be thought of as a set of
+ multiple coupled feedback loops. Clients perform their own effort
+ adjustments via [CLIENT_TIMEOUT] atop a base effort suggested by the service.
+ That suggestion incorporates the service's control adjustments atop a base
+ effort calculated using a sum of currently-queued client effort.
+
+ Each feedback loop has an opportunity to cover different time scales. Clients
+ can make adjustments at every single circuit creation request, whereas
+ services are limited by the extra load that frequent updates would place on
+ HSDir nodes.
+
+ In the combined client/service system these client-side increases are
+ expected to provide the most effective quick response to an emerging DoS
+ attack. After early clients increase the effort using [CLIENT_TIMEOUT],
+ later clients will benefit from the service detecting this increased queued
+ effort and offering a larger suggested_effort.
+
+ Effort increases and decreases both have an intrinsic cost. Increasing effort
+ will make the service more expensive to contact, and decreasing effort makes
+ new requests likely to become backlogged behind older requests. The steady
+ state condition is preferable to either of these side-effects, but ultimately
+ it's expected that the control loop always oscillates to some degree.
+
+3.4.3.2. Service-side effort estimation
- Then during its operation the service continuously keeps track of the
- received PoW cell efforts to inform its clients of the effort they should put
- in their introduction to get service. The service informs the clients by
- using the <suggested-effort> field in the descriptor.
+ Services keep an internal effort estimation which updates on a regular
+ periodic timer in response to measurements made on the queueing behavior
+ in the previous period. These internal effort changes can optionally trigger
+ client-visible suggested_effort changes when the difference is great enough
+ to warrant republishing to the HSDir.
- Everytime the service handles or trims an introduction request from the
- priority queue in [HANDLE_QUEUE], the service adds the request's effort to a
- sorted list.
+ This evaluation and update period is referred to as HS_UPDATE_PERIOD.
+ The service side effort estimation takes inspiration from TCP congestion
+ control's additive increase / multiplicative decrease approach, but unlike
+ a typical AIMD this algorithm is fixed-rate and doesn't update immediately
+ in response to events.
- Then every HS_UPDATE_PERIOD seconds (which is controlled through a consensus
- parameter and has a default value of 300 seconds) and while the DoS feature
- is enabled, the service updates its <suggested-effort> value as follows:
+ {TODO: HS_UPDATE_PERIOD is hardcoded to 300 (5 minutes) currently, but it
+ should be configurable in some way. Is it more appropriate to use the
+ service's torrc here or a consensus parameter?}
- 1. Set TOTAL_EFFORT to the sum of the effort of all valid requests that
- have been received since the last HS descriptor update (this includes
- all handled requests, trimmed requests and requests still in the queue)
+3.4.3.3. Per-period service state
- 2. Set SUGGESTED_EFFORT = TOTAL_EFFORT / (SVC_BOTTOM_CAPACITY * HS_UPDATE_PERIOD).
- The denominator above is the max number of requests that the service
- could have handled during that time.
+ During each update period, the service maintains some state:
- 3. Set <suggested-effort> to max(MIN_EFFORT, SUGGESTED_EFFORT).
+ 1. TOTAL_EFFORT, a sum of all effort values for rendezvous requests that
+ were successfully validated and enqueued.
- During the above procedure we use the following default values:
- - MIN_EFFORT = 1000, as the result of a simulation experiment [REF_TEVADOR_SIM]
- - SVC_BOTTOM_CAPACITY = 100, which is the number of introduction requests
- that can be handled by the service per second. This was computed in
- [POW_DIFFICULTY_TOR] as 180, but we reduced it to 100 to account for
- slower computers and networks.
+ 2. REND_HANDLED, a count of rendezvous requests that were actually
+ launched. Requests that made it to dequeueing but were too old to launch
+ by then are not included.
- The above algorithm is meant to balance the suggested effort based on the
- effort of all received requests. It attempts to dynamically adjust the
- suggested effort so that it increases when an attack is received, and tones
- down when the attack has stopped.
+ 3. HAD_QUEUE, a flag which is set if at any time in the update period we
+ saw the priority queue filled with more than a minimum amount of work,
+ greater than we would expect to process in approximately 1/4 second
+ using the configured dequeue rate.
+
+ 4. MAX_TRIMMED_EFFORT, the largest observed single request effort that we
+ discarded during the period. Requests are discarded either due to age
+ (timeout) or during culling events that discard the bottom half of the
+ entire queue when it's too full.
+
+3.4.3.4. Service AIMD conditions
+
+ At the end of each period, the service may decide to increase effort,
+ decrease effort, or make no changes, based on these accumulated state values:
+
+ 1. If MAX_TRIMMED_EFFORT > our previous internal suggested_effort,
+ always INCREASE. Requests that follow our latest advice are being
+ dropped.
+
+ 2. If the HAD_QUEUE flag was set and the queue still contains at least
+ one item with effort >= our previous internal suggested_effort,
+ INCREASE. Even if we haven't yet reached the point of dropping requests,
+ this signal indicates that the our latest suggestion isn't high enough
+ and requests will build up in the queue.
+
+ 3. If neither condition (1) or (2) are taking place and the queue is below
+ a level we would expect to process in approximately 1/4 second, choose
+ to DECREASE.
+
+ 4. If none of these conditions match, the suggested effort is unchanged.
+
+ When we INCREASE, the internal suggested_effort is increased to either its
+ previous value + 1, or (TOTAL_EFFORT / REND_HANDLED), whichever is larger.
+
+ When we DECREASE, the internal suggested_effort is scaled by 2/3rds.
+
+ Over time, this will continue to decrease our effort suggestion any time the
+ service is fully processing its request queue. If the queue stays empty, the
+ effort suggestion decreases to zero and clients should no longer submit a
+ proof-of-work solution with their first connection attempt.
It's worth noting that the suggested-effort is not a hard limit to the
efforts that are accepted by the service, and it's only meant to serve as a
@@ -424,33 +559,17 @@ Status: Draft
to the service. The service still adds requests with lower effort than
suggested-effort to the priority queue in [ADD_QUEUE].
- Finally, the above algorithm will never reset back to zero suggested-effort,
- even if the attack is completely over. That's because in that case it would
- be impossible to determine the total computing power of connecting
- clients. Instead it will reset back to MIN_EFFORT, and the operator will have
- to manually shut down the anti-DoS mechanism.
-
- {XXX: SVC_BOTTOM_CAPACITY is static above and will not be accurate for all
- boxes. Ideally we should calculate SVC_BOTTOM_CAPACITY dynamically based on
- the resources of every onion service while the algorithm is running.}
-
-3.4.3.1. Updating descriptor with new suggested effort
+3.4.3.5. Updating descriptor with new suggested effort
- Every HS_UPDATE_PERIOD seconds the service should upload a new descriptor
- with a new suggested-effort value.
+ The service descriptors may be updated for multiple reasons including
+ introduction point rotation common to all v3 onion services, the scheduled
+ seed rotations described in [DESC_POW], and updates to the effort suggestion.
+ Even though the internal effort estimate updates on a regular timer, we avoid
+ propagating those changes into the descriptor and the HSDir hosts unless
+ there is a significant change.
- The service should avoid uploading descriptors too often to avoid overwhelming
- the HSDirs. The service SHOULD NOT upload descriptors more often than
- HS_UPDATE_PERIOD. The service SHOULD NOT upload a new descriptor if the
- suggested-effort value changes by less than 15%.
-
- {XXX: Is this too often? Perhaps we can set different limits for when the
- difficulty goes up and different for when it goes down. It's more important
- to update the descriptor when the difficulty goes up.}
-
- {XXX: What attacks are possible here? Can the attacker intentionally hit this
- rate-limit and then influence the suggested effort so that clients do not
- learn about the new effort?}
+ If the PoW params otherwise match but the seed has changed by less than 15
+ percent, services SHOULD NOT upload a new descriptor.
4. Client behavior [CLIENT_BEHAVIOR]
@@ -463,8 +582,10 @@ Status: Draft
not allow the service to inform the client that the rendezvous is never gonna
occur.
- For this reason we need to define some client behaviors to work around these
- issues.
+ From the client's perspective there's no way to attribute this failure to
+ the service itself rather than the introduction point, so error accounting
+ is performed separately for each introduction-point. Existing mechanisms
+ will discard an introduction point that's required too many retries.
4.1. Clients handling timeouts [CLIENT_TIMEOUT]
@@ -477,31 +598,35 @@ Status: Draft
If the rendezvous request times out, the client SHOULD fetch a new descriptor
for the service to make sure that it's using the right suggested-effort for
- the PoW and the right PoW seed. The client SHOULD NOT fetch service
- descriptors more often than every 'hs-pow-desc-fetch-rate-limit' seconds
- (which is controlled through a consensus parameter and has a default value of
- 600 seconds).
+ the PoW and the right PoW seed. If the fetched descriptor includes a new
+ suggested effort or seed, it should first retry the request with these
+ parameters.
+
+ {TODO: This is not actually implemented yet, but we should do it. How often
+ should clients at most try to fetch new descriptors? Determined by a
+ consensus parameter? This change will also allow clients to retry
+ effectively in cases where the service has just been reconfigured to
+ enable PoW defenses.}
+
+ Every time the client retries the connection, it will count these failures
+ per-introduction-point. These counts of previous retries are combined with
+ the service's suggested_effort when calculating the actual effort to spend
+ on any individual request to a service that advertises PoW support, even
+ when the currently advertised suggested_effort is zero.
- {XXX: Is this too rare? Too often?}
+ On each retry, the client modifies its solver effort:
- When the client fetches a new descriptor, it should try connecting to the
- service with the new suggested-effort and PoW seed. If that doesn't work, it
- should double the effort and retry. The client should keep on
- doubling-and-retrying until it manages to get service, or its able to fetch a
- new descriptor again.
+ 1. If the effort is below (CLIENT_POW_EFFORT_DOUBLE_UNTIL = 1000)
+ it will be doubled.
- {XXX: This means that the client will keep on spinning and
- doubling-and-retrying for a service under this situation. There will never be
- a "Client connection timed out" page for the user. Is this good? Is this bad?
- Should we stop doubling-and-retrying after some iterations? Or should we
- throw a custom error page to the user, and ask the user to stop spinning
- whenever they want?}
+ 2. Otherwise, multiply the effort by (CLIENT_POW_RETRY_MULTIPLIER = 1.5).
-4.3. Other descriptor issues
+ 3. Constrain the new effort to be at least
+ (CLIENT_MIN_RETRY_POW_EFFORT = 8) and no greater than
+ (CLIENT_MAX_POW_EFFORT = 10000)
- Another race condition here is if the service enables PoW, while a client has
- a cached descriptor. How will the client notice that PoW is needed? Does it
- need to fetch a new descriptor? Should there be another feedback mechanism?
+ {TODO: These hardcoded limits should be replaced by timed limits and/or
+ an unlimited solver with robust cancellation. This is issue tor#40787}
5. Attacker strategies [ATTACK_META]
@@ -520,7 +645,8 @@ Status: Draft
that this attack is not possible: we tune this PoW parameter in section
[POW_TUNING_VERIFICATION].
-5.1.2. Overwhelm rendezvous capacity (aka "Overwhelm bottom half") [ATTACK_BOTTOM_HALF]
+5.1.2. Overwhelm rendezvous capacity (aka "Overwhelm bottom half")
+ [ATTACK_BOTTOM_HALF]
Given the way the introduction queue works (see [HANDLE_QUEUE]), a very
effective strategy for the attacker is to totally overwhelm the queue
diff --git a/proposals/329-traffic-splitting.txt b/proposals/329-traffic-splitting.txt
index 44f2e4e..93655f0 100644
--- a/proposals/329-traffic-splitting.txt
+++ b/proposals/329-traffic-splitting.txt
@@ -2,7 +2,7 @@ Filename: 329-traffic-splitting.txt
Title: Overcoming Tor's Bottlenecks with Traffic Splitting
Author: David Goulet, Mike Perry
Created: 2020-11-25
-Status: Needs Revision
+Status: Needs-Revision
0. Status
@@ -38,26 +38,27 @@ Status: Needs Revision
Tor relay queues, and not with any other bottlenecks (such as
intermediate Internet routers), we can avoid this complexity merely by
specifying that any paths that are constructed SHOULD NOT share any
- relays (except for the exit). This assumption is valid, because non-relay bottlenecks are managed
- by TCP of client-to-relay and relay-to-relay OR connections, and not
- Tor's circuit-level congestion control. In this way, we can proceed to
- use the exact same congestion control as specified in [PROP324],
- for each path.
+ relays (except for the exit). This assumption is valid, because non-relay
+ bottlenecks are managed by TCP of client-to-relay and relay-to-relay OR
+ connections, and not Tor's circuit-level congestion control. In this way,
+ we can proceed to use the exact same congestion control as specified in
+ [PROP324], for each path.
For this reason, this proposal will focus on protocol specification, and
the traffic scheduling algorithms, rather than coupling. Note that the
scheduling algorithms are currently in flux, and will be subject to
change as we tune them in Shadow, on the live network, and for future
- UDP implementation (see [PROP339]).
+ UDP implementation (see [PROP339]). This proposal will be kept up to
+ date with the current implementation.
1.2. Divergence from the initial Conflux design
The initial [CONFLUX] paper doesn't provide any indications on how to
handle the size of out-of-order cell queue, which we consider a
potential dangerous memory DoS vector (see [MEMORY_DOS]). It also used
- RTT as the sole heuristic for selecting which circuit to send on, which
+ RTT as the sole heuristic for selecting which circuit to send on (which
may vary depending on the geographical locations of the participant
- relays, without considering their actual available circuit capacity
+ relays), without considering their actual available circuit capacity
(which will be available to us via Proposal 324). Additionally, since
the publication of [CONFLUX], more modern packet scheduling algorithms
have been developed, which aim to reduce out-of-order queue size.
@@ -70,8 +71,7 @@ Status: Needs Revision
1.3. Design Overview
- The following section describes the Conflux design. Each sub-section is
- a building block to the multipath design that Conflux proposes.
+ The following section describes the Conflux design.
The circuit construction is as follows:
@@ -110,10 +110,10 @@ Status: Needs Revision
different scheduling algorithms to determine this.
Initial RTT is measured during circuit linking, as described in
- [CONFLUX_HANDSHAKE]. RTT is continually measured using SENDME timing, as
- in Proposal 324. This means that during use, the primary circuit and
- secondary circuit may switch roles, depending on unrelated network
- congestion caused by other Tor clients.
+ [CONFLUX_HANDSHAKE]. After the initial link, RTT is continually measured
+ using SENDME timing, as in Proposal 324. This means that during use,
+ the primary circuit and secondary circuit may switch roles, depending on
+ unrelated network congestion caused by other Tor clients.
We also support linking onion service circuits together. In this case,
only two rendezvous circuits are linked. Each of these RP circuits will
@@ -124,11 +124,11 @@ Status: Needs Revision
we have consulted (see [ACKNOWLEDGMENTS]), believe Tor's congestion
control from Proposal 324 to be sufficient in this rare case.
- In the algorithms we recommend here, only two circuits will be linked together at a time.
- However, implementations
- SHOULD support more than two paths, as this has been shown to assist in
- traffic analysis resistance[WTF_SPLIT], and will also be useful for
- maintaining a desired target RTT, for UDP VoIP applications.
+ In the algorithms we recommend here, only two circuits will be linked
+ together at a time. However, implementations SHOULD support more than
+ two paths, as this has been shown to assist in traffic analysis
+ resistance[WTF_SPLIT], and will also be useful for maintaining a desired
+ target RTT, for UDP VoIP applications.
If the number of circuits exceeds the current number of guard relays,
guard relays MAY be re-used, but implementations SHOULD use the same
@@ -160,11 +160,12 @@ Status: Needs Revision
The "max-num-circ" value indicate the maximum number of rendezvous
circuits that are allowed to be linked together.
- We let the service specify the conflux algorithm to use. Some services may
- prefer latency, where as some may prefer throughput. However, clients will
- also have to be able to override this request, because the high-throughput
- algorithms will require more out-of-order queue memory, which may be
- infeasible on mobile.
+ We let the service specify the conflux algorithm to use, when sending data
+ to the service. Some services may prefer latency, where as some may prefer
+ throughput. However, clients also have the ability to request their own UX
+ for data that the service sends, in the LINK handshake below, in part
+ because the high-throughput algorithms will require more out-of-order queue
+ memory, which may be infeasible on mobile.
The next section describes how the circuits are linked together.
@@ -263,19 +264,34 @@ Status: Needs Revision
These three relay commands are sent on *each* leg, to allow each endpoint to
measure the initial RTT of each leg.
- The client SHOULD abandon and close circuit if the LINKED message takes too long to arrive.
- This timeout MUST be no larger than the normal SOCKS/stream timeout in use
- for RELAY_BEGIN, but MAY be the Circuit Build Timeout value, instead.
- (The C-Tor implementation currently uses Circuit Build Timeout).
+ The client SHOULD abandon and close circuit if the LINKED message takes too
+ long to arrive. This timeout MUST be no larger than the normal SOCKS/stream
+ timeout in use for RELAY_BEGIN, but MAY be the Circuit Build Timeout value,
+ instead. (The C-Tor implementation currently uses Circuit Build Timeout).
See [SIDE_CHANNELS] for rules for when to reject unexpected handshake cells.
2.2. Linking Circuits from OP to Exit [LINKING_EXIT]
- To link exit circuits, two circuits to the same exit are built. When
- each circuit is opened, we ensure that congestion control has been
- negotiated. If congestion control negotiation has failed, the circuit
- MUST be closed. After this, the linking handshake begins.
+ To link exit circuits, two circuits to the same exit are built, with
+ additional restrictions such that they do not share Guard or Middle
+ relays. This restriction applies via specific relay identity keys,
+ and not IP addresses, families, or networks. (This is because the purpose
+ of it is to avoid sharing a bottleneck *inside* relay circuit queues;
+ bottlenecks caused by a shared network are handled by TCP's congestion
+ control on the OR conns).
+
+ Bridges also are subject to the same constraint as Guard relays;
+ the C-Tor codebase emits a warn if only one bridge is configured, unless
+ that bridge has transport "snowflake". Snowflake is exempt from this
+ Guard restriction because it is actually backed by many bridges. In the
+ bridge case, we only warn, and do not refuse to build conflux circuits,
+ because it is not catastrophic that Bridges are shared, it is just
+ sub-optimal for performance and congestion.
+
+ When each circuit is opened, we ensure that congestion control
+ has been negotiated. If congestion control negotiation has failed, the
+ circuit MUST be closed. After this, the linking handshake begins.
The RTT times between RELAY_CONFLUX_LINK and RELAY_CONFLUX_LINKED are
measured by the client, to determine primary vs secondary circuit use,
@@ -285,9 +301,11 @@ Status: Needs Revision
Because of the race between initial data and the RELAY_CONFLUX_LINKED_ACK
cell, conditions can arise where an Exit needs to send data before the
- slowest circuit delivers this ACK. In these cases, it should prefer the
- circuit that has delivered the ACK (which will arrive immediately prior
- to any data).
+ slowest circuit delivers this ACK. In these cases, it should prefer sending
+ data on the circuit that has delivered the ACK (which will arrive immediately
+ prior to any data from the client). This circuit will be the lower RTT
+ circuit anyway, but the code needs to handle the fact that in this case,
+ there won't yet be an RTT for the second circuit.
2.3. Linking circuits to an onion service [LINKING_SERVICE]
@@ -299,7 +317,20 @@ Status: Needs Revision
meet the client at two separate rendezvous points. These introduce
requests MUST be sent to the same intropoint (due to potential use of
onionbalance), and SHOULD be sent back-to-back on the same intro
- circuit. They MAY be combined with Proposal 340.
+ circuit. They MAY be combined with Proposal 340. (Note that if we do not
+ use Prop340, we will have to raise the limit on number of intros per
+ client circuit to 2, here, at intropoints).
+
+ When rendezvous circuits are built, they should use the same Guard,
+ Bridge, and Middle restrictions as specified in 2.2, for Exits. These
+ restrictions SHOULD also take into consideration all Middles in the path,
+ including the rendezvous point. All relay identities should be unique
+ (again, except for when the Snowflake transport is in use). The one
+ special case here is if the chosen rendezvous points by a client
+ are the same as the service's guards. In this case, the service SHOULD
+ NOT use different guards, but instead stick with the guards it has.
+ The reason for this is that we do not want to create the ability
+ for a client to force a service to use different guards.
The first rendezvous circuit to get joined SHOULD use Proposal 340 to
append the RELAY_BEGIN command, and the service MUST answer on this
@@ -324,9 +355,9 @@ Status: Needs Revision
In C-Tor, conflux is only used via circuit prebuilding. Pre-built conflux
sets are preferred over other pre-built circuits, but if the pre-built pool
ends up empty, normal pre-built circuits are used. If those run out, regular
- non-conflux circuits are built. Conflux sets are never built on-demand, but
- this is strictly an implementation decision, to simplify dealing with the
- C-Tor codebase.
+ non-conflux circuits are built. In other words, in C-Tor, conflux sets are
+ never built on-demand, but this is strictly an implementation decision, to
+ simplify dealing with the C-Tor codebase
The consensus parameter 'cfx_max_prebuilt_set' specifies the number of
sets to pre-build.
@@ -340,13 +371,13 @@ Status: Needs Revision
When a set is launched, legs begin the handshake in the unlinked state.
As handshakes complete, finalization is attempted, to create a linked set.
On the client, this finalization happens upon receipt of the LINKED cell.
- On the exit/service, this finalization happens upon sending the LINKED_ACK.
+ On the exit/service, this finalization happens upon *sending* the LINKED_ACK.
The initiator of this handshake considers the set fully linked once the
RELAY_CONFLUX_LINKED_ACK is sent (roughly upon receipt of the LINKED cell).
Because of the potential race between LINKED_ACK, and initial data sent by
the client, the receiver of the handshake must consider a leg linked at
- the time of sending a LINKED cell.
+ the time of *sending* a LINKED_ACK cell.
This means that exit legs may not have an RTT measurement, if data on the
faster leg beats the LINKED_ACK on the slower leg. The implementation MUST
@@ -379,15 +410,15 @@ Status: Needs Revision
legs' maximum last_seq_recv, and a lower last_seq_recv than all
current legs last_seq_sent.
- This check is performed on finalization, not the receipt of the cell. This
- gives the data additional time to arrive.
+ This check is performed on finalization, not the receipt of first
+ handshake cell. This gives the data additional time to arrive.
2.5. Congestion Control Application [CONGESTION_CONTROL]
The SENDMEs for congestion control are performed per-leg. As soon as
data arrives, regardless of its ordering, it is counted towards SENDME
- delivery. In this way, 'cwnd - package_window' of each leg always
- reflects the available data to send on each leg. This is important for
+ delivery. In this way, 'cwnd - inflight' of each leg always reflects
+ the available data to send on each leg. This is important for
[SCHEDULING].
The Congestion control Stream XON/XOFF can be sent on either leg, and
@@ -397,7 +428,19 @@ Status: Needs Revision
of their circuit was blocked. Because conflux can send on the other
circuit, which uses a different OR conn, this form of stream blocking
has been decoupled from the OR conn status, and only happens when
- congestion control has decided that all circuits are blocked.
+ congestion control has decided that all circuits are blocked (congestion
+ control becomes blocked when either 'cwnd - inflight <= 0', *or* when
+ the local OR conn is blocked, so if all local OR conns of a set are
+ blocked, then the stream will block that way).
+
+ Note also that because congestion control only covers RELAY_COMMAND_DATA
+ cells, for all algorithms, a special case must be made such that if no
+ circuit is available to send on due to congestion control blocking,
+ commands other than RELAY_COMMAN_DATA MUST be sent on the current
+ circuit, even if the cell scheduler believes that no circuit is available.
+ Depending on the code structure of Arti, this special case may or may
+ not be necessary. It arises in C-Tor because nothing can block the
+ sending of arbitrary non-DATA relay command cells.
2.6. Sequencing [SEQUENCING]
@@ -519,11 +562,12 @@ Status: Needs Revision
RELAY_COMMAND_XON
Currently, this set is the same as the set of cells that have stream ID,
- but the property that enforces this is that these cells must be ordered
- with respect to all data on the circuit. It is not impossible that future
- cells could be invented that don't have stream IDs, but yet must still
- arrive in order with respect to circuit data cells. Prop#253 is one
- possible example of such a thing (though we won't be implementing that).
+ but the property that leads to this requirement is not stream usage by
+ itself, it is that these cells must be ordered with respect to all data
+ on the circuit. It is not impossible that future relay commands could be
+ invented that don't have stream IDs, but yet must still arrive in order
+ with respect to circuit data cells. Prop#253 is one possible example of
+ such a thing (though we won't be implementing that proposal).
3. Traffic Scheduling [SCHEDULING]
@@ -532,60 +576,63 @@ Status: Needs Revision
original conflux paper used only RTT. However, with Proposal 324, we
will have accurate information on the instantaneous available bandwidth
of each circuit leg, as 'cwnd - inflight' (see Section 3 of
- Proposal 324).
-
- Some additional RTT optimizations are also useful, to improve
- responsiveness and minimize out-of-order queue sizes.
+ Proposal 324). We also have the TCP block state of the local OR
+ connection.
We specify two traffic schedulers from the multipath literature and
- adapt them to Tor: [LOWRTT_TOR] and [BLEST_TOR]. [LOWRTT_TOR] also has
- three variants, with different trade offs.
+ adapt them to Tor: [MINRTT_TOR], and [LOWRTT_TOR]. Additionally,
+ we create low-memory variants of these that aim to minimize the
+ out-of-order queue size at the receiving endpoint.
- However, see the [TRAFFIC_ANALYSIS] sections of this proposal for
+ Additionally, see the [TRAFFIC_ANALYSIS] sections of this proposal for
important details on how this selection can be changed, to reduce
website traffic fingerprinting.
- XXX: These sections are not accurate, and are subject to change
- during the alpha process, via Shadow simulation. We need to specify
- candidate algorithms for the UX properties. The latency algorithms
- will be related to LOWRTT_TOR, and the throughput algorithms related
- to BLEST_TOR, but significant changes will arise during evaluation,
- and possibly also live deployment iteration.
+3.1. MinRTT scheduling [MINRTT_TOR]
+
+ This schedulng algorithm is used for the MIN_LATENCY user experience.
+
+ It works by always and only sending on the circuit with the current minimum
+ RTT. With this algorithm, conflux should effectively stay on the circuit with
+ the lowest initial RTT, unless that circuit's RTT raises above the RTT of the
+ other circuit (due to relay load or congestion). When the circuit's congestion
+ window is full (ie: cwnd - inflight <= 0), or if the local OR conn blocks,
+ the conflux set stops transmitting and stops reading on edge connections,
+ rather than switch.
-3.1. LowRTT Scheduling [LOWRTT_TOR]
+ This should result in low out-of-order queues in most situations, unless
+ the initial RTTs of the two circuits are very close (basically within the
+ Vegas RTT bounds of queue variance, 'alpha' and 'beta').
- This scheduling algorithm is based on the original [CONFLUX] paper, with
- ideas from [MPTCP]'s minRTT/LowRTT scheduler.
+3.2. LowRTT Scheduling [LOWRTT_TOR]
- In this algorithm, endpoints send cells on the circuit with lower RTT
- (primary circuit). This continues while the congestion window on the
- circuit has available room: ie whenever cwnd - package_window > 0.
+ This scheduling algorithm is based on [MPTCP]'s LowRTT scheduler. This
+ algorithm is used for the UX choice of HIGH_THROUGHPUT.
- Whenever the primary circuit's congestion window becomes full, the
- secondary circuit is used. We stop reading on the send window source
- (edge connection) when both congestion windows become full.
+ In this algorithm, endpoints send cells on the circuit with lowest RTT that
+ has an unblocked local OR connection, and room in its congestion window (ie:
+ cwnd - inflight > 0). We stop reading on edge connections only when both
+ congestion windows become full, or when both local OR connections are blocked.
In this way, unlike original conflux, we switch to the secondary circuit
- without causing congestion on the primary circuit. This improves both
- load times, and overall throughput.
+ without causing congestion either locally, or on either circuit. This
+ improves both load times, and overall throughput. Given a large enough
+ transmission, both circuits are used to their full capacity,
+ simultaneously.
- This behavior matches minRTT from [MPTCP], sometimes called LowRTT.
+3.3. MinRTT Low-Memory Scheduling [MINRTT_LOWMEM_TOR]
- It may be better to stop reading on the edge connection when the primary
- congestion window becomes full, rather than switch to the secondary
- circuit as soon as the primary congestion window becomes full. (Ie: only
- switch if the RTTs themselves change which circuit is primary). This is
- what was done in the original Conflux paper. This behavior effectively
- causes us to optimize for responsiveness and congestion avoidance,
- rather than throughput. For evaluation, we will control this switching
- behavior with a consensus parameter (see [CONSENSUS_PARAMETERS]).
+ The low memory version of the MinRTT scheduler ensures that we do not
+ perform a switch more often than once per congestion window worth of data.
- Because of potential side channel risk (see [SIDE_CHANNELS]), a third
- variant of this algorithm, where the primary circuit is chosen during
- the [LINKING_CIRCUITS] handshake and never changed, is also possible
- to control via consensus parameter.
+ XXX: Other rate limiting, such as not switching unless the RTT changes by
+ more than X%, may be useful here.
-3.2. BLEST Scheduling [BLEST_TOR]
+3.4. BLEST Scheduling [BLEST_TOR]
+
+ XXX: Something like this might be useful for minimizing OOQ for the UX
+ choice of LOW_MEM_THROUGHPUT, but we might just be able to reduce switching
+ frequency instead.
XXX: We want an algorithm that only uses cwnd instead. This algorithm
has issues if the primary cwnd grows while the secondary does not.
@@ -701,7 +748,7 @@ Status: Needs Revision
important to be mindful of ways that an adversary can inject new cell
commands, as well as ways that the adversary can spawn new circuits
arbitrarily.
-
+
It is also important, though slightly less so, to be mindful of the uniqueness
of new handshakes, as handshakes can be used to classify usage (such as via
Onion Service Circuit Fingerprinting). Handshake side channels are only
diff --git a/proposals/343-rend-caa.txt b/proposals/343-rend-caa.txt
new file mode 100644
index 0000000..f5d449f
--- /dev/null
+++ b/proposals/343-rend-caa.txt
@@ -0,0 +1,107 @@
+Filename: 343-rend-caa.txt
+Title: CAA Extensions for the Tor Rendezvous Specification
+Author: Q Misell <q@as207960.net>
+Created: 2023-04-25
+Status: Open
+
+Overview:
+ The document defines extensions to the Tor Rendezvous Specification Hidden
+ Service descriptor format to allow the attachment of DNS style CAA records to
+ Tor hidden services to allow the same security benefits as CAA provides in the
+ DNS.
+
+Motivation:
+ As part of the work on draft-misell-acme-onion [I-D.misell-acme-onion] at the
+ IETF it was felt necessary to define a method to incorporate CAA records
+ [RFC8659] into Tor hidden services.
+
+ CAA records in the DNS provide an mechanism to indicate which Certificate
+ Authorities are permitted to issue certificates for a given domain name, and
+ restrict which validation methods are permitted for certificate validation.
+
+ As Tor hidden service domains are not in the DNS another way to provide the
+ same security benefits as CAA does in the DNS needed to be devised.
+
+ More information about this project in general can be found at
+ https://e.as207960.net/w4bdyj/Gm2AylEF
+
+Specification:
+ To enable maximal code re-use in CA codebases the same CAA record format is
+ used in Tor hidden services as in the DNS. To this end a new field is added to
+ the second layer hidden service descriptor [tor-rend-spec-v3] § 2.5.2.2.
+ with the following format:
+
+ "caa" SP flags SP tag SP value NL
+ [Any number of times]
+
+ The contents of "flag", "tag", and "value" are as per [RFC8659] § 4.1.1.
+ Multiple CAA records may be present, as is the case in the DNS.
+
+ A hidden service's second layer descriptor using CAA may look
+ something like the following:
+
+ create2-formats 2
+ single-onion-service
+ caa 0 issue "example.com"
+ caa 0 iodef "mailto:security@example.com"
+ caa 128 validationmethods "onion-csr-01"
+ introduction-point AwAGsAk5nSMpAhRqhMHbTFCTSlfhP8f5PqUhe6DatgMgk7kSL3KHCZ...
+
+ As the CAA records are in the second layer descriptor and in the case of a
+ hidden service requiring client authentication it is impossible to read them
+ without the hidden service trusting a CA's public key, a method is required to
+ signal that there are CAA records present (but not reveal their contents,
+ which may disclose unwanted information about the hidden service operator to
+ third parties). This is to allow a CA to know that it must attempt to check
+ CAA records before issuance, and fail if it is unable to do so.
+
+ To this end a new field is added to the first layer hidden service descriptor
+ [tor-rend-spec-v3] § 2.5.1.2. with the following format:
+
+ "caa-critical" NL
+ [At most once]
+
+Security Considerations:
+ The second layer descriptor is signed and MACed in a way that only a party
+ with access to the secret key of the hidden service could manipulate what is
+ published there. Therefore, Tor CAA records have at least the same security as
+ those in the DNS secured by DNSSEC.
+
+ The "caa-critical" flag is visible to anyone with knowledge of the hidden
+ service's public key, however it reveals no information that could be used to
+ de-anonymize the hidden service operator.
+
+ The CAA flags in the second layer descriptor may reveal information about the
+ hidden service operator if they choose to publish an "iodef", "contactemail",
+ or "contactphone" tag. These however are not required for primary goal of CAA,
+ that is to restrict which CAs may issue certificates for a given domain name.
+
+ No more information is revealed by the "issue" nor "issuewild" tags than would
+ be revealed by someone making a connection to the hidden service and noting
+ which certificate is presented.
+
+Compatibility:
+ The hidden service spec [tor-rend-spec-v3] already requires that clients
+ ignore unknown lines when decoding hidden service descriptors, so this change
+ should not cause any compatibility issues. Additionally in testing no
+ compatibility issues where found with existing Tor implementations.
+
+ A hidden service with CAA records published in its descriptor is available at
+ znkiu4wogurrktkqqid2efdg4nvztm7d2jydqenrzeclfgv3byevnbid.onion, to allow
+ further compatibility testing.
+
+References:
+ [I-D.misell-acme-onion]
+ Misell, Q., "Automated Certificate Management Environment (ACME)
+ Extensions for ".onion" Domain Names", Internet-Draft
+ draft-misell-acme-onion-02, April 2023,
+ <https://datatracker.ietf.org/doc/html/draft-misell-acme-onion-02>.
+
+ [RFC8659] Hallam-Baker, P., Stradling, R., and J. Hoffman-Andrews,
+ "DNS Certification Authority Authorization (CAA) Resource
+ Record", RFC 8659, DOI 10.17487/RFC8659, November 2019,
+ <https://www.rfc-editor.org/info/rfc8659>.
+
+ [tor-rend-spec-v3]
+ The Tor Project, "Tor Rendezvous Specification - Version 3",
+ <https://spec.torproject.org/rend-spec-v3>.
diff --git a/proposals/BY_INDEX.md b/proposals/BY_INDEX.md
index d0b1214..f48ef31 100644
--- a/proposals/BY_INDEX.md
+++ b/proposals/BY_INDEX.md
@@ -236,17 +236,17 @@ Below are a list of proposals sorted by their proposal number. See
* [`316-flashflow.md`](/proposals/316-flashflow.md): FlashFlow: A Secure Speed Test for Tor (Parent Proposal) [DRAFT]
* [`317-secure-dns-name-resolution.txt`](/proposals/317-secure-dns-name-resolution.txt): Improve security aspects of DNS name resolution [NEEDS-REVISION]
* [`318-limit-protovers.md`](/proposals/318-limit-protovers.md): Limit protover values to 0-63 [CLOSED]
-* [`319-wide-everything.md`](/proposals/319-wide-everything.md): RELAY_FRAGMENT cells [OPEN]
+* [`319-wide-everything.md`](/proposals/319-wide-everything.md): RELAY_FRAGMENT cells [OBSOLETE]
* [`320-tap-out-again.md`](/proposals/320-tap-out-again.md): Removing TAP usage from v2 onion services [REJECTED]
* [`321-happy-families.md`](/proposals/321-happy-families.md): Better performance and usability for the MyFamily option (v2) [ACCEPTED]
* [`322-dirport-linkspec.md`](/proposals/322-dirport-linkspec.md): Extending link specifiers to include the directory port [OPEN]
* [`323-walking-onions-full.md`](/proposals/323-walking-onions-full.md): Specification for Walking Onions [OPEN]
* [`324-rtt-congestion-control.txt`](/proposals/324-rtt-congestion-control.txt): RTT-based Congestion Control for Tor [OPEN]
-* [`325-packed-relay-cells.md`](/proposals/325-packed-relay-cells.md): Packed relay cells: saving space on small commands [OPEN]
+* [`325-packed-relay-cells.md`](/proposals/325-packed-relay-cells.md): Packed relay cells: saving space on small commands [OBSOLETE]
* [`326-tor-relay-well-known-uri-rfc8615.md`](/proposals/326-tor-relay-well-known-uri-rfc8615.md): The "tor-relay" Well-Known Resource Identifier [OPEN]
* [`327-pow-over-intro.txt`](/proposals/327-pow-over-intro.txt): A First Take at PoW Over Introduction Circuits [DRAFT]
* [`328-relay-overload-report.md`](/proposals/328-relay-overload-report.md): Make Relays Report When They Are Overloaded [CLOSED]
-* [`329-traffic-splitting.txt`](/proposals/329-traffic-splitting.txt): Overcoming Tor's Bottlenecks with Traffic Splitting [DRAFT]
+* [`329-traffic-splitting.txt`](/proposals/329-traffic-splitting.txt): Overcoming Tor's Bottlenecks with Traffic Splitting [NEEDS-REVISION]
* [`330-authority-contact.md`](/proposals/330-authority-contact.md): Modernizing authority contact entries [OPEN]
* [`331-res-tokens-for-anti-dos.md`](/proposals/331-res-tokens-for-anti-dos.md): Res tokens: Anonymous Credentials for Onion Service DoS Resilience [DRAFT]
* [`332-ntor-v3-with-extra-data.md`](/proposals/332-ntor-v3-with-extra-data.md): Ntor protocol with extra data, version 3 [FINISHED]
@@ -260,4 +260,5 @@ Below are a list of proposals sorted by their proposal number. See
* [`340-packed-and-fragmented.md`](/proposals/340-packed-and-fragmented.md): Packed and fragmented relay messages [OPEN]
* [`341-better-oos.md`](/proposals/341-better-oos.md): A better algorithm for out-of-sockets eviction [OPEN]
* [`342-decouple-hs-interval.md`](/proposals/342-decouple-hs-interval.md): Decoupling hs_interval and SRV lifetime [DRAFT]
+* [`343-rend-caa.txt`](/proposals/343-rend-caa.txt): CAA Extensions for the Tor Rendezvous Specification [OPEN]
diff --git a/proposals/README.md b/proposals/README.md
index 0461d6a..9ff4f18 100644
--- a/proposals/README.md
+++ b/proposals/README.md
@@ -32,15 +32,14 @@ for discussion.
* [`306-ipv6-happy-eyeballs.txt`](/proposals/306-ipv6-happy-eyeballs.txt): A Tor Implementation of IPv6 Happy Eyeballs
* [`308-counter-galois-onion.txt`](/proposals/308-counter-galois-onion.txt): Counter Galois Onion: A New Proposal for Forward-Secure Relay Cryptography
* [`309-optimistic-socks-in-tor.txt`](/proposals/309-optimistic-socks-in-tor.txt): Optimistic SOCKS Data
-* [`319-wide-everything.md`](/proposals/319-wide-everything.md): RELAY_FRAGMENT cells
* [`322-dirport-linkspec.md`](/proposals/322-dirport-linkspec.md): Extending link specifiers to include the directory port
* [`323-walking-onions-full.md`](/proposals/323-walking-onions-full.md): Specification for Walking Onions
* [`324-rtt-congestion-control.txt`](/proposals/324-rtt-congestion-control.txt): RTT-based Congestion Control for Tor
-* [`325-packed-relay-cells.md`](/proposals/325-packed-relay-cells.md): Packed relay cells: saving space on small commands
* [`326-tor-relay-well-known-uri-rfc8615.md`](/proposals/326-tor-relay-well-known-uri-rfc8615.md): The "tor-relay" Well-Known Resource Identifier
* [`330-authority-contact.md`](/proposals/330-authority-contact.md): Modernizing authority contact entries
* [`340-packed-and-fragmented.md`](/proposals/340-packed-and-fragmented.md): Packed and fragmented relay messages
* [`341-better-oos.md`](/proposals/341-better-oos.md): A better algorithm for out-of-sockets eviction
+* [`343-rend-caa.txt`](/proposals/343-rend-caa.txt): CAA Extensions for the Tor Rendezvous Specification
## ACCEPTED proposals: slated for implementation
@@ -107,7 +106,6 @@ discussion.
* [`294-tls-1.3.txt`](/proposals/294-tls-1.3.txt): TLS 1.3 Migration
* [`316-flashflow.md`](/proposals/316-flashflow.md): FlashFlow: A Secure Speed Test for Tor (Parent Proposal)
* [`327-pow-over-intro.txt`](/proposals/327-pow-over-intro.txt): A First Take at PoW Over Introduction Circuits
-* [`329-traffic-splitting.txt`](/proposals/329-traffic-splitting.txt): Overcoming Tor's Bottlenecks with Traffic Splitting
* [`331-res-tokens-for-anti-dos.md`](/proposals/331-res-tokens-for-anti-dos.md): Res tokens: Anonymous Credentials for Onion Service DoS Resilience
* [`342-decouple-hs-interval.md`](/proposals/342-decouple-hs-interval.md): Decoupling hs_interval and SRV lifetime
@@ -125,6 +123,7 @@ certain changes.
* [`279-naming-layer-api.txt`](/proposals/279-naming-layer-api.txt): A Name System API for Tor Onion Services
* [`291-two-guard-nodes.txt`](/proposals/291-two-guard-nodes.txt): The move to two guard nodes
* [`317-secure-dns-name-resolution.txt`](/proposals/317-secure-dns-name-resolution.txt): Improve security aspects of DNS name resolution
+* [`329-traffic-splitting.txt`](/proposals/329-traffic-splitting.txt): Overcoming Tor's Bottlenecks with Traffic Splitting
## NEEDS-RESEARCH proposals: blocking on research
@@ -357,7 +356,9 @@ longer relevant (the proposal is OBSOLETE).
* [`270-newhope-hybrid-handshake.txt`](/proposals/270-newhope-hybrid-handshake.txt): RebelAlliance: A Post-Quantum Secure Hybrid Handshake Based on NewHope [OBSOLETE]
* [`276-lower-bw-granularity.txt`](/proposals/276-lower-bw-granularity.txt): Report bandwidth with lower granularity in consensus documents [DEAD]
* [`286-hibernation-api.txt`](/proposals/286-hibernation-api.txt): Controller APIs for hibernation access on mobile [REJECTED]
+* [`319-wide-everything.md`](/proposals/319-wide-everything.md): RELAY_FRAGMENT cells [OBSOLETE]
* [`320-tap-out-again.md`](/proposals/320-tap-out-again.md): Removing TAP usage from v2 onion services [REJECTED]
+* [`325-packed-relay-cells.md`](/proposals/325-packed-relay-cells.md): Packed relay cells: saving space on small commands [OBSOLETE]
diff --git a/rend-spec-v3.txt b/rend-spec-v3.txt
index 4a12343..53880db 100644
--- a/rend-spec-v3.txt
+++ b/rend-spec-v3.txt
@@ -2723,3 +2723,95 @@ Appendix F. Two methods for managing revision counters.
Similarly, implementations SHOULD NOT let the revision counter
increase forever without resetting it -- doing so links the service
across changes in the blinded public key.
+
+Appendix G. Text vectors
+
+ G.1. Test vectors for hs-ntor / NTOR-WITH-EXTRA-DATA
+
+ Here is a set of test values for the hs-ntor handshake, called
+ [NTOR-WITH-EXTRA-DATA] in this document. They were generated by
+ instrumenting Tor's code to dump the values for an INTRODUCE/RENDEZVOUS
+ handshake, and then by running that code on a Chutney network.
+
+ We assume an onion service with:
+
+ KP_hs_ipd_sid = 34E171E4358E501BFF21ED907E96AC6B
+ FEF697C779D040BBAF49ACC30FC5D21F
+ KP_hss_ntor = 8E5127A40E83AABF6493E41F142B6EE3
+ 604B85A3961CD7E38D247239AFF71979
+ KS_hss_ntor = A0ED5DBF94EEB2EDB3B514E4CF6ABFF6
+ 022051CC5F103391F1970A3FCD15296A
+ N_hs_subcred = 0085D26A9DEBA252263BF0231AEAC59B
+ 17CA11BAD8A218238AD6487CBAD68B57
+
+ The client wants to make in INTRODUCE request. It generates
+ the following header (everything before the ENCRYPTED portion)
+ of its INTRODUCE1 cell:
+
+ H = 000000000000000000000000000000000000000002002034E171E4358E501BFF
+ 21ED907E96AC6BFEF697C779D040BBAF49ACC30FC5D21F00
+
+ It generates the following plaintext body to encrypt. (This
+ is the "decrypted plaintext body" from [PROCESS_INTRO2].
+
+ P = 6BD364C12638DD5C3BE23D76ACA05B04E6CE932C0101000100200DE6130E4FCA
+ C4EDDA24E21220CC3EADAE403EF6B7D11C8273AC71908DE565450300067F0000
+ 0113890214F823C4F8CC085C792E0AEE0283FE00AD7520B37D0320728D5DF39B
+ 7B7077A0118A900FF4456C382F0041300ACF9C58E51C392795EF870000000000
+ 0000000000000000000000000000000000000000000000000000000000000000
+ 000000000000000000000000000000000000000000000000000000000000
+
+ The client now begins the hs-ntor handshake. It generates
+ a curve25519 keypair:
+
+ x = 60B4D6BF5234DCF87A4E9D7487BDF3F4
+ A69B6729835E825CA29089CFDDA1E341
+ X = BF04348B46D09AED726F1D66C618FDEA
+ 1DE58E8CB8B89738D7356A0C59111D5D
+
+ Then it calculates:
+
+ ENC_KEY = 9B8917BA3D05F3130DACCE5300C3DC27
+ F6D012912F1C733036F822D0ED238706
+ MAC_KEY = FC4058DA59D4DF61E7B40985D122F502
+ FD59336BC21C30CAF5E7F0D4A2C38FD5
+
+ With these, it encrypts the plaintext body P with ENC_KEY, getting
+ an encrypted value C. It computes MAC(MAC_KEY, H | X | C),
+ getting a MAC value M. It then assembles the final INTRODUCE1
+ body as H | X | C | M:
+
+ 000000000000000000000000000000000000000002002034E171E4358E501BFF
+ 21ED907E96AC6BFEF697C779D040BBAF49ACC30FC5D21F00BF04348B46D09AED
+ 726F1D66C618FDEA1DE58E8CB8B89738D7356A0C59111D5DADBECCCB38E37830
+ 4DCC179D3D9E437B452AF5702CED2CCFEC085BC02C4C175FA446525C1B9D5530
+ 563C362FDFFB802DAB8CD9EBC7A5EE17DA62E37DEEB0EB187FBB48C63298B0E8
+ 3F391B7566F42ADC97C46BA7588278273A44CE96BC68FFDAE31EF5F0913B9A9C
+ 7E0F173DBC0BDDCD4ACB4C4600980A7DDD9EAEC6E7F3FA3FC37CD95E5B8BFB3E
+ 35717012B78B4930569F895CB349A07538E42309C993223AEA77EF8AEA64F25D
+ DEE97DA623F1AEC0A47F150002150455845C385E5606E41A9A199E7111D54EF2
+ D1A51B7554D8B3692D85AC587FB9E69DF990EFB776D8
+
+ Later the service receives that body in an INTRODUCE2 cell. It
+ processes it according to the hs-ntor handshake, and recovers
+ the client's plaintext P. To continue the hs-ntor handshake,
+ the service chooses a curve25519 keypair:
+
+ y = 68CB5188CA0CD7924250404FAB54EE13
+ 92D3D2B9C049A2E446513875952F8F55
+ Y = 8FBE0DB4D4A9C7FF46701E3E0EE7FD05
+ CD28BE4F302460ADDEEC9E93354EE700
+
+ From this and the client's input, it computes:
+
+ AUTH_INPUT_MAC = 4A92E8437B8424D5E5EC279245D5C72B
+ 25A0327ACF6DAF902079FCB643D8B208
+ NTOR_KEY_SEED = 4D0C72FE8AFF35559D95ECC18EB5A368
+ 83402B28CDFD48C8A530A5A3D7D578DB
+
+ The service sends back Y | AUTH_INPUT_MAC in its RENDEZVOUS1 cell
+ body. From these, the client finishes the handshake, validates
+ AUTH_INPUT_MAC, and computes the same NTOR_KEY_SEED.
+
+ Now that both parties have the same NTOR_KEY_SEED, they can derive
+ the shared key material they will use for their circuit.
diff --git a/tor-spec.txt b/tor-spec.txt
index c6dbcfd..0e9622b 100644
--- a/tor-spec.txt
+++ b/tor-spec.txt
@@ -1260,13 +1260,12 @@ see tor-design.pdf.
t_mac = PROTOID | ":mac"
t_key = PROTOID | ":key_extract"
t_verify = PROTOID | ":verify"
- MULT(a,b) = the multiplication of the curve25519 point 'a' by the
- scalar 'b'.
G = The preferred base point for curve25519 ([9])
KEYGEN() = The curve25519 key generation algorithm, returning
a private/public keypair.
m_expand = PROTOID | ":key_expand"
KEYID(A) = A
+ EXP(a, b) = The ECDH algorithm for establishing a shared secret.
To perform the handshake, the client needs to know an identity key
digest for the server, and an ntor onion key (a curve25519 public