aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorMicah Elizabeth Scott <beth@torproject.org>2023-11-09 14:00:01 -0800
committerMicah Elizabeth Scott <beth@torproject.org>2023-11-09 14:16:27 -0800
commit1d7274bdcc7d46b0aec43ed44d99d2d7362593e5 (patch)
tree164f87aaec0d7c2e6f8437cda953600376aedc00
parentd3987e6889a51beef81bc7f4735c3749cd9fc78b (diff)
downloadtorspec-1d7274bdcc7d46b0aec43ed44d99d2d7362593e5.tar.gz
torspec-1d7274bdcc7d46b0aec43ed44d99d2d7362593e5.zip
Another editing pass for hspow-spec and friends
This mostly updates formatting and links. I added a little bit of new context, primarily a disclaimer and updated benchmark info for analysis-discussion.md
-rw-r--r--spec/dos-spec/overview.md6
-rw-r--r--spec/hspow-spec/analysis-discussion.md28
-rw-r--r--spec/hspow-spec/common-protocol.md28
-rw-r--r--spec/hspow-spec/motivation.md4
-rw-r--r--spec/hspow-spec/v1-equix.md34
-rw-r--r--spec/rend-spec/introduction-protocol.md4
6 files changed, 59 insertions, 45 deletions
diff --git a/spec/dos-spec/overview.md b/spec/dos-spec/overview.md
index e159014..0ea0994 100644
--- a/spec/dos-spec/overview.md
+++ b/spec/dos-spec/overview.md
@@ -7,7 +7,7 @@ These mitigations are expected to improve network availability, but DoS mitigati
The attack and defense environment changes over time.
Expect that this document is an attempt to describe the current state of things, but that it may not be complete.
-The defenses here are organized by the type of resource under contention. These can be physical resources (Memory, CPU, Bandwidth) or protocol resources (Connections, Circuits, Introductions).
+The defenses here are organized by the type of resource under contention. These can be physical resources ([Memory](#memory), [CPU](#cpu), [Bandwidth](#bandwidth)) or protocol resources ([Channels](#channels), [Circuits](#circuits), [Introductions](#hs-intro)).
In practice there are always overlaps between these resource types.
Connecting to an onion service, for example, puts some strain on every resource type here.
@@ -73,6 +73,6 @@ Based on the queue behavior, servers will continuously provide an updated effort
Queue backlogs cause the effort to rise, and an idle server will cause the effort to decay.
If the queue is never overfull the effort decays to zero, asking clients not to include a proof-of-work solution at all.
-We may support multiple cryptographic algorithms for this puzzle in the future, but currently we support one type. It's called `v1` in our protocol, and it's based on the Equi-X algorithm developed for this purpose. See the document on [`Proof of Work for onion service introduction`](../hspow-spec/index.md).
+We may support multiple cryptographic algorithms for this puzzle in the future, but currently we support one type. It's called `v1` in our protocol, and it's based on the Equi-X algorithm developed for this purpose. See the document on [Proof of Work for onion service introduction](../hspow-spec/index.md).
-This defense is configured by an operator using the `HiddenServicePoW` configuration options. Additionally, it requires both the client and the onion service to be compiled with the `pow` module (`--enable-gpl` mode) available. Current versions of the Tor Browser do include `pow` support.
+This defense is configured by an operator using the `HiddenServicePoW` configuration options. Additionally, it requires both the client and the onion service to be compiled with the `pow` module (and `--enable-gpl` mode) available. Despite this non-default build setting, proof of work *is* available through common packagers like the Tor Browser and Debian.
diff --git a/spec/hspow-spec/analysis-discussion.md b/spec/hspow-spec/analysis-discussion.md
index 483bd31..8f25623 100644
--- a/spec/hspow-spec/analysis-discussion.md
+++ b/spec/hspow-spec/analysis-discussion.md
@@ -1,23 +1,27 @@
# Analysis and discussion
+*Warning*: Take all the PoW performance numbers on this page with a large grain of salt. Most of this is based on very early analysis that has not been updated for the current state of implementation.
+
+For current performance numbers on a specific piece of hardware, please run `cargo bench` from the [`equix/bench`](https://gitlab.torproject.org/tpo/core/arti/-/tree/main/crates/equix/bench) crate within [Arti](https://gitlab.torproject.org/tpo/core/arti/). This framework tests both the C and Rust implementations side-by-side.
+
## Attacker strategies {#attacker-strategies}
To design a protocol and choose its parameters, we first need to understand a few high-level attacker strategies to see what we are fighting against.
-### Overwhelm PoW verification (aka "Overwhelm top half") {#attack-top-half}
+### Overwhelm PoW verification ("top half") {#attack-top-half}
-A basic attack here is the adversary spamming with bogus INTRO cells so that the service does not have computing capacity to even verify the proof-of-work. This adversary tries to overwhelm the procedure in the [POW_VERIFY] section.
+A basic attack here is the adversary spamming with bogus INTRO cells so that the service does not have computing capacity to even verify the proof-of-work. This adversary tries to overwhelm the procedure in the [`v1` verification algorithm](./v1-equix.md#service-verify) section.
-That's why we need the PoW algorithm to have a cheap verification time so that this attack is not possible: we tune this PoW parameter in section [POW_TUNING_VERIFICATION].
+That's why we need the PoW algorithm to have a cheap verification time so that this attack is not possible: we explore this PoW parameter below in the section on [PoW verification](#pow-tuning-verification).
-### Overwhelm rendezvous capacity (aka "Overwhelm bottom half") {#attack-bottom-half}
+### Overwhelm rendezvous capacity ("bottom half") {#attack-bottom-half}
-Given the way the introduction queue works (see [HANDLE_QUEUE]), a very effective strategy for the attacker is to totally overwhelm the queue processing by sending more high-effort introductions than the onion service can handle at any given tick.
-This adversary tries to overwhelm the procedure in the [HANDLE_QUEUE] section.
+Given the way [the introduction queue](./common-protocol.md#intro-queue) works, a very effective strategy for the attacker is to totally overwhelm the queue processing by sending more high-effort introductions than the onion service can handle at any given tick.
+This adversary tries to overwhelm the process of [handling queued introductions](./common-protocol.md#handling-queue).
-To do so, the attacker would have to send at least 20 high-effort introduction cells every 100ms, where high-effort is a PoW which is above the estimated level of "the motivated user" (see [USER_MODEL]).
+To do so, the attacker would have to send at least 20 high-effort introduction cells every 100ms, where high-effort is a PoW which is above the estimated level of ["the motivated user"](./motivation.md#user-profiles).
-An easier attack for the adversary, is the same strategy but with introduction cells that are all above the comfortable level of "the standard user" (see [USER_MODEL]).
+An easier attack for the adversary, is the same strategy but with introduction cells that are all above the comfortable level of ["the standard user"](./motivation.md#user-profiles).
This would block out all standard users and only allow motivated users to pass.
### Hybrid overwhelm strategy {#attack-hybrid}
@@ -240,11 +244,11 @@ Nevertheless, there are some massive differences in both the scale and the dynam
We think we aren't making a bad situation worse: DoS attacks on the Tor network are already happening and attackers are already burning energy to carry them out.
As we see in the [denial-of-service overview](../dos-spec/overview.md#hs-intro), attacks on onion services are in a position to cause downstream resource consumption of nearly every type.
Each relay involved experiences increased CPU load from the circuit floods they process.
-We think that asking legitimate clients to carry out PoW computations is not gonna affect the equation too much, since an attacker right now can very quickly use the same resources that hundreds of legitimate clients do in a whole day.
+We think that asking legitimate clients to carry out PoW computations doesn't affect the equation too much, since an attacker right now can very quickly use the same resources that hundreds of legitimate clients do in a whole day.
We hope to make things better: The hope is that systems like this will make the DoS actors go away and hence the PoW system will not be used.
As long as DoS is happening there will be a waste of energy, but if we manage to demotivate them with technical means, the network as a whole will less wasteful.
-Also see [The DoS Catch-22](./motivation.md#catch22) for a similar argument.
+Also see [The DoS Catch-22](./motivation.md#catch22).
## Acknowledgements {#acknowledgements}
@@ -272,8 +276,8 @@ Processing an INTRODUCE2 cell at the onion service means a series of operations
2. Decrypt cell (AES operations).
3. Parse cell header and process it depending on its RELAY_COMMAND.
4. INTRODUCE2 cell handling which means building a rendezvous circuit:
- - Path selection
- - Launch circuit to first hop.
+ - Path selection
+ - Launch circuit to first hop.
5. Return to mainloop event which essentially means back to step (1).
Tor will read at most 32 cells out of the inbuf per mainloop round.
diff --git a/spec/hspow-spec/common-protocol.md b/spec/hspow-spec/common-protocol.md
index 844d87f..e0910ac 100644
--- a/spec/hspow-spec/common-protocol.md
+++ b/spec/hspow-spec/common-protocol.md
@@ -85,7 +85,7 @@ Clients automatically increase their bid when retrying, and services regularly o
[Motivated users](./motivation.md#user-profiles) can spend a high amount of effort in their PoW computation, which should guarantee access to the service given reasonable adversary models.
-An effective effort estimation algorithm will improve reachability and UX by suggesting values that reduce overall service load to tolerable values while also leaving users with a tolerable overall delay.
+An effective effort control algorithm will improve reachability and UX by suggesting values that reduce overall service load to tolerable values while also leaving users with a tolerable overall delay.
The service starts with a default suggested-effort value of 0, which keeps the PoW defenses dormant until we notice signs of queue overload.
@@ -104,13 +104,13 @@ Increasing effort will make the service more expensive to contact,
and decreasing effort makes new requests likely to become backlogged behind older requests.
The steady state condition is preferable to either of these side-effects, but ultimately it's expected that the control loop always oscillates to some degree.
-### Service-side effort estimation {#service-effort}
+### Service-side effort control {#service-effort}
-Services keep an internal effort estimation which updates on a regular periodic timer in response to measurements made on the queueing behavior in the previous period.
+Services keep an internal suggested effort target which updates on a regular periodic timer in response to measurements made on queue behavior in the previous period.
These internal effort changes can optionally trigger client-visible [descriptor changes](#service-effort-update) when the difference is great enough to warrant republication to the [HSDir](../rend-spec/hsdesc.md).
This evaluation and update period is referred to as `HS_UPDATE_PERIOD`.
-The service-side effort estimation takes inspiration from TCP congestion control's additive increase / multiplicative decrease approach, but unlike a typical AIMD this algorithm is fixed-rate and doesn't update immediately in response to events.
+The service-side effort control loop takes inspiration from TCP congestion control's additive increase / multiplicative decrease approach, but unlike a typical AIMD this algorithm is fixed-rate and doesn't update immediately in response to events.
TODO: `HS_UPDATE_PERIOD` is hardcoded to 300 (5 minutes) currently, but it should be configurable in some way.
Is it more appropriate to use the service's torrc here or a consensus parameter?
@@ -128,35 +128,37 @@ During each update period, the service maintains some state:
At the end of each period, the service may decide to increase effort, decrease effort, or make no changes, based on these accumulated state values:
-1. If `MAX_TRIMMED_EFFORT` > our previous internal suggested_effort, always INCREASE.
+1. If `MAX_TRIMMED_EFFORT` > our previous internal `suggested_effort`, always INCREASE.
Requests that follow our latest advice are being dropped.
-2. If the `HAD_QUEUE` flag was set and the queue still contains at least one item with effort >= our previous internal suggested_effort, INCREASE.
+2. If the `HAD_QUEUE` flag was set and the queue still contains at least one item with effort >= our previous internal `suggested_effort`, INCREASE.
Even if we haven't yet reached the point of dropping requests, this signal indicates that our latest suggestion isn't high enough and requests will build up in the queue.
3. If neither condition 1 or 2 are taking place and the queue is below a level we would expect to process in approximately 1/4 second, choose to DECREASE.
-4. If none of these conditions match, the suggested effort is unchanged.
+4. If none of these conditions match, the `suggested_effort` is unchanged.
-When we INCREASE, the internal suggested_effort is increased to either its previous value + 1, or (`TOTAL_EFFORT` / `REND_HANDLED`), whichever is larger.
+When we INCREASE, the internal `suggested_effort` is increased to either its previous value + 1, or (`TOTAL_EFFORT` / `REND_HANDLED`), whichever is larger.
-When we DECREASE, the internal suggested_effort is scaled by 2/3rds.
+When we DECREASE, the internal `suggested_effort` is scaled by 2/3rds.
Over time, this will continue to decrease our effort suggestion any time the service is fully processing its request queue.
If the queue stays empty, the effort suggestion decreases to zero and clients should no longer submit a proof-of-work solution with their first connection attempt.
-It's worth noting that the suggested-effort is not a hard limit to the efforts that are accepted by the service, and it's only meant to serve as a guideline for clients to reduce the number of unsuccessful requests that get to the service.
-When [adding requests to the queue](#add-queue), services do accept valid solutions with efforts lower than the published `suggested-effort`.
+It's worth noting that the `suggested_effort` is not a hard limit to the efforts that are accepted by the service, and it's only meant to serve as a guideline for clients to reduce the number of unsuccessful requests that get to the service.
+When [adding requests to the queue](#add-queue), services do accept valid solutions with efforts higher or lower than the published values from `pow-params`.
#### Updating descriptor with new suggested effort {#service-effort-update}
The service descriptors may be updated for multiple reasons including introduction point rotation common to all v3 onion services, scheduled seed rotations like the one described for [`v1` parameters](./v1-equix.md#parameter-descriptor), and updates to the effort suggestion.
-Even though the internal effort estimate updates on a regular timer, we avoid propagating those changes into the descriptor and the HSDir hosts unless there is a significant change.
+Even though the internal effort value updates on a regular timer, we avoid propagating those changes into the descriptor and the HSDir hosts unless there is a significant change.
If the PoW params otherwise match but the seed has changed by less than 15 percent, services SHOULD NOT upload a new descriptor.
-### Client-side effort estimation {#client-effort}
+### Client-side effort control {#client-effort}
Clients are responsible for making their own effort adjustments in response to connection trouble, to allow the system a chance to react before the service has published new effort values.
This is an important tool to uphold UX expectations without relying on excessively frequent updates through the HSDir.
+TODO: This is the weak link in user experience for our current implementation. The C tor implementation does not detect and retry onion service connections as reliably as we would like. Currently our best strategy to improve retry behavior is the Arti rewrite.
+
#### Failure ambiguity {#client-failure-ambiguity}
The first challenge in reacting to failure, in our case, is to even accurately and quickly understand when a failure has occurred.
diff --git a/spec/hspow-spec/motivation.md b/spec/hspow-spec/motivation.md
index a79bac7..8fffe3f 100644
--- a/spec/hspow-spec/motivation.md
+++ b/spec/hspow-spec/motivation.md
@@ -21,7 +21,7 @@ With the right parameters, this proof-of-work scheme acts as a gatekeeper to blo
For a similar concept, see the three internet drafts that have been proposed for defending against TLS-based DDoS attacks using client puzzles:
- [`draft-nygren-tls-client-puzzles-02`](https://www.ietf.org/archive/id/draft-nygren-tls-client-puzzles-02.txt)
-- [`draft-nir-tls-puzzles-00`](https://tools.ietf.org/id/draft-nir-tls-puzzles-00.html)
+- [`draft-nir-tls-puzzles-00`](https://www.ietf.org/archive/id/draft-nir-tls-puzzles-00.txt)
- [`draft-ietf-ipsecme-ddos-protection-10`](https://tools.ietf.org/html/draft-ietf-ipsecme-ddos-protection-10)
## Threat model
@@ -52,7 +52,7 @@ Let's start with some adversary profiles:
The upfront cost for this attacker is about $36k.
We hope that this proposal can help us defend against the script-kiddie attacker and small botnets.
-To defend against a large botnet we would need more tools at our disposal (see the [discussion on future designs](./analysis-discussion.md#FUTURE_DESIGNS)).
+To defend against a large botnet we would need more tools at our disposal (see the [discussion on future designs](./analysis-discussion.md#future-designs)).
### User profiles {#user-profiles}
diff --git a/spec/hspow-spec/v1-equix.md b/spec/hspow-spec/v1-equix.md
index ccaf055..77b7f95 100644
--- a/spec/hspow-spec/v1-equix.md
+++ b/spec/hspow-spec/v1-equix.md
@@ -15,13 +15,17 @@ Furthermore, it's designed for this particular use-case and hence cryptocurrency
At this point there is no formal specification for Equi-X or the underlying HashX function.
We have two actively maintained implementations of both components, which we subject to automated cross-compatibility and fuzz testing:
-- A fork of tevador's implementation is maintained within the C tor repository, in the [`src/ext/equix` subdirectory](https://gitlab.torproject.org/tpo/core/tor/-/tree/main/src/ext/equix).
- Currently this contains important fixes for security, portability, and testability which have not been merged upstream.
+- A fork of tevador's implementation is maintained within the C tor repository.
+
+ This is the [`src/ext/equix` subdirectory](https://gitlab.torproject.org/tpo/core/tor/-/tree/main/src/ext/equix).
+ Currently this contains important fixes for security, portability, and testability which have not been merged upstream!
This implementation is released under the LGPL license.
When `tor` is built with the required `--enable-gpl` option this code will be statically linked.
+
- As part of Arti, a new Rust re-implementation was written based loosely on tevador's original.
- This implementation currently has somewhat lower verification performance than the original but otherwise offers equivalent features.
+
This is the [`equix` crate](https://tpo.pages.torproject.net/core/doc/rust/equix/index.html).
+ This implementation currently has somewhat lower verification performance than the original but otherwise offers equivalent features.
## Algorithm overview {#overview}
@@ -52,7 +56,7 @@ The underlying Equi-X puzzle has an approximately fixed computational cost.
Adjustable effort comes from the construction of the overlying Blake2b layer, which requires clients to test a variable number of Equi-X solutions in order to find answers which also satisfy this layer's effort constraint.
It's common for proof-of-work systems to define an exponential effort function based on a particular number of leading zero bits or equivalent.
-For the benefit of our effort estimation system, it's quite useful if we have a linear scale instead. We use the first 32 bits of a hashed version of the Equi-X solution as a uniformly distributed random value.
+For the benefit of our effort control system, it's quite useful if we have a linear scale instead. We use the first 32 bits of a hashed version of the Equi-X solution as a uniformly distributed random value.
Conceptually we could define a function:
```text
@@ -80,7 +84,7 @@ Thus the effort is communicated explicitly in our protocol, and it forms part of
This whole protocol starts with the service encoding its parameters in a `pow-params` line within the 'encrypted' (inner) part of the v3 descriptor. The [second layer plaintext format](../rend-spec/hsdesc-encrypt.md#second-layer-plaintext) describes it canonically. The parameters offered are:
- `type`, always `v1` for the algorithm described here
- `seed-b64`, a periodically updated 32-byte random seed, base64 encoded
-- `suggested-effort`, the latest output from [service-side effort estimation](./common-protocol.md#service-effort)
+- `suggested-effort`, the latest output from the [service-side effort controller](./common-protocol.md#service-effort)
- `expiration-time`, a timestamp when we plan to replace the seed.
Seed expiration and rotation allows used nonces to expire from the anti-replay memory.
@@ -88,13 +92,13 @@ At every seed rotation, a new expiration time is chosen uniformly at random from
- At the earliest, 105 minutes in the future
- At the latest, 2 hours in the future (15 minutes later)
-The service should refresh its seed when expiration-time passes.
+The service SHOULD refresh its seed when expiration-time passes.
The service SHOULD keep its previous seed in memory and accept PoWs using it to avoid race-conditions with clients that have an old seed.
-The service SHOULD avoid generating two consequent seeds that have a common 4 bytes prefix; see the usage of seed headings below in the [introduction extension]{#intro-ext}.
+The service SHOULD avoid generating two consequent seeds that have a common 4 bytes prefix; see the usage of seed headings below in the [introduction extension](#intro-ext).
## Client computes a solution {#client-solver}
-If a client receives a descriptor with `pow-params``, it should assume that the service is prepared to receive PoW solutions as part of the introduction protocol.
+If a client receives a descriptor with `pow-params`, it should assume that the service is prepared to receive PoW solutions as part of the introduction protocol.
The client parses the descriptor and extracts the PoW parameters.
It makes sure that the `expiration-time` has not expired.
@@ -113,18 +117,22 @@ The solver itself is iterative; the following steps are repeated until they succ
1. Construct the *challenge string* by concatenating `P || ID || C || N || htonl(E)`.
2. Calculate a candidate proof `S` by passing this challenge to Equi-X.
- - `S = equix_solve(P || ID || C || N || htonl(E))`
+
+ `S = equix_solve(P || ID || C || N || htonl(E))`
3. Calculate a 32-bit check value by interpreting a 32-bit Blake2b hash of the concatenated challenge and solution as an integer in network byte order.
- - `R = ntohl(blake2b_32(P || ID || C || N || htonl(E) || S))`
+
+ `R = ntohl(blake2b_32(P || ID || C || N || htonl(E) || S))`
4. Check if 32-bit multiplication of `R * E` would overflow
- - If `R * E` overflows (the result would be greater than `UINT32_MAX`) the solver must retry with another nonce value. The client interprets N as a 16-byte little-endian integer, increments it by 1, and goes back to step 1.
- - If there is no overflow (the result is less than or equal to `UINT32_MAX`) this is a valid solution. The client can submit final nonce `N`, effort `E`, the first 4 bytes of seed `C`, and proof `S`.
+
+ If `R * E` overflows (the result would be greater than `UINT32_MAX`) the solver must retry with another nonce value. The client interprets N as a 16-byte little-endian integer, increments it by 1, and goes back to step 1.
+
+ If there is no overflow (the result is less than or equal to `UINT32_MAX`) this is a valid solution. The client can submit final nonce `N`, effort `E`, the first 4 bytes of seed `C`, and proof `S`.
Note that the Blake2b hash includes the output length parameter in its initial state vector, so a `blake2b_32` is not equivalent to the prefix of a `blake2b_512`.
We calculate the 32-bit Blake2b specifically, and interpret it in network byte order as an unsigned integer.
At the end of the above procedure, the client should have calculated a proof `S` and final nonce `N` that satisfies both the Equi-X proof conditions and the Blake2b effort test.
-How quickly this happens, on average, depends mainly on the target effort `E` parameter.
+The time taken, on average, is linearly proportional with the target effort `E` parameter.
The algorithm as described is suitable for single-threaded computation.
Optionally, a client may choose multiple nonces and attempt several solutions in parallel on separate CPU cores.
diff --git a/spec/rend-spec/introduction-protocol.md b/spec/rend-spec/introduction-protocol.md
index 3c0ba0c..60e5a40 100644
--- a/spec/rend-spec/introduction-protocol.md
+++ b/spec/rend-spec/introduction-protocol.md
@@ -349,8 +349,8 @@ POW_SOLUTION is a matching proof computed by the client's solver
Only version 1 is currently defined.
Other versions may have a different format.
-A correctly functioning client should only submit solutions with a version and seed which at some point were advertised by the server.
-An extension with an unknown version or seed is suspicious and SHOULD result in introduction failure.
+A correctly functioning client only submits solutions with a version and seed which were advertised by the server and have not yet expired.
+An extension with an unknown version or expired seed is suspicious and SHOULD result in introduction failure.
This will increase the INTRODUCE1 payload size by 43 bytes since the extension type and length is 2 extra bytes, the N_EXTENSIONS field is always present and currently set to 0 and the EXT_FIELD is 41 bytes.
According to ticket #33650, INTRODUCE1 cells currently have more than 200 bytes available.