summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2023-05-10hs_pow: swap out some commentsMicah Elizabeth Scott
i think we're done with these? and swap in a nonfatal assert to replace one of the comments. Signed-off-by: Micah Elizabeth Scott <beth@torproject.org>
2023-05-10hs_pow: always give other events a chance to run between rend requestsMicah Elizabeth Scott
This dequeue path has been through a few revisions by now, first limiting us to a fixed number per event loop callback, then an additional limit based on a token bucket, then the current version which has only the token bucket. The thinking behing processing multiple requests per callback was to optimize our usage of libevent, but in effect this creates a prioritization problem. I think even a small fixed limit would be less reliable than just backing out this optimization and always allowing other callbacks to interrupt us in-between dequeues. With this patch I'm seeing much smoother queueing behavior when I add artificial delays to the main thread in testing. Signed-off-by: Micah Elizabeth Scott <beth@torproject.org>
2023-05-10hs_pow: modified approach to pqueue level thresholdsMicah Elizabeth Scott
This centralizes the logic for deciding on these magic thresholds, and tries to reduce them to just two: a min and max. The min should be a "nearly empty" threshold, indicating that the queue only contains work we expect to be able to complete very soon. The max level triggers a bulk culling process that reduces the queue to half that amount. This patch calculates both thresholds based on the torrc pqueue rate settings if they're present, and uses generic defaults if the user asked for an unlimited dequeue rate in torrc. Signed-off-by: Micah Elizabeth Scott <beth@torproject.org>
2023-05-10hs_pow: faster hs_circuitmap lookup for rend in pow_worker_job_tMicah Elizabeth Scott
The worker job queue for hs_pow needs what's effectively a weak pointer to two circuits, but there's not a generic mechanism for this in c-tor. The previous approach of circuit_get_by_global_id() is straightforward but not efficient. These global IDs are normally only used by the control port protocol. To reduce the number of O(N) lookups we have over the whole circuit list, we can use hs_circuitmap to look up the rend circuit by its auth cookie. Signed-off-by: Micah Elizabeth Scott <beth@torproject.org>
2023-05-10hs_pow: review feedback, use MAX for max_trimmed_effortMicah Elizabeth Scott
Signed-off-by: Micah Elizabeth Scott <beth@torproject.org>
2023-05-10hs_pow: Lower several logs from notice to infoMicah Elizabeth Scott
Signed-off-by: Micah Elizabeth Scott <beth@torproject.org>
2023-05-10hs_pow: update_suggested_effort fix and cleanupMicah Elizabeth Scott
This is trying to be an AIMD event-driven algorithm, but we ended up with two different add paths with diverging behavior. This fix makes the AIMD events more explicit, and it fixes an earlier behavior where the effort could be decreased (by the add/recalculate branch) even when the pqueue was not emptying at all. With this patch we shouldn't drop down to an effort of zero as long as even low-effort attacks are flooding the pqueue. Signed-off-by: Micah Elizabeth Scott <beth@torproject.org>
2023-05-10hs_pow: client side effort adjustmentMicah Elizabeth Scott
The goal of this patch is to add an additional mechanism for adjusting PoW effort upwards, where clients rather than services can choose to solve their puzzles at a higher effort than what was suggested in the descriptor. I wanted to use hs_cache's existing unreachability stats to drive this effort bump, but this revealed some cases where a circuit (intro or rend) closed early on can end up in hs_cache with an all zero intro point key, where nobody will find it. This moves intro_auth_pk initialization earlier in a couple places and adds nonfatal asserts to catch the problem if it shows up elsewhere. The actual effort adjustment method I chose is to multiply the suggested effort by (1 + unresponsive_count), then ensure the result is at least 1. If a service has suggested effort of 0 but we fail to connect, retries will all use an effort of 1. If the suggestion was 50, we'll try 50, 100, 150, 200, etc. This is bounded both by our client effort limit and by the limit on unresponsive_count (currently 5). Signed-off-by: Micah Elizabeth Scott <beth@torproject.org>
2023-05-10hs_pow: leak fix, free the contents of pqueue entries in ↵Micah Elizabeth Scott
hs_pow_free_service_state Asan catches this pretty readily when ending a service gracefully while a DoS is in progress and the queue is full of items that haven't yet timed out. The module boundaries in hs_circuit are quite fuzzy here, but I'm trying to follow the vibe of the existing hs_pow code. Signed-off-by: Micah Elizabeth Scott <beth@torproject.org>
2023-05-10hs_pow: bump client-side effort limit from 500 to 10000Micah Elizabeth Scott
500 was quite low, but this limit was helpful when the suggested-effort estimation algorithm was likely to give us large abrupt increases. Now that this should be fixed, let's allow spending a bit more time on the client puzzles if it's actually necessary. Solving a puzzle with effort=10000 usually completes within a minute on my old x86_64 machine. We may want to fine tune this further, and it should probably be made into a config option. Signed-off-by: Micah Elizabeth Scott <beth@torproject.org>
2023-05-10hs_pow: stop having a "minimum effort", and let PoW effort start lowMicah Elizabeth Scott
I don't think the concept of "minimum effort" is really useful to us, so this patch removes it entirely and consequentially changes the way that "total" effort is calculated so that we don't rely on any minimum and we instead ramp up effort no faster than necessary. If at least some portion of the attack is conducted by clients that avoid PoW or provide incorrect solutions, those (potentially very cheap) attacks will end up keeping the pqueue full. Prior to this patch, that would cause suggested efforts to be unnecessarily high, because rounding these very cheap requests up to even a minimum of 1 will overestimate how much actual attack effort is being spent. The result is that this patch is a simplification and it also allows a slower start, where PoW effort jumps up either by a single unit or by an amount calculated from actual effort in the queue. Signed-off-by: Micah Elizabeth Scott <beth@torproject.org>
2023-05-10changes: Ticket 40634 (hs_pow)Micah Elizabeth Scott
Signed-off-by: Micah Elizabeth Scott <beth@torproject.org>
2023-05-10gitlab-ci: Try enabling GPL mode so we test hs_powMicah Elizabeth Scott
Signed-off-by: Micah Elizabeth Scott <beth@torproject.org>
2023-05-10hs_pow: Represent equix_solution as a byte arrayMicah Elizabeth Scott
This patch is intended to clarify the points at which we convert between the internal representation of an equix_solution and a portable but opaque byte array representation. Signed-off-by: Micah Elizabeth Scott <beth@torproject.org>
2023-05-10sandbox: allow stack mmap with prot_noneMicah Elizabeth Scott
This fixes a failure that was showing up on i386 Debian hosts with sandboxing enabled, now that cpuworker is enabled on clients. We already had allowances for creating threads and creating stacks in the sandbox, but prot_none (probably used for a stack guard) was not allowed so thread creation failed. Signed-off-by: Micah Elizabeth Scott <beth@torproject.org>
2023-05-10hs_pow: Fix nonce cache entry leakMicah Elizabeth Scott
This leak was showing up in address sanitizer runs of test_hs_pow, but it will also happen during normal operation as seeds are rotated. Signed-off-by: Micah Elizabeth Scott <beth@torproject.org>
2023-05-10hs_pow: Define seed_head as uint8_t[4] instead of uint32_tMicah Elizabeth Scott
This is more consistent with the specification, and it's much less confusing with endianness. This resolves the underlying cause of the earlier byte-swap. This patch itself does not change the wire protocol at all, it's just tidying up the types we use at the trunnel layer. Signed-off-by: Micah Elizabeth Scott <beth@torproject.org>
2023-05-10hs_pow: Don't require uint128_tMicah Elizabeth Scott
We were using a native uint128_t to represent the hs_pow nonce, but as the comments note it's more portable and more flexible to use a byte array. Indeed the uint128_t was a problem for 32-bit platforms. This swaps in a new implementation that uses multiple machine words to implement the nonce incrementation. Signed-off-by: Micah Elizabeth Scott <beth@torproject.org>
2023-05-10hs_pow: unswap byte order of seed_head fieldMicah Elizabeth Scott
In proposal 327, "POW_SEED is the first 4 bytes of the seed used". The proposal doesn't specifically mention the data type of this field, and the code in hs_pow so far treats it as an integer but semantically it's more like the first four bytes of an already-encoded little endian blob. This leads to a byte swap, since the type confusion takes place in a little-endian subsystem but the wire encoding of seed_head uses tor's default of big endian. This patch does not address the underlying type confusion, it's a minimal change that only swaps the byte order and updates unit tests accordingly. Further changes will clean up the data types. Signed-off-by: Micah Elizabeth Scott <beth@torproject.org>
2023-05-10hs_pow: fix assert in services that receive unsolicited proof of workMicah Elizabeth Scott
Signed-off-by: Micah Elizabeth Scott <beth@torproject.org>
2023-05-10hs_pow: use the compiled HashX implementationMicah Elizabeth Scott
Much faster per-hash, affects both verify and solve. Only implemented on x86_64 and aarch64, other platforms always use the interpreted version of hashx. Signed-off-by: Micah Elizabeth Scott <beth@torproject.org>
2023-05-10test_hs_pow: add test vectors for our hs_pow client puzzleMicah Elizabeth Scott
This adds test vectors for the overall client puzzle at the hs_pow and hs_cell layers. These are similar to the crypto/equix tests, but they also cover particulars of our hs_pow format like the conversion to byte arrays, the replay cache, the effort test, and the formatting of the equix challenge string. Signed-off-by: Micah Elizabeth Scott <beth@torproject.org>
2023-05-10hashx: trim trailing whitespaceMicah Elizabeth Scott
Signed-off-by: Micah Elizabeth Scott <beth@torproject.org>
2023-05-10equix: Portability fixes for big endian platformsMicah Elizabeth Scott
Signed-off-by: Micah Elizabeth Scott <beth@torproject.org>
2023-05-10equix: Build cleanly with -Wall -WerrorMicah Elizabeth Scott
Fixes some type nitpicks that show up in Tor development builds, which usually run with -Wall -Werror. Tested on x86_64 and aarch64 for clean build and passing equix-tests + hashx-tests. Signed-off-by: Micah Elizabeth Scott <beth@torproject.org>
2023-05-10ext: build equix and hashx using automakeMicah Elizabeth Scott
This replaces the sketchy cmake invocation we had inside configure The libs are always built and always used in unit tests, but only included in libtor and tor when --enable-gpl is set. Signed-off-by: Micah Elizabeth Scott <beth@torproject.org>
2023-05-10hs_pow: Replace libb2 dependency with hashx's internal blake2Micah Elizabeth Scott
This forgoes another external library dependency, and instead introduces a compatibility header so that interested parties (who already depend on equix, like hs_pow and unit tests) can use the implementation of blake2b included in hashx. Signed-off-by: Micah Elizabeth Scott <beth@torproject.org>
2023-05-10test_crypto: add equix and hashx testsMicah Elizabeth Scott
This adds test vectors for the Equi-X proof of work algorithm and the Hash-X function it's based on. The overall Equi-X test takes about 10 seconds to run on my machine, so it's in test_crypto_slow. The hashx test still covers both the compiled and interpreted versions of the hash function. There aren't any official test vectors for Equi-X or for its particular configuration of Hash-X, so I made some up based on the current implementation. Signed-off-by: Micah Elizabeth Scott <beth@torproject.org>
2023-05-10test_crypto: add blake2b test vectorsMicah Elizabeth Scott
I'm planning on swapping blake2b implementations, and this test is intended to prevent regressions. Right now blake2b is only used by hs_pow. Signed-off-by: Micah Elizabeth Scott <beth@torproject.org>
2023-05-10hs_pow: Make proof-of-work support optional in configureMicah Elizabeth Scott
This adds a new "pow" module for the user-visible proof of work support in ./configure, and this disables src/feature/hs/hs_pow at compile-time. Signed-off-by: Micah Elizabeth Scott <beth@torproject.org>
2023-05-10configure: Add --enable-gpl optionMicah Elizabeth Scott
This change on its own doesn't use the option for anything, but it includes support for configure and a message in 'tor --version' Signed-off-by: Micah Elizabeth Scott <beth@torproject.org>
2023-05-10hs_pow_solve: use equix_solve more efficientlyMicah Elizabeth Scott
This was apparently misinterpreting "zero solutions" as an error instead of just moving on to the next nonce. Additionally, equix could have been returning up to 8 solutions and we would only give one of those a chance to succeed. Signed-off-by: Micah Elizabeth Scott <beth@torproject.org>
2023-05-10hs_pow: reduce min_effort default to 1Micah Elizabeth Scott
We may want to choose something larger eventually, but 20 seemed much too large. Very low nonzero efforts are still useful against a script kiddie level DoS attack. Signed-off-by: Micah Elizabeth Scott <beth@torproject.org>
2023-05-10hs_pow: Rate limited dequeueMicah Elizabeth Scott
This adds a token bucket ratelimiter on the dequeue side of hs_pow's priority queue. It adds config options and docs for those options. (HiddenServicePoWQueueRate/Burst) I'm testing this as a way to limit the overhead of circuit creation when we're experiencing a flood of rendezvous requests. Signed-off-by: Micah Elizabeth Scott <beth@torproject.org>
2023-05-10fix typo in HiddenServiceExportCircuitIDMicah Elizabeth Scott
Really inconsequential, since the string was only used for logging a warning.
2023-05-10manpage: document HiddenServicePoWDefensesEnabled optionMicah Elizabeth Scott
Signed-off-by: Micah Elizabeth Scott <beth@torproject.org>
2023-05-10hs_pow: check for expired params in can_client_refetch_descMicah Elizabeth Scott
Without this check, we never actually refetch the hs descriptor when PoW parameters expire, because can_client_refetch_desc deems the descriptor to be still good. Signed-off-by: Micah Elizabeth Scott <beth@torproject.org>
2023-05-10hs_metrics: Proof of Work pqueue depth, suggested effortMicah Elizabeth Scott
Adds two new metrics for hs_pow, and an internal parameter within hs_metrics for implementing gauge parameters that reset before every update. Signed-off-by: Micah Elizabeth Scott <beth@torproject.org>
2023-05-10update_suggested_effort: avoid assert if the pqueue has emptiedMicah Elizabeth Scott
top_of_rend_pqueue_is_worthwhile requires a nonempty queue.
2023-05-10compute the client-side pow in a cpuworker threadRoger Dingledine
We mark the intro circuit with a new flag saying that the pow is in the cpuworker queue. When the cpuworker comes back, it either has a solution, in which case we proceed with sending the intro1 cell, or it has no solution, in which case we unmark the intro circuit and let the whole process restart on the next iteration of connection_ap_handshake_attach_circuit().
2023-05-10refactor send_introduce1()Roger Dingledine
into two parts: * a "consider whether to send an intro2 cell" part (now called consider_sending_introduce1()), and * an "actually send it" (now called send_introduce1()).
2023-05-10start the cpuworkers always, even for clientsRoger Dingledine
prepares the way for client-side pow cpuworkers also happens to resolve bug https://bugs.torproject.org/tpo/core/tor/40617 (which went into 0.4.7.4-alpha) because now we survive initing the cpuworker subsystem when we're not a relay.
2023-05-10allow suggested effort to be 0Roger Dingledine
First (both client and service), make descriptor parsing not fail when suggested_effort is 0. Second (client side), if we get a descriptor with a pow_params section but with suggested_effort of 0, treat it as not requiring a pow. Third (service side), when deciding whether the suggested effort has changed, don't treat "previous suggested effort 0, new suggested effort 0" as a change. An alternative design to resolve 'first' and 'second' above would be to omit the pow_params from the descriptor when suggested_effort is 0, so clients never see the pow_params so they don't compute a pow. But I decided to include a pow_params with an explicit suggested_effort of 0, since this way the client knows the seed etc so they can solve a higher-effort pow if they want. The tradeoff is that the descriptor reveals whether HiddenServicePoWDefensesEnabled is set to 1 for this onion service, even if the AIMD calculation is currently requiring effort 0.
2023-05-10Initialize startup effort at 0.Mike Perry
If it works correctly, auto-tuning should set a non-zero effort once an attack begins.
2023-05-10Implement AIMD effort estimation.Mike Perry
Now, pow should auto-enable and auto-disable itself.
2023-05-10Replace the constant bottom-half rate with handled count.Mike Perry
This allows us to more accurately estimate effort, based on real bottom-half throughput over the duration of a descriptor update.
2023-05-10Make the thing compile.Mike Perry
2023-05-10clients defend themselves from absurd pow requestsRoger Dingledine
if asked for higher than a cap, we just solve it at the cap i picked 500 for now but maybe we'll pick a better number in the future.
2023-05-10log_err is reserved for fatal failuresRoger Dingledine
2023-05-10drop the default min effort to 20Roger Dingledine
effort 100 is really quite expensive