summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2023-05-28hs_pow: Update for equix API to fix issue 40794Micah Elizabeth Scott
This change adapts the hs_pow layer and unit tests to API changes in hashx and equix which modify the fault recovery responsibilities and reporting behaivor. This and the corresponding implementation changes in hashx and equix form the fix for #40794, both solving the segfault and giving hashx a way to report those failures up the call chain without them being mistaken for a different error (unusable seed) that would warrant a retry. To handle these new late compiler failures with a minimum of fuss or inefficiency, the failover is delegated to the internals of hashx and tor needs only pass in a EQUIX_CTX_TRY_COMPILE flag to get the behavior that tor was previously responsible for implementing. Signed-off-by: Micah Elizabeth Scott <beth@torproject.org>
2023-05-28equix: API changes for new result codes and hashx compatibilityMicah Elizabeth Scott
This change adapts Equi-X to the corresponding HashX API changes that added HASHX_TRY_COMPILE. The new regularized HashX return codes are reflected by revised corresponding Equi-X return codes. Both solve and verify operations now return an error/success code, and a new equix_solutions_buffer struct includes both the solution buffer and information about the solution count and hashx implementation. With this change, it's possible to discern between hash construction failures (invalid seed) and some external error like an mprotect() failure. Signed-off-by: Micah Elizabeth Scott <beth@torproject.org>
2023-05-28hashx: API changes to allow recovery from late compile failuresMicah Elizabeth Scott
This is an API breaking change to hashx, which modifies the error handling strategy. The main goal here is to allow unproblematic recovery from hashx_compile failures. hashx_alloc can no longer fail for reasons other than memory allocation. All platform-specific compile failures are now reported via hashx_make(), in order to both allow later failure and avoid requiring users of the API to maintain and test multiple failure paths. Note that late failures may be more common in actual use than early failures. Early failures represent architectures other than x86_64 and aarch64. Late failures could represent a number of system configurations where syscalls are restricted. The definition of a hashx context no longer tries to overlay storage for the different types of program, and instead allows one context to always contain an interpretable description of the program as well as an optional buffer for compiled code. The hashx_type enum is now used to mean either a specific type of hash function or a type of hashx context. You can allocate a context for use only with interpreted or compiled functions, or you can use HASHX_TRY_COMPILE to prefer the compiler with an automatic fallback on the interpreter. After calling hashx_make(), the new hashx_query_type() can be used if needed to determine which implementation was actually chosen. The error return types have been overhauled so that everyone uses the hashx_result enum, and seed failures vs compile failures are always clearly distinguishable. Signed-off-by: Micah Elizabeth Scott <beth@torproject.org>
2023-05-28hashx: allow hashx_compile to fail, avoid segfault without changing APIMicah Elizabeth Scott
This is a minimal portion of the fix for tor issue #40794, in which hashx segfaults due to denial of mprotect() syscalls at runtime. Prior to this fix, hashx makes the assumption that if the JIT is supported on the current architecture, it will also be usable at runtime. This isn't true if mprotect fails on linux, which it may for various reasons: the tor built-in sandbox, the shadow simulator, or external security software that implements a syscall filter. The necessary error propagation was missing internally in hashx, causing us to obliviously call into code which was never made executable. With this fix, hashx_make() will instead fail by returning zero. A proper fix will require API changes so that callers can discern between different types of failures. Zero already means that a program couldn't be constructed, which requires a different response: choosing a different seed, vs switching implementations. Callers would also benefit from a way to use one context (with its already-built program) to run in either compiled or interpreted mode. Signed-off-by: Micah Elizabeth Scott <beth@torproject.org>
2023-05-28hashx: minor, another logical operator changeMicah Elizabeth Scott
The code style in equix and hashx sometimes uses bitwise operators in place of logical ones in cases where it doesn't really matter either way. This sometimes annoys our static analyzer tools. Signed-off-by: Micah Elizabeth Scott <beth@torproject.org>
2023-05-28test_sandbox: equix crypto test case for issue 40794Micah Elizabeth Scott
This is an additional test case for test_sandbox that runs a small subset of test_crypto_equix() inside the syscall sandbox, where mprotect() is filtered. It's reasonable for the sandbox to disallow JIT. We could revise this policy if we want, but it seems a good default for now. The problem in issue 40794 is that both equix and hashx need improvements in their API to handle failures after allocation time, and this failure occurs while the hash function is being compiled. With this commit only, the segfault from issue 40794 is reproduced. Subsequent commits will fix the segfault and revise the API. Signed-off-by: Micah Elizabeth Scott <beth@torproject.org>
2023-05-25changes: Add file for ticket 40797David Goulet
Signed-off-by: David Goulet <dgoulet@torproject.org>
2023-05-25Forgot about the stub namesfriendly73
2023-05-25Added relay prefix to new metrics functionsfriendly73
2023-05-25Fixed enum type not found in relay_stubfriendly73
2023-05-25Added void stubs for the relay metrics functions to fix building without ↵friendly73
relay module
2023-05-25Fixed new arguments for metrics_store_addfriendly73
2023-05-25Removed getter abstraction and moved from rephist to relay_metrics.friendly73
2023-05-25Fixed est intro getter using wrong arrayfriendly73
2023-05-25Fixed REND1 metric label valuefriendly73
2023-05-25Added INTRO and REND metrics for relay.friendly73
2023-05-25Merge branch 'tor-gitlab/mr/443'David Goulet
2023-05-25Add missing changes file for tor#33669.Alexander Færøy
See: tpo/core/tor#33669.
2023-05-25Restart PT processes when they die on us.Alexander Færøy
This patch forces a PT reconfigure of infant PT processes as part of the PT process' exit handler. See: tpo/core/tor#33669
2023-05-25Log at LD_PT instead of LD_GENERAL for PT process stdout lines.Alexander Færøy
See: tpo/core/tor#33669
2023-05-25Only terminate PT processes that are running.Alexander Færøy
See: tpo/core/tor#33669
2023-05-25Log name of managed proxy in exit handler.Alexander Færøy
This patch ensures that we can figure out which PT that terminated in the PT exit handler. See: tpo/core/tor#33669
2023-05-25Log state transitions for Pluggable TransportsAlexander Færøy
This patch makes Tor log state transitions within the PT layer at the info log-level. This should make it easier to figure out if Tor ends up in a strange state. See: tpo/core/tor#33669
2023-05-25test: Fix parseconf to account for ClientUseIPv6 change for dirauth disabledDavid Goulet
Signed-off-by: David Goulet <dgoulet@torproject.org>
2023-05-25test: Fix parseconf to account for ClientUseIPv6 changeDavid Goulet
Signed-off-by: David Goulet <dgoulet@torproject.org>
2023-05-24Merge branch 'tor-gitlab/mr/711'David Goulet
2023-05-24token_bucket_ctr: replace 32-bit wallclock time with monotimeMicah Elizabeth Scott
This started as a response to ticket #40792 where Coverity is complaining about a potential year 2038 bug where we cast time_t from approx_time() to uint32_t for use in token_bucket_ctr. There was a larger can of worms though, since token_bucket really doesn't want to be using wallclock time here. I audited the call sites for approx_time() and changed any that used a 32-bit cast or made inappropriate use of wallclock time. Things like certificate lifetime, consensus intervals, etc. need wallclock time. Measurements of rates over time, however, are better served with a monotonic timer that does not try and sync with wallclock ever. Looking closer at token_bucket, its design is a bit odd because it was initially intended for use with tick units but later forked into token_bucket_rw which uses ticks to count bytes per second, and token_bucket_ctr which uses seconds to count slower events. The rates represented by either token bucket can't be lower than 1 per second, so the slower timer in 'ctr' is necessary to represent the slower rates of things like connections or introduction packets or rendezvous attempts. I considered modifying token_bucket to use 64-bit timestamps overall instead of 32-bit, but that seemed like an unnecessarily invasive change that would grant some peace of mind but probably not help much. I was more interested in removing the dependency on wallclock time. The token_bucket_rw timer already uses monotonic time. This patch converts token_bucket_ctr to use monotonic time as well. It introduces a new monotime_coarse_absolute_sec(), which is currently the same as nsec divided by a billion but could be optimized easily if we ever need to. This patch also might fix a rollover bug.. I haven't tested this extensively but I don't think the previous version of the rollover code on either token bucket was correct, and I would expect it to get stuck after the first rollover. Signed-off-by: Micah Elizabeth Scott <beth@torproject.org>
2023-05-24Merge branch 'tor-gitlab/mr/709'David Goulet
2023-05-24Merge branch 'tor-gitlab/mr/710'David Goulet
2023-05-24test_hs_descriptor: Add a test case that fails without the fix for 40793Micah Elizabeth Scott
This adds a bit more to hs_descriptor/test_decode_descriptor, mostly testing pow-params and triggering the tor_assert() in issue #40793. There was no mechanism for adding arbitrary test strings to the encrypted portion of the desc without duplicating encode logic. One option might be to publicize get_inner_encrypted_layer_plaintext enough to add a mock implementation. In this patch I opt for what seems like the simplest solution, at the cost of a small amount of #ifdef noise. The unpacked descriptor grows a new test-only member that's used for dropping arbitrary data in at encode time. Signed-off-by: Micah Elizabeth Scott <beth@torproject.org>
2023-05-24Merge branch 'tor-gitlab/mr/708'David Goulet
2023-05-24ipv6: Flip ClientUseIPv6 to 1agowa338
Fixes #40785 Signed-off-by: David Goulet <dgoulet@torproject.org>
2023-05-24metrics: Add ticket 40546 changes file and code fixDavid Goulet
The MR was using an old function definition so the code fix is for that. Closes #40546 Signed-off-by: David Goulet <dgoulet@torproject.org>
2023-05-24Merge branch 'tor-gitlab/mr/698'David Goulet
2023-05-24Merge branch 'tor-gitlab/mr/703'David Goulet
2023-05-15hs_pow: fix insufficient length check in pow-paramsMicah Elizabeth Scott
The descriptor validation table had an out of date minimum length for pow-params (3) whereas the spec and the current code expect at least 4 parameters. This was an opportunity for a malicious service to cause an assert failure in clients which attempted to parse its descriptor. Addresses issue #40793 Signed-off-by: Micah Elizabeth Scott <beth@torproject.org>
2023-05-11Add changes file for conflux.Mike Perry
2023-05-11test_crypto: avoid memory leak in some hashx test failuresMicah Elizabeth Scott
This should fix one of the warnings in issue #40792. I was sloppy with freeing memory in the failure cases for test_crypto_hashx. ASAN didn't notice but coverity did. Okay, I'll eat my vegetables and put hashx_ctx's deinit in an upper scope and use 'goto done' correctly like a properly diligent C programmer. Signed-off-by: Micah Elizabeth Scott <beth@torproject.org>
2023-05-11equix: avoid a coverity warning in hashx_alloc()Micah Elizabeth Scott
This addresses one of the warnings in issue #40792. As far as I can tell this is a false positive, since the use of "ctx->type" in hashx_free() can only be hit after the unioned code/program pointer is non-NULL. It's no big deal to zero this value explicitly to silence the warning though. Signed-off-by: Micah Elizabeth Scott <beth@torproject.org>
2023-05-11Add torrc option for conflux client UX.Mike Perry
2023-05-11Fix unit tests.Mike Perry
2023-05-11Clean up UX decision logic; hardcode for browser UX case.Mike Perry
2023-05-11fix minor typos in conflux and pow areasRoger Dingledine
2023-05-10Clean up and disable switch rate limiting.Mike Perry
Switch rate limiting will likely be helpful for limiting OOQ, but according to shadow it was the cause of slower performance in Hong Kong endpoints. So let's disable it, and then optimize for OOQ later.
2023-05-10Remove two conflux algs: maxrate and cwndrate.Mike Perry
Maxrate had slower throughput than lowrtt in Shadow, which is not too surprising. We just wanted to test it.
2023-05-10hs_pow: Modify challenge format, include blinded HS idMicah Elizabeth Scott
This is a protocol breaking change that implements nickm's changes to prop 327 to add an algorithm personalization string and blinded HS id to the EquiX challenge string for our onion service client puzzle. This corresponds with the spec changes in torspec!130, and it fixes a proposed vulnerability documented in ticket tor#40789. Clients and services prior to this patch will no longer be compatible with the proposed "v1" proof-of-work protocol. Signed-off-by: Micah Elizabeth Scott <beth@torproject.org>
2023-05-10hs_pow: add per-circuit effort information to control portMicah Elizabeth Scott
This lets controller apps see the outgoing PoW effort on client circuits, and the validated effort received on an incoming service circuit. Signed-off-by: Micah Elizabeth Scott <beth@torproject.org>
2023-05-10hs_pow: fix error path with outdated assumptionMicah Elizabeth Scott
This error path with the "PoW cpuworker returned with no solution. Will retry soon." message was usually lying. It's concerning now because we expect to always find a solution no matter how long it takes, rather than re-enter the solver repeatedly, so any exit without a solution is a sign of a problem. In fact when this error path gets hit, we are usually missing a circuit instead because the request is quite old and the circuits have been destroyed. This is not an emergency, it's just a sign of client-side overload. Signed-off-by: Micah Elizabeth Scott <beth@torproject.org>
2023-05-10hs_pow: swap out some commentsMicah Elizabeth Scott
i think we're done with these? and swap in a nonfatal assert to replace one of the comments. Signed-off-by: Micah Elizabeth Scott <beth@torproject.org>
2023-05-10hs_pow: always give other events a chance to run between rend requestsMicah Elizabeth Scott
This dequeue path has been through a few revisions by now, first limiting us to a fixed number per event loop callback, then an additional limit based on a token bucket, then the current version which has only the token bucket. The thinking behing processing multiple requests per callback was to optimize our usage of libevent, but in effect this creates a prioritization problem. I think even a small fixed limit would be less reliable than just backing out this optimization and always allowing other callbacks to interrupt us in-between dequeues. With this patch I'm seeing much smoother queueing behavior when I add artificial delays to the main thread in testing. Signed-off-by: Micah Elizabeth Scott <beth@torproject.org>