aboutsummaryrefslogtreecommitdiff
path: root/src/feature/hs/hs_circuit.c
AgeCommit message (Collapse)Author
2023-05-24token_bucket_ctr: replace 32-bit wallclock time with monotimeMicah Elizabeth Scott
This started as a response to ticket #40792 where Coverity is complaining about a potential year 2038 bug where we cast time_t from approx_time() to uint32_t for use in token_bucket_ctr. There was a larger can of worms though, since token_bucket really doesn't want to be using wallclock time here. I audited the call sites for approx_time() and changed any that used a 32-bit cast or made inappropriate use of wallclock time. Things like certificate lifetime, consensus intervals, etc. need wallclock time. Measurements of rates over time, however, are better served with a monotonic timer that does not try and sync with wallclock ever. Looking closer at token_bucket, its design is a bit odd because it was initially intended for use with tick units but later forked into token_bucket_rw which uses ticks to count bytes per second, and token_bucket_ctr which uses seconds to count slower events. The rates represented by either token bucket can't be lower than 1 per second, so the slower timer in 'ctr' is necessary to represent the slower rates of things like connections or introduction packets or rendezvous attempts. I considered modifying token_bucket to use 64-bit timestamps overall instead of 32-bit, but that seemed like an unnecessarily invasive change that would grant some peace of mind but probably not help much. I was more interested in removing the dependency on wallclock time. The token_bucket_rw timer already uses monotonic time. This patch converts token_bucket_ctr to use monotonic time as well. It introduces a new monotime_coarse_absolute_sec(), which is currently the same as nsec divided by a billion but could be optimized easily if we ever need to. This patch also might fix a rollover bug.. I haven't tested this extensively but I don't think the previous version of the rollover code on either token bucket was correct, and I would expect it to get stuck after the first rollover. Signed-off-by: Micah Elizabeth Scott <beth@torproject.org>
2023-05-10hs_pow: Modify challenge format, include blinded HS idMicah Elizabeth Scott
This is a protocol breaking change that implements nickm's changes to prop 327 to add an algorithm personalization string and blinded HS id to the EquiX challenge string for our onion service client puzzle. This corresponds with the spec changes in torspec!130, and it fixes a proposed vulnerability documented in ticket tor#40789. Clients and services prior to this patch will no longer be compatible with the proposed "v1" proof-of-work protocol. Signed-off-by: Micah Elizabeth Scott <beth@torproject.org>
2023-05-10hs_pow: add per-circuit effort information to control portMicah Elizabeth Scott
This lets controller apps see the outgoing PoW effort on client circuits, and the validated effort received on an incoming service circuit. Signed-off-by: Micah Elizabeth Scott <beth@torproject.org>
2023-05-10hs_pow: always give other events a chance to run between rend requestsMicah Elizabeth Scott
This dequeue path has been through a few revisions by now, first limiting us to a fixed number per event loop callback, then an additional limit based on a token bucket, then the current version which has only the token bucket. The thinking behing processing multiple requests per callback was to optimize our usage of libevent, but in effect this creates a prioritization problem. I think even a small fixed limit would be less reliable than just backing out this optimization and always allowing other callbacks to interrupt us in-between dequeues. With this patch I'm seeing much smoother queueing behavior when I add artificial delays to the main thread in testing. Signed-off-by: Micah Elizabeth Scott <beth@torproject.org>
2023-05-10hs_pow: modified approach to pqueue level thresholdsMicah Elizabeth Scott
This centralizes the logic for deciding on these magic thresholds, and tries to reduce them to just two: a min and max. The min should be a "nearly empty" threshold, indicating that the queue only contains work we expect to be able to complete very soon. The max level triggers a bulk culling process that reduces the queue to half that amount. This patch calculates both thresholds based on the torrc pqueue rate settings if they're present, and uses generic defaults if the user asked for an unlimited dequeue rate in torrc. Signed-off-by: Micah Elizabeth Scott <beth@torproject.org>
2023-05-10hs_pow: review feedback, use MAX for max_trimmed_effortMicah Elizabeth Scott
Signed-off-by: Micah Elizabeth Scott <beth@torproject.org>
2023-05-10hs_pow: Lower several logs from notice to infoMicah Elizabeth Scott
Signed-off-by: Micah Elizabeth Scott <beth@torproject.org>
2023-05-10hs_pow: client side effort adjustmentMicah Elizabeth Scott
The goal of this patch is to add an additional mechanism for adjusting PoW effort upwards, where clients rather than services can choose to solve their puzzles at a higher effort than what was suggested in the descriptor. I wanted to use hs_cache's existing unreachability stats to drive this effort bump, but this revealed some cases where a circuit (intro or rend) closed early on can end up in hs_cache with an all zero intro point key, where nobody will find it. This moves intro_auth_pk initialization earlier in a couple places and adds nonfatal asserts to catch the problem if it shows up elsewhere. The actual effort adjustment method I chose is to multiply the suggested effort by (1 + unresponsive_count), then ensure the result is at least 1. If a service has suggested effort of 0 but we fail to connect, retries will all use an effort of 1. If the suggestion was 50, we'll try 50, 100, 150, 200, etc. This is bounded both by our client effort limit and by the limit on unresponsive_count (currently 5). Signed-off-by: Micah Elizabeth Scott <beth@torproject.org>
2023-05-10hs_pow: leak fix, free the contents of pqueue entries in ↵Micah Elizabeth Scott
hs_pow_free_service_state Asan catches this pretty readily when ending a service gracefully while a DoS is in progress and the queue is full of items that haven't yet timed out. The module boundaries in hs_circuit are quite fuzzy here, but I'm trying to follow the vibe of the existing hs_pow code. Signed-off-by: Micah Elizabeth Scott <beth@torproject.org>
2023-05-10hs_pow: stop having a "minimum effort", and let PoW effort start lowMicah Elizabeth Scott
I don't think the concept of "minimum effort" is really useful to us, so this patch removes it entirely and consequentially changes the way that "total" effort is calculated so that we don't rely on any minimum and we instead ramp up effort no faster than necessary. If at least some portion of the attack is conducted by clients that avoid PoW or provide incorrect solutions, those (potentially very cheap) attacks will end up keeping the pqueue full. Prior to this patch, that would cause suggested efforts to be unnecessarily high, because rounding these very cheap requests up to even a minimum of 1 will overestimate how much actual attack effort is being spent. The result is that this patch is a simplification and it also allows a slower start, where PoW effort jumps up either by a single unit or by an amount calculated from actual effort in the queue. Signed-off-by: Micah Elizabeth Scott <beth@torproject.org>
2023-05-10hs_pow: Make proof-of-work support optional in configureMicah Elizabeth Scott
This adds a new "pow" module for the user-visible proof of work support in ./configure, and this disables src/feature/hs/hs_pow at compile-time. Signed-off-by: Micah Elizabeth Scott <beth@torproject.org>
2023-05-10hs_pow: Rate limited dequeueMicah Elizabeth Scott
This adds a token bucket ratelimiter on the dequeue side of hs_pow's priority queue. It adds config options and docs for those options. (HiddenServicePoWQueueRate/Burst) I'm testing this as a way to limit the overhead of circuit creation when we're experiencing a flood of rendezvous requests. Signed-off-by: Micah Elizabeth Scott <beth@torproject.org>
2023-05-10hs_metrics: Proof of Work pqueue depth, suggested effortMicah Elizabeth Scott
Adds two new metrics for hs_pow, and an internal parameter within hs_metrics for implementing gauge parameters that reset before every update. Signed-off-by: Micah Elizabeth Scott <beth@torproject.org>
2023-05-10update_suggested_effort: avoid assert if the pqueue has emptiedMicah Elizabeth Scott
top_of_rend_pqueue_is_worthwhile requires a nonempty queue.
2023-05-10Implement AIMD effort estimation.Mike Perry
Now, pow should auto-enable and auto-disable itself.
2023-05-10Replace the constant bottom-half rate with handled count.Mike Perry
This allows us to more accurately estimate effort, based on real bottom-half throughput over the duration of a descriptor update.
2023-05-10Make the thing compile.Mike Perry
2023-05-10sort pqueue ties by time-addedRoger Dingledine
our pqueue implementation does bizarre unspecified things with ordering of elements that are equal. it certainly doesn't do any sort of "first in first out" property that i was expecting. now make it explicit by saying that "equal-effort, added-earlier" is higher priority.
2023-05-10rate-limit low-effort rendezvous responsesRoger Dingledine
specifically, if we have 16 in-flight rend circs, and the next one at the top of the pqueue is lower than our suggested effort, then don't launch it yet. this way we always launch adequate-effort requests immediately, and we always handle *some* low-effort requests, but we are ready at any moment to handle a few new adequate-effort requests.
2023-05-10make the rend_pqueue_cb event be postloopRoger Dingledine
this change makes us reach the callback *after* each mainloop run, rather than as the next event to run immediately after activation. with the old behavior, we were starving everything else to drain the pqueue entirely, each time we got a new intro2 cell. now we at least will get to other activities as well.
2023-05-10track how many in-flight hs-side rend circsRoger Dingledine
not used in decision-making yet, but it's all ready to use in a "don't dequeue any more if we have too many in-flight" kind of way
2023-05-10we were sorting our pqueue the wrong wayRoger Dingledine
i.e. we were putting higher effort intro2 cells at the *end*
2023-05-10bump up some log messages for easier debuggingRoger Dingledine
2023-05-10new design for handling too many pending rend reqsRoger Dingledine
now we let ourselves queue up to twice as many as we expect, and when we get to the limit we make a new pqueue and move over the first n elements that we like most. (the old approach, of calling SMARTLIST_DEL_CURRENT_KEEPORDER() on elements in a pqueue, will destroy its heapify property.) we also discard elements that are too old, either during the trimming process or if they come up as the next request to respond to. lastly, fix a fencepost error on how many rend reqs we would handle per iteration.
2023-05-10pass time around as a parameterRoger Dingledine
should help with unit testing
2023-05-10hs: Maximum rend request and trimming of the queueDavid Goulet
Signed-off-by: David Goulet <dgoulet@torproject.org>
2023-05-10hs: Handle multiple rend request per mainloop runDavid Goulet
Signed-off-by: David Goulet <dgoulet@torproject.org>
2023-05-10hs: Priority queue for rendezvous requestsDavid Goulet
If PoW are enabled, use a priority queue by effort for the rendezvous requests hooked into the mainloop. Signed-off-by: David Goulet <dgoulet@torproject.org>
2023-05-10hs: Move rendezvous circuit data structureDavid Goulet
When parsing an INTRODUCE2 cell, we extract data in order to launch the rendezvous circuit. This commit creates a data structure just for that data so it can be used by future commits for prop327 in order to copy that data over a priority queue instead of the whole intro data data structure which contains pointers that could dissapear. Signed-off-by: David Goulet <dgoulet@torproject.org>
2023-05-10hs: Client now solve PoW if presentDavid Goulet
At this commit, the tor main loop solves it. We might consider moving this to the CPU pool at some point. Signed-off-by: David Goulet <dgoulet@torproject.org>
2023-03-07metrics: Add a `reason` label to the HS error metrics.Gabriela Moldovan
This adds a `reason` label to the `hs_intro_rejected_intro_req_count` and `hs_rdv_error_count` metrics introduced in #40755. Metric look up and intialization is now more a bit more involved. This may be fine for now, but it will become unwieldy if/when we add more labels (and as such will need to be refactored). Also, in the future, we may want to introduce finer grained `reason` labels. For example, the `invalid_introduce2` label actually covers multiple types of errors that can happen during the processing of an INTRODUCE2 cell (such as cell parse errors, replays, decryption errors). Signed-off-by: Gabriela Moldovan <gabi@torproject.org>
2023-02-16metrics: Add metrics for rendezvous and introduction request failures.Gabriela Moldovan
This introduces a couple of new service side metrics: * `hs_intro_rejected_intro_req_count`, which counts the number of introduction requests rejected by the hidden service * `hs_rdv_error_count`, which counts the number of rendezvous errors as seen by the hidden service (this number includes the number of circuit establishment failures, failed retries, end-to-end circuit setup failures) Closes #40755. This partially addresses #40717. Signed-off-by: Gabriela Moldovan <gabi@torproject.org>
2022-10-26Merge remote-tracking branch 'tor-gitlab/mr/638'David Goulet
2022-10-24hs: Retry rdv circuit if repurposedDavid Goulet
This can happen if our measurement subsystem decides to snatch it. Fixes #40696 Signed-off-by: David Goulet <dgoulet@torproject.org>
2022-10-19hs: Retry service rendezvous on circuit closeDavid Goulet
Move the retry from circuit_expire_building() to when the offending circuit is being closed. Fixes #40695 Signed-off-by: David Goulet <dgoulet@torproject.org>
2022-10-19circ: Get rid of hs_circ_has_timed_outDavid Goulet
Logic is too convoluted and we can't efficiently apply a specific timeout depending on the purpose. Remove it and instead rely on the right circuit cutoff instead of keeping this flagged circuit open forever. Part of #40694 Signed-off-by: David Goulet <dgoulet@torproject.org>
2022-03-16hs: Helper function to setup congestion controlDavid Goulet
We had 3 callsites setting up the circuit congestion control and so this commit consolidates all 3 calls into 1 function. Related to #40586 Signed-off-by: David Goulet <dgoulet@torproject.org>
2022-03-16hs: Transfer ccontrol from circuit to cpathDavid Goulet
Once the cpath is finalized, e2e encryption setup, transfer the ccontrol from the rendezvous circuit to the cpath. This allows the congestion control subsystem to properly function for both upload and download side of onion services. Closes #40586 Signed-off-by: David Goulet <dgoulet@torproject.org>
2022-02-23Properly initialize the cc_enabled field in hs intro data.Mike Perry
2022-02-22Use path type hint for Vegas queue parameters.Mike Perry
These parameters will vary depending on path length, especially for onions.
2022-02-22hs: Setup congestion control on service rends using intro dataDavid Goulet
Signed-off-by: David Goulet <dgoulet@torproject.org>
2022-02-22hs: Build INTRODUCE extension in the encrypted sectionDavid Goulet
Signed-off-by: David Goulet <dgoulet@torproject.org>
2022-02-22trunnel: Make hs/cell_common.trunnel genericDavid Goulet
Move it to extension.trunnel instead so that extension ABI construction can be used in other parts of tor than just HS cells. Specifically, we'll use it in the ntorv3 data payload and make a congestion control parameter extension using that binary structure. Only rename. No code behavior changes. Signed-off-by: David Goulet <dgoulet@torproject.org>
2021-03-12Update copyrights to 2021, using "make update-copyright"Nick Mathewson
2021-02-19hs-v2: Removal of service and relay supportDavid Goulet
This is unfortunately massive but both functionalities were extremely intertwined and it would have required us to actually change the HSv2 code in order to be able to split this into multiple commits. After this commit, there are still artefacts of v2 in the code but there is no more support for service, intro point and HSDir. The v2 support for rendezvous circuit is still available since that code is the same for the v3 and we will leave it in so if a client is able to rendezvous on v2 then it can still transfer traffic. Once the entire network has moved away from v2, we can remove v2 rendezvous point support. Related to #40266 Signed-off-by: David Goulet <dgoulet@torproject.org>
2021-02-19hs-v2: Remove client supportDavid Goulet
Related to #40266 Signed-off-by: David Goulet <dgoulet@torproject.org>
2020-12-24Downgrade the severity of a few rendezvous circuit-related warnings.Neel Chauhan
2020-11-12Fix typos.Samanta Navarro
Typos found with codespell. Please keep in mind that this should have impact on actual code and must be carefully evaluated: src/core/or/lttng_circuit.inc - ctf_enum_value("CONTROLER", CIRCUIT_PURPOSE_CONTROLLER) + ctf_enum_value("CONTROLLER", CIRCUIT_PURPOSE_CONTROLLER)
2020-10-27hs: Collect rendezvous circuit metricsDavid Goulet
The total number of rendezvous circuit created and the number of established ones which is a gauge that decreases to keep an updated counter. Related to #40063 Signed-off-by: David Goulet <dgoulet@torproject.org>
2020-07-02Extract extend_info manipulation functions into a new file.Nick Mathewson