Age | Commit message (Collapse) | Author |
|
|
|
Bug found and fixed by @hyunsoo.kim676.
|
|
This started as a response to ticket #40792 where Coverity is
complaining about a potential year 2038 bug where we cast time_t from
approx_time() to uint32_t for use in token_bucket_ctr.
There was a larger can of worms though, since token_bucket really
doesn't want to be using wallclock time here. I audited the call sites
for approx_time() and changed any that used a 32-bit cast or made
inappropriate use of wallclock time. Things like certificate lifetime,
consensus intervals, etc. need wallclock time. Measurements of rates
over time, however, are better served with a monotonic timer that does
not try and sync with wallclock ever.
Looking closer at token_bucket, its design is a bit odd because it was
initially intended for use with tick units but later forked into
token_bucket_rw which uses ticks to count bytes per second, and
token_bucket_ctr which uses seconds to count slower events. The rates
represented by either token bucket can't be lower than 1 per second, so
the slower timer in 'ctr' is necessary to represent the slower rates of
things like connections or introduction packets or rendezvous attempts.
I considered modifying token_bucket to use 64-bit timestamps overall
instead of 32-bit, but that seemed like an unnecessarily invasive change
that would grant some peace of mind but probably not help much. I was
more interested in removing the dependency on wallclock time. The
token_bucket_rw timer already uses monotonic time. This patch converts
token_bucket_ctr to use monotonic time as well. It introduces a new
monotime_coarse_absolute_sec(), which is currently the same as nsec
divided by a billion but could be optimized easily if we ever need to.
This patch also might fix a rollover bug.. I haven't tested this
extensively but I don't think the previous version of the rollover code
on either token bucket was correct, and I would expect it to get stuck
after the first rollover.
Signed-off-by: Micah Elizabeth Scott <beth@torproject.org>
|
|
This is a protocol breaking change that implements nickm's
changes to prop 327 to add an algorithm personalization string
and blinded HS id to the EquiX challenge string for our onion
service client puzzle.
This corresponds with the spec changes in torspec!130,
and it fixes a proposed vulnerability documented in
ticket tor#40789.
Clients and services prior to this patch will no longer
be compatible with the proposed "v1" proof-of-work protocol.
Signed-off-by: Micah Elizabeth Scott <beth@torproject.org>
|
|
i think we're done with these?
and swap in a nonfatal assert to replace one of the comments.
Signed-off-by: Micah Elizabeth Scott <beth@torproject.org>
|
|
This centralizes the logic for deciding on these magic thresholds,
and tries to reduce them to just two: a min and max. The min should be a
"nearly empty" threshold, indicating that the queue only contains work
we expect to be able to complete very soon. The max level triggers a
bulk culling process that reduces the queue to half that amount.
This patch calculates both thresholds based on the torrc pqueue rate
settings if they're present, and uses generic defaults if the user asked
for an unlimited dequeue rate in torrc.
Signed-off-by: Micah Elizabeth Scott <beth@torproject.org>
|
|
Signed-off-by: Micah Elizabeth Scott <beth@torproject.org>
|
|
This is trying to be an AIMD event-driven algorithm, but we ended up with
two different add paths with diverging behavior. This fix makes the AIMD
events more explicit, and it fixes an earlier behavior where the effort
could be decreased (by the add/recalculate branch) even when the pqueue
was not emptying at all. With this patch we shouldn't drop down to an
effort of zero as long as even low-effort attacks are flooding the
pqueue.
Signed-off-by: Micah Elizabeth Scott <beth@torproject.org>
|
|
I don't think the concept of "minimum effort" is really useful to us,
so this patch removes it entirely and consequentially changes the way
that "total" effort is calculated so that we don't rely on any minimum
and we instead ramp up effort no faster than necessary.
If at least some portion of the attack is conducted by clients that
avoid PoW or provide incorrect solutions, those (potentially very
cheap) attacks will end up keeping the pqueue full. Prior to this patch,
that would cause suggested efforts to be unnecessarily high, because
rounding these very cheap requests up to even a minimum of 1 will
overestimate how much actual attack effort is being spent.
The result is that this patch is a simplification and it also allows a
slower start, where PoW effort jumps up either by a single unit or by an
amount calculated from actual effort in the queue.
Signed-off-by: Micah Elizabeth Scott <beth@torproject.org>
|
|
This is more consistent with the specification, and it's much
less confusing with endianness. This resolves the underlying
cause of the earlier byte-swap. This patch itself does not
change the wire protocol at all, it's just tidying up the
types we use at the trunnel layer.
Signed-off-by: Micah Elizabeth Scott <beth@torproject.org>
|
|
This adds a new "pow" module for the user-visible proof
of work support in ./configure, and this disables
src/feature/hs/hs_pow at compile-time.
Signed-off-by: Micah Elizabeth Scott <beth@torproject.org>
|
|
This adds a token bucket ratelimiter on the dequeue side
of hs_pow's priority queue. It adds config options and docs
for those options. (HiddenServicePoWQueueRate/Burst)
I'm testing this as a way to limit the overhead of circuit
creation when we're experiencing a flood of rendezvous requests.
Signed-off-by: Micah Elizabeth Scott <beth@torproject.org>
|
|
Adds two new metrics for hs_pow, and an internal parameter within
hs_metrics for implementing gauge parameters that reset before
every update.
Signed-off-by: Micah Elizabeth Scott <beth@torproject.org>
|
|
top_of_rend_pqueue_is_worthwhile requires a nonempty queue.
|
|
First (both client and service), make descriptor parsing not fail when
suggested_effort is 0.
Second (client side), if we get a descriptor with a pow_params section
but with suggested_effort of 0, treat it as not requiring a pow.
Third (service side), when deciding whether the suggested effort has
changed, don't treat "previous suggested effort 0, new suggested effort 0"
as a change.
An alternative design to resolve 'first' and 'second' above would be
to omit the pow_params from the descriptor when suggested_effort is 0,
so clients never see the pow_params so they don't compute a pow. But
I decided to include a pow_params with an explicit suggested_effort
of 0, since this way the client knows the seed etc so they can solve
a higher-effort pow if they want. The tradeoff is that the descriptor
reveals whether HiddenServicePoWDefensesEnabled is set to 1 for this onion
service, even if the AIMD calculation is currently requiring effort 0.
|
|
If it works correctly, auto-tuning should set a non-zero effort once
an attack begins.
|
|
Now, pow should auto-enable and auto-disable itself.
|
|
This allows us to more accurately estimate effort, based on real bottom-half
throughput over the duration of a descriptor update.
|
|
|
|
Signed-off-by: David Goulet <dgoulet@torproject.org>
|
|
This adds 2 histogram metrics for hidden services:
* `tor_hs_rend_circ_build_time` - the rendezvous circuit build time in milliseconds
* `tor_hs_intro_circ_build_time` - the introduction circuit build time in milliseconds
The text representation representation of the new metrics looks like this:
```
# HELP tor_hs_rend_circ_build_time The rendezvous circuit build time in milliseconds
# TYPE tor_hs_rend_circ_build_time histogram
tor_hs_rend_circ_build_time_bucket{onion="<elided>",le="1000.00"} 2
tor_hs_rend_circ_build_time_bucket{onion="<elided>",le="5000.00"} 10
tor_hs_rend_circ_build_time_bucket{onion="<elided>",le="10000.00"} 10
tor_hs_rend_circ_build_time_bucket{onion="<elided>",le="30000.00"} 10
tor_hs_rend_circ_build_time_bucket{onion="<elided>",le="60000.00"} 10
tor_hs_rend_circ_build_time_bucket{onion="<elided>",le="+Inf"} 10
tor_hs_rend_circ_build_time_sum{onion="<elided>"} 10824
tor_hs_rend_circ_build_time_count{onion="<elided>"} 10
# HELP tor_hs_intro_circ_build_time The introduction circuit build time in milliseconds
# TYPE tor_hs_intro_circ_build_time histogram
tor_hs_intro_circ_build_time_bucket{onion="<elided>",le="1000.00"} 0
tor_hs_intro_circ_build_time_bucket{onion="<elided>",le="5000.00"} 6
tor_hs_intro_circ_build_time_bucket{onion="<elided>",le="10000.00"} 6
tor_hs_intro_circ_build_time_bucket{onion="<elided>",le="30000.00"} 6
tor_hs_intro_circ_build_time_bucket{onion="<elided>",le="60000.00"} 6
tor_hs_intro_circ_build_time_bucket{onion="<elided>",le="+Inf"} 6
tor_hs_intro_circ_build_time_sum{onion="<elided>"} 9843
tor_hs_intro_circ_build_time_count{onion="<elided>"} 6
```
Signed-off-by: Gabriela Moldovan <gabi@torproject.org>
|
|
This adds a `reason` label to the `hs_intro_rejected_intro_req_count` and
`hs_rdv_error_count` metrics introduced in #40755.
Metric look up and intialization is now more a bit more involved. This may be
fine for now, but it will become unwieldy if/when we add more labels (and as
such will need to be refactored).
Also, in the future, we may want to introduce finer grained `reason` labels.
For example, the `invalid_introduce2` label actually covers multiple types of
errors that can happen during the processing of an INTRODUCE2 cell (such as
cell parse errors, replays, decryption errors).
Signed-off-by: Gabriela Moldovan <gabi@torproject.org>
|
|
This introduces a couple of new service side metrics:
* `hs_intro_rejected_intro_req_count`, which counts the number of introduction
requests rejected by the hidden service
* `hs_rdv_error_count`, which counts the number of rendezvous errors as seen by
the hidden service (this number includes the number of circuit establishment
failures, failed retries, end-to-end circuit setup failures)
Closes #40755. This partially addresses #40717.
Signed-off-by: Gabriela Moldovan <gabi@torproject.org>
|
|
This updates the docs to stop suggesting `pk` can be NULL, as that doesn't seem
to be the case anymore (`tor_assert(pk)`).
Signed-off-by: Gabriela Moldovan <gabi@torproject.org>
|
|
Move the retry from circuit_expire_building() to when the offending
circuit is being closed.
Fixes #40695
Signed-off-by: David Goulet <dgoulet@torproject.org>
|
|
Move the retry from circuit_expire_building() to when the offending
circuit is being closed.
Fixes #40695
Signed-off-by: David Goulet <dgoulet@torproject.org>
|
|
|
|
Republishing is necessary to ensure that clients connect using the correct
sendme_inc upon any change. Additionally, introduction points must be
re-chosen, so that cached descriptors with old values are not usable.
We do not expect to change sendme_inc, unless cell size or TLS record size
changes, so this should be rare.
Signed-off-by: David Goulet <dgoulet@torproject.org>
|
|
Move it to extension.trunnel instead so that extension ABI construction
can be used in other parts of tor than just HS cells.
Specifically, we'll use it in the ntorv3 data payload and make a
congestion control parameter extension using that binary structure.
Only rename. No code behavior changes.
Signed-off-by: David Goulet <dgoulet@torproject.org>
|
|
|
|
Turns out that passing client authorization keys to ADD_ONION for v3 was
not working because we were not setting the "is_client_auth_enabled"
flag to true once the clients were configured. This lead to the
descriptor being encoded without the clients.
This patch removes that flag and instead adds an inline function that
can be used to check if a given service has client authorization
enabled.
This will be much less error prone of needing to keep in sync the client
list and a flag instead.
Fixes #40378
Signed-off-by: David Goulet <dgoulet@torproject.org>
|
|
|
|
|
|
When reloading a service, we can re-register a service and thus end up again
in the metrics store initialization code path which is fine. No need to BUG()
anymore.
Fixes #40334
Signed-off-by: David Goulet <dgoulet@torproject.org>
|
|
We still keep v2 rendezvous stats since we will allow them until the network
has entirely phased out.
Related to #40266
Signed-off-by: David Goulet <dgoulet@torproject.org>
|
|
This is unfortunately massive but both functionalities were extremely
intertwined and it would have required us to actually change the HSv2 code in
order to be able to split this into multiple commits.
After this commit, there are still artefacts of v2 in the code but there is no
more support for service, intro point and HSDir.
The v2 support for rendezvous circuit is still available since that code is
the same for the v3 and we will leave it in so if a client is able to
rendezvous on v2 then it can still transfer traffic. Once the entire network
has moved away from v2, we can remove v2 rendezvous point support.
Related to #40266
Signed-off-by: David Goulet <dgoulet@torproject.org>
|
|
Remove it from rendservice.c and move everything related to hs_common.{c|h}.
Related to #40266
Signed-off-by: David Goulet <dgoulet@torproject.org>
|
|
|
|
|
|
|
|
|
|
Some days before this commit, the network experienced a DDoS on the directory
authorities that prevented them to generate a consensus for more than 5 hours
straight.
That in turn entirely disabled onion service v3, client and service side, due
to the subsystem requiring a live consensus to function properly.
We know require a reasonably live consensus which means that the HSv3
subsystem will to its job for using the best consensus tor can find. If the
entire network is using an old consensus, than this should be alright.
If the service happens to use a live consensus while a client is not, it
should still work because the client will use the current SRV it sees which
might be the previous SRV for the service for which it still publish
descriptors for.
If the service is using an old one and somehow can't get a new one while
clients are on a new one, then reachability issues might arise. However, this
is a situation we already have at the moment since the service will simply not
work if it doesn't have a live consensus while a client has one.
Fixes #40237
Signed-off-by: David Goulet <dgoulet@torproject.org>
|
|
adding them
|
|
ownership of auth_clients_v3
|
|
|
|
|
|
|
|
|
|
Typos found with codespell.
Please keep in mind that this should have impact on actual code
and must be carefully evaluated:
src/core/or/lttng_circuit.inc
- ctf_enum_value("CONTROLER", CIRCUIT_PURPOSE_CONTROLLER)
+ ctf_enum_value("CONTROLLER", CIRCUIT_PURPOSE_CONTROLLER)
|
|
Tracks the total number of established introduction circuit.
Related to #40063
Signed-off-by: David Goulet <dgoulet@torproject.org>
|