Age | Commit message (Collapse) | Author |
|
Fix assert crash on relay-side due to on_circuit backpointer
See merge request tpo/core/tor!737
|
|
|
|
Signed-off-by: David Goulet <dgoulet@torproject.org>
|
|
|
|
|
|
|
|
|
|
Close #40824
Signed-off-by: David Goulet <dgoulet@torproject.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
This brings us into sync with the consensus, and will be useful for test
vectors, to ensure behavior consistent with the consensus params.
|
|
|
|
KISTSchedRunIntervalClient
|
|
Also, double check that the consensus has enough overall exits before
attempting conflux set launch.
|
|
|
|
Also add calls to dump the legs of a conflux set if we have too many
|
|
|
|
Otherwise, the BEGIN cell arrives at the exit before it has an RTT,
and then it does not know which circuit to prefer in response.
|
|
This will give us a full stacktrace.
|
|
|
|
|
|
|
|
This started as a response to ticket #40792 where Coverity is
complaining about a potential year 2038 bug where we cast time_t from
approx_time() to uint32_t for use in token_bucket_ctr.
There was a larger can of worms though, since token_bucket really
doesn't want to be using wallclock time here. I audited the call sites
for approx_time() and changed any that used a 32-bit cast or made
inappropriate use of wallclock time. Things like certificate lifetime,
consensus intervals, etc. need wallclock time. Measurements of rates
over time, however, are better served with a monotonic timer that does
not try and sync with wallclock ever.
Looking closer at token_bucket, its design is a bit odd because it was
initially intended for use with tick units but later forked into
token_bucket_rw which uses ticks to count bytes per second, and
token_bucket_ctr which uses seconds to count slower events. The rates
represented by either token bucket can't be lower than 1 per second, so
the slower timer in 'ctr' is necessary to represent the slower rates of
things like connections or introduction packets or rendezvous attempts.
I considered modifying token_bucket to use 64-bit timestamps overall
instead of 32-bit, but that seemed like an unnecessarily invasive change
that would grant some peace of mind but probably not help much. I was
more interested in removing the dependency on wallclock time. The
token_bucket_rw timer already uses monotonic time. This patch converts
token_bucket_ctr to use monotonic time as well. It introduces a new
monotime_coarse_absolute_sec(), which is currently the same as nsec
divided by a billion but could be optimized easily if we ever need to.
This patch also might fix a rollover bug.. I haven't tested this
extensively but I don't think the previous version of the rollover code
on either token bucket was correct, and I would expect it to get stuck
after the first rollover.
Signed-off-by: Micah Elizabeth Scott <beth@torproject.org>
|
|
|
|
|
|
|
|
|
|
|
|
Switch rate limiting will likely be helpful for limiting OOQ, but according to
shadow it was the cause of slower performance in Hong Kong endpoints.
So let's disable it, and then optimize for OOQ later.
|
|
Maxrate had slower throughput than lowrtt in Shadow, which is not too
surprising. We just wanted to test it.
|
|
This lets controller apps see the outgoing PoW effort on client
circuits, and the validated effort received on an incoming service
circuit.
Signed-off-by: Micah Elizabeth Scott <beth@torproject.org>
|
|
The goal of this patch is to add an additional mechanism for adjusting
PoW effort upwards, where clients rather than services can choose to
solve their puzzles at a higher effort than what was suggested in the
descriptor.
I wanted to use hs_cache's existing unreachability stats to drive this
effort bump, but this revealed some cases where a circuit (intro or
rend) closed early on can end up in hs_cache with an all zero intro
point key, where nobody will find it. This moves intro_auth_pk
initialization earlier in a couple places and adds nonfatal asserts to
catch the problem if it shows up elsewhere.
The actual effort adjustment method I chose is to multiply the suggested
effort by (1 + unresponsive_count), then ensure the result is at least
1. If a service has suggested effort of 0 but we fail to connect,
retries will all use an effort of 1. If the suggestion was 50, we'll try
50, 100, 150, 200, etc. This is bounded both by our client effort limit
and by the limit on unresponsive_count (currently 5).
Signed-off-by: Micah Elizabeth Scott <beth@torproject.org>
|
|
This adds a new "pow" module for the user-visible proof
of work support in ./configure, and this disables
src/feature/hs/hs_pow at compile-time.
Signed-off-by: Micah Elizabeth Scott <beth@torproject.org>
|
|
We mark the intro circuit with a new flag saying that the pow is
in the cpuworker queue. When the cpuworker comes back, it either
has a solution, in which case we proceed with sending the intro1
cell, or it has no solution, in which case we unmark the intro
circuit and let the whole process restart on the next iteration of
connection_ap_handshake_attach_circuit().
|
|
prepares the way for client-side pow cpuworkers
also happens to resolve bug https://bugs.torproject.org/tpo/core/tor/40617
(which went into 0.4.7.4-alpha) because now we survive initing the
cpuworker subsystem when we're not a relay.
|
|
Signed-off-by: David Goulet <dgoulet@torproject.org>
|
|
At this commit, the tor main loop solves it. We might consider moving
this to the CPU pool at some point.
Signed-off-by: David Goulet <dgoulet@torproject.org>
|
|
|
|
|
|
|
|
|
|
We discovered two cases where edge connections can stall during testing:
1. Due to final data sitting in the edge inbuf when it was resumed
2. Due to flag synchronization between the token bucket and XON/XOFF
The first issue has always existed in C-Tor, but we were able to tickle it
in scp testing. If the last data from the protocol is able to fit in the
inbuf, but not large enough to send, if an XOFF or connection block comes in
at exactly that point, when the edge connection resumes, there will be no
data to read from the socket, but the inbuf can just sit there, never
draining.
We noticed the second issue along the way to finding the first. It seems
wrong, but it didn't seem to affect anything in practice.
These are extremely rare in normal operation, but with conflux, XON/XOFF
activity is more common, so we hit these.
Signed-off-by: David Goulet <dgoulet@torproject.org>
|
|
In https://gitlab.torproject.org/tpo/core/tor/-/issues/40623, we changed the
DESTROY propogation to ensure memory was freed quickly at relays. This was a
good move, but it exacerbates the condition where a stream is closed on a
circuit, and then it is immediately closed because it is dirty. This creates a
race between the DESTROY and the last data sent on the stream. This race is
visible in shadow, and does happen.
This could be backported. A better solution to these kinds of problems is to
create an ENDED cell, and not close any circuits until the ENDED comes back.
But this will also require thinking, since this ENDED cell can also get lost,
so some kind of timeout may be needed either way. The ENDED cell could just
allow us to have much longer timeouts for this case.
|
|
Signed-off-by: David Goulet <dgoulet@torproject.org>
|
|
Signed-off-by: David Goulet <dgoulet@torproject.org>
|