From e35a77088220314b7fcb4053033f131d1357dac6 Mon Sep 17 00:00:00 2001 From: Nick Mathewson Date: Mon, 23 May 2022 14:23:54 -0400 Subject: netflow padding: clarify directionality and padding behavior. The main points here are: * We assume that flow measurements are unidirectional, so each side must make sure to send traffic. * So we restart our timer when sending, only. * We restart the timer whether we're sending real traffic or padding traffic. * The logic for `max(X,X)` timing applies even though we aren't using a bidirectional trigger for timing. --- padding-spec.txt | 49 ++++++++++++++++++++++++------------------------- 1 file changed, 24 insertions(+), 25 deletions(-) diff --git a/padding-spec.txt b/padding-spec.txt index 825f1d7..0a45e8b 100644 --- a/padding-spec.txt +++ b/padding-spec.txt @@ -143,6 +143,12 @@ Table of Contents user traffic in that time period is multiplexed over a single connection (as it is with Tor). + Though flow measurement in principle can be bidirectional (counting cells + sent in both directions between a pair of IPs) or unidirectional (counting + only cells sent from one IP to another), we assume for safety that all + measurement is unidirectional, and so traffic must be sent by both parties + in order to prevent record splitting. + 2.2. Implementation Tor clients currently maintain one TLS connection to their Guard node to @@ -154,35 +160,31 @@ Table of Contents connections, and pad them, but otherwise not pad between normal relays. Both clients and Guards will maintain a timer for all application (ie: - non-directory) TLS connections. Every time a non-padding packet is sent or - received by either end, that endpoint will sample a timeout value from - between 1.5 seconds and 9.5 seconds using the max(X,X) distribution - described in Section 2.3. The time range is subject to consensus + non-directory) TLS connections. Every time a padding packet sent by an + endpoint, that endpoint will sample a timeout value from + the max(X,X) distribution described in Section 2.3. The default + range is from 1.5 seconds to 9.5 seconds time range, subject to consensus parameters as specified in Section 2.6. - If the connection becomes active for any reason before this timer - expires, the timer is reset to a new random value between 1.5 and 9.5 - seconds. If the connection remains inactive until the timer expires, a - single CELL_PADDING cell will be sent on that connection. + (The timing is randomized to avoid making it obvious which cells are + padding.) - In this way, the connection will only be padded in the event that it is - idle, and will always transmit a packet before the minimum 10 second inactive - timeout. + If another cell is sent for any reason before this timer expires, the timer + is reset to a new random value. -2.3. Padding Cell Timeout Distribution Statistics + If the connection remains inactive until the timer expires, a + single CELL_PADDING cell will be sent on that connection (which will + also start a new timer). - It turns out that because the padding is bidirectional, and because both - endpoints are maintaining timers, this creates the situation where the time - before sending a padding packet in either direction is actually - min(client_timeout, server_timeout). + In this way, the connection will only be padded in a given direction in + the event that it is idle in that direction, and will always transmit a + packet before the minimum 10 second inactive timeout. - If client_timeout and server_timeout are uniformly sampled, then the - distribution of min(client_timeout,server_timeout) is no longer uniform, and - the resulting average timeout (Exp[min(X,X)]) is much lower than the - midpoint of the timeout range. +2.3. Padding Cell Timeout Distribution Statistics - To compensate for this, instead of sampling each endpoint timeout uniformly, - we instead sample it from max(X,X), where X is uniformly distributed. + To limit the amount of padding sent, instead of sampling each endpoint + timeout uniformly, we instead sample it from max(X,X), where X is + uniformly distributed. If X is a random variable uniform from 0..R-1 (where R=high-low), then the random variable Y = max(X,X) has Prob(Y == i) = (2.0*i + 1)/(R*R). @@ -206,9 +208,6 @@ Table of Contents 15000 7499.5 7995 4999.5 9999.5 20000 9900.5 10661 6666.2 13332.8 - In this way, we maintain the property that the midpoint of the timeout range - is the expected mean time before a padding packet is sent in either - direction. 2.4. Maximum overhead bounds -- cgit v1.2.3-54-g00ecf From 9aad630153d8bb7c9c3b07b47855700705595059 Mon Sep 17 00:00:00 2001 From: Nick Mathewson Date: Mon, 23 May 2022 14:31:38 -0400 Subject: Clarify who sends padding negotiation and when. Also explain what should happen if those assumptions are violated. --- padding-spec.txt | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/padding-spec.txt b/padding-spec.txt index 0a45e8b..9806b2a 100644 --- a/padding-spec.txt +++ b/padding-spec.txt @@ -252,6 +252,13 @@ Table of Contents CELL_PADDING_NEGOTIATE to instruct the relay not to pad, and then does not send any further padding itself. + Currently, clients negotiate padding only when a channel is created, + immediately after sending their NETINFO cell. Recipients SHOULD, however, + accept padding negotiation messages at any time. + + Clients and bridges MUST reject padding negotiation messages from relays, + and close the channel if they receive one. + 2.6. Consensus Parameters Governing Behavior Connection-level padding is controlled by the following consensus parameters: -- cgit v1.2.3-54-g00ecf From 836a5fb964e288e8ff20e918abf19df353c245ac Mon Sep 17 00:00:00 2001 From: Nick Mathewson Date: Mon, 23 May 2022 14:57:23 -0400 Subject: Try to document the many uses of nf_conntimeout_clients. (This is largely determined by reverse-engineering tor's current behavior.) --- padding-spec.txt | 16 ++++++++++++---- 1 file changed, 12 insertions(+), 4 deletions(-) diff --git a/padding-spec.txt b/padding-spec.txt index 9806b2a..471dd74 100644 --- a/padding-spec.txt +++ b/padding-spec.txt @@ -283,10 +283,18 @@ Table of Contents - Default: 14000 * nf_conntimeout_clients - - The number of seconds to keep circuits opened and available for - clients to use. Note that the actual client timeout is randomized - uniformly from this value to twice this value. This governs client - OR conn lifespan. Reduced padding clients use half the consensus + - The number of seconds to keep never-used circuits opened and + available for clients to use. Note that the actual client timeout is + randomized uniformly from this value to twice this value. + - The number of seconds to keep idle (not currently used) canonical + channels are open and available. (We do this to ensure a sufficient + time duration of padding, which is the ultimate goal.) + - This value is also used to determine how long, after a port has been + used, we should attempt to keep building predicted circuits for that + port. (See path-spec.txt section 2.1.1.) This behavior was + originally added to work around implementation limitations, but it + serves as a reasonable default regardless of implementation. + - For all use cases, reduced padding clients use half the consensus value. - Default: 1800 -- cgit v1.2.3-54-g00ecf From 5536d29700d1bcea4b2652a3d7978a197b058a45 Mon Sep 17 00:00:00 2001 From: Nick Mathewson Date: Mon, 23 May 2022 15:07:06 -0400 Subject: Padding spec: describe behavior with queues. (Briefly: "Sent" is sometimes unobservable, so we should use "queued" as a reasonable proxy.) --- padding-spec.txt | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/padding-spec.txt b/padding-spec.txt index 471dd74..262e88f 100644 --- a/padding-spec.txt +++ b/padding-spec.txt @@ -180,6 +180,16 @@ Table of Contents the event that it is idle in that direction, and will always transmit a packet before the minimum 10 second inactive timeout. + (In practice, an implementation may not be able to determine when, + exactly, a cell is sent on a given channel. For example, even though the + cell has been given to the kernel via a call to `send(2)`, the kernel may + still be buffering that cell. In cases such as these, implementations + should use a reasonable proxy for the time at which a cell is sent: for + example, when the cell is queued. If this strategy is used, + implementations should try to observe the innermost (closest to the wire) + queue that the practically can, and if this queue is already nonempty, + padding should not be scheduled until after the queue does become empty.) + 2.3. Padding Cell Timeout Distribution Statistics To limit the amount of padding sent, instead of sampling each endpoint -- cgit v1.2.3-54-g00ecf From 1272bd0db5ce44b76a8fb7aa50eb58fbcb66ce13 Mon Sep 17 00:00:00 2001 From: Mike Perry Date: Thu, 26 May 2022 20:01:09 +0000 Subject: Describe a potential (rare) distringuisher in idle circuits. In the rare event that a user resumes activity after a period between the "reduced connection timeout" and the full value, and that user has not set reduced padding, this is a distinguisher on circuits that have been held idle and open for that long. --- padding-spec.txt | 3 +++ 1 file changed, 3 insertions(+) diff --git a/padding-spec.txt b/padding-spec.txt index 262e88f..ea16d8b 100644 --- a/padding-spec.txt +++ b/padding-spec.txt @@ -306,6 +306,9 @@ Table of Contents serves as a reasonable default regardless of implementation. - For all use cases, reduced padding clients use half the consensus value. + - Implementations MAY mark circuits held open past the reduced padding + quantity (half the consensus value) as "not to be used for streams", + to prevent their use from becoming a distinguisher. - Default: 1800 * nf_pad_before_usage -- cgit v1.2.3-54-g00ecf