From e35a77088220314b7fcb4053033f131d1357dac6 Mon Sep 17 00:00:00 2001 From: Nick Mathewson Date: Mon, 23 May 2022 14:23:54 -0400 Subject: netflow padding: clarify directionality and padding behavior. The main points here are: * We assume that flow measurements are unidirectional, so each side must make sure to send traffic. * So we restart our timer when sending, only. * We restart the timer whether we're sending real traffic or padding traffic. * The logic for `max(X,X)` timing applies even though we aren't using a bidirectional trigger for timing. --- padding-spec.txt | 49 ++++++++++++++++++++++++------------------------- 1 file changed, 24 insertions(+), 25 deletions(-) (limited to 'padding-spec.txt') diff --git a/padding-spec.txt b/padding-spec.txt index 825f1d7..0a45e8b 100644 --- a/padding-spec.txt +++ b/padding-spec.txt @@ -143,6 +143,12 @@ Table of Contents user traffic in that time period is multiplexed over a single connection (as it is with Tor). + Though flow measurement in principle can be bidirectional (counting cells + sent in both directions between a pair of IPs) or unidirectional (counting + only cells sent from one IP to another), we assume for safety that all + measurement is unidirectional, and so traffic must be sent by both parties + in order to prevent record splitting. + 2.2. Implementation Tor clients currently maintain one TLS connection to their Guard node to @@ -154,35 +160,31 @@ Table of Contents connections, and pad them, but otherwise not pad between normal relays. Both clients and Guards will maintain a timer for all application (ie: - non-directory) TLS connections. Every time a non-padding packet is sent or - received by either end, that endpoint will sample a timeout value from - between 1.5 seconds and 9.5 seconds using the max(X,X) distribution - described in Section 2.3. The time range is subject to consensus + non-directory) TLS connections. Every time a padding packet sent by an + endpoint, that endpoint will sample a timeout value from + the max(X,X) distribution described in Section 2.3. The default + range is from 1.5 seconds to 9.5 seconds time range, subject to consensus parameters as specified in Section 2.6. - If the connection becomes active for any reason before this timer - expires, the timer is reset to a new random value between 1.5 and 9.5 - seconds. If the connection remains inactive until the timer expires, a - single CELL_PADDING cell will be sent on that connection. + (The timing is randomized to avoid making it obvious which cells are + padding.) - In this way, the connection will only be padded in the event that it is - idle, and will always transmit a packet before the minimum 10 second inactive - timeout. + If another cell is sent for any reason before this timer expires, the timer + is reset to a new random value. -2.3. Padding Cell Timeout Distribution Statistics + If the connection remains inactive until the timer expires, a + single CELL_PADDING cell will be sent on that connection (which will + also start a new timer). - It turns out that because the padding is bidirectional, and because both - endpoints are maintaining timers, this creates the situation where the time - before sending a padding packet in either direction is actually - min(client_timeout, server_timeout). + In this way, the connection will only be padded in a given direction in + the event that it is idle in that direction, and will always transmit a + packet before the minimum 10 second inactive timeout. - If client_timeout and server_timeout are uniformly sampled, then the - distribution of min(client_timeout,server_timeout) is no longer uniform, and - the resulting average timeout (Exp[min(X,X)]) is much lower than the - midpoint of the timeout range. +2.3. Padding Cell Timeout Distribution Statistics - To compensate for this, instead of sampling each endpoint timeout uniformly, - we instead sample it from max(X,X), where X is uniformly distributed. + To limit the amount of padding sent, instead of sampling each endpoint + timeout uniformly, we instead sample it from max(X,X), where X is + uniformly distributed. If X is a random variable uniform from 0..R-1 (where R=high-low), then the random variable Y = max(X,X) has Prob(Y == i) = (2.0*i + 1)/(R*R). @@ -206,9 +208,6 @@ Table of Contents 15000 7499.5 7995 4999.5 9999.5 20000 9900.5 10661 6666.2 13332.8 - In this way, we maintain the property that the midpoint of the timeout range - is the expected mean time before a padding packet is sent in either - direction. 2.4. Maximum overhead bounds -- cgit v1.2.3-54-g00ecf