From e35a77088220314b7fcb4053033f131d1357dac6 Mon Sep 17 00:00:00 2001
From: Nick Mathewson <nickm@torproject.org>
Date: Mon, 23 May 2022 14:23:54 -0400
Subject: netflow padding: clarify directionality and padding behavior.

The main points here are:

  * We assume that flow measurements are unidirectional, so
    each side must make sure to send traffic.
  * So we restart our timer when sending, only.
  * We restart the timer whether we're sending real traffic or
    padding traffic.
  * The logic for `max(X,X)` timing  applies even though we aren't
    using a bidirectional trigger for timing.
---
 padding-spec.txt | 49 ++++++++++++++++++++++++-------------------------
 1 file changed, 24 insertions(+), 25 deletions(-)

diff --git a/padding-spec.txt b/padding-spec.txt
index 825f1d7..0a45e8b 100644
--- a/padding-spec.txt
+++ b/padding-spec.txt
@@ -143,6 +143,12 @@ Table of Contents
   user traffic in that time period is multiplexed over a single connection
   (as it is with Tor).
 
+  Though flow measurement in principle can be bidirectional (counting cells
+  sent in both directions between a pair of IPs) or unidirectional (counting
+  only cells sent from one IP to another), we assume for safety that all
+  measurement is unidirectional, and so traffic must be sent by both parties
+  in order to prevent record splitting.
+
 2.2. Implementation
 
   Tor clients currently maintain one TLS connection to their Guard node to
@@ -154,35 +160,31 @@ Table of Contents
   connections, and pad them, but otherwise not pad between normal relays.
 
   Both clients and Guards will maintain a timer for all application (ie:
-  non-directory) TLS connections. Every time a non-padding packet is sent or
-  received by either end, that endpoint will sample a timeout value from
-  between 1.5 seconds and 9.5 seconds using the max(X,X) distribution
-  described in Section 2.3. The time range is subject to consensus
+  non-directory) TLS connections. Every time a padding packet sent by an
+  endpoint, that endpoint will sample a timeout value from
+  the max(X,X) distribution described in Section 2.3. The default
+  range is from 1.5 seconds to 9.5 seconds time range, subject to consensus
   parameters as specified in Section 2.6.
 
-  If the connection becomes active for any reason before this timer
-  expires, the timer is reset to a new random value between 1.5 and 9.5
-  seconds. If the connection remains inactive until the timer expires, a
-  single CELL_PADDING cell will be sent on that connection.
+  (The timing is randomized to avoid making it obvious which cells are
+  padding.)
 
-  In this way, the connection will only be padded in the event that it is
-  idle, and will always transmit a packet before the minimum 10 second inactive
-  timeout.
+  If another cell is sent for any reason before this timer expires, the timer
+  is reset to a new random value.
 
-2.3. Padding Cell Timeout Distribution Statistics
+  If the connection remains inactive until the timer expires, a
+  single CELL_PADDING cell will be sent on that connection (which will
+  also start a new timer).
 
-  It turns out that because the padding is bidirectional, and because both
-  endpoints are maintaining timers, this creates the situation where the time
-  before sending a padding packet in either direction is actually
-  min(client_timeout, server_timeout).
+  In this way, the connection will only be padded in a given direction in
+  the event that it is idle in that direction, and will always transmit a
+  packet before the minimum 10 second inactive timeout.
 
-  If client_timeout and server_timeout are uniformly sampled, then the
-  distribution of min(client_timeout,server_timeout) is no longer uniform, and
-  the resulting average timeout (Exp[min(X,X)]) is much lower than the
-  midpoint of the timeout range.
+2.3. Padding Cell Timeout Distribution Statistics
 
-  To compensate for this, instead of sampling each endpoint timeout uniformly,
-  we instead sample it from max(X,X), where X is uniformly distributed.
+  To limit the amount of padding sent, instead of sampling each endpoint
+  timeout uniformly, we instead sample it from max(X,X), where X is
+  uniformly distributed.
 
   If X is a random variable uniform from 0..R-1 (where R=high-low), then the
   random variable Y = max(X,X) has Prob(Y == i) = (2.0*i + 1)/(R*R).
@@ -206,9 +208,6 @@ Table of Contents
      15000   7499.5    7995       4999.5           9999.5
      20000   9900.5    10661      6666.2           13332.8
 
-  In this way, we maintain the property that the midpoint of the timeout range
-  is the expected mean time before a padding packet is sent in either
-  direction.
 
 2.4. Maximum overhead bounds
 
-- 
cgit v1.2.3-54-g00ecf


From 9aad630153d8bb7c9c3b07b47855700705595059 Mon Sep 17 00:00:00 2001
From: Nick Mathewson <nickm@torproject.org>
Date: Mon, 23 May 2022 14:31:38 -0400
Subject: Clarify who sends padding negotiation and when.

Also explain what should happen if those assumptions are violated.
---
 padding-spec.txt | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/padding-spec.txt b/padding-spec.txt
index 0a45e8b..9806b2a 100644
--- a/padding-spec.txt
+++ b/padding-spec.txt
@@ -252,6 +252,13 @@ Table of Contents
   CELL_PADDING_NEGOTIATE to instruct the relay not to pad, and then does not
   send any further padding itself.
 
+  Currently, clients negotiate padding only when a channel is created,
+  immediately after sending their NETINFO cell.  Recipients SHOULD, however,
+  accept padding negotiation messages at any time.
+
+  Clients and bridges MUST reject padding negotiation messages from relays,
+  and close the channel if they receive one.
+
 2.6. Consensus Parameters Governing Behavior
 
   Connection-level padding is controlled by the following consensus parameters:
-- 
cgit v1.2.3-54-g00ecf


From 836a5fb964e288e8ff20e918abf19df353c245ac Mon Sep 17 00:00:00 2001
From: Nick Mathewson <nickm@torproject.org>
Date: Mon, 23 May 2022 14:57:23 -0400
Subject: Try to document the many uses of nf_conntimeout_clients.

(This is largely determined by reverse-engineering tor's current
behavior.)
---
 padding-spec.txt | 16 ++++++++++++----
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/padding-spec.txt b/padding-spec.txt
index 9806b2a..471dd74 100644
--- a/padding-spec.txt
+++ b/padding-spec.txt
@@ -283,10 +283,18 @@ Table of Contents
       - Default: 14000
 
     * nf_conntimeout_clients
-      - The number of seconds to keep circuits opened and available for
-        clients to use. Note that the actual client timeout is randomized
-        uniformly from this value to twice this value. This governs client
-        OR conn lifespan. Reduced padding clients use half the consensus
+      - The number of seconds to keep never-used circuits opened and
+        available for clients to use. Note that the actual client timeout is
+        randomized uniformly from this value to twice this value.
+      - The number of seconds to keep idle (not currently used) canonical
+        channels are open and available. (We do this to ensure a sufficient
+        time duration of padding, which is the ultimate goal.)
+      - This value is also used to determine how long, after a port has been
+        used, we should attempt to keep building predicted circuits for that
+        port. (See path-spec.txt section 2.1.1.)  This behavior was
+        originally added to work around implementation limitations, but it
+        serves as a reasonable default regardless of implementation.
+      - For all use cases, reduced padding clients use half the consensus
         value.
       - Default: 1800
 
-- 
cgit v1.2.3-54-g00ecf


From 5536d29700d1bcea4b2652a3d7978a197b058a45 Mon Sep 17 00:00:00 2001
From: Nick Mathewson <nickm@torproject.org>
Date: Mon, 23 May 2022 15:07:06 -0400
Subject: Padding spec: describe behavior with queues.

(Briefly: "Sent" is sometimes unobservable, so we should use
"queued" as a reasonable proxy.)
---
 padding-spec.txt | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/padding-spec.txt b/padding-spec.txt
index 471dd74..262e88f 100644
--- a/padding-spec.txt
+++ b/padding-spec.txt
@@ -180,6 +180,16 @@ Table of Contents
   the event that it is idle in that direction, and will always transmit a
   packet before the minimum 10 second inactive timeout.
 
+  (In practice, an implementation may not be able to determine when,
+  exactly, a cell is sent on a given channel.  For example, even though the
+  cell has been given to the kernel via a call to `send(2)`, the kernel may
+  still be buffering that cell.  In cases such as these, implementations
+  should use a reasonable proxy for the time at which a cell is sent: for
+  example, when the cell is queued.  If this strategy is used,
+  implementations should try to observe the innermost (closest to the wire)
+  queue that the practically can, and if this queue is already nonempty,
+  padding should not be scheduled until after the queue does become empty.)
+
 2.3. Padding Cell Timeout Distribution Statistics
 
   To limit the amount of padding sent, instead of sampling each endpoint
-- 
cgit v1.2.3-54-g00ecf


From 1272bd0db5ce44b76a8fb7aa50eb58fbcb66ce13 Mon Sep 17 00:00:00 2001
From: Mike Perry <mikeperry-git@torproject.org>
Date: Thu, 26 May 2022 20:01:09 +0000
Subject: Describe a potential (rare) distringuisher in idle circuits.

In the rare event that a user resumes activity after a period between the
"reduced connection timeout" and the full value, and that user has not set
reduced padding, this is a distinguisher on circuits that have been held idle
and open for that long.
---
 padding-spec.txt | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/padding-spec.txt b/padding-spec.txt
index 262e88f..ea16d8b 100644
--- a/padding-spec.txt
+++ b/padding-spec.txt
@@ -306,6 +306,9 @@ Table of Contents
         serves as a reasonable default regardless of implementation.
       - For all use cases, reduced padding clients use half the consensus
         value.
+      - Implementations MAY mark circuits held open past the reduced padding
+        quantity (half the consensus value) as "not to be used for streams",
+        to prevent their use from becoming a distinguisher.
       - Default: 1800
 
     * nf_pad_before_usage
-- 
cgit v1.2.3-54-g00ecf