aboutsummaryrefslogtreecommitdiff
path: root/proposals
diff options
context:
space:
mode:
authorNick Mathewson <nickm@torproject.org>2022-07-25 15:04:42 -0400
committerNick Mathewson <nickm@torproject.org>2022-07-25 15:04:42 -0400
commit6f9ec72e279c837a07101b37c8790117cdfaf8a5 (patch)
treea245e5e568eaf0a5cbaa4d6c83f3afdadcd6ee38 /proposals
parent833d6b27a4427b5bbb2189218c10fd568fc3e415 (diff)
downloadtorspec-6f9ec72e279c837a07101b37c8790117cdfaf8a5.tar.gz
torspec-6f9ec72e279c837a07101b37c8790117cdfaf8a5.zip
Add proposal 341-better-oos.md
Diffstat (limited to 'proposals')
-rw-r--r--proposals/000-index.txt2
-rw-r--r--proposals/341-better-oos.md171
-rw-r--r--proposals/BY_INDEX.md1
-rw-r--r--proposals/README.md1
4 files changed, 175 insertions, 0 deletions
diff --git a/proposals/000-index.txt b/proposals/000-index.txt
index 7d518c1..0f82741 100644
--- a/proposals/000-index.txt
+++ b/proposals/000-index.txt
@@ -261,6 +261,7 @@ Proposals by number:
338 Use an 8-byte timestamp in NETINFO cells [ACCEPTED]
339 UDP traffic over Tor [ACCEPTED]
340 Packed and fragmented relay messages [OPEN]
+341 A better algorithm for out-of-sockets eviction [OPEN]
Proposals by status:
@@ -301,6 +302,7 @@ Proposals by status:
326 The "tor-relay" Well-Known Resource Identifier
330 Modernizing authority contact entries
340 Packed and fragmented relay messages
+ 341 A better algorithm for out-of-sockets eviction
ACCEPTED:
265 Load Balancing with Overhead Parameters [for 0.2.9.x]
282 Remove "Named" and "Unnamed" handling from consensus voting [for 0.3.3.x]
diff --git a/proposals/341-better-oos.md b/proposals/341-better-oos.md
new file mode 100644
index 0000000..0715c14
--- /dev/null
+++ b/proposals/341-better-oos.md
@@ -0,0 +1,171 @@
+```
+Filename: 341-better-oos.md
+Title: A better algorithm for out-of-sockets eviction
+Author: Nick Mathewson
+Created: 25 July 2022
+Status: Open
+```
+
+# Introduction
+
+Our existing algorithm for handling an out-of-sockets condition needs
+improvement. It only handles sockets used for OR connections, and
+prioritizes those with more circuits. Because of these weaknesses, the
+algorithm is trivial to circumvent, and it's disabled by default with
+`DisableOOSCheck`.
+
+Here we propose a new algorithm for choosing which connections to close
+when we're out of sockets. In summary, the new algorithm works by
+deciding which kinds of connections we have "too many" of, and then by
+closing excess connections of each kind. The algorithm for selecting
+connections of each kind is different.
+
+
+
+# Intuitions behind the algorithm below
+
+We want to keep a healthy mix of connections running; favoring one kind
+of connection over another gives the attacker a fine way to starve the
+disfavored connections by making a bunch of the favored kind.
+
+The correct mix of connections depends on the type of service we are
+providing. Everywhere _except_ authorities, for example, inbound
+directory connections are perfectly fine to close, since nothing in our
+protocol actually generates them.
+
+In general, we would prefer to close DirPort connections, then Exit
+connections, then OR connections.
+
+The priority with which to close connections is different depending on
+the connection type. "Age of connection" or "number of circuits" may be
+a fine metric for how truly used an OR connection is, but for a DirPort
+connection, high age is suspicious.
+
+# The algorithm
+
+Define a "candidate" connection as one that has a socket, and is either
+an exit stream, an _inbound_ directory stream, or an OR connection.
+
+(Note that OR connections can be from clients, relays, or bridges. Note
+that ordinary relays should not get directory streams that use sockets,
+since clients always use `BEGIN_DIR` to create tunneled directory
+streams.)
+
+In all of the following, treat subtraction as saturating at zero. In
+other words, when you see "A - B" below, read it as "MAX(A-B, 0)".
+
+## Phase 1: Deciding how many connections to close
+
+When we are find that we are low on sockets, we pick a number of sockets
+that we want to close according to our existing algorithm. (That is, we
+try to close 1/4 of our maximum sockets if we have reached our upper
+limit, or 1/10 of our maximum sockets if we have encountered a failure
+from socket(2).) Call this `N_CLOSE`.
+
+Then we decide which sockets to target based on this algorithm.
+
+1. Consider the total number of sockets used for exit streams
+ (`N_EXIT`), the total number used for _inbound_ directory streams
+ (`N_DIR`), and the total number used for OR connections (`N_OR`).
+ (In these calculations, we exclude connections that are already
+ marked to be closed.) Call the total `N_CONN = N_DIR + N_OR +
+ N_EXIT`. Define `N_RETAIN = N_CONN - N_CLOSE`.
+
+2. Compute how many connections of each type are "in excess". First,
+ calculate our target proportions:
+
+ * If we are an authority, let `T_DIR` = 1. Otherwise set `T_DIR = 0.1`.
+ * If we are an exit or we are running an onion service, let `T_EXIT =
+ 2`. Otherwise let `T_EXIT = 0.1`.
+ * Let `T_OR` = 1.
+
+ > TODO: Should those numbers be consensus parameters?
+
+ These numbers define the relative proportions of connections that
+ we would be willing to retain retain in our final mix. Compute a
+ number of _excess_ connections of each type by calculating.
+
+ ```
+ T_TOTAL = T_OR + T_DIR + T_EXIT.
+ EXCESS_DIR = N_DIR - N_RETAIN * (T_DIR / T_TOTAL)
+ EXCESS_EXIT = N_EXIT - N_RETAIN * (T_EXIT / T_TOTAL)
+ EXCESS_OR = N_OR - N_RETAIN * (T_OR / T_TOTAL)
+ ```
+
+3. Finally, divide N_CLOSE among the different types of excess
+ connections, assigning first to excess directory connections, then
+ excess exit connections, and finally to excess OR connections.
+
+ ```
+ CLOSE_DIR = MIN(EXCESS_DIR, N_CLOSE)
+ N_CLOSE := N_CLOSE - CLOSE_DIR
+ CLOSE_EXIT = MIN(EXCESS_EXIT, N_CLOSE)
+ N_CLOSE := N_CLOSE - CLOSE_EXIT
+ CLOSE_OR = MIN(EXCESS_OR, N_CLOSE)
+ ```
+
+We will try to close `CLOSE_DIR` directory connections, `CLOSE_EXIT`
+exit connections, and `CLOSE_OR` OR connections.
+
+## Phase 2: Closing directory connections
+
+We want to close a certain number of directory connections. To select
+our targets, we sort first by the number of directory connections from
+a similar address (see "similar address" below) and then by their age,
+preferring to close the oldest ones first.
+
+> This approach defeats "many requests from the same address" and "Open
+> a connection and hold it open, and do so from many addresses". It
+> doesn't do such a great job with defeating "open and close frequently
+> and do so on many addresses."
+
+> Note that fallback directories do not typically use sockets for
+> handling directory connections: theirs are usually created with
+> BEGIN_DIR.
+
+## Phase 3: Closing exit connections.
+
+We want to close a certain number of exit connections. To do this, we
+pick an exit connection at random, then close its circuit _along with
+all the other exit connections on the same circuit_. Then we repeat
+until we have closed at least our target number of exit connections.
+
+> This approach probabilistically favors closing circuits with a large
+> number of sockets open, regardless of how long those sockets have been
+> open. This defeats the easiest way of opening a large number of exit
+> streams ("open them all on once circuit") without making the
+> counter-approach ("open each exit stream on a its own circuit") much
+> more attractive.
+
+## Phase 3: Closing OR connections.
+
+We want to close a certain number of OR connections, to clients, bridges, or
+relays.
+
+To do this, we first close OR connections with zero circuits. Then we
+close all OR connections but the most recent 2 from each "similar
+address". Then we close OR connections at random from among those _not_
+to a recognized relay in the latest directory. Finally, we close OR
+connections at random.
+
+> We used to unconditionally prefer to close connections with fewer
+> circuits. That's trivial for an adversary to circumvent, though: they
+> can just open a bunch of circuits on their bogus OR connections, and
+> force us to preferentially close circuits from real clients, bridges,
+> and relays.
+
+> Note that some connections that seem like client connections ("not
+> from relays in the latest directory") are actually those created by
+> bridges.
+
+## What is "A similar address"?
+
+We define two connections as having a similar address if they are in the
+same IPv4 /30, or if they are in the same IPv6 /90.
+
+
+# Acknowledgments
+
+This proposal was inspired by a set of
+[OOS improvements](https://gitlab.torproject.org/tpo/core/tor/-/issues/32794)
+from `starlight`.
diff --git a/proposals/BY_INDEX.md b/proposals/BY_INDEX.md
index 9383351..b64ae55 100644
--- a/proposals/BY_INDEX.md
+++ b/proposals/BY_INDEX.md
@@ -258,4 +258,5 @@ Below are a list of proposals sorted by their proposal number. See
* [`338-netinfo-y2038.md`](/proposals/338-netinfo-y2038.md): Use an 8-byte timestamp in NETINFO cells [ACCEPTED]
* [`339-udp-over-tor.md`](/proposals/339-udp-over-tor.md): UDP traffic over Tor [ACCEPTED]
* [`340-packed-and-fragmented.md`](/proposals/340-packed-and-fragmented.md): Packed and fragmented relay messages [OPEN]
+* [`341-better-oos.md`](/proposals/341-better-oos.md): A better algorithm for out-of-sockets eviction [OPEN]
diff --git a/proposals/README.md b/proposals/README.md
index 1431e8d..4127329 100644
--- a/proposals/README.md
+++ b/proposals/README.md
@@ -40,6 +40,7 @@ for discussion.
* [`326-tor-relay-well-known-uri-rfc8615.md`](/proposals/326-tor-relay-well-known-uri-rfc8615.md): The "tor-relay" Well-Known Resource Identifier
* [`330-authority-contact.md`](/proposals/330-authority-contact.md): Modernizing authority contact entries
* [`340-packed-and-fragmented.md`](/proposals/340-packed-and-fragmented.md): Packed and fragmented relay messages
+* [`341-better-oos.md`](/proposals/341-better-oos.md): A better algorithm for out-of-sockets eviction
## ACCEPTED proposals: slated for implementation