From 6f9ec72e279c837a07101b37c8790117cdfaf8a5 Mon Sep 17 00:00:00 2001 From: Nick Mathewson Date: Mon, 25 Jul 2022 15:04:42 -0400 Subject: Add proposal 341-better-oos.md --- proposals/000-index.txt | 2 + proposals/341-better-oos.md | 171 ++++++++++++++++++++++++++++++++++++++++++++ proposals/BY_INDEX.md | 1 + proposals/README.md | 1 + 4 files changed, 175 insertions(+) create mode 100644 proposals/341-better-oos.md diff --git a/proposals/000-index.txt b/proposals/000-index.txt index 7d518c1..0f82741 100644 --- a/proposals/000-index.txt +++ b/proposals/000-index.txt @@ -261,6 +261,7 @@ Proposals by number: 338 Use an 8-byte timestamp in NETINFO cells [ACCEPTED] 339 UDP traffic over Tor [ACCEPTED] 340 Packed and fragmented relay messages [OPEN] +341 A better algorithm for out-of-sockets eviction [OPEN] Proposals by status: @@ -301,6 +302,7 @@ Proposals by status: 326 The "tor-relay" Well-Known Resource Identifier 330 Modernizing authority contact entries 340 Packed and fragmented relay messages + 341 A better algorithm for out-of-sockets eviction ACCEPTED: 265 Load Balancing with Overhead Parameters [for 0.2.9.x] 282 Remove "Named" and "Unnamed" handling from consensus voting [for 0.3.3.x] diff --git a/proposals/341-better-oos.md b/proposals/341-better-oos.md new file mode 100644 index 0000000..0715c14 --- /dev/null +++ b/proposals/341-better-oos.md @@ -0,0 +1,171 @@ +``` +Filename: 341-better-oos.md +Title: A better algorithm for out-of-sockets eviction +Author: Nick Mathewson +Created: 25 July 2022 +Status: Open +``` + +# Introduction + +Our existing algorithm for handling an out-of-sockets condition needs +improvement. It only handles sockets used for OR connections, and +prioritizes those with more circuits. Because of these weaknesses, the +algorithm is trivial to circumvent, and it's disabled by default with +`DisableOOSCheck`. + +Here we propose a new algorithm for choosing which connections to close +when we're out of sockets. In summary, the new algorithm works by +deciding which kinds of connections we have "too many" of, and then by +closing excess connections of each kind. The algorithm for selecting +connections of each kind is different. + + + +# Intuitions behind the algorithm below + +We want to keep a healthy mix of connections running; favoring one kind +of connection over another gives the attacker a fine way to starve the +disfavored connections by making a bunch of the favored kind. + +The correct mix of connections depends on the type of service we are +providing. Everywhere _except_ authorities, for example, inbound +directory connections are perfectly fine to close, since nothing in our +protocol actually generates them. + +In general, we would prefer to close DirPort connections, then Exit +connections, then OR connections. + +The priority with which to close connections is different depending on +the connection type. "Age of connection" or "number of circuits" may be +a fine metric for how truly used an OR connection is, but for a DirPort +connection, high age is suspicious. + +# The algorithm + +Define a "candidate" connection as one that has a socket, and is either +an exit stream, an _inbound_ directory stream, or an OR connection. + +(Note that OR connections can be from clients, relays, or bridges. Note +that ordinary relays should not get directory streams that use sockets, +since clients always use `BEGIN_DIR` to create tunneled directory +streams.) + +In all of the following, treat subtraction as saturating at zero. In +other words, when you see "A - B" below, read it as "MAX(A-B, 0)". + +## Phase 1: Deciding how many connections to close + +When we are find that we are low on sockets, we pick a number of sockets +that we want to close according to our existing algorithm. (That is, we +try to close 1/4 of our maximum sockets if we have reached our upper +limit, or 1/10 of our maximum sockets if we have encountered a failure +from socket(2).) Call this `N_CLOSE`. + +Then we decide which sockets to target based on this algorithm. + +1. Consider the total number of sockets used for exit streams + (`N_EXIT`), the total number used for _inbound_ directory streams + (`N_DIR`), and the total number used for OR connections (`N_OR`). + (In these calculations, we exclude connections that are already + marked to be closed.) Call the total `N_CONN = N_DIR + N_OR + + N_EXIT`. Define `N_RETAIN = N_CONN - N_CLOSE`. + +2. Compute how many connections of each type are "in excess". First, + calculate our target proportions: + + * If we are an authority, let `T_DIR` = 1. Otherwise set `T_DIR = 0.1`. + * If we are an exit or we are running an onion service, let `T_EXIT = + 2`. Otherwise let `T_EXIT = 0.1`. + * Let `T_OR` = 1. + + > TODO: Should those numbers be consensus parameters? + + These numbers define the relative proportions of connections that + we would be willing to retain retain in our final mix. Compute a + number of _excess_ connections of each type by calculating. + + ``` + T_TOTAL = T_OR + T_DIR + T_EXIT. + EXCESS_DIR = N_DIR - N_RETAIN * (T_DIR / T_TOTAL) + EXCESS_EXIT = N_EXIT - N_RETAIN * (T_EXIT / T_TOTAL) + EXCESS_OR = N_OR - N_RETAIN * (T_OR / T_TOTAL) + ``` + +3. Finally, divide N_CLOSE among the different types of excess + connections, assigning first to excess directory connections, then + excess exit connections, and finally to excess OR connections. + + ``` + CLOSE_DIR = MIN(EXCESS_DIR, N_CLOSE) + N_CLOSE := N_CLOSE - CLOSE_DIR + CLOSE_EXIT = MIN(EXCESS_EXIT, N_CLOSE) + N_CLOSE := N_CLOSE - CLOSE_EXIT + CLOSE_OR = MIN(EXCESS_OR, N_CLOSE) + ``` + +We will try to close `CLOSE_DIR` directory connections, `CLOSE_EXIT` +exit connections, and `CLOSE_OR` OR connections. + +## Phase 2: Closing directory connections + +We want to close a certain number of directory connections. To select +our targets, we sort first by the number of directory connections from +a similar address (see "similar address" below) and then by their age, +preferring to close the oldest ones first. + +> This approach defeats "many requests from the same address" and "Open +> a connection and hold it open, and do so from many addresses". It +> doesn't do such a great job with defeating "open and close frequently +> and do so on many addresses." + +> Note that fallback directories do not typically use sockets for +> handling directory connections: theirs are usually created with +> BEGIN_DIR. + +## Phase 3: Closing exit connections. + +We want to close a certain number of exit connections. To do this, we +pick an exit connection at random, then close its circuit _along with +all the other exit connections on the same circuit_. Then we repeat +until we have closed at least our target number of exit connections. + +> This approach probabilistically favors closing circuits with a large +> number of sockets open, regardless of how long those sockets have been +> open. This defeats the easiest way of opening a large number of exit +> streams ("open them all on once circuit") without making the +> counter-approach ("open each exit stream on a its own circuit") much +> more attractive. + +## Phase 3: Closing OR connections. + +We want to close a certain number of OR connections, to clients, bridges, or +relays. + +To do this, we first close OR connections with zero circuits. Then we +close all OR connections but the most recent 2 from each "similar +address". Then we close OR connections at random from among those _not_ +to a recognized relay in the latest directory. Finally, we close OR +connections at random. + +> We used to unconditionally prefer to close connections with fewer +> circuits. That's trivial for an adversary to circumvent, though: they +> can just open a bunch of circuits on their bogus OR connections, and +> force us to preferentially close circuits from real clients, bridges, +> and relays. + +> Note that some connections that seem like client connections ("not +> from relays in the latest directory") are actually those created by +> bridges. + +## What is "A similar address"? + +We define two connections as having a similar address if they are in the +same IPv4 /30, or if they are in the same IPv6 /90. + + +# Acknowledgments + +This proposal was inspired by a set of +[OOS improvements](https://gitlab.torproject.org/tpo/core/tor/-/issues/32794) +from `starlight`. diff --git a/proposals/BY_INDEX.md b/proposals/BY_INDEX.md index 9383351..b64ae55 100644 --- a/proposals/BY_INDEX.md +++ b/proposals/BY_INDEX.md @@ -258,4 +258,5 @@ Below are a list of proposals sorted by their proposal number. See * [`338-netinfo-y2038.md`](/proposals/338-netinfo-y2038.md): Use an 8-byte timestamp in NETINFO cells [ACCEPTED] * [`339-udp-over-tor.md`](/proposals/339-udp-over-tor.md): UDP traffic over Tor [ACCEPTED] * [`340-packed-and-fragmented.md`](/proposals/340-packed-and-fragmented.md): Packed and fragmented relay messages [OPEN] +* [`341-better-oos.md`](/proposals/341-better-oos.md): A better algorithm for out-of-sockets eviction [OPEN] diff --git a/proposals/README.md b/proposals/README.md index 1431e8d..4127329 100644 --- a/proposals/README.md +++ b/proposals/README.md @@ -40,6 +40,7 @@ for discussion. * [`326-tor-relay-well-known-uri-rfc8615.md`](/proposals/326-tor-relay-well-known-uri-rfc8615.md): The "tor-relay" Well-Known Resource Identifier * [`330-authority-contact.md`](/proposals/330-authority-contact.md): Modernizing authority contact entries * [`340-packed-and-fragmented.md`](/proposals/340-packed-and-fragmented.md): Packed and fragmented relay messages +* [`341-better-oos.md`](/proposals/341-better-oos.md): A better algorithm for out-of-sockets eviction ## ACCEPTED proposals: slated for implementation -- cgit v1.2.3-54-g00ecf