More editing, and an expanded analysis of port overlapping

author: Micah Elizabeth Scott <beth@torproject.org> 2024-01-11 19:26:54 -0800
committer: Micah Elizabeth Scott <beth@torproject.org> 2024-01-25 08:56:48 -0800
commit: 5b2d5866414fb205172ca735ce84d911feae948f (patch)
tree: 17a05c55763098d7c0f2c421ac2f6445a15338e4
parent: 0e662b4e0e40bd63e0c7216a6ca3a2ccdb44fea4 (diff)
download: torspec-5b2d5866414fb205172ca735ce84d911feae948f.tar.gz
torspec-5b2d5866414fb205172ca735ce84d911feae948f.zip
1 files changed, 130 insertions, 85 deletions
diff --git a/proposals/XXX-udp-app-support.md b/proposals/XXX-udp-app-support.md
index 6adb598..55f2192 100644
--- a/proposals/XXX-udp-app-support.md
+++ b/proposals/XXX-udp-app-support.md
@@ -56,8 +56,8 @@ This proposal takes a fresh look at the problem of implementing support in Tor f
 
 This work is being done with the sponsorship and goals of the [Tor VPN Client for Android project](https://gitlab.torproject.org/groups/tpo/-/milestones/32).
 
-We start out by defining how this proposal compares to previous work, and the specific problem space we are addressing.
-This leads into an analysis that references appropriate standards and proposes some specific solutions with properties we can compare.
+The proposal begins with a summary of previous work and the specific problem space being addressed.
+This leads into an analysis of possible solutions, and finally some possible conclusions about the available development opportunities.
 
 ### History
 
@@ -71,10 +71,9 @@ The focus of this work was on a potential way to support unreliable traffic, not
 
 In proposal 100, a Tor *stream* is used for one pairing of local and remote address and port, copying the technique used by Tor for TCP.
 This works for some types of UDP applications, but it's broken by common behaviors like ICE connectivity checks, NAT traversal attempts, or using multiple servers via the same socket.
-We go into more detail about application behavior below.
 
 No additional large-message fragmentation protocol is defined, so the MTU in proposal 100 is limited to what fits in a single Tor cell.
-This value we will see is much too small for most applications.
+This value is much too small for most applications.
 
 It's possible these UDP protocol details would have been elaborated during design, but the proposal hit a snag elsewhere:
 there was no agreement on a way to avoid facilitating new attacks against anonymity.
@@ -92,7 +91,7 @@ Attacks that are described here, such as drops and injections, may be applied by
 [Proposal 339](https://spec.torproject.org/proposals/339-udp-over-tor.html) by Nick Mathewson in 2020 introduced a simpler UDP encapsulation design which had similar stream mapping properties as in proposal 100, but with the unreliable transport omitted. Datagrams are tunneled over a new type of Tor stream using a new type of Tor message.
 As a prerequisite, it depends on [proposal 319](https://spec.torproject.org/proposals/319-wide-everything.html) to support messages that may be larger than a cell, extending the MTU to support arbitrarily large UDP datagrams.
 
-In proposal 339 the property of binding a stream both to a local port and to a remote peer is described in UNIX-style terminology as a *connected socket*. We dive deeper into this idea below and supply an alternate formulation based on [RFC4787](https://www.rfc-editor.org/rfc/rfc4787), *NAT behavior requirements for UDP*.
+In proposal 339 the property of binding a stream both to a local port and to a remote peer is described in UNIX-style terminology as a *connected socket*. This idea is explored below using alternate terminology from [RFC4787](https://www.rfc-editor.org/rfc/rfc4787), *NAT behavior requirements for UDP*.
 The single-peer *connected socket* behavior would be referred to as an *endpoint-dependent mapping* in RFC4787.
 This type works fine for client/server apps but precludes the use of NAT traversal for peer-to-peer transfer.
 
@@ -104,11 +103,11 @@ We don't have a specific list of applications that must be supported, but we are
 
 Changes to the structure of the Tor network are out of scope, as are most performance optimizations. We expect to rely on common optimizations to the performance of Tor circuits, rather than looking to make specific changes that optimize for unreliable datagram transmission.
 
-We will discuss the design implications of UDP on onion services below.
-It's worth planning for this as a way to evaluate the design space, but in practice we are not aiming for UDP to onion services yet.
+This document will briefly discuss UDP for onion services below.
+It's worth planning for this as a way to evaluate the future design space, but in practice we are not aiming for UDP onion services yet.
 This will require changes to most applications that want to use it, as it implies that any media negotiations will need to understand onion addressing in addition to IPv4 and IPv6.
 
-We do not rigidly define the subset of UDP traffic that will be allowed.
+The allowed subset of UDP traffic is not subject to a single rigid definition.
 There are several options discussed below using the RFC4787 framework.
 
 We require support for DNS clients. Tor currently only supports a limited subset of DNS queries, and it's desirable to support more. This will be analyzed in detail as an application below. DNS is one of very few applications that still rely on fragmented UDP datagrams, though this may not be relevant for us since only servers typically need to control the production of fragments.
@@ -120,13 +119,13 @@ Video calls between two Tor users should transit directly between two exit nodes
 This requires that allocated UDP ports can each communicate with multiple peers:
 *endpoint-independent mapping* as described by RFC4787.
 
-We do not plan to support applications which accept arbitrary incoming datagrams, for example a DNS server hosted via Tor.
+We do not plan to support applications which accept incoming datagrams from previously-unknown peers, for example a DNS server hosted via Tor.
 RFC4787 calls this *endpoint-independent filtering*.
-It's unnecessary for running peer-to-peer apps, and it facilitates an extremely easy traffic injection attack.
+It's unnecessary for running peer-to-peer apps, and it facilitates an extremely easy [traffic injection](#traffic-injection) attack.
 
 ## UDP Traffic Models
 
-To better specify the role of a UDP extension for Tor, we will look at a few frameworks for describing UDP applications.
+To better specify the role of a UDP extension for Tor, this section explores a few frameworks for describing noteworthy subsets of UDP traffic.
 
 ### User Datagram Protocol (RFC768)
 
@@ -159,33 +158,40 @@ Every datagram sent or received on the socket may have a different peer address.
 
 ### Network Address Translation (NAT)
 
-Much of the real-world complexity in UDP applications comes from their strategies to detect and overcome the effects of NAT.
+Much of the real-world complexity in applying UDP comes from defining strategies to detect and overcome the effects of NAT.
+As a result, an intimidating quantity of IETF documentation has been written on NAT behavior and on strategies for NAT traversal.
 
-Many RFCs have been written on NAT behavior and NAT traversal strategies.
 [RFC4787](https://www.rfc-editor.org/rfc/rfc4787.html) and later [RFC7857](https://www.rfc-editor.org/rfc/rfc7857.html) offer best practices for implementing NAT. These are sometimes referred to as the [BEHAVE-WG](https://datatracker.ietf.org/wg/behave/about/) recommendations, based on the "Behavior Engineering for Hindrance Avoidance" working group behind them.
-Carrier-grade NAT requirements are addressed by [RFC6888](https://www.rfc-editor.org/rfc/rfc6888.html).
+
+[RFC6888](https://www.rfc-editor.org/rfc/rfc6888.html) makes additional recommendations for "carrier grade" NAT systems, where small pools of IP addresses are shared among a much larger number of subscribers.
 
 [RFC8445](https://www.rfc-editor.org/rfc/rfc8445.html) describes the Interactive Connectivity Establishment (ICE) protocol, which has become a common and recommended application-level technique for building peer-to-peer applications that work through NAT.
 
 There are multiple fundamental technical issues that NAT presents:
 
 1. NAT must be stateful in order to route replies back to the correct source.
+
    This directly conflicts with the stateless nature of UDP itself.
    The NAT's mapping lifetime, determined by a timer, will not necessarily match the lifetime of the application-level connection.
    This necessitates keep-alive packets in some protocols.
    Protocols that allow their binding to expire may be open to a NAT rebinding attack, when a different party acquires access to the NAT's port allocation.
+
 2. Applications no longer know an address they can be reached at without outside help.
+
    Chosen port numbers may or may not be used by the NAT.
    The effective IP address and port are not knowable without observing from an outside peer.
-3. Filtering and mapping approaches both vary, and it's not generally possible to establish a connection without testing several alternatives and choosing the one that works.
+
+3. Filtering and mapping approaches both vary, and it's not generally possible to establish a connection without interactive probing.
+
    This is the reason ICE exists, but it's also a possible anonymity hazard.
+   This risk is explored a bit further below in the context of [interaction with other networks](#interaction-with-other-networks).
 
 We can use the constraints of NAT both to understand application behavior and as an opportunity to model Tor's behavior as a type of NAT.
 In fact, Tor's many exit nodes already share similarity with some types of carrier-grade NAT.
 Applications will need to assume very little about the IP address their outbound UDP originates on, and we can use that to our advantage in implementing UDP for Tor.
 
-This body of IETF work is invaluable for understanding the scope of the problem and for defining common terminology.
-We must take inspiration from these documents while also keeping in mind that the analogy between Tor and a NAT is imperfect.
+This body of work is invaluable for understanding the scope of the problem and for defining common terminology.
+Let's take inspiration from these documents while also keeping in mind that the analogy between Tor and a NAT is imperfect.
 For example, in analyzing Tor as a type of carrier-grade NAT, we may consider the "pooling behavior" defined in RFC4787: the choice of which external addresses map to an internal address.
 Tor by necessity must carefully limit how predictable these mappings can ever be, to preserve its anonymity properties.
 A literal application of RFC6888 would find trouble in REQ-2 and REQ-9, as well as the various per-subscriber limiting requirements.
@@ -195,21 +201,21 @@ A literal application of RFC6888 would find trouble in REQ-2 and REQ-9, as well
 RFC4787 defines a framework for understanding the behavior of NAT by analyzing both its "mapping" and "filtering" behavior separately.
 Mappings are the NAT's unit of state tracking.
 Filters are layered on top of mappings, potentially rejecting incoming datagrams that don't match an already-expected address.
-Both RFC4787 and the demands of peer to peer applications make a good case for always using an **Endpoint-Independent Mapping**.
+Both RFC4787 and the demands of peer-to-peer applications make a good case for always using an **Endpoint-Independent Mapping**.
 
 Choice of filtering strategy is left open by the BEHAVE-WG recommendations.
-RFC4787 defines three types with different properties, and does not make one single recommendation for all circumstances.
-We can gain some additional insight by looking at requirements that come from outside RFC4787.
+RFC4787 does not make one single recommendation for all circumstances,
+instead it defines three behavior options with different properties:
 
 - **Endpoint-Independent Filtering** allows incoming datagrams from any peer once a mapping has been established.
 
   RFC4787 recommends this approach, with the concession that it may not be ideal for all security requirements.
 
-  In the context of Tor, we can likely rule out this technique entirely.
-  It makes traffic injection attacks possible from any source address, provided you can guess the UDP port number used at an exit.
+  This technique cannot be safely applied in the context of Tor.
+  It makes [traffic injection](#traffic-injection) attacks possible from any source address, provided you can guess the UDP port number used at an exit.
   It also makes possible clear-net hosting of UDP servers using an exit node's IP, which may have undesirable abuse properties.
 
-  It precludes "Port overlapping" behavior as defined in RFC7857 section 3, which may be necessary in order to achieve sufficient utilization of local port numbers on exit nodes.
+  This permissive filter is also incompatible with our proposed mitigation to [local port exhaustion](#local-port-exhaustion) on exit relays. Even with per-circuit rate limiting, an attacker could trivially overwhelm the local port capacity of all combined UDP-capable Tor exits.
 
   It is still common for present-day applications to *prefer* endpoint-independent filtering, as it allows incoming connections from peers which cannot use STUN or a similar address fixing protocol.
   Choosing endpoint-independent filtering would have *some* compatibility benefit, but among modern protocols which use ICE and STUN there would be no improvement.
@@ -229,11 +235,11 @@ We can gain some additional insight by looking at requirements that come from ou
 
   One security hazard of address-dependent and non-port-dependent filtering, identified in RFC4787, is that a peer on a NAT effectively negates the security benefits of this host filtering.
   In fact, this should raise additional red flags as applied to either Tor or carrier grade NAT.
-  If we are intending to support peer to peer applications, it should be commonplace to establish UDP flows between two tor exit nodes.
+  If supporting peer-to-peer applications, it should be commonplace to establish UDP flows between two Tor exit nodes.
   When this takes place, non-port-dependent filtering would then allow anyone on Tor to connect via those same nodes and perform traffic injection.
   The resulting security properties really become uncomfortably similar to endpoint-independent filtering.
 
-- **Address and Port-Dependent Filtering**
+- **Address- and Port-Dependent Filtering**
 
   This is the strictest variety of filtering, and it is an allowed alternative under RFC4787.
   It provides opportunities for increased security and opportunities for reduced compatibility, both of which in practice may depend on other factors.
@@ -244,13 +250,15 @@ We can gain some additional insight by looking at requirements that come from ou
   This is the only type of filtering that provides any barrier at all between cross-circuit traffic injection when the communicating parties are known.
 
 RFC4787 recommends that filtering style be configurable.
-We would like to implement that advice, but we are also looking for opportunities to make design decisions that give us the best network and end-user behaviors.
+We would like to implement that advice, but only to the extent it can be done safely and meaningfully in the context of an anonymity system.
+When possible, it would provide additional compatibility at no mandatory cost to allow applications to optionally request **Address-Dependent Filtering**.
+Otherwise, **Address- and Port-Dependent Filtering** is the most appropriate default setting.
 
 ### Common Protocols
 
 Applications that want to use UDP are increasingly making use of higher-level protocols to avoid creating bespoke solutions for problems like NAT traversal, connection establishment, and reliable delivery.
 
-We will analyze how these protocols affect Tor's UDP traffic requirements.
+This section looks at how these protocols affect Tor's UDP traffic requirements.
 
 #### QUIC
 
@@ -263,19 +271,21 @@ The intention is to provide transparent roaming as mobile users change networks.
 This automated path discovery opens additional opportunities for malicious traffic, for which the RFC also offers mitigations. See *path validation* in section `8.2`, and the additional mitigations from section `9.3`.
 
 When QUIC is used as an optional upgrade path, we must compare any proposed UDP support against the baseline of a non-upgraded original connection.
-In these cases we are not looking for any specific compatibility enhancement, simply an avoidance of regression.
+In these cases the goal is not a compatibility enhancement but an avoidance of regression.
 
-In cases where QUIC is used as a primary protocol without TCP fallback, we expect UDP support to be vital. These applications are currently niche but we expect they may rise in popularity.
+In cases where QUIC is used as a primary protocol without TCP fallback, UDP compatibility will be vital.
+These applications are currently niche but we expect they may rise in popularity.
 
 #### WebRTC
 
 WebRTC is a large collection of protocols tuned to work together for media transport and NAT traversal.
-It is increasingly common, both for browser-based telephony and for peer to peer data transfer.
-Non-browser-based apps often implement WebRTC or have components in common with WebRTC.
+It is increasingly common, both for browser-based telephony and for peer-to-peer data transfer.
+Non-browser apps often implement WebRTC as well, for example using [`libwebrtc`](https://github.com/webrtc-sdk/libwebrtc).
+Even non-WebRTC apps sometimes have significant overlaps in their technology stacks, due to the independent history of ICE, RTP, and SDP adoption.
 
 Of particular importance to us, WebRTC uses the Interactive Connection Establishment (ICE) protocol to establish a bidirectional channel between endpoints that may or may not be behind a NAT with unknown configuration.
 
-Any generalized solution to connection establishment, like ICE, will require sending connectivity test probes. These have an inherent hazard to anonymity: assuming no delays are inserted intentionally, the result is a broadcast of similar traffic across all available network interfaces. This could form a convenient correlation beacon for an attacker attempting to de-anonymize users who use WebRTC over a Tor VPN.
+Any generalized solution to connection establishment, like ICE, will require sending connectivity test probes. These have an inherent hazard to anonymity: assuming no delays are inserted intentionally, the result is a broadcast of similar traffic across all available network interfaces. This could form a convenient correlation beacon for an attacker attempting to de-anonymize users who use WebRTC over a Tor VPN. This is the risk enumerated below as [*interaction with other networks*](#interaction-with-other-networks).
 
 See
 [RFC8825](https://www.rfc-editor.org/rfc/rfc8825.html) *Overview: Real-Time Protocols for Browser-Based Applications*,
@@ -290,7 +300,7 @@ With applications exhibiting such a wide variety of behaviors, how do we know wh
 How do we know which compatibility decisions will be most important to users?
 For this it's helpful to look at specific application behaviors.
 This is a best-effort analysis conducted at a point in time.
-It's not meant to be a definitive reference, think of it as a site survey taken before we plan a building.
+It's not meant to be a definitive reference, think of it as a site survey taken before planning a building.
 
 In alphabetical order:
 
@@ -313,11 +323,8 @@ In alphabetical order:
 
 ## Overview of Possible Solutions
 
-Now that we've defined some categories of UDP traffic we are interested in handling, this section starts to examine different high-level implementation techniques we could adopt.
-
-We can broadly split these into *datagram routing* and *tunneling*.
-
-Ideally we would be choosing a design that solves problems we have in the near-term while also providing a solid foundation for future enhancements to Tor, including changes which may add full support for unreliable delivery of datagrams. If we proceed down that path with insufficient understanding of the long-term goal, there's a risk that we will choose to adopt complexity in service of future goals while failing to serve them adequately when the time comes.
+This section starts to examine different high-level implementation techniques we could adopt.
+Broadly they can be split into *datagram routing* and *tunneling*.
 
 ### Datagram Routing
 
@@ -339,22 +346,22 @@ This points to the key weakness of relying on a separate network for UDP: Tor ha
 
 #### Future Work on Tor
 
-This is likely where we would seek to expand Tor's design in order to add end-to-end support for unreliable delivery in the future.
+There may be room for future changes to Tor which allow it to somehow transfer and route datagrams directly, without a separate process of establishing circuits and tunnels.
+If this is practical it may prove to be the simplest and highest performance route to achieving high quality UDP support in the long term.
 A specific design is out of the scope of this document.
 
 It is worth thinking early about how we can facilitate combinations of approaches.
-We may find a need for an abstraction similar to a network routing table, allowing multiple UDP providers to coexist.
-
-Even without bringing any new network configurations to Tor, achieving interoperable support for both exit nodes and onion services in a Tor UDP implementation requires some attention to how multiple UDP providers can coexist.
+Even without bringing any new network configurations to Tor, achieving interoperable support for both exit nodes and onion services in a Tor UDP implementation requires some attention to how multiple UDP providers can share protocol responsibilities.
+This may warrant the introduction of some additional routing layer.
 
 ### Tunneling
 
-The approaches in this section add a new construct which does not exist in UDP itself: a point to point tunnel between clients and some other location at which they establish the capability to send and receive UDP datagrams.
+The approaches in this section add a new construct which does not exist in UDP itself: a point-to-point tunnel between clients and some other location at which they establish the capability to send and receive UDP datagrams.
 
 Any tunneling approach requires some way to discover tunnel endpoints.
-We would like this to come as an extension of Tor's existing process for distributing consensus and representing exit policy.
+For the best usability and adoption this should come as an extension of Tor's existing process for distributing consensus and representing exit policy.
 
-We expect exit policies for UDP to have limited practical amounts of diversity.
+In practice, exit policies for UDP will have limited practical amounts of diversity.
 VPN implementations will need to know ahead of time which tunnel circuits to build, or they will suffer a significant spike in latency for the first outgoing datagram to a new peer.
 Additionally, it's common for UDP port numbers to be randomly assigned.
 This would make highly specific Tor exit policies even less useful and even higher overhead than they are with TCP.
@@ -365,7 +372,7 @@ The scope of this tunnel is quite similar to the existing TURN relays, used comm
 
 TURN is defined by [RFC8656](https://www.rfc-editor.org/rfc/rfc8656) as a set of extensions built on the framework from STUN in [RFC8489](https://www.rfc-editor.org/rfc/rfc8489.html). The capabilities are a good match for our needs, offering clients the ability to encapsulate UDP datagrams within a TCP stream, and to allocate local port mappings on the server.
 
-TURN was designed to be a set of modular and extensible pieces, which may be too far opposed to Tor's design philosophy of providing single canonical representations. Any adoption of TURN will need to consider the potential for malicious implementations to mark traffic, facilitating de-anonymization attacks.
+TURN was designed to be a set of modular and extensible pieces, which might be too distant from Tor's design philosophy of providing single canonical representations. Any adoption of TURN will need to consider the potential for malicious implementations to mark traffic, facilitating de-anonymization attacks.
 
 TURN has a popular embeddable C-language implementation, [coturn](https://github.com/coturn/coturn), which may be suitable for including alongside or inside C tor.
 
@@ -373,28 +380,32 @@ TURN has a popular embeddable C-language implementation, [coturn](https://github
 
 Most of the discussion on UDP implementation in Tor so far has assumed this approach. Essentially it's the same strategy as TCP exits, but for UDP. When the OP initializes support for UDP, it pre-builds circuits to exits that support required UDP exit policies. These pre-built circuits can then be used as tunnels for UDP datagrams.
 
-Within this overall approach, there are various ways we could choose to assign Tor *streams* for the UDP traffic. This will be considered below.
+Within this overall approach, there are various ways to assign Tor *streams* for the UDP traffic. This will be considered below.
 
 #### Tor Stream Tunnel to a Rendezvous Point
 
-To implement onion services which advertise UDP, we may consider using multiple simultaneous tunnels.
-In addition to exit nodes, clients could establish the ability to allocate virtual UDP ports on a rendezvous node of some kind.
+To implement onion services which advertise UDP ports, we can use additional tunnels.
+A new type of tunnel could end at a rendezvous point rather than an exit node.
+Clients could establish the ability to allocate a temporary virtual datagram mailbox at these rendezvous nodes.
 
+This leaves more open questions about how outgoing traffic is routed, and which addressing format would be used for the datagram mailbox.
 The most immediate challenge in UDP rendezvous would then become application support. Protocols like STUN and ICE deal directly with IPv4 and IPv6 formats in order to advertise a reachable address to their peer. Supporting onion services in WebRTC would require protocol extensions and software modifications for STUN, TURN, ICE, and SDP at minimum.
 
-UDP-like rendezvous extensions would have limited meaning unless they form part of a long-term strategy to forward datagrams in some new way for enhanced performance or compatibility. Otherwise, application authors might as well stick with Tor's existing TCP-like rendezvous functionality.
+UDP-like rendezvous extensions would have limited meaning unless they form part of a long-term strategy to forward datagrams in some new way for enhanced performance or compatibility. Otherwise, application authors might as well stick with Tor's existing reliable circuit rendezvous functionality.
 
 ## Specific Designs Using Tor Streams
 
 Let's look more closely at Tor *streams*, the multiplexing layer right below circuits.
 
-Streams have a 16-bit identifier, allocated arbitrarily by clients. Stream lifetimes are subject to some ambiguity still in the Tor spec. They are allocated by clients, but may be destroyed by either peer.
+Streams have an opaque 16-bit identifier, allocated from the onion proxy (OP) endpoint.
+Stream lifetimes are subject to some slight ambiguity still in the Tor spec.
+They are always allocated from the OP end but may be destroyed asynchronously by either circuit endpoint.
 
 We have an opportunity to use this additional existing multiplexing layer to serve a useful function in the new protocol, or we can opt to interact with streams as little as possible in order to keep the protocol features more orthogonal.
 
 ### One Stream per Tunnel
 
-The fewest new streams would be a single stream for all of UDP. This is what we get if we choose an off-the-shelf protocol like TURN as our UDP proxy.
+The fewest new streams would be a single stream for all of UDP. This is the result when using an off-the-shelf stream oriented protocol like TURN-over-TCP as our UDP proxy.
 
 This approach would require only a single new Tor message type:
 
@@ -402,7 +413,9 @@ This approach would require only a single new Tor message type:
 
   - Establish a stream as a connection to the exit relay's built-in (or configured) TURN server.
 
-Note that RFC8656 requires authentication before data can be relayed, which is a good default best practice for the internet perhaps but is the opposite of what Tor is trying to do. We would either deviate from the specification to relax this auth requirement, or we would provide a way for clients to discover credentials: perhaps by fixing them ahead of time or by including them in the relay descriptor.
+    This would logically be a TURN-over-TCP connection, though it does not need to correspond to any real TCP socket if the TURN server is implemented in-process with tor.
+
+Note that RFC8656 requires authentication before data can be relayed, which is a good default best practice for the internet perhaps but is the opposite of what Tor is trying to do. We would either deviate from the specification to relax this auth requirement, or provide a way for clients to discover credentials: perhaps by fixing them ahead of time or by including them in the relay descriptor.
 
 ### One Stream per Socket
 
@@ -425,7 +438,7 @@ A simple one-to-one mapping between streams and sockets would preclude the optim
 
 One stream **per flow** has also been suggested.
 Specifically, Mike Perry brought this up during our conversations about UDP recently and we spent some time analyzing it from a RFC4787 perspective.
-We will see below it has interesting properties but also some hidden complexity.
+This approach has some interesting properties but also hidden complexity that may ultimately make other options more easily applicable.
 
 This would assign a stream ID to the tuple consisting of at least `(local port, remote address, remote port)`. Additional flags may be included for features like transmit and receive filtering, IPv4/v6 choice, and IP *Don't Fragment*.
 
@@ -447,7 +460,7 @@ When is this exit-originated circuit ID allocation potentially needed?
 It is clearly needed when using **address-dependent filtering**.
 An incoming datagram from a previously-unseen peer port is expected to be deliverable, and the exit would need to allocate an ID for it.
 
-Even with the stricter **address and port-dependent filtering** we may still be exposed to exit-originated circuit IDs if there are mismatches in the lifetime of the filter and the stream.
+Even with the stricter **address and port-dependent filtering** clients may still be exposed to exit-originated circuit IDs if there are mismatches in the lifetime of the filter and the stream.
 
 This approach thus requires some attention to either correctly allocating stream IDs on both sides of the circuit, or choosing a filtering strategy and filter/mapping lifetime that does not ever leave stream IDs undefined when expecting incoming datagrams.
 
@@ -455,11 +468,11 @@ This approach thus requires some attention to either correctly allocating stream
 
 One stream **per mapping** is an alternative which attempts to reduce the number of edge cases by merging the lifetimes of one stream and one **endpoint-independent mapping**.
 
-A mapping would always be allocated from the OP side.
+A mapping would always be allocated from the OP (client) side.
 It could explicitly specify a filtering style, if we wish to allow applications to request non-port-dependent filtering for compatibility.
 Each datagram within the stream would still need to be tagged with a peer address/port in some way.
 
-This approach would involve a single new type of stream, two new messages that pertain to these *mapping* streams:
+This approach would involve a single new type of stream, and two new messages that pertain to these *mapping* streams:
 
 - `NEW_UDP_MAPPING`
 
@@ -516,23 +529,24 @@ The implementation here could be a strict superset of the **per mapping** implem
   - Datagram contents only, without address.
   - Only appears on *flow* streams.
 
-We must consider the traffic marking opportunities we open when allowing an exit to represent one incoming datagram as either a *flow* or *mapping* datagram.
+We must consider the traffic marking opportunities provided when allowing an exit to represent one incoming datagram as either a *flow* or *mapping* datagram.
 
-It's possible this traffic injection potential is not worse than the baseline amount of injection potential than every UDP protocol presents. See more on [risks](#risks) below. For this hybrid stream approach specifically, there's a limited mitigation we can use to allow exits only a bounded amount of leaked information per UDP peer:
+It's possible this traffic injection potential is not worse than the baseline amount of injection potential than every UDP protocol presents. See more on [risks](#risks) below. For this hybrid stream approach specifically, there's a limited mitigation available which allows exits only a bounded amount of leaked information per UDP peer:
 
-We would like to state that exits may not choose to send a `UDP_MAPPING_DATAGRAM` when they could have sent a `UDP_FLOW_DATAGRAM`.
+Ideally exits may not choose to send a `UDP_MAPPING_DATAGRAM` when they could have sent a `UDP_FLOW_DATAGRAM`.
 Sometimes it is genuinely unclear though: an exit may have received this datagram in-between processing `NEW_UDP_MAPPING` and `NEW_UDP_FLOW`.
-We could choose to terminate circuits which send a `UDP_MAPPING_DATAGRAM` for a peer that has already been referenced in a `UDP_FLOW_DATAGRAM`, giving exits a one-way gate to let them switch a peer from *mapping* datagram to *flow* datagram but not the reverse.
+A partial mitigation would terminate circuits which send a `UDP_MAPPING_DATAGRAM` for a peer that has already been referenced in a `UDP_FLOW_DATAGRAM`.
+The exit is thereby given a one-way gate allowing it to switch from using *mapping* datagrams to using *flow* datagrams at some point, but not to switch back and forth repeatedly.
 
 Mappings that do not request port-specific filtering may always get unexpected `UDP_MAPPING_DATAGRAM`s. Mappings that do use port-specific filtering could make a flow for their only expected peers, then expect to never see `UDP_MAPPING_DATAGRAM`.
 
-We may wish for `NEW_UDP_MAPPING` to have an option requiring that only `UDP_FLOW_DATAGRAM` is to be used, never `UDP_MAPPING_DATAGRAM`.
+`NEW_UDP_MAPPING` could have an option requiring that only `UDP_FLOW_DATAGRAM` is to be used, never `UDP_MAPPING_DATAGRAM`.
 This would remove the potential for ambiguity, but costs in compatibility as it's no longer possible to implement non-port-specific filtering.
 
 ## Risks
 
 Any proposed UDP support involves significant risks to user privacy and software maintainability.
-We will try to elaborate some of these risks here, so they can be compared against the expected benefits.
+This section elaborates some of these risks, so they can be compared against expected benefits.
 
 ### Behavior Regressions
 
@@ -543,7 +557,7 @@ They may also occur for more fundamental reasons of protocol layering.
 For example, the redundant error correction layers when tunneling QUIC over TCP.
 These performance degradations are expected to be minor, but there's some unavoidable risk.
 
-We may mitigate the risk of severe performance or compatibility regressions by giving users a way to toggle UDP support per-application.
+The risk of severe performance or compatibility regressions may be mitigated by giving users a way to toggle UDP support per-application.
 
 Privacy and security regressions have more severe consequences and they can be much harder to detect.
 There are straightforward downgrades, like WebRTC apps that give up TURN-over-TLS for plaintext TURN-over-UDP.
@@ -551,7 +565,7 @@ More subtly, the act of centralizing connection establishment traffic in Tor exi
 
 ### Bandwidth Usage
 
-We expect an increase in overall exit bandwidth requirements due to peer-to-peer file sharing applications.
+We should expect an increase in overall exit bandwidth requirements due to peer-to-peer file sharing applications.
 
 Current users attempting to use BitTorrent over Tor are hampered by the lack of UDP compatibility. Interoperability with common file-sharing peers would make Tor more appealing to users with a large and sustained appetite for anonymized bandwidth.
 
@@ -559,7 +573,7 @@ Current users attempting to use BitTorrent over Tor are hampered by the lack of
 
 Exit routers will have a limited number of local UDP ports. In the most constrained scenario, an exit may have a single IP with 16384 or fewer ephemeral ports available. These ports could each be allocated by one client for an unbounded amount of exclusive use.
 
-In order to enforce high levels of isolation between different subsequent users of the same local UDP port, we may wish to enforce a delay between allocations during which nobody may own the port. Effective isolation requires this timer to be greater than any timer we expect to encounter on a peer or a NAT. In RFC4787's recommendations a NAT's mapping timer must be longer than 2 minutes. Our timer should ideally be *much* longer than 2 minutes.
+In order to enforce high levels of isolation between different subsequent users of the same local UDP port, we may wish to enforce an extended delay between allocations during which nobody may own the port. Effective isolation requires this timer duration to be greater than any timer encountered on a peer or a NAT. In RFC4787's recommendations a NAT's mapping timer must be longer than 2 minutes. Our timer should ideally be *much* longer than 2 minutes.
 
 An attacker who allocates ports for only this minimum duration of 2 minutes would need to send 136.5 requests per second to achieve sustained use of all available ports. With multiple simultaneous clients this could easily be done while bypassing per-circuit rate limiting.
 
@@ -567,10 +581,28 @@ The expanded definition of "Port overlapping" from [RFC7857 section 3](https://d
 
 > This document clarifies that this port overlapping behavior may be extended to connections originating from different internal source IP addresses and ports as long as their destinations are different.
 
-This gives us an opportunity for a vast reduction in the number of required ports and file descriptors. Practically, though, it does require us to make a guess about which potential peers one source port may communicate with.
+This gives us an opportunity for a vast reduction in the number of required ports and file descriptors.
+Exit routers can automatically allocate local ports for use with a specific peer when that peer is first added to the client's filter.
+
+Due to the general requirements of NAT traversal, UDP applications with any NAT support will always need to communicate with a relatively well known server prior to any attempts at peer-to-peer communication.
+This early peer could be an entire application server, or it could be a STUN endpoint.
+In any case, the identity of this first peer gives us a hint about the set of all potential peers.
+
+Within the exit router, each local port will track a separate mapping owner for each peer.
+When processing that first outgoing datagram, the exit may choose any local port where the specific peer is not taken.
+Subsequent outgoing datagrams on the same port may communicate with a different peer, and there's no guarantee all these future peers will be claimed successfully.
+
+When is this a problem?
+An un-claimable peer represents a case where the exact `(local ip, local port, remote ip, remote port)` tuple is in use already for a different mapping in some other Tor stream.
+For example, imagine two clients are running different types of telecom apps which are nevertheless inter-compatible and capable of both calling the same peers.
+Alternatively, consider the same app but with servers in several regions.
+The two apps will begin by communicating with different sets of peers, due to different application servers and different bundled STUN servers.
+This is our hint that it's likely appropriate to overlap their local port allocations.
+At this point, both of these applications may be successfully sharing a `(local ip, local port)` tuple on the exit.
+As soon as one of these apps calls a peer with some `(remote ip, remote port)`, the other app will be unable to contact that specific peer.
 
-Our UDP implementation will need to choose a port assignment based on knowledge of only the first peer the app is sending to.
-Heuristically, we can make this work. The first peer in practice will be less unique than subsequent peers. Applications will contact centralized services before contacting peers. This ordering is necessary in the general case of ICE-like connection establishment.
+The lack of connectivity may seem like a niche inconvenience, and perhaps that is the extent of the issue.
+It seems likely this heuristic could result in a problematic information disclosure under some circumstances, and it deserves closer study.
 
 ### Application Fingerprinting
 
@@ -579,11 +611,12 @@ UDP applications present an increased surface of plaintext data that may be avai
 Exposed values can include short-lived identifiers like STUN usernames.
 Typically it will also be possible to determine what type of software is in use, and maybe what version of that software.
 
-Short-lived identifiers are still quite valuable to attackers, because they can reliably track application sessions across changes to the Tor exit. If longer-lived identifiers exist for any reason, that of course provides a powerful tool for call metadata gathering.
+Short-lived identifiers are still quite valuable to attackers, because they may reliably track application sessions across changes to the Tor exit. If longer-lived identifiers exist for any reason, that of course provides a powerful tool for call metadata gathering.
 
 ### Peer-to-Peer Metadata Collection
 
-One of our goals was to achieve the compatibility and perhaps performance benefits of allowing "peer-to-peer" (in our case really exit-to-exit) UDP connections. We expect this to enable the subset of applications that lack a fallback path which loops traffic through an app-provided server.
+One of our goals was to achieve the compatibility and perhaps performance benefits of allowing "peer-to-peer" (in our case really exit-to-exit) UDP connections.
+We expect this to enable the subset of applications that lack a fallback path which loops traffic through an app-provided server.
 
 This goal may be at odds with our privacy requirements. At minimum, a pool of malicious exit nodes could passively collect metadata about these connections as a noisy proxy for call metadata.
 
@@ -609,40 +642,52 @@ In case of malicious exit relays, whole datagrams can be inserted and dropped, a
 
 Of particular interest is the plaintext STUN, TURN, and ICE traffic used by most WebRTC apps. These applications rely on higher-level protocols (SRTP, DTLS) to provide end-to-end encryption and authentication. A compromise at the connection establishment layer would not violate application-level end-to-end security requirements, making it outside the threat model of WebRTC but very much still a concern for Tor.
 
-These attacks are not fully unique to the proposed UDP support, but UDP may increase exposure. In cases where the application already has a fallback using TURN-over-TLS, the proposal is a clear regression over previous behaviors. Even when we are comparing plaintext to plaintext, there may be a serious downside to centralizing all connection establishment traffic through a small number of exit IPs. Depending on your threat model, it could very well be more private to allow the UDP traffic to bypass Tor entirely.
+These attacks are not fully unique to the proposed UDP support, but UDP may increase exposure. In cases where the application already has a fallback using TURN-over-TLS, the proposal is a clear regression over previous behaviors. Even when comparing plaintext to plaintext, there may be a serious downside to centralizing all connection establishment traffic through a small number of exit IPs. Depending on your threat model, it could very well be more private to allow the UDP traffic to bypass Tor entirely.
 
 ### Malicious Outgoing Traffic
 
-We expect UDP compatibility in Tor will give malicious actors additional opportunities to transmit unwanted traffic.
+We can expect UDP compatibility in Tor will give malicious actors additional opportunities to transmit unwanted traffic.
+
+In general, exit abuse will need to be filtered administratively somehow.
+This is not unique to UDP support, and exit relay administration typically involves some type of filtering response tooling that falls outside the scope of Tor itself.
+
+Exit administrators may choose to modify their exit policy, or to silently drop problematic traffic.
+Silent dropping is discouraged in most cases, as Tor prioritizes the accuracy of an exit's advertised policy.
+Detailed exit policies have a significant space overhead in the overall Tor consensus document, but it's still seen as a valuable resource for clients during circuit establishment.
+
+Exit policy filtering may be less useful in UDP than with TCP due to the inconvenient latency spike when establishing a new [tunnel](#tunneling).
+Applications that are sensitive to RTT measurements made during connection establishment may fail entirely when the tunnel cannot be pre-built.
+
+This section lists a few potential hazards, but the real-world impact may be hard to predict owing to a diversity of custom UDP protocols implemented across the internet.
 
-#### Amplification attacks against arbitrary targets
+- Amplification attacks against arbitrary targets
 
-These are possible only in limited circumstances where the protocol allows an arbitrary reply address, like SIP.
+  These are possible only in limited circumstances where the protocol allows an arbitrary reply address, like SIP.
 The peer is often at fault for having an overly permissive configuration.
 Nevertheless, any of these *easy* amplification targets can be exploited from Tor with little consequence, creating a nuisance for the ultimate target and for exit operators.
 
-#### Amplification attacks against an exit relay
+- Amplification attacks against an exit relay
 
-An amplification peer which doesn't allow arbitrary destinations can still be used to attack the exit relay itself or other users of that relay.
+  An amplification peer which doesn't allow arbitrary destinations can still be used to attack the exit relay itself or other users of that relay.
 This is essentially the same attack that is possible against any NAT the attacker is behind.
 
-#### Malicious fragmented traffic
+- Malicious fragmented traffic
 
-If we allow sending large UDP datagrams over IPv4 without the *Don't Fragment* flag set, we allow attackers to generate fragmented IP datagrams.
+  If we allow sending large UDP datagrams over IPv4 without the *Don't Fragment* flag set, we allow attackers to generate fragmented IP datagrams.
 This is not itself a problem, but it has historically been a common source of inconsistencies in firewall behavior.
 
-#### Excessive sends to an uninterested peer
+- Excessive sends to an uninterested peer
 
-Whereas TCP mandates a successful handshake, UDP will happily send unlimited amounts of traffic to a peer that has never responded.
+  Whereas TCP mandates a successful handshake, UDP will happily send unlimited amounts of traffic to a peer that has never responded.
 To prevent denial of service attacks we have an opportunity and perhaps a responsibility to define our supported subset of UDP to include true bidirectional traffic but exclude continued sends to peers who do not respond.
 
-See also [RFC7675](https://www.rfc-editor.org/rfc/rfc7675.html) and STUN's concept of "Send consent".
+  See also [RFC7675](https://www.rfc-editor.org/rfc/rfc7675.html) and STUN's concept of "Send consent".
 
-#### Excessive number of peers
+- Excessive number of peers
 
-We may want to place conservative limits on the maximum number of peers per mapping or per circuit, in order to make bulk scanning of UDP port space less convenient.
+  We may want to place conservative limits on the maximum number of peers per mapping or per circuit, in order to make bulk scanning of UDP port space less convenient.
 
-The limit does need to be on peers, not stream IDs as we presently do for TCP.
+  The limit would need to be on peers, not stream IDs as we presently do for TCP.
 In this proposal stream IDs are not necessarily meaningful except as a representational choice made by clients.
 
 ## Next Steps
author	Micah Elizabeth Scott <beth@torproject.org>	2024-01-11 19:26:54 -0800
committer	Micah Elizabeth Scott <beth@torproject.org>	2024-01-25 08:56:48 -0800
commit	5b2d5866414fb205172ca735ce84d911feae948f (patch)
tree	17a05c55763098d7c0f2c421ac2f6445a15338e4
parent	0e662b4e0e40bd63e0c7216a6ca3a2ccdb44fea4 (diff)
download	torspec-5b2d5866414fb205172ca735ce84d911feae948f.tar.gz torspec-5b2d5866414fb205172ca735ce84d911feae948f.zip