More writing and outlining

Intro material is somewhat more there now, and I put in a very very rough sketch of the protocol-level recommendations at the very end.
author: Micah Elizabeth Scott <beth@torproject.org> 2023-12-20 19:10:12 -0800
committer: Micah Elizabeth Scott <beth@torproject.org> 2024-01-25 08:56:48 -0800
commit: 9b886c681a35a9251bd63bb3c622fe1ca8c7cef5 (patch)
tree: 97c479a22acaad7fbd0646a95536c494fa7cbcfe
parent: 0dd88e6e89e620ab015fc402718bc6cb9ebaf4c1 (diff)
download: torspec-9b886c681a35a9251bd63bb3c622fe1ca8c7cef5.tar.gz
torspec-9b886c681a35a9251bd63bb3c622fe1ca8c7cef5.zip
1 files changed, 108 insertions, 24 deletions
diff --git a/proposals/XXX-udp-app-support.md b/proposals/XXX-udp-app-support.md
index 3a08706..ce30bda 100644
--- a/proposals/XXX-udp-app-support.md
+++ b/proposals/XXX-udp-app-support.md
@@ -6,18 +6,60 @@ Created: December 2023
 Status: Draft
 ```
 
-that exit relays can also relay some additional types of UDP traffic
-to the network.
-With the above in mind, the goal of this proposal is specifically to enable compatibility with more applications.
-We will analyze some of those applications below, in order to choose an appropriate subset of UDP to implement which achieves our compatibility goals.
-These compatibility goals will need to be weighed against any anonymity hazards or opportunities for abuse.
+# Introduction
 
-We expect that work to improve round-trip time and jitter characteristics, such as Conflux, will benefit both TCP and UDP applications.
-Many applications already support both transport protocols, and will perform similarly with or without this proposal.
+This proposal takes a fresh look at the problem of implementing support in Tor for applications which require UDP/IP communication.
+
+We start out by defining how this proposal compares to previous work, and the specific problem space we are addressing.
+This leads into an analysis that references appropriate standards and proposes some specific solutions with properties we can compare.
+
+## History
+
+There have been multiple attempts within Tor to define some type of UDP support.
+
+Proposal 100 by Marc Liberatore in 2006 suggested a way to "add support for tunneling unreliable datagrams through tor with as few modifications to the protocol as possible." This proposal extends the existing protocol with a DTLS link mode alongside the existing TLS+TCP. The focus of this work was on a potential way to support unreliable traffic, not necessarily on UDP itself or on UDP applications.
+
+In proposal 100, a Tor "Stream" is used for one pairing of local and remote address and port, copying the technique used by Tor for TCP. This works for some types of UDP applications, as we detail below, but for many peer-to-peer use cases it's unhelpful to allocate local ports that can only be used with a single remote peer.
+
+Furthermore, no additional large-message fragmentation protocol is defined, so the MTU in proposal 100 is limited to a one Tor cell, a value that we will see is unusably small for most applications.
+
+It's possible these UDP protocol details would have been elaborated during design, but this proposal hit a snag elsewhere: there was no agreement on a way to avoid causing new attacks against anonymity.
+
+In 2018, Nick Mathewson and Mike Perry wrote this summary of the general issues with unreliable transports for Tor:
+
+https://research.torproject.org/techreports/side-channel-analysis-2018-11-27.pdf
+
+TODO: Look closer at this, some of the attacks described might overlap with non-datagram-transport ways of encapsulating UDP. (i.e. a malicious exit may always cause drops/injections regardless of what the rest of the anonymity network does.)
+
+TODO: There has got to be a section here for new risks with malicious exits. Some protocols may see increased risk vs. their existing TCP fallbacks.
+
+Proposal 339 by Nick Mathewson in 2020 introduced a simpler UDP encapsulation design which had similar stream mapping properties as in proposal 100, but with the unreliable transport omitted. Datagrams are tunneled over a new type of Tor circuit using a new type of Tor message. As a prerequisite, it depends on proposal 319 to support messages that may be larger than a cell, extending the MTU to support arbitrarily large UDP datagrams.
+
+In proposal 339 the property of binding a stream both to a local port and to a remote peer is described in UNIX-style terminology as a "connected socket". We dive deeper into this idea below and supply an alternate formulation based on RFC4787, NAT behavior requirements for UDP. This single-peer behavior would be referred to as an "endpoint-dependent mapping" in RFC4787. It works fine for client/server apps but precludes the use of peer-to-peer techniques.
+
+## Scope
+
+This proposal aims to allow Tor applications and Tor-based VPNs to provide compatibility with applications that require UDP/IP communications.
+
+We don't have a specific list of applications that must be supported, but we are currently aiming for broad support of popular applications while still respecting and referencing all applicable Internet standards documents.
+
+Changes to the structure of the Tor network are out of scope, as are specific performance optimizations. We expect to rely on common optimizations to the performance of Tor circuits, rather than looking to make specific changes to optimize for unreliable datagram transmission.
+
+We will discuss the design implications of UDP on onion services below.
+It's worth planning for this as a way to evaluate the design space, but in practice we are not aiming for UDP to onion services yet.
+This will require changes to most applications that want to use it, as it implies that any media negotiations will need to understand onion addressing in addition to IPv4 and IPv6.
+
+We do not rigidly define the subset of UDP traffic that will be allowed.
+There are several options discussed below using the framework of RFC4787 NAT recommendations.
+
+We do expect support for voice/video telecommunications apps. Even without an underlying transport that supports unreliable datagrams, we expect a tunnel to provide a usable level of compatibility. This design space is very similar to the TURN (RFC8656) specification, already used very widely for compatibility with networks that filter UDP. See the analysis of specific applications below.
+
+We do expect support for DNS clients. Tor currently only supports a limited subset of DNS queries, and it's desirable to support more. This will be analyzed in detail as an application below. DNS is one of very few applications that still rely on fragmented UDP datagrams, though this may not be relevant for us since only servers typically need to control the production of fragments.
 
 # UDP Traffic Models
 
 ## User Datagram Protocol (RFC768)
+
 The "User Interface" suggested by RFC768 for the protocol is a rough sketch, suggesting that applications have some way to allocate a local port for receiving datagrams and to transmit datagrams with arbitrary headers.
 Despite UDP's simplicity as an application of IP, we do need to be aware of IP features that are typically hidden by TCP's abstraction.
 UDP applications typically try to obtain an awareness of the path MTU, using some type of path MTU discovery (PMTUD) algorithm.
@@ -26,6 +68,7 @@ On IPv4, this requires sending packets with the "Don't Fragment" flag set, and m
 Note that many applications have their own requirements for path MTU. For example, QUIC and common implementations of WebRTC require an MTU no smaller than 1200 bytes, but they can discover larger MTUs when available.
 
 ## Socket Layer
+
 In practice the straightforward "User Interface" from RFC768, capable of arbitrary local address, is only available to privileged users.
 
 BSD-style sockets support UDP via `SOCK_DGRAM`.
@@ -189,6 +232,7 @@ In alphabetical order:
 | WhatsApp               | Telecom        | STUN, TURN-over-TCP. Multi server | Works              | Slight latency improvement                         |
 | WiFi Calling           | Telecom        | IPsec tunnel                      | Out of scope       | Still out of scope                                 |
 | Zoom                   | Telecom        | client/server or P2P, UDP/TCP     | Works              | Slight latency improvement                         |
+
 ## Malicious traffic
 
 TODO: Various kinds of traffic we want to avoid
@@ -209,19 +253,35 @@ TODO: Are there plaintext identifiers in these telecom apps?
 
 TODO: Is there any chance we make the anonymity risk worse by providing UDP exits than it would be with an application-provided TCP relay server?
 
-# Alternative designs
+# Design approaches
+
+## Datagram routing
+
+### 3rd party implementations
+
+### Hybrid approaches
+
+### Making room for a future Tor implementation
+
+## Tunneling
+
+### Using TURN (RFC8656) encapsulated in a Tor stream
 
 TODO: Comparison vs. an entirely out-of-protocol and potentially out-of-process TURN server. Is this complexity warranted?
 
-# Tor protocol design
+### Using Tor streams to an exit
+
+TODO: Bulk of the design goes here, this is what we likely start with
 
-Using the specification- and application-based goals above, here we will briefly discuss the design constraints as they relate to Tor's protocol.
+### Using Tor streams to a rendezvous point
 
-## Stream usage
+TODO: possible generalization of above to support onion UDP. big caveat that apps need extensions to support onion addressing.
+
+#### Stream usage
 
 An early design juncture in this project is the particular choice of scope for one *stream* in the existing Tor protocol.
 
-- One stream **per socket** was the approach suggested in an earlier version of this proposal.
+- One stream **per socket** was the approach suggested in Proposal 339.
 
   Each stream would match the lifetime of a source port allocation.
   There would be a single peer address/port allowed per allocation.
@@ -279,26 +339,50 @@ An early design juncture in this project is the particular choice of scope for o
   In TURN, a "channel" is only ever allocated from the originating side of the connection.
   Incoming datagrams with no channel can always be represented in the long form, so TURN never has to allocate channels unexpectedly.
 
+# Protocol Designs
+
+Currently we have two design candidates, and both are described here.
+
+TODO: Exit selection is likely common to both of these approaches. Talk about how we specify UDP exit policy, how we provide a smart tradeoff between ability to filter harmful traffic while avoiding an explosion in exit policy complexity.
+
+## Tor UDP Mapping Protocol
+
+This is *tunneled* approach, using Tor's existing protocol objects where possible. Exits advertise support for specific UDP policies. Clients choose an exit where they would like to allocate a UDP local port. They create one *stream* with a lifetime matching the lifetime of that mapping. Optional additional *streams* may be used as fast paths for common peers. This is the *hybrid approach* to stream usage, from above.
+
+- `NEW_UDP_MAPPING`
 
-# Tor protocol specification
-TODO
+  - always client-to-exit
+  - stream ID for mapping
+  - no address given
+  - "don't fragment" flag
+  - port-specific filtering flag
+  - no reply? early data always ok.
+  - ended on circuit teardown or by `END`
 
-TODO
+- `UDP_MAPPING_DATAGRAM`
 
-TODO: Source port cookie, if we choose to have multiple streams per source port. (Under discussion still; see below) Flags for transmit/receive allowed.
+  - conveys one datagram on a stream defined by ALLOCATE_UDP_MAPPING.
+  - Includes peer address (ipv4/ipv6) as well as datagram content
 
-TODO: Do we need this at all? If a NAT is involved the info won't be correct. Even if it's correct, existing apps (connected via a VPN) wouldn't be able to make use of it. They would still talk to a STUN server. Do we care about UDP apps written directly to the Arti API? Are there non-malicious uses for this?
+- `NEW_UDP_FLOW`
 
-TODO: This would need an additional byte or two of 'peer ID' if we choose to implement streams as 1:1 with local port allocations. This is still being discussed and a conclusion hasn't been reached, see below.
+  - Lifetime is <= lifetime of UDP mapping.
+  - stream ID for parent mapping, and new flow within that mapping
+  - ended on mapping end or by explicit `END`
+  - Includes peer address (ipv4/ipv6)
 
-TODO: What does it mean for a stream to end here? This depends on how we choose to do port allocations, which needs more discussion. Should the stream lifetime and port allocation lifetime match, or should there be a separate timer system?
+- `UDP_FLOW_DATAGRAM`
 
-TODO: This is flawed, the "connected/unconnected" idea is just a UNIX'ism and we should be thinking about streams and about how we allocate source ports. Also, limiting connections to a single peer breaks 100% of P2P connections, since they require using STUN (or similar) first.
+  - Datagram contents without address, for flow streams.
+  - If a flow stream exists, the exit must use `UDP_FLOW_DATAGRAM` instead of `UDP_MAPPING_DATAGRAM`.
+  - Mappings that do not request port-specific filtering may always get unexpected UDP_MAPPING_DATAGRAMs. Mappings that do use port-specific filtering could make a flow for their only expected peers, then expect to never see `UDP_MAPPING_DATAGRAM`.
 
-TODO: Right now beth and mike have different ideas for how this could work, and we need to discuss more.
+## TURN over Tor
 
-TODO: Revise this section. Nearly everything assumes the MTU is larger, and we are driving this design based on app compatibility.
+In this approach, a single Tor stream is used for all UDP traffic.
 
-TODO: "Application" here being the VPN or the end-user app? We've discussed having a per-app opt-in for UDP, enforced at the VPN layer.
-TODO: update this
+- `CONNECT_TURN`
 
+  - Establish a stream as a connection to the exit relay's built-in (or configured) TURN server.
+  - RFC8656 requires authentication before data can be relayed, which is a good default best practice for the internet perhaps but is the opposite of what Tor is trying to do. We would either modify the specification to relax this auth requirement, or we would provide a way for clients to discover credentials: either by fixing them ahead of time or by including them in the relay descriptor.
+  
+\ No newline at end of file
author	Micah Elizabeth Scott <beth@torproject.org>	2023-12-20 19:10:12 -0800
committer	Micah Elizabeth Scott <beth@torproject.org>	2024-01-25 08:56:48 -0800
commit	9b886c681a35a9251bd63bb3c622fe1ca8c7cef5 (patch)
tree	97c479a22acaad7fbd0646a95536c494fa7cbcfe
parent	0dd88e6e89e620ab015fc402718bc6cb9ebaf4c1 (diff)
download	torspec-9b886c681a35a9251bd63bb3c622fe1ca8c7cef5.tar.gz torspec-9b886c681a35a9251bd63bb3c622fe1ca8c7cef5.zip