aboutsummaryrefslogtreecommitdiff
path: root/proposals/344-protocol-info-leaks.txt
diff options
context:
space:
mode:
Diffstat (limited to 'proposals/344-protocol-info-leaks.txt')
-rw-r--r--proposals/344-protocol-info-leaks.txt1188
1 files changed, 1188 insertions, 0 deletions
diff --git a/proposals/344-protocol-info-leaks.txt b/proposals/344-protocol-info-leaks.txt
new file mode 100644
index 0000000..f20ebca
--- /dev/null
+++ b/proposals/344-protocol-info-leaks.txt
@@ -0,0 +1,1188 @@
+```
+Filename: 344-protocol-info-leaks.txt
+Title: Prioritizing Protocol Information Leaks in Tor
+Author: Mike Perry
+Created: 2023-07-17
+Purpose: Normative
+Status: Open
+
+
+0. Introduction
+
+Tor's protocol has numerous forms of information leaks, ranging from highly
+severe covert channels, to behavioral issues that have been useful
+in performing other attacks, to traffic analysis concerns.
+
+Historically, we have had difficulty determining the severity of these
+information leaks when they are considered in isolation. At a high level, many
+information leaks look similar, and all seem to be forms of traffic analysis,
+which is regarded as a difficult attack to perform due to Tor's distributed
+trust properties.
+
+However, some information leaks are indeed more severe than others: some can
+be used to remove Tor's distributed trust properties by providing a covert
+channel and using it to ensure that only colluding and communicating relays
+are present in a path, thus deanonymizing users. Some do not provide this
+capability, but can be combined with other info leak vectors to quickly yield
+Guard Discovery, and some only become dangerous once Guard Discovery or other
+anonymity set reduction is already achieved.
+
+By prioritizing information leak vectors by their co-factors, impact, and
+resulting consequences, we can see that these attack vectors are not all
+equivalent. Each vector of information leak also has a common solution, and
+some categories even share the same solution as other categories.
+
+This framework is essential for understanding the context in which we will be
+addressing information leaks, so that decisions and fixes can be understood
+properly. This framework is also essential for recognizing when new protocol
+changes might introduce information leaks or not, for gauging the severity of
+such information leaks, and for knowing what to do about them.
+
+Hence, we are including it in tor-spec, as a living, normative document to be
+updated with experience, and as external research progresses.
+
+It is essential reading material for any developers working on new Tor
+implementations, be they Arti, Arti-relay, or a third party implementation.
+
+This document is likely also useful to developers of Tor-like anonymity
+systems, of which there are now several, such as I2P, MASQUE, and Oxen. They
+definitely share at least some, and possibly even many of these issues.
+
+Readers who are relatively new to anonymity literature may wish to first
+consult the Glossary in Section 3, especially if terms such as Covert Channel,
+Path Bias, Guard Discovery, and False Positive/False Negative are unfamiliar
+or hazy. There is also a catalog of historical real-world attacks that are
+known to have been performed against Tor in Section 2, to help illustrate how
+information leaks have been used adversarially, in practice.
+
+We are interested in hearing from journalists and legal organizations who
+learn about court proceedings involving Tor. We became aware of three
+instances of real-world attacks covered in Section 2 in this way. Parallel
+construction (hiding the true source of evidence by inventing an alternate
+story for the court -- also known as lying) is a possibility in the US and
+elsewhere, but (so far) we are not aware of any direct evidence of this
+occurring with respect to Tor cases. Still, keep your eyes peeled...
+
+
+0.1. Table of Contents
+
+ 1. Info Leak Vectors
+ 1.1. Highly Severe Covert Channel Vectors
+ 1.1.1. Cryptographic Tagging
+ 1.1.2. End-to-end cell header manipulation
+ 1.1.3. Dropped cells
+ 1.2. Info Leaks that enable other attacks
+ 1.2.1. Handshakes with unique traffic patterns
+ 1.2.2. Adversary-Induced Circuit Creation
+ 1.2.3. Relay Bandwidth Lying
+ 1.2.4. Metrics Leakage
+ 1.2.5. Protocol Oracles
+ 1.3. Info Leaks of Research Concern
+ 1.3.1. Netflow Activity
+ 1.3.2. Active Traffic Manipulation Covert Channels
+ 1.3.3. Passive Application-Layer Traffic Patterns
+ 1.3.4. Protocol or Application Linkability
+ 1.3.5. Latency Measurement
+ 2. Attack Examples
+ 2.1. CMU Tagging Attack
+ 2.2. Guard Discovery Attacks with Netflow Deanonymization
+ 2.3. Netflow Anonymity Set Reduction
+ 2.4. Application Layer Confirmation
+ 3. Glossary
+
+
+1. Info Leak Vectors
+
+In this section, we enumerate the vectors of protocol-based information leak
+in Tor, in order of highest priority first. We separate these vectors into
+three categories: "Highly Severe Covert Channels", "Info Leaks that Enable
+other attacks", and "Info Leaks Of Research Concern". The first category
+yields deanonymization attacks on their own. The second category enables other
+attacks that can lead to deanonymization. The final category can be aided by
+the earlier vectors to become more severe, but overall severity is a
+combination of many factors, and requires further research to illuminate all
+of these factors.
+
+For each vector, we provide a brief "at-a-glance" summary, which includes a
+ballpark estimate of Accuracy in terms of False Positives (FP) and False
+Negatives (FN), as 0, near-zero, low, medium, or high. We then list what is
+required to make use of the info leak, the impact, the reason for the
+prioritization, and some details on where the signal is injected and observed.
+
+
+1.1. Highly Severe Covert Channel Vectors
+
+This category of info leak consists entirely of covert channel vectors that
+have zero or near-zero false positive and false negative rates, because they
+can inject a covert channel in places where similar activity would not happen,
+and they are end-to-end.
+
+They also either provide or enable path bias attacks that can capture
+the route clients use, to ensure that only malicious exits are used, leading
+to full deanonymization when the requirements are met.
+
+If the adversary has censorship capability, and can ensure that users only
+connect to compromised Guards (or Bridges), they can fully deanonymize all
+users with these covert channels.
+
+
+1.1.1. Cryptographic Tagging
+
+At a glance:
+ Accuracy: FP=0, FN=0
+ Requires: Malicious or compromised Guard, at least one exit
+ Impact: Full deanonymization (path bias, identifier transmission)
+ Path Bias: Automatic route capture (all non-deanonymized circuits fail)
+ Reason for prioritization: Severity of Impact; similar attacks used in wild
+ Signal is: Modified cell contents
+ Signal is injected: by guard
+ Signal is observed: by exit
+
+First reported at Black Hat in 2009 (see [ONECELL]), and elaborated further
+with the path bias amplification attack in 2012 by some Raccoons (see
+[RACCOON23]), this is the most severe vector of covert channel attack in Tor.
+
+Cryptographic tagging is where an adversary who controls a Guard (or Bridge)
+XORs an identifier, such as an IP address, directly into the circuit's
+cipher-stream, in an area of known-plaintext. This tag can be exactly
+recovered by a colluding exit relay, ensuring zero false positives and zero
+false negatives for this built-in identifier transmission, along with their
+collusion signal.
+
+Additionally, because every circuit that does not have a colluding relay will
+automatically fail because of the failed digest validation, the adversary gets
+a free path bias amplification attack, such that their relay only actually
+carries traffic that they know they have successfully deanonymized. Because
+clients will continually attempt to re-build such circuits through the guard
+until they hit a compromised exit and succeed, this violates Tor's distributed
+trust assumption, reducing it to the same security level as a one-hop proxy
+(ie: the security of fully trusting the Guard relay). Worse still, when the
+adversary has full censorship control over all connections into the Tor
+network, Tor provides zero anonymity or privacy against them, when they also
+use this vector.
+
+Because the Exit is able to close *all* circuits that are not deanonymized,
+for maximal efficiency, the adversary's Guard capacity should exactly match
+their Exit capacity. To make up for the loss of traffic caused by closing many
+circuits, relays can lie about their bandwidth (see Section 1.2.3).
+
+Large amounts of circuit failure (that might be evidence of such an attack)
+are tracked and reported by C-Tor in the logs, by the path bias detector, but
+when the Guard is under DDoS, or even heavy load, this can yield false alarms.
+These false alarms happened frequently during the network-wide DDoS of
+2022-2023. They can also be induced at arbitrary Guards via DoS, to make users
+suspicious of their Guards for no reason.
+
+The path bias detector could have a second layer in Arti, that checks to see
+if any specific Exits are overused when the circuit failure rate is high. This
+would be more indicative of an attack, but could still go off if the user is
+actually trying to use rare exits (ie: country selection, bittorrent).
+
+This attack, and path bias attacks that are used in the next two sections, do
+have some minor engineering barriers when being performed against both onion
+and exit traffic, because the onion service traffic is restricted to
+particular hops in the case of HSDIR and intro point circuits. However,
+because pre-built circuits are used to access HSDIR and intro points, the
+adversary can use their covert channel such that only exits and pre-built
+onion service circuits are allowed to proceed. Onion services are harder to
+deanonymize in this way, because the HSDIR choice itself can't be controlled
+by them, but they can still be connected to using pre-built circuits until the
+adversary also ends up in the HSDIR position, for deanonymization.
+
+Solution: Path Bias Exit Usage Counter;
+ Counter Galois Onion (CGO) (Forthcoming update to Prop#308).
+Status: Unfixed (Current PathBias detector is error-prone under DDoS)
+Funding: CGO explicitly funded via Sponsor 112
+
+
+1.1.2. End-to-end cell header manipulation
+
+At a glance:
+ Accuracy: FP=0, FN=0
+ Requires: Malicious or compromised Guard, at least one exit
+ Impact: Full deanonymization (path bias, identifier transmission)
+ Path Bias: Full route capture is trivial
+ Reason for prioritization: Severity of Impact; used in the wild
+ Signal is: Modified cell commands.
+ Signal is injected: By either guard or exit/HSDIR
+ Signal is observed: By either guard or exit/HSDIR
+
+The Tor protocol consists of both cell header commands, and relay header
+commands. Cell commands are not encrypted by circuit-level encryption, so they
+are visible and modifiable by every relay in the path. Relay header commands
+are encrypted, and not visible to every hop in the path.
+
+Not all cell commands are forwarded end-to-end. Currently, these are limited
+to RELAY, RELAY_EARLY, and DESTROY. Because of the attack described here,
+great care must be taken when adding new end-to-end cell commands, even if
+they are protected by a MAC.
+
+Previously, a group of researchers at CMU used this property to modify the
+cell command header of cells on circuits, to switch between RELAY_EARLY and
+RELAY at exits and HSDIRs (see [RELAY_EARLY]). This creates a visible bit in
+each cell, that can signal collusion, or with enough cells, can encode an
+identifier such as an IP address. They assisted the FBI, to use this attack in
+the wild to deanonymize clients.
+
+We addressed the CMU attack by closing the circuit upon receiving an "inbound"
+(towards the client) RELAY_EARLY command cell, and by limiting the number of
+"outbound" (towards the exit) RELAY_EARLY command cells at relays, and by
+requiring the use of RELAY_EARLY for EXTEND (onionskin) relay commands. This
+defense is not generalized, though. Guards may still use this specific covert
+channel to send around 3-5 bits of information after the extend handshake,
+without killing the circuit. It is possible to use the remaining outbound
+vector to assist in path bias attacks for dropped cells, as a collusion signal
+to reduce the amount of non-compromised traffic that malicious exits must
+carry (see the following Section 1.1.3).
+
+If this covert channel is not addressed, it is trivial for a Guard and Exit
+relays to close every circuit that does not display this covert channel,
+providing path bias amplification attack and distributed trust reduction,
+similar to cryptographic tagging attacks. Because the inbound direction *is*
+addressed, we believe this kind of path bias is currently not possible with
+this vector by itself (thus also requiring the vector from Section 1.1.3), but
+it could easily become possible if this defense is forgotten, or if a new
+end-to-end cell type is introduced.
+
+While more cumbersome than cryptographic tagging attacks, in practice this
+attack is just as successful, if these cell command types are not restricted
+and limited. It is somewhat surprising that the FBI used this attack before
+cryptographic tagging, but perhaps that was just a lucky coincidence of
+opportunity.
+
+Solution: CGO (Updated version of Prop#308) covers cell commands in the MAC;
+ Any future end-to-end cell commands must still limit usage
+Status: Fix specific to CMU attack; Outbound direction is unfixed
+Funding: Arti and relay-side fixes are explicitly funded via Sponsor 112
+
+
+1.1.3. Dropped cells
+
+At a glance:
+ Accuracy: FP=0, FN=0
+ Requires: Malicious Guard or Netflow data (if high volume), one exit
+ Impact: Full deanonymization (path bias amplification, collusion signal)
+ Path Bias: Full route capture is trivial
+ Reason for prioritization: Severity of Impact; similar attacks used in wild
+ Signal is: Unusual patterns in number of cells received
+ Signal is injected: By exit or HSDIR
+ Signal is observed: at guard or client<->guard connection.
+
+Dropped cells are cells that a relay can inject that end up ignored and
+discarded by a Tor client. These include:
+ - Unparsable cells
+ - Unrecognized cells (ie: wrong source hop, or decrypt failures)
+ - invalid relay commands
+ - unsupported (or consensus-disabled) relay commands or extensions
+ - out-of-context relay commands
+ - duplicate relay commands
+ - relay commands that hit any error codepaths
+ - relay commands for an invalid or already-closed stream ID
+ - semantically void relay cells (incl relay data len == 0, or PING)
+ - onion descriptor-appended junk
+
+This attack works by injecting inbound RELAY cells at the exit or at a middle
+relay, and then observing anomalous traffic patterns at the guard or at the
+client->guard connection.
+
+The severity of this covert channel is extreme (zero false positives; zero
+false negatives) when they are injected in cases where the circuit is
+otherwise known to be silent, because of the protocol state machine. These
+cases include:
+ - Immediately following an onionskin response
+ - During other protocol handshakes (onion services, conflux)
+ - Following relay CONNECTED or RESOLVED (not as severe - no path bias)
+
+Because of the stateful and deterministic nature of the Tor protocol,
+especially handshakes, it is easy to accurately recognize these specific cases
+even when observing only encrypted circuit traffic at the Guard relay (see
+[DROPMARK]).
+
+Because this covert channel is most accurate before actual circuit use, when
+the circuit is expected to be otherwise silent, it is trivial for a Guard
+relay to close every circuit that does not display this covert channel,
+providing path bias amplification attack and distributed trust reduction,
+similar to cryptographic tagging attacks and end-to-end cell header
+manipulation. This ability to use the collusion signal to perform path bias
+before circuit use differentiates dropped cells within the Tor Protocol from
+deadweight traffic during application usage (such as javascript requests for
+404 URLs, covered in Section 1.3.2).
+
+This category is not quite as severe as these previous two categories by
+itself, for two main reasons. However, it is also the case that due to other
+factors, these reasons may not matter in practice.
+
+First, the Exit can't use this covert channel to close circuits that are not
+deanonymized by a colluding Guard, since there is no covert channel from the
+Guard to the Exit with this vector alone. Thus, unlike cryptographic tagging,
+the adversary's Exits will still carry non-deanonymized traffic from
+non-adversary Guards, and thus the adversary needs more Exit capacity than
+Guard capacity. These kinds of more subtle trade-offs with respect to path
+bias are covered in [DOSSECURITY]. However, note that this issue can be fixed
+by using the previous RELAY_EARLY covert channel from the Guard to the Exit
+(since this direction is unfixed). This allows the adversary to confirm
+receipt of the dropped cell covert channel, allowing both the Guard and the
+Exit to close all non-confirmed circuits, and thus ensure that they only need
+to allocate equal amounts of compromised Guard and Exit traffic, to monitor
+all Tor traffic.
+
+Second, encoding a full unique identifier in this covert channel is
+non-trivial. A significant amount of injected traffic must be sent to exchange
+more than a simple collusion signal, to link circuits when attacking a large
+number of users. In practice, this likely means some amount of correlation,
+and a resulting (but very small) statistical error.
+
+Obviously, the actual practical consequences of these two limitations are
+questionable, so this covert channel is still regarded as "Highly Severe". It
+can still result in full deanonymization of all Tor traffic by an adversary
+with censorship capability, with very little error.
+
+Solution: Forthcoming dropped-cell proposal
+Status: Fixed with vanguards addon; Unfixed otherwise
+Funding: Arti and relay-side fixes are explicitly funded via Sponsor 112
+
+
+1.2. Info Leaks that enable other attacks
+
+These info leaks are less severe than the first group, as they do not yield
+full covert channels, but they do enable other attacks, including guard
+discovery and eventual netflow deanonymization, and website traffic
+fingerprinting.
+
+
+1.2.1. Handshakes with unique traffic patterns
+
+At a glance:
+ Accuracy: FP=near-zero, FN=near-zero
+ Requires: Compromised Guard
+ Impact: Anonymity Set Reduction and Oracle; assists in Guard Discovery
+ Path Bias: Full route capture is difficult (high failure rate)
+ Reason for Prioritization: Increases severity of vectors 1.2.2 and 1.3.3
+ Signal is: Caused by client's behavior.
+ Signal is observed: At guard
+ Signal is: Unique cell patterns
+
+Certain aspects of Tor's handshakes are very unique and easy to fingerprint,
+based only on observed traffic timing and volume patterns. In particular, the
+onion client and onion service handshake activity is fingerprintable with
+near-zero false negatives and near-zero false positive rates, as per
+[ONIONPRINT]. The conflux link handshake is also unique (and thus accurately
+recognizable), because it is our only 3-way handshake.
+
+This info leak is very accurate. However, the impact is much lower than that
+of covert channels, because by itself, it can only tell if a particular Tor
+protocol, behavior, or feature is in use.
+
+Additionally, Tor's distributed trust properties remain in-tact, because there
+is no collusion signal built in to this info leak. When a path bias attack
+is mounted to close circuits during circuit handshake construction without a
+collusion signal to the Exit, it must proceed hop-by-hop. Guards must close
+circuits that do not extend to colluding middles, and those colluding middles
+must close circuits that don't extend to colluding exits. This means that the
+adversary must control some relays in each position, and has a substantially
+higher circuit failure rate while directing circuits to each of these relays
+in a path.
+
+To put this into perspective, an adversary using a collusion signal with 10%
+of Exits expects to fail 9 circuits before detecting their signal at a
+colluding exit and allowing a circuit to succeed. However, an adversary
+without a collusion signal and 10% of all relays expects to fail 9 circuits
+before getting a circuit to their middle, but then expects 9 of *those*
+circuits to fail before reaching an Exit, for 81 circuit failures for every
+successful circuit.
+
+Published attacks have built upon this info leak, though.
+
+In particular, certain error conditions, such as returning a single
+"404"-containing relay cell for an unknown onion service descriptor, are
+uniquely recognizable. This fingerprint was used in the [ONIONFOUND] guard
+discovery attack, and they provide a measurement of its uniqueness.
+
+Additionally, onion client fingerprintability can be used to vastly reduce the
+set of website traffic traces that need to be considered for website traffic
+fingerprinting (see Section 1.3.3), making that attack realistic and
+practical. Effectively, it functions as a kind of oracle in this case (see
+Glossary, and [ORACLES]).
+
+Solution: Padding machines at middles for protocol handshakes (as per [PCP]);
+ Pathbias-lite.
+Status: Padding machines deployed for onion clients, but have weaknesses
+ against DF and stateful cross-circuit fingerprints
+Funding: Not explicitly funded
+
+
+1.2.2. Adversary-Induced Circuit Creation
+
+At a glance:
+ Accuracy: FP=high, FN=high
+ Requires: Onion service activity, or malicious exit
+ Impact: Guard discovery
+ Path Bias: Repeated circuits eventually provide the desired path
+ Reason for Prioritization: Enables Guard Discovery
+ Signal is: Inducing a client to make a new Tor circuit
+ Signal is injected: by application layer, client, or malicious relay
+ Signal is observed: At middle
+
+By itself, the ability for an adversary to cause a client to create circuits
+is not a covert channel or arguably even an info leak. Circuit creation, even
+bursts of frequent circuit creation, is commonplace on the Tor network.
+
+However, when this activity is combined with a covert channel from Section
+1.1, with a unique handshake from Section 1.2.1, or with active traffic
+manipulation (Section 1.3.2), then it leads to Guard Discovery, by allowing
+the adversary to recognize when they are chosen for the Middle position, and
+thus learn the Guard. Once Guard Discovery is achieved, netflow analysis of
+the Guard's connections can be used to perform intersection attacks and
+eventually determine the client IP address (see Section 1.3.1).
+
+Large quantities of circuit creation can be induced by:
+ - Many connections to an Onion Service
+ - Causing a client to make connections to many onion service addresses
+ - Application connection to ports in rare exit policies, followed by circuit
+ close at Exit
+ - Repeated Conflux leg failures
+
+In Tor 0.4.7 and later, onion services are protected from this activity via
+Vanguards-Lite (Proposal #333). This system adds a second layer of vanguards
+to onion service circuits, with rotation times set such that it is sufficient
+to protect a user for use cases on the order of weeks, assuming the adversary
+does not get lucky and land in a set. Non-Onion service activity, such as
+Conflux leg failures, is protected by feature-specific rate limits.
+
+Longer lived onion services should use the Vanguards Addon, which implements
+Mesh Vanguards (Prop#292). It uses two layers of vanguards, and expected
+use cases of months.
+
+These attack times are probabilistic expectations, and are rough estimates.
+See the proposals for details. To derive these numbers, the proposals assume a
+100% accurate covert channel for detecting that the middle is in the desired
+circuit. If we address the low hanging fruit for such covert channels above,
+these numbers change, and such attacks also become much more easily
+detectable, as they will rely on application layer covert channels (See
+Section 1.3.2), which will resemble an application layer DoS or flood.
+
+Solution: Mesh-vanguards (Prop#292); Vanguards-lite (Prop#333); rate limiting
+ circuit creation attempts; rate limiting the total number of distinct
+ paths used by circuits
+Status: Vanguards-lite deployed in Tor 0.4.7; Mesh-vanguards is vanguards addon;
+ Conflux leg failures are limited per-exit; Exitpolicy scanner exists
+Funding: Not explicitly funded
+
+
+1.2.3. Relay Bandwidth Lying
+
+At a glance:
+ Accuracy: FP=high, FN=high
+ Requires: Running relays in the network
+ Impact: Additional traffic towards malicious relays
+ Path Bias: Bandwidth lying can make up for circuit rejection
+ Reason for prioritization: Assists Covert Channel Path Bias attacks
+ Signal is injected: by manipulating reported descriptor bandwidths
+ Signal is observed: by clients choosing lying relays more often
+ Signal is: the effect of using lying relays more often
+
+Tor clients select relays for circuits in proportion to their fraction of
+consensus "bandwidth" weight. This consensus weight is calculated by
+multiplying the relay's self-reported "observed" descriptor bandwidth value by
+a ratio that is measured by the Tor load balancing system (formerly TorFlow;
+now sbws -- see [SBWS] for an overview).
+
+The load balancing system uses two-hop paths to measure the stream bandwidth
+through all relays on the network. The ratio is computed by determining a
+network-wide average stream bandwidth, 'avg_sbw', and a per-relay average
+stream bandwidth, 'relay_sbw'. Each relay's ratio value is 'relay_sbw/avg_sbw'.
+(There are also additional filtering steps to remove slow outlier streams).
+
+Because the consensus weights for relays derive from manipulated descriptor
+values by multiplication with this ratio, relays can still influence their
+weight by egregiously lying in their descriptor value, thus attracting more
+client usage. They can also attempt to fingerprint load balancer activity and
+selectively give it better service, though this is more complicated than
+simply patching Tor to lie.
+
+This attack vector is especially useful when combined with a path bias attack
+from Section 1.1: if an adversary is using one of those covert channels to
+close a large portion of their circuits, they can make up for this loss of
+usage by inflating their corresponding bandwidth value by an equivalent
+amount, thus causing the load balancer to still measure a reasonable ratio for
+them, and thus still provide fast service for the fully deanonymized circuits
+that they do carry.
+
+There are many research papers written on alternate approaches to the
+measurement problem. These have not been deployed for three reasons:
+ 1. The unwieldy complexity and fragility of the C-Tor codebase
+ 2. The conflation of measurement with load balancing (we need both)
+ 3. Difficulty performing measurement of the fastest relays with
+ non-detectable/distributed mechanisms
+
+In the medium term, we will work on detecting bandwidth lying and manipulation
+via scanners. In the long term, Arti-relay will allow the implementation of
+distributed and/or dedicated measurement components, such as [FLASHFLOW].
+(Note that FlashFlow still needs [SBWS] or another mechanism to handle load
+balancing, though, since FlashFlow only provides measurement).
+
+Solutions: Scan for lying relays; implement research measurement solutions
+Status: A sketch of the lying relay scanner design is in [LYING_SCANNER]
+Funding: Scanning for lying relays is funded via Sponsor 112
+
+
+1.2.4. Metrics Leakage
+
+At a glance:
+ Accuracy: FP=low, FN=high
+ Requires: Some mechanism to bias or inflate reported relay metrics
+ Impact: Guard discovery
+ Path Bias: Potentially relevant, depending on type of leak
+ Reason for prioritization: Historically severe issues
+ Signal is injected: by interacting with onion service
+ Signal is observed: by reading router descriptors
+ Signal is: information about volume of traffic and number of IP addresses
+
+In the past, we have had issues with info leaks in our metrics reporting (see
+[METRICSLEAK]). We addressed them by lowering the resolution of read/write
+history, and ensuring certain error conditions could not willfully introduce
+noticeable asymmetries. However, certain characteristics, like reporting local
+onion or SOCKS activity in relay bandwidth counts, still remain.
+
+Additionally, during extremely large flooding or DDoS attempts, it may still
+be possible to see the corresponding increases in reported metrics for Guards
+in use by onion services, and thus discover its Guards.
+
+Solutions: Fix client traffic reporting; remove injectable asymmetries;
+ reduce metrics resolution; add noise
+Status: Metrics resolution reduced to 24hr; known asymmetries fixed
+Funding: Not funded
+
+
+1.2.5. Protocol Oracles
+
+At a glance:
+ Accuracy: FP=medium, FN=0 (for unpopular sites: FP=0, FN=0)
+ Requires: Probing relay DNS cache
+ Impact: Assists Website Traffic Fingerprinting; Domain Usage Analytics
+ Path Bias: Not Possible
+ Reason for prioritization: Historically accurate oracles
+ Signal is injected: by client causing DNS caching at exit
+ Signal is observed: by probing DNS response wrt to cell ordering via all exits
+ Signal is: If cached, response is immediate; otherwise other cells come first
+
+Protocol oracles, such as exit DNS cache timing to determine if a domain has
+been recently visited, increase the severity of Website Traffic Fingerprinting
+in Section 1.3.3, by reducing false positives, especially for unpopular
+websites.
+
+There are additional forms of oracles for Website Traffic Fingerprinting, but
+the remainder are not protocol oracles in Tor. See [ORACLES] in the
+references.
+
+Tor deployed a defense for this oracle in the [DNSORACLE] tickets, to
+randomize expiry time. This helps reduce the precision of this oracle for
+popular and moderately popular domains/websites in the network, but does not
+fully eliminate it for unpopular domains/websites.
+
+The paper in [DNSORACLE] specifies a further defense, using a pre-load of
+popular names and circuit cache isolation defense in Section 6.2, with third
+party resolvers. The purpose of the pre-load list is to preserve the cache
+hits for shared domains across circuits (~11-17% of cache hits, according to
+the paper). The purpose of circuit isolation is to avoid Tor cache hits for
+unpopular domains across circuits. The purpose of third party resolvers is to
+ensure that the local resolver's cache does not become measurable, when
+isolating non-preloaded domains to be per-circuit.
+
+Unfortunately, third party resolvers are unlikely to be recommended for use by
+Tor, since cache misses of unpopular domains would hit them, and be subject to
+sale in DNS analytics data at high resolution (see [NETFLOW_TICKET]).
+
+Also note that the cache probe attack can only be used by one adversary at a
+time (or they begin to generate false positives for each other by actually
+*causing* caching, or need to monitor for each other to avoid each other).
+This is in stark contrast to third party resolvers, where this information is
+sold and available to multiple adversaries concurrently, for all uncached
+domains, with high resolution timing, without the need for careful
+coordination by adversaries.
+
+However, note that an arti-relay implementation would no longer be single
+threaded, and would be able to reprioritize asynchronous cache activity
+arbitrarily, especially for sensitive uncached activity to a local resolver.
+This might be useful for reducing the accuracy of the side channel, in this
+case.
+
+Unfortunately, we lack sufficient clarity to determine if it is meaningful to
+implement any further defense that does not involve third party resolvers
+under either current C-Tor, or future arti-relay circumstances.
+
+Solutions: Isolate cache per circuit; provide a shared pre-warmed cache of
+ popular domains; smarter cache handling mechanisms?
+Status: Randomized expiry only - not fully eliminated
+Funding: Any further fixes are covered by Sponsor 112
+
+
+1.3. Info Leaks of Research Concern
+
+In this section, we list info leaks that either need further research, or are
+undergoing active research.
+
+Some of these are still severe, but typically less so than the already covered
+ones, unless they are part of a combined attack, such as with an Oracle,
+or with Guard Discovery.
+
+Some of these may be more or less severe than currently suspected: If we
+knew for certain, they wouldn't need research.
+
+
+1.3.1. Netflow Activity
+
+At a glance:
+ Accuracy: FP=high; FN=0 (FN=medium with incomplete vantage point set)
+ Requires: Access to netflow data market, or ISP coercion
+ Impact: Anonymity Set Reduction; Deanonymization with Guard Discovery/Oracle
+ Path Bias: Not possible
+ Reason for Prioritization: Low impact without Guard Discovery/Oracle
+ Signal is: created by using the network
+ Signal is observed: at ISP of everything that is using the network.
+ Signal is: Connection tuple times and byte counts
+
+Netflow is a feature of internet routers that records connection tuples, as
+well as time stamps and byte counts, for analysis.
+
+This data is bought and sold, by both governments and threat intelligence
+companies, as documented in [NETFLOW_TICKET].
+
+Tor has a padding mechanism to reduce the resolution of this data (see Section
+2 of [PADDING_SPEC]), but this hinges on clients' ability to keep connections
+open and padded for 45-60 minutes, even when idle. This padding reduces the
+resolution of intersection attacks, making them operate on 30 minute time
+windows, rather than 15 second time windows. This increases the false positive
+rate, and thus increases the duration of such intersection attacks.
+
+Large scale Netflow data can also be used to track Tor users as they migrate
+from location to location, without necessarily deanonymizing them. Because Tor
+uses three directory guards, and has ~4000 Guard relays, the choice
+Choose(4000,3) of directory Guards is ~10 billion different combinations,
+though probability weighting of Guard selection does reduce this considerably
+in practice. Lowering the total number of Guard relays (via arti-relay and
+using only the fastest Guards), and using just two directory guards as opposed
+to three can reduce this such that false positives become more common. More
+thorough solutions are discussed in [GUARDSETS].
+
+Location tracking aside, by itself, this data (especially when padded) is not
+a threat to client anonymity. However, this data can also be used in
+combination with a number of Oracles or confirmation vectors, such as:
+ - Guard Discovery
+ - Flooding an onion service with huge amounts of traffic in a pattern
+ - Advertising analytics or account activity log purchase
+ - TCP RST injection
+ - TLS conn rotation
+
+These oracles can be used to either confirm the connection of an onion
+service, or to deanonymize it after Guard Discovery.
+
+In the case of clients, the use of Oracle data can enable intersection attacks
+to deanonymize them. The oracle data necessary for client intersection attack
+is also being bought and sold, as documented in [NETFLOW_TICKET]. It is
+unknown how long such attacks take, but it is a function of the number of
+users under consideration, and their connection durations.
+
+The research interest here is in determining what can be done to increase the
+amount of time these attacks take, in terms of increasing connection duration,
+increasing the number of users, reducing the total number of Guard relays,
+using a UDP transport, or changing user behavior.
+
+Solutions: Netflow padding; connection duration increase; QUIC transport;
+ using bridges; decreasing total number of guards; using only two
+ directory guards; guardsets; limiting sensitive account usage
+Status: Netflow padding deployed in C-Tor and arti
+Funding: Not explicitly funded
+
+
+1.3.2. Active Traffic Manipulation Covert Channels
+
+At a Glance:
+ Accuracy: FP=medium, FN=low
+ Requires: Netflow data, or compromised/monitored Guard
+ Impact: Anonymity Set Reduction; Netflow-assisted deanonymization
+ Path Bias: Possible via exit policy or onion service reconnection
+ Reason for Prioritization: Can assist other attacks; lower severity otherwise
+ Signal is injected: by the target application, manipulated by the other end.
+ Signal is observed: at Guard, or target->guard connection.
+ Signal is: Unusual traffic volume or timing.
+
+This category of covert channel occurs after a client has begun using a
+circuit, by manipulating application data traffic. This manipulation can
+occur either at the application layer, or at the Tor protocol layer.
+
+Because it occurs after the circuit is in use, it does not permit the use of
+path bias or trust reduction properties by itself (unless combined with one of
+the above info leak attack vectors -- most often Adversary-Induced Circuit
+Creation).
+
+These covert channels also have a significantly higher false positive rate
+than those before circuit use, since application traffic is ad-hoc and
+arbitrary, and is also involved during the attempted manipulation of
+application traffic.
+
+For onion services, this covert channel is much more severe: Onion services may
+be flooded with application data in large-volume patterns over long periods of
+time, which can be seen in netflow logs.
+
+For clients, this covert channel typically is only effective after the
+adversary suspects an individual, for confirmation of their suspicion, or
+after Guard Discovery.
+
+Examples of this class of covert channel include:
+ - Application-layer manipulation (AJAX)
+ - Traffic delays (rainbow, swirl - see [BACKLIT])
+ - Onion Service flooding via HTTP POST
+ - Flooding Tor relays to notice traffic changes in onion service throughput
+ - Conflux leg switching patterns
+ - Traffic inflation (1 byte data cells)
+
+Solution: Protocol checks; Padding machines at middles for specific kinds of
+ traffic; limits on inbound onion service traffic; Backlit
+Status: Protocol checks performed for conflux; vanguards addon closes
+ high-volume circuits
+Funding: Not explicitly funded
+
+
+1.3.3. Passive Application-Layer Traffic Patterns
+
+At a Glance:
+ Accuracy: FP=medium, FN=Medium
+ Requires: Compromised Guard (external monitoring increases FP+FN rate)
+ Impact: Links client and destination activity (ie: deanonymization with logs)
+ Path Bias: Not Possible
+ Reason for prioritization: Large FP rate without oracle, debated practicality
+ Signal is: not injected; passively extracted
+ Signal is observed: at Guard, or entire network
+ Signal is: timing and volume patterns of traffic.
+
+This category of information leak occurs after a client has begun using a
+circuit, by analyzing application data traffic.
+
+Examples of this class of information leak include:
+ - Website traffic fingerprinting
+ - End-to-end correlation
+
+The canonical application of this information leak is in end-to-end
+correlation, where application traffic entering the Tor network is correlated
+to traffic exiting the Tor network (see [DEEPCOFFEA]). This attack vector
+requires a global view of all Tor traffic, or false negatives skyrocket.
+However, this information leak is also possible to exploit at a single
+observation point, using machine learning classifiers (see [ROBFINGERPRINT]),
+typically either the Guard or bridge relay, or the path between the
+Guard/bridge and the client.
+
+In both cases, this information leak has a significant false positive rate,
+since application traffic is ad-hoc, arbitrary, and self-similar. Because
+multiple circuits are multiplexed on one TLS connection, the false positive
+and false negative rates are higher still at this observation location, as
+opposed to on a specific circuit.
+
+In both cases, the majority of the information gained by classifiers is in the
+beginning of the trace (see [FRONT] and [DEEPCOFFEA]).
+
+This information leak gets more severe when it is combined with another oracle
+(as per [ORACLES]) that can confirm the statistically derived activity, or
+narrow the scope of material to analyze. Example oracles include:
+ - DNS cache timing
+ - Onion service handshake fingerprinting
+ - Restricting targeting to either specific users, or specific websites
+ - Advertising analytics or account activity log purchase (see
+ [NETFLOW_TICKET])
+
+Website traffic fingerprinting literature is divided into two classes of
+attack study: Open World and Closed World. Closed World is when the adversary
+uses an Oracle to restrict the set of possible websites to classify traffic
+against. Open World is when the adversary attempts to recognize a specific
+website or set of websites out of all possible other traffic.
+
+The nature of the protocol usage by the application can make this attack
+easier or harder, which has resulted in application layer defenses, such as
+[ALPACA]. Additionally, the original Google QUIC was easier to fingerprint
+than HTTP (See [QUICPRINT1]), but IETF HTTP3 reversed this (See [QUICPRINT2]).
+Javascript usage makes these attacks easier (see [INTERSPACE], Table 3), where
+as concurrent activity (in the case of TLS observation) makes them harder.
+Web3 protocols that exchange blocks of data instead of performing AJAX
+requests are likely to be much harder to fingerprint, so long as the web3
+application is accessed via its native protocol, and not via a website
+front-end.
+
+The entire research literature for this vector is fraught with analysis
+problems, unfortunately. Because smaller web crawl sizes make the attacks more
+effective, and because attack papers are easier to produce than defenses
+generally, dismal results are commonplace. [WFNETSIM] and [WFLIVE] examine
+some of these effects. It is common for additional hidden gifts to adversaries
+to creep in, leading to contradictory results, even in otherwise comprehensive
+papers at top-tier venues. The entire vein of literature must be read with a
+skeptical eye, a fine-tooth comb, and a large dumpster nearby.
+
+As one recent example, in an otherwise comprehensive evaluation of modern
+defenses, [DEFCRITIC] found a contrary result with respect to the Javascript
+finding in the [INTERSPACE] paper, by training and testing their classifiers
+with knowledge of the Javascript state of the browser (thus giving them a free
+oracle). In truth, neither [DEFCRITIC] nor [INTERSPACE] properly examined the
+effects of Javascript -- a rigorous test would train and test on a mix of
+Javascript and non-Javascript traffic, and then compare the classification
+accuracy of each set separately, after joint classification. Instead,
+[DEFCRITIC] just reported that disabling Javascript (via the security level of
+Tor Browser) has "no beneficial effect", which they showed by actually letting
+the adversary know which traces had Javascript disabled.
+
+Such hidden gifts to adversaries are commonplace, especially in attack papers.
+While it may be useful to do this while comparing defenses against each other,
+when these assumptions are hidden, and when defenses are not re-tunable for
+more realistic conditions, this leads to focus on burdensome defenses with
+large amounts of delay or huge amounts of overhead, at the expense of ignoring
+lighter approaches that actually improve the situation in practice.
+
+This of course means that nothing gets done at all, because Tor is neither
+going to add arbitrary cell delay at relays (because of queue memory required
+for this and the impacts on congestion control), nor add 400% overhead to
+both directions of traffic.
+
+In terms of defense deployment, it makes the most sense to place these padding
+machines at the Guards to start, for many reasons. This is in contrast to
+other lighter padding machines for earlier vectors, where it makes more sense
+to place them at the middle relay. In this case, the heavier padding machines
+necessary for this vector can take advantage of higher multiplexing, which
+means less overhead. They can also use the congestion signal at the TLS
+connection, to more easily avoid unnecessary padding when the TLS connection
+is blocked, thus only using "slack" Guard capacity. Conflux also can be tuned
+to provide at least some benefit here: even if in lab conditions it provides
+low benefit, in the scenarios studied by [WFNETSIM] and [WFLIVE], this may
+actually be considerable, unless the adversary has both guards, which is more
+difficult for an internal adversary. Additionally, the distinction between
+external and internal adversaries is rarely, if ever, evaluated in the
+literature anyway, so there is little guidance on this distinction as a whole,
+right now.
+
+
+Solution: Application layer solutions ([ALPACA], disabling Javascript, web3 apps);
+ Padding machines at guards for application traffic; conflux tuning
+Status: Unfixed
+Funding: Padding machine and simulator port to arti are funded via Sponsor 112
+
+
+1.3.4. Protocol or Application Linkability
+
+At a Glance:
+ Accuracy: FP=0, FN=0
+ Requires: Compromised Exit; Traffic Observation; Hostile Website
+ Impact: Anonymity Set Reduction
+ Path Bias: Not Possible
+ Reason for prioritization: Low impact with faster releases
+ Signal is: not injected; passively extracted
+ Signal is observed: at Exit, or at application destination
+ Signal is: Rare protocol usage or behavior
+
+Historically, due to Tor's slow upgrade cycles, we have had concerns about
+deploying new features that may fragment the anonymity set of early adopters.
+
+Since we have moved to a more rapid release cycle for both clients and relays
+by abandoning the Tor LTS series, these concerns are much less severe.
+However, they can still present concerns during the upgrade cycle. For
+Conflux, for example, during the alpha series, the fact that few exits
+supported conflux caused us to limit the number of pre-built conflux sets to
+just one, to avoid concentrating alpha users at just a few exits. It is not
+clear that this was actually a serious anonymity concern, but it was certainly
+a concern with respect to concentrating the full activity of all these users
+at just a few locations, for load balancing reasons alone.
+
+Similar concerns exist for users of alternate implementations, both of Tor,
+and of applications like the browser. We regard this as a potential research
+concern, but it is likely not a severe one. For example, assuming Tor Browser
+and Brave both address browser fingerprinting, how bad is it for anonymity
+that they address it differently? Even if they ensure that all their users
+have the same or similar browser fingerprints, it will still be possible for
+websites, analytics datasets, and possibly even Exit relays or Exit-side
+network observers, to differentiate the use of one browser versus the other.
+Does this actually harm their anonymity in a real way, or must other oracles
+be involved? Are these oracles easy to obtain?
+
+Similarly, letting users choose their exit country is in this category. In
+some circumstances, this choice has serious anonymity implications: if the
+choice is a permanent, global one, and the user chooses an unpopular country
+with few exits, all of their activity will be much more linkable. However, if
+the country is popular, and/or if the choice is isolated per-tab or per-app,
+is this still significant such that it actually enables any real attacks? It
+seems like not so much.
+
+Solutions: Faster upgrade cycle; Avoiding concentrated use of new features
+Status: Tor LTS series is no longer supported
+Funding: Not explicitly funded
+
+
+1.3.5. Latency Measurement
+
+At a glance:
+ Accuracy: FP=high, FN=high
+ Requires: Onion service, or malicious Exit
+ Impact: Anonymity Set Reduction/Rough geolocation of services
+ Path Bias: Possible exacerbating factor
+ Reason for Prioritization: Low impact; multiple observations required
+ Signal is created naturally by anything that has a "reply" mechanic
+ Signal is observed at either end.
+ Signal is: delays between a message sent and a message received in reply.
+
+Latency's effects on anonymity set has been studied in the [LATENCY_LEAK]
+papers.
+
+It may be possible to get a rough idea of the geolocation of an onion service
+by measuring the latency over many different circuits. This seems more
+realistic if the Guard or Guards are known, so that their contribution to
+latency statistics can be factored in, over many many connections to an onion
+service. For normal client activity, route selection and the fact that the
+Exit does not know specific accounts or cookies in use likely provides enough
+protection.
+
+If this turns out to be severe, it seems the best option is to add a delay on
+the client side to attempt to mask the overall latency. This kind of approach
+is only likely to make sense for onion services. Other path selection
+alterations may help, though.
+
+Solutions: Guards, vanguards, alternative path selection, client-side delay
+Status: Guards and vanguards-lite are used in Tor since 0.4.7
+Funding: Not explicitly funded
+
+
+2. Attack Examples
+
+To demonstrate how info leaks combine, here we provide some historical
+real-world attacks that have used these info leaks to deanonymize Tor
+users.
+
+
+2.1. CMU Tagging Attack
+
+Perhaps the most famous historical attack was when a group at CMU assisted the
+FBI in performing dragnet deanonymization of Tor users, through their
+[RELAY_EARLY] attack on the live network. This attack could only work on users
+who happened to use their Guards, but those users could be fully deanonymized.
+
+The attack itself operated on connections to monitored HSDIRs: it encoded the
+address of the onion service in the cell command header, via the RELAY_EARLY
+bitflipping technique from Section 1.1.2. Their Guards then recorded this
+address, along with the IP address of the user, providing a log of onion
+services that each IP address visited.
+
+It is not clear if the CMU group even properly utilized the full path bias
+attack power here to deanonymize as many Tor users as possible, or if their
+logs were simply of interest to the FBI because of what they happened to
+capture. It seems like the latter is the case.
+
+A similar, motivated adversary could use any of the covert channels in Section
+1.1, in combination with Path Bias to close non-deanonymized circuits, to
+fully deanonymize all exit traffic carried by their Guard relays. There are
+path bias detectors in Tor to detect large amounts of circuit failure, but
+when the network (or the Guard) is also under heavy circuit load, they can
+become unreliable, and have their own false positives.
+
+While this attack vector requires the Guard relay, it is of interest to any
+adversary that would like to perform dragnet deanonymization of a wide range
+of Tor users, or to compel a Guard to deanonymize certain Tor users. It is
+also of interest to adversaries with censorship capability, who would like
+to monitor all Tor usage of users, rather than block them. Such an adversary
+would use their censorship capability to direct Tor users to only their own
+malicious Guards or Bridges.
+
+
+2.2. Guard Discovery Attacks with Netflow Deanonymization
+
+Prior to the introduction of Vanguards-lite in Tor 0.4.7, it was possible to
+combine "1.2.2. Adversary-Induced Circuit Creation", with a circuit-based
+covert channel (1.1.3, 1.2.1, or 1.3.2), to obtain a middle relay confirmed
+to be next to the user's Guard.
+
+Once the Guard is obtained, netflow connection times can be used to find the
+user of interest.
+
+There was at least one instance of this being used against a user of Ricochet,
+who was fully deanonymized. The user was using neither vanguards-lite, nor the
+vanguards addon, so this attack was trivial. It is unclear which covert
+channel type was used for Guard Discovery. The netflow attack proceeded
+quickly, because the attacker was able to determine when the user was on and
+offline via their onion service descriptor being available, and the number
+of users at the discovered Guard was relatively small.
+
+
+2.3. Netflow Anonymity Set Reduction
+
+Netflow records have been used, to varying degrees of success, to attempt to
+identify users who have posted violent threats in an area.
+
+In most cases, this has simply ended up hassling unrelated Tor users, without
+finding the posting user. However, in at least one case, the user was found.
+
+Netflow records were also reportedly used to build suspicion of a datacenter
+in Germany which was emitting large amounts of Tor traffic, to eventually
+identify it as a Tor hosting service providing service to drug markets, after
+further investigation. It is not clear if a flooding attack was also used in
+this case.
+
+
+2.4. Application Layer Confirmation
+
+The first (and only) known case of fine-grained traffic analysis of Tor
+involved an application layer confirmation attack, using the vector from
+1.3.2.
+
+In this case, a particular person was suspected as being involved in a group
+under investigation, due to the presence of an informant in that group. The
+FBI then monitored the suspect's WiFi, and sent a series of XMPP ping messages
+to the account in question. Despite the use of Tor, enough pings were sent
+such that the timings on the monitored WiFi showed overlap with the XMPP
+timings of sent pings and responses. This was prior to Tor's introduction of
+netflow padding (which generates similar back-and-forth traffic every 4-9
+seconds between the client and the Guard).
+
+It should be noted that such attacks are still prone to error, especially for
+heavy Tor users whose other traffic would always cause such overlap, as
+opposed to those who use Tor for only one purpose, and very lightly or
+infrequently.
+
+
+3. Glossary
+
+Covert Channel:
+ A kind of information leak that allows an adversary to send information
+ to another point in the network.
+
+Collusion Signal:
+ A Covert Channel that only reliably conveys 1 bit: if an adversary is
+ present. Such covert channels are weaker than those that enable full
+ identifier transmission, and also typically require correlation.
+
+Confirmation Signal:
+ Similar to a collusion signal, a confirmation signal is sent over a
+ weak or noisy channel, and can only confirm that an already suspected
+ entity is the target of the signal.
+
+False Negative:
+ A false negative is when the adversary fails to spot the presence of
+ an info leak vector, in instances where it is actually present.
+
+False Positive:
+ A false positive is when the adversary attempts to use an info leak vector,
+ but some similar traffic pattern or behavior elsewhere matches the traffic
+ pattern of their info leak vector.
+
+Guard Discovery:
+ The ability of an adversary to determine the Guard in use by a service or
+ client.
+
+Identifier Transmission:
+ The ability of a covert channel to reliably encode a unique identifier,
+ such as an IP address, without error.
+
+Oracle:
+ An additional mechanism used to confirm an observed info leak vector
+ that has a high rate of False Positives. Can take the form of DNS
+ cache, server logs, analytics data, and other factors. (See [ORACLES]).
+
+Path Bias (aka Route Manipulation, or Route Capture):
+ The ability of an adversary to direct circuits towards their other
+ compromised relays, by destroying circuits and/or TLS connections
+ whose paths are not sufficiently compromised.
+
+
+Acknowledgments:
+
+This document has benefited from review and suggestions by David Goulet, Nick
+Hopper, Rob Jansen, Nick Mathewson, Tobias Pulls, and Florentin Rochet.
+
+
+References:
+
+[ALPACA]
+ https://petsymposium.org/2017/papers/issue2/paper54-2017-2-source.pdf
+
+[BACKLIT]
+ https://www.freehaven.net/anonbib/cache/acsac11-backlit.pdf
+
+[DEEPCOFFEA]
+ https://www-users.cse.umn.edu/~hoppernj/deepcoffea.pdf
+
+[DEFCRITIC]
+ https://www-users.cse.umn.edu/~hoppernj/sok_wf_def_sp23.pdf
+
+[DNSORACLE]
+ https://www.usenix.org/system/files/usenixsecurity23-dahlberg.pdf
+ https://gitlab.torproject.org/rgdd/ttapd/-/tree/main/artifact/safety-board
+ https://gitlab.torproject.org/tpo/core/tor/-/issues/40674
+ https://gitlab.torproject.org/tpo/core/tor/-/issues/40539
+ https://gitlab.torproject.org/tpo/core/tor/-/issues/32678
+
+[DOSSECURITY]
+ https://www.princeton.edu/~pmittal/publications/dos-ccs07.pdf
+
+[DROPMARK]
+ https://petsymposium.org/2018/files/papers/issue2/popets-2018-0011.pdf
+
+[FLASHFLOW]
+ https://gitweb.torproject.org/torspec.git/tree/proposals/316-flashflow.md
+
+[FRONT]
+ https://www.usenix.org/system/files/sec20summer_gong_prepub.pdf
+
+[GUARDSETS]
+ https://www.freehaven.net/anonbib/cache/guardsets-pets2015.pdf
+ https://www.freehaven.net/anonbib/cache/guardsets-pets2018.pdf
+
+[INTERSPACE]
+ https://arxiv.org/pdf/2011.13471.pdf (Table 3)
+
+[LATENCY_LEAK]
+ https://www.freehaven.net/anonbib/cache/ccs07-latency-leak.pdf
+ https://www.robgjansen.com/publications/howlow-pets2013.pdf
+
+[LYING_SCANNER]
+ https://gitlab.torproject.org/tpo/network-health/team/-/issues/313
+
+[METRICSLEAK]
+ https://gitlab.torproject.org/tpo/core/tor/-/issues/23512
+
+[NETFLOW_TICKET]
+ https://gitlab.torproject.org/tpo/network-health/team/-/issues/42
+
+[ONECELL]
+ https://www.blackhat.com/presentations/bh-dc-09/Fu/BlackHat-DC-09-Fu-Break-Tors-Anonymity.pdf
+
+[ONIONPRINT]
+ https://www.freehaven.net/anonbib/cache/circuit-fingerprinting2015.pdf
+
+[ONIONFOUND]
+ https://www.researchgate.net/publication/356421302_From_Onion_Not_Found_to_Guard_Discovery/fulltext/619be24907be5f31b7ac194a/From-Onion-Not-Found-to-Guard-Discovery.pdf?origin=publication_detail
+
+[ORACLES]
+ https://petsymposium.org/popets/2020/popets-2020-0013.pdf
+
+[PADDING_SPEC]
+ https://gitlab.torproject.org/tpo/core/torspec/-/blob/main/padding-spec.txt#L68
+
+[PCP]
+ https://arxiv.org/abs/2103.03831
+
+[QUICPRINT1]
+ https://arxiv.org/abs/2101.11871 (see also: https://news.ycombinator.com/item?id=25969886)
+
+[QUICPRINT2]
+ https://netsec.ethz.ch/publications/papers/smith2021website.pdf
+
+[RACCOON23]
+ https://archives.seul.org/or/dev/Mar-2012/msg00019.html
+
+[RELAY_EARLY]
+ https://blog.torproject.org/tor-security-advisory-relay-early-traffic-confirmation-attack/
+
+[ROBFINGERPRINT]
+ https://www.usenix.org/conference/usenixsecurity23/presentation/shen-meng
+
+[SBWS]
+ https://tpo.pages.torproject.net/network-health/sbws/how_works.html
+
+[WFLIVE]
+ https://www.usenix.org/system/files/sec22-cherubin.pdf
+
+[WFNETSIM]
+ https://petsymposium.org/2023/files/papers/issue4/popets-2023-0125.pdf
+```