diff options
Diffstat (limited to 'doc/spec/proposals/115-two-hop-paths.txt')
-rw-r--r-- | doc/spec/proposals/115-two-hop-paths.txt | 387 |
1 files changed, 0 insertions, 387 deletions
diff --git a/doc/spec/proposals/115-two-hop-paths.txt b/doc/spec/proposals/115-two-hop-paths.txt deleted file mode 100644 index ee10d949c4..0000000000 --- a/doc/spec/proposals/115-two-hop-paths.txt +++ /dev/null @@ -1,387 +0,0 @@ -Filename: 115-two-hop-paths.txt -Title: Two Hop Paths -Version: $Revision$ -Last-Modified: $Date$ -Author: Mike Perry -Created: -Status: Dead -Supersedes: 112 - - -Overview: - - The idea is that users should be able to choose if they would like - to have either two or three hop paths through the tor network. - - Let us be clear: the users who would choose this option should be - those that are concerned with IP obfuscation only: ie they would not be - targets of a resource-intensive multi-node attack. It is sometimes said - that these users should find some other network to use other than Tor. - This is a foolish suggestion: more users improves security of everyone, - and the current small userbase size is a critical hindrance to - anonymity, as is discussed below and in [1]. - - This value should be modifiable from the controller, and should be - available from Vidalia. - - -Motivation: - - The Tor network is slow and overloaded. Increasingly often I hear - stories about friends and friends of friends who are behind firewalls, - annoying censorware, or under surveillance that interferes with their - productivity and Internet usage, or chills their speech. These people - know about Tor, but they choose to put up with the censorship because - Tor is too slow to be usable for them. In fact, to download a fresh, - complete copy of levine-timing.pdf for the Theoretical Argument - section of this proposal over Tor took me 3 tries. - - Furthermore, the biggest current problem with Tor's anonymity for - those who really need it is not someone attacking the network to - discover who they are. It's instead the extreme danger that so few - people use Tor because it's so slow, that those who do use it have - essentially no confusion set. - - The recent case where the professor and the rogue Tor user were the - only Tor users on campus, and thus suspected in an incident involving - Tor and that University underscores this point: "That was why the police - had come to see me. They told me that only two people on our campus were - using Tor: me and someone they suspected of engaging in an online scam. - The detectives wanted to know whether the other user was a former - student of mine, and why I was using Tor"[1]. - - Not only does Tor provide no anonymity if you use it to be anonymous - but are obviously from a certain institution, location or circumstance, - it is also dangerous to use Tor for risk of being accused of having - something significant enough to hide to be willing to put up with - the horrible performance as opposed to using some weaker alternative. - - There are many ways to improve the speed problem, and of course we - should and will implement as many as we can. Johannes's GSoC project - and my reputation system are longer term, higher-effort things that - will still provide benefit independent of this proposal. - - However, reducing the path length to 2 for those who do not need the - extra anonymity 3 hops provide not only improves their Tor experience - but also reduces their load on the Tor network by 33%, and should - increase adoption of Tor by a good deal. That's not just Win-Win, it's - Win-Win-Win. - - -Who will enable this option? - - This is the crux of the proposal. Admittedly, there is some anonymity - loss and some degree of decreased investment required on the part of - the adversary to attack 2 hop users versus 3 hop users, even if it is - minimal and limited mostly to up-front costs and false positives. - - The key questions are: - - 1. Are these users in a class such that their risk is significantly - less than the amount of this anonymity loss? - - 2. Are these users able to identify themselves? - - Many many users of Tor are not at risk for an adversary capturing c/n - nodes of the network just to see what they do. These users use Tor to - circumvent aggressive content filters, or simply to keep their IP out of - marketing and search engine databases. Most content filters have no - interest in running Tor nodes to catch violators, and marketers - certainly would never consider such a thing, both on a cost basis and a - legal one. - - In a sense, this represents an alternate threat model against these - users who are not at risk for Tor's normal threat model. - - It should be evident to these users that they fall into this class. All - that should be needed is a radio button - - * "I use Tor for local content filter circumvention and/or IP obfuscation, - not anonymity. Speed is more important to me than high anonymity. - No one will make considerable efforts to determine my real IP." - * "I use Tor for anonymity and/or national-level, legally enforced - censorship. It is possible effort will be taken to identify - me, including but not limited to network surveillance. I need more - protection." - - and then some explanation in the help for exactly what this means, and - the risks involved with eliminating the adversary's need for timing - attacks with respect to false positives. Ultimately, the decision is a - simple one that can be made without this information, however. The user - does not need Paul Syverson to instruct them on the deep magic of Onion - Routing to make this decision. They just need to know why they use Tor. - If they use it just to stay out of marketing databases and/or bypass a - local content filter, two hops is plenty. This is likely the vast - majority of Tor users, and many non-users we would like to bring on - board. - - So, having established this class of users, let us now go on to - examine theoretical and practical risks we place them at, and determine - if these risks violate the users needs, or introduce additional risk - to node operators who may be subject to requests from law enforcement - to track users who need 3 hops, but use 2 because they enjoy the - thrill of russian roulette. - - -Theoretical Argument: - - It has long been established that timing attacks against mixed - and onion networks are extremely effective, and that regardless - of path length, if the adversary has compromised your first and - last hop of your path, you can assume they have compromised your - identity for that connection. - - In fact, it was demonstrated that for all but the slowest, lossiest - networks, error rates for false positives and false negatives were - very near zero[2]. Only for constant streams of traffic over slow and - (more importantly) extremely lossy network links did the error rate - hit 20%. For loss rates typical to the Internet, even the error rate - for slow nodes with constant traffic streams was 13%. - - When you take into account that most Tor streams are not constant, - but probably much more like their "HomeIP" dataset, which consists - mostly of web traffic that exists over finite intervals at specific - times, error rates drop to fractions of 1%, even for the "worst" - network nodes. - - Therefore, the user has little benefit from the extra hop, assuming - the adversary does timing correlation on their nodes. Since timing - correlation is simply an implementation issue and is most likely - a single up-front cost (and one that is like quite a bit cheaper - than the cost of the machines purchased to host the nodes to mount - an attack), the real protection is the low probability of getting - both the first and last hop of a client's stream. - - -Practical Issues: - - Theoretical issues aside, there are several practical issues with the - implementation of Tor that need to be addressed to ensure that - identity information is not leaked by the implementation. - - Exit policy issues: - - If a client chooses an exit with a very restrictive exit policy - (such as an IP or IP range), the first hop then knows a good deal - about the destination. For this reason, clients should not select - exits that match their destination IP with anything other than "*". - - Partitioning: - - Partitioning attacks form another concern. Since Tor uses telescoping - to build circuits, it is possible to tell a user is constructing only - two hop paths at the entry node and on the local network. An external - adversary can potentially differentiate 2 and 3 hop users, and decide - that all IP addresses connecting to Tor and using 3 hops have something - to hide, and should be scrutinized more closely or outright apprehended. - - One solution to this is to use the "leaky-circuit" method of attaching - streams: The user always creates 3-hop circuits, but if the option - is enabled, they always exit from their 2nd hop. The ideal solution - would be to create a RELAY_SHISHKABOB cell which contains onion - skins for every host along the path, but this requires protocol - changes at the nodes to support. - - Guard nodes: - - Since guard nodes can rotate due to client relocation, network - failure, node upgrades and other issues, if you amortize the risk a - mobile, dialup, or otherwise intermittently connected user is exposed to - over any reasonable duration of Tor usage (on the order of a year), it - is the same with or without guard nodes. Assuming an adversary has - c%/n% of network bandwidth, and guards rotate on average with period R, - statistically speaking, it's merely a question of if the user wishes - their risk to be concentrated with probability c/n over an expected - period of R*c, and probability 0 over an expected period of R*(n-c), - versus a continuous risk of (c/n)^2. So statistically speaking, guards - only create a time-tradeoff of risk over the long run for normal Tor - usage. Rotating guards do not reduce risk for normal client usage long - term.[3] - - On other other hand, assuming a more stable method of guard selection - and preservation is devised, or a more stable client side network than - my own is typical (which rotates guards frequently due to network issues - and moving about), guard nodes provide a tradeoff in the form of c/n% of - the users being "sacrificial users" who are exposed to high risk O(c/n) - of identification, while the rest of the network is exposed to zero - risk. - - The nature of Tor makes it likely an adversary will take a "shock and - awe" approach to suppressing Tor by rounding up a few users whose - browsing activity has been observed to be made into examples, in an - attempt to prove that Tor is not perfect. - - Since this "shock and awe" attack can be applied with or without guard - nodes, stable guard nodes do offer a measure of accountability of sorts. - If a user was using a small set of guard nodes and knows them well, and - then is suddenly apprehended as a result of Tor usage, having a fixed - set of entry points to suspect is a lot better than suspecting the whole - network. Conversely, it can also give non-apprehended users comfort - that they are likely to remain safe indefinitely with their set of (now - presumably trusted) guards. This is probably the most beneficial - property of reliable guards: they deter the adversary from mounting - "shock and awe" attacks because the surviving users will not - intimidated, but instead made more confident. Of course, guards need to - be made much more stable and users need to be encouraged to know their - guards for this property to really take effect. - - This beneficial property of client vigilance also carries over to an - active adversary, except in this case instead of relying on the user - to remember their guard nodes and somehow communicate them after - apprehension, the code can alert them to the presence of an active - adversary before they are apprehended. But only if they use guard nodes. - - So lets consider the active adversary: Two hop paths allow malicious - guards to get considerably more benefit from failing circuits if they do - not extend to their colluding peers for the exit hop. Since guards can - detect the number of hops in a path via either timing or by statistical - analysis of the exit policy of the 2nd hop, they can perform this attack - predominantly against 2 hop users. - - This can be addressed by completely abandoning an entry guard after a - certain ratio of extend or general circuit failures with respect to - non-failed circuits. The proper value for this ratio can be determined - experimentally with TorFlow. There is the possibility that the local - network can abuse this feature to cause certain guards to be dropped, - but they can do that anyways with the current Tor by just making guards - they don't like unreachable. With this mechanism, Tor will complain - loudly if any guard failure rate exceeds the expected in any failure - case, local or remote. - - Eliminating guards entirely would actually not address this issue due - to the time-tradeoff nature of risk. In fact, it would just make it - worse. Without guard nodes, it becomes much more difficult for clients - to become alerted to Tor entry points that are failing circuits to make - sure that they only devote bandwidth to carry traffic for streams which - they observe both ends. Yet the rogue entry points are still able to - significantly increase their success rates by failing circuits. - - For this reason, guard nodes should remain enabled for 2 hop users, - at least until an IP-independent, undetectable guard scanner can - be created. TorFlow can scan for failing guards, but after a while, - its unique behavior gives away the fact that its IP is a scanner and - it can be given selective service. - - Consideration of risks for node operators: - - There is a serious risk for two hop users in the form of guard - profiling. If an adversary running an exit node notices that a - particular site is always visited from a fixed previous hop, it is - likely that this is a two hop user using a certain guard, which could be - monitored to determine their identity. Thus, for the protection of both - 2 hop users and node operators, 2 hop users should limit their guard - duration to a sufficient number of days to verify reliability of a node, - but not much more. This duration can be determined experimentally by - TorFlow. - - Considering a Tor client builds on average 144 circuits/day (10 - minutes per circuit), if the adversary owns c/n% of exits on the - network, they can expect to see 144*c/n circuits from this user, or - about 14 minutes of usage per day per percentage of network penetration. - Since it will take several occurrences of user-linkable exit content - from the same predecessor hop for the adversary to have any confidence - this is a 2 hop user, it is very unlikely that any sort of demands made - upon the predecessor node would guaranteed to be effective (ie it - actually was a guard), let alone be executed in time to apprehend the - user before they rotated guards. - - The reverse risk also warrants consideration. If a malicious guard has - orders to surveil Mike Perry, it can determine Mike Perry is using two - hops by observing his tendency to choose a 2nd hop with a viable exit - policy. This can be done relatively quickly, unfortunately, and - indicates Mike Perry should spend some of his time building real 3 hop - circuits through the same guards, to require them to at least wait for - him to actually use Tor to determine his style of operation, rather than - collect this information from his passive building patterns. - - However, to actively determine where Mike Perry is going, the guard - will need to require logging ahead of time at multiple exit nodes that - he may use over the course of the few days while he is at that guard, - and correlate the usage times of the exit node with Mike Perry's - activity at that guard for the few days he uses it. At this point, the - adversary is mounting a scale and method of attack (widespread logging, - timing attacks) that works pretty much just as effectively against 3 - hops, so exit node operators are exposed to no additional danger than - they otherwise normally are. - - -Why not fix Pathlen=2?: - - The main reason I am not advocating that we always use 2 hops is that - in some situations, timing correlation evidence by itself may not be - considered as solid and convincing as an actual, uninterrupted, fully - traced path. Are these timing attacks as effective on a real network as - they are in simulation? Maybe the circuit multiplexing of Tor can serve - to frustrate them to a degree? Would an extralegal adversary or - authoritarian government even care? In the face of these situation - dependent unknowns, it should be up to the user to decide if this is - a concern for them or not. - - It should probably also be noted that even a false positive - rate of 1% for a 200k concurrent-user network could mean that for a - given node, a given stream could be confused with something like 10 - users, assuming ~200 nodes carry most of the traffic (ie 1000 users - each). Though of course to really know for sure, someone needs to do - an attack on a real network, unfortunately. - - Additionally, at some point cover traffic schemes may be implemented to - frustrate timing attacks on the first hop. It is possible some expert - users may do this ad-hoc already, and may wish to continue using 3 hops - for this reason. - - -Implementation: - - new_route_len() can be modified directly with a check of the - Pathlen option. However, circuit construction logic should be - altered so that both 2 hop and 3 hop users build the same types of - circuits, and the option should ultimately govern circuit selection, - not construction. This improves coverage against guard nodes being - able to passively profile users who aren't even using Tor. - PathlenCoinWeight, anyone? :) - - The exit policy hack is a bit more tricky. compare_addr_to_addr_policy - needs to return an alternate ADDR_POLICY_ACCEPTED_WILDCARD or - ADDR_POLICY_ACCEPTED_SPECIFIC return value for use in - circuit_is_acceptable. - - The leaky exit is trickier still.. handle_control_attachstream - does allow paths to exit at a given hop. Presumably something similar - can be done in connection_ap_handshake_process_socks, and elsewhere? - Circuit construction would also have to be performed such that the - 2nd hop's exit policy is what is considered, not the 3rd's. - - The entry_guard_t structure could have num_circ_failed and - num_circ_succeeded members such that if it exceeds F% circuit - extend failure rate to a second hop, it is removed from the entry list. - - F should be sufficiently high to avoid churn from normal Tor circuit - failure as determined by TorFlow scans. - - The Vidalia option should be presented as a radio button. - - -Migration: - - Phase 1: Adjust exit policy checks if Pathlen is set, implement leaky - circuit ability, and 2-3 hop circuit selection logic governed by - Pathlen. - - Phase 2: Experiment to determine the proper ratio of circuit - failures used to expire garbage or malicious guards via TorFlow - (pending Bug #440 backport+adoption). - - Phase 3: Implement guard expiration code to kick off failure-prone - guards and warn the user. Cap 2 hop guard duration to a proper number - of days determined sufficient to establish guard reliability (to be - determined by TorFlow). - - Phase 4: Make radiobutton in Vidalia, along with help entry - that explains in layman's terms the risks involved. - - Phase 5: Allow user to specify path length by HTTP URL suffix. - - -[1] http://p2pnet.net/story/11279 -[2] http://www.cs.umass.edu/~mwright/papers/levine-timing.pdf -[3] Proof available upon request ;) |