From 61d35525e66251a21a74c2c695901d339b269e9e Mon Sep 17 00:00:00 2001 From: Mike Perry Date: Fri, 1 Jun 2007 04:41:51 +0000 Subject: Add Two Hop Paths proposal as 115. Mark 112 superseded by 115. svn:r10435 --- proposals/115-two-hop-paths.txt | 292 ++++++++++++++++++++++++++++++++++++++++ 1 file changed, 292 insertions(+) create mode 100644 proposals/115-two-hop-paths.txt (limited to 'proposals/115-two-hop-paths.txt') diff --git a/proposals/115-two-hop-paths.txt b/proposals/115-two-hop-paths.txt new file mode 100644 index 0000000..5f6be81 --- /dev/null +++ b/proposals/115-two-hop-paths.txt @@ -0,0 +1,292 @@ +Filename: 115-two-hop-paths.txt +Title: Two Hop Paths +Version: $Revision$ +Last-Modified: $Date$ +Author: Mike Perry +Created: +Status: Open +Supersedes: 112 + + +Overview: + + The idea is that users should be able to choose if they would like + to have either two or three hop paths through the tor network. + + This value should be modifiable from the controller, and should be + available from Vidalia. + + +Motivation: + + The Tor network is slow and overloaded. Increasingly often I hear + stories about friends and friends of friends who are behind firewalls, + annoying censorware, or under surveillance that interferes with their + productivity and Internet usage, or chills their speech. These people + know about Tor, but they choose to put up with the censorship because + Tor is too slow to be usable for them. In fact, to download a fresh, + complete copy of levine-timing.pdf for the Theoretical Argument + section of this proposal over Tor took me 3 tries. + + Furthermore, the biggest current problem with Tor's anonymity for + those who really need it is not someone attacking the network to + discover who they are. It's instead the extreme danger that so few + people use Tor because it's so slow, that those who do use it have + essentially no confusion set. + + The recent case where the professor and the rogue Tor user were the + only Tor users on campus, and thus suspected in an incident involving + Tor and that University underscores this point: "That was why the police + had come to see me. They told me that only two people on our campus were + using Tor: me and someone they suspected of engaging in an online scam. + The detectives wanted to know whether the other user was a former + student of mine, and why I was using Tor"[1]. + + Not only does Tor provide no anonymity if you use it to be anonymous + but are obviously from a certain institution, location or circumstance, + it is also dangerous to use Tor for risk of being accused of having + something significant enough to hide to be willing to put up with + the horrible performance. + + There are many ways to improve the speed problem, and of course we + should and will implement as many as we can. Johannes's GSoC project + and my reputation system are longer term, higher-effort things that + will still provide benefit independent of this proposal. + + However, reducing the path length to 2 for those who do not need the + (questionable) extra anonymity 3 hops provide not only improves their + Tor experience but also reduces their load on the Tor network by 33%, + and can be done in less than 10 lines of code (not counting various + security enhancements). That's not just Win-Win, it's Win-Win-Win. + + +Theoretical Argument: + + It has long been established that timing attacks against mixed + and onion networks are extremely effective, and that regardless + of path length, if the adversary has compromised your first and + last hop of your path, you can assume they have compromised your + identity for that connection. + + In fact, it was demonstrated that for all but the slowest, lossiest + networks, error rates for false positives and false negatives were + very near zero[2]. Only for constant streams of traffic over slow and + (more importantly) extremely lossy network links did the error rate + hit 20%. For loss rates typical to the Internet, even the error rate + for slow nodes with constant traffic streams was 13%. + + When you take into account that most Tor streams are not constant, + but probably much more like their "HomeIP" dataset, which consists + mostly of web traffic that exists over finite intervals at specific + times, error rates drop to fractions of 1%, even for the "worst" + network nodes. + + Therefore, the user has little benefit from the extra hop, assuming + the adversary does timing correlation on their nodes. Since timing + correlation is simply an implementation issue and is most likely + a single up-front cost (and one that is like quite a bit cheaper + than the cost of the machines purchased to host the nodes to mount + an attack), the real protection is the low probability of getting + both the first and last hop of a client's stream. + + +Practical Issues: + + Theoretical issues aside, there are several practical issues with the + implementation of Tor that need to be addressed to ensure that + identity information is not leaked by the implementation. + + Exit policy issues: + + If a client chooses an exit with a very restrictive exit policy + (such as an IP or IP range), the first hop then knows a good deal + about the destination. For this reason, clients should not select + exits that match their destination IP with anything other than "*". + + Partitioning: + + Partitioning attacks form another concern. Since Tor uses telescoping + to build circuits, it is possible to tell a user is constructing only + two hop paths at the entry node and on the local network. An external + adversary can potentially differentiate 2 and 3 hop users, and decide + that all IP addresses connecting to Tor and using 3 hops have something + to hide, and should be scrutinized more closely or outright apprehended. + + One solution to this is to use the "leaky-circuit" method of attaching + streams: The user always creates 3-hop circuits, but if the option + is enabled, they always exit from their 2nd hop. The ideal solution + would be to create a RELAY_SHISHKABOB cell which contains onion + skins for every host along the path, but this requires protocol + changes at the nodes to support. + + Guard nodes: + + Since guard nodes do rotate due to network failure, node upgrades and + other issues, if you amortize the risk a user is exposed to over any + reasonable duration of Tor usage (on the order of a year), it is the + same with or without guard nodes. Assuming an adversary has c%/n% of + network bandwidth, and guards rotate on average with period R, + statistically speaking, it's merely a question of if the user wishes + their risk to be concentrated with probability c/n over an expected + period of R*c, and probability 0 over an expected period of R*(n-c), + versus a continuous risk of (c/n)^2. So statistically speaking, guards + only create a time-tradeoff of risk over the long run for normal Tor + usage. They do not reduce risk for normal client usage long term.[3] + + Guard nodes do offer a measure of accountability of sorts. If a user + was using a small set of guard nodes, and then is suddenly apprehended + as a result of Tor usage, having a fixed set of entry points to suspect + is a lot better than suspecting the whole network. + + It has been speculated that a set of guard nodes can be used to + fingerprint a user (presumably by a local adversary) when they move + about. However, it is precisely this activity of moving your laptop that + causes guards to be marked as down by the Tor client, which then chooses + new ones. + + All of this is not terribly relevant to this proposal, but worth bearing + in mind, since guard nodes do have a bit more ability to wreak + havoc with two hops than with three. + + Two hop paths allow malicious guards to get considerably more benefit + from failing circuits if they do not extend to their colluding peers for + the exit hop. Since guards can detect the number of hops in a path via + either timing or by statistical analysis of the exit policy of the 2nd + hop, they can perform this attack predominantly against 2 hop users + only. + + This can be addressed by completely abandoning an entry guard after a + certain ratio of extend or general circuit failures with respect to + non-failed circuits. The proper value for this ratio can be determined + experimentally with TorFlow. There is the possibility that the local + network can abuse this feature to cause certain guards to be dropped, + but they can do that anyways with the current Tor by just making guards + they don't like unreachable. With this mechanism, Tor will complain + loudly if any guard failure rate exceeds the expected in any failure + case, local or remote. + + Eliminating guards entirely would actually not address this issue due + to the time-tradeoff nature of risk. In fact, it would just make it + worse. Without guard nodes, it becomes much more difficult for clients + to become alerted to Tor entry points that are failing circuits to make + sure that they only devote bandwidth to carry traffic for streams which + they observe both ends. + + For this reason, guard nodes should remain enabled for 2 hop users, + at least until an IP-independent, undetectable guard scanner can + be created. TorFlow can scan for failing guards, but after a while, + its unique behavior gives away the fact that its IP is a scanner and + it can be given selective service. + + +Why not fix Pathlen=2?: + + The main reason I am not advocating that we always use 2 hops is that + in some situations, timing correlation evidence by itself may not be + considered as solid and convincing as an actual, uninterrupted, fully + traced path. Are these timing attacks as effective on a real network as + they are in simulation? Maybe the circuit multiplexing of Tor can serve + to frustrate them to a degree? Would an extralegal adversary or + authoritarian government even care? In the face of these situation + dependent unknowns, it should be up to the user to decide if this is + a concern for them or not. + + It should probably also be noted that even a false positive + rate of 1% for a 200k concurrent-user network could mean that for a + given node, a given stream could be confused with something like 10 + users, assuming ~200 nodes carry most of the traffic (ie 1000 users + each). Though of course to really know for sure, someone needs to do + an attack on a real network, unfortunately. + + Additionally, at some point cover traffic schemes may be implemented to + frustrate timing attacks on the first hop. It is possible some expert + users may do this ad-hoc already, and may wish to continue using 3 hops + for this reason. + + +Who will enable this option? + + This is the crux of the proposal. Admittedly, there is some anonymity + loss and some degree of decreased investment required on the part of + the adversary to attack 2 hop users versus 3 hop users, even if it is + minimal and limited mostly to up-front costs and false positives. + + The key questions are: + + 1. Are these users in a class such that their risk is significantly + less than the amount of this anonymity loss? + + 2. Are these users able to identify themselves? + + Many many users of Tor are not at risk for an adversary capturing c/n + nodes of the network just to see what they do. These users use Tor to + circumvent aggressive content filters, or simply to keep their IP out of + marketing and search engine databases. Most content filters have no + interest in running Tor nodes to catch violators, and marketers + certainly would never consider such a thing, both on a cost basis and a + legal one. + + In a sense, this represents an alternate threat model against these + users who are not at risk for Tor's normal threat model. + + It should be evident to these users that they fall into this class. All + that should be needed is a radio button + + * "I use Tor for censorship resistance and IP obfuscation, not anonymity. + Speed is more important to me than high anonymity." + * "I use Tor for anonymity. I need more protection at the cost of speed." + + and then some explanation in the help for exactly what this means, and + the risks involved with eliminating the adversary's need for timing + attacks with respect to false positives. + + +Implementation: + + new_route_len() can be modified directly with a check of the + Pathlen option. + + The exit policy hack is a bit more tricky. compare_addr_to_addr_policy + needs to return an alternate ADDR_POLICY_ACCEPTED_WILDCARD or + ADDR_POLICY_ACCEPTED_SPECIFIC return value for use in + circuit_is_acceptable. + + The leaky exit is trickier still.. handle_control_attachstream + does allow paths to exit at a given hop. Presumably something similar + can be done in connection_ap_handshake_process_socks, and elsewhere? + Circuit construction would also have to be performed such that the + 2nd hop's exit policy is what is considered, not the 3rd's. + + The entry_guard_t structure could have num_circ_failed and + num_circ_succeeded members such that if it exceeds F% circuit + extend failure rate to a second hop, it is removed from the entry list. + + F should be sufficiently high to avoid churn from normal Tor circuit + failure as determined by TorFlow scans. + + The Vidalia option should be presented as a radio button. + + +Migration: + + Phase 1: Adjust exit policy checks if Pathlen is set. Modify + new_route_len() to obey a 'Pathlen' config option. + + Phase 2: Implement leaky circuit ability. + + Phase 3: Experiment to determine the proper ratio of circuit + failures used to expire garbage or malicious guards via TorFlow + (pending Bug #440 backport+adoption). + + Phase 4: Implement guard expiration code to kick off failure-prone + guards and warn the user. + + Phase 5: Make radiobutton in Vidalia, along with help entry + that explains in layman's terms the risks involved. + + Phase 6: Allow user to specify pathlength by HTTP URL suffix. + + +[1] http://p2pnet.net/story/11279 +[2] http://www.cs.umass.edu/~mwright/papers/levine-timing.pdf +[3] Proof available upon request ;) -- cgit v1.2.3-54-g00ecf