diff options
author | Roger Dingledine <arma@torproject.org> | 2007-05-12 02:26:46 +0000 |
---|---|---|
committer | Roger Dingledine <arma@torproject.org> | 2007-05-12 02:26:46 +0000 |
commit | 7218188157864f1a433f8d28ee3360a3d84fd6eb (patch) | |
tree | d4e12a99b91ced2d3b352e775c21a7453033eb9a /doc | |
parent | 866313aafc8fd2d56a499711956d54d64bb34f7f (diff) | |
download | tor-7218188157864f1a433f8d28ee3360a3d84fd6eb.tar.gz tor-7218188157864f1a433f8d28ee3360a3d84fd6eb.zip |
hack up a blocking.html via tth and some manual hacking
svn:r10168
Diffstat (limited to 'doc')
-rw-r--r-- | doc/design-paper/blocking.html | 2112 |
1 files changed, 2112 insertions, 0 deletions
diff --git a/doc/design-paper/blocking.html b/doc/design-paper/blocking.html new file mode 100644 index 0000000000..849d2679a5 --- /dev/null +++ b/doc/design-paper/blocking.html @@ -0,0 +1,2112 @@ +<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" + "DTD/xhtml1-transitional.dtd"> +<html> +<meta name="GENERATOR" content="TtH 3.77"> +<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"> + <style type="text/css"> div.p { margin-top: 7pt;}</style> + <style type="text/css"><!-- + td div.comp { margin-top: -0.6ex; margin-bottom: -1ex;} + td div.comb { margin-top: -0.6ex; margin-bottom: -.6ex;} + td div.hrcomp { line-height: 0.9; margin-top: -0.8ex; margin-bottom: -1ex;} + td div.norm {line-height:normal;} + span.roman {font-family: serif; font-style: normal; font-weight: normal;} + span.overacc2 {position: relative; left: .8em; top: -1.2ex;} + span.overacc1 {position: relative; left: .6em; top: -1.2ex;} --></style> + + +<title> Design of a blocking-resistant anonymity system\DRAFT</title> + +<h1 align="center">Design of a blocking-resistant anonymity system<br />DRAFT </h1> + +<div class="p"><!----></div> + +<h3 align="center">Roger Dingledine, Nick Mathewson </h3> + + +<div class="p"><!----></div> + +<h2> Abstract</h2> +Internet censorship is on the rise as websites around the world are +increasingly blocked by government-level firewalls. Although popular +anonymizing networks like Tor were originally designed to keep attackers from +tracing people's activities, many people are also using them to evade local +censorship. But if the censor simply denies access to the Tor network +itself, blocked users can no longer benefit from the security Tor offers. + +<div class="p"><!----></div> +Here we describe a design that builds upon the current Tor network +to provide an anonymizing network that resists blocking +by government-level attackers. + +<div class="p"><!----></div> + + <h2><a name="tth_sEc1"> +1</a> Introduction and Goals</h2> + +<div class="p"><!----></div> +Anonymizing networks like Tor [<a href="#tor-design" name="CITEtor-design">11</a>] bounce traffic around a +network of encrypting relays. Unlike encryption, which hides only <i>what</i> +is said, these networks also aim to hide who is communicating with whom, which +users are using which websites, and similar relations. These systems have a +broad range of users, including ordinary citizens who want to avoid being +profiled for targeted advertisements, corporations who don't want to reveal +information to their competitors, and law enforcement and government +intelligence agencies who need to do operations on the Internet without being +noticed. + +<div class="p"><!----></div> +Historical anonymity research has focused on an +attacker who monitors the user (call her Alice) and tries to discover her +activities, yet lets her reach any piece of the network. In more modern +threat models such as Tor's, the adversary is allowed to perform active +attacks such as modifying communications to trick Alice +into revealing her destination, or intercepting some connections +to run a man-in-the-middle attack. But these systems still assume that +Alice can eventually reach the anonymizing network. + +<div class="p"><!----></div> +An increasing number of users are using the Tor software +less for its anonymity properties than for its censorship +resistance properties — if they use Tor to access Internet sites like +Wikipedia +and Blogspot, they are no longer affected by local censorship +and firewall rules. In fact, an informal user study +showed China as the third largest user base +for Tor clients, with perhaps ten thousand people accessing the Tor +network from China each day. + +<div class="p"><!----></div> +The current Tor design is easy to block if the attacker controls Alice's +connection to the Tor network — by blocking the directory authorities, +by blocking all the server IP addresses in the directory, or by filtering +based on the fingerprint of the Tor TLS handshake. Here we describe an +extended design that builds upon the current Tor network to provide an +anonymizing +network that resists censorship as well as anonymity-breaking attacks. +In section <a href="#sec:adversary">2</a> we discuss our threat model — that is, +the assumptions we make about our adversary. Section <a href="#sec:current-tor">3</a> +describes the components of the current Tor design and how they can be +leveraged for a new blocking-resistant design. Section <a href="#sec:related">4</a> +explains the features and drawbacks of the currently deployed solutions. +In sections <a href="#sec:bridges">5</a> through <a href="#sec:discovery">7</a>, we explore the +components of our designs in detail. Section <a href="#sec:security">8</a> considers +security implications and Section <a href="#sec:reachability">9</a> presents other +issues with maintaining connectivity and sustainability for the design. +Section <a href="#sec:future">10</a> speculates about future more complex designs, +and finally Section <a href="#sec:conclusion">11</a> summarizes our next steps and +recommendations. + +<div class="p"><!----></div> + +<div class="p"><!----></div> + +<div class="p"><!----></div> + +<div class="p"><!----></div> + <h2><a name="tth_sEc2"> +<a name="sec:adversary"> +2</a> Adversary assumptions</h2> +</a> + +<div class="p"><!----></div> +To design an effective anti-censorship tool, we need a good model for the +goals and resources of the censors we are evading. Otherwise, we risk +spending our effort on keeping the adversaries from doing things they have no +interest in doing, and thwarting techniques they do not use. +The history of blocking-resistance designs is littered with conflicting +assumptions about what adversaries to expect and what problems are +in the critical path to a solution. Here we describe our best +understanding of the current situation around the world. + +<div class="p"><!----></div> +In the traditional security style, we aim to defeat a strong +attacker — if we can defend against this attacker, we inherit protection +against weaker attackers as well. After all, we want a general design +that will work for citizens of China, Thailand, and other censored +countries; for +whistleblowers in firewalled corporate networks; and for people in +unanticipated oppressive situations. In fact, by designing with +a variety of adversaries in mind, we can take advantage of the fact that +adversaries will be in different stages of the arms race at each location, +so a server blocked in one locale can still be useful in others. + +<div class="p"><!----></div> +We assume that the attackers' goals are somewhat complex. + +<dl compact="compact"> + + <dt><b></b></dt> + <dd><li>The attacker would like to restrict the flow of certain kinds of + information, particularly when this information is seen as embarrassing to + those in power (such as information about rights violations or corruption), + or when it enables or encourages others to oppose them effectively (such as + information about opposition movements or sites that are used to organize + protests).</dd> + <dt><b></b></dt> + <dd><li>As a second-order effect, censors aim to chill citizens' behavior by + creating an impression that their online activities are monitored.</dd> + <dt><b></b></dt> + <dd><li>In some cases, censors make a token attempt to block a few sites for + obscenity, blasphemy, and so on, but their efforts here are mainly for + show. In other cases, they really do try hard to block such content.</dd> + <dt><b></b></dt> + <dd><li>Complete blocking (where nobody at all can ever download censored + content) is not a + goal. Attackers typically recognize that perfect censorship is not only + impossible, but unnecessary: if "undesirable" information is known only + to a small few, further censoring efforts can be focused elsewhere.</dd> + <dt><b></b></dt> + <dd><li>Similarly, the censors are not attempting to shut down or block <i> + every</i> anti-censorship tool — merely the tools that are popular and + effective (because these tools impede the censors' information restriction + goals) and those tools that are highly visible (thus making the censors + look ineffectual to their citizens and their bosses).</dd> + <dt><b></b></dt> + <dd><li>Reprisal against <i>most</i> passive consumers of <i>most</i> kinds of + blocked information is also not a goal, given the broadness of most + censorship regimes. This seems borne out by fact.<a href="#tthFtNtAAB" name="tthFrefAAB"><sup>1</sup></a></dd> + <dt><b></b></dt> + <dd><li>Producers and distributors of targeted information are in much + greater danger than consumers; the attacker would like to not only block + their work, but identify them for reprisal.</dd> + <dt><b></b></dt> + <dd><li>The censors (or their governments) would like to have a working, useful + Internet. There are economic, political, and social factors that prevent + them from "censoring" the Internet by outlawing it entirely, or by + blocking access to all but a tiny list of sites. + Nevertheless, the censors <i>are</i> willing to block innocuous content + (like the bulk of a newspaper's reporting) in order to censor other content + distributed through the same channels (like that newspaper's coverage of + the censored country). +</dd> +</dl> + +<div class="p"><!----></div> +We assume there are three main technical network attacks in use by censors +currently [<a href="#clayton:pet2006" name="CITEclayton:pet2006">7</a>]: + +<div class="p"><!----></div> + +<dl compact="compact"> + + <dt><b></b></dt> + <dd><li>Block a destination or type of traffic by automatically searching for + certain strings or patterns in TCP packets. Offending packets can be + dropped, or can trigger a response like closing the + connection.</dd> + <dt><b></b></dt> + <dd><li>Block a destination by listing its IP address at a + firewall or other routing control point.</dd> + <dt><b></b></dt> + <dd><li>Intercept DNS requests and give bogus responses for certain + destination hostnames. +</dd> +</dl> + +<div class="p"><!----></div> +We assume the network firewall has limited CPU and memory per +connection [<a href="#clayton:pet2006" name="CITEclayton:pet2006">7</a>]. Against an adversary who could carefully +examine the contents of every packet and correlate the packets in every +stream on the network, we would need some stronger mechanism such as +steganography, which introduces its own +problems [<a href="#active-wardens" name="CITEactive-wardens">15</a>,<a href="#tcpstego" name="CITEtcpstego">26</a>]. But we make a "weak +steganography" assumption here: to remain unblocked, it is necessary to +remain unobservable only by computational resources on par with a modern +router, firewall, proxy, or IDS. + +<div class="p"><!----></div> +We assume that while various different regimes can coordinate and share +notes, there will be a time lag between one attacker learning how to overcome +a facet of our design and other attackers picking it up. (The most common +vector of transmission seems to be commercial providers of censorship tools: +once a provider adds a feature to meet one country's needs or requests, the +feature is available to all of the provider's customers.) Conversely, we +assume that insider attacks become a higher risk only after the early stages +of network development, once the system has reached a certain level of +success and visibility. + +<div class="p"><!----></div> +We do not assume that government-level attackers are always uniform +across the country. For example, users of different ISPs in China +experience different censorship policies and mechanisms. + +<div class="p"><!----></div> +We assume that the attacker may be able to use political and economic +resources to secure the cooperation of extraterritorial or multinational +corporations and entities in investigating information sources. +For example, the censors can threaten the service providers of +troublesome blogs with economic reprisals if they do not reveal the +authors' identities. + +<div class="p"><!----></div> +We assume that our users have control over their hardware and +software — they don't have any spyware installed, there are no +cameras watching their screens, etc. Unfortunately, in many situations +these threats are real [<a href="#zuckerman-threatmodels" name="CITEzuckerman-threatmodels">28</a>]; yet +software-based security systems like ours are poorly equipped to handle +a user who is entirely observed and controlled by the adversary. See +Section <a href="#subsec:cafes-and-livecds">8.4</a> for more discussion of what little +we can do about this issue. + +<div class="p"><!----></div> +Similarly, we assume that the user will be able to fetch a genuine +version of Tor, rather than one supplied by the adversary; see +Section <a href="#subsec:trust-chain">8.5</a> for discussion on helping the user +confirm that he has a genuine version and that he can connect to the +real Tor network. + +<div class="p"><!----></div> + <h2><a name="tth_sEc3"> +<a name="sec:current-tor"> +3</a> Adapting the current Tor design to anti-censorship</h2> +</a> + +<div class="p"><!----></div> +Tor is popular and sees a lot of use — it's the largest anonymity +network of its kind, and has +attracted more than 800 volunteer-operated routers from around the +world. Tor protects each user by routing their traffic through a multiply +encrypted "circuit" built of a few randomly selected servers, each of which +can remove only a single layer of encryption. Each server sees only the step +before it and the step after it in the circuit, and so no single server can +learn the connection between a user and her chosen communication partners. +In this section, we examine some of the reasons why Tor has become popular, +with particular emphasis to how we can take advantage of these properties +for a blocking-resistance design. + +<div class="p"><!----></div> +Tor aims to provide three security properties: + +<dl compact="compact"> + + <dt><b></b></dt> + <dd>1. A local network attacker can't learn, or influence, your +destination.</dd> + <dt><b></b></dt> + <dd>2. No single router in the Tor network can link you to your +destination.</dd> + <dt><b></b></dt> + <dd>3. The destination, or somebody watching the destination, +can't learn your location. +</dd> +</dl> + +<div class="p"><!----></div> +For blocking-resistance, we care most clearly about the first +property. But as the arms race progresses, the second property +will become important — for example, to discourage an adversary +from volunteering a relay in order to learn that Alice is reading +or posting to certain websites. The third property helps keep users safe from +collaborating websites: consider websites and other Internet services +that have been pressured +recently into revealing the identity of bloggers +or treating clients differently depending on their network +location [<a href="#goodell-syverson06" name="CITEgoodell-syverson06">17</a>]. + +<div class="p"><!----></div> +The Tor design provides other features as well that are not typically +present in manual or ad hoc circumvention techniques. + +<div class="p"><!----></div> +First, Tor has a well-analyzed and well-understood way to distribute +information about servers. +Tor directory authorities automatically aggregate, test, +and publish signed summaries of the available Tor routers. Tor clients +can fetch these summaries to learn which routers are available and +which routers are suitable for their needs. Directory information is cached +throughout the Tor network, so once clients have bootstrapped they never +need to interact with the authorities directly. (To tolerate a minority +of compromised directory authorities, we use a threshold trust scheme — +see Section <a href="#subsec:trust-chain">8.5</a> for details.) + +<div class="p"><!----></div> +Second, the list of directory authorities is not hard-wired. +Clients use the default authorities if no others are specified, +but it's easy to start a separate (or even overlapping) Tor network just +by running a different set of authorities and convincing users to prefer +a modified client. For example, we could launch a distinct Tor network +inside China; some users could even use an aggregate network made up of +both the main network and the China network. (But we should not be too +quick to create other Tor networks — part of Tor's anonymity comes from +users behaving like other users, and there are many unsolved anonymity +questions if different users know about different pieces of the network.) + +<div class="p"><!----></div> +Third, in addition to automatically learning from the chosen directories +which Tor routers are available and working, Tor takes care of building +paths through the network and rebuilding them as needed. So the user +never has to know how paths are chosen, never has to manually pick +working proxies, and so on. More generally, at its core the Tor protocol +is simply a tool that can build paths given a set of routers. Tor is +quite flexible about how it learns about the routers and how it chooses +the paths. Harvard's Blossom project [<a href="#blossom-thesis" name="CITEblossom-thesis">16</a>] makes this +flexibility more concrete: Blossom makes use of Tor not for its security +properties but for its reachability properties. It runs a separate set +of directory authorities, its own set of Tor routers (called the Blossom +network), and uses Tor's flexible path-building to let users view Internet +resources from any point in the Blossom network. + +<div class="p"><!----></div> +Fourth, Tor separates the role of <em>internal relay</em> from the +role of <em>exit relay</em>. That is, some volunteers choose just to relay +traffic between Tor users and Tor routers, and others choose to also allow +connections to external Internet resources. Because we don't force all +volunteers to play both roles, we end up with more relays. This increased +diversity in turn is what gives Tor its security: the more options the +user has for her first hop, and the more options she has for her last hop, +the less likely it is that a given attacker will be watching both ends +of her circuit [<a href="#tor-design" name="CITEtor-design">11</a>]. As a bonus, because our design attracts +more internal relays that want to help out but don't want to deal with +being an exit relay, we end up providing more options for the first +hop — the one most critical to being able to reach the Tor network. + +<div class="p"><!----></div> +Fifth, Tor is sustainable. Zero-Knowledge Systems offered the commercial +but now defunct Freedom Network [<a href="#freedom21-security" name="CITEfreedom21-security">2</a>], a design with +security comparable to Tor's, but its funding model relied on collecting +money from users to pay relay operators. Modern commercial proxy systems +similarly +need to keep collecting money to support their infrastructure. On the +other hand, Tor has built a self-sustaining community of volunteers who +donate their time and resources. This community trust is rooted in Tor's +open design: we tell the world exactly how Tor works, and we provide all +the source code. Users can decide for themselves, or pay any security +expert to decide, whether it is safe to use. Further, Tor's modularity +as described above, along with its open license, mean that its impact +will continue to grow. + +<div class="p"><!----></div> +Sixth, Tor has an established user base of hundreds of +thousands of people from around the world. This diversity of +users contributes to sustainability as above: Tor is used by +ordinary citizens, activists, corporations, law enforcement, and +even government and military users, +and they can +only achieve their security goals by blending together in the same +network [<a href="#econymics" name="CITEeconymics">1</a>,<a href="#usability:weis2006" name="CITEusability:weis2006">9</a>]. This user base also provides +something else: hundreds of thousands of different and often-changing +addresses that we can leverage for our blocking-resistance design. + +<div class="p"><!----></div> +Finally and perhaps most importantly, Tor provides anonymity and prevents any +single server from linking users to their communication partners. Despite +initial appearances, <i>distributed-trust anonymity is critical for +anti-censorship efforts</i>. If any single server can expose dissident bloggers +or compile a list of users' behavior, the censors can profitably compromise +that server's operator, perhaps by applying economic pressure to their +employers, +breaking into their computer, pressuring their family (if they have relatives +in the censored area), or so on. Furthermore, in designs where any relay can +expose its users, the censors can spread suspicion that they are running some +of the relays and use this belief to chill use of the network. + +<div class="p"><!----></div> +We discuss and adapt these components further in +Section <a href="#sec:bridges">5</a>. But first we examine the strengths and +weaknesses of other blocking-resistance approaches, so we can expand +our repertoire of building blocks and ideas. + +<div class="p"><!----></div> + <h2><a name="tth_sEc4"> +<a name="sec:related"> +4</a> Current proxy solutions</h2> +</a> + +<div class="p"><!----></div> +Relay-based blocking-resistance schemes generally have two main +components: a relay component and a discovery component. The relay part +encompasses the process of establishing a connection, sending traffic +back and forth, and so on — everything that's done once the user knows +where she's going to connect. Discovery is the step before that: the +process of finding one or more usable relays. + +<div class="p"><!----></div> +For example, we can divide the pieces of Tor in the previous section +into the process of building paths and sending +traffic over them (relay) and the process of learning from the directory +servers about what routers are available (discovery). With this distinction +in mind, we now examine several categories of relay-based schemes. + +<div class="p"><!----></div> + <h3><a name="tth_sEc4.1"> +4.1</a> Centrally-controlled shared proxies</h3> + +<div class="p"><!----></div> +Existing commercial anonymity solutions (like Anonymizer.com) are based +on a set of single-hop proxies. In these systems, each user connects to +a single proxy, which then relays traffic between the user and her +destination. These public proxy +systems are typically characterized by two features: they control and +operate the proxies centrally, and many different users get assigned +to each proxy. + +<div class="p"><!----></div> +In terms of the relay component, single proxies provide weak security +compared to systems that distribute trust over multiple relays, since a +compromised proxy can trivially observe all of its users' actions, and +an eavesdropper only needs to watch a single proxy to perform timing +correlation attacks against all its users' traffic and thus learn where +everyone is connecting. Worse, all users +need to trust the proxy company to have good security itself as well as +to not reveal user activities. + +<div class="p"><!----></div> +On the other hand, single-hop proxies are easier to deploy, and they +can provide better performance than distributed-trust designs like Tor, +since traffic only goes through one relay. They're also more convenient +from the user's perspective — since users entirely trust the proxy, +they can just use their web browser directly. + +<div class="p"><!----></div> +Whether public proxy schemes are more or less scalable than Tor is +still up for debate: commercial anonymity systems can use some of their +revenue to provision more bandwidth as they grow, whereas volunteer-based +anonymity systems can attract thousands of fast relays to spread the load. + +<div class="p"><!----></div> +The discovery piece can take several forms. Most commercial anonymous +proxies have one or a handful of commonly known websites, and their users +log in to those websites and relay their traffic through them. When +these websites get blocked (generally soon after the company becomes +popular), if the company cares about users in the blocked areas, they +start renting lots of disparate IP addresses and rotating through them +as they get blocked. They notify their users of new addresses (by email, +for example). It's an arms race, since attackers can sign up to receive the +email too, but operators have one nice trick available to them: because they +have a list of paying subscribers, they can notify certain subscribers +about updates earlier than others. + +<div class="p"><!----></div> +Access control systems on the proxy let them provide service only to +users with certain characteristics, such as paying customers or people +from certain IP address ranges. + +<div class="p"><!----></div> +Discovery in the face of a government-level firewall is a complex and +unsolved +topic, and we're stuck in this same arms race ourselves; we explore it +in more detail in Section <a href="#sec:discovery">7</a>. But first we examine the +other end of the spectrum — getting volunteers to run the proxies, +and telling only a few people about each proxy. + +<div class="p"><!----></div> + <h3><a name="tth_sEc4.2"> +4.2</a> Independent personal proxies</h3> + +<div class="p"><!----></div> +Personal proxies such as Circumventor [<a href="#circumventor" name="CITEcircumventor">18</a>] and +CGIProxy [<a href="#cgiproxy" name="CITEcgiproxy">23</a>] use the same technology as the public ones as +far as the relay component goes, but they use a different strategy for +discovery. Rather than managing a few centralized proxies and constantly +getting new addresses for them as the old addresses are blocked, they +aim to have a large number of entirely independent proxies, each managing +its own (much smaller) set of users. + +<div class="p"><!----></div> +As the Circumventor site explains, "You don't +actually install the Circumventor <em>on</em> the computer that is blocked +from accessing Web sites. You, or a friend of yours, has to install the +Circumventor on some <em>other</em> machine which is not censored." + +<div class="p"><!----></div> +This tactic has great advantages in terms of blocking-resistance — recall +our assumption in Section <a href="#sec:adversary">2</a> that the attention +a system attracts from the attacker is proportional to its number of +users and level of publicity. If each proxy only has a few users, and +there is no central list of proxies, most of them will never get noticed by +the censors. + +<div class="p"><!----></div> +On the other hand, there's a huge scalability question that so far has +prevented these schemes from being widely useful: how does the fellow +in China find a person in Ohio who will run a Circumventor for him? In +some cases he may know and trust some people on the outside, but in many +cases he's just out of luck. Just as hard, how does a new volunteer in +Ohio find a person in China who needs it? + +<div class="p"><!----></div> + +<div class="p"><!----></div> +This challenge leads to a hybrid design-centrally — distributed +personal proxies — which we will investigate in more detail in +Section <a href="#sec:discovery">7</a>. + +<div class="p"><!----></div> + <h3><a name="tth_sEc4.3"> +4.3</a> Open proxies</h3> + +<div class="p"><!----></div> +Yet another currently used approach to bypassing firewalls is to locate +open and misconfigured proxies on the Internet. A quick Google search +for "open proxy list" yields a wide variety of freely available lists +of HTTP, HTTPS, and SOCKS proxies. Many small companies have sprung up +providing more refined lists to paying customers. + +<div class="p"><!----></div> +There are some downsides to using these open proxies though. First, +the proxies are of widely varying quality in terms of bandwidth and +stability, and many of them are entirely unreachable. Second, unlike +networks of volunteers like Tor, the legality of routing traffic through +these proxies is questionable: it's widely believed that most of them +don't realize what they're offering, and probably wouldn't allow it if +they realized. Third, in many cases the connection to the proxy is +unencrypted, so firewalls that filter based on keywords in IP packets +will not be hindered. Fourth, in many countries (including China), the +firewall authorities hunt for open proxies as well, to preemptively +block them. And last, many users are suspicious that some +open proxies are a little <em>too</em> convenient: are they run by the +adversary, in which case they get to monitor all the user's requests +just as single-hop proxies can? + +<div class="p"><!----></div> +A distributed-trust design like Tor resolves each of these issues for +the relay component, but a constantly changing set of thousands of open +relays is clearly a useful idea for a discovery component. For example, +users might be able to make use of these proxies to bootstrap their +first introduction into the Tor network. + +<div class="p"><!----></div> + <h3><a name="tth_sEc4.4"> +4.4</a> Blocking resistance and JAP</h3> + +<div class="p"><!----></div> +Köpsell and Hilling's Blocking Resistance +design [<a href="#koepsell:wpes2004" name="CITEkoepsell:wpes2004">20</a>] is probably +the closest related work, and is the starting point for the design in this +paper. In this design, the JAP anonymity system [<a href="#web-mix" name="CITEweb-mix">3</a>] is used +as a base instead of Tor. Volunteers operate a large number of access +points that relay traffic to the core JAP +network, which in turn anonymizes users' traffic. The software to run these +relays is, as in our design, included in the JAP client software and enabled +only when the user decides to enable it. Discovery is handled with a +CAPTCHA-based mechanism; users prove that they aren't an automated process, +and are given the address of an access point. (The problem of a determined +attacker with enough manpower to launch many requests and enumerate all the +access points is not considered in depth.) There is also some suggestion +that information about access points could spread through existing social +networks. + +<div class="p"><!----></div> + <h3><a name="tth_sEc4.5"> +4.5</a> Infranet</h3> + +<div class="p"><!----></div> +The Infranet design [<a href="#infranet" name="CITEinfranet">14</a>] uses one-hop relays to deliver web +content, but disguises its communications as ordinary HTTP traffic. Requests +are split into multiple requests for URLs on the relay, which then encodes +its responses in the content it returns. The relay needs to be an actual +website with plausible content and a number of URLs which the user might want +to access — if the Infranet software produced its own cover content, it would +be far easier for censors to identify. To keep the censors from noticing +that cover content changes depending on what data is embedded, Infranet needs +the cover content to have an innocuous reason for changing frequently: the +paper recommends watermarked images and webcams. + +<div class="p"><!----></div> +The attacker and relay operators in Infranet's threat model are significantly +different than in ours. Unlike our attacker, Infranet's censor can't be +bypassed with encrypted traffic (presumably because the censor blocks +encrypted traffic, or at least considers it suspicious), and has more +computational resources to devote to each connection than ours (so it can +notice subtle patterns over time). Unlike our bridge operators, Infranet's +operators (and users) have more bandwidth to spare; the overhead in typical +steganography schemes is far higher than Tor's. + +<div class="p"><!----></div> +The Infranet design does not include a discovery element. Discovery, +however, is a critical point: if whatever mechanism allows users to learn +about relays also allows the censor to do so, he can trivially discover and +block their addresses, even if the steganography would prevent mere traffic +observation from revealing the relays' addresses. + +<div class="p"><!----></div> + <h3><a name="tth_sEc4.6"> +4.6</a> RST-evasion and other packet-level tricks</h3> + +<div class="p"><!----></div> +In their analysis of China's firewall's content-based blocking, Clayton, +Murdoch and Watson discovered that rather than blocking all packets in a TCP +streams once a forbidden word was noticed, the firewall was simply forging +RST packets to make the communicating parties believe that the connection was +closed [<a href="#clayton:pet2006" name="CITEclayton:pet2006">7</a>]. They proposed altering operating systems +to ignore forged RST packets. This approach might work in some cases, but +in practice it appears that many firewalls start filtering by IP address +once a sufficient number of RST packets have been sent. + +<div class="p"><!----></div> +Other packet-level responses to filtering include splitting +sensitive words across multiple TCP packets, so that the censors' +firewalls can't notice them without performing expensive stream +reconstruction [<a href="#ptacek98insertion" name="CITEptacek98insertion">27</a>]. This technique relies on the +same insight as our weak steganography assumption. + +<div class="p"><!----></div> + <h3><a name="tth_sEc4.7"> +4.7</a> Internal caching networks</h3> + +<div class="p"><!----></div> +Freenet [<a href="#freenet-pets00" name="CITEfreenet-pets00">6</a>] is an anonymous peer-to-peer data store. +Analyzing Freenet's security can be difficult, as its design is in flux as +new discovery and routing mechanisms are proposed, and no complete +specification has (to our knowledge) been written. Freenet servers relay +requests for specific content (indexed by a digest of the content) +"toward" the server that hosts it, and then cache the content as it +follows the same path back to +the requesting user. If Freenet's routing mechanism is successful in +allowing nodes to learn about each other and route correctly even as some +node-to-node links are blocked by firewalls, then users inside censored areas +can ask a local Freenet server for a piece of content, and get an answer +without having to connect out of the country at all. Of course, operators of +servers inside the censored area can still be targeted, and the addresses of +external servers can still be blocked. + +<div class="p"><!----></div> + <h3><a name="tth_sEc4.8"> +4.8</a> Skype</h3> + +<div class="p"><!----></div> +The popular Skype voice-over-IP software uses multiple techniques to tolerate +restrictive networks, some of which allow it to continue operating in the +presence of censorship. By switching ports and using encryption, Skype +attempts to resist trivial blocking and content filtering. Even if no +encryption were used, it would still be expensive to scan all voice +traffic for sensitive words. Also, most current keyloggers are unable to +store voice traffic. Nevertheless, Skype can still be blocked, especially at +its central login server. + +<div class="p"><!----></div> + <h3><a name="tth_sEc4.9"> +4.9</a> Tor itself</h3> + +<div class="p"><!----></div> +And last, we include Tor itself in the list of current solutions +to firewalls. Tens of thousands of people use Tor from countries that +routinely filter their Internet. Tor's website has been blocked in most +of them. But why hasn't the Tor network been blocked yet? + +<div class="p"><!----></div> +We have several theories. The first is the most straightforward: tens of +thousands of people are simply too few to matter. It may help that Tor is +perceived to be for experts only, and thus not worth attention yet. The +more subtle variant on this theory is that we've positioned Tor in the +public eye as a tool for retaining civil liberties in more free countries, +so perhaps blocking authorities don't view it as a threat. (We revisit +this idea when we consider whether and how to publicize a Tor variant +that improves blocking-resistance — see Section <a href="#subsec:publicity">9.5</a> +for more discussion.) + +<div class="p"><!----></div> +The broader explanation is that the maintenance of most government-level +filters is aimed at stopping widespread information flow and appearing to be +in control, not by the impossible goal of blocking all possible ways to bypass +censorship. Censors realize that there will always +be ways for a few people to get around the firewall, and as long as Tor +has not publically threatened their control, they see no urgent need to +block it yet. + +<div class="p"><!----></div> +We should recognize that we're <em>already</em> in the arms race. These +constraints can give us insight into the priorities and capabilities of +our various attackers. + +<div class="p"><!----></div> + <h2><a name="tth_sEc5"> +<a name="sec:bridges"> +5</a> The relay component of our blocking-resistant design</h2> +</a> + +<div class="p"><!----></div> +Section <a href="#sec:current-tor">3</a> describes many reasons why Tor is +well-suited as a building block in our context, but several changes will +allow the design to resist blocking better. The most critical changes are +to get more relay addresses, and to distribute them to users differently. + +<div class="p"><!----></div> + +<div class="p"><!----></div> + +<div class="p"><!----></div> + <h3><a name="tth_sEc5.1"> +5.1</a> Bridge relays</h3> + +<div class="p"><!----></div> +Today, Tor servers operate on less than a thousand distinct IP addresses; +an adversary +could enumerate and block them all with little trouble. To provide a +means of ingress to the network, we need a larger set of entry points, most +of which an adversary won't be able to enumerate easily. Fortunately, we +have such a set: the Tor users. + +<div class="p"><!----></div> +Hundreds of thousands of people around the world use Tor. We can leverage +our already self-selected user base to produce a list of thousands of +frequently-changing IP addresses. Specifically, we can give them a little +button in the GUI that says "Tor for Freedom", and users who click +the button will turn into <em>bridge relays</em> (or just <em>bridges</em> +for short). They can rate limit relayed connections to 10 KB/s (almost +nothing for a broadband user in a free country, but plenty for a user +who otherwise has no access at all), and since they are just relaying +bytes back and forth between blocked users and the main Tor network, they +won't need to make any external connections to Internet sites. Because +of this separation of roles, and because we're making use of software +that the volunteers have already installed for their own use, we expect +our scheme to attract and maintain more volunteers than previous schemes. + +<div class="p"><!----></div> +As usual, there are new anonymity and security implications from running a +bridge relay, particularly from letting people relay traffic through your +Tor client; but we leave this discussion for Section <a href="#sec:security">8</a>. + +<div class="p"><!----></div> + +<div class="p"><!----></div> + <h3><a name="tth_sEc5.2"> +5.2</a> The bridge directory authority</h3> + +<div class="p"><!----></div> +How do the bridge relays advertise their existence to the world? We +introduce a second new component of the design: a specialized directory +authority that aggregates and tracks bridges. Bridge relays periodically +publish server descriptors (summaries of their keys, locations, etc, +signed by their long-term identity key), just like the relays in the +"main" Tor network, but in this case they publish them only to the +bridge directory authorities. + +<div class="p"><!----></div> +The main difference between bridge authorities and the directory +authorities for the main Tor network is that the main authorities provide +a list of every known relay, but the bridge authorities only give +out a server descriptor if you already know its identity key. That is, +you can keep up-to-date on a bridge's location and other information +once you know about it, but you can't just grab a list of all the bridges. + +<div class="p"><!----></div> +The identity key, IP address, and directory port for each bridge +authority ship by default with the Tor software, so the bridge relays +can be confident they're publishing to the right location, and the +blocked users can establish an encrypted authenticated channel. See +Section <a href="#subsec:trust-chain">8.5</a> for more discussion of the public key +infrastructure and trust chain. + +<div class="p"><!----></div> +Bridges use Tor to publish their descriptors privately and securely, +so even an attacker monitoring the bridge directory authority's network +can't make a list of all the addresses contacting the authority. +Bridges may publish to only a subset of the +authorities, to limit the potential impact of an authority compromise. + +<div class="p"><!----></div> + +<div class="p"><!----></div> + <h3><a name="tth_sEc5.3"> +<a name="subsec:relay-together"> +5.3</a> Putting them together</h3> +</a> + +<div class="p"><!----></div> +If a blocked user knows the identity keys of a set of bridge relays, and +he has correct address information for at least one of them, he can use +that one to make a secure connection to the bridge authority and update +his knowledge about the other bridge relays. He can also use it to make +secure connections to the main Tor network and directory servers, so he +can build circuits and connect to the rest of the Internet. All of these +updates happen in the background: from the blocked user's perspective, +he just accesses the Internet via his Tor client like always. + +<div class="p"><!----></div> +So now we've reduced the problem from how to circumvent the firewall +for all transactions (and how to know that the pages you get have not +been modified by the local attacker) to how to learn about a working +bridge relay. + +<div class="p"><!----></div> +There's another catch though. We need to make sure that the network +traffic we generate by simply connecting to a bridge relay doesn't stand +out too much. + +<div class="p"><!----></div> + +<div class="p"><!----></div> + +<div class="p"><!----></div> + <h2><a name="tth_sEc6"> +<a name="sec:network-fingerprint"> +<a name="subsec:enclave-dirs"> +6</a> Hiding Tor's network fingerprint</h2> +</a> +</a> + +<div class="p"><!----></div> +Currently, Tor uses two protocols for its network communications. The +main protocol uses TLS for encrypted and authenticated communication +between Tor instances. The second protocol is standard HTTP, used for +fetching directory information. All Tor servers listen on their "ORPort" +for TLS connections, and some of them opt to listen on their "DirPort" +as well, to serve directory information. Tor servers choose whatever port +numbers they like; the server descriptor they publish to the directory +tells users where to connect. + +<div class="p"><!----></div> +One format for communicating address information about a bridge relay is +its IP address and DirPort. From there, the user can ask the bridge's +directory cache for an up-to-date copy of its server descriptor, and +learn its current circuit keys, its ORPort, and so on. + +<div class="p"><!----></div> +However, connecting directly to the directory cache involves a plaintext +HTTP request. A censor could create a network fingerprint (known as a +<em>signature</em> in the intrusion detection field) for the request +and/or its response, thus preventing these connections. To resolve this +vulnerability, we've modified the Tor protocol so that users can connect +to the directory cache via the main Tor port — they establish a TLS +connection with the bridge as normal, and then send a special "begindir" +relay command to establish an internal connection to its directory cache. + +<div class="p"><!----></div> +Therefore a better way to summarize a bridge's address is by its IP +address and ORPort, so all communications between the client and the +bridge will use ordinary TLS. But there are other details that need +more investigation. + +<div class="p"><!----></div> +What port should bridges pick for their ORPort? We currently recommend +that they listen on port 443 (the default HTTPS port) if they want to +be most useful, because clients behind standard firewalls will have +the best chance to reach them. Is this the best choice in all cases, +or should we encourage some fraction of them pick random ports, or other +ports commonly permitted through firewalls like 53 (DNS) or 110 +(POP)? Or perhaps we should use other ports where TLS traffic is +expected, like 993 (IMAPS) or 995 (POP3S). We need more research on our +potential users, and their current and anticipated firewall restrictions. + +<div class="p"><!----></div> +Furthermore, we need to look at the specifics of Tor's TLS handshake. +Right now Tor uses some predictable strings in its TLS handshakes. For +example, it sets the X.509 organizationName field to "Tor", and it puts +the Tor server's nickname in the certificate's commonName field. We +should tweak the handshake protocol so it doesn't rely on any unusual details +in the certificate, yet it remains secure; the certificate itself +should be made to resemble an ordinary HTTPS certificate. We should also try +to make our advertised cipher-suites closer to what an ordinary web server +would support. + +<div class="p"><!----></div> +Tor's TLS handshake uses two-certificate chains: one certificate +contains the self-signed identity key for +the router, and the second contains a current TLS key, signed by the +identity key. We use these to authenticate that we're talking to the right +router, and to limit the impact of TLS-key exposure. Most (though far from +all) consumer-oriented HTTPS services provide only a single certificate. +These extra certificates may help identify Tor's TLS handshake; instead, +bridges should consider using only a single TLS key certificate signed by +their identity key, and providing the full value of the identity key in an +early handshake cell. More significantly, Tor currently has all clients +present certificates, so that clients are harder to distinguish from servers. +But in a blocking-resistance environment, clients should not present +certificates at all. + +<div class="p"><!----></div> +Last, what if the adversary starts observing the network traffic even +more closely? Even if our TLS handshake looks innocent, our traffic timing +and volume still look different than a user making a secure web connection +to his bank. The same techniques used in the growing trend to build tools +to recognize encrypted Bittorrent traffic +could be used to identify Tor communication and recognize bridge +relays. Rather than trying to look like encrypted web traffic, we may be +better off trying to blend with some other encrypted network protocol. The +first step is to compare typical network behavior for a Tor client to +typical network behavior for various other protocols. This statistical +cat-and-mouse game is made more complex by the fact that Tor transports a +variety of protocols, and we'll want to automatically handle web browsing +differently from, say, instant messaging. + +<div class="p"><!----></div> + +<div class="p"><!----></div> + <h3><a name="tth_sEc6.1"> +<a name="subsec:id-address"> +6.1</a> Identity keys as part of addressing information</h3> +</a> + +<div class="p"><!----></div> +We have described a way for the blocked user to bootstrap into the +network once he knows the IP address and ORPort of a bridge. What about +local spoofing attacks? That is, since we never learned an identity +key fingerprint for the bridge, a local attacker could intercept our +connection and pretend to be the bridge we had in mind. It turns out +that giving false information isn't that bad — since the Tor client +ships with trusted keys for the bridge directory authority and the Tor +network directory authorities, the user can learn whether he's being +given a real connection to the bridge authorities or not. (After all, +if the adversary intercepts every connection the user makes and gives +him a bad connection each time, there's nothing we can do.) + +<div class="p"><!----></div> +What about anonymity-breaking attacks from observing traffic, if the +blocked user doesn't start out knowing the identity key of his intended +bridge? The vulnerabilities aren't so bad in this case either — the +adversary could do similar attacks just by monitoring the network +traffic. + +<div class="p"><!----></div> +Once the Tor client has fetched the bridge's server descriptor, it should +remember the identity key fingerprint for that bridge relay. Thus if +the bridge relay moves to a new IP address, the client can query the +bridge directory authority to look up a fresh server descriptor using +this fingerprint. + +<div class="p"><!----></div> +So we've shown that it's <em>possible</em> to bootstrap into the network +just by learning the IP address and ORPort of a bridge, but are there +situations where it's more convenient or more secure to learn the bridge's +identity fingerprint as well as instead, while bootstrapping? We keep +that question in mind as we next investigate bootstrapping and discovery. + +<div class="p"><!----></div> + <h2><a name="tth_sEc7"> +<a name="sec:discovery"> +7</a> Discovering working bridge relays</h2> +</a> + +<div class="p"><!----></div> +Tor's modular design means that we can develop a better relay component +independently of developing the discovery component. This modularity's +great promise is that we can pick any discovery approach we like; but the +unfortunate fact is that we have no magic bullet for discovery. We're +in the same arms race as all the other designs we described in +Section <a href="#sec:related">4</a>. + +<div class="p"><!----></div> +In this section we describe a variety of approaches to adding discovery +components for our design. + +<div class="p"><!----></div> + <h3><a name="tth_sEc7.1"> +<a name="subsec:first-bridge"> +7.1</a> Bootstrapping: finding your first bridge.</h3> +</a> + +<div class="p"><!----></div> +In Section <a href="#subsec:relay-together">5.3</a>, we showed that a user who knows +a working bridge address can use it to reach the bridge authority and +to stay connected to the Tor network. But how do new users reach the +bridge authority in the first place? After all, the bridge authority +will be one of the first addresses that a censor blocks. + +<div class="p"><!----></div> +First, we should recognize that most government firewalls are not +perfect. That is, they may allow connections to Google cache or some +open proxy servers, or they let file-sharing traffic, Skype, instant +messaging, or World-of-Warcraft connections through. Different users will +have different mechanisms for bypassing the firewall initially. Second, +we should remember that most people don't operate in a vacuum; users will +hopefully know other people who are in other situations or have other +resources available. In the rest of this section we develop a toolkit +of different options and mechanisms, so that we can enable users in a +diverse set of contexts to bootstrap into the system. + +<div class="p"><!----></div> +(For users who can't use any of these techniques, hopefully they know +a friend who can — for example, perhaps the friend already knows some +bridge relay addresses. If they can't get around it at all, then we +can't help them — they should go meet more people or learn more about +the technology running the firewall in their area.) + +<div class="p"><!----></div> +By deploying all the schemes in the toolkit at once, we let bridges and +blocked users employ the discovery approach that is most appropriate +for their situation. + +<div class="p"><!----></div> + <h3><a name="tth_sEc7.2"> +7.2</a> Independent bridges, no central discovery</h3> + +<div class="p"><!----></div> +The first design is simply to have no centralized discovery component at +all. Volunteers run bridges, and we assume they have some blocked users +in mind and communicate their address information to them out-of-band +(for example, through Gmail). This design allows for small personal +bridges that have only one or a handful of users in mind, but it can +also support an entire community of users. For example, Citizen Lab's +upcoming Psiphon single-hop proxy tool [<a href="#psiphon" name="CITEpsiphon">13</a>] plans to use this +<em>social network</em> approach as its discovery component. + +<div class="p"><!----></div> +There are several ways to do bootstrapping in this design. In the simple +case, the operator of the bridge informs each chosen user about his +bridge's address information and/or keys. A different approach involves +blocked users introducing new blocked users to the bridges they know. +That is, somebody in the blocked area can pass along a bridge's address to +somebody else they trust. This scheme brings in appealing but complex game +theoretic properties: the blocked user making the decision has an incentive +only to delegate to trustworthy people, since an adversary who learns +the bridge's address and filters it makes it unavailable for both of them. +Also, delegating known bridges to members of your social network can be +dangerous: an the adversary who can learn who knows which bridges may +be able to reconstruct the social network. + +<div class="p"><!----></div> +Note that a central set of bridge directory authorities can still be +compatible with a decentralized discovery process. That is, how users +first learn about bridges is entirely up to the bridges, but the process +of fetching up-to-date descriptors for them can still proceed as described +in Section <a href="#sec:bridges">5</a>. Of course, creating a central place that +knows about all the bridges may not be smart, especially if every other +piece of the system is decentralized. Further, if a user only knows +about one bridge and he loses track of it, it may be quite a hassle to +reach the bridge authority. We address these concerns next. + +<div class="p"><!----></div> + <h3><a name="tth_sEc7.3"> +7.3</a> Families of bridges, no central discovery</h3> + +<div class="p"><!----></div> +Because the blocked users are running our software too, we have many +opportunities to improve usability or robustness. Our second design builds +on the first by encouraging volunteers to run several bridges at once +(or coordinate with other bridge volunteers), such that some +of the bridges are likely to be available at any given time. + +<div class="p"><!----></div> +The blocked user's Tor client would periodically fetch an updated set of +recommended bridges from any of the working bridges. Now the client can +learn new additions to the bridge pool, and can expire abandoned bridges +or bridges that the adversary has blocked, without the user ever needing +to care. To simplify maintenance of the community's bridge pool, each +community could run its own bridge directory authority — reachable via +the available bridges, and also mirrored at each bridge. + +<div class="p"><!----></div> + <h3><a name="tth_sEc7.4"> +7.4</a> Public bridges with central discovery</h3> + +<div class="p"><!----></div> +What about people who want to volunteer as bridges but don't know any +suitable blocked users? What about people who are blocked but don't +know anybody on the outside? Here we describe how to make use of these +<em>public bridges</em> in a way that still makes it hard for the attacker +to learn all of them. + +<div class="p"><!----></div> +The basic idea is to divide public bridges into a set of pools based on +identity key. Each pool corresponds to a <em>distribution strategy</em>: +an approach to distributing its bridge addresses to users. Each strategy +is designed to exercise a different scarce resource or property of +the user. + +<div class="p"><!----></div> +How do we divide bridges between these strategy pools such that they're +evenly distributed and the allocation is hard to influence or predict, +but also in a way that's amenable to creating more strategies later +on without reshuffling all the pools? We assign a given bridge +to a strategy pool by hashing the bridge's identity key along with a +secret that only the bridge authority knows: the first n bits of this +hash dictate the strategy pool number, where n is a parameter that +describes how many strategy pools we want at this point. We choose n=3 +to start, so we divide bridges between 8 pools; but as we later invent +new distribution strategies, we can increment n to split the 8 into +16. Since a bridge can't predict the next bit in its hash, it can't +anticipate which identity key will correspond to a certain new pool +when the pools are split. Further, since the bridge authority doesn't +provide any feedback to the bridge about which strategy pool it's in, +an adversary who signs up bridges with the goal of filling a certain +pool [<a href="#casc-rep" name="CITEcasc-rep">12</a>] will be hindered. + +<div class="p"><!----></div> + +<div class="p"><!----></div> +The first distribution strategy (used for the first pool) publishes bridge +addresses in a time-release fashion. The bridge authority divides the +available bridges into partitions, and each partition is deterministically +available only in certain time windows. That is, over the course of a +given time slot (say, an hour), each requester is given a random bridge +from within that partition. When the next time slot arrives, a new set +of bridges from the pool are available for discovery. Thus some bridge +address is always available when a new +user arrives, but to learn about all bridges the attacker needs to fetch +all new addresses at every new time slot. By varying the length of the +time slots, we can make it harder for the attacker to guess when to check +back. We expect these bridges will be the first to be blocked, but they'll +help the system bootstrap until they <em>do</em> get blocked. Further, +remember that we're dealing with different blocking regimes around the +world that will progress at different rates — so this pool will still +be useful to some users even as the arms races progress. + +<div class="p"><!----></div> +The second distribution strategy publishes bridge addresses based on the IP +address of the requesting user. Specifically, the bridge authority will +divide the available bridges in the pool into a bunch of partitions +(as in the first distribution scheme), hash the requester's IP address +with a secret of its own (as in the above allocation scheme for creating +pools), and give the requester a random bridge from the appropriate +partition. To raise the bar, we should discard the last octet of the +IP address before inputting it to the hash function, so an attacker +who only controls a single "/24" network only counts as one user. A +large attacker like China will still be able to control many addresses, +but the hassle of establishing connections from each network (or spoofing +TCP connections) may still slow them down. Similarly, as a special case, +we should treat IP addresses that are Tor exit nodes as all being on +the same network. + +<div class="p"><!----></div> +The third strategy combines the time-based and location-based +strategies to further constrain and rate-limit the available bridge +addresses. Specifically, the bridge address provided in a given time +slot to a given network location is deterministic within the partition, +rather than chosen randomly each time from the partition. Thus, repeated +requests during that time slot from a given network are given the same +bridge address as the first request. + +<div class="p"><!----></div> +The fourth strategy is based on Circumventor's discovery strategy. +The Circumventor project, realizing that its adoption will remain limited +if it has no central coordination mechanism, has started a mailing list to +distribute new proxy addresses every few days. From experimentation it +seems they have concluded that sending updates every three or four days +is sufficient to stay ahead of the current attackers. + +<div class="p"><!----></div> +The fifth strategy provides an alternative approach to a mailing list: +users provide an email address and receive an automated response +listing an available bridge address. We could limit one response per +email address. To further rate limit queries, we could require a CAPTCHA +solution +in each case too. In fact, we wouldn't need to +implement the CAPTCHA on our side: if we only deliver bridge addresses +to Yahoo or GMail addresses, we can leverage the rate-limiting schemes +that other parties already impose for account creation. + +<div class="p"><!----></div> +The sixth strategy ties in the social network design with public +bridges and a reputation system. We pick some seeds — trusted people in +blocked areas — and give them each a few dozen bridge addresses and a few +<em>delegation tokens</em>. We run a website next to the bridge authority, +where users can log in (they connect via Tor, and they don't need to +provide actual identities, just persistent pseudonyms). Users can delegate +trust to other people they know by giving them a token, which can be +exchanged for a new account on the website. Accounts in "good standing" +then accrue new bridge addresses and new tokens. As usual, reputation +schemes bring in a host of new complexities [<a href="#rep-anon" name="CITErep-anon">10</a>]: how do we +decide that an account is in good standing? We could tie reputation +to whether the bridges they're told about have been blocked — see +Section <a href="#subsec:geoip">7.7</a> below for initial thoughts on how to discover +whether bridges have been blocked. We could track reputation between +accounts (if you delegate to somebody who screws up, it impacts you too), +or we could use blinded delegation tokens [<a href="#chaum-blind" name="CITEchaum-blind">5</a>] to prevent +the website from mapping the seeds' social network. We put off deeper +discussion of the social network reputation strategy for future work. + +<div class="p"><!----></div> +Pools seven and eight are held in reserve, in case our currently deployed +tricks all fail at once and the adversary blocks all those bridges — so +we can adapt and move to new approaches quickly, and have some bridges +immediately available for the new schemes. New strategies might be based +on some other scarce resource, such as relaying traffic for others or +other proof of energy spent. (We might also worry about the incentives +for bridges that sign up and get allocated to the reserve pools: will they +be unhappy that they're not being used? But this is a transient problem: +if Tor users are bridges by default, nobody will mind not being used yet. +See also Section <a href="#subsec:incentives">9.4</a>.) + +<div class="p"><!----></div> + +<div class="p"><!----></div> + <h3><a name="tth_sEc7.5"> +7.5</a> Public bridges with coordinated discovery</h3> + +<div class="p"><!----></div> +We presented the above discovery strategies in the context of a single +bridge directory authority, but in practice we will want to distribute the +operations over several bridge authorities — a single point of failure +or attack is a bad move. The first answer is to run several independent +bridge directory authorities, and bridges gravitate to one based on +their identity key. The better answer would be some federation of bridge +authorities that work together to provide redundancy but don't introduce +new security issues. We could even imagine designs where the bridge +authorities have encrypted versions of the bridge's server descriptors, +and the users learn a decryption key that they keep private when they +first hear about the bridge — this way the bridge authorities would not +be able to learn the IP address of the bridges. + +<div class="p"><!----></div> +We leave this design question for future work. + +<div class="p"><!----></div> + <h3><a name="tth_sEc7.6"> +7.6</a> Assessing whether bridges are useful</h3> + +<div class="p"><!----></div> +Learning whether a bridge is useful is important in the bridge authority's +decision to include it in responses to blocked users. For example, if +we end up with a list of thousands of bridges and only a few dozen of +them are reachable right now, most blocked users will not end up knowing +about working bridges. + +<div class="p"><!----></div> +There are three components for assessing how useful a bridge is. First, +is it reachable from the public Internet? Second, what proportion of +the time is it available? Third, is it blocked in certain jurisdictions? + +<div class="p"><!----></div> +The first component can be tested just as we test reachability of +ordinary Tor servers. Specifically, the bridges do a self-test — connect +to themselves via the Tor network — before they are willing to +publish their descriptor, to make sure they're not obviously broken or +misconfigured. Once the bridges publish, the bridge authority also tests +reachability to make sure they're not confused or outright lying. + +<div class="p"><!----></div> +The second component can be measured and tracked by the bridge authority. +By doing periodic reachability tests, we can get a sense of how often the +bridge is available. More complex tests will involve bandwidth-intensive +checks to force the bridge to commit resources in order to be counted as +available. We need to evaluate how the relationship of uptime percentage +should weigh into our choice of which bridges to advertise. We leave +this to future work. + +<div class="p"><!----></div> +The third component is perhaps the trickiest: with many different +adversaries out there, how do we keep track of which adversaries have +blocked which bridges, and how do we learn about new blocks as they +occur? We examine this problem next. + +<div class="p"><!----></div> + <h3><a name="tth_sEc7.7"> +<a name="subsec:geoip"> +7.7</a> How do we know if a bridge relay has been blocked?</h3> +</a> + +<div class="p"><!----></div> +There are two main mechanisms for testing whether bridges are reachable +from inside each blocked area: active testing via users, and passive +testing via bridges. + +<div class="p"><!----></div> +In the case of active testing, certain users inside each area +sign up as testing relays. The bridge authorities can then use a +Blossom-like [<a href="#blossom-thesis" name="CITEblossom-thesis">16</a>] system to build circuits through them +to each bridge and see if it can establish the connection. But how do +we pick the users? If we ask random users to do the testing (or if we +solicit volunteers from the users), the adversary should sign up so he +can enumerate the bridges we test. Indeed, even if we hand-select our +testers, the adversary might still discover their location and monitor +their network activity to learn bridge addresses. + +<div class="p"><!----></div> +Another answer is not to measure directly, but rather let the bridges +report whether they're being used. +Specifically, bridges should install a GeoIP database such as the public +IP-To-Country list [<a href="#ip-to-country" name="CITEip-to-country">19</a>], and then periodically report to the +bridge authorities which countries they're seeing use from. This data +would help us track which countries are making use of the bridge design, +and can also let us learn about new steps the adversary has taken in +the arms race. (The compressed GeoIP database is only several hundred +kilobytes, and we could even automate the update process by serving it +from the bridge authorities.) +More analysis of this passive reachability +testing design is needed to resolve its many edge cases: for example, +if a bridge stops seeing use from a certain area, does that mean the +bridge is blocked or does that mean those users are asleep? + +<div class="p"><!----></div> +There are many more problems with the general concept of detecting whether +bridges are blocked. First, different zones of the Internet are blocked +in different ways, and the actual firewall jurisdictions do not match +country borders. Our bridge scheme could help us map out the topology +of the censored Internet, but this is a huge task. More generally, +if a bridge relay isn't reachable, is that because of a network block +somewhere, because of a problem at the bridge relay, or just a temporary +outage somewhere in between? And last, an attacker could poison our +bridge database by signing up already-blocked bridges. In this case, +if we're stingy giving out bridge addresses, users in that country won't +learn working bridges. + +<div class="p"><!----></div> +All of these issues are made more complex when we try to integrate this +testing into our social network reputation system above. +Since in that case we punish or reward users based on whether bridges +get blocked, the adversary has new attacks to trick or bog down the +reputation tracking. Indeed, the bridge authority doesn't even know +what zone the blocked user is in, so do we blame him for any possible +censored zone, or what? + +<div class="p"><!----></div> +Clearly more analysis is required. The eventual solution will probably +involve a combination of passive measurement via GeoIP and active +measurement from trusted testers. More generally, we can use the passive +feedback mechanism to track usage of the bridge network as a whole — which +would let us respond to attacks and adapt the design, and it would also +let the general public track the progress of the project. + +<div class="p"><!----></div> + +<div class="p"><!----></div> + <h3><a name="tth_sEc7.8"> +7.8</a> Advantages of deploying all solutions at once</h3> + +<div class="p"><!----></div> +For once, we're not in the position of the defender: we don't have to +defend against every possible filtering scheme; we just have to defend +against at least one. On the flip side, the attacker is forced to guess +how to allocate his resources to defend against each of these discovery +strategies. So by deploying all of our strategies at once, we not only +increase our chances of finding one that the adversary has difficulty +blocking, but we actually make <em>all</em> of the strategies more robust +in the face of an adversary with limited resources. + +<div class="p"><!----></div> + +<div class="p"><!----></div> + +<div class="p"><!----></div> + +<div class="p"><!----></div> + +<div class="p"><!----></div> + +<div class="p"><!----></div> + +<div class="p"><!----></div> + +<div class="p"><!----></div> + +<div class="p"><!----></div> + +<div class="p"><!----></div> + +<div class="p"><!----></div> + +<div class="p"><!----></div> + +<div class="p"><!----></div> + +<div class="p"><!----></div> + <h2><a name="tth_sEc8"> +<a name="sec:security"> +8</a> Security considerations</h2> +</a> + +<div class="p"><!----></div> + <h3><a name="tth_sEc8.1"> +8.1</a> Possession of Tor in oppressed areas</h3> + +<div class="p"><!----></div> +Many people speculate that installing and using a Tor client in areas with +particularly extreme firewalls is a high risk — and the risk increases +as the firewall gets more restrictive. This notion certain has merit, but +there's +a counter pressure as well: as the firewall gets more restrictive, more +ordinary people behind it end up using Tor for more mainstream activities, +such as learning +about Wall Street prices or looking at pictures of women's ankles. So +as the restrictive firewall pushes up the number of Tor users, the +"typical" Tor user becomes more mainstream, and therefore mere +use or possession of the Tor software is not so surprising. + +<div class="p"><!----></div> +It's hard to say which of these pressures will ultimately win out, +but we should keep both sides of the issue in mind. + +<div class="p"><!----></div> + +<div class="p"><!----></div> + +<div class="p"><!----></div> + <h3><a name="tth_sEc8.2"> +<a name="subsec:upload-padding"> +8.2</a> Observers can tell who is publishing and who is reading</h3> +</a> + +<div class="p"><!----></div> +Tor encrypts traffic on the local network, and it obscures the eventual +destination of the communication, but it doesn't do much to obscure the +traffic volume. In particular, a user publishing a home video will have a +different network fingerprint than a user reading an online news article. +Based on our assumption in Section <a href="#sec:adversary">2</a> that users who +publish material are in more danger, should we work to improve Tor's +security in this situation? + +<div class="p"><!----></div> +In the general case this is an extremely challenging task: +effective <em>end-to-end traffic confirmation attacks</em> +are known where the adversary observes the origin and the +destination of traffic and confirms that they are part of the +same communication [<a href="#danezis:pet2004" name="CITEdanezis:pet2004">8</a>,<a href="#e2e-traffic" name="CITEe2e-traffic">24</a>]. Related are +<em>website fingerprinting attacks</em>, where the adversary downloads +a few hundred popular websites, makes a set of "fingerprints" for each +site, and then observes the target Tor client's traffic to look for +a match [<a href="#pet05-bissias" name="CITEpet05-bissias">4</a>,<a href="#defensive-dropping" name="CITEdefensive-dropping">21</a>]. But can we do better +against a limited adversary who just does coarse-grained sweeps looking +for unusually prolific publishers? + +<div class="p"><!----></div> +One answer is for bridge users to automatically send bursts of padding +traffic periodically. (This traffic can be implemented in terms of +long-range drop cells, which are already part of the Tor specification.) +Of course, convincingly simulating an actual human publishing interesting +content is a difficult arms race, but it may be worthwhile to at least +start the race. More research remains. + +<div class="p"><!----></div> + <h3><a name="tth_sEc8.3"> +8.3</a> Anonymity effects from acting as a bridge relay</h3> + +<div class="p"><!----></div> +Against some attacks, relaying traffic for others can improve +anonymity. The simplest example is an attacker who owns a small number +of Tor servers. He will see a connection from the bridge, but he won't +be able to know whether the connection originated there or was relayed +from somebody else. More generally, the mere uncertainty of whether the +traffic originated from that user may be helpful. + +<div class="p"><!----></div> +There are some cases where it doesn't seem to help: if an attacker can +watch all of the bridge's incoming and outgoing traffic, then it's easy +to learn which connections were relayed and which started there. (In this +case he still doesn't know the final destinations unless he is watching +them too, but in this case bridges are no better off than if they were +an ordinary client.) + +<div class="p"><!----></div> +There are also some potential downsides to running a bridge. First, while +we try to make it hard to enumerate all bridges, it's still possible to +learn about some of them, and for some people just the fact that they're +running one might signal to an attacker that they place a higher value +on their anonymity. Second, there are some more esoteric attacks on Tor +relays that are not as well-understood or well-tested — for example, an +attacker may be able to "observe" whether the bridge is sending traffic +even if he can't actually watch its network, by relaying traffic through +it and noticing changes in traffic timing [<a href="#attack-tor-oak05" name="CITEattack-tor-oak05">25</a>]. On +the other hand, it may be that limiting the bandwidth the bridge is +willing to relay will allow this sort of attacker to determine if it's +being used as a bridge but not easily learn whether it is adding traffic +of its own. + +<div class="p"><!----></div> +We also need to examine how entry guards fit in. Entry guards +(a small set of nodes that are always used for the first +step in a circuit) help protect against certain attacks +where the attacker runs a few Tor servers and waits for +the user to choose these servers as the beginning and end of her +circuit<a href="#tthFtNtAAC" name="tthFrefAAC"><sup>2</sup></a>. +If the blocked user doesn't use the bridge's entry guards, then the bridge +doesn't gain as much cover benefit. On the other hand, what design changes +are needed for the blocked user to use the bridge's entry guards without +learning what they are (this seems hard), and even if we solve that, +do they then need to use the guards' guards and so on down the line? + +<div class="p"><!----></div> +It is an open research question whether the benefits of running a bridge +outweigh the risks. A lot of the decision rests on which attacks the +users are most worried about. For most users, we don't think running a +bridge relay will be that damaging, and it could help quite a bit. + +<div class="p"><!----></div> + <h3><a name="tth_sEc8.4"> +<a name="subsec:cafes-and-livecds"> +8.4</a> Trusting local hardware: Internet cafes and LiveCDs</h3> +</a> + +<div class="p"><!----></div> +Assuming that users have their own trusted hardware is not +always reasonable. + +<div class="p"><!----></div> +For Internet cafe Windows computers that let you attach your own USB key, +a USB-based Tor image would be smart. There's Torpark, and hopefully +there will be more thoroughly analyzed and trustworthy options down the +road. Worries remain about hardware or software keyloggers and other +spyware, as well as and physical surveillance. + +<div class="p"><!----></div> +If the system lets you boot from a CD or from a USB key, you can gain +a bit more security by bringing a privacy LiveCD with you. (This +approach isn't foolproof either of course, since hardware +keyloggers and physical surveillance are still a worry). + +<div class="p"><!----></div> +In fact, LiveCDs are also useful if it's your own hardware, since it's +easier to avoid leaving private data and logs scattered around the +system. + +<div class="p"><!----></div> + +<div class="p"><!----></div> + <h3><a name="tth_sEc8.5"> +<a name="subsec:trust-chain"> +8.5</a> The trust chain</h3> +</a> + +<div class="p"><!----></div> +Tor's "public key infrastructure" provides a chain of trust to +let users verify that they're actually talking to the right servers. +There are four pieces to this trust chain. + +<div class="p"><!----></div> +First, when Tor clients are establishing circuits, at each step +they demand that the next Tor server in the path prove knowledge of +its private key [<a href="#tor-design" name="CITEtor-design">11</a>]. This step prevents the first node +in the path from just spoofing the rest of the path. Second, the +Tor directory authorities provide a signed list of servers along with +their public keys — so unless the adversary can control a threshold +of directory authorities, he can't trick the Tor client into using other +Tor servers. Third, the location and keys of the directory authorities, +in turn, is hard-coded in the Tor source code — so as long as the user +got a genuine version of Tor, he can know that he is using the genuine +Tor network. And last, the source code and other packages are signed +with the GPG keys of the Tor developers, so users can confirm that they +did in fact download a genuine version of Tor. + +<div class="p"><!----></div> +In the case of blocked users contacting bridges and bridge directory +authorities, the same logic applies in parallel: the blocked users fetch +information from both the bridge authorities and the directory authorities +for the `main' Tor network, and they combine this information locally. + +<div class="p"><!----></div> +How can a user in an oppressed country know that he has the correct +key fingerprints for the developers? As with other security systems, it +ultimately comes down to human interaction. The keys are signed by dozens +of people around the world, and we have to hope that our users have met +enough people in the PGP web of trust +that they can learn +the correct keys. For users that aren't connected to the global security +community, though, this question remains a critical weakness. + +<div class="p"><!----></div> + +<div class="p"><!----></div> + +<div class="p"><!----></div> + +<div class="p"><!----></div> + <h2><a name="tth_sEc9"> +<a name="sec:reachability"> +9</a> Maintaining reachability</h2> +</a> + +<div class="p"><!----></div> + <h3><a name="tth_sEc9.1"> +9.1</a> How many bridge relays should you know about?</h3> + +<div class="p"><!----></div> +The strategies described in Section <a href="#sec:discovery">7</a> talked about +learning one bridge address at a time. But if most bridges are ordinary +Tor users on cable modem or DSL connection, many of them will disappear +and/or move periodically. How many bridge relays should a blocked user +know about so that she is likely to have at least one reachable at any +given point? This is already a challenging problem if we only consider +natural churn: the best approach is to see what bridges we attract in +reality and measure their churn. We may also need to factor in a parameter +for how quickly bridges get discovered and blocked by the attacker; +we leave this for future work after we have more deployment experience. + +<div class="p"><!----></div> +A related question is: if the bridge relays change IP addresses +periodically, how often does the blocked user need to fetch updates in +order to keep from being cut out of the loop? + +<div class="p"><!----></div> +Once we have more experience and intuition, we should explore technical +solutions to this problem too. For example, if the discovery strategies +give out k bridge addresses rather than a single bridge address, perhaps +we can improve robustness from the user perspective without significantly +aiding the adversary. Rather than giving out a new random subset of k +addresses at each point, we could bind them together into <em>bridge +families</em>, so all users that learn about one member of the bridge family +are told about the rest as well. + +<div class="p"><!----></div> +This scheme may also help defend against attacks to map the set of +bridges. That is, if all blocked users learn a random subset of bridges, +the attacker should learn about a few bridges, monitor the country-level +firewall for connections to them, then watch those users to see what +other bridges they use, and repeat. By segmenting the bridge address +space, we can limit the exposure of other users. + +<div class="p"><!----></div> + <h3><a name="tth_sEc9.2"> +<a name="subsec:block-cable"> +9.2</a> Cablemodem users don't usually provide important websites</h3> +</a> + +<div class="p"><!----></div> +Another attacker we might be concerned about is that the attacker could +just block all DSL and cablemodem network addresses, on the theory that +they don't run any important services anyway. If most of our bridges +are on these networks, this attack could really hurt. + +<div class="p"><!----></div> +The first answer is to aim to get volunteers both from traditionally +"consumer" networks and also from traditionally "producer" networks. +Since bridges don't need to be Tor exit nodes, as we improve our usability +it seems quite feasible to get a lot of websites helping out. + +<div class="p"><!----></div> +The second answer (not as practical) would be to encourage more use of +consumer networks for popular and useful Internet services. + +<div class="p"><!----></div> +A related attack we might worry about is based on large countries putting +economic pressure on companies that want to expand their business. For +example, what happens if Verizon wants to sell services in China, and +China pressures Verizon to discourage its users in the free world from +running bridges? + +<div class="p"><!----></div> + <h3><a name="tth_sEc9.3"> +9.3</a> Scanning resistance: making bridges more subtle</h3> + +<div class="p"><!----></div> +If it's trivial to verify that a given address is operating as a bridge, +and most bridges run on a predictable port, then it's conceivable our +attacker could scan the whole Internet looking for bridges. (In fact, +he can just concentrate on scanning likely networks like cablemodem +and DSL services — see Section <a href="#subsec:block-cable">9.2</a> +above for +related attacks.) It would be nice to slow down this attack. It would +be even nicer to make it hard to learn whether we're a bridge without +first knowing some secret. We call this general property <em>scanning +resistance</em>, and it goes along with normalizing Tor's TLS handshake and +network fingerprint. + +<div class="p"><!----></div> +We could provide a password to the blocked user, and she (or her Tor +client) provides a nonced hash of this password when she connects. We'd +need to give her an ID key for the bridge too (in addition to the IP +address and port — see Section <a href="#subsec:id-address">6.1</a>), and wait to +present the password until we've finished the TLS handshake, else it +would look unusual. If Alice can authenticate the bridge before she +tries to send her password, we can resist an adversary who pretends +to be the bridge and launches a man-in-the-middle attack to learn the +password. But even if she can't, we still resist against widespread +scanning. + +<div class="p"><!----></div> +How should the bridge behave if accessed without the correct +authorization? Perhaps it should act like an unconfigured HTTPS server +("welcome to the default Apache page"), or maybe it should mirror +and act like common websites, or websites randomly chosen from Google. + +<div class="p"><!----></div> +We might assume that the attacker can recognize HTTPS connections that +use self-signed certificates. (This process would be resource-intensive +but not out of the realm of possibility.) But even in this case, many +popular websites around the Internet use self-signed or just plain broken +SSL certificates. + +<div class="p"><!----></div> + +<div class="p"><!----></div> + +<div class="p"><!----></div> + +<div class="p"><!----></div> + <h3><a name="tth_sEc9.4"> +<a name="subsec:incentives"> +9.4</a> How to motivate people to run bridge relays</h3> +</a> + +<div class="p"><!----></div> +One of the traditional ways to get people to run software that benefits +others is to give them motivation to install it themselves. An often +suggested approach is to install it as a stunning screensaver so everybody +will be pleased to run it. We take a similar approach here, by leveraging +the fact that these users are already interested in protecting their +own Internet traffic, so they will install and run the software. + +<div class="p"><!----></div> +Eventually, we may be able to make all Tor users become bridges if they +pass their self-reachability tests — the software and installers need +more work on usability first, but we're making progress. + +<div class="p"><!----></div> +In the mean time, we can make a snazzy network graph with +Vidalia<a href="#tthFtNtAAD" name="tthFrefAAD"><sup>3</sup></a> that +emphasizes the connections the bridge user is currently relaying. + +<div class="p"><!----></div> + +<div class="p"><!----></div> + +<div class="p"><!----></div> + +<div class="p"><!----></div> + +<div class="p"><!----></div> + +<div class="p"><!----></div> + +<div class="p"><!----></div> + +<div class="p"><!----></div> + <h3><a name="tth_sEc9.5"> +<a name="subsec:publicity"> +9.5</a> Publicity attracts attention</h3> +</a> + +<div class="p"><!----></div> +Many people working on this field want to publicize the existence +and extent of censorship concurrently with the deployment of their +circumvention software. The easy reason for this two-pronged push is +to attract volunteers for running proxies in their systems; but in many +cases their main goal is not to focus on actually allowing individuals +to circumvent the firewall, but rather to educate the world about the +censorship. The media also tries to do its part by broadcasting the +existence of each new circumvention system. + +<div class="p"><!----></div> +But at the same time, this publicity attracts the attention of the +censors. We can slow down the arms race by not attracting as much +attention, and just spreading by word of mouth. If our goal is to +establish a solid social network of bridges and bridge users before +the adversary gets involved, does this extra attention work to our +disadvantage? + +<div class="p"><!----></div> + <h3><a name="tth_sEc9.6"> +9.6</a> The Tor website: how to get the software</h3> + +<div class="p"><!----></div> +One of the first censoring attacks against a system like ours is to +block the website and make the software itself hard to find. Our system +should work well once the user is running an authentic +copy of Tor and has found a working bridge, but to get to that point +we rely on their individual skills and ingenuity. + +<div class="p"><!----></div> +Right now, most countries that block access to Tor block only the main +website and leave mirrors and the network itself untouched. +Falling back on word-of-mouth is always a good last resort, but we should +also take steps to make sure it's relatively easy for users to get a copy, +such as publicizing the mirrors more and making copies available through +other media. We might also mirror the latest version of the software on +each bridge, so users who hear about an honest bridge can get a good +copy. +See Section <a href="#subsec:first-bridge">7.1</a> for more discussion. + +<div class="p"><!----></div> + +<div class="p"><!----></div> + <h2><a name="tth_sEc10"> +<a name="sec:future"> +10</a> Future designs</h2> +</a> + +<div class="p"><!----></div> + <h3><a name="tth_sEc10.1"> +10.1</a> Bridges inside the blocked network too</h3> + +<div class="p"><!----></div> +Assuming actually crossing the firewall is the risky part of the +operation, can we have some bridge relays inside the blocked area too, +and more established users can use them as relays so they don't need to +communicate over the firewall directly at all? A simple example here is +to make new blocked users into internal bridges also — so they sign up +on the bridge authority as part of doing their query, and we give out +their addresses +rather than (or along with) the external bridge addresses. This design +is a lot trickier because it brings in the complexity of whether the +internal bridges will remain available, can maintain reachability with +the outside world, etc. + +<div class="p"><!----></div> +More complex future designs involve operating a separate Tor network +inside the blocked area, and using <em>hidden service bridges</em> — bridges +that can be accessed by users of the internal Tor network but whose +addresses are not published or findable, even by these users — to get +from inside the firewall to the rest of the Internet. But this design +requires directory authorities to run inside the blocked area too, +and they would be a fine target to take down the network. + +<div class="p"><!----></div> + +<div class="p"><!----></div> + <h2><a name="tth_sEc11"> +<a name="sec:conclusion"> +11</a> Next Steps</h2> +</a> + +<div class="p"><!----></div> +Technical solutions won't solve the whole censorship problem. After all, +the firewalls in places like China are <em>socially</em> very +successful, even if technologies and tricks exist to get around them. +However, having a strong technical solution is still necessary as one +important piece of the puzzle. + +<div class="p"><!----></div> +In this paper, we have shown that Tor provides a great set of building +blocks to start from. The next steps are to deploy prototype bridges and +bridge authorities, implement some of the proposed discovery strategies, +and then observe the system in operation and get more intuition about +the actual requirements and adversaries we're up against. + +<div class="p"><!----></div> + +<h2>References</h2> + +<dl compact="compact"> + <dt><a href="#CITEeconymics" name="econymics">[1]</a></dt><dd> +Alessandro Acquisti, Roger Dingledine, and Paul Syverson. + On the economics of anonymity. + In Rebecca N. Wright, editor, <em>Financial Cryptography</em>. + Springer-Verlag, LNCS 2742, 2003. + +<div class="p"><!----></div> +</dd> + <dt><a href="#CITEfreedom21-security" name="freedom21-security">[2]</a></dt><dd> +Adam Back, Ian Goldberg, and Adam Shostack. + Freedom systems 2.1 security issues and analysis. + White paper, Zero Knowledge Systems, Inc., May 2001. + +<div class="p"><!----></div> +</dd> + <dt><a href="#CITEweb-mix" name="web-mix">[3]</a></dt><dd> +Oliver Berthold, Hannes Federrath, and Stefan Köpsell. + Web MIXes: A system for anonymous and unobservable Internet + access. + In H. Federrath, editor, <em>Designing Privacy Enhancing + Technologies: Workshop on Design Issue in Anonymity and Unobservability</em>. + Springer-Verlag, LNCS 2009, 2000. + +<div class="p"><!----></div> +</dd> + <dt><a href="#CITEpet05-bissias" name="pet05-bissias">[4]</a></dt><dd> +George Dean Bissias, Marc Liberatore, and Brian Neil Levine. + Privacy vulnerabilities in encrypted http streams. + In <em>Proceedings of Privacy Enhancing Technologies workshop (PET + 2005)</em>, May 2005. + + <a href="http://prisms.cs.umass.edu/brian/pubs/bissias.liberatore.pet.2005.pdf"><tt>http://prisms.cs.umass.edu/brian/pubs/bissias.liberatore.pet.2005.pdf</tt></a>. + +<div class="p"><!----></div> +</dd> + <dt><a href="#CITEchaum-blind" name="chaum-blind">[5]</a></dt><dd> +David Chaum. + Blind signatures for untraceable payments. + In D. Chaum, R.L. Rivest, and A.T. Sherman, editors, <em>Advances in + Cryptology: Proceedings of Crypto 82</em>, pages 199-203. Plenum Press, 1983. + +<div class="p"><!----></div> +</dd> + <dt><a href="#CITEfreenet-pets00" name="freenet-pets00">[6]</a></dt><dd> +Ian Clarke, Oskar Sandberg, Brandon Wiley, and Theodore W. Hong. + Freenet: A distributed anonymous information storage and retrieval + system. + In H. Federrath, editor, <em>Designing Privacy Enhancing + Technologies: Workshop on Design Issue in Anonymity and Unobservability</em>, + pages 46-66. Springer-Verlag, LNCS 2009, July 2000. + +<div class="p"><!----></div> +</dd> + <dt><a href="#CITEclayton:pet2006" name="clayton:pet2006">[7]</a></dt><dd> +Richard Clayton, Steven J. Murdoch, and Robert N. M. Watson. + Ignoring the great firewall of china. + In <em>Proceedings of the Sixth Workshop on Privacy Enhancing + Technologies (PET 2006)</em>, Cambridge, UK, June 2006. Springer. + <a href="http://www.cl.cam.ac.uk/~rnc1/ignoring.pdf"><tt>http://www.cl.cam.ac.uk/~rnc1/ignoring.pdf</tt></a>. + +<div class="p"><!----></div> +</dd> + <dt><a href="#CITEdanezis:pet2004" name="danezis:pet2004">[8]</a></dt><dd> +George Danezis. + The traffic analysis of continuous-time mixes. + In David Martin and Andrei Serjantov, editors, <em>Privacy Enhancing + Technologies (PET 2004)</em>, LNCS, May 2004. + <a href="http://www.cl.cam.ac.uk/users/gd216/cmm2.pdf"><tt>http://www.cl.cam.ac.uk/users/gd216/cmm2.pdf</tt></a>. + +<div class="p"><!----></div> +</dd> + <dt><a href="#CITEusability:weis2006" name="usability:weis2006">[9]</a></dt><dd> +Roger Dingledine and Nick Mathewson. + Anonymity loves company: Usability and the network effect. + In <em>Proceedings of the Fifth Workshop on the Economics of + Information Security (WEIS 2006)</em>, Cambridge, UK, June 2006. + <a href="http://freehaven.net/doc/wupss04/usability.pdf"><tt>http://freehaven.net/doc/wupss04/usability.pdf</tt></a>. + +<div class="p"><!----></div> +</dd> + <dt><a href="#CITErep-anon" name="rep-anon">[10]</a></dt><dd> +Roger Dingledine, Nick Mathewson, and Paul Syverson. + Reputation in P2P Anonymity Systems. + In <em>Proceedings of Workshop on Economics of Peer-to-Peer + Systems</em>, June 2003. + <a href="http://freehaven.net/doc/econp2p03/econp2p03.pdf"><tt>http://freehaven.net/doc/econp2p03/econp2p03.pdf</tt></a>. + +<div class="p"><!----></div> +</dd> + <dt><a href="#CITEtor-design" name="tor-design">[11]</a></dt><dd> +Roger Dingledine, Nick Mathewson, and Paul Syverson. + Tor: The second-generation onion router. + In <em>Proceedings of the 13th USENIX Security Symposium</em>, August + 2004. + <a href="http://tor.eff.org/tor-design.pdf"><tt>http://tor.eff.org/tor-design.pdf</tt></a>. + +<div class="p"><!----></div> +</dd> + <dt><a href="#CITEcasc-rep" name="casc-rep">[12]</a></dt><dd> +Roger Dingledine and Paul Syverson. + Reliable MIX Cascade Networks through Reputation. + In Matt Blaze, editor, <em>Financial Cryptography</em>. Springer-Verlag, + LNCS 2357, 2002. + +<div class="p"><!----></div> +</dd> + <dt><a href="#CITEpsiphon" name="psiphon">[13]</a></dt><dd> +Ronald Deibert et al. + Psiphon. + <a href="http://psiphon.civisec.org/"><tt>http://psiphon.civisec.org/</tt></a>. + +<div class="p"><!----></div> +</dd> + <dt><a href="#CITEinfranet" name="infranet">[14]</a></dt><dd> +Nick Feamster, Magdalena Balazinska, Greg Harfst, Hari Balakrishnan, and David + Karger. + Infranet: Circumventing web censorship and surveillance. + In <em>Proceedings of the 11th USENIX Security Symposium</em>, August + 2002. + <a href="http://nms.lcs.mit.edu/~feamster/papers/usenixsec2002.pdf"><tt>http://nms.lcs.mit.edu/~feamster/papers/usenixsec2002.pdf</tt></a>. + +<div class="p"><!----></div> +</dd> + <dt><a href="#CITEactive-wardens" name="active-wardens">[15]</a></dt><dd> +Gina Fisk, Mike Fisk, Christos Papadopoulos, and Joshua Neil. + Eliminating steganography in internet traffic with active wardens. + In Fabien Petitcolas, editor, <em>Information Hiding Workshop (IH + 2002)</em>. Springer-Verlag, LNCS 2578, October 2002. + +<div class="p"><!----></div> +</dd> + <dt><a href="#CITEblossom-thesis" name="blossom-thesis">[16]</a></dt><dd> +Geoffrey Goodell. + <em>Perspective Access Networks</em>. + PhD thesis, Harvard University, July 2006. + <a href="http://afs.eecs.harvard.edu/~goodell/thesis.pdf"><tt>http://afs.eecs.harvard.edu/~goodell/thesis.pdf</tt></a>. + +<div class="p"><!----></div> +</dd> + <dt><a href="#CITEgoodell-syverson06" name="goodell-syverson06">[17]</a></dt><dd> +Geoffrey Goodell and Paul Syverson. + The right place at the right time: The use of network location in + authentication and abuse prevention, 2006. + Submitted. + +<div class="p"><!----></div> +</dd> + <dt><a href="#CITEcircumventor" name="circumventor">[18]</a></dt><dd> +Bennett Haselton. + How to install the Circumventor program. + + <a href="http://www.peacefire.org/circumventor/simple-circumventor-instructions.html"><tt>http://www.peacefire.org/circumventor/simple-circumventor-instructions.html</tt></a>. + +<div class="p"><!----></div> +</dd> + <dt><a href="#CITEip-to-country" name="ip-to-country">[19]</a></dt><dd> +Ip-to-country database. + <a href="http://ip-to-country.webhosting.info/"><tt>http://ip-to-country.webhosting.info/</tt></a>. + +<div class="p"><!----></div> +</dd> + <dt><a href="#CITEkoepsell:wpes2004" name="koepsell:wpes2004">[20]</a></dt><dd> +Stefan Köpsell and Ulf Hilling. + How to achieve blocking resistance for existing systems enabling + anonymous web surfing. + In <em>Proceedings of the Workshop on Privacy in the Electronic + Society (WPES 2004)</em>, Washington, DC, USA, October 2004. + <a href="http://freehaven.net/anonbib/papers/p103-koepsell.pdf"><tt>http://freehaven.net/anonbib/papers/p103-koepsell.pdf</tt></a>. + +<div class="p"><!----></div> +</dd> + <dt><a href="#CITEdefensive-dropping" name="defensive-dropping">[21]</a></dt><dd> +Brian N. Levine, Michael K. Reiter, Chenxi Wang, and Matthew Wright. + Timing analysis in low-latency mix-based systems. + In Ari Juels, editor, <em>Financial Cryptography</em>. Springer-Verlag, + LNCS (forthcoming), 2004. + +<div class="p"><!----></div> +</dd> + <dt><a href="#CITEmackinnon-personal" name="mackinnon-personal">[22]</a></dt><dd> +Rebecca MacKinnon. + Private communication, 2006. + +<div class="p"><!----></div> +</dd> + <dt><a href="#CITEcgiproxy" name="cgiproxy">[23]</a></dt><dd> +James Marshall. + CGIProxy: HTTP/FTP Proxy in a CGI Script. + <a href="http://www.jmarshall.com/tools/cgiproxy/"><tt>http://www.jmarshall.com/tools/cgiproxy/</tt></a>. + +<div class="p"><!----></div> +</dd> + <dt><a href="#CITEe2e-traffic" name="e2e-traffic">[24]</a></dt><dd> +Nick Mathewson and Roger Dingledine. + Practical traffic analysis: Extending and resisting statistical + disclosure. + In David Martin and Andrei Serjantov, editors, <em>Privacy Enhancing + Technologies (PET 2004)</em>, LNCS, May 2004. + <a href="http://freehaven.net/doc/e2e-traffic/e2e-traffic.pdf"><tt>http://freehaven.net/doc/e2e-traffic/e2e-traffic.pdf</tt></a>. + +<div class="p"><!----></div> +</dd> + <dt><a href="#CITEattack-tor-oak05" name="attack-tor-oak05">[25]</a></dt><dd> +Steven J. Murdoch and George Danezis. + Low-cost traffic analysis of tor. + In <em>IEEE Symposium on Security and Privacy</em>. IEEE CS, May 2005. + +<div class="p"><!----></div> +</dd> + <dt><a href="#CITEtcpstego" name="tcpstego">[26]</a></dt><dd> +Steven J. Murdoch and Stephen Lewis. + Embedding covert channels into TCP/IP. + In Mauro Barni, Jordi Herrera-Joancomartí, Stefan Katzenbeisser, + and Fernando Pérez-González, editors, <em>Information Hiding: 7th + International Workshop</em>, volume 3727 of <em>LNCS</em>, pages 247-261, + Barcelona, Catalonia (Spain), June 2005. Springer-Verlag. + +<div class="p"><!----></div> +</dd> + <dt><a href="#CITEptacek98insertion" name="ptacek98insertion">[27]</a></dt><dd> +Thomas H. Ptacek and Timothy N. Newsham. + Insertion, evasion, and denial of service: Eluding network intrusion + detection. + Technical report, Secure Networks, Inc., Suite 330, 1201 5th Street + S.W, Calgary, Alberta, Canada, T2R-0Y6, 1998. + +<div class="p"><!----></div> +</dd> + <dt><a href="#CITEzuckerman-threatmodels" name="zuckerman-threatmodels">[28]</a></dt><dd> +Ethan Zuckerman. + We've got to adjust some of our threat models. + <a href="http://www.ethanzuckerman.com/blog/?p=1019"><tt>http://www.ethanzuckerman.com/blog/?p=1019</tt></a>.</dd> +</dl> + + +<div class="p"><!----></div> + +<div class="p"><!----></div> + +<div class="p"><!----></div> +<hr /><h3>Footnotes:</h3> + +<div class="p"><!----></div> +<a name="tthFtNtAAB"></a><a href="#tthFrefAAB"><sup>1</sup></a>So far in places + like China, the authorities mainly go after people who publish materials + and coordinate organized movements [<a href="#mackinnon-personal" name="CITEmackinnon-personal">22</a>]. + If they find that a + user happens to be reading a site that should be blocked, the typical + response is simply to block the site. Of course, even with an encrypted + connection, the adversary may be able to distinguish readers from + publishers by observing whether Alice is mostly downloading bytes or mostly + uploading them — we discuss this issue more in + Section <a href="#subsec:upload-padding">8.2</a>. +<div class="p"><!----></div> +<a name="tthFtNtAAC"></a><a href="#tthFrefAAC"><sup>2</sup></a><a href="http://wiki.noreply.org/noreply/TheOnionRouter/TorFAQ\#EntryGuards"><tt>http://wiki.noreply.org/noreply/TheOnionRouter/TorFAQ#EntryGuards</tt></a> +<div class="p"><!----></div> +<a name="tthFtNtAAD"></a><a href="#tthFrefAAD"><sup>3</sup></a><a href="http://vidalia-project.net/"><tt>http://vidalia-project.net/</tt></a> +<br /><br /><hr /><small>File translated from +T<sub><font size="-1">E</font></sub>X +by <a href="http://hutchinson.belmont.ma.us/tth/"> +T<sub><font size="-1">T</font></sub>H</a>, +version 3.77.<br />On 11 May 2007, 21:49.</small> +</html> + |