summaryrefslogtreecommitdiff
path: root/doc/design-paper
diff options
context:
space:
mode:
Diffstat (limited to 'doc/design-paper')
-rw-r--r--doc/design-paper/roadmap-2007.pdfbin119462 -> 0 bytes
-rw-r--r--doc/design-paper/roadmap-2007.tex690
-rw-r--r--doc/design-paper/roadmap-future.pdfbin72297 -> 0 bytes
-rw-r--r--doc/design-paper/roadmap-future.tex895
4 files changed, 0 insertions, 1585 deletions
diff --git a/doc/design-paper/roadmap-2007.pdf b/doc/design-paper/roadmap-2007.pdf
deleted file mode 100644
index 2422c05888..0000000000
--- a/doc/design-paper/roadmap-2007.pdf
+++ /dev/null
Binary files differ
diff --git a/doc/design-paper/roadmap-2007.tex b/doc/design-paper/roadmap-2007.tex
deleted file mode 100644
index cebe4a5905..0000000000
--- a/doc/design-paper/roadmap-2007.tex
+++ /dev/null
@@ -1,690 +0,0 @@
-\documentclass{article}
-
-\usepackage{url}
-
-\newenvironment{tightlist}{\begin{list}{$\bullet$}{
- \setlength{\itemsep}{0mm}
- \setlength{\parsep}{0mm}
- % \setlength{\labelsep}{0mm}
- % \setlength{\labelwidth}{0mm}
- % \setlength{\topsep}{0mm}
- }}{\end{list}}
-\newcommand{\tmp}[1]{{\bf #1} [......] \\}
-\newcommand{\plan}[1]{ {\bf (#1)}}
-
-\begin{document}
-
-\title{Tor Development Roadmap: Wishlist for Nov 2006--Dec 2007}
-\author{Roger Dingledine \and Nick Mathewson \and Shava Nerad}
-
-\maketitle
-\pagestyle{plain}
-
-% TO DO:
-% add cites
-% add time estimates
-
-
-\section{Introduction}
-%Hi, Roger! Hi, Shava. This paragraph should get deleted soon. Right now,
-%this document goes into about as much detail as I'd like to go into for a
-%technical audience, since that's the audience I know best. It doesn't have
-%time estimates everywhere. It isn't well prioritized, and it doesn't
-%distinguish well between things that need lots of research and things that
-%don't. The breakdowns don't all make sense. There are lots of things where
-%I don't make it clear how they fit into larger goals, and lots of larger
-%goals that don't break down into little things. It isn't all stuff we can do
-%for sure, and it isn't even all stuff we can do for sure in 2007. The
-%tmp\{\} macro indicates stuff I haven't said enough about. That said, here
-%plangoes...
-
-Tor (the software) and Tor (the overall software/network/support/document
-suite) are now experiencing all the crises of success. Over the next year,
-we're probably going to grow more in terms of users, developers, and funding
-than before. This gives us the opportunity to perform long-neglected
-maintenance tasks.
-
-\section{Code and design infrastructure}
-
-\subsection{Protocol revision}
-To maintain backward compatibility, we've postponed major protocol
-changes and redesigns for a long time. Because of this, there are a number
-of sensible revisions we've been putting off until we could deploy several of
-them at once. To do each of these, we first need to discuss design
-alternatives with other cryptographers and outside collaborators to
-make sure that our choices are secure.
-
-First of all, our protocol needs better {\bf versioning support} so that we
-can make backward-incompatible changes to our core protocol. There are
-difficult anonymity issues here, since many naive designs would make it easy
-to tell clients apart (and then track them) based on their supported versions.
-
-With protocol versioning support would come the ability to {\bf future-proof
- our ciphersuites}. For example, not only our OR protocol, but also our
-directory protocol, is pretty firmly tied to the SHA-1 hash function, which
-though not yet known to be insecure for our purposes, has begun to show
-its age. We should
-remove assumptions throughout our design based on the assumption that public
-keys, secret keys, or digests will remain any particular size indefinitely.
-
-Our OR {\bf authentication protocol}, though provably
-secure\cite{tap:pet2006}, relies more on particular aspects of RSA and our
-implementation thereof than we had initially believed. To future-proof
-against changes, we should replace it with a less delicate approach.
-
-\plan{For all the above: 2 person-months to specify, spread over several
- months with time for interaction with external participants. One
- person-month to implement. Start specifying in early 2007.}
-
-We might design a {\bf stream migration} feature so that streams tunneled
-over Tor could be more resilient to dropped connections and changed IPs.
-\plan{Not in 2007.}
-
-A new protocol could support {\bf multiple cell sizes}. Right now, all data
-passes through the Tor network divided into 512-byte cells. This is
-efficient for high-bandwidth protocols, but inefficient for protocols
-like SSH or AIM that send information in small chunks. Of course, we need to
-investigate the extent to which multiple sizes could make it easier for an
-adversary to fingerprint a traffic pattern. \plan{Not in 2007.}
-
-As a part of our design, we should investigate possible {\bf cipher modes}
-other than counter mode. For example, a mode with built-in integrity
-checking, error propagation, and random access could simplify our protocol
-significantly. Sadly, many of these are patented and unavailable for us.
-\plan{Not in 2007.}
-
-\subsection{Scalability}
-
-\subsubsection{Improved directory efficiency}
-Right now, clients download a statement of the {\bf network status} made by
-each directory authority. We could reduce network bandwidth significantly by
-having the authorities jointly sign a statement reflecting their vote on the
-current network status. This would save clients up to 160K per hour, and
-make their view of the network more uniform. Of course, we'd need to make
-sure the voting process was secure and resilient to failures in the
-network.\plan{Must do; specify in 2006. 2 weeks to specify, 3-4 weeks to
- implement.}
-
-We should {\bf shorten router descriptors}, since the current format includes
-a great deal of information that's only of interest to the directory
-authorities, and not of interest to clients. We can do this by having each
-router upload a short-form and a long-form signed descriptor, and having
-clients download only the short form. Even a naive version of this would
-save about 40\% of the bandwidth currently spent by clients downloading
-descriptors.\plan{Must do; specify in 2006. 3-4 weeks.}
-
-We should {\bf have routers upload their descriptors even less often}, so
-that clients do not need to download replacements every 18 hours whether any
-information has changed or not. (As of Tor 0.1.2.3-alpha, clients tolerate
-routers that don't upload often, but routers still upload at least every 18
-hours to support older clients.) \plan{Must do, but not until 0.1.1.x is
-deprecated in mid 2007. 1 week.}
-
-\subsubsection{Non-clique topology}
-Our current network design achieves a certain amount of its anonymity by
-making clients act like each other through the simple expedient of making
-sure that all clients know all servers, and that any server can talk to any
-other server. But as the number of servers increases to serve an
-ever-greater number of clients, these assumptions become impractical.
-
-At worst, if these scalability issues become troubling before a solution is
-found, we can design and build a solution to {\bf split the network into
-multiple slices} until a better solution comes along. This is not ideal,
-since rather than looking like all other users from a point of view of path
-selection, users would ``only'' look like 200,000--300,000 other
-users.\plan{Not unless needed.}
-
-We are in the process of designing {\bf improved schemes for network
- scalability}. Some approaches focus on limiting what an adversary can know
-about what a user knows; others focus on reducing the extent to which an
-adversary can exploit this knowledge. These are currently in their infancy,
-and will probably not be needed in 2007, but they must be designed in 2007 if
-they are to be deployed in 2008.\plan{Design in 2007; unknown difficulty.
- Write a paper.}
-
-\subsubsection{Relay incentives}
-To support more users on the network, we need to get more servers. So far,
-we've relied on volunteerism to attract server operators, and so far it's
-served us well. But in the long run, we need to {\bf design incentives for
- users to run servers} and relay traffic for others. Most obviously, we
-could try to build the network so that servers offered improved service for
-other servers, but we would need to do so without weakening anonymity and
-making it obvious which connections originate from users running servers. We
-have some preliminary designs~\cite{incentives-txt,tor-challenges},
-but need to perform
-some more research to make sure they would be safe and effective.\plan{Write
- a draft paper; 2 person-months.}
-
-\subsection{Portability}
-Our {\bf Windows implementation}, though much improved, continues to lag
-behind Unix and Mac OS X, especially when running as a server. We hope to
-merge promising patches from Mike Chiussi to address this point, and bring
-Windows performance on par with other platforms.\plan{Do in 2007; 1.5 months
- to integrate not counting Mike's work.}
-
-We should have {\bf better support for portable devices}, including modes of
-operation that require less RAM, and that write to disk less frequently (to
-avoid wearing out flash RAM).\plan{Optional; 2 weeks.}
-
-We should {\bf stop using socketpair on Windows}; instead, we can use
-in-memory structures to communicate between cpuworkers and the main thread,
-and between connections.\plan{Optional; 1 week.}
-
-\subsection{Performance: resource usage}
-We've been working on {\bf using less RAM}, especially on servers. This has
-paid off a lot for directory caches in the 0.1.2, which in some cases are
-using 90\% less memory than they used to require. But we can do better,
-especially in the area around our buffer management algorithms, by using an
-approach more like the BSD and Linux kernels use instead of our current ring
-buffer approach. (For OR connections, we can just use queues of cell-sized
-chunks produced with a specialized allocator.) This could potentially save
-around 25 to 50\% of the memory currently allocated for network buffers, and
-make Tor a more attractive proposition for restricted-memory environments
-like old computers, mobile devices, and the like.\plan{Do in 2007; 2-3 weeks
- plus one week measurement.}
-
-We should improve our {\bf bandwidth limiting}. The current system has been
-crucial in making users willing to run servers: nobody is willing to run a
-server if it might use an unbounded amount of bandwidth, especially if they
-are charged for their usage. We can make our system better by letting users
-configure bandwidth limits independently for their own traffic and traffic
-relayed for others; and by adding write limits for users running directory
-servers.\plan{Do in 2006; 2-3 weeks.}
-
-On many hosts, sockets are still in short supply, and will be until we can
-migrate our protocol to UDP. We can {\bf use fewer sockets} by making our
-self-to-self connections happen internally to the code rather than involving
-the operating system's socket implementation.\plan{Optional; 1 week.}
-
-\subsection{Performance: network usage}
-We know too little about how well our current path
-selection algorithms actually spread traffic around the network in practice.
-We should {\bf research the efficacy of our traffic allocation} and either
-assure ourselves that it is close enough to optimal as to need no improvement
-(unlikely) or {\bf identify ways to improve network usage}, and get more
-users' traffic delivered faster. Performing this research will require
-careful thought about anonymity implications.
-
-We should also {\bf examine the efficacy of our congestion control
- algorithm}, and see whether we can improve client performance in the
-presence of a congested network through dynamic `sendme' window sizes or
-other means. This will have anonymity implications too if we aren't careful.
-
-\plan{For both of the above: research, design and write
- a measurement tool in 2007: 1 month. See if we can interest a graduate
- student.}
-
-We should work on making Tor's cell-based protocol perform better on
-networks with low bandwidth
-and high packet loss.\plan{Do in 2007 if we're funded to do it; 4-6 weeks.}
-
-\subsection{Performance scenario: one Tor client, many users}
-We should {\bf improve Tor's performance when a single Tor handles many
- clients}. Many organizations want to manage a single Tor client on their
-firewall for many users, rather than having each user install a separate
-Tor client. We haven't optimized for this scenario, and it is likely that
-there are some code paths in the current implementation that become
-inefficient when a single Tor is servicing hundreds or thousands of client
-connections. (Additionally, it is likely that such clients have interesting
-anonymity requirements the we should investigate.) We should profile Tor
-under appropriate loads, identify bottlenecks, and fix them.\plan{Do in 2007
- if we're funded to do it; 4-8 weeks.}
-
-\subsection{Tor servers on asymmetric bandwidth}
-
-Tor should work better on servers that have asymmetric connections like cable
-or DSL. Because Tor has separate TCP connections between each
-hop, if the incoming bytes are arriving just fine and the outgoing bytes are
-all getting dropped on the floor, the TCP push-back mechanisms don't really
-transmit this information back to the incoming streams.\plan{Do in 2007 since
- related to bandwidth limiting. 3-4 weeks.}
-
-\subsection{Running Tor as both client and server}
-
-Many performance tradeoffs and balances that might need more attention.
-We first need to track and fix whatever bottlenecks emerge; but we also
-need to invent good algorithms for prioritizing the client's traffic
-without starving the server's traffic too much.\plan{No idea; try
-profiling and improving things in 2007.}
-
-\subsection{Protocol redesign for UDP}
-Tor has relayed only TCP traffic since its first versions, and has used
-TLS-over-TCP to do so. This approach has proved reliable and flexible, but
-in the long term we will need to allow UDP traffic on the network, and switch
-some or all of the network to using a UDP transport. {\bf Supporting UDP
- traffic} will make Tor more suitable for protocols that require UDP, such
-as many VOIP protocols. {\bf Using a UDP transport} could greatly reduce
-resource limitations on servers, and make the network far less interruptible
-by lossy connections. Either of these protocol changes would require a great
-deal of design work, however. We hope to be able to enlist the aid of a few
-talented graduate students to assist with the initial design and
-specification, but the actual implementation will require significant testing
-of different reliable transport approaches.\plan{Maybe do a design in 2007 if
-we find an interested academic. Ian or Ben L might be good partners here.}
-
-\section{Blocking resistance}
-
-\subsection{Design for blocking resistance}
-We have written a design document explaining our general approach to blocking
-resistance. We should workshop it with other experts in the field to get
-their ideas about how we can improve Tor's efficacy as an anti-censorship
-tool.
-
-\subsection{Implementation: client-side and bridges-side}
-
-Our anticensorship design calls for some nodes to act as ``bridges''
-that are outside a national firewall, and others inside the firewall to
-act as pure clients. This part of the design is quite clear-cut; we're
-probably ready to begin implementing it. To {\bf implement bridges}, we
-need to have servers publish themselves as limited-availability relays
-to a special bridge authority if they judge they'd make good servers.
-We will also need to help provide documentation for port forwarding,
-and an easy configuration tool for running as a bridge.
-
-To {\bf implement clients}, we need to provide a flexible interface to
-learn about bridges and to act on knowledge of bridges. We also need
-to teach them how to know to use bridges as their first hop, and how to
-fetch directory information from both classes of directory authority.
-
-Clients also need to {\bf use the encrypted directory variant} added in Tor
-0.1.2.3-alpha. This will let them retrieve directory information over Tor
-once they've got their initial bridges. We may want to get the rest of the
-Tor user base to begin using this encrypted directory variant too, to
-provide cover.
-
-Bridges will want to be able to {\bf listen on multiple addresses and ports}
-if they can, to give the adversary more ports to block.
-
-\subsection{Research: anonymity implications from becoming a bridge}
-
-\subsection{Implementation: bridge authority}
-
-The design here is also reasonably clear-cut: we need to run some
-directory authorities with a slightly modified protocol that doesn't leak
-the entire list of bridges. Thus users can learn up-to-date information
-for bridges they already know about, but they can't learn about arbitrary
-new bridges.
-
-\subsection{Normalizing the Tor protocol on the wire}
-Additionally, we should {\bf resist content-based filters}. Though an
-adversary can't see what users are saying, some aspects of our protocol are
-easy to fingerprint {\em as} Tor. We should correct this where possible.
-
-Look like Firefox; or look like nothing?
-Future research: investigate timing similarities with other protocols.
-
-\subsection{Access control for bridges}
-Design/impl: password-protecting bridges, in light of above.
-And/or more general access control.
-
-\subsection{Research: scanning-resistance}
-
-\subsection{Research/Design/Impl: how users discover bridges}
-Our design anticipates an arms race between discovery methods and censors.
-We need to begin the infrastructure on our side quickly, preferably in a
-flexible language like Python, so we can adapt quickly to censorship.
-
-phase one: personal bridges
-phase two: families of personal bridges
-phase three: more structured social network
-phase four: bag of tricks
-Research: phase five...
-
-Integration with Psiphon, etc?
-
-\subsection{Document best practices for users}
-Document best practices for various activities common among
-blocked users (e.g. WordPress use).
-
-\subsection{Research: how to know if a bridge has been blocked?}
-
-\subsection{GeoIP maintenance, and "private" user statistics}
-How to know if the whole idea is working?
-
-\subsection{Research: hiding whether the user is reading or publishing?}
-
-\subsection{Research: how many bridges do you need to know to maintain
-reachability?}
-
-\subsection{Resisting censorship of the Tor website, docs, and mirrors}
-
-We should take some effort to consider {\bf initial distribution of Tor and
- related information} in countries where the Tor website and mirrors are
-censored. (Right now, most countries that block access to Tor block only the
-main website and leave mirrors and the network itself untouched.) Falling
-back on word-of-mouth is always a good last resort, but we should also take
-steps to make sure it's relatively easy for users to get ahold of a copy.
-
-\section{Security}
-
-\subsection{Security research projects}
-
-We should investigate approaches with some promise to help Tor resist
-end-to-end traffic correlation attacks. It's an open research question
-whether (and to what extent) {\bf mixed-latency} networks, {\bf low-volume
- long-distance padding}, or other approaches can resist these attacks, which
-are currently some of the most effective against careful Tor users. We
-should research these questions and perform simulations to identify
-opportunities for strengthening our design without dropping performance to
-unacceptable levels. %Cite something
-\plan{Start doing this in 2007; write a paper. 8-16 weeks.}
-
-We've got some preliminary results suggesting that {\bf a topology-aware
- routing algorithm}~\cite{feamster:wpes2004} could reduce Tor users'
-vulnerability against local or ISP-level adversaries, by ensuring that they
-are never in a position to watch both ends of a connection. We need to
-examine the effects of this approach in more detail and consider side-effects
-on anonymity against other kinds of adversaries. If the approach still looks
-promising, we should investigate ways for clients to implement it (or an
-approximation of it) without having to download routing tables for the whole
-Internet. \plan{Not in 2007 unless a graduate student wants to do it.}
-
-%\tmp{defenses against end-to-end correlation} We don't expect any to work
-%right now, but it would be useful to learn that one did. Alternatively,
-%proving that one didn't would free up researchers in the field to go work on
-%other things.
-%
-% See above; I think I got this.
-
-We should research the efficacy of {\bf website fingerprinting} attacks,
-wherein an adversary tries to match the distinctive traffic and timing
-pattern of the resources constituting a given website to the traffic pattern
-of a user's client. These attacks work great in simulations, but in
-practice we hear they don't work nearly as well. We should get some actual
-numbers to investigate the issue, and figure out what's going on. If we
-resist these attacks, or can improve our design to resist them, we should.
-% add cites
-\plan{Possibly part of end-to-end correlation paper. Otherwise, not in 2007
- unless a graduate student is interested.}
-
-\subsection{Implementation security}
-Right now, each Tor node stores its keys unencrypted. We should {\bf encrypt
- more Tor keys} so that Tor authorities can require a startup password. We
-should look into adding intermediary medium-term ``signing keys'' between
-identity keys and onion keys, so that a password could be required to replace
-a signing key, but not to start Tor. This would improve Tor's long-term
-security, especially in its directory authority infrastructure.\plan{Design this
- as a part of the revised ``v2.1'' directory protocol; implement it in
- 2007. 3-4 weeks.}
-
-We should also {\bf mark RAM that holds key material as non-swappable} so
-that there is no risk of recovering key material from a hard disk
-compromise. This would require submitting patches upstream to OpenSSL, where
-support for marking memory as sensitive is currently in a very preliminary
-state.\plan{Nice to do, but not in immediate Tor scope.}
-
-There are numerous tools for identifying trouble spots in code (such as
-Coverity or even VS2005's code analysis tool) and we should convince somebody
-to run some of them against the Tor codebase. Ideally, we could figure out a
-way to get our code checked periodically rather than just once.\plan{Almost
- no time once we talk somebody into it.}
-
-We should try {\bf protocol fuzzing} to identify errors in our
-implementation.\plan{Not in 2007 unless we find a grad student or
- undergraduate who wants to try.}
-
-Our guard nodes help prevent an attacker from being able to become a chosen
-client's entry point by having each client choose a few favorite entry points
-as ``guards'' and stick to them. We should implement a {\bf directory
- guards} feature to keep adversaries from enumerating Tor users by acting as
-a directory cache.\plan{Do in 2007; 2 weeks.}
-
-\subsection{Detect corrupt exits and other servers}
-With the success of our network, we've attracted servers in many locations,
-operated by many kinds of people. Unfortunately, some of these locations
-have compromised or defective networks, and some of these people are
-untrustworthy or incompetent. Our current design relies on authority
-administrators to identify bad nodes and mark them as nonfunctioning. We
-should {\bf automate the process of identifying malfunctioning nodes} as
-follows:
-
-We should create a generic {\bf feedback mechanism for add-on tools} like
-Mike Perry's ``Snakes on a Tor'' to report failing nodes to authorities.
-\plan{Do in 2006; 1-2 weeks.}
-
-We should write tools to {\bf detect more kinds of innocent node failure},
-such as nodes whose network providers intercept SSL, nodes whose network
-providers censor popular websites, and so on. We should also try to detect
-{\bf routers that snoop traffic}; we could do this by launching connections
-to throwaway accounts, and seeing which accounts get used.\plan{Do in 2007;
- ask Mike Perry if he's interested. 4-6 weeks.}
-
-We should add {\bf an efficient way for authorities to mark a set of servers
- as probably collaborating} though not necessarily otherwise dishonest.
-This happens when an administrator starts multiple routers, but doesn't mark
-them as belonging to the same family.\plan{Do during v2.1 directory protocol
- redesign; 1-2 weeks to implement.}
-
-To avoid attacks where an adversary claims good performance in order to
-attract traffic, we should {\bf have authorities measure node performance}
-(including stability and bandwidth) themselves, and not simply believe what
-they're told. Measuring stability can be done by tracking MTBF. Measuring
-bandwidth can be tricky, since it's hard to distinguish between a server with
-low capacity, and a high-capacity server with most of its capacity in
-use.\plan{Do ``Stable'' in 2007; 2-3 weeks. ``Fast'' will be harder; do it
- if we can interest a grad student.}
-
-{\bf Operating a directory authority should be easier.} We rely on authority
-operators to keep the network running well, but right now their job involves
-too much busywork and administrative overhead. A better interface for them
-to use could free their time to work on exception cases rather than on
-adding named nodes to the network.\plan{Do in 2007; 4-5 weeks.}
-
-\subsection{Protocol security}
-
-In addition to other protocol changes discussed above,
-% And should we move some of them down here? -NM
-we should add {\bf hooks for denial-of-service resistance}; we have some
-preliminary designs, but we shouldn't postpone them until we really need them.
-If somebody tries a DDoS attack against the Tor network, we won't want to
-wait for all the servers and clients to upgrade to a new
-version.\plan{Research project; do this in 2007 if funded.}
-
-\section{Development infrastructure}
-
-\subsection{Build farm}
-We've begun to deploy a cross-platform distributed build farm of hosts
-that build and test the Tor source every time it changes in our development
-repository.
-
-We need to {\bf get more participants}, so that we can test a larger variety
-of platforms. (Previously, we've only found out when our code had broken on
-obscure platforms when somebody got around to building it.)
-
-We need also to {\bf add our dependencies} to the build farm, so that we can
-ensure that libraries we need (especially libevent) do not stop working on
-any important platform between one release and the next.
-
-\plan{This is ongoing as more buildbots arrive.}
-
-\subsection{Improved testing harness}
-Currently, our {\bf unit tests} cover only about 20\% of the code base. This
-is uncomfortably low; we should write more and switch to a more flexible
-testing framework.\plan{Ongoing basis, time permitting.}
-
-We should also write flexible {\bf automated single-host deployment tests} so
-we can more easily verify that the current codebase works with the
-network.\plan{Worthwhile in 2007; would save lots of time. 2-4 weeks.}
-
-We should build automated {\bf stress testing} frameworks so we can see which
-realistic loads cause Tor to perform badly, and regularly profile Tor against
-these loads. This would give us {\it in vitro} performance values to
-supplement our deployment experience.\plan{Worthwhile in 2007; 2-6 weeks.}
-
-We should improve our memory profiling code.\plan{...}
-
-
-\subsection{Centralized build system}
-We currently rely on a separate packager to maintain the packaging system and
-to build Tor on each platform for which we distribute binaries. Separate
-package maintainers is sensible, but separate package builders has meant
-long turnaround times between source releases and package releases. We
-should create the necessary infrastructure for us to produce binaries for all
-major packages within an hour or so of source release.\plan{We should
- brainstorm this at least in 2007.}
-
-\subsection{Improved metrics}
-We need a way to {\bf measure the network's health, capacity, and degree of
- utilization}. Our current means for doing this are ad hoc and not
-completely accurate
-
-We need better ways to {\bf tell which countries are users are coming from,
- and how many there are}. A good perspective of the network helps us
-allocate resources and identify trouble spots, but our current approaches
-will work less and less well as we make it harder for adversaries to
-enumerate users. We'll probably want to shift to a smarter, statistical
-approach rather than our current ``count and extrapolate'' method.
-
-\plan{All of this in 2007 if funded; 4-8 weeks}
-
-% \tmp{We'd like to know how much of the network is getting used.}
-% I think this is covered above -NM
-
-\subsection{Controller library}
-We've done lots of design and development on our controller interface, which
-allows UI applications and other tools to interact with Tor. We could
-encourage the development of more such tools by releasing a {\bf
- general-purpose controller library}, ideally with API support for several
-popular programming languages.\plan{2006 or 2007; 1-2 weeks.}
-
-\section{User experience}
-
-\subsection{Get blocked less, get blocked less broadly}
-Right now, some services block connections from the Tor network because
-they don't have a better
-way to keep vandals from abusing them than blocking IP addresses associated
-with vandalism. Our approach so far has been to educate them about better
-solutions that currently exist, but we should also {\bf create better
-solutions for limiting vandalism by anonymous users} like credential and
-blind-signature based implementations, and encourage their use. Other
-promising starting points including writing a patch and explanation for
-Wikipedia, and helping Freenode to document, maintain, and expand its
-current Tor-friendly position.\plan{Do a writeup here in 2007; 1-2 weeks.}
-
-Those who do block Tor users also block overbroadly, sometimes blacklisting
-operators of Tor servers that do not permit exit to their services. We could
-obviate innocent reasons for doing so by designing a {\bf narrowly-targeted Tor
- RBL service} so that those who wanted to overblock Tor could no longer
-plead incompetence.\plan{Possibly in 2007 if we decide it's a good idea; 3
- weeks.}
-
-\subsection{All-in-one bundle}
-We need a well-tested, well-documented bundle of Tor and supporting
-applications configured to use it correctly. We have an initial
-implementation well under way, but it will need additional work in
-identifying requisite Firefox extensions, identifying security threats,
-improving user experience, and so on. This will need significantly more work
-before it's ready for a general public release.
-
-\subsection{LiveCD Tor}
-We need a nice bootable livecd containing a minimal OS and a few applications
-configured to use it correctly. The Anonym.OS project demonstrated that this
-is quite feasible, but their project is not currently maintained.
-
-\subsection{A Tor client in a VM}
-\tmp{a.k.a JanusVM} which is quite related to the firewall-level deployment
-section below. JanusVM is a Linux kernel running in VMWare. It gets an IP
-address from the network, and serves as a DHCP server for its host Windows
-machine. It intercepts all outgoing traffic and redirects it into Privoxy,
-Tor, etc. This Linux-in-Windows approach may help us with scalability in
-the short term, and it may also be a good long-term solution rather than
-accepting all security risks in Windows.
-
-%\subsection{Interface improvements}
-%\tmp{Allow controllers to manipulate server status.}
-% (Why is this in the User Experience section?) -RD
-% I think it's better left to a generic ``make controller iface better'' item.
-
-\subsection{Firewall-level deployment}
-Another useful deployment mode for some users is using {\bf Tor in a firewall
- configuration}, and directing all their traffic through Tor. This can be a
-little tricky to set up currently, but it's an effective way to make sure no
-traffic leaves the host un-anonymized. To achieve this, we need to {\bf
- improve and port our new TransPort} feature which allows Tor to be used
-without SOCKS support; to {\bf add an anonymizing DNS proxy} feature to Tor;
-and to {\bf construct a recommended set of firewall configurations} to redirect
-traffic to Tor.
-
-This is an area where {\bf deployment via a livecd}, or an installation
-targeted at specialized home routing hardware, could be useful.
-
-\subsection{Assess software and configurations for anonymity risks}
-Right now, users and packagers are more or less on their own when selecting
-Firefox extensions. We should {\bf assemble a recommended list of browser
- extensions} through experiment, and include this in the application bundles
-we distribute.
-
-We should also describe {\bf best practices for using Tor with each class of
- application}. For example, Ethan Zuckerman has written a detailed
-tutorial on how to use Tor, Firefox, GMail, and Wordpress to blog with
-improved safety. There are many other cases on the Internet where anonymity
-would be helpful, and there are a lot of ways to screw up using Tor.
-
-The Foxtor and Torbutton extensions serve similar purposes; we should pick a
-favorite, and merge in the useful features of the other.
-
-%\tmp{clean up our own bundled software:
-%E.g. Merge the good features of Foxtor into Torbutton}
-%
-% What else did you have in mind? -NM
-
-\subsection{Localization}
-Right now, most of our user-facing code is internationalized. We need to
-internationalize the last few hold-outs (like the Tor expert installer), and get
-more translations for the parts that are already internationalized.
-
-Also, we should look into a {\bf unified translator's solution}. Currently,
-since different tools have been internationalized using the
-framework-appropriate method, different tools require translators to localize
-them via different interfaces. Inasmuch as possible, we should make
-translators only need to use a single tool to translate the whole Tor suite.
-
-\section{Support}
-
-It would be nice to set up some {\bf user support infrastructure} and
-{\bf contributor support infrastructure}, especially focusing on server
-operators and on coordinating volunteers.
-
-This includes intuitive and easy ticket systems for bug reports and
-feature suggestions (not just mailing lists with a half dozen people
-and no clear roles for who answers what), but it also includes a more
-personalized and efficient framework for interaction so we keep the
-attention and interest of the contributors, and so we make them feel
-helpful and wanted.
-
-\section{Documentation}
-
-\subsection{Unified documentation scheme}
-
-We need to {\bf inventory our documentation.} Our documentation so far has
-been mostly produced on an {\it ad hoc} basis, in response to particular
-needs and requests. We should figure out what documentation we have, which of
-it (if any) should get priority, and whether we can't put it all into a
-single format.
-
-We could {\bf unify the docs} into a single book-like thing. This will also
-help us identify what sections of the ``book'' are missing.
-
-\subsection{Missing technical documentation}
-
-We should {\bf revise our design paper} to reflect the new decisions and
-research we've made since it was published in 2004. This will help other
-researchers evaluate and suggest improvements to Tor's current design.
-
-Other projects sometimes implement the client side of our protocol. We
-encourage this, but we should write {\bf a document about how to avoid
-excessive resource use}, so we don't need to worry that they will do so
-without regard to the effect of their choices on server resources.
-
-\subsection{Missing user documentation}
-
-Our documentation falls into two broad categories: some is `discoursive' and
-explains in detail why users should take certain actions, and other
-documentation is `comprehensive' and describes all of Tor's features. Right
-now, we have no document that is both deep, readable, and thorough. We
-should correct this by identifying missing spots in our design.
-
-\bibliographystyle{plain} \bibliography{tor-design}
-
-\end{document}
-
diff --git a/doc/design-paper/roadmap-future.pdf b/doc/design-paper/roadmap-future.pdf
deleted file mode 100644
index 8300ce19c9..0000000000
--- a/doc/design-paper/roadmap-future.pdf
+++ /dev/null
Binary files differ
diff --git a/doc/design-paper/roadmap-future.tex b/doc/design-paper/roadmap-future.tex
deleted file mode 100644
index 4ab240f977..0000000000
--- a/doc/design-paper/roadmap-future.tex
+++ /dev/null
@@ -1,895 +0,0 @@
-\documentclass{article}
-
-\usepackage{url}
-\usepackage{fullpage}
-
-\newenvironment{tightlist}{\begin{list}{$\bullet$}{
- \setlength{\itemsep}{0mm}
- \setlength{\parsep}{0mm}
- % \setlength{\labelsep}{0mm}
- % \setlength{\labelwidth}{0mm}
- % \setlength{\topsep}{0mm}
- }}{\end{list}}
-\newcommand{\tmp}[1]{{\bf #1} [......] \\}
-\newcommand{\plan}[1]{ {\bf (#1)}}
-
-\begin{document}
-
-\title{Tor Development Roadmap: Wishlist for 2008 and beyond}
-\author{Roger Dingledine \and Nick Mathewson}
-\date{}
-
-\maketitle
-\pagestyle{plain}
-
-\section{Introduction}
-
-Tor (the software) and Tor (the overall software/network/support/document
-suite) are now experiencing all the crises of success. Over the next
-years, we're probably going to grow even more in terms of users, developers,
-and funding than before. This document attempts to lay out all the
-well-understood next steps that Tor needs to take. We should periodically
-reorganize it to reflect current and intended priorities.
-
-\section{Everybody can be a relay}
-
-We've made a lot of progress towards letting an ordinary Tor client also
-serve as a Tor relay. But these issues remain.
-
-\subsection{UPNP}
-
-We should teach Vidalia how to speak UPNP to automatically open and
-forward ports on common (e.g. Linksys) routers. There are some promising
-Qt-based UPNP libs out there, and in any case there are others (e.g. in
-Perl) that we can base it on.
-
-\subsection{``ORPort auto'' to look for a reachable port}
-
-Vidalia defaults to port 443 on Windows and port 8080 elsewhere. But if
-that port is already in use, or the ISP filters incoming connections
-on that port (some cablemodem providers filter 443 inbound), the user
-needs to learn how to notice this, and then pick a new one and type it
-into Vidalia.
-
-We should add a new option ``auto'' that cycles through a set of preferred
-ports, testing bindability and reachability for each of them, and only
-complains to the user once it's given up on the common choices.
-
-\subsection{Incentives design}
-
-Roger has been working with researchers at Rice University to simulate
-and analyze a new design where the directory authorities assign gold
-stars to well-behaving relays, and then all the relays give priority
-to traffic from gold-starred relays. The great feature of the design is
-that not only does it provide the (explicit) incentive to run a relay,
-but it also aims to grow the overall capacity of the network, so even
-non-relays will benefit.
-
-It needs more analysis, and perhaps more design work, before we try
-deploying it.
-
-\subsection{Windows libevent}
-
-Tor relays still don't work well or reliably on Windows XP or Windows
-Vista, because we don't use the Windows-native ``overlapped IO''
-approach. Christian King made a good start at teaching libevent about
-overlapped IO during Google Summer of Code 2007, and next steps are
-to a) finish that, b) teach Tor to do openssl calls on buffers rather
-than directly to the network, and c) teach Tor to use the new libevent
-buffers approach.
-
-\subsection{Network scaling}
-
-If we attract many more relays, we will need to handle the growing pains
-in terms of getting all the directory information to all the users.
-
-The first piece of this issue is a practical question: since the
-directory size scales linearly with more relays, at some point it
-will no longer be practical for every client to learn about every
-relay. We can try to reduce the amount of information each client needs
-to fetch (e.g. based on fetching less information preemptively as in
-Section~\ref{subsec:fewer-descriptor-fetches} below), but eventually
-clients will need to learn about only a subset of the network, and we
-will need to design good ways to divide up the network information.
-
-The second piece is an anonymity question that arises from this
-partitioning: if Tor's security comes from having all the clients
-behaving in similar ways, yet we are now giving different clients
-different directory information, how can we minimize the new anonymity
-attacks we introduce?
-
-\subsection{Using fewer sockets}
-
-Since in the current network every Tor relay can reach every other Tor
-relay, and we have many times more users than relays, pretty much every
-possible link in the network is in use. That is, the current network
-is a clique in practice.
-
-And since each of these connections requires a TCP socket, it's going
-to be hard for the network to grow much larger: many systems come with
-a default of 1024 file descriptors allowed per process, and raising
-that ulimit is hard for end users. Worse, many low-end gateway/firewall
-routers can't handle this many connections in their routing table.
-
-One approach is a restricted-route topology~\cite{danezis:pet2003}:
-predefine which relays can reach which other relays, and communicate
-these restrictions to the relays and the clients. We need to compute
-which links are acceptable in a way that's decentralized yet scalable,
-and in a way that achieves a small-worlds property; and we
-need an efficient (compact) way to characterize the topology information
-so all the users could keep up to date.
-
-Another approach would be to switch to UDP-based transport between
-relays, so we don't need to keep the TCP sockets open at all. Needs more
-investigation too.
-
-\subsection{Auto bandwidth detection and rate limiting, especially for
- asymmetric connections.}
-
-
-\subsection{Better algorithms for giving priority to local traffic}
-
-Proposal 111 made a lot of progress at separating local traffic from
-relayed traffic, so Tor users can rate limit the relayed traffic at a
-stricter level. But since we want to pass both traffic classes over the
-same TCP connection, we can't keep them entirely separate. The current
-compromise is that we treat all bytes to/from a given connectin as
-local traffic if any of the bytes within the past N seconds were local
-bytes. But a) we could use some more intelligent heuristics, and b)
-this leaks information to an active attacker about when local traffic
-was sent/received.
-
-\subsection{Tolerate absurdly wrong clocks, even for relays}
-
-Many of our users are on Windows, running with a clock several days or
-even several years off from reality. Some of them are even intentionally
-in this state so they can run software that will only run in the past.
-
-Before Tor 0.1.1.x, Tor clients would still function if their clock was
-wildly off --- they simply got a copy of the directory and believed it.
-Starting in Tor 0.1.1.x (and even moreso in Tor 0.2.0.x), the clients
-only use networkstatus documents that they believe to be recent, so
-clients with extremely wrong clocks no longer work. (This bug has been
-an unending source of vague and confusing bug reports.)
-
-The first step is for clients to recognize when all the directory material
-they're fetching has roughly the same offset from their current time,
-and then automatically correct for it.
-
-Once that's working well, clients who opt to become bridge relays should
-be able to use the same approach to serve accurate directory information
-to their bridge users.
-
-\subsection{Risks from being a relay}
-
-Three different research
-papers~\cite{back01,clog-the-queue,attack-tor-oak05} describe ways to
-identify the nodes in a circuit by running traffic through candidate nodes
-and looking for dips in the traffic while the circuit is active. These
-clogging attacks are not that scary in the Tor context so long as relays
-are never clients too. But if we're trying to encourage more clients to
-turn on relay functionality too (whether as bridge relays or as normal
-relays), then we need to understand this threat better and learn how to
-mitigate it.
-
-One promising research direction is to investigate the RelayBandwidthRate
-feature that lets Tor rate limit relayed traffic differently from local
-traffic. Since the attacker's ``clogging'' traffic is not in the same
-bandwidth class as the traffic initiated by the user, it may be harder
-to detect interference. Or it may not be.
-
-\subsection{First a bridge, then a public relay?}
-
-Once enough of the items in this section are done, I want all clients
-to start out automatically detecting their reachability and opting
-to be bridge relays.
-
-Then if they realize they have enough consistency and bandwidth, they
-should automatically upgrade to being non-exit relays.
-
-What metrics should we use for deciding when we're fast enough
-and stable enough to switch? Given that the list of bridge relays needs
-to be kept secret, it doesn't make much sense to switch back.
-
-\section{Tor on low resources / slow links}
-\subsection{Reducing directory fetches further}
-\label{subsec:fewer-descriptor-fetches}
-\subsection{AvoidDiskWrites}
-\subsection{Using less ram}
-\subsection{Better DoS resistance for tor servers / authorities}
-\section{Blocking resistance}
-\subsection{Better bridge-address-distribution strategies}
-\subsection{Get more volunteers running bridges}
-\subsection{Handle multiple bridge authorities}
-\subsection{Anonymity for bridge users: second layer of entry guards, etc?}
-\subsection{More TLS normalization}
-\subsection{Harder to block Tor software distribution}
-\subsection{Integration with Psiphon}
-\section{Packaging}
-\subsection{Switch Privoxy out for Polipo}
- - Make Vidalia able to launch more programs itself
-\subsection{Continue Torbutton improvements}
- especially better docs
-\subsection{Vidalia and stability (especially wrt ongoing Windows problems)}
- learn how to get useful crash reports (tracebacks) from Windows users
-\subsection{Polipo support on Windows}
-\subsection{Auto update for Tor, Vidalia, others}
-\subsection{Tor browser bundle for USB and standalone use}
-\subsection{LiveCD solution}
-\subsection{VM-based solution}
-\subsection{Tor-on-enclave-firewall configuration}
-\subsection{General tutorials on what common applications are Tor-friendly}
-\subsection{Controller libraries (torctl) plus documentation}
-\subsection{Localization and translation (Vidalia, Torbutton, web pages)}
-\section{Interacting better with Internet sites}
-\subsection{Make tordnsel (tor exitlist) better and more well-known}
-\subsection{Nymble}
-\subsection{Work with Wikipedia, Slashdot, Google(, IRC networks)}
-\subsection{IPv6 support for exit destinations}
-\section{Network health}
-\subsection{torflow / soat to detect bad relays}
-\subsection{make authorities more automated}
-\subsection{torstatus pages and better trend tracking}
-\subsection{better metrics for assessing network health / growth}
- - geoip usage-by-country reporting and aggregation
- (Once that's working, switch to Directory guards)
-\section{Performance research}
-\subsection{Load balance better}
-\subsection{Improve our congestion control algorithms}
-\subsection{Two-hops vs Three-hops}
-\subsection{Transport IP packets end-to-end}
-\section{Outreach and user education}
-\subsection{"Who uses Tor" use cases}
-\subsection{Law enforcement contacts}
- - "Was this IP address a Tor relay recently?" database
-\subsection{Commercial/enterprise outreach. Help them use Tor well and
- not fear it.}
-\subsection{NGO outreach and training.}
- - "How to be a safe blogger"
-\subsection{More activist coordinators, more people to answer user questions}
-\subsection{More people to hold hands of server operators}
-\subsection{Teaching the media about Tor}
-\subsection{The-dangers-of-plaintext awareness}
-\subsection{check.torproject.org and other "privacy checkers"}
-\subsection{Stronger legal FAQ for US}
-\subsection{Legal FAQs for other countries}
-\section{Anonymity research}
-\subsection{estimate relay bandwidth more securely}
-\subsection{website fingerprinting attacks}
-\subsection{safer e2e defenses}
-\subsection{Using Tor when you really need anonymity. Can you compose it
- with other steps, like more trusted guards or separate proxies?}
-\subsection{Topology-aware routing; routing-zones, steven's pet2007 paper.}
-\subsection{Exactly what do guard nodes provide?}
-
-Entry guards seem to defend against all sorts of attacks. Can we work
-through all the benefits they provide? Papers like Nikita's CCS 2007
-paper make me think their value is not well-understood by the research
-community.
-
-\section{Organizational growth and stability}
-\subsection{A contingency plan if Roger gets hit by a bus}
- - Get a new executive director
-\subsection{More diversity of funding}
- - Don't rely on any one funder as much
- - Don't rely on any sector or funder category as much
-\subsection{More Tor-funded people who are skilled at peripheral apps like
- Vidalia, Torbutton, Polipo, etc}
-\subsection{More coordinated media handling and strategy}
-\subsection{Clearer and more predictable trademark behavior}
-\subsection{More outside funding for internships, etc e.g. GSoC.}
-\section{Hidden services}
-\subsection{Scaling: how to handle many hidden services}
-\subsection{Performance: how to rendezvous with them quickly}
-\subsection{Authentication/authorization: how to tolerate DoS / load}
-\section{Tor as a general overlay network}
-\subsection{Choose paths / exit by country}
-\subsection{Easier to run your own private servers and have Tor use them
- anywhere in the path}
-\subsection{Easier to run an independent Tor network}
-\section{Code security/correctness}
-\subsection{veracode}
-\subsection{code audit}
-\subsection{more fuzzing tools}
-\subsection{build farm, better testing harness}
-\subsection{Long-overdue code refactoring and cleanup}
-\section{Protocol security}
-\subsection{safer circuit handshake}
-\subsection{protocol versioning for future compatibility}
-\subsection{cell sizes}
-\subsection{adapt to new key sizes, etc}
-
-\bibliographystyle{plain} \bibliography{tor-design}
-
-\end{document}
-
-
-
-
-\section{Code and design infrastructure}
-
-\subsection{Protocol revision}
-To maintain backward compatibility, we've postponed major protocol
-changes and redesigns for a long time. Because of this, there are a number
-of sensible revisions we've been putting off until we could deploy several of
-them at once. To do each of these, we first need to discuss design
-alternatives with other cryptographers and outside collaborators to
-make sure that our choices are secure.
-
-First of all, our protocol needs better {\bf versioning support} so that we
-can make backward-incompatible changes to our core protocol. There are
-difficult anonymity issues here, since many naive designs would make it easy
-to tell clients apart (and then track them) based on their supported versions.
-
-With protocol versioning support would come the ability to {\bf future-proof
- our ciphersuites}. For example, not only our OR protocol, but also our
-directory protocol, is pretty firmly tied to the SHA-1 hash function, which
-though not yet known to be insecure for our purposes, has begun to show
-its age. We should
-remove assumptions throughout our design based on the assumption that public
-keys, secret keys, or digests will remain any particular size indefinitely.
-
-Our OR {\bf authentication protocol}, though provably
-secure\cite{tap:pet2006}, relies more on particular aspects of RSA and our
-implementation thereof than we had initially believed. To future-proof
-against changes, we should replace it with a less delicate approach.
-
-\plan{For all the above: 2 person-months to specify, spread over several
- months with time for interaction with external participants. One
- person-month to implement. Start specifying in early 2007.}
-
-We might design a {\bf stream migration} feature so that streams tunneled
-over Tor could be more resilient to dropped connections and changed IPs.
-\plan{Not in 2007.}
-
-A new protocol could support {\bf multiple cell sizes}. Right now, all data
-passes through the Tor network divided into 512-byte cells. This is
-efficient for high-bandwidth protocols, but inefficient for protocols
-like SSH or AIM that send information in small chunks. Of course, we need to
-investigate the extent to which multiple sizes could make it easier for an
-adversary to fingerprint a traffic pattern. \plan{Not in 2007.}
-
-As a part of our design, we should investigate possible {\bf cipher modes}
-other than counter mode. For example, a mode with built-in integrity
-checking, error propagation, and random access could simplify our protocol
-significantly. Sadly, many of these are patented and unavailable for us.
-\plan{Not in 2007.}
-
-\subsection{Scalability}
-
-\subsubsection{Improved directory efficiency}
-
-We should {\bf have routers upload their descriptors even less often}, so
-that clients do not need to download replacements every 18 hours whether any
-information has changed or not. (As of Tor 0.1.2.3-alpha, clients tolerate
-routers that don't upload often, but routers still upload at least every 18
-hours to support older clients.) \plan{Must do, but not until 0.1.1.x is
-deprecated in mid 2007. 1 week.}
-
-\subsubsection{Non-clique topology}
-Our current network design achieves a certain amount of its anonymity by
-making clients act like each other through the simple expedient of making
-sure that all clients know all servers, and that any server can talk to any
-other server. But as the number of servers increases to serve an
-ever-greater number of clients, these assumptions become impractical.
-
-At worst, if these scalability issues become troubling before a solution is
-found, we can design and build a solution to {\bf split the network into
-multiple slices} until a better solution comes along. This is not ideal,
-since rather than looking like all other users from a point of view of path
-selection, users would ``only'' look like 200,000--300,000 other
-users.\plan{Not unless needed.}
-
-We are in the process of designing {\bf improved schemes for network
- scalability}. Some approaches focus on limiting what an adversary can know
-about what a user knows; others focus on reducing the extent to which an
-adversary can exploit this knowledge. These are currently in their infancy,
-and will probably not be needed in 2007, but they must be designed in 2007 if
-they are to be deployed in 2008.\plan{Design in 2007; unknown difficulty.
- Write a paper.}
-
-\subsubsection{Relay incentives}
-To support more users on the network, we need to get more servers. So far,
-we've relied on volunteerism to attract server operators, and so far it's
-served us well. But in the long run, we need to {\bf design incentives for
- users to run servers} and relay traffic for others. Most obviously, we
-could try to build the network so that servers offered improved service for
-other servers, but we would need to do so without weakening anonymity and
-making it obvious which connections originate from users running servers. We
-have some preliminary designs~\cite{incentives-txt,tor-challenges},
-but need to perform
-some more research to make sure they would be safe and effective.\plan{Write
- a draft paper; 2 person-months.}
-(XXX we did that)
-
-\subsection{Portability}
-Our {\bf Windows implementation}, though much improved, continues to lag
-behind Unix and Mac OS X, especially when running as a server. We hope to
-merge promising patches from Christian King to address this point, and bring
-Windows performance on par with other platforms.\plan{Do in 2007; 1.5 months
- to integrate not counting Mike's work.}
-
-We should have {\bf better support for portable devices}, including modes of
-operation that require less RAM, and that write to disk less frequently (to
-avoid wearing out flash RAM).\plan{Optional; 2 weeks.}
-
-\subsection{Performance: resource usage}
-We've been working on {\bf using less RAM}, especially on servers. This has
-paid off a lot for directory caches in the 0.1.2, which in some cases are
-using 90\% less memory than they used to require. But we can do better,
-especially in the area around our buffer management algorithms, by using an
-approach more like the BSD and Linux kernels use instead of our current ring
-buffer approach. (For OR connections, we can just use queues of cell-sized
-chunks produced with a specialized allocator.) This could potentially save
-around 25 to 50\% of the memory currently allocated for network buffers, and
-make Tor a more attractive proposition for restricted-memory environments
-like old computers, mobile devices, and the like.\plan{Do in 2007; 2-3 weeks
- plus one week measurement.} (XXX We did this, but we need to do something
-more/else.)
-
-\subsection{Performance: network usage}
-We know too little about how well our current path
-selection algorithms actually spread traffic around the network in practice.
-We should {\bf research the efficacy of our traffic allocation} and either
-assure ourselves that it is close enough to optimal as to need no improvement
-(unlikely) or {\bf identify ways to improve network usage}, and get more
-users' traffic delivered faster. Performing this research will require
-careful thought about anonymity implications.
-
-We should also {\bf examine the efficacy of our congestion control
- algorithm}, and see whether we can improve client performance in the
-presence of a congested network through dynamic `sendme' window sizes or
-other means. This will have anonymity implications too if we aren't careful.
-
-\plan{For both of the above: research, design and write
- a measurement tool in 2007: 1 month. See if we can interest a graduate
- student.}
-
-We should work on making Tor's cell-based protocol perform better on
-networks with low bandwidth
-and high packet loss.\plan{Do in 2007 if we're funded to do it; 4-6 weeks.}
-
-\subsection{Performance scenario: one Tor client, many users}
-We should {\bf improve Tor's performance when a single Tor handles many
- clients}. Many organizations want to manage a single Tor client on their
-firewall for many users, rather than having each user install a separate
-Tor client. We haven't optimized for this scenario, and it is likely that
-there are some code paths in the current implementation that become
-inefficient when a single Tor is servicing hundreds or thousands of client
-connections. (Additionally, it is likely that such clients have interesting
-anonymity requirements the we should investigate.) We should profile Tor
-under appropriate loads, identify bottlenecks, and fix them.\plan{Do in 2007
- if we're funded to do it; 4-8 weeks.}
-
-\subsection{Tor servers on asymmetric bandwidth}
-
-Tor should work better on servers that have asymmetric connections like cable
-or DSL. Because Tor has separate TCP connections between each
-hop, if the incoming bytes are arriving just fine and the outgoing bytes are
-all getting dropped on the floor, the TCP push-back mechanisms don't really
-transmit this information back to the incoming streams.\plan{Do in 2007 since
- related to bandwidth limiting. 3-4 weeks.}
-
-\subsection{Running Tor as both client and server}
-
-Many performance tradeoffs and balances that might need more attention.
-We first need to track and fix whatever bottlenecks emerge; but we also
-need to invent good algorithms for prioritizing the client's traffic
-without starving the server's traffic too much.\plan{No idea; try
-profiling and improving things in 2007.}
-
-\subsection{Protocol redesign for UDP}
-Tor has relayed only TCP traffic since its first versions, and has used
-TLS-over-TCP to do so. This approach has proved reliable and flexible, but
-in the long term we will need to allow UDP traffic on the network, and switch
-some or all of the network to using a UDP transport. {\bf Supporting UDP
- traffic} will make Tor more suitable for protocols that require UDP, such
-as many VOIP protocols. {\bf Using a UDP transport} could greatly reduce
-resource limitations on servers, and make the network far less interruptible
-by lossy connections. Either of these protocol changes would require a great
-deal of design work, however. We hope to be able to enlist the aid of a few
-talented graduate students to assist with the initial design and
-specification, but the actual implementation will require significant testing
-of different reliable transport approaches.\plan{Maybe do a design in 2007 if
-we find an interested academic. Ian or Ben L might be good partners here.}
-
-\section{Blocking resistance}
-
-\subsection{Design for blocking resistance}
-We have written a design document explaining our general approach to blocking
-resistance. We should workshop it with other experts in the field to get
-their ideas about how we can improve Tor's efficacy as an anti-censorship
-tool.
-
-\subsection{Implementation: client-side and bridges-side}
-
-Bridges will want to be able to {\bf listen on multiple addresses and ports}
-if they can, to give the adversary more ports to block.
-
-\subsection{Research: anonymity implications from becoming a bridge}
-
-see arma's bridge proposal; e.g. should bridge users use a second layer of
-entry guards?
-
-\subsection{Implementation: bridge authority}
-
-we run some
-directory authorities with a slightly modified protocol that doesn't leak
-the entire list of bridges. Thus users can learn up-to-date information
-for bridges they already know about, but they can't learn about arbitrary
-new bridges.
-
-we need a design for distributing the bridge authority over more than one
-server
-
-\subsection{Normalizing the Tor protocol on the wire}
-Additionally, we should {\bf resist content-based filters}. Though an
-adversary can't see what users are saying, some aspects of our protocol are
-easy to fingerprint {\em as} Tor. We should correct this where possible.
-
-Look like Firefox; or look like nothing?
-Future research: investigate timing similarities with other protocols.
-
-\subsection{Research: scanning-resistance}
-
-\subsection{Research/Design/Impl: how users discover bridges}
-Our design anticipates an arms race between discovery methods and censors.
-We need to begin the infrastructure on our side quickly, preferably in a
-flexible language like Python, so we can adapt quickly to censorship.
-
-phase one: personal bridges
-phase two: families of personal bridges
-phase three: more structured social network
-phase four: bag of tricks
-Research: phase five...
-
-Integration with Psiphon, etc?
-
-\subsection{Document best practices for users}
-Document best practices for various activities common among
-blocked users (e.g. WordPress use).
-
-\subsection{Research: how to know if a bridge has been blocked?}
-
-\subsection{GeoIP maintenance, and "private" user statistics}
-How to know if the whole idea is working?
-
-\subsection{Research: hiding whether the user is reading or publishing?}
-
-\subsection{Research: how many bridges do you need to know to maintain
-reachability?}
-
-\subsection{Resisting censorship of the Tor website, docs, and mirrors}
-
-We should take some effort to consider {\bf initial distribution of Tor and
- related information} in countries where the Tor website and mirrors are
-censored. (Right now, most countries that block access to Tor block only the
-main website and leave mirrors and the network itself untouched.) Falling
-back on word-of-mouth is always a good last resort, but we should also take
-steps to make sure it's relatively easy for users to get ahold of a copy.
-
-\section{Security}
-
-\subsection{Security research projects}
-
-We should investigate approaches with some promise to help Tor resist
-end-to-end traffic correlation attacks. It's an open research question
-whether (and to what extent) {\bf mixed-latency} networks, {\bf low-volume
- long-distance padding}, or other approaches can resist these attacks, which
-are currently some of the most effective against careful Tor users. We
-should research these questions and perform simulations to identify
-opportunities for strengthening our design without dropping performance to
-unacceptable levels. %Cite something
-\plan{Start doing this in 2007; write a paper. 8-16 weeks.}
-
-We've got some preliminary results suggesting that {\bf a topology-aware
- routing algorithm}~\cite{feamster:wpes2004} could reduce Tor users'
-vulnerability against local or ISP-level adversaries, by ensuring that they
-are never in a position to watch both ends of a connection. We need to
-examine the effects of this approach in more detail and consider side-effects
-on anonymity against other kinds of adversaries. If the approach still looks
-promising, we should investigate ways for clients to implement it (or an
-approximation of it) without having to download routing tables for the whole
-Internet. \plan{Not in 2007 unless a graduate student wants to do it.}
-
-%\tmp{defenses against end-to-end correlation} We don't expect any to work
-%right now, but it would be useful to learn that one did. Alternatively,
-%proving that one didn't would free up researchers in the field to go work on
-%other things.
-%
-% See above; I think I got this.
-
-We should research the efficacy of {\bf website fingerprinting} attacks,
-wherein an adversary tries to match the distinctive traffic and timing
-pattern of the resources constituting a given website to the traffic pattern
-of a user's client. These attacks work great in simulations, but in
-practice we hear they don't work nearly as well. We should get some actual
-numbers to investigate the issue, and figure out what's going on. If we
-resist these attacks, or can improve our design to resist them, we should.
-% add cites
-\plan{Possibly part of end-to-end correlation paper. Otherwise, not in 2007
- unless a graduate student is interested.}
-
-\subsection{Implementation security}
-
-We should also {\bf mark RAM that holds key material as non-swappable} so
-that there is no risk of recovering key material from a hard disk
-compromise. This would require submitting patches upstream to OpenSSL, where
-support for marking memory as sensitive is currently in a very preliminary
-state.\plan{Nice to do, but not in immediate Tor scope.}
-
-There are numerous tools for identifying trouble spots in code (such as
-Coverity or even VS2005's code analysis tool) and we should convince somebody
-to run some of them against the Tor codebase. Ideally, we could figure out a
-way to get our code checked periodically rather than just once.\plan{Almost
- no time once we talk somebody into it.}
-
-We should try {\bf protocol fuzzing} to identify errors in our
-implementation.\plan{Not in 2007 unless we find a grad student or
- undergraduate who wants to try.}
-
-Our guard nodes help prevent an attacker from being able to become a chosen
-client's entry point by having each client choose a few favorite entry points
-as ``guards'' and stick to them. We should implement a {\bf directory
- guards} feature to keep adversaries from enumerating Tor users by acting as
-a directory cache.\plan{Do in 2007; 2 weeks.}
-
-\subsection{Detect corrupt exits and other servers}
-With the success of our network, we've attracted servers in many locations,
-operated by many kinds of people. Unfortunately, some of these locations
-have compromised or defective networks, and some of these people are
-untrustworthy or incompetent. Our current design relies on authority
-administrators to identify bad nodes and mark them as nonfunctioning. We
-should {\bf automate the process of identifying malfunctioning nodes} as
-follows:
-
-We should create a generic {\bf feedback mechanism for add-on tools} like
-Mike Perry's ``Snakes on a Tor'' to report failing nodes to authorities.
-\plan{Do in 2006; 1-2 weeks.}
-
-We should write tools to {\bf detect more kinds of innocent node failure},
-such as nodes whose network providers intercept SSL, nodes whose network
-providers censor popular websites, and so on. We should also try to detect
-{\bf routers that snoop traffic}; we could do this by launching connections
-to throwaway accounts, and seeing which accounts get used.\plan{Do in 2007;
- ask Mike Perry if he's interested. 4-6 weeks.}
-
-We should add {\bf an efficient way for authorities to mark a set of servers
- as probably collaborating} though not necessarily otherwise dishonest.
-This happens when an administrator starts multiple routers, but doesn't mark
-them as belonging to the same family.\plan{Do during v2.1 directory protocol
- redesign; 1-2 weeks to implement.}
-
-To avoid attacks where an adversary claims good performance in order to
-attract traffic, we should {\bf have authorities measure node performance}
-(including stability and bandwidth) themselves, and not simply believe what
-they're told. We also measure stability by tracking MTBF. Measuring
-bandwidth will be tricky, since it's hard to distinguish between a server with
-low capacity, and a high-capacity server with most of its capacity in
-use. See also Nikita's NDSS 2008 paper.\plan{Do it if we can interest
-a grad student.}
-
-{\bf Operating a directory authority should be easier.} We rely on authority
-operators to keep the network running well, but right now their job involves
-too much busywork and administrative overhead. A better interface for them
-to use could free their time to work on exception cases rather than on
-adding named nodes to the network.\plan{Do in 2007; 4-5 weeks.}
-
-\subsection{Protocol security}
-
-In addition to other protocol changes discussed above,
-% And should we move some of them down here? -NM
-we should add {\bf hooks for denial-of-service resistance}; we have some
-preliminary designs, but we shouldn't postpone them until we really need them.
-If somebody tries a DDoS attack against the Tor network, we won't want to
-wait for all the servers and clients to upgrade to a new
-version.\plan{Research project; do this in 2007 if funded.}
-
-\section{Development infrastructure}
-
-\subsection{Build farm}
-We've begun to deploy a cross-platform distributed build farm of hosts
-that build and test the Tor source every time it changes in our development
-repository.
-
-We need to {\bf get more participants}, so that we can test a larger variety
-of platforms. (Previously, we've only found out when our code had broken on
-obscure platforms when somebody got around to building it.)
-
-We need also to {\bf add our dependencies} to the build farm, so that we can
-ensure that libraries we need (especially libevent) do not stop working on
-any important platform between one release and the next.
-
-\plan{This is ongoing as more buildbots arrive.}
-
-\subsection{Improved testing harness}
-Currently, our {\bf unit tests} cover only about 20\% of the code base. This
-is uncomfortably low; we should write more and switch to a more flexible
-testing framework.\plan{Ongoing basis, time permitting.}
-
-We should also write flexible {\bf automated single-host deployment tests} so
-we can more easily verify that the current codebase works with the
-network.\plan{Worthwhile in 2007; would save lots of time. 2-4 weeks.}
-
-We should build automated {\bf stress testing} frameworks so we can see which
-realistic loads cause Tor to perform badly, and regularly profile Tor against
-these loads. This would give us {\it in vitro} performance values to
-supplement our deployment experience.\plan{Worthwhile in 2007; 2-6 weeks.}
-
-We should improve our memory profiling code.\plan{...}
-
-
-\subsection{Centralized build system}
-We currently rely on a separate packager to maintain the packaging system and
-to build Tor on each platform for which we distribute binaries. Separate
-package maintainers is sensible, but separate package builders has meant
-long turnaround times between source releases and package releases. We
-should create the necessary infrastructure for us to produce binaries for all
-major packages within an hour or so of source release.\plan{We should
- brainstorm this at least in 2007.}
-
-\subsection{Improved metrics}
-We need a way to {\bf measure the network's health, capacity, and degree of
- utilization}. Our current means for doing this are ad hoc and not
-completely accurate
-
-We need better ways to {\bf tell which countries are users are coming from,
- and how many there are}. A good perspective of the network helps us
-allocate resources and identify trouble spots, but our current approaches
-will work less and less well as we make it harder for adversaries to
-enumerate users. We'll probably want to shift to a smarter, statistical
-approach rather than our current ``count and extrapolate'' method.
-
-\plan{All of this in 2007 if funded; 4-8 weeks}
-
-% \tmp{We'd like to know how much of the network is getting used.}
-% I think this is covered above -NM
-
-\subsection{Controller library}
-We've done lots of design and development on our controller interface, which
-allows UI applications and other tools to interact with Tor. We could
-encourage the development of more such tools by releasing a {\bf
- general-purpose controller library}, ideally with API support for several
-popular programming languages.\plan{2006 or 2007; 1-2 weeks.}
-
-\section{User experience}
-
-\subsection{Get blocked less, get blocked less broadly}
-Right now, some services block connections from the Tor network because
-they don't have a better
-way to keep vandals from abusing them than blocking IP addresses associated
-with vandalism. Our approach so far has been to educate them about better
-solutions that currently exist, but we should also {\bf create better
-solutions for limiting vandalism by anonymous users} like credential and
-blind-signature based implementations, and encourage their use. Other
-promising starting points including writing a patch and explanation for
-Wikipedia, and helping Freenode to document, maintain, and expand its
-current Tor-friendly position.\plan{Do a writeup here in 2007; 1-2 weeks.}
-
-Those who do block Tor users also block overbroadly, sometimes blacklisting
-operators of Tor servers that do not permit exit to their services. We could
-obviate innocent reasons for doing so by designing a {\bf narrowly-targeted Tor
- RBL service} so that those who wanted to overblock Tor could no longer
-plead incompetence.\plan{Possibly in 2007 if we decide it's a good idea; 3
- weeks.}
-
-\subsection{All-in-one bundle}
-We need a well-tested, well-documented bundle of Tor and supporting
-applications configured to use it correctly. We have an initial
-implementation well under way, but it will need additional work in
-identifying requisite Firefox extensions, identifying security threats,
-improving user experience, and so on. This will need significantly more work
-before it's ready for a general public release.
-
-\subsection{LiveCD Tor}
-We need a nice bootable livecd containing a minimal OS and a few applications
-configured to use it correctly. The Anonym.OS project demonstrated that this
-is quite feasible, but their project is not currently maintained.
-
-\subsection{A Tor client in a VM}
-\tmp{a.k.a JanusVM} which is quite related to the firewall-level deployment
-section below. JanusVM is a Linux kernel running in VMWare. It gets an IP
-address from the network, and serves as a DHCP server for its host Windows
-machine. It intercepts all outgoing traffic and redirects it into Privoxy,
-Tor, etc. This Linux-in-Windows approach may help us with scalability in
-the short term, and it may also be a good long-term solution rather than
-accepting all security risks in Windows.
-
-%\subsection{Interface improvements}
-%\tmp{Allow controllers to manipulate server status.}
-% (Why is this in the User Experience section?) -RD
-% I think it's better left to a generic ``make controller iface better'' item.
-
-\subsection{Firewall-level deployment}
-Another useful deployment mode for some users is using {\bf Tor in a firewall
- configuration}, and directing all their traffic through Tor. This can be a
-little tricky to set up currently, but it's an effective way to make sure no
-traffic leaves the host un-anonymized. To achieve this, we need to {\bf
- improve and port our new TransPort} feature which allows Tor to be used
-without SOCKS support; to {\bf add an anonymizing DNS proxy} feature to Tor;
-and to {\bf construct a recommended set of firewall configurations} to redirect
-traffic to Tor.
-
-This is an area where {\bf deployment via a livecd}, or an installation
-targeted at specialized home routing hardware, could be useful.
-
-\subsection{Assess software and configurations for anonymity risks}
-Right now, users and packagers are more or less on their own when selecting
-Firefox extensions. We should {\bf assemble a recommended list of browser
- extensions} through experiment, and include this in the application bundles
-we distribute.
-
-We should also describe {\bf best practices for using Tor with each class of
- application}. For example, Ethan Zuckerman has written a detailed
-tutorial on how to use Tor, Firefox, GMail, and Wordpress to blog with
-improved safety. There are many other cases on the Internet where anonymity
-would be helpful, and there are a lot of ways to screw up using Tor.
-
-The Foxtor and Torbutton extensions serve similar purposes; we should pick a
-favorite, and merge in the useful features of the other.
-
-%\tmp{clean up our own bundled software:
-%E.g. Merge the good features of Foxtor into Torbutton}
-%
-% What else did you have in mind? -NM
-
-\subsection{Localization}
-Right now, most of our user-facing code is internationalized. We need to
-internationalize the last few hold-outs (like the Tor expert installer), and get
-more translations for the parts that are already internationalized.
-
-Also, we should look into a {\bf unified translator's solution}. Currently,
-since different tools have been internationalized using the
-framework-appropriate method, different tools require translators to localize
-them via different interfaces. Inasmuch as possible, we should make
-translators only need to use a single tool to translate the whole Tor suite.
-
-\section{Support}
-
-It would be nice to set up some {\bf user support infrastructure} and
-{\bf contributor support infrastructure}, especially focusing on server
-operators and on coordinating volunteers.
-
-This includes intuitive and easy ticket systems for bug reports and
-feature suggestions (not just mailing lists with a half dozen people
-and no clear roles for who answers what), but it also includes a more
-personalized and efficient framework for interaction so we keep the
-attention and interest of the contributors, and so we make them feel
-helpful and wanted.
-
-\section{Documentation}
-
-\subsection{Unified documentation scheme}
-
-We need to {\bf inventory our documentation.} Our documentation so far has
-been mostly produced on an {\it ad hoc} basis, in response to particular
-needs and requests. We should figure out what documentation we have, which of
-it (if any) should get priority, and whether we can't put it all into a
-single format.
-
-We could {\bf unify the docs} into a single book-like thing. This will also
-help us identify what sections of the ``book'' are missing.
-
-\subsection{Missing technical documentation}
-
-We should {\bf revise our design paper} to reflect the new decisions and
-research we've made since it was published in 2004. This will help other
-researchers evaluate and suggest improvements to Tor's current design.
-
-Other projects sometimes implement the client side of our protocol. We
-encourage this, but we should write {\bf a document about how to avoid
-excessive resource use}, so we don't need to worry that they will do so
-without regard to the effect of their choices on server resources.
-
-\subsection{Missing user documentation}
-
-Our documentation falls into two broad categories: some is `discoursive' and
-explains in detail why users should take certain actions, and other
-documentation is `comprehensive' and describes all of Tor's features. Right
-now, we have no document that is both deep, readable, and thorough. We
-should correct this by identifying missing spots in our design.
-
-\bibliographystyle{plain} \bibliography{tor-design}
-
-\end{document}
-