diff options
Diffstat (limited to 'doc')
98 files changed, 2365 insertions, 584 deletions
diff --git a/doc/HACKING b/doc/HACKING index 50b5d80d18..3d3f2c1dfc 100644 --- a/doc/HACKING +++ b/doc/HACKING @@ -11,12 +11,20 @@ 0.1. Useful command-lines that are non-trivial to reproduce but can help with tracking bugs or leaks. +0.1.1. Dmalloc + dmalloc -l ~/dmalloc.log (run the commands it tells you) ./configure --with-dmalloc +0.2.2. Valgrind + valgrind --leak-check=yes --error-limit=no --show-reachable=yes src/or/tor +(Note that if you get a zillion openssl warnings, you will also need to + pass --undef-value-errors=no to valgrind, or rebuild your openssl + with -DPURIFY.) + 0.2. Running gcov for unit test coverage make clean @@ -4,8 +4,8 @@ We've split out our TODO into three files: TODO.02x is the list of items we're planning to get done in the next stable release. -TODO.external is the list of external constraints and deliverables that -we all need to keep in mind. +TODO.external lives in svn under /projects/todo/. It's the list of +external constraints and deliverables that we all need to keep in mind. TODO.future is the list of other items we plan to get to in later releases. diff --git a/doc/TODO.021 b/doc/TODO.021 index 881ba5ee4b..37c5b9845b 100644 --- a/doc/TODO.021 +++ b/doc/TODO.021 @@ -1,4 +1,3 @@ -$Id$ Legend: SPEC!! - Not specified SPEC - Spec not finalized diff --git a/doc/TODO.022 b/doc/TODO.022 index 3eeae006cb..f4fe2ebb2a 100644 --- a/doc/TODO.022 +++ b/doc/TODO.022 @@ -8,14 +8,17 @@ NOTE 2: It's easy to list stuff like this with no time estimates and 0.2.2, figure out how long the stuff we want will take, and triage accordingly, or vice versa. -- Design +- Design only - Begin design work for UDP transition; identify areas where we need to make changes or instrument stuff early. + [multiple weeks, ongoing. Need to do a draft early.] - Performance, mostly protocol-neutral. - Work with Libevent 2.0's bufferevent interface - Identify any performance stuff we need to push back into libevent to make it as fast as we want. + - Get a decent rate-limiting feature into Libevent + - Get openssl support into Libevent. - Revise how we do bandwidth limiting and round-robining between circuits on a connection. @@ -30,21 +33,76 @@ NOTE 2: It's easy to list stuff like this with no time estimates and - Figure out good ways to instrument Tor internals so we can tell how well our bandwidth and flow-control stuff is actually working. + - What ports eat the bandwidth? + - How full do queues get? + - How much latency do queues get? -- Features + - Rate limit at clients: + - Give clients an upper bound on how much they're willing to use + the network if they're not relaying? + - ... or group client circuits by IP at the server and rate-limit + like that. + + - Use if-modified-since to download consensuses + + +- Other features - Proposals to implement: - - 146: reflect long-term stability + - 146: reflect long-term stability in consensuses - 147: Stop using v2 directories to generate v3 votes. + - Start pinging as soon as we learn about a relay, not on a + 22-minute cycle. Prioritize new and volatile relays for + testing. - Proposals to improve and implement - 158: microdescriptors + o Revise proposal + - Implement + o 160: list bandwidth in consensus + o Finish proposal + o and actually set it reasonably + o and actually use it. - Proposals to improve and implement if not broken - - IPv6 support. (Parts of 117, but figure out how to handle DNS + D IPv6 support. (Parts of 117, but figure out how to handle DNS requests.) - 140: Directory diffs + - Need a decent simple C diff implementation. + - Need a decent simple C ed patch implementation. - 149: learn info from netinfo cells. - - 134: handle authority fragmentation (Needs more analysis) + o Start discussion + - Revise proposal based on discussion. + X 134: handle authority fragmentation (Needs more analysis) + - 165: Easy migration for voting authority sets + - 163: Detect client-status better + o Write proposal + - Possibly implement, depending on discussion. + - 164: Have authorities report relay and voting status better: make it + easy to answer, "Why is my server not listed/not Guard/not + Running/etc" + o Write proposal + - Possibly implement, depending on discussion + - 162: Have consensuses come in multiple "flavours". + o Write proposal + - Possibly implement, depending on discussion. + + - Needs a proposal, or at least some design + - Weaken the requirements for being a Guard, based on K's + measurements. +K - Finish measurements +K? - Write proposal + - Adaptive timeouts for giving up on circuits and streams. +M - Revise proposal 151 + - Downweight guards more sensibly: be more forgiving about using + Guard nodes as non-first-hop. + - Write proposal. + - Lagged weight updates in consensuses: don't just move abruptly. +M? - Write proposal + d Don't kill a circuit on the first failed extend. + +- Installers + - Switch to MSI on win32 + - Use Thandy, perhaps? - Deprecations - Make .exit safe, or make it off-by-default. diff --git a/doc/TODO.external b/doc/TODO.external index c02d6aca54..2e7e536efc 100644 --- a/doc/TODO.external +++ b/doc/TODO.external @@ -1,196 +1,4 @@ -$Id$ -Legend: -SPEC!! - Not specified -SPEC - Spec not finalized -N - nick claims -R - arma claims -P - phobos claims -S - Steven claims -E - Matt claims -M - Mike claims -J - Jeff claims -I - ioerror claims -W - weasel claims -K - Karsten claims -C - coderman claims - - Not done - * Top priority - . Partially done - o Done - d Deferrable - D Deferred - X Abandoned -======================================================================= - -External constraints: - -For June/July: -NR - Work more on Paul's NRL research problem. - -For March 22: -I * Email auto-responder - * teach gettor how to ask for (and attach) split files. - -K . Metrics. - . With Mike's help, use Torflow to start doing monthly rudimentary - performance evaluations: - . Circuit throughput and latency - - Measure via Broadband and dialup - . Publish a report addressing key long-term metrics questions: - . What metrics should we present? - . What data are available for these metrics? - . What data are missing, and can collect them safely? Can we - publish them safely? - . What systems are available to present this data? - -E . Vidalia improvements - o Vidalia displays by-country user summary for bridge operators -? - write a help page for vidalia, "what is this" - -For mid August: - -Section 0, items that didn't make it into the original roadmap: - -0.1, installers and packaging -C . i18n for the msi bundle files -P . more consistent TBB builds -IC- get a buildbot up again. Have Linux and BSD build machines. - (Windows would be nice but realistically will come later.) -E - Get Tor to work properly on the iPhone. - -3.1, performance work. [Section numbers in here are from performance.pdf] - - High-priority items from performance.pdf -RS - 1.2, new circuit window sizes. make the default package window lower. -R+ - 2.1, squeeze loud circuits - - Evaluate the code to see what stats we can keep about circuit use. - - Write proposals for various meddling. Look at the research papers - that Juliusz pointed us to. Ask our systems friends. Plan to put - a lot of the parameters in the consensus, so we can tune it with - short turnaround times. -E+ - 2.5, Change Vidalia's default exit policy to not click "other - protocols". Or choose not to. Think this through first. -R+ - 2.6, Tell users not to file-share. - - Put statement on the Tor front page - - Put statement on the download pages too - - And the FAQ - - 3.1.2, Tor weather -I - Implement time-to-notification (immediate, a day, a week) -I - Get a relay operator mailing list going, with a plan and supporting - scripts and so on. -R - Link to them from the Tor relay page -R - and the torrc.sample? -SM - 4.1, balance traffic better - - Steven and Mike should decide if we should do Steven's plan - (rejigger the bandwidth numbers at the authorities based on - Steven's algorithm), or Mike's plan (relay scanning to identify - the unbalanced relays and fix them on the fly), or both. - - Figure out how to actually modify bandwidths in the consensus. We - may need to change the consensus voting algorithm to decide what - bandwidth to advertise based on something other than median: - if 7 authorities provide bandwidths, and 2 are doing scanning, - then the 5 that aren't scanning will outvote any changes. Should - all 7 scan? Should only some vote? Extra points if it doesn't - change all the numbers every new consensus, so consensus diffing - is still practical. -? - 4.5, Older entry guards are overloaded - - Pick a conservative timeout like a month, and implement. -M - 5.2, better timeouts for giving up on circuits/streams - - clients gather data about circuit timeouts, and then abandon - circuits that take more than a std dev above that. - -4.1, IOCP / libevent / windows / tor -N - get it working for nick -N - put out a release so other people can start testing it. -N - both the libevent buffer abstraction, and the - tor-uses-libevent-buffer-abstraction. Unless we think that's - unreachable for this milestone? - -4.2.1, risks from becoming a relay -S - Have a clear plan for how users who become relays will be safe, - and be confident that we can build this plan. - - evaluate all the various attacks that are made possible by relaying. - specifically, see "relaying-traffic attacks" in 6.6. - - identify and evaluate ways to make them not a big deal - - setting a low RelayBandwidth - - Nick Hopper's FC08 paper suggesting that we should do a modified - round-robin so we leak less about other circuits - - instructing clients to disable pings in their firewall, etc - - pick the promising ones, improve them so they're even better, and - spec them out so we know how to build them and how much effort is - involved in building them. - -4.5, clients download less directory info -N * deploy proposal 158. -N - decide whether to do proposal 140. if so, construct an implementation - plan for how we'll do it. if not, explain why not. - -5.1, Normalize TLS fingerprint -N o write a draft list of possible attacks for this section, with - estimates about difficulty of attack, difficulty of solution, etc -N - revisit the list and revise our plans as needed -NR- put up a blog post about the two contradictory conclusions: we can - discuss the theory of arms races, and our quandry, without revealing - any specific vulnerabilities. (or decide not to put up a blog post, - and explain why not.) - -5.5, email autoresponder -I . maintenance and keeping it running - -5.7.2, metrics - -XXX. - -6.2, Vidalia work -E - add breakpad support or similar for windows debugging -E o let vidalia change languages without needing a restart -E - Implement the status warning event interface started for the - phase one deliverables. -E - Work with Steve Tyree on building a Vidalia plugin API to enable - building Herdict and TBB plugins. - -6.3, Node scanning -M - Steps toward automation - - Set up email list for results - - Map failure types to potential BadExit lines -M - Improve the ability of SoaT to mimic various real web browsers - - randomizing user agents and locale strings - - caching, XMLHTTPRequest, form posting, content sniffing - - Investigate ideas like running Chrome/xulrunner in parallel -M - Other protocols - - SSH, IMAPS, POPS, SMTPS -M - Add ability to geolocalize exit selection based on scanner location - - Use this to rescan dynamic urls filtered by the URL filter - -6.4, Torbutton development -M - Resolve extension conflicts and other high priority bugs -M - Fix or hack around ugly firefox bugs, especially Timezone issue. - Definitely leaning towards "hack around" unless we see some - level of love from Mozilla. -M - Vidalia New Nym Integration - - Implement for Torbutton to pick up on Vidalia's NEWNYM and clear - cookies based on FoeBud's source - - Do this in such a way that we could adapt polipo to purge cache - if we were so inclined -M - Write up a summary of our options for dealing with the google - you-must-solve-a-captcha-to-search problem, and pick one as our - favorite option. - -6.6, Evaluate new anonymity attacks -S - relaying-traffic attacks - - original murdoch-danezis attack - - nick hopper's latency measurement attack - - columbia bandwidth measurement attack - - christian grothoff's long-circuit attack -S - client attacks - - website fingerprinting - -7.1, Tor VM Research, analysis, and prototyping -C . Get a working package out, meaning other people are testing it. - -7.2, Tor Browser Bundle -I - Port to one of OS X or Linux, and start the port to the other. -I . Make it the recommended Tor download on Windows -I - Make sure it's easy to un-brand TBB in case Firefox asks us to -I - Evaluate CCC's Freedom Stick +[This file moved to svn in /projects/todo/. More people can edit +it more easily there. -RD] diff --git a/doc/TODO.future b/doc/TODO.future index 64169ecfec..a6cc95150e 100644 --- a/doc/TODO.future +++ b/doc/TODO.future @@ -1,4 +1,3 @@ -$Id$ Legend: SPEC!! - Not specified SPEC - Spec not finalized diff --git a/doc/design-paper/latex8.bst b/doc/design-paper/latex8.bst index 2dd3249633..bae8e209ee 100644 --- a/doc/design-paper/latex8.bst +++ b/doc/design-paper/latex8.bst @@ -1,8 +1,6 @@ % --------------------------------------------------------------- % -% $Id$ -% % by Paolo.Ienne@di.epfl.ch % diff --git a/doc/design-paper/usenix.sty b/doc/design-paper/usenix.sty index 4442f11574..575c854e77 100644 --- a/doc/design-paper/usenix.sty +++ b/doc/design-paper/usenix.sty @@ -5,8 +5,6 @@ % \usepackage{usenix-2e} % and put {\rm ....} around the author names. % -% $Id$ -% % The following definitions are modifications of standard article.sty % definitions, arranged to do a better job of matching the USENIX % guidelines. diff --git a/doc/spec/address-spec.txt b/doc/spec/address-spec.txt index 2a84d857e6..2e1aff2b8a 100644 --- a/doc/spec/address-spec.txt +++ b/doc/spec/address-spec.txt @@ -1,4 +1,3 @@ -$Id$ Special Hostnames in Tor Nick Mathewson @@ -34,10 +33,13 @@ $Id$ "www.google.com.foo.exit=64.233.161.99.foo.exit" to speed subsequent lookups. + The .exit notation is disabled by default as of Tor 0.2.2.1-alpha, due + to potential application-level attacks. + EXAMPLES: www.example.com.exampletornode.exit - Connect to www.example.com from the node called "exampletornode." + Connect to www.example.com from the node called "exampletornode". exampletornode.exit @@ -54,15 +56,3 @@ $Id$ When Tor sees an address in this format, it tries to look up and connect to the specified hidden service. See rend-spec.txt for full details. -4. .noconnect - - SYNTAX: [string].noconnect - - When Tor sees an address in this format, it immediately closes the - connection without attaching it to any circuit. This is useful for - controllers that want to test whether a given application is indeed using - the same instance of Tor that they're controlling. - -5. [XXX Is there a ".virtual" address that we expose too, or is that -just intended to be internal? -RD] - diff --git a/doc/spec/bridges-spec.txt b/doc/spec/bridges-spec.txt index 4a9b373c8e..647118815c 100644 --- a/doc/spec/bridges-spec.txt +++ b/doc/spec/bridges-spec.txt @@ -1,4 +1,3 @@ -$Id$ Tor bridges specification diff --git a/doc/spec/control-spec-v0.txt b/doc/spec/control-spec-v0.txt index faf75a64a4..3515d395a6 100644 --- a/doc/spec/control-spec-v0.txt +++ b/doc/spec/control-spec-v0.txt @@ -1,4 +1,3 @@ -$Id$ TC: A Tor control protocol (Version 0) diff --git a/doc/spec/control-spec.txt b/doc/spec/control-spec.txt index 576c5dcd53..fc4242ea16 100644 --- a/doc/spec/control-spec.txt +++ b/doc/spec/control-spec.txt @@ -1,4 +1,3 @@ -$Id$ TC: A Tor control protocol (Version 1) @@ -220,7 +219,7 @@ $Id$ "INFO" / "NOTICE" / "WARN" / "ERR" / "NEWDESC" / "ADDRMAP" / "AUTHDIR_NEWDESCS" / "DESCCHANGED" / "STATUS_GENERAL" / "STATUS_CLIENT" / "STATUS_SERVER" / "GUARD" / "NS" / "STREAM_BW" / - "CLIENTS_SEEN" + "CLIENTS_SEEN" / "NEWCONSENSUS" Any events *not* listed in the SETEVENTS line are turned off; thus, sending SETEVENTS with an empty body turns off all event reporting. @@ -503,7 +502,7 @@ $Id$ start and the rest of the interval respectively. The 'interval-start' and 'interval-end' fields are the borders of the current interval; the 'interval-wake' field is the time within the current interval (if any) - where we plan[ned] to start being active. + where we plan[ned] to start being active. The times are GMT. "config/names" A series of lines listing the available configuration options. Each is @@ -563,14 +562,14 @@ $Id$ states. See Section 4.1.10 for explanations. (Only a few of the status events are available as getinfo's currently. Let us know if you want more exposed.) - "status/reachability/or" + "status/reachability-succeeded/or" 0 or 1, depending on whether we've found our ORPort reachable. - "status/reachability/dir" + "status/reachability-succeeded/dir" 0 or 1, depending on whether we've found our DirPort reachable. - "status/reachability" + "status/reachability-succeeded" "OR=" ("0"/"1") SP "DIR=" ("0"/"1") - Combines status/reachability/*; controllers MUST ignore unrecognized - elements in this entry. + Combines status/reachability-succeeded/*; controllers MUST ignore + unrecognized elements in this entry. "status/bootstrap-phase" Returns the most recent bootstrap phase status event sent. Specifically, it returns a string starting with either @@ -774,9 +773,8 @@ $Id$ Same as passing 'EXTENDED' to SETEVENTS; this is the preferred way to request the extended event syntax. - This will not be always-enabled until at least two stable releases - after 0.1.2.3-alpha, the release where it was first used for - anything. + This feature was first used in 0.1.2.3-alpha. It is always-on in + Tor 0.2.2.1-alpha and later. VERBOSE_NAMES @@ -787,8 +785,9 @@ $Id$ LongName format includes a Fingerprint, an indication of Named status, and a Nickname (if one is known). - This will not be always-enabled until at least two stable releases - after 0.1.2.2-alpha, the release where it was first available. + This will not be always-enabled until at least two stable + releases after 0.1.2.2-alpha, the release where it was first + available. It is always-on in Tor 0.2.2.1-alpha and later. 3.20. RESOLVE diff --git a/doc/spec/dir-spec-v1.txt b/doc/spec/dir-spec-v1.txt index 286df664e2..a92fc7999a 100644 --- a/doc/spec/dir-spec-v1.txt +++ b/doc/spec/dir-spec-v1.txt @@ -1,4 +1,3 @@ -$Id$ Tor Protocol Specification diff --git a/doc/spec/dir-spec-v2.txt b/doc/spec/dir-spec-v2.txt index 4873c4a728..d1be27f3db 100644 --- a/doc/spec/dir-spec-v2.txt +++ b/doc/spec/dir-spec-v2.txt @@ -1,4 +1,3 @@ -$Id$ Tor directory protocol, version 2 diff --git a/doc/spec/dir-spec.txt b/doc/spec/dir-spec.txt index 9a2a62bc46..16f121a19a 100644 --- a/doc/spec/dir-spec.txt +++ b/doc/spec/dir-spec.txt @@ -1,4 +1,3 @@ -$Id$ Tor directory protocol, version 3 @@ -594,7 +593,7 @@ $Id$ "allow-single-hop-exits" - [At most one.] + [At most once.] Present only if the router allows single-hop circuits to make exit connections. Most Tor servers do not support this: this is @@ -642,6 +641,200 @@ $Id$ "geoip-start" is the time at which we began collecting geoip statistics. + "dirreq-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL + [At most once.] + + YYYY-MM-DD HH:MM:SS defines the end of the included measurement + interval of length NSEC seconds (86400 seconds by default). + + A "dirreq-stats-end" line, as well as any other "dirreq-*" line, + is only added when the relay has opened its Dir port and after 24 + hours of measuring directory requests. + + "dirreq-v2-ips" CC=N,CC=N,... NL + [At most once.] + "dirreq-v3-ips" CC=N,CC=N,... NL + [At most once.] + + List of mappings from two-letter country codes to the number of + unique IP addresses that have connected from that country to + request a v2/v3 network status, rounded up to the nearest multiple + of 8. Only those IP addresses are counted that the directory can + answer with a 200 OK status code. + + "dirreq-v2-reqs" CC=N,CC=N,... NL + [At most once.] + "dirreq-v3-reqs" CC=N,CC=N,... NL + [At most once.] + + List of mappings from two-letter country codes to the number of + requests for v2/v3 network statuses from that country, rounded up + to the nearest multiple of 8. Only those requests are counted that + the directory can answer with a 200 OK status code. + + "dirreq-v2-share" num% NL + [At most once.] + "dirreq-v3-share" num% NL + [At most once.] + + The share of v2/v3 network status requests that the directory + expects to receive from clients based on its advertised bandwidth + compared to the overall network bandwidth capacity. Shares are + formatted in percent with two decimal places. Shares are + calculated as means over the whole 24-hour interval. + + "dirreq-v2-resp" status=num,... NL + [At most once.] + "dirreq-v3-resp" status=nul,... NL + [At most once.] + + List of mappings from response statuses to the number of requests + for v2/v3 network statuses that were answered with that response + status, rounded up to the nearest multiple of 4. Only response + statuses with at least 1 response are reported. New response + statuses can be added at any time. The current list of response + statuses is as follows: + + "ok": a network status request is answered; this number + corresponds to the sum of all requests as reported in + "dirreq-v2-reqs" or "dirreq-v3-reqs", respectively, before + rounding up. + "not-enough-sigs: a version 3 network status is not signed by a + sufficient number of requested authorities. + "unavailable": a requested network status object is unavailable. + "not-found": a requested network status is not found. + "not-modified": a network status has not been modified since the + If-Modified-Since time that is included in the request. + "busy": the directory is busy. + + "dirreq-v2-direct-dl" key=val,... NL + [At most once.] + "dirreq-v3-direct-dl" key=val,... NL + [At most once.] + "dirreq-v2-tunneled-dl" key=val,... NL + [At most once.] + "dirreq-v3-tunneled-dl" key=val,... NL + [At most once.] + + List of statistics about possible failures in the download process + of v2/v3 network statuses. Requests are either "direct" + HTTP-encoded requests over the relay's directory port, or + "tunneled" requests using a BEGIN_DIR cell over the relay's OR + port. The list of possible statistics can change, and statistics + can be left out from reporting. The current list of statistics is + as follows: + + Successful downloads and failures: + + "complete": a client has finished the download successfully. + "timeout": a download did not finish within 10 minutes after + starting to send the response. + "running": a download is still running at the end of the + measurement period for less than 10 minutes after starting to + send the response. + + Download times: + + "min", "max": smallest and largest measured bandwidth in B/s. + "d[1-4,6-9]": 1st to 4th and 6th to 9th decile of measured + bandwidth in B/s. For a given decile i, i/10 of all downloads + had a smaller bandwidth than di, and (10-i)/10 of all downloads + had a larger bandwidth than di. + "q[1,3]": 1st and 3rd quartile of measured bandwidth in B/s. One + fourth of all downloads had a smaller bandwidth than q1, one + fourth of all downloads had a larger bandwidth than q3, and the + remaining half of all downloads had a bandwidth between q1 and + q3. + "md": median of measured bandwidth in B/s. Half of the downloads + had a smaller bandwidth than md, the other half had a larger + bandwidth than md. + + "entry-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL + [At most once.] + + YYYY-MM-DD HH:MM:SS defines the end of the included measurement + interval of length NSEC seconds (86400 seconds by default). + + An "entry-stats-end" line, as well as any other "entry-*" + line, is first added after the relay has been running for at least + 24 hours. + + "entry-ips" CC=N,CC=N,... NL + [At most once.] + + List of mappings from two-letter country codes to the number of + unique IP addresses that have connected from that country to the + relay and which are no known other relays, rounded up to the + nearest multiple of 8. + + "cell-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL + [At most once.] + + YYYY-MM-DD HH:MM:SS defines the end of the included measurement + interval of length NSEC seconds (86400 seconds by default). + + A "cell-stats-end" line, as well as any other "cell-*" line, + is first added after the relay has been running for at least 24 + hours. + + "cell-processed-cells" num,...,num NL + [At most once.] + + Mean number of processed cells per circuit, subdivided into + deciles of circuits by the number of cells they have processed in + descending order from loudest to quietest circuits. + + "cell-queued-cells" num,...,num NL + [At most once.] + + Mean number of cells contained in queues by circuit decile. These + means are calculated by 1) determining the mean number of cells in + a single circuit between its creation and its termination and 2) + calculating the mean for all circuits in a given decile as + determined in "cell-processed-cells". Numbers have a precision of + two decimal places. + + "cell-time-in-queue" num,...,num NL + [At most once.] + + Mean time cells spend in circuit queues in milliseconds. Times are + calculated by 1) determining the mean time cells spend in the + queue of a single circuit and 2) calculating the mean for all + circuits in a given decile as determined in + "cell-processed-cells". + + "cell-circuits-per-decile" num NL + [At most once.] + + Mean number of circuits that are included in any of the deciles, + rounded up to the next integer. + + "exit-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL + [At most once.] + + YYYY-MM-DD HH:MM:SS defines the end of the included measurement + interval of length NSEC seconds (86400 seconds by default). + + An "exit-stats-end" line, as well as any other "exit-*" line, is + first added after the relay has been running for at least 24 hours + and only if the relay permits exiting (where exiting to a single + port and IP address is sufficient). + + "exit-kibibytes-written" port=N,port=N,... NL + [At most once.] + "exit-kibibytes-read" port=N,port=N,... NL + [At most once.] + + List of mappings from ports to the number of kibibytes that the + relay has written to or read from exit connections to that port, + rounded up to the next full kibibyte. + + "exit-streams-opened" port=N,port=N,... NL + [At most once.] + + List of mappings from ports to the number of opened exit streams + to that port, rounded up to the nearest multiple of 4. + "router-signature" NL Signature NL [At end, exactly once.] @@ -798,7 +991,7 @@ $Id$ documents are described in section XXX below. Status documents contain a preamble, an authority section, a list of - router status entries, and one more footers signature, in that order. + router status entries, and one or more footer signature, in that order. Unlike other formats described above, a SP in these documents must be a single space character (hex 20). @@ -1030,13 +1223,20 @@ $Id$ descriptors if they would cause "v" lines to be over 128 characters long. - "w" SP "Bandwidth=" INT NL + "w" SP "Bandwidth=" INT [SP "Measured=" INT] NL [At most once.] An estimate of the bandwidth of this server, in an arbitrary unit (currently kilobytes per second). Used to weight router - selection. Other weighting keywords may be added later. + selection. + + Additionally, the Measured= keyword is present in votes by + participating bandwidth measurement authorites to indicate + a measured bandwidth currently produced by measuring stream + capacities. + + Other weighting keywords may be added later. Clients MUST ignore keywords they do not recognize. "p" SP ("accept" / "reject") SP PortList NL @@ -1179,6 +1379,13 @@ $Id$ rate limit from the router descriptor. It is given in kilobytes per second, and capped at some arbitrary value (currently 10 MB/s). + The Measured= keyword on a "w" line vote is currently computed + by multiplying the previous published consensus bandwidth by the + ratio of the measured average node stream capacity to the network + average. If 3 or more authorities provide a Measured= keyword for + a router, the authorites produce a consensus containing a "w" + Bandwidth= keyword equal to the median of the Measured= votes. + The ports listed in a "p" line should be taken as those ports for which the router's exit policy permits 'most' addresses, ignoring any accept not for all addresses, ignoring all rejects for private @@ -1261,6 +1468,11 @@ $Id$ one, breaking ties in favor of the lexicographically larger vote.) The port list is encoded as specified in 3.4.2. + * If consensus-method 6 or later is in use and if 3 or more + authorities provide a Measured= keyword in their votes for + a router, the authorities produce a consensus containing a + Bandwidth= keyword equal to the median of the Measured= votes. + The signatures at the end of a consensus document are sorted in ascending order by identity digest. @@ -1281,6 +1493,7 @@ $Id$ "3" -- Added legacy ID key support to aid in authority ID key rollovers "4" -- No longer list routers that are not running in the consensus "5" -- adds support for "w" and "p" lines. + "6" -- Prefers measured bandwidth values rather than advertised Before generating a consensus, an authority must decide which consensus method to use. To do this, it looks for the highest version number diff --git a/doc/spec/path-spec.txt b/doc/spec/path-spec.txt index dceb21dad7..78f3b63bcb 100644 --- a/doc/spec/path-spec.txt +++ b/doc/spec/path-spec.txt @@ -1,4 +1,3 @@ -$Id$ Tor Path Specification @@ -72,6 +71,24 @@ of their choices. is unknown (usually its target IP), but we believe the path probably supports the request according to the rules given below. +1.1. A server's bandwidth + + Old versions of Tor did not report bandwidths in network status + documents, so clients had to learn them from the routers' advertised + server descriptors. + + For versions of Tor prior to 0.2.1.17-rc, everywhere below where we + refer to a server's "bandwidth", we mean its clipped advertised + bandwidth, computed by taking the smaller of the 'rate' and + 'observed' arguments to the "bandwidth" element in the server's + descriptor. If a router's advertised bandwidth is greater than + MAX_BELIEVABLE_BANDWIDTH (currently 10 MB/s), we clipped to that + value. + + For more recent versions of Tor, we take the bandwidth value declared + in the consensus, and fall back to the clipped advertised bandwidth + only if the consensus does not have bandwidths listed. + 2. Building circuits 2.1. When we build @@ -179,16 +196,13 @@ of their choices. multiple candidates for a path element, we choose randomly. For "fast" circuits, we pick a given router as an exit with probability - proportional to its advertised bandwidth [the smaller of the 'rate' and - 'observed' arguments to the "bandwidth" element in its descriptor]. If a - router's advertised bandwidth is greater than MAX_BELIEVABLE_BANDWIDTH - (currently 10 MB/s), we clip to that value. + proportional to its bandwidth. For non-exit positions on "fast" circuits, we pick routers as above, but - we weight the clipped advertised bandwidth of Exit-flagged nodes depending + we weight the bandwidth of Exit-flagged nodes depending on the fraction of bandwidth available from non-Exit nodes. Call the - total clipped advertised bandwidth for Exit nodes under consideration E, - and the total clipped advertised bandwidth for all nodes under + total bandwidth for Exit nodes under consideration E, + and the total bandwidth for all nodes under consideration T. If E<T/3, we do not consider Exit-flagged nodes. Otherwise, we weight their bandwidth with the factor (E-T/3)/E. This ensures that bandwidth is evenly distributed over nodes in 3-hop paths. @@ -306,7 +320,7 @@ of their choices. We use Guard nodes (also called "helper nodes" in the literature) to prevent certain profiling attacks. Here's the risk: if we choose entry and exit nodes at random, and an attacker controls C out of N servers - (ignoring advertised bandwidth), then the + (ignoring bandwidth), then the attacker will control the entry and exit node of any given circuit with probability (C/N)^2. But as we make many different circuits over time, then the probability that the attacker will see a sample of about (C/N)^2 diff --git a/doc/spec/proposals/000-index.txt b/doc/spec/proposals/000-index.txt index d75157650d..d2d3ca5d72 100644 --- a/doc/spec/proposals/000-index.txt +++ b/doc/spec/proposals/000-index.txt @@ -1,7 +1,5 @@ Filename: 000-index.txt Title: Index of Tor Proposals -Version: $Revision$ -Last-Modified: $Date$ Author: Nick Mathewson Created: 26-Jan-2007 Status: Meta @@ -56,7 +54,7 @@ Proposals by number: 131 Help users to verify they are using Tor [NEEDS-REVISION] 132 A Tor Web Service For Verifying Correct Browser Configuration [DRAFT] 133 Incorporate Unreachable ORs into the Tor Network [DRAFT] -134 More robust consensus voting with diverse authority sets [ACCEPTED] +134 More robust consensus voting with diverse authority sets [REJECTED] 135 Simplify Configuration of Private Tor Networks [CLOSED] 136 Mass authority migration with legacy keys [CLOSED] 137 Keep controllers informed as Tor bootstraps [CLOSED] @@ -82,6 +80,13 @@ Proposals by number: 157 Make certificate downloads specific [ACCEPTED] 158 Clients download consensus + microdescriptors [OPEN] 159 Exit Scanning [OPEN] +160 Authorities vote for bandwidth offsets in consensus [OPEN] +161 Computing Bandwidth Adjustments [OPEN] +162 Publish the consensus in multiple flavors [OPEN] +163 Detecting whether a connection comes from a client [OPEN] +164 Reporting the status of server votes [OPEN] +165 Easy migration for voting authority sets [OPEN] +166 Including Network Statistics in Extra-Info Documents [ACCEPTED] Proposals by status: @@ -103,14 +108,20 @@ Proposals by status: 156 Tracking blocked ports on the client side [for 0.2.?] 158 Clients download consensus + microdescriptors 159 Exit Scanning + 160 Authorities vote for bandwidth offsets in consensus [for 0.2.2.x] + 161 Computing Bandwidth Adjustments [for 0.2.2.x] + 162 Publish the consensus in multiple flavors [for 0.2.2] + 163 Detecting whether a connection comes from a client [for 0.2.2] + 164 Reporting the status of server votes [for 0.2.2] + 165 Easy migration for voting authority sets ACCEPTED: 110 Avoiding infinite length circuits [for 0.2.1.x] [in 0.2.1.3-alpha] 117 IPv6 exits [for 0.2.1.x] 118 Advertising multiple ORPorts at once [for 0.2.1.x] - 134 More robust consensus voting with diverse authority sets [for 0.2.2.x] 140 Provide diffs between consensuses [for 0.2.2.x] 147 Eliminate the need for v2 directories in generating v3 directories [for 0.2.1.x] 157 Make certificate downloads specific [for 0.2.1.x] + 166 Including Network Statistics in Extra-Info Documents [for 0.2.2] META: 000 Index of Tor Proposals 001 The Tor Proposal Process @@ -159,3 +170,5 @@ Proposals by status: 120 Shutdown descriptors when Tor servers stop 128 Families of private bridges 142 Combine Introduction and Rendezvous Points + REJECTED: + 134 More robust consensus voting with diverse authority sets diff --git a/doc/spec/proposals/001-process.txt b/doc/spec/proposals/001-process.txt index 3a767b5fa4..636ba2c2fa 100644 --- a/doc/spec/proposals/001-process.txt +++ b/doc/spec/proposals/001-process.txt @@ -1,7 +1,5 @@ Filename: 001-process.txt Title: The Tor Proposal Process -Version: $Revision$ -Last-Modified: $Date$ Author: Nick Mathewson Created: 30-Jan-2007 Status: Meta @@ -47,7 +45,7 @@ How to change the specs now: Like an RFC, every proposal gets a number. Unlike RFCs, proposals can change over time and keep the same number, until they are finally accepted or rejected. The history for each proposal - will be stored in the Tor Subversion repository. + will be stored in the Tor repository. Once a proposal is in the repository, we should discuss and improve it until we've reached consensus that it's a good idea, and that it's @@ -82,9 +80,7 @@ How new proposals get added: What should go in a proposal: Every proposal should have a header containing these fields: - Filename, Title, Version, Last-Modified, Author, Created, Status. - The Version and Last-Modified fields should use the SVN Revision and Date - tags respectively. + Filename, Title, Author, Created, Status. These fields are optional but recommended: Target, Implemented-In. @@ -97,7 +93,7 @@ What should go in a proposal: what the proposal's about, what it does, and about what state it's in. After the Overview, the proposal becomes more free-form. Depending on its - the length and complexity, the proposal can break into sections as + length and complexity, the proposal can break into sections as appropriate, or follow a short discursive format. Every proposal should contain at least the following information before it is "ACCEPTED", though the information does not need to be in sections with these names. diff --git a/doc/spec/proposals/098-todo.txt b/doc/spec/proposals/098-todo.txt index e891ea890c..a0bbbeb568 100644 --- a/doc/spec/proposals/098-todo.txt +++ b/doc/spec/proposals/098-todo.txt @@ -1,7 +1,5 @@ Filename: 098-todo.txt Title: Proposals that should be written -Version: $Revision$ -Last-Modified: $Date$ Author: Nick Mathewson, Roger Dingledine Created: 26-Jan-2007 Status: Meta diff --git a/doc/spec/proposals/099-misc.txt b/doc/spec/proposals/099-misc.txt index ba13ea2a71..a3621dd25f 100644 --- a/doc/spec/proposals/099-misc.txt +++ b/doc/spec/proposals/099-misc.txt @@ -1,7 +1,5 @@ Filename: 099-misc.txt Title: Miscellaneous proposals -Version: $Revision$ -Last-Modified: $Date$ Author: Various Created: 26-Jan-2007 Status: Meta diff --git a/doc/spec/proposals/100-tor-spec-udp.txt b/doc/spec/proposals/100-tor-spec-udp.txt index 8224682ec8..7f062222c5 100644 --- a/doc/spec/proposals/100-tor-spec-udp.txt +++ b/doc/spec/proposals/100-tor-spec-udp.txt @@ -1,7 +1,5 @@ Filename: 100-tor-spec-udp.txt Title: Tor Unreliable Datagram Extension Proposal -Version: $Revision$ -Last-Modified: $Date$ Author: Marc Liberatore Created: 23 Feb 2006 Status: Dead diff --git a/doc/spec/proposals/101-dir-voting.txt b/doc/spec/proposals/101-dir-voting.txt index be900a641e..634d3f1948 100644 --- a/doc/spec/proposals/101-dir-voting.txt +++ b/doc/spec/proposals/101-dir-voting.txt @@ -1,7 +1,5 @@ Filename: 101-dir-voting.txt Title: Voting on the Tor Directory System -Version: $Revision$ -Last-Modified: $Date$ Author: Nick Mathewson Created: Nov 2006 Status: Closed diff --git a/doc/spec/proposals/102-drop-opt.txt b/doc/spec/proposals/102-drop-opt.txt index 8f6a38ae6c..490376bb53 100644 --- a/doc/spec/proposals/102-drop-opt.txt +++ b/doc/spec/proposals/102-drop-opt.txt @@ -1,7 +1,5 @@ Filename: 102-drop-opt.txt Title: Dropping "opt" from the directory format -Version: $Revision$ -Last-Modified: $Date$ Author: Nick Mathewson Created: Jan 2007 Status: Closed diff --git a/doc/spec/proposals/103-multilevel-keys.txt b/doc/spec/proposals/103-multilevel-keys.txt index ef51e18047..c8a7a6677b 100644 --- a/doc/spec/proposals/103-multilevel-keys.txt +++ b/doc/spec/proposals/103-multilevel-keys.txt @@ -1,7 +1,5 @@ Filename: 103-multilevel-keys.txt Title: Splitting identity key from regularly used signing key. -Version: $Revision$ -Last-Modified: $Date$ Author: Nick Mathewson Created: Jan 2007 Status: Closed diff --git a/doc/spec/proposals/104-short-descriptors.txt b/doc/spec/proposals/104-short-descriptors.txt index a1c42c8ff7..90e0764fe6 100644 --- a/doc/spec/proposals/104-short-descriptors.txt +++ b/doc/spec/proposals/104-short-descriptors.txt @@ -1,7 +1,5 @@ Filename: 104-short-descriptors.txt Title: Long and Short Router Descriptors -Version: $Revision$ -Last-Modified: $Date$ Author: Nick Mathewson Created: Jan 2007 Status: Closed diff --git a/doc/spec/proposals/105-handshake-revision.txt b/doc/spec/proposals/105-handshake-revision.txt index f6c209e71b..791a016c26 100644 --- a/doc/spec/proposals/105-handshake-revision.txt +++ b/doc/spec/proposals/105-handshake-revision.txt @@ -1,7 +1,5 @@ Filename: 105-handshake-revision.txt Title: Version negotiation for the Tor protocol. -Version: $Revision$ -Last-Modified: $Date$ Author: Nick Mathewson, Roger Dingledine Created: Jan 2007 Status: Closed diff --git a/doc/spec/proposals/106-less-tls-constraint.txt b/doc/spec/proposals/106-less-tls-constraint.txt index 35d6bf1066..7e7621df69 100644 --- a/doc/spec/proposals/106-less-tls-constraint.txt +++ b/doc/spec/proposals/106-less-tls-constraint.txt @@ -1,7 +1,5 @@ Filename: 106-less-tls-constraint.txt Title: Checking fewer things during TLS handshakes -Version: $Revision$ -Last-Modified: $Date$ Author: Nick Mathewson Created: 9-Feb-2007 Status: Closed diff --git a/doc/spec/proposals/107-uptime-sanity-checking.txt b/doc/spec/proposals/107-uptime-sanity-checking.txt index b11be89380..922129b21d 100644 --- a/doc/spec/proposals/107-uptime-sanity-checking.txt +++ b/doc/spec/proposals/107-uptime-sanity-checking.txt @@ -1,7 +1,5 @@ Filename: 107-uptime-sanity-checking.txt Title: Uptime Sanity Checking -Version: $Revision$ -Last-Modified: $Date$ Author: Kevin Bauer & Damon McCoy Created: 8-March-2007 Status: Closed diff --git a/doc/spec/proposals/108-mtbf-based-stability.txt b/doc/spec/proposals/108-mtbf-based-stability.txt index 2c66481530..294103760b 100644 --- a/doc/spec/proposals/108-mtbf-based-stability.txt +++ b/doc/spec/proposals/108-mtbf-based-stability.txt @@ -1,7 +1,5 @@ Filename: 108-mtbf-based-stability.txt Title: Base "Stable" Flag on Mean Time Between Failures -Version: $Revision$ -Last-Modified: $Date$ Author: Nick Mathewson Created: 10-Mar-2007 Status: Closed diff --git a/doc/spec/proposals/109-no-sharing-ips.txt b/doc/spec/proposals/109-no-sharing-ips.txt index 1a88b00c0f..5438cf049a 100644 --- a/doc/spec/proposals/109-no-sharing-ips.txt +++ b/doc/spec/proposals/109-no-sharing-ips.txt @@ -1,7 +1,5 @@ Filename: 109-no-sharing-ips.txt Title: No more than one server per IP address. -Version: $Revision$ -Last-Modified: $Date$ Author: Kevin Bauer & Damon McCoy Created: 9-March-2007 Status: Closed diff --git a/doc/spec/proposals/110-avoid-infinite-circuits.txt b/doc/spec/proposals/110-avoid-infinite-circuits.txt index 1834cd34a7..fffc41c25a 100644 --- a/doc/spec/proposals/110-avoid-infinite-circuits.txt +++ b/doc/spec/proposals/110-avoid-infinite-circuits.txt @@ -1,7 +1,5 @@ Filename: 110-avoid-infinite-circuits.txt Title: Avoiding infinite length circuits -Version: $Revision$ -Last-Modified: $Date$ Author: Roger Dingledine Created: 13-Mar-2007 Status: Accepted diff --git a/doc/spec/proposals/111-local-traffic-priority.txt b/doc/spec/proposals/111-local-traffic-priority.txt index f8a37efc94..9411463c21 100644 --- a/doc/spec/proposals/111-local-traffic-priority.txt +++ b/doc/spec/proposals/111-local-traffic-priority.txt @@ -1,7 +1,5 @@ Filename: 111-local-traffic-priority.txt Title: Prioritizing local traffic over relayed traffic -Version: $Revision$ -Last-Modified: $Date$ Author: Roger Dingledine Created: 14-Mar-2007 Status: Closed diff --git a/doc/spec/proposals/112-bring-back-pathlencoinweight.txt b/doc/spec/proposals/112-bring-back-pathlencoinweight.txt index e7cc6b4e36..3f6c3376f0 100644 --- a/doc/spec/proposals/112-bring-back-pathlencoinweight.txt +++ b/doc/spec/proposals/112-bring-back-pathlencoinweight.txt @@ -1,7 +1,5 @@ Filename: 112-bring-back-pathlencoinweight.txt Title: Bring Back Pathlen Coin Weight -Version: $Revision$ -Last-Modified: $Date$ Author: Mike Perry Created: Status: Superseded diff --git a/doc/spec/proposals/113-fast-authority-interface.txt b/doc/spec/proposals/113-fast-authority-interface.txt index 20cf33e429..8912b53220 100644 --- a/doc/spec/proposals/113-fast-authority-interface.txt +++ b/doc/spec/proposals/113-fast-authority-interface.txt @@ -1,7 +1,5 @@ Filename: 113-fast-authority-interface.txt Title: Simplifying directory authority administration -Version: $Revision$ -Last-Modified: $Date$ Author: Nick Mathewson Created: Status: Superseded diff --git a/doc/spec/proposals/114-distributed-storage.txt b/doc/spec/proposals/114-distributed-storage.txt index e9271fb82d..91a787d301 100644 --- a/doc/spec/proposals/114-distributed-storage.txt +++ b/doc/spec/proposals/114-distributed-storage.txt @@ -1,7 +1,5 @@ Filename: 114-distributed-storage.txt Title: Distributed Storage for Tor Hidden Service Descriptors -Version: $Revision$ -Last-Modified: $Date$ Author: Karsten Loesing Created: 13-May-2007 Status: Closed diff --git a/doc/spec/proposals/115-two-hop-paths.txt b/doc/spec/proposals/115-two-hop-paths.txt index ee10d949c4..9854c9ad55 100644 --- a/doc/spec/proposals/115-two-hop-paths.txt +++ b/doc/spec/proposals/115-two-hop-paths.txt @@ -1,7 +1,5 @@ Filename: 115-two-hop-paths.txt Title: Two Hop Paths -Version: $Revision$ -Last-Modified: $Date$ Author: Mike Perry Created: Status: Dead diff --git a/doc/spec/proposals/116-two-hop-paths-from-guard.txt b/doc/spec/proposals/116-two-hop-paths-from-guard.txt index 454b344abf..f45625350b 100644 --- a/doc/spec/proposals/116-two-hop-paths-from-guard.txt +++ b/doc/spec/proposals/116-two-hop-paths-from-guard.txt @@ -1,7 +1,5 @@ Filename: 116-two-hop-paths-from-guard.txt Title: Two hop paths from entry guards -Version: $Revision$ -Last-Modified: $Date$ Author: Michael Lieberman Created: 26-Jun-2007 Status: Dead diff --git a/doc/spec/proposals/117-ipv6-exits.txt b/doc/spec/proposals/117-ipv6-exits.txt index c8402821ed..00cd7cef10 100644 --- a/doc/spec/proposals/117-ipv6-exits.txt +++ b/doc/spec/proposals/117-ipv6-exits.txt @@ -1,7 +1,5 @@ Filename: 117-ipv6-exits.txt Title: IPv6 exits -Version: $Revision$ -Last-Modified: $Date$ Author: coderman Created: 10-Jul-2007 Status: Accepted diff --git a/doc/spec/proposals/118-multiple-orports.txt b/doc/spec/proposals/118-multiple-orports.txt index 1bef2504d9..2381ec7ca3 100644 --- a/doc/spec/proposals/118-multiple-orports.txt +++ b/doc/spec/proposals/118-multiple-orports.txt @@ -1,7 +1,5 @@ Filename: 118-multiple-orports.txt Title: Advertising multiple ORPorts at once -Version: $Revision$ -Last-Modified: $Date$ Author: Nick Mathewson Created: 09-Jul-2007 Status: Accepted diff --git a/doc/spec/proposals/119-controlport-auth.txt b/doc/spec/proposals/119-controlport-auth.txt index dc57a27368..9ed1cc1cbe 100644 --- a/doc/spec/proposals/119-controlport-auth.txt +++ b/doc/spec/proposals/119-controlport-auth.txt @@ -1,7 +1,5 @@ Filename: 119-controlport-auth.txt Title: New PROTOCOLINFO command for controllers -Version: $Revision$ -Last-Modified: $Date$ Author: Roger Dingledine Created: 14-Aug-2007 Status: Closed diff --git a/doc/spec/proposals/120-shutdown-descriptors.txt b/doc/spec/proposals/120-shutdown-descriptors.txt index dc1265b03b..5cfe2b5bc6 100644 --- a/doc/spec/proposals/120-shutdown-descriptors.txt +++ b/doc/spec/proposals/120-shutdown-descriptors.txt @@ -1,7 +1,5 @@ Filename: 120-shutdown-descriptors.txt Title: Shutdown descriptors when Tor servers stop -Version: $Revision$ -Last-Modified: $Date$ Author: Roger Dingledine Created: 15-Aug-2007 Status: Dead diff --git a/doc/spec/proposals/121-hidden-service-authentication.txt b/doc/spec/proposals/121-hidden-service-authentication.txt index 828bf3c92d..0d92b53a8c 100644 --- a/doc/spec/proposals/121-hidden-service-authentication.txt +++ b/doc/spec/proposals/121-hidden-service-authentication.txt @@ -1,7 +1,5 @@ Filename: 121-hidden-service-authentication.txt Title: Hidden Service Authentication -Version: $Revision$ -Last-Modified: $Date$ Author: Tobias Kamm, Thomas Lauterbach, Karsten Loesing, Ferdinand Rieger, Christoph Weingarten Created: 10-Sep-2007 diff --git a/doc/spec/proposals/122-unnamed-flag.txt b/doc/spec/proposals/122-unnamed-flag.txt index 6502b9c560..2ce7bb22b9 100644 --- a/doc/spec/proposals/122-unnamed-flag.txt +++ b/doc/spec/proposals/122-unnamed-flag.txt @@ -1,7 +1,5 @@ Filename: 122-unnamed-flag.txt Title: Network status entries need a new Unnamed flag -Version: $Revision$ -Last-Modified: $Date$ Author: Roger Dingledine Created: 04-Oct-2007 Status: Closed diff --git a/doc/spec/proposals/123-autonaming.txt b/doc/spec/proposals/123-autonaming.txt index 6cd25329f8..74c486985d 100644 --- a/doc/spec/proposals/123-autonaming.txt +++ b/doc/spec/proposals/123-autonaming.txt @@ -1,7 +1,5 @@ Filename: 123-autonaming.txt Title: Naming authorities automatically create bindings -Version: $Revision$ -Last-Modified: $Date$ Author: Peter Palfrader Created: 2007-10-11 Status: Closed diff --git a/doc/spec/proposals/124-tls-certificates.txt b/doc/spec/proposals/124-tls-certificates.txt index 0a47772732..9472d14af8 100644 --- a/doc/spec/proposals/124-tls-certificates.txt +++ b/doc/spec/proposals/124-tls-certificates.txt @@ -1,7 +1,5 @@ Filename: 124-tls-certificates.txt Title: Blocking resistant TLS certificate usage -Version: $Revision$ -Last-Modified: $Date$ Author: Steven J. Murdoch Created: 2007-10-25 Status: Superseded diff --git a/doc/spec/proposals/125-bridges.txt b/doc/spec/proposals/125-bridges.txt index 8bb3169780..9d95729d42 100644 --- a/doc/spec/proposals/125-bridges.txt +++ b/doc/spec/proposals/125-bridges.txt @@ -1,7 +1,5 @@ Filename: 125-bridges.txt Title: Behavior for bridge users, bridge relays, and bridge authorities -Version: $Revision$ -Last-Modified: $Date$ Author: Roger Dingledine Created: 11-Nov-2007 Status: Closed diff --git a/doc/spec/proposals/126-geoip-reporting.txt b/doc/spec/proposals/126-geoip-reporting.txt index d48a08ba38..9f3b21c670 100644 --- a/doc/spec/proposals/126-geoip-reporting.txt +++ b/doc/spec/proposals/126-geoip-reporting.txt @@ -1,7 +1,5 @@ Filename: 126-geoip-reporting.txt Title: Getting GeoIP data and publishing usage summaries -Version: $Revision$ -Last-Modified: $Date$ Author: Roger Dingledine Created: 2007-11-24 Status: Closed diff --git a/doc/spec/proposals/127-dirport-mirrors-downloads.txt b/doc/spec/proposals/127-dirport-mirrors-downloads.txt index 1b55a02d61..72d6c0cb9f 100644 --- a/doc/spec/proposals/127-dirport-mirrors-downloads.txt +++ b/doc/spec/proposals/127-dirport-mirrors-downloads.txt @@ -1,7 +1,5 @@ Filename: 127-dirport-mirrors-downloads.txt Title: Relaying dirport requests to Tor download site / website -Version: $Revision$ -Last-Modified: $Date$ Author: Roger Dingledine Created: 2007-12-02 Status: Draft diff --git a/doc/spec/proposals/128-bridge-families.txt b/doc/spec/proposals/128-bridge-families.txt index e8a0050c3c..e5bdcf95cb 100644 --- a/doc/spec/proposals/128-bridge-families.txt +++ b/doc/spec/proposals/128-bridge-families.txt @@ -1,7 +1,5 @@ Filename: 128-bridge-families.txt Title: Families of private bridges -Version: $Revision$ -Last-Modified: $Date$ Author: Roger Dingledine Created: 2007-12-xx Status: Dead diff --git a/doc/spec/proposals/129-reject-plaintext-ports.txt b/doc/spec/proposals/129-reject-plaintext-ports.txt index d4767d03d8..8080ff5b75 100644 --- a/doc/spec/proposals/129-reject-plaintext-ports.txt +++ b/doc/spec/proposals/129-reject-plaintext-ports.txt @@ -1,7 +1,5 @@ Filename: 129-reject-plaintext-ports.txt Title: Block Insecure Protocols by Default -Version: $Revision$ -Last-Modified: $Date$ Author: Kevin Bauer & Damon McCoy Created: 2008-01-15 Status: Closed diff --git a/doc/spec/proposals/130-v2-conn-protocol.txt b/doc/spec/proposals/130-v2-conn-protocol.txt index 16f5bf2844..60e742a622 100644 --- a/doc/spec/proposals/130-v2-conn-protocol.txt +++ b/doc/spec/proposals/130-v2-conn-protocol.txt @@ -1,7 +1,5 @@ Filename: 130-v2-conn-protocol.txt Title: Version 2 Tor connection protocol -Version: $Revision$ -Last-Modified: $Date$ Author: Nick Mathewson Created: 2007-10-25 Status: Closed diff --git a/doc/spec/proposals/131-verify-tor-usage.txt b/doc/spec/proposals/131-verify-tor-usage.txt index 2687139189..d3c6efe75a 100644 --- a/doc/spec/proposals/131-verify-tor-usage.txt +++ b/doc/spec/proposals/131-verify-tor-usage.txt @@ -1,7 +1,5 @@ Filename: 131-verify-tor-usage.txt Title: Help users to verify they are using Tor -Version: $Revision$ -Last-Modified: $Date$ Author: Steven J. Murdoch Created: 2008-01-25 Status: Needs-Revision diff --git a/doc/spec/proposals/132-browser-check-tor-service.txt b/doc/spec/proposals/132-browser-check-tor-service.txt index d07a10dcde..6132e5d060 100644 --- a/doc/spec/proposals/132-browser-check-tor-service.txt +++ b/doc/spec/proposals/132-browser-check-tor-service.txt @@ -1,7 +1,5 @@ Filename: 132-browser-check-tor-service.txt Title: A Tor Web Service For Verifying Correct Browser Configuration -Version: $Revision$ -Last-Modified: $Date$ Author: Robert Hogan Created: 2008-03-08 Status: Draft diff --git a/doc/spec/proposals/134-robust-voting.txt b/doc/spec/proposals/134-robust-voting.txt index 5d5e77fa3b..c5dfb3b47f 100644 --- a/doc/spec/proposals/134-robust-voting.txt +++ b/doc/spec/proposals/134-robust-voting.txt @@ -2,8 +2,10 @@ Filename: 134-robust-voting.txt Title: More robust consensus voting with diverse authority sets Author: Peter Palfrader Created: 2008-04-01 -Status: Accepted -Target: 0.2.2.x +Status: Rejected + +History: + 2009 May 27: Added note on rejecting this proposal -- Nick Overview: @@ -103,3 +105,19 @@ Possible Attacks/Open Issues/Some thinking required: Q: Can this ever force us to build a consensus with authorities we do not recognize? A: No, we can never build a fully connected set with them in step 3. + +------------------------------ + +I'm rejecting this proposal as insecure. + +Suppose that we have a clique of size N, and M hostile members in the +clique. If these hostile members stop declaring trust for up to M-1 +good members of the clique, the clique with the hostile members will +in it will be larger than the one without them. + +The M hostile members will constitute a majority of this new clique +when M > (N-(M-1)) / 2, or when M > (N + 1) / 3. This breaks our +requirement that an adversary must compromise a majority of authorities +in order to control the consensus. + +-- Nick diff --git a/doc/spec/proposals/135-private-tor-networks.txt b/doc/spec/proposals/135-private-tor-networks.txt index 131bbb9068..19ef68b7b1 100644 --- a/doc/spec/proposals/135-private-tor-networks.txt +++ b/doc/spec/proposals/135-private-tor-networks.txt @@ -1,7 +1,5 @@ Filename: 135-private-tor-networks.txt Title: Simplify Configuration of Private Tor Networks -Version: $Revision$ -Last-Modified: $Date$ Author: Karsten Loesing Created: 29-Apr-2008 Status: Closed diff --git a/doc/spec/proposals/137-bootstrap-phases.txt b/doc/spec/proposals/137-bootstrap-phases.txt index 18d3dfae12..ebe044c707 100644 --- a/doc/spec/proposals/137-bootstrap-phases.txt +++ b/doc/spec/proposals/137-bootstrap-phases.txt @@ -1,7 +1,5 @@ Filename: 137-bootstrap-phases.txt Title: Keep controllers informed as Tor bootstraps -Version: $Revision$ -Last-Modified: $Date$ Author: Roger Dingledine Created: 07-Jun-2008 Status: Closed diff --git a/doc/spec/proposals/138-remove-down-routers-from-consensus.txt b/doc/spec/proposals/138-remove-down-routers-from-consensus.txt index a07764d536..776911b5c9 100644 --- a/doc/spec/proposals/138-remove-down-routers-from-consensus.txt +++ b/doc/spec/proposals/138-remove-down-routers-from-consensus.txt @@ -1,7 +1,5 @@ Filename: 138-remove-down-routers-from-consensus.txt Title: Remove routers that are not Running from consensus documents -Version: $Revision$ -Last-Modified: $Date$ Author: Peter Palfrader Created: 11-Jun-2008 Status: Closed diff --git a/doc/spec/proposals/140-consensus-diffs.txt b/doc/spec/proposals/140-consensus-diffs.txt index da63bfe23c..8bc4070bfe 100644 --- a/doc/spec/proposals/140-consensus-diffs.txt +++ b/doc/spec/proposals/140-consensus-diffs.txt @@ -1,12 +1,15 @@ Filename: 140-consensus-diffs.txt Title: Provide diffs between consensuses -Version: $Revision$ -Last-Modified: $Date$ Author: Peter Palfrader Created: 13-Jun-2008 Status: Accepted Target: 0.2.2.x +0. History + + 22-May-2009: Restricted the ed format even more strictly for ease of + implementation. -nickm + 1. Overview. Tor clients and servers need a list of which relays are on the @@ -135,6 +138,10 @@ Target: 0.2.2.x Note that line numbers always apply to the file after all previous commands have already been applied. + The commands MUST apply to the file from back to front, such that + lines are only ever referred to by their position in the original + file. + The "current line" is either the first line of the file, if this is the first command, the last line of a block we added in an append or change command, or the line immediate following a set of lines we just diff --git a/doc/spec/proposals/141-jit-sd-downloads.txt b/doc/spec/proposals/141-jit-sd-downloads.txt index b0c2b2cbcd..2ac7a086b7 100644 --- a/doc/spec/proposals/141-jit-sd-downloads.txt +++ b/doc/spec/proposals/141-jit-sd-downloads.txt @@ -1,7 +1,5 @@ Filename: 141-jit-sd-downloads.txt Title: Download server descriptors on demand -Version: $Revision$ -Last-Modified: $Date$ Author: Peter Palfrader Created: 15-Jun-2008 Status: Draft @@ -63,8 +61,8 @@ Status: Draft which tries to convey a server's capacity to clients. Currently we weigh servers differently for different purposes. There - is a weigh for when we use a server as a guard node (our entry to the - Tor network), there is one weigh we assign servers for exit duties, + is a weight for when we use a server as a guard node (our entry to the + Tor network), there is one weight we assign servers for exit duties, and a third for when we need intermediate (middle) nodes. 2.2 Exit information @@ -80,7 +78,7 @@ Status: Draft 2.3 Capability information - Server descriptors contain information about the specific version or + Server descriptors contain information about the specific version of the Tor protocol they understand [proposal 105]. Furthermore the server descriptor also contains the exact version of diff --git a/doc/spec/proposals/142-combine-intro-and-rend-points.txt b/doc/spec/proposals/142-combine-intro-and-rend-points.txt index 3456b285a9..3abd5c863d 100644 --- a/doc/spec/proposals/142-combine-intro-and-rend-points.txt +++ b/doc/spec/proposals/142-combine-intro-and-rend-points.txt @@ -1,7 +1,5 @@ Filename: 142-combine-intro-and-rend-points.txt Title: Combine Introduction and Rendezvous Points -Version: $Revision$ -Last-Modified: $Date$ Author: Karsten Loesing, Christian Wilms Created: 27-Jun-2008 Status: Dead diff --git a/doc/spec/proposals/143-distributed-storage-improvements.txt b/doc/spec/proposals/143-distributed-storage-improvements.txt index 8789d84663..0f7468f1dc 100644 --- a/doc/spec/proposals/143-distributed-storage-improvements.txt +++ b/doc/spec/proposals/143-distributed-storage-improvements.txt @@ -1,7 +1,5 @@ Filename: 143-distributed-storage-improvements.txt Title: Improvements of Distributed Storage for Tor Hidden Service Descriptors -Version: $Revision$ -Last-Modified: $Date$ Author: Karsten Loesing Created: 28-Jun-2008 Status: Open diff --git a/doc/spec/proposals/145-newguard-flag.txt b/doc/spec/proposals/145-newguard-flag.txt index 31d707d725..9e61e30be9 100644 --- a/doc/spec/proposals/145-newguard-flag.txt +++ b/doc/spec/proposals/145-newguard-flag.txt @@ -1,7 +1,5 @@ Filename: 145-newguard-flag.txt Title: Separate "suitable as a guard" from "suitable as a new guard" -Version: $Revision$ -Last-Modified: $Date$ Author: Nick Mathewson Created: 1-Jul-2008 Status: Open diff --git a/doc/spec/proposals/146-long-term-stability.txt b/doc/spec/proposals/146-long-term-stability.txt index 7cfd58f564..9af0017441 100644 --- a/doc/spec/proposals/146-long-term-stability.txt +++ b/doc/spec/proposals/146-long-term-stability.txt @@ -1,7 +1,5 @@ Filename: 146-long-term-stability.txt Title: Add new flag to reflect long-term stability -Version: $Revision$ -Last-Modified: $Date$ Author: Nick Mathewson Created: 19-Jun-2008 Status: Open diff --git a/doc/spec/proposals/147-prevoting-opinions.txt b/doc/spec/proposals/147-prevoting-opinions.txt index 2b8cf30e46..3d9659c984 100644 --- a/doc/spec/proposals/147-prevoting-opinions.txt +++ b/doc/spec/proposals/147-prevoting-opinions.txt @@ -1,7 +1,5 @@ Filename: 147-prevoting-opinions.txt Title: Eliminate the need for v2 directories in generating v3 directories -Version: $Revision$ -Last-Modified: $Date$ Author: Nick Mathewson Created: 2-Jul-2008 Status: Accepted diff --git a/doc/spec/proposals/148-uniform-client-end-reason.txt b/doc/spec/proposals/148-uniform-client-end-reason.txt index cec81253ea..1db3b3e596 100644 --- a/doc/spec/proposals/148-uniform-client-end-reason.txt +++ b/doc/spec/proposals/148-uniform-client-end-reason.txt @@ -1,7 +1,5 @@ Filename: 148-uniform-client-end-reason.txt Title: Stream end reasons from the client side should be uniform -Version: $Revision$ -Last-Modified: $Date$ Author: Roger Dingledine Created: 2-Jul-2008 Status: Closed diff --git a/doc/spec/proposals/149-using-netinfo-data.txt b/doc/spec/proposals/149-using-netinfo-data.txt index 4919514b4c..8bf8375d5d 100644 --- a/doc/spec/proposals/149-using-netinfo-data.txt +++ b/doc/spec/proposals/149-using-netinfo-data.txt @@ -1,7 +1,5 @@ Filename: 149-using-netinfo-data.txt Title: Using data from NETINFO cells -Version: $Revision$ -Last-Modified: $Date$ Author: Nick Mathewson Created: 2-Jul-2008 Status: Open @@ -24,14 +22,14 @@ Motivation idea of their own IP addresses, so they can publish correct descriptors. This is also in NETINFO cells. -Learning the time and IP +Learning the time and IP address We need to think about attackers here. Just because a router tells us that we have a given IP or a given clock skew doesn't mean that it's true. We believe this information only if we've heard it from a majority of the routers we've connected to recently, including at least 3 routers. Routers only believe this information if the - majority inclues at least one authority. + majority includes at least one authority. Avoiding MITM attacks diff --git a/doc/spec/proposals/150-exclude-exit-nodes.txt b/doc/spec/proposals/150-exclude-exit-nodes.txt index b73a9cc4d1..b497ae62c1 100644 --- a/doc/spec/proposals/150-exclude-exit-nodes.txt +++ b/doc/spec/proposals/150-exclude-exit-nodes.txt @@ -1,6 +1,5 @@ Filename: 150-exclude-exit-nodes.txt Title: Exclude Exit Nodes from a circuit -Version: $Revision$ Author: Mfr Created: 2008-06-15 Status: Closed diff --git a/doc/spec/proposals/151-path-selection-improvements.txt b/doc/spec/proposals/151-path-selection-improvements.txt index e3c8f35451..3d5f07d3ab 100644 --- a/doc/spec/proposals/151-path-selection-improvements.txt +++ b/doc/spec/proposals/151-path-selection-improvements.txt @@ -1,7 +1,5 @@ Filename: 151-path-selection-improvements.txt Title: Improving Tor Path Selection -Version: -Last-Modified: Author: Fallon Chen, Mike Perry Created: 5-Jul-2008 Status: Draft diff --git a/doc/spec/proposals/152-single-hop-circuits.txt b/doc/spec/proposals/152-single-hop-circuits.txt index e49a4250e0..d0b28b1c72 100644 --- a/doc/spec/proposals/152-single-hop-circuits.txt +++ b/doc/spec/proposals/152-single-hop-circuits.txt @@ -1,7 +1,5 @@ Filename: 152-single-hop-circuits.txt Title: Optionally allow exit from single-hop circuits -Version: -Last-Modified: Author: Geoff Goodell Created: 13-Jul-2008 Status: Closed diff --git a/doc/spec/proposals/153-automatic-software-update-protocol.txt b/doc/spec/proposals/153-automatic-software-update-protocol.txt index 7bc809d440..c2979bb695 100644 --- a/doc/spec/proposals/153-automatic-software-update-protocol.txt +++ b/doc/spec/proposals/153-automatic-software-update-protocol.txt @@ -1,7 +1,5 @@ Filename: 153-automatic-software-update-protocol.txt Title: Automatic software update protocol -Version: $Revision$ -Last-Modified: $Date$ Author: Jacob Appelbaum Created: 14-July-2008 Status: Superseded diff --git a/doc/spec/proposals/154-automatic-updates.txt b/doc/spec/proposals/154-automatic-updates.txt index 00a820de08..4c2c6d3899 100644 --- a/doc/spec/proposals/154-automatic-updates.txt +++ b/doc/spec/proposals/154-automatic-updates.txt @@ -1,7 +1,5 @@ Filename: 154-automatic-updates.txt Title: Automatic Software Update Protocol -Version: $Revision$ -Last-Modified: $Date$ Author: Matt Edman Created: 30-July-2008 Status: Superseded diff --git a/doc/spec/proposals/155-four-hidden-service-improvements.txt b/doc/spec/proposals/155-four-hidden-service-improvements.txt index f528f8baf2..e342bf1c39 100644 --- a/doc/spec/proposals/155-four-hidden-service-improvements.txt +++ b/doc/spec/proposals/155-four-hidden-service-improvements.txt @@ -1,7 +1,5 @@ Filename: 155-four-hidden-service-improvements.txt Title: Four Improvements of Hidden Service Performance -Version: $Revision$ -Last-Modified: $Date$ Author: Karsten Loesing, Christian Wilms Created: 25-Sep-2008 Status: Finished diff --git a/doc/spec/proposals/156-tracking-blocked-ports.txt b/doc/spec/proposals/156-tracking-blocked-ports.txt index 1e7b0d963f..419de7e74c 100644 --- a/doc/spec/proposals/156-tracking-blocked-ports.txt +++ b/doc/spec/proposals/156-tracking-blocked-ports.txt @@ -1,7 +1,5 @@ Filename: 156-tracking-blocked-ports.txt Title: Tracking blocked ports on the client side -Version: $Revision$ -Last-Modified: $Date$ Author: Robert Hogan Created: 14-Oct-2008 Status: Open diff --git a/doc/spec/proposals/157-specific-cert-download.txt b/doc/spec/proposals/157-specific-cert-download.txt index e54a987277..204b20973a 100644 --- a/doc/spec/proposals/157-specific-cert-download.txt +++ b/doc/spec/proposals/157-specific-cert-download.txt @@ -1,7 +1,5 @@ Filename: 157-specific-cert-download.txt Title: Make certificate downloads specific -Version: $Revision$ -Last-Modified: $Date$ Author: Nick Mathewson Created: 2-Dec-2008 Status: Accepted diff --git a/doc/spec/proposals/158-microdescriptors.txt b/doc/spec/proposals/158-microdescriptors.txt index f478a3c834..e6966c0cef 100644 --- a/doc/spec/proposals/158-microdescriptors.txt +++ b/doc/spec/proposals/158-microdescriptors.txt @@ -1,11 +1,20 @@ Filename: 158-microdescriptors.txt Title: Clients download consensus + microdescriptors -Version: $Revision$ -Last-Modified: $Date$ Author: Roger Dingledine Created: 17-Jan-2009 Status: Open +0. History + + 15 May 2009: Substantially revised based on discussions on or-dev + from late January. Removed the notion of voting on how to choose + microdescriptors; made it just a function of the consensus method. + (This lets us avoid the possibility of "desynchronization.") + Added suggestion to use a new consensus flavor. Specified use of + SHA256 for new hashes. -nickm + + 15 June 2009: Cleaned up based on comments from Roger. -nickm + 1. Overview This proposal replaces section 3.2 of proposal 141, which was @@ -13,9 +22,7 @@ Status: Open circuit-building protocol to fetch a server descriptor inline at each circuit extend, we instead put all of the information that clients need either into the consensus itself, or into a new set of data about each - relay called a microdescriptor. The microdescriptor is a direct - transform from the relay descriptor, so relays don't even need to know - this is happening. + relay called a microdescriptor. Descriptor elements that are small and frequently changing should go in the consensus itself, and descriptor elements that are small and @@ -24,6 +31,10 @@ Status: Open them, we'll need to resume considering some design like the one in proposal 141. + Note also that any descriptor element which clients need to use to + decide which servers to fetch info about, or which servers to fetch + info from, needs to stay in the consensus. + 2. Motivation See @@ -36,99 +47,91 @@ Status: Open 3. Design There are three pieces to the proposal. First, authorities will list in - their votes (and thus in the consensus) what relay descriptor elements - are included in the microdescriptor, and also list the expected hash - of microdescriptor for each relay. Second, directory mirrors will serve - microdescriptors. Third, clients will ask for them and cache them. + their votes (and thus in the consensus) the expected hash of + microdescriptor for each relay. Second, authorities will serve + microdescriptors, directory mirrors will cache and serve + them. Third, clients will ask for them and cache them. 3.1. Consensus changes - V3 votes should include a new line: - microdescriptor-elements bar baz foo - listing each descriptor element (sorted alphabetically) that authority - included when it calculated its expected microdescriptor hashes. + If the authorities choose a consensus method of a given version or + later, a microdescriptor format is implicit in that version. + A microdescriptor should in every case be a pure function of the + router descriptor and the consensus method. + + In votes, we need to include the hash of each expected microdescriptor + in the routerstatus section. I suggest a new "m" line for each stanza, + with the base64 of the SHA256 hash of the router's microdescriptor. + + For every consensus method that an authority supports, it includes a + separate "m" line in each router section of its vote, containing: + "m" SP methods 1*(SP AlgorithmName "=" digest) NL + where methods is a comma-separated list of the consensus methods + that the authority believes will produce "digest". - We also need to include the hash of each expected microdescriptor in - the routerstatus section. I suggest a new "m" line for each stanza, - with the base64 of the hash of the elements that the authority voted - for above. + (As with base64 encoding of SHA1 hashes in consensuses, let's + omit the trailing =s) The consensus microdescriptor-elements and "m" lines are then computed as described in Section 3.1.2 below. - I believe that means we need a new consensus-method "6" that knows - how to compute the microdescriptor-elements and add "m" lines. + (This means we need a new consensus-method that knows + how to compute the microdescriptor-elements and add "m" lines.) -3.1.1. Descriptor elements to include for now + The microdescriptor consensus uses the directory-signature format from + proposal 162, with the "sha256" algorithm. - To start, the element list that authorities suggest should be - family onion-key - (Note that the or-dev posts above only mention onion-key, but if - we don't also include family then clients will never learn it. It - seemed like it should be relatively static, so putting it in the - microdescriptor is smarter than trying to fit it into the consensus.) +3.1.1. Descriptor elements to include for now - We could imagine a config option "family,onion-key" so authorities - could change their voted preferences without needing to upgrade. + In the first version, the microdescriptor should contain the + onion-key element, and the family element from the router descriptor, + and the exit policy summary as currently specified in dir-spec.txt. 3.1.2. Computing consensus for microdescriptor-elements and "m" lines - One approach is for the consensus microdescriptor-elements line to - include every element listed by a majority of authorities, sorted. The - problem here is that it will no longer be deterministic what the correct - hash for the "m" line should be. We could imagine telling the authority - to go look in its descriptor and produce the right hash itself, but - we don't want consensus calculation to be based on external data like - that. (Plus, the authority may not have the descriptor that everybody - else voted to use.) - - The better approach is to take the exact set that has the most votes - (breaking ties by the set that has the most elements, and breaking - ties after that by whichever is alphabetically first). That will - increase the odds that we actually get a microdescriptor hash that - is both a) for the descriptor we're putting in the consensus, and b) - over the elements that we're declaring it should be for. - - Then the "m" line for a given relay is the one that gets the most votes - from authorities that both a) voted for the microdescriptor-elements - line we're using, and b) voted for the descriptor we're using. - - (If there's a tie, use the smaller hash. But really, if there are - multiple such votes and they differ about a microdescriptor, we caught - one of them lying or being buggy. We should log it to track down why.) - - If there are no such votes, then we leave out the "m" line for that - relay. That means clients should avoid it for this time period. (As - an extension it could instead mean that clients should fetch the - descriptor and figure out its microdescriptor themselves. But let's - not get ahead of ourselves.) - - It would be nice to have a more foolproof way to agree on what - microdescriptor hash each authority should vote for, so we can avoid - missing "m" lines. Just switching to a new consensus-method each time - we change the set of microdescriptor-elements won't help though, since - each authority will still have to decide what hash to vote for before - knowing what consensus-method will be used. - - Here's one way we could do it. Each vote / consensus includes - the microdescriptor-elements that were used to compute the hashes, - and also a preferred-microdescriptor-elements set. If an authority - has a consensus from the previous period, then it should use the - consensus preferred-microdescriptor-elements when computing its votes - for microdescriptor-elements and the appropriate hashes in the upcoming - period. (If it has no previous consensus, then it just writes its - own preferences in both lines.) - -3.2. Directory mirrors serve microdescriptors - - Directory mirrors should then read the microdescriptor-elements line - from the consensus, and learn how to answer requests. (Directory mirrors - continue to serve normal relay descriptors too, a) to serve old clients - and b) to be able to construct microdescriptors on the fly.) - - The microdescriptors with hashes <D1>,<D2>,<D3> should be available at: - http://<hostname>/tor/micro/d/<D1>+<D2>+<D3>.z + When we are generating a consensus, we use whichever m line + unambiguously corresponds to the descriptor digest that will be + included in the consensus. + + (If different votes have different microdescriptor digests for a + single <descriptor-digest, consensus-method> pair, then at least one + of the authorities is broken. If this happens, the consensus should + contain whichever microdescriptor digest is most common. If there is + no winner, we break ties in the favor of the lexically earliest. + Either way, we should log a warning: there is definitely a bug.) + + The "m" lines in a consensus contain only the digest, not a list of + consensus methods. + +3.1.3. A new flavor of consensus + + Rather than inserting "m" lines in the current consensus format, + they should be included in a new consensus flavor (see proposal + 162). + + This flavor can safely omit descriptor digests. + + When we implement this voting method, we can remove the exit policy + summary from the current "ns" flavor of consensus, since no current + clients use them, and they take up about 5% of the compressed + consensus. + + This new consensus flavor should be signed with the sha256 signature + format as documented in proposal 162. + +3.2. Directory mirrors fetch, cache, and serve microdescriptors + + Directory mirrors should fetch, catch, and serve each microdescriptor + from the authorities. (They need to continue to serve normal relay + descriptors too, to handle old clients.) + + The microdescriptors with base64 hashes <D1>,<D2>,<D3> should be + available at: + http://<hostname>/tor/micro/d/<D1>-<D2>-<D3>.z + (We use base64 for size and for consistency with the consensus + format. We use -s instead of +s to separate these items, since + the + character is used in base64 encoding.) All the microdescriptors from the current consensus should also be available at: @@ -136,24 +139,9 @@ Status: Open so a client that's bootstrapping doesn't need to send a 70KB URL just to name every microdescriptor it's looking for. - The format of a microdescriptor is the header line - "microdescriptor-header" - followed by each element (keyword and body), alphabetically. There's - no need to mention what hash it's for, since it's self-identifying: - you can hash the elements to learn this. - - (Do we need a footer line to show that it's over, or is the next - microdescriptor line or EOF enough of a hint? A footer line wouldn't - hurt much. Also, no fair voting for the microdescriptor-element - "microdescriptor-header".) - + Microdescriptors have no header or footer. The hash of the microdescriptor is simply the hash of the concatenated - elements -- not counting the header line or hypothetical footer line. - Unless you prefer that? - - Is there a reasonable way to version these things? We could say that - the microdescriptor-header line can contain arguments which clients - must ignore if they don't understand them. Any better ways? + elements. Directory mirrors should check to make sure that the microdescriptors they're about to serve match the right hashes (either the hashes from @@ -170,10 +158,14 @@ Status: Open When a client gets a new consensus, it looks to see if there are any microdescriptors it needs to learn. If it needs to learn more than some threshold of the microdescriptors (half?), it requests 'all', - else it requests only the missing ones. + else it requests only the missing ones. Clients MAY try to + determine whether the upload bandwidth for listing the + microdescriptors they want is more or less than the download + bandwidth for the microdescriptors they do not want. Clients maintain a cache of microdescriptors along with metadata like - when it was last referenced by a consensus. They keep a microdescriptor + when it was last referenced by a consensus, and which identity key + it corresponds to. They keep a microdescriptor until it hasn't been mentioned in any consensus for a week. Future clients might cache them for longer or shorter times. @@ -190,18 +182,17 @@ Status: Open Another future option would be to fetch some of the microdescriptors anonymously (via a Tor circuit). + Another crazy option (Roger's phrasing) is to do decoy fetches as + well. + 4. Transition and deployment Phase one, the directory authorities should start voting on - microdescriptors and microdescriptor elements, and putting them in the - consensus. This should happen during the 0.2.1.x series, and should - be relatively easy to do. + microdescriptors, and putting them in the consensus. Phase two, directory mirrors should learn how to serve them, and learn - how to read the consensus to find out what they should be serving. This - phase could be done either in 0.2.1.x or early in 0.2.2.x, depending - on how messy it turns out to be and how quickly we get around to it. + how to read the consensus to find out what they should be serving. Phase three, clients should start fetching and caching them instead - of normal descriptors. This should happen post 0.2.1.x. + of normal descriptors. diff --git a/doc/spec/proposals/159-exit-scanning.txt b/doc/spec/proposals/159-exit-scanning.txt index fbc69aa9e6..7090f2ed08 100644 --- a/doc/spec/proposals/159-exit-scanning.txt +++ b/doc/spec/proposals/159-exit-scanning.txt @@ -1,7 +1,5 @@ Filename: 159-exit-scanning.txt Title: Exit Scanning -Version: $Revision$ -Last-Modified: $Date$ Author: Mike Perry Created: 13-Feb-2009 Status: Open diff --git a/doc/spec/proposals/160-bandwidth-offset.txt b/doc/spec/proposals/160-bandwidth-offset.txt new file mode 100644 index 0000000000..7ca74dfae3 --- /dev/null +++ b/doc/spec/proposals/160-bandwidth-offset.txt @@ -0,0 +1,105 @@ +Filename: 160-bandwidth-offset.txt +Title: Authorities vote for bandwidth offsets in consensus +Author: Roger Dingledine +Created: 4-May-2009 +Status: Open +Target: 0.2.2.x + +1. Motivation + + As part of proposal 141, we moved the bandwidth value for each relay + into the consensus. Now clients can know how they should load balance + even before they've fetched the corresponding relay descriptors. + + Putting the bandwidth in the consensus also lets the directory + authorities choose more accurate numbers to advertise, if we come up + with a better algorithm for deciding weightings. + + Our original plan was to teach directory authorities how to measure + bandwidth themselves; then every authority would vote for the bandwidth + it prefers, and we'd take the median of votes as usual. + + The problem comes when we have 7 authorities, and only a few of them + have smarter bandwidth allocation algorithms. So long as the majority + of them are voting for the number in the relay descriptor, the minority + that have better numbers will be ignored. + +2. Options + + One fix would be to demand that every authority also run the + new bandwidth measurement algorithms: in that case, part of the + responsibility of being an authority operator is that you need to run + this code too. But in practice we can't really require all current + authority operators to do that; and if we want to expand the set of + authority operators even further, it will become even more impractical. + Also, bandwidth testing adds load to the network, so we don't really + want to require that the number of concurrent bandwidth tests match + the number of authorities we have. + + The better fix is to allow certain authorities to specify that they are + voting on bandwidth measurements: more accurate bandwidth values that + have actually been evaluated. In this way, authorities can vote on + the median measured value if sufficient measured votes exist for a router, + and otherwise fall back to the median value taken from the published router + descriptors. + +3. Security implications + + If only some authorities choose to vote on an offset, then a majority of + those voting authorities can arbitrarily change the bandwidth weighting + for the relay. At the extreme, if there's only one offset-voting + authority, then that authority can dictate which relays clients will + find attractive. + + This problem isn't entirely new: we already have the worry wrt + the subset of authorities that vote for BadExit. + + To make it not so bad, we should deploy at least three offset-voting + authorities. + + Also, authorities that know how to vote for offsets should vote for + an offset of zero for new nodes, rather than choosing not to vote on + any offset in those cases. + +4. Design + + First, we need a new consensus method to support this new calculation. + + Now v3 votes can have an additional value on the "w" line: + "w Bandwidth=X Measured=" INT. + + Once we're using the new consensus method, the new way to compute the + Bandwidth weight is by checking if there are at least 3 "Measured" + votes. If so, the median of these is taken. Otherwise, the median + of the "Bandwidth=" values are taken, as described in Proposal 141. + + Then the actual consensus looks just the same as it did before, + so clients never have to know that this additional calculation is + happening. + +5. Implementation + + The Measured values will be read from a file provided by the scanners + described in proposal 161. Files with a timestamp older than 3 days + will be ignored. + + The file will be read in from dirserv_generate_networkstatus_vote_obj() + in a location specified by a new config option "V3MeasuredBandwidths". + A helper function will be called to populate new 'measured' and + 'has_measured' fields of the routerstatus_t 'routerstatuses' list with + values read from this file. + + An additional for_vote flag will be passed to + routerstatus_format_entry() from format_networkstatus_vote(), which will + indicate that the "Measured=" string should be appended to the "w Bandwith=" + line with the measured value in the struct. + + routerstatus_parse_entry_from_string() will be modified to parse the + "Measured=" lines into routerstatus_t struct fields. + + Finally, networkstatus_compute_consensus() will set rs_out.bandwidth + to the median of the measured values if there are more than 3, otherwise + it will use the bandwidth value median as normal. + + + diff --git a/doc/spec/proposals/161-computing-bandwidth-adjustments.txt b/doc/spec/proposals/161-computing-bandwidth-adjustments.txt new file mode 100644 index 0000000000..786e1afebd --- /dev/null +++ b/doc/spec/proposals/161-computing-bandwidth-adjustments.txt @@ -0,0 +1,174 @@ +Title: Computing Bandwidth Adjustments +Filename: 161-computing-bandwidth-adjustments.txt +Author: Mike Perry +Created: 12-May-2009 +Target: 0.2.2.x +Status: Open + + +1. Motivation + + There is high variance in the performance of the Tor network. Despite + our efforts to balance load evenly across the Tor nodes, some nodes are + significantly slower and more overloaded than others. + + Proposal 160 describes how we can augment the directory authorities to + vote on measured bandwidths for routers. This proposal describes what + goes into the measuring process. + + +2. Measurement Selection + + The general idea is to determine a load factor representing the ratio + of the capacity of measured nodes to the rest of the network. This load + factor could be computed from three potentially relevant statistics: + circuit failure rates, circuit extend times, or stream capacity. + + Circuit failure rates and circuit extend times appear to be + non-linearly proportional to node load. We've observed that the same + nodes when scanned at US nighttime hours (when load is presumably + lower) exhibit almost no circuit failure, and significantly faster + extend times than when scanned during the day. + + Stream capacity, however, is much more uniform, even during US + nighttime hours. Moreover, it is a more intuitive representation of + node capacity, and also less dependent upon distance and latency + if amortized over large stream fetches. + + +3. Average Stream Bandwidth Calculation + + The average stream bandwidths are obtained by dividing the network into + slices of 50 nodes each, grouped according to advertised node bandwidth. + + Two hop circuits are built using nodes from the same slice, and a large + file is downloaded via these circuits. The file sizes are set based + on node percentile rank as follows: + + 0-10: 2M + 10-20: 1M + 20-30: 512k + 30-50: 256k + 50-100: 128k + + These sizes are based on measurements performed during test scans. + + This process is repeated until each node has been chosen to participate + in at least 5 circuits. + + +4. Ratio Calculation + + The ratios are calculated by dividing each measured value by the + network-wide average. + + +5. Ratio Filtering + + After the base ratios are calculated, a second pass is performed + to remove any streams with nodes of ratios less than X=0.5 from + the results of other nodes. In addition, all outlying streams + with capacity of one standard deviation below a node's average + are also removed. + + The final ratio result will be greater of the unfiltered ratio + and the filtered ratio. + + +6. Pseudocode for Ratio Calculation Algorithm + + Here is the complete pseudocode for the ratio algorithm: + + Slices = {S | S is 50 nodes of similar consensus capacity} + for S in Slices: + while exists node N in S with circ_chosen(N) < 7: + fetch_slice_file(build_2hop_circuit(N, (exit in S))) + for N in S: + BW_measured(N) = MEAN(b | b is bandwidth of a stream through N) + Bw_stddev(N) = STDDEV(b | b is bandwidth of a stream through N) + Bw_avg(S) = MEAN(b | b = BW_measured(N) for all N in S) + for N in S: + Normal_Streams(N) = {stream via N | bandwidth >= BW_measured(N)} + BW_Norm_measured(N) = MEAN(b | b is a bandwidth of Normal_Streams(N)) + + Bw_net_avg(Slices) = MEAN(BW_measured(N) for all N in Slices) + Bw_Norm_net_avg(Slices) = MEAN(BW_Norm_measured(N) for all N in Slices) + + for N in all Slices: + Bw_net_ratio(N) = Bw_measured(N)/Bw_net_avg(Slices) + Bw_Norm_net_ratio(N) = BW_Norm_measured(N)/Bw_Norm_net_avg(Slices) + + ResultRatio(N) = MAX(Bw_net_ratio(N), Bw_Norm_net_ratio(N)) + + +7. Security implications + + The ratio filtering will deal with cases of sabotage by dropping + both very slow outliers in stream average calculations, as well + as dropping streams that used very slow nodes from the calculation + of other nodes. + + This scheme will not address nodes that try to game the system by + providing better service to scanners. The scanners can be detected + at the entry by IP address, and at the exit by the destination fetch + IP. + + Measures can be taken to obfuscate and separate the scanners' source + IP address from the directory authority IP address. For instance, + scans can happen offsite and the results can be rsynced into the + authorities. The destination server IP can also change. + + Neither of these methods are foolproof, but such nodes can already + lie about their bandwidth to attract more traffic, so this solution + does not set us back any in that regard. + + +8. Parallelization + + Because each slice takes as long as 6 hours to complete, we will want + to parallelize as much as possible. This will be done by concurrently + running multiple scanners from each authority to deal with different + segments of the network. Each scanner piece will continually loop + over a portion of the network, outputting files of the form: + + node_id=<idhex> SP strm_bw=<BW_measured(N)> SP + filt_bw=<BW_Norm_measured(N)> ns_bw=<CurrentConsensusBw(N)> NL + + The most recent file from each scanner will be periodically gathered + by another script that uses them to produce network-wide averages + and calculate ratios as per the algorithm in section 6. Because nodes + may shift in capacity, they may appear in more than one slice and/or + appear more than once in the file set. The most recently measured + line will be chosen in this case. + + +9. Integration with Proposal 160 + + The final results will be produced for the voting mechanism + described in Proposal 160 by multiplying the derived ratio by + the average published consensus bandwidth during the course of the + scan, and taking the weighted average with the previous consensus + bandwidth: + + Bw_new = Round((Bw_current * Alpha + Bw_scan_avg*Bw_ratio)/(Alpha + 1)) + + The Alpha parameter is a smoothing parameter intended to prevent + rapid oscillation between loaded and unloaded conditions. It is + currently fixed at 0.333. + + The Round() step consists of rounding to the 3 most significant figures + in base10, and then rounding that result to the nearest 1000, with + a minimum value of 1000. + + This will produce a new bandwidth value that will be output into a + file consisting of lines of the form: + + node_id=<idhex> SP bw=<Bw_new> NL + + The first line of the file will contain a timestamp in UNIX time() + seconds. This will be used by the authority to decide if the + measured values are too old to use. + + This file can be either copied or rsynced into a directory readable + by the directory authority. + diff --git a/doc/spec/proposals/162-consensus-flavors.txt b/doc/spec/proposals/162-consensus-flavors.txt new file mode 100644 index 0000000000..8fdf9d07bf --- /dev/null +++ b/doc/spec/proposals/162-consensus-flavors.txt @@ -0,0 +1,178 @@ +Filename: 162-consensus-flavors.txt +Title: Publish the consensus in multiple flavors +Author: Nick Mathewson +Created: 14-May-2009 +Target: 0.2.2 +Status: Open + +Overview: + + This proposal describes a way to publish each consensus in + multiple simultaneous formats, or "flavors". This will reduce the + amount of time needed to deploy new consensus-like documents, and + reduce the size of consensus documents in the long term. + +Motivation: + + In the future, we will almost surely want different fields and + data in the network-status document. Examples include: + - Publishing hashes of microdescriptors instead of hashes of + full descriptors (Proposal 158). + - Including different digests of descriptors, instead of the + perhaps-soon-to-be-totally-broken SHA1. + + Note that in both cases, from the client's point of view, this + information _replaces_ older information. If we're using a + SHA256 hash, we don't need to see the SHA1. If clients only want + microdescriptors, they don't (necessarily) need to see hashes of + other things. + + Our past approach to cases like this has been to shovel all of + the data into the consensus document. But this is rather poor + for bandwidth. Adding a single SHA256 hash to a consensus for + each router increases the compressed consensus size by 47%. In + comparison, replacing a single SHA1 hash with a SHA256 hash for + each listed router increases the consensus size by only 18%. + +Design in brief: + + Let the voting process remain as it is, until a consensus is + generated. With future versions of the voting algorithm, instead + of just a single consensus being generated, multiple consensus + "flavors" are produced. + + Consensuses (all of them) include a list of which flavors are + being generated. Caches fetch and serve all flavors of consensus + that are listed, regardless of whether they can parse or validate + them, and serve them to clients. Thus, once this design is in + place, we won't need to deploy more cache changes in order to get + new flavors of consensus to be cached. + + Clients download only the consensus flavor they want. + +A note on hashes: + + Everything in this document is specified to use SHA256, and to be + upgradeable to use better hashes in the future. + +Spec modifications: + + 1. URLs and changes to the current consensus format. + + Every consensus flavor has a name consisting of a sequence of one + or more alphanumeric characters and dashes. For compatibility + current descriptor flavor is called "ns". + + The supported consensus flavors are defined as part of the + authorities' consensus method. + + For each supported flavor, every authority calculates another + consensus document of as-yet-unspecified format, and exchanges + detached signatures for these documents as in the current consensus + design. + + In addition to the consensus currently served at + /tor/status-vote/(current|next)/consensus.z , authorities serve + another consensus of each flavor "F" from the location + /tor/status-vote/(current|next)/F/consensus.z. + + When caches serve these documents, they do so from the same + locations. + + 2. Document format: generic consensus. + + The format of a flavored consensus is as-yet-unspecified, except + that the first line is: + "network-status-version" SP version SP flavor NL + + where version is 3 or higher, and the flavor is a string + consisting of alphanumeric characters and dashes, matching the + corresponding flavor listed in the unflavored consensus. + + 3. Document format: detached signatures. + + In addition to the current detached signature format, we allow + the first line to take the form, + "consensus-digest" SP flavor SP 1*(Algname "=" Digest) NL + + The consensus-signatures URL should contain the signatures + for _all_ flavors of consensus. + + 4. The consensus index: + + Authorities additionally generate and serve a consensus-index + document. Its format is: + + Header ValidAfter ValidUntil Documents Signatures + + Header = "consensus-index" SP version NL + ValidAfter = as in a consensus + ValidUntil = as in a consensus + Documents = Document* + Document = "document" SP flavor SP SignedLength + 1*(SP AlgorithmName "=" Digest) NL + Signatures = Signature* + Signature = "directory-signature" SP algname SP identity + SP signing-key-digest NL signature + + There must be one Document line for each generated consensus flavor. + Each Document line describes the length of the signed portion of + a consensus (the signatures themselves are not included), along + with one or more digests of that signed portion. Digests are + given in hex. The algorithm "sha256" MUST be included; others + are allowed. + + The algname part of a signature describes what algorithm was + used to hash the identity and signing keys, and to compute the + signature. The algorithm "sha256" MUST be recognized; + signatures with unrecognized algorithms MUST be ignored. + (See below). + + The consensus index is made available at + /tor/status-vote/(current|next)/consensus-index.z. + + Caches should fetch this document so they can check the + correctness of the different consensus documents they fetch. + They do not need to check anything about an unrecognized + consensus document beyond its digest and length. + + 4.1. The "sha256" signature format. + + The 'SHA256' signature format for directory objects is defined as + the RSA signature of the OAEP+-padded SHA256 digest of the SHA256 + digest of the item to be signed. When checking signatures, + the signature MUST be treated as valid if the signature material + begins with SHA256(SHA256(document)); this allows us to add other + data later. + +Considerations: + + - We should not create a new flavor of consensus when adding a + field instead wouldn't be too onerous. + + - We should not proliferate flavors lightly: clients will be + distinguishable based on which flavor they download. + +Migration: + + - Stage one: authorities begin generating and serving + consensus-index files. + + - Stage two: Caches begin downloading consensus-index files, + validating them, and using them to decide what flavors of + consensus documents to cache. They download all listed + documents, and compare them to the digests given in the + consensus. + + - Stage three: Once we want to make a significant change to the + consensus format, we deploy another flavor of consensus at the + authorities. This will immediately start getting cached by the + caches, and clients can start fetching the new flavor without + waiting a version or two for enough caches to begin supporting + it. + +Acknowledgements: + + Aspects of this design and its applications to hash migration were + heavily influenced by IRC conversations with Marian. + diff --git a/doc/spec/proposals/163-detecting-clients.txt b/doc/spec/proposals/163-detecting-clients.txt new file mode 100644 index 0000000000..d838b17063 --- /dev/null +++ b/doc/spec/proposals/163-detecting-clients.txt @@ -0,0 +1,115 @@ +Filename: 163-detecting-clients.txt +Title: Detecting whether a connection comes from a client +Author: Nick Mathewson +Created: 22-May-2009 +Target: 0.2.2 +Status: Open + + +Overview: + + Some aspects of Tor's design require relays to distinguish + connections from clients from connections that come from relays. + The existing means for doing this is easy to spoof. We propose + a better approach. + +Motivation: + + There are at least two reasons for which Tor servers want to tell + which connections come from clients and which come from other + servers: + + 1) Some exits, proposal 152 notwithstanding, want to disallow + their use as single-hop proxies. + 2) Some performance-related proposals involve prioritizing + traffic from relays, or limiting traffic per client (but not + per relay). + + Right now, we detect client vs server status based on how the + client opens circuits. (Check out the code that implements the + AllowSingleHopExits option if you want all the details.) This + method is depressingly easy to fake, though. This document + proposes better means. + +Goals: + + To make grabbing relay privileges at least as difficult as just + running a relay. + + In the analysis below, "using server privileges" means taking any + action that only servers are supposed to do, like delivering a + BEGIN cell to an exit node that doesn't allow single hop exits, + or claiming server-like amounts of bandwidth. + +Passive detection: + + A connection is definitely a client connection if it takes one of + the TLS methods during setup that does not establish an identity + key. + + A circuit is definitely a client circuit if it is initiated with + a CREATE_FAST cell, though the node could be a client or a server. + + A node that's listed in a recent consensus is probably a server. + + A node to which we have successfully extended circuits from + multiple origins is probably a server. + +Active detection: + + If a node doesn't try to use server privileges at all, we never + need to care whether it's a server. + + When a node or circuit tries to use server privileges, if it is + "definitely a client" as per above, we can refuse it immediately. + + If it's "probably a server" as per above, we can accept it. + + Otherwise, we have either a client, or a server that is neither + listed in any consensus or used by any other clients -- in other + words, a new or private server. + + For these servers, we should attempt to build one or more test + circuits through them. If enough of the circuits succeed, the + node is a real relay. If not, it is probably a client. + + While we are waiting for the test circuits to succeed, we should + allow a short grace period in which server privileges are + permitted. When a test is done, we should remember its outcome + for a while, so we don't need to do it again. + +Why it's hard to do good testing: + + Doing a test circuit starting with an unlisted router requires + only that we have an open connection for it. Doing a test + circuit starting elsewhere _through_ an unlisted router--though + more reliable-- would require that we have a known address, port, + identity key, and onion key for the router. Only the address and + identity key are easily available via the current Tor protocol in + all cases. + + We could fix this part by requiring that all servers support + BEGIN_DIR and support downloading at least a current descriptor + for themselves. + +Open questions: + + What are the thresholds for the needed numbers of circuits + for us to decide that a node is a relay? + + [Suggested answer: two circuits from two distinct hosts.] + + How do we pick grace periods? How long do we remember the + outcome of a test? + + [Suggested answer: 10 minute grace period; 48 hour memory of + test outcomes.] + + If we can build circuits starting at a suspect node, but we don't + have enough information to try extending circuits elsewhere + through the node, should we conclude that the node is + "server-like" or not? + + [Suggested answer: for now, just try making circuits through + the node. Extend this to extending circuits as needed.] + diff --git a/doc/spec/proposals/164-reporting-server-status.txt b/doc/spec/proposals/164-reporting-server-status.txt new file mode 100644 index 0000000000..705f5f1a84 --- /dev/null +++ b/doc/spec/proposals/164-reporting-server-status.txt @@ -0,0 +1,91 @@ +Filename: 164-reporting-server-status.txt +Title: Reporting the status of server votes +Author: Nick Mathewson +Created: 22-May-2009 +Target: 0.2.2 +Status: Open + + +Overview: + + When a given node isn't listed in the directory, it isn't always easy + to tell why. This proposal suggest a quick-and-dirty way for + authorities to export not only how they voted, but why, and a way to + collate the information. + +Motivation: + + Right now, if you want to know the reason why your server was listed + a certain way in the Tor directory, the following steps are + recommended: + + - Look through your log for reports of what the authority said + when you tried to upload. + + - Look at the consensus; see if you're listed. + + - Wait a while, see if things get better. + + - Download the votes from all the authorities, and see how they + voted. Try to figure out why. + + - If you think they'll listen to you, ask some authority + operators to look you up in their mtbf files and logs to see + why they voted as they did. + + This is far too hard. + +Solution: + + We should add a new vote-like information-only document that + authorities serve on request. Call it a "vote info". It is + generated at the same time as a vote, but used only for + determining why a server voted as it did. It is served from + /tor/status-vote-info/current/authority[.z] + + It differs from a vote in that: + + * Its vote-status field is 'vote-info'. + + * It includes routers that the authority would not include + in its vote. + + For these, it includes an "omitted" line with an English + message explaining why they were omitted. + + * For each router, it includes a line describing its WFU and + MTBF. The format is: + + "stability <mtbf> up-since='date'" + "uptime <wfu> down-since='date'" + + * It describes the WFU and MTBF thresholds it requires to + vote for a given router in various roles in the header. + The format is: + + "flag-requirement <flag-name> <field> <op> <value>" + + e.g. + + "flag-requirement Guard uptime > 80" + + * It includes info on routers all of whose descriptors that + were uploaded but rejected over the past few hours. The + "r" lines for these are the same as for regular routers. + The other lines are omitted for these routers, and are + replaced with a single "rejected" line, explaining (in + English) why the router was rejected. + + + A status site (like Torweather or Torstatus or another + tool) can poll these files when they are generated, collate + the data, and make it available to server operators. + +Risks: + + This document makes no provisions for caching these "vote + info" documents. If many people wind up fetching them + aggressively from the authorities, that would be bad. + + + diff --git a/doc/spec/proposals/165-simple-robust-voting.txt b/doc/spec/proposals/165-simple-robust-voting.txt new file mode 100644 index 0000000000..f813285a83 --- /dev/null +++ b/doc/spec/proposals/165-simple-robust-voting.txt @@ -0,0 +1,133 @@ +Filename: 165-simple-robust-voting.txt +Title: Easy migration for voting authority sets +Author: Nick Mathewson +Created: 2009-05-28 +Status: Open + +Overview: + + This proposal describes any easy-to-implement, easy-to-verify way to + change the set of authorities without creating a "flag day" situation. + +Motivation: + + From proposal 134 ("More robust consensus voting with diverse + authority sets") by Peter Palfrader: + + Right now there are about five authoritative directory servers + in the Tor network, tho this number is expected to rise to about + 15 eventually. + + Adding a new authority requires synchronized action from all + operators of directory authorities so that at any time during the + update at least half of all authorities are running and agree on + who is an authority. The latter requirement is there so that the + authorities can arrive at a common consensus: Each authority + builds the consensus based on the votes from all authorities it + recognizes, and so a different set of recognized authorities will + lead to a different consensus document. + + In response to this problem, proposal 134 suggested that every + candidate authority list in its vote whom it believes to be an + authority. These A-says-B-is-an-authority relationships form a + directed graph. Each authority then iteratively finds the largest + clique in the graph and remove it, until they find one containing + them. They vote with this clique. + + Proposal 134 had some problems: + + - It had a security problem in that M hostile authorities in a + clique could effectively kick out M-1 honest authorities. This + could enable a minority of the original authorities to take over. + + - It was too complex in its implications to analyze well: it took us + over a year to realize that it was insecure. + + - It tried to solve a bigger problem: general fragmentation of + authority trust. Really, all we wanted to have was the ability to + add and remove authorities without forcing a flag day. + +Proposed protocol design: + + A "Voting Set" is a set of authorities. Each authority has a list of + the voting sets it considers acceptable. These sets are chosen + manually by the authority operators. They must always contain the + authority itself. Each authority lists all of these voting sets in + its votes. + + Authorities exchange votes with every other authority in any of their + voting sets. + + When it is time to calculate a consensus, an authority votes with + whichever voting set it lists that is listed by the most members of + that set. In other words, given two sets S1 and S2 that an authority + lists, that authority will prefer to vote with S1 over S2 whenever + the number of other authorities in S1 that themselves list S1 is + higher than the number of other authorities in S2 that themselves + list S2. + + For example, suppose authority A recognizes two sets, "A B C D" and + "A E F G H". Suppose that the first set is recognized by all of A, + B, C, and D, whereas the second set is recognized only by A, E, and + F. Because the first set is recognize by more of the authorities in + it than the other one, A will vote with the first set. + + Ties are broken in favor of some arbitrary function of the identity + keys of the authorities in the set. + +How to migrate authority sets: + + In steady state, each authority operator should list only the current + actual voting set as accepted. + + When we want to add an authority, each authority operator configures + his or her server to list two voting sets: one containing all the old + authorities, and one containing the old authorities and the new + authority too. Once all authorities are listing the new set of + authorities, they will start voting with that set because of its + size. + + What if one or two authority operators are slow to list the new set? + Then the other operators can stop listing the old set once there are + enough authorities listing the new set to make its voting successful. + (Note that these authorities not listing the new set will still have + their votes counted, since they themselves will be members of the new + set. They will only fail to sign the consensus generated by the + other authorities who are using the new set.) + + When we want to remove an authority, the operators list two voting + sets: one containing all the authorities, and one omitting the + authority we want to remove. Once enough authorities list the new + set as acceptable, we start having authority operators stop listing + the old set. Once there are more listing the new set than the old + set, the new set will win. + +Data format changes: + + Add a new 'voting-set' line to the vote document format. Allow it to + occur any number of times. Its format is: + + voting-set SP 'fingerprint' SP 'fingerprint' ... NL + + where each fingerprint is the hex fingerprint of an identity key of + an authority. Sort fingerprints in ascending order. + + When the consensus method is at least 'X' (decide this when we + implement the proposal), add this line to the consensus format as + well, before the first dir-source line. [This information is not + redundant with the dir-source sections in the consensus: If an + authority is recognized but didn't vote, that authority will appear in + the voting-set line but not in the dir-source sections.] + + We don't need to list other information about authorities in our + vote. + +Migration issues: + + We should keep track somewhere of which Tor client versions + recognized which authorities. + +Acknowledgments: + + The design came out of an IRC conversation with Peter Palfrader. He + had the basic idea first. diff --git a/doc/spec/proposals/166-statistics-extra-info-docs.txt b/doc/spec/proposals/166-statistics-extra-info-docs.txt new file mode 100644 index 0000000000..ab2716a71c --- /dev/null +++ b/doc/spec/proposals/166-statistics-extra-info-docs.txt @@ -0,0 +1,391 @@ +Filename: 166-statistics-extra-info-docs.txt +Title: Including Network Statistics in Extra-Info Documents +Author: Karsten Loesing +Created: 21-Jul-2009 +Target: 0.2.2 +Status: Accepted + +Change history: + + 21-Jul-2009 Initial proposal for or-dev + + +Overview: + + The Tor network has grown to almost two thousand relays and millions + of casual users over the past few years. With growth has come + increasing performance problems and attempts by some countries to + block access to the Tor network. In order to address these problems, + we need to learn more about the Tor network. This proposal suggests to + measure additional statistics and include them in extra-info documents + to help us understand the Tor network better. + + +Introduction: + + As of May 2009, relays, bridges, and directories gather the following + data for statistical purposes: + + - Relays and bridges count the number of bytes that they have pushed + in 15-minute intervals over the past 24 hours. Relays and bridges + include these data in extra-info documents that they send to the + directory authorities whenever they publish their server descriptor. + + - Bridges further include a rough number of clients per country that + they have seen in the past 48 hours in their extra-info documents. + + - Directories can be configured to count the number of clients they + see per country in the past 24 hours and to write them to a local + file. + + Since then we extended the network statistics in Tor. These statistics + include: + + - Directories now gather more precise statistics about connecting + clients. Fixes include measuring in intervals of exactly 24 hours, + counting unsuccessful requests, measuring download times, etc. The + directories append their statistics to a local file every 24 hours. + + - Entry guards count the number of clients per country per day like + bridges do and write them to a local file every 24 hours. + + - Relays measure statistics of the number of cells in their circuit + queues and how much time these cells spend waiting there. Relays + write these statistics to a local file every 24 hours. + + - Exit nodes count the number of read and written bytes on exit + connections per port as well as the number of opened exit streams + per port in 24-hour intervals. Exit nodes write their statistics to + a local file. + + The following four sections contain descriptions for adding these + statistics to the relays' extra-info documents. + + +Directory request statistics: + + The first type of statistics aims at measuring directory requests sent + by clients to a directory mirror or directory authority. More + precisely, these statistics aim at requests for v2 and v3 network + statuses only. These directory requests are sent non-anonymously, + either via HTTP-like requests to a directory's Dir port or tunneled + over a 1-hop circuit. + + Measuring directory request statistics is useful for several reasons: + First, the number of locally seen directory requests can be used to + estimate the total number of clients in the Tor network. Second, the + country-wise classification of requests using a GeoIP database can + help counting the relative and absolute number of users per country. + Third, the download times can give hints on the available bandwidth + capacity at clients. + + Directory requests do not give any hints on the contents that clients + send or receive over the Tor network. Every client requests network + statuses from the directories, so that there are no anonymity-related + concerns to gather these statistics. It might be, though, that clients + wish to hide the fact that they are connecting to the Tor network. + Therefore, IP addresses are resolved to country codes in memory, + events are accumulated over 24 hours, and numbers are rounded up to + multiples of 4 or 8. + + "dirreq-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL + [At most once.] + + YYYY-MM-DD HH:MM:SS defines the end of the included measurement + interval of length NSEC seconds (86400 seconds by default). + + A "dirreq-stats-end" line, as well as any other "dirreq-*" line, + is only added when the relay has opened its Dir port and after 24 + hours of measuring directory requests. + + "dirreq-v2-ips" CC=N,CC=N,... NL + [At most once.] + "dirreq-v3-ips" CC=N,CC=N,... NL + [At most once.] + + List of mappings from two-letter country codes to the number of + unique IP addresses that have connected from that country to + request a v2/v3 network status, rounded up to the nearest multiple + of 8. Only those IP addresses are counted that the directory can + answer with a 200 OK status code. + + "dirreq-v2-reqs" CC=N,CC=N,... NL + [At most once.] + "dirreq-v3-reqs" CC=N,CC=N,... NL + [At most once.] + + List of mappings from two-letter country codes to the number of + requests for v2/v3 network statuses from that country, rounded up + to the nearest multiple of 8. Only those requests are counted that + the directory can answer with a 200 OK status code. + + "dirreq-v2-share" num% NL + [At most once.] + "dirreq-v3-share" num% NL + [At most once.] + + The share of v2/v3 network status requests that the directory + expects to receive from clients based on its advertised bandwidth + compared to the overall network bandwidth capacity. Shares are + formatted in percent with two decimal places. Shares are + calculated as means over the whole 24-hour interval. + + "dirreq-v2-resp" status=num,... NL + [At most once.] + "dirreq-v3-resp" status=nul,... NL + [At most once.] + + List of mappings from response statuses to the number of requests + for v2/v3 network statuses that were answered with that response + status, rounded up to the nearest multiple of 4. Only response + statuses with at least 1 response are reported. New response + statuses can be added at any time. The current list of response + statuses is as follows: + + "ok": a network status request is answered; this number + corresponds to the sum of all requests as reported in + "dirreq-v2-reqs" or "dirreq-v3-reqs", respectively, before + rounding up. + "not-enough-sigs: a version 3 network status is not signed by a + sufficient number of requested authorities. + "unavailable": a requested network status object is unavailable. + "not-found": a requested network status is not found. + "not-modified": a network status has not been modified since the + If-Modified-Since time that is included in the request. + "busy": the directory is busy. + + "dirreq-v2-direct-dl" key=val,... NL + [At most once.] + "dirreq-v3-direct-dl" key=val,... NL + [At most once.] + "dirreq-v2-tunneled-dl" key=val,... NL + [At most once.] + "dirreq-v3-tunneled-dl" key=val,... NL + [At most once.] + + List of statistics about possible failures in the download process + of v2/v3 network statuses. Requests are either "direct" + HTTP-encoded requests over the relay's directory port, or + "tunneled" requests using a BEGIN_DIR cell over the relay's OR + port. The list of possible statistics can change, and statistics + can be left out from reporting. The current list of statistics is + as follows: + + Successful downloads and failures: + + "complete": a client has finished the download successfully. + "timeout": a download did not finish within 10 minutes after + starting to send the response. + "running": a download is still running at the end of the + measurement period for less than 10 minutes after starting to + send the response. + + Download times: + + "min", "max": smallest and largest measured bandwidth in B/s. + "d[1-4,6-9]": 1st to 4th and 6th to 9th decile of measured + bandwidth in B/s. For a given decile i, i/10 of all downloads + had a smaller bandwidth than di, and (10-i)/10 of all downloads + had a larger bandwidth than di. + "q[1,3]": 1st and 3rd quartile of measured bandwidth in B/s. One + fourth of all downloads had a smaller bandwidth than q1, one + fourth of all downloads had a larger bandwidth than q3, and the + remaining half of all downloads had a bandwidth between q1 and + q3. + "md": median of measured bandwidth in B/s. Half of the downloads + had a smaller bandwidth than md, the other half had a larger + bandwidth than md. + + +Entry guard statistics: + + Entry guard statistics include the number of clients per country and + per day that are connecting directly to an entry guard. + + Entry guard statistics are important to learn more about the + distribution of clients to countries. In the future, this knowledge + can be useful to detect if there are or start to be any restrictions + for clients connecting from specific countries. + + The information which client connects to a given entry guard is very + sensitive. This information must not be combined with the information + what contents are leaving the network at the exit nodes. Therefore, + entry guard statistics need to be aggregated to prevent them from + becoming useful for de-anonymization. Aggregation includes resolving + IP addresses to country codes, counting events over 24-hour intervals, + and rounding up numbers to the next multiple of 8. + + "entry-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL + [At most once.] + + YYYY-MM-DD HH:MM:SS defines the end of the included measurement + interval of length NSEC seconds (86400 seconds by default). + + An "entry-stats-end" line, as well as any other "entry-*" + line, is first added after the relay has been running for at least + 24 hours. + + "entry-ips" CC=N,CC=N,... NL + [At most once.] + + List of mappings from two-letter country codes to the number of + unique IP addresses that have connected from that country to the + relay and which are no known other relays, rounded up to the + nearest multiple of 8. + + +Cell statistics: + + The third type of statistics have to do with the time that cells spend + in circuit queues. In order to gather these statistics, the relay + memorizes when it puts a given cell in a circuit queue and when this + cell is flushed. The relay further notes the life time of the circuit. + These data are sufficient to determine the mean number of cells in a + queue over time and the mean time that cells spend in a queue. + + Cell statistics are necessary to learn more about possible reasons for + the poor network performance of the Tor network, especially high + latencies. The same statistics are also useful to determine the + effects of design changes by comparing today's data with future data. + + There are basically no privacy concerns from measuring cell + statistics, regardless of a node being an entry, middle, or exit node. + + "cell-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL + [At most once.] + + YYYY-MM-DD HH:MM:SS defines the end of the included measurement + interval of length NSEC seconds (86400 seconds by default). + + A "cell-stats-end" line, as well as any other "cell-*" line, + is first added after the relay has been running for at least 24 + hours. + + "cell-processed-cells" num,...,num NL + [At most once.] + + Mean number of processed cells per circuit, subdivided into + deciles of circuits by the number of cells they have processed in + descending order from loudest to quietest circuits. + + "cell-queued-cells" num,...,num NL + [At most once.] + + Mean number of cells contained in queues by circuit decile. These + means are calculated by 1) determining the mean number of cells in + a single circuit between its creation and its termination and 2) + calculating the mean for all circuits in a given decile as + determined in "cell-processed-cells". Numbers have a precision of + two decimal places. + + "cell-time-in-queue" num,...,num NL + [At most once.] + + Mean time cells spend in circuit queues in milliseconds. Times are + calculated by 1) determining the mean time cells spend in the + queue of a single circuit and 2) calculating the mean for all + circuits in a given decile as determined in + "cell-processed-cells". + + "cell-circuits-per-decile" num NL + [At most once.] + + Mean number of circuits that are included in any of the deciles, + rounded up to the next integer. + + +Exit statistics: + + The last type of statistics affects exit nodes counting the number of + bytes written and read and the number of streams opened per port and + per 24 hours. Exit port statistics can be measured from looking at + headers of BEGIN and DATA cells. A BEGIN cell contains the exit port + that is required for the exit node to open a new exit stream. + Subsequent DATA cells coming from the client or being sent back to the + client contain a length field stating how many bytes of application + data are contained in the cell. + + Exit port statistics are important to measure in order to identify + possible load-balancing problems with respect to exit policies. Exit + nodes that permit more ports than others are very likely overloaded + with traffic for those ports plus traffic for other ports. Improving + load balancing in the Tor network improves the overall utilization of + bandwidth capacity. + + Exit traffic is one of the most sensitive parts of network data in the + Tor network. Even though these statistics do not require looking at + traffic contents, statistics are aggregated so that they are not + useful for de-anonymizing users. Only those ports are reported that + have seen at least 0.1% of exiting or incoming bytes, numbers of bytes + are rounded up to full kibibytes (KiB), and stream numbers are rounded + up to the next multiple of 4. + + "exit-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL + [At most once.] + + YYYY-MM-DD HH:MM:SS defines the end of the included measurement + interval of length NSEC seconds (86400 seconds by default). + + An "exit-stats-end" line, as well as any other "exit-*" line, is + first added after the relay has been running for at least 24 hours + and only if the relay permits exiting (where exiting to a single + port and IP address is sufficient). + + "exit-kibibytes-written" port=N,port=N,... NL + [At most once.] + "exit-kibibytes-read" port=N,port=N,... NL + [At most once.] + + List of mappings from ports to the number of kibibytes that the + relay has written to or read from exit connections to that port, + rounded up to the next full kibibyte. + + "exit-streams-opened" port=N,port=N,... NL + [At most once.] + + List of mappings from ports to the number of opened exit streams + to that port, rounded up to the nearest multiple of 4. + + +Implementation notes: + + Right now, relays that are configured accordingly write similar + statistics to those described in this proposal to disk every 24 hours. + With this proposal being implemented, relays include the contents of + these files in extra-info documents. + + The following steps are necessary to implement this proposal: + + 1. The current format of [dirreq|entry|buffer|exit]-stats files needs + to be adapted to the description in this proposal. This step + basically means renaming keywords. + + 2. The timing of writing the four *-stats files should be unified, so + that they are written exactly 24 hours after starting the + relay. Right now, the measurement intervals for dirreq, entry, and + exit stats starts with the first observed request, and files are + written when observing the first request that occurs more than 24 + hours after the beginning of the measurement interval. With this + proposal, the measurement intervals should all start at the same + time, and files should be written exactly 24 hours later. + + 3. It is advantageous to cache statistics in local files in the data + directory until they are included in extra-info documents. The + reason is that the 24-hour measurement interval can be very + different from the 18-hour publication interval of extra-info + documents. When a relay crashes after finishing a measurement + interval, but before publishing the next extra-info document, + statistics would get lost. Therefore, statistics are written to + disk when finishing a measurement interval and read from disk when + generating an extra-info document. Only the statistics that were + appended to the *-stats files within the past 24 hours are included + in extra-info documents. Further, the contents of the *-stats files + need to be checked in the process of generating extra-info documents. + + 4. With the statistics patches being tested, the ./configure options + should be removed and the statistics code be compiled by default. + It is still required for relay operators to add configuration + options (DirReqStatistics, ExitPortStatistics, etc.) to enable + gathering statistics. However, in the near future, statistics shall + be enabled gathered by all relays by default, where requiring a + ./configure option would be a barrier for many relay operators. diff --git a/doc/spec/proposals/167-params-in-consensus.txt b/doc/spec/proposals/167-params-in-consensus.txt new file mode 100644 index 0000000000..7649c040cd --- /dev/null +++ b/doc/spec/proposals/167-params-in-consensus.txt @@ -0,0 +1,47 @@ +Filename: 167-params-in-consensus.txt +Title: Vote on network parameters in consensus +Author: Roger Dingledine +Created: 18-Aug-2009 +Status: Open +Target: 0.2.2 + +0. History + + +1. Overview + + Several of our new performance plans involve guessing how to tune + clients and relays, yet we won't be able to learn whether we guessed + the right tuning parameters until many people have upgraded. Instead, + we should have directory authorities vote on the parameters, and teach + Tors to read the currently recommended values out of the consensus. + +2. Design + + V3 votes should include a new "params" line after the known-flags + line. It contains key=value pairs, where value is an integer. + + Consensus documents that are generated with a sufficiently new consensus + method (7?) then include a params line that includes every key listed + in any vote, and the median value for that key (in case of ties, + we use the median closer to zero). + +2.1. Planned keys. + + The first planned parameter is "circwindow=101", which is the initial + circuit packaging window that clients and relays should use. Putting + it in the consensus will let us perform experiments with different + values once enough Tors have upgraded -- see proposal 168. + + Later parameters might include a weighting for how much to favor quiet + circuits over loud circuits in our round-robin algorithm; a weighting + for how much to prioritize relays over clients if we use an incentive + scheme like the gold-star design; and what fraction of circuits we + should throw out from proposal 151. + +2.2. What about non-integers? + + I'm not sure how we would do median on non-integer values. Further, + I don't have any non-integer values in mind yet. So I say we cross + that bridge when we get to it. + diff --git a/doc/spec/proposals/168-reduce-circwindow.txt b/doc/spec/proposals/168-reduce-circwindow.txt new file mode 100644 index 0000000000..c10cf41e2e --- /dev/null +++ b/doc/spec/proposals/168-reduce-circwindow.txt @@ -0,0 +1,134 @@ +Filename: 168-reduce-circwindow.txt +Title: Reduce default circuit window +Author: Roger Dingledine +Created: 12-Aug-2009 +Status: Open +Target: 0.2.2 + +0. History + + +1. Overview + + We should reduce the starting circuit "package window" from 1000 to + 101. The lower package window will mean that clients will only be able + to receive 101 cells (~50KB) on a circuit before they need to send a + 'sendme' acknowledgement cell to request 100 more. + + Starting with a lower package window on exit relays should save on + buffer sizes (and thus memory requirements for the exit relay), and + should save on queue sizes (and thus latency for users). + + Lowering the package window will induce an extra round-trip for every + additional 50298 bytes of the circuit. This extra step is clearly a + slow-down for large streams, but ultimately we hope that a) clients + fetching smaller streams will see better response, and b) slowing + down the large streams in this way will produce lower e2e latencies, + so the round-trips won't be so bad. + +2. Motivation + + Karsten's torperf graphs show that the median download time for a 50KB + file over Tor in mid 2009 is 7.7 seconds, whereas the median download + time for 1MB and 5MB are around 50s and 150s respectively. The 7.7 + second figure is way too high, whereas the 50s and 150s figures are + surprisingly low. + + The median round-trip latency appears to be around 2s, with 25% of + the data points taking more than 5s. That's a lot of variance. + + We designed Tor originally with the original goal of maximizing + throughput. We figured that would also optimize other network properties + like round-trip latency. Looks like we were wrong. + +3. Design + + Wherever we initialize the circuit package window, initialize it to + 101 rather than 1000. Reducing it should be safe even when interacting + with old Tors: the old Tors will receive the 101 cells and send back + a sendme ack cell. They'll still have much higher deliver windows, + but the rest of their deliver window will go unused. + + You can find the patch at arma/circwindow. It seems to work. + +3.1. Why not 100? + + Tor 0.0.0 through 0.2.1.19 have a bug where they only send the sendme + ack cell after 101 cells rather than the intended 100 cells. + + Once 0.2.1.19 is obsolete we can change it back to 100 if we like. But + hopefully we'll have moved to some datagram protocol long before + 0.2.1.19 becomes obsolete. + +3.2. What about stream packaging windows? + + Right now the stream packaging windows start at 500. The goal was to + set the stream window to half the circuit window, to provide a crude + load balancing between streams on the same circuit. Once we lower + the circuit packaging window, the stream packaging window basically + becomes redundant. + + We could leave it in -- it isn't hurting much in either case. Or we + could take it out -- people building other Tor clients would thank us + for that step. Alas, people building other Tor clients are going to + have to be compatible with current Tor clients, so in practice there's + no point taking out the stream packaging windows. + +3.3. What about variable circuit windows? + + Once upon a time we imagined adapting the circuit package window to + the network conditions. That is, we would start the window small, + and raise it based on the latency and throughput we see. + + In theory that crude imitation of TCP's windowing system would allow + us to adapt to fill the network better. In practice, I think we want + to stick with the small window and never raise it. The low cap reduces + the total throughput you can get from Tor for a given circuit. But + that's a feature, not a bug. + +4. Evaluation + + How do we know this change is actually smart? It seems intuitive that + it's helpful, and some smart systems people have agreed that it's + a good idea (or said another way, they were shocked at how big the + default package window was before). + + To get a more concrete sense of the benefit, though, Karsten has been + running torperf side-by-side on exit relays with the old package window + vs the new one. The results are mixed currently -- it is slightly faster + for fetching 40KB files, and slightly slower for fetching 50KB files. + + I think it's going to be tough to get a clear conclusion that this is + a good design just by comparing one exit relay running the patch. The + trouble is that the other hops in the circuits are still getting bogged + down by other clients introducing too much traffic into the network. + + Ultimately, we'll want to put the circwindow parameter into the + consensus so we can test a broader range of values once enough relays + have upgraded. + +5. Transition and deployment + + We should put the circwindow in the consensus (see proposal 167), + with an initial value of 101. Then as more exit relays upgrade, + clients should seamlessly get the better behavior. + + Note that upgrading the exit relay will only affect the "download" + package window. An old client that's uploading lots of bytes will + continue to use the old package window at the client side, and we + can't throttle that window at the exit side without breaking protocol. + + The real question then is what we should backport to 0.2.1. Assuming + this could be a big performance win, we can't afford to wait until + 0.2.2.x comes out before starting to see the changes here. So we have + two options as I see them: + a) once clients in 0.2.2.x know how to read the value out of the + consensus, and it's been tested for a bit, backport that part to + 0.2.1.x. + b) if it's too complex to backport, just pick a number, like 101, and + backport that number. + + Clearly choice (a) is the better one if the consensus parsing part + isn't very complex. Let's shoot for that, and fall back to (b) if the + patch turns out to be so big that we reconsider. + diff --git a/doc/spec/proposals/ideas/xxx-bwrate-algs.txt b/doc/spec/proposals/ideas/xxx-bwrate-algs.txt new file mode 100644 index 0000000000..757f5bc55e --- /dev/null +++ b/doc/spec/proposals/ideas/xxx-bwrate-algs.txt @@ -0,0 +1,106 @@ +# The following two algorithms + + +# Algorithm 1 +# TODO: Burst and Relay/Regular differentiation + +BwRate = Bandwidth Rate in Bytes Per Second +GlobalWriteBucket = 0 +GlobalReadBucket = 0 +Epoch = Token Fill Rate in seconds: suggest 50ms=.050 +SecondCounter = 0 +MinWriteBytes = Minimum amount bytes per write + +Every Epoch Seconds: + UseMinWriteBytes = MinWriteBytes + WriteCnt = 0 + ReadCnt = 0 + BytesRead = 0 + + For Each Open OR Conn with pending write data: + WriteCnt++ + For Each Open OR Conn: + ReadCnt++ + + BytesToRead = (BwRate*Epoch + GlobalReadBucket)/ReadCnt + BytesToWrite = (BwRate*Epoch + GlobalWriteBucket)/WriteCnt + + if BwRate/WriteCnt < MinWriteBytes: + # If we aren't likely to accumulate enough bytes in a second to + # send a whole cell for our connections, send partials + Log(NOTICE, "Too many ORCons to write full blocks. Sending short packets.") + UseMinWriteBytes = 1 + # Other option: We could switch to plan 2 here + + # Service each writable ORConn. If there are any partial writes, + # return remaining bytes from this epoch to the global pool + For Each Open OR Conn with pending write data: + ORConn->write_bucket += BytesToWrite + if ORConn->write_bucket > UseMinWriteBytes: + w = write(ORConn, MIN(len(ORConn->write_data), ORConn->write_bucket)) + # possible that w < ORConn->write_data here due to TCP pushback. + # We should restore the rest of the write_bucket to the global + # buffer + GlobalWriteBucket += (ORConn->write_bucket - w) + ORConn->write_bucket = 0 + + For Each Open OR Conn: + r = read_nonblock(ORConn, BytesToRead) + BytesRead += r + + SecondCounter += Epoch + if SecondCounter < 1: + # Save unused bytes from this epoch to be used later in the second + GlobalReadBucket += (BwRate*Epoch - BytesRead) + else: + SecondCounter = 0 + GlobalReadBucket = 0 + GlobalWriteBucket = 0 + For Each ORConn: + ORConn->write_bucket = 0 + + + +# Alternate plan for Writing fairly. Reads would still be covered +# by plan 1 as there is no additional network overhead for short reads, +# so we don't need to try to avoid them. +# +# I think this is actually pretty similar to what we do now, but +# with the addition that the bytes accumulate up to the second mark +# and we try to keep track of our position in the write list here +# (unless libevent is doing that for us already and I just don't see it) +# +# TODO: Burst and Relay/Regular differentiation + +# XXX: The inability to send single cells will cause us to block +# on EXTEND cells for low-bandwidth node pairs.. +BwRate = Bandwidth Rate in Bytes Per Second +WriteBytes = Bytes per write +Epoch = MAX(MIN(WriteBytes/BwRate, .333s), .050s) + +SecondCounter = 0 +GlobalWriteBucket = 0 + +# New connections are inserted at Head-1 (the 'tail' of this circular list) +# This is not 100% fifo for all node data, but it is the best we can do +# without insane amounts of additional queueing complexity. +WriteConnList = List of Open OR Conns with pending write data > WriteBytes +WriteConnHead = 0 + +Every Epoch Seconds: + GlobalWriteBucket += BwRate*Epoch + WriteListEnd = WriteConnHead + + do + ORCONN = WriteConnList[WriteConnHead] + w = write(ORConn, WriteBytes) + GlobalWriteBucket -= w + WriteConnHead += 1 + while GlobalWriteBucket > 0 and WriteConnHead != WriteListEnd + + SecondCounter += Epoch + if SecondCounter >= 1: + SecondCounter = 0 + GlobalWriteBucket = 0 + + diff --git a/doc/spec/proposals/ideas/xxx-choosing-crypto-in-tor-protocol.txt b/doc/spec/proposals/ideas/xxx-choosing-crypto-in-tor-protocol.txt new file mode 100644 index 0000000000..e8489570f7 --- /dev/null +++ b/doc/spec/proposals/ideas/xxx-choosing-crypto-in-tor-protocol.txt @@ -0,0 +1,138 @@ +Filename: xxx-choosing-crypto-in-tor-protocol.txt +Title: Picking cryptographic standards in the Tor wire protocol +Author: Marian +Created: 2009-05-16 +Status: Draft + +Motivation: + + SHA-1 is horribly outdated and not suited for security critical + purposes. SHA-2, RIPEMD-160, Whirlpool and Tigerare good options + for a short-term replacement, but in the long run, we will + probably want to upgrade to the winner or a semi-finalist of the + SHA-3 competition. + + For a 2006 comparison of different hash algorithms, read: + http://www.sane.nl/sane2006/program/final-papers/R10.pdf + + Other reading about SHA-1: + http://www.schneier.com/blog/archives/2005/02/sha1_broken.html + http://www.schneier.com/blog/archives/2005/08/new_cryptanalyt.html + http://www.schneier.com/paper-preimages.html + + Additionally, AES has been theoretically broken for years. While + the attack is still not efficient enough that the public sector + has been able to prove that it works, we should probably consider + the time between a theoretical attack and a practical attack as an + opportunity to figure out how to upgrade to a better algorithm, + such as Twofish. + + See: + http://schneier.com/crypto-gram-0209.html#1 + +Design: + + I suggest that nodes should publish in directories which + cryptographic standards, such as hash algorithms and ciphers, + they support. Clients communicating with nodes will then + pick whichever of those cryptographic standards they prefer + the most. In the case that the node does not publish which + cryptographic standards it supports, the client should assume + that the server supports the older standards, such as SHA-1 + and AES, until such time as we choose to desupport those + standards. + + Node to node communications could work similarly. However, in + case they both support a set of algorithms but have different + preferences, the disagreement would have to be resolved + somehow. Two possibilities include: + * the node requesting communications presents which + cryptographic standards it supports in the request. The + other node picks. + * both nodes send each other lists of what they support and + what version of Tor they are using. The newer node picks, + based on the assumption that the newer node has the most up + to date information about which hash algorithm is the best. + Of course, the node could lie about its version, but then + again, it could also maliciously choose only to support older + algorithms. + + Using this method, we could potentially add server side support + to hash algorithms and ciphers before we instruct clients to + begin preferring those hash algorithms and ciphers. In this way, + the clients could upgrade and the servers would already support + the newly preferred hash algorithms and ciphers, even if the + servers were still using older versions of Tor, so long as the + older versions of Tor were at least new enough to have server + side support. + + This would make quickly upgrading to new hash algorithms and + ciphers easier. This could be very useful when new attacks + are published. + + One concern is that client preferences could expose the client + to segmentation attacks. To mitigate this, we suggest hardcoding + preferences in the client, to prevent the client from choosing + to use a new hash algorithm or cipher that no one else is using + yet. While offering a preference might be useful in case a client + with an older version of Tor wants to start using the newer hash + algorithm or cipher that everyone else is using, if the client + cares enough, he or she can just upgrade Tor. + + We may also have to worry about nodes which, through laziness or + maliciousness, refuse to start supporting new hash algorithms or + ciphers. This must be balanced with the need to maintain + backward compatibility so the client will have a large selection + of nodes to pick from. Adding new hash algorithms and ciphers + long before we suggest nodes start using them can help mitigate + this. However, eventually, once sufficient nodes support new + standards, client side support for older standards should be + disabled, particularly if there are practical rather than merely + theoretical attacks. + + Server side support for older standards can be kept much longer + than client side support, since clients using older hashes and + ciphers are really only hurting theirselvse. + + If server side support for a hash algorithm or cipher is added + but never preferred before we decide we don't really want it, + support can be removed without having to worry about backward + compatibility. + +Security implications: + Improving cryptography will improve Tor's security. However, if + clients pick different cryptographic standards, they could be + partitioned based on their cryptographic preferences. We also + need to worry about nodes refusing to support new standards. + These issues are detailed above. + +Specification: + + Todo. Need better understanding of how Tor currently works or + help from someone who does. + +Compatibility: + + This idea is intended to allow easier upgrading of cryptographic + hash algorithms and ciphers while maintaining backwards + compatibility. However, at some point, backwards compatibility + with very old hashes and ciphers should be dropped for security + reasons. + +Implementation: + + Todo. + +Performance and scalability nodes: + + Better hashes and cipher are someimes a little more CPU intensive + than weaker ones. For instance, on most computers AES is a little + faster than Twofish. However, in that example, I consider Twofish's + additional security worth the tradeoff. + +Acknowledgements: + + Discussed this on IRC with a few people, mostly Nick Mathewson. + Nick was particularly helpful in explaining how Tor works, + explaining goals, and providing various links to Tor + specifications. diff --git a/doc/spec/proposals/ideas/xxx-encrypted-services.txt b/doc/spec/proposals/ideas/xxx-encrypted-services.txt new file mode 100644 index 0000000000..3414f3c4fb --- /dev/null +++ b/doc/spec/proposals/ideas/xxx-encrypted-services.txt @@ -0,0 +1,18 @@ + +the basic idea might be to generate a keypair, and sign little statements +like "this key corresponds to this relay id", and publish them on karsten's +hs dht. + +so if you want to talk to it, you look it up, then go to that exit. +and by 'go to' i mean 'build a tor circuit like normal except you're sure +where to exit' + +connecting to it is slower than usual, but once you're connected, it's no +slower than normal tor. +and you get what wikileaks wants from its hidden service, which is really +just the UI piece. +indymedia also wants this. + +might be interesting to let an encrypted service list more than one relay, +too. + diff --git a/doc/spec/proposals/ideas/xxx-hide-platform.txt b/doc/spec/proposals/ideas/xxx-hide-platform.txt index 3fed5cfbd4..ad19fb1fd4 100644 --- a/doc/spec/proposals/ideas/xxx-hide-platform.txt +++ b/doc/spec/proposals/ideas/xxx-hide-platform.txt @@ -1,7 +1,5 @@ Filename: xxx-hide-platform.txt Title: Hide Tor Platform Information -Version: $Revision$ -Last-Modified: $Date$ Author: Jacob Appelbaum Created: 24-July-2008 Status: Draft diff --git a/doc/spec/proposals/ideas/xxx-port-knocking.txt b/doc/spec/proposals/ideas/xxx-port-knocking.txt index 9fbcdf3545..85c27ec52d 100644 --- a/doc/spec/proposals/ideas/xxx-port-knocking.txt +++ b/doc/spec/proposals/ideas/xxx-port-knocking.txt @@ -1,7 +1,5 @@ Filename: xxx-port-knocking.txt Title: Port knocking for bridge scanning resistance -Version: $Revision$ -Last-Modified: $Date$ Author: Jacob Appelbaum Created: 19-April-2009 Status: Draft diff --git a/doc/spec/proposals/ideas/xxx-separate-streams-by-port.txt b/doc/spec/proposals/ideas/xxx-separate-streams-by-port.txt index cebde65a9b..f26c1e580f 100644 --- a/doc/spec/proposals/ideas/xxx-separate-streams-by-port.txt +++ b/doc/spec/proposals/ideas/xxx-separate-streams-by-port.txt @@ -1,7 +1,5 @@ Filename: xxx-separate-streams-by-port.txt Title: Separate streams across circuits by destination port -Version: $Revision$ -Last-Modified: $Date$ Author: Robert Hogan Created: 21-Oct-2008 Status: Draft diff --git a/doc/spec/proposals/ideas/xxx-what-uses-sha1.txt b/doc/spec/proposals/ideas/xxx-what-uses-sha1.txt index 9b6e20c586..b3ca3eea5a 100644 --- a/doc/spec/proposals/ideas/xxx-what-uses-sha1.txt +++ b/doc/spec/proposals/ideas/xxx-what-uses-sha1.txt @@ -1,8 +1,6 @@ Filename: xxx-what-uses-sha1.txt Title: Where does Tor use SHA-1 today? -Version: $Revision$ -Last-Modified: $Date$ -Author: Nick Mathewson +Authors: Nick Mathewson, Marian Created: 30-Dec-2008 Status: Meta @@ -15,9 +13,15 @@ Introduction: too long. According to smart crypto people, the SHA-2 functions (SHA-256, etc) - share too much of SHA-1's structure to be very good. Some people - like other hash functions; most of these have not seen enough - analysis to be widely regarded as an extra-good idea. + share too much of SHA-1's structure to be very good. RIPEMD-160 is + also based on flawed past hashes. Some people think other hash + functions (e.g. Whirlpool and Tiger) are not as bad; most of these + have not seen enough analysis to be used yet. + + Here is a 2006 paper about hash algorithms. + http://www.sane.nl/sane2006/program/final-papers/R10.pdf + + (Todo: Ask smart crypto people.) By 2012, the NIST SHA-3 competition will be done, and with luck we'll have something good to switch too. But it's probably a bad idea to @@ -54,50 +58,138 @@ Why now? one look silly. +Triage + + How severe are these problems? Let's divide them into these + categories, where H(x) is the SHA-1 hash of x: + PREIMAGE -- find any x such that a H(x) has a chosen value + -- A SHA-1 usage that only depends on preimage + resistance + * Also SECOND PREIMAGE. Given x, find a y not equal to + x such that H(x) = H(y) + COLLISION<role> -- A SHA-1 usage that depends on collision + resistance, but the only party who could mount a + collision-based attack is already in a trusted role + (like a distribution signer or a directory authority). + COLLISION -- find any x and y such that H(x) = H(y) -- A + SHA-1 usage that depends on collision resistance + and doesn't need the attacker to have any special keys. + + There is no need to put much effort into fixing PREIMAGE and SECOND + PREIMAGE usages in the near-term: while there have been some + theoretical results doing these attacks against SHA-1, they don't + seem to be close to practical yet. To fix COLLISION<code-signing> + usages is not too important either, since anyone who has the key to + sign the code can mount far worse attacks. It would be good to fix + COLLISION<authority> usages, since we try to resist bad authorities + to a limited extent. The COLLISION usages are the most important + to fix. + + Kelsey and Schneier published a theoretical second preimage attack + against SHA-1 in 2005, so it would be a good idea to fix PREIMAGE + and SECOND PREIMAGE usages after fixing COLLISION usages or where fixes + require minimal effort. + + http://www.schneier.com/paper-preimages.html + + Additionally, we need to consider the impact of a successful attack + in each of these cases. SHA-1 collisions are still expensive even + if recent results are verified, and anybody with the resources to + compute one also has the resources to mount a decent Sybil attack. + + Let's be pessimistic, and not assume that producing collisions of + a given format is actually any harder than producing collisions at + all. + What Tor uses hashes for today: 1. Infrastructure. A. Our X.509 certificates are signed with SHA-1. + COLLSION B. TLS uses SHA-1 (and MD5) internally to generate keys. + PREIMAGE? + * At least breaking SHA-1 and MD5 simultaneously is + much more difficult than breaking either + independently. C. Some of the TLS ciphersuites we allow use SHA-1. + PREIMAGE? D. When we sign our code with GPG, it might be using SHA-1. + COLLISION<code-signing> + * GPG 1.4 and up have writing support for SHA-2 hashes. + This blog has help for converting: + http://www.schwer.us/journal/2005/02/19/sha-1-broken-and-gnupg-gpg/ E. Our GPG keys might be authenticated with SHA-1. + COLLISION<code-signing-key-signing> F. OpenSSL's random number generator uses SHA-1, I believe. + PREIMAGE 2. The Tor protocol A. Everything we sign, we sign using SHA-1-based OAEP-MGF1. + PREIMAGE? B. Our CREATE cell format uses SHA-1 for: OAEP padding. + PREIMAGE? C. Our EXTEND cells use SHA-1 to hash the identity key of the target server. + COLLISION D. Our CREATED cells use SHA-1 to hash the derived key data. + ?? E. The data we use in CREATE_FAST cells to generate a key is the length of a SHA-1. + NONE F. The data we send back in a CREATED/CREATED_FAST cell is the length of a SHA-1. - G. We use SHA-1 to derive our circuit keys from the negotiated g^xy value. + NONE + G. We use SHA-1 to derive our circuit keys from the negotiated g^xy + value. + NONE H. We use SHA-1 to derive the digest field of each RELAY cell, but that's used more as a checksum than as a strong digest. + NONE 3. Directory services + [All are COLLISION or COLLISION<authority> ] + A. All signatures are generated on the SHA-1 of their corresponding documents, using PKCS1 padding. + * In dir-spec.txt, section 1.3, it states, + "SIGNATURE" Object contains a signature (using the signing key) + of the PKCS1-padded digest of the entire document, taken from + the beginning of the Initial item, through the newline after + the Signature Item's keyword and its arguments." + So our attacker, Malcom, could generate a collision for the hash + that is signed. Thus, a second pre-image attack is possible. + Vulnerable to regular collision attack only if key is stolen. + If the key is stolen, Malcom could distribute two different + copies of the document which have the same hash. Maybe useful + for a partitioning attack? B. Router descriptors identify their corresponding extra-info documents by their SHA-1 digest. + * A third party might use a second pre-image attack to generate a + false extra-info document that has the same hash. The router + itself might use a regular collision attack to generate multiple + extra-info documents with the same hash, which might be useful + for a partitioning attack. C. Fingerprints in router descriptors are taken using SHA-1. - D. Fingerprints in authority certs are taken using SHA-1. - E. Fingerprints in dir-source lines of votes and consensuses are taken + * The fingerprint must match the public key. Not sure what would + happen if two routers had different public keys but the same + fingerprint. There could perhaps be unpredictable behaviour. + D. In router descriptors, routers in the same "Family" may be listed + by server nicknames or hexdigests. + * Does not seem critical. + E. Fingerprints in authority certs are taken using SHA-1. + F. Fingerprints in dir-source lines of votes and consensuses are taken using SHA-1. - F. Networkstatuses refer to routers identity keys and descriptors by their + G. Networkstatuses refer to routers identity keys and descriptors by their SHA-1 digests. - G. Directory-signature lines identify which key is doing the signing by + H. Directory-signature lines identify which key is doing the signing by the SHA-1 digests of the authority's signing key and its identity key. - H. The following items are downloaded by the SHA-1 of their contents: + I. The following items are downloaded by the SHA-1 of their contents: XXXX list them - I. The following items are downloaded by the SHA-1 of an identity key: + J. The following items are downloaded by the SHA-1 of an identity key: XXXX list them too. 4. The rendezvous protocol @@ -107,6 +199,12 @@ What Tor uses hashes for today: establishment requests. B. Hidden servers use SHA-1 in multiple places when generating hidden service descriptors. + * The permanent-id is the first 80 bits of the SHA-1 hash of the + public key + ** time-period performs caclulations using the permanent-id + * The secret-id-part is the SHA-1 has of the time period, the + descriptor-cookie, and replica. + * Hash of introduction point's identity key. C. Hidden servers performing basic-type client authorization for their services use SHA-1 when encrypting introduction points contained in hidden service descriptors. @@ -115,26 +213,35 @@ What Tor uses hashes for today: identifier or not. E. Hidden servers use SHA-1 to derive .onion addresses of their services. + * What's worse, it only uses the first 80 bits of the SHA-1 hash. + However, the rend-spec.txt says we aren't worried about arbitrary + collisons? F. Clients use SHA-1 to generate the current hidden service descriptor identifiers for a given .onion address. G. Hidden servers use SHA-1 to remember digests of the first parts of Diffie-Hellman handshakes contained in introduction requests in order - to detect replays. + to detect replays. See the RELAY_ESTABLISH_INTRO cell. We seem to be + taking a hash of a hash here. H. Hidden servers use SHA-1 during the Diffie-Hellman key exchange with a connecting client. 5. The bridge protocol XXXX write me + + A. Client may attempt to query for bridges where he knows a digest + (probably SHA-1) before a direct query. 6. The Tor user interface A. We log information about servers based on SHA-1 hashes of their identity keys. + COLLISION B. The controller identifies servers based on SHA-1 hashes of their identity keys. + COLLISION C. Nearly all of our configuration options that list servers allow SHA-1 hashes of their identity keys. + COLLISION E. The deprecated .exit notation uses SHA-1 hashes of identity keys - - + COLLISION diff --git a/doc/spec/proposals/reindex.py b/doc/spec/proposals/reindex.py index 2b4c02516b..980bc0659f 100755 --- a/doc/spec/proposals/reindex.py +++ b/doc/spec/proposals/reindex.py @@ -4,7 +4,7 @@ import re, os class Error(Exception): pass STATUSES = """DRAFT NEEDS-REVISION NEEDS-RESEARCH OPEN ACCEPTED META FINISHED - CLOSED SUPERSEDED DEAD""".split() + CLOSED SUPERSEDED DEAD REJECTED""".split() REQUIRED_FIELDS = [ "Filename", "Status", "Title" ] CONDITIONAL_FIELDS = { "OPEN" : [ "Target" ], "ACCEPTED" : [ "Target "], diff --git a/doc/spec/rend-spec.txt b/doc/spec/rend-spec.txt index e3fbe2253b..f030092679 100644 --- a/doc/spec/rend-spec.txt +++ b/doc/spec/rend-spec.txt @@ -1,4 +1,3 @@ -$Id$ Tor Rendezvous Specification @@ -145,33 +144,10 @@ $Id$ 1.2. Bob's OP generates service descriptors. The first time the OP provides an advertised service, it generates - a public/private keypair (stored locally). Periodically, the OP - generates and publishes a descriptor of type "V0". + a public/private keypair (stored locally). - The "V0" descriptor contains: - - KL Key length [2 octets] - PK Bob's public key [KL octets] - TS A timestamp [4 octets] - NI Number of introduction points [2 octets] - Ipt A list of NUL-terminated ORs [variable] - SIG Signature of above fields [variable] - - KL is the length of PK, in octets. - TS is the number of seconds elapsed since Jan 1, 1970. - - The members of Ipt may be either (a) nicknames, or (b) identity key - digests, encoded in hex, and prefixed with a '$'. Clients must - accept both forms. Services must only generate the second form. - Once 0.0.9.x is obsoleted, we can drop the first form. - - [It's ok for Bob to advertise 0 introduction points. He might want - to do that if he previously advertised some introduction points, - and now he doesn't have any. -RD] - - Beginning with 0.2.0.10-alpha, Bob's OP encodes "V2" descriptors in - addition to "V0" descriptors. The format of a "V2" descriptor is as - follows: + Beginning with 0.2.0.10-alpha, Bob's OP encodes "V2" descriptors. The + format of a "V2" descriptor is as follows: "rendezvous-service-descriptor" descriptor-id NL @@ -340,6 +316,10 @@ $Id$ (This ends the fields in the encrypted portion of the descriptor.) + [It's ok for Bob to advertise 0 introduction points. He might want + to do that if he previously advertised some introduction points, + and now he doesn't have any. -RD] + "signature" NL signature-string [At end, exactly once] @@ -349,6 +329,21 @@ $Id$ 1.2.1. Other descriptor formats we don't use. + Support for the V0 descriptor format was dropped in 0.2.2.0-alpha-dev: + + KL Key length [2 octets] + PK Bob's public key [KL octets] + TS A timestamp [4 octets] + NI Number of introduction points [2 octets] + Ipt A list of NUL-terminated ORs [variable] + SIG Signature of above fields [variable] + + KL is the length of PK, in octets. + TS is the number of seconds elapsed since Jan 1, 1970. + + The members of Ipt may be either (a) nicknames, or (b) identity key + digests, encoded in hex, and prefixed with a '$'. + The V1 descriptor format was understood and accepted from 0.1.1.5-alpha-cvs to 0.2.0.6-alpha-dev, but no Tors generated it and it was removed: @@ -409,7 +404,7 @@ $Id$ RELAY_ESTABLISH_INTRO cell, containing: KL Key length [2 octets] - PK Bob's public key [KL octets] + PK Introduction public key [KL octets] HS Hash of session info [20 octets] SIG Signature of above information [variable] @@ -431,16 +426,13 @@ $Id$ currently associated with PK. On success, the OR sends Bob a RELAY_INTRO_ESTABLISHED cell with an empty payload. - If a hidden service is configured to publish only v2 hidden service - descriptors, Bob's OP does not include its own public key in the - RELAY_ESTABLISH_INTRO cell, but the public key of a freshly generated - key pair. The OP also includes these fresh public keys in the v2 hidden - service descriptor together with the other introduction point - information. The reason is that the introduction point does not need to - and therefore should not know for which hidden service it works, so as - to prevent it from tracking the hidden service's activity. If the hidden - service is configured to publish both, v0 and v2 descriptors, two - separate sets of introduction points are established. + Bob's OP does not include its own public key in the RELAY_ESTABLISH_INTRO + cell, but the public key of a freshly generated introduction key pair. + The OP also includes these fresh public keys in the v2 hidden service + descriptor together with the other introduction point information. The + reason is that the introduction point does not need to and therefore + should not know for which hidden service it works, so as to prevent it + from tracking the hidden service's activity. 1.4. Bob's OP advertises his service descriptor(s). @@ -464,10 +456,8 @@ $Id$ after its timestamp. At least every 18 hours, Bob's OP uploads a fresh descriptor. - If Bob's OP is configured to publish v2 descriptors instead of or in - addition to v0 descriptors, it does so to a changing subset of all v2 - hidden service directories instead of the authoritative directory - servers. Therefore, Bob's OP opens a stream via Tor to each + Bob's OP publishes v2 descriptors to a changing subset of all v2 hidden + service directories. Therefore, Bob's OP opens a stream via Tor to each responsible hidden service directory. (He may re-use old circuits for this.) Over this stream, Bob's OP makes an HTTP 'POST' request to a URL "/tor/rendezvous2/publish" relative to the hidden service @@ -520,12 +510,21 @@ $Id$ 1.6. Alice's OP retrieves a service descriptor. - Alice opens a stream to a directory server via Tor, and makes an HTTP GET - request for the document '/tor/rendezvous/<z>', where '<z>' is replaced - with the encoding of Bob's public key as described above. (She may re-use - old circuits for this.) The directory replies with a 404 HTTP response if - it does not recognize <z>, and otherwise returns Bob's most recently - uploaded service descriptor. + Similarly to the description in section 1.4, Alice's OP fetches a v2 + descriptor from a randomly chosen hidden service directory out of the + changing subset of 6 nodes. If the request is unsuccessful, Alice retries + the other remaining responsible hidden service directories in a random + order. Alice relies on Bob to care about a potential clock skew between + the two by possibly storing two sets of descriptors (see end of section + 1.4). + + Alice's OP opens a stream via Tor to the chosen v2 hidden service + directory. (She may re-use old circuits for this.) Over this stream, + Alice's OP makes an HTTP 'GET' request for the document + "/tor/rendezvous2/<z>", where z is replaced with the encoding of the + descriptor ID. The directory replies with a 404 HTTP response if it does + not recognize <z>, and otherwise returns Bob's most recently uploaded + service descriptor. If Alice's OP receives a 404 response, it tries the other directory servers, and only fails the lookup if none recognize the public key hash. @@ -541,22 +540,6 @@ $Id$ [Caching may make her partitionable, but she fetched it anonymously, and we can't very well *not* cache it. -RD] - Alice's OP fetches v2 descriptors in parallel to v0 descriptors. Similarly - to the description in section 1.4, the OP fetches a v2 descriptor from a - randomly chosen hidden service directory out of the changing subset of - 6 nodes. If the request is unsuccessful, Alice retries the other - remaining responsible hidden service directories in a random order. - Alice relies on Bob to care about a potential clock skew between the two - by possibly storing two sets of descriptors (see end of section 1.4). - - Alice's OP opens a stream via Tor to the chosen v2 hidden service - directory. (She may re-use old circuits for this.) Over this stream, - Alice's OP makes an HTTP 'GET' request for the document - "/tor/rendezvous2/<z>", where z is replaced with the encoding of the - descriptor ID. The directory replies with a 404 HTTP response if it does - not recognize <z>, and otherwise returns Bob's most recently uploaded - service descriptor. - 1.7. Alice's OP establishes a rendezvous point. When Alice requests a connection to a given location-hidden service, diff --git a/doc/spec/socks-extensions.txt b/doc/spec/socks-extensions.txt index 8d58987f35..62d86acd9f 100644 --- a/doc/spec/socks-extensions.txt +++ b/doc/spec/socks-extensions.txt @@ -1,4 +1,3 @@ -$Id$ Tor's extensions to the SOCKS protocol 1. Overview diff --git a/doc/spec/tor-spec.txt b/doc/spec/tor-spec.txt index a321aa8694..efa6029f22 100644 --- a/doc/spec/tor-spec.txt +++ b/doc/spec/tor-spec.txt @@ -1,4 +1,3 @@ -$Id$ Tor Protocol Specification diff --git a/doc/spec/version-spec.txt b/doc/spec/version-spec.txt index 842271ae19..265717f409 100644 --- a/doc/spec/version-spec.txt +++ b/doc/spec/version-spec.txt @@ -1,4 +1,3 @@ -$Id$ HOW TOR VERSION NUMBERS WORK diff --git a/doc/tor.1.in b/doc/tor.1.in index 1a72ebd09f..a0f8e8b0f6 100644 --- a/doc/tor.1.in +++ b/doc/tor.1.in @@ -1,4 +1,4 @@ -.TH TOR 1 "January 2009" "TOR" +.TH TOR 1 "August 2009" "TOR" .SH NAME tor \- The second-generation onion router .SH SYNOPSIS @@ -241,6 +241,13 @@ fetching early. Normal users should leave it off. (Default: 0) .LP .TP +\fBFetchDirInfoExtraEarly \fR\fB0\fR|\fB1\fR\fP +If set to 1, Tor will fetch directory information before other +directory caches. It will attempt to download directory information closer to +the start of the consensus period. Normal users should leave it off. +(Default: 0) +.LP +.TP \fBFetchHidServDescriptors \fR\fB0\fR|\fB1\fR\fP If set to 0, Tor will never fetch any hidden service descriptors from the rendezvous directories. This option is only useful if you're using @@ -292,6 +299,25 @@ HTTPS proxy authentication that Tor supports; feel free to submit a patch if you want it to support others. .LP .TP +\fBSocks4Proxy\fR \fIhost\fR[:\fIport\fR]\fP +Tor will make all OR connections through the SOCKS 4 proxy at host:port +(or host:1080 if port is not specified). +.LP +.TP +\fBSocks5Proxy\fR \fIhost\fR[:\fIport\fR]\fP +Tor will make all OR connections through the SOCKS 5 proxy at host:port +(or host:1080 if port is not specified). +.LP +.TP +\fBSocks5ProxyUsername\fR \fIusername\fP +.LP +.TP +\fBSocks5ProxyPassword\fR \fIpassword\fP +If defined, authenticate to the SOCKS 5 server using username and password +in accordance to RFC 1929. Both username and password must be between 1 and 255 +characters. +.LP +.TP \fBKeepalivePeriod \fR\fINUM\fP To keep firewalls from expiring connections, send a padding keepalive cell every NUM seconds on open connections that are in use. If the @@ -350,8 +376,19 @@ On startup, setuid to this user and setgid to their primary group. .LP .TP \fBHardwareAccel \fR\fB0\fR|\fB1\fP -If non-zero, try to use crypto hardware acceleration when -available. This is untested and probably buggy. (Default: 0) +If non-zero, try to use built-in (static) crypto hardware acceleration when +available. (Default: 0) +.LP +.TP +\fBAccelName \fR\fINAME\fP +When using OpenSSL hardware crypto acceleration attempt to load the dynamic +engine of this name. This must be used for any dynamic hardware engine. Names +can be verified with the openssl engine command. +.LP +.TP +\fBAccelDir \fR\fIDIR\fP +Specify this option if using dynamic hardware acceleration and the engine +implementation library resides somewhere other than the OpenSSL default. .LP .TP \fBAvoidDiskWrites \fR\fB0\fR|\fB1\fP @@ -674,6 +711,13 @@ resolved. This helps trap accidental attempts to resolve URLs and so on. (Default: 0) .LP .TP +\fBAllowDotExit \fR\fB0\fR|\fB1\fR\fP +If enabled, we convert "www.google.com.foo.exit" addresses on the +SocksPort/TransPort/NatdPort into "www.google.com" addresses that exit +from the node "foo". Disabled by default since attacking websites and +exit relays can use it to manipulate your path selection. (Default: 0) +.LP +.TP \fBFastFirstHopPK \fR\fB0\fR|\fB1\fR\fP When this option is disabled, Tor uses the public key step for the first hop of creating circuits. Skipping it is generally safe since we have @@ -1031,6 +1075,36 @@ behalf of clients. .TP \fBGeoIPFile \fR\fIfilename\fP A filename containing GeoIP data, for use with BridgeRecordUsageByCountry. +.LP +.TP +\fBCellStatistics \fR\fB0\fR|\fB1\fR\fP +When this option is enabled, Tor writes statistics on the mean time that +cells spend in circuit queues to disk every 24 hours. Cannot be changed +while Tor is running. (Default: 0) +.LP +.TP +\fBDirReqStatistics \fR\fB0\fR|\fB1\fR\fP +When this option is enabled, Tor writes statistics on the number and +response time of network status requests to disk every 24 hours. Cannot be +changed while Tor is running. (Default: 0) +.LP +.TP +\fBEntryStatistics \fR\fB0\fR|\fB1\fR\fP +When this option is enabled, Tor writes statistics on the number of +directly connecting clients to disk every 24 hours. Cannot be changed +while Tor is running. (Default: 0) +.LP +.TP +\fBExitPortStatistics \fR\fB0\fR|\fB1\fR\fP +When this option is enabled, Tor writes statistics on the number of +relayed bytes and opened stream per exit port to disk every 24 hours. +Cannot be changed while Tor is running. (Default: 0) +.LP +.TP +\fBExtraInfoStatistics \fR\fB0\fR|\fB1\fR\fP +When this option is enabled, Tor includes previously gathered statistics +in its extra-info documents that it uploads to the directory authorities. +(Default: 0) .SH DIRECTORY SERVER OPTIONS .PP @@ -1295,7 +1369,7 @@ if you're using a Tor controller that handles hidserv publishing for you. .TP \fBHiddenServiceVersion \fR\fIversion\fR,\fIversion\fR,\fI...\fP A list of rendezvous service descriptor versions to publish for the hidden -service. Possible version numbers are 0 and 2. (Default: 0, 2) +service. Currently, only version 2 is supported. (Default: 2) .LP .TP \fBHiddenServiceAuthorizeClient \fR\fIauth-type\fR \fR\fIclient-name\fR,\fIclient-name\fR,\fI...\fP |