aboutsummaryrefslogtreecommitdiff
path: root/doc/spec
diff options
context:
space:
mode:
Diffstat (limited to 'doc/spec')
-rw-r--r--doc/spec/Makefile.am2
-rw-r--r--doc/spec/address-spec.txt18
-rw-r--r--doc/spec/bridges-spec.txt1
-rw-r--r--doc/spec/control-spec-v0.txt1
-rw-r--r--doc/spec/control-spec.txt136
-rw-r--r--doc/spec/dir-spec-v1.txt1
-rw-r--r--doc/spec/dir-spec-v2.txt1
-rw-r--r--doc/spec/dir-spec.txt502
-rw-r--r--doc/spec/path-spec.txt70
-rw-r--r--doc/spec/proposals/000-index.txt33
-rw-r--r--doc/spec/proposals/001-process.txt10
-rw-r--r--doc/spec/proposals/098-todo.txt2
-rw-r--r--doc/spec/proposals/099-misc.txt2
-rw-r--r--doc/spec/proposals/100-tor-spec-udp.txt2
-rw-r--r--doc/spec/proposals/101-dir-voting.txt2
-rw-r--r--doc/spec/proposals/102-drop-opt.txt2
-rw-r--r--doc/spec/proposals/103-multilevel-keys.txt2
-rw-r--r--doc/spec/proposals/104-short-descriptors.txt2
-rw-r--r--doc/spec/proposals/105-handshake-revision.txt2
-rw-r--r--doc/spec/proposals/106-less-tls-constraint.txt2
-rw-r--r--doc/spec/proposals/107-uptime-sanity-checking.txt2
-rw-r--r--doc/spec/proposals/108-mtbf-based-stability.txt2
-rw-r--r--doc/spec/proposals/109-no-sharing-ips.txt2
-rw-r--r--doc/spec/proposals/110-avoid-infinite-circuits.txt2
-rw-r--r--doc/spec/proposals/111-local-traffic-priority.txt2
-rw-r--r--doc/spec/proposals/112-bring-back-pathlencoinweight.txt2
-rw-r--r--doc/spec/proposals/113-fast-authority-interface.txt2
-rw-r--r--doc/spec/proposals/114-distributed-storage.txt2
-rw-r--r--doc/spec/proposals/115-two-hop-paths.txt2
-rw-r--r--doc/spec/proposals/116-two-hop-paths-from-guard.txt2
-rw-r--r--doc/spec/proposals/117-ipv6-exits.txt2
-rw-r--r--doc/spec/proposals/118-multiple-orports.txt2
-rw-r--r--doc/spec/proposals/119-controlport-auth.txt2
-rw-r--r--doc/spec/proposals/120-shutdown-descriptors.txt2
-rw-r--r--doc/spec/proposals/121-hidden-service-authentication.txt2
-rw-r--r--doc/spec/proposals/122-unnamed-flag.txt2
-rw-r--r--doc/spec/proposals/123-autonaming.txt2
-rw-r--r--doc/spec/proposals/124-tls-certificates.txt2
-rw-r--r--doc/spec/proposals/125-bridges.txt2
-rw-r--r--doc/spec/proposals/126-geoip-reporting.txt2
-rw-r--r--doc/spec/proposals/127-dirport-mirrors-downloads.txt2
-rw-r--r--doc/spec/proposals/128-bridge-families.txt2
-rw-r--r--doc/spec/proposals/129-reject-plaintext-ports.txt2
-rw-r--r--doc/spec/proposals/130-v2-conn-protocol.txt2
-rw-r--r--doc/spec/proposals/131-verify-tor-usage.txt2
-rw-r--r--doc/spec/proposals/132-browser-check-tor-service.txt2
-rw-r--r--doc/spec/proposals/134-robust-voting.txt22
-rw-r--r--doc/spec/proposals/135-private-tor-networks.txt2
-rw-r--r--doc/spec/proposals/137-bootstrap-phases.txt2
-rw-r--r--doc/spec/proposals/138-remove-down-routers-from-consensus.txt2
-rw-r--r--doc/spec/proposals/140-consensus-diffs.txt11
-rw-r--r--doc/spec/proposals/141-jit-sd-downloads.txt8
-rw-r--r--doc/spec/proposals/142-combine-intro-and-rend-points.txt2
-rw-r--r--doc/spec/proposals/143-distributed-storage-improvements.txt2
-rw-r--r--doc/spec/proposals/145-newguard-flag.txt2
-rw-r--r--doc/spec/proposals/146-long-term-stability.txt2
-rw-r--r--doc/spec/proposals/147-prevoting-opinions.txt2
-rw-r--r--doc/spec/proposals/148-uniform-client-end-reason.txt2
-rw-r--r--doc/spec/proposals/149-using-netinfo-data.txt6
-rw-r--r--doc/spec/proposals/150-exclude-exit-nodes.txt1
-rw-r--r--doc/spec/proposals/151-path-selection-improvements.txt160
-rw-r--r--doc/spec/proposals/152-single-hop-circuits.txt2
-rw-r--r--doc/spec/proposals/153-automatic-software-update-protocol.txt2
-rw-r--r--doc/spec/proposals/154-automatic-updates.txt2
-rw-r--r--doc/spec/proposals/155-four-hidden-service-improvements.txt2
-rw-r--r--doc/spec/proposals/156-tracking-blocked-ports.txt2
-rw-r--r--doc/spec/proposals/157-specific-cert-download.txt2
-rw-r--r--doc/spec/proposals/158-microdescriptors.txt209
-rw-r--r--doc/spec/proposals/159-exit-scanning.txt2
-rw-r--r--doc/spec/proposals/160-bandwidth-offset.txt105
-rw-r--r--doc/spec/proposals/161-computing-bandwidth-adjustments.txt174
-rw-r--r--doc/spec/proposals/162-consensus-flavors.txt188
-rw-r--r--doc/spec/proposals/163-detecting-clients.txt115
-rw-r--r--doc/spec/proposals/164-reporting-server-status.txt91
-rw-r--r--doc/spec/proposals/165-simple-robust-voting.txt133
-rw-r--r--doc/spec/proposals/166-statistics-extra-info-docs.txt391
-rw-r--r--doc/spec/proposals/167-params-in-consensus.txt47
-rw-r--r--doc/spec/proposals/168-reduce-circwindow.txt134
-rw-r--r--doc/spec/proposals/169-eliminating-renegotiation.txt404
-rw-r--r--doc/spec/proposals/170-user-path-config.txt95
-rw-r--r--doc/spec/proposals/ideas/xxx-bwrate-algs.txt106
-rw-r--r--doc/spec/proposals/ideas/xxx-choosing-crypto-in-tor-protocol.txt138
-rw-r--r--doc/spec/proposals/ideas/xxx-encrypted-services.txt18
-rw-r--r--doc/spec/proposals/ideas/xxx-hide-platform.txt2
-rw-r--r--doc/spec/proposals/ideas/xxx-port-knocking.txt2
-rw-r--r--doc/spec/proposals/ideas/xxx-separate-streams-by-port.txt2
-rw-r--r--doc/spec/proposals/ideas/xxx-what-uses-sha1.txt139
-rwxr-xr-xdoc/spec/proposals/reindex.py2
-rw-r--r--doc/spec/rend-spec.txt111
-rw-r--r--doc/spec/socks-extensions.txt1
-rw-r--r--doc/spec/tor-spec.txt21
-rw-r--r--doc/spec/version-spec.txt1
92 files changed, 3207 insertions, 507 deletions
diff --git a/doc/spec/Makefile.am b/doc/spec/Makefile.am
index 208901d9db..e2fef42e81 100644
--- a/doc/spec/Makefile.am
+++ b/doc/spec/Makefile.am
@@ -1,5 +1,5 @@
EXTRA_DIST = tor-spec.txt rend-spec.txt control-spec.txt \
dir-spec.txt socks-extensions.txt path-spec.txt \
- version-spec.txt address-spec.txt
+ version-spec.txt address-spec.txt bridges-spec.txt
diff --git a/doc/spec/address-spec.txt b/doc/spec/address-spec.txt
index 2a84d857e6..2e1aff2b8a 100644
--- a/doc/spec/address-spec.txt
+++ b/doc/spec/address-spec.txt
@@ -1,4 +1,3 @@
-$Id$
Special Hostnames in Tor
Nick Mathewson
@@ -34,10 +33,13 @@ $Id$
"www.google.com.foo.exit=64.233.161.99.foo.exit" to speed subsequent
lookups.
+ The .exit notation is disabled by default as of Tor 0.2.2.1-alpha, due
+ to potential application-level attacks.
+
EXAMPLES:
www.example.com.exampletornode.exit
- Connect to www.example.com from the node called "exampletornode."
+ Connect to www.example.com from the node called "exampletornode".
exampletornode.exit
@@ -54,15 +56,3 @@ $Id$
When Tor sees an address in this format, it tries to look up and connect to
the specified hidden service. See rend-spec.txt for full details.
-4. .noconnect
-
- SYNTAX: [string].noconnect
-
- When Tor sees an address in this format, it immediately closes the
- connection without attaching it to any circuit. This is useful for
- controllers that want to test whether a given application is indeed using
- the same instance of Tor that they're controlling.
-
-5. [XXX Is there a ".virtual" address that we expose too, or is that
-just intended to be internal? -RD]
-
diff --git a/doc/spec/bridges-spec.txt b/doc/spec/bridges-spec.txt
index 4a9b373c8e..647118815c 100644
--- a/doc/spec/bridges-spec.txt
+++ b/doc/spec/bridges-spec.txt
@@ -1,4 +1,3 @@
-$Id$
Tor bridges specification
diff --git a/doc/spec/control-spec-v0.txt b/doc/spec/control-spec-v0.txt
index faf75a64a4..3515d395a6 100644
--- a/doc/spec/control-spec-v0.txt
+++ b/doc/spec/control-spec-v0.txt
@@ -1,4 +1,3 @@
-$Id$
TC: A Tor control protocol (Version 0)
diff --git a/doc/spec/control-spec.txt b/doc/spec/control-spec.txt
index cf92e2b9e3..b60baba052 100644
--- a/doc/spec/control-spec.txt
+++ b/doc/spec/control-spec.txt
@@ -1,4 +1,3 @@
-$Id$
TC: A Tor control protocol (Version 1)
@@ -88,6 +87,10 @@ $Id$
2.4. General-use tokens
+ ; CRLF means, "the ASCII Carriage Return character (decimal value value 13)
+ ; followed by the ASCII Linefeed character (decimal value 10)."
+ CRLF = CR LF
+
; Identifiers for servers.
ServerID = Nickname / Fingerprint
@@ -220,7 +223,7 @@ $Id$
"INFO" / "NOTICE" / "WARN" / "ERR" / "NEWDESC" / "ADDRMAP" /
"AUTHDIR_NEWDESCS" / "DESCCHANGED" / "STATUS_GENERAL" /
"STATUS_CLIENT" / "STATUS_SERVER" / "GUARD" / "NS" / "STREAM_BW" /
- "CLIENTS_SEEN"
+ "CLIENTS_SEEN" / "NEWCONSENSUS" / "BUILDTIMEOUT_SET"
Any events *not* listed in the SETEVENTS line are turned off; thus, sending
SETEVENTS with an empty body turns off all event reporting.
@@ -271,6 +274,9 @@ $Id$
returns "250 OK" if successful, or "551 Unable to write configuration
to disk" if it can't write the file or some other error occurs.
+ See also the "getinfo config-text" command, if the controller wants
+ to write the torrc file itself.
+
3.7. SIGNAL
Sent from the client to the server. The syntax is:
@@ -379,6 +385,10 @@ $Id$
"config-file" -- The location of Tor's configuration file ("torrc").
+ "config-text" -- The contents that Tor would write if you send it
+ a SAVECONF command, so the controller can write the file to
+ disk itself. [First implemented in 0.2.2.7-alpha.]
+
["exit-policy/prepend" -- The default exit policy lines that Tor will
*prepend* to the ExitPolicy config option.
-- Never implemented. Useful?]
@@ -503,7 +513,7 @@ $Id$
start and the rest of the interval respectively. The 'interval-start'
and 'interval-end' fields are the borders of the current interval; the
'interval-wake' field is the time within the current interval (if any)
- where we plan[ned] to start being active.
+ where we plan[ned] to start being active. The times are GMT.
"config/names"
A series of lines listing the available configuration options. Each is
@@ -564,14 +574,14 @@ $Id$
states. See Section 4.1.10 for explanations. (Only a few of the
status events are available as getinfo's currently. Let us know if
you want more exposed.)
- "status/reachability/or"
+ "status/reachability-succeeded/or"
0 or 1, depending on whether we've found our ORPort reachable.
- "status/reachability/dir"
+ "status/reachability-succeeded/dir"
0 or 1, depending on whether we've found our DirPort reachable.
- "status/reachability"
+ "status/reachability-succeeded"
"OR=" ("0"/"1") SP "DIR=" ("0"/"1")
- Combines status/reachability/*; controllers MUST ignore unrecognized
- elements in this entry.
+ Combines status/reachability-succeeded/*; controllers MUST ignore
+ unrecognized elements in this entry.
"status/bootstrap-phase"
Returns the most recent bootstrap phase status event
sent. Specifically, it returns a string starting with either
@@ -582,7 +592,7 @@ $Id$
List of currently recommended versions.
"status/version/current"
Status of the current version. One of: new, old, unrecommended,
- recommended, new in series, obsolete.
+ recommended, new in series, obsolete, unknown.
"status/clients-seen"
A summary of which countries we've seen clients from recently,
formatted the same as the CLIENTS_SEEN status event described in
@@ -600,15 +610,20 @@ $Id$
3.10. EXTENDCIRCUIT
Sent from the client to the server. The format is:
- "EXTENDCIRCUIT" SP CircuitID SP
- ServerSpec *("," ServerSpec)
- [SP "purpose=" Purpose] CRLF
+ "EXTENDCIRCUIT" SP CircuitID
+ [SP ServerSpec *("," ServerSpec)
+ SP "purpose=" Purpose] CRLF
This request takes one of two forms: either the CircuitID is zero, in
- which case it is a request for the server to build a new circuit according
- to the specified path, or the CircuitID is nonzero, in which case it is a
- request for the server to extend an existing circuit with that ID according
- to the specified path.
+ which case it is a request for the server to build a new circuit,
+ or the CircuitID is nonzero, in which case it is a request for the
+ server to extend an existing circuit with that ID according to the
+ specified path.
+
+ If the CircuitID is 0, the controller has the option of providing
+ a path for Tor to use to build the circuit. If it does not provide
+ a path, Tor will select one automatically from high capacity nodes
+ according to path-spec.txt.
If CircuitID is 0 and "purpose=" is specified, then the circuit's
purpose is set. Two choices are recognized: "general" and
@@ -775,9 +790,8 @@ $Id$
Same as passing 'EXTENDED' to SETEVENTS; this is the preferred way to
request the extended event syntax.
- This will not be always-enabled until at least two stable releases
- after 0.1.2.3-alpha, the release where it was first used for
- anything.
+ This feature was first used in 0.1.2.3-alpha. It is always-on in
+ Tor 0.2.2.1-alpha and later.
VERBOSE_NAMES
@@ -788,8 +802,9 @@ $Id$
LongName format includes a Fingerprint, an indication of Named status,
and a Nickname (if one is known).
- This will not be always-enabled until at least two stable releases
- after 0.1.2.2-alpha, the release where it was first available.
+ This will not be always-enabled until at least two stable
+ releases after 0.1.2.2-alpha, the release where it was first
+ available. It is always-on in Tor 0.2.2.1-alpha and later.
3.20. RESOLVE
@@ -1497,6 +1512,23 @@ $Id$
should just look at ACCEPTED_SERVER_DESCRIPTOR and should ignore
this event for now.}
+ SERVER_DESCRIPTOR_STATUS
+ "STATUS=" "LISTED" / "UNLISTED"
+ We just got a new networkstatus consensus, and whether we're in
+ it or not in it has changed. Specifically, status is "listed"
+ if we're listed in it but previous to this point we didn't know
+ we were listed in a consensus; and status is "unlisted" if we
+ thought we should have been listed in it (e.g. we were listed in
+ the last one), but we're not.
+
+ {Moving from listed to unlisted is not necessarily cause for
+ alarm. The relay might have failed a few reachability tests,
+ or the Internet might have had some routing problems. So this
+ feature is mainly to let relay operators know when their relay
+ has successfully been listed in the consensus.}
+
+ [Not implemented yet. We should do this in 0.2.2.x. -RD]
+
NAMESERVER_STATUS
"NS=addr"
"STATUS=" "UP" / "DOWN"
@@ -1581,17 +1613,21 @@ $Id$
4.1.13. Bandwidth used on an application stream
The syntax is:
- "650" SP "STREAM_BW" SP StreamID SP BytesRead SP BytesWritten CRLF
- BytesRead = 1*DIGIT
+ "650" SP "STREAM_BW" SP StreamID SP BytesWritten SP BytesRead CRLF
BytesWritten = 1*DIGIT
+ BytesRead = 1*DIGIT
+
+ BytesWritten and BytesRead are the number of bytes written and read
+ by the application since the last STREAM_BW event on this stream.
- BytesRead and BytesWritten are the number of bytes read and written since
- the last STREAM_BW event on this stream. These events are generated about
- once per second per stream; no events are generated for streams that have
- not read or written.
+ Note that from Tor's perspective, *reading* a byte on a stream means
+ that the application *wrote* the byte. That's why the order of "written"
+ vs "read" is opposite for stream_bw events compared to bw events.
- These events apply only to streams entering Tor (such as on a SOCKSPort,
- TransPort, or so on). They are not generated for exiting streams.
+ These events are generated about once per second per stream; no events
+ are generated for streams that have not written or read. These events
+ apply only to streams entering Tor (such as on a SOCKSPort, TransPort,
+ or so on). They are not generated for exiting streams.
4.1.14. Per-country client stats
@@ -1610,11 +1646,11 @@ $Id$
TimeStarted is a quoted string indicating when the reported summary
counts from (in GMT).
- The CountrySummary keyword has as its argument a comma-separated
- set of "countrycode=count" pairs. For example,
- 650-CLIENTS_SEEN TimeStarted="Thu Dec 25 23:50:43 EST 2008"
- 650 CountrySummary=us=16,de=8,uk=8
-[XXX Matt Edman informs me that the time format above is wrong. -RD]
+ The CountrySummary keyword has as its argument a comma-separated,
+ possibly empty set of "countrycode=count" pairs. For example (without
+ linebreak),
+ 650-CLIENTS_SEEN TimeStarted="2008-12-25 23:50:43"
+ CountrySummary=us=16,de=8,uk=8
4.1.15. New consensus networkstatus has arrived.
@@ -1629,6 +1665,38 @@ $Id$
[First added in 0.2.1.13-alpha]
+4.1.16. New circuit buildtime has been set.
+
+ The syntax is:
+ "650" SP "BUILDTIMEOUT_SET" SP Type SP "TOTAL_TIMES=" Total SP
+ "TIMEOUT_MS=" Timeout SP "XM=" Xm SP "ALPHA=" Alpha SP
+ "CUTOFF_QUANTILE=" Quantile CRLF
+ Type = "COMPUTED" / "RESET" / "SUSPENDED" / "DISCARD" / "RESUME"
+ Total = Integer count of timeouts stored
+ Timeout = Integer timeout in milliseconds
+ Xm = Estimated integer Pareto parameter Xm in milliseconds
+ Alpha = Estimated floating point Paredo paremter alpha
+ Quantile = Floating point CDF quantile cutoff point for this timeout
+
+ A new circuit build timeout time has been set. If Type is "COMPUTED",
+ Tor has computed the value based on historical data. If Type is "RESET",
+ initialization or drastic network changes have caused Tor to reset
+ the timeout back to the default, to relearn again. If Type is
+ "SUSPENDED", Tor has detected a loss of network connectivity and has
+ temporarily changed the timeout value to the default until the network
+ recovers. If type is "DISCARD", Tor has decided to discard timeout
+ values that likely happened while the network was down. If type is
+ "RESUME", Tor has decided to resume timeout calculation.
+
+ The Total value is the count of circuit build times Tor used in
+ computing this value. It is capped internally at the maximum number
+ of build times Tor stores (NCIRCUITS_TO_OBSERVE).
+
+ The Timeout itself is provided in milliseconds. Internally, Tor rounds
+ this value to the nearest second before using it.
+
+ [First added in 0.2.2.7-alpha]
+
5. Implementation notes
5.1. Authentication
diff --git a/doc/spec/dir-spec-v1.txt b/doc/spec/dir-spec-v1.txt
index 286df664e2..a92fc7999a 100644
--- a/doc/spec/dir-spec-v1.txt
+++ b/doc/spec/dir-spec-v1.txt
@@ -1,4 +1,3 @@
-$Id$
Tor Protocol Specification
diff --git a/doc/spec/dir-spec-v2.txt b/doc/spec/dir-spec-v2.txt
index 4873c4a728..d1be27f3db 100644
--- a/doc/spec/dir-spec-v2.txt
+++ b/doc/spec/dir-spec-v2.txt
@@ -1,4 +1,3 @@
-$Id$
Tor directory protocol, version 2
diff --git a/doc/spec/dir-spec.txt b/doc/spec/dir-spec.txt
index 9a2a62bc46..b88e838f36 100644
--- a/doc/spec/dir-spec.txt
+++ b/doc/spec/dir-spec.txt
@@ -1,4 +1,3 @@
-$Id$
Tor directory protocol, version 3
@@ -11,7 +10,7 @@ $Id$
Caches and authorities must still support older versions of the
directory protocols, until the versions of Tor that require them are
- finally out of commission. See Section XXXX on backward compatibility.
+ finally out of commission.
This document merges and supersedes the following proposals:
@@ -183,7 +182,8 @@ $Id$
All directory information is uploaded and downloaded with HTTP.
[Authorities also generate and caches also cache documents produced and
- used by earlier versions of this protocol; see section XXX for notes.]
+ used by earlier versions of this protocol; see dir-spec-v1.txt and
+ dir-spec-v2.txt for notes on those versions.]
1.1. What's different from version 2?
@@ -592,9 +592,9 @@ $Id$
with unrecognized items; the protocols line should be preceded with
an "opt" until these Tors are obsolete.]
- "allow-single-hop-exits"
+ "allow-single-hop-exits" NL
- [At most one.]
+ [At most once.]
Present only if the router allows single-hop circuits to make exit
connections. Most Tor servers do not support this: this is
@@ -613,7 +613,7 @@ $Id$
Fingerprint is encoded in hex (using upper-case letters), with
no spaces.
- "published"
+ "published" YYYY-MM-DD HH:MM:SS NL
[Exactly once.]
@@ -628,8 +628,8 @@ $Id$
As documented in 2.1 above. See migration notes in section 2.2.1.
- "geoip-start" YYYY-MM-DD HH:MM:SS NL
- "geoip-client-origins" CC=N,CC=N,... NL
+ ("geoip-start" YYYY-MM-DD HH:MM:SS NL)
+ ("geoip-client-origins" CC=N,CC=N,... NL)
Only generated by bridge routers (see blocking.pdf), and only
when they have been configured with a geoip database.
@@ -642,6 +642,227 @@ $Id$
"geoip-start" is the time at which we began collecting geoip
statistics.
+ "geoip-start" and "geoip-client-origins" have been replaced by
+ "bridge-stats-end" and "bridge-stats-ips" in 0.2.2.4-alpha. The
+ reason is that the measurement interval with "geoip-stats" as
+ determined by subtracting "geoip-start" from "published" could
+ have had a variable length, whereas the measurement interval in
+ 0.2.2.4-alpha and later is set to be exactly 24 hours long. In
+ order to clearly distinguish the new measurement intervals from
+ the old ones, the new keywords have been introduced.
+
+ "bridge-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL
+ [At most once.]
+
+ YYYY-MM-DD HH:MM:SS defines the end of the included measurement
+ interval of length NSEC seconds (86400 seconds by default).
+
+ A "bridge-stats-end" line, as well as any other "bridge-*" line,
+ is only added when the relay has been running as a bridge for at
+ least 24 hours.
+
+ "bridge-ips" CC=N,CC=N,... NL
+ [At most once.]
+
+ List of mappings from two-letter country codes to the number of
+ unique IP addresses that have connected from that country to the
+ bridge and which are no known relays, rounded up to the nearest
+ multiple of 8.
+
+ "dirreq-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL
+ [At most once.]
+
+ YYYY-MM-DD HH:MM:SS defines the end of the included measurement
+ interval of length NSEC seconds (86400 seconds by default).
+
+ A "dirreq-stats-end" line, as well as any other "dirreq-*" line,
+ is only added when the relay has opened its Dir port and after 24
+ hours of measuring directory requests.
+
+ "dirreq-v2-ips" CC=N,CC=N,... NL
+ [At most once.]
+ "dirreq-v3-ips" CC=N,CC=N,... NL
+ [At most once.]
+
+ List of mappings from two-letter country codes to the number of
+ unique IP addresses that have connected from that country to
+ request a v2/v3 network status, rounded up to the nearest multiple
+ of 8. Only those IP addresses are counted that the directory can
+ answer with a 200 OK status code.
+
+ "dirreq-v2-reqs" CC=N,CC=N,... NL
+ [At most once.]
+ "dirreq-v3-reqs" CC=N,CC=N,... NL
+ [At most once.]
+
+ List of mappings from two-letter country codes to the number of
+ requests for v2/v3 network statuses from that country, rounded up
+ to the nearest multiple of 8. Only those requests are counted that
+ the directory can answer with a 200 OK status code.
+
+ "dirreq-v2-share" num% NL
+ [At most once.]
+ "dirreq-v3-share" num% NL
+ [At most once.]
+
+ The share of v2/v3 network status requests that the directory
+ expects to receive from clients based on its advertised bandwidth
+ compared to the overall network bandwidth capacity. Shares are
+ formatted in percent with two decimal places. Shares are
+ calculated as means over the whole 24-hour interval.
+
+ "dirreq-v2-resp" status=num,... NL
+ [At most once.]
+ "dirreq-v3-resp" status=nul,... NL
+ [At most once.]
+
+ List of mappings from response statuses to the number of requests
+ for v2/v3 network statuses that were answered with that response
+ status, rounded up to the nearest multiple of 4. Only response
+ statuses with at least 1 response are reported. New response
+ statuses can be added at any time. The current list of response
+ statuses is as follows:
+
+ "ok": a network status request is answered; this number
+ corresponds to the sum of all requests as reported in
+ "dirreq-v2-reqs" or "dirreq-v3-reqs", respectively, before
+ rounding up.
+ "not-enough-sigs: a version 3 network status is not signed by a
+ sufficient number of requested authorities.
+ "unavailable": a requested network status object is unavailable.
+ "not-found": a requested network status is not found.
+ "not-modified": a network status has not been modified since the
+ If-Modified-Since time that is included in the request.
+ "busy": the directory is busy.
+
+ "dirreq-v2-direct-dl" key=val,... NL
+ [At most once.]
+ "dirreq-v3-direct-dl" key=val,... NL
+ [At most once.]
+ "dirreq-v2-tunneled-dl" key=val,... NL
+ [At most once.]
+ "dirreq-v3-tunneled-dl" key=val,... NL
+ [At most once.]
+
+ List of statistics about possible failures in the download process
+ of v2/v3 network statuses. Requests are either "direct"
+ HTTP-encoded requests over the relay's directory port, or
+ "tunneled" requests using a BEGIN_DIR cell over the relay's OR
+ port. The list of possible statistics can change, and statistics
+ can be left out from reporting. The current list of statistics is
+ as follows:
+
+ Successful downloads and failures:
+
+ "complete": a client has finished the download successfully.
+ "timeout": a download did not finish within 10 minutes after
+ starting to send the response.
+ "running": a download is still running at the end of the
+ measurement period for less than 10 minutes after starting to
+ send the response.
+
+ Download times:
+
+ "min", "max": smallest and largest measured bandwidth in B/s.
+ "d[1-4,6-9]": 1st to 4th and 6th to 9th decile of measured
+ bandwidth in B/s. For a given decile i, i/10 of all downloads
+ had a smaller bandwidth than di, and (10-i)/10 of all downloads
+ had a larger bandwidth than di.
+ "q[1,3]": 1st and 3rd quartile of measured bandwidth in B/s. One
+ fourth of all downloads had a smaller bandwidth than q1, one
+ fourth of all downloads had a larger bandwidth than q3, and the
+ remaining half of all downloads had a bandwidth between q1 and
+ q3.
+ "md": median of measured bandwidth in B/s. Half of the downloads
+ had a smaller bandwidth than md, the other half had a larger
+ bandwidth than md.
+
+ "entry-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL
+ [At most once.]
+
+ YYYY-MM-DD HH:MM:SS defines the end of the included measurement
+ interval of length NSEC seconds (86400 seconds by default).
+
+ An "entry-stats-end" line, as well as any other "entry-*"
+ line, is first added after the relay has been running for at least
+ 24 hours.
+
+ "entry-ips" CC=N,CC=N,... NL
+ [At most once.]
+
+ List of mappings from two-letter country codes to the number of
+ unique IP addresses that have connected from that country to the
+ relay and which are no known other relays, rounded up to the
+ nearest multiple of 8.
+
+ "cell-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL
+ [At most once.]
+
+ YYYY-MM-DD HH:MM:SS defines the end of the included measurement
+ interval of length NSEC seconds (86400 seconds by default).
+
+ A "cell-stats-end" line, as well as any other "cell-*" line,
+ is first added after the relay has been running for at least 24
+ hours.
+
+ "cell-processed-cells" num,...,num NL
+ [At most once.]
+
+ Mean number of processed cells per circuit, subdivided into
+ deciles of circuits by the number of cells they have processed in
+ descending order from loudest to quietest circuits.
+
+ "cell-queued-cells" num,...,num NL
+ [At most once.]
+
+ Mean number of cells contained in queues by circuit decile. These
+ means are calculated by 1) determining the mean number of cells in
+ a single circuit between its creation and its termination and 2)
+ calculating the mean for all circuits in a given decile as
+ determined in "cell-processed-cells". Numbers have a precision of
+ two decimal places.
+
+ "cell-time-in-queue" num,...,num NL
+ [At most once.]
+
+ Mean time cells spend in circuit queues in milliseconds. Times are
+ calculated by 1) determining the mean time cells spend in the
+ queue of a single circuit and 2) calculating the mean for all
+ circuits in a given decile as determined in
+ "cell-processed-cells".
+
+ "cell-circuits-per-decile" num NL
+ [At most once.]
+
+ Mean number of circuits that are included in any of the deciles,
+ rounded up to the next integer.
+
+ "exit-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL
+ [At most once.]
+
+ YYYY-MM-DD HH:MM:SS defines the end of the included measurement
+ interval of length NSEC seconds (86400 seconds by default).
+
+ An "exit-stats-end" line, as well as any other "exit-*" line, is
+ first added after the relay has been running for at least 24 hours
+ and only if the relay permits exiting (where exiting to a single
+ port and IP address is sufficient).
+
+ "exit-kibibytes-written" port=N,port=N,... NL
+ [At most once.]
+ "exit-kibibytes-read" port=N,port=N,... NL
+ [At most once.]
+
+ List of mappings from ports to the number of kibibytes that the
+ relay has written to or read from exit connections to that port,
+ rounded up to the next full kibibyte.
+
+ "exit-streams-opened" port=N,port=N,... NL
+ [At most once.]
+
+ List of mappings from ports to the number of opened exit streams
+ to that port, rounded up to the nearest multiple of 4.
+
"router-signature" NL Signature NL
[At end, exactly once.]
@@ -795,10 +1016,10 @@ $Id$
generate exactly the same consensus given the same set of votes.
The procedure for deciding when to generate vote and consensus status
- documents are described in section XXX below.
+ documents are described in section 1.4 on the voting timeline.
Status documents contain a preamble, an authority section, a list of
- router status entries, and one more footers signature, in that order.
+ router status entries, and one or more footer signature, in that order.
Unlike other formats described above, a SP in these documents must be a
single space character (hex 20).
@@ -905,6 +1126,34 @@ $Id$
enough votes were counted for the consensus for an authoritative
opinion to have been formed about their status.
+ "params" SP [Parameters] NL
+
+ [At most once]
+
+ Parameter ::= Keyword '=' Int32
+ Int32 ::= A decimal integer between -2147483648 and 2147483647.
+ Parameters ::= Parameter | Parameters SP Parameter
+
+ The parameters list, if present, contains a space-separated list of
+ key-value pairs, sorted in lexical order by their keyword. Each
+ parameter has its own meaning.
+
+ (Only included when the vote is generated with consensus-method 7 or
+ later.)
+
+ Commonly used "param" arguments at this point include:
+
+ "CircWindow" -- the default package window that circuits should
+ be established with. It started out at 1000 cells, but some
+ research indicates that a lower value would mean fewer cells in
+ transit in the network at any given time. Obeyed by Tor 0.2.1.20
+ and later.
+
+ "CircuitPriorityHalflifeMsec" -- the halflife parameter used when
+ weighting which circuit will send the next cell. Obeyed by Tor
+ 0.2.2.10-alpha and later. (Versions of Tor between 0.2.2.7-alpha
+ and 0.2.2.10-alpha recognized a "CircPriorityHalflifeMsec" parameter,
+ but mishandled it badly.)
The authority section of a vote contains the following items, followed
in turn by the authority's current key certificate:
@@ -1030,13 +1279,20 @@ $Id$
descriptors if they would cause "v" lines to be over 128 characters
long.
- "w" SP "Bandwidth=" INT NL
+ "w" SP "Bandwidth=" INT [SP "Measured=" INT] NL
[At most once.]
An estimate of the bandwidth of this server, in an arbitrary
unit (currently kilobytes per second). Used to weight router
- selection. Other weighting keywords may be added later.
+ selection.
+
+ Additionally, the Measured= keyword is present in votes by
+ participating bandwidth measurement authorities to indicate
+ a measured bandwidth currently produced by measuring stream
+ capacities.
+
+ Other weighting keywords may be added later.
Clients MUST ignore keywords they do not recognize.
"p" SP ("accept" / "reject") SP PortList NL
@@ -1051,8 +1307,57 @@ $Id$
or does not support (if 'reject') for exit to "most
addresses".
- The signature section contains the following item, which appears
- Exactly Once for a vote, and At Least Once for a consensus.
+ The footer section is delineated in all votes and consensuses supporting
+ consensus method 9 and above with the following:
+
+ "directory-footer" NL
+
+ It contains two subsections, a bandwidths-weights line and a
+ directory-signature.
+
+ The bandwidths-weights line appears At Most Once for a consensus. It does
+ not appear in votes.
+
+ "bandwidth-weights" SP
+ "Wbd=" INT SP "Wbe=" INT SP "Wbg=" INT SP "Wbm=" INT SP
+ "Wdb=" INT SP
+ "Web=" INT SP "Wed=" INT SP "Wee=" INT SP "Weg=" INT SP "Wem=" INT SP
+ "Wgb=" INT SP "Wgd=" INT SP "Wgg=" INT SP "Wgm=" INT SP
+ "Wmb=" INT SP "Wmd=" INT SP "Wme=" INT SP "Wmg=" INT SP "Wmm=" INT NL
+
+ These values represent the weights to apply to router bandwidths during
+ path selection. They are sorted in alphabetical order in the list. The
+ integer values are divided by BW_WEIGHT_SCALE=10000 or the consensus
+ param "bwweightscale". They are:
+
+ Wgg - Weight for Guard-flagged nodes in the guard position
+ Wgm - Weight for non-flagged nodes in the guard Position
+ Wgd - Weight for Guard+Exit-flagged nodes in the guard Position
+
+ Wmg - Weight for Guard-flagged nodes in the middle Position
+ Wmm - Weight for non-flagged nodes in the middle Position
+ Wme - Weight for Exit-flagged nodes in the middle Position
+ Wmd - Weight for Guard+Exit flagged nodes in the middle Position
+
+ Weg - Weight for Guard flagged nodes in the exit Position
+ Wem - Weight for non-flagged nodes in the exit Position
+ Wee - Weight for Exit-flagged nodes in the exit Position
+ Wed - Weight for Guard+Exit-flagged nodes in the exit Position
+
+ Wgb - Weight for BEGIN_DIR-supporting Guard-flagged nodes
+ Wmb - Weight for BEGIN_DIR-supporting non-flagged nodes
+ Web - Weight for BEGIN_DIR-supporting Exit-flagged nodes
+ Wdb - Weight for BEGIN_DIR-supporting Guard+Exit-flagged nodes
+
+ Wbg - Weight for Guard+Exit-flagged nodes for BEGIN_DIR requests
+ Wbm - Weight for Guard+Exit-flagged nodes for BEGIN_DIR requests
+ Wbe - Weight for Guard+Exit-flagged nodes for BEGIN_DIR requests
+ Wbd - Weight for Guard+Exit-flagged nodes for BEGIN_DIR requests
+
+ These values are calculated as specified in Section 3.4.3.
+
+ The signature contains the following item, which appears Exactly Once
+ for a vote, and At Least Once for a consensus.
"directory-signature" SP identity SP signing-key-digest NL Signature
@@ -1065,7 +1370,7 @@ $Id$
the signing authority, and "signing-key-digest" is the hex-encoded
digest of the current authority signing key of the signing authority.
-3.3. Deciding how to vote.
+3.3. Assigning flags in a vote
(This section describes how directory authorities choose which status
flags to apply to routers, as of Tor 0.2.0.0-alpha-dev. Later directory
@@ -1128,7 +1433,7 @@ $Id$
least one /8 address space.
"Fast" -- A router is 'Fast' if it is active, and its bandwidth is
- either in the top 7/8ths for known active routers or at least 100KB/s.
+ either in the top 7/8ths for known active routers or at least 20KB/s.
"Guard" -- A router is a possible 'Guard' if its Weighted Fractional
Uptime is at least the median for "familiar" active routers, and if
@@ -1179,6 +1484,13 @@ $Id$
rate limit from the router descriptor. It is given in kilobytes
per second, and capped at some arbitrary value (currently 10 MB/s).
+ The Measured= keyword on a "w" line vote is currently computed
+ by multiplying the previous published consensus bandwidth by the
+ ratio of the measured average node stream capacity to the network
+ average. If 3 or more authorities provide a Measured= keyword for
+ a router, the authorities produce a consensus containing a "w"
+ Bandwidth= keyword equal to the median of the Measured= votes.
+
The ports listed in a "p" line should be taken as those ports for
which the router's exit policy permits 'most' addresses, ignoring any
accept not for all addresses, ignoring all rejects for private
@@ -1199,6 +1511,10 @@ $Id$
Known-flags is the union of all flags known by any voter.
+ Entries are given on the "params" line for every keyword on which any
+ authority voted. The values given are the low-median of all votes on
+ that keyword.
+
"client-versions" and "server-versions" are sorted in ascending
order; A version is recommended in the consensus if it is recommended
by more than half of the voting authorities that included a
@@ -1261,6 +1577,14 @@ $Id$
one, breaking ties in favor of the lexicographically larger
vote.) The port list is encoded as specified in 3.4.2.
+ * If consensus-method 6 or later is in use and if 3 or more
+ authorities provide a Measured= keyword in their votes for
+ a router, the authorities produce a consensus containing a
+ Bandwidth= keyword equal to the median of the Measured= votes.
+
+ * If consensus-method 7 or later is in use, the params line is
+ included in the output.
+
The signatures at the end of a consensus document are sorted in
ascending order by identity digest.
@@ -1281,6 +1605,10 @@ $Id$
"3" -- Added legacy ID key support to aid in authority ID key rollovers
"4" -- No longer list routers that are not running in the consensus
"5" -- adds support for "w" and "p" lines.
+ "6" -- Prefers measured bandwidth values rather than advertised
+ "7" -- Provides keyword=integer pairs of consensus parameters
+ "8" -- Provides microdescriptor summaries
+ "9" -- Provides weights for selecting flagged routers in paths
Before generating a consensus, an authority must decide which consensus
method to use. To do this, it looks for the highest version number
@@ -1313,6 +1641,147 @@ $Id$
use an accept-style summary and list as much of the port list as is
possible within these 1000 bytes. [XXXX be more specific.]
+3.4.3. Computing Bandwidth Weights
+
+ Let weight_scale = 10000
+
+ Let G be the total bandwidth for Guard-flagged nodes.
+ Let M be the total bandwidth for non-flagged nodes.
+ Let E be the total bandwidth for Exit-flagged nodes.
+ Let D be the total bandwidth for Guard+Exit-flagged nodes.
+ Let T = G+M+E+D
+
+ Let Wgd be the weight for choosing a Guard+Exit for the guard position.
+ Let Wmd be the weight for choosing a Guard+Exit for the middle position.
+ Let Wed be the weight for choosing a Guard+Exit for the exit position.
+
+ Let Wme be the weight for choosing an Exit for the middle position.
+ Let Wmg be the weight for choosing a Guard for the middle position.
+
+ Let Wgg be the weight for choosing a Guard for the guard position.
+ Let Wee be the weight for choosing an Exit for the exit position.
+
+ Balanced network conditions then arise from solutions to the following
+ system of equations:
+
+ Wgg*G + Wgd*D == M + Wmd*D + Wme*E + Wmg*G (guard bw = middle bw)
+ Wgg*G + Wgd*D == Wee*E + Wed*D (guard bw = exit bw)
+ Wed*D + Wmd*D + Wgd*D == D (aka: Wed+Wmd+Wdg = 1)
+ Wmg*G + Wgg*G == G (aka: Wgg = 1-Wmg)
+ Wme*E + Wee*E == E (aka: Wee = 1-Wme)
+
+ We are short 2 constraints with the above set. The remaining constraints
+ come from examining different cases of network load.
+
+ Case 1: E >= T/3 && G >= T/3 (Neither Exit nor Guard Scarce)
+
+ In this case, the additional two constraints are: Wme*E == Wmd*D and
+ Wgd == 0, which maximizes Exit-flagged bandwidth in the middle position.
+
+ This leads to the solution:
+
+ Wgg = (weight_scale*(D+E+G+M))/(3*G)
+ Wmd = (weight_scale*(2*D + 2*E - G - M))/(6*D)
+ Wme = (weight_scale*(2*D + 2*E - G - M))/(6*E)
+ Wee = (weight_scale*(-2*D + 4*E + G + M))/(6*E)
+ Wmg = weight_scale - Wgg
+ Wed = weight_scale - Wmd
+ Wgd = 0
+
+ Case 2: E < T/3 && G < T/3 (Both are scarce)
+
+ Let R denote the more scarce class (Rare) between Guard vs Exit.
+ Let S denote the less scarce class.
+
+ Subcase a: R+D < S
+
+ In this subcase, we simply devote all of D bandwidth to the
+ scarce class.
+
+ Wgg = Wee = weight_scale
+ Wmg = Wme = Wmd = 0;
+ if E < G:
+ Wed = weight_scale
+ Wgd = 0
+ else:
+ Wed = 0
+ Wgd = weight_scale
+
+ Subcase b: R+D >= S
+
+ In this case, if M <= T/3, we have enough bandwidth to try to achieve
+ a balancing condition, and add the constraints Wgg == 1 and
+ Wme*E == Wmd*D:
+
+ Wgg = weight_scale
+ Wgd = (weight_scale*(D + E - 2*G + M))/(3*D) (T/3 >= G (Ok))
+ Wmd = (weight_scale*(D + E + G - 2*M))/(6*D) (T/3 >= M)
+ Wme = (weight_scale*(D + E + G - 2*M))/(6*E)
+ Wee = (weight_scale*(-D + 5*E - G + 2*M))/(6*E) (2E+M >= T/3)
+ Wmg = 0;
+ Wed = weight_scale - Wgd - Wmd
+
+ If M >= T/3, the above solution will not be valid (one of the weights
+ will be < 0 or > 1). In this case, we use:
+
+ Wgg = weight_scale
+ Wee = weight_scale
+ Wmg = Wme = Wmd = 0
+ Wgd = (weight_scale*(D+E-G))/(2*D)
+ Wed = weight_scale - Wgd
+
+ Case 3: One of E < T/3 or G < T/3
+
+ Let S be the scarce class (of E or G).
+
+ Subcase a: (S+D) < T/3:
+ if G=S:
+ Wgg = Wgd = weight_scale;
+ Wmd = Wed = Wmg = 0;
+ Wme = (weight_scale*(E-M))/(2*E);
+ Wee = weight_scale-Wme;
+ if E=S:
+ Wee = Wed = weight_scale;
+ Wmd = Wgd = Wmg = 0;
+ Wmg = (weight_scale*(G-M))/(2*G);
+ Wgg = weight_scale-Wmg;
+
+ Subcase b: (S+D) >= T/3
+ if G=S:
+ Add constraints Wmg = 0, Wme*E == Wmd*D to maximize exit bandwidth
+ in the middle position:
+ Wgd = (weight_scale*(D + E - 2*G + M))/(3*D);
+ Wmd = (weight_scale*(D + E + G - 2*M))/(6*D);
+ Wme = (weight_scale*(D + E + G - 2*M))/(6*E);
+ Wee = (weight_scale*(-D + 5*E - G + 2*M))/(6*E);
+ Wgg = weight_scale;
+ Wmg = 0;
+ Wed = weight_scale - Wgd - Wmd;
+ if E=S:
+ Add constraints Wgd = 0, Wme*E == Wmd*D:
+ Wgg = (weight_scale*(D + E + G + M))/(3*G);
+ Wmd = (weight_scale*(2*D + 2*E - G - M))/(6*D);
+ Wme = (weight_scale*(2*D + 2*E - G - M))/(6*E);
+ Wee = (weight_scale*(-2*D + 4*E + G + M))/(6*E);
+ Wgd = 0;
+ Wmg = weight_scale - Wgg;
+ Wed = weight_scale - Wmd;
+
+ To ensure consensus, all calculations are performed using integer math
+ with a fixed precision determined by the bwweightscale consensus
+ parameter (defaults at 10000).
+
+ For future balancing improvements, Tor clients support 11 additional weights
+ for directory requests and middle weighting. These weights are currently
+ set at weight_scale, with the exception of the following groups of
+ assignments:
+
+ Directory requests use middle weights:
+ Wbd=Wmd, Wbg=Wmg, Wbe=Wme, Wbm=Wmm
+
+ Handle bridges and strange exit policies:
+ Wgm=Wgg, Wem=Wee, Weg=Wed
+
3.5. Detached signatures
Assuming full connectivity, every authority should compute and sign the
@@ -1884,7 +2353,6 @@ $Id$
A. Consensus-negotiation timeline.
-
Period begins: this is the Published time.
Everybody sends votes
Reconciliation: everybody tries to fetch missing votes.
diff --git a/doc/spec/path-spec.txt b/doc/spec/path-spec.txt
index dceb21dad7..8a85718a08 100644
--- a/doc/spec/path-spec.txt
+++ b/doc/spec/path-spec.txt
@@ -1,4 +1,3 @@
-$Id$
Tor Path Specification
@@ -72,6 +71,24 @@ of their choices.
is unknown (usually its target IP), but we believe the path probably
supports the request according to the rules given below.
+1.1. A server's bandwidth
+
+ Old versions of Tor did not report bandwidths in network status
+ documents, so clients had to learn them from the routers' advertised
+ server descriptors.
+
+ For versions of Tor prior to 0.2.1.17-rc, everywhere below where we
+ refer to a server's "bandwidth", we mean its clipped advertised
+ bandwidth, computed by taking the smaller of the 'rate' and
+ 'observed' arguments to the "bandwidth" element in the server's
+ descriptor. If a router's advertised bandwidth is greater than
+ MAX_BELIEVABLE_BANDWIDTH (currently 10 MB/s), we clipped to that
+ value.
+
+ For more recent versions of Tor, we take the bandwidth value declared
+ in the consensus, and fall back to the clipped advertised bandwidth
+ only if the consensus does not have bandwidths listed.
+
2. Building circuits
2.1. When we build
@@ -175,26 +192,41 @@ of their choices.
below)
- XXXX Choosing the length
- For circuits that do not need to be "fast", when choosing among
- multiple candidates for a path element, we choose randomly.
+ For "fast" circuits, we only choose nodes with the Fast flag. For
+ non-"fast" circuits, all nodes are eligible.
+
+ For all circuits, we weight node selection according to router bandwidth.
+
+ We also weight the bandwidth of Exit and Guard flagged nodes depending on
+ the fraction of total bandwidth that they make up and depending upon the
+ position they are being selected for.
+
+ These weights are published in the consensus, and are computed as described
+ in Section 3.4.3 of dir-spec.txt. They are:
+
+ Wgg - Weight for Guard-flagged nodes in the guard position
+ Wgm - Weight for non-flagged nodes in the guard Position
+ Wgd - Weight for Guard+Exit-flagged nodes in the guard Position
+
+ Wmg - Weight for Guard-flagged nodes in the middle Position
+ Wmm - Weight for non-flagged nodes in the middle Position
+ Wme - Weight for Exit-flagged nodes in the middle Position
+ Wmd - Weight for Guard+Exit flagged nodes in the middle Position
- For "fast" circuits, we pick a given router as an exit with probability
- proportional to its advertised bandwidth [the smaller of the 'rate' and
- 'observed' arguments to the "bandwidth" element in its descriptor]. If a
- router's advertised bandwidth is greater than MAX_BELIEVABLE_BANDWIDTH
- (currently 10 MB/s), we clip to that value.
+ Weg - Weight for Guard flagged nodes in the exit Position
+ Wem - Weight for non-flagged nodes in the exit Position
+ Wee - Weight for Exit-flagged nodes in the exit Position
+ Wed - Weight for Guard+Exit-flagged nodes in the exit Position
- For non-exit positions on "fast" circuits, we pick routers as above, but
- we weight the clipped advertised bandwidth of Exit-flagged nodes depending
- on the fraction of bandwidth available from non-Exit nodes. Call the
- total clipped advertised bandwidth for Exit nodes under consideration E,
- and the total clipped advertised bandwidth for all nodes under
- consideration T. If E<T/3, we do not consider Exit-flagged nodes.
- Otherwise, we weight their bandwidth with the factor (E-T/3)/E. This
- ensures that bandwidth is evenly distributed over nodes in 3-hop paths.
+ Wgb - Weight for BEGIN_DIR-supporting Guard-flagged nodes
+ Wmb - Weight for BEGIN_DIR-supporting non-flagged nodes
+ Web - Weight for BEGIN_DIR-supporting Exit-flagged nodes
+ Wdb - Weight for BEGIN_DIR-supporting Guard+Exit-flagged nodes
- Similarly, guard nodes are weighted by the factor (G-T/3)/G, and not
- considered for non-guard positions if this value is less than 0.
+ Wbg - Weight for Guard+Exit-flagged nodes for BEGIN_DIR requests
+ Wbm - Weight for Guard+Exit-flagged nodes for BEGIN_DIR requests
+ Wbe - Weight for Guard+Exit-flagged nodes for BEGIN_DIR requests
+ Wbd - Weight for Guard+Exit-flagged nodes for BEGIN_DIR requests
Additionally, we may be building circuits with one or more requests in
mind. Each kind of request puts certain constraints on paths:
@@ -306,7 +338,7 @@ of their choices.
We use Guard nodes (also called "helper nodes" in the literature) to
prevent certain profiling attacks. Here's the risk: if we choose entry and
exit nodes at random, and an attacker controls C out of N servers
- (ignoring advertised bandwidth), then the
+ (ignoring bandwidth), then the
attacker will control the entry and exit node of any given circuit with
probability (C/N)^2. But as we make many different circuits over time,
then the probability that the attacker will see a sample of about (C/N)^2
diff --git a/doc/spec/proposals/000-index.txt b/doc/spec/proposals/000-index.txt
index d75157650d..62327a1e61 100644
--- a/doc/spec/proposals/000-index.txt
+++ b/doc/spec/proposals/000-index.txt
@@ -1,7 +1,5 @@
Filename: 000-index.txt
Title: Index of Tor Proposals
-Version: $Revision$
-Last-Modified: $Date$
Author: Nick Mathewson
Created: 26-Jan-2007
Status: Meta
@@ -56,7 +54,7 @@ Proposals by number:
131 Help users to verify they are using Tor [NEEDS-REVISION]
132 A Tor Web Service For Verifying Correct Browser Configuration [DRAFT]
133 Incorporate Unreachable ORs into the Tor Network [DRAFT]
-134 More robust consensus voting with diverse authority sets [ACCEPTED]
+134 More robust consensus voting with diverse authority sets [REJECTED]
135 Simplify Configuration of Private Tor Networks [CLOSED]
136 Mass authority migration with legacy keys [CLOSED]
137 Keep controllers informed as Tor bootstraps [CLOSED]
@@ -73,7 +71,7 @@ Proposals by number:
148 Stream end reasons from the client side should be uniform [CLOSED]
149 Using data from NETINFO cells [OPEN]
150 Exclude Exit Nodes from a circuit [CLOSED]
-151 Improving Tor Path Selection [DRAFT]
+151 Improving Tor Path Selection [FINISHED]
152 Optionally allow exit from single-hop circuits [CLOSED]
153 Automatic software update protocol [SUPERSEDED]
154 Automatic Software Update Protocol [SUPERSEDED]
@@ -82,6 +80,17 @@ Proposals by number:
157 Make certificate downloads specific [ACCEPTED]
158 Clients download consensus + microdescriptors [OPEN]
159 Exit Scanning [OPEN]
+160 Authorities vote for bandwidth offsets in consensus [FINISHED]
+161 Computing Bandwidth Adjustments [FINISHED]
+162 Publish the consensus in multiple flavors [OPEN]
+163 Detecting whether a connection comes from a client [OPEN]
+164 Reporting the status of server votes [OPEN]
+165 Easy migration for voting authority sets [OPEN]
+166 Including Network Statistics in Extra-Info Documents [ACCEPTED]
+167 Vote on network parameters in consensus [CLOSED]
+168 Reduce default circuit window [OPEN]
+169 Eliminate TLS renegotiation for the Tor connection handshake [DRAFT]
+170 Configuration options regarding circuit building [DRAFT]
Proposals by status:
@@ -92,7 +101,8 @@ Proposals by status:
133 Incorporate Unreachable ORs into the Tor Network
141 Download server descriptors on demand
144 Increase the diversity of circuits by detecting nodes belonging the same provider
- 151 Improving Tor Path Selection
+ 169 Eliminate TLS renegotiation for the Tor connection handshake [for 0.2.2]
+ 170 Configuration options regarding circuit building
NEEDS-REVISION:
131 Help users to verify they are using Tor
OPEN:
@@ -103,14 +113,19 @@ Proposals by status:
156 Tracking blocked ports on the client side [for 0.2.?]
158 Clients download consensus + microdescriptors
159 Exit Scanning
+ 162 Publish the consensus in multiple flavors [for 0.2.2]
+ 163 Detecting whether a connection comes from a client [for 0.2.2]
+ 164 Reporting the status of server votes [for 0.2.2]
+ 165 Easy migration for voting authority sets
+ 168 Reduce default circuit window [for 0.2.2]
ACCEPTED:
110 Avoiding infinite length circuits [for 0.2.1.x] [in 0.2.1.3-alpha]
117 IPv6 exits [for 0.2.1.x]
118 Advertising multiple ORPorts at once [for 0.2.1.x]
- 134 More robust consensus voting with diverse authority sets [for 0.2.2.x]
140 Provide diffs between consensuses [for 0.2.2.x]
147 Eliminate the need for v2 directories in generating v3 directories [for 0.2.1.x]
157 Make certificate downloads specific [for 0.2.1.x]
+ 166 Including Network Statistics in Extra-Info Documents [for 0.2.2]
META:
000 Index of Tor Proposals
001 The Tor Proposal Process
@@ -118,7 +133,10 @@ Proposals by status:
099 Miscellaneous proposals
FINISHED:
121 Hidden Service Authentication [in 0.2.1.x]
+ 151 Improving Tor Path Selection
155 Four Improvements of Hidden Service Performance [in 0.2.1.x]
+ 160 Authorities vote for bandwidth offsets in consensus [for 0.2.2.x]
+ 161 Computing Bandwidth Adjustments [for 0.2.2.x]
CLOSED:
101 Voting on the Tor Directory System [in 0.2.0.x]
102 Dropping "opt" from the directory format [in 0.2.0.x]
@@ -146,6 +164,7 @@ Proposals by status:
148 Stream end reasons from the client side should be uniform [in 0.2.1.9-alpha]
150 Exclude Exit Nodes from a circuit [in 0.2.1.3-alpha]
152 Optionally allow exit from single-hop circuits [in 0.2.1.6-alpha]
+ 167 Vote on network parameters in consensus [in 0.2.2]
SUPERSEDED:
112 Bring Back Pathlen Coin Weight
113 Simplifying directory authority administration
@@ -159,3 +178,5 @@ Proposals by status:
120 Shutdown descriptors when Tor servers stop
128 Families of private bridges
142 Combine Introduction and Rendezvous Points
+ REJECTED:
+ 134 More robust consensus voting with diverse authority sets
diff --git a/doc/spec/proposals/001-process.txt b/doc/spec/proposals/001-process.txt
index 3a767b5fa4..636ba2c2fa 100644
--- a/doc/spec/proposals/001-process.txt
+++ b/doc/spec/proposals/001-process.txt
@@ -1,7 +1,5 @@
Filename: 001-process.txt
Title: The Tor Proposal Process
-Version: $Revision$
-Last-Modified: $Date$
Author: Nick Mathewson
Created: 30-Jan-2007
Status: Meta
@@ -47,7 +45,7 @@ How to change the specs now:
Like an RFC, every proposal gets a number. Unlike RFCs, proposals can
change over time and keep the same number, until they are finally
accepted or rejected. The history for each proposal
- will be stored in the Tor Subversion repository.
+ will be stored in the Tor repository.
Once a proposal is in the repository, we should discuss and improve it
until we've reached consensus that it's a good idea, and that it's
@@ -82,9 +80,7 @@ How new proposals get added:
What should go in a proposal:
Every proposal should have a header containing these fields:
- Filename, Title, Version, Last-Modified, Author, Created, Status.
- The Version and Last-Modified fields should use the SVN Revision and Date
- tags respectively.
+ Filename, Title, Author, Created, Status.
These fields are optional but recommended:
Target, Implemented-In.
@@ -97,7 +93,7 @@ What should go in a proposal:
what the proposal's about, what it does, and about what state it's in.
After the Overview, the proposal becomes more free-form. Depending on its
- the length and complexity, the proposal can break into sections as
+ length and complexity, the proposal can break into sections as
appropriate, or follow a short discursive format. Every proposal should
contain at least the following information before it is "ACCEPTED",
though the information does not need to be in sections with these names.
diff --git a/doc/spec/proposals/098-todo.txt b/doc/spec/proposals/098-todo.txt
index e891ea890c..a0bbbeb568 100644
--- a/doc/spec/proposals/098-todo.txt
+++ b/doc/spec/proposals/098-todo.txt
@@ -1,7 +1,5 @@
Filename: 098-todo.txt
Title: Proposals that should be written
-Version: $Revision$
-Last-Modified: $Date$
Author: Nick Mathewson, Roger Dingledine
Created: 26-Jan-2007
Status: Meta
diff --git a/doc/spec/proposals/099-misc.txt b/doc/spec/proposals/099-misc.txt
index ba13ea2a71..a3621dd25f 100644
--- a/doc/spec/proposals/099-misc.txt
+++ b/doc/spec/proposals/099-misc.txt
@@ -1,7 +1,5 @@
Filename: 099-misc.txt
Title: Miscellaneous proposals
-Version: $Revision$
-Last-Modified: $Date$
Author: Various
Created: 26-Jan-2007
Status: Meta
diff --git a/doc/spec/proposals/100-tor-spec-udp.txt b/doc/spec/proposals/100-tor-spec-udp.txt
index 8224682ec8..7f062222c5 100644
--- a/doc/spec/proposals/100-tor-spec-udp.txt
+++ b/doc/spec/proposals/100-tor-spec-udp.txt
@@ -1,7 +1,5 @@
Filename: 100-tor-spec-udp.txt
Title: Tor Unreliable Datagram Extension Proposal
-Version: $Revision$
-Last-Modified: $Date$
Author: Marc Liberatore
Created: 23 Feb 2006
Status: Dead
diff --git a/doc/spec/proposals/101-dir-voting.txt b/doc/spec/proposals/101-dir-voting.txt
index be900a641e..634d3f1948 100644
--- a/doc/spec/proposals/101-dir-voting.txt
+++ b/doc/spec/proposals/101-dir-voting.txt
@@ -1,7 +1,5 @@
Filename: 101-dir-voting.txt
Title: Voting on the Tor Directory System
-Version: $Revision$
-Last-Modified: $Date$
Author: Nick Mathewson
Created: Nov 2006
Status: Closed
diff --git a/doc/spec/proposals/102-drop-opt.txt b/doc/spec/proposals/102-drop-opt.txt
index 8f6a38ae6c..490376bb53 100644
--- a/doc/spec/proposals/102-drop-opt.txt
+++ b/doc/spec/proposals/102-drop-opt.txt
@@ -1,7 +1,5 @@
Filename: 102-drop-opt.txt
Title: Dropping "opt" from the directory format
-Version: $Revision$
-Last-Modified: $Date$
Author: Nick Mathewson
Created: Jan 2007
Status: Closed
diff --git a/doc/spec/proposals/103-multilevel-keys.txt b/doc/spec/proposals/103-multilevel-keys.txt
index ef51e18047..c8a7a6677b 100644
--- a/doc/spec/proposals/103-multilevel-keys.txt
+++ b/doc/spec/proposals/103-multilevel-keys.txt
@@ -1,7 +1,5 @@
Filename: 103-multilevel-keys.txt
Title: Splitting identity key from regularly used signing key.
-Version: $Revision$
-Last-Modified: $Date$
Author: Nick Mathewson
Created: Jan 2007
Status: Closed
diff --git a/doc/spec/proposals/104-short-descriptors.txt b/doc/spec/proposals/104-short-descriptors.txt
index a1c42c8ff7..90e0764fe6 100644
--- a/doc/spec/proposals/104-short-descriptors.txt
+++ b/doc/spec/proposals/104-short-descriptors.txt
@@ -1,7 +1,5 @@
Filename: 104-short-descriptors.txt
Title: Long and Short Router Descriptors
-Version: $Revision$
-Last-Modified: $Date$
Author: Nick Mathewson
Created: Jan 2007
Status: Closed
diff --git a/doc/spec/proposals/105-handshake-revision.txt b/doc/spec/proposals/105-handshake-revision.txt
index f6c209e71b..791a016c26 100644
--- a/doc/spec/proposals/105-handshake-revision.txt
+++ b/doc/spec/proposals/105-handshake-revision.txt
@@ -1,7 +1,5 @@
Filename: 105-handshake-revision.txt
Title: Version negotiation for the Tor protocol.
-Version: $Revision$
-Last-Modified: $Date$
Author: Nick Mathewson, Roger Dingledine
Created: Jan 2007
Status: Closed
diff --git a/doc/spec/proposals/106-less-tls-constraint.txt b/doc/spec/proposals/106-less-tls-constraint.txt
index 35d6bf1066..7e7621df69 100644
--- a/doc/spec/proposals/106-less-tls-constraint.txt
+++ b/doc/spec/proposals/106-less-tls-constraint.txt
@@ -1,7 +1,5 @@
Filename: 106-less-tls-constraint.txt
Title: Checking fewer things during TLS handshakes
-Version: $Revision$
-Last-Modified: $Date$
Author: Nick Mathewson
Created: 9-Feb-2007
Status: Closed
diff --git a/doc/spec/proposals/107-uptime-sanity-checking.txt b/doc/spec/proposals/107-uptime-sanity-checking.txt
index b11be89380..922129b21d 100644
--- a/doc/spec/proposals/107-uptime-sanity-checking.txt
+++ b/doc/spec/proposals/107-uptime-sanity-checking.txt
@@ -1,7 +1,5 @@
Filename: 107-uptime-sanity-checking.txt
Title: Uptime Sanity Checking
-Version: $Revision$
-Last-Modified: $Date$
Author: Kevin Bauer & Damon McCoy
Created: 8-March-2007
Status: Closed
diff --git a/doc/spec/proposals/108-mtbf-based-stability.txt b/doc/spec/proposals/108-mtbf-based-stability.txt
index 2c66481530..294103760b 100644
--- a/doc/spec/proposals/108-mtbf-based-stability.txt
+++ b/doc/spec/proposals/108-mtbf-based-stability.txt
@@ -1,7 +1,5 @@
Filename: 108-mtbf-based-stability.txt
Title: Base "Stable" Flag on Mean Time Between Failures
-Version: $Revision$
-Last-Modified: $Date$
Author: Nick Mathewson
Created: 10-Mar-2007
Status: Closed
diff --git a/doc/spec/proposals/109-no-sharing-ips.txt b/doc/spec/proposals/109-no-sharing-ips.txt
index 1a88b00c0f..5438cf049a 100644
--- a/doc/spec/proposals/109-no-sharing-ips.txt
+++ b/doc/spec/proposals/109-no-sharing-ips.txt
@@ -1,7 +1,5 @@
Filename: 109-no-sharing-ips.txt
Title: No more than one server per IP address.
-Version: $Revision$
-Last-Modified: $Date$
Author: Kevin Bauer & Damon McCoy
Created: 9-March-2007
Status: Closed
diff --git a/doc/spec/proposals/110-avoid-infinite-circuits.txt b/doc/spec/proposals/110-avoid-infinite-circuits.txt
index 1834cd34a7..fffc41c25a 100644
--- a/doc/spec/proposals/110-avoid-infinite-circuits.txt
+++ b/doc/spec/proposals/110-avoid-infinite-circuits.txt
@@ -1,7 +1,5 @@
Filename: 110-avoid-infinite-circuits.txt
Title: Avoiding infinite length circuits
-Version: $Revision$
-Last-Modified: $Date$
Author: Roger Dingledine
Created: 13-Mar-2007
Status: Accepted
diff --git a/doc/spec/proposals/111-local-traffic-priority.txt b/doc/spec/proposals/111-local-traffic-priority.txt
index f8a37efc94..9411463c21 100644
--- a/doc/spec/proposals/111-local-traffic-priority.txt
+++ b/doc/spec/proposals/111-local-traffic-priority.txt
@@ -1,7 +1,5 @@
Filename: 111-local-traffic-priority.txt
Title: Prioritizing local traffic over relayed traffic
-Version: $Revision$
-Last-Modified: $Date$
Author: Roger Dingledine
Created: 14-Mar-2007
Status: Closed
diff --git a/doc/spec/proposals/112-bring-back-pathlencoinweight.txt b/doc/spec/proposals/112-bring-back-pathlencoinweight.txt
index e7cc6b4e36..3f6c3376f0 100644
--- a/doc/spec/proposals/112-bring-back-pathlencoinweight.txt
+++ b/doc/spec/proposals/112-bring-back-pathlencoinweight.txt
@@ -1,7 +1,5 @@
Filename: 112-bring-back-pathlencoinweight.txt
Title: Bring Back Pathlen Coin Weight
-Version: $Revision$
-Last-Modified: $Date$
Author: Mike Perry
Created:
Status: Superseded
diff --git a/doc/spec/proposals/113-fast-authority-interface.txt b/doc/spec/proposals/113-fast-authority-interface.txt
index 20cf33e429..8912b53220 100644
--- a/doc/spec/proposals/113-fast-authority-interface.txt
+++ b/doc/spec/proposals/113-fast-authority-interface.txt
@@ -1,7 +1,5 @@
Filename: 113-fast-authority-interface.txt
Title: Simplifying directory authority administration
-Version: $Revision$
-Last-Modified: $Date$
Author: Nick Mathewson
Created:
Status: Superseded
diff --git a/doc/spec/proposals/114-distributed-storage.txt b/doc/spec/proposals/114-distributed-storage.txt
index e9271fb82d..91a787d301 100644
--- a/doc/spec/proposals/114-distributed-storage.txt
+++ b/doc/spec/proposals/114-distributed-storage.txt
@@ -1,7 +1,5 @@
Filename: 114-distributed-storage.txt
Title: Distributed Storage for Tor Hidden Service Descriptors
-Version: $Revision$
-Last-Modified: $Date$
Author: Karsten Loesing
Created: 13-May-2007
Status: Closed
diff --git a/doc/spec/proposals/115-two-hop-paths.txt b/doc/spec/proposals/115-two-hop-paths.txt
index ee10d949c4..9854c9ad55 100644
--- a/doc/spec/proposals/115-two-hop-paths.txt
+++ b/doc/spec/proposals/115-two-hop-paths.txt
@@ -1,7 +1,5 @@
Filename: 115-two-hop-paths.txt
Title: Two Hop Paths
-Version: $Revision$
-Last-Modified: $Date$
Author: Mike Perry
Created:
Status: Dead
diff --git a/doc/spec/proposals/116-two-hop-paths-from-guard.txt b/doc/spec/proposals/116-two-hop-paths-from-guard.txt
index 454b344abf..f45625350b 100644
--- a/doc/spec/proposals/116-two-hop-paths-from-guard.txt
+++ b/doc/spec/proposals/116-two-hop-paths-from-guard.txt
@@ -1,7 +1,5 @@
Filename: 116-two-hop-paths-from-guard.txt
Title: Two hop paths from entry guards
-Version: $Revision$
-Last-Modified: $Date$
Author: Michael Lieberman
Created: 26-Jun-2007
Status: Dead
diff --git a/doc/spec/proposals/117-ipv6-exits.txt b/doc/spec/proposals/117-ipv6-exits.txt
index c8402821ed..00cd7cef10 100644
--- a/doc/spec/proposals/117-ipv6-exits.txt
+++ b/doc/spec/proposals/117-ipv6-exits.txt
@@ -1,7 +1,5 @@
Filename: 117-ipv6-exits.txt
Title: IPv6 exits
-Version: $Revision$
-Last-Modified: $Date$
Author: coderman
Created: 10-Jul-2007
Status: Accepted
diff --git a/doc/spec/proposals/118-multiple-orports.txt b/doc/spec/proposals/118-multiple-orports.txt
index 1bef2504d9..2381ec7ca3 100644
--- a/doc/spec/proposals/118-multiple-orports.txt
+++ b/doc/spec/proposals/118-multiple-orports.txt
@@ -1,7 +1,5 @@
Filename: 118-multiple-orports.txt
Title: Advertising multiple ORPorts at once
-Version: $Revision$
-Last-Modified: $Date$
Author: Nick Mathewson
Created: 09-Jul-2007
Status: Accepted
diff --git a/doc/spec/proposals/119-controlport-auth.txt b/doc/spec/proposals/119-controlport-auth.txt
index dc57a27368..9ed1cc1cbe 100644
--- a/doc/spec/proposals/119-controlport-auth.txt
+++ b/doc/spec/proposals/119-controlport-auth.txt
@@ -1,7 +1,5 @@
Filename: 119-controlport-auth.txt
Title: New PROTOCOLINFO command for controllers
-Version: $Revision$
-Last-Modified: $Date$
Author: Roger Dingledine
Created: 14-Aug-2007
Status: Closed
diff --git a/doc/spec/proposals/120-shutdown-descriptors.txt b/doc/spec/proposals/120-shutdown-descriptors.txt
index dc1265b03b..5cfe2b5bc6 100644
--- a/doc/spec/proposals/120-shutdown-descriptors.txt
+++ b/doc/spec/proposals/120-shutdown-descriptors.txt
@@ -1,7 +1,5 @@
Filename: 120-shutdown-descriptors.txt
Title: Shutdown descriptors when Tor servers stop
-Version: $Revision$
-Last-Modified: $Date$
Author: Roger Dingledine
Created: 15-Aug-2007
Status: Dead
diff --git a/doc/spec/proposals/121-hidden-service-authentication.txt b/doc/spec/proposals/121-hidden-service-authentication.txt
index 828bf3c92d..0d92b53a8c 100644
--- a/doc/spec/proposals/121-hidden-service-authentication.txt
+++ b/doc/spec/proposals/121-hidden-service-authentication.txt
@@ -1,7 +1,5 @@
Filename: 121-hidden-service-authentication.txt
Title: Hidden Service Authentication
-Version: $Revision$
-Last-Modified: $Date$
Author: Tobias Kamm, Thomas Lauterbach, Karsten Loesing, Ferdinand Rieger,
Christoph Weingarten
Created: 10-Sep-2007
diff --git a/doc/spec/proposals/122-unnamed-flag.txt b/doc/spec/proposals/122-unnamed-flag.txt
index 6502b9c560..2ce7bb22b9 100644
--- a/doc/spec/proposals/122-unnamed-flag.txt
+++ b/doc/spec/proposals/122-unnamed-flag.txt
@@ -1,7 +1,5 @@
Filename: 122-unnamed-flag.txt
Title: Network status entries need a new Unnamed flag
-Version: $Revision$
-Last-Modified: $Date$
Author: Roger Dingledine
Created: 04-Oct-2007
Status: Closed
diff --git a/doc/spec/proposals/123-autonaming.txt b/doc/spec/proposals/123-autonaming.txt
index 6cd25329f8..74c486985d 100644
--- a/doc/spec/proposals/123-autonaming.txt
+++ b/doc/spec/proposals/123-autonaming.txt
@@ -1,7 +1,5 @@
Filename: 123-autonaming.txt
Title: Naming authorities automatically create bindings
-Version: $Revision$
-Last-Modified: $Date$
Author: Peter Palfrader
Created: 2007-10-11
Status: Closed
diff --git a/doc/spec/proposals/124-tls-certificates.txt b/doc/spec/proposals/124-tls-certificates.txt
index 0a47772732..9472d14af8 100644
--- a/doc/spec/proposals/124-tls-certificates.txt
+++ b/doc/spec/proposals/124-tls-certificates.txt
@@ -1,7 +1,5 @@
Filename: 124-tls-certificates.txt
Title: Blocking resistant TLS certificate usage
-Version: $Revision$
-Last-Modified: $Date$
Author: Steven J. Murdoch
Created: 2007-10-25
Status: Superseded
diff --git a/doc/spec/proposals/125-bridges.txt b/doc/spec/proposals/125-bridges.txt
index 8bb3169780..9d95729d42 100644
--- a/doc/spec/proposals/125-bridges.txt
+++ b/doc/spec/proposals/125-bridges.txt
@@ -1,7 +1,5 @@
Filename: 125-bridges.txt
Title: Behavior for bridge users, bridge relays, and bridge authorities
-Version: $Revision$
-Last-Modified: $Date$
Author: Roger Dingledine
Created: 11-Nov-2007
Status: Closed
diff --git a/doc/spec/proposals/126-geoip-reporting.txt b/doc/spec/proposals/126-geoip-reporting.txt
index d48a08ba38..9f3b21c670 100644
--- a/doc/spec/proposals/126-geoip-reporting.txt
+++ b/doc/spec/proposals/126-geoip-reporting.txt
@@ -1,7 +1,5 @@
Filename: 126-geoip-reporting.txt
Title: Getting GeoIP data and publishing usage summaries
-Version: $Revision$
-Last-Modified: $Date$
Author: Roger Dingledine
Created: 2007-11-24
Status: Closed
diff --git a/doc/spec/proposals/127-dirport-mirrors-downloads.txt b/doc/spec/proposals/127-dirport-mirrors-downloads.txt
index 1b55a02d61..72d6c0cb9f 100644
--- a/doc/spec/proposals/127-dirport-mirrors-downloads.txt
+++ b/doc/spec/proposals/127-dirport-mirrors-downloads.txt
@@ -1,7 +1,5 @@
Filename: 127-dirport-mirrors-downloads.txt
Title: Relaying dirport requests to Tor download site / website
-Version: $Revision$
-Last-Modified: $Date$
Author: Roger Dingledine
Created: 2007-12-02
Status: Draft
diff --git a/doc/spec/proposals/128-bridge-families.txt b/doc/spec/proposals/128-bridge-families.txt
index e8a0050c3c..e5bdcf95cb 100644
--- a/doc/spec/proposals/128-bridge-families.txt
+++ b/doc/spec/proposals/128-bridge-families.txt
@@ -1,7 +1,5 @@
Filename: 128-bridge-families.txt
Title: Families of private bridges
-Version: $Revision$
-Last-Modified: $Date$
Author: Roger Dingledine
Created: 2007-12-xx
Status: Dead
diff --git a/doc/spec/proposals/129-reject-plaintext-ports.txt b/doc/spec/proposals/129-reject-plaintext-ports.txt
index d4767d03d8..8080ff5b75 100644
--- a/doc/spec/proposals/129-reject-plaintext-ports.txt
+++ b/doc/spec/proposals/129-reject-plaintext-ports.txt
@@ -1,7 +1,5 @@
Filename: 129-reject-plaintext-ports.txt
Title: Block Insecure Protocols by Default
-Version: $Revision$
-Last-Modified: $Date$
Author: Kevin Bauer & Damon McCoy
Created: 2008-01-15
Status: Closed
diff --git a/doc/spec/proposals/130-v2-conn-protocol.txt b/doc/spec/proposals/130-v2-conn-protocol.txt
index 16f5bf2844..60e742a622 100644
--- a/doc/spec/proposals/130-v2-conn-protocol.txt
+++ b/doc/spec/proposals/130-v2-conn-protocol.txt
@@ -1,7 +1,5 @@
Filename: 130-v2-conn-protocol.txt
Title: Version 2 Tor connection protocol
-Version: $Revision$
-Last-Modified: $Date$
Author: Nick Mathewson
Created: 2007-10-25
Status: Closed
diff --git a/doc/spec/proposals/131-verify-tor-usage.txt b/doc/spec/proposals/131-verify-tor-usage.txt
index 2687139189..d3c6efe75a 100644
--- a/doc/spec/proposals/131-verify-tor-usage.txt
+++ b/doc/spec/proposals/131-verify-tor-usage.txt
@@ -1,7 +1,5 @@
Filename: 131-verify-tor-usage.txt
Title: Help users to verify they are using Tor
-Version: $Revision$
-Last-Modified: $Date$
Author: Steven J. Murdoch
Created: 2008-01-25
Status: Needs-Revision
diff --git a/doc/spec/proposals/132-browser-check-tor-service.txt b/doc/spec/proposals/132-browser-check-tor-service.txt
index d07a10dcde..6132e5d060 100644
--- a/doc/spec/proposals/132-browser-check-tor-service.txt
+++ b/doc/spec/proposals/132-browser-check-tor-service.txt
@@ -1,7 +1,5 @@
Filename: 132-browser-check-tor-service.txt
Title: A Tor Web Service For Verifying Correct Browser Configuration
-Version: $Revision$
-Last-Modified: $Date$
Author: Robert Hogan
Created: 2008-03-08
Status: Draft
diff --git a/doc/spec/proposals/134-robust-voting.txt b/doc/spec/proposals/134-robust-voting.txt
index 5d5e77fa3b..c5dfb3b47f 100644
--- a/doc/spec/proposals/134-robust-voting.txt
+++ b/doc/spec/proposals/134-robust-voting.txt
@@ -2,8 +2,10 @@ Filename: 134-robust-voting.txt
Title: More robust consensus voting with diverse authority sets
Author: Peter Palfrader
Created: 2008-04-01
-Status: Accepted
-Target: 0.2.2.x
+Status: Rejected
+
+History:
+ 2009 May 27: Added note on rejecting this proposal -- Nick
Overview:
@@ -103,3 +105,19 @@ Possible Attacks/Open Issues/Some thinking required:
Q: Can this ever force us to build a consensus with authorities we do not
recognize?
A: No, we can never build a fully connected set with them in step 3.
+
+------------------------------
+
+I'm rejecting this proposal as insecure.
+
+Suppose that we have a clique of size N, and M hostile members in the
+clique. If these hostile members stop declaring trust for up to M-1
+good members of the clique, the clique with the hostile members will
+in it will be larger than the one without them.
+
+The M hostile members will constitute a majority of this new clique
+when M > (N-(M-1)) / 2, or when M > (N + 1) / 3. This breaks our
+requirement that an adversary must compromise a majority of authorities
+in order to control the consensus.
+
+-- Nick
diff --git a/doc/spec/proposals/135-private-tor-networks.txt b/doc/spec/proposals/135-private-tor-networks.txt
index 131bbb9068..19ef68b7b1 100644
--- a/doc/spec/proposals/135-private-tor-networks.txt
+++ b/doc/spec/proposals/135-private-tor-networks.txt
@@ -1,7 +1,5 @@
Filename: 135-private-tor-networks.txt
Title: Simplify Configuration of Private Tor Networks
-Version: $Revision$
-Last-Modified: $Date$
Author: Karsten Loesing
Created: 29-Apr-2008
Status: Closed
diff --git a/doc/spec/proposals/137-bootstrap-phases.txt b/doc/spec/proposals/137-bootstrap-phases.txt
index 18d3dfae12..ebe044c707 100644
--- a/doc/spec/proposals/137-bootstrap-phases.txt
+++ b/doc/spec/proposals/137-bootstrap-phases.txt
@@ -1,7 +1,5 @@
Filename: 137-bootstrap-phases.txt
Title: Keep controllers informed as Tor bootstraps
-Version: $Revision$
-Last-Modified: $Date$
Author: Roger Dingledine
Created: 07-Jun-2008
Status: Closed
diff --git a/doc/spec/proposals/138-remove-down-routers-from-consensus.txt b/doc/spec/proposals/138-remove-down-routers-from-consensus.txt
index a07764d536..776911b5c9 100644
--- a/doc/spec/proposals/138-remove-down-routers-from-consensus.txt
+++ b/doc/spec/proposals/138-remove-down-routers-from-consensus.txt
@@ -1,7 +1,5 @@
Filename: 138-remove-down-routers-from-consensus.txt
Title: Remove routers that are not Running from consensus documents
-Version: $Revision$
-Last-Modified: $Date$
Author: Peter Palfrader
Created: 11-Jun-2008
Status: Closed
diff --git a/doc/spec/proposals/140-consensus-diffs.txt b/doc/spec/proposals/140-consensus-diffs.txt
index da63bfe23c..8bc4070bfe 100644
--- a/doc/spec/proposals/140-consensus-diffs.txt
+++ b/doc/spec/proposals/140-consensus-diffs.txt
@@ -1,12 +1,15 @@
Filename: 140-consensus-diffs.txt
Title: Provide diffs between consensuses
-Version: $Revision$
-Last-Modified: $Date$
Author: Peter Palfrader
Created: 13-Jun-2008
Status: Accepted
Target: 0.2.2.x
+0. History
+
+ 22-May-2009: Restricted the ed format even more strictly for ease of
+ implementation. -nickm
+
1. Overview.
Tor clients and servers need a list of which relays are on the
@@ -135,6 +138,10 @@ Target: 0.2.2.x
Note that line numbers always apply to the file after all previous
commands have already been applied.
+ The commands MUST apply to the file from back to front, such that
+ lines are only ever referred to by their position in the original
+ file.
+
The "current line" is either the first line of the file, if this is
the first command, the last line of a block we added in an append or
change command, or the line immediate following a set of lines we just
diff --git a/doc/spec/proposals/141-jit-sd-downloads.txt b/doc/spec/proposals/141-jit-sd-downloads.txt
index b0c2b2cbcd..2ac7a086b7 100644
--- a/doc/spec/proposals/141-jit-sd-downloads.txt
+++ b/doc/spec/proposals/141-jit-sd-downloads.txt
@@ -1,7 +1,5 @@
Filename: 141-jit-sd-downloads.txt
Title: Download server descriptors on demand
-Version: $Revision$
-Last-Modified: $Date$
Author: Peter Palfrader
Created: 15-Jun-2008
Status: Draft
@@ -63,8 +61,8 @@ Status: Draft
which tries to convey a server's capacity to clients.
Currently we weigh servers differently for different purposes. There
- is a weigh for when we use a server as a guard node (our entry to the
- Tor network), there is one weigh we assign servers for exit duties,
+ is a weight for when we use a server as a guard node (our entry to the
+ Tor network), there is one weight we assign servers for exit duties,
and a third for when we need intermediate (middle) nodes.
2.2 Exit information
@@ -80,7 +78,7 @@ Status: Draft
2.3 Capability information
- Server descriptors contain information about the specific version or
+ Server descriptors contain information about the specific version of
the Tor protocol they understand [proposal 105].
Furthermore the server descriptor also contains the exact version of
diff --git a/doc/spec/proposals/142-combine-intro-and-rend-points.txt b/doc/spec/proposals/142-combine-intro-and-rend-points.txt
index 3456b285a9..3abd5c863d 100644
--- a/doc/spec/proposals/142-combine-intro-and-rend-points.txt
+++ b/doc/spec/proposals/142-combine-intro-and-rend-points.txt
@@ -1,7 +1,5 @@
Filename: 142-combine-intro-and-rend-points.txt
Title: Combine Introduction and Rendezvous Points
-Version: $Revision$
-Last-Modified: $Date$
Author: Karsten Loesing, Christian Wilms
Created: 27-Jun-2008
Status: Dead
diff --git a/doc/spec/proposals/143-distributed-storage-improvements.txt b/doc/spec/proposals/143-distributed-storage-improvements.txt
index 8789d84663..0f7468f1dc 100644
--- a/doc/spec/proposals/143-distributed-storage-improvements.txt
+++ b/doc/spec/proposals/143-distributed-storage-improvements.txt
@@ -1,7 +1,5 @@
Filename: 143-distributed-storage-improvements.txt
Title: Improvements of Distributed Storage for Tor Hidden Service Descriptors
-Version: $Revision$
-Last-Modified: $Date$
Author: Karsten Loesing
Created: 28-Jun-2008
Status: Open
diff --git a/doc/spec/proposals/145-newguard-flag.txt b/doc/spec/proposals/145-newguard-flag.txt
index 31d707d725..9e61e30be9 100644
--- a/doc/spec/proposals/145-newguard-flag.txt
+++ b/doc/spec/proposals/145-newguard-flag.txt
@@ -1,7 +1,5 @@
Filename: 145-newguard-flag.txt
Title: Separate "suitable as a guard" from "suitable as a new guard"
-Version: $Revision$
-Last-Modified: $Date$
Author: Nick Mathewson
Created: 1-Jul-2008
Status: Open
diff --git a/doc/spec/proposals/146-long-term-stability.txt b/doc/spec/proposals/146-long-term-stability.txt
index 7cfd58f564..9af0017441 100644
--- a/doc/spec/proposals/146-long-term-stability.txt
+++ b/doc/spec/proposals/146-long-term-stability.txt
@@ -1,7 +1,5 @@
Filename: 146-long-term-stability.txt
Title: Add new flag to reflect long-term stability
-Version: $Revision$
-Last-Modified: $Date$
Author: Nick Mathewson
Created: 19-Jun-2008
Status: Open
diff --git a/doc/spec/proposals/147-prevoting-opinions.txt b/doc/spec/proposals/147-prevoting-opinions.txt
index 2b8cf30e46..3d9659c984 100644
--- a/doc/spec/proposals/147-prevoting-opinions.txt
+++ b/doc/spec/proposals/147-prevoting-opinions.txt
@@ -1,7 +1,5 @@
Filename: 147-prevoting-opinions.txt
Title: Eliminate the need for v2 directories in generating v3 directories
-Version: $Revision$
-Last-Modified: $Date$
Author: Nick Mathewson
Created: 2-Jul-2008
Status: Accepted
diff --git a/doc/spec/proposals/148-uniform-client-end-reason.txt b/doc/spec/proposals/148-uniform-client-end-reason.txt
index cec81253ea..1db3b3e596 100644
--- a/doc/spec/proposals/148-uniform-client-end-reason.txt
+++ b/doc/spec/proposals/148-uniform-client-end-reason.txt
@@ -1,7 +1,5 @@
Filename: 148-uniform-client-end-reason.txt
Title: Stream end reasons from the client side should be uniform
-Version: $Revision$
-Last-Modified: $Date$
Author: Roger Dingledine
Created: 2-Jul-2008
Status: Closed
diff --git a/doc/spec/proposals/149-using-netinfo-data.txt b/doc/spec/proposals/149-using-netinfo-data.txt
index 4919514b4c..8bf8375d5d 100644
--- a/doc/spec/proposals/149-using-netinfo-data.txt
+++ b/doc/spec/proposals/149-using-netinfo-data.txt
@@ -1,7 +1,5 @@
Filename: 149-using-netinfo-data.txt
Title: Using data from NETINFO cells
-Version: $Revision$
-Last-Modified: $Date$
Author: Nick Mathewson
Created: 2-Jul-2008
Status: Open
@@ -24,14 +22,14 @@ Motivation
idea of their own IP addresses, so they can publish correct
descriptors. This is also in NETINFO cells.
-Learning the time and IP
+Learning the time and IP address
We need to think about attackers here. Just because a router tells
us that we have a given IP or a given clock skew doesn't mean that
it's true. We believe this information only if we've heard it from
a majority of the routers we've connected to recently, including at
least 3 routers. Routers only believe this information if the
- majority inclues at least one authority.
+ majority includes at least one authority.
Avoiding MITM attacks
diff --git a/doc/spec/proposals/150-exclude-exit-nodes.txt b/doc/spec/proposals/150-exclude-exit-nodes.txt
index b73a9cc4d1..b497ae62c1 100644
--- a/doc/spec/proposals/150-exclude-exit-nodes.txt
+++ b/doc/spec/proposals/150-exclude-exit-nodes.txt
@@ -1,6 +1,5 @@
Filename: 150-exclude-exit-nodes.txt
Title: Exclude Exit Nodes from a circuit
-Version: $Revision$
Author: Mfr
Created: 2008-06-15
Status: Closed
diff --git a/doc/spec/proposals/151-path-selection-improvements.txt b/doc/spec/proposals/151-path-selection-improvements.txt
index e3c8f35451..363eebae84 100644
--- a/doc/spec/proposals/151-path-selection-improvements.txt
+++ b/doc/spec/proposals/151-path-selection-improvements.txt
@@ -1,16 +1,14 @@
Filename: 151-path-selection-improvements.txt
Title: Improving Tor Path Selection
-Version:
-Last-Modified:
Author: Fallon Chen, Mike Perry
Created: 5-Jul-2008
-Status: Draft
+Status: Finished
Overview
The performance of paths selected can be improved by adjusting the
CircuitBuildTimeout and avoiding failing guard nodes. This proposal
- describes a method of tracking buildtime statistics at the client, and
+ describes a method of tracking buildtime statistics at the client, and
using those statistics to adjust the CircuitBuildTimeout.
Motivation
@@ -22,121 +20,123 @@ Motivation
Implementation
- Storing Build Times
+ Gathering Build Times
- Circuit build times will be stored in the circular array
- 'circuit_build_times' consisting of uint16_t elements as milliseconds.
- The total size of this array will be based on the number of circuits
+ Circuit build times are stored in the circular array
+ 'circuit_build_times' consisting of uint32_t elements as milliseconds.
+ The total size of this array is based on the number of circuits
it takes to converge on a good fit of the long term distribution of
the circuit builds for a fixed link. We do not want this value to be
too large, because it will make it difficult for clients to adapt to
moving between different links.
- From our initial observations, this value appears to be on the order
- of 1000, but will be configurable in a #define NCIRCUITS_TO_OBSERVE.
- The exact value for this #define will be determined by performing
- goodness of fit tests using measurments obtained from the shufflebt.py
- script from TorFlow.
-
+ From our observations, the minimum value for a reasonable fit appears
+ to be on the order of 500 (MIN_CIRCUITS_TO_OBSERVE). However, to keep
+ a good fit over the long term, we store 5000 most recent circuits in
+ the array (NCIRCUITS_TO_OBSERVE).
+
+ The Tor client will build test circuits at a rate of one per
+ minute (BUILD_TIMES_TEST_FREQUENCY) up to the point of
+ MIN_CIRCUITS_TO_OBSERVE. This allows a fresh Tor to have
+ a CircuitBuildTimeout estimated within 8 hours after install,
+ upgrade, or network change (see below).
+
Long Term Storage
- The long-term storage representation will be implemented by storing a
- histogram with BUILDTIME_BIN_WIDTH millisecond buckets (default 50) when
- writing out the statistics to disk. The format of this histogram on disk
- is yet to be finalized, but it will likely be of the format
- 'CircuitBuildTime <bin> <count>', with the total specified as
- 'TotalBuildTimes <total>'
+ The long-term storage representation is implemented by storing a
+ histogram with BUILDTIME_BIN_WIDTH millisecond buckets (default 50) when
+ writing out the statistics to disk. The format this takes in the
+ state file is 'CircuitBuildTime <bin-ms> <count>', with the total
+ specified as 'TotalBuildTimes <total>'
Example:
TotalBuildTimes 100
- CircuitBuildTimeBin 1 50
- CircuitBuildTimeBin 2 25
- CircuitBuildTimeBin 3 13
+ CircuitBuildTimeBin 25 50
+ CircuitBuildTimeBin 75 25
+ CircuitBuildTimeBin 125 13
...
- Reading the histogram in will entail multiplying each bin by the
- BUILDTIME_BIN_WIDTH and then inserting <count> values into the
- circuit_build_times array each with the value of
- <bin>*BUILDTIME_BIN_WIDTH. In order to evenly distribute the
- values in the circular array, a form of index skipping must
- be employed. Values from bin #N with bin count C and total T
- will occupy indexes specified by N+((T/C)*k)-1, where k is the
- set of integers ranging from 0 to C-1.
-
- For example, this would mean that the values from bin 1 would
- occupy indexes 1+(100/50)*k-1, or 0, 2, 4, 6, 8, 10 and so on.
- The values for bin 2 would occupy positions 1, 5, 9, 13. Collisions
- will be inserted at the first empty position in the array greater
- than the selected index (which may requiring looping around the
- array back to index 0).
+ Reading the histogram in will entail inserting <count> values
+ into the circuit_build_times array each with the value of
+ <bin-ms> milliseconds. In order to evenly distribute the values
+ in the circular array, the Fisher-Yates shuffle will be performed
+ after reading values from the bins.
Learning the CircuitBuildTimeout
Based on studies of build times, we found that the distribution of
- circuit buildtimes appears to be a Pareto distribution.
+ circuit buildtimes appears to be a Frechet distribution. However,
+ estimators and quantile functions of the Frechet distribution are
+ difficult to work with and slow to converge. So instead, since we
+ are only interested in the accuracy of the tail, we approximate
+ the tail of the distribution with a Pareto curve starting at
+ the mode of the circuit build time sample set.
We will calculate the parameters for a Pareto distribution
fitting the data using the estimators at
http://en.wikipedia.org/wiki/Pareto_distribution#Parameter_estimation.
- The timeout itself will be calculated by solving the CDF for the
- a percentile cutoff BUILDTIME_PERCENT_CUTOFF. This value
- represents the percentage of paths the Tor client will accept out of
- the total number of paths. We have not yet determined a good
- cutoff for this mathematically, but 85% seems a good choice for now.
+ The timeout itself is calculated by using the Quartile function (the
+ inverted CDF) to give us the value on the CDF such that
+ BUILDTIME_PERCENT_CUTOFF (80%) of the mass of the distribution is
+ below the timeout value.
+
+ Thus, we expect that the Tor client will accept the fastest 80% of
+ the total number of paths on the network.
+
+ Detecting Changing Network Conditions
- From http://en.wikipedia.org/wiki/Pareto_distribution#Definition,
- the calculation we need is pow(BUILDTIME_PERCENT_CUTOFF/100.0, k)/Xm.
+ We attempt to detect both network connectivity loss and drastic
+ changes in the timeout characteristics.
+
+ We assume that we've had network connectivity loss if 3 circuits
+ timeout and we've received no cells or TLS handshakes since those
+ circuits began. We then set the timeout to 60 seconds and stop
+ counting timeouts.
+
+ If 3 more circuits timeout and the network still has not been
+ live within this new 60 second timeout window, we then discard
+ the previous timeouts during this period from our history.
+
+ To detect changing network conditions, we keep a history of
+ the timeout or non-timeout status of the past RECENT_CIRCUITS (20)
+ that successfully completed at least one hop. If more than 75%
+ of these circuits timeout, we discard all buildtimes history,
+ reset the timeout to 60, and then begin recomputing the timeout.
Testing
After circuit build times, storage, and learning are implemented,
the resulting histogram should be checked for consistency by
- verifying it persists across successive Tor invocations where
+ verifying it persists across successive Tor invocations where
no circuits are built. In addition, we can also use the existing
- buildtime scripts to record build times, and verify that the histogram
+ buildtime scripts to record build times, and verify that the histogram
the python produces matches that which is output to the state file in Tor,
and verify that the Pareto parameters and cutoff points also match.
-
- Soft timeout vs Hard Timeout
-
- At some point, it may be desirable to change the cutoff from a
- single hard cutoff that destroys the circuit to a soft cutoff and
- a hard cutoff, where the soft cutoff merely triggers the building
- of a new circuit, and the hard cutoff triggers destruction of the
- circuit.
-
- Good values for hard and soft cutoffs seem to be 85% and 65%
- respectively, but we should eventually justify this with observation.
-
- When to Begin Calculation
- The number of circuits to observe (NCIRCUITS_TO_CUTOFF) before
- changing the CircuitBuildTimeout will be tunable via a #define. From
- our measurements, a good value for NCIRCUITS_TO_CUTOFF appears to be
- on the order of 100.
+ We will also verify that there are no unexpected large deviations from
+ node selection, such as nodes from distant geographical locations being
+ completely excluded.
Dealing with Timeouts
- Timeouts should be counted as the expectation of the region of
- of the Pareto distribution beyond the cutoff. The proposal will
- be updated with this value soon.
+ Timeouts should be counted as the expectation of the region of
+ of the Pareto distribution beyond the cutoff. This is done by
+ generating a random sample for each timeout at points on the
+ curve beyond the current timeout cutoff.
- Also, in the event of network failure, the observation mechanism
- should stop collecting timeout data.
+ Future Work
- Client Hints
+ At some point, it may be desirable to change the cutoff from a
+ single hard cutoff that destroys the circuit to a soft cutoff and
+ a hard cutoff, where the soft cutoff merely triggers the building
+ of a new circuit, and the hard cutoff triggers destruction of the
+ circuit.
- Some research still needs to be done to provide initial values
- for CircuitBuildTimeout based on values learned from modem
- users, DSL users, Cable Modem users, and dedicated links. A
- radiobutton in Vidalia should eventually be provided that
- sets CircuitBuildTimeout to one of these values and also
- provide the option of purging all learned data, should any exist.
+ It may also be beneficial to learn separate timeouts for each
+ guard node, as they will have slightly different distributions.
+ This will take longer to generate initial values though.
- These values can either be published in the directory, or
- shipped hardcoded for a particular Tor version.
-
Issues
Impact on anonymity
diff --git a/doc/spec/proposals/152-single-hop-circuits.txt b/doc/spec/proposals/152-single-hop-circuits.txt
index e49a4250e0..d0b28b1c72 100644
--- a/doc/spec/proposals/152-single-hop-circuits.txt
+++ b/doc/spec/proposals/152-single-hop-circuits.txt
@@ -1,7 +1,5 @@
Filename: 152-single-hop-circuits.txt
Title: Optionally allow exit from single-hop circuits
-Version:
-Last-Modified:
Author: Geoff Goodell
Created: 13-Jul-2008
Status: Closed
diff --git a/doc/spec/proposals/153-automatic-software-update-protocol.txt b/doc/spec/proposals/153-automatic-software-update-protocol.txt
index 7bc809d440..c2979bb695 100644
--- a/doc/spec/proposals/153-automatic-software-update-protocol.txt
+++ b/doc/spec/proposals/153-automatic-software-update-protocol.txt
@@ -1,7 +1,5 @@
Filename: 153-automatic-software-update-protocol.txt
Title: Automatic software update protocol
-Version: $Revision$
-Last-Modified: $Date$
Author: Jacob Appelbaum
Created: 14-July-2008
Status: Superseded
diff --git a/doc/spec/proposals/154-automatic-updates.txt b/doc/spec/proposals/154-automatic-updates.txt
index 00a820de08..4c2c6d3899 100644
--- a/doc/spec/proposals/154-automatic-updates.txt
+++ b/doc/spec/proposals/154-automatic-updates.txt
@@ -1,7 +1,5 @@
Filename: 154-automatic-updates.txt
Title: Automatic Software Update Protocol
-Version: $Revision$
-Last-Modified: $Date$
Author: Matt Edman
Created: 30-July-2008
Status: Superseded
diff --git a/doc/spec/proposals/155-four-hidden-service-improvements.txt b/doc/spec/proposals/155-four-hidden-service-improvements.txt
index f528f8baf2..e342bf1c39 100644
--- a/doc/spec/proposals/155-four-hidden-service-improvements.txt
+++ b/doc/spec/proposals/155-four-hidden-service-improvements.txt
@@ -1,7 +1,5 @@
Filename: 155-four-hidden-service-improvements.txt
Title: Four Improvements of Hidden Service Performance
-Version: $Revision$
-Last-Modified: $Date$
Author: Karsten Loesing, Christian Wilms
Created: 25-Sep-2008
Status: Finished
diff --git a/doc/spec/proposals/156-tracking-blocked-ports.txt b/doc/spec/proposals/156-tracking-blocked-ports.txt
index 1e7b0d963f..419de7e74c 100644
--- a/doc/spec/proposals/156-tracking-blocked-ports.txt
+++ b/doc/spec/proposals/156-tracking-blocked-ports.txt
@@ -1,7 +1,5 @@
Filename: 156-tracking-blocked-ports.txt
Title: Tracking blocked ports on the client side
-Version: $Revision$
-Last-Modified: $Date$
Author: Robert Hogan
Created: 14-Oct-2008
Status: Open
diff --git a/doc/spec/proposals/157-specific-cert-download.txt b/doc/spec/proposals/157-specific-cert-download.txt
index e54a987277..204b20973a 100644
--- a/doc/spec/proposals/157-specific-cert-download.txt
+++ b/doc/spec/proposals/157-specific-cert-download.txt
@@ -1,7 +1,5 @@
Filename: 157-specific-cert-download.txt
Title: Make certificate downloads specific
-Version: $Revision$
-Last-Modified: $Date$
Author: Nick Mathewson
Created: 2-Dec-2008
Status: Accepted
diff --git a/doc/spec/proposals/158-microdescriptors.txt b/doc/spec/proposals/158-microdescriptors.txt
index f478a3c834..e6966c0cef 100644
--- a/doc/spec/proposals/158-microdescriptors.txt
+++ b/doc/spec/proposals/158-microdescriptors.txt
@@ -1,11 +1,20 @@
Filename: 158-microdescriptors.txt
Title: Clients download consensus + microdescriptors
-Version: $Revision$
-Last-Modified: $Date$
Author: Roger Dingledine
Created: 17-Jan-2009
Status: Open
+0. History
+
+ 15 May 2009: Substantially revised based on discussions on or-dev
+ from late January. Removed the notion of voting on how to choose
+ microdescriptors; made it just a function of the consensus method.
+ (This lets us avoid the possibility of "desynchronization.")
+ Added suggestion to use a new consensus flavor. Specified use of
+ SHA256 for new hashes. -nickm
+
+ 15 June 2009: Cleaned up based on comments from Roger. -nickm
+
1. Overview
This proposal replaces section 3.2 of proposal 141, which was
@@ -13,9 +22,7 @@ Status: Open
circuit-building protocol to fetch a server descriptor inline at each
circuit extend, we instead put all of the information that clients need
either into the consensus itself, or into a new set of data about each
- relay called a microdescriptor. The microdescriptor is a direct
- transform from the relay descriptor, so relays don't even need to know
- this is happening.
+ relay called a microdescriptor.
Descriptor elements that are small and frequently changing should go
in the consensus itself, and descriptor elements that are small and
@@ -24,6 +31,10 @@ Status: Open
them, we'll need to resume considering some design like the one in
proposal 141.
+ Note also that any descriptor element which clients need to use to
+ decide which servers to fetch info about, or which servers to fetch
+ info from, needs to stay in the consensus.
+
2. Motivation
See
@@ -36,99 +47,91 @@ Status: Open
3. Design
There are three pieces to the proposal. First, authorities will list in
- their votes (and thus in the consensus) what relay descriptor elements
- are included in the microdescriptor, and also list the expected hash
- of microdescriptor for each relay. Second, directory mirrors will serve
- microdescriptors. Third, clients will ask for them and cache them.
+ their votes (and thus in the consensus) the expected hash of
+ microdescriptor for each relay. Second, authorities will serve
+ microdescriptors, directory mirrors will cache and serve
+ them. Third, clients will ask for them and cache them.
3.1. Consensus changes
- V3 votes should include a new line:
- microdescriptor-elements bar baz foo
- listing each descriptor element (sorted alphabetically) that authority
- included when it calculated its expected microdescriptor hashes.
+ If the authorities choose a consensus method of a given version or
+ later, a microdescriptor format is implicit in that version.
+ A microdescriptor should in every case be a pure function of the
+ router descriptor and the consensus method.
+
+ In votes, we need to include the hash of each expected microdescriptor
+ in the routerstatus section. I suggest a new "m" line for each stanza,
+ with the base64 of the SHA256 hash of the router's microdescriptor.
+
+ For every consensus method that an authority supports, it includes a
+ separate "m" line in each router section of its vote, containing:
+ "m" SP methods 1*(SP AlgorithmName "=" digest) NL
+ where methods is a comma-separated list of the consensus methods
+ that the authority believes will produce "digest".
- We also need to include the hash of each expected microdescriptor in
- the routerstatus section. I suggest a new "m" line for each stanza,
- with the base64 of the hash of the elements that the authority voted
- for above.
+ (As with base64 encoding of SHA1 hashes in consensuses, let's
+ omit the trailing =s)
The consensus microdescriptor-elements and "m" lines are then computed
as described in Section 3.1.2 below.
- I believe that means we need a new consensus-method "6" that knows
- how to compute the microdescriptor-elements and add "m" lines.
+ (This means we need a new consensus-method that knows
+ how to compute the microdescriptor-elements and add "m" lines.)
-3.1.1. Descriptor elements to include for now
+ The microdescriptor consensus uses the directory-signature format from
+ proposal 162, with the "sha256" algorithm.
- To start, the element list that authorities suggest should be
- family onion-key
- (Note that the or-dev posts above only mention onion-key, but if
- we don't also include family then clients will never learn it. It
- seemed like it should be relatively static, so putting it in the
- microdescriptor is smarter than trying to fit it into the consensus.)
+3.1.1. Descriptor elements to include for now
- We could imagine a config option "family,onion-key" so authorities
- could change their voted preferences without needing to upgrade.
+ In the first version, the microdescriptor should contain the
+ onion-key element, and the family element from the router descriptor,
+ and the exit policy summary as currently specified in dir-spec.txt.
3.1.2. Computing consensus for microdescriptor-elements and "m" lines
- One approach is for the consensus microdescriptor-elements line to
- include every element listed by a majority of authorities, sorted. The
- problem here is that it will no longer be deterministic what the correct
- hash for the "m" line should be. We could imagine telling the authority
- to go look in its descriptor and produce the right hash itself, but
- we don't want consensus calculation to be based on external data like
- that. (Plus, the authority may not have the descriptor that everybody
- else voted to use.)
-
- The better approach is to take the exact set that has the most votes
- (breaking ties by the set that has the most elements, and breaking
- ties after that by whichever is alphabetically first). That will
- increase the odds that we actually get a microdescriptor hash that
- is both a) for the descriptor we're putting in the consensus, and b)
- over the elements that we're declaring it should be for.
-
- Then the "m" line for a given relay is the one that gets the most votes
- from authorities that both a) voted for the microdescriptor-elements
- line we're using, and b) voted for the descriptor we're using.
-
- (If there's a tie, use the smaller hash. But really, if there are
- multiple such votes and they differ about a microdescriptor, we caught
- one of them lying or being buggy. We should log it to track down why.)
-
- If there are no such votes, then we leave out the "m" line for that
- relay. That means clients should avoid it for this time period. (As
- an extension it could instead mean that clients should fetch the
- descriptor and figure out its microdescriptor themselves. But let's
- not get ahead of ourselves.)
-
- It would be nice to have a more foolproof way to agree on what
- microdescriptor hash each authority should vote for, so we can avoid
- missing "m" lines. Just switching to a new consensus-method each time
- we change the set of microdescriptor-elements won't help though, since
- each authority will still have to decide what hash to vote for before
- knowing what consensus-method will be used.
-
- Here's one way we could do it. Each vote / consensus includes
- the microdescriptor-elements that were used to compute the hashes,
- and also a preferred-microdescriptor-elements set. If an authority
- has a consensus from the previous period, then it should use the
- consensus preferred-microdescriptor-elements when computing its votes
- for microdescriptor-elements and the appropriate hashes in the upcoming
- period. (If it has no previous consensus, then it just writes its
- own preferences in both lines.)
-
-3.2. Directory mirrors serve microdescriptors
-
- Directory mirrors should then read the microdescriptor-elements line
- from the consensus, and learn how to answer requests. (Directory mirrors
- continue to serve normal relay descriptors too, a) to serve old clients
- and b) to be able to construct microdescriptors on the fly.)
-
- The microdescriptors with hashes <D1>,<D2>,<D3> should be available at:
- http://<hostname>/tor/micro/d/<D1>+<D2>+<D3>.z
+ When we are generating a consensus, we use whichever m line
+ unambiguously corresponds to the descriptor digest that will be
+ included in the consensus.
+
+ (If different votes have different microdescriptor digests for a
+ single <descriptor-digest, consensus-method> pair, then at least one
+ of the authorities is broken. If this happens, the consensus should
+ contain whichever microdescriptor digest is most common. If there is
+ no winner, we break ties in the favor of the lexically earliest.
+ Either way, we should log a warning: there is definitely a bug.)
+
+ The "m" lines in a consensus contain only the digest, not a list of
+ consensus methods.
+
+3.1.3. A new flavor of consensus
+
+ Rather than inserting "m" lines in the current consensus format,
+ they should be included in a new consensus flavor (see proposal
+ 162).
+
+ This flavor can safely omit descriptor digests.
+
+ When we implement this voting method, we can remove the exit policy
+ summary from the current "ns" flavor of consensus, since no current
+ clients use them, and they take up about 5% of the compressed
+ consensus.
+
+ This new consensus flavor should be signed with the sha256 signature
+ format as documented in proposal 162.
+
+3.2. Directory mirrors fetch, cache, and serve microdescriptors
+
+ Directory mirrors should fetch, catch, and serve each microdescriptor
+ from the authorities. (They need to continue to serve normal relay
+ descriptors too, to handle old clients.)
+
+ The microdescriptors with base64 hashes <D1>,<D2>,<D3> should be
+ available at:
+ http://<hostname>/tor/micro/d/<D1>-<D2>-<D3>.z
+ (We use base64 for size and for consistency with the consensus
+ format. We use -s instead of +s to separate these items, since
+ the + character is used in base64 encoding.)
All the microdescriptors from the current consensus should also be
available at:
@@ -136,24 +139,9 @@ Status: Open
so a client that's bootstrapping doesn't need to send a 70KB URL just
to name every microdescriptor it's looking for.
- The format of a microdescriptor is the header line
- "microdescriptor-header"
- followed by each element (keyword and body), alphabetically. There's
- no need to mention what hash it's for, since it's self-identifying:
- you can hash the elements to learn this.
-
- (Do we need a footer line to show that it's over, or is the next
- microdescriptor line or EOF enough of a hint? A footer line wouldn't
- hurt much. Also, no fair voting for the microdescriptor-element
- "microdescriptor-header".)
-
+ Microdescriptors have no header or footer.
The hash of the microdescriptor is simply the hash of the concatenated
- elements -- not counting the header line or hypothetical footer line.
- Unless you prefer that?
-
- Is there a reasonable way to version these things? We could say that
- the microdescriptor-header line can contain arguments which clients
- must ignore if they don't understand them. Any better ways?
+ elements.
Directory mirrors should check to make sure that the microdescriptors
they're about to serve match the right hashes (either the hashes from
@@ -170,10 +158,14 @@ Status: Open
When a client gets a new consensus, it looks to see if there are any
microdescriptors it needs to learn. If it needs to learn more than
some threshold of the microdescriptors (half?), it requests 'all',
- else it requests only the missing ones.
+ else it requests only the missing ones. Clients MAY try to
+ determine whether the upload bandwidth for listing the
+ microdescriptors they want is more or less than the download
+ bandwidth for the microdescriptors they do not want.
Clients maintain a cache of microdescriptors along with metadata like
- when it was last referenced by a consensus. They keep a microdescriptor
+ when it was last referenced by a consensus, and which identity key
+ it corresponds to. They keep a microdescriptor
until it hasn't been mentioned in any consensus for a week. Future
clients might cache them for longer or shorter times.
@@ -190,18 +182,17 @@ Status: Open
Another future option would be to fetch some of the microdescriptors
anonymously (via a Tor circuit).
+ Another crazy option (Roger's phrasing) is to do decoy fetches as
+ well.
+
4. Transition and deployment
Phase one, the directory authorities should start voting on
- microdescriptors and microdescriptor elements, and putting them in the
- consensus. This should happen during the 0.2.1.x series, and should
- be relatively easy to do.
+ microdescriptors, and putting them in the consensus.
Phase two, directory mirrors should learn how to serve them, and learn
- how to read the consensus to find out what they should be serving. This
- phase could be done either in 0.2.1.x or early in 0.2.2.x, depending
- on how messy it turns out to be and how quickly we get around to it.
+ how to read the consensus to find out what they should be serving.
Phase three, clients should start fetching and caching them instead
- of normal descriptors. This should happen post 0.2.1.x.
+ of normal descriptors.
diff --git a/doc/spec/proposals/159-exit-scanning.txt b/doc/spec/proposals/159-exit-scanning.txt
index fbc69aa9e6..7090f2ed08 100644
--- a/doc/spec/proposals/159-exit-scanning.txt
+++ b/doc/spec/proposals/159-exit-scanning.txt
@@ -1,7 +1,5 @@
Filename: 159-exit-scanning.txt
Title: Exit Scanning
-Version: $Revision$
-Last-Modified: $Date$
Author: Mike Perry
Created: 13-Feb-2009
Status: Open
diff --git a/doc/spec/proposals/160-bandwidth-offset.txt b/doc/spec/proposals/160-bandwidth-offset.txt
new file mode 100644
index 0000000000..96935ade7d
--- /dev/null
+++ b/doc/spec/proposals/160-bandwidth-offset.txt
@@ -0,0 +1,105 @@
+Filename: 160-bandwidth-offset.txt
+Title: Authorities vote for bandwidth offsets in consensus
+Author: Roger Dingledine
+Created: 4-May-2009
+Status: Finished
+Target: 0.2.2.x
+
+1. Motivation
+
+ As part of proposal 141, we moved the bandwidth value for each relay
+ into the consensus. Now clients can know how they should load balance
+ even before they've fetched the corresponding relay descriptors.
+
+ Putting the bandwidth in the consensus also lets the directory
+ authorities choose more accurate numbers to advertise, if we come up
+ with a better algorithm for deciding weightings.
+
+ Our original plan was to teach directory authorities how to measure
+ bandwidth themselves; then every authority would vote for the bandwidth
+ it prefers, and we'd take the median of votes as usual.
+
+ The problem comes when we have 7 authorities, and only a few of them
+ have smarter bandwidth allocation algorithms. So long as the majority
+ of them are voting for the number in the relay descriptor, the minority
+ that have better numbers will be ignored.
+
+2. Options
+
+ One fix would be to demand that every authority also run the
+ new bandwidth measurement algorithms: in that case, part of the
+ responsibility of being an authority operator is that you need to run
+ this code too. But in practice we can't really require all current
+ authority operators to do that; and if we want to expand the set of
+ authority operators even further, it will become even more impractical.
+ Also, bandwidth testing adds load to the network, so we don't really
+ want to require that the number of concurrent bandwidth tests match
+ the number of authorities we have.
+
+ The better fix is to allow certain authorities to specify that they are
+ voting on bandwidth measurements: more accurate bandwidth values that
+ have actually been evaluated. In this way, authorities can vote on
+ the median measured value if sufficient measured votes exist for a router,
+ and otherwise fall back to the median value taken from the published router
+ descriptors.
+
+3. Security implications
+
+ If only some authorities choose to vote on an offset, then a majority of
+ those voting authorities can arbitrarily change the bandwidth weighting
+ for the relay. At the extreme, if there's only one offset-voting
+ authority, then that authority can dictate which relays clients will
+ find attractive.
+
+ This problem isn't entirely new: we already have the worry wrt
+ the subset of authorities that vote for BadExit.
+
+ To make it not so bad, we should deploy at least three offset-voting
+ authorities.
+
+ Also, authorities that know how to vote for offsets should vote for
+ an offset of zero for new nodes, rather than choosing not to vote on
+ any offset in those cases.
+
+4. Design
+
+ First, we need a new consensus method to support this new calculation.
+
+ Now v3 votes can have an additional value on the "w" line:
+ "w Bandwidth=X Measured=" INT.
+
+ Once we're using the new consensus method, the new way to compute the
+ Bandwidth weight is by checking if there are at least 3 "Measured"
+ votes. If so, the median of these is taken. Otherwise, the median
+ of the "Bandwidth=" values are taken, as described in Proposal 141.
+
+ Then the actual consensus looks just the same as it did before,
+ so clients never have to know that this additional calculation is
+ happening.
+
+5. Implementation
+
+ The Measured values will be read from a file provided by the scanners
+ described in proposal 161. Files with a timestamp older than 3 days
+ will be ignored.
+
+ The file will be read in from dirserv_generate_networkstatus_vote_obj()
+ in a location specified by a new config option "V3MeasuredBandwidths".
+ A helper function will be called to populate new 'measured' and
+ 'has_measured' fields of the routerstatus_t 'routerstatuses' list with
+ values read from this file.
+
+ An additional for_vote flag will be passed to
+ routerstatus_format_entry() from format_networkstatus_vote(), which will
+ indicate that the "Measured=" string should be appended to the "w Bandwith="
+ line with the measured value in the struct.
+
+ routerstatus_parse_entry_from_string() will be modified to parse the
+ "Measured=" lines into routerstatus_t struct fields.
+
+ Finally, networkstatus_compute_consensus() will set rs_out.bandwidth
+ to the median of the measured values if there are more than 3, otherwise
+ it will use the bandwidth value median as normal.
+
+
+
diff --git a/doc/spec/proposals/161-computing-bandwidth-adjustments.txt b/doc/spec/proposals/161-computing-bandwidth-adjustments.txt
new file mode 100644
index 0000000000..d219826668
--- /dev/null
+++ b/doc/spec/proposals/161-computing-bandwidth-adjustments.txt
@@ -0,0 +1,174 @@
+Title: Computing Bandwidth Adjustments
+Filename: 161-computing-bandwidth-adjustments.txt
+Author: Mike Perry
+Created: 12-May-2009
+Target: 0.2.2.x
+Status: Finished
+
+
+1. Motivation
+
+ There is high variance in the performance of the Tor network. Despite
+ our efforts to balance load evenly across the Tor nodes, some nodes are
+ significantly slower and more overloaded than others.
+
+ Proposal 160 describes how we can augment the directory authorities to
+ vote on measured bandwidths for routers. This proposal describes what
+ goes into the measuring process.
+
+
+2. Measurement Selection
+
+ The general idea is to determine a load factor representing the ratio
+ of the capacity of measured nodes to the rest of the network. This load
+ factor could be computed from three potentially relevant statistics:
+ circuit failure rates, circuit extend times, or stream capacity.
+
+ Circuit failure rates and circuit extend times appear to be
+ non-linearly proportional to node load. We've observed that the same
+ nodes when scanned at US nighttime hours (when load is presumably
+ lower) exhibit almost no circuit failure, and significantly faster
+ extend times than when scanned during the day.
+
+ Stream capacity, however, is much more uniform, even during US
+ nighttime hours. Moreover, it is a more intuitive representation of
+ node capacity, and also less dependent upon distance and latency
+ if amortized over large stream fetches.
+
+
+3. Average Stream Bandwidth Calculation
+
+ The average stream bandwidths are obtained by dividing the network into
+ slices of 50 nodes each, grouped according to advertised node bandwidth.
+
+ Two hop circuits are built using nodes from the same slice, and a large
+ file is downloaded via these circuits. The file sizes are set based
+ on node percentile rank as follows:
+
+ 0-10: 2M
+ 10-20: 1M
+ 20-30: 512k
+ 30-50: 256k
+ 50-100: 128k
+
+ These sizes are based on measurements performed during test scans.
+
+ This process is repeated until each node has been chosen to participate
+ in at least 5 circuits.
+
+
+4. Ratio Calculation
+
+ The ratios are calculated by dividing each measured value by the
+ network-wide average.
+
+
+5. Ratio Filtering
+
+ After the base ratios are calculated, a second pass is performed
+ to remove any streams with nodes of ratios less than X=0.5 from
+ the results of other nodes. In addition, all outlying streams
+ with capacity of one standard deviation below a node's average
+ are also removed.
+
+ The final ratio result will be greater of the unfiltered ratio
+ and the filtered ratio.
+
+
+6. Pseudocode for Ratio Calculation Algorithm
+
+ Here is the complete pseudocode for the ratio algorithm:
+
+ Slices = {S | S is 50 nodes of similar consensus capacity}
+ for S in Slices:
+ while exists node N in S with circ_chosen(N) < 7:
+ fetch_slice_file(build_2hop_circuit(N, (exit in S)))
+ for N in S:
+ BW_measured(N) = MEAN(b | b is bandwidth of a stream through N)
+ Bw_stddev(N) = STDDEV(b | b is bandwidth of a stream through N)
+ Bw_avg(S) = MEAN(b | b = BW_measured(N) for all N in S)
+ for N in S:
+ Normal_Streams(N) = {stream via N | bandwidth >= BW_measured(N)}
+ BW_Norm_measured(N) = MEAN(b | b is a bandwidth of Normal_Streams(N))
+
+ Bw_net_avg(Slices) = MEAN(BW_measured(N) for all N in Slices)
+ Bw_Norm_net_avg(Slices) = MEAN(BW_Norm_measured(N) for all N in Slices)
+
+ for N in all Slices:
+ Bw_net_ratio(N) = Bw_measured(N)/Bw_net_avg(Slices)
+ Bw_Norm_net_ratio(N) = BW_Norm_measured(N)/Bw_Norm_net_avg(Slices)
+
+ ResultRatio(N) = MAX(Bw_net_ratio(N), Bw_Norm_net_ratio(N))
+
+
+7. Security implications
+
+ The ratio filtering will deal with cases of sabotage by dropping
+ both very slow outliers in stream average calculations, as well
+ as dropping streams that used very slow nodes from the calculation
+ of other nodes.
+
+ This scheme will not address nodes that try to game the system by
+ providing better service to scanners. The scanners can be detected
+ at the entry by IP address, and at the exit by the destination fetch
+ IP.
+
+ Measures can be taken to obfuscate and separate the scanners' source
+ IP address from the directory authority IP address. For instance,
+ scans can happen offsite and the results can be rsynced into the
+ authorities. The destination server IP can also change.
+
+ Neither of these methods are foolproof, but such nodes can already
+ lie about their bandwidth to attract more traffic, so this solution
+ does not set us back any in that regard.
+
+
+8. Parallelization
+
+ Because each slice takes as long as 6 hours to complete, we will want
+ to parallelize as much as possible. This will be done by concurrently
+ running multiple scanners from each authority to deal with different
+ segments of the network. Each scanner piece will continually loop
+ over a portion of the network, outputting files of the form:
+
+ node_id=<idhex> SP strm_bw=<BW_measured(N)> SP
+ filt_bw=<BW_Norm_measured(N)> ns_bw=<CurrentConsensusBw(N)> NL
+
+ The most recent file from each scanner will be periodically gathered
+ by another script that uses them to produce network-wide averages
+ and calculate ratios as per the algorithm in section 6. Because nodes
+ may shift in capacity, they may appear in more than one slice and/or
+ appear more than once in the file set. The most recently measured
+ line will be chosen in this case.
+
+
+9. Integration with Proposal 160
+
+ The final results will be produced for the voting mechanism
+ described in Proposal 160 by multiplying the derived ratio by
+ the average published consensus bandwidth during the course of the
+ scan, and taking the weighted average with the previous consensus
+ bandwidth:
+
+ Bw_new = Round((Bw_current * Alpha + Bw_scan_avg*Bw_ratio)/(Alpha + 1))
+
+ The Alpha parameter is a smoothing parameter intended to prevent
+ rapid oscillation between loaded and unloaded conditions. It is
+ currently fixed at 0.333.
+
+ The Round() step consists of rounding to the 3 most significant figures
+ in base10, and then rounding that result to the nearest 1000, with
+ a minimum value of 1000.
+
+ This will produce a new bandwidth value that will be output into a
+ file consisting of lines of the form:
+
+ node_id=<idhex> SP bw=<Bw_new> NL
+
+ The first line of the file will contain a timestamp in UNIX time()
+ seconds. This will be used by the authority to decide if the
+ measured values are too old to use.
+
+ This file can be either copied or rsynced into a directory readable
+ by the directory authority.
+
diff --git a/doc/spec/proposals/162-consensus-flavors.txt b/doc/spec/proposals/162-consensus-flavors.txt
new file mode 100644
index 0000000000..e3b697afee
--- /dev/null
+++ b/doc/spec/proposals/162-consensus-flavors.txt
@@ -0,0 +1,188 @@
+Filename: 162-consensus-flavors.txt
+Title: Publish the consensus in multiple flavors
+Author: Nick Mathewson
+Created: 14-May-2009
+Target: 0.2.2
+Status: Open
+
+Overview:
+
+ This proposal describes a way to publish each consensus in
+ multiple simultaneous formats, or "flavors". This will reduce the
+ amount of time needed to deploy new consensus-like documents, and
+ reduce the size of consensus documents in the long term.
+
+Motivation:
+
+ In the future, we will almost surely want different fields and
+ data in the network-status document. Examples include:
+ - Publishing hashes of microdescriptors instead of hashes of
+ full descriptors (Proposal 158).
+ - Including different digests of descriptors, instead of the
+ perhaps-soon-to-be-totally-broken SHA1.
+
+ Note that in both cases, from the client's point of view, this
+ information _replaces_ older information. If we're using a
+ SHA256 hash, we don't need to see the SHA1. If clients only want
+ microdescriptors, they don't (necessarily) need to see hashes of
+ other things.
+
+ Our past approach to cases like this has been to shovel all of
+ the data into the consensus document. But this is rather poor
+ for bandwidth. Adding a single SHA256 hash to a consensus for
+ each router increases the compressed consensus size by 47%. In
+ comparison, replacing a single SHA1 hash with a SHA256 hash for
+ each listed router increases the consensus size by only 18%.
+
+Design in brief:
+
+ Let the voting process remain as it is, until a consensus is
+ generated. With future versions of the voting algorithm, instead
+ of just a single consensus being generated, multiple consensus
+ "flavors" are produced.
+
+ Consensuses (all of them) include a list of which flavors are
+ being generated. Caches fetch and serve all flavors of consensus
+ that are listed, regardless of whether they can parse or validate
+ them, and serve them to clients. Thus, once this design is in
+ place, we won't need to deploy more cache changes in order to get
+ new flavors of consensus to be cached.
+
+ Clients download only the consensus flavor they want.
+
+A note on hashes:
+
+ Everything in this document is specified to use SHA256, and to be
+ upgradeable to use better hashes in the future.
+
+Spec modifications:
+
+ 1. URLs and changes to the current consensus format.
+
+ Every consensus flavor has a name consisting of a sequence of one
+ or more alphanumeric characters and dashes. For compatibility
+ current descriptor flavor is called "ns".
+
+ The supported consensus flavors are defined as part of the
+ authorities' consensus method.
+
+ For each supported flavor, every authority calculates another
+ consensus document of as-yet-unspecified format, and exchanges
+ detached signatures for these documents as in the current consensus
+ design.
+
+ In addition to the consensus currently served at
+ /tor/status-vote/(current|next)/consensus.z and
+ /tor/status-vote/(current|next)/consensus/<FP1>+<FP2>+<FP3>+....z ,
+ authorities serve another consensus of each flavor "F" from the
+ locations /tor/status-vote/(current|next)/consensus-F.z. and
+ /tor/status-vote/(current|next)/consensus-F/<FP1>+....z.
+
+ When caches serve these documents, they do so from the same
+ locations.
+
+ 2. Document format: generic consensus.
+
+ The format of a flavored consensus is as-yet-unspecified, except
+ that the first line is:
+ "network-status-version" SP version SP flavor NL
+
+ where version is 3 or higher, and the flavor is a string
+ consisting of alphanumeric characters and dashes, matching the
+ corresponding flavor listed in the unflavored consensus.
+
+ 3. Document format: detached signatures.
+
+ We amend the detached signature format to include more than one
+ consensus-digest line, and more than one set of signatures.
+
+ After the consensus-digest line, we allow more lines of the form:
+ "additional-digest" SP flavor SP algname SP digest NL
+
+ Before the directory-signature lines, we allow more entries of the form:
+ "additional-signature" SP flavor SP algname SP identity SP
+ signing-key-digest NL signature.
+
+ [We do not use "consensus-digest" or "directory-signature" for flavored
+ consensuses, since this could confuse older Tors.]
+
+ The consensus-signatures URL should contain the signatures
+ for _all_ flavors of consensus.
+
+ 4. The consensus index:
+
+ Authorities additionally generate and serve a consensus-index
+ document. Its format is:
+
+ Header ValidAfter ValidUntil Documents Signatures
+
+ Header = "consensus-index" SP version NL
+ ValidAfter = as in a consensus
+ ValidUntil = as in a consensus
+ Documents = Document*
+ Document = "document" SP flavor SP SignedLength
+ 1*(SP AlgorithmName "=" Digest) NL
+ Signatures = Signature*
+ Signature = "directory-signature" SP algname SP identity
+ SP signing-key-digest NL signature
+
+ There must be one Document line for each generated consensus flavor.
+ Each Document line describes the length of the signed portion of
+ a consensus (the signatures themselves are not included), along
+ with one or more digests of that signed portion. Digests are
+ given in hex. The algorithm "sha256" MUST be included; others
+ are allowed.
+
+ The algname part of a signature describes what algorithm was
+ used to hash the identity and signing keys, and to compute the
+ signature. The algorithm "sha256" MUST be recognized;
+ signatures with unrecognized algorithms MUST be ignored.
+ (See below).
+
+ The consensus index is made available at
+ /tor/status-vote/(current|next)/consensus-index.z.
+
+ Caches should fetch this document so they can check the
+ correctness of the different consensus documents they fetch.
+ They do not need to check anything about an unrecognized
+ consensus document beyond its digest and length.
+
+ 4.1. The "sha256" signature format.
+
+ The 'SHA256' signature format for directory objects is defined as
+ the RSA signature of the OAEP+-padded SHA256 digest of the item to
+ be signed. When checking signatures, the signature MUST be treated
+ as valid if the signature material begins with SHA256(document);
+ this allows us to add other data later.
+
+Considerations:
+
+ - We should not create a new flavor of consensus when adding a
+ field instead wouldn't be too onerous.
+
+ - We should not proliferate flavors lightly: clients will be
+ distinguishable based on which flavor they download.
+
+Migration:
+
+ - Stage one: authorities begin generating and serving
+ consensus-index files.
+
+ - Stage two: Caches begin downloading consensus-index files,
+ validating them, and using them to decide what flavors of
+ consensus documents to cache. They download all listed
+ documents, and compare them to the digests given in the
+ consensus.
+
+ - Stage three: Once we want to make a significant change to the
+ consensus format, we deploy another flavor of consensus at the
+ authorities. This will immediately start getting cached by the
+ caches, and clients can start fetching the new flavor without
+ waiting a version or two for enough caches to begin supporting
+ it.
+
+Acknowledgements:
+
+ Aspects of this design and its applications to hash migration were
+ heavily influenced by IRC conversations with Marian.
+
diff --git a/doc/spec/proposals/163-detecting-clients.txt b/doc/spec/proposals/163-detecting-clients.txt
new file mode 100644
index 0000000000..d838b17063
--- /dev/null
+++ b/doc/spec/proposals/163-detecting-clients.txt
@@ -0,0 +1,115 @@
+Filename: 163-detecting-clients.txt
+Title: Detecting whether a connection comes from a client
+Author: Nick Mathewson
+Created: 22-May-2009
+Target: 0.2.2
+Status: Open
+
+
+Overview:
+
+ Some aspects of Tor's design require relays to distinguish
+ connections from clients from connections that come from relays.
+ The existing means for doing this is easy to spoof. We propose
+ a better approach.
+
+Motivation:
+
+ There are at least two reasons for which Tor servers want to tell
+ which connections come from clients and which come from other
+ servers:
+
+ 1) Some exits, proposal 152 notwithstanding, want to disallow
+ their use as single-hop proxies.
+ 2) Some performance-related proposals involve prioritizing
+ traffic from relays, or limiting traffic per client (but not
+ per relay).
+
+ Right now, we detect client vs server status based on how the
+ client opens circuits. (Check out the code that implements the
+ AllowSingleHopExits option if you want all the details.) This
+ method is depressingly easy to fake, though. This document
+ proposes better means.
+
+Goals:
+
+ To make grabbing relay privileges at least as difficult as just
+ running a relay.
+
+ In the analysis below, "using server privileges" means taking any
+ action that only servers are supposed to do, like delivering a
+ BEGIN cell to an exit node that doesn't allow single hop exits,
+ or claiming server-like amounts of bandwidth.
+
+Passive detection:
+
+ A connection is definitely a client connection if it takes one of
+ the TLS methods during setup that does not establish an identity
+ key.
+
+ A circuit is definitely a client circuit if it is initiated with
+ a CREATE_FAST cell, though the node could be a client or a server.
+
+ A node that's listed in a recent consensus is probably a server.
+
+ A node to which we have successfully extended circuits from
+ multiple origins is probably a server.
+
+Active detection:
+
+ If a node doesn't try to use server privileges at all, we never
+ need to care whether it's a server.
+
+ When a node or circuit tries to use server privileges, if it is
+ "definitely a client" as per above, we can refuse it immediately.
+
+ If it's "probably a server" as per above, we can accept it.
+
+ Otherwise, we have either a client, or a server that is neither
+ listed in any consensus or used by any other clients -- in other
+ words, a new or private server.
+
+ For these servers, we should attempt to build one or more test
+ circuits through them. If enough of the circuits succeed, the
+ node is a real relay. If not, it is probably a client.
+
+ While we are waiting for the test circuits to succeed, we should
+ allow a short grace period in which server privileges are
+ permitted. When a test is done, we should remember its outcome
+ for a while, so we don't need to do it again.
+
+Why it's hard to do good testing:
+
+ Doing a test circuit starting with an unlisted router requires
+ only that we have an open connection for it. Doing a test
+ circuit starting elsewhere _through_ an unlisted router--though
+ more reliable-- would require that we have a known address, port,
+ identity key, and onion key for the router. Only the address and
+ identity key are easily available via the current Tor protocol in
+ all cases.
+
+ We could fix this part by requiring that all servers support
+ BEGIN_DIR and support downloading at least a current descriptor
+ for themselves.
+
+Open questions:
+
+ What are the thresholds for the needed numbers of circuits
+ for us to decide that a node is a relay?
+
+ [Suggested answer: two circuits from two distinct hosts.]
+
+ How do we pick grace periods? How long do we remember the
+ outcome of a test?
+
+ [Suggested answer: 10 minute grace period; 48 hour memory of
+ test outcomes.]
+
+ If we can build circuits starting at a suspect node, but we don't
+ have enough information to try extending circuits elsewhere
+ through the node, should we conclude that the node is
+ "server-like" or not?
+
+ [Suggested answer: for now, just try making circuits through
+ the node. Extend this to extending circuits as needed.]
+
diff --git a/doc/spec/proposals/164-reporting-server-status.txt b/doc/spec/proposals/164-reporting-server-status.txt
new file mode 100644
index 0000000000..705f5f1a84
--- /dev/null
+++ b/doc/spec/proposals/164-reporting-server-status.txt
@@ -0,0 +1,91 @@
+Filename: 164-reporting-server-status.txt
+Title: Reporting the status of server votes
+Author: Nick Mathewson
+Created: 22-May-2009
+Target: 0.2.2
+Status: Open
+
+
+Overview:
+
+ When a given node isn't listed in the directory, it isn't always easy
+ to tell why. This proposal suggest a quick-and-dirty way for
+ authorities to export not only how they voted, but why, and a way to
+ collate the information.
+
+Motivation:
+
+ Right now, if you want to know the reason why your server was listed
+ a certain way in the Tor directory, the following steps are
+ recommended:
+
+ - Look through your log for reports of what the authority said
+ when you tried to upload.
+
+ - Look at the consensus; see if you're listed.
+
+ - Wait a while, see if things get better.
+
+ - Download the votes from all the authorities, and see how they
+ voted. Try to figure out why.
+
+ - If you think they'll listen to you, ask some authority
+ operators to look you up in their mtbf files and logs to see
+ why they voted as they did.
+
+ This is far too hard.
+
+Solution:
+
+ We should add a new vote-like information-only document that
+ authorities serve on request. Call it a "vote info". It is
+ generated at the same time as a vote, but used only for
+ determining why a server voted as it did. It is served from
+ /tor/status-vote-info/current/authority[.z]
+
+ It differs from a vote in that:
+
+ * Its vote-status field is 'vote-info'.
+
+ * It includes routers that the authority would not include
+ in its vote.
+
+ For these, it includes an "omitted" line with an English
+ message explaining why they were omitted.
+
+ * For each router, it includes a line describing its WFU and
+ MTBF. The format is:
+
+ "stability <mtbf> up-since='date'"
+ "uptime <wfu> down-since='date'"
+
+ * It describes the WFU and MTBF thresholds it requires to
+ vote for a given router in various roles in the header.
+ The format is:
+
+ "flag-requirement <flag-name> <field> <op> <value>"
+
+ e.g.
+
+ "flag-requirement Guard uptime > 80"
+
+ * It includes info on routers all of whose descriptors that
+ were uploaded but rejected over the past few hours. The
+ "r" lines for these are the same as for regular routers.
+ The other lines are omitted for these routers, and are
+ replaced with a single "rejected" line, explaining (in
+ English) why the router was rejected.
+
+
+ A status site (like Torweather or Torstatus or another
+ tool) can poll these files when they are generated, collate
+ the data, and make it available to server operators.
+
+Risks:
+
+ This document makes no provisions for caching these "vote
+ info" documents. If many people wind up fetching them
+ aggressively from the authorities, that would be bad.
+
+
+
diff --git a/doc/spec/proposals/165-simple-robust-voting.txt b/doc/spec/proposals/165-simple-robust-voting.txt
new file mode 100644
index 0000000000..f813285a83
--- /dev/null
+++ b/doc/spec/proposals/165-simple-robust-voting.txt
@@ -0,0 +1,133 @@
+Filename: 165-simple-robust-voting.txt
+Title: Easy migration for voting authority sets
+Author: Nick Mathewson
+Created: 2009-05-28
+Status: Open
+
+Overview:
+
+ This proposal describes any easy-to-implement, easy-to-verify way to
+ change the set of authorities without creating a "flag day" situation.
+
+Motivation:
+
+ From proposal 134 ("More robust consensus voting with diverse
+ authority sets") by Peter Palfrader:
+
+ Right now there are about five authoritative directory servers
+ in the Tor network, tho this number is expected to rise to about
+ 15 eventually.
+
+ Adding a new authority requires synchronized action from all
+ operators of directory authorities so that at any time during the
+ update at least half of all authorities are running and agree on
+ who is an authority. The latter requirement is there so that the
+ authorities can arrive at a common consensus: Each authority
+ builds the consensus based on the votes from all authorities it
+ recognizes, and so a different set of recognized authorities will
+ lead to a different consensus document.
+
+ In response to this problem, proposal 134 suggested that every
+ candidate authority list in its vote whom it believes to be an
+ authority. These A-says-B-is-an-authority relationships form a
+ directed graph. Each authority then iteratively finds the largest
+ clique in the graph and remove it, until they find one containing
+ them. They vote with this clique.
+
+ Proposal 134 had some problems:
+
+ - It had a security problem in that M hostile authorities in a
+ clique could effectively kick out M-1 honest authorities. This
+ could enable a minority of the original authorities to take over.
+
+ - It was too complex in its implications to analyze well: it took us
+ over a year to realize that it was insecure.
+
+ - It tried to solve a bigger problem: general fragmentation of
+ authority trust. Really, all we wanted to have was the ability to
+ add and remove authorities without forcing a flag day.
+
+Proposed protocol design:
+
+ A "Voting Set" is a set of authorities. Each authority has a list of
+ the voting sets it considers acceptable. These sets are chosen
+ manually by the authority operators. They must always contain the
+ authority itself. Each authority lists all of these voting sets in
+ its votes.
+
+ Authorities exchange votes with every other authority in any of their
+ voting sets.
+
+ When it is time to calculate a consensus, an authority votes with
+ whichever voting set it lists that is listed by the most members of
+ that set. In other words, given two sets S1 and S2 that an authority
+ lists, that authority will prefer to vote with S1 over S2 whenever
+ the number of other authorities in S1 that themselves list S1 is
+ higher than the number of other authorities in S2 that themselves
+ list S2.
+
+ For example, suppose authority A recognizes two sets, "A B C D" and
+ "A E F G H". Suppose that the first set is recognized by all of A,
+ B, C, and D, whereas the second set is recognized only by A, E, and
+ F. Because the first set is recognize by more of the authorities in
+ it than the other one, A will vote with the first set.
+
+ Ties are broken in favor of some arbitrary function of the identity
+ keys of the authorities in the set.
+
+How to migrate authority sets:
+
+ In steady state, each authority operator should list only the current
+ actual voting set as accepted.
+
+ When we want to add an authority, each authority operator configures
+ his or her server to list two voting sets: one containing all the old
+ authorities, and one containing the old authorities and the new
+ authority too. Once all authorities are listing the new set of
+ authorities, they will start voting with that set because of its
+ size.
+
+ What if one or two authority operators are slow to list the new set?
+ Then the other operators can stop listing the old set once there are
+ enough authorities listing the new set to make its voting successful.
+ (Note that these authorities not listing the new set will still have
+ their votes counted, since they themselves will be members of the new
+ set. They will only fail to sign the consensus generated by the
+ other authorities who are using the new set.)
+
+ When we want to remove an authority, the operators list two voting
+ sets: one containing all the authorities, and one omitting the
+ authority we want to remove. Once enough authorities list the new
+ set as acceptable, we start having authority operators stop listing
+ the old set. Once there are more listing the new set than the old
+ set, the new set will win.
+
+Data format changes:
+
+ Add a new 'voting-set' line to the vote document format. Allow it to
+ occur any number of times. Its format is:
+
+ voting-set SP 'fingerprint' SP 'fingerprint' ... NL
+
+ where each fingerprint is the hex fingerprint of an identity key of
+ an authority. Sort fingerprints in ascending order.
+
+ When the consensus method is at least 'X' (decide this when we
+ implement the proposal), add this line to the consensus format as
+ well, before the first dir-source line. [This information is not
+ redundant with the dir-source sections in the consensus: If an
+ authority is recognized but didn't vote, that authority will appear in
+ the voting-set line but not in the dir-source sections.]
+
+ We don't need to list other information about authorities in our
+ vote.
+
+Migration issues:
+
+ We should keep track somewhere of which Tor client versions
+ recognized which authorities.
+
+Acknowledgments:
+
+ The design came out of an IRC conversation with Peter Palfrader. He
+ had the basic idea first.
diff --git a/doc/spec/proposals/166-statistics-extra-info-docs.txt b/doc/spec/proposals/166-statistics-extra-info-docs.txt
new file mode 100644
index 0000000000..ab2716a71c
--- /dev/null
+++ b/doc/spec/proposals/166-statistics-extra-info-docs.txt
@@ -0,0 +1,391 @@
+Filename: 166-statistics-extra-info-docs.txt
+Title: Including Network Statistics in Extra-Info Documents
+Author: Karsten Loesing
+Created: 21-Jul-2009
+Target: 0.2.2
+Status: Accepted
+
+Change history:
+
+ 21-Jul-2009 Initial proposal for or-dev
+
+
+Overview:
+
+ The Tor network has grown to almost two thousand relays and millions
+ of casual users over the past few years. With growth has come
+ increasing performance problems and attempts by some countries to
+ block access to the Tor network. In order to address these problems,
+ we need to learn more about the Tor network. This proposal suggests to
+ measure additional statistics and include them in extra-info documents
+ to help us understand the Tor network better.
+
+
+Introduction:
+
+ As of May 2009, relays, bridges, and directories gather the following
+ data for statistical purposes:
+
+ - Relays and bridges count the number of bytes that they have pushed
+ in 15-minute intervals over the past 24 hours. Relays and bridges
+ include these data in extra-info documents that they send to the
+ directory authorities whenever they publish their server descriptor.
+
+ - Bridges further include a rough number of clients per country that
+ they have seen in the past 48 hours in their extra-info documents.
+
+ - Directories can be configured to count the number of clients they
+ see per country in the past 24 hours and to write them to a local
+ file.
+
+ Since then we extended the network statistics in Tor. These statistics
+ include:
+
+ - Directories now gather more precise statistics about connecting
+ clients. Fixes include measuring in intervals of exactly 24 hours,
+ counting unsuccessful requests, measuring download times, etc. The
+ directories append their statistics to a local file every 24 hours.
+
+ - Entry guards count the number of clients per country per day like
+ bridges do and write them to a local file every 24 hours.
+
+ - Relays measure statistics of the number of cells in their circuit
+ queues and how much time these cells spend waiting there. Relays
+ write these statistics to a local file every 24 hours.
+
+ - Exit nodes count the number of read and written bytes on exit
+ connections per port as well as the number of opened exit streams
+ per port in 24-hour intervals. Exit nodes write their statistics to
+ a local file.
+
+ The following four sections contain descriptions for adding these
+ statistics to the relays' extra-info documents.
+
+
+Directory request statistics:
+
+ The first type of statistics aims at measuring directory requests sent
+ by clients to a directory mirror or directory authority. More
+ precisely, these statistics aim at requests for v2 and v3 network
+ statuses only. These directory requests are sent non-anonymously,
+ either via HTTP-like requests to a directory's Dir port or tunneled
+ over a 1-hop circuit.
+
+ Measuring directory request statistics is useful for several reasons:
+ First, the number of locally seen directory requests can be used to
+ estimate the total number of clients in the Tor network. Second, the
+ country-wise classification of requests using a GeoIP database can
+ help counting the relative and absolute number of users per country.
+ Third, the download times can give hints on the available bandwidth
+ capacity at clients.
+
+ Directory requests do not give any hints on the contents that clients
+ send or receive over the Tor network. Every client requests network
+ statuses from the directories, so that there are no anonymity-related
+ concerns to gather these statistics. It might be, though, that clients
+ wish to hide the fact that they are connecting to the Tor network.
+ Therefore, IP addresses are resolved to country codes in memory,
+ events are accumulated over 24 hours, and numbers are rounded up to
+ multiples of 4 or 8.
+
+ "dirreq-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL
+ [At most once.]
+
+ YYYY-MM-DD HH:MM:SS defines the end of the included measurement
+ interval of length NSEC seconds (86400 seconds by default).
+
+ A "dirreq-stats-end" line, as well as any other "dirreq-*" line,
+ is only added when the relay has opened its Dir port and after 24
+ hours of measuring directory requests.
+
+ "dirreq-v2-ips" CC=N,CC=N,... NL
+ [At most once.]
+ "dirreq-v3-ips" CC=N,CC=N,... NL
+ [At most once.]
+
+ List of mappings from two-letter country codes to the number of
+ unique IP addresses that have connected from that country to
+ request a v2/v3 network status, rounded up to the nearest multiple
+ of 8. Only those IP addresses are counted that the directory can
+ answer with a 200 OK status code.
+
+ "dirreq-v2-reqs" CC=N,CC=N,... NL
+ [At most once.]
+ "dirreq-v3-reqs" CC=N,CC=N,... NL
+ [At most once.]
+
+ List of mappings from two-letter country codes to the number of
+ requests for v2/v3 network statuses from that country, rounded up
+ to the nearest multiple of 8. Only those requests are counted that
+ the directory can answer with a 200 OK status code.
+
+ "dirreq-v2-share" num% NL
+ [At most once.]
+ "dirreq-v3-share" num% NL
+ [At most once.]
+
+ The share of v2/v3 network status requests that the directory
+ expects to receive from clients based on its advertised bandwidth
+ compared to the overall network bandwidth capacity. Shares are
+ formatted in percent with two decimal places. Shares are
+ calculated as means over the whole 24-hour interval.
+
+ "dirreq-v2-resp" status=num,... NL
+ [At most once.]
+ "dirreq-v3-resp" status=nul,... NL
+ [At most once.]
+
+ List of mappings from response statuses to the number of requests
+ for v2/v3 network statuses that were answered with that response
+ status, rounded up to the nearest multiple of 4. Only response
+ statuses with at least 1 response are reported. New response
+ statuses can be added at any time. The current list of response
+ statuses is as follows:
+
+ "ok": a network status request is answered; this number
+ corresponds to the sum of all requests as reported in
+ "dirreq-v2-reqs" or "dirreq-v3-reqs", respectively, before
+ rounding up.
+ "not-enough-sigs: a version 3 network status is not signed by a
+ sufficient number of requested authorities.
+ "unavailable": a requested network status object is unavailable.
+ "not-found": a requested network status is not found.
+ "not-modified": a network status has not been modified since the
+ If-Modified-Since time that is included in the request.
+ "busy": the directory is busy.
+
+ "dirreq-v2-direct-dl" key=val,... NL
+ [At most once.]
+ "dirreq-v3-direct-dl" key=val,... NL
+ [At most once.]
+ "dirreq-v2-tunneled-dl" key=val,... NL
+ [At most once.]
+ "dirreq-v3-tunneled-dl" key=val,... NL
+ [At most once.]
+
+ List of statistics about possible failures in the download process
+ of v2/v3 network statuses. Requests are either "direct"
+ HTTP-encoded requests over the relay's directory port, or
+ "tunneled" requests using a BEGIN_DIR cell over the relay's OR
+ port. The list of possible statistics can change, and statistics
+ can be left out from reporting. The current list of statistics is
+ as follows:
+
+ Successful downloads and failures:
+
+ "complete": a client has finished the download successfully.
+ "timeout": a download did not finish within 10 minutes after
+ starting to send the response.
+ "running": a download is still running at the end of the
+ measurement period for less than 10 minutes after starting to
+ send the response.
+
+ Download times:
+
+ "min", "max": smallest and largest measured bandwidth in B/s.
+ "d[1-4,6-9]": 1st to 4th and 6th to 9th decile of measured
+ bandwidth in B/s. For a given decile i, i/10 of all downloads
+ had a smaller bandwidth than di, and (10-i)/10 of all downloads
+ had a larger bandwidth than di.
+ "q[1,3]": 1st and 3rd quartile of measured bandwidth in B/s. One
+ fourth of all downloads had a smaller bandwidth than q1, one
+ fourth of all downloads had a larger bandwidth than q3, and the
+ remaining half of all downloads had a bandwidth between q1 and
+ q3.
+ "md": median of measured bandwidth in B/s. Half of the downloads
+ had a smaller bandwidth than md, the other half had a larger
+ bandwidth than md.
+
+
+Entry guard statistics:
+
+ Entry guard statistics include the number of clients per country and
+ per day that are connecting directly to an entry guard.
+
+ Entry guard statistics are important to learn more about the
+ distribution of clients to countries. In the future, this knowledge
+ can be useful to detect if there are or start to be any restrictions
+ for clients connecting from specific countries.
+
+ The information which client connects to a given entry guard is very
+ sensitive. This information must not be combined with the information
+ what contents are leaving the network at the exit nodes. Therefore,
+ entry guard statistics need to be aggregated to prevent them from
+ becoming useful for de-anonymization. Aggregation includes resolving
+ IP addresses to country codes, counting events over 24-hour intervals,
+ and rounding up numbers to the next multiple of 8.
+
+ "entry-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL
+ [At most once.]
+
+ YYYY-MM-DD HH:MM:SS defines the end of the included measurement
+ interval of length NSEC seconds (86400 seconds by default).
+
+ An "entry-stats-end" line, as well as any other "entry-*"
+ line, is first added after the relay has been running for at least
+ 24 hours.
+
+ "entry-ips" CC=N,CC=N,... NL
+ [At most once.]
+
+ List of mappings from two-letter country codes to the number of
+ unique IP addresses that have connected from that country to the
+ relay and which are no known other relays, rounded up to the
+ nearest multiple of 8.
+
+
+Cell statistics:
+
+ The third type of statistics have to do with the time that cells spend
+ in circuit queues. In order to gather these statistics, the relay
+ memorizes when it puts a given cell in a circuit queue and when this
+ cell is flushed. The relay further notes the life time of the circuit.
+ These data are sufficient to determine the mean number of cells in a
+ queue over time and the mean time that cells spend in a queue.
+
+ Cell statistics are necessary to learn more about possible reasons for
+ the poor network performance of the Tor network, especially high
+ latencies. The same statistics are also useful to determine the
+ effects of design changes by comparing today's data with future data.
+
+ There are basically no privacy concerns from measuring cell
+ statistics, regardless of a node being an entry, middle, or exit node.
+
+ "cell-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL
+ [At most once.]
+
+ YYYY-MM-DD HH:MM:SS defines the end of the included measurement
+ interval of length NSEC seconds (86400 seconds by default).
+
+ A "cell-stats-end" line, as well as any other "cell-*" line,
+ is first added after the relay has been running for at least 24
+ hours.
+
+ "cell-processed-cells" num,...,num NL
+ [At most once.]
+
+ Mean number of processed cells per circuit, subdivided into
+ deciles of circuits by the number of cells they have processed in
+ descending order from loudest to quietest circuits.
+
+ "cell-queued-cells" num,...,num NL
+ [At most once.]
+
+ Mean number of cells contained in queues by circuit decile. These
+ means are calculated by 1) determining the mean number of cells in
+ a single circuit between its creation and its termination and 2)
+ calculating the mean for all circuits in a given decile as
+ determined in "cell-processed-cells". Numbers have a precision of
+ two decimal places.
+
+ "cell-time-in-queue" num,...,num NL
+ [At most once.]
+
+ Mean time cells spend in circuit queues in milliseconds. Times are
+ calculated by 1) determining the mean time cells spend in the
+ queue of a single circuit and 2) calculating the mean for all
+ circuits in a given decile as determined in
+ "cell-processed-cells".
+
+ "cell-circuits-per-decile" num NL
+ [At most once.]
+
+ Mean number of circuits that are included in any of the deciles,
+ rounded up to the next integer.
+
+
+Exit statistics:
+
+ The last type of statistics affects exit nodes counting the number of
+ bytes written and read and the number of streams opened per port and
+ per 24 hours. Exit port statistics can be measured from looking at
+ headers of BEGIN and DATA cells. A BEGIN cell contains the exit port
+ that is required for the exit node to open a new exit stream.
+ Subsequent DATA cells coming from the client or being sent back to the
+ client contain a length field stating how many bytes of application
+ data are contained in the cell.
+
+ Exit port statistics are important to measure in order to identify
+ possible load-balancing problems with respect to exit policies. Exit
+ nodes that permit more ports than others are very likely overloaded
+ with traffic for those ports plus traffic for other ports. Improving
+ load balancing in the Tor network improves the overall utilization of
+ bandwidth capacity.
+
+ Exit traffic is one of the most sensitive parts of network data in the
+ Tor network. Even though these statistics do not require looking at
+ traffic contents, statistics are aggregated so that they are not
+ useful for de-anonymizing users. Only those ports are reported that
+ have seen at least 0.1% of exiting or incoming bytes, numbers of bytes
+ are rounded up to full kibibytes (KiB), and stream numbers are rounded
+ up to the next multiple of 4.
+
+ "exit-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL
+ [At most once.]
+
+ YYYY-MM-DD HH:MM:SS defines the end of the included measurement
+ interval of length NSEC seconds (86400 seconds by default).
+
+ An "exit-stats-end" line, as well as any other "exit-*" line, is
+ first added after the relay has been running for at least 24 hours
+ and only if the relay permits exiting (where exiting to a single
+ port and IP address is sufficient).
+
+ "exit-kibibytes-written" port=N,port=N,... NL
+ [At most once.]
+ "exit-kibibytes-read" port=N,port=N,... NL
+ [At most once.]
+
+ List of mappings from ports to the number of kibibytes that the
+ relay has written to or read from exit connections to that port,
+ rounded up to the next full kibibyte.
+
+ "exit-streams-opened" port=N,port=N,... NL
+ [At most once.]
+
+ List of mappings from ports to the number of opened exit streams
+ to that port, rounded up to the nearest multiple of 4.
+
+
+Implementation notes:
+
+ Right now, relays that are configured accordingly write similar
+ statistics to those described in this proposal to disk every 24 hours.
+ With this proposal being implemented, relays include the contents of
+ these files in extra-info documents.
+
+ The following steps are necessary to implement this proposal:
+
+ 1. The current format of [dirreq|entry|buffer|exit]-stats files needs
+ to be adapted to the description in this proposal. This step
+ basically means renaming keywords.
+
+ 2. The timing of writing the four *-stats files should be unified, so
+ that they are written exactly 24 hours after starting the
+ relay. Right now, the measurement intervals for dirreq, entry, and
+ exit stats starts with the first observed request, and files are
+ written when observing the first request that occurs more than 24
+ hours after the beginning of the measurement interval. With this
+ proposal, the measurement intervals should all start at the same
+ time, and files should be written exactly 24 hours later.
+
+ 3. It is advantageous to cache statistics in local files in the data
+ directory until they are included in extra-info documents. The
+ reason is that the 24-hour measurement interval can be very
+ different from the 18-hour publication interval of extra-info
+ documents. When a relay crashes after finishing a measurement
+ interval, but before publishing the next extra-info document,
+ statistics would get lost. Therefore, statistics are written to
+ disk when finishing a measurement interval and read from disk when
+ generating an extra-info document. Only the statistics that were
+ appended to the *-stats files within the past 24 hours are included
+ in extra-info documents. Further, the contents of the *-stats files
+ need to be checked in the process of generating extra-info documents.
+
+ 4. With the statistics patches being tested, the ./configure options
+ should be removed and the statistics code be compiled by default.
+ It is still required for relay operators to add configuration
+ options (DirReqStatistics, ExitPortStatistics, etc.) to enable
+ gathering statistics. However, in the near future, statistics shall
+ be enabled gathered by all relays by default, where requiring a
+ ./configure option would be a barrier for many relay operators.
diff --git a/doc/spec/proposals/167-params-in-consensus.txt b/doc/spec/proposals/167-params-in-consensus.txt
new file mode 100644
index 0000000000..d23bc9c01e
--- /dev/null
+++ b/doc/spec/proposals/167-params-in-consensus.txt
@@ -0,0 +1,47 @@
+Filename: 167-params-in-consensus.txt
+Title: Vote on network parameters in consensus
+Author: Roger Dingledine
+Created: 18-Aug-2009
+Status: Closed
+Implemented-In: 0.2.2
+
+0. History
+
+
+1. Overview
+
+ Several of our new performance plans involve guessing how to tune
+ clients and relays, yet we won't be able to learn whether we guessed
+ the right tuning parameters until many people have upgraded. Instead,
+ we should have directory authorities vote on the parameters, and teach
+ Tors to read the currently recommended values out of the consensus.
+
+2. Design
+
+ V3 votes should include a new "params" line after the known-flags
+ line. It contains key=value pairs, where value is an integer.
+
+ Consensus documents that are generated with a sufficiently new consensus
+ method (7?) then include a params line that includes every key listed
+ in any vote, and the median value for that key (in case of ties,
+ we use the median closer to zero).
+
+2.1. Planned keys.
+
+ The first planned parameter is "circwindow=101", which is the initial
+ circuit packaging window that clients and relays should use. Putting
+ it in the consensus will let us perform experiments with different
+ values once enough Tors have upgraded -- see proposal 168.
+
+ Later parameters might include a weighting for how much to favor quiet
+ circuits over loud circuits in our round-robin algorithm; a weighting
+ for how much to prioritize relays over clients if we use an incentive
+ scheme like the gold-star design; and what fraction of circuits we
+ should throw out from proposal 151.
+
+2.2. What about non-integers?
+
+ I'm not sure how we would do median on non-integer values. Further,
+ I don't have any non-integer values in mind yet. So I say we cross
+ that bridge when we get to it.
+
diff --git a/doc/spec/proposals/168-reduce-circwindow.txt b/doc/spec/proposals/168-reduce-circwindow.txt
new file mode 100644
index 0000000000..c10cf41e2e
--- /dev/null
+++ b/doc/spec/proposals/168-reduce-circwindow.txt
@@ -0,0 +1,134 @@
+Filename: 168-reduce-circwindow.txt
+Title: Reduce default circuit window
+Author: Roger Dingledine
+Created: 12-Aug-2009
+Status: Open
+Target: 0.2.2
+
+0. History
+
+
+1. Overview
+
+ We should reduce the starting circuit "package window" from 1000 to
+ 101. The lower package window will mean that clients will only be able
+ to receive 101 cells (~50KB) on a circuit before they need to send a
+ 'sendme' acknowledgement cell to request 100 more.
+
+ Starting with a lower package window on exit relays should save on
+ buffer sizes (and thus memory requirements for the exit relay), and
+ should save on queue sizes (and thus latency for users).
+
+ Lowering the package window will induce an extra round-trip for every
+ additional 50298 bytes of the circuit. This extra step is clearly a
+ slow-down for large streams, but ultimately we hope that a) clients
+ fetching smaller streams will see better response, and b) slowing
+ down the large streams in this way will produce lower e2e latencies,
+ so the round-trips won't be so bad.
+
+2. Motivation
+
+ Karsten's torperf graphs show that the median download time for a 50KB
+ file over Tor in mid 2009 is 7.7 seconds, whereas the median download
+ time for 1MB and 5MB are around 50s and 150s respectively. The 7.7
+ second figure is way too high, whereas the 50s and 150s figures are
+ surprisingly low.
+
+ The median round-trip latency appears to be around 2s, with 25% of
+ the data points taking more than 5s. That's a lot of variance.
+
+ We designed Tor originally with the original goal of maximizing
+ throughput. We figured that would also optimize other network properties
+ like round-trip latency. Looks like we were wrong.
+
+3. Design
+
+ Wherever we initialize the circuit package window, initialize it to
+ 101 rather than 1000. Reducing it should be safe even when interacting
+ with old Tors: the old Tors will receive the 101 cells and send back
+ a sendme ack cell. They'll still have much higher deliver windows,
+ but the rest of their deliver window will go unused.
+
+ You can find the patch at arma/circwindow. It seems to work.
+
+3.1. Why not 100?
+
+ Tor 0.0.0 through 0.2.1.19 have a bug where they only send the sendme
+ ack cell after 101 cells rather than the intended 100 cells.
+
+ Once 0.2.1.19 is obsolete we can change it back to 100 if we like. But
+ hopefully we'll have moved to some datagram protocol long before
+ 0.2.1.19 becomes obsolete.
+
+3.2. What about stream packaging windows?
+
+ Right now the stream packaging windows start at 500. The goal was to
+ set the stream window to half the circuit window, to provide a crude
+ load balancing between streams on the same circuit. Once we lower
+ the circuit packaging window, the stream packaging window basically
+ becomes redundant.
+
+ We could leave it in -- it isn't hurting much in either case. Or we
+ could take it out -- people building other Tor clients would thank us
+ for that step. Alas, people building other Tor clients are going to
+ have to be compatible with current Tor clients, so in practice there's
+ no point taking out the stream packaging windows.
+
+3.3. What about variable circuit windows?
+
+ Once upon a time we imagined adapting the circuit package window to
+ the network conditions. That is, we would start the window small,
+ and raise it based on the latency and throughput we see.
+
+ In theory that crude imitation of TCP's windowing system would allow
+ us to adapt to fill the network better. In practice, I think we want
+ to stick with the small window and never raise it. The low cap reduces
+ the total throughput you can get from Tor for a given circuit. But
+ that's a feature, not a bug.
+
+4. Evaluation
+
+ How do we know this change is actually smart? It seems intuitive that
+ it's helpful, and some smart systems people have agreed that it's
+ a good idea (or said another way, they were shocked at how big the
+ default package window was before).
+
+ To get a more concrete sense of the benefit, though, Karsten has been
+ running torperf side-by-side on exit relays with the old package window
+ vs the new one. The results are mixed currently -- it is slightly faster
+ for fetching 40KB files, and slightly slower for fetching 50KB files.
+
+ I think it's going to be tough to get a clear conclusion that this is
+ a good design just by comparing one exit relay running the patch. The
+ trouble is that the other hops in the circuits are still getting bogged
+ down by other clients introducing too much traffic into the network.
+
+ Ultimately, we'll want to put the circwindow parameter into the
+ consensus so we can test a broader range of values once enough relays
+ have upgraded.
+
+5. Transition and deployment
+
+ We should put the circwindow in the consensus (see proposal 167),
+ with an initial value of 101. Then as more exit relays upgrade,
+ clients should seamlessly get the better behavior.
+
+ Note that upgrading the exit relay will only affect the "download"
+ package window. An old client that's uploading lots of bytes will
+ continue to use the old package window at the client side, and we
+ can't throttle that window at the exit side without breaking protocol.
+
+ The real question then is what we should backport to 0.2.1. Assuming
+ this could be a big performance win, we can't afford to wait until
+ 0.2.2.x comes out before starting to see the changes here. So we have
+ two options as I see them:
+ a) once clients in 0.2.2.x know how to read the value out of the
+ consensus, and it's been tested for a bit, backport that part to
+ 0.2.1.x.
+ b) if it's too complex to backport, just pick a number, like 101, and
+ backport that number.
+
+ Clearly choice (a) is the better one if the consensus parsing part
+ isn't very complex. Let's shoot for that, and fall back to (b) if the
+ patch turns out to be so big that we reconsider.
+
diff --git a/doc/spec/proposals/169-eliminating-renegotiation.txt b/doc/spec/proposals/169-eliminating-renegotiation.txt
new file mode 100644
index 0000000000..2c90f9c9e8
--- /dev/null
+++ b/doc/spec/proposals/169-eliminating-renegotiation.txt
@@ -0,0 +1,404 @@
+Filename: 169-eliminating-renegotiation.txt
+Title: Eliminate TLS renegotiation for the Tor connection handshake
+Author: Nick Mathewson
+Created: 27-Jan-2010
+Status: Draft
+Target: 0.2.2
+
+1. Overview
+
+ I propose a backward-compatible change to the Tor connection
+ establishment protocol to avoid the use of TLS renegotiation.
+
+ Rather than doing a TLS renegotiation to exchange certificates
+ and authenticate the original handshake, this proposal takes an
+ approach similar to Steven Murdoch's proposal 124, and uses Tor
+ cells to finish authenticating the parties' identities once the
+ initial TLS handshake is finished.
+
+ Terminological note: I use "client" below to mean the Tor
+ instance (a client or a relay) that initiates a TLS connection,
+ and "server" to mean the Tor instance (a relay) that accepts it.
+
+2. Motivation and history
+
+ In the original Tor TLS connection handshake protocol ("V1", or
+ "two-cert"), parties that wanted to authenticate provided a
+ two-cert chain of X.509 certificates during the handshake setup
+ phase. Every party that wanted to authenticate sent these
+ certificates.
+
+ In the current Tor TLS connection handshake protocol ("V2", or
+ "renegotiating"), the parties begin with a single certificate
+ sent from the server (responder) to the client (initiator), and
+ then renegotiate to a two-certs-from-each-authenticating party.
+ We made this change to make Tor's handshake look like a browser
+ speaking SSL to a webserver. (See proposal 130, and
+ tor-spec.txt.) To tell whether to use the V1 or V2 handshake,
+ servers look at the list of ciphers sent by the client. (This is
+ ugly, but there's not much else in the ClientHello that they can
+ look at.) If the list contains any cipher not used by the V1
+ protocol, the server sends back a single cert and expects a
+ renegotiation. If the client gets back a single cert, then it
+ withholds its own certificates until the TLS renegotiation phase.
+
+ In other words, initiator behavior now looks like this:
+
+ - Begin TLS negotiation with V2 cipher list; wait for
+ certificate(s).
+ - If we get a certificate chain:
+ - Then we are using the V1 handshake. Send our own
+ certificate chain as part of this initial TLS handshake
+ if we want to authenticate; otherwise, send no
+ certificates. When the handshake completes, check
+ certificates. We are now mutually authenticated.
+
+ Otherwise, if we get just a single certificate:
+ - Then we are using the V2 handshake. Do not send any
+ certificates during this handshake.
+ - When the handshake is done, immediately start a TLS
+ renegotiation. During the renegotiation, expect
+ a certificate chain from the server; send a certificate
+ chain of our own if we want to authenticate ourselves.
+ - After the renegotiation, check the certificates. Then
+ send (and expect) a VERSIONS cell from the other side to
+ establish the link protocol version.
+
+ And V2 responder behavior now looks like this:
+
+ - When we get a TLS ClientHello request, look at the cipher
+ list.
+ - If the cipher list contains only the V1 ciphersuites:
+ - Then we're doing a V1 handshake. Send a certificate
+ chain. Expect a possible client certificate chain in
+ response.
+ Otherwise, if we get other ciphersuites:
+ - We're using the V2 handshake. Send back a single
+ certificate and let the handshake complete.
+ - Do not accept any data until the client has renegotiated.
+ - When the client is renegotiating, send a certificate
+ chain, and expect (possibly multiple) certificates in
+ reply.
+ - Check the certificates when the renegotiation is done.
+ Then exchange VERSIONS cells.
+
+ Late in 2009, researchers found a flaw in most applications' use
+ of TLS renegotiation: Although TLS renegotiation does not
+ reauthenticate any information exchanged before the renegotiation
+ takes place, many applications were treating it as though it did,
+ and assuming that data sent _before_ the renegotiation was
+ authenticated with the credentials negotiated _during_ the
+ renegotiation. This problem was exacerbated by the fact that
+ most TLS libraries don't actually give you an obvious good way to
+ tell where the renegotiation occurred relative to the datastream.
+ Tor wasn't directly affected by this vulnerability, but its
+ aftermath hurts us in a few ways:
+
+ 1) OpenSSL has disabled renegotiation by default, and created
+ a "yes we know what we're doing" option we need to set to
+ turn it back on. (Two options, actually: one for openssl
+ 0.9.8l and one for 0.9.8m and later.)
+
+ 2) Some vendors have removed all renegotiation support from
+ their versions of OpenSSL entirely, forcing us to tell
+ users to either replace their versions of OpenSSL or to
+ link Tor against a hand-built one.
+
+ 3) Because of 1 and 2, I'd expect TLS renegotiation to become
+ rarer and rarer in the wild, making our own use stand out
+ more.
+
+3. Design
+
+3.1. The view in the large
+
+ Taking a cue from Steven Murdoch's proposal 124, I propose that
+ we move the work currently done by the TLS renegotiation step
+ (that is, authenticating the parties to one another) and do it
+ with Tor cells instead of with TLS.
+
+ Using _yet another_ variant response from the responder (server),
+ we allow the client to learn that it doesn't need to rehandshake
+ and can instead use a cell-based authentication system. Once the
+ TLS handshake is done, the client and server exchange VERSIONS
+ cells to determine link protocol version (including
+ handshake version). If they're using the handshake version
+ specified here, the client and server arrive at link protocol
+ version 3 (or higher), and use cells to exchange further
+ authentication information.
+
+3.2. New TLS handshake variant
+
+ We already used the list of ciphers from the clienthello to
+ indicate whether the client can speak the V2 ("renegotiating")
+ handshake or later, so we can't encode more information there.
+
+ We can, however, change the DN in the certificate passed by the
+ server back to the client. Currently, all V2 certificates are
+ generated with CN values ending with ".net". I propose that we
+ have the ".net" commonName ending reserved to indicate the V2
+ protocol, and use commonName values ending with ".com" to
+ indicate the V3 ("minimal") handshake described herein.
+
+ Now, once the initial TLS handshake is done, the client can look
+ at the server's certificate(s). If there is a certificate chain,
+ the handshake is V1. If there is a single certificate whose
+ subject commonName ends in ".net", the handshake is V2 and the
+ client should try to renegotiate as it would currently.
+ Otherwise, the client should assume that the handshake is V3+.
+ [Servers should _only_ send ".com" addesses, to allow room for
+ more signaling in the future.]
+
+3.3. Authenticating inside Tor
+
+ Once the TLS handshake is finished, if the client renegotiates,
+ then the server should go on as it does currently.
+
+ If the client implements this proposal, however, and the server
+ has shown it can understand the V3+ handshake protocol, the
+ client immediately sends a VERSIONS cell to the server
+ and waits to receive a VERSIONS cell in return. We negotiate
+ the Tor link protocol version _before_ we proceed with the
+ negotiation, in case we need to change the authentication
+ protocol in the future.
+
+ Once either party has seen the VERSIONS cell from the other, it
+ knows which version they will pick (that is, the highest version
+ shared by both parties' VERSIONS cells). All Tor instances using
+ the handshake protocol described in 3.2 MUST support at least
+ link protocol version 3 as described here.
+
+ On learning the link protocol, the server then sends the client a
+ CERT cell and a NETINFO cell. If the client wants to
+ authenticate to the server, it sends a CERT cell, an AUTHENTICATE
+ cell, and a NETINFO cell, or it may simply send a NETINFO cell if
+ it does not want to authenticate.
+
+ The CERT cell describes the keys that a Tor instance is claiming
+ to have. It is a variable-length cell. Its payload format is:
+
+ N: Number of certs in cell [1 octet]
+ N times:
+ CLEN [2 octets]
+ Certificate [CLEN octets]
+
+ Any extra octets at the end of a CERT cell MUST be ignored.
+
+ Each certificate has the form:
+
+ CertType [1 octet]
+ CertPurpose [1 octet]
+ PublicKeyLen [2 octets]
+ PublicKey [PublicKeyLen octets]
+ NotBefore [4 octets]
+ NotAfter [4 octets]
+ SignerID [HASH256_LEN octets]
+ SignatureLen [2 octets]
+ Signature [SignatureLen octets]
+
+ where CertType is 1 (meaning "RSA/SHA256")
+ CertPurpose is 1 (meaning "link certificate")
+ PublicKey is the DER encoding of the ASN.1 representation
+ of the RSA key of the subject of this certificate,
+ NotBefore is a time in HOURS since January 1, 1970, 00:00
+ UTC before which this certificate should not be
+ considered valid.
+ NotAfter is a time in HOURS since January 1, 1970, 00:00
+ UTC after which this certificate should not be
+ considered valid.
+ SignerID is the SHA-256 digest of the public key signing
+ this certificate
+ and Signature is the signature of the all other fields in
+ this certificate, using SHA256 as described in proposal
+ 158.
+
+ While authenticating, a server need send only a self-signed
+ certificate for its identity key. (Its TLS certificate already
+ contains its link key signed by its identity key.) A client that
+ wants to authenticate MUST send two certificates: one containing
+ a public link key signed by its identity key, and one self-signed
+ cert for its identity.
+
+ Tor instances MUST ignore any certificate with an unrecognized
+ CertType or CertPurpose, and MUST ignore extra bytes in the cert.
+
+ The AUTHENTICATE cell proves to the server that the client with
+ whom it completed the initial TLS handshake is the one possessing
+ the link public key in its certificate. It is a variable-length
+ cell. Its contents are:
+
+ SignatureType [2 octets]
+ SignatureLen [2 octets]
+ Signature [SignatureLen octets]
+
+ where SignatureType is 1 (meaning "RSA-SHA256") and Signature is
+ an RSA-SHA256 signature of the HMAC-SHA256, using the TLS master
+ secret key as its key, of the following elements:
+
+ - The SignatureType field (0x00 0x01)
+ - The NUL terminated ASCII string: "Tor certificate verification"
+ - client_random, as sent in the Client Hello
+ - server_random, as sent in the Server Hello
+
+ Once the above handshake is complete, the client knows (from the
+ initial TLS handshake) that it has a secure connection to an
+ entity that controls a given link public key, and knows (from the
+ CERT cell) that the link public key is a valid public key for a
+ given Tor identity.
+
+ If the client authenticates, the server learns from the CERT cell
+ that a given Tor identity has a given current public link key.
+ From the AUTHENTICATE cell, it knows that an entity with that
+ link key knows the master secret for the TLS connection, and
+ hence must be the party with whom it's talking, if TLS works.
+
+3.4. Security checks
+
+ If the TLS handshake indicates a V2 or V3+ connection, the server
+ MUST reject any connection from the client that does not begin
+ with either a renegotiation attempt or a VERSIONS cell containing
+ at least link protocol version "3". If the TLS handshake
+ indicates a V3+ connection, the client MUST reject any connection
+ where the server sends anything before the client has sent a
+ VERSIONS cell, and any connection where the VERSIONS cell does
+ not contain at least link protocol version "3".
+
+ If link protocol version 3 is chosen:
+
+ Clients and servers MUST check that all digests and signatures
+ on the certificates in CERT cells they are given are as
+ described above.
+
+ After the VERSIONS cell, clients and servers MUST close the
+ connection if anything besides a CERT or AUTH cell is sent
+ before the
+
+ CERT or AUTHENTICATE cells anywhere after the first NETINFO
+ cell must be rejected.
+
+ ... [write more here. What else?] ...
+
+3.5. Summary
+
+ We now revisit the protocol outlines from section 2 to incorporate
+ our changes. New or modified steps are marked with a *.
+
+ The new initiator behavior now looks like this:
+
+ - Begin TLS negotiation with V2 cipher list; wait for
+ certificate(s).
+ - If we get a certificate chain:
+ - Then we are using the V1 handshake. Send our own
+ certificate chain as part of this initial TLS handshake
+ if we want to authenticate; otherwise, send no
+ certificates. When the handshake completes, check
+ certificates. We are now mutually authenticated.
+ Otherwise, if we get just a single certificate:
+ - Then we are using the V2 or the V3+ handshake. Do not
+ send any certificates during this handshake.
+ * When the handshake is done, look at the server's
+ certificate's subject commonName.
+ * If it ends with ".net", we're doing a V2 handshake:
+ - Immediately start a TLS renegotiation. During the
+ renegotiation, expect a certificate chain from the
+ server; send a certificate chain of our own if we
+ want to authenticate ourselves.
+ - After the renegotiation, check the certificates. Then
+ send (and expect) a VERSIONS cell from the other side
+ to establish the link protocol version.
+ * If it ends with anything else, assume a V3 or later
+ handshake:
+ * Send a VERSIONS cell, and wait for a VERSIONS cell
+ from the server.
+ * If we are authenticating, send CERT and AUTHENTICATE
+ cells.
+ * Send a NETINFO cell. Wait for a CERT and a NETINFO
+ cell from the server.
+ * If the CERT cell contains a valid self-identity cert,
+ and the identity key in the cert can be used to check
+ the signature on the x.509 certificate we got during
+ the TLS handshake, then we know we connected to the
+ server with that identity. If any of these checks
+ fail, or the identity key was not what we expected,
+ then we close the connection.
+ * Once the NETINFO cell arrives, continue as before.
+
+ And V3+ responder behavior now looks like this:
+
+ - When we get a TLS ClientHello request, look at the cipher
+ list.
+
+ - If the cipher list contains only the V1 ciphersuites:
+ - Then we're doing a V1 handshake. Send a certificate
+ chain. Expect a possible client certificate chain in
+ response.
+ Otherwise, if we get other ciphersuites:
+ - We're using the V2 handshake. Send back a single
+ certificate whose subject commonName ends with ".com",
+ and let the handshake complete.
+ * If the client does anything besides renegotiate or send a
+ VERSIONS cell, drop the connection.
+ - If the client renegotiates immediately, it's a V2
+ connection:
+ - When the client is renegotiating, send a certificate
+ chain, and expect (possibly multiple certificates in
+ reply).
+ - Check the certificates when the renegotiation is done.
+ Then exchange VERSIONS cells.
+ * Otherwise we got a VERSIONS cell and it's a V3 handshake.
+ * Send a VERSIONS cell, a CERT cell, an AUTHENTICATE
+ cell, and a NETINFO cell.
+ * Wait for the client to send cells in reply. If the
+ client sends a CERT and an AUTHENTICATE and a NETINFO,
+ use them to authenticate the client. If the client
+ sends a NETINFO, it is unauthenticated. If it sends
+ anything else before its NETINFO, it's rejected.
+
+4. Numbers to assign
+
+ We need a version number for this link protocol. I've been
+ calling it "3".
+
+ We need to reserve command numbers for CERT and AUTH cells. I
+ suggest that in link protocol 3 and higher, we reserve command
+ numbers 128..240 for variable-length cells. (241-256 we can hold
+ for future extensions.
+
+5. Efficiency
+
+ This protocol add a round-trip step when the client sends a
+ VERSIONS cell to the server, and waits for the {VERSIONS, CERT,
+ NETINFO} response in turn. (The server then waits for the
+ client's {NETINFO} or {CERT, AUTHENTICATE, NETINFO} reply,
+ but it would have already been waiting for the client's NETINFO,
+ so that's not an additional wait.)
+
+ This is actually fewer round-trip steps than required before for
+ TLS renegotiation, so that's a win.
+
+6. Open questions:
+
+ - Should we use X.509 certificates instead of the certificate-ish
+ things we describe here? They are more standard, but more ugly.
+
+ - May we cache which certificates we've already verified? It
+ might leak in timing whether we've connected with a given server
+ before, and how recently.
+
+ - Is there a better secret than the master secret to use in the
+ AUTHENTICATE cell? Say, a portable one? Can we get at it for
+ other libraries besides OpenSSL?
+
+ - Does using the client_random and server_random data in the
+ AUTHENTICATE message actually help us? How hard is it to pull
+ them out of the OpenSSL data structure?
+
+ - Can we give some way for clients to signal "I want to use the
+ V3 protocol if possible, but I can't renegotiate, so don't give
+ me the V2"? Clients currently have a fair idea of server
+ versions, so they could potentially do the V3+ handshake with
+ servers that support it, and fall back to V1 otherwise.
+
+ - What should servers that don't have TLS renegotiation do? For
+ now, I think they should just get it. Eventually we can
+ deprecate the V2 handshake as we did with the V1 handshake.
diff --git a/doc/spec/proposals/170-user-path-config.txt b/doc/spec/proposals/170-user-path-config.txt
new file mode 100644
index 0000000000..fa74c76f73
--- /dev/null
+++ b/doc/spec/proposals/170-user-path-config.txt
@@ -0,0 +1,95 @@
+Title: Configuration options regarding circuit building
+Filename: 170-user-path-config.txt
+Author: Sebastian Hahn
+Created: 01-March-2010
+Status: Draft
+
+Overview:
+
+ This document outlines how Tor handles the user configuration
+ options to influence the circuit building process.
+
+Motivation:
+
+ Tor's treatment of the configuration *Nodes options was surprising
+ to many users, and quite a few conspiracy theories have crept up. We
+ should update our specification and code to better describe and
+ communicate what is going during circuit building, and how we're
+ honoring configuration. So far, we've been tracking a bugreport
+ about this behaviour (
+ https://bugs.torproject.org/flyspray/index.php?do=details&id=1090 )
+ and Nick replied in a thread on or-talk (
+ http://archives.seul.org/or/talk/Feb-2010/msg00117.html ).
+
+ This proposal tries to document our intention for those configuration
+ options.
+
+Design:
+
+ Five configuration options are available to users to influence Tor's
+ circuit building. EntryNodes and ExitNodes define a list of nodes
+ that are for the Entry/Exit position in all circuits. ExcludeNodes
+ is a list of nodes that are used for no circuit, and
+ ExcludeExitNodes is a list of nodes that aren't used as the last
+ hop. StrictNodes defines Tor's behaviour in case of a conflict, for
+ example when a node that is excluded is the only available
+ introduction point. Setting StrictNodes to 1 breaks Tor's
+ functionality in that case, and it will refuse to build such a
+ circuit.
+
+ Neither Nick's email nor bug 1090 have clear suggestions how we
+ should behave in each case, so I tried to come up with something
+ that made sense to me.
+
+Security implications:
+
+ Deviating from normal circuit building can break one's anonymity, so
+ the documentation of the above option should contain a warning to
+ make users aware of the pitfalls.
+
+Specification:
+
+ It is proposed that the "User configuration" part of path-spec
+ (section 2.2.2) be replaced with this:
+
+ Users can alter the default behavior for path selection with
+ configuration options. In case of conflicts (excluding and requiring
+ the same node) the "StrictNodes" option is used to determine
+ behaviour. If a nodes is both excluded and required via a
+ configuration option, the exclusion takes preference.
+
+ - If "ExitNodes" is provided, then every request requires an exit
+ node on the ExitNodes list. If a request is supported by no nodes
+ on that list, and "StrictNodes" is false, then Tor treats that
+ request as if ExitNodes were not provided.
+
+ - "EntryNodes" behaves analogously.
+
+ - If "ExcludeNodes" is provided, then no circuit uses any of the
+ nodes listed. If a circuit requires an excluded node to be used,
+ and "StrictNodes" is false, then Tor uses the node in that
+ position while not using any other of the excluded nodes.
+
+ - If "ExcludeExitNodes" is provided, then Tor will not use the nodes
+ listed for the exit position in a circuit. If a circuit requires
+ an excluded node to be used in the exit position and "StrictNodes"
+ is false, then Tor builds that circuit as if ExcludeExitNodes were
+ not provided.
+
+ - If a user tries to connect to or resolve a hostname of the form
+ <target>.<servername>.exit and the "AllowDotExit" configuration
+ option is set to 1, the request is rewritten to a request for
+ <target>, and the request is only supported by the exit whose
+ nickname or fingerprint is <servername>. If "AllowDotExit" is set
+ to 0 (default), any request for <anything>.exit is denied.
+
+ - When any of the *Nodes settings are changed, all circuits are
+ expired immediately, to prevent a situation where a previously
+ built circuit is used even though some of its nodes are now
+ excluded.
+
+
+Compatibility:
+
+ The old Strict*Nodes options are deprecated, and the StrictNodes
+ option is new. Tor users may need to update their configuration file.
diff --git a/doc/spec/proposals/ideas/xxx-bwrate-algs.txt b/doc/spec/proposals/ideas/xxx-bwrate-algs.txt
new file mode 100644
index 0000000000..757f5bc55e
--- /dev/null
+++ b/doc/spec/proposals/ideas/xxx-bwrate-algs.txt
@@ -0,0 +1,106 @@
+# The following two algorithms
+
+
+# Algorithm 1
+# TODO: Burst and Relay/Regular differentiation
+
+BwRate = Bandwidth Rate in Bytes Per Second
+GlobalWriteBucket = 0
+GlobalReadBucket = 0
+Epoch = Token Fill Rate in seconds: suggest 50ms=.050
+SecondCounter = 0
+MinWriteBytes = Minimum amount bytes per write
+
+Every Epoch Seconds:
+ UseMinWriteBytes = MinWriteBytes
+ WriteCnt = 0
+ ReadCnt = 0
+ BytesRead = 0
+
+ For Each Open OR Conn with pending write data:
+ WriteCnt++
+ For Each Open OR Conn:
+ ReadCnt++
+
+ BytesToRead = (BwRate*Epoch + GlobalReadBucket)/ReadCnt
+ BytesToWrite = (BwRate*Epoch + GlobalWriteBucket)/WriteCnt
+
+ if BwRate/WriteCnt < MinWriteBytes:
+ # If we aren't likely to accumulate enough bytes in a second to
+ # send a whole cell for our connections, send partials
+ Log(NOTICE, "Too many ORCons to write full blocks. Sending short packets.")
+ UseMinWriteBytes = 1
+ # Other option: We could switch to plan 2 here
+
+ # Service each writable ORConn. If there are any partial writes,
+ # return remaining bytes from this epoch to the global pool
+ For Each Open OR Conn with pending write data:
+ ORConn->write_bucket += BytesToWrite
+ if ORConn->write_bucket > UseMinWriteBytes:
+ w = write(ORConn, MIN(len(ORConn->write_data), ORConn->write_bucket))
+ # possible that w < ORConn->write_data here due to TCP pushback.
+ # We should restore the rest of the write_bucket to the global
+ # buffer
+ GlobalWriteBucket += (ORConn->write_bucket - w)
+ ORConn->write_bucket = 0
+
+ For Each Open OR Conn:
+ r = read_nonblock(ORConn, BytesToRead)
+ BytesRead += r
+
+ SecondCounter += Epoch
+ if SecondCounter < 1:
+ # Save unused bytes from this epoch to be used later in the second
+ GlobalReadBucket += (BwRate*Epoch - BytesRead)
+ else:
+ SecondCounter = 0
+ GlobalReadBucket = 0
+ GlobalWriteBucket = 0
+ For Each ORConn:
+ ORConn->write_bucket = 0
+
+
+
+# Alternate plan for Writing fairly. Reads would still be covered
+# by plan 1 as there is no additional network overhead for short reads,
+# so we don't need to try to avoid them.
+#
+# I think this is actually pretty similar to what we do now, but
+# with the addition that the bytes accumulate up to the second mark
+# and we try to keep track of our position in the write list here
+# (unless libevent is doing that for us already and I just don't see it)
+#
+# TODO: Burst and Relay/Regular differentiation
+
+# XXX: The inability to send single cells will cause us to block
+# on EXTEND cells for low-bandwidth node pairs..
+BwRate = Bandwidth Rate in Bytes Per Second
+WriteBytes = Bytes per write
+Epoch = MAX(MIN(WriteBytes/BwRate, .333s), .050s)
+
+SecondCounter = 0
+GlobalWriteBucket = 0
+
+# New connections are inserted at Head-1 (the 'tail' of this circular list)
+# This is not 100% fifo for all node data, but it is the best we can do
+# without insane amounts of additional queueing complexity.
+WriteConnList = List of Open OR Conns with pending write data > WriteBytes
+WriteConnHead = 0
+
+Every Epoch Seconds:
+ GlobalWriteBucket += BwRate*Epoch
+ WriteListEnd = WriteConnHead
+
+ do
+ ORCONN = WriteConnList[WriteConnHead]
+ w = write(ORConn, WriteBytes)
+ GlobalWriteBucket -= w
+ WriteConnHead += 1
+ while GlobalWriteBucket > 0 and WriteConnHead != WriteListEnd
+
+ SecondCounter += Epoch
+ if SecondCounter >= 1:
+ SecondCounter = 0
+ GlobalWriteBucket = 0
+
+
diff --git a/doc/spec/proposals/ideas/xxx-choosing-crypto-in-tor-protocol.txt b/doc/spec/proposals/ideas/xxx-choosing-crypto-in-tor-protocol.txt
new file mode 100644
index 0000000000..e8489570f7
--- /dev/null
+++ b/doc/spec/proposals/ideas/xxx-choosing-crypto-in-tor-protocol.txt
@@ -0,0 +1,138 @@
+Filename: xxx-choosing-crypto-in-tor-protocol.txt
+Title: Picking cryptographic standards in the Tor wire protocol
+Author: Marian
+Created: 2009-05-16
+Status: Draft
+
+Motivation:
+
+ SHA-1 is horribly outdated and not suited for security critical
+ purposes. SHA-2, RIPEMD-160, Whirlpool and Tigerare good options
+ for a short-term replacement, but in the long run, we will
+ probably want to upgrade to the winner or a semi-finalist of the
+ SHA-3 competition.
+
+ For a 2006 comparison of different hash algorithms, read:
+ http://www.sane.nl/sane2006/program/final-papers/R10.pdf
+
+ Other reading about SHA-1:
+ http://www.schneier.com/blog/archives/2005/02/sha1_broken.html
+ http://www.schneier.com/blog/archives/2005/08/new_cryptanalyt.html
+ http://www.schneier.com/paper-preimages.html
+
+ Additionally, AES has been theoretically broken for years. While
+ the attack is still not efficient enough that the public sector
+ has been able to prove that it works, we should probably consider
+ the time between a theoretical attack and a practical attack as an
+ opportunity to figure out how to upgrade to a better algorithm,
+ such as Twofish.
+
+ See:
+ http://schneier.com/crypto-gram-0209.html#1
+
+Design:
+
+ I suggest that nodes should publish in directories which
+ cryptographic standards, such as hash algorithms and ciphers,
+ they support. Clients communicating with nodes will then
+ pick whichever of those cryptographic standards they prefer
+ the most. In the case that the node does not publish which
+ cryptographic standards it supports, the client should assume
+ that the server supports the older standards, such as SHA-1
+ and AES, until such time as we choose to desupport those
+ standards.
+
+ Node to node communications could work similarly. However, in
+ case they both support a set of algorithms but have different
+ preferences, the disagreement would have to be resolved
+ somehow. Two possibilities include:
+ * the node requesting communications presents which
+ cryptographic standards it supports in the request. The
+ other node picks.
+ * both nodes send each other lists of what they support and
+ what version of Tor they are using. The newer node picks,
+ based on the assumption that the newer node has the most up
+ to date information about which hash algorithm is the best.
+ Of course, the node could lie about its version, but then
+ again, it could also maliciously choose only to support older
+ algorithms.
+
+ Using this method, we could potentially add server side support
+ to hash algorithms and ciphers before we instruct clients to
+ begin preferring those hash algorithms and ciphers. In this way,
+ the clients could upgrade and the servers would already support
+ the newly preferred hash algorithms and ciphers, even if the
+ servers were still using older versions of Tor, so long as the
+ older versions of Tor were at least new enough to have server
+ side support.
+
+ This would make quickly upgrading to new hash algorithms and
+ ciphers easier. This could be very useful when new attacks
+ are published.
+
+ One concern is that client preferences could expose the client
+ to segmentation attacks. To mitigate this, we suggest hardcoding
+ preferences in the client, to prevent the client from choosing
+ to use a new hash algorithm or cipher that no one else is using
+ yet. While offering a preference might be useful in case a client
+ with an older version of Tor wants to start using the newer hash
+ algorithm or cipher that everyone else is using, if the client
+ cares enough, he or she can just upgrade Tor.
+
+ We may also have to worry about nodes which, through laziness or
+ maliciousness, refuse to start supporting new hash algorithms or
+ ciphers. This must be balanced with the need to maintain
+ backward compatibility so the client will have a large selection
+ of nodes to pick from. Adding new hash algorithms and ciphers
+ long before we suggest nodes start using them can help mitigate
+ this. However, eventually, once sufficient nodes support new
+ standards, client side support for older standards should be
+ disabled, particularly if there are practical rather than merely
+ theoretical attacks.
+
+ Server side support for older standards can be kept much longer
+ than client side support, since clients using older hashes and
+ ciphers are really only hurting theirselvse.
+
+ If server side support for a hash algorithm or cipher is added
+ but never preferred before we decide we don't really want it,
+ support can be removed without having to worry about backward
+ compatibility.
+
+Security implications:
+ Improving cryptography will improve Tor's security. However, if
+ clients pick different cryptographic standards, they could be
+ partitioned based on their cryptographic preferences. We also
+ need to worry about nodes refusing to support new standards.
+ These issues are detailed above.
+
+Specification:
+
+ Todo. Need better understanding of how Tor currently works or
+ help from someone who does.
+
+Compatibility:
+
+ This idea is intended to allow easier upgrading of cryptographic
+ hash algorithms and ciphers while maintaining backwards
+ compatibility. However, at some point, backwards compatibility
+ with very old hashes and ciphers should be dropped for security
+ reasons.
+
+Implementation:
+
+ Todo.
+
+Performance and scalability nodes:
+
+ Better hashes and cipher are someimes a little more CPU intensive
+ than weaker ones. For instance, on most computers AES is a little
+ faster than Twofish. However, in that example, I consider Twofish's
+ additional security worth the tradeoff.
+
+Acknowledgements:
+
+ Discussed this on IRC with a few people, mostly Nick Mathewson.
+ Nick was particularly helpful in explaining how Tor works,
+ explaining goals, and providing various links to Tor
+ specifications.
diff --git a/doc/spec/proposals/ideas/xxx-encrypted-services.txt b/doc/spec/proposals/ideas/xxx-encrypted-services.txt
new file mode 100644
index 0000000000..3414f3c4fb
--- /dev/null
+++ b/doc/spec/proposals/ideas/xxx-encrypted-services.txt
@@ -0,0 +1,18 @@
+
+the basic idea might be to generate a keypair, and sign little statements
+like "this key corresponds to this relay id", and publish them on karsten's
+hs dht.
+
+so if you want to talk to it, you look it up, then go to that exit.
+and by 'go to' i mean 'build a tor circuit like normal except you're sure
+where to exit'
+
+connecting to it is slower than usual, but once you're connected, it's no
+slower than normal tor.
+and you get what wikileaks wants from its hidden service, which is really
+just the UI piece.
+indymedia also wants this.
+
+might be interesting to let an encrypted service list more than one relay,
+too.
+
diff --git a/doc/spec/proposals/ideas/xxx-hide-platform.txt b/doc/spec/proposals/ideas/xxx-hide-platform.txt
index 3fed5cfbd4..ad19fb1fd4 100644
--- a/doc/spec/proposals/ideas/xxx-hide-platform.txt
+++ b/doc/spec/proposals/ideas/xxx-hide-platform.txt
@@ -1,7 +1,5 @@
Filename: xxx-hide-platform.txt
Title: Hide Tor Platform Information
-Version: $Revision$
-Last-Modified: $Date$
Author: Jacob Appelbaum
Created: 24-July-2008
Status: Draft
diff --git a/doc/spec/proposals/ideas/xxx-port-knocking.txt b/doc/spec/proposals/ideas/xxx-port-knocking.txt
index 9fbcdf3545..85c27ec52d 100644
--- a/doc/spec/proposals/ideas/xxx-port-knocking.txt
+++ b/doc/spec/proposals/ideas/xxx-port-knocking.txt
@@ -1,7 +1,5 @@
Filename: xxx-port-knocking.txt
Title: Port knocking for bridge scanning resistance
-Version: $Revision$
-Last-Modified: $Date$
Author: Jacob Appelbaum
Created: 19-April-2009
Status: Draft
diff --git a/doc/spec/proposals/ideas/xxx-separate-streams-by-port.txt b/doc/spec/proposals/ideas/xxx-separate-streams-by-port.txt
index cebde65a9b..f26c1e580f 100644
--- a/doc/spec/proposals/ideas/xxx-separate-streams-by-port.txt
+++ b/doc/spec/proposals/ideas/xxx-separate-streams-by-port.txt
@@ -1,7 +1,5 @@
Filename: xxx-separate-streams-by-port.txt
Title: Separate streams across circuits by destination port
-Version: $Revision$
-Last-Modified: $Date$
Author: Robert Hogan
Created: 21-Oct-2008
Status: Draft
diff --git a/doc/spec/proposals/ideas/xxx-what-uses-sha1.txt b/doc/spec/proposals/ideas/xxx-what-uses-sha1.txt
index 9b6e20c586..b3ca3eea5a 100644
--- a/doc/spec/proposals/ideas/xxx-what-uses-sha1.txt
+++ b/doc/spec/proposals/ideas/xxx-what-uses-sha1.txt
@@ -1,8 +1,6 @@
Filename: xxx-what-uses-sha1.txt
Title: Where does Tor use SHA-1 today?
-Version: $Revision$
-Last-Modified: $Date$
-Author: Nick Mathewson
+Authors: Nick Mathewson, Marian
Created: 30-Dec-2008
Status: Meta
@@ -15,9 +13,15 @@ Introduction:
too long.
According to smart crypto people, the SHA-2 functions (SHA-256, etc)
- share too much of SHA-1's structure to be very good. Some people
- like other hash functions; most of these have not seen enough
- analysis to be widely regarded as an extra-good idea.
+ share too much of SHA-1's structure to be very good. RIPEMD-160 is
+ also based on flawed past hashes. Some people think other hash
+ functions (e.g. Whirlpool and Tiger) are not as bad; most of these
+ have not seen enough analysis to be used yet.
+
+ Here is a 2006 paper about hash algorithms.
+ http://www.sane.nl/sane2006/program/final-papers/R10.pdf
+
+ (Todo: Ask smart crypto people.)
By 2012, the NIST SHA-3 competition will be done, and with luck we'll
have something good to switch too. But it's probably a bad idea to
@@ -54,50 +58,138 @@ Why now?
one look silly.
+Triage
+
+ How severe are these problems? Let's divide them into these
+ categories, where H(x) is the SHA-1 hash of x:
+ PREIMAGE -- find any x such that a H(x) has a chosen value
+ -- A SHA-1 usage that only depends on preimage
+ resistance
+ * Also SECOND PREIMAGE. Given x, find a y not equal to
+ x such that H(x) = H(y)
+ COLLISION<role> -- A SHA-1 usage that depends on collision
+ resistance, but the only party who could mount a
+ collision-based attack is already in a trusted role
+ (like a distribution signer or a directory authority).
+ COLLISION -- find any x and y such that H(x) = H(y) -- A
+ SHA-1 usage that depends on collision resistance
+ and doesn't need the attacker to have any special keys.
+
+ There is no need to put much effort into fixing PREIMAGE and SECOND
+ PREIMAGE usages in the near-term: while there have been some
+ theoretical results doing these attacks against SHA-1, they don't
+ seem to be close to practical yet. To fix COLLISION<code-signing>
+ usages is not too important either, since anyone who has the key to
+ sign the code can mount far worse attacks. It would be good to fix
+ COLLISION<authority> usages, since we try to resist bad authorities
+ to a limited extent. The COLLISION usages are the most important
+ to fix.
+
+ Kelsey and Schneier published a theoretical second preimage attack
+ against SHA-1 in 2005, so it would be a good idea to fix PREIMAGE
+ and SECOND PREIMAGE usages after fixing COLLISION usages or where fixes
+ require minimal effort.
+
+ http://www.schneier.com/paper-preimages.html
+
+ Additionally, we need to consider the impact of a successful attack
+ in each of these cases. SHA-1 collisions are still expensive even
+ if recent results are verified, and anybody with the resources to
+ compute one also has the resources to mount a decent Sybil attack.
+
+ Let's be pessimistic, and not assume that producing collisions of
+ a given format is actually any harder than producing collisions at
+ all.
+
What Tor uses hashes for today:
1. Infrastructure.
A. Our X.509 certificates are signed with SHA-1.
+ COLLSION
B. TLS uses SHA-1 (and MD5) internally to generate keys.
+ PREIMAGE?
+ * At least breaking SHA-1 and MD5 simultaneously is
+ much more difficult than breaking either
+ independently.
C. Some of the TLS ciphersuites we allow use SHA-1.
+ PREIMAGE?
D. When we sign our code with GPG, it might be using SHA-1.
+ COLLISION<code-signing>
+ * GPG 1.4 and up have writing support for SHA-2 hashes.
+ This blog has help for converting:
+ http://www.schwer.us/journal/2005/02/19/sha-1-broken-and-gnupg-gpg/
E. Our GPG keys might be authenticated with SHA-1.
+ COLLISION<code-signing-key-signing>
F. OpenSSL's random number generator uses SHA-1, I believe.
+ PREIMAGE
2. The Tor protocol
A. Everything we sign, we sign using SHA-1-based OAEP-MGF1.
+ PREIMAGE?
B. Our CREATE cell format uses SHA-1 for: OAEP padding.
+ PREIMAGE?
C. Our EXTEND cells use SHA-1 to hash the identity key of the
target server.
+ COLLISION
D. Our CREATED cells use SHA-1 to hash the derived key data.
+ ??
E. The data we use in CREATE_FAST cells to generate a key is the
length of a SHA-1.
+ NONE
F. The data we send back in a CREATED/CREATED_FAST cell is the length
of a SHA-1.
- G. We use SHA-1 to derive our circuit keys from the negotiated g^xy value.
+ NONE
+ G. We use SHA-1 to derive our circuit keys from the negotiated g^xy
+ value.
+ NONE
H. We use SHA-1 to derive the digest field of each RELAY cell, but that's
used more as a checksum than as a strong digest.
+ NONE
3. Directory services
+ [All are COLLISION or COLLISION<authority> ]
+
A. All signatures are generated on the SHA-1 of their corresponding
documents, using PKCS1 padding.
+ * In dir-spec.txt, section 1.3, it states,
+ "SIGNATURE" Object contains a signature (using the signing key)
+ of the PKCS1-padded digest of the entire document, taken from
+ the beginning of the Initial item, through the newline after
+ the Signature Item's keyword and its arguments."
+ So our attacker, Malcom, could generate a collision for the hash
+ that is signed. Thus, a second pre-image attack is possible.
+ Vulnerable to regular collision attack only if key is stolen.
+ If the key is stolen, Malcom could distribute two different
+ copies of the document which have the same hash. Maybe useful
+ for a partitioning attack?
B. Router descriptors identify their corresponding extra-info documents
by their SHA-1 digest.
+ * A third party might use a second pre-image attack to generate a
+ false extra-info document that has the same hash. The router
+ itself might use a regular collision attack to generate multiple
+ extra-info documents with the same hash, which might be useful
+ for a partitioning attack.
C. Fingerprints in router descriptors are taken using SHA-1.
- D. Fingerprints in authority certs are taken using SHA-1.
- E. Fingerprints in dir-source lines of votes and consensuses are taken
+ * The fingerprint must match the public key. Not sure what would
+ happen if two routers had different public keys but the same
+ fingerprint. There could perhaps be unpredictable behaviour.
+ D. In router descriptors, routers in the same "Family" may be listed
+ by server nicknames or hexdigests.
+ * Does not seem critical.
+ E. Fingerprints in authority certs are taken using SHA-1.
+ F. Fingerprints in dir-source lines of votes and consensuses are taken
using SHA-1.
- F. Networkstatuses refer to routers identity keys and descriptors by their
+ G. Networkstatuses refer to routers identity keys and descriptors by their
SHA-1 digests.
- G. Directory-signature lines identify which key is doing the signing by
+ H. Directory-signature lines identify which key is doing the signing by
the SHA-1 digests of the authority's signing key and its identity key.
- H. The following items are downloaded by the SHA-1 of their contents:
+ I. The following items are downloaded by the SHA-1 of their contents:
XXXX list them
- I. The following items are downloaded by the SHA-1 of an identity key:
+ J. The following items are downloaded by the SHA-1 of an identity key:
XXXX list them too.
4. The rendezvous protocol
@@ -107,6 +199,12 @@ What Tor uses hashes for today:
establishment requests.
B. Hidden servers use SHA-1 in multiple places when generating hidden
service descriptors.
+ * The permanent-id is the first 80 bits of the SHA-1 hash of the
+ public key
+ ** time-period performs caclulations using the permanent-id
+ * The secret-id-part is the SHA-1 has of the time period, the
+ descriptor-cookie, and replica.
+ * Hash of introduction point's identity key.
C. Hidden servers performing basic-type client authorization for their
services use SHA-1 when encrypting introduction points contained in
hidden service descriptors.
@@ -115,26 +213,35 @@ What Tor uses hashes for today:
identifier or not.
E. Hidden servers use SHA-1 to derive .onion addresses of their
services.
+ * What's worse, it only uses the first 80 bits of the SHA-1 hash.
+ However, the rend-spec.txt says we aren't worried about arbitrary
+ collisons?
F. Clients use SHA-1 to generate the current hidden service descriptor
identifiers for a given .onion address.
G. Hidden servers use SHA-1 to remember digests of the first parts of
Diffie-Hellman handshakes contained in introduction requests in order
- to detect replays.
+ to detect replays. See the RELAY_ESTABLISH_INTRO cell. We seem to be
+ taking a hash of a hash here.
H. Hidden servers use SHA-1 during the Diffie-Hellman key exchange with
a connecting client.
5. The bridge protocol
XXXX write me
+
+ A. Client may attempt to query for bridges where he knows a digest
+ (probably SHA-1) before a direct query.
6. The Tor user interface
A. We log information about servers based on SHA-1 hashes of their
identity keys.
+ COLLISION
B. The controller identifies servers based on SHA-1 hashes of their
identity keys.
+ COLLISION
C. Nearly all of our configuration options that list servers allow SHA-1
hashes of their identity keys.
+ COLLISION
E. The deprecated .exit notation uses SHA-1 hashes of identity keys
-
-
+ COLLISION
diff --git a/doc/spec/proposals/reindex.py b/doc/spec/proposals/reindex.py
index 2b4c02516b..980bc0659f 100755
--- a/doc/spec/proposals/reindex.py
+++ b/doc/spec/proposals/reindex.py
@@ -4,7 +4,7 @@ import re, os
class Error(Exception): pass
STATUSES = """DRAFT NEEDS-REVISION NEEDS-RESEARCH OPEN ACCEPTED META FINISHED
- CLOSED SUPERSEDED DEAD""".split()
+ CLOSED SUPERSEDED DEAD REJECTED""".split()
REQUIRED_FIELDS = [ "Filename", "Status", "Title" ]
CONDITIONAL_FIELDS = { "OPEN" : [ "Target" ],
"ACCEPTED" : [ "Target "],
diff --git a/doc/spec/rend-spec.txt b/doc/spec/rend-spec.txt
index e3fbe2253b..f030092679 100644
--- a/doc/spec/rend-spec.txt
+++ b/doc/spec/rend-spec.txt
@@ -1,4 +1,3 @@
-$Id$
Tor Rendezvous Specification
@@ -145,33 +144,10 @@ $Id$
1.2. Bob's OP generates service descriptors.
The first time the OP provides an advertised service, it generates
- a public/private keypair (stored locally). Periodically, the OP
- generates and publishes a descriptor of type "V0".
+ a public/private keypair (stored locally).
- The "V0" descriptor contains:
-
- KL Key length [2 octets]
- PK Bob's public key [KL octets]
- TS A timestamp [4 octets]
- NI Number of introduction points [2 octets]
- Ipt A list of NUL-terminated ORs [variable]
- SIG Signature of above fields [variable]
-
- KL is the length of PK, in octets.
- TS is the number of seconds elapsed since Jan 1, 1970.
-
- The members of Ipt may be either (a) nicknames, or (b) identity key
- digests, encoded in hex, and prefixed with a '$'. Clients must
- accept both forms. Services must only generate the second form.
- Once 0.0.9.x is obsoleted, we can drop the first form.
-
- [It's ok for Bob to advertise 0 introduction points. He might want
- to do that if he previously advertised some introduction points,
- and now he doesn't have any. -RD]
-
- Beginning with 0.2.0.10-alpha, Bob's OP encodes "V2" descriptors in
- addition to "V0" descriptors. The format of a "V2" descriptor is as
- follows:
+ Beginning with 0.2.0.10-alpha, Bob's OP encodes "V2" descriptors. The
+ format of a "V2" descriptor is as follows:
"rendezvous-service-descriptor" descriptor-id NL
@@ -340,6 +316,10 @@ $Id$
(This ends the fields in the encrypted portion of the descriptor.)
+ [It's ok for Bob to advertise 0 introduction points. He might want
+ to do that if he previously advertised some introduction points,
+ and now he doesn't have any. -RD]
+
"signature" NL signature-string
[At end, exactly once]
@@ -349,6 +329,21 @@ $Id$
1.2.1. Other descriptor formats we don't use.
+ Support for the V0 descriptor format was dropped in 0.2.2.0-alpha-dev:
+
+ KL Key length [2 octets]
+ PK Bob's public key [KL octets]
+ TS A timestamp [4 octets]
+ NI Number of introduction points [2 octets]
+ Ipt A list of NUL-terminated ORs [variable]
+ SIG Signature of above fields [variable]
+
+ KL is the length of PK, in octets.
+ TS is the number of seconds elapsed since Jan 1, 1970.
+
+ The members of Ipt may be either (a) nicknames, or (b) identity key
+ digests, encoded in hex, and prefixed with a '$'.
+
The V1 descriptor format was understood and accepted from
0.1.1.5-alpha-cvs to 0.2.0.6-alpha-dev, but no Tors generated it and
it was removed:
@@ -409,7 +404,7 @@ $Id$
RELAY_ESTABLISH_INTRO cell, containing:
KL Key length [2 octets]
- PK Bob's public key [KL octets]
+ PK Introduction public key [KL octets]
HS Hash of session info [20 octets]
SIG Signature of above information [variable]
@@ -431,16 +426,13 @@ $Id$
currently associated with PK. On success, the OR sends Bob a
RELAY_INTRO_ESTABLISHED cell with an empty payload.
- If a hidden service is configured to publish only v2 hidden service
- descriptors, Bob's OP does not include its own public key in the
- RELAY_ESTABLISH_INTRO cell, but the public key of a freshly generated
- key pair. The OP also includes these fresh public keys in the v2 hidden
- service descriptor together with the other introduction point
- information. The reason is that the introduction point does not need to
- and therefore should not know for which hidden service it works, so as
- to prevent it from tracking the hidden service's activity. If the hidden
- service is configured to publish both, v0 and v2 descriptors, two
- separate sets of introduction points are established.
+ Bob's OP does not include its own public key in the RELAY_ESTABLISH_INTRO
+ cell, but the public key of a freshly generated introduction key pair.
+ The OP also includes these fresh public keys in the v2 hidden service
+ descriptor together with the other introduction point information. The
+ reason is that the introduction point does not need to and therefore
+ should not know for which hidden service it works, so as to prevent it
+ from tracking the hidden service's activity.
1.4. Bob's OP advertises his service descriptor(s).
@@ -464,10 +456,8 @@ $Id$
after its timestamp. At least every 18 hours, Bob's OP uploads a
fresh descriptor.
- If Bob's OP is configured to publish v2 descriptors instead of or in
- addition to v0 descriptors, it does so to a changing subset of all v2
- hidden service directories instead of the authoritative directory
- servers. Therefore, Bob's OP opens a stream via Tor to each
+ Bob's OP publishes v2 descriptors to a changing subset of all v2 hidden
+ service directories. Therefore, Bob's OP opens a stream via Tor to each
responsible hidden service directory. (He may re-use old circuits
for this.) Over this stream, Bob's OP makes an HTTP 'POST' request to a
URL "/tor/rendezvous2/publish" relative to the hidden service
@@ -520,12 +510,21 @@ $Id$
1.6. Alice's OP retrieves a service descriptor.
- Alice opens a stream to a directory server via Tor, and makes an HTTP GET
- request for the document '/tor/rendezvous/<z>', where '<z>' is replaced
- with the encoding of Bob's public key as described above. (She may re-use
- old circuits for this.) The directory replies with a 404 HTTP response if
- it does not recognize <z>, and otherwise returns Bob's most recently
- uploaded service descriptor.
+ Similarly to the description in section 1.4, Alice's OP fetches a v2
+ descriptor from a randomly chosen hidden service directory out of the
+ changing subset of 6 nodes. If the request is unsuccessful, Alice retries
+ the other remaining responsible hidden service directories in a random
+ order. Alice relies on Bob to care about a potential clock skew between
+ the two by possibly storing two sets of descriptors (see end of section
+ 1.4).
+
+ Alice's OP opens a stream via Tor to the chosen v2 hidden service
+ directory. (She may re-use old circuits for this.) Over this stream,
+ Alice's OP makes an HTTP 'GET' request for the document
+ "/tor/rendezvous2/<z>", where z is replaced with the encoding of the
+ descriptor ID. The directory replies with a 404 HTTP response if it does
+ not recognize <z>, and otherwise returns Bob's most recently uploaded
+ service descriptor.
If Alice's OP receives a 404 response, it tries the other directory
servers, and only fails the lookup if none recognize the public key hash.
@@ -541,22 +540,6 @@ $Id$
[Caching may make her partitionable, but she fetched it anonymously,
and we can't very well *not* cache it. -RD]
- Alice's OP fetches v2 descriptors in parallel to v0 descriptors. Similarly
- to the description in section 1.4, the OP fetches a v2 descriptor from a
- randomly chosen hidden service directory out of the changing subset of
- 6 nodes. If the request is unsuccessful, Alice retries the other
- remaining responsible hidden service directories in a random order.
- Alice relies on Bob to care about a potential clock skew between the two
- by possibly storing two sets of descriptors (see end of section 1.4).
-
- Alice's OP opens a stream via Tor to the chosen v2 hidden service
- directory. (She may re-use old circuits for this.) Over this stream,
- Alice's OP makes an HTTP 'GET' request for the document
- "/tor/rendezvous2/<z>", where z is replaced with the encoding of the
- descriptor ID. The directory replies with a 404 HTTP response if it does
- not recognize <z>, and otherwise returns Bob's most recently uploaded
- service descriptor.
-
1.7. Alice's OP establishes a rendezvous point.
When Alice requests a connection to a given location-hidden service,
diff --git a/doc/spec/socks-extensions.txt b/doc/spec/socks-extensions.txt
index 8d58987f35..62d86acd9f 100644
--- a/doc/spec/socks-extensions.txt
+++ b/doc/spec/socks-extensions.txt
@@ -1,4 +1,3 @@
-$Id$
Tor's extensions to the SOCKS protocol
1. Overview
diff --git a/doc/spec/tor-spec.txt b/doc/spec/tor-spec.txt
index a321aa8694..e94ac7faaa 100644
--- a/doc/spec/tor-spec.txt
+++ b/doc/spec/tor-spec.txt
@@ -1,4 +1,3 @@
-$Id$
Tor Protocol Specification
@@ -21,7 +20,7 @@ see tor-design.pdf.
PK -- a public key.
SK -- a private key.
- K -- a key for a symmetric cypher.
+ K -- a key for a symmetric cipher.
a|b -- concatenation of 'a' and 'b'.
@@ -172,8 +171,8 @@ see tor-design.pdf.
In "renegotiation", the connection initiator sends no certificates, and
the responder sends a single connection certificate. Once the TLS
handshake is complete, the initiator renegotiates the handshake, with each
- parties sending a two-certificate chain as in "certificates up-front".
- The initiator's ClientHello MUST include at least once ciphersuite not in
+ party sending a two-certificate chain as in "certificates up-front".
+ The initiator's ClientHello MUST include at least one ciphersuite not in
the list above. The responder SHOULD NOT select any ciphersuite besides
those in the list above.
[The above "should not" is because some of the ciphers that
@@ -201,9 +200,9 @@ see tor-design.pdf.
to decide which to use.
In all of the above handshake variants, certificates sent in the clear
- SHOULD NOT include any strings to identify the host as a Tor server. In
- the "renegotation" and "backwards-compatible renegotiation", the
- initiator SHOULD chose a list of ciphersuites and TLS extensions chosen
+ SHOULD NOT include any strings to identify the host as a Tor server. In
+ the "renegotiation" and "backwards-compatible renegotiation" steps, the
+ initiator SHOULD choose a list of ciphersuites and TLS extensions
to mimic one used by a popular web browser.
Responders MUST NOT select any TLS ciphersuite that lacks ephemeral keys,
@@ -289,7 +288,7 @@ see tor-design.pdf.
6 -- CREATED_FAST (Circuit created, no PK) (See Sec 5.1)
7 -- VERSIONS (Negotiate proto version) (See Sec 4)
8 -- NETINFO (Time and address info) (See Sec 4)
- 9 -- RELAY_EARLY (End-to-end data; limited) (See sec 5.6)
+ 9 -- RELAY_EARLY (End-to-end data; limited)(See Sec 5.6)
The interpretation of 'Payload' depends on the type of the cell.
PADDING: Payload is unused.
@@ -357,7 +356,7 @@ see tor-design.pdf.
The address format is a type/length/value sequence as given in section
6.4 below. The timestamp is a big-endian unsigned integer number of
- seconds since the unix epoch.
+ seconds since the Unix epoch.
Implementations MAY use the timestamp value to help decide if their
clocks are skewed. Initiators MAY use "other OR's address" to help
@@ -399,7 +398,7 @@ see tor-design.pdf.
Onion skin [DH_LEN+KEY_LEN+PK_PAD_LEN bytes]
Identity fingerprint [HASH_LEN bytes]
- The port and address field denote the IPV4 address and port of the next
+ The port and address field denote the IPv4 address and port of the next
onion router in the circuit; the public key hash is the hash of the PKCS#1
ASN1 encoding of the next onion router's identity (signing) key. (See 0.3
above.) Including this hash allows the extending OR verify that it is
@@ -886,7 +885,7 @@ see tor-design.pdf.
6.4. Remote hostname lookup
To find the address associated with a hostname, the OP sends a
- RELAY_RESOLVE cell containing the hostname to be resolved with a nul
+ RELAY_RESOLVE cell containing the hostname to be resolved with a NUL
terminating byte. (For a reverse lookup, the OP sends a RELAY_RESOLVE
cell containing an in-addr.arpa address.) The OR replies with a
RELAY_RESOLVED cell containing a status byte, and any number of
diff --git a/doc/spec/version-spec.txt b/doc/spec/version-spec.txt
index 842271ae19..265717f409 100644
--- a/doc/spec/version-spec.txt
+++ b/doc/spec/version-spec.txt
@@ -1,4 +1,3 @@
-$Id$
HOW TOR VERSION NUMBERS WORK