summaryrefslogtreecommitdiff
path: root/doc/spec
diff options
context:
space:
mode:
Diffstat (limited to 'doc/spec')
-rw-r--r--doc/spec/Makefile.am2
-rw-r--r--doc/spec/address-spec.txt20
-rw-r--r--doc/spec/bridges-spec.txt1
-rw-r--r--doc/spec/control-spec-v0.txt1
-rw-r--r--doc/spec/control-spec.txt283
-rw-r--r--doc/spec/dir-spec-v1.txt1
-rw-r--r--doc/spec/dir-spec-v2.txt1
-rw-r--r--doc/spec/dir-spec.txt565
-rw-r--r--doc/spec/path-spec.txt252
-rw-r--r--doc/spec/proposals/000-index.txt39
-rw-r--r--doc/spec/proposals/001-process.txt13
-rw-r--r--doc/spec/proposals/098-todo.txt2
-rw-r--r--doc/spec/proposals/099-misc.txt2
-rw-r--r--doc/spec/proposals/100-tor-spec-udp.txt2
-rw-r--r--doc/spec/proposals/101-dir-voting.txt2
-rw-r--r--doc/spec/proposals/102-drop-opt.txt2
-rw-r--r--doc/spec/proposals/103-multilevel-keys.txt2
-rw-r--r--doc/spec/proposals/104-short-descriptors.txt2
-rw-r--r--doc/spec/proposals/105-handshake-revision.txt2
-rw-r--r--doc/spec/proposals/106-less-tls-constraint.txt2
-rw-r--r--doc/spec/proposals/107-uptime-sanity-checking.txt2
-rw-r--r--doc/spec/proposals/108-mtbf-based-stability.txt2
-rw-r--r--doc/spec/proposals/109-no-sharing-ips.txt2
-rw-r--r--doc/spec/proposals/110-avoid-infinite-circuits.txt2
-rw-r--r--doc/spec/proposals/111-local-traffic-priority.txt2
-rw-r--r--doc/spec/proposals/112-bring-back-pathlencoinweight.txt2
-rw-r--r--doc/spec/proposals/113-fast-authority-interface.txt2
-rw-r--r--doc/spec/proposals/114-distributed-storage.txt2
-rw-r--r--doc/spec/proposals/115-two-hop-paths.txt2
-rw-r--r--doc/spec/proposals/116-two-hop-paths-from-guard.txt2
-rw-r--r--doc/spec/proposals/117-ipv6-exits.txt2
-rw-r--r--doc/spec/proposals/118-multiple-orports.txt2
-rw-r--r--doc/spec/proposals/119-controlport-auth.txt2
-rw-r--r--doc/spec/proposals/120-shutdown-descriptors.txt2
-rw-r--r--doc/spec/proposals/121-hidden-service-authentication.txt2
-rw-r--r--doc/spec/proposals/122-unnamed-flag.txt2
-rw-r--r--doc/spec/proposals/123-autonaming.txt2
-rw-r--r--doc/spec/proposals/124-tls-certificates.txt2
-rw-r--r--doc/spec/proposals/125-bridges.txt2
-rw-r--r--doc/spec/proposals/126-geoip-reporting.txt2
-rw-r--r--doc/spec/proposals/127-dirport-mirrors-downloads.txt2
-rw-r--r--doc/spec/proposals/128-bridge-families.txt2
-rw-r--r--doc/spec/proposals/129-reject-plaintext-ports.txt2
-rw-r--r--doc/spec/proposals/130-v2-conn-protocol.txt2
-rw-r--r--doc/spec/proposals/131-verify-tor-usage.txt2
-rw-r--r--doc/spec/proposals/132-browser-check-tor-service.txt2
-rw-r--r--doc/spec/proposals/134-robust-voting.txt22
-rw-r--r--doc/spec/proposals/135-private-tor-networks.txt2
-rw-r--r--doc/spec/proposals/137-bootstrap-phases.txt2
-rw-r--r--doc/spec/proposals/138-remove-down-routers-from-consensus.txt2
-rw-r--r--doc/spec/proposals/140-consensus-diffs.txt11
-rw-r--r--doc/spec/proposals/141-jit-sd-downloads.txt8
-rw-r--r--doc/spec/proposals/142-combine-intro-and-rend-points.txt2
-rw-r--r--doc/spec/proposals/143-distributed-storage-improvements.txt2
-rw-r--r--doc/spec/proposals/145-newguard-flag.txt2
-rw-r--r--doc/spec/proposals/146-long-term-stability.txt2
-rw-r--r--doc/spec/proposals/147-prevoting-opinions.txt2
-rw-r--r--doc/spec/proposals/148-uniform-client-end-reason.txt2
-rw-r--r--doc/spec/proposals/149-using-netinfo-data.txt6
-rw-r--r--doc/spec/proposals/150-exclude-exit-nodes.txt1
-rw-r--r--doc/spec/proposals/151-path-selection-improvements.txt161
-rw-r--r--doc/spec/proposals/152-single-hop-circuits.txt2
-rw-r--r--doc/spec/proposals/153-automatic-software-update-protocol.txt2
-rw-r--r--doc/spec/proposals/154-automatic-updates.txt2
-rw-r--r--doc/spec/proposals/155-four-hidden-service-improvements.txt2
-rw-r--r--doc/spec/proposals/156-tracking-blocked-ports.txt2
-rw-r--r--doc/spec/proposals/157-specific-cert-download.txt2
-rw-r--r--doc/spec/proposals/158-microdescriptors.txt209
-rw-r--r--doc/spec/proposals/159-exit-scanning.txt2
-rw-r--r--doc/spec/proposals/160-bandwidth-offset.txt105
-rw-r--r--doc/spec/proposals/161-computing-bandwidth-adjustments.txt174
-rw-r--r--doc/spec/proposals/162-consensus-flavors.txt188
-rw-r--r--doc/spec/proposals/163-detecting-clients.txt115
-rw-r--r--doc/spec/proposals/164-reporting-server-status.txt91
-rw-r--r--doc/spec/proposals/165-simple-robust-voting.txt133
-rw-r--r--doc/spec/proposals/166-statistics-extra-info-docs.txt391
-rw-r--r--doc/spec/proposals/167-params-in-consensus.txt47
-rw-r--r--doc/spec/proposals/168-reduce-circwindow.txt134
-rw-r--r--doc/spec/proposals/169-eliminating-renegotiation.txt404
-rw-r--r--doc/spec/proposals/170-user-path-config.txt95
-rw-r--r--doc/spec/proposals/172-circ-getinfo-option.txt138
-rw-r--r--doc/spec/proposals/173-getinfo-option-expansion.txt101
-rw-r--r--doc/spec/proposals/174-optimistic-data-server.txt242
-rw-r--r--doc/spec/proposals/ideas/xxx-bwrate-algs.txt106
-rw-r--r--doc/spec/proposals/ideas/xxx-choosing-crypto-in-tor-protocol.txt138
-rw-r--r--doc/spec/proposals/ideas/xxx-encrypted-services.txt18
-rw-r--r--doc/spec/proposals/ideas/xxx-hide-platform.txt2
-rw-r--r--doc/spec/proposals/ideas/xxx-port-knocking.txt2
-rw-r--r--doc/spec/proposals/ideas/xxx-separate-streams-by-port.txt2
-rw-r--r--doc/spec/proposals/ideas/xxx-using-spdy.txt143
-rw-r--r--doc/spec/proposals/ideas/xxx-what-uses-sha1.txt139
-rwxr-xr-xdoc/spec/proposals/reindex.py2
-rw-r--r--doc/spec/rend-spec.txt612
-rw-r--r--doc/spec/socks-extensions.txt1
-rw-r--r--doc/spec/tor-spec.txt39
-rw-r--r--doc/spec/version-spec.txt1
96 files changed, 4546 insertions, 715 deletions
diff --git a/doc/spec/Makefile.am b/doc/spec/Makefile.am
index 208901d9db..e2fef42e81 100644
--- a/doc/spec/Makefile.am
+++ b/doc/spec/Makefile.am
@@ -1,5 +1,5 @@
EXTRA_DIST = tor-spec.txt rend-spec.txt control-spec.txt \
dir-spec.txt socks-extensions.txt path-spec.txt \
- version-spec.txt address-spec.txt
+ version-spec.txt address-spec.txt bridges-spec.txt
diff --git a/doc/spec/address-spec.txt b/doc/spec/address-spec.txt
index 2a84d857e6..ce6d2b65e7 100644
--- a/doc/spec/address-spec.txt
+++ b/doc/spec/address-spec.txt
@@ -1,4 +1,3 @@
-$Id$
Special Hostnames in Tor
Nick Mathewson
@@ -13,7 +12,7 @@ $Id$
These hostnames can be passed to Tor as the address part of a SOCKS4a or
SOCKS5 request. If the application is connected to Tor using an IP-only
- method (such as SOCKS4, TransPort, or NatdPort), these hostnames can be
+ method (such as SOCKS4, TransPort, or NATDPort), these hostnames can be
substituted for certain IP addresses using the MapAddress configuration
option or the MAPADDRESS control command.
@@ -34,10 +33,13 @@ $Id$
"www.google.com.foo.exit=64.233.161.99.foo.exit" to speed subsequent
lookups.
+ The .exit notation is disabled by default as of Tor 0.2.2.1-alpha, due
+ to potential application-level attacks.
+
EXAMPLES:
www.example.com.exampletornode.exit
- Connect to www.example.com from the node called "exampletornode."
+ Connect to www.example.com from the node called "exampletornode".
exampletornode.exit
@@ -54,15 +56,3 @@ $Id$
When Tor sees an address in this format, it tries to look up and connect to
the specified hidden service. See rend-spec.txt for full details.
-4. .noconnect
-
- SYNTAX: [string].noconnect
-
- When Tor sees an address in this format, it immediately closes the
- connection without attaching it to any circuit. This is useful for
- controllers that want to test whether a given application is indeed using
- the same instance of Tor that they're controlling.
-
-5. [XXX Is there a ".virtual" address that we expose too, or is that
-just intended to be internal? -RD]
-
diff --git a/doc/spec/bridges-spec.txt b/doc/spec/bridges-spec.txt
index 4a9b373c8e..647118815c 100644
--- a/doc/spec/bridges-spec.txt
+++ b/doc/spec/bridges-spec.txt
@@ -1,4 +1,3 @@
-$Id$
Tor bridges specification
diff --git a/doc/spec/control-spec-v0.txt b/doc/spec/control-spec-v0.txt
index faf75a64a4..3515d395a6 100644
--- a/doc/spec/control-spec-v0.txt
+++ b/doc/spec/control-spec-v0.txt
@@ -1,4 +1,3 @@
-$Id$
TC: A Tor control protocol (Version 0)
diff --git a/doc/spec/control-spec.txt b/doc/spec/control-spec.txt
index cf92e2b9e3..255adf00a4 100644
--- a/doc/spec/control-spec.txt
+++ b/doc/spec/control-spec.txt
@@ -1,4 +1,3 @@
-$Id$
TC: A Tor control protocol (Version 1)
@@ -16,6 +15,11 @@ $Id$
versions 0.1.0.x; the protocol in this document only works with Tor
versions in the 0.1.1.x series and later.)
+ The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
+ NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and
+ "OPTIONAL" in this document are to be interpreted as described in
+ RFC 2119.
+
1. Protocol outline
TC is a bidirectional message-based protocol. It assumes an underlying
@@ -88,32 +92,39 @@ $Id$
2.4. General-use tokens
- ; Identifiers for servers.
- ServerID = Nickname / Fingerprint
-
- Nickname = 1*19 NicknameChar
- NicknameChar = "a"-"z" / "A"-"Z" / "0" - "9"
- Fingerprint = "$" 40*HEXDIG
-
- ; A "=" indicates that the given nickname is canonical; a "~" indicates
- ; that the given nickname is not canonical. If no nickname is given at
- ; all, Tor does not even have a guess for what this router calls itself.
- LongName = Fingerprint [ ( "=" / "~" ) Nickname ]
+ ; CRLF means, "the ASCII Carriage Return character (decimal value 13)
+ ; followed by the ASCII Linefeed character (decimal value 10)."
+ CRLF = CR LF
; How a controller tells Tor about a particular OR. There are four
; possible formats:
- ; $Digest -- The router whose identity key hashes to the given digest.
+ ; $Fingerprint -- The router whose identity key hashes to the fingerprint.
; This is the preferred way to refer to an OR.
- ; $Digest~Name -- The router whose identity key hashes to the given
- ; digest, but only if the router has the given nickname.
- ; $Digest=Name -- The router whose identity key hashes to the given
- ; digest, but only if the router is Named and has the given
+ ; $Fingerprint~Nickname -- The router whose identity key hashes to the
+ ; given fingerprint, but only if the router has the given nickname.
+ ; $Fingerprint=Nickname -- The router whose identity key hashes to the
+ ; given fingerprint, but only if the router is Named and has the given
; nickname.
- ; Name -- The Named router with the given nickname, or, if no such
+ ; Nickname -- The Named router with the given nickname, or, if no such
; router exists, any router whose nickname matches the one given.
; This is not a safe way to refer to routers, since Named status
; could under some circumstances change over time.
+ ;
+ ; The tokens that implement the above follow:
+
ServerSpec = LongName / Nickname
+ LongName = Fingerprint [ ( "=" / "~" ) Nickname ]
+
+ Fingerprint = "$" 40*HEXDIG
+ NicknameChar = "a"-"z" / "A"-"Z" / "0" - "9"
+ Nickname = 1*19 NicknameChar
+
+ ; What follows is an outdated way to refer to ORs.
+ ; Feature VERBOSE_NAMES replaces ServerID with LongName in events and
+ ; GETINFO results. VERBOSE_NAMES can be enabled starting in Tor version
+ ; 0.1.2.2-alpha and it is always-on in 0.2.2.1-alpha and later.
+ ServerID = Nickname / Fingerprint
+
; Unique identifiers for streams or circuits. Currently, Tor only
; uses digits, but this may change
@@ -220,7 +231,7 @@ $Id$
"INFO" / "NOTICE" / "WARN" / "ERR" / "NEWDESC" / "ADDRMAP" /
"AUTHDIR_NEWDESCS" / "DESCCHANGED" / "STATUS_GENERAL" /
"STATUS_CLIENT" / "STATUS_SERVER" / "GUARD" / "NS" / "STREAM_BW" /
- "CLIENTS_SEEN"
+ "CLIENTS_SEEN" / "NEWCONSENSUS" / "BUILDTIMEOUT_SET"
Any events *not* listed in the SETEVENTS line are turned off; thus, sending
SETEVENTS with an empty body turns off all event reporting.
@@ -271,6 +282,9 @@ $Id$
returns "250 OK" if successful, or "551 Unable to write configuration
to disk" if it can't write the file or some other error occurs.
+ See also the "getinfo config-text" command, if the controller wants
+ to write the torrc file itself.
+
3.7. SIGNAL
Sent from the client to the server. The syntax is:
@@ -379,6 +393,10 @@ $Id$
"config-file" -- The location of Tor's configuration file ("torrc").
+ "config-text" -- The contents that Tor would write if you send it
+ a SAVECONF command, so the controller can write the file to
+ disk itself. [First implemented in 0.2.2.7-alpha.]
+
["exit-policy/prepend" -- The default exit policy lines that Tor will
*prepend* to the ExitPolicy config option.
-- Never implemented. Useful?]
@@ -462,25 +480,36 @@ $Id$
StreamID SP StreamStatus SP CircID SP Target CRLF
"orconn-status"
- A series of lines as for an OR connection status event. Each is of the
- form:
+ A series of lines as for an OR connection status event. In Tor
+ 0.1.2.2-alpha with feature VERBOSE_NAMES enabled and in Tor
+ 0.2.2.1-alpha and later by default, each line is of the form:
+ LongName SP ORStatus CRLF
+
+ In Tor versions 0.1.2.2-alpha through 0.2.2.1-alpha with feature
+ VERBOSE_NAMES turned off and before version 0.1.2.2-alpha, each line
+ is of the form:
ServerID SP ORStatus CRLF
"entry-guards"
A series of lines listing the currently chosen entry guards, if any.
- Each is of the form:
- ServerID2 SP Status [SP ISOTime] CRLF
-
- Status-with-time = ("unlisted") SP ISOTime
- Status = ("up" / "never-connected" / "down" /
- "unusable" / "unlisted" )
+ In Tor 0.1.2.2-alpha with feature VERBOSE_NAMES enabled and in Tor
+ 0.2.2.1-alpha and later by default, each line is of the form:
+ LongName SP Status [SP ISOTime] CRLF
+ In Tor versions 0.1.2.2-alpha through 0.2.2.1-alpha with feature
+ VERBOSE_NAMES turned off and before version 0.1.2.2-alpha, each line
+ is of the form:
+ ServerID2 SP Status [SP ISOTime] CRLF
ServerID2 = Nickname / 40*HEXDIG
- [From 0.1.1.4-alpha to 0.1.1.10-alpha, this was called "helper-nodes".
- Tor still supports calling it that for now, but support will be
- removed in 0.1.3.x.]
+ The definition of Status is the same for both:
+ Status = "up" / "never-connected" / "down" /
+ "unusable" / "unlisted"
+ [From 0.1.1.4-alpha to 0.1.1.10-alpha, entry-guards was called
+ "helper-nodes". Tor still supports calling "helper-nodes", but it
+ is deprecated and should not be used.]
+
[Older versions of Tor (before 0.1.2.x-final) generated 'down' instead
of unlisted/unusable. Current Tors never generate 'down'.]
@@ -503,7 +532,7 @@ $Id$
start and the rest of the interval respectively. The 'interval-start'
and 'interval-end' fields are the borders of the current interval; the
'interval-wake' field is the time within the current interval (if any)
- where we plan[ned] to start being active.
+ where we plan[ned] to start being active. The times are GMT.
"config/names"
A series of lines listing the available configuration options. Each is
@@ -564,14 +593,14 @@ $Id$
states. See Section 4.1.10 for explanations. (Only a few of the
status events are available as getinfo's currently. Let us know if
you want more exposed.)
- "status/reachability/or"
+ "status/reachability-succeeded/or"
0 or 1, depending on whether we've found our ORPort reachable.
- "status/reachability/dir"
+ "status/reachability-succeeded/dir"
0 or 1, depending on whether we've found our DirPort reachable.
- "status/reachability"
+ "status/reachability-succeeded"
"OR=" ("0"/"1") SP "DIR=" ("0"/"1")
- Combines status/reachability/*; controllers MUST ignore unrecognized
- elements in this entry.
+ Combines status/reachability-succeeded/*; controllers MUST ignore
+ unrecognized elements in this entry.
"status/bootstrap-phase"
Returns the most recent bootstrap phase status event
sent. Specifically, it returns a string starting with either
@@ -582,7 +611,7 @@ $Id$
List of currently recommended versions.
"status/version/current"
Status of the current version. One of: new, old, unrecommended,
- recommended, new in series, obsolete.
+ recommended, new in series, obsolete, unknown.
"status/clients-seen"
A summary of which countries we've seen clients from recently,
formatted the same as the CLIENTS_SEEN status event described in
@@ -600,15 +629,20 @@ $Id$
3.10. EXTENDCIRCUIT
Sent from the client to the server. The format is:
- "EXTENDCIRCUIT" SP CircuitID SP
- ServerSpec *("," ServerSpec)
- [SP "purpose=" Purpose] CRLF
+ "EXTENDCIRCUIT" SP CircuitID
+ [SP ServerSpec *("," ServerSpec)
+ SP "purpose=" Purpose] CRLF
This request takes one of two forms: either the CircuitID is zero, in
- which case it is a request for the server to build a new circuit according
- to the specified path, or the CircuitID is nonzero, in which case it is a
- request for the server to extend an existing circuit with that ID according
- to the specified path.
+ which case it is a request for the server to build a new circuit,
+ or the CircuitID is nonzero, in which case it is a request for the
+ server to extend an existing circuit with that ID according to the
+ specified path.
+
+ If the CircuitID is 0, the controller has the option of providing
+ a path for Tor to use to build the circuit. If it does not provide
+ a path, Tor will select one automatically from high capacity nodes
+ according to path-spec.txt.
If CircuitID is 0 and "purpose=" is specified, then the circuit's
purpose is set. Two choices are recognized: "general" and
@@ -750,46 +784,47 @@ $Id$
3.19. USEFEATURE
+ Adding additional features to the control protocol sometimes will break
+ backwards compatibility. Initially such features are added into Tor and
+ disabled by default. USEFEATURE can enable these additional features.
+
The syntax is:
"USEFEATURE" *(SP FeatureName) CRLF
FeatureName = 1*(ALPHA / DIGIT / "_" / "-")
- Sometimes extensions to the controller protocol break compatibility with
- older controllers. In this case, whenever possible, the extensions are
- first included in Tor disabled by default, and only enabled on a given
- controller connection when the "USEFEATURE" command is given. Once a
- "USEFEATURE" command is given, it applies to all subsequent interactions on
- the same connection; to disable an enabled feature, a new controller
- connection must be opened.
+ Feature names are case-insensitive.
- This is a forward-compatibility mechanism; each feature will eventually
- become a regular part of the control protocol in some future version of Tor.
- Tor will ignore a request to use any feature that is already on by default.
- Tor will give a "552" error if any requested feature is not recognized.
+ Once enabled, a feature stays enabled for the duration of the connection
+ to the controller. A new connection to the controller must be opened to
+ disable an enabled feature.
- Feature names are case-insensitive.
+ Features are a forward-compatibility mechanism; each feature will eventually
+ become a standard part of the control protocol. Once a feature becomes part
+ of the protocol, it is always-on. Each feature documents the version it was
+ introduced as a feature and the version in which it became part of the
+ protocol.
+
+ Tor will ignore a request to use any feature that is always-on. Tor will give
+ a 552 error in response to an unrecognized feature.
EXTENDED_EVENTS
Same as passing 'EXTENDED' to SETEVENTS; this is the preferred way to
request the extended event syntax.
- This will not be always-enabled until at least two stable releases
- after 0.1.2.3-alpha, the release where it was first used for
- anything.
+ This feature was first introduced in 0.1.2.3-alpha. It is always-on
+ and part of the protocol in Tor 0.2.2.1-alpha and later.
VERBOSE_NAMES
- Instead of ServerID as specified above, the controller should
- identify ORs by LongName in events and GETINFO results. This format is
- strictly more informative: rather than including Nickname for
- known Named routers and Fingerprint for unknown or unNamed routers, the
- LongName format includes a Fingerprint, an indication of Named status,
- and a Nickname (if one is known).
+ Replaces ServerID with LongName in events and GETINFO results. LongName
+ provides a Fingerprint for all routers, an indication of Named status,
+ and a Nickname if one is known. LongName is strictly more informative
+ than ServerID, which only provides either a Fingerprint or a Nickname.
- This will not be always-enabled until at least two stable releases
- after 0.1.2.2-alpha, the release where it was first available.
+ This feature was first introduced in 0.1.2.2-alpha. It is always-on and
+ part of the protocol in Tor 0.2.2.1-alpha and later.
3.20. RESOLVE
@@ -980,12 +1015,17 @@ $Id$
"FAILED" / ; circuit closed (was not built)
"CLOSED" ; circuit closed (was built)
- Path = ServerID *("," ServerID)
+ Path = LongName *("," LongName)
+ ; In Tor versions 0.1.2.2-alpha through 0.2.2.1-alpha with feature
+ ; VERBOSE_NAMES turned off and before version 0.1.2.2-alpha, Path
+ ; is as follows:
+ Path = ServerID *("," ServerID)
Reason = "NONE" / "TORPROTOCOL" / "INTERNAL" / "REQUESTED" /
"HIBERNATING" / "RESOURCELIMIT" / "CONNECTFAILED" /
"OR_IDENTITY" / "OR_CONN_CLOSED" / "TIMEOUT" /
- "FINISHED" / "DESTROYED" / "NOPATH" / "NOSUCHSERVICE"
+ "FINISHED" / "DESTROYED" / "NOPATH" / "NOSUCHSERVICE" /
+ "MEASUREMENT_EXPIRED"
The path is provided only when the circuit has been extended at least one
hop.
@@ -1029,7 +1069,7 @@ $Id$
Reason = "MISC" / "RESOLVEFAILED" / "CONNECTREFUSED" /
"EXITPOLICY" / "DESTROY" / "DONE" / "TIMEOUT" /
- "HIBERNATING" / "INTERNAL"/ "RESOURCELIMIT" /
+ "NOROUTE" / "HIBERNATING" / "INTERNAL"/ "RESOURCELIMIT" /
"CONNRESET" / "TORPROTOCOL" / "NOTDIRECTORY" / "END"
The "REASON" field is provided only for FAILED, CLOSED, and DETACHED
@@ -1068,19 +1108,26 @@ $Id$
4.1.3. OR Connection status changed
The syntax is:
- "650" SP "ORCONN" SP (ServerID / Target) SP ORStatus [ SP "REASON="
+
+ "650" SP "ORCONN" SP (LongName / Target) SP ORStatus [ SP "REASON="
Reason ] [ SP "NCIRCS=" NumCircuits ] CRLF
ORStatus = "NEW" / "LAUNCHED" / "CONNECTED" / "FAILED" / "CLOSED"
+ ; In Tor versions 0.1.2.2-alpha through 0.2.2.1-alpha with feature
+ ; VERBOSE_NAMES turned off and before version 0.1.2.2-alpha, OR
+ ; Connection is as follows:
+ "650" SP "ORCONN" SP (ServerID / Target) SP ORStatus [ SP "REASON="
+ Reason ] [ SP "NCIRCS=" NumCircuits ] CRLF
+
NEW is for incoming connections, and LAUNCHED is for outgoing
connections. CONNECTED means the TLS handshake has finished (in
either direction). FAILED means a connection is being closed that
hasn't finished its handshake, and CLOSED is for connections that
have handshaked.
- A ServerID is specified unless it's a NEW connection, in which
- case we don't know what server it is yet, so we use Address:Port.
+ A LongName or ServerID is specified unless it's a NEW connection, in
+ which case we don't know what server it is yet, so we use Address:Port.
If extended events are enabled (see 3.19), optional reason and
circuit counting information is provided for CLOSED and FAILED
@@ -1117,7 +1164,11 @@ $Id$
4.1.6. New descriptors available
Syntax:
- "650" SP "NEWDESC" 1*(SP ServerID) CRLF
+ "650" SP "NEWDESC" 1*(SP LongName) CRLF
+ ; In Tor versions 0.1.2.2-alpha through 0.2.2.1-alpha with feature
+ ; VERBOSE_NAMES turned off and before version 0.1.2.2-alpha, it
+ ; is as follows:
+ "650" SP "NEWDESC" 1*(SP ServerID) CRLF
4.1.7. New Address mapping
@@ -1497,6 +1548,23 @@ $Id$
should just look at ACCEPTED_SERVER_DESCRIPTOR and should ignore
this event for now.}
+ SERVER_DESCRIPTOR_STATUS
+ "STATUS=" "LISTED" / "UNLISTED"
+ We just got a new networkstatus consensus, and whether we're in
+ it or not in it has changed. Specifically, status is "listed"
+ if we're listed in it but previous to this point we didn't know
+ we were listed in a consensus; and status is "unlisted" if we
+ thought we should have been listed in it (e.g. we were listed in
+ the last one), but we're not.
+
+ {Moving from listed to unlisted is not necessarily cause for
+ alarm. The relay might have failed a few reachability tests,
+ or the Internet might have had some routing problems. So this
+ feature is mainly to let relay operators know when their relay
+ has successfully been listed in the consensus.}
+
+ [Not implemented yet. We should do this in 0.2.2.x. -RD]
+
NAMESERVER_STATUS
"NS=addr"
"STATUS=" "UP" / "DOWN"
@@ -1581,17 +1649,21 @@ $Id$
4.1.13. Bandwidth used on an application stream
The syntax is:
- "650" SP "STREAM_BW" SP StreamID SP BytesRead SP BytesWritten CRLF
- BytesRead = 1*DIGIT
+ "650" SP "STREAM_BW" SP StreamID SP BytesWritten SP BytesRead CRLF
BytesWritten = 1*DIGIT
+ BytesRead = 1*DIGIT
- BytesRead and BytesWritten are the number of bytes read and written since
- the last STREAM_BW event on this stream. These events are generated about
- once per second per stream; no events are generated for streams that have
- not read or written.
+ BytesWritten and BytesRead are the number of bytes written and read
+ by the application since the last STREAM_BW event on this stream.
- These events apply only to streams entering Tor (such as on a SOCKSPort,
- TransPort, or so on). They are not generated for exiting streams.
+ Note that from Tor's perspective, *reading* a byte on a stream means
+ that the application *wrote* the byte. That's why the order of "written"
+ vs "read" is opposite for stream_bw events compared to bw events.
+
+ These events are generated about once per second per stream; no events
+ are generated for streams that have not written or read. These events
+ apply only to streams entering Tor (such as on a SOCKSPort, TransPort,
+ or so on). They are not generated for exiting streams.
4.1.14. Per-country client stats
@@ -1610,11 +1682,11 @@ $Id$
TimeStarted is a quoted string indicating when the reported summary
counts from (in GMT).
- The CountrySummary keyword has as its argument a comma-separated
- set of "countrycode=count" pairs. For example,
- 650-CLIENTS_SEEN TimeStarted="Thu Dec 25 23:50:43 EST 2008"
- 650 CountrySummary=us=16,de=8,uk=8
-[XXX Matt Edman informs me that the time format above is wrong. -RD]
+ The CountrySummary keyword has as its argument a comma-separated,
+ possibly empty set of "countrycode=count" pairs. For example (without
+ linebreak),
+ 650-CLIENTS_SEEN TimeStarted="2008-12-25 23:50:43"
+ CountrySummary=us=16,de=8,uk=8
4.1.15. New consensus networkstatus has arrived.
@@ -1629,6 +1701,43 @@ $Id$
[First added in 0.2.1.13-alpha]
+4.1.16. New circuit buildtime has been set.
+
+ The syntax is:
+ "650" SP "BUILDTIMEOUT_SET" SP Type SP "TOTAL_TIMES=" Total SP
+ "TIMEOUT_MS=" Timeout SP "XM=" Xm SP "ALPHA=" Alpha SP
+ "CUTOFF_QUANTILE=" Quantile SP "TIMEOUT_RATE=" TimeoutRate SP
+ "CLOSE_MS=" CloseTimeout SP "CLOSE_RATE=" CloseRate
+ CRLF
+ Type = "COMPUTED" / "RESET" / "SUSPENDED" / "DISCARD" / "RESUME"
+ Total = Integer count of timeouts stored
+ Timeout = Integer timeout in milliseconds
+ Xm = Estimated integer Pareto parameter Xm in milliseconds
+ Alpha = Estimated floating point Paredo paremter alpha
+ Quantile = Floating point CDF quantile cutoff point for this timeout
+ TimeoutRate = Floating point ratio of circuits that timeout
+ CloseTimeout = How long to keep measurement circs in milliseconds
+ CloseRate = Floating point ratio of measurement circuits that are closed
+
+ A new circuit build timeout time has been set. If Type is "COMPUTED",
+ Tor has computed the value based on historical data. If Type is "RESET",
+ initialization or drastic network changes have caused Tor to reset
+ the timeout back to the default, to relearn again. If Type is
+ "SUSPENDED", Tor has detected a loss of network connectivity and has
+ temporarily changed the timeout value to the default until the network
+ recovers. If type is "DISCARD", Tor has decided to discard timeout
+ values that likely happened while the network was down. If type is
+ "RESUME", Tor has decided to resume timeout calculation.
+
+ The Total value is the count of circuit build times Tor used in
+ computing this value. It is capped internally at the maximum number
+ of build times Tor stores (NCIRCUITS_TO_OBSERVE).
+
+ The Timeout itself is provided in milliseconds. Internally, Tor rounds
+ this value to the nearest second before using it.
+
+ [First added in 0.2.2.7-alpha]
+
5. Implementation notes
5.1. Authentication
diff --git a/doc/spec/dir-spec-v1.txt b/doc/spec/dir-spec-v1.txt
index 286df664e2..a92fc7999a 100644
--- a/doc/spec/dir-spec-v1.txt
+++ b/doc/spec/dir-spec-v1.txt
@@ -1,4 +1,3 @@
-$Id$
Tor Protocol Specification
diff --git a/doc/spec/dir-spec-v2.txt b/doc/spec/dir-spec-v2.txt
index 4873c4a728..d1be27f3db 100644
--- a/doc/spec/dir-spec-v2.txt
+++ b/doc/spec/dir-spec-v2.txt
@@ -1,4 +1,3 @@
-$Id$
Tor directory protocol, version 2
diff --git a/doc/spec/dir-spec.txt b/doc/spec/dir-spec.txt
index 9a2a62bc46..6e35deb00e 100644
--- a/doc/spec/dir-spec.txt
+++ b/doc/spec/dir-spec.txt
@@ -1,4 +1,3 @@
-$Id$
Tor directory protocol, version 3
@@ -11,7 +10,7 @@ $Id$
Caches and authorities must still support older versions of the
directory protocols, until the versions of Tor that require them are
- finally out of commission. See Section XXXX on backward compatibility.
+ finally out of commission.
This document merges and supersedes the following proposals:
@@ -19,13 +18,15 @@ $Id$
103 Splitting identity key from regularly used signing key
104 Long and Short Router Descriptors
- AS OF 14 JUNE 2007, THIS SPECIFICATION HAS NOT YET BEEN COMPLETELY
- IMPLEMENTED, OR COMPLETELY COMPLETED.
-
XXX when to download certificates.
XXX timeline
XXX fill in XXXXs
+ The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
+ NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and
+ "OPTIONAL" in this document are to be interpreted as described in
+ RFC 2119.
+
0.1. History
The earliest versions of Onion Routing shipped with a list of known
@@ -183,7 +184,8 @@ $Id$
All directory information is uploaded and downloaded with HTTP.
[Authorities also generate and caches also cache documents produced and
- used by earlier versions of this protocol; see section XXX for notes.]
+ used by earlier versions of this protocol; see dir-spec-v1.txt and
+ dir-spec-v2.txt for notes on those versions.]
1.1. What's different from version 2?
@@ -592,9 +594,9 @@ $Id$
with unrecognized items; the protocols line should be preceded with
an "opt" until these Tors are obsolete.]
- "allow-single-hop-exits"
+ "allow-single-hop-exits" NL
- [At most one.]
+ [At most once.]
Present only if the router allows single-hop circuits to make exit
connections. Most Tor servers do not support this: this is
@@ -613,7 +615,7 @@ $Id$
Fingerprint is encoded in hex (using upper-case letters), with
no spaces.
- "published"
+ "published" YYYY-MM-DD HH:MM:SS NL
[Exactly once.]
@@ -628,8 +630,8 @@ $Id$
As documented in 2.1 above. See migration notes in section 2.2.1.
- "geoip-start" YYYY-MM-DD HH:MM:SS NL
- "geoip-client-origins" CC=N,CC=N,... NL
+ ("geoip-start" YYYY-MM-DD HH:MM:SS NL)
+ ("geoip-client-origins" CC=N,CC=N,... NL)
Only generated by bridge routers (see blocking.pdf), and only
when they have been configured with a geoip database.
@@ -642,6 +644,238 @@ $Id$
"geoip-start" is the time at which we began collecting geoip
statistics.
+ "geoip-start" and "geoip-client-origins" have been replaced by
+ "bridge-stats-end" and "bridge-stats-ips" in 0.2.2.4-alpha. The
+ reason is that the measurement interval with "geoip-stats" as
+ determined by subtracting "geoip-start" from "published" could
+ have had a variable length, whereas the measurement interval in
+ 0.2.2.4-alpha and later is set to be exactly 24 hours long. In
+ order to clearly distinguish the new measurement intervals from
+ the old ones, the new keywords have been introduced.
+
+ "bridge-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL
+ [At most once.]
+
+ YYYY-MM-DD HH:MM:SS defines the end of the included measurement
+ interval of length NSEC seconds (86400 seconds by default).
+
+ A "bridge-stats-end" line, as well as any other "bridge-*" line,
+ is only added when the relay has been running as a bridge for at
+ least 24 hours.
+
+ "bridge-ips" CC=N,CC=N,... NL
+ [At most once.]
+
+ List of mappings from two-letter country codes to the number of
+ unique IP addresses that have connected from that country to the
+ bridge and which are no known relays, rounded up to the nearest
+ multiple of 8.
+
+ "dirreq-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL
+ [At most once.]
+
+ YYYY-MM-DD HH:MM:SS defines the end of the included measurement
+ interval of length NSEC seconds (86400 seconds by default).
+
+ A "dirreq-stats-end" line, as well as any other "dirreq-*" line,
+ is only added when the relay has opened its Dir port and after 24
+ hours of measuring directory requests.
+
+ "dirreq-v2-ips" CC=N,CC=N,... NL
+ [At most once.]
+ "dirreq-v3-ips" CC=N,CC=N,... NL
+ [At most once.]
+
+ List of mappings from two-letter country codes to the number of
+ unique IP addresses that have connected from that country to
+ request a v2/v3 network status, rounded up to the nearest multiple
+ of 8. Only those IP addresses are counted that the directory can
+ answer with a 200 OK status code.
+
+ "dirreq-v2-reqs" CC=N,CC=N,... NL
+ [At most once.]
+ "dirreq-v3-reqs" CC=N,CC=N,... NL
+ [At most once.]
+
+ List of mappings from two-letter country codes to the number of
+ requests for v2/v3 network statuses from that country, rounded up
+ to the nearest multiple of 8. Only those requests are counted that
+ the directory can answer with a 200 OK status code.
+
+ "dirreq-v2-share" num% NL
+ [At most once.]
+ "dirreq-v3-share" num% NL
+ [At most once.]
+
+ The share of v2/v3 network status requests that the directory
+ expects to receive from clients based on its advertised bandwidth
+ compared to the overall network bandwidth capacity. Shares are
+ formatted in percent with two decimal places. Shares are
+ calculated as means over the whole 24-hour interval.
+
+ "dirreq-v2-resp" status=num,... NL
+ [At most once.]
+ "dirreq-v3-resp" status=nul,... NL
+ [At most once.]
+
+ List of mappings from response statuses to the number of requests
+ for v2/v3 network statuses that were answered with that response
+ status, rounded up to the nearest multiple of 4. Only response
+ statuses with at least 1 response are reported. New response
+ statuses can be added at any time. The current list of response
+ statuses is as follows:
+
+ "ok": a network status request is answered; this number
+ corresponds to the sum of all requests as reported in
+ "dirreq-v2-reqs" or "dirreq-v3-reqs", respectively, before
+ rounding up.
+ "not-enough-sigs: a version 3 network status is not signed by a
+ sufficient number of requested authorities.
+ "unavailable": a requested network status object is unavailable.
+ "not-found": a requested network status is not found.
+ "not-modified": a network status has not been modified since the
+ If-Modified-Since time that is included in the request.
+ "busy": the directory is busy.
+
+ "dirreq-v2-direct-dl" key=val,... NL
+ [At most once.]
+ "dirreq-v3-direct-dl" key=val,... NL
+ [At most once.]
+ "dirreq-v2-tunneled-dl" key=val,... NL
+ [At most once.]
+ "dirreq-v3-tunneled-dl" key=val,... NL
+ [At most once.]
+
+ List of statistics about possible failures in the download process
+ of v2/v3 network statuses. Requests are either "direct"
+ HTTP-encoded requests over the relay's directory port, or
+ "tunneled" requests using a BEGIN_DIR cell over the relay's OR
+ port. The list of possible statistics can change, and statistics
+ can be left out from reporting. The current list of statistics is
+ as follows:
+
+ Successful downloads and failures:
+
+ "complete": a client has finished the download successfully.
+ "timeout": a download did not finish within 10 minutes after
+ starting to send the response.
+ "running": a download is still running at the end of the
+ measurement period for less than 10 minutes after starting to
+ send the response.
+
+ Download times:
+
+ "min", "max": smallest and largest measured bandwidth in B/s.
+ "d[1-4,6-9]": 1st to 4th and 6th to 9th decile of measured
+ bandwidth in B/s. For a given decile i, i/10 of all downloads
+ had a smaller bandwidth than di, and (10-i)/10 of all downloads
+ had a larger bandwidth than di.
+ "q[1,3]": 1st and 3rd quartile of measured bandwidth in B/s. One
+ fourth of all downloads had a smaller bandwidth than q1, one
+ fourth of all downloads had a larger bandwidth than q3, and the
+ remaining half of all downloads had a bandwidth between q1 and
+ q3.
+ "md": median of measured bandwidth in B/s. Half of the downloads
+ had a smaller bandwidth than md, the other half had a larger
+ bandwidth than md.
+
+ "dirreq-read-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM,NUM,NUM... NL
+ [At most once]
+ "dirreq-write-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM,NUM,NUM... NL
+ [At most once]
+
+ Declare how much bandwidth the OR has spent on answering directory
+ requests. Usage is divided into intervals of NSEC seconds. The
+ YYYY-MM-DD HH:MM:SS field defines the end of the most recent
+ interval. The numbers are the number of bytes used in the most
+ recent intervals, ordered from oldest to newest.
+
+ "entry-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL
+ [At most once.]
+
+ YYYY-MM-DD HH:MM:SS defines the end of the included measurement
+ interval of length NSEC seconds (86400 seconds by default).
+
+ An "entry-stats-end" line, as well as any other "entry-*"
+ line, is first added after the relay has been running for at least
+ 24 hours.
+
+ "entry-ips" CC=N,CC=N,... NL
+ [At most once.]
+
+ List of mappings from two-letter country codes to the number of
+ unique IP addresses that have connected from that country to the
+ relay and which are no known other relays, rounded up to the
+ nearest multiple of 8.
+
+ "cell-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL
+ [At most once.]
+
+ YYYY-MM-DD HH:MM:SS defines the end of the included measurement
+ interval of length NSEC seconds (86400 seconds by default).
+
+ A "cell-stats-end" line, as well as any other "cell-*" line,
+ is first added after the relay has been running for at least 24
+ hours.
+
+ "cell-processed-cells" num,...,num NL
+ [At most once.]
+
+ Mean number of processed cells per circuit, subdivided into
+ deciles of circuits by the number of cells they have processed in
+ descending order from loudest to quietest circuits.
+
+ "cell-queued-cells" num,...,num NL
+ [At most once.]
+
+ Mean number of cells contained in queues by circuit decile. These
+ means are calculated by 1) determining the mean number of cells in
+ a single circuit between its creation and its termination and 2)
+ calculating the mean for all circuits in a given decile as
+ determined in "cell-processed-cells". Numbers have a precision of
+ two decimal places.
+
+ "cell-time-in-queue" num,...,num NL
+ [At most once.]
+
+ Mean time cells spend in circuit queues in milliseconds. Times are
+ calculated by 1) determining the mean time cells spend in the
+ queue of a single circuit and 2) calculating the mean for all
+ circuits in a given decile as determined in
+ "cell-processed-cells".
+
+ "cell-circuits-per-decile" num NL
+ [At most once.]
+
+ Mean number of circuits that are included in any of the deciles,
+ rounded up to the next integer.
+
+ "exit-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL
+ [At most once.]
+
+ YYYY-MM-DD HH:MM:SS defines the end of the included measurement
+ interval of length NSEC seconds (86400 seconds by default).
+
+ An "exit-stats-end" line, as well as any other "exit-*" line, is
+ first added after the relay has been running for at least 24 hours
+ and only if the relay permits exiting (where exiting to a single
+ port and IP address is sufficient).
+
+ "exit-kibibytes-written" port=N,port=N,... NL
+ [At most once.]
+ "exit-kibibytes-read" port=N,port=N,... NL
+ [At most once.]
+
+ List of mappings from ports to the number of kibibytes that the
+ relay has written to or read from exit connections to that port,
+ rounded up to the next full kibibyte.
+
+ "exit-streams-opened" port=N,port=N,... NL
+ [At most once.]
+
+ List of mappings from ports to the number of opened exit streams
+ to that port, rounded up to the nearest multiple of 4.
+
"router-signature" NL Signature NL
[At end, exactly once.]
@@ -795,10 +1029,10 @@ $Id$
generate exactly the same consensus given the same set of votes.
The procedure for deciding when to generate vote and consensus status
- documents are described in section XXX below.
+ documents are described in section 1.4 on the voting timeline.
Status documents contain a preamble, an authority section, a list of
- router status entries, and one more footers signature, in that order.
+ router status entries, and one or more footer signature, in that order.
Unlike other formats described above, a SP in these documents must be a
single space character (hex 20).
@@ -905,6 +1139,53 @@ $Id$
enough votes were counted for the consensus for an authoritative
opinion to have been formed about their status.
+ "params" SP [Parameters] NL
+
+ [At most once]
+
+ Parameter ::= Keyword '=' Int32
+ Int32 ::= A decimal integer between -2147483648 and 2147483647.
+ Parameters ::= Parameter | Parameters SP Parameter
+
+ The parameters list, if present, contains a space-separated list of
+ case-sensitive key-value pairs, sorted in lexical order by
+ their keyword. Each parameter has its own meaning.
+
+ (Only included when the vote is generated with consensus-method 7 or
+ later.)
+
+ Commonly used "param" arguments at this point include:
+
+ "circwindow" -- the default package window that circuits should
+ be established with. It started out at 1000 cells, but some
+ research indicates that a lower value would mean fewer cells in
+ transit in the network at any given time. Obeyed by Tor 0.2.1.20
+ and later.
+
+ "CircuitPriorityHalflifeMsec" -- the halflife parameter used when
+ weighting which circuit will send the next cell. Obeyed by Tor
+ 0.2.2.10-alpha and later. (Versions of Tor between 0.2.2.7-alpha
+ and 0.2.2.10-alpha recognized a "CircPriorityHalflifeMsec" parameter,
+ but mishandled it badly.)
+
+ "perconnbwrate" and "perconnbwburst" -- if set, each relay sets
+ up a separate token bucket for every client OR connection,
+ and rate limits that connection indepedently. Typically left
+ unset, except when used for performance experiments around trac
+ entry 1750. Only honored by relays running Tor 0.2.2.16-alpha
+ and later. (Note that relays running 0.2.2.7-alpha through
+ 0.2.2.14-alpha looked for bwconnrate and bwconnburst, but then
+ did the wrong thing with them; see bug 1830 for details.)
+
+ "refuseunknownexits" -- if set and non-zero, exit relays look at
+ the previous hop of circuits that ask to open an exit stream,
+ and refuse to exit if they don't recognize it as a relay. The
+ goal is to make it harder for people to use them as one-hop
+ proxies. See trac entry 1751 for details.
+
+ See also "2.4.5. Consensus parameters governing behavior"
+ in path-spec.txt for a series of circuit build time related
+ consensus params.
The authority section of a vote contains the following items, followed
in turn by the authority's current key certificate:
@@ -1030,13 +1311,20 @@ $Id$
descriptors if they would cause "v" lines to be over 128 characters
long.
- "w" SP "Bandwidth=" INT NL
+ "w" SP "Bandwidth=" INT [SP "Measured=" INT] NL
[At most once.]
An estimate of the bandwidth of this server, in an arbitrary
unit (currently kilobytes per second). Used to weight router
- selection. Other weighting keywords may be added later.
+ selection.
+
+ Additionally, the Measured= keyword is present in votes by
+ participating bandwidth measurement authorities to indicate
+ a measured bandwidth currently produced by measuring stream
+ capacities.
+
+ Other weighting keywords may be added later.
Clients MUST ignore keywords they do not recognize.
"p" SP ("accept" / "reject") SP PortList NL
@@ -1051,8 +1339,57 @@ $Id$
or does not support (if 'reject') for exit to "most
addresses".
- The signature section contains the following item, which appears
- Exactly Once for a vote, and At Least Once for a consensus.
+ The footer section is delineated in all votes and consensuses supporting
+ consensus method 9 and above with the following:
+
+ "directory-footer" NL
+
+ It contains two subsections, a bandwidths-weights line and a
+ directory-signature.
+
+ The bandwidths-weights line appears At Most Once for a consensus. It does
+ not appear in votes.
+
+ "bandwidth-weights" SP
+ "Wbd=" INT SP "Wbe=" INT SP "Wbg=" INT SP "Wbm=" INT SP
+ "Wdb=" INT SP
+ "Web=" INT SP "Wed=" INT SP "Wee=" INT SP "Weg=" INT SP "Wem=" INT SP
+ "Wgb=" INT SP "Wgd=" INT SP "Wgg=" INT SP "Wgm=" INT SP
+ "Wmb=" INT SP "Wmd=" INT SP "Wme=" INT SP "Wmg=" INT SP "Wmm=" INT NL
+
+ These values represent the weights to apply to router bandwidths during
+ path selection. They are sorted in alphabetical order in the list. The
+ integer values are divided by BW_WEIGHT_SCALE=10000 or the consensus
+ param "bwweightscale". They are:
+
+ Wgg - Weight for Guard-flagged nodes in the guard position
+ Wgm - Weight for non-flagged nodes in the guard Position
+ Wgd - Weight for Guard+Exit-flagged nodes in the guard Position
+
+ Wmg - Weight for Guard-flagged nodes in the middle Position
+ Wmm - Weight for non-flagged nodes in the middle Position
+ Wme - Weight for Exit-flagged nodes in the middle Position
+ Wmd - Weight for Guard+Exit flagged nodes in the middle Position
+
+ Weg - Weight for Guard flagged nodes in the exit Position
+ Wem - Weight for non-flagged nodes in the exit Position
+ Wee - Weight for Exit-flagged nodes in the exit Position
+ Wed - Weight for Guard+Exit-flagged nodes in the exit Position
+
+ Wgb - Weight for BEGIN_DIR-supporting Guard-flagged nodes
+ Wmb - Weight for BEGIN_DIR-supporting non-flagged nodes
+ Web - Weight for BEGIN_DIR-supporting Exit-flagged nodes
+ Wdb - Weight for BEGIN_DIR-supporting Guard+Exit-flagged nodes
+
+ Wbg - Weight for Guard flagged nodes for BEGIN_DIR requests
+ Wbm - Weight for non-flagged nodes for BEGIN_DIR requests
+ Wbe - Weight for Exit-flagged nodes for BEGIN_DIR requests
+ Wbd - Weight for Guard+Exit-flagged nodes for BEGIN_DIR requests
+
+ These values are calculated as specified in Section 3.4.3.
+
+ The signature contains the following item, which appears Exactly Once
+ for a vote, and At Least Once for a consensus.
"directory-signature" SP identity SP signing-key-digest NL Signature
@@ -1065,7 +1402,7 @@ $Id$
the signing authority, and "signing-key-digest" is the hex-encoded
digest of the current authority signing key of the signing authority.
-3.3. Deciding how to vote.
+3.3. Assigning flags in a vote
(This section describes how directory authorities choose which status
flags to apply to routers, as of Tor 0.2.0.0-alpha-dev. Later directory
@@ -1128,14 +1465,11 @@ $Id$
least one /8 address space.
"Fast" -- A router is 'Fast' if it is active, and its bandwidth is
- either in the top 7/8ths for known active routers or at least 100KB/s.
+ either in the top 7/8ths for known active routers or at least 20KB/s.
"Guard" -- A router is a possible 'Guard' if its Weighted Fractional
Uptime is at least the median for "familiar" active routers, and if
its bandwidth is at least median or at least 250KB/s.
- If the total bandwidth of active non-BadExit Exit servers is less
- than one third of the total bandwidth of all active servers, no Exit is
- listed as a Guard.
To calculate weighted fractional uptime, compute the fraction
of time that the router is up in any given day, weighting so that
@@ -1179,6 +1513,13 @@ $Id$
rate limit from the router descriptor. It is given in kilobytes
per second, and capped at some arbitrary value (currently 10 MB/s).
+ The Measured= keyword on a "w" line vote is currently computed
+ by multiplying the previous published consensus bandwidth by the
+ ratio of the measured average node stream capacity to the network
+ average. If 3 or more authorities provide a Measured= keyword for
+ a router, the authorities produce a consensus containing a "w"
+ Bandwidth= keyword equal to the median of the Measured= votes.
+
The ports listed in a "p" line should be taken as those ports for
which the router's exit policy permits 'most' addresses, ignoring any
accept not for all addresses, ignoring all rejects for private
@@ -1199,6 +1540,10 @@ $Id$
Known-flags is the union of all flags known by any voter.
+ Entries are given on the "params" line for every keyword on which any
+ authority voted. The values given are the low-median of all votes on
+ that keyword.
+
"client-versions" and "server-versions" are sorted in ascending
order; A version is recommended in the consensus if it is recommended
by more than half of the voting authorities that included a
@@ -1261,6 +1606,14 @@ $Id$
one, breaking ties in favor of the lexicographically larger
vote.) The port list is encoded as specified in 3.4.2.
+ * If consensus-method 6 or later is in use and if 3 or more
+ authorities provide a Measured= keyword in their votes for
+ a router, the authorities produce a consensus containing a
+ Bandwidth= keyword equal to the median of the Measured= votes.
+
+ * If consensus-method 7 or later is in use, the params line is
+ included in the output.
+
The signatures at the end of a consensus document are sorted in
ascending order by identity digest.
@@ -1281,6 +1634,11 @@ $Id$
"3" -- Added legacy ID key support to aid in authority ID key rollovers
"4" -- No longer list routers that are not running in the consensus
"5" -- adds support for "w" and "p" lines.
+ "6" -- Prefers measured bandwidth values rather than advertised
+ "7" -- Provides keyword=integer pairs of consensus parameters
+ "8" -- Provides microdescriptor summaries
+ "9" -- Provides weights for selecting flagged routers in paths
+ "10" -- Fixes edge case bugs in router flag selection weights
Before generating a consensus, an authority must decide which consensus
method to use. To do this, it looks for the highest version number
@@ -1313,6 +1671,168 @@ $Id$
use an accept-style summary and list as much of the port list as is
possible within these 1000 bytes. [XXXX be more specific.]
+3.4.3. Computing Bandwidth Weights
+
+ Let weight_scale = 10000
+
+ Let G be the total bandwidth for Guard-flagged nodes.
+ Let M be the total bandwidth for non-flagged nodes.
+ Let E be the total bandwidth for Exit-flagged nodes.
+ Let D be the total bandwidth for Guard+Exit-flagged nodes.
+ Let T = G+M+E+D
+
+ Let Wgd be the weight for choosing a Guard+Exit for the guard position.
+ Let Wmd be the weight for choosing a Guard+Exit for the middle position.
+ Let Wed be the weight for choosing a Guard+Exit for the exit position.
+
+ Let Wme be the weight for choosing an Exit for the middle position.
+ Let Wmg be the weight for choosing a Guard for the middle position.
+
+ Let Wgg be the weight for choosing a Guard for the guard position.
+ Let Wee be the weight for choosing an Exit for the exit position.
+
+ Balanced network conditions then arise from solutions to the following
+ system of equations:
+
+ Wgg*G + Wgd*D == M + Wmd*D + Wme*E + Wmg*G (guard bw = middle bw)
+ Wgg*G + Wgd*D == Wee*E + Wed*D (guard bw = exit bw)
+ Wed*D + Wmd*D + Wgd*D == D (aka: Wed+Wmd+Wdg = 1)
+ Wmg*G + Wgg*G == G (aka: Wgg = 1-Wmg)
+ Wme*E + Wee*E == E (aka: Wee = 1-Wme)
+
+ We are short 2 constraints with the above set. The remaining constraints
+ come from examining different cases of network load. The following
+ constraints are used in consensus method 10 and above. There are another
+ incorrect and obsolete set of constraints used for these same cases in
+ consensus method 9. For those, see dir-spec.txt in Tor 0.2.2.10-alpha
+ to 0.2.2.16-alpha.
+
+ Case 1: E >= T/3 && G >= T/3 (Neither Exit nor Guard Scarce)
+
+ In this case, the additional two constraints are: Wmg == Wmd,
+ Wed == 1/3.
+
+ This leads to the solution:
+ Wgd = weight_scale/3
+ Wed = weight_scale/3
+ Wmd = weight_scale/3
+ Wee = (weight_scale*(E+G+M))/(3*E)
+ Wme = weight_scale - Wee
+ Wmg = (weight_scale*(2*G-E-M))/(3*G)
+ Wgg = weight_scale - Wmg
+
+ Case 2: E < T/3 && G < T/3 (Both are scarce)
+
+ Let R denote the more scarce class (Rare) between Guard vs Exit.
+ Let S denote the less scarce class.
+
+ Subcase a: R+D < S
+
+ In this subcase, we simply devote all of D bandwidth to the
+ scarce class.
+
+ Wgg = Wee = weight_scale
+ Wmg = Wme = Wmd = 0;
+ if E < G:
+ Wed = weight_scale
+ Wgd = 0
+ else:
+ Wed = 0
+ Wgd = weight_scale
+
+ Subcase b: R+D >= S
+
+ In this case, if M <= T/3, we have enough bandwidth to try to achieve
+ a balancing condition.
+
+ Add constraints Wgg = 1, Wmd == Wgd to maximize bandwidth in the guard
+ position while still allowing exits to be used as middle nodes:
+
+ Wee = (weight_scale*(E - G + M))/E
+ Wed = (weight_scale*(D - 2*E + 4*G - 2*M))/(3*D)
+ Wme = (weight_scale*(G-M))/E
+ Wmg = 0
+ Wgg = weight_scale
+ Wmd = (weight_scale - Wed)/2
+ Wgd = (weight_scale - Wed)/2
+
+ If this system ends up with any values out of range (ie negative, or
+ above weight_scale), use the constraints Wgg == 1 and Wee == 1, since
+ both those positions are scarce:
+
+ Wgg = weight_scale
+ Wee = weight_scale
+ Wed = (weight_scale*(D - 2*E + G + M))/(3*D)
+ Wmd = (weight_Scale*(D - 2*M + G + E))/(3*D)
+ Wme = 0
+ Wmg = 0
+ Wgd = weight_scale - Wed - Wmd
+
+ If M > T/3, then the Wmd weight above will become negative. Set it to 0
+ in this case:
+ Wmd = 0
+ Wgd = weight_scale - Wed
+
+ Case 3: One of E < T/3 or G < T/3
+
+ Let S be the scarce class (of E or G).
+
+ Subcase a: (S+D) < T/3:
+ if G=S:
+ Wgg = Wgd = weight_scale;
+ Wmd = Wed = Wmg = 0;
+ // Minor subcase, if E is more scarce than M,
+ // keep its bandwidth in place.
+ if (E < M) Wme = 0;
+ else Wme = (weight_scale*(E-M))/(2*E);
+ Wee = weight_scale-Wme;
+ if E=S:
+ Wee = Wed = weight_scale;
+ Wmd = Wgd = Wme = 0;
+ // Minor subcase, if G is more scarce than M,
+ // keep its bandwidth in place.
+ if (G < M) Wmg = 0;
+ else Wmg = (weight_scale*(G-M))/(2*G);
+ Wgg = weight_scale-Wmg;
+
+ Subcase b: (S+D) >= T/3
+ if G=S:
+ Add constraints Wgg = 1, Wmd == Wed to maximize bandwidth
+ in the guard position, while still allowing exits to be
+ used as middle nodes:
+ Wgg = weight_scale
+ Wgd = (weight_scale*(D - 2*G + E + M))/(3*D)
+ Wmg = 0
+ Wee = (weight_scale*(E+M))/(2*E)
+ Wme = weight_scale - Wee
+ Wmd = (weight_scale - Wgd)/2
+ Wed = (weight_scale - Wgd)/2
+ if E=S:
+ Add constraints Wee == 1, Wmd == Wgd to maximize bandwidth
+ in the exit position:
+ Wee = weight_scale;
+ Wed = (weight_scale*(D - 2*E + G + M))/(3*D);
+ Wme = 0;
+ Wgg = (weight_scale*(G+M))/(2*G);
+ Wmg = weight_scale - Wgg;
+ Wmd = (weight_scale - Wed)/2;
+ Wgd = (weight_scale - Wed)/2;
+
+ To ensure consensus, all calculations are performed using integer math
+ with a fixed precision determined by the bwweightscale consensus
+ parameter (defaults at 10000).
+
+ For future balancing improvements, Tor clients support 11 additional weights
+ for directory requests and middle weighting. These weights are currently
+ set at weight_scale, with the exception of the following groups of
+ assignments:
+
+ Directory requests use middle weights:
+ Wbd=Wmd, Wbg=Wmg, Wbe=Wme, Wbm=Wmm
+
+ Handle bridges and strange exit policies:
+ Wgm=Wgg, Wem=Wee, Weg=Wed
+
3.5. Detached signatures
Assuming full connectivity, every authority should compute and sign the
@@ -1884,7 +2404,6 @@ $Id$
A. Consensus-negotiation timeline.
-
Period begins: this is the Published time.
Everybody sends votes
Reconciliation: everybody tries to fetch missing votes.
diff --git a/doc/spec/path-spec.txt b/doc/spec/path-spec.txt
index dceb21dad7..2e4207bd56 100644
--- a/doc/spec/path-spec.txt
+++ b/doc/spec/path-spec.txt
@@ -1,4 +1,3 @@
-$Id$
Tor Path Specification
@@ -15,6 +14,11 @@ of their choices.
THIS SPEC ISN'T DONE YET.
+ The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
+ NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and
+ "OPTIONAL" in this document are to be interpreted as described in
+ RFC 2119.
+
1. General operation
Tor begins building circuits as soon as it has enough directory
@@ -72,6 +76,24 @@ of their choices.
is unknown (usually its target IP), but we believe the path probably
supports the request according to the rules given below.
+1.1. A server's bandwidth
+
+ Old versions of Tor did not report bandwidths in network status
+ documents, so clients had to learn them from the routers' advertised
+ server descriptors.
+
+ For versions of Tor prior to 0.2.1.17-rc, everywhere below where we
+ refer to a server's "bandwidth", we mean its clipped advertised
+ bandwidth, computed by taking the smaller of the 'rate' and
+ 'observed' arguments to the "bandwidth" element in the server's
+ descriptor. If a router's advertised bandwidth is greater than
+ MAX_BELIEVABLE_BANDWIDTH (currently 10 MB/s), we clipped to that
+ value.
+
+ For more recent versions of Tor, we take the bandwidth value declared
+ in the consensus, and fall back to the clipped advertised bandwidth
+ only if the consensus does not have bandwidths listed.
+
2. Building circuits
2.1. When we build
@@ -158,6 +180,7 @@ of their choices.
XXXX
+
2.2. Path selection and constraints
We choose the path for each new circuit before we build it. We choose the
@@ -175,26 +198,41 @@ of their choices.
below)
- XXXX Choosing the length
- For circuits that do not need to be "fast", when choosing among
- multiple candidates for a path element, we choose randomly.
+ For "fast" circuits, we only choose nodes with the Fast flag. For
+ non-"fast" circuits, all nodes are eligible.
+
+ For all circuits, we weight node selection according to router bandwidth.
+
+ We also weight the bandwidth of Exit and Guard flagged nodes depending on
+ the fraction of total bandwidth that they make up and depending upon the
+ position they are being selected for.
+
+ These weights are published in the consensus, and are computed as described
+ in Section 3.4.3 of dir-spec.txt. They are:
+
+ Wgg - Weight for Guard-flagged nodes in the guard position
+ Wgm - Weight for non-flagged nodes in the guard Position
+ Wgd - Weight for Guard+Exit-flagged nodes in the guard Position
- For "fast" circuits, we pick a given router as an exit with probability
- proportional to its advertised bandwidth [the smaller of the 'rate' and
- 'observed' arguments to the "bandwidth" element in its descriptor]. If a
- router's advertised bandwidth is greater than MAX_BELIEVABLE_BANDWIDTH
- (currently 10 MB/s), we clip to that value.
+ Wmg - Weight for Guard-flagged nodes in the middle Position
+ Wmm - Weight for non-flagged nodes in the middle Position
+ Wme - Weight for Exit-flagged nodes in the middle Position
+ Wmd - Weight for Guard+Exit flagged nodes in the middle Position
- For non-exit positions on "fast" circuits, we pick routers as above, but
- we weight the clipped advertised bandwidth of Exit-flagged nodes depending
- on the fraction of bandwidth available from non-Exit nodes. Call the
- total clipped advertised bandwidth for Exit nodes under consideration E,
- and the total clipped advertised bandwidth for all nodes under
- consideration T. If E<T/3, we do not consider Exit-flagged nodes.
- Otherwise, we weight their bandwidth with the factor (E-T/3)/E. This
- ensures that bandwidth is evenly distributed over nodes in 3-hop paths.
+ Weg - Weight for Guard flagged nodes in the exit Position
+ Wem - Weight for non-flagged nodes in the exit Position
+ Wee - Weight for Exit-flagged nodes in the exit Position
+ Wed - Weight for Guard+Exit-flagged nodes in the exit Position
- Similarly, guard nodes are weighted by the factor (G-T/3)/G, and not
- considered for non-guard positions if this value is less than 0.
+ Wgb - Weight for BEGIN_DIR-supporting Guard-flagged nodes
+ Wmb - Weight for BEGIN_DIR-supporting non-flagged nodes
+ Web - Weight for BEGIN_DIR-supporting Exit-flagged nodes
+ Wdb - Weight for BEGIN_DIR-supporting Guard+Exit-flagged nodes
+
+ Wbg - Weight for Guard+Exit-flagged nodes for BEGIN_DIR requests
+ Wbm - Weight for Guard+Exit-flagged nodes for BEGIN_DIR requests
+ Wbe - Weight for Guard+Exit-flagged nodes for BEGIN_DIR requests
+ Wbd - Weight for Guard+Exit-flagged nodes for BEGIN_DIR requests
Additionally, we may be building circuits with one or more requests in
mind. Each kind of request puts certain constraints on paths:
@@ -263,8 +301,182 @@ of their choices.
at a given node -- either via the ".exit" notation or because the
destination is running at the same location as an exit node.
+2.4. Learning when to give up ("timeout") on circuit construction
+
+ Since version 0.2.2.8-alpha, Tor attempts to learn when to give up on
+ circuits based on network conditions.
+
+2.4.1 Distribution choice and parameter estimation
+
+ Based on studies of build times, we found that the distribution of
+ circuit build times appears to be a Frechet distribution. However,
+ estimators and quantile functions of the Frechet distribution are
+ difficult to work with and slow to converge. So instead, since we
+ are only interested in the accuracy of the tail, we approximate
+ the tail of the distribution with a Pareto curve.
+
+ We calculate the parameters for a Pareto distribution fitting the data
+ using the estimators in equation 4 from:
+ http://portal.acm.org/citation.cfm?id=1647962.1648139
+
+ This is:
+
+ alpha_m = s/(ln(U(X)/Xm^n))
+
+ where s is the total number of completed circuits we have seen, and
+
+ U(X) = x_max^u * Prod_s{x_i}
+
+ with x_i as our i-th completed circuit time, x_max as the longest
+ completed circuit build time we have yet observed, u as the
+ number of unobserved timeouts that have no exact value recorded,
+ and n as u+s, the total number of circuits that either timeout or
+ complete.
+
+ Using log laws, we compute this as the sum of logs to avoid
+ overflow and ln(1.0+epsilon) precision issues:
+
+ alpha_m = s/(u*ln(x_max) + Sum_s{ln(x_i)} - n*ln(Xm))
+
+ This estimator is closely related to the parameters present in:
+ http://en.wikipedia.org/wiki/Pareto_distribution#Parameter_estimation
+ except they are adjusted to handle the fact that our samples are
+ right-censored at the timeout cutoff.
+
+ Additionally, because this is not a true Pareto distribution, we alter
+ how Xm is computed. The Xm parameter is computed as the midpoint of the most
+ frequently occurring 50ms histogram bin, until the point where 1000
+ circuits are recorded. After this point, the weighted average of the top
+ 'cbtnummodes' (default: 3) midpoint modes is used as Xm. All times below
+ this value are counted as having the midpoint value of this weighted average bin.
+
+ The timeout itself is calculated by using the Pareto Quantile function (the
+ inverted CDF) to give us the value on the CDF such that 80% of the mass
+ of the distribution is below the timeout value.
+
+ Thus, we expect that the Tor client will accept the fastest 80% of
+ the total number of paths on the network.
+
+2.4.2. How much data to record
+
+ From our observations, the minimum number of circuit build times for a
+ reasonable fit appears to be on the order of 100. However, to keep a
+ good fit over the long term, we store 1000 most recent circuit build times
+ in a circular array.
+
+ The Tor client should build test circuits at a rate of one per
+ minute up until 100 circuits are built. This allows a fresh Tor to have
+ a CircuitBuildTimeout estimated within 1.5 hours after install,
+ upgrade, or network change (see below).
+
+ Timeouts are stored on disk in a histogram of 50ms bin width, the same
+ width used to calculate the Xm value above. This histogram must be shuffled
+ after being read from disk, to preserve a proper expiration of old values
+ after restart.
+
+2.4.3. How to record timeouts
+
+ Circuits that pass the timeout threshold should be allowed to continue
+ building until a time corresponding to the point 'cbtclosequantile'
+ (default 95) on the Pareto curve, or 60 seconds, whichever is greater.
+
+ The actual completion times for these circuits should be recorded.
+ Implementations should completely abandon a circuit and record a value
+ as an 'unknown' timeout if the total build time exceeds this threshold.
+
+ The reason for this is that right-censored pareto estimators begin to lose
+ their accuracy if more than approximately 5% of the values are censored.
+ Since we wish to set the cutoff at 20%, we must allow circuits to continue
+ building past this cutoff point up to the 95th percentile.
+
+2.4.4. Detecting Changing Network Conditions
+
+ We attempt to detect both network connectivity loss and drastic
+ changes in the timeout characteristics.
+
+ We assume that we've had network connectivity loss if 3 circuits
+ timeout and we've received no cells or TLS handshakes since those
+ circuits began. We then temporarily set the timeout to 60 seconds
+ and stop counting timeouts.
+
+ If 3 more circuits timeout and the network still has not been
+ live within this new 60 second timeout window, we then discard
+ the previous timeouts during this period from our history.
+
+ To detect changing network conditions, we keep a history of
+ the timeout or non-timeout status of the past 20 circuits that
+ successfully completed at least one hop. If more than 90% of
+ these circuits timeout, we discard all buildtimes history, reset
+ the timeout to 60, and then begin recomputing the timeout.
+
+ If the timeout was already 60 or higher, we double the timeout.
+
+2.4.5. Consensus parameters governing behavior
+
+ Clients that implement circuit build timeout learning should obey the
+ following consensus parameters that govern behavior, in order to allow
+ us to handle bugs or other emergent behaviors due to client circuit
+ construction. If these parameters are not present in the consensus,
+ the listed default values should be used instead.
-2.4. Handling failure
+ cbtdisabled
+ Default: 0
+ Effect: If non-zero, all CircuitBuildTime learning code should be
+ disabled and history should be discarded. For use in
+ emergency situations only.
+
+ cbtnummodes
+ Default: 3
+ Effect: This value governs how many modes to use in the weighted
+ average calculation of Pareto paramter Xm. A value of 3 introduces
+ some bias (2-5% of CDF) under ideal conditions, but allows for better
+ performance in the event that a client chooses guard nodes of radically
+ different performance characteristics.
+
+ cbtrecentcount
+ Default: 20
+ Effect: This is the number of circuit build times to keep track of
+ for the following option.
+
+ cbtmaxtimeouts
+ Default: 18
+ Effect: When this many timeouts happen in the last 'cbtrecentcount'
+ circuit attempts, the client should discard all of its
+ history and begin learning a fresh timeout value.
+
+ cbtmincircs
+ Default: 100
+ Effect: This is the minimum number of circuits to build before
+ computing a timeout.
+
+ cbtquantile
+ Default: 80
+ Effect: This is the position on the quantile curve to use to set the
+ timeout value. It is a percent (0-99).
+
+ cbtclosequantile
+ Default: 95
+ Effect: This is the position on the quantile curve to use to set the
+ timeout value to use to actually close circuits. It is a percent
+ (0-99).
+
+ cbttestfreq
+ Default: 60
+ Effect: Describes how often in seconds to build a test circuit to
+ gather timeout values. Only applies if less than 'cbtmincircs'
+ have been recorded.
+
+ cbtmintimeout
+ Default: 2000
+ Effect: This is the minimum allowed timeout value in milliseconds.
+
+ cbtinitialtimeout
+ Default: 60000
+ Effect: This is the timeout value to use before computing a timeout,
+ in milliseconds.
+
+
+2.5. Handling failure
If an attempt to extend a circuit fails (either because the first create
failed or a subsequent extend failed) then the circuit is torn down and is
@@ -306,7 +518,7 @@ of their choices.
We use Guard nodes (also called "helper nodes" in the literature) to
prevent certain profiling attacks. Here's the risk: if we choose entry and
exit nodes at random, and an attacker controls C out of N servers
- (ignoring advertised bandwidth), then the
+ (ignoring bandwidth), then the
attacker will control the entry and exit node of any given circuit with
probability (C/N)^2. But as we make many different circuits over time,
then the probability that the attacker will see a sample of about (C/N)^2
diff --git a/doc/spec/proposals/000-index.txt b/doc/spec/proposals/000-index.txt
index d75157650d..f6f313e58d 100644
--- a/doc/spec/proposals/000-index.txt
+++ b/doc/spec/proposals/000-index.txt
@@ -1,7 +1,5 @@
Filename: 000-index.txt
Title: Index of Tor Proposals
-Version: $Revision$
-Last-Modified: $Date$
Author: Nick Mathewson
Created: 26-Jan-2007
Status: Meta
@@ -56,7 +54,7 @@ Proposals by number:
131 Help users to verify they are using Tor [NEEDS-REVISION]
132 A Tor Web Service For Verifying Correct Browser Configuration [DRAFT]
133 Incorporate Unreachable ORs into the Tor Network [DRAFT]
-134 More robust consensus voting with diverse authority sets [ACCEPTED]
+134 More robust consensus voting with diverse authority sets [REJECTED]
135 Simplify Configuration of Private Tor Networks [CLOSED]
136 Mass authority migration with legacy keys [CLOSED]
137 Keep controllers informed as Tor bootstraps [CLOSED]
@@ -73,7 +71,7 @@ Proposals by number:
148 Stream end reasons from the client side should be uniform [CLOSED]
149 Using data from NETINFO cells [OPEN]
150 Exclude Exit Nodes from a circuit [CLOSED]
-151 Improving Tor Path Selection [DRAFT]
+151 Improving Tor Path Selection [FINISHED]
152 Optionally allow exit from single-hop circuits [CLOSED]
153 Automatic software update protocol [SUPERSEDED]
154 Automatic Software Update Protocol [SUPERSEDED]
@@ -82,6 +80,20 @@ Proposals by number:
157 Make certificate downloads specific [ACCEPTED]
158 Clients download consensus + microdescriptors [OPEN]
159 Exit Scanning [OPEN]
+160 Authorities vote for bandwidth offsets in consensus [FINISHED]
+161 Computing Bandwidth Adjustments [FINISHED]
+162 Publish the consensus in multiple flavors [OPEN]
+163 Detecting whether a connection comes from a client [OPEN]
+164 Reporting the status of server votes [OPEN]
+165 Easy migration for voting authority sets [OPEN]
+166 Including Network Statistics in Extra-Info Documents [ACCEPTED]
+167 Vote on network parameters in consensus [CLOSED]
+168 Reduce default circuit window [OPEN]
+169 Eliminate TLS renegotiation for the Tor connection handshake [DRAFT]
+170 Configuration options regarding circuit building [DRAFT]
+172 GETINFO controller option for circuit information [ACCEPTED]
+173 GETINFO Option Expansion [ACCEPTED]
+174 Optimistic Data for Tor: Server Side [OPEN]
Proposals by status:
@@ -92,7 +104,8 @@ Proposals by status:
133 Incorporate Unreachable ORs into the Tor Network
141 Download server descriptors on demand
144 Increase the diversity of circuits by detecting nodes belonging the same provider
- 151 Improving Tor Path Selection
+ 169 Eliminate TLS renegotiation for the Tor connection handshake [for 0.2.2]
+ 170 Configuration options regarding circuit building
NEEDS-REVISION:
131 Help users to verify they are using Tor
OPEN:
@@ -103,14 +116,22 @@ Proposals by status:
156 Tracking blocked ports on the client side [for 0.2.?]
158 Clients download consensus + microdescriptors
159 Exit Scanning
+ 162 Publish the consensus in multiple flavors [for 0.2.2]
+ 163 Detecting whether a connection comes from a client [for 0.2.2]
+ 164 Reporting the status of server votes [for 0.2.2]
+ 165 Easy migration for voting authority sets
+ 168 Reduce default circuit window [for 0.2.2]
+ 174 Optimistic Data for Tor: Server Side
ACCEPTED:
110 Avoiding infinite length circuits [for 0.2.1.x] [in 0.2.1.3-alpha]
117 IPv6 exits [for 0.2.1.x]
118 Advertising multiple ORPorts at once [for 0.2.1.x]
- 134 More robust consensus voting with diverse authority sets [for 0.2.2.x]
140 Provide diffs between consensuses [for 0.2.2.x]
147 Eliminate the need for v2 directories in generating v3 directories [for 0.2.1.x]
157 Make certificate downloads specific [for 0.2.1.x]
+ 166 Including Network Statistics in Extra-Info Documents [for 0.2.2]
+ 172 GETINFO controller option for circuit information
+ 173 GETINFO Option Expansion
META:
000 Index of Tor Proposals
001 The Tor Proposal Process
@@ -118,7 +139,10 @@ Proposals by status:
099 Miscellaneous proposals
FINISHED:
121 Hidden Service Authentication [in 0.2.1.x]
+ 151 Improving Tor Path Selection
155 Four Improvements of Hidden Service Performance [in 0.2.1.x]
+ 160 Authorities vote for bandwidth offsets in consensus [for 0.2.2.x]
+ 161 Computing Bandwidth Adjustments [for 0.2.2.x]
CLOSED:
101 Voting on the Tor Directory System [in 0.2.0.x]
102 Dropping "opt" from the directory format [in 0.2.0.x]
@@ -146,6 +170,7 @@ Proposals by status:
148 Stream end reasons from the client side should be uniform [in 0.2.1.9-alpha]
150 Exclude Exit Nodes from a circuit [in 0.2.1.3-alpha]
152 Optionally allow exit from single-hop circuits [in 0.2.1.6-alpha]
+ 167 Vote on network parameters in consensus [in 0.2.2]
SUPERSEDED:
112 Bring Back Pathlen Coin Weight
113 Simplifying directory authority administration
@@ -159,3 +184,5 @@ Proposals by status:
120 Shutdown descriptors when Tor servers stop
128 Families of private bridges
142 Combine Introduction and Rendezvous Points
+ REJECTED:
+ 134 More robust consensus voting with diverse authority sets
diff --git a/doc/spec/proposals/001-process.txt b/doc/spec/proposals/001-process.txt
index 3a767b5fa4..e2fe358fed 100644
--- a/doc/spec/proposals/001-process.txt
+++ b/doc/spec/proposals/001-process.txt
@@ -1,7 +1,5 @@
Filename: 001-process.txt
Title: The Tor Proposal Process
-Version: $Revision$
-Last-Modified: $Date$
Author: Nick Mathewson
Created: 30-Jan-2007
Status: Meta
@@ -47,7 +45,7 @@ How to change the specs now:
Like an RFC, every proposal gets a number. Unlike RFCs, proposals can
change over time and keep the same number, until they are finally
accepted or rejected. The history for each proposal
- will be stored in the Tor Subversion repository.
+ will be stored in the Tor repository.
Once a proposal is in the repository, we should discuss and improve it
until we've reached consensus that it's a good idea, and that it's
@@ -82,9 +80,7 @@ How new proposals get added:
What should go in a proposal:
Every proposal should have a header containing these fields:
- Filename, Title, Version, Last-Modified, Author, Created, Status.
- The Version and Last-Modified fields should use the SVN Revision and Date
- tags respectively.
+ Filename, Title, Author, Created, Status.
These fields are optional but recommended:
Target, Implemented-In.
@@ -97,7 +93,7 @@ What should go in a proposal:
what the proposal's about, what it does, and about what state it's in.
After the Overview, the proposal becomes more free-form. Depending on its
- the length and complexity, the proposal can break into sections as
+ length and complexity, the proposal can break into sections as
appropriate, or follow a short discursive format. Every proposal should
contain at least the following information before it is "ACCEPTED",
though the information does not need to be in sections with these names.
@@ -131,7 +127,8 @@ What should go in a proposal:
Implementation: If the proposal will be tricky to implement in Tor's
current architecture, the document can contain some discussion of how
- to go about making it work.
+ to go about making it work. Actual patches should go on public git
+ branches, or be uploaded to trac.
Performance and scalability notes: If the feature will have an effect
on performance (in RAM, CPU, bandwidth) or scalability, there should
diff --git a/doc/spec/proposals/098-todo.txt b/doc/spec/proposals/098-todo.txt
index e891ea890c..a0bbbeb568 100644
--- a/doc/spec/proposals/098-todo.txt
+++ b/doc/spec/proposals/098-todo.txt
@@ -1,7 +1,5 @@
Filename: 098-todo.txt
Title: Proposals that should be written
-Version: $Revision$
-Last-Modified: $Date$
Author: Nick Mathewson, Roger Dingledine
Created: 26-Jan-2007
Status: Meta
diff --git a/doc/spec/proposals/099-misc.txt b/doc/spec/proposals/099-misc.txt
index ba13ea2a71..a3621dd25f 100644
--- a/doc/spec/proposals/099-misc.txt
+++ b/doc/spec/proposals/099-misc.txt
@@ -1,7 +1,5 @@
Filename: 099-misc.txt
Title: Miscellaneous proposals
-Version: $Revision$
-Last-Modified: $Date$
Author: Various
Created: 26-Jan-2007
Status: Meta
diff --git a/doc/spec/proposals/100-tor-spec-udp.txt b/doc/spec/proposals/100-tor-spec-udp.txt
index 8224682ec8..7f062222c5 100644
--- a/doc/spec/proposals/100-tor-spec-udp.txt
+++ b/doc/spec/proposals/100-tor-spec-udp.txt
@@ -1,7 +1,5 @@
Filename: 100-tor-spec-udp.txt
Title: Tor Unreliable Datagram Extension Proposal
-Version: $Revision$
-Last-Modified: $Date$
Author: Marc Liberatore
Created: 23 Feb 2006
Status: Dead
diff --git a/doc/spec/proposals/101-dir-voting.txt b/doc/spec/proposals/101-dir-voting.txt
index be900a641e..634d3f1948 100644
--- a/doc/spec/proposals/101-dir-voting.txt
+++ b/doc/spec/proposals/101-dir-voting.txt
@@ -1,7 +1,5 @@
Filename: 101-dir-voting.txt
Title: Voting on the Tor Directory System
-Version: $Revision$
-Last-Modified: $Date$
Author: Nick Mathewson
Created: Nov 2006
Status: Closed
diff --git a/doc/spec/proposals/102-drop-opt.txt b/doc/spec/proposals/102-drop-opt.txt
index 8f6a38ae6c..490376bb53 100644
--- a/doc/spec/proposals/102-drop-opt.txt
+++ b/doc/spec/proposals/102-drop-opt.txt
@@ -1,7 +1,5 @@
Filename: 102-drop-opt.txt
Title: Dropping "opt" from the directory format
-Version: $Revision$
-Last-Modified: $Date$
Author: Nick Mathewson
Created: Jan 2007
Status: Closed
diff --git a/doc/spec/proposals/103-multilevel-keys.txt b/doc/spec/proposals/103-multilevel-keys.txt
index ef51e18047..c8a7a6677b 100644
--- a/doc/spec/proposals/103-multilevel-keys.txt
+++ b/doc/spec/proposals/103-multilevel-keys.txt
@@ -1,7 +1,5 @@
Filename: 103-multilevel-keys.txt
Title: Splitting identity key from regularly used signing key.
-Version: $Revision$
-Last-Modified: $Date$
Author: Nick Mathewson
Created: Jan 2007
Status: Closed
diff --git a/doc/spec/proposals/104-short-descriptors.txt b/doc/spec/proposals/104-short-descriptors.txt
index a1c42c8ff7..90e0764fe6 100644
--- a/doc/spec/proposals/104-short-descriptors.txt
+++ b/doc/spec/proposals/104-short-descriptors.txt
@@ -1,7 +1,5 @@
Filename: 104-short-descriptors.txt
Title: Long and Short Router Descriptors
-Version: $Revision$
-Last-Modified: $Date$
Author: Nick Mathewson
Created: Jan 2007
Status: Closed
diff --git a/doc/spec/proposals/105-handshake-revision.txt b/doc/spec/proposals/105-handshake-revision.txt
index f6c209e71b..791a016c26 100644
--- a/doc/spec/proposals/105-handshake-revision.txt
+++ b/doc/spec/proposals/105-handshake-revision.txt
@@ -1,7 +1,5 @@
Filename: 105-handshake-revision.txt
Title: Version negotiation for the Tor protocol.
-Version: $Revision$
-Last-Modified: $Date$
Author: Nick Mathewson, Roger Dingledine
Created: Jan 2007
Status: Closed
diff --git a/doc/spec/proposals/106-less-tls-constraint.txt b/doc/spec/proposals/106-less-tls-constraint.txt
index 35d6bf1066..7e7621df69 100644
--- a/doc/spec/proposals/106-less-tls-constraint.txt
+++ b/doc/spec/proposals/106-less-tls-constraint.txt
@@ -1,7 +1,5 @@
Filename: 106-less-tls-constraint.txt
Title: Checking fewer things during TLS handshakes
-Version: $Revision$
-Last-Modified: $Date$
Author: Nick Mathewson
Created: 9-Feb-2007
Status: Closed
diff --git a/doc/spec/proposals/107-uptime-sanity-checking.txt b/doc/spec/proposals/107-uptime-sanity-checking.txt
index b11be89380..922129b21d 100644
--- a/doc/spec/proposals/107-uptime-sanity-checking.txt
+++ b/doc/spec/proposals/107-uptime-sanity-checking.txt
@@ -1,7 +1,5 @@
Filename: 107-uptime-sanity-checking.txt
Title: Uptime Sanity Checking
-Version: $Revision$
-Last-Modified: $Date$
Author: Kevin Bauer & Damon McCoy
Created: 8-March-2007
Status: Closed
diff --git a/doc/spec/proposals/108-mtbf-based-stability.txt b/doc/spec/proposals/108-mtbf-based-stability.txt
index 2c66481530..294103760b 100644
--- a/doc/spec/proposals/108-mtbf-based-stability.txt
+++ b/doc/spec/proposals/108-mtbf-based-stability.txt
@@ -1,7 +1,5 @@
Filename: 108-mtbf-based-stability.txt
Title: Base "Stable" Flag on Mean Time Between Failures
-Version: $Revision$
-Last-Modified: $Date$
Author: Nick Mathewson
Created: 10-Mar-2007
Status: Closed
diff --git a/doc/spec/proposals/109-no-sharing-ips.txt b/doc/spec/proposals/109-no-sharing-ips.txt
index 1a88b00c0f..5438cf049a 100644
--- a/doc/spec/proposals/109-no-sharing-ips.txt
+++ b/doc/spec/proposals/109-no-sharing-ips.txt
@@ -1,7 +1,5 @@
Filename: 109-no-sharing-ips.txt
Title: No more than one server per IP address.
-Version: $Revision$
-Last-Modified: $Date$
Author: Kevin Bauer & Damon McCoy
Created: 9-March-2007
Status: Closed
diff --git a/doc/spec/proposals/110-avoid-infinite-circuits.txt b/doc/spec/proposals/110-avoid-infinite-circuits.txt
index 1834cd34a7..fffc41c25a 100644
--- a/doc/spec/proposals/110-avoid-infinite-circuits.txt
+++ b/doc/spec/proposals/110-avoid-infinite-circuits.txt
@@ -1,7 +1,5 @@
Filename: 110-avoid-infinite-circuits.txt
Title: Avoiding infinite length circuits
-Version: $Revision$
-Last-Modified: $Date$
Author: Roger Dingledine
Created: 13-Mar-2007
Status: Accepted
diff --git a/doc/spec/proposals/111-local-traffic-priority.txt b/doc/spec/proposals/111-local-traffic-priority.txt
index f8a37efc94..9411463c21 100644
--- a/doc/spec/proposals/111-local-traffic-priority.txt
+++ b/doc/spec/proposals/111-local-traffic-priority.txt
@@ -1,7 +1,5 @@
Filename: 111-local-traffic-priority.txt
Title: Prioritizing local traffic over relayed traffic
-Version: $Revision$
-Last-Modified: $Date$
Author: Roger Dingledine
Created: 14-Mar-2007
Status: Closed
diff --git a/doc/spec/proposals/112-bring-back-pathlencoinweight.txt b/doc/spec/proposals/112-bring-back-pathlencoinweight.txt
index e7cc6b4e36..3f6c3376f0 100644
--- a/doc/spec/proposals/112-bring-back-pathlencoinweight.txt
+++ b/doc/spec/proposals/112-bring-back-pathlencoinweight.txt
@@ -1,7 +1,5 @@
Filename: 112-bring-back-pathlencoinweight.txt
Title: Bring Back Pathlen Coin Weight
-Version: $Revision$
-Last-Modified: $Date$
Author: Mike Perry
Created:
Status: Superseded
diff --git a/doc/spec/proposals/113-fast-authority-interface.txt b/doc/spec/proposals/113-fast-authority-interface.txt
index 20cf33e429..8912b53220 100644
--- a/doc/spec/proposals/113-fast-authority-interface.txt
+++ b/doc/spec/proposals/113-fast-authority-interface.txt
@@ -1,7 +1,5 @@
Filename: 113-fast-authority-interface.txt
Title: Simplifying directory authority administration
-Version: $Revision$
-Last-Modified: $Date$
Author: Nick Mathewson
Created:
Status: Superseded
diff --git a/doc/spec/proposals/114-distributed-storage.txt b/doc/spec/proposals/114-distributed-storage.txt
index e9271fb82d..91a787d301 100644
--- a/doc/spec/proposals/114-distributed-storage.txt
+++ b/doc/spec/proposals/114-distributed-storage.txt
@@ -1,7 +1,5 @@
Filename: 114-distributed-storage.txt
Title: Distributed Storage for Tor Hidden Service Descriptors
-Version: $Revision$
-Last-Modified: $Date$
Author: Karsten Loesing
Created: 13-May-2007
Status: Closed
diff --git a/doc/spec/proposals/115-two-hop-paths.txt b/doc/spec/proposals/115-two-hop-paths.txt
index ee10d949c4..9854c9ad55 100644
--- a/doc/spec/proposals/115-two-hop-paths.txt
+++ b/doc/spec/proposals/115-two-hop-paths.txt
@@ -1,7 +1,5 @@
Filename: 115-two-hop-paths.txt
Title: Two Hop Paths
-Version: $Revision$
-Last-Modified: $Date$
Author: Mike Perry
Created:
Status: Dead
diff --git a/doc/spec/proposals/116-two-hop-paths-from-guard.txt b/doc/spec/proposals/116-two-hop-paths-from-guard.txt
index 454b344abf..f45625350b 100644
--- a/doc/spec/proposals/116-two-hop-paths-from-guard.txt
+++ b/doc/spec/proposals/116-two-hop-paths-from-guard.txt
@@ -1,7 +1,5 @@
Filename: 116-two-hop-paths-from-guard.txt
Title: Two hop paths from entry guards
-Version: $Revision$
-Last-Modified: $Date$
Author: Michael Lieberman
Created: 26-Jun-2007
Status: Dead
diff --git a/doc/spec/proposals/117-ipv6-exits.txt b/doc/spec/proposals/117-ipv6-exits.txt
index c8402821ed..00cd7cef10 100644
--- a/doc/spec/proposals/117-ipv6-exits.txt
+++ b/doc/spec/proposals/117-ipv6-exits.txt
@@ -1,7 +1,5 @@
Filename: 117-ipv6-exits.txt
Title: IPv6 exits
-Version: $Revision$
-Last-Modified: $Date$
Author: coderman
Created: 10-Jul-2007
Status: Accepted
diff --git a/doc/spec/proposals/118-multiple-orports.txt b/doc/spec/proposals/118-multiple-orports.txt
index 1bef2504d9..2381ec7ca3 100644
--- a/doc/spec/proposals/118-multiple-orports.txt
+++ b/doc/spec/proposals/118-multiple-orports.txt
@@ -1,7 +1,5 @@
Filename: 118-multiple-orports.txt
Title: Advertising multiple ORPorts at once
-Version: $Revision$
-Last-Modified: $Date$
Author: Nick Mathewson
Created: 09-Jul-2007
Status: Accepted
diff --git a/doc/spec/proposals/119-controlport-auth.txt b/doc/spec/proposals/119-controlport-auth.txt
index dc57a27368..9ed1cc1cbe 100644
--- a/doc/spec/proposals/119-controlport-auth.txt
+++ b/doc/spec/proposals/119-controlport-auth.txt
@@ -1,7 +1,5 @@
Filename: 119-controlport-auth.txt
Title: New PROTOCOLINFO command for controllers
-Version: $Revision$
-Last-Modified: $Date$
Author: Roger Dingledine
Created: 14-Aug-2007
Status: Closed
diff --git a/doc/spec/proposals/120-shutdown-descriptors.txt b/doc/spec/proposals/120-shutdown-descriptors.txt
index dc1265b03b..5cfe2b5bc6 100644
--- a/doc/spec/proposals/120-shutdown-descriptors.txt
+++ b/doc/spec/proposals/120-shutdown-descriptors.txt
@@ -1,7 +1,5 @@
Filename: 120-shutdown-descriptors.txt
Title: Shutdown descriptors when Tor servers stop
-Version: $Revision$
-Last-Modified: $Date$
Author: Roger Dingledine
Created: 15-Aug-2007
Status: Dead
diff --git a/doc/spec/proposals/121-hidden-service-authentication.txt b/doc/spec/proposals/121-hidden-service-authentication.txt
index 828bf3c92d..0d92b53a8c 100644
--- a/doc/spec/proposals/121-hidden-service-authentication.txt
+++ b/doc/spec/proposals/121-hidden-service-authentication.txt
@@ -1,7 +1,5 @@
Filename: 121-hidden-service-authentication.txt
Title: Hidden Service Authentication
-Version: $Revision$
-Last-Modified: $Date$
Author: Tobias Kamm, Thomas Lauterbach, Karsten Loesing, Ferdinand Rieger,
Christoph Weingarten
Created: 10-Sep-2007
diff --git a/doc/spec/proposals/122-unnamed-flag.txt b/doc/spec/proposals/122-unnamed-flag.txt
index 6502b9c560..2ce7bb22b9 100644
--- a/doc/spec/proposals/122-unnamed-flag.txt
+++ b/doc/spec/proposals/122-unnamed-flag.txt
@@ -1,7 +1,5 @@
Filename: 122-unnamed-flag.txt
Title: Network status entries need a new Unnamed flag
-Version: $Revision$
-Last-Modified: $Date$
Author: Roger Dingledine
Created: 04-Oct-2007
Status: Closed
diff --git a/doc/spec/proposals/123-autonaming.txt b/doc/spec/proposals/123-autonaming.txt
index 6cd25329f8..74c486985d 100644
--- a/doc/spec/proposals/123-autonaming.txt
+++ b/doc/spec/proposals/123-autonaming.txt
@@ -1,7 +1,5 @@
Filename: 123-autonaming.txt
Title: Naming authorities automatically create bindings
-Version: $Revision$
-Last-Modified: $Date$
Author: Peter Palfrader
Created: 2007-10-11
Status: Closed
diff --git a/doc/spec/proposals/124-tls-certificates.txt b/doc/spec/proposals/124-tls-certificates.txt
index 0a47772732..9472d14af8 100644
--- a/doc/spec/proposals/124-tls-certificates.txt
+++ b/doc/spec/proposals/124-tls-certificates.txt
@@ -1,7 +1,5 @@
Filename: 124-tls-certificates.txt
Title: Blocking resistant TLS certificate usage
-Version: $Revision$
-Last-Modified: $Date$
Author: Steven J. Murdoch
Created: 2007-10-25
Status: Superseded
diff --git a/doc/spec/proposals/125-bridges.txt b/doc/spec/proposals/125-bridges.txt
index 8bb3169780..9d95729d42 100644
--- a/doc/spec/proposals/125-bridges.txt
+++ b/doc/spec/proposals/125-bridges.txt
@@ -1,7 +1,5 @@
Filename: 125-bridges.txt
Title: Behavior for bridge users, bridge relays, and bridge authorities
-Version: $Revision$
-Last-Modified: $Date$
Author: Roger Dingledine
Created: 11-Nov-2007
Status: Closed
diff --git a/doc/spec/proposals/126-geoip-reporting.txt b/doc/spec/proposals/126-geoip-reporting.txt
index d48a08ba38..9f3b21c670 100644
--- a/doc/spec/proposals/126-geoip-reporting.txt
+++ b/doc/spec/proposals/126-geoip-reporting.txt
@@ -1,7 +1,5 @@
Filename: 126-geoip-reporting.txt
Title: Getting GeoIP data and publishing usage summaries
-Version: $Revision$
-Last-Modified: $Date$
Author: Roger Dingledine
Created: 2007-11-24
Status: Closed
diff --git a/doc/spec/proposals/127-dirport-mirrors-downloads.txt b/doc/spec/proposals/127-dirport-mirrors-downloads.txt
index 1b55a02d61..72d6c0cb9f 100644
--- a/doc/spec/proposals/127-dirport-mirrors-downloads.txt
+++ b/doc/spec/proposals/127-dirport-mirrors-downloads.txt
@@ -1,7 +1,5 @@
Filename: 127-dirport-mirrors-downloads.txt
Title: Relaying dirport requests to Tor download site / website
-Version: $Revision$
-Last-Modified: $Date$
Author: Roger Dingledine
Created: 2007-12-02
Status: Draft
diff --git a/doc/spec/proposals/128-bridge-families.txt b/doc/spec/proposals/128-bridge-families.txt
index e8a0050c3c..e5bdcf95cb 100644
--- a/doc/spec/proposals/128-bridge-families.txt
+++ b/doc/spec/proposals/128-bridge-families.txt
@@ -1,7 +1,5 @@
Filename: 128-bridge-families.txt
Title: Families of private bridges
-Version: $Revision$
-Last-Modified: $Date$
Author: Roger Dingledine
Created: 2007-12-xx
Status: Dead
diff --git a/doc/spec/proposals/129-reject-plaintext-ports.txt b/doc/spec/proposals/129-reject-plaintext-ports.txt
index d4767d03d8..8080ff5b75 100644
--- a/doc/spec/proposals/129-reject-plaintext-ports.txt
+++ b/doc/spec/proposals/129-reject-plaintext-ports.txt
@@ -1,7 +1,5 @@
Filename: 129-reject-plaintext-ports.txt
Title: Block Insecure Protocols by Default
-Version: $Revision$
-Last-Modified: $Date$
Author: Kevin Bauer & Damon McCoy
Created: 2008-01-15
Status: Closed
diff --git a/doc/spec/proposals/130-v2-conn-protocol.txt b/doc/spec/proposals/130-v2-conn-protocol.txt
index 16f5bf2844..60e742a622 100644
--- a/doc/spec/proposals/130-v2-conn-protocol.txt
+++ b/doc/spec/proposals/130-v2-conn-protocol.txt
@@ -1,7 +1,5 @@
Filename: 130-v2-conn-protocol.txt
Title: Version 2 Tor connection protocol
-Version: $Revision$
-Last-Modified: $Date$
Author: Nick Mathewson
Created: 2007-10-25
Status: Closed
diff --git a/doc/spec/proposals/131-verify-tor-usage.txt b/doc/spec/proposals/131-verify-tor-usage.txt
index 2687139189..d3c6efe75a 100644
--- a/doc/spec/proposals/131-verify-tor-usage.txt
+++ b/doc/spec/proposals/131-verify-tor-usage.txt
@@ -1,7 +1,5 @@
Filename: 131-verify-tor-usage.txt
Title: Help users to verify they are using Tor
-Version: $Revision$
-Last-Modified: $Date$
Author: Steven J. Murdoch
Created: 2008-01-25
Status: Needs-Revision
diff --git a/doc/spec/proposals/132-browser-check-tor-service.txt b/doc/spec/proposals/132-browser-check-tor-service.txt
index d07a10dcde..6132e5d060 100644
--- a/doc/spec/proposals/132-browser-check-tor-service.txt
+++ b/doc/spec/proposals/132-browser-check-tor-service.txt
@@ -1,7 +1,5 @@
Filename: 132-browser-check-tor-service.txt
Title: A Tor Web Service For Verifying Correct Browser Configuration
-Version: $Revision$
-Last-Modified: $Date$
Author: Robert Hogan
Created: 2008-03-08
Status: Draft
diff --git a/doc/spec/proposals/134-robust-voting.txt b/doc/spec/proposals/134-robust-voting.txt
index 5d5e77fa3b..c5dfb3b47f 100644
--- a/doc/spec/proposals/134-robust-voting.txt
+++ b/doc/spec/proposals/134-robust-voting.txt
@@ -2,8 +2,10 @@ Filename: 134-robust-voting.txt
Title: More robust consensus voting with diverse authority sets
Author: Peter Palfrader
Created: 2008-04-01
-Status: Accepted
-Target: 0.2.2.x
+Status: Rejected
+
+History:
+ 2009 May 27: Added note on rejecting this proposal -- Nick
Overview:
@@ -103,3 +105,19 @@ Possible Attacks/Open Issues/Some thinking required:
Q: Can this ever force us to build a consensus with authorities we do not
recognize?
A: No, we can never build a fully connected set with them in step 3.
+
+------------------------------
+
+I'm rejecting this proposal as insecure.
+
+Suppose that we have a clique of size N, and M hostile members in the
+clique. If these hostile members stop declaring trust for up to M-1
+good members of the clique, the clique with the hostile members will
+in it will be larger than the one without them.
+
+The M hostile members will constitute a majority of this new clique
+when M > (N-(M-1)) / 2, or when M > (N + 1) / 3. This breaks our
+requirement that an adversary must compromise a majority of authorities
+in order to control the consensus.
+
+-- Nick
diff --git a/doc/spec/proposals/135-private-tor-networks.txt b/doc/spec/proposals/135-private-tor-networks.txt
index 131bbb9068..19ef68b7b1 100644
--- a/doc/spec/proposals/135-private-tor-networks.txt
+++ b/doc/spec/proposals/135-private-tor-networks.txt
@@ -1,7 +1,5 @@
Filename: 135-private-tor-networks.txt
Title: Simplify Configuration of Private Tor Networks
-Version: $Revision$
-Last-Modified: $Date$
Author: Karsten Loesing
Created: 29-Apr-2008
Status: Closed
diff --git a/doc/spec/proposals/137-bootstrap-phases.txt b/doc/spec/proposals/137-bootstrap-phases.txt
index 18d3dfae12..ebe044c707 100644
--- a/doc/spec/proposals/137-bootstrap-phases.txt
+++ b/doc/spec/proposals/137-bootstrap-phases.txt
@@ -1,7 +1,5 @@
Filename: 137-bootstrap-phases.txt
Title: Keep controllers informed as Tor bootstraps
-Version: $Revision$
-Last-Modified: $Date$
Author: Roger Dingledine
Created: 07-Jun-2008
Status: Closed
diff --git a/doc/spec/proposals/138-remove-down-routers-from-consensus.txt b/doc/spec/proposals/138-remove-down-routers-from-consensus.txt
index a07764d536..776911b5c9 100644
--- a/doc/spec/proposals/138-remove-down-routers-from-consensus.txt
+++ b/doc/spec/proposals/138-remove-down-routers-from-consensus.txt
@@ -1,7 +1,5 @@
Filename: 138-remove-down-routers-from-consensus.txt
Title: Remove routers that are not Running from consensus documents
-Version: $Revision$
-Last-Modified: $Date$
Author: Peter Palfrader
Created: 11-Jun-2008
Status: Closed
diff --git a/doc/spec/proposals/140-consensus-diffs.txt b/doc/spec/proposals/140-consensus-diffs.txt
index da63bfe23c..8bc4070bfe 100644
--- a/doc/spec/proposals/140-consensus-diffs.txt
+++ b/doc/spec/proposals/140-consensus-diffs.txt
@@ -1,12 +1,15 @@
Filename: 140-consensus-diffs.txt
Title: Provide diffs between consensuses
-Version: $Revision$
-Last-Modified: $Date$
Author: Peter Palfrader
Created: 13-Jun-2008
Status: Accepted
Target: 0.2.2.x
+0. History
+
+ 22-May-2009: Restricted the ed format even more strictly for ease of
+ implementation. -nickm
+
1. Overview.
Tor clients and servers need a list of which relays are on the
@@ -135,6 +138,10 @@ Target: 0.2.2.x
Note that line numbers always apply to the file after all previous
commands have already been applied.
+ The commands MUST apply to the file from back to front, such that
+ lines are only ever referred to by their position in the original
+ file.
+
The "current line" is either the first line of the file, if this is
the first command, the last line of a block we added in an append or
change command, or the line immediate following a set of lines we just
diff --git a/doc/spec/proposals/141-jit-sd-downloads.txt b/doc/spec/proposals/141-jit-sd-downloads.txt
index b0c2b2cbcd..2ac7a086b7 100644
--- a/doc/spec/proposals/141-jit-sd-downloads.txt
+++ b/doc/spec/proposals/141-jit-sd-downloads.txt
@@ -1,7 +1,5 @@
Filename: 141-jit-sd-downloads.txt
Title: Download server descriptors on demand
-Version: $Revision$
-Last-Modified: $Date$
Author: Peter Palfrader
Created: 15-Jun-2008
Status: Draft
@@ -63,8 +61,8 @@ Status: Draft
which tries to convey a server's capacity to clients.
Currently we weigh servers differently for different purposes. There
- is a weigh for when we use a server as a guard node (our entry to the
- Tor network), there is one weigh we assign servers for exit duties,
+ is a weight for when we use a server as a guard node (our entry to the
+ Tor network), there is one weight we assign servers for exit duties,
and a third for when we need intermediate (middle) nodes.
2.2 Exit information
@@ -80,7 +78,7 @@ Status: Draft
2.3 Capability information
- Server descriptors contain information about the specific version or
+ Server descriptors contain information about the specific version of
the Tor protocol they understand [proposal 105].
Furthermore the server descriptor also contains the exact version of
diff --git a/doc/spec/proposals/142-combine-intro-and-rend-points.txt b/doc/spec/proposals/142-combine-intro-and-rend-points.txt
index 3456b285a9..3abd5c863d 100644
--- a/doc/spec/proposals/142-combine-intro-and-rend-points.txt
+++ b/doc/spec/proposals/142-combine-intro-and-rend-points.txt
@@ -1,7 +1,5 @@
Filename: 142-combine-intro-and-rend-points.txt
Title: Combine Introduction and Rendezvous Points
-Version: $Revision$
-Last-Modified: $Date$
Author: Karsten Loesing, Christian Wilms
Created: 27-Jun-2008
Status: Dead
diff --git a/doc/spec/proposals/143-distributed-storage-improvements.txt b/doc/spec/proposals/143-distributed-storage-improvements.txt
index 8789d84663..0f7468f1dc 100644
--- a/doc/spec/proposals/143-distributed-storage-improvements.txt
+++ b/doc/spec/proposals/143-distributed-storage-improvements.txt
@@ -1,7 +1,5 @@
Filename: 143-distributed-storage-improvements.txt
Title: Improvements of Distributed Storage for Tor Hidden Service Descriptors
-Version: $Revision$
-Last-Modified: $Date$
Author: Karsten Loesing
Created: 28-Jun-2008
Status: Open
diff --git a/doc/spec/proposals/145-newguard-flag.txt b/doc/spec/proposals/145-newguard-flag.txt
index 31d707d725..9e61e30be9 100644
--- a/doc/spec/proposals/145-newguard-flag.txt
+++ b/doc/spec/proposals/145-newguard-flag.txt
@@ -1,7 +1,5 @@
Filename: 145-newguard-flag.txt
Title: Separate "suitable as a guard" from "suitable as a new guard"
-Version: $Revision$
-Last-Modified: $Date$
Author: Nick Mathewson
Created: 1-Jul-2008
Status: Open
diff --git a/doc/spec/proposals/146-long-term-stability.txt b/doc/spec/proposals/146-long-term-stability.txt
index 7cfd58f564..9af0017441 100644
--- a/doc/spec/proposals/146-long-term-stability.txt
+++ b/doc/spec/proposals/146-long-term-stability.txt
@@ -1,7 +1,5 @@
Filename: 146-long-term-stability.txt
Title: Add new flag to reflect long-term stability
-Version: $Revision$
-Last-Modified: $Date$
Author: Nick Mathewson
Created: 19-Jun-2008
Status: Open
diff --git a/doc/spec/proposals/147-prevoting-opinions.txt b/doc/spec/proposals/147-prevoting-opinions.txt
index 2b8cf30e46..3d9659c984 100644
--- a/doc/spec/proposals/147-prevoting-opinions.txt
+++ b/doc/spec/proposals/147-prevoting-opinions.txt
@@ -1,7 +1,5 @@
Filename: 147-prevoting-opinions.txt
Title: Eliminate the need for v2 directories in generating v3 directories
-Version: $Revision$
-Last-Modified: $Date$
Author: Nick Mathewson
Created: 2-Jul-2008
Status: Accepted
diff --git a/doc/spec/proposals/148-uniform-client-end-reason.txt b/doc/spec/proposals/148-uniform-client-end-reason.txt
index cec81253ea..1db3b3e596 100644
--- a/doc/spec/proposals/148-uniform-client-end-reason.txt
+++ b/doc/spec/proposals/148-uniform-client-end-reason.txt
@@ -1,7 +1,5 @@
Filename: 148-uniform-client-end-reason.txt
Title: Stream end reasons from the client side should be uniform
-Version: $Revision$
-Last-Modified: $Date$
Author: Roger Dingledine
Created: 2-Jul-2008
Status: Closed
diff --git a/doc/spec/proposals/149-using-netinfo-data.txt b/doc/spec/proposals/149-using-netinfo-data.txt
index 4919514b4c..8bf8375d5d 100644
--- a/doc/spec/proposals/149-using-netinfo-data.txt
+++ b/doc/spec/proposals/149-using-netinfo-data.txt
@@ -1,7 +1,5 @@
Filename: 149-using-netinfo-data.txt
Title: Using data from NETINFO cells
-Version: $Revision$
-Last-Modified: $Date$
Author: Nick Mathewson
Created: 2-Jul-2008
Status: Open
@@ -24,14 +22,14 @@ Motivation
idea of their own IP addresses, so they can publish correct
descriptors. This is also in NETINFO cells.
-Learning the time and IP
+Learning the time and IP address
We need to think about attackers here. Just because a router tells
us that we have a given IP or a given clock skew doesn't mean that
it's true. We believe this information only if we've heard it from
a majority of the routers we've connected to recently, including at
least 3 routers. Routers only believe this information if the
- majority inclues at least one authority.
+ majority includes at least one authority.
Avoiding MITM attacks
diff --git a/doc/spec/proposals/150-exclude-exit-nodes.txt b/doc/spec/proposals/150-exclude-exit-nodes.txt
index b73a9cc4d1..b497ae62c1 100644
--- a/doc/spec/proposals/150-exclude-exit-nodes.txt
+++ b/doc/spec/proposals/150-exclude-exit-nodes.txt
@@ -1,6 +1,5 @@
Filename: 150-exclude-exit-nodes.txt
Title: Exclude Exit Nodes from a circuit
-Version: $Revision$
Author: Mfr
Created: 2008-06-15
Status: Closed
diff --git a/doc/spec/proposals/151-path-selection-improvements.txt b/doc/spec/proposals/151-path-selection-improvements.txt
index e3c8f35451..af89f21193 100644
--- a/doc/spec/proposals/151-path-selection-improvements.txt
+++ b/doc/spec/proposals/151-path-selection-improvements.txt
@@ -1,16 +1,15 @@
Filename: 151-path-selection-improvements.txt
Title: Improving Tor Path Selection
-Version:
-Last-Modified:
Author: Fallon Chen, Mike Perry
Created: 5-Jul-2008
-Status: Draft
+Status: Finished
+In-Spec: path-spec.txt
Overview
The performance of paths selected can be improved by adjusting the
CircuitBuildTimeout and avoiding failing guard nodes. This proposal
- describes a method of tracking buildtime statistics at the client, and
+ describes a method of tracking buildtime statistics at the client, and
using those statistics to adjust the CircuitBuildTimeout.
Motivation
@@ -22,121 +21,123 @@ Motivation
Implementation
- Storing Build Times
+ Gathering Build Times
- Circuit build times will be stored in the circular array
- 'circuit_build_times' consisting of uint16_t elements as milliseconds.
- The total size of this array will be based on the number of circuits
+ Circuit build times are stored in the circular array
+ 'circuit_build_times' consisting of uint32_t elements as milliseconds.
+ The total size of this array is based on the number of circuits
it takes to converge on a good fit of the long term distribution of
the circuit builds for a fixed link. We do not want this value to be
too large, because it will make it difficult for clients to adapt to
moving between different links.
- From our initial observations, this value appears to be on the order
- of 1000, but will be configurable in a #define NCIRCUITS_TO_OBSERVE.
- The exact value for this #define will be determined by performing
- goodness of fit tests using measurments obtained from the shufflebt.py
- script from TorFlow.
-
+ From our observations, the minimum value for a reasonable fit appears
+ to be on the order of 500 (MIN_CIRCUITS_TO_OBSERVE). However, to keep
+ a good fit over the long term, we store 5000 most recent circuits in
+ the array (NCIRCUITS_TO_OBSERVE).
+
+ The Tor client will build test circuits at a rate of one per
+ minute (BUILD_TIMES_TEST_FREQUENCY) up to the point of
+ MIN_CIRCUITS_TO_OBSERVE. This allows a fresh Tor to have
+ a CircuitBuildTimeout estimated within 8 hours after install,
+ upgrade, or network change (see below).
+
Long Term Storage
- The long-term storage representation will be implemented by storing a
- histogram with BUILDTIME_BIN_WIDTH millisecond buckets (default 50) when
- writing out the statistics to disk. The format of this histogram on disk
- is yet to be finalized, but it will likely be of the format
- 'CircuitBuildTime <bin> <count>', with the total specified as
- 'TotalBuildTimes <total>'
+ The long-term storage representation is implemented by storing a
+ histogram with BUILDTIME_BIN_WIDTH millisecond buckets (default 50) when
+ writing out the statistics to disk. The format this takes in the
+ state file is 'CircuitBuildTime <bin-ms> <count>', with the total
+ specified as 'TotalBuildTimes <total>'
Example:
TotalBuildTimes 100
- CircuitBuildTimeBin 1 50
- CircuitBuildTimeBin 2 25
- CircuitBuildTimeBin 3 13
+ CircuitBuildTimeBin 25 50
+ CircuitBuildTimeBin 75 25
+ CircuitBuildTimeBin 125 13
...
- Reading the histogram in will entail multiplying each bin by the
- BUILDTIME_BIN_WIDTH and then inserting <count> values into the
- circuit_build_times array each with the value of
- <bin>*BUILDTIME_BIN_WIDTH. In order to evenly distribute the
- values in the circular array, a form of index skipping must
- be employed. Values from bin #N with bin count C and total T
- will occupy indexes specified by N+((T/C)*k)-1, where k is the
- set of integers ranging from 0 to C-1.
-
- For example, this would mean that the values from bin 1 would
- occupy indexes 1+(100/50)*k-1, or 0, 2, 4, 6, 8, 10 and so on.
- The values for bin 2 would occupy positions 1, 5, 9, 13. Collisions
- will be inserted at the first empty position in the array greater
- than the selected index (which may requiring looping around the
- array back to index 0).
+ Reading the histogram in will entail inserting <count> values
+ into the circuit_build_times array each with the value of
+ <bin-ms> milliseconds. In order to evenly distribute the values
+ in the circular array, the Fisher-Yates shuffle will be performed
+ after reading values from the bins.
Learning the CircuitBuildTimeout
Based on studies of build times, we found that the distribution of
- circuit buildtimes appears to be a Pareto distribution.
+ circuit buildtimes appears to be a Frechet distribution. However,
+ estimators and quantile functions of the Frechet distribution are
+ difficult to work with and slow to converge. So instead, since we
+ are only interested in the accuracy of the tail, we approximate
+ the tail of the distribution with a Pareto curve starting at
+ the mode of the circuit build time sample set.
We will calculate the parameters for a Pareto distribution
fitting the data using the estimators at
http://en.wikipedia.org/wiki/Pareto_distribution#Parameter_estimation.
- The timeout itself will be calculated by solving the CDF for the
- a percentile cutoff BUILDTIME_PERCENT_CUTOFF. This value
- represents the percentage of paths the Tor client will accept out of
- the total number of paths. We have not yet determined a good
- cutoff for this mathematically, but 85% seems a good choice for now.
+ The timeout itself is calculated by using the Quartile function (the
+ inverted CDF) to give us the value on the CDF such that
+ BUILDTIME_PERCENT_CUTOFF (80%) of the mass of the distribution is
+ below the timeout value.
+
+ Thus, we expect that the Tor client will accept the fastest 80% of
+ the total number of paths on the network.
+
+ Detecting Changing Network Conditions
- From http://en.wikipedia.org/wiki/Pareto_distribution#Definition,
- the calculation we need is pow(BUILDTIME_PERCENT_CUTOFF/100.0, k)/Xm.
+ We attempt to detect both network connectivity loss and drastic
+ changes in the timeout characteristics.
+
+ We assume that we've had network connectivity loss if 3 circuits
+ timeout and we've received no cells or TLS handshakes since those
+ circuits began. We then set the timeout to 60 seconds and stop
+ counting timeouts.
+
+ If 3 more circuits timeout and the network still has not been
+ live within this new 60 second timeout window, we then discard
+ the previous timeouts during this period from our history.
+
+ To detect changing network conditions, we keep a history of
+ the timeout or non-timeout status of the past RECENT_CIRCUITS (20)
+ that successfully completed at least one hop. If more than 75%
+ of these circuits timeout, we discard all buildtimes history,
+ reset the timeout to 60, and then begin recomputing the timeout.
Testing
After circuit build times, storage, and learning are implemented,
the resulting histogram should be checked for consistency by
- verifying it persists across successive Tor invocations where
+ verifying it persists across successive Tor invocations where
no circuits are built. In addition, we can also use the existing
- buildtime scripts to record build times, and verify that the histogram
+ buildtime scripts to record build times, and verify that the histogram
the python produces matches that which is output to the state file in Tor,
and verify that the Pareto parameters and cutoff points also match.
-
- Soft timeout vs Hard Timeout
-
- At some point, it may be desirable to change the cutoff from a
- single hard cutoff that destroys the circuit to a soft cutoff and
- a hard cutoff, where the soft cutoff merely triggers the building
- of a new circuit, and the hard cutoff triggers destruction of the
- circuit.
-
- Good values for hard and soft cutoffs seem to be 85% and 65%
- respectively, but we should eventually justify this with observation.
-
- When to Begin Calculation
- The number of circuits to observe (NCIRCUITS_TO_CUTOFF) before
- changing the CircuitBuildTimeout will be tunable via a #define. From
- our measurements, a good value for NCIRCUITS_TO_CUTOFF appears to be
- on the order of 100.
+ We will also verify that there are no unexpected large deviations from
+ node selection, such as nodes from distant geographical locations being
+ completely excluded.
Dealing with Timeouts
- Timeouts should be counted as the expectation of the region of
- of the Pareto distribution beyond the cutoff. The proposal will
- be updated with this value soon.
+ Timeouts should be counted as the expectation of the region of
+ of the Pareto distribution beyond the cutoff. This is done by
+ generating a random sample for each timeout at points on the
+ curve beyond the current timeout cutoff.
- Also, in the event of network failure, the observation mechanism
- should stop collecting timeout data.
+ Future Work
- Client Hints
+ At some point, it may be desirable to change the cutoff from a
+ single hard cutoff that destroys the circuit to a soft cutoff and
+ a hard cutoff, where the soft cutoff merely triggers the building
+ of a new circuit, and the hard cutoff triggers destruction of the
+ circuit.
- Some research still needs to be done to provide initial values
- for CircuitBuildTimeout based on values learned from modem
- users, DSL users, Cable Modem users, and dedicated links. A
- radiobutton in Vidalia should eventually be provided that
- sets CircuitBuildTimeout to one of these values and also
- provide the option of purging all learned data, should any exist.
+ It may also be beneficial to learn separate timeouts for each
+ guard node, as they will have slightly different distributions.
+ This will take longer to generate initial values though.
- These values can either be published in the directory, or
- shipped hardcoded for a particular Tor version.
-
Issues
Impact on anonymity
diff --git a/doc/spec/proposals/152-single-hop-circuits.txt b/doc/spec/proposals/152-single-hop-circuits.txt
index e49a4250e0..d0b28b1c72 100644
--- a/doc/spec/proposals/152-single-hop-circuits.txt
+++ b/doc/spec/proposals/152-single-hop-circuits.txt
@@ -1,7 +1,5 @@
Filename: 152-single-hop-circuits.txt
Title: Optionally allow exit from single-hop circuits
-Version:
-Last-Modified:
Author: Geoff Goodell
Created: 13-Jul-2008
Status: Closed
diff --git a/doc/spec/proposals/153-automatic-software-update-protocol.txt b/doc/spec/proposals/153-automatic-software-update-protocol.txt
index 7bc809d440..c2979bb695 100644
--- a/doc/spec/proposals/153-automatic-software-update-protocol.txt
+++ b/doc/spec/proposals/153-automatic-software-update-protocol.txt
@@ -1,7 +1,5 @@
Filename: 153-automatic-software-update-protocol.txt
Title: Automatic software update protocol
-Version: $Revision$
-Last-Modified: $Date$
Author: Jacob Appelbaum
Created: 14-July-2008
Status: Superseded
diff --git a/doc/spec/proposals/154-automatic-updates.txt b/doc/spec/proposals/154-automatic-updates.txt
index 00a820de08..4c2c6d3899 100644
--- a/doc/spec/proposals/154-automatic-updates.txt
+++ b/doc/spec/proposals/154-automatic-updates.txt
@@ -1,7 +1,5 @@
Filename: 154-automatic-updates.txt
Title: Automatic Software Update Protocol
-Version: $Revision$
-Last-Modified: $Date$
Author: Matt Edman
Created: 30-July-2008
Status: Superseded
diff --git a/doc/spec/proposals/155-four-hidden-service-improvements.txt b/doc/spec/proposals/155-four-hidden-service-improvements.txt
index f528f8baf2..e342bf1c39 100644
--- a/doc/spec/proposals/155-four-hidden-service-improvements.txt
+++ b/doc/spec/proposals/155-four-hidden-service-improvements.txt
@@ -1,7 +1,5 @@
Filename: 155-four-hidden-service-improvements.txt
Title: Four Improvements of Hidden Service Performance
-Version: $Revision$
-Last-Modified: $Date$
Author: Karsten Loesing, Christian Wilms
Created: 25-Sep-2008
Status: Finished
diff --git a/doc/spec/proposals/156-tracking-blocked-ports.txt b/doc/spec/proposals/156-tracking-blocked-ports.txt
index 1e7b0d963f..419de7e74c 100644
--- a/doc/spec/proposals/156-tracking-blocked-ports.txt
+++ b/doc/spec/proposals/156-tracking-blocked-ports.txt
@@ -1,7 +1,5 @@
Filename: 156-tracking-blocked-ports.txt
Title: Tracking blocked ports on the client side
-Version: $Revision$
-Last-Modified: $Date$
Author: Robert Hogan
Created: 14-Oct-2008
Status: Open
diff --git a/doc/spec/proposals/157-specific-cert-download.txt b/doc/spec/proposals/157-specific-cert-download.txt
index e54a987277..204b20973a 100644
--- a/doc/spec/proposals/157-specific-cert-download.txt
+++ b/doc/spec/proposals/157-specific-cert-download.txt
@@ -1,7 +1,5 @@
Filename: 157-specific-cert-download.txt
Title: Make certificate downloads specific
-Version: $Revision$
-Last-Modified: $Date$
Author: Nick Mathewson
Created: 2-Dec-2008
Status: Accepted
diff --git a/doc/spec/proposals/158-microdescriptors.txt b/doc/spec/proposals/158-microdescriptors.txt
index f478a3c834..e6966c0cef 100644
--- a/doc/spec/proposals/158-microdescriptors.txt
+++ b/doc/spec/proposals/158-microdescriptors.txt
@@ -1,11 +1,20 @@
Filename: 158-microdescriptors.txt
Title: Clients download consensus + microdescriptors
-Version: $Revision$
-Last-Modified: $Date$
Author: Roger Dingledine
Created: 17-Jan-2009
Status: Open
+0. History
+
+ 15 May 2009: Substantially revised based on discussions on or-dev
+ from late January. Removed the notion of voting on how to choose
+ microdescriptors; made it just a function of the consensus method.
+ (This lets us avoid the possibility of "desynchronization.")
+ Added suggestion to use a new consensus flavor. Specified use of
+ SHA256 for new hashes. -nickm
+
+ 15 June 2009: Cleaned up based on comments from Roger. -nickm
+
1. Overview
This proposal replaces section 3.2 of proposal 141, which was
@@ -13,9 +22,7 @@ Status: Open
circuit-building protocol to fetch a server descriptor inline at each
circuit extend, we instead put all of the information that clients need
either into the consensus itself, or into a new set of data about each
- relay called a microdescriptor. The microdescriptor is a direct
- transform from the relay descriptor, so relays don't even need to know
- this is happening.
+ relay called a microdescriptor.
Descriptor elements that are small and frequently changing should go
in the consensus itself, and descriptor elements that are small and
@@ -24,6 +31,10 @@ Status: Open
them, we'll need to resume considering some design like the one in
proposal 141.
+ Note also that any descriptor element which clients need to use to
+ decide which servers to fetch info about, or which servers to fetch
+ info from, needs to stay in the consensus.
+
2. Motivation
See
@@ -36,99 +47,91 @@ Status: Open
3. Design
There are three pieces to the proposal. First, authorities will list in
- their votes (and thus in the consensus) what relay descriptor elements
- are included in the microdescriptor, and also list the expected hash
- of microdescriptor for each relay. Second, directory mirrors will serve
- microdescriptors. Third, clients will ask for them and cache them.
+ their votes (and thus in the consensus) the expected hash of
+ microdescriptor for each relay. Second, authorities will serve
+ microdescriptors, directory mirrors will cache and serve
+ them. Third, clients will ask for them and cache them.
3.1. Consensus changes
- V3 votes should include a new line:
- microdescriptor-elements bar baz foo
- listing each descriptor element (sorted alphabetically) that authority
- included when it calculated its expected microdescriptor hashes.
+ If the authorities choose a consensus method of a given version or
+ later, a microdescriptor format is implicit in that version.
+ A microdescriptor should in every case be a pure function of the
+ router descriptor and the consensus method.
+
+ In votes, we need to include the hash of each expected microdescriptor
+ in the routerstatus section. I suggest a new "m" line for each stanza,
+ with the base64 of the SHA256 hash of the router's microdescriptor.
+
+ For every consensus method that an authority supports, it includes a
+ separate "m" line in each router section of its vote, containing:
+ "m" SP methods 1*(SP AlgorithmName "=" digest) NL
+ where methods is a comma-separated list of the consensus methods
+ that the authority believes will produce "digest".
- We also need to include the hash of each expected microdescriptor in
- the routerstatus section. I suggest a new "m" line for each stanza,
- with the base64 of the hash of the elements that the authority voted
- for above.
+ (As with base64 encoding of SHA1 hashes in consensuses, let's
+ omit the trailing =s)
The consensus microdescriptor-elements and "m" lines are then computed
as described in Section 3.1.2 below.
- I believe that means we need a new consensus-method "6" that knows
- how to compute the microdescriptor-elements and add "m" lines.
+ (This means we need a new consensus-method that knows
+ how to compute the microdescriptor-elements and add "m" lines.)
-3.1.1. Descriptor elements to include for now
+ The microdescriptor consensus uses the directory-signature format from
+ proposal 162, with the "sha256" algorithm.
- To start, the element list that authorities suggest should be
- family onion-key
- (Note that the or-dev posts above only mention onion-key, but if
- we don't also include family then clients will never learn it. It
- seemed like it should be relatively static, so putting it in the
- microdescriptor is smarter than trying to fit it into the consensus.)
+3.1.1. Descriptor elements to include for now
- We could imagine a config option "family,onion-key" so authorities
- could change their voted preferences without needing to upgrade.
+ In the first version, the microdescriptor should contain the
+ onion-key element, and the family element from the router descriptor,
+ and the exit policy summary as currently specified in dir-spec.txt.
3.1.2. Computing consensus for microdescriptor-elements and "m" lines
- One approach is for the consensus microdescriptor-elements line to
- include every element listed by a majority of authorities, sorted. The
- problem here is that it will no longer be deterministic what the correct
- hash for the "m" line should be. We could imagine telling the authority
- to go look in its descriptor and produce the right hash itself, but
- we don't want consensus calculation to be based on external data like
- that. (Plus, the authority may not have the descriptor that everybody
- else voted to use.)
-
- The better approach is to take the exact set that has the most votes
- (breaking ties by the set that has the most elements, and breaking
- ties after that by whichever is alphabetically first). That will
- increase the odds that we actually get a microdescriptor hash that
- is both a) for the descriptor we're putting in the consensus, and b)
- over the elements that we're declaring it should be for.
-
- Then the "m" line for a given relay is the one that gets the most votes
- from authorities that both a) voted for the microdescriptor-elements
- line we're using, and b) voted for the descriptor we're using.
-
- (If there's a tie, use the smaller hash. But really, if there are
- multiple such votes and they differ about a microdescriptor, we caught
- one of them lying or being buggy. We should log it to track down why.)
-
- If there are no such votes, then we leave out the "m" line for that
- relay. That means clients should avoid it for this time period. (As
- an extension it could instead mean that clients should fetch the
- descriptor and figure out its microdescriptor themselves. But let's
- not get ahead of ourselves.)
-
- It would be nice to have a more foolproof way to agree on what
- microdescriptor hash each authority should vote for, so we can avoid
- missing "m" lines. Just switching to a new consensus-method each time
- we change the set of microdescriptor-elements won't help though, since
- each authority will still have to decide what hash to vote for before
- knowing what consensus-method will be used.
-
- Here's one way we could do it. Each vote / consensus includes
- the microdescriptor-elements that were used to compute the hashes,
- and also a preferred-microdescriptor-elements set. If an authority
- has a consensus from the previous period, then it should use the
- consensus preferred-microdescriptor-elements when computing its votes
- for microdescriptor-elements and the appropriate hashes in the upcoming
- period. (If it has no previous consensus, then it just writes its
- own preferences in both lines.)
-
-3.2. Directory mirrors serve microdescriptors
-
- Directory mirrors should then read the microdescriptor-elements line
- from the consensus, and learn how to answer requests. (Directory mirrors
- continue to serve normal relay descriptors too, a) to serve old clients
- and b) to be able to construct microdescriptors on the fly.)
-
- The microdescriptors with hashes <D1>,<D2>,<D3> should be available at:
- http://<hostname>/tor/micro/d/<D1>+<D2>+<D3>.z
+ When we are generating a consensus, we use whichever m line
+ unambiguously corresponds to the descriptor digest that will be
+ included in the consensus.
+
+ (If different votes have different microdescriptor digests for a
+ single <descriptor-digest, consensus-method> pair, then at least one
+ of the authorities is broken. If this happens, the consensus should
+ contain whichever microdescriptor digest is most common. If there is
+ no winner, we break ties in the favor of the lexically earliest.
+ Either way, we should log a warning: there is definitely a bug.)
+
+ The "m" lines in a consensus contain only the digest, not a list of
+ consensus methods.
+
+3.1.3. A new flavor of consensus
+
+ Rather than inserting "m" lines in the current consensus format,
+ they should be included in a new consensus flavor (see proposal
+ 162).
+
+ This flavor can safely omit descriptor digests.
+
+ When we implement this voting method, we can remove the exit policy
+ summary from the current "ns" flavor of consensus, since no current
+ clients use them, and they take up about 5% of the compressed
+ consensus.
+
+ This new consensus flavor should be signed with the sha256 signature
+ format as documented in proposal 162.
+
+3.2. Directory mirrors fetch, cache, and serve microdescriptors
+
+ Directory mirrors should fetch, catch, and serve each microdescriptor
+ from the authorities. (They need to continue to serve normal relay
+ descriptors too, to handle old clients.)
+
+ The microdescriptors with base64 hashes <D1>,<D2>,<D3> should be
+ available at:
+ http://<hostname>/tor/micro/d/<D1>-<D2>-<D3>.z
+ (We use base64 for size and for consistency with the consensus
+ format. We use -s instead of +s to separate these items, since
+ the + character is used in base64 encoding.)
All the microdescriptors from the current consensus should also be
available at:
@@ -136,24 +139,9 @@ Status: Open
so a client that's bootstrapping doesn't need to send a 70KB URL just
to name every microdescriptor it's looking for.
- The format of a microdescriptor is the header line
- "microdescriptor-header"
- followed by each element (keyword and body), alphabetically. There's
- no need to mention what hash it's for, since it's self-identifying:
- you can hash the elements to learn this.
-
- (Do we need a footer line to show that it's over, or is the next
- microdescriptor line or EOF enough of a hint? A footer line wouldn't
- hurt much. Also, no fair voting for the microdescriptor-element
- "microdescriptor-header".)
-
+ Microdescriptors have no header or footer.
The hash of the microdescriptor is simply the hash of the concatenated
- elements -- not counting the header line or hypothetical footer line.
- Unless you prefer that?
-
- Is there a reasonable way to version these things? We could say that
- the microdescriptor-header line can contain arguments which clients
- must ignore if they don't understand them. Any better ways?
+ elements.
Directory mirrors should check to make sure that the microdescriptors
they're about to serve match the right hashes (either the hashes from
@@ -170,10 +158,14 @@ Status: Open
When a client gets a new consensus, it looks to see if there are any
microdescriptors it needs to learn. If it needs to learn more than
some threshold of the microdescriptors (half?), it requests 'all',
- else it requests only the missing ones.
+ else it requests only the missing ones. Clients MAY try to
+ determine whether the upload bandwidth for listing the
+ microdescriptors they want is more or less than the download
+ bandwidth for the microdescriptors they do not want.
Clients maintain a cache of microdescriptors along with metadata like
- when it was last referenced by a consensus. They keep a microdescriptor
+ when it was last referenced by a consensus, and which identity key
+ it corresponds to. They keep a microdescriptor
until it hasn't been mentioned in any consensus for a week. Future
clients might cache them for longer or shorter times.
@@ -190,18 +182,17 @@ Status: Open
Another future option would be to fetch some of the microdescriptors
anonymously (via a Tor circuit).
+ Another crazy option (Roger's phrasing) is to do decoy fetches as
+ well.
+
4. Transition and deployment
Phase one, the directory authorities should start voting on
- microdescriptors and microdescriptor elements, and putting them in the
- consensus. This should happen during the 0.2.1.x series, and should
- be relatively easy to do.
+ microdescriptors, and putting them in the consensus.
Phase two, directory mirrors should learn how to serve them, and learn
- how to read the consensus to find out what they should be serving. This
- phase could be done either in 0.2.1.x or early in 0.2.2.x, depending
- on how messy it turns out to be and how quickly we get around to it.
+ how to read the consensus to find out what they should be serving.
Phase three, clients should start fetching and caching them instead
- of normal descriptors. This should happen post 0.2.1.x.
+ of normal descriptors.
diff --git a/doc/spec/proposals/159-exit-scanning.txt b/doc/spec/proposals/159-exit-scanning.txt
index fbc69aa9e6..7090f2ed08 100644
--- a/doc/spec/proposals/159-exit-scanning.txt
+++ b/doc/spec/proposals/159-exit-scanning.txt
@@ -1,7 +1,5 @@
Filename: 159-exit-scanning.txt
Title: Exit Scanning
-Version: $Revision$
-Last-Modified: $Date$
Author: Mike Perry
Created: 13-Feb-2009
Status: Open
diff --git a/doc/spec/proposals/160-bandwidth-offset.txt b/doc/spec/proposals/160-bandwidth-offset.txt
new file mode 100644
index 0000000000..96935ade7d
--- /dev/null
+++ b/doc/spec/proposals/160-bandwidth-offset.txt
@@ -0,0 +1,105 @@
+Filename: 160-bandwidth-offset.txt
+Title: Authorities vote for bandwidth offsets in consensus
+Author: Roger Dingledine
+Created: 4-May-2009
+Status: Finished
+Target: 0.2.2.x
+
+1. Motivation
+
+ As part of proposal 141, we moved the bandwidth value for each relay
+ into the consensus. Now clients can know how they should load balance
+ even before they've fetched the corresponding relay descriptors.
+
+ Putting the bandwidth in the consensus also lets the directory
+ authorities choose more accurate numbers to advertise, if we come up
+ with a better algorithm for deciding weightings.
+
+ Our original plan was to teach directory authorities how to measure
+ bandwidth themselves; then every authority would vote for the bandwidth
+ it prefers, and we'd take the median of votes as usual.
+
+ The problem comes when we have 7 authorities, and only a few of them
+ have smarter bandwidth allocation algorithms. So long as the majority
+ of them are voting for the number in the relay descriptor, the minority
+ that have better numbers will be ignored.
+
+2. Options
+
+ One fix would be to demand that every authority also run the
+ new bandwidth measurement algorithms: in that case, part of the
+ responsibility of being an authority operator is that you need to run
+ this code too. But in practice we can't really require all current
+ authority operators to do that; and if we want to expand the set of
+ authority operators even further, it will become even more impractical.
+ Also, bandwidth testing adds load to the network, so we don't really
+ want to require that the number of concurrent bandwidth tests match
+ the number of authorities we have.
+
+ The better fix is to allow certain authorities to specify that they are
+ voting on bandwidth measurements: more accurate bandwidth values that
+ have actually been evaluated. In this way, authorities can vote on
+ the median measured value if sufficient measured votes exist for a router,
+ and otherwise fall back to the median value taken from the published router
+ descriptors.
+
+3. Security implications
+
+ If only some authorities choose to vote on an offset, then a majority of
+ those voting authorities can arbitrarily change the bandwidth weighting
+ for the relay. At the extreme, if there's only one offset-voting
+ authority, then that authority can dictate which relays clients will
+ find attractive.
+
+ This problem isn't entirely new: we already have the worry wrt
+ the subset of authorities that vote for BadExit.
+
+ To make it not so bad, we should deploy at least three offset-voting
+ authorities.
+
+ Also, authorities that know how to vote for offsets should vote for
+ an offset of zero for new nodes, rather than choosing not to vote on
+ any offset in those cases.
+
+4. Design
+
+ First, we need a new consensus method to support this new calculation.
+
+ Now v3 votes can have an additional value on the "w" line:
+ "w Bandwidth=X Measured=" INT.
+
+ Once we're using the new consensus method, the new way to compute the
+ Bandwidth weight is by checking if there are at least 3 "Measured"
+ votes. If so, the median of these is taken. Otherwise, the median
+ of the "Bandwidth=" values are taken, as described in Proposal 141.
+
+ Then the actual consensus looks just the same as it did before,
+ so clients never have to know that this additional calculation is
+ happening.
+
+5. Implementation
+
+ The Measured values will be read from a file provided by the scanners
+ described in proposal 161. Files with a timestamp older than 3 days
+ will be ignored.
+
+ The file will be read in from dirserv_generate_networkstatus_vote_obj()
+ in a location specified by a new config option "V3MeasuredBandwidths".
+ A helper function will be called to populate new 'measured' and
+ 'has_measured' fields of the routerstatus_t 'routerstatuses' list with
+ values read from this file.
+
+ An additional for_vote flag will be passed to
+ routerstatus_format_entry() from format_networkstatus_vote(), which will
+ indicate that the "Measured=" string should be appended to the "w Bandwith="
+ line with the measured value in the struct.
+
+ routerstatus_parse_entry_from_string() will be modified to parse the
+ "Measured=" lines into routerstatus_t struct fields.
+
+ Finally, networkstatus_compute_consensus() will set rs_out.bandwidth
+ to the median of the measured values if there are more than 3, otherwise
+ it will use the bandwidth value median as normal.
+
+
+
diff --git a/doc/spec/proposals/161-computing-bandwidth-adjustments.txt b/doc/spec/proposals/161-computing-bandwidth-adjustments.txt
new file mode 100644
index 0000000000..d219826668
--- /dev/null
+++ b/doc/spec/proposals/161-computing-bandwidth-adjustments.txt
@@ -0,0 +1,174 @@
+Title: Computing Bandwidth Adjustments
+Filename: 161-computing-bandwidth-adjustments.txt
+Author: Mike Perry
+Created: 12-May-2009
+Target: 0.2.2.x
+Status: Finished
+
+
+1. Motivation
+
+ There is high variance in the performance of the Tor network. Despite
+ our efforts to balance load evenly across the Tor nodes, some nodes are
+ significantly slower and more overloaded than others.
+
+ Proposal 160 describes how we can augment the directory authorities to
+ vote on measured bandwidths for routers. This proposal describes what
+ goes into the measuring process.
+
+
+2. Measurement Selection
+
+ The general idea is to determine a load factor representing the ratio
+ of the capacity of measured nodes to the rest of the network. This load
+ factor could be computed from three potentially relevant statistics:
+ circuit failure rates, circuit extend times, or stream capacity.
+
+ Circuit failure rates and circuit extend times appear to be
+ non-linearly proportional to node load. We've observed that the same
+ nodes when scanned at US nighttime hours (when load is presumably
+ lower) exhibit almost no circuit failure, and significantly faster
+ extend times than when scanned during the day.
+
+ Stream capacity, however, is much more uniform, even during US
+ nighttime hours. Moreover, it is a more intuitive representation of
+ node capacity, and also less dependent upon distance and latency
+ if amortized over large stream fetches.
+
+
+3. Average Stream Bandwidth Calculation
+
+ The average stream bandwidths are obtained by dividing the network into
+ slices of 50 nodes each, grouped according to advertised node bandwidth.
+
+ Two hop circuits are built using nodes from the same slice, and a large
+ file is downloaded via these circuits. The file sizes are set based
+ on node percentile rank as follows:
+
+ 0-10: 2M
+ 10-20: 1M
+ 20-30: 512k
+ 30-50: 256k
+ 50-100: 128k
+
+ These sizes are based on measurements performed during test scans.
+
+ This process is repeated until each node has been chosen to participate
+ in at least 5 circuits.
+
+
+4. Ratio Calculation
+
+ The ratios are calculated by dividing each measured value by the
+ network-wide average.
+
+
+5. Ratio Filtering
+
+ After the base ratios are calculated, a second pass is performed
+ to remove any streams with nodes of ratios less than X=0.5 from
+ the results of other nodes. In addition, all outlying streams
+ with capacity of one standard deviation below a node's average
+ are also removed.
+
+ The final ratio result will be greater of the unfiltered ratio
+ and the filtered ratio.
+
+
+6. Pseudocode for Ratio Calculation Algorithm
+
+ Here is the complete pseudocode for the ratio algorithm:
+
+ Slices = {S | S is 50 nodes of similar consensus capacity}
+ for S in Slices:
+ while exists node N in S with circ_chosen(N) < 7:
+ fetch_slice_file(build_2hop_circuit(N, (exit in S)))
+ for N in S:
+ BW_measured(N) = MEAN(b | b is bandwidth of a stream through N)
+ Bw_stddev(N) = STDDEV(b | b is bandwidth of a stream through N)
+ Bw_avg(S) = MEAN(b | b = BW_measured(N) for all N in S)
+ for N in S:
+ Normal_Streams(N) = {stream via N | bandwidth >= BW_measured(N)}
+ BW_Norm_measured(N) = MEAN(b | b is a bandwidth of Normal_Streams(N))
+
+ Bw_net_avg(Slices) = MEAN(BW_measured(N) for all N in Slices)
+ Bw_Norm_net_avg(Slices) = MEAN(BW_Norm_measured(N) for all N in Slices)
+
+ for N in all Slices:
+ Bw_net_ratio(N) = Bw_measured(N)/Bw_net_avg(Slices)
+ Bw_Norm_net_ratio(N) = BW_Norm_measured(N)/Bw_Norm_net_avg(Slices)
+
+ ResultRatio(N) = MAX(Bw_net_ratio(N), Bw_Norm_net_ratio(N))
+
+
+7. Security implications
+
+ The ratio filtering will deal with cases of sabotage by dropping
+ both very slow outliers in stream average calculations, as well
+ as dropping streams that used very slow nodes from the calculation
+ of other nodes.
+
+ This scheme will not address nodes that try to game the system by
+ providing better service to scanners. The scanners can be detected
+ at the entry by IP address, and at the exit by the destination fetch
+ IP.
+
+ Measures can be taken to obfuscate and separate the scanners' source
+ IP address from the directory authority IP address. For instance,
+ scans can happen offsite and the results can be rsynced into the
+ authorities. The destination server IP can also change.
+
+ Neither of these methods are foolproof, but such nodes can already
+ lie about their bandwidth to attract more traffic, so this solution
+ does not set us back any in that regard.
+
+
+8. Parallelization
+
+ Because each slice takes as long as 6 hours to complete, we will want
+ to parallelize as much as possible. This will be done by concurrently
+ running multiple scanners from each authority to deal with different
+ segments of the network. Each scanner piece will continually loop
+ over a portion of the network, outputting files of the form:
+
+ node_id=<idhex> SP strm_bw=<BW_measured(N)> SP
+ filt_bw=<BW_Norm_measured(N)> ns_bw=<CurrentConsensusBw(N)> NL
+
+ The most recent file from each scanner will be periodically gathered
+ by another script that uses them to produce network-wide averages
+ and calculate ratios as per the algorithm in section 6. Because nodes
+ may shift in capacity, they may appear in more than one slice and/or
+ appear more than once in the file set. The most recently measured
+ line will be chosen in this case.
+
+
+9. Integration with Proposal 160
+
+ The final results will be produced for the voting mechanism
+ described in Proposal 160 by multiplying the derived ratio by
+ the average published consensus bandwidth during the course of the
+ scan, and taking the weighted average with the previous consensus
+ bandwidth:
+
+ Bw_new = Round((Bw_current * Alpha + Bw_scan_avg*Bw_ratio)/(Alpha + 1))
+
+ The Alpha parameter is a smoothing parameter intended to prevent
+ rapid oscillation between loaded and unloaded conditions. It is
+ currently fixed at 0.333.
+
+ The Round() step consists of rounding to the 3 most significant figures
+ in base10, and then rounding that result to the nearest 1000, with
+ a minimum value of 1000.
+
+ This will produce a new bandwidth value that will be output into a
+ file consisting of lines of the form:
+
+ node_id=<idhex> SP bw=<Bw_new> NL
+
+ The first line of the file will contain a timestamp in UNIX time()
+ seconds. This will be used by the authority to decide if the
+ measured values are too old to use.
+
+ This file can be either copied or rsynced into a directory readable
+ by the directory authority.
+
diff --git a/doc/spec/proposals/162-consensus-flavors.txt b/doc/spec/proposals/162-consensus-flavors.txt
new file mode 100644
index 0000000000..e3b697afee
--- /dev/null
+++ b/doc/spec/proposals/162-consensus-flavors.txt
@@ -0,0 +1,188 @@
+Filename: 162-consensus-flavors.txt
+Title: Publish the consensus in multiple flavors
+Author: Nick Mathewson
+Created: 14-May-2009
+Target: 0.2.2
+Status: Open
+
+Overview:
+
+ This proposal describes a way to publish each consensus in
+ multiple simultaneous formats, or "flavors". This will reduce the
+ amount of time needed to deploy new consensus-like documents, and
+ reduce the size of consensus documents in the long term.
+
+Motivation:
+
+ In the future, we will almost surely want different fields and
+ data in the network-status document. Examples include:
+ - Publishing hashes of microdescriptors instead of hashes of
+ full descriptors (Proposal 158).
+ - Including different digests of descriptors, instead of the
+ perhaps-soon-to-be-totally-broken SHA1.
+
+ Note that in both cases, from the client's point of view, this
+ information _replaces_ older information. If we're using a
+ SHA256 hash, we don't need to see the SHA1. If clients only want
+ microdescriptors, they don't (necessarily) need to see hashes of
+ other things.
+
+ Our past approach to cases like this has been to shovel all of
+ the data into the consensus document. But this is rather poor
+ for bandwidth. Adding a single SHA256 hash to a consensus for
+ each router increases the compressed consensus size by 47%. In
+ comparison, replacing a single SHA1 hash with a SHA256 hash for
+ each listed router increases the consensus size by only 18%.
+
+Design in brief:
+
+ Let the voting process remain as it is, until a consensus is
+ generated. With future versions of the voting algorithm, instead
+ of just a single consensus being generated, multiple consensus
+ "flavors" are produced.
+
+ Consensuses (all of them) include a list of which flavors are
+ being generated. Caches fetch and serve all flavors of consensus
+ that are listed, regardless of whether they can parse or validate
+ them, and serve them to clients. Thus, once this design is in
+ place, we won't need to deploy more cache changes in order to get
+ new flavors of consensus to be cached.
+
+ Clients download only the consensus flavor they want.
+
+A note on hashes:
+
+ Everything in this document is specified to use SHA256, and to be
+ upgradeable to use better hashes in the future.
+
+Spec modifications:
+
+ 1. URLs and changes to the current consensus format.
+
+ Every consensus flavor has a name consisting of a sequence of one
+ or more alphanumeric characters and dashes. For compatibility
+ current descriptor flavor is called "ns".
+
+ The supported consensus flavors are defined as part of the
+ authorities' consensus method.
+
+ For each supported flavor, every authority calculates another
+ consensus document of as-yet-unspecified format, and exchanges
+ detached signatures for these documents as in the current consensus
+ design.
+
+ In addition to the consensus currently served at
+ /tor/status-vote/(current|next)/consensus.z and
+ /tor/status-vote/(current|next)/consensus/<FP1>+<FP2>+<FP3>+....z ,
+ authorities serve another consensus of each flavor "F" from the
+ locations /tor/status-vote/(current|next)/consensus-F.z. and
+ /tor/status-vote/(current|next)/consensus-F/<FP1>+....z.
+
+ When caches serve these documents, they do so from the same
+ locations.
+
+ 2. Document format: generic consensus.
+
+ The format of a flavored consensus is as-yet-unspecified, except
+ that the first line is:
+ "network-status-version" SP version SP flavor NL
+
+ where version is 3 or higher, and the flavor is a string
+ consisting of alphanumeric characters and dashes, matching the
+ corresponding flavor listed in the unflavored consensus.
+
+ 3. Document format: detached signatures.
+
+ We amend the detached signature format to include more than one
+ consensus-digest line, and more than one set of signatures.
+
+ After the consensus-digest line, we allow more lines of the form:
+ "additional-digest" SP flavor SP algname SP digest NL
+
+ Before the directory-signature lines, we allow more entries of the form:
+ "additional-signature" SP flavor SP algname SP identity SP
+ signing-key-digest NL signature.
+
+ [We do not use "consensus-digest" or "directory-signature" for flavored
+ consensuses, since this could confuse older Tors.]
+
+ The consensus-signatures URL should contain the signatures
+ for _all_ flavors of consensus.
+
+ 4. The consensus index:
+
+ Authorities additionally generate and serve a consensus-index
+ document. Its format is:
+
+ Header ValidAfter ValidUntil Documents Signatures
+
+ Header = "consensus-index" SP version NL
+ ValidAfter = as in a consensus
+ ValidUntil = as in a consensus
+ Documents = Document*
+ Document = "document" SP flavor SP SignedLength
+ 1*(SP AlgorithmName "=" Digest) NL
+ Signatures = Signature*
+ Signature = "directory-signature" SP algname SP identity
+ SP signing-key-digest NL signature
+
+ There must be one Document line for each generated consensus flavor.
+ Each Document line describes the length of the signed portion of
+ a consensus (the signatures themselves are not included), along
+ with one or more digests of that signed portion. Digests are
+ given in hex. The algorithm "sha256" MUST be included; others
+ are allowed.
+
+ The algname part of a signature describes what algorithm was
+ used to hash the identity and signing keys, and to compute the
+ signature. The algorithm "sha256" MUST be recognized;
+ signatures with unrecognized algorithms MUST be ignored.
+ (See below).
+
+ The consensus index is made available at
+ /tor/status-vote/(current|next)/consensus-index.z.
+
+ Caches should fetch this document so they can check the
+ correctness of the different consensus documents they fetch.
+ They do not need to check anything about an unrecognized
+ consensus document beyond its digest and length.
+
+ 4.1. The "sha256" signature format.
+
+ The 'SHA256' signature format for directory objects is defined as
+ the RSA signature of the OAEP+-padded SHA256 digest of the item to
+ be signed. When checking signatures, the signature MUST be treated
+ as valid if the signature material begins with SHA256(document);
+ this allows us to add other data later.
+
+Considerations:
+
+ - We should not create a new flavor of consensus when adding a
+ field instead wouldn't be too onerous.
+
+ - We should not proliferate flavors lightly: clients will be
+ distinguishable based on which flavor they download.
+
+Migration:
+
+ - Stage one: authorities begin generating and serving
+ consensus-index files.
+
+ - Stage two: Caches begin downloading consensus-index files,
+ validating them, and using them to decide what flavors of
+ consensus documents to cache. They download all listed
+ documents, and compare them to the digests given in the
+ consensus.
+
+ - Stage three: Once we want to make a significant change to the
+ consensus format, we deploy another flavor of consensus at the
+ authorities. This will immediately start getting cached by the
+ caches, and clients can start fetching the new flavor without
+ waiting a version or two for enough caches to begin supporting
+ it.
+
+Acknowledgements:
+
+ Aspects of this design and its applications to hash migration were
+ heavily influenced by IRC conversations with Marian.
+
diff --git a/doc/spec/proposals/163-detecting-clients.txt b/doc/spec/proposals/163-detecting-clients.txt
new file mode 100644
index 0000000000..d838b17063
--- /dev/null
+++ b/doc/spec/proposals/163-detecting-clients.txt
@@ -0,0 +1,115 @@
+Filename: 163-detecting-clients.txt
+Title: Detecting whether a connection comes from a client
+Author: Nick Mathewson
+Created: 22-May-2009
+Target: 0.2.2
+Status: Open
+
+
+Overview:
+
+ Some aspects of Tor's design require relays to distinguish
+ connections from clients from connections that come from relays.
+ The existing means for doing this is easy to spoof. We propose
+ a better approach.
+
+Motivation:
+
+ There are at least two reasons for which Tor servers want to tell
+ which connections come from clients and which come from other
+ servers:
+
+ 1) Some exits, proposal 152 notwithstanding, want to disallow
+ their use as single-hop proxies.
+ 2) Some performance-related proposals involve prioritizing
+ traffic from relays, or limiting traffic per client (but not
+ per relay).
+
+ Right now, we detect client vs server status based on how the
+ client opens circuits. (Check out the code that implements the
+ AllowSingleHopExits option if you want all the details.) This
+ method is depressingly easy to fake, though. This document
+ proposes better means.
+
+Goals:
+
+ To make grabbing relay privileges at least as difficult as just
+ running a relay.
+
+ In the analysis below, "using server privileges" means taking any
+ action that only servers are supposed to do, like delivering a
+ BEGIN cell to an exit node that doesn't allow single hop exits,
+ or claiming server-like amounts of bandwidth.
+
+Passive detection:
+
+ A connection is definitely a client connection if it takes one of
+ the TLS methods during setup that does not establish an identity
+ key.
+
+ A circuit is definitely a client circuit if it is initiated with
+ a CREATE_FAST cell, though the node could be a client or a server.
+
+ A node that's listed in a recent consensus is probably a server.
+
+ A node to which we have successfully extended circuits from
+ multiple origins is probably a server.
+
+Active detection:
+
+ If a node doesn't try to use server privileges at all, we never
+ need to care whether it's a server.
+
+ When a node or circuit tries to use server privileges, if it is
+ "definitely a client" as per above, we can refuse it immediately.
+
+ If it's "probably a server" as per above, we can accept it.
+
+ Otherwise, we have either a client, or a server that is neither
+ listed in any consensus or used by any other clients -- in other
+ words, a new or private server.
+
+ For these servers, we should attempt to build one or more test
+ circuits through them. If enough of the circuits succeed, the
+ node is a real relay. If not, it is probably a client.
+
+ While we are waiting for the test circuits to succeed, we should
+ allow a short grace period in which server privileges are
+ permitted. When a test is done, we should remember its outcome
+ for a while, so we don't need to do it again.
+
+Why it's hard to do good testing:
+
+ Doing a test circuit starting with an unlisted router requires
+ only that we have an open connection for it. Doing a test
+ circuit starting elsewhere _through_ an unlisted router--though
+ more reliable-- would require that we have a known address, port,
+ identity key, and onion key for the router. Only the address and
+ identity key are easily available via the current Tor protocol in
+ all cases.
+
+ We could fix this part by requiring that all servers support
+ BEGIN_DIR and support downloading at least a current descriptor
+ for themselves.
+
+Open questions:
+
+ What are the thresholds for the needed numbers of circuits
+ for us to decide that a node is a relay?
+
+ [Suggested answer: two circuits from two distinct hosts.]
+
+ How do we pick grace periods? How long do we remember the
+ outcome of a test?
+
+ [Suggested answer: 10 minute grace period; 48 hour memory of
+ test outcomes.]
+
+ If we can build circuits starting at a suspect node, but we don't
+ have enough information to try extending circuits elsewhere
+ through the node, should we conclude that the node is
+ "server-like" or not?
+
+ [Suggested answer: for now, just try making circuits through
+ the node. Extend this to extending circuits as needed.]
+
diff --git a/doc/spec/proposals/164-reporting-server-status.txt b/doc/spec/proposals/164-reporting-server-status.txt
new file mode 100644
index 0000000000..705f5f1a84
--- /dev/null
+++ b/doc/spec/proposals/164-reporting-server-status.txt
@@ -0,0 +1,91 @@
+Filename: 164-reporting-server-status.txt
+Title: Reporting the status of server votes
+Author: Nick Mathewson
+Created: 22-May-2009
+Target: 0.2.2
+Status: Open
+
+
+Overview:
+
+ When a given node isn't listed in the directory, it isn't always easy
+ to tell why. This proposal suggest a quick-and-dirty way for
+ authorities to export not only how they voted, but why, and a way to
+ collate the information.
+
+Motivation:
+
+ Right now, if you want to know the reason why your server was listed
+ a certain way in the Tor directory, the following steps are
+ recommended:
+
+ - Look through your log for reports of what the authority said
+ when you tried to upload.
+
+ - Look at the consensus; see if you're listed.
+
+ - Wait a while, see if things get better.
+
+ - Download the votes from all the authorities, and see how they
+ voted. Try to figure out why.
+
+ - If you think they'll listen to you, ask some authority
+ operators to look you up in their mtbf files and logs to see
+ why they voted as they did.
+
+ This is far too hard.
+
+Solution:
+
+ We should add a new vote-like information-only document that
+ authorities serve on request. Call it a "vote info". It is
+ generated at the same time as a vote, but used only for
+ determining why a server voted as it did. It is served from
+ /tor/status-vote-info/current/authority[.z]
+
+ It differs from a vote in that:
+
+ * Its vote-status field is 'vote-info'.
+
+ * It includes routers that the authority would not include
+ in its vote.
+
+ For these, it includes an "omitted" line with an English
+ message explaining why they were omitted.
+
+ * For each router, it includes a line describing its WFU and
+ MTBF. The format is:
+
+ "stability <mtbf> up-since='date'"
+ "uptime <wfu> down-since='date'"
+
+ * It describes the WFU and MTBF thresholds it requires to
+ vote for a given router in various roles in the header.
+ The format is:
+
+ "flag-requirement <flag-name> <field> <op> <value>"
+
+ e.g.
+
+ "flag-requirement Guard uptime > 80"
+
+ * It includes info on routers all of whose descriptors that
+ were uploaded but rejected over the past few hours. The
+ "r" lines for these are the same as for regular routers.
+ The other lines are omitted for these routers, and are
+ replaced with a single "rejected" line, explaining (in
+ English) why the router was rejected.
+
+
+ A status site (like Torweather or Torstatus or another
+ tool) can poll these files when they are generated, collate
+ the data, and make it available to server operators.
+
+Risks:
+
+ This document makes no provisions for caching these "vote
+ info" documents. If many people wind up fetching them
+ aggressively from the authorities, that would be bad.
+
+
+
diff --git a/doc/spec/proposals/165-simple-robust-voting.txt b/doc/spec/proposals/165-simple-robust-voting.txt
new file mode 100644
index 0000000000..f813285a83
--- /dev/null
+++ b/doc/spec/proposals/165-simple-robust-voting.txt
@@ -0,0 +1,133 @@
+Filename: 165-simple-robust-voting.txt
+Title: Easy migration for voting authority sets
+Author: Nick Mathewson
+Created: 2009-05-28
+Status: Open
+
+Overview:
+
+ This proposal describes any easy-to-implement, easy-to-verify way to
+ change the set of authorities without creating a "flag day" situation.
+
+Motivation:
+
+ From proposal 134 ("More robust consensus voting with diverse
+ authority sets") by Peter Palfrader:
+
+ Right now there are about five authoritative directory servers
+ in the Tor network, tho this number is expected to rise to about
+ 15 eventually.
+
+ Adding a new authority requires synchronized action from all
+ operators of directory authorities so that at any time during the
+ update at least half of all authorities are running and agree on
+ who is an authority. The latter requirement is there so that the
+ authorities can arrive at a common consensus: Each authority
+ builds the consensus based on the votes from all authorities it
+ recognizes, and so a different set of recognized authorities will
+ lead to a different consensus document.
+
+ In response to this problem, proposal 134 suggested that every
+ candidate authority list in its vote whom it believes to be an
+ authority. These A-says-B-is-an-authority relationships form a
+ directed graph. Each authority then iteratively finds the largest
+ clique in the graph and remove it, until they find one containing
+ them. They vote with this clique.
+
+ Proposal 134 had some problems:
+
+ - It had a security problem in that M hostile authorities in a
+ clique could effectively kick out M-1 honest authorities. This
+ could enable a minority of the original authorities to take over.
+
+ - It was too complex in its implications to analyze well: it took us
+ over a year to realize that it was insecure.
+
+ - It tried to solve a bigger problem: general fragmentation of
+ authority trust. Really, all we wanted to have was the ability to
+ add and remove authorities without forcing a flag day.
+
+Proposed protocol design:
+
+ A "Voting Set" is a set of authorities. Each authority has a list of
+ the voting sets it considers acceptable. These sets are chosen
+ manually by the authority operators. They must always contain the
+ authority itself. Each authority lists all of these voting sets in
+ its votes.
+
+ Authorities exchange votes with every other authority in any of their
+ voting sets.
+
+ When it is time to calculate a consensus, an authority votes with
+ whichever voting set it lists that is listed by the most members of
+ that set. In other words, given two sets S1 and S2 that an authority
+ lists, that authority will prefer to vote with S1 over S2 whenever
+ the number of other authorities in S1 that themselves list S1 is
+ higher than the number of other authorities in S2 that themselves
+ list S2.
+
+ For example, suppose authority A recognizes two sets, "A B C D" and
+ "A E F G H". Suppose that the first set is recognized by all of A,
+ B, C, and D, whereas the second set is recognized only by A, E, and
+ F. Because the first set is recognize by more of the authorities in
+ it than the other one, A will vote with the first set.
+
+ Ties are broken in favor of some arbitrary function of the identity
+ keys of the authorities in the set.
+
+How to migrate authority sets:
+
+ In steady state, each authority operator should list only the current
+ actual voting set as accepted.
+
+ When we want to add an authority, each authority operator configures
+ his or her server to list two voting sets: one containing all the old
+ authorities, and one containing the old authorities and the new
+ authority too. Once all authorities are listing the new set of
+ authorities, they will start voting with that set because of its
+ size.
+
+ What if one or two authority operators are slow to list the new set?
+ Then the other operators can stop listing the old set once there are
+ enough authorities listing the new set to make its voting successful.
+ (Note that these authorities not listing the new set will still have
+ their votes counted, since they themselves will be members of the new
+ set. They will only fail to sign the consensus generated by the
+ other authorities who are using the new set.)
+
+ When we want to remove an authority, the operators list two voting
+ sets: one containing all the authorities, and one omitting the
+ authority we want to remove. Once enough authorities list the new
+ set as acceptable, we start having authority operators stop listing
+ the old set. Once there are more listing the new set than the old
+ set, the new set will win.
+
+Data format changes:
+
+ Add a new 'voting-set' line to the vote document format. Allow it to
+ occur any number of times. Its format is:
+
+ voting-set SP 'fingerprint' SP 'fingerprint' ... NL
+
+ where each fingerprint is the hex fingerprint of an identity key of
+ an authority. Sort fingerprints in ascending order.
+
+ When the consensus method is at least 'X' (decide this when we
+ implement the proposal), add this line to the consensus format as
+ well, before the first dir-source line. [This information is not
+ redundant with the dir-source sections in the consensus: If an
+ authority is recognized but didn't vote, that authority will appear in
+ the voting-set line but not in the dir-source sections.]
+
+ We don't need to list other information about authorities in our
+ vote.
+
+Migration issues:
+
+ We should keep track somewhere of which Tor client versions
+ recognized which authorities.
+
+Acknowledgments:
+
+ The design came out of an IRC conversation with Peter Palfrader. He
+ had the basic idea first.
diff --git a/doc/spec/proposals/166-statistics-extra-info-docs.txt b/doc/spec/proposals/166-statistics-extra-info-docs.txt
new file mode 100644
index 0000000000..ab2716a71c
--- /dev/null
+++ b/doc/spec/proposals/166-statistics-extra-info-docs.txt
@@ -0,0 +1,391 @@
+Filename: 166-statistics-extra-info-docs.txt
+Title: Including Network Statistics in Extra-Info Documents
+Author: Karsten Loesing
+Created: 21-Jul-2009
+Target: 0.2.2
+Status: Accepted
+
+Change history:
+
+ 21-Jul-2009 Initial proposal for or-dev
+
+
+Overview:
+
+ The Tor network has grown to almost two thousand relays and millions
+ of casual users over the past few years. With growth has come
+ increasing performance problems and attempts by some countries to
+ block access to the Tor network. In order to address these problems,
+ we need to learn more about the Tor network. This proposal suggests to
+ measure additional statistics and include them in extra-info documents
+ to help us understand the Tor network better.
+
+
+Introduction:
+
+ As of May 2009, relays, bridges, and directories gather the following
+ data for statistical purposes:
+
+ - Relays and bridges count the number of bytes that they have pushed
+ in 15-minute intervals over the past 24 hours. Relays and bridges
+ include these data in extra-info documents that they send to the
+ directory authorities whenever they publish their server descriptor.
+
+ - Bridges further include a rough number of clients per country that
+ they have seen in the past 48 hours in their extra-info documents.
+
+ - Directories can be configured to count the number of clients they
+ see per country in the past 24 hours and to write them to a local
+ file.
+
+ Since then we extended the network statistics in Tor. These statistics
+ include:
+
+ - Directories now gather more precise statistics about connecting
+ clients. Fixes include measuring in intervals of exactly 24 hours,
+ counting unsuccessful requests, measuring download times, etc. The
+ directories append their statistics to a local file every 24 hours.
+
+ - Entry guards count the number of clients per country per day like
+ bridges do and write them to a local file every 24 hours.
+
+ - Relays measure statistics of the number of cells in their circuit
+ queues and how much time these cells spend waiting there. Relays
+ write these statistics to a local file every 24 hours.
+
+ - Exit nodes count the number of read and written bytes on exit
+ connections per port as well as the number of opened exit streams
+ per port in 24-hour intervals. Exit nodes write their statistics to
+ a local file.
+
+ The following four sections contain descriptions for adding these
+ statistics to the relays' extra-info documents.
+
+
+Directory request statistics:
+
+ The first type of statistics aims at measuring directory requests sent
+ by clients to a directory mirror or directory authority. More
+ precisely, these statistics aim at requests for v2 and v3 network
+ statuses only. These directory requests are sent non-anonymously,
+ either via HTTP-like requests to a directory's Dir port or tunneled
+ over a 1-hop circuit.
+
+ Measuring directory request statistics is useful for several reasons:
+ First, the number of locally seen directory requests can be used to
+ estimate the total number of clients in the Tor network. Second, the
+ country-wise classification of requests using a GeoIP database can
+ help counting the relative and absolute number of users per country.
+ Third, the download times can give hints on the available bandwidth
+ capacity at clients.
+
+ Directory requests do not give any hints on the contents that clients
+ send or receive over the Tor network. Every client requests network
+ statuses from the directories, so that there are no anonymity-related
+ concerns to gather these statistics. It might be, though, that clients
+ wish to hide the fact that they are connecting to the Tor network.
+ Therefore, IP addresses are resolved to country codes in memory,
+ events are accumulated over 24 hours, and numbers are rounded up to
+ multiples of 4 or 8.
+
+ "dirreq-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL
+ [At most once.]
+
+ YYYY-MM-DD HH:MM:SS defines the end of the included measurement
+ interval of length NSEC seconds (86400 seconds by default).
+
+ A "dirreq-stats-end" line, as well as any other "dirreq-*" line,
+ is only added when the relay has opened its Dir port and after 24
+ hours of measuring directory requests.
+
+ "dirreq-v2-ips" CC=N,CC=N,... NL
+ [At most once.]
+ "dirreq-v3-ips" CC=N,CC=N,... NL
+ [At most once.]
+
+ List of mappings from two-letter country codes to the number of
+ unique IP addresses that have connected from that country to
+ request a v2/v3 network status, rounded up to the nearest multiple
+ of 8. Only those IP addresses are counted that the directory can
+ answer with a 200 OK status code.
+
+ "dirreq-v2-reqs" CC=N,CC=N,... NL
+ [At most once.]
+ "dirreq-v3-reqs" CC=N,CC=N,... NL
+ [At most once.]
+
+ List of mappings from two-letter country codes to the number of
+ requests for v2/v3 network statuses from that country, rounded up
+ to the nearest multiple of 8. Only those requests are counted that
+ the directory can answer with a 200 OK status code.
+
+ "dirreq-v2-share" num% NL
+ [At most once.]
+ "dirreq-v3-share" num% NL
+ [At most once.]
+
+ The share of v2/v3 network status requests that the directory
+ expects to receive from clients based on its advertised bandwidth
+ compared to the overall network bandwidth capacity. Shares are
+ formatted in percent with two decimal places. Shares are
+ calculated as means over the whole 24-hour interval.
+
+ "dirreq-v2-resp" status=num,... NL
+ [At most once.]
+ "dirreq-v3-resp" status=nul,... NL
+ [At most once.]
+
+ List of mappings from response statuses to the number of requests
+ for v2/v3 network statuses that were answered with that response
+ status, rounded up to the nearest multiple of 4. Only response
+ statuses with at least 1 response are reported. New response
+ statuses can be added at any time. The current list of response
+ statuses is as follows:
+
+ "ok": a network status request is answered; this number
+ corresponds to the sum of all requests as reported in
+ "dirreq-v2-reqs" or "dirreq-v3-reqs", respectively, before
+ rounding up.
+ "not-enough-sigs: a version 3 network status is not signed by a
+ sufficient number of requested authorities.
+ "unavailable": a requested network status object is unavailable.
+ "not-found": a requested network status is not found.
+ "not-modified": a network status has not been modified since the
+ If-Modified-Since time that is included in the request.
+ "busy": the directory is busy.
+
+ "dirreq-v2-direct-dl" key=val,... NL
+ [At most once.]
+ "dirreq-v3-direct-dl" key=val,... NL
+ [At most once.]
+ "dirreq-v2-tunneled-dl" key=val,... NL
+ [At most once.]
+ "dirreq-v3-tunneled-dl" key=val,... NL
+ [At most once.]
+
+ List of statistics about possible failures in the download process
+ of v2/v3 network statuses. Requests are either "direct"
+ HTTP-encoded requests over the relay's directory port, or
+ "tunneled" requests using a BEGIN_DIR cell over the relay's OR
+ port. The list of possible statistics can change, and statistics
+ can be left out from reporting. The current list of statistics is
+ as follows:
+
+ Successful downloads and failures:
+
+ "complete": a client has finished the download successfully.
+ "timeout": a download did not finish within 10 minutes after
+ starting to send the response.
+ "running": a download is still running at the end of the
+ measurement period for less than 10 minutes after starting to
+ send the response.
+
+ Download times:
+
+ "min", "max": smallest and largest measured bandwidth in B/s.
+ "d[1-4,6-9]": 1st to 4th and 6th to 9th decile of measured
+ bandwidth in B/s. For a given decile i, i/10 of all downloads
+ had a smaller bandwidth than di, and (10-i)/10 of all downloads
+ had a larger bandwidth than di.
+ "q[1,3]": 1st and 3rd quartile of measured bandwidth in B/s. One
+ fourth of all downloads had a smaller bandwidth than q1, one
+ fourth of all downloads had a larger bandwidth than q3, and the
+ remaining half of all downloads had a bandwidth between q1 and
+ q3.
+ "md": median of measured bandwidth in B/s. Half of the downloads
+ had a smaller bandwidth than md, the other half had a larger
+ bandwidth than md.
+
+
+Entry guard statistics:
+
+ Entry guard statistics include the number of clients per country and
+ per day that are connecting directly to an entry guard.
+
+ Entry guard statistics are important to learn more about the
+ distribution of clients to countries. In the future, this knowledge
+ can be useful to detect if there are or start to be any restrictions
+ for clients connecting from specific countries.
+
+ The information which client connects to a given entry guard is very
+ sensitive. This information must not be combined with the information
+ what contents are leaving the network at the exit nodes. Therefore,
+ entry guard statistics need to be aggregated to prevent them from
+ becoming useful for de-anonymization. Aggregation includes resolving
+ IP addresses to country codes, counting events over 24-hour intervals,
+ and rounding up numbers to the next multiple of 8.
+
+ "entry-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL
+ [At most once.]
+
+ YYYY-MM-DD HH:MM:SS defines the end of the included measurement
+ interval of length NSEC seconds (86400 seconds by default).
+
+ An "entry-stats-end" line, as well as any other "entry-*"
+ line, is first added after the relay has been running for at least
+ 24 hours.
+
+ "entry-ips" CC=N,CC=N,... NL
+ [At most once.]
+
+ List of mappings from two-letter country codes to the number of
+ unique IP addresses that have connected from that country to the
+ relay and which are no known other relays, rounded up to the
+ nearest multiple of 8.
+
+
+Cell statistics:
+
+ The third type of statistics have to do with the time that cells spend
+ in circuit queues. In order to gather these statistics, the relay
+ memorizes when it puts a given cell in a circuit queue and when this
+ cell is flushed. The relay further notes the life time of the circuit.
+ These data are sufficient to determine the mean number of cells in a
+ queue over time and the mean time that cells spend in a queue.
+
+ Cell statistics are necessary to learn more about possible reasons for
+ the poor network performance of the Tor network, especially high
+ latencies. The same statistics are also useful to determine the
+ effects of design changes by comparing today's data with future data.
+
+ There are basically no privacy concerns from measuring cell
+ statistics, regardless of a node being an entry, middle, or exit node.
+
+ "cell-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL
+ [At most once.]
+
+ YYYY-MM-DD HH:MM:SS defines the end of the included measurement
+ interval of length NSEC seconds (86400 seconds by default).
+
+ A "cell-stats-end" line, as well as any other "cell-*" line,
+ is first added after the relay has been running for at least 24
+ hours.
+
+ "cell-processed-cells" num,...,num NL
+ [At most once.]
+
+ Mean number of processed cells per circuit, subdivided into
+ deciles of circuits by the number of cells they have processed in
+ descending order from loudest to quietest circuits.
+
+ "cell-queued-cells" num,...,num NL
+ [At most once.]
+
+ Mean number of cells contained in queues by circuit decile. These
+ means are calculated by 1) determining the mean number of cells in
+ a single circuit between its creation and its termination and 2)
+ calculating the mean for all circuits in a given decile as
+ determined in "cell-processed-cells". Numbers have a precision of
+ two decimal places.
+
+ "cell-time-in-queue" num,...,num NL
+ [At most once.]
+
+ Mean time cells spend in circuit queues in milliseconds. Times are
+ calculated by 1) determining the mean time cells spend in the
+ queue of a single circuit and 2) calculating the mean for all
+ circuits in a given decile as determined in
+ "cell-processed-cells".
+
+ "cell-circuits-per-decile" num NL
+ [At most once.]
+
+ Mean number of circuits that are included in any of the deciles,
+ rounded up to the next integer.
+
+
+Exit statistics:
+
+ The last type of statistics affects exit nodes counting the number of
+ bytes written and read and the number of streams opened per port and
+ per 24 hours. Exit port statistics can be measured from looking at
+ headers of BEGIN and DATA cells. A BEGIN cell contains the exit port
+ that is required for the exit node to open a new exit stream.
+ Subsequent DATA cells coming from the client or being sent back to the
+ client contain a length field stating how many bytes of application
+ data are contained in the cell.
+
+ Exit port statistics are important to measure in order to identify
+ possible load-balancing problems with respect to exit policies. Exit
+ nodes that permit more ports than others are very likely overloaded
+ with traffic for those ports plus traffic for other ports. Improving
+ load balancing in the Tor network improves the overall utilization of
+ bandwidth capacity.
+
+ Exit traffic is one of the most sensitive parts of network data in the
+ Tor network. Even though these statistics do not require looking at
+ traffic contents, statistics are aggregated so that they are not
+ useful for de-anonymizing users. Only those ports are reported that
+ have seen at least 0.1% of exiting or incoming bytes, numbers of bytes
+ are rounded up to full kibibytes (KiB), and stream numbers are rounded
+ up to the next multiple of 4.
+
+ "exit-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL
+ [At most once.]
+
+ YYYY-MM-DD HH:MM:SS defines the end of the included measurement
+ interval of length NSEC seconds (86400 seconds by default).
+
+ An "exit-stats-end" line, as well as any other "exit-*" line, is
+ first added after the relay has been running for at least 24 hours
+ and only if the relay permits exiting (where exiting to a single
+ port and IP address is sufficient).
+
+ "exit-kibibytes-written" port=N,port=N,... NL
+ [At most once.]
+ "exit-kibibytes-read" port=N,port=N,... NL
+ [At most once.]
+
+ List of mappings from ports to the number of kibibytes that the
+ relay has written to or read from exit connections to that port,
+ rounded up to the next full kibibyte.
+
+ "exit-streams-opened" port=N,port=N,... NL
+ [At most once.]
+
+ List of mappings from ports to the number of opened exit streams
+ to that port, rounded up to the nearest multiple of 4.
+
+
+Implementation notes:
+
+ Right now, relays that are configured accordingly write similar
+ statistics to those described in this proposal to disk every 24 hours.
+ With this proposal being implemented, relays include the contents of
+ these files in extra-info documents.
+
+ The following steps are necessary to implement this proposal:
+
+ 1. The current format of [dirreq|entry|buffer|exit]-stats files needs
+ to be adapted to the description in this proposal. This step
+ basically means renaming keywords.
+
+ 2. The timing of writing the four *-stats files should be unified, so
+ that they are written exactly 24 hours after starting the
+ relay. Right now, the measurement intervals for dirreq, entry, and
+ exit stats starts with the first observed request, and files are
+ written when observing the first request that occurs more than 24
+ hours after the beginning of the measurement interval. With this
+ proposal, the measurement intervals should all start at the same
+ time, and files should be written exactly 24 hours later.
+
+ 3. It is advantageous to cache statistics in local files in the data
+ directory until they are included in extra-info documents. The
+ reason is that the 24-hour measurement interval can be very
+ different from the 18-hour publication interval of extra-info
+ documents. When a relay crashes after finishing a measurement
+ interval, but before publishing the next extra-info document,
+ statistics would get lost. Therefore, statistics are written to
+ disk when finishing a measurement interval and read from disk when
+ generating an extra-info document. Only the statistics that were
+ appended to the *-stats files within the past 24 hours are included
+ in extra-info documents. Further, the contents of the *-stats files
+ need to be checked in the process of generating extra-info documents.
+
+ 4. With the statistics patches being tested, the ./configure options
+ should be removed and the statistics code be compiled by default.
+ It is still required for relay operators to add configuration
+ options (DirReqStatistics, ExitPortStatistics, etc.) to enable
+ gathering statistics. However, in the near future, statistics shall
+ be enabled gathered by all relays by default, where requiring a
+ ./configure option would be a barrier for many relay operators.
diff --git a/doc/spec/proposals/167-params-in-consensus.txt b/doc/spec/proposals/167-params-in-consensus.txt
new file mode 100644
index 0000000000..d23bc9c01e
--- /dev/null
+++ b/doc/spec/proposals/167-params-in-consensus.txt
@@ -0,0 +1,47 @@
+Filename: 167-params-in-consensus.txt
+Title: Vote on network parameters in consensus
+Author: Roger Dingledine
+Created: 18-Aug-2009
+Status: Closed
+Implemented-In: 0.2.2
+
+0. History
+
+
+1. Overview
+
+ Several of our new performance plans involve guessing how to tune
+ clients and relays, yet we won't be able to learn whether we guessed
+ the right tuning parameters until many people have upgraded. Instead,
+ we should have directory authorities vote on the parameters, and teach
+ Tors to read the currently recommended values out of the consensus.
+
+2. Design
+
+ V3 votes should include a new "params" line after the known-flags
+ line. It contains key=value pairs, where value is an integer.
+
+ Consensus documents that are generated with a sufficiently new consensus
+ method (7?) then include a params line that includes every key listed
+ in any vote, and the median value for that key (in case of ties,
+ we use the median closer to zero).
+
+2.1. Planned keys.
+
+ The first planned parameter is "circwindow=101", which is the initial
+ circuit packaging window that clients and relays should use. Putting
+ it in the consensus will let us perform experiments with different
+ values once enough Tors have upgraded -- see proposal 168.
+
+ Later parameters might include a weighting for how much to favor quiet
+ circuits over loud circuits in our round-robin algorithm; a weighting
+ for how much to prioritize relays over clients if we use an incentive
+ scheme like the gold-star design; and what fraction of circuits we
+ should throw out from proposal 151.
+
+2.2. What about non-integers?
+
+ I'm not sure how we would do median on non-integer values. Further,
+ I don't have any non-integer values in mind yet. So I say we cross
+ that bridge when we get to it.
+
diff --git a/doc/spec/proposals/168-reduce-circwindow.txt b/doc/spec/proposals/168-reduce-circwindow.txt
new file mode 100644
index 0000000000..c10cf41e2e
--- /dev/null
+++ b/doc/spec/proposals/168-reduce-circwindow.txt
@@ -0,0 +1,134 @@
+Filename: 168-reduce-circwindow.txt
+Title: Reduce default circuit window
+Author: Roger Dingledine
+Created: 12-Aug-2009
+Status: Open
+Target: 0.2.2
+
+0. History
+
+
+1. Overview
+
+ We should reduce the starting circuit "package window" from 1000 to
+ 101. The lower package window will mean that clients will only be able
+ to receive 101 cells (~50KB) on a circuit before they need to send a
+ 'sendme' acknowledgement cell to request 100 more.
+
+ Starting with a lower package window on exit relays should save on
+ buffer sizes (and thus memory requirements for the exit relay), and
+ should save on queue sizes (and thus latency for users).
+
+ Lowering the package window will induce an extra round-trip for every
+ additional 50298 bytes of the circuit. This extra step is clearly a
+ slow-down for large streams, but ultimately we hope that a) clients
+ fetching smaller streams will see better response, and b) slowing
+ down the large streams in this way will produce lower e2e latencies,
+ so the round-trips won't be so bad.
+
+2. Motivation
+
+ Karsten's torperf graphs show that the median download time for a 50KB
+ file over Tor in mid 2009 is 7.7 seconds, whereas the median download
+ time for 1MB and 5MB are around 50s and 150s respectively. The 7.7
+ second figure is way too high, whereas the 50s and 150s figures are
+ surprisingly low.
+
+ The median round-trip latency appears to be around 2s, with 25% of
+ the data points taking more than 5s. That's a lot of variance.
+
+ We designed Tor originally with the original goal of maximizing
+ throughput. We figured that would also optimize other network properties
+ like round-trip latency. Looks like we were wrong.
+
+3. Design
+
+ Wherever we initialize the circuit package window, initialize it to
+ 101 rather than 1000. Reducing it should be safe even when interacting
+ with old Tors: the old Tors will receive the 101 cells and send back
+ a sendme ack cell. They'll still have much higher deliver windows,
+ but the rest of their deliver window will go unused.
+
+ You can find the patch at arma/circwindow. It seems to work.
+
+3.1. Why not 100?
+
+ Tor 0.0.0 through 0.2.1.19 have a bug where they only send the sendme
+ ack cell after 101 cells rather than the intended 100 cells.
+
+ Once 0.2.1.19 is obsolete we can change it back to 100 if we like. But
+ hopefully we'll have moved to some datagram protocol long before
+ 0.2.1.19 becomes obsolete.
+
+3.2. What about stream packaging windows?
+
+ Right now the stream packaging windows start at 500. The goal was to
+ set the stream window to half the circuit window, to provide a crude
+ load balancing between streams on the same circuit. Once we lower
+ the circuit packaging window, the stream packaging window basically
+ becomes redundant.
+
+ We could leave it in -- it isn't hurting much in either case. Or we
+ could take it out -- people building other Tor clients would thank us
+ for that step. Alas, people building other Tor clients are going to
+ have to be compatible with current Tor clients, so in practice there's
+ no point taking out the stream packaging windows.
+
+3.3. What about variable circuit windows?
+
+ Once upon a time we imagined adapting the circuit package window to
+ the network conditions. That is, we would start the window small,
+ and raise it based on the latency and throughput we see.
+
+ In theory that crude imitation of TCP's windowing system would allow
+ us to adapt to fill the network better. In practice, I think we want
+ to stick with the small window and never raise it. The low cap reduces
+ the total throughput you can get from Tor for a given circuit. But
+ that's a feature, not a bug.
+
+4. Evaluation
+
+ How do we know this change is actually smart? It seems intuitive that
+ it's helpful, and some smart systems people have agreed that it's
+ a good idea (or said another way, they were shocked at how big the
+ default package window was before).
+
+ To get a more concrete sense of the benefit, though, Karsten has been
+ running torperf side-by-side on exit relays with the old package window
+ vs the new one. The results are mixed currently -- it is slightly faster
+ for fetching 40KB files, and slightly slower for fetching 50KB files.
+
+ I think it's going to be tough to get a clear conclusion that this is
+ a good design just by comparing one exit relay running the patch. The
+ trouble is that the other hops in the circuits are still getting bogged
+ down by other clients introducing too much traffic into the network.
+
+ Ultimately, we'll want to put the circwindow parameter into the
+ consensus so we can test a broader range of values once enough relays
+ have upgraded.
+
+5. Transition and deployment
+
+ We should put the circwindow in the consensus (see proposal 167),
+ with an initial value of 101. Then as more exit relays upgrade,
+ clients should seamlessly get the better behavior.
+
+ Note that upgrading the exit relay will only affect the "download"
+ package window. An old client that's uploading lots of bytes will
+ continue to use the old package window at the client side, and we
+ can't throttle that window at the exit side without breaking protocol.
+
+ The real question then is what we should backport to 0.2.1. Assuming
+ this could be a big performance win, we can't afford to wait until
+ 0.2.2.x comes out before starting to see the changes here. So we have
+ two options as I see them:
+ a) once clients in 0.2.2.x know how to read the value out of the
+ consensus, and it's been tested for a bit, backport that part to
+ 0.2.1.x.
+ b) if it's too complex to backport, just pick a number, like 101, and
+ backport that number.
+
+ Clearly choice (a) is the better one if the consensus parsing part
+ isn't very complex. Let's shoot for that, and fall back to (b) if the
+ patch turns out to be so big that we reconsider.
+
diff --git a/doc/spec/proposals/169-eliminating-renegotiation.txt b/doc/spec/proposals/169-eliminating-renegotiation.txt
new file mode 100644
index 0000000000..2c90f9c9e8
--- /dev/null
+++ b/doc/spec/proposals/169-eliminating-renegotiation.txt
@@ -0,0 +1,404 @@
+Filename: 169-eliminating-renegotiation.txt
+Title: Eliminate TLS renegotiation for the Tor connection handshake
+Author: Nick Mathewson
+Created: 27-Jan-2010
+Status: Draft
+Target: 0.2.2
+
+1. Overview
+
+ I propose a backward-compatible change to the Tor connection
+ establishment protocol to avoid the use of TLS renegotiation.
+
+ Rather than doing a TLS renegotiation to exchange certificates
+ and authenticate the original handshake, this proposal takes an
+ approach similar to Steven Murdoch's proposal 124, and uses Tor
+ cells to finish authenticating the parties' identities once the
+ initial TLS handshake is finished.
+
+ Terminological note: I use "client" below to mean the Tor
+ instance (a client or a relay) that initiates a TLS connection,
+ and "server" to mean the Tor instance (a relay) that accepts it.
+
+2. Motivation and history
+
+ In the original Tor TLS connection handshake protocol ("V1", or
+ "two-cert"), parties that wanted to authenticate provided a
+ two-cert chain of X.509 certificates during the handshake setup
+ phase. Every party that wanted to authenticate sent these
+ certificates.
+
+ In the current Tor TLS connection handshake protocol ("V2", or
+ "renegotiating"), the parties begin with a single certificate
+ sent from the server (responder) to the client (initiator), and
+ then renegotiate to a two-certs-from-each-authenticating party.
+ We made this change to make Tor's handshake look like a browser
+ speaking SSL to a webserver. (See proposal 130, and
+ tor-spec.txt.) To tell whether to use the V1 or V2 handshake,
+ servers look at the list of ciphers sent by the client. (This is
+ ugly, but there's not much else in the ClientHello that they can
+ look at.) If the list contains any cipher not used by the V1
+ protocol, the server sends back a single cert and expects a
+ renegotiation. If the client gets back a single cert, then it
+ withholds its own certificates until the TLS renegotiation phase.
+
+ In other words, initiator behavior now looks like this:
+
+ - Begin TLS negotiation with V2 cipher list; wait for
+ certificate(s).
+ - If we get a certificate chain:
+ - Then we are using the V1 handshake. Send our own
+ certificate chain as part of this initial TLS handshake
+ if we want to authenticate; otherwise, send no
+ certificates. When the handshake completes, check
+ certificates. We are now mutually authenticated.
+
+ Otherwise, if we get just a single certificate:
+ - Then we are using the V2 handshake. Do not send any
+ certificates during this handshake.
+ - When the handshake is done, immediately start a TLS
+ renegotiation. During the renegotiation, expect
+ a certificate chain from the server; send a certificate
+ chain of our own if we want to authenticate ourselves.
+ - After the renegotiation, check the certificates. Then
+ send (and expect) a VERSIONS cell from the other side to
+ establish the link protocol version.
+
+ And V2 responder behavior now looks like this:
+
+ - When we get a TLS ClientHello request, look at the cipher
+ list.
+ - If the cipher list contains only the V1 ciphersuites:
+ - Then we're doing a V1 handshake. Send a certificate
+ chain. Expect a possible client certificate chain in
+ response.
+ Otherwise, if we get other ciphersuites:
+ - We're using the V2 handshake. Send back a single
+ certificate and let the handshake complete.
+ - Do not accept any data until the client has renegotiated.
+ - When the client is renegotiating, send a certificate
+ chain, and expect (possibly multiple) certificates in
+ reply.
+ - Check the certificates when the renegotiation is done.
+ Then exchange VERSIONS cells.
+
+ Late in 2009, researchers found a flaw in most applications' use
+ of TLS renegotiation: Although TLS renegotiation does not
+ reauthenticate any information exchanged before the renegotiation
+ takes place, many applications were treating it as though it did,
+ and assuming that data sent _before_ the renegotiation was
+ authenticated with the credentials negotiated _during_ the
+ renegotiation. This problem was exacerbated by the fact that
+ most TLS libraries don't actually give you an obvious good way to
+ tell where the renegotiation occurred relative to the datastream.
+ Tor wasn't directly affected by this vulnerability, but its
+ aftermath hurts us in a few ways:
+
+ 1) OpenSSL has disabled renegotiation by default, and created
+ a "yes we know what we're doing" option we need to set to
+ turn it back on. (Two options, actually: one for openssl
+ 0.9.8l and one for 0.9.8m and later.)
+
+ 2) Some vendors have removed all renegotiation support from
+ their versions of OpenSSL entirely, forcing us to tell
+ users to either replace their versions of OpenSSL or to
+ link Tor against a hand-built one.
+
+ 3) Because of 1 and 2, I'd expect TLS renegotiation to become
+ rarer and rarer in the wild, making our own use stand out
+ more.
+
+3. Design
+
+3.1. The view in the large
+
+ Taking a cue from Steven Murdoch's proposal 124, I propose that
+ we move the work currently done by the TLS renegotiation step
+ (that is, authenticating the parties to one another) and do it
+ with Tor cells instead of with TLS.
+
+ Using _yet another_ variant response from the responder (server),
+ we allow the client to learn that it doesn't need to rehandshake
+ and can instead use a cell-based authentication system. Once the
+ TLS handshake is done, the client and server exchange VERSIONS
+ cells to determine link protocol version (including
+ handshake version). If they're using the handshake version
+ specified here, the client and server arrive at link protocol
+ version 3 (or higher), and use cells to exchange further
+ authentication information.
+
+3.2. New TLS handshake variant
+
+ We already used the list of ciphers from the clienthello to
+ indicate whether the client can speak the V2 ("renegotiating")
+ handshake or later, so we can't encode more information there.
+
+ We can, however, change the DN in the certificate passed by the
+ server back to the client. Currently, all V2 certificates are
+ generated with CN values ending with ".net". I propose that we
+ have the ".net" commonName ending reserved to indicate the V2
+ protocol, and use commonName values ending with ".com" to
+ indicate the V3 ("minimal") handshake described herein.
+
+ Now, once the initial TLS handshake is done, the client can look
+ at the server's certificate(s). If there is a certificate chain,
+ the handshake is V1. If there is a single certificate whose
+ subject commonName ends in ".net", the handshake is V2 and the
+ client should try to renegotiate as it would currently.
+ Otherwise, the client should assume that the handshake is V3+.
+ [Servers should _only_ send ".com" addesses, to allow room for
+ more signaling in the future.]
+
+3.3. Authenticating inside Tor
+
+ Once the TLS handshake is finished, if the client renegotiates,
+ then the server should go on as it does currently.
+
+ If the client implements this proposal, however, and the server
+ has shown it can understand the V3+ handshake protocol, the
+ client immediately sends a VERSIONS cell to the server
+ and waits to receive a VERSIONS cell in return. We negotiate
+ the Tor link protocol version _before_ we proceed with the
+ negotiation, in case we need to change the authentication
+ protocol in the future.
+
+ Once either party has seen the VERSIONS cell from the other, it
+ knows which version they will pick (that is, the highest version
+ shared by both parties' VERSIONS cells). All Tor instances using
+ the handshake protocol described in 3.2 MUST support at least
+ link protocol version 3 as described here.
+
+ On learning the link protocol, the server then sends the client a
+ CERT cell and a NETINFO cell. If the client wants to
+ authenticate to the server, it sends a CERT cell, an AUTHENTICATE
+ cell, and a NETINFO cell, or it may simply send a NETINFO cell if
+ it does not want to authenticate.
+
+ The CERT cell describes the keys that a Tor instance is claiming
+ to have. It is a variable-length cell. Its payload format is:
+
+ N: Number of certs in cell [1 octet]
+ N times:
+ CLEN [2 octets]
+ Certificate [CLEN octets]
+
+ Any extra octets at the end of a CERT cell MUST be ignored.
+
+ Each certificate has the form:
+
+ CertType [1 octet]
+ CertPurpose [1 octet]
+ PublicKeyLen [2 octets]
+ PublicKey [PublicKeyLen octets]
+ NotBefore [4 octets]
+ NotAfter [4 octets]
+ SignerID [HASH256_LEN octets]
+ SignatureLen [2 octets]
+ Signature [SignatureLen octets]
+
+ where CertType is 1 (meaning "RSA/SHA256")
+ CertPurpose is 1 (meaning "link certificate")
+ PublicKey is the DER encoding of the ASN.1 representation
+ of the RSA key of the subject of this certificate,
+ NotBefore is a time in HOURS since January 1, 1970, 00:00
+ UTC before which this certificate should not be
+ considered valid.
+ NotAfter is a time in HOURS since January 1, 1970, 00:00
+ UTC after which this certificate should not be
+ considered valid.
+ SignerID is the SHA-256 digest of the public key signing
+ this certificate
+ and Signature is the signature of the all other fields in
+ this certificate, using SHA256 as described in proposal
+ 158.
+
+ While authenticating, a server need send only a self-signed
+ certificate for its identity key. (Its TLS certificate already
+ contains its link key signed by its identity key.) A client that
+ wants to authenticate MUST send two certificates: one containing
+ a public link key signed by its identity key, and one self-signed
+ cert for its identity.
+
+ Tor instances MUST ignore any certificate with an unrecognized
+ CertType or CertPurpose, and MUST ignore extra bytes in the cert.
+
+ The AUTHENTICATE cell proves to the server that the client with
+ whom it completed the initial TLS handshake is the one possessing
+ the link public key in its certificate. It is a variable-length
+ cell. Its contents are:
+
+ SignatureType [2 octets]
+ SignatureLen [2 octets]
+ Signature [SignatureLen octets]
+
+ where SignatureType is 1 (meaning "RSA-SHA256") and Signature is
+ an RSA-SHA256 signature of the HMAC-SHA256, using the TLS master
+ secret key as its key, of the following elements:
+
+ - The SignatureType field (0x00 0x01)
+ - The NUL terminated ASCII string: "Tor certificate verification"
+ - client_random, as sent in the Client Hello
+ - server_random, as sent in the Server Hello
+
+ Once the above handshake is complete, the client knows (from the
+ initial TLS handshake) that it has a secure connection to an
+ entity that controls a given link public key, and knows (from the
+ CERT cell) that the link public key is a valid public key for a
+ given Tor identity.
+
+ If the client authenticates, the server learns from the CERT cell
+ that a given Tor identity has a given current public link key.
+ From the AUTHENTICATE cell, it knows that an entity with that
+ link key knows the master secret for the TLS connection, and
+ hence must be the party with whom it's talking, if TLS works.
+
+3.4. Security checks
+
+ If the TLS handshake indicates a V2 or V3+ connection, the server
+ MUST reject any connection from the client that does not begin
+ with either a renegotiation attempt or a VERSIONS cell containing
+ at least link protocol version "3". If the TLS handshake
+ indicates a V3+ connection, the client MUST reject any connection
+ where the server sends anything before the client has sent a
+ VERSIONS cell, and any connection where the VERSIONS cell does
+ not contain at least link protocol version "3".
+
+ If link protocol version 3 is chosen:
+
+ Clients and servers MUST check that all digests and signatures
+ on the certificates in CERT cells they are given are as
+ described above.
+
+ After the VERSIONS cell, clients and servers MUST close the
+ connection if anything besides a CERT or AUTH cell is sent
+ before the
+
+ CERT or AUTHENTICATE cells anywhere after the first NETINFO
+ cell must be rejected.
+
+ ... [write more here. What else?] ...
+
+3.5. Summary
+
+ We now revisit the protocol outlines from section 2 to incorporate
+ our changes. New or modified steps are marked with a *.
+
+ The new initiator behavior now looks like this:
+
+ - Begin TLS negotiation with V2 cipher list; wait for
+ certificate(s).
+ - If we get a certificate chain:
+ - Then we are using the V1 handshake. Send our own
+ certificate chain as part of this initial TLS handshake
+ if we want to authenticate; otherwise, send no
+ certificates. When the handshake completes, check
+ certificates. We are now mutually authenticated.
+ Otherwise, if we get just a single certificate:
+ - Then we are using the V2 or the V3+ handshake. Do not
+ send any certificates during this handshake.
+ * When the handshake is done, look at the server's
+ certificate's subject commonName.
+ * If it ends with ".net", we're doing a V2 handshake:
+ - Immediately start a TLS renegotiation. During the
+ renegotiation, expect a certificate chain from the
+ server; send a certificate chain of our own if we
+ want to authenticate ourselves.
+ - After the renegotiation, check the certificates. Then
+ send (and expect) a VERSIONS cell from the other side
+ to establish the link protocol version.
+ * If it ends with anything else, assume a V3 or later
+ handshake:
+ * Send a VERSIONS cell, and wait for a VERSIONS cell
+ from the server.
+ * If we are authenticating, send CERT and AUTHENTICATE
+ cells.
+ * Send a NETINFO cell. Wait for a CERT and a NETINFO
+ cell from the server.
+ * If the CERT cell contains a valid self-identity cert,
+ and the identity key in the cert can be used to check
+ the signature on the x.509 certificate we got during
+ the TLS handshake, then we know we connected to the
+ server with that identity. If any of these checks
+ fail, or the identity key was not what we expected,
+ then we close the connection.
+ * Once the NETINFO cell arrives, continue as before.
+
+ And V3+ responder behavior now looks like this:
+
+ - When we get a TLS ClientHello request, look at the cipher
+ list.
+
+ - If the cipher list contains only the V1 ciphersuites:
+ - Then we're doing a V1 handshake. Send a certificate
+ chain. Expect a possible client certificate chain in
+ response.
+ Otherwise, if we get other ciphersuites:
+ - We're using the V2 handshake. Send back a single
+ certificate whose subject commonName ends with ".com",
+ and let the handshake complete.
+ * If the client does anything besides renegotiate or send a
+ VERSIONS cell, drop the connection.
+ - If the client renegotiates immediately, it's a V2
+ connection:
+ - When the client is renegotiating, send a certificate
+ chain, and expect (possibly multiple certificates in
+ reply).
+ - Check the certificates when the renegotiation is done.
+ Then exchange VERSIONS cells.
+ * Otherwise we got a VERSIONS cell and it's a V3 handshake.
+ * Send a VERSIONS cell, a CERT cell, an AUTHENTICATE
+ cell, and a NETINFO cell.
+ * Wait for the client to send cells in reply. If the
+ client sends a CERT and an AUTHENTICATE and a NETINFO,
+ use them to authenticate the client. If the client
+ sends a NETINFO, it is unauthenticated. If it sends
+ anything else before its NETINFO, it's rejected.
+
+4. Numbers to assign
+
+ We need a version number for this link protocol. I've been
+ calling it "3".
+
+ We need to reserve command numbers for CERT and AUTH cells. I
+ suggest that in link protocol 3 and higher, we reserve command
+ numbers 128..240 for variable-length cells. (241-256 we can hold
+ for future extensions.
+
+5. Efficiency
+
+ This protocol add a round-trip step when the client sends a
+ VERSIONS cell to the server, and waits for the {VERSIONS, CERT,
+ NETINFO} response in turn. (The server then waits for the
+ client's {NETINFO} or {CERT, AUTHENTICATE, NETINFO} reply,
+ but it would have already been waiting for the client's NETINFO,
+ so that's not an additional wait.)
+
+ This is actually fewer round-trip steps than required before for
+ TLS renegotiation, so that's a win.
+
+6. Open questions:
+
+ - Should we use X.509 certificates instead of the certificate-ish
+ things we describe here? They are more standard, but more ugly.
+
+ - May we cache which certificates we've already verified? It
+ might leak in timing whether we've connected with a given server
+ before, and how recently.
+
+ - Is there a better secret than the master secret to use in the
+ AUTHENTICATE cell? Say, a portable one? Can we get at it for
+ other libraries besides OpenSSL?
+
+ - Does using the client_random and server_random data in the
+ AUTHENTICATE message actually help us? How hard is it to pull
+ them out of the OpenSSL data structure?
+
+ - Can we give some way for clients to signal "I want to use the
+ V3 protocol if possible, but I can't renegotiate, so don't give
+ me the V2"? Clients currently have a fair idea of server
+ versions, so they could potentially do the V3+ handshake with
+ servers that support it, and fall back to V1 otherwise.
+
+ - What should servers that don't have TLS renegotiation do? For
+ now, I think they should just get it. Eventually we can
+ deprecate the V2 handshake as we did with the V1 handshake.
diff --git a/doc/spec/proposals/170-user-path-config.txt b/doc/spec/proposals/170-user-path-config.txt
new file mode 100644
index 0000000000..fa74c76f73
--- /dev/null
+++ b/doc/spec/proposals/170-user-path-config.txt
@@ -0,0 +1,95 @@
+Title: Configuration options regarding circuit building
+Filename: 170-user-path-config.txt
+Author: Sebastian Hahn
+Created: 01-March-2010
+Status: Draft
+
+Overview:
+
+ This document outlines how Tor handles the user configuration
+ options to influence the circuit building process.
+
+Motivation:
+
+ Tor's treatment of the configuration *Nodes options was surprising
+ to many users, and quite a few conspiracy theories have crept up. We
+ should update our specification and code to better describe and
+ communicate what is going during circuit building, and how we're
+ honoring configuration. So far, we've been tracking a bugreport
+ about this behaviour (
+ https://bugs.torproject.org/flyspray/index.php?do=details&id=1090 )
+ and Nick replied in a thread on or-talk (
+ http://archives.seul.org/or/talk/Feb-2010/msg00117.html ).
+
+ This proposal tries to document our intention for those configuration
+ options.
+
+Design:
+
+ Five configuration options are available to users to influence Tor's
+ circuit building. EntryNodes and ExitNodes define a list of nodes
+ that are for the Entry/Exit position in all circuits. ExcludeNodes
+ is a list of nodes that are used for no circuit, and
+ ExcludeExitNodes is a list of nodes that aren't used as the last
+ hop. StrictNodes defines Tor's behaviour in case of a conflict, for
+ example when a node that is excluded is the only available
+ introduction point. Setting StrictNodes to 1 breaks Tor's
+ functionality in that case, and it will refuse to build such a
+ circuit.
+
+ Neither Nick's email nor bug 1090 have clear suggestions how we
+ should behave in each case, so I tried to come up with something
+ that made sense to me.
+
+Security implications:
+
+ Deviating from normal circuit building can break one's anonymity, so
+ the documentation of the above option should contain a warning to
+ make users aware of the pitfalls.
+
+Specification:
+
+ It is proposed that the "User configuration" part of path-spec
+ (section 2.2.2) be replaced with this:
+
+ Users can alter the default behavior for path selection with
+ configuration options. In case of conflicts (excluding and requiring
+ the same node) the "StrictNodes" option is used to determine
+ behaviour. If a nodes is both excluded and required via a
+ configuration option, the exclusion takes preference.
+
+ - If "ExitNodes" is provided, then every request requires an exit
+ node on the ExitNodes list. If a request is supported by no nodes
+ on that list, and "StrictNodes" is false, then Tor treats that
+ request as if ExitNodes were not provided.
+
+ - "EntryNodes" behaves analogously.
+
+ - If "ExcludeNodes" is provided, then no circuit uses any of the
+ nodes listed. If a circuit requires an excluded node to be used,
+ and "StrictNodes" is false, then Tor uses the node in that
+ position while not using any other of the excluded nodes.
+
+ - If "ExcludeExitNodes" is provided, then Tor will not use the nodes
+ listed for the exit position in a circuit. If a circuit requires
+ an excluded node to be used in the exit position and "StrictNodes"
+ is false, then Tor builds that circuit as if ExcludeExitNodes were
+ not provided.
+
+ - If a user tries to connect to or resolve a hostname of the form
+ <target>.<servername>.exit and the "AllowDotExit" configuration
+ option is set to 1, the request is rewritten to a request for
+ <target>, and the request is only supported by the exit whose
+ nickname or fingerprint is <servername>. If "AllowDotExit" is set
+ to 0 (default), any request for <anything>.exit is denied.
+
+ - When any of the *Nodes settings are changed, all circuits are
+ expired immediately, to prevent a situation where a previously
+ built circuit is used even though some of its nodes are now
+ excluded.
+
+
+Compatibility:
+
+ The old Strict*Nodes options are deprecated, and the StrictNodes
+ option is new. Tor users may need to update their configuration file.
diff --git a/doc/spec/proposals/172-circ-getinfo-option.txt b/doc/spec/proposals/172-circ-getinfo-option.txt
new file mode 100644
index 0000000000..b7fd79c9a8
--- /dev/null
+++ b/doc/spec/proposals/172-circ-getinfo-option.txt
@@ -0,0 +1,138 @@
+Filename: 172-circ-getinfo-option.txt
+Title: GETINFO controller option for circuit information
+Author: Damian Johnson
+Created: 03-June-2010
+Status: Accepted
+
+Overview:
+
+ This details an additional GETINFO option that would provide information
+ concerning a relay's current circuits.
+
+Motivation:
+
+ The original proposal was for connection related information, but Jake make
+ the excellent point that any information retrieved from the control port
+ is...
+
+ 1. completely ineffectual for auditing purposes since either (a) these
+ results can be fetched from netstat already or (b) the information would
+ only be provided via tor and can't be validated.
+
+ 2. The more useful uses for connection information can be achieved with
+ much less (and safer) information.
+
+ Hence the proposal is now for circuit based rather than connection based
+ information. This would strip the most controversial and sensitive data
+ entirely (ip addresses, ports, and connection based bandwidth breakdowns)
+ while still being useful for the following purposes:
+
+ - Basic Relay Usage Questions
+ How is the bandwidth I'm contributing broken down? Is it being evenly
+ distributed or is someone hogging most of it? Do these circuits belong to
+ the hidden service I'm running or something else? Now that I'm using exit
+ policy X am I desirable as an exit, or are most people just using me as a
+ relay?
+
+ - Debugging
+ Say a relay has a restrictive firewall policy for outbound connections,
+ with the ORPort whitelisted but doesn't realize that tor needs random high
+ ports. Tor would report success ("your orport is reachable - excellent")
+ yet the relay would be nonfunctional. This proposed information would
+ reveal numerous RELAY -> YOU -> UNESTABLISHED circuits, giving a good
+ indicator of what's wrong.
+
+ - Visualization
+ A nice benefit of visualizing tor's behavior is that it becomes a helpful
+ tool in puzzling out how tor works. For instance, tor spawns numerous
+ client connections at startup (even if unused as a client). As a newcomer
+ to tor these asymmetric (outbound only) connections mystified me for quite
+ a while until until Roger explained their use to me. The proposed
+ TYPE_FLAGS would let controllers clearly label them as being client
+ related, making their purpose a bit clearer.
+
+ At the moment connection data can only be retrieved via commands like
+ netstat, ss, and lsof. However, providing an alternative via the control
+ port provides several advantages:
+
+ - scrubbing for private data
+ Raw connection data has no notion of what's sensitive and what is
+ not. The relay's flags and cached consensus can be used to take
+ educated guesses concerning which connections could possibly belong
+ to client or exit traffic, but this is both difficult and inaccurate.
+ Anything provided via the control port can scrubbed to make sure we
+ aren't providing anything we think relay operators should not see.
+
+ - additional information
+ All connection querying commands strictly provide the ip address and
+ port of connections, and nothing else. However, for the uses listed
+ above the far more interesting attributes are the circuit's type,
+ bandwidth usage and uptime.
+
+ - improved performance
+ Querying connection data is an expensive activity, especially for
+ busy relays or low end processors (such as mobile devices). Tor
+ already internally knows its circuits, allowing for vastly quicker
+ lookups.
+
+ - cross platform capability
+ The connection querying utilities mentioned above not only aren't
+ available under Windows, but differ widely among different *nix
+ platforms. FreeBSD in particular takes a very unique approach,
+ dropping important options from netstat and assigning ss to a
+ spreadsheet application instead. A controller interface, however,
+ would provide a uniform means of retrieving this information.
+
+Security Implications:
+
+ This is an open question. This proposal lacks the most controversial pieces
+ of information (ip addresses and ports) and insight into potential threats
+ this would pose would be very welcomed!
+
+Specification:
+
+ The following addition would be made to the control-spec's GETINFO section:
+
+ "rcirc/id/<Circuit identity>" -- Provides entry for the associated relay
+ circuit, formatted as:
+ CIRC_ID=<circuit ID> CREATED=<timestamp> UPDATED=<timestamp> TYPE=<flag>
+ READ=<bytes> WRITE=<bytes>
+
+ none of the parameters contain whitespace, and additional results must be
+ ignored to allow for future expansion. Parameters are defined as follows:
+ CIRC_ID - Unique numeric identifier for the circuit this belongs to.
+ CREATED - Unix timestamp (as seconds since the Epoch) for when the
+ circuit was created.
+ UPDATED - Unix timestamp for when this information was last updated.
+ TYPE - Single character flags indicating attributes in the circuit:
+ (E)ntry : has a connection that doesn't belong to a known Tor server,
+ indicating that this is either the first hop or bridged
+ E(X)it : has been used for at least one exit stream
+ (R)elay : has been extended
+ Rende(Z)vous : is being used for a rendezvous point
+ (I)ntroduction : is being used for a hidden service introduction
+ (N)one of the above: none of the above have happened yet.
+ READ - Total bytes transmitted toward the exit over the circuit.
+ WRITE - Total bytes transmitted toward the client over the circuit.
+
+ "rcirc/all" -- The 'rcirc/id/*' output for all current circuits, joined by
+ newlines.
+
+ The following would be included for circ info update events.
+
+4.1.X. Relay circuit status changed
+
+ The syntax is:
+ "650" SP "RCIRC" SP CircID SP Notice [SP Created SP Updated SP Type SP
+ Read SP Write] CRLF
+
+ Notice =
+ "NEW" / ; first information being provided for this circuit
+ "UPDATE" / ; update for a previously reported circuit
+ "CLOSED" ; notice that the circuit no longer exists
+
+ Notice indicating that queryable information on a relay related circuit has
+ changed. If the Notice parameter is either "NEW" or "UPDATE" then this
+ provides the same fields that would be given by calling "GETINFO rcirc/id/"
+ with the CircID.
+
diff --git a/doc/spec/proposals/173-getinfo-option-expansion.txt b/doc/spec/proposals/173-getinfo-option-expansion.txt
new file mode 100644
index 0000000000..03e18ef8d4
--- /dev/null
+++ b/doc/spec/proposals/173-getinfo-option-expansion.txt
@@ -0,0 +1,101 @@
+Filename: 173-getinfo-option-expansion.txt
+Title: GETINFO Option Expansion
+Author: Damian Johnson
+Created: 02-June-2010
+Status: Accepted
+
+Overview:
+
+ Over the course of developing arm there's been numerous hacks and
+ workarounds to gleam pieces of basic, desirable information about the tor
+ process. As per Roger's request I've compiled a list of these pain points
+ to try and improve the control protocol interface.
+
+Motivation:
+
+ The purpose of this proposal is to expose additional process and relay
+ related information that is currently unavailable in a convenient,
+ dependable, and/or platform independent way. Examples of this are...
+
+ - The relay's total contributed bandwidth. This is a highly requested
+ piece of information and, based on the following patch from pipe, looks
+ trivial to include.
+ http://www.mail-archive.com/or-talk@freehaven.net/msg13085.html
+
+ - The process ID of the tor process. There is a high degree of guess work
+ in obtaining this. Arm for instance uses pidof, netstat, and ps yet
+ still fails on some platforms, and Orbot recently got a ticket about
+ its own attempt to fetch it with ps:
+ https://trac.torproject.org/projects/tor/ticket/1388
+
+ This just includes the pieces of missing information I've noticed
+ (suggestions or questions of their usefulness are welcome!).
+
+Security Implications:
+
+ None that I'm aware of. From a security standpoint this seems decently
+ innocuous.
+
+Specification:
+
+ The following addition would be made to the control-spec's GETINFO section:
+
+ "relay/bw-limit" -- Effective relayed bandwidth limit.
+
+ "relay/burst-limit" -- Effective relayed burst limit.
+
+ "relay/read-total" -- Total bytes relayed (download).
+
+ "relay/write-total" -- Total bytes relayed (upload).
+
+ "relay/flags" -- Space separated listing of flags currently held by the
+ relay as repored by the currently cached consensus.
+
+ "process/user" -- Username under which the tor process is running,
+ providing an empty string if none exists.
+
+ "process/pid" -- Process id belonging to the main tor process, -1 if none
+ exists for the platform.
+
+ "process/uptime" -- Total uptime of the tor process (in seconds).
+
+ "process/uptime-reset" -- Time since last reset (startup, sighup, or RELOAD
+ signal, in seconds).
+
+ "process/descriptors-used" -- Count of file descriptors used.
+
+ "process/descriptor-limit" -- File descriptor limit (getrlimit results).
+
+ "ns/authority" -- Router status info (v2 directory style) for all
+ recognized directory authorities, joined by newlines.
+
+ "state/names" -- A space-separated list of all the keys supported by this
+ version of Tor's state.
+
+ "state/val/<key>" -- Provides the current state value belonging to the
+ given key. If undefined, this provides the key's default value.
+
+ "status/ports-seen" -- A summary of which ports we've seen connections
+ circuits connect to recently, formatted the same as the EXITS_SEEN status
+ event described in Section 4.1.XX. This GETINFO option is currently
+ available only for exit relays.
+
+4.1.XX. Per-port exit stats
+
+ The syntax is:
+ "650" SP "EXITS_SEEN" SP TimeStarted SP PortSummary CRLF
+
+ We just generated a new summary of which ports we've seen exiting circuits
+ connecting to recently. The controller could display this for the user, e.g.
+ in their "relay" configuration window, to give them a sense of how they're
+ being used (popularity of the various ports they exit to). Currently only
+ exit relays will receive this event.
+
+ TimeStarted is a quoted string indicating when the reported summary
+ counts from (in GMT).
+
+ The PortSummary keyword has as its argument a comma-separated, possibly
+ empty set of "port=count" pairs. For example (without linebreak),
+ 650-EXITS_SEEN TimeStarted="2008-12-25 23:50:43"
+ PortSummary=80=16,443=8
+
diff --git a/doc/spec/proposals/174-optimistic-data-server.txt b/doc/spec/proposals/174-optimistic-data-server.txt
new file mode 100644
index 0000000000..d97c45e909
--- /dev/null
+++ b/doc/spec/proposals/174-optimistic-data-server.txt
@@ -0,0 +1,242 @@
+Filename: 174-optimistic-data-server.txt
+Title: Optimistic Data for Tor: Server Side
+Author: Ian Goldberg
+Created: 2-Aug-2010
+Status: Open
+
+Overview:
+
+When a SOCKS client opens a TCP connection through Tor (for an HTTP
+request, for example), the query latency is about 1.5x higher than it
+needs to be. Simply, the problem is that the sequence of data flows
+is this:
+
+1. The SOCKS client opens a TCP connection to the OP
+2. The SOCKS client sends a SOCKS CONNECT command
+3. The OP sends a BEGIN cell to the Exit
+4. The Exit opens a TCP connection to the Server
+5. The Exit returns a CONNECTED cell to the OP
+6. The OP returns a SOCKS CONNECTED notification to the SOCKS client
+7. The SOCKS client sends some data (the GET request, for example)
+8. The OP sends a DATA cell to the Exit
+9. The Exit sends the GET to the server
+10. The Server returns the HTTP result to the Exit
+11. The Exit sends the DATA cells to the OP
+12. The OP returns the HTTP result to the SOCKS client
+
+Note that the Exit node knows that the connection to the Server was
+successful at the end of step 4, but is unable to send the HTTP query to
+the server until step 9.
+
+This proposal (as well as its upcoming sibling concerning the client
+side) aims to reduce the latency by allowing:
+1. SOCKS clients to optimistically send data before they are notified
+ that the SOCKS connection has completed successfully
+2. OPs to optimistically send DATA cells on streams in the CONNECT_WAIT
+ state
+3. Exit nodes to accept and queue DATA cells while in the
+ EXIT_CONN_STATE_CONNECTING state
+
+This particular proposal deals with #3.
+
+In this way, the flow would be as follows:
+
+1. The SOCKS client opens a TCP connection to the OP
+2. The SOCKS client sends a SOCKS CONNECT command, followed immediately
+ by data (such as the GET request)
+3. The OP sends a BEGIN cell to the Exit, followed immediately by DATA
+ cells
+4. The Exit opens a TCP connection to the Server
+5. The Exit returns a CONNECTED cell to the OP, and sends the queued GET
+ request to the Server
+6. The OP returns a SOCKS CONNECTED notification to the SOCKS client,
+ and the Server returns the HTTP result to the Exit
+7. The Exit sends the DATA cells to the OP
+8. The OP returns the HTTP result to the SOCKS client
+
+Motivation:
+
+This change will save one OP<->Exit round trip (down to one from two).
+There are still two SOCKS Client<->OP round trips (negligible time) and
+two Exit<->Server round trips. Depending on the ratio of the
+Exit<->Server (Internet) RTT to the OP<->Exit (Tor) RTT, this will
+decrease the latency by 25 to 50 percent. Experiments validate these
+predictions. [Goldberg, PETS 2010 rump session; see
+https://thunk.cs.uwaterloo.ca/optimistic-data-pets2010-rump.pdf ]
+
+Design:
+
+The current code actually correctly handles queued data at the Exit; if
+there is queued data in a EXIT_CONN_STATE_CONNECTING stream, that data
+will be immediately sent when the connection succeeds. If the
+connection fails, the data will be correctly ignored and freed. The
+problem with the current server code is that the server currently
+drops DATA cells on streams in the EXIT_CONN_STATE_CONNECTING state.
+Also, if you try to queue data in the EXIT_CONN_STATE_RESOLVING state,
+bad things happen because streams in that state don't yet have
+conn->write_event set, and so some existing sanity checks (any stream
+with queued data is at least potentially writable) are no longer sound.
+
+The solution is to simply not drop received DATA cells while in the
+EXIT_CONN_STATE_CONNECTING state. Also do not send SENDME cells in this
+state, so that the OP cannot send more than one window's worth of data
+to be queued at the Exit. Finally, patch the sanity checks so that
+streams in the EXIT_CONN_STATE_RESOLVING state that have buffered data
+can pass.
+
+If no clients ever send such optimistic data, the new code will never be
+executed, and the behaviour of Tor will not change. When clients begin
+to send optimistic data, the performance of those clients' streams will
+improve.
+
+After discussion with nickm, it seems best to just have the server
+version number be the indicator of whether a particular Exit supports
+optimistic data. (If a client sends optimistic data to an Exit which
+does not support it, the data will be dropped, and the client's request
+will fail to complete.) What do version numbers for hypothetical future
+protocol-compatible implementations look like, though?
+
+Security implications:
+
+Servers (for sure the Exit, and possibly others, by watching the
+pattern of packets) will be able to tell that a particular client
+is using optimistic data. This will be discussed more in the sibling
+proposal.
+
+On the Exit side, servers will be queueing a little bit extra data, but
+no more than one window. Clients today can cause Exits to queue that
+much data anyway, simply by establishing a Tor connection to a slow
+machine, and sending one window of data.
+
+Specification:
+
+tor-spec section 6.2 currently says:
+
+ The OP waits for a RELAY_CONNECTED cell before sending any data.
+ Once a connection has been established, the OP and exit node
+ package stream data in RELAY_DATA cells, and upon receiving such
+ cells, echo their contents to the corresponding TCP stream.
+ RELAY_DATA cells sent to unrecognized streams are dropped.
+
+It is not clear exactly what an "unrecognized" stream is, but this last
+sentence would be changed to say that RELAY_DATA cells received on a
+stream that has processed a RELAY_BEGIN cell and has not yet issued a
+RELAY_END or a RELAY_CONNECTED cell are queued; that queue is processed
+immediately after a RELAY_CONNECTED cell is issued for the stream, or
+freed after a RELAY_END cell is issued for the stream.
+
+The earlier part of this section will be addressed in the sibling
+proposal.
+
+Compatibility:
+
+There are compatibility issues, as mentioned above. OPs MUST NOT send
+optimistic data to Exit nodes whose version numbers predate (something).
+OPs MAY send optimistic data to Exit nodes whose version numbers match
+or follow that value. (But see the question about independent server
+reimplementations, above.)
+
+Implementation:
+
+Here is a simple patch. It seems to work with both regular streams and
+hidden services, but there may be other corner cases I'm not aware of.
+(Do streams used for directory fetches, hidden services, etc. take a
+different code path?)
+
+diff --git a/src/or/connection.c b/src/or/connection.c
+index 7b1493b..f80cd6e 100644
+--- a/src/or/connection.c
++++ b/src/or/connection.c
+@@ -2845,7 +2845,13 @@ _connection_write_to_buf_impl(const char *string, size_t len,
+ return;
+ }
+
+- connection_start_writing(conn);
++ /* If we receive optimistic data in the EXIT_CONN_STATE_RESOLVING
++ * state, we don't want to try to write it right away, since
++ * conn->write_event won't be set yet. Otherwise, write data from
++ * this conn as the socket is available. */
++ if (conn->state != EXIT_CONN_STATE_RESOLVING) {
++ connection_start_writing(conn);
++ }
+ if (zlib) {
+ conn->outbuf_flushlen += buf_datalen(conn->outbuf) - old_datalen;
+ } else {
+@@ -3382,7 +3388,11 @@ assert_connection_ok(connection_t *conn, time_t now)
+ tor_assert(conn->s < 0);
+
+ if (conn->outbuf_flushlen > 0) {
+- tor_assert(connection_is_writing(conn) || conn->write_blocked_on_bw ||
++ /* With optimistic data, we may have queued data in
++ * EXIT_CONN_STATE_RESOLVING while the conn is not yet marked to writing.
++ * */
++ tor_assert(conn->state == EXIT_CONN_STATE_RESOLVING ||
++ connection_is_writing(conn) || conn->write_blocked_on_bw ||
+ (CONN_IS_EDGE(conn) && TO_EDGE_CONN(conn)->edge_blocked_on_circ));
+ }
+
+diff --git a/src/or/relay.c b/src/or/relay.c
+index fab2d88..e45ff70 100644
+--- a/src/or/relay.c
++++ b/src/or/relay.c
+@@ -1019,6 +1019,9 @@ connection_edge_process_relay_cell(cell_t *cell, circuit_t *circ,
+ relay_header_t rh;
+ unsigned domain = layer_hint?LD_APP:LD_EXIT;
+ int reason;
++ int optimistic_data = 0; /* Set to 1 if we receive data on a stream
++ that's in the EXIT_CONN_STATE_RESOLVING
++ or EXIT_CONN_STATE_CONNECTING states.*/
+
+ tor_assert(cell);
+ tor_assert(circ);
+@@ -1038,9 +1041,20 @@ connection_edge_process_relay_cell(cell_t *cell, circuit_t *circ,
+ /* either conn is NULL, in which case we've got a control cell, or else
+ * conn points to the recognized stream. */
+
+- if (conn && !connection_state_is_open(TO_CONN(conn)))
+- return connection_edge_process_relay_cell_not_open(
+- &rh, cell, circ, conn, layer_hint);
++ if (conn && !connection_state_is_open(TO_CONN(conn))) {
++ if ((conn->_base.state == EXIT_CONN_STATE_CONNECTING ||
++ conn->_base.state == EXIT_CONN_STATE_RESOLVING) &&
++ rh.command == RELAY_COMMAND_DATA) {
++ /* We're going to allow DATA cells to be delivered to an exit
++ * node in state EXIT_CONN_STATE_CONNECTING or
++ * EXIT_CONN_STATE_RESOLVING. This speeds up HTTP, for example. */
++ log_warn(domain, "Optimistic data received.");
++ optimistic_data = 1;
++ } else {
++ return connection_edge_process_relay_cell_not_open(
++ &rh, cell, circ, conn, layer_hint);
++ }
++ }
+
+ switch (rh.command) {
+ case RELAY_COMMAND_DROP:
+@@ -1090,7 +1104,9 @@ connection_edge_process_relay_cell(cell_t *cell, circuit_t *circ,
+ log_debug(domain,"circ deliver_window now %d.", layer_hint ?
+ layer_hint->deliver_window : circ->deliver_window);
+
+- circuit_consider_sending_sendme(circ, layer_hint);
++ if (!optimistic_data) {
++ circuit_consider_sending_sendme(circ, layer_hint);
++ }
+
+ if (!conn) {
+ log_info(domain,"data cell dropped, unknown stream (streamid %d).",
+@@ -1107,7 +1123,9 @@ connection_edge_process_relay_cell(cell_t *cell, circuit_t *circ,
+ stats_n_data_bytes_received += rh.length;
+ connection_write_to_buf(cell->payload + RELAY_HEADER_SIZE,
+ rh.length, TO_CONN(conn));
+- connection_edge_consider_sending_sendme(conn);
++ if (!optimistic_data) {
++ connection_edge_consider_sending_sendme(conn);
++ }
+ return 0;
+ case RELAY_COMMAND_END:
+ reason = rh.length > 0 ?
+
+Performance and scalability notes:
+
+There may be more RAM used at Exit nodes, as mentioned above, but it is
+transient.
diff --git a/doc/spec/proposals/ideas/xxx-bwrate-algs.txt b/doc/spec/proposals/ideas/xxx-bwrate-algs.txt
new file mode 100644
index 0000000000..757f5bc55e
--- /dev/null
+++ b/doc/spec/proposals/ideas/xxx-bwrate-algs.txt
@@ -0,0 +1,106 @@
+# The following two algorithms
+
+
+# Algorithm 1
+# TODO: Burst and Relay/Regular differentiation
+
+BwRate = Bandwidth Rate in Bytes Per Second
+GlobalWriteBucket = 0
+GlobalReadBucket = 0
+Epoch = Token Fill Rate in seconds: suggest 50ms=.050
+SecondCounter = 0
+MinWriteBytes = Minimum amount bytes per write
+
+Every Epoch Seconds:
+ UseMinWriteBytes = MinWriteBytes
+ WriteCnt = 0
+ ReadCnt = 0
+ BytesRead = 0
+
+ For Each Open OR Conn with pending write data:
+ WriteCnt++
+ For Each Open OR Conn:
+ ReadCnt++
+
+ BytesToRead = (BwRate*Epoch + GlobalReadBucket)/ReadCnt
+ BytesToWrite = (BwRate*Epoch + GlobalWriteBucket)/WriteCnt
+
+ if BwRate/WriteCnt < MinWriteBytes:
+ # If we aren't likely to accumulate enough bytes in a second to
+ # send a whole cell for our connections, send partials
+ Log(NOTICE, "Too many ORCons to write full blocks. Sending short packets.")
+ UseMinWriteBytes = 1
+ # Other option: We could switch to plan 2 here
+
+ # Service each writable ORConn. If there are any partial writes,
+ # return remaining bytes from this epoch to the global pool
+ For Each Open OR Conn with pending write data:
+ ORConn->write_bucket += BytesToWrite
+ if ORConn->write_bucket > UseMinWriteBytes:
+ w = write(ORConn, MIN(len(ORConn->write_data), ORConn->write_bucket))
+ # possible that w < ORConn->write_data here due to TCP pushback.
+ # We should restore the rest of the write_bucket to the global
+ # buffer
+ GlobalWriteBucket += (ORConn->write_bucket - w)
+ ORConn->write_bucket = 0
+
+ For Each Open OR Conn:
+ r = read_nonblock(ORConn, BytesToRead)
+ BytesRead += r
+
+ SecondCounter += Epoch
+ if SecondCounter < 1:
+ # Save unused bytes from this epoch to be used later in the second
+ GlobalReadBucket += (BwRate*Epoch - BytesRead)
+ else:
+ SecondCounter = 0
+ GlobalReadBucket = 0
+ GlobalWriteBucket = 0
+ For Each ORConn:
+ ORConn->write_bucket = 0
+
+
+
+# Alternate plan for Writing fairly. Reads would still be covered
+# by plan 1 as there is no additional network overhead for short reads,
+# so we don't need to try to avoid them.
+#
+# I think this is actually pretty similar to what we do now, but
+# with the addition that the bytes accumulate up to the second mark
+# and we try to keep track of our position in the write list here
+# (unless libevent is doing that for us already and I just don't see it)
+#
+# TODO: Burst and Relay/Regular differentiation
+
+# XXX: The inability to send single cells will cause us to block
+# on EXTEND cells for low-bandwidth node pairs..
+BwRate = Bandwidth Rate in Bytes Per Second
+WriteBytes = Bytes per write
+Epoch = MAX(MIN(WriteBytes/BwRate, .333s), .050s)
+
+SecondCounter = 0
+GlobalWriteBucket = 0
+
+# New connections are inserted at Head-1 (the 'tail' of this circular list)
+# This is not 100% fifo for all node data, but it is the best we can do
+# without insane amounts of additional queueing complexity.
+WriteConnList = List of Open OR Conns with pending write data > WriteBytes
+WriteConnHead = 0
+
+Every Epoch Seconds:
+ GlobalWriteBucket += BwRate*Epoch
+ WriteListEnd = WriteConnHead
+
+ do
+ ORCONN = WriteConnList[WriteConnHead]
+ w = write(ORConn, WriteBytes)
+ GlobalWriteBucket -= w
+ WriteConnHead += 1
+ while GlobalWriteBucket > 0 and WriteConnHead != WriteListEnd
+
+ SecondCounter += Epoch
+ if SecondCounter >= 1:
+ SecondCounter = 0
+ GlobalWriteBucket = 0
+
+
diff --git a/doc/spec/proposals/ideas/xxx-choosing-crypto-in-tor-protocol.txt b/doc/spec/proposals/ideas/xxx-choosing-crypto-in-tor-protocol.txt
new file mode 100644
index 0000000000..e8489570f7
--- /dev/null
+++ b/doc/spec/proposals/ideas/xxx-choosing-crypto-in-tor-protocol.txt
@@ -0,0 +1,138 @@
+Filename: xxx-choosing-crypto-in-tor-protocol.txt
+Title: Picking cryptographic standards in the Tor wire protocol
+Author: Marian
+Created: 2009-05-16
+Status: Draft
+
+Motivation:
+
+ SHA-1 is horribly outdated and not suited for security critical
+ purposes. SHA-2, RIPEMD-160, Whirlpool and Tigerare good options
+ for a short-term replacement, but in the long run, we will
+ probably want to upgrade to the winner or a semi-finalist of the
+ SHA-3 competition.
+
+ For a 2006 comparison of different hash algorithms, read:
+ http://www.sane.nl/sane2006/program/final-papers/R10.pdf
+
+ Other reading about SHA-1:
+ http://www.schneier.com/blog/archives/2005/02/sha1_broken.html
+ http://www.schneier.com/blog/archives/2005/08/new_cryptanalyt.html
+ http://www.schneier.com/paper-preimages.html
+
+ Additionally, AES has been theoretically broken for years. While
+ the attack is still not efficient enough that the public sector
+ has been able to prove that it works, we should probably consider
+ the time between a theoretical attack and a practical attack as an
+ opportunity to figure out how to upgrade to a better algorithm,
+ such as Twofish.
+
+ See:
+ http://schneier.com/crypto-gram-0209.html#1
+
+Design:
+
+ I suggest that nodes should publish in directories which
+ cryptographic standards, such as hash algorithms and ciphers,
+ they support. Clients communicating with nodes will then
+ pick whichever of those cryptographic standards they prefer
+ the most. In the case that the node does not publish which
+ cryptographic standards it supports, the client should assume
+ that the server supports the older standards, such as SHA-1
+ and AES, until such time as we choose to desupport those
+ standards.
+
+ Node to node communications could work similarly. However, in
+ case they both support a set of algorithms but have different
+ preferences, the disagreement would have to be resolved
+ somehow. Two possibilities include:
+ * the node requesting communications presents which
+ cryptographic standards it supports in the request. The
+ other node picks.
+ * both nodes send each other lists of what they support and
+ what version of Tor they are using. The newer node picks,
+ based on the assumption that the newer node has the most up
+ to date information about which hash algorithm is the best.
+ Of course, the node could lie about its version, but then
+ again, it could also maliciously choose only to support older
+ algorithms.
+
+ Using this method, we could potentially add server side support
+ to hash algorithms and ciphers before we instruct clients to
+ begin preferring those hash algorithms and ciphers. In this way,
+ the clients could upgrade and the servers would already support
+ the newly preferred hash algorithms and ciphers, even if the
+ servers were still using older versions of Tor, so long as the
+ older versions of Tor were at least new enough to have server
+ side support.
+
+ This would make quickly upgrading to new hash algorithms and
+ ciphers easier. This could be very useful when new attacks
+ are published.
+
+ One concern is that client preferences could expose the client
+ to segmentation attacks. To mitigate this, we suggest hardcoding
+ preferences in the client, to prevent the client from choosing
+ to use a new hash algorithm or cipher that no one else is using
+ yet. While offering a preference might be useful in case a client
+ with an older version of Tor wants to start using the newer hash
+ algorithm or cipher that everyone else is using, if the client
+ cares enough, he or she can just upgrade Tor.
+
+ We may also have to worry about nodes which, through laziness or
+ maliciousness, refuse to start supporting new hash algorithms or
+ ciphers. This must be balanced with the need to maintain
+ backward compatibility so the client will have a large selection
+ of nodes to pick from. Adding new hash algorithms and ciphers
+ long before we suggest nodes start using them can help mitigate
+ this. However, eventually, once sufficient nodes support new
+ standards, client side support for older standards should be
+ disabled, particularly if there are practical rather than merely
+ theoretical attacks.
+
+ Server side support for older standards can be kept much longer
+ than client side support, since clients using older hashes and
+ ciphers are really only hurting theirselvse.
+
+ If server side support for a hash algorithm or cipher is added
+ but never preferred before we decide we don't really want it,
+ support can be removed without having to worry about backward
+ compatibility.
+
+Security implications:
+ Improving cryptography will improve Tor's security. However, if
+ clients pick different cryptographic standards, they could be
+ partitioned based on their cryptographic preferences. We also
+ need to worry about nodes refusing to support new standards.
+ These issues are detailed above.
+
+Specification:
+
+ Todo. Need better understanding of how Tor currently works or
+ help from someone who does.
+
+Compatibility:
+
+ This idea is intended to allow easier upgrading of cryptographic
+ hash algorithms and ciphers while maintaining backwards
+ compatibility. However, at some point, backwards compatibility
+ with very old hashes and ciphers should be dropped for security
+ reasons.
+
+Implementation:
+
+ Todo.
+
+Performance and scalability nodes:
+
+ Better hashes and cipher are someimes a little more CPU intensive
+ than weaker ones. For instance, on most computers AES is a little
+ faster than Twofish. However, in that example, I consider Twofish's
+ additional security worth the tradeoff.
+
+Acknowledgements:
+
+ Discussed this on IRC with a few people, mostly Nick Mathewson.
+ Nick was particularly helpful in explaining how Tor works,
+ explaining goals, and providing various links to Tor
+ specifications.
diff --git a/doc/spec/proposals/ideas/xxx-encrypted-services.txt b/doc/spec/proposals/ideas/xxx-encrypted-services.txt
new file mode 100644
index 0000000000..3414f3c4fb
--- /dev/null
+++ b/doc/spec/proposals/ideas/xxx-encrypted-services.txt
@@ -0,0 +1,18 @@
+
+the basic idea might be to generate a keypair, and sign little statements
+like "this key corresponds to this relay id", and publish them on karsten's
+hs dht.
+
+so if you want to talk to it, you look it up, then go to that exit.
+and by 'go to' i mean 'build a tor circuit like normal except you're sure
+where to exit'
+
+connecting to it is slower than usual, but once you're connected, it's no
+slower than normal tor.
+and you get what wikileaks wants from its hidden service, which is really
+just the UI piece.
+indymedia also wants this.
+
+might be interesting to let an encrypted service list more than one relay,
+too.
+
diff --git a/doc/spec/proposals/ideas/xxx-hide-platform.txt b/doc/spec/proposals/ideas/xxx-hide-platform.txt
index 3fed5cfbd4..ad19fb1fd4 100644
--- a/doc/spec/proposals/ideas/xxx-hide-platform.txt
+++ b/doc/spec/proposals/ideas/xxx-hide-platform.txt
@@ -1,7 +1,5 @@
Filename: xxx-hide-platform.txt
Title: Hide Tor Platform Information
-Version: $Revision$
-Last-Modified: $Date$
Author: Jacob Appelbaum
Created: 24-July-2008
Status: Draft
diff --git a/doc/spec/proposals/ideas/xxx-port-knocking.txt b/doc/spec/proposals/ideas/xxx-port-knocking.txt
index 9fbcdf3545..85c27ec52d 100644
--- a/doc/spec/proposals/ideas/xxx-port-knocking.txt
+++ b/doc/spec/proposals/ideas/xxx-port-knocking.txt
@@ -1,7 +1,5 @@
Filename: xxx-port-knocking.txt
Title: Port knocking for bridge scanning resistance
-Version: $Revision$
-Last-Modified: $Date$
Author: Jacob Appelbaum
Created: 19-April-2009
Status: Draft
diff --git a/doc/spec/proposals/ideas/xxx-separate-streams-by-port.txt b/doc/spec/proposals/ideas/xxx-separate-streams-by-port.txt
index cebde65a9b..f26c1e580f 100644
--- a/doc/spec/proposals/ideas/xxx-separate-streams-by-port.txt
+++ b/doc/spec/proposals/ideas/xxx-separate-streams-by-port.txt
@@ -1,7 +1,5 @@
Filename: xxx-separate-streams-by-port.txt
Title: Separate streams across circuits by destination port
-Version: $Revision$
-Last-Modified: $Date$
Author: Robert Hogan
Created: 21-Oct-2008
Status: Draft
diff --git a/doc/spec/proposals/ideas/xxx-using-spdy.txt b/doc/spec/proposals/ideas/xxx-using-spdy.txt
new file mode 100644
index 0000000000..d733a84b69
--- /dev/null
+++ b/doc/spec/proposals/ideas/xxx-using-spdy.txt
@@ -0,0 +1,143 @@
+Filename: xxx-using-spdy.txt
+Title: Using the SPDY protocol to improve Tor performance
+Author: Steven J. Murdoch
+Created: 03-Feb-2010
+Status: Draft
+Target:
+
+1. Overview
+
+ The SPDY protocol [1] is an alternative method for transferring
+ web content over TCP, designed to improve efficiency and
+ performance. A SPDY-aware browser can already communicate with
+ a SPDY-aware web server over Tor, because this only requires a TCP
+ stream to be set up. However, a SPDY-aware browser cannot
+ communicate with a non-SPDY-aware web server. This proposal
+ outlines how Tor could support this latter case, and why it
+ may be good for performance.
+
+2. Motivation
+
+ About 90% of Tor traffic, by connection, is HTTP [2], but
+ users report subjective performance to be poor. It would
+ therefore be desirable to improve this situation. SPDY was
+ designed to offer better performance than HTTP, in
+ high-latency and/or low-bandwidth situations, and is therefore
+ an option worth examining.
+
+ If a user wishes to access a SPDY-enabled web server over Tor,
+ all they need to do is to configure their SPDY-enabled browser
+ (e.g. Google Chrome) to use Tor. However, there are few
+ SPDY-enabled web servers, and even if there was high demand
+ from Tor users, there would be little motivation for server
+ operators to upgrade, for the benefit of only a small
+ proportion of their users.
+
+ The motivation of this proposal is to allow only the user to
+ install a SPDY-enabled browser, and permit web servers to
+ remain unmodified. Essentially, Tor would incorporate a proxy
+ on the exit node, which communicates SPDY to the web browser
+ and normal HTTP to the web server. This proxy would translate
+ between the two transport protocols, and possibly perform
+ other optimizations.
+
+ SPDY currently offers five optimizations:
+
+ 1) Multiplexed streams:
+ An unlimited number of resources can be transferred
+ concurrently, over a single TCP connection.
+
+ 2) Request prioritization:
+ The client can set a priority on each resource, to assist
+ the server in re-ordering responses.
+
+ 3) Compression:
+ Both HTTP header and resource content can be compressed.
+
+ 4) Server push:
+ The server can offer the client resources which have not
+ been requested, but which the server believes will be.
+
+ 5) Server hint:
+ The server can suggest that the client request further
+ resources, before the main content is transferred.
+
+ Tor currently effectively implements (1), by being able to put
+ multiple streams on one circuit. SPDY however requires fewer
+ round-trips to do the same. The other features are not
+ implemented by Tor. Therefore it is reasonable to expect that
+ a HTTP <-> SPDY proxy may improve Tor performance, by some
+ amount.
+
+ The consequences on caching need to be considered carefully.
+ Most of the optimizations SPDY offers have no effect because
+ the existing HTTP cache control headers are transmitted without
+ modification. Server push is more problematic, because here
+ the server may push a resource that the client already has.
+
+3. Design outline
+
+ One way to implement the SPDY proxy is for Tor exit nodes to
+ advertise this capability in their descriptor. The OP would
+ then preferentially select these nodes when routing streams
+ destined for port 80.
+
+ Then, rather than sending the usual RELAY_BEGIN cell, the OP
+ would send a RELAY_BEGIN_TRANSFORMED cell, with a parameter to
+ indicate that the exit node should translate between SPDY and
+ HTTP. The rest of the connection process would operate as
+ usual.
+
+ There would need to be some way of elegantly handling non-HTTP
+ traffic which goes over port 80.
+
+4. Implementation status
+
+ SPDY is under active development and both the specification
+ and implementations are in a state of flux. Initial
+ experiments with Google Chrome in SPDY-mode and server
+ libraries indicate that more work is needed before they are
+ production-ready. There is no indication that browsers other
+ than Google Chrome will support SPDY (and no official
+ statement as to whether Google Chrome will eventually enable
+ SPDY by default).
+
+ Implementing a full SPDY proxy would be non-trivial. Stream
+ multiplexing and compression are supported by existing
+ libraries and would be fairly simple to implement. Request
+ prioritization would require some form of caching on the
+ proxy-side. Server push and server hint would require content
+ parsing to identify resources which should be treated
+ specially.
+
+5. Security and policy implications
+
+ A SPDY proxy would be a significant amount of code, and may
+ pull in external libraries. This code will process potentially
+ malicious data, both at the SPDY and HTTP sides. This proposal
+ therefore increases the risk that exit nodes will be
+ compromised by exploiting a bug in the proxy.
+
+ This proposal would also be the first way in which Tor is
+ modifying TCP stream data. Arguably this is still meta-data
+ (HTTP headers), but there may be some concern that Tor should
+ not be doing this.
+
+ Torbutton only works with Firefox, but SPDY only works with
+ Google Chrome. We should be careful not to recommend that
+ users adopt a browser which harms their privacy in other ways.
+
+6. Open questions:
+
+ - How difficult would this be to implement?
+
+ - How much performance improvement would it actually result in?
+
+ - Is there some way to rapidly develop a prototype which would
+ answer the previous question?
+
+[1] SPDY: An experimental protocol for a faster web
+ http://dev.chromium.org/spdy/spdy-whitepaper
+[2] Shining Light in Dark Places: Understanding the Tor Network Damon McCoy,
+ Kevin Bauer, Dirk Grunwald, Tadayoshi Kohno, Douglas Sicker
+ http://www.cs.washington.edu/homes/yoshi/papers/Tor/PETS2008_37.pdf
diff --git a/doc/spec/proposals/ideas/xxx-what-uses-sha1.txt b/doc/spec/proposals/ideas/xxx-what-uses-sha1.txt
index 9b6e20c586..b3ca3eea5a 100644
--- a/doc/spec/proposals/ideas/xxx-what-uses-sha1.txt
+++ b/doc/spec/proposals/ideas/xxx-what-uses-sha1.txt
@@ -1,8 +1,6 @@
Filename: xxx-what-uses-sha1.txt
Title: Where does Tor use SHA-1 today?
-Version: $Revision$
-Last-Modified: $Date$
-Author: Nick Mathewson
+Authors: Nick Mathewson, Marian
Created: 30-Dec-2008
Status: Meta
@@ -15,9 +13,15 @@ Introduction:
too long.
According to smart crypto people, the SHA-2 functions (SHA-256, etc)
- share too much of SHA-1's structure to be very good. Some people
- like other hash functions; most of these have not seen enough
- analysis to be widely regarded as an extra-good idea.
+ share too much of SHA-1's structure to be very good. RIPEMD-160 is
+ also based on flawed past hashes. Some people think other hash
+ functions (e.g. Whirlpool and Tiger) are not as bad; most of these
+ have not seen enough analysis to be used yet.
+
+ Here is a 2006 paper about hash algorithms.
+ http://www.sane.nl/sane2006/program/final-papers/R10.pdf
+
+ (Todo: Ask smart crypto people.)
By 2012, the NIST SHA-3 competition will be done, and with luck we'll
have something good to switch too. But it's probably a bad idea to
@@ -54,50 +58,138 @@ Why now?
one look silly.
+Triage
+
+ How severe are these problems? Let's divide them into these
+ categories, where H(x) is the SHA-1 hash of x:
+ PREIMAGE -- find any x such that a H(x) has a chosen value
+ -- A SHA-1 usage that only depends on preimage
+ resistance
+ * Also SECOND PREIMAGE. Given x, find a y not equal to
+ x such that H(x) = H(y)
+ COLLISION<role> -- A SHA-1 usage that depends on collision
+ resistance, but the only party who could mount a
+ collision-based attack is already in a trusted role
+ (like a distribution signer or a directory authority).
+ COLLISION -- find any x and y such that H(x) = H(y) -- A
+ SHA-1 usage that depends on collision resistance
+ and doesn't need the attacker to have any special keys.
+
+ There is no need to put much effort into fixing PREIMAGE and SECOND
+ PREIMAGE usages in the near-term: while there have been some
+ theoretical results doing these attacks against SHA-1, they don't
+ seem to be close to practical yet. To fix COLLISION<code-signing>
+ usages is not too important either, since anyone who has the key to
+ sign the code can mount far worse attacks. It would be good to fix
+ COLLISION<authority> usages, since we try to resist bad authorities
+ to a limited extent. The COLLISION usages are the most important
+ to fix.
+
+ Kelsey and Schneier published a theoretical second preimage attack
+ against SHA-1 in 2005, so it would be a good idea to fix PREIMAGE
+ and SECOND PREIMAGE usages after fixing COLLISION usages or where fixes
+ require minimal effort.
+
+ http://www.schneier.com/paper-preimages.html
+
+ Additionally, we need to consider the impact of a successful attack
+ in each of these cases. SHA-1 collisions are still expensive even
+ if recent results are verified, and anybody with the resources to
+ compute one also has the resources to mount a decent Sybil attack.
+
+ Let's be pessimistic, and not assume that producing collisions of
+ a given format is actually any harder than producing collisions at
+ all.
+
What Tor uses hashes for today:
1. Infrastructure.
A. Our X.509 certificates are signed with SHA-1.
+ COLLSION
B. TLS uses SHA-1 (and MD5) internally to generate keys.
+ PREIMAGE?
+ * At least breaking SHA-1 and MD5 simultaneously is
+ much more difficult than breaking either
+ independently.
C. Some of the TLS ciphersuites we allow use SHA-1.
+ PREIMAGE?
D. When we sign our code with GPG, it might be using SHA-1.
+ COLLISION<code-signing>
+ * GPG 1.4 and up have writing support for SHA-2 hashes.
+ This blog has help for converting:
+ http://www.schwer.us/journal/2005/02/19/sha-1-broken-and-gnupg-gpg/
E. Our GPG keys might be authenticated with SHA-1.
+ COLLISION<code-signing-key-signing>
F. OpenSSL's random number generator uses SHA-1, I believe.
+ PREIMAGE
2. The Tor protocol
A. Everything we sign, we sign using SHA-1-based OAEP-MGF1.
+ PREIMAGE?
B. Our CREATE cell format uses SHA-1 for: OAEP padding.
+ PREIMAGE?
C. Our EXTEND cells use SHA-1 to hash the identity key of the
target server.
+ COLLISION
D. Our CREATED cells use SHA-1 to hash the derived key data.
+ ??
E. The data we use in CREATE_FAST cells to generate a key is the
length of a SHA-1.
+ NONE
F. The data we send back in a CREATED/CREATED_FAST cell is the length
of a SHA-1.
- G. We use SHA-1 to derive our circuit keys from the negotiated g^xy value.
+ NONE
+ G. We use SHA-1 to derive our circuit keys from the negotiated g^xy
+ value.
+ NONE
H. We use SHA-1 to derive the digest field of each RELAY cell, but that's
used more as a checksum than as a strong digest.
+ NONE
3. Directory services
+ [All are COLLISION or COLLISION<authority> ]
+
A. All signatures are generated on the SHA-1 of their corresponding
documents, using PKCS1 padding.
+ * In dir-spec.txt, section 1.3, it states,
+ "SIGNATURE" Object contains a signature (using the signing key)
+ of the PKCS1-padded digest of the entire document, taken from
+ the beginning of the Initial item, through the newline after
+ the Signature Item's keyword and its arguments."
+ So our attacker, Malcom, could generate a collision for the hash
+ that is signed. Thus, a second pre-image attack is possible.
+ Vulnerable to regular collision attack only if key is stolen.
+ If the key is stolen, Malcom could distribute two different
+ copies of the document which have the same hash. Maybe useful
+ for a partitioning attack?
B. Router descriptors identify their corresponding extra-info documents
by their SHA-1 digest.
+ * A third party might use a second pre-image attack to generate a
+ false extra-info document that has the same hash. The router
+ itself might use a regular collision attack to generate multiple
+ extra-info documents with the same hash, which might be useful
+ for a partitioning attack.
C. Fingerprints in router descriptors are taken using SHA-1.
- D. Fingerprints in authority certs are taken using SHA-1.
- E. Fingerprints in dir-source lines of votes and consensuses are taken
+ * The fingerprint must match the public key. Not sure what would
+ happen if two routers had different public keys but the same
+ fingerprint. There could perhaps be unpredictable behaviour.
+ D. In router descriptors, routers in the same "Family" may be listed
+ by server nicknames or hexdigests.
+ * Does not seem critical.
+ E. Fingerprints in authority certs are taken using SHA-1.
+ F. Fingerprints in dir-source lines of votes and consensuses are taken
using SHA-1.
- F. Networkstatuses refer to routers identity keys and descriptors by their
+ G. Networkstatuses refer to routers identity keys and descriptors by their
SHA-1 digests.
- G. Directory-signature lines identify which key is doing the signing by
+ H. Directory-signature lines identify which key is doing the signing by
the SHA-1 digests of the authority's signing key and its identity key.
- H. The following items are downloaded by the SHA-1 of their contents:
+ I. The following items are downloaded by the SHA-1 of their contents:
XXXX list them
- I. The following items are downloaded by the SHA-1 of an identity key:
+ J. The following items are downloaded by the SHA-1 of an identity key:
XXXX list them too.
4. The rendezvous protocol
@@ -107,6 +199,12 @@ What Tor uses hashes for today:
establishment requests.
B. Hidden servers use SHA-1 in multiple places when generating hidden
service descriptors.
+ * The permanent-id is the first 80 bits of the SHA-1 hash of the
+ public key
+ ** time-period performs caclulations using the permanent-id
+ * The secret-id-part is the SHA-1 has of the time period, the
+ descriptor-cookie, and replica.
+ * Hash of introduction point's identity key.
C. Hidden servers performing basic-type client authorization for their
services use SHA-1 when encrypting introduction points contained in
hidden service descriptors.
@@ -115,26 +213,35 @@ What Tor uses hashes for today:
identifier or not.
E. Hidden servers use SHA-1 to derive .onion addresses of their
services.
+ * What's worse, it only uses the first 80 bits of the SHA-1 hash.
+ However, the rend-spec.txt says we aren't worried about arbitrary
+ collisons?
F. Clients use SHA-1 to generate the current hidden service descriptor
identifiers for a given .onion address.
G. Hidden servers use SHA-1 to remember digests of the first parts of
Diffie-Hellman handshakes contained in introduction requests in order
- to detect replays.
+ to detect replays. See the RELAY_ESTABLISH_INTRO cell. We seem to be
+ taking a hash of a hash here.
H. Hidden servers use SHA-1 during the Diffie-Hellman key exchange with
a connecting client.
5. The bridge protocol
XXXX write me
+
+ A. Client may attempt to query for bridges where he knows a digest
+ (probably SHA-1) before a direct query.
6. The Tor user interface
A. We log information about servers based on SHA-1 hashes of their
identity keys.
+ COLLISION
B. The controller identifies servers based on SHA-1 hashes of their
identity keys.
+ COLLISION
C. Nearly all of our configuration options that list servers allow SHA-1
hashes of their identity keys.
+ COLLISION
E. The deprecated .exit notation uses SHA-1 hashes of identity keys
-
-
+ COLLISION
diff --git a/doc/spec/proposals/reindex.py b/doc/spec/proposals/reindex.py
index 2b4c02516b..980bc0659f 100755
--- a/doc/spec/proposals/reindex.py
+++ b/doc/spec/proposals/reindex.py
@@ -4,7 +4,7 @@ import re, os
class Error(Exception): pass
STATUSES = """DRAFT NEEDS-REVISION NEEDS-RESEARCH OPEN ACCEPTED META FINISHED
- CLOSED SUPERSEDED DEAD""".split()
+ CLOSED SUPERSEDED DEAD REJECTED""".split()
REQUIRED_FIELDS = [ "Filename", "Status", "Title" ]
CONDITIONAL_FIELDS = { "OPEN" : [ "Target" ],
"ACCEPTED" : [ "Target "],
diff --git a/doc/spec/rend-spec.txt b/doc/spec/rend-spec.txt
index e3fbe2253b..3c14ebc662 100644
--- a/doc/spec/rend-spec.txt
+++ b/doc/spec/rend-spec.txt
@@ -1,11 +1,15 @@
-$Id$
Tor Rendezvous Specification
0. Overview and preliminaries
+ The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
+ NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and
+ "OPTIONAL" in this document are to be interpreted as described in
+ RFC 2119.
+
Read
- https://www.torproject.org/doc/design-paper/tor-design.html#sec:rendezvous
+ https://svn.torproject.org/svn/projects/design-paper/tor-design.html#sec:rendezvous
before you read this specification. It will make more sense.
Rendezvous points provide location-hidden services (server
@@ -16,11 +20,10 @@ $Id$
Bob does this by anonymously advertising a public key for his
service, along with a list of onion routers to act as "Introduction
Points" for his service. He creates forward circuits to those
- introduction points, and tells them about his public key. To
+ introduction points, and tells them about his service. To
connect to Bob, Alice first builds a circuit to an OR to act as
her "Rendezvous Point." She then connects to one of Bob's chosen
- introduction points, optionally provides authentication or
- authorization information, and asks it to tell him about her Rendezvous
+ introduction points, and asks it to tell him about her Rendezvous
Point (RP). If Bob chooses to answer, he builds a circuit to her
RP, and tells it to connect him to Alice. The RP joins their
circuits together, and begins relaying cells. Alice's 'BEGIN'
@@ -60,23 +63,21 @@ $Id$
0.2. Protocol outline
- 1. Bob->Bob's OP: "Offer IP:Port as
- public-key-name:Port". [configuration]
+ 1. Bob->Bob's OP: "Offer IP:Port as public-key-name:Port". [configuration]
(We do not specify this step; it is left to the implementor of
Bob's OP.)
- 2. Bob's OP generates keypair and rendezvous service descriptor:
- "Meet public-key X at introduction point A, B, or C." (signed)
+ 2. Bob's OP generates a long-term keypair.
3. Bob's OP->Introduction point via Tor: [introduction setup]
- "This pk is me."
+ "This public key is (currently) associated to me."
- 4. Bob's OP->directory service via Tor: publishes Bob's service
- descriptor [advertisement]
+ 4. Bob's OP->directory service via Tor: publishes Bob's service descriptor
+ [advertisement]
+ "Meet public-key X at introduction point A, B, or C." (signed)
- 5. Out of band, Alice receives a [x.y.]z.onion:port address.
- She opens a SOCKS connection to her OP, and requests
- x.y.z.onion:port.
+ 5. Out of band, Alice receives a z.onion:port address.
+ She opens a SOCKS connection to her OP, and requests z.onion:port.
6. Alice's OP retrieves Bob's descriptor via Tor. [descriptor lookup.]
@@ -85,29 +86,31 @@ $Id$
setup.]
8. Alice connects to the Introduction point via Tor, and tells it about
- her rendezvous point and optional authentication/authorization
- information. (Encrypted to Bob.) [Introduction 1]
+ her rendezvous point. (Encrypted to Bob.) [Introduction 1]
9. The Introduction point passes this on to Bob's OP via Tor, along the
introduction circuit. [Introduction 2]
10. Bob's OP decides whether to connect to Alice, and if so, creates a
circuit to Alice's RP via Tor. Establishes a shared circuit.
- [Rendezvous.]
+ [Rendezvous 1]
- 11. Alice's OP sends begin cells to Bob's OP. [Connection]
+ 11. The Rendezvous point forwards Bob's confirmation to Alice's OP.
+ [Rendezvous 2]
+
+ 12. Alice's OP sends begin cells to Bob's OP. [Connection]
0.3. Constants and new cell types
Relay cell types
- 32 -- RELAY_ESTABLISH_INTRO
- 33 -- RELAY_ESTABLISH_RENDEZVOUS
- 34 -- RELAY_INTRODUCE1
- 35 -- RELAY_INTRODUCE2
- 36 -- RELAY_RENDEZVOUS1
- 37 -- RELAY_RENDEZVOUS2
- 38 -- RELAY_INTRO_ESTABLISHED
- 39 -- RELAY_RENDEZVOUS_ESTABLISHED
+ 32 -- RELAY_COMMAND_ESTABLISH_INTRO
+ 33 -- RELAY_COMMAND_ESTABLISH_RENDEZVOUS
+ 34 -- RELAY_COMMAND_INTRODUCE1
+ 35 -- RELAY_COMMAND_INTRODUCE2
+ 36 -- RELAY_COMMAND_RENDEZVOUS1
+ 37 -- RELAY_COMMAND_RENDEZVOUS2
+ 38 -- RELAY_COMMAND_INTRO_ESTABLISHED
+ 39 -- RELAY_COMMAND_RENDEZVOUS_ESTABLISHED
40 -- RELAY_COMMAND_INTRODUCE_ACK
0.4. Version overview
@@ -117,14 +120,14 @@ $Id$
other parts remained the same. The following list of potentially
versioned protocol parts should help reduce some confusion:
- - Hidden service descriptor: the binary-based v0 was the default for
- a long time, and an ascii-based v2 has been added by proposal
- 114. See 1.2.
+ - Hidden service descriptor: the binary-based v0 was the default for a
+ long time, and an ASCII-based v2 has been added by proposal 114. The
+ v0 descriptor format has been deprecated in 0.2.2.1-alpha. See 1.3.
- Hidden service descriptor propagation mechanism: currently related to
the hidden service descriptor version -- v0 publishes to the original
hs directory authorities, whereas v2 publishes to a rotating subset
- of relays with the "hsdir" flag; see 1.4 and 1.6.
+ of relays with the "HSDir" flag; see 1.4 and 1.6.
- Introduction protocol for how to generate an introduction cell:
v0 specified a nickname for the rendezvous point and assumed the
@@ -142,11 +145,56 @@ $Id$
service. Bob provides a mapping from each of these virtual ports
to a local IP:Port pair.
-1.2. Bob's OP generates service descriptors.
+1.2. Bob's OP establishes his introduction points.
The first time the OP provides an advertised service, it generates
- a public/private keypair (stored locally). Periodically, the OP
- generates and publishes a descriptor of type "V0".
+ a public/private keypair (stored locally).
+
+ The OP chooses a small number of Tor servers as introduction points.
+ The OP establishes a new introduction circuit to each introduction
+ point. These circuits MUST NOT be used for anything but hidden service
+ introduction. To establish the introduction, Bob sends a
+ RELAY_COMMAND_ESTABLISH_INTRO cell, containing:
+
+ KL Key length [2 octets]
+ PK Bob's public key or service key [KL octets]
+ HS Hash of session info [20 octets]
+ SIG Signature of above information [variable]
+
+ KL is the length of PK, in octets.
+
+ To prevent replay attacks, the HS field contains a SHA-1 hash based on the
+ shared secret KH between Bob's OP and the introduction point, as
+ follows:
+ HS = H(KH | "INTRODUCE")
+ That is:
+ HS = H(KH | [49 4E 54 52 4F 44 55 43 45])
+ (KH, as specified in tor-spec.txt, is H(g^xy | [00]) .)
+
+ Upon receiving such a cell, the OR first checks that the signature is
+ correct with the included public key. If so, it checks whether HS is
+ correct given the shared state between Bob's OP and the OR. If either
+ check fails, the OP discards the cell; otherwise, it associates the
+ circuit with Bob's public key, and dissociates any other circuits
+ currently associated with PK. On success, the OR sends Bob a
+ RELAY_COMMAND_INTRO_ESTABLISHED cell with an empty payload.
+
+ Bob's OP uses either Bob's public key or a freshly generated, single-use
+ service key in the RELAY_COMMAND_ESTABLISH_INTRO cell, depending on the
+ configured hidden service descriptor version. The public key is used for
+ v0 descriptors, the service key for v2 descriptors. In the latter case, the
+ service keys of all introduction points are included in the v2 hidden
+ service descriptor together with the other introduction point information.
+ The reason is that the introduction point does not need to and therefore
+ should not know for which hidden service it works, so as to prevent it from
+ tracking the hidden service's activity. If the hidden service is configured
+ to publish both v0 and v2 descriptors, two separate sets of introduction
+ points are established.
+
+1.3. Bob's OP generates service descriptors.
+
+ For versions before 0.2.2.1-alpha, Bob's OP periodically generates and
+ publishes a descriptor of type "V0".
The "V0" descriptor contains:
@@ -157,7 +205,6 @@ $Id$
Ipt A list of NUL-terminated ORs [variable]
SIG Signature of above fields [variable]
- KL is the length of PK, in octets.
TS is the number of seconds elapsed since Jan 1, 1970.
The members of Ipt may be either (a) nicknames, or (b) identity key
@@ -170,8 +217,8 @@ $Id$
and now he doesn't have any. -RD]
Beginning with 0.2.0.10-alpha, Bob's OP encodes "V2" descriptors in
- addition to "V0" descriptors. The format of a "V2" descriptor is as
- follows:
+ addition to (or instead of) "V0" descriptors. The format of a "V2"
+ descriptor is as follows:
"rendezvous-service-descriptor" descriptor-id NL
@@ -179,11 +226,7 @@ $Id$
Indicates the beginning of the descriptor. "descriptor-id" is a
periodically changing identifier of 160 bits formatted as 32 base32
- chars that is calculated by the hidden service and its clients. If
- the optional "descriptor-cookie" is used, this "descriptor-id"
- cannot be computed by anyone else. (Everyone can verify that this
- "descriptor-id" belongs to the rest of the descriptor, even without
- knowing the optional "descriptor-cookie", as described below.) The
+ chars that is calculated by the hidden service and its clients. The
"descriptor-id" is calculated by performing the following operation:
descriptor-id =
@@ -195,29 +238,20 @@ $Id$
permanent-id = H(public-key)[:10]
- "H(time-period | descriptor-cookie | replica)" is the (possibly
- secret) id part that is
- necessary to verify that the hidden service is the true originator
- of this descriptor. It can only be created by the hidden service
- and its clients, but the "signature" below can only be created by
- the service.
+ Note: If Bob's OP has "stealth" authorization enabled (see Section 2.2),
+ it uses the client key in place of the public hidden service key.
- "descriptor-cookie" is an optional secret password of 128 bits that
- is shared between the hidden service provider and its clients.
-
- "replica" denotes the number of the non-consecutive replica.
+ "H(time-period | descriptor-cookie | replica)" is the (possibly
+ secret) id part that is necessary to verify that the hidden service is
+ the true originator of this descriptor and that is therefore contained
+ in the descriptor, too. The descriptor ID can only be created by the
+ hidden service and its clients, but the "signature" below can only be
+ created by the service.
- (Each descriptor is replicated on a number of _consecutive_ nodes
- in the identifier ring by making every storing node responsible
- for the identifier intervals starting from its 3rd predecessor's
- ID to its own ID. In addition to that, every service publishes
- multiple descriptors with different descriptor IDs in order to
- distribute them to different places on the ring. Therefore,
- "replica" chooses one of the _non-consecutive_ replicas. -KL)
+ "time-period" changes periodically as a function of time and
- The "time-period" changes periodically depending on the global time and
- as a function of "permanent-id". The current value for "time-period" can
- be calculated using the following formula:
+ "permanent-id". The current value for "time-period" can be calculated
+ using the following formula:
time-period = (current-time + permanent-id-byte * 86400 / 256)
/ 86400
@@ -231,6 +265,15 @@ $Id$
of the overall operation is a (network-ordered) 32-bit integer, e.g.
13753 or 0x000035B9 with the example values given above.
+ "descriptor-cookie" is an optional secret password of 128 bits that
+ is shared between the hidden service provider and its clients. If the
+ descriptor-cookie is left out, the input to the hash function is 128
+ bits shorter.
+
+ "replica" denotes the number of the replica. A service publishes
+ multiple descriptors with different descriptor IDs in order to
+ distribute them to different places on the ring.
+
"version" version-number NL
[Exactly once]
@@ -284,13 +327,16 @@ $Id$
The unencrypted string may begin with:
- ["service-authentication" auth-type NL auth-data ... reserved]
+ "service-authentication" auth-type auth-data NL
- [At start, any number]
+ [Any number]
The service-specific authentication data can be used to perform
client authentication. This data is independent of the selected
- introduction point as opposed to "intro-authentication" below.
+ introduction point as opposed to "intro-authentication" below. The
+ format of auth-data (base64-encoded or PEM format) depends on
+ auth-type. See section 2 of this document for details on auth
+ mechanisms.
Subsequently, an arbitrary number of introduction point entries may
follow, each containing the following data:
@@ -329,17 +375,23 @@ $Id$
The public key that can be used to encrypt messages to the hidden
service.
- ["intro-authentication" auth-type NL auth-data ... reserved]
+ "intro-authentication" auth-type auth-data NL
[Any number]
The introduction-point-specific authentication data can be used
to perform client authentication. This data depends on the
selected introduction point as opposed to "service-authentication"
- above.
+ above. The format of auth-data (base64-encoded or PEM format)
+ depends on auth-type. See section 2 of this document for details
+ on auth mechanisms.
(This ends the fields in the encrypted portion of the descriptor.)
+ [It's ok for Bob to advertise 0 introduction points. He might want
+ to do that if he previously advertised some introduction points,
+ and now he doesn't have any. -RD]
+
"signature" NL signature-string
[At end, exactly once]
@@ -347,7 +399,22 @@ $Id$
A signature of all fields above with the private key of the hidden
service.
-1.2.1. Other descriptor formats we don't use.
+1.3.1. Other descriptor formats we don't use.
+
+ Support for the V0 descriptor format was dropped in 0.2.2.0-alpha-dev:
+
+ KL Key length [2 octets]
+ PK Bob's public key [KL octets]
+ TS A timestamp [4 octets]
+ NI Number of introduction points [2 octets]
+ Ipt A list of NUL-terminated ORs [variable]
+ SIG Signature of above fields [variable]
+
+ KL is the length of PK, in octets.
+ TS is the number of seconds elapsed since Jan 1, 1970.
+
+ The members of Ipt may be either (a) nicknames, or (b) identity key
+ digests, encoded in hex, and prefixed with a '$'.
The V1 descriptor format was understood and accepted from
0.1.1.5-alpha-cvs to 0.2.0.6-alpha-dev, but no Tors generated it and
@@ -401,56 +468,17 @@ $Id$
Currently only AUTHT of [00 00] is supported, with an AUTHL of 0.
See section 2 of this document for details on auth mechanisms.
-1.3. Bob's OP establishes his introduction points.
-
- The OP establishes a new introduction circuit to each introduction
- point. These circuits MUST NOT be used for anything but hidden service
- introduction. To establish the introduction, Bob sends a
- RELAY_ESTABLISH_INTRO cell, containing:
-
- KL Key length [2 octets]
- PK Bob's public key [KL octets]
- HS Hash of session info [20 octets]
- SIG Signature of above information [variable]
-
- [XXX011, need to add auth information here. -RD]
-
- To prevent replay attacks, the HS field contains a SHA-1 hash based on the
- shared secret KH between Bob's OP and the introduction point, as
- follows:
- HS = H(KH | "INTRODUCE")
- That is:
- HS = H(KH | [49 4E 54 52 4F 44 55 43 45])
- (KH, as specified in tor-spec.txt, is H(g^xy | [00]) .)
-
- Upon receiving such a cell, the OR first checks that the signature is
- correct with the included public key. If so, it checks whether HS is
- correct given the shared state between Bob's OP and the OR. If either
- check fails, the OP discards the cell; otherwise, it associates the
- circuit with Bob's public key, and dissociates any other circuits
- currently associated with PK. On success, the OR sends Bob a
- RELAY_INTRO_ESTABLISHED cell with an empty payload.
-
- If a hidden service is configured to publish only v2 hidden service
- descriptors, Bob's OP does not include its own public key in the
- RELAY_ESTABLISH_INTRO cell, but the public key of a freshly generated
- key pair. The OP also includes these fresh public keys in the v2 hidden
- service descriptor together with the other introduction point
- information. The reason is that the introduction point does not need to
- and therefore should not know for which hidden service it works, so as
- to prevent it from tracking the hidden service's activity. If the hidden
- service is configured to publish both, v0 and v2 descriptors, two
- separate sets of introduction points are established.
-
1.4. Bob's OP advertises his service descriptor(s).
- Bob's OP opens a stream to each directory server's directory port via Tor.
- (He may re-use old circuits for this.) Over this stream, Bob's OP makes
- an HTTP 'POST' request, to a URL "/tor/rendezvous/publish" relative to the
- directory server's root, containing as its body Bob's service descriptor.
+ Bob's OP advertises his service descriptor to a fixed set of v0 hidden
+ service directory servers and/or a changing subset of all v2 hidden service
+ directories.
- Bob should upload a service descriptor for each version format that
- is supported in the current Tor network.
+ For versions before 0.2.2.1-alpha, Bob's OP opens a stream to each v0
+ directory server's directory port via Tor. (He may re-use old circuits for
+ this.) Over this stream, Bob's OP makes an HTTP 'POST' request, to a URL
+ "/tor/rendezvous/publish" relative to the directory server's root,
+ containing as its body Bob's service descriptor.
Upon receiving a descriptor, the directory server checks the signature,
and discards the descriptor if the signature does not match the enclosed
@@ -464,13 +492,12 @@ $Id$
after its timestamp. At least every 18 hours, Bob's OP uploads a
fresh descriptor.
- If Bob's OP is configured to publish v2 descriptors instead of or in
- addition to v0 descriptors, it does so to a changing subset of all v2
- hidden service directories instead of the authoritative directory
- servers. Therefore, Bob's OP opens a stream via Tor to each
- responsible hidden service directory. (He may re-use old circuits
- for this.) Over this stream, Bob's OP makes an HTTP 'POST' request to a
- URL "/tor/rendezvous2/publish" relative to the hidden service
+ If Bob's OP is configured to publish v2 descriptors, it does so to a
+ changing subset of all v2 hidden service directories instead of the
+ authoritative directory servers. Therefore, Bob's OP opens a stream via
+ Tor to each responsible hidden service directory. (He may re-use old
+ circuits for this.) Over this stream, Bob's OP makes an HTTP 'POST'
+ request to a URL "/tor/rendezvous2/publish" relative to the hidden service
directory's root, containing as its body Bob's service descriptor.
At any time, there are 6 hidden service directories responsible for
@@ -487,45 +514,41 @@ $Id$
Bob's OP publishes a new v2 descriptor once an hour or whenever its
content changes. V2 descriptors can be found by clients within a given
time period of 24 hours, after which they change their ID as described
- under 1.2. If a published descriptor would be valid for less than 60
+ under 1.3. If a published descriptor would be valid for less than 60
minutes (= 2 x 30 minutes to allow the server to be 30 minutes behind
and the client 30 minutes ahead), Bob's OP publishes the descriptor
under the ID of both, the current and the next publication period.
-1.5. Alice receives a x.y.z.onion address.
+1.5. Alice receives a z.onion address.
When Alice receives a pointer to a location-hidden service, it is as a
- hostname of the form "z.onion" or "y.z.onion" or "x.y.z.onion", where
- z is a base-32 encoding of a 10-octet hash of Bob's service's public
- key, computed as follows:
+ hostname of the form "z.onion", where z is a base-32 encoding of a
+ 10-octet hash of Bob's service's public key, computed as follows:
1. Let H = H(PK).
2. Let H' = the first 80 bits of H, considering each octet from
most significant bit to least significant bit.
- 2. Generate a 16-character encoding of H', using base32 as defined
+ 3. Generate a 16-character encoding of H', using base32 as defined
in RFC 3548.
(We only use 80 bits instead of the 160 bits from SHA1 because we
don't need to worry about arbitrary collisions, and because it will
make handling the url's more convenient.)
- The string "x", if present, is the base-32 encoding of the
- authentication/authorization required by the introduction point.
- The string "y", if present, is the base-32 encoding of the
- authentication/authorization required by the hidden service.
- Omitting a string is taken to mean auth type [00 00].
- See section 2 of this document for details on auth mechanisms.
-
[Yes, numbers are allowed at the beginning. See RFC 1123. -NM]
1.6. Alice's OP retrieves a service descriptor.
- Alice opens a stream to a directory server via Tor, and makes an HTTP GET
- request for the document '/tor/rendezvous/<z>', where '<z>' is replaced
- with the encoding of Bob's public key as described above. (She may re-use
- old circuits for this.) The directory replies with a 404 HTTP response if
- it does not recognize <z>, and otherwise returns Bob's most recently
- uploaded service descriptor.
+ Alice's OP fetches the service descriptor from the fixed set of v0 hidden
+ service directory servers and/or a changing subset of all v2 hidden service
+ directories.
+
+ For versions before 0.2.2.1-alpha, Alice's OP opens a stream to a directory
+ server via Tor, and makes an HTTP GET request for the document
+ '/tor/rendezvous/<z>', where '<z>' is replaced with the encoding of Bob's
+ public key as described above. (She may re-use old circuits for this.) The
+ directory replies with a 404 HTTP response if it does not recognize <z>,
+ and otherwise returns Bob's most recently uploaded service descriptor.
If Alice's OP receives a 404 response, it tries the other directory
servers, and only fails the lookup if none recognize the public key hash.
@@ -541,13 +564,15 @@ $Id$
[Caching may make her partitionable, but she fetched it anonymously,
and we can't very well *not* cache it. -RD]
- Alice's OP fetches v2 descriptors in parallel to v0 descriptors. Similarly
- to the description in section 1.4, the OP fetches a v2 descriptor from a
- randomly chosen hidden service directory out of the changing subset of
- 6 nodes. If the request is unsuccessful, Alice retries the other
- remaining responsible hidden service directories in a random order.
- Alice relies on Bob to care about a potential clock skew between the two
- by possibly storing two sets of descriptors (see end of section 1.4).
+ If Alice's OP is running 0.2.1.10-alpha or higher, it fetches v2 hidden
+ service descriptors. Versions before 0.2.2.1-alpha are fetching both v0 and
+ v2 descriptors in parallel. Similar to the description in section 1.4,
+ Alice's OP fetches a v2 descriptor from a randomly chosen hidden service
+ directory out of the changing subset of 6 nodes. If the request is
+ unsuccessful, Alice retries the other remaining responsible hidden service
+ directories in a random order. Alice relies on Bob to care about a potential
+ clock skew between the two by possibly storing two sets of descriptors (see
+ end of section 1.4).
Alice's OP opens a stream via Tor to the chosen v2 hidden service
directory. (She may re-use old circuits for this.) Over this stream,
@@ -563,19 +588,18 @@ $Id$
and Alice's OP does not have an established circuit to that service,
the OP builds a rendezvous circuit. It does this by establishing
a circuit to a randomly chosen OR, and sending a
- RELAY_ESTABLISH_RENDEZVOUS cell to that OR. The body of that cell
+ RELAY_COMMAND_ESTABLISH_RENDEZVOUS cell to that OR. The body of that cell
contains:
RC Rendezvous cookie [20 octets]
- [XXX011 this looks like an auth mechanism. should we generalize here? -RD]
-
The rendezvous cookie is an arbitrary 20-byte value, chosen randomly by
- Alice's OP.
+ Alice's OP. Alice SHOULD choose a new rendezvous cookie for each new
+ connection attempt.
- Upon receiving a RELAY_ESTABLISH_RENDEZVOUS cell, the OR associates the
- RC with the circuit that sent it. It replies to Alice with an empty
- RELAY_RENDEZVOUS_ESTABLISHED cell to indicate success.
+ Upon receiving a RELAY_COMMAND_ESTABLISH_RENDEZVOUS cell, the OR associates
+ the RC with the circuit that sent it. It replies to Alice with an empty
+ RELAY_COMMAND_RENDEZVOUS_ESTABLISHED cell to indicate success.
Alice's OP MUST NOT use the circuit which sent the cell for any purpose
other than rendezvous with the given location-hidden service.
@@ -583,7 +607,7 @@ $Id$
1.8. Introduction: from Alice's OP to Introduction Point
Alice builds a separate circuit to one of Bob's chosen introduction
- points, and sends it a RELAY_INTRODUCE1 cell containing:
+ points, and sends it a RELAY_COMMAND_INTRODUCE1 cell containing:
Cleartext
PK_ID Identifier for Bob's PK [20 octets]
@@ -605,15 +629,32 @@ $Id$
KEY Rendezvous point onion key [KLEN octets]
RC Rendezvous cookie [20 octets]
g^x Diffie-Hellman data, part 1 [128 octets]
+ OR (in the v3 intro protocol)
+ VER Version byte: set to 3. [1 octet]
+ AUTHT The auth type that is used [1 octet]
+ AUTHL Length of auth data [2 octets]
+ AUTHD Auth data [variable]
+ TS A timestamp [4 octets]
+ IP Rendezvous point's address [4 octets]
+ PORT Rendezvous point's OR port [2 octets]
+ ID Rendezvous point identity ID [20 octets]
+ KLEN Length of onion key [2 octets]
+ KEY Rendezvous point onion key [KLEN octets]
+ RC Rendezvous cookie [20 octets]
+ g^x Diffie-Hellman data, part 1 [128 octets]
- PK_ID is the hash of Bob's public key. RP is NUL-padded and
- terminated. In version 0, it must contain a nickname. In version 1,
- it must contain EITHER a nickname or an identity key digest that is
- encoded in hex and prefixed with a '$'.
+ PK_ID is the hash of Bob's public key or the service key, depending on the
+ hidden service descriptor version. In case of a v0 descriptor, Alice's OP
+ uses Bob's public key. If Alice has downloaded a v2 descriptor, she uses
+ the contained public key ("service-key").
+
+ RP is NUL-padded and terminated. In version 0 of the intro protocol, RP
+ must contain a nickname. In version 1, it must contain EITHER a nickname or
+ an identity key digest that is encoded in hex and prefixed with a '$'.
The hybrid encryption to Bob's PK works just like the hybrid
encryption in CREATE cells (see tor-spec). Thus the payload of the
- version 0 RELAY_INTRODUCE1 cell on the wire will contain
+ version 0 RELAY_COMMAND_INTRODUCE1 cell on the wire will contain
20+42+16+20+20+128=246 bytes, and the version 1 and version 2
introduction formats have other sizes.
@@ -622,51 +663,31 @@ $Id$
v1, and v2 since 0.1.1.x. As of Tor 0.2.0.7-alpha and 0.1.2.18,
clients switched to using the v2 intro format.
- If Alice has downloaded a v2 descriptor, she uses the contained public
- key ("service-key") instead of Bob's public key to create the
- RELAY_INTRODUCE1 cell as described above.
-
-1.8.1. Other introduction formats we don't use.
-
- We briefly speculated about using the following format for the
- "encrypted to Bob's PK" part of the introduction, but no Tors have
- ever generated these.
-
- VER Version byte: set to 3. [1 octet]
- ATYPE An address type (typically 4) [1 octet]
- ADDR Rendezvous point's IP address [4 or 16 octets]
- PORT Rendezvous point's OR port [2 octets]
- AUTHT The auth type that is supported [2 octets]
- AUTHL Length of auth data [1 octet]
- AUTHD Auth data [variable]
- ID Rendezvous point identity ID [20 octets]
- KLEN Length of onion key [2 octets]
- KEY Rendezvous point onion key [KLEN octets]
- RC Rendezvous cookie [20 octets]
- g^x Diffie-Hellman data, part 1 [128 octets]
-
1.9. Introduction: From the Introduction Point to Bob's OP
If the Introduction Point recognizes PK_ID as a public key which has
- established a circuit for introductions as in 1.3 above, it sends the body
- of the cell in a new RELAY_INTRODUCE2 cell down the corresponding circuit.
- (If the PK_ID is unrecognized, the RELAY_INTRODUCE1 cell is discarded.)
-
- After sending the RELAY_INTRODUCE2 cell, the OR replies to Alice with an
- empty RELAY_COMMAND_INTRODUCE_ACK cell. If no RELAY_INTRODUCE2 cell can
- be sent, the OR replies to Alice with a non-empty cell to indicate an
- error. (The semantics of the cell body may be determined later; the
- current implementation sends a single '1' byte on failure.)
-
- When Bob's OP receives the RELAY_INTRODUCE2 cell, it decrypts it with
- the private key for the corresponding hidden service, and extracts the
+ established a circuit for introductions as in 1.2 above, it sends the body
+ of the cell in a new RELAY_COMMAND_INTRODUCE2 cell down the corresponding
+ circuit. (If the PK_ID is unrecognized, the RELAY_COMMAND_INTRODUCE1 cell is
+ discarded.)
+
+ After sending the RELAY_COMMAND_INTRODUCE2 cell to Bob, the OR replies to
+ Alice with an empty RELAY_COMMAND_INTRODUCE_ACK cell. If no
+ RELAY_COMMAND_INTRODUCE2 cell can be sent, the OR replies to Alice with a
+ non-empty cell to indicate an error. (The semantics of the cell body may be
+ determined later; the current implementation sends a single '1' byte on
+ failure.)
+
+ When Bob's OP receives the RELAY_COMMAND_INTRODUCE2 cell, it decrypts it
+ with the private key for the corresponding hidden service, and extracts the
rendezvous point's nickname, the rendezvous cookie, and the value of g^x
chosen by Alice.
1.10. Rendezvous
Bob's OP builds a new Tor circuit ending at Alice's chosen rendezvous
- point, and sends a RELAY_RENDEZVOUS1 cell along this circuit, containing:
+ point, and sends a RELAY_COMMAND_RENDEZVOUS1 cell along this circuit,
+ containing:
RC Rendezvous cookie [20 octets]
g^y Diffie-Hellman [128 octets]
KH Handshake digest [20 octets]
@@ -674,7 +695,7 @@ $Id$
(Bob's OP MUST NOT use this circuit for any other purpose.)
If the RP recognizes RC, it relays the rest of the cell down the
- corresponding circuit in a RELAY_RENDEZVOUS2 cell, containing:
+ corresponding circuit in a RELAY_COMMAND_RENDEZVOUS2 cell, containing:
g^y Diffie-Hellman [128 octets]
KH Handshake digest [20 octets]
@@ -682,10 +703,10 @@ $Id$
(If the RP does not recognize the RC, it discards the cell and
tears down the circuit.)
- When Alice's OP receives a RELAY_RENDEZVOUS2 cell on a circuit which
- has sent a RELAY_ESTABLISH_RENDEZVOUS cell but which has not yet received
- a reply, it uses g^y and H(g^xy) to complete the handshake as in the Tor
- circuit extend process: they establish a 60-octet string as
+ When Alice's OP receives a RELAY_COMMAND_RENDEZVOUS2 cell on a circuit which
+ has sent a RELAY_COMMAND_ESTABLISH_RENDEZVOUS cell but which has not yet
+ received a reply, it uses g^y and H(g^xy) to complete the handshake as in
+ the Tor circuit extend process: they establish a 60-octet string as
K = SHA1(g^xy | [00]) | SHA1(g^xy | [01]) | SHA1(g^xy | [02])
and generate
KH = K[0..15]
@@ -704,7 +725,7 @@ $Id$
1.11. Creating streams
To open TCP connections to Bob's location-hidden service, Alice's OP sends
- a RELAY_BEGIN cell along the established circuit, using the special
+ a RELAY_COMMAND_BEGIN cell along the established circuit, using the special
address "", and a chosen port. Bob's OP chooses a destination IP and
port, based on the configuration of the service connected to the circuit,
and opens a TCP stream. From then on, Bob's OP treats the stream as an
@@ -712,13 +733,190 @@ $Id$
[ Except he doesn't include addr in the connected cell or the end
cell. -RD]
- Alice MAY send multiple RELAY_BEGIN cells along the circuit, to open
- multiple streams to Bob. Alice SHOULD NOT send RELAY_BEGIN cells for any
- other address along her circuit to Bob; if she does, Bob MUST reject them.
+ Alice MAY send multiple RELAY_COMMAND_BEGIN cells along the circuit, to open
+ multiple streams to Bob. Alice SHOULD NOT send RELAY_COMMAND_BEGIN cells
+ for any other address along her circuit to Bob; if she does, Bob MUST reject
+ them.
2. Authentication and authorization.
-Foo.
+ The rendezvous protocol as described in Section 1 provides a few options
+ for implementing client-side authorization. There are two steps in the
+ rendezvous protocol that can be used for performing client authorization:
+ when downloading and decrypting parts of the hidden service descriptor and
+ at Bob's Tor client before contacting the rendezvous point. A service
+ provider can restrict access to his service at these two points to
+ authorized clients only.
+
+ There are currently two authorization protocols specified that are
+ described in more detail below:
+
+ 1. The first protocol allows a service provider to restrict access
+ to clients with a previously received secret key only, but does not
+ attempt to hide service activity from others.
+
+ 2. The second protocol, albeit being feasible for a limited set of about
+ 16 clients, performs client authorization and hides service activity
+ from everyone but the authorized clients.
+
+2.1. Service with large-scale client authorization
+
+ The first client authorization protocol aims at performing access control
+ while consuming as few additional resources as possible. This is the "basic"
+ authorization protocol. A service provider should be able to permit access
+ to a large number of clients while denying access for everyone else.
+ However, the price for scalability is that the service won't be able to hide
+ its activity from unauthorized or formerly authorized clients.
+
+ The main idea of this protocol is to encrypt the introduction-point part
+ in hidden service descriptors to authorized clients using symmetric keys.
+ This ensures that nobody else but authorized clients can learn which
+ introduction points a service currently uses, nor can someone send a
+ valid INTRODUCE1 message without knowing the introduction key. Therefore,
+ a subsequent authorization at the introduction point is not required.
+
+ A service provider generates symmetric "descriptor cookies" for his
+ clients and distributes them outside of Tor. The suggested key size is
+ 128 bits, so that descriptor cookies can be encoded in 22 base64 chars
+ (which can hold up to 22 * 5 = 132 bits, leaving 4 bits to encode the
+ authorization type (here: "0") and allow a client to distinguish this
+ authorization protocol from others like the one proposed below).
+ Typically, the contact information for a hidden service using this
+ authorization protocol looks like this:
+
+ v2cbb2l4lsnpio4q.onion Ll3X7Xgz9eHGKCCnlFH0uz
+
+ When generating a hidden service descriptor, the service encrypts the
+ introduction-point part with a single randomly generated symmetric
+ 128-bit session key using AES-CTR as described for v2 hidden service
+ descriptors in rend-spec. Afterwards, the service encrypts the session
+ key to all descriptor cookies using AES. Authorized client should be able
+ to efficiently find the session key that is encrypted for him/her, so
+ that 4 octet long client ID are generated consisting of descriptor cookie
+ and initialization vector. Descriptors always contain a number of
+ encrypted session keys that is a multiple of 16 by adding fake entries.
+ Encrypted session keys are ordered by client IDs in order to conceal
+ addition or removal of authorized clients by the service provider.
+
+ ATYPE Authorization type: set to 1. [1 octet]
+ ALEN Number of clients := 1 + ((clients - 1) div 16) [1 octet]
+ for each symmetric descriptor cookie:
+ ID Client ID: H(descriptor cookie | IV)[:4] [4 octets]
+ SKEY Session key encrypted with descriptor cookie [16 octets]
+ (end of client-specific part)
+ RND Random data [(15 - ((clients - 1) mod 16)) * 20 octets]
+ IV AES initialization vector [16 octets]
+ IPOS Intro points, encrypted with session key [remaining octets]
+
+ An authorized client needs to configure Tor to use the descriptor cookie
+ when accessing the hidden service. Therefore, a user adds the contact
+ information that she received from the service provider to her torrc
+ file. Upon downloading a hidden service descriptor, Tor finds the
+ encrypted introduction-point part and attempts to decrypt it using the
+ configured descriptor cookie. (In the rare event of two or more client
+ IDs being equal a client tries to decrypt all of them.)
+
+ Upon sending the introduction, the client includes her descriptor cookie
+ as auth type "1" in the INTRODUCE2 cell that she sends to the service.
+ The hidden service checks whether the included descriptor cookie is
+ authorized to access the service and either responds to the introduction
+ request, or not.
+
+2.2. Authorization for limited number of clients
+
+ A second, more sophisticated client authorization protocol goes the extra
+ mile of hiding service activity from unauthorized clients. This is the
+ "stealth" authorization protocol. With all else being equal to the preceding
+ authorization protocol, the second protocol publishes hidden service
+ descriptors for each user separately and gets along with encrypting the
+ introduction-point part of descriptors to a single client. This allows the
+ service to stop publishing descriptors for removed clients. As long as a
+ removed client cannot link descriptors issued for other clients to the
+ service, it cannot derive service activity any more. The downside of this
+ approach is limited scalability. Even though the distributed storage of
+ descriptors (cf. proposal 114) tackles the problem of limited scalability to
+ a certain extent, this protocol should not be used for services with more
+ than 16 clients. (In fact, Tor should refuse to advertise services for more
+ than this number of clients.)
+
+ A hidden service generates an asymmetric "client key" and a symmetric
+ "descriptor cookie" for each client. The client key is used as
+ replacement for the service's permanent key, so that the service uses a
+ different identity for each of his clients. The descriptor cookie is used
+ to store descriptors at changing directory nodes that are unpredictable
+ for anyone but service and client, to encrypt the introduction-point
+ part, and to be included in INTRODUCE2 cells. Once the service has
+ created client key and descriptor cookie, he tells them to the client
+ outside of Tor. The contact information string looks similar to the one
+ used by the preceding authorization protocol (with the only difference
+ that it has "1" encoded as auth-type in the remaining 4 of 132 bits
+ instead of "0" as before).
+
+ When creating a hidden service descriptor for an authorized client, the
+ hidden service uses the client key and descriptor cookie to compute
+ secret ID part and descriptor ID:
+
+ secret-id-part = H(time-period | descriptor-cookie | replica)
+
+ descriptor-id = H(client-key[:10] | secret-id-part)
+
+ The hidden service also replaces permanent-key in the descriptor with
+ client-key and encrypts introduction-points with the descriptor cookie.
+
+ ATYPE Authorization type: set to 2. [1 octet]
+ IV AES initialization vector [16 octets]
+ IPOS Intro points, encr. with descriptor cookie [remaining octets]
+
+ When uploading descriptors, the hidden service needs to make sure that
+ descriptors for different clients are not uploaded at the same time (cf.
+ Section 1.1) which is also a limiting factor for the number of clients.
+
+ When a client is requested to establish a connection to a hidden service
+ it looks up whether it has any authorization data configured for that
+ service. If the user has configured authorization data for authorization
+ protocol "2", the descriptor ID is determined as described in the last
+ paragraph. Upon receiving a descriptor, the client decrypts the
+ introduction-point part using its descriptor cookie. Further, the client
+ includes its descriptor cookie as auth-type "2" in INTRODUCE2 cells that
+ it sends to the service.
+
+2.3. Hidden service configuration
+
+ A hidden service that is meant to perform client authorization adds a
+ new option HiddenServiceAuthorizeClient to its hidden service
+ configuration. This option contains the authorization type which is
+ either "basic" for the protocol described in 2.1 or "stealth" for the
+ protocol in 2.2 and a comma-separated list of human-readable client
+ names, so that Tor can create authorization data for these clients:
+
+ HiddenServiceAuthorizeClient auth-type client-name,client-name,...
+
+ If this option is configured, HiddenServiceVersion is automatically
+ reconfigured to contain only version numbers of 2 or higher. There is
+ a maximum of 512 client names for basic auth and a maximum of 16 for
+ stealth auth.
+
+ Tor stores all generated authorization data for the authorization
+ protocols described in Sections 2.1 and 2.2 in a new file using the
+ following file format:
+
+ "client-name" human-readable client identifier NL
+ "descriptor-cookie" 128-bit key ^= 22 base64 chars NL
+
+ If the authorization protocol of Section 2.2 is used, Tor also generates
+ and stores the following data:
+
+ "client-key" NL a public key in PEM format
+
+2.4. Client configuration
+
+ Clients need to make their authorization data known to Tor using another
+ configuration option that contains a service name (mainly for the sake of
+ convenience), the service address, and the descriptor cookie that is
+ required to access a hidden service (the authorization protocol number is
+ encoded in the descriptor cookie):
+
+ HidServAuth service-name service-address descriptor-cookie
3. Hidden service directory operation
diff --git a/doc/spec/socks-extensions.txt b/doc/spec/socks-extensions.txt
index 8d58987f35..62d86acd9f 100644
--- a/doc/spec/socks-extensions.txt
+++ b/doc/spec/socks-extensions.txt
@@ -1,4 +1,3 @@
-$Id$
Tor's extensions to the SOCKS protocol
1. Overview
diff --git a/doc/spec/tor-spec.txt b/doc/spec/tor-spec.txt
index a321aa8694..91ad561b8d 100644
--- a/doc/spec/tor-spec.txt
+++ b/doc/spec/tor-spec.txt
@@ -1,4 +1,3 @@
-$Id$
Tor Protocol Specification
@@ -17,11 +16,16 @@ see tor-design.pdf.
0. Preliminaries
+ The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
+ NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and
+ "OPTIONAL" in this document are to be interpreted as described in
+ RFC 2119.
+
0.1. Notation and encoding
PK -- a public key.
SK -- a private key.
- K -- a key for a symmetric cypher.
+ K -- a key for a symmetric cipher.
a|b -- concatenation of 'a' and 'b'.
@@ -172,8 +176,8 @@ see tor-design.pdf.
In "renegotiation", the connection initiator sends no certificates, and
the responder sends a single connection certificate. Once the TLS
handshake is complete, the initiator renegotiates the handshake, with each
- parties sending a two-certificate chain as in "certificates up-front".
- The initiator's ClientHello MUST include at least once ciphersuite not in
+ party sending a two-certificate chain as in "certificates up-front".
+ The initiator's ClientHello MUST include at least one ciphersuite not in
the list above. The responder SHOULD NOT select any ciphersuite besides
those in the list above.
[The above "should not" is because some of the ciphers that
@@ -201,9 +205,9 @@ see tor-design.pdf.
to decide which to use.
In all of the above handshake variants, certificates sent in the clear
- SHOULD NOT include any strings to identify the host as a Tor server. In
- the "renegotation" and "backwards-compatible renegotiation", the
- initiator SHOULD chose a list of ciphersuites and TLS extensions chosen
+ SHOULD NOT include any strings to identify the host as a Tor server. In
+ the "renegotiation" and "backwards-compatible renegotiation" steps, the
+ initiator SHOULD choose a list of ciphersuites and TLS extensions
to mimic one used by a popular web browser.
Responders MUST NOT select any TLS ciphersuite that lacks ephemeral keys,
@@ -289,7 +293,7 @@ see tor-design.pdf.
6 -- CREATED_FAST (Circuit created, no PK) (See Sec 5.1)
7 -- VERSIONS (Negotiate proto version) (See Sec 4)
8 -- NETINFO (Time and address info) (See Sec 4)
- 9 -- RELAY_EARLY (End-to-end data; limited) (See sec 5.6)
+ 9 -- RELAY_EARLY (End-to-end data; limited)(See Sec 5.6)
The interpretation of 'Payload' depends on the type of the cell.
PADDING: Payload is unused.
@@ -357,7 +361,7 @@ see tor-design.pdf.
The address format is a type/length/value sequence as given in section
6.4 below. The timestamp is a big-endian unsigned integer number of
- seconds since the unix epoch.
+ seconds since the Unix epoch.
Implementations MAY use the timestamp value to help decide if their
clocks are skewed. Initiators MAY use "other OR's address" to help
@@ -399,7 +403,7 @@ see tor-design.pdf.
Onion skin [DH_LEN+KEY_LEN+PK_PAD_LEN bytes]
Identity fingerprint [HASH_LEN bytes]
- The port and address field denote the IPV4 address and port of the next
+ The port and address field denote the IPv4 address and port of the next
onion router in the circuit; the public key hash is the hash of the PKCS#1
ASN1 encoding of the next onion router's identity (signing) key. (See 0.3
above.) Including this hash allows the extending OR verify that it is
@@ -593,6 +597,14 @@ see tor-design.pdf.
cell to the next node in the circuit, and replies to the OP with a
RELAY_TRUNCATED cell.
+ [Note: If an OR receives a TRUNCATE cell and it has any RELAY cells
+ still queued on the circuit for the next node it will drop them
+ without sending them. This is not considered conformant behavior,
+ but it probably won't get fixed until a later version of Tor. Thus,
+ clients SHOULD NOT send a TRUNCATE cell to a node running any current
+ version of Tor if a) they have sent relay cells through that node,
+ and b) they aren't sure whether those cells have been sent on yes.]
+
When an unrecoverable error occurs along one connection in a
circuit, the nodes on either side of the connection should, if they
are able, act as follows: the node closer to the OP should send a
@@ -831,7 +843,8 @@ see tor-design.pdf.
6 -- REASON_DONE (Anonymized TCP connection was closed)
7 -- REASON_TIMEOUT (Connection timed out, or OR timed out
while connecting)
- 8 -- (unallocated) [**]
+ 8 -- REASON_NOROUTE (Routing error while attempting to
+ contact destination)
9 -- REASON_HIBERNATING (OR is temporarily hibernating)
10 -- REASON_INTERNAL (Internal error at the OR)
11 -- REASON_RESOURCELIMIT (OR has no resources to fulfill request)
@@ -853,8 +866,6 @@ see tor-design.pdf.
[*] Older versions of Tor also send this reason when connections are
reset.
- [**] Due to a bug in versions of Tor through 0095, error reason 8 must
- remain allocated until that version is obsolete.
--- [The rest of this section describes unimplemented functionality.]
@@ -886,7 +897,7 @@ see tor-design.pdf.
6.4. Remote hostname lookup
To find the address associated with a hostname, the OP sends a
- RELAY_RESOLVE cell containing the hostname to be resolved with a nul
+ RELAY_RESOLVE cell containing the hostname to be resolved with a NUL
terminating byte. (For a reverse lookup, the OP sends a RELAY_RESOLVE
cell containing an in-addr.arpa address.) The OR replies with a
RELAY_RESOLVED cell containing a status byte, and any number of
diff --git a/doc/spec/version-spec.txt b/doc/spec/version-spec.txt
index 842271ae19..265717f409 100644
--- a/doc/spec/version-spec.txt
+++ b/doc/spec/version-spec.txt
@@ -1,4 +1,3 @@
-$Id$
HOW TOR VERSION NUMBERS WORK