aboutsummaryrefslogtreecommitdiff
path: root/spec/dir-spec/extra-info-document-format.md
diff options
context:
space:
mode:
Diffstat (limited to 'spec/dir-spec/extra-info-document-format.md')
-rw-r--r--spec/dir-spec/extra-info-document-format.md591
1 files changed, 591 insertions, 0 deletions
diff --git a/spec/dir-spec/extra-info-document-format.md b/spec/dir-spec/extra-info-document-format.md
new file mode 100644
index 0000000..0defe70
--- /dev/null
+++ b/spec/dir-spec/extra-info-document-format.md
@@ -0,0 +1,591 @@
+<a id="dir-spec.txt-2.1.2"></a>
+
+# Extra-info document format
+
+Extra-info documents consist of the following items:
+
+```text
+ "extra-info" Nickname Fingerprint NL
+ [At start, exactly once.]
+```
+
+Identifies what router this is an extra-info descriptor for.
+Fingerprint is encoded in hex (using upper-case letters), with
+no spaces.
+
+```text
+ "identity-ed25519"
+ [As in router descriptors]
+
+ "published" YYYY-MM-DD HH:MM:SS NL
+
+ [Exactly once.]
+```
+
+The time, in UTC, when this document (and its corresponding router
+descriptor if any) was generated. It MUST match the published time
+in the corresponding server descriptor.
+
+```text
+ "read-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM,NUM,NUM,NUM,NUM... NL
+ [At most once.]
+ "write-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM,NUM,NUM,NUM,NUM... NL
+ [At most once.]
+```
+
+Declare how much bandwidth the OR has used recently. Usage is divided
+into intervals of NSEC seconds. The YYYY-MM-DD HH:MM:SS field
+defines the end of the most recent interval. The numbers are the
+number of bytes used in the most recent intervals, ordered from
+oldest to newest.
+
+These fields include both IPv4 and IPv6 traffic.
+
+```text
+ "ipv6-read-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM,NUM,NUM... NL
+ [At most once]
+ "ipv6-write-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM,NUM,NUM... NL
+ [At most once]
+```
+
+Declare how much bandwidth the OR has used recently, on IPv6
+connections. See "read-history" and "write-history" for full details.
+
+```text
+ "geoip-db-digest" Digest NL
+ [At most once.]
+```
+
+SHA1 digest of the IPv4 GeoIP database file that is used to
+resolve IPv4 addresses to country codes.
+
+```text
+ "geoip6-db-digest" Digest NL
+ [At most once.]
+```
+
+SHA1 digest of the IPv6 GeoIP database file that is used to
+resolve IPv6 addresses to country codes.
+
+("geoip-start-time" YYYY-MM-DD HH:MM:SS NL)
+("geoip-client-origins" CC=NUM,CC=NUM,... NL)
+
+Only generated by bridge routers (see blocking.pdf), and only
+when they have been configured with a geoip database.
+Non-bridges SHOULD NOT generate these fields. Contains a list
+of mappings from two-letter country codes (CC) to the number
+of clients that have connected to that bridge from that
+country (approximate, and rounded up to the nearest multiple of 8
+in order to hamper traffic analysis). A country is included
+only if it has at least one address. The time in
+"geoip-start-time" is the time at which we began collecting geoip
+statistics.
+
+"geoip-start-time" and "geoip-client-origins" have been replaced by
+"bridge-stats-end" and "bridge-ips" in 0.2.2.4-alpha. The
+reason is that the measurement interval with "geoip-stats" as
+determined by subtracting "geoip-start-time" from "published" could
+have had a variable length, whereas the measurement interval in
+0.2.2.4-alpha and later is set to be exactly 24 hours long. In
+order to clearly distinguish the new measurement intervals from
+the old ones, the new keywords have been introduced.
+
+```text
+ "bridge-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL
+ [At most once.]
+```
+
+YYYY-MM-DD HH:MM:SS defines the end of the included measurement
+interval of length NSEC seconds (86400 seconds by default).
+
+A "bridge-stats-end" line, as well as any other "bridge-\*" line,
+is only added when the relay has been running as a bridge for at
+least 24 hours.
+
+```text
+ "bridge-ips" CC=NUM,CC=NUM,... NL
+ [At most once.]
+```
+
+List of mappings from two-letter country codes to the number of
+unique IP addresses that have connected from that country to the
+bridge and which are no known relays, rounded up to the nearest
+multiple of 8.
+
+```text
+ "bridge-ip-versions" FAM=NUM,FAM=NUM,... NL
+ [At most once.]
+```
+
+List of unique IP addresses that have connected to the bridge
+per protocol family.
+
+```text
+ "bridge-ip-transports" PT=NUM,PT=NUM,... NL
+ [At most once.]
+```
+
+List of mappings from pluggable transport names to the number
+of unique IP addresses that have connected using that
+pluggable transport. Unobfuscated connections are counted
+using the reserved pluggable transport name "`<OR>`" (without
+quotes). If we received a connection from a transport proxy
+but we couldn't figure out the name of the pluggable
+transport, we use the reserved pluggable transport name
+"`<??>`".
+
+("`<OR>`" and "`<??>`" are reserved because normal pluggable
+transport names MUST match the following regular expression:
+"`[a-zA-Z_][a-zA-Z0-9_]*`" )
+
+The pluggable transport name list is sorted into lexically
+ascending order.
+
+If no clients have connected to the bridge yet, we only write
+"bridge-ip-transports" to the stats file.
+
+```text
+ "dirreq-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL
+ [At most once.]
+```
+
+YYYY-MM-DD HH:MM:SS defines the end of the included measurement
+interval of length NSEC seconds (86400 seconds by default).
+
+A "dirreq-stats-end" line, as well as any other "dirreq-\*" line,
+is only added when the relay has opened its Dir port and after 24
+hours of measuring directory requests.
+
+```text
+ "dirreq-v2-ips" CC=NUM,CC=NUM,... NL
+ [At most once.]
+ "dirreq-v3-ips" CC=NUM,CC=NUM,... NL
+ [At most once.]
+```
+
+List of mappings from two-letter country codes to the number of
+unique IP addresses that have connected from that country to
+request a v2/v3 network status, rounded up to the nearest multiple
+of 8. Only those IP addresses are counted that the directory can
+answer with a 200 OK status code. (Note here and below: current Tor
+versions, as of 0.2.5.2-alpha, no longer cache or serve v2
+networkstatus documents.)
+
+```text
+ "dirreq-v2-reqs" CC=NUM,CC=NUM,... NL
+ [At most once.]
+ "dirreq-v3-reqs" CC=NUM,CC=NUM,... NL
+ [At most once.]
+```
+
+List of mappings from two-letter country codes to the number of
+requests for v2/v3 network statuses from that country, rounded up
+to the nearest multiple of 8. Only those requests are counted that
+the directory can answer with a 200 OK status code.
+
+```text
+ "dirreq-v2-share" NUM% NL
+ [At most once.]
+ "dirreq-v3-share" NUM% NL
+ [At most once.]
+```
+
+The share of v2/v3 network status requests that the directory
+expects to receive from clients based on its advertised bandwidth
+compared to the overall network bandwidth capacity. Shares are
+formatted in percent with two decimal places. Shares are
+calculated as means over the whole 24-hour interval.
+
+```text
+ "dirreq-v2-resp" status=NUM,... NL
+ [At most once.]
+ "dirreq-v3-resp" status=NUM,... NL
+ [At most once.]
+```
+
+List of mappings from response statuses to the number of requests
+for v2/v3 network statuses that were answered with that response
+status, rounded up to the nearest multiple of 4. Only response
+statuses with at least 1 response are reported. New response
+statuses can be added at any time. The current list of response
+statuses is as follows:
+
+```text
+ "ok": a network status request is answered; this number
+ corresponds to the sum of all requests as reported in
+ "dirreq-v2-reqs" or "dirreq-v3-reqs", respectively, before
+ rounding up.
+ "not-enough-sigs: a version 3 network status is not signed by a
+ sufficient number of requested authorities.
+ "unavailable": a requested network status object is unavailable.
+ "not-found": a requested network status is not found.
+ "not-modified": a network status has not been modified since the
+ If-Modified-Since time that is included in the request.
+ "busy": the directory is busy.
+
+ "dirreq-v2-direct-dl" key=NUM,... NL
+ [At most once.]
+ "dirreq-v3-direct-dl" key=NUM,... NL
+ [At most once.]
+ "dirreq-v2-tunneled-dl" key=NUM,... NL
+ [At most once.]
+ "dirreq-v3-tunneled-dl" key=NUM,... NL
+ [At most once.]
+```
+
+List of statistics about possible failures in the download process
+of v2/v3 network statuses. Requests are either "direct"
+HTTP-encoded requests over the relay's directory port, or
+"tunneled" requests using a BEGIN_DIR relay message over the relay's OR
+port. The list of possible statistics can change, and statistics
+can be left out from reporting. The current list of statistics is
+as follows:
+
+Successful downloads and failures:
+
+```text
+ "complete": a client has finished the download successfully.
+ "timeout": a download did not finish within 10 minutes after
+ starting to send the response.
+ "running": a download is still running at the end of the
+ measurement period for less than 10 minutes after starting to
+ send the response.
+
+ Download times:
+
+ "min", "max": smallest and largest measured bandwidth in B/s.
+ "d[1-4,6-9]": 1st to 4th and 6th to 9th decile of measured
+ bandwidth in B/s. For a given decile i, i/10 of all downloads
+ had a smaller bandwidth than di, and (10-i)/10 of all downloads
+ had a larger bandwidth than di.
+ "q[1,3]": 1st and 3rd quartile of measured bandwidth in B/s. One
+ fourth of all downloads had a smaller bandwidth than q1, one
+ fourth of all downloads had a larger bandwidth than q3, and the
+ remaining half of all downloads had a bandwidth between q1 and
+ q3.
+ "md": median of measured bandwidth in B/s. Half of the downloads
+ had a smaller bandwidth than md, the other half had a larger
+ bandwidth than md.
+
+ "dirreq-read-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM,NUM,NUM... NL
+ [At most once]
+ "dirreq-write-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM,NUM,NUM... NL
+ [At most once]
+```
+
+Declare how much bandwidth the OR has spent on answering directory
+requests. Usage is divided into intervals of NSEC seconds. The
+YYYY-MM-DD HH:MM:SS field defines the end of the most recent
+interval. The numbers are the number of bytes used in the most
+recent intervals, ordered from oldest to newest.
+
+```text
+ "entry-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL
+ [At most once.]
+```
+
+YYYY-MM-DD HH:MM:SS defines the end of the included measurement
+interval of length NSEC seconds (86400 seconds by default).
+
+An "entry-stats-end" line, as well as any other "entry-\*"
+line, is first added after the relay has been running for at least
+24 hours.
+
+```text
+ "entry-ips" CC=NUM,CC=NUM,... NL
+ [At most once.]
+```
+
+List of mappings from two-letter country codes to the number of
+unique IP addresses that have connected from that country to the
+relay and which are no known other relays, rounded up to the
+nearest multiple of 8.
+
+```text
+ "cell-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL
+ [At most once.]
+```
+
+YYYY-MM-DD HH:MM:SS defines the end of the included measurement
+interval of length NSEC seconds (86400 seconds by default).
+
+A "cell-stats-end" line, as well as any other "cell-\*" line,
+is first added after the relay has been running for at least 24
+hours.
+
+```text
+ "cell-processed-cells" NUM,...,NUM NL
+ [At most once.]
+```
+
+Mean number of processed cells per circuit, subdivided into
+deciles of circuits by the number of cells they have processed in
+descending order from loudest to quietest circuits.
+
+```text
+ "cell-queued-cells" NUM,...,NUM NL
+ [At most once.]
+```
+
+Mean number of cells contained in queues by circuit decile. These
+means are calculated by 1) determining the mean number of cells in
+a single circuit between its creation and its termination and 2)
+calculating the mean for all circuits in a given decile as
+determined in "cell-processed-cells". Numbers have a precision of
+two decimal places.
+
+Note that this statistic can be inaccurate for circuits that had
+queued cells at the start or end of the measurement interval.
+
+```text
+ "cell-time-in-queue" NUM,...,NUM NL
+ [At most once.]
+```
+
+Mean time cells spend in circuit queues in milliseconds. Times are
+calculated by 1) determining the mean time cells spend in the
+queue of a single circuit and 2) calculating the mean for all
+circuits in a given decile as determined in
+"cell-processed-cells".
+
+Note that this statistic can be inaccurate for circuits that had
+queued cells at the start or end of the measurement interval.
+
+```text
+ "cell-circuits-per-decile" NUM NL
+ [At most once.]
+```
+
+Mean number of circuits that are included in any of the deciles,
+rounded up to the next integer.
+
+```text
+ "conn-bi-direct" YYYY-MM-DD HH:MM:SS (NSEC s) BELOW,READ,WRITE,BOTH NL
+ [At most once]
+```
+
+Number of connections, split into 10-second intervals, that are
+used uni-directionally or bi-directionally as observed in the NSEC
+seconds (usually 86400 seconds) before YYYY-MM-DD HH:MM:SS. Every
+10 seconds, we determine for every connection whether we read and
+wrote less than a threshold of 20 KiB (BELOW), read at least 10
+times more than we wrote (READ), wrote at least 10 times more than
+we read (WRITE), or read and wrote more than the threshold, but
+not 10 times more in either direction (BOTH). After classifying a
+connection, read and write counters are reset for the next
+10-second interval.
+
+This measurement includes both IPv4 and IPv6 connections.
+
+```text
+ "ipv6-conn-bi-direct" YYYY-MM-DD HH:MM:SS (NSEC s) BELOW,READ,WRITE,BOTH NL
+ [At most once]
+```
+
+Number of IPv6 connections that are used uni-directionally or
+bi-directionally. See "conn-bi-direct" for more details.
+
+```text
+ "exit-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL
+ [At most once.]
+```
+
+YYYY-MM-DD HH:MM:SS defines the end of the included measurement
+interval of length NSEC seconds (86400 seconds by default).
+
+An "exit-stats-end" line, as well as any other "exit-\*" line, is
+first added after the relay has been running for at least 24 hours
+and only if the relay permits exiting (where exiting to a single
+port and IP address is sufficient).
+
+```text
+ "exit-kibibytes-written" port=N,port=N,... NL
+ [At most once.]
+ "exit-kibibytes-read" port=N,port=N,... NL
+ [At most once.]
+```
+
+List of mappings from ports to the number of kibibytes that the
+relay has written to or read from exit connections to that port,
+rounded up to the next full kibibyte. Relays may limit the
+number of listed ports and subsume any remaining kibibytes under
+port "other".
+
+```text
+ "exit-streams-opened" port=N,port=N,... NL
+ [At most once.]
+```
+
+List of mappings from ports to the number of opened exit streams
+to that port, rounded up to the nearest multiple of 4. Relays may
+limit the number of listed ports and subsume any remaining opened
+streams under port "other".
+
+```text
+ "hidserv-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL
+ [At most once.]
+ "hidserv-v3-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL
+ [At most once.]
+```
+
+YYYY-MM-DD HH:MM:SS defines the end of the included measurement
+interval of length NSEC seconds (86400 seconds by default).
+
+A "hidserv-stats-end" line, as well as any other "hidserv-\*" line,
+is first added after the relay has been running for at least 24
+hours.
+
+(Introduced in tor-0.4.6.1-alpha)
+
+```text
+ "hidserv-rend-relayed-cells" SP NUM SP key=val SP key=val ... NL
+ [At most once.]
+ "hidserv-rend-v3-relayed-cells" SP NUM SP key=val SP key=val ... NL
+ [At most once.]
+```
+
+Approximate number of relay cells seen in either direction on a
+circuit after receiving and successfully processing a RENDEZVOUS1
+cell.
+
+The original measurement value is obfuscated in several steps:
+first, it is rounded up to the nearest multiple of 'bin_size'
+which is reported in the key=val part of this line; second, a
+(possibly negative) noise value is added to the result of the
+first step by randomly sampling from a Laplace distribution with
+mu = 0 and b = (delta_f / epsilon) with 'delta_f' and 'epsilon'
+being reported in the key=val part, too; third, the result of the
+previous obfuscation steps is truncated to the next smaller
+integer and included as 'NUM'. Note that the overall reported
+value can be negative.
+
+(Introduced in tor-0.4.6.1-alpha)
+
+```text
+ "hidserv-dir-onions-seen" SP NUM SP key=val SP key=val ... NL
+ [At most once.]
+ "hidserv-dir-v3-onions-seen" SP NUM SP key=val SP key=val ... NL
+ [At most once.]
+```
+
+Approximate number of unique hidden-service identities seen in
+descriptors published to and accepted by this hidden-service
+directory.
+
+The original measurement value is obfuscated in the same way as
+the 'NUM' value reported in "hidserv-rend-relayed-cells", but
+possibly with different parameters as reported in the key=val part
+of this line. Note that the overall reported value can be
+negative.
+
+(Introduced in tor-0.4.6.1-alpha)
+
+```text
+ "transport" transportname address:port [arglist] NL
+ [Any number.]
+```
+
+Signals that the router supports the 'transportname' pluggable
+transport in IP address 'address' and TCP port 'port'. A single
+descriptor MUST not have more than one transport line with the
+same 'transportname'.
+
+Pluggable transports are only relevant to bridges, but these entries
+can appear in non-bridge relays as well.
+
+```text
+ "padding-counts" YYYY-MM-DD HH:MM:SS (NSEC s) key=NUM key=NUM ... NL
+ [At most once.]
+```
+
+YYYY-MM-DD HH:MM:SS defines the end of the included measurement
+interval of length NSEC seconds (86400 seconds by default). Counts
+are reset to 0 at the end of this interval.
+
+The keyword list is currently as follows:
+
+```text
+ bin-size
+ - The current rounding value for cell count fields (10000 by
+ default)
+ write-drop
+ - The number of RELAY_DROP messages this relay sent
+ write-pad
+ - The number of PADDING cells this relay sent
+ write-total
+ - The total number of cells this relay cent
+ read-drop
+ - The number of RELAY_DROP messages this relay received
+ read-pad
+ - The number of PADDING cells this relay received
+ read-total
+ - The total number of cells this relay received
+ enabled-read-pad
+ - The number of PADDING cells this relay received on
+ connections that support padding
+ enabled-read-total
+ - The total number of cells this relay received on connections
+ that support padding
+ enabled-write-pad
+ - The total number of cells this relay received on connections
+ that support padding
+ enabled-write-total
+ - The total number of cells sent by this relay on connections
+ that support padding
+ max-chanpad-timers
+ - The maximum number of timers that this relay scheduled for
+ padding in the previous NSEC interval
+
+ "overload-ratelimits" SP version SP YYYY-MM-DD SP HH:MM:SS
+ SP rate-limit SP burst-limit
+ SP read-overload-count SP write-overload-count NL
+ [At most once.]
+
+ Indicates that a bandwidth limit was exhausted for this relay.
+```
+
+The "rate-limit" and "burst-limit" are the raw values from the
+BandwidthRate and BandwidthBurst found in the torrc configuration file.
+
+The "{read|write}-overload-count" are the counts of how many times the
+reported limits of burst/rate were exhausted and thus the maximum
+between the read and write count occurrences. To make the counter more
+meaningful and to avoid multiple connections saturating the counter
+when a relay is overloaded, we only increment it once a minute.
+
+The 'version' field is set to '1' for now.
+
+(Introduced in tor-0.4.6.1-alpha)
+
+```text
+ "overload-fd-exhausted" SP version YYYY-MM-DD HH:MM:SS NL
+ [At most once.]
+```
+
+Indicates that a file descriptor exhaustion was experienced by this
+relay.
+
+The timestamp indicates that the maximum was reached between the
+timestamp and the "published" timestamp of the document.
+
+This overload field should remain in place for 72 hours since last
+triggered. If the limits are reached again in this period, the
+timestamp is updated, and this 72 hour period restarts.
+
+The 'version' field is set to '1' for the initial implementation which
+detects fd exhaustion only when a socket open fails.
+
+(Introduced in tor-0.4.6.1-alpha)
+
+```text
+ "router-sig-ed25519"
+ [As in router descriptors]
+
+ "router-signature" NL Signature NL
+ [At end, exactly once.]
+ [No extra arguments]
+```
+
+A document signature as documented in section 1.3, using the
+initial item "extra-info" and the final item "router-signature",
+signed with the router's identity key.