diff options
Diffstat (limited to 'spec/dir-spec/extra-info-document-format.md')
-rw-r--r-- | spec/dir-spec/extra-info-document-format.md | 591 |
1 files changed, 591 insertions, 0 deletions
diff --git a/spec/dir-spec/extra-info-document-format.md b/spec/dir-spec/extra-info-document-format.md new file mode 100644 index 0000000..0defe70 --- /dev/null +++ b/spec/dir-spec/extra-info-document-format.md @@ -0,0 +1,591 @@ +<a id="dir-spec.txt-2.1.2"></a> + +# Extra-info document format + +Extra-info documents consist of the following items: + +```text + "extra-info" Nickname Fingerprint NL + [At start, exactly once.] +``` + +Identifies what router this is an extra-info descriptor for. +Fingerprint is encoded in hex (using upper-case letters), with +no spaces. + +```text + "identity-ed25519" + [As in router descriptors] + + "published" YYYY-MM-DD HH:MM:SS NL + + [Exactly once.] +``` + +The time, in UTC, when this document (and its corresponding router +descriptor if any) was generated. It MUST match the published time +in the corresponding server descriptor. + +```text + "read-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM,NUM,NUM,NUM,NUM... NL + [At most once.] + "write-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM,NUM,NUM,NUM,NUM... NL + [At most once.] +``` + +Declare how much bandwidth the OR has used recently. Usage is divided +into intervals of NSEC seconds. The YYYY-MM-DD HH:MM:SS field +defines the end of the most recent interval. The numbers are the +number of bytes used in the most recent intervals, ordered from +oldest to newest. + +These fields include both IPv4 and IPv6 traffic. + +```text + "ipv6-read-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM,NUM,NUM... NL + [At most once] + "ipv6-write-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM,NUM,NUM... NL + [At most once] +``` + +Declare how much bandwidth the OR has used recently, on IPv6 +connections. See "read-history" and "write-history" for full details. + +```text + "geoip-db-digest" Digest NL + [At most once.] +``` + +SHA1 digest of the IPv4 GeoIP database file that is used to +resolve IPv4 addresses to country codes. + +```text + "geoip6-db-digest" Digest NL + [At most once.] +``` + +SHA1 digest of the IPv6 GeoIP database file that is used to +resolve IPv6 addresses to country codes. + +("geoip-start-time" YYYY-MM-DD HH:MM:SS NL) +("geoip-client-origins" CC=NUM,CC=NUM,... NL) + +Only generated by bridge routers (see blocking.pdf), and only +when they have been configured with a geoip database. +Non-bridges SHOULD NOT generate these fields. Contains a list +of mappings from two-letter country codes (CC) to the number +of clients that have connected to that bridge from that +country (approximate, and rounded up to the nearest multiple of 8 +in order to hamper traffic analysis). A country is included +only if it has at least one address. The time in +"geoip-start-time" is the time at which we began collecting geoip +statistics. + +"geoip-start-time" and "geoip-client-origins" have been replaced by +"bridge-stats-end" and "bridge-ips" in 0.2.2.4-alpha. The +reason is that the measurement interval with "geoip-stats" as +determined by subtracting "geoip-start-time" from "published" could +have had a variable length, whereas the measurement interval in +0.2.2.4-alpha and later is set to be exactly 24 hours long. In +order to clearly distinguish the new measurement intervals from +the old ones, the new keywords have been introduced. + +```text + "bridge-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL + [At most once.] +``` + +YYYY-MM-DD HH:MM:SS defines the end of the included measurement +interval of length NSEC seconds (86400 seconds by default). + +A "bridge-stats-end" line, as well as any other "bridge-\*" line, +is only added when the relay has been running as a bridge for at +least 24 hours. + +```text + "bridge-ips" CC=NUM,CC=NUM,... NL + [At most once.] +``` + +List of mappings from two-letter country codes to the number of +unique IP addresses that have connected from that country to the +bridge and which are no known relays, rounded up to the nearest +multiple of 8. + +```text + "bridge-ip-versions" FAM=NUM,FAM=NUM,... NL + [At most once.] +``` + +List of unique IP addresses that have connected to the bridge +per protocol family. + +```text + "bridge-ip-transports" PT=NUM,PT=NUM,... NL + [At most once.] +``` + +List of mappings from pluggable transport names to the number +of unique IP addresses that have connected using that +pluggable transport. Unobfuscated connections are counted +using the reserved pluggable transport name "`<OR>`" (without +quotes). If we received a connection from a transport proxy +but we couldn't figure out the name of the pluggable +transport, we use the reserved pluggable transport name +"`<??>`". + +("`<OR>`" and "`<??>`" are reserved because normal pluggable +transport names MUST match the following regular expression: +"`[a-zA-Z_][a-zA-Z0-9_]*`" ) + +The pluggable transport name list is sorted into lexically +ascending order. + +If no clients have connected to the bridge yet, we only write +"bridge-ip-transports" to the stats file. + +```text + "dirreq-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL + [At most once.] +``` + +YYYY-MM-DD HH:MM:SS defines the end of the included measurement +interval of length NSEC seconds (86400 seconds by default). + +A "dirreq-stats-end" line, as well as any other "dirreq-\*" line, +is only added when the relay has opened its Dir port and after 24 +hours of measuring directory requests. + +```text + "dirreq-v2-ips" CC=NUM,CC=NUM,... NL + [At most once.] + "dirreq-v3-ips" CC=NUM,CC=NUM,... NL + [At most once.] +``` + +List of mappings from two-letter country codes to the number of +unique IP addresses that have connected from that country to +request a v2/v3 network status, rounded up to the nearest multiple +of 8. Only those IP addresses are counted that the directory can +answer with a 200 OK status code. (Note here and below: current Tor +versions, as of 0.2.5.2-alpha, no longer cache or serve v2 +networkstatus documents.) + +```text + "dirreq-v2-reqs" CC=NUM,CC=NUM,... NL + [At most once.] + "dirreq-v3-reqs" CC=NUM,CC=NUM,... NL + [At most once.] +``` + +List of mappings from two-letter country codes to the number of +requests for v2/v3 network statuses from that country, rounded up +to the nearest multiple of 8. Only those requests are counted that +the directory can answer with a 200 OK status code. + +```text + "dirreq-v2-share" NUM% NL + [At most once.] + "dirreq-v3-share" NUM% NL + [At most once.] +``` + +The share of v2/v3 network status requests that the directory +expects to receive from clients based on its advertised bandwidth +compared to the overall network bandwidth capacity. Shares are +formatted in percent with two decimal places. Shares are +calculated as means over the whole 24-hour interval. + +```text + "dirreq-v2-resp" status=NUM,... NL + [At most once.] + "dirreq-v3-resp" status=NUM,... NL + [At most once.] +``` + +List of mappings from response statuses to the number of requests +for v2/v3 network statuses that were answered with that response +status, rounded up to the nearest multiple of 4. Only response +statuses with at least 1 response are reported. New response +statuses can be added at any time. The current list of response +statuses is as follows: + +```text + "ok": a network status request is answered; this number + corresponds to the sum of all requests as reported in + "dirreq-v2-reqs" or "dirreq-v3-reqs", respectively, before + rounding up. + "not-enough-sigs: a version 3 network status is not signed by a + sufficient number of requested authorities. + "unavailable": a requested network status object is unavailable. + "not-found": a requested network status is not found. + "not-modified": a network status has not been modified since the + If-Modified-Since time that is included in the request. + "busy": the directory is busy. + + "dirreq-v2-direct-dl" key=NUM,... NL + [At most once.] + "dirreq-v3-direct-dl" key=NUM,... NL + [At most once.] + "dirreq-v2-tunneled-dl" key=NUM,... NL + [At most once.] + "dirreq-v3-tunneled-dl" key=NUM,... NL + [At most once.] +``` + +List of statistics about possible failures in the download process +of v2/v3 network statuses. Requests are either "direct" +HTTP-encoded requests over the relay's directory port, or +"tunneled" requests using a BEGIN_DIR relay message over the relay's OR +port. The list of possible statistics can change, and statistics +can be left out from reporting. The current list of statistics is +as follows: + +Successful downloads and failures: + +```text + "complete": a client has finished the download successfully. + "timeout": a download did not finish within 10 minutes after + starting to send the response. + "running": a download is still running at the end of the + measurement period for less than 10 minutes after starting to + send the response. + + Download times: + + "min", "max": smallest and largest measured bandwidth in B/s. + "d[1-4,6-9]": 1st to 4th and 6th to 9th decile of measured + bandwidth in B/s. For a given decile i, i/10 of all downloads + had a smaller bandwidth than di, and (10-i)/10 of all downloads + had a larger bandwidth than di. + "q[1,3]": 1st and 3rd quartile of measured bandwidth in B/s. One + fourth of all downloads had a smaller bandwidth than q1, one + fourth of all downloads had a larger bandwidth than q3, and the + remaining half of all downloads had a bandwidth between q1 and + q3. + "md": median of measured bandwidth in B/s. Half of the downloads + had a smaller bandwidth than md, the other half had a larger + bandwidth than md. + + "dirreq-read-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM,NUM,NUM... NL + [At most once] + "dirreq-write-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM,NUM,NUM... NL + [At most once] +``` + +Declare how much bandwidth the OR has spent on answering directory +requests. Usage is divided into intervals of NSEC seconds. The +YYYY-MM-DD HH:MM:SS field defines the end of the most recent +interval. The numbers are the number of bytes used in the most +recent intervals, ordered from oldest to newest. + +```text + "entry-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL + [At most once.] +``` + +YYYY-MM-DD HH:MM:SS defines the end of the included measurement +interval of length NSEC seconds (86400 seconds by default). + +An "entry-stats-end" line, as well as any other "entry-\*" +line, is first added after the relay has been running for at least +24 hours. + +```text + "entry-ips" CC=NUM,CC=NUM,... NL + [At most once.] +``` + +List of mappings from two-letter country codes to the number of +unique IP addresses that have connected from that country to the +relay and which are no known other relays, rounded up to the +nearest multiple of 8. + +```text + "cell-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL + [At most once.] +``` + +YYYY-MM-DD HH:MM:SS defines the end of the included measurement +interval of length NSEC seconds (86400 seconds by default). + +A "cell-stats-end" line, as well as any other "cell-\*" line, +is first added after the relay has been running for at least 24 +hours. + +```text + "cell-processed-cells" NUM,...,NUM NL + [At most once.] +``` + +Mean number of processed cells per circuit, subdivided into +deciles of circuits by the number of cells they have processed in +descending order from loudest to quietest circuits. + +```text + "cell-queued-cells" NUM,...,NUM NL + [At most once.] +``` + +Mean number of cells contained in queues by circuit decile. These +means are calculated by 1) determining the mean number of cells in +a single circuit between its creation and its termination and 2) +calculating the mean for all circuits in a given decile as +determined in "cell-processed-cells". Numbers have a precision of +two decimal places. + +Note that this statistic can be inaccurate for circuits that had +queued cells at the start or end of the measurement interval. + +```text + "cell-time-in-queue" NUM,...,NUM NL + [At most once.] +``` + +Mean time cells spend in circuit queues in milliseconds. Times are +calculated by 1) determining the mean time cells spend in the +queue of a single circuit and 2) calculating the mean for all +circuits in a given decile as determined in +"cell-processed-cells". + +Note that this statistic can be inaccurate for circuits that had +queued cells at the start or end of the measurement interval. + +```text + "cell-circuits-per-decile" NUM NL + [At most once.] +``` + +Mean number of circuits that are included in any of the deciles, +rounded up to the next integer. + +```text + "conn-bi-direct" YYYY-MM-DD HH:MM:SS (NSEC s) BELOW,READ,WRITE,BOTH NL + [At most once] +``` + +Number of connections, split into 10-second intervals, that are +used uni-directionally or bi-directionally as observed in the NSEC +seconds (usually 86400 seconds) before YYYY-MM-DD HH:MM:SS. Every +10 seconds, we determine for every connection whether we read and +wrote less than a threshold of 20 KiB (BELOW), read at least 10 +times more than we wrote (READ), wrote at least 10 times more than +we read (WRITE), or read and wrote more than the threshold, but +not 10 times more in either direction (BOTH). After classifying a +connection, read and write counters are reset for the next +10-second interval. + +This measurement includes both IPv4 and IPv6 connections. + +```text + "ipv6-conn-bi-direct" YYYY-MM-DD HH:MM:SS (NSEC s) BELOW,READ,WRITE,BOTH NL + [At most once] +``` + +Number of IPv6 connections that are used uni-directionally or +bi-directionally. See "conn-bi-direct" for more details. + +```text + "exit-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL + [At most once.] +``` + +YYYY-MM-DD HH:MM:SS defines the end of the included measurement +interval of length NSEC seconds (86400 seconds by default). + +An "exit-stats-end" line, as well as any other "exit-\*" line, is +first added after the relay has been running for at least 24 hours +and only if the relay permits exiting (where exiting to a single +port and IP address is sufficient). + +```text + "exit-kibibytes-written" port=N,port=N,... NL + [At most once.] + "exit-kibibytes-read" port=N,port=N,... NL + [At most once.] +``` + +List of mappings from ports to the number of kibibytes that the +relay has written to or read from exit connections to that port, +rounded up to the next full kibibyte. Relays may limit the +number of listed ports and subsume any remaining kibibytes under +port "other". + +```text + "exit-streams-opened" port=N,port=N,... NL + [At most once.] +``` + +List of mappings from ports to the number of opened exit streams +to that port, rounded up to the nearest multiple of 4. Relays may +limit the number of listed ports and subsume any remaining opened +streams under port "other". + +```text + "hidserv-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL + [At most once.] + "hidserv-v3-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL + [At most once.] +``` + +YYYY-MM-DD HH:MM:SS defines the end of the included measurement +interval of length NSEC seconds (86400 seconds by default). + +A "hidserv-stats-end" line, as well as any other "hidserv-\*" line, +is first added after the relay has been running for at least 24 +hours. + +(Introduced in tor-0.4.6.1-alpha) + +```text + "hidserv-rend-relayed-cells" SP NUM SP key=val SP key=val ... NL + [At most once.] + "hidserv-rend-v3-relayed-cells" SP NUM SP key=val SP key=val ... NL + [At most once.] +``` + +Approximate number of relay cells seen in either direction on a +circuit after receiving and successfully processing a RENDEZVOUS1 +cell. + +The original measurement value is obfuscated in several steps: +first, it is rounded up to the nearest multiple of 'bin_size' +which is reported in the key=val part of this line; second, a +(possibly negative) noise value is added to the result of the +first step by randomly sampling from a Laplace distribution with +mu = 0 and b = (delta_f / epsilon) with 'delta_f' and 'epsilon' +being reported in the key=val part, too; third, the result of the +previous obfuscation steps is truncated to the next smaller +integer and included as 'NUM'. Note that the overall reported +value can be negative. + +(Introduced in tor-0.4.6.1-alpha) + +```text + "hidserv-dir-onions-seen" SP NUM SP key=val SP key=val ... NL + [At most once.] + "hidserv-dir-v3-onions-seen" SP NUM SP key=val SP key=val ... NL + [At most once.] +``` + +Approximate number of unique hidden-service identities seen in +descriptors published to and accepted by this hidden-service +directory. + +The original measurement value is obfuscated in the same way as +the 'NUM' value reported in "hidserv-rend-relayed-cells", but +possibly with different parameters as reported in the key=val part +of this line. Note that the overall reported value can be +negative. + +(Introduced in tor-0.4.6.1-alpha) + +```text + "transport" transportname address:port [arglist] NL + [Any number.] +``` + +Signals that the router supports the 'transportname' pluggable +transport in IP address 'address' and TCP port 'port'. A single +descriptor MUST not have more than one transport line with the +same 'transportname'. + +Pluggable transports are only relevant to bridges, but these entries +can appear in non-bridge relays as well. + +```text + "padding-counts" YYYY-MM-DD HH:MM:SS (NSEC s) key=NUM key=NUM ... NL + [At most once.] +``` + +YYYY-MM-DD HH:MM:SS defines the end of the included measurement +interval of length NSEC seconds (86400 seconds by default). Counts +are reset to 0 at the end of this interval. + +The keyword list is currently as follows: + +```text + bin-size + - The current rounding value for cell count fields (10000 by + default) + write-drop + - The number of RELAY_DROP messages this relay sent + write-pad + - The number of PADDING cells this relay sent + write-total + - The total number of cells this relay cent + read-drop + - The number of RELAY_DROP messages this relay received + read-pad + - The number of PADDING cells this relay received + read-total + - The total number of cells this relay received + enabled-read-pad + - The number of PADDING cells this relay received on + connections that support padding + enabled-read-total + - The total number of cells this relay received on connections + that support padding + enabled-write-pad + - The total number of cells this relay received on connections + that support padding + enabled-write-total + - The total number of cells sent by this relay on connections + that support padding + max-chanpad-timers + - The maximum number of timers that this relay scheduled for + padding in the previous NSEC interval + + "overload-ratelimits" SP version SP YYYY-MM-DD SP HH:MM:SS + SP rate-limit SP burst-limit + SP read-overload-count SP write-overload-count NL + [At most once.] + + Indicates that a bandwidth limit was exhausted for this relay. +``` + +The "rate-limit" and "burst-limit" are the raw values from the +BandwidthRate and BandwidthBurst found in the torrc configuration file. + +The "{read|write}-overload-count" are the counts of how many times the +reported limits of burst/rate were exhausted and thus the maximum +between the read and write count occurrences. To make the counter more +meaningful and to avoid multiple connections saturating the counter +when a relay is overloaded, we only increment it once a minute. + +The 'version' field is set to '1' for now. + +(Introduced in tor-0.4.6.1-alpha) + +```text + "overload-fd-exhausted" SP version YYYY-MM-DD HH:MM:SS NL + [At most once.] +``` + +Indicates that a file descriptor exhaustion was experienced by this +relay. + +The timestamp indicates that the maximum was reached between the +timestamp and the "published" timestamp of the document. + +This overload field should remain in place for 72 hours since last +triggered. If the limits are reached again in this period, the +timestamp is updated, and this 72 hour period restarts. + +The 'version' field is set to '1' for the initial implementation which +detects fd exhaustion only when a socket open fails. + +(Introduced in tor-0.4.6.1-alpha) + +```text + "router-sig-ed25519" + [As in router descriptors] + + "router-signature" NL Signature NL + [At end, exactly once.] + [No extra arguments] +``` + +A document signature as documented in section 1.3, using the +initial item "extra-info" and the final item "router-signature", +signed with the router's identity key. |