From 33c1c31567d633d6861ffb3d96c2cc9cdf2bf6d0 Mon Sep 17 00:00:00 2001 From: Nick Mathewson Date: Thu, 25 Feb 2016 09:52:04 -0500 Subject: proposal: 267-tor-consensus-transparency.txt (from Linus) --- proposals/267-tor-consensus-transparency.txt | 363 +++++++++++++++++++++++++++ 1 file changed, 363 insertions(+) create mode 100644 proposals/267-tor-consensus-transparency.txt (limited to 'proposals/267-tor-consensus-transparency.txt') diff --git a/proposals/267-tor-consensus-transparency.txt b/proposals/267-tor-consensus-transparency.txt new file mode 100644 index 0000000..9d761e7 --- /dev/null +++ b/proposals/267-tor-consensus-transparency.txt @@ -0,0 +1,363 @@ +Filename: 267-tor-consensus-transparency.txt +Title: Tor Consensus Transparency +Author: Linus Nordberg +Created: 2014-06-28 +Status: Draft + +0. Introduction + + This document describes how to provide and use public, append-only, + verifiable logs containing Tor consensus and vote status documents, + much like what Certificate Transparency [CT] does for TLS + certificates, making it possible for log monitors to detect false + consensuses and votes. + + Tor clients and relays can refuse using a consensus not present in + a set of logs of their choosing, as well as provide possible + evidence of misissuance by submitting such a consensus to any + number of logs. + +1. Overview + + Tor status documents, consensuses as well as votes, are stored in + one or more public, append-only, externally verifiable log using a + history tree like the one described in [CrosbyWallach]. + + Consensus-users, i.e. Tor clients and relays, expect to receive one + or more "proof of inclusions" with new consensus documents. A proof + of inclusion is a hash sum representing the tree head of a log, + signed by the logs private key, and an audit path listing the nodes + in the tree needed to recreate the tree head. Consensus-users are + configured to use one or more logs by listing a log address and a + public key for each log. This is enough for verifying that a given + consensus document is present in a given log. + + Submission of status documents to a log can be done by anyone with + an internet connection (and the Tor network, in case of logs only + on a .onion address). The submitter gets a signed tree head and a + proof of inclusion in return. Directory authorities are expected to + submit to one or more logs and include the proofs when serving + consensus documents. Directory caches and consensus-users receiving + a consensus not including a proof of inclusion may submit the + document and use the proof they receive in return. + + Auditing log behaviour and monitoring the contents of logs is + performed in cooperation between the Tor network and external + services. Relays act as log auditors with help from Tor clients + gossiping about what they see. Directory authorities are good + candidates for monitoring log content since they know what votes + they have sent and received as well as what consensus documents + they have issued. Anybody can run both an auditor and a monitor + though, which is an important property of the proposed system. + +2. Motivation + + Popping a handful of boxes (currently five) or factoring the same + number of RSA keys should not be ruled out as a possible attack + against a subset of Tor users. An attacker controlling a majority + of the directory authorities signing keys can, using + man-in-the-middle or man-on-the-side attacks, serve consensus + documents listing relays under their control. If mounted on a small + subset of Tor users on the internet, the chance of detection is + probably low. Implementation of this proposal increases the cost + for such an attack by raising the chances of it being detected. + + Note that while the proposed solution gives each individual some + degree of protection against using a false consensus this is not + the primary goal but more of a nice side effect. The primary goal + is to detect correctly signed consensus documents which differ from + the consensus of the directory authoritites. This raises the risk + of exposure of an attacker capable of producing a consensus and + feed it to users. + + The complexity of the proposed solution is motivated by the fact + that the log key is not just another key on top of the directory + authority keys since the log doesn't have to be trusted. Another + value is the decentralisation given -- anybody can run their own + log and use it. Anybody can audit all existing logs and verify + their correct behaviour. This empowers people outside the group of + Tor directory authority operators and the people who trust them for + one reason or the other. + +3. Design + + Communication with logs is done over HTTP using TLS or Tor onion + services for transport, similar to what is defined in + [rfc6962-bis-12]. Parameters for POSTs and all responses are + encoded as name/value pairs in JSON objects [RFC4627]. + + Summary of proposed changes to Tor: + + - Configuration is added for listing known logs and for describing + policy for using them. + + - Directory authorities start submitting newly created consensuses + to at least one public log. + + - Tor clients and relays receiving a consensus not accompanied by a + proof of inclusion start submitting that consensus to at least + one public log. + + - Consensus-users start rejecting consensuses accompanied by an + invalid proof of inclusion. + + - A new cell type LOG_STH is defined, for clients and relays to + exchange information about seen tree heads and their validity. + + - Consensus-users send seen tree heads to relays acting as log + auditors. + + - Relays acting as log auditors validate tree heads (section 3.2.2) + received from consensus-users and send results back. + + - Consensus-users start rejecting consensuses for which valid + proofs of inclusion can not be obtained. + + Definitions: + + - Log id: The SHA-256 hash of the log's public key, to be treated + as an opaque byte string identifying the log. + +3.1. Consensus submission + + Logs accept consensus submissions from anyone as long as the + consensus is signed by a majority of the Tor directory authorities + of the Tor network that it's logging. + + Consensus documents are POST:ed to a well-known URL as defined in + section 5.2. + + The output is what we call a proof of inclusion. + +3.2. Verification + +3.2.1. Log entry membership verification + + Calculate a tree head from the hash of the received consensus and + the audit path in the accompanying proof. Verify that the + calculated tree head is identical to the tree head in the + proof. This can easily be done by consensus-users for each received + consensus. + + We now know that the consensus is part of a tree which the log + claims to be The Tree. Whether this tree is the same tree that + everybody else see is unknown at this point. + +3.2.2. Log consistency verification + + Ask the log for a consistency proof between the tree head to verify + and a previously known good tree head from the pool. Section 5.3 + specifies how to fetch a consistency proof. + + [[TBD require auditors to fetch and store the tree head for the + empty tree as part of bootstrapping, in order to avoid the case + where there's no older tree to verify against?]] + + [[TODO description of verification of consistency goes here]] + + Relays acting as auditors cache results to minimise calculations + and communication with log servers. + + [[TBD have clients verify consistency as well? NOTE: we still want + relays to see tree heads in order to catch a lying log (the + split-view attack)]] + + We now know that the verified tree is a superset of a known good + tree. + +3.3. Log auditing + + A log auditor verifies two things: + + - A logs append-only property, i.e. that no entries once accepted + by a log are ever altered or removed. + + - That a log presents the same view to all of its users [[TODO + describe the Tor networks role in auditing more than what's found + in section 3.2.2]] + + A log auditor typically doesn't care about the contents of the log + entries, other than calculating their hash sums for auditing + purposes. + + Tor relays should act as log auditors. + +3.4. Log monitoring + + A log monitor downloads and investigates each entry in a log + searching for anomalies according to its monitoring policy. + + This document doesn't define monitoring policies but does outline a + few strategies for monitoring in section [[TBD]]. + + Note that there can be more than one valid consensus documents for + a given point in time. One reason for this is that the number of + signatures can differ due to consensus voting timing + details. [[TODO Are there more reasons?]] + + [[TODO expand on monitoring strategies -- even if this is not part + of the proposed extensions to the Tor network it's good for + understanding. a) dirauths can verify consensus documents byte for + byte; b) anyone can look for diffs larger than D per time T, where + "diffs" certainly can be smarter than a plain text diff]] + +3.5. Consensus-user behaviour + + [[TODO move most of this to section 5]] + + Keep an on-disk cache of consensus documents. Mark them as being in + one of three states: + + LOG_STATE_UNKNOWN -- don't know whether it's present in enough logs + or not + LOG_STATE_LOGGED -- have seen good proof(s) of inclusion + LOG_STATE_LOGGED_GOOD -- confident about the tree head representing + a good tree + + Newly arrived consensus documents start in UNKNOWN or LOGGED + depending on whether they are accompanied by enough proofs or + not. There are two possible state transitions: + + - UNKNOWN --> LOGGED: When enough correctly verifying proofs of + inclusion (section 3.2.1) have been seen. The number of good + proofs required is a policy setting in the configuration of the + consensus-user. + + - LOGGED --> LOGGED_GOOD: When the tree head in enough of the + inclusion proofs have been verified (section 3.2.2) or enough + LOG_STH cells vouching for the same tree heads have been + seen. The number of verifications required is a policy setting in + the configuration of the consensus-user. + + Consensuses in state UNKNOWN are not used but are instead submitted + to one or more logs. If the submission succeeds, this will take the + consensus to state LOGGED. + + Consensuses in state LOGGED are used despite not being fully + verified with regard to logging. LOG_STH cells containing + tree heads from received proofs are being sent to relays for + verification. Clients send to all relays that they have a circuit + to, i.e. their guard relay(s). Relays send to three random relays + that they have a circuit to. + +3.6. Relay behaviour when acting as an auditor + + In order to verify the append-only property of a log, relays acting + as log auditors verify the consistency of tree heads received in + LOG_STH cells. An auditor keeps a copy of 2+N known good tree heads + in a pool stored on persistent media [[TBD where N is either a + fixed number in the range 32-128 or is a function of the log + size]]. Two of them are the oldest and newest tree heads seen, + respectively. The rest, N, are randomly chosen from the tree heads + seen. + + [[TODO describe or refer to an algorithm for "randomly chosen", + hopefully not subjective to flushing attacks (or other attacks)]]. + +3.7. Notable differences from Certificate Transparency + + - The data logged is "strictly time-stamped", i.e. ordered. + + - Much shorter lifetime of logged data -- a day rather than a + year. Is the effects of this difference of importance only for + "one-shot attacks"? + + - Directory authorities have consensus about what they're + signing -- there are no "web sites knowing better". + + - Submitters are not in the same hurry as CA:s and can wait minutes + rather than seconds for a proof of inclusion. + +4. Security implications + + TODO + +5. Specification + +5.0. Data structures + + Data structures are defined as described in [RFC5246] section 4, + i.e. TLS 1.2 presentation language. While it is tempting to try to + avoid yet another format, the cost of redefining the data + structures in [rfc6962-bis-12] outweighs this consideration. The + burden of redefining, reimplementing and testing is extra true for + those structures which need precise definitions because they are to + be signed. + +5.1. Signed Tree Head (STH) + + An STH is a TransItem structure of type "signed_tree_head" as + defined in [rfc6962-bis-12] section 5.8. + +5.2. Submitting a consensus document to a log + + POST https:///tct/v1/add-consensus + + Input: + + consensus: A consensus status document as defined in [dir-spec] + section 3.4.1 [[TBD gziped and base64 encoded to save 50%?]] + + Output: + + sth: A signed tree head as defined in section 5.1 refering to a + tree in which the submitted document is included. + + inclusion: An inclusion proof as specified for the "inclusion" + output in [rfc6962-bis-12] section 6.5. + +5.3. Getting a consistency proof from a log + + GET https:///tct/v1/get-sth-consistency + + Input and output as specified in [rfc6962-bis-12] section 6.4. + +5.x. LOG_STH cells + + A LOG_STH cell is a variable-length cell with the following + fields: + + TBDname [TBD octets] + TBDname [TBD octets] + TBDname [TBD octets] + +6. Compatibility + + TBD + +7. Implementation + + TBD + +8. Performance and scalability notes + + TBD + +A. Open issues / TODOs + + - TODO: Add SCTs from CT, at least as a practical "cookie" (i.e. no + need to send them around or include them anywhere). Logs should + be given more time for distributing than we're willing to wait on + an HTTP response for. + + - TODO: explain why no hash function and signing algorithm agility, + [[rfc6962-bis-12] section 10 + + - TODO: add a blurb about the values of publishing logs as onion + services + + - TODO: discuss compromise of log keys + +B. Acknowledgements + + This proposal leans heavily on [rfc6962-bis-12]. Some definitions + are copied verbatim from that document. Valuable feedback has been + received from Ben Laurie, Karsten Loesing and Ximin Luo. + +C. References + + [CrosbyWallach] http://static.usenix.org/event/sec09/tech/full_papers/crosby.pdf + [dir-spec] https://gitweb.torproject.org/torspec.git/blob/HEAD:/dir-spec.txt + [RFC4627] https://tools.ietf.org/html/rfc4627 + [rfc6962-bis-12] https://datatracker.ietf.org/doc/draft-ietf-trans-rfc6962-bis/12 + [CT] https://https://www.certificate-transparency.org/ -- cgit v1.2.3-54-g00ecf