diff options
Diffstat (limited to 'doc/spec/proposals/158-microdescriptors.txt')
-rw-r--r-- | doc/spec/proposals/158-microdescriptors.txt | 207 |
1 files changed, 0 insertions, 207 deletions
diff --git a/doc/spec/proposals/158-microdescriptors.txt b/doc/spec/proposals/158-microdescriptors.txt deleted file mode 100644 index f478a3c834..0000000000 --- a/doc/spec/proposals/158-microdescriptors.txt +++ /dev/null @@ -1,207 +0,0 @@ -Filename: 158-microdescriptors.txt -Title: Clients download consensus + microdescriptors -Version: $Revision$ -Last-Modified: $Date$ -Author: Roger Dingledine -Created: 17-Jan-2009 -Status: Open - -1. Overview - - This proposal replaces section 3.2 of proposal 141, which was - called "Fetching descriptors on demand". Rather than modifying the - circuit-building protocol to fetch a server descriptor inline at each - circuit extend, we instead put all of the information that clients need - either into the consensus itself, or into a new set of data about each - relay called a microdescriptor. The microdescriptor is a direct - transform from the relay descriptor, so relays don't even need to know - this is happening. - - Descriptor elements that are small and frequently changing should go - in the consensus itself, and descriptor elements that are small and - relatively static should go in the microdescriptor. If we ever end up - with descriptor elements that aren't small yet clients need to know - them, we'll need to resume considering some design like the one in - proposal 141. - -2. Motivation - - See - http://archives.seul.org/or/dev/Nov-2008/msg00000.html and - http://archives.seul.org/or/dev/Nov-2008/msg00001.html and especially - http://archives.seul.org/or/dev/Nov-2008/msg00007.html - for a discussion of the options and why this is currently the best - approach. - -3. Design - - There are three pieces to the proposal. First, authorities will list in - their votes (and thus in the consensus) what relay descriptor elements - are included in the microdescriptor, and also list the expected hash - of microdescriptor for each relay. Second, directory mirrors will serve - microdescriptors. Third, clients will ask for them and cache them. - -3.1. Consensus changes - - V3 votes should include a new line: - microdescriptor-elements bar baz foo - listing each descriptor element (sorted alphabetically) that authority - included when it calculated its expected microdescriptor hashes. - - We also need to include the hash of each expected microdescriptor in - the routerstatus section. I suggest a new "m" line for each stanza, - with the base64 of the hash of the elements that the authority voted - for above. - - The consensus microdescriptor-elements and "m" lines are then computed - as described in Section 3.1.2 below. - - I believe that means we need a new consensus-method "6" that knows - how to compute the microdescriptor-elements and add "m" lines. - -3.1.1. Descriptor elements to include for now - - To start, the element list that authorities suggest should be - family onion-key - - (Note that the or-dev posts above only mention onion-key, but if - we don't also include family then clients will never learn it. It - seemed like it should be relatively static, so putting it in the - microdescriptor is smarter than trying to fit it into the consensus.) - - We could imagine a config option "family,onion-key" so authorities - could change their voted preferences without needing to upgrade. - -3.1.2. Computing consensus for microdescriptor-elements and "m" lines - - One approach is for the consensus microdescriptor-elements line to - include every element listed by a majority of authorities, sorted. The - problem here is that it will no longer be deterministic what the correct - hash for the "m" line should be. We could imagine telling the authority - to go look in its descriptor and produce the right hash itself, but - we don't want consensus calculation to be based on external data like - that. (Plus, the authority may not have the descriptor that everybody - else voted to use.) - - The better approach is to take the exact set that has the most votes - (breaking ties by the set that has the most elements, and breaking - ties after that by whichever is alphabetically first). That will - increase the odds that we actually get a microdescriptor hash that - is both a) for the descriptor we're putting in the consensus, and b) - over the elements that we're declaring it should be for. - - Then the "m" line for a given relay is the one that gets the most votes - from authorities that both a) voted for the microdescriptor-elements - line we're using, and b) voted for the descriptor we're using. - - (If there's a tie, use the smaller hash. But really, if there are - multiple such votes and they differ about a microdescriptor, we caught - one of them lying or being buggy. We should log it to track down why.) - - If there are no such votes, then we leave out the "m" line for that - relay. That means clients should avoid it for this time period. (As - an extension it could instead mean that clients should fetch the - descriptor and figure out its microdescriptor themselves. But let's - not get ahead of ourselves.) - - It would be nice to have a more foolproof way to agree on what - microdescriptor hash each authority should vote for, so we can avoid - missing "m" lines. Just switching to a new consensus-method each time - we change the set of microdescriptor-elements won't help though, since - each authority will still have to decide what hash to vote for before - knowing what consensus-method will be used. - - Here's one way we could do it. Each vote / consensus includes - the microdescriptor-elements that were used to compute the hashes, - and also a preferred-microdescriptor-elements set. If an authority - has a consensus from the previous period, then it should use the - consensus preferred-microdescriptor-elements when computing its votes - for microdescriptor-elements and the appropriate hashes in the upcoming - period. (If it has no previous consensus, then it just writes its - own preferences in both lines.) - -3.2. Directory mirrors serve microdescriptors - - Directory mirrors should then read the microdescriptor-elements line - from the consensus, and learn how to answer requests. (Directory mirrors - continue to serve normal relay descriptors too, a) to serve old clients - and b) to be able to construct microdescriptors on the fly.) - - The microdescriptors with hashes <D1>,<D2>,<D3> should be available at: - http://<hostname>/tor/micro/d/<D1>+<D2>+<D3>.z - - All the microdescriptors from the current consensus should also be - available at: - http://<hostname>/tor/micro/all.z - so a client that's bootstrapping doesn't need to send a 70KB URL just - to name every microdescriptor it's looking for. - - The format of a microdescriptor is the header line - "microdescriptor-header" - followed by each element (keyword and body), alphabetically. There's - no need to mention what hash it's for, since it's self-identifying: - you can hash the elements to learn this. - - (Do we need a footer line to show that it's over, or is the next - microdescriptor line or EOF enough of a hint? A footer line wouldn't - hurt much. Also, no fair voting for the microdescriptor-element - "microdescriptor-header".) - - The hash of the microdescriptor is simply the hash of the concatenated - elements -- not counting the header line or hypothetical footer line. - Unless you prefer that? - - Is there a reasonable way to version these things? We could say that - the microdescriptor-header line can contain arguments which clients - must ignore if they don't understand them. Any better ways? - - Directory mirrors should check to make sure that the microdescriptors - they're about to serve match the right hashes (either the hashes from - the fetch URL or the hashes from the consensus, respectively). - - We will probably want to consider some sort of smart data structure to - be able to quickly convert microdescriptor hashes into the appropriate - microdescriptor. Clients will want this anyway when they load their - microdescriptor cache and want to match it up with the consensus to - see what's missing. - -3.3. Clients fetch them and cache them - - When a client gets a new consensus, it looks to see if there are any - microdescriptors it needs to learn. If it needs to learn more than - some threshold of the microdescriptors (half?), it requests 'all', - else it requests only the missing ones. - - Clients maintain a cache of microdescriptors along with metadata like - when it was last referenced by a consensus. They keep a microdescriptor - until it hasn't been mentioned in any consensus for a week. Future - clients might cache them for longer or shorter times. - -3.3.1. Information leaks from clients - - If a client asks you for a set of microdescs, then you know she didn't - have them cached before. How much does that leak? What about when - we're all using our entry guards as directory guards, and we've seen - that user make a bunch of circuits already? - - Fetching "all" when you need at least half is a good first order fix, - but might not be all there is to it. - - Another future option would be to fetch some of the microdescriptors - anonymously (via a Tor circuit). - -4. Transition and deployment - - Phase one, the directory authorities should start voting on - microdescriptors and microdescriptor elements, and putting them in the - consensus. This should happen during the 0.2.1.x series, and should - be relatively easy to do. - - Phase two, directory mirrors should learn how to serve them, and learn - how to read the consensus to find out what they should be serving. This - phase could be done either in 0.2.1.x or early in 0.2.2.x, depending - on how messy it turns out to be and how quickly we get around to it. - - Phase three, clients should start fetching and caching them instead - of normal descriptors. This should happen post 0.2.1.x. - |