diff options
author | Nick Mathewson <nickm@torproject.org> | 2012-10-11 10:36:31 -0400 |
---|---|---|
committer | Nick Mathewson <nickm@torproject.org> | 2012-10-11 10:36:31 -0400 |
commit | 7dd452336e4df3e4bcc66b40f7cbbd56416bd6a2 (patch) | |
tree | 511ad896f208a6f531321349cfde2b8d7bab92ed /proposals/210-faster-headless-consensus-bootstrap.txt | |
parent | 5c00b2d335a4c6aa0b01dc51069a2010667ae545 (diff) | |
download | torspec-7dd452336e4df3e4bcc66b40f7cbbd56416bd6a2.tar.gz torspec-7dd452336e4df3e4bcc66b40f7cbbd56416bd6a2.zip |
faster headless consensus bootstrapping is now proposal 210
Diffstat (limited to 'proposals/210-faster-headless-consensus-bootstrap.txt')
-rw-r--r-- | proposals/210-faster-headless-consensus-bootstrap.txt | 113 |
1 files changed, 113 insertions, 0 deletions
diff --git a/proposals/210-faster-headless-consensus-bootstrap.txt b/proposals/210-faster-headless-consensus-bootstrap.txt new file mode 100644 index 0000000..6b1502b --- /dev/null +++ b/proposals/210-faster-headless-consensus-bootstrap.txt @@ -0,0 +1,113 @@ +Filename: 210-faster-headless-consensus-bootstrap.txt +Title: Faster Headless Consensus Bootstrapping +Author: Mike Perry +Created: 01-10-2012 +Status: Open +Target: 0.2.4.x+ + + +Overview and Motiviation + + This proposal describes a way for clients to fetch the initial + consensus more quickly in situations where some or all of the directory + authorities are unreachable. This proposal is meant to describe a + solution for bug #4483. + +Design: Bootstrap Process Changes + + The core idea is to attempt to establish bootstrap connections in + parallel during the bootstrap process, and download the consensus from + the first connection that completes. + + Connection attempts will be done in batches of three. Only one + connection will be performed to one of the canonical directory + authorities. Two connections will be performed to randomly chosen hard + coded directory mirrors. + + If no connections complete within 5 seconds, another batch of three + connections will be launched. Otherwise, the first connection to + complete will be used to download the consensus document and the others + will be closed, after which bootstrapping will proceed as normal. + + If at any time, the total outstanding bootstrap connection attempts + exceeds 15, no new connection attempts are to be launched until existing + connection attempts experience full timeout. + +Design: Fallback Dir Mirror Selection + + The set of hard coded directory mirrors from #572 shall be chosen using + the 100 Guard nodes with the longest uptime. + + The fallback weights will be set using each mirror's fraction of + consensus bandwidth out of the total of all 100 mirrors. + + This list of fallback dir mirrors should be updated with every + major Tor release. In future releases, the number of dir mirrors + should be set at 20% of the current Guard nodes, rather than fixed at + 100. + +Performance: Additional Load with Current Parameter Choices + + This design and the connection count parameters were chosen such that + no additional bandwidth load would be placed on the directory + authorities. In fact, the directory authorities should experience less + load, because they will not need to serve the consensus document for a + connection in the event that one of the directory mirrors complete their + connection before the directory authority does. + + However, the scheme does place additional TLS connection load on the + fallback dir mirrors. Because bootstrapping is rare and all but one of + the TLS connections will be very short-lived and unused, this should not + be a substantial issue. + + The dangerous case is in the event of a prolonged consensus failure + that induces all clients to enter into the bootstrap process. In this + case, the number of initial TLS connections to the fallback dir mirrors + would be 2*C/100, or 10,000 for C=500,000 users. If no connections + complete before the five retries, this could reach as high as 50,000 + connection attempts, but this is extremely unlikely to happen in full + aggregate. + + However, in the no-consensus scenario today, the directory authorities + would already experience C/9 or 55,555 connection attempts. The + 5-retry scheme increases their total maximum load to about 275,000 + connection attempts, but again this is unlikely to be reached + in aggregate. Additionally, with this scheme, even if the dirauths + are taken down by this load, the dir mirrors should be able to survive + it. + +Implementation Notes: Code Modifications + + The implementation of the bootstrap process is unfortunately mixed + in with many types of directory activity. + + The process starts in update_consensus_networkstatus_downloads(), + which initiates a single directory connection through + directory_get_from_dirserver(). Depending on bootstrap state, + a single directory server is selected and a connection is + eventually made through directory_initiate_command_rend(). + + There appear to be a few options for altering this code to perform + multiple connections. Without refactoring, one approach would be + to make multiple calls to directory_initiate_command_routerstatus() + from directory_get_from_dirserver() if the purpose is + DIR_PURPOSE_FETCH_CONSENSUS and the only directory servers available + are the authorities and the fallback dir mirrors. + + The code in directory_initiate_command_rend() would then need to be + altered to maintain a list of the dircons created for this purpose as + well as avoid immediately queuing the directory_send_command() request + for the DIR_PURPOSE_FETCH_CONSENSUS purpose. A flag would need to be set + on the dircon to be checked in connection_dir_finished_connecting(). + + The function connection_dir_finished_connecting() would need to be + altered to examine the list of pending dircons, determine if this one is + the first to complete, and if so, then call directory_send_command() to + download the consensus and close the other pending dircons. + + An additional timer would need to be installed to re-call + update_consensus_networkstatus_downloads() or a related helper after 5 + seconds. connection_dir_finished_connecting() would cancel this timer. + The helper would check the list of pending connections and ensure it + never exceeds 15. + |