diff options
Diffstat (limited to 'doc/spec/proposals/114-distributed-storage.txt')
-rw-r--r-- | doc/spec/proposals/114-distributed-storage.txt | 441 |
1 files changed, 0 insertions, 441 deletions
diff --git a/doc/spec/proposals/114-distributed-storage.txt b/doc/spec/proposals/114-distributed-storage.txt deleted file mode 100644 index e9271fb82d..0000000000 --- a/doc/spec/proposals/114-distributed-storage.txt +++ /dev/null @@ -1,441 +0,0 @@ -Filename: 114-distributed-storage.txt -Title: Distributed Storage for Tor Hidden Service Descriptors -Version: $Revision$ -Last-Modified: $Date$ -Author: Karsten Loesing -Created: 13-May-2007 -Status: Closed -Implemented-In: 0.2.0.x - -Change history: - - 13-May-2007 Initial proposal - 14-May-2007 Added changes suggested by Lasse Øverlier - 30-May-2007 Changed descriptor format, key length discussion, typos - 09-Jul-2007 Incorporated suggestions by Roger, added status of specification - and implementation for upcoming GSoC mid-term evaluation - 11-Aug-2007 Updated implementation statuses, included non-consecutive - replication to descriptor format - 20-Aug-2007 Renamed config option HSDir as HidServDirectoryV2 - 02-Dec-2007 Closed proposal - -Overview: - - The basic idea of this proposal is to distribute the tasks of storing and - serving hidden service descriptors from currently three authoritative - directory nodes among a large subset of all onion routers. The three - reasons to do this are better robustness (availability), better - scalability, and improved security properties. Further, - this proposal suggests changes to the hidden service descriptor format to - prevent new security threats coming from decentralization and to gain even - better security properties. - -Status: - - As of December 2007, the new hidden service descriptor format is implemented - and usable. However, servers and clients do not yet make use of descriptor - cookies, because there are open usability issues of this feature that might - be resolved in proposal 121. Further, hidden service directories do not - perform replication by themselves, because (unauthorized) replica fetch - requests would allow any attacker to fetch all hidden service descriptors in - the system. As neither issue is critical to the functioning of v2 - descriptors and their distribution, this proposal is considered as Closed. - -Motivation: - - The current design of hidden services exhibits the following performance and - security problems: - - First, the three hidden service authoritative directories constitute a - performance bottleneck in the system. The directory nodes are responsible for - storing and serving all hidden service descriptors. As of May 2007 there are - about 1000 descriptors at a time, but this number is assumed to increase in - the future. Further, there is no replication protocol for descriptors between - the three directory nodes, so that hidden services must ensure the - availability of their descriptors by manually publishing them on all - directory nodes. Whenever a fourth or fifth hidden service authoritative - directory is added, hidden services will need to maintain an equally - increasing number of replicas. These scalability issues have an impact on the - current usage of hidden services and put an even higher burden on the - development of new kinds of applications for hidden services that might - require storing even more descriptors. - - Second, besides posing a limitation to scalability, storing all hidden - service descriptors on three directory nodes also constitutes a security - risk. The directory node operators could easily analyze the publish and fetch - requests to derive information on service activity and usage and read the - descriptor contents to determine which onion routers work as introduction - points for a given hidden service and need to be attacked or threatened to - shut it down. Furthermore, the contents of a hidden service descriptor offer - only minimal security properties to the hidden service. Whoever gets aware of - the service ID can easily find out whether the service is active at the - moment and which introduction points it has. This applies to (former) - clients, (former) introduction points, and of course to the directory nodes. - It requires only to request the descriptor for the given service ID, which - can be performed by anyone anonymously. - - This proposal suggests two major changes to approach the described - performance and security problems: - - The first change affects the storage location for hidden service descriptors. - Descriptors are distributed among a large subset of all onion routers instead - of three fixed directory nodes. Each storing node is responsible for a subset - of descriptors for a limited time only. It is not able to choose which - descriptors it stores at a certain time, because this is determined by its - onion ID which is hard to change frequently and in time (only routers which - are stable for a given time are accepted as storing nodes). In order to - resist single node failures and untrustworthy nodes, descriptors are - replicated among a certain number of storing nodes. A first replication - protocol makes sure that descriptors don't get lost when the node population - changes; therefore, a storing node periodically requests the descriptors from - its siblings. A second replication protocol distributes descriptors among - non-consecutive nodes of the ID ring to prevent a group of adversaries from - generating new onion keys until they have consecutive IDs to create a 'black - hole' in the ring and make random services unavailable. Connections to - storing nodes are established by extending existing circuits by one hop to - the storing node. This also ensures that contents are encrypted. The effect - of this first change is that the probability that a single node operator - learns about a certain hidden service is very small and that it is very hard - to track a service over time, even when it collaborates with other node - operators. - - The second change concerns the content of hidden service descriptors. - Obviously, security problems cannot be solved only by decentralizing storage; - in fact, they could also get worse if done without caution. At first, a - descriptor ID needs to change periodically in order to be stored on changing - nodes over time. Next, the descriptor ID needs to be computable only for the - service's clients, but should be unpredictable for all other nodes. Further, - the storing node needs to be able to verify that the hidden service is the - true originator of the descriptor with the given ID even though it is not a - client. Finally, a storing node should learn as little information as - necessary by storing a descriptor, because it might not be as trustworthy as - a directory node; for example it does not need to know the list of - introduction points. Therefore, a second key is applied that is only known to - the hidden service provider and its clients and that is not included in the - descriptor. It is used to calculate descriptor IDs and to encrypt the - introduction points. This second key can either be given to all clients - together with the hidden service ID, or to a group or a single client as - an authentication token. In the future this second key could be the result of - some key agreement protocol between the hidden service and one or more - clients. A new text-based format is proposed for descriptors instead of an - extension of the existing binary format for reasons of future extensibility. - -Design: - - The proposed design is described by the required changes to the current - design. These requirements are grouped by content, rather than by affected - specification documents or code files, and numbered for reference below. - - Hidden service clients, servers, and directories: - - /1/ Create routing list - - All participants can filter the consensus status document received from the - directory authorities to one routing list containing only those servers - that store and serve hidden service descriptors and which are running for - at least 24 hours. A participant only trusts its own routing list and never - learns about routing information from other parties. - - /2/ Determine responsible hidden service directory - - All participants can determine the hidden service directory that is - responsible for storing and serving a given ID, as well as the hidden - service directories that replicate its content. Every hidden service - directory is responsible for the descriptor IDs in the interval from - its predecessor, exclusive, to its own ID, inclusive. Further, a hidden - service directory holds replicas for its n predecessors, where n denotes - the number of consecutive replicas. (requires /1/) - - [/3/ and /4/ were requirements to use BEGIN_DIR cells for directory - requests which have not been fulfilled in the course of the implementation - of this proposal, but elsewhere.] - - Hidden service directory nodes: - - /5/ Advertise hidden service directory functionality - - Every onion router that has its directory port open can decide whether it - wants to store and serve hidden service descriptors by setting a new config - option "HidServDirectoryV2" 0|1 to 1. An onion router with this config - option being set includes the flag "hidden-service-dir" in its router - descriptors that it sends to directory authorities. - - /6/ Accept v2 publish requests, parse and store v2 descriptors - - Hidden service directory nodes accept publish requests for hidden service - descriptors and store them to their local memory. (It is not necessary to - make descriptors persistent, because after disconnecting, the onion router - would not be accepted as storing node anyway, because it has not been - running for at least 24 hours.) All requests and replies are formatted as - HTTP messages. Requests are directed to the router's directory port and are - contained within BEGIN_DIR cells. A hidden service directory node stores a - descriptor only when it thinks that it is responsible for storing that - descriptor based on its own routing table. Every hidden service directory - node is responsible for the descriptor IDs in the interval of its n-th - predecessor in the ID circle up to its own ID (n denotes the number of - consecutive replicas). (requires /1/) - - /7/ Accept v2 fetch requests - - Same as /6/, but with fetch requests for hidden service descriptors. - (requires /2/) - - /8/ Replicate descriptors with neighbors - - A hidden service directory node replicates descriptors from its two - predecessors by downloading them once an hour. Further, it checks its - routing table periodically for changes. Whenever it realizes that a - predecessor has left the network, it establishes a connection to the new - n-th predecessor and requests its stored descriptors in the interval of its - (n+1)-th predecessor and the requested n-th predecessor. Whenever it - realizes that a new onion router has joined with an ID higher than its - former n-th predecessor, it adds it to its predecessors and discards all - descriptors in the interval of its (n+1)-th and its n-th predecessor. - (requires /1/) - - [Dec 02: This function has not been implemented, because arbitrary nodes - what have been able to download the entire set of v2 descriptors. An - authorized replication request would be necessary. For the moment, the - system runs without any directory-side replication. -KL] - - Authoritative directory nodes: - - /9/ Confirm a router's hidden service directory functionality - - Directory nodes include a new flag "HSDir" for routers that decided to - provide storage for hidden service descriptors and that are running for at - least 24 hours. The last requirement prevents a node from frequently - changing its onion key to become responsible for an identifier it wants to - target. - - Hidden service provider: - - /10/ Configure v2 hidden service - - Each hidden service provider that has set the config option - "PublishV2HidServDescriptors" 0|1 to 1 is configured to publish v2 - descriptors and conform to the v2 connection establishment protocol. When - configuring a hidden service, a hidden service provider checks if it has - already created a random secret_cookie and a hostname2 file; if not, it - creates both of them. (requires /2/) - - /11/ Establish introduction points with fresh key - - If configured to publish only v2 descriptors and no v0/v1 descriptors any - more, a hidden service provider that is setting up the hidden service at - introduction points does not pass its own public key, but the public key - of a freshly generated key pair. It also includes these fresh public keys - in the hidden service descriptor together with the other introduction point - information. The reason is that the introduction point does not need to and - therefore should not know for which hidden service it works, so as to - prevent it from tracking the hidden service's activity. (If a hidden - service provider supports both, v0/v1 and v2 descriptors, v0/v1 clients - rely on the fact that all introduction points accept the same public key, - so that this new feature cannot be used.) - - /12/ Encode v2 descriptors and send v2 publish requests - - If configured to publish v2 descriptors, a hidden service provider - publishes a new descriptor whenever its content changes or a new - publication period starts for this descriptor. If the current publication - period would only last for less than 60 minutes (= 2 x 30 minutes to allow - the server to be 30 minutes behind and the client 30 minutes ahead), the - hidden service provider publishes both a current descriptor and one for - the next period. Publication is performed by sending the descriptor to all - hidden service directories that are responsible for keeping replicas for - the descriptor ID. This includes two non-consecutive replicas that are - stored at 3 consecutive nodes each. (requires /1/ and /2/) - - Hidden service client: - - /13/ Send v2 fetch requests - - A hidden service client that has set the config option - "FetchV2HidServDescriptors" 0|1 to 1 handles SOCKS requests for v2 onion - addresses by requesting a v2 descriptor from a randomly chosen hidden - service directory that is responsible for keeping replica for the - descriptor ID. In total there are six replicas of which the first and the - last three are stored on consecutive nodes. The probability of picking one - of the three consecutive replicas is 1/6, 2/6, and 3/6 to incorporate the - fact that the availability will be the highest on the node with next higher - ID. A hidden service client relies on the hidden service provider to store - two sets of descriptors to compensate clock skew between service and - client. (requires /1/ and /2/) - - /14/ Process v2 fetch reply and parse v2 descriptors - - A hidden service client that has sent a request for a v2 descriptor can - parse it and store it to the local cache of rendezvous service descriptors. - - /15/ Establish connection to v2 hidden service - - A hidden service client can establish a connection to a hidden service - using a v2 descriptor. This includes using the secret cookie for decrypting - the introduction points contained in the descriptor. When contacting an - introduction point, the client does not use the public key of the hidden - service provider, but the freshly-generated public key that is included in - the hidden service descriptor. Whether or not a fresh key is used instead - of the key of the hidden service depends on the available protocol versions - that are included in the descriptor; by this, connection establishment is - to a certain extend decoupled from fetching the descriptor. - - Hidden service descriptor: - - (Requirements concerning the descriptor format are contained in /6/ and /7/.) - - The new v2 hidden service descriptor format looks like this: - - onion-address = h(public-key) + cookie - descriptor-id = h(h(public-key) + h(time-period + cookie + relica)) - descriptor-content = { - descriptor-id, - version, - public-key, - h(time-period + cookie + replica), - timestamp, - protocol-versions, - { introduction-points } encrypted with cookie - } signed with private-key - - The "descriptor-id" needs to change periodically in order for the - descriptor to be stored on changing nodes over time. It may only be - computable by a hidden service provider and all of his clients to prevent - unauthorized nodes from tracking the service activity by periodically - checking whether there is a descriptor for this service. Finally, the - hidden service directory needs to be able to verify that the hidden service - provider is the true originator of the descriptor with the given ID. - - Therefore, "descriptor-id" is derived from the "public-key" of the hidden - service provider, the current "time-period" which changes every 24 hours, - a secret "cookie" shared between hidden service provider and clients, and - a "replica" denoting the number of this non-consecutive replica. (The - "time-period" is constructed in a way that time periods do not change at - the same moment for all descriptors by deriving a value between 0:00 and - 23:59 hours from h(public-key) and making the descriptors of this hidden - service provider expire at that time of the day.) The "descriptor-id" is - defined to be 160 bits long. [extending the "descriptor-id" length - suggested by LØ] - - Only the hidden service provider and the clients are able to generate - future "descriptor-ID"s. Hence, the "onion-address" is extended from now - the hash value of "public-key" by the secret "cookie". The "public-key" is - determined to be 80 bits long, whereas the "cookie" is dimensioned to be - 120 bits long. This makes a total of 200 bits or 40 base32 chars, which is - quite a lot to handle for a human, but necessary to provide sufficient - protection against an adversary from generating a key pair with same - "public-key" hash or guessing the "cookie". - - A hidden service directory can verify that a descriptor was created by the - hidden service provider by checking if the "descriptor-id" corresponds to - the "public-key" and if the signature can be verified with the - "public-key". - - The "introduction-points" that are included in the descriptor are encrypted - using the same "cookie" that is shared between hidden service provider and - clients. [correction to use another key than h(time-period + cookie) as - encryption key for introduction points made by LØ] - - A new text-based format is proposed for descriptors instead of an extension - of the existing binary format for reasons of future extensibility. - -Security implications: - - The security implications of the proposed changes are grouped by the roles of - nodes that could perform attacks or on which attacks could be performed. - - Attacks by authoritative directory nodes - - Authoritative directory nodes are no longer the single places in the - network that know about a hidden service's activity and introduction - points. Thus, they cannot perform attacks using this information, e.g. - track a hidden service's activity or usage pattern or attack its - introduction points. Formerly, it would only require a single corrupted - authoritative directory operator to perform such an attack. - - Attacks by hidden service directory nodes - - A hidden service directory node could misuse a stored descriptor to track a - hidden service's activity and usage pattern by clients. Though there is no - countermeasure against this kind of attack, it is very expensive to track a - certain hidden service over time. An attacker would need to run a large - number of stable onion routers that work as hidden service directory nodes - to have a good probability to become responsible for its changing - descriptor IDs. For each period, the probability is: - - 1-(N-c choose r)/(N choose r) for N-c>=r and 1 otherwise, with N - as total - number of hidden service directories, c as compromised nodes, and r as - number of replicas - - The hidden service directory nodes could try to make a certain hidden - service unavailable to its clients. Therefore, they could discard all - stored descriptors for that hidden service and reply to clients that there - is no descriptor for the given ID or return an old or false descriptor - content. The client would detect a false descriptor, because it could not - contain a correct signature. But an old content or an empty reply could - confuse the client. Therefore, the countermeasure is to replicate - descriptors among a small number of hidden service directories, e.g. 5. - The probability of a group of collaborating nodes to make a hidden service - completely unavailable is in each period: - - (c choose r)/(N choose r) for c>=r and N>=r, and 0 otherwise, - with N as total - number of hidden service directories, c as compromised nodes, and r as - number of replicas - - A hidden service directory could try to find out which introduction points - are working on behalf of a hidden service. In contrast to the previous - design, this is not possible anymore, because this information is encrypted - to the clients of a hidden service. - - Attacks on hidden service directory nodes - - An anonymous attacker could try to swamp a hidden service directory with - false descriptors for a given descriptor ID. This is prevented by requiring - that descriptors are signed. - - Anonymous attackers could swamp a hidden service directory with correct - descriptors for non-existing hidden services. There is no countermeasure - against this attack. However, the creation of valid descriptors is more - expensive than verification and storage in local memory. This should make - this kind of attack unattractive. - - Attacks by introduction points - - Current or former introduction points could try to gain information on the - hidden service they serve. But due to the fresh key pair that is used by - the hidden service, this attack is not possible anymore. - - Attacks by clients - - Current or former clients could track a hidden service's activity, attack - its introduction points, or determine the responsible hidden service - directory nodes and attack them. There is nothing that could prevent them - from doing so, because honest clients need the full descriptor content to - establish a connection to the hidden service. At the moment, the only - countermeasure against dishonest clients is to change the secret cookie and - pass it only to the honest clients. - -Compatibility: - - The proposed design is meant to replace the current design for hidden service - descriptors and their storage in the long run. - - There should be a first transition phase in which both, the current design - and the proposed design are served in parallel. Onion routers should start - serving as hidden service directories, and hidden service providers and - clients should make use of the new design if both sides support it. Hidden - service providers should be allowed to publish descriptors of the current - format in parallel, and authoritative directories should continue storing and - serving these descriptors. - - After the first transition phase, hidden service providers should stop - publishing descriptors on authoritative directories, and hidden service - clients should not try to fetch descriptors from the authoritative - directories. However, the authoritative directories should continue serving - hidden service descriptors for a second transition phase. As of this point, - all v2 config options should be set to a default value of 1. - - After the second transition phase, the authoritative directories should stop - serving hidden service descriptors. - |