diff options
Diffstat (limited to 'doc/spec/proposals/ideas')
-rw-r--r-- | doc/spec/proposals/ideas/xxx-auto-update.txt | 39 | ||||
-rw-r--r-- | doc/spec/proposals/ideas/xxx-bridge-disbursement.txt | 174 | ||||
-rw-r--r-- | doc/spec/proposals/ideas/xxx-bwrate-algs.txt | 106 | ||||
-rw-r--r-- | doc/spec/proposals/ideas/xxx-choosing-crypto-in-tor-protocol.txt | 138 | ||||
-rw-r--r-- | doc/spec/proposals/ideas/xxx-controllers-intercept-extends.txt | 44 | ||||
-rw-r--r-- | doc/spec/proposals/ideas/xxx-encrypted-services.txt | 18 | ||||
-rw-r--r-- | doc/spec/proposals/ideas/xxx-exit-scanning-outline.txt | 44 | ||||
-rw-r--r-- | doc/spec/proposals/ideas/xxx-geoip-survey-plan.txt | 137 | ||||
-rw-r--r-- | doc/spec/proposals/ideas/xxx-grand-scaling-plan.txt | 97 | ||||
-rw-r--r-- | doc/spec/proposals/ideas/xxx-hide-platform.txt | 37 | ||||
-rw-r--r-- | doc/spec/proposals/ideas/xxx-port-knocking.txt | 91 | ||||
-rw-r--r-- | doc/spec/proposals/ideas/xxx-rate-limit-exits.txt | 63 | ||||
-rw-r--r-- | doc/spec/proposals/ideas/xxx-separate-streams-by-port.txt | 59 | ||||
-rw-r--r-- | doc/spec/proposals/ideas/xxx-using-spdy.txt | 143 | ||||
-rw-r--r-- | doc/spec/proposals/ideas/xxx-what-uses-sha1.txt | 247 |
15 files changed, 0 insertions, 1437 deletions
diff --git a/doc/spec/proposals/ideas/xxx-auto-update.txt b/doc/spec/proposals/ideas/xxx-auto-update.txt deleted file mode 100644 index dc9a857c1e..0000000000 --- a/doc/spec/proposals/ideas/xxx-auto-update.txt +++ /dev/null @@ -1,39 +0,0 @@ - -Notes on an auto updater: - -steve wants a "latest" symlink so he can always just fetch that. - -roger worries that this will exacerbate the "what version are you -using?" "latest." problem. - -weasel suggests putting the latest recommended version in dns. then -we don't have to hit the website. it's got caching, it's lightweight, -it scales. just put it in a TXT record or something. - -but, no dnssec. - -roger suggests a file on the https website that lists the latest -recommended version (or filename or url or something like that). - -(steve seems to already be doing this with xerobank. he additionally -suggests a little blurb that can be displayed to the user to describe -what's new.) - -how to verify you're getting the right file? -a) it's https. -b) ship with a signing key, and use some openssl functions to verify. -c) both - -andrew reminds us that we have a "recommended versions" line in the -consensus directory already. - -if only we had some way to point out the "latest stable recommendation" -from this list. we could list it first, or something. - -the recommended versions line also doesn't take into account which -packages are available -- e.g. on Windows one version might be the best -available, and on OS X it might be a different one. - -aren't there existing solutions to this? surely there is a beautiful, -efficient, crypto-correct auto updater lib out there. even for windows. - diff --git a/doc/spec/proposals/ideas/xxx-bridge-disbursement.txt b/doc/spec/proposals/ideas/xxx-bridge-disbursement.txt deleted file mode 100644 index 6c9a3c71ed..0000000000 --- a/doc/spec/proposals/ideas/xxx-bridge-disbursement.txt +++ /dev/null @@ -1,174 +0,0 @@ - -How to hand out bridges. - -Divide bridges into 'strategies' as they come in. Do this uniformly -at random for now. - -For each strategy, we'll hand out bridges in a different way to -clients. This document describes two strategies: email-based and -IP-based. - -0. Notation: - - HMAC(k,v) : an HMAC of v using the key k. - - A|B: The string A concatenated with the string B. - - -1. Email-based. - - Goal: bootstrap based on one or more popular email service's sybil - prevention algorithms. - - - Parameters: - HMAC -- an HMAC function - P -- a time period - K -- the number of bridges to send in a period. - - Setup: Generate two nonces, N and M. - - As bridges arrive, put them into a ring according to HMAC(N,ID) - where ID is the bridges's identity digest. - - Divide time into divisions of length P. - - When we get an email: - - If it's not from a supported email service, reject it. - - If we already sent a response to that email address (normalized) - in this period, send _exactly_ the same response. - - If it is from a supported service, generate X = HMAC(M,PS|E) where E - is the lowercased normalized email address for the user, and - where PS is the start of the currrent period. Send - the first K bridges in the ring after point X. - - [If we want to make sure that repeat queries are given exactly the - same results, then we can't let the ring change during the - time period. For a long time period like a month, that's quite a - hassle. How about instead just keeping a replay cache of addresses - that have been answered, and sending them a "sorry, you already got - your addresses for the time period; perhaps you should try these - other fine distribution strategies while you wait?" response? This - approach would also resolve the "Make sure you can't construct a - distinct address to match an existing one" note below. -RD] - - [I think, if we get a replay, we need to send back the same - answer as we did the first time, not say "try again." - Otherwise we need to worry that an attacker can keep people - from getting bridges by preemtively asking for them, - or that an attacker may force them to prove they haven't - gotten any bridges by asking. -NM] - - [While we're at it, if we do the replay cache thing and don't need - repeatable answers, we could just pick K random answers from the - pool. Is it beneficial that a bridge user who knows about a clump of - nodes will be sharing them with other users who know about a similar - (overlapping) clump? One good aspect is against an adversary who - learns about a clump this way and watches those bridges to learn - other users and discover *their* bridges: he doesn't learn about - as many new bridges as he might if they were randomly distributed. - A drawback is against an adversary who happens to pick two email - addresses in P that include overlapping answers: he can measure - the difference in clumps and estimate how quickly the bridge pool - is growing. -RD] - - [Random is one more darn thing to implement; rings are already - there. -NM] - - [If we make the period P be mailbox-specific, and make it a random - value around some mean, then we make it harder for an attacker to - know when to try using his small army of gmail addresses to gather - another harvest. But we also make it harder for users to know when - they can try again. -RD] - - [Letting the users know about when they can try again seems - worthwhile. Otherwise users and attackers will all probe and - probe and probe until they get an answer. No additional - security will be achieved, but bandwidth will be lost. -NM] - - To normalize an email address: - Start with the RFC822 address. Consider only the mailbox {???} - portion of the address (username@domain). Put this into lowercase - ascii. - - Questions: - What to do with weird character encodings? Look up the RFC. - - Notes: - Make sure that you can't force a single email address to appear - in lots of different ways. IOW, if nickm@freehaven.net and - NICKM@freehaven.net aren't treated the same, then I can get lots - more bridges than I should. - - Make sure you can't construct a distinct address to match an - existing one. IOW, if we treat nickm@X and nickm@Y as the same - user, then anybody can register nickm@Z and use it to tell which - bridges nickm@X got (or would get). - - Make sure that we actually check headers so we can't be trivially - used to spam people. - - -2. IP-based. - - Goal: avoid handing out all the bridges to users in a similar IP - space and time. - - Parameters: - - T_Flush -- how long it should take a user on a single network to - see a whole cluster of bridges. - - N_C - - K -- the number of bridges we hand out in response to a single - request. - - Setup: using an AS map or a geoip map or some other flawed input - source, divide IP space into "areas" such that surveying a large - collection of "areas" is hard. For v0, use /24 address blocks. - - Group areas into N_C clusters. - - Generate secrets L, M, N. - - Set the period P such that P*(bridges-per-cluster/K) = T_flush. - Don't set P to greater than a week, or less than three hours. - - When we get a bridge: - - Based on HMAC(L,ID), assign the bridge to a cluster. Within each - cluster, keep the bridges in a ring based on HMAC(M,ID). - - [Should we re-sort the rings for each new time period, so the ring - for a given cluster is based on HMAC(M,PS|ID)? -RD] - - When we get a connection: - - If it's http, redirect it to https. - - Let area be the incoming IP network. Let PS be the current - period. Compute X = HMAC(N, PS|area). Return the next K bridges - in the ring after X. - - [Don't we want to compute C = HMAC(key, area) to learn what cluster - to answer from, and then X = HMAC(key, PS|area) to pick a point in - that ring? -RD] - - - Need to clarify that some HMACs are for rings, and some are for - partitions. How rings scale is clear. How do we grow the number of - partitions? Looking at successive bits from the HMAC output is one way. - -3. Open issues - - Denial of service attacks - A good view of network topology - -at some point we should learn some reliability stats on our bridges. when -we say above 'give out k bridges', we might give out 2 reliable ones and -k-2 others. we count around the ring the same way we do now, to find them. - diff --git a/doc/spec/proposals/ideas/xxx-bwrate-algs.txt b/doc/spec/proposals/ideas/xxx-bwrate-algs.txt deleted file mode 100644 index 757f5bc55e..0000000000 --- a/doc/spec/proposals/ideas/xxx-bwrate-algs.txt +++ /dev/null @@ -1,106 +0,0 @@ -# The following two algorithms - - -# Algorithm 1 -# TODO: Burst and Relay/Regular differentiation - -BwRate = Bandwidth Rate in Bytes Per Second -GlobalWriteBucket = 0 -GlobalReadBucket = 0 -Epoch = Token Fill Rate in seconds: suggest 50ms=.050 -SecondCounter = 0 -MinWriteBytes = Minimum amount bytes per write - -Every Epoch Seconds: - UseMinWriteBytes = MinWriteBytes - WriteCnt = 0 - ReadCnt = 0 - BytesRead = 0 - - For Each Open OR Conn with pending write data: - WriteCnt++ - For Each Open OR Conn: - ReadCnt++ - - BytesToRead = (BwRate*Epoch + GlobalReadBucket)/ReadCnt - BytesToWrite = (BwRate*Epoch + GlobalWriteBucket)/WriteCnt - - if BwRate/WriteCnt < MinWriteBytes: - # If we aren't likely to accumulate enough bytes in a second to - # send a whole cell for our connections, send partials - Log(NOTICE, "Too many ORCons to write full blocks. Sending short packets.") - UseMinWriteBytes = 1 - # Other option: We could switch to plan 2 here - - # Service each writable ORConn. If there are any partial writes, - # return remaining bytes from this epoch to the global pool - For Each Open OR Conn with pending write data: - ORConn->write_bucket += BytesToWrite - if ORConn->write_bucket > UseMinWriteBytes: - w = write(ORConn, MIN(len(ORConn->write_data), ORConn->write_bucket)) - # possible that w < ORConn->write_data here due to TCP pushback. - # We should restore the rest of the write_bucket to the global - # buffer - GlobalWriteBucket += (ORConn->write_bucket - w) - ORConn->write_bucket = 0 - - For Each Open OR Conn: - r = read_nonblock(ORConn, BytesToRead) - BytesRead += r - - SecondCounter += Epoch - if SecondCounter < 1: - # Save unused bytes from this epoch to be used later in the second - GlobalReadBucket += (BwRate*Epoch - BytesRead) - else: - SecondCounter = 0 - GlobalReadBucket = 0 - GlobalWriteBucket = 0 - For Each ORConn: - ORConn->write_bucket = 0 - - - -# Alternate plan for Writing fairly. Reads would still be covered -# by plan 1 as there is no additional network overhead for short reads, -# so we don't need to try to avoid them. -# -# I think this is actually pretty similar to what we do now, but -# with the addition that the bytes accumulate up to the second mark -# and we try to keep track of our position in the write list here -# (unless libevent is doing that for us already and I just don't see it) -# -# TODO: Burst and Relay/Regular differentiation - -# XXX: The inability to send single cells will cause us to block -# on EXTEND cells for low-bandwidth node pairs.. -BwRate = Bandwidth Rate in Bytes Per Second -WriteBytes = Bytes per write -Epoch = MAX(MIN(WriteBytes/BwRate, .333s), .050s) - -SecondCounter = 0 -GlobalWriteBucket = 0 - -# New connections are inserted at Head-1 (the 'tail' of this circular list) -# This is not 100% fifo for all node data, but it is the best we can do -# without insane amounts of additional queueing complexity. -WriteConnList = List of Open OR Conns with pending write data > WriteBytes -WriteConnHead = 0 - -Every Epoch Seconds: - GlobalWriteBucket += BwRate*Epoch - WriteListEnd = WriteConnHead - - do - ORCONN = WriteConnList[WriteConnHead] - w = write(ORConn, WriteBytes) - GlobalWriteBucket -= w - WriteConnHead += 1 - while GlobalWriteBucket > 0 and WriteConnHead != WriteListEnd - - SecondCounter += Epoch - if SecondCounter >= 1: - SecondCounter = 0 - GlobalWriteBucket = 0 - - diff --git a/doc/spec/proposals/ideas/xxx-choosing-crypto-in-tor-protocol.txt b/doc/spec/proposals/ideas/xxx-choosing-crypto-in-tor-protocol.txt deleted file mode 100644 index e8489570f7..0000000000 --- a/doc/spec/proposals/ideas/xxx-choosing-crypto-in-tor-protocol.txt +++ /dev/null @@ -1,138 +0,0 @@ -Filename: xxx-choosing-crypto-in-tor-protocol.txt -Title: Picking cryptographic standards in the Tor wire protocol -Author: Marian -Created: 2009-05-16 -Status: Draft - -Motivation: - - SHA-1 is horribly outdated and not suited for security critical - purposes. SHA-2, RIPEMD-160, Whirlpool and Tigerare good options - for a short-term replacement, but in the long run, we will - probably want to upgrade to the winner or a semi-finalist of the - SHA-3 competition. - - For a 2006 comparison of different hash algorithms, read: - http://www.sane.nl/sane2006/program/final-papers/R10.pdf - - Other reading about SHA-1: - http://www.schneier.com/blog/archives/2005/02/sha1_broken.html - http://www.schneier.com/blog/archives/2005/08/new_cryptanalyt.html - http://www.schneier.com/paper-preimages.html - - Additionally, AES has been theoretically broken for years. While - the attack is still not efficient enough that the public sector - has been able to prove that it works, we should probably consider - the time between a theoretical attack and a practical attack as an - opportunity to figure out how to upgrade to a better algorithm, - such as Twofish. - - See: - http://schneier.com/crypto-gram-0209.html#1 - -Design: - - I suggest that nodes should publish in directories which - cryptographic standards, such as hash algorithms and ciphers, - they support. Clients communicating with nodes will then - pick whichever of those cryptographic standards they prefer - the most. In the case that the node does not publish which - cryptographic standards it supports, the client should assume - that the server supports the older standards, such as SHA-1 - and AES, until such time as we choose to desupport those - standards. - - Node to node communications could work similarly. However, in - case they both support a set of algorithms but have different - preferences, the disagreement would have to be resolved - somehow. Two possibilities include: - * the node requesting communications presents which - cryptographic standards it supports in the request. The - other node picks. - * both nodes send each other lists of what they support and - what version of Tor they are using. The newer node picks, - based on the assumption that the newer node has the most up - to date information about which hash algorithm is the best. - Of course, the node could lie about its version, but then - again, it could also maliciously choose only to support older - algorithms. - - Using this method, we could potentially add server side support - to hash algorithms and ciphers before we instruct clients to - begin preferring those hash algorithms and ciphers. In this way, - the clients could upgrade and the servers would already support - the newly preferred hash algorithms and ciphers, even if the - servers were still using older versions of Tor, so long as the - older versions of Tor were at least new enough to have server - side support. - - This would make quickly upgrading to new hash algorithms and - ciphers easier. This could be very useful when new attacks - are published. - - One concern is that client preferences could expose the client - to segmentation attacks. To mitigate this, we suggest hardcoding - preferences in the client, to prevent the client from choosing - to use a new hash algorithm or cipher that no one else is using - yet. While offering a preference might be useful in case a client - with an older version of Tor wants to start using the newer hash - algorithm or cipher that everyone else is using, if the client - cares enough, he or she can just upgrade Tor. - - We may also have to worry about nodes which, through laziness or - maliciousness, refuse to start supporting new hash algorithms or - ciphers. This must be balanced with the need to maintain - backward compatibility so the client will have a large selection - of nodes to pick from. Adding new hash algorithms and ciphers - long before we suggest nodes start using them can help mitigate - this. However, eventually, once sufficient nodes support new - standards, client side support for older standards should be - disabled, particularly if there are practical rather than merely - theoretical attacks. - - Server side support for older standards can be kept much longer - than client side support, since clients using older hashes and - ciphers are really only hurting theirselvse. - - If server side support for a hash algorithm or cipher is added - but never preferred before we decide we don't really want it, - support can be removed without having to worry about backward - compatibility. - -Security implications: - Improving cryptography will improve Tor's security. However, if - clients pick different cryptographic standards, they could be - partitioned based on their cryptographic preferences. We also - need to worry about nodes refusing to support new standards. - These issues are detailed above. - -Specification: - - Todo. Need better understanding of how Tor currently works or - help from someone who does. - -Compatibility: - - This idea is intended to allow easier upgrading of cryptographic - hash algorithms and ciphers while maintaining backwards - compatibility. However, at some point, backwards compatibility - with very old hashes and ciphers should be dropped for security - reasons. - -Implementation: - - Todo. - -Performance and scalability nodes: - - Better hashes and cipher are someimes a little more CPU intensive - than weaker ones. For instance, on most computers AES is a little - faster than Twofish. However, in that example, I consider Twofish's - additional security worth the tradeoff. - -Acknowledgements: - - Discussed this on IRC with a few people, mostly Nick Mathewson. - Nick was particularly helpful in explaining how Tor works, - explaining goals, and providing various links to Tor - specifications. diff --git a/doc/spec/proposals/ideas/xxx-controllers-intercept-extends.txt b/doc/spec/proposals/ideas/xxx-controllers-intercept-extends.txt deleted file mode 100644 index 76ba5c84b5..0000000000 --- a/doc/spec/proposals/ideas/xxx-controllers-intercept-extends.txt +++ /dev/null @@ -1,44 +0,0 @@ -Author: Geoff Goodell -Title: Allow controller to manage circuit extensions -Date: 12 March 2006 - -History: - - This was once bug 268. Moving it into the proposal system for posterity. - -Test: - -Tor controllers should have a means of learning more about circuits built -through Tor routers. Specifically, if a Tor controller is connected to a Tor -router, it should be able to subscribe to a new class of events, perhaps -"onion" or "router" events. A Tor router SHOULD then ensure that the -controller is informed: - -(a) (NEW) when it receives a connection from some other location, in which -case it SHOULD indicate (1) a unique identifier for the circuit, and (2) a -ServerID in the event of an OR connection from another Tor router, and -Hostname otherwise. - -(b) (REQUEST) when it receives a request to extend an existing circuit to a -successive Tor router, in which case it SHOULD provide (1) the unique -identifier for the circuit, (2) a Hostname (or, if possible, ServerID) of the -previous Tor router in the circuit, and (3) a ServerID for the requested -successive Tor router in the circuit; - -(c) (EXTEND) Tor will attempt to extend the circuit to some other router, in -which case it SHOULD provide the same fields as provided for REQUEST. - -(d) (SUCCEEDED) The circuit has been successfully extended to some ther -router, in which case it SHOULD provide the same fields as provided for -REQUEST. - -We also need a new configuration option analogous to _leavestreamsunattached, -specifying whether the controller is to manage circuit extensions or not. -Perhaps we can call it "_leavecircuitsunextended". When set to 0, Tor -manages everything as usual. When set to 1, a circuit received by the Tor -router cannot transition from "REQUEST" to "EXTEND" state without being -directed by a new controller command. The controller command probably does -not need any arguments, since circuits are extended per client source -routing, and all that the controller does is accept or reject the extension. - -This feature can be used as a basis for enforcing routing policy. diff --git a/doc/spec/proposals/ideas/xxx-encrypted-services.txt b/doc/spec/proposals/ideas/xxx-encrypted-services.txt deleted file mode 100644 index 3414f3c4fb..0000000000 --- a/doc/spec/proposals/ideas/xxx-encrypted-services.txt +++ /dev/null @@ -1,18 +0,0 @@ - -the basic idea might be to generate a keypair, and sign little statements -like "this key corresponds to this relay id", and publish them on karsten's -hs dht. - -so if you want to talk to it, you look it up, then go to that exit. -and by 'go to' i mean 'build a tor circuit like normal except you're sure -where to exit' - -connecting to it is slower than usual, but once you're connected, it's no -slower than normal tor. -and you get what wikileaks wants from its hidden service, which is really -just the UI piece. -indymedia also wants this. - -might be interesting to let an encrypted service list more than one relay, -too. - diff --git a/doc/spec/proposals/ideas/xxx-exit-scanning-outline.txt b/doc/spec/proposals/ideas/xxx-exit-scanning-outline.txt deleted file mode 100644 index d84094400a..0000000000 --- a/doc/spec/proposals/ideas/xxx-exit-scanning-outline.txt +++ /dev/null @@ -1,44 +0,0 @@ -1. Scanning process - A. Non-HTML/JS HTTP mime types compared via SHA1 hash - B. Dynamic HTTP content filtered at 4 levels: - 1. IP change+Tor cookie utilization - - Tor cookies replayed with new IP in case of changes - 2. HTML Tag+Attribute+JS comparison - - Comparisons made based only on "relevant" HTML tags - and attributes - 3. HTML Tag+Attribute+JS diffing - - Tags, attributes and JS AST nodes that change during - Non-Tor fetches pruned from comparison - 4. URLS with > N% of node failures removed - - results purged from filesystem at end of scan loop - C. SSL scanning handles some forms of dynamic certs - 1. Catalogs certs for all IPs resolved locally - by getaddrinfo over the duration of the scan. - - Updated each test. - 2. If the domain presents a new cert for each IP, this - is noted on the failure result for the node - 3. If the same IP presents two different certs locally, - the cert list is first refreshed, and if it happens - again, discarded - 4. A N% node failure filter also applies - D. Scanner can be restarted from any point in the event - of scanner or system crashes, or graceful shutdown. - - Results+scan state pickled to filesystem continuously -2. Cron job checks results periodically for reporting - A. Divide failures into three types of BadExit based on type - and frequency over time and incident rate - B. write reject lines to approved-routers for those three types: - 1. ID Hex based (for misconfig/network problems easily fixed) - 2. IP based (for content modification) - 3. IP+mask based (for continuous/egregious content modification) - C. Emails results to tor-scanners@freehaven.net -3. Human Review and Appeal - A. ID Hex-based BadExit is meant to be possible to removed easily - without needing to beg us. - - Should this behavior be encouraged? - B. Optionally can reserve IP based badexits for human review - 1. Results are encapsulated fully on the filesystem and can be - reviewed without network access - 2. Soat has --rescan to rescan failed nodes from a data directory - - New set of URLs used - diff --git a/doc/spec/proposals/ideas/xxx-geoip-survey-plan.txt b/doc/spec/proposals/ideas/xxx-geoip-survey-plan.txt deleted file mode 100644 index 49c6615a66..0000000000 --- a/doc/spec/proposals/ideas/xxx-geoip-survey-plan.txt +++ /dev/null @@ -1,137 +0,0 @@ - - -Abstract - - This document explains how to tell about how many Tor users there - are, and how many there are in which country. Statistics are - involved. - -Motivation - - There are a few reasons we need to keep track of which countries - Tor users (in aggregate) are coming from: - - - Resource allocation. Knowing about underserved countries with - lots of users can let us know about where we need to direct - translation and outreach efforts. - - - Anticensorship. Sudden drops in usage on a national basis can - indicate the arrival of a censorious firewall. - - - Sponsor outreach and self-evalutation. Many people and - organizations who are interested in funding The Tor Project's - work want to know that we're successfully serving parts of the - world they're interested in, and that efforts to expand our - userbase are actually succeeding. So do we. - -Goals - - We want to know approximately how many Tor users there are, and which - countries they're in, even in the presence of a hypothetical - "directory guard" feature. Some uncertainty is okay, but we'd like - to be able to put a bound on the uncertainty. - - We need to make sure this information isn't exposed in a way that - helps an adversary. - -Methods for current clients: - - Every client downloads network status documents. There are - currently three methods (one hypothetical) for clients to get them. - - 0.1.2.x clients (and earlier) fetch a v2 networkstatus - document about every NETWORKSTATUS_CLIENT_DL_INTERVAL [30 - minutes]. - - - 0.2.0.x clients fetch a v3 networkstatus consensus document - at a random interval between when their current document is no - longer freshest, and when their current document is about to - expire. - - [In both of the above cases, clients choose a running - directory cache at random with odds roughly proportional to - its bandwidth. If they're just starting, they know a XXXX FIXME -NM] - - - In some future version, clients will choose directory caches - to serve as their "directory guards" to avoid profiling - attacks, similarly to how clients currently start all their - circuits at guard nodes. - - We assume that a directory cache can tell which of these three - categories a client is in by the format of its status request. - - A directory cache can be made to count distinct client IP - addresses that make a certain request of it in a given timeframe, - and total requests made to it over that timeframe. For the first - two cases, a cache can get a picture of the overall - number and countries of users in the network by dividing the IP - count by the probability with which they (as a cache) would be - chosen. Assuming that our listed bandwidth is such that we expect - to be chosen with probability P for any given request, and we've - been counting IPs for long enough that we expect the average - client to have made N requests, they will have visited us at least - once with probability P' = 1-(1-P)^N, and so we divide the IP - counts we've seen by P' for our estimate. To estimate total - number of clients of a given type, determine how many requests a - client of that type will make over that time, and assume we'll - have seen P of them. - - Both of these numbers are useful: the IP counts will give the - total number of IPs connecting to the network, and the request - counts will give the total number of users on the network at any - given time. - - Notes: - - [Over H hours, the N for V2 clients is 2*H, and the N for V3 - clients is currently around H/2 or H/3.] - - - (We should only count requests that we actually intend to answer; - 503 requests shouldn't count.) - - - These measurements should also be taken at a directory - authority if possible: their picture of the network is skewed - by clients that fetch from them directly. These clients, - however, are all the clients that are just bootstrapping - (assuming that the fallback-consensus feature isn't yet used - much). - - - These measurements also overestimate the V2 download rate if - some downloads fail and clients retry them later after backing - off. - -Methods for directory guards: - - If directory guards are in use, directory guards get a picture of - all those users who chose them as a guard when they were listed - as a good choice for a guard, and who are also on the network - now. The cleanest data here will come from nodes that were listed - as good new-guards choices for a while, and have not been so for a - while longer (to study decay rates); nodes that have been listed - as good new-guard choices consistently for a long time (to get a - sample of the network); and nodes that have been listed as good - new-guard choices only recently (to get a sample of new users and - users whose guards have died out.) - - Since directory guards are currently unspecified, we'll need to - make some guesses about how they'll turn out to work. Here are - a couple of approaches that could work. - - We could have clients pick completely new directory guards on - a rolling basis every two months or so. This would ensure - that staying as a guard for a while would be sufficient to - see a sample of users. This is potentially advantageous for - load-balancing the network as well, though it might lose some - of the benefits of directory guard. We need to quantify the - impact of this; it might not actually make stuff worse in - practice, if most guards don't stay good guards for a month - or two. - - - We could try to collect statistics at several directory - guards and combine their statisics, but we would need to make - sure that for all time, at least one of the directory guards - had been recommended as a good choice for new guards. By - looking at new-IP rates for guards, we could get an idea of - user uptake; for looking at old-IP decay rates, we could get - an idea of turnover. This approach would entail significant - complexity, and we'd probably need to record more information - than we'd really like to. - - diff --git a/doc/spec/proposals/ideas/xxx-grand-scaling-plan.txt b/doc/spec/proposals/ideas/xxx-grand-scaling-plan.txt deleted file mode 100644 index 336798cc0f..0000000000 --- a/doc/spec/proposals/ideas/xxx-grand-scaling-plan.txt +++ /dev/null @@ -1,97 +0,0 @@ - -Right now as I understand it, there are n big scaling problems heading -our way: - -1) Clients need to learn all the relay descriptors they could use. That's -a lot of bytes through a potentially small pipe. -2) Relays need to hold open TCP connections to most other relays. -3) Clients need to learn the whole networkstatus. Even using v3, as -the network grows that will become unwieldy. -4) Dir mirrors need to mirror all the relay descriptors; eventually this -will get big too. - -Here's my plan. - --------------------------------------------------------------------- - -Piece one: download O(1) descriptors rather than O(n) descriptors. - -We need to change our circuit extend protocol so it fetches a relay -descriptor at every 'extend' operation: - - Client fetches networkstatus, picks guards, connects to one. - - Client picks middle hop out of networkstatus, asks guard for - its descriptor, then extends to it. - - Clients picks exit hop out of networkstatus, asks middle hop - for its descriptor, then extends to it. Done. - -The client needs to ask for the descriptor even if it already has a -copy, because otherwise we leak too much. Also, the descriptor needs to -be padded to some large (but not too large) size to prevent the middle -hops from guessing about it. - -The first step towards this is to instrument the current code to see -how much of a win this would actually be -- I am guessing it is already -a win even with the current number of descriptors. - -We also would need to assign the 'Exit' flag more usefully, and make -clients pay attention to it when picking their last hop, since they -don't actually know the exit policies of the relays they're choosing from. - -We also need to think harder about other implications -- for example, -a relay with a tiny exit policy won't get the Exit flag, and thus won't -ever get picked as an exit relay. Plus, our "enclave exit" model is out -the window unless we figure out a cool trick. - -More generally, we'll probably want to compress the descriptors that we -send back; maybe 8k is a good upper bound? I wonder if we could ask for -several descriptors, and bundle back all of the ones that fit in the 8k? - -We'd also want to put the load balancing weights into the networkstatus, -so clients can choose fast nodes more often without needing to see the -descriptors. This is a good opportunity for the authorities to be able -to put "more accurate" weights in if they learn to detect attacks. It -also means we should consider running automated audits to make sure the -authorities aren't trying to snooker everybody. - -I'm aiming to get Peter Palfrader to tackle this problem in mid 2008, -but I bet he could use some help. - --------------------------------------------------------------------- - -Piece two: inter-relay communication uses UDP - -If relays send packets to/from other relays via UDP, they don't need a -new descriptor for each such link. Thus we'll still need to keep state -for each link, but we won't max out on sockets. - -Clearly a lot more work needs to be done here. Ian Goldberg has a student -who has been working on it, and if all goes well we'll be chipping in -some funding to continue that. Also, Camilo Viecco has been doing his -PhD thesis on it. - --------------------------------------------------------------------- - -Piece three: networkstatus documents get partitioned - -While the authorities should be expected to be able to handle learning -about all the relays, there's no reason the clients or the mirrors need -to. Authorities should put a cap on the number of relays listed in a -single networkstatus, and split them when they get too big. - -We'd need a good way to have each authority come to the same conclusion -about which partition a given relay goes into. - -Directory mirrors would then mirror all the relay descriptors in their -partition. This is compatible with 'piece one' above, since clients in -a given partition will only ask about descriptors in that partition. - -More complex versions of this design would involve overlapping partitions, -but that would seem to start contradicting other parts of this proposal -right quick. - -Nobody is working on this piece yet. It's hard to say when we'll need -it, but it would be nice to have some more thought on it before the week -that we need it. - --------------------------------------------------------------------- - diff --git a/doc/spec/proposals/ideas/xxx-hide-platform.txt b/doc/spec/proposals/ideas/xxx-hide-platform.txt deleted file mode 100644 index ad19fb1fd4..0000000000 --- a/doc/spec/proposals/ideas/xxx-hide-platform.txt +++ /dev/null @@ -1,37 +0,0 @@ -Filename: xxx-hide-platform.txt -Title: Hide Tor Platform Information -Author: Jacob Appelbaum -Created: 24-July-2008 -Status: Draft - - - Hiding Tor Platform Information - -0.0 Introduction - -The current Tor program publishes its specific Tor version and related OS -platform information. This information could be misused by an attacker. - -0.1 Current Implementation - -Currently, the Tor binary sends data that looks like the following: - - Tor 0.2.0.26-rc (r14597) on Darwin Power Macintosh - Tor 0.1.2.19 on Windows XP Service Pack 3 [workstation] {terminal services, - single user} - -1.0 Suggested changes - -It would be useful to allow a user to configure the disclosure of such -information. Such a change would be an option in the torrc file like so: - - HidePlatform Yes - -1.1 Suggested default behavior in the future - -If a user would like to disclose this information, they could configure their -Tor to do so. - - HidePlatform No - - diff --git a/doc/spec/proposals/ideas/xxx-port-knocking.txt b/doc/spec/proposals/ideas/xxx-port-knocking.txt deleted file mode 100644 index 85c27ec52d..0000000000 --- a/doc/spec/proposals/ideas/xxx-port-knocking.txt +++ /dev/null @@ -1,91 +0,0 @@ -Filename: xxx-port-knocking.txt -Title: Port knocking for bridge scanning resistance -Author: Jacob Appelbaum -Created: 19-April-2009 -Status: Draft - - Port knocking for bridge scanning resistance - -0.0 Introduction - -This document is a collection of ideas relating to improving scanning -resistance for private bridge relays. This is intented to stop opportunistic -network scanning and subsequent discovery of private bridge relays. - - -0.1 Current Implementation - -Currently private bridges are only hidden by their obscurity. If you know -a bridge ip address, the bridge can be detected trivially and added to a block -list. - -0.2 Configuring an external port knocking program to control the firewall - -It is currently possible for bridge operators to configure a port knocking -daemon that controls access to the incoming OR port. This is currently out of -scope for Tor and Tor configuration. This process requires the firewall to know -the current nodes in the Tor network. - -1.0 Suggested changes - -Private bridge operators should be able to configure a method of hiding their -relay. Only authorized users should be able to communicate with the private -bridge. This should be done with Tor and if possible without the help of the -firewall. It should be possible for a Tor user to enter a secret key into -Tor or optionally Vidalia on a per bridge basis. This secret key should be -used to authenticate the bridge user to the private bridge. - -1.x Issues with low ports and bind() for ORPort - -Tor opens low numbered ports during startup and then drops privileges. It is -no longer possible to rebind to those lower ports after they are closed. - -1.x Issues with OS level packet filtering - -Tor does not know about any OS level packet filtering. Currently there is no -packet filters that understands the Tor network in real time. - -1.x Possible partioning of users by bridge operator - -Depending on implementation, it may be possible for bridge operators to -uniquely identify users. This appears to be a general bridge issue when a -bridge operator uniquely deploys bridges per user. - -2.0 Implementation ideas - -This is a suggested set of methods for port knocking. - -2.x Using SPA port knocking - -Single Packet Authentication port knocking encodes all required data into a -single UDP packet. Improperly formatted packets may be simply discarded. -Properly formatted packets should be processed and appropriate actions taken. - -2.x Using DNS as a transport for SPA - -It should be possible for Tor to bind to port 53 at startup and merely drop all -packets that are not valid. UDP does not require a response and invalid packets -will not trigger a response from Tor. With base32 encoding it should be -possible to encode SPA as valid DNS requests. This should allow use of the -public DNS infrastructure for authorization requests if desired. - -2.x Ghetto firewalling with opportunistic connection closing - -Until a user has authenticated with Tor, Tor only has a UDP listener. This -listener should never send data in response, it should only open an ORPort -when a user has successfully authenticated. After a user has authenticated -with Tor to open an ORPort, only users who have authenticated will be able -to use it. All other users as identified by their ip address will have their -connection closed before any data is sent or received. This should be -accomplished with an access policy. By default, the access policy should block -all access to the ORPort. - -2.x Timing and reset of access policies - -Access to the ORPort is sensitive. The bridge should remove any exceptions -to its access policy regularly when the ORPort is unused. Valid users should -reauthenticate if they do not use the ORPort within a given time frame. - -2.x Additional considerations - -There are many. A format of the packet and the crypto involved is a good start. diff --git a/doc/spec/proposals/ideas/xxx-rate-limit-exits.txt b/doc/spec/proposals/ideas/xxx-rate-limit-exits.txt deleted file mode 100644 index 81fed20af8..0000000000 --- a/doc/spec/proposals/ideas/xxx-rate-limit-exits.txt +++ /dev/null @@ -1,63 +0,0 @@ - -1. Overview - - We should rate limit the volume of stream creations at exits: - -2.1. Per-circuit limits - - If a given circuit opens more than N streams in X seconds, further - stream requests over the next Y seconds should fail with the reason - 'resourcelimit'. Clients will automatically notice this and switch to - a new circuit. - - The goal is to limit the effects of port scans on a given exit relay, - so the relay's ISP won't get hassled as much. - - First thoughts for parameters would be N=100 streams in X=5 seconds - causes 30 seconds of fails; and N=300 streams in X=30 seconds causes - 30 seconds of fails. - - We could simplify by, instead of having a "for 30 seconds" parameter, - just marking the circuit as forever failing new requests. (We don't want - to just close the circuit because it may still have open streams on it.) - -2.2. Per-destination limits - - If a given circuit opens more than N1 streams in X seconds to a single - IP address, or all the circuits combined open more than N2 streams, - then we should fail further attempts to reach that address for a while. - - The goal is to limit the abuse that Tor exit relays can dish out - to a single target either for socket DoS or for web crawling, in - the hopes of a) not triggering their automated defenses, and b) not - making them upset at Tor. Hopefully these self-imposed bans will be - much shorter-lived than bans or barriers put up by the websites. - -3. Issues - -3.1. Circuit-creation overload - - Making clients move to new circuits more often will cause more circuit - creation requests. - -3.2. How to pick the parameters? - - If we pick the numbers too low, then popular sites are effectively - cut out of Tor. If we pick them too high, we don't do much good. - - Worse, picking them wrong isn't easy to fix, since the deployed Tor - servers will ship with a certain set of numbers. - - We could put numbers (or "general settings") in the networkstatus - consensus, and Tor exits would adapt more dynamically. - - We could also have a local config option about how aggressive this - server should be with its parameters. - -4. Client-side limitations - - Perhaps the clients should have built-in rate limits too, so they avoid - harrassing the servers by default? - - Tricky if we want to get Tor clients in use at large enclaves. - diff --git a/doc/spec/proposals/ideas/xxx-separate-streams-by-port.txt b/doc/spec/proposals/ideas/xxx-separate-streams-by-port.txt deleted file mode 100644 index f26c1e580f..0000000000 --- a/doc/spec/proposals/ideas/xxx-separate-streams-by-port.txt +++ /dev/null @@ -1,59 +0,0 @@ -Filename: xxx-separate-streams-by-port.txt -Title: Separate streams across circuits by destination port -Author: Robert Hogan -Created: 21-Oct-2008 -Status: Draft - -Here's a patch Robert Hogan wrote to use only one destination port per -circuit. It's based on a wishlist item Roger wrote, to never send AIM -usernames over the same circuit that we're hoping to browse anonymously -through. The remaining open question is: how many extra circuits does this -cause an ordinary user to create? My guess is not very many, but I'm wary -of putting this in until we have some better estimate. On the other hand, -not putting it in means that we have a known security flaw. Hm. - -Index: src/or/or.h -=================================================================== ---- src/or/or.h (revision 17143) -+++ src/or/or.h (working copy) -@@ -1874,6 +1874,7 @@ - - uint8_t state; /**< Current status of this circuit. */ - uint8_t purpose; /**< Why are we creating this circuit? */ -+ uint16_t service; /**< Port conn must have to use this circuit. */ - - /** How many relay data cells can we package (read from edge streams) - * on this circuit before we receive a circuit-level sendme cell asking -Index: src/or/circuituse.c -=================================================================== ---- src/or/circuituse.c (revision 17143) -+++ src/or/circuituse.c (working copy) -@@ -62,10 +62,16 @@ - return 0; - } - -- if (purpose == CIRCUIT_PURPOSE_C_GENERAL) -+ if (purpose == CIRCUIT_PURPOSE_C_GENERAL) { - if (circ->timestamp_dirty && - circ->timestamp_dirty+get_options()->MaxCircuitDirtiness <= now) - return 0; -+ /* If the circuit is dirty and used for services on another port, -+ then it is not suitable. */ -+ if (circ->service && conn->socks_request->port && -+ (circ->service != conn->socks_request->port)) -+ return 0; -+ } - - /* decide if this circ is suitable for this conn */ - -@@ -1351,7 +1357,9 @@ - if (connection_ap_handshake_send_resolve(conn) < 0) - return -1; - } -- -+ if (conn->socks_request->port -+ && (TO_CIRCUIT(circ)->purpose == CIRCUIT_PURPOSE_C_GENERAL)) -+ TO_CIRCUIT(circ)->service = conn->socks_request->port; - return 1; - } - diff --git a/doc/spec/proposals/ideas/xxx-using-spdy.txt b/doc/spec/proposals/ideas/xxx-using-spdy.txt deleted file mode 100644 index d733a84b69..0000000000 --- a/doc/spec/proposals/ideas/xxx-using-spdy.txt +++ /dev/null @@ -1,143 +0,0 @@ -Filename: xxx-using-spdy.txt -Title: Using the SPDY protocol to improve Tor performance -Author: Steven J. Murdoch -Created: 03-Feb-2010 -Status: Draft -Target: - -1. Overview - - The SPDY protocol [1] is an alternative method for transferring - web content over TCP, designed to improve efficiency and - performance. A SPDY-aware browser can already communicate with - a SPDY-aware web server over Tor, because this only requires a TCP - stream to be set up. However, a SPDY-aware browser cannot - communicate with a non-SPDY-aware web server. This proposal - outlines how Tor could support this latter case, and why it - may be good for performance. - -2. Motivation - - About 90% of Tor traffic, by connection, is HTTP [2], but - users report subjective performance to be poor. It would - therefore be desirable to improve this situation. SPDY was - designed to offer better performance than HTTP, in - high-latency and/or low-bandwidth situations, and is therefore - an option worth examining. - - If a user wishes to access a SPDY-enabled web server over Tor, - all they need to do is to configure their SPDY-enabled browser - (e.g. Google Chrome) to use Tor. However, there are few - SPDY-enabled web servers, and even if there was high demand - from Tor users, there would be little motivation for server - operators to upgrade, for the benefit of only a small - proportion of their users. - - The motivation of this proposal is to allow only the user to - install a SPDY-enabled browser, and permit web servers to - remain unmodified. Essentially, Tor would incorporate a proxy - on the exit node, which communicates SPDY to the web browser - and normal HTTP to the web server. This proxy would translate - between the two transport protocols, and possibly perform - other optimizations. - - SPDY currently offers five optimizations: - - 1) Multiplexed streams: - An unlimited number of resources can be transferred - concurrently, over a single TCP connection. - - 2) Request prioritization: - The client can set a priority on each resource, to assist - the server in re-ordering responses. - - 3) Compression: - Both HTTP header and resource content can be compressed. - - 4) Server push: - The server can offer the client resources which have not - been requested, but which the server believes will be. - - 5) Server hint: - The server can suggest that the client request further - resources, before the main content is transferred. - - Tor currently effectively implements (1), by being able to put - multiple streams on one circuit. SPDY however requires fewer - round-trips to do the same. The other features are not - implemented by Tor. Therefore it is reasonable to expect that - a HTTP <-> SPDY proxy may improve Tor performance, by some - amount. - - The consequences on caching need to be considered carefully. - Most of the optimizations SPDY offers have no effect because - the existing HTTP cache control headers are transmitted without - modification. Server push is more problematic, because here - the server may push a resource that the client already has. - -3. Design outline - - One way to implement the SPDY proxy is for Tor exit nodes to - advertise this capability in their descriptor. The OP would - then preferentially select these nodes when routing streams - destined for port 80. - - Then, rather than sending the usual RELAY_BEGIN cell, the OP - would send a RELAY_BEGIN_TRANSFORMED cell, with a parameter to - indicate that the exit node should translate between SPDY and - HTTP. The rest of the connection process would operate as - usual. - - There would need to be some way of elegantly handling non-HTTP - traffic which goes over port 80. - -4. Implementation status - - SPDY is under active development and both the specification - and implementations are in a state of flux. Initial - experiments with Google Chrome in SPDY-mode and server - libraries indicate that more work is needed before they are - production-ready. There is no indication that browsers other - than Google Chrome will support SPDY (and no official - statement as to whether Google Chrome will eventually enable - SPDY by default). - - Implementing a full SPDY proxy would be non-trivial. Stream - multiplexing and compression are supported by existing - libraries and would be fairly simple to implement. Request - prioritization would require some form of caching on the - proxy-side. Server push and server hint would require content - parsing to identify resources which should be treated - specially. - -5. Security and policy implications - - A SPDY proxy would be a significant amount of code, and may - pull in external libraries. This code will process potentially - malicious data, both at the SPDY and HTTP sides. This proposal - therefore increases the risk that exit nodes will be - compromised by exploiting a bug in the proxy. - - This proposal would also be the first way in which Tor is - modifying TCP stream data. Arguably this is still meta-data - (HTTP headers), but there may be some concern that Tor should - not be doing this. - - Torbutton only works with Firefox, but SPDY only works with - Google Chrome. We should be careful not to recommend that - users adopt a browser which harms their privacy in other ways. - -6. Open questions: - - - How difficult would this be to implement? - - - How much performance improvement would it actually result in? - - - Is there some way to rapidly develop a prototype which would - answer the previous question? - -[1] SPDY: An experimental protocol for a faster web - http://dev.chromium.org/spdy/spdy-whitepaper -[2] Shining Light in Dark Places: Understanding the Tor Network Damon McCoy, - Kevin Bauer, Dirk Grunwald, Tadayoshi Kohno, Douglas Sicker - http://www.cs.washington.edu/homes/yoshi/papers/Tor/PETS2008_37.pdf diff --git a/doc/spec/proposals/ideas/xxx-what-uses-sha1.txt b/doc/spec/proposals/ideas/xxx-what-uses-sha1.txt deleted file mode 100644 index b3ca3eea5a..0000000000 --- a/doc/spec/proposals/ideas/xxx-what-uses-sha1.txt +++ /dev/null @@ -1,247 +0,0 @@ -Filename: xxx-what-uses-sha1.txt -Title: Where does Tor use SHA-1 today? -Authors: Nick Mathewson, Marian -Created: 30-Dec-2008 -Status: Meta - - -Introduction: - - Tor uses SHA-1 as a message digest. SHA-1 is showing its age: - theoretical attacks for finding collisions against it get better - every year or two, and it will likely be broken in practice before - too long. - - According to smart crypto people, the SHA-2 functions (SHA-256, etc) - share too much of SHA-1's structure to be very good. RIPEMD-160 is - also based on flawed past hashes. Some people think other hash - functions (e.g. Whirlpool and Tiger) are not as bad; most of these - have not seen enough analysis to be used yet. - - Here is a 2006 paper about hash algorithms. - http://www.sane.nl/sane2006/program/final-papers/R10.pdf - - (Todo: Ask smart crypto people.) - - By 2012, the NIST SHA-3 competition will be done, and with luck we'll - have something good to switch too. But it's probably a bad idea to - wait until 2012 to figure out _how_ to migrate to a new hash - function, for two reasons: - 1) It's not inconceivable we'll want to migrate in a hurry - some time before then. - 2) It's likely that migrating to a new hash function will - require protocol changes, and it's easiest to make protocol - changes backward compatible if we lay the groundwork in - advance. It would suck to have to break compatibility with - a big hard-to-test "flag day" protocol change. - - This document attempts to list everything Tor uses SHA-1 for today. - This is the first step in getting all the design work done to switch - to something else. - - This document SHOULD NOT be a clearinghouse of what to do about our - use of SHA-1. That's better left for other individual proposals. - - -Why now? - - The recent publication of "MD5 considered harmful today: Creating a - rogue CA certificate" by Alexander Sotirov, Marc Stevens, Jacob - Appelbaum, Arjen Lenstra, David Molnar, Dag Arne Osvik, and Benne de - Weger has reminded me that: - - * You can't rely on theoretical attacks to stay theoretical. - * It's quite unpleasant when theoretical attacks become practical - and public on days you were planning to leave for vacation. - * Broken hash functions (which SHA-1 is not quite yet AFAIU) - should be dropped like hot potatoes. Failure to do so can make - one look silly. - - -Triage - - How severe are these problems? Let's divide them into these - categories, where H(x) is the SHA-1 hash of x: - PREIMAGE -- find any x such that a H(x) has a chosen value - -- A SHA-1 usage that only depends on preimage - resistance - * Also SECOND PREIMAGE. Given x, find a y not equal to - x such that H(x) = H(y) - COLLISION<role> -- A SHA-1 usage that depends on collision - resistance, but the only party who could mount a - collision-based attack is already in a trusted role - (like a distribution signer or a directory authority). - COLLISION -- find any x and y such that H(x) = H(y) -- A - SHA-1 usage that depends on collision resistance - and doesn't need the attacker to have any special keys. - - There is no need to put much effort into fixing PREIMAGE and SECOND - PREIMAGE usages in the near-term: while there have been some - theoretical results doing these attacks against SHA-1, they don't - seem to be close to practical yet. To fix COLLISION<code-signing> - usages is not too important either, since anyone who has the key to - sign the code can mount far worse attacks. It would be good to fix - COLLISION<authority> usages, since we try to resist bad authorities - to a limited extent. The COLLISION usages are the most important - to fix. - - Kelsey and Schneier published a theoretical second preimage attack - against SHA-1 in 2005, so it would be a good idea to fix PREIMAGE - and SECOND PREIMAGE usages after fixing COLLISION usages or where fixes - require minimal effort. - - http://www.schneier.com/paper-preimages.html - - Additionally, we need to consider the impact of a successful attack - in each of these cases. SHA-1 collisions are still expensive even - if recent results are verified, and anybody with the resources to - compute one also has the resources to mount a decent Sybil attack. - - Let's be pessimistic, and not assume that producing collisions of - a given format is actually any harder than producing collisions at - all. - - -What Tor uses hashes for today: - -1. Infrastructure. - - A. Our X.509 certificates are signed with SHA-1. - COLLSION - B. TLS uses SHA-1 (and MD5) internally to generate keys. - PREIMAGE? - * At least breaking SHA-1 and MD5 simultaneously is - much more difficult than breaking either - independently. - C. Some of the TLS ciphersuites we allow use SHA-1. - PREIMAGE? - D. When we sign our code with GPG, it might be using SHA-1. - COLLISION<code-signing> - * GPG 1.4 and up have writing support for SHA-2 hashes. - This blog has help for converting: - http://www.schwer.us/journal/2005/02/19/sha-1-broken-and-gnupg-gpg/ - E. Our GPG keys might be authenticated with SHA-1. - COLLISION<code-signing-key-signing> - F. OpenSSL's random number generator uses SHA-1, I believe. - PREIMAGE - -2. The Tor protocol - - A. Everything we sign, we sign using SHA-1-based OAEP-MGF1. - PREIMAGE? - B. Our CREATE cell format uses SHA-1 for: OAEP padding. - PREIMAGE? - C. Our EXTEND cells use SHA-1 to hash the identity key of the - target server. - COLLISION - D. Our CREATED cells use SHA-1 to hash the derived key data. - ?? - E. The data we use in CREATE_FAST cells to generate a key is the - length of a SHA-1. - NONE - F. The data we send back in a CREATED/CREATED_FAST cell is the length - of a SHA-1. - NONE - G. We use SHA-1 to derive our circuit keys from the negotiated g^xy - value. - NONE - H. We use SHA-1 to derive the digest field of each RELAY cell, but that's - used more as a checksum than as a strong digest. - NONE - -3. Directory services - - [All are COLLISION or COLLISION<authority> ] - - A. All signatures are generated on the SHA-1 of their corresponding - documents, using PKCS1 padding. - * In dir-spec.txt, section 1.3, it states, - "SIGNATURE" Object contains a signature (using the signing key) - of the PKCS1-padded digest of the entire document, taken from - the beginning of the Initial item, through the newline after - the Signature Item's keyword and its arguments." - So our attacker, Malcom, could generate a collision for the hash - that is signed. Thus, a second pre-image attack is possible. - Vulnerable to regular collision attack only if key is stolen. - If the key is stolen, Malcom could distribute two different - copies of the document which have the same hash. Maybe useful - for a partitioning attack? - B. Router descriptors identify their corresponding extra-info documents - by their SHA-1 digest. - * A third party might use a second pre-image attack to generate a - false extra-info document that has the same hash. The router - itself might use a regular collision attack to generate multiple - extra-info documents with the same hash, which might be useful - for a partitioning attack. - C. Fingerprints in router descriptors are taken using SHA-1. - * The fingerprint must match the public key. Not sure what would - happen if two routers had different public keys but the same - fingerprint. There could perhaps be unpredictable behaviour. - D. In router descriptors, routers in the same "Family" may be listed - by server nicknames or hexdigests. - * Does not seem critical. - E. Fingerprints in authority certs are taken using SHA-1. - F. Fingerprints in dir-source lines of votes and consensuses are taken - using SHA-1. - G. Networkstatuses refer to routers identity keys and descriptors by their - SHA-1 digests. - H. Directory-signature lines identify which key is doing the signing by - the SHA-1 digests of the authority's signing key and its identity key. - I. The following items are downloaded by the SHA-1 of their contents: - XXXX list them - J. The following items are downloaded by the SHA-1 of an identity key: - XXXX list them too. - -4. The rendezvous protocol - - A. Hidden servers use SHA-1 to establish introduction points on relays, - and relays use SHA-1 to check incoming introduction point - establishment requests. - B. Hidden servers use SHA-1 in multiple places when generating hidden - service descriptors. - * The permanent-id is the first 80 bits of the SHA-1 hash of the - public key - ** time-period performs caclulations using the permanent-id - * The secret-id-part is the SHA-1 has of the time period, the - descriptor-cookie, and replica. - * Hash of introduction point's identity key. - C. Hidden servers performing basic-type client authorization for their - services use SHA-1 when encrypting introduction points contained in - hidden service descriptors. - D. Hidden service directories use SHA-1 to check whether a given hidden - service descriptor may be published under a given descriptor - identifier or not. - E. Hidden servers use SHA-1 to derive .onion addresses of their - services. - * What's worse, it only uses the first 80 bits of the SHA-1 hash. - However, the rend-spec.txt says we aren't worried about arbitrary - collisons? - F. Clients use SHA-1 to generate the current hidden service descriptor - identifiers for a given .onion address. - G. Hidden servers use SHA-1 to remember digests of the first parts of - Diffie-Hellman handshakes contained in introduction requests in order - to detect replays. See the RELAY_ESTABLISH_INTRO cell. We seem to be - taking a hash of a hash here. - H. Hidden servers use SHA-1 during the Diffie-Hellman key exchange with - a connecting client. - -5. The bridge protocol - - XXXX write me - - A. Client may attempt to query for bridges where he knows a digest - (probably SHA-1) before a direct query. - -6. The Tor user interface - - A. We log information about servers based on SHA-1 hashes of their - identity keys. - COLLISION - B. The controller identifies servers based on SHA-1 hashes of their - identity keys. - COLLISION - C. Nearly all of our configuration options that list servers allow SHA-1 - hashes of their identity keys. - COLLISION - E. The deprecated .exit notation uses SHA-1 hashes of identity keys - COLLISION |