diff options
Diffstat (limited to 'doc/spec/proposals/ideas')
18 files changed, 0 insertions, 2554 deletions
diff --git a/doc/spec/proposals/ideas/xxx-auto-update.txt b/doc/spec/proposals/ideas/xxx-auto-update.txt deleted file mode 100644 index dc9a857c1e..0000000000 --- a/doc/spec/proposals/ideas/xxx-auto-update.txt +++ /dev/null @@ -1,39 +0,0 @@ - -Notes on an auto updater: - -steve wants a "latest" symlink so he can always just fetch that. - -roger worries that this will exacerbate the "what version are you -using?" "latest." problem. - -weasel suggests putting the latest recommended version in dns. then -we don't have to hit the website. it's got caching, it's lightweight, -it scales. just put it in a TXT record or something. - -but, no dnssec. - -roger suggests a file on the https website that lists the latest -recommended version (or filename or url or something like that). - -(steve seems to already be doing this with xerobank. he additionally -suggests a little blurb that can be displayed to the user to describe -what's new.) - -how to verify you're getting the right file? -a) it's https. -b) ship with a signing key, and use some openssl functions to verify. -c) both - -andrew reminds us that we have a "recommended versions" line in the -consensus directory already. - -if only we had some way to point out the "latest stable recommendation" -from this list. we could list it first, or something. - -the recommended versions line also doesn't take into account which -packages are available -- e.g. on Windows one version might be the best -available, and on OS X it might be a different one. - -aren't there existing solutions to this? surely there is a beautiful, -efficient, crypto-correct auto updater lib out there. even for windows. - diff --git a/doc/spec/proposals/ideas/xxx-bridge-disbursement.txt b/doc/spec/proposals/ideas/xxx-bridge-disbursement.txt deleted file mode 100644 index 6c9a3c71ed..0000000000 --- a/doc/spec/proposals/ideas/xxx-bridge-disbursement.txt +++ /dev/null @@ -1,174 +0,0 @@ - -How to hand out bridges. - -Divide bridges into 'strategies' as they come in. Do this uniformly -at random for now. - -For each strategy, we'll hand out bridges in a different way to -clients. This document describes two strategies: email-based and -IP-based. - -0. Notation: - - HMAC(k,v) : an HMAC of v using the key k. - - A|B: The string A concatenated with the string B. - - -1. Email-based. - - Goal: bootstrap based on one or more popular email service's sybil - prevention algorithms. - - - Parameters: - HMAC -- an HMAC function - P -- a time period - K -- the number of bridges to send in a period. - - Setup: Generate two nonces, N and M. - - As bridges arrive, put them into a ring according to HMAC(N,ID) - where ID is the bridges's identity digest. - - Divide time into divisions of length P. - - When we get an email: - - If it's not from a supported email service, reject it. - - If we already sent a response to that email address (normalized) - in this period, send _exactly_ the same response. - - If it is from a supported service, generate X = HMAC(M,PS|E) where E - is the lowercased normalized email address for the user, and - where PS is the start of the currrent period. Send - the first K bridges in the ring after point X. - - [If we want to make sure that repeat queries are given exactly the - same results, then we can't let the ring change during the - time period. For a long time period like a month, that's quite a - hassle. How about instead just keeping a replay cache of addresses - that have been answered, and sending them a "sorry, you already got - your addresses for the time period; perhaps you should try these - other fine distribution strategies while you wait?" response? This - approach would also resolve the "Make sure you can't construct a - distinct address to match an existing one" note below. -RD] - - [I think, if we get a replay, we need to send back the same - answer as we did the first time, not say "try again." - Otherwise we need to worry that an attacker can keep people - from getting bridges by preemtively asking for them, - or that an attacker may force them to prove they haven't - gotten any bridges by asking. -NM] - - [While we're at it, if we do the replay cache thing and don't need - repeatable answers, we could just pick K random answers from the - pool. Is it beneficial that a bridge user who knows about a clump of - nodes will be sharing them with other users who know about a similar - (overlapping) clump? One good aspect is against an adversary who - learns about a clump this way and watches those bridges to learn - other users and discover *their* bridges: he doesn't learn about - as many new bridges as he might if they were randomly distributed. - A drawback is against an adversary who happens to pick two email - addresses in P that include overlapping answers: he can measure - the difference in clumps and estimate how quickly the bridge pool - is growing. -RD] - - [Random is one more darn thing to implement; rings are already - there. -NM] - - [If we make the period P be mailbox-specific, and make it a random - value around some mean, then we make it harder for an attacker to - know when to try using his small army of gmail addresses to gather - another harvest. But we also make it harder for users to know when - they can try again. -RD] - - [Letting the users know about when they can try again seems - worthwhile. Otherwise users and attackers will all probe and - probe and probe until they get an answer. No additional - security will be achieved, but bandwidth will be lost. -NM] - - To normalize an email address: - Start with the RFC822 address. Consider only the mailbox {???} - portion of the address (username@domain). Put this into lowercase - ascii. - - Questions: - What to do with weird character encodings? Look up the RFC. - - Notes: - Make sure that you can't force a single email address to appear - in lots of different ways. IOW, if nickm@freehaven.net and - NICKM@freehaven.net aren't treated the same, then I can get lots - more bridges than I should. - - Make sure you can't construct a distinct address to match an - existing one. IOW, if we treat nickm@X and nickm@Y as the same - user, then anybody can register nickm@Z and use it to tell which - bridges nickm@X got (or would get). - - Make sure that we actually check headers so we can't be trivially - used to spam people. - - -2. IP-based. - - Goal: avoid handing out all the bridges to users in a similar IP - space and time. - - Parameters: - - T_Flush -- how long it should take a user on a single network to - see a whole cluster of bridges. - - N_C - - K -- the number of bridges we hand out in response to a single - request. - - Setup: using an AS map or a geoip map or some other flawed input - source, divide IP space into "areas" such that surveying a large - collection of "areas" is hard. For v0, use /24 address blocks. - - Group areas into N_C clusters. - - Generate secrets L, M, N. - - Set the period P such that P*(bridges-per-cluster/K) = T_flush. - Don't set P to greater than a week, or less than three hours. - - When we get a bridge: - - Based on HMAC(L,ID), assign the bridge to a cluster. Within each - cluster, keep the bridges in a ring based on HMAC(M,ID). - - [Should we re-sort the rings for each new time period, so the ring - for a given cluster is based on HMAC(M,PS|ID)? -RD] - - When we get a connection: - - If it's http, redirect it to https. - - Let area be the incoming IP network. Let PS be the current - period. Compute X = HMAC(N, PS|area). Return the next K bridges - in the ring after X. - - [Don't we want to compute C = HMAC(key, area) to learn what cluster - to answer from, and then X = HMAC(key, PS|area) to pick a point in - that ring? -RD] - - - Need to clarify that some HMACs are for rings, and some are for - partitions. How rings scale is clear. How do we grow the number of - partitions? Looking at successive bits from the HMAC output is one way. - -3. Open issues - - Denial of service attacks - A good view of network topology - -at some point we should learn some reliability stats on our bridges. when -we say above 'give out k bridges', we might give out 2 reliable ones and -k-2 others. we count around the ring the same way we do now, to find them. - diff --git a/doc/spec/proposals/ideas/xxx-bwrate-algs.txt b/doc/spec/proposals/ideas/xxx-bwrate-algs.txt deleted file mode 100644 index 757f5bc55e..0000000000 --- a/doc/spec/proposals/ideas/xxx-bwrate-algs.txt +++ /dev/null @@ -1,106 +0,0 @@ -# The following two algorithms - - -# Algorithm 1 -# TODO: Burst and Relay/Regular differentiation - -BwRate = Bandwidth Rate in Bytes Per Second -GlobalWriteBucket = 0 -GlobalReadBucket = 0 -Epoch = Token Fill Rate in seconds: suggest 50ms=.050 -SecondCounter = 0 -MinWriteBytes = Minimum amount bytes per write - -Every Epoch Seconds: - UseMinWriteBytes = MinWriteBytes - WriteCnt = 0 - ReadCnt = 0 - BytesRead = 0 - - For Each Open OR Conn with pending write data: - WriteCnt++ - For Each Open OR Conn: - ReadCnt++ - - BytesToRead = (BwRate*Epoch + GlobalReadBucket)/ReadCnt - BytesToWrite = (BwRate*Epoch + GlobalWriteBucket)/WriteCnt - - if BwRate/WriteCnt < MinWriteBytes: - # If we aren't likely to accumulate enough bytes in a second to - # send a whole cell for our connections, send partials - Log(NOTICE, "Too many ORCons to write full blocks. Sending short packets.") - UseMinWriteBytes = 1 - # Other option: We could switch to plan 2 here - - # Service each writable ORConn. If there are any partial writes, - # return remaining bytes from this epoch to the global pool - For Each Open OR Conn with pending write data: - ORConn->write_bucket += BytesToWrite - if ORConn->write_bucket > UseMinWriteBytes: - w = write(ORConn, MIN(len(ORConn->write_data), ORConn->write_bucket)) - # possible that w < ORConn->write_data here due to TCP pushback. - # We should restore the rest of the write_bucket to the global - # buffer - GlobalWriteBucket += (ORConn->write_bucket - w) - ORConn->write_bucket = 0 - - For Each Open OR Conn: - r = read_nonblock(ORConn, BytesToRead) - BytesRead += r - - SecondCounter += Epoch - if SecondCounter < 1: - # Save unused bytes from this epoch to be used later in the second - GlobalReadBucket += (BwRate*Epoch - BytesRead) - else: - SecondCounter = 0 - GlobalReadBucket = 0 - GlobalWriteBucket = 0 - For Each ORConn: - ORConn->write_bucket = 0 - - - -# Alternate plan for Writing fairly. Reads would still be covered -# by plan 1 as there is no additional network overhead for short reads, -# so we don't need to try to avoid them. -# -# I think this is actually pretty similar to what we do now, but -# with the addition that the bytes accumulate up to the second mark -# and we try to keep track of our position in the write list here -# (unless libevent is doing that for us already and I just don't see it) -# -# TODO: Burst and Relay/Regular differentiation - -# XXX: The inability to send single cells will cause us to block -# on EXTEND cells for low-bandwidth node pairs.. -BwRate = Bandwidth Rate in Bytes Per Second -WriteBytes = Bytes per write -Epoch = MAX(MIN(WriteBytes/BwRate, .333s), .050s) - -SecondCounter = 0 -GlobalWriteBucket = 0 - -# New connections are inserted at Head-1 (the 'tail' of this circular list) -# This is not 100% fifo for all node data, but it is the best we can do -# without insane amounts of additional queueing complexity. -WriteConnList = List of Open OR Conns with pending write data > WriteBytes -WriteConnHead = 0 - -Every Epoch Seconds: - GlobalWriteBucket += BwRate*Epoch - WriteListEnd = WriteConnHead - - do - ORCONN = WriteConnList[WriteConnHead] - w = write(ORConn, WriteBytes) - GlobalWriteBucket -= w - WriteConnHead += 1 - while GlobalWriteBucket > 0 and WriteConnHead != WriteListEnd - - SecondCounter += Epoch - if SecondCounter >= 1: - SecondCounter = 0 - GlobalWriteBucket = 0 - - diff --git a/doc/spec/proposals/ideas/xxx-choosing-crypto-in-tor-protocol.txt b/doc/spec/proposals/ideas/xxx-choosing-crypto-in-tor-protocol.txt deleted file mode 100644 index e8489570f7..0000000000 --- a/doc/spec/proposals/ideas/xxx-choosing-crypto-in-tor-protocol.txt +++ /dev/null @@ -1,138 +0,0 @@ -Filename: xxx-choosing-crypto-in-tor-protocol.txt -Title: Picking cryptographic standards in the Tor wire protocol -Author: Marian -Created: 2009-05-16 -Status: Draft - -Motivation: - - SHA-1 is horribly outdated and not suited for security critical - purposes. SHA-2, RIPEMD-160, Whirlpool and Tigerare good options - for a short-term replacement, but in the long run, we will - probably want to upgrade to the winner or a semi-finalist of the - SHA-3 competition. - - For a 2006 comparison of different hash algorithms, read: - http://www.sane.nl/sane2006/program/final-papers/R10.pdf - - Other reading about SHA-1: - http://www.schneier.com/blog/archives/2005/02/sha1_broken.html - http://www.schneier.com/blog/archives/2005/08/new_cryptanalyt.html - http://www.schneier.com/paper-preimages.html - - Additionally, AES has been theoretically broken for years. While - the attack is still not efficient enough that the public sector - has been able to prove that it works, we should probably consider - the time between a theoretical attack and a practical attack as an - opportunity to figure out how to upgrade to a better algorithm, - such as Twofish. - - See: - http://schneier.com/crypto-gram-0209.html#1 - -Design: - - I suggest that nodes should publish in directories which - cryptographic standards, such as hash algorithms and ciphers, - they support. Clients communicating with nodes will then - pick whichever of those cryptographic standards they prefer - the most. In the case that the node does not publish which - cryptographic standards it supports, the client should assume - that the server supports the older standards, such as SHA-1 - and AES, until such time as we choose to desupport those - standards. - - Node to node communications could work similarly. However, in - case they both support a set of algorithms but have different - preferences, the disagreement would have to be resolved - somehow. Two possibilities include: - * the node requesting communications presents which - cryptographic standards it supports in the request. The - other node picks. - * both nodes send each other lists of what they support and - what version of Tor they are using. The newer node picks, - based on the assumption that the newer node has the most up - to date information about which hash algorithm is the best. - Of course, the node could lie about its version, but then - again, it could also maliciously choose only to support older - algorithms. - - Using this method, we could potentially add server side support - to hash algorithms and ciphers before we instruct clients to - begin preferring those hash algorithms and ciphers. In this way, - the clients could upgrade and the servers would already support - the newly preferred hash algorithms and ciphers, even if the - servers were still using older versions of Tor, so long as the - older versions of Tor were at least new enough to have server - side support. - - This would make quickly upgrading to new hash algorithms and - ciphers easier. This could be very useful when new attacks - are published. - - One concern is that client preferences could expose the client - to segmentation attacks. To mitigate this, we suggest hardcoding - preferences in the client, to prevent the client from choosing - to use a new hash algorithm or cipher that no one else is using - yet. While offering a preference might be useful in case a client - with an older version of Tor wants to start using the newer hash - algorithm or cipher that everyone else is using, if the client - cares enough, he or she can just upgrade Tor. - - We may also have to worry about nodes which, through laziness or - maliciousness, refuse to start supporting new hash algorithms or - ciphers. This must be balanced with the need to maintain - backward compatibility so the client will have a large selection - of nodes to pick from. Adding new hash algorithms and ciphers - long before we suggest nodes start using them can help mitigate - this. However, eventually, once sufficient nodes support new - standards, client side support for older standards should be - disabled, particularly if there are practical rather than merely - theoretical attacks. - - Server side support for older standards can be kept much longer - than client side support, since clients using older hashes and - ciphers are really only hurting theirselvse. - - If server side support for a hash algorithm or cipher is added - but never preferred before we decide we don't really want it, - support can be removed without having to worry about backward - compatibility. - -Security implications: - Improving cryptography will improve Tor's security. However, if - clients pick different cryptographic standards, they could be - partitioned based on their cryptographic preferences. We also - need to worry about nodes refusing to support new standards. - These issues are detailed above. - -Specification: - - Todo. Need better understanding of how Tor currently works or - help from someone who does. - -Compatibility: - - This idea is intended to allow easier upgrading of cryptographic - hash algorithms and ciphers while maintaining backwards - compatibility. However, at some point, backwards compatibility - with very old hashes and ciphers should be dropped for security - reasons. - -Implementation: - - Todo. - -Performance and scalability nodes: - - Better hashes and cipher are someimes a little more CPU intensive - than weaker ones. For instance, on most computers AES is a little - faster than Twofish. However, in that example, I consider Twofish's - additional security worth the tradeoff. - -Acknowledgements: - - Discussed this on IRC with a few people, mostly Nick Mathewson. - Nick was particularly helpful in explaining how Tor works, - explaining goals, and providing various links to Tor - specifications. diff --git a/doc/spec/proposals/ideas/xxx-controllers-intercept-extends.txt b/doc/spec/proposals/ideas/xxx-controllers-intercept-extends.txt deleted file mode 100644 index 76ba5c84b5..0000000000 --- a/doc/spec/proposals/ideas/xxx-controllers-intercept-extends.txt +++ /dev/null @@ -1,44 +0,0 @@ -Author: Geoff Goodell -Title: Allow controller to manage circuit extensions -Date: 12 March 2006 - -History: - - This was once bug 268. Moving it into the proposal system for posterity. - -Test: - -Tor controllers should have a means of learning more about circuits built -through Tor routers. Specifically, if a Tor controller is connected to a Tor -router, it should be able to subscribe to a new class of events, perhaps -"onion" or "router" events. A Tor router SHOULD then ensure that the -controller is informed: - -(a) (NEW) when it receives a connection from some other location, in which -case it SHOULD indicate (1) a unique identifier for the circuit, and (2) a -ServerID in the event of an OR connection from another Tor router, and -Hostname otherwise. - -(b) (REQUEST) when it receives a request to extend an existing circuit to a -successive Tor router, in which case it SHOULD provide (1) the unique -identifier for the circuit, (2) a Hostname (or, if possible, ServerID) of the -previous Tor router in the circuit, and (3) a ServerID for the requested -successive Tor router in the circuit; - -(c) (EXTEND) Tor will attempt to extend the circuit to some other router, in -which case it SHOULD provide the same fields as provided for REQUEST. - -(d) (SUCCEEDED) The circuit has been successfully extended to some ther -router, in which case it SHOULD provide the same fields as provided for -REQUEST. - -We also need a new configuration option analogous to _leavestreamsunattached, -specifying whether the controller is to manage circuit extensions or not. -Perhaps we can call it "_leavecircuitsunextended". When set to 0, Tor -manages everything as usual. When set to 1, a circuit received by the Tor -router cannot transition from "REQUEST" to "EXTEND" state without being -directed by a new controller command. The controller command probably does -not need any arguments, since circuits are extended per client source -routing, and all that the controller does is accept or reject the extension. - -This feature can be used as a basis for enforcing routing policy. diff --git a/doc/spec/proposals/ideas/xxx-crypto-migration.txt b/doc/spec/proposals/ideas/xxx-crypto-migration.txt deleted file mode 100644 index 1c734229b8..0000000000 --- a/doc/spec/proposals/ideas/xxx-crypto-migration.txt +++ /dev/null @@ -1,384 +0,0 @@ - -Title: Initial thoughts on migrating Tor to new cryptography -Author: Nick Mathewson -Created: 12 December 2010 - -1. Introduction - - Tor currently uses AES-128, RSA-1024, and SHA1. Even though these - ciphers were a decent choice back in 2003, and even though attacking - these algorithms is by no means the best way for a well-funded - adversary to attack users (correlation attacks are still cheaper, even - with pessimistic assumptions about the security of each cipher), we - will want to move to better algorithms in the future. Indeed, if - migrating to a new ciphersuite were simple, we would probably have - already moved to RSA-1024/AES-128/SHA256 or something like that. - - So it's a good idea to start figuring out how we can move to better - ciphers. Unfortunately, this is a bit nontrivial, so before we start - doing the design work here, we should start by examining the issues - involved. Robert Ransom and I both decided to spend this weekend - writing up documents of this type so that we can see how much two - people working independently agree on. I know more Tor than Robert; - Robert knows far more cryptography than I do. With luck we'll - complement each other's work nicely. - - A note on scope: This document WILL NOT attempt to pick a new cipher - or set of ciphers. Instead, it's about how to migrate to new ciphers - in general. Any algorithms mentioned other than those we use today - are just for illustration. - - Also, I don't much consider the importance of updating each particular - usage; only the methods that you'd use to do it. - - Also, this isn't a complete proposal. - -2. General principles and tricks - - Before I get started, let's talk about some general design issues. - -2.1. Many algorithms or few? - - Protocols like TLS and OpenPGP allow a wide choice of cryptographic - algorithms; so long as the sender and receiver (or the responder and - initiator) have at least one mutually acceptable algorithm, they can - converge upon it and send each other messages. - - This isn't the best choice for anonymity designs. If two clients - support a different set of algorithms, then an attacker can tell them - apart. A protocol with N ciphersuites would in principle split - clients into 2**N-1 sets. (In practice, nearly all users will use the - default, and most users who choose _not_ to use the default will do so - without considering the loss of anonymity. See "Anonymity Loves - Company: Usability and the Network Effect".) - - On the other hand, building only one ciphersuite into Tor has a flaw - of its own: it has proven difficult to migrate to another one. So - perhaps instead of specifying only a single new ciphersuite, we should - specify more than one, with plans to switch over (based on a flag in - the consensus or some other secure signal) once the first choice of - algorithms start looking iffy. This switch-based approach would seem - especially easy for parameterizable stuff like key sizes. - -2.2. Waiting for old clients and servers to upgrade - - The easiest way to implement a shift in algorithms would be to declare - a "flag day": once we have the new versions of the protocols - implemented, pick a day by which everybody must upgrade to the new - software. Before this day, the software would have the old behavior; - after this way, it would use the improved behavior. - - Tor tries to avoid flag days whenever possible; they have well-known - issues. First, since a number of our users don't automatically - update, it can take a while for people to upgrade to new versions of - our software. Second and more worryingly, it's hard to get adequate - testing for new behavior that is off-by-default. Flag days in other - systems have been known to leave whole networks more or less - inoperable for months; we should not trust in our skill to avoid - similar problems. - - So if we're avoiding flag days, what can we do? - - * We can add _support_ for new behavior early, and have clients use it - where it's available. (Clients know the advertised versions of the - Tor servers they use-- but see 2.3 below for a danger here, and 2.4 - for a bigger danger.) - - * We can remove misfeatures that _prevent_ deployment of new - behavior. For instance, if a certain key length has an arbitrary - 1024-bit limit, we can remove that arbitrary limitation. - - * Once an optional new behavior is ubiquitous enough, the authorities - can stop accepting descriptors from servers that do not have it - until they upgrade. - - It is far easier to remove arbitrary limitations than to make other - changes; such changes are generally safe to back-port to older stable - release series. But in general, it's much better to avoid any plans - that require waiting for any version of Tor to no longer be in common - use: a stable release can take on the order of 2.5 years to start - dropping off the radar. Thandy might fix that, but even if a perfect - Thandy release comes out tomorrow, we'll still have lots of older - clients and relays not using it. - - We'll have to approach the migration problem on a case-by-case basis - as we consider the algorithms used by Tor and how to change them. - -2.3. Early adopters and other partitioning dangers - - It's pretty much unavoidable that clients running software that speak - the new version of any protocol will be distinguishable from those - that cannot speak the new version. This is inevitable, though we - could try to minimize the number of such partitioning sets by having - features turned on in the same release rather than one-at-a-time. - - Another option here is to have new protocols controlled by a - configuration tri-state with values "on", "off", and "auto". The - "auto" value means to look at the consensus to decide wither to use - the feature; the other two values are self-explanatory. We'd ship - clients with the feature set to "auto" by default, with people only - using "on" for testing. - - If we're worried about early client-side implementations of a protocol - turning out to be broken, we can have the consensus value say _which_ - versions should turn on the protocol. - -2.4. Avoid whole-circuit switches - - One risky kind of protocol migration is a feature that gets used only - when all the routers in a circuit support it. If such a feature is - implemented by few relays, then each relay learns a lot about the rest - of the path by seeing it used. On the other hand, if the feature is - implemented by most relays, then a relay learns a lot about the rest of - the path when the feature is *not* used. - - It's okay to have a feature that can be only used if two consecutive - routers in the patch support it: each router knows the ones adjacent - to it, after all, so knowing what version of Tor they're running is no - big deal. - -2.5. The Second System Effect rears its ugly head - - Any attempt at improving Tor's crypto is likely to involve changes - throughout the Tor protocol. We should be aware of the risks of - falling into what Fred Brooks called the "Second System Effect": when - redesigning a fielded system, it's always tempting to try to shovel in - every possible change that one ever wanted to make to it. - - This is a fine time to make parts of our protocol that weren't - previously versionable into ones that are easier to upgrade in the - future. This probably _isn't_ time to redesign every aspect of the - Tor protocol that anybody finds problematic. - -2.6. Low-hanging fruit and well-lit areas - - Not all parts of Tor are tightly covered. If it's possible to upgrade - different parts of the system at different rates from one another, we - should consider doing the stuff we can do easier, earlier. - - But remember the story of the policeman who finds a drunk under a - streetlamp, staring at the ground? The cop asks, "What are you - doing?" The drunk says, "I'm looking for my keys!" "Oh, did you drop - them around here?" says the policeman. "No," says the drunk, "But the - light is so much better here!" - - Or less proverbially: Simply because a change is easiest, does not - mean it is the best use of our time. We should avoid getting bogged - down solving the _easy_ aspects of our system unless they happen also - to be _important_. - -2.7. Nice safe boring codes - - Let's avoid, to the extent that we can: - - being the primary user of any cryptographic construction or - protocol. - - anything that hasn't gotten much attention in the literature. - - anything we would have to implement from scratch - - anything without a nice BSD-licensed C implementation - - Sometimes we'll have the choice of a more efficient algorithm or a - more boring & well-analyzed one. We should not even consider trading - conservative design for efficiency unless we are firmly in the - critical path. - -2.8. Key restrictions - - Our spec says that RSA exponents should be 65537, but our code never - checks for that. If we want to bolster resistance against collision - attacks, we could check this requirement. To the best of my - knowledge, nothing violates it except for tools like "shallot" that - generate cute memorable .onion names. If we want to be nice to - shallot users, we could check the requirement for everything *except* - hidden service identity keys. - -3. Aspects of Tor's cryptography, and thoughts on how to upgrade them all - -3.1. Link cryptography - - Tor uses TLS for its link cryptography; it is easy to add more - ciphersuites to the acceptable list, or increase the length of - link-crypto public keys, or increase the length of the DH parameter, - or sign the X509 certificates with any digest algorithm that OpenSSL - clients will support. Current Tor versions do not check any of these - against expected values. - - The identity key used to sign the second certificate in the current - handshake protocol, however, is harder to change, since it needs to - match up with what we see in the router descriptor for the router - we're connecting to. See notes on router identity below. So long as - the certificate chain is ultimately authenticated by a RSA-1024 key, - it's not clear whether making the link RSA key longer on its own - really improves matters or not. - - Recall also that for anti-fingerprinting reasons, we're thinking of - revising the protocol handshake sometime in the 0.2.3.x timeframe. - If we do that, that might be a good time to make sure that we aren't - limited by the old identity key size. - -3.2. Circuit-extend crypto - - Currently, our code requires RSA onion keys to be 1024 bits long. - Additionally, current nodes will not deliver an EXTEND cell unless it - is the right length. - - For this, we might add a second, longer onion-key to router - descriptors, and a second CREATE2 cell to open new circuits - using this key type. It should contain not only the onionskin, but - also information on onionskin version and ciphersuite. Onionskins - generated for CREATE2 cells should use a larger DH group as well, and - keys should be derived from DH results using a better digest algorithm. - - We should remove the length limit on EXTEND cells, backported to all - supported stable versions; call these "EXTEND2" cells. Call these - "lightly patched". Clients could use the new EXTEND2/CREATE2 format - whenever using a lightly patched or new server to extend to a new - server, and the old EXTEND/CREATE format otherwise. - - The new onion skin format should try to avoid the design oddities of - our old one. Instead of its current iffy hybrid encryption scheme, it - should probably do something more like a BEAR/LIONESS operation with a - fixed key on the g^x value, followed by a public key encryption on the - start of the encrypted data. (Robert reminded me about this - construction.) - - The current EXTEND cell format ends with a router identity - fingerprint, which is used by the extended-from router to authenticate - the extended-to router when it connects. Changes to this will - interact with changes to how long an identity key can be and to the - link protocol; see notes on the link protocol above and about router - identity below. - -3.2.1. Circuit-extend crypto: fast case - - When we do unauthenticated circuit extends with CREATE/CREATED_FAST, - the two input values are combined with SHA1. I believe that's okay; - using any entropy here at all is overkill. - -3.3. Relay crypto - - Upon receiving relay cells, a router transforms the payload portion of - the cell with the appropriate key appropriate key, sees if it - recognizes the cell (the recognized field is zero, the digest field is - correct, the cell is outbound), and passes them on if not. It is - possible for each hop in the circuit to handle the relay crypto - differently; nobody but the client and the hop in question need to - coordinate their operations. - - It's not clear, though, whether updating the relay crypto algorithms - would help anything, unless we changed the whole relay cell processing - format too. The stream cipher is good enough, and the use of 4 bytes - of digest does not have enough bits to provide cryptographic strength, - no matter what cipher we use. - - This is the likeliest area for the second-system effect to strike; - there are lots of opportunities to try to be more clever than we are - now. - -3.4. Router identity - - This is one of the hardest things to change. Right now, routers are - identified by a "fingerprint" equal to the SHA1 hash of their 1024-bit - identity key as given in their router descriptor. No existing Tor - will accept any other size of identity key, or any other hash - algorithm. The identity key itself is used: - - To sign the router descriptors - - To sign link-key certificates - - To determine the least significant bits of circuit IDs used on a - Tor instance's links (see tor-spec §5.1) - - The fingerprint is used: - - To identify a router identity key in EXTEND cells - - To identify a router identity key in bridge lines - - Throughout the controller interface - - To fetch bridge descriptors for a bridge - - To identify a particular router throughout the codebase - - In the .exit notation. - - By the controller to identify nodes - - To identify servers in the logs - - Probably other places too - - To begin to allow other key types, key lengths, and hash functions, we - would either need to wait till all current Tors are obsolete, or allow - routers to have more than one identity for a while. - - To allow routers to have more than one identity, we need to - cross-certify identity keys. We can do this trivially, in theory, by - listing both keys in the router descriptor and having both identities - sign the descriptor. In practice, we will need to analyze this pretty - carefully to avoid attacks where one key is completely fake aimed to - trick old clients somehow. - - Upgrading the hash algorithm once would be easy: just say that all - new-type keys get hashed using the new hash algorithm. Remaining - future-proof could be tricky. - - This is one of the hardest areas to update; "SHA1 of identity key" is - assumed in so many places throughout Tor that we'll probably need a - lot of design work to work with something else. - -3.5. Directory objects - - Fortunately, the problem is not so bad for consensuses themselves, - because: - - Authority identity keys are allowed to be RSA keys of any length; - in practice I think they are all 3072 bits. - - Authority signing keys are also allowed to be of any length. - AFAIK the code works with longer signing keys just fine. - - Currently, votes are hashed with both sha1 and sha256; adding - more hash algorithms isn't so hard. - - Microdescriptor consensuses are all signed using sha256. While - regular consensuses are signed using sha1, exploitable collisions - are hard to come up with, since once you had a collision, you - would need to get a majority of other authorities to agree to - generate it. - - Router descriptors are currently identified by SHA1 digests of their - identity keys and descriptor digests in regular consensuses, and by - SHA1 digests of identity keys and SHA256 digests of microdescriptors - in microdesc consensuses. The consensus-flavors design allows us to - generate new flavors of consensus that identity routers by new hashes - of their identity keys. Alternatively, existing consensuses could be - expanded to contain more hashes, though that would have some space - concerns. - - Router descriptors themselves are signed using RSA-1024 identity keys - and SHA1. For information on updating identity keys, see above. - - Router descriptors and extra-info documents cross-certify one another - using SHA1. - - Microdescriptors are currently specified to contain exactly one - onion key, of length 1024 bits. - -3.6. The directory protocol - - Most objects are indexed by SHA1 hash of an identity key or a - descriptor object. Adding more hash types wouldn't be a huge problem - at the directory cache level. - -3.7. The hidden service protocol - - Hidden services self-identify by a 1024-bit RSA key. Other key - lengths are not supported. This key is turned into an 80 bit half - SHA-1 hash for hidden service names. - - The most simple change here would be to set an interface for putting - the whole ugly SHA1 hash in the hidden service name. Remember that - this needs to coexist with the authentication system which also uses - .onion hostnames; that hostnames top out around 255 characters and and - their components top out at 63. - - Currently, ESTABLISH_INTRO cells take a key length parameter, so in - theory they allow longer keys. The rest of the protocol assumes that - this will be hashed into a 20-byte SHA1 identifier. Changing that - would require changes at the introduction point as well as the hidden - service. - - The parsing code for hidden service descriptors currently enforce a - 1024-bit identity key, though this does not seem to be described in - the specification. Changing that would be at least as hard as doing - it for regular identity keys. - - Fortunately, hidden services are nearly completely orthogonal to - everything else. - diff --git a/doc/spec/proposals/ideas/xxx-crypto-requirements.txt b/doc/spec/proposals/ideas/xxx-crypto-requirements.txt deleted file mode 100644 index 8a8943a42f..0000000000 --- a/doc/spec/proposals/ideas/xxx-crypto-requirements.txt +++ /dev/null @@ -1,72 +0,0 @@ -Title: Requirements for Tor's circuit cryptography -Author: Robert Ransom -Created: 12 December 2010 - -Overview - - This draft is intended to specify the meaning of 'secure' for a Tor - circuit protocol, hopefully in enough detail that - mathematically-inclined cryptographers can use this definition to - prove that a Tor circuit protocol (or component thereof) is secure - under reasonably well-accepted assumptions. - - Tor's current circuit protocol consists of the CREATE, CREATED, RELAY, - DESTROY, CREATE_FAST, CREATED_FAST, and RELAY_EARLY cells (including - all subtypes of RELAY and RELAY_EARLY cells). Tor currently has two - circuit-extension handshake protocols: one consists of the CREATE and - CREATED cells; the other, used only over the TLS connection to the - first node in a circuit, consists of the CREATE_FAST and CREATED_FAST - cells. - -Requirements - - 1. Every circuit-extension handshake protocol must provide forward - secrecy -- the protocol must allow both the client and the relay to - destroy, immediately after a circuit is closed, enough key material - that no attacker who can eavesdrop on all handshake and circuit cells - and who can seize and inspect the client and relay after the circuit - is closed will be able to decrypt any non-handshake data sent along - the circuit. - - In particular, the protocol must not require that a key which can be - used to decrypt non-handshake data be stored for a predetermined - period of time, as such a key must be written to persistent storage. - - 2. Every circuit-extension handshake protocol must specify what key - material must be used only once in order to allow unlinkability of - circuit-extension handshakes. - - 3. Every circuit-extension handshake protocol must authenticate the relay - to the client -- an attacker who can eavesdrop on all handshake and - circuit cells and who can participate in handshakes with the client - must not be able to determine a symmetric session key that a circuit - will use without either knowing a secret key corresponding to a - handshake-authentication public key published by the relay or breaking - a cryptosystem for which the relay published a - handshake-authentication public key. - - 4. Every circuit-extension handshake protocol must ensure that neither - the client nor the relay can cause the handshake to result in a - predetermined symmetric session key. - - 5. Every circuit-extension handshake protocol should ensure that an - attacker who can predict the relay's ephemeral secret input to the - handshake and can eavesdrop on all handshake and circuit cells, but - does not know a secret key corresponding to the - handshake-authentication public key used in the handshake, cannot - break the handshake-authentication public key's cryptosystem, and - cannot predict the client's ephemeral secret input to the handshake, - cannot predict the symmetric session keys used for the resulting - circuit. - - 6. The circuit protocol must specify an end-to-end flow-control - mechanism, and must allow for the addition of new mechanisms. - - 7. The circuit protocol should specify the statistics to be exchanged - between circuit endpoints in order to support end-to-end flow control, - and should specify how such statistics can be verified. - - - 8. The circuit protocol should allow an endpoint to verify that the other - endpoint is participating in an end-to-end flow-control protocol - honestly. diff --git a/doc/spec/proposals/ideas/xxx-draft-spec-for-TLS-normalization.txt b/doc/spec/proposals/ideas/xxx-draft-spec-for-TLS-normalization.txt deleted file mode 100644 index 16484e6375..0000000000 --- a/doc/spec/proposals/ideas/xxx-draft-spec-for-TLS-normalization.txt +++ /dev/null @@ -1,360 +0,0 @@ -Filename: xxx-draft-spec-for-TLS-normalization.txt -Title: Draft spec for TLS certificate and handshake normalization -Author: Jacob Appelbaum, Gladys Shufflebottom -Created: 16-Feb-2011 -Status: Draft - - - Draft spec for TLS certificate and handshake normalization - - - Overview - -Scope - -This is a document that proposes improvements to problems with Tor's -current TLS (Transport Layer Security) certificates and handshake that will -reduce the distinguishability of Tor traffic from other encrypted traffic that -uses TLS. It also addresses some of the possible fingerprinting attacks -possible against the current Tor TLS protocol setup process. - -Motivation and history - -Censorship is an arms race and this is a step forward in the defense -of Tor. This proposal outlines ideas to make it more difficult to -fingerprint and block Tor traffic. - -Goals - -This proposal intends to normalize or remove easy-to-predict or static -values in the Tor TLS certificates and with the Tor TLS setup process. -These values can be used as criteria for the automated classification of -encrypted traffic as Tor traffic. Network observers should not be able -to trivially detect Tor merely by receiving or observing the certificate -used or advertised by a Tor relay. I also propose the creation of -a hard-to-detect covert channel through which a server can signal that it -supports the third version ("V3") of the Tor handshake protocol. - -Non-Goals - -This document is not intended to solve all of the possible active or passive -Tor fingerprinting problems. This document focuses on removing distinctive -and predictable features of TLS protocol negotiation; we do not attempt to -make guarantees about resisting other kinds of fingerprinting of Tor -traffic, such as fingerprinting techniques related to timing or volume of -transmitted data. - - Implementation details - - -Certificate Issues - -The CN or commonName ASN1 field - -Tor generates certificates with a predictable commonName field; the -field is within a given range of values that is specific to Tor. -Additionally, the generated host names have other undesirable properties. -The host names typically do not resolve in the DNS because the domain -names referred to are generated at random. Although they are syntatically -valid, they usually refer to domains that have never been registered by -any domain name registrar. - -An example of the current commonName field: CN=www.s4ku5skci.net - -An example of OpenSSL’s asn1parse over a typical Tor certificate: - - 0:d=0 hl=4 l= 438 cons: SEQUENCE - 4:d=1 hl=4 l= 287 cons: SEQUENCE - 8:d=2 hl=2 l= 3 cons: cont [ 0 ] - 10:d=3 hl=2 l= 1 prim: INTEGER :02 - 13:d=2 hl=2 l= 4 prim: INTEGER :4D3C763A - 19:d=2 hl=2 l= 13 cons: SEQUENCE - 21:d=3 hl=2 l= 9 prim: OBJECT :sha1WithRSAEncryption - 32:d=3 hl=2 l= 0 prim: NULL - 34:d=2 hl=2 l= 35 cons: SEQUENCE - 36:d=3 hl=2 l= 33 cons: SET - 38:d=4 hl=2 l= 31 cons: SEQUENCE - 40:d=5 hl=2 l= 3 prim: OBJECT :commonName - 45:d=5 hl=2 l= 24 prim: PRINTABLESTRING :www.vsbsvwu5b4soh4wg.net - 71:d=2 hl=2 l= 30 cons: SEQUENCE - 73:d=3 hl=2 l= 13 prim: UTCTIME :110123184058Z - 88:d=3 hl=2 l= 13 prim: UTCTIME :110123204058Z - 103:d=2 hl=2 l= 28 cons: SEQUENCE - 105:d=3 hl=2 l= 26 cons: SET - 107:d=4 hl=2 l= 24 cons: SEQUENCE - 109:d=5 hl=2 l= 3 prim: OBJECT :commonName - 114:d=5 hl=2 l= 17 prim: PRINTABLESTRING :www.s4ku5skci.net - 133:d=2 hl=3 l= 159 cons: SEQUENCE - 136:d=3 hl=2 l= 13 cons: SEQUENCE - 138:d=4 hl=2 l= 9 prim: OBJECT :rsaEncryption - 149:d=4 hl=2 l= 0 prim: NULL - 151:d=3 hl=3 l= 141 prim: BIT STRING - 295:d=1 hl=2 l= 13 cons: SEQUENCE - 297:d=2 hl=2 l= 9 prim: OBJECT :sha1WithRSAEncryption - 308:d=2 hl=2 l= 0 prim: NULL - 310:d=1 hl=3 l= 129 prim: BIT STRING - -I propose that we match OpenSSL's default self-signed certificates. I hypothesise -that they are the most common self-signed certificates. If this turns out not -to be the case, then we should use whatever the most common turns out to be. - -Certificate serial numbers - -Currently our generated certificate serial number is set to the number of -seconds since the epoch at the time of the certificate's creation. I propose -that we should ensure that our serial numbers are unrelated to the epoch, -since the generation methods are potentially recognizable as Tor-related. - -Instead, I propose that we use a randomly generated number that is -subsequently hashed with SHA-512 and then truncate the data to eight bytes[1]. - -Random sixteen byte values appear to be the high bound for serial number as -issued by Verisign and DigiCert. RapidSSL appears to be three bytes in length. -Others common byte lengths appear to be between one and four bytes. The default -OpenSSL certificates are eight bytes and we should use this length with our -self-signed certificates. - -This randomly generated serial number field may now serve as a covert channel -that signals to the client that the OR will not support TLS renegotiation; this -means that the client can expect to perform a V3 TLS handshake setup. -Otherwise, if the serial number is a reasonable time since the epoch, we should -assume the OR is using an earlier protocol version and hence that it expects -renegotiation. - -We also have a need to signal properties with our certificates for a possible -v3 handshake in the future. Therefore I propose that we match OpenSSL default -self-signed certificates (a 64-bit random number), but reserve the two least- -significant bits for signaling. For the moment, these two bits will be zero. - -This means that an attacker may be able to identify Tor certificates from default -OpenSSL certificates with a 75% probability. - -As a security note, care must be taken to ensure that supporting this -covert channel will not lead to an attacker having a method to downgrade client -behavior. This shouldn't be a risk because the TLS Finished message hashes over -all the bytes of the handshake, including the certificates. - -Certificate fingerprinting issues expressed as base64 encoding - -It appears that all deployed Tor certificates have the following strings in -common: - -MIIB -CCA -gAwIBAgIETU -ANBgkqhkiG9w0BAQUFADA -YDVQQDEx -3d3cu - -As expected these values correspond to specific ASN.1 OBJECT IDENTIFIER (OID) -properties (sha1WithRSAEncryption, commonName, etc) of how we generate our -certificates. - -As an illustrated example of the common bytes of all certificates used within -the Tor network within a single one hour window, I have replaced the actual -value with a wild card ('.') character here: - ------BEGIN CERTIFICATE----- -MIIB..CCA..gAwIBAgIETU....ANBgkqhkiG9w0BAQUFADA.M..w..YDVQQDEx.3 -d3cu............................................................ -................................................................ -................................................................ -................................................................ -................................................................ -................................................................ -................................................................ -................................................................ -........................... <--- Variable length and padding ------END CERTIFICATE----- - -This fine ascii art only illustrates the bytes that absolutely match in all -cases. In many cases, it's likely that there is a high probability for a given -byte to be only a small subset of choices. - -Using the above strings, the EFF's certificate observatory may trivially -discover all known relays, known bridges and unknown bridges in a single SQL -query. I propose that we ensure that we test our certificates to ensure that -they do not have these kinds of statistical similarities without ensuring -overlap with a very large cross section of the internet's certificates. - -Certificate dating and validity issues - -TLS certificates found in the wild are generally found to be long-lived; -they are frequently old and often even expired. The current Tor certificate -validity time is a very small time window starting at generation time and -ending shortly thereafter, as defined in or.h by MAX_SSL_KEY_LIFETIME -(2*60*60). - -I propose that the certificate validity time length is extended to a period of -twelve Earth months, possibly with a small random skew to be determined by the -implementer. Tor should randomly set the start date in the past or some -currently unspecified window of time before the current date. This would -more closely track the typical distribution of non-Tor TLS certificate -expiration times. - -The certificate values, such as expiration, should not be used for anything -relating to security; for example, if the OR presents an expired TLS -certificate, this does not imply that the client should terminate the -connection (as would be appropriate for an ordinary TLS implementation). -Rather, I propose we use a TOFU style expiration policy - the certificate -should never be trusted for more than a two hour window from first sighting. - -This policy should have two major impacts. The first is that an adversary will -have to perform a differential analysis of all certificates for a given IP -address rather than a single check. The second is that the server expiration -time is enforced by the client and confirmed by keys rotating in the consensus. - -The expiration time should not be a fixed time that is simple to calculate by -any Deep Packet Inspection device or it will become a new Tor TLS setup -fingerprint. - -Proposed certificate form - -The following output from openssl asn1parse results from the proposed -certificate generation algorithm. It matches the results of generating a -default self-signed certificate: - - 0:d=0 hl=4 l= 513 cons: SEQUENCE - 4:d=1 hl=4 l= 362 cons: SEQUENCE - 8:d=2 hl=2 l= 9 prim: INTEGER :DBF6B3B864FF7478 - 19:d=2 hl=2 l= 13 cons: SEQUENCE - 21:d=3 hl=2 l= 9 prim: OBJECT :sha1WithRSAEncryption - 32:d=3 hl=2 l= 0 prim: NULL - 34:d=2 hl=2 l= 69 cons: SEQUENCE - 36:d=3 hl=2 l= 11 cons: SET - 38:d=4 hl=2 l= 9 cons: SEQUENCE - 40:d=5 hl=2 l= 3 prim: OBJECT :countryName - 45:d=5 hl=2 l= 2 prim: PRINTABLESTRING :AU - 49:d=3 hl=2 l= 19 cons: SET - 51:d=4 hl=2 l= 17 cons: SEQUENCE - 53:d=5 hl=2 l= 3 prim: OBJECT :stateOrProvinceName - 58:d=5 hl=2 l= 10 prim: PRINTABLESTRING :Some-State - 70:d=3 hl=2 l= 33 cons: SET - 72:d=4 hl=2 l= 31 cons: SEQUENCE - 74:d=5 hl=2 l= 3 prim: OBJECT :organizationName - 79:d=5 hl=2 l= 24 prim: PRINTABLESTRING :Internet Widgits Pty Ltd - 105:d=2 hl=2 l= 30 cons: SEQUENCE - 107:d=3 hl=2 l= 13 prim: UTCTIME :110217011237Z - 122:d=3 hl=2 l= 13 prim: UTCTIME :120217011237Z - 137:d=2 hl=2 l= 69 cons: SEQUENCE - 139:d=3 hl=2 l= 11 cons: SET - 141:d=4 hl=2 l= 9 cons: SEQUENCE - 143:d=5 hl=2 l= 3 prim: OBJECT :countryName - 148:d=5 hl=2 l= 2 prim: PRINTABLESTRING :AU - 152:d=3 hl=2 l= 19 cons: SET - 154:d=4 hl=2 l= 17 cons: SEQUENCE - 156:d=5 hl=2 l= 3 prim: OBJECT :stateOrProvinceName - 161:d=5 hl=2 l= 10 prim: PRINTABLESTRING :Some-State - 173:d=3 hl=2 l= 33 cons: SET - 175:d=4 hl=2 l= 31 cons: SEQUENCE - 177:d=5 hl=2 l= 3 prim: OBJECT :organizationName - 182:d=5 hl=2 l= 24 prim: PRINTABLESTRING :Internet Widgits Pty Ltd - 208:d=2 hl=3 l= 159 cons: SEQUENCE - 211:d=3 hl=2 l= 13 cons: SEQUENCE - 213:d=4 hl=2 l= 9 prim: OBJECT :rsaEncryption - 224:d=4 hl=2 l= 0 prim: NULL - 226:d=3 hl=3 l= 141 prim: BIT STRING - 370:d=1 hl=2 l= 13 cons: SEQUENCE - 372:d=2 hl=2 l= 9 prim: OBJECT :sha1WithRSAEncryption - 383:d=2 hl=2 l= 0 prim: NULL - 385:d=1 hl=3 l= 129 prim: BIT STRING - - -Custom Certificates - -It should be possible for a Tor relay operator to use a specifically supplied -certificate and secret key. This will allow a relay or bridge operator to use a -certificate signed by any member of any geographically relevant certificate -authority racket; it will also allow for any other user-supplied certificate. -This may be desirable in some kinds of filtered networks or when attempting to -avoid attracting suspicion by blending in with the TLS web server certificate -crowd. - -Problematic Diffie–Hellman parameters - -We currently send a static Diffie–Hellman parameter, prime p (or “prime p -outlaw”) as specified in RFC2409 as part of the TLS Server Hello response. - -The use of this prime in TLS negotiations may, as a result, be filtered and -effectively banned by certain networks. We do not have to use this particular -prime in all cases. - -While amusing to have the power to make specific prime numbers into a new class -of numbers (cf. imaginary, irrational, illegal [3]) - our new friend prime p -outlaw is not required. - -The use of this prime in TLS negotiations may, as a result, be filtered and -effectively banned by certain networks. We do not have to use this particular -prime in all cases. - -I propose that the function to initialize and generate DH parameters be -split into two functions. - -First, init_dh_param() should be used only for OR-to-OR DH setup and -communication. Second, it is proposed that we create a new function -init_tls_dh_param() that will have a two-stage development process. - -The first stage init_tls_dh_param() will use the same prime that -Apache2.x [4] sends (or “dh1024_apache_p”), and this change should be -made immediately. This is a known good and safe prime number (p-1 / 2 -is also prime) that is currently not known to be blocked. - -The second stage init_tls_dh_param() should randomly generate a new prime on a -regular basis; this is designed to make the prime difficult to outlaw or -filter. Call this a shape-shifting or "Rakshasa" prime. This should be added -to the 0.2.3.x branch of Tor. This prime can be generated at setup or execution -time and probably does not need to be stored on disk. Rakshasa primes only -need to be generated by Tor relays as Tor clients will never send them. Such -a prime should absolutely not be shared between different Tor relays nor -should it ever be static after the 0.2.3.x release. - -As a security precaution, care must be taken to ensure that we do not generate -weak primes or known filtered primes. Both weak and filtered primes will -undermine the TLS connection security properties. OpenSSH solves this issue -dynamically in RFC 4419 [5] and may provide a solution that works reasonably -well for Tor. More research in this area including the applicability of -Miller-Rabin or AKS primality tests[6] will need to be analyzed and probably -added to Tor. - -Practical key size - -Currently we use a 1024 bit long RSA modulus. I propose that we increase the -RSA key size to 2048 as an additional channel to signal support for the V3 -handshake setup. 2048 appears to be the most common key size[0] above 1024. -Additionally, the increase in modulus size provides a reasonable security boost -with regard to key security properties. - -The implementer should increase the 1024 bit RSA modulus to 2048 bits. - -Possible future filtering nightmares - -At some point it may cost effective or politically feasible for a network -filter to simply block all signed or self-signed certificates without a known -valid CA trust chain. This will break many applications on the internet and -hopefully, our option for custom certificates will ensure that this step is -simply avoided by the censors. - -The Rakshasa prime approach may cause censors to specifically allow only -certain known and accepted DH parameters. - - -Appendix: Other issues - -What other obvious TLS certificate issues exist? What other static values are -present in the Tor TLS setup process? - -[0] http://archives.seul.org/or/dev/Jan-2011/msg00051.html -[1] http://archives.seul.org/or/dev/Feb-2011/msg00016.html -[2] http://archives.seul.org/or/dev/Feb-2011/msg00039.html -[3] To be fair this is hardly a new class of numbers. History is rife with - similar examples of inane authoritarian attempts at mathematical secrecy. - Probably the most dramatic example is the story of the pupil Hipassus of - Metapontum, pupil of the famous Pythagoras, who, legend goes, proved the - fact that Root2 cannot be expressed as a fraction of whole numbers (now - called an irrational number) and was assassinated for revealing this - secret. Further reading on the subject may be found on the Wikipedia: - http://en.wikipedia.org/wiki/Hippasus - -[4] httpd-2.2.17/modules/ss/ssl_engine_dh.c -[5] http://tools.ietf.org/html/rfc4419 -[6] http://archives.seul.org/or/dev/Jan-2011/msg00037.html diff --git a/doc/spec/proposals/ideas/xxx-encrypted-services.txt b/doc/spec/proposals/ideas/xxx-encrypted-services.txt deleted file mode 100644 index 3c2ac67fa4..0000000000 --- a/doc/spec/proposals/ideas/xxx-encrypted-services.txt +++ /dev/null @@ -1,66 +0,0 @@ -Filename: xxx-encrypted-services.txt -Title: Encrypted services as a replacement to exit enclaving -Author: Roger Dingledine -Created: 2011-01-12 -Status: Draft - -We should offer a way to run a Tor hidden service where the server-side -rendezvous circuits are just one hop. - -1. Motivation - - There are three Tor use cases that this idea addresses: - - 1) Indymedia wants to run an exit enclave that provides end-to-end - authentication and encryption. They tried running an exit relay that - just exits to themselves: - https://trac.torproject.org/projects/tor/ticket/800 - but a) it handles lots of other traffic too since it's a relay, and - b) exit enclaves don't actually work consistently, because the first - connection from the user doesn't realize it should use the exit enclave. - - 2) Wikileaks uses Tor hidden services not to hide their service, - but because the hidden service address provides a type of usability - we didn't think much about: unlike a more normal address, a Tor - hidden service address either works (meaning you get your end-to-end - authentication and encryption) or it fails hard. With a hidden service - address there's no way a user could accidentally submit their documents - to Wikileaks without using Tor, but with normal Tor it's possible. - - 3) The Freenode IRC network wants to provide end-to-end encryption and - authentication to its users, a) to handle the fact that the IRC protocol - doesn't really provide much of that by default, and b) to funnel all - their Tor users into a single location so they can handle the anonymous - users better. They don't mind the fact that their service is hidden, but - they'd rather have better performance for their users given the choice. - -2. Design - - It seems that the main changes required would be to a) make - circuit_launch_by_extend_info() know to use 1 hop rather than the - default, and know not to try to cannibalize a general 3-hop circ for - these circuits, and b) add a way in the torrc file to specify that this - service wants to be an encrypted service rather than a hidden service. - - I had originally pondered some sort of even more efficient "signed - document saying this service is running at this Tor relay", which - would be more efficient because it would cut out the rendezvous step. - But by reusing the hidden service rendezvous infrastructure, we a) - blend in with hidden services (and hidden service descriptors) and - don't need to teach users (or their Tor clients) a new interface, - and b) can offer the encrypted service on a non-relay. - - One design question to ponder: should we continue to use three-hop - circuits for our introduction points, and for publishing our encrypted - service descriptor? Probably. - -3. Security implications - - There's a possible second-order effect here since both encrypted - services and hidden services will have foo.onion addresses and it's - not clear based on the address whether the service will be hidden -- - if *some* .onion addresses are easy to track down, are we encouraging - adversaries to attack all rendezvous points just in case? - -... - diff --git a/doc/spec/proposals/ideas/xxx-exit-scanning-outline.txt b/doc/spec/proposals/ideas/xxx-exit-scanning-outline.txt deleted file mode 100644 index d84094400a..0000000000 --- a/doc/spec/proposals/ideas/xxx-exit-scanning-outline.txt +++ /dev/null @@ -1,44 +0,0 @@ -1. Scanning process - A. Non-HTML/JS HTTP mime types compared via SHA1 hash - B. Dynamic HTTP content filtered at 4 levels: - 1. IP change+Tor cookie utilization - - Tor cookies replayed with new IP in case of changes - 2. HTML Tag+Attribute+JS comparison - - Comparisons made based only on "relevant" HTML tags - and attributes - 3. HTML Tag+Attribute+JS diffing - - Tags, attributes and JS AST nodes that change during - Non-Tor fetches pruned from comparison - 4. URLS with > N% of node failures removed - - results purged from filesystem at end of scan loop - C. SSL scanning handles some forms of dynamic certs - 1. Catalogs certs for all IPs resolved locally - by getaddrinfo over the duration of the scan. - - Updated each test. - 2. If the domain presents a new cert for each IP, this - is noted on the failure result for the node - 3. If the same IP presents two different certs locally, - the cert list is first refreshed, and if it happens - again, discarded - 4. A N% node failure filter also applies - D. Scanner can be restarted from any point in the event - of scanner or system crashes, or graceful shutdown. - - Results+scan state pickled to filesystem continuously -2. Cron job checks results periodically for reporting - A. Divide failures into three types of BadExit based on type - and frequency over time and incident rate - B. write reject lines to approved-routers for those three types: - 1. ID Hex based (for misconfig/network problems easily fixed) - 2. IP based (for content modification) - 3. IP+mask based (for continuous/egregious content modification) - C. Emails results to tor-scanners@freehaven.net -3. Human Review and Appeal - A. ID Hex-based BadExit is meant to be possible to removed easily - without needing to beg us. - - Should this behavior be encouraged? - B. Optionally can reserve IP based badexits for human review - 1. Results are encapsulated fully on the filesystem and can be - reviewed without network access - 2. Soat has --rescan to rescan failed nodes from a data directory - - New set of URLs used - diff --git a/doc/spec/proposals/ideas/xxx-geoip-survey-plan.txt b/doc/spec/proposals/ideas/xxx-geoip-survey-plan.txt deleted file mode 100644 index 49c6615a66..0000000000 --- a/doc/spec/proposals/ideas/xxx-geoip-survey-plan.txt +++ /dev/null @@ -1,137 +0,0 @@ - - -Abstract - - This document explains how to tell about how many Tor users there - are, and how many there are in which country. Statistics are - involved. - -Motivation - - There are a few reasons we need to keep track of which countries - Tor users (in aggregate) are coming from: - - - Resource allocation. Knowing about underserved countries with - lots of users can let us know about where we need to direct - translation and outreach efforts. - - - Anticensorship. Sudden drops in usage on a national basis can - indicate the arrival of a censorious firewall. - - - Sponsor outreach and self-evalutation. Many people and - organizations who are interested in funding The Tor Project's - work want to know that we're successfully serving parts of the - world they're interested in, and that efforts to expand our - userbase are actually succeeding. So do we. - -Goals - - We want to know approximately how many Tor users there are, and which - countries they're in, even in the presence of a hypothetical - "directory guard" feature. Some uncertainty is okay, but we'd like - to be able to put a bound on the uncertainty. - - We need to make sure this information isn't exposed in a way that - helps an adversary. - -Methods for current clients: - - Every client downloads network status documents. There are - currently three methods (one hypothetical) for clients to get them. - - 0.1.2.x clients (and earlier) fetch a v2 networkstatus - document about every NETWORKSTATUS_CLIENT_DL_INTERVAL [30 - minutes]. - - - 0.2.0.x clients fetch a v3 networkstatus consensus document - at a random interval between when their current document is no - longer freshest, and when their current document is about to - expire. - - [In both of the above cases, clients choose a running - directory cache at random with odds roughly proportional to - its bandwidth. If they're just starting, they know a XXXX FIXME -NM] - - - In some future version, clients will choose directory caches - to serve as their "directory guards" to avoid profiling - attacks, similarly to how clients currently start all their - circuits at guard nodes. - - We assume that a directory cache can tell which of these three - categories a client is in by the format of its status request. - - A directory cache can be made to count distinct client IP - addresses that make a certain request of it in a given timeframe, - and total requests made to it over that timeframe. For the first - two cases, a cache can get a picture of the overall - number and countries of users in the network by dividing the IP - count by the probability with which they (as a cache) would be - chosen. Assuming that our listed bandwidth is such that we expect - to be chosen with probability P for any given request, and we've - been counting IPs for long enough that we expect the average - client to have made N requests, they will have visited us at least - once with probability P' = 1-(1-P)^N, and so we divide the IP - counts we've seen by P' for our estimate. To estimate total - number of clients of a given type, determine how many requests a - client of that type will make over that time, and assume we'll - have seen P of them. - - Both of these numbers are useful: the IP counts will give the - total number of IPs connecting to the network, and the request - counts will give the total number of users on the network at any - given time. - - Notes: - - [Over H hours, the N for V2 clients is 2*H, and the N for V3 - clients is currently around H/2 or H/3.] - - - (We should only count requests that we actually intend to answer; - 503 requests shouldn't count.) - - - These measurements should also be taken at a directory - authority if possible: their picture of the network is skewed - by clients that fetch from them directly. These clients, - however, are all the clients that are just bootstrapping - (assuming that the fallback-consensus feature isn't yet used - much). - - - These measurements also overestimate the V2 download rate if - some downloads fail and clients retry them later after backing - off. - -Methods for directory guards: - - If directory guards are in use, directory guards get a picture of - all those users who chose them as a guard when they were listed - as a good choice for a guard, and who are also on the network - now. The cleanest data here will come from nodes that were listed - as good new-guards choices for a while, and have not been so for a - while longer (to study decay rates); nodes that have been listed - as good new-guard choices consistently for a long time (to get a - sample of the network); and nodes that have been listed as good - new-guard choices only recently (to get a sample of new users and - users whose guards have died out.) - - Since directory guards are currently unspecified, we'll need to - make some guesses about how they'll turn out to work. Here are - a couple of approaches that could work. - - We could have clients pick completely new directory guards on - a rolling basis every two months or so. This would ensure - that staying as a guard for a while would be sufficient to - see a sample of users. This is potentially advantageous for - load-balancing the network as well, though it might lose some - of the benefits of directory guard. We need to quantify the - impact of this; it might not actually make stuff worse in - practice, if most guards don't stay good guards for a month - or two. - - - We could try to collect statistics at several directory - guards and combine their statisics, but we would need to make - sure that for all time, at least one of the directory guards - had been recommended as a good choice for new guards. By - looking at new-IP rates for guards, we could get an idea of - user uptake; for looking at old-IP decay rates, we could get - an idea of turnover. This approach would entail significant - complexity, and we'd probably need to record more information - than we'd really like to. - - diff --git a/doc/spec/proposals/ideas/xxx-grand-scaling-plan.txt b/doc/spec/proposals/ideas/xxx-grand-scaling-plan.txt deleted file mode 100644 index 336798cc0f..0000000000 --- a/doc/spec/proposals/ideas/xxx-grand-scaling-plan.txt +++ /dev/null @@ -1,97 +0,0 @@ - -Right now as I understand it, there are n big scaling problems heading -our way: - -1) Clients need to learn all the relay descriptors they could use. That's -a lot of bytes through a potentially small pipe. -2) Relays need to hold open TCP connections to most other relays. -3) Clients need to learn the whole networkstatus. Even using v3, as -the network grows that will become unwieldy. -4) Dir mirrors need to mirror all the relay descriptors; eventually this -will get big too. - -Here's my plan. - --------------------------------------------------------------------- - -Piece one: download O(1) descriptors rather than O(n) descriptors. - -We need to change our circuit extend protocol so it fetches a relay -descriptor at every 'extend' operation: - - Client fetches networkstatus, picks guards, connects to one. - - Client picks middle hop out of networkstatus, asks guard for - its descriptor, then extends to it. - - Clients picks exit hop out of networkstatus, asks middle hop - for its descriptor, then extends to it. Done. - -The client needs to ask for the descriptor even if it already has a -copy, because otherwise we leak too much. Also, the descriptor needs to -be padded to some large (but not too large) size to prevent the middle -hops from guessing about it. - -The first step towards this is to instrument the current code to see -how much of a win this would actually be -- I am guessing it is already -a win even with the current number of descriptors. - -We also would need to assign the 'Exit' flag more usefully, and make -clients pay attention to it when picking their last hop, since they -don't actually know the exit policies of the relays they're choosing from. - -We also need to think harder about other implications -- for example, -a relay with a tiny exit policy won't get the Exit flag, and thus won't -ever get picked as an exit relay. Plus, our "enclave exit" model is out -the window unless we figure out a cool trick. - -More generally, we'll probably want to compress the descriptors that we -send back; maybe 8k is a good upper bound? I wonder if we could ask for -several descriptors, and bundle back all of the ones that fit in the 8k? - -We'd also want to put the load balancing weights into the networkstatus, -so clients can choose fast nodes more often without needing to see the -descriptors. This is a good opportunity for the authorities to be able -to put "more accurate" weights in if they learn to detect attacks. It -also means we should consider running automated audits to make sure the -authorities aren't trying to snooker everybody. - -I'm aiming to get Peter Palfrader to tackle this problem in mid 2008, -but I bet he could use some help. - --------------------------------------------------------------------- - -Piece two: inter-relay communication uses UDP - -If relays send packets to/from other relays via UDP, they don't need a -new descriptor for each such link. Thus we'll still need to keep state -for each link, but we won't max out on sockets. - -Clearly a lot more work needs to be done here. Ian Goldberg has a student -who has been working on it, and if all goes well we'll be chipping in -some funding to continue that. Also, Camilo Viecco has been doing his -PhD thesis on it. - --------------------------------------------------------------------- - -Piece three: networkstatus documents get partitioned - -While the authorities should be expected to be able to handle learning -about all the relays, there's no reason the clients or the mirrors need -to. Authorities should put a cap on the number of relays listed in a -single networkstatus, and split them when they get too big. - -We'd need a good way to have each authority come to the same conclusion -about which partition a given relay goes into. - -Directory mirrors would then mirror all the relay descriptors in their -partition. This is compatible with 'piece one' above, since clients in -a given partition will only ask about descriptors in that partition. - -More complex versions of this design would involve overlapping partitions, -but that would seem to start contradicting other parts of this proposal -right quick. - -Nobody is working on this piece yet. It's hard to say when we'll need -it, but it would be nice to have some more thought on it before the week -that we need it. - --------------------------------------------------------------------- - diff --git a/doc/spec/proposals/ideas/xxx-hide-platform.txt b/doc/spec/proposals/ideas/xxx-hide-platform.txt deleted file mode 100644 index ad19fb1fd4..0000000000 --- a/doc/spec/proposals/ideas/xxx-hide-platform.txt +++ /dev/null @@ -1,37 +0,0 @@ -Filename: xxx-hide-platform.txt -Title: Hide Tor Platform Information -Author: Jacob Appelbaum -Created: 24-July-2008 -Status: Draft - - - Hiding Tor Platform Information - -0.0 Introduction - -The current Tor program publishes its specific Tor version and related OS -platform information. This information could be misused by an attacker. - -0.1 Current Implementation - -Currently, the Tor binary sends data that looks like the following: - - Tor 0.2.0.26-rc (r14597) on Darwin Power Macintosh - Tor 0.1.2.19 on Windows XP Service Pack 3 [workstation] {terminal services, - single user} - -1.0 Suggested changes - -It would be useful to allow a user to configure the disclosure of such -information. Such a change would be an option in the torrc file like so: - - HidePlatform Yes - -1.1 Suggested default behavior in the future - -If a user would like to disclose this information, they could configure their -Tor to do so. - - HidePlatform No - - diff --git a/doc/spec/proposals/ideas/xxx-pluggable-transport.txt b/doc/spec/proposals/ideas/xxx-pluggable-transport.txt deleted file mode 100644 index 53ba9c630b..0000000000 --- a/doc/spec/proposals/ideas/xxx-pluggable-transport.txt +++ /dev/null @@ -1,312 +0,0 @@ -Filename: xxx-pluggable-transport.txt -Title: Pluggable transports for circumvention -Author: Jacob Appelbaum, Nick Mathewson -Created: 15-Oct-2010 -Status: Draft - -Overview - - This proposal describes a way to decouple protocol-level obfuscation - from the core Tor protocol in order to better resist client-bridge - censorship. Our approach is to specify a means to add pluggable - transport implementations to Tor clients and bridges so that they can - negotiate a superencipherment for the Tor protocol. - -Scope - - This is a document about transport plugins; it does not cover - discovery improvements, or bridgedb improvements. While these - requirements might be solved by a program that also functions as a - transport plugin, this proposal only covers the requirements and - operation of transport plugins. - -Motivation - - Frequently, people want to try a novel circumvention method to help - users connect to Tor bridges. Some of these methods are already - pretty easy to deploy: if the user knows an unblocked VPN or open - SOCKS proxy, they can just use that with the Tor client today. - - Less easy to deploy are methods that require participation by both the - client and the bridge. In order of increasing sophistication, we - might want to support: - - 1. A protocol obfuscation tool that transforms the output of a TLS - connection into something that looks like HTTP as it leaves the - client, and back to TLS as it arrives at the bridge. - 2. An additional authentication step that a client would need to - perform for a given bridge before being allowed to connect. - 3. An information passing system that uses a side-channel in some - existing protocol to convey traffic between a client and a bridge - without the two of them ever communicating directly. - 4. A set of clients to tunnel client->bridge traffic over an existing - large p2p network, such that the bridge is known by an identifier - in that network rather than by an IP address. - - We could in theory support these almost fine with Tor as it stands - today: every Tor client can take a SOCKS proxy to use for its outgoing - traffic, so a suitable client proxy could handle the client's traffic - and connections on its behalf, while a corresponding program on the - bridge side could handle the bridge's side of the protocol - transformation. Nevertheless, there are some reasons to add support - for transportation plugins to Tor itself: - - 1. It would be good for bridges to have a standard way to advertise - which transports they support, so that clients can have multiple - local transport proxies, and automatically use the right one for - the right bridge. - - 2. There are some changes to our architecture that we'll need for a - system like this to work. For testing purposes, if a bridge blocks - off its regular ORPort and instead has an obfuscated ORPort, the - bridge authority has no way to test it. Also, unless the bridge - has some way to tell that the bridge-side proxy at 127.0.0.1 is not - the origin of all the connections it is relaying, it might decide - that there are too many connections from 127.0.0.1, and start - paring them down to avoid a DoS. - - 3. Censorship and anticensorship techniques often evolve faster than - the typical Tor release cycle. As such, it's a good idea to - provide ways to test out new anticensorship mechanisms on a more - rapid basis. - - 4. Transport obfuscation is a relatively distinct problem - from the other privacy problems that Tor tries to solve, and it - requires a fairly distinct skill-set from hacking the rest of Tor. - By decoupling transport obfuscation from the Tor core, we hope to - encourage people working on transport obfuscation who would - otherwise not be interested in hacking Tor. - - 5. Finally, we hope that defining a generic transport obfuscation plugin - mechanism will be useful to other anticensorship projects. - -Non-Goals - - We're not going to talk about automatic verification of plugin - correctness and safety via sandboxing, proof-carrying code, or - whatever. - - We need to do more with discovery and distribution, but that's not - what this proposal is about. We're pretty convinced that the problems - are sufficiently orthogonal that we should be fine so long as we don't - preclude a single program from implementing both transport and - discovery extensions. - - This proposal is not about what transport plugins are the best ones - for people to write. We do, however, make some general - recommendations for plugin authors in an appendix. - - We've considered issues involved with completely replacing Tor's TLS - with another encryption layer, rather than layering it inside the - obfuscation layer. We describe how to do this in an appendix to the - current proposal, though we are not currently sure whether it's a good - idea to implement. - - We deliberately reject any design that would involve linking more code - into Tor's process space. - -Design overview - - To write a new transport protocol, an implementer must provide two - pieces: a "Client Proxy" to run at the initiator side, and a "Server - Proxy" to run a the server side. These two pieces may or may not be - implemented by the same program. - - Each client may run any number of Client Proxies. Each one acts like - a SOCKS proxy that accepts accept connections on localhost. Each one - runs on a different port, and implements one or more transport - methods. If the protocol has any parameters, they passed from Tor - inside the regular username/password parts of the SOCKS protocol. - - Bridges (and maybe relays) may run any number of Server Proxies: these - programs provide an interface like stunnel-server (or whatever the - option is): they get connections from the network (typically by - listening for connections on the network) and relay them to the - Bridge's real ORPort. - - To configure one of these programs, it should be sufficient simply to - list it in your torrc. The program tells Tor which transports it - provides. The Tor consensus should carry a new approved version number that - is specific for pluggable transport; this will allow Tor to know when a - particular transport is known to be unsafe safe or non-functional. - - Bridges (and maybe relays) report in their descriptors which transport - protocols they support. This information can be copied into bridge - lines. Bridges using a transport protocol may have multiple bridge - lines. - - Any methods that are wildly successful, we can bake into Tor. - -Specifications: Client behavior - - Bridge lines can now follow the extended format "bridge method - address:port [[keyid=]id-fingerprint] [k=v] [k=v] [k=v]". To connect - to such a bridge, a client must open a local connection to the SOCKS - proxy for "method", and ask it to connect to address:port. If - [id-fingerprint] is provided, it should expect the public identity key - on the TLS connection to match the digest provided in - [id-fingerprint]. If any [k=v] items are provided, they are - configuration parameters for the proxy: Tor should separate them with - semicolons and put them user and password fields of the request, - splitting them across the fields as necessary. If a key or value - value must contain a semicolon or a backslash, it is escaped with a - backslash. - - The "id-fingerprint" field is always provided in a field named - "keyid", if it was given. Method names must be C identifiers. - - Example: if the bridge line is "bridge trebuchet www.example.com:3333 - rocks=20 height=5.6m" AND if the Tor client knows that the - 'trebuchet' method is provided by a SOCKS5 proxy on - 127.0.0.1:19999, the client should connect to that proxy, ask it to - connect to www.example.com, and provide the string - "rocks=20;height=5.6m" as the username, the password, or split - across the username and password. - - There are two ways to tell Tor clients about protocol proxies: - external proxies and managed proxies. An external proxy is configured - with "ClientTransportPlugin trebuchet socks5 127.0.0.1:9999". This - tells Tor that another program is already running to handle - 'trubuchet' connections, and Tor doesn't need to worry about it. A - managed proxy is configured with "ClientTransportPlugin trebuchet - exec /usr/libexec/tor-proxies/trebuchet [options]", and tells Tor to launch - an external program on-demand to provide a socks proxy for 'trebuchet' - connections. The Tor client only launches one instance of each - external program, even if the same executable is listed for more than - one method. - - The same program can implement a managed or an external proxy: it just - needs to take an argument saying which one to be. - -Client proxy behavior - - When launched from the command-line by a Tor client, a transport - proxy needs to tell Tor which methods and ports it supports. It does - this by printing one or more CMETHOD: lines to its stdout. These look - like - - CMETHOD: trebuchet SOCKS5 127.0.0.1:19999 ARGS:rocks,height \ - OPT-ARGS:tensile-strength - - The ARGS field lists mandatory parameters that must appear in every - bridge line for this method. The OPT-ARGS field lists optional - parameters. If no ARGS or OPT-ARGS field is provided, Tor should not - check the parameters in bridge lines for this method. - - The proxy should print a single "METHODS: DONE" line after it is - finished telling Tor about the methods it provides. - - The transport proxy MUST exit cleanly when it receives a SIGTERM from - Tor. - - The Tor client MUST ignore lines beginning with a keyword and a colon - if it does not recognize the keyword. - - In the future, if we need a control mechanism, we can use the - stdin/stdout from Tor to the transport proxy. - - A transport proxy MUST handle SOCKS connect requests using the SOCKS - version it advertises. - - Tor clients SHOULD NOT use any method from a client proxy unless it - is both listed as a possible method for that proxy in torrc, and it - is listed by the proxy as a method it supports. - - [XXXX say something about versioning.] - -Server behavior - - Server proxies are configured similarly to client proxies. - - - -Server proxy behavior - - - - [so, we can have this work like client proxies, where the bridge - launches some programs, and they tell the bridge, "I am giving you - method X with parameters Y"? Do you have to take all the methods? If - not, which do you specify?] - - [Do we allow programs that get started independently?] - - [We'll need to figure out how this works with port forwarding. Is - port forwarding the bridge's problem, the proxy's problem, or some - combination of the two?] - - [If we're using the bridge authority/bridgedb system for distributing - bridge info, the right place to advertise bridge lines is probably - the extrainfo document. We also need a way to tell the bridge - authority "don't give out a default bridge line for me"] - -Server behavior - -Bridge authority behavior - -Implementation plan - - Turn this into a draft proposal - - Circulate and discuss on or-dev. - - We should ship a couple of null plugin implementations in one or two - popular, portable languages so that people get an idea of how to - write the stuff. - - 1. We should have one that's just a proof of concept that does - nothing but transfer bytes back and forth. - - 1. We should not do a rot13 one. - - 2. We should implement a basic proxy that does not transform the bytes at all - - 1. We should implement DNS or HTTP using other software (as goodell - did years ago with DNS) as an example of wrapping existing code into - our plugin model. - - 2. The obfuscated-ssh superencipherment is pretty trivial and pretty - useful. It makes the protocol stringwise unfingerprintable. - - 1. Nick needs to be told firmly not to bikeshed the obfuscated-ssh - superencipherment too badly - - 1. Go ahead, bikeshed my day - - 1. If we do a raw-traffic proxy, openssh tunnels would be the logical choice. - -Appendix: recommendations for transports - - Be free/open-source software. Also, if you think your code might - someday do so well at circumvention that it should be implemented - inside Tor, it should use the same license as Tor. - - Use libraries that Tor already requires. (You can rely on openssl and - libevent being present if current Tor is present.) - - Be portable: most Tor users are on Windows, and most Tor developers - are not, so designing your code for just one of these platforms will - make it either get a small userbase, or poor auditing. - - Think secure: if your code is in a C-like language, and it's hard to - read it and become convinced it's safe then, it's probably not safe. - - Think small: we want to minimize the bytes that a Windows user needs - to download for a transport client. - - Specify: if you can't come up with a good explanation - - Avoid security-through-obscurity if possible. Specify. - - Resist trivial fingerprinting: There should be no good string or regex - to search for to distinguish your protocol from protocols permitted by - censors. - - Imitate a real profile: There are many ways to implement most - protocols -- and in many cases, most possible variants of a given - protocol won't actually exist in the wild. - -Appendix: Raw-traffic transports - - This section describes an optional extension to the proposal above. - We are not sure whether it is a good idea. diff --git a/doc/spec/proposals/ideas/xxx-port-knocking.txt b/doc/spec/proposals/ideas/xxx-port-knocking.txt deleted file mode 100644 index 85c27ec52d..0000000000 --- a/doc/spec/proposals/ideas/xxx-port-knocking.txt +++ /dev/null @@ -1,91 +0,0 @@ -Filename: xxx-port-knocking.txt -Title: Port knocking for bridge scanning resistance -Author: Jacob Appelbaum -Created: 19-April-2009 -Status: Draft - - Port knocking for bridge scanning resistance - -0.0 Introduction - -This document is a collection of ideas relating to improving scanning -resistance for private bridge relays. This is intented to stop opportunistic -network scanning and subsequent discovery of private bridge relays. - - -0.1 Current Implementation - -Currently private bridges are only hidden by their obscurity. If you know -a bridge ip address, the bridge can be detected trivially and added to a block -list. - -0.2 Configuring an external port knocking program to control the firewall - -It is currently possible for bridge operators to configure a port knocking -daemon that controls access to the incoming OR port. This is currently out of -scope for Tor and Tor configuration. This process requires the firewall to know -the current nodes in the Tor network. - -1.0 Suggested changes - -Private bridge operators should be able to configure a method of hiding their -relay. Only authorized users should be able to communicate with the private -bridge. This should be done with Tor and if possible without the help of the -firewall. It should be possible for a Tor user to enter a secret key into -Tor or optionally Vidalia on a per bridge basis. This secret key should be -used to authenticate the bridge user to the private bridge. - -1.x Issues with low ports and bind() for ORPort - -Tor opens low numbered ports during startup and then drops privileges. It is -no longer possible to rebind to those lower ports after they are closed. - -1.x Issues with OS level packet filtering - -Tor does not know about any OS level packet filtering. Currently there is no -packet filters that understands the Tor network in real time. - -1.x Possible partioning of users by bridge operator - -Depending on implementation, it may be possible for bridge operators to -uniquely identify users. This appears to be a general bridge issue when a -bridge operator uniquely deploys bridges per user. - -2.0 Implementation ideas - -This is a suggested set of methods for port knocking. - -2.x Using SPA port knocking - -Single Packet Authentication port knocking encodes all required data into a -single UDP packet. Improperly formatted packets may be simply discarded. -Properly formatted packets should be processed and appropriate actions taken. - -2.x Using DNS as a transport for SPA - -It should be possible for Tor to bind to port 53 at startup and merely drop all -packets that are not valid. UDP does not require a response and invalid packets -will not trigger a response from Tor. With base32 encoding it should be -possible to encode SPA as valid DNS requests. This should allow use of the -public DNS infrastructure for authorization requests if desired. - -2.x Ghetto firewalling with opportunistic connection closing - -Until a user has authenticated with Tor, Tor only has a UDP listener. This -listener should never send data in response, it should only open an ORPort -when a user has successfully authenticated. After a user has authenticated -with Tor to open an ORPort, only users who have authenticated will be able -to use it. All other users as identified by their ip address will have their -connection closed before any data is sent or received. This should be -accomplished with an access policy. By default, the access policy should block -all access to the ORPort. - -2.x Timing and reset of access policies - -Access to the ORPort is sensitive. The bridge should remove any exceptions -to its access policy regularly when the ORPort is unused. Valid users should -reauthenticate if they do not use the ORPort within a given time frame. - -2.x Additional considerations - -There are many. A format of the packet and the crypto involved is a good start. diff --git a/doc/spec/proposals/ideas/xxx-rate-limit-exits.txt b/doc/spec/proposals/ideas/xxx-rate-limit-exits.txt deleted file mode 100644 index 81fed20af8..0000000000 --- a/doc/spec/proposals/ideas/xxx-rate-limit-exits.txt +++ /dev/null @@ -1,63 +0,0 @@ - -1. Overview - - We should rate limit the volume of stream creations at exits: - -2.1. Per-circuit limits - - If a given circuit opens more than N streams in X seconds, further - stream requests over the next Y seconds should fail with the reason - 'resourcelimit'. Clients will automatically notice this and switch to - a new circuit. - - The goal is to limit the effects of port scans on a given exit relay, - so the relay's ISP won't get hassled as much. - - First thoughts for parameters would be N=100 streams in X=5 seconds - causes 30 seconds of fails; and N=300 streams in X=30 seconds causes - 30 seconds of fails. - - We could simplify by, instead of having a "for 30 seconds" parameter, - just marking the circuit as forever failing new requests. (We don't want - to just close the circuit because it may still have open streams on it.) - -2.2. Per-destination limits - - If a given circuit opens more than N1 streams in X seconds to a single - IP address, or all the circuits combined open more than N2 streams, - then we should fail further attempts to reach that address for a while. - - The goal is to limit the abuse that Tor exit relays can dish out - to a single target either for socket DoS or for web crawling, in - the hopes of a) not triggering their automated defenses, and b) not - making them upset at Tor. Hopefully these self-imposed bans will be - much shorter-lived than bans or barriers put up by the websites. - -3. Issues - -3.1. Circuit-creation overload - - Making clients move to new circuits more often will cause more circuit - creation requests. - -3.2. How to pick the parameters? - - If we pick the numbers too low, then popular sites are effectively - cut out of Tor. If we pick them too high, we don't do much good. - - Worse, picking them wrong isn't easy to fix, since the deployed Tor - servers will ship with a certain set of numbers. - - We could put numbers (or "general settings") in the networkstatus - consensus, and Tor exits would adapt more dynamically. - - We could also have a local config option about how aggressive this - server should be with its parameters. - -4. Client-side limitations - - Perhaps the clients should have built-in rate limits too, so they avoid - harrassing the servers by default? - - Tricky if we want to get Tor clients in use at large enclaves. - diff --git a/doc/spec/proposals/ideas/xxx-using-spdy.txt b/doc/spec/proposals/ideas/xxx-using-spdy.txt deleted file mode 100644 index d733a84b69..0000000000 --- a/doc/spec/proposals/ideas/xxx-using-spdy.txt +++ /dev/null @@ -1,143 +0,0 @@ -Filename: xxx-using-spdy.txt -Title: Using the SPDY protocol to improve Tor performance -Author: Steven J. Murdoch -Created: 03-Feb-2010 -Status: Draft -Target: - -1. Overview - - The SPDY protocol [1] is an alternative method for transferring - web content over TCP, designed to improve efficiency and - performance. A SPDY-aware browser can already communicate with - a SPDY-aware web server over Tor, because this only requires a TCP - stream to be set up. However, a SPDY-aware browser cannot - communicate with a non-SPDY-aware web server. This proposal - outlines how Tor could support this latter case, and why it - may be good for performance. - -2. Motivation - - About 90% of Tor traffic, by connection, is HTTP [2], but - users report subjective performance to be poor. It would - therefore be desirable to improve this situation. SPDY was - designed to offer better performance than HTTP, in - high-latency and/or low-bandwidth situations, and is therefore - an option worth examining. - - If a user wishes to access a SPDY-enabled web server over Tor, - all they need to do is to configure their SPDY-enabled browser - (e.g. Google Chrome) to use Tor. However, there are few - SPDY-enabled web servers, and even if there was high demand - from Tor users, there would be little motivation for server - operators to upgrade, for the benefit of only a small - proportion of their users. - - The motivation of this proposal is to allow only the user to - install a SPDY-enabled browser, and permit web servers to - remain unmodified. Essentially, Tor would incorporate a proxy - on the exit node, which communicates SPDY to the web browser - and normal HTTP to the web server. This proxy would translate - between the two transport protocols, and possibly perform - other optimizations. - - SPDY currently offers five optimizations: - - 1) Multiplexed streams: - An unlimited number of resources can be transferred - concurrently, over a single TCP connection. - - 2) Request prioritization: - The client can set a priority on each resource, to assist - the server in re-ordering responses. - - 3) Compression: - Both HTTP header and resource content can be compressed. - - 4) Server push: - The server can offer the client resources which have not - been requested, but which the server believes will be. - - 5) Server hint: - The server can suggest that the client request further - resources, before the main content is transferred. - - Tor currently effectively implements (1), by being able to put - multiple streams on one circuit. SPDY however requires fewer - round-trips to do the same. The other features are not - implemented by Tor. Therefore it is reasonable to expect that - a HTTP <-> SPDY proxy may improve Tor performance, by some - amount. - - The consequences on caching need to be considered carefully. - Most of the optimizations SPDY offers have no effect because - the existing HTTP cache control headers are transmitted without - modification. Server push is more problematic, because here - the server may push a resource that the client already has. - -3. Design outline - - One way to implement the SPDY proxy is for Tor exit nodes to - advertise this capability in their descriptor. The OP would - then preferentially select these nodes when routing streams - destined for port 80. - - Then, rather than sending the usual RELAY_BEGIN cell, the OP - would send a RELAY_BEGIN_TRANSFORMED cell, with a parameter to - indicate that the exit node should translate between SPDY and - HTTP. The rest of the connection process would operate as - usual. - - There would need to be some way of elegantly handling non-HTTP - traffic which goes over port 80. - -4. Implementation status - - SPDY is under active development and both the specification - and implementations are in a state of flux. Initial - experiments with Google Chrome in SPDY-mode and server - libraries indicate that more work is needed before they are - production-ready. There is no indication that browsers other - than Google Chrome will support SPDY (and no official - statement as to whether Google Chrome will eventually enable - SPDY by default). - - Implementing a full SPDY proxy would be non-trivial. Stream - multiplexing and compression are supported by existing - libraries and would be fairly simple to implement. Request - prioritization would require some form of caching on the - proxy-side. Server push and server hint would require content - parsing to identify resources which should be treated - specially. - -5. Security and policy implications - - A SPDY proxy would be a significant amount of code, and may - pull in external libraries. This code will process potentially - malicious data, both at the SPDY and HTTP sides. This proposal - therefore increases the risk that exit nodes will be - compromised by exploiting a bug in the proxy. - - This proposal would also be the first way in which Tor is - modifying TCP stream data. Arguably this is still meta-data - (HTTP headers), but there may be some concern that Tor should - not be doing this. - - Torbutton only works with Firefox, but SPDY only works with - Google Chrome. We should be careful not to recommend that - users adopt a browser which harms their privacy in other ways. - -6. Open questions: - - - How difficult would this be to implement? - - - How much performance improvement would it actually result in? - - - Is there some way to rapidly develop a prototype which would - answer the previous question? - -[1] SPDY: An experimental protocol for a faster web - http://dev.chromium.org/spdy/spdy-whitepaper -[2] Shining Light in Dark Places: Understanding the Tor Network Damon McCoy, - Kevin Bauer, Dirk Grunwald, Tadayoshi Kohno, Douglas Sicker - http://www.cs.washington.edu/homes/yoshi/papers/Tor/PETS2008_37.pdf diff --git a/doc/spec/proposals/ideas/xxx-what-uses-sha1.txt b/doc/spec/proposals/ideas/xxx-what-uses-sha1.txt deleted file mode 100644 index b3ca3eea5a..0000000000 --- a/doc/spec/proposals/ideas/xxx-what-uses-sha1.txt +++ /dev/null @@ -1,247 +0,0 @@ -Filename: xxx-what-uses-sha1.txt -Title: Where does Tor use SHA-1 today? -Authors: Nick Mathewson, Marian -Created: 30-Dec-2008 -Status: Meta - - -Introduction: - - Tor uses SHA-1 as a message digest. SHA-1 is showing its age: - theoretical attacks for finding collisions against it get better - every year or two, and it will likely be broken in practice before - too long. - - According to smart crypto people, the SHA-2 functions (SHA-256, etc) - share too much of SHA-1's structure to be very good. RIPEMD-160 is - also based on flawed past hashes. Some people think other hash - functions (e.g. Whirlpool and Tiger) are not as bad; most of these - have not seen enough analysis to be used yet. - - Here is a 2006 paper about hash algorithms. - http://www.sane.nl/sane2006/program/final-papers/R10.pdf - - (Todo: Ask smart crypto people.) - - By 2012, the NIST SHA-3 competition will be done, and with luck we'll - have something good to switch too. But it's probably a bad idea to - wait until 2012 to figure out _how_ to migrate to a new hash - function, for two reasons: - 1) It's not inconceivable we'll want to migrate in a hurry - some time before then. - 2) It's likely that migrating to a new hash function will - require protocol changes, and it's easiest to make protocol - changes backward compatible if we lay the groundwork in - advance. It would suck to have to break compatibility with - a big hard-to-test "flag day" protocol change. - - This document attempts to list everything Tor uses SHA-1 for today. - This is the first step in getting all the design work done to switch - to something else. - - This document SHOULD NOT be a clearinghouse of what to do about our - use of SHA-1. That's better left for other individual proposals. - - -Why now? - - The recent publication of "MD5 considered harmful today: Creating a - rogue CA certificate" by Alexander Sotirov, Marc Stevens, Jacob - Appelbaum, Arjen Lenstra, David Molnar, Dag Arne Osvik, and Benne de - Weger has reminded me that: - - * You can't rely on theoretical attacks to stay theoretical. - * It's quite unpleasant when theoretical attacks become practical - and public on days you were planning to leave for vacation. - * Broken hash functions (which SHA-1 is not quite yet AFAIU) - should be dropped like hot potatoes. Failure to do so can make - one look silly. - - -Triage - - How severe are these problems? Let's divide them into these - categories, where H(x) is the SHA-1 hash of x: - PREIMAGE -- find any x such that a H(x) has a chosen value - -- A SHA-1 usage that only depends on preimage - resistance - * Also SECOND PREIMAGE. Given x, find a y not equal to - x such that H(x) = H(y) - COLLISION<role> -- A SHA-1 usage that depends on collision - resistance, but the only party who could mount a - collision-based attack is already in a trusted role - (like a distribution signer or a directory authority). - COLLISION -- find any x and y such that H(x) = H(y) -- A - SHA-1 usage that depends on collision resistance - and doesn't need the attacker to have any special keys. - - There is no need to put much effort into fixing PREIMAGE and SECOND - PREIMAGE usages in the near-term: while there have been some - theoretical results doing these attacks against SHA-1, they don't - seem to be close to practical yet. To fix COLLISION<code-signing> - usages is not too important either, since anyone who has the key to - sign the code can mount far worse attacks. It would be good to fix - COLLISION<authority> usages, since we try to resist bad authorities - to a limited extent. The COLLISION usages are the most important - to fix. - - Kelsey and Schneier published a theoretical second preimage attack - against SHA-1 in 2005, so it would be a good idea to fix PREIMAGE - and SECOND PREIMAGE usages after fixing COLLISION usages or where fixes - require minimal effort. - - http://www.schneier.com/paper-preimages.html - - Additionally, we need to consider the impact of a successful attack - in each of these cases. SHA-1 collisions are still expensive even - if recent results are verified, and anybody with the resources to - compute one also has the resources to mount a decent Sybil attack. - - Let's be pessimistic, and not assume that producing collisions of - a given format is actually any harder than producing collisions at - all. - - -What Tor uses hashes for today: - -1. Infrastructure. - - A. Our X.509 certificates are signed with SHA-1. - COLLSION - B. TLS uses SHA-1 (and MD5) internally to generate keys. - PREIMAGE? - * At least breaking SHA-1 and MD5 simultaneously is - much more difficult than breaking either - independently. - C. Some of the TLS ciphersuites we allow use SHA-1. - PREIMAGE? - D. When we sign our code with GPG, it might be using SHA-1. - COLLISION<code-signing> - * GPG 1.4 and up have writing support for SHA-2 hashes. - This blog has help for converting: - http://www.schwer.us/journal/2005/02/19/sha-1-broken-and-gnupg-gpg/ - E. Our GPG keys might be authenticated with SHA-1. - COLLISION<code-signing-key-signing> - F. OpenSSL's random number generator uses SHA-1, I believe. - PREIMAGE - -2. The Tor protocol - - A. Everything we sign, we sign using SHA-1-based OAEP-MGF1. - PREIMAGE? - B. Our CREATE cell format uses SHA-1 for: OAEP padding. - PREIMAGE? - C. Our EXTEND cells use SHA-1 to hash the identity key of the - target server. - COLLISION - D. Our CREATED cells use SHA-1 to hash the derived key data. - ?? - E. The data we use in CREATE_FAST cells to generate a key is the - length of a SHA-1. - NONE - F. The data we send back in a CREATED/CREATED_FAST cell is the length - of a SHA-1. - NONE - G. We use SHA-1 to derive our circuit keys from the negotiated g^xy - value. - NONE - H. We use SHA-1 to derive the digest field of each RELAY cell, but that's - used more as a checksum than as a strong digest. - NONE - -3. Directory services - - [All are COLLISION or COLLISION<authority> ] - - A. All signatures are generated on the SHA-1 of their corresponding - documents, using PKCS1 padding. - * In dir-spec.txt, section 1.3, it states, - "SIGNATURE" Object contains a signature (using the signing key) - of the PKCS1-padded digest of the entire document, taken from - the beginning of the Initial item, through the newline after - the Signature Item's keyword and its arguments." - So our attacker, Malcom, could generate a collision for the hash - that is signed. Thus, a second pre-image attack is possible. - Vulnerable to regular collision attack only if key is stolen. - If the key is stolen, Malcom could distribute two different - copies of the document which have the same hash. Maybe useful - for a partitioning attack? - B. Router descriptors identify their corresponding extra-info documents - by their SHA-1 digest. - * A third party might use a second pre-image attack to generate a - false extra-info document that has the same hash. The router - itself might use a regular collision attack to generate multiple - extra-info documents with the same hash, which might be useful - for a partitioning attack. - C. Fingerprints in router descriptors are taken using SHA-1. - * The fingerprint must match the public key. Not sure what would - happen if two routers had different public keys but the same - fingerprint. There could perhaps be unpredictable behaviour. - D. In router descriptors, routers in the same "Family" may be listed - by server nicknames or hexdigests. - * Does not seem critical. - E. Fingerprints in authority certs are taken using SHA-1. - F. Fingerprints in dir-source lines of votes and consensuses are taken - using SHA-1. - G. Networkstatuses refer to routers identity keys and descriptors by their - SHA-1 digests. - H. Directory-signature lines identify which key is doing the signing by - the SHA-1 digests of the authority's signing key and its identity key. - I. The following items are downloaded by the SHA-1 of their contents: - XXXX list them - J. The following items are downloaded by the SHA-1 of an identity key: - XXXX list them too. - -4. The rendezvous protocol - - A. Hidden servers use SHA-1 to establish introduction points on relays, - and relays use SHA-1 to check incoming introduction point - establishment requests. - B. Hidden servers use SHA-1 in multiple places when generating hidden - service descriptors. - * The permanent-id is the first 80 bits of the SHA-1 hash of the - public key - ** time-period performs caclulations using the permanent-id - * The secret-id-part is the SHA-1 has of the time period, the - descriptor-cookie, and replica. - * Hash of introduction point's identity key. - C. Hidden servers performing basic-type client authorization for their - services use SHA-1 when encrypting introduction points contained in - hidden service descriptors. - D. Hidden service directories use SHA-1 to check whether a given hidden - service descriptor may be published under a given descriptor - identifier or not. - E. Hidden servers use SHA-1 to derive .onion addresses of their - services. - * What's worse, it only uses the first 80 bits of the SHA-1 hash. - However, the rend-spec.txt says we aren't worried about arbitrary - collisons? - F. Clients use SHA-1 to generate the current hidden service descriptor - identifiers for a given .onion address. - G. Hidden servers use SHA-1 to remember digests of the first parts of - Diffie-Hellman handshakes contained in introduction requests in order - to detect replays. See the RELAY_ESTABLISH_INTRO cell. We seem to be - taking a hash of a hash here. - H. Hidden servers use SHA-1 during the Diffie-Hellman key exchange with - a connecting client. - -5. The bridge protocol - - XXXX write me - - A. Client may attempt to query for bridges where he knows a digest - (probably SHA-1) before a direct query. - -6. The Tor user interface - - A. We log information about servers based on SHA-1 hashes of their - identity keys. - COLLISION - B. The controller identifies servers based on SHA-1 hashes of their - identity keys. - COLLISION - C. Nearly all of our configuration options that list servers allow SHA-1 - hashes of their identity keys. - COLLISION - E. The deprecated .exit notation uses SHA-1 hashes of identity keys - COLLISION |