aboutsummaryrefslogtreecommitdiff
path: root/proposals/ideas
diff options
context:
space:
mode:
authorNick Mathewson <nickm@torproject.org>2011-10-18 21:43:36 -0400
committerNick Mathewson <nickm@torproject.org>2011-10-19 19:59:47 -0400
commitd55883abf351b0df0a46ee6f5adc17dad61da9de (patch)
treee643c7ccc8ebec704cde6c7139266626f0124c58 /proposals/ideas
parentef5a3c93402000dcaa5eff4878010cbb36a01205 (diff)
downloadtorspec-d55883abf351b0df0a46ee6f5adc17dad61da9de.tar.gz
torspec-d55883abf351b0df0a46ee6f5adc17dad61da9de.zip
start a draft for "new crypto ops"
Diffstat (limited to 'proposals/ideas')
-rw-r--r--proposals/ideas/xxx-new-crypto-sketch.txt601
-rw-r--r--proposals/ideas/xxx-ntor-handshake.txt17
2 files changed, 614 insertions, 4 deletions
diff --git a/proposals/ideas/xxx-new-crypto-sketch.txt b/proposals/ideas/xxx-new-crypto-sketch.txt
new file mode 100644
index 0000000..a4893e1
--- /dev/null
+++ b/proposals/ideas/xxx-new-crypto-sketch.txt
@@ -0,0 +1,601 @@
+Title: Design sketch for new crypto ops
+Date: 19 Oct 2011
+Author: Nick Mathewson
+
+0. Overview
+
+ The point of this document is to discuss what crypto we ought to be using.
+ See "Initial Thoughts on Migrating Tor to New Cryptography" from last year
+ for general guidelines and principles.
+
+ In broad strokes, the parts of our crypto are:
+
+ IDENTITY KEYS AND FINGERPRINTS
+ Addressed here in Section 2.
+ LINK CRYPTO (TLS) --
+ Addressed in proposls 176, 184. We say a little here though in
+ section 5.
+ CREATE/EXTEND CRYPTO --
+ Addressed in xxx-ntor-handshake.txt and rransom's extend draft
+ RELAY CRYPTO
+ Addressed here in Section 6
+ DIRECTORY SYSTEM
+ Addressed here.
+ HIDDEN SERVICE SYSTEM
+ Addressed in a forthcoming document by rransom.
+
+1. Base algorithm choice
+
+ There seem to be two main candidate algorithms for signatures: RSA with big
+ keys (hereinafter "RSA>1024"); and Ed25519, which is DSA with the sharp
+ edges filed off on an Edwards curve related to DJB's Curve25519. We can
+ look at other ECC groups too. {But see ECC Notes.}
+
+ For Diffie Hellman: Curve25519 seems like a decent choice; failing that,
+ another . DH
+ on Z_p with big groups (hereinafter "DH>1024"). {But see ECC notes.}
+
+ For a hash function: SHA256, switching to SHA3 in 2012 when it comes out.
+ It might be worthwhile waiting for SHA3 in most places, and skipping over
+ the sha256 stage entirely.
+
+ For a stream cipher: AES-CTR is in one sense a conservative choice inasmuch
+ as AES is well-analyzed, but AES's well-known issues with cache-based
+ timing attacks are pretty worrisome. We can mitigate that some by using
+ random secret IVs for AES-CTR, so that we will be encrypting neither
+ attacker-chosen nor attacker-known plaintext with out AES cipher, but
+ that's a bit kludgy. Salsa20 is what rransom likes these days, but IMO we
+ aren't competent to tell whether it looks good or not; the existing attacks
+ against it don't look like very bad news to me, but who knows whether it's
+ getting enough attention that we can read. See also ChaCha; see also the
+ other EuroCrypt {XXXX} winners/finalists; see also SHA3 if the SHA3 winner
+ specifies a way to use it as a stream cipher, or specifies an underlying
+ stream/block cipher.
+
+ For a random number generator: We currently use OpenSSL seeded with
+ RAND_poll and with platform entropy. OpenSSL uses RC4 (XXX check). The
+ platform entropy management can be messy, obscure, or both. I suggest
+ that:
+ * we should seed our PRNG with more entropy sources if we can find some
+ promising code with an appropriate license
+ * instead of just using OpenSSL's PRNG, we should use OpenSSL's PRNG
+ xor'd with some other good PRNG. (Is Yarrow still cool? And is there
+ a combine operation better than xor?)
+ * We should re-seed the RNG before and after very sensitive operations,
+ like private key generation.
+
+
+1.1. ECC notes
+
+ ECC is the brave new[*] crypto of the future! It's faster[**] than doing
+ crypto in Z_n (as we do for RSA and DH now) for equivalent levels of
+ security, and the resulting outputs are much shorter.
+
+ As near as I can tell as a layman, Certicom is muddying the waters as much
+ as possible wrt claiming that it's nigh impractical to deploy ECC without
+ licensing their patents. This is rather like the silliness that PKP used
+ to pull back in the day, where they claimed that their patents covered not
+ only the existing public key cryptography algorithms, but also the very
+ idea of public key cryptography itself.
+
+ DJB claims that for every patent he's aware of, either that patent doesn't
+ cover his code, or that patent is invalid because of prior art. I'm not
+ going to try to evaluate these claims, since I'm not supposed to be reading
+ patents for typical "let's avoid the appearance of knowing infringement"
+ reasons. But before we dive into the world of ECC, we should see if we can
+ ask any friendly patent attorneys and ECC experts for a second or third
+ opinion here.
+
+ I note in passing that nearly all of the patents that DJB mentions in his
+ list would appear to expire over the next 12 months or so.
+
+ Additionally, there are ECC groups out there less fast than DJB's, but more
+ widely available and analyzed. We should consider some of those too.
+
+ One final issue to investigate is whether using these algorithms will make
+ any major free software distribution decide not to include us. I seem to
+ recall seeing that one or two of the big ones had at one point decided to
+ ship OpenSSL only with ECC disabled, either because of real patent
+ concerns, or because of an opinion that the certicom license for ECC use in
+ TLS was problematic for free software, or something like that. We should
+ check that out.
+
+ [*] Actually, it's older than onion routing, and older than some members of
+ the Tor project.
+
+ [**] Actually, because of the common practice of choosing a small-ish prime
+ value (65537) for e in RSA, RSA public key operations can be a little
+ faster than equivalent-security ECDH or ECDSA operations. The private key
+ operations in RSA are still much much slower.
+
+2. New identities
+
+ Identity keys and their fingerprints are used:
+ - To sign router descriptors
+ - To identify nodes in consensus directories
+ - To make sure we're talking to the right node in the link handshake
+ - To make sure that the extending node is talking to the right next node
+ when sending an extend cell
+ - To identify particular nodes in the hidden service subsystem
+ - To identify nodes in the UI in various places
+ - Internally, to identify a node uniquely in the codebase.
+ - To determine which part of the circuit ID space to use on a Tor
+ instance's links.
+
+2.1. New identities, option 1: RSA>1024, slow migration.
+
+ We use RSA for identity keys indefinitely. Nearly all operations done with
+ an identity key are signature checking; signing happens only a few times an
+ hour per node even with pathological cases. Since signature checking is
+ really cheap with RSA, there's no speed advantage for ECC here. (There is
+ a space advantage, since the keys are much smaller.)
+
+ The easiest way to migrate to longer identity keys is to tell all Tors to
+ begin accepting longer identity keys now, and to tweak all our protocols so
+ that longer RSA identity keys are understood. We should then have a pair
+ of parameters in the consensus that determines the largest and smallest
+ acceptable identity key size in the network. Clients and servers should
+ reject any keys longer or shorter than specified. Once all versions of Tor
+ can accept long identity keys, we raise the maximum size from 1024 to
+ somewhere in the 2048-4096 range.
+
+2.2. New identities option 2: RSA>1024, faster migration.
+
+ We use RSA for identity keys indefinitely as above. But we allow nodes to
+ begin having longer identities now, even though older Tors won't understand
+ them. This implies, of course, that every such node needs to have at least
+ 2 identities: one RSA1024 identity for backward compatibility, one RSA>1024
+ identity for more secure identification.
+
+ We would have these identities cross-certify as follows: All keys would be
+ listed in the router descriptor. RSA>1024 keys would be called something
+ other than identity-key, so as not to confuse older clients. A signature
+ with the RSA>1024 key would appear right before the current RSA1024
+ signature. This way, signed material would include both keys, and would be
+ signed by both keys.
+
+ [In other words, descriptors would look something like:
+
+ router foo...
+ ...
+ identity-key
+ -----BEGIN RSA KEY-----
+ 1024-bit RSA key here
+ -----END RSA KEY-----
+ ext-identity-key
+ -----BEGIN RSA KEY-----
+ 3072-bit RSA key here
+ -----END RSA KEY-----
+ ...
+ ext-signature
+ -----BEGIN SIGNATURE-----
+ signature of everything through "ext-signature\n", using the long key
+ -----END SIGNATURE-----
+ router-signature
+ -----BEGIN SIGNATURE-----
+ signature of everything through "router-signature\n", using the short key
+ -----END SIGNATURE-----
+
+ ]
+
+ See "UI notes" in the "new fingerprints" section below for some of the
+ implications of letting nodes have multiple identity keys.
+
+ We'll need to advertise these new identities in consensus directories too;
+ see XXXX below for more info there.
+
+2.3. New identities option 3: RSA>1024 and/or Ed25519, faster migration
+
+ As in option 2 above, but new keys can also be Ed25519. If we expect that
+ not all installations will allow Ed25519 (see "ECC Notes", section 1.1),
+ we'll need to say that every server with an Ed25519 key must also have an
+ RSA>1024 key.
+
+2.4. Implications for current use of identity keys
+
+ Let's review our use of identity keys again and make sure that we can
+ handle all of them with the ideas above.
+
+ - To sign router descriptors
+
+ We discussed this in 2.2.
+
+ - To make sure we're talking to the right node in the link handshake
+
+ The current v3 link handshake can handle presenting multiple identity
+ certificates in the CERT cell. We should consider ourselves to be
+ connected to a node with identity X if _any_ of the identity certificates
+ that it presents in its authenticated CERT cell has identity X. To handle
+ EXTEND cells correctly, we should verify every identity we can.
+
+ - To make sure that the extending node is talking to the right next node
+ when sending an extend cell
+
+ The new extend cell format needs to allow the client to tell the extending
+ node about some identity for the destination node that the extending node
+ will be able to understand. This is a capability of the extending node
+ that the client needs to be able to check. (Also, the extend cell needs to
+ hash that identity in a form the extending node can understand; but that's
+ a fingerprint issue.)
+
+ - To determine which part of the circuit ID space to use on a Tor
+ instance's links.
+
+ We can continue to use RSA1024 identity key comparison here by default. We
+ can also use some other parameter of the v3 handshake, or introduce a new
+ link protocol where if the initiator authenticates, the initiator always
+ gets the low circIDs and the responder always gets the high ones.
+
+ - To identity nodes in consensus directories
+ - To identify nodes in the UI in various places
+ - Internally, to identify a node uniquely in the codebase.
+
+ See sections 3 and 4 below.
+
+ - To identify particular nodes in the hidden service subsystem
+
+ Out of scope.
+
+2.5. Migrating away from short ID keys entirely
+
+ Eventually no version of Tor that requires 1024-bit identity keys will
+ remain. When that happens, we should stop using them entirely. That means
+ that if we take any path other than the "slow migration" path of 2.1, we'll
+ need to make everything that looks at a node's identity also accept nodes
+ with _only_ a RSA>1024/Ed25519 identity.
+
+ At the directory service level, we should have an option to allow nodes
+ without RSA1024 identity keys (off until all clients and nodes accept new
+ identity keys).
+
+2.6. Implications for directory information
+
+ Clients must know a hash for each node's identity key, or else they can't
+ make an authenticated connection to the node or tell ORs how to extend to
+ the node.
+
+
+3. New fingerprints
+
+ Right now we compute fingerprints by taking the SHA1 hash of an ASN1
+ encoding of the RSA1024 identity key. We encode this in hex almost
+ everywhere, and prefix it with a $.
+
+ I propose that fingerprints of the future be determined by taking a digest
+ using SHA256 or SHA3 of:
+
+ "Hash Algorithm Name", "Key Type Name", encoded key
+
+ When representing these internally, we should include the hash algorithm
+ that was used. When representing them in the UI, we should use the
+ notation %b64, where b64 is a base-64 encoding, omitting the trailing =s.
+
+ (Other plausible characters to use are @, ?, +, ~, =, etc. I like %, but
+ can be persuaded. Bikeshed bikeshed bikeshed.)
+
+ Since 43 base-64 characters is enough to represent a 256-bit digest, with 2
+ bits left over, I propose that the b64 value encode
+ hh | D(hash algorithm name, key type, encoded key),
+ where hh is a 2-bit value, with one of the values:
+ 00 -- sha256
+ 01 -- sha3
+ 10 -- to be determined
+ 11 -- reserved.
+
+ We should investigate in the interface whether it's plausible to allow a
+ prefix of a node ID where the full ID would otherwise be required. That
+ seems risky for short prefixes, though.
+
+3.1. How many fingerprints is that anyway?!
+
+ Suppose that we allow sha256 and sha3 as hash algorithms, and we allow each
+ node to have 3 identity keys: one RSA1024, one RSA>1024, and one ECC. Then
+ we would have 7 fingerprints (6 plus the legacy SHA1(RSA1024) fingerprint),
+ for a total of 20+6*32==212 bytes per node.
+
+ It's not a horrible problem to accept them all in the UI, but the UI isn't
+ the only place that needs to know fingerprints. Instead, let's say that
+ RSA1024 identities are only identified with SHA1 hashes. This limits our
+ fingerprint load to a more manageable 20+32*2 == 84 bytes per node. Still
+ not great, though.
+
+3.2. What does this imply for the UI?
+
+ In the UI we'll lose the property that no node has more than one
+ fingerprint: I do not believe that this actually hurts us.
+
+
+4. Directory changes
+
+4.1. Better cross-referencing
+
+ In some places, directory objects cross-reference one another by SHA1
+ hash. They should use a better hash algorithm instead.
+
+ XXX
+
+4.2. More fingerprints
+
+ Because extending from node A to node B requires that we have node B's
+ fingerprint in a way that node A will understand, it is not enough to get a
+ set of identity fingerprints for each node in the format _we_ like best.
+ Instead, we must .
+
+
+4.x. An option: compound signatures on directory objects
+
+ In Tor 0.2.2.x and later, when we check a signature on a directory object
+ (not including hidden service descriptors), we only look at the first
+ DIGEST_LEN bytes of the RSA-signed data. Once 0.2.1.x is obsolete, or on
+ any types of signatures not checked in 0.2.1.x, we can use the rest of the
+ space. (We're using PKCS1 padding on our signatures, which has an
+ overhead of 11 bytes. Signing a SHA1 hash with a 1024-bit key therefore
+ leaves 128-11-20==97 more bytes we could use for a SHA2 or a SHA3 hash.)
+
+4.x.
+
+
+5. Link crypto changes
+
+
+
+
+6. Relay crypto changes
+
+There are a few things we might want out of improved relay crypto. They
+include:
+ - Resistance to end-to-end bitwise tagging attacks.
+ - Better resistance to malleability.
+ - If using counter mode, no block-cipher operations on any value known
+ to the attacker.
+
+I'll try to provide these in increasing order of difficulty. None of these
+is necessarily correct; I should look for a security proof or a better
+construction for any that we seem likely to use.
+
+Rationales: Our existing malleability resistance is a kludge. Doing no
+block-cipher ops on attacker-known values increases our security margins a
+little. Our arguments about tagging attacks hold that an attacker who
+controls both ends has plenty of ways to win even if tagging attacks are
+foiled; nonetheless, most of these ways are technically slightly more
+difficult than xor-based tagging, and it could be useful to boost our
+defense-in-depth a little bit, just in case other active end-to-end attacks
+turn out to be harder than we'd though.)
+
+Option 1: Use AES-CTR in a less scary mode
+
+ When doing key expansion, in addition to establishing Kf, Kb, Df, and Db,
+ also establish IVf and IVb. Use the current relay crypto, except instead
+ of starting the counters at 0, start them at IVf and IVb. This way, an
+ attacker doesn't have any known plaintexts to work with, which makes AES a
+ little more robust.
+
+Option 2: As 1, but tagging attacks garble the circuit after one block.
+
+ Keep an HMAC of all previously received encrypted cells on a circuit.
+ When decrypting a cell, use this HMAC value to determine the first 64 bits
+ of the counter; increment the low 64 bits of the counter as usual.
+
+ This way, if an adversary flips any bits before passing the stream through
+ an honest node, no _subsequent_ block will be recoverable.
+
+ To prevent any part of the stream from being re-used, close any circuit if
+ the low 64 bits of the counter would ever wrap (that is, around 295
+ million terabytes).
+
+ (If we're using a stream cipher with fast re-key, then we can just have
+ the key used for each block be an HMAC of all previously received
+ ciphertext.)
+
+Option 3: As 1, but tagging attacks garble the circuit in the same block.
+
+ Use a large-block cipher mode, such as BEAR or LIONESS (depending on
+ whether we need a PRP or SPRP). Base the key material for each block on
+ an HMAC of all previous blocks' ciphertexts.
+
+ This way, if an adversary makes any alteration in a block, that block and
+ all subsequent blocks will be garbled. It's more expensive than 2,
+ though, especially if we need to use a LIONESS construction.
+
+ {I considered IGE here, with a trick where odd-numbered nodes on a circuit
+ start from the front of the block and even-numbered nodes start from the
+ end, but it didn't seem much better. We should investigate relative
+ performance, though.}
+
+Option 4: Shall we have middle nodes be able to fast-stop bad data?
+
+ In all the above options, if a cell is altered, the middle node can at
+ best turn that cell and the rest of the cells on the circuit into garbage,
+ which the last node won't deliver (if honest) or can't deliver (if dishonest).
+
+ Might we prefer to do as in mixnets, and have nodes kill circuits upon
+ receiving altered cells?
+
+ I'm not so sure. It's relatively expensive in per-cell overhead, and the
+ next-best attack to the one it prevents isn't so great.
+
+ Consider that if option 3 is in place, an end-to-end attacker who wants to
+ do a tagging attack at one node can garble the rest of the circuit, and
+ see if the output is garbled at the exit node. But such an attacker could
+ just as easily close the circuit at one of those nodes and watch for a
+ corresponding close event, or even better -- simply pause traffic on that
+ circuit for a while and watch for a corresponding gap at the exit. The
+ only advantage of the garbling attack would be that garbled cells are
+ presumably rarer than circuit closes or traffic pauses, and thus easier to
+ use to distinguish target circuits. But that's still bunk; the other
+ attacks win fine, and the pause attack doesn't risk detection so much.
+
+ Still, to do this one, we'd want to have outgoing and incoming circuits
+ treated differently. Incoming cells could work as in 1 or 2 or 3 above;
+ outgoing cells would want to have a header portion as in mixmaster, where
+ each node checks that the first N bits of the header match a MAC of all
+ data seen so far, *including the rest of the cell*. They'd then decrypt
+ the rest of the cell, remove the first N bits of the header, and re-pad
+ the header with N bits at the end, taken from a PRNG whose seed is shared
+ with the client. (This is basically how mixmaster works, and how
+ mixminion works in the common case.)
+
+ The space overhead here is kind of large: N bits per cell per node. In
+ the most paranoid case, if we used 256-byte HMACs on 3-node paths, that's
+ 96 bytes per cell, which is more than 20% of the total length. But we can
+ probably do better if we let the CREATE operation also tell the node some
+ N to check. For example, the first node doesn't need to check any bits.
+ The second and third nodes could check 64 bits apiece; that only has 16
+ bytes overhead, and high probability of catching any changes. (Birthday
+ attacks don't matter here, and an attacker who mounts this attack for long
+ enough to accidentally find a 64-bit mac will break so many circuits in
+ the process as to become totally unreliable.
+
+ But all of this leaks the path lengths and position on the path to various
+ nodes. We might open ourselves up to partitioning attacks if different
+ clients choose different numbers of bits. What's more, we might leak the
+ length of the path to the last node by how much junk there is at the end
+ of the cell.
+
+ Here's a simple construction for this format, to be concrete:
+
+ The CREATE operation's KDF produces the following outputs:
+ Kf, IVf (stream cipher key, and possibly IV, for forward direction)
+ Kb, IVb (stream cipher key, and possibly IV, for reverse direction)
+ Mf (MAC key for forward direction)
+ Mb (MAC key for reverse direction)
+ SEEDf (PRNG key for forward direction)
+ And it also sets the following user-selected parameter:
+ MACBYTESf (an integer between 0 and 32 inclusive)
+ MACBYTESb (an integer between 0 and 32 inclusive)
+ CANEXIT (boolean: can we exit from this hop?)
+
+ Let Kf[i], Mf[i], etc denote the parameter Kf, Mf, etc as shared between
+ the client and the i'th node in its circuit.
+
+ Relay cells sent towards the client have the following plaintext format:
+
+ Body:
+ Content:
+ Relay Command [1 byte]
+ StreamID [2 bytes]
+ Length [2 bytes]
+ Data [Up to CELL_DATA_LEN-5-MACBYTESb bytes]
+ Padding [randomly generated as needed to fill the cell]
+ MAC(All previous encrypted content + encrypted content,
+ Mb)[:MACBYTESb] [MACBYTESb bytes]
+
+ The originator of the client-bound cell encrypts the content with the
+ next part of its Kb,IVb stream, then appends the MAC.
+
+ Non-clients receiving a client-bound relay cell encrypt the entire cell
+ body, MAC included, with the next part of the stream cipher that was
+ keyed with Kb,IVb.
+
+ When the client receives a relay cell body, it iteratively does:
+
+ For node i in circuit from 1..N:
+ Let cells_i = all previous cells which we decided were from i,
+ and let cellbody = the body of the cell, except for the last
+ MACBYTESb[i] bytes.
+ {XXXX I'd be more comfortable if the MAC covered all cells
+ passed by every node on the circuit.}
+
+ If MACBYTESb[i]>0, check whether MAC(cells_i | cellbody,
+ Mb[i])[:MACBYTESb[i]] the last MACBYTESb[i] bytes of the cell. If
+ so, this cell is from node i.
+
+ If this cell is from node i, decrypt the first
+ CELL_DATA_LEN-MACBYTESb[i] bytes of the cell using the stream
+ keyed with Kb[i],IVb[i]. Act on it as a relay cell.
+
+ Otherwise decrypt the entire cell, MAC included, with the stream
+ keyed with Kb[i], IVb[i].
+
+ If no node sent this cell: it's junk and somebody is probably
+ messing with us! Destroy the circuit.
+
+ When the client *sends* a cell outbound to node N:
+
+ Let cells[i] start at "" for all i in 1...N initially, and get
+ updated as below.
+
+ Let MACLEN = SUM(MACBYTESf[1...N])
+
+ Let Body =
+ Relay Command [1 byte]
+ StreamID [2 bytes]
+ Length [2 bytes]
+ Data [Up to CELL_DATA_LEN-5-MACLEN bytes]
+ Padding [randomly generated,
+ CELL_DATA_LEN-5-MACLEN-len(Data) bytes]
+
+ Let PAD[i] = the next MACBYTESf[i] bytes from the PRNG keyed
+ with SEEDf[i], for i in 1...N.
+
+ Let STREAM[i] = the next CELL_DATA_LEN-MACBYTESf[i] bytes of
+ the stream keyed by Kf[i],IV[i], for i in 1...N.
+
+ Let PADSEEN[1] == ""
+
+ For i in 2...N:
+ Let L = len(PADSEEN[i-1])
+ Let PADSEEN[i] = (PADSEEN[i-1] xor STREAM[i-1][CELL_DATA_LEN-L:]) | PAD[i-1]
+
+ For i in N down to 1:
+
+ Let Encbody = Body xor STREAM[i][len(body)]
+ Let extra = "RECOGNIZED" if i == N, "OK" otherwise
+ Let cells[i] = cells[i] | Body | PADSEEN[i]
+ Let M = MAC(cells[i] | extra , Mf[i])
+
+ Let Body = M[:MACBYTESf[i]] | EncBody
+
+
+ To receive an outbound cell:
+
+ Let M be the first MACBYTESf bytes of the cell, let REST be the rest
+ of the cell, and let "cells" be all previous cells on this circuit.
+ If CANEXIT, and M = MAC(cells|rest|"RECOGNIZED", Mb)[:MACBYTESf],
+ and MACBYTESf > 0, this cell is for us. If M = MAC(cells|rest|"OK",
+ Mb)[:MACBYTESf], this cell is not for us, but is valid. Otherwise,
+ destroy the circuit.
+
+ Decrypt REST using the stream cipher keyed with Kf,IVf. If this
+ cell is for us, act on it as a relay cell. Otherwise, let
+ PAD = the next MACBYTESf[i] bytes of the PRNG keyed with SEEDf,
+ and relay REST | PAD.
+
+ ANOTHER VARIANT:
+
+ If we restrict MACBYTESf values to range 0..HL/2, where HL is the
+ length of the MAC output, we can replace
+ MAC(x | "RECOGIZED")[:MACBYTESf] and MAC(x | "OK")[:MACBYTESf]
+ with
+ MAC(x)[:MACBYTESf] and MAC(x)[HL-MACBYTESf:]
+
+ PICKING MACBYTESf,MACBYTESb.
+
+ We don't need to worry about birthday attacks:
+ Because we're using a MAC, only the parties who are making the
+ MACs could try to do a brute-force search for a collision, but
+ they have no reason to do so.
+
+ If a collision occurs accidentally, an adversary can't substitute
+ an earlier-seen cell for a later one with the same MAC, since the
+ MAC covers not only the cell, but all previous cells on the
+ circuit.
+ So 16 bytes is about the most we should ever do, given our usual
+ security parameters. Let me moot the number 8 for MACBYTESb.
+
+ For outbound cells, for any hop we can exit from, choosing
+ MACBYTESf=6 gets us the current security level. For the first hop,
+ assuming we don't exit from it, choosing MACBYTESf=0 is totally
+ safe, since the link crypto guarantees that nothing was corrupted on
+ the way.
+
+ In general, to prevent an end-to-end tagging attack, it seems
+ sufficient to do something like setting MACBYTES=8 for the last
+ hop, and MACBYTES=8 for one hop in the middle.
+
+
+ACKS
+
+ Lots of the good ideas and concerns here are due to Robert Ransom.
+
+ Michael Stone helped some with "relay option 4" above.
diff --git a/proposals/ideas/xxx-ntor-handshake.txt b/proposals/ideas/xxx-ntor-handshake.txt
index 41af5c7..1f988fc 100644
--- a/proposals/ideas/xxx-ntor-handshake.txt
+++ b/proposals/ideas/xxx-ntor-handshake.txt
@@ -67,13 +67,20 @@ Protocol:
NODEID: ID -- H_LENGTH bytes
KEYID: KEYID(B) -- H_LENGTH bytes
CLIENT_PK: X -- G_LENGTH bytes
+ PARAMSLEN: -- 2 bytes
+ PARMS: -- PARAMSLEN byets
+
+ (The "PARAMS" component is used to encode any additional authenticated
+ information that's needed for establishing the right kind of circuit.)
The server generates a keypair of y,Y = KEYGEN(), and computes
- secret_input = EXP(X,y) | EXP(X,b) | ID | B | X | Y | PROTOID
+ secret_input = EXP(X,y) | EXP(X,b) | ID | B | X | Y | PARAMSLEN | PARAMS
+ | PROTOID
KEY_SEED = H(secret_input, t_key)
verify = H(secret_input, t_verify)
- auth_input = verify | ID | B | Y | X | PROTOID | "Server"
+ auth_input = verify | ID | B | Y | X | PARAMSLEN | PARAMS | PROTOID
+ | "Server"
The server sends a CREATED cell containing:
@@ -82,10 +89,12 @@ Protocol:
The client then checks Y is in G^* [see below], and computes
- secret_input = EXP(Y,x) | EXP(B,x) | ID | B | X | Y | PROTOID
+ secret_input = EXP(Y,x) | EXP(B,x) | ID | B | X | Y | PARAMSLEN | PARAMS
+ | PROTOID
KEY_SEED = H(secret_input, t_key)
verify = H(secret_input, t_verify)
- auth_input = verify | ID | B | Y | X | PROTOID | "Server"
+ auth_input = verify | ID | B | Y | X | PARAMLENS | PARAMS | PROTOID
+ | "Server"
The client verifies that AUTH == H(auth_input, t_mac).