aboutsummaryrefslogtreecommitdiff
path: root/proposals/194-mnemonic-urls.txt
diff options
context:
space:
mode:
authorJacob Appelbaum <jacob@appelbaum.net>2012-02-29 15:09:17 -0400
committerJacob Appelbaum <jacob@appelbaum.net>2012-02-29 15:09:17 -0400
commit7b76d529802740b11af62b5ded6f61dc5d39c1d4 (patch)
tree8e84b70c40459f6ec6333d0a15877e5a55dfa8bf /proposals/194-mnemonic-urls.txt
parent5b2b6b591ee436b9ff1e65247f4f32fe145c2c78 (diff)
downloadtorspec-7b76d529802740b11af62b5ded6f61dc5d39c1d4.tar.gz
torspec-7b76d529802740b11af62b5ded6f61dc5d39c1d4.zip
Add proposal 194 by Saizai and Alex Fink
Diffstat (limited to 'proposals/194-mnemonic-urls.txt')
-rw-r--r--proposals/194-mnemonic-urls.txt200
1 files changed, 200 insertions, 0 deletions
diff --git a/proposals/194-mnemonic-urls.txt b/proposals/194-mnemonic-urls.txt
new file mode 100644
index 0000000..7d66196
--- /dev/null
+++ b/proposals/194-mnemonic-urls.txt
@@ -0,0 +1,200 @@
+Filename: 194-mnemonic-urls.txt
+Title: Mnemonic .onion URLs
+Author: Sai, Alex Fink
+Created: 29-Feb-2012
+Status: Open
+
+1. Overview
+
+ Currently, canonical Tor .onion URLs consist of a naked 80-bit hash[1]. This
+ is not something that users can even recognize for validity, let alone produce
+ directly. It is vulnerable to partial-match fuzzing attacks[2], where a
+ would-be MITM attacker generates a very similar hash and uses various social
+ engineering, wiki poisoning, or other methods to trick the user into visiting
+ the spoof site.
+
+ This proposal gives an alternative method for displaying and entering .onion
+ and other URLs, such that they will be easily remembered and generated by end
+ users, and easily published by hidden service websites, without any dependency
+ on a full domain name type system like e.g. namecoin[3]. This makes it easier
+ to implement (requiring only a change in the proxy).
+
+ This proposal could equally be used for IPv4, IPv6, etc, if normal DNS is for
+ some reason untrusted.
+
+ This is not a petname system[4], in that it does not allow service providers
+ or users[5] to associate a name of their choosing to an address[6]. Rather, it
+ is a mnemonic system that encodes the 80 bit .onion address into a
+ meaningful[7] and memorable sentence. A full petname system (based on
+ registration of some kind, and allowing for shorter, service-chosen URLs) can
+ be implemented in parallel[8].
+
+ This system has the three properties of being secure, distributed, and
+ human-meaningful — it just doesn't also have choice of name (except of course
+ by brute force creation of multiple keys to see if one has an encoding the
+ operator likes).
+
+ This is inspired by Jonathan Ackerman's "Four Little Words" proposal[9] for
+ doing the same thing with IPv4 addresses. We just need to handle 80+ bits, not
+ just 32 bits.
+
+ It is similar to Markus Jakobsson & Ruj Akavipat's FastWord system[10], except
+ that it does not permit user choice of passphrase, does not know what URL a
+ user will enter (vs verifying against a single stored password), and again has
+ to encode significantly more data.
+
+ This is also similar to RFC1751[11], RFC2289[12], and multiple other
+ fingerprint encoding systems[13] (e.g. PGPfone[14] using the PGP
+ wordlist[15], and Arturo Filatsò's OnionURL[16]), but we aim to make something
+ that's as easy as possible for users to remember — and significantly easier
+ than just a list of words or pseudowords, which we consider only useful as an
+ active confirmation tool, not as something that can be fully memorized and
+ recalled, like a normal domain name.
+
+2. Requirements
+
+2.1. encodes at least 80 bits of random data (preferably more, eg for a
+checksum)
+
+2.2. valid, visualizable English sentence — not just a series of words[17]
+
+2.3. words are common enough that non-native speakers and bad spellers will have
+minimum difficulty remembering and producing (perhaps with some spellcheck help)
+
+2.4. not syntactically confusable (e.g. order should not matter)
+
+2.5. short enough to be easily memorized and fully recalled at will, not just
+recognized
+
+2.6. no dependency on an external service
+
+2.7. dictionary size small enough to be reasonable for end users to download as
+part of the onion package
+
+2.8. consistent across users (so that websites can e.g. reinforce their random
+hash's phrase with a clever drawing)
+
+2.9. not create offensive sentences that service providers will reject
+
+2.10. resistant against semantic fuzzing (e.g. by having uniqueness against
+WordNet synsets[18])
+
+3. Possible implementations
+
+ This section is intentionally left unfinished; full listing of template
+ sentences and the details of their parser and generating implementation is
+ co-dependent on the creation of word class dictionaries fulfilling the above
+ criteria. Since that's fairly labor-intensive, we're pausing at this stage for
+ input first, to avoid wasting work.
+
+3.1. Have a fixed number of template sentences, such as:
+
+ 1. Adj subj adv vtrans adj obj
+ 2. Subj and subj vtrans adj obj
+ 3. … etc
+
+ For a 6 word sentence, with 8 (3b) templates, we need ~12b (4k word)
+ dictionaries for each word category.
+
+ If multiple words of the same category are used, they must either play
+ different grammatical roles (eg subj vs obj, or adj on a different item), be
+ chosen from different dictionaries, or there needs to be an order-agnostic way
+ to join them at the bit level. Preferably this should be avoided, just to
+ prevent users forgetting the order.
+
+3.2. As 3.1, but treat sentence generation as decoding a prefix code, and have
+ a Huffman code for each word class.
+
+ We suppose it’s okay if the generated sentence has a few more words than it
+ might, as long as they’re common lean words. E.g., for adjectives, "good"
+ might cost only six bits while "unfortunate" costs twelve.
+
+ Choice between different sentence syntaxes could be worked into the prefix
+ code as well, and potentially done separately for each syntactic constituent.
+
+4. Usage
+
+ To form mnemonic .onion URL, just join the words with dashes or underscores,
+ stripping minimal words like 'a', 'the', 'and' etc., and append '.onion'. This
+ can be readily distinguished from standard hash-style .onion URLs by form.
+
+ Translation should take place at the client — though hidden service servers
+ should also be able to output the mnemonic form of hashes too, to assist
+ website operators in publishing them (e.g. by posting an amusing drawing of
+ the described situation on their website to reinforce the mnemonic).
+
+ After the translation stage of name resolution, everything proceeds as normal
+ for an 80-bit hash onion URL.
+
+ The user should be notified of the mnemonic form of hash URL in some way, and
+ have an easy way in the client UI to translate mnemonics to hashes and vice
+ versa. For the purposes of browser URLs and the like though, the mnemonic
+ should be treated on par with the hash; if the user enters a mnemonic URL they
+ should not become redirected to the hash version. (If anything, the opposite
+ may be true, so that users become used to seeing and verifying the mnemonic
+ version of hash URLs, and gain the security benefits against partial-match
+ fuzzing.)
+
+ Ideally, inputs that don't validly resolve should have a response page served
+ by the proxy that uses a simple spell-check system to suggest alternate domain
+ names that are valid hash encodings. This could hypothetically be done inline
+ in URL input, but would require changes on the browser (normally domain names
+ aren't subject so spellcheck), and this avoids that implementation problem.
+
+5. International support
+
+ It is not possible for this scheme to support non-English languages without
+ a) (usually) Unicode in domains (which is not yet well supported by browsers),
+ and
+ b) fully customized dictionaries and phrase patterns per language
+
+ The scheme must not be used in an attempted 'translation' by simply replacing
+ English words with glosses in the target language. Several of the necessary
+ features would be completely mangled by this (e.g. other languages have
+ different synonym, homonym, etc groupings, not to mention completely different
+ grammar).
+
+ It is unlikely a priori that URLs constructed using a non-English
+ dictionary/pattern setup would in any sense 'translate' semantically to
+ English; more likely is that each language would have completely unrelated
+ encodings for a given hash.
+
+ We intend to only make an English version at first, to avoid these issues
+ during testing.
+
+________________
+
+[1] https://trac.torproject.org/projects/tor/wiki/doc/HiddenServiceNames
+https://gitweb.torproject.org/torspec.git/blob/HEAD:/address-spec.txt
+[2] http://www.thc.org/papers/ffp.html
+[3] http://dot-bit.org/Namecoin
+[4] https://en.wikipedia.org/wiki/Zooko's_triangle
+[5] https://addons.mozilla.org/en-US/firefox/addon/petname-tool/
+[6] However, service operators can generate a large number of hidden service
+descriptors and check whether their hashes result in a desirable phrasal
+encoding (much like certain hidden services currently use brute force generated
+hashes to ensure their name is the prefix of their raw hash). This won't get you
+whatever phrase you want, but will at least improve the likelihood that it's
+something amusing and acceptable.
+[7] "Meaningful" here inasmuch as e.g. "Barnaby thoughtfully mangles simplistic
+yellow camels" is an absurdist but meaningful sentence. Absurdness is a feature,
+not a bug; it decreases the probability of mistakes if the scenario described is
+not one that the user would try to fit into a template of things they have
+previously encountered IRL. See research into linguistic schema for further
+details.
+[8] https://gitweb.torproject.org/torspec.git/blob/HEAD:/proposals/ideas/xxx-oni
+on-nyms.txt
+[9] http://blog.rabidgremlin.com/2010/11/28/4-little-words/
+[10] http://fastword.me/
+[11] https://tools.ietf.org/html/rfc1751
+[12] http://tools.ietf.org/html/rfc2289
+[13] https://github.com/singpolyma/mnemonicode
+http://mysteryrobot.com
+https://github.com/zacharyvoase/humanhash
+[14] http://www.mathcs.duq.edu/~juola/papers.d/icslp96.pdf
+[15] http://en.wikipedia.org/wiki/PGP_word_list
+[16] https://github.com/hellais/Onion-url
+https://github.com/hellais/Onion-url/blob/master/dev/mnemonic.py
+[17] http://www.reddit.com/r/technology/comments/ecllk
+[18] http://wordnet.princeton.edu/wordnet/man2.1/wnstats.7WN.html
+