``` Filename: 179-TLS-cert-and-parameter-normalization.txt Title: TLS certificate and parameter normalization Author: Jacob Appelbaum, Gladys Shufflebottom Created: 16-Feb-2011 Status: Closed Target: 0.2.3.x Draft spec for TLS certificate and handshake normalization Overview STATUS NOTE: This document is implemented in part in 0.2.3.x, deferred in part, and rejected in part. See indented bracketed comments in individual sections below for more information. -NM Scope This is a document that proposes improvements to problems with Tor's current TLS (Transport Layer Security) certificates and handshake that will reduce the distinguishability of Tor traffic from other encrypted traffic that uses TLS. It also addresses some of the possible fingerprinting attacks possible against the current Tor TLS protocol setup process. Motivation and history Censorship is an arms race and this is a step forward in the defense of Tor. This proposal outlines ideas to make it more difficult to fingerprint and block Tor traffic. Goals This proposal intends to normalize or remove easy-to-predict or static values in the Tor TLS certificates and with the Tor TLS setup process. These values can be used as criteria for the automated classification of encrypted traffic as Tor traffic. Network observers should not be able to trivially detect Tor merely by receiving or observing the certificate used or advertised by a Tor relay. I also propose the creation of a hard-to-detect covert channel through which a server can signal that it supports the third version ("V3") of the Tor handshake protocol. Non-Goals This document is not intended to solve all of the possible active or passive Tor fingerprinting problems. This document focuses on removing distinctive and predictable features of TLS protocol negotiation; we do not attempt to make guarantees about resisting other kinds of fingerprinting of Tor traffic, such as fingerprinting techniques related to timing or volume of transmitted data. Implementation details Certificate Issues The CN or commonName ASN1 field Tor generates certificates with a predictable commonName field; the field is within a given range of values that is specific to Tor. Additionally, the generated host names have other undesirable properties. The host names typically do not resolve in the DNS because the domain names referred to are generated at random. Although they are syntatically valid, they usually refer to domains that have never been registered by any domain name registrar. An example of the current commonName field: CN=www.s4ku5skci.net An example of OpenSSL’s asn1parse over a typical Tor certificate: 0:d=0 hl=4 l= 438 cons: SEQUENCE 4:d=1 hl=4 l= 287 cons: SEQUENCE 8:d=2 hl=2 l= 3 cons: cont [ 0 ] 10:d=3 hl=2 l= 1 prim: INTEGER :02 13:d=2 hl=2 l= 4 prim: INTEGER :4D3C763A 19:d=2 hl=2 l= 13 cons: SEQUENCE 21:d=3 hl=2 l= 9 prim: OBJECT :sha1WithRSAEncryption 32:d=3 hl=2 l= 0 prim: NULL 34:d=2 hl=2 l= 35 cons: SEQUENCE 36:d=3 hl=2 l= 33 cons: SET 38:d=4 hl=2 l= 31 cons: SEQUENCE 40:d=5 hl=2 l= 3 prim: OBJECT :commonName 45:d=5 hl=2 l= 24 prim: PRINTABLESTRING :www.vsbsvwu5b4soh4wg.net 71:d=2 hl=2 l= 30 cons: SEQUENCE 73:d=3 hl=2 l= 13 prim: UTCTIME :110123184058Z 88:d=3 hl=2 l= 13 prim: UTCTIME :110123204058Z 103:d=2 hl=2 l= 28 cons: SEQUENCE 105:d=3 hl=2 l= 26 cons: SET 107:d=4 hl=2 l= 24 cons: SEQUENCE 109:d=5 hl=2 l= 3 prim: OBJECT :commonName 114:d=5 hl=2 l= 17 prim: PRINTABLESTRING :www.s4ku5skci.net 133:d=2 hl=3 l= 159 cons: SEQUENCE 136:d=3 hl=2 l= 13 cons: SEQUENCE 138:d=4 hl=2 l= 9 prim: OBJECT :rsaEncryption 149:d=4 hl=2 l= 0 prim: NULL 151:d=3 hl=3 l= 141 prim: BIT STRING 295:d=1 hl=2 l= 13 cons: SEQUENCE 297:d=2 hl=2 l= 9 prim: OBJECT :sha1WithRSAEncryption 308:d=2 hl=2 l= 0 prim: NULL 310:d=1 hl=3 l= 129 prim: BIT STRING I propose that we match OpenSSL's default self-signed certificates. I hypothesise that they are the most common self-signed certificates. If this turns out not to be the case, then we should use whatever the most common turns out to be. Certificate serial numbers Currently our generated certificate serial number is set to the number of seconds since the epoch at the time of the certificate's creation. I propose that we should ensure that our serial numbers are unrelated to the epoch, since the generation methods are potentially recognizable as Tor-related. Instead, I propose that we use a randomly generated number that is subsequently hashed with SHA-512 and then truncate the data to eight bytes[1]. Random sixteen byte values appear to be the high bound for serial number as issued by Verisign and DigiCert. RapidSSL appears to be three bytes in length. Others common byte lengths appear to be between one and four bytes. The default OpenSSL certificates are eight bytes and we should use this length with our self-signed certificates. This randomly generated serial number field may now serve as a covert channel that signals to the client that the OR will not support TLS renegotiation; this means that the client can expect to perform a V3 TLS handshake setup. Otherwise, if the serial number is a reasonable time since the epoch, we should assume the OR is using an earlier protocol version and hence that it expects renegotiation. We also have a need to signal properties with our certificates for a possible v3 handshake in the future. Therefore I propose that we match OpenSSL default self-signed certificates (a 64-bit random number), but reserve the two least- significant bits for signaling. For the moment, these two bits will be zero. This means that an attacker may be able to identify Tor certificates from default OpenSSL certificates with a 75% probability. As a security note, care must be taken to ensure that supporting this covert channel will not lead to an attacker having a method to downgrade client behavior. This shouldn't be a risk because the TLS Finished message hashes over all the bytes of the handshake, including the certificates. [Randomized serial numbers are implemented in 0.2.3.9-alpha. We probably shouldn't do certificate tagging by a covert channel in serial numbers, since doing so would mean we could never have an externally signed cert. -NM] Certificate fingerprinting issues expressed as base64 encoding It appears that all deployed Tor certificates have the following strings in common: MIIB CCA gAwIBAgIETU ANBgkqhkiG9w0BAQUFADA YDVQQDEx 3d3cu As expected these values correspond to specific ASN.1 OBJECT IDENTIFIER (OID) properties (sha1WithRSAEncryption, commonName, etc) of how we generate our certificates. As an illustrated example of the common bytes of all certificates used within the Tor network within a single one hour window, I have replaced the actual value with a wild card ('.') character here: -----BEGIN CERTIFICATE----- MIIB..CCA..gAwIBAgIETU....ANBgkqhkiG9w0BAQUFADA.M..w..YDVQQDEx.3 d3cu............................................................ ................................................................ ................................................................ ................................................................ ................................................................ ................................................................ ................................................................ ................................................................ ........................... <--- Variable length and padding -----END CERTIFICATE----- This fine ascii art only illustrates the bytes that absolutely match in all cases. In many cases, it's likely that there is a high probability for a given byte to be only a small subset of choices. Using the above strings, the EFF's certificate observatory may trivially discover all known relays, known bridges and unknown bridges in a single SQL query. I propose that we ensure that we test our certificates to ensure that they do not have these kinds of statistical similarities without ensuring overlap with a very large cross section of the internet's certificates. Certificate dating and validity issues TLS certificates found in the wild are generally found to be long-lived; they are frequently old and often even expired. The current Tor certificate validity time is a very small time window starting at generation time and ending shortly thereafter, as defined in or.h by MAX_SSL_KEY_LIFETIME (2*60*60). I propose that the certificate validity time length is extended to a period of twelve Earth months, possibly with a small random skew to be determined by the implementer. Tor should randomly set the start date in the past or some currently unspecified window of time before the current date. This would more closely track the typical distribution of non-Tor TLS certificate expiration times. The certificate values, such as expiration, should not be used for anything relating to security; for example, if the OR presents an expired TLS certificate, this does not imply that the client should terminate the connection (as would be appropriate for an ordinary TLS implementation). Rather, I propose we use a TOFU style expiration policy - the certificate should never be trusted for more than a two hour window from first sighting. This policy should have two major impacts. The first is that an adversary will have to perform a differential analysis of all certificates for a given IP address rather than a single check. The second is that the server expiration time is enforced by the client and confirmed by keys rotating in the consensus. The expiration time should not be a fixed time that is simple to calculate by any Deep Packet Inspection device or it will become a new Tor TLS setup fingerprint. [Deferred and needs revision; see proposal XXX. -NM] Proposed certificate form The following output from openssl asn1parse results from the proposed certificate generation algorithm. It matches the results of generating a default self-signed certificate: 0:d=0 hl=4 l= 513 cons: SEQUENCE 4:d=1 hl=4 l= 362 cons: SEQUENCE 8:d=2 hl=2 l= 9 prim: INTEGER :DBF6B3B864FF7478 19:d=2 hl=2 l= 13 cons: SEQUENCE 21:d=3 hl=2 l= 9 prim: OBJECT :sha1WithRSAEncryption 32:d=3 hl=2 l= 0 prim: NULL 34:d=2 hl=2 l= 69 cons: SEQUENCE 36:d=3 hl=2 l= 11 cons: SET 38:d=4 hl=2 l= 9 cons: SEQUENCE 40:d=5 hl=2 l= 3 prim: OBJECT :countryName 45:d=5 hl=2 l= 2 prim: PRINTABLESTRING :AU 49:d=3 hl=2 l= 19 cons: SET 51:d=4 hl=2 l= 17 cons: SEQUENCE 53:d=5 hl=2 l= 3 prim: OBJECT :stateOrProvinceName 58:d=5 hl=2 l= 10 prim: PRINTABLESTRING :Some-State 70:d=3 hl=2 l= 33 cons: SET 72:d=4 hl=2 l= 31 cons: SEQUENCE 74:d=5 hl=2 l= 3 prim: OBJECT :organizationName 79:d=5 hl=2 l= 24 prim: PRINTABLESTRING :Internet Widgits Pty Ltd 105:d=2 hl=2 l= 30 cons: SEQUENCE 107:d=3 hl=2 l= 13 prim: UTCTIME :110217011237Z 122:d=3 hl=2 l= 13 prim: UTCTIME :120217011237Z 137:d=2 hl=2 l= 69 cons: SEQUENCE 139:d=3 hl=2 l= 11 cons: SET 141:d=4 hl=2 l= 9 cons: SEQUENCE 143:d=5 hl=2 l= 3 prim: OBJECT :countryName 148:d=5 hl=2 l= 2 prim: PRINTABLESTRING :AU 152:d=3 hl=2 l= 19 cons: SET 154:d=4 hl=2 l= 17 cons: SEQUENCE 156:d=5 hl=2 l= 3 prim: OBJECT :stateOrProvinceName 161:d=5 hl=2 l= 10 prim: PRINTABLESTRING :Some-State 173:d=3 hl=2 l= 33 cons: SET 175:d=4 hl=2 l= 31 cons: SEQUENCE 177:d=5 hl=2 l= 3 prim: OBJECT :organizationName 182:d=5 hl=2 l= 24 prim: PRINTABLESTRING :Internet Widgits Pty Ltd 208:d=2 hl=3 l= 159 cons: SEQUENCE 211:d=3 hl=2 l= 13 cons: SEQUENCE 213:d=4 hl=2 l= 9 prim: OBJECT :rsaEncryption 224:d=4 hl=2 l= 0 prim: NULL 226:d=3 hl=3 l= 141 prim: BIT STRING 370:d=1 hl=2 l= 13 cons: SEQUENCE 372:d=2 hl=2 l= 9 prim: OBJECT :sha1WithRSAEncryption 383:d=2 hl=2 l= 0 prim: NULL 385:d=1 hl=3 l= 129 prim: BIT STRING [Rejected pending more evidence; this pattern is trivially detectable, and there is just not enough reason at the moment to think that this particular certificate pattern is common enough for sites that matter that the censors wouldn't be willing to block it. -NM] Custom Certificates It should be possible for a Tor relay operator to use a specifically supplied certificate and secret key. This will allow a relay or bridge operator to use a certificate signed by any member of any geographically relevant certificate authority racket; it will also allow for any other user-supplied certificate. This may be desirable in some kinds of filtered networks or when attempting to avoid attracting suspicion by blending in with the TLS web server certificate crowd. [Deferred; see proposal XXX] Problematic Diffie–Hellman parameters We currently send a static Diffie–Hellman parameter, prime p (or “prime p outlaw”) as specified in RFC2409 as part of the TLS Server Hello response. The use of this prime in TLS negotiations may, as a result, be filtered and effectively banned by certain networks. We do not have to use this particular prime in all cases. While amusing to have the power to make specific prime numbers into a new class of numbers (cf. imaginary, irrational, illegal [3]) - our new friend prime p outlaw is not required. The use of this prime in TLS negotiations may, as a result, be filtered and effectively banned by certain networks. We do not have to use this particular prime in all cases. I propose that the function to initialize and generate DH parameters be split into two functions. First, init_dh_param() should be used only for OR-to-OR DH setup and communication. Second, it is proposed that we create a new function init_tls_dh_param() that will have a two-stage development process. The first stage init_tls_dh_param() will use the same prime that Apache2.x [4] sends (or “dh1024_apache_p”), and this change should be made immediately. This is a known good and safe prime number (p-1 / 2 is also prime) that is currently not known to be blocked. The second stage init_tls_dh_param() should randomly generate a new prime on a regular basis; this is designed to make the prime difficult to outlaw or filter. Call this a shape-shifting or "Rakshasa" prime. This should be added to the 0.2.3.x branch of Tor. This prime can be generated at setup or execution time and probably does not need to be stored on disk. Rakshasa primes only need to be generated by Tor relays as Tor clients will never send them. Such a prime should absolutely not be shared between different Tor relays nor should it ever be static after the 0.2.3.x release. As a security precaution, care must be taken to ensure that we do not generate weak primes or known filtered primes. Both weak and filtered primes will undermine the TLS connection security properties. OpenSSH solves this issue dynamically in RFC 4419 [5] and may provide a solution that works reasonably well for Tor. More research in this area including the applicability of Miller-Rabin or AKS primality tests[6] will need to be analyzed and probably added to Tor. [Randomized DH groups are implemented in 0.2.3.9-alpha. -NM] Practical key size Currently we use a 1024 bit long RSA modulus. I propose that we increase the RSA key size to 2048 as an additional channel to signal support for the V3 handshake setup. 2048 appears to be the most common key size[0] above 1024. Additionally, the increase in modulus size provides a reasonable security boost with regard to key security properties. The implementer should increase the 1024 bit RSA modulus to 2048 bits. [Deferred and needs performance analysis. See proposal XXX. Additionally, DH group strength seems far more crucial. Still, this is out-of-scope for a "normalization" question. -NM] Possible future filtering nightmares At some point it may cost effective or politically feasible for a network filter to simply block all signed or self-signed certificates without a known valid CA trust chain. This will break many applications on the internet and hopefully, our option for custom certificates will ensure that this step is simply avoided by the censors. The Rakshasa prime approach may cause censors to specifically allow only certain known and accepted DH parameters. Appendix: Other issues What other obvious TLS certificate issues exist? What other static values are present in the Tor TLS setup process? [0] http://archives.seul.org/or/dev/Jan-2011/msg00051.html [1] http://archives.seul.org/or/dev/Feb-2011/msg00016.html [2] http://archives.seul.org/or/dev/Feb-2011/msg00039.html [3] To be fair this is hardly a new class of numbers. History is rife with similar examples of inane authoritarian attempts at mathematical secrecy. Probably the most dramatic example is the story of the pupil Hipassus of Metapontum, pupil of the famous Pythagoras, who, legend goes, proved the fact that Root2 cannot be expressed as a fraction of whole numbers (now called an irrational number) and was assassinated for revealing this secret. Further reading on the subject may be found on the Wikipedia: http://en.wikipedia.org/wiki/Hippasus [4] httpd-2.2.17/modules/ss/ssl_engine_dh.c [5] http://tools.ietf.org/html/rfc4419 [6] http://archives.seul.org/or/dev/Jan-2011/msg00037.html ```