## CREATE and CREATED cells Users set up circuits incrementally, one hop at a time. To create a new circuit, OPs send a CREATE/CREATE2 cell to the first node, with the first half of an authenticated handshake; that node responds with a CREATED/CREATED2 cell with the second half of the handshake. To extend a circuit past the first hop, the OP sends an EXTEND/EXTEND2 relay cell (see section 5.1.2) which instructs the last node in the circuit to send a CREATE/CREATE2 cell to extend the circuit. There are two kinds of CREATE and CREATED cells: The older "CREATE/CREATED" format, and the newer "CREATE2/CREATED2" format. The newer format is extensible by design; the older one is not. A CREATE2 cell contains: ```text HTYPE (Client Handshake Type) [2 bytes] HLEN (Client Handshake Data Len) [2 bytes] HDATA (Client Handshake Data) [HLEN bytes] A CREATED2 cell contains: HLEN (Server Handshake Data Len) [2 bytes] HDATA (Server Handshake Data) [HLEN bytes] Recognized HTYPEs (handshake types) are: 0x0000 TAP -- the original Tor handshake; see 5.1.3 0x0001 reserved 0x0002 ntor -- the ntor+curve25519+sha256 handshake; see 5.1.4 0x0003 ntor-v3 -- ntor extended with extra data; see 5.1.4.1 The format of a CREATE cell is one of the following: HDATA (Client Handshake Data) [TAP_C_HANDSHAKE_LEN bytes] or HTAG (Client Handshake Type Tag) [16 bytes] HDATA (Client Handshake Data) [TAP_C_HANDSHAKE_LEN-16 bytes] ``` The first format is equivalent to a CREATE2 cell with HTYPE of 'tap' and length of TAP_C_HANDSHAKE_LEN. The second format is a way to encapsulate new handshake types into the old CREATE cell format for migration. See 5.1.2 below. Recognized HTAG values are: ntor -- 'ntorNTORntorNTOR' The format of a CREATED cell is: HDATA (Server Handshake Data) [TAP_S_HANDSHAKE_LEN bytes] (It's equivalent to a CREATED2 cell with length of TAP_S_HANDSHAKE_LEN.) As usual with DH, x and y MUST be generated randomly. In general, clients SHOULD use CREATE whenever they are using the TAP handshake, and CREATE2 otherwise. Clients SHOULD NOT send the second format of CREATE cells (the one with the handshake type tag) to a server directly. Servers always reply to a successful CREATE with a CREATED, and to a successful CREATE2 with a CREATED2. On failure, a server sends a DESTROY cell to tear down the circuit. [CREATE2 is handled by Tor 0.2.4.7-alpha and later.] ### Choosing circuit IDs in create cells The CircID for a CREATE/CREATE2 cell is a nonzero integer, selected by the node (OP or OR) that sends the CREATE/CREATED2 cell. Depending on the link protocol version, there are certain rules for choosing the value of CircID which MUST be obeyed, as implementations MAY decide to refuse in case of a violation. In link protocol 3 or lower, CircIDs are 2 bytes long; in protocol 4 or higher, CircIDs are 4 bytes long. In link protocol version 3 or lower, the nodes choose from only one half of the possible values based on the ORs' public identity keys, in order to avoid collisions. If the sending node has a lower key, it chooses a CircID with an MSB of 0; otherwise, it chooses a CircID with an MSB of 1. (Public keys are compared numerically by modulus.) A client with no public key MAY choose any CircID it wishes, since clients never need to process CREATE/CREATE2 cells. In link protocol version 4 or higher, whichever node initiated the connection MUST set its MSB to 1, and whichever node didn't initiate the connection MUST set its MSB to 0. The CircID value 0 is specifically reserved for cells that do not belong to any circuit: CircID 0 MUST not be used for circuits. No other CircID value, including 0x8000 or 0x80000000, is reserved. Existing Tor implementations choose their CircID values at random from among the available unused values. To avoid distinguishability, new implementations should do the same. Implementations MAY give up and stop attempting to build new circuits on a channel, if a certain number of randomly chosen CircID values are all in use (today's Tor stops after 64). ### EXTEND and EXTENDED cells To extend an existing circuit, the client sends an EXTEND or EXTEND2 RELAY_EARLY cell to the last node in the circuit. An EXTEND2 cell's relay payload contains: ```text NSPEC (Number of link specifiers) [1 byte] NSPEC times: LSTYPE (Link specifier type) [1 byte] LSLEN (Link specifier length) [1 byte] LSPEC (Link specifier) [LSLEN bytes] HTYPE (Client Handshake Type) [2 bytes] HLEN (Client Handshake Data Len) [2 bytes] HDATA (Client Handshake Data) [HLEN bytes] ``` Link specifiers describe the next node in the circuit and how to connect to it. Recognized specifiers are: ```text [00] TLS-over-TCP, IPv4 address A four-byte IPv4 address plus two-byte ORPort [01] TLS-over-TCP, IPv6 address A sixteen-byte IPv6 address plus two-byte ORPort [02] Legacy identity A 20-byte SHA1 identity fingerprint. At most one may be listed. [03] Ed25519 identity A 32-byte Ed25519 identity fingerprint. At most one may be listed. ``` Nodes MUST ignore unrecognized specifiers, and MUST accept multiple instances of specifiers other than 'legacy identity' and 'Ed25519 identity'. (Nodes SHOULD reject link specifier lists that include multiple instances of either one of those specifiers.) For purposes of indistinguishability, implementations SHOULD send these link specifiers, if using them, in this order: [00], [02], [03], [01]. The relay payload for an EXTEND relay cell consists of: ```text Address [4 bytes] Port [2 bytes] Onion skin [TAP_C_HANDSHAKE_LEN bytes] Identity fingerprint [HASH_LEN bytes] ``` The "legacy identity" and "identity fingerprint" fields are the SHA1 hash of the PKCS#1 ASN1 encoding of the next onion router's identity (signing) key. (See 0.3 above.) The "Ed25519 identity" field is the Ed25519 identity key of the target node. Including this key information allows the extending OR verify that it is indeed connected to the correct target OR, and prevents certain man-in-the-middle attacks. Extending ORs MUST check _all_ provided identity keys (if they recognize the format), and and MUST NOT extend the circuit if the target OR did not prove its ownership of any such identity key. If only one identity key is provided, but the extending OR knows the other (from directory information), then the OR SHOULD also enforce the key in the directory. If an extending OR has a channel with a given Ed25519 ID and RSA identity, and receives a request for that Ed25519 ID and a different RSA identity, it SHOULD NOT attempt to make another connection: it should just fail and DESTROY the circuit. The client MAY include multiple IPv4 or IPv6 link specifiers in an EXTEND cell; current OR implementations only consider the first of each type. After checking relay identities, extending ORs generate a CREATE/CREATE2 cell from the contents of the EXTEND/EXTEND2 cell. See section 5.3 for details. The payload of an EXTENDED cell is the same as the payload of a CREATED cell. The payload of an EXTENDED2 cell is the same as the payload of a CREATED2 cell. [Support for EXTEND2/EXTENDED2 was added in Tor 0.2.4.8-alpha.] Clients SHOULD use the EXTEND format whenever sending a TAP handshake, and MUST use it whenever the EXTEND cell will be handled by a node running a version of Tor too old to support EXTEND2. In other cases, clients SHOULD use EXTEND2. When generating an EXTEND2 cell, clients SHOULD include the target's Ed25519 identity whenever the target has one, and whenever the target supports LinkAuth subprotocol version "3". (See section 9.2.) When encoding a non-TAP handshake in an EXTEND cell, clients SHOULD use the format with 'client handshake type tag'. ### The "TAP" handshake This handshake uses Diffie-Hellman in Z_p and RSA to compute a set of shared keys which the client knows are shared only with a particular server, and the server knows are shared with whomever sent the original handshake (or with nobody at all). It's not very fast and not very good. (See Goldberg's "On the Security of the Tor Authentication Protocol".) Define TAP_C_HANDSHAKE_LEN as DH_LEN+KEY_LEN+KP_PAD_LEN. Define TAP_S_HANDSHAKE_LEN as DH_LEN+HASH_LEN. The payload for a CREATE cell is an 'onion skin', which consists of the first step of the DH handshake data (also known as g^x). This value is encrypted using the "legacy hybrid encryption" algorithm (see 0.4 above) to the server's onion key, giving a client handshake: ```text KP-encrypted: Padding [KP_PAD_LEN bytes] Symmetric key [KEY_LEN bytes] First part of g^x [KP_ENC_LEN-KP_PAD_LEN-KEY_LEN bytes] Symmetrically encrypted: Second part of g^x [DH_LEN-(KP_ENC_LEN-KP_PAD_LEN-KEY_LEN) bytes] ``` The payload for a CREATED cell, or the relay payload for an EXTENDED cell, contains: ```text DH data (g^y) [DH_LEN bytes] Derivative key data (KH) [HASH_LEN bytes] ``` Once the handshake between the OP and an OR is completed, both can now calculate g^xy with ordinary DH. Before computing g^xy, both parties MUST verify that the received g^x or g^y value is not degenerate; that is, it must be strictly greater than 1 and strictly less than p-1 where p is the DH modulus. Implementations MUST NOT complete a handshake with degenerate keys. Implementations MUST NOT discard other "weak" g^x values. (Discarding degenerate keys is critical for security; if bad keys are not discarded, an attacker can substitute the OR's CREATED cell's g^y with 0 or 1, thus creating a known g^xy and impersonating the OR. Discarding other keys may allow attacks to learn bits of the private key.) Once both parties have g^xy, they derive their shared circuit keys and 'derivative key data' value via the KDF-TOR function in 5.2.1. ### The "ntor" handshake This handshake uses a set of DH handshakes to compute a set of shared keys which the client knows are shared only with a particular server, and the server knows are shared with whomever sent the original handshake (or with nobody at all). Here we use the "curve25519" group and representation as specified in "Curve25519: new Diffie-Hellman speed records" by D. J. Bernstein. [The ntor handshake was added in Tor 0.2.4.8-alpha.] In this section, define: ```text H(x,t) as HMAC_SHA256 with message x and key t. H_LENGTH = 32. ID_LENGTH = 20. G_LENGTH = 32 PROTOID = "ntor-curve25519-sha256-1" t_mac = PROTOID | ":mac" t_key = PROTOID | ":key_extract" t_verify = PROTOID | ":verify" G = The preferred base point for curve25519 ([9]) KEYGEN() = The curve25519 key generation algorithm, returning a private/public keypair. m_expand = PROTOID | ":key_expand" KEYID(A) = A EXP(a, b) = The ECDH algorithm for establishing a shared secret. ``` To perform the handshake, the client needs to know an identity key digest for the server, and an ntor onion key (a curve25519 public key) for that server. Call the ntor onion key "B". The client generates a temporary keypair: x,X = KEYGEN() and generates a client-side handshake with contents: ```text NODEID Server identity digest [ID_LENGTH bytes] KEYID KEYID(B) [H_LENGTH bytes] CLIENT_KP X [G_LENGTH bytes] ``` The server generates a keypair of y,Y = KEYGEN(), and uses its ntor private key 'b' to compute: ```text secret_input = EXP(X,y) | EXP(X,b) | ID | B | X | Y | PROTOID KEY_SEED = H(secret_input, t_key) verify = H(secret_input, t_verify) auth_input = verify | ID | B | Y | X | PROTOID | "Server" The server's handshake reply is: SERVER_KP Y [G_LENGTH bytes] AUTH H(auth_input, t_mac) [H_LENGTH bytes] The client then checks Y is in G^* [see NOTE below], and computes secret_input = EXP(Y,x) | EXP(B,x) | ID | B | X | Y | PROTOID KEY_SEED = H(secret_input, t_key) verify = H(secret_input, t_verify) auth_input = verify | ID | B | Y | X | PROTOID | "Server" The client verifies that AUTH == H(auth_input, t_mac). ``` Both parties check that none of the EXP() operations produced the point at infinity. [NOTE: This is an adequate replacement for checking Y for group membership, if the group is curve25519.] Both parties now have a shared value for KEY_SEED. They expand this into the keys needed for the Tor relay protocol, using the KDF described in 5.2.2 and the tag m_expand. #### The "ntor-v3" handshake This handshake extends the ntor handshake to include support for extra data transmitted as part of the handshake. Both the client and the server can transmit extra data; in both cases, the extra data is encrypted, but only server data receives forward secrecy. To advertise support for this handshake, servers advertise the "Relay=4" subprotocol version. To select it, clients use the 'ntor-v3' HTYPE value in their CREATE2 cells. In this handshake, we define: ```text PROTOID = "ntor3-curve25519-sha3_256-1" t_msgkdf = PROTOID | ":kdf_phase1" t_msgmac = PROTOID | ":msg_mac" t_key_seed = PROTOID | ":key_seed" t_verify = PROTOID | ":verify" t_final = PROTOID | ":kdf_final" t_auth = PROTOID | ":auth_final" `ENCAP(s)` -- an encapsulation function. We define this as `htonll(len(s)) | s`. (Note that `len(ENCAP(s)) = len(s) + 8`). `PARTITION(s, n1, n2, n3, ...)` -- a function that partitions a bytestring `s` into chunks of length `n1`, `n2`, `n3`, and so on. Extra data is put into a final chunk. If `s` is not long enough, the function fails. H(s, t) = SHA3_256(ENCAP(t) | s) MAC(k, msg, t) = SHA3_256(ENCAP(t) | ENCAP(k) | s) KDF(s, t) = SHAKE_256(ENCAP(t) | s) ENC(k, m) = AES_256_CTR(k, m) EXP(pk,sk), KEYGEN: defined as in curve25519 DIGEST_LEN = MAC_LEN = MAC_KEY_LEN = ENC_KEY_LEN = PUB_KEY_LEN = 32 ID_LEN = 32 (representing an ed25519 identity key) For any tag "t_foo": H_foo(s) = H(s, t_foo) MAC_foo(k, msg) = MAC(k, msg, t_foo) KDF_foo(s) = KDF(s, t_foo) Other notation is as in the ntor description in 5.1.4 above. The client begins by knowing: B, ID -- The curve25519 onion key and Ed25519 ID of the server that it wants to use. CM -- A message it wants to send as part of its handshake. VER -- An optional shared verification string: The client computes: x,X = KEYGEN() Bx = EXP(B,x) secret_input_phase1 = Bx | ID | X | B | PROTOID | ENCAP(VER) phase1_keys = KDF_msgkdf(secret_input_phase1) (ENC_K1, MAC_K1) = PARTITION(phase1_keys, ENC_KEY_LEN, MAC_KEY_LEN) encrypted_msg = ENC(ENC_K1, CM) msg_mac = MAC_msgmac(MAC_K1, ID | B | X | encrypted_msg) The client then sends, as its CREATE handshake: NODEID ID [ID_LEN bytes] KEYID B [PUB_KEY_LEN bytes] CLIENT_PK X [PUB_KEY_LEN bytes] MSG encrypted_msg [len(CM) bytes] MAC msg_mac [MAC_LEN bytes] The client remembers x, X, B, ID, Bx, and msg_mac. ``` When the server receives this handshake, it checks whether NODEID is as expected, and looks up the (b,B) keypair corresponding to KEYID. If the keypair is missing or the NODEID is wrong, the handshake fails. Now the relay uses `X=CLIENT_PK` to compute: ```text Xb = EXP(X,b) secret_input_phase1 = Xb | ID | X | B | PROTOID | ENCAP(VER) phase1_keys = KDF_msgkdf(secret_input_phase1) (ENC_K1, MAC_K1) = PARTITION(phase1_keys, ENC_KEY_LEN, MAC_KEY_LEN) expected_mac = MAC_msgmac(MAC_K1, ID | B | X | MSG) ``` If `expected_mac` is not `MAC`, the handshake fails. Otherwise the relay computes `CM` as: CM = DEC(MSG, ENC_K1) The relay then checks whether `CM` is well-formed, and in response composes `SM`, the reply that it wants to send as part of the handshake. It then generates a new ephemeral keypair: y,Y = KEYGEN() and computes the rest of the handshake: ```text Xy = EXP(X,y) secret_input = Xy | Xb | ID | B | X | Y | PROTOID | ENCAP(VER) ntor_key_seed = H_key_seed(secret_input) verify = H_verify(secret_input) RAW_KEYSTREAM = KDF_final(ntor_key_seed) (ENC_KEY, KEYSTREAM) = PARTITION(RAW_KEYSTREAM, ENC_KEY_LKEN, ...) encrypted_msg = ENC(ENC_KEY, SM) auth_input = verify | ID | B | Y | X | MAC | ENCAP(encrypted_msg) | PROTOID | "Server" AUTH = H_auth(auth_input) The relay then sends as its CREATED handshake: Y Y [PUB_KEY_LEN bytes] AUTH AUTH [DIGEST_LEN bytes] MSG encrypted_msg [len(SM) bytes, up to end of the message] Upon receiving this handshake, the client computes: Yx = EXP(Y, x) secret_input = Yx | Bx | ID | B | X | Y | PROTOID | ENCAP(VER) ntor_key_seed = H_key_seed(secret_input) verify = H_verify(secret_input) auth_input = verify | ID | B | Y | X | MAC | ENCAP(MSG) | PROTOID | "Server" AUTH_expected = H_auth(auth_input) ``` If AUTH_expected is equal to AUTH, then the handshake has succeeded. The client can then calculate: ```text RAW_KEYSTREAM = KDF_final(ntor_key_seed) (ENC_KEY, KEYSTREAM) = PARTITION(RAW_KEYSTREAM, ENC_KEY_LKEN, ...) SM = DEC(ENC_KEY, MSG) ``` SM is the message from the relay, and the client uses KEYSTREAM to generate the shared secrets for the newly created circuit. Now both parties share the same KEYSTREAM, and can use it to generate their circuit keys. ### CREATE_FAST/CREATED_FAST cells When initializing the first hop of a circuit, the OP has already established the OR's identity and negotiated a secret key using TLS. Because of this, it is not always necessary for the OP to perform the public key operations to create a circuit. In this case, the OP MAY send a CREATE_FAST cell instead of a CREATE cell for the first hop only. The OR responds with a CREATED_FAST cell, and the circuit is created. A CREATE_FAST cell contains: Key material (X) [HASH_LEN bytes] A CREATED_FAST cell contains: ```text Key material (Y) [HASH_LEN bytes] Derivative key data [HASH_LEN bytes] (See 5.2.1 below) The values of X and Y must be generated randomly. ``` Once both parties have X and Y, they derive their shared circuit keys and 'derivative key data' value via the KDF-TOR function in 5.2.1. The CREATE_FAST handshake is currently deprecated whenever it is not necessary; the migration is controlled by the "usecreatefast" networkstatus parameter as described in dir-spec.txt. [Tor 0.3.1.1-alpha and later disable CREATE_FAST by default.] ### Additional data in CREATE/CREATED cells Some handshakes (currently ntor-v3 defined above) allow the client or the relay to send additional data as part of the handshake. When used in a CREATE/CREATED handshake, this additional data must have the following format: ```text N_EXTENSIONS [one byte] N_EXTENSIONS times: EXT_FIELD_TYPE [one byte] EXT_FIELD_LEN [one byte] EXT_FIELD [EXT_FIELD_LEN bytes] (`EXT_FIELD_LEN` may be zero, in which case EXT_FIELD is absent.) All parties MUST reject messages that are not well-formed per the rules above. We do not specify specific TYPE semantics here; we leave those for other proposals and specifications. Parties MUST ignore extensions with `EXT_FIELD_TYPE` bodies they do not recognize. Unless otherwise specified in the documentation for an extension type: * Each extension type SHOULD be sent only once in a message. * Parties MUST ignore any occurrences all occurrences of an extension with a given type after the first such occurrence. * Extensions SHOULD be sent in numerically ascending order by type. (The above extension sorting and multiplicity rules are only defaults; they may be overridden in the description of individual extensions.) Currently supported extensions are: 1 -- CC_FIELD_REQUEST [Client to server] Contains an empty payload. Signifies that the client wants to use the extended congestion control described in proposal 324. 2 -- CC_FIELD_RESPONSE [Server to client] Indicates that the relay will use the congestion control of proposal 324, as requested by the client. One byte in length: sendme_inc [1 byte] ```