aboutsummaryrefslogtreecommitdiff
path: root/proposals/270-newhope-hybrid-handshake.txt
blob: 6ad2e18d8eb79dcb501fc16cd10d03c67c59579a (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
```
Filename: 270-newhope-hybrid-handshake.txt
Title: RebelAlliance: A Post-Quantum Secure Hybrid Handshake Based on NewHope
Author: Isis Lovecruft, Peter Schwabe
Created: 16 Apr 2016
Updated: 22 Jul 2016
Status: Obsolete
Depends: prop#220 prop#249 prop#264 prop#270

§0. Introduction

  RebelAlliance is a post-quantum secure hybrid handshake, comprised of an
  alliance between the X25519 and NewHope key exchanges.

  NewHope is a post-quantum-secure lattice-based key-exchange protocol based
  on the ring-learning-with-errors (Ring-LWE) problem.  We propose a hybrid
  handshake for Tor, based on a combination of Tor's current NTor handshake
  and a shared key derived through a NewHope ephemeral key exchange.

  For further details on the NewHope key exchange, the reader is referred to
  "Post-quantum key exchange - a new hope" by Alkim, Ducas, Pöppelmann, and
  Schwabe [0][1].

  For the purposes of brevity, we consider that NTor is currently the only
  handshake protocol in Tor; the older TAP protocol is ignored completely, due
  to the fact that it is currently deprecated and nearly entirely unused.


§1. Motivation

  An attacker currently monitoring and storing circuit-layer NTor handshakes
  who later has the ability to run Shor's algorithm on a quantum computer will
  be able to break Tor's current handshake protocol and decrypt previous
  communications.

  It is unclear if and when such attackers equipped with large quantum
  computers will exist, but various estimates by researchers in quantum
  physics and quantum engineering give estimates of only 1 to 2 decades.
  Clearly, the security requirements of many Tor users include secrecy of
  their messages beyond this time span, which means that Tor needs to update
  the key exchange to protect against such attackers as soon as possible.


§2. Design

  An initiator and responder, in parallel, conduct two handshakes:

  - An X25519 key exchange, as described in the description of the NTor
    handshake in Tor proposal #216.
  - A NewHope key exchange.

  The shared keys derived from these two handshakes are then concatenated and
  used as input to the SHAKE-256 extendable output function (XOF), as described
  in FIPS-PUB-202 [2], in order to produce a shared key of the desired length.
  The testvectors in §C assume that this key has a length of 32 bytes, but the
  use of a XOF allows arbitrary lengths to easily support future updates of
  the symmetric primitives using the key. See also §3.3.1.


§3. Specification

§3.1. Notation

  Let `a || b` be the concatenation of a with b.

  Let `a^b` denote the exponentiation of a to the bth power.

  Let `a == b` denote the equality of a with b, and vice versa.

  Let `a := b` be the assignment of the value of b to the variable a.

  Let `H(x)` be 32-bytes of output of the SHAKE-256 XOF (as described in
  FIPS-PUB-202) applied to message x.

  Let X25519 refer to the curve25519-based key agreement protocol described
  in RFC7748 §6.1. [3]

  Let `EXP(a, b) == X25519(., b, a)` with `g == 9`. Let X25519_KEYGEN() do
  the appropriate manipulations when generating the secret key (clearing the
  low bits, twidding the high bits).  Additionally, EXP() MUST include the
  check for all-zero output due to the input point being of small
  order (cf. RFC7748 §6).

  Let `X25519_KEYID(B) == B` where B is a valid X25519 public key.

  When representing an element of the Curve25519 subgroup as a byte string,
  use the standard (32-byte, little-endian, x-coordinate-only) representation
  for Curve25519 points.

  Let `ID` be a router's identity key taken from the router microdescriptor.
  In the case for relays possessing Ed25519 identity keys (cf. Tor proposal
  #220), this is a 32-byte string representing the public Ed25519 identity key.
  For backwards and forwards compatibility with routers which do not possess
  Ed25519 identity keys, this is a 32-byte string created via the output of
  H(ID).

  We refer to the router as the handshake "responder", and the client (which
  may be an OR or an OP) as the "initiator".


  ID_LENGTH      [32 bytes]
  H_LENGTH       [32 bytes]
  G_LENGTH       [32 bytes]

  PROTOID  :=    "pqtor-x25519-newhope-shake256-1"
  T_MAC    :=    PROTOID || ":mac"
  T_KEY    :=    PROTOID || ":key_extract"
  T_VERIFY :=    PROTOID || ":verify"

  (X25519_SK, X25519_PK) := X25519_KEYGEN()


§3.2. Protocol

 ========================================================================================
 |                                                                                      |
 | Fig. 1: The NewHope-X25519 Hybrid Handshake.                                         |
 |                                                                                      |
 | Before the handshake the Initiator is assumed to know Z, a public X25519 key for     |
 | the Responder, as well as the Responder's ID.                                        |
 ----------------------------------------------------------------------------------------
 |                                                                                      |
 | Initiator                             Responder                                      |
 |                                                                                      |
 | SEED         := H(randombytes(32))                                                   |
 | x, X         := X25519_KEYGEN()                                                      |
 | a, A         := NEWHOPE_KEYGEN(SEED)                                                 |
 | CLIENT_HDATA := ID || Z || X || A                                                    |
 |                                                                                      |
 |               --- CLIENT_HDATA --->                                                  |
 |                                                                                      |
 |                                       y, Y           := X25519_KEYGEN()              |
 |                                       NTOR_KEY, AUTH := NTOR_SHAREDB(X,y,Y,z,Z,ID,B) |
 |                                       M, NEWHOPE_KEY := NEWHOPE_SHAREDB(A)           |
 |                                       SERVER_HDATA   := Y || AUTH || M               |
 |                                       sk := SHAKE-256(NTOR_KEY || NEWHOPE_KEY)       |
 |                                                                                      |
 |               <-- SERVER_HDATA ----                                                  |
 |                                                                                      |
 | NTOR_KEY    := NTOR_SHAREDA(x, X, Y, Z, ID, AUTH)                                    |
 | NEWHOPE_KEY := NEWHOPE_SHAREDA(M, a)                                                 |
 | sk := SHAKE-256(NTOR_KEY || NEWHOPE_KEY)                                             |
 |                                                                                      |
 ========================================================================================


§3.2.1. The NTor Handshake

§3.2.1.1. Prologue

  Take a router with identity ID. As setup, the router generates a secret key z,
  and a public onion key Z with:

    z, Z := X25519_KEYGEN()

  The router publishes Z in its server descriptor in the "ntor-onion-key" entry.
  Henceforward, we refer to this router as the "responder".


§3.2.1.2. Initiator

  To send a create cell, the initiator generates a keypair:

    x, X := X25519_KEYGEN()

  and creates the NTor portion of a CREATE2V cell's HDATA section:

    CLIENT_NTOR    := ID || Z || X                   [96 bytes]

  The initiator includes the responder's ID and Z in the CLIENT_NTOR so that, in
  the event the responder OR has recently rotated keys, the responder can
  determine which keypair to use.

  The initiator then concatenates CLIENT_NTOR with CLIENT_NEWHOPE (see §3.2.2),
  to create CLIENT_HDATA, and creates and sends a CREATE2V cell (see §A.1)
  to the responder.

    CLIENT_NEWHOPE                                   [1824 bytes]  (see §3.2.2)
    CLIENT_HDATA   := CLIENT_NTOR || CLIENT_NEWHOPE  [1920 bytes]

  If the responder does not respond with a CREATED2V cell, the initiator SHOULD
  NOT attempt to extend the circuit through the responder by sending fragmented
  EXTEND2 cells, since the responder's lack of support for CREATE2V cells is
  assumed to imply the responder also lacks support for fragmented EXTEND2
  cells.  Alternatively, for initiators with a sufficiently late consensus
  method, the initiator MUST check that "proto" line in the responder's
  descriptor (cf. Tor proposal #264) advertises support for the "Relay"
  subprotocol version 3 (see §5).


§3.2.1.3. Responder

  The responder generates a keypair of y, Y = X25519_KEYGEN(), and does
  NTOR_SHAREDB() as follows:

  (NTOR_KEY, AUTH) ← NTOR_SHAREDB(X, y, Y, z, Z, ID, B):
    secret_input := EXP(X, y) || EXP(X, z) || ID || B || Z || Y || PROTOID
    NTOR_KEY     := H(secret_input, T_KEY)
    verify       := H(secret_input, T_VERIFY)
    auth_input   := verify || ID || Z || Y || X || PROTOID || "Server"
    AUTH         := H(auth_input, T_MAC)

  The responder sends a CREATED2V cell containing:

    SERVER_NTOR    := Y || AUTH                      [64 bytes]
    SERVER_NEWHOPE                                   [2048 bytes]  (see §3.2.2)
    SERVER_HDATA   := SERVER_NTOR || SERVER_NEWHOPE  [2112 bytes]

  and sends this to the initiator.


§3.2.1.4. Finalisation

  The initiator then checks Y is in G^* [see NOTE below], and does
  NTOR_SHAREDA() as follows:

  (NTOR_KEY) ← NTOR_SHAREDA(x, X, Y, Z, ID, AUTH)
    secret_input := EXP(Y, x) || EXP(Z, x) || ID || Z || X || Y || PROTOID
    NTOR_KEY     := H(secret_input, T_KEY)
    verify       := H(secret_input, T_VERIFY)
    auth_input   := verify || ID || Z || Y || X || PROTOID || "Server"
    if AUTH == H(auth_input, T_MAC)
       return NTOR_KEY

  Both parties now have a shared value for NTOR_KEY.  They expand this into
  the keys needed for the Tor relay protocol.

  [XXX We think we want to omit the final hashing in the production of NTOR_KEY
  here, and instead put all the inputs through SHAKE-256. --isis, peter]

  [XXX We probably want to remove ID and B from the input to the shared key
  material, since they serve for authentication but, as pre-established
  "prologue" material to the handshake, they should not be used in attempts to
  strengthen the cryptographic suitability of the shared key.  Also, their
  inclusion is implicit in the DH exponentiations.  I should probably ask Ian
  about the reasoning for the original design choice.  --isis]


§3.2.2. The NewHope Handshake

§3.2.2.1. Parameters & Mathematical Structures

  Let ℤ be the ring of rational integers. Let ℤq, for q ≥ 1, denote the quotient
  ring ℤ/qℤ.  We define R = ℤ[X]/((X^n)+1) as the ring of integer polynomials
  modulo ((X^n)+1), and Rq = ℤq[X]/((X^n)+1) as the ring of integer polynomials
  modulo ((X^n)+1) where each coefficient is reduced modulo q. When we refer to
  a polynomial, we mean an element of Rq.

    n := 1024
    q := 12289

    SEED         [32 Bytes]
    NEWHOPE_POLY [1792 Bytes]
    NEWHOPE_REC  [256 Bytes]
    NEWHOPE_KEY  [32 Bytes]

    NEWHOPE_MSGA := (NEWHOPE_POLY || SEED)
    NEWHOPE_MSGB := (NEWHOPE_POLY || NEWHOPE_REC)


§3.2.2.2. High-level Description of Newhope API Functions

  For a description of internal functions, see §B.

    (NEWHOPE_POLY, NEWHOPE_MSGA) ← NEWHOPE_KEYGEN(SEED):
        â    := gen_a(seed)
        s    := poly_getnoise()
        e    := poly_getnoise()
        ŝ    := poly_ntt(s)
        ê    := poly_ntt(e)
        b̂    := pointwise(â, ŝ) + ê
        sp   := poly_tobytes(ŝ)
        bp   := poly_tobytes(b̂)
        return (sp, (bp || seed))

    (NEWHOPE_MSGB, NEWHOPE_KEY) ← NEWHOPE_SHAREDB(NEWHOPE_MSGA):
        s'   := poly_getnoise()
        e'   := poly_getnoise()
        e"   := poly_getnoise()
        b̂    := poly_frombytes(bp)
        â    := gen_a(seed)
        ŝ'   := poly_ntt(s')
        ê'   := poly_ntt(e')
        û    := poly_pointwise(â, ŝ') + ê'
        v    := poly_invntt(poly_pointwise(b̂,ŝ')) + e"
        r    := helprec(v)
        up   := poly_tobytes(û)
        k    := rec(v, r)
        return ((up || r), k)

    NEWHOPE_KEY ← NEWHOPE_SHAREDA(NEWHOPE_MSGB, NEWHOPE_POLY):
        û    := poly_frombytes(up)
        ŝ    := poly_frombytes(sp)
        v'   := poly_invntt(poly_pointwise(û, ŝ))
        k    := rec(v', r)
        return k

  When a client uses a SEED within a CREATE2V cell, the client SHOULD NOT use
  that SEED in any other CREATE2V or EXTEND2 cells.  See §4 for further
  discussion.


§3.3. Key Expansion

  The client and server derive a shared key, SHARED, by:

    HKDFID := "THESE ARENT THE DROIDS YOURE LOOKING FOR"
    SHARED := SHAKE_256(HKDFID || NTorKey || NewHopeKey)


§3.3.1. Note on the Design Choice

  The reader may wonder why one would use SHAKE-256 to produce a 256-bit
  output, since the security strength in bits for SHAKE-256 is min(d/2,256)
  for collision resistance and min(d,256) for first- and second-order
  preimages, where d is the output length.

  The reasoning is that we should be aiming for 256-bit security for all of
  our symmetric cryptography.  One could then argue that we should just use
  SHA3-256 for the KDF.  We choose SHAKE-256 instead in order to provide an
  easy way to derive longer shared secrets in the future without requiring a
  new handshake.  The construction is odd, but the future is bright.
  As we are already using SHAKE-256 for the 32-byte output hash, we are also
  using it for all other 32-byte hashes involved in the protocol. Note that
  the only difference between SHA3-256 and SHAKE-256 with 32-byte output is
  one domain-separation byte.

  [XXX why would you want 256-bit security for the symmetric side? Are you
  talking pre- or post-quantum security? --peter]


§4. Security & Anonymity Implications

  This handshake protocol is one-way authenticated.  That is, the server is
  authenticated, while the client remains anonymous.

  The client MUST NOT cache and reuse SEED.  Doing so gives non-trivial
  adversarial advantages w.r.t. all-for-the-price-of-one attacks during the
  caching period.  More importantly, if the SEED used to generate NEWHOPE_MSGA
  is reused for handshakes along the same circuit or multiple different
  circuits, an adversary conducting a sybil attack somewhere along the path(s)
  will be able to correlate the identity of the client across circuits or
  hops.


§5. Compatibility

  Because our proposal requires both the client and server to send more than
  the 505 bytes possible within a CREATE2 cell's HDATA section, it depends
  upon the implementation of a mechanism for allowing larger CREATE cells
  (cf. Tor proposal #249).

  We reserve the following handshake type for use in CREATE2V/CREATED2V and
  EXTEND2V/EXTENDED2V cells:

    0x0003            [NEWHOPE + X25519 HYBRID HANDSHAKE]

  We introduce a new sub-protocol number, "Relay=3", (cf. Tor proposal #264
  §5.3) to signify support this handshake, and hence for the CREATE2V and
  fragmented EXTEND2 cells which it requires.

  There are no additional entries or changes required within either router
  descriptors or microdescriptors to support this handshake method, due to the
  NewHope keys being ephemeral and derived on-the-fly, and due to the NTor X25519
  public keys already being included within the "ntor-onion-key" entry.

  Add a "UseNewHopeKEX" configuration option and a corresponding consensus
  parameter to control whether clients prefer using this NewHope hybrid
  handshake or some previous handshake protocol.  If the configuration option
  is "auto", clients SHOULD obey the consensus parameter.  The default
  configuration SHOULD be "auto" and the consensus value SHOULD initially be "0".


§6. Implementation

  The paper by Alkim, Ducas, Pöppelmann and Schwabe describes two software
  implementations of NewHope, one C reference implementation and an optimized
  implementation using AVX2 vector instructions. Those implementations are
  available at [1].

  Additionally, there are implementations in Go by Yawning Angel, available
  from [4] and in Rust by Isis Lovecruft, available from [5].

  The software used to generate the test vectors in §C is based on the C
  reference implementation and available from:

  https://code.ciph.re/isis/newhope-tor-testvectors
  https://github.com/isislovecruft/newhope-tor-testvectors


§7. Performance & Scalability

  The computationally expensive part in the current NTor handshake is the
  X25519 key-pair generation and the X25519 shared-key computation. The
  current implementation in Tor is a wrapper to support various highly optimized
  implementations on different architectures. On Intel Haswell processors, the
  fastest implementation of X25519, as reported by the eBACS benchmarking
  project [6], takes 169920 cycles for key-pair generation and 161648 cycles
  for shared-key computation; these add up to a total of 331568 cycles on each
  side (initiator and responder).

  The C reference implementation of NewHope, also benchmarked on Intel
  Haswell, takes 358234 cycles for the initiator and 402058 cycles for the
  Responder. The core computation of the proposed combination of NewHope and
  X25519 will thus mean a slowdown of about a factor of 2.1 for the Initiator
  and a slowdown by a factor of 2.2 for the Responder compared to the current
  NTor handshake. These numbers assume a fully optimized implementation of the
  NTor handshake and a C reference implementation of NewHope. With optimized
  implementations of NewHope, such as the one for Intel Haswell described in
  [0], the computational slowdown will be considerably smaller than a factor
  of 2.


§8. References

[0]: https://cryptojedi.org/papers/newhope-20160328.pdf
[1]: https://cryptojedi.org/crypto/#newhope
[2]: http://www.nist.gov/customcf/get_pdf.cfm?pub_id=919061
[3]: https://tools.ietf.org/html/rfc7748#section-6.1
[4]: https://github.com/Yawning/newhope
[5]: https://code.ciph.re/isis/newhopers
[6]: http://bench.cr.yp.to


§A. Cell Formats

§A.1. CREATE2V Cells

  The client portion of the handshake should send CLIENT_HDATA, formatted
  into a CREATE2V cell as follows:

    CREATE2V {                                              [2114 bytes]
      HTYPE   := 0x0003                                     [2 bytes]
      HLEN    := 0x0780                                     [2 bytes]
      HDATA   := CLIENT_HDATA                               [1920 bytes]
      IGNORED := 0x00                                       [194 bytes]
    }

  [XXX do we really want to pad with IGNORED to make CLIENT_HDATA the
  same number of bytes as SERVER_HDATA? --isis]

§A.2. CREATED2V Cells

  The server responds to the client's CREATE2V cell with SERVER_HDATA,
  formatted into a CREATED2V cell as follows:

    CREATED2V {                                             [2114 bytes]
      HLEN    := 0x0800                                     [2 bytes]
      HDATA   := SERVER_HDATA                               [2112 bytes]
      IGNORED := 0x00                                       [0 bytes]
    }

§A.3. Fragmented EXTEND2 Cells

  When the client wishes to extend a circuit, the client should fragment
  CLIENT_HDATA into four EXTEND2 cells:

    EXTEND2 {
      NSPEC := 0x02 {                                     [1 byte]
        LINK_ID_SERVER                                    [22 bytes] XXX
        LINK_ADDRESS_SERVER                               [8 bytes]  XXX
      }
      HTYPE := 0x0003                                     [2 bytes]
      HLEN  := 0x0780                                     [2 bytes]
      HDATA := CLIENT_HDATA[0,461]                        [462 bytes]
    }
    EXTEND2 {
      NSPEC := 0x00                                       [1 byte]
      HTYPE := 0xFFFF                                     [2 bytes]
      HLEN  := 0x0000                                     [2 bytes]
      HDATA := CLIENT_HDATA[462,954]                      [492 bytes]
    }
    EXTEND2 {
      NSPEC := 0x00                                       [1 byte]
      HTYPE := 0xFFFF                                     [2 bytes]
      HLEN  := 0x0000                                     [2 bytes]
      HDATA := CLIENT_HDATA[955,1447]                     [492 bytes]
    }
    EXTEND2 {
      NSPEC := 0x00                                       [1 byte]
      HTYPE := 0xFFFF                                     [2 bytes]
      HLEN  := 0x0000                                     [2 bytes]
      HDATA := CLIENT_HDATA[1448,1919] || 0x00[20]        [492 bytes]
    }
    EXTEND2 {
      NSPEC := 0x00                                       [1 byte]
      HTYPE := 0xFFFF                                     [2 bytes]
      HLEN  := 0x0000                                     [2 bytes]
      HDATA := 0x00[172]                                  [172 bytes]
    }

  The client sends this to the server to extend the circuit from, and that
  server should format the fragmented EXTEND2 cells into a CREATE2V cell, as
  described in §A.1.

§A.4. Fragmented EXTENDED2 Cells

    EXTENDED2 {
      NSPEC := 0x02 {                                     [1 byte]
        LINK_ID_SERVER                                    [22 bytes] XXX
        LINK_ADDRESS_SERVER                               [8 bytes]  XXX
      }
      HTYPE := 0x0003                                     [2 bytes]
      HLEN  := 0x0800                                     [2 bytes]
      HDATA := SERVER_HDATA[0,461]                        [462 bytes]
    }
    EXTENDED2 {
      NSPEC := 0x00                                       [1 byte]
      HTYPE := 0xFFFF                                     [2 bytes]
      HLEN  := 0x0000                                     [2 bytes]
      HDATA := SERVER_HDATA[462,954]                      [492 bytes]
    }
    EXTEND2 {
      NSPEC := 0x00                                       [1 byte]
      HTYPE := 0xFFFF                                     [2 bytes]
      HLEN  := 0x0000                                     [2 bytes]
      HDATA := SERVER_HDATA[955,1447]                     [492 bytes]
    }
    EXTEND2 {
      NSPEC := 0x00                                       [1 byte]
      HTYPE := 0xFFFF                                     [2 bytes]
      HLEN  := 0x0000                                     [2 bytes]
      HDATA := SERVER_HDATA[1448,1939]                    [492 bytes]
    }
    EXTEND2 {
      NSPEC := 0x00                                       [1 byte]
      HTYPE := 0xFFFF                                     [2 bytes]
      HLEN  := 0x0000                                     [2 bytes]
      HDATA := SERVER_HDATA[1940,2112]                    [172 bytes]
    }


§B. NewHope Internal Functions

  gen_a(SEED):                  returns a uniformly random poly
  poly_getnoise():              returns a poly sampled from a centered binomial
  poly_ntt(poly):               number-theoretic transform; returns a poly
  poly_invntt(poly):            inverse number-theoretic transform; returns a poly
  poly_pointwise(poly, poly):   pointwise multiplication; returns a poly
  poly_tobytes(poly):           packs a poly to a NEWHOPE_POLY byte array
  poly_frombytes(NEWHOPE_POLY): unpacks a NEWHOPE_POLY byte array to a poly

  helprec(poly):                returns a NEWHOPE_REC byte array
  rec(poly, NEWHOPE_REC):       returns a NEWHOPE_KEY


  --- Description of the Newhope internal functions ---

  gen_a(SEED seed) receives as input a 32-byte (public) seed.  It expands
  this seed through SHAKE-128 from the FIPS202 standard. The output of SHAKE-128
  is considered a sequence of 16-bit little-endian integers. This sequence is
  used to initialize the coefficients of the returned polynomial from the least
  significant (coefficient of X^0) to the most significant (coefficient of
  X^1023) coefficient. For each of the 16-bit integers first eliminate the
  highest two bits (to make it a 14-bit integer) and then use it as the next
  coefficient if it is smaller than q=12289.
  Note that the amount of output required from SHAKE to initialize all 1024
  coefficients of the polynomial varies depending on the input seed.
  Note further that this function does not process any secret data and thus does
  not need any timing-attack protection.


  poly_getnoise() first generates 4096 bytes of uniformly random data. This can
  be done by reading these bytes from the system's RNG; efficient
  implementations will typically only read a 32-byte seed from the system's RNG
  and expand it through some fast PRG (for example, ChaCha20 or AES-256 in CTR
  mode). The output of the PRG is considered an array of 2048 16-bit integers
  r[0],...,r[2047]. The coefficients of the output polynomial are computed as
  HW(r[0])-HW(r[1]), HW(r[2])-HW(r[3]),...,HW(r[2046])-HW(r[2047]), where HW
  stands for Hamming weight.
  Note that the choice of RNG is a local decision; different implementations are
  free to use different RNGs.
  Note further that the output of this function is secret; the PRG (and the
  computation of HW) need to be protected against timing attacks.


  poly_ntt(poly f): For a mathematical description of poly_ntt see the [0]; a
  pseudocode description of a very naive in-place transformation of an input
  polynomial f = f[0] + f[1]*X + f[2]*X^2 + ... + f[1023]*X^1023 is the
  following code (all arithmetic on coefficients performed modulo q):

    psi   = 7
    omega = 49

    for i in range(0,n):
      t[i] = f[i] * psi^i

    for i in range(0,n):
      f[i] = 0
      for j in range(0,n):
        f[i] += t[j] * omega^((i*j)%n)

  Note that this is not how poly_ntt should be implemented if performance is
  an issue; in particular, efficient algorithms for the number-theoretic
  transform take time O(n*log(n)) and not O(n^2)
  Note further that all arithmetic in poly_ntt has to be protected against
  timing attacks.


  poly_invntt(poly f): For a mathematical description of poly_invntt see the
  [0]; a pseudocode description of a very naive in-place transformation of an
  input polynomial f = f[0] + f[1]*X + f[2]*X^2 + ... + f[1023]*X^1023 is the
  following code (all arithmetic on coefficients performed modulo q):

    invpsi = 8778;
    invomega = 1254;
    invn = 12277;

    for i in range(0,n):
      t[i] = f[i];

    for i in range(0,n):
      f[i]=0;
      for j in range(0,n):
        f[i] += t[j] * invomega^((i*j)%n)
      f[i] *= invpsi^i
      f[i] *= invn

  Note that this is not how poly_invntt should be implemented if performance
  is an issue; in particular, efficient algorithms for the inverse
  number-theoretic transform take time O(n*log(n)) and not O(n^2)
  Note further that all arithmetic in poly_invntt has to be protected against
  timing attacks.


  poly_pointwise(poly f, poly g) performs pointwise multiplication of the two
  polynomials.  This means that for f = (f0 + f1*X + f2*X^2 + ... +
  f1023*X^1023) and g = (g0 + g1*X + g2*X^2 + ... + g1023*X^1023) it computes
  and returns h = (h0 + h1*X + h2*X^2 + ... + h1023*X^1023) with h0 = f0*g0,
  h1 = f1*g1,..., h1023 = f1023*g1023.


  poly_tobytes(poly f) first reduces all coefficents of f modulo q, i.e.,
  brings them to the interval [0,q-1]. Denote these reduced coefficients as
  f0,..., f1023; note that they all fit into 14 bits. The function then packs
  those coefficients into an array of 1792 bytes r[0],..., r[1792] in "packed
  little-endian representation", i.e.,
  r[0]     = f[0] & 0xff;
  r[1]     = (f[0] >>  8) & ((f[1] & 0x03) << 6)
  r[2]     = (f[1] >>  2) & 0xff;
  r[3]     = (f[1] >> 10) & ((f[2] & 0x0f) << 4)
  .
  .
  .
  r[1790]  = (f[1022]) >> 12) & ((f[1023] & 0x3f) << 2)
  r[1791]  = f[1023] >> 6
  Note that this function needs to be protected against timing attacks. In
  particular, avoid non-constant-time conditional subtractions (or other
  non-constant-time expressions) in the reduction modulo q of the coefficients.


  poly_frombytes(NEWHOPE_POLY b) is the inverse of poly_tobytes; it receives
  as input an array of 1792 bytes and coverts it into the internal
  representation of a poly. Note that poly_frombytes does not need to check
  whether the coefficients are reduced modulo q or reduce coefficients modulo
  q. Note further that the function must not leak any information about its
  inputs through timing information, as it is also applied to the secret key
  of the initiator.


  helprec(poly f) computes 256 bytes of reconciliation information from the
  input poly f. Internally, one byte of reconciliation information is computed
  from four coefficients of f by a function helprec4. Let the input polynomial f
  = (f0 + f1*X + f2*X^2 + ... + f1023*X^1023); let the output byte array be
  r[0],...r[256]. This output byte array is computed as
  r[0]   = helprec4(f0,f256,f512,f768)
  r[1]   = helprec4(f1,f257,f513,f769)
  r[2]   = helprec4(f2,f258,f514,f770)
  .
  .
  .
  r[255] = helprec4(f255,f511,f767,f1023), where helprec4 does the following:

    helprec4(x0,x1,x2,x3):
      b = randombit()
      r0,r1,r2,r3 = CVPD4(8*x0+4*b,8*x1+4*b,8*x2+4*b,8*x3+4*b)
      r = (r0 & 0x03) | ((r1 & 0x03) << 2) | ((r2 & 0x03) << 4) | ((r3 & 0x03) << 6)
      return r

  The function CVPD4 does the following:

    CVPD4(y0,y1,y2,y3):
      v00 = round(y0/2q)
      v01 = round(y1/2q)
      v02 = round(y2/2q)
      v03 = round(y3/2q)
      v10 = round((y0-1)/2q)
      v11 = round((y1-1)/2q)
      v12 = round((y2-1)/2q)
      v13 = round((y3-1)/2q)
      t   = abs(y0 - 2q*v00)
      t  += abs(y1 - 2q*v01)
      t  += abs(y2 - 2q*v02)
      t  += abs(y3 - 2q*v03)
      if(t < 2q):
        v0 = v00
        v1 = v01
        v2 = v02
        v3 = v03
        k  = 0
      else
        v0 = v10
        v1 = v11
        v2 = v12
        v3 = v13
        r  = 1
      return (v0-v3,v1-v3,v2-v3,k+2*v3)

  In this description, round(x) is defined as ⌊x + 0.5⌋, where ⌊x⌋ rounds to
  the largest integer that does not exceed x; abs() returns the absolute
  value.
  Note that all computations involved in helprec operate on secret data and must
  be protected against timing attacks.


  rec(poly f, NEWHOPE_REC r) computes the pre-hash (see paper) Newhope key from
  f and r. Specifically, it computes one bit of key from 4 coefficients of f and
  one byte of r. Let f = f0 + f1*X + f2*X^2 + ... + f1023*X^1023 and let r =
  r[0],r[1],...,r[255]. Let the bytes of the output by k[0],...,k[31] and let
  the bits of the output by k0,...,k255, where
  k0   = k[0] & 0x01
  k1   = (k[0] >> 1) & 0x01
  k2   = (k[0] >> 2) & 0x01
  .
  .
  .
  k8   = k[1] & 0x01
  k9   = (k[1] >> 1) & 0x01
  .
  .
  .
  k255 = (k[32] >> 7)
  The function rec computes k0,...,k255 as
  k0   = rec4(f0,f256,f512,f768,r[0])
  k1   = rec4(f1,f257,f513,f769,r[1])
  .
  .
  .
  k255 = rec4(f255,f511,f767,f1023,r[255])

  The function rec4 does the following:

    rec4(y0,y1,y2,y3,r):
      r0 = r & 0x03
      r1 = (r >> 2) & 0x03
      r2 = (r >> 4) & 0x03
      r3 = (r >> 6) & 0x03
      Decode(8*y0-2q*r0, 8*y1-2q*r1, 8*y2-2q*r2, 8*y3-q*r3)

  The function Decode does the following:

    Decode(v0,v1,v2,v3):
      t0 = round(v0/8q)
      t1 = round(v1/8q)
      t2 = round(v2/8q)
      t3 = round(v3/8q)
      t  = abs(v0 - 8q*t0)
      t += abs(v0 - 8q*t0)
      t += abs(v0 - 8q*t0)
      t += abs(v0 - 8q*t0)
      if(t > 1) return 1
      else return 0


§C. Test Vectors
```