1 files changed, 195 insertions, 0 deletions
diff --git a/spec/vanguards-spec/vanguards-stats.md b/spec/vanguards-spec/vanguards-stats.md
new file mode 100644
index 0000000..445f2f0
--- /dev/null
+++ b/spec/vanguards-spec/vanguards-stats.md
@@ -0,0 +1,195 @@
+# Vanguard Rotation Statistics
+
+## Sybil rotation counts for a given number of Guards {#SybilTable}
+
+The probability of Sybil success for Guard discovery can be modeled as
+the probability of choosing 1 or more malicious middle nodes for a
+sensitive circuit over some period of time.
+
+```text
+  P(At least 1 bad middle) = 1 - P(All Good Middles)
+                           = 1 - P(One Good middle)^(num_middles)
+                           = 1 - (1 - c/n)^(num_middles)
+```
+
+c/n is the adversary compromise percentage
+
+In the case of Vanguards, num_middles is the number of Guards you rotate
+through in a given time period. This is a function of the number of vanguards
+in that position (v), as well as the number of rotations (r).
+
+```text
+  P(At least one bad middle) = 1 - (1 - c/n)^(v*r)
+```
+
+Here's detailed tables in terms of the number of rotations required for
+a given Sybil success rate for certain number of guards.
+
+```text
+  1.0% Network Compromise:
+   Sybil Success   One   Two  Three  Four  Five  Six  Eight  Nine  Ten  Twelve  Sixteen
+    10%            11     6     4     3     3     2     2     2     2     1       1
+    15%            17     9     6     5     4     3     3     2     2     2       2
+    25%            29    15    10     8     6     5     4     4     3     3       2
+    50%            69    35    23    18    14    12     9     8     7     6       5
+    60%            92    46    31    23    19    16    12    11    10     8       6
+    75%           138    69    46    35    28    23    18    16    14    12       9
+    85%           189    95    63    48    38    32    24    21    19    16      12
+    90%           230   115    77    58    46    39    29    26    23    20      15
+    95%           299   150   100    75    60    50    38    34    30    25      19
+    99%           459   230   153   115    92    77    58    51    46    39      29
+
+  5.0% Network Compromise:
+   Sybil Success   One   Two  Three  Four  Five  Six  Eight  Nine  Ten  Twelve  Sixteen
+    10%             3     2     1     1     1     1     1     1     1     1       1
+    15%             4     2     2     1     1     1     1     1     1     1       1
+    25%             6     3     2     2     2     1     1     1     1     1       1
+    50%            14     7     5     4     3     3     2     2     2     2       1
+    60%            18     9     6     5     4     3     3     2     2     2       2
+    75%            28    14    10     7     6     5     4     4     3     3       2
+    85%            37    19    13    10     8     7     5     5     4     4       3
+    90%            45    23    15    12     9     8     6     5     5     4       3
+    95%            59    30    20    15    12    10     8     7     6     5       4
+    99%            90    45    30    23    18    15    12    10     9     8       6
+
+  10.0% Network Compromise:
+   Sybil Success   One   Two  Three  Four  Five  Six  Eight  Nine  Ten  Twelve  Sixteen
+    10%             2     1     1     1     1     1     1     1     1     1       1
+    15%             2     1     1     1     1     1     1     1     1     1       1
+    25%             3     2     1     1     1     1     1     1     1     1       1
+    50%             7     4     3     2     2     2     1     1     1     1       1
+    60%             9     5     3     3     2     2     2     1     1     1       1
+    75%            14     7     5     4     3     3     2     2     2     2       1
+    85%            19    10     7     5     4     4     3     3     2     2       2
+    90%            22    11     8     6     5     4     3     3     3     2       2
+    95%            29    15    10     8     6     5     4     4     3     3       2
+    99%            44    22    15    11     9     8     6     5     5     4       3
+```
+
+The rotation counts in these tables were generated with:
+
+## Skewed Rotation Distribution {#MaxDist}
+
+In order to skew the distribution of the third layer guard towards
+higher values, we use max(X,X) for the distribution, where X is a
+random variable that takes on values from the uniform distribution.
+
+Here's a table of expectation (arithmetic means) for relevant
+ranges of X (sampled from 0..N-1). The table was generated with the
+following python functions:
+
+```text
+
+  def ProbMinXX(N, i): return (2.0*(N-i)-1)/(N*N)
+  def ProbMaxXX(N, i): return (2.0*i+1)/(N*N)
+
+  def ExpFn(N, ProbFunc):
+    exp = 0.0
+    for i in range(N): exp += i*ProbFunc(N, i)
+    return exp
+```
+
+The current choice for second-layer Vanguards-Lite guards is noted with **,
+and the current choice for third-layer Full Vanguards is noted with ***.
+
+```text
+   Range  Min(X,X)   Max(X,X)
+   22      6.84        14.16**
+   23      7.17        14.83
+   24      7.51        15.49
+   25      7.84        16.16
+   26      8.17        16.83
+   27      8.51        17.49
+   28      8.84        18.16
+   29      9.17        18.83
+   30      9.51        19.49
+   31      9.84        20.16
+   32      10.17       20.83
+   33      10.51       21.49
+   34      10.84       22.16
+   35      11.17       22.83
+   36      11.50       23.50
+   37      11.84       24.16
+   38      12.17       24.83
+   39      12.50       25.50
+   40      12.84       26.16
+   40      12.84       26.16
+   41      13.17       26.83
+   42      13.50       27.50
+   43      13.84       28.16
+   44      14.17       28.83
+   45      14.50       29.50
+   46      14.84       30.16
+   47      15.17       30.83
+   48      15.50       31.50***
+```
+
+The Cumulative Density Function (CDF) tells us the probability that a
+guard will no longer be in use after a given number of time units have
+passed.
+
+Because the Sybil attack on the third node is expected to complete at any
+point in the second node's rotation period with uniform probability, if we
+want to know the probability that a second-level Guard node will still be in
+use after t days, we first need to compute the probability distribution of
+the rotation duration of the second-level guard at a uniformly random point
+in time. Let's call this P(R=r).
+
+For P(R=r), the probability of the rotation duration depends on the selection
+probability of a rotation duration, and the fraction of total time that
+rotation is likely to be in use. This can be written as:
+
+```text
+  P(R=r) = ProbMaxXX(X=r)*r / \sum_{i=1}^N ProbMaxXX(X=i)*i
+```
+
+  or in Python:
+
+```text
+  def ProbR(N, r, ProbFunc=ProbMaxXX):
+     return ProbFunc(N, r)*r/ExpFn(N, ProbFunc)
+```
+
+For the full CDF, we simply sum up the fractional probability density for
+all rotation durations. For rotation durations less than t days, we add the
+entire probability mass for that period to the density function. For
+durations d greater than t days, we take the fraction of that rotation
+period's selection probability and multiply it by t/d and add it to the
+density. In other words:
+
+```text
+  def FullCDF(N, t, ProbFunc=ProbR):
+    density = 0.0
+    for d in range(N):
+      if t >= d: density += ProbFunc(N, d)
+      # The +1's below compensate for 0-indexed arrays:
+      else: density += ProbFunc(N, d)*(float(t+1))/(d+1)
+    return density
+```
+
+Computing this yields the following distribution for our current parameters:
+
+```text
+   t          P(SECOND_ROTATION <= t)
+   1               0.03247
+   2               0.06494
+   3               0.09738
+   4               0.12977
+   5               0.16207
+  10               0.32111
+  15               0.47298
+  20               0.61353
+  25               0.73856
+  30               0.84391
+  35               0.92539
+  40               0.97882
+  45               1.00000
+```
+
+This CDF tells us that for the second-level Guard rotation, the
+adversary can expect that 3.3% of the time, their third-level Sybil
+attack will provide them with a second-level guard node that has only
+1 day remaining before it rotates. 6.5% of the time, there will
+be only 2 day or less remaining, and 9.7% of the time, 3 days or less.
+
+Note that this distribution is still a day-resolution approximation.