diff options
Diffstat (limited to 'spec/vanguards-spec/vanguards-stats.md')
-rw-r--r-- | spec/vanguards-spec/vanguards-stats.md | 195 |
1 files changed, 195 insertions, 0 deletions
diff --git a/spec/vanguards-spec/vanguards-stats.md b/spec/vanguards-spec/vanguards-stats.md new file mode 100644 index 0000000..445f2f0 --- /dev/null +++ b/spec/vanguards-spec/vanguards-stats.md @@ -0,0 +1,195 @@ +# Vanguard Rotation Statistics + +## Sybil rotation counts for a given number of Guards {#SybilTable} + +The probability of Sybil success for Guard discovery can be modeled as +the probability of choosing 1 or more malicious middle nodes for a +sensitive circuit over some period of time. + +```text + P(At least 1 bad middle) = 1 - P(All Good Middles) + = 1 - P(One Good middle)^(num_middles) + = 1 - (1 - c/n)^(num_middles) +``` + +c/n is the adversary compromise percentage + +In the case of Vanguards, num_middles is the number of Guards you rotate +through in a given time period. This is a function of the number of vanguards +in that position (v), as well as the number of rotations (r). + +```text + P(At least one bad middle) = 1 - (1 - c/n)^(v*r) +``` + +Here's detailed tables in terms of the number of rotations required for +a given Sybil success rate for certain number of guards. + +```text + 1.0% Network Compromise: + Sybil Success One Two Three Four Five Six Eight Nine Ten Twelve Sixteen + 10% 11 6 4 3 3 2 2 2 2 1 1 + 15% 17 9 6 5 4 3 3 2 2 2 2 + 25% 29 15 10 8 6 5 4 4 3 3 2 + 50% 69 35 23 18 14 12 9 8 7 6 5 + 60% 92 46 31 23 19 16 12 11 10 8 6 + 75% 138 69 46 35 28 23 18 16 14 12 9 + 85% 189 95 63 48 38 32 24 21 19 16 12 + 90% 230 115 77 58 46 39 29 26 23 20 15 + 95% 299 150 100 75 60 50 38 34 30 25 19 + 99% 459 230 153 115 92 77 58 51 46 39 29 + + 5.0% Network Compromise: + Sybil Success One Two Three Four Five Six Eight Nine Ten Twelve Sixteen + 10% 3 2 1 1 1 1 1 1 1 1 1 + 15% 4 2 2 1 1 1 1 1 1 1 1 + 25% 6 3 2 2 2 1 1 1 1 1 1 + 50% 14 7 5 4 3 3 2 2 2 2 1 + 60% 18 9 6 5 4 3 3 2 2 2 2 + 75% 28 14 10 7 6 5 4 4 3 3 2 + 85% 37 19 13 10 8 7 5 5 4 4 3 + 90% 45 23 15 12 9 8 6 5 5 4 3 + 95% 59 30 20 15 12 10 8 7 6 5 4 + 99% 90 45 30 23 18 15 12 10 9 8 6 + + 10.0% Network Compromise: + Sybil Success One Two Three Four Five Six Eight Nine Ten Twelve Sixteen + 10% 2 1 1 1 1 1 1 1 1 1 1 + 15% 2 1 1 1 1 1 1 1 1 1 1 + 25% 3 2 1 1 1 1 1 1 1 1 1 + 50% 7 4 3 2 2 2 1 1 1 1 1 + 60% 9 5 3 3 2 2 2 1 1 1 1 + 75% 14 7 5 4 3 3 2 2 2 2 1 + 85% 19 10 7 5 4 4 3 3 2 2 2 + 90% 22 11 8 6 5 4 3 3 3 2 2 + 95% 29 15 10 8 6 5 4 4 3 3 2 + 99% 44 22 15 11 9 8 6 5 5 4 3 +``` + +The rotation counts in these tables were generated with: + +## Skewed Rotation Distribution {#MaxDist} + +In order to skew the distribution of the third layer guard towards +higher values, we use max(X,X) for the distribution, where X is a +random variable that takes on values from the uniform distribution. + +Here's a table of expectation (arithmetic means) for relevant +ranges of X (sampled from 0..N-1). The table was generated with the +following python functions: + +```text + + def ProbMinXX(N, i): return (2.0*(N-i)-1)/(N*N) + def ProbMaxXX(N, i): return (2.0*i+1)/(N*N) + + def ExpFn(N, ProbFunc): + exp = 0.0 + for i in range(N): exp += i*ProbFunc(N, i) + return exp +``` + +The current choice for second-layer Vanguards-Lite guards is noted with **, +and the current choice for third-layer Full Vanguards is noted with ***. + +```text + Range Min(X,X) Max(X,X) + 22 6.84 14.16** + 23 7.17 14.83 + 24 7.51 15.49 + 25 7.84 16.16 + 26 8.17 16.83 + 27 8.51 17.49 + 28 8.84 18.16 + 29 9.17 18.83 + 30 9.51 19.49 + 31 9.84 20.16 + 32 10.17 20.83 + 33 10.51 21.49 + 34 10.84 22.16 + 35 11.17 22.83 + 36 11.50 23.50 + 37 11.84 24.16 + 38 12.17 24.83 + 39 12.50 25.50 + 40 12.84 26.16 + 40 12.84 26.16 + 41 13.17 26.83 + 42 13.50 27.50 + 43 13.84 28.16 + 44 14.17 28.83 + 45 14.50 29.50 + 46 14.84 30.16 + 47 15.17 30.83 + 48 15.50 31.50*** +``` + +The Cumulative Density Function (CDF) tells us the probability that a +guard will no longer be in use after a given number of time units have +passed. + +Because the Sybil attack on the third node is expected to complete at any +point in the second node's rotation period with uniform probability, if we +want to know the probability that a second-level Guard node will still be in +use after t days, we first need to compute the probability distribution of +the rotation duration of the second-level guard at a uniformly random point +in time. Let's call this P(R=r). + +For P(R=r), the probability of the rotation duration depends on the selection +probability of a rotation duration, and the fraction of total time that +rotation is likely to be in use. This can be written as: + +```text + P(R=r) = ProbMaxXX(X=r)*r / \sum_{i=1}^N ProbMaxXX(X=i)*i +``` + + or in Python: + +```text + def ProbR(N, r, ProbFunc=ProbMaxXX): + return ProbFunc(N, r)*r/ExpFn(N, ProbFunc) +``` + +For the full CDF, we simply sum up the fractional probability density for +all rotation durations. For rotation durations less than t days, we add the +entire probability mass for that period to the density function. For +durations d greater than t days, we take the fraction of that rotation +period's selection probability and multiply it by t/d and add it to the +density. In other words: + +```text + def FullCDF(N, t, ProbFunc=ProbR): + density = 0.0 + for d in range(N): + if t >= d: density += ProbFunc(N, d) + # The +1's below compensate for 0-indexed arrays: + else: density += ProbFunc(N, d)*(float(t+1))/(d+1) + return density +``` + +Computing this yields the following distribution for our current parameters: + +```text + t P(SECOND_ROTATION <= t) + 1 0.03247 + 2 0.06494 + 3 0.09738 + 4 0.12977 + 5 0.16207 + 10 0.32111 + 15 0.47298 + 20 0.61353 + 25 0.73856 + 30 0.84391 + 35 0.92539 + 40 0.97882 + 45 1.00000 +``` + +This CDF tells us that for the second-level Guard rotation, the +adversary can expect that 3.3% of the time, their third-level Sybil +attack will provide them with a second-level guard node that has only +1 day remaining before it rotates. 6.5% of the time, there will +be only 2 day or less remaining, and 9.7% of the time, 3 days or less. + +Note that this distribution is still a day-resolution approximation. |