From 2e5e0cb3f87f6813b789f09459daea6ebcaa4eb4 Mon Sep 17 00:00:00 2001 From: Nick Mathewson Date: Fri, 24 Feb 2017 11:23:31 -0500 Subject: Four new proposals based on experiments with download size --- proposals/276-lower-bw-granularity.txt | 70 ++++++++++++++++++++++++++++++++++ 1 file changed, 70 insertions(+) create mode 100644 proposals/276-lower-bw-granularity.txt (limited to 'proposals/276-lower-bw-granularity.txt') diff --git a/proposals/276-lower-bw-granularity.txt b/proposals/276-lower-bw-granularity.txt new file mode 100644 index 0000000..4d3735c --- /dev/null +++ b/proposals/276-lower-bw-granularity.txt @@ -0,0 +1,70 @@ +Filename: 276-lower-bw-granularity.txt +Title: Report bandwidth with lower granularity in consensus documents +Author: Nick Mathewson +Created: 20-Feb-2017 +Status: Open +Target: 0.3.1.x-alpha + +1. Overview + + This document proposes that, in order to limit the bandwidth needed for + networkstatus diffs, we lower the granularity with which bandwidth is + reported in consensus documents. + + Making this change will reduce the total compressed ed diff download + volume by around 10%. + +2. Motivation + + Consensus documents currently report bandwidth values as the median + of the measured bandwidth values in the votes. (Or as the median of + all votes' values if there are not enough measurements.) And when + voting, in turn, authorities simply report whatever measured value + they most recently encountered, clipped to 3 significant base-10 + figures. + + This means that, from one consensus to the next, these weights very + often and with little significance: A large fraction of bandwidth + transitions are under 2% in magnitude. + + As we begin to use consensus diffs, each change will take space to + transmit. So lowering the amount of changes will lower client + bandwidth requirements significantly. + +3. Proposal + + I propose that we round the bandwidth values as they are placed in + the votes to two no more than significant digits. In addition, for + values beginning with decimal "2" through "4", we should round the + first two digits the nearest multiple of 2. For values beginning + with decimal "5" though "9", we should round to the nearest multiple + of 5. + + This change does not require a consensus method; it will take effect + once enough authorities have upgraded. + +4. Analysis + + The rounding proposed above will not round any value by more than + 5%, so the overall impact on bandwidth balancing should be small. + + In order to assess the bandwidth savings of this approach, I + smoothed the January 2017 consensus documents' Bandwidth fields, + using scripts from [1]. I found that if clients download + consensus diffs once an hour, they can expect 11-13% mean savings + after xz or gz compression. For two-hour intervals, the savings + is 8-10%; for three-hour or four-hour intervals, the savings only + is 6-8%. After that point, we start seeing diminishing returns, + with only 1-2% savings on a 72-hour interval's diff. + + [1] https://github.com/nmathewson/consensus-diff-analysis + +5. Open questions: + + Is there a greedier smoothing algorithm that would produce better + results? + + Is there any reason to think this amount of smoothing would not + be save? + + Would a time-aware smoothing mechanism work better? -- cgit v1.2.3-54-g00ecf