aboutsummaryrefslogtreecommitdiff
path: root/proposals/ideas
diff options
context:
space:
mode:
authorMike Perry <mikeperry-git@fscked.org>2009-02-12 09:54:54 +0000
committerMike Perry <mikeperry-git@fscked.org>2009-02-12 09:54:54 +0000
commitbc2c75cd1f7458f81f2742f90c508c26a149762f (patch)
treeb968c43d90b6ac37f5f43c16b99b680938e4b576 /proposals/ideas
parentea37df68d6f04ac6c1dc652b4b199717bc84ec89 (diff)
downloadtorspec-bc2c75cd1f7458f81f2742f90c508c26a149762f.tar.gz
torspec-bc2c75cd1f7458f81f2742f90c508c26a149762f.zip
Add exit scanning proposal outline from discussions with arma.
svn:r18501
Diffstat (limited to 'proposals/ideas')
-rw-r--r--proposals/ideas/xxx-exit-scanning-outline.txt34
1 files changed, 34 insertions, 0 deletions
diff --git a/proposals/ideas/xxx-exit-scanning-outline.txt b/proposals/ideas/xxx-exit-scanning-outline.txt
new file mode 100644
index 0000000..8d2d456
--- /dev/null
+++ b/proposals/ideas/xxx-exit-scanning-outline.txt
@@ -0,0 +1,34 @@
+1. Scanning process
+ A. Non-HTML/JS mime types compared via SHA1 hash
+ B. Dynamic content filtered at 4 levels:
+ 1. IP change+Tor cookie utilization
+ - Tor cookies replayed with new IP in case of changes
+ 2. HTML Tag+Attribute+JS comparison
+ - Comparisons made based only on "relevant" HTML tags
+ and attributes
+ 3. HTML Tag+Attribute+JS diffing
+ - Tags, attributes and JS AST nodes that change during
+ Non-Tor fetches pruned from comparison
+ 4. URLS with > N% of node failures removed
+ - results purged from filesystem at end of scan loop
+ C. Scanner can be restarted from any point in the event
+ of scanner or system crashes, or graceful shutdown.
+ - Results+scan state pickled to filesystem continuously
+2. Cron job checks results periodically for reporting
+ A. Divide failures into three types of BadExit based on type
+ and frequency over time and incident rate
+ B. write reject lines to approved-routers for those three types:
+ 1. ID Hex based (for misconfig/network problems easily fixed)
+ 2. IP based (for content modification)
+ 3. IP+mask based (for continuous/eggregious content modification)
+ C. Emails results to tor-scanners@freehaven.net
+3. Human Review and Appeal
+ A. ID Hex-based BadExit is meant to be possible to removed easily
+ without needing to beg us.
+ - Should this behavior be encouraged?
+ B. Optionally can reserve IP based badexits for human review
+ 1. Results are encapsulated fully on the filesystem and can be
+ reviewed without network access
+ 2. Soat has --rescan to rescan failed nodes from a data directory
+ - New set of URLs used
+