summaryrefslogtreecommitdiff
path: root/doc/TODO
blob: 287207be6cdbda988049b36b5ddda3d50e2a50c7 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
Legend:
SPEC!!  - Not specified
SPEC    - Spec not finalized
NICK    - nick claims
ARMA    - arma claims
        - Not done
        * Top priority
        . Partially done
        o Done
        D Deferred
        X Abandoned

For 0.0.7:
        o allow multiple log files
        o *bindaddress
          o include the port
          o allow multiple of them
          o have an allow/deny series for socks
        o break exitpolicy into multiple config lines
        o have the OP forget routers it hasn't heard about in 24 hours
        . rename/rearrange functions for what file they're in
        D try to break apart the main clump of functions better.
        o rend_services_introduce should check if it's failed a lot
          recently, and not try for a while if so
        o check tor version as soon as you get the recommended-versions
          string, regardless of whether parsing the directory succeeded.
        - make all ORs serve the directory too.


For September:
        . Windows port
          o works as client
            - deal with pollhup / reached_eof on all platforms
          . robust as a client
          - works as server
            - can be configured
          - robust as a server
          - docs for building in win
          - installer?

        - Docs
          - FAQ
          - overview of tor. how does it work, what's it do, pros and
            cons of using it, why should I use it, etc.
          - a howto tutorial with examples
          - tutorial: how to set up your own tor network
            - (need to not hardcode dirservers file in config.c)
          . correct, update, polish spec
          - document the exposed function api?
          - document what we mean by socks.

        - packages
          - rpm
          - find a long-term rpm maintainer

        - code
          - better warn/info messages
          - let tor do resolves.
          - extend socks4 to do resolves?
          - make script to ask tor for resolves
          - tsocks
            - gather patches, submit to maintainer
            - intercept gethostbyname and others, do resolve via tor
          - redesign and thorough code revamp, with particular eye toward:
            - support half-open tcp connections
            - conn key rotation
            - other transports -- http, airhook
            - modular introduction mechanism
            - allow non-clique topology

Other details and small and hard things:
        - tor should be able to have a pool of outgoing IP addresses
          that it is able to rotate through. (maybe)
        - tie into squid
        - buffer size pool, to let a few buffers grow huge or many buffers
          grow a bit
        - hidserv offerers shouldn't need to define a SocksPort
        - when the client fails to pick an intro point for a hidserv,
          it should refetch the hidserv desc.
        . should maybe make clients exit(1) when bad things happen?
          e.g. clock skew.
        - should retry exitpolicy end streams even if the end cell didn't
          resolve the address for you
        - Add '[...truncated]' or similar to truncated log entries (like the directory
          in connection_dir_process_inbuf()).
        . Make logs handle it better when writing to them fails.
        o Dirserver shouldn't put you in running-routers list if you haven't
          uploaded a descriptor recently
        . Refactor: add own routerinfo to routerlist.  Right now, only
          router_get_by_nickname knows about 'this router', as a hack to
          get circuit_launch_new to do the right thing.
        . Scrubbing proxies
                - Find an smtp proxy?
                . Get socks4a support into Mozilla
        - Extend by nickname/hostname/something, not by IP.
        - Need a relay teardown cell, separate from one-way ends.
        - Make it harder to circumvent bandwidth caps: look at number of bytes
          sent across sockets, not number sent inside TLS stream.
        - fix router_get_by_* functions so they can get ourselves too,
          and audit everything to make sure rend and intro points are
          just as likely to be us as not.



***************************Future tasks:****************************

Rendezvous and hidden services:
  make it fast:
    - preemptively build and start rendezvous circs.
    - preemptively build n-1 hops of intro circs?
    - cannibalize general circs?
  make it reliable:
    - standby/hotswap/redundant services.
    - store stuff to disk? dirservers forget service descriptors when
      they restart; nodes offering hidden services forget their chosen
      intro points when they restart.
  make it robust:
    - auth mechanisms to let midpoint and bob selectively choose
      connection requests.
  make it scalable:
    - right now the hidserv store/lookup system is run by the dirservers;
      this won't scale.

Tor scalability:
  Relax clique assumptions.
  Redesign how directories are handled.
    - Separate running-routers lookup from descriptor list lookup.
    - Resolve directory agreement somehow.
    - Cache directory on all servers.
  Find and remove bottlenecks
    - Address linear searches on e.g. circuit and connection lists.
  Reputation/memory system, so dirservers can measure people,
    and so other people can verify their measurements.
    - Need to measure via relay, so it's not distinguishable.
  Bandwidth-aware path selection. So people with T3's are picked
    more often than people with DSL.
  Reliability-aware node selection. So people who are stable are
    preferred for long-term circuits such as intro and rend circs,
    and general circs for irc, aim, ssh, etc.
  Let dissidents get to Tor servers via Tor users. ("Backbone model")

Anonymity improvements:
  Is abandonding the circuit the only option when an extend fails, or
    can we do something without impacting anonymity too much?
  Is exiting from the middle of the circuit always a bad idea?
  Helper nodes. Decide how to use them to improve safety.
  DNS resolution: need to make tor support resolve requests. Need to write
    a script and an interface (including an extension to the socks
    protocol) so we can ask it to do resolve requests. Need to patch
    tsocks to intercept gethostbyname, else we'll continue leaking it.
  Improve path selection algorithms based on routing-zones paper. Be sure
    to start and end circuits in different ASs. Ideally, consider AS of
    source and destination -- maybe even enter and exit via nearby AS.
  Intermediate model, with some delays and mixing.
  Add defensive dropping regime?

Make it more correct:
  Handle half-open connections: right now we don't support all TCP
    streams, at least according to the protocol. But we handle all that
    we've seen in the wild.
  Support IPv6.

Efficiency/speed/robustness:
  Congestion control. Is our current design sufficient once we have heavy
    use? Need to measure and tweak, or maybe overhaul.
  Allow small cells and large cells on the same network?
  Cell buffering and resending. This will allow us to handle broken
    circuits as long as the endpoints don't break, plus will allow
    connection (tls session key) rotation.
  Implement Morphmix, so we can compare its behavior, complexity, etc.
  Use cpuworker for more heavy lifting.
    - Signing (and verifying) hidserv descriptors
    - Signing (and verifying) intro/rend requests
    - Signing (and verifying) router descriptors
    - Signing (and verifying) directories
    - Doing TLS handshake (this is very hard to separate out, though)
  Buffer size pool: allocate a maximum size for all buffers, not
    a maximum size for each buffer. So we don't have to give up as
    quickly (and kill the thickpipe!) when there's congestion.
  Exit node caching: tie into squid or other caching web proxy.
  Other transport. HTTP, udp, rdp, airhook, etc. May have to do our own
    link crypto, unless we can bully openssl into it.

P2P Tor:
  Do all the scalability stuff above, first.
  Incentives to relay. Not so hard.
  Incentives to allow exit. Possibly quite hard.
  Sybil defenses without having a human bottleneck.
  How to gather random sample of nodes.
  How to handle nodelist recommendations.
  Consider incremental switches: a p2p tor with only 50 users has
    different anonymity properties than one with 10k users, and should
    be treated differently.