summaryrefslogtreecommitdiff
path: root/doc/HACKING
diff options
context:
space:
mode:
Diffstat (limited to 'doc/HACKING')
-rw-r--r--doc/HACKING117
1 files changed, 117 insertions, 0 deletions
diff --git a/doc/HACKING b/doc/HACKING
new file mode 100644
index 0000000000..421b32f904
--- /dev/null
+++ b/doc/HACKING
@@ -0,0 +1,117 @@
+
+0. Intro.
+Onion Routing is still very much in development stages. This document
+aims to get you started in the right direction if you want to understand
+the code, add features, fix bugs, etc.
+
+Read the README file first, so you can get familiar with the basics.
+
+1. The programs.
+
+1.1. "or". This is the main program here. It functions as both a server
+and a client, depending on which config file you give it. ...
+
+2. The pieces.
+
+2.1. Routers. Onion routers, as far as the 'or' program is concerned,
+are a bunch of data items that are loaded into the router_array when
+the program starts. After it's loaded, the router information is never
+changed. When a new OR connection is started (see below), the relevant
+information is copied from the router struct to the connection struct.
+
+2.2. Connections. A connection is a long-standing tcp socket between
+nodes. A connection is named based on what it's connected to -- an "OR
+connection" has an onion router on the other end, an "OP connection" has
+an onion proxy on the other end, an "exit connection" has a website or
+other server on the other end, and an "AP connection" has an application
+proxy (and thus a user) on the other end.
+
+2.3. Circuits. A circuit is a single conversation between two
+participants over the onion routing network. One end of the circuit has
+an AP connection, and the other end has an exit connection. AP and exit
+connections have only one circuit associated with them (and thus these
+connection types are closed when the circuit is closed), whereas OP and
+OR connections multiplex many circuits at once, and stay standing even
+when there are no circuits running over them.
+
+2.4. Cells. Some connections, specifically OR and OP connections, speak
+"cells". This means that data over that connection is bundled into 128
+byte packets (8 bytes of header and 120 bytes of payload). Each cell has
+a type, or "command", which indicates what it's for.
+
+
+3. Important parameters in the code.
+
+3.1. Role.
+
+
+4. Robustness features.
+
+4.1. Bandwidth throttling. Each cell-speaking connection has a maximum
+bandwidth it can use, as specified in the routers.or file. Bandwidth
+throttling occurs on both the sender side and the receiving side. The
+sending side sends cells at regularly spaced intervals (e.g., a connection
+with a bandwidth of 12800B/s would queue a cell every 10ms). The receiving
+side protects against misbehaving servers that send cells more frequently,
+by using a simple token bucket:
+
+Each connection has a token bucket with a specified capacity. Tokens are
+added to the bucket each second (when the bucket is full, new tokens
+are discarded.) Each token represents permission to receive one byte
+from the network --- to receive a byte, the connection must remove a
+token from the bucket. Thus if the bucket is empty, that connection must
+wait until more tokens arrive. The number of tokens we add enforces a
+longterm average rate of incoming bytes, yet we still permit short-term
+bursts above the allowed bandwidth. Currently bucket sizes are set to
+ten seconds worth of traffic.
+
+The bandwidth throttling uses TCP to push back when we stop reading.
+We extend it with token buckets to allow more flexibility for traffic
+bursts.
+
+4.2. Data congestion control. Even with the above bandwidth throttling,
+we still need to worry about congestion, either accidental or intentional.
+If a lot of people make circuits into same node, and they all come out
+through the same connection, then that connection may become saturated
+(be unable to send out data cells as quickly as it wants to). An adversary
+can make a 'put' request through the onion routing network to a webserver
+he owns, and then refuse to read any of the bytes at the webserver end
+of the circuit. These bottlenecks can propagate back through the entire
+network, mucking up everything.
+
+To handle this congestion, each circuit starts out with a receive
+window at each node of 100 cells -- it is willing to receive at most 100
+cells on that circuit. (It handles each direction separately; so that's
+really 100 cells forward and 100 cells back.) The edge of the circuit
+is willing to create at most 100 cells from data coming from outside the
+onion routing network. Nodes in the middle of the circuit will tear down
+the circuit if a data cell arrives when the receive window is 0. When
+data has traversed the network, the edge node buffers it on its outbuf,
+and evaluates whether to respond with a 'sendme' acknowledgement: if its
+outbuf is not too full, and its receive window is less than 90, then it
+queues a 'sendme' cell backwards in the circuit. Each node that receives
+the sendme increments its window by 10 and passes the cell onward.
+
+In practice, all the nodes in the circuit maintain a receive window
+close to 100 except the exit node, which stays around 0, periodically
+receiving a sendme and reading 10 more data cells from the webserver.
+In this way we can use pretty much all of the available bandwidth for
+data, but gracefully back off when faced with multiple circuits (a new
+sendme arrives only after some cells have traversed the entire network),
+stalled network connections, or attacks.
+
+We don't need to reimplement full tcp windows, with sequence numbers,
+the ability to drop cells when we're full etc, because the tcp streams
+already guarantee in-order delivery of each cell. Rather than trying
+to build some sort of tcp-on-tcp scheme, we implement this minimal data
+congestion control; so far it's enough.
+
+4.3. Router twins. In many cases when we ask for a router with a given
+address and port, we really mean a router who knows a given key. Router
+twins are two or more routers that all share the same private key. We thus
+give routers extra flexibility in choosing the next hop in the circuit: if
+some of the twins are down or slow, it can choose the more available ones.
+
+Currently the code tries for the primary router first, and if it's down,
+chooses the first available twin.
+