summaryrefslogtreecommitdiff
path: root/doc/HACKING
diff options
context:
space:
mode:
authorRoger Dingledine <arma@torproject.org>2003-04-05 19:04:05 +0000
committerRoger Dingledine <arma@torproject.org>2003-04-05 19:04:05 +0000
commit1ae95f66ede45f64fd6795cd7cdcc20f9a780c76 (patch)
treef595fd24a198c8e673883ac4ffb911adfae457c8 /doc/HACKING
parent03f4ed309f8d7743817521dfa9cd361364d2183f (diff)
downloadtor-1ae95f66ede45f64fd6795cd7cdcc20f9a780c76.tar.gz
tor-1ae95f66ede45f64fd6795cd7cdcc20f9a780c76.zip
bring docs closer to reality
svn:r221
Diffstat (limited to 'doc/HACKING')
-rw-r--r--doc/HACKING58
1 files changed, 27 insertions, 31 deletions
diff --git a/doc/HACKING b/doc/HACKING
index 421b32f904..e6a9e8157a 100644
--- a/doc/HACKING
+++ b/doc/HACKING
@@ -8,16 +8,20 @@ Read the README file first, so you can get familiar with the basics.
1. The programs.
-1.1. "or". This is the main program here. It functions as both a server
-and a client, depending on which config file you give it. ...
+1.1. "or". This is the main program here. It functions as either a server
+or a client, depending on which config file you give it.
+
+1.2. "orkeygen". Use "orkeygen file-for-privkey file-for-pubkey" to
+generate key files for an onion router.
2. The pieces.
2.1. Routers. Onion routers, as far as the 'or' program is concerned,
are a bunch of data items that are loaded into the router_array when
-the program starts. After it's loaded, the router information is never
-changed. When a new OR connection is started (see below), the relevant
-information is copied from the router struct to the connection struct.
+the program starts. Periodically it downloads a new set of routers
+from a directory server, and updates the router_array. When a new OR
+connection is started (see below), the relevant information is copied
+from the router struct to the connection struct.
2.2. Connections. A connection is a long-standing tcp socket between
nodes. A connection is named based on what it's connected to -- an "OR
@@ -26,34 +30,36 @@ an onion proxy on the other end, an "exit connection" has a website or
other server on the other end, and an "AP connection" has an application
proxy (and thus a user) on the other end.
-2.3. Circuits. A circuit is a single conversation between two
-participants over the onion routing network. One end of the circuit has
-an AP connection, and the other end has an exit connection. AP and exit
+2.3. Circuits. A circuit is a path over the onion routing
+network. Applications can connect to one end of the circuit, and can
+create exit connections at the other end of the circuit. AP and exit
connections have only one circuit associated with them (and thus these
connection types are closed when the circuit is closed), whereas OP and
OR connections multiplex many circuits at once, and stay standing even
when there are no circuits running over them.
+2.4. Topics. Topics are specific conversations between an AP and an exit.
+Topics are multiplexed over circuits.
+
2.4. Cells. Some connections, specifically OR and OP connections, speak
-"cells". This means that data over that connection is bundled into 128
-byte packets (8 bytes of header and 120 bytes of payload). Each cell has
+"cells". This means that data over that connection is bundled into 256
+byte packets (8 bytes of header and 248 bytes of payload). Each cell has
a type, or "command", which indicates what it's for.
3. Important parameters in the code.
-3.1. Role.
4. Robustness features.
4.1. Bandwidth throttling. Each cell-speaking connection has a maximum
bandwidth it can use, as specified in the routers.or file. Bandwidth
-throttling occurs on both the sender side and the receiving side. The
-sending side sends cells at regularly spaced intervals (e.g., a connection
-with a bandwidth of 12800B/s would queue a cell every 10ms). The receiving
-side protects against misbehaving servers that send cells more frequently,
-by using a simple token bucket:
+throttling can occur on both the sender side and the receiving side. If
+the LinkPadding option is on, the sending side sends cells at regularly
+spaced intervals (e.g., a connection with a bandwidth of 25600B/s would
+queue a cell every 10ms). The receiving side protects against misbehaving
+servers that send cells more frequently, by using a simple token bucket:
Each connection has a token bucket with a specified capacity. Tokens are
added to the bucket each second (when the bucket is full, new tokens
@@ -79,22 +85,12 @@ he owns, and then refuse to read any of the bytes at the webserver end
of the circuit. These bottlenecks can propagate back through the entire
network, mucking up everything.
-To handle this congestion, each circuit starts out with a receive
-window at each node of 100 cells -- it is willing to receive at most 100
-cells on that circuit. (It handles each direction separately; so that's
-really 100 cells forward and 100 cells back.) The edge of the circuit
-is willing to create at most 100 cells from data coming from outside the
-onion routing network. Nodes in the middle of the circuit will tear down
-the circuit if a data cell arrives when the receive window is 0. When
-data has traversed the network, the edge node buffers it on its outbuf,
-and evaluates whether to respond with a 'sendme' acknowledgement: if its
-outbuf is not too full, and its receive window is less than 90, then it
-queues a 'sendme' cell backwards in the circuit. Each node that receives
-the sendme increments its window by 10 and passes the cell onward.
+(See the tor-spec.txt document for details of how congestion control
+works.)
In practice, all the nodes in the circuit maintain a receive window
-close to 100 except the exit node, which stays around 0, periodically
-receiving a sendme and reading 10 more data cells from the webserver.
+close to maximum except the exit node, which stays around 0, periodically
+receiving a sendme and reading more data cells from the webserver.
In this way we can use pretty much all of the available bandwidth for
data, but gracefully back off when faced with multiple circuits (a new
sendme arrives only after some cells have traversed the entire network),
@@ -108,7 +104,7 @@ congestion control; so far it's enough.
4.3. Router twins. In many cases when we ask for a router with a given
address and port, we really mean a router who knows a given key. Router
-twins are two or more routers that all share the same private key. We thus
+twins are two or more routers that share the same private key. We thus
give routers extra flexibility in choosing the next hop in the circuit: if
some of the twins are down or slow, it can choose the more available ones.