diff options
author | Nick Mathewson <nickm@torproject.org> | 2019-11-04 12:22:58 -0500 |
---|---|---|
committer | Nick Mathewson <nickm@torproject.org> | 2019-11-04 12:24:46 -0500 |
commit | 607b1ff776b5a5e5c9ba0197b5768751e5b9c68c (patch) | |
tree | a73de7edb8f504f15b85551082a2cf3c76a8f71d | |
parent | a5085c52d0902c35ae889c68e99d5f2a1422dd30 (diff) | |
download | tor-607b1ff776b5a5e5c9ba0197b5768751e5b9c68c.tar.gz tor-607b1ff776b5a5e5c9ba0197b5768751e5b9c68c.zip |
Move much of 00-overview.md into doxygen.
-rw-r--r-- | doc/HACKING/design/00-overview.md | 118 | ||||
-rw-r--r-- | src/mainpage.dox | 119 |
2 files changed, 115 insertions, 122 deletions
diff --git a/doc/HACKING/design/00-overview.md b/doc/HACKING/design/00-overview.md index ff40a566be..1c14dc8c10 100644 --- a/doc/HACKING/design/00-overview.md +++ b/doc/HACKING/design/00-overview.md @@ -1,124 +1,6 @@ ## Overview ## -This document describes the general structure of the Tor codebase, how -it fits together, what functionality is available for extending Tor, -and gives some notes on how Tor got that way. - -Tor remains a work in progress: We've been working on it for nearly two -decades, and we've learned a lot about good coding since we first -started. This means, however, that some of the older pieces of Tor will -have some "code smell" in them that could stand a brisk -refactoring. So when I describe a piece of code, I'll sometimes give a -note on how it got that way, and whether I still think that's a good -idea. - -The first drafts of this document were written in the Summer and Fall of -2015, when Tor 0.2.6 was the most recent stable version, and Tor 0.2.7 -was under development. There is a revision in progress (as of late -2019), to bring it up to pace with Tor as of version 0.4.2. If you're -reading this far in the future, some things may have changed. Caveat -haxxor! - -This document is not an overview of the Tor protocol. For that, see the -design paper and the specifications at https://spec.torproject.org/ . - -For more information about Tor's coding standards and some helpful -development tools, see doc/HACKING in the Tor repository. - - -### The very high level ### - -Ultimately, Tor runs as an event-driven network daemon: it responds to -network events, signals, and timers by sending and receiving things over -the network. Clients, relays, and directory authorities all use the -same codebase: the Tor process will run as a client, relay, or authority -depending on its configuration. - -Tor has a few major dependencies, including Libevent (used to tell which -sockets are readable and writable), OpenSSL or NSS (used for many encryption -functions, and to implement the TLS protocol), and zlib (used to -compress and uncompress directory information). - -Most of Tor's work today is done in a single event-driven main thread. -Tor also spawns one or more worker threads to handle CPU-intensive -tasks. (Right now, this only includes circuit encryption and the more -expensive compression algorithms.) - -On startup, Tor initializes its libraries, reads and responds to its -configuration files, and launches a main event loop. At first, the only -events that Tor listens for are a few signals (like TERM and HUP), and -one or more listener sockets (for different kinds of incoming -connections). Tor also configures several timers to handle periodic -events. As Tor runs over time, other events will open, and new events -will be scheduled. - -The codebase is divided into a few top-level subdirectories, each of -which contains several sub-modules. - - * `src/ext` -- Code maintained elsewhere that we include in the Tor - source distribution. - - * src/lib` -- Lower-level utility code, not necessarily tor-specific. - - * `src/trunnel` -- Automatically generated code (from the Trunnel - tool): used to parse and encode binary formats. - - * `src/core` -- Networking code that is implements the central parts of - the Tor protocol and main loop. - - * `src/feature` -- Aspects of Tor (like directory management, running a - relay, running a directory authorities, managing a list of nodes, - running and using onion services) that are built on top of the - mainloop code. - - * `src/app` -- Highest-level functionality; responsible for setting up - and configuring the Tor daemon, making sure all the lower-level - modules start up when required, and so on. - - * `src/tools` -- Binaries other than Tor that we produce. Currently this - is tor-resolve, tor-gencert, and the tor_runner.o helper module. - - * `src/test` -- unit tests, regression tests, and a few integration - tests. - -In theory, the above parts of the codebase are sorted from highest-level to -lowest-level, where high-level code is only allowed to invoke lower-level -code, and lower-level code never includes or depends on code of a higher -level. In practice, this refactoring is incomplete: The modules in `src/lib` -are well-factored, but there are many layer violations ("upward -dependencies") in `src/core` and `src/feature`. We aim to eliminate those -over time. - -### Some key high-level abstractions ### - -The most important abstractions at Tor's high-level are Connections, -Channels, Circuits, and Nodes. - -A 'Connection' represents a stream-based information flow. Most -connections are TCP connections to remote Tor servers and clients. (But -as a shortcut, a relay will sometimes make a connection to itself -without actually using a TCP connection. More details later on.) -Connections exist in different varieties, depending on what -functionality they provide. The principle types of connection are -"edge" (eg a socks connection or a connection from an exit relay to a -destination), "OR" (a TLS stream connecting to a relay), "Directory" (an -HTTP connection to learn about the network), and "Control" (a connection -from a controller). - -A 'Circuit' is persistent tunnel through the Tor network, established -with public-key cryptography, and used to send cells one or more hops. -Clients keep track of multi-hop circuits, and the cryptography -associated with each hop. Relays, on the other hand, keep track only of -their hop of each circuit. - -A 'Channel' is an abstract view of sending cells to and from a Tor -relay. Currently, all channels are implemented using OR connections. -If we switch to other strategies in the future, we'll have more -connection types. - -A 'Node' is a view of a Tor instance's current knowledge and opinions -about a Tor relay or bridge. ### The rest of this document. ### diff --git a/src/mainpage.dox b/src/mainpage.dox index 84eea3c526..7c64e0dffb 100644 --- a/src/mainpage.dox +++ b/src/mainpage.dox @@ -3,9 +3,120 @@ @section intro Getting to know Tor -Welcome to the Tor source code documentation! Here we have documentation for -nearly every function, type, and module in the Tor source code. The high-level -documentation is a work in progress. For now, have a look at the source code -overview in doc/HACKING/design. +Welcome! + +This documentation describes the general structure of the Tor codebase, how +it fits together, what functionality is available for extending Tor, and +gives some notes on how Tor got that way. It also includes a reference for +nearly every function, type, file, and module in the Tor source code. The +high-level documentation is a work in progress. + +Tor itself remains a work in progress too: We've been working on it for +nearly two decades, and we've learned a lot about good coding since we first +started. This means, however, that some of the older pieces of Tor will have +some "code smell" in them that could stand a brisk refactoring. So when we +describe a piece of code, we'll sometimes give a note on how it got that way, +and whether we still think that's a good idea. + +This document is not an overview of the Tor protocol. For that, see the +design paper and the specifications at https://spec.torproject.org/ . + +For more information about Tor's coding standards and some helpful +development tools, see doc/HACKING in the Tor repository. + +@section highlevel The very high level + +Ultimately, Tor runs as an event-driven network daemon: it responds to +network events, signals, and timers by sending and receiving things over +the network. Clients, relays, and directory authorities all use the +same codebase: the Tor process will run as a client, relay, or authority +depending on its configuration. + +Tor has a few major dependencies, including Libevent (used to tell which +sockets are readable and writable), OpenSSL or NSS (used for many encryption +functions, and to implement the TLS protocol), and zlib (used to +compress and uncompress directory information). + +Most of Tor's work today is done in a single event-driven main thread. +Tor also spawns one or more worker threads to handle CPU-intensive +tasks. (Right now, this only includes circuit encryption and the more +expensive compression algorithms.) + +On startup, Tor initializes its libraries, reads and responds to its +configuration files, and launches a main event loop. At first, the only +events that Tor listens for are a few signals (like TERM and HUP), and +one or more listener sockets (for different kinds of incoming +connections). Tor also configures several timers to handle periodic +events. As Tor runs over time, other events will open, and new events +will be scheduled. + +The codebase is divided into a few top-level subdirectories, each of +which contains several sub-modules. + + - `ext` -- Code maintained elsewhere that we include in the Tor + source distribution. + + - \refdir{lib} -- Lower-level utility code, not necessarily + tor-specific. + + - `trunnel` -- Automatically generated code (from the Trunnel + tool): used to parse and encode binary formats. + + - \refdir{core} -- Networking code that is implements the central + parts of the Tor protocol and main loop. + + - \refdir{feature} -- Aspects of Tor (like directory management, + running a relay, running a directory authorities, managing a list of + nodes, running and using onion services) that are built on top of the + mainloop code. + + - \refdir{app} -- Highest-level functionality; responsible for setting + up and configuring the Tor daemon, making sure all the lower-level + modules start up when required, and so on. + + - \refdir{tools} -- Binaries other than Tor that we produce. + Currently this is tor-resolve, tor-gencert, and the tor_runner.o helper + module. + + - `test` -- unit tests, regression tests, and a few integration + tests. + +In theory, the above parts of the codebase are sorted from highest-level to +lowest-level, where high-level code is only allowed to invoke lower-level +code, and lower-level code never includes or depends on code of a higher +level. In practice, this refactoring is incomplete: The modules in +\refdir{lib} are well-factored, but there are many layer violations ("upward +dependencies") in \refdir{core} and \refdir{feature}. +We aim to eliminate those over time. + +@section keyabstractions Some key high-level abstractions + +The most important abstractions at Tor's high-level are Connections, +Channels, Circuits, and Nodes. + +A 'Connection' (connection_t) represents a stream-based information flow. +Most connections are TCP connections to remote Tor servers and clients. (But +as a shortcut, a relay will sometimes make a connection to itself without +actually using a TCP connection. More details later on.) Connections exist +in different varieties, depending on what functionality they provide. The +principle types of connection are edge_connection_t (eg a socks connection or +a connection from an exit relay to a destination), or_connection_t (a TLS +stream connecting to a relay), dir_connection_t (an HTTP connection to learn +about the network), and control_connection_t (a connection from a +controller). + +A 'Circuit' (circuit_t) is persistent tunnel through the Tor network, +established with public-key cryptography, and used to send cells one or more +hops. Clients keep track of multi-hop circuits (origin_circuit_t), and the +cryptography associated with each hop. Relays, on the other hand, keep track +only of their hop of each circuit (or_circuit_t). + +A 'Channel' (channel_t) is an abstract view of sending cells to and from a +Tor relay. Currently, all channels are implemented using OR connections +(channel_tls_t). If we switch to other strategies in the future, we'll have +more connection types. + +A 'Node' (node_t) is a view of a Tor instance's current knowledge and opinions +about a Tor relay or bridge. **/ |