summaryrefslogtreecommitdiff
path: root/doc/HACKING
diff options
context:
space:
mode:
Diffstat (limited to 'doc/HACKING')
-rw-r--r--doc/HACKING/CodeStructure.md129
-rw-r--r--doc/HACKING/CodingStandards.md43
-rw-r--r--doc/HACKING/CodingStandardsRust.md24
-rw-r--r--doc/HACKING/HelpfulTools.md46
-rw-r--r--doc/HACKING/Module.md111
-rw-r--r--doc/HACKING/ReleasingTor.md4
-rw-r--r--doc/HACKING/Tracing.md2
7 files changed, 337 insertions, 22 deletions
diff --git a/doc/HACKING/CodeStructure.md b/doc/HACKING/CodeStructure.md
new file mode 100644
index 0000000000..736d6cd484
--- /dev/null
+++ b/doc/HACKING/CodeStructure.md
@@ -0,0 +1,129 @@
+
+TODO: revise this to talk about how things are, rather than how things
+have changed.
+
+TODO: Make this into good markdown.
+
+
+
+For quite a while now, the program "tor" has been built from source
+code in just two directories: src/common and src/or.
+
+This has become more-or-less untenable, for a few reasons -- most
+notably of which is that it has led our code to become more
+spaghetti-ish than I can endorse with a clean conscience.
+
+So to fix that, we've gone and done a huge code movement in our git
+master branch, which will land in a release once Tor 0.3.5.1-alpha is
+out.
+
+Here's what we did:
+
+ * src/common has been turned into a set of static libraries. These
+all live in the "src/lib/*" directories. The dependencies between
+these libraries should have no cycles. The libraries are:
+
+ arch -- Headers to handle architectural differences
+ cc -- headers to handle differences among compilers
+ compress -- wraps zlib, zstd, lzma
+ container -- high-level container types
+ crypt_ops -- Cryptographic operations. Planning to split this into
+a higher and lower level library
+ ctime -- Operations that need to run in constant-time. (Properly,
+data-invariant time)
+ defs -- miscelaneous definitions needed throughout Tor.
+ encoding -- transforming one data type into another, and various
+data types into strings.
+ err -- lowest-level error handling, in cases where we can't use
+the logs because something that the logging system needs has broken.
+ evloop -- Generic event-loop handling logic
+ fdio -- Low-level IO wrapper functions for file descriptors.
+ fs -- Operations on the filesystem
+ intmath -- low-level integer math and misc bit-twiddling hacks
+ lock -- low-level locking code
+ log -- Tor's logging module. This library sits roughly halfway up
+the library dependency diagram, since everything it depends on has to
+be carefully crafted to *not* log.
+ malloc -- Low-level wrappers for the platform memory allocation functions.
+ math -- Higher-level mathematical functions, and floating-point math
+ memarea -- An arena allocator
+ meminfo -- Functions for querying the current process's memory
+status and resources
+ net -- Networking compatibility and convenience code
+ osinfo -- Querying information about the operating system
+ process -- Launching and querying the status of other processes
+ sandbox -- Backend for the linux seccomp2 sandbox
+ smartlist_core -- The lowest-level of the smartlist_t data type.
+Separated from the rest of the containers library because the logging
+subsystem depends on it.
+ string -- Compatibility and convenience functions for manipulating
+C strings.
+ term -- Terminal-related functions (currently limited to a getpass
+function).
+ testsupport -- Macros for mocking, unit tests, etc.
+ thread -- Higher-level thread compatibility code
+ time -- Higher-level time management code, including format
+conversions and monotonic time
+ tls -- Our wrapper around our TLS library
+ trace -- Formerly src/trace -- a generic event tracing API
+ wallclock -- Low-level time code, used by the log module.
+
+ * To ensure that the dependency graph in src/common remains under
+control, there is a tool that you can run called "make
+check-includes". It verifies that each module in Tor only includes
+the headers that it is permitted to include, using a per-directory
+".may_include" file.
+
+ * The src/or/or.h header has been split into numerous smaller
+headers. Notably, many important structures are now declared in a
+header called foo_st.h, where "foo" is the name of the structure.
+
+ * The src/or directory, which had most of Tor's code, had been split
+up into several directories. This is still a work in progress: This
+code has not itself been refactored, and its dependency graph is still
+a tangled web. I hope we'll be working on that over the coming
+releases, but it will take a while to do.
+
+ The new top-level source directories are:
+
+ src/core -- Code necessary to actually perform or use onion routing.
+ src/feature -- Code used only by some onion routing
+configurations, or only for a special purpose.
+ src/app -- Top-level code to run, invoke, and configure the
+lower-level code
+
+ The new second-level source directories are:
+ src/core/crypto -- High-level cryptographic protocols used in Tor
+ src/core/mainloop -- Tor's event loop, connection-handling, and
+traffic-routing code.
+ src/core/or -- Parts related to handling onion routing itself
+ src/core/proto -- support for encoding and decoding different
+wire protocols
+
+ src/feature/api -- Support for making Tor embeddable
+ src/feature/client -- Functionality which only Tor clients need
+ src/feature/control -- Controller implementation
+ src/feature/dirauth -- Directory authority
+ src/feature/dircache -- Directory cache
+ src/feature/dirclient -- Directory client
+ src/feature/dircommon -- Shared code between the other directory modules
+ src/feature/hibernate -- Hibernating when Tor is out of bandwidth
+or shutting down
+ src/feature/hs -- v3 onion service implementation
+ src/feature/hs_common -- shared code between both onion service
+implementations
+ src/feature/nodelist -- storing and accessing the list of relays on
+the network.
+ src/feature/relay -- code that only relay servers and exit servers need.
+ src/feature/rend -- v2 onion service implementation
+ src/feature/stats -- statistics and history
+
+ src/app/config -- configuration and state for Tor
+ src/app/main -- Top-level functions to invoke the rest or Tor.
+
+ * The "tor" executable is now built in src/app/tor rather than src/or/tor.
+
+ * There are more static libraries than before that you need to build
+into your application if you want to embed Tor. Rather than
+maintaining this list yourself, I recommend that you run "make
+show-libs" to have Tor emit a list of what you need to link.
diff --git a/doc/HACKING/CodingStandards.md b/doc/HACKING/CodingStandards.md
index 79a6a9f0ce..4f229348e4 100644
--- a/doc/HACKING/CodingStandards.md
+++ b/doc/HACKING/CodingStandards.md
@@ -42,6 +42,23 @@ If you have changed build system components:
- For example, if you have changed Makefiles, autoconf files, or anything
else that affects the build system.
+License issues
+==============
+
+Tor is distributed under the license terms in the LICENSE -- in
+brief, the "3-clause BSD license". If you send us code to
+distribute with Tor, it needs to be code that we can distribute
+under those terms. Please don't send us patches unless you agree
+to allow this.
+
+Some compatible licenses include:
+
+ - 3-clause BSD
+ - 2-clause BSD
+ - CC0 Public Domain Dedication
+
+
+
How we use Git branches
=======================
@@ -155,7 +172,6 @@ deviations from our C whitespace style. Generally, we use:
- Unix-style line endings
- K&R-style indentation
- No space before newlines
- - A blank line at the end of each file
- Never more than one blank line in a row
- Always spaces, never tabs
- No more than 79-columns per line.
@@ -168,6 +184,9 @@ deviations from our C whitespace style. Generally, we use:
`puts (x)`.
- Function declarations at the start of the line.
+If you use an editor that has plugins for editorconfig.org, the file
+`.editorconfig` will help you to conform this coding style.
+
We try hard to build without warnings everywhere. In particular, if
you're using gcc, you should invoke the configure script with the
option `--enable-fatal-warnings`. This will tell the compiler
@@ -181,8 +200,8 @@ We have some wrapper functions like `tor_malloc`, `tor_free`, `tor_strdup`, and
always succeed or exit.)
You can get a full list of the compatibility functions that Tor provides by
-looking through `src/common/util*.h` and `src/common/compat*.h`. You can see the
-available containers in `src/common/containers*.h`. You should probably
+looking through `src/lib/*/*.h`. You can see the
+available containers in `src/lib/containers/*.h`. You should probably
familiarize yourself with these modules before you write too much code, or
else you'll wind up reinventing the wheel.
@@ -195,6 +214,24 @@ We don't call `memcmp()` directly. Use `fast_memeq()`, `fast_memneq()`,
Also see a longer list of functions to avoid in:
https://people.torproject.org/~nickm/tor-auto/internal/this-not-that.html
+What code can use what other code?
+----------------------------------
+
+We're trying to simplify Tor's structure over time. In the long run, we want
+Tor to be structured as a set of modules with *no circular dependencies*.
+
+This property is currently provided by the modules in src/lib, but not
+throughout the rest of Tor. In general, higher-level libraries may use
+lower-level libraries, but never the reverse.
+
+To prevent new circular dependencies from landing, we have a tool that
+you can invoke with `make check-includes`, and which is run
+automatically as part of `make check`. This tool will verify that, for
+every source directory with a `.may_include` file, no local headers are
+included except those specifically permitted by the `.may_include` file.
+When editing one of these files, please make sure that you are not
+introducing any cycles into Tor's dependency graph.
+
Floating point math is hard
---------------------------
diff --git a/doc/HACKING/CodingStandardsRust.md b/doc/HACKING/CodingStandardsRust.md
index 7c6405e624..fc562816db 100644
--- a/doc/HACKING/CodingStandardsRust.md
+++ b/doc/HACKING/CodingStandardsRust.md
@@ -104,7 +104,7 @@ repo.
Documentation
---------------
-You MUST include `#[deny(missing_docs)]` in your crate.
+You MUST include `#![deny(missing_docs)]` in your crate.
For function/method comments, you SHOULD include a one-sentence, "first person"
description of function behaviour (see requirements for documentation as
@@ -324,12 +324,26 @@ Here are some additional bits of advice and rules:
}
}
-3. Pass only integer types and bytes over the boundary
+3. Pass only C-compatible primitive types and bytes over the boundary
- The only non-integer type which may cross the FFI boundary is
+ Rust's C-compatible primitive types are integers and floats.
+ These types are declared in the [libc crate](https://doc.rust-lang.org/libc/x86_64-unknown-linux-gnu/libc/index.html#types).
+ Most Rust objects have different [representations](https://doc.rust-lang.org/libc/x86_64-unknown-linux-gnu/libc/index.html#types)
+ in C and Rust, so they can't be passed using FFI.
+
+ Tor currently uses the following Rust primitive types from libc for FFI:
+ * defined-size integers: `uint32_t`
+ * native-sized integers: `c_int`
+ * native-sized floats: `c_double`
+ * native-sized raw pointers: `* c_void`, `* c_char`, `** c_char`
+
+ TODO: C smartlist to Stringlist conversion using FFI
+
+ The only non-primitive type which may cross the FFI boundary is
bytes, e.g. `&[u8]`. This SHOULD be done on the Rust side by
- passing a pointer (`*mut libc::c_char`) and a length
- (`libc::size_t`).
+ passing a pointer (`*mut libc::c_char`). The length can be passed
+ explicitly (`libc::size_t`), or the string can be NUL-byte terminated
+ C string.
One might be tempted to do this via doing
`CString::new("blah").unwrap().into_raw()`. This has several problems:
diff --git a/doc/HACKING/HelpfulTools.md b/doc/HACKING/HelpfulTools.md
index f919d08ec1..d499238526 100644
--- a/doc/HACKING/HelpfulTools.md
+++ b/doc/HACKING/HelpfulTools.md
@@ -4,25 +4,49 @@ Useful tools
These aren't strictly necessary for hacking on Tor, but they can help track
down bugs.
+Travis/Appveyor CI
+------------------
+It's CI.
+
+Looks like this:
+* https://travis-ci.org/torproject/tor
+* https://ci.appveyor.com/project/torproject/tor
+
+Travis builds and runs tests on Linux, and eventually macOS (#24629).
+Appveyor builds and runs tests on Windows (using Windows Services for Linux).
+
+Runs automatically on Pull Requests sent to torproject/tor. You can set it up
+for your fork to build commits outside of PRs too:
+
+1. sign up for GitHub: https://github.com/join
+2. fork https://github.com/torproject/tor:
+ https://help.github.com/articles/fork-a-repo/
+3. follow https://docs.travis-ci.com/user/getting-started/#To-get-started-with-Travis-CI.
+ skip steps involving `.travis.yml` (we already have one).
+4. go to https://ci.appveyor.com/login , log in with github, and select
+ "NEW PROJECT"
+
+Builds should show up on the web at travis-ci.com and on IRC at #tor-ci on
+OFTC. If they don't, ask #tor-dev (also on OFTC).
+
Jenkins
-------
- https://jenkins.torproject.org
+It's CI/builders. Looks like this: https://jenkins.torproject.org
-Dmalloc
--------
+Runs automatically on commits merged to git.torproject.org. We CI the
+master branch and all supported tor versions. We also build nightly debian
+packages from master.
-The dmalloc library will keep track of memory allocation, so you can find out
-if we're leaking memory, doing any double-frees, or so on.
+Builds Linux and Windows cross-compilation. Runs Linux tests.
- dmalloc -l -/dmalloc.log
- (run the commands it tells you)
- ./configure --with-dmalloc
+Builds should show up on the web at jenkins.torproject.org and on IRC at
+#tor-bots on OFTC. If they don't, ask #tor-dev (also on OFTC).
Valgrind
--------
- valgrind --leak-check=yes --error-limit=no --show-reachable=yes src/or/tor
+ valgrind --leak-check=yes --error-limit=no --show-reachable=yes src/app/tor
(Note that if you get a zillion openssl warnings, you will also need to
pass `--undef-value-errors=no` to valgrind, or rebuild your openssl
@@ -226,10 +250,10 @@ Beforehand, install google-perftools.
Now you can run Tor with profiling enabled, and use the pprof utility to look at
performance! See the gperftools manual for more info, but basically:
-2. Run `env CPUPROFILE=/tmp/profile src/or/tor -f <path/torrc>`. The profile file
+2. Run `env CPUPROFILE=/tmp/profile src/app/tor -f <path/torrc>`. The profile file
is not written to until Tor finishes execuction.
-3. Run `pprof src/or/tor /tm/profile` to start the REPL.
+3. Run `pprof src/app/tor /tm/profile` to start the REPL.
Generating and analyzing a callgraph
------------------------------------
diff --git a/doc/HACKING/Module.md b/doc/HACKING/Module.md
new file mode 100644
index 0000000000..9cf36090b4
--- /dev/null
+++ b/doc/HACKING/Module.md
@@ -0,0 +1,111 @@
+# Modules in Tor #
+
+This document describes the build system and coding standards when writing a
+module in Tor.
+
+## What is a module? ##
+
+In the context of the tor code base, a module is a subsystem that we can
+selectively enable or disable, at `configure` time.
+
+Currently, there is only one module:
+
+ - Directory Authority subsystem (dirauth)
+
+It is located in its own directory in `src/feature/dirauth/`. To disable it,
+one need to pass `--disable-module-dirauth` at configure time. All modules
+are currently enabled by default.
+
+## Build System ##
+
+The changes to the build system are pretty straightforward.
+
+1. Locate in the `configure.ac` file this define: `m4_define(MODULES`. It
+ contains a list (white-space separated) of the module in tor. Add yours to
+ the list.
+
+2. Use the `AC_ARG_ENABLE([module-dirauth]` template for your new module. We
+ use the "disable module" approach instead of enabling them one by one. So,
+ by default, tor will build all the modules.
+
+ This will define the `HAVE_MODULE_<name>` statement which can be used in
+ the C code to conditionally compile things for your module. And the
+ `BUILD_MODULE_<name>` is also defined for automake files (e.g: include.am).
+
+3. In the `src/core/include.am` file, locate the `MODULE_DIRAUTH_SOURCES`
+ value. You need to create your own `_SOURCES` variable for your module
+ and then conditionally add the it to `LIBTOR_A_SOURCES` if you should
+ build the module.
+
+ It is then **very** important to add your SOURCES variable to
+ `src_or_libtor_testing_a_SOURCES` so the tests can build it.
+
+4. Do the same for header files, locate `ORHEADERS +=` which always add all
+ headers of all modules so the symbol can be found for the module entry
+ points.
+
+Finally, your module will automatically be included in the
+`TOR_MODULES_ALL_ENABLED` variable which is used to build the unit tests. They
+always build everything in order to tests everything.
+
+## Coding ##
+
+As mentioned above, a module must be isolated in its own directory (name of
+the module) in `src/feature/`.
+
+There are couples of "rules" you want to follow:
+
+* Minimize as much as you can the number of entry points into your module.
+ Less is always better but of course that doesn't work out for every use
+ case. However, it is a good thing to always keep that in mind.
+
+* Do **not** use the `HAVE_MODULE_<name>` define outside of the module code
+ base. Every entry point should have a second definition if the module is
+ disabled. For instance:
+
+ ```
+ #ifdef HAVE_MODULE_DIRAUTH
+
+ int sr_init(int save_to_disk);
+
+ #else /* HAVE_MODULE_DIRAUTH */
+
+ static inline int
+ sr_init(int save_to_disk)
+ {
+ (void) save_to_disk;
+ return 0;
+ }
+
+ #endif /* HAVE_MODULE_DIRAUTH */
+
+ ```
+
+ The main reason for this approach is to avoid having conditional code
+ everywhere in the code base. It should be centralized as much as possible
+ which helps maintainability but also avoids conditional spaghetti code
+ making the code much more difficult to follow/understand.
+
+* It is possible that you end up with code that needs to be used by the rest
+ of the code base but is still part of your module. As a good example, if
+ you look at `src/feature/shared_random_client.c`: it contains code needed
+ by the hidden service subsystem but mainly related to the shared random
+ subsystem very specific to the dirauth module.
+
+ This is fine but try to keep it as lean as possible and never use the same
+ filename as the one in the module. For example, this is a bad idea and
+ should never be done:
+
+ - `src/feature/dirclient/shared_random.c`
+ - `src/feature/dirauth/shared_random.c`
+
+* When you include headers from the module, **always** use the full module
+ path in your statement. Example:
+
+ `#include "feature/dirauth/dirvote.h"`
+
+ The main reason is that we do **not** add the module include path by default
+ so it needs to be specified. But also, it helps our human brain understand
+ which part comes from a module or not.
+
+ Even **in** the module itself, use the full include path like above.
diff --git a/doc/HACKING/ReleasingTor.md b/doc/HACKING/ReleasingTor.md
index 6c8fa1331f..55a40fc89b 100644
--- a/doc/HACKING/ReleasingTor.md
+++ b/doc/HACKING/ReleasingTor.md
@@ -7,7 +7,7 @@ new Tor release:
=== 0. Preliminaries
-1. Get at least three of weasel/arma/Sebastian/Sina to put the new
+1. Get at least two of weasel/arma/Sebastian to put the new
version number in their approved versions list. Give them a few
days to do this if you can.
@@ -34,7 +34,7 @@ new Tor release:
What about Coverity Scan?
- What about clan scan-build?
+ What about clang scan-build?
Does 'make distcheck' complain?
diff --git a/doc/HACKING/Tracing.md b/doc/HACKING/Tracing.md
index 349aade23a..24fa761310 100644
--- a/doc/HACKING/Tracing.md
+++ b/doc/HACKING/Tracing.md
@@ -69,7 +69,7 @@ configure option:
## Instrument Tor ##
This is pretty easy. Let's say you want to add a trace event in
-`src/or/rendcache.c`, you only have to add this include statement:
+`src/feature/rend/rendcache.c`, you only have to add this include statement:
#include "trace/events.h"