diff options
Diffstat (limited to 'doc/HACKING')
-rw-r--r-- | doc/HACKING | 545 | ||||
-rw-r--r-- | doc/HACKING/CodingStandards.md | 245 | ||||
-rw-r--r-- | doc/HACKING/GettingStarted.md | 187 | ||||
-rw-r--r-- | doc/HACKING/HelpfulTools.md | 293 | ||||
-rw-r--r-- | doc/HACKING/HowToReview.md | 85 | ||||
-rw-r--r-- | doc/HACKING/README.1st.md | 62 | ||||
-rw-r--r-- | doc/HACKING/ReleasingTor.md | 143 | ||||
-rw-r--r-- | doc/HACKING/WritingTests.md | 445 |
8 files changed, 1460 insertions, 545 deletions
diff --git a/doc/HACKING b/doc/HACKING deleted file mode 100644 index c69b2a6fee..0000000000 --- a/doc/HACKING +++ /dev/null @@ -1,545 +0,0 @@ -Hacking Tor: An Incomplete Guide -================================ - -Getting started ---------------- - -For full information on how Tor is supposed to work, look at the files in -https://gitweb.torproject.org/torspec.git/tree - -For an explanation of how to change Tor's design to work differently, look at -https://gitweb.torproject.org/torspec.git/blob_plain/HEAD:/proposals/001-process.txt - -For the latest version of the code, get a copy of git, and - - git clone https://git.torproject.org/git/tor - -We talk about Tor on the tor-talk mailing list. Design proposals and -discussion belong on the tor-dev mailing list. We hang around on -irc.oftc.net, with general discussion happening on #tor and development -happening on #tor-dev. - -How we use Git branches ------------------------ - -Each main development series (like 0.2.1, 0.2.2, etc) has its main work -applied to a single branch. At most one series can be the development series -at a time; all other series are maintenance series that get bug-fixes only. -The development series is built in a git branch called "master"; the -maintenance series are built in branches called "maint-0.2.0", "maint-0.2.1", -and so on. We regularly merge the active maint branches forward. - -For all series except the development series, we also have a "release" branch -(as in "release-0.2.1"). The release series is based on the corresponding -maintenance series, except that it deliberately lags the maint series for -most of its patches, so that bugfix patches are not typically included in a -maintenance release until they've been tested for a while in a development -release. Occasionally, we'll merge an urgent bugfix into the release branch -before it gets merged into maint, but that's rare. - -If you're working on a bugfix for a bug that occurs in a particular version, -base your bugfix branch on the "maint" branch for the first supported series -that has that bug. (As of June 2013, we're supporting 0.2.3 and later.) If -you're working on a new feature, base it on the master branch. - - -How we log changes ------------------- - -When you do a commit that needs a ChangeLog entry, add a new file to -the "changes" toplevel subdirectory. It should have the format of a -one-entry changelog section from the current ChangeLog file, as in - - o Major bugfixes: - - Fix a potential buffer overflow. Fixes bug 99999; bugfix on - 0.3.1.4-beta. - -To write a changes file, first categorize the change. Some common categories -are: Minor bugfixes, Major bugfixes, Minor features, Major features, Code -simplifications and refactoring. Then say what the change does. If -it's a bugfix, mention what bug it fixes and when the bug was -introduced. To find out which Git tag the change was introduced in, -you can use "git describe --contains <sha1 of commit>". - -If at all possible, try to create this file in the same commit where -you are making the change. Please give it a distinctive name that no -other branch will use for the lifetime of your change. - -When we go to make a release, we will concatenate all the entries -in changes to make a draft changelog, and clear the directory. We'll -then edit the draft changelog into a nice readable format. - -What needs a changes file?:: - A not-exhaustive list: Anything that might change user-visible - behavior. Anything that changes internals, documentation, or the build - system enough that somebody could notice. Big or interesting code - rewrites. Anything about which somebody might plausibly wonder "when - did that happen, and/or why did we do that" 6 months down the line. - -Why use changes files instead of Git commit messages?:: - Git commit messages are written for developers, not users, and they - are nigh-impossible to revise after the fact. - -Why use changes files instead of entries in the ChangeLog?:: - Having every single commit touch the ChangeLog file tended to create - zillions of merge conflicts. - -Useful tools ------------- - -These aren't strictly necessary for hacking on Tor, but they can help track -down bugs. - -Jenkins -~~~~~~~ - -https://jenkins.torproject.org - -Dmalloc -~~~~~~~ - -The dmalloc library will keep track of memory allocation, so you can find out -if we're leaking memory, doing any double-frees, or so on. - - dmalloc -l ~/dmalloc.log - (run the commands it tells you) - ./configure --with-dmalloc - -Valgrind -~~~~~~~~ - -valgrind --leak-check=yes --error-limit=no --show-reachable=yes src/or/tor - -(Note that if you get a zillion openssl warnings, you will also need to -pass --undef-value-errors=no to valgrind, or rebuild your openssl -with -DPURIFY.) - -Running gcov for unit test coverage -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - ------ - ./configure --enable-coverage - make - make check - mkdir coverage-output - ./scripts/test/coverage coverage-output ------ - -(On OSX, you'll need to start with "--enable-coverage CC=clang".) - -Then, look at the .gcov files in coverage-output. '-' before a line means -that the compiler generated no code for that line. '######' means that the -line was never reached. Lines with numbers were called that number of times. - -If that doesn't work: - * Try configuring Tor with --disable-gcc-hardening - * You might need to run 'make clean' after you run './configure'. - -If you make changes to Tor and want to get another set of coverage results, -you can run "make reset-gcov" to clear the intermediary gcov output. - -If you have two different "coverage-output" directories, and you want to see -a meaningful diff between them, you can run: - ------ - ./scripts/test/cov-diff coverage-output1 coverage-output2 | less ------ - -In this diff, any lines that were visited at least once will have coverage -"1". This lets you inspect what you (probably) really want to know: which -untested lines were changed? Are there any new untested lines? - -Running integration tests -~~~~~~~~~~~~~~~~~~~~~~~~~ - -We have the beginnings of a set of scripts to run integration tests using -Chutney. To try them, set CHUTNEY_PATH to your chutney source directory, and -run "make test-network". - -Profiling Tor with oprofile -~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -The oprofile tool runs (on Linux only!) to tell you what functions Tor is -spending its CPU time in, so we can identify berformance pottlenecks. - -Here are some basic instructions - - - Build tor with debugging symbols (you probably already have, unless - you messed with CFLAGS during the build process). - - Build all the libraries you care about with debugging symbols - (probably you only care about libssl, maybe zlib and Libevent). - - Copy this tor to a new directory - - Copy all the libraries it uses to that dir too (ldd ./tor will - tell you) - - Set LD_LIBRARY_PATH to include that dir. ldd ./tor should now - show you it's using the libs in that dir - - Run that tor - - Reset oprofiles counters/start it - * "opcontrol --reset; opcontrol --start", if Nick remembers right. - - After a while, have it dump the stats on tor and all the libs - in that dir you created. - * "opcontrol --dump;" - * "opreport -l that_dir/*" - - Profit - - -Coding conventions ------------------- - -Patch checklist -~~~~~~~~~~~~~~~ - -If possible, send your patch as one of these (in descending order of -preference) - - - A git branch we can pull from - - Patches generated by git format-patch - - A unified diff - -Did you remember... - - - To build your code while configured with --enable-gcc-warnings? - - To run "make check-spaces" on your code? - - To run "make check-docs" to see whether all new options are on - the manpage? - - To write unit tests, as possible? - - To base your code on the appropriate branch? - - To include a file in the "changes" directory as appropriate? - -Whitespace and C conformance -~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -Invoke "make check-spaces" from time to time, so it can tell you about -deviations from our C whitespace style. Generally, we use: - - - Unix-style line endings - - K&R-style indentation - - No space before newlines - - A blank line at the end of each file - - Never more than one blank line in a row - - Always spaces, never tabs - - No more than 79-columns per line. - - Two spaces per indent. - - A space between control keywords and their corresponding paren - "if (x)", "while (x)", and "switch (x)", never "if(x)", "while(x)", or - "switch(x)". - - A space between anything and an open brace. - - No space between a function name and an opening paren. "puts(x)", not - "puts (x)". - - Function declarations at the start of the line. - -We try hard to build without warnings everywhere. In particular, if you're -using gcc, you should invoke the configure script with the option -"--enable-gcc-warnings". This will give a bunch of extra warning flags to -the compiler, and help us find divergences from our preferred C style. - -Getting emacs to edit Tor source properly -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -Nick likes to put the following snippet in his .emacs file: - ------ - (add-hook 'c-mode-hook - (lambda () - (font-lock-mode 1) - (set-variable 'show-trailing-whitespace t) - - (let ((fname (expand-file-name (buffer-file-name)))) - (cond - ((string-match "^/home/nickm/src/libevent" fname) - (set-variable 'indent-tabs-mode t) - (set-variable 'c-basic-offset 4) - (set-variable 'tab-width 4)) - ((string-match "^/home/nickm/src/tor" fname) - (set-variable 'indent-tabs-mode nil) - (set-variable 'c-basic-offset 2)) - ((string-match "^/home/nickm/src/openssl" fname) - (set-variable 'indent-tabs-mode t) - (set-variable 'c-basic-offset 8) - (set-variable 'tab-width 8)) - )))) ------ - -You'll note that it defaults to showing all trailing whitespace. The "cond" -test detects whether the file is one of a few C free software projects that I -often edit, and sets up the indentation level and tab preferences to match -what they want. - -If you want to try this out, you'll need to change the filename regex -patterns to match where you keep your Tor files. - -If you use emacs for editing Tor and nothing else, you could always just say: - ------ - (add-hook 'c-mode-hook - (lambda () - (font-lock-mode 1) - (set-variable 'show-trailing-whitespace t) - (set-variable 'indent-tabs-mode nil) - (set-variable 'c-basic-offset 2))) ------ - -There is probably a better way to do this. No, we are probably not going -to clutter the files with emacs stuff. - - -Functions to use -~~~~~~~~~~~~~~~~ - -We have some wrapper functions like tor_malloc, tor_free, tor_strdup, and -tor_gettimeofday; use them instead of their generic equivalents. (They -always succeed or exit.) - -You can get a full list of the compatibility functions that Tor provides by -looking through src/common/util.h and src/common/compat.h. You can see the -available containers in src/common/containers.h. You should probably -familiarize yourself with these modules before you write too much code, or -else you'll wind up reinventing the wheel. - -Use 'INLINE' instead of 'inline', so that we work properly on Windows. - -Calling and naming conventions -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -Whenever possible, functions should return -1 on error and 0 on success. - -For multi-word identifiers, use lowercase words combined with -underscores. (e.g., "multi_word_identifier"). Use ALL_CAPS for macros and -constants. - -Typenames should end with "_t". - -Function names should be prefixed with a module name or object name. (In -general, code to manipulate an object should be a module with the same name -as the object, so it's hard to tell which convention is used.) - -Functions that do things should have imperative-verb names -(e.g. buffer_clear, buffer_resize); functions that return booleans should -have predicate names (e.g. buffer_is_empty, buffer_needs_resizing). - -If you find that you have four or more possible return code values, it's -probably time to create an enum. If you find that you are passing three or -more flags to a function, it's probably time to create a flags argument that -takes a bitfield. - -What To Optimize -~~~~~~~~~~~~~~~~ - -Don't optimize anything if it's not in the critical path. Right now, the -critical path seems to be AES, logging, and the network itself. Feel free to -do your own profiling to determine otherwise. - -Log conventions -~~~~~~~~~~~~~~~ - -https://trac.torproject.org/projects/tor/wiki/doc/TorFAQ#loglevel - -No error or warning messages should be expected during normal OR or OP -operation. - -If a library function is currently called such that failure always means ERR, -then the library function should log WARN and let the caller log ERR. - -Every message of severity INFO or higher should either (A) be intelligible -to end-users who don't know the Tor source; or (B) somehow inform the -end-users that they aren't expected to understand the message (perhaps -with a string like "internal error"). Option (A) is to be preferred to -option (B). - -Doxygen -~~~~~~~~ - -We use the 'doxygen' utility to generate documentation from our -source code. Here's how to use it: - - 1. Begin every file that should be documented with - /** - * \file filename.c - * \brief Short description of the file. - **/ - - (Doxygen will recognize any comment beginning with /** as special.) - - 2. Before any function, structure, #define, or variable you want to - document, add a comment of the form: - - /** Describe the function's actions in imperative sentences. - * - * Use blank lines for paragraph breaks - * - and - * - hyphens - * - for - * - lists. - * - * Write <b>argument_names</b> in boldface. - * - * \code - * place_example_code(); - * between_code_and_endcode_commands(); - * \endcode - */ - - 3. Make sure to escape the characters "<", ">", "\", "%" and "#" as "\<", - "\>", "\\", "\%", and "\#". - - 4. To document structure members, you can use two forms: - - struct foo { - /** You can put the comment before an element; */ - int a; - int b; /**< Or use the less-than symbol to put the comment - * after the element. */ - }; - - 5. To generate documentation from the Tor source code, type: - - $ doxygen -g - - To generate a file called 'Doxyfile'. Edit that file and run - 'doxygen' to generate the API documentation. - - 6. See the Doxygen manual for more information; this summary just - scratches the surface. - -Doxygen comment conventions -^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -Say what functions do as a series of one or more imperative sentences, as -though you were telling somebody how to be the function. In other words, DO -NOT say: - - /** The strtol function parses a number. - * - * nptr -- the string to parse. It can include whitespace. - * endptr -- a string pointer to hold the first thing that is not part - * of the number, if present. - * base -- the numeric base. - * returns: the resulting number. - */ - long strtol(const char *nptr, char **nptr, int base); - -Instead, please DO say: - - /** Parse a number in radix <b>base</b> from the string <b>nptr</b>, - * and return the result. Skip all leading whitespace. If - * <b>endptr</b> is not NULL, set *<b>endptr</b> to the first character - * after the number parsed. - **/ - long strtol(const char *nptr, char **nptr, int base); - -Doxygen comments are the contract in our abstraction-by-contract world: if -the functions that call your function rely on it doing something, then your -function should mention that it does that something in the documentation. If -you rely on a function doing something beyond what is in its documentation, -then you should watch out, or it might do something else later. - -Putting out a new release -------------------------- - -Here are the steps Roger takes when putting out a new Tor release: - -1) Use it for a while, as a client, as a relay, as a hidden service, -and as a directory authority. See if it has any obvious bugs, and -resolve those. - -1.5) As applicable, merge the maint-X branch into the release-X branch. - -2) Gather the changes/* files into a changelog entry, rewriting many -of them and reordering to focus on what users and funders would find -interesting and understandable. - - 2.1) Make sure that everything that wants a bug number has one. - Make sure that everything which is a bugfix says what version - it was a bugfix on. - 2.2) Concatenate them. - 2.3) Sort them by section. Within each section, sort by "version it's - a bugfix on", else by numerical ticket order. - - 2.4) Clean them up: - - Standard idioms: - "Fixes bug 9999; bugfix on 0.3.3.3-alpha." - - One period after a space. - - Make stuff very terse - - Make sure each section name ends with a colon - - Describe the user-visible problem right away - - Mention relevant config options by name. If they're rare or unusual, - remind people what they're for - - Avoid starting lines with open-paren - - Present and imperative tense: not past. - - 'Relays', not 'servers' or 'nodes' or 'Tor relays'. - - "Stop FOOing", not "Fix a bug where we would FOO". - - Try not to let any given section be longer than about a page. Break up - long sections into subsections by some sort of common subtopic. This - guideline is especially important when organizing Release Notes for - new stable releases. - - If a given changes stanza showed up in a different release (e.g. - maint-0.2.1), be sure to make the stanzas identical (so people can - distinguish if these are the same change). - - 2.5) Merge them in. - - 2.6) Clean everything one last time. - - 2.7) Run it through fmt to make it pretty. - -3) Compose a short release blurb to highlight the user-facing -changes. Insert said release blurb into the ChangeLog stanza. If it's -a stable release, add it to the ReleaseNotes file too. If we're adding -to a release-0.2.x branch, manually commit the changelogs to the later -git branches too. - -4) Bump the version number in configure.ac and rebuild. - -5) Make dist, put the tarball up somewhere, and tell #tor about it. Wait -a while to see if anybody has problems building it. Try to get Sebastian -or somebody to try building it on Windows. - -6) Get at least two of weasel/arma/sebastian to put the new version number -in their approved versions list. - -7) Sign the tarball, then sign and push the git tag: - gpg -ba <the_tarball> - git tag -u <keyid> tor-0.2.x.y-status - git push origin tag tor-0.2.x.y-status - -8) scp the tarball and its sig to the website in the dist/ directory -(i.e. /srv/www-master.torproject.org/htdocs/dist/ on vescum). Edit -"include/versions.wmi" and "Makefile" to note the new version. From your -website checkout, run ./publish to build and publish the website. - -9) Email the packagers (cc'ing tor-assistants) that a new tarball is up. - -10) Add the version number to Trac. To do this, go to Trac, log in, -select "Admin" near the top of the screen, then select "Versions" from -the menu on the left. At the right, there will be an "Add version" -box. By convention, we enter the version in the form "Tor: -0.2.2.23-alpha" (or whatever the version is), and we select the date as -the date in the ChangeLog. - -11) Forward-port the ChangeLog. - -12) Wait up to a day or two (for a development release), or until most -packages are up (for a stable release), and mail the release blurb and -changelog to tor-talk or tor-announce. - - (We might be moving to faster announcements, but don't announce until - the website is at least updated.) - -13) If it's a stable release, bump the version number in the maint-x.y.z - branch to "newversion-dev", and do a "merge -s ours" merge to avoid - taking that change into master. Do a similar 'merge -s theirs' - merge to get the change (and only that change) into release. (Some - of the build scripts require that maint merge cleanly into release.) - diff --git a/doc/HACKING/CodingStandards.md b/doc/HACKING/CodingStandards.md new file mode 100644 index 0000000000..4aafa5ddd4 --- /dev/null +++ b/doc/HACKING/CodingStandards.md @@ -0,0 +1,245 @@ +Coding conventions for Tor +========================== + +tl;dr: + + - Run configure with `--enable-gcc-warnings` + - Run `make check-spaces` to catch whitespace errors + - Document your functions + - Write unit tests + - Add a file in `changes` for your branch. + +Patch checklist +--------------- + +If possible, send your patch as one of these (in descending order of +preference) + + - A git branch we can pull from + - Patches generated by git format-patch + - A unified diff + +Did you remember... + + - To build your code while configured with `--enable-gcc-warnings`? + - To run `make check-spaces` on your code? + - To run `make check-docs` to see whether all new options are on + the manpage? + - To write unit tests, as possible? + - To base your code on the appropriate branch? + - To include a file in the `changes` directory as appropriate? + +How we use Git branches +======================= + +Each main development series (like 0.2.1, 0.2.2, etc) has its main work +applied to a single branch. At most one series can be the development series +at a time; all other series are maintenance series that get bug-fixes only. +The development series is built in a git branch called "master"; the +maintenance series are built in branches called "maint-0.2.0", "maint-0.2.1", +and so on. We regularly merge the active maint branches forward. + +For all series except the development series, we also have a "release" branch +(as in "release-0.2.1"). The release series is based on the corresponding +maintenance series, except that it deliberately lags the maint series for +most of its patches, so that bugfix patches are not typically included in a +maintenance release until they've been tested for a while in a development +release. Occasionally, we'll merge an urgent bugfix into the release branch +before it gets merged into maint, but that's rare. + +If you're working on a bugfix for a bug that occurs in a particular version, +base your bugfix branch on the "maint" branch for the first supported series +that has that bug. (As of June 2013, we're supporting 0.2.3 and later.) If +you're working on a new feature, base it on the master branch. + + +How we log changes +================== + +When you do a commit that needs a ChangeLog entry, add a new file to +the `changes` toplevel subdirectory. It should have the format of a +one-entry changelog section from the current ChangeLog file, as in + +- Major bugfixes: + - Fix a potential buffer overflow. Fixes bug 99999; bugfix on + 0.3.1.4-beta. + +To write a changes file, first categorize the change. Some common categories +are: Minor bugfixes, Major bugfixes, Minor features, Major features, Code +simplifications and refactoring. Then say what the change does. If +it's a bugfix, mention what bug it fixes and when the bug was +introduced. To find out which Git tag the change was introduced in, +you can use `git describe --contains <sha1 of commit>`. + +If at all possible, try to create this file in the same commit where you are +making the change. Please give it a distinctive name that no other branch will +use for the lifetime of your change. To verify the format of the changes file, +you can use `make check-changes`. + +When we go to make a release, we will concatenate all the entries +in changes to make a draft changelog, and clear the directory. We'll +then edit the draft changelog into a nice readable format. + +To make sure that stuff is in the right format, we use +scripts/maint/lintChanges.py to check the changes files for +(superficial) validity. You can run this script on your own changes +files! + +What needs a changes file? + + * A not-exhaustive list: Anything that might change user-visible + behavior. Anything that changes internals, documentation, or the build + system enough that somebody could notice. Big or interesting code + rewrites. Anything about which somebody might plausibly wonder "when + did that happen, and/or why did we do that" 6 months down the line. + +Why use changes files instead of Git commit messages? + + * Git commit messages are written for developers, not users, and they + are nigh-impossible to revise after the fact. + +Why use changes files instead of entries in the ChangeLog? + + * Having every single commit touch the ChangeLog file tended to create + zillions of merge conflicts. + +Whitespace and C conformance +---------------------------- + +Invoke `make check-spaces` from time to time, so it can tell you about +deviations from our C whitespace style. Generally, we use: + + - Unix-style line endings + - K&R-style indentation + - No space before newlines + - A blank line at the end of each file + - Never more than one blank line in a row + - Always spaces, never tabs + - No more than 79-columns per line. + - Two spaces per indent. + - A space between control keywords and their corresponding paren + `if (x)`, `while (x)`, and `switch (x)`, never `if(x)`, `while(x)`, or + `switch(x)`. + - A space between anything and an open brace. + - No space between a function name and an opening paren. `puts(x)`, not + `puts (x)`. + - Function declarations at the start of the line. + +We try hard to build without warnings everywhere. In particular, if you're +using gcc, you should invoke the configure script with the option +`--enable-gcc-warnings`. This will give a bunch of extra warning flags to +the compiler, and help us find divergences from our preferred C style. + +Functions to use; functions not to use +-------------------------------------- + +We have some wrapper functions like `tor_malloc`, `tor_free`, `tor_strdup`, and +`tor_gettimeofday;` use them instead of their generic equivalents. (They +always succeed or exit.) + +You can get a full list of the compatibility functions that Tor provides by +looking through `src/common/util*.h` and `src/common/compat*.h`. You can see the +available containers in `src/common/containers*.h`. You should probably +familiarize yourself with these modules before you write too much code, or +else you'll wind up reinventing the wheel. + +We don't use `strcat` or `strcpy` or `sprintf` of any of those notoriously broken +old C functions. Use `strlcat`, `strlcpy`, or `tor_snprintf/tor_asprintf` instead. + +We don't call `memcmp()` directly. Use `fast_memeq()`, `fast_memneq()`, +`tor_memeq()`, or `tor_memneq()` for most purposes. + +Functions not to write +---------------------- + +Try to never hand-write new code to parse or generate binary +formats. Instead, use trunnel if at all possible. See + + https://gitweb.torproject.org/trunnel.git/tree + +for more information about trunnel. + +For information on adding new trunnel code to Tor, see src/trunnel/README + + +Calling and naming conventions +------------------------------ + +Whenever possible, functions should return -1 on error and 0 on success. + +For multi-word identifiers, use lowercase words combined with +underscores. (e.g., `multi_word_identifier`). Use ALL_CAPS for macros and +constants. + +Typenames should end with `_t`. + +Function names should be prefixed with a module name or object name. (In +general, code to manipulate an object should be a module with the same name +as the object, so it's hard to tell which convention is used.) + +Functions that do things should have imperative-verb names +(e.g. `buffer_clear`, `buffer_resize`); functions that return booleans should +have predicate names (e.g. `buffer_is_empty`, `buffer_needs_resizing`). + +If you find that you have four or more possible return code values, it's +probably time to create an enum. If you find that you are passing three or +more flags to a function, it's probably time to create a flags argument that +takes a bitfield. + +What To Optimize +---------------- + +Don't optimize anything if it's not in the critical path. Right now, the +critical path seems to be AES, logging, and the network itself. Feel free to +do your own profiling to determine otherwise. + +Log conventions +--------------- + +`https://www.torproject.org/docs/faq#LogLevel` + +No error or warning messages should be expected during normal OR or OP +operation. + +If a library function is currently called such that failure always means ERR, +then the library function should log WARN and let the caller log ERR. + +Every message of severity INFO or higher should either (A) be intelligible +to end-users who don't know the Tor source; or (B) somehow inform the +end-users that they aren't expected to understand the message (perhaps +with a string like "internal error"). Option (A) is to be preferred to +option (B). + + + +Doxygen comment conventions +--------------------------- + +Say what functions do as a series of one or more imperative sentences, as +though you were telling somebody how to be the function. In other words, DO +NOT say: + + /** The strtol function parses a number. + * + * nptr -- the string to parse. It can include whitespace. + * endptr -- a string pointer to hold the first thing that is not part + * of the number, if present. + * base -- the numeric base. + * returns: the resulting number. + */ + long strtol(const char *nptr, char **nptr, int base); + +Instead, please DO say: + + /** Parse a number in radix <b>base</b> from the string <b>nptr</b>, + * and return the result. Skip all leading whitespace. If + * <b>endptr</b> is not NULL, set *<b>endptr</b> to the first character + * after the number parsed. + **/ + long strtol(const char *nptr, char **nptr, int base); + +Doxygen comments are the contract in our abstraction-by-contract world: if +the functions that call your function rely on it doing something, then your +function should mention that it does that something in the documentation. If +you rely on a function doing something beyond what is in its documentation, +then you should watch out, or it might do something else later. diff --git a/doc/HACKING/GettingStarted.md b/doc/HACKING/GettingStarted.md new file mode 100644 index 0000000000..0295adc1ff --- /dev/null +++ b/doc/HACKING/GettingStarted.md @@ -0,0 +1,187 @@ + +Getting started in Tor development +================================== + +Congratulations! You've found this file, and you're reading it! This +means that you might be interested in getting started in developing Tor. + +(This guide is just about Tor itself--the small network program at the +heart of the Tor network--and not about all the other programs in the +whole Tor ecosystem.) + + +If you are looking for a more bare-bones, less user-friendly information +dump of important information, you might like reading doc/HACKING +instead. You should probably read it before you write your first patch. + + +Required background +------------------- + +First, I'm going to assume that you can build Tor from source, and that +you know enough of the C language to read and write it. (See the README +file that comes with the Tor source for more information on building it, +and any high-quality guide to C for information on programming.) + +I'm also going to assume that you know a little bit about how to use +Git, or that you're able to follow one of the several excellent guides +at http://git-scm.org to learn. + +Most Tor developers develop using some Unix-based system, such as Linux, +BSD, or OSX. It's okay to develop on Windows if you want, but you're +going to have a more difficult time. + + +Getting your first patch into Tor +--------------------------------- + +Once you've reached this point, here's what you need to know. + + 1. Get the source. + + We keep our source under version control in Git. To get the latest + version, run + + git clone https://git.torproject.org/git/tor + + This will give you a checkout of the master branch. If you're + going to fix a bug that appears in a stable version, check out the + appropriate "maint" branch, as in: + + git checkout maint-0.2.7 + + 2. Find your way around the source + + Our overall code structure is explained in the "torguts" documents, + currently at + + git clone https://git.torproject.org/user/nickm/torguts.git + + Find a part of the code that looks interesting to you, and start + looking around it to see how it fits together! + + We do some unusual things in our codebase. Our testing-related + practices and kludges are explained in doc/WritingTests.txt. + + If you see something that doesn't make sense, we love to get + questions! + + 3. Find something cool to hack on. + + You may already have a good idea of what you'd like to work on, or + you might be looking for a way to contribute. + + Many people have gotten started by looking for an area where they + personally felt Tor was underperforming, and investigating ways to + fix it. If you're looking for ideas, you can head to our bug + tracker at trac.torproject.org and look for tickets that have + received the "easy" tag: these are ones that developers think would + be pretty simple for a new person to work on. For a bigger + challenge, you might want to look for tickets with the "lorax" + keyword: these are tickets that the developers think might be a + good idea to build, but which we have no time to work on any time + soon. + + Or you might find another open ticket that piques your + interest. It's all fine! + + For your first patch, it is probably NOT a good idea to make + something huge or invasive. In particular, you should probably + avoid: + + * Major changes spread across many parts of the codebase. + * Major changes to programming practice or coding style. + * Huge new features or protocol changes. + + 4. Meet the developers! + + We discuss stuff on the tor-dev mailing list and on the #tor-dev + IRC channel on OFTC. We're generally friendly and approachable, + and we like to talk about how Tor fits together. If we have ideas + about how something should be implemented, we'll be happy to share + them. + + We currently have a patch workshop at least once a week, where + people share patches they've made and discuss how to make them + better. The time might change in the future, but generally, + there's no bad time to talk, and ask us about patch ideas. + + 5. Do you need to write a design proposal? + + If your idea is very large, or it will require a change to Tor's + protocols, there needs to be a written design proposal before it + can be merged. (We use this process to manage changes in the + protocols.) To write one, see the instructions at + https://gitweb.torproject.org/torspec.git/tree/proposals/001-process.txt + . If you'd like help writing a proposal, just ask! We're happy to + help out with good ideas. + + You might also like to look around the rest of that directory, to + see more about open and past proposed changes to Tor's behavior. + + 6. Writing your patch + + As you write your code, you'll probably want it to fit in with the + standards of the rest of the Tor codebase so it will be easy for us + to review and merge. You can learn our coding standards in + doc/HACKING. + + If your patch is large and/or is divided into multiple logical + components, remember to divide it into a series of Git commits. A + series of small changes is much easier to review than one big lump. + + 7. Testing your patch + + We prefer that all new or modified code have unit tests for it to + ensure that it runs correctly. Also, all code should actually be + _run_ by somebody, to make sure it works. + + See doc/WritingTests.txt for more information on how we test things + in Tor. If you'd like any help writing tests, just ask! We're + glad to help out. + + 8. Submitting your patch + + We review patches through tickets on our bugtracker at + trac.torproject.org. You can either upload your patches there, or + put them at a public git repository somewhere we can fetch them + (like github or bitbucket) and then paste a link on the appropriate + trac ticket. + + Once your patches are available, write a short explanation of what + you've done on trac, and then change the status of the ticket to + needs_review. + + 9. Review, Revision, and Merge + + With any luck, somebody will review your patch soon! If not, you + can ask on the IRC channel; sometimes we get really busy and take + longer than we should. But don't let us slow you down: you're the + one who's offering help here, and we should respect your time and + contributions. + + When your patch is reviewed, one of these things will happen: + + * The reviewer will say "looks good to me" and your + patch will get merged right into Tor. [Assuming we're not + in the middle of a code-freeze window. If the codebase is + frozen, your patch will go into the next release series.] + + * OR the reviewer will say "looks good, just needs some small + changes!" And then the reviewer will make those changes, + and merge the modified patch into Tor. + + * OR the reviewer will say "Here are some questions and + comments," followed by a bunch of stuff that the reviewer + thinks should change in your code, or questions that the + reviewer has. + + At this point, you might want to make the requested changes + yourself, and comment on the trac ticket once you have done + so. Or if you disagree with any of the comments, you should + say so! And if you won't have time to make some of the + changes, you should say that too, so that other developers + will be able to pick up the unfinished portion. + + Congratulations! You have now written your first patch, and gotten + it integrated into mainline Tor. diff --git a/doc/HACKING/HelpfulTools.md b/doc/HACKING/HelpfulTools.md new file mode 100644 index 0000000000..a7f36e6c7e --- /dev/null +++ b/doc/HACKING/HelpfulTools.md @@ -0,0 +1,293 @@ +Useful tools +============ + +These aren't strictly necessary for hacking on Tor, but they can help track +down bugs. + +Jenkins +------- + + https://jenkins.torproject.org + +Dmalloc +------- + +The dmalloc library will keep track of memory allocation, so you can find out +if we're leaking memory, doing any double-frees, or so on. + + dmalloc -l -/dmalloc.log + (run the commands it tells you) + ./configure --with-dmalloc + +Valgrind +-------- + + valgrind --leak-check=yes --error-limit=no --show-reachable=yes src/or/tor + +(Note that if you get a zillion openssl warnings, you will also need to +pass `--undef-value-errors=no` to valgrind, or rebuild your openssl +with `-DPURIFY`.) + +Coverity +-------- + +Nick regularly runs the coverity static analyzer on the Tor codebase. + +The preprocessor define `__COVERITY__` is used to work around instances +where coverity picks up behavior that we wish to permit. + +clang Static Analyzer +--------------------- + +The clang static analyzer can be run on the Tor codebase using Xcode (WIP) +or a command-line build. + +The preprocessor define `__clang_analyzer__` is used to work around instances +where clang picks up behavior that we wish to permit. + +clang Runtime Sanitizers +------------------------ + +To build the Tor codebase with the clang Address and Undefined Behavior +sanitizers, see the file `contrib/clang/sanitize_blacklist.txt`. + +Preprocessor workarounds for instances where clang picks up behavior that +we wish to permit are also documented in the blacklist file. + +Running lcov for unit test coverage +----------------------------------- + +Lcov is a utility that generates pretty HTML reports of test code coverage. +To generate such a report: + + ./configure --enable-coverage + make + make coverage-html + $BROWSER ./coverage_html/index.html + +This will run the tor unit test suite `./src/test/test` and generate the HTML +coverage code report under the directory `./coverage_html/`. To change the +output directory, use `make coverage-html HTML_COVER_DIR=./funky_new_cov_dir`. + +Coverage diffs using lcov are not currently implemented, but are being +investigated (as of July 2014). + +Running the unit tests +---------------------- + +To quickly run all the tests distributed with Tor: + + make check + +To run the fast unit tests only: + + make test + +To selectively run just some tests (the following can be combined +arbitrarily): + + ./src/test/test <name_of_test> [<name of test 2>] ... + ./src/test/test <prefix_of_name_of_test>.. [<prefix_of_name_of_test2>..] ... + ./src/test/test :<name_of_excluded_test> [:<name_of_excluded_test2]... + +To run all tests, including those based on Stem or Chutney: + + make test-full + +To run all tests, including those based on Stem or Chutney that require a +working connection to the internet: + + make test-full-online + +Running gcov for unit test coverage +----------------------------------- + + ./configure --enable-coverage + make + make check + # or--- make test-full ? make test-full-online? + mkdir coverage-output + ./scripts/test/coverage coverage-output + +(On OSX, you'll need to start with `--enable-coverage CC=clang`.) + +Then, look at the .gcov files in `coverage-output`. '-' before a line means +that the compiler generated no code for that line. '######' means that the +line was never reached. Lines with numbers were called that number of times. + +If that doesn't work: + + * Try configuring Tor with `--disable-gcc-hardening` + * You might need to run `make clean` after you run `./configure`. + +If you make changes to Tor and want to get another set of coverage results, +you can run `make reset-gcov` to clear the intermediary gcov output. + +If you have two different `coverage-output` directories, and you want to see +a meaningful diff between them, you can run: + + ./scripts/test/cov-diff coverage-output1 coverage-output2 | less + +In this diff, any lines that were visited at least once will have coverage +"1". This lets you inspect what you (probably) really want to know: which +untested lines were changed? Are there any new untested lines? + +Running integration tests +------------------------- + +We have the beginnings of a set of scripts to run integration tests using +Chutney. To try them, set CHUTNEY_PATH to your chutney source directory, and +run `make test-network`. + +We also have scripts to run integration tests using Stem. To try them, set +`STEM_SOURCE_DIR` to your Stem source directory, and run `test-stem`. + +Profiling Tor with oprofile +--------------------------- + +The oprofile tool runs (on Linux only!) to tell you what functions Tor is +spending its CPU time in, so we can identify performance bottlenecks. + +Here are some basic instructions + + - Build tor with debugging symbols (you probably already have, unless + you messed with CFLAGS during the build process). + - Build all the libraries you care about with debugging symbols + (probably you only care about libssl, maybe zlib and Libevent). + - Copy this tor to a new directory + - Copy all the libraries it uses to that dir too (`ldd ./tor` will + tell you) + - Set LD_LIBRARY_PATH to include that dir. `ldd ./tor` should now + show you it's using the libs in that dir + - Run that tor + - Reset oprofiles counters/start it + * `opcontrol --reset; opcontrol --start`, if Nick remembers right. + - After a while, have it dump the stats on tor and all the libs + in that dir you created. + * `opcontrol --dump;` + * `opreport -l that_dir/*` + - Profit + +Generating and analyzing a callgraph +------------------------------------ + +1. Run `./scripts/maint/generate_callgraph.sh`. This will generate a + bunch of files in a new ./callgraph directory. + +2. Run `./scripts/maint/analyze_callgraph.py callgraph/src/*/*`. This + will do a lot of graph operations and then dump out a new + `callgraph.pkl` file, containing data in Python's 'pickle' format. + +3. Run `./scripts/maint/display_callgraph.py`. It will display: + - the number of functions reachable from each function. + - all strongly-connnected components in the Tor callgraph + - the largest bottlenecks in the largest SCC in the Tor callgraph. + +Note that currently the callgraph generator can't detect calls that pass +through function pointers. + +Getting emacs to edit Tor source properly +----------------------------------------- + +Nick likes to put the following snippet in his .emacs file: + + + (add-hook 'c-mode-hook + (lambda () + (font-lock-mode 1) + (set-variable 'show-trailing-whitespace t) + + (let ((fname (expand-file-name (buffer-file-name)))) + (cond + ((string-match "^/home/nickm/src/libevent" fname) + (set-variable 'indent-tabs-mode t) + (set-variable 'c-basic-offset 4) + (set-variable 'tab-width 4)) + ((string-match "^/home/nickm/src/tor" fname) + (set-variable 'indent-tabs-mode nil) + (set-variable 'c-basic-offset 2)) + ((string-match "^/home/nickm/src/openssl" fname) + (set-variable 'indent-tabs-mode t) + (set-variable 'c-basic-offset 8) + (set-variable 'tab-width 8)) + )))) + + +You'll note that it defaults to showing all trailing whitespace. The `cond` +test detects whether the file is one of a few C free software projects that I +often edit, and sets up the indentation level and tab preferences to match +what they want. + +If you want to try this out, you'll need to change the filename regex +patterns to match where you keep your Tor files. + +If you use emacs for editing Tor and nothing else, you could always just say: + + + (add-hook 'c-mode-hook + (lambda () + (font-lock-mode 1) + (set-variable 'show-trailing-whitespace t) + (set-variable 'indent-tabs-mode nil) + (set-variable 'c-basic-offset 2))) + + +There is probably a better way to do this. No, we are probably not going +to clutter the files with emacs stuff. + + +Doxygen +------- + +We use the 'doxygen' utility to generate documentation from our +source code. Here's how to use it: + + 1. Begin every file that should be documented with + + /** + * \file filename.c + * \brief Short description of the file. + */ + + (Doxygen will recognize any comment beginning with /** as special.) + + 2. Before any function, structure, #define, or variable you want to + document, add a comment of the form: + + /** Describe the function's actions in imperative sentences. + * + * Use blank lines for paragraph breaks + * - and + * - hyphens + * - for + * - lists. + * + * Write <b>argument_names</b> in boldface. + * + * \code + * place_example_code(); + * between_code_and_endcode_commands(); + * \endcode + */ + + 3. Make sure to escape the characters `<`, `>`, `\`, `%` and `#` as `\<`, + `\>`, `\\`, `\%` and `\#`. + + 4. To document structure members, you can use two forms: + + struct foo { + /** You can put the comment before an element; */ + int a; + int b; /**< Or use the less-than symbol to put the comment + * after the element. */ + }; + + 5. To generate documentation from the Tor source code, type: + + $ doxygen -g + + to generate a file called `Doxyfile`. Edit that file and run + `doxygen` to generate the API documentation. + + 6. See the Doxygen manual for more information; this summary just + scratches the surface. diff --git a/doc/HACKING/HowToReview.md b/doc/HACKING/HowToReview.md new file mode 100644 index 0000000000..de7891c923 --- /dev/null +++ b/doc/HACKING/HowToReview.md @@ -0,0 +1,85 @@ +How to review a patch +===================== + +Some folks have said that they'd like to review patches more often, but they +don't know how. + +So, here are a bunch of things to check for when reviewing a patch! + +Note that if you can't do every one of these, that doesn't mean you can't do +a good review! Just make it clear what you checked for and what you didn't. + + +Top-level smell-checks +---------------------- + +(Difficulty: easy) + +- Does it compile with `--enable-gcc-warnings`? + +- Does `make check-spaces` pass? + +- Does it have a reasonable amount of tests? Do they pass? Do they leak + memory? + +- Do all the new functions, global variables, types, and structure members have + documentation? + +- Do all the functions, global variables, types, and structure members with + modified behavior have modified documentation? + +- Do all the new torrc options have documentation? + +- If this changes Tor's behavior on the wire, is there a design proposal? + + + +Let's look at the code! +----------------------- + +- Does the code conform to CodingStandards.txt? + +- Does the code leak memory? + +- If two or more pointers ever point to the same object, is it clear which + pointer "owns" the object? + +- Are all allocated resources freed? + +- Are all pointers that should be const, const? + +- Are `#defines` used for 'magic' numbers? + +- Can you understand what the code is trying to do? + +- Can you convince yourself that the code really does that? + +- Is there duplicated code that could be turned into a function? + + +Let's look at the documentation! +-------------------------------- + +- Does the documentation confirm to CodingStandards.txt? + +- Does it make sense? + +- Can you predict what the function will do from its documentation? + + +Let's think about security! +--------------------------- + +- If there are any arrays, buffers, are you 100% sure that they cannot + overflow? + +- If there is any integer math, can it overflow or underflow? + +- If there are any allocations, are you sure there are corresponding + deallocations? + +- Is there a safer pattern that could be used in any case? + +- Have they used one of the Forbidden Functions? + +(Also see your favorite secure C programming guides.) diff --git a/doc/HACKING/README.1st.md b/doc/HACKING/README.1st.md new file mode 100644 index 0000000000..8299fe634a --- /dev/null +++ b/doc/HACKING/README.1st.md @@ -0,0 +1,62 @@ + +In this directory +----------------- + +This directory has helpful information about what you need to know to +hack on Tor! + +First, read `GettingStarted.md` to learn how to get a start in Tor +development. + +If you've decided to write a patch, `CodingStandards.txt` will give +you a bunch of information about how we structure our code. + +It's important to get code right! Reading `WritingTests.md` will +tell you how to write and run tests in the Tor codebase. + +There are a bunch of other programs we use to help maintain and +develop the codebase: `HelpfulTools.md` can tell you how to use them +with Tor. + +If it's your job to put out Tor releases, see `ReleasingTor.md` so +that you don't miss any steps! + + +----------------------- + +For full information on how Tor is supposed to work, look at the files in +`https://gitweb.torproject.org/torspec.git/tree`. + +For an explanation of how to change Tor's design to work differently, look at +`https://gitweb.torproject.org/torspec.git/blob_plain/HEAD:/proposals/001-process.txt`. + +For the latest version of the code, get a copy of git, and + + git clone https://git.torproject.org/git/tor + +We talk about Tor on the `tor-talk` mailing list. Design proposals and +discussion belong on the `tor-dev` mailing list. We hang around on +irc.oftc.net, with general discussion happening on #tor and development +happening on `#tor-dev`. + +The other files in this `HACKING` directory may also be useful as you +get started working with Tor. + +Happy hacking! + + +----------------------- + +XXXXX also describe + +doc/HACKING/WritingTests.md + +torguts.git + +torspec.git + +The design paper + +freehaven.net/anonbib + +XXXX describe these and add links. diff --git a/doc/HACKING/ReleasingTor.md b/doc/HACKING/ReleasingTor.md new file mode 100644 index 0000000000..2378aef568 --- /dev/null +++ b/doc/HACKING/ReleasingTor.md @@ -0,0 +1,143 @@ + +Putting out a new release +------------------------- + +Here are the steps Roger takes when putting out a new Tor release: + +1. Use it for a while, as a client, as a relay, as a hidden service, + and as a directory authority. See if it has any obvious bugs, and + resolve those. + + As applicable, merge the `maint-X` branch into the `release-X` branch. + +2. Gather the `changes/*` files into a changelog entry, rewriting many + of them and reordering to focus on what users and funders would find + interesting and understandable. + + 1. Make sure that everything that wants a bug number has one. + Make sure that everything which is a bugfix says what version + it was a bugfix on. + + 2. Concatenate them. + + 3. Sort them by section. Within each section, sort by "version it's + a bugfix on", else by numerical ticket order. + + 4. Clean them up: + + Standard idioms: + `Fixes bug 9999; bugfix on 0.3.3.3-alpha.` + + One space after a period. + + Make stuff very terse + + Make sure each section name ends with a colon + + Describe the user-visible problem right away + + Mention relevant config options by name. If they're rare or unusual, + remind people what they're for + + Avoid starting lines with open-paren + + Present and imperative tense: not past. + + 'Relays', not 'servers' or 'nodes' or 'Tor relays'. + + "Stop FOOing", not "Fix a bug where we would FOO". + + Try not to let any given section be longer than about a page. Break up + long sections into subsections by some sort of common subtopic. This + guideline is especially important when organizing Release Notes for + new stable releases. + + If a given changes stanza showed up in a different release (e.g. + maint-0.2.1), be sure to make the stanzas identical (so people can + distinguish if these are the same change). + + 5. Merge them in. + + 6. Clean everything one last time. + + 7. Run `./scripts/maint/format_changelog.py` to make it prettier. + +3. Compose a short release blurb to highlight the user-facing + changes. Insert said release blurb into the ChangeLog stanza. If it's + a stable release, add it to the ReleaseNotes file too. If we're adding + to a release-0.2.x branch, manually commit the changelogs to the later + git branches too. + + If you're doing the first stable release in a series, you need to + create a ReleaseNotes for the series as a whole. To get started + there, copy all of the Changelog entries from the series into a new + file, and run `./scripts/maint/sortChanges.py` on it. That will + group them by category. Then kill every bugfix entry for fixing + bugs that were introduced within that release series; those aren't + relevant changes since the last series. At that point, it's time + to start sorting and condensing entries. (Generally, we don't edit the + text of existing entries, though.) + +4. In `maint-0.2.x`, bump the version number in `configure.ac` and run + `scripts/maint/updateVersions.pl` to update version numbers in other + places, and commit. Then merge `maint-0.2.x` into `release-0.2.x`. + + (NOTE: To bump the version number, edit `configure.ac`, and then run + either `make`, or `perl scripts/maint/updateVersions.pl`, depending on + your version.) + +5. Make distcheck, put the tarball up somewhere, and tell `#tor` about + it. Wait a while to see if anybody has problems building it. Try to + get Sebastian or somebody to try building it on Windows. + +6. Get at least two of weasel/arma/Sebastian to put the new version number + in their approved versions list. + +7. Sign the tarball, then sign and push the git tag: + + gpg -ba <the_tarball> + git tag -u <keyid> tor-0.2.x.y-status + git push origin tag tor-0.2.x.y-status + +8. scp the tarball and its sig to the dist website, i.e. + `/srv/dist-master.torproject.org/htdocs/` on dist-master. When you want + it to go live, you run "static-update-component dist.torproject.org" + on dist-master. + + Edit `include/versions.wmi` and `Makefile` to note the new version. + + (NOTE: Due to #17805, there can only be one stable version listed at + once. Nonetheless, do not call your version "alpha" if it is stable, + or people will get confused.) + +9. Email the packagers (cc'ing tor-assistants) that a new tarball is up. + The current list of packagers is: + + - {weasel,gk,mikeperry} at torproject dot org + - {blueness} at gentoo dot org + - {paul} at invizbox dot io + - {ondrej.mikle} at gmail dot com + - {lfleischer} at archlinux dot org + - {tails-dev} at boum dot org + +10. Add the version number to Trac. To do this, go to Trac, log in, + select "Admin" near the top of the screen, then select "Versions" from + the menu on the left. At the right, there will be an "Add version" + box. By convention, we enter the version in the form "Tor: + 0.2.2.23-alpha" (or whatever the version is), and we select the date as + the date in the ChangeLog. + +11. Forward-port the ChangeLog (and ReleaseNotes if appropriate). + +12. Wait up to a day or two (for a development release), or until most + packages are up (for a stable release), and mail the release blurb and + changelog to tor-talk or tor-announce. + + (We might be moving to faster announcements, but don't announce until + the website is at least updated.) + +13. If it's a stable release, bump the version number in the `maint-x.y.z` + branch to "newversion-dev", and do a `merge -s ours` merge to avoid + taking that change into master. Do a similar `merge -s theirs` + merge to get the change (and only that change) into release. (Some + of the build scripts require that maint merge cleanly into release.) diff --git a/doc/HACKING/WritingTests.md b/doc/HACKING/WritingTests.md new file mode 100644 index 0000000000..4e98d3d645 --- /dev/null +++ b/doc/HACKING/WritingTests.md @@ -0,0 +1,445 @@ + +Writing tests for Tor: an incomplete guide +========================================== + +Tor uses a variety of testing frameworks and methodologies to try to +keep from introducing bugs. The major ones are: + + 1. Unit tests written in C and shipped with the Tor distribution. + + 2. Integration tests written in Python and shipped with the Tor + distribution. + + 3. Integration tests written in Python and shipped with the Stem + library. Some of these use the Tor controller protocol. + + 4. System tests written in Python and SH, and shipped with the + Chutney package. These work by running many instances of Tor + locally, and sending traffic through them. + + 5. The Shadow network simulator. + +How to run these tests +---------------------- + +### The easy version + +To run all the tests that come bundled with Tor, run `make check`. + +To run the Stem tests as well, fetch stem from the git repository, +set `STEM_SOURCE_DIR` to the checkout, and run `make test-stem`. + +To run the Chutney tests as well, fetch chutney from the git repository, +set `CHUTNEY_PATH` to the checkout, and run `make test-network`. + +To run all of the above, run `make test-full`. + +To run all of the above, plus tests that require a working connection to the +internet, run `make test-full-online`. + +### Running particular subtests + +The Tor unit tests are divided into separate programs and a couple of +bundled unit test programs. + +Separate programs are easy. For example, to run the memwipe tests in +isolation, you just run `./src/test/test-memwipe`. + +To run tests within the unit test programs, you can specify the name +of the test. The string ".." can be used as a wildcard at the end of the +test name. For example, to run all the cell format tests, enter +`./src/test/test cellfmt/..`. To run + +Many tests that need to mess with global state run in forked subprocesses in +order to keep from contaminating one another. But when debugging a failing test, +you might want to run it without forking a subprocess. To do so, use the +`--no-fork` option with a single test. (If you specify it along with +multiple tests, they might interfere.) + +You can turn on logging in the unit tests by passing one of `--debug`, +`--info`, `--notice`, or `--warn`. By default only errors are displayed. + +Unit tests are divided into `./src/test/test` and `./src/test/test-slow`. +The former are those that should finish in a few seconds; the latter tend to +take more time, and may include CPU-intensive operations, deliberate delays, +and stuff like that. + +### Finding test coverage + +Test coverage is a measurement of which lines your tests actually visit. + +When you configure Tor with the `--enable-coverage` option, it should +build with support for coverage in the unit tests, and in a special +`tor-cov` binary. + +Then, run the tests you'd like to see coverage from. If you have old +coverage output, you may need to run `reset-gcov` first. + +Now you've got a bunch of files scattered around your build directories +called `*.gcda`. In order to extract the coverage output from them, make a +temporary directory for them and run `./scripts/test/coverage ${TMPDIR}`, +where `${TMPDIR}` is the temporary directory you made. This will create a +`.gcov` file for each source file under tests, containing that file's source +annotated with the number of times the tests hit each line. (You'll need to +have gcov installed.) + +You can get a summary of the test coverage for each file by running +`./scripts/test/cov-display ${TMPDIR}/*` . Each line lists the file's name, +the number of uncovered lines, the number of uncovered lines, and the +coverage percentage. + +For a summary of the test coverage for each _function_, run +`./scripts/test/cov-display -f ${TMPDIR}/*`. + +### Comparing test coverage + +Sometimes it's useful to compare test coverage for a branch you're writing to +coverage from another branch (such as git master, for example). But you +can't run `diff` on the two coverage outputs directly, since the actual +number of times each line is executed aren't so important, and aren't wholly +deterministic. + +Instead, follow the instructions above for each branch, creating a separate +temporary directory for each. Then, run `./scripts/test/cov-diff ${D1} +${D2}`, where D1 and D2 are the directories you want to compare. This will +produce a diff of the two directories, with all lines normalized to be either +covered or uncovered. + +To count new or modified uncovered lines in D2, you can run: + + ./scripts/test/cov-diff ${D1} ${D2}" | grep '^+ *\#' | wc -l + + +What kinds of test should I write? +---------------------------------- + +Integration testing and unit testing are complementary: it's probably a +good idea to make sure that your code is hit by both if you can. + +If your code is very-low level, and its behavior is easily described in +terms of a relation between inputs and outputs, or a set of state +transitions, then it's a natural fit for unit tests. (If not, please +consider refactoring it until most of it _is_ a good fit for unit +tests!) + +If your code adds new externally visible functionality to Tor, it would +be great to have a test for that functionality. That's where +integration tests more usually come in. + +Unit and regression tests: Does this function do what it's supposed to? +----------------------------------------------------------------------- + +Most of Tor's unit tests are made using the "tinytest" testing framework. +You can see a guide to using it in the tinytest manual at + + https://github.com/nmathewson/tinytest/blob/master/tinytest-manual.md + +To add a new test of this kind, either edit an existing C file in `src/test/`, +or create a new C file there. Each test is a single function that must +be indexed in the table at the end of the file. We use the label "done:" as +a cleanup point for all test functions. + +(Make sure you read `tinytest-manual.md` before proceeding.) + +I use the term "unit test" and "regression tests" very sloppily here. + +### A simple example + +Here's an example of a test function for a simple function in util.c: + + static void + test_util_writepid(void *arg) + { + (void) arg; + + char *contents = NULL; + const char *fname = get_fname("tmp_pid"); + unsigned long pid; + char c; + + write_pidfile(fname); + + contents = read_file_to_str(fname, 0, NULL); + tt_assert(contents); + + int n = sscanf(contents, "%lu\n%c", &pid, &c); + tt_int_op(n, OP_EQ, 1); + tt_int_op(pid, OP_EQ, getpid()); + + done: + tor_free(contents); + } + +This should look pretty familiar to you if you've read the tinytest +manual. One thing to note here is that we use the testing-specific +function `get_fname` to generate a file with respect to a temporary +directory that the tests use. You don't need to delete the file; +it will get removed when the tests are done. + +Also note our use of `OP_EQ` instead of `==` in the `tt_int_op()` calls. +We define `OP_*` macros to use instead of the binary comparison +operators so that analysis tools can more easily parse our code. +(Coccinelle really hates to see `==` used as a macro argument.) + +Finally, remember that by convention, all `*_free()` functions that +Tor defines are defined to accept NULL harmlessly. Thus, you don't +need to say `if (contents)` in the cleanup block. + +### Exposing static functions for testing + +Sometimes you need to test a function, but you don't want to expose +it outside its usual module. + +To support this, Tor's build system compiles a testing version of +each module, with extra identifiers exposed. If you want to +declare a function as static but available for testing, use the +macro `STATIC` instead of `static`. Then, make sure there's a +macro-protected declaration of the function in the module's header. + +For example, `crypto_curve25519.h` contains: + + #ifdef CRYPTO_CURVE25519_PRIVATE + STATIC int curve25519_impl(uint8_t *output, const uint8_t *secret, + const uint8_t *basepoint); + #endif + +The `crypto_curve25519.c` file and the `test_crypto.c` file both define +`CRYPTO_CURVE25519_PRIVATE`, so they can see this declaration. + +### STOP! Does this test really test? + +When writing tests, it's not enough to just generate coverage on all the +lines of the code that you're testing: It's important to make sure that +the test _really tests_ the code. + +For example, here is a _bad_ test for the unlink() function (which is +supposed to remove a file). + + static void + test_unlink_badly(void *arg) + { + (void) arg; + int r; + + const char *fname = get_fname("tmpfile"); + + /* If the file isn't there, unlink returns -1 and sets ENOENT */ + r = unlink(fname); + tt_int_op(n, OP_EQ, -1); + tt_int_op(errno, OP_EQ, ENOENT); + + /* If the file DOES exist, unlink returns 0. */ + write_str_to_file(fname, "hello world", 0); + r = unlink(fnme); + tt_int_op(r, OP_EQ, 0); + + done: + tor_free(contents); + } + + +This test might get very high coverage on unlink(). So why is it a +bad test? Because it doesn't check that unlink() *actually removes the +named file*! + +Remember, the purpose of a test is to succeed if the code does what +it's supposed to do, and fail otherwise. Try to design your tests so +that they check for the code's intended and documented functionality +as much as possible. + + +### Mock functions for testing in isolation + +Often we want to test that a function works right, but the function to +be tested depends on other functions whose behavior is hard to observe, +or which require a working Tor network, or something like that. + +To write tests for this case, you can replace the underlying functions +with testing stubs while your unit test is running. You need to declare +the underlying function as 'mockable', as follows: + + MOCK_DECL(returntype, functionname, (argument list)); + +and then later implement it as: + + MOCK_IMPL(returntype, functionname, (argument list)) + { + /* implementation here */ + } + +For example, if you had a 'connect to remote server' function, you could +declare it as: + + + MOCK_DECL(int, connect_to_remote, (const char *name, status_t *status)); + +When you declare a function this way, it will be declared as normal in +regular builds, but when the module is built for testing, it is declared +as a function pointer initialized to the actual implementation. + +In your tests, if you want to override the function with a temporary +replacement, you say: + + MOCK(functionname, replacement_function_name); + +And later, you can restore the original function with: + + UNMOCK(functionname); + +For more information, see the definitions of this mocking logic in +`testsupport.h`. + +### Okay but what should my tests actually do? + +We talk above about "test coverage" -- making sure that your tests visit +every line of code, or every branch of code. But visiting the code isn't +enough: we want to verify that it's correct. + +So when writing tests, try to make tests that should pass with any correct +implementation of the code, and that should fail if the code doesn't do what +it's supposed to do. + +You can write "black-box" tests or "glass-box" tests. A black-box test is +one that you write without looking at the structure of the function. A +glass-box one is one you implement while looking at how the function is +implemented. + +In either case, make sure to consider common cases *and* edge cases; success +cases and failure csaes. + +For example, consider testing this function: + + /** Remove all elements E from sl such that E==element. Preserve + * the order of any elements before E, but elements after E can be + * rearranged. + */ + void smartlist_remove(smartlist_t *sl, const void *element); + +In order to test it well, you should write tests for at least all of the +following cases. (These would be black-box tests, since we're only looking +at the declared behavior for the function: + + * Remove an element that is in the smartlist. + * Remove an element that is not in the smartlist. + * Remove an element that appears in the smartlist more than once. + +And your tests should verify that it behaves correct. At minimum, you should +test: + + * That other elements before E are in the same order after you call the + functions. + * That the target element is really removed. + * That _only_ the target element is removed. + +When you consider edge cases, you might try: + + * Remove an element from an empty list. + * Remove an element from a singleton list containing that element. + * Remove an element for a list containing several instances of that + element, and nothing else. + +Now let's look at the implementation: + + void + smartlist_remove(smartlist_t *sl, const void *element) + { + int i; + if (element == NULL) + return; + for (i=0; i < sl->num_used; i++) + if (sl->list[i] == element) { + sl->list[i] = sl->list[--sl->num_used]; /* swap with the end */ + i--; /* so we process the new i'th element */ + sl->list[sl->num_used] = NULL; + } + } + +Based on the implementation, we now see three more edge cases to test: + + * Removing NULL from the list. + * Removing an element from the end of the list + * Removing an element from a position other than the end of the list. + + +### What should my tests NOT do? + +Tests shouldn't require a network connection. + +Whenever possible, tests shouldn't take more than a second. Put the test +into test/slow if it genuinely needs to be run. + +Tests should not alter global state unless they run with `TT_FORK`: Tests +should not require other tests to be run before or after them. + +Tests should not leak memory or other resources. To find out if your tests +are leaking memory, run them under valgrind (see HelpfulTools.txt for more +information on how to do that). + +When possible, tests should not be over-fit to the implementation. That is, +the test should verify that the documented behavior is implemented, but +should not break if other permissible behavior is later implemented. + + +### Advanced techniques: Namespaces + +Sometimes, when you're doing a lot of mocking at once, it's convenient to +isolate your identifiers within a single namespace. If this were C++, we'd +already have namespaces, but for C, we do the best we can with macros and +token-pasting. + +We have some macros defined for this purpose in `src/test/test.h`. To use +them, you define `NS_MODULE` to a prefix to be used for your identifiers, and +then use other macros in place of identifier names. See `src/test/test.h` for +more documentation. + + +Integration tests: Calling Tor from the outside +----------------------------------------------- + +Some tests need to invoke Tor from the outside, and shouldn't run from the +same process as the Tor test program. Reasons for doing this might include: + + * Testing the actual behavior of Tor when run from the command line + * Testing that a crash-handler correctly logs a stack trace + * Verifying that violating a sandbox or capability requirement will + actually crash the program. + * Needing to run as root in order to test capability inheritance or + user switching. + +To add one of these, you generally want a new C program in `src/test`. Add it +to `TESTS` and `noinst_PROGRAMS` if it can run on its own and return success or +failure. If it needs to be invoked multiple times, or it needs to be +wrapped, add a new shell script to `TESTS`, and the new program to +`noinst_PROGRAMS`. If you need access to any environment variable from the +makefile (eg `${PYTHON}` for a python interpreter), then make sure that the +makefile exports them. + +Writing integration tests with Stem +----------------------------------- + +The 'stem' library includes extensive unit tests for the Tor controller +protocol. + +For more information on writing new tests for stem, have a look around +the `test/*` directory in stem, and find a good example to emulate. You +might want to start with +`https://gitweb.torproject.org/stem.git/tree/test/integ/control/controller.py` +to improve Tor's test coverage. + +You can run stem tests from tor with `make test-stem`, or see +`https://stem.torproject.org/faq.html#how-do-i-run-the-tests`. + +System testing with Chutney +--------------------------- + +The 'chutney' program configures and launches a set of Tor relays, +authorities, and clients on your local host. It has a `test network` +functionality to send traffic through them and verify that the traffic +arrives correctly. + +You can write new test networks by adding them to `networks`. To add +them to Tor's tests, add them to the `test-network` or `test-network-all` +targets in `Makefile.am`. + +(Adding new kinds of program to chutney will still require hacking the +code.) |