diff options
Diffstat (limited to 'doc/HACKING')
-rw-r--r-- | doc/HACKING/CodingStandards.md | 154 | ||||
-rw-r--r-- | doc/HACKING/CodingStandardsRust.md | 469 | ||||
-rw-r--r-- | doc/HACKING/Fuzzing.md | 123 | ||||
-rw-r--r-- | doc/HACKING/GettingStarted.md | 5 | ||||
-rw-r--r-- | doc/HACKING/GettingStartedRust.md | 161 | ||||
-rw-r--r-- | doc/HACKING/HelpfulTools.md | 90 | ||||
-rw-r--r-- | doc/HACKING/HowToReview.md | 3 | ||||
-rw-r--r-- | doc/HACKING/ReleasingTor.md | 118 | ||||
-rw-r--r-- | doc/HACKING/Tracing.md | 91 | ||||
-rw-r--r-- | doc/HACKING/WritingTests.md | 8 |
10 files changed, 1142 insertions, 80 deletions
diff --git a/doc/HACKING/CodingStandards.md b/doc/HACKING/CodingStandards.md index f1c65850a4..dd21d6fd2c 100644 --- a/doc/HACKING/CodingStandards.md +++ b/doc/HACKING/CodingStandards.md @@ -4,9 +4,10 @@ Coding conventions for Tor tl;dr: - Run configure with `--enable-fatal-warnings` - - Run `make check-spaces` to catch whitespace errors - Document your functions - Write unit tests + - Run `make check` before submitting a patch + - Run `make distcheck` if you have made changes to build system components - Add a file in `changes` for your branch. Patch checklist @@ -22,13 +23,25 @@ preference) Did you remember... - To build your code while configured with `--enable-fatal-warnings`? - - To run `make check-spaces` on your code? - To run `make check-docs` to see whether all new options are on the manpage? - To write unit tests, as possible? + - To run `make test-full` to test against all unit and integration tests (or + `make test-full-online` if you have a working connection to the internet)? + - To test that the distribution will actually work via `make distcheck`? - To base your code on the appropriate branch? - To include a file in the `changes` directory as appropriate? +If you are submitting a major patch or new feature, or want to in the future... + + - Set up Chutney and Stem, see HACKING/WritingTests.md + - Run `make test-full` to test against all unit and integration tests. + +If you have changed build system components: + - Please run `make distcheck` + - For example, if you have changed Makefiles, autoconf files, or anything + else that affects the build system. + How we use Git branches ======================= @@ -49,8 +62,17 @@ before it gets merged into maint, but that's rare. If you're working on a bugfix for a bug that occurs in a particular version, base your bugfix branch on the "maint" branch for the first supported series -that has that bug. (As of June 2013, we're supporting 0.2.3 and later.) If -you're working on a new feature, base it on the master branch. +that has that bug. (As of June 2013, we're supporting 0.2.3 and later.) + +If you're working on a new feature, base it on the master branch. If you're +working on a new feature and it will take a while to implement and/or you'd +like to avoid the possibility of unrelated bugs in Tor while you're +implementing your feature, consider branching off of the latest maint- branch. +_Never_ branch off a relase- branch. Don't branch off a tag either: they come +from release branches. Doing so will likely produce a nightmare of merge +conflicts in the ChangeLog when it comes time to merge your branch into Tor. +Best advice: don't try to keep an independent branch forked for more than 6 +months and expect it to merge cleanly. Try to merge pieces early and often. How we log changes @@ -74,17 +96,34 @@ you can use `git describe --contains <sha1 of commit>`. If at all possible, try to create this file in the same commit where you are making the change. Please give it a distinctive name that no other branch will use for the lifetime of your change. To verify the format of the changes file, -you can use `make check-changes`. +you can use `make check-changes`. This is run automatically as part of +`make check` -- if it fails, we must fix it before we release. These +checks are implemented in `scripts/maint/lintChanges.py`. + +Changes file style guide: + * Changes files begin with " o Header (subheading):". The header + should usually be "Minor/Major bugfixes/features". The subheading is a + particular area within Tor. See the ChangeLog for examples. + + * Make everything terse. + + * Write from the user's point of view: describe the user-visible changes + right away. + + * Mention configuration options by name. If they're rare or unusual, + remind people what they're for. + + * Describe changes in the present tense and in the imperative: not past. + + * Every bugfix should have a sentence of the form "Fixes bug 1234; bugfix + on 0.1.2.3-alpha", describing what bug was fixed and where it came from. + + * "Relays", not "servers", "nodes", or "Tor relays". When we go to make a release, we will concatenate all the entries in changes to make a draft changelog, and clear the directory. We'll then edit the draft changelog into a nice readable format. -To make sure that stuff is in the right format, we use -scripts/maint/lintChanges.py to check the changes files for -(superficial) validity. You can run this script on your own changes -files! - What needs a changes file? * A not-exhaustive list: Anything that might change user-visible @@ -93,6 +132,10 @@ What needs a changes file? rewrites. Anything about which somebody might plausibly wonder "when did that happen, and/or why did we do that" 6 months down the line. +What does not need a changes file? + + * Bugfixes for code that hasn't shipped in any released version of Tor + Why use changes files instead of Git commit messages? * Git commit messages are written for developers, not users, and they @@ -149,6 +192,79 @@ old C functions. Use `strlcat`, `strlcpy`, or `tor_snprintf/tor_asprintf` inste We don't call `memcmp()` directly. Use `fast_memeq()`, `fast_memneq()`, `tor_memeq()`, or `tor_memneq()` for most purposes. +Also see a longer list of functions to avoid in: +https://people.torproject.org/~nickm/tor-auto/internal/this-not-that.html + +Floating point math is hard +--------------------------- + +Floating point arithmetic as typically implemented by computers is +very counterintuitive. Failure to adequately analyze floating point +usage can result in surprising behavior and even security +vulnerabilities! + +General advice: + + - Don't use floating point. + - If you must use floating point, document how the limits of + floating point precision and calculation accuracy affect function + outputs. + - Try to do as much as possible of your calculations using integers + (possibly acting as fixed-point numbers) and convert to floating + point for display. + - If you must send floating point numbers on the wire, serialize + them in a platform-independent way. Tor avoids exchanging + floating-point values, but when it does, it uses ASCII numerals, + with a decimal point ("."). + - Binary fractions behave very differently from decimal fractions. + Make sure you understand how these differences affect your + calculations. + - Every floating point arithmetic operation is an opportunity to + lose precision, overflow, underflow, or otherwise produce + undesired results. Addition and subtraction tend to be worse + than multiplication and division (due to things like catastrophic + cancellation). Try to arrange your calculations to minimize such + effects. + - Changing the order of operations changes the results of many + floating-point calculations. Be careful when you simplify + calculations! If the order is significant, document it using a + code comment. + - Comparing most floating point values for equality is unreliable. + Avoid using `==`, instead, use `>=` or `<=`. If you use an + epsilon value, make sure it's appropriate for the ranges in + question. + - Different environments (including compiler flags and per-thread + state on a single platform!) can get different results from the + same floating point calculations. This means you can't use + floats in anything that needs to be deterministic, like consensus + generation. This also makes reliable unit tests of + floating-point outputs hard to write. + +For additional useful advice (and a little bit of background), see +[What Every Programmer Should Know About Floating-Point +Arithmetic](http://floating-point-gui.de/). + +A list of notable (and surprising) facts about floating point +arithmetic is at [Floating-point +complexities](https://randomascii.wordpress.com/2012/04/05/floating-point-complexities/). +Most of that [series of posts on floating +point](https://randomascii.wordpress.com/category/floating-point/) is +helpful. + +For more detailed (and math-intensive) background, see [What Every +Computer Scientist Should Know About Floating-Point +Arithmetic](https://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html). + +Other C conventions +------------------- + +The `a ? b : c` trinary operator only goes inside other expressions; +don't use it as a replacement for if. (You can ignore this inside macro +definitions when necessary.) + +Assignment operators shouldn't nest inside other expressions. (You can +ignore this inside macro definitions when necessary.) + Functions not to write ---------------------- @@ -210,7 +326,25 @@ end-users that they aren't expected to understand the message (perhaps with a string like "internal error"). Option (A) is to be preferred to option (B). +Assertions In Tor +----------------- + +Assertions should be used for bug-detection only. Don't use assertions to +detect bad user inputs, network errors, resource exhaustion, or similar +issues. + +Tor is always built with assertions enabled, so try to only use +`tor_assert()` for cases where you are absolutely sure that crashing is the +least bad option. Many bugs have been caused by use of `tor_assert()` when +another kind of check would have been safer. + +If you're writing an assertion to test for a bug that you _can_ recover from, +use `tor_assert_nonfatal()` in place of `tor_assert()`. If you'd like to +write a conditional that incorporates a nonfatal assertion, use the `BUG()` +macro, as in: + if (BUG(ptr == NULL)) + return -1; Doxygen comment conventions --------------------------- diff --git a/doc/HACKING/CodingStandardsRust.md b/doc/HACKING/CodingStandardsRust.md new file mode 100644 index 0000000000..d0b17c1604 --- /dev/null +++ b/doc/HACKING/CodingStandardsRust.md @@ -0,0 +1,469 @@ + + Rust Coding Standards +======================= + +You MUST follow the standards laid out in `.../doc/HACKING/CodingStandards.md`, +where applicable. + + Module/Crate Declarations +--------------------------- + +Each Tor C module which is being rewritten MUST be in its own crate. +See the structure of `.../src/rust` for examples. + +In your crate, you MUST use `lib.rs` ONLY for pulling in external +crates (e.g. `extern crate libc;`) and exporting public objects from +other Rust modules (e.g. `pub use mymodule::foo;`). For example, if +you create a crate in `.../src/rust/yourcrate`, your Rust code should +live in `.../src/rust/yourcrate/yourcode.rs` and the public interface +to it should be exported in `.../src/rust/yourcrate/lib.rs`. + +If your code is to be called from Tor C code, you MUST define a safe +`ffi.rs`. See the "Safety" section further down for more details. + +For example, in a hypothetical `tor_addition` Rust module: + +In `.../src/rust/tor_addition/addition.rs`: + + pub fn get_sum(a: i32, b: i32) -> i32 { + a + b + } + +In `.../src/rust/tor_addition/lib.rs`: + + pub use addition::*; + +In `.../src/rust/tor_addition/ffi.rs`: + + #[no_mangle] + pub extern "C" fn tor_get_sum(a: c_int, b: c_int) -> c_int { + get_sum(a, b) + } + +If your Rust code must call out to parts of Tor's C code, you must +declare the functions you are calling in the `external` crate, located +at `.../src/rust/external`. + +<!-- XXX get better examples of how to declare these externs, when/how they --> +<!-- XXX are unsafe, what they are expected to do —isis --> + +Modules should strive to be below 500 lines (tests excluded). Single +responsibility and limited dependencies should be a guiding standard. + +If you have any external modules as dependencies (e.g. `extern crate +libc;`), you MUST declare them in your crate's `lib.rs` and NOT in any +other module. + + Dependencies +-------------- + +In general, we use modules from only the Rust standard library +whenever possible. We will review including external crates on a +case-by-case basis. + + Documentation +--------------- + +You MUST include `#[deny(missing_docs)]` in your crate. + +For function/method comments, you SHOULD include a one-sentence, "first person" +description of function behaviour (see requirements for documentation as +described in `.../src/HACKING/CodingStandards.md`), then an `# Inputs` section +for inputs or initialisation values, a `# Returns` section for return +values/types, a `# Warning` section containing warnings for unsafe behaviours or +panics that could happen. For publicly accessible +types/constants/objects/functions/methods, you SHOULD also include an +`# Examples` section with runnable doctests. + +You MUST document your module with _module docstring_ comments, +i.e. `//!` at the beginning of each line. + + Style +------- + +You SHOULD consider breaking up large literal numbers with `_` when it makes it +more human readable to do so, e.g. `let x: u64 = 100_000_000_000`. + + Testing +--------- + +All code MUST be unittested and integration tested. + +Public functions/objects exported from a crate SHOULD include doctests +describing how the function/object is expected to be used. + +Integration tests SHOULD go into a `tests/` directory inside your +crate. Unittests SHOULD go into their own module inside the module +they are testing, e.g. in `.../src/rust/tor_addition/addition.rs` you +should put: + + #[cfg(test)] + mod test { + use super::*; + + #[test] + fn addition_with_zero() { + let sum: i32 = get_sum(5i32, 0i32); + assert_eq!(sum, 5); + } + } + + Benchmarking +-------------- + +The external `test` crate can be used for most benchmarking. However, using +this crate requires nightly Rust. Since we may want to switch to a more +stable Rust compiler eventually, we shouldn't do things which will automatically +break builds for stable compilers. Therefore, you MUST feature-gate your +benchmarks in the following manner. + +If you wish to benchmark some of your Rust code, you MUST put the +following in the `[features]` section of your crate's `Cargo.toml`: + + [features] + bench = [] + +Next, in your crate's `lib.rs` you MUST put: + + #[cfg(all(test, feature = "bench"))] + extern crate test; + +This ensures that the external crate `test`, which contains utilities +for basic benchmarks, is only used when running benchmarks via `cargo +bench --features bench`. + +Finally, to write your benchmark code, in +`.../src/rust/tor_addition/addition.rs` you SHOULD put: + + #[cfg(all(test, features = "bench"))] + mod bench { + use test::Bencher; + use super::*; + + #[bench] + fn addition_small_integers(b: &mut Bencher) { + b.iter(| | get_sum(5i32, 0i32)); + } + } + + Fuzzing +--------- + +If you wish to fuzz parts of your code, please see the +[`cargo fuzz`](https://github.com/rust-fuzz/cargo-fuzz) crate, which uses +[libfuzzer-sys](https://github.com/rust-fuzz/libfuzzer-sys). + + Whitespace & Formatting +------------------------- + +You MUST run `rustfmt` (https://github.com/rust-lang-nursery/rustfmt) +on your code before your code will be merged. You can install rustfmt +by doing `cargo install rustfmt-nightly` and then run it with `cargo +fmt`. + + Safety +-------- + +You SHOULD read [the nomicon](https://doc.rust-lang.org/nomicon/) before writing +Rust FFI code. It is *highly advised* that you read and write normal Rust code +before attempting to write FFI or any other unsafe code. + +Here are some additional bits of advice and rules: + +0. Any behaviours which Rust considers to be undefined are forbidden + + From https://doc.rust-lang.org/reference/behavior-considered-undefined.html: + + > Behavior considered undefined + > + > The following is a list of behavior which is forbidden in all Rust code, + > including within unsafe blocks and unsafe functions. Type checking provides the + > guarantee that these issues are never caused by safe code. + > + > * Data races + > * Dereferencing a null/dangling raw pointer + > * Reads of [undef](http://llvm.org/docs/LangRef.html#undefined-values) + > (uninitialized) memory + > * Breaking the + > [pointer aliasing rules](http://llvm.org/docs/LangRef.html#pointer-aliasing-rules) + > with raw pointers (a subset of the rules used by C) + > * `&mut T` and `&T` follow LLVM’s scoped noalias model, except if the `&T` + > contains an `UnsafeCell<U>`. Unsafe code must not violate these aliasing + > guarantees. + > * Mutating non-mutable data (that is, data reached through a shared + > reference or data owned by a `let` binding), unless that data is + > contained within an `UnsafeCell<U>`. + > * Invoking undefined behavior via compiler intrinsics: + > - Indexing outside of the bounds of an object with + > `std::ptr::offset` (`offset` intrinsic), with the exception of + > one byte past the end which is permitted. + > - Using `std::ptr::copy_nonoverlapping_memory` (`memcpy32`/`memcpy64` + > intrinsics) on overlapping buffers + > * Invalid values in primitive types, even in private fields/locals: + > - Dangling/null references or boxes + > - A value other than `false` (0) or `true` (1) in a `bool` + > - A discriminant in an `enum` not included in the type definition + > - A value in a `char` which is a surrogate or above `char::MAX` + > - Non-UTF-8 byte sequences in a `str` + > * Unwinding into Rust from foreign code or unwinding from Rust into foreign + > code. Rust's failure system is not compatible with exception handling in other + > languages. Unwinding must be caught and handled at FFI boundaries. + +1. `unwrap()` + + If you call `unwrap()`, anywhere, even in a test, you MUST include + an inline comment stating how the unwrap will either 1) never fail, + or 2) should fail (i.e. in a unittest). + + You SHOULD NOT use `unwrap()` anywhere in which it is possible to handle the + potential error with either `expect()` or the eel operator, `?`. + For example, consider a function which parses a string into an integer: + + fn parse_port_number(config_string: &str) -> u16 { + u16::from_str_radix(config_string, 10).unwrap() + } + + There are numerous ways this can fail, and the `unwrap()` will cause the + whole program to byte the dust! Instead, either you SHOULD use `expect()` + (or another equivalent function which will return an `Option` or a `Result`) + and change the return type to be compatible: + + fn parse_port_number(config_string: &str) -> Option<u16> { + u16::from_str_radix(config_string, 10).expect("Couldn't parse port into a u16") + } + + or you SHOULD use `or()` (or another similar method): + + fn parse_port_number(config_string: &str) -> Option<u16> { + u16::from_str_radix(config_string, 10).or(Err("Couldn't parse port into a u16") + } + + Using methods like `or()` can be particularly handy when you must do + something afterwards with the data, for example, if we wanted to guarantee + that the port is high. Combining these methods with the eel operator (`?`) + makes this even easier: + + fn parse_port_number(config_string: &str) -> Result<u16, Err> { + let port = u16::from_str_radix(config_string, 10).or(Err("Couldn't parse port into a u16"))?; + + if port > 1024 { + return Ok(port); + } else { + return Err("Low ports not allowed"); + } + } + +2. `unsafe` + + If you use `unsafe`, you MUST describe a contract in your + documentation which describes how and when the unsafe code may + fail, and what expectations are made w.r.t. the interfaces to + unsafe code. This is also REQUIRED for major pieces of FFI between + C and Rust. + + When creating an FFI in Rust for C code to call, it is NOT REQUIRED + to declare the entire function `unsafe`. For example, rather than doing: + + #[no_mangle] + pub unsafe extern "C" fn increment_and_combine_numbers(mut numbers: [u8; 4]) -> u32 { + for number in &mut numbers { + *number += 1; + } + std::mem::transmute::<[u8; 4], u32>(numbers) + } + + You SHOULD instead do: + + #[no_mangle] + pub extern "C" fn increment_and_combine_numbers(mut numbers: [u8; 4]) -> u32 { + for index in 0..numbers.len() { + numbers[index] += 1; + } + unsafe { + std::mem::transmute::<[u8; 4], u32>(numbers) + } + } + +3. Pass only integer types and bytes over the boundary + + The only non-integer type which may cross the FFI boundary is + bytes, e.g. `&[u8]`. This SHOULD be done on the Rust side by + passing a pointer (`*mut libc::c_char`) and a length + (`libc::size_t`). + + One might be tempted to do this via doing + `CString::new("blah").unwrap().into_raw()`. This has several problems: + + a) If you do `CString::new("bl\x00ah")` then the unwrap() will fail + due to the additional NULL terminator, causing a dangling + pointer to be returned (as well as a potential use-after-free). + + b) Returning the raw pointer will cause the CString to run its deallocator, + which causes any C code which tries to access the contents to dereference a + NULL pointer. + + c) If we were to do `as_raw()` this would result in a potential double-free + since the Rust deallocator would run and possibly Tor's deallocator. + + d) Calling `into_raw()` without later using the same pointer in Rust to call + `from_raw()` and then deallocate in Rust can result in a + [memory leak](https://doc.rust-lang.org/std/ffi/struct.CString.html#method.into_raw). + + [It was determined](https://github.com/rust-lang/rust/pull/41074) that this + is safe to do if you use the same allocator in C and Rust and also specify + the memory alignment for CString (except that there is no way to specify + the alignment for CString). It is believed that the alignment is always 1, + which would mean it's safe to dealloc the resulting `*mut c_char` in Tor's + C code. However, the Rust developers are not willing to guarantee the + stability of, or a contract for, this behaviour, citing concerns that this + is potentially extremely and subtly unsafe. + +4. Perform an allocation on the other side of the boundary + + After crossing the boundary, the other side MUST perform an + allocation to copy the data and is therefore responsible for + freeing that memory later. + +5. No touching other language's enums + + Rust enums should never be touched from C (nor can they be safely + `#[repr(C)]`) nor vice versa: + + > "The chosen size is the default enum size for the target platform's C + > ABI. Note that enum representation in C is implementation defined, so this is + > really a "best guess". In particular, this may be incorrect when the C code + > of interest is compiled with certain flags." + + (from https://gankro.github.io/nomicon/other-reprs.html) + +6. Type safety + + Wherever possible and sensical, you SHOULD create new types in a + manner which prevents type confusion or misuse. For example, + rather than using an untyped mapping between strings and integers + like so: + + use std::collections::HashMap; + + pub fn get_elements_with_over_9000_points(map: &HashMap<String, usize>) -> Vec<String> { + ... + } + + It would be safer to define a new type, such that some other usage + of `HashMap<String, usize>` cannot be confused for this type: + + pub struct DragonBallZPowers(pub HashMap<String, usize>); + + impl DragonBallZPowers { + pub fn over_nine_thousand<'a>(&'a self) -> Vec<&'a String> { + let mut powerful_enough: Vec<&'a String> = Vec::with_capacity(5); + + for (character, power) in &self.0 { + if *power > 9000 { + powerful_enough.push(character); + } + } + powerful_enough + } + } + + Note the following code, which uses Rust's type aliasing, is valid + but it does NOT meet the desired type safety goals: + + pub type Power = usize; + + pub fn over_nine_thousand(power: &Power) -> bool { + if *power > 9000 { + return true; + } + false + } + + // We can still do the following: + let his_power: usize = 9001; + over_nine_thousand(&his_power); + +7. Unsafe mucking around with lifetimes + + Because lifetimes are technically, in type theory terms, a kind, i.e. a + family of types, individual lifetimes can be treated as types. For example, + one can arbitrarily extend and shorten lifetime using `std::mem::transmute`: + + struct R<'a>(&'a i32); + + unsafe fn extend_lifetime<'b>(r: R<'b>) -> R<'static> { + std::mem::transmute::<R<'b>, R<'static>>(r) + } + + unsafe fn shorten_invariant_lifetime<'b, 'c>(r: &'b mut R<'static>) -> &'b mut R<'c> { + std::mem::transmute::<&'b mut R<'static>, &'b mut R<'c>>(r) + } + + Calling `extend_lifetime()` would cause an `R` passed into it to live forever + for the life of the program (the `'static` lifetime). Similarly, + `shorten_invariant_lifetime()` could be used to take something meant to live + forever, and cause it to disappear! This is incredibly unsafe. If you're + going to be mucking around with lifetimes like this, first, you better have + an extremely good reason, and second, you may as be honest and explicit about + it, and for ferris' sake just use a raw pointer. + + In short, just because lifetimes can be treated like types doesn't mean you + should do it. + +8. Doing excessively unsafe things when there's a safer alternative + + Similarly to #7, often there are excessively unsafe ways to do a task and a + simpler, safer way. You MUST choose the safer option where possible. + + For example, `std::mem::transmute` can be abused in ways where casting with + `as` would be both simpler and safer: + + // Don't do this + let ptr = &0; + let ptr_num_transmute = unsafe { std::mem::transmute::<&i32, usize>(ptr)}; + + // Use an `as` cast instead + let ptr_num_cast = ptr as *const i32 as usize; + + In fact, using `std::mem::transmute` for *any* reason is a code smell and as + such SHOULD be avoided. + +9. Casting integers with `as` + + This is generally fine to do, but it has some behaviours which you should be + aware of. Casting down chops off the high bits, e.g.: + + let x: u32 = 4294967295; + println!("{}", x as u16); // prints 65535 + + Some cases which you MUST NOT do include: + + * Casting an `u128` down to an `f32` or vice versa (e.g. + `u128::MAX as f32` but this isn't only a problem with overflowing + as it is also undefined behaviour for `42.0f32 as u128`), + + * Casting between integers and floats when the thing being cast + cannot fit into the type it is being casted into, e.g.: + + println!("{}", 42949.0f32 as u8); // prints 197 in debug mode and 0 in release + println!("{}", 1.04E+17 as u8); // prints 0 in both modes + println!("{}", (0.0/0.0) as i64); // prints whatever the heck LLVM wants + + Because this behaviour is undefined, it can even produce segfaults in + safe Rust code. For example, the following program built in release + mode segfaults: + + #[inline(never)] + pub fn trigger_ub(sl: &[u8; 666]) -> &[u8] { + // Note that the float is out of the range of `usize`, invoking UB when casting. + let idx = 1e99999f64 as usize; + &sl[idx..] // The bound check is elided due to `idx` being of an undefined value. + } + + fn main() { + println!("{}", trigger_ub(&[1; 666])[999999]); // ~ out of bound + } + + And in debug mode panics with: + + thread 'main' panicked at 'slice index starts at 140721821254240 but ends at 666', /checkout/src/libcore/slice/mod.rs:754:4 diff --git a/doc/HACKING/Fuzzing.md b/doc/HACKING/Fuzzing.md new file mode 100644 index 0000000000..2039d6a4c0 --- /dev/null +++ b/doc/HACKING/Fuzzing.md @@ -0,0 +1,123 @@ += Fuzzing Tor + +== The simple version (no fuzzing, only tests) + +Check out fuzzing-corpora, and set TOR_FUZZ_CORPORA to point to the place +where you checked it out. + +To run the fuzzing test cases in a deterministic fashion, use: + make test-fuzz-corpora + +This won't actually fuzz Tor! It will just run all the fuzz binaries +on our existing set of testcases for the fuzzer. + + +== Different kinds of fuzzing + +Right now we support three different kinds of fuzzer. + +First, there's American Fuzzy Lop (AFL), a fuzzer that works by forking +a target binary and passing it lots of different inputs on stdin. It's the +trickiest one to set up, so I'll be describing it more below. + +Second, there's libFuzzer, a llvm-based fuzzer that you link in as a library, +and it runs a target function over and over. To use this one, you'll need to +have a reasonably recent clang and libfuzzer installed. At that point, you +just build with --enable-expensive-hardening and --enable-libfuzzer. That +will produce a set of binaries in src/test/fuzz/lf-fuzz-* . These programs +take as input a series of directories full of fuzzing examples. For more +information on libfuzzer, see http://llvm.org/docs/LibFuzzer.html + +Third, there's Google's OSS-Fuzz infrastructure, which expects to get all of +its. For more on this, see https://github.com/google/oss-fuzz and the +projects/tor subdirectory. You'll need to mess around with Docker a bit to +test this one out; it's meant to run on Google's infrastructure. + +In all cases, you'll need some starting examples to give the fuzzer when it +starts out. There's a set in the "fuzzing-corpora" git repository. Try +setting TOR_FUZZ_CORPORA to point to a checkout of that repository + +== Writing Tor fuzzers + +A tor fuzzing harness should have: +* a fuzz_init() function to set up any necessary global state. +* a fuzz_main() function to receive input and pass it to a parser. +* a fuzz_cleanup() function to clear global state. + +Most fuzzing frameworks will produce many invalid inputs - a tor fuzzing +harness should rejecting invalid inputs without crashing or behaving badly. + +But the fuzzing harness should crash if tor fails an assertion, triggers a +bug, or accesses memory it shouldn't. This helps fuzzing frameworks detect +"interesting" cases. + + +== Guided Fuzzing with AFL + +There is no HTTPS, hash, or signature for American Fuzzy Lop's source code, so +its integrity can't be verified. That said, you really shouldn't fuzz on a +machine you care about, anyway. + +To Build: + Get AFL from http://lcamtuf.coredump.cx/afl/ and unpack it + cd afl + make + cd ../tor + PATH=$PATH:../afl/ CC="../afl/afl-gcc" ./configure --enable-expensive-hardening + AFL_HARDEN=1 make clean fuzzers + +To Find The ASAN Memory Limit: (64-bit only) + +On 64-bit platforms, afl needs to know how much memory ASAN uses, +because ASAN tends to allocate a ridiculous amount of virtual memory, +and then not actually use it. + +Read afl/docs/notes_for_asan.txt for more details. + + Download recidivm from http://jwilk.net/software/recidivm + Download the signature + Check the signature + tar xvzf recidivm*.tar.gz + cd recidivm* + make + /path/to/recidivm -v src/test/fuzz/fuzz-http + Use the final "ok" figure as the input to -m when calling afl-fuzz + (Normally, recidivm would output a figure automatically, but in some cases, + the fuzzing harness will hang when the memory limit is too small.) + +You could also just say "none" instead of the memory limit below, if you +don't care about memory limits. + + +To Run: + mkdir -p src/test/fuzz/fuzz_http_findings + ../afl/afl-fuzz -i ${TOR_FUZZ_CORPORA}/http -o src/test/fuzz/fuzz_http_findings -m <asan-memory-limit> -- src/test/fuzz/fuzz-http + + +AFL has a multi-core mode, check the documentation for details. +You might find the included fuzz-multi.sh script useful for this. + +macOS (OS X) requires slightly more preparation, including: +* using afl-clang (or afl-clang-fast from the llvm directory) +* disabling external crash reporting (AFL will guide you through this step) + +== Triaging Issues + +Crashes are usually interesting, particularly if using AFL_HARDEN=1 and --enable-expensive-hardening. Sometimes crashes are due to bugs in the harness code. + +Hangs might be interesting, but they might also be spurious machine slowdowns. +Check if a hang is reproducible before reporting it. Sometimes, processing +valid inputs may take a second or so, particularly with the fuzzer and +sanitizers enabled. + +To see what fuzz-http is doing with a test case, call it like this: + src/test/fuzz/fuzz-http --debug < /path/to/test.case + +(Logging is disabled while fuzzing to increase fuzzing speed.) + +== Reporting Issues + +Please report any issues discovered using the process in Tor's security issue +policy: + +https://trac.torproject.org/projects/tor/wiki/org/meetings/2016SummerDevMeeting/Notes/SecurityIssuePolicy diff --git a/doc/HACKING/GettingStarted.md b/doc/HACKING/GettingStarted.md index 0295adc1ff..0c42404634 100644 --- a/doc/HACKING/GettingStarted.md +++ b/doc/HACKING/GettingStarted.md @@ -11,8 +11,9 @@ whole Tor ecosystem.) If you are looking for a more bare-bones, less user-friendly information -dump of important information, you might like reading doc/HACKING -instead. You should probably read it before you write your first patch. +dump of important information, you might like reading the "torguts" +documents linked to below. You should probably read it before you write +your first patch. Required background diff --git a/doc/HACKING/GettingStartedRust.md b/doc/HACKING/GettingStartedRust.md new file mode 100644 index 0000000000..a5253b46a6 --- /dev/null +++ b/doc/HACKING/GettingStartedRust.md @@ -0,0 +1,161 @@ + + Hacking on Rust in Tor +======================== + + Getting Started +----------------- + +Please read or review our documentation on Rust coding standards +(`.../doc/HACKING/CodingStandardsRust.md`) before doing anything. + +Please also read +[the Rust Code of Conduct](https://www.rust-lang.org/en-US/conduct.html). We aim +to follow the good example set by the Rust community and be excellent to one +another. Let's be careful with each other, so we can be memory-safe together! + +Next, please contact us before rewriting anything! Rust in Tor is still an +experiment. It is an experiment that we very much want to see succeed, so we're +going slowly and carefully. For the moment, it's also a completely +volunteer-driven effort: while many, if not most, of us are paid to work on Tor, +we are not yet funded to write Rust code for Tor. Please be patient with the +other people who are working on getting more Rust code into Tor, because they +are graciously donating their free time to contribute to this effort. + + Resources for learning Rust +----------------------------- + +**Beginning resources** + +The primary resource for learning Rust is +[The Book](https://doc.rust-lang.org/book/). If you'd like to start writing +Rust immediately, without waiting for anything to install, there is +[an interactive browser-based playground](https://play.rust-lang.org/). + +**Advanced resources** + +If you're interested in playing with various Rust compilers and viewing a very +nicely displayed output of the generated assembly, there is +[the Godbolt compiler explorer](https://rust.godbolt.org/) + +For learning how to write unsafe Rust, read +[The Rustonomicon](https://doc.rust-lang.org/nomicon/). + +For learning everything you ever wanted to know about Rust macros, there is +[The Little Book of Rust Macros](https://danielkeep.github.io/tlborm/book/index.html). + +For learning more about FFI and Rust, see Jake Goulding's +[Rust FFI Omnibus](http://jakegoulding.com/rust-ffi-omnibus/). + + Compiling Tor with Rust enabled +--------------------------------- + +You will need to run the `configure` script with the `--enable-rust` flag to +explicitly build with Rust. Additionally, you will need to specify where to +fetch Rust dependencies, as we allow for either fetching dependencies from Cargo +or specifying a local directory. + +**Fetch dependencies from Cargo** + + ./configure --enable-rust --enable-cargo-online-mode + +**Using a local dependency cache** + +**NOTE**: local dependency caches which were not *originally* created via + `--enable-cargo-online-mode` are broken. See https://bugs.torproject.org/22907 + +To specify a local directory: + + RUST_DEPENDENCIES='path_to_dependencies_directory' ./configure --enable-rust + +(Note that RUST_DEPENDENCIES must be the full path to the directory; it cannot +be relative.) + +You'll need the following Rust dependencies (as of this writing): + + libc==0.2.22 + +To get them, do: + + mkdir path_to_dependencies_directory + cd path_to_dependencies_directory + git clone https://github.com/rust-lang/libc + cd libc + git checkout 0.2.22 + cargo package + cd .. + ln -s libc/target/package/libc-0.2.22 libc-0.2.22 + + + Identifying which modules to rewrite +====================================== + +The places in the Tor codebase that are good candidates for porting to Rust are: + +1. loosely coupled to other Tor submodules, +2. have high test coverage, and +3. would benefit from being implemented in a memory safe language. + +Help in either identifying places such as this, or working to improve existing +areas of the C codebase by adding regression tests and simplifying dependencies, +would be really helpful. + +Furthermore, as submodules in C are implemented in Rust, this is a good +opportunity to refactor, add more tests, and split modules into smaller areas of +responsibility. + +A good first step is to build a module-level callgraph to understand how +interconnected your target module is. + + git clone https://git.torproject.org/user/nickm/calltool.git + cd tor + CFLAGS=0 ./configure + ../calltool/src/main.py module_callgraph + +The output will tell you each module name, along with a set of every module that +the module calls. Modules which call fewer other modules are better targets. + + Writing your Rust module +========================== + +Strive to change the C API as little as possible. + +We are currently targetting Rust nightly, *for now*. We expect this to change +moving forward, as we understand more about which nightly features we need. It +is on our TODO list to try to cultivate good standing with various distro +maintainers of `rustc` and `cargo`, in order to ensure that whatever version we +solidify on is readily available. + + Adding your Rust module to Tor's build system +----------------------------------------------- + +0. Your translation of the C module should live in its own crate(s) + in the `.../tor/src/rust/` directory. +1. Add your crate to `.../tor/src/rust/Cargo.toml`, in the + `[workspace.members]` section. +2. Append your crate's static library to the `rust_ldadd` definition + (underneath `if USE_RUST`) in `.../tor/Makefile.am`. + + How to test your Rust code +---------------------------- + +Everything should be tested full stop. Even non-public functionality. + +Be sure to edit `.../tor/src/test/test_rust.sh` to add the name of your crate to +the `crates` variable! This will ensure that `cargo test` is run on your crate. + +Configure Tor's build system to build with Rust enabled: + + ./configure --enable-fatal-warnings --enable-rust --enable-cargo-online-mode + +Tor's test should be run by doing: + + make check + +Tor's integration tests should also pass: + + make test-stem + + Submitting a patch +===================== + +Please follow the instructions in `.../doc/HACKING/GettingStarted.md`. diff --git a/doc/HACKING/HelpfulTools.md b/doc/HACKING/HelpfulTools.md index a7f36e6c7e..f919d08ec1 100644 --- a/doc/HACKING/HelpfulTools.md +++ b/doc/HACKING/HelpfulTools.md @@ -111,15 +111,19 @@ Running gcov for unit test coverage (On OSX, you'll need to start with `--enable-coverage CC=clang`.) -Then, look at the .gcov files in `coverage-output`. '-' before a line means -that the compiler generated no code for that line. '######' means that the -line was never reached. Lines with numbers were called that number of times. - If that doesn't work: * Try configuring Tor with `--disable-gcc-hardening` * You might need to run `make clean` after you run `./configure`. +Then, look at the .gcov files in `coverage-output`. '-' before a line means +that the compiler generated no code for that line. '######' means that the +line was never reached. Lines with numbers were called that number of times. + +For more details about how to read gcov output, see the [Invoking +gcov](https://gcc.gnu.org/onlinedocs/gcc/Invoking-Gcov.html) chapter +of the GCC manual. + If you make changes to Tor and want to get another set of coverage results, you can run `make reset-gcov` to clear the intermediary gcov output. @@ -128,9 +132,13 @@ a meaningful diff between them, you can run: ./scripts/test/cov-diff coverage-output1 coverage-output2 | less -In this diff, any lines that were visited at least once will have coverage -"1". This lets you inspect what you (probably) really want to know: which -untested lines were changed? Are there any new untested lines? +In this diff, any lines that were visited at least once will have coverage "1", +and line numbers are deleted. This lets you inspect what you (probably) really +want to know: which untested lines were changed? Are there any new untested +lines? + +If you run ./scripts/test/cov-exclude, it marks excluded unreached +lines with 'x', and excluded reached lines with '!!!'. Running integration tests ------------------------- @@ -142,6 +150,12 @@ run `make test-network`. We also have scripts to run integration tests using Stem. To try them, set `STEM_SOURCE_DIR` to your Stem source directory, and run `test-stem`. +Profiling Tor +------------- + +Ongoing notes about Tor profiling can be found at +https://pad.riseup.net/p/profiling-tor + Profiling Tor with oprofile --------------------------- @@ -168,20 +182,62 @@ Here are some basic instructions * `opreport -l that_dir/*` - Profit +Profiling Tor with perf +----------------------- + +This works with a running Tor, and requires root. + +1. Decide how long you want to profile for. Start with (say) 30 seconds. If that + works, try again with longer times. + +2. Find the PID of your running tor process. + +3. Run `perf record --call-graph dwarf -p <PID> sleep <SECONDS>` + + (You may need to do this as root.) + + You might need to add `-e cpu-clock` as an option to the perf record line + above, if you are on an older CPU without access to hardware profiling + events, or in a VM, or something. + +4. Now you have a perf.data file. Have a look at it with `perf report + --no-children --sort symbol,dso` or `perf report --no-children --sort + symbol,dso --stdio --header`. How does it look? + +5a. Once you have a nice big perf.data file, you can compress it, encrypt it, + and send it to your favorite Tor developers. + +5b. Or maybe you'd rather not send a nice big perf.data file. Who knows what's + in that!? It's kinda scary. To generate a less scary file, you can use `perf + report -g > <FILENAME>.out`. Then you can compress that and put it somewhere + public. + +Profiling Tor with gperftools aka Google-performance-tools +---------------------------------------------------------- + +This should work on nearly any unixy system. It doesn't seem to be compatible +with RunAsDaemon though. + +Beforehand, install google-perftools. + +1. You need to rebuild Tor, hack the linking steps to add `-lprofiler` to the + libs. You can do this by adding `LIBS=-lprofiler` when you call `./configure`. + +Now you can run Tor with profiling enabled, and use the pprof utility to look at +performance! See the gperftools manual for more info, but basically: + +2. Run `env CPUPROFILE=/tmp/profile src/or/tor -f <path/torrc>`. The profile file + is not written to until Tor finishes execuction. + +3. Run `pprof src/or/tor /tm/profile` to start the REPL. + Generating and analyzing a callgraph ------------------------------------ -1. Run `./scripts/maint/generate_callgraph.sh`. This will generate a - bunch of files in a new ./callgraph directory. - -2. Run `./scripts/maint/analyze_callgraph.py callgraph/src/*/*`. This - will do a lot of graph operations and then dump out a new - `callgraph.pkl` file, containing data in Python's 'pickle' format. +0. Build Tor on linux or mac, ideally with -O0 or -fno-inline. -3. Run `./scripts/maint/display_callgraph.py`. It will display: - - the number of functions reachable from each function. - - all strongly-connnected components in the Tor callgraph - - the largest bottlenecks in the largest SCC in the Tor callgraph. +1. Clone 'https://gitweb.torproject.org/user/nickm/calltool.git/' . + Follow the README in that repository. Note that currently the callgraph generator can't detect calls that pass through function pointers. diff --git a/doc/HACKING/HowToReview.md b/doc/HACKING/HowToReview.md index d53318942f..2d1f3d1c9e 100644 --- a/doc/HACKING/HowToReview.md +++ b/doc/HACKING/HowToReview.md @@ -19,6 +19,8 @@ Top-level smell-checks - Does `make check-spaces` pass? +- Does `make check-changes` pass? + - Does it have a reasonable amount of tests? Do they pass? Do they leak memory? @@ -32,6 +34,7 @@ Top-level smell-checks - If this changes Tor's behavior on the wire, is there a design proposal? +- If this changes anything in the code, is there a "changes" file? Let's look at the code! diff --git a/doc/HACKING/ReleasingTor.md b/doc/HACKING/ReleasingTor.md index 7595398241..62029b44f0 100644 --- a/doc/HACKING/ReleasingTor.md +++ b/doc/HACKING/ReleasingTor.md @@ -8,7 +8,11 @@ new Tor release: === 0. Preliminaries 1. Get at least three of weasel/arma/Sebastian/Sina to put the new - version number in their approved versions list. + version number in their approved versions list. Give them a few + days to do this if you can. + +2. If this is going to be an important security release, give the packagers + some advance warning: See this list of packagers in IV.3 below. === I. Make sure it works @@ -26,39 +30,37 @@ new Tor release: What about Coverity Scan? - Is make check-spaces happy? + What about clan scan-build? - Does 'make distcheck' compain? + Does 'make distcheck' complain? - How about 'make test-stem' and 'make test-network'? + How about 'make test-stem' and 'make test-network' and + `make test-network-full`? - Are all those tests still happy with --enable-expensive-hardening ? Any memory leaks? -=== II. Write a changelog. +=== II. Write a changelog + +1a. (Alpha release variant) -1. Gather the `changes/*` files into a changelog entry, rewriting many + Gather the `changes/*` files into a changelog entry, rewriting many of them and reordering to focus on what users and funders would find interesting and understandable. - 1. Make sure that everything that wants a bug number has one. - Make sure that everything which is a bugfix says what version - it was a bugfix on. - - 2. Concatenate them. + To do this, first run `./scripts/maint/lintChanges.py changes/*` and + fix as many warnings as you can. Then run `./scripts/maint/sortChanges.py + changes/* > changelog.in` to combine headings and sort the entries. + After that, it's time to hand-edit and fix the issues that lintChanges + can't find: - 3. Sort them by section. Within each section, sort by "version it's - a bugfix on", else by numerical ticket order. + 1. Within each section, sort by "version it's a bugfix on", else by + numerical ticket order. - 4. Clean them up: - - Standard idioms: - `Fixes bug 9999; bugfix on 0.3.3.3-alpha.` - - One space after a period. + 2. Clean them up: Make stuff very terse @@ -86,19 +88,32 @@ new Tor release: maint-0.2.1), be sure to make the stanzas identical (so people can distinguish if these are the same change). - 5. Merge them in. + 3. Clean everything one last time. + + 4. Run `./scripts/maint/format_changelog.py --inplace` to make it prettier - 6. Clean everything one last time. +1b. (old-stable release variant) - 7. Run `./scripts/maint/format_changelog.py` to make it prettier. + For stable releases that backport things from later, we try to compose + their releases, we try to make sure that we keep the changelog entries + identical to their original versions, with a 'backport from 0.x.y.z' + note added to each section. So in this case, once you have the items + from the changes files copied together, don't use them to build a new + changelog: instead, look up the corrected versions that were merged + into ChangeLog in the master branch, and use those. 2. Compose a short release blurb to highlight the user-facing changes. Insert said release blurb into the ChangeLog stanza. If it's a stable release, add it to the ReleaseNotes file too. If we're adding - to a release-0.2.x branch, manually commit the changelogs to the later + to a release-* branch, manually commit the changelogs to the later git branches too. -3. If you're doing the first stable release in a series, you need to +3. If there are changes that require or suggest operator intervention + before or during the update, mail operators (either dirauth or relays + list) with a headline that indicates that an action is required or + appreciated. + +4. If you're doing the first stable release in a series, you need to create a ReleaseNotes for the series as a whole. To get started there, copy all of the Changelog entries from the series into a new file, and run `./scripts/maint/sortChanges.py` on it. That will @@ -111,51 +126,52 @@ new Tor release: === III. Making the source release. -1. In `maint-0.2.x`, bump the version number in `configure.ac` and run - `scripts/maint/updateVersions.pl` to update version numbers in other - places, and commit. Then merge `maint-0.2.x` into `release-0.2.x`. +1. In `maint-0.?.x`, bump the version number in `configure.ac` and run + `perl scripts/maint/updateVersions.pl` to update version numbers in other + places, and commit. Then merge `maint-0.?.x` into `release-0.?.x`. (NOTE: To bump the version number, edit `configure.ac`, and then run either `make`, or `perl scripts/maint/updateVersions.pl`, depending on your version.) -2. Make distcheck, put the tarball up somewhere, and tell `#tor` about - it. Wait a while to see if anybody has problems building it. Try to - get Sebastian or somebody to try building it on Windows. +2. Make distcheck, put the tarball up in somewhere (how about your + homedir on your homedir on people.torproject.org?) , and tell `#tor` + about it. Wait a while to see if anybody has problems building it. + (Though jenkins is usually pretty good about catching these things.) === IV. Commit, upload, announce 1. Sign the tarball, then sign and push the git tag: gpg -ba <the_tarball> - git tag -u <keyid> tor-0.2.x.y-status - git push origin tag tor-0.2.x.y-status + git tag -u <keyid> tor-0.3.x.y-status + git push origin tag tor-0.3.x.y-status 2. scp the tarball and its sig to the dist website, i.e. `/srv/dist-master.torproject.org/htdocs/` on dist-master. When you want it to go live, you run "static-update-component dist.torproject.org" on dist-master. - Edit `include/versions.wmi` and `Makefile` to note the new version. + In the webwml.git repository, `include/versions.wmi` and `Makefile` + to note the new version. (NOTE: Due to #17805, there can only be one stable version listed at once. Nonetheless, do not call your version "alpha" if it is stable, or people will get confused.) -3. Email the packagers (cc'ing tor-assistants) that a new tarball is up. +3. Email the packagers (cc'ing tor-team) that a new tarball is up. The current list of packagers is: - {weasel,gk,mikeperry} at torproject dot org - {blueness} at gentoo dot org - {paul} at invizbox dot io + - {vincent} at invizbox dot com - {lfleischer} at archlinux dot org - {Nathan} at freitas dot net - {mike} at tig dot as - - {tails-rm} at boum dot org (for pre-release announcments) - - - - {tails-dev} at boum dot org (for at-release announcements) - + - {tails-rm} at boum dot org + - {simon} at sdeziel.info + - {yuri} at rawbw.com 4. Add the version number to Trac. To do this, go to Trac, log in, select "Admin" near the top of the screen, then select "Versions" from @@ -164,22 +180,26 @@ new Tor release: 0.2.2.23-alpha" (or whatever the version is), and we select the date as the date in the ChangeLog. -5. Wait up to a day or two (for a development release), or until most - packages are up (for a stable release), and mail the release blurb and - changelog to tor-talk or tor-announce. +5. Mail the release blurb and ChangeLog to tor-talk (development release) or + tor-announce (stable). + + Post the changelog on the blog as well. You can generate a + blog-formatted version of the changelog with the -B option to + format-changelog. - (We might be moving to faster announcements, but don't announce until - the website is at least updated.) + When you post, include an estimate of when the next TorBrowser + releases will come out that include this Tor release. This will + usually track https://wiki.mozilla.org/RapidRelease/Calendar , but it + can vary. === V. Aftermath and cleanup -1. If it's a stable release, bump the version number in the `maint-x.y.z` - branch to "newversion-dev", and do a `merge -s ours` merge to avoid - taking that change into master. Do a similar `merge -s theirs` - merge to get the change (and only that change) into release. (Some - of the build scripts require that maint merge cleanly into release.) +1. If it's a stable release, bump the version number in the + `maint-x.y.z` branch to "newversion-dev", and do a `merge -s ours` + merge to avoid taking that change into master. 2. Forward-port the ChangeLog (and ReleaseNotes if appropriate). +3. Keep an eye on the blog post, to moderate comments and answer questions. diff --git a/doc/HACKING/Tracing.md b/doc/HACKING/Tracing.md new file mode 100644 index 0000000000..a5fb5165e2 --- /dev/null +++ b/doc/HACKING/Tracing.md @@ -0,0 +1,91 @@ +# Tracing # + +This document describes how the event tracing subsystem works in tor so +developers can add events to the code base but also hook them to an event +tracing framework. + +## Basics ### + +Event tracing is seperated in two concepts, trace events and a tracer. The +tracing subsystem can be found in `src/trace`. The `events.h` header file is +the main file that maps the different tracers to trace events. + +### Events ### + +A trace event is basically a function from which we can pass any data that +we want to collect. In addition, we specify a context for the event such as +a subsystem and an event name. + +A trace event in tor has the following standard format: + + tor_trace(subsystem, event\_name, args...) + +The `subsystem` parameter is the name of the subsytem the trace event is in. +For example that could be "scheduler" or "vote" or "hs". The idea is to add +some context to the event so when we collect them we know where it's coming +from. The `event_name` is the name of the event which helps a lot with +adding some semantic to the event. Finally, `args` is any number of +arguments we want to collect. + +Here is an example of a possible tracepoint in main(): + + tor_trace(main, init_phase, argc) + +The above is a tracepoint in the `main` subsystem with `init_phase` as the +event name and the `int argc` is passed to the event as well. + +How `argc` is collected or used has nothing to do with the instrumentation +(adding trace events to the code). It is the work of the tracer so this is why +the trace events and collection framework (tracer) are decoupled. You _can_ +have trace events without a tracer. + +### Tracer ### + +In `src/trace/events.h`, we map the `tor_trace()` function to the right +tracer. A tracer support is only enabled at compile time. For instance, the +file `src/trace/debug.h` contains the mapping of the generic tracing function +`tor_trace()` to the `log_debug()` function. More specialized function can be +mapped depending on the tracepoint. + +## Build System ## + +This section describes how it is integrated into the build system of tor. + +By default, every tracing events are disabled in tor that is `tor_trace()` +is a NOP. + +To enable a tracer, there is a configure option on the form of: + + --enable-tracing-<tracer> + +We have an option that will send every trace events to a `log_debug()` (as +mentionned above) which will print you the subsystem and name of the event but +not the arguments for technical reasons. This is useful if you want to quickly +see if your trace event is being hit or well written. To do so, use this +configure option: + + --enable-tracing-debug + +## Instrument Tor ## + +This is pretty easy. Let's say you want to add a trace event in +`src/or/rendcache.c`, you only have to add this include statement: + + #include "trace/events.h" + +Once done, you can add as many as you want `tor_trace()` that you need. +Please use the right subsystem (here it would be `hs`) and a unique name that +tells what the event is for. For example: + + tor_trace(hs, store_desc_as_client, desc, desc_id); + +If you look in `src/trace/events.h`, you'll see that if tracing is enabled it +will be mapped to a function called: + + tor_trace_hs_store_desc_as_client(desc, desc_id) + +And the point of all this is for that function to be defined in a new file +that you might want to add named `src/trace/hs.{c|h}` which would defined how +to collect the data for the `tor_trace_hs_store_desc_as_client()` function +like for instance sending it to a `log_debug()` or do more complex operations +or use a userspace tracer like LTTng (https://lttng.org). diff --git a/doc/HACKING/WritingTests.md b/doc/HACKING/WritingTests.md index de80bbdef2..cc393494ec 100644 --- a/doc/HACKING/WritingTests.md +++ b/doc/HACKING/WritingTests.md @@ -48,7 +48,7 @@ isolation, you just run `./src/test/test-memwipe`. To run tests within the unit test programs, you can specify the name of the test. The string ".." can be used as a wildcard at the end of the test name. For example, to run all the cell format tests, enter -`./src/test/test cellfmt/..`. To run +`./src/test/test cellfmt/..`. Many tests that need to mess with global state run in forked subprocesses in order to keep from contaminating one another. But when debugging a failing test, @@ -91,6 +91,9 @@ coverage percentage. For a summary of the test coverage for each _function_, run `./scripts/test/cov-display -f ${TMPDIR}/*`. +For more details on using gcov, including the helper scripts in +scripts/test, see HelpfulTools.md. + ### Comparing test coverage Sometimes it's useful to compare test coverage for a branch you're writing to @@ -117,7 +120,8 @@ with LCOV_EXCL_START... LCOV_EXCL_STOP. Note that older versions of lcov don't understand these lines. You can post-process .gcov files to make these lines 'unreached' by -running ./scripts/test/cov-exclude on them. +running ./scripts/test/cov-exclude on them. It marks excluded +unreached lines with 'x', and excluded reached lines with '!!!'. Note: you should never do this unless the line is meant to 100% unreachable by actual code. |