diff options
author | Nick Mathewson <nickm@torproject.org> | 2019-10-14 13:49:27 -0400 |
---|---|---|
committer | Nick Mathewson <nickm@torproject.org> | 2019-10-14 13:49:27 -0400 |
commit | 8ef5d96c2e7c026feff3a4dd20f0096f6d8cf901 (patch) | |
tree | e5006a762530501b820334613de6afb25b27a69b | |
parent | 908070bbd5096efc09b251154dbc058559920f05 (diff) | |
download | tor-8ef5d96c2e7c026feff3a4dd20f0096f6d8cf901.tar.gz tor-8ef5d96c2e7c026feff3a4dd20f0096f6d8cf901.zip |
Rewrite "common" overview into a "lib" overview.
-rw-r--r-- | doc/HACKING/design/01.00-lib-overview.md | 206 |
1 files changed, 128 insertions, 78 deletions
diff --git a/doc/HACKING/design/01.00-lib-overview.md b/doc/HACKING/design/01.00-lib-overview.md index 79a6a7b7d3..08dec51a00 100644 --- a/doc/HACKING/design/01.00-lib-overview.md +++ b/doc/HACKING/design/01.00-lib-overview.md @@ -1,121 +1,171 @@ -## Utility code in Tor +## Library code in Tor. -Most of Tor's utility code is in modules in the src/common subdirectory. +Most of Tor's utility code is in modules in the `src/lib` subdirectory. In +general, this code is not necessarily Tor-specific, but is instead possibly +useful for other applications. -These are divided, broadly, into _compatibility_ functions, _utility_ -functions, _containers_, and _cryptography_. (Someday in the future, it -would be great to split these modules into separate directories. Also, some -functions are probably put in the wrong modules) +This code includes: -### Compatibility code + * Compatibility wrappers, to provide a uniform API across different + platforms. -These functions live in src/common/compat\*.c; some corresponding macros live -in src/common/compat\*.h. They serve as wrappers around platform-specific or -compiler-specific logic functionality. + * Library wrappers, to provide a tor-like API over different libraries + that Tor uses for things like compression and cryptography. -In general, the rest of the Tor code *should not* be calling platform-specific -or otherwise non-portable functions. Instead, they should call wrappers from -compat.c, which implement a common cross-platform API. (If you don't know -whether a function is portable, it's usually good enough to see whether it -exists on OSX, Linux, and Windows.) + * Containers, to implement some general-purpose data container types. -Other compatibility modules include backtrace.c, which generates stack traces -for crash reporting; sandbox.c, which implements the Linux seccomp2 sandbox; -and procmon.c, which handles monitoring a child process. +The modules in `src/lib` are currently well-factored: each one depends +only on lower-level modules. You can see an up-to-date list of the +modules sorted from lowest to highest level by running +`./scripts/maint/practracker/includes.py --toposort`. -Parts of address.c are compatibility code for handling network addressing -issues; other parts are in util.c. +As of this writing, the library modules are (from lowest to highest +level): -Notable compatibility areas are: + * `lib/cc` -- Macros for managing the C compiler and + language. Includes macros for improving compatibility and clarity + across different C compilers. - * mmap support for mapping files into the address space (read-only) + * `lib/version` -- Holds the current version of Tor. - * Code to work around the intricacies + * `lib/testsupport` -- Helpers for making test-only code and test + mocking support. - * Workaround code for Windows's horrible winsock incompatibilities and - Linux's intricate socket extensions. + * `lib/defs` -- Lowest-level constants used in many places across the + code. - * Helpful string functions like memmem, memstr, asprintf, strlcpy, and - strlcat that not all platforms have. + * `lib/subsys` -- Types used for declaring a "subsystem". A subsystem + is a module with support for initialization, shutdown, + configuration, and so on. - * Locale-ignoring variants of the ctypes functions. + * `lib/conf` -- Types and macros used for declaring configuration + options. - * Time-manipulation functions + * `lib/arch` -- Compatibility functions and macros for handling + differences in CPU architecture. - * File locking function + * `lib/err` -- Lowest-level error handling code: responsible for + generating stack traces, handling raw assertion failures, and + otherwise reporting problems that might not be safe to report + via the regular logging module. - * IPv6 functions for platforms that don't have enough IPv6 support + * `lib/malloc` -- Wrappers and utilities for memory management. - * Endianness functions + * `lib/intmath` -- Utilities for integer mathematics. - * OS functions + * `lib/fdio` -- Utilities and compatibility code for reading and + writing data on file descriptors (and on sockets, for platforms + where a socket is not a kind of fd). - * Threading and locking functions. + * `lib/lock` -- Compatibility code for declaring and using locks. + Lower-level than the rest of the threading code. -=== Utility functions + * `lib/ctime` -- Constant-time implementations for data comparison + and table lookup, used to avoid timing side-channels from standard + implementations of memcmp() and so on. -General-purpose utilities are in util.c; they include higher-level wrappers -around many of the compatibility functions to provide things like -file-at-once access, memory management functions, math, string manipulation, -time manipulation, filesystem manipulation, etc. + * `lib/string` -- Low-level compatibility wrappers and utility + functions for string manipulation. -(Some functionality, like daemon-launching, would be better off in a -compatibility module.) + * `lib/wallclock` -- Compatibility and utility functions for + inspecting and manipulating the current (UTC) time. -In util_format.c, we have code to implement stuff like base-32 and base-64 -encoding. + * `lib/osinfo` -- Functions for inspecting the version and + capabilities of the operating system. -The address.c module interfaces with the system resolver and implements -address parsing and formatting functions. It converts sockaddrs to and from -a more compact tor_addr_t type. + * `lib/smartlist_core` -- The bare-bones pieces of our dynamic array + ("smartlist") implementation. There are higher-level pieces, but + these ones are used by (and therefore cannot use) the logging code. -The di_ops.c module provides constant-time comparison and associative-array -operations, for side-channel avoidance. + * `lib/log` -- Implements the logging system used by all higher-level + Tor code. You can think of this as the logical "midpoint" of the + library code: much of the higher-level code is higher-level + _because_ it uses the logging module, and much of the lower-level + code is specifically written to avoid having to log, because the + logging module depends on it. -The logging subsystem in log.c supports logging to files, to controllers, to -stdout/stderr, or to the system log. + * `lib/container` -- General purpose containers, including dynamic arrays, + hashtables, bit arrays, weak-reference-like "handles", bloom + filters, and a bit more. -The abstraction in memarea.c is used in cases when a large amount of -temporary objects need to be allocated, and they can all be freed at the same -time. + * `lib/trace` -- A general-purpose API for introducing + function-tracing functionality into Tor. Currently not much used. -The torgzip.c module wraps the zlib library to implement compression. + * `lib/thread` -- Threading compatibility and utility functionality, + other than low-level locks (which are in `lib/lock`) and + workqueue/threadpool code (which belongs in `lib/evloop`). -Workqueue.c provides a simple multithreaded work-queue implementation. + * `lib/term` -- Code for terminal manipulation functions (like + reading a password from the user). -### Containers + * `lib/memarea` -- A data structure for a fast "arena" style allocator, + where the data is freed all at once. Used for parsing. -The container.c module defines these container types, used throughout the Tor -codebase. + * `lib/encoding` -- Implementations for encoding data in various + formats, datatypes, and transformations. -There is a dynamic array called **smartlist**, used as our general resizeable -array type. It supports sorting, searching, common set operations, and so -on. It has specialized functions for smartlists of strings, and for -heap-based priority queues. + * `lib/dispatch` -- A general-purpose in-process message delivery + system. Used by `lib/pubsub` to implement our inter-module + publish/subscribe system. -There's a bit-array type. + * `lib/sandbox` -- Our Linux seccomp2 sandbox implementation. -A set of mapping types to map strings, 160-bit digests, and 256-bit digests -to void \*. These are what we generally use when we want O(1) lookup. + * `lib/pubsub` -- Code and macros to implement our publish/subscribe + message passing system. -Additionally, for containers, we use the ht.h and tor_queue.h headers, in -src/ext. These provide intrusive hashtable and linked-list macros. + * `lib/fs` -- Utility and compatibility code for manipulating files, + filenames, directories, and so on. -### Cryptography + * `lib/confmgt` -- Code to parse, encode, and manipulate our + configuration files, state files, and so forth. -Once, we tried to keep our cryptography code in a single "crypto.c" file, -with an "aes.c" module containing an AES implementation for use with older -OpenSSLs. + * `lib/crypt_ops` -- Cryptographic operations. This module contains + wrappers around the cryptographic libraries that we support, + and implementations for some higher-level cryptographic + constructions that we use. -Now, our practice has become to introduce crypto_\*.c modules when adding new -cryptography backend code. We have modules for Ed25519, Curve25519, -secret-to-key algorithms, and password-based boxed encryption. + * `lib/meminfo` -- Functions for inspecting our memory usage, if the + malloc implementation exposes that to us. -Our various TLS compatibility code, wrappers, and hacks are kept in -tortls.c, which is probably too full of Tor-specific kludges. I'm -hoping we can eliminate most of those kludges when we finally remove -support for older versions of our TLS handshake. + * `lib/time` -- Higher level time functions, including fine-gained and + monotonic timers. + * `lib/math` -- Floating-point mathematical utilities, including + compatibility code, and probability distributions. + * `lib/buf` -- A general purpose queued buffer implementation, + similar to the BSD kernel's "mbuf" structure. + * `lib/net` -- Networking code, including address manipulation, + compatibility wrappers, + + * `lib/compress` -- A compatibility wrapper around several + compression libraries, currently including zlib, zstd, and lzma. + + * `lib/geoip` -- Utilities to manage geoip (IP to country) lookups + and formats. + + * `lib/tls` -- Compatibility wrappers around the library (NSS or + OpenSSL, depending on configuration) that Tor uses to implement the + TLS link security protocol. + + * `lib/evloop` -- Tools to manage the event loop and related + functionality, in order to implement asynchronous networking, + timers, periodic events, and other scheduling tasks. + + * `lib/process` -- Utilities and compatibility code to launch and + manage subprocesses. + +### What belongs in lib? + +In general, if you can imagine some program wanting the functionality +you're writing, even if that program had nothing to do with Tor, your +functionality belongs in lib. + +If it falls into one of the existing "lib" categories, your +functionality belongs in lib. + +If you are using platform-specific `#ifdef`s to manage compatibility +issues among platforms, you should probably consider whether you can +put your code into lib. |