From 9c653e58b41f12c58da060669fa148172f20127a Mon Sep 17 00:00:00 2001
From: David Goulet <dgoulet@torproject.org>
Date: Wed, 15 Jul 2020 09:44:12 -0400
Subject: doc: Document circuit subsystem tracing events

Create a doc/tracing/ directory to contain a top level README.md which is the
previously named Tracing.md and add the EventsCircuit.md which describes the
circuit subsystem tracing events in depth.

Closes #40036

Signed-off-by: David Goulet <dgoulet@torproject.org>
---
 doc/HACKING/Tracing.md               | 163 -----------------------------------
 doc/HACKING/tracing/EventsCircuit.md | 139 +++++++++++++++++++++++++++++
 doc/HACKING/tracing/README.md        | 163 +++++++++++++++++++++++++++++++++++
 3 files changed, 302 insertions(+), 163 deletions(-)
 delete mode 100644 doc/HACKING/Tracing.md
 create mode 100644 doc/HACKING/tracing/EventsCircuit.md
 create mode 100644 doc/HACKING/tracing/README.md

(limited to 'doc')
diff --git a/doc/HACKING/Tracing.md b/doc/HACKING/Tracing.md
deleted file mode 100644
index d9fb2e5341..0000000000
--- a/doc/HACKING/Tracing.md
+++ /dev/null
@@ -1,163 +0,0 @@
-# Tracing
-
-This document describes how the event tracing subsystem works in tor so
-developers can add events to the code base but also hook them to an event
-tracing framework (i.e. tracer).
-
-## WARNING ##
-
-Tracing the tor daemon **always** generates sensitive data if used in
-production (on the public network).
-
-It **is** ethical for researchers to use tracing for their own tor client (for
-example: building paths, timings, or performance).
-
-It is **NOT** ethical to archive, publish or keep data containing other users'
-activity such as relay data or anything that handles users' traffic. This
-of course includes any logs below notice level.
-
-Publishing analysis of tracing data containing user traffic is **NOT** safe
-either.
-
-In other words, tracing data that contains other users's activity is **NOT**
-safe to publish in any form.
-
-## Basics ###
-
-Tracing is separated in two different concepts. The tracing API and the
-tracing probes.
-
-The API is in `src/lib/trace/` which defines how to call tracepoints in the
-tor code. Every C files should include `src/lib/trace/events.h` if they want
-to call a tracepoint.
-
-The probes are what actually record the tracepoint data. Because they often
-need to access specific subsystem objects, the probes are within each
-subsystem. They are defined in the `trace-probes-<subsystem>.c` files.
-
-### Events
-
-A trace event is basically a function from which we can pass any data that we
-want to collect. In addition, we specify a context for the event such as the
-subsystem and an event name.
-
-A trace event in tor has the following standard format:
-
-```c
-tor_trace(subsystem, event_name, args...);
-```
-
-The `subsystem` parameter is the name of the subsytem the trace event is in.
-For example that could be "scheduler" or "vote" or "hs". The idea is to add
-some context to the event so when we collect them we know where it's coming
-from.
-
-The `event_name` is the name of the event which adds better semantic to the
-event.
-
-The `args` can be any number of arguments we want to collect.
-
-Here is an example of a possible tracepoint in main():
-
-```c
-tor_trace(main, init_phase, argc);
-```
-
-The above is a tracepoint in the `main` subsystem with `init_phase` as the
-event name and the `int argc` is passed to the event as one argument.
-
-How `argc` is collected or used has nothing to do with the instrumentation
-(adding trace events to the code). It is the work of the tracer so this is why
-the trace events and collection framework (tracer) are decoupled. You _can_
-have trace events without a tracer.
-
-### Instrumentation ###
-
-In `src/lib/trace/events.h`, we map the high level `tor_trace()` macro to one
-or many enabled instrumentation.
-
-Currently, we have 3 types of possible instrumentation:
-
-1. Debug
-
-  This will map every tracepoint to `log_debug()`. However, none of the
-  arguments will be passed on because we don't know their type nor the string
-  format of the debug log. The output is standardized like this:
-
-```
-[debug] __FUNC__: Tracepoint <event_name> from subsystem <subsystem> hit.
-```
-
-2. USDT
-
-  User Statically-Defined Tracing (USDT) is a kind of probe which can be
-  handled by a variety of tracers such as SystemTap, DTrace, perf, eBPF and
-  ftrace.
-
-  For each tracer, one will need to define the ABI in order for the tracer to
-  be able to extract the data from the tracepoint objects. For instance, the
-  tracer needs to know how to print the circuit state of a `circuit_t`
-  object.
-
-3. LTTng-UST
-
-  LTTng Userspace is a tracer that has it own type of instrumentation. The
-  probe definitions are created within the C code and is strongly typed.
-
-  For more information, see https://lttng.org/docs.
-
-## Build System
-
-This section describes how the instrumentation is integrated into the build
-system of tor.
-
-By default, every tracing events are disabled in tor that is `tor_trace()` is
-a NOP thus has no execution cost time.
-
-To enable a specific instrumentation, there are configure options:
-
-1. Debug: `--enable-tracing-instrumentation-debug`
-
-2. USDT: `--enable-tracing-instrumentation-usdt`
-
-3. LTTng: `--enable-tracing-instrumentation-lttng`
-
-They can all be used together or independently. If one of them is set,
-`HAVE_TRACING` define is set. And for each instrumentation, a
-`USE_TRACING_INSTRUMENTATION_<type>` is set.
-
-## Adding a Tracepoint ##
-
-This is pretty easy. Let's say you want to add a trace event in
-`src/feature/rend/rendcache.c`, you first need to include this file:
-
-```c
-#include "lib/trace/events.h"
-```
-
-Then, the `tor_trace()` macro can be used with the specific format detailled
-before in a previous section. As an example:
-
-```c
-tor_trace(hs, store_desc_as_client, desc, desc_id);
-```
-
-For `Debug` instrumentation, you have nothing else to do.
-
-For `USDT`, instrumentation, you will need to define the probes in a way the
-specific tracer can understand. For instance, SystemTap requires you to define
-a `tapset` for each tracepoints.
-
-For `LTTng`, you will need to define the probes in the
-`trace-probes-<subsystem>.{c|h}` file. See the `trace-probes-circuit.{c|h}`
-file as an example and https://lttng.org/docs/v2.11/#doc-instrumenting.
-
-## Performance ##
-
-A word about performance when a tracepoint is enabled. One of the goal of a
-tracepoint (USDT, LTTng-UST, ...) is that they can be enabled or disabled. By
-default, they are disabled which means the tracer will not record the data but
-it has to do a check thus the cost is basically the one of a `branch`.
-
-If enabled, then the performance depends on the tracer. In the case of
-LTTng-UST, the event costs around 110nsec.
diff --git a/doc/HACKING/tracing/EventsCircuit.md b/doc/HACKING/tracing/EventsCircuit.md
new file mode 100644
index 0000000000..42abdda856
--- /dev/null
+++ b/doc/HACKING/tracing/EventsCircuit.md
@@ -0,0 +1,139 @@
+# Circuit Subsystem Trace Events
+
+The circuit subsystem emits a series of tracing events related to a circuit
+object life cycle and its state change.
+
+This document describes each event as in what data they record and what they
+represent.
+
+## Background
+
+There are two types of circuits: origin and OR (onion router). Both of them
+are derived from a base object called a general circuit.
+
+- Origin circuits are the ones initiated by tor itself so client or onion
+  service circuits for instance.
+
+- OR circuits are the ones going through us that we have not initiated and
+  thus only seen by relays.
+
+Many operations are done on the base (general) circuit, and some are specific
+to an origin or OR. The following section describes each of them by circuit
+type.
+
+## Trace Events
+
+For the LTTng tracer, the subsystem name of these events is: `tor_circuit`.
+
+Also, unless specified otherwise, every event emits a common set of parameters
+thus they should always be expected in the following order:
+
+- `circ_id`: For an origin circuit, this is the global circuit identifier used
+  in a cell. For an OR circuit, the value is 0.
+
+- `purpose`: Purpose of the circuit as in what it is used for. Note that this
+  can change during the lifetime of a circuit. See `CIRCUIT_PURPOSE_*` in
+  `core/or/circuitlist.h` for an exhaustive list of the possible values.
+
+- `state`: State of a circuit. This changes during the lifetime of a circuit.
+  See `CIRCUIT_STATE_*` in `core/or/circuitlist.h` for an exhaustive list of
+  the possible values.
+
+Now, the tracing events.
+
+### General Circuit (`circuit_t`)
+
+The following events are triggered for the base circuit object and thus apply
+to all types of circuits.
+
+  * `free`: A circuit object is freed that is memory is released and not
+    usable anymore. After this event, no more events will be emitted for the
+    specific circuit object.
+
+  * `mark_for_close`: A circuit object is marked for close that is scheduled
+    to be closed in a later mainloop periodic event.
+
+    Extra parameters:
+
+    - `end_reason`: Reason why the circuit is closed. Tor often changes that
+      reason to something generic sometimes in order to avoid leaking internal
+      reasons to the end point. Thus, this value can be different from
+      orig_close_reason.
+
+    - `orig_close_reason`: Original reason why the circuit is closed. That
+      value never changes and contains the internal reason why we close it. It
+      is **never** this reason that is sent back on the circuit.
+
+  * `change_purpose`: Purpose change.
+
+    Extra parameters:
+
+    (`purpose` parameter is not present)
+
+    - `old_purpose`: Previous purpose that is no longer.
+
+    - `new_purpose`: New purpose assigned to the circuit.
+
+  * `change_state`: State change.
+
+    Extra parameters:
+
+    (`state` parameter is not present)
+
+    - `old_state`: Previous state that is no longer.
+
+    - `new_state`: New state assigned to the circuit.
+
+### Origin Circuit (`origin_circuit_t`)
+
+The following events are triggered only for origin circuits.
+
+  * `new_origin`: New origin circuit has been created meaning it has been
+    newly allocated, initialized and added to the global list.
+
+  * `establish`: Circuit is being established. This is the initial first step
+    where the path was selected and a connection to the first hop has been
+    launched.
+
+  * `cannibalized`: Circuit has been cannibalized. This happens when we have
+    an already opened unused circuit (preemptive circuits) and it was picked.
+
+  * `first_onion_skin`: First onion skin was sent that is the handshake with
+    the first hop.
+
+    Extra parameters:
+
+    - `fingerprint`: Identity digest (RSA) of the first hop.
+
+  * `intermediate_onion_skin`: An intermediate onion skin was sent which can
+    be why any hops after the first one. There is thus `N - 1` of these events
+    where `N` is the total number of hops in the path.
+
+    Extra parameters:
+
+    - `fingerprint`: Identity digest (RSA) of the next hop.
+
+  * `opened`: Circuit just became opened which means that all hops down the
+    path have negotiated the handshake between them and us and the circuit is
+    now ready to send cells.
+
+  * `timeout`: Circuit has timed out that is we waited too long for the
+    circuit to be built.
+
+  * `idle_timeout`: Circuit has timed out due to idleness. This is controlled
+    by the MaxCircuitDirtiness parameter which is 10 min by default.
+
+For the common use case of a 3-hop circuit, the following events should be
+seen in this order:
+
+  `new_origin` -> `establish` -> `first_onion_skin` ->
+  `intermediate_onion_skin` -> `intermediate_onion_skin` -> `opened`
+
+### OR Circuit (`or_circuit_t`)
+
+The following events are triggered only for OR circuits. For each of them, the
+`circ_id` parameter is not present since it would always be 0. The `purpose`
+and `state` remain.
+
+  * `new_or`: New OR circuit has been created meaning it has been newly
+    allocated, initialized and added to the global list.
diff --git a/doc/HACKING/tracing/README.md b/doc/HACKING/tracing/README.md
new file mode 100644
index 0000000000..d9fb2e5341
--- /dev/null
+++ b/doc/HACKING/tracing/README.md
@@ -0,0 +1,163 @@
+# Tracing
+
+This document describes how the event tracing subsystem works in tor so
+developers can add events to the code base but also hook them to an event
+tracing framework (i.e. tracer).
+
+## WARNING ##
+
+Tracing the tor daemon **always** generates sensitive data if used in
+production (on the public network).
+
+It **is** ethical for researchers to use tracing for their own tor client (for
+example: building paths, timings, or performance).
+
+It is **NOT** ethical to archive, publish or keep data containing other users'
+activity such as relay data or anything that handles users' traffic. This
+of course includes any logs below notice level.
+
+Publishing analysis of tracing data containing user traffic is **NOT** safe
+either.
+
+In other words, tracing data that contains other users's activity is **NOT**
+safe to publish in any form.
+
+## Basics ###
+
+Tracing is separated in two different concepts. The tracing API and the
+tracing probes.
+
+The API is in `src/lib/trace/` which defines how to call tracepoints in the
+tor code. Every C files should include `src/lib/trace/events.h` if they want
+to call a tracepoint.
+
+The probes are what actually record the tracepoint data. Because they often
+need to access specific subsystem objects, the probes are within each
+subsystem. They are defined in the `trace-probes-<subsystem>.c` files.
+
+### Events
+
+A trace event is basically a function from which we can pass any data that we
+want to collect. In addition, we specify a context for the event such as the
+subsystem and an event name.
+
+A trace event in tor has the following standard format:
+
+```c
+tor_trace(subsystem, event_name, args...);
+```
+
+The `subsystem` parameter is the name of the subsytem the trace event is in.
+For example that could be "scheduler" or "vote" or "hs". The idea is to add
+some context to the event so when we collect them we know where it's coming
+from.
+
+The `event_name` is the name of the event which adds better semantic to the
+event.
+
+The `args` can be any number of arguments we want to collect.
+
+Here is an example of a possible tracepoint in main():
+
+```c
+tor_trace(main, init_phase, argc);
+```
+
+The above is a tracepoint in the `main` subsystem with `init_phase` as the
+event name and the `int argc` is passed to the event as one argument.
+
+How `argc` is collected or used has nothing to do with the instrumentation
+(adding trace events to the code). It is the work of the tracer so this is why
+the trace events and collection framework (tracer) are decoupled. You _can_
+have trace events without a tracer.
+
+### Instrumentation ###
+
+In `src/lib/trace/events.h`, we map the high level `tor_trace()` macro to one
+or many enabled instrumentation.
+
+Currently, we have 3 types of possible instrumentation:
+
+1. Debug
+
+  This will map every tracepoint to `log_debug()`. However, none of the
+  arguments will be passed on because we don't know their type nor the string
+  format of the debug log. The output is standardized like this:
+
+```
+[debug] __FUNC__: Tracepoint <event_name> from subsystem <subsystem> hit.
+```
+
+2. USDT
+
+  User Statically-Defined Tracing (USDT) is a kind of probe which can be
+  handled by a variety of tracers such as SystemTap, DTrace, perf, eBPF and
+  ftrace.
+
+  For each tracer, one will need to define the ABI in order for the tracer to
+  be able to extract the data from the tracepoint objects. For instance, the
+  tracer needs to know how to print the circuit state of a `circuit_t`
+  object.
+
+3. LTTng-UST
+
+  LTTng Userspace is a tracer that has it own type of instrumentation. The
+  probe definitions are created within the C code and is strongly typed.
+
+  For more information, see https://lttng.org/docs.
+
+## Build System
+
+This section describes how the instrumentation is integrated into the build
+system of tor.
+
+By default, every tracing events are disabled in tor that is `tor_trace()` is
+a NOP thus has no execution cost time.
+
+To enable a specific instrumentation, there are configure options:
+
+1. Debug: `--enable-tracing-instrumentation-debug`
+
+2. USDT: `--enable-tracing-instrumentation-usdt`
+
+3. LTTng: `--enable-tracing-instrumentation-lttng`
+
+They can all be used together or independently. If one of them is set,
+`HAVE_TRACING` define is set. And for each instrumentation, a
+`USE_TRACING_INSTRUMENTATION_<type>` is set.
+
+## Adding a Tracepoint ##
+
+This is pretty easy. Let's say you want to add a trace event in
+`src/feature/rend/rendcache.c`, you first need to include this file:
+
+```c
+#include "lib/trace/events.h"
+```
+
+Then, the `tor_trace()` macro can be used with the specific format detailled
+before in a previous section. As an example:
+
+```c
+tor_trace(hs, store_desc_as_client, desc, desc_id);
+```
+
+For `Debug` instrumentation, you have nothing else to do.
+
+For `USDT`, instrumentation, you will need to define the probes in a way the
+specific tracer can understand. For instance, SystemTap requires you to define
+a `tapset` for each tracepoints.
+
+For `LTTng`, you will need to define the probes in the
+`trace-probes-<subsystem>.{c|h}` file. See the `trace-probes-circuit.{c|h}`
+file as an example and https://lttng.org/docs/v2.11/#doc-instrumenting.
+
+## Performance ##
+
+A word about performance when a tracepoint is enabled. One of the goal of a
+tracepoint (USDT, LTTng-UST, ...) is that they can be enabled or disabled. By
+default, they are disabled which means the tracer will not record the data but
+it has to do a check thus the cost is basically the one of a `branch`.
+
+If enabled, then the performance depends on the tracer. In the case of
+LTTng-UST, the event costs around 110nsec.
-- 
cgit v1.2.3-54-g00ecf