spec/path-spec/when-we-build.md


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177

<a id="path-spec.txt-2.1"></a>

# When we build

<a id="path-spec.txt-2.1.0"></a>

## We don't build circuits until we have enough directory info

There's a class of possible attacks where our directory servers
only give us information about the relays that they would like us
to use.  To prevent this attack, we don't build multi-hop
circuits
(including
[preemptive circuits](#preemptive),
[on-demand circuits(#on-demand),
[onion-service circuits](#onion-service)]
or [self-testing testing circuits](#self-test))
for real traffic
until we have enough directory information to be
reasonably confident this attack isn't being done to us.

Here, "enough" directory information is defined as:

```text
      * Having a consensus that's been valid at some point in the
        last REASONABLY_LIVE_TIME interval (24 hours).

      * Having enough descriptors that we could build at least some
        fraction F of all bandwidth-weighted paths, without taking
        ExitNodes/EntryNodes/etc into account.

        (F is set by the PathsNeededToBuildCircuits option,
        defaulting to the 'min_paths_for_circs_pct' consensus
        parameter, with a final default value of 60%.)

      * Having enough descriptors that we could build at least some
        fraction F of all bandwidth-weighted paths, _while_ taking
        ExitNodes/EntryNodes/etc into account.

        (F is as above.)

      * Having a descriptor for every one of the first
        NUM_USABLE_PRIMARY_GUARDS guards among our primary guards. (see
        guard-spec.txt)
```

We define the "fraction of bandwidth-weighted paths" as the product of
these three fractions.

```text
      * The fraction of descriptors that we have for nodes with the Guard
        flag, weighted by their bandwidth for the guard position.
      * The fraction of descriptors that we have for all nodes,
        weighted by their bandwidth for the middle position.
      * The fraction of descriptors that we have for nodes with the Exit
        flag, weighted by their bandwidth for the exit position.
```

If the consensus has zero weighted bandwidth for a given kind of
relay (Guard, Middle, or Exit), Tor instead uses the fraction of relays
for which it has the descriptor (not weighted by bandwidth at all).

If the consensus lists zero exit-flagged relays, Tor instead uses the
fraction of middle relays.

<a id="path-spec.txt-2.1.1"></a>

## Clients build circuits preemptively {#preemptive}

When running as a client, Tor tries to maintain at least a certain
number of clean circuits, so that new streams can be handled
quickly.  To increase the likelihood of success, Tor tries to
predict what circuits will be useful by choosing from among nodes
that support the ports we have used in the recent past (by default
one hour). Specifically, on startup Tor tries to maintain one clean
fast exit circuit that allows connections to port 80, and at least
two fast clean stable internal circuits in case we get a resolve
request or hidden service request (at least three if we _run_ a
hidden service).

After that, Tor will adapt the circuits that it preemptively builds
based on the requests it sees from the user: it tries to have two fast
clean exit circuits available for every port seen within the past hour
(each circuit can be adequate for many predicted ports -- it doesn't
need two separate circuits for each port), and it tries to have the
above internal circuits available if we've seen resolves or hidden
service activity within the past hour. If there are 12 or more clean
circuits open, it doesn't open more even if it has more predictions.

Only stable circuits can "cover" a port that is listed in the
LongLivedPorts config option. Similarly, hidden service requests
to ports listed in LongLivedPorts make us create stable internal
circuits.

Note that if there are no requests from the user for an hour, Tor
will predict no use and build no preemptive circuits.

The Tor client SHOULD NOT store its list of predicted requests to a
persistent medium.

<a id="path-spec.txt-2.1.2"></a>

## Clients build circuits on demand {#on-demand}

Additionally, when a client request exists that no circuit (built or
pending) might support, we create a new circuit to support the request.
For exit connections, we pick an exit node that will handle the
most pending requests (choosing arbitrarily among ties), launch a
circuit to end there, and repeat until every unattached request
might be supported by a pending or built circuit. For internal
circuits, we pick an arbitrary acceptable path, repeating as needed.

Clients consider a circuit to become "dirty" as soon as a stream is
attached to it, or some other request is performed over the circuit.
If a circuit has been "dirty" for at least MaxCircuitDirtiness seconds,
new circuits may not be attached to it.

In some cases we can reuse an already established circuit if it's
clean; see ["cannibalizing circuits"](./cannibalizing-circuits.md)

for details.

<a id="path-spec.txt-2.1.3"></a>

## Relays build circuits for testing reachability and bandwidth {#self-test}

Tor relays test reachability of their ORPort once they have
successfully built a circuit (on startup and whenever their IP address
changes). They build an ordinary fast internal circuit with themselves
as the last hop. As soon as any testing circuit succeeds, the Tor
relay decides it's reachable and is willing to publish a descriptor.

We launch multiple testing circuits (one at a time), until we
have NUM_PARALLEL_TESTING_CIRC (4) such circuits open. Then we
do a "bandwidth test" by sending a certain number of relay drop
cells down each circuit: BandwidthRate * 10 / CELL_NETWORK_SIZE
total cells divided across the four circuits, but never more than
CIRCWINDOW_START (1000) cells total. This exercises both outgoing and
incoming bandwidth, and helps to jumpstart the observed bandwidth
(see dir-spec.txt).

Tor relays also test reachability of their DirPort once they have
established a circuit, but they use an ordinary exit circuit for
this purpose.

<a id="path-spec.txt-2.1.4"></a>

## Hidden-service circuits {#onion-service}

See section 4 below.

<a id="path-spec.txt-2.1.5"></a>

## Rate limiting of failed circuits

If we fail to build a circuit N times in a X second period
(see ["Handling failure"](./handling-failure.md)
for how this works), we stop building circuits until the X seconds
have elapsed.
XXXX

<a id="path-spec.txt-2.1.6"></a>

## When to tear down circuits

Clients should tear down circuits (in general) only when those circuits
have no streams on them.  Additionally, clients should tear-down
stream-less circuits only under one of the following conditions:

```text
     - The circuit has never had a stream attached, and it was created too
       long in the past (based on CircuitsAvailableTimeout or
       cbtlearntimeout, depending on timeout estimate status).

     - The circuit is dirty (has had a stream attached), and it has been
       dirty for at least MaxCircuitDirtiness.
```