diff options
Diffstat (limited to 'spec/intro/index.md')
-rw-r--r-- | spec/intro/index.md | 140 |
1 files changed, 140 insertions, 0 deletions
diff --git a/spec/intro/index.md b/spec/intro/index.md new file mode 100644 index 0000000..894d7e0 --- /dev/null +++ b/spec/intro/index.md @@ -0,0 +1,140 @@ +# A short introduction to Tor {#tor-intro} + +### Basic functionality {#basics} + +Tor is a distributed overlay network designed to anonymize +low-latency TCP-based applications +such as web browsing, secure shell, and instant messaging. +The network is built of a number of servers, called **relays** +(also called "onion routers" or "ORs" in some older documentation). + +To connect to the network, +a client needs to download an up-to-date signed directory +of the relays on the network. +These directory documents are generated and signed +by a set of semi-trusted **directory authority** servers, +and are cached by the relays themselves. +(If a client does not yet have a directory, +it finds a cache by looking at a list of stable cache locations, +distributed along with its source code.) + +> For more information on the directory subsystem, +> see the [directory protocol specification](../dir-spec). + +After the client knows the relays on the network, +it can pick a relay and open a [**channel**](../tor-spec/channels.md) +to one of these relays. +A channel is an encrypted reliable non-anonymous transport +between a client and a relay or a relay and a relay, +used to transmit messages called [**cells**](../tor-spec/cell-packet-format.md). +(Under the hood, a channel is just a TLS connection over TCP, +with a specified encoding for cells.) + +To anonymize its traffic, +a client chooses a **path**—a sequence of relays on the network— +and opens a channel to the first relay on the path +(if it does not already have a channel open to that relay). +The client then uses that channel to build +a multi-hop cryptographic structure +called a [**circuit**](../tor-spec/circuit-management.md). +A circuit is built over a sequence of relays (typically three). +Every relay in the circuit knows its precessor and successor, +but no other relays in the circuit. +Many circuits can be multiplexed over a single channel. + +> For more information on how paths are selected, +> see the [path specification](../path-spec). +> The first hop on a path, +> also called a **guard node**, +> has complicated rules for its selection; +> for more on those, see the [guard specification](../guard-spec). + +Once a circuit exists, +the client can use it to exchange fixed-length +[**relay cells**](../tor-spec/relay-cells.md) +with any relay on the circuit. +These relay cells are wrapped in multiple layers of encryption: +as part of building the circuit, +the client [negotiates](../tor-spec/create-created-cells.md) +a separate set of symmetric keys +with each relay on the circuit. +Each relay removes (or adds) +a [single layer of encryption](../tor-spec/routing-relay-cells.md) +for each relay cell before passing it on. + +A client uses these relay cells +to exchange [**relay messages**](../tor-spec/relay-cells.md) with relays on a circuit. +These "relay messages" in turn are used +to actually deliver traffic over the network. +In the [simplest use case](../tor-spec/opening-streams.md), +the client sends a `BEGIN` message +to tell the last relay on the circuit +(called the **exit node**) +to create a new session, or **stream**, +and associate that stream +with a new TCP connection to a target host. +The exit node replies with a `CONNECTED` message +to say that the TCP connection has succeeded. +Then the client and the exit exchange `DATA` messages +to represent the contents of the anonymized stream. + +> Note that as of 2023, +> the specifications do not perfectly distinguish +> between relay cells and relay messages. +> This is because, until recently, +> there was a 1-to-1 relationship between the two: +> every relay cell held a single relay message. +> As [proposal 340](../proposals/340-packed-and-fragmented.md) is implemented, +> we will revise the specifications +> for improved clarify on this point. + +Other kinds of relay messages can be used +for more advanced functionality. + +<!-- TODO: I'm not so sure about the vocabulary in this part. --> + +Using a system called **conflux** +a client can build multiple circuits to the _same_ exit node, +and associate those circuits within a **conflux set**. +Once this is done, +relay messages can be sent over _either_ circuit in the set, +depending on capacity and performance. + +> For more on conflux, +> which has been integrated into the C tor implementation, +> but not yet (as of 2023) into this document, +> see [proposal 329](../proposals/329-traffic-splitting.txt). + +### Advanced topics: Onion services and responder anonymity {#onions} + +In addition to _initiating_ anonymous communications, +clients can also arrange to _receive_ communications +without revealing their identity or location. +This is called **responder anonymity**, +and the mechanism Tor uses to achieve it +is called **onion services** +(or "hidden services" or "rendezvous services" +in some older documentation). + +> For the details on onion services, +> see the [Tor Rendezvous Specification](../rend-spec). + +### Advanced topics: Censorship resistence {#anticensorship} + +In some places, Tor is censored. +Typically, censors do this by blocking connections +to the addresses of the known Tor relays, +and by blocking traffic that resembles Tor. + +To resist this censorship, +some Tor relays, called **bridges**, +are unlisted in the public directory: +their addresses are distributed by [other means](../bridgedb-spec.md). +(To distinguish ordinary published relays from bridges, +we sometimes call them **public relays**.) + +Additionally, Tor clients and bridges can use extension programs, +called [**pluggable transports**](../pt-spec), +that obfuscate their traffic to make it harder to detect. + + |