aboutsummaryrefslogtreecommitdiff
path: root/doc/design-paper/performance.tex
blob: 748eaa3f24faa3421b26513b4ca5f0f845c1ff6a (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
\documentclass{article}
%\usepackage{palatcm}
\usepackage{fancyhdr}
\usepackage{color}
\usepackage{graphicx}

\usepackage{prettyref}
\newrefformat{sec}{Section~\ref{#1}}
\newrefformat{tab}{Table~\ref{#1}}
\newrefformat{fig}{Figure~\ref{#1}}
\newrefformat{cha}{Chapter~\ref{#1}}
\newrefformat{eqn}{Equation~\ref{#1}}
\newrefformat{apx}{Appendix~\ref{#1}}

\usepackage{hyperref}
\hypersetup{colorlinks, citecolor=MyDarkRed, filecolor=MyDarkBlue, linkcolor=MyDarkRed, urlcolor=MyDarkBlue}

\definecolor{MyDarkBlue}{rgb}{0, 0.0, 0.45}
\definecolor{MyDarkRed}{rgb}{0.45, 0.0, 0}
\definecolor{MyDarkGreen}{rgb}{0, 0.45, 0}
\definecolor{MyLightGray}{gray}{.90}
\definecolor{MyLightGreen}{rgb}{0.5, 0.99, 0.5}

\usepackage{xspace}
\makeatletter
\newcommand{\ie}{i.e.\@\xspace}
\newcommand{\eg}{e.g.\@\xspace}
\newcommand{\cf}{cf.\@\xspace}
\newcommand{\vs}{vs.\@\xspace}
\newcommand{\wrt}{w.r.t.\@\xspace}
\newcommand{\etal}{\textit{et al.\@\xspace}}
\newcommand{\detal}{\textit{et al.}}
\newcommand{\ia}{inter alia\xspace}
\makeatother

\newcommand{\thetitle}{Performance Improvements on Tor}
\title{\thetitle}

%% Please add your name in here if you contribute
\author{Steven J. Murdoch}

\pagestyle{fancy}
\fancyhf{}

\fancyhead[C]{\thetitle}
\fancyfoot[C]{\thepage}  

\begin{document}

\thispagestyle{plain}
 
\maketitle

\section{Considering exit policy in node selection}

When selecting an exit node for a circuit, a Tor client will build a list of all exit nodes which can carry the desired stream, then select from them with a probability weighted by each node's capacity\footnote{The actual algorithm is slightly more complex, in particular exit nodes which are also guard nodes will be weighted less, and there is also preemptive circuit creation}.
This means that nodes with more permissive exit policies will be candidates for more circuits, and hence will be more heavily loaded compared to nodes with restrictive policies.

\begin{figure}
\includegraphics[width=\textwidth]{node-selection/exit-capacity}
\caption{Exit node capacity, in terms of number of nodes and advertised bandwidth for a selection of port numbers.}
\label{fig:exit-capacity}
\end{figure}

\prettyref{fig:exit-capacity} shows the exit node capacity for a selection of port numbers.
It can be clearly seen that there is a radical difference in the availability of nodes for certain ports, generally those not in the default exit policy.
Any traffic to these ports will be routed through a small number of exit nodes, and if they have a permissive exit policy, they will likely become overloaded from all the other traffic they receive.
The extent of this effect will depend on how much traffic in Tor is to ports which are not in the default exit policy.

The overloading of permissive exit nodes can be counteracted by adjusting the selection probability of a node based on its exit policy and knowledge of the network load per-port.
While it should improve performance, this modification will make it easier for malicious exit nodes to select traffic they wish to monitor.
For example, an exit node which wants to attack SSH sessions can currently list only port 22 in their exit policy.
Currently they will get a small amount of traffic compared to their capacity, but with the modification they will get a much larger share of SSH traffic.
However a malicious exit node could already do this, by artificially inflating their advertised bandwidth capacity.

\subsection{Further work}

To properly balance exit node usage, it is necessary to know the usage of the Tor network, by port.
McCoy \etal have figures for protocol usage in Tor, but these figures were generated through deep packet inspection, rather than by port number.
Furthermore, the exit node they ran used the fairly permissive default exit policy.
Therefore, their measurements will underestimate the relative traffic on ports which are present in the default exit policy, and are also present in more restrictive policies.
To accurately estimate the Tor network usage by port, it is necessary to measure the network usage by port on one or more exit nodes, while simultaneously recording the exit policy of all other exit nodes considered usable.

\section{Altering node selection algorithm}

Currently Tor selects nodes with a probability proportional to their bandwidth contribution to the network, however this may not be the optimal algorithm.
Murdoch and Watson investigated the performance impact of different node selection algorithms, and derived a formula for estimating average latency $T$:

\begin{equation}
T = \sum_{i=1}^n q_i t_i = \sum_{i=1}^n \frac{q_i x_i (2 - q_i x_i \Lambda)}{2 (1 - q_i x_i \Lambda)}
\label{eqn:waiting}
\end{equation}

Where $q_i$ is the probability of the $i$th node (out of $n$ nodes) being selected, $t_i$ is the average latency at the $i$th node, $x_i$ is the reciprocal of the $i$th node's bandwidth, and $\Lambda$ is the total network load.

This calculation is subject to a number of assumptions.
In particular, it assumes that Tor nodes have infinite length queues and input traffic is Poisson distributed.
Whereas in practise Tor nodes have finite length queues (which controls network load), and the distribution of input cells is not known.
Unfortunately, these assumptions are necessary to apply standard queueing theory results.

Despite the simplifications made to the network model, results derived from it may still be useful.
This is especially the case because it models the entire network, whereas experiments can feasibly change only a few of the clients' behaviour.
The formula is also amenable to mathematical analysis such as non-linear optimization.

To try and find the optimum node selection probabilities, I used a hill-climbing algorithm to minimize network latency, with a Tor directory snapshot as input.
The results (shown in \prettyref{fig:optimum-selection} and \prettyref{fig:relative-selection}) depend on the network load relative to overall capacity.
As load approaches capacity, the optimum selection probabilities converge to the one used by Tor: node bandwidth proportional to network capacity.
However, as load drops, the optimized selection algorithm favours slow nodes less and faster nodes more; many nodes are not used at all.

\begin{figure}
\includegraphics[width=\textwidth]{node-selection/optimum-selection-probabilities}
\caption{Optimum node selection probabilities for a variety of network loads. Tor is currently at around 50\% utilization. The node selection probabilities currently used by Tor are shown in black.}
\label{fig:optimum-selection}
\end{figure}

\begin{figure}
\includegraphics[width=\textwidth]{node-selection/relative-selection-probabilities}
\caption{Difference between Tor's current node selection probabilities and the optimum, for a variety of network loads. For Tor's current network load ($\approx 50$\%) shown in pink, the slowest nodes are not used at all, and the slower nodes are favoured less.}
\label{fig:relative-selection}
\end{figure}

\subsection{Impact of network load estimation}

The node selection probabilities discussed above are tuned to a particular level of network load.
It is possible to estimate network load because all Tor nodes report back both their capacity and usage in their descriptor.
However, this estimate may be inaccurate and it is possible that the network load will change over time.
In which case the node selection algorithm chosen will no longer be optimal.
\prettyref{fig:vary-load} shows how average network latency is affected by node selection probabilities, for different levels of network load.

As can be seen, small changes in network load do not significantly affect the network latency, and for all load levels examined, the optimized selection probabilities do offer lower latency when compared to the Tor selection algorithm.
However, each probability distribution has a cut-off point at which at least one node will have a higher load than its capacity, at which its queue length, and hence latency, will become infinite.
For the optimized selection probability distributions, this cut-off point is a few percent above the level they were designed to operate at.
For the Tor selection algorithm, it is when the overall network capacity equals the overall network load.

In this respect the Tor selection algorithm reaches the theoretical optimum, as no network can operate at greater than 100\% utilization while maintaining finite latency.
However, in a real instantiation of any of these alternative probability distributions, the network latency would not become infinite; instead a connection would time out and a different circuit would be selected.
So in practise, if the wrong probability distribution was selected, the network would converge at a different one.
Unfortunately the standard queuing theory models cannot handle this case and it seems likely that a simulation would be required to estimate the effect.

\begin{figure}
\includegraphics[width=\textwidth]{node-selection/vary-network-load}
\caption{Average network latency against network load. Three node selection probabilities are shown, optimized for 50\%, 75\%, and 90\% network load. The Tor node selection algorithm is also included (black). The dots on the $x$ axis show the level of network load at which the node selection probability distributions are optimized for. The line is cut off when the model predicts that at least one node will have an infinite queue length, which occurs before load $=$ capacity for all node selection algorithms except for Tor's current one.}
\label{fig:vary-load}
\end{figure}

\section{TLS application record overhead reduction}

OpenSSL will, by default, insert an empty TLS application record before any one which contains data.
This is to prevent an attack, by which someone who has partial control over the plaintext of a TLS stream, can also confirm guesses as to the plaintext which he does not control.
By including an empty application record, which incorporates a MAC, the attacker is made unable to control the CBC initialization vector, and hence does not have control of the input to the encryption function\footnote{\url{http://www.openssl.org/~bodo/tls-cbc.txt}}.

This application record does introduce an appreciable overhead.
Most Tor cells are sent in application records of their own, giving application records of 512 bytes (cell) $+$ 20 bytes (MAC) $+$ 12 bytes (TLS padding) $+$ 5 bytes (TLS application record header) $=$ 549 bytes.
The empty application records contain only 20 bytes (MAC) $+$ 12 bytes (TLS padding) $+$ 5 bytes (TLS application record header) $=$ 37 bytes.
There is also a 20 byte IP header and 32 byte TCP header.

Thus the overhead saved by removing the empty TLS application record itself is $37 / (549 + 37 + 20 + 32) = 5.8\%$.
This calculation is assuming that the same number of IP packets will be sent, because currently Tor sends packets, with only one cell, far smaller than the path MTU.
If Tor were to pack cells optimally efficiently into packets, then removing the empty application records would also reduce the number of packets, and hence TCP/IP headers, that needed to be sent.
The reduction in TCP/IP header overhead would be $37/(549 + 37) = 6.3\%$.

Of course, the empty application record was inserted for a reason -- to prevent an attack on the CBC mode of operation used by TLS, so before removing it we must be confident the attack does not apply to Tor.
Ben Laurie (one of the OpenSSL developers), concluded that in his opinion Tor could safely remove the insertion of empty TLS application records\footnote{\url{http://archives.seul.org/or/dev/Dec-2008/msg00005.html}}.
I was able to come up with only certificational weaknesses (discussed in the above analysis), which are expensive to exploit and give little information to the attacker.

To be successful, the attacker must have full control of the plaintext application record before the one he wishes to guess.
Tor makes this difficult because all cells where the payload is controlled by the attacker are prepended with a two byte circuit ID, unknown to the attacker.
Also, because the majority of cells sent in Tor are encrypted by a key not known by the attacker, the probability that an attacker can guess what a cell might be is extremely small.
The exception is a padding cell, which has no circuit ID and a zero length payload, however Tor does not currently send padding cells, other than as a periodic keep-alive.

\subsection*{Acknowledgements}

% Mike Perry

\end{document}