summaryrefslogtreecommitdiff
path: root/doc/design-paper/performance.tex
blob: 41d7ea8dbb90742756766b6636ffa4eb177d0a79 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
\documentclass{article}
%\usepackage{palatcm}
\usepackage{fancyhdr}
\usepackage{color}
\usepackage{graphicx}

\usepackage{prettyref}
\newrefformat{sec}{Section~\ref{#1}}
\newrefformat{tab}{Table~\ref{#1}}
\newrefformat{fig}{Figure~\ref{#1}}
\newrefformat{cha}{Chapter~\ref{#1}}
\newrefformat{eqn}{Equation~\ref{#1}}
\newrefformat{apx}{Appendix~\ref{#1}}

\usepackage{hyperref}
\hypersetup{colorlinks, citecolor=MyDarkRed, filecolor=MyDarkBlue, linkcolor=MyDarkRed, urlcolor=MyDarkBlue}

\definecolor{MyDarkBlue}{rgb}{0, 0.0, 0.45}
\definecolor{MyDarkRed}{rgb}{0.45, 0.0, 0}
\definecolor{MyDarkGreen}{rgb}{0, 0.45, 0}
\definecolor{MyLightGray}{gray}{.90}
\definecolor{MyLightGreen}{rgb}{0.5, 0.99, 0.5}

\newcommand{\thetitle}{Performance Improvements on Tor}
\title{\thetitle}

%% Please add your name in here if you contribute
\author{Steven J. Murdoch}

\pagestyle{fancy}
\fancyhf{}

\fancyhead[C]{\thetitle}
\fancyfoot[C]{\thepage}  

\begin{document}

\thispagestyle{plain}
 
\maketitle

\section{Altering node selection algorithm}

Currently Tor selects nodes with a probability proportional to their bandwidth contribution to the network, however this may not be the optimal algorithm.
Murdoch and Watson investigated the performance impact of different node selection algorithms, and derived a formula for estimating average latency $T$:

\begin{equation}
T = \sum_{i=1}^n q_i t_i = \sum_{i=1}^n \frac{q_i x_i (2 - q_i x_i \Lambda)}{2 (1 - q_i x_i \Lambda)}
\label{eqn:waiting}
\end{equation}

Where $q_i$ is the probability of the $i$th node (out of $n$ nodes) being selected, $t_i$ is the average latency at the $i$th node, $x_i$ is the reciprocal of the $i$th node's bandwidth, and $\Lambda$ is the total network load.

This calculation is subject to a number of assumptions.
In particular, it assumes that Tor nodes have infinite length queues and input traffic is Poisson distributed.
Whereas in practise Tor nodes have finite length queues (which controls network load), and the distribution of input cells is not known.
Unfortunately, these assumptions are necessary to apply standard queueing theory results.

Despite the simplifications made to the network model, results derived from it may still be useful.
This is especially the case because it models the entire network, whereas experiments can feasibly change only a few of the clients' behaviour.
The formula is also amenable to mathematical analysis such as non-linear optimization.

To try and find the optimum node selection probabilities, I used a hill-climbing algorithm to minimize network latency, with a Tor directory snapshot as input.
The results (shown in \prettyref{fig:optimum-selection} and \prettyref{fig:relative-selection}) depend on the network load relative to overall capacity.
As load approaches capacity, the optimum selection probabilities converge to the one used by Tor: node bandwidth proportional to network capacity.
However, as load drops, the optimized selection algorithm favours slow nodes less and faster nodes more; many nodes are not used at all.

\begin{figure}
\includegraphics[width=\textwidth]{node-selection/optimum-selection-probabilities}
\caption{Optimum node selection probabilities for a variety of network loads. Tor is currently at around 50\% utilization. The node selection probabilities currently used by Tor are shown in black.}
\label{fig:optimum-selection}
\end{figure}

\begin{figure}
\includegraphics[width=\textwidth]{node-selection/relative-selection-probabilities}
\caption{Difference between Tor's current node selection probabilities and the optimum, for a variety of network loads. For Tor's current network load ($\approx 50$\%) shown in pink, the slowest nodes are not used at all, and the slower nodes are favoured less.}
\label{fig:relative-selection}
\end{figure}

\subsection{Impact of network load estimation}

The node selection probabilities discussed above are tuned to a particular level of network load.
It is possible to estimate network load because all Tor nodes report back both their capacity and usage in their descriptor.
However, this estimate may be inaccurate and it is possible that the network load will change over time.
In which case the node selection algorithm chosen will no longer be optimal.
\prettyref{fig:vary-load} shows how average network latency is affected by node selection probabilities, for different levels of network load.

As can be seen, small changes in network load do not significantly affect the network latency, and for all load levels examined, the optimized selection probabilities do offer lower latency when compared to the Tor selection algorithm.
However, each probability distribution has a cut-off point at which at least one node will have a higher load than its capacity, at which its queue length, and hence latency, will become infinite.
For the optimized selection probability distributions, this cut-off point is a few percent above the level they were designed to operate at.
For the Tor selection algorithm, it is when the overall network capacity equals the overall network load.

In this respect the Tor selection algorithm reaches the theoretical optimum, as no network can operate at greater than 100\% utilization while maintaining finite latency.
However, in a real instantiation of any of these alternative probability distributions, the network latency would not become infinite; instead a connection would time out and a different circuit would be selected.
So in practise, if the wrong probability distribution was selected, the network would converge at a different one.
Unfortunately the standard queuing theory models cannot handle this case and it seems likely that a simulation would be required to estimate the effect.

\begin{figure}
\includegraphics[width=\textwidth]{node-selection/vary-network-load}
\caption{Average network latency against network load. Three node selection probabilities are shown, optimized for 50\%, 75\%, and 90\% network load. The Tor node selection algorithm is also included (black). The dots on the $x$ axis show the level of network load at which the node selection probability distributions are optimized for. The line is cut off when the model predicts that at least one node will have an infinite queue length, which occurs before load $=$ capacity for all node selection algorithms except for Tor's current one.}
\label{fig:vary-load}
\end{figure}

\section{TLS application record overhead reduction}

OpenSSL will, by default, insert an empty TLS application record before any one which contains data.
This is to prevent an attack, by which someone who has partial control over the plaintext of a TLS stream, can also confirm guesses as to the plaintext which he does not control.
By including an empty application record, which incorporates a MAC, the attacker is made unable to control the CBC initialization vector, and hence does not have control of the input to the encryption function\footnote{\url{http://www.openssl.org/~bodo/tls-cbc.txt}}.

This application record does introduce an appreciable overhead.
Most Tor cells are sent in application records of their own, giving application records of 512 bytes (cell) $+$ 20 bytes (MAC) $+$ 12 bytes (TLS padding) $+$ 5 bytes (TLS application record header) $=$ 549 bytes.
The empty application records contain only 20 bytes (MAC) $+$ 12 bytes (TLS padding) $+$ 5 bytes (TLS application record header) $=$ 37 bytes.
There is also a 20 byte IP header and 32 byte TCP header.

Thus the overhead saved by removing the empty TLS application record itself is $37 / (549 + 37 + 20 + 32) = 5.8\%$.
This calculation is assuming that the same number of IP packets will be sent, because currently Tor sends packets, with only one cell, far smaller than the path MTU.
If Tor were to pack cells optimally efficiently into packets, then removing the empty application records would also reduce the number of packets, and hence TCP/IP headers, that needed to be sent.
The reduction in TCP/IP header overhead would be $37/(549 + 37) = 6.3\%$.

Of course, the empty application record was inserted for a reason -- to prevent an attack on the CBC mode of operation used by TLS, so before removing it we must be confident the attack does not apply to Tor.
Ben Laurie (one of the OpenSSL developers), concluded that in his opinion Tor could safely remove the insertion of empty TLS application records\footnote{\url{http://archives.seul.org/or/dev/Dec-2008/msg00005.html}}.
I was able to come up with only certificational weaknesses (discussed in the above analysis), which are expensive to exploit and give little information to the attacker.

To be successful, the attacker must have full control of the plaintext application record before the one he wishes to guess.
Tor makes this difficult because all cells where the payload is controlled by the attacker are prepended with a two byte circuit ID, unknown to the attacker.
Also, because the majority of cells sent in Tor are encrypted by a key not known by the attacker, the probability that an attacker can guess what a cell might be is extremely small.
The exception is a padding cell, which has no circuit ID and a zero length payload, however Tor does not currently send padding cells, other than as a periodic keep-alive.

\end{document}