forked from cvmfs/doc-cvmfs
-
Notifications
You must be signed in to change notification settings - Fork 0
/
cpt-benchmarks.tex
61 lines (51 loc) · 4.1 KB
/
cpt-benchmarks.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
\chapter{Performance Measurement}
When measuring performance, we look at the additional running time for a task to complete given its software relies on \cvmfs\ instead of a local file system or an alternative network file system.
We distinguish between \newterm{cold cache} and \newterm{warm cache}.
Cold cache means the cache is empty at the beginning of the benchmark.
Warm cache means all necessary files are already in the local disk cache at the beginning of the benchmark.
\section{Cold Cache}
Cold cache measurements show the network performance of \cvmfs.
We use the ROOT application benchmark \texttt{stressHepix}.
The \texttt{stressHepix} benchmark is a collection of several ROOT benchmarks that cover typical HEP tasks.
Our benchmarks also include compilation of the stressHepix test suite with GCC 3.4.
The overall running time on the benchmark machine is approximately 3 minutes.
While running the benchmark, the system has to find and fetch certain libraries and headers from the distributed file system but, once loaded, the libraries remain in a local cache.
The cache footprint is about 650 files of about 100MB in total.
To put the numbers into context, we compare \cvmfs\ to {\scshape AFS} and {\scshape GROW-FS}, another HTTP file system.
We use the same setup as the \cernvm\ production environment. The file server runs Apache 2.2.8 and Squid 2.6 for \cvmfs and OpenAFS 1.4.8 for {\scshape AFS} having the repository stored on a read-only volume.
The benchmark machine is connected to the server via local Gbit Ethernet.
In order to see the impact of wide area networks, we shape the outgoing network traffic using \texttt{tc} tool with \texttt{netem} and \texttt{tbf} qdiscs\footnote{So-called queueing disciplines allow to set certain properties of the outgoing network traffic kernel queue. See also \url{http://lartc.org/howto/lartc.qdisc.html}}.
Thus, we artificially include latency and bandwidth restrictions.
Figures~\ref{fig:latency} and Figure~\ref{fig:bandwidth} show the results.
Latency in particular turns out to be the key issue, since when compiling and running software we have many \texttt{open()} operations.
Furthermore, without specifically tuned TCP parameters, we see the strong TCP throughput decrease caused by the high bandwidth-delay product.
For \cvmfs, reasonable performance losses due to low throughput are not seen until throughput rates are 15-18\,Mb/s, which is the download speed of current ADSL+ connections.
\begin{figure}
\begin{center}
\input{latency}
\end{center}
\caption{Time penalty as a function of latency with {\scshape AFS} and \cvmfs\ compared to local storage. With standard window sizes of 16KiB TCP throughput is directly affected by latency. A round trip time of 100\,ms correspond to observed parameters between client running on Amazon EC2 (U.S. East Coast) accessing file servers located at CERN.}
\label{fig:latency}
\end{figure}
\begin{figure}
\begin{center}
\input{bandwidth}
\end{center}
\caption{Impact of low bandwidth in wide area networks in addition to high latency (100\,ms).}
\label{fig:bandwidth}
\end{figure}
\section{Warm Cache}
As soon as all the necessary files are in the local disk cache, there is almost no additional running time for \cvmfs\ compared to a local file system.
Though there is an overhead introduced by Fuse, for running and compiling software the cache layers work particulary well.
Table~\ref{tbl:warmcache} lists the result for typical tasks done on \cvmfs.
In case of a kernel cache hit, the call is directly satisfied by the VFS layer without even calling the Fuse kernel module.
For the \texttt{open()} call, kernel caching is not possible because a new file descriptor has to be registered.
Here, the \cvmfs\ catalog cache at least mitiagates the costs for an SQLite lookup.
Overall, in case of running and compiling software, the Fuse overhead is neglegible.
\begin{table}
\begin{center}
\input{warmcache}
\end{center}
\caption{Number of system calls and cache hit rates for several benchmarks. The catalog cache covers \emph{remaining} system calls that are not already covered by a kernel cache hit.}
\label{tbl:warmcache}
\end{table}