-
Notifications
You must be signed in to change notification settings - Fork 30
/
2012Lecture10.tex
203 lines (177 loc) · 7.24 KB
/
2012Lecture10.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
%!TEX root = InfoSec.tex
% Lecture 10: 17 October 2012
\sektion{10}{Memory safety}
Core problem: how to \underline{isolate} programs running on the same machine.
\makebox[9cm]{\framebox[2.5cm]{component}\makebox[1cm]{$\leftrightarrow$}\parbox[c]{2cm}{interfaces\\\makebox[.75cm]{}\framebox[.5cm]I}\makebox[1cm]{$\leftrightarrow$}\framebox[2.5cm]{component}}\\
\makebox[9cm]{\makebox[6.5cm]{}\makebox[2.5cm]{$\updownarrow$}}\\
\makebox[9cm]{\makebox[6.5cm]{}\makebox[2.5cm]{\framebox[.5cm]{I}}}\\
\makebox[9cm]{\makebox[6.5cm]{}\makebox[2.5cm]{$\updownarrow$}}\\
\framebox[9cm]{operating system}
Goals:
\begin{itemize}
\item limit addresses a program can read/write
\item prohibit jumps outside prgram's own code (except via interfaces)
\end{itemize}
Common themes:
\begin{itemize}
\item enforcement mechanism more powerful than the program
\item enforcement mechanism must start first
\end{itemize}
Two styles of memory safety enforcement
\begin{itemize}
\item hardware-based (e.g. most OSes)
\item software-based (e.g. most browsers)
\end{itemize}
\subsektion{Hardware-based memory safety}
\begin{itemize}
\item virtual memory:
\begin{tabular}{rcc}
address & \framebox[4cm]{virtual page \#} & \framebox[3cm]{offset}\\
& $\downarrow$ & \multirow{3}{*}{\begin{sideways}$\xleftarrow{\qquad\qquad\quad}$\end{sideways}}\\
& \ovalbox{\parbox[c]{2cm}{\centering translate\\via page\\table}} &\\
& $\downarrow$ &\\
physical address & \framebox[4cm]{physical page \#} & \framebox[3cm]{offset}
\end{tabular}
page table, separate for each process
\begin{table}[h!]\centering\begin{tabular}{c|c}
virtual page \# & physical page \#\\
\hline
0 & 73\\
\hline
1 & 0\\
\hline
$\vdots$ & $\vdots$
\end{tabular}\end{table}
If physical page number isn't in the page table, then that program
can't access it.
\item page protection bits: for each page, say whether current program
\begin{itemize}
\item can read
\item can write
\item can execute (on some machines)
\end{itemize}
\end{itemize}
%Use same mechanism for controlling disk and network access hardware by mapping
%to region of virtual addresses?
{\bf OS Kernel}\\
Special program with superpowers -- can set up page table, protection bits, etc.
Hardware ``priviledged'' bit:
\begin{itemize}
\item in privileged mode, can modify permissions
\item on when kernel is running, off when not
\end{itemize}
Certain instructions will run only in privileged mode.
System call:
\begin{itemize}
\item special way for ordinary code to call the kernel
\item forces call to go through a special interface -- to a
\underline{fixed} program address in kernel
\item special ``syscall'' (hardware) instruction, simultaneously:
\begin{itemize}
\item sets privileged bit to true
\item sets program counter to a fixed value (forced to go through
entry point where everything will be checked)
\end{itemize}
\end{itemize}
{\bf Hardware-based: pros and cons}
\begin{description}
\item[+] simple mechanisms, can do fast checking
\item[--] slow to cross protection boundaries
\item[--] narrow interface between components
\item[--] hard to layer this on itself (if a program wants to use hardware-
based protection for its own internals)
\end{description}
\subsektion{Software-based memory safety}
Big fad back around 1970 then almost died out; came back in late 90's with
widespread use of the Internet
{\bf Usual model}
\begin{itemize}
\item code producer: writes and delievers code
\item code consumer: checks and runs code
\item goal: protect consumer from producer
\end{itemize}
{\bf Approach 1: interpretation}
\begin{tabular}{ccc}
CP & & CC\\
\framebox[3cm]{source code} & $\xrightarrow{\qquad\qquad}$ & \ovalbox{\parbox[c]{3cm}{\centering interpreter\\checks safety\\as it runs}}
\end{tabular}
Example: JavaScript
Pros: portable, rich interfaces, straightforward to implement
Cons: must use certain programming language. interpretation is often slow
{\bf Approach 2: type-safe language}
Example: Java in browser
Java type safety - every access to an object
\begin{itemize}
\item goes to an object that this program created, and
\item obeys \underline{private}, \underline{protected} keywords
\end{itemize}
\begin{theorem*}{\textnormal{(conjecture)}}
If memory is allocated only within a program's own space, then type safety
implies memory safety.
\end{theorem*}
\begin{tabular}{ccc}
CP & & CC\\
\framebox[3cm]{source code} & &\\
$\downarrow$ & &\\
\ovalbox{compiler} & &\\
$\downarrow$ & &\\
\framebox[3cm]{.class files} & $\xrightarrow{\qquad\qquad}$ & \ovalbox{verify}\\
& & $\downarrow$ if ok\\
& & \ovalbox{interpret/compile}
\end{tabular}
Pros and cons:
\begin{description}
\item[+] support rich interfaces
\item[+] untrusting components can run together
\item[--] some compilation/interpretation at the consumer (slow, complex)
\item[--] have to use one language? Java class files very dependent on the
Java type system, so compiling other kinds of files into .class hard
\end{description}
{\bf Approach 3: insert checking code}
\begin{tabular}{ccc}
CP & & CC\\
\framebox[3cm]{source code} & &\\
$\downarrow$ & &\\
\ovalbox{compiler} & &\\
$\downarrow$ & &\\
\framebox[3cm]{machine code} & $\xrightarrow{\qquad\qquad}$ & \ovalbox{transformer}\\
& & $\downarrow$\\
& & \framebox[3.5cm]{safe machine code}
\end{tabular}
Transformer inserts checking code before every read, write, or jump instruction.
Complexities:
\begin{itemize}
\item insert code $\rightarrow$ recalculate offsets in the code
\item can't let the code jump past the inserted checks
\end{itemize}
{\bf Approach 4: proof-carrying code}
\begin{tabular}{ccc}
CP & & CC\\
\framebox[3cm]{source code} & &\\
$\downarrow$ & &\\
\ovalbox{compiler} & &\\
$\downarrow$ & &\\
\framebox[3cm]{machine code} & $\xrightarrow{\qquad\qquad}$ & \multirow{2}{*}{\ovalbox{\parbox[c]{1.25cm}{verify\\proof}}}\\
\framebox[3cm]{proof} & $\xrightarrow{\qquad\qquad}$ &\\
& & $\downarrow$ if ok\\
& & \framebox[3cm]{machine code}
\end{tabular}
Where does the proof come from? Use a certifying compiler: starting with a
typesafe program, as the compiler goes through transformations to lower-level
representions of the program it also transforms the proof of why safe to that
level. Note that the compiler can always add additional checking code in order
to make the proof simpler.
What a safety proof looks like: (in assembly code)
\begin{itemize}
\item annotate each instruction iwth a precondition and a postcondition
\item to prove safety, prove:
\begin{itemize}
\item when program starts, precondition of first instruction is true
\item for every instruction, if precondition and execute
instruction, then afterward postcondition will be true
\item for every instruction, precondition implies (read/write/jump)
instruction is safe to execute
\item postcondition of each instruction implies precondition of
every instruction that can come next
\end{itemize}
\end{itemize}