-
Notifications
You must be signed in to change notification settings - Fork 25
/
02-00-key-ideas-pt-2.html
217 lines (145 loc) · 5.31 KB
/
02-00-key-ideas-pt-2.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
<!DOCTYPE html>
<html>
<head>
<title>Key Ideas, Part 2</title>
<meta charset="utf-8">
<style>
@import url(https://fonts.googleapis.com/css?family=Montserrat);
@import url(https://fonts.googleapis.com/css?family=Lato:400,700,400italic);
@import url(https://fonts.googleapis.com/css?family=Source+Code+Pro:400,700,400italic);
body { font-family: 'Lato'; }
h1, h2, h3 {
font-family: 'Montserrat';
font-weight: normal;
}
img {
max-width: 100%;
}
.remark-code, .remark-inline-code { font-family: 'Source Code Pro'; }
</style>
</head>
<body>
<textarea id="source">
class: center, middle
# Key Ideas, Part 2
## Search spaces
---
# Search spaces
* Does the search space contain the solution?
* Can we find the solution?
???
* Want a search space that is large enough that it contains the solution.
* Want a search space that's well-structured. We can't use gradient descent to
explore a discrete collection of functions, so we probably want a function
parameterized by a continuous variable. We probably want our search space to
be well-suited for gradient descent.
* These are complemetary tensions.
* We already said we're using computational graphs, but what are the individual
nodes (operations)? In the linear regression example, we had some operations
like additions and multiplications along with pointwise squaring and
summation.
---
# Neurons
.center[
![](figs/neuron.png)
]
???
* Brain-inspired model of computation. Neuron is the basic building block.
* Don't read too deeply into the brain-inspired part. It's the basic idea plus
a bunch of hacks that work well in practice.
* Vaguely: there's input from a bunch of different places, and if that input
exceeds a certain threshold, then a signal is propagated.
---
class: center, middle
# Artifical neurons
???
* Basic unit of computation
* [ 02-01-notes ]
---
class: center, middle
# Artificial neural networks
???
* [ 02-02-notes ]
* Think about the big ideas we talked about before: we're searching over
programs. This is merely describing a function parameterized by the weights
and biases, and we can find the optimal program by gradient descent to
minimize a loss which will express how well the function fits some data.
* At a high level: as you go deeper and deeper into the network, it's
extracting higher and higher level information. Basically, the deeper you go,
the further into the processing pipeline you are.
* Empirically, large and deep neural nets are really good at solving problems:
scaling up has a huge benefit (need lots of compute and data).
---
# Convolutional neural networks
.center[
![](figs/convolution.png)
]
<!-- image from http://intellabs.github.io/RiverTrail/tutorial/ -->
???
* Turns out that for many problems, deep fully-connected neural networks
describe too large a search space. The solution is definitely there, but we
can't efficiently search the space. So we need a better structural prior. For
image-related problems, one solution is convolutional neural networks.
* Convolution: have a convolutional filter, slide it over all possible
positions on the input, take dot products.
---
# Convolutional neural networks
* Example: Sobel kernel
`$$\begin{bmatrix}
1 & 0 & -1 \\
2 & 0 & -2 \\
1 & 0 & -1 \\
\end{bmatrix} * A$$`
.center[
![](figs/sobel.jpg)
]
???
* A small number of parameters can describe a powerful operation.
---
# Convolutional neural networks
.center[
![](figs/vgg16.png)
]
<!-- https://blog.heuritech.com/2016/02/29/a-brief-report-of-the-heuritech-deep-learning-meetup-5/ -->
???
* VGG-16 network from 2014: takes in images, predicts probability distribution
over the 1000 ImageNet classes.
* As you go deeper into the network, it extracts higher and higher level
representations. It's all convolution and pooling until the end, where there
are a couple fully connected layers, and finally, there's a softmax, that
scales the output so it's a probability distribution.
* Highly structured search space.
* This could be expressed by a huge deep fully-connected network, but you'd
never get it to converge.
---
class: center, middle
# Neural network layers
???
* Many types of neural network layers.
* We don't have time to talk about all of these, but if you understand the
fundamental concepts, this stuff is easy to learn.
* Basically, layers are either there to structure the search space or make it
easier for gradient descent to find good minima.
---
class: center, middle
# Network architectures
???
* How do you choose a search space (network architecture)? Look at what others
have done before, try your own variations: requires some trial and error.
</textarea>
<script src="https://gnab.github.io/remark/downloads/remark-latest.min.js"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.1/MathJax.js?config=TeX-AMS_HTML&delayStartupUntil=configured" type="text/javascript"></script>
<script type="text/javascript">
var slideshow = remark.create({
countIncrementalSlides: false
});
// Setup MathJax
MathJax.Hub.Config({
tex2jax: {
skipTags: ['script', 'noscript', 'style', 'textarea', 'pre']
}
});
MathJax.Hub.Configured();
</script>
</body>
</html>