-
Notifications
You must be signed in to change notification settings - Fork 0
/
project.html
executable file
·232 lines (200 loc) · 7.84 KB
/
project.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
<!DOCTYPE HTML>
<!--
Solarize by TEMPLATED
templated.co @templatedco
Released for free under the Creative Commons Attribution 3.0 license (templated.co/license)
-->
<html>
<head>
<title>Final Project</title>
<meta http-equiv="content-type" content="text/html; charset=utf-8" />
<meta name="description" content="" />
<meta name="keywords" content="" />
<!--[if lte IE 8]><script src="css/ie/html5shiv.js"></script><![endif]-->
<script src="js/jquery.min.js"></script>
<script src="js/jquery.dropotron.min.js"></script>
<script src="js/skel.min.js"></script>
<script src="js/skel-layers.min.js"></script>
<script src="js/init.js"></script>
<noscript>
<link rel="stylesheet" href="css/skel.css" />
<link rel="stylesheet" href="css/style.css" />
</noscript>
<!--[if lte IE 8]><link rel="stylesheet" href="css/ie/v8.css" /><![endif]-->
</head>
<body>
<!-- Header Wrapper -->
<div class="wrapper style1">
<!-- Header -->
<div id="header">
<div class="container">
<!-- Logo -->
<h1><a href="#" id="logo">Big Data</a></h1>
<!-- Nav -->
<nav id="nav">
<ul>
<li class="active"><a href="index.html">Overview</a></li>
<li><a href="syllabus.html">Syllabus</a></li>
<li><a href="schedule.html">Course Schedule</a></li>
<li><a href="project.html">Final Project</a></li>
<li><a href="resources.html">Resources</a></li>
</ul>
</nav>
</div>
</div>
</div>
<!-- Main -->
<div id="main" class="wrapper style6">
<!-- Content -->
<div id="content" class="container">
<section>
<header class="major">
<h2 align="center">Mathematics of Big Data</h2>
<span class="byline" align="center">
Professor Weiqing Gu <br>
Spring 2019
</span>
</header>
<h2>Final Project</h2>
<br>
<h3>Description:</h3>
<p>
This is by far the largest component of the course.
You will discover, explore, and attack a real world
problem of your choosing.
There are <b> three </b> types of projects you
can work on, shown below in order of increasing difficulty:
</p>
<ul>
<li>
(1) Application of existing algorithm to a new problem and
potentially new data.
</li>
<li>
(2) Algorithmic work. Extend an existing algorithm
or conceive a new one to solve some problem.
This inherently includes the first option because you will
need to test this new algorithm on data.
</li>
<li>
(3) Theoretical work. Create a new convergence bound
on a learning algorithm. Show that at some limit
one learning algorithm becomes another. Etc.
</li>
</ul>
<p>
These also have increasing risk. For example, you cannot
turn in a paper saying you worked on a convergence bound
for months with no results. Option two has medium risk because
part of the process of creating a new algorithm is creating
baselines to improve upon.
At <i>any</i> time during the course <i>please</i> feel free
to come and discuss your problem and ask questions with the
instructor or TA.
</p>
<h3>Requirements:</h3>
All of the requirements below must be satisfied in order
to receive full credits for the project:
<ul>
<br>
<li>
● <b><i>Partner:</i></b> <br>
Maximum of 1 partner (we may concede to 2 partners
in extreme scenarios eg. huge coding project). All
partners must contribute equally.
</li>
<li>
● <b><i>Dataset:</i></b> <br>
You must use at least one dataset with at least
one half million data points as a significant part of
your project.
</li>
<li>
● <b><i>Format:</i></b> <br>
Your submission must be submitted as a pdf in
<a href="https://nips.cc/Conferences/2016/PaperInformation/AuthorSubmissionInstructions">NIPS format</a>.
Note that this means you must use LaTeX with their
<a href="https://nips.cc/Conferences/2016/PaperInformation/StyleFiles">style file</a>.
(NIPS, Neural Information Processing Systems,
is one of the major machine learning conferences).<br>
If you do not know how to use LaTeX, we reccommend finding a
partner who does.
</li>
<li>
● <b><i>Code Style:</i></b> <br>
All code used in the production of your final report should be clean
<a href="https://drivendata.github.io/cookiecutter-data-science/">
(suggested format)</a>
and placed into a public GitHub repository under one of
your partner's accounts. Place a footnote to this
URL somewhere in your final pdf. This is not required but
it is recommended to place your code under some
<a href="http://choosealicense.com/">open-source license</a>
such as <a href="http://choosealicense.com/licenses/mit/">MIT</a>.
</li>
</ul>
<h3>Due Dates:</h3>
<ul>
<br>
<li>
● <b><i>Feb 11: Project Proposal</i></b><br>
Typed (LaTeX)
one page maximum explaining your problem, what data
sets you are likely to use (you must find some candidates),
who your partners are, and what methods (of those you know of)
you think you might use. Note that this is not 100% final
but it should be within some epsilon of your final project.
</li>
<br>
<li>
● <b><i>April 1: Midterm Presentation</i></b><br>
10-12 minute presentation (plus 3 minutes for questions)
detailing your progress towards
your goal. The write-up should be 6 to 8 pages for a 1 person
group, 12 to 15 pages for a 2 person group and 15 to 20 for a 3 person group.
</li>
<br>
<li>
● <b><i>April 15: Draft of Final Project Submission</i></b> <br>
Typed (LaTex) draft of final report and all of the codes written
need to be submitted.
The draft needs to detail the progress of the final project, which
is expected to be a significant amount. The draft is
used to demonstrate what you have done so far and show that you
are ready for the final presentation.
It does not need to follow
the NIPS format (which is required for the final version).
The code does not
need to be super clean and organized for this draft submission,
but it is expected to be cleaned up for final submission. You do
not need to have the presentation slides ready for this submission.
</li>
<br>
<li>
● <b><i>May 8 (tentatively): Final Project Presentation</i></b> <br>
Presentation should as detailed as possible, and it should be
about 10 minutes to half an hour long.
</li>
<br>
<li>
● <b><i>May 9 (tentatively): Final Project Submission</i></b> <br>
Submission of the final project should be done electronically.
It must include the Latex final report
following NIPS format, all codes written for the project, the
dataset, presentation (.pptx or .pdf) and any other files used.
Only one copy of each item need be turned in per group.
Must conform to the requirements above. If the dataset is too
large to upload it to the Github, please contact instructor or
TA for submission of the dataset.
</li>
</ul>
</section>
</div>
</div>
<!-- Copyright -->
<div id="copyright">
@Copyright 2019 Mathematics of Big Data Spring 2019 All Rights Reserved
</div>
</div>
</body>
</html>