Skip to content

Latest commit

 

History

History
48 lines (41 loc) · 5.54 KB

README.org

File metadata and controls

48 lines (41 loc) · 5.54 KB

Data Mining 290

Description
Learn how to obtain, clean, visualize, understand, model, and predict the world around you using data. Grading will consist of homework (30%), midterm (30%), project (40%).
Instructor
Jim Blomo <jblomo@ischool>
GSI
Shreyas <shreyas@ischool>
Textbook
Han, J., Kamber, M., & Pei, J. (2011). Data Mining: Concepts and Techniques, Third Edition (3rd ed.). Morgan Kaufmann.

Syllabus

DM[0-9]+ indicates chapters from the text, Data Mining.

DateReadingsSlidesHomework / Project
Jan 25Try Github ; A Taxonomy of Data ScienceClass Intro ; Tools Intro by GUEST: ShreyasGit Intro
Feb 1DM1 ; The Yelp Factor: Are Consumer Reviews Good for Business?Case Studies ; Obtaining DataObtain & Explore Data
Feb 8DM2, DM3Probability ; PreprocessingData Stats
Feb 15DM4, Apache Hadoop: Petabytes and Terawatts (slides); mrjob docs (for homework)Data Warehouse ; MapReduceProject Details ; mrjob
Feb 22DM8Decision Trees; Naive BayesGini Index
Mar 1DM[9.1-9.3], 9.5 ; Understanding the Bias-Variance TradeoffSVM ; Neural NetworksNeural Network Back Propagation
Mar 8DM10Agglomerative - Clustering ; Hierarchical, Density - ClusteringK-Means
Mar 15DM11.1Reviewprepare 1 cheat sheet
Mar 221 cheat sheetMidterm-
Mar 29HOLIDAY
Apr 5DM6Advanced Clustering ; Frequent PatternAWS ; Project Proposal Due
Apr 12DM11.3; PageRank; Uncovering Social Network Sybils in the WildGraphs; PageRankAdjacency Representations
Apr 19Non-linear regressionGUEST: Gene Lee Ceaser’s Pricing Strategy; Ceaser’s RecruitingPrice Elasticity
Apr 26DM12; Shazam Audio SearchOutliers; Images & AudioMidterm Review
May 3Embedded Plots ; Data-Driven DocumentsVisualization ; Yelp’s VisualizationsD3 Intro D3 Lab
May 10A Few Useful Things to Know about Machine Learning ; Top 10 Algorithms in Data MiningIn Real Life ; PresentationsMay 16th: Project Papers Due
May 17-Final PresentationBye!
Fork me on GitHub