Data science is a sexy job. The salaries are high, the work is interesting, and there’s significant prestige that comes with the title.
A data scientist will:
- Analyze Data
- Clean Data using Pandas & Numpy - Gaining insights
- Build models on data
-
Bill James applied data analysis to baseball
- Who are the top performers?
- How can you best predict future performance?
-
Netflix uses data analysis to recommend movies.
Participants will be able to:
- Create a Jupyter Notebook to begin data analysis
- Perform exploratory data analysis (EDA)
- Understand the purpose and methods of cleaning data
- Understand the methods of analyzing a dataset
- Accessing Jupyter Notebooks
- Importing libraries such as pandas and NumPy into Jupyter Notebooks
- Techniques for exploratory data analysis (EDA)
- Identifying missing or erroneous data for possible cleaning
- Using pandas and NumPy to analyze a dataset
Data science is a multi-disciplinary field that uses scientific methods, processes, algorithms and systems to extract knowledge and insights from data in various forms, both structured and unstructured, similar to data mining.
LifeCycle of Data Science
- Tools like Pandas, Numpy, Hadoop, Spark etc comprise an important part of the data science toolbox. It is up to the data scientist to figure out which tool to use in different circumstances (as well as how to use the tool correctly) in order to solve analytically open-ended problems.
- Access to More Data Translates to Higher Accuracy
- Data Science and Business Intelligence Are the Same
- You Must Have Access to Lots of Data
-
What are the advantages of using NumPy Array?
-
What differentiates data science from other analytical fields (business intelligence, etc)?
-
Assignments: