Notebook for an 18F tech talk about solving common data munging challenges with Pandas.
Nothing fancy here...this talk is a brief tour of the Pandas library and its role in basic (smallish) data exploration and munging tasks.
Goals:
- provide a quick overview for potential new Pandas users
- give a basic refresher (with links to resources for more advanced tasks) to people who already use Pandas
To run the code samples in the Jupyer notebook interactively:
-
Install the miniconda Python package manager
-
From a terminal, clone this project repository to your local machine:
git@github.com:bsweger/pandas-munging.git
-
If you don't have a GitHub account and want to get a read-only version of the code, use this command instead: git://github.com/bsweger/pandas-munging.git
-
Change to the project directory:
cd pandas-munging
-
Install Python and dependencies into a conda virtual environment called
pandas-munging
.conda env create
-
Activate the conda environment:
source activate pandas-munging
-
Start up the Jupyter notebook server:
jupyter notebook
-
The notebook startup process should have opened a Jupyter web page at
localhost:8888
(if not, point your browser there). -
On the Files tab of the Jupyter home page, click pandas_data_munging.ipynb.
-
The Jupyter notebook with the slides and code from the presentation should open.