This repository contains the material for Talk Python Training course on Getting Started with Dask.
In this free course, we will get you up to speed with Dask and show you how to easily convert pandas workloads to blazing Dask clusters (locally across cores or scaled-out across cloud servers).
Learn more and take the course at: training.talkpython.fm
In this course, you will:
- Explore the problems solved by Dask: What is big data and how can you work with it?
- Learn the Dask API and how to use it
- Analyze the NYC taxicab dataset with Dask on a local cluster
- Scale that same computation to the cloud with Coiled
- Connect to local and remote Dask cluster visualization and reporting dashboards
- And more!
- Basic Python
Not required, but nice to have:
- pandas
- JupyterLab
- conda (for local setup)
- terminal (for local setup)
You get up and running in two ways:
The binder project allows you to open Jupyter notebooks in this repository in an online executable environment. Click on the "launch binder" link in your browser window to get started. It might take a few minutes to start.
Note: Binder notebooks timeout if inactive for more than 10 mins.
-
Clone your forked repository:
git clone http://github.com/<username>/talkpython-getting-started-with-dask
- From root directory:
cd talkpython-getting-started-with-dask
create a new conda environment:
conda env create -f environment.yml
- Activate the conda environment:
conda activate talkpython-dask
- Start JupyterLab
jupyter lab