Skip to content

Script to get data for all years of the london marathon (takes ~15mins)

License

Notifications You must be signed in to change notification settings

michaelwalshe/scrape_london_marathon

Repository files navigation

Scraping the Virgin London Marathon

Build and Test

The Virgin London Marathon website provides a webpage showing the results for each year. The underlying dataset is not publicly available, but as there are only slight changes in the search interface each year it is possible to scrape the tables for each year.

Main program is pipeline.py, which scrapes and cleans the data in parallel using multithreading, followed by testing the output to ensure it looks as expected. This takes ~15mins to run completely. For a simpler view of the program, or for testing, example_2020_scrape_london_marathon.ipynb just gets the table for the 2020 results without using multithreading.

About

Script to get data for all years of the london marathon (takes ~15mins)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published