Skip to content

Silvervox325/reddit-crawler

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 

Repository files navigation

Reddit Crawler

Python script to crawl Reddit.

About

The story behind this crawler is that I wanted to get all of /r/DailyProgrammer/ challanges, but couldn't have been bothered to go through every post, page by page, for hundreds of posts.

Features

  • Crawl any subreddit,
  • Choose how many pages you wish to crawl,
  • Save crawled data and do whatever you want with it

Prerequisites

Setup

$ git clone https://github.com/filipkonieczny/reddit-crawler.git
$ cd reddit-crawler/
$ virtualenv .venv
$ source .venv/bin/activate
$ pip install -r requirements.txt

Usage

All you have to do is run the script while in the project directory like this:

$ python reddit_crawler.py SUBREDDIT CRAWLING_DEPTH

and supply SUBREDDIT along with CRAWLING_DEPTH(optional, default is 1), for example:

$ python reddit_crawler.py http://www.reddit.com/r/dailyprogrammer/ 10

will crawl you first 10 pages of /r/DailyProgrammer/ subreddit.

About

Python script to crawl Reddit

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages