Reddit posts scraper and sentiment analyzer using python

Summary:

A web-crawler/scraper script to fetch reddit posts and save them in CSV files. Search is performed with given keywords in specific subreddits to fetch the reddit posts. Performs sentiment analysis to quantify the posts using pre-trained sentiment analysis models like Flair, Textblob and VADER. The results are saved in CSV.

download_data_from_reddit.py

Is a scraper script which can search reddit post using keywords, in a subreddit of interest.
It uses pushshift api from https://api.pushshift.io/. There is no need get API secret keys from reddit.com to use pushshift APIs (as of this writing).
sample data generated by the script looks like this.

reddit_post_sentiment_analysis.py

Takes csv file generated by download_data_from_reddit.py
Combiles title and subtext columns, and perform sentiment analysis.
Performs flair (https://pypi.org/project/flair/), textblob (https://pypi.org/project/textblob/), and VADER (https://www.nltk.org/_modules/nltk/sentiment/vader.html) NLP processing to get sentiment scores.
Sample data generated at this stage looks like this.
Bucketize the rows to combine all values for each hour. Sentiment scores are averaged and missing values are set to 0.
Sample data generated finaly looks like this.

This framework is used in https://github.com/pratikpv/predicting_bitcoin_market

Credits: Code from https://medium.com/@RareLoot/using-pushshifts-api-to-extract-reddit-submissions-fb517b286563 is referenced as base to write scraper code.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
README.md		README.md
download_data_from_reddit.py		download_data_from_reddit.py
reddit_post_sentiment_analysis.py		reddit_post_sentiment_analysis.py
sample_reddit_data.png		sample_reddit_data.png
sample_reddit_data_sentiment.png		sample_reddit_data_sentiment.png
sample_reddit_data_sentiment_bucketized.png		sample_reddit_data_sentiment_bucketized.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Reddit posts scraper and sentiment analyzer using python

Summary:

About

Releases

Packages

Contributors 2

Languages

pratikpv/reddit_scraper_and_sentiment_analyzer

Folders and files

Latest commit

History

Repository files navigation

Reddit posts scraper and sentiment analyzer using python

Summary:

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages