This is a python app which gets data from the Twitter Sample Stream and returns the occurence count for every word in the stream or for a number of words specified by the user in JSON format. It can be used as the backend for a wordcloud service.
The app is enclosed in a docker container and uses docker-compose to communicate with a redis server which is running in another docker container.
To use the app, you have to register a twitter app to get a consumer key and access token for accessing the stream, and set them in wordcloud.py at lines 12-15.
(this will download the docker images for redis and python:2.7.9)
- Install docker and docker-compose
- To run the app with the default args:
sudo docker-compose up
(this will gather data for 2 seconds from the stream and print the first 4 words ordered by occurence count, by running the command in docker-compose.yml) - To run the app with custom command line args, you can use
sudo docker-compose run web ./wordcloud.py [seconds_of_streaming] [max_words]
- Make sure you have python version 2.7.9 installed. For older versions, use
pip install requests[security]
so that the requests made by urllib3 won't throw security exceptions. - Install redis
pip install redis
(python 2.7.9 comes with pip) - Install tweepy
pip install tweepy
- Start the redis server
- Replace the value of the redis host at line 19 in wordcloud.py with the IP of the redis server. You can use 'localhost' if the server is local.
- Run
./wordcloud.py [seconds_of_streaming] [max_words]