This project is aimed at creating a tweet search engine. The goal of the project is to successfully crawl tweets from twitter, index them and then create a graphic user interface allowing a user to search tweets. Furthermore, the project contributes in analyzing the corpus informing the user about the impact of the tweets in the twitter sphere.
You can find the Demo of this application here https://www.youtube.com/watch?v=hruluDxwTU8
Platform | Tech-Stack |
---|---|
Front-End | ReactJS and Redux, CSS, HTML |
Back-End | Django, Python |
Search Platform | Solr/Lucene |
Translation Platform | Microsoft Azure |
Analytics | Plotly |
News Scraping | News API Praw |
Server Instance | Amazon EC2 |
-
Search
DogDogGo allows the user to search words, phrases, hashtags etc. It offers a rich, flexible set of features for search. -
Translation
The search engine allows the user to search in multiple languages. -
Highlighting
Found results are highlighted. -
More Like This
When a user finds a document relevant the user can search similar tweets by clicking on “More like this”. This feature is similar to Google's "More Like This feature in Google News. -
Custom Search
The user can customise its search and use filters. We allow the user to filter its search on the basis of POI, Location, Hashtags, Sentiment, Language, and Source. -
Analytics
We use the SentimentIntensityAnalyser of VadarSentiment library to analyse the sentiment of each tweet as well as for the searched results. Red: Negative, Green: Positive, Orange: Neutral. -
Dynamic Search Result Analysis
On the basis of the search results, a number of analysis is provided.- Location Distribution: Location of tweets that match the query term.
- Sentiment Analysis: Sentiment analysis of the fetched results.
- Person of Interest Distribution: frequency of query term on the POI’s twitter handle
- Distribution of Devices: Devices from which the tweets were posted.
-
Tweet Corpus Analysis
The user can also visualize the statistics of the tweet corpus.- World Twitter Usage: Geo mapping of tweets around the world.
- Country Time Series: Twitter usage based on country over the time.
- POI Time Series: Twitter usage based on Person of Interest over the time.
- Sentiment Time Series: Sentiment of tweets over the time.
- Location Distribution: Distribution of tweets by location.
-
Relevant News Articles
The user can also view articles related to the tweet. The user can also view the original article. -
Tweet Replies
The user can also view the replies for a particular tweet. -
Additional Features
A few additional features are also included to enhance the user experience.- Total search count and response time of search engine.
- Pagination
- Interactive plots
- Clean user interface
- Phrase search
- Sentiment analysis and display retweet count, reply count and article count for each tweet.
Front-end: Anirudh
Back-end: Snigdha and Raunaq
Analytics: Raunaq and Anirudh
News Scraping: Deepesh
Translation and Solr Querying: Snigdha, Deepesh and Raunaq
Documentation: Snigdha