-
Notifications
You must be signed in to change notification settings - Fork 0
s4sarath/sentimental_analysis_twitter
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
#### SENTIMENTAL ANALYSIS ASSIGNMENT 1. The folder named ancient- is the one which is a Flask application. A version has been deployed to heroku, but it results in timeout due to complexity of the model. But work fine locally. Run "python routes.py" 2. The sentimental_analysis.ipynb is the ipython version which is quite self explanatory , trained on sts-gold data. But as the data is less, it will be not perfect for new test samples. But it results in an accuracy of 77% when repeated strings like "happyyy", "yesss" will be added as features. This is also mentioned in the ipyython notebook. 3. Sentimental_new.ipynb is the one which trained on stanford data. As the test data is more than 16 lakhs, I wont be able to run in my system. i tried with 2 lakhs, it wont run. But all the model creation is also memntioned in data. 4. Heroku link is "https://ancient-peak-6490.herokuapp.com/" in the search box add a tweet . By deafult it will wait for 30 sec. if it picks up no tweet, the request will be rejected. Used tweepy module and multithreading. 5. As a part of feature extraction I have used the following: a.) Count vectorizer for counting the total number of words after stemming and tokenization. b.) Tf-idf , which doesnt make much change in accuracy. c.) word2vec features, which is a genesim library package, which is an implementation of deep-learning based, word to context features. It results in 71% accuracy. More can be done and experimented with a sophisticated system. System used : Mac Yosemite with 8gb Ram.
About
No description, website, or topics provided.
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published