Simple way to configure your schema and scheduler from your CLI, and obtain tweets into your local Mongo database.
P.S: If you are receiving a similar duplicate error on db.[collection].[$_id_]
, try running db.collection.dropIndexes()
in a mongo instance.
- Converted from commander to inquirer for the interface and user experience.
- Updated the streaming functionality to reflect the recent migration.
- Organized file structure.
To install the required packages:
npm install
Head to Twitter for developers and create a new app. In ./server/config.js
you will notice:
var client = require('twitter')({
consumer_key: '',
consumer_secret: '',
access_token_key: '',
access_token_secret: ''
});
Fill those in to have access to Twitter API v1.1 Public Stream.
Make sure you have MongoDB installed, then run this command to start your server instance.
mongod
Then run your app.
node miner.js
Two things to take care of:
-
For filtering languages: A list (comma separated) of languages according to Twitter API and ISO_639-1 standards. Example:
en,eu,ar,fr
. -
For scheduling - A list (comma separated) to create Date(year,month,day,hour,minute,second,millis) object. Example:
2015,01,22,15,0,0
.
As soon as the server is running, your default browser will lauch to login to Twitter.
Default: http://localhost:3000/
That's it!
Saying Yes
launches the program.
Saying No
re-launches the questions.
You are not allowed to:
- Enter a start/end time smaller or equal to current time.
- Enter an end date with no start date.
- Enter an end date smaller than start date.
Notice that months is zero based
If you do not specify start/end time(s), manual interaction will be required.
var schema = mongoose.Schema({
_id: {type: String},
text: { type: String},
screen_name: {type: String},
verified:{type: Boolean},
followers_count:{type: Number},
image_url:{type:String},
coordinates:{type: Array},
retweet_count:{type: Number},
timestamp:{type: Date}
});
The application will look for retweeted statuses in stream data and do the following:
- If
_id
exists in local database, incrementretweet_count
. - Else consider it as new tweet.