#TO-DO
-
Currently the scraper for guardian.com extracts articles for the month of MAY, make it a command line argument for the user to specify the date, month and year.
-
Make a new version to scrape all the articles from the begining.
-
Combine create and build dataset into one script
Industry wise/ company wise extraction of pdfs from annualreports.com on a seperate repo.