Crawl Scientific Webpages for relevant papers. The easisets way to start your journey in the scientific jungle.
- The SPH Team
You can try the SPH here.
This service / repo is build for only educational purposes!
The Scientific-Purpose-Harvester (SPH) aims to build a dynamic possibility of crawling scientific webpages for relevant content, matching your questions.
Check Google Scholar for the best scientific results for your question, with the help of an easy-to-use Graphical User Interface.
In the long run we might connect additional data sources like https://dblp.uni-trier.de/ (conferences not yet peer reviewed...).
Quickly start with a Video Tutorial for the SPH! (Click Image to go to Youtube)
The easiest way to access the SPH.
Simply open SPH, hosted by an SPH-Teammember. This Website uses the svelte-Version of the SPH.
- Clone the Repro
git clone https://github.com/SimonScapan/scientific-purpose-harvester.git
- Navigate into harvester
cd harvester
- Uncomment Lines 22-24 in api.py
- Start the api.py to start the harvester
python api.py
- Open Local Website in your Browser
- Shutdown Local Website with Using CTRL+C in your terminal
- Enter your question
- Hit the search button and wait for results.
- Get a quick Overview of the best scientific papers for your questions. Follow a Link to get directly to the paper.
- Scraper-API allows us to crawl Google Scholar (or other Websites) without getting blacklisted.
- A Free Plan of ScraperAPI is used. It allows 1000 free Requests per Month
- If there is a Problem with the used API Key
- Get your own free API-Key on the ScraperAPI Website
- Replace the given API-Key with your personal API-Key in the harvester_scholar.py file (Line 34)
- Svelte allows us to use python file within the website
- The Papers are ranked by citation count
Here are some Idead for future extions. Feel free to fork this Project and add some of these, or your own Ideas!
- Free Text based NLP Training --> Q&A Pair generation --> Feed Fancy Flash Cards with Content
- Build a Network of the cited articles. Who cited who? Where are conections?
- Build an Integration to some more Scientific Search Engines, like IEEE, arxiv, ...
- Generate with the Help of NLP abstracts for each Paper
Thank you for using the SPH. If you have questions, feel free to reach out for the SPH-Team:
- Jan Brebeck - Brebeck-Jan
- Andreas Bernrieder - Phantomias3782
- Simon Scapan - SimonScapan
- Thorsten Hilbradt - Thorsten-H
- Niklas Wichter - NWichter