Skip to content

crawls through the web and collects page link. Ranks and visualize pages based on number of links.

License

Notifications You must be signed in to change notification settings

priyankkhanna/web_page_ranker_crawler

Repository files navigation

Order of operation

  • spider.py to crawl a web site and store pages linked to it into the database; recording link between pages. stores data in spider.sqlite
  • spdump.py for cleaning data
  • sprank.py to rank the pages. Assigns weights to pages based on number of links. Then ranks pages on the basis of those weights. Iterate it as much as you want.
  • spdump.py for cleaning again
  • spreset.py to restart the page rank calculations without re- spidering the wab pages.
  • spjson.py to create spider.js

About

crawls through the web and collects page link. Ranks and visualize pages based on number of links.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published