Skip to content

PageRank computation of Wikipedia's articles using Hadoop.

Notifications You must be signed in to change notification settings

MrYawe/wiki-pagerank-hadoop

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

39 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

wiki-pagerank-hadoop

Starting

  1. Download these files:
  1. Create the input_pages and input_links folders at the root of the project.
  2. Put frwiki-latest-page.sql.gz in input_pages and frwiki-latest-pagelinks.sql.gz in input_pagelinks.
  3. Download dependencies with mvn install
  4. You can run the jar in the target folder with 3 args: "input_pagelinks input_pages final_result". The final_result folder will be created automatically and musn't exist at start.

About

PageRank computation of Wikipedia's articles using Hadoop.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages