Skip to content

mdtareque/IRE-Wikipedia-Search-Engine

Repository files navigation

Main Class for query engine : com.mtk.ire.QueryProcessor

Other standalone main classes:

  1. External Merge Sort
  2. IndexGenerator (to index 53 gb wiki dump)
  3. IndexCreator (to create primary and secondary index)

Implemented:

  1. Single word queries
  2. Multi word queries
  3. Field queries
  4. Phrase queries partially (using ngrams)
  5. Additional little summary snippet (work in progress)

About

IRE - Information Retreival Project. Index and build a query engine on 53 GB of wikipedia dump (https://dumps.wikimedia.org/enwiki/latest/enwiki-latest-pages-articles.xml.bz2).

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published