Skip to content

ryrutherford/word-sense-disambiguation

Repository files navigation

Word Sense Disambiguation

A project that involved implementing four different methods for the NLP task of Word Sense Disambiguation. The methods are as follows:

  • Lesk's algorithm using the most frequent sense baseline (attempts to disambiguate all words)
  • NLTK's WordNet Lesk's algorithm (attempts to disambiguate all words)
  • Naive Bayes (each classifier is trained to disambiguate a single specified word)
  • Decision Trees using Bagging (each classifier is trained to disambiguate a single specified word)

Two datasets were used for this project. The implementations of Lesk's algorithm used the SemEval 2013 Shared Task #12 dataset and the classifiers used the entire SemCor corpus.

For a summary of the results see report.pdf

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages