A project that involved implementing four different methods for the NLP task of Word Sense Disambiguation. The methods are as follows:
- Lesk's algorithm using the most frequent sense baseline (attempts to disambiguate all words)
- NLTK's WordNet Lesk's algorithm (attempts to disambiguate all words)
- Naive Bayes (each classifier is trained to disambiguate a single specified word)
- Decision Trees using Bagging (each classifier is trained to disambiguate a single specified word)
Two datasets were used for this project. The implementations of Lesk's algorithm used the SemEval 2013 Shared Task #12 dataset and the classifiers used the entire SemCor corpus.
For a summary of the results see report.pdf