Word Sense Disambiguation

A project that involved implementing four different methods for the NLP task of Word Sense Disambiguation. The methods are as follows:

Lesk's algorithm using the most frequent sense baseline (attempts to disambiguate all words)
NLTK's WordNet Lesk's algorithm (attempts to disambiguate all words)
Naive Bayes (each classifier is trained to disambiguate a single specified word)
Decision Trees using Bagging (each classifier is trained to disambiguate a single specified word)

Two datasets were used for this project. The implementations of Lesk's algorithm used the SemEval 2013 Shared Task #12 dataset and the classifiers used the entire SemCor corpus.

For a summary of the results see report.pdf

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
src		src
.gitignore		.gitignore
README.md		README.md
baseline_accuracy_supervisedwsd.json		baseline_accuracy_supervisedwsd.json
bootstrap_supervised_wsd_accuracies.json		bootstrap_supervised_wsd_accuracies.json
multilingual-all-words.en.xml		multilingual-all-words.en.xml
report.pdf		report.pdf
supervised_wsd_accuracies.json		supervised_wsd_accuracies.json
wordnet.en.key		wordnet.en.key

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Word Sense Disambiguation

About

Releases

Packages

Languages

ryrutherford/word-sense-disambiguation

Folders and files

Latest commit

History

Repository files navigation

Word Sense Disambiguation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages