Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add "most_similar_to_given" method for KeyedVectors #1582

Merged
merged 5 commits into from
Oct 17, 2017
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 26 additions & 1 deletion gensim/models/keyedvectors.py
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,8 @@

from numpy import dot, zeros, dtype, float32 as REAL,\
double, array, vstack, fromstring, sqrt, newaxis,\
ndarray, sum as np_sum, prod, ascontiguousarray
ndarray, sum as np_sum, prod, ascontiguousarray,\
argmax

from gensim import utils, matutils # utility fnc for pickling, common scipy operations etc
from gensim.corpora.dictionary import Dictionary
Expand Down Expand Up @@ -618,6 +619,30 @@ def similarity(self, w1, w2):
"""
return dot(matutils.unitvec(self[w1]), matutils.unitvec(self[w2]))

def most_similar_to_given(self, w1, word_list):
"""Return the word from word_list most similar to w1.

Args:
w1 (str): a word
word_list (list): list of words containing a word most similar to w1

Returns:
the word in word_list with the highest similarity to w1

Raises:
KeyError: If w1 or any word in word_list is not in the vocabulary

Example::

>>> trained_model.most_similar_to_given('music', ['water', 'sound', 'backpack', 'mouse'])
'sound'

>>> trained_model.most_similar_to_given('snake', ['food', 'pencil', 'animal', 'phone'])
'animal'

"""
return word_list[argmax([self.similarity(w1, word) for word in word_list])]

def n_similarity(self, ws1, ws2):
"""
Compute cosine similarity between two sets of words.
Expand Down