Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error while iterating over SimilarityABC with chunk.shape[0] == 1 #791

Closed
hg-bauerc opened this issue Jul 15, 2016 · 1 comment
Closed
Assignees
Labels
bug Issue described a bug difficulty easy Easy issue: required small fix

Comments

@hg-bauerc
Copy link

Having this code section of SimilarityABC.iter

if chunk.shape[0] > 1:
    for sim in self[chunk]:
        yield sim
else:
    yield self[chunk]

When running this code and chunk.shape[0] == 1 (i.e. else path) I get a ValueError (too many values to unpack).
If I'm right the above code has to be simply replaced by

for sim in self[chunk]:
    yield sim

Thanks for verifying!

@hg-bauerc
Copy link
Author

The problem with the else-path can be seen in this demo session:

>>> from gensim import corpora, models, similarities
>>> corpus = [[(0, 1.0), (1, 1.0), (2, 1.0)],
...           [(2, 1.0), (3, 1.0), (4, 1.0), (5, 1.0), (6, 1.0), (8, 1.0)],
...           [(1, 1.0), (3, 1.0), (4, 1.0), (7, 1.0)],
...           [(0, 1.0), (4, 2.0), (7, 1.0)]]
>>> tfidf = models.TfidfModel(corpus)
>>> index = similarities.MatrixSimilarity(tfidf[corpus], num_features=12)
>>> index.chunksize=2
>>> for s in index:
...   print s
...
[ 0.99999994  0.15336274  0.32415688  0.35208049]
[ 0.15336274  1.          0.17483118  0.05580689]
[ 0.32415688  0.17483118  1.00000012  0.46034482]
[ 0.35208049  0.05580689  0.46034482  1.        ]
>>> index.chunksize=3
>>> for s in index:
...   print s
...
[ 0.99999994  0.15336274  0.32415688  0.35208049]
[ 0.15336274  1.          0.17483118  0.05580689]
[ 0.32415688  0.17483118  1.00000012  0.46034482]
[[ 0.35208049  0.05580689  0.46034482  1.        ]]

With chunksize == 3 the else-path is called for the last sim-vector and returns it inside a list.

@tmylk tmylk added bug Issue described a bug difficulty easy Easy issue: required small fix labels Oct 5, 2016
@tmylk tmylk closed this as completed in #839 Oct 7, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Issue described a bug difficulty easy Easy issue: required small fix
Projects
None yet
Development

No branches or pull requests

2 participants