You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
ShardedCorpus skips the first value of a generator. This is possibly caused by ShardedCorpus not using the fixed corpus that is returned from is_corpus method, but I haven't verified this yet.
Steps/Code/Corpus to Reproduce
from gensim.corpora.sharded_corpus import ShardedCorpus
def my_generator():
yield [(0,1)]
yield [(1,1)]
yield [(2,1)]
corpus = ShardedCorpus("corpus", my_generator(), dim=3, overwrite=True)
print(len(corpus))
print(corpus[0])
Expected Results
Expected output:
3
[ 1. 0. 0.]
Actual Results
Actual output:
2
[ 0. 1. 0.]
I.e. The first item in the generator has been skipped and is missing from the resulting corpus
Description
ShardedCorpus skips the first value of a generator. This is possibly caused by ShardedCorpus not using the fixed corpus that is returned from is_corpus method, but I haven't verified this yet.
Steps/Code/Corpus to Reproduce
Expected Results
Expected output:
Actual Results
Actual output:
I.e. The first item in the generator has been skipped and is missing from the resulting corpus
Versions
Darwin-16.7.0-x86_64-i386-64bit
Python 3.6.1 |Anaconda 4.4.0 (x86_64)| (default, May 11 2017, 13:04:09)
[GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.57)]
NumPy 1.12.1
SciPy 0.19.0
gensim 2.3.0
FAST_VERSION 1
The text was updated successfully, but these errors were encountered: