Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gensim 4.0 loading Phraser trained from Gensim 3.x #3173

Closed
canmingh opened this issue Jun 15, 2021 · 2 comments · Fixed by #3174
Closed

Gensim 4.0 loading Phraser trained from Gensim 3.x #3173

canmingh opened this issue Jun 15, 2021 · 2 comments · Fixed by #3174

Comments

@canmingh
Copy link

Problem description

Gensim 4.0 cannot load Phraser model from Gensim 3.x

Steps/code/corpus to reproduce

gemsim.model.load("saved_phraser.pkl") 

Will not be able to load phraser trained in gensim 3.x due to a bug in source code. See below.

Versions

This line of code from 4.0.1 (and also current development branch) need to be fixed:
https://github.com/RaRe-Technologies/gensim/blob/4.0.1/gensim/models/phrases.py#L367

model.phrasegrams = {
  str(model.delimiter.join(component), encoding='utf8'): score
  for key, val in phrasegrams.items()
}

The existing code will only keep the first phrase. To upgrade and load all phrases, it should be replaced with:

model.phrasegrams = {
  str(model.delimiter.join(key), encoding='utf8'): val
  for key, val in phrasegrams.items()
}
@piskvorky
Copy link
Owner

You're right, thanks. Can you open a PR?

@canmingh
Copy link
Author

Just created a pull request. See above. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants