Skip to content

v1.0.0a1: Update sense2vec for spaCy v2.1.x or standalone use

Pre-release
Pre-release
Compare
Choose a tag to compare
@ines ines released this 12 Sep 14:12
· 275 commits to master since this release

⚠️ This is an alpha release and not yet ready for production. You can download sense2vec via pip by specifying the exact version.

pip install sense2vec==1.0.0a1

Note that the library doesn't depend on spaCy anymore, so you might have to install spaCy and the English model separately. The Reddit vectors (trained on all comments of 2015) are attached to this release as a .tar.gz file. For more details and usage instructions, see the README.


✨ New features and improvements

  • NEW: Remove spaCy dependency and allow standalone use of the sense2vec library.
  • NEW: Include spaCy v2.x pipeline component to add sense2vec-compatible token merging and token attributes and methods.
  • Attach reddit_vectors model to release and make it easier to download and load in models.

📖 Documentation and examples

  • Rewrite README from scratch and include full API docs.

🚧 Todo

  • Replace VectorMap implementation with spaCy's Vectors class.
  • Don't merge tokens at runtime and adjust extension attributes accordingly.
  • Update training and pre-processing scripts for spaCy v2.x.
  • Retrain vectors on more data.