v1.0.0a2: Refactor and modernize, spaCy v2.2 support, more features, Prodigy recipes
⚠️ This is an alpha release and not yet ready for production. You can download sense2vec via pip by specifying the exact version.pip install sense2vec==1.0.0a2The converted Reddit vectors (trained on all comments of 2015) are attached to this release as a
.tar.gz
file. For more details and usage instructions, see theREADME
.
✨ New features and improvements
- Completely rewrite package from scratch.
- Replace built-in vector storage with spaCy's
Vectors
, making this package a pure Python package and allowing easy out-of-the-box serialization of vectors. - Add fully serializable spaCy pipeline component and extension attributes.
- Add new methods
get_best_sense
andget_other_senses
and improvemost_similar
. - Add annotation recipes for Prodigy to easily create word lists and match patterns from similar phrases using sense2vec vectors (like the
terms.teach
recipe, just with multi-word expressions). - New and more efficient training and preprocessing scripts using GloVe.
⚠️ Backwards incompatibilities
- The
sense2vec.load
method has been removed. UseSense2Vec.from_disk
instead. - The previous
VectorMap
andVectorStorage
have been removed. - This package now requires Python 3.6+.
- This update requires a new vectors format (see attached
.tar.gz
).
📖 Documentation and examples
- Rewrite
README
from scratch and include full API docs.
👥 Contributors
Thanks to @kabirkhan for contributing the Prodigy recipes!