Skip to content

v1.0.0a2: Refactor and modernize, spaCy v2.2 support, more features, Prodigy recipes

Compare
Choose a tag to compare
@ines ines released this 31 Oct 21:17
· 190 commits to master since this release
d11dbef

⚠️ This is an alpha release and not yet ready for production. You can download sense2vec via pip by specifying the exact version.

pip install sense2vec==1.0.0a2

The converted Reddit vectors (trained on all comments of 2015) are attached to this release as a .tar.gz file. For more details and usage instructions, see the README.


✨ New features and improvements

  • Completely rewrite package from scratch.
  • Replace built-in vector storage with spaCy's Vectors, making this package a pure Python package and allowing easy out-of-the-box serialization of vectors.
  • Add fully serializable spaCy pipeline component and extension attributes.
  • Add new methods get_best_sense and get_other_senses and improve most_similar.
  • Add annotation recipes for Prodigy to easily create word lists and match patterns from similar phrases using sense2vec vectors (like the terms.teach recipe, just with multi-word expressions).
  • New and more efficient training and preprocessing scripts using GloVe.

⚠️ Backwards incompatibilities

  • The sense2vec.load method has been removed. Use Sense2Vec.from_disk instead.
  • The previous VectorMap and VectorStorage have been removed.
  • This package now requires Python 3.6+.
  • This update requires a new vectors format (see attached .tar.gz).

📖 Documentation and examples

  • Rewrite README from scratch and include full API docs.

👥 Contributors

Thanks to @kabirkhan for contributing the Prodigy recipes!