-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
KeyedVectors & *2Vec API streamlining, consistency (#2698)
* slim low-value warnings * clarify vectors/vectors_vocab relationship; fix lockf & nonsense ngram-norming confusion * mv FT, KV tests to right place * rm deprecations, obsolete refs/tests, delete_temporary_training_data, update usages * update usages, tests, flake8 cleanup * expand KeyedVectors to obviate Doc2VecKeyedVectors; upconvert old offset-style doctags * fix docstring warnings; update usages * rm unused old plain-python codepaths * unify class comments under __init__ for consistncy w/ api doc presentation * name/comment harmonization (rm 'entity', lessen 'word'-centricity) * table formatting * return pyemd to linux test env * split backcompat tests for better resolution * convert Vocab & related data items to use dataclasses * rm obsolete Vocab/Trainable/abstract/Wrapper classes, persistent callbacks (bug #2136), outdated tests/warnings; update usages * tune tests for stability, runtimes; rm auto reruns that hide flakiness * fix numpy FutureWarning: arrays to stack must be sequence * (commented-out) deoptimization option * stronger FB model testing; no _unpack_copy test * merge redundant methods; rm duplicated imports/defs * rationalize _lockf, buckets_word behaviors * rename .docvecs to .dv * update usages; rm obsolete tests; restore gensim.utils import * intensify FT tests (more epochs, more buckets) * flake8-3.8.0 style fixes - but also pin flake8-3.7.9 vs 3.8.0 'output_file' error * replace vectors_norm with 1d norms * tighten testParallel * rm .vocab & 'Vocab' classes; add expandable 'vecattrs' * update usages (no vocabs) * enable running inside '-m mtprof' (or cProfile) via explicit unittest.main(module=..) * faster sample_int reads * load_word2vec_format(.., no_header=True) to support GLoVe text vectors * refactor & comment lockf feature; allow single-element lockf * improve FT comment * rm deprecated/unneded init_sims calls * fixes to code style * flake8: fix overlong lines * rm stray merge error * rm duplicated , old nonstandard hash workarounds * use numpy-recommended PRNG constructor * add sg to FastTextConfig & consult it; rm remaining broken-hash cruft * reorg conditional packages for clarity * comments, names, refactoring, randomization * Apply suggestions from code review Co-authored-by: Radim Řehůřek <me@radimrehurek.com> * fix cruft left from suggestion * fix numpy-32bit-on-Windows; executable docs * mv lee_corpus to utils; cleanup * update poincare for latest KV __init__ signature * restore word_vec method for proper overriding, but rm usages * Apply suggestions from code review Co-authored-by: Radim Řehůřek <me@radimrehurek.com> * adjust testParallel against failure risk * intensify training for an occasionally failing test * clarify word/char ngrams handling; rm outdated comments * mostly avoid duplciating FastTextConfig fields into locals * avoid copies/pointers for no-bucket (FT as W2V) case * rm obsolete test (already skipped & somewhat originally misguided) * simpler/faster .get(..., default) (avoids exception-catching in has_index_for) * add default option to get_index; avoid exception in has_index_for * chained range check Co-authored-by: Radim Řehůřek <me@radimrehurek.com> * Update CHANGELOG.md Co-authored-by: Radim Řehůřek <radimrehurek@seznam.cz> Co-authored-by: Radim Řehůřek <me@radimrehurek.com> Co-authored-by: Michael Penkov <m@penkov.dev>
- Loading branch information
1 parent
4cdf228
commit c0e0169
Showing
76 changed files
with
5,642 additions
and
14,143 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file was deleted.
Oops, something went wrong.
This file was deleted.
Oops, something went wrong.
This file was deleted.
Oops, something went wrong.
This file was deleted.
Oops, something went wrong.
This file was deleted.
Oops, something went wrong.
This file was deleted.
Oops, something went wrong.
This file was deleted.
Oops, something went wrong.
This file was deleted.
Oops, something went wrong.
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.