Skip to content

v2.3.0

Latest
Compare
Choose a tag to compare
@github-actions github-actions released this 17 Dec 16:10
· 1 commit to main since this release

2.3.0 - 2024-12-17

Features

  • Release new multilabel biomedBERT model trained on LLM (Gemini) synthetically generated NER data. The model was trained on over 7000 LLM annoted documents with a total of 295822 samples.
    The model was trained for 21 epochs and achieved an F1 score of 95.6% on a held out test set. (multilabel_bert)
  • added multilabel NER training example and config.
  • added scaling kazu with Ray docs and example.

Bugfixes

  • Fix issue with TransformersModelForTokenClassificationNerStep when processing large amounts of documents. The fix offloads tensors onto cpu before performin the torch.cat operation which lead to a zero tensor before. (pytorch_memory_issue)