Releases: umcu/clinlp
Releases · umcu/clinlp
v0.9.0
0.9.0 (2024-07-10)
Added
- Mantra GSC corpus for evaluation
- Loading and exporting
InfoExtractionDataset
as dictionaries or JSON files - Metric support for multi-class qualifiers
- In the
RuleBasedEntityMatcher
, option to add terms as adict
(in addition tostr
,list
andTerm
) - In the
RuleBasedEntityMatcher
, option to add terms from dict (add_terms_from_dict
), json (add_terms_from_json
) or csv (add_terms_from_csv
) - In the
Term
class, an option to override arguments that were not set
Changed
- Moved regression test cases to data directory in more open format, so they are re-usable
- Made the
default
field forQualifier
optional InfoExtractionDataset
andInfoExtractionMetrics
useQualifier
objects for qualifiers rather thandict
- ❗
InfoExtractionDataset
andInfoExtractionMetrics
no longer track or use qualifier defaults - Made qualifiers optional for metrics in
Annotation
- Added a
normalize
method toNormalizer
, so it can be used/tested directly - The logic for determining whether the
RuleBasedEntityMatcher
should internally use the phrase matcher or the matcher is simplified
Deprecated
- ❗ The
create_concept_dict
method, which is now replaced byadd_terms_from_csv
inRuleBasedEntityMatcher
- ❗ In the
RuleBasedEntityMatcher
, theload_concepts
method, which is now replaced byadd_terms_from_dict
andadd_terms_from_json
v0.8.1
0.8.1 (2024-06-27)
Added
- Docstrings on all modules, classes, methods and functions
Changed
- In
InformationExtractionDataset
, renamedspan_counts
,label_counts
andqualifier_counts
tospan_freqs
,label_freqs
andqualifier_freqs
respectively. - The
clinlp_component
utility now returns the class itself, rather than a helper function for making it - Changed order of
direction
andqualifier
arguments ofContextRule
- Simplified default settings for
clinlp
components andTerm
class - Normalizer uses casefold rather than lower for normalizing text
- Parameterized
spans_key
for ie components
v0.8.0
0.8.0 (2024-06-03)
Changed
- ❗ Renamed the
clinlp_entity_matcher
toclinlp_rule_based_entity_matcher
- ❗
clinlp
now stores entities indoc.spans['ents']
rather thandoc.ents
, allowing for overlap- ❗ Overlap in entities found by the entity matcher is no longer resolved by default (replacing old behaviour). To remove overlap, pass
resolve_overlap=True
.
- ❗ Overlap in entities found by the entity matcher is no longer resolved by default (replacing old behaviour). To remove overlap, pass
- Refactored tests to use
pytest
best practices - Changed
clinlp_autocomponent
toclinlp_component
, which automatically registers your component with spaCy - Codebase and linting improvements
- Renamed the
other_threshold
config tofamily_threshold
in theclinlp_experiencer_transformer
component
Fixed
- The
clinlp_rule_based_entity_matcher
no longer overwrites entities detected by other components (but appends them)
v0.7.0
0.7.0 (2024-05-16)
Added
- Integrated the clin_nlp_metrics package in this repository, specifically in
clinlp.metrics.ie
- Support for non-binary qualifier in the Context Algorithm (e.g. 'Change', with values Decreasing, Stable and Increasing)
- Support for bidirectional qualifier patterns
Changed
- Moved all components related to information extraction to
clinlp.ie
. Please update imports accordingly (e.g.from clinlp.ie import Term
). - Updated the framework for qualifiers, to now have three qualifier classes: Presence, Temporality and Experiencer. For more details, see docs.