##v2.1.0
Implemented Enhancements:
- An improved NER system that allows for much better performance on inorganic materials.
- New tokenization to go with the new NER system.
- The addition of InferredProperty allows for users to define explicit links between different properties included in their data models, reducing a large amount of boilerplate parser code.
- The Every parse element means that users can specify that a certain token satisfy multiple condition.
- A more flexible tagging system that allows for the creation of taggers beyond just part of speech and NER taggers.
- Batch tagging.
- A new, modern theme for the documentation, along with much more detail in the documentation on certain parts of ChemDataExtractor, such as tagging and tokenization.
Breaking Changes:
- Any taggers previously written by the user will be broken. Please refer to the migration guide for version 2.1.
- The new tokenization can break some parse rules written by the user. This can either be fixed by adopting a few changes to the parse rules, or by reverting to the previous NER system and tokenizer. Please refer to the migration guide for more details.
v2.0 (2019-09-xx)
Implemented enhancements:
- New model structure changed so that the Compound class is no longer at the root of all properties
- Hierarchy changed so that documents own models, not parsers, so that the user doesn’t need to remember to pass in all the correct parsers.
- Quantity based models, allowing for easy detection of units and values. Also allows for better comparisons of models.
- Completely new table parsing routine with the incorporation of TableDataExtractor. This returns a more structured form for tables without any user input.
- Automatically generated parsers based on the dimensional information of properties.
- Forward looking Interdependency Resolution for detecting definitions of specifier terms and chemical names.
- Improved Interdependency Resolution to account for more complex models.
- Snowball integration where Snowball parsers can be used seamlessly alongside rule-based parsers.
- Improved performance, with parsing up to 2x faster in real-world usage.
- The incorporation of an evaluation package for measuring the performance of CDE.
- Improved tokenization when using new quantity based models.
- Improved documentation, including a migration guide for users coming from older versions.
v1.3.0 (2017-02-03)
Implemented enhancements:
v1.2.3 (2017-01-22)
Fixed bugs:
- _in_stoplist should return True for entities trimmed out of existence #12
v1.2.2 (2016-11-02)
Fixed bugs:
v1.2.1 (2016-10-24)
Fixed bugs:
- RSCHTMLReader throws bytes/string error #8
- Fix encoding bug in RSC image character handling #9 (mcs07)
v1.2.0 (2016-10-11)
Implemented enhancements:
Fixed bugs:
- import error: HTMLParser in Python 3 #7
- Installation on Windows 7 #3
- HTML unescape py2/3 compat - fixes #4 #6 (mcs07)
v1.1.1 (2016-10-04)
Implemented enhancements:
Fixed bugs:
- version of pdfminer #1
v1.1.0 (2016-10-03)
* This Change Log was automatically generated by github_changelog_generator