- Add: update gnfinder to v1.1.5
- Add: sort uses
slices
package.
v1.0.0 - 2023-03-27 Mon
- Add: update modules, finalize v1
v1.0.0-RC6 - 2023-01-31 Tue
- Add #65: remove duplicates from the short version of occurrences dump.
v1.0.0-RC5 - 2023-01-19 Thu
- Add #64: add occurrence number to short dump, use it in the filter.rb.
- Add #63: add verbatim names to short dump, optionally normalized.
v1.0.0-RC4 - 2023-01-10 Tue
- Add #62: dump odds vs verification stats.
- Add #61: shortened and filtered dump for BHL-related data.
- Add #60: normalize odds according to verification results.
v1.0.0-RC3 - 2022-12-13 Tue
- Add: documentation in README about creating and configuring database and its user.
- Fix: show error in case if rebuilding of the database, or its initiation did not work as expected.
v1.0.0-RC2 - 2022-11-28 Mon
- Add #59: modify dump data according to #57.
- Add #58: pre-scan data for duplicate page Ids before name-finding.
- Add #57: switch to an OCR dump structure that uses item and page Ids.
- Add #56: add an option to return all verification results instead of the best results only.
v1.0.0-RC1 - 2022-10-30 Sun
- Add #55: refactor the directory structure using
internal
directory to hide code not suitable for public use.
v0.13.2 - 2022-09-12 Mon
- Add #53: classification ranks and IDs in dump files.
v0.13.1 - 2022-09-08 Thu
- Add #52: dump pages information
v0.13.0 - 2022-09-01 Thu
- Add #51: remove RESTful interface, no more remote access. All data is taken from dumps.
- Add #50: dump saves pages and names separately, allows a flag to dump only results for specific data-sources. Dump has a flag pointing to a directory where to save dump data.
v0.12.6 - 2022-08-29 Mon
- Add: compatibility with GNverifier v1.0.0
- Add: info for RESTful API.
- Add: RESTful API for occurrences takes data_sources in account.
- Add: improve code documentation.
- Add: detected verbatim name to results and data-dump.
- Add: shorten barcode for page to sequence number.
- Fix: deal with verbatim names longer than 255 bytes.
- Add: improve help messages.
- Add: update to gnfinder v0.19.2.
- Add: Update to gnfinder v0.19.1.
- Fix: add classification ranks, ids to REST API.
- Add #49: add classification ranks, ids.
- Add #48: change RESTful pagination to use IDs.
- Add #47: implement
dump
command. - Add #45: create RESTful service.
- Add #46: switch to gnverifier for name verification.
- Add #43: refactor to improve architecture and usability.
- Add #41: Update to gnfinder v0.11.1.
- Add #39: Save annotations about new species, combinations, subspecies.
- Add #38: Save 5 words before and after name-candidates.
- Add #36: Rename
title
toitem
to be in sync with BHL terminology, name_string export via gRPC.
- Add #35: Fixes in dictionaries In particular names of botanical genera authors are not in the dictionary anymore. Also common latin capitalized words from species descriptions are now added to 'grey' dictionary. As a result calculation of Bayes odds score improved quite a bit.
- Add #34: There are more indices.
- Add #32: Pages are not considered unique anymore and we take a combination of item id and archive page id as unique.
- Add #31: save preferred data-sources results to db.
- Add #30: average odds and occurrence number for name_strings.
- Add #29: matched canonical form from verification.
- Fix #28: sporadic non-zero edit distance for ExactMatch.
- Fix #27: no verification for abbreviated names.
- Add #26: add Go modules to make builds more stable.
- Add #24: updates in verification interface.
- Add #23: gRPC has an option to limit stream of pages to specific volumes.
- Add #22: gRPC has a stream of volumes metainfo.
- Add #18: gRPC example groups names by class clade.
- Add #17: gRPC does not stream volumes, streams pages and names and text.
- Add #16: gRPC streams volumes, pages, and names.
- Add #15: simple gRPC server and an example how to use it.
- Fix [#25]: gRPC serves pages in ascending order instead of random order.
- Add #14 curation information for verified names.
- Add #12,#13 options to set workers in command line app, better CLI.
- Add #9,#10,#11 improve command line interface.
- Add #8: decouple name-finding and name-verification.
- Add #4: set a Makefile to simplify compilation and packaging.
- Add #3: verification of name-strings against gnindex.
- Add #2: saving unique name-strings to database.
- Add: gnfinder support for Bayes searches.
- Update: tests to pass again.
- Update: to changes in dependencies.
- Remove:
*.txt
files fromgit lfs
.
- Add:
git lfs
support - Add: documentation in
README.md
file and script/README.md file. - Update: to recent
gnfinder
.
- Add: Biodiversity Heritage Library production trial, 1h for 50 million pages.
- Add: heuristic name finding via gnfinder.
- Add: saving data to database.
- Add: production wrapper script to reset db and do name-finding.
- Add: command line program.
- Add: name-finding framework.
- Add: Postgresql support and migrations.
- Add: development environment with
docker-compose
.
This document follows changelog guidelines