Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parser for corpus_pubtator.txt #10

Open
izuna385 opened this issue Feb 21, 2021 · 1 comment
Open

Parser for corpus_pubtator.txt #10

izuna385 opened this issue Feb 21, 2021 · 1 comment

Comments

@izuna385
Copy link

Hi, I just wrote parser for pubtator-fomat based annotation, specifically for this dataset and others.
https://github.com/izuna385/PubTator-Multiprocess-Parser

I hope you find it useful.

@gpiat
Copy link

gpiat commented Mar 8, 2021

Hello,
I thought this would be a good place to mention that there are a number of projects that help with loading PubTator-format files.

The oldest package I could find that handles PubTator is PubTator2Anndoc, but its scope is very limited.

Perhaps the first packages to implement multipurpose PubTator support were Kindred and bconv.

I've personally been working with MedMentions for over a year now and recently released the code I've been using to parse it on the python package index as well. It's called pubtatortool and can be found here.

Shortly thereafter, pubtator-loader and pubtator2dataset were released as well.

It may be good for anyone trying to decide which package to use to make some kind of table of supported features and post it in this thread.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants