Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

arXiv fetcher: import more information using DOI #9092

Closed
mlep opened this issue Aug 25, 2022 · 7 comments · Fixed by #9170
Closed

arXiv fetcher: import more information using DOI #9092

mlep opened this issue Aug 25, 2022 · 7 comments · Fixed by #9170
Assignees
Labels
fetcher good first issue An issue intended for project-newcomers. Varies in difficulty. import

Comments

@mlep
Copy link
Contributor

mlep commented Aug 25, 2022

Is your suggestion for improvement related to a problem? Please describe.
When importing the reference of a publication using its arXiv number (i.e. copy-paste of the arXiv number on the entry table):

  • the provided arXiv number is not stored
  • less information is imported, compared to using the DOI of the arXiv publication.

Describe the solution you'd like
Import the reference using the DOI.
It is straightforward to get the DOI from the arXiv number: e.g. the publication with the arXiv ID 1811.10364 has the DOI 10.48550/arXiv.1811.10364. See https://arxiv.org/abs/1811.10364

@Siedlerchr Siedlerchr added the good first issue An issue intended for project-newcomers. Varies in difficulty. label Sep 15, 2022
@thiagocferr
Copy link
Contributor

From what I've seen from the arXiv importer and the usual API responses, the extraction of the DOI from the arXiv ID is only trivial when the author provides it.

The displayed DOI on their site may actually be one generated from DataCite (see image below), in which case this data seems not to be transmitted into the API call response made in code (for example, http://export.arxiv.org/api/query?id_list=1811.10364). According to the API manual, this would appear in the form of the arxiv:doi element (which would be present if the author had included, as mentioned in here), or as link element (see manual), but this does not seem to be the case with some entries (i.e. arXiv ID 1811.10364)

image

Considering that, this feature would either only work on entries where the DOI was provided with the API response, or JabRef could try getting this missing info from other methods (web scrapping, use of other APIs like DataCite, match against other archives, etc.)

Please correct me if this was a false conclusion, as I am still not very knowledgeable at most of the codebase.

@mlep
Copy link
Contributor Author

mlep commented Sep 16, 2022

I am not sure that I got you right.

When the user provides the arXiv ID (e.g. 1811.10364), the user also provides, in fact, the DOI. This is because you just need to add the prefix 10.48550/arXiv. to the arXiv ID to get the DOI (10.48550/arXiv.1811.10364). So, JabRef, when provided with an arXiv ID could use the arXiv fetcher, but also the DOI fetcher.

@thiagocferr
Copy link
Contributor

thiagocferr commented Sep 19, 2022

What I don't get, using your example, is how you could possibly know to add the prefix 10.48550/arXiv. (more specifically, 10.48550 part) if the only provided information is the arXiv ID (1811.10364). From my understanding, this would only be known if the arXiv fetcher could get the DOI on the same request, which not always does, as shown before...

@mlep
Copy link
Contributor Author

mlep commented Sep 19, 2022

You could possibly add the prefix 10.48550/arXiv because you have identified that 1811.10364 is an arXiv ID.

And, currently, JabRef is already able to identify if a string is an arXiv ID: by simply pasting 1811.10364 in the entry table, JabRef is able to determine that this string is an arXiv ID ("Found arxiv identifier in clipboard" is written in the log file).

@thiagocferr
Copy link
Contributor

I think I get it now. After a quick search, I found from this article that, indeed, all arXiv articles have a DOI with the same prefix, which I never really paid attention to 😅.

@thiagocferr
Copy link
Contributor

Well, I've tackled a bit of code around this functionality and I have an idea on how to implement it, so I'd like to contribute to it as my first issue on this repo.

@Siedlerchr
Copy link
Member

@thiagocferr That sounds great! If you have any questions you can ask them in your pr then
Make sure to follow our contribution guide! https://devdocs.jabref.org/contributing.html#contribute-code

@ThiloteE ThiloteE moved this to Free to take in Good First Issues Oct 14, 2022
@ThiloteE ThiloteE moved this from Free to take to In Progress in Good First Issues Oct 14, 2022
@koppor koppor moved this to Normal priority in Features & Enhancements Nov 7, 2022
Repository owner moved this from Normal priority to Done in Features & Enhancements Nov 10, 2022
Repository owner moved this from In Progress to Done in Good First Issues Nov 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fetcher good first issue An issue intended for project-newcomers. Varies in difficulty. import
Projects
Archived in project
Archived in project
Development

Successfully merging a pull request may close this issue.

3 participants