Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes Jabref#7660 Unable to download some arXiv links if the "eprint" field is missing #7663

Merged
merged 1 commit into from
Apr 23, 2021

Conversation

JavuesZhang
Copy link
Contributor

Fixes #7660

Brief summary

  1. Run EprintCleanup on a copy of the entry the ArXiv fetcher is fetching before getting arXiv id from the eprint field;
  2. Add two test method. One finds full text with title containing colon and journal, while another finds full text with title containing colon and url.

Problem

When finding full text, this BibTeX reference works:

@Article{booth_bayes-trex_2020,
  author        = {Serena Booth and Yilun Zhou and Ankit Shah and Julie Shah},
  journal       = {arXiv:2002.10248v4 [cs]},
  title         = {Bayes-TrEx: a Bayesian Sampling Approach to Model Transparency by Example},
  year          = {2020},
  month         = dec,
  archiveprefix = {arXiv},
  eprint        = {2002.10248},
  url           = {http://arxiv.org/abs/2002.10248v4},
}

#7660-0

But when eprint field is missing, no full text will be found:

@Article{booth_bayes-trex_2020,
  author        = {Serena Booth and Yilun Zhou and Ankit Shah and Julie Shah},
  journal       = {arXiv:2002.10248v4 [cs]},
  title         = {Bayes-TrEx: a Bayesian Sampling Approach to Model Transparency by Example},
  year          = {2020},
  month         = dec,
  archiveprefix = {arXiv},
  url           = {http://arxiv.org/abs/2002.10248v4},
}

#7660-1

Solution

Since the title contains colon and arXiv uses colon to represent key and value, the title may be recognized mistakenly. So use other fields to get the eprint field to avoid this problem. Thanks to the advice from @tobiasdiez .

Screenshots

After fix:
#7660-2

  • Change in CHANGELOG.md described in a way that is understandable for the average user (if applicable)
  • Tests created for changes (if applicable)
  • Manually tested changed features in running JabRef (always required)
  • Screenshots added in PR description (for UI changes)
  • Checked documentation: Is the information available and up to date? If not created an issue at https://github.com/JabRef/user-documentation/issues or, even better, submitted a pull request to the documentation repository.

1. Run EprintCleanup on a copy of the entry the ArXiv fetcher is fetching before getting arXiv id from the eprint field;
2. Add two test method. One finds full text with title containing colon and journal, while another finds full text with title containing colon and url.
Copy link
Member

@tobiasdiez tobiasdiez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow that was quick 🚀 . Code looks good to me! Thanks for your contribution.

@tobiasdiez tobiasdiez added the status: ready-for-review Pull Requests that are ready to be reviewed by the maintainers label Apr 23, 2021
@@ -116,6 +118,9 @@ public TrustLevel getTrustLevel() {
}

private List<ArXivEntry> searchForEntries(BibEntry entry) throws FetcherException {
entry = (BibEntry) entry.clone();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you clone the entry?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we do clean up on the original entry, its fields will be changed but I think we needn't change fields here because it is just a search.

@JavuesZhang
Copy link
Contributor Author

Wow that was quick 🚀 . Code looks good to me! Thanks for your contribution.

It is your advice that makes me efficient! At the beginning I try to ignore colons in title to solve this problem but it seems not so good.

Copy link
Member

@Siedlerchr Siedlerchr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the quick fix!

@Siedlerchr Siedlerchr merged commit f815050 into JabRef:main Apr 23, 2021
Siedlerchr added a commit that referenced this pull request Apr 24, 2021
…om.tngtech.archunit-archunit-junit5-api-0.18.0

* upstream/main:
  Fix exception when searching (#7659)
  Fixes #7660 (#7663)
  Fix for issue 5850: Journal abbreviations in UTF-8 not recognized (#7639)
  Fix SSLHandshake Exception by using bypass (#7657)
  Fix for issue 7633: Unable to download arXiv pdfs if Title contains curly brackets (#7652)
  Fix#7195 partly Opacity of disabled icon-buttons
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status: ready-for-review Pull Requests that are ready to be reviewed by the maintainers
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Unable to download some arXiv links if the "eprint" field is missing
3 participants