Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix for issue 5804: Rewrite ACM fetcher #7733

Merged
merged 15 commits into from
May 20, 2021
Merged

Conversation

ruanych
Copy link
Contributor

@ruanych ruanych commented May 12, 2021

Fixes #5804

  • Change in CHANGELOG.md described in a way that is understandable for the average user (if applicable)
  • Tests created for changes (if applicable)
  • Manually tested changed features in running JabRef (always required)
  • Screenshots added in PR description (for UI changes)
  • Checked documentation: Is the information available and up to date? If not created an issue at https://github.com/JabRef/user-documentation/issues or, even better, submitted a pull request to the documentation repository.

From issue #5804, ACM Fetcher needs to be rewritten, because of ACM website changes.
One solution is to get the DOI from the HTML results of the search page, then use the export interface to get the JSON format data, and finally parse them.

Search API: https://dl.acm.org/action/doSearch?AllField=<query_string>
Export API: https://dl.acm.org/action/exportCiteProcCitation?targetFile=custom-bibtex&format=bibTex&dois=<doi>,<doi>,...

What did I do:

  1. Rewrited ACMPortalFetcher, added ACMPortalParser.
  2. added Modified ACMPortalFetcherTest and ACMPortalParserTest.

Successfully get data from AMC Fetcher:
ACM-Portal
ACM-Portal-3-imported

@ruanych ruanych changed the title Fix for issue 5804 Fix for issue 5804: Rewrite ACM fetcher May 12, 2021

private StandardEntryType typeStrToEnum(String typeStr) {
StandardEntryType type;
typeStr = typeStr.toLowerCase(Locale.ENGLISH).replace("_", "");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can simplify this whole EntryType stuff, if you want too look up an enum key by the string value:
StandardEntryType.valueOf(...)
Or alternatively, use the guava equivalent https://guava.dev/releases/19.0/api/docs/index.html?com/google/common/base/Enums.html

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem here is that the entry of StandardEntryType uses big camelcase nomenclature, I can't directly use StandardEntryType.valueOf(...), or I need to find a good way from JabRef to deal with big camelcase format and all-caps dashed separation format conversion.

And some types of naming do not match exactly. For example, the API returns PAPER_CONFERENCE corresponding to the CONFERENCE of StandardEntryType.

Copy link
Member

@Siedlerchr Siedlerchr May 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I understand, however, you can directly operate on the Enum, there is the possibility to use an EnumSet and you have an Optional to check for the type.
This approach is simpler and you don't need to create a new Map with the keys and values

       Optional<StandardEntryType> foundType  = EnumSet.allOf(StandardEntryType.class).stream().filter(type-> type.getDisplayName().equals(strType)).findAny(); 

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optional<StandardEntryType> foundType  = EnumSet.allOf(StandardEntryType.class).stream().filter(type-> type.getDisplayName().equals(strType)).findAny(); 

This is invalid if the input string is ARTICLE

New version~

private StandardEntryType typeStrToEnum(String typeStr) {
    StandardEntryType type;
    if ("PAPER_CONFERENCE".equals(typeStr)) {
        type = StandardEntryType.Conference;
    } else {
        String upperUnderscoreTyeStr = CaseFormat.UPPER_UNDERSCORE.to(CaseFormat.UPPER_CAMEL, typeStr);
        type = Enums.getIfPresent(StandardEntryType.class, upperUnderscoreTyeStr).or(StandardEntryType.Article);
    }
    return type;
}

Copy link
Member

@Siedlerchr Siedlerchr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall this looks already good to me, just some mimor changes/improvements

@Siedlerchr Siedlerchr added the status: ready-for-review Pull Requests that are ready to be reviewed by the maintainers label May 15, 2021
Copy link
Member

@koppor koppor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good job!

Thank you for creating tests!

Small comments on the tests itself.

@@ -93,7 +94,7 @@ private WebFetchers() {
set.add(new MathSciNet(importFormatPreferences));
set.add(new ZbMATH(importFormatPreferences));
// see https://github.com/JabRef/jabref/issues/5804
// set.add(new ACMPortalFetcher(importFormatPreferences));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just remove the disabled lines

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May I ask whether you committed that change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have deleted these two lines

// see https://github.com/JabRef/jabref/issues/5804 
// set.add(new ACMPortalFetcher(importFormatPreferences));

do you mean I still need to delete this line?

// set.add(new GoogleScholar(importFormatPreferences));

But this is another Fetcher

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, was too late. Everything allright! 🎉

@ruanych
Copy link
Contributor Author

ruanych commented May 18, 2021

@Siedlerchr @koppor Thank you very much for your help and suggestions.
The above problem has been fixed, and I also updated some codes.

Copy link
Member

@koppor koppor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the quick follow-up.

Some small nitpicks remain 😇

@@ -93,7 +94,7 @@ private WebFetchers() {
set.add(new MathSciNet(importFormatPreferences));
set.add(new ZbMATH(importFormatPreferences));
// see https://github.com/JabRef/jabref/issues/5804
// set.add(new ACMPortalFetcher(importFormatPreferences));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May I ask whether you committed that change?

@koppor
Copy link
Member

koppor commented May 20, 2021

Thank you for working on it! This brings JabRef a step forward!

@koppor koppor merged commit 3b3e7d6 into JabRef:main May 20, 2021
Siedlerchr added a commit that referenced this pull request May 21, 2021
* upstream/main:
  Add more unit tests (#7638)
  Fix for issue 5804: Rewrite ACM fetcher (#7733)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status: ready-for-review Pull Requests that are ready to be reviewed by the maintainers
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Rewrite ACM fetcher, because of ACM web site changes
3 participants