Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/enable paginated fetchers #7082

Merged
merged 9 commits into from
Nov 15, 2020

Conversation

DominikVoigt
Copy link
Contributor

This PR adds complex search query support for paginated fetchers.
It additionally implements the corresponding interfaces for a couple of fetchers (arXiv, Scholar, IEEE Xplore, Springer Link).
Refs #6236, #5507, koppor#369, koppor#347

  • Change in CHANGELOG.md described (if applicable)
  • Tests created for changes (if applicable)
  • Manually tested changed features in running JabRef (always required)
  • Screenshots added in PR description (for UI changes)
  • Checked documentation: Is the information available and up to date? If not created an issue at https://github.com/JabRef/user-documentation/issues or, even better, submitted a pull request to the documentation repository.

Signed-off-by: Dominik Voigt <dominik.ingo.voigt@gmail.com>
Signed-off-by: Dominik Voigt <dominik.ingo.voigt@gmail.com>
// if there are too much requests from the same IP address google is answering with a 503 and redirecting to a captcha challenge
// The caught IOException looks for example like this:
// java.io.IOException: Server returned HTTP response code: 503 for URL: https://ipv4.google.com/sorry/index?continue=https://scholar.google.com/scholar%3Fhl%3Den%26btnG%3DSearch%26q%3Dbpmn&hl=en&q=CGMSBI0NBDkYuqy9wAUiGQDxp4NLQCWbIEY1HjpH5zFJhv4ANPGdWj0
if (e.getMessage().contains("Server returned HTTP response code: 503 for URL")) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This struck me recently, isn't there a way to check if the exception is a more specific exception so that one could check for the status code number?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not quite understand what the benefits of that would be.
Could you provide me with an example to grasp your idea, please? :)

Signed-off-by: Dominik Voigt <dominik.ingo.voigt@gmail.com>
Copy link
Member

@tobiasdiez tobiasdiez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have currently quite a lot of code duplications. The main sources are:

  1. Complex and string-based queries have to be implemented separately, leading to mostly the same code. Suggestion: remove string-based queries completely
  2. Unpaged search methods always fall back to paged search method with page number 0. Suggestion: put these fallbacks in the general PagedSearchFetcher interface.

@@ -234,8 +228,31 @@ public String getName() {

@Override
public URL getComplexQueryURL(ComplexSearchQuery complexSearchQuery) throws URISyntaxException, MalformedURLException {
return getComplexQueryURL(complexSearchQuery, 0);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be the default implementation, i.e. push to PagedSearchBasedParserFetcher

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have currently quite a lot of code duplications. The main sources are:

  1. Complex and string-based queries have to be implemented separately, leading to mostly the same code. Suggestion: remove string-based queries completely
  2. Unpaged search methods always fall back to paged search method with page number 0. Suggestion: put these fallbacks in the general PagedSearchFetcher interface.

Thanks for your feedback! :)
Regarding 1:
I think that this is a good idea but will require more work than just replacing the method.
This is because for the WebSearchPane for instance the normal string-based search is used as a fallback case when the query could not be parsed.
Therefore I do not feel comfortable just tossing the string-based version.
However, I will address this in the upcoming weeks :)!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I implemented all other suggestions

Copy link
Member

@tobiasdiez tobiasdiez Nov 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good!

Concerning the first point, doesn't it work to parse the queries as follows:
author=me and title=something -> ComplexQuery[author: "me", title: "something", rest: ""]
author=me something -> ComplexQuery[author: "me", title: "", rest: "something"]
something -> ComplexQuery[author: "", title: "", rest: ""something]

Then you don't need any fall-back to a purely string-based search.

In my opinion, this question should be resolved before changing the fetcher in other ways. Otherwise you have a lot of overhead/code duplication now, that will be removed later.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I now removed the normal performSearch all together :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot! If you now also remove the "complex" in the names, I'm super happy ;-)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I now removed the complex from perform search :)
Regarding the query object, I believe that it is sensible to keep that name as it will be extended in the future such as structured information.

Move common methods into default implementation

Signed-off-by: Dominik Voigt <dominik.ingo.voigt@gmail.com>
Signed-off-by: Dominik Voigt <dominik.ingo.voigt@gmail.com>
Signed-off-by: Dominik Voigt <dominik.ingo.voigt@gmail.com>
Adapt query parser and add new tests

Signed-off-by: Dominik Voigt <dominik.ingo.voigt@gmail.com>
@koppor koppor merged commit 00e3409 into JabRef:master Nov 15, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants