Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

adding DOAB to web search #8598

Merged
merged 12 commits into from
Apr 20, 2022
Merged

Conversation

Mohamadi98
Copy link
Contributor

@Mohamadi98 Mohamadi98 commented Mar 22, 2022

  • Change in CHANGELOG.md described in a way that is understandable for the average user (if applicable)
  • Tests created for changes (if applicable)
  • Manually tested changed features in running JabRef (always required)
  • Screenshots added in PR description (for UI changes)
  • Checked documentation: Is the information available and up to date? If not, I created an issue at https://github.com/JabRef/user-documentation/issues or, even better, I submitted a pull request to the documentation repository.

Copy link
Member

@Siedlerchr Siedlerchr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was surprised to learn that it returns JSON as well. My browser showed XML...

@@ -103,7 +103,7 @@ public ArgumentProcessor(String[] args, Mode startupMode, PreferencesService pre
ParserResult result = new ParserResult(entries);
result.setToOpenTab();
return Optional.of(result);
} catch (ParseException e) {
} catch (ParseException | IOException e) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please do not add the IOExceptio here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please do not add the IOExceptio here

Sure I will remove the exceptions that I added, and move the method you mentioned.
I have a question, what do you think more attributes should be added to the resulting bibtex, the metadata of the API returns much more information, I looked at other fetchers and added the common attributes for a start, you can tell me if you would like other attributes to be added, for me I think a description should be added.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should import all fields that can somehow be mapped to the BibTeX/biblatex
e.g. abstract (that is very useful) language, location, and maybe add the terms under other as keywords?

};
}

private JSONArray toJsonArray(InputStream stream) throws IOException {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move that method also to the JsonReader Util class and handle the exception there. Then you don't need to throw and add an IO Exception everywhere

@Siedlerchr Siedlerchr added the status: changes required Pull requests that are not yet complete label Mar 22, 2022
@Mohamadi98
Copy link
Contributor Author

I was surprised to learn that it returns JSON as well. My browser showed XML...

No, you are correct it returns xml I convert the xml to json and then to bibtex, it is easier for me to work with json, I should have explained that better in the description

BibEntry entry = new BibEntry();
for (int i = 0; i < metadataArray.length(); i++) {
JSONObject dataObject = metadataArray.getJSONObject(i);
if (dataObject.getString("key").equals("dc.type")) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also you can simply this using a switch case
Extract the dataObject.get("key) to a variable and then you can do a nice switch on the variable

see https://www.baeldung.com/java-switch#switch-expressions and also
https://www.baeldung.com/java-switch-pattern-matching

@Mohamadi98
Copy link
Contributor Author

@Siedlerchr Hi,
I have made fixes as you requested, but I have a question,
how can I add multiple values to a single Bibtex entry attribute, for example how to add multiple authors values to entry attribute AUTHOR ?

@Siedlerchr
Copy link
Member

Siedlerchr commented Mar 25, 2022

Hi, sorry for the late resonse: For the authors: If each author is in it's own field you can directly use the new Author(...) for each; Otherwise you can use Authorlist.parse()...
The same applies for the editor field. Many books only have editors and not authors.

Have a look at the CrossRef Fetcher.

private String toAuthors(JSONArray authors) {
if (authors == null) {
return "";
}
// input: list of {"given":"A.","family":"Riel","affiliation":[]}
return IntStream.range(0, authors.length())
.mapToObj(authors::getJSONObject)
.map((author) -> new Author(
author.optString("given", ""), "", "",
author.optString("family", ""), ""))
.collect(AuthorList.collect())
.getAsFirstLastNamesWithAnd();
}

@Mohamadi98
Copy link
Contributor Author

@Siedlerchr Hi,
sorry for not replying I have been very busy the last a few days, can you review the latest commit and see if there any changes you would like ?

Copy link
Member

@Siedlerchr Siedlerchr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general looks already good. Some minor improvements

};
}

private BibEntry JsonToBibEntry(JSONArray metadataArray) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lowercase method name

BibEntry entry = JsonToBibEntry(metadataArray);
return Collections.singletonList(entry);
}
// multiple results
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment can be removed, shold be obvious


public class DOABFetcher implements SearchBasedParserFetcher {
private static final String SEARCH_URL = "https://directory.doabooks.org/rest/search?";
// private static final String PEER_REVIEW_URL = " https://directory.doabooks.org/rest/peerReviews?";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure how you want to integrate it, should it be a different fitcher, DOABPeerReviews for example or maybe in the same fitcher but the user has two options
I am actually not sure if I understand what peer reviews mean in the world of books.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would ignore it for the moment.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove the comment then

return entry;
}

private Author toAuthor(String author) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if you have multiple authors?
https://directory.doabooks.org/handle/20.500.12854/79348?show=full
And I think you can use AuthorList.parse () as the names are already in bibtex format

Copy link
Member

@k3KAW8Pnf7mkmdSMPHz27 k3KAW8Pnf7mkmdSMPHz27 Mar 31, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 using AuthorList.parse. There are a lot of weird edge-cases when it comes to authors :/
(see comment https://github.com/JabRef/jabref/pull/8598/files#r839925364 for a bit more discussion)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if you have multiple authors? https://directory.doabooks.org/handle/20.500.12854/79348?show=full And I think you can use AuthorList.parse () as the names are already in bibtex format

I created a list from the Author class and all authors are added to it then converted to an AuthorList I will add a test case for multiple authors, it works the same as multiple editors there a test case provide for that, but I agree with you I think there is a better solution for that, can you describe what AuthorList.parse () actually does and how to use it, I am not sure I completely understand it.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The nutshell interpretation of AuthorList is that it is intended to represent one or more authors written in bibtex format.

AuthorList.parse is intended to take a String consisting of one or more authors in bibtex format, and store them internally in a List that you can access directly through getAuthors() if you need to.

For this case, I'd probably go with

authorsList.addAll(
	AuthorList.parse(
		dataObject.getString("value"))
			.getAuthors())

even if we should only be getting one author from doab, and it isn't in bibtex format. Probably preprocess by checking for (Ed.) as in #8598 (comment)

Is that explanation/example useful?

You seem to already have nailed down how to deal with output from AuthorList. Do you have any questions regarding it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, that was useful, and definitely a cleaner solution, I will implement this, and also start thinking about solutions to handle edgy cases like the one you mentioned here #8598 (comment) especially the author name that starts with Fatima which is formatted differently from the other authors.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also noticed something for some results the field for the number of pages can come in a different format like this result here https://directory.doabooks.org/rest/search?query=%22UAV%E2%80%90Based%20Remote%20Sensing%20Volume%202%22&expand=metadata
this can be problematic because the type of the return is not an integer anymore it is a string.

public void TestPerformSearch() throws FetcherException {
List<BibEntry> entries;
entries = fetcher.performSearch("i open fire");
assertFalse(entries.isEmpty());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assertFalse and assertTrue make it hard to see why a test fails. In thise casse it's better you compare the entry collections.
e.g. assertEquals(Collections.singletonlist(expectedEntry),entries)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, sure

Comment on lines 85 to 91
List<Author> authorsList = new ArrayList<>();
List<Author> editorsList = new ArrayList<>();
StringJoiner keywordJoiner = new StringJoiner(",");
for (int i = 0; i < metadataArray.length(); i++) {
JSONObject dataObject = metadataArray.getJSONObject(i);
switch (dataObject.getString("key")) {
case "dc.contributor.author" -> authorsList.add(toAuthor(dataObject.getString("value")));
Copy link
Member

@k3KAW8Pnf7mkmdSMPHz27 k3KAW8Pnf7mkmdSMPHz27 Mar 31, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the interest of trying to deepen the comment in https://github.com/JabRef/jabref/pull/8598/files#r839909593

In what circumstance(s) is the metadataArray larger than 1 item?

If it can not be guaranteed that metadataArray is exactly 1 item (in which case we have only one author and one editor field) I'd probably suggest
authorsList.addAll(AuthorList.parse(dataObject.getString("value")).getAuthors()) even if it is a bit ugly :/ (which should also cover the weird edge case of there being an author field but no parsable author)

It is probably possible to use StringJoiner with and, but it feels a bit hacky :/

I don't know, I guess the best-case scenario is if metadataArray is only one item, but I guess we can't always get what we wish for 😛

Copy link
Member

@k3KAW8Pnf7mkmdSMPHz27 k3KAW8Pnf7mkmdSMPHz27 Mar 31, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at it a bit closer, it seems that I've spoken too soon.
@Mohamadi98 do you know if there is any information on what format they are using with more details?

E.g.,

<metadata>
	  <key>dc.contributor.author</key>
	  <language>*</language>
	  <value>Felipe Gonzalez Toro (Ed.)</value>
</metadata>
<metadata>
	<key>dc.contributor.author</key>
	<language>*</language>
	<value>Kamruzzaman, Md. (Liton)</value>
</metadata>
<metadata>
	  <key>dc.contributor.author</key>
	  <language>*</language>
	  <value>Fátima Velez de Castro</value>
</metadata>

is a bit weird.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also follow-up in #8576 (comment)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the data itself returned from the API is in xml format, and the actual information that I used for the Bibtex entries are inside an array called metadata which you can add or remove from the results, see this https://directory.doabooks.org/rest/search?query=%22the+deliverance+of+open+access+books%22, so I agree with you it is a little bit weird
I convert the format to json, It's easier for me to work with json

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Mohamadi98 Most entries seem to be already in correct format, however some seem to indicate Ediors in the authors field
We suggest you apply a minimal heuristic:

  1. Check if the dc.contirbutor.author contains (Ed.)
  2. if yes -> remove that (Ed.) part
  3. Parse the value now using Authorlist.parse() it takes care of different names formats
  4. Repeat for other editors
  5. Put in the editors field

The other normal case:

  1. Check if the dc.contirbutor.author contains (Ed.)
  2. If no -> Parse it using AuthorLIst.parse()
  3. Repeat for other authors
  4. Put in the authors field

It's a bit of a mess with a list of list of authors... and the naming doesn't really help (yep, it's a mess^^)

But if you're stuck, just ask @k3KAW8Pnf7mkmdSMPHz27

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The AuthorList.parse can also handle names not in (LastName, FirstName) format you will see in one of the tests in the latest commit that there are names that returns from the API in (FirstName LastName) and are parsed correctly, so this nice

private String toAuthorList(List<Author> authorsList) {
return AuthorList.of(authorsList).getAsFirstLastNamesWithAnd();
private String namePreprocessing(String name) {
name = String.valueOf(new StringBuilder(name).delete(name.length() - 5, name.length()));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not a simple replace? e.g. name.replace("(Eds)","") ?

@Mohamadi98
Copy link
Contributor Author

I think currently this works fine for most entries of the API,
there is still some entries where the API is not consistent with their format.
what do you think ? @Siedlerchr @k3KAW8Pnf7mkmdSMPHz27

@Siedlerchr
Copy link
Member

@Mohamadi98 Cool! Thanks for the work so far! I guess we can't handle all edge cases. For example, if you cannot parse the author correctly you should nontheless add the rest of the fields that are, so the user has the option to manually edit/correct the entries

Otherwise, looks good already. It would also be nice to implement the FullTextFetcher Interface, so that the PDF can be automatically downloaded. You can also do this as an extra PR.

@Mohamadi98
Copy link
Contributor Author

@Mohamadi98 Cool! Thanks for the work so far! I guess we can't handle all edge cases. For example, if you cannot parse the author correctly you should nontheless add the rest of the fields that are, so the user has the option to manually edit/correct the entries

Otherwise, looks good already. It would also be nice to implement the FullTextFetcher Interface, so that the PDF can be automatically downloaded. You can also do this as an extra PR.

Yes, actually all names are parsed correctly in comparison to the result from the API maybe not correct for the real life name but in that case as you said the user can edit it or correct it, I will just add a try catch for the pages number case, there are a few results where the the number of pages is in the following format "(roman number), number" which causes an error because in this format the type is String and not an integer like the normal format, so that try catch should handle that fine.

@Mohamadi98
Copy link
Contributor Author

@Mohamadi98 Cool! Thanks for the work so far! I guess we can't handle all edge cases. For example, if you cannot parse the author correctly you should nontheless add the rest of the fields that are, so the user has the option to manually edit/correct the entries

Otherwise, looks good already. It would also be nice to implement the FullTextFetcher Interface, so that the PDF can be automatically downloaded. You can also do this as an extra PR.

Sure no problem, but to make sure I understand, the FullTextFetcher supposed to find the page of a Bib entry for example a book probably using their DOI or URI , is that correct or something else ?

@Mohamadi98 Mohamadi98 marked this pull request as ready for review April 5, 2022 23:36
@Siedlerchr
Copy link
Member

@Mohamadi98 Yes, a Full-text fetcher takes a bib entry, and then returns the PDF document for it e.g. by checking for the DOI or any other field (e.g. title, this depends on the concrete fetcher). Whenever possible, unique identifier like DOI or ISBN etc. should be used to find an entry as other fields can be ambiguos.

When you have an entry in JabRef and say "Find full text online", JabRef calls multiple fetchers, until one returns the PDF file. The fetchers also have a Trust level, e.g. if the entry or the PDF comes from the publisher website, it is considered as more trustworthy as when it comes from a website like ResearchGate.

@koppor koppor added this to the v5.6 milestone Apr 11, 2022
@koppor
Copy link
Member

koppor commented Apr 11, 2022

DevCall decision: @Siedlerchr tries this code locally and if it works, we merge. The FullTextFetcher can then go into a follow-up pull request.

@koppor
Copy link
Member

koppor commented Apr 11, 2022

An entry to the CHANGELOG.md should be added, too.

@ThiloteE
Copy link
Member

ThiloteE commented Apr 11, 2022

Searching for "politics". Found:

@Misc{Jun2011aap,
  author    = {Nathan Jun},
  date      = {2011},
  title     = {Anarchism and Political Modernity},
  doi       = {10.5040/9781501306785},
  language  = {English},
  type      = {book},
  abstract  = {This book is available as open access through the Bloomsbury Open Access programme and is available on www.bloomsburycollections.com. Anarchism and Political Modernity looks at the place of 'classical anarchism' in the postmodern political discourse, claiming that anarchism presents a vision of political postmodernity. The book seeks to foster a better understanding of why and how anarchism is growing in the present. To do so, it first looks at its origins and history, offering a different view from the two traditions that characterize modern political theory: socialism and liberalism. Such an examination leads to a better understanding of how anarchism connects with newer political trends and why it is a powerful force in contemporary social and political movements. This new volume in the Contemporary Anarchist Studies series offers a novel philosophical engagement with anarchism and contests a number of positions established in postanarchist theory. Its new approach makes a valuable contribution to an established debate about anarchism and political theory. It offers a new perspective on the emerging area of anarchist studies that will be of interest to students and theorists in political theory and anarchist studies.},
  keywords  = {Politics & International Relations, Political Theory and Philosophy (Politics), Social and Political Philosophy (Philosophy), Political and Legal Philosophy (Politics), Supplementary Standard},
  pages     = {272},
  publisher = {Bloomsbury Academic},
  uri       = {https://directory.doabooks.org/handle/20.500.12854/77385},
}

But when importing via DOI this emerges:

@Book{Jun_2012,
  author    = {Nathan Jun},
  date      = {2012},
  title     = {Anarchism and Political Modernity},
  doi       = {10.5040/9781501306785},
  publisher = {Continuum},
}

To do:

  • set entry type (type is probably the entry type).
    Example:
    @Book{citationkey, instead of @Misc{citationkey,
  • Would it be possible to fetch the URL? Maybe copy the uri (as it seems to be a stable link at doabooks.org) to the url field. Could be done in a follow up pull request. Nothing that blocks this one.

@ThiloteE
Copy link
Member

Also, see following entries:


@Misc{Ledeneva2018tge,
  title     = {The Global Encyclopaedia of informality, Volume 2},
  abstract  = {Alena Ledeneva invites you on a voyage of discovery to explore society’s open secrets, unwritten rules and know-how practices. Broadly defined as ‘ways of getting things done’, these invisible yet powerful informal practices tend to escape articulation in official discourse. They include emotion-driven exchanges of gifts or favours and tributes for services, interest-driven know-how (from informal welfare to informal employment and entrepreneurship), identity-driven practices of solidarity, and power-driven forms of co-optation and control. The paradox, or not, of the invisibility of these informal practices is their ubiquity. Expertly practised by insiders but often hidden from outsiders, informal practices are, as this book shows, deeply rooted all over the world, yet underestimated in policy. Entries from the five continents presented in this volume are samples of the truly global and ever-growing collection, made possible by a remarkable collaboration of over 200 scholars across disciplines and area studies. By mapping the grey zones, blurred boundaries, types of ambivalence and contexts of complexity, this book creates the first Global Map of Informality. The accompanying database (www.in-formality.com) is searchable by region, keyword or type of practice, so do explore what works, how, where and why!},
  date      = {2018},
  doi       = {10.14324/111.9781787351899},
  editor    = {Alena Ledeneva},
  keywords  = {global informality, informality, informal practices, Russia},
  language  = {English},
  pages     = {568},
  publisher = {UCL Press},
  type      = {book},
  uri       = {https://directory.doabooks.org/handle/20.500.12854/37655},
}

The volume is in the title. Maybe Volume could be parsed from there?


@Misc{Burnell2006gd,
  author    = {Peter Burnell},
  date      = {2006},
  title     = {Globalising Democracy},
  doi       = {10.4324/9780203965719},
  language  = {English},
  type      = {book},
  abstract  = {This volume brings together expert contributors to explore the intersection of two major contemporary themes: globalization, and the contribution that both domestic party politics and international party support make to democratization.         Globalising Democracy clearly shows what globalization means for domestic and international efforts to build effective political parties and competitive party systems in new and emerging democracies. Contrasting perspectives are presented through fresh case studies of European post-communist countries, Africa and Turkey. The reader is clearly shown how international party assistance is a manifestation and vehicle of globalization, and explores how it may be assessed in terms of:              global economic integration       the growth of global communications       the development and implications for party politics of multi-level governance.        This is the first book to analyze the impact of globalization on democracy and will be of great interest to all students of international relations, governance and politics.},
  keywords  = {assistance, party, aid, promotion, system, political, international, zeeuw, civil, society},
  publisher = {Taylor & Francis},
  uri       = {https://directory.doabooks.org/handle/20.500.12854/32659},
}
  • on the webpage you will find the ISBN. If possible, Jabref should fetch the ISBN
  • You can see that the abstract has these odd empty spaces in between. Is this because JabRef is doing some postprocessing or is this something from DOAB side? @koppor ?

@Misc{Crewe2021tao,
  author    = {Emma Crewe},
  date      = {2021},
  title     = {The Anthropology of Parliaments},
  doi       = {10.4324/9781003084488},
  language  = {English},
  type      = {book},
  abstract  = {The Anthropology of Parliaments offers a fresh, comparative approach to analysing parliaments and democratic politics, drawing together rare ethnographic work by anthropologists and politics scholars from around the world. Crewe’s insights deepen our understanding of the complexity of political institutions. She reveals how elected politicians navigate relationships by forging alliances and thwarting opponents; how parliamentary buildings are constructed as sites of work, debate and the nation in miniature; and how politicians and officials engage with hierarchies, continuity and change. This book also proposes how to study parliaments through an anthropological lens while in conversation with other disciplines. The dive into ethnographies from across Africa, the Americas, Asia, Europe, the Middle East and the Pacific Region demolishes hackneyed geo-political categories and culminates in a new comparative theory about the contradictions in everyday political work. This important book will be of interest to anyone studying parliaments but especially those in the disciplines of anthropology and sociology; politics, legal and development studies; and international relations.},
  keywords  = {Anthropology, Comparative politics, Political structures: democracy, Social and cultural anthropology},
  pages     = {242},
  publisher = {Taylor & Francis},
  uri       = {https://directory.doabooks.org/handle/20.500.12854/69644},
}
  • on the webpage of doabooks.org, you will find the "title" and "subtitle" of this entry. Biblatex documentation on page 9, 19 and 26 shows that this is a field that probably should be fetched if you can. I would suggest putting it into the field subtitle. Of course, only if it is manageable to fetch it.
  • I checked Bibtex documentation. There, subtitle does not need to be rendered, because subtitle does not exist for the entry type books, but this info is just for the record. Probably not relevant this pull request.

@Siedlerchr
Copy link
Member

@Mohamadi98 It would be cool if you could address the remaining comments so we can merge this

Copy link
Member

@koppor koppor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some small comments to the test organization.

I would also like to see .withField instead of .setField. See #8693 for an example of the outcome.

Comment on lines 202 to 225
@Test
public void TestPerformSearch() throws FetcherException {
List<BibEntry> entries;
entries = fetcher.performSearch("i open fire");
assertEquals(Collections.singletonList(David_Opal), entries);
}

@Test
public void TestPerformSearch2() throws FetcherException {
List<BibEntry> entries;
entries = fetcher.performSearch("the deliverance of open access books");
assertEquals(Collections.singletonList(Ronald_Snijder), entries);
}

@Test
public void TestPerformSearch3() throws FetcherException {
List<BibEntry> entries;
entries = fetcher.performSearch("Four Kingdom Motifs before and beyond the Book of Daniel");
assertEquals(Collections.singletonList(Andrew_Perrin), entries);
}

@Test
public void TestPerformSearch4() throws FetcherException {
List<BibEntry> entries;
entries = fetcher.performSearch("UAV Sensors for Environmental Monitoring");
assertEquals(Collections.singletonList(Felipe_Gonzalez), entries);
}

@Test
public void TestPerformSearch5() throws FetcherException {
List<BibEntry> entries;
entries = fetcher.performSearch("The symbiosis between information system project complexity and information system project success");
assertEquals(Collections.singletonList(Carl_Marnewick), entries);
}

@Test
public void TestPerformSearch6() throws FetcherException {
List<BibEntry> entries;
entries = fetcher.performSearch("UAV‐Based Remote Sensing Volume 2");
assertEquals(Collections.singletonList(Antonios_Tsourdos), entries);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be converted to a parameterized test --> https://www.baeldung.com/parameterized-tests-junit-5

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be converted to a parameterized test --> https://www.baeldung.com/parameterized-tests-junit-5

Hi, do you mind taking a look at the test file for the last commit, I still want to use assertEquals or something similar to it but I couldn't use it with parameterized test, any advice on that ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because now the value of the expected variable in the test has to change with every new test case from the string array in ValueSource under ParameterizedTest

}

@Test
public void TestGetName() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Method names should start with a lower case:

Suggested change
public void TestGetName() {
public void testGetName() {

@Mohamadi98 Mohamadi98 closed this Apr 19, 2022
@Mohamadi98 Mohamadi98 reopened this Apr 19, 2022
@Mohamadi98
Copy link
Contributor Author

Mohamadi98 commented Apr 19, 2022

Searching for "politics". Found:

@Misc{Jun2011aap,
  author    = {Nathan Jun},
  date      = {2011},
  title     = {Anarchism and Political Modernity},
  doi       = {10.5040/9781501306785},
  language  = {English},
  type      = {book},
  abstract  = {This book is available as open access through the Bloomsbury Open Access programme and is available on www.bloomsburycollections.com. Anarchism and Political Modernity looks at the place of 'classical anarchism' in the postmodern political discourse, claiming that anarchism presents a vision of political postmodernity. The book seeks to foster a better understanding of why and how anarchism is growing in the present. To do so, it first looks at its origins and history, offering a different view from the two traditions that characterize modern political theory: socialism and liberalism. Such an examination leads to a better understanding of how anarchism connects with newer political trends and why it is a powerful force in contemporary social and political movements. This new volume in the Contemporary Anarchist Studies series offers a novel philosophical engagement with anarchism and contests a number of positions established in postanarchist theory. Its new approach makes a valuable contribution to an established debate about anarchism and political theory. It offers a new perspective on the emerging area of anarchist studies that will be of interest to students and theorists in political theory and anarchist studies.},
  keywords  = {Politics & International Relations, Political Theory and Philosophy (Politics), Social and Political Philosophy (Philosophy), Political and Legal Philosophy (Politics), Supplementary Standard},
  pages     = {272},
  publisher = {Bloomsbury Academic},
  uri       = {https://directory.doabooks.org/handle/20.500.12854/77385},
}

But when importing via DOI this emerges:

@Book{Jun_2012,
  author    = {Nathan Jun},
  date      = {2012},
  title     = {Anarchism and Political Modernity},
  doi       = {10.5040/9781501306785},
  publisher = {Continuum},
}

To do:

  • set entry type (type is probably the entry type).
    Example:
    @Book{citationkey, instead of @Misc{citationkey,
  • Would it be possible to fetch the URL? Maybe copy the uri (as it seems to be a stable link at doabooks.org) to the url field. Could be done in a follow up pull request. Nothing that blocks this one.

okay, I think this could be done, can you just explain to me what affect does changing the type have, I don't think I understand that part about bibtex fully
For the url part, having both url and uri can be done I will add them

@Mohamadi98
Copy link
Contributor Author

Mohamadi98 commented Apr 19, 2022

Also, see following entries:

@Misc{Ledeneva2018tge,
  title     = {The Global Encyclopaedia of informality, Volume 2},
  abstract  = {Alena Ledeneva invites you on a voyage of discovery to explore society’s open secrets, unwritten rules and know-how practices. Broadly defined as ‘ways of getting things done’, these invisible yet powerful informal practices tend to escape articulation in official discourse. They include emotion-driven exchanges of gifts or favours and tributes for services, interest-driven know-how (from informal welfare to informal employment and entrepreneurship), identity-driven practices of solidarity, and power-driven forms of co-optation and control. The paradox, or not, of the invisibility of these informal practices is their ubiquity. Expertly practised by insiders but often hidden from outsiders, informal practices are, as this book shows, deeply rooted all over the world, yet underestimated in policy. Entries from the five continents presented in this volume are samples of the truly global and ever-growing collection, made possible by a remarkable collaboration of over 200 scholars across disciplines and area studies. By mapping the grey zones, blurred boundaries, types of ambivalence and contexts of complexity, this book creates the first Global Map of Informality. The accompanying database (www.in-formality.com) is searchable by region, keyword or type of practice, so do explore what works, how, where and why!},
  date      = {2018},
  doi       = {10.14324/111.9781787351899},
  editor    = {Alena Ledeneva},
  keywords  = {global informality, informality, informal practices, Russia},
  language  = {English},
  pages     = {568},
  publisher = {UCL Press},
  type      = {book},
  uri       = {https://directory.doabooks.org/handle/20.500.12854/37655},
}

The volume is in the title. Maybe Volume could be parsed from there?

@Misc{Burnell2006gd,
  author    = {Peter Burnell},
  date      = {2006},
  title     = {Globalising Democracy},
  doi       = {10.4324/9780203965719},
  language  = {English},
  type      = {book},
  abstract  = {This volume brings together expert contributors to explore the intersection of two major contemporary themes: globalization, and the contribution that both domestic party politics and international party support make to democratization.         Globalising Democracy clearly shows what globalization means for domestic and international efforts to build effective political parties and competitive party systems in new and emerging democracies. Contrasting perspectives are presented through fresh case studies of European post-communist countries, Africa and Turkey. The reader is clearly shown how international party assistance is a manifestation and vehicle of globalization, and explores how it may be assessed in terms of:              global economic integration       the growth of global communications       the development and implications for party politics of multi-level governance.        This is the first book to analyze the impact of globalization on democracy and will be of great interest to all students of international relations, governance and politics.},
  keywords  = {assistance, party, aid, promotion, system, political, international, zeeuw, civil, society},
  publisher = {Taylor & Francis},
  uri       = {https://directory.doabooks.org/handle/20.500.12854/32659},
}
  • on the webpage you will find the ISBN. If possible, Jabref should fetch the ISBN
  • You can see that the abstract has these odd empty spaces in between. Is this because JabRef is doing some postprocessing or is this something from DOAB side? @koppor ?
@Misc{Crewe2021tao,
  author    = {Emma Crewe},
  date      = {2021},
  title     = {The Anthropology of Parliaments},
  doi       = {10.4324/9781003084488},
  language  = {English},
  type      = {book},
  abstract  = {The Anthropology of Parliaments offers a fresh, comparative approach to analysing parliaments and democratic politics, drawing together rare ethnographic work by anthropologists and politics scholars from around the world. Crewe’s insights deepen our understanding of the complexity of political institutions. She reveals how elected politicians navigate relationships by forging alliances and thwarting opponents; how parliamentary buildings are constructed as sites of work, debate and the nation in miniature; and how politicians and officials engage with hierarchies, continuity and change. This book also proposes how to study parliaments through an anthropological lens while in conversation with other disciplines. The dive into ethnographies from across Africa, the Americas, Asia, Europe, the Middle East and the Pacific Region demolishes hackneyed geo-political categories and culminates in a new comparative theory about the contradictions in everyday political work. This important book will be of interest to anyone studying parliaments but especially those in the disciplines of anthropology and sociology; politics, legal and development studies; and international relations.},
  keywords  = {Anthropology, Comparative politics, Political structures: democracy, Social and cultural anthropology},
  pages     = {242},
  publisher = {Taylor & Francis},
  uri       = {https://directory.doabooks.org/handle/20.500.12854/69644},
}
  • on the webpage of doabooks.org, you will find the "title" and "subtitle" of this entry. Biblatex documentation on page 9, 19 and 26 shows that this is a field that probably should be fetched if you can. I would suggest putting it into the field subtitle. Of course, only if it is manageable to fetch it.
  • I checked Bibtex documentation. There, subtitle does not need to be rendered, because subtitle does not exist for the entry type books, but this info is just for the record. Probably not relevant this pull request.

actually there is a field for volume in the API results, but it can be a little strange like for the example you mentioned there is no, but there is for other results, I think adding that field is a more generalized solution than parsing the volume number from the name string, it cause problems for maybe cases where the actual word "volume" is part of the name.

I can't find a field for the ISBN in the API results, maybe I could communicate with Ronald from the issue here #8576 to ask about ISBN.

For the abstract part, this is the format that comes directly from the API.

@Mohamadi98
Copy link
Contributor Author

Mohamadi98 commented Apr 19, 2022

Also, see following entries:

@Misc{Ledeneva2018tge,
  title     = {The Global Encyclopaedia of informality, Volume 2},
  abstract  = {Alena Ledeneva invites you on a voyage of discovery to explore society’s open secrets, unwritten rules and know-how practices. Broadly defined as ‘ways of getting things done’, these invisible yet powerful informal practices tend to escape articulation in official discourse. They include emotion-driven exchanges of gifts or favours and tributes for services, interest-driven know-how (from informal welfare to informal employment and entrepreneurship), identity-driven practices of solidarity, and power-driven forms of co-optation and control. The paradox, or not, of the invisibility of these informal practices is their ubiquity. Expertly practised by insiders but often hidden from outsiders, informal practices are, as this book shows, deeply rooted all over the world, yet underestimated in policy. Entries from the five continents presented in this volume are samples of the truly global and ever-growing collection, made possible by a remarkable collaboration of over 200 scholars across disciplines and area studies. By mapping the grey zones, blurred boundaries, types of ambivalence and contexts of complexity, this book creates the first Global Map of Informality. The accompanying database (www.in-formality.com) is searchable by region, keyword or type of practice, so do explore what works, how, where and why!},
  date      = {2018},
  doi       = {10.14324/111.9781787351899},
  editor    = {Alena Ledeneva},
  keywords  = {global informality, informality, informal practices, Russia},
  language  = {English},
  pages     = {568},
  publisher = {UCL Press},
  type      = {book},
  uri       = {https://directory.doabooks.org/handle/20.500.12854/37655},
}

The volume is in the title. Maybe Volume could be parsed from there?

@Misc{Burnell2006gd,
  author    = {Peter Burnell},
  date      = {2006},
  title     = {Globalising Democracy},
  doi       = {10.4324/9780203965719},
  language  = {English},
  type      = {book},
  abstract  = {This volume brings together expert contributors to explore the intersection of two major contemporary themes: globalization, and the contribution that both domestic party politics and international party support make to democratization.         Globalising Democracy clearly shows what globalization means for domestic and international efforts to build effective political parties and competitive party systems in new and emerging democracies. Contrasting perspectives are presented through fresh case studies of European post-communist countries, Africa and Turkey. The reader is clearly shown how international party assistance is a manifestation and vehicle of globalization, and explores how it may be assessed in terms of:              global economic integration       the growth of global communications       the development and implications for party politics of multi-level governance.        This is the first book to analyze the impact of globalization on democracy and will be of great interest to all students of international relations, governance and politics.},
  keywords  = {assistance, party, aid, promotion, system, political, international, zeeuw, civil, society},
  publisher = {Taylor & Francis},
  uri       = {https://directory.doabooks.org/handle/20.500.12854/32659},
}
  • on the webpage you will find the ISBN. If possible, Jabref should fetch the ISBN
  • You can see that the abstract has these odd empty spaces in between. Is this because JabRef is doing some postprocessing or is this something from DOAB side? @koppor ?
@Misc{Crewe2021tao,
  author    = {Emma Crewe},
  date      = {2021},
  title     = {The Anthropology of Parliaments},
  doi       = {10.4324/9781003084488},
  language  = {English},
  type      = {book},
  abstract  = {The Anthropology of Parliaments offers a fresh, comparative approach to analysing parliaments and democratic politics, drawing together rare ethnographic work by anthropologists and politics scholars from around the world. Crewe’s insights deepen our understanding of the complexity of political institutions. She reveals how elected politicians navigate relationships by forging alliances and thwarting opponents; how parliamentary buildings are constructed as sites of work, debate and the nation in miniature; and how politicians and officials engage with hierarchies, continuity and change. This book also proposes how to study parliaments through an anthropological lens while in conversation with other disciplines. The dive into ethnographies from across Africa, the Americas, Asia, Europe, the Middle East and the Pacific Region demolishes hackneyed geo-political categories and culminates in a new comparative theory about the contradictions in everyday political work. This important book will be of interest to anyone studying parliaments but especially those in the disciplines of anthropology and sociology; politics, legal and development studies; and international relations.},
  keywords  = {Anthropology, Comparative politics, Political structures: democracy, Social and cultural anthropology},
  pages     = {242},
  publisher = {Taylor & Francis},
  uri       = {https://directory.doabooks.org/handle/20.500.12854/69644},
}
  • on the webpage of doabooks.org, you will find the "title" and "subtitle" of this entry. Biblatex documentation on page 9, 19 and 26 shows that this is a field that probably should be fetched if you can. I would suggest putting it into the field subtitle. Of course, only if it is manageable to fetch it.
  • I checked Bibtex documentation. There, subtitle does not need to be rendered, because subtitle does not exist for the entry type books, but this info is just for the record. Probably not relevant this pull request.

Unfortunately, I can't find a field for subtitle in the API results, maybe it could be fetched from the webpage, but I don't know of a way to access webpages from jabref and fetch data from the page.

@Mohamadi98
Copy link
Contributor Author

Some small comments to the test organization.

I would also like to see .withField instead of .setField. See #8693 for an example of the outcome.

sure, but can you explain the difference between "setfield" and "withfield".

@ThiloteE
Copy link
Member

Fixing the entry type is comparatively important (More important than e.g. fetching the missing subtitle field).

See here for a rough explanation: https://www.openoffice.org/bibliographic/bibtex-defs.html

In short:

When entering a reference in the database, the first thing to decide is what type of entry it is. No fixed classification scheme can be complete, but provides enough entry types to handle almost any reference reasonably well.

References to different types of publications contain different information; a reference to a journal article might include the volume and number of the journal, which is usually not meaningful for a book. Therefore, database entries of different types have different fields.

As a consequence, citation-styles will render the entries differently based on entry type. If the entry type is @misc, the bibliography will look different as compared to entries with entry type @article.

The entry type is offered in the API probably via the following:

dc.type

You already fetched it, since it shows in the biblography as

type = {book},

All that needs to be done here is to do the conversion.


I can't find a field for the ISBN in the API results, maybe I could communicate with Ronald from the issue here #8576 to ask about ISBN.

For this entry click on the bottom left show full item record, you will see info about ISBN:

oapen.relation.isbn 9780415401845;9780415401838;9781134143894;9781134143887;9781134143849

Unfortunately, I can't find a field for subtitle in the API results, maybe it could be fetched from the webpage, but I don't know of a way to access webpages from jabref and fetch data from the page.

Same procedure for this entry yields the subtitle:

dc.title.alternative Entanglements in Democratic Politics

and I think fetching "imprint" was also missing, so here we go:

oapen.imprint Routledge

Finally, with regard to:

  • Maybe parse volume from title field?
  • the spaces in the abstract

These two issues are not important. I just noticed, but they are very low priority. They were more a question than a proposal. Don't do them. They do not prevent merging this pull request at all.


Overall, the entries I fetched look already pretty good! Good job!

@koppor
Copy link
Member

koppor commented Apr 19, 2022

Some small comments to the test organization.
I would also like to see .withField instead of .setField. See #8693 for an example of the outcome.
sure, but can you explain the difference between "setfield" and "withfield".

withField is the fluent API, similar to a builder, whereas setField is old-school style.

@Mohamadi98 Mohamadi98 force-pushed the add-DOAB-to-web-search branch from 5ef9e06 to 4042bcf Compare April 20, 2022 08:57
@Mohamadi98
Copy link
Contributor Author

Fixing the entry type is comparatively important (More important than e.g. fetching the missing subtitle field).

See here for a rough explanation: https://www.openoffice.org/bibliographic/bibtex-defs.html

In short:

When entering a reference in the database, the first thing to decide is what type of entry it is. No fixed classification scheme can be complete, but provides enough entry types to handle almost any reference reasonably well.
References to different types of publications contain different information; a reference to a journal article might include the volume and number of the journal, which is usually not meaningful for a book. Therefore, database entries of different types have different fields.

As a consequence, citation-styles will render the entries differently based on entry type. If the entry type is @misc, the bibliography will look different as compared to entries with entry type @article.

The entry type is offered in the API probably via the following:

dc.type

You already fetched it, since it shows in the biblography as

type = {book},

All that needs to be done here is to do the conversion.

I can't find a field for the ISBN in the API results, maybe I could communicate with Ronald from the issue here #8576 to ask about ISBN.

For this entry click on the bottom left show full item record, you will see info about ISBN:

oapen.relation.isbn 9780415401845;9780415401838;9781134143894;9781134143887;9781134143849

Unfortunately, I can't find a field for subtitle in the API results, maybe it could be fetched from the webpage, but I don't know of a way to access webpages from jabref and fetch data from the page.

Same procedure for this entry yields the subtitle:

dc.title.alternative Entanglements in Democratic Politics
and I think fetching "imprint" was also missing, so here we go:

oapen.imprint Routledge
Finally, with regard to:

  • Maybe parse volume from title field?
  • the spaces in the abstract

These two issues are not important. I just noticed, but they are very low priority. They were more a question than a proposal. Don't do them. They do not prevent merging this pull request at all.

Overall, the entries I fetched look already pretty good! Good job!

I added somethings, can you check the last commit and let me know if you have request or feedback

also, I added oapen.relation.isbn so if that field is in the API result it will be fitched and assigned to the bibentry, however the problem is that despite you found it on the webpage it doesn't seem to appear in all API results I have seen including the example you mentioned, here is the API response for the query of the book you mentioned https://directory.doabooks.org/rest/search?query=%22Globalising%20Democracy%22&expand=metadata

@koppor
Copy link
Member

koppor commented Apr 20, 2022

Note that the "secret sauce" of Parameterized tests is:

        return Stream.of(
                Arguments.of(

We applied that pattern at e809885 (#8598)

koppor and others added 4 commits April 20, 2022 21:51
- Add CHANGELOG.md entry
- Fix checkstyle

Co-authored-by: Christoph <siedlerkiller@gmail.com>
Co-authored-by: Carl Christian Snethlage <50491877+calixtus@users.noreply.github.com>
…there are two results)

Co-authored-by: Christoph <siedlerkiller@gmail.com>
Co-authored-by: Carl Christian Snethlage <50491877+calixtus@users.noreply.github.com>
@koppor koppor force-pushed the add-DOAB-to-web-search branch from 203b152 to d677bf1 Compare April 20, 2022 19:52
@koppor koppor removed the status: changes required Pull requests that are not yet complete label Apr 20, 2022
@koppor koppor merged commit f220575 into JabRef:main Apr 20, 2022
@koppor
Copy link
Member

koppor commented Apr 20, 2022

@Mohamadi98 Thank you for your efforts in implementing this. Hope, you learned something here 🎉. Maybe, you can investiage on our final verison of the paramterized tests.

@ThiloteE In case there still is something missing, please open a new issue.

@ThiloteE
Copy link
Member

Thank you Mohamadi! :)

@ThiloteE
Copy link
Member

@Mohamadi98

By the way, if you write

"Fixes #8576", It will automatically close the issue, when the pull request is merged :-)

@Mohamadi98
Copy link
Contributor Author

@Mohamadi98 Thank you for your efforts in implementing this. Hope, you learned something here 🎉. Maybe, you can investiage on our final verison of the paramterized tests.

@ThiloteE In case there still is something missing, please open a new issue.

Definitely will look at it, It was a pleasure, thanks

@Mohamadi98
Copy link
Contributor Author

Thank you Mohamadi! :)

It was fun, Thank you !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants