adding DOAB to web search #8598

Mohamadi98 · 2022-03-22T04:24:32Z

Fix for issue Add Directory of Open Access Books (DOAB) fetcher to web search #8576 adding DOAB fetcher to the list of web searches
DOABFetcher class is added to org.jabref.logic.importer.fetcher to make calls for the API https://www.doabooks.org/en/resources/metadata-harvesting-and-content-dissemination for the library https://www.doabooks.org/ , and convert json formatted results into Bibtex entries
An instantiation of the class is added here org.jabref.logic.importer.WebFetchers#getSearchBasedFetchers method
Provided some tests for the whole process.
This is an initial solution for review there will be more optimizations

Change in CHANGELOG.md described in a way that is understandable for the average user (if applicable)
Tests created for changes (if applicable)
Manually tested changed features in running JabRef (always required)
Screenshots added in PR description (for UI changes)
Checked documentation: Is the information available and up to date? If not, I created an issue at https://github.com/JabRef/user-documentation/issues or, even better, I submitted a pull request to the documentation repository.

Siedlerchr

I was surprised to learn that it returns JSON as well. My browser showed XML...

Siedlerchr · 2022-03-22T18:21:12Z

src/main/java/org/jabref/cli/ArgumentProcessor.java

@@ -103,7 +103,7 @@ public ArgumentProcessor(String[] args, Mode startupMode, PreferencesService pre
            ParserResult result = new ParserResult(entries);
            result.setToOpenTab();
            return Optional.of(result);
-        } catch (ParseException e) {
+        } catch (ParseException | IOException e) {


Please do not add the IOExceptio here

Please do not add the IOExceptio here

Sure I will remove the exceptions that I added, and move the method you mentioned.
I have a question, what do you think more attributes should be added to the resulting bibtex, the metadata of the API returns much more information, I looked at other fetchers and added the common attributes for a start, you can tell me if you would like other attributes to be added, for me I think a description should be added.

You should import all fields that can somehow be mapped to the BibTeX/biblatex
e.g. abstract (that is very useful) language, location, and maybe add the terms under other as keywords?

Siedlerchr · 2022-03-22T18:55:41Z

src/main/java/org/jabref/logic/importer/fetcher/DOABFetcher.java

+        };
+    }
+
+    private JSONArray toJsonArray(InputStream stream) throws IOException {


Move that method also to the JsonReader Util class and handle the exception there. Then you don't need to throw and add an IO Exception everywhere

Mohamadi98 · 2022-03-23T04:07:25Z

I was surprised to learn that it returns JSON as well. My browser showed XML...

No, you are correct it returns xml I convert the xml to json and then to bibtex, it is easier for me to work with json, I should have explained that better in the description

Siedlerchr · 2022-03-23T22:05:04Z

src/main/java/org/jabref/logic/importer/fetcher/DOABFetcher.java

+        BibEntry entry = new BibEntry();
+        for (int i = 0; i < metadataArray.length(); i++) {
+            JSONObject dataObject = metadataArray.getJSONObject(i);
+            if (dataObject.getString("key").equals("dc.type")) {


Also you can simply this using a switch case
Extract the dataObject.get("key) to a variable and then you can do a nice switch on the variable

see https://www.baeldung.com/java-switch#switch-expressions and also
https://www.baeldung.com/java-switch-pattern-matching

Mohamadi98 · 2022-03-25T03:10:41Z

@Siedlerchr Hi,
I have made fixes as you requested, but I have a question,
how can I add multiple values to a single Bibtex entry attribute, for example how to add multiple authors values to entry attribute AUTHOR ?

Siedlerchr · 2022-03-25T22:55:12Z

Hi, sorry for the late resonse: For the authors: If each author is in it's own field you can directly use the new Author(...) for each; Otherwise you can use Authorlist.parse()...
The same applies for the editor field. Many books only have editors and not authors.

Have a look at the CrossRef Fetcher.

jabref/src/main/java/org/jabref/logic/importer/fetcher/CrossRef.java

Lines 149 to 162 in 6e99a33

    
           private String toAuthors(JSONArray authors) { 
        
               if (authors == null) { 
        
                   return ""; 
        
               } 
        
               // input: list of {"given":"A.","family":"Riel","affiliation":[]} 
        
               return IntStream.range(0, authors.length()) 
        
                               .mapToObj(authors::getJSONObject) 
        
                               .map((author) -> new Author( 
        
                                       author.optString("given", ""), "", "", 
        
                                       author.optString("family", ""), "")) 
        
                               .collect(AuthorList.collect()) 
        
                               .getAsFirstLastNamesWithAnd(); 
        
           }

Mohamadi98 · 2022-03-31T04:16:34Z

@Siedlerchr Hi,
sorry for not replying I have been very busy the last a few days, can you review the latest commit and see if there any changes you would like ?

Siedlerchr

In general looks already good. Some minor improvements

Siedlerchr · 2022-03-31T17:45:07Z

src/main/java/org/jabref/logic/importer/fetcher/DOABFetcher.java

+        };
+    }
+
+    private BibEntry JsonToBibEntry(JSONArray metadataArray) {


lowercase method name

Siedlerchr · 2022-03-31T17:45:27Z

src/main/java/org/jabref/logic/importer/fetcher/DOABFetcher.java

+                BibEntry entry = JsonToBibEntry(metadataArray);
+                return Collections.singletonList(entry);
+            }
+            // multiple results


The comment can be removed, shold be obvious

Siedlerchr · 2022-03-31T17:45:44Z

src/main/java/org/jabref/logic/importer/fetcher/DOABFetcher.java

+
+public class DOABFetcher implements SearchBasedParserFetcher {
+    private static final String SEARCH_URL = "https://directory.doabooks.org/rest/search?";
+    // private static final String PEER_REVIEW_URL = " https://directory.doabooks.org/rest/peerReviews?";


not needed?

Not sure how you want to integrate it, should it be a different fitcher, DOABPeerReviews for example or maybe in the same fitcher but the user has two options
I am actually not sure if I understand what peer reviews mean in the world of books.

I would ignore it for the moment.

Remove the comment then

Siedlerchr · 2022-03-31T18:07:24Z

src/main/java/org/jabref/logic/importer/fetcher/DOABFetcher.java

+        return entry;
+    }
+
+    private Author toAuthor(String author) {


What happens if you have multiple authors?
https://directory.doabooks.org/handle/20.500.12854/79348?show=full
And I think you can use AuthorList.parse () as the names are already in bibtex format

~~+1 using AuthorList.parse. There are a lot of weird edge-cases when it comes to authors~~ :/
(see comment https://github.com/JabRef/jabref/pull/8598/files#r839925364 for a bit more discussion)

What happens if you have multiple authors? https://directory.doabooks.org/handle/20.500.12854/79348?show=full And I think you can use AuthorList.parse () as the names are already in bibtex format

I created a list from the Author class and all authors are added to it then converted to an AuthorList I will add a test case for multiple authors, it works the same as multiple editors there a test case provide for that, but I agree with you I think there is a better solution for that, can you describe what AuthorList.parse () actually does and how to use it, I am not sure I completely understand it.

The nutshell interpretation of AuthorList is that it is intended to represent one or more authors written in bibtex format.

AuthorList.parse is intended to take a String consisting of one or more authors in bibtex format, and store them internally in a List that you can access directly through getAuthors() if you need to.

For this case, I'd probably go with

authorsList.addAll( AuthorList.parse( dataObject.getString("value")) .getAuthors())

even if we should only be getting one author from doab, and it isn't in bibtex format. Probably preprocess by checking for (Ed.) as in #8598 (comment)

Is that explanation/example useful?

You seem to already have nailed down how to deal with output from AuthorList. Do you have any questions regarding it?

Thanks, that was useful, and definitely a cleaner solution, I will implement this, and also start thinking about solutions to handle edgy cases like the one you mentioned here #8598 (comment) especially the author name that starts with Fatima which is formatted differently from the other authors.

I also noticed something for some results the field for the number of pages can come in a different format like this result here https://directory.doabooks.org/rest/search?query=%22UAV%E2%80%90Based%20Remote%20Sensing%20Volume%202%22&expand=metadata
this can be problematic because the type of the return is not an integer anymore it is a string.

Siedlerchr · 2022-03-31T18:10:24Z

src/test/java/org/jabref/logic/importer/fetcher/DOABFetcherTest.java

+    public void TestPerformSearch() throws FetcherException {
+        List<BibEntry> entries;
+        entries = fetcher.performSearch("i open fire");
+        assertFalse(entries.isEmpty());


assertFalse and assertTrue make it hard to see why a test fails. In thise casse it's better you compare the entry collections.
e.g. assertEquals(Collections.singletonlist(expectedEntry),entries)

k3KAW8Pnf7mkmdSMPHz27 · 2022-03-31T18:54:11Z

src/main/java/org/jabref/logic/importer/fetcher/DOABFetcher.java

+        List<Author> authorsList = new ArrayList<>();
+        List<Author> editorsList = new ArrayList<>();
+        StringJoiner keywordJoiner = new StringJoiner(",");
+        for (int i = 0; i < metadataArray.length(); i++) {
+            JSONObject dataObject = metadataArray.getJSONObject(i);
+            switch (dataObject.getString("key")) {
+                case "dc.contributor.author" -> authorsList.add(toAuthor(dataObject.getString("value")));


~~In the interest of trying to deepen the comment in https://github.com/JabRef/jabref/pull/8598/files#r839909593~~

~~In what circumstance(s) is the metadataArray larger than 1 item?~~

If it can not be guaranteed that metadataArray is exactly 1 item (in which case we have only one author and one editor field) I'd probably suggest
authorsList.addAll(AuthorList.parse(dataObject.getString("value")).getAuthors()) even if it is a bit ugly :/ (which should also cover the weird edge case of there being an author field but no parsable author)

~~It is probably possible to use StringJoiner with and, but it feels a bit hacky~~ :/

~~I don't know, I guess the best-case scenario is if metadataArray is only one item, but I guess we can't always get what we wish for~~ 😛

Looking at it a bit closer, it seems that I've spoken too soon.
@Mohamadi98 do you know if there is any information on what format they are using with more details?

E.g.,

<metadata> <key>dc.contributor.author</key> <language>*</language> <value>Felipe Gonzalez Toro (Ed.)</value> </metadata> <metadata> <key>dc.contributor.author</key> <language>*</language> <value>Kamruzzaman, Md. (Liton)</value> </metadata> <metadata> <key>dc.contributor.author</key> <language>*</language> <value>Fátima Velez de Castro</value> </metadata>

is a bit weird.

Also follow-up in #8576 (comment)

the data itself returned from the API is in xml format, and the actual information that I used for the Bibtex entries are inside an array called metadata which you can add or remove from the results, see this https://directory.doabooks.org/rest/search?query=%22the+deliverance+of+open+access+books%22, so I agree with you it is a little bit weird
I convert the format to json, It's easier for me to work with json

@Mohamadi98 Most entries seem to be already in correct format, however some seem to indicate Ediors in the authors field
We suggest you apply a minimal heuristic:

Check if the dc.contirbutor.author contains (Ed.)

if yes -> remove that (Ed.) part

Parse the value now using Authorlist.parse() it takes care of different names formats

Repeat for other editors

Put in the editors field

The other normal case:

Check if the dc.contirbutor.author contains (Ed.)

If no -> Parse it using AuthorLIst.parse()

Repeat for other authors

Put in the authors field

It's a bit of a mess with a list of list of authors... and the naming doesn't really help (yep, it's a mess^^)

But if you're stuck, just ask @k3KAW8Pnf7mkmdSMPHz27

The AuthorList.parse can also handle names not in (LastName, FirstName) format you will see in one of the tests in the latest commit that there are names that returns from the API in (FirstName LastName) and are parsed correctly, so this nice

Siedlerchr · 2022-04-03T14:20:53Z

src/main/java/org/jabref/logic/importer/fetcher/DOABFetcher.java

-    private String toAuthorList(List<Author> authorsList) {
-        return AuthorList.of(authorsList).getAsFirstLastNamesWithAnd();
+    private String namePreprocessing(String name) {
+        name = String.valueOf(new StringBuilder(name).delete(name.length() - 5, name.length()));


Why not a simple replace? e.g. name.replace("(Eds)","") ?

Mohamadi98 · 2022-04-04T20:40:20Z

I think currently this works fine for most entries of the API,
there is still some entries where the API is not consistent with their format.
what do you think ? @Siedlerchr @k3KAW8Pnf7mkmdSMPHz27

Siedlerchr · 2022-04-05T15:59:42Z

@Mohamadi98 Cool! Thanks for the work so far! I guess we can't handle all edge cases. For example, if you cannot parse the author correctly you should nontheless add the rest of the fields that are, so the user has the option to manually edit/correct the entries

Otherwise, looks good already. It would also be nice to implement the FullTextFetcher Interface, so that the PDF can be automatically downloaded. You can also do this as an extra PR.

Mohamadi98 · 2022-04-05T23:13:15Z

@Mohamadi98 Cool! Thanks for the work so far! I guess we can't handle all edge cases. For example, if you cannot parse the author correctly you should nontheless add the rest of the fields that are, so the user has the option to manually edit/correct the entries

Otherwise, looks good already. It would also be nice to implement the FullTextFetcher Interface, so that the PDF can be automatically downloaded. You can also do this as an extra PR.

Yes, actually all names are parsed correctly in comparison to the result from the API maybe not correct for the real life name but in that case as you said the user can edit it or correct it, I will just add a try catch for the pages number case, there are a few results where the the number of pages is in the following format "(roman number), number" which causes an error because in this format the type is String and not an integer like the normal format, so that try catch should handle that fine.

Mohamadi98 · 2022-04-05T23:24:52Z

@Mohamadi98 Cool! Thanks for the work so far! I guess we can't handle all edge cases. For example, if you cannot parse the author correctly you should nontheless add the rest of the fields that are, so the user has the option to manually edit/correct the entries

Otherwise, looks good already. It would also be nice to implement the FullTextFetcher Interface, so that the PDF can be automatically downloaded. You can also do this as an extra PR.

Sure no problem, but to make sure I understand, the FullTextFetcher supposed to find the page of a Bib entry for example a book probably using their DOI or URI , is that correct or something else ?

Siedlerchr · 2022-04-08T21:44:29Z

@Mohamadi98 Yes, a Full-text fetcher takes a bib entry, and then returns the PDF document for it e.g. by checking for the DOI or any other field (e.g. title, this depends on the concrete fetcher). Whenever possible, unique identifier like DOI or ISBN etc. should be used to find an entry as other fields can be ambiguos.

When you have an entry in JabRef and say "Find full text online", JabRef calls multiple fetchers, until one returns the PDF file. The fetchers also have a Trust level, e.g. if the entry or the PDF comes from the publisher website, it is considered as more trustworthy as when it comes from a website like ResearchGate.

koppor · 2022-04-11T20:10:59Z

DevCall decision: @Siedlerchr tries this code locally and if it works, we merge. The FullTextFetcher can then go into a follow-up pull request.

koppor · 2022-04-11T20:11:38Z

An entry to the CHANGELOG.md should be added, too.

ThiloteE · 2022-04-11T20:27:38Z

Searching for "politics". Found:

@Misc{Jun2011aap,
  author    = {Nathan Jun},
  date      = {2011},
  title     = {Anarchism and Political Modernity},
  doi       = {10.5040/9781501306785},
  language  = {English},
  type      = {book},
  abstract  = {This book is available as open access through the Bloomsbury Open Access programme and is available on www.bloomsburycollections.com. Anarchism and Political Modernity looks at the place of 'classical anarchism' in the postmodern political discourse, claiming that anarchism presents a vision of political postmodernity. The book seeks to foster a better understanding of why and how anarchism is growing in the present. To do so, it first looks at its origins and history, offering a different view from the two traditions that characterize modern political theory: socialism and liberalism. Such an examination leads to a better understanding of how anarchism connects with newer political trends and why it is a powerful force in contemporary social and political movements. This new volume in the Contemporary Anarchist Studies series offers a novel philosophical engagement with anarchism and contests a number of positions established in postanarchist theory. Its new approach makes a valuable contribution to an established debate about anarchism and political theory. It offers a new perspective on the emerging area of anarchist studies that will be of interest to students and theorists in political theory and anarchist studies.},
  keywords  = {Politics & International Relations, Political Theory and Philosophy (Politics), Social and Political Philosophy (Philosophy), Political and Legal Philosophy (Politics), Supplementary Standard},
  pages     = {272},
  publisher = {Bloomsbury Academic},
  uri       = {https://directory.doabooks.org/handle/20.500.12854/77385},
}

But when importing via DOI this emerges:

@Book{Jun_2012,
  author    = {Nathan Jun},
  date      = {2012},
  title     = {Anarchism and Political Modernity},
  doi       = {10.5040/9781501306785},
  publisher = {Continuum},
}

To do:

set entry type (type is probably the entry type).
Example:
@Book{citationkey, instead of @Misc{citationkey,
Would it be possible to fetch the URL? Maybe copy the uri (as it seems to be a stable link at doabooks.org) to the url field. Could be done in a follow up pull request. Nothing that blocks this one.

ThiloteE · 2022-04-11T22:56:19Z

Also, see following entries:

@Misc{Ledeneva2018tge,
  title     = {The Global Encyclopaedia of informality, Volume 2},
  abstract  = {Alena Ledeneva invites you on a voyage of discovery to explore society’s open secrets, unwritten rules and know-how practices. Broadly defined as ‘ways of getting things done’, these invisible yet powerful informal practices tend to escape articulation in official discourse. They include emotion-driven exchanges of gifts or favours and tributes for services, interest-driven know-how (from informal welfare to informal employment and entrepreneurship), identity-driven practices of solidarity, and power-driven forms of co-optation and control. The paradox, or not, of the invisibility of these informal practices is their ubiquity. Expertly practised by insiders but often hidden from outsiders, informal practices are, as this book shows, deeply rooted all over the world, yet underestimated in policy. Entries from the five continents presented in this volume are samples of the truly global and ever-growing collection, made possible by a remarkable collaboration of over 200 scholars across disciplines and area studies. By mapping the grey zones, blurred boundaries, types of ambivalence and contexts of complexity, this book creates the first Global Map of Informality. The accompanying database (www.in-formality.com) is searchable by region, keyword or type of practice, so do explore what works, how, where and why!},
  date      = {2018},
  doi       = {10.14324/111.9781787351899},
  editor    = {Alena Ledeneva},
  keywords  = {global informality, informality, informal practices, Russia},
  language  = {English},
  pages     = {568},
  publisher = {UCL Press},
  type      = {book},
  uri       = {https://directory.doabooks.org/handle/20.500.12854/37655},
}

The volume is in the title. Maybe Volume could be parsed from there?

@Misc{Burnell2006gd,
  author    = {Peter Burnell},
  date      = {2006},
  title     = {Globalising Democracy},
  doi       = {10.4324/9780203965719},
  language  = {English},
  type      = {book},
  abstract  = {This volume brings together expert contributors to explore the intersection of two major contemporary themes: globalization, and the contribution that both domestic party politics and international party support make to democratization.         Globalising Democracy clearly shows what globalization means for domestic and international efforts to build effective political parties and competitive party systems in new and emerging democracies. Contrasting perspectives are presented through fresh case studies of European post-communist countries, Africa and Turkey. The reader is clearly shown how international party assistance is a manifestation and vehicle of globalization, and explores how it may be assessed in terms of:              global economic integration       the growth of global communications       the development and implications for party politics of multi-level governance.        This is the first book to analyze the impact of globalization on democracy and will be of great interest to all students of international relations, governance and politics.},
  keywords  = {assistance, party, aid, promotion, system, political, international, zeeuw, civil, society},
  publisher = {Taylor & Francis},
  uri       = {https://directory.doabooks.org/handle/20.500.12854/32659},
}

on the webpage you will find the ISBN. If possible, Jabref should fetch the ISBN
You can see that the abstract has these odd empty spaces in between. Is this because JabRef is doing some postprocessing or is this something from DOAB side? @koppor ?

@Misc{Crewe2021tao,
  author    = {Emma Crewe},
  date      = {2021},
  title     = {The Anthropology of Parliaments},
  doi       = {10.4324/9781003084488},
  language  = {English},
  type      = {book},
  abstract  = {The Anthropology of Parliaments offers a fresh, comparative approach to analysing parliaments and democratic politics, drawing together rare ethnographic work by anthropologists and politics scholars from around the world. Crewe’s insights deepen our understanding of the complexity of political institutions. She reveals how elected politicians navigate relationships by forging alliances and thwarting opponents; how parliamentary buildings are constructed as sites of work, debate and the nation in miniature; and how politicians and officials engage with hierarchies, continuity and change. This book also proposes how to study parliaments through an anthropological lens while in conversation with other disciplines. The dive into ethnographies from across Africa, the Americas, Asia, Europe, the Middle East and the Pacific Region demolishes hackneyed geo-political categories and culminates in a new comparative theory about the contradictions in everyday political work. This important book will be of interest to anyone studying parliaments but especially those in the disciplines of anthropology and sociology; politics, legal and development studies; and international relations.},
  keywords  = {Anthropology, Comparative politics, Political structures: democracy, Social and cultural anthropology},
  pages     = {242},
  publisher = {Taylor & Francis},
  uri       = {https://directory.doabooks.org/handle/20.500.12854/69644},
}

on the webpage of doabooks.org, you will find the "title" and "subtitle" of this entry. Biblatex documentation on page 9, 19 and 26 shows that this is a field that probably should be fetched if you can. I would suggest putting it into the field subtitle. Of course, only if it is manageable to fetch it.
I checked Bibtex documentation. There, subtitle does not need to be rendered, because subtitle does not exist for the entry type books, but this info is just for the record. Probably not relevant this pull request.

Siedlerchr · 2022-04-18T21:07:11Z

@Mohamadi98 It would be cool if you could address the remaining comments so we can merge this

koppor

Some small comments to the test organization.

I would also like to see .withField instead of .setField. See #8693 for an example of the outcome.

koppor · 2022-04-18T22:06:53Z

src/test/java/org/jabref/logic/importer/fetcher/DOABFetcherTest.java

+    @Test
+    public void TestPerformSearch() throws FetcherException {
+        List<BibEntry> entries;
+        entries = fetcher.performSearch("i open fire");
+        assertEquals(Collections.singletonList(David_Opal), entries);
+    }
+
+     @Test
+    public void TestPerformSearch2() throws FetcherException {
+        List<BibEntry> entries;
+        entries = fetcher.performSearch("the deliverance of open access books");
+        assertEquals(Collections.singletonList(Ronald_Snijder), entries);
+    }
+
+    @Test
+    public void TestPerformSearch3() throws FetcherException {
+        List<BibEntry> entries;
+        entries = fetcher.performSearch("Four Kingdom Motifs before and beyond the Book of Daniel");
+        assertEquals(Collections.singletonList(Andrew_Perrin), entries);
+    }
+
+    @Test
+    public void TestPerformSearch4() throws FetcherException {
+        List<BibEntry> entries;
+        entries = fetcher.performSearch("UAV Sensors for Environmental Monitoring");
+        assertEquals(Collections.singletonList(Felipe_Gonzalez), entries);
+    }
+
+    @Test
+    public void TestPerformSearch5() throws FetcherException {
+        List<BibEntry> entries;
+        entries = fetcher.performSearch("The symbiosis between information system project complexity and information system project success");
+        assertEquals(Collections.singletonList(Carl_Marnewick), entries);
+    }
+
+    @Test
+    public void TestPerformSearch6() throws FetcherException {
+        List<BibEntry> entries;
+        entries = fetcher.performSearch("UAV‐Based Remote Sensing Volume 2");
+        assertEquals(Collections.singletonList(Antonios_Tsourdos), entries);
+    }


This should be converted to a parameterized test --> https://www.baeldung.com/parameterized-tests-junit-5

This should be converted to a parameterized test --> https://www.baeldung.com/parameterized-tests-junit-5

Hi, do you mind taking a look at the test file for the last commit, I still want to use assertEquals or something similar to it but I couldn't use it with parameterized test, any advice on that ?

Because now the value of the expected variable in the test has to change with every new test case from the string array in ValueSource under ParameterizedTest

koppor · 2022-04-18T22:07:22Z

src/test/java/org/jabref/logic/importer/fetcher/DOABFetcherTest.java

+    }
+
+    @Test
+    public void TestGetName() {


Method names should start with a lower case:

Suggested change

public void TestGetName() {

public void testGetName() {

Mohamadi98 · 2022-04-19T05:02:47Z

Searching for "politics". Found:

@Misc{Jun2011aap,
  author    = {Nathan Jun},
  date      = {2011},
  title     = {Anarchism and Political Modernity},
  doi       = {10.5040/9781501306785},
  language  = {English},
  type      = {book},
  abstract  = {This book is available as open access through the Bloomsbury Open Access programme and is available on www.bloomsburycollections.com. Anarchism and Political Modernity looks at the place of 'classical anarchism' in the postmodern political discourse, claiming that anarchism presents a vision of political postmodernity. The book seeks to foster a better understanding of why and how anarchism is growing in the present. To do so, it first looks at its origins and history, offering a different view from the two traditions that characterize modern political theory: socialism and liberalism. Such an examination leads to a better understanding of how anarchism connects with newer political trends and why it is a powerful force in contemporary social and political movements. This new volume in the Contemporary Anarchist Studies series offers a novel philosophical engagement with anarchism and contests a number of positions established in postanarchist theory. Its new approach makes a valuable contribution to an established debate about anarchism and political theory. It offers a new perspective on the emerging area of anarchist studies that will be of interest to students and theorists in political theory and anarchist studies.},
  keywords  = {Politics & International Relations, Political Theory and Philosophy (Politics), Social and Political Philosophy (Philosophy), Political and Legal Philosophy (Politics), Supplementary Standard},
  pages     = {272},
  publisher = {Bloomsbury Academic},
  uri       = {https://directory.doabooks.org/handle/20.500.12854/77385},
}

But when importing via DOI this emerges:

@Book{Jun_2012,
  author    = {Nathan Jun},
  date      = {2012},
  title     = {Anarchism and Political Modernity},
  doi       = {10.5040/9781501306785},
  publisher = {Continuum},
}

To do:

set entry type (type is probably the entry type).
Example:
@Book{citationkey, instead of @Misc{citationkey,
Would it be possible to fetch the URL? Maybe copy the uri (as it seems to be a stable link at doabooks.org) to the url field. Could be done in a follow up pull request. Nothing that blocks this one.

okay, I think this could be done, can you just explain to me what affect does changing the type have, I don't think I understand that part about bibtex fully
For the url part, having both url and uri can be done I will add them

Mohamadi98 · 2022-04-19T05:37:33Z

Also, see following entries:

@Misc{Ledeneva2018tge,
  title     = {The Global Encyclopaedia of informality, Volume 2},
  abstract  = {Alena Ledeneva invites you on a voyage of discovery to explore society’s open secrets, unwritten rules and know-how practices. Broadly defined as ‘ways of getting things done’, these invisible yet powerful informal practices tend to escape articulation in official discourse. They include emotion-driven exchanges of gifts or favours and tributes for services, interest-driven know-how (from informal welfare to informal employment and entrepreneurship), identity-driven practices of solidarity, and power-driven forms of co-optation and control. The paradox, or not, of the invisibility of these informal practices is their ubiquity. Expertly practised by insiders but often hidden from outsiders, informal practices are, as this book shows, deeply rooted all over the world, yet underestimated in policy. Entries from the five continents presented in this volume are samples of the truly global and ever-growing collection, made possible by a remarkable collaboration of over 200 scholars across disciplines and area studies. By mapping the grey zones, blurred boundaries, types of ambivalence and contexts of complexity, this book creates the first Global Map of Informality. The accompanying database (www.in-formality.com) is searchable by region, keyword or type of practice, so do explore what works, how, where and why!},
  date      = {2018},
  doi       = {10.14324/111.9781787351899},
  editor    = {Alena Ledeneva},
  keywords  = {global informality, informality, informal practices, Russia},
  language  = {English},
  pages     = {568},
  publisher = {UCL Press},
  type      = {book},
  uri       = {https://directory.doabooks.org/handle/20.500.12854/37655},
}

The volume is in the title. Maybe Volume could be parsed from there?

@Misc{Burnell2006gd,
  author    = {Peter Burnell},
  date      = {2006},
  title     = {Globalising Democracy},
  doi       = {10.4324/9780203965719},
  language  = {English},
  type      = {book},
  abstract  = {This volume brings together expert contributors to explore the intersection of two major contemporary themes: globalization, and the contribution that both domestic party politics and international party support make to democratization.         Globalising Democracy clearly shows what globalization means for domestic and international efforts to build effective political parties and competitive party systems in new and emerging democracies. Contrasting perspectives are presented through fresh case studies of European post-communist countries, Africa and Turkey. The reader is clearly shown how international party assistance is a manifestation and vehicle of globalization, and explores how it may be assessed in terms of:              global economic integration       the growth of global communications       the development and implications for party politics of multi-level governance.        This is the first book to analyze the impact of globalization on democracy and will be of great interest to all students of international relations, governance and politics.},
  keywords  = {assistance, party, aid, promotion, system, political, international, zeeuw, civil, society},
  publisher = {Taylor & Francis},
  uri       = {https://directory.doabooks.org/handle/20.500.12854/32659},
}

on the webpage you will find the ISBN. If possible, Jabref should fetch the ISBN
You can see that the abstract has these odd empty spaces in between. Is this because JabRef is doing some postprocessing or is this something from DOAB side? @koppor ?

@Misc{Crewe2021tao,
  author    = {Emma Crewe},
  date      = {2021},
  title     = {The Anthropology of Parliaments},
  doi       = {10.4324/9781003084488},
  language  = {English},
  type      = {book},
  abstract  = {The Anthropology of Parliaments offers a fresh, comparative approach to analysing parliaments and democratic politics, drawing together rare ethnographic work by anthropologists and politics scholars from around the world. Crewe’s insights deepen our understanding of the complexity of political institutions. She reveals how elected politicians navigate relationships by forging alliances and thwarting opponents; how parliamentary buildings are constructed as sites of work, debate and the nation in miniature; and how politicians and officials engage with hierarchies, continuity and change. This book also proposes how to study parliaments through an anthropological lens while in conversation with other disciplines. The dive into ethnographies from across Africa, the Americas, Asia, Europe, the Middle East and the Pacific Region demolishes hackneyed geo-political categories and culminates in a new comparative theory about the contradictions in everyday political work. This important book will be of interest to anyone studying parliaments but especially those in the disciplines of anthropology and sociology; politics, legal and development studies; and international relations.},
  keywords  = {Anthropology, Comparative politics, Political structures: democracy, Social and cultural anthropology},
  pages     = {242},
  publisher = {Taylor & Francis},
  uri       = {https://directory.doabooks.org/handle/20.500.12854/69644},
}

on the webpage of doabooks.org, you will find the "title" and "subtitle" of this entry. Biblatex documentation on page 9, 19 and 26 shows that this is a field that probably should be fetched if you can. I would suggest putting it into the field subtitle. Of course, only if it is manageable to fetch it.
I checked Bibtex documentation. There, subtitle does not need to be rendered, because subtitle does not exist for the entry type books, but this info is just for the record. Probably not relevant this pull request.

actually there is a field for volume in the API results, but it can be a little strange like for the example you mentioned there is no, but there is for other results, I think adding that field is a more generalized solution than parsing the volume number from the name string, it cause problems for maybe cases where the actual word "volume" is part of the name.

I can't find a field for the ISBN in the API results, maybe I could communicate with Ronald from the issue here #8576 to ask about ISBN.

For the abstract part, this is the format that comes directly from the API.

Mohamadi98 · 2022-04-19T05:43:42Z

Also, see following entries:

@Misc{Ledeneva2018tge,
  title     = {The Global Encyclopaedia of informality, Volume 2},
  abstract  = {Alena Ledeneva invites you on a voyage of discovery to explore society’s open secrets, unwritten rules and know-how practices. Broadly defined as ‘ways of getting things done’, these invisible yet powerful informal practices tend to escape articulation in official discourse. They include emotion-driven exchanges of gifts or favours and tributes for services, interest-driven know-how (from informal welfare to informal employment and entrepreneurship), identity-driven practices of solidarity, and power-driven forms of co-optation and control. The paradox, or not, of the invisibility of these informal practices is their ubiquity. Expertly practised by insiders but often hidden from outsiders, informal practices are, as this book shows, deeply rooted all over the world, yet underestimated in policy. Entries from the five continents presented in this volume are samples of the truly global and ever-growing collection, made possible by a remarkable collaboration of over 200 scholars across disciplines and area studies. By mapping the grey zones, blurred boundaries, types of ambivalence and contexts of complexity, this book creates the first Global Map of Informality. The accompanying database (www.in-formality.com) is searchable by region, keyword or type of practice, so do explore what works, how, where and why!},
  date      = {2018},
  doi       = {10.14324/111.9781787351899},
  editor    = {Alena Ledeneva},
  keywords  = {global informality, informality, informal practices, Russia},
  language  = {English},
  pages     = {568},
  publisher = {UCL Press},
  type      = {book},
  uri       = {https://directory.doabooks.org/handle/20.500.12854/37655},
}

The volume is in the title. Maybe Volume could be parsed from there?

@Misc{Burnell2006gd,
  author    = {Peter Burnell},
  date      = {2006},
  title     = {Globalising Democracy},
  doi       = {10.4324/9780203965719},
  language  = {English},
  type      = {book},
  abstract  = {This volume brings together expert contributors to explore the intersection of two major contemporary themes: globalization, and the contribution that both domestic party politics and international party support make to democratization.         Globalising Democracy clearly shows what globalization means for domestic and international efforts to build effective political parties and competitive party systems in new and emerging democracies. Contrasting perspectives are presented through fresh case studies of European post-communist countries, Africa and Turkey. The reader is clearly shown how international party assistance is a manifestation and vehicle of globalization, and explores how it may be assessed in terms of:              global economic integration       the growth of global communications       the development and implications for party politics of multi-level governance.        This is the first book to analyze the impact of globalization on democracy and will be of great interest to all students of international relations, governance and politics.},
  keywords  = {assistance, party, aid, promotion, system, political, international, zeeuw, civil, society},
  publisher = {Taylor & Francis},
  uri       = {https://directory.doabooks.org/handle/20.500.12854/32659},
}

on the webpage you will find the ISBN. If possible, Jabref should fetch the ISBN
You can see that the abstract has these odd empty spaces in between. Is this because JabRef is doing some postprocessing or is this something from DOAB side? @koppor ?

@Misc{Crewe2021tao,
  author    = {Emma Crewe},
  date      = {2021},
  title     = {The Anthropology of Parliaments},
  doi       = {10.4324/9781003084488},
  language  = {English},
  type      = {book},
  abstract  = {The Anthropology of Parliaments offers a fresh, comparative approach to analysing parliaments and democratic politics, drawing together rare ethnographic work by anthropologists and politics scholars from around the world. Crewe’s insights deepen our understanding of the complexity of political institutions. She reveals how elected politicians navigate relationships by forging alliances and thwarting opponents; how parliamentary buildings are constructed as sites of work, debate and the nation in miniature; and how politicians and officials engage with hierarchies, continuity and change. This book also proposes how to study parliaments through an anthropological lens while in conversation with other disciplines. The dive into ethnographies from across Africa, the Americas, Asia, Europe, the Middle East and the Pacific Region demolishes hackneyed geo-political categories and culminates in a new comparative theory about the contradictions in everyday political work. This important book will be of interest to anyone studying parliaments but especially those in the disciplines of anthropology and sociology; politics, legal and development studies; and international relations.},
  keywords  = {Anthropology, Comparative politics, Political structures: democracy, Social and cultural anthropology},
  pages     = {242},
  publisher = {Taylor & Francis},
  uri       = {https://directory.doabooks.org/handle/20.500.12854/69644},
}

on the webpage of doabooks.org, you will find the "title" and "subtitle" of this entry. Biblatex documentation on page 9, 19 and 26 shows that this is a field that probably should be fetched if you can. I would suggest putting it into the field subtitle. Of course, only if it is manageable to fetch it.
I checked Bibtex documentation. There, subtitle does not need to be rendered, because subtitle does not exist for the entry type books, but this info is just for the record. Probably not relevant this pull request.

Unfortunately, I can't find a field for subtitle in the API results, maybe it could be fetched from the webpage, but I don't know of a way to access webpages from jabref and fetch data from the page.

Mohamadi98 · 2022-04-19T05:45:33Z

Some small comments to the test organization.

I would also like to see .withField instead of .setField. See #8693 for an example of the outcome.

sure, but can you explain the difference between "setfield" and "withfield".

ThiloteE · 2022-04-19T09:02:23Z

Fixing the entry type is comparatively important (More important than e.g. fetching the missing subtitle field).

See here for a rough explanation: https://www.openoffice.org/bibliographic/bibtex-defs.html

In short:

When entering a reference in the database, the first thing to decide is what type of entry it is. No fixed classification scheme can be complete, but provides enough entry types to handle almost any reference reasonably well.

References to different types of publications contain different information; a reference to a journal article might include the volume and number of the journal, which is usually not meaningful for a book. Therefore, database entries of different types have different fields.

As a consequence, citation-styles will render the entries differently based on entry type. If the entry type is @misc, the bibliography will look different as compared to entries with entry type @article.

The entry type is offered in the API probably via the following:

dc.type

You already fetched it, since it shows in the biblography as

type = {book},

All that needs to be done here is to do the conversion.

I can't find a field for the ISBN in the API results, maybe I could communicate with Ronald from the issue here #8576 to ask about ISBN.

For this entry click on the bottom left show full item record, you will see info about ISBN:

oapen.relation.isbn	9780415401845;9780415401838;9781134143894;9781134143887;9781134143849

Unfortunately, I can't find a field for subtitle in the API results, maybe it could be fetched from the webpage, but I don't know of a way to access webpages from jabref and fetch data from the page.

Same procedure for this entry yields the subtitle:

dc.title.alternative	Entanglements in Democratic Politics

and I think fetching "imprint" was also missing, so here we go:

oapen.imprint	Routledge

Finally, with regard to:

Maybe parse volume from title field?
the spaces in the abstract

These two issues are not important. I just noticed, but they are very low priority. They were more a question than a proposal. Don't do them. They do not prevent merging this pull request at all.

Overall, the entries I fetched look already pretty good! Good job!

koppor · 2022-04-19T21:59:20Z

Some small comments to the test organization.
I would also like to see .withField instead of .setField. See #8693 for an example of the outcome.
sure, but can you explain the difference between "setfield" and "withfield".

withField is the fluent API, similar to a builder, whereas setField is old-school style.

Mohamadi98 · 2022-04-20T09:11:53Z

Fixing the entry type is comparatively important (More important than e.g. fetching the missing subtitle field).

See here for a rough explanation: https://www.openoffice.org/bibliographic/bibtex-defs.html

In short:

When entering a reference in the database, the first thing to decide is what type of entry it is. No fixed classification scheme can be complete, but provides enough entry types to handle almost any reference reasonably well.
References to different types of publications contain different information; a reference to a journal article might include the volume and number of the journal, which is usually not meaningful for a book. Therefore, database entries of different types have different fields.

As a consequence, citation-styles will render the entries differently based on entry type. If the entry type is @misc, the bibliography will look different as compared to entries with entry type @article.

The entry type is offered in the API probably via the following:

dc.type

You already fetched it, since it shows in the biblography as

type = {book},

All that needs to be done here is to do the conversion.

I can't find a field for the ISBN in the API results, maybe I could communicate with Ronald from the issue here #8576 to ask about ISBN.

For this entry click on the bottom left show full item record, you will see info about ISBN:

oapen.relation.isbn 9780415401845;9780415401838;9781134143894;9781134143887;9781134143849

Unfortunately, I can't find a field for subtitle in the API results, maybe it could be fetched from the webpage, but I don't know of a way to access webpages from jabref and fetch data from the page.

Same procedure for this entry yields the subtitle:

dc.title.alternative Entanglements in Democratic Politics
and I think fetching "imprint" was also missing, so here we go:

oapen.imprint Routledge
Finally, with regard to:

Maybe parse volume from title field?

the spaces in the abstract

These two issues are not important. I just noticed, but they are very low priority. They were more a question than a proposal. Don't do them. They do not prevent merging this pull request at all.

Overall, the entries I fetched look already pretty good! Good job!

I added somethings, can you check the last commit and let me know if you have request or feedback

also, I added oapen.relation.isbn so if that field is in the API result it will be fitched and assigned to the bibentry, however the problem is that despite you found it on the webpage it doesn't seem to appear in all API results I have seen including the example you mentioned, here is the API response for the query of the book you mentioned https://directory.doabooks.org/rest/search?query=%22Globalising%20Democracy%22&expand=metadata

koppor · 2022-04-20T19:49:43Z

Note that the "secret sauce" of Parameterized tests is:

        return Stream.of(
                Arguments.of(

We applied that pattern at e809885 (#8598)

- Add CHANGELOG.md entry - Fix checkstyle Co-authored-by: Christoph <siedlerkiller@gmail.com> Co-authored-by: Carl Christian Snethlage <50491877+calixtus@users.noreply.github.com>

…there are two results) Co-authored-by: Christoph <siedlerkiller@gmail.com> Co-authored-by: Carl Christian Snethlage <50491877+calixtus@users.noreply.github.com>

koppor · 2022-04-20T20:49:53Z

@Mohamadi98 Thank you for your efforts in implementing this. Hope, you learned something here 🎉. Maybe, you can investiage on our final verison of the paramterized tests.

@ThiloteE In case there still is something missing, please open a new issue.

ThiloteE · 2022-04-20T22:15:04Z

Thank you Mohamadi! :)

ThiloteE · 2022-04-20T22:49:27Z

@Mohamadi98

Fix for issue Add Directory of Open Access Books (DOAB) fetcher to web search #8576 adding DOAB fetcher to the list of web searches

By the way, if you write

"Fixes #8576", It will automatically close the issue, when the pull request is merged :-)

Mohamadi98 · 2022-04-21T06:05:29Z

@Mohamadi98 Thank you for your efforts in implementing this. Hope, you learned something here 🎉. Maybe, you can investiage on our final verison of the paramterized tests.

@ThiloteE In case there still is something missing, please open a new issue.

Definitely will look at it, It was a pleasure, thanks

Mohamadi98 · 2022-04-21T06:05:49Z

Thank you Mohamadi! :)

It was fun, Thank you !

adding DOAB to web search

e3c9f95

Siedlerchr requested changes Mar 22, 2022

View reviewed changes

Siedlerchr added the status: changes required Pull requests that are not yet complete label Mar 22, 2022

Siedlerchr reviewed Mar 23, 2022

View reviewed changes

update on adding DOAB to web search

7772eb2

update on adding DOAB to web search

f0c7a1d

Siedlerchr requested changes Mar 31, 2022

View reviewed changes

k3KAW8Pnf7mkmdSMPHz27 reviewed Mar 31, 2022

View reviewed changes

update on adding DOAB to web search

d33c30a

Siedlerchr reviewed Apr 3, 2022

View reviewed changes

Mohamadi98 added 2 commits April 4, 2022 01:49

removing (Ed.) from names

16b7cc2

adding space between keywords

fa880cf

handling string data type at pages number

3e71137

Mohamadi98 marked this pull request as ready for review April 5, 2022 23:36

koppor added this to the v5.6 milestone Apr 11, 2022

koppor requested changes Apr 18, 2022

View reviewed changes

Mohamadi98 closed this Apr 19, 2022

Mohamadi98 reopened this Apr 19, 2022

update on bibtex fields and testing

4042bcf

Mohamadi98 force-pushed the add-DOAB-to-web-search branch from 5ef9e06 to 4042bcf Compare April 20, 2022 08:57

ThiloteE added the fetcher label Apr 20, 2022

koppor and others added 4 commits April 20, 2022 21:51

Minor fixes

2e8c7ae

- Add CHANGELOG.md entry - Fix checkstyle Co-authored-by: Christoph <siedlerkiller@gmail.com> Co-authored-by: Carl Christian Snethlage <50491877+calixtus@users.noreply.github.com>

"Right" usage of withField

e746f82

Fix style of parameterized tests - and remove one test case (because …

e809885

…there are two results) Co-authored-by: Christoph <siedlerkiller@gmail.com> Co-authored-by: Carl Christian Snethlage <50491877+calixtus@users.noreply.github.com>

Merge remote-tracking branch 'origin/main' into add-DOAB-to-web-search

d677bf1

koppor force-pushed the add-DOAB-to-web-search branch from 203b152 to d677bf1 Compare April 20, 2022 19:52

koppor removed the status: changes required Pull requests that are not yet complete label Apr 20, 2022

koppor merged commit f220575 into JabRef:main Apr 20, 2022

ThiloteE mentioned this pull request Apr 20, 2022

Add Directory of Open Access Books (DOAB) fetcher to web search #8576

Closed

adding DOAB to web search #8598

adding DOAB to web search #8598

Conversation

Mohamadi98 commented Mar 22, 2022 • edited by calixtus Loading

Siedlerchr left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Mohamadi98 commented Mar 23, 2022

Choose a reason for hiding this comment

Mohamadi98 commented Mar 25, 2022

Siedlerchr commented Mar 25, 2022 • edited Loading

Mohamadi98 commented Mar 31, 2022

Siedlerchr left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

k3KAW8Pnf7mkmdSMPHz27 Mar 31, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

k3KAW8Pnf7mkmdSMPHz27 Mar 31, 2022 • edited Loading

Choose a reason for hiding this comment

k3KAW8Pnf7mkmdSMPHz27 Mar 31, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Mohamadi98 commented Apr 4, 2022

Siedlerchr commented Apr 5, 2022

Mohamadi98 commented Apr 5, 2022

Mohamadi98 commented Apr 5, 2022

Siedlerchr commented Apr 8, 2022

koppor commented Apr 11, 2022

koppor commented Apr 11, 2022

ThiloteE commented Apr 11, 2022 • edited Loading

ThiloteE commented Apr 11, 2022

Siedlerchr commented Apr 18, 2022

koppor left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Mohamadi98 commented Apr 19, 2022 • edited by koppor Loading

Mohamadi98 commented Apr 19, 2022 • edited by koppor Loading

Mohamadi98 commented Apr 19, 2022 • edited by koppor Loading

Mohamadi98 commented Apr 19, 2022

ThiloteE commented Apr 19, 2022

koppor commented Apr 19, 2022

Mohamadi98 commented Apr 20, 2022

koppor commented Apr 20, 2022 • edited Loading

koppor commented Apr 20, 2022

ThiloteE commented Apr 20, 2022

ThiloteE commented Apr 20, 2022

Mohamadi98 commented Apr 21, 2022

Mohamadi98 commented Apr 21, 2022

Mohamadi98 commented Mar 22, 2022 •

edited by calixtus

Loading

Siedlerchr commented Mar 25, 2022 •

edited

Loading

k3KAW8Pnf7mkmdSMPHz27 Mar 31, 2022 •

edited

Loading

k3KAW8Pnf7mkmdSMPHz27 Mar 31, 2022 •

edited

Loading

k3KAW8Pnf7mkmdSMPHz27 Mar 31, 2022 •

edited

Loading

ThiloteE commented Apr 11, 2022 •

edited

Loading

Mohamadi98 commented Apr 19, 2022 •

edited by koppor

Loading

Mohamadi98 commented Apr 19, 2022 •

edited by koppor

Loading

Mohamadi98 commented Apr 19, 2022 •

edited by koppor

Loading

koppor commented Apr 20, 2022 •

edited

Loading