Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOAJ and bibjson #208

Closed
oscargus opened this issue Oct 4, 2015 · 10 comments
Closed

DOAJ and bibjson #208

oscargus opened this issue Oct 4, 2015 · 10 comments

Comments

@oscargus
Copy link
Contributor

oscargus commented Oct 4, 2015

Just found Directory of Open Access Journals. They provide an API for searching and returns the result as bibjson.

  • It would be nice to support searching DOAJ (direct link to PDFs etc)
  • bibjson import and export might be considered (import/conversion from required for the DOAJ search)

Any preference on json package? Any already included in JabRef? (With that said, I only have a vague idea of json parsers etc.)

@koppor
Copy link
Member

koppor commented Oct 4, 2015

In Winery, I use the 2.x branch of Jackson. I rely, however, on the automatic mapping of JAX-RS and seldomly use directly a parsing component of Jackson.

I would propose to create a separate library with a Java data model of BibJSON. That would help reusing the functionality in other projects such as CloudRef.

I would use the JAX-RS 2.0 client API. This is the more modern way instead of using the (old) Jersey-Client API. Instead of going with Apache CXF or Jersey, the Jackson JAX-RS Providers seem the way to go. If Jackson does not work, you are welcome to try with Apache CXF.

Use jackson-annotations to specify the mapping to/from JSON. jackson-databind is the module you need to use to go from JSON to POJOs. Currently not sure whether you have to explicitly include it if you include the Jackson JAX-RS JSON Provider.

Hope, that is enough information for googling :)

@koppor
Copy link
Member

koppor commented Oct 10, 2015

I just realized that @stefan-kolb added (I think in #101) unirest. In the serialization section in their README.md, they also point to Jackson.

@oscargus
Copy link
Contributor Author

Thanks for pointing these out. After playing around with Unirest (and org.json), it appears as rather straightforward to implement this without POJOs. I see the potential beauty of having a bibjson <-> BibtexEntry POJO connection, but I can not currently realize how this should be done (partly since the BibtexEntry POJOs are not really POJOs in the sense that would be beneficial here, but also since it seems quite complicated to deal with all the potential variants of bibjson fields). Is it a waste of time to do a DOAJ import based on "manual" conversion from bibjson to BibtexEntry? Of course, I'll use the Json functions, but rather checking specific fields rather than mapping it to POJOs.

A small problem here is that it appears as if the author order is not always correct/consistent. Also, the author formatting differs between journals, but still I guess it is better than manual import. :-)

@koppor
Copy link
Member

koppor commented Oct 12, 2015

Fine with me. I also like working on the JSON more. Could be a nightmare at testing though... Having a POJO model, we can separate the parser JSON -> POJO from the interpretation POJO -> BibTeX. Tests JSON -> BibTeX are also fine. Should also be manageable without much effort.

You can normalize the authors using net.sf.jabref.model.entry.AuthorList.fixAuthor_firstNameFirst(String). Then the authors always look nice. OK, we could make that configurable. The only alternative is net.sf.jabref.model.entry.AuthorList.fixAuthor_lastNameFirst(String) though and I personally got used to read the first name first.

@oscargus
Copy link
Contributor Author

Regarding the authors it is worse that that. The author field can say "Kopp Oliver", "Oliver Kopp", or "Kopp O" (and probably "O Kopp"). The third (and fourth) cases are at least feasible to detect, but I can not see how to distinguish between the two first cases (no, there are no commas). Probably it is better to just leave it untouched.

@oscargus
Copy link
Contributor Author

Now I'm talking about the author field from DOAJ, not how it should look in the .bib file.

@koppor
Copy link
Member

koppor commented Oct 13, 2015

I wasn't aware that the naming issue is not solved in bibjson. Seems like that one has to look up locally in the database if the name is already there? Or in the internet which name is more likely? - In case of the last two patterns, one can have a guess that the Word with one character is the first name...

@oscargus oscargus mentioned this issue Nov 3, 2015
9 tasks
@oscargus
Copy link
Contributor Author

oscargus commented Nov 4, 2015

I guess that it is actually solved in bibjson (in the sense that it should be any of the valid bibtex formats). Just that DOAJ is not always obtaining correct data from its sources (out of which some are "predatory" journals). I also noted that for one of my papers, the author order was incorrect.

@matthiasgeiger
Copy link
Member

As the PR regarding DOAJ is merged, can we close this, too?

@oscargus
Copy link
Contributor Author

Closed as #287 is now merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants