Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix checks for required fields for primary article #1258

Merged
merged 5 commits into from
Jul 5, 2023

Conversation

ahamelers
Copy link
Collaborator

Closes datadryad/dryad-product-roadmap#2585

Also fixed the import button to sentence casing

@ahamelers ahamelers requested a review from ryscher June 29, 2023 15:51
@ahamelers ahamelers force-pushed the require-related-metadata branch from d93dbc1 to bfde155 Compare June 29, 2023 16:22
Copy link
Contributor

@sfisher sfisher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This works pretty well for me, though I notice some differences of what is acceptable for the DOI formatting from the past and now it only accepts one style in order to submit. It now validates and requires a bare-DOI identifier style rather than also accepting the URL style (which I think is the DataCite official style).

It actually functions, no matter which style of DOI is entered and will autofill the page and also shows as an Article related work at the bottom of the page (with the full URL).

However the validation will not accept either style, even though it otherwise seems to function. IDK if that's a big deal.

For example, try it with importing "Nature Energy" and "https://doi.org/10.1038/s41560-023-01282-z" vs "10.1038/s41560-023-01282-z" in different datasets. Both will autofill and fill the related work, but only "10.1038/s41560-023-01282-z" is accepted as valid in the review page. 🤷‍♂️

Probably a dumb, minor thing but inconsistent with what we'll accept vs what the review page says. (And if we insist on one or the other only, I think the full url is the preferred one by DataCite.)

@ryscher
Copy link
Member

ryscher commented Jun 29, 2023

For ease of copy/paste, we should accept all 3: bare style, URL style, and namespace style ("doi:"), and convert them the whatever style we store internally.

@ahamelers
Copy link
Collaborator Author

@ryscher @sfisher I just want to point out, that if you look at the changes, I did not create this DOI check at all, I've only enabled the check whether or not the user has already entered the journal name (and added a check requiring the journal name be entered). Presumably, any user who had entered a journal name was already getting this check! It existed and had tests written to check it was operating. Are you now suggesting this existing check be modified or removed?

@ahamelers
Copy link
Collaborator Author

Here's all the information I can discover about why this check was added (almost exactly a year ago!):

The commit: CDL-Dryad@5f9b5ac

Pull request: https://github.com/CDL-Dryad/dryad-app/pull/884

Release notes: https://github.com/CDL-Dryad/dryad-app/releases/tag/v0.7.32a


journal = " from #{@resource.identifier.publication_name}" if @resource.identifier.publication_name.present?

if @resource.identifier.import_info == 'published' &&
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Audrey, this is the part that wasn't working and it was changed in this pull request. Before it had some conditions and returned at the first of the article_id method. If it got past returning early then it simply checked that a primary article existed.

The logic in here lines 102-103 seem to be the following:

IF import_info == 'published' AND (the identifier has a publication DOI OR there is a primary article) THEN give an error.

I'm making another commit to fix this and another bug I found if someone puts in a bad identifier for the primary article first.

My new code is

        primary_article = @resource.related_identifiers.where(work_type: 'primary_article').first
        if @resource.identifier.import_info == 'published' && (
                      primary_article.nil? || primary_article.related_identifier.blank? || !primary_article.valid_doi_format?
                    )

It also just looks up the primary article from this resource. The identifier version looks through the whole history for a primary article, but we probably don't need to do that and if their primary article is gone then they probably should change the options from those 3 choices.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sfisher This part that you have quoted is just a variable to show a journal name in the error message if one is present, because the error message previously had a journal name in it and I wanted to maintain that, but now I'm showing the same error even if the journal name is blank. What's not working about it?

Copy link
Collaborator Author

@ahamelers ahamelers Jul 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I think I can see in the old code, that previously the primary_article DOI check was only triggered if the publication_doi field was blank. I guess when I was reconfiguring these checks I thought that was a bug, because this new change is a check that ensures the publication_doi field will never be left blank if the published article import option is selected, which means the primary_article DOI check will never be triggered—and there's an rspec test for it and everything! If it never being triggered is preferred, should I not just remove that primary_article check entirely?

@sfisher
Copy link
Contributor

sfisher commented Jul 3, 2023

BTW, I added more commits that should fix this problem.

PS. The check that was there a year ago only checked that a DOI was filled IF they had selected "article" AND a publisher was filled in as I recall. I don't think it required anything to be filled if nothing was filled in the form (even if publisher was selected from radio buttons).

The diff shows that the code did change from the main branch and from a year ago. Maybe you were talking about a different section of code?

You can test the updated code by trying all 3 different DOI formats (and all seem to work as far as I can tell). I'm not sure why only some formats failed with this original pull request since I would've expected them all to fail. 🤷‍♂️

@ahamelers
Copy link
Collaborator Author

The check that was there a year ago only checked that a DOI was filled IF they had selected "article" AND a publisher was filled in as I recall. I don't think it required anything to be filled if nothing was filled in the form (even if publisher was selected from radio buttons).

Yes, that is what the pull request is fixing, it is making filling these fields required.

@ahamelers
Copy link
Collaborator Author

https://github.com/CDL-Dryad/dryad-app/pull/1258#discussion_r1252050222

Okay, I think I can see in the old code, that previously the primary_article DOI check was only triggered if the publication_doi field was blank. I guess when I was reconfiguring these checks I thought that was a bug, because this new change is a check that ensures the publication_doi field will never be left blank if the published article import option is selected, which means the primary_article DOI check will never be triggered—and there's an rspec test for it and everything! If it never being triggered is preferred, should I not just remove that primary_article check entirely?

@ryscher
Copy link
Member

ryscher commented Jul 5, 2023

Looks good to me. This is good to roll out now, and we can improve the overall interaction with a future redesign.

@ryscher ryscher merged commit 802d65a into main Jul 5, 2023
@ahamelers ahamelers deleted the require-related-metadata branch August 15, 2023 17:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Dataset submitted without journal metadata when "a manuscript in progress" was selected
3 participants