Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle additional cases in GraduationJob #2138

Merged
merged 1 commit into from
Apr 19, 2021
Merged

Handle additional cases in GraduationJob #2138

merged 1 commit into from
Apr 19, 2021

Conversation

mark-dce
Copy link
Contributor

We have diagnosed a long-standing bug that leaves ETDs in a published
but inactive state after the graduation job runs. This occurs when
the embargo expiration date for the work occurrs before the date of
the graduation job run. This can happen

  1. with shorter 6 month embargos when there is a delay in the graduation
    because of department or registrar issues
  2. when and embargo was initially requested at submission, but the
    submission was later edited to remove the embargo request before
    graduation - this leaves a default six year embargo on the work
    which was trunctated to end on the graduation date by
    the graduation job

Because of the process used to generate the registrar feed, graduation
dates always occur in the past in relation to the date of the graduation
job run - i.e. the registrar feed never contains future dated graduation
reocrds. Therefore in case 2 above, the embargo expiration is set to a
date some number of days or weeks in the past. The same situation can
occur for case when when there is a multi-semester delay between the
ETD submission & approval in relation to the final graduation date.

Hyrax includes a validation that prevents works from being saved
if they have an active embargo with an expiration date in the past.
A handful of works each month were failing this validation and the save
operation in the GraduationJob was silently failing due to this
validation failure.

This change set adds tests for the two cases described and makes
the necessary code modifications to successfully handle each case.
We have run the code against historical datasets loaded on the QA
environment and the publication bug did not occur for works that
are currently in the error state in the production environment.

To diagnose the problem, we added additional logging to GraduationJob
and were able to clearly see the issue as follows:

W, [2021-04-12T13:40:49.784025 #4023] WARN -- : [ActiveJob] [GraduationJob] [bbe82cf9-d2c7-4d99-9bde-0b74e8d4efd8] ETD 9019s367m embargo release is 2020-08-18T00:00:00+00:00
W, [2021-04-12T13:40:49.784191 #4023] WARN -- : [ActiveJob] [GraduationJob] [bbe82cf9-d2c7-4d99-9bde-0b74e8d4efd8] ETD 9019s367m saving work...
E, [2021-04-12T13:40:49.788255 #4023] ERROR -- : [ActiveJob] [GraduationJob] [bbe82cf9-d2c7-4d99-9bde-0b74e8d4efd8] Error performing GraduationJob (Job ID: bbe82cf9-d2c7-4d99-9bde-0b74e8d4efd8) from Sidekiq(default) in 6021.0ms: ActiveFedora::RecordInvalid (Validation failed: Embargo release date Must be a future date):

Relevant data from two ETDs processed on 2021-04-11 show the date conflict

{
  "id":"9019s367m",
  "hasEmbargo_ssim":["5f8a5c88-64e8-4952-8a90-1a339d904cd0"],
  "embargo_length_ssi":"None - open access immediately",
  "suppressed_bsi":true,
  "embargo_release_date_dtsi":"2020-08-18T00:00:00Z"},
{
  "id":"9019s368w",
  "hasEmbargo_ssim":["ab9d4ac1-5b41-43ab-ade4-d567c33b7cbf"],
  "embargo_length_ssi":"6 months",
  "suppressed_bsi":true,
  "embargo_release_date_dtsi":"2021-02-18T00:00:00Z"},

In each case, the embargo_release_date prior to the processing date was
triggering the "Embargo release date Must be a future date" validation
failure. Because of the version of the save method being called by the
GraduationJob, these validations were silently failing and the ETDs'
workflow state was being updated to published but the updated
graduation date and visibility were not being saved successfully.

@mark-dce mark-dce force-pushed the fix_graduation_job branch 3 times, most recently from 90ea22c to 0e78a18 Compare April 17, 2021 21:16
@coveralls
Copy link

coveralls commented Apr 17, 2021

Coverage Status

Coverage increased (+0.02%) to 92.138% when pulling 941cb30 on fix_graduation_job into db22f6e on main.

We have diagnosed a long-standing bug that leaves ETDs in a published
but inactive state after the graduation job runs.  This occurs when
the embargo expiration date for the work occurrs before the date of
the graduation job run. This can happen
1. with shorter 6 month embargos when there is a delay in the graduation
   because of department or registrar issues
2. when and embargo was initially requested at submission, but the
   submission was later edited to remove the embargo request before
   graduation - this leaves a default six year embargo on the work
   which was trunctated to end on the graduation date by
   the graduation job

Because of the process used to generate the registrar feed, graduation
dates always occur in the past in relation to the date of the graduation
job run - i.e. the registrar feed never contains future dated graduation
reocrds.  Therefore in case 2 above, the embargo expiration is set to a
date some number of days or weeks in the past.  The same situation can
occur for case when when there is a multi-semester delay between the
ETD submission & approval in relation to the final graduation date.

Hyrax includes a validation that prevents works from being saved
if they have an active embargo with an expiration date in the past.
A handful of works each month were failing this validation and the save
operation in the GraduationJob was silently failing due to this
validation failure.

This change set adds tests for the two cases described and makes
the necessary code modifications to successfully handle each case.
We have run the code against historical datasets loaded on the QA
environment  and the publication bug did not occur for works that
are currently in the error state in the production environment.

To diagnose the problem, we added additional logging to GraduationJob
and were able to clearly see the issue as follows:
```
W, [2021-04-12T13:40:49.784025 #4023] WARN -- : [ActiveJob] [GraduationJob] [bbe82cf9-d2c7-4d99-9bde-0b74e8d4efd8] ETD 9019s367m embargo release is 2020-08-18T00:00:00+00:00
W, [2021-04-12T13:40:49.784191 #4023] WARN -- : [ActiveJob] [GraduationJob] [bbe82cf9-d2c7-4d99-9bde-0b74e8d4efd8] ETD 9019s367m saving work...
E, [2021-04-12T13:40:49.788255 #4023] ERROR -- : [ActiveJob] [GraduationJob] [bbe82cf9-d2c7-4d99-9bde-0b74e8d4efd8] Error performing GraduationJob (Job ID: bbe82cf9-d2c7-4d99-9bde-0b74e8d4efd8) from Sidekiq(default) in 6021.0ms: ActiveFedora::RecordInvalid (Validation failed: Embargo release date Must be a future date):
```

Relevant data from two ETDs processed on 2021-04-11 show the date conflict
```
{
  "id":"9019s367m",
  "hasEmbargo_ssim":["5f8a5c88-64e8-4952-8a90-1a339d904cd0"],
  "embargo_length_ssi":"None - open access immediately",
  "suppressed_bsi":true,
  "embargo_release_date_dtsi":"2020-08-18T00:00:00Z"},
{
  "id":"9019s368w",
  "hasEmbargo_ssim":["ab9d4ac1-5b41-43ab-ade4-d567c33b7cbf"],
  "embargo_length_ssi":"6 months",
  "suppressed_bsi":true,
  "embargo_release_date_dtsi":"2021-02-18T00:00:00Z"},
```
In each case, the embargo_release_date prior to the processing date was
triggering the "Embargo release date Must be a future date" validation
failure.  Because of the version of the save method being called by the
GraduationJob, these validations were silently failing and the ETDs'
workflow state was being updated to `published` but the updated
graduation date and visibility were not being saved successfully.
@mark-dce mark-dce force-pushed the fix_graduation_job branch from 0e78a18 to 941cb30 Compare April 18, 2021 19:32
@fnibbit fnibbit merged commit c9ee414 into main Apr 19, 2021
@fnibbit fnibbit deleted the fix_graduation_job branch April 19, 2021 19:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants