Skip to content

Commit

Permalink
Handle additional cases in GraduationJob
Browse files Browse the repository at this point in the history
We have diagnosed a long-standing bug that leaves ETDs in a published
but inactive state after the graduation job runs.  This occurs when
the embargo expiration date for the work occurrs before the date of
the graduation job run. This can happen
1. with shorter 6 month embargos when there is a delay in the graduation
   because of department or registrar issues
2. when and embargo was initially requested at submission, but the
   submission was later edited to remove the embargo request before
   graduation - this leaves a default six year embargo on the work
   which was trunctated to end on the graduation date by
   the graduation job

Because of the process used to generate the registrar feed, graduation
dates always occur in the past in relation to the date of the graduation
job run - i.e. the registrar feed never contains future dated graduation
reocrds.  Therefore in case 2 above, the embargo expiration is set to a
date some number of days or weeks in the past.  The same situation can
occur for case when when there is a multi-semester delay between the
ETD submission & approval in relation to the final graduation date.

Hyrax includes a validation that prevents works from being saved
if they have an active embargo with an expiration date in the past.
A handful of works each month were failing this validation and the save
operation in the GraduationJob was silently failing due to this
validation failure.

This change set adds tests for the two cases described and makes
the necessary code modifications to successfully handle each case.
We have run the code against historical datasets loaded on the QA
environment  and the publication bug did not occur for works that
are currently in the error state in the production environment.

To diagnose the problem, we added additional logging to GraduationJob
and were able to clearly see the issue as follows:
```
W, [2021-04-12T13:40:49.784025 #4023] WARN -- : [ActiveJob] [GraduationJob] [bbe82cf9-d2c7-4d99-9bde-0b74e8d4efd8] ETD 9019s367m embargo release is 2020-08-18T00:00:00+00:00
W, [2021-04-12T13:40:49.784191 #4023] WARN -- : [ActiveJob] [GraduationJob] [bbe82cf9-d2c7-4d99-9bde-0b74e8d4efd8] ETD 9019s367m saving work...
E, [2021-04-12T13:40:49.788255 #4023] ERROR -- : [ActiveJob] [GraduationJob] [bbe82cf9-d2c7-4d99-9bde-0b74e8d4efd8] Error performing GraduationJob (Job ID: bbe82cf9-d2c7-4d99-9bde-0b74e8d4efd8) from Sidekiq(default) in 6021.0ms: ActiveFedora::RecordInvalid (Validation failed: Embargo release date Must be a future date):
```

Relevant data from two ETDs processed on 2021-04-11 show the date conflict
```
{
  "id":"9019s367m",
  "hasEmbargo_ssim":["5f8a5c88-64e8-4952-8a90-1a339d904cd0"],
  "embargo_length_ssi":"None - open access immediately",
  "suppressed_bsi":true,
  "embargo_release_date_dtsi":"2020-08-18T00:00:00Z"},
{
  "id":"9019s368w",
  "hasEmbargo_ssim":["ab9d4ac1-5b41-43ab-ade4-d567c33b7cbf"],
  "embargo_length_ssi":"6 months",
  "suppressed_bsi":true,
  "embargo_release_date_dtsi":"2021-02-18T00:00:00Z"},
```
In each case, the embargo_release_date prior to the processing date was
triggering the "Embargo release date Must be a future date" validation
failure.  Because of the version of the save method being called by the
GraduationJob, these validations were silently failing and the ETDs'
workflow state was being updated to `published` but the updated
graduation date and visibility were not being saved successfully.
  • Loading branch information
mark-dce committed Apr 18, 2021
1 parent db22f6e commit 941cb30
Show file tree
Hide file tree
Showing 4 changed files with 101 additions and 27 deletions.
2 changes: 1 addition & 1 deletion app/actors/hyrax/actors/pregrad_embargo.rb
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ def handle_malformed_data(env)
"is being interpreted as a request for no embargo on " \
"#{env.attributes[:title]}."

env.attributes.delete(:embargo_length)
# env.attributes.delete(:embargo_length)

{}
end
Expand Down
15 changes: 14 additions & 1 deletion app/jobs/graduation_job.rb
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ def perform(work_id, graduation_date = Time.zone.today.to_s)
publish_object
ProquestJob.perform_later(@work.id)
send_notifications
@work.save
@work.save!
end

# @param [Date] graduation_date
Expand Down Expand Up @@ -60,6 +60,19 @@ def update_embargo_release_date
fs.embargo.embargo_release_date = embargo_release_date
fs.embargo.save
end

if embargo_release_date <= Time.zone.today
@work.visibility = @work.visibility_after_embargo if @work.visibility_after_embargo
@work.deactivate_embargo!
@work.embargo.save
@work.save
@work.file_sets.each do |fs|
fs.visibility = @work.visibility
fs.deactivate_embargo!
fs.save
end
end

Rails.logger.warn "Graduation Job: ETD #{@work.id} embargo release date set to #{@work.embargo.embargo_release_date}"
rescue => e
Rails.logger.error "Error updating embargo release date for work #{@work}: #{e}"
Expand Down
4 changes: 2 additions & 2 deletions spec/actors/hyrax/actors/pregrad_embargo_spec.rb
Original file line number Diff line number Diff line change
Expand Up @@ -35,10 +35,10 @@
'embargo_length' => InProgressEtd::NO_EMBARGO }
end

it "removes the embargo_length attribute" do
it "does not remove the embargo_length attribute" do
expect(env.attributes["embargo_length"]).to eq InProgressEtd::NO_EMBARGO
middleware.create(env)
expect(env.attributes["embargo_length"]).to eq nil
expect(env.attributes["embargo_length"]).to eq InProgressEtd::NO_EMBARGO
end
end

Expand Down
107 changes: 84 additions & 23 deletions spec/jobs/graduation_job_with_embargo_spec.rb
Original file line number Diff line number Diff line change
@@ -1,48 +1,67 @@
require 'rails_helper'
# For a work with an embargo, the GraduationJob should:
# For the supplied ETD and graduation date, the GraduationJob should:
# * set the degree_awarded date
# * publish the work (workflow transition)
# * update the relevant User object to have the post-graduation email address
# * update the embargo release date to the user's graduation date plus
# their requested embargo length
# * update the embargo release date to the user's graduation date plus their requested embargo length
# * expire the embargo if it has already passed (sometimes happens when graduation is delayed)
# * send notifications
describe GraduationJob, integration: true do
context "a student with a requested embargo", :perform_jobs do
let(:workflow) { WorkflowSetup.new("#{fixture_path}/config/emory/superusers.yml", "#{fixture_path}/config/emory/candler_admin_sets.yml", "/dev/null") }
describe GraduationJob, :perform_jobs, integration: true do
context "standard cases" do
let(:user) { FactoryBot.create(:user) }
let(:ability) { ::Ability.new(user) }
let(:for_embargo) { Hyrax::Actors::Environment.new(Etd.new, ability, attributes) }
let(:attributes) {
let(:for_embargo) { Hyrax::Actors::Environment.new(Etd.new, ability, attributes.merge(embargo)) }
let(:no_embargo) { Hyrax::Actors::Environment.new(Etd.new, ability, attributes.merge(open_access)) }
let(:open) { Hydra::AccessControls::AccessRight::VISIBILITY_TEXT_VALUE_PUBLIC }
let(:restricted) { Hydra::AccessControls::AccessRight::VISIBILITY_TEXT_VALUE_PRIVATE }
let(:attributes) {
Hash[
title: ['The Adventures of Cottontail Rabbit'],
depositor: user.user_key,
post_graduation_email: ['me@after.graduation.com'],
creator: ['Quest, June'],
school: ["Candler School of Theology"],
department: ["Divinity"],
uploaded_files: [FactoryBot.create(:primary_uploaded_file, user_id: user.id).id]
]
}
let(:embargo) {
Hash[
files_embargoed: true,
abstract_embargoed: true,
toc_embargoed: true,
embargo_length: '6 months',
uploaded_files: [file.id]
embargo_length: '6 months'
]
}
let(:six_years_from_today) { Time.zone.today + 6.years }
let(:file) { FactoryBot.create :primary_uploaded_file, user_id: user.id }
let(:open) { Hydra::AccessControls::AccessRight::VISIBILITY_TEXT_VALUE_PUBLIC }
let(:restricted) { Hydra::AccessControls::AccessRight::VISIBILITY_TEXT_VALUE_PRIVATE }
let(:open_access) {
Hash[
files_embargoed: false,
abstract_embargoed: false,
toc_embargoed: false,
embargo_length: InProgressEtd::NO_EMBARGO
]
}

before :all do
WorkflowSetup.new("#{fixture_path}/config/emory/superusers.yml", "#{fixture_path}/config/emory/candler_admin_sets.yml", "/dev/null").setup
end

before do
workflow.setup
allow(Hyrax::Workflow::DegreeAwardedNotification).to receive(:send_notification)
ActiveJob::Base.queue_adapter.filter = [AttachFilesToWorkJob]
Hyrax::CurationConcern.actor.create(for_embargo)
@etd_with_embargo_requested = for_embargo.curation_concern.id
end

it "performs the graduation process" do
let(:eight_months_ago) { 8.months.ago.beginning_of_day }
let(:two_months_ago) { eight_months_ago + 6.months }
let(:one_month_ago) { 1.month.ago.beginning_of_day }
let(:five_months_from_now) { one_month_ago + 6.months }
let(:six_years_from_today) { 6.years.from_now.beginning_of_day }

it "handles ETDs with embargos expiring in the future", :aggregate_failures do
# Before the GraduationJob is run
etd = Etd.find(@etd_with_embargo_requested)
Hyrax::CurationConcern.actor.create(for_embargo)
id = for_embargo.curation_concern.id
etd = Etd.find(id)
expect(etd.degree_awarded).to eq nil
expect(etd.embargo.embargo_release_date).to eq six_years_from_today
expect(etd.embargo_length).to eq "6 months"
Expand All @@ -53,12 +72,14 @@
expect(etd.file_sets.first)
.to have_attributes visibility: restricted
graduation_job = described_class.new
graduation_job.perform(etd.id, Time.zone.tomorrow)
graduation_job.perform(etd.id, one_month_ago)
etd.reload

# After the GraduationJob is run
# The ETD should now have a degree_awarded date
expect(etd.degree_awarded).to eq Time.zone.tomorrow
# The degree awarded date will always be in the past
# because of how the Registrar data is produced
expect(etd.degree_awarded).to eq one_month_ago

# An object must be "published" and "active" to be publicly visible
expect(etd.to_sipity_entity.workflow_state_name).to eq "published"
Expand All @@ -70,8 +91,9 @@

# The embargo_release_date of the ETD and any attached files should now
# equal the user's graduation date plus the requested embargo length
expect(etd.embargo.embargo_release_date).to eq Time.zone.tomorrow + 6.months
expect(etd.file_sets.first.embargo.embargo_release_date).to eq Time.zone.tomorrow + 6.months
# i.e. one month ago + 6 months = 5 months from now
expect(etd.embargo.embargo_release_date).to eq five_months_from_now
expect(etd.file_sets.first.embargo.embargo_release_date).to eq five_months_from_now

# Attached files should be restricted during the embargo period
expect(etd.file_sets.first).to have_attributes visibility: restricted
Expand All @@ -83,5 +105,44 @@
# Notifications have been sent that the degree was awarded and the ETD was published
expect(Hyrax::Workflow::DegreeAwardedNotification).to have_received(:send_notification)
end

it "handles ETDs with embargos ending before the job run", :aggregate_failures do
Hyrax::CurationConcern.actor.create(for_embargo)
etd_id = for_embargo.curation_concern.id
graduation_job = described_class.new
graduation_job.perform(etd_id, eight_months_ago)

etd = Etd.find(etd_id)
expect(etd.degree_awarded).to eq eight_months_ago
expect(etd.embargo_length).to eq "6 months"
expect(etd.to_sipity_entity.workflow_state_name).to eq "published"
expect(etd.state).to eq Vocab::FedoraResourceStatus.active

# The embargo should be deactivated (embargo_release_date = nil) and
# The embargo history should list the embargo_release_date of the expired embargo
# i.e. 8 months ago + 6 months = 2 months ago
expect(etd.embargo.embargo_history.last).to include two_months_ago.strftime('%Y-%m-%d')

# Embargo should be expired and attached files should be visible
expect(etd.file_sets.first.embargo.embargo_history.last).to include two_months_ago.strftime('%Y-%m-%d')
expect(etd.file_sets.first).to have_attributes visibility: open
end

it "handles ETDs without embargos", :aggregate_failures do
Hyrax::CurationConcern.actor.create(no_embargo)
etd_id = no_embargo.curation_concern.id
graduation_job = described_class.new
graduation_job.perform(etd_id, one_month_ago)

etd = Etd.find(etd_id)
expect(etd.degree_awarded).to eq one_month_ago
expect(etd.to_sipity_entity.workflow_state_name).to eq "published"
expect(etd.state).to eq Vocab::FedoraResourceStatus.active
expect(etd.file_sets.first).to have_attributes visibility: open
expect(etd.embargo_length).to eq "None - open access immediately" # i.e. InProgressEtd::NO_EMBARGO

# The etd should not have an embargo
expect(etd.embargo).to be_nil
end
end
end

0 comments on commit 941cb30

Please sign in to comment.