Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HYC 1678 - Dimensions ingest email report generation #1100

Merged
merged 30 commits into from
May 30, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
30 commits
Select commit Hold shift + click to select a range
1bee634
tweaking ingest service, introducing reporting service
davidcam-src May 21, 2024
8248933
syntax, including test suite
davidcam-src May 21, 2024
96ebbbe
small changes to reporting service and ingest
davidcam-src May 21, 2024
d78b5c8
tests for report generation helper function
davidcam-src May 21, 2024
e1b8e2a
slight phrasing change, updating test
davidcam-src May 22, 2024
31a7cec
changes to ingest and reporting service to include cdr urls in email …
davidcam-src May 23, 2024
668b778
configuration for action mailer, slight refactor to report generation
davidcam-src May 24, 2024
a4857f1
syntax
davidcam-src May 24, 2024
3a7a21e
test refactoring
davidcam-src May 24, 2024
d6b76fa
rubocop, mailer preview and tests
davidcam-src May 24, 2024
862ed0f
removing comment
davidcam-src May 24, 2024
5db8bab
rubocop, renaming some stuff, updating preview class
davidcam-src May 24, 2024
0550464
adding comment
davidcam-src May 25, 2024
467f8b4
code climate
davidcam-src May 25, 2024
b1a3369
rubocop
davidcam-src May 25, 2024
86d40d2
addressing failing test
davidcam-src May 27, 2024
4d73750
logs for debugging pr
davidcam-src May 28, 2024
901b598
another log statement
davidcam-src May 28, 2024
58dc740
modified build step to save time
davidcam-src May 28, 2024
f63ff6e
modified logging for debugging
davidcam-src May 28, 2024
38cc7a2
temporarily adding env variable to build yaml, logging
davidcam-src May 28, 2024
4554b58
rubocop
davidcam-src May 28, 2024
c71489a
moving ingested publications to after virus checker stubbing
davidcam-src May 28, 2024
cb0d54b
removing configuration stuff and debug logs, moving dimensions ingest…
davidcam-src May 28, 2024
958d32d
simulating missing pdfs in email report preview and reporting service…
davidcam-src May 28, 2024
a4e975f
updating build yml
davidcam-src May 28, 2024
f0a7f93
updating column title
davidcam-src May 28, 2024
2973f65
rolling back debugging related change to build yml
davidcam-src May 28, 2024
d1de3f4
implement suggested review changes
davidcam-src May 29, 2024
d6702ef
updating view template
davidcam-src May 29, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion app/mailers/application_mailer.rb
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# frozen_string_literal: true
class ApplicationMailer < ActionMailer::Base
default from: 'from@example.com'
default from: 'no-reply@example.com'
layout 'mailer'
end
7 changes: 7 additions & 0 deletions app/mailers/dimensions_report_mailer.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
# frozen_string_literal: true
class DimensionsReportMailer < ApplicationMailer
def dimensions_report_email(report)
@report = report
mail(to: 'recipient@example.com', subject: report[:subject])
end
end
9 changes: 5 additions & 4 deletions app/services/tasks/dimensions_ingest_service.rb
Original file line number Diff line number Diff line change
Expand Up @@ -19,15 +19,16 @@ def initialize(config)
def ingest_publications(publications)
time = Time.now
Rails.logger.info('Ingesting publications from Dimensions.')
res = {ingested: [], failed: [], time: time}
res = {ingested: [], failed: [], time: time, admin_set_title: @admin_set.title.first, depositor: @config['depositor_onyen']}

publications.each.with_index do |publication, index|
begin
next unless publication.presence
process_publication(publication)
res[:ingested] << publication
article = process_publication(publication)
res[:ingested] << publication.merge('article_id' => article.id)
rescue StandardError => e
res[:failed] << { publication: publication, error: [e.class.to_s, e.message] }
publication.delete('pdf_attached')
res[:failed] << publication.merge('error' => [e.class.to_s, e.message])
Rails.logger.error("Error ingesting publication '#{publication['title']}'")
Rails.logger.error [e.class.to_s, e.message, *e.backtrace].join($RS)
end
Expand Down
41 changes: 41 additions & 0 deletions app/services/tasks/dimensions_reporting_service.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
# frozen_string_literal: true
module Tasks
class DimensionsReportingService
def initialize(ingested_publications)
@ingested_publications = ingested_publications
end

def generate_report
report = { successfully_ingested_rows: [], marked_for_review_rows: [], failed_to_ingest_rows: [], subject: [], headers: { }}
extracted_info = extract_publication_info
formatted_time = @ingested_publications[:time].strftime('%B %d, %Y at %I:%M %p %Z')
report[:subject] = "Dimensions Ingest Report for #{formatted_time}"
report[:headers][:reporting_message] = "Reporting publications from dimensions ingest on #{formatted_time} by #{@ingested_publications[:depositor]}."
report[:headers][:admin_set] = "Admin Set: #{@ingested_publications[:admin_set_title]}"
report[:headers][:total_publications] = "Total Publications: #{extracted_info[:successfully_ingested].length + extracted_info[:failed_to_ingest].length + extracted_info[:marked_for_review].length}"
report[:headers][:successfully_ingested] = "\nSuccessfully Ingested: (#{extracted_info[:successfully_ingested].length} Publications)"
report[:headers][:marked_for_review] = "\nMarked for Review: (#{extracted_info[:marked_for_review].length} Publications)"
report[:headers][:failed_to_ingest] = "\nFailed to Ingest: (#{extracted_info[:failed_to_ingest].length} Publications)"
report[:successfully_ingested_rows] = extracted_info[:successfully_ingested]
report[:marked_for_review_rows] = extracted_info[:marked_for_review]
report[:failed_to_ingest_rows] = extracted_info[:failed_to_ingest]
report
end

def extract_publication_info
publication_info = {successfully_ingested: [], failed_to_ingest: [], marked_for_review: []}
@ingested_publications[:ingested].map do |publication|
publication_item = { title: publication['title'], id: publication['id'], url: "#{ENV['HYRAX_HOST']}/concern/articles/#{publication['article_id']}?locale=en", pdf_attached: publication['pdf_attached'] ? 'Yes' : 'No' }
if publication['marked_for_review']
publication_info[:marked_for_review] << publication_item
else
publication_info[:successfully_ingested] << publication_item
end
end
@ingested_publications[:failed].map do |publication|
publication_info[:failed_to_ingest] << { title: publication['title'], id: publication['id'], error: "#{publication['error'][0]} - #{publication['error'][1]}" }
end
publication_info
end
end
end
113 changes: 113 additions & 0 deletions app/views/dimensions_report_mailer/dimensions_report_email.erb
Original file line number Diff line number Diff line change
@@ -0,0 +1,113 @@
<!-- app/views/dimensions_report_mailer/dimensions_report_email.html.erb -->
<!DOCTYPE html>
<html>
<head>
<meta charset="UTF-8">
<style>
table {
width: 100%;
border-collapse: collapse;
}
th, td {
border: 1px solid #dddddd;
text-align: left;
padding: 8px;
}
th {
background-color: #f2f2f2;
}
.title-col {
width: 30%;
}
.id-col {
width: 20%;
}
.url-col {
width: 30%;
}
.pdf-col {
width: 5%;
}
.error-col {
width: 15%;
}
</style>
</head>
<body>
<p><%= @report[:headers][:reporting_message] %></p>
<p><%= @report[:headers][:admin_set] %></p>
<p><%= @report[:headers][:total_publications] %></p>

<h2><%= @report[:headers][:successfully_ingested] %></h2>
<table>
<thead>
<tr>
<th class="title-col">Title</th>
<th class="id-col">Dimensions ID</th>
<th class="url-col">URL</th>
<th class="pdf-col">PDF Attached</th>
<th class="error-col">Error</th>
</tr>
</thead>
<tbody>
<% @report[:successfully_ingested_rows].each do |publication| %>
<tr>
<td class="title-col"><%= publication[:title] %></td>
<td class="id-col"><%= publication[:id] %></td>
<td class="url-col"><a href="<%= publication[:url] %>"><%= publication[:url] %></a></td>
<td class="pdf-col"><%= publication[:pdf_attached] %></td>
<td class="error-col">N/A</td>
</tr>
<% end %>
</tbody>
</table>

<h2><%= @report[:headers][:marked_for_review] %></h2>
<table>
<thead>
<tr>
<th class="title-col">Title</th>
<th class="id-col">Dimensions ID</th>
<th class="url-col">URL</th>
<th class="pdf-col">PDF Attached</th>
<th class="error-col">Error</th>
</tr>
</thead>
<tbody>
<% @report[:marked_for_review_rows].each do |publication| %>
<tr>
<td class="title-col"><%= publication[:title] %></td>
<td class="id-col"><%= publication[:id] %></td>
<td class="url-col"><a href="<%= publication[:url] %>"><%= publication[:url] %></a></td>
<td class="pdf-col"><%= publication[:pdf_attached] %></td>
<td class="error-col">N/A</td>
</tr>
<% end %>
</tbody>
</table>

<h2><%= @report[:headers][:failed_to_ingest] %></h2>
<table>
<thead>
<tr>
<th class="title-col">Title</th>
<th class="id-col">Dimensions ID</th>
<th class="url-col">URL</th>
<th class="pdf-col">PDF Attached</th>
<th class="error-col">Error</th>
</tr>
</thead>
<tbody>
<% @report[:failed_to_ingest_rows].each do |publication| %>
<tr>
<td class="title-col"><%= publication[:title] %></td>
<td class="id-col"><%= publication[:id] %></td>
<td class="url-col">N/A</td>
<td class="pdf-col">N/A</td>
<td class="error-col"><%= publication[:error] %></td>
</tr>
<% end %>
</tbody>
</table>
</body>
</html>
119 changes: 119 additions & 0 deletions spec/mailers/dimensions_report_mailer_spec.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,119 @@
# frozen_string_literal: true
# spec/mailers/dimensions_report_mailer_spec.rb
require 'rails_helper'

RSpec.describe DimensionsReportMailer, type: :mailer do
let(:config) {
{
'admin_set' => 'Open_Access_Articles_and_Book_Chapters',
'depositor_onyen' => 'admin'
}
}
let(:dimensions_ingest_test_fixture) do
File.read(File.join(Rails.root, '/spec/fixtures/files/dimensions_ingest_test_fixture.json'))
end

let(:admin) { FactoryBot.create(:admin) }
let(:admin_set) do
FactoryBot.create(:admin_set, title: ['Open_Access_Articles_and_Book_Chapters'])
end
let(:permission_template) do
FactoryBot.create(:permission_template, source_id: admin_set.id)
end
let(:workflow) do
FactoryBot.create(:workflow, permission_template_id: permission_template.id, active: true)
end
let(:workflow_state) do
FactoryBot.create(:workflow_state, workflow_id: workflow.id, name: 'deposited')
end

let(:pdf_content) { File.binread(File.join(Rails.root, '/spec/fixtures/files/sample_pdf.pdf')) }
let(:test_err_msg) { 'Test error' }

let(:fixed_time) { Time.new(2024, 5, 21, 10, 0, 0) }
# Removing linkout pdf from some publications to simulate missing pdfs
let(:test_publications) {
all_publications = JSON.parse(dimensions_ingest_test_fixture)['publications']
all_publications.each_with_index do |pub, index|
pub.delete('linkout') if index.even?
end
all_publications
}

let(:failing_publication_sample) { test_publications[0..2] }
let(:marked_for_review_sample) do
test_publications[3..5].each { |pub| pub['marked_for_review'] = true }
test_publications[3..5]
end
let(:successful_publication_sample) { test_publications[6..-1] }

let(:ingest_service) { Tasks::DimensionsIngestService.new(config) }
let(:ingested_publications) do
ingest_service.ingest_publications(test_publications)
end
let(:report) { Tasks::DimensionsReportingService.new(ingested_publications).generate_report }

before do
ActiveFedora::Cleaner.clean!
admin_set
permission_template
workflow
workflow_state
allow(Time).to receive(:now).and_return(fixed_time)
allow(User).to receive(:find_by).with(uid: 'admin').and_return(admin)
allow(AdminSet).to receive(:where).with(title: 'Open_Access_Articles_and_Book_Chapters').and_return([admin_set])
stub_request(:head, 'https://test-url.com/')
.to_return(status: 200, headers: { 'Content-Type' => 'application/pdf' })
stub_request(:get, 'https://test-url.com/')
.to_return(body: pdf_content, status: 200, headers: { 'Content-Type' => 'application/pdf' })
allow(ingest_service).to receive(:process_publication).and_call_original
allow(ingest_service).to receive(:process_publication).with(satisfy { |pub| failing_publication_sample.include?(pub) }).and_raise(StandardError, test_err_msg)
marked_for_review_sample
# stub virus checking
allow(Hyrax::VirusCheckerService).to receive(:file_has_virus?) { false }
# stub longleaf job
allow(RegisterToLongleafJob).to receive(:perform_later).and_return(nil)
# stub FITS characterization
allow(CharacterizeJob).to receive(:perform_later)
ingested_publications
end

describe 'dimensions_report_email' do
let(:mail) { DimensionsReportMailer.dimensions_report_email(report) }

it 'renders the headers' do
expect(mail.subject).to eq(report[:subject])
expect(mail.to).to eq(['recipient@example.com'])
expect(mail.from).to eq(['no-reply@example.com'])
end

it 'renders the body' do
expect(mail.body.encoded).to include(report[:headers][:reporting_message])
.and include(report[:headers][:admin_set])
.and include(report[:headers][:total_publications])
.and include(report[:headers][:successfully_ingested])
.and include(report[:headers][:marked_for_review])
.and include(report[:headers][:failed_to_ingest])

report[:successfully_ingested_rows].each do |publication|
expect(mail.body.encoded).to include(publication[:title])
.and include(publication[:id])
.and include(publication[:url])
.and include(publication[:pdf_attached])
end

report[:marked_for_review_rows].each do |publication|
expect(mail.body.encoded).to include(publication[:title])
.and include(publication[:id])
.and include(publication[:url])
.and include(publication[:pdf_attached])
end

report[:failed_to_ingest_rows].each do |publication|
expect(mail.body.encoded).to include(publication[:title])
.and include(publication[:id])
.and include(publication[:error])
end
end
end
end
38 changes: 38 additions & 0 deletions spec/mailers/previews/dimensions_report_mailer_preview.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
# frozen_string_literal: true
# Preview all emails at http://localhost:3000/rails/mailers/dimensions_report_mailer
class DimensionsReportMailerPreview < ActionMailer::Preview
def dimensions_report_email
# Ensuring template works with a report generated after an ingest with a test fixture
dimensions_ingest_test_fixture = File.read(File.join(Rails.root, '/spec/fixtures/files/dimensions_ingest_test_fixture.json'))
test_publications = JSON.parse(dimensions_ingest_test_fixture)['publications']
# Marking some publications for review
test_publications[3..5].each { |pub| pub['marked_for_review'] = true }
config = {
'admin_set' => 'default',
'depositor_onyen' => 'admin'
}

dimensions_ingest_service = Tasks::DimensionsIngestService.new(config)
ingested_publications = dimensions_ingest_service.ingest_publications(test_publications)
# Marking some successfully ingested publications as having attached PDFs, workaround to stubbing requests for them
# Odd publications marked as pdf_attached for consistency with tests
ingested_publications[:ingested].each_with_index do |pub, index|
if index.odd?
pub['pdf_attached'] = 'Yes'
end
end

# Moving publications in the failing publication sample to the failed array and adding an error message
failing_publication_sample = test_publications[0..2]
ingested_publications[:ingested] = ingested_publications[:ingested].reject do |pub|
failing_publication_sample.any? { |failing_pub| failing_pub['id'] == pub['id'] }
end
ingested_publications[:failed] = failing_publication_sample.map do |pub|
pub.merge('error' => ['Test error', 'Test error message'])
end

dimensions_reporting_service = Tasks::DimensionsReportingService.new(ingested_publications)
report = dimensions_reporting_service.generate_report
DimensionsReportMailer.dimensions_report_email(report)
end
end
Loading
Loading