Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix duplicate thumbnails for PDFs #2254

Merged
merged 2 commits into from
Jun 24, 2024
Merged

Fix duplicate thumbnails for PDFs #2254

merged 2 commits into from
Jun 24, 2024

Conversation

laritakr
Copy link
Collaborator

@laritakr laritakr commented Jun 21, 2024

When text extraction fails on a PDF, the derivative job resubmits. This was resulting in 5 Hyrax::FileMetadata objects on a file set, all for the same thumbnail file, because jobs automatically resubmit 5 times.

In this PR, I commented out the failing extract_full_text method, but also qualified the thumbnail creation on whether the thumbnail is already there.

Screenshots

BEFORE

Screenshot 2024-06-21 at 5 39 48 PM

AFTER

Screenshot 2024-06-21 at 5 38 59 PM

@laritakr laritakr added the patch-ver for release notes label Jun 21, 2024
kirkkwang
kirkkwang previously approved these changes Jun 21, 2024
Copy link

github-actions bot commented Jun 21, 2024

Test Results

    3 files  ±0      3 suites  ±0   17m 32s ⏱️ -41s
2 027 tests ±0  1 983 ✅ ±0  44 💤 ±0  0 ❌ ±0 
2 054 runs  ±0  2 008 ✅ ±0  46 💤 ±0  0 ❌ ±0 

Results for commit 3a53fd4. ± Comparison against base commit d700f82.

This pull request removes 42 and adds 42 tests. Note that renamed tests count towards both.
spec.abilities.work_ability_spec ‑ Hyrax::Ability::WorkAbility when work depositor Etd permissions is expected not to be able to destroy 045f2a0a-6b0b-4b18-9dad-33049a6f0692
spec.abilities.work_ability_spec ‑ Hyrax::Ability::WorkAbility when work depositor Etd permissions is expected not to be able to edit d41ce974-27d5-4ec4-b594-33a597b741a9
spec.abilities.work_ability_spec ‑ Hyrax::Ability::WorkAbility when work depositor Etd permissions is expected not to be able to read c52d8261-faef-4fa0-8f1c-8b426a441e63
spec.abilities.work_ability_spec ‑ Hyrax::Ability::WorkAbility when work depositor Etd permissions is expected not to be able to update 059b321a-0df5-4d24-9e9b-ebe7ff471678
spec.abilities.work_ability_spec ‑ Hyrax::Ability::WorkAbility when work depositor FileSet permissions is expected not to be able to destroy 37a9fb8a-b658-46c0-b0b4-2337e17477e0
spec.abilities.work_ability_spec ‑ Hyrax::Ability::WorkAbility when work depositor FileSet permissions is expected not to be able to edit d7980c34-b39b-4b47-9040-95f85ab51258
spec.abilities.work_ability_spec ‑ Hyrax::Ability::WorkAbility when work depositor FileSet permissions is expected not to be able to read 95b7d4a4-e03e-4ec6-a4e9-cad942d380d9
spec.abilities.work_ability_spec ‑ Hyrax::Ability::WorkAbility when work depositor FileSet permissions is expected not to be able to update e284991a-c1e7-4367-a844-ce20dcc94dff
spec.abilities.work_ability_spec ‑ Hyrax::Ability::WorkAbility when work depositor GenericWork permissions is expected not to be able to destroy 8da00f81-2782-443a-b0dd-d1cdfdc6e27a
spec.abilities.work_ability_spec ‑ Hyrax::Ability::WorkAbility when work depositor GenericWork permissions is expected not to be able to edit 903bd28d-76b8-44fa-af30-3d965080446f
…
spec.abilities.work_ability_spec ‑ Hyrax::Ability::WorkAbility when work depositor Etd permissions is expected not to be able to destroy ae2027e7-41a6-4563-8e38-64382d085470
spec.abilities.work_ability_spec ‑ Hyrax::Ability::WorkAbility when work depositor Etd permissions is expected not to be able to edit 85e0e841-94b5-4232-8460-ea5907f8bfd9
spec.abilities.work_ability_spec ‑ Hyrax::Ability::WorkAbility when work depositor Etd permissions is expected not to be able to read 6b60de30-d0c1-4a51-b118-1034e28cb02b
spec.abilities.work_ability_spec ‑ Hyrax::Ability::WorkAbility when work depositor Etd permissions is expected not to be able to update c2a754dc-2fa5-4e13-99e3-15041f5bfa7f
spec.abilities.work_ability_spec ‑ Hyrax::Ability::WorkAbility when work depositor FileSet permissions is expected not to be able to destroy a0c2973a-efa5-4fca-a525-71a2d4338fab
spec.abilities.work_ability_spec ‑ Hyrax::Ability::WorkAbility when work depositor FileSet permissions is expected not to be able to edit a7d15699-026d-407a-9074-58b761ecc7ce
spec.abilities.work_ability_spec ‑ Hyrax::Ability::WorkAbility when work depositor FileSet permissions is expected not to be able to read 957f86f5-c26c-4199-8c70-238c2a6a906a
spec.abilities.work_ability_spec ‑ Hyrax::Ability::WorkAbility when work depositor FileSet permissions is expected not to be able to update 1a8660a6-26b6-4295-bc9e-213c9ef67be6
spec.abilities.work_ability_spec ‑ Hyrax::Ability::WorkAbility when work depositor GenericWork permissions is expected not to be able to destroy 6d7f2c96-00c6-49af-bab0-679ecab9f51d
spec.abilities.work_ability_spec ‑ Hyrax::Ability::WorkAbility when work depositor GenericWork permissions is expected not to be able to edit 45b1d792-5acd-4948-98ce-32781939a62f
…

♻️ This comment has been updated with latest results.

When text extraction fails on a PDF, the derivative job resubmits. This
was results in 5 Hyrax::FileMetadata objects for the same thumbnail
file, because jobs automatically resubmit 5 times.

In this PR, I commented out the failing extract_full_text method. I
also qualified the thumbnail creation on whether the thumbnail is already
there, in cases where we run more after creating the thumbnail.
@laritakr laritakr merged commit ea2cae2 into main Jun 24, 2024
9 checks passed
@laritakr laritakr deleted the file_set_derivatives branch June 24, 2024 16:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
patch-ver for release notes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants