Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WAVE / IFF: Accurate file offsets and new validation checks #534

Merged
merged 14 commits into from
Dec 10, 2019

Conversation

carlwilson
Copy link
Member

Core:

In certain circumstances submessages were being dropped when creating JhoveMessages.
IFF (AIFF & WAVE):

Check that chunk IDs only consist of characters in the printable ASCII range.
Check that spaces do not precede printable characters in chunk IDs.
Clarified error messages and improved offset reporting accuracy.
WAVE:

Added reporting for unrecognized data in the top-level RIFF structure.
Made the Table Length field of ds64 chunks optional to better align with the specification.
Reinstated WAVE-HUL-4 reporting which had been lost during refactoring.
Corrected WAVE-HUL-15 from an Error to an Informational message.
Retired WAVE-HUL-16, an unused duplicate of WAVE-HUL-19.
Clarified error messages and greatly improved offset reporting accuracy.
Moved all remaining message text into translatable resource files.
Added test files to demonstrate new validation checks.

Includes #468 from @david-russo

david-russo and others added 14 commits August 19, 2019 14:52
Values above the printable ASCII range no longer validate, nor IDs in
which spaces precede printable characters, per the IFF, AIFF, and RIFF
specifications.
- Improved offset reporting
- Reinstated WAVE-HUL-4 reporting
- Made more strings translatable
- Included reporting of unrecognized data in the
  top-most RIFF chunk as already existed for sub-chunks
Also stripped example wave files of unnecessary data.
Like unrecognized chunks, unrecognized list types are allowed but
should be skipped, so this message is now informational. This change
also allows file processing to continue after finding an unknown list,
instead of aborting prematurely.
This moves constants for chunk properties such as ID and size field
lengths into the Chunk class, and also adds chunk offset information
for all chunks.
- Retired WAVE-HUL-16 which was a part of previously pruned dead code.
- All chunk names are now surrounded by quotation marks to improve
  visibility of leading and trailing spaces and have been moved into
  translatable resource files.
- Improved accuracy of WAVE messages.
- bumped version and date for `AIFF-hul`, `WAVE-hul` and `TIFF-hul`;
- added version updates to baseline script;
- copied `WAV-hul` module results due to wholesale changes; and
- removed some text files from TIFF corpus.
@carlwilson carlwilson self-assigned this Dec 10, 2019
@carlwilson carlwilson added this to the v1.24-m4 Release milestone Dec 10, 2019
@codecov
Copy link

codecov bot commented Dec 10, 2019

Codecov Report

Merging #534 into integration will increase coverage by 0.01%.
The diff coverage is 100%.

Impacted file tree graph

@@                Coverage Diff                @@
##             integration     #534      +/-   ##
=================================================
+ Coverage          49.53%   49.54%   +0.01%     
- Complexity           985      986       +1     
=================================================
  Files                 55       55              
  Lines               7767     7765       -2     
  Branches            1373     1373              
=================================================
  Hits                3847     3847              
+ Misses              3450     3448       -2     
  Partials             470      470
Impacted Files Coverage Δ Complexity Δ
...rvard/hul/ois/jhove/messages/JhoveMessageImpl.java 19.04% <100%> (-3.68%) 4 <1> (-1)
...in/java/edu/harvard/hul/ois/jhove/InfoMessage.java 52.94% <0%> (+11.76%) 5% <0%> (+1%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 267d8d7...4d95d89. Read the comment docs.

@carlwilson carlwilson merged commit f062eb5 into integration Dec 10, 2019
@carlwilson carlwilson deleted the fix/wave-ids-and-offsets branch December 10, 2019 01:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants