-
Notifications
You must be signed in to change notification settings - Fork 83
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Identify with format switch reports erroneous JPEG quality if actual value is undefined #260
Comments
Best that I can see from identify and from exiftool, no quality value was recorded in the file. It may have been compressed at quality 50% but it was not recorded. So IM will assume its default quality of 92. Someone else can double check my assessment. |
@fmw42 You're probably right, but IM's current behavior makes it impossible to distinguish between JPEGs that were actually compressed at 92% quality, and JPEGs for which the quality is unknown, which isn't very helpful. Reporting some NaN value or even an empty string (I don't know if there are any IM conventions for this?) would be much more helpful. Knowing that IM cannot establish the quality is actually useful information, whereas getting some arbitrary value that's indistinguishable from a meaningful estimate isn't! BTW you mention the absence of quality level info in the metadata. I noticed that as well. From the Fotoforensics explainer (tab "Estimating Quality) I understand it's quite rare for JPEGs to have this info in the metadata, and even if it's there, it's often unreliable. It's not entirely clear to me how IM establishes the quality; I suspect somewhere under the hood IM or some delegate library uses either the "Approximate Ratios" or "Approximate Quantization Tables" methods that are mentioned in the Fotoforensics piece. But from the code I can't quite figure out if this is indeed the case. However, JPEGs created inside IM don't appear to contain quality-related metadata either. Nevertheless, "identify" is able to establish the quality! A quick example. First I create a new JPEG with 40% compression quality:
The output of the following ExifTool command doesn't contain anything related to the quality level:
Despite this, using "identify":
Result:
This is all slightly straying from the issue, which is really about the reporting. But since it popped up in the conversation, and I'd be doing some tests with that already, I might as well mention it in case it's of any use. |
If you define a quality and add it to the file properly then it will be in the meta data. Most of the JPGs that I have checked have a quality value. If no quality value is in the file, then you get 92. If you have a quality value in the file, then it could have 92, but it will show as a quality value in the file as opposed to no quality value. An IM developer might comment further on this. |
A bit of further digging seems to confirm that IM actually determines the JPEG compression quality from the quantization tables, and not from some pre-defined metadata field: Line 925 in bf9bc7f
|
For info - I did some further tests with a Python port of ImageMagick's JPEG quality heuristic. See below blog post for details: This explains why it doesn't come up with a quality estimate for the file in my opening post (see sections "ImageMagick’s JPEG quality heuristic" and "Tracing all loop variables"). In the same post I also tentatively propose a modified heuristic that works for low quality images that use non-standard quantization tables. I'm not entirely sure how much sense this makes, but I hope the post will provoke some responses from people who are better versed in in the inner workings of the JPEG compression algorithm. All scripts and data I used for this are available here: |
ImageMagick version
6.9.10-23
Operating system
Linux
Operating system, version and so on
Linux Mint 20.1 Ulyssa
Description
Identify with format switch reports erroneous JPEG quality fallback value if actual value is undefined.
Steps to Reproduce
Attached JPEG file was supplied to us by a vendor, who claims it was compressed at 50% quality. In an attempt to verify this claim, I first ran identify with verbose output:
The resulting output does not contain the "Quality" property (which is used for JPEG quality). So for some reason identify doesn't seem to be able to establish the quality level in this case (I'm not entirely sure why, but from what I read here estimating JPEG quality can be tricky at low quality levels). But if I run identify like this (which only reports the quality level):
Result:
Which is obviously wrong in this case! The actual file size is actually pretty much what I would expect for a 50% quality JPEG for this kind of material; also a double-check with this tool by Neal Krawetz confirmed the actual quality must be quite low.
I had a quick look into the code, and found this:
https://github.com/ImageMagick/ImageMagick/blob/f5bdfdd62af7109ad105f8af4e28111e353edecd/MagickCore/property.c#L2725
I'm not a C programmer, but if I understand this correctly, this forces the reported image quality to a "92" fallback value if the actual quality is 0 (I assume this is used internally if the actual quality cannot be determined). If I'm correct, a better solution may be to use something that clearly indicates that the quality level is undefined here.
(Side note: I also did some additional tests with low-quality JPEGs I made within ImageMagick, but for all of these identify was able to report the correct quality value. It's not clear to me why this is the case.)
Images
The text was updated successfully, but these errors were encountered: