Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

avfilter/tonemap**: use more stable range and peak handling #472

Merged
merged 7 commits into from
Oct 5, 2024

Conversation

gnattu
Copy link
Member

@gnattu gnattu commented Oct 3, 2024

The main goal of this PR is to fix the unstable range and peak handling in our current tonemap filters, both software and hardware.

For consistency, now all filters will use 203nit as SDR reference white when tone mapping, and all brightness will be normalized to the same input / 10 scale. Previously both are only applied to BT.2390 which does not really make sense to make such differences.

Software tonemapx filter now handles range better, the difference between TV range and PC range now is fully handled by yuv2rgb and rgb2yuv conversions and all intermediate steps handled are in full range, which both simplified the processing and producing more accurate results. The current yuv and rgb conversion still exhibits some precision loss at highlight areas where the color may appear to "desaturate" itself. To make a more accurate handling we might have to re-write the code path and may introduce extra performance cost, so I want to leave it for future versions.

I also noticed that our current GPU filter implementation handles limited and full range conversion in an unstable manner where the resulted video may look "greener" than it should be and this may be very noticeable on some monitors/tvs. I implemented a new range scaling method for VideoToolbox which does the scaling in RGB instead of in YUV which produces better results. The port to CUDA and OpenCL is also on the way once I have better internet connection to verify it as I'm currently traveling. Now CUDA and OpenCL also uses this new stabler range conversion

To demonstrate the "green tint", look at the following example with hable TMO in maxRGB mode:

This is our current implementation

current

This is the corrected implementation

corrected

You may need to have a good enough monitor to tell the greenish tone though.

Changes

Issues

Might fix https://forum.jellyfin.org/t-green-tint-on-some-movies when also ported to cuda and opencl.

@gnattu gnattu requested a review from a team October 3, 2024 19:13
@nyanmisaka
Copy link
Member

+ if ((ret = compute_tonemap_lut(s, out->color_trc)) < 0)

There’s a typo, it should be in->color_trc and will fix the highlight issue in tonemapx.

@gnattu
Copy link
Member Author

gnattu commented Oct 3, 2024

Can't believe the problem is that simple

@nyanmisaka
Copy link
Member

nyanmisaka commented Oct 4, 2024

It seems that bt.2390 + input/10 does not perform well in this demo.

tonemapx-bt2390-1000nit

tonemapx-bt2390-10000nit

gpgpu-bt2390-1000nit

gpgpu-bt2390-10000nit

@gnattu
Copy link
Member Author

gnattu commented Oct 4, 2024

This particular demo is mastered in a way that abuses over-bright colors extensively which means the tone mapped intensity is that high and that's why the full HDR range 10000nit tone mapping looks more natural. In other words, not our issue, but the demo issue. Intel VPP and VideoToolbox will generate a high intensity tonemap as well.

The blackout area looks like an overflow to me but I don't see it on my computer... Which code path is that? AVX, SSE, NEON or C?

@nyanmisaka
Copy link
Member

nyanmisaka commented Oct 4, 2024

This particular demo is mastered in a way that abuses over-bright colors extensively which means the tone mapped intensity is that high and that's why the full HDR range 10000nit tone mapping looks more natural. In other words, not our issue, but the demo issue. Intel VPP and VideoToolbox will generate a high intensity tonemap as well.

Then only the brightness detection like libplacebo can alleviate this problem. But that experience is not good for me either.

The blackout area looks like an overflow to me but I don't see it on my computer... Which code path is that? AVX, SSE, NEON or C?

AVX/SSE + p010_2_p010 path. I found it with the NVENC encoder. (I haven't tested this on NEON yet)

./ffmpeg -i 'Samsung and RedBull See the Unexpected HDR UHD 4K Demo.ts' -c:v copy -an -sn -dn -y /tmp/demo.mp4

./ffmpeg -ss 00:00:24 -i '/tmp/demo.mp4' -an -sn -dn -vf "format=p010,setparams=colorspace=bt2020nc:color_primaries=bt2020:color_trc=smpte2084,tonemapx=tonemap=bt2390:desat=0:peak=100:format=p010" -vframes 1 -f image2 -qscale:v 2 -fps_mode passthrough -y /tmp/overflow.jpg
./ffmpeg -hwaccel cuda -hwaccel_output_format cuda -hwaccel_flags +unsafe_output -threads 2 -i 'Samsung and RedBull See the Unexpected HDR UHD 4K Demo.ts' -an -sn -dn -vf "setparams=colorspace=bt2020nc:color_primaries=bt2020:color_trc=smpte2084,hwdownload,format=p010,tonemapx=tonemap=bt2390:desat=0:peak=100:format=p010,format=p010,hwupload" -c:v hevc_nvenc -preset p1 -qp:v 20 -fps_mode passthrough -y /tmp/overflow.mp4

@gnattu
Copy link
Member Author

gnattu commented Oct 4, 2024

Then only the brightness detection like libplacebo can alleviate this problem. But that experience is not good for me either.

Commercially available videos won't be mastered in such way or at least most of them won't. So this is a minor problem to me.

AVX/SSE + p010_2_p010 path.

420p10 path is OK. I rarely test the p010 path though because it is almost useless for current use case. Will look at it later.

@nyanmisaka
Copy link
Member

Commercially available videos won't be mastered in such way or at least most of them won't. So this is a minor problem to me.

So it is acceptable to me too.

420p10 path is OK. I rarely test the p010 path though because it is almost useless for current use case. Will look at it later.

Yeah, currently I only use the p010 path with the HW encoder to accelerate encoding test videos.

@nyanmisaka
Copy link
Member

Some code style cleanup to make tonemapx and gpgpu filters consistent.
clean.patch

@gnattu
Copy link
Member Author

gnattu commented Oct 5, 2024

Then only the brightness detection like libplacebo can alleviate this problem. But that experience is not good for me either.

Alternatively, we can adjust the BT.2390 EETF similar to how it's done in libplacebo. They modify the 0.5 knee offset from the ITU standard to a 1.0 knee offset, which makes the curve to roll off earlier. This makes the overall brightness transition smoother for such videos with extreme highlights. Due to web being lagging behind we have a few more weeks for 10.10, so if we want to make such changes we can do it for 10.10 now. (In a separate PR)

Screenshot 2024-10-05 at 11 12 26 Screenshot 2024-10-05 at 11 12 42

@nyanmisaka
Copy link
Member

Then only the brightness detection like libplacebo can alleviate this problem. But that experience is not good for me either.

Alternatively, we can adjust the BT.2390 EETF similar to how it's done in libplacebo. They modify the 0.5 knee offset from the ITU standard to a 1.0 knee offset, which makes the curve to roll off earlier. This makes the overall brightness transition smoother for such videos with extreme highlights. Due to web being lagging behind we have a few more weeks for 10.10, so if we want to make such changes we can do it for 10.10 now. (In a separate PR)

We can indeed give it a try. How long ago did they add this change?

@gnattu
Copy link
Member Author

gnattu commented Oct 5, 2024

We can indeed give it a try. How long ago did they add this change?

About last year, and is now the default value in gpu-next:

haasn/libplacebo@4cadbe7#diff-2e046795c81fb2bcc2667705f88c6421fa988047f147ddb2333ec8da5f638b7bR503

@nyanmisaka
Copy link
Member

We can indeed give it a try. How long ago did they add this change?

About last year, and is now the default value in gpu-next:

haasn/libplacebo@4cadbe7#diff-2e046795c81fb2bcc2667705f88c6421fa988047f147ddb2333ec8da5f638b7bR503

Does 1.0 look good enough for most other videos? If so then there is no need to add an extra AVOption.

@gnattu
Copy link
Member Author

gnattu commented Oct 5, 2024

Does 1.0 look good enough for most other videos? If so then there is no need to add an extra AVOption.

We can borrow the param and translate it to knee_offset when current method is bt2390, as that parameter already has different meaning in different tonemap operators, and is currently not used in bt2390.

@gnattu gnattu force-pushed the fix-tonemap-peak-range branch 2 times, most recently from 9e3f40e to 5126148 Compare October 5, 2024 06:46
@gnattu gnattu requested a review from nyanmisaka October 5, 2024 06:50
Need to perform bit shift before saturation move to prevent overflow
@gnattu gnattu force-pushed the fix-tonemap-peak-range branch from 5126148 to 6cbbce0 Compare October 5, 2024 07:23
Copy link
Member

@nyanmisaka nyanmisaka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

p016 is broken in avx/sse, but I think you can simply remove it since we don't have any use case for it.

p016

@gnattu
Copy link
Member Author

gnattu commented Oct 5, 2024

p016 is broken in avx/sse, but I think you can simply remove it since we don't have any use case for it.

Just remove it then. P016 will get overflowed in SIMD registers if we are handling them as int variants. If we ever want to support this use case we need to do a full re-implement that convert it to float and normalize to [0.0,1.0] like the GPUs.

@nyanmisaka
Copy link
Member

nyanmisaka commented Oct 5, 2024

p016 is broken in avx/sse, but I think you can simply remove it since we don't have any use case for it.

Just remove it then. P016 will get overflowed in SIMD registers if we are handling them as int variants. If we ever want to support this use case we need to do a full re-implement that convert it to float and normalize to [0.0,1.0] like the GPUs.

Not worth the effort, only NVDEC decoding RExt can produce this pixel format, and it's only really 12bit effective.

@gnattu gnattu merged commit 2e19320 into jellyfin Oct 5, 2024
26 checks passed
@gnattu gnattu deleted the fix-tonemap-peak-range branch October 5, 2024 10:47
@silajim
Copy link

silajim commented Oct 6, 2024

How hard would it be to have the reference sdr brightness user settable?

@gnattu
Copy link
Member Author

gnattu commented Oct 6, 2024

We are already using 100nit as SDR brightness and we do not recommend users to alter that because players/monitors will apply its own transfer curve for SDR inputs and a higher target peak usually result in bad highlight appearance.

@silajim
Copy link

silajim commented Oct 6, 2024

that's cool to know!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants