-
-
Notifications
You must be signed in to change notification settings - Fork 134
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add patches for VT #340
Conversation
This backports necessary codes from upstream to make the VT work properly. Introduced scale_vt, overlay_videotoolbox and interop between VT and OCL This also fixed full range color and HDR passthrough support on 6.0.1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall LGTM. Except for two patches that can be improved.
I ran the fate test with all patches applied and found it complains. Please help me add these two patches. diff --git a/debian/patches/0004-add-cuda-pixfmt-converter-impl.patch b/debian/patches/0004-add-cuda-pixfmt-converter-impl.patch
index 9ba8147..0a12be4 100644
--- a/debian/patches/0004-add-cuda-pixfmt-converter-impl.patch
+++ b/debian/patches/0004-add-cuda-pixfmt-converter-impl.patch
@@ -1,3 +1,15 @@
+Index: jellyfin-ffmpeg/tests/ref/fate/source
+===================================================================
+--- jellyfin-ffmpeg.orig/tests/ref/fate/source
++++ jellyfin-ffmpeg/tests/ref/fate/source
+@@ -22,6 +22,7 @@ compat/djgpp/math.h
+ compat/float/float.h
+ compat/float/limits.h
+ libavcodec/bitstream_template.h
++libavfilter/dither_matrix.h
+ tools/decode_simple.h
+ Use of av_clip() where av_clip_uintp2() could be used:
+ Use of av_clip() where av_clip_intp2() could be used:
Index: jellyfin-ffmpeg/libavfilter/dither_matrix.h
===================================================================
--- /dev/null
-- And add this to your diff --git a/tests/fate/hevc.mak b/tests/fate/hevc.mak
index 20c2e5b..c5df8ea 100644
--- a/tests/fate/hevc.mak
+++ b/tests/fate/hevc.mak
@@ -222,7 +222,7 @@ FATE_HEVC-$(call ALLYES, HEVC_DEMUXER MOV_DEMUXER HEVC_PARSER HEVC_MP4TOANNEXB_B
fate-hevc-bsf-mp4toannexb: tests/data/hevc-mp4.mov
fate-hevc-bsf-mp4toannexb: CMD = md5 -i $(TARGET_PATH)/tests/data/hevc-mp4.mov -c:v copy -fflags +bitexact -f hevc
fate-hevc-bsf-mp4toannexb: CMP = oneline
-fate-hevc-bsf-mp4toannexb: REF = 1873662a3af1848c37e4eb25722c8df9
+fate-hevc-bsf-mp4toannexb: REF = 7d05a79c7a6665ae22c0043a4d83a811
fate-hevc-skiploopfilter: CMD = framemd5 -skip_loop_filter nokey -i $(TARGET_SAMPLES)/hevc-conformance/SAO_D_Samsung_5.bit -sws_flags bitexact
FATE_HEVC-$(call FRAMEMD5, HEVC, HEVC, HEVC_PARSER) += fate-hevc-skiploopfilter
-- Just in case someone in our downstream wants to run this in their builds. It doesn't mean much to us. |
The fate is fixed, but we are having some situation. The HDR passthrough works but the color is not correct and this is an upstream issue. I tested upstream 6.1 and it also occurred there. The red light will look too yellow. It may appears to be OK for scene like flames, but it could be very wrong in certain scene like a sunset. Tone-mapped SDR output is not affected by this. Handbrake does not have similar issue so I may post additional patches during the 10.9 cycle to port the correct implementation from there to see if we can get it fixed an upstreamed. |
I think I found the issue. The required metadata like |
A big difference for VideoToolbox is that such static Metadata should be passed in during the session configuration stage, but it seems like ffmpeg handles it as a per-frame data? What can I do to handle this? Hack the per-frame handler to modify the session config when the first frame comes in? |
Yes ffmpeg handles HDR metadata per frame. Maybe you can defer finalize the VT encoding session until the first frame arrives. |
It seems like the problem is more than the encoder. If I set |
Some HW decoders on other platforms fails to set the correct colorspace attributes for the output frames, so I usually override it based on the results of
Using See also, figure 3 & 4 in https://developer.nvidia.com/blog/nvidia-ffmpeg-transcoding-guide/ |
Strange. |
The upstream maintainer tested HLG->SDR using the command containing |
Can u use ffprobe to print the stream info of the input and output files?
|
The source:{
"streams": [
{
"index": 0,
"codec_name": "hevc",
"codec_long_name": "H.265 / HEVC (High Efficiency Video Coding)",
"profile": "Main 10",
"codec_type": "video",
"codec_tag_string": "[0][0][0][0]",
"codec_tag": "0x0000",
"width": 3840,
"height": 2160,
"coded_width": 3840,
"coded_height": 2160,
"closed_captions": 0,
"film_grain": 0,
"has_b_frames": 2,
"sample_aspect_ratio": "1:1",
"display_aspect_ratio": "16:9",
"pix_fmt": "yuv420p10le",
"level": 150,
"color_range": "tv",
"color_space": "bt2020nc",
"color_transfer": "arib-std-b67",
"color_primaries": "bt2020",
"chroma_location": "left",
"field_order": "progressive",
"refs": 1,
"r_frame_rate": "30/1",
"avg_frame_rate": "30/1",
"time_base": "1/1000",
"extradata_size": 1087,
"disposition": {
"default": 0,
"dub": 0,
"original": 0,
"comment": 0,
"lyrics": 0,
"karaoke": 0,
"forced": 0,
"hearing_impaired": 0,
"visual_impaired": 0,
"clean_effects": 0,
"attached_pic": 0,
"timed_thumbnails": 0,
"non_diegetic": 0,
"captions": 0,
"descriptions": 0,
"metadata": 0,
"dependent": 0,
"still_image": 0
},
"tags": {
"ENCODER": "Lavc57.107.100 libx265"
}
}
]
} The correct output:{
"streams": [
{
"index": 0,
"codec_name": "hevc",
"codec_long_name": "H.265 / HEVC (High Efficiency Video Coding)",
"profile": "Main",
"codec_type": "video",
"codec_tag_string": "hvc1",
"codec_tag": "0x31637668",
"width": 1920,
"height": 1080,
"coded_width": 1920,
"coded_height": 1088,
"closed_captions": 0,
"film_grain": 0,
"has_b_frames": 0,
"sample_aspect_ratio": "1:1",
"display_aspect_ratio": "16:9",
"pix_fmt": "yuv420p",
"level": 123,
"color_range": "tv",
"color_space": "bt709",
"color_transfer": "bt709",
"color_primaries": "bt709",
"chroma_location": "left",
"field_order": "progressive",
"refs": 1,
"id": "0x1",
"r_frame_rate": "30/1",
"avg_frame_rate": "30/1",
"time_base": "1/15360",
"start_pts": 0,
"start_time": "0.000000",
"duration_ts": 1518592,
"duration": "98.866667",
"bit_rate": "8912514",
"nb_frames": "2966",
"extradata_size": 111,
"disposition": {
"default": 1,
"dub": 0,
"original": 0,
"comment": 0,
"lyrics": 0,
"karaoke": 0,
"forced": 0,
"hearing_impaired": 0,
"visual_impaired": 0,
"clean_effects": 0,
"attached_pic": 0,
"timed_thumbnails": 0,
"non_diegetic": 0,
"captions": 0,
"descriptions": 0,
"metadata": 0,
"dependent": 0,
"still_image": 0
},
"tags": {
"language": "und",
"handler_name": "VideoHandler",
"vendor_id": "[0][0][0][0]",
"encoder": "Lavc60.31.102 hevc_videotoolbox"
}
}
]
} The wrong output:{
"streams": [
{
"index": 0,
"codec_name": "hevc",
"codec_long_name": "H.265 / HEVC (High Efficiency Video Coding)",
"profile": "Main",
"codec_type": "video",
"codec_tag_string": "hvc1",
"codec_tag": "0x31637668",
"width": 1920,
"height": 1080,
"coded_width": 1920,
"coded_height": 1088,
"closed_captions": 0,
"film_grain": 0,
"has_b_frames": 0,
"sample_aspect_ratio": "1:1",
"display_aspect_ratio": "16:9",
"pix_fmt": "yuv420p",
"level": 123,
"color_range": "tv",
"color_space": "bt709",
"color_transfer": "bt709",
"color_primaries": "bt709",
"chroma_location": "left",
"field_order": "progressive",
"refs": 1,
"id": "0x1",
"r_frame_rate": "30/1",
"avg_frame_rate": "30/1",
"time_base": "1/15360",
"start_pts": 0,
"start_time": "0.000000",
"duration_ts": 1518592,
"duration": "98.866667",
"bit_rate": "9473045",
"nb_frames": "2966",
"extradata_size": 111,
"disposition": {
"default": 1,
"dub": 0,
"original": 0,
"comment": 0,
"lyrics": 0,
"karaoke": 0,
"forced": 0,
"hearing_impaired": 0,
"visual_impaired": 0,
"clean_effects": 0,
"attached_pic": 0,
"timed_thumbnails": 0,
"non_diegetic": 0,
"captions": 0,
"descriptions": 0,
"metadata": 0,
"dependent": 0,
"still_image": 0
},
"tags": {
"language": "und",
"handler_name": "VideoHandler",
"vendor_id": "[0][0][0][0]",
"encoder": "Lavc60.31.102 hevc_videotoolbox"
}
}
]
} Both output are using the exact command the upstream maintainer used, except change for the input file and the removal of |
And the most interesting part: it only affects the tone mapping. The HDR passthrough works fine using I don't think this is a problem inside the |
If hwaccel_output_format is not used, then ffmpeg creates an intermediate buffer and sets its color, passing it to the first filter scale_vt.
// set color func // decoder postproc callback |
I think we can use |
I've checked and it is not the attachments, very very weird. I even overwrote both the source and destination |
Submitted to upstream and see if they have a clue for it: At the same time, I'm going to add a workaround in Jellyfin Server to disable A very interesting thing is, using ./ffmpeg -hwaccel videotoolbox \
-i /Users/gnattu/Movies/ffmpeg-sample/4K_HLG.mkv \
-c:v hevc_videotoolbox \
-profile:v main \
-b:v 3M \
-vf format=p010,hwupload,scale_vt=w=iw/2:h=ih/2:color_matrix=bt709:color_primaries=bt709:color_transfer=bt709 \
-c:a copy \
-tag:v hvc1 \
test3.mp4 This easily give me over 250 fps:
./ffmpeg -hwaccel videotoolbox \
-hwaccel_output_format videotoolbox_vld \
-i /Users/gnattu/Movies/ffmpeg-sample/4K_HLG.mkv \
-c:v hevc_videotoolbox \
-profile:v main \
-b:v 3M \
-vf scale_vt=w=iw/2:h=ih/2:color_matrix=bt709:color_primaries=bt709:color_transfer=bt709 \
-c:a copy \
-tag:v hvc1 \
test3.mp4 The "recommended supposed to be fast"
Now I'm even thinking about disable the |
What if you use ffmpeg’s built in software decoder? How is the color look like then? |
The color is correct using software decoder:
Same output as the hwupload, but much slower. |
This presumably proves that the videotoolbox hwaccel implementation in ffmpeg has quirks. We usually use software codecs as a basic reference. HLG only contains static metadata, so the results should be nearly identical every time you transcode. |
Our sub-filter triggered some bug in my
So need to change both the filter and Jellyfin Server. |
It should, but it does not. I'm not looking at it anymore given it does not improve performance at all. Need to find someone with an Intel Mac to test if this is still the case though. If using |
Can you share the ffmpeg CLI and the corresponding errors? |
You mean the subtitle ones? It is not the subtitle filter issue IMO, all error code are from Apple stack. The VT will raise The Metal will raise an assertion failure complaining the rows are less than expected, and this usually happens when the color depth does not match the texture's settings, which is what we are having here: trying to load an 8bit image to an 16bit texture without any conversion. The (broken) CLI looks like this: /Users/gnattu/src/jellyfin-ffmpeg-macos-apple/ffmpeg -analyzeduration 200M -probesize 1G -init_hw_device videotoolbox=vt -noautorotate -t 300 -i file:"/Users/gnattu/Movies/ffmpeg-sample/120FPS.mkv" -map_metadata -1 -map_chapters -1 -threads 0 -map 0:0 -map 0:1 -map -0:0 -codec:v:0 hevc_videotoolbox -tag:v:0 hvc1 -b:v 292000 -maxrate 292000 -bufsize 584000 -force_key_frames:0 "expr:gte(t,n_forced*3)" -g:v:0 360 -keyint_min:v:0 360 -filter_complex "alphasrc=s=426x238:r=60:start='0',format=bgra,subtitles=f='/Users/gnattu/Movies/ffmpeg-sample/120FPS.mkv':si=0:alpha=1:sub2video=1:fontsdir='/Users/gnattu/Library/Application Support/jellyfin/cache/attachments/2af2cfc6ff2a3b75ad3699da3de7061d',hwupload=derive_device=videotoolbox[sub];[0:v]hwupload,scale_vt=w=426:h=238[main];[main][sub]overlay_videotoolbox=eof_action=pass:repeatlast=0" -start_at_zero -codec:a:0 ac3 -ab 128000 -ar 48000 wtf.mp4 By adding |
Both these two works for me. // qsv
ffmpeg -init_hw_device qsv=qs -filter_hw_device qs -i INPUT -an -sn -vf hwupload=extra_hw_frames=16 -f null -
...
Stream #0:0(und): Video: wrapped_avframe, qsv(tv, progressive), 1920x1080 [SAR 1:1 DAR 16:9], q=2-31, 200 kb/s, 29.97 fps, 29.97 tbn (default)
// d3d11va
ffmpeg -init_hw_device d3d11va=dx11 -filter_hw_device dx11 -i INPUT -an -sn -vf hwupload -f null -
...
Stream #0:0: Video: wrapped_avframe, d3d11(tv, bt709/unknown/unknown, progressive), 1920x1080 [SAR 1:1 DAR 16:9], q=2-31, 200 kb/s, 23.98 fps, 23.98 tbn (default) Will this command fail on videotoolbox? My video source is HEVC Main10 1080p. // videotoolbox
ffmpeg -init_hw_device videotoolbox=vt -filter_hw_device vt -i INPUT -an -sn -vf hwupload -f null - |
The pixel format output by the software decoder for HEVC Main is All fixed-function hardware (non-GPGPU impl) on x86_64 that I know of only support semi-planar and packed formats such as So perhaps commenting out |
Of cause this will work. I can even append scale_vt to hwupload in this case. It's only using it in our burn-in filter chain would confuses the hwupload to pick up the best pixel format. I haven't seen other cases where specifying format is mandatory. The error is not from the VTTransfer of |
What did it even uploaded to? When I try to retrieve the pixel format type using |
You have to check for nullptr when using
So what you mean is that in
|
Seems to be this case:
The |
|
This is what I'm planning to do. Detect input color depth and specify a upload format when using subtitles burn in. |
I imagine this would make the logic quite complex. Just curious, what if you use OpenCL filters globally and enable |
I tried more with this and hope to solve this by a dual conversion. For the overlay, we can first convert it to My plan is to check if any of the input is in non-bi-planer format, and perform additional conversion if non-bi-planer format is uploaded. The idea is that if the conversion is not avoidable, prefer doing such conversion using Apple's native methods instead of using ffmpeg's implementation as it will be hardware-accelerated. Is such approach upstream-able? At the very least we can use it in our own fork as it could reduce the complexity of the filtering chain building logic. |
For overlays with an alpha channel and are not in BGRA format, they would already be uploaded to |
I want to check first, Is If so, maybe replace the intermediate format |
I've checked most of the formats, and it appears that |
You can restrict the filter's main and overlay formats, like this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is ready. You may want to do more testing on the server side before merging this.
Changes
This backports necessary codes from upstream to make the VT work properly. Introduced scale_vt, overlay_videotoolbox and interop between VT and OCL This also fixed full range color and HDR passthrough support on 6.0.1
Issues
Step one for #339