Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(server): fully accelerated qsv #9689

Merged
merged 5 commits into from
May 23, 2024
Merged

feat(server): fully accelerated qsv #9689

merged 5 commits into from
May 23, 2024

Conversation

mertalev
Copy link
Contributor

@mertalev mertalev commented May 23, 2024

Descripion

This PR adds end-to-end acceleration for Quick Sync transcoding. Similar to NVENC and RKMPP, tone-mapping is done with OpenCL. Based on my testing, OpenCL was even faster than native VPP tone-mapping while having more features and higher quality.

Testing

Tested with hardware decoding enabled and disabled, with tone-mapping enabled and disabled, and scaling set to either 720p or original.

Note: My processor is too new for the intel-compute-runtime in the server image, so I had to manually install the latest release here. I'll make a PR to apply this in the base image, but in the meantime it should only affect very new processors and only when hardware decoding is enabled.

Current transcoding time: 470s
With hardware decoding: 34s

Copy link

cloudflare-workers-and-pages bot commented May 23, 2024

Deploying immich with  Cloudflare Pages  Cloudflare Pages

Latest commit: 5279a0e
Status: ✅  Deploy successful!
Preview URL: https://9b8702ea.immich.pages.dev
Branch Preview URL: https://feat-qsv-hw-decoding.immich.pages.dev

View logs

export class VAAPIConfig extends BaseHWConfig {
getBaseInputOptions() {
if (this.devices.length === 0) {
throw new Error('No VAAPI device found');
}

let hwDevice = this.getPreferredHardwareDevice();
if (hwDevice === null) {
if (!hwDevice) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🙏

@mertalev mertalev enabled auto-merge (squash) May 23, 2024 03:54
@mertalev mertalev merged commit a5e8b45 into main May 23, 2024
23 of 24 checks passed
@mertalev mertalev deleted the feat/qsv-hw-decoding branch May 23, 2024 03:58
@mertalev mertalev changed the title feat(server): qsv hardware decoding and tone-mapping feat(server): fully accelerated qsv May 23, 2024
return [
'hwmap=derive_device=opencl',
`tonemap_opencl=${tonemapOptions.join(':')}`,
'hwmap=derive_device=vaapi:reverse=1',

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hwmap=derive_device=vaapi:reverse=1
This will be not allowed in the future. It should be hwmap=derive_device=qsv:reverse=1
The qsv encoder's ability to accept vaapi fmt is experimental and was decided by Intel not to be promoted to upstream.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used vaapi there because qsv throws an error:

[AVHWDeviceContext @ 0x24e400304c0] 0.0: Intel(R) OpenCL Graphics / Intel(R) Arc(TM) Graphics
[AVHWDeviceContext @ 0x24e400304c0] Intel QSV to OpenCL mapping function found (clCreateFromVA_APIMediaSurfaceINTEL).
[AVHWDeviceContext @ 0x24e400304c0] Intel QSV in OpenCL acquire function found (clEnqueueAcquireVA_APIMediaSurfacesINTEL).
[AVHWDeviceContext @ 0x24e400304c0] Intel QSV in OpenCL release function found (clEnqueueReleaseVA_APIMediaSurfacesINTEL).
[AVHWFramesContext @ 0x24e401d4740] The hardware pixel format 'vaapi' is not supported by the device type 'QSV'
[Parsed_hwmap_2 @ 0x24e40173000] Failed to initialise target frames context: -38.
[Parsed_hwmap_2 @ 0x24e40173000] Failed to configure output pad on Parsed_hwmap_2
Error reinitializing filters!
Failed to inject frame into filter network: Function not implemented
Error while processing the decoded data for stream #0:0
[AVIOContext @ 0x24e401907c0] Statistics: 0 bytes written, 0 seeks, 0 writeouts
Terminating demuxer thread 0
[AVIOContext @ 0x24e401902c0] Statistics: 1982895 bytes read, 2 seeks
Conversion failed!

Is there something else I would need to change to make this work? The full command is this:

ffmpeg -hwaccel qsv -async_depth 4 -threads 1 -i HDR.mp4 \
    -c:v h264_qsv -c:a copy -movflags faststart -fps_mode passthrough -map 0:0 -bf 7 -refs 5 -g 256 -v verbose \
    -vf hwmap=derive_device=opencl,tonemap_opencl=desat=0:format=nv12:matrix=bt709:primaries=bt709:range=pc:tonemap=hable:transfer=bt709,hwmap=derive_device=qsv:reverse=1 \
    -preset 7 -global_quality 23 SDR.mp4

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you have to specify the format=qsv to enable the indirect hwmap.

hwmap=derive_device=qsv:reverse=1,format=qsv

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, that worked! I'll change it to use qsv.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants