-
Notifications
You must be signed in to change notification settings - Fork 0
Rendering
The frames decoded by rkmpp decoders are DRM_PRIME frames and not all players support those type of frames. To render a DRM_PRIME frame there are two common ways.
The EGL interface coming from MESA or a custom GL implementation from a vendor (ie: arm mali blobs) can import those DRM_PRIME frames and render on the screen. However that kind of rendering scheme is not yet implemented in all players. In this case the rendering flow is as follows.
Kernel (Rockchip BSP) UserSpace (Linux)
+----------------------+ +------------------------+
| Rockchip HW Decoders | | |
| ↓ | | |
| mpp service ------+------+---------> mpp |
| | | ↓ |
| rgamulti <-----+------+-librga -> ffmpeg |
| | | ↓ (DRMPRIME) |
| | | player |
| | | ↓ |
| GPU Driver <-----+------+-------- mesa(EGL) |
| | | | |
| +----------+------+--> mesa (pan*/lima*) |
| | | ↓ |
| DRM/KMS (vop2) <---+------+-------- DE (gnome/KDE) |
| ↓ | | |
| Display Out | | |
| (hdmi/dp/mipi,lvds) | | |
+----------------------+ +------------------------+
This flow is exactly like above but the gpu+DE part is bypassed, and the player directly renders on the kernels DRM interface, since the GPU is bypassed this is to most efficient rendering method but supported by very few players. (Only kodi as of writing)
Kernel (Rockchip BSP) UserSpace (Linux)
+----------------------+ +------------------------+
| Rockchip HW Decoders | | |
| ↓ | | |
| mpp service ------+------+---------> mpp |
| | | ↓ |
| rgamulti <-----+------+-librga -> ffmpeg |
| | | ↓ (DRMPRIME) |
| | | player |
| | | | |
| DRM/KMS (vop2) <---+------+-----------+ |
| ↓ | | |
| Display Out | | |
| (hdmi/dp/mipi,lvds) | | |
+----------------------+ +------------------------+
When an FFMPeg decoder decodes the picture using CPU (ie: libx64 or libdavid), they are provided to a memory buffer and generally copied around several times. This type of flow is ok may be 1080p or even 4k picture to some point, but those several copies introduces a lot of delay and choppiness in the video. Therefore rkmpp decoders do not support soft Frames as a priority, because this is not the intended use of an hardware decoder.
As mentioned above, the 2 known rendering schemes, should never copy frames to another buffer to get the desired performance. And even in the case of 0-copy rendering, generating big frames like 8K in 60fps require a lot of DDR bandwidth and not even possible in generic DDR4 speeds. To tackle this problem, ARM introduced a compression mechanism for the frames called AFBC (Arm Frame Buffer Compression). Rkmpp* decoders support AFBC compressed frames generation, but it is very important to have an AFBC capable mesa and/or AFBC capable DRM driver(vop2) to get really up 8k@60fps speeds smoothly.
In the case AFBC is not supported by the rendering chain, there is still option to decode the frames AFBC compressed, and decompress the with RGA filter in ffmpeg as well. This method will also give 8K@60fps performance levels with a trade of increasing the required memory size (this also depends on your DDR performance.)
FFplay supports DRMPrime Frames only when decoder gets this frames over VAAPI interface, otherwise there is no support in FFplay take advantage of those rkmpp accelerated frames.
So currently there is no proper support for FFPlay.
Mpv supports DRMPrime frames through EGL
To get basic support with mpv run mpv with below syntax:
mpv --profile=fast --hwdec=rkmpp path-to-file
- Limitations:
- This will bring a speed up to 4k@60 fps rendering. For faster rendering you should activate the AFBC mode due to DDR bandwidth.
- This will most likely not play 10bit files, because MESA and MPV both currently do not support NV15, NV20 10bit plane formats that the rkmpp decoders generate.
To workaround those issues, ffmpeg can use RGA filters to decompress the AFBC compressed frames, and convert the 10bit NV15,NV20 frames to something more accepted in mesa.
Below flags for mpv, will run rkmpp decoders in afbc mode, and pass those to RGA filters. RGA will convert NV15 frames to P010 and NV20 frames to P210 format.
mpv --profile=fast --hwdec=rkmpp --vd-lavc-o=afbc=on --vf=scale_rkrga=force_yuv=auto path-to-file
This will get true 10bit decoded rendering (if your display and mesa actually supports it) but due to the fact that P010 and P210 picture formats are not very efficient formats, above usage may still hit memory bandwidth limitations around 8k@55fps. To improve the performance in that regard, there should be direct AFBC rendering support in both mesa and mpv. I have tried several approaches, but could not find a proper solution get afbc support in mpv, and mesa in its current form. May be someone can take this up and improve.
To improve the performance more, a dynamic 10bit to 8bit conversion can be apllied with rga as below. This works exactly like above, but converts NV15 frames to NV12 and NV20 to NV16.
mpv --profile=fast --hwdec=rkmpp --vd-lavc-o=afbc=on --vf=scale_rkrga=force_yuv=8bit path-to-file
-
Limitations:
- In both cases, mesa expects 64 byte aligned picture buffers, however mpp currently gives dynamicly aligned frame buffer which may not be 64 byte aligned. If the the picture width is an oddnumber * 64 (720x480), then it is possible that mesa will not accept the provided frames over EGL.
-
Tweaking mpv
-
--profile=fast
is only required when your mesa is not fast enough to render the decoded frames. In a faster GL implementation like mali, of hopefully future panthor, you do not have to enable this feature -
--swapchain-depth=8
might help to increase the delay and reduce to dropped frames due to whatever bottleneck in the rendering path -
--msg-level=ffmpeg=debug or --msg-level=ffmpeg=trace
can give extra useful information about the ffmpeg decoding process.trace
option might be overkill,debug
should be ok -
--vo=gpu-next
flag allows to use the new gpu backend in mpv which uses libplacebo. This might give slightly better performance -
--ao=null --ao-null-untimed
disables to sync video from audio. If you are testing from command line and have no proper audio backend, you can prevent frame drops due to lack of audio sync when testing
-
Kodi provides DRMPrime frames support through both EGL and KMS/GBM.
When the windowing manager is using X, there is no way to support DRMPrime frames, neither through EGL or GBM/KMS When the windowing system is using Wayland, you can get EGL support. When there is no windowing manager, and you start Kodi with GBM, you can get both EGL and GBM/KMS support.
You have to do the following configuration to get decoding over EGL.
settings->player->videos->render method
Allow using DRM PRIME Decoder=enable
Allow Hardware Acceleation with DRM PRIME=enable
Prime Render Method=EGL
- Limitations:
- Same restrictions of mpv withtout any rga usage apply here as well, unfortunately Kodi can not use FFMpeg filters and make use of RGA
- 10bit formats will not work
- AFBC improvements can not be used
- Performance will be limited 4k@60
- 64 byte unaligned frames will not be rendered
This type of rendering is the fastest method you can get. To run kodi with gbm support, the active Desktop Environment must be stopped so that Kodi can directly interact with KMS. You need to start kodi with
FFMPEG_RKMPP_DEC_OPT="afbc=on" kodi --windowing=gbm --audio-backend=alsa
Note: Audio backend force to alsa is not necesssary if you have a proper pipewire configuration.
Then configure kodi to render directly over KMS planes.
settings->player->videos->render method
Allow using DRM PRIME Decoder=enable
Allow Hardware Acceleation with DRM PRIME=enable
Prime Render Method=Direct to Plane
As you might notice, the decoder is currently running in AFBC mode, so there is no restriction in this mode in terms of performance, you should be able to get 8k@60 without any DDR bandwidth limitation.
- Limitations:
- if your attached monitor's resolution is <4K, you will not be able to render 8K frames properly, because vop2 activates the 8K rendering and scaling capabilities if the attached monitor is actually an 8k monitor. This is a limitation in rockchip hardware.
- As a general thumb of rule, 8K@60 means only if the hardware actually allows this. In rk3588 this means only HEVC and H264 frames. AV1 and VP9 decoder is limited to 4K performance in rk3588. This depends on your actual hardware if the device in use is not RK3588.
Moonlight automatically detects which FFMpeg decoder can create a DRM Prime frame with hardware acceleration and detect rkmpp decoders automatically. If not users can still force the rkmpp decoders with below environment variables.
H264_DECODER_HINT=h264_rkmpp HEVC_DECODER_HINT=hevc_rkmpp AV1_DECODER_HINT=av1_rkmpp moonlight
But as mentioned it is not necessary to force the codecs. Make sure you have the latest moonlight with the patch fixes regession fro V4L2 codecs listed below.
-
Rockchip Linux
- 0001-rga3_uncompact_fix.patch : To get P010 & P210 support rockchip kernel must be fixed, if not the rendered picture will display broken
- 0002-vop2_rbga2101010_capability_fix.patch : When using Kodi in GBM Mode, you will get a black screen if this patch is not applied. This is a bug in rockchip kernel.
-
Librga
- 0001-normalrga-cpp-add-10b-compact-endian-mode.patch : To get P010 & P210 support librga also must be fixed, if not the rendered picture will display broken
-
mpv
- 0000-hwdec_drmprime-add-AV_PIX_FMT_NV16-support.patch
- 0001-hwdec_drmprime-add-AV_PIX_FMT_P010-support.patch
- 0002-hwdec_drmprime-add-AV_PIX_FMT_P210-support.patch : To get individual formats supported over EGL with mpv, they need to be patched. Some/all of the might already be marged in mainline mpv.
-
KODI
- 0001-windowing-gbm-Dynamic-plane-selection.patch : If not applied, video will be displayed on top of OSD when Kodi used in GBM mode.
- 0002-VideoLayerBridgeDRMPRIME-Use-crop-fields-to-render-t.patch : If not applied, the decoded AFBC frames will have several pixels of offset on top when rendered. Both of those fixes has a PR in mainline kodi.
-
MOONLIGHT
- 0001-Only-give-pixel_format-nv12-option-to-v4l2m2m-or-v4l.patch : If not applied, moonlight will not able to initalize the rkmpp decoders even if forced.