WIP: multi-context support #29

pH5 · 2020-01-23T16:01:35Z

Running multiple decoders in parallel on the same VADisplay requires a separate kernel context per VAContext. Further the VA API requires surfaces to be allocated independently from the context.

We can achieve this on top of the V4L2 API by allocating and exporting DMA buffers from a separate, temporary kernel context, which can be closed immediately after allocation. Reimporting the orphaned DMA buffers into the decoder contexts allows.

This can be used to reduce number of issued ioctls, by setting multiple controls at once. Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>

This can be used to query codec mode controls, such as decode mode and start code for h.264. Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>

Update to the merged stateless h.264 kernel interface, as of commit c3adb85745ca ("media: uapi: h264: Get rid of the p0/b0/b1 ref-lists"). Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>

Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>

If the driver reports that it expects H.264 Annex B start codes, provide them. Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>

Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>

This requires modifications in gst-plugins-bad, libva, and gstreamer-vaapi. Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>

Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>

At this point it is unclear whether to store the Inter Y scaling matrix at index 1 (h.264 standard) or 3 [1]. Store it at both indices for now. [1] https://lore.kernel.org/linux-media/HE1PR06MB40118B3C30939861DD91113CACBE0@HE1PR06MB4011.eurprd06.prod.outlook.com/T/#m60af013132990335d525e6e5600c5f5bd692cfbf Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>

The mplane type should be selected base on the driver capabilties, not base on the selected pixel format. Signed-off-by: Nicolas Dufresne <nicolas.dufresne@collabora.com>

In RequestCreateSurfaces2, the S_FMT(CAP) may not set the desired format if the capture format is limited to the output format dimensions, unless the output format is set in advance. Use V4L2_PIX_FMT_H264_SLICE because we know that requires larger capture buffers to store motion vectors on Hantro G1. Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>

This works around a runtime dynamic linker error: $ vainfo libva info: VA-API version 1.1.0 libva info: va_getDriverName() returns -1 libva info: User requested driver 'v4l2_request' libva info: Trying to open /usr/lib/dri/v4l2_request_drv_video.so libva error: dlopen of /usr/lib/dri/v4l2_request_drv_video.so failed: /usr/lib/dri/v4l2_request_drv_video.so: undefined symbol: tiled_to_planar libva info: va_openDriver() returns -1 vaInitialize failed with error code -1 (unknown libva error),exit

TODO: roll back surface creation and buffer mapping on error. Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>

To avoid reevaluating the environment variable in multiple places when reopening the video device, store video_path in struct request_data. Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>

Query buffer capabilities and verify that MMAP, DMABUF, and ORPHANED_BUFS capabilities are supported on the capture queue. This is required to allocate buffers on a temporary context, export to DMA buffers, and then orphan them by closing the temporary video fd. The orphaned DMA buffers can then be imported by multiple decoder contexts. Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>

Allow creating DMABUF slots on the capture queue by specifying memory type with a parameter to v4l2_create_buffers(). Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>

Allow to queue and dequeue imported DMA buffers on a capture queue. Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>

Always export the DMA buffers and store them in the surface in vaCreateSurfaces(2). Let vaAcquireBufferHandle() and vaExportSurfaceHandle() dup the stored dmabuf fds. This is in preparation for allocating DMA buffers on a temporary allocation context and reimporting them into the decoder contexts for multi-context support. Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>

Let vaCreateSurfaces(2) allocate buffers on a temporary V4L2 context, export them to DMA buffers, and orphan them by closing the allocation context. The orphaned buffers are then imported into the decoder context upon use. This allows to allocate an arbitrary number of surfaces (up to 32 at a time), to export them to external APIs, and to use them on multiple contexts. Adapt vaEndPicture and vaSyncSurface to (de)queue imported DMA buffers. Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>

Store the ID of the active decoder context in the render target surface when the surface state is changed to VASurfaceRendering in vaBeginPicture(). Clear it when the state is changed to VASurfaceDisplaying in vaSyncSurface(). Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>

Let each VA-API context create their own V4L2 context by opening a new video_fd. This will allow to operate multiple contexts at the same time. - Queue and dequeue buffers on the per-context video_fd. - Set h.264 controls on the per-context video_fd. Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>

Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>

Since a new temporary context is created every time vaCreateSurfaces(2) is called, we can use VIDIOC_REQBUFS instead of VIDIOC_CREATE_BUFS to allocate the buffers. Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>

pH5 and others added 29 commits January 23, 2020 16:45

v4l2: introduce v4l2_set_controls

2d07222

This can be used to reduce number of issued ioctls, by setting multiple controls at once. Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>

v4l2: introduce v4l2_get_controls

c1261cc

This can be used to query codec mode controls, such as decode mode and start code for h.264. Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>

h264: update to merged h.264 kernel interface

0923e90

Update to the merged stateless h.264 kernel interface, as of commit c3adb85745ca ("media: uapi: h264: Get rid of the p0/b0/b1 ref-lists"). Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>

h264: use v4l2_set_controls to reduce number of issued ioctls

fbde9f6

Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>

h264: use v4l2_get_controls to query decode mode and start code

c7385a6

Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>

h264: add H.264 Annex B start codes if required

b7aadc5

If the driver reports that it expects H.264 Annex B start codes, provide them. Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>

h264: set pic_num in dpb

97a013c

Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>

h264: set frame_num in slice_params

a33da99

Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>

h264: extract nal_ref_idc and nal_unit_type

a422742

Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>

h264: set max_num_ref_frames in SPS

6d59904

Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>

h264: set profile_idc in SPS

a74198a

Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>

h264: set idr_pic_id and dec_ref_pic_marking_bit_size

00080bf

This requires modifications in gst-plugins-bad, libva, and gstreamer-vaapi. Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>

h264: set pic_order_cnt_bit_size

145fb8a

This requires modifications in gst-plugins-bad, libva, and gstreamer-vaapi. Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>

h264: set num_ref_idx_l[01]_default_active_minus1 in PPS

9306beb

Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>

Fix mplane support

abd2b2e

The mplane type should be selected base on the driver capabilties, not base on the selected pixel format. Signed-off-by: Nicolas Dufresne <nicolas.dufresne@collabora.com>

surface: add surface creation error path

d20b686

TODO: roll back surface creation and buffer mapping on error. Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>

request: store video_path in driver data

f9d852f

To avoid reevaluating the environment variable in multiple places when reopening the video device, store video_path in struct request_data. Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>

v4l2: add memory type to v4l2_create_buffers

2c1ea3a

Allow creating DMABUF slots on the capture queue by specifying memory type with a parameter to v4l2_create_buffers(). Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>

v4l2: add dmabuf (de)queue helpers

5957d64

Allow to queue and dequeue imported DMA buffers on a capture queue. Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>

move dmabuf slot creation from vaCreateSurfaces(2) into vaCreateContext

10d485d

Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>

context: allocate output buffers with REQBUFS

654e91e

Since a new temporary context is created every time vaCreateSurfaces(2) is called, we can use VIDIOC_REQBUFS instead of VIDIOC_CREATE_BUFS to allocate the buffers. Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>

wolfallein mentioned this pull request May 22, 2021

Fails to build against kernel 5.11.x #35

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

WIP: multi-context support #29

WIP: multi-context support #29

Uh oh!

pH5 commented Jan 23, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

WIP: multi-context support #29

Are you sure you want to change the base?

WIP: multi-context support #29

Uh oh!

Conversation

pH5 commented Jan 23, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants