Skip to content

Conversation

@pH5
Copy link

@pH5 pH5 commented Jan 23, 2020

Running multiple decoders in parallel on the same VADisplay requires a separate kernel context per VAContext. Further the VA API requires surfaces to be allocated independently from the context.

We can achieve this on top of the V4L2 API by allocating and exporting DMA buffers from a separate, temporary kernel context, which can be closed immediately after allocation. Reimporting the orphaned DMA buffers into the decoder contexts allows.

pH5 and others added 29 commits January 23, 2020 16:45
This can be used to reduce number of issued ioctls,
by setting multiple controls at once.

Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
This can be used to query codec mode controls,
such as decode mode and start code for h.264.

Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Update to the merged stateless h.264 kernel interface, as of commit
c3adb85745ca ("media: uapi: h264: Get rid of the p0/b0/b1 ref-lists").

Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
If the driver reports that it expects H.264 Annex B start codes,
provide them.

Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
This requires modifications in gst-plugins-bad, libva, and
gstreamer-vaapi.

Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
This requires modifications in gst-plugins-bad, libva, and
gstreamer-vaapi.

Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
At this point it is unclear whether to store the Inter Y scaling matrix
at index 1 (h.264 standard) or 3 [1]. Store it at both indices for now.

[1] https://lore.kernel.org/linux-media/HE1PR06MB40118B3C30939861DD91113CACBE0@HE1PR06MB4011.eurprd06.prod.outlook.com/T/#m60af013132990335d525e6e5600c5f5bd692cfbf

Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
The mplane type should be selected base on the driver capabilties, not base
on the selected pixel format.

Signed-off-by: Nicolas Dufresne <nicolas.dufresne@collabora.com>
In RequestCreateSurfaces2, the S_FMT(CAP) may not set the desired format
if the capture format is limited to the output format dimensions, unless
the output format is set in advance.

Use V4L2_PIX_FMT_H264_SLICE because we know that requires larger capture
buffers to store motion vectors on Hantro G1.

Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
This works around a runtime dynamic linker error:

  $ vainfo
  libva info: VA-API version 1.1.0
  libva info: va_getDriverName() returns -1
  libva info: User requested driver 'v4l2_request'
  libva info: Trying to open /usr/lib/dri/v4l2_request_drv_video.so
  libva error: dlopen of /usr/lib/dri/v4l2_request_drv_video.so failed:
    /usr/lib/dri/v4l2_request_drv_video.so: undefined symbol: tiled_to_planar
  libva info: va_openDriver() returns -1
  vaInitialize failed with error code -1 (unknown libva error),exit
TODO: roll back surface creation and buffer mapping on error.

Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
To avoid reevaluating the environment variable in multiple places when
reopening the video device, store video_path in struct request_data.

Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Query buffer capabilities and verify that MMAP, DMABUF, and
ORPHANED_BUFS capabilities are supported on the capture queue.

This is required to allocate buffers on a temporary context, export to
DMA buffers, and then orphan them by closing the temporary video fd.
The orphaned DMA buffers can then be imported by multiple decoder
contexts.

Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Allow creating DMABUF slots on the capture queue by specifying memory
type with a parameter to v4l2_create_buffers().

Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Allow to queue and dequeue imported DMA buffers on a capture queue.

Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Always export the DMA buffers and store them in the surface in
vaCreateSurfaces(2). Let vaAcquireBufferHandle() and
vaExportSurfaceHandle() dup the stored dmabuf fds.
This is in preparation for allocating DMA buffers on a temporary
allocation context and reimporting them into the decoder contexts
for multi-context support.

Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Let vaCreateSurfaces(2) allocate buffers on a temporary V4L2 context,
export them to DMA buffers, and orphan them by closing the allocation
context. The orphaned buffers are then imported into the decoder context
upon use.

This allows to allocate an arbitrary number of surfaces (up to 32 at
a time), to export them to external APIs, and to use them on multiple
contexts.

Adapt vaEndPicture and vaSyncSurface to (de)queue imported DMA buffers.

Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Store the ID of the active decoder context in the render target
surface when the surface state is changed to VASurfaceRendering in
vaBeginPicture(). Clear it when the state is changed to
VASurfaceDisplaying in vaSyncSurface().

Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Let each VA-API context create their own V4L2 context by opening a new
video_fd.

This will allow to operate multiple contexts at the same time.

- Queue and dequeue buffers on the per-context video_fd.
- Set h.264 controls on the per-context video_fd.

Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Since a new temporary context is created every time vaCreateSurfaces(2)
is called, we can use VIDIOC_REQBUFS instead of VIDIOC_CREATE_BUFS to
allocate the buffers.

Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants