-
Notifications
You must be signed in to change notification settings - Fork 7
Description
Environment:
GPU NVIDIA GeForce RTX 5060 (Blackwell GB206, SM 12.0)
NVIDIA Driver 580.126.16 / 580.126.20 (Open)
Vulkan API1.4.312
vdr-plugin-softhdcuvid3.35.3+git20260315-355-b475c17-0yavdr0noblenoble
libplacebo349 7.349.0+git20241013-18-9e16c86f-1yavdr0
libplacebo338 6.338.2-2build1
OS Ubuntu 24.04.4 LTS (Noble)
Platform Proxmox VM with GPU-Passthrough
VDR crashes on shutdown with a segfault, reported by the Linux kernel in dmesg:
video display[7125]: segfault at 7d5456124760 ip 00007d5456124760
sp 00007d5688fdc818 error 15 likely on CPU 0
ip == faulting address → call through an invalid function pointer
error 15 → protection fault, page present, user mode, reserved bit set in PTE
crash always occurs during teardown of the video display thread
Sequence in VDR log immediately before the crash:
vdr: CuvidExit
vdr: video: video thread canceled
vdr: decoder thread exit
vdr: delete placebo ← last log entry before crash
kernel: video display[...]: segfault error 15
Root Cause
softhdcuvid initializes the Vulkan instance in InitPlacebo() in video.c with only two extensions:
const char *ext[2] = {"VK_KHR_xcb_surface", "VK_KHR_surface"};
iparams.num_extensions = 2;
iparams.extensions = ext;
VK_EXT_debug_utils is neither requested at instance nor at device level.
libplacebo and libnvidia-glcore internally load vkCmdBeginDebugUtilsLabelEXT and vkCmdEndDebugUtilsLabelEXT via vkGetDeviceProcAddr – without the extension having been enabled.
Behavior depending on driver/GPU:
- Older GPUs/drivers:
vkGetDeviceProcAddrreturns NULL → null checks work → no crash - RTX 5060 + Driver 580 (Blackwell SM 12.0):
vkGetDeviceProcAddrreturns a non-NULL but invalid pointer → null checks do not fire → SEGFAULT
Confirmed by Vulkan Validation Layer:
vkCmdBeginDebugUtilsLabelEXT(): function required extension
VK_EXT_debug_utils which has not been enabled.
Fix
Two changes in video.c inside InitPlacebo():
- Add instance extension:
// Before:
const char *ext[2] = {"VK_KHR_xcb_surface", "VK_KHR_surface"};
iparams.num_extensions = 2;
// After:
const char *ext[3] = {"VK_KHR_xcb_surface", "VK_KHR_surface",
"VK_EXT_debug_utils"};
iparams.num_extensions = 3;
- Add device extension (after params.allow_software = false;):
static const char *dev_opt_ext[] = {"VK_EXT_debug_utils"};
params.opt_extensions = dev_opt_ext;
params.num_opt_extensions = 1;
opt_extensions is used so that device creation does not fail if the extension is unavailable on other systems.
Relation to libplacebo Bug
A related bug was found and fixed simultaneously in libplacebo: CmdBeginDebugUtilsLabelEXT and CmdEndDebugUtilsLabelEXT were loaded as mandatory functions in vk_dev_funs[] without activating the extension (src/vulkan/context.c). Both fixes together completely resolve the segfault.
libplacebo fix (src/vulkan/context.c):
bash# Remove from mandatory function list vk_dev_funs[]:
sed -i '/PL_VK_DEV_FUN(CmdBeginDebugUtilsLabelEXT)/d' src/vulkan/context.c
sed -i '/PL_VK_DEV_FUN(CmdEndDebugUtilsLabelEXT)/d' src/vulkan/context.c
Tested
Ubuntu 24.04.4 LTS (Noble)
RTX 5060 (Blackwell GB206, SM 12.0)
NVIDIA Open Driver 580.126.16 / 580.126.20
VDR starts and stops without segfault after the fix