diff --git a/docs/extension_support.md b/docs/extension_support.md new file mode 100644 index 00000000..e1b81a8f --- /dev/null +++ b/docs/extension_support.md @@ -0,0 +1,113 @@ +# Extension support in a layer + +It might be useful for some layers to implement an extension, such as +`VK_EXT_frame_boundary`, even if the underlying driver does not support it. +This page explains the general approach that needs to be taken, and the +specific API modifications that need to be applied for specific extensions. + +The core libGPULayers framework allows you to expose additional extensions via +the default `vkEnumerate*ExtensionProperties()` implementation, but per-layer +code must implement the API modifications in any other functions as needed. + +## Exposing a new extension + +New extensions are advertised to applications by adding the extension string to +the list returned by `vkEnumerate*ExtensionProperties()`. This functionality +is provided in the common framework default functions. Layer implementations +add the new extension information that they want to expose to either: + +* `Instance::injectedInstanceExtensions` for instance extensions. +* `Instance::injectedDeviceExtensions` for device extensions. + +Device extensions will be removed from this list if we can detect that the +underlying device already supports them, which means we can just pass through +rather than emulating support. + +### Handling extended API entry points + +All entrypoints that are touched by an extension need to be intercepted with a +`user_tag` version of that function, which will implement the functionality +that the layer requires. + +If the driver beneath the layer actually supports the extension, the extended +API parameters can be passed down to the driver without modification. This +scenario can be detected by checking that the extension name is no longer in +the `injectedExtensions` list, although the layer will probably want to cache +this check to reduce performance overhead. + +If the driver beneath the layer does not support the extension, the extended +API parameters should be rewritten to remove the extension before passing down +to the driver. User structure inputs to the Vulkan API are usually marked as +`const`, so we must take a safe-struct copy which we can modify and then pass +that copy to the driver. + +Note that Vulkan specifies that components must ignore structures in the +`pNext` chain that they do not understand: + +> Any component of the implementation (the loader, any enabled layers, and +> drivers) must skip over, without processing (other than reading the `sType` +> and `pNext` members) any extending structures in the chain not defined by +> core versions or extensions supported by that component. + +Any extension structures can therefore be left in-situ when being emulated, but +any other API parameter modifications must be unpicked to hide the emulation. + +## Common extension notes + +This section is a set of brief notes about extensions that we have implemented, +summarizing the changes needed and referencing where you can find an example +of the changes if you need something similar. + +### VK_EXT_frame_boundary + +This extension allows applications to annotate arbitrary submit calls to +indicate which frame the submitted work belongs to, instead of relying on +`vkQueuePresent()`. This can be useful for multi-threaded applications, +where CPU processing for frames can overlap, and for applications which +do not have frames, but that want to use tools such as RenderDoc that +require them. + +The `layer_gpu_timeline` layer is an example of a layer exposing this +extension using emulation on devices that do not support it. + +#### Exposing extension + +Adding exposure handling: + +* Add `VK_EXT_frame_boundary` to device extension list. +* Populate the `VkPhysicalDeviceFrameBoundary` in the + `VkPhysicalDeviceFeatures2.pNext` list returned by + `vkGetPhysicalDeviceFeatures2()`, forcing the value to `VK_TRUE`, if the + extension is "supported" but feature-disabled by the driver. +* Query `VkPhysicalDeviceFrameBoundary` in `VkDeviceCreateInfo.pNext` to see if + application enabled the extension. + +#### Implementing extension + +Adding implementation handling: + +* Add `VkFrameBoundaryEXT` extension struct handling to: + * `vkQueueSubmit()` + * `vkQueueSubmit2()` + * `vkQueuePresent()` + * `vkQueueBindSparse()` + +#### Implementation notes + +Most applications using this that I have seen are using it to demarcate frames +when using a single submitting render thread for off-screen rendering or +compute use cases that do not use `vkQueuePresent()`. In these systems just +detecting the frame boundary flag in the extension structure passed to a queue +submit is enough, and how we would use `vkQueuePresent()` to do the same +without this extension. + +It is possible for applications to have multiple concurrent frames being +submitted in an overlapping manner, which can be handled by tagging work with +the frame ID found in the extension structure for each `vkQueue*()` call. This +will require downstream data handling to cope with overlapping frame +submissions, which most of our layers do not handle, as it is rarely +encountered. + +- - - + +_Copyright © 2025, Arm Limited and contributors._ diff --git a/generator/vk_layer/source/instance.cpp b/generator/vk_layer/source/instance.cpp index 164ffb4f..70e705a0 100644 --- a/generator/vk_layer/source/instance.cpp +++ b/generator/vk_layer/source/instance.cpp @@ -38,10 +38,16 @@ static std::unordered_map> g_instances; const APIVersion Instance::minAPIVersion { 1, 1 }; /* See header for documentation. */ -const std::vector Instance::extraExtensions { +const std::vector Instance::requiredDriverExtensions { VK_EXT_DEBUG_UTILS_EXTENSION_NAME }; +/* See header for documentation. */ +const std::vector> Instance::injectedInstanceExtensions {}; + +/* See header for documentation. */ +std::vector> Instance::injectedDeviceExtensions {}; + /* See header for documentation. */ void Instance::store( VkInstance handle, diff --git a/generator/vk_layer/source/instance.hpp b/generator/vk_layer/source/instance.hpp index 8660e3c9..b2cd47b5 100644 --- a/generator/vk_layer/source/instance.hpp +++ b/generator/vk_layer/source/instance.hpp @@ -143,7 +143,24 @@ class Instance static const APIVersion minAPIVersion; /** - * @brief The minimum set of instance extensions needed by this layer. + * @brief Required extensions from the driver. + * + * The layer will attempt to enable these even if the application does not. + */ + static const std::vector requiredDriverExtensions; + + /** + * @brief Additional instance extensions injected by the layer. + * + * The layer will expose these even if the driver does not. + */ + static const std::vector> injectedInstanceExtensions; + + /** + * @brief Additional device extensions injected by the layer. + * + * The layer will expose these even if the driver does not. Items are + * removed from the list if the driver already exposes the extension. */ - static const std::vector extraExtensions; + static std::vector> injectedDeviceExtensions; }; diff --git a/layer_example/source/instance.cpp b/layer_example/source/instance.cpp index 78a7941d..35c6b842 100644 --- a/layer_example/source/instance.cpp +++ b/layer_example/source/instance.cpp @@ -35,10 +35,18 @@ static std::unordered_map> g_instances; /* See header for documentation. */ -const APIVersion Instance::minAPIVersion {1, 1}; +const APIVersion Instance::minAPIVersion { 1, 1 }; /* See header for documentation. */ -const std::vector Instance::extraExtensions {VK_EXT_DEBUG_UTILS_EXTENSION_NAME}; +const std::vector Instance::requiredDriverExtensions { + VK_EXT_DEBUG_UTILS_EXTENSION_NAME +}; + +/* See header for documentation. */ +const std::vector> Instance::injectedInstanceExtensions {}; + +/* See header for documentation. */ +std::vector> Instance::injectedDeviceExtensions {}; /* See header for documentation. */ void Instance::store(VkInstance handle, std::unique_ptr& instance) diff --git a/layer_example/source/instance.hpp b/layer_example/source/instance.hpp index cc05dcb8..acd7c91a 100644 --- a/layer_example/source/instance.hpp +++ b/layer_example/source/instance.hpp @@ -137,7 +137,24 @@ class Instance static const APIVersion minAPIVersion; /** - * @brief The minimum set of instance extensions needed by this layer. + * @brief Required extensions from the driver. + * + * The layer will attempt to enable these even if the application does not. + */ + static const std::vector requiredDriverExtensions; + + /** + * @brief Additional instance extensions injected by the layer. + * + * The layer will expose these even if the driver does not. + */ + static const std::vector> injectedInstanceExtensions; + + /** + * @brief Additional device extensions injected by the layer. + * + * The layer will expose these even if the driver does not. Items are + * removed from the list if the driver already exposes the extension. */ - static const std::vector extraExtensions; + static std::vector> injectedDeviceExtensions; }; diff --git a/layer_gpu_profile/source/instance.cpp b/layer_gpu_profile/source/instance.cpp index 71bd4c18..df4f55ac 100644 --- a/layer_gpu_profile/source/instance.cpp +++ b/layer_gpu_profile/source/instance.cpp @@ -35,13 +35,19 @@ static std::unordered_map> g_instances; /* See header for documentation. */ -const APIVersion Instance::minAPIVersion {1, 1}; +const APIVersion Instance::minAPIVersion { 1, 1 }; /* See header for documentation. */ -const std::vector Instance::extraExtensions { +const std::vector Instance::requiredDriverExtensions { VK_EXT_DEBUG_UTILS_EXTENSION_NAME, }; +/* See header for documentation. */ +const std::vector> Instance::injectedInstanceExtensions {}; + +/* See header for documentation. */ +std::vector> Instance::injectedDeviceExtensions {}; + /* See header for documentation. */ void Instance::store(VkInstance handle, std::unique_ptr& instance) { diff --git a/layer_gpu_profile/source/instance.hpp b/layer_gpu_profile/source/instance.hpp index 854745f3..d2788599 100644 --- a/layer_gpu_profile/source/instance.hpp +++ b/layer_gpu_profile/source/instance.hpp @@ -143,7 +143,24 @@ class Instance static const APIVersion minAPIVersion; /** - * @brief The minimum set of instance extensions needed by this layer. + * @brief Required extensions from the driver. + * + * The layer will attempt to enable these even if the application does not. + */ + static const std::vector requiredDriverExtensions; + + /** + * @brief Additional instance extensions injected by the layer. + * + * The layer will expose these even if the driver does not. + */ + static const std::vector> injectedInstanceExtensions; + + /** + * @brief Additional device extensions injected by the layer. + * + * The layer will expose these even if the driver does not. Items are + * removed from the list if the driver already exposes the extension. */ - static const std::vector extraExtensions; + static std::vector> injectedDeviceExtensions; }; diff --git a/layer_gpu_profile/source/layer_device_functions_debug.cpp b/layer_gpu_profile/source/layer_device_functions_debug.cpp index f975b385..74f352f5 100644 --- a/layer_gpu_profile/source/layer_device_functions_debug.cpp +++ b/layer_gpu_profile/source/layer_device_functions_debug.cpp @@ -42,7 +42,7 @@ VKAPI_ATTR void VKAPI_CALL layer_vkCmdDebugMarkerBeginEXT(VkCommandBuf auto* layer = Device::retrieve(commandBuffer); // Only instrument inside active frame of interest - if(layer->isFrameOfInterest) + if (layer->isFrameOfInterest) { auto& tracker = layer->getStateTracker(); auto& cb = tracker.getCommandBuffer(commandBuffer); @@ -67,7 +67,7 @@ VKAPI_ATTR void VKAPI_CALL layer_vkCmdDebugMarkerEndEXT(VkCommandBuffe auto* layer = Device::retrieve(commandBuffer); // Only instrument inside active frame of interest - if(layer->isFrameOfInterest) + if (layer->isFrameOfInterest) { auto& tracker = layer->getStateTracker(); auto& cb = tracker.getCommandBuffer(commandBuffer); @@ -93,7 +93,7 @@ VKAPI_ATTR void VKAPI_CALL layer_vkCmdBeginDebugUtilsLabelEXT(VkComman auto* layer = Device::retrieve(commandBuffer); // Only instrument inside active frame of interest - if(layer->isFrameOfInterest) + if (layer->isFrameOfInterest) { auto& tracker = layer->getStateTracker(); auto& cb = tracker.getCommandBuffer(commandBuffer); @@ -118,7 +118,7 @@ VKAPI_ATTR void VKAPI_CALL layer_vkCmdEndDebugUtilsLabelEXT(VkCommandB auto* layer = Device::retrieve(commandBuffer); // Only instrument inside active frame of interest - if(layer->isFrameOfInterest) + if (layer->isFrameOfInterest) { auto& tracker = layer->getStateTracker(); auto& cb = tracker.getCommandBuffer(commandBuffer); diff --git a/layer_gpu_support/source/device.cpp b/layer_gpu_support/source/device.cpp index f3cdeabb..f9c9664e 100644 --- a/layer_gpu_support/source/device.cpp +++ b/layer_gpu_support/source/device.cpp @@ -180,7 +180,7 @@ static void modifyDeviceRobustBufferAccess(Instance& instance, { if (enableRobustness) { - if(config->robustBufferAccess) + if (config->robustBufferAccess) { LAYER_LOG("Device feature already enabled: robustBufferAccess"); } @@ -190,9 +190,10 @@ static void modifyDeviceRobustBufferAccess(Instance& instance, config->robustBufferAccess = VK_TRUE; } } + if (disableRobustness) { - if(!config->robustBufferAccess) + if (!config->robustBufferAccess) { LAYER_LOG("Device feature already disabled: robustBufferAccess"); } diff --git a/layer_gpu_support/source/instance.cpp b/layer_gpu_support/source/instance.cpp index 71bd4c18..df4f55ac 100644 --- a/layer_gpu_support/source/instance.cpp +++ b/layer_gpu_support/source/instance.cpp @@ -35,13 +35,19 @@ static std::unordered_map> g_instances; /* See header for documentation. */ -const APIVersion Instance::minAPIVersion {1, 1}; +const APIVersion Instance::minAPIVersion { 1, 1 }; /* See header for documentation. */ -const std::vector Instance::extraExtensions { +const std::vector Instance::requiredDriverExtensions { VK_EXT_DEBUG_UTILS_EXTENSION_NAME, }; +/* See header for documentation. */ +const std::vector> Instance::injectedInstanceExtensions {}; + +/* See header for documentation. */ +std::vector> Instance::injectedDeviceExtensions {}; + /* See header for documentation. */ void Instance::store(VkInstance handle, std::unique_ptr& instance) { diff --git a/layer_gpu_support/source/instance.hpp b/layer_gpu_support/source/instance.hpp index df78da73..d24cf593 100644 --- a/layer_gpu_support/source/instance.hpp +++ b/layer_gpu_support/source/instance.hpp @@ -142,7 +142,24 @@ class Instance static const APIVersion minAPIVersion; /** - * @brief The minimum set of instance extensions needed by this layer. + * @brief Required extensions from the driver. + * + * The layer will attempt to enable these even if the application does not. + */ + static const std::vector requiredDriverExtensions; + + /** + * @brief Additional instance extensions injected by the layer. + * + * The layer will expose these even if the driver does not. + */ + static const std::vector> injectedInstanceExtensions; + + /** + * @brief Additional device extensions injected by the layer. + * + * The layer will expose these even if the driver does not. Items are + * removed from the list if the driver already exposes the extension. */ - static const std::vector extraExtensions; + static std::vector> injectedDeviceExtensions; }; diff --git a/layer_gpu_support/source/layer_device_functions_image.cpp b/layer_gpu_support/source/layer_device_functions_image.cpp index e680e6d7..aad18102 100644 --- a/layer_gpu_support/source/layer_device_functions_image.cpp +++ b/layer_gpu_support/source/layer_device_functions_image.cpp @@ -122,10 +122,10 @@ VKAPI_ATTR VkResult VKAPI_CALL layer_vkCreateImage(VkDevice device, } // Create modifiable structures we can patch - vku::safe_VkImageCreateInfo newCreateInfoSafe(pCreateInfo); - auto* newCreateInfo = reinterpret_cast(&newCreateInfoSafe); + vku::safe_VkImageCreateInfo safeCreateInfo(pCreateInfo); + auto* newCreateInfo = reinterpret_cast(&safeCreateInfo); // We know we can const-cast here because this is a safe-struct clone - void* pNextBase = const_cast(newCreateInfoSafe.pNext); + void* pNextBase = const_cast(safeCreateInfo.pNext); // Create extra structures we can patch in VkImageCompressionControlEXT newCompressionControl = vku::InitStructHelper(); @@ -165,7 +165,7 @@ VKAPI_ATTR VkResult VKAPI_CALL layer_vkCreateImage(VkDevice device, // Add a config if not already configured by the application if (patchNeeded) { - vku::AddToPnext(newCreateInfoSafe, *compressionControl); + vku::AddToPnext(safeCreateInfo, *compressionControl); } return layer->driver.vkCreateImage(device, newCreateInfo, pAllocator, pImage); diff --git a/layer_gpu_timeline/source/CMakeLists.txt b/layer_gpu_timeline/source/CMakeLists.txt index efe08d75..86de9f87 100644 --- a/layer_gpu_timeline/source/CMakeLists.txt +++ b/layer_gpu_timeline/source/CMakeLists.txt @@ -54,6 +54,7 @@ add_library( layer_device_functions_render_pass.cpp layer_device_functions_trace_rays.cpp layer_device_functions_transfer.cpp + layer_instance_functions.cpp timeline_comms.cpp timeline_protobuf_encoder.cpp) diff --git a/layer_gpu_timeline/source/device.hpp b/layer_gpu_timeline/source/device.hpp index 937b0076..e3bb6b85 100644 --- a/layer_gpu_timeline/source/device.hpp +++ b/layer_gpu_timeline/source/device.hpp @@ -185,6 +185,14 @@ class Device */ static const std::vector createInfoPatches; + /** + * @brief Is this layer emulating VK_EXT_frame_boundary? + * + * Set to @c true if layer is emulating on top of a driver that doesn't + * support it, @c false if layer knows driver supports it. + */ + bool isEmulatingExtFrameBoundary { false }; + private: /** * @brief State tracker for this device. diff --git a/layer_gpu_timeline/source/instance.cpp b/layer_gpu_timeline/source/instance.cpp index 71bd4c18..0aaf9b1a 100644 --- a/layer_gpu_timeline/source/instance.cpp +++ b/layer_gpu_timeline/source/instance.cpp @@ -35,13 +35,21 @@ static std::unordered_map> g_instances; /* See header for documentation. */ -const APIVersion Instance::minAPIVersion {1, 1}; +const APIVersion Instance::minAPIVersion { 1, 1 }; /* See header for documentation. */ -const std::vector Instance::extraExtensions { +const std::vector Instance::requiredDriverExtensions { VK_EXT_DEBUG_UTILS_EXTENSION_NAME, }; +/* See header for documentation. */ +const std::vector> Instance::injectedInstanceExtensions {}; + +/* See header for documentation. */ +std::vector> Instance::injectedDeviceExtensions { + {VK_EXT_FRAME_BOUNDARY_EXTENSION_NAME, VK_EXT_FRAME_BOUNDARY_SPEC_VERSION} +}; + /* See header for documentation. */ void Instance::store(VkInstance handle, std::unique_ptr& instance) { diff --git a/layer_gpu_timeline/source/instance.hpp b/layer_gpu_timeline/source/instance.hpp index cc05dcb8..acd7c91a 100644 --- a/layer_gpu_timeline/source/instance.hpp +++ b/layer_gpu_timeline/source/instance.hpp @@ -137,7 +137,24 @@ class Instance static const APIVersion minAPIVersion; /** - * @brief The minimum set of instance extensions needed by this layer. + * @brief Required extensions from the driver. + * + * The layer will attempt to enable these even if the application does not. + */ + static const std::vector requiredDriverExtensions; + + /** + * @brief Additional instance extensions injected by the layer. + * + * The layer will expose these even if the driver does not. + */ + static const std::vector> injectedInstanceExtensions; + + /** + * @brief Additional device extensions injected by the layer. + * + * The layer will expose these even if the driver does not. Items are + * removed from the list if the driver already exposes the extension. */ - static const std::vector extraExtensions; + static std::vector> injectedDeviceExtensions; }; diff --git a/layer_gpu_timeline/source/layer_device_functions.hpp b/layer_gpu_timeline/source/layer_device_functions.hpp index 1030b35f..38f78bea 100644 --- a/layer_gpu_timeline/source/layer_device_functions.hpp +++ b/layer_gpu_timeline/source/layer_device_functions.hpp @@ -504,3 +504,11 @@ VKAPI_ATTR VkResult VKAPI_CALL layer_vkQueueSubmit2KHR(VkQueue queue, uint32_t submitCount, const VkSubmitInfo2* pSubmits, VkFence fence); + +/* See Vulkan API for documentation. */ +template <> +VKAPI_ATTR VkResult VKAPI_CALL layer_vkQueueBindSparse( + VkQueue queue, + uint32_t bindInfoCount, + const VkBindSparseInfo* pBindInfo, + VkFence fence); diff --git a/layer_gpu_timeline/source/layer_device_functions_queue.cpp b/layer_gpu_timeline/source/layer_device_functions_queue.cpp index 65f4ca59..a4ba1a05 100644 --- a/layer_gpu_timeline/source/layer_device_functions_queue.cpp +++ b/layer_gpu_timeline/source/layer_device_functions_queue.cpp @@ -29,8 +29,8 @@ #include "trackers/queue.hpp" #include - #include +#include extern std::mutex g_vulkanLock; @@ -93,6 +93,44 @@ static void emitCommandBufferMetadata(Device& layer, trackQueue.runSubmitCommandStream(LCS, workloadVisitor); } +/** + * @brief Check a pNext chain for a manual frame boundary marker. + * + * Emits the necessary metadata to emulate a vkQueuePresent. Note that this + * will generate a second metadata submit to be a container for any commands + * if there are submits remaining after the one tagged as end-of-frame. + * + * @param layer The layer context. + * @param queue The queue. + * @param pNext The submit pNext pointer. + * @param isLastSubmit Is this the last submit in the API call? + * @param workloadVisitor Visitor for the protobuf encoder. + */ +static void checkManualFrameBoundary( + Device* layer, + VkQueue queue, + const void* pNext, + bool isLastSubmit, + TimelineProtobufEncoder& workloadVisitor +) { + // Check for end of frame boundary + auto* ext = vku::FindStructInPNextChain(pNext); + if (ext && (ext->flags & VK_FRAME_BOUNDARY_FRAME_END_BIT_EXT)) + { + // Emulate a queue present to indicate end of frame + auto& tracker = layer->getStateTracker(); + tracker.queuePresent(); + + TimelineProtobufEncoder::emitFrame(*layer, tracker.totalStats.getFrameCount(), getClockMonotonicRaw()); + + // Emulate a new queue submit if work remains to submit + if (!isLastSubmit) + { + emitQueueMetadata(queue, workloadVisitor); + } + } +} + /* See Vulkan API for documentation. */ template<> VKAPI_ATTR VkResult VKAPI_CALL layer_vkQueuePresentKHR(VkQueue queue, const VkPresentInfoKHR* pPresentInfo) @@ -106,13 +144,24 @@ VKAPI_ATTR VkResult VKAPI_CALL layer_vkQueuePresentKHR(VkQueue queue, auto& tracker = layer->getStateTracker(); tracker.queuePresent(); - // This is run with the lock held to ensure that all queue submit - // messages are sent sequentially to the host tool + // Create a modifiable structure we can patch + vku::safe_VkPresentInfoKHR safePresentInfo(pPresentInfo); + auto* newPresentInfo = reinterpret_cast(&safePresentInfo); + + // Remove emulated frame boundaries + if (layer->isEmulatingExtFrameBoundary) + { + vku::RemoveFromPnext(safePresentInfo, VK_STRUCTURE_TYPE_FRAME_BOUNDARY_EXT); + } + + // Note that we assume QueuePresent is _always_ the end of a frame. + // This is run with the lock held to ensure that all queue submit messages + // are sent sequentially to the host tool TimelineProtobufEncoder::emitFrame(*layer, tracker.totalStats.getFrameCount(), getClockMonotonicRaw()); // Release the lock to call into the driver lock.unlock(); - return layer->driver.vkQueuePresentKHR(queue, pPresentInfo); + return layer->driver.vkQueuePresentKHR(queue, newPresentInfo); } /* See Vulkan API for documentation. */ @@ -142,6 +191,10 @@ VKAPI_ATTR VkResult VKAPI_CALL VkCommandBuffer commandBuffer = submit.pCommandBuffers[j]; emitCommandBufferMetadata(*layer, queue, commandBuffer, workloadVisitor); } + + // Check for end of frame boundary + bool isLast = i == submitCount - 1; + checkManualFrameBoundary(layer, queue, submit.pNext, isLast, workloadVisitor); } // Release the lock to call into the driver @@ -176,6 +229,10 @@ VKAPI_ATTR VkResult VKAPI_CALL VkCommandBuffer commandBuffer = submit.pCommandBufferInfos[j].commandBuffer; emitCommandBufferMetadata(*layer, queue, commandBuffer, workloadVisitor); } + + // Check for end of frame boundary + bool isLast = i == submitCount - 1; + checkManualFrameBoundary(layer, queue, submit.pNext, isLast, workloadVisitor); } // Release the lock to call into the driver @@ -210,9 +267,53 @@ VKAPI_ATTR VkResult VKAPI_CALL VkCommandBuffer commandBuffer = submit.pCommandBufferInfos[j].commandBuffer; emitCommandBufferMetadata(*layer, queue, commandBuffer, workloadVisitor); } + + // Check for end of frame boundary + bool isLast = i == submitCount - 1; + checkManualFrameBoundary(layer, queue, submit.pNext, isLast, workloadVisitor); } // Release the lock to call into the driver lock.unlock(); return layer->driver.vkQueueSubmit2KHR(queue, submitCount, pSubmits, fence); } + +/** + * See Vulkan API for documentation. + * + * Note: Modelling of this function is only implemented to support manual frame + * boundaries. There is no reporting of the workload associated with bind + * sparse submissions in the Mali timeline driver data model. + */ +template <> +VKAPI_ATTR VkResult VKAPI_CALL layer_vkQueueBindSparse( + VkQueue queue, + uint32_t bindInfoCount, + const VkBindSparseInfo* pBindInfo, + VkFence fence +) { + LAYER_TRACE(__func__); + + // Hold the lock to access layer-wide global store + std::unique_lock lock {g_vulkanLock}; + auto* layer = Device::retrieve(queue); + + // Scan infos for frame boundaries + for (uint32_t i = 0; i < bindInfoCount; i++) + { + const auto& info = pBindInfo[i]; + + auto* ext = vku::FindStructInPNextChain(info.pNext); + if (ext && (ext->flags & VK_FRAME_BOUNDARY_FRAME_END_BIT_EXT)) + { + // Emulate a queue present to indicate end of frame + auto& tracker = layer->getStateTracker(); + tracker.queuePresent(); + TimelineProtobufEncoder::emitFrame(*layer, tracker.totalStats.getFrameCount(), getClockMonotonicRaw()); + } + } + + // Release the lock to call into the driver + lock.unlock(); + return layer->driver.vkQueueBindSparse(queue, bindInfoCount, pBindInfo, fence); +} diff --git a/layer_gpu_timeline/source/layer_instance_functions.cpp b/layer_gpu_timeline/source/layer_instance_functions.cpp new file mode 100644 index 00000000..0d36dfd2 --- /dev/null +++ b/layer_gpu_timeline/source/layer_instance_functions.cpp @@ -0,0 +1,113 @@ +/* + * SPDX-License-Identifier: MIT + * ---------------------------------------------------------------------------- + * Copyright (c) 2025 Arm Limited + * + * Permission is hereby granted, free of charge, to any person obtaining a copy + * of this software and associated documentation files (the "Software"), to + * deal in the Software without restriction, including without limitation the + * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or + * sell copies of the Software, and to permit persons to whom the Software is + * furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE + * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS + * IN THE SOFTWARE. + * ---------------------------------------------------------------------------- + */ + +#include "instance.hpp" +#include "device.hpp" + +#include +#include + +extern std::mutex g_vulkanLock; + +/* See header for documentation. */ +template <> +VKAPI_ATTR VkResult VKAPI_CALL layer_vkCreateDevice( + VkPhysicalDevice physicalDevice, + const VkDeviceCreateInfo* pCreateInfo, + const VkAllocationCallbacks* pAllocator, + VkDevice* pDevice +) { + LAYER_TRACE(__func__); + + // Use the default function for the heavy-lifting + auto res = layer_vkCreateDevice(physicalDevice, pCreateInfo, pAllocator, pDevice); + if (res != VK_SUCCESS) + { + return res; + } + + // Cache flags indicating extension emulation + std::unique_lock lock {g_vulkanLock}; + auto* layer = Device::retrieve(*pDevice); + + static const std::string target { VK_EXT_FRAME_BOUNDARY_EXTENSION_NAME }; + for (auto& ext : layer->instance->injectedDeviceExtensions) + { + if (ext.first == target) + { + layer->isEmulatingExtFrameBoundary = true; + } + } + + return res; +} + +/* See Vulkan API for documentation. */ +template <> +VKAPI_ATTR void VKAPI_CALL layer_vkGetPhysicalDeviceFeatures2( + VkPhysicalDevice physicalDevice, + VkPhysicalDeviceFeatures2* pFeatures +) { + LAYER_TRACE(__func__); + + // Hold the lock to access layer-wide global store + std::unique_lock lock { g_vulkanLock }; + auto* layer = Instance::retrieve(physicalDevice); + + // Release the lock to call into the driver + lock.unlock(); + layer->driver.vkGetPhysicalDeviceFeatures2(physicalDevice, pFeatures); + + // Patch the query response to show that it is supported + auto* ext = vku::FindStructInPNextChain(pFeatures->pNext); + if (ext) + { + ext->frameBoundary = VK_TRUE; + } +} + +/* See Vulkan API for documentation. */ +template <> +VKAPI_ATTR void VKAPI_CALL layer_vkGetPhysicalDeviceFeatures2KHR( + VkPhysicalDevice physicalDevice, + VkPhysicalDeviceFeatures2* pFeatures +) { + LAYER_TRACE(__func__); + + // Hold the lock to access layer-wide global store + std::unique_lock lock { g_vulkanLock }; + auto* layer = Instance::retrieve(physicalDevice); + + // Release the lock to call into the driver + lock.unlock(); + layer->driver.vkGetPhysicalDeviceFeatures2KHR(physicalDevice, pFeatures); + + // Patch the query response to show that it is supported + auto* ext = vku::FindStructInPNextChain(pFeatures->pNext); + if (ext) + { + ext->frameBoundary = VK_TRUE; + } +} diff --git a/layer_gpu_timeline/source/layer_instance_functions.hpp b/layer_gpu_timeline/source/layer_instance_functions.hpp new file mode 100644 index 00000000..c2e6f1d0 --- /dev/null +++ b/layer_gpu_timeline/source/layer_instance_functions.hpp @@ -0,0 +1,50 @@ +/* + * SPDX-License-Identifier: MIT + * ---------------------------------------------------------------------------- + * Copyright (c) 2025 Arm Limited + * + * Permission is hereby granted, free of charge, to any person obtaining a copy + * of this software and associated documentation files (the "Software"), to + * deal in the Software without restriction, including without limitation the + * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or + * sell copies of the Software, and to permit persons to whom the Software is + * furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE + * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS + * IN THE SOFTWARE. + * ---------------------------------------------------------------------------- + */ + +#pragma once + +#include + +// Functions for devices + +/* See Vulkan API for documentation. */ +template <> +VKAPI_ATTR VkResult VKAPI_CALL layer_vkCreateDevice( + VkPhysicalDevice physicalDevice, + const VkDeviceCreateInfo* pCreateInfo, + const VkAllocationCallbacks* pAllocator, + VkDevice* pDevice); + +/* See Vulkan API for documentation. */ +template <> +VKAPI_ATTR void VKAPI_CALL layer_vkGetPhysicalDeviceFeatures2( + VkPhysicalDevice physicalDevice, + VkPhysicalDeviceFeatures2* pFeatures); + +/* See Vulkan API for documentation. */ +template <> +VKAPI_ATTR void VKAPI_CALL layer_vkGetPhysicalDeviceFeatures2KHR( + VkPhysicalDevice physicalDevice, + VkPhysicalDeviceFeatures2* pFeatures); diff --git a/source_common/framework/manual_functions.cpp b/source_common/framework/manual_functions.cpp index 2f68292a..a7bcbe6a 100644 --- a/source_common/framework/manual_functions.cpp +++ b/source_common/framework/manual_functions.cpp @@ -29,8 +29,11 @@ * implemented as library code which can be swapped for alternative * implementations on a per-layer basis if needed. */ - +#include +#include +#include #include +#include #include "framework/manual_functions.hpp" #include "utils/misc.hpp" @@ -381,7 +384,9 @@ std::vector cloneExtensionList(uint32_t extensionCount, const char* static void enableInstanceVkExtDebugUtils(vku::safe_VkInstanceCreateInfo& createInfo, const std::vector& supported) { - static const std::string target {VK_EXT_DEBUG_UTILS_EXTENSION_NAME}; + static const std::string target { + VK_EXT_DEBUG_UTILS_EXTENSION_NAME + }; // Test if the desired extension is supported. If supported list is // empty then we didn't query and assume extension is supported. @@ -411,7 +416,9 @@ void enableDeviceVkKhrTimelineSemaphore(Instance& instance, UNUSED(instance); UNUSED(physicalDevice); - static const std::string target {VK_KHR_TIMELINE_SEMAPHORE_EXTENSION_NAME}; + static const std::string target { + VK_KHR_TIMELINE_SEMAPHORE_EXTENSION_NAME + }; // We know we can const-cast here because createInfo is a safe-struct clone void* pNextBase = const_cast(createInfo.pNext); @@ -523,6 +530,42 @@ void enableDeviceVkExtImageCompressionControl(Instance& instance, } } +/* See header for documentation. */ +void emulateDeviceVkExtFrameBoundary(Instance& instance, + VkPhysicalDevice physicalDevice, + vku::safe_VkDeviceCreateInfo& createInfo, + std::vector& supported) +{ + UNUSED(instance); + UNUSED(physicalDevice); + UNUSED(supported); + + static const std::string target {VK_EXT_FRAME_BOUNDARY_EXTENSION_NAME}; + + // We only need to hide it if driver does not support it + bool isEmulated = false; + for (auto& ext : instance.injectedDeviceExtensions) + { + if (ext.first == target) + { + isEmulated = true; + break; + } + } + + if (!isEmulated) + { + return; + } + + // Mask extension if layer is emulating it + bool removed = vku::RemoveFromPnext(createInfo, VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_FRAME_BOUNDARY_FEATURES_EXT); + if (removed) + { + LAYER_LOG("Device extension masked: %s", target.c_str()); + } +} + /** See Vulkan API for documentation. */ template <> PFN_vkVoidFunction layer_vkGetInstanceProcAddr(VkInstance instance, const char* pName) @@ -617,15 +660,130 @@ VkResult layer_vkEnumerateInstanceExtensionProperties(const char* p { LAYER_TRACE(__func__); - UNUSED(pProperties); + // Query for a layer + if (pLayerName) + { + // ... but not this layer + if (strcmp(pLayerName, layerProps[0].layerName)) + { + return VK_ERROR_LAYER_NOT_PRESENT; + } + + size_t count = Instance::injectedInstanceExtensions.size(); + + // Size query + if (!pProperties) + { + *pPropertyCount = static_cast(count); + return VK_SUCCESS; + } + + // Property query, clamped to size of user array if smaller + size_t emitCount = std::min(count, static_cast(*pPropertyCount)); + for (size_t i = 0; i < emitCount; i++) + { + const auto& ref = Instance::injectedInstanceExtensions[i]; + std::strcpy(pProperties[i].extensionName, ref.first.c_str()); + pProperties[i].specVersion = ref.second; + } + + *pPropertyCount = static_cast(emitCount); + + if (count > emitCount) + { + return VK_INCOMPLETE; + } + + return VK_SUCCESS; + } + + // Note that unlike device extensions, there is no layering for this + // query, as we have no way of knowing what's beneath us - if (!pLayerName || strcmp(pLayerName, layerProps[0].layerName)) + return VK_ERROR_LAYER_NOT_PRESENT; +} + +static std::vector get_driver_device_extensions( + Instance& layer, + VkPhysicalDevice gpu +) { + uint32_t queryCount = 0; + std::vector query; + + // Query how many extensions to allocate for + auto err = layer.driver.vkEnumerateDeviceExtensionProperties(gpu, nullptr, &queryCount, nullptr); + if (err != VK_SUCCESS) { - return VK_ERROR_LAYER_NOT_PRESENT; + return {}; } - *pPropertyCount = 0; - return VK_SUCCESS; + // Allocate storage + query.resize(queryCount); + + // Query + err = layer.driver.vkEnumerateDeviceExtensionProperties(gpu, nullptr, &queryCount, query.data()); + if (err != VK_SUCCESS) + { + return {}; + } + + return query; +} + +static void get_extended_device_extensions( + Instance& layer, + std::vector& extensions +) { + std::vector passthroughExtensions; + + // For each extension in our extension list ... + for (auto& injectedExtension : layer.injectedDeviceExtensions) + { + const std::string& name = injectedExtension.first; + // Is it in the list already? + bool found = false; + for (const auto& driverExtension : extensions) + { + if (name == driverExtension.extensionName) + { + passthroughExtensions.emplace_back(name); + found = true; + break; + } + } + + // If not then add it to the list + if (!found) + { + LAYER_LOG("Injecting device extension: %s", name.c_str()); + VkExtensionProperties prop {}; + + // Populate the string, and guarantee it's NUL terminated + std::strncpy(prop.extensionName, name.c_str(), VK_MAX_EXTENSION_NAME_SIZE - 1); + prop.extensionName[VK_MAX_EXTENSION_NAME_SIZE - 1] = '\0'; + prop.specVersion = injectedExtension.second; + extensions.emplace_back(prop); + } + else + { + LAYER_LOG("Not injecting device extension: %s", name.c_str()); + } + } + + // Remove any found extensions from the injected list so that we can tell + // that the driver supports it and we didn't need to inject it + for (auto& ref : passthroughExtensions) + { + auto& tgt = layer.injectedDeviceExtensions; + for (auto it = tgt.begin(); it != tgt.end(); it++) + { + if (ref == it->first) + { + tgt.erase(it); + break; + } + } + } } /** See Vulkan API for documentation. */ @@ -637,23 +795,50 @@ VkResult layer_vkEnumerateDeviceExtensionProperties(VkPhysicalDevic { LAYER_TRACE(__func__); - UNUSED(pProperties); - - // Android layer enumeration will always pass a nullptr for the device - if (!gpu) + // Query for a layer + if (pLayerName) { - if (!pLayerName || strcmp(pLayerName, layerProps[0].layerName)) + // ... but not this layer + if (strcmp(pLayerName, layerProps[0].layerName)) { return VK_ERROR_LAYER_NOT_PRESENT; } - *pPropertyCount = 0; + size_t count = Instance::injectedDeviceExtensions.size(); + + // Size query + if (!pProperties) + { + *pPropertyCount = static_cast(count); + return VK_SUCCESS; + } + + // Property query, clamped to size of user array if smaller + size_t emitCount = std::min(count, static_cast(*pPropertyCount)); + for (size_t i = 0; i < emitCount; i++) + { + const auto& ref = Instance::injectedDeviceExtensions[i]; + std::strcpy(pProperties[i].extensionName, ref.first.c_str()); + pProperties[i].specVersion = ref.second; + } + + *pPropertyCount = static_cast(emitCount); + + if (count > emitCount) + { + return VK_INCOMPLETE; + } + return VK_SUCCESS; } + // Query for a device, but on Android discovery may pass a null device + if (!gpu) + { + return VK_ERROR_LAYER_NOT_PRESENT; + } + // For other cases forward to the driver to handle it - assert(!pLayerName); - assert(gpu); // Hold the lock to access layer-wide global store std::unique_lock lock {g_vulkanLock}; @@ -661,7 +846,38 @@ VkResult layer_vkEnumerateDeviceExtensionProperties(VkPhysicalDevic // Release the lock to call into the driver lock.unlock(); - return layer->driver.vkEnumerateDeviceExtensionProperties(gpu, pLayerName, pPropertyCount, pProperties); + auto our_extensions = get_driver_device_extensions(*layer, gpu); + + // Lock again because we patch a shared structure based on driver response + lock.lock(); + get_extended_device_extensions(*layer, our_extensions); + lock.unlock(); + + // Size query + if (!pProperties) + { + *pPropertyCount = static_cast(our_extensions.size()); + return VK_SUCCESS; + } + + // Property query, clamped to size of user array if smaller + size_t count2 = our_extensions.size(); + size_t emitCount2 = std::min(count2, static_cast(*pPropertyCount)); + for (size_t i = 0; i < emitCount2; i++) + { + const auto& ref = our_extensions[i]; + std::strcpy(pProperties[i].extensionName, ref.extensionName); + pProperties[i].specVersion = ref.specVersion; + } + + *pPropertyCount = static_cast(emitCount2); + + if (count2 > emitCount2) + { + return VK_INCOMPLETE; + } + + return VK_SUCCESS; } /** See Vulkan API for documentation. */ @@ -670,20 +886,25 @@ VkResult layer_vkEnumerateInstanceLayerProperties(uint32_t* pProper { LAYER_TRACE(__func__); - if (pProperties) - { - size_t count = std::min(layerProps.size(), static_cast(*pPropertyCount)); - if (count < layerProps.size()) - { - return VK_INCOMPLETE; - } + size_t count = layerProps.size(); - memcpy(pProperties, layerProps.data(), count * sizeof(VkLayerProperties)); - *pPropertyCount = count; + // Size query + if (!pProperties) + { + *pPropertyCount = static_cast(count); return VK_SUCCESS; } - *pPropertyCount = layerProps.size(); + // Property query, clamped to size of user array if smaller + size_t emitCount = std::min(count, static_cast(*pPropertyCount)); + std::memcpy(pProperties, layerProps.data(), emitCount * sizeof(VkLayerProperties)); + *pPropertyCount = static_cast(emitCount); + + if (count > emitCount) + { + return VK_INCOMPLETE; + } + return VK_SUCCESS; } @@ -697,20 +918,25 @@ VkResult layer_vkEnumerateDeviceLayerProperties(VkPhysicalDevice gp UNUSED(gpu); - if (pProperties) - { - size_t count = std::min(layerProps.size(), static_cast(*pPropertyCount)); - if (count < layerProps.size()) - { - return VK_INCOMPLETE; - } + size_t count = layerProps.size(); - memcpy(pProperties, layerProps.data(), count * sizeof(VkLayerProperties)); - *pPropertyCount = count; + // Size query + if (!pProperties) + { + *pPropertyCount = static_cast(count); return VK_SUCCESS; } - *pPropertyCount = layerProps.size(); + // Property query, clamped to size of user array if smaller + size_t emitCount = std::min(count, static_cast(*pPropertyCount)); + std::memcpy(pProperties, layerProps.data(), emitCount * sizeof(VkLayerProperties)); + *pPropertyCount = static_cast(emitCount); + + if (count > emitCount) + { + return VK_INCOMPLETE; + } + return VK_SUCCESS; } @@ -759,18 +985,18 @@ VKAPI_ATTR VkResult VKAPI_CALL layer_vkCreateInstance(const VkInsta } // Create modifiable structures we can patch - vku::safe_VkInstanceCreateInfo newCreateInfoSafe(pCreateInfo); - auto* newCreateInfo = reinterpret_cast(&newCreateInfoSafe); + vku::safe_VkInstanceCreateInfo safeCreateInfo(pCreateInfo); + auto* newCreateInfo = reinterpret_cast(&safeCreateInfo); // Patch updated application info - newCreateInfoSafe.pApplicationInfo->apiVersion = VK_MAKE_API_VERSION(0, newVersion.first, newVersion.second, 0); + safeCreateInfo.pApplicationInfo->apiVersion = VK_MAKE_API_VERSION(0, newVersion.first, newVersion.second, 0); // Enable extra extensions - for (const auto& newExt : Instance::extraExtensions) + for (const auto& newExt : Instance::requiredDriverExtensions) { if (newExt == VK_EXT_DEBUG_UTILS_EXTENSION_NAME) { - enableInstanceVkExtDebugUtils(newCreateInfoSafe, supportedExtensions); + enableInstanceVkExtDebugUtils(safeCreateInfo, supportedExtensions); } else { @@ -853,13 +1079,13 @@ VKAPI_ATTR VkResult VKAPI_CALL layer_vkCreateDevice(VkPhysicalDevic LAYER_LOG("Device API version %u.%u", apiVersion.first, apiVersion.second); // Create a modifiable structure we can patch - vku::safe_VkDeviceCreateInfo newCreateInfoSafe(pCreateInfo); - auto* newCreateInfo = reinterpret_cast(&newCreateInfoSafe); + vku::safe_VkDeviceCreateInfo safeCreateInfo(pCreateInfo); + auto* newCreateInfo = reinterpret_cast(&safeCreateInfo); // Apply all required patches to the VkDeviceCreateInfo for (const auto patch : Device::createInfoPatches) { - patch(*layer, physicalDevice, newCreateInfoSafe, supportedExtensions); + patch(*layer, physicalDevice, safeCreateInfo, supportedExtensions); } // Log extensions after patching for debug purposes diff --git a/source_common/framework/manual_functions.hpp b/source_common/framework/manual_functions.hpp index 18849d83..c066f301 100644 --- a/source_common/framework/manual_functions.hpp +++ b/source_common/framework/manual_functions.hpp @@ -23,6 +23,8 @@ * ---------------------------------------------------------------------------- */ +#pragma once + /** * @file * This module exposes common functionality used by layer entrypoints, @@ -31,7 +33,7 @@ */ #include - #include "device.hpp" +#include "device.hpp" #include "framework/device_dispatch_table.hpp" #include "framework/device_functions.hpp" #include "framework/instance_functions.hpp" @@ -229,3 +231,20 @@ void enableDeviceVkExtImageCompressionControl(Instance& instance, VkPhysicalDevice physicalDevice, vku::safe_VkDeviceCreateInfo& createInfo, std::vector& supported); + +/** + * Hide VK_EXT_frame_boundary if emulated on top of the driver. + * + * If the driver supports this already we don't need to do anything, but + * if the driver does not then we need to hide the support. + * + * @param instance The layer instance we are running within. + * @param physicalDevice The physical device we are creating a device for. + * @param createInfo The createInfo we can search to find user config. + * @param supported The list of supported extensions. + */ +void emulateDeviceVkExtFrameBoundary(Instance& instance, + VkPhysicalDevice physicalDevice, + vku::safe_VkDeviceCreateInfo& createInfo, + std::vector& supported); + diff --git a/source_common/trackers/render_pass.cpp b/source_common/trackers/render_pass.cpp index 4f921236..76203490 100644 --- a/source_common/trackers/render_pass.cpp +++ b/source_common/trackers/render_pass.cpp @@ -1,7 +1,7 @@ /* * SPDX-License-Identifier: MIT * ---------------------------------------------------------------------------- - * Copyright (c) 2022-2024 Arm Limited + * Copyright (c) 2022-2025 Arm Limited * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to