Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
f43844d
first attempt
bob80905 Oct 16, 2025
38fde3c
Merge branch 'main' into add_unmapped_resources_tests
bob80905 Oct 16, 2025
d7908c7
Merge branch 'main' into add_unmapped_resources_tests
bob80905 Oct 17, 2025
a3b4ce1
first attempt
bob80905 Oct 17, 2025
2d371ed
self review
bob80905 Oct 17, 2025
284d8fe
add check access fully mapped
bob80905 Oct 17, 2025
9c6664b
make all 3 new functions more consistent between each other
bob80905 Oct 17, 2025
eda2715
rename test file name
bob80905 Oct 17, 2025
b58d924
address variable casing errors
bob80905 Oct 18, 2025
59a0b7c
address Justin and Damyan
bob80905 Oct 21, 2025
d3cbf91
fix build errors
bob80905 Oct 21, 2025
8d9cbbb
attempt to unify SRV
bob80905 Oct 21, 2025
fb4a631
fix texturelayout flag and another flag
bob80905 Oct 22, 2025
1c9752f
add metal XFAIL
bob80905 Oct 22, 2025
3e78f39
improve, but keep UAV the same
bob80905 Oct 24, 2025
fb86727
remove unwanted changes
bob80905 Oct 24, 2025
b9b202b
address confusing comment Justin pointed out
bob80905 Oct 24, 2025
361225b
clang-format
bob80905 Oct 24, 2025
eb89f86
address damyan, xfail intel
bob80905 Oct 24, 2025
c59e9e2
remove reserved CBV function, and capitalize var names
bob80905 Oct 24, 2025
8579b37
Merge branch 'main' into add_unmapped_resources_tests
bob80905 Oct 24, 2025
6a7878d
add some missing constts
bob80905 Oct 24, 2025
26f1045
Merge branch 'add_unmapped_resources_tests' of https://github.com/bob…
bob80905 Oct 24, 2025
06bf89c
self review
bob80905 Oct 25, 2025
e7f9a91
address Sarah
bob80905 Nov 6, 2025
2b2b9aa
add overflow assert
bob80905 Nov 7, 2025
7b8fa4c
fix assert condition, and update resource elty
bob80905 Nov 10, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions include/Support/Pipeline.h
Original file line number Diff line number Diff line change
Expand Up @@ -143,6 +143,7 @@ struct Resource {
std::optional<VulkanBinding> VKBinding;
Buffer *BufferPtr = nullptr;
bool HasCounter;
std::optional<uint32_t> TilesMapped;

bool isRaw() const {
switch (Kind) {
Expand Down
112 changes: 88 additions & 24 deletions lib/API/DX/Device.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -149,9 +149,14 @@ static D3D12_RESOURCE_DESC getResourceDescription(const Resource &R) {
const uint32_t Width =
R.isTexture() ? B.OutputProps.Width : getUAVBufferSize(R);
const uint32_t Height = R.isTexture() ? B.OutputProps.Height : 1;
const D3D12_TEXTURE_LAYOUT Layout = R.isTexture()
? D3D12_TEXTURE_LAYOUT_UNKNOWN
: D3D12_TEXTURE_LAYOUT_ROW_MAJOR;
D3D12_TEXTURE_LAYOUT Layout;
if (R.isTexture())
Layout = getDXKind(R.Kind) == SRV
? D3D12_TEXTURE_LAYOUT_64KB_UNDEFINED_SWIZZLE
: D3D12_TEXTURE_LAYOUT_UNKNOWN;
else
Layout = D3D12_TEXTURE_LAYOUT_ROW_MAJOR;

const D3D12_RESOURCE_FLAGS Flags =
R.isReadWrite() ? D3D12_RESOURCE_FLAG_ALLOW_UNORDERED_ACCESS
: D3D12_RESOURCE_FLAG_NONE;
Expand Down Expand Up @@ -244,9 +249,11 @@ class DXDevice : public offloadtest::Device {
ComPtr<ID3D12Resource> Upload;
ComPtr<ID3D12Resource> Buffer;
ComPtr<ID3D12Resource> Readback;
ComPtr<ID3D12Heap> Heap;
ResourceSet(ComPtr<ID3D12Resource> Upload, ComPtr<ID3D12Resource> Buffer,
ComPtr<ID3D12Resource> Readback)
: Upload(Upload), Buffer(Buffer), Readback(Readback) {}
ComPtr<ID3D12Resource> Readback,
ComPtr<ID3D12Heap> Heap = nullptr)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this defaulted to null but the others are not?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both CBV and UAV don't need to pass the Heap to the Resource bundle (yet) because only SRVs are using reserved resources by default, and that's the only case where the lifetime of the Heap need to be extended. The other cases will need non-null values, especially for functions like Bind* or CreateComputeCommands.

: Upload(Upload), Buffer(Buffer), Readback(Readback), Heap(Heap) {}
};

// ResourceBundle will contain one ResourceSet for a singular resource
Expand Down Expand Up @@ -521,51 +528,107 @@ class DXDevice : public offloadtest::Device {
addUploadEndBarrier(IS, Destination, R.isReadWrite());
}

UINT getNumTiles(std::optional<uint32_t> NumTiles, uint32_t Width) {
UINT Ret;
if (NumTiles.has_value())
Ret = static_cast<UINT>(*NumTiles);
else {
// Map the entire buffer by computing how many 64KB tiles cover it
Ret = static_cast<UINT>(
(Width + D3D12_TILED_RESOURCE_TILE_SIZE_IN_BYTES - 1) /
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could this operation overflow? Why is width a uint64?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently the arg that feeds into the width parameter is an int datatype. So it doesn't need to be a uint64, at least not yet. I'll change it.

Copy link
Collaborator

@spall spall Nov 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could width or d3d12_tiled_resource_tile_size_in_bytes be uint32 max or half that amount; could this operation overflow after you change width to be a uint32?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't believe D3D12_TILED_RESOURCE_TILE_SIZE_IN_BYTES can be any larger than 65536. However, width could be any arbitrary size, depending on how the developer specified it. So yes, I think this operation could overflow, though in exceedingly specific and pathological circumstances.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW: in D3D12, Width is a uint64 because it needs to be able to support describing buffers that are larger than 4gb.

Copy link
Contributor Author

@bob80905 bob80905 Nov 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, though for this API:
UpdateTileMappings
The pRangeTileCounts param is a UINT. Its value in my code comes directly from counting the number of tiles. So I don't think this should ever be > uint32_max.

D3D12_TILED_RESOURCE_TILE_SIZE_IN_BYTES);
// check for overflow
assert(Width < std::numeric_limits<UINT>::max() -
D3D12_TILED_RESOURCE_TILE_SIZE_IN_BYTES - 1);
}
return Ret;
}

llvm::Expected<ResourceBundle> createSRV(Resource &R, InvocationState &IS) {
ResourceBundle Bundle;

const D3D12_HEAP_PROPERTIES HeapProp =
CD3DX12_HEAP_PROPERTIES(D3D12_HEAP_TYPE_DEFAULT);
const D3D12_RESOURCE_DESC ResDesc = getResourceDescription(R);
const D3D12_HEAP_PROPERTIES UploadHeapProp =
CD3DX12_HEAP_PROPERTIES(D3D12_HEAP_TYPE_UPLOAD);
const D3D12_RESOURCE_DESC UploadResDesc =
CD3DX12_RESOURCE_DESC::Buffer(R.size());
const D3D12_HEAP_PROPERTIES UploadHeapProps =
CD3DX12_HEAP_PROPERTIES(D3D12_HEAP_TYPE_UPLOAD);

uint32_t RegOffset = 0;

for (const auto &ResData : R.BufferPtr->Data) {
llvm::outs() << "Creating SRV: { Size = " << R.size() << ", Register = t"
<< R.DXBinding.Register + RegOffset
<< ", Space = " << R.DXBinding.Space << " }\n";
<< ", Space = " << R.DXBinding.Space;

if (R.TilesMapped)
llvm::outs() << ", TilesMapped = " << *R.TilesMapped;
llvm::outs() << " }\n";

ComPtr<ID3D12Resource> Buffer;
if (auto Err = HR::toError(
Device->CreateCommittedResource(
&HeapProp, D3D12_HEAP_FLAG_NONE, &ResDesc,
D3D12_RESOURCE_STATE_COMMON, nullptr, IID_PPV_ARGS(&Buffer)),
"Failed to create committed resource (buffer)."))
if (auto Err =
HR::toError(Device->CreateReservedResource(
&ResDesc, D3D12_RESOURCE_STATE_COMMON, nullptr,
IID_PPV_ARGS(&Buffer)),
"Failed to create reserved resource (buffer)."))
return Err;

// Committed upload buffer
ComPtr<ID3D12Resource> UploadBuffer;
if (auto Err = HR::toError(
Device->CreateCommittedResource(
&UploadHeapProp, D3D12_HEAP_FLAG_NONE, &UploadResDesc,
&UploadHeapProps, D3D12_HEAP_FLAG_NONE, &UploadResDesc,
D3D12_RESOURCE_STATE_GENERIC_READ, nullptr,
IID_PPV_ARGS(&UploadBuffer)),
"Failed to create committed resource (upload buffer)."))
return Err;

// Initialize the SRV data
// Tile mapping setup (only skipped when TilesMapped is set to 0)
const UINT NumTiles = getNumTiles(R.TilesMapped, ResDesc.Width);
ComPtr<ID3D12Heap> Heap; // optional, only created if NumTiles > 0

if (NumTiles > 0) {
// Create a Heap large enough for the mapped tiles
D3D12_HEAP_DESC HeapDesc = {};
HeapDesc.Properties = CD3DX12_HEAP_PROPERTIES(D3D12_HEAP_TYPE_DEFAULT);
HeapDesc.Alignment = D3D12_DEFAULT_RESOURCE_PLACEMENT_ALIGNMENT;
HeapDesc.SizeInBytes = static_cast<UINT64>(NumTiles) *
D3D12_DEFAULT_RESOURCE_PLACEMENT_ALIGNMENT;
HeapDesc.Flags = D3D12_HEAP_FLAG_ALLOW_ALL_BUFFERS_AND_TEXTURES;

if (auto Err =
HR::toError(Device->CreateHeap(&HeapDesc, IID_PPV_ARGS(&Heap)),
"Failed to create heap for tiled SRV resource."))
return Err;

// Define one contiguous mapping region
const D3D12_TILED_RESOURCE_COORDINATE StartCoord = {0, 0, 0, 0};
D3D12_TILE_REGION_SIZE RegionSize = {};
RegionSize.NumTiles = NumTiles;
RegionSize.UseBox = FALSE;

const D3D12_TILE_RANGE_FLAGS RangeFlag = D3D12_TILE_RANGE_FLAG_NONE;
const UINT HeapRangeStartOffset = 0;
const UINT RangeTileCount = NumTiles;

ID3D12CommandQueue *CommandQueue = IS.Queue.Get();
CommandQueue->UpdateTileMappings(
Buffer.Get(), 1, &StartCoord, &RegionSize, // One region
Heap.Get(), 1, &RangeFlag, &HeapRangeStartOffset, &RangeTileCount,
D3D12_TILE_MAPPING_FLAG_NONE);
}

// Upload data initialization
void *ResDataPtr = nullptr;
if (auto Err = HR::toError(UploadBuffer->Map(0, nullptr, &ResDataPtr),
"Failed to acquire UAV data pointer."))
return Err;
memcpy(ResDataPtr, ResData.get(), R.size());
UploadBuffer->Unmap(0, nullptr);
if (SUCCEEDED(UploadBuffer->Map(0, NULL, &ResDataPtr))) {
memcpy(ResDataPtr, ResData.get(), R.size());
UploadBuffer->Unmap(0, nullptr);
} else {
return llvm::createStringError(std::errc::io_error,
"Failed to map SRV upload buffer.");
}

addResourceUploadCommands(R, IS, Buffer, UploadBuffer);

Bundle.emplace_back(UploadBuffer, Buffer, nullptr);
Bundle.emplace_back(UploadBuffer, Buffer, nullptr, Heap);
RegOffset++;
}
return Bundle;
Expand Down Expand Up @@ -684,6 +747,7 @@ class DXDevice : public offloadtest::Device {
llvm::outs() << "UAV: HeapIdx = " << HeapIdx << " EltSize = " << EltSize
<< " NumElts = " << NumElts
<< " HasCounter = " << R.HasCounter << "\n";

D3D12_CPU_DESCRIPTOR_HANDLE UAVHandle = UAVHandleHeapStart;
UAVHandle.ptr += HeapIdx * DescHandleIncSize;
ID3D12Resource *CounterBuffer = R.HasCounter ? RS.Buffer.Get() : nullptr;
Expand Down
8 changes: 4 additions & 4 deletions lib/API/VK/Device.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -776,8 +776,8 @@ class VKDevice : public offloadtest::Device {
std::errc::invalid_argument,
"No RenderTarget buffer specified for graphics pipeline.");
Resource FrameBuffer = {
ResourceKind::Texture2D, "RenderTarget", {}, {},
P.Bindings.RTargetBufferPtr, false};
ResourceKind::Texture2D, "RenderTarget", {}, {},
P.Bindings.RTargetBufferPtr, false, std::nullopt};
IS.FrameBufferResource.Size = P.Bindings.RTargetBufferPtr->size();
IS.FrameBufferResource.BufferPtr = P.Bindings.RTargetBufferPtr;
IS.FrameBufferResource.ImageLayout =
Expand All @@ -804,8 +804,8 @@ class VKDevice : public offloadtest::Device {
std::errc::invalid_argument,
"No Vertex buffer specified for graphics pipeline.");
const Resource VertexBuffer = {
ResourceKind::StructuredBuffer, "VertexBuffer", {}, {},
P.Bindings.VertexBufferPtr, false};
ResourceKind::StructuredBuffer, "VertexBuffer", {}, {},
P.Bindings.VertexBufferPtr, false, std::nullopt};
auto ExVHostBuf =
createBuffer(IS, VK_BUFFER_USAGE_TRANSFER_SRC_BIT,
VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT, VertexBuffer.size(),
Expand Down
1 change: 1 addition & 0 deletions lib/Support/Pipeline.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -291,6 +291,7 @@ void MappingTraits<offloadtest::Resource>::mapping(IO &I,
I.mapRequired("Name", R.Name);
I.mapRequired("Kind", R.Kind);
I.mapOptional("HasCounter", R.HasCounter, 0);
I.mapOptional("TilesMapped", R.TilesMapped);
I.mapRequired("DirectXBinding", R.DXBinding);
I.mapOptional("VulkanBinding", R.VKBinding);
}
Expand Down
152 changes: 152 additions & 0 deletions test/Feature/HLSLLib/PartiallyMappedResources.test
Original file line number Diff line number Diff line change
@@ -0,0 +1,152 @@
#--- source.hlsl

StructuredBuffer<int4> X : register(t0);
StructuredBuffer<int4> Y : register(t1);

RWStructuredBuffer<int> Out : register(u2);
RWStructuredBuffer<int> CAFM : register(u3);

[numthreads(1,1,1)]
void main() {
// 4096 int4's inside X or Y occupy 64KB of data.
// (4096 int4's * 4 ints * 4 bytes per int)
// So, any index into the buffer >= [4096] will access a new "tile"

int idx = 0;

uint status;
int4 Out0 = X.Load(0, status);
int CAFMResult = CheckAccessFullyMapped(status) ? 1 : 0;
if (CAFMResult){
CAFM[idx] = CAFMResult;
Out[idx] = Out0.x;
Out[idx + 4] = status;
}
idx += 1;

int4 Out1 = X.Load(4100, status);
CAFMResult = CheckAccessFullyMapped(status) ? 1 : 0;
if (CAFMResult){
CAFM[idx] = CAFMResult;
Out[idx] = Out1.x;
Out[idx + 4] = status;
}
idx += 1;

int4 Out2 = Y.Load(0, status);
CAFMResult = CheckAccessFullyMapped(status) ? 1 : 0;
if (CAFMResult){
CAFM[idx] = CAFMResult;
Out[idx] = Out2.x;
Out[idx + 4] = status;
}
idx += 1;

int4 Out3 = Y.Load(4100, status);
CAFMResult = CheckAccessFullyMapped(status) ? 1 : 0;
if (CAFMResult){
CAFM[idx] = CAFMResult;
Out[idx] = Out3.x;
Out[idx + 4] = status;
}
}

//--- pipeline.yaml

---
Shaders:
- Stage: Compute
Entry: main
DispatchSize: [1, 1, 1]
Buffers:
- Name: X
Format: Int32
Stride: 16
FillSize: 131072
FillValue: 9001
- Name: Y
Format: Int32
Stride: 16
FillSize: 131072
FillValue: 9002
- Name: Out
Format: Int32
Stride: 4
FillSize: 32
- Name: ExpectedOut
Format: Int32
Stride: 4
# first 4 values are the actual data retrieved. For non-resident loads, 0 is expected.
# last 4 values are the status. 1 is expected for resident memory, 0 for non-resident
Data: [9001, 0, 0, 0, 1, 0, 0, 0]
- Name: CAFM
Format: Int32
Stride: 4
FillSize: 16
FillValue: 0
- Name: ExpectedCAFM
Format: Int32
Stride: 4
# Only the first data access should be accessing fully mapped memory
Data: [1, 0, 0, 0]

Results:
- Result: Test
Rule: BufferExact
Actual: Out
Expected: ExpectedOut
- Result: TestCAFM
Rule: BufferExact
Actual: CAFM
Expected: ExpectedCAFM
DescriptorSets:
- Resources:
- Name: X
Kind: StructuredBuffer
DirectXBinding:
Register: 0
Space: 0
VulkanBinding:
Binding: 0
TilesMapped: 1
- Name: Y
Kind: StructuredBuffer
DirectXBinding:
Register: 1
Space: 0
VulkanBinding:
Binding: 1
TilesMapped: 0
- Name: Out
Kind: RWStructuredBuffer
DirectXBinding:
Register: 2
Space: 0
VulkanBinding:
Binding: 2
- Name: CAFM
Kind: RWStructuredBuffer
DirectXBinding:
Register: 3
Space: 0
VulkanBinding:
Binding: 3
#--- end

# Unimplemented https://github.com/llvm/llvm-project/issues/138910
# AND https://github.com/llvm/llvm-project/issues/99204

# Unimplemented https://github.com/llvm/llvm-project/issues/138910
# AND https://github.com/llvm/llvm-project/issues/99204
# XFAIL: Vulkan

# Bug https://github.com/llvm/offload-test-suite/issues/182
# Metal API seems to have problems with reserved resources
# XFAIL: Metal

# Bug https://github.com/llvm/offload-test-suite/issues/485
# XFAIL: Intel

# RUN: split-file %s %t
# RUN: %dxc_target -T cs_6_5 -Fo %t.o %t/source.hlsl
# RUN: %offloader %t/pipeline.yaml %t.o
Loading