@@ -64,30 +64,32 @@ more control over the entire pipeline by letting them directly process batches o
6464vertices and primitives in compute-like shaders. This allows for more efficient culling,
6565level-of-detail calculations, and custom pipeline logic—all on the GPU.
6666
67- An example of a Rust GPU mesh shader that outputs points :
67+ An example of a Rust GPU mesh shader that outputs a triangle :
6868
6969```
7070use spirv_std::arch::set_mesh_outputs_ext;
71- use spirv_std::glam::{UVec2 , Vec4};
71+ use spirv_std::glam::{UVec3 , Vec4};
7272use spirv_std::spirv;
7373
7474#[spirv(mesh_ext(
7575 threads(1),
76- output_vertices = 1 ,
76+ output_vertices = 3 ,
7777 output_primitives_ext = 1,
78- output_points
78+ output_triangles_ext
7979))]
8080pub fn main(
81- #[spirv(position)] positions: &mut [Vec4; 1 ],
82- #[spirv(primitive_point_indices_ext )] indices: &mut [u32 ; 1],
81+ #[spirv(position)] positions: &mut [Vec4; 3 ],
82+ #[spirv(primitive_triangle_indices_ext )] indices: &mut [UVec3 ; 1],
8383) {
8484 unsafe {
85- set_mesh_outputs_ext(1 , 1);
85+ set_mesh_outputs_ext(3 , 1);
8686 }
8787
8888 positions[0] = Vec4::new(-0.5, 0.5, 0.0, 1.0);
89+ positions[1] = Vec4::new(0.5, 0.5, 0.0, 1.0);
90+ positions[2] = Vec4::new(0.0, -0.5, 0.0, 1.0);
8991
90- indices[0] = 0 ;
92+ indices[0] = UVec3::new(0, 1, 2) ;
9193}
9294```
9395
@@ -110,39 +112,21 @@ pub fn main() {
110112[ @Firestar99 ] ( https://github.com/firestar99 ) also added support for subgroups via
111113[ subgroup intrinsics] ( https://github.com/Rust-GPU/rust-gpu/pull/14 ) .
112114
113- [ Subgroups] ( https://www.khronos.org/blog/vulkan-subgroup-tutorial ) are small groups of
114- threads within a workgroup that can share data and perform synchronized operations more
115+ [ Subgroups] ( https://www.khronos.org/blog/vulkan-subgroup-tutorial ) allow a group of
116+ threads of vendor-defined size to share data and perform synchronized operations more
115117efficiently. For example, using subgroup intrinsics you can:
116118
117119- Perform reductions (e.g., sum, min, max) across threads in a subgroup.
118120- Share intermediate results without relying on global memory, reducing latency.
119121- Implement algorithms like prefix sums or parallel sorting more effectively.
120122
121- Here is a simple Rust GPU example to demonstrate subgroup reduction:
122-
123- ``` rust
124- use glam :: UVec3 ;
125- use spirv_std :: spirv;
126-
127- unsafe fn subgroup_i_add_reduce (value : u32 ) -> u32 {
128- spirv_std :: arch :: subgroup_i_add (value )
129- }
130-
131- #[spirv(compute(threads(32, 1, 1)))]
132- pub fn main (#[spirv (local_invocation_id )] local_invocation_id : UVec3 ) {
133- unsafe {
134- subgroup_i_add_reduce (local_invocation_id . x);
135- }
136- }
137- ```
138-
139123## Added ` TypedBuffer `
140124
141125[ @eddyb ] ( https://github.com/eddyb ) and [ @Firestar99 ] ( https://github.com/firestar99 )
142126[ introduced ` TypedBuffer ` ] ( https://github.com/Rust-GPU/rust-gpu/pull/16 ) , an explicit
143127way to declare inputs and outputs as buffers. This enables declaring an "array of buffer
144- descriptors containing something" as is common in [ bindless
145- textures ] ( https://computergraphics.stackexchange.com/questions/10794/binding-vs-bindless ) .
128+ descriptors containing something" as is common in
129+ [ bindless ] ( https://computergraphics.stackexchange.com/questions/10794/binding-vs-bindless ) .
146130
147131Here is an example of using
148132[ ` TypedBuffer ` ] ( https://rust-gpu.github.io/rust-gpu/api/spirv_std/struct.TypedBuffer.html )
0 commit comments