-
Notifications
You must be signed in to change notification settings - Fork 83
Update PointCloud type to support client-side rendering #754
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
|
||
| // The acquisition viewpoint of the data, specifying the sensor's pose. This is optional but highly recommended | ||
| // for sensor fusion and accurate visualization tasks. | ||
| optional common.v1.Pose viewpoint = 7; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not a frame of the sensor that captured it? that's what we currently do with pointclouds in the visualizer. pointclouds are returned from camera methods in local space and we child them to the camera frame. it allows us to show both local and world pose information
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess though if this is just modifying the proto message then we can still determine the frame using the above strategy. would viewpoint here potentially mean some other kind of offset with respect to the sensor frame?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can do either, but I am thinking more from an ergonomics standpoint. I imagine it would be easier to work with just the pose than the entire frame object when using this API. I also feel like the Pose is less likely to change (at least in the near future) as opposed to the Frame.
This change introduces a
PointCloudHeaderto thePointCloudmessage, making the existingbytespayload self-describing. This change is additive, non-breaking, and significantly improves our ability to describe and render point clouds.Reasoning
The current
PointCloudmessage efficiently transports data as a binary blob (bytes point_cloud = 1;). However, it lacks any descriptive metadata, forcing an implicit, out-of-band contract between the client and server to interpret the byte stream's structure (e.g., field order, data types, presence of color or intensity).This approach has three main drawbacks:
Transform.metadata.Proposed Changes
To address this, I introduced the
PointCloudHeadermessage. This header adheres to industry standards to explicitly define the structure of the data contained within thepoint_cloudbyte array.I updated the
PointCloudmessage with a new field,PointCloudHeader header = 2;, to contain the metadata.This change is non-breaking. Existing clients will continue to parse the
bytesat field1as they always have and will ignore the newheaderfield at tag2. New clients can check for the presence of theheaderto adopt a more robust parsing logic, enabling them to handle varied and evolving point cloud structures.Rendering Benefits
This enhancement directly benefits rendering performance by allowing the server to provide a GPU-optimized, interleaved memory layout. In this Array-of-Structures format, all attributes for a single point (e.g., position, color, intensity) are packed contiguously in the binary blob. A rendering client using a library like Three.js can leverage this for maximum efficiency. The entire
bytespayload from the message is loaded into a singleTypedArrayand used to create aTHREE.InterleavedBuffer.The new
PointCloudHeaderprovides the crucialstridevalue—the total number of elements for one point—needed for this step. From this single buffer, visualizers can create multipleTHREE.InterleavedBufferAttribute's to define 'position', 'color', and other properties by specifying their individualitemSizeandoffset` within the stride, all of which are now explicitly defined in the header. This "zero-copy" process is speedy because it avoids any client-side iteration or data restructuring. The data flows directly from the network into a memory layout that is highly cache-friendly for the GPU, ensuring that when a vertex is processed, all of its associated attributes are fetched in a single memory read, maximizing rendering throughput for large-scale point clouds.Beyond the initial transfer of a complete point cloud, this self-describing format is exceptionally well-suited for streaming targeted updates to a client. For instance, if only a small region of the point cloud changes (e.g., an object moves or a sensor updates a specific area), the server can send a new
PointCloudmessage containing only the data for the modified points by including thestartfield in the header. Upon receiving this partial update, the client can perform a highly efficient, targeted modification of the data on the GPU.Instead of rebuilding the entire geometry, the client can use the new binary data to overwrite a specific portion of its existing
THREE.BufferAttributeorTHREE.InterleavedBuffer. The client can inform the renderer to only re-upload the small, changed segment of the buffer to the GPU, and avoid the significant performance cost of transferring and processing the entire multi-million point dataset for a minor change.Quick Example
I have already put together a working POC using the world state store service fake as an example. You can see the code here:
Screen.Recording.2025-10-06.at.9.28.46.AM.mov