-
Notifications
You must be signed in to change notification settings - Fork 158
trycopy: reimplement sparse_mmap's trycopy with inline asm #2294
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Use inline asm instead of global asm or a C implementation, which allows the compiler to inline the code. Use a custom exception-table-based approach for recovery instead of setjmp/longjmp on Unix (slow) or SEH exceptions on Windows (prevents inlining). Besides improving performance across the board, this allows us to use a single per-arch implementation of the primitives across all OSes. This should make it easier to add new primitives over time, if desired.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR extracts fault-tolerant memory access functionality from the sparse_mmap crate into a new dedicated trycopy crate. The extraction enables reuse of safe memory copy/read/write primitives across trust boundaries in virtualization contexts.
Key Changes
- Created new
trycopycrate with architecture-specific (x86_64/aarch64) inline assembly implementations for fault-recoverable memory operations - Replaced C/global_asm implementation in
sparse_mmapwith pure Rust inline assembly using recovery descriptors - Updated
sparse_mmap,guestmem, anduser_driverto depend on and use the newtrycopycrate
Reviewed Changes
Copilot reviewed 21 out of 22 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| support/trycopy/src/lib.rs | New crate entry point with public API and signal/exception handler installation |
| support/trycopy/src/x86_64.rs | x86_64-specific inline assembly implementations with recovery descriptors |
| support/trycopy/src/aarch64.rs | ARM64-specific inline assembly implementations with recovery descriptors |
| support/trycopy/src/memcpy.rs | Architecture-independent memory copy/set chunking logic |
| support/trycopy/Cargo.toml | New crate manifest with dependencies |
| support/trycopy/benches/perf.rs | Performance benchmarks for the new crate |
| support/sparse_mmap/src/lib.rs | Updated to use trycopy crate instead of internal implementation |
| support/sparse_mmap/src/unix.rs | Changed to call trycopy::initialize_try_copy() |
| support/sparse_mmap/src/windows.rs | Changed to call trycopy::initialize_try_copy() and removed unused constants |
| support/sparse_mmap/Cargo.toml | Removed cc build dependency and added trycopy dependency |
| support/sparse_mmap/build.rs | Removed (no longer needed without C compilation) |
| support/sparse_mmap/src/trycopy.c | Removed (replaced by Rust implementation) |
| support/sparse_mmap/src/trycopy_windows_x64.rs | Removed (replaced by new crate) |
| support/sparse_mmap/src/trycopy_windows_arm64.rs | Removed (replaced by new crate) |
| support/sparse_mmap/benches/perf.rs | Removed (moved to trycopy crate) |
| vm/vmcore/guestmem/src/lib.rs | Updated to use trycopy:: functions instead of sparse_mmap:: |
| vm/vmcore/guestmem/Cargo.toml | Added trycopy dependency |
| vm/devices/user_driver/src/vfio.rs | Updated to use trycopy:: functions |
| vm/devices/user_driver/Cargo.toml | Added trycopy dependency and removed optional sparse_mmap |
| flowey/flowey_lib_hvlite/src/_jobs/check_clippy.rs | Removed macOS-specific SPARSE_MMAP_NO_BUILD workaround |
| Cargo.toml | Added trycopy workspace member, removed cc dependency |
| Cargo.lock | Updated with new trycopy crate and dependency changes |
| let (offset, is_write) = if failure.address.is_null() { | ||
| // In the case of a general protection fault (#GP) the provided address is zero. | ||
| (0, src.is_none()) | ||
| } else if (dest..dest.wrapping_add(len)).contains(&failure.address) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this wraps does contains still work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But also, do we expect it to ever wrap?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we expect this to wrap, we just want to avoid having to use an unsafe fn here to do the math.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just use normal addition? Overflow checks won't be present in release builds.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dest is a pointer. Normal addition is not defined.
| use recovery_section; | ||
|
|
||
| #[cfg(test)] | ||
| mod tests { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should file an issue for adding a fuzzer for this crate. We have one for sparse_mmap, but presumably one targeting this crate directly could exercise some paths sparse_mmap might not.
|
This PR modifies files containing For more on why we check whole files, instead of just diffs, check out the Rustonomicon |
| /// something like `.trycopy$a` and `.trycopy$z`, with the elements in between | ||
| /// in `.trycopy$b`. | ||
| /// | ||
| /// However, Rust/LLVM inline asm (but not global asm) seems to drop the '$', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Odd, any idea why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couldn't figure it out in the time I spent on it. I'll look into it more in the future.
|
Whoa, big binary size wins. Were those expected? |
Use inline asm instead of global asm or a C implementation, which allows the compiler to inline the code. Use a
custom exception-table-based approach for recovery instead of setjmp/longjmp on Unix (slow) or SEH exceptions on Windows (prevents inlining).
Besides improving performance across the board, this allows us to use a single per-arch implementation of the primitives across all OSes. This should make it easier to add new primitives over time, if desired.