[LLVM,X64] Add support for SHA intrinsics#37
Conversation
| auto [dst_ref, dst_part] = this->val_ref_single(inst->getOperand(0)); | ||
| auto [src_ref, src_part] = this->val_ref_single(inst->getOperand(1)); | ||
| ASM(SHA1MSG1rr, | ||
| dst_part.load_to_reg(), | ||
| src_part.load_to_reg()); |
There was a problem hiding this comment.
This destroys a value that might still be used. Use into_temporary, see crc32 above. Likewise for other cases.
| case llvm::Intrinsic::x86_sha256rnds2: { | ||
| auto [dst_ref, dst_part] = this->val_ref_single(inst->getOperand(0)); | ||
| auto [src_ref, src_part] = this->val_ref_single(inst->getOperand(1)); | ||
| auto [round_constants_ref, round_constants_part] = this->val_ref_single(inst->getOperand(2)); |
There was a problem hiding this comment.
Use load_to_specific and do this before loading any other values. FixedRegBackup should never be used outside of generated code.
There was a problem hiding this comment.
Note: you will have to call into_temporary first as otherwise it will break for fixed assignments.
| ; X64-LABEL: <sha1msg1>: | ||
| ; X64: sha1msg1 xmm0, xmm1 | ||
| ; X64-NEXT: ret | ||
| %res = call <4 x i32> @llvm.x86.sha1msg1(<4 x i32> %a, <4 x i32> %b) |
There was a problem hiding this comment.
Tests lack variant where destination reg cannot be reused.
|
Maybe just add an |
rust's Alternatively, if you'd prefer to generate code for all* architecture-specific intrinsics by scraping LLVM's TableGen files, I could instead work on that. |
|
Is this something that comes up while building reasonably common code? In that case, it would be interesting to know how exactly this comes about (I'd expect crypto intrinsics to be only used in crypto libraries and these routines typically don't benefit from inlining, so shouldn't be compiled that often). Or does it just happen when building some fairly/the standard libraries? Standard libraries tend to contain a lot of rarely used constructs, which tend to require a lot of effort. The primary goal for TPDE-LLVM should be to compile typical modules (95%?) as generated by front-ends without or with light optimizations. So unless a non-tiny amount of modules contain these intrinsics, I'd rather focus on missing support for more important IR constructs. Support for all intrinsics is most likely a non-goal: it adds a lot of code and substantially increases maintenance costs while providing very little benefit in practice. |
I encountered it when (perhaps overly ambitiously) trying to build pingora. As you said, I don't expect the SHA intrinsics specifically would be used outside of web/crypto code and the standard library. But most Rust projects will use an intrinsic somewhere in the dependency tree, and this remains a problem for rustc_codegen_cranelift on real-world code (rust-lang/rustc_codegen_cranelift#171). It currently implements intrinsics manually, but is looking into automating the process (rust-lang/rustc_codegen_cranelift#1547). Would you prefer for rustc_codegen_tpde to handle the intrinsics itself?
Thanks for the clarification. |
Adds the LLVM x86 sha1* and sha256* intrinsics, but not vsha512* (not supported by fadec yet).
I'm not convinced I've done the fixed register in SHA256RNDS2 properly. Apologies if I've misunderstood something.