HOL-Light: Add HOL-Light proof framework and x86 AVX2 NTT proof #640

jakemas · 2025-11-07T19:30:47Z

Resolves Hol-Light: Prove AVX2 NTT #338
Resolves HOL-Light: Autogenerate embedded byte-code #646
Resolves HOL-Light: Run proofs in CI #645
Resolves HOL-Light: Autogenerate assembly files #647

This PR includes:

An update to s2n-bignum version to support proofs
Addition of list_proofs.sh, build-proof.sh, dump_bytecode.ml utilities
Addition of mldsa-specs.ml and mldsa-utils.ml to support HOL-Light proofs
Addition of first HOL-Light proof for ML-DSA x86 AVX2 NTT in mldsa_ntt.ml
Addition of hol_light to scripts/tests to wrap and run proofs
READMEs for usage descriptions
Autogen updated to use dump_bytecode.ml by the argument --update-hol-light-bytecode and --dry-run to check/update
Autogen HOL-Light assembly files gen_hol_light_asm_file and gen_hol_light_asm
3 New CI jobs for HOL-light proofs

See the added proofs/hol_light/x86_64/README.md for details.

Running make -C proofs/hol_light/x86_64 produces an mldsa_ntt.correct file that ends in the following:

Running time: 18147.000000 sec, Start unixtime: 1762547981.000000, End unixtime: 1762566128.000000

(~5hour on my test EC2 m5.2xlarge, Successful in 249m in CI).

Discussion

There is an issue where the x86 bytecode generated on clang ARM Mac is different to that when compiled on a gcc Linux. The clang compiler performs some optimization of instructions by switching the source operands of VPADDD etc. to get a shorter encoding. This means the define_assert_from_elf check will fail when compiled on clang ARM Mac, as the proof is set with the machine code as compiled on gcc Linux (I do have proofs for both). We will have the same issue in mlkem-native when the x86 proofs arrive. So looking for opinions on (1) being okay with the x86 proofs not working on clang ARM Mac environments (2) fix the selective swapping of source operands to VPADDD to "optimise" the instructions down to the shorter size on gcc Linux, so both compile consistently (3) adding both proof types... likely all but (1) is out of scope of this PR -- just keeping notes.

mkannwischer

Thanks @jakemas - this is awesome progress!

I tried to run the proofs on a c8g.4x-large in the nix develop .#hol_light shell, but
it does not work as x86_64-linux-gnu-objdump and x86_64-linux-gnu-as are not installed.
We can probably fix cross compilation in a follow-up PR -- running it on x86 now.

ubuntu@ip-172-31-34-86:~/mldsa-native tests hol_light -p mldsa_ntt
INFO  > HOL_LIGHT (1/1)    None        (native):         Starting HOL-Light proof for mldsa_ntt
ERROR > HOL_LIGHT (1/1)    None        (native):            FAILED (after 0s)
ERROR > HOL_LIGHT (1/1)    None        (native):         sh: line 1: x86_64-linux-gnu-as: command not found
sh: line 1: x86_64-linux-gnu-objdump: command not found
sh: line 1: x86_64-linux-gnu-as: command not found
make: *** [Makefile:70: mldsa/mldsa_ntt.o] Error 127

1 tests FAILED
* HOL-Light proof for mldsa_ntt

ubuntu@ip-172-31-34-86:~/mldsa-native$ make -C proofs/hol_light/x86_64 
make: Entering directory '/home/ubuntu/mldsa-native/proofs/hol_light/x86_64'
Preparing mldsa/mldsa_ntt.o ...
sh: line 1: x86_64-linux-gnu-as: command not found
AS: 
sh: line 1: x86_64-linux-gnu-objdump: command not found
OBJDUMP: 
[ -d mldsa ] || mkdir -p mldsa
cat mldsa/mldsa_ntt.S | gcc -E -xassembler-with-cpp - | tr ';' '\n' | x86_64-linux-gnu-as -o mldsa/mldsa_ntt.o -
sh: line 1: x86_64-linux-gnu-as: command not found
make: *** [Makefile:70: mldsa/mldsa_ntt.o] Error 127
make: Leaving directory '/home/ubuntu/mldsa-native/proofs/hol_light/x86_64'

This aligns the lint script with mlkem-native: - Fixing a typo that was discovered in #640 - Uncommenting the shell linting I have no memories why the latter was commented out. Signed-off-by: Matthias J. Kannwischer <matthias@kannwischer.eu>

mkannwischer

Thanks @jakemas. I took the liberty to also add it to the root README.

I verified that the proof runs okay.
I tested on a m4.4xlarge machine and the proofs completed after 20393s.

I tried to understand the main theorem that is being proven - the bounds look good and are indeed much tighter than what we require.

It currently only works on x86, but we can address that in a follow up.
Here is a list of follow-up issues that we should tackle:

@hanno-becker, WDYT?

proofs/README.md

proofs/hol_light/x86_64/README.md

hanno-becker

Thanks a lot @jakemas @jargh for the work! 🎉 It's great to see the first correctness proof for x86_64, let alone such a complicated one.

A few things:

The documentation still needs adjusting after copy-over from mlkem-native. Most importantly, we cannot claim to have verified all x86_64 assembly, which is currently stated at the top of proofs/hol_light/x86_64/README.md.
I would like to stick a homogeneous assembly syntax, for now AT&T everywhere. We can uniformly change this at a later point, but I don't want different syntaxes across mlkem-native and mldsa-native, or even within mldsa-native.
We should have CI and autogen in place before we merge.

@jakemas Can you have a stab at those? 1 and 2 should be straightforward (note that simpasm should give you AT&T by default); if you have issues with 3, let us know. I may get to it as well, but the alpha release has priority.

hanno-becker · 2025-11-10T12:23:28Z

(2) fix the selective swapping of source operands to VPADDD to "optimise" the instructions down to the shorter size on gcc Linux, so both compile consistently

If we can avoid the issue by swapping VPADD operands, I think we should do that. It should have no impact on the proof, right?

jakemas · 2025-11-10T17:57:57Z

Thanks a lot @jakemas @jargh for the work! 🎉 It's great to see the first correctness proof for x86_64, let alone such a complicated one.

A few things:

The documentation still needs adjusting after copy-over from mlkem-native. Most importantly, we cannot claim to have verified all x86_64 assembly, which is currently stated at the top of proofs/hol_light/x86_64/README.md.

I would like to stick a homogeneous assembly syntax, for now AT&T everywhere. We can uniformly change this at a later point, but I don't want different syntaxes across mlkem-native and mldsa-native, or even within mldsa-native.

We should have CI and autogen in place before we merge.

@jakemas Can you have a stab at those? 1 and 2 should be straightforward (note that simpasm should give you AT&T by default); if you have issues with 3, let us know. I may get to it as well, but the alpha release has priority.

Ok fixed (1) and (2) I will test on my x86 machine then clean up the commits. (I'd rather add the non-simpasm ASMB as it appears upstream (https://github.com/pq-crystals/dilithium/blob/master/avx2/ntt.S) with macros etc. to demonstrate clearly the applicability to the reference -- fortunately, John requires in s2n-bignum that the AT&T versions are generated from the Intel, so we have both).

We should have CI and autogen in place before we merge.

By this do you mean these merge first:
#645, #646, #647, #648 ?

because the bytecode check happens as part of CI... part of this PR or another?

jakemas · 2025-11-10T19:12:16Z

Testing - Automatic bytecode generation/insertion

Ok I hooked up the dump_bytecode.ml utility to autogen and I verified that changing the first value in byte code to 0x99 e.g. from mldsa_ntt.ml:

let mldsa_ntt_mc = define_assert_from_elf "mldsa_ntt_mc" "mldsa/mldsa_ntt.o"
(*** BYTECODE START ***)
[
  0x99; 0x0f; 0x1e; 0xfa;  (* ENDBR64 *)

was correctly found by python scripts/autogen --update-hol-light-bytecode --dry-run

error proofs/hol_light/x86_64/proofs/mldsa_ntt.ml
Autogenerated file proofs/hol_light/x86_64/proofs/mldsa_ntt.ml needs updating. Have you called scripts/autogen? Wrote new version to proofs/hol_light/x86_64/proofs/mldsa_ntt.ml.new.
20c20
<   0x99; 0x0f; 0x1e; 0xfa;  (* ENDBR64 *)
---
>   0xf3; 0x0f; 0x1e; 0xfa;  (* ENDBR64 *)

By then running python scripts/autogen --update-hol-light-bytecode the bytecode was fixed, and verified by running python scripts/autogen --update-hol-light-bytecode --dry-run again to see:

✓ Finalized and checked files are up to date (8.3s)

To generate the bytecode:

cd proofs/hol_light/x86_64 && make dump_bytecode

=== bytecode start: mldsa/mldsa_ntt.o ================
[
  0xf3; 0x0f; 0x1e; 0xfa;  (* ENDBR64 *)
  0xc5; 0xfd; 0x6f; 0x06;  (* VMOVDQA (%_% ymm0) (Memop Word256 (%% (rsi,0))) *)
  0xc4; 0xe2; 0x7d; 0x58; 0x8e; 0x84; 0x00; 0x00; 0x00;
                           (* VPBROADCASTD (%_% ymm1) (Memop Doubleword (%% (rsi,132))) *)
  0xc4; 0xe2; 0x7d; 0x58; 0x96; 0x24; 0x05; 0x00; 0x00;
                           (* VPBROADCASTD (%_% ymm2) (Memop Doubleword (%% (rsi,1316))) *)
  0xc5; 0xfd; 0x6f; 0x27;  (* VMOVDQA (%_% ymm4) (Memop Word256 (%% (rdi,0))) *)
  0xc5; 0xfd; 0x6f; 0xaf; 0x80; 0x00; 0x00; 0x00;
                           (* VMOVDQA (%_% ymm5) (Memop Word256 (%% (rdi,128))) *)
  0xc5; 0xfd; 0x6f; 0xb7; 0x00; 0x01; 0x00; 0x00;
                           (* VMOVDQA (%_% ymm6) (Memop Word256 (%% (rdi,256))) *)
  0xc5; 0xfd; 0x6f; 0xbf; 0x80; 0x01; 0x00; 0x00;
                           (* VMOVDQA (%_% ymm7) (Memop Word256 (%% (rdi,384))) *)
  0xc5; 0x7d; 0x6f; 0x87; 0x00; 0x02; 0x00; 0x00;
                           (* VMOVDQA (%_% ymm8) (Memop Word256 (%% (rdi,512))) *)
  0xc5; 0x7d; 0x6f; 0x8f; 0x80; 0x02; 0x00; 0x00;
                           (* VMOVDQA (%_% ymm9) (Memop Word256 (%% (rdi,640))) *)
  0xc5; 0x7d; 0x6f; 0x97; 0x00; 0x03; 0x00; 0x00;
                           (* VMOVDQA (%_% ymm10) (Memop Word256 (%% (rdi,768))) *)
  0xc5; 0x7d; 0x6f; 0x9f; 0x80; 0x03; 0x00; 0x00;
                           (* VMOVDQA (%_% ymm11) (Memop Word256 (%% (rdi,896))) *)
  0xc4; 0x62; 0x3d; 0x28; 0xe9;
                           (* VPMULDQ (%_% ymm13) (%_% ymm8) (%_% ymm1) *)
  0xc4; 0x41; 0x7e; 0x1

...
                           (* VMOVDQA (Memop Word256 (%% (rdi,960))) (%_% ymm3) *)
  0xc5; 0x7d; 0x7f; 0x9f; 0xe0; 0x03; 0x00; 0x00;
                           (* VMOVDQA (Memop Word256 (%% (rdi,992))) (%_% ymm11) *)
  0xc3                     (* RET *)
];;
==== bytecode end =====================================

Running time: 96.000000 sec, Start unixtime: 1762801050.000000, End unixtime: 1762801146.000000

jakemas · 2025-11-10T20:19:09Z

HOL-Light Proofs in CI

I've added .github/workflows/hol_light.yml that works on adding some of this to CI (#645). I've not added to the CI before, but looked at the existing examples. I chose x86 runners for the proofs, to best emulate the running environment, but I don't know if the instances are set up to last long enough, or be powerful enough for the proofs. Please take a look @mkannwischer to see what you think of my CI setup.

As in mlkem-native a three part test:

HOL-Light / HOL-Light bytecode check (pull_request) Successful in 6m
HOL-Light / HOL-Light interactive shell test (pull_request) Successful in 15m
HOL-Light / HOL Light proof for mldsa_ntt.S (pull_request) Successful in 249m

The interactive shell test in mlkem-native https://github.com/pq-code-package/mlkem-native/blob/main/.github/workflows/hol_light.yml#L66-L68 performs needs "proofs/mlkem_poly_tobytes.ml";;, however, in mldsa-native we have no small proof to run here -- I tried running needs "proofs/mldsa_ntt.ml";; but running in the interactive shell takes longer than the actual CI job for HOL-Light / HOL Light proof for mldsa_ntt.S. So instead, I think it's reasonable to test loading the mldsa specific environment, i.e. 'needs "x86/proofs/base.ml";; and needs "proofs/mldsa_specs.ml";; -- when I add a smaller proof, I will swap this out for needs "proofs/mldsa_small_proof";;.

mkannwischer

Thanks @jakemas. The CI looks good to me.
Three minor comments on autogen above.

Can you also autogenerate mldsa_ntt.S please (see gen_hol_light_asm in mlkem-native)?

scripts/autogen

jakemas · 2025-11-11T19:09:19Z

Thanks @jakemas. The CI looks good to me. Three minor comments on autogen above.

Can you also autogenerate mldsa_ntt.S please (see gen_hol_light_asm in mlkem-native)?

Okay, also added autogen.

hanno-becker · 2025-11-11T20:01:46Z

By this do you mean these merge first:
#645, #646, #647, #648 ?

#648 and #646 are not required for this PR (in my mind), but we should have #645, #647.

jakemas · 2025-11-11T20:05:48Z

By this do you mean these merge first:
#645, #646, #647, #648 ?

#648 and #646 are not required for this PR (in my mind), but we should have #645, #647.

Ok, they are implemented in this PR now. I have kept the description up to date.

jargh · 2025-11-11T20:06:34Z

On the problem of clang-based and gcc-based assemblers behaving differently, (the former selectively swapping arguments of commutative functions like VPADD to get a more compact encoding), there could be an argument for forcing the optimization by switching the order in the code, in the hope that it becomes more platform-independent. A possible issue with doing that simply via the clean code is that the triggering of the optimization depends on the register values, so it might be tricky to enforce at the macro level. It would be interesting to know exactly when the swap gets triggered. Maybe it's when one argument is >= ymm8 and the other is <= ymm7 (which I think is the only situation that makes the encoding shorter) or maybe it's simply when one register code is greater than another.

jargh · 2025-11-11T20:08:35Z

That is, I'd instinctively vote for option (2) abiove if it isn't too troublesome

(2) fix the selective swapping of source operands to VPADDD to "optimise" the instructions down to the shorter size on gcc Linux, so both compile consistently

dev/x86_64/src/ntt.S

mldsa/src/native/x86_64/src/ntt.S

proofs/hol_light/x86_64/README.md

proofs/hol_light/x86_64/proofs/mldsa_specs.ml

proofs/hol_light/x86_64/proofs/mldsa_ntt.ml

proofs/hol_light/x86_64/README.md

.github/workflows/hol_light.yml

proofs/hol_light/x86_64/README.md

Add HOL-Light framework with NTT proof. This includes: - list_proofs.sh, build-proof.sh, dump_bytecode.ml utilities - mldsa-specs.ml and mldsa-utils.ml to support HOL-Light proofs - HOL-Light proof for ML-DSA x86 AVX2 NTT in mldsa_ntt.ml - Addition of hol_light to scripts/tests to wrap and run proofs - READMEs for usage descriptions - Autogenerate embedded bytecode using dump_bytecode.ml - Add CI jobs to test and run hol-light proofs - Add autogen for asmb files Signed-off-by: Jake Massimo <jakemas@amazon.com>

jakemas added the hol-light label Nov 7, 2025

jakemas force-pushed the hol-light-ntt branch from 0f77c6c to c321a88 Compare November 8, 2025 02:22

jakemas marked this pull request as ready for review November 8, 2025 02:22

jakemas requested a review from a team as a code owner November 8, 2025 02:22

mkannwischer requested changes Nov 8, 2025

View reviewed changes

mkannwischer mentioned this pull request Nov 8, 2025

lint: Consolidate with mlkem-native #641

Merged

mkannwischer force-pushed the hol-light-ntt branch from 87f4d53 to 24a1562 Compare November 9, 2025 03:48

This was referenced Nov 9, 2025

HOL-Light: Support running proofs cross-platform (on Linux) #643

Open

HOL-Light: Support running proofs on MacOS #644

Open

HOL-Light: Run proofs in CI #645

Closed

HOL-Light: Autogenerate embedded byte-code #646

Closed

mkannwischer approved these changes Nov 9, 2025

View reviewed changes

hanno-becker reviewed Nov 10, 2025

View reviewed changes

proofs/README.md Outdated Show resolved Hide resolved

hanno-becker reviewed Nov 10, 2025

View reviewed changes

proofs/hol_light/x86_64/README.md Outdated Show resolved Hide resolved

hanno-becker requested changes Nov 10, 2025

View reviewed changes

jakemas force-pushed the hol-light-ntt branch from 414790e to 3914869 Compare November 10, 2025 18:56

jakemas force-pushed the hol-light-ntt branch from 44e99b1 to 2b14a7b Compare November 10, 2025 20:29

jakemas changed the title ~~Hol-Light: Add Hol-light proof framework and Forward NTT proof~~ Hol-Light: Add Hol-light proof framework and x86 AVX2 NTT proof Nov 10, 2025

jakemas force-pushed the hol-light-ntt branch 2 times, most recently from c3596c3 to fdd4867 Compare November 11, 2025 03:09

mkannwischer requested changes Nov 11, 2025

View reviewed changes

scripts/autogen Outdated Show resolved Hide resolved

scripts/autogen Outdated Show resolved Hide resolved

scripts/autogen Outdated Show resolved Hide resolved

jakemas force-pushed the hol-light-ntt branch from 42d9098 to 01dda75 Compare November 11, 2025 19:08

jakemas force-pushed the hol-light-ntt branch from 01dda75 to c997ca7 Compare November 11, 2025 19:11