Skip to content

Conversation

@mkannwischer
Copy link
Contributor

Buffers in sign.c are not currently forced to be aligned. This may harm performance and it may also lead to problems if a FIPS202 backend is used that requires alignment (e.g., in OpenTitan).
This commit adds alignment.

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mac Mini (M1, 2020) benchmarks (opt)

Benchmark suite Current: 1c02fb9 Previous: 500fc82 Ratio
ML-DSA-44 keypair 46521 cycles 46426 cycles 1.00
ML-DSA-44 sign 132863 cycles 132751 cycles 1.00
ML-DSA-44 verify 47879 cycles 47842 cycles 1.00
ML-DSA-65 keypair 81495 cycles 81452 cycles 1.00
ML-DSA-65 sign 219348 cycles 219232 cycles 1.00
ML-DSA-65 verify 80138 cycles 80137 cycles 1.00
ML-DSA-87 keypair 132625 cycles 132755 cycles 1.00
ML-DSA-87 sign 280963 cycles 280940 cycles 1.00
ML-DSA-87 verify 130449 cycles 130324 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mac Mini (M1, 2020) benchmarks (no-opt)

Benchmark suite Current: 1c02fb9 Previous: 500fc82 Ratio
ML-DSA-44 keypair 115289 cycles 115263 cycles 1.00
ML-DSA-44 sign 431838 cycles 431701 cycles 1.00
ML-DSA-44 verify 122210 cycles 122160 cycles 1.00
ML-DSA-65 keypair 197380 cycles 197422 cycles 1.00
ML-DSA-65 sign 701028 cycles 700956 cycles 1.00
ML-DSA-65 verify 197645 cycles 197682 cycles 1.00
ML-DSA-87 keypair 325371 cycles 325402 cycles 1.00
ML-DSA-87 sign 884428 cycles 884475 cycles 1.00
ML-DSA-87 verify 328680 cycles 328672 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intel Xeon 4th gen (c7i)

Benchmark suite Current: 1c02fb9 Previous: 500fc82 Ratio
ML-DSA-44 keypair 35668 cycles 35140 cycles 1.02
ML-DSA-44 sign 120943 cycles 121005 cycles 1.00
ML-DSA-44 verify 38257 cycles 38270 cycles 1.00
ML-DSA-65 keypair 61717 cycles 61949 cycles 1.00
ML-DSA-65 sign 199624 cycles 199347 cycles 1.00
ML-DSA-65 verify 62358 cycles 62507 cycles 1.00
ML-DSA-87 keypair 95434 cycles 95705 cycles 1.00
ML-DSA-87 sign 235690 cycles 235102 cycles 1.00
ML-DSA-87 verify 94300 cycles 94106 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intel Xeon 4th gen (c7i) (no-opt)

Benchmark suite Current: 1c02fb9 Previous: 500fc82 Ratio
ML-DSA-44 keypair 95214 cycles 95144 cycles 1.00
ML-DSA-44 sign 348280 cycles 349805 cycles 1.00
ML-DSA-44 verify 101250 cycles 100917 cycles 1.00
ML-DSA-65 keypair 163948 cycles 164738 cycles 1.00
ML-DSA-65 sign 566630 cycles 567854 cycles 1.00
ML-DSA-65 verify 164656 cycles 165287 cycles 1.00
ML-DSA-87 keypair 268428 cycles 267478 cycles 1.00
ML-DSA-87 sign 721414 cycles 723051 cycles 1.00
ML-DSA-87 verify 272196 cycles 272153 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arm Cortex-A72 (Raspberry Pi 4) benchmarks (opt)

Benchmark suite Current: 1c02fb9 Previous: 500fc82 Ratio
ML-DSA-44 keypair 225627 cycles 229220 cycles 0.98
ML-DSA-44 sign 652989 cycles 678770 cycles 0.96
ML-DSA-44 verify 230546 cycles 225880 cycles 1.02
ML-DSA-65 keypair 411684 cycles 402895 cycles 1.02
ML-DSA-65 sign 1110030 cycles 1101474 cycles 1.01
ML-DSA-65 verify 380777 cycles 390274 cycles 0.98
ML-DSA-87 keypair 675518 cycles 696993 cycles 0.97
ML-DSA-87 sign 1476505 cycles 1474544 cycles 1.00
ML-DSA-87 verify 643392 cycles 661120 cycles 0.97

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Graviton4

Benchmark suite Current: 1c02fb9 Previous: 500fc82 Ratio
ML-DSA-44 keypair 69682 cycles 69610 cycles 1.00
ML-DSA-44 sign 213934 cycles 213776 cycles 1.00
ML-DSA-44 verify 72567 cycles 72483 cycles 1.00
ML-DSA-65 keypair 123376 cycles 123509 cycles 1.00
ML-DSA-65 sign 351004 cycles 350698 cycles 1.00
ML-DSA-65 verify 120714 cycles 120636 cycles 1.00
ML-DSA-87 keypair 201268 cycles 201283 cycles 1.00
ML-DSA-87 sign 449535 cycles 449295 cycles 1.00
ML-DSA-87 verify 198071 cycles 197962 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AMD EPYC 3rd gen (c6a)

Benchmark suite Current: 1c02fb9 Previous: 500fc82 Ratio
ML-DSA-44 keypair 69327 cycles 69451 cycles 1.00
ML-DSA-44 sign 185276 cycles 185214 cycles 1.00
ML-DSA-44 verify 69090 cycles 69112 cycles 1.00
ML-DSA-65 keypair 119589 cycles 119727 cycles 1.00
ML-DSA-65 sign 296417 cycles 296567 cycles 1.00
ML-DSA-65 verify 115525 cycles 115540 cycles 1.00
ML-DSA-87 keypair 201953 cycles 201497 cycles 1.00
ML-DSA-87 sign 386238 cycles 388390 cycles 0.99
ML-DSA-87 verify 194055 cycles 193215 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intel Xeon 3rd gen (c6i)

Benchmark suite Current: 1c02fb9 Previous: 500fc82 Ratio
ML-DSA-44 keypair 57892 cycles 57537 cycles 1.01
ML-DSA-44 sign 180438 cycles 180698 cycles 1.00
ML-DSA-44 verify 61265 cycles 61256 cycles 1.00
ML-DSA-65 keypair 100161 cycles 99766 cycles 1.00
ML-DSA-65 sign 296472 cycles 296668 cycles 1.00
ML-DSA-65 verify 100142 cycles 100133 cycles 1.00
ML-DSA-87 keypair 154289 cycles 154399 cycles 1.00
ML-DSA-87 sign 352859 cycles 353033 cycles 1.00
ML-DSA-87 verify 152197 cycles 153913 cycles 0.99

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Graviton2

Benchmark suite Current: 1c02fb9 Previous: 500fc82 Ratio
ML-DSA-44 keypair 115842 cycles 116083 cycles 1.00
ML-DSA-44 sign 377668 cycles 377918 cycles 1.00
ML-DSA-44 verify 120631 cycles 120647 cycles 1.00
ML-DSA-65 keypair 200379 cycles 200443 cycles 1.00
ML-DSA-65 sign 623778 cycles 623405 cycles 1.00
ML-DSA-65 verify 198368 cycles 198650 cycles 1.00
ML-DSA-87 keypair 327312 cycles 327674 cycles 1.00
ML-DSA-87 sign 789901 cycles 790953 cycles 1.00
ML-DSA-87 verify 325751 cycles 325005 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Graviton4 (no-opt)

Benchmark suite Current: 1c02fb9 Previous: 500fc82 Ratio
ML-DSA-44 keypair 128230 cycles 128233 cycles 1.00
ML-DSA-44 sign 456819 cycles 457201 cycles 1.00
ML-DSA-44 verify 136326 cycles 136315 cycles 1.00
ML-DSA-65 keypair 220954 cycles 220681 cycles 1.00
ML-DSA-65 sign 745724 cycles 746297 cycles 1.00
ML-DSA-65 verify 220407 cycles 220298 cycles 1.00
ML-DSA-87 keypair 365062 cycles 365097 cycles 1.00
ML-DSA-87 sign 944498 cycles 944395 cycles 1.00
ML-DSA-87 verify 368915 cycles 368782 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Graviton3

Benchmark suite Current: 1c02fb9 Previous: 500fc82 Ratio
ML-DSA-44 keypair 74020 cycles 74265 cycles 1.00
ML-DSA-44 sign 228117 cycles 228735 cycles 1.00
ML-DSA-44 verify 77989 cycles 78135 cycles 1.00
ML-DSA-65 keypair 130546 cycles 130413 cycles 1.00
ML-DSA-65 sign 378805 cycles 378295 cycles 1.00
ML-DSA-65 verify 129428 cycles 129150 cycles 1.00
ML-DSA-87 keypair 209721 cycles 211746 cycles 0.99
ML-DSA-87 sign 475412 cycles 479634 cycles 0.99
ML-DSA-87 verify 210448 cycles 210224 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AMD EPYC 3rd gen (c6a) (no-opt)

Benchmark suite Current: 1c02fb9 Previous: 500fc82 Ratio
ML-DSA-44 keypair 135532 cycles 135579 cycles 1.00
ML-DSA-44 sign 539766 cycles 542304 cycles 1.00
ML-DSA-44 verify 148440 cycles 148860 cycles 1.00
ML-DSA-65 keypair 229235 cycles 228638 cycles 1.00
ML-DSA-65 sign 891864 cycles 892396 cycles 1.00
ML-DSA-65 verify 239018 cycles 237683 cycles 1.01
ML-DSA-87 keypair 373242 cycles 373275 cycles 1.00
ML-DSA-87 sign 1107433 cycles 1105513 cycles 1.00
ML-DSA-87 verify 386752 cycles 387408 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AMD EPYC 4th gen (c7a)

Benchmark suite Current: 1c02fb9 Previous: 500fc82 Ratio
ML-DSA-44 keypair 42362 cycles 42819 cycles 0.99
ML-DSA-44 sign 131214 cycles 131179 cycles 1.00
ML-DSA-44 verify 44516 cycles 44463 cycles 1.00
ML-DSA-65 keypair 72890 cycles 73113 cycles 1.00
ML-DSA-65 sign 211540 cycles 212321 cycles 1.00
ML-DSA-65 verify 73583 cycles 73373 cycles 1.00
ML-DSA-87 keypair 109867 cycles 110402 cycles 1.00
ML-DSA-87 sign 248634 cycles 249028 cycles 1.00
ML-DSA-87 verify 109312 cycles 110526 cycles 0.99

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intel Xeon 3rd gen (c6i) (no-opt)

Benchmark suite Current: 1c02fb9 Previous: 500fc82 Ratio
ML-DSA-44 keypair 158254 cycles 157967 cycles 1.00
ML-DSA-44 sign 566174 cycles 566123 cycles 1.00
ML-DSA-44 verify 170210 cycles 169632 cycles 1.00
ML-DSA-65 keypair 271005 cycles 270786 cycles 1.00
ML-DSA-65 sign 925830 cycles 926328 cycles 1.00
ML-DSA-65 verify 275829 cycles 275770 cycles 1.00
ML-DSA-87 keypair 451739 cycles 451356 cycles 1.00
ML-DSA-87 sign 1183611 cycles 1182974 cycles 1.00
ML-DSA-87 verify 461482 cycles 461122 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Graviton3 (no-opt)

Benchmark suite Current: 1c02fb9 Previous: 500fc82 Ratio
ML-DSA-44 keypair 138728 cycles 138769 cycles 1.00
ML-DSA-44 sign 493240 cycles 493618 cycles 1.00
ML-DSA-44 verify 148457 cycles 148348 cycles 1.00
ML-DSA-65 keypair 242785 cycles 242253 cycles 1.00
ML-DSA-65 sign 808642 cycles 809756 cycles 1.00
ML-DSA-65 verify 240667 cycles 240589 cycles 1.00
ML-DSA-87 keypair 396373 cycles 396617 cycles 1.00
ML-DSA-87 sign 1027322 cycles 1027256 cycles 1.00
ML-DSA-87 verify 401355 cycles 401370 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Graviton2 (no-opt)

Benchmark suite Current: 1c02fb9 Previous: 500fc82 Ratio
ML-DSA-44 keypair 214651 cycles 214601 cycles 1.00
ML-DSA-44 sign 783494 cycles 795140 cycles 0.99
ML-DSA-44 verify 230237 cycles 230348 cycles 1.00
ML-DSA-65 keypair 384772 cycles 385942 cycles 1.00
ML-DSA-65 sign 1308561 cycles 1307471 cycles 1.00
ML-DSA-65 verify 375730 cycles 376490 cycles 1.00
ML-DSA-87 keypair 607346 cycles 607058 cycles 1.00
ML-DSA-87 sign 1624575 cycles 1625788 cycles 1.00
ML-DSA-87 verify 616851 cycles 617486 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AMD EPYC 4th gen (c7a) (no-opt)

Benchmark suite Current: 1c02fb9 Previous: 500fc82 Ratio
ML-DSA-44 keypair 120112 cycles 120716 cycles 0.99
ML-DSA-44 sign 454067 cycles 454908 cycles 1.00
ML-DSA-44 verify 129876 cycles 130486 cycles 1.00
ML-DSA-65 keypair 204925 cycles 205460 cycles 1.00
ML-DSA-65 sign 734724 cycles 737598 cycles 1.00
ML-DSA-65 verify 209633 cycles 211808 cycles 0.99
ML-DSA-87 keypair 337263 cycles 338314 cycles 1.00
ML-DSA-87 sign 924352 cycles 926006 cycles 1.00
ML-DSA-87 verify 346278 cycles 348227 cycles 0.99

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SpacemiT K1 8 (Banana Pi F3) benchmarks (no-opt)

Benchmark suite Current: 1c02fb9 Previous: 500fc82 Ratio
ML-DSA-44 keypair 825701 cycles 825424 cycles 1.00
ML-DSA-44 sign 3329492 cycles 3333116 cycles 1.00
ML-DSA-44 verify 917407 cycles 919802 cycles 1.00
ML-DSA-65 keypair 1403418 cycles 1402665 cycles 1.00
ML-DSA-65 sign 5466727 cycles 5451820 cycles 1.00
ML-DSA-65 verify 1470047 cycles 1466342 cycles 1.00
ML-DSA-87 keypair 2304543 cycles 2304877 cycles 1.00
ML-DSA-87 sign 6811011 cycles 6820100 cycles 1.00
ML-DSA-87 verify 2393203 cycles 2405200 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arm Cortex-A72 (Raspberry Pi 4) benchmarks (no-opt)

Benchmark suite Current: 1c02fb9 Previous: 500fc82 Ratio
ML-DSA-44 keypair 309668 cycles 323650 cycles 0.96
ML-DSA-44 sign 1219779 cycles 1217983 cycles 1.00
ML-DSA-44 verify 340805 cycles 334278 cycles 1.02
ML-DSA-65 keypair 571949 cycles 569989 cycles 1.00
ML-DSA-65 sign 1968617 cycles 1976813 cycles 1.00
ML-DSA-65 verify 541988 cycles 529535 cycles 1.02
ML-DSA-87 keypair 867053 cycles 873331 cycles 0.99
ML-DSA-87 sign 2514392 cycles 2474406 cycles 1.02
ML-DSA-87 verify 893328 cycles 896613 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arm Cortex-A76 (Raspberry Pi 5) benchmarks (opt)

Benchmark suite Current: 1c02fb9 Previous: 500fc82 Ratio
ML-DSA-44 keypair 115631 cycles 115648 cycles 1.00
ML-DSA-44 sign 377515 cycles 377237 cycles 1.00
ML-DSA-44 verify 120445 cycles 120227 cycles 1.00
ML-DSA-65 keypair 200315 cycles 200070 cycles 1.00
ML-DSA-65 sign 623677 cycles 622815 cycles 1.00
ML-DSA-65 verify 198258 cycles 198200 cycles 1.00
ML-DSA-87 keypair 327353 cycles 326756 cycles 1.00
ML-DSA-87 sign 789117 cycles 789956 cycles 1.00
ML-DSA-87 verify 325333 cycles 324403 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arm Cortex-A76 (Raspberry Pi 5) benchmarks (no-opt)

Benchmark suite Current: 1c02fb9 Previous: 500fc82 Ratio
ML-DSA-44 keypair 214078 cycles 213844 cycles 1.00
ML-DSA-44 sign 781369 cycles 782211 cycles 1.00
ML-DSA-44 verify 229700 cycles 230301 cycles 1.00
ML-DSA-65 keypair 384976 cycles 385110 cycles 1.00
ML-DSA-65 sign 1317745 cycles 1313702 cycles 1.00
ML-DSA-65 verify 375878 cycles 375555 cycles 1.00
ML-DSA-87 keypair 606967 cycles 606211 cycles 1.00
ML-DSA-87 sign 1621866 cycles 1622406 cycles 1.00
ML-DSA-87 verify 616960 cycles 617113 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arm Cortex-A55 (Snapdragon 888) benchmarks (opt)

Benchmark suite Current: 1c02fb9 Previous: 500fc82 Ratio
ML-DSA-44 keypair 291864 cycles 290605 cycles 1.00
ML-DSA-44 sign 930890 cycles 935466 cycles 1.00
ML-DSA-44 verify 295927 cycles 291719 cycles 1.01
ML-DSA-65 keypair 494217 cycles 495090 cycles 1.00
ML-DSA-65 sign 1525079 cycles 1561749 cycles 0.98
ML-DSA-65 verify 481578 cycles 476130 cycles 1.01
ML-DSA-87 keypair 842619 cycles 842023 cycles 1.00
ML-DSA-87 sign 2066804 cycles 2091738 cycles 0.99
ML-DSA-87 verify 823325 cycles 814481 cycles 1.01

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arm Cortex-A55 (Snapdragon 888) benchmarks (no-opt)

Benchmark suite Current: 1c02fb9 Previous: 500fc82 Ratio
ML-DSA-44 keypair 469824 cycles 468437 cycles 1.00
ML-DSA-44 sign 2212277 cycles 2222751 cycles 1.00
ML-DSA-44 verify 550073 cycles 546777 cycles 1.01
ML-DSA-65 keypair 781578 cycles 784138 cycles 1.00
ML-DSA-65 sign 3627152 cycles 3650589 cycles 0.99
ML-DSA-65 verify 851369 cycles 851427 cycles 1.00
ML-DSA-87 keypair 1265646 cycles 1268579 cycles 1.00
ML-DSA-87 sign 4500789 cycles 4549665 cycles 0.99
ML-DSA-87 verify 1371228 cycles 1372155 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

@mkannwischer mkannwischer marked this pull request as ready for review November 11, 2025 04:14
@mkannwischer mkannwischer requested a review from a team as a code owner November 11, 2025 04:14
Copy link
Contributor

@hanno-becker hanno-becker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should detect this automatically. I opened #669 to track this.

@mkannwischer mkannwischer force-pushed the align branch 2 times, most recently from 277d746 to e6ee07f Compare November 12, 2025 09:35
Buffers in sign.c are not currently forced to be aligned. This may harm
performance and it may also lead to problems if a FIPS202 backend is used
that requires alignment (e.g., in OpenTitan).
This commit adds alignment.

Signed-off-by: Matthias J. Kannwischer <matthias@kannwischer.eu>
@hanno-becker hanno-becker merged commit 7320306 into main Nov 12, 2025
259 checks passed
@hanno-becker hanno-becker deleted the align branch November 12, 2025 12:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants