Skip to content

Conversation

@eternal-flame-AD
Copy link

@eternal-flame-AD eternal-flame-AD commented Oct 29, 2025

This should not affect soft core performance as Salsa20 is designed to be register resident and this is just register wiring.

TODO: pull in an SSE2 and AVX2 dual-buffer core and statically dispatch in place of the soft core when round count is high enough to be profitable: https://github.com/eternal-flame-AD/scrypt-opt/blob/main/src/salsa20/x86_64.rs

Before:

test salsa12_bench1_16b   ... bench:          12.43 ns/iter (+/- 0.61) = 1333 MB/s
test salsa12_bench2_256b  ... bench:         159.03 ns/iter (+/- 12.04) = 1610 MB/s
test salsa12_bench3_1kib  ... bench:         619.33 ns/iter (+/- 6.29) = 1654 MB/s
test salsa12_bench4_16kib ... bench:       9,915.67 ns/iter (+/- 38.58) = 1652 MB/s
test salsa20_bench1_16b   ... bench:          18.40 ns/iter (+/- 0.23) = 888 MB/s
test salsa20_bench2_256b  ... bench:         253.31 ns/iter (+/- 11.76) = 1011 MB/s
test salsa20_bench3_1kib  ... bench:       1,001.52 ns/iter (+/- 7.56) = 1022 MB/s
test salsa20_bench4_16kib ... bench:      15,858.24 ns/iter (+/- 83.23) = 1033 MB/s
test salsa8_bench1_16b    ... bench:           9.59 ns/iter (+/- 0.13) = 1777 MB/s
test salsa8_bench2_256b   ... bench:         112.39 ns/iter (+/- 2.31) = 2285 MB/s
test salsa8_bench3_1kib   ... bench:         433.26 ns/iter (+/- 13.06) = 2364 MB/s
test salsa8_bench4_16kib  ... bench:       6,801.70 ns/iter (+/- 53.38) = 2409 MB/s

After:

test salsa12_bench1_16b   ... bench:          12.68 ns/iter (+/- 0.44) = 1333 MB/s
test salsa12_bench2_256b  ... bench:         158.09 ns/iter (+/- 5.91) = 1620 MB/s
test salsa12_bench3_1kib  ... bench:         617.26 ns/iter (+/- 15.70) = 1659 MB/s
test salsa12_bench4_16kib ... bench:       9,713.60 ns/iter (+/- 247.77) = 1686 MB/s
test salsa20_bench1_16b   ... bench:          18.43 ns/iter (+/- 1.11) = 888 MB/s
test salsa20_bench2_256b  ... bench:         255.13 ns/iter (+/- 11.43) = 1003 MB/s
test salsa20_bench3_1kib  ... bench:         994.71 ns/iter (+/- 4.67) = 1030 MB/s
test salsa20_bench4_16kib ... bench:      15,787.80 ns/iter (+/- 47.00) = 1037 MB/s
test salsa8_bench1_16b    ... bench:           9.74 ns/iter (+/- 0.13) = 1777 MB/s
test salsa8_bench2_256b   ... bench:         113.25 ns/iter (+/- 3.86) = 2265 MB/s
test salsa8_bench3_1kib   ... bench:         437.69 ns/iter (+/- 14.69) = 2343 MB/s
test salsa8_bench4_16kib  ... bench:       6,752.95 ns/iter (+/- 101.23) = 2426 MB/s

test salsa8_bench1_ks      ... bench:          34.39 ns/iter (+/- 0.57) = 1882 MB/s
test salsa8_bench1_ks_altn ... bench:          29.47 ns/iter (+/- 3.24) = 2206 MB/s

xref: RustCrypto/password-hashes#622

Signed-off-by: eternal-flame-AD <yume@yumechi.jp>
Signed-off-by: eternal-flame-AD <yume@yumechi.jp>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant