This is a pure (safe) Rust implementation of shake128. This code should be easy to adapt to all SHA-3 hash family since the general Keccak1600 has been implemented in the code.
cd code
# Ensure that everything is clean and test runs
cargo clippy --all -- -D warnings
cargo test
# Build
RUSTFLAGS="-C target-cpu=native" cargo build --release
# Copy to PWD
cp target/release/shake128 ../
cd ../Please first execute the build steps.
./shake128 [HASH SIZE (in bytes)] < /path/to/input/fileNote : generated using hyperfine on a ~78Mo file, see below the table for the full output
| Implementation | Time consumed (absolute) |
|---|---|
| OpenSSL | 198 ms |
| Python (hashlib) | 209 ms |
| Rust (tiny-keccak, quoted in Keccak website) | 198 ms |
| My implementation | 208 ms |
Output :
Benchmark 1: ./target/release/shake128 4096 < /initrd.img.old
Time (mean ± σ): 208.0 ms ± 1.6 ms [User: 187.3 ms, System: 20.7 ms]
Range (min … max): 206.0 ms … 211.4 ms 14 runs
Benchmark 2: openssl dgst -shake128 -xoflen 4096 < /initrd.img.old
Time (mean ± σ): 198.3 ms ± 2.3 ms [User: 184.9 ms, System: 13.3 ms]
Range (min … max): 196.3 ms … 204.2 ms 15 runs
Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet system without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.
Benchmark 3: python3 -c 'import hashlib; import sys; print(hashlib.shake_128(sys.stdin.buffer.read()).hexdigest(4096))' < /initrd.img.old
Time (mean ± σ): 208.9 ms ± 1.7 ms [User: 193.5 ms, System: 15.3 ms]
Range (min … max): 206.8 ms … 212.4 ms 14 runs
Benchmark 4: ../keccak_ref/target/release/keccak_ref 8192 < /initrd.img.old
Time (mean ± σ): 203.4 ms ± 1.0 ms [User: 186.1 ms, System: 17.2 ms]
Range (min … max): 202.1 ms … 205.3 ms 14 runs
Summary
openssl dgst -shake128 -xoflen 4096 < /initrd.img.old ran
1.03 ± 0.01 times faster than ../keccak_ref/target/release/keccak_ref 4096 < /initrd.img.old
1.05 ± 0.01 times faster than ./target/release/shake128 4096 < /initrd.img.old
1.05 ± 0.01 times faster than python3 -c 'import hashlib; import sys; print(hashlib.shake_128(sys.stdin.buffer.read()).hexdigest(4096))' < /initrd.img.old
I tried to do a fair comparison, so every programm reads the input from the STDIN. However, the python programm should be a bit faster because of the just-in-time compilation that induces some overhead. Moreover, the two rust programms are compiled using the -C target-cpu=native flag (openssl is surely compiled with equivalent flags, this it seems fair for a benchmark).