-
Notifications
You must be signed in to change notification settings - Fork 39
continuation of multi-shard + ram bus #1084
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
d5849da to
015c542
Compare
015c542 to
d32c71f
Compare
517f8d5 to
4d5a421
Compare
4503fd9 to
6ac69fc
Compare
0163be6 to
0ea92e0
Compare
multi shards and continuation supportThis PR add |
shards mem records trackingrefer |
## Summary ### Sub tasks - [x] $N = 2^n$ septic curve points accumulation (in **one layer**) using [Quark](https://eprint.iacr.org/2020/1275) cmd: `cargo test --release --lib test_ecc_quark_prover --features "sanity-check" -- --nocapture` - [x] `Global` chip @kunxian-xia - [x] constraints - [x] debugging `sum != p[0] + p[1]` - [x] enable poseidon2 - [x] #1093 - [x] enable non power-of-two #1081 --------- Co-authored-by: Ming <hero78119@gmail.com>
build on top of #1084 ### changes - [x] (cpu) migrate all tables to gkr-iop - [x] fixing verifier logic and e2e - [x] refactor code under gpu module - [x] code cleanup & benchmark ### benchmark With CPU 3-4% performance regressed | Benchmark | Median Time (s) | Median Change (%) | |------------------------------|------------------|------------------------------------| | fibonacci_max_steps_1048576 | 2.8744 | +4.06% (Performance has regressed) | | fibonacci_max_steps_2097152 | 5.0004 | +3.31% (Performance has regressed) | | fibonacci_max_steps_4194304 | 9.3081 | +3.08% (Performance has regressed) | With GPU On GPU 5070 ti, CPU 5900XT performance regressed a bit on smaller size is expected. On larger workload it even improve. Think the reason is because we got less cuda kernel invocations during e2e | Benchmark | GPU (New) | Δ Change (%) | |------------------------------|-----------|------------------------------------| | fibonacci_max_steps_1048576 | 0.979 s | +3.63% (Performance has regressed) | | fibonacci_max_steps_2097152 | 1.344 s | +3.92% (Performance has regressed) | | fibonacci_max_steps_4194304 | 2.222 s | -14.87% (Performance has improved) | --------- Co-authored-by: kunxian xia <xiakunxian130@gmail.com>
kunxian-xia
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 🎉
To deal with > 1 shards and satisfy offline memory constrain
related to #1061, #1063, #699, #698, #697, #696, #700
change scope
benchmarks
bench on single chunk
with fibonacci on CPU: 5900XT 32 cores, 64GB RAM
which shows no performance impact