igzip/riscv64: Add adler32_rvv optimization for VLEN=128 #374

leiwen2025 · 2025-11-20T08:23:19Z

This PR introduces an optimized adler32_rvv implementation for vlen=128.

The optimization has been verified on the SG2044 platform:

SG2044:
        new: adler32_warm: runtime =    3062471 usecs, bandwidth 23095 MB in 3.0625 sec = 7541.43 MB/s
        old: adler32_warm: runtime =    3062465 usecs, bandwidth 9233 MB in 3.0625 sec = 3015.15 MB/s

Signed-off-by: WenLei <lei.wen2@zte.com.cn>

pablodelara · 2025-11-24T15:23:54Z

@sunyuechi can you review this? Thanks!

sunyuechi · 2025-12-03T15:38:44Z

igzip/riscv64/igzip_isal_adler32_rvv128.S

+    addi    sp, sp, -32
+    sd      ra, 24(sp)
+    sd      s1, 16(sp)
+    sd      s2, 8(sp)


You can use the unused registers to reduce stack operations (at least a7, t5)

sunyuechi · 2025-12-03T15:43:56Z

igzip/riscv64/igzip_isal_adler32_rvv128.S

+    slli    s1, a0, 48
+    srli    s1, s1, 48              // s1: A = adler32 & 0xffff
+    srliw   s2, a0, 16              // s2: B = adler32 >> 16
+    add     s3, a1, a2              // s3 = end


Signed-off-by: WenLei <lei.wen2@zte.com.cn>

sunyuechi · 2025-12-05T10:51:00Z

igzip/riscv64/igzip_isal_adler32_rvv128.S

+    la      a7, factors
+    vle8.v  v0, (a7)
+    vmv.v.i v4, 0
+    vmv.v.i v8, 0


v4 hasn’t been modified, so you can just use v4.

Done, thanks for the review!

sunyuechi · 2025-12-05T10:52:17Z

igzip/riscv64/igzip_isal_adler32_rvv128.S

+    mv      t2, t1
+1:
+    mv      a3, t5
+    mv      a4, t6


t5, t6 -> a3, a4 update a3, a4 a3, a4 -> t5, t6

It doesn’t seem to be needed here — is it fine to just update t5 and t6 directly?

Done, thanks for the review!

Signed-off-by: WenLei <lei.wen2@zte.com.cn>

leiwen2025 added 2 commits November 20, 2025 16:05

igzip/riscv64: Add RVV optimization for VLEN=128

dbebf57

Signed-off-by: WenLei <lei.wen2@zte.com.cn>

fix clang-format CI failure

208333e

Signed-off-by: WenLei <lei.wen2@zte.com.cn>

sunyuechi reviewed Dec 3, 2025

View reviewed changes

Update code

35e4d5d

Signed-off-by: WenLei <lei.wen2@zte.com.cn>

sunyuechi reviewed Dec 5, 2025

View reviewed changes

fix: Optimize register usage

7ed6de8

Signed-off-by: WenLei <lei.wen2@zte.com.cn>

leiwen2025 force-pushed the rv64-igzip-adler32rvv128 branch from 787635b to 7ed6de8 Compare December 6, 2025 11:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

igzip/riscv64: Add adler32_rvv optimization for VLEN=128 #374

igzip/riscv64: Add adler32_rvv optimization for VLEN=128 #374

leiwen2025 commented Nov 20, 2025

Uh oh!

pablodelara commented Nov 24, 2025

Uh oh!

sunyuechi Dec 3, 2025

Uh oh!

sunyuechi Dec 3, 2025

Uh oh!

sunyuechi Dec 5, 2025

Uh oh!

leiwen2025 Dec 6, 2025

Uh oh!

sunyuechi Dec 5, 2025 •

edited

Loading

Uh oh!

leiwen2025 Dec 6, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

igzip/riscv64: Add adler32_rvv optimization for VLEN=128 #374

Are you sure you want to change the base?

igzip/riscv64: Add adler32_rvv optimization for VLEN=128 #374

Conversation

leiwen2025 commented Nov 20, 2025

Uh oh!

pablodelara commented Nov 24, 2025

Uh oh!

sunyuechi Dec 3, 2025

Choose a reason for hiding this comment

Uh oh!

sunyuechi Dec 3, 2025

Choose a reason for hiding this comment

Uh oh!

sunyuechi Dec 5, 2025

Choose a reason for hiding this comment

Uh oh!

leiwen2025 Dec 6, 2025

Choose a reason for hiding this comment

Uh oh!

sunyuechi Dec 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

leiwen2025 Dec 6, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

sunyuechi Dec 5, 2025 •

edited

Loading