Skip to content

Conversation

@yuzegao
Copy link

@yuzegao yuzegao commented Oct 30, 2025

Background

When Predixy runs in Cluster mode (e.g., with Kvrocks), it encodes the backend SCAN cursor together with the Server Group ID so a single client-side cursor can scan across groups. Kvrocks may return very large cursors due to its RocksDB iterator state.

Problem

SCAN loops never complete (cursor never returns to 0).
Monitor shows steadily increasing cursors being sent to the backend.

Root Cause

Predixy encodes the client-visible cursor as: clientCursor = (backendCursor << 10) | groupId.
Kvrocks cursors can be close to 2^64, and the left-shift by 10 pushes the value beyond 64-bit limits.
Using 64-bit integers causes overflow on shift, corrupting both the cursor and the embedded group ID, which leads to incorrect routing and an endless scan loop.

Fix

Use 128-bit integers (__uint128_t) to safely parse, encode, and decode SCAN cursors.
Manually convert between strings and 128-bit integers (standard printf/stringstream do not support __uint128_t).
Preserve the existing 10-bit group encoding scheme.

Implementation

src/Handler.cpp:
Request path: parse client cursor as __uint128_t, extract group via mask, derive backend cursor via right shift.
Response path: parse backend cursor as __uint128_t, left shift by 10, OR with group ID, then serialize to string.
Add debug logs tracing client/server cursors and group IDs.
src/Request.h:
Change signature: void adjustScanCursor(__uint128_t cursor);
src/Request.cpp:
Implement adjustScanCursor with manual 128-bit-to-string conversion.

Compatibility and Performance

Requires compiler support for __uint128_t (GCC ≥ 4.6 or Clang ≥ 3.0).
Negligible CPU overhead; minor memory increase per cursor (16 bytes vs 8).
Cursor strings may be longer over the wire; functional impact is minimal.

Testing and Verification

Run SCAN from a client through Predixy against Kvrocks in Cluster mode:
Cursor advances with large decimal values and eventually returns "0".
All keys are visited without duplication.
Enable Debug logs to verify:
Correct client → server decode (group extraction, right shift).
Correct server → client encode (left shift, OR group).
Final cursor "0" round trip.

…eusability

- Use __uint128_t to handle large cursor values from Kvrocks (up to 64-bit)
- Prevent integer overflow when encoding cursor with Group ID
- Add Util::uint128ToString() utility function to eliminate code duplication
- Refactor Handler.cpp and Request.cpp to use the new utility function
- Add debug logging for cursor transformation tracking

This fix resolves SCAN cursor infinite loop issue when using Predixy with Kvrocks in Cluster mode.
@yuzegao yuzegao force-pushed the fix/ScanCompatible64bitCursor branch from e268cdd to b059744 Compare October 30, 2025 10:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant