Feature Request: Expose ingestion tuning parameters for faster backfills

## Summary

The DeepBook indexer currently uses hardcoded defaults for the `sui-indexer-alt-framework` ingestion configuration. Exposing these parameters via CLI arguments would allow users to tune backfill performance based on their hardware and network capabilities.

## Problem

When running backfills (e.g., syncing 1 week or 1 month of historical data), the indexer uses the framework's default ingestion settings:

- `checkpoint_buffer_size`: 5,000
- `ingest_concurrency`: 200  
- `retry_interval_ms`: 200

These defaults are conservative. Users with capable hardware and good network connectivity cannot tune these values without modifying the source code.

## Proposed Solution

Expose three additional CLI arguments in `main.rs`:

```rust
/// Buffer size for checkpoints between ingestion and processing.
/// Higher values use more memory but provide smoother throughput.
#[clap(env, long, default_value = "5000")]
checkpoint_buffer_size: usize,

/// Number of concurrent checkpoint fetches from the remote store.
/// Higher values improve ingestion speed but increase network load.
#[clap(env, long, default_value = "200")]
ingest_concurrency: usize,

/// Retry interval for missing checkpoints in milliseconds.
/// Lower values reduce latency but may cause unnecessary retries.
#[clap(env, long, default_value = "200")]
retry_interval_ms: u64,
```

Then pass these to a custom `IngestionConfig` instead of `Default::default()`:

```rust
let ingestion_config = IngestionConfig {
    checkpoint_buffer_size,
    ingest_concurrency,
    retry_interval_ms,
    ..Default::default()
};

let mut indexer = Indexer::new(
    store,
    indexer_args,
    client_args,
    ingestion_config,  // Instead of Default::default()
    // ...
)
```

## Sample Performance Results

Testing with more aggressive settings on a local machine showed meaningful speedup:

| Setting | Default | Aggressive |
|---------|---------|------------|
| `checkpoint_buffer_size` | 5,000 | 15,000 |
| `ingest_concurrency` | 200 | 800 |
| `retry_interval_ms` | 200 | 100 |
| **Observed Speed** | ~450 cps | ~670 cps |
| **Speedup** | 1.0x | **~1.5x** |

*Note: These are sample results from a single test. Actual performance will vary based on hardware, network conditions, and the checkpoint store's capacity. The optimal values and maximum achievable speedup are not yet determined.*

## Benefits

1. **Faster backfills** - Users can tune for their specific hardware/network
2. **No breaking changes** - Defaults remain the same as current behavior
3. **Minimal code change** - ~20 lines of additional code
4. **Aligns with framework capabilities** - These parameters are already supported by `sui-indexer-alt-framework`, just not exposed

## Additional Context

- The `sui-indexer-alt-framework` documentation discusses tuning these values: https://docs.sui.io/guides/developer/advanced/custom-indexer/indexer-runtime-perf
- The framework supports additional streaming-related parameters that could also be exposed in the future if needed

Happy to submit a PR if this approach looks reasonable.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature Request: Expose ingestion tuning parameters for faster backfills #750

Summary

Problem

Proposed Solution

Sample Performance Results

Benefits

Additional Context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Setting	Default	Aggressive
`checkpoint_buffer_size`	5,000	15,000
`ingest_concurrency`	200	800
`retry_interval_ms`	200	100
Observed Speed	~450 cps	~670 cps
Speedup	1.0x	~1.5x

Feature Request: Expose ingestion tuning parameters for faster backfills #750

Description

Summary

Problem

Proposed Solution

Sample Performance Results

Benefits

Additional Context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions