Summary
The DeepBook indexer currently uses hardcoded defaults for the sui-indexer-alt-framework ingestion configuration. Exposing these parameters via CLI arguments would allow users to tune backfill performance based on their hardware and network capabilities.
Problem
When running backfills (e.g., syncing 1 week or 1 month of historical data), the indexer uses the framework's default ingestion settings:
checkpoint_buffer_size: 5,000
ingest_concurrency: 200
retry_interval_ms: 200
These defaults are conservative. Users with capable hardware and good network connectivity cannot tune these values without modifying the source code.
Proposed Solution
Expose three additional CLI arguments in main.rs:
/// Buffer size for checkpoints between ingestion and processing.
/// Higher values use more memory but provide smoother throughput.
#[clap(env, long, default_value = "5000")]
checkpoint_buffer_size: usize,
/// Number of concurrent checkpoint fetches from the remote store.
/// Higher values improve ingestion speed but increase network load.
#[clap(env, long, default_value = "200")]
ingest_concurrency: usize,
/// Retry interval for missing checkpoints in milliseconds.
/// Lower values reduce latency but may cause unnecessary retries.
#[clap(env, long, default_value = "200")]
retry_interval_ms: u64,
Then pass these to a custom IngestionConfig instead of Default::default():
let ingestion_config = IngestionConfig {
checkpoint_buffer_size,
ingest_concurrency,
retry_interval_ms,
..Default::default()
};
let mut indexer = Indexer::new(
store,
indexer_args,
client_args,
ingestion_config, // Instead of Default::default()
// ...
)
Sample Performance Results
Testing with more aggressive settings on a local machine showed meaningful speedup:
| Setting |
Default |
Aggressive |
checkpoint_buffer_size |
5,000 |
15,000 |
ingest_concurrency |
200 |
800 |
retry_interval_ms |
200 |
100 |
| Observed Speed |
~450 cps |
~670 cps |
| Speedup |
1.0x |
~1.5x |
Note: These are sample results from a single test. Actual performance will vary based on hardware, network conditions, and the checkpoint store's capacity. The optimal values and maximum achievable speedup are not yet determined.
Benefits
- Faster backfills - Users can tune for their specific hardware/network
- No breaking changes - Defaults remain the same as current behavior
- Minimal code change - ~20 lines of additional code
- Aligns with framework capabilities - These parameters are already supported by
sui-indexer-alt-framework, just not exposed
Additional Context
Happy to submit a PR if this approach looks reasonable.
Summary
The DeepBook indexer currently uses hardcoded defaults for the
sui-indexer-alt-frameworkingestion configuration. Exposing these parameters via CLI arguments would allow users to tune backfill performance based on their hardware and network capabilities.Problem
When running backfills (e.g., syncing 1 week or 1 month of historical data), the indexer uses the framework's default ingestion settings:
checkpoint_buffer_size: 5,000ingest_concurrency: 200retry_interval_ms: 200These defaults are conservative. Users with capable hardware and good network connectivity cannot tune these values without modifying the source code.
Proposed Solution
Expose three additional CLI arguments in
main.rs:Then pass these to a custom
IngestionConfiginstead ofDefault::default():Sample Performance Results
Testing with more aggressive settings on a local machine showed meaningful speedup:
checkpoint_buffer_sizeingest_concurrencyretry_interval_msNote: These are sample results from a single test. Actual performance will vary based on hardware, network conditions, and the checkpoint store's capacity. The optimal values and maximum achievable speedup are not yet determined.
Benefits
sui-indexer-alt-framework, just not exposedAdditional Context
sui-indexer-alt-frameworkdocumentation discusses tuning these values: https://docs.sui.io/guides/developer/advanced/custom-indexer/indexer-runtime-perfHappy to submit a PR if this approach looks reasonable.