Skip to content

Add batch insert operations to storage system#282

Open
wbssbw wants to merge 9 commits intoobelisk:storage/refactor-apisfrom
wbssbw:storage/batch-insert
Open

Add batch insert operations to storage system#282
wbssbw wants to merge 9 commits intoobelisk:storage/refactor-apisfrom
wbssbw:storage/batch-insert

Conversation

@wbssbw
Copy link
Collaborator

@wbssbw wbssbw commented Mar 12, 2026

Summary

Implements insert_batch and insert_batch_shared functions for the storage system, enabling atomic insertion of multiple key/value pairs in a single operation. This improves performance for bulk storage operations and reduces round trips between guest modules and the storage layer.

Changes

Runtime Storage Functions (runtime/plaid/src/functions/storage/)

  • Added insert_batch and insert_batch_shared host functions that accept JSON-serialized arrays of Item structs
  • Implemented storage limit accounting for batch operations, checking total size before writing any data
  • Properly accounts for existing keys by subtracting old value sizes from usage calculations

Storage Provider Implementations (runtime/plaid/src/storage/)

  • Added insert_batch trait method to StorageProvider
  • DynamoDB: implemented using TransactWriteItems for atomic batch writes
  • InMemory: implemented with write-locked batch insertion
  • Sled: returns Unimplemented error

Standard Library (runtime/plaid-stl/src/plaid/storage.rs)

  • Added Item struct for representing key/value pairs with Serde support
  • Exported insert_batch(&[Item]) and insert_batch_shared(namespace, &[Item]) functions for guest modules
  • Functions serialize items to JSON before passing to host

Testing

  • Added insert_batch test case to test_db module demonstrating bulk insertion with pipe-separated key=value pairs

Error Handling

  • Added new StorageError variants: Unimplemented, BuildError, BatchWriteError
  • Batch operations fail atomically if storage limit would be exceeded

Rationale

Individual insert operations require multiple host function calls and storage provider round trips. Batch insertion reduces overhead for bulk data scenarios and enables atomic multi-item writes where supported (e.g., DynamoDB transactions).

@wbssbw wbssbw marked this pull request as ready for review March 12, 2026 17:27
@wbssbw wbssbw requested review from michelemin and obelisk March 12, 2026 17:27
@wbssbw wbssbw force-pushed the storage/batch-insert branch from f0a8dbb to 1800586 Compare March 12, 2026 18:30
@wbssbw wbssbw marked this pull request as draft March 12, 2026 18:31
@wbssbw wbssbw changed the base branch from main to storage/refactor-apis March 12, 2026 18:34
@wbssbw wbssbw force-pushed the storage/batch-insert branch from 1800586 to 7fa9ab7 Compare March 12, 2026 18:50
@wbssbw wbssbw marked this pull request as ready for review March 12, 2026 18:58
Copy link
Owner

@obelisk obelisk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason sled is not implemented? Also pending discussion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants