-
-
Notifications
You must be signed in to change notification settings - Fork 14
Open
Labels
enhancementNew feature or requestNew feature or request
Description
Summary
Add batch processing endpoint for large-scale deduplication workloads.
Motivation
Some use cases require processing thousands of chunks. A batch endpoint with async processing and webhooks enables efficient large-scale operations.
API Design
Submit Batch Job
POST /v1/batch
{
"chunks": [...], // or "chunks_url": "s3://..."
"options": {...},
"webhook_url": "https://example.com/callback"
}
Response:
{
"job_id": "batch_abc123",
"status": "queued",
"estimated_duration_seconds": 120
}
Check Status
GET /v1/batch/batch_abc123
Response:
{
"job_id": "batch_abc123",
"status": "processing", // queued, processing, completed, failed
"progress": 0.45,
"chunks_processed": 450,
"chunks_total": 1000
}
Get Results
GET /v1/batch/batch_abc123/results
Response:
{
"job_id": "batch_abc123",
"status": "completed",
"results": [...],
"stats": {...}
}
Components
- Job queue (in-memory or Redis)
- Background worker
- Webhook notifications
- S3/GCS input support
- Result storage and retrieval
Acceptance Criteria
- Process 10K+ chunks in single batch
- Progress tracking via polling or webhook
- Results available for 24 hours
- Graceful handling of failures
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request