Skip to content

Collection Module: Enhance smart batching #716

@nishika26

Description

@nishika26

Describe the current behaviour?
Currently we have a default batch size of 10 set for collections which means 10 files are sent in batches to vector store to get uploaded, but if lets say someone uploaded 250+ files to put into vector store, then there will be 25 upload and api calls in openai's end, which causes polling overhead and many issues and causes collection creation to fail. we cant put a bigger number as default either because if someone put many bigger size of documents and they are being processed together, that would affect server's memory

Describe the enhancement you'd like
Enhance the module for smart batching with the following improvements:

  • add file size column in documents table
  • add number of docs and total size column to collection job table
  • Implement a dynamic smart batching
  • batching is done on the basis of an upperlimit on total size of documents, or number of documents, whichever gets hit first

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

Projects

Status

In Review

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions