I'd like to benefit from cheaper batch processing using OpenAI Flex Processing API and Groq Batch API that offers cheaper batch processing.
As a fallback mechanism when the batch is not finished by X minutes, I'd like the batch to be cenceled and fallback to standard LLM processing.
https://developers.openai.com/api/docs/guides/flex-processing
For inspiration
vectorize-io/hindsight#365
I'd like to benefit from cheaper batch processing using OpenAI Flex Processing API and Groq Batch API that offers cheaper batch processing.
As a fallback mechanism when the batch is not finished by X minutes, I'd like the batch to be cenceled and fallback to standard LLM processing.
https://developers.openai.com/api/docs/guides/flex-processing
For inspiration
vectorize-io/hindsight#365