Skip to content

feat(helm): add persistent volume for local model cache#861

Merged
nicoloboschi merged 2 commits intovectorize-io:mainfrom
isac322:feat/helm-model-cache-volume
Apr 7, 2026
Merged

feat(helm): add persistent volume for local model cache#861
nicoloboschi merged 2 commits intovectorize-io:mainfrom
isac322:feat/helm-model-cache-volume

Conversation

@isac322
Copy link
Copy Markdown
Contributor

@isac322 isac322 commented Apr 3, 2026

Problem

When using local reranker models (e.g., BAAI/bge-reranker-v2-m3) or local embedding models, the models are downloaded to /home/hindsight/.cache on every pod restart. This causes slow startup (~1GB+ download) and unnecessary bandwidth usage.

Closes #860

Changes

  • values.yaml: Add api.persistence.modelCache and worker.persistence.modelCache config (disabled by default)
  • api-deployment.yaml: Mount PVC at /home/hindsight/.cache when enabled
  • api-model-cache-pvc.yaml: New PVC template for API model cache
  • worker-statefulset.yaml: Add volumeClaimTemplates for model cache when enabled

Usage

api:
  persistence:
    modelCache:
      enabled: true
      size: 5Gi
      storageClass: "standard"

worker:
  persistence:
    modelCache:
      enabled: true
      size: 5Gi

Notes

  • Disabled by default — no breaking changes
  • API uses a standalone PVC (Deployment)
  • Worker uses volumeClaimTemplates (StatefulSet) for per-replica storage

When using local reranker (e.g., BAAI/bge-reranker-v2-m3) or local
embedding models, the models are downloaded to /home/hindsight/.cache
on every pod restart, causing slow startup and unnecessary bandwidth.

Add optional persistent volume support:
- api: PVC mounted at /home/hindsight/.cache
- worker: volumeClaimTemplate (StatefulSet) at same path

Disabled by default. Enable via:
  api.persistence.modelCache.enabled: true
  worker.persistence.modelCache.enabled: true

Closes vectorize-io#860

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@isac322 isac322 force-pushed the feat/helm-model-cache-volume branch from ab2e152 to 4de08f4 Compare April 3, 2026 09:49
Allow users to mount arbitrary volumes (configMaps, secrets, emptyDir,
etc.) into api and worker pods via values, following common helm chart
library conventions.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Collaborator

@nicoloboschi nicoloboschi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

awesome, thank you!
I wanted to add this some time ago but couldn't find the time, thanks!!!

@nicoloboschi nicoloboschi merged commit cefa755 into vectorize-io:main Apr 7, 2026
4 checks passed
@isac322 isac322 deleted the feat/helm-model-cache-volume branch April 7, 2026 07:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Helm chart: add persistent volume for local model cache (reranker/embeddings)

2 participants