Skip to content

Conversation

@ambicuity
Copy link

Summary

This PR implements ChromaDB as a built-in target for CocoIndex, following the pattern established by the LanceDB target. This addresses issue #1214.

Changes

Core Implementation

  • Added python/cocoindex/targets/chromadb.py with full ChromaDB connector
  • Implements core CRUD operations (create, read, update, delete)
  • Schema mapping from CocoIndex types to ChromaDB documents
  • Support for vector embeddings storage
  • Proper error handling with helpful messages for missing dependencies

Dependencies

  • Added chromadb>=0.5.0 as optional dependency in pyproject.toml
  • Uses guarded import pattern with clear error messaging

Features (Phase 1 MVP)

  • Basic ChromaDB connector with persistent and in-memory client support
  • Single key field support (matching LanceDB pattern)
  • Conversion of Python types to ChromaDB-compatible formats
  • UUID, Range, Vector, Struct, and Table type support
  • Metadata storage for non-vector fields
  • Embedding storage for vector fields
  • Collection lifecycle management (create/delete/reuse)

Testing

Manual testing performed with ephemeral ChromaDB client. Comprehensive automated tests will be added in a follow-up PR.

Future Work (Phase 2)

As discussed in #1214:

  • Vector index configuration with distance metrics
  • Advanced indexing options
  • Performance optimizations
  • Query capabilities

Checklist

  • Implementation follows LanceDB target pattern
  • Optional dependency properly configured
  • Error messages are helpful and actionable
  • Code includes docstrings
  • Single key field requirement enforced
  • Tests added (deferred to follow-up)
  • Documentation updated (deferred to follow-up)

Fixes #1214

- Implements basic ChromaDB connector with core CRUD operations
- Schema mapping for key/value fields to ChromaDB documents
- Supports vector embeddings storage
- Optional dependency setup
- Follows LanceDB target pattern

Related to cocoindex-io#1214
Add chromadb>=0.5.0 as optional dependency in pyproject.toml

Related to cocoindex-io#1214
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[FEATURE] support ChromaDB as a builtin target

1 participant