Skip to content

[MEDI] Allow extending VectorStoreWriter #6972

@roji

Description

@roji

VectorStoreWriter is currently sealed, but there are some good reasons to allow specialized implementations for specific databases, which would extend it. Such specialized implementations could be delivered with MEVD provider; the e.g. Qdrant MEVD provider package and provide QdrantStoreWriter (though we'd have to be OK with the added reference to MEDI.Abstractions).

Some reasons/motivations to specialize:

  • Work around provider-specific limitations. For example, we have a (temporary) hack to identify Qdrant, where we use GUIDs (as opposed to strings in all other databases). We'd instead do that in QdrantVectorWriter (the alternative is to expose type support metadata from MEVD itself, issue).
  • Different databases like different kinds of GUIDs, and index them much better (though see #12182 and #11485 as better alternatives):
    • Generate UUIDv7 for PostgreSQL, where they're much more efficient for indexing than random UUIDv4
    • SQL Server has its own special "sequential GUIDs" (link).
  • Account fo the different support Top supported by different databases (see this)

Of course, unsealing VectorStoreWriter isn't enough - we'd need to expose selected protected APIs to actually make it useful. An alternative is for specialized implementations to simply extend VectorStoreWriter and duplicate code, but there seems to be enough actual logic in there to justify extensibility, I think.

Metadata

Metadata

Assignees

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions