-
Notifications
You must be signed in to change notification settings - Fork 499
PGVectorStore: use helpers to convert between nodes and storage #2232
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
PGVectorStore: use helpers to convert between nodes and storage #2232
Conversation
…ps can be rehydrated
🦋 Changeset detectedLatest commit: 58d4731 The changes in this PR will be included in the next version bump. This PR includes changesets to release 1 package
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
@llamaindex/autotool
@llamaindex/community
@llamaindex/core
@llamaindex/env
@llamaindex/experimental
llamaindex
@llamaindex/node-parser
@llamaindex/readers
@llamaindex/tools
@llamaindex/wasm-tools
@llamaindex/workflow
@llamaindex/anthropic
@llamaindex/assemblyai
@llamaindex/aws
@llamaindex/clip
@llamaindex/cohere
@llamaindex/deepinfra
@llamaindex/deepseek
@llamaindex/discord
@llamaindex/excel
@llamaindex/fireworks
@llamaindex/google
@llamaindex/groq
@llamaindex/huggingface
@llamaindex/jinaai
@llamaindex/mistral
@llamaindex/mixedbread
@llamaindex/notion
@llamaindex/ollama
@llamaindex/openai
@llamaindex/ovhcloud
@llamaindex/perplexity
@llamaindex/portkey-ai
@llamaindex/replicate
@llamaindex/together
@llamaindex/vercel
@llamaindex/vllm
@llamaindex/voyage-ai
@llamaindex/xai
@llamaindex/astra
@llamaindex/azure
@llamaindex/chroma
@llamaindex/elastic-search
@llamaindex/firestore
@llamaindex/milvus
@llamaindex/mongodb
@llamaindex/pinecone
@llamaindex/postgres
@llamaindex/qdrant
@llamaindex/supabase
@llamaindex/upstash
@llamaindex/weaviate
commit: |
|
hi @marcusschiesser : Do you need anything from me to make this PR or #2237 ready to be merged? (I think those test failures are unrelated.) I'd love for this to be merged because I'm having real trouble getting my forked version to build in my build process. |
packages/core/package.json
Outdated
| "name": "@llamaindex/core", | ||
| "type": "module", | ||
| "version": "0.6.22", | ||
| "version": "0.6.23", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PR shouldn't change the versions. Change set is doing this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry @jeremybmerrill there are also other files changing the versions, please revert them
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
geez, sorry about that. I think I've fixed it.
8b845f8 to
43b7d79
Compare
|
@jeremybmerrill please have a look, tests are failing: |
|
@marcusschiesser BLUF: I resolved the first issue; the second one raises a question for you. I have two other recommendations/questions. I've edited this branch to re-hydrate Running this test requires some non-documented local Postgresql setup -- there's a ".env.ci" that assumes a Postgres user named The second failure (expected In One possible solution would be to edit I think the best solution to this is to edit the test to accept |
|
|
@marcusschiesser Thanks! I will edit this PR to add some instructions for testing and to edit the test to accept |
…G.md to flag e2e tests
|
I've made the first two changes in the commit above -- and the e2e tests now pass. I agree that |
As described in #2225, when nodes are persisted to Postgresql and then re-hydrated with
query, they do not retainrelationshipkey/value pair. That means nodes lose their relationship to theirSOURCEdocument. This behavior is (as described in my comment in #2225) specific to the PGVectorStore class; SimpleVectorStore and WeaviateVectorStore properly retain `relationship.The root of the bug is that PGVectorStore doesn't use
nodeToMetadatato dehydrate the node to a single dict and doesn't usemetadataDictToNodeto re-hydrate it. Instead, PGVectorStore just stores the node'smetadata(ignoring the rest of the node data, discarding it).This PR uses
nodeToMetadataandmetadataDictToNode, imitating WeaviateVectorStore. Now, it works for me -- I can dehydrate and rehydrate nodes, and have them retain their relationships.Closes #2225
Before, incorrect value of the
metadatafield in Postgres -- all of these keys are arbitrary user-specific metadata, nothing LlamaIndexTS-related exceptcreate_date:After, with this PR:
I have not added tests with this PR because I'm bad at unit tests and there aren't any tests that I see for PGVectorStore. If required, I will futz around with Copilot and try to get some meaningful tests that pass. Let me know!