Refactor embedding operations and improve configuration #6
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fixes #2
Refactor code to use
EmbeddingKeyclass for embedding operations.Introduce
EmbeddingKeyclass:EmbeddingKeyclass to encapsulatetextandmodelinsrc/operations.py.src/operations.pyto useEmbeddingKeyinstead of tuples for keys.write_embedding_to_table,is_key_in_table,list_keys_in_table, andget_embedding_from_tableto useEmbeddingKey.Refactor embedding operations:
EmbeddingOperationsclass insrc/embedding_operations.pyto encapsulate embedding-related operations.pickle_embeddings,duckdb_embeddings, andget_similarityintoEmbeddingOperationsclass.Update embedding functions:
pickle_embeddingsandduckdb_embeddingsfunctions insrc/embedding.pyto useEmbeddingKey.Add error handling:
src/openai_client.py.src/connection.py.Add configuration file:
config.yamlwith database name, model name, paths to files, OpenAI API key, and number of top documents.Add tests:
EmbeddingOperationsclass intests/test_embedding_operations.py.config.yamlfile intests/test_config.py.