fix: auto-detect OpenAI embedding provider when API key available (#72) by zonk1024 · Pull Request #583 · ourochronos/valence

zonk1024 · 2026-02-28T19:45:12Z

Problem

The embedding_provider config defaults to "local" but local embeddings aren't set up in most environments. When OPENAI_API_KEY is available, the provider should auto-select "openai". The MCP/server path worked because env vars were set at server startup, but direct Python API usage defaulted to "local" — causing failures in batch usage.

Fix

Added a Pydantic model_validator(mode='after') to CoreSettings in src/valence/core/config.py:

If embedding_provider is still the default "local" and OPENAI_API_KEY is set and VALENCE_EMBEDDING_PROVIDER was not explicitly configured → auto-select "openai"
Explicit user configuration via VALENCE_EMBEDDING_PROVIDER is always respected (no surprise overrides)

Tests

Added TestEmbeddingProviderAutoDetect in tests/core/test_config.py:

OPENAI_API_KEY set + no VALENCE_EMBEDDING_PROVIDER → provider becomes "openai"
VALENCE_EMBEDDING_PROVIDER=local explicit + OPENAI_API_KEY set → stays "local" (user override)
No key, no explicit provider → stays "local"

All 1716 tests pass (10 skipped).

Closes ourochronos/tracking#72

When OPENAI_API_KEY is set and VALENCE_EMBEDDING_PROVIDER has not been explicitly configured, the embedding provider now auto-selects 'openai' instead of falling back to the 'local' default (which requires a local model that is typically not set up). The batch Python API path was previously broken in environments where only an OpenAI API key is available, because the MCP/server path benefits from env-var injection at startup while direct API usage went straight to CoreSettings with its 'local' default. Changes: - Add model_validator(mode='after') _auto_select_embedding_provider to CoreSettings in src/valence/core/config.py - Import model_validator from pydantic - Add TestEmbeddingProviderAutoDetect test class in tests/core/test_config.py covering three scenarios: 1. OPENAI_API_KEY set + no VALENCE_EMBEDDING_PROVIDER → 'openai' 2. VALENCE_EMBEDDING_PROVIDER='local' explicitly set → stays 'local' 3. No key, no explicit provider → stays 'local' Closes ourochronos/tracking#72

Copilot

Pull request overview

This PR updates Valence’s configuration so CoreSettings.embedding_provider auto-selects "openai" when OPENAI_API_KEY is present and the provider was not explicitly configured, preventing unexpected failures when using the Python API with default settings.

Changes:

Added a @model_validator(mode="after") in CoreSettings to auto-select the OpenAI embedding provider when an API key is available.
Added unit tests covering env-var-driven auto-detection and explicit env override behavior.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File	Description
`src/valence/core/config.py`	Adds an “after” model validator to auto-select `embedding_provider="openai"` when `OPENAI_API_KEY` is set.
`tests/core/test_config.py`	Adds tests validating auto-detection and respecting explicit env configuration.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-28T19:47:36Z

tests/core/test_config.py

+
+        settings = CoreSettings()
+        assert settings.embedding_provider == "local"
+


These tests cover env-var-based overrides, but don’t cover the common direct-Python usage where a caller might explicitly pass embedding_provider via CoreSettings(embedding_provider=...). Given the validator logic, adding a test that an explicit embedding_provider="local" (via kwargs) is not auto-overridden when OPENAI_API_KEY is set would help prevent regressions.

Suggested change

def test_kwarg_local_override_respected(self, monkeypatch, clean_env):

"""When embedding_provider='local' is passed via kwargs, it should stay

'local' even if OPENAI_API_KEY is set (direct Python override)."""

monkeypatch.setenv("OPENAI_API_KEY", "sk-test-kwarg-local")

monkeypatch.delenv("VALENCE_EMBEDDING_PROVIDER", raising=False)

settings = CoreSettings(embedding_provider="local")

assert settings.embedding_provider == "local"

Copilot · 2026-02-28T19:47:36Z

src/valence/core/config.py

+            import os
+
+            # Check if VALENCE_EMBEDDING_PROVIDER was explicitly set
+            if not os.environ.get("VALENCE_EMBEDDING_PROVIDER"):


The explicit-configuration check relies on os.environ.get("VALENCE_EMBEDDING_PROVIDER"), but BaseSettings values can come from non-os.environ sources (e.g., passing embedding_provider="local" to CoreSettings(...), or values loaded from an env_file in non-test runs). In those cases this validator can incorrectly override an explicit local setting to openai. Prefer detecting whether the field was provided via settings inputs (e.g., using self.model_fields_set to see if embedding_provider was set) rather than inspecting os.environ directly.

Suggested change

import os

# Check if VALENCE_EMBEDDING_PROVIDER was explicitly set

if not os.environ.get("VALENCE_EMBEDDING_PROVIDER"):

# Only auto-select when the embedding_provider field was *not*

# explicitly provided via any settings source.

if "embedding_provider" not in self.model_fields_set:

Copilot · 2026-02-28T19:47:37Z

src/valence/core/config.py

+                self.embedding_provider = "openai"
+        return self
+
+        # ==========================================================================


There’s a section divider comment indented under the validator after return self, which is unreachable and makes the file structure confusing. Consider removing it or unindenting it so the section headers align consistently with the rest of the module.

Suggested change

# ==========================================================================

# ==========================================================================

Copilot AI review requested due to automatic review settings February 28, 2026 19:45

Copilot started reviewing on behalf of zonk1024 February 28, 2026 19:45 View session

zonk1024 merged commit b3d21d1 into main Feb 28, 2026
7 checks passed

Copilot AI reviewed Feb 28, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: auto-detect OpenAI embedding provider when API key available (#72)#583

fix: auto-detect OpenAI embedding provider when API key available (#72)#583
zonk1024 merged 1 commit intomainfrom
fix/72-embedding-provider-autodetect

zonk1024 commented Feb 28, 2026

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Feb 28, 2026

Uh oh!

Copilot AI Feb 28, 2026

Uh oh!

Copilot AI Feb 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants


		settings = CoreSettings()
		assert settings.embedding_provider == "local"

+    def test_kwarg_local_override_respected(self, monkeypatch, clean_env):
+        """When embedding_provider='local' is passed via kwargs, it should stay
+        'local' even if OPENAI_API_KEY is set (direct Python override)."""
+        monkeypatch.setenv("OPENAI_API_KEY", "sk-test-kwarg-local")
+        monkeypatch.delenv("VALENCE_EMBEDDING_PROVIDER", raising=False)
+        settings = CoreSettings(embedding_provider="local")
+        assert settings.embedding_provider == "local"

	# ==========================================================================
	# ==========================================================================

Conversation

zonk1024 commented Feb 28, 2026

Problem

Fix

Tests

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Feb 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 28, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants