Skip to content

feat: add Hugging Face as a first-class inference provider#1747

Open
teknium1 wants to merge 2 commits intomainfrom
hermes/hermes-0ed29ee7
Open

feat: add Hugging Face as a first-class inference provider#1747
teknium1 wants to merge 2 commits intomainfrom
hermes/hermes-0ed29ee7

Conversation

@teknium1
Copy link
Contributor

Summary

Salvage of PR #1171 by @davanstrien onto current main.

Registers Hugging Face Inference Providers (router.huggingface.co/v1) as a named provider. Users can now:

  • hermes chat --provider huggingface
  • Use hf:model-name syntax (e.g. hf:Qwen/Qwen3-235B-A22B-Thinking-2507)
  • Set HF_TOKEN in ~/.hermes/.env
  • Select from 18 curated open models via hermes model picker

OpenAI-compatible endpoint with automatic failover across providers (Groq, Together, SambaNova, etc.), free tier included ($0.10/month, no markup).

Changes vs original PR #1171

The original PR touched 3 files (auth.py, models.py, main.py). This salvage adds the missing integration points required by our current provider checklist:

File Original PR This PR
hermes_cli/auth.py ✅ ProviderConfig ✅ + aliases in resolve_provider()
hermes_cli/models.py ✅ Models, labels, aliases, order ✅ Same
hermes_cli/main.py ✅ Labels, providers, choices, dispatch ✅ Updated for current structure
hermes_cli/setup.py ❌ Missing ✅ provider_choices + setup flow
hermes_cli/config.py ❌ Missing ✅ HF_TOKEN + HF_BASE_URL in OPTIONAL_ENV_VARS
agent/model_metadata.py ❌ Missing ✅ Context window entries for all 18 models
.env.example ❌ Missing ✅ HF_TOKEN documentation

How to test

  1. Set HF_TOKEN in ~/.hermes/.env (get one at https://huggingface.co/settings/tokens)
  2. Run hermes model, select "Hugging Face Inference Providers"
  3. Pick a model and chat

Or directly: hermes chat --provider huggingface --model Qwen/Qwen3-235B-A22B-Thinking-2507

Test results

5081 passed, 32 failed (all pre-existing on main), 164 skipped. Zero regressions introduced.

Attribution

Contributor commit preserved with original authorship. Closes #1171.

davanstrien and others added 2 commits March 17, 2026 05:21
Register Hugging Face Inference Providers (router.huggingface.co/v1)
as a named provider alongside existing ones. Users can now:
- hermes chat --provider huggingface
- Use hf:model-name syntax (e.g. hf:Qwen/Qwen3-235B-A22B-Thinking-2507)
- Set HF_TOKEN in ~/.hermes/.env
- Select from 18 curated models via hermes model picker

OpenAI-compatible endpoint with automatic failover across providers
(Groq, Together, SambaNova, etc.), free tier included.

Files changed:
- hermes_cli/auth.py: ProviderConfig + aliases (hf, hugging-face, huggingface-hub)
- hermes_cli/models.py: _PROVIDER_MODELS, _PROVIDER_LABELS, _PROVIDER_ALIASES, _PROVIDER_ORDER
- hermes_cli/main.py: provider_labels, providers list, --provider choices, dispatch
- hermes_cli/setup.py: provider_choices, setup flow with token prompt
- hermes_cli/config.py: HF_TOKEN + HF_BASE_URL in OPTIONAL_ENV_VARS
- agent/model_metadata.py: context window entries for all curated HF models
- .env.example: HF_TOKEN documentation

Based on PR #1171 by @davanstrien. Salvaged onto current main with
additional completeness: setup.py flow, config.py env vars, auth.py
aliases, model_metadata context windows, .env.example.
- quickstart.md: add to provider table
- configuration.md: add to provider table, add dedicated section with
  usage examples, config.yaml snippet, routing suffixes, and token info;
  also fix pre-existing duplicate Alibaba Cloud entry
- environment-variables.md: add HF_TOKEN + HF_BASE_URL, add huggingface
  to HERMES_INFERENCE_PROVIDER values
- fallback-providers.md: add to supported providers table and
  auto-detection chain
@teknium1 teknium1 force-pushed the hermes/hermes-0ed29ee7 branch from f3b6c9d to 11c926c Compare March 17, 2026 16:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants