Skip to content

feat: add Hugging Face as a first-class inference provider#1171

Open
davanstrien wants to merge 2 commits intoNousResearch:mainfrom
davanstrien:feat/huggingface-provider
Open

feat: add Hugging Face as a first-class inference provider#1171
davanstrien wants to merge 2 commits intoNousResearch:mainfrom
davanstrien:feat/huggingface-provider

Conversation

@davanstrien
Copy link

@davanstrien davanstrien commented Mar 13, 2026

Register Hugging Face Inference Providers (https://router.huggingface.co/v1) as a named provider alongside existing ones like Z.AI, Kimi, and MiniMax.

Users can now:

  • hermes chat --provider huggingface
  • Use hf:model-name syntax (e.g. hf:Qwen/Qwen3-235B-A22B-Thinking-2507)
  • Set HF_TOKEN in ~/.hermes/.env
  • Select from 18 curated models via hermes model picker

Curated model list sourced from models.dev. Users can also type any model name manually — it validates against the live /v1/models endpoint (120+ models available).

Type of Change

  • ✨ New feature (non-breaking change that adds functionality)

Changes Made

  • hermes_cli/auth.py — Added huggingface to PROVIDER_REGISTRY with HF_TOKEN / HUGGING_FACE_HUB_TOKEN env vars and https://router.huggingface.co/v1 base URL
  • hermes_cli/models.py — Added curated model list (from models.dev), label, hf alias, and provider order entry
  • hermes_cli/main.py — Added HF to provider menu, dispatch, --provider CLI choices, and curated model picker list

How to Test

  1. Set HF_TOKEN in ~/.hermes/.env (get one at https://huggingface.co/settings/tokens)
  2. Run hermes model, select "Hugging Face Inference Providers"
  3. Pick a model from the curated list (e.g. Qwen/Qwen3-235B-A22B-Thinking-2507)
  4. Chat — verify completions and tool calling work

Or directly: hermes chat --provider huggingface --model Qwen/Qwen3-235B-A22B-Thinking-2507

Tested locally

  • Provider registry loads correctly
  • validate_requested_model() accepts curated models against live API
  • Chat completions work via router.huggingface.co/v1
  • Tool calling works (function calls returned correctly)
  • Tested on macOS 15 (Apple Silicon)

Checklist

Code

  • I've read the Contributing Guide
  • My commit messages follow Conventional Commits
  • I searched for existing PRs to make sure this isn't a duplicate
  • My PR contains only changes related to this fix/feature
  • I've run pytest tests/ -q and all tests pass
  • I've added tests for my changes
  • I've tested on my platform: macOS 15.2 (Apple Silicon)

Documentation & Housekeeping

  • I've updated relevant documentation — happy to add docs if wanted
  • I've updated cli-config.yaml.example if I added/changed config keys — can add HF_TOKEN example
  • I've considered cross-platform impact — N/A (no platform-specific code)

Once merged, we'll add Hermes Agent to the HF Inference Providers integrations page as well.

),
"huggingface": ProviderConfig(
id="huggingface",
name="Hugging Face",
Copy link
Author

@davanstrien davanstrien Mar 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe HuggingFace Inference Providers

Register Hugging Face Inference Providers (https://router.huggingface.co/v1)
as a named provider alongside existing ones like Z.AI, Kimi, and MiniMax.

Users can now:
- `hermes chat --provider huggingface`
- Use `hf:model-name` syntax (e.g. `hf:Qwen/Qwen3-235B-A22B-Thinking-2507`)
- Set HF_TOKEN in ~/.hermes/.env
- Select from 18 curated models via `hermes model` picker

Curated model list sourced from models.dev. Users can also type any model
name manually — it validates against the live /v1/models endpoint.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@davanstrien davanstrien force-pushed the feat/huggingface-provider branch from ce5223f to f0a24cf Compare March 13, 2026 14:56
@davanstrien
Copy link
Author

cc @hanouticelina @Wauplin for viz

@davanstrien davanstrien marked this pull request as ready for review March 13, 2026 14:59
Per feedback from @Wauplin — HUGGING_FACE_HUB_TOKEN is outdated.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
teknium1 pushed a commit that referenced this pull request Mar 17, 2026
Register Hugging Face Inference Providers (router.huggingface.co/v1)
as a named provider alongside existing ones. Users can now:
- hermes chat --provider huggingface
- Use hf:model-name syntax (e.g. hf:Qwen/Qwen3-235B-A22B-Thinking-2507)
- Set HF_TOKEN in ~/.hermes/.env
- Select from 18 curated models via hermes model picker

OpenAI-compatible endpoint with automatic failover across providers
(Groq, Together, SambaNova, etc.), free tier included.

Files changed:
- hermes_cli/auth.py: ProviderConfig + aliases (hf, hugging-face, huggingface-hub)
- hermes_cli/models.py: _PROVIDER_MODELS, _PROVIDER_LABELS, _PROVIDER_ALIASES, _PROVIDER_ORDER
- hermes_cli/main.py: provider_labels, providers list, --provider choices, dispatch
- hermes_cli/setup.py: provider_choices, setup flow with token prompt
- hermes_cli/config.py: HF_TOKEN + HF_BASE_URL in OPTIONAL_ENV_VARS
- agent/model_metadata.py: context window entries for all curated HF models
- .env.example: HF_TOKEN documentation

Based on PR #1171 by @davanstrien. Salvaged onto current main with
additional completeness: setup.py flow, config.py env vars, auth.py
aliases, model_metadata context windows, .env.example.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant