Skip to content

Add llama-cpp-python backend support#173

Draft
Copilot wants to merge 2 commits intomainfrom
copilot/add-llamacpp-support
Draft

Add llama-cpp-python backend support#173
Copilot wants to merge 2 commits intomainfrom
copilot/add-llamacpp-support

Conversation

Copy link
Contributor

Copilot AI commented Mar 2, 2026

Adds llama-cpp-python as a supported inference backend, consistent with the existing vllm and vllm_offline integrations and adapted from the outlines llamacpp model.

Changes

  • src/gimkit/models/llamacpp.py: New LlamaCpp subclass overriding __call__ with GIMKit input/output handling; _ensure_response_suffix injects RESPONSE_SUFFIX as a stop token; from_llamacpp factory function
  • src/gimkit/models/__init__.py / src/gimkit/__init__.py: Export from_llamacpp
  • pyproject.toml: New llamacpp optional dependency group (llama-cpp-python>=0.3.0)
  • tests/models/test_llamacpp.py: Tests for factory, call dispatch, error handling, and stop-token injection; uses an autouse fixture patching LlamaCppTokenizer so the suite runs without llama-cpp-python installed

Usage

from llama_cpp import Llama
from gimkit import from_llamacpp, guide

llm = Llama(model_path="model.gguf")
model = from_llamacpp(llm)

result = model(f"The capital of France is {guide(name='city')}.")
print(result.tags["city"].content)  # "Paris"

🔒 GitHub Advanced Security automatically protects Copilot coding agent pull requests. You can protect all pull requests by enabling Advanced Security for your repositories. Learn more about Advanced Security.

Co-authored-by: Ki-Seki <60967965+Ki-Seki@users.noreply.github.com>
Copilot AI changed the title [WIP] Add llamacpp support for model integration Add llama-cpp-python backend support Mar 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants