This repository is a local demo that connects Gemma 4 to claw-code-main through a 127.0.0.1-only path.
The goal is simple: show that a local model server can act as the default conversation model for a coding agent, trigger tool calls, and make real workspace edits without relying on a cloud provider.
Warning
This is an experiment-first demo repository, not a polished product. It mainly documents what was tested, what worked, and what did not. Development may continue, but it may also stop here.
This machine is constrained enough that model choice matters.
Development environment used for this demo:
- OS: Windows
- Shell: PowerShell
- CPU:
AMD Ryzen 5 3500 6-Core Processor - GPU:
NVIDIA GeForce GTX 1660 - VRAM:
6 GB - RAM:
15.94 GB
Because of those limits, Gemma 4 E4B was too tight for reliable local use in the tested paths, while Gemma 4 E2B was the practical target. In short:
E4Bwas too large for this6 GBGPU in the tested local setupE2Bwas the smallest meaningfulGemma 4edge model for this machine- the working local deployment path ended up being
Gemma 4 E2B Q4_0 GGUF + llama.cpp
So although the repo explored multiple paths, the demo is intentionally centered on Gemma 4 E2B because that was the model that best matched the available CPU/GPU/RAM budget.
- a local
llama.cppserver exposing an OpenAI-compatible API claw-code-maintalking to that local server instead of a cloud model- tool calls turning into real file reads and edits
- a browser-based
Streamlitdemo view for the workflow - a small toy project that can be created or refined during the demo
Main pieces in this repo:
apps/claw_gemma4_demo.pyscripts/run_claw_gemma4_streamlit.ps1scripts/run_claw_code_main_with_gemma4.ps1scripts/run_gemma4_e2b_q4_0_llamacpp_server.ps1scripts/stop_llamacpp_server.ps1requirements.txtrequirements-streamlit.txtdemo/toy-task-board
What is implemented:
- local Gemma 4 server startup and shutdown
- OpenAI-compatible routing for
claw-code-main - one-shot prompt execution
- persistent REPL-style live conversation
- tool trace inspection
- Streamlit controls for server lifecycle and demo flow
- a visual static toy project used as the demo target
Dependency files:
requirements.txt: full environment snapshot generated from the current.venvrequirements-streamlit.txt: smaller Streamlit-side dependency file for the demo UI layer
This demo would not exist without claw-code-main.
Thanks to the original author and maintainers for building the base agent runtime that made this integration possible. This repository only adds a local Gemma 4-focused demo layer on top of that work.
Upstream project:
The parts of claw-code-main used directly in this demo are:
- the OpenAI-compatible provider path
promptandREPLexecution modes- tool-calling
workspace-writefile operations- session logs under
.claw/sessionsfor trace inspection
This demo also depends directly on Gemma 4, especially the E2B model line that made a constrained local setup practical enough to test.
Thanks to the Gemma team for releasing a model family that is small enough to explore on local hardware while still being capable enough to drive a real coding-agent demo.
Launch the Streamlit demo:
powershell -NoProfile -ExecutionPolicy Bypass -File .\scripts\run_claw_gemma4_streamlit.ps1The Streamlit app is intended to show:
- local server startup
- conversation flow
- tool execution
- changed files
- demo logs and traces
The current toy project used in the demo is:
demo/toy-task-board/index.htmldemo/toy-task-board/styles.cssdemo/toy-task-board/app.jsdemo/toy-task-board/README.md
It is intentionally small, static, and visual enough for a live walkthrough.
Important clarification:
- the current contents of
demo/toy-task-boardare not an end-to-end Gemma-generated artifact - the current version was manually arranged in this repository as a reading and presentation target for the demo
- it should be treated as a surface for inspecting the local agent workflow, not as proof that Gemma 4 autonomously produced the whole project in its current form
- This repository is primarily about the local coding-agent loop, not full multimodal coverage
GGUF + llama.cppwas the path that actually ran reliably here- longer agent turns can still be slow or unstable
- the setup is tuned to this local Windows machine and this repo layout
- some
Gemma 4features explored during testing did not fit this hardware budget cleanly - Windows tool execution is not fully cleaned up yet; some traces still show
bash-style tool calls instead of a Windows-native execution path - because of that, full development-level validation of the tool chain should still be treated as incomplete
- tighten the Streamlit UX further
- improve the readability of live conversation traces
- make the demo path shorter and more repeatable
- validate editing reliability over more real prompts
- simplify CPU/GPU bring-up and status reporting
- resolve the remaining Windows shell mismatch so the agent does not fall back to
bash-style tool calls - verify the development workflow at a deeper level after the Windows tool-path issue is fixed
- decide whether this remains a one-off experiment or becomes a maintained demo
The honest status of this repository is:
- the local demo works
- the integration path is real
- the repo is still experimental
- this is currently closer to “successfully tried and documented” than to “actively developed product”
That is intentional. The point of this repository is to preserve a working local demo and the decisions behind it, not to pretend that it is more finished than it is.

