Skip to content

ictseoyoungmin/Gemma4-Local-Agent-Windows

Repository files navigation

Gemma 4 + Claw-Code Local Demo

This repository is a local demo that connects Gemma 4 to claw-code-main through a 127.0.0.1-only path.

The goal is simple: show that a local model server can act as the default conversation model for a coding agent, trigger tool calls, and make real workspace edits without relying on a cloud provider.

Example of performing coding tasks

Streamlit demo screenshot

Results (Gemma4-E2B-Q4_0)

Streamlit demo results

Warning

This is an experiment-first demo repository, not a polished product. It mainly documents what was tested, what worked, and what did not. Development may continue, but it may also stop here.

Why Gemma 4 E2B

This machine is constrained enough that model choice matters.

Development environment used for this demo:

  • OS: Windows
  • Shell: PowerShell
  • CPU: AMD Ryzen 5 3500 6-Core Processor
  • GPU: NVIDIA GeForce GTX 1660
  • VRAM: 6 GB
  • RAM: 15.94 GB

Because of those limits, Gemma 4 E4B was too tight for reliable local use in the tested paths, while Gemma 4 E2B was the practical target. In short:

  • E4B was too large for this 6 GB GPU in the tested local setup
  • E2B was the smallest meaningful Gemma 4 edge model for this machine
  • the working local deployment path ended up being Gemma 4 E2B Q4_0 GGUF + llama.cpp

So although the repo explored multiple paths, the demo is intentionally centered on Gemma 4 E2B because that was the model that best matched the available CPU/GPU/RAM budget.

What This Repository Demonstrates

  • a local llama.cpp server exposing an OpenAI-compatible API
  • claw-code-main talking to that local server instead of a cloud model
  • tool calls turning into real file reads and edits
  • a browser-based Streamlit demo view for the workflow
  • a small toy project that can be created or refined during the demo

Current Scope

Main pieces in this repo:

What is implemented:

  • local Gemma 4 server startup and shutdown
  • OpenAI-compatible routing for claw-code-main
  • one-shot prompt execution
  • persistent REPL-style live conversation
  • tool trace inspection
  • Streamlit controls for server lifecycle and demo flow
  • a visual static toy project used as the demo target

Dependency files:

Thanks to claw-code-main

This demo would not exist without claw-code-main.

Thanks to the original author and maintainers for building the base agent runtime that made this integration possible. This repository only adds a local Gemma 4-focused demo layer on top of that work.

Upstream project:

The parts of claw-code-main used directly in this demo are:

  • the OpenAI-compatible provider path
  • prompt and REPL execution modes
  • tool-calling
  • workspace-write file operations
  • session logs under .claw/sessions for trace inspection

Thanks to Gemma 4

This demo also depends directly on Gemma 4, especially the E2B model line that made a constrained local setup practical enough to test.

Thanks to the Gemma team for releasing a model family that is small enough to explore on local hardware while still being capable enough to drive a real coding-agent demo.

How to Run

Launch the Streamlit demo:

powershell -NoProfile -ExecutionPolicy Bypass -File .\scripts\run_claw_gemma4_streamlit.ps1

The Streamlit app is intended to show:

  • local server startup
  • conversation flow
  • tool execution
  • changed files
  • demo logs and traces

Demo Target

The current toy project used in the demo is:

It is intentionally small, static, and visual enough for a live walkthrough.

Important clarification:

  • the current contents of demo/toy-task-board are not an end-to-end Gemma-generated artifact
  • the current version was manually arranged in this repository as a reading and presentation target for the demo
  • it should be treated as a surface for inspecting the local agent workflow, not as proof that Gemma 4 autonomously produced the whole project in its current form

Known Limits

  • This repository is primarily about the local coding-agent loop, not full multimodal coverage
  • GGUF + llama.cpp was the path that actually ran reliably here
  • longer agent turns can still be slow or unstable
  • the setup is tuned to this local Windows machine and this repo layout
  • some Gemma 4 features explored during testing did not fit this hardware budget cleanly
  • Windows tool execution is not fully cleaned up yet; some traces still show bash-style tool calls instead of a Windows-native execution path
  • because of that, full development-level validation of the tool chain should still be treated as incomplete

TODO

  • tighten the Streamlit UX further
  • improve the readability of live conversation traces
  • make the demo path shorter and more repeatable
  • validate editing reliability over more real prompts
  • simplify CPU/GPU bring-up and status reporting
  • resolve the remaining Windows shell mismatch so the agent does not fall back to bash-style tool calls
  • verify the development workflow at a deeper level after the Windows tool-path issue is fixed
  • decide whether this remains a one-off experiment or becomes a maintained demo

Status

The honest status of this repository is:

  • the local demo works
  • the integration path is real
  • the repo is still experimental
  • this is currently closer to “successfully tried and documented” than to “actively developed product”

That is intentional. The point of this repository is to preserve a working local demo and the decisions behind it, not to pretend that it is more finished than it is.

About

claw-code, gemma4-E2B, Streamlit, Coding Agent Demo

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors