Goal. A hands-on lab to build text-only LLM applications with RAG, tool use, multi-agent reasoning, a text-adventure demo, Gradio UI, evaluation, and FastAPI deployment. The repo follows an 8-stage roadmap and delivers runnable Jupyter notebooks plus reusable Python modules.
- LLMAdapter (OSS-first): unified chat interface for Transformers / llama.cpp / Ollama.
- Chinese RAG pipeline: zh-friendly chunking, BGE embeddings, FAISS, reranking, grounded answers with citations.
- Tools & Function-Calling: JSON tool schema, web search (DuckDuckGo + trafilatura), calculator, file lookup, ReAct.
- Multi-Agent Orchestrator: Researcher → Planner → Writer → Reviewer with a shared blackboard and retries.
- RAG-driven Text Adventure: world-knowledge KB, state machine, choices, save/load, style control, moderation.
- UI & API: Gradio multi-mode panel and FastAPI (REST + WebSocket, OpenAI-compatible route).
- Eval & Observability: Recall@k, nDCG, groundedness, Rouge/chrF++, latency/tokens/VRAM, run reports.
ragent-text-lab/
notebooks/1..8/ # 61 notebooks across 8 stages (training-oriented walkthroughs)
shared\_utils/ # adapters/, rag/, agents/, prompts/, metrics/, ui/, utils/
apps/ # gradio\_app/, api/
configs/ # rag.yaml, agents.yaml, styles/\*.yaml, ui/api configs
scripts/ # setup\_env.py, helpers
tests/ # minimal smoke and component tests
data/ indices/ outs/ # git-ignored artifacts & caches
Shared model cache only. Large artifacts (models, indices, outputs) live outside git:
AI_CACHE_ROOT,data/,indices/,outs/.
- Windows 10/11, PowerShell, Git, Conda (Miniconda/Anaconda), Python 3.11
- Optional CUDA GPU (8–12 GB VRAM recommended). CPU fallback is supported.
# 1) create the skeleton (PowerShell)
# -> see "Windows terminal: full bootstrap" in the docs/message
# 2) create env and install deps
conda create -y -n rag-agent python=3.11
conda activate rag-agent
pip install -r requirements-dev.txt
pre-commit install
nbstripout --install
# 3) prepare .env
Copy-Item .env.example .env
# 4) smoke: run FastAPI and Gradio (two terminals)
uvicorn apps.api.main:app --host 0.0.0.0 --port 8000
python apps/gradio_app/app.py-
.env
AI_CACHE_ROOT=/mnt/ai/cache(shared model/data cache)MODEL_ID,BACKEND=transformers|llama_cpp|ollamaEMBEDDING_MODEL=BAAI/bge-m3,RERANKER_MODEL=BAAI/bge-reranker-base- See
configs/rag.yamlandconfigs/agents.yamlfor RAG/agent settings.
from shared_utils.adapters.llm_adapter import LLMAdapter
adapter = LLMAdapter(model_id="Qwen/Qwen2.5-7B-Instruct")
messages = [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Explain RAG in 3 bullets."}
]
print(adapter.generate(messages, max_new_tokens=200, temperature=0.7))# Load config (rag.yaml), embed with BGE, index with FAISS, retrieve -> rerank -> cite
# See notebooks: 2_rag_basics/nb10..nb15- Stage 1 — Repo & LLMAdapter (8)
- Stage 2 — Chinese RAG Basics (10)
- Stage 3 — Tools / Function-Calling (10)
- Stage 4 — Multi-Agent Orchestrator (10)
- Stage 5 — RAG × Text-Adventure (8)
- Stage 6 — Gradio UI (6)
- Stage 7 — Eval & Observability (5)
- Stage 8 — API & Deploy (4)
Each notebook includes: goals, prerequisites, shared-cache bootstrap, MVP code, low-VRAM settings, smoke test, and a “when to use this” note.
-
Branch per notebook/feature:
feature/nb10-rag-index,feature/agent-orchestrator,feature/ui-gradio, … -
Merge path:
feature/* → develop → main(use--no-ff; tag releases per stage). -
Conventional Commits examples:
feat(rag): build zh chunk + bge-m3 + FAISS indexfeat(agent): 4-role orchestrator with retriesfeat(ui): gradio modes (chat/rag/agents/game)feat(api): openai-compatible /v1/chat/completions)
- Never print tokens; store secrets in
.env. - Tool calls are whitelisted and schema-validated.
- Input sanitization (HTML/URL) and prompt-injection guards.
- Open an issue with context and a minimal test case.
- Use a feature branch and Conventional Commits.
- Ensure notebooks have a smoke cell and outputs are stripped.
- Run
pre-commitand tests before opening a PR.
MIT (unless your organization requires a different license). Add your LICENSE file accordingly.
### (Optional) Save README from PowerShell
```powershell
@"
<PASTE THE ENTIRE MARKDOWN ABOVE>
"@ | Set-Content -Encoding utf8 README.md