A Gradio web interface for text generation (Gemma3 4B as a default model).
- NVIDIA GPU (8GB+ VRAM)
- Python 3.11+
- CUDA 12.1+
- Clone the repository:
git clone https://github.com/vpakarinen2/llm-text-gradio-webui.git
cd llm-text-gradio-webui
- Create/activate virtual environment:
python -m venv .venv
# Windows
.venv\Scripts\activate
# Linux/Mac
source .venv/bin/activate
- Install PyTorch with CUDA:
pip install torch --index-url https://download.pytorch.org/whl/cu121
- Install dependencies:
pip install -r requirements.txt --no-deps
- Create
.envfile:
EMBED_MODEL_ID=sentence-transformers/all-MiniLM-L6-v2
GRADIO_ANALYTICS_ENABLED=False
MAX_NEW_TOKENS=128
DEVICE=cuda
- Log in to Hugging Face
- Create an access token (Settings → Access Tokens).
- Log in:
huggingface-cli login
python -m app.server
Ville pakarinen (@vpakarinen2)

