A full-stack app for video understanding and semantic search. Upload a video, we transcribe it with Whisper, chunk and embed the transcript, and let you search or chat over the video. Built with:
- Backend: Django 5, Django REST Framework, Channels (in-memory channel layer)
- ML: faster-whisper, sentence-transformers, NumPy
- Storage: PostgreSQL
- Frontend: Next.js 15 (App Router), React 19, Tailwind CSS, Zustand
scenequery/
├─ backend/
│ ├─ server/ # Django project (ASGI+WSGI); settings, urls, asgi
│ ├─ videos/ # App: upload, process, search, chat (WebSockets)
│ ├─ manage.py
│ └─ requirements.txt
├─ frontend/
│ ├─ src/app/ # Next.js app router pages
│ ├─ src/lib/api.ts # Calls backend REST endpoints
│ └─ package.json
└─ .env # Project-wide environment (not committed)
- Upload a video and track processing progress live via WebSockets
- Speech-to-text transcription via faster-whisper
- Chunking and embedding of transcript with sentence-transformers
- Vector search over transcript segments to find best relevant moments
- Optional chat over a single video, streaming tokens from OpenAI
- Node.js 18+ and a package manager (npm, pnpm, or yarn)
- Python 3.11+
- ffmpeg and ffprobe installed and on PATH (or set
FFMPEG_PATH/FFPROBE_PATH) - For OpenAI chat: an
OPENAI_API_KEY - PostgreSQL 13+ (required)
Follow these steps on Windows 10/11 using PowerShell. Administrator is not required unless installing system packages.
- Install prerequisites
- Node.js: Download LTS from https://nodejs.org/en/download and install.
- Python 3.11+: Download from https://www.python.org/downloads/windows/ and enable “Add python.exe to PATH”.
- Git (optional but recommended): https://git-scm.com/download/win
- PostgreSQL 13+: https://www.postgresql.org/download/windows/
- During setup, note the superuser (default
postgres) and password. - After install, open “SQL Shell (psql)” and create the database:
CREATE DATABASE scenequery;
- During setup, note the superuser (default
- ffmpeg + ffprobe: Download static builds from https://www.gyan.dev/ffmpeg/builds/ or https://www.ffmpeg.org/download.html
- Unzip to
C:\ffmpeg\so the binaries are atC:\ffmpeg\bin\ffmpeg.exeandC:\ffmpeg\bin\ffprobe.exe. - Either add
C:\ffmpeg\binto your PATH, or setFFMPEG_PATHandFFPROBE_PATHin.env.
- Unzip to
- Create a
.envfile at the repo root
Copy and adjust the example below (see the Configuration section for full options):
DJANGO_SECRET_KEY=your-dev-key
CORS_ALLOWED_ORIGINS=http://localhost:3000
PGDATABASE=scenequery
PGUSER=postgres
PGPASSWORD=your-postgres-password
PGHOST=localhost
PGPORT=5432
# Optional: OpenAI chat
# OPENAI_API_KEY=sk-...
# Optional: ffmpeg if not on PATH
FFMPEG_PATH=C:\\ffmpeg\\bin\\ffmpeg.exe
FFPROBE_PATH=C:\\ffmpeg\\bin\\ffprobe.exe
# Frontend
NEXT_PUBLIC_API_BASE=http://127.0.0.1:8000- Backend setup (PowerShell)
# From repo root
python -m venv backend/venv
backend/venv/Scripts/Activate.ps1
pip install -r backend/requirements.txt
# Initialize database
python backend/manage.py migrate
# Start HTTP + WebSocket dev server (Django runserver)
python backend/manage.py runserver 8000- Frontend setup (PowerShell)
# From repo root
cd frontend
npm install
npm run devThe app will be available at http://localhost:3000 and the backend at http://127.0.0.1:8000.
Tips:
- If PowerShell blocks script execution, run
Set-ExecutionPolicy -Scope CurrentUser RemoteSignedonce. - If ffmpeg is not found, verify PATH or
FFMPEG_PATH/FFPROBE_PATH. - Ensure PostgreSQL service is running (check “Services” or
pg_ctlfrom the installation directory).
Open two terminals.
- Backend (Django)
# from repo root
python -m venv backend/venv
backend/venv/Scripts/activate # Windows
# source backend/venv/bin/activate # macOS/Linux
pip install -r backend/requirements.txt
# Create and migrate the DB
python backend/manage.py migrate
# Run the dev server (HTTP + WebSocket)
python backend/manage.py runserver 8000
- Frontend (Next.js)
# from repo root
cd frontend
npm install
# or: pnpm install / yarn install
# In dev, the frontend proxies to the backend base URL you configure
npm run dev
Then open http://localhost:3000 and ensure the backend is reachable at http://127.0.0.1:8000 (configurable).
Provide environment variables in a root .env (already gitignored). The backend loads this with python-dotenv in backend/server/settings.py.
DJANGO_SECRET_KEY(default:dev-insecure-secret-key)ALLOWED_HOSTS(comma-separated, default:localhost,127.0.0.1)- Database (PostgreSQL by default):
PGDATABASE,PGUSER,PGPASSWORD,PGHOST,PGPORT
- Media/static:
MEDIA_ROOT(default:<backend>/.media)
- CORS:
CORS_ALLOW_ALL_ORIGINS(default:true)CORS_ALLOWED_ORIGINS(CSV; default:http://localhost:3000)
- Channels (WebSockets): in-memory channel layer is used; Redis is not required.
-
ffmpeg/ffprobe location (see
backend/videos/utils/ffmpeg.py):FFMPEG_PATH,FFPROBE_PATH(if not on PATH)
-
Whisper transcription (see
backend/videos/utils/transcription.py):WHISPER_MODEL(default:small)WHISPER_MODEL_PATH(use a local model directory instead of downloading)ALLOW_MODEL_DOWNLOADS(true/false, default:true)WHISPER_CACHE_DIRor globalMODEL_CACHE_DIRWHISPER_DEVICE(cpu|cuda|auto, default:cpu)WHISPER_COMPUTE_TYPE(e.g.,float32,float16,int8_float16)WHISPER_CPU_THREADS(int, default:0for runtime default)WHISPER_NUM_WORKERS(int, default:1)- Tuning:
WHISPER_VAD_FILTER(true/false, default:true)WHISPER_BEAM_SIZE(int, default:1)WHISPER_BEST_OF(int, default:1)WHISPER_CONDITION_ON_PREV(true/false, default:false)WHISPER_LANGUAGE(e.g.,en)WHISPER_TEMPERATURE(float, default:0)
-
Embeddings (see
backend/videos/utils/embeddings.py):EMBED_MODEL(default:sentence-transformers/all-MiniLM-L6-v2)EMBED_MODEL_PATH(local directory override)ALLOW_MODEL_DOWNLOADS(true/false)EMBED_CACHE_DIRor globalMODEL_CACHE_DIREMBED_DEVICE(cudaorcpu)EMBED_BATCH_SIZE(default:32)
-
OpenAI for chat streaming (see
backend/videos/consumers.py):OPENAI_API_KEY(required for chat)OPENAI_MODEL(default:gpt-4o-mini)
NEXT_PUBLIC_API_BASE(default:http://127.0.0.1:8000) used infrontend/src/lib/api.ts
Example .env (minimal dev):
# Backend
DJANGO_SECRET_KEY=your-dev-key
CORS_ALLOWED_ORIGINS=http://localhost:3000
# PostgreSQL
PGDATABASE=scenequery
PGUSER=postgres
PGPASSWORD=postgres
PGHOST=localhost
PGPORT=5432
# Optional: OpenAI chat
OPENAI_API_KEY=sk-...
OPENAI_MODEL=gpt-4o-mini
# Optional: ffmpeg if not on PATH
FFMPEG_PATH=C:\\ffmpeg\\bin\\ffmpeg.exe
FFPROBE_PATH=C:\\ffmpeg\\bin\\ffprobe.exe
# Frontend
NEXT_PUBLIC_API_BASE=http://127.0.0.1:8000
- Backend dev server:
python backend/manage.py runserver 8000 - Frontend dev server:
npm run devinfrontend/ - Visit the UI at http://localhost:3000
Uploads and derived files (frames, media) are saved under MEDIA_ROOT (default backend/.media/). In dev, DEBUG=True serves media at /media/ via backend/server/urls.py.
For full ASGI support (HTTP + WebSockets) using Daphne instead of Django's dev server:
# From repo root
backend/venv/Scripts/Activate.ps1
daphne -b 127.0.0.1 -p 8000 server.asgi:applicationNotes:
- If Windows Firewall prompts, allow access on Private networks.
- Keep
NEXT_PUBLIC_API_BASEpointing tohttp://127.0.0.1:8000so the frontend talks to Daphne. - Static files in production should be served by a web server or CDN; in dev
DEBUG=Trueis sufficient.
Base URL defaults to http://127.0.0.1:8000.
POST /api/videos/— upload a video file (form field:file)GET /api/videos/<id>/— get details about a videoGET /api/videos/<id>/search?q=...— semantic search in transcript; returns best match and alternativesGET /media/frames/<frame>.jpg— preview frames rendered on demand
WebSocket endpoints (see backend/videos/routing.py and backend/server/asgi.py):
ws://127.0.0.1:8000/ws/videos/<id>/progress/— processing progress eventsws://127.0.0.1:8000/ws/videos/<id>/chat/— chat over a single video; send{ type: "user_message", text: "..." }
- The channel layer is in-memory. Redis is not used in this project.
- PostgreSQL is the default DB. Ensure the
PG*env vars are set and the database exists. - Large model downloads: set
ALLOW_MODEL_DOWNLOADS=falseand point to localWHISPER_MODEL_PATHandEMBED_MODEL_PATHif you work offline. - GPU acceleration: Set
WHISPER_DEVICE=cudaand a compatibleWHISPER_COMPUTE_TYPE(e.g.,float16). Ensure GPU drivers and CUDA runtime for your environment.
Backend:
# create superuser
python backend/manage.py createsuperuser
# run tests (if/when added)
pytest # if configured
Frontend:
npm run dev
npm run build
npm start
npm run lint
- ffmpeg/ffprobe not found: install ffmpeg and ensure both
ffmpegandffprobeare on PATH, or setFFMPEG_PATH/FFPROBE_PATH. - OpenAI chat errors: ensure
OPENAI_API_KEYis set and reachable from the backend process. - WebSockets not connecting in dev: confirm the backend runs on
http://127.0.0.1:8000,ASGI_APPLICATIONis configured (it is), and that the browser is allowed by CORS. If using a different host/port, setNEXT_PUBLIC_API_BASEandCORS_ALLOWED_ORIGINSaccordingly. - Model downloads blocked: set
ALLOW_MODEL_DOWNLOADS=trueor provide local model paths and cache directories.
Proprietary/internal by default. Add your chosen license here if publishing.