You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A full-stack starter monorepo for detection-first computer vision products built with Next.js and FastAPI.
3
+
A product-minded monorepo starter for detection-first computer vision apps built with Next.js and FastAPI.
4
4
5
-
It combines a polished frontend, a Python API designed for image-processing workloads, shared root scripts, a documented OpenAPI contract, and a sample detection pipeline that runs on CPU with OpenCV so teams can start shipping product workflows before committing to a heavier model stack.
5
+
It gives you a polished upload-to-inference UI, a typed OpenAPI contract, CPU-friendly starter pipelines, and a clean path into webcam capture, segmentation, and heavier model backends later.
Most computer-vision starters fall into one of two buckets:
23
+
24
+
- model notebooks with no product layer
25
+
- web templates with no real inference contract
26
+
27
+
This kit sits in the middle. It starts with a real product flow:
28
+
29
+
- upload an image
30
+
- run a detection-oriented pipeline
31
+
- inspect typed boxes, metrics, and image metadata
32
+
- keep the same contract when you add segmentation or webcam capture later
33
+
34
+
## What You Get
35
+
36
+
- detection-first starter UX with annotated preview overlays
37
+
- inference-first architecture with a separate Next.js frontend and FastAPI backend
38
+
- shared OpenAPI contract in `docs/openapi.yaml`
39
+
- generated frontend API types from `openapi-typescript`
40
+
- optional webcam extension that reuses the same API surface
41
+
- first live segmentation extension with polygons, masks, and derived boxes
42
+
- CPU-first OpenCV sample pipelines that are easy to replace later
43
+
- root dev and verification scripts for a monorepo-style workflow
44
+
- GitHub Actions template CI
6
45
7
46
## Stack
8
47
@@ -15,32 +54,22 @@ It combines a polished frontend, a Python API designed for image-processing work
15
54
- OpenCV
16
55
- Docker Compose
17
56
18
-
## Monorepo Structure
19
-
20
-
-`frontend/`: Next.js app with a vision-console UI, API client helpers, and generated OpenAPI types
21
-
-`backend/`: FastAPI service with health, pipeline catalog, and image-analysis routes
22
-
-`docs/`: shared API contract
23
-
-`scripts/`: root development and verification scripts
24
-
-`.github/`: CI workflow for the template
25
-
26
-
## Recommended Shape
27
-
28
-
- architecture: inference-first
29
-
- default demo: detection-first
30
-
- optional frontend extension: webcam capture
31
-
- later backend extension: segmentation
32
-
- later workspace/package: training pipeline
57
+
## Included Pipelines
33
58
34
-
This keeps the template easy to understand while still leaving a clean path into more advanced CV workflows.
59
+
-`starter-detection`: default object-style detection flow for the main UI
60
+
-`foreground-segmentation`: first extension pipeline with polygons plus derived boxes
61
+
-`document-layout`: document-style region extraction for capture and scanning products
62
+
-`dominant-color`: metrics-only example for QA and analytics workflows
35
63
36
-
## Why This Template Exists
64
+
These pipelines are intentionally lightweight. They prove the repo shape and developer workflow without forcing you into toy logic forever. Swap them for YOLO, ONNX Runtime, PyTorch, TensorRT, or a hosted inference service when you are ready.
37
65
38
-
Most computer-vision starters are either model notebooks with no product layer or web templates with no real inference shape. This template sits in the middle:
66
+
## Repo Shape
39
67
40
-
- product-minded frontend by default
41
-
- backend structure ready for image upload, preprocessing, and model-serving extensions
42
-
- typed API contract between the web app and the inference service
43
-
- one-command local development from the repo root
68
+
-`frontend/`: Next.js app shell, upload flow, webcam flow, and generated API types
69
+
-`backend/`: FastAPI service, pipeline registry, validation, and starter image logic
70
+
-`docs/`: OpenAPI contract and screenshot assets
71
+
-`scripts/`: root development and verification commands
72
+
-`.github/`: template CI workflow
44
73
45
74
## Quick Start
46
75
@@ -51,9 +80,11 @@ Most computer-vision starters are either model notebooks with no product layer o
51
80
5. Run `npm run api:types`.
52
81
6. Run `npm run dev`.
53
82
54
-
Frontend: `http://localhost:3000`
83
+
Frontend: `http://localhost:3000`
55
84
Backend: `http://127.0.0.1:8000`
56
85
86
+
If you create `backend/.venv`, the root scripts will prefer that interpreter automatically.
87
+
57
88
## Commands
58
89
59
90
```bash
@@ -63,45 +94,28 @@ npm run api:types
63
94
npm run check
64
95
```
65
96
66
-
## API Contract
97
+
## Verification
67
98
68
-
-`docs/openapi.yaml` is the source of truth for the shared HTTP contract.
69
-
-`frontend/src/generated/openapi.ts` is generated from that spec with `openapi-typescript`.
70
-
- Run `npm run api:types` whenever backend payloads change.
99
+
The root check runs:
71
100
72
-
## Sample Pipelines Included
101
+
- frontend lint
102
+
- frontend typecheck
103
+
- frontend production build
104
+
- backend `pytest`
105
+
- backend `compileall`
73
106
74
-
-`starter-detection`: the default object-style detection sample used by the main frontend flow
75
-
-`foreground-segmentation`: the first live extension pipeline, returning region polygons and derived boxes
76
-
-`document-layout`: document-oriented box extraction for scanning and capture products
77
-
-`dominant-color`: metrics-only extension pipeline for QA and analytics
107
+
## Contract Notes
78
108
79
-
These are intentionally lightweight starter pipelines. They are there to prove the architecture and developer workflow, not to lock you into toy logic. Swap them for YOLO, ONNX Runtime, PyTorch, TensorRT, or a custom service when you are ready.
109
+
-`docs/openapi.yaml` is the source of truth for the HTTP contract.
110
+
-`frontend/src/generated/openapi.ts` is generated from that spec.
111
+
- Run `npm run api:types` whenever backend payloads change.
80
112
81
-
## What You Get
113
+
## Recommended Growth Path
114
+
115
+
1. Keep the main story detection-first.
116
+
2. Add webcam polish once upload mode feels strong.
117
+
3. Add segmentation depth without changing the response boundary.
118
+
4. Introduce a real model adapter layer.
119
+
5. Split training and experimentation into a separate workspace later.
- optional webcam capture mode that reuses the same inference contract
86
-
- first segmentation extension pipeline using the same response boundary
87
-
- FastAPI inference endpoint with typed response models
88
-
- OpenCV-based sample processing that runs without a GPU
89
-
- root scripts for local dev and checks
90
-
- GitHub Actions workflow for frontend and backend verification
91
-
- Docker Compose dev option
92
-
93
-
## Notes
94
-
95
-
- The backend in this starter is CPU-first on purpose so it is easier to clone, run, and extend.
96
-
- The main story is intentionally detection-first so the template stays easy to explain and demo.
97
-
- The current environment used to build this template did not have Python installed, so the frontend was verified locally but backend execution was prepared rather than run here.
98
-
- If you move to heavier vision workloads, add a worker or model-service layer and keep the current API as the contract boundary.
99
-
100
-
## Next Expansions
101
-
102
-
- async job queue for long-running inference
103
-
- persistent artifact storage
104
-
- model registry and experiment tracking
105
-
- richer segmentation overlays and mask visualizations
106
-
- video ingestion pipelines
107
-
- training or experiment workspace in a separate `ml/` or `training/` package
121
+
The short public roadmap lives in [soon.md](./soon.md).
0 commit comments