Skip to content

Commit bbdfd08

Browse files
committed
Polish repo presentation
1 parent 563c2cb commit bbdfd08

File tree

9 files changed

+467
-89
lines changed

9 files changed

+467
-89
lines changed

README.md

Lines changed: 75 additions & 61 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,47 @@
11
# nextjs-python-computer-vision-kit
22

3-
A full-stack starter monorepo for detection-first computer vision products built with Next.js and FastAPI.
3+
A product-minded monorepo starter for detection-first computer vision apps built with Next.js and FastAPI.
44

5-
It combines a polished frontend, a Python API designed for image-processing workloads, shared root scripts, a documented OpenAPI contract, and a sample detection pipeline that runs on CPU with OpenCV so teams can start shipping product workflows before committing to a heavier model stack.
5+
It gives you a polished upload-to-inference UI, a typed OpenAPI contract, CPU-friendly starter pipelines, and a clean path into webcam capture, segmentation, and heavier model backends later.
6+
7+
<p>
8+
<a href="#quick-start">Quick start</a> ·
9+
<a href="#screenshots">Screenshots</a> ·
10+
<a href="#what-you-get">What you get</a> ·
11+
<a href="./soon.md">Roadmap</a>
12+
</p>
13+
14+
## Screenshots
15+
16+
![Vision console screenshot](docs/assets/vision-console.png)
17+
18+
![Webcam extension screenshot](docs/assets/webcam-extension.png)
19+
20+
## Why This Repo Exists
21+
22+
Most computer-vision starters fall into one of two buckets:
23+
24+
- model notebooks with no product layer
25+
- web templates with no real inference contract
26+
27+
This kit sits in the middle. It starts with a real product flow:
28+
29+
- upload an image
30+
- run a detection-oriented pipeline
31+
- inspect typed boxes, metrics, and image metadata
32+
- keep the same contract when you add segmentation or webcam capture later
33+
34+
## What You Get
35+
36+
- detection-first starter UX with annotated preview overlays
37+
- inference-first architecture with a separate Next.js frontend and FastAPI backend
38+
- shared OpenAPI contract in `docs/openapi.yaml`
39+
- generated frontend API types from `openapi-typescript`
40+
- optional webcam extension that reuses the same API surface
41+
- first live segmentation extension with polygons, masks, and derived boxes
42+
- CPU-first OpenCV sample pipelines that are easy to replace later
43+
- root dev and verification scripts for a monorepo-style workflow
44+
- GitHub Actions template CI
645

746
## Stack
847

@@ -15,32 +54,22 @@ It combines a polished frontend, a Python API designed for image-processing work
1554
- OpenCV
1655
- Docker Compose
1756

18-
## Monorepo Structure
19-
20-
- `frontend/`: Next.js app with a vision-console UI, API client helpers, and generated OpenAPI types
21-
- `backend/`: FastAPI service with health, pipeline catalog, and image-analysis routes
22-
- `docs/`: shared API contract
23-
- `scripts/`: root development and verification scripts
24-
- `.github/`: CI workflow for the template
25-
26-
## Recommended Shape
27-
28-
- architecture: inference-first
29-
- default demo: detection-first
30-
- optional frontend extension: webcam capture
31-
- later backend extension: segmentation
32-
- later workspace/package: training pipeline
57+
## Included Pipelines
3358

34-
This keeps the template easy to understand while still leaving a clean path into more advanced CV workflows.
59+
- `starter-detection`: default object-style detection flow for the main UI
60+
- `foreground-segmentation`: first extension pipeline with polygons plus derived boxes
61+
- `document-layout`: document-style region extraction for capture and scanning products
62+
- `dominant-color`: metrics-only example for QA and analytics workflows
3563

36-
## Why This Template Exists
64+
These pipelines are intentionally lightweight. They prove the repo shape and developer workflow without forcing you into toy logic forever. Swap them for YOLO, ONNX Runtime, PyTorch, TensorRT, or a hosted inference service when you are ready.
3765

38-
Most computer-vision starters are either model notebooks with no product layer or web templates with no real inference shape. This template sits in the middle:
66+
## Repo Shape
3967

40-
- product-minded frontend by default
41-
- backend structure ready for image upload, preprocessing, and model-serving extensions
42-
- typed API contract between the web app and the inference service
43-
- one-command local development from the repo root
68+
- `frontend/`: Next.js app shell, upload flow, webcam flow, and generated API types
69+
- `backend/`: FastAPI service, pipeline registry, validation, and starter image logic
70+
- `docs/`: OpenAPI contract and screenshot assets
71+
- `scripts/`: root development and verification commands
72+
- `.github/`: template CI workflow
4473

4574
## Quick Start
4675

@@ -51,9 +80,11 @@ Most computer-vision starters are either model notebooks with no product layer o
5180
5. Run `npm run api:types`.
5281
6. Run `npm run dev`.
5382

54-
Frontend: `http://localhost:3000`
83+
Frontend: `http://localhost:3000`
5584
Backend: `http://127.0.0.1:8000`
5685

86+
If you create `backend/.venv`, the root scripts will prefer that interpreter automatically.
87+
5788
## Commands
5889

5990
```bash
@@ -63,45 +94,28 @@ npm run api:types
6394
npm run check
6495
```
6596

66-
## API Contract
97+
## Verification
6798

68-
- `docs/openapi.yaml` is the source of truth for the shared HTTP contract.
69-
- `frontend/src/generated/openapi.ts` is generated from that spec with `openapi-typescript`.
70-
- Run `npm run api:types` whenever backend payloads change.
99+
The root check runs:
71100

72-
## Sample Pipelines Included
101+
- frontend lint
102+
- frontend typecheck
103+
- frontend production build
104+
- backend `pytest`
105+
- backend `compileall`
73106

74-
- `starter-detection`: the default object-style detection sample used by the main frontend flow
75-
- `foreground-segmentation`: the first live extension pipeline, returning region polygons and derived boxes
76-
- `document-layout`: document-oriented box extraction for scanning and capture products
77-
- `dominant-color`: metrics-only extension pipeline for QA and analytics
107+
## Contract Notes
78108

79-
These are intentionally lightweight starter pipelines. They are there to prove the architecture and developer workflow, not to lock you into toy logic. Swap them for YOLO, ONNX Runtime, PyTorch, TensorRT, or a custom service when you are ready.
109+
- `docs/openapi.yaml` is the source of truth for the HTTP contract.
110+
- `frontend/src/generated/openapi.ts` is generated from that spec.
111+
- Run `npm run api:types` whenever backend payloads change.
80112

81-
## What You Get
113+
## Recommended Growth Path
114+
115+
1. Keep the main story detection-first.
116+
2. Add webcam polish once upload mode feels strong.
117+
3. Add segmentation depth without changing the response boundary.
118+
4. Introduce a real model adapter layer.
119+
5. Split training and experimentation into a separate workspace later.
82120

83-
- reusable Next.js + Python computer-vision monorepo layout
84-
- upload-and-detect frontend starter UI
85-
- optional webcam capture mode that reuses the same inference contract
86-
- first segmentation extension pipeline using the same response boundary
87-
- FastAPI inference endpoint with typed response models
88-
- OpenCV-based sample processing that runs without a GPU
89-
- root scripts for local dev and checks
90-
- GitHub Actions workflow for frontend and backend verification
91-
- Docker Compose dev option
92-
93-
## Notes
94-
95-
- The backend in this starter is CPU-first on purpose so it is easier to clone, run, and extend.
96-
- The main story is intentionally detection-first so the template stays easy to explain and demo.
97-
- The current environment used to build this template did not have Python installed, so the frontend was verified locally but backend execution was prepared rather than run here.
98-
- If you move to heavier vision workloads, add a worker or model-service layer and keep the current API as the contract boundary.
99-
100-
## Next Expansions
101-
102-
- async job queue for long-running inference
103-
- persistent artifact storage
104-
- model registry and experiment tracking
105-
- richer segmentation overlays and mask visualizations
106-
- video ingestion pipelines
107-
- training or experiment workspace in a separate `ml/` or `training/` package
121+
The short public roadmap lives in [soon.md](./soon.md).

docs/assets/sample-scene.png

38.5 KB
Loading

docs/assets/vision-console.png

523 KB
Loading

docs/assets/webcam-extension.png

442 KB
Loading
38.5 KB
Loading
Lines changed: 95 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,95 @@
1+
import { AnalysisPreview } from "@/components/analysis-preview";
2+
import { AnalysisResults } from "@/components/analysis-results";
3+
import { docsDemoImagePath, docsPreviewResult } from "@/lib/docs-demo";
4+
5+
export default function DocsPreviewPage() {
6+
return (
7+
<main className="min-h-screen overflow-hidden px-6 py-8 lg:px-10 lg:py-10">
8+
<div className="mx-auto flex max-w-7xl flex-col gap-6">
9+
<section className="rounded-[36px] border border-black/10 bg-white/76 px-7 py-8 shadow-[0_32px_90px_rgba(10,20,25,0.12)] backdrop-blur-xl lg:px-10">
10+
<div className="flex flex-wrap items-center gap-3">
11+
<span className="rounded-full bg-[var(--accent-soft)] px-3 py-1 text-xs font-semibold uppercase tracking-[0.3em] text-[var(--foreground)]">
12+
Docs Preview
13+
</span>
14+
<span className="rounded-full border border-black/10 px-3 py-1 font-mono text-xs text-black/60">
15+
Detection + segmentation showcase
16+
</span>
17+
</div>
18+
19+
<div className="mt-6 grid gap-4 lg:grid-cols-[1.2fr_0.8fr] lg:items-end">
20+
<div>
21+
<h1 className="max-w-4xl text-5xl font-semibold tracking-[-0.05em] text-[var(--foreground)]">
22+
Screenshot-ready preview of the kit&apos;s main product path.
23+
</h1>
24+
<p className="mt-4 max-w-3xl text-base leading-8 text-black/70">
25+
One static scene, one typed response shape, and the same polished overlay
26+
UI the public starter ships with.
27+
</p>
28+
</div>
29+
30+
<div className="grid gap-3 rounded-[28px] border border-black/10 bg-[#13262e] p-5 text-white">
31+
<div>
32+
<p className="font-mono text-xs uppercase tracking-[0.3em] text-white/45">
33+
Included in the template
34+
</p>
35+
<p className="mt-3 text-lg font-semibold tracking-tight">
36+
Upload workflow, overlay controls, typed results, and the first
37+
segmentation extension.
38+
</p>
39+
</div>
40+
<div className="flex flex-wrap gap-2">
41+
{["starter-detection", "foreground-segmentation", "webcam mode"].map(
42+
(item) => (
43+
<span
44+
key={item}
45+
className="rounded-full border border-white/10 bg-white/8 px-3 py-1 font-mono text-[11px] text-white/72"
46+
>
47+
{item}
48+
</span>
49+
),
50+
)}
51+
</div>
52+
</div>
53+
</div>
54+
</section>
55+
56+
<section className="grid gap-6 lg:grid-cols-[0.95fr_1.05fr]">
57+
<div className="rounded-[32px] border border-black/10 bg-white/78 p-6 shadow-[0_32px_90px_rgba(10,20,25,0.12)] backdrop-blur-xl">
58+
<div className="flex flex-wrap items-center gap-3">
59+
<span className="rounded-full bg-[var(--accent-soft)] px-3 py-1 text-xs font-semibold uppercase tracking-[0.3em] text-[var(--foreground)]">
60+
Vision Console
61+
</span>
62+
<span className="rounded-full border border-black/10 px-3 py-1 font-mono text-xs text-black/65">
63+
seeded demo state
64+
</span>
65+
</div>
66+
67+
<div className="mt-6 space-y-3">
68+
<h2 className="text-2xl font-semibold tracking-tight text-[var(--foreground)]">
69+
Upload once, inspect detections, and keep the contract stable.
70+
</h2>
71+
<p className="max-w-xl text-sm leading-7 text-black/70">
72+
The docs preview uses a seeded response so the overlay, legend,
73+
segmentation controls, and review panel all stay screenshot-friendly.
74+
</p>
75+
</div>
76+
77+
<AnalysisPreview
78+
fileName={docsPreviewResult.image.filename}
79+
previewDimensions={null}
80+
previewUrl={docsDemoImagePath}
81+
result={docsPreviewResult}
82+
/>
83+
</div>
84+
85+
<AnalysisResults
86+
emptyDescription=""
87+
emptyEyebrow=""
88+
emptyTitle=""
89+
result={docsPreviewResult}
90+
/>
91+
</section>
92+
</div>
93+
</main>
94+
);
95+
}
Lines changed: 128 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,128 @@
1+
import Image from "next/image";
2+
3+
import { AnalysisResults } from "@/components/analysis-results";
4+
import { docsDemoImagePath, docsWebcamResult } from "@/lib/docs-demo";
5+
6+
export default function DocsPreviewWebcamPage() {
7+
return (
8+
<main className="min-h-screen overflow-hidden px-6 py-8 lg:px-10 lg:py-10">
9+
<div className="mx-auto flex max-w-7xl flex-col gap-6">
10+
<section className="rounded-[36px] border border-black/10 bg-white/76 px-7 py-8 shadow-[0_32px_90px_rgba(10,20,25,0.12)] backdrop-blur-xl lg:px-10">
11+
<div className="flex flex-wrap items-center gap-3">
12+
<span className="rounded-full bg-[var(--accent-soft)] px-3 py-1 text-xs font-semibold uppercase tracking-[0.3em] text-[var(--foreground)]">
13+
Webcam Preview
14+
</span>
15+
<span className="rounded-full border border-black/10 px-3 py-1 font-mono text-xs text-black/60">
16+
Same API, different input source
17+
</span>
18+
</div>
19+
20+
<h1 className="mt-6 max-w-4xl text-5xl font-semibold tracking-[-0.05em] text-[var(--foreground)]">
21+
The webcam extension still looks and feels like the same product.
22+
</h1>
23+
<p className="mt-4 max-w-3xl text-base leading-8 text-black/70">
24+
Capture is a frontend concern. The review surface, pipeline selection, and
25+
result shape all stay aligned with the upload flow.
26+
</p>
27+
</section>
28+
29+
<section className="grid gap-6 lg:grid-cols-[0.95fr_1.05fr]">
30+
<div className="rounded-[32px] border border-black/10 bg-white/78 p-6 shadow-[0_32px_90px_rgba(10,20,25,0.12)] backdrop-blur-xl">
31+
<div className="flex flex-wrap items-center gap-3">
32+
<span className="rounded-full bg-[var(--accent-soft)] px-3 py-1 text-xs font-semibold uppercase tracking-[0.3em] text-[var(--foreground)]">
33+
Optional Mode
34+
</span>
35+
<span className="rounded-full border border-black/10 px-3 py-1 font-mono text-xs text-black/65">
36+
seeded capture preview
37+
</span>
38+
</div>
39+
40+
<div className="mt-6 space-y-3">
41+
<h2 className="text-2xl font-semibold tracking-tight text-[var(--foreground)]">
42+
Reuse the detection contract from a live camera frame.
43+
</h2>
44+
<p className="max-w-xl text-sm leading-7 text-black/70">
45+
This mock state mirrors the public webcam page, but with a seeded frame so
46+
the docs can show the extension path clearly.
47+
</p>
48+
</div>
49+
50+
<div className="mt-8 space-y-5">
51+
<div className="rounded-[24px] border border-black/10 bg-white px-4 py-4 text-sm text-black/70">
52+
<p className="font-semibold text-[var(--foreground)]">Starter Detection</p>
53+
<p className="mt-2 leading-7">
54+
Detection-first sample pipeline that returns object-style boxes and
55+
confidence scores.
56+
</p>
57+
<div className="mt-3 flex flex-wrap gap-2">
58+
{["object boxes", "confidence scores", "coverage metrics"].map((item) => (
59+
<span
60+
key={item}
61+
className="rounded-full bg-[var(--accent-soft)] px-3 py-1 font-mono text-xs text-black/70"
62+
>
63+
{item}
64+
</span>
65+
))}
66+
</div>
67+
</div>
68+
69+
<div className="overflow-hidden rounded-[24px] border border-black/10 bg-[#12242c]">
70+
<div className="relative aspect-video w-full">
71+
<Image
72+
alt="Seeded webcam capture preview"
73+
className="object-cover"
74+
fill
75+
sizes="(max-width: 1024px) 100vw, 560px"
76+
src={docsDemoImagePath}
77+
unoptimized
78+
/>
79+
<div className="absolute inset-0 bg-[linear-gradient(180deg,rgba(18,36,44,0.08),rgba(18,36,44,0.18))]" />
80+
</div>
81+
<div className="flex flex-wrap items-center justify-between gap-3 px-4 py-4 text-sm text-white/75">
82+
<p className="font-mono text-xs uppercase tracking-[0.3em] text-white/45">
83+
Camera state: live
84+
</p>
85+
<div className="flex flex-wrap gap-2">
86+
<button
87+
className="rounded-full bg-[var(--accent)] px-4 py-2 font-medium text-[#1d1007]"
88+
type="button"
89+
>
90+
Restart camera
91+
</button>
92+
<button
93+
className="rounded-full border border-white/15 px-4 py-2 font-medium text-white"
94+
type="button"
95+
>
96+
Stop
97+
</button>
98+
<button
99+
className="rounded-full border border-white/15 bg-white/8 px-4 py-2 font-medium text-white"
100+
type="button"
101+
>
102+
Capture and analyze
103+
</button>
104+
</div>
105+
</div>
106+
</div>
107+
108+
<div className="rounded-[24px] border border-black/10 bg-[#fff4ea] px-4 py-4 text-sm text-black/70">
109+
<p className="font-semibold text-[var(--foreground)]">Keep upload as the main path</p>
110+
<p className="mt-2 leading-7">
111+
The starter still teaches image upload first. Webcam stays here as a
112+
believable extension once the base contract already feels solid.
113+
</p>
114+
</div>
115+
</div>
116+
</div>
117+
118+
<AnalysisResults
119+
emptyDescription=""
120+
emptyEyebrow=""
121+
emptyTitle=""
122+
result={docsWebcamResult}
123+
/>
124+
</section>
125+
</div>
126+
</main>
127+
);
128+
}

0 commit comments

Comments
 (0)