Short public roadmap for the next upgrades to this computer-vision kit.
These are the highest-leverage next steps for the public template.
- Bring the full overlay experience to webcam mode, including labels, filters, and export.
- Add backend CI that runs the Python test suite in a real Python-enabled environment.
- Add fixture images and golden outputs so API and UI behavior stay easy to verify.
These upgrades make the starter more useful for real teams without changing the repo's shape.
- Add a model adapter layer for YOLO, ONNX Runtime, and hosted inference APIs.
- Add per-pipeline controls for confidence thresholds, box filtering, and segmentation cleanup.
- Add solo-class focus and hover linking in the preview overlay.
- Add a side-by-side original vs annotated review mode.
These are larger product and platform expansions once the starter path feels mature.
- Add batch inference for multiple images in one request.
- Add async jobs for long-running inference workloads.
- Add video ingestion that reuses the same contract frame by frame.
- Add artifact storage and richer export flows.
Training should stay adjacent to the app, not mixed into the runtime path.
- Create a dedicated
training/workspace. - Add dataset config templates for detection and segmentation.
- Add evaluation and regression scripts for sample predictions.
- Add experiment tracking hooks for metrics, artifacts, and model versions.
The template itself is close to deploy-ready today:
- production Dockerfiles already exist for the frontend and backend
- release tags already publish images and a GitHub Release
- release smoke checks already validate the published images
The sign-language adaptation path is not deploy-ready yet.
Before treating that version as deployable, the next gaps to close are:
- add the actual sign-language inference pipeline in the backend
- define model artifact packaging and versioning
- set production CORS and environment values for the deployed frontend domain
- add a production-oriented deployment target or guide for a real host
- add regression checks for the sign-language model outputs
If you are extending the repo from here, the cleanest order is:
- webcam overlay parity
- backend CI + fixture-based verification
- model adapter interface
That keeps the repo product-shaped while making it much easier to grow beyond the starter OpenCV pipelines.