NC-KTV (C++ Edition)

Next-Generation Professional Music Video Karaoke Maker

High-performance, native desktop application for creating professional karaoke videos with AI-powered vocal separation, automatic transcription, and synchronized lyrics.

Features • Installation • Architecture • Changelog

Overview

Welcome to the C++ Qt6 Rebirth of NC-KTV.

Originally written in Python, we have completely overhauled the NC-KTV engine using modern C++17 and Qt6 to achieve unparalleled performance, hardware-accelerated rendering, and a butter-smooth editing experience. This marks a massive leap in processing speed and UI responsiveness, enabling real-time waveform rendering, precise audio seeking, and seamless subtitle processing.

NC-KTV automates the entire karaoke video creation workflow:

Import any media file natively via FFmpeg.
Extract vocals cleanly using UVR-compatible ONNX models (MDX-Net) with a native C++ Mixed-Radix FFT DSP pipeline.
Transcribe lyrics automatically (Whisper) or import industry standards.
Sync lyrics with sub-millisecond precision using the new Hardware-Accelerated Timeline and one-click Set Start/End controls.
Export to professional-grade formats (ASS, MP4, MKV) with GPU-accelerated encoding and custom resolution scaling.

Core Features

AI Vocal Separation

High-Performance Bridge: Driven by the authentic audio-separator Python library for 100% matching UVR quality.
Hardware Acceleration: Automatic target detection for CUDA (NVIDIA) via PyTorch. Local GPU libraries in models/whisper/cudn12/ are auto-bundled into the portable build.
Full Model Support: Supports all UVR models including MDX-Net, VR Architecture, and Roformer (.onnx and .pth).

Hardware-Accelerated Studio Editor

Native Qt6 UI: Butter-smooth 60fps+ rendering of complex timeline data via QPainter and Hardware Accel, featuring smart render-debouncing.
Interactive Waveforms: Zoom, scrub, and manipulate gigabytes of audio data instantaneously using pixel-bucketing compression without UI blocking.
Precision Karaoke Builder Studio Mode: Brand-new fully vertical syllable tracking interface. Features cascading blocks locked to a Y-axis left-waveform display, a dedicated instant-update Lyrics Map sidebar for transcription tuning, fully-synchronized auto-scrolling, and millisecond-accurate "Play Segment" vocal isolation capabilities.
One-Click Timing Sync: Dedicated "Set Start" and "Set End" buttons in the transport bar to instantly align lyrics to the current playhead.
GPU-Accelerated Export: High-performance video rendering utilizing NVENC (NVIDIA), QSV (Intel), or AMF (AMD) with configurable resolution scaling (1080p to 360p).
Dockable Workspace Panels: The Synchronization Queue and Properties panels are full QDockWidget instances — tear off, float, and re-dock them anywhere for a completely custom workspace layout.
In-Editor AI Support: Kick off Whisper transcripts dynamically directly from the editor mode, complete with language override parameters and dimming modal overlays.
Live Karaoke Preview: Configurable zero-latency ASS subtitle rendering overlaid onto the active video track with a smooth horizontal linear wipe effect per word.

Intelligent Transcription and Sync

Whisper Powered: Uses OpenAI's Whisper via Python subprocess for robust, high-speed transcription.
Word-Level Precision: Automatic word-timestamp generation for perfect syllable alignment.
Targeted AI Control: Select Whisper models (base, small, medium, large, turbo) and specific ISO codes (en, id, ms, ja, ko) inside the UI to balance speed vs. accuracy.
Tap-to-Sync Engine: Rebuilt event-driven synchronization for perfect rhythm matching.
Auto-Romanization: Lightning-fast transliteration of global scripts (Korean/Japanese to Latin).

Gemini AI Integration

Transcribe with Gemini: One-click button in the Source Lyrics tab that compresses the active audio to a small MP3 file and opens your custom Gemini Gems link in the browser. Simply upload the MP3, copy Gemini's output, and click Paste & Sync in the app.
Smart Paste & Sync: Parses Gemini/AI transcription text (plain or LRC format with range timestamps like [00:15.15 - 00:19.30]) and directly loads it into the Synchronization Queue.
Configurable URL: Paste your own Gemini Gem link directly in the UI so the app always opens the right transcription tool.

Installation and Build

NC-KTV is now built using standard CMake and requires a modern C++17 compliant toolchain.

Prerequisites

MinGW-w64 (GCC 13.x): Part of the bundled Qt 6.8.2 toolchain.
CMake 3.25+: Essential for project configuration.
Qt 6.8.2: Core, Gui, Widgets, Multimedia, Network.
Python 3.10+: For the AI bridge (Whisper/UVR).

Building from Source (Windows)

NC-KTV now features a streamlined, high-performance build pipeline using MinGW and a custom PowerShell script that handles both C++ compilation and Python environment bundling.

# 1. Open PowerShell and navigate to the project root

# 2. Run the automated build script
# This script configures CMake, builds the C++ engine (Ninja),
# packages the Python AI bridge (PyInstaller), and assembles 
# the portable directory with all necessary DLLs.
.\build_portable_release.ps1

Once completed, the final portable application will be available in the root directory as ncktv.exe, with all dependencies (Qt, FFmpeg, ONNX, Python Bridge) properly staged.

For manual development/debugging:

Open the project in VS Code or Qt Creator.
Select the windows-debug or windows-release CMake preset.
Build using the standard CMake workflow (Ctrl+Shift+B in VS Code).

Architecture & Documentation

For a deep dive into the completely revamped C++ architecture, hardware-accelerated UI patterns, and the multithreaded audio pipeline, please see our dedicated Technical Documentation (DOCS.md).

Contributing

We welcome contributions to the NC-KTV C++ engine!

Please ensure PRs targeting core systems compile successfully across MSVC, GCC, and Clang.
Run the included GTest suite via ctest before opening a pull request.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 43 Commits
.agent		.agent
assets		assets
cpp		cpp
models		models
themes		themes
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
DOCS.md		DOCS.md
LICENSE		LICENSE
README.md		README.md
build_portable_release.ps1		build_portable_release.ps1
config.yaml		config.yaml
download_ffmpeg.py		download_ffmpeg.py
download_models.py		download_models.py
download_whisperx_models.py		download_whisperx_models.py
hallucination_filter.py		hallucination_filter.py
list_models.py		list_models.py
python_bridge.py		python_bridge.py
requirements.txt		requirements.txt
robust_download.py		robust_download.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NC-KTV (C++ Edition)

Overview

Core Features

AI Vocal Separation

Hardware-Accelerated Studio Editor

Intelligent Transcription and Sync

Gemini AI Integration

Installation and Build

Prerequisites

Building from Source (Windows)

Architecture & Documentation

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

NC-KTV (C++ Edition)

Overview

Core Features

AI Vocal Separation

Hardware-Accelerated Studio Editor

Intelligent Transcription and Sync

Gemini AI Integration

Installation and Build

Prerequisites

Building from Source (Windows)

Architecture & Documentation

Contributing

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages