Automatically select matching video clips from a given video collection based on lyrics.
On AutoDL, select GPU: RTX 3090 24GB, Framework: PyTorch
mkdir -p /root/autodl-tmp/huggingface
echo 'export HF_HOME=/root/autodl-tmp/huggingface' >> ~/.bashrc
echo 'export HF_ENDPOINT=https://hf-mirror.com' >> ~/.bashrc
source ~/.bashrcsudo apt-get update
sudo apt-get install git-lfs
git lfs install
python -m pip install --upgrade pipapt-get install ffmpeg
pip install ffmpeg-python pillow torchvision
pip install transnetv2-pytorchpip install git+https://github.com/huggingface/transformers accelerate flash-attn
pip install qwen-vl-utils[decord] sentence_transformers openaiUpload video files to ./autodl-tmp/videos/ directory, supporting multiple formats (e.g., mp4, mkv).
Upload lyrics file to ./autodl-tmp/lyrics.lrc.
python generate_scenes.pypython generate_descriptions.pypython match_embedding.pyManually edit and compose the video based on the content in ./autodl-tmp/best_matches.txt.
This project uses the following models:
This project is licensed under the Apache License 2.0.