Improve background job reliability: Redis resilience + file-based fallbacks#8
Improve background job reliability: Redis resilience + file-based fallbacks#8brbrainerd wants to merge 3 commits intoIliasHad:mainfrom
Conversation
…allbacks; add optional GPU support; secure CIFS creds via env + override compose
- Enable CUDA 12.4 PyTorch for GPU transcription/analysis - Add face_recognition + dlib dependencies - Harden file watchers: depth limit, Syncthing/temp ignores, audio support - Fix analysis result path mismatch (/app/apps/background-jobs/) - Add start.ps1/stop.ps1 to gitignore (contain GPU_COUNT)
- Add findAudioFiles and findMediaFiles functions - Update folder trigger route to detect audio folders - Audio folders (x_audio, /audio) scan for audio extensions
IliasHad
left a comment
There was a problem hiding this comment.
Thank you so much @brbrainerd for taking the time and making the first external PR, much appreciate it. I left couple of comments about the changes that you made
There was a problem hiding this comment.
Thank you for the contribution. Let's focus only on the video, at least now, because the system is built for video creators and people with a large archive of videos
There was a problem hiding this comment.
The root Docker Compose file will be for people who wanna use pre-built Docker images from GitHub container registry, if you wanna to build it or develop. You can use the Docker Compose file in the docker/ folder
There was a problem hiding this comment.
That's a good feature to have, thank you for adding that. but we should add support in the backend and frontend to handle network drive
There was a problem hiding this comment.
let's keep .env.example file in the project root, thank you
| onProgress?: ProgressCallback | ||
| ): Promise<void> { | ||
| return new Promise((resolve, reject) => { | ||
| let resolved = false |
There was a problem hiding this comment.
Can you please elaborate more on why we need a file-based fallback in this case, if the web socket is sending a complete webhook message when the transcription is done? Because the transcription service could be stuck, and using this file-based fallback, the transcription will be completed (the file size hasn't changed), but in reality, the transcription job is not completed
| onProgress: (progress: AnalysisProgress) => void | ||
| ): Promise<{ analysis: Analysis; category: string }> { | ||
| return new Promise((resolve, reject) => { | ||
| let resolved = false |
There was a problem hiding this comment.
Same comment and observation for the transcription file back fallback will be for this frame analysis file fallback. We need to use the analysis_complete message from the websocket server
| } | ||
|
|
||
| // Ensure connection is alive, reconnect if needed | ||
| public async ensureConnected(): Promise<boolean> { |
There was a problem hiding this comment.
Where are we using this function? if weren't using it, can you please remove it?
| const message = JSON.parse(data.toString()) | ||
| const { type, payload, job_id } = message | ||
|
|
||
| // Handle ping/pong heartbeat from Python service |
There was a problem hiding this comment.
Thank you for adding support for NVIDIA CUDA support but this will make the Docker image bigger for everyone, including the users (who don't have a NVIDIA GPU). Can we have an option to include an argument to use CUDA, and in the GitHub release YAML file, add it to the strategy for building the Docker image (add a tag to the Docker image that has the NVIDIA support)
Summary
Improves reliability for long-running video indexing jobs and enables GPU acceleration.
Changes
Job Processing Reliability
GPU Acceleration
Watcher Hardening
Bug Fixes
Why?
Long videos caused stalls due to WebSocket timeouts and Redis ETIMEDOUT on Docker Desktop. File-based fallbacks ensure completion regardless of callback delivery.