Skip to content

Conversation

@Hemil36
Copy link
Contributor

@Hemil36 Hemil36 commented Dec 7, 2025

This pull request introduces a real-time WebSocket–based progress reporting system for image-tagging jobs and refines both backend and frontend behavior to support accurate, live folder-level tagging status. It also corrects the previous misuse of the taggingCompleted flag and implements a reliable workflow for handling folders that have already been fully tagged.

Backend: Real-time Progress & Corrected Tagging Logic

WebSocket Infrastructure

  • Added a dedicated WebSocket manager (webSocket.py) that:
    • Tracks connected clients
    • Queues and coalesces progress events
    • Broadcasts updates with throttling
    • Replays the latest known state on reconnect
    • Runs a background dispatch loop for efficient event delivery

Image Processing Pipeline Enhancements

  • Updated image_util_classify_and_face_detect_images to:
    • Group images per folder
    • Emit folder-level progress after each processed image
    • Send progress updates at 5% increments and on completion
    • Mark completion via the new DB helper db_update_folder_tagging_completed

Corrected taggingCompleted Workflow

  • Implemented proper checks for already-tagged folders:
    • If a folder is already completed, the system skips reprocessing
    • A “completed” event is emitted immediately so the UI stays aligned
    • Eliminates inconsistent states between backend and frontend

Misc Backend Changes

  • Added /ws endpoint in main.py and initialized WebSocket manager on app startup
  • Switched from ProcessPoolExecutor to ThreadPoolExecutor for more efficient image-processing workloads

Frontend: Real-time Status & State Improvements

WebSocket-driven UI

  • Added WebSocket initialization (startWebSocket, addWsListener) during app startup
  • Receives real-time folder progress and completion events directly from backend

State Management Enhancements

  • Merged incoming progress updates with existing Redux folder state without overwriting unrelated entries
  • Removed periodic polling entirely; UI is now fully event-driven

Correct Handling of Already Completed Folders

  • UI immediately reflects “completed” status without extra API calls
  • Prevents stale or stuck progress indicators
  • Ensures consistent behavior across refreshes, reconnects, and multi-folder operations

Result

The system now provides:

  • Immediate, accurate, real-time progress visibility for AI image tagging
  • Smoother user experience with no polling and no stale states
  • Correct and reliable use of the taggingCompleted flag
  • Fully synchronized backend–frontend tagging workflow

All these changes significantly improve responsiveness, correctness, and developer experience.

Mermaid Chart - Create complex, visual diagrams with text -2025-12-07-062557

Summary by CodeRabbit

  • New Features

    • Real-time WebSocket progress with reconnect, event replay, and frontend subscription; UI applies job/folder progress (sets 100% on completion).
    • Batch image processing by folder with per-folder streamed progress and folder-level completion events.
  • Bug Fixes / Reliability

    • Improved error handling, resource cleanup, throttling/coalescing of progress updates, and thread-safe progress publishing.
    • Folders are explicitly marked tagging-completed and UI now merges incremental tagging updates.
  • Chores

    • DevTools enabled in development store configuration.

✏️ Tip: You can customize this high-level summary in your review settings.

@github-actions github-actions bot added backend bug Something isn't working frontend labels Dec 7, 2025
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 7, 2025

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 48.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and specifically summarizes the main changes: implementing WebSocket-based real-time progress reporting for tagging and fixing the taggingCompleted flag behavior.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

📜 Recent review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b2e7687 and d942dbc.

📒 Files selected for processing (1)
  • backend/app/utils/images.py (3 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
backend/app/utils/images.py (2)
backend/app/utils/webSocket/webSocket.py (1)
  • publish_progress_from_thread (92-115)
backend/app/database/folders.py (1)
  • db_update_folder_tagging_completed (443-472)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Backend Tests
  • GitHub Check: Tauri Build Check (windows-latest)
  • GitHub Check: Tauri Build Check (macos-latest, --target aarch64-apple-darwin)
  • GitHub Check: Tauri Build Check (ubuntu-22.04)
🔇 Additional comments (4)
backend/app/utils/images.py (4)

8-9: LGTM! Imports support the new folder-based processing.

The imports are necessary for grouping images by folder and publishing progress via WebSocket.


107-110: LGTM! Docstring accurately describes the new behavior.


114-119: LGTM! Grouping enables folder-level progress tracking.

Images are properly grouped by folder_id. Note that images without a folder_id are skipped (Line 118), which is appropriate defensive programming.


178-186: LGTM! Proper resource cleanup with exception handling.

The finally block ensures both object_classifier and face_detector are properly closed even if processing fails. Each close() call is wrapped in its own try-except to prevent one failure from blocking the other.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Nitpick comments (9)
frontend/src/hooks/useJobProgressUpdate.ts (1)

36-58: Consider simplifying the job update logic.

When prev[jobId] doesn't exist, you create prevJob with all the payload values, then immediately spread and overwrite with the same payload values in nextJob. This works correctly but is redundant.

A cleaner approach:

   setJobs(prev => {
-    const prevJob = prev[jobId] ?? {
-      jobId,
-      percent: payload.percent,
-      processed: payload.processed,
-      total: payload.total,
-      status: payload.status,
-      seq: incomingSeq,
-      ts: incomingTs
-    };
-
-    const nextJob: ApiJob = {
-      ...prevJob,
-      percent: payload.percent ?? prevJob.percent,
-      processed: payload.processed ?? prevJob.processed,
-      total: payload.total ?? prevJob.total,
-      status: payload.status ?? prevJob.status,
+    const prevJob = prev[jobId];
+    const nextJob: ApiJob = {
+      jobId,
+      percent: payload.percent ?? prevJob?.percent ?? 0,
+      processed: payload.processed ?? prevJob?.processed,
+      total: payload.total ?? prevJob?.total,
+      status: payload.status ?? prevJob?.status,
       seq: incomingSeq,
       ts: incomingTs
     };

     return { ...prev, [jobId]: nextJob };
   });
frontend/src/lib/ws.ts (1)

63-67: Consider logging when messages are dropped.

wsSend silently discards messages when the socket isn't open. For debugging purposes, a dev-mode warning could help identify timing issues.

 export function wsSend(obj: unknown) {
   if (ws && ws.readyState === WebSocket.OPEN) {
     ws.send(JSON.stringify(obj));
+  } else {
+    console.debug("[WS] send dropped, socket not open", obj);
   }
 }
backend/main.py (1)

138-148: Consolidate exception handling in WebSocket endpoint.

Both exception handlers perform the same action. Using a finally block would be cleaner and ensure cleanup even for unexpected exceptions.

 @app.websocket("/ws")
 async def websocket_endpoint(ws: WebSocket):
     await manager.register_ws(ws)
     try:
         while True:
             await ws.receive_text()
-    except WebSocketDisconnect:
-        await manager.unregister_ws(ws)
     except Exception:
-        await manager.unregister_ws(ws)
+        pass
+    finally:
+        await manager.unregister_ws(ws)
frontend/src/App.tsx (1)

41-41: Consider conditionally rendering DevTools in development only.

ReactQueryDevtools adds bundle weight and should ideally be excluded from production builds:

-        <ReactQueryDevtools initialIsOpen={false} />
+        {import.meta.env.DEV && <ReactQueryDevtools initialIsOpen={false} />}

Note: React Query DevTools automatically excludes itself from production bundles when using tree-shaking, but conditional rendering makes the intent explicit.

frontend/src/hooks/useFolderOperations.tsx (1)

99-108: Redundant dispatch when status is 'done'.

When status === 'done', you dispatch the tagging info twice: once at line 99 with the received percent, then again at lines 105-107 forcing 100%. Consolidate into a single dispatch:

           if (folder) {
+            const finalPercent = status === 'done' ? 100 : Math.round(percent);
             const taggingInfo: FolderTaggingInfo = {
               folder_id: jobId,
               folder_path: folder.folder_path,
-              tagging_percentage: Math.round(percent),
+              tagging_percentage: finalPercent,
             };
 
             dispatch(setTaggingStatus([taggingInfo]));
 
             if (status === 'done') {
               console.log(`[WS] AI tagging completed for folder: ${jobId}`);
-              dispatch(
-                setTaggingStatus([{ ...taggingInfo, tagging_percentage: 100 }]),
-              );
             }
           }
backend/app/utils/images.py (1)

164-164: Move import to the top of the file.

Inline imports inside functions are generally discouraged as they make dependencies harder to track and can have minor performance implications when called repeatedly. Since this function is called once per batch, the impact is minimal, but moving the import to the top of the file with other imports improves code organization.

backend/app/utils/webSocket/webSocket.py (3)

57-70: Add type hints for consistency.

The ws parameter lacks type annotation while other parts of the codebase use explicit types. Consider adding the type hint for consistency and IDE support.

-async def register_ws(ws):
+async def register_ws(ws: WebSocket):
     """Register incoming WebSocket connection (ws is a Starlette/FastAPI WebSocket)."""
     await ws.accept()
     async with _conn_lock:
         _active_connections.add(ws)


-async def unregister_ws(ws):
+async def unregister_ws(ws: WebSocket):
     async with _conn_lock:
         _active_connections.discard(ws)

125-125: Use logger instead of print for consistency.

The codebase uses a structured logging setup. Replace print() with the appropriate logger call.

+from app.logging.setup_logging import get_logger
+
+logger = get_logger(__name__)
+
 async def start_ws_manager_background_tasks(
     loop: Optional[asyncio.AbstractEventLoop] = None,
 ):
-    print("Starting WebSocket manager background tasks...")
+    logger.info("Starting WebSocket manager background tasks...")

178-261: Consider extracting duplicated throttling logic.

The throttling/coalescing decision logic (lines 195-211 and 238-254) is nearly identical. Extracting this to a helper function would improve maintainability.

def _should_send_immediately(payload: Dict[str, Any], job_id: Optional[str]) -> bool:
    """Determine if payload should be sent immediately or coalesced."""
    status = payload.get("status")
    percent = payload.get("percent")
    
    if status in ("done", "failed", "cancelled"):
        return True
    
    last = _last_sent_per_job.get(job_id)
    if last is None:
        return True
    
    now = time.time()
    last_percent = last.get("percent")
    last_ts = last.get("ts", 0)
    
    if percent is None:
        return True
    if abs(percent - (last_percent or 0)) >= _PERCENT_DELTA:
        return True
    if (now - last_ts) >= _MIN_INTERVAL:
        return True
    
    return False
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c37d8df and 1035d06.

📒 Files selected for processing (11)
  • backend/app/database/folders.py (1 hunks)
  • backend/app/utils/images.py (3 hunks)
  • backend/app/utils/webSocket/webSocket.py (1 hunks)
  • backend/main.py (4 hunks)
  • frontend/src/App.tsx (3 hunks)
  • frontend/src/app/store.ts (1 hunks)
  • frontend/src/features/folderSlice.ts (1 hunks)
  • frontend/src/hooks/useFolderOperations.tsx (3 hunks)
  • frontend/src/hooks/useJobProgressUpdate.ts (1 hunks)
  • frontend/src/lib/ws.ts (1 hunks)
  • frontend/src/types/FolderProgress.ts (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (5)
frontend/src/hooks/useFolderOperations.tsx (3)
frontend/src/features/folderSlice.ts (1)
  • setTaggingStatus (75-86)
frontend/src/lib/ws.ts (1)
  • addWsListener (57-61)
frontend/src/types/FolderStatus.ts (1)
  • FolderTaggingInfo (1-5)
frontend/src/hooks/useJobProgressUpdate.ts (1)
frontend/src/types/FolderProgress.ts (3)
  • JobId (2-2)
  • WsProgressMessage (14-25)
  • ApiJob (4-12)
backend/app/utils/images.py (5)
backend/app/utils/webSocket/webSocket.py (1)
  • publish_progress_from_thread (92-115)
backend/app/models/ObjectClassifier.py (3)
  • ObjectClassifier (9-33)
  • get_classes (15-25)
  • close (27-33)
backend/app/models/FaceDetector.py (3)
  • FaceDetector (15-79)
  • detect_faces (26-67)
  • close (69-79)
backend/app/database/images.py (2)
  • db_insert_image_classes_batch (293-324)
  • db_update_image_tagged_status (264-290)
backend/app/database/folders.py (1)
  • db_update_folder_tagging_completed (443-472)
frontend/src/App.tsx (2)
frontend/src/lib/ws.ts (1)
  • startWebSocket (53-55)
frontend/src/contexts/ThemeContext.tsx (1)
  • ThemeProvider (23-64)
backend/main.py (1)
backend/app/utils/webSocket/webSocket.py (3)
  • start_ws_manager_background_tasks (118-138)
  • register_ws (57-61)
  • unregister_ws (64-70)
🪛 GitHub Actions: PR Check Build
frontend/src/App.tsx

[error] 7-7: Cannot find module '@tanstack/react-query-devtools' or its corresponding type declarations.

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Backend Tests
🔇 Additional comments (14)
frontend/src/types/FolderProgress.ts (1)

1-25: LGTM! Well-structured type definitions.

The interfaces properly capture the WebSocket progress message structure and provide good optional field handling for backward compatibility.

Minor: Line 1 is empty, which is unusual for a file start.

frontend/src/lib/ws.ts (1)

11-39: LGTM! Reconnection logic is sound.

The onclose handler correctly nullifies ws before scheduling reconnect, ensuring connect() can proceed. Exponential backoff with a 30s cap is appropriate.

frontend/src/app/store.ts (1)

20-20: LGTM!

Explicit devTools configuration is good for clarity, though Redux Toolkit's configureStore already defaults to enabling devTools only when process.env.NODE_ENV !== 'production'.

frontend/src/features/folderSlice.ts (1)

74-86: LGTM! Correct merge-based update strategy.

The change from replacing to merging taggingStatus properly supports incremental WebSocket updates without overwriting unrelated folder entries. With Redux Toolkit's Immer integration, you could simplify slightly:

for (const info of action.payload) {
  state.taggingStatus[info.folder_id] = info;
}

But the explicit copy approach also works fine.

backend/main.py (1)

56-63: LGTM! Proper executor lifecycle management.

The switch to ThreadPoolExecutor with graceful shutdown is correct. Consider making max_workers configurable via environment variable for different deployment scenarios.

frontend/src/App.tsx (1)

23-26: LGTM! Correct initialization patterns.

Using useState for QueryClient prevents recreation on re-renders. The useEffect with empty deps correctly initializes the WebSocket connection once on mount.

backend/app/database/folders.py (1)

443-472: LGTM!

The function follows the established pattern in this file for database operations with proper connection handling, parameterized queries, rollback on exception, and clear return semantics.

backend/app/utils/images.py (3)

8-9: LGTM!

The new imports are appropriate for the folder-based grouping and WebSocket progress publishing functionality.


114-119: LGTM!

Good defensive check to skip images without folder_id, preventing potential KeyError or processing issues downstream.


179-187: LGTM!

The resource cleanup properly ensures both object_classifier and face_detector are closed even if one fails, with appropriate exception logging.

backend/app/utils/webSocket/webSocket.py (4)

9-51: Global state initialization looks correct for single event loop usage.

The separation of asyncio.Lock for connection tracking and threading.Lock for sequence generation is appropriate. The recent events list is accessed from the async context which is single-threaded, and get_events_since returns a copy to avoid mutation issues.


73-89: LGTM!

The event retrieval and replay logic is clean. Returning a copy from get_events_since prevents external mutation, and replay_events properly handles send failures by breaking the loop.


92-115: LGTM with minor observation.

The thread-safe enqueuing via call_soon_threadsafe is correct. The defensive queue creation in _enqueue (lines 107-108) acts as a safety net, though the queue should already be initialized by start_ws_manager_background_tasks before _main_loop is set.


264-279: LGTM!

The periodic flusher correctly handles graceful shutdown via CancelledError and safely copies the pending dict before clearing to avoid mutation during iteration.

Comment on lines 70 to +124
useEffect(() => {
if (taggingStatusQuery.isError) {
console.error(
'Failed to fetch tagging status:',
taggingStatusQuery.error,
);

const errorMessage = taggingStatusQuery.errorMessage || 'Unknown error';
console.warn(`Tagging status query failed: ${errorMessage}`);
}
}, [
taggingStatusQuery.isError,
taggingStatusQuery.error,
taggingStatusQuery.errorMessage,
]);
const unsubscribe = addWsListener((msg: any) => {
// Defensive checks
if (!msg || typeof msg !== 'object' || !('type' in msg)) return;

const type = String(msg.type);

// Handle progress events from backend
// Message format: { type: 'progress', seq, ts, payload: { job_id, processed, total, percent, status } }
if (type === 'progress' && msg.payload) {
try {
const p = msg.payload as any;
const jobId = String(p.job_id);
const percent = Number(p.percent || 0);
const status = String(p.status || 'running');

// Check if this job_id corresponds to a folder_id
// Since backend sends job_id as folder_id for AI tagging operations
const folder = folders.find((f) => f.folder_id === jobId);

if (folder) {
// Convert progress event to FolderTaggingInfo format
const taggingInfo: FolderTaggingInfo = {
folder_id: jobId,
folder_path: folder.folder_path,
tagging_percentage: Math.round(percent),
};

// Update tagging status in Redux
dispatch(setTaggingStatus([taggingInfo]));

// If status is 'done', mark tagging as completed (100%)
if (status === 'done') {
console.log(`[WS] AI tagging completed for folder: ${jobId}`);
// Ensure it's set to exactly 100%
dispatch(
setTaggingStatus([{ ...taggingInfo, tagging_percentage: 100 }]),
);
}
}
} catch (e) {
console.error('[WS] Failed to process progress message', e);
}
}
});

// Cleanup listener on unmount
return () => {
try {
unsubscribe();
} catch (e) {
console.error('[WS] Error unsubscribing', e);
}
};
}, [dispatch, folders]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Dependency on folders causes frequent listener re-registration.

The WebSocket listener is recreated every time folders changes, which:

  1. Briefly disconnects the listener during re-subscription
  2. Could miss messages during the gap
  3. Creates unnecessary churn

Consider using a ref to access current folders without triggering re-subscription:

+import { useRef } from 'react';
+
 export const useFolderOperations = () => {
   const dispatch = useDispatch();
   const folders = useSelector(selectAllFolders);
+  const foldersRef = useRef(folders);
+  foldersRef.current = folders;
 
   // ... 
 
   useEffect(() => {
     const unsubscribe = addWsListener((msg: any) => {
       // ...
-      const folder = folders.find((f) => f.folder_id === jobId);
+      const folder = foldersRef.current.find((f) => f.folder_id === jobId);
       // ...
     });
 
     return () => {
-      try {
-        unsubscribe();
-      } catch (e) {
-        console.error('[WS] Error unsubscribing', e);
-      }
+      unsubscribe();
     };
-  }, [dispatch, folders]);
+  }, [dispatch]);

Committable suggestion skipped: line range outside the PR's diff.

export type WsListener = (msg: any) => void;
const listeners = new Set<WsListener>();

const WS_URL = "ws://localhost:8000/ws";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Hardcoded WebSocket URL will break in non-local environments.

The URL ws://localhost:8000/ws is hardcoded, which prevents the app from working in production or any environment where the backend isn't on localhost.

-const WS_URL = "ws://localhost:8000/ws"; 
+const WS_URL = import.meta.env.VITE_WS_URL ?? `ws://${window.location.hostname}:8000/ws`;

Alternatively, derive it from window.location to support both ws/wss protocols:

const WS_URL = `${window.location.protocol === 'https:' ? 'wss:' : 'ws:'}//${window.location.host}/ws`;
🤖 Prompt for AI Agents
In frontend/src/lib/ws.ts around line 9, the WebSocket URL is hardcoded to
ws://localhost:8000/ws which breaks non-local deployments; update the code to
derive the URL from window.location so it uses wss when the page is served over
https and ws otherwise, and uses the current host and path (/ws) instead of
localhost:8000; implement this by building the protocol based on
window.location.protocol and combining it with window.location.host and the /ws
path, and ensure this runs only in browser environments (guard against
server-side rendering) so no runtime errors occur.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
frontend/src/App.tsx (1)

2-2: Consider WebSocket lifecycle & StrictMode idempotency

startWebSocket() is called once per mount, but in React 19 dev with StrictMode, this effect can run twice, potentially opening duplicate connections unless startWebSocket (or connect) guards against it. You may also want a cleanup path to close the socket on unmount if the app is ever embedded or remounted.

Please double‑check that startWebSocket is idempotent in development and that the underlying connection is properly closed when no longer needed.

Also applies to: 11-11, 22-25

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1035d06 and ca950f5.

⛔ Files ignored due to path filters (1)
  • frontend/package-lock.json is excluded by !**/package-lock.json
📒 Files selected for processing (5)
  • frontend/package.json (1 hunks)
  • frontend/src/App.tsx (3 hunks)
  • frontend/src/hooks/useJobProgressUpdate.ts (1 hunks)
  • frontend/src/lib/ws.ts (1 hunks)
  • frontend/src/types/FolderProgress.ts (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • frontend/package.json
🚧 Files skipped from review as they are similar to previous changes (3)
  • frontend/src/types/FolderProgress.ts
  • frontend/src/hooks/useJobProgressUpdate.ts
  • frontend/src/lib/ws.ts
🧰 Additional context used
🧬 Code graph analysis (1)
frontend/src/App.tsx (2)
frontend/src/lib/ws.ts (1)
  • startWebSocket (55-57)
frontend/src/contexts/ThemeContext.tsx (1)
  • ThemeProvider (23-64)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Backend Tests
  • GitHub Check: Tauri Build Check (macos-latest, --target aarch64-apple-darwin)
  • GitHub Check: Tauri Build Check (windows-latest)
  • GitHub Check: Tauri Build Check (ubuntu-22.04)
🔇 Additional comments (1)
frontend/src/App.tsx (1)

6-7: React Query integration looks correct and idiomatic

Using const [queryClient] = React.useState(() => new QueryClient()) and wrapping the app with QueryClientProvider is aligned with @tanstack/react-query v5 guidance and should ensure a stable client instance across renders. No issues from my side here.

If you upgraded react-query recently, just confirm there are no additional required provider props or config changes in the version you’re on.

Also applies to: 22-25, 28-40

Refactor progress publishing logic to always send 'running' status.
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
backend/app/utils/images.py (1)

148-148: Fix bucket calculation to match PR objectives (5% increments, not 2%).

The calculation percentage // 2 produces progress updates at 2% increments (0, 2, 4, 6...), but the PR objectives explicitly state "send updates at 5% increments and on completion." This causes 2.5× more frequent updates than intended, increasing network traffic and processing overhead.

Apply this diff to produce 5% increments:

-                bucket = int(percentage // 2) 
+                bucket = int(percentage // 5)  # 5%, 10%, 15%, ...
🧹 Nitpick comments (2)
backend/app/utils/images.py (2)

141-141: Consider simplifying the redundant condition.

The check 0 in classes is redundant because classes.count(0) >= 1 already guarantees that 0 is in classes.

Apply this diff for clarity:

-                if classes and 0 in classes and 0 < classes.count(0) < 7:
+                if classes and 1 <= classes.count(0) < 7:

163-163: Move import to top of file.

Imports should be placed at the module level for clarity, performance, and consistency with Python conventions.

Move this import to the top of the file (after line 9):

from app.database.folders import db_update_folder_tagging_completed
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ca950f5 and b2e7687.

📒 Files selected for processing (1)
  • backend/app/utils/images.py (3 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
backend/app/utils/images.py (3)
backend/app/utils/webSocket/webSocket.py (1)
  • publish_progress_from_thread (92-115)
backend/app/database/images.py (1)
  • db_insert_image_classes_batch (293-324)
backend/app/database/folders.py (1)
  • db_update_folder_tagging_completed (443-472)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Tauri Build Check (ubuntu-22.04)
  • GitHub Check: Tauri Build Check (macos-latest, --target aarch64-apple-darwin)
  • GitHub Check: Tauri Build Check (windows-latest)
  • GitHub Check: Backend Tests
🔇 Additional comments (5)
backend/app/utils/images.py (5)

8-9: LGTM!

The new imports support the folder-grouping and WebSocket progress reporting features. Both are used appropriately in the code.


107-110: LGTM!

The docstring accurately documents the new progress reporting behavior.


114-119: LGTM!

The folder-grouping logic correctly batches images by folder_id and appropriately skips images without folder association.


147-147: LGTM!

The progress reporting logic correctly publishes updates when crossing bucket thresholds or processing the final image. The duplicate "done" event issue from the previous review has been properly addressed by keeping the loop status as "running" only.

Also applies to: 149-160


178-186: LGTM!

The enhanced error handling in the finally block ensures both classifier and detector resources are properly closed even if one cleanup operation fails. The use of logger.exception provides detailed error context for debugging.

Comment on lines +165 to +176
db_update_folder_tagging_completed(folder_id, True)
logger.info(f"Folder {folder_id} tagging completed")
# Publish final done event after database update
publish_progress_from_thread(
{
"job_id": str(folder_id),
"processed": total,
"total": total,
"percent": 100.0,
"status": "done",
}
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Add exception handling to prevent one folder's failure from blocking others.

If db_update_folder_tagging_completed fails for one folder, the exception propagates and stops processing of all remaining folders. This cascading failure means that one folder's database issue prevents other (potentially healthy) folders from being tagged and marked complete.

Apply this diff to continue processing remaining folders even if one fails:

-            # Step 5: Mark folder tagging as completed after processing all images
-            from app.database.folders import db_update_folder_tagging_completed
-
-            db_update_folder_tagging_completed(folder_id, True)
-            logger.info(f"Folder {folder_id} tagging completed")
-            # Publish final done event after database update
-            publish_progress_from_thread(
-                {
-                    "job_id": str(folder_id),
-                    "processed": total,
-                    "total": total,
-                    "percent": 100.0,
-                    "status": "done",
-                }
-            )
+            # Step 5: Mark folder tagging as completed after processing all images
+            from app.database.folders import db_update_folder_tagging_completed
+
+            try:
+                db_update_folder_tagging_completed(folder_id, True)
+                logger.info(f"Folder {folder_id} tagging completed")
+                # Publish final done event after database update
+                publish_progress_from_thread(
+                    {
+                        "job_id": str(folder_id),
+                        "processed": total,
+                        "total": total,
+                        "percent": 100.0,
+                        "status": "done",
+                    }
+                )
+            except Exception as e:
+                logger.error(f"Failed to mark folder {folder_id} as completed: {e}")
+                # Publish error status so frontend knows this folder failed
+                publish_progress_from_thread(
+                    {
+                        "job_id": str(folder_id),
+                        "processed": total,
+                        "total": total,
+                        "percent": 100.0,
+                        "status": "error",
+                    }
+                )

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In backend/app/utils/images.py around lines 165 to 176, wrap the
db_update_folder_tagging_completed and the logger.info call in a try/except so a
failure updating one folder doesn't stop the loop; on exception catch Exception
as e and call logger.exception or logger.error with the folder_id and exception,
then continue processing; ensure publish_progress_from_thread is still invoked
for this folder (use different status like "done" on success and "error" on
failure) so downstream systems get a final event and the loop moves on.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend bug Something isn't working frontend

Projects

None yet

Development

Successfully merging this pull request may close these issues.

BUG : Infinite Stream of Folder Status Requests When AI Tagging Is Enabled

1 participant