Skip to content

feat: Cost Tracking Dashboard with LangWatch integration#36

Open
adubuc-cloudgeometry wants to merge 8 commits intomainfrom
feature/cost-tracking-dashboard-ix
Open

feat: Cost Tracking Dashboard with LangWatch integration#36
adubuc-cloudgeometry wants to merge 8 commits intomainfrom
feature/cost-tracking-dashboard-ix

Conversation

@adubuc-cloudgeometry
Copy link
Collaborator

@adubuc-cloudgeometry adubuc-cloudgeometry commented Mar 18, 2026

Summary

  • Complete Usage & Cost Tracking dashboard powered by LangWatch trace data
  • Backend: LangWatchService with httpx client, Fernet key encryption, Redis caching, scroll pagination, ownership filtering
  • Frontend: UsagePage with summary cards, date picker, per-flow breakdown, expandable run detail, multi-flow selection, admin user filter
  • 4 API endpoints: GET /usage/, GET /usage/{flow_id}/runs, POST /usage/settings/langwatch-key, GET /usage/settings/langwatch-key/status
  • GlobalSettings DB model + Alembic migration for encrypted API key storage
  • LangWatch SDK upgraded from 0.2.x to 0.17.0 (langchain-core made optional in this release, resolving version conflict)
  • Fixed latent _api_key guard bug: was always truthy (function object), now uses get_api_key() from langwatch.state
  • Token capture monkey-patch for Anthropic/streaming responses + model name normalization for LangWatch pricing table
  • 670+ backend tests, full frontend unit tests, Playwright E2E spec
  • Feature-flagged behind ENABLE_USAGE_TRACKING

Test plan

  • pip install --dry-run langwatch==0.17.0 — zero dependency conflicts
  • All 5 SDK import paths verified compatible with v0.17.0
  • Flow execution test: "Anthropic chat" flow returns response
  • Usage endpoint returns cost data: $0.0107 for test run
  • get_api_key() correctly returns falsy when no key configured
  • Full test suite pass (pytest)
  • Frontend test suite pass (vitest)
  • Manual smoke test: run flows, verify traces + costs in dashboard UI

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features
    • Added Usage & Cost Tracking dashboard with aggregated cost metrics and per-flow breakdown.
    • Integrated LangWatch API for real-time usage data retrieval.
    • Added admin controls for API key configuration and management.
    • Implemented date range filtering and flow-level cost analysis.
    • Added feature flag to control usage tracking visibility and navigation access.

test222234 and others added 6 commits March 17, 2026 12:24
Add Usage & Cost Tracking dashboard (19 bugs found and fixed during QA):

Backend:
- LangWatchService with httpx client, Fernet key encryption, trace parsing
- GET /api/v1/usage/ — aggregated cost and invocation data
- GET /api/v1/usage/{flow_id}/runs — per-run detail
- POST /api/v1/usage/settings/langwatch-key — save/validate key
- GET /api/v1/usage/settings/langwatch-key/status — key status
- GlobalSettings model + Alembic migration (SQLite compatible)
- 670+ backend tests

Frontend:
- UsagePage with summary cards, date picker, user filter
- LangWatchKeyForm in Settings > General (admin only)
- Usage nav in Account Menu dropdown (feature-flagged)
- PageLayout wrapper with back navigation
- Error states: KEY_NOT_CONFIGURED, generic, retryable
- TanStack Query hooks with debounce

Bug fixes applied during QA (19 bugs):
- Fixed DI crashes: async generator wrapped in async with (BUG-C1, C2)
- Fixed null date crash in _fetch_from_langwatch (BUG-C3)
- Fixed unhandled httpx.HTTPStatusError (BUG-C4)
- Fixed key validation: wrong endpoint/auth header (GET→POST, Bearer→X-Auth-Token)
- Fixed cache key collision cross-user data leak (BUG-L4)
- Fixed SQLite migration NOW()→sa.func.now() (BUG-L9)
- Fixed datetime.utcnow deprecation (BUG-L8)
- Fixed httpx connection leak (BUG-L2)
- Fixed flow name collision in ownership filter (BUG-L5)
- Fixed frontend error throwing (raw JSON→Error instances) (BUG-I2, I3)
- Removed dead code and duplicate ownership checks
- Narrowed bare except Exception blocks

QA artifacts in QA/ and Research/ directories.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The LangWatch OTEL SDK serializes metadata.labels into span attributes,
but the search API only surfaces OTEL Resource attributes in the trace
metadata field. This made _filter_by_ownership find empty labels and
silently drop all traces, showing zero data in the Usage dashboard.

Fix:
- Add includeSpans:true to fetch payload so span data is available
- Extract flow name from root span (type:"workflow") as fallback
  when metadata.labels is empty (in both _parse_trace and
  _filter_by_ownership)
- Filter to workflow-type traces only to prevent N+1 over-counting
  (1 workflow + N component traces per execution)
- Add diagnostic logging for trace filtering

Verified: Usage dashboard now shows 5 invocations across 2 flows.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…,010,014)

UsagePage now renders all pre-built components:
- FlowBreakdownList with per-flow cost, invocations, and expandable runs
- EmptyStatePrompt with admin/user distinction (FR-003)
- ErrorState with retry button (FR-005)
- UserFilterDropdown populated from flow data, admin-only (FR-010)

Token capture: monkey-patch LangWatch SDK callback on_llm_end to extract
tokens from 3 locations (OpenAI token_usage, Anthropic usage,
LangChain usage_metadata) as fallback when SDK misses them.
Fixes $0.00 cost for Anthropic and streaming providers.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
FR-002: Flows/MCP Server sub-view toggle using existing tab pattern.
Backend already supports sub_view param — frontend now passes it.

FR-012: Checkbox multi-selection on flow breakdown rows with summed
totals summary bar. Selection auto-clears on date/filter changes.

New components: SubViewToggle.tsx, SelectionSummary.tsx
Modified: FlowBreakdownList.tsx (checkbox header + props),
FlowBreakdownRow.tsx (checkbox cell + props), UsagePage.tsx (state + wiring)

All 14 Must Have FRs now implemented.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add span-based fallback to fetch_flow_runs (same fix as _filter_by_ownership)
  so expanded run details show data instead of "No runs found"
- Add Owner column to flow breakdown table (shows owner_username)
- Show UserFilterDropdown for all admins (not just when >=2 users)
- Update colspan for expanded rows to accommodate new column

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Upgrade langwatch from >=0.2.11,<0.3.0 to ==0.17.0. This version
makes langchain-core optional (PR #2464), removing the hard
dependency conflict with our langchain-core<1.0.0 pin.

Fix latent bug on line 170: `self._client._api_key` accessed a
@module_property function object (always truthy) instead of the
API key value. Replace with `get_api_key()` from `langwatch.state`
which correctly returns '' (falsy) when no key is configured.

Verified: all SDK imports pass, flow execution works, usage endpoint
returns cost data ($0.0107 for Anthropic chat test run).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 18, 2026

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 77d90e4c-84b6-4449-bc7d-5bb4e0bfe441

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review

Walkthrough

Implements a comprehensive usage and cost tracking dashboard feature with LangWatch API integration across both backend and frontend. Adds new database models and migrations for settings storage, service layer for data aggregation and caching, multiple API endpoints for usage summary and flow runs, and a full frontend dashboard with filtering, date range selection, flow breakdown, and admin controls. Includes extensive research documentation analyzing bugs and implementation plans for related defect fixes.

Changes

Cohort / File(s) Summary
Research & Planning Documents
Research/bugs/cost-tracking-dashboard-ix-defects/*, Research/bugs/dashboard-completeness-audit/*, Research/bugs/trace-flow-matching-gap/*, QA/cost-tracking-dashboard-ix/findings.md
Comprehensive research, findings, and implementation plans documenting identified bugs (19 categorized issues), defect analysis (DI crashes, null date handling, async/await misuse), proposed fixes (7 micro-plans across 4 phases), and audit findings for navigation, backend, and frontend completeness. Research threads cover tracer integration, usage service matching logic, token capture gaps, and migration compatibility issues.
Database Layer
langbuilder/src/backend/base/langflow/alembic/versions/773db17e6029_add_global_settings_table.py, langbuilder/src/backend/base/langflow/services/database/models/global_settings.py, langbuilder/src/backend/base/langflow/services/database/models/__init__.py, langbuilder/src/backend/base/tests/migrations/test_global_settings_migration.py, langbuilder/src/backend/base/tests/models/test_global_settings_model.py
Adds new GlobalSettings SQLModel table for key/value configuration storage with encryption support (id, key, value, is_encrypted, created_at, updated_at, updated_by). Includes Alembic migration with upgrade/downgrade, model definition with timezone-aware datetime defaults, and comprehensive test coverage for migration and model behavior.
LangWatch Service Core
langbuilder/src/backend/base/langflow/services/langwatch/service.py, langbuilder/src/backend/base/langflow/services/langwatch/exceptions.py, langbuilder/src/backend/base/langflow/services/langwatch/schemas.py
Implements LangWatchService with HTTP client management, trace pagination and parsing, ownership filtering, Redis cache-aside pattern, and key management with encryption. Defines exception hierarchy (LangWatchError with 6 subclasses mapping to HTTP status codes) and Pydantic v2 schemas for request/response models (UsageQueryParams, UsageResponse, FlowUsage, etc.). Service handles LangWatch API interaction, data aggregation, and structured error responses.
LangWatch Tracing Integration
langbuilder/src/backend/base/langflow/services/tracing/langwatch.py
Enhances LangWatch callback with token extraction fallback logic supporting OpenAI, Anthropic, and LangChain streaming token formats. Patches on_llm_end to inject SpanMetrics when SDK did not capture tokens. Adds flow_id preservation alongside flow_name, model normalization helper, and API key fallback checks for trace context management.
Usage API Router
langbuilder/src/backend/base/langflow/api/v1/usage/router.py, langbuilder/src/backend/base/langflow/api/router.py
Introduces new Usage router with 4 endpoints: GET /usage/ (aggregated summary with optional user filtering), GET /usage/{flow_id}/runs (per-flow run details), POST /usage/settings/langwatch-key (admin-only key validation/save), GET /usage/settings/langwatch-key/status (key status reporting). Implements ownership enforcement, LangWatch error mapping to structured HTTP responses, and admin-only access controls.
Backend Test Suite - Unit/Integration
langbuilder/src/backend/base/tests/api/test_usage_endpoint.py, langbuilder/src/backend/base/tests/api/test_usage_router_skeleton.py, langbuilder/src/backend/base/tests/api/test_usage_router_registration.py, langbuilder/src/backend/base/tests/api/test_usage_security.py, langbuilder/src/backend/base/tests/api/test_flow_runs_endpoint.py, langbuilder/src/backend/base/tests/api/test_langwatch_key_endpoint.py, langbuilder/src/backend/base/tests/api/test_usage_api_integration.py
Comprehensive test coverage for router functionality including authorization enforcement, error handling, endpoint registration, schema validation, key management, and end-to-end integration via FastAPI TestClient with dependency overrides.
Backend Test Suite - Service Layer
langbuilder/src/backend/base/tests/services/test_langwatch_*.py
Extensive service tests covering HTTP client configuration, pagination/fetch behavior, trace parsing and aggregation, cache-aside logic with Redis error handling, encryption/decryption, exception hierarchy, ownership filtering, flow runs retrieval, key validation, and integration scenarios. Tests validate async behavior, API contracts, retry logic, and graceful error recovery.
Frontend - Core Page & Settings
langbuilder/src/frontend/src/pages/UsagePage/UsagePage.tsx, langbuilder/src/frontend/src/pages/UsagePage/__tests__/UsagePage.test.tsx, langbuilder/src/frontend/src/pages/SettingsPage/LangWatchKeyForm.tsx, langbuilder/src/frontend/src/pages/SettingsPage/__tests__/LangWatchKeyForm*.test.tsx, langbuilder/src/frontend/src/pages/SettingsPage/pages/GeneralPage/index.tsx
Implements UsagePage dashboard with date range picker, user filtering (admin-only), sub-view toggle (flows/mcp), and flow selection. Integrates LangWatchKeyForm in Settings for admin key configuration with validation and status display. Page handles loading, error, and empty states with proper error code handling and retry mechanisms.
Frontend - UI Components
langbuilder/src/frontend/src/pages/UsagePage/components/DateRangePicker.tsx, langbuilder/src/frontend/src/pages/UsagePage/components/UsageSummaryCards.tsx, langbuilder/src/frontend/src/pages/UsagePage/components/FlowBreakdownList.tsx, langbuilder/src/frontend/src/pages/UsagePage/components/FlowBreakdownRow.tsx, langbuilder/src/frontend/src/pages/UsagePage/components/FlowRunsTable.tsx, langbuilder/src/frontend/src/pages/UsagePage/components/EmptyStatePrompt.tsx, langbuilder/src/frontend/src/pages/UsagePage/components/ErrorState.tsx, langbuilder/src/frontend/src/pages/UsagePage/components/LoadingSkeleton.tsx, langbuilder/src/frontend/src/pages/UsagePage/components/SelectionSummary.tsx, langbuilder/src/frontend/src/pages/UsagePage/components/SubViewToggle.tsx, langbuilder/src/frontend/src/pages/UsagePage/components/UserFilterDropdown.tsx
Reusable UI components for the usage dashboard including date range picker with presets, summary metric cards, paginated flow breakdown table with selection, flow detail rows with run expansion, run detail table, empty/error state prompts, loading skeleton, selection summary aggregation, sub-view toggle, and user filter dropdown. All components include comprehensive unit tests.
Frontend - Hooks & Services
langbuilder/src/frontend/src/pages/UsagePage/hooks/useGetUsageSummary.ts, langbuilder/src/frontend/src/pages/UsagePage/hooks/useGetFlowRuns.ts, langbuilder/src/frontend/src/pages/UsagePage/hooks/useGetKeyStatus.ts, langbuilder/src/frontend/src/pages/UsagePage/hooks/__tests__/*, langbuilder/src/frontend/src/services/LangWatchService.ts, langbuilder/src/frontend/src/services/__tests__/LangWatchService.test.ts
Custom React Query hooks for usage summary, flow runs, and key status with configured staleTime, retry, and exponential backoff logic. LangWatchService provides API interaction layer with structured error handling, detail object parsing, and consistent error code/retryable flag propagation.
Frontend - Types & Constants
langbuilder/src/frontend/src/types/usage.ts, langbuilder/src/frontend/src/types/__tests__/usage.test.ts, langbuilder/src/frontend/src/hooks/useDebounce.ts, langbuilder/src/frontend/src/hooks/__tests__/useDebounce.test.ts
TypeScript interfaces for usage analytics (UsageQueryParams, UsageResponse, FlowUsage, RunDetail, etc.) and reusable useDebounce hook for debouncing value updates. Comprehensive type tests validate field presence and shape consistency.
Frontend - Navigation & Routes
langbuilder/src/frontend/src/customization/feature-flags.ts, langbuilder/src/frontend/src/routes.tsx, langbuilder/src/frontend/src/components/core/appHeaderComponent/components/AccountMenu/index.tsx, langbuilder/src/frontend/src/pages/UsagePage/index.ts
Feature-flag gated Usage route and AccountMenu navigation entry. ENABLE_USAGE_TRACKING flag controls route availability and Usage menu option, enabling conditional feature rollout.
Frontend - E2E Tests
langbuilder/src/frontend/tests/extended/features/usage-dashboard.spec.ts
Playwright test suite for usage dashboard covering skeleton rendering, data load, date range interactions, error states, and retry logic with mocked API responses.
Configuration
langbuilder/src/backend/base/pyproject.toml
Updates langwatch dependency from version range (>=0.2.11,<0.3.0) to fixed version (==0.17.0) for compatibility alignment.

Sequence Diagram(s)

sequenceDiagram
    participant User as User (Browser)
    participant Frontend as Frontend App
    participant API as Usage Router
    participant LWService as LangWatch Service
    participant LangWatch as LangWatch API
    participant Cache as Redis Cache
    participant DB as Database

    User->>Frontend: Navigate to /usage
    activate Frontend
    Frontend->>Frontend: Mount UsagePage, init state
    Frontend->>API: GET /api/v1/usage/?from_date=X&to_date=Y
    activate API
    API->>API: Enforce ownership, build allowed_flow_ids
    API->>LWService: get_usage_summary(params, allowed_flow_ids, is_admin=False)
    activate LWService
    LWService->>Cache: Check cache key usage:user:UUID:...
    alt Cache Hit
        Cache-->>LWService: Return cached summary
    else Cache Miss
        LWService->>DB: Get GlobalSettings for API key
        LWService->>LangWatch: POST /api/traces/search (with auth token)
        activate LangWatch
        LangWatch-->>LWService: Return traces with pagination
        deactivate LangWatch
        LWService->>LWService: _parse_trace, _filter_by_ownership, _aggregate_with_metadata
        LWService->>Cache: Write result with TTL=3600s
    end
    LWService-->>API: Return UsageResponse
    deactivate LWService
    API-->>Frontend: 200 + UsageResponse
    deactivate API
    Frontend->>Frontend: Render summary, flow breakdown list, selection controls
    deactivate Frontend
    User->>Frontend: Select flows, click "Last 7 days"
    activate Frontend
    Frontend->>Frontend: Update date range state (debounce 300ms)
    Frontend->>API: GET /api/v1/usage/?from_date=X&to_date=Y
    API->>LWService: get_usage_summary(...) → invalidate_cache first
    LWService->>Cache: Delete usage:user:* keys
    activate Cache
    Cache-->>LWService: Keys deleted
    deactivate Cache
    LWService->>LangWatch: Fetch fresh data
    LWService-->>API: Return updated response
    API-->>Frontend: Updated UsageResponse
    Frontend->>Frontend: Re-render with new data
    deactivate Frontend
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feature/cost-tracking-dashboard-ix

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

Note

Due to the large number of review comments, Critical severity comments were prioritized as inline comments.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
langbuilder/src/frontend/src/pages/SettingsPage/pages/GeneralPage/index.tsx (1)

17-17: ⚠️ Potential issue | 🟠 Major

Gate the LangWatch settings section with ENABLE_USAGE_TRACKING as well.

Right now the block is admin-gated only; it can still render when usage tracking is globally disabled, which breaks rollout consistency.

🛠️ Proposed fix
-import { ENABLE_PROFILE_ICONS } from "@/customization/feature-flags";
+import {
+  ENABLE_PROFILE_ICONS,
+  ENABLE_USAGE_TRACKING,
+} from "@/customization/feature-flags";
...
-      {isAdmin && (
+      {isAdmin && ENABLE_USAGE_TRACKING && (
         <div className="flex w-full flex-col gap-6">
           <div className="flex w-full items-start gap-6">
             <div className="flex w-full flex-col">

Also applies to: 167-181

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@langbuilder/src/frontend/src/pages/SettingsPage/pages/GeneralPage/index.tsx`
at line 17, Import and use the ENABLE_USAGE_TRACKING feature flag alongside the
existing admin check so the LangWatch settings section only renders when both
admin privileges and ENABLE_USAGE_TRACKING are enabled; specifically, add
ENABLE_USAGE_TRACKING to the imports (next to ENABLE_PROFILE_ICONS) and update
the conditional that renders the LangWatch settings block (the same conditional
around the LangWatch settings component/JSX) to require ENABLE_USAGE_TRACKING &&
isAdmin (or the existing admin boolean) in addition to the current checks.
🟠 Major comments (24)
langbuilder/src/backend/base/tests/services/test_langwatch_exceptions.py-53-53 (1)

53-53: ⚠️ Potential issue | 🟠 Major

Fix Ruff EM101/TRY003 in raised test exception message.

Line 53 raises with a string literal directly, which Ruff flags. Assign the message to a variable first.

🧹 Proposed lint-safe fix
     for exc_class in exception_classes:
+        message = "test message"
         with pytest.raises(LangWatchError):
-            raise exc_class("test message")
+            raise exc_class(message)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@langbuilder/src/backend/base/tests/services/test_langwatch_exceptions.py` at
line 53, Ruff flags raising an exception with a string literal (EM101/TRY003);
change the raise site that currently does raise exc_class("test message") to
first assign the message to a local variable (e.g., msg = "test message") and
then raise using that variable (raise exc_class(msg)); locate the occurrence by
looking for the exc_class raise in test_langwatch_exceptions.py and update it
accordingly.
langbuilder/src/frontend/src/pages/UsagePage/components/DateRangePicker.tsx-59-61 (1)

59-61: ⚠️ Potential issue | 🟠 Major

Use local-date formatting for presets to avoid calendar-date mismatches.

Line 60 converts to UTC before truncating to YYYY-MM-DD. For users outside UTC, the UTC calendar date can differ from the local calendar date—for example, a user in UTC-5 at local time 2026-03-17 23:00 will have new Date() return 2026-03-18T04:00:00Z, producing "2026-03-18" instead of the local date "2026-03-17". This causes preset ranges (lines 27, 36, 45, 54) to display incorrect dates and become inconsistent with manually entered dates from <input type="date">, which respects local timezone.

Proposed fix
 function toISODate(date: Date): string {
-  return date.toISOString().split("T")[0];
+  const year = date.getFullYear();
+  const month = String(date.getMonth() + 1).padStart(2, "0");
+  const day = String(date.getDate()).padStart(2, "0");
+  return `${year}-${month}-${day}`;
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@langbuilder/src/frontend/src/pages/UsagePage/components/DateRangePicker.tsx`
around lines 59 - 61, The toISODate function currently uses date.toISOString()
which converts to UTC and causes calendar-date mismatches for non-UTC users;
update toISODate to produce a local YYYY-MM-DD string (e.g., build from
date.getFullYear(), date.getMonth()+1, date.getDate() with zero-padding or use
toLocaleDateString with an ISO-like locale such as 'sv'/'en-CA') so presets that
call toISODate (used for preset ranges) match the browser <input type="date">
local calendar values; replace the existing implementation of toISODate
accordingly to ensure all preset computations (the preset helpers referenced in
this file) use local dates.
langbuilder/src/backend/base/tests/services/test_langwatch_key_validation.py-22-22 (1)

22-22: ⚠️ Potential issue | 🟠 Major

Replace the key-like test literal to avoid secret-scanner false positives.

This constant shape is being flagged by Gitleaks and can block CI even though it’s test-only.

🛠️ Proposed fix
-VALID_API_KEY = "lw_test_valid_key_abc123"
+VALID_API_KEY = "langwatch_test_token"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@langbuilder/src/backend/base/tests/services/test_langwatch_key_validation.py`
at line 22, The constant VALID_API_KEY contains a key-like literal that triggers
secret scanners; replace its value with a clearly non-secret test placeholder
(e.g., "test-valid-api-key" or "VALID_API_KEY_PLACEHOLDER") or generate a benign
token via a fixture (pytest fixture using uuid4 or faker) so tests continue to
pass; update any tests that assert specific formatting to accept the placeholder
or generated value and keep the symbol name VALID_API_KEY unchanged.
langbuilder/src/backend/base/tests/services/test_langwatch_key_validation.py-29-35 (1)

29-35: ⚠️ Potential issue | 🟠 Major

Use public constructor instead; object.__setattr__() does not suppress SLF001 violations.

Ruff flags 11 SLF001 violations in this file. The suggested object.__setattr__() pattern does not resolve this—Ruff still flags the assignments as private member access.

The cleanest solution: use the public constructor that already exists:

def _make_service() -> LangWatchService:
    """Create a LangWatchService instance with mocked DB session and httpx client."""
    svc = LangWatchService(db_session=AsyncMock())
    svc.redis = None
    return svc

If the constructor cannot be used, suppress violations with # noqa: SLF001 (the pattern used elsewhere in the codebase). The same violations exist in test_langwatch_ownership.py—apply the fix consistently across both files.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@langbuilder/src/backend/base/tests/services/test_langwatch_key_validation.py`
around lines 29 - 35, The test helper _make_service should instantiate
LangWatchService via its public constructor instead of creating an instance via
LangWatchService.__new__ and assigning private members; change _make_service to
call LangWatchService(db_session=AsyncMock()) and then set svc.redis = None (or
if constructor truly cannot be used, add a "# noqa: SLF001" comment to the
offending assignment lines to suppress the Ruff SLF001 warnings); apply the same
change in test_langwatch_ownership.py to remove the SLF001 violations
consistently.
Research/bugs/dashboard-completeness-audit/thread-3-backend-audit.md-6-6 (1)

6-6: ⚠️ Potential issue | 🟠 Major

Remove machine-local absolute path from committed docs.

This embeds local user/workspace identifiers and makes the note non-portable. Use a repo-relative root reference instead.

🔒 Suggested edit
-**Codebase:** `/Users/cg-adubuc/cg-ai-msl-workspaces/orgs/4c1a52a5-c94b-4f56-a14b-704b5c2f4725/projects/83b7021c-55d2-4e01-bab2-3d59c760c2e6/main/langbuilder/`
+**Codebase:** `langbuilder/` (repository root)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@Research/bugs/dashboard-completeness-audit/thread-3-backend-audit.md` at line
6, The committed doc contains a machine-local absolute path string (starts with
"/Users/") embedded in thread-3-backend-audit.md; replace that absolute path
with a repo-relative reference (e.g.,
"./Research/bugs/dashboard-completeness-audit/...") or a neutral placeholder
like "<repo-root>/Research/...", and commit the updated file so no local
user/workspace identifiers remain; search for any other "/Users/" occurrences in
the repository and update them similarly.
langbuilder/src/frontend/src/services/LangWatchService.ts-22-24 (1)

22-24: 🛠️ Refactor suggestion | 🟠 Major

Use the shared api controller instead of raw fetch in service calls.

This service currently bypasses the project-standard API layer.

As per coding guidelines, "Use the api controller from @/controllers/API for backend API calls."

Also applies to: 50-55, 73-78, 98-107

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@langbuilder/src/frontend/src/services/LangWatchService.ts` around lines 22 -
24, The service is directly calling fetch (e.g., the call using BASE_URL_API_V1
and searchParams) and must use the shared API controller instead; replace each
raw fetch call in LangWatchService (the usages around lines shown: the call at
start and the other occurrences) with the appropriate methods from the exported
api controller from "@/controllers/API" (e.g., api.get or api.post) while
passing the same URL path/params (drop BASE_URL_API_V1 if the api controller
already prefixes base), propagate credentials/headers via the api call options,
and keep error handling and response parsing the same; update the functions that
currently use fetch so they import and call the api controller instead of
window.fetch.
langbuilder/src/frontend/src/services/LangWatchService.ts-51-51 (1)

51-51: ⚠️ Potential issue | 🟠 Major

Encode flowId before putting it into the URL path.

Unencoded path segments can break requests when IDs contain reserved URL characters.

💡 Suggested fix
-    `${BASE_URL_API_V1}usage/${flowId}/runs?${searchParams}`,
+    `${BASE_URL_API_V1}usage/${encodeURIComponent(flowId)}/runs?${searchParams}`,
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@langbuilder/src/frontend/src/services/LangWatchService.ts` at line 51, The
URL is inserting an unencoded path segment using
`${BASE_URL_API_V1}usage/${flowId}/runs?${searchParams}` which can break for IDs
with reserved characters; update the caller/construction in LangWatchService.ts
to encode the path segment by replacing `flowId` with
`encodeURIComponent(flowId)` (keep `BASE_URL_API_V1` and `searchParams`
unchanged) so the request path becomes safe for all IDs.
langbuilder/src/backend/base/tests/api/test_langwatch_key_endpoint.py-67-67 (1)

67-67: ⚠️ Potential issue | 🟠 Major

Replace secret-like API key literals to avoid false-positive secret leaks in CI.

These test values are being detected by secret scanners; this can fail security gates even though they are fixtures.

💡 Suggested fix
-    body = SaveLangWatchKeyRequest(api_key="lw_live_abc123xyz")
+    body = SaveLangWatchKeyRequest(api_key="lw_test_unit_key_abc123xyz")
@@
-    body = SaveLangWatchKeyRequest(api_key="lw_live_abc123")
+    body = SaveLangWatchKeyRequest(api_key="lw_test_unit_key_abc123")

Also applies to: 163-163

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@langbuilder/src/backend/base/tests/api/test_langwatch_key_endpoint.py` at
line 67, Replace the hardcoded secret-like API key literal used to create the
test request (the SaveLangWatchKeyRequest instantiation that sets
api_key="lw_live_abc123xyz") with a non-secret test-safe placeholder or
generated value (e.g., "test-api-key" or a random UUID) so secret scanners don’t
flag fixtures; update all occurrences (including the other instance at the same
file) where the api_key literal is used to use the safe placeholder or a test
helper that returns a non-sensitive value.
langbuilder/src/frontend/src/pages/UsagePage/components/FlowBreakdownList.tsx-1-33 (1)

1-33: ⚠️ Potential issue | 🟠 Major

Clamp/reset pagination when the filtered flow list shrinks.

When flows changes, page can remain out-of-range, yielding an empty table even though data exists.

💡 Suggested fix
-import { useState } from "react";
+import { useEffect, useState } from "react";
@@
-  const totalPages = Math.ceil(flows.length / PAGE_SIZE);
+  const totalPages = Math.max(1, Math.ceil(flows.length / PAGE_SIZE));
   const pageFlows = flows.slice(page * PAGE_SIZE, (page + 1) * PAGE_SIZE);
+
+  useEffect(() => {
+    setPage((p) => Math.min(p, totalPages - 1));
+  }, [totalPages]);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@langbuilder/src/frontend/src/pages/UsagePage/components/FlowBreakdownList.tsx`
around lines 1 - 33, The paginated view can show an empty page when the incoming
flows array shrinks because the page state isn't clamped; add a useEffect that
watches flows (or flows.length) and PAGE_SIZE-derived totalPages and then call
setPage to clamp the current page into range (e.g., setPage(p => Math.min(p,
Math.max(0, Math.ceil(flows.length / PAGE_SIZE) - 1)))) so pageFlows (from
flows.slice(...)) never becomes out-of-range; implement this near the existing
page state and handlers and reference PAGE_SIZE, page, totalPages, setPage, and
pageFlows in the effect.
langbuilder/src/backend/base/langflow/services/tracing/langwatch.py-165-168 (1)

165-168: ⚠️ Potential issue | 🟠 Major

Preserve existing labels when adding the flow label.

This currently replaces the full labels array, which can drop previously attached labels on the trace metadata.

💡 Suggested fix
-        if metadata and "flow_name" in metadata:
-            self.trace.update(metadata=(self.trace.metadata or {}) | {
-                "labels": [f"Flow: {metadata['flow_name']}"],
-                "flow_id": self.flow_id,  # Stable identifier — survives flow renames
-            })
+        if metadata and "flow_name" in metadata:
+            current_metadata = self.trace.metadata or {}
+            labels = list(current_metadata.get("labels") or [])
+            flow_label = f"Flow: {metadata['flow_name']}"
+            if flow_label not in labels:
+                labels.append(flow_label)
+            self.trace.update(
+                metadata=current_metadata
+                | {
+                    "labels": labels,
+                    "flow_id": self.flow_id,  # Stable identifier — survives flow renames
+                }
+            )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@langbuilder/src/backend/base/langflow/services/tracing/langwatch.py` around
lines 165 - 168, The trace metadata update in self.trace.update replaces any
existing "labels" array, dropping previously attached labels; change the update
to merge existing labels with the new flow label instead of overwriting: read
current_labels = (self.trace.metadata or {}).get("labels", []), append or extend
with f"Flow: {metadata['flow_name']}" only if not already present, then call
self.trace.update with the merged "labels" and the existing flow_id value (keep
"flow_id": self.flow_id) so other metadata is preserved; modify the code around
self.trace.update(...) in langwatch.py to perform this merge before updating.
langbuilder/src/backend/base/langflow/services/tracing/langwatch.py-272-274 (1)

272-274: ⚠️ Potential issue | 🟠 Major

Tighten version normalization regex to avoid corrupting multi-digit model segments.

The current pattern can mutate values like gpt-4-32k into gpt-4.32k, which breaks model matching/pricing lookup.

💡 Suggested fix
-    normalized = re.sub(r"(\d)-(\d)", r"\1.\2", normalized)
+    normalized = re.sub(r"(?<=-)(\d)-(\d)(?=-|$)", r"\1.\2", normalized)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@langbuilder/src/backend/base/langflow/services/tracing/langwatch.py` around
lines 272 - 274, The normalization step in langwatch.py that assigns to the
variable "normalized" currently replaces digit-dash-digit sequences too broadly
(e.g., turns "gpt-4-32k" into "gpt-4.32k"); update the second re.sub call (the
one converting version dashes to dots) so it only matches single-digit
components by using lookaround checks that ensure neither side is adjacent to
other digits (i.e., only transform when both sides are standalone single
digits), leaving multi-digit segments like "32k" untouched.
langbuilder/src/backend/base/langflow/services/langwatch/schemas.py-13-15 (1)

13-15: ⚠️ Potential issue | 🟠 Major

Reject blank-only LangWatch keys here.

min_length=1 still accepts values like " ". That can be stored as a "configured" key and then cause every downstream LangWatch call to fail until an admin fixes it.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@langbuilder/src/backend/base/langflow/services/langwatch/schemas.py` around
lines 13 - 15, The SaveLangWatchKeyRequest currently uses Field(...,
min_length=1) which still allows whitespace-only strings; update the api_key
validation to reject blank-only values by either replacing the type with
pydantic.constr(strip_whitespace=True, min_length=1, max_length=500) for the
api_key field or adding a `@validator`('api_key') on the SaveLangWatchKeyRequest
model that checks api_key.strip() is not empty and raises a ValueError if it is;
ensure the validation prevents storing keys that are only whitespace.
langbuilder/src/backend/base/tests/api/test_usage_router_registration.py-34-41 (1)

34-41: ⚠️ Potential issue | 🟠 Major

This doesn't validate main-router registration.

Every assertion loads langflow/api/v1/usage/router.py directly. If langflow/api/router.py stops including usage_router, this file still passes because _load_api_router() is never used.

Also applies to: 54-89

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@langbuilder/src/backend/base/tests/api/test_usage_router_registration.py`
around lines 34 - 41, Tests currently import langflow/api/v1/usage/router.py
directly instead of exercising the main router registration, so changes to
langflow/api/router.py could break registration unnoticed; update the tests to
call and use the helper _load_api_router() (which loads langflow.api.router) for
assertions instead of importing langflow.api.v1.usage.router directly, then
verify that the loaded module exposes or registers usage_router (or the expected
route objects) as before; replace direct imports in the assertions referenced
and in the block covering lines 54-89 with checks against the module returned by
_load_api_router() to validate main-router registration.
langbuilder/src/backend/base/tests/api/test_usage_security.py-30-58 (1)

30-58: ⚠️ Potential issue | 🟠 Major

Use the FastAPI client fixture to drive dependency injection tests instead of checking signatures and mocking exceptions.

The tests at these lines check endpoint parameter annotations with inspect.signature() and mock exceptions with side_effect, but never invoke endpoints through FastAPI's actual dependency injection system. This means:

  • A broken dependency declaration (e.g., wrong Depends() target) passes these tests silently.
  • Auth failures are mocked rather than validated as real 401/403 responses from FastAPI.
  • Middleware and request-level validation are completely bypassed.

Instead, use the client fixture from conftest.py to make actual HTTP requests and optionally override dependencies with app.dependency_overrides if you need to isolate external services. This ensures the endpoints behave correctly when FastAPI's DI system runs.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@langbuilder/src/backend/base/tests/api/test_usage_security.py` around lines
30 - 58, The tests currently only inspect signatures and mock exceptions in
helper functions _stub_modules and _load_router, which bypass FastAPI's DI and
middleware; update the tests to use the FastAPI test client (use the client
fixture from conftest.py) to make real HTTP requests to the router returned by
_load_router, remove reliance on inspect.signature() and side_effect mocks for
auth paths, and when isolation is needed use app.dependency_overrides to replace
dependencies (for example override the auth dependency to simulate
allowed/denied users) so that responses are validated as real 401/403 and
middleware/validation run through the framework.
langbuilder/src/backend/base/tests/api/test_usage_endpoint.py-18-39 (1)

18-39: ⚠️ Potential issue | 🟠 Major

Add test isolation: save and restore sys.modules state in _load_router().

_stub_modules() mutates the process-global sys.modules dictionary and never restores it. After the first test runs, the mocked modules persist for all subsequent tests, causing order-dependent failures and import pollution. Use a fixture with try/finally or a context manager to save and restore the original sys.modules state.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@langbuilder/src/backend/base/tests/api/test_usage_endpoint.py` around lines
18 - 39, The tests currently mutate global sys.modules in _stub_modules() and
never restore it, causing import pollution; update _load_router() (or wrap
_stub_modules()) to snapshot the current sys.modules entries for the keys you
will modify, then populate the stubs, load the router, and in a finally block
restore sys.modules to its original state (replacing restored entries and
removing any inserted stub keys) so subsequent tests are unaffected; refer to
the functions _stub_modules and _load_router to locate where to take the
snapshot, insert mocks, and perform the restore.
langbuilder/src/backend/base/langflow/services/langwatch/service.py-253-263 (1)

253-263: ⚠️ Potential issue | 🟠 Major

to_date currently excludes the selected day.

When the caller supplies a date, both paths normalize it to YYYY-MM-DDT00:00:00Z. That makes the end of the range the start of the chosen day, so everything later that day is dropped from usage summaries and run details. Convert date-only to_date values to the next midnight (end-exclusive) or to the end of day.

Also applies to: 686-696

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@langbuilder/src/backend/base/langflow/services/langwatch/service.py` around
lines 253 - 263, The code normalizes a date-only to_date to midnight, making the
end-exclusive range exclude the selected day; update the to_date handling in the
block that sets to_dt (and the similar block around lines 686-696) so that when
to_date is a date only you convert it to the next midnight (or end-of-day)
instead of the same day's midnight — i.e., if not isinstance(to_date, datetime)
create to_dt as midnight of (to_date + 1 day) in UTC (or set to_dt to the end of
that day) so end_ms calculated from to_dt includes the full selected day; ensure
you still preserve timezone-aware datetimes passed in.
langbuilder/src/backend/base/tests/services/test_langwatch_service_integration.py-234-259 (1)

234-259: ⚠️ Potential issue | 🟠 Major

Patching one shared service inside each coroutine makes this test racy.

Each task patches and restores the same instance methods while the other tasks are still awaiting get_usage_summary(). One coroutine can restore the original implementations while another is mid-call, which makes the test flaky and can trigger real HTTP work. Patch once around gather() or give each task its own service instance.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@langbuilder/src/backend/base/tests/services/test_langwatch_service_integration.py`
around lines 234 - 259, The test is racy because each concurrent coroutine
patches the same service instance methods (_fetch_from_langwatch,
_filter_by_ownership, _aggregate_with_metadata) inside call_service while
asyncio.gather runs multiple coroutines; either move the patch context so it's
applied once around the gather() call (patch
service_no_redis._fetch_from_langwatch, service_no_redis._filter_by_ownership,
service_no_redis._aggregate_with_metadata before calling asyncio.gather and
restore after) or instantiate a fresh service per task and patch that instance
inside call_service (create new service objects and patch their methods) so no
two coroutines mutate the same object concurrently while invoking
get_usage_summary.
langbuilder/src/backend/base/langflow/services/langwatch/service.py-712-727 (1)

712-727: ⚠️ Potential issue | 🟠 Major

Filter fetch_flow_runs() to workflow traces as well.

_fetch_from_langwatch() already documents that one execution creates N+1 traces. This path matches by label/name only, so component traces can be counted as independent runs and inflate total_runs_in_period. Reuse the same workflow-only filter here before building RunDetail.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@langbuilder/src/backend/base/langflow/services/langwatch/service.py` around
lines 712 - 727, The current trace collection in _fetch_from_langwatch() is
matching both workflow and component traces which inflates total_runs_in_period;
update the logic that builds flow_traces (the loop using flow_label and span
checks) to produce a workflow-only list (i.e., only traces that either have the
"Flow: {flow_name}" label or contain a span with type "workflow" and name ==
flow_name) and then use that filtered workflow-only list when constructing
RunDetail and counting runs (the same workflow-only filter used elsewhere should
be reused here to exclude component traces from total_runs_in_period). Ensure
you reference the existing variables flow_label, flow_traces, trace.get("spans")
and the surrounding _fetch_from_langwatch()/fetch_flow_runs() flow so only
workflow traces are passed into RunDetail construction and counting.
langbuilder/src/backend/base/tests/services/test_langwatch_service_integration.py-548-556 (1)

548-556: ⚠️ Potential issue | 🟠 Major

These tests assert the pre-mapping exception type.

_fetch_all_pages() now converts these non-401/403 responses into LangWatchUnavailableError, so these expectations will fail even when the service is behaving as implemented. Assert the service exception here, or move the HTTP-status assertions up to the router tests.

Also applies to: 573-581

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@langbuilder/src/backend/base/tests/services/test_langwatch_service_integration.py`
around lines 548 - 556, Tests currently expect httpx.HTTPStatusError
pre-mapping, but _fetch_all_pages() now converts non-401/403 HTTP errors into
LangWatchUnavailableError; update the assertions in tests that call
service_no_redis.get_usage_summary (and the similar block at 573-581) to assert
that a LangWatchUnavailableError is raised instead of inspecting
exc_info.value.response.status_code, or alternatively move HTTP status
assertions into router-level tests; specifically change the pytest.raises target
from httpx.HTTPStatusError to LangWatchUnavailableError for calls to
service_no_redis.get_usage_summary and adjust any subsequent assertions to check
the service-level exception.
langbuilder/src/backend/base/tests/services/test_langwatch_service_integration.py-323-330 (1)

323-330: ⚠️ Potential issue | 🟠 Major

These mocks no longer match validate_key()'s request.

The service now validates keys with POST /api/traces/search using X-Auth-Token. In the lifecycle test the mock is still pointed at the old analytics endpoint, and in the auth-header test the only POST /api/traces/search response is already consumed by get_usage_summary(), so validate_key() has no matching mocked response.

Also applies to: 492-518

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@langbuilder/src/backend/base/tests/services/test_langwatch_service_integration.py`
around lines 323 - 330, The test's mocked HTTP responses still target the old
analytics endpoint, but validate_key() now issues a POST to /api/traces/search
with an X-Auth-Token header; update the httpx_mock.add_response calls in the
lifecycle test (and similarly at lines 492-518) to add a POST response for URL
"/api/traces/search" (status 200, appropriate json body) and include a header
matcher for "X-Auth-Token" or add a separate mocked POST response for the key
used by validate_key() so it isn't consumed by get_usage_summary(); locate mocks
around service_with_redis and get_usage_summary to ensure validate_key() has its
own matching POST /api/traces/search response.
langbuilder/src/backend/base/tests/services/test_langwatch_service_integration.py-69-88 (1)

69-88: ⚠️ Potential issue | 🟠 Major

These fixture traces never survive the workflow filter.

get_usage_summary() only keeps traces with no spans or a type == "workflow" span. _make_trace() builds only llm spans, so the ownership and pagination tests below end up exercising an empty result set instead of real aggregation.

💡 Proposed fix
-        "spans": [{"span_id": f"span_{trace_id}", "type": "llm", "model": "gpt-4o"}],
+        "spans": [
+            {"span_id": f"workflow_{trace_id}", "type": "workflow", "name": flow_name},
+            {"span_id": f"span_{trace_id}", "type": "llm", "model": "gpt-4o"},
+        ],
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@langbuilder/src/backend/base/tests/services/test_langwatch_service_integration.py`
around lines 69 - 88, The test fixture builder _make_trace produces only spans
with type "llm", which get filtered out by get_usage_summary (it keeps traces
with no spans or a span.type == "workflow"), so update the fixture to produce
traces that survive the filter: change _make_trace (or its call sites in the
tests) to either omit the spans entirely for traces meant to be included or add
a span with "type": "workflow" (e.g., spans: [{"span_id": f"span_{trace_id}",
"type": "workflow", "model": "gpt-4o"}]) so ownership and pagination tests
exercise real aggregated results rather than an empty set.
langbuilder/src/backend/base/langflow/api/v1/usage/router.py-270-277 (1)

270-277: ⚠️ Potential issue | 🟠 Major

Whitespace-only keys still 500 on this path.

After strip(), api_key can be empty. validate_key() raises ValueError for that case, but this handler only catches LangWatchConnectionError, so a whitespace-only payload turns into an internal error instead of a 422.

💡 Proposed fix
     api_key = body.api_key.strip()
+    if not api_key:
+        raise HTTPException(
+            status_code=422,
+            detail={
+                "code": "INVALID_KEY",
+                "message": "Invalid API key. Please check your LangWatch account settings and try again.",
+            },
+        )
 
     # Validate key against LangWatch before saving
     try:
         is_valid = await langwatch.validate_key(api_key)
+    except ValueError:
+        raise HTTPException(
+            status_code=422,
+            detail={
+                "code": "INVALID_KEY",
+                "message": "Invalid API key. Please check your LangWatch account settings and try again.",
+            },
+        ) from None
     except LangWatchConnectionError as exc:
         _raise_langwatch_http_error(exc)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@langbuilder/src/backend/base/langflow/api/v1/usage/router.py` around lines
270 - 277, After stripping the incoming key (api_key = body.api_key.strip()) you
must handle the empty-key case and ValueError raised by langwatch.validate_key
so whitespace-only payloads return a 422 instead of a 500; either check if
api_key == "" and return a SaveKeyResponse(success=False, key_preview="",
message="...") or raise the appropriate HTTP 422 error, and/or catch ValueError
around await langwatch.validate_key(api_key) and map it to the same 422
response; keep the existing handling for LangWatchConnectionError via
_raise_langwatch_http_error.
langbuilder/src/backend/base/tests/api/test_usage_api_integration.py-27-36 (1)

27-36: ⚠️ Potential issue | 🟠 Major

Avoid mutating sys.modules at import time in this test module.

These MagicMock entries stay installed for the entire pytest process. That can hide real import/wiring problems in the usage router and can also break unrelated tests that import the same modules later. Apply these stubs with monkeypatch inside a fixture so they are restored after each test.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@langbuilder/src/backend/base/tests/api/test_usage_api_integration.py` around
lines 27 - 36, The test currently mutates sys.modules at import time using
_STUBS and MagicMock which persists across the pytest run; instead create a
pytest fixture (e.g., stub_external_modules) that uses
monkeypatch.setitem(sys.modules, mod, MagicMock()) for each name in _STUBS and
apply it to tests (either by adding it as an argument to test functions or
making it autouse=True) so the stubs are installed per-test and automatically
restored after each test; update references to remove the global import-time
loop and rely on the fixture to provide the temporary stubs.
langbuilder/src/backend/base/langflow/services/langwatch/service.py-985-994 (1)

985-994: ⚠️ Potential issue | 🟠 Major

Remove or implement the Redis caching provider.

The import at line 991 will fail—get_redis_client does not exist in lfx.services.deps. The exception handler catches this silently and sets redis_client=None, causing the cache-aside pattern to no-op. Either implement get_redis_client in the deps module, or remove the dead cache logic entirely to avoid confusion about whether caching is actually working.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@langbuilder/src/backend/base/langflow/services/langwatch/service.py` around
lines 985 - 994, The current import of get_redis_client from lfx.services.deps
is dead and always falls back to redis_client=None, so either implement the
provider or remove the dead cache plumbing: either add a concrete
get_redis_client function to lfx.services.deps that returns a Redis client (so
the try/import block in langwatch/service.py and the redis_client variable work
as intended), or delete the try/except import and the redis_client variable here
and remove/rewire any cache-aside logic referenced by
get_usage_summary/self.redis so the service no longer advertises Redis support;
update comments accordingly to avoid misleading TODOs.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 33544e60-0a7c-4bc8-aeed-242cd3d8dca7

📥 Commits

Reviewing files that changed from the base of the PR and between 98c487d and ebe79c1.

📒 Files selected for processing (113)
  • QA/cost-tracking-dashboard-ix/findings.md
  • Research/bugs/cost-tracking-dashboard-ix-defects/.meta.md
  • Research/bugs/cost-tracking-dashboard-ix-defects/notebook.md
  • Research/bugs/cost-tracking-dashboard-ix-defects/plans/01_fix_di_crashes.md
  • Research/bugs/cost-tracking-dashboard-ix-defects/plans/02_fix_service_crashes.md
  • Research/bugs/cost-tracking-dashboard-ix-defects/plans/03_fix_migration_and_model.md
  • Research/bugs/cost-tracking-dashboard-ix-defects/plans/04_fix_security_and_cache.md
  • Research/bugs/cost-tracking-dashboard-ix-defects/plans/05_fix_frontend_errors.md
  • Research/bugs/cost-tracking-dashboard-ix-defects/plans/06_fix_service_cleanup.md
  • Research/bugs/cost-tracking-dashboard-ix-defects/plans/07_fix_navigation.md
  • Research/bugs/cost-tracking-dashboard-ix-defects/plans/execution_log.md
  • Research/bugs/cost-tracking-dashboard-ix-defects/plans/phase_map.md
  • Research/bugs/cost-tracking-dashboard-ix-defects/plans/review-verdicts.md
  • Research/bugs/cost-tracking-dashboard-ix-defects/research-plan.md
  • Research/bugs/cost-tracking-dashboard-ix-defects/research.md
  • Research/bugs/cost-tracking-dashboard-ix-defects/thread-1-navigation-architecture.md
  • Research/bugs/cost-tracking-dashboard-ix-defects/thread-2-db-session-pattern.md
  • Research/bugs/cost-tracking-dashboard-ix-defects/thread-3-usage-router-audit.md
  • Research/bugs/dashboard-completeness-audit/notebook.md
  • Research/bugs/dashboard-completeness-audit/plans/01_wire_existing_components.md
  • Research/bugs/dashboard-completeness-audit/plans/02_fix_token_capture.md
  • Research/bugs/dashboard-completeness-audit/plans/03_build_subview_toggle.md
  • Research/bugs/dashboard-completeness-audit/plans/04_build_multi_select.md
  • Research/bugs/dashboard-completeness-audit/plans/phase_map.md
  • Research/bugs/dashboard-completeness-audit/research-plan.md
  • Research/bugs/dashboard-completeness-audit/research.md
  • Research/bugs/dashboard-completeness-audit/thread-1-zero-cost.md
  • Research/bugs/dashboard-completeness-audit/thread-2-missing-ui.md
  • Research/bugs/dashboard-completeness-audit/thread-3-backend-audit.md
  • Research/bugs/dashboard-completeness-audit/thread-4-frontend-audit.md
  • Research/bugs/trace-flow-matching-gap/implementation_plan.md
  • Research/bugs/trace-flow-matching-gap/notebook.md
  • Research/bugs/trace-flow-matching-gap/research-plan.md
  • Research/bugs/trace-flow-matching-gap/research.md
  • Research/bugs/trace-flow-matching-gap/thread-1-tracer.md
  • Research/bugs/trace-flow-matching-gap/thread-2-usage-service.md
  • Research/bugs/trace-flow-matching-gap/thread-3-real-data.md
  • langbuilder/src/backend/base/langflow/alembic/versions/773db17e6029_add_global_settings_table.py
  • langbuilder/src/backend/base/langflow/api/router.py
  • langbuilder/src/backend/base/langflow/api/v1/usage/__init__.py
  • langbuilder/src/backend/base/langflow/api/v1/usage/router.py
  • langbuilder/src/backend/base/langflow/services/database/models/__init__.py
  • langbuilder/src/backend/base/langflow/services/database/models/global_settings.py
  • langbuilder/src/backend/base/langflow/services/langwatch/__init__.py
  • langbuilder/src/backend/base/langflow/services/langwatch/exceptions.py
  • langbuilder/src/backend/base/langflow/services/langwatch/schemas.py
  • langbuilder/src/backend/base/langflow/services/langwatch/service.py
  • langbuilder/src/backend/base/langflow/services/tracing/langwatch.py
  • langbuilder/src/backend/base/pyproject.toml
  • langbuilder/src/backend/base/tests/api/test_flow_runs_endpoint.py
  • langbuilder/src/backend/base/tests/api/test_langwatch_key_endpoint.py
  • langbuilder/src/backend/base/tests/api/test_usage_api_integration.py
  • langbuilder/src/backend/base/tests/api/test_usage_endpoint.py
  • langbuilder/src/backend/base/tests/api/test_usage_router_registration.py
  • langbuilder/src/backend/base/tests/api/test_usage_router_skeleton.py
  • langbuilder/src/backend/base/tests/api/test_usage_security.py
  • langbuilder/src/backend/base/tests/migrations/test_global_settings_migration.py
  • langbuilder/src/backend/base/tests/models/test_global_settings_model.py
  • langbuilder/src/backend/base/tests/services/test_langwatch_api_spike.py
  • langbuilder/src/backend/base/tests/services/test_langwatch_caching.py
  • langbuilder/src/backend/base/tests/services/test_langwatch_encryption.py
  • langbuilder/src/backend/base/tests/services/test_langwatch_exceptions.py
  • langbuilder/src/backend/base/tests/services/test_langwatch_fetch.py
  • langbuilder/src/backend/base/tests/services/test_langwatch_flow_runs.py
  • langbuilder/src/backend/base/tests/services/test_langwatch_httpx_client.py
  • langbuilder/src/backend/base/tests/services/test_langwatch_key_validation.py
  • langbuilder/src/backend/base/tests/services/test_langwatch_ownership.py
  • langbuilder/src/backend/base/tests/services/test_langwatch_parsing.py
  • langbuilder/src/backend/base/tests/services/test_langwatch_schemas.py
  • langbuilder/src/backend/base/tests/services/test_langwatch_service_integration.py
  • langbuilder/src/backend/base/tests/services/test_langwatch_service_skeleton.py
  • langbuilder/src/frontend/src/components/core/appHeaderComponent/components/AccountMenu/index.tsx
  • langbuilder/src/frontend/src/customization/feature-flags.ts
  • langbuilder/src/frontend/src/hooks/__tests__/useDebounce.test.ts
  • langbuilder/src/frontend/src/hooks/useDebounce.ts
  • langbuilder/src/frontend/src/pages/SettingsPage/LangWatchKeyForm.tsx
  • langbuilder/src/frontend/src/pages/SettingsPage/__tests__/LangWatchKeyForm.test.tsx
  • langbuilder/src/frontend/src/pages/SettingsPage/__tests__/LangWatchKeyFormComprehensive.test.tsx
  • langbuilder/src/frontend/src/pages/SettingsPage/pages/GeneralPage/index.tsx
  • langbuilder/src/frontend/src/pages/UsagePage/UsagePage.tsx
  • langbuilder/src/frontend/src/pages/UsagePage/__tests__/UsagePage.test.tsx
  • langbuilder/src/frontend/src/pages/UsagePage/components/DateRangePicker.tsx
  • langbuilder/src/frontend/src/pages/UsagePage/components/EmptyStatePrompt.tsx
  • langbuilder/src/frontend/src/pages/UsagePage/components/ErrorState.tsx
  • langbuilder/src/frontend/src/pages/UsagePage/components/FlowBreakdownList.tsx
  • langbuilder/src/frontend/src/pages/UsagePage/components/FlowBreakdownRow.tsx
  • langbuilder/src/frontend/src/pages/UsagePage/components/FlowRunsTable.tsx
  • langbuilder/src/frontend/src/pages/UsagePage/components/LoadingSkeleton.tsx
  • langbuilder/src/frontend/src/pages/UsagePage/components/SelectionSummary.tsx
  • langbuilder/src/frontend/src/pages/UsagePage/components/SubViewToggle.tsx
  • langbuilder/src/frontend/src/pages/UsagePage/components/UsageSummaryCards.tsx
  • langbuilder/src/frontend/src/pages/UsagePage/components/UserFilterDropdown.tsx
  • langbuilder/src/frontend/src/pages/UsagePage/components/__tests__/DateRangePicker.test.tsx
  • langbuilder/src/frontend/src/pages/UsagePage/components/__tests__/EmptyStatePrompt.test.tsx
  • langbuilder/src/frontend/src/pages/UsagePage/components/__tests__/ErrorState.test.tsx
  • langbuilder/src/frontend/src/pages/UsagePage/components/__tests__/FlowBreakdownList.test.tsx
  • langbuilder/src/frontend/src/pages/UsagePage/components/__tests__/FlowRunsTable.test.tsx
  • langbuilder/src/frontend/src/pages/UsagePage/components/__tests__/LoadingSkeleton.test.tsx
  • langbuilder/src/frontend/src/pages/UsagePage/components/__tests__/UsageSummaryCards.test.tsx
  • langbuilder/src/frontend/src/pages/UsagePage/components/__tests__/UserFilterDropdown.test.tsx
  • langbuilder/src/frontend/src/pages/UsagePage/hooks/__tests__/useGetFlowRuns.test.ts
  • langbuilder/src/frontend/src/pages/UsagePage/hooks/__tests__/useGetKeyStatus.test.ts
  • langbuilder/src/frontend/src/pages/UsagePage/hooks/__tests__/useGetUsageSummary.test.ts
  • langbuilder/src/frontend/src/pages/UsagePage/hooks/useGetFlowRuns.ts
  • langbuilder/src/frontend/src/pages/UsagePage/hooks/useGetKeyStatus.ts
  • langbuilder/src/frontend/src/pages/UsagePage/hooks/useGetUsageSummary.ts
  • langbuilder/src/frontend/src/pages/UsagePage/index.ts
  • langbuilder/src/frontend/src/routes.tsx
  • langbuilder/src/frontend/src/services/LangWatchService.ts
  • langbuilder/src/frontend/src/services/__tests__/LangWatchService.test.ts
  • langbuilder/src/frontend/src/types/__tests__/usage.test.ts
  • langbuilder/src/frontend/src/types/usage.ts
  • langbuilder/src/frontend/tests/extended/features/usage-dashboard.spec.ts

Comment on lines +154 to +169
from_date: Annotated[str | None, Query(description="ISO 8601 start date (YYYY-MM-DD)")] = None,
to_date: Annotated[str | None, Query(description="ISO 8601 end date (YYYY-MM-DD)")] = None,
user_id: Annotated[str | None, Query(description="Admin only: filter by user UUID")] = None,
sub_view: Annotated[str, Query(description="flows | mcp")] = "flows",
) -> UsageResponse:
"""Return aggregated cost and invocation data.

Non-admin users receive only their own flows (user_id param silently ignored).
Admins can filter by user_id or retrieve all flows.
"""
params = UsageQueryParams(
from_date=from_date,
to_date=to_date,
user_id=user_id,
sub_view=sub_view,
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🌐 Web query:

In FastAPI with Pydantic v2, if endpoint query params are declared as strand the route handler manually instantiates a Pydantic model from them, does a resultingValidationError get converted to a 422 automatically or does it propagate as a 500 unless it is caught explicitly?

💡 Result:

In FastAPI with Pydantic v2, if endpoint query parameters are declared as str (or other primitives) and the route handler manually instantiates a Pydantic model from them (e.g., using MyModel.model_validate(data) where data is a dict from request.query_params), a resulting ValidationError does not get automatically converted to a 422 response. Instead, it propagates as a 500 Internal Server Error unless explicitly caught. FastAPI's automatic 422 handling via RequestValidationError only applies to validation during automatic parameter parsing from request data into function arguments (e.g., when using Pydantic models directly as query parameters with Query or Depends). When validation occurs manually inside the route handler, the plain pydantic.ValidationError (or ValueError from validators) is treated as an unhandled server exception, resulting in a 500. To return a 422, you must explicitly catch the ValidationError in the handler and raise HTTPException(status_code=422) or use a custom exception handler for ValidationError. This behavior is consistent across Pydantic v1 and v2, and is by design: 422 is for client request validation failures, while internal validation errors are server errors (500) unless handled.

Citations:


🏁 Script executed:

# First, let's see the full router.py file to understand the implementation
wc -l langbuilder/src/backend/base/langflow/api/v1/usage/router.py

Repository: CloudGeometry/langbuilder

Length of output: 133


🏁 Script executed:

# Read the router file to understand the full context
cat -n langbuilder/src/backend/base/langflow/api/v1/usage/router.py

Repository: CloudGeometry/langbuilder

Length of output: 13141


🏁 Script executed:

# Find the UsageQueryParams definition
fd -t f -e py . langbuilder/src/backend/base/langflow/api/v1/usage/ | xargs grep -l "UsageQueryParams"

Repository: CloudGeometry/langbuilder

Length of output: 129


🏁 Script executed:

# Check for FlowRunsQueryParams as well
rg "class (UsageQueryParams|FlowRunsQueryParams)" -A 15

Repository: CloudGeometry/langbuilder

Length of output: 2207


🏁 Script executed:

# Check if there are any exception handlers in the router or parent modules
rg "@app.exception_handler|@router.exception_handler|except.*ValidationError|except.*RequestValidationError" langbuilder/src/backend/base/langflow/api/v1/usage/

Repository: CloudGeometry/langbuilder

Length of output: 51


Convert query parameters to use Pydantic models with Depends() for proper validation responses.

Query parameters like from_date, to_date, user_id, and sub_view arrive as strings, but are manually converted to UsageQueryParams (lines 164–169) after entering the handler. When Pydantic encounters invalid data—malformed dates, invalid UUIDs, or invalid Literal values—the ValidationError propagates as a 500 Internal Server Error instead of the standard 422 Unprocessable Entity response. FastAPI's automatic 422 conversion only works during parameter binding in the function signature, not within handler code.

Use Depends() on the query models directly or declare parameters with their target types (e.g., from_date: date | None = Query(...)) so validation occurs before the handler runs.

Also applies to: 206–214

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@langbuilder/src/backend/base/langflow/api/v1/usage/router.py` around lines
154 - 169, The handler currently accepts raw query strings (from_date, to_date,
user_id, sub_view) and then constructs UsageQueryParams inside the function (see
UsageQueryParams usage), which causes Pydantic validation errors to surface as
500s; change the endpoint signature to accept UsageQueryParams via Depends()
(e.g., params: UsageQueryParams = Depends()) so FastAPI performs validation and
returns 422 on bad input, remove the manual UsageQueryParams(...) construction,
and apply the same change to the second handler referenced (the other block that
builds UsageQueryParams) so both endpoints rely on Pydantic dependency injection
for proper validation; update imports to include Depends and ensure
UsageQueryParams is a FastAPI-compatible Pydantic model.

Comment on lines +375 to +433
# Query DB for flow metadata
stmt = (
select(Flow.id, Flow.name, Flow.user_id, User.username)
.join(User, Flow.user_id == User.id, isouter=True)
.where(Flow.id.in_(allowed_flow_ids))
)
result = await self._db_session.exec(stmt)
rows = result.all()

# Build name → FlowMeta map
# NOTE: LangWatch trace labels only contain flow *names* (e.g., "Flow: My Bot"),
# not flow IDs. When two flows share a name, we cannot perfectly disambiguate.
# Heuristic: prefer the flow whose ID is in allowed_flow_ids; tie-break by
# most recently created.
flow_name_map: dict[str, FlowMeta] = {}
for row in rows:
meta = FlowMeta(
flow_id=row.id,
user_id=row.user_id or UUID(int=0),
username=row.username or "",
)
existing = flow_name_map.get(row.name)
if existing is None:
flow_name_map[row.name] = meta
else:
# Collision: two flows share a name.
# Prefer the one whose ID is in allowed_flow_ids.
new_allowed = row.id in allowed_flow_ids
old_allowed = existing.flow_id in allowed_flow_ids
if new_allowed and not old_allowed:
flow_name_map[row.name] = meta
elif new_allowed and old_allowed:
# Both allowed (admin view) — prefer most recently created.
if hasattr(row, "created_at") and row.created_at and (
not hasattr(existing, "created_at")
or not getattr(existing, "created_at", None)
or row.created_at > existing.created_at
):
flow_name_map[row.name] = meta

allowed_names = set(flow_name_map.keys())

# Filter traces
filtered: list[dict] = []
for trace in traces:
metadata = trace.get("metadata") or {}
labels: list = metadata.get("labels") or []
flow_name = next(
(lbl[6:] for lbl in labels if isinstance(lbl, str) and lbl.startswith("Flow: ")),
None,
)
# Fallback: root workflow span name (OTEL SDK doesn't surface labels in API metadata)
if flow_name is None:
for span in trace.get("spans", []):
if span.get("type") == "workflow":
flow_name = span.get("name")
break
if flow_name in allowed_names:
filtered.append(trace)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Matching ownership by flow_name is not safe.

If two users have flows with the same name, flow_name_map collapses them to one entry and the later flow_name in allowed_names check keeps every trace with that shared name. A user who owns one of the duplicates can therefore see another user's usage. This needs a fail-closed path for non-unique names unless the trace payload carries a stable flow identifier.

🧰 Tools
🪛 Ruff (0.15.6)

[warning] 406-412: Use a single if statement instead of nested if statements

(SIM102)

Comment on lines +555 to +569
# User scope: specific UUID for filtered view, role-aware scoping for empty sets
if params.user_id:
user_scope = str(params.user_id)
elif is_admin and len(allowed_flow_ids) == 0:
user_scope = "admin:all"
elif len(allowed_flow_ids) == 0:
user_scope = "user:none"
else:
user_scope = "user"

# Date hash: compact representation of date range
date_str = f"{params.from_date}:{params.to_date}"
date_hash = hashlib.sha256(date_str.encode()).hexdigest()[:12]

return f"usage:{org_id}:{params.sub_view}:{user_scope}:{date_hash}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

The cache key is still shared across different principals.

params.user_id is honored even for non-admin requests, and every non-empty unfiltered view collapses to user_scope = "user". That lets a regular user collide with another user—or with an admin-filtered cache entry—and receive someone else's cached summary.

💡 Proposed fix
-        if params.user_id:
-            user_scope = str(params.user_id)
-        elif is_admin and len(allowed_flow_ids) == 0:
-            user_scope = "admin:all"
-        elif len(allowed_flow_ids) == 0:
-            user_scope = "user:none"
-        else:
-            user_scope = "user"
+        if is_admin and params.user_id:
+            user_scope = f"admin:user:{params.user_id}"
+        else:
+            role = "admin" if is_admin else "user"
+            flow_scope = ",".join(sorted(str(flow_id) for flow_id in allowed_flow_ids))
+            flow_hash = hashlib.sha256(flow_scope.encode()).hexdigest()[:12] if flow_scope else "none"
+            user_scope = f"{role}:{flow_hash}"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# User scope: specific UUID for filtered view, role-aware scoping for empty sets
if params.user_id:
user_scope = str(params.user_id)
elif is_admin and len(allowed_flow_ids) == 0:
user_scope = "admin:all"
elif len(allowed_flow_ids) == 0:
user_scope = "user:none"
else:
user_scope = "user"
# Date hash: compact representation of date range
date_str = f"{params.from_date}:{params.to_date}"
date_hash = hashlib.sha256(date_str.encode()).hexdigest()[:12]
return f"usage:{org_id}:{params.sub_view}:{user_scope}:{date_hash}"
# User scope: specific UUID for filtered view, role-aware scoping for empty sets
if is_admin and params.user_id:
user_scope = f"admin:user:{params.user_id}"
else:
role = "admin" if is_admin else "user"
flow_scope = ",".join(sorted(str(flow_id) for flow_id in allowed_flow_ids))
flow_hash = hashlib.sha256(flow_scope.encode()).hexdigest()[:12] if flow_scope else "none"
user_scope = f"{role}:{flow_hash}"
# Date hash: compact representation of date range
date_str = f"{params.from_date}:{params.to_date}"
date_hash = hashlib.sha256(date_str.encode()).hexdigest()[:12]
return f"usage:{org_id}:{params.sub_view}:{user_scope}:{date_hash}"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@langbuilder/src/backend/base/langflow/services/langwatch/service.py` around
lines 555 - 569, The cache key logic in the cache key builder (the block using
params.user_id, is_admin, allowed_flow_ids and producing user_scope) collapses
many principals into the same "user" token causing cross-principal cache
collisions; change the branching so that when the view is not an
empty-unfiltered set (i.e., else branch where user_scope currently becomes
"user") you include a stable principal identifier in the scope (for example use
f"user:{params.user_id}" or f"user:{principal.id}" depending on available
principal object) instead of the literal "user"; keep the existing "admin:all"
and "user:none" tokens for the empty/unfiltered cases, and ensure params.user_id
is only used when present but non-empty to construct the per-principal scope
(update the user_scope assignment in this function to produce distinct values
like "user:{id}" or "admin:{id}" as appropriate).

Comment on lines +22 to +34
optional_stubs = [
"langflow.api.utils",
"langflow.api.utils.core",
"fastapi_pagination",
]
for mod in optional_stubs:
if mod not in sys.modules:
sys.modules[mod] = MagicMock()

# Also stub the lfx deps to avoid lfx.services.deps import errors
if "lfx.services.deps" not in sys.modules:
sys.modules["lfx.services.deps"] = MagicMock()

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

file="langbuilder/src/backend/base/tests/api/test_usage_router_skeleton.py"

echo "sys.modules writes:"
rg -n 'sys\.modules\[[^]]+\]\s*=' "$file"

echo
echo "Cleanup operations (should exist if writes are isolated):"
rg -n 'del\s+sys\.modules\[|sys\.modules\.pop\(' "$file" || true

echo
echo "MagicMock stubs involved:"
rg -n 'MagicMock\(' "$file"

Repository: CloudGeometry/langbuilder

Length of output: 382


🏁 Script executed:

cat -n langbuilder/src/backend/base/tests/api/test_usage_router_skeleton.py

Repository: CloudGeometry/langbuilder

Length of output: 3941


🏁 Script executed:

fd conftest.py langbuilder/src/backend/base/tests/

Repository: CloudGeometry/langbuilder

Length of output: 51


🏁 Script executed:

fd conftest.py langbuilder/src/backend/

Repository: CloudGeometry/langbuilder

Length of output: 216


🏁 Script executed:

find langbuilder/src/backend -name conftest.py -type f

Repository: CloudGeometry/langbuilder

Length of output: 216


🏁 Script executed:

cat langbuilder/src/backend/tests/conftest.py

Repository: CloudGeometry/langbuilder

Length of output: 30660


🏁 Script executed:

# Check if there's a conftest between base/tests/ and the test file location
find langbuilder/src/backend/base -name conftest.py -type f

Repository: CloudGeometry/langbuilder

Length of output: 51


🏁 Script executed:

# Check if other test files in the same directory use monkeypatch or fixture cleanup
rg -l "monkeypatch|@pytest.fixture" langbuilder/src/backend/base/tests/api/ 2>/dev/null || echo "No files found with monkeypatch or fixture"

Repository: CloudGeometry/langbuilder

Length of output: 111


🏁 Script executed:

# List all test files in the directory to see testing patterns
ls -la langbuilder/src/backend/base/tests/api/

Repository: CloudGeometry/langbuilder

Length of output: 730


🏁 Script executed:

# Check if other test files in the same directory have similar sys.modules patterns
rg "sys\.modules" langbuilder/src/backend/base/tests/api/ -A 3 -B 1

Repository: CloudGeometry/langbuilder

Length of output: 4762


🏁 Script executed:

# Check how sys.modules cleanup is handled in other backend test files (if at all)
rg "sys\.modules\.pop\|del sys\.modules" langbuilder/src/backend/ -A 2 -B 2 | head -50

Repository: CloudGeometry/langbuilder

Length of output: 51


Replace global sys.modules stubbing with monkeypatch fixture to ensure proper test isolation.

The direct sys.modules[...] = MagicMock() writes (lines 29, 33) are process-global and never restored, polluting the module cache across test runs. This pattern appears in multiple files in this directory and violates test isolation principles. Use pytest's monkeypatch fixture instead, which automatically restores state after each test.

Suggested fix:

def test_router_module_importable(monkeypatch):
    """The usage router module can be loaded without errors."""
    monkeypatch.setitem(sys.modules, "langflow.api.utils", MagicMock())
    monkeypatch.setitem(sys.modules, "langflow.api.utils.core", MagicMock())
    monkeypatch.setitem(sys.modules, "fastapi_pagination", MagicMock())
    monkeypatch.setitem(sys.modules, "lfx.services.deps", MagicMock())
    
    mod = _load_router_module()
    assert mod is not None

Alternatively, refactor _load_router_module() to accept an optional monkeypatch parameter and apply stubs through it.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@langbuilder/src/backend/base/tests/api/test_usage_router_skeleton.py` around
lines 22 - 34, The tests currently assign MagicMock objects directly into
sys.modules using the optional_stubs list (e.g., "langflow.api.utils",
"langflow.api.utils.core", "fastapi_pagination") and also "lfx.services.deps",
which mutates global state; change the test to use pytest's monkeypatch fixture
(accept monkeypatch in test_router_module_importable) and call
monkeypatch.setitem(sys.modules, <module>, MagicMock()) for each module instead
of writing to sys.modules directly, then call _load_router_module() and assert
the result; alternatively, make _load_router_module accept a monkeypatch
parameter and apply the same monkeypatch.setitem calls there so test isolation
is preserved.

Comment on lines +89 to +95
def _make_db_with_flows(flow_ids: list[UUID]) -> AsyncMock:
"""Create a mock DB that returns a list of flow IDs."""
mock_db = AsyncMock()
mock_result = MagicMock()
mock_result.fetchall.return_value = [(fid,) for fid in flow_ids]
mock_db.execute.return_value = mock_result
return mock_db
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Fix the _make_db_with_flows() call signature.

The helper only accepts flow_ids, but Line 134 and Line 550 pass user_id=.... Those tests fail with TypeError before they ever reach the router logic.

🛠️ Minimal fix
-    db = _make_db_with_flows([], user_id=own_user_id)
+    db = _make_db_with_flows([])
...
-    db = _make_db_with_flows([flow_a_id], user_id=user_a_id)
+    db = _make_db_with_flows([flow_a_id])

Also applies to: 132-135, 549-550

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@langbuilder/src/backend/base/tests/api/test_usage_security.py` around lines
89 - 95, The helper _make_db_with_flows currently only accepts flow_ids but
tests call it with user_id=..., causing a TypeError; change the signature of
_make_db_with_flows to accept an optional user_id parameter (or **kwargs) so
calls with user_id succeed, keep existing behavior of returning a mock DB whose
execute().fetchall() yields [(fid,) for fid in flow_ids], and ensure the
function still returns AsyncMock as before—update any internal references to the
new parameter if needed but do not change the mocked result structure.

Comment on lines +671 to +682
response = await service_no_redis.fetch_flow_runs(
flow_id=FLOW_UUID_C,
flow_name="Other Bot",
query=FlowRunsQueryParams(
from_date=date(2026, 1, 1),
to_date=date(2026, 1, 31),
limit=10,
),
api_key=API_KEY,
requesting_user_id=requesting_user,
is_admin=False,
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

This test calls fetch_flow_runs() with a stale signature.

requesting_user_id and is_admin are not accepted by the current service API, so this raises TypeError before any assertion. The _db_session.exec setup is also unused in this code path now that ownership is enforced in the router.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@langbuilder/src/backend/base/tests/services/test_langwatch_service_integration.py`
around lines 671 - 682, The test calls fetch_flow_runs with outdated parameters
requesting_user_id and is_admin causing a TypeError; update the call to match
the current service API by removing those two arguments and pass only flow_id,
flow_name, query, api_key as required by fetch_flow_runs, and remove or stop
relying on the now-unused _db_session.exec setup in this test (or adjust
assertions to use the router-enforced ownership behavior instead of mocking
_db_session.exec); reference the fetch_flow_runs invocation in the test and any
_db_session.exec setup near it for the changes.

Comment on lines +56 to +68
it("shows error state when error occurs", () => {
mockUseGetUsageSummary.mockReturnValue({
data: undefined,
isLoading: false,
isError: true,
error: new Error("API Error"),
});

render(<UsagePage />);

expect(screen.getByTestId("usage-error-state")).toBeInTheDocument();
expect(screen.getByText("Failed to load usage data")).toBeInTheDocument();
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check how UsagePage handles and displays errors
ast-grep --pattern 'Failed to load usage data'

Repository: CloudGeometry/langbuilder

Length of output: 51


🏁 Script executed:

#!/bin/bash
# Search for the error message using rg
rg "Failed to load usage data" langbuilder/

# Also search for UsagePage component file
find langbuilder/src/frontend/src/pages -name "UsagePage*" -type f | grep -v __tests__

Repository: CloudGeometry/langbuilder

Length of output: 415


🏁 Script executed:

#!/bin/bash
# Read the UsagePage component to see error handling
cat langbuilder/src/frontend/src/pages/UsagePage/UsagePage.tsx

Repository: CloudGeometry/langbuilder

Length of output: 4310


🏁 Script executed:

#!/bin/bash
# Get the full test file to understand context better
cat langbuilder/src/frontend/src/pages/UsagePage/__tests__/UsagePage.test.tsx

Repository: CloudGeometry/langbuilder

Length of output: 4002


🏁 Script executed:

#!/bin/bash
# Find and read the ErrorState component
find langbuilder/src/frontend/src/pages/UsagePage -name "ErrorState*" -type f

Repository: CloudGeometry/langbuilder

Length of output: 225


🏁 Script executed:

#!/bin/bash
# Also check the components directory structure
ls -la langbuilder/src/frontend/src/pages/UsagePage/components/

Repository: CloudGeometry/langbuilder

Length of output: 999


🏁 Script executed:

#!/bin/bash
# Read the ErrorState component
cat langbuilder/src/frontend/src/pages/UsagePage/components/ErrorState.tsx

Repository: CloudGeometry/langbuilder

Length of output: 2099


Fix test assertions to match actual ErrorState component behavior.

The test has multiple assertion failures:

  1. Data-testid mismatch: Test expects usage-error-state, but ErrorState renders error-state
  2. Error message mismatch: Test expects "Failed to load usage data", but ErrorState renders "An unexpected error occurred." (for unknown error codes) or code-specific messages from the ERROR_MESSAGES map

The ErrorState component handles errors by extracting the error code and looking it up in a predefined ERROR_MESSAGES map. For arbitrary errors like new Error("API Error") with no code, it defaults to "An unexpected error occurred." The hardcoded message in the test does not reflect this actual behavior.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@langbuilder/src/frontend/src/pages/UsagePage/__tests__/UsagePage.test.tsx`
around lines 56 - 68, The test assertions are out of sync with ErrorState:
change the data-testid assertion from "usage-error-state" to "error-state" and
update the expected message to match ErrorState behavior—either assert the
default text "An unexpected error occurred." for a plain Error, or provide an
error object with a matching code (use the same code keys as ERROR_MESSAGES) in
mockUseGetUsageSummary's error field so you can assert the specific
ERROR_MESSAGES[code] string; update the mockReturnValue(error: ...) and the
expect(screen.getByText(...)) accordingly in the UsagePage.test.tsx for
UsagePage and mockUseGetUsageSummary.

test222234 and others added 2 commits March 19, 2026 12:23
Cost Trend Chart:
- Add DailyCost schema + date bucketing in _aggregate_with_metadata()
- Install recharts, create CostTrendChart (AreaChart) component
- Wire chart into UsagePage between summary cards and flow breakdown
- Gap-fill daily_costs with zeros, cap at 366 days

Demo Mode (USAGE_DEMO_MODE=true):
- Short-circuit usage endpoints at router level with mock data
- Generate realistic multi-day costs using real flow names from DB
- Deterministic seeded random (weekday/weekend patterns, power law)
- Supports both summary and drill-down endpoints
- No LangWatch dependency when demo mode is on

Alembic Migration Fix:
- Fix global_settings table SQLite type mismatch (sa.UUID -> sa.Uuid)
- Prevents backend crash on restart without deleting database

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Platform-generated eval output should not be committed to the source
repo. Causes eval deadlock when the platform clones the repo for a
new project — the orchestrator finds old progress.json and asks
whether to resume or start fresh, blocking automated eval.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants