Skip to content

Comments

Add retry logic and adaptive rate limiting to stock data refresh#82

Open
eullrich wants to merge 2 commits intomainfrom
claude/fix-stock-cache-sync-cKKPv
Open

Add retry logic and adaptive rate limiting to stock data refresh#82
eullrich wants to merge 2 commits intomainfrom
claude/fix-stock-cache-sync-cKKPv

Conversation

@eullrich
Copy link
Owner

@eullrich eullrich commented Feb 8, 2026

Summary

Enhance the stock data refresh scheduler with intelligent retry logic and adaptive rate limiting to improve reliability when fetching data from yfinance. The refresh process now distinguishes between fresh, stale, and failed data, retrying stale/failed tickers across multiple passes while dynamically adjusting delays based on failure rates.

Key Changes

Core Data Fetching (src/tradfi/core/data.py)

  • Added FetchOutcome enum to classify fetch results: FRESH (new data), STALE (cached fallback), FAILED (no data)
  • Added FetchResult NamedTuple to return detailed outcome information alongside stock data
  • Introduced fetch_stock_from_api_with_result() function that explicitly tracks outcome status
  • Extracted stock building logic into _build_stock() helper to reduce duplication
  • Added _apply_rate_limit() helper for consistent rate limiting across functions

Refresh Scheduler (src/tradfi/api/scheduler.py)

  • Implemented multi-pass retry logic: initial sweep followed by up to MAX_RETRY_PASSES (3) retry passes
  • Added adaptive delay adjustment: increases delay by DELAY_INCREASE_FACTOR (1.5x) when failure rate exceeds FAILURE_RATE_THRESHOLD (15%)
  • Delay caps at MAX_DELAY (15 seconds) and includes INTER_RETRY_PAUSE (30 seconds) between retry passes
  • Enhanced progress tracking with per-pass statistics (fresh/stale/failed counts)
  • Updated refresh state to include detailed outcome breakdown instead of just fetched/failed counts
  • Improved logging with per-pass progress and failure diagnostics

Cache Utilities (src/tradfi/utils/cache.py)

  • Added get_batch_cache_freshness() function to efficiently check freshness status of multiple tickers in a single database query
  • Returns "fresh", "stale", or "missing" status for each ticker

API Endpoints (src/tradfi/api/routers/refresh.py)

  • Updated UniverseStatsSchema to include fresh and stale fields alongside existing cached field
  • Modified get_universe_stats() endpoint to use new batch freshness checking
  • Updated endpoint documentation to reflect fresh vs stale distinction

Tests (tests/test_cache_refresh.py)

  • Added test helpers for creating mock FetchResult objects with different outcomes
  • Added TestBatchCacheFreshness test class covering batch freshness checking
  • Updated existing refresh tests to use new fetch_stock_from_api_with_result() function
  • Added tests for retry recovery and max retry limits
  • Tests verify adaptive delay behavior and multi-pass retry logic

Implementation Details

Retry Strategy: Tickers are only retried if they returned STALE or FAILED status. Successfully fetched (FRESH) tickers are not retried, reducing unnecessary API calls.

Adaptive Backoff: When a pass has >15% failure/stale rate, the delay increases by 1.5x (capped at 15s) to back off from potential yfinance rate limiting.

Backwards Compatibility: The refresh stats response includes a legacy fetched field (fresh + stale count) for compatibility with existing clients.

In-Memory Config: Rate limit delays are modified in-memory only during refresh, avoiding disk I/O and ensuring the original delay is restored afterward.

https://claude.ai/code/session_01TQSW9T8pmrieTbckXNpXRU

The daily auto-refresh was failing to capture all stocks in nasdaq100 and
russell2000 because: (1) failed fetches were never retried, (2) stale cache
fallbacks masqueraded as successes, (3) blocking time.sleep() in the async
scheduler, (4) no adaptive delay when yfinance rate-limits.

Changes:
- Add FetchResult/FetchOutcome types to distinguish fresh/stale/failed fetches
- Extract _build_stock() helper to avoid duplicating Stock construction
- Rewrite refresh_universe() with up to 3 retry passes for stale/failed tickers
- Add adaptive delay that increases when failure rate exceeds 15%
- Use asyncio.to_thread() to avoid blocking the event loop during fetches
- Set rate limit delay in-memory only (no unnecessary disk I/O)
- Add get_batch_cache_freshness() for efficient per-universe freshness reporting
- Expose fresh/stale/missing counts in /refresh/universes endpoint
- Replace print() with proper logging in data.py
- Update and expand tests (18 passing) including retry and max-pass tests

https://claude.ai/code/session_01TQSW9T8pmrieTbckXNpXRU
…e-sync-cKKPv

# Conflicts:
#	src/tradfi/api/routers/refresh.py
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants