Add retry logic and adaptive rate limiting to stock data refresh#82
Open
Add retry logic and adaptive rate limiting to stock data refresh#82
Conversation
The daily auto-refresh was failing to capture all stocks in nasdaq100 and russell2000 because: (1) failed fetches were never retried, (2) stale cache fallbacks masqueraded as successes, (3) blocking time.sleep() in the async scheduler, (4) no adaptive delay when yfinance rate-limits. Changes: - Add FetchResult/FetchOutcome types to distinguish fresh/stale/failed fetches - Extract _build_stock() helper to avoid duplicating Stock construction - Rewrite refresh_universe() with up to 3 retry passes for stale/failed tickers - Add adaptive delay that increases when failure rate exceeds 15% - Use asyncio.to_thread() to avoid blocking the event loop during fetches - Set rate limit delay in-memory only (no unnecessary disk I/O) - Add get_batch_cache_freshness() for efficient per-universe freshness reporting - Expose fresh/stale/missing counts in /refresh/universes endpoint - Replace print() with proper logging in data.py - Update and expand tests (18 passing) including retry and max-pass tests https://claude.ai/code/session_01TQSW9T8pmrieTbckXNpXRU
…e-sync-cKKPv # Conflicts: # src/tradfi/api/routers/refresh.py
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Enhance the stock data refresh scheduler with intelligent retry logic and adaptive rate limiting to improve reliability when fetching data from yfinance. The refresh process now distinguishes between fresh, stale, and failed data, retrying stale/failed tickers across multiple passes while dynamically adjusting delays based on failure rates.
Key Changes
Core Data Fetching (
src/tradfi/core/data.py)FetchOutcomeenum to classify fetch results:FRESH(new data),STALE(cached fallback),FAILED(no data)FetchResultNamedTuple to return detailed outcome information alongside stock datafetch_stock_from_api_with_result()function that explicitly tracks outcome status_build_stock()helper to reduce duplication_apply_rate_limit()helper for consistent rate limiting across functionsRefresh Scheduler (
src/tradfi/api/scheduler.py)MAX_RETRY_PASSES(3) retry passesDELAY_INCREASE_FACTOR(1.5x) when failure rate exceedsFAILURE_RATE_THRESHOLD(15%)MAX_DELAY(15 seconds) and includesINTER_RETRY_PAUSE(30 seconds) between retry passesCache Utilities (
src/tradfi/utils/cache.py)get_batch_cache_freshness()function to efficiently check freshness status of multiple tickers in a single database queryAPI Endpoints (
src/tradfi/api/routers/refresh.py)UniverseStatsSchemato includefreshandstalefields alongside existingcachedfieldget_universe_stats()endpoint to use new batch freshness checkingTests (
tests/test_cache_refresh.py)FetchResultobjects with different outcomesTestBatchCacheFreshnesstest class covering batch freshness checkingfetch_stock_from_api_with_result()functionImplementation Details
Retry Strategy: Tickers are only retried if they returned STALE or FAILED status. Successfully fetched (FRESH) tickers are not retried, reducing unnecessary API calls.
Adaptive Backoff: When a pass has >15% failure/stale rate, the delay increases by 1.5x (capped at 15s) to back off from potential yfinance rate limiting.
Backwards Compatibility: The refresh stats response includes a legacy
fetchedfield (fresh + stale count) for compatibility with existing clients.In-Memory Config: Rate limit delays are modified in-memory only during refresh, avoiding disk I/O and ensuring the original delay is restored afterward.
https://claude.ai/code/session_01TQSW9T8pmrieTbckXNpXRU