Problem
repo-intel query onboard and repo-intel query can-i-help time out (>30s) on very large repos like TypeScript (81K files) and Deno (28K files). The areas() function iterates all file_activity entries and groups by directory, which becomes expensive at scale.
Repos affected
- microsoft/TypeScript (~81K files)
- denoland/deno (~28K files)
Potential fixes
- Cache
areas() result within a query session (it's called by both onboard and can-i-help)
- Pre-compute directory groupings during
merge_delta() instead of on-the-fly
- Add a file count threshold - for repos > 10K files, sample or limit to recent files only
Context
Discovered during 100-repo validation. 97/100 repos pass; these 2 timeout.
Problem
repo-intel query onboardandrepo-intel query can-i-helptime out (>30s) on very large repos like TypeScript (81K files) and Deno (28K files). Theareas()function iterates allfile_activityentries and groups by directory, which becomes expensive at scale.Repos affected
Potential fixes
areas()result within a query session (it's called by both onboard and can-i-help)merge_delta()instead of on-the-flyContext
Discovered during 100-repo validation. 97/100 repos pass; these 2 timeout.