Skip to content

Conversation

@Yavanosta
Copy link

@Yavanosta Yavanosta commented Oct 24, 2025

This CL improves perfrormance of UriIdentityService:

  1. Migrates to string keys and native Map for cache to reduce amount of URI objects creation/conversion to string.
  2. Improves cache cleanup algorithm by using quickSelect to pick median value instead of sorting the entire cache.

Changes measured on my local machine (MacBook Air M4):

  1. Case 1: inserting 1000 URIs on almost full cache (has 64000 items): 153ms -> 18ms
  2. Case 2: reading 50000 URIs from cache on almost full cache (has 65000 items): 297ms -> 81ms
  3. Case 3: cache trim 297ms -> 17ms

Fixes #273108

Copilot AI review requested due to automatic review settings October 24, 2025 13:14
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR improves the performance of the UriIdentityService by optimizing its internal caching mechanism. The changes reduce object creation overhead and improve cache cleanup efficiency.

Key changes:

  • Migrated from SkipList to native Map with string keys to reduce URI object creation
  • Refactored path casing logic into a separate PathCasingCache class
  • Replaced full-sort cache cleanup with quickSelect algorithm to find median value

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
src/vs/platform/uriIdentity/common/uriIdentityService.ts Refactored UriIdentityService to use Map with string keys, extracted PathCasingCache class, and optimized cache cleanup with quickSelect
src/vs/platform/uriIdentity/test/common/uriIdentityService.test.ts Added test to verify cache eviction respects access time when cache overflows

Comment on lines 142 to 157
Entry._clock = 0;
const times = [...this._canonicalUris.values()].map(e => e.time);
const median = quickSelect(
Math.floor(times.length / 2),
times,
(a, b) => a - b);
for (const [key, entry] of this._canonicalUris.entries()) {
if (entry.time < median) {
this._canonicalUris.delete(key);
} else {
return 0;
entry.time = 0;
}
});

Entry._clock = 0;
this._canonicalUris.clear();
const newSize = this._limit * 0.5;
for (let i = 0; i < newSize; i++) {
this._canonicalUris.set(entries[i][0], entries[i][1].touch());
}
Copy link

Copilot AI Oct 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The entry.time values are being reset to 0 after quickSelect has already used them to calculate the median, but entries with time < median are deleted while keeping others. This creates an inconsistency where the median calculation is based on old times, but the comparison uses those same old times. Additionally, after deletion, the remaining entries all have time = 0, which means the next cache cleanup will not properly distinguish access patterns. Consider resetting Entry._clock and entry.time values only after the cleanup is complete, or adjust the logic to maintain proper time tracking.

Copilot uses AI. Check for mistakes.
Copy link
Author

@Yavanosta Yavanosta Oct 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if I can correctly understand this AI generated commit. The logic behind the change if the following:

  1. We calculate median based on current items
  2. Then we remove items with time <= median and reset it to 0 for those which left.
  3. This logic utilizes the fact that we clean up exactly the half of the cache.
  4. All remain items will have time 0.
  5. On next cleanup all new items will have time >= 1. Items left from previous cleanup will have time === 0 if they have not been touched and >= 1 if they have been touched. If they have not been touched we do not care in their order, since all of them will be trimmed.
  6. So the algorithm guarantees that amount of items with time === 0 is always strictly less then a half, and this is enough for correct trim.

@Yavanosta Yavanosta force-pushed the UriIdentityService-perf branch from 528a459 to 01f4c01 Compare October 24, 2025 13:49
@bpasero bpasero assigned jrieken and unassigned dbaeumer Oct 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Performance: UriIdentityService uses non-optimal caching strategy

4 participants