Skip to content

perf: use bin search to find TID entry#370

Open
cheb0 wants to merge 1 commit intomainfrom
0-bin-search-for-tid-find
Open

perf: use bin search to find TID entry#370
cheb0 wants to merge 1 commit intomainfrom
0-bin-search-for-tid-find

Conversation

@cheb0
Copy link
Member

@cheb0 cheb0 commented Mar 3, 2026

Description

For some aggs, half of time is spent just finding TID entry. Example:

_exists_:remote_addr | group by remote_addr

Total number of buckets found: 492756
Timings: 5481 ms => 2151 ms


  • I have read and followed all requirements in CONTRIBUTING.md;
  • I used LLM/AI assistance to make this pull request;

@codecov-commenter
Copy link

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 71.43%. Comparing base (8aefcc9) to head (21ce78f).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #370   +/-   ##
=======================================
  Coverage   71.42%   71.43%           
=======================================
  Files         205      205           
  Lines       14910    14918    +8     
=======================================
+ Hits        10650    10656    +6     
- Misses       3484     3485    +1     
- Partials      776      777    +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@github-actions
Copy link
Contributor

github-actions bot commented Mar 3, 2026

🔴 Performance Degradation

Some benchmarks have degraded compared to the previous run.
Click on Show table button to see full list of degraded benchmarks.

Show table
Name Previous Current Ratio Verdict
AggDeep/size=1000-4 ad0347 ef4947
4737.00 ns/op 5572.00 ns/op 1.18 🔴
AggDeep/size=10000-4 ad0347 ef4947
47818.00 ns/op 54832.00 ns/op 1.15 🔴
AggWide/size=1000-4 ad0347 ef4947
4780.00 ns/op 5663.00 ns/op 1.18 🔴
FindSequence_Random/medium-4 ad0347 ef4947
12292.95 MB/s 10341.12 MB/s 0.84 🔴
86.31 ns/op 99.02 ns/op 1.15 🔴

Comment on lines +88 to 91
if i > 0 {
entry := data.Entries[i-1]
if tid <= entry.getLastTID() {
return entry
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like this check is redundant. We checked earlier in line 79 that there MUST exist an entry which has StartTID <= tid && tid <= LastTID (at least at 0-th index). So we can rewrite it simpler:

for _, data := range t {
	from := data.Entries[0].StartTID
	to := data.Entries[len(data.Entries)-1].getLastTID()

	if tid < from || tid > to {
		continue
	}

	i := sort.Search(len(data.Entries), func(j int) bool {
		return data.Entries[j].StartTID > tid
	})

	return data.Entries[i-1]
}

And what's important that for sort.Search we have following documentation:

Search uses binary search to find and return the smallest index i in [0, n) at which f(i) is true, assuming that on the range [0, n), f(i) == true implies f(i+1) == true.

And also there is:

If there is no such index, Search returns n.

So sort.Search cannot return index that is less or equal to 0 in this case.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally, I would add invariant check here as well:

for _, data := range t {
	from := data.Entries[0].StartTID
	to := data.Entries[len(data.Entries)-1].getLastTID()

	if tid < from || tid > to {
		continue
	}

	i := sort.Search(len(data.Entries), func(j int) bool {
		return data.Entries[j].StartTID > tid
	})

	if i <= 0 {
		panic("invariant violation")
	}

	return data.Entries[i-1]
}

but that's unnecessary of course.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants