NH-124861: restart timer thread #235

xuan-cao-swi · 2025-12-02T19:06:33Z

Description

This is a hot fix for 7.0.1 that restart the timer thread after fork.
Also add the thread lock that synchronize the token bucket variable changes.
Depreciated the config option :sample_rate (seems no longer used)

Test (if applicable)

cheempz · 2025-12-02T21:10:49Z

lib/solarwinds_apm/sampling/token_bucket.rb

    end

    # Starts replenishing the bucket
    def start


The token bucket replenishment approach is different from JS/Python--it seems a timer thread on interval adds the token here, while the other two implementations "replenish during consume", meaning, at the time a token consume call is made, the interval between last consume and now is calculated, then the number of tokens that accrued during the interval is added.

Was there a reason you went with this different approach?

Minor point is that given this difference, and if you intend to keep this different implementation, it does not make sense to set a MAX_INTERVAL--there should just be a DEFAULT_INTERVAL that is 1 second. The intention with JS/Python MAX_INTERVAL is to guard against the scenario where no consume is called for a long time, i.e. app sits there w/o receiving request.

I think apm-ruby approach is similar to apm-js.

apm-js seems also has this timer that run task, which add token with duration of interval. Although it's not a thread (e.g. all single thread in js), but I think setInterval function from js create an endless loop to keep executing the task function.

I checked the apm-python, it indeed recalculates during each consume and update.

Ah my bad, i just assumed the Python implementation is based on JS.

Question for @tammy-baylis-swi or @jerrytfleung now... do you remember the reason for Python implementation to not use a background thread to replenish tokens?

And @xuan-cao-swi how does the "reinit thread for forked child process" work in Ruby--not seeing anything obvious in this PR that corresponds to the Python register_at_fork.

not seeing anything obvious in this PR

Right, there is no reinitialize the thread like reset_on_fork from http_sampler.
The update/update_from_hash will start the @timer thread after fork. Because after fork, the settings_request will start, and it will update the settings; it will restart the timer thread in line with start unless running.

Re:

reason for Python implementation to not use a background thread to replenish tokens?

wondering if due to GIL where the thread might not actually run as consistently on interval as we'd like in a CPU bound application, thus not replenishing tokens accurately.

lib/solarwinds_apm/config.rb

Co-authored-by: Lin Lin <lin.lin@solarwinds.com>

cheempz · 2025-12-04T20:50:20Z

lib/solarwinds_apm/sampling/token_bucket.rb

+        if settings[:interval]
+          @interval = settings[:interval].clamp(0, MAX_INTERVAL)
+          SolarWindsAPM.logger.debug { "[#{self.class}/#{__method__}] Updated interval: #{@interval}ms" }
+        end


The logic is hard to follow due to many threads and methods involved, but IIUC the settings are being updated in a single background thread that runs the update_settings logic, which uses a mutex to protect the @settings data. But not seeing this mutex used to synchronize the reading of @settings which seems to happen in get_settings. Seems we're missing the case of using the settings mutex to synchronize reads, e.g. multiple application threads could be calling shouldSample (thus getting settings), while this settings update thread is trying to write it?

Then as part of update_settings it calls this update_from_hash method, and we're using another mutex here to protect concurrent access of the @rate, @capacity and @tokens variables during token consume/replenish, right?

I'm also still confused why we could be getting settings[:interval] and needing to update it here.

Finally, i think the APM Python way is easier on the brain. One less thread to think about.

Seems we're missing the case of using the settings mutex to synchronize reads

Yes, I will add them later

I'm also still confused why we could be getting settings[:interval] and needing to update it here.

Yeah, the interval part should be removed

APM Python way is easier on the brain.

Agree, especially with the thread, the mutex will make everything complex.

…tfix/7.0.2

cheempz

Thanks @xuan-cao-swi. As discussed we can get this in, then actually revamp the thread safety logic. Minor is we should also get the sample_rate deprecation into main if not already there.

NH-124861: restart timer thread

387576d

xuan-cao-swi requested review from a team as code owners December 2, 2025 19:06

more thread safety around update and consume

063c3e3

cheempz reviewed Dec 2, 2025

View reviewed changes

fix the test fail

db057c4

cheempz reviewed Dec 4, 2025

View reviewed changes

lib/solarwinds_apm/config.rb Outdated Show resolved Hide resolved

cheempz reviewed Dec 4, 2025

View reviewed changes

lib/solarwinds_apm/config.rb Outdated Show resolved Hide resolved

Apply suggestions from code review

b7f4e5c

Co-authored-by: Lin Lin <lin.lin@solarwinds.com>

cheempz reviewed Dec 4, 2025

View reviewed changes

xuan-cao-swi added 2 commits December 5, 2025 11:27

add test case

bad8bf0

Merge branch 'hotfix/7.0.2' of github.com:solarwinds/apm-ruby into ho…

ccc40f2

…tfix/7.0.2

xuan-cao-swi requested a review from cheempz December 5, 2025 16:34

xuan-cao-swi changed the base branch from main to 7-0-patches December 5, 2025 19:03

cheempz approved these changes Dec 5, 2025

View reviewed changes

xuan-cao-swi changed the title ~~[DO NOT MERGE] NH-124861: restart timer thread~~ NH-124861: restart timer thread Dec 5, 2025

xuan-cao-swi merged commit 5ff759a into 7-0-patches Dec 8, 2025
44 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

NH-124861: restart timer thread #235

NH-124861: restart timer thread #235

Uh oh!

xuan-cao-swi commented Dec 2, 2025

Uh oh!

cheempz Dec 2, 2025

Uh oh!

cheempz Dec 2, 2025

Uh oh!

xuan-cao-swi Dec 2, 2025

Uh oh!

cheempz Dec 3, 2025

Uh oh!

xuan-cao-swi Dec 3, 2025

Uh oh!

cheempz Dec 3, 2025

Uh oh!

Uh oh!

Uh oh!

cheempz Dec 4, 2025

Uh oh!

xuan-cao-swi Dec 5, 2025

Uh oh!

cheempz left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

3 participants

NH-124861: restart timer thread #235

NH-124861: restart timer thread #235

Uh oh!

Conversation

xuan-cao-swi commented Dec 2, 2025

Description

Test (if applicable)

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cheempz left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

3 participants