-
Notifications
You must be signed in to change notification settings - Fork 3
NH-124861: restart timer thread #235
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| end | ||
|
|
||
| # Starts replenishing the bucket | ||
| def start |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The token bucket replenishment approach is different from JS/Python--it seems a timer thread on interval adds the token here, while the other two implementations "replenish during consume", meaning, at the time a token consume call is made, the interval between last consume and now is calculated, then the number of tokens that accrued during the interval is added.
Was there a reason you went with this different approach?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor point is that given this difference, and if you intend to keep this different implementation, it does not make sense to set a MAX_INTERVAL--there should just be a DEFAULT_INTERVAL that is 1 second. The intention with JS/Python MAX_INTERVAL is to guard against the scenario where no consume is called for a long time, i.e. app sits there w/o receiving request.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think apm-ruby approach is similar to apm-js.
apm-js seems also has this timer that run task, which add token with duration of interval. Although it's not a thread (e.g. all single thread in js), but I think setInterval function from js create an endless loop to keep executing the task function.
I checked the apm-python, it indeed recalculates during each consume and update.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah my bad, i just assumed the Python implementation is based on JS.
Question for @tammy-baylis-swi or @jerrytfleung now... do you remember the reason for Python implementation to not use a background thread to replenish tokens?
And @xuan-cao-swi how does the "reinit thread for forked child process" work in Ruby--not seeing anything obvious in this PR that corresponds to the Python register_at_fork.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not seeing anything obvious in this PR
Right, there is no reinitialize the thread like reset_on_fork from http_sampler.
The update/update_from_hash will start the @timer thread after fork. Because after fork, the settings_request will start, and it will update the settings; it will restart the timer thread in line with start unless running.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Re:
reason for Python implementation to not use a background thread to replenish tokens?
wondering if due to GIL where the thread might not actually run as consistently on interval as we'd like in a CPU bound application, thus not replenishing tokens accurately.
Co-authored-by: Lin Lin <lin.lin@solarwinds.com>
| if settings[:interval] | ||
| @interval = settings[:interval].clamp(0, MAX_INTERVAL) | ||
| SolarWindsAPM.logger.debug { "[#{self.class}/#{__method__}] Updated interval: #{@interval}ms" } | ||
| end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logic is hard to follow due to many threads and methods involved, but IIUC the settings are being updated in a single background thread that runs the update_settings logic, which uses a mutex to protect the @settings data. But not seeing this mutex used to synchronize the reading of @settings which seems to happen in get_settings. Seems we're missing the case of using the settings mutex to synchronize reads, e.g. multiple application threads could be calling shouldSample (thus getting settings), while this settings update thread is trying to write it?
Then as part of update_settings it calls this update_from_hash method, and we're using another mutex here to protect concurrent access of the @rate, @capacity and @tokens variables during token consume/replenish, right?
I'm also still confused why we could be getting settings[:interval] and needing to update it here.
Finally, i think the APM Python way is easier on the brain. One less thread to think about.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems we're missing the case of using the settings mutex to synchronize reads
Yes, I will add them later
I'm also still confused why we could be getting settings[:interval] and needing to update it here.
Yeah, the interval part should be removed
APM Python way is easier on the brain.
Agree, especially with the thread, the mutex will make everything complex.
cheempz
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @xuan-cao-swi. As discussed we can get this in, then actually revamp the thread safety logic. Minor is we should also get the sample_rate deprecation into main if not already there.
Description
This is a hot fix for 7.0.1 that restart the timer thread after fork.
Also add the thread lock that synchronize the token bucket variable changes.
Depreciated the config option
:sample_rate(seems no longer used)Test (if applicable)