feat(ntx-builder): deactivate accounts which crash repeatedly#1712
feat(ntx-builder): deactivate accounts which crash repeatedly#1712SantiagoPittella merged 12 commits intonextfrom
Conversation
| tracing::warn!( | ||
| %account_id, | ||
| crash_count = count, | ||
| "Account blacklisted due to repeated crashes, skipping actor spawn" |
There was a problem hiding this comment.
Q: should we make this more explicit? tracing::error!, since it signifies a bug in our impl or add "BUG" deliberately to the message?
There was a problem hiding this comment.
We should already be logging the crash itself, so this is probably okay at warn since it will repeat often for each account.
| assert_eq!(inactive_targets[0], inactive_id); | ||
| } | ||
|
|
||
| // BLACKLIST TESTS |
There was a problem hiding this comment.
nit: we might want to find a different term to "blacklist", i.e. repeated_failure_block_list to avoid any confusion / fud
There was a problem hiding this comment.
I'd prefer something much shorter -- we're already overly verbose and stuttery throughout the codebase imo. Maybe deactivated_accounts
Mirko-von-Leipzig
left a comment
There was a problem hiding this comment.
Just naming nits :) I guess blacklist was the wrong term to use -- let's try deactivated instead 🙃
| tracing::warn!( | ||
| %account_id, | ||
| crash_count = count, | ||
| "Account blacklisted due to repeated crashes, skipping actor spawn" |
There was a problem hiding this comment.
We should already be logging the crash itself, so this is probably okay at warn since it will repeat often for each account.
| assert_eq!(inactive_targets[0], inactive_id); | ||
| } | ||
|
|
||
| // BLACKLIST TESTS |
There was a problem hiding this comment.
I'd prefer something much shorter -- we're already overly verbose and stuttery throughout the codebase imo. Maybe deactivated_accounts
ebf9d91 to
870292b
Compare
c5f91d8 to
d0c58c2
Compare
Co-authored-by: Mirko <48352201+Mirko-von-Leipzig@users.noreply.github.com>
dc72a14 to
aab733a
Compare
|
@Mirko-von-Leipzig I added the |
* feat(ntx-builder): blacklist accounts whose actors crash repeatedly * add PR number to changelog entry * docs: add changelog entry & remove duplicated ones * review: replace blacklist with deactivated * review: update docs * review: rename actors with accounts * review: improve log format Co-authored-by: Mirko <48352201+Mirko-von-Leipzig@users.noreply.github.com> * review: improve traces * review: move tests helpers as methods * review: remove ShutdownReason struct, replace with Result * chore: remove duplicated changelog entries --------- Co-authored-by: Mirko <48352201+Mirko-von-Leipzig@users.noreply.github.com>
Implements account blacklisting for the NTX Builder (4th task from #1694).
Track crash counts per account in the coordinator. When an actor shuts down due to a
DbError, its crash count is incremented. Once it reaches a configurable threshold, the account is blacklisted andspawn_actorskips it.Only DbError shutdowns count as crashes because other shutdown reasons (
Cancelled,IdleTimeout,SemaphoreFailed) are either intentional or system-wide and not indicative of a per-account bug.