agent-control-drift-evaluator

Temporal behavioral drift evaluator for Agent Control. Detects gradual degradation patterns that point-in-time evaluators miss.

The Problem

Agent Control's built-in evaluators (regex, list, SQL, JSON) assess individual interactions. They answer: "Is this response safe right now?" But they don't answer: "Is this agent becoming less reliable over time?"

Empirical observation across 13 LLM agents showed:

Agents scoring 1.0 on point-in-time tests drifted ~7% on behavioral consistency over 28-day windows
Self-reported capability claims diverged from measured behavior by 7% on average
Degradation patterns were non-monotonic — stability windows followed by abrupt shifts, not gradual decline

This evaluator fills that gap.

How It Works

Records behavioral observations per agent over time
Compares recent window against an established baseline
Flags drift when mean shift exceeds a configurable threshold
Dampens false signals from tasks with multiple valid behavioral patterns (spec_clarity)

Drift Detection Method

Baseline vs window: First N observations establish baseline; last M observations are compared
Mean shift: Absolute delta between baseline mean and recent mean
Cohen's d: Standardized effect size for practical significance
Confidence: Weighted combination of sample size and effect size
Specification clarity: MULTI_VALID tasks suppress drift flags when effect size is small (agents legitimately behave differently on ambiguous tasks)

Installation

pip install agent-control-drift-evaluator

With Redis backend:

pip install agent-control-drift-evaluator[redis]

Usage

from agent_control import control

@control(
    name="behavioral-drift-check",
    evaluator="drift",
    config={
        "window_size": 7,         # recent observations to analyze
        "baseline_size": 10,      # observations for baseline
        "drift_threshold": 0.10,  # 10% mean shift triggers
        "dimensions": ["calibration", "adaptation", "robustness"],
        "action": "warn",         # or "deny" for critical agents
        "spec_clarity": "unambiguous",
    },
)
async def my_agent_step(input):
    ...

Observation Format

The evaluator expects data with agent_id and score:

{
    "agent_id": "my-agent-001",
    "score": 0.92,
    "dimension": "calibration",
    "timestamp": 1710844800.0,
    "metadata": {"probe": "pii-detection"}
}

Field	Required	Description
`agent_id`	✅	Identifies which agent this observation is for
`score`	✅	Behavioral measurement (0.0–1.0, higher = more reliable)
`dimension`	❌	Category for separate tracking (default: `"default"`)
`timestamp`	❌	Unix epoch seconds (default: current time)
`metadata`	❌	Extra context (probe type, model version, etc.)

Configuration

Parameter	Default	Description
`window_size`	`7`	Recent observations to compare. Empirical minimum: 5
`baseline_size`	`10`	Initial observations for baseline
`drift_threshold`	`0.10`	Mean-shift threshold (0.0–1.0)
`dimensions`	`["default"]`	Dimensions to track separately
`action`	`"warn"`	Action on drift: `warn`, `deny`, or `log`
`storage_backend`	`"file"`	Storage: `file` or `redis`
`storage_dir`	`~/.agent-control-drift/observations`	File backend directory
`spec_clarity`	`"unambiguous"`	Task clarity: `unambiguous`, `multi_valid`, `underspecified`

Result Metadata

When drift is detected, the EvaluatorResult.metadata includes:

{
    "agent_id": "my-agent-001",
    "dimension": "calibration",
    "mean_shift": -0.15,
    "effect_size": 0.82,
    "drift_detected": true,
    "window_size": 7,
    "baseline_size": 10,
    "specification_clarity": "unambiguous",
    "total_observations": 28
}

Empirical Findings

From production validation across two independent systems:

Window ≥ 5 required: Below 5 observations, drift detection is noise. Validated on Gerundium (3-node swarm) and NexusGuard (19-agent fleet).
Non-monotonic drift: Agents don't degrade gradually. They show stability → abrupt shift → stability. Rolling windows catch this; cumulative averages blur it.
Specification clarity matters: Under identical prompts, one agent produced a stable 6A/4B split across two reasoning paths. Without spec_clarity: multi_valid, this would be flagged as drift.

Storage Backends

File (default)

Observations stored as JSON lines in ~/.agent-control-drift/observations/{agent_id}/{dimension}.jsonl. Atomic appends via O_APPEND. Good for single-host deployments.

Redis

config = DriftEvaluatorConfig(
    storage_backend="redis",
    redis_url="redis://localhost:6379/0",
)

Uses Redis lists with RPUSH/LRANGE. Better for multi-host or high-throughput setups.

Development

git clone https://github.com/nanookclaw/agent-control-drift-evaluator
cd agent-control-drift-evaluator
pip install -e ".[dev]"
pytest

License

MIT

References

Agent Control — Runtime guardrails for AI agents
PDR Paper — Behavioral reliability measurement methodology
Agent Control Evaluators — Evaluator plugin guide

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
src/agent_control_drift_evaluator		src/agent_control_drift_evaluator
tests		tests
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

agent-control-drift-evaluator

The Problem

How It Works

Drift Detection Method

Installation

Usage

Observation Format

Configuration

Result Metadata

Empirical Findings

Storage Backends

File (default)

Redis

Development

License

References

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

agent-control-drift-evaluator

The Problem

How It Works

Drift Detection Method

Installation

Usage

Observation Format

Configuration

Result Metadata

Empirical Findings

Storage Backends

File (default)

Redis

Development

License

References

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages