Home

AgentAssay Wiki

Welcome to the AgentAssay wiki — the complete guide to token-efficient stochastic testing for AI agents.

Quick Links

What is AgentAssay?

AgentAssay is a formal regression testing framework that delivers statistical guarantees without burning your token budget. It combines behavioral fingerprinting, adaptive budget optimization, and trace-first offline analysis to achieve 5-20x cost reduction at equivalent statistical power.

Core Features

Token-Efficient Testing — Pay only for the trials you actually need (40-83% savings)
Behavioral Fingerprinting — Detect regressions by comparing agent behavior, not raw outputs
Statistical Rigor — Three-valued verdicts (PASS/FAIL/INCONCLUSIVE) with confidence intervals
5D Coverage Model — Tool, path, state, boundary, and model coverage
10 Framework Adapters — LangGraph, CrewAI, AutoGen, OpenAI, smolagents, and more
Mutation Testing — 12 operators across 4 categories to evaluate test suite sensitivity
Trace-First Analysis — Coverage and contract checking at zero token cost

The Problem AgentAssay Solves

Testing AI agents is expensive. Every test requires LLM API calls and tool executions. A fixed-100-trial strategy costs $20-$200 per run. Multiply by CI frequency, and you're looking at thousands of dollars per month.

Most teams respond by either over-testing (wasting budget), under-testing (missing regressions), or skipping testing entirely.

AgentAssay eliminates this waste through adaptive budgeting, behavioral fingerprinting, and offline analysis — delivering the same statistical confidence at 5-20x lower cost.

Research Foundation

AgentAssay is built on peer-reviewed research:

Paper: arXiv:2603.02601 (cs.AI + cs.SE)
Dataset: Zenodo DOI: 10.5281/zenodo.18842011
Author: Varun Pratap Bhardwaj (Independent Researcher)

Installation

pip install agentassay

See the Installation page for framework-specific extras and development setup.

Quick Example

from agentassay.efficiency import AdaptiveBudgetOptimizer

# 1. Run a small calibration (10 trials)
optimizer = AdaptiveBudgetOptimizer(alpha=0.05, beta=0.10)
estimate = optimizer.calibrate(calibration_traces)

# 2. See the savings
print(f"Recommended trials: {estimate.recommended_n}")
print(f"Estimated cost: ${estimate.estimated_cost_usd:.2f}")
print(f"Savings vs fixed-100: {estimate.savings_vs_fixed_100:.0%}")

# 3. Run only what you need
results = runner.run_trials(scenario, n=estimate.recommended_n)

Documentation Structure

Getting Started

Core Concepts

Token-Efficient Testing ⭐ The differentiator
Behavioral Fingerprinting
Statistical Methods
Coverage Model
Mutation Testing

Guides

Reference

Community & Support

GitHub: github.com/qualixar/agentassay
PyPI: pypi.org/project/agentassay
Issues: GitHub Issues

License

Apache-2.0 — forever free, never paid.

Part of Qualixar | Author: Varun Pratap Bhardwaj

Home

Getting Started

Core Concepts

Guides

Reference

qualixar.com | arXiv:2603.02601

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Home

AgentAssay Wiki

Quick Links

What is AgentAssay?

Core Features

The Problem AgentAssay Solves

Research Foundation

Installation

Quick Example

Documentation Structure

Getting Started

Core Concepts

Guides

Reference

Community & Support

License

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally