Skip to content

ImNotClaude/Claude-Artifact-Monitoring

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 

Repository files navigation

Claude Behavior Probe

A research toolkit for studying Claude's behavioral patterns from inside Claude.ai artifacts.

What This Is

Claude artifacts can call the Anthropic API directly without an API key. This creates a unique opportunity for self-referential experimentation — using Claude to study Claude's behavior under controlled conditions.

This probe doesn't focus on success or failure. It targets the gray zones — the subtle behavioral shifts that occur when the model is under constraint pressure but not obviously failing.

Gray-Zone Behaviors

The probe detects 7 categories of subtle behavioral degradation:

Behavior What It Detects
Almost-Repeat Same mechanism repackaged with different nouns
Evasive Abstraction Backing away because novelty is expensive
Constraint Hallucination Inventing plausible but empty distinctions
Latent Disengagement Formal compliance, substantive withdrawal
Style-Content Inversion One channel exhausts before the other
Self-Regularization Settling into a narrow output band
Emergent Meta-Strategy Discovering a survival strategy for the prompt

See docs/gray-zones.md for the full taxonomy.

Metrics Tracked

Per-response:

  • Token counts (input/output)
  • Latency
  • Stop reason

Derived behavioral:

  • Novelty score — Trigram Jaccard similarity vs recent responses
  • Hedging ratio — Hedge word density
  • Abstraction ratio — Abstract vs concrete language
  • Falsifiability — Presence of testable claims
  • Disengagement score — Soft refusal signals

How It Works

  1. Cycles through 8 epistemological questions about understanding/comprehension
  2. Applies rotating constraints: mechanism, behavioral, consequence, falsifiable, contrast, failure
  3. Sends to Claude API (Sonnet 4 by default)
  4. Analyzes response for behavioral metrics
  5. Tracks trends over time with sparkline visualizations
  6. Exports data as JSON for further analysis

Usage

In Claude.ai Artifacts

  1. Create a new artifact in Claude.ai
  2. Copy the contents of src/BehaviorProbe.jsx
  3. Run the artifact
  4. Click "Start" to begin probing
  5. Use "Export Session" to get your data

Data Persistence

  • Published artifacts: Data persists via window.storage
  • Draft artifacts: Data is session-only (lost on refresh)
  • Export: JSON via textarea (CSP blocks clipboard/downloads)

Experiment Design

The probe implements several research design principles:

  • Rotating constraints push the model off default rails
  • Falsifiability hooks prevent vapor answers
  • Fixed output structure makes responses comparable
  • Repetition detection catches almost-repeats

Platform Constraints

Feature Status
API calls to Anthropic ✅ Works (no key needed)
window.storage ✅ Works (published only)
Clipboard API ❌ CSP blocked
Blob downloads ❌ CSP blocked
External APIs ❌ Only Anthropic whitelisted

Files

├── README.md                 # This file
├── docs/
│   └── gray-zones.md         # Full theory and methodology
├── src/
│   └── BehaviorProbe.jsx     # The artifact component
└── analysis/
    └── (your exported data)

Research Applications

  • Response consistency analysis
  • Constraint stress testing
  • Prompt sensitivity mapping
  • Model version drift detection
  • Alignment behavior study

References

License

MIT

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •