Automates attribution-graph analysis via probe prompting: circuit-trace a prompt, auto-generate concept probes, profile feature activations, cluster supernodes.
graph-analysis sparse-autoencoders mechanistic-interpretability llm-interpretability research-tooling circuit-tracing attribution-graphs probe-prompting prompt-probing neuronpedia feature-activation supernodes cross-layer-transcoder
-
Updated
Oct 30, 2025 - Python