software-mansion · mkaput · Apr 8, 2026 · Apr 7, 2026
diff --git a/astro.config.ts b/astro.config.ts
@@ -63,6 +63,7 @@ export default defineConfig({
           items: [
             { slug: "expanding-horizons/threads-context-and-caching" },
             { slug: "expanding-horizons/model-pricing" },
+            { slug: "expanding-horizons/autoresearch" },
             { slug: "expanding-horizons/what-to-read-next" },
           ],
         },

diff --git a/src/content/docs/expanding-horizons/autoresearch.mdx b/src/content/docs/expanding-horizons/autoresearch.mdx
@@ -0,0 +1,45 @@
+---
+title: Autoresearch
+description: A pattern where a coding agent runs semi-autonomous experiments to discover performance improvements or other optimizations.
+---
+
+import ExternalLink from "../../../components/ExternalLink.astro";
+// import ClapButton from "../../../components/ClapButton.astro";
+
+You've already seen how a closed feedback loop makes agents more autonomous — tests and scripts let them self-correct without waiting for you.
+Autoresearch takes that idea further.
+Instead of fixing one known problem, the agent explores a space of potential improvements on its own, running experiments and keeping what works.
+
+It's particularly effective for optimization tasks where you can express the goal as a number.
+
+## How it works
+
+You give the agent two things:
+
+- A **task description** — what to optimize, what constraints to respect, and what "success" means.
+- A **benchmark script** — something the agent runs after each experiment to get a measurable result.
+
+The agent then runs a loop: propose a change, apply it, measure it, keep it or revert it, repeat.
+Each experiment is isolated, so results stay interpretable.
+
+{/* <ClapButton slug="expanding-horizons/autoresearch/how-it-works" /> */}
+
+## Why it works
+
+Three conditions make autoresearch effective:
+
+- **A measurable goal.** "Make it faster" becomes actionable when the agent can run a script and read a number.
+  Without a benchmark, there's no feedback loop.
+- **A robust test suite.** Tests let the agent discard changes that break correctness.
+  Without them, the agent can't safely move fast.
+- **Isolated experiments.** Trying one change at a time keeps results interpretable.
+  If everything changes at once, you can't tell what worked.
+
+These conditions apply broadly — autoresearch works for performance, but also for any goal you can express as a script output.
+
+Read more:
+
+- <ExternalLink href="https://github.com/karpathy/autoresearch" />
+- <ExternalLink href="https://simonwillison.net/2026/Mar/13/liquid/" />
+
+{/* <ClapButton slug="expanding-horizons/autoresearch/why-it-works" /> */}
diff --git a/src/data/links.csv b/src/data/links.csv
@@ -52,6 +52,7 @@ https://github.com/badlogic/pi-mono/blob/380236a003ec7f0e69f54463b0f00b3118d78f3
 https://github.com/callstackincubator/agent-device,callstackincubator/agent-device: CLI to control iOS and Android devices for AI agents,Callstack,,2026-03-04
 https://github.com/Expensify/App,Expensify/App,GitHub,,2026-03-04
 https://github.com/github/github-mcp-server,GitHub - github/github-mcp-server: GitHub&#39;s official MCP Server · GitHub,,,2026-03-13
+https://github.com/karpathy/autoresearch,GitHub - karpathy/autoresearch: AI agents running research on single-GPU nanochat training automatically · GitHub,,2026-03-06,2026-04-07
 https://github.com/mattpocock/skills/blob/main/grill-me/SKILL.md,grill-me skill,Matt Pocock,2026-02-25,2026-03-06
 https://github.com/mcp,GitHub MCP Registry,,,2026-03-13
 https://github.com/microsoft/playwright-mcp,microsoft/playwright-mcp,Microsoft,,2026-03-13
@@ -86,6 +87,7 @@ https://registerspill.thorstenball.com/,Register Spill,Thorsten Ball,,2026-03-04
 https://simonwillison.net/,Weblog,Simon Willison,,2026-03-04
 https://simonwillison.net/2025/Jun/16/the-lethal-trifecta/,"The lethal trifecta for AI agents: private data, untrusted content, and external communication",Simon Willison,2025-06-16,2026-03-16
 https://simonwillison.net/2026/Feb/10/showboat-and-rodney/,"Introducing Showboat and Rodney, so agents can demo what they’ve built",Simon Willison,2026-02-10,2026-03-24
+https://simonwillison.net/2026/Mar/13/liquid/,"Shopify/liquid: Performance: 53% faster parse+render, 61% fewer allocations",,2026-03-13,2026-04-07
 https://simonwillison.net/guides/agentic-engineering-patterns/anti-patterns/,Anti-patterns: things to avoid,Simon Willison,2026-03-04,2026-03-05
 https://simonwillison.net/guides/agentic-engineering-patterns/linear-walkthroughs/,Linear walkthroughs,Simon Willison,2026-02-25,2026-03-04
 https://simonwillison.net/guides/agentic-engineering-patterns/red-green-tdd/,Red/green TDD,Simon Willison,2026-02-23,2026-03-04