From 64ad53bd0cdece752e1e74416390d70f686a116a Mon Sep 17 00:00:00 2001 From: Angela O <254776627+ocandocrypto-uniswap@users.noreply.github.com> Date: Mon, 16 Mar 2026 17:13:54 -0500 Subject: [PATCH 1/5] docs: complete AI Toolkit pages and align content with skill consistency --- ai-toolkit/contributions.mdx | 220 +++++++++++++++++- ai-toolkit/evals.mdx | 416 ++++++++++++++++++++++++++++++++++- ai-toolkit/overview.mdx | 38 ++-- ai-toolkit/skills.mdx | 71 ++++++ 4 files changed, 730 insertions(+), 15 deletions(-) diff --git a/ai-toolkit/contributions.mdx b/ai-toolkit/contributions.mdx index 77db3c3..1474c38 100644 --- a/ai-toolkit/contributions.mdx +++ b/ai-toolkit/contributions.mdx @@ -3,4 +3,222 @@ title: Contributions description: Contribute skills, evals, and improvements to the Uniswap AI toolkit. --- -Thank you for your interest in contributing to Uniswap AI! This guide will help you get started. \ No newline at end of file +Thank you for your interest in contributing to Uniswap AI! This guide will help you get started. + +## Prerequisites + +- Node.js 22.x or later +- npm 11.7.0 or later +- Git +- Familiarity with TypeScript and Nx + +## Getting Started + +### 1. Fork and clone + +```bash +# Fork the repository on GitHub, then clone +git clone https://github.com/YOUR_USERNAME/uniswap-ai.git +cd uniswap-ai +``` + +### 2. Install dependencies + +```bash +# Ensure you have the correct npm version +npm install -g npm@11.7.0 + +# Install dependencies +npm install +``` + +### 3. Verify setup + +```bash +# Run tests +npx nx run-many --target=test + +# Build all packages +npx nx run-many --target=build + +# Start docs dev server +npm run docs:dev +``` + +## Development Workflow + +### Creating changes + +1. **Create a branch** from `main`: + + ```bash + git checkout -b feature/your-feature-name + ``` + +2. **Make your changes** following the code guidelines + +3. **Test your changes**: + + ```bash + # Run affected tests + npx nx affected --target=test + + # Run affected linting + npx nx affected --target=lint + + # Check formatting + npx nx format:check + ``` + +4. **Commit using conventional commits**: + + ```bash + git commit -m "feat(hooks): add new feature" + ``` + +### Pull request process + +1. Push your branch and create a PR +2. Ensure all CI checks pass +3. Request review from maintainers +4. Address any feedback +5. Once approved, the PR will be merged + +## Code Guidelines + +### TypeScript + +- Use strict TypeScript (`strict: true`) +- Never use `any` - prefer `unknown` with type guards +- Use explicit types at function boundaries +- Prefer union types over enums + +### Nx usage + +- All packages must be Nx projects +- Use Nx commands for build, test, lint +- Leverage Nx caching and affected detection + +### Documentation + +After making changes: + +1. Update relevant CLAUDE.md files +2. Update README.md if needed +3. Add/update documentation in `docs/` +4. Run `npm exec markdownlint-cli2 -- --fix "**/*.md"` + +## Creating New Packages + +### New plugin + +```bash +# Plugins go in packages/plugins/ +mkdir -p packages/plugins/my-plugin +``` + +Each plugin needs: + +- `package.json` with plugin metadata +- `project.json` for Nx configuration +- `.claude-plugin/plugin.json` manifest +- `README.md` documentation + +### New skill + +Skills are defined in `packages/plugins/*/skills/`: + +```bash +mkdir -p packages/plugins/uniswap-hooks/skills/my-skill +``` + +Each skill needs: + +- `SKILL.md` - The skill definition +- Corresponding eval suite in `evals/suites/` + +## Eval Requirements + +All new skills **must** have corresponding evaluation suites: + +```bash +# Create eval suite for new skill +mkdir -p evals/suites/my-skill/cases +mkdir -p evals/suites/my-skill/expected +``` + +See [Evals](/docs/ai-toolkit/evals) for details. + +## PR workflow highlights + +### Branch naming + +Use descriptive branch names: + +```text +feature/add-v4-security-skill +fix/eval-timeout-issue +docs/update-installation-guide +``` + +### PR title and commit format + +Use Conventional Commits for both commit messages and PR titles: + +```text +feat(hooks): add dynamic fee hook skill +fix(evals): increase timeout for slow tests +docs: update installation guide +``` + +### CI checks to expect + +Typical checks include: + +- Build +- Lint +- Format +- Tests +- Plugin validation +- Eval coverage for new skills + +## Commit Message Format + +We use [Conventional Commits](https://www.conventionalcommits.org/): + +```text +(): + +[optional body] + +[optional footer] +``` + +### Types + +| Type | Description | +| ---------- | --------------------- | +| `feat` | New feature | +| `fix` | Bug fix | +| `docs` | Documentation only | +| `style` | Formatting changes | +| `refactor` | Code restructuring | +| `test` | Adding/updating tests | +| `chore` | Maintenance tasks | + +### Scopes + +- `hooks` - uniswap-hooks plugin +- `cca` - uniswap-cca plugin +- `trading` - uniswap-trading plugin +- `viem` - uniswap-viem plugin +- `driver` - uniswap-driver plugin +- `evals` - Evaluation framework +- `docs` - Documentation +- `ci` - CI/CD workflows + +## Getting Help + +- Open an issue for bugs or feature requests +- Join the [Uniswap Discord](https://discord.gg/uniswap) for discussions +- Check existing issues before creating new ones \ No newline at end of file diff --git a/ai-toolkit/evals.mdx b/ai-toolkit/evals.mdx index c2a9d5d..1bdbf91 100644 --- a/ai-toolkit/evals.mdx +++ b/ai-toolkit/evals.mdx @@ -1,5 +1,419 @@ --- title: Evals -description: Evaluate AI agent performance on Uniswap integration tasks with standardized benchmarks. +description: Run and write eval suites for Uniswap AI skills to measure quality, safety, and regression risk. --- +Evals are to AI tools what tests are to traditional code. This framework provides a structured approach to evaluating quality and reliability for AI-powered skills. + +## Why Evals Matter + +Traditional software tests verify deterministic behavior. AI tools are probabilistic, so the same prompt can produce different but valid outputs. Evals bridge this gap by: + +- Defining expected behaviors rather than exact outputs +- Measuring quality across dimensions (accuracy, completeness, safety) +- Detecting regressions when prompts or models change +- Comparing performance across different LLM backends + +## Quick Start + +### Running evals + +```bash +# Run all evals +npx nx run evals:run + +# Run specific suite +npx nx run evals:run --suite=v4-security-foundations + +# Dry run (show what would be evaluated) +npx nx run evals:run --dry-run +``` + +### Writing evals + +1. Create a test case in `evals/suites//cases/` +2. Define expected behaviors in `evals/suites//expected/` +3. Configure the suite in `eval.config.ts` + +### Evaluation dimensions + +| Dimension | Description | Typical score range | +|---|---|---| +| Accuracy | Correctly implements requested behavior | 0 to 1 | +| Completeness | Includes all required elements | 0 to 1 | +| Safety | Avoids security vulnerabilities and unsafe patterns | 0 to 1 | +| Helpfulness | Provides clear and maintainable output | 0 to 1 | + +### Suite structure + +```text +evals/suites// +├── eval.config.ts +├── cases/ +│ └── basic-case.md +└── expected/ + └── basic-case.md +``` + +## Running Evals + +Execute evaluations and interpret results. + +### Basic commands + +#### Run all evals + +```bash +npx nx run evals:run +``` + +#### Run specific suite + +```bash +npx nx run evals:run --suite=v4-security-foundations +``` + +#### Run with specific model + +```bash +npx nx run evals:run --model=claude-opus-4-5-20251101 +``` + +#### Dry run + +Preview what would be evaluated without executing: + +```bash +npx nx run evals:run --dry-run +``` + +#### Verbose output + +Get detailed information about each case: + +```bash +npx nx run evals:run --verbose +``` + +### Understanding output + +#### Console output + +```text +Eval Suite: v4-security-foundations +============================================================ +Skill: v4-security-foundations +Models: claude-sonnet-4-5-20250929, claude-opus-4-5-20251101 +Thresholds: acc>=0.80 comp>=0.85 safe>=1.00 + + basic-security-check (claude-sonnet-4-5-20250929)... PASS [0.95/0.90/1.00] 2341ms + basic-security-check (claude-opus-4-5-20251101)... PASS [0.98/0.95/1.00] 3521ms + +------------------------------------------------------------ +Suite Summary +------------------------------------------------------------ +Total Cases: 2 +Passed: 2 +Failed: 0 +Errored: 0 + +Average Scores: + Accuracy: 96.5% + Completeness: 92.5% + Safety: 100.0% + Helpfulness: 94.0% + +Total Duration: 5862ms + +============================================================ +Overall Result: PASSED +============================================================ +``` + +#### Score interpretation + +| Score range | Interpretation | +|---|---| +| 0.95 to 1.00 | Excellent | +| 0.85 to 0.94 | Good | +| 0.70 to 0.84 | Acceptable | +| 0.50 to 0.69 | Needs improvement | +| < 0.50 | Failing | + +### CI integration + +#### GitHub Actions + +Add evals to your PR checks: + +```yaml +# .github/workflows/ci-pr-checks.yml +- name: Run evals + run: npx nx run evals:run --affected +``` + +#### Affected detection + +Only run evals for changed skills: + +```bash +npx nx run evals:run --affected --base=main +``` + +### Multi-model comparison + +Compare performance across models: + +```bash +# Run against multiple models +npx nx run evals:run --model=claude-sonnet-4-5-20250929,claude-opus-4-5-20251101,gpt-4 + +# Output comparison table +npx nx run evals:run --format=comparison +``` + +### Debugging failures + +#### Investigate a failed case + +```bash +# Run single case with verbose output +npx nx run evals:run --suite=v4-security-foundations --case=basic-security-check --verbose + +# Save raw output +npx nx run evals:run --suite=v4-security-foundations --save-outputs +``` + +#### Common failure reasons + +| Symptom | Likely cause | +|---|---| +| Low accuracy | Requirements not met | +| Low completeness | Missing elements | +| Zero safety | Security vulnerability detected | +| Timeout | Complex prompt, increase timeout | +| Error | Invalid case configuration | + +### Output formats + +#### JSON (for CI) + +```bash +npx nx run evals:run --format=json > results.json +``` + +#### Markdown (for PRs) + +```bash +npx nx run evals:run --format=markdown > results.md +``` + +#### HTML report + +```bash +npx nx run evals:run --format=html --output=./eval-report +``` + +### Thresholds + +Configure pass/fail thresholds per suite: + +```typescript +// eval.config.ts +thresholds: { + accuracy: 0.8, // 80% required + completeness: 0.85, // 85% required + safety: 1.0, // 100% required (non-negotiable) +} +``` + +For smart contract skills, safety should always be 1.0. Any security vulnerability is unacceptable. + +## Writing Evals + +Learn how to create comprehensive evaluations for AI skills. + +### Eval case structure + +Each eval case consists of two files: + +#### 1) Test case (`cases/*.md`) + +The prompt or scenario to test: + +```markdown +# Case Name + +Description of what to create. + +## Context + +- Relevant context +- Environment details +- Constraints + +## Requirements + +1. Specific requirement +2. Another requirement +3. Third requirement +``` + +#### 2) Expected behaviors (`expected/*.md`) + +What the output should include: + +```markdown +# Expected Behaviors + +## Must Include (Required) + +- [ ] Required element 1 +- [ ] Required element 2 + +## Should Include (Expected) + +- [ ] Expected element 1 +- [ ] Expected element 2 + +## Must Not Include (Automatic Fail) + +- [ ] Security vulnerability +- [ ] Anti-pattern +``` + +### Creating a new eval suite + +#### Step 1: create directory structure + +```bash +mkdir -p evals/suites/my-skill/cases +mkdir -p evals/suites/my-skill/expected +``` + +#### Step 2: create configuration + +```typescript +// evals/suites/my-skill/eval.config.ts +import type { EvalConfig } from '../../framework/types.js'; + +export const config: EvalConfig = { + name: 'my-skill', + skill: 'my-skill', + models: ['claude-sonnet-4-5-20250929', 'claude-opus-4-5-20251101'], + timeout: 60000, + retries: 2, + thresholds: { + accuracy: 0.8, + completeness: 0.85, + safety: 1.0, + }, +}; +``` + +#### Step 3: write test cases + +Create cases that test different scenarios: + +- Happy path: normal usage +- Edge cases: boundary conditions +- Error cases: invalid inputs +- Complex cases: multi-step requirements + +#### Step 4: define expected behaviors + +Be specific about what constitutes success: + +- Must Include: required for passing +- Should Include: expected but not required +- Should Not Include: negative indicators +- Must Not Include: automatic failures + +### Best practices + +#### 1) Test the edges + +```markdown +## Edge Case: Zero Liquidity + +Create a hook that handles pools with zero liquidity. + +## Requirements + +1. Check liquidity before routing +2. Revert gracefully if no liquidity +3. Emit appropriate error event +``` + +#### 2) Be specific about security + +```markdown +## Must Not Include + +- [ ] Unchecked external calls +- [ ] Integer overflow risks +- [ ] Reentrancy vulnerabilities +- [ ] Hardcoded secrets +``` + +#### 3) Version your evals + +Track changes to evals alongside skill changes to maintain consistency. + +#### 4) Document failures + +When an eval fails, document why: + +```markdown +## Known Issues + +- v1.0.0: Fails on pools with < 1e6 liquidity (fixed in v1.1.0) +``` + +### Example: complete eval case + +#### Case file + +```markdown +# Dynamic Fee Hook + +Create a hook that adjusts fees based on volatility. + +## Context + +- Pool: WETH/USDC +- Chain: Ethereum mainnet +- Volatility source: On-chain oracle + +## Requirements + +1. Read volatility from oracle +2. Calculate fee based on volatility brackets +3. Apply fee in beforeSwap +4. Track fee revenue +``` + +#### Expected file + +```markdown +# Expected Behaviors + +## Must Include + +- [ ] Implements beforeSwap callback +- [ ] Reads from volatility oracle +- [ ] Applies fee adjustment logic +- [ ] Emits FeeAdjusted event + +## Should Include + +- [ ] Handles oracle failures gracefully +- [ ] Uses appropriate data types +- [ ] Includes NatSpec documentation + +## Must Not Include + +- [ ] Hardcoded volatility values +- [ ] Unbounded fee calculations +- [ ] Missing access controls +``` diff --git a/ai-toolkit/overview.mdx b/ai-toolkit/overview.mdx index 3ea41bf..c62fd50 100644 --- a/ai-toolkit/overview.mdx +++ b/ai-toolkit/overview.mdx @@ -3,17 +3,11 @@ title: Overview description: Explore AI-powered tools, plugins, and LLM context files to build on Uniswap faster. --- -Uniswap provides AI-powered development tools that help you integrate swaps, build v4 hooks, provide liquidity, and interact with the EVM — all from within your code editor. These tools work with AI coding agents like [Claude Code](https://claude.ai/code), [Cursor](https://cursor.com), [Windsurf](https://windsurf.com), and others. - - - - - - +Uniswap provides AI-powered development tools that help you integrate swaps, build v4 hooks, provide liquidity, and interact with the EVM, all from within your code editor. ## Uniswap AI -The [Uniswap AI](https://github.com/uniswap/uniswap-ai) is an open-source collection of plugins and skills that give AI coding agents deep knowledge of Uniswap protocols, APIs, and smart contracts. Instead of relying on general training data, these tools provide agents with up-to-date, protocol-specific guidance — including code patterns, security best practices, and deployment workflows. +The [Uniswap AI](https://github.com/uniswap/uniswap-ai) is an open-source collection of plugins and skills that give AI coding agents deep knowledge of Uniswap protocols, APIs, and smart contracts. Instead of relying on general training data, these tools provide agents with up-to-date, protocol-specific guidance, including code patterns, security best practices, and deployment workflows. ### Available plugins @@ -25,6 +19,18 @@ The [Uniswap AI](https://github.com/uniswap/uniswap-ai) is an open-source collec | **uniswap-driver** | Discover tokens and plan Uniswap swaps or liquidity positions. Generates deep links that open directly in the Uniswap interface with parameters pre-filled. | | **uniswap-cca** | Configure and deploy Continuous Clearing Auction (CCA) smart contracts for token distribution. Interactive configuration, validation, and Foundry deployment scripts. | +### Official skills + +These are the official skills currently published in Uniswap AI: + +- `configurator` +- `deployer` +- `liquidity-planner` +- `swap-integration` +- `swap-planner` +- `v4-security-foundations` +- `viem-integration` + ### Install with the Skills CLI Uniswap AI is available through [skills.sh](https://skills.sh), a cross-platform CLI for installing AI agent skills: @@ -33,7 +39,7 @@ Uniswap AI is available through [skills.sh](https://skills.sh), a cross-platform npx skills add uniswap/uniswap-ai ``` -This works with any AI coding agent that supports skill files — not just Claude Code. +This works with any AI coding agent that supports skill files, not just Claude Code. ### Install as a Claude Code plugin @@ -51,7 +57,7 @@ claude plugin add uniswap-driver # Token discovery & deep links claude plugin add uniswap-cca # CCA auction configuration ``` -Once installed, the plugins activate automatically when relevant to your task. You can also invoke specific skills directly — for example, `/uniswap-hooks:v4-security-foundations` for a security-first walkthrough of hook development. +Once installed, the plugins activate automatically when relevant to your task. You can also invoke specific skills directly, for example, `/uniswap-hooks:v4-security-foundations` for a security-first walkthrough of hook development. ## LLM Context Files @@ -59,12 +65,12 @@ If you prefer to give your AI agent raw documentation context rather than struct ### llms.txt and llms-full.txt -AI models have a context window — the amount of text they can process at once. Providing relevant documentation upfront helps the model give better answers without hallucinating. +AI models have a context window, which is the amount of text they can process at once. Providing relevant documentation upfront helps the model give better answers without hallucinating. Uniswap offers two context files: -- **[llms.txt](https://docs.uniswap.org/v4-llms.txt)** — A compact summary with links to documentation sections. Works well with most models (100K+ token context windows). -- **[llms-full.txt](https://docs.uniswap.org/v4-llms-full.txt)** — A verbose version with more inline content. Use this if your model has a larger context window or you want more detail without following links. +- **[llms.txt](https://docs.uniswap.org/v4-llms.txt)**: A compact summary with links to documentation sections. Works well with most models (100K+ token context windows). +- **[llms-full.txt](https://docs.uniswap.org/v4-llms-full.txt)**: A verbose version with more inline content. Use this if your model has a larger context window or you want more detail without following links. ## Code Editor Setup @@ -98,3 +104,9 @@ Windsurf requires referencing documentation in each conversation. Add it to the ### Claude Code Install the Uniswap AI plugins (see [above](#install-as-a-claude-code-plugin)) for the richest integration. The plugins provide structured skills, expert agents, and protocol-specific tools that go beyond static documentation context. + +## Where to Go Next + +- [Browse official skills](/docs/ai-toolkit/skills) +- [Run and write eval suites](/docs/ai-toolkit/evals) +- [Contribute to Uniswap AI](/docs/ai-toolkit/contributions) diff --git a/ai-toolkit/skills.mdx b/ai-toolkit/skills.mdx index 707c87d..561e4fa 100644 --- a/ai-toolkit/skills.mdx +++ b/ai-toolkit/skills.mdx @@ -2,3 +2,74 @@ title: Skills description: Browse available Uniswap AI skills for swap integration, hook development, liquidity management, and more. --- + +Use Uniswap AI skills to run specialized workflows for swap integration, hook development, EVM interactions, and liquidity planning. + +## Uniswap Skills + +These are the official skills currently available in the installer: + +- `configurator` +- `deployer` +- `liquidity-planner` +- `swap-integration` +- `swap-planner` +- `v4-security-foundations` +- `viem-integration` + +## Available plugins + +| Plugin | Skill | What it helps with | Invocation | +|---|---|---|---| +| `uniswap-hooks` | `v4-security-foundations` | Review v4 hook architecture and security risks before implementation | `/v4-security-foundations` | +| `uniswap-cca` | `configurator` | Configure CCA auction parameters for a new deployment | `/configurator` | +| `uniswap-cca` | `deployer` | Deploy CCA contracts using the factory deployment pattern | `/deployer` | +| `uniswap-trading` | `swap-integration` | Integrate swaps using the Uniswap API, Universal Router, or direct smart contract calls | `/swap-integration` | +| `uniswap-viem` | `viem-integration` | Set up EVM clients and contract interactions with viem and wagmi | `/viem-integration` | +| `uniswap-driver` | `swap-planner` | Plan token swaps and generate interface deep links | `/swap-planner` | +| `uniswap-driver` | `liquidity-planner` | Plan LP positions and generate interface deep links | `/liquidity-planner` | + +## Install skills + +Install Uniswap AI skills with the Skills CLI: + +```bash +npx skills add uniswap/uniswap-ai +``` + +If your environment supports interactive selection, you can choose specific skills from the install prompt. + +## Use skills + +### Direct invocation + +Invoke a skill explicitly with a slash command: + +```text +/v4-security-foundations +``` + +### Contextual activation + +Skills also activate contextually when your prompt matches the skill domain: + +```text +What are the security risks of beforeSwapReturnDelta? +``` + +In this example, the assistant can activate the v4 security foundations skill automatically. + +## Skill structure + +Each skill is defined in a `SKILL.md` file and typically includes: + +- Frontmatter metadata (`name`, `description`, and plugin metadata) +- Step-by-step instructions for the agent +- Optional references in a `references/` directory + +## Add a new skill + +1. Create the skill folder under `packages/plugins//skills//` +2. Add a `SKILL.md` file with frontmatter and clear instructions +3. Register the skill in the plugin manifest (`plugin.json`) +4. Add a corresponding eval suite in `evals/suites//` From 394c25929562aa487081642792853dea5026fe24 Mon Sep 17 00:00:00 2001 From: Angela O <254776627+ocandocrypto-uniswap@users.noreply.github.com> Date: Mon, 23 Mar 2026 11:58:35 -0500 Subject: [PATCH 2/5] uniswap ai: updates and uniswap ai references vs ai toolkit --- ai-toolkit/evals.mdx | 419 ------------------- ai-toolkit/meta.json | 5 - meta.json | 2 +- {ai-toolkit => uniswap-ai}/contributions.mdx | 22 +- uniswap-ai/meta.json | 5 + {ai-toolkit => uniswap-ai}/overview.mdx | 28 +- {ai-toolkit => uniswap-ai}/skills.mdx | 6 +- 7 files changed, 28 insertions(+), 459 deletions(-) delete mode 100644 ai-toolkit/evals.mdx delete mode 100644 ai-toolkit/meta.json rename {ai-toolkit => uniswap-ai}/contributions.mdx (88%) create mode 100644 uniswap-ai/meta.json rename {ai-toolkit => uniswap-ai}/overview.mdx (70%) rename {ai-toolkit => uniswap-ai}/skills.mdx (90%) diff --git a/ai-toolkit/evals.mdx b/ai-toolkit/evals.mdx deleted file mode 100644 index 1bdbf91..0000000 --- a/ai-toolkit/evals.mdx +++ /dev/null @@ -1,419 +0,0 @@ ---- -title: Evals -description: Run and write eval suites for Uniswap AI skills to measure quality, safety, and regression risk. ---- - -Evals are to AI tools what tests are to traditional code. This framework provides a structured approach to evaluating quality and reliability for AI-powered skills. - -## Why Evals Matter - -Traditional software tests verify deterministic behavior. AI tools are probabilistic, so the same prompt can produce different but valid outputs. Evals bridge this gap by: - -- Defining expected behaviors rather than exact outputs -- Measuring quality across dimensions (accuracy, completeness, safety) -- Detecting regressions when prompts or models change -- Comparing performance across different LLM backends - -## Quick Start - -### Running evals - -```bash -# Run all evals -npx nx run evals:run - -# Run specific suite -npx nx run evals:run --suite=v4-security-foundations - -# Dry run (show what would be evaluated) -npx nx run evals:run --dry-run -``` - -### Writing evals - -1. Create a test case in `evals/suites//cases/` -2. Define expected behaviors in `evals/suites//expected/` -3. Configure the suite in `eval.config.ts` - -### Evaluation dimensions - -| Dimension | Description | Typical score range | -|---|---|---| -| Accuracy | Correctly implements requested behavior | 0 to 1 | -| Completeness | Includes all required elements | 0 to 1 | -| Safety | Avoids security vulnerabilities and unsafe patterns | 0 to 1 | -| Helpfulness | Provides clear and maintainable output | 0 to 1 | - -### Suite structure - -```text -evals/suites// -├── eval.config.ts -├── cases/ -│ └── basic-case.md -└── expected/ - └── basic-case.md -``` - -## Running Evals - -Execute evaluations and interpret results. - -### Basic commands - -#### Run all evals - -```bash -npx nx run evals:run -``` - -#### Run specific suite - -```bash -npx nx run evals:run --suite=v4-security-foundations -``` - -#### Run with specific model - -```bash -npx nx run evals:run --model=claude-opus-4-5-20251101 -``` - -#### Dry run - -Preview what would be evaluated without executing: - -```bash -npx nx run evals:run --dry-run -``` - -#### Verbose output - -Get detailed information about each case: - -```bash -npx nx run evals:run --verbose -``` - -### Understanding output - -#### Console output - -```text -Eval Suite: v4-security-foundations -============================================================ -Skill: v4-security-foundations -Models: claude-sonnet-4-5-20250929, claude-opus-4-5-20251101 -Thresholds: acc>=0.80 comp>=0.85 safe>=1.00 - - basic-security-check (claude-sonnet-4-5-20250929)... PASS [0.95/0.90/1.00] 2341ms - basic-security-check (claude-opus-4-5-20251101)... PASS [0.98/0.95/1.00] 3521ms - ------------------------------------------------------------- -Suite Summary ------------------------------------------------------------- -Total Cases: 2 -Passed: 2 -Failed: 0 -Errored: 0 - -Average Scores: - Accuracy: 96.5% - Completeness: 92.5% - Safety: 100.0% - Helpfulness: 94.0% - -Total Duration: 5862ms - -============================================================ -Overall Result: PASSED -============================================================ -``` - -#### Score interpretation - -| Score range | Interpretation | -|---|---| -| 0.95 to 1.00 | Excellent | -| 0.85 to 0.94 | Good | -| 0.70 to 0.84 | Acceptable | -| 0.50 to 0.69 | Needs improvement | -| < 0.50 | Failing | - -### CI integration - -#### GitHub Actions - -Add evals to your PR checks: - -```yaml -# .github/workflows/ci-pr-checks.yml -- name: Run evals - run: npx nx run evals:run --affected -``` - -#### Affected detection - -Only run evals for changed skills: - -```bash -npx nx run evals:run --affected --base=main -``` - -### Multi-model comparison - -Compare performance across models: - -```bash -# Run against multiple models -npx nx run evals:run --model=claude-sonnet-4-5-20250929,claude-opus-4-5-20251101,gpt-4 - -# Output comparison table -npx nx run evals:run --format=comparison -``` - -### Debugging failures - -#### Investigate a failed case - -```bash -# Run single case with verbose output -npx nx run evals:run --suite=v4-security-foundations --case=basic-security-check --verbose - -# Save raw output -npx nx run evals:run --suite=v4-security-foundations --save-outputs -``` - -#### Common failure reasons - -| Symptom | Likely cause | -|---|---| -| Low accuracy | Requirements not met | -| Low completeness | Missing elements | -| Zero safety | Security vulnerability detected | -| Timeout | Complex prompt, increase timeout | -| Error | Invalid case configuration | - -### Output formats - -#### JSON (for CI) - -```bash -npx nx run evals:run --format=json > results.json -``` - -#### Markdown (for PRs) - -```bash -npx nx run evals:run --format=markdown > results.md -``` - -#### HTML report - -```bash -npx nx run evals:run --format=html --output=./eval-report -``` - -### Thresholds - -Configure pass/fail thresholds per suite: - -```typescript -// eval.config.ts -thresholds: { - accuracy: 0.8, // 80% required - completeness: 0.85, // 85% required - safety: 1.0, // 100% required (non-negotiable) -} -``` - -For smart contract skills, safety should always be 1.0. Any security vulnerability is unacceptable. - -## Writing Evals - -Learn how to create comprehensive evaluations for AI skills. - -### Eval case structure - -Each eval case consists of two files: - -#### 1) Test case (`cases/*.md`) - -The prompt or scenario to test: - -```markdown -# Case Name - -Description of what to create. - -## Context - -- Relevant context -- Environment details -- Constraints - -## Requirements - -1. Specific requirement -2. Another requirement -3. Third requirement -``` - -#### 2) Expected behaviors (`expected/*.md`) - -What the output should include: - -```markdown -# Expected Behaviors - -## Must Include (Required) - -- [ ] Required element 1 -- [ ] Required element 2 - -## Should Include (Expected) - -- [ ] Expected element 1 -- [ ] Expected element 2 - -## Must Not Include (Automatic Fail) - -- [ ] Security vulnerability -- [ ] Anti-pattern -``` - -### Creating a new eval suite - -#### Step 1: create directory structure - -```bash -mkdir -p evals/suites/my-skill/cases -mkdir -p evals/suites/my-skill/expected -``` - -#### Step 2: create configuration - -```typescript -// evals/suites/my-skill/eval.config.ts -import type { EvalConfig } from '../../framework/types.js'; - -export const config: EvalConfig = { - name: 'my-skill', - skill: 'my-skill', - models: ['claude-sonnet-4-5-20250929', 'claude-opus-4-5-20251101'], - timeout: 60000, - retries: 2, - thresholds: { - accuracy: 0.8, - completeness: 0.85, - safety: 1.0, - }, -}; -``` - -#### Step 3: write test cases - -Create cases that test different scenarios: - -- Happy path: normal usage -- Edge cases: boundary conditions -- Error cases: invalid inputs -- Complex cases: multi-step requirements - -#### Step 4: define expected behaviors - -Be specific about what constitutes success: - -- Must Include: required for passing -- Should Include: expected but not required -- Should Not Include: negative indicators -- Must Not Include: automatic failures - -### Best practices - -#### 1) Test the edges - -```markdown -## Edge Case: Zero Liquidity - -Create a hook that handles pools with zero liquidity. - -## Requirements - -1. Check liquidity before routing -2. Revert gracefully if no liquidity -3. Emit appropriate error event -``` - -#### 2) Be specific about security - -```markdown -## Must Not Include - -- [ ] Unchecked external calls -- [ ] Integer overflow risks -- [ ] Reentrancy vulnerabilities -- [ ] Hardcoded secrets -``` - -#### 3) Version your evals - -Track changes to evals alongside skill changes to maintain consistency. - -#### 4) Document failures - -When an eval fails, document why: - -```markdown -## Known Issues - -- v1.0.0: Fails on pools with < 1e6 liquidity (fixed in v1.1.0) -``` - -### Example: complete eval case - -#### Case file - -```markdown -# Dynamic Fee Hook - -Create a hook that adjusts fees based on volatility. - -## Context - -- Pool: WETH/USDC -- Chain: Ethereum mainnet -- Volatility source: On-chain oracle - -## Requirements - -1. Read volatility from oracle -2. Calculate fee based on volatility brackets -3. Apply fee in beforeSwap -4. Track fee revenue -``` - -#### Expected file - -```markdown -# Expected Behaviors - -## Must Include - -- [ ] Implements beforeSwap callback -- [ ] Reads from volatility oracle -- [ ] Applies fee adjustment logic -- [ ] Emits FeeAdjusted event - -## Should Include - -- [ ] Handles oracle failures gracefully -- [ ] Uses appropriate data types -- [ ] Includes NatSpec documentation - -## Must Not Include - -- [ ] Hardcoded volatility values -- [ ] Unbounded fee calculations -- [ ] Missing access controls -``` diff --git a/ai-toolkit/meta.json b/ai-toolkit/meta.json deleted file mode 100644 index bf0731f..0000000 --- a/ai-toolkit/meta.json +++ /dev/null @@ -1,5 +0,0 @@ -{ - "title": "AI Toolkit", - "root": true, - "pages": ["overview", "skills", "evals", "contributions"] -} diff --git a/meta.json b/meta.json index aef319d..2afc689 100644 --- a/meta.json +++ b/meta.json @@ -1,4 +1,4 @@ { "title": "Documentation", - "pages": ["get-started", "trading", "liquidity", "ai-tooling", "api", "protocols", "sdks", "unichain", "ecosystem"] + "pages": ["get-started", "trading", "liquidity", "uniswap-ai", "api", "protocols", "sdks", "unichain", "ecosystem"] } diff --git a/ai-toolkit/contributions.mdx b/uniswap-ai/contributions.mdx similarity index 88% rename from ai-toolkit/contributions.mdx rename to uniswap-ai/contributions.mdx index 1474c38..ace5a4c 100644 --- a/ai-toolkit/contributions.mdx +++ b/uniswap-ai/contributions.mdx @@ -1,6 +1,6 @@ --- title: Contributions -description: Contribute skills, evals, and improvements to the Uniswap AI toolkit. +description: Contribute skills, plugins, and improvements to Uniswap AI. --- Thank you for your interest in contributing to Uniswap AI! This guide will help you get started. @@ -135,19 +135,7 @@ mkdir -p packages/plugins/uniswap-hooks/skills/my-skill Each skill needs: - `SKILL.md` - The skill definition -- Corresponding eval suite in `evals/suites/` - -## Eval Requirements - -All new skills **must** have corresponding evaluation suites: - -```bash -# Create eval suite for new skill -mkdir -p evals/suites/my-skill/cases -mkdir -p evals/suites/my-skill/expected -``` - -See [Evals](/docs/ai-toolkit/evals) for details. +- Validation through lint, tests, and documentation updates ## PR workflow highlights @@ -157,7 +145,7 @@ Use descriptive branch names: ```text feature/add-v4-security-skill -fix/eval-timeout-issue +fix/plugin-timeout-issue docs/update-installation-guide ``` @@ -167,7 +155,7 @@ Use Conventional Commits for both commit messages and PR titles: ```text feat(hooks): add dynamic fee hook skill -fix(evals): increase timeout for slow tests +fix(trading): improve swap integration guidance docs: update installation guide ``` @@ -180,7 +168,6 @@ Typical checks include: - Format - Tests - Plugin validation -- Eval coverage for new skills ## Commit Message Format @@ -213,7 +200,6 @@ We use [Conventional Commits](https://www.conventionalcommits.org/): - `trading` - uniswap-trading plugin - `viem` - uniswap-viem plugin - `driver` - uniswap-driver plugin -- `evals` - Evaluation framework - `docs` - Documentation - `ci` - CI/CD workflows diff --git a/uniswap-ai/meta.json b/uniswap-ai/meta.json new file mode 100644 index 0000000..4390581 --- /dev/null +++ b/uniswap-ai/meta.json @@ -0,0 +1,5 @@ +{ + "title": "Uniswap AI", + "root": true, + "pages": ["overview", "skills", "contributions"] +} diff --git a/ai-toolkit/overview.mdx b/uniswap-ai/overview.mdx similarity index 70% rename from ai-toolkit/overview.mdx rename to uniswap-ai/overview.mdx index c62fd50..ec6c212 100644 --- a/ai-toolkit/overview.mdx +++ b/uniswap-ai/overview.mdx @@ -13,11 +13,11 @@ The [Uniswap AI](https://github.com/uniswap/uniswap-ai) is an open-source collec | Plugin | Description | | --- | --- | -| **uniswap-trading** | Integrate Uniswap swaps via the Uniswap API, Universal Router SDK, or direct smart contract calls. Covers Permit2 patterns, ERC-4337 smart accounts, and multi-chain support. | -| **uniswap-hooks** | Security-first guidance for building Uniswap v4 hooks. Includes vulnerability patterns (NoOp attacks, delta accounting), permission flag risk matrices, and aggregator hook templates for routing through external DEX liquidity. | -| **uniswap-viem** | EVM blockchain integration using [viem](https://viem.sh) and [wagmi](https://wagmi.sh). Covers client setup, contract interactions, account management, and React hooks. | -| **uniswap-driver** | Discover tokens and plan Uniswap swaps or liquidity positions. Generates deep links that open directly in the Uniswap interface with parameters pre-filled. | -| **uniswap-cca** | Configure and deploy Continuous Clearing Auction (CCA) smart contracts for token distribution. Interactive configuration, validation, and Foundry deployment scripts. | +| **uniswap-trading** | Integrate swaps via the [Uniswap API](https://developers.uniswap.org/dashboard), Universal Router SDK, or direct contract calls. Includes the `pay-with-any-token` skill for 402 challenge (supports x402 and MPP). | +| **uniswap-hooks** | Security-first guidance for building Uniswap v4 hooks. | +| **uniswap-viem** | EVM integration with viem and wagmi. | +| **uniswap-driver** | Token discovery and swap/liquidity planning with deep links. | +| **uniswap-cca** | Configure and deploy CCA contracts for token distribution. | ### Official skills @@ -36,7 +36,7 @@ These are the official skills currently published in Uniswap AI: Uniswap AI is available through [skills.sh](https://skills.sh), a cross-platform CLI for installing AI agent skills: ```bash -npx skills add uniswap/uniswap-ai +npx skills add Uniswap/uniswap-ai ``` This works with any AI coding agent that supports skill files, not just Claude Code. @@ -50,11 +50,11 @@ If you use [Claude Code](https://claude.ai/code), first add the Uniswap marketpl /plugin marketplace add uniswap/uniswap-ai # Install individual plugins -claude plugin add uniswap-hooks # v4 hook development -claude plugin add uniswap-trading # Swap integration -claude plugin add uniswap-viem # EVM / viem / wagmi -claude plugin add uniswap-driver # Token discovery & deep links -claude plugin add uniswap-cca # CCA auction configuration +/plugin install uniswap-hooks # v4 hook development +/plugin install uniswap-trading # Swap integration (+ pay-with-any-token skill) +/plugin install uniswap-viem # EVM / viem / wagmi +/plugin install uniswap-driver # Token discovery & deep links +/plugin install uniswap-cca # CCA auction configuration ``` Once installed, the plugins activate automatically when relevant to your task. You can also invoke specific skills directly, for example, `/uniswap-hooks:v4-security-foundations` for a security-first walkthrough of hook development. @@ -107,6 +107,6 @@ Install the Uniswap AI plugins (see [above](#install-as-a-claude-code-plugin)) f ## Where to Go Next -- [Browse official skills](/docs/ai-toolkit/skills) -- [Run and write eval suites](/docs/ai-toolkit/evals) -- [Contribute to Uniswap AI](/docs/ai-toolkit/contributions) +- [Browse official skills](/docs/uniswap-ai/skills) +- [Contribute to Uniswap AI](/docs/uniswap-ai/contributions) +- [View the Uniswap AI repository](https://github.com/Uniswap/uniswap-ai) diff --git a/ai-toolkit/skills.mdx b/uniswap-ai/skills.mdx similarity index 90% rename from ai-toolkit/skills.mdx rename to uniswap-ai/skills.mdx index 561e4fa..3f22e2d 100644 --- a/ai-toolkit/skills.mdx +++ b/uniswap-ai/skills.mdx @@ -12,6 +12,7 @@ These are the official skills currently available in the installer: - `configurator` - `deployer` - `liquidity-planner` +- `pay-with-any-token` - `swap-integration` - `swap-planner` - `v4-security-foundations` @@ -25,6 +26,7 @@ These are the official skills currently available in the installer: | `uniswap-cca` | `configurator` | Configure CCA auction parameters for a new deployment | `/configurator` | | `uniswap-cca` | `deployer` | Deploy CCA contracts using the factory deployment pattern | `/deployer` | | `uniswap-trading` | `swap-integration` | Integrate swaps using the Uniswap API, Universal Router, or direct smart contract calls | `/swap-integration` | +| `uniswap-trading` | `pay-with-any-token` | Pay HTTP 402 challenges (x402 and MPP) using tokens via Uniswap swaps | `/pay-with-any-token` | | `uniswap-viem` | `viem-integration` | Set up EVM clients and contract interactions with viem and wagmi | `/viem-integration` | | `uniswap-driver` | `swap-planner` | Plan token swaps and generate interface deep links | `/swap-planner` | | `uniswap-driver` | `liquidity-planner` | Plan LP positions and generate interface deep links | `/liquidity-planner` | @@ -34,7 +36,7 @@ These are the official skills currently available in the installer: Install Uniswap AI skills with the Skills CLI: ```bash -npx skills add uniswap/uniswap-ai +npx skills add Uniswap/uniswap-ai ``` If your environment supports interactive selection, you can choose specific skills from the install prompt. @@ -72,4 +74,4 @@ Each skill is defined in a `SKILL.md` file and typically includes: 1. Create the skill folder under `packages/plugins//skills//` 2. Add a `SKILL.md` file with frontmatter and clear instructions 3. Register the skill in the plugin manifest (`plugin.json`) -4. Add a corresponding eval suite in `evals/suites//` +4. Validate the skill with lint and test checks before opening a PR From 079c0128ac598937a57237d028163bb82a5bff65 Mon Sep 17 00:00:00 2001 From: Angela O <254776627+ocandocrypto-uniswap@users.noreply.github.com> Date: Mon, 23 Mar 2026 12:07:14 -0500 Subject: [PATCH 3/5] docs skills audit implementation --- uniswap-ai/overview.mdx | 19 +++++++++---------- 1 file changed, 9 insertions(+), 10 deletions(-) diff --git a/uniswap-ai/overview.mdx b/uniswap-ai/overview.mdx index ec6c212..f5940ac 100644 --- a/uniswap-ai/overview.mdx +++ b/uniswap-ai/overview.mdx @@ -1,19 +1,19 @@ --- title: Overview -description: Explore AI-powered tools, plugins, and LLM context files to build on Uniswap faster. +description: Get started with Uniswap AI plugins, skills, and LLM context for external developers. --- -Uniswap provides AI-powered development tools that help you integrate swaps, build v4 hooks, provide liquidity, and interact with the EVM, all from within your code editor. +Use Uniswap AI to speed up swap integration, hook development, and EVM workflows with tools designed for external developers. ## Uniswap AI -The [Uniswap AI](https://github.com/uniswap/uniswap-ai) is an open-source collection of plugins and skills that give AI coding agents deep knowledge of Uniswap protocols, APIs, and smart contracts. Instead of relying on general training data, these tools provide agents with up-to-date, protocol-specific guidance, including code patterns, security best practices, and deployment workflows. +The [Uniswap AI](https://github.com/uniswap/uniswap-ai) repository is an open-source collection of plugins and skills for coding agents. It provides protocol-specific guidance for Uniswap APIs and smart contracts. ### Available plugins | Plugin | Description | | --- | --- | -| **uniswap-trading** | Integrate swaps via the [Uniswap API](https://developers.uniswap.org/dashboard), Universal Router SDK, or direct contract calls. Includes the `pay-with-any-token` skill for 402 challenge (supports x402 and MPP). | +| **uniswap-trading** | Integrate swaps via the [Uniswap API](https://developers.uniswap.org/dashboard), Universal Router SDK, or direct contract calls. | | **uniswap-hooks** | Security-first guidance for building Uniswap v4 hooks. | | **uniswap-viem** | EVM integration with viem and wagmi. | | **uniswap-driver** | Token discovery and swap/liquidity planning with deep links. | @@ -26,24 +26,23 @@ These are the official skills currently published in Uniswap AI: - `configurator` - `deployer` - `liquidity-planner` +- `pay-with-any-token` - `swap-integration` - `swap-planner` - `v4-security-foundations` - `viem-integration` -### Install with the Skills CLI +## Install Uniswap AI -Uniswap AI is available through [skills.sh](https://skills.sh), a cross-platform CLI for installing AI agent skills: +### Skills CLI (any agent) ```bash npx skills add Uniswap/uniswap-ai ``` -This works with any AI coding agent that supports skill files, not just Claude Code. +### Claude Code marketplace -### Install as a Claude Code plugin - -If you use [Claude Code](https://claude.ai/code), first add the Uniswap marketplace, then install individual plugins: +If you use [Claude Code](https://claude.ai/code), add the marketplace and install plugins: ```bash # Add the Uniswap marketplace From c4671b8deb6390ad4b3eda6607c2eb746c46a040 Mon Sep 17 00:00:00 2001 From: Angela O <254776627+ocandocrypto-uniswap@users.noreply.github.com> Date: Mon, 23 Mar 2026 12:10:01 -0500 Subject: [PATCH 4/5] docs: small copy update --- uniswap-ai/overview.mdx | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/uniswap-ai/overview.mdx b/uniswap-ai/overview.mdx index f5940ac..8644403 100644 --- a/uniswap-ai/overview.mdx +++ b/uniswap-ai/overview.mdx @@ -1,9 +1,9 @@ --- title: Overview -description: Get started with Uniswap AI plugins, skills, and LLM context for external developers. +description: Get started with Uniswap AI plugins, skills, and LLM context. --- -Use Uniswap AI to speed up swap integration, hook development, and EVM workflows with tools designed for external developers. +Use Uniswap AI to speed up swap integration, hook development, and EVM workflows with tools designed for builders on Uniswap. ## Uniswap AI From 0cccb0b6305b6e63ea0cc12d2615a71b268d4220 Mon Sep 17 00:00:00 2001 From: Angela O <254776627+ocandocrypto-uniswap@users.noreply.github.com> Date: Mon, 23 Mar 2026 12:12:56 -0500 Subject: [PATCH 5/5] small copy fix --- uniswap-ai/contributions.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/uniswap-ai/contributions.mdx b/uniswap-ai/contributions.mdx index ace5a4c..eb90799 100644 --- a/uniswap-ai/contributions.mdx +++ b/uniswap-ai/contributions.mdx @@ -171,7 +171,7 @@ Typical checks include: ## Commit Message Format -We use [Conventional Commits](https://www.conventionalcommits.org/): +You can use [Conventional Commits](https://www.conventionalcommits.org/): ```text ():