Skip to content

Add agent-browser remote interop and browse-fleet-subagents skill#38

Open
shrey150 wants to merge 24 commits intomainfrom
codex/skills-agent-browser-remote-fleet
Open

Add agent-browser remote interop and browse-fleet-subagents skill#38
shrey150 wants to merge 24 commits intomainfrom
codex/skills-agent-browser-remote-fleet

Conversation

@shrey150
Copy link
Contributor

@shrey150 shrey150 commented Feb 27, 2026

Summary

  • add agent-browser-remote skill to teach explicit Browserbase CDP handoff for agent-browser
  • add browse-fleet-subagents skill as a sub-agent orchestration pattern (not a CLI primitive)
  • add a helper script (browserbase-session.mjs) to create/close Browserbase sessions and export CDP env vars
  • update browser skill docs to emphasize focused snapshots and parallel work via sub-agents
  • update skills catalog in README

Naming decisions

  • keep agent-browser-remote for discoverability on explicit interop intent (agent-browser + remote/CDP)
  • use browse-fleet-subagents to make the sub-agent orchestration model explicit in the slug
  • avoid introducing a redundant stealth alias skill in this PR

Notes

  • docs intentionally avoid teaching legacy single-command fanout patterns
  • no CLI behavior changes in this repo (skills/docs/script only)

Note

Low Risk
Mostly documentation and skill catalog updates, plus a small standalone Node script that creates/closes Browserbase sessions via API; no runtime/CLI behavior in this repo is modified. Risk is limited to potential user confusion or session leaks if the helper script is misused.

Overview
Adds two new skills: agent-browser-remote (teaches explicit Browserbase CDP handoff for agent-browser, including a browserbase-session.mjs helper to create/close sessions and export env vars) and browse-fleet-subagents (documents a sub-agent fanout pattern with retries and cleanup).

Replaces the prior browser-automation skill/docs with a new browser skill centered on the daemon-based browse CLI (focused snapshots, remote-mode guidance, and examples), and updates the marketplace/README skill catalog accordingly. Updates the functions skill docs (adds MIT license, streamlines SKILL.md, and moves invocation/patterns/troubleshooting into skills/functions/REFERENCE.md).

Written by Cursor Bugbot for commit 9e5c1a3. This will update automatically on new commits. Configure here.

openclaw and others added 22 commits February 21, 2026 17:10
No agent (Claude Code, OpenCode, OpenClaw, Cursor, Windsurf) reads
setup.json — only SKILL.md is loaded into context. The setup.json
pattern was invisible to all agents and therefore broken everywhere.

Replace with standard frontmatter fields:
- requires.bins: [browser] — agents skip/suppress the skill if the
  binary isn't installed, rather than always injecting it and hoping
  the agent notices setupComplete: false
- install.kind/pkg — agents that support auto-install can invoke
  npm install @browserbasehq/stagehand-cli automatically

Also remove the ".env file" reference for credentials — the env vars
are read from the environment, not specifically from a .env file.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ed frontmatter

requires.bins and install.kind/pkg are not part of any agent skills spec.
Replace with:
- compatibility field (valid per agentskills.io spec) to surface requirements
- inline `which browser || npm install` check in the skill body so the
  agent can self-heal without relying on non-standard frontmatter fields

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The agentskills.io spec requires the directory name to match the name
field in SKILL.md frontmatter. Since the skill is named "browser"
(invoked as /browser), the directory should be browser/ not browser-automation/.

Also fix inconsistent heading levels (### → ##).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Validated both skills against the skills-ref reference library and
fixed all issues found:

- Quote compatibility YAML value to fix strictyaml parse error
- Rewrite functions description with trigger keywords (schedule, webhook,
  cloud, cron, Browserbase Functions) -- the spec requires triggers in
  the description, not in the body
- Split functions SKILL.md into SKILL.md + REFERENCE.md for progressive
  disclosure (invocation examples, common patterns, troubleshooting)
- Remove "When to Use" body section from functions (redundant with
  description, invisible during skill discovery)
- Add license: Apache-2.0 and LICENSE.txt to both skills
- Add table of contents to browser REFERENCE.md (535 lines)
- Condense browser EXAMPLES.md from 8 repetitive examples to 4 diverse ones

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Update copyright year from 2025 to 2026 in both LICENSE.txt files
- Fix package name from @browserbasehq/stagehand-cli (doesn't exist)
  to @browserbasehq/browse-cli (actual npm package)
- Update CLI command from `browser` to `browse` to match the npm
  package binary name across SKILL.md, REFERENCE.md, and EXAMPLES.md

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add anti-bot stealth, CAPTCHA solving, residential proxy, and session
persistence details to the browser skill description and mode comparison
table. These trigger phrases help AI agents discover and select this
skill when users need to interact with protected or JavaScript-heavy
websites.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add requires.bins (browse CLI) and install spec so OpenClaw can gate
  the skill properly and auto-install the CLI
- Add homepage for ClawHub trust score
- Do NOT gate on env vars — local mode works without Browserbase keys
- Rewrite "Environment Selection" section with clear guidance on when
  to use local mode (simple pages) vs remote mode (protected sites,
  CAPTCHAs, bot detection, Cloudflare, geo-restricted content)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Session logs show the agent screenshots after every action (expensive,
slow) and ignores browse act/observe in favor of manual snapshot → click
ref loops. Update the skill to:

- Document browse snapshot and recommend it as default over screenshot
- Add guidance on when to use snapshot vs screenshot
- Steer toward browse act/observe over low-level ref-based commands
- Rewrite best practices to reflect snapshot-first, act-first workflow

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The SKILL.md documented commands (navigate, act, extract, observe) that
don't exist in the CLI. The actual commands are open, click, type, fill,
snapshot, etc. This caused agents to run nonexistent commands and fall
back to guessing.

- Replace all command docs with actual CLI syntax from browse --help
- Document snapshot-first workflow with element refs
- Add session management commands (stop, status, pages, tab_switch)
- Add "No active page" to troubleshooting
- Fix quick example to use real commands

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
browse stop doesn't always kill the daemon process. Add pkill fallback
for when the daemon is stuck with wsUrl: "unknown" after a SIGTERM.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
… CLI commands

- Rename browser-automation → browser in README.md and marketplace.json
- Rewrite EXAMPLES.md: replace nonexistent commands (navigate, act,
  extract, observe, close) with real browse CLI commands (open, snapshot,
  click, type, fill, get, stop). 4 concrete examples including remote
  mode escalation.
- Rewrite REFERENCE.md: replace Stagehand/Playwright architecture with
  actual daemon-based CLI docs, all 20 real commands, env var config.
- SKILL.md: add "Activating Remote Mode" progressive disclosure section,
  fix get text to require selector argument.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add browse mode to SKILL.md commands list and REFERENCE.md. Rewrite
"Activating Remote Mode" as "Switching Between Local and Remote Mode"
using browse mode as the primary mechanism.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…rowse-cli; gate on 'browse' bin only\n\n- Drop OpenClaw install block to avoid failed installs\n- Make compatibility + setup check generic (no npm package name)\n- Generalize REFERENCE note (avoid browse-cli v0.1.4 mention)

Co-authored-by: Kyle Jeong <Kylejeong2@users.noreply.github.com>
…basehq/browse-cli; gate on 'browse' bin only\n\n- Drop OpenClaw install block to avoid failed installs\n- Make compatibility + setup check generic (no npm package name)\n- Generalize REFERENCE note (avoid browse-cli v0.1.4 mention)"

This reverts commit 5b74333.
Remove: get html, drag, highlight, is, execute references.
Add docs for: hover, newpage, eval, viewport, network capture,
snapshot --compact, screenshot --full-page, type --delay/--mistakes,
fill --no-press-enter, stop --force. Replace get html with get value
in SKILL.md.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ked commands

These commands are functional again after PR browserbase/stagent-cli#11
fixed the daemon startup (EPIPE crash) and restored selector command surfaces.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…NCE.md

These CLI features were verified working in browse-cli v0.1.5 but were
missing from the reference documentation:
- `refs` command for cached ref map lookup
- `open --wait` flag (networkidle/domcontentloaded) for SPAs
- `--json` global flag for structured output
- `--session` global flag for concurrent browser sessions

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Match the #### heading + description + code block pattern used
throughout the rest of REFERENCE.md.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…hing

- Add Table of Contents to REFERENCE.md (matches functions/REFERENCE.md
  and top skills like terraform-skill, mcp-builder)
- Remove Typical Workflow and Local vs Remote Mode sections from
  REFERENCE.md — these were near-identical copies of SKILL.md content
- Condense "Switching Between Local and Remote Environment" in SKILL.md
  from 38 lines to 16 — keeps the signal detection list and credential
  setup, drops redundant env commands already shown in Commands section

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@shrey150 shrey150 changed the title Add agent-browser remote interop and sub-agent fleet skills Add agent-browser remote interop and browse-fleet-subagents skill Feb 27, 2026
Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

"strict": false,
"skills": [
"./skills/browser-automation"
"./skills/browser"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New skills not registered in marketplace discovery catalog

High Severity

The two new skills (agent-browser-remote and browse-fleet-subagents) are added with full SKILL.md files and listed in the README.md skills table, but neither is registered in marketplace.json. Since marketplace.json is the discovery catalog that Claude Code uses to find and load skills, these skills won't be discoverable or installable through the plugin system. The browse plugin's skills array only contains ./skills/browser.

Additional Locations (1)

Fix in Cursor Fix in Web

When done:

```bash
node scripts/browserbase-session.mjs close --session-id "$BROWSERBASE_SESSION_ID"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Script path unresolvable from user working directory

High Severity

The SKILL.md instructs Claude to run node scripts/browserbase-session.mjs as a relative path, but the script lives at skills/agent-browser-remote/scripts/browserbase-session.mjs relative to the repo root. When Claude Code follows these bash instructions, commands execute in the user's project directory, where no scripts/browserbase-session.mjs exists. Every other skill in this repo uses globally-installed CLI tools (e.g., browse, pnpm bb) rather than relative file paths.

Fix in Cursor Fix in Web

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants