Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
79 commits
Select commit Hold shift + click to select a range
9daa584
add playwright arguments to agent (#1066)
tkattkat Sep 10, 2025
f6f05b0
[docs] add info on not needing project id in browserbase session para…
chrisreadsf Sep 11, 2025
c886544
Export aisdk (#1058)
chrisreadsf Sep 15, 2025
87505a3
docs: update fingerprint settings to reflect the new session create c…
Kylejeong2 Sep 15, 2025
3c39a05
[docs] export aisdk (#1074)
chrisreadsf Sep 16, 2025
bf2d0e7
Fix zod peer dependency support (#1032)
miguelg719 Sep 16, 2025
7f38b3a
add stagehand agent to api (#1077)
tkattkat Sep 16, 2025
3a0dc58
add playwright screenshot option for browserbase env (#1070)
derekmeegan Sep 17, 2025
b7be89e
add webbench, chrome-based OS world, and ground truth to web voyager …
filip-michalsky Sep 18, 2025
df76f7a
Fix python installation instructions (#1087)
rsbryan Sep 19, 2025
b9c8102
update xpath in `observe_vantechjournal` (#1088)
seanmcguire12 Sep 20, 2025
536f366
Fix session create logs on api (#1089)
miguelg719 Sep 21, 2025
8ff5c5a
Improve failed act logs (#1090)
miguelg719 Sep 21, 2025
13b0603
initial commit
tkattkat Sep 22, 2025
c097fba
update agent types
tkattkat Sep 22, 2025
ae974a5
update logger type
tkattkat Sep 22, 2025
569e444
[docs] add aisdk workaround before npm release + add versions to work…
chrisreadsf Sep 22, 2025
4b26bf1
update log levels
tkattkat Sep 22, 2025
d6434d6
remove logger helper, and use inline
tkattkat Sep 22, 2025
a4b277e
extract changes
tkattkat Sep 22, 2025
0f376a2
remove unnecessary return values from tools
tkattkat Sep 22, 2025
5801b85
clean up action handler types
tkattkat Sep 22, 2025
9ad0e6d
remove aria tree caching
tkattkat Sep 22, 2025
5dc0d1d
move system prompt to prompt.ts
tkattkat Sep 22, 2025
8c0fd01
pass stagehand, instead of stagehandPage to agent (#1082)
tkattkat Sep 22, 2025
edce0cc
Merge remote-tracking branch 'origin/main' into stagehand-agent-impro…
tkattkat Sep 22, 2025
b2742dd
remove unnecessary type casting
tkattkat Sep 22, 2025
ab1ec2d
remove more unnecessary type casting
tkattkat Sep 22, 2025
dc2d420
img diff algo for screenshots (#1072)
filip-michalsky Sep 23, 2025
01a7de3
update aria tool
tkattkat Sep 23, 2025
f89b13e
Eval metadata (#1092)
miguelg719 Sep 23, 2025
47de9ea
update hasSearch
tkattkat Sep 23, 2025
2a5a5b6
improve tool return values
tkattkat Sep 23, 2025
4ff715b
remove unnecessary params
tkattkat Sep 23, 2025
4760b07
add store actions flag
tkattkat Sep 23, 2025
44be5e3
Merge remote-tracking branch 'origin/main' into stagehand-agent-impro…
tkattkat Sep 23, 2025
a8eddcc
merge changes
tkattkat Sep 23, 2025
61cede9
update goto
tkattkat Sep 23, 2025
a76407d
temp remove store actions flag
tkattkat Sep 23, 2025
c321896
add execution model to eval runner
tkattkat Sep 23, 2025
8a33ebc
add store actions flag
tkattkat Sep 23, 2025
843b60d
update prompt
tkattkat Sep 23, 2025
793c3d1
replace any types with proper typing
tkattkat Sep 23, 2025
6754ead
remove execution model
tkattkat Sep 23, 2025
4fd7bca
add exa dependency
tkattkat Sep 23, 2025
8bcf453
update docs
tkattkat Sep 24, 2025
4b3239d
changeset
tkattkat Sep 24, 2025
887e5d5
update prompt on drag and drop
tkattkat Sep 24, 2025
7fbbbf6
temp disable tool filtering
tkattkat Sep 24, 2025
966d92b
temp disable model routing
tkattkat Sep 24, 2025
529f226
add back model routing
tkattkat Sep 24, 2025
f8fdd5c
add move
tkattkat Sep 24, 2025
0125390
prompt changes
tkattkat Sep 24, 2025
a7bf3a7
update prompts
tkattkat Sep 24, 2025
97cba02
adjust prompt
tkattkat Sep 25, 2025
eb18df9
increase checkpoint interval
tkattkat Sep 25, 2025
937b378
update prompt
tkattkat Sep 25, 2025
ab941b1
update prompt
tkattkat Sep 25, 2025
108de3c
update evals cli docs (#1096)
miguelg719 Sep 26, 2025
a77eeea
change scroll to be percentage based
tkattkat Sep 26, 2025
d1821cb
adjust prompt
tkattkat Sep 26, 2025
ff0d942
update screenshot tool
tkattkat Sep 26, 2025
e0e6b30
adding support for new claude 4.5 sonnet agent model (#1099)
Kylejeong2 Sep 29, 2025
919eb3b
custom tool schema sbased on model
tkattkat Sep 29, 2025
057bdca
update tool routing
tkattkat Sep 29, 2025
9a8ee76
update scroll
tkattkat Sep 29, 2025
889cb6c
properly convert custom / mcp tools to anthropic cua format (#1103)
tkattkat Oct 1, 2025
a99aa48
Add current date and page url to agent context (#1102)
miguelg719 Oct 1, 2025
a1ad06c
Additional agent logging (#1104)
miguelg719 Oct 1, 2025
0791404
Include import statements in extract code examples (#1105)
victlue Oct 4, 2025
3ccf335
fix: missing URLs for `extract()` with array schema (#1107)
seanmcguire12 Oct 6, 2025
dda52f1
Support for new Gemini Computer Use Models (#1110)
miguelg719 Oct 7, 2025
9a29937
google cua docs (#1111)
jay-sahnan Oct 7, 2025
34da7d3
Version Packages (#1062)
github-actions[bot] Oct 7, 2025
ec5317c
Fix Python example in observe.mdx (#1113)
rsbryan Oct 7, 2025
c0fbc51
set default viewport when running on browserbase (#1114)
seanmcguire12 Oct 8, 2025
7da5b55
Version Packages (#1115)
github-actions[bot] Oct 8, 2025
46c0fa2
Merge remote-tracking branch 'origin/main' into stagehand-agent-impro…
tkattkat Oct 8, 2025
ceae3bd
fix type errors from merge
tkattkat Oct 8, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 0 additions & 5 deletions .changeset/pink-snakes-sneeze.md

This file was deleted.

5 changes: 5 additions & 0 deletions .changeset/six-oranges-report.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
---
"@browserbasehq/stagehand": patch
---

Enhanced Stagehand agent with smart model routing, expanded toolset, and robust context management. For more information, reference the [stagehand agent docs](https://docs.stagehand.dev/basics/agent)
5 changes: 0 additions & 5 deletions .changeset/social-moles-wish.md

This file was deleted.

5 changes: 0 additions & 5 deletions .changeset/tired-cats-repeat.md

This file was deleted.

1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
CLAUDE.md
node_modules/
/test-results/
/playwright-report/
Expand Down
41 changes: 38 additions & 3 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,43 @@
# @browserbasehq/stagehand

## 2.5.2

### Patch Changes

- [#1114](https://github.com/browserbase/stagehand/pull/1114) [`c0fbc51`](https://github.com/browserbase/stagehand/commit/c0fbc51a4b7e0b803af254501d2f89473124f0dc) Thanks [@seanmcguire12](https://github.com/seanmcguire12)! - configure default viewport when running on browserbase

## 2.5.1

### Patch Changes

- [#1082](https://github.com/browserbase/stagehand/pull/1082) [`8c0fd01`](https://github.com/browserbase/stagehand/commit/8c0fd01c965a809b96c026f4674685e6445bc7d4) Thanks [@tkattkat](https://github.com/tkattkat)! - Pass stagehand object to agent instead of stagehand page

- [#1104](https://github.com/browserbase/stagehand/pull/1104) [`a1ad06c`](https://github.com/browserbase/stagehand/commit/a1ad06c5398db10db7a2a83075b808dc63a963f7) Thanks [@miguelg719](https://github.com/miguelg719)! - Fix logging for stagehand agent

- [#1066](https://github.com/browserbase/stagehand/pull/1066) [`9daa584`](https://github.com/browserbase/stagehand/commit/9daa58477111e1470f2b618a898738b5e1967cb6) Thanks [@tkattkat](https://github.com/tkattkat)! - Add playwright arguments to agent execute response

- [#1077](https://github.com/browserbase/stagehand/pull/1077) [`7f38b3a`](https://github.com/browserbase/stagehand/commit/7f38b3a3048ba28f81649c33c0d633c4853146bd) Thanks [@tkattkat](https://github.com/tkattkat)! - adds support for stagehand agent in the api

- [#1032](https://github.com/browserbase/stagehand/pull/1032) [`bf2d0e7`](https://github.com/browserbase/stagehand/commit/bf2d0e79da744b6b2a82d60e1ad05ca9fa811488) Thanks [@miguelg719](https://github.com/miguelg719)! - Fix for zod peer dependency support

- [#1014](https://github.com/browserbase/stagehand/pull/1014) [`6966201`](https://github.com/browserbase/stagehand/commit/6966201e2511eb897132d237d0b7712b48b3c7ab) Thanks [@tkattkat](https://github.com/tkattkat)! - Replace operator handler with base of new agent

- [#1089](https://github.com/browserbase/stagehand/pull/1089) [`536f366`](https://github.com/browserbase/stagehand/commit/536f366f868d115ffa84c2c92124ae05400dd8be) Thanks [@miguelg719](https://github.com/miguelg719)! - Fixed info logs on api session create

- [#1103](https://github.com/browserbase/stagehand/pull/1103) [`889cb6c`](https://github.com/browserbase/stagehand/commit/889cb6cec27f0fc07286a9263bdc4d559149a037) Thanks [@tkattkat](https://github.com/tkattkat)! - patch custom tool support in anthropic cua client

- [#1056](https://github.com/browserbase/stagehand/pull/1056) [`6a002b2`](https://github.com/browserbase/stagehand/commit/6a002b234dbf1ac7d1f180eeffdf66154fa7799b) Thanks [@chrisreadsf](https://github.com/chrisreadsf)! - remove need for duplicate project id if already passed to Stagehand

- [#1090](https://github.com/browserbase/stagehand/pull/1090) [`8ff5c5a`](https://github.com/browserbase/stagehand/commit/8ff5c5a4b2050fc581240ae1befcdc0cf9195873) Thanks [@miguelg719](https://github.com/miguelg719)! - Improve failed act error logs

- [#1014](https://github.com/browserbase/stagehand/pull/1014) [`6966201`](https://github.com/browserbase/stagehand/commit/6966201e2511eb897132d237d0b7712b48b3c7ab) Thanks [@tkattkat](https://github.com/tkattkat)! - replace operator agent with scaffold for new stagehand agent

- [#1107](https://github.com/browserbase/stagehand/pull/1107) [`3ccf335`](https://github.com/browserbase/stagehand/commit/3ccf335d943b43cd5249e4eeb5b1a8f2aff7fd3b) Thanks [@seanmcguire12](https://github.com/seanmcguire12)! - fix: url extraction not working inside an array

- [#1102](https://github.com/browserbase/stagehand/pull/1102) [`a99aa48`](https://github.com/browserbase/stagehand/commit/a99aa48936ae3ce113172bce673809eaf5ef7ac1) Thanks [@miguelg719](https://github.com/miguelg719)! - Add current page and date context to agent

- [#1110](https://github.com/browserbase/stagehand/pull/1110) [`dda52f1`](https://github.com/browserbase/stagehand/commit/dda52f170de0bbbb6e9e684b2b0fa7c53fbe2ab9) Thanks [@miguelg719](https://github.com/miguelg719)! - Add support for new Gemini Computer Use models

## 2.5.0

### Minor Changes
Expand Down Expand Up @@ -233,15 +271,13 @@
We're thrilled to announce the release of Stagehand 2.0, bringing significant improvements to make browser automation more powerful, faster, and easier to use than ever before.

### 🚀 New Features

- **Introducing `stagehand.agent`**: A powerful new way to integrate SOTA Computer use models or Browserbase's [Open Operator](https://operator.browserbase.com) into Stagehand with one line of code! Perfect for multi-step workflows and complex interactions. [Learn more](https://docs.stagehand.dev/concepts/agent)
- **Lightning-fast `act` and `extract`**: Major performance improvements to make your automations run significantly faster.
- **Enhanced Logging**: Better visibility into what's happening during automation with improved logging and debugging capabilities.
- **Comprehensive Documentation**: A completely revamped documentation site with better examples, guides, and best practices.
- **Improved Error Handling**: More descriptive errors and better error recovery to help you debug issues faster.

### 🛠️ Developer Experience

- **Better TypeScript Support**: Enhanced type definitions and better IDE integration
- **Better Error Messages**: Clearer, more actionable error messages to help you debug faster
- **Improved Caching**: More reliable action caching for better performance
Expand Down Expand Up @@ -502,7 +538,6 @@
- [#316](https://github.com/browserbase/stagehand/pull/316) [`902e633`](https://github.com/browserbase/stagehand/commit/902e633e126a58b80b757ea0ecada01a7675a473) Thanks [@kamath](https://github.com/kamath)! - rename browserbaseResumeSessionID -> browserbaseSessionID

- [#296](https://github.com/browserbase/stagehand/pull/296) [`f11da27`](https://github.com/browserbase/stagehand/commit/f11da27a20409c240ceeea2003d520f676def61a) Thanks [@kamath](https://github.com/kamath)! - - Deprecate fields in `init` in favor of constructor options

- Deprecate `initFromPage` in favor of `browserbaseResumeSessionID` in constructor
- Rename `browserBaseSessionCreateParams` -> `browserbaseSessionCreateParams`

Expand Down
58 changes: 43 additions & 15 deletions docs/basics/agent.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -30,28 +30,28 @@ There are two ways to create agents in Stagehand:

### Computer Use Agents

Use computer use agents with specialized models from OpenAI or Anthropic:
Use computer use agents with specialized models from OpenAI, Anthropic, or Google:

<CodeGroup>
```typescript TypeScript
const agent = stagehand.agent({
provider: "anthropic",
model: "claude-sonnet-4-20250514",
provider: "google",
model: "gemini-2.5-computer-use-preview-10-2025",
instructions: "You are a helpful assistant that can use a web browser.",
options: {
apiKey: process.env.ANTHROPIC_API_KEY,
apiKey: process.env.GOOGLE_API_KEY,
},
});
await agent.execute("apply for a job at Browserbase")
```

```python Python
agent = stagehand.agent({
"provider": "anthropic",
"model": "claude-sonnet-4-20250514",
"provider": "google",
"model": "gemini-2.5-computer-use-preview-10-2025",
"instructions": "You are a helpful assistant that can use a web browser.",
"options": {
"apiKey": os.getenv("ANTHROPIC_API_KEY"),
"apiKey": os.getenv("GOOGLE_API_KEY"),
},
})
await agent.execute("apply for a job at Browserbase")
Expand All @@ -62,13 +62,41 @@ await agent.execute("apply for a job at Browserbase")

Use the agent without specifying a provider to utilize any model or LLM provider:

<Note>Non CUA agents are currently only supported in TypeScript</Note>
<Note>Stagehand agent is currently only supported in TypeScript</Note>

```typescript TypeScript
// Basic usage
const agent = stagehand.agent();
await agent.execute("apply for a job at Browserbase")
```

#### Recommended Configuration

For optimal performance, we recommend using Claude 4 sonnet with Gemini 2.5 Flash as the execution model:

```typescript TypeScript
const agent = stagehand.agent({
model: "anthropic/claude-4-20250514", // Reliable reasoning and planning for the agent
executionModel: "google/gemini-2.5-flash", // Fast and reliable execution for stagehand primitives (act, extract, observe)
instructions: "You are a helpful assistant that can use a web browser.",
});

// Enable Claude-specific optimizations for best performance
await agent.execute({
instruction: "apply for a job at Browserbase",
storeActions: false, // Unlocks claude-specific tools
maxSteps: 25
});
```

<Tip>
**Why this configuration?** Claude 4 provides excellent reasoning and planning, while Gemini 2.5 Flash offers fast execution for stagehand primitives. Setting `storeActions: false` enables coordinate-based tools for a hybrid approach of stagehand primitives and coordinate-based actions, but removes the ability to turn your agent runs into repeatable deterministic scripts.
</Tip>

<Note>
All configuration options are optional. The agent works well with default settings, but the above configuration provides the most optimal performance.
</Note>


## MCP Integrations

Expand All @@ -77,14 +105,14 @@ Agents can be enhanced with external tools and services through MCP (Model Conte
<CodeGroup>
```typescript TypeScript (Pass URL)
const agent = stagehand.agent({
provider: "openai",
model: "computer-use-preview",
provider: "google",
model: "gemini-2.5-computer-use-preview-10-2025",
integrations: [
`https://mcp.exa.ai/mcp?exaApiKey=${process.env.EXA_API_KEY}`,
],
instructions: `You have access to web search through Exa. Use it to find current information before browsing.`,
options: {
apiKey: process.env.OPENAI_API_KEY,
apiKey: process.env.GOOGLE_API_KEY,
},
});

Expand All @@ -99,12 +127,12 @@ const supabaseClient = await connectToMCPServer(
);

const agent = stagehand.agent({
provider: "openai",
model: "computer-use-preview",
provider: "google",
model: "gemini-2.5-computer-use-preview-10-2025",
integrations: [supabaseClient],
instructions: `You can interact with Supabase databases. Use these tools to store and retrieve data.`,
options: {
apiKey: process.env.OPENAI_API_KEY,
apiKey: process.env.GOOGLE_API_KEY,
},
});

Expand All @@ -123,7 +151,7 @@ Stagehand uses a 1024x768 viewport by default (the optimal size for Computer Use

## Available Models

Use specialized computer use models (e.g., `computer-use-preview` from OpenAI or `claude-sonnet-4-20250514` from Anthropic)
Use specialized computer use models (e.g., `gemini-2.5-computer-use-preview-10-2025` from Google or `claude-sonnet-4-20250514` from Anthropic)

<Card title="Available Models" icon="robot" href="/configuration/models">
Check out the guide on how to use different models with Stagehand.
Expand Down
18 changes: 17 additions & 1 deletion docs/basics/extract.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,8 @@ Here is how an `extract` call might look for a single object:

<CodeGroup>
```typescript TypeScript
import { z } from 'zod/v3';

const item = await page.extract({
instruction: "extract the price of the item",
schema: z.object({
Expand All @@ -45,6 +47,8 @@ const item = await page.extract({
```

```python Python
from pydantic import BaseModel

class Extraction(BaseModel):
price: float

Expand All @@ -66,6 +70,8 @@ Here is how an `extract` call might look for a list of objects.

<CodeGroup>
```typescript TypeScript
import { z } from 'zod/v3';

const apartments = await page.extract({
instruction:
"Extract ALL the apartment listings and their details, including address, price, and square feet.",
Expand All @@ -84,6 +90,8 @@ console.log("the apartment list is: ", apartments);
```

```python Python
from pydantic import BaseModel

class Apartment(BaseModel):
address: str
price: str
Expand Down Expand Up @@ -180,6 +188,8 @@ You can provide additional context to your schema to help the model extract the

<CodeGroup>
```typescript TypeScript
import { z } from 'zod/v3';

const apartments = await page.extract({
instruction:
"Extract ALL the apartment listings and their details, including address, price, and square feet.",
Expand All @@ -196,6 +206,8 @@ const apartments = await page.extract({
```

```python Python
from pydantic import BaseModel, Field

class Apartment(BaseModel):
address: str = Field(..., description="the address of the apartment")
price: str = Field(..., description="the price of the apartment")
Expand All @@ -221,6 +233,8 @@ Here is how an `extract` call might look for extracting a link or URL. This also

<CodeGroup>
```typescript TypeScript
import { z } from 'zod/v3';

const extraction = await page.extract({
instruction: "extract the link to the 'contact us' page",
schema: z.object({
Expand All @@ -232,6 +246,8 @@ console.log("the link to the contact us page is: ", extraction.link);
```

```python Python
from pydantic import BaseModel, HttpUrl

class Extraction(BaseModel):
link: HttpUrl # note the usage of HttpUrl here

Expand Down Expand Up @@ -414,4 +430,4 @@ for page_num in page_numbers:
<Card title="Observe" icon="magnifying-glass" href="/basics/observe">
Analyze pages with observe()
</Card>
</CardGroup>
</CardGroup>
12 changes: 7 additions & 5 deletions docs/basics/observe.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -109,13 +109,15 @@ const { data } = await page.extract({
});
```
```python Python
# Use observe to validate elements before extraction
[ table ] = await page.observe("Find the data table")
# Use observe to find the specific section (table, form, list, etc.)
tables = await page.observe("Find the data table")
table = tables[0] # Get the first suggestion

# Extract data using the selector to minimize context
extraction = await page.extract(
"Extract data from the table",
schema=Data, # Pydantic schema
selector=table.selector # Reduce context scope needed for extraction
"Extract data from the table",
schema=TableData, # Pydantic schema
selector=table.selector # Focus extraction on just this table
)
```
</CodeGroup>
Expand Down
20 changes: 4 additions & 16 deletions docs/configuration/browser.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -114,7 +114,7 @@ stagehand = Stagehand(
apiKey: process.env.BROWSERBASE_API_KEY,
projectId: process.env.BROWSERBASE_PROJECT_ID,
browserbaseSessionCreateParams: {
projectId: process.env.BROWSERBASE_PROJECT_ID!,
projectId: process.env.BROWSERBASE_PROJECT_ID!, // Optional: automatically set if given in environment variable or by Stagehand parameter
proxies: true,
region: "us-west-2",
timeout: 3600, // 1 hour session timeout
Expand All @@ -124,17 +124,11 @@ stagehand = Stagehand(
blockAds: true,
solveCaptchas: true,
recordSession: false,
os: "windows", // Valid: "windows" | "mac" | "linux" | "mobile" | "tablet"
viewport: {
width: 1920,
height: 1080,
},
fingerprint: {
browsers: ["chrome", "edge"],
devices: ["desktop"],
operatingSystems: ["windows", "macos"],
locales: ["en-US", "en-GB"],
httpVersion: 2,
},
},
userMetadata: {
userId: "automation-user-123",
Expand All @@ -149,7 +143,7 @@ stagehand = Stagehand(
api_key=os.getenv("BROWSERBASE_API_KEY"),
project_id=os.getenv("BROWSERBASE_PROJECT_ID"),
browserbase_session_create_params={
"project_id": os.getenv("BROWSERBASE_PROJECT_ID"),
"project_id": os.getenv("BROWSERBASE_PROJECT_ID"), # Optional: automatically set if given in environment or by Stagehand parameter
"proxies": True,
"region": "us-west-2",
"timeout": 3600, # 1 hour session timeout
Expand All @@ -159,17 +153,11 @@ stagehand = Stagehand(
"block_ads": True,
"solve_captchas": True,
"record_session": False,
"os": "windows", # "windows" | "mac" | "linux" | "mobile" | "tablet"
"viewport": {
"width": 1920,
"height": 1080,
},
"fingerprint": {
"browsers": ["chrome", "edge"],
"devices": ["desktop"],
"operating_systems": ["windows", "macos"],
"locales": ["en-US", "en-GB"],
"http_version": 2,
},
},
"user_metadata": {
"user_id": "automation-user-123",
Expand Down
Loading