Skip to content

Conversation

@weeta-code
Copy link

why

The Cerebras integration was non-functional in testing and outdated in terms of models, and implementation.

what changed

Changed the way LLM calls were made with Cerebras, updated to use the vercel ai-sdk instead of OpenAI. Also updated model lists from Cerebras.

test plan

I ran through and passed all Cerebras models on the following evals: iframe_form_filling, amazon_add_to_cart, dropdown, extract_repo_name, allrecipes, imbd_movie_details, sciquest

tkattkat and others added 30 commits September 10, 2025 13:40
# why

solves browserbase#1060 
patch regression of playwright arguments being removed from agent
execute response

# what changed

agent.execute now returns playwright arguments in its response 

# test plan

tested locally
…ms to docs (browserbase#1065)

# why

reflect project id changes in docs

# what changed

advanced configuration comments

# test plan

reviewed via mintlify on localhost
# why

Easier to use for Custom LLM Clients and keep users up to date with our
aisdk file

# what changed

added export of aisdk to lib/index.ts

# test plan

build local stagehand, import local AISdkClient, run Azure Stagehand
session
…onfigu… (browserbase#1073)

…ration settings

# why

Updated docs to match the new fingerprint params in the Browserbase docs
here:
https://docs.browserbase.com/guides/stealth-customization#customization-options

# what changed

Update browser configuration docs to reflect the docs changes. 

# test plan
# why

Updating docs to reflect aisdk can be imported directly

# what changed

The model page

# test plan

Reviewed page with mintlify dev locally
# why

# what changed

# test plan
# why

Currently, we do not support stagehand agent within the api

# what changed

When api is enabled, stagehand agent now routes through the api 

# test plan

Tested locally
# why

Currently, using playwright screenshot command is not available when the
execution environment is Stagehand. A customer has indicated they would
prefer to use Playwright's native screenshot command instead of CDP when
using Browserbase as CDP screenshot causes unexpected behavior for their
target site.

# what changed

- added a StagehandScreenshotOptions type with useCDP argument added
- extended page type to accept custom stagehand screeenshot options
- update screenshot proxy to default useCDP to true if the env is
browserbase and use playwright screenshot if false
- added eval for screenshot with and without cdp

# test plan
- tested and confirmed functionality with eval and external example
script (not committed)
…rowserbase#1057)

# why

We want to build a best in class agent in stagehand.
Therefore, we need more eval benchmarks.

# what changed
- Added Web-bench evals dataset
- Added a subset of OS World evals - those that can be run in a chrome
browser (desktop-based tasks omitted)
- added LICENSE noticed to the copied evals tasks
- Added ground truth / expected result to some WebVoyager tasks using
reference_answer.json from Browser Use public evals repo.

Improvements to `pnpm run evals -man` to better describe how to run
evals.

# test plan
Evals should run locally and bb for these new benchmarks.
# why
Initial instructions didn't mention uv or pip prerequisites and also
didn't mention venv. Fix reduces friction on first timers.

# what changed
- added link to install uv
- added details for initializing venv
- adjusted code example respectively 

# test plan
docs change
# why
- webpage structure changed, needed to update the xpath in the expected
locator
… with LanguageModelV1 + LiteLLM works for python (browserbase#1086)

# why

1. aisdk not yet available through npm package
2. customLLM provider only works with LanguageModelV1
3. LiteLLM compatible providers are supported in python

# what changed

1. change docs to install stagehand from git repo
2. pin versions that use LanguageModelV1

# test plan

local test
# why

currently we pass stagehand page to agent, this results in our page
management having issues when facing new tabs

# what changed

the stagehand object is now passed instead of stagehandPage

# test plan

tested locally
# why

Our existing screenshot service is a dummy time-based triggered service.
It also does not trigger based on any actions of the agent.

# what changed
Added img hash diff algo (quick check with MSE, verify with SSIM algo)
to see if there was an actual UI change and only store ss in the buffer
if that is so.

Added ss interceptor which copies each screenshot the agent is taking to
a buffer (if different enough from the previous ss) to be later used for
evals.

- There's also a small refactor of the agent initialization config to
enable the screenshot collector service to be attached

# test plan
Tests pass locally

---------

Co-authored-by: Miguel <36487034+miguelg719@users.noreply.github.com>
Co-authored-by: miguel <miguelg71921@gmail.com>
# why
To help make sense of eval test cases and results

# what changed
Added metadata to eval runs, cleaned deprecated code

# test plan
# why

# what changed

# test plan
# why

anthropic released a new sota computer use model

# what changed

added claude-sonnet-4-5-20250929 as a model to the list

# test plan

ran evals
…ase#1103)

Why

Custom AI SDK tools and MCP integrations weren't working properly with
Anthropic CUA - parameters were empty {} and tools weren't tracked.

What Changed


- Convert Zod schemas to JSON Schema before sending to Anthropic (using
zodToJsonSchema)
- Track custom tool calls in the actions array
- Silence "Unknown tool name" warnings for custom tools

Test Plan

Tested with examples file. 

Parameters passed correctly ({"city":"San Francisco"} instead of {})
Custom tools execute and appear in actions array
No warnings
# why
To improve context

# what changed
Added current page and url to the system prompt

# test plan
# why
To inform the user throughout the agent execution process

# what changed
Added logs to tool calls, and on the stagehand agent handler

# test plan
- [x] tested locally
PR to make clearer the dependencies for `extract` (for those who haven't
used zod or pydantic before)

---------

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
# why
- before this change, when we convert `z.string().url()` to an ID, if it
was inside a `z.array()`, it was not getting converted back into a URL
- this meant that if you defined a schema like this:
```ts
schema: z.object({
  records: z.array(z.string().url()),
})
```
you would receive an array like this:
```
{
  records: [
    '0-302', '0-309',
    '0-316', '0-323',
    '0-330', '0-337',
    '0-344', '0-351',
    '0-358', '0-365'
  ]
}
```
- with this change, you will now receive the actual URLs, ie:
```
{
  records: [
    'https://www.archives.gov/files/research/jfk/releases/2025/0318/104-10003-10041.pdf',
    'https://www.archives.gov/files/research/jfk/releases/2025/0318/104-10004-10143%20(C06932208).pdf',
    'https://www.archives.gov/files/research/jfk/releases/2025/0318/104-10004-10143.pdf',
    'https://www.archives.gov/files/research/jfk/releases/2025/0318/104-10004-10156.pdf',
    'https://www.archives.gov/files/research/jfk/releases/2025/0318/104-10004-10213.pdf',
    'https://www.archives.gov/files/research/jfk/releases/2025/0318/104-10005-10321.pdf',
    'https://www.archives.gov/files/research/jfk/releases/2025/0318/104-10006-10247.pdf',
    'https://www.archives.gov/files/research/jfk/releases/2025/0318/104-10007-10345.pdf',
    'https://www.archives.gov/files/research/jfk/releases/2025/0318/104-10009-10021.pdf',
    'https://www.archives.gov/files/research/jfk/releases/2025/0318/104-10009-10222.pdf'
  ]
}
```

# what changed
- updated the `injectUrls` function so that when it hits an array and
there is not deeper path, it loops through the array and injects the
URLs
# test plan
- evals
# why
Adding support for Gemini's new Computer Use model

# what changed
We partnered with Google Deepmind to help integrate and test their new
Computer Use models.

<img width="1238" height="655" alt="Screenshot 2025-10-07 at 1 14 44 PM"
src="https://github.com/user-attachments/assets/af0d854a-8e55-4937-a071-10335497f686"
/>

The new model tag `gemini-2.5-pro-computer-use-preview-10-2025` is
available for Stagehand Agent. You can try it today with the example
`cua-example.ts`

To learn more, check out the blog post
[https://www.browserbase.com/blog/evaluating-browser-agents](https://www.browserbase.com/blog/evaluating-browser-agents)

---------

Co-authored-by: tkattkat <tkat@tkat.net>
Co-authored-by: Kylejeong2 <kylejeong21@gmail.com>
Co-authored-by: Sameel <sameel.m.arif@gmail.com>
# why

# what changed

# test plan
This PR was opened by the [Changesets
release](https://github.com/changesets/action) GitHub action. When
you're ready to do a release, you can merge this and the packages will
be published to npm automatically. If you're not ready to do a release
yet, that's fine, whenever you add more changesets to main, this PR will
be updated.


# Releases
## @browserbasehq/stagehand@2.5.1

### Patch Changes

- [browserbase#1082](browserbase#1082)
[`8c0fd01`](browserbase@8c0fd01)
Thanks [@tkattkat](https://github.com/tkattkat)! - Pass stagehand object
to agent instead of stagehand page

- [browserbase#1104](browserbase#1104)
[`a1ad06c`](browserbase@a1ad06c)
Thanks [@miguelg719](https://github.com/miguelg719)! - Fix logging for
stagehand agent

- [browserbase#1066](browserbase#1066)
[`9daa584`](browserbase@9daa584)
Thanks [@tkattkat](https://github.com/tkattkat)! - Add playwright
arguments to agent execute response

- [browserbase#1077](browserbase#1077)
[`7f38b3a`](browserbase@7f38b3a)
Thanks [@tkattkat](https://github.com/tkattkat)! - adds support for
stagehand agent in the api

- [browserbase#1032](browserbase#1032)
[`bf2d0e7`](browserbase@bf2d0e7)
Thanks [@miguelg719](https://github.com/miguelg719)! - Fix for zod peer
dependency support

- [browserbase#1014](browserbase#1014)
[`6966201`](browserbase@6966201)
Thanks [@tkattkat](https://github.com/tkattkat)! - Replace operator
handler with base of new agent

- [browserbase#1089](browserbase#1089)
[`536f366`](browserbase@536f366)
Thanks [@miguelg719](https://github.com/miguelg719)! - Fixed info logs
on api session create

- [browserbase#1103](browserbase#1103)
[`889cb6c`](browserbase@889cb6c)
Thanks [@tkattkat](https://github.com/tkattkat)! - patch custom tool
support in anthropic cua client

- [browserbase#1056](browserbase#1056)
[`6a002b2`](browserbase@6a002b2)
Thanks [@chrisreadsf](https://github.com/chrisreadsf)! - remove need for
duplicate project id if already passed to Stagehand

- [browserbase#1090](browserbase#1090)
[`8ff5c5a`](browserbase@8ff5c5a)
Thanks [@miguelg719](https://github.com/miguelg719)! - Improve failed
act error logs

- [browserbase#1014](browserbase#1014)
[`6966201`](browserbase@6966201)
Thanks [@tkattkat](https://github.com/tkattkat)! - replace operator
agent with scaffold for new stagehand agent

- [browserbase#1107](browserbase#1107)
[`3ccf335`](browserbase@3ccf335)
Thanks [@seanmcguire12](https://github.com/seanmcguire12)! - fix: url
extraction not working inside an array

- [browserbase#1102](browserbase#1102)
[`a99aa48`](browserbase@a99aa48)
Thanks [@miguelg719](https://github.com/miguelg719)! - Add current page
and date context to agent

- [browserbase#1110](browserbase#1110)
[`dda52f1`](browserbase@dda52f1)
Thanks [@miguelg719](https://github.com/miguelg719)! - Add support for
new Gemini Computer Use models

## @browserbasehq/stagehand-evals@1.1.0

### Minor Changes

- [browserbase#1057](browserbase#1057)
[`b7be89e`](browserbase@b7be89e)
Thanks [@filip-michalsky](https://github.com/filip-michalsky)! - added
web voyager ground truth (optional), added web bench, and subset of
OSWorld evals which run on a browser

### Patch Changes

- [browserbase#1072](browserbase#1072)
[`dc2d420`](browserbase@dc2d420)
Thanks [@filip-michalsky](https://github.com/filip-michalsky)! - improve
evals screenshot service - add img hashing diff to add screenshots and
change to screenshot intercepts from the agent

- Updated dependencies
\[[`8c0fd01`](browserbase@8c0fd01),
[`a1ad06c`](browserbase@a1ad06c),
[`9daa584`](browserbase@9daa584),
[`7f38b3a`](browserbase@7f38b3a),
[`bf2d0e7`](browserbase@bf2d0e7),
[`6966201`](browserbase@6966201),
[`536f366`](browserbase@536f366),
[`889cb6c`](browserbase@889cb6c),
[`6a002b2`](browserbase@6a002b2),
[`8ff5c5a`](browserbase@8ff5c5a),
[`6966201`](browserbase@6966201),
[`3ccf335`](browserbase@3ccf335),
[`a99aa48`](browserbase@a99aa48),
[`dda52f1`](browserbase@dda52f1)]:
    -   @browserbasehq/stagehand@2.5.1

## @browserbasehq/stagehand-examples@1.0.10

### Patch Changes

- Updated dependencies
\[[`8c0fd01`](browserbase@8c0fd01),
[`a1ad06c`](browserbase@a1ad06c),
[`9daa584`](browserbase@9daa584),
[`7f38b3a`](browserbase@7f38b3a),
[`bf2d0e7`](browserbase@bf2d0e7),
[`6966201`](browserbase@6966201),
[`536f366`](browserbase@536f366),
[`889cb6c`](browserbase@889cb6c),
[`6a002b2`](browserbase@6a002b2),
[`8ff5c5a`](browserbase@8ff5c5a),
[`6966201`](browserbase@6966201),
[`3ccf335`](browserbase@3ccf335),
[`a99aa48`](browserbase@a99aa48),
[`dda52f1`](browserbase@dda52f1)]:
    -   @browserbasehq/stagehand@2.5.1

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
# why

The original example used JavaScript destructuring syntax [table] which
doesn't work in Python. Fixed to use proper Python array indexing.

# what changed

fixed example to proper python syntax

# test plan

Co-authored-by: Steven Bryan <steven@mac.local.meter>
# why
- need to set default viewport when running on browserbase. previously,
we only defined the default inside the exported `StagehandConfig`
# what changed
- set default viewport to 1288 * 711 when running on browserbase
# test plan
- tested locally,
- regression evals
This PR was opened by the [Changesets
release](https://github.com/changesets/action) GitHub action. When
you're ready to do a release, you can merge this and the packages will
be published to npm automatically. If you're not ready to do a release
yet, that's fine, whenever you add more changesets to main, this PR will
be updated.


# Releases
## @browserbasehq/stagehand@2.5.2

### Patch Changes

- [browserbase#1114](browserbase#1114)
[`c0fbc51`](browserbase@c0fbc51)
Thanks [@seanmcguire12](https://github.com/seanmcguire12)! - configure
default viewport when running on browserbase

## @browserbasehq/stagehand-evals@1.1.1

### Patch Changes

- Updated dependencies
\[[`c0fbc51`](browserbase@c0fbc51)]:
    -   @browserbasehq/stagehand@2.5.2

## @browserbasehq/stagehand-examples@1.0.11

### Patch Changes

- Updated dependencies
\[[`c0fbc51`](browserbase@c0fbc51)]:
    -   @browserbasehq/stagehand@2.5.2

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
renl and others added 10 commits October 9, 2025 17:18
Updated link in the Getting Started section to point to the correct
Quickstart Guide.

# why
Quickstart link in README leads to a non-existent page.
<img width="1556" height="763" alt="image"
src="https://github.com/user-attachments/assets/20a1a5b5-8534-43b4-89d5-e3a062b3965a"
/>

# what changed
Updated quickstart link in README to the correct quickstart address
`https://docs.stagehand.dev/first-steps/quickstart`

# test plan
Access new link to quickstart
# why

currently, for openai cua agent, we are handling keypress actions
incorrectly
currently, there is no way to pass a custom system prompt to the Google
cua agent

# what changed
- All key actions, are now ran through mapKeyToPlaywright function to
ensure we are properly mapping the agents actions to valid playwright
keys
- Custom system prompts now override the default system prompt for
Google Cua agent

# test plan
tested locally with google & openai cua agents

Fixes browserbase#1122
# why

currently when using stagehand agent through api, it returns early
without executing

# what changed

we now properly handle the options when none are present 

# test plan
tested locally, and tested across other cua agents to ensure no breaking
changes
# why
New model dropped

# what changed
Added support for haiku 4.5

# test plan
…ocs (browserbase#1140)

# why

We recently shipped updates to our MCP server that should be reflected
in the documentation.

# what changed

update tools list for MCP update, removing mentions of multisession,
adding experimental flag + get url tool

related PR:
browserbase/mcp-server-browserbase#123

# test plan
n/a
# why

Broken links. 

# what changed

Fixed broken links

# test plan

This PR.

---------

Co-authored-by: GG <guergabo@mac.local.meter>
# why

currently, we have no sense of what url an action was taken on and what
time it was taken when using agent

this is useful to have, because in the dashboard it will allow us to
filter the agents actions by url, and display timestamps of the
individual actions

# what changed

- added pageUrl to every action 
- added timestamp to every action

the url is grabbed prior to the action being taken. This is because if
we do it after the action is taken, there is a chance the action could
have caused a navigation, which would result in the incorrect url for
the action

# test plan

tested locally
# why
Make it easier to parse/filter/group evals

# what changed
Evals tagged with more granular metadata and error parsing

# test plan

---------

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
…vercel ai-sdk integration. As well as added updated models to types/models.ts & updated docs to include Cerebras key integration
@changeset-bot
Copy link

changeset-bot bot commented Oct 17, 2025

🦋 Changeset detected

Latest commit: d404d3d

The changes in this PR will be included in the next version bump.

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Overview

Summary

This PR modernizes the Cerebras integration by migrating from a custom OpenAI-based implementation to the official @ai-sdk/cerebras integration, while adding 7 new Cerebras models.

Key Changes:

  • Refactored CerebrasClient.ts to delegate all functionality to AISdkClient using @ai-sdk/cerebras
  • Removed ~280 lines of custom OpenAI wrapper code in favor of native AI SDK support
  • Added 7 new models: llama-4 (maverick/scout), qwen-3 (multiple variants), and gpt-oss-120b
  • Updated documentation to include CEREBRAS_API_KEY environment variable
  • Exported CerebrasClient in lib/index.ts

Implementation:
The refactor follows the established pattern used by other AI SDK integrations in the codebase. The CerebrasClient now strips the cerebras- prefix from model names and creates a language model instance via createCerebras(), then wraps it with AISdkClient to handle all LLM operations consistently.

Testing:
PR author reports passing all evals (iframe_form_filling, amazon_add_to_cart, dropdown, extract_repo_name, allrecipes, imbd_movie_details, sciquest) with Cerebras models.

Confidence Score: 4/5

  • This PR is safe to merge with one minor process requirement needed.
  • The code changes are well-structured and follow existing patterns in the codebase. The migration to @ai-sdk/cerebras is a clear improvement that removes custom wrapper code. The author has tested across multiple evals. However, the PR is missing a changeset file which is required by project policy for documenting version bumps and changes.
  • No files require special attention - all changes are straightforward and follow established patterns. Only missing changeset needs to be added.

Important Files Changed

File Analysis

Filename Score Overview
lib/llm/CerebrasClient.ts 5/5 Refactored from OpenAI-based implementation to @ai-sdk/cerebras integration, delegating all logic to AISdkClient. Clean implementation with proper model name transformation.
types/model.ts 5/5 Added 7 new Cerebras model names (llama-4, qwen-3, gpt-oss) to the enum. All models follow existing naming convention.
lib/llm/LLMProvider.ts 5/5 Added new Cerebras models to modelToProviderMap. All mappings are correct and consistent.
docs/configuration/models.mdx 5/5 Added CEREBRAS_API_KEY to environment variable documentation.
lib/index.ts 4/5 Exported CerebrasClient for external use. Missing changeset file for this PR.

Sequence Diagram

sequenceDiagram
    participant User
    participant LLMProvider
    participant CerebrasClient
    participant AISdkClient
    participant CerebrasAPI as Cerebras API

    User->>LLMProvider: getClient(modelName, clientOptions)
    LLMProvider->>LLMProvider: Check modelToProviderMap["cerebras-*"]
    LLMProvider->>CerebrasClient: new CerebrasClient({logger, cache, modelName, clientOptions})
    
    CerebrasClient->>CerebrasClient: Strip "cerebras-" prefix from modelName
    CerebrasClient->>CerebrasClient: createCerebras({apiKey, baseURL})
    CerebrasClient->>CerebrasClient: cerebrasProvider(actualModelName)
    CerebrasClient->>AISdkClient: new AISdkClient({model, logger, cache})
    AISdkClient-->>CerebrasClient: Return client instance
    CerebrasClient-->>LLMProvider: Return CerebrasClient instance
    LLMProvider-->>User: Return client
    
    User->>CerebrasClient: createChatCompletion(options)
    CerebrasClient->>AISdkClient: createChatCompletion(options)
    AISdkClient->>AISdkClient: Format messages to CoreMessage[]
    
    alt Has response_model
        AISdkClient->>CerebrasAPI: generateObject(model, messages, schema)
        CerebrasAPI-->>AISdkClient: Return structured object
    else No response_model
        AISdkClient->>CerebrasAPI: generateText(model, messages, tools)
        CerebrasAPI-->>AISdkClient: Return text/tool calls
    end
    
    AISdkClient->>AISdkClient: Transform response format
    AISdkClient-->>CerebrasClient: Return formatted response
    CerebrasClient-->>User: Return response
Loading

Additional Comments (1)

  1. lib/index.ts, line 1052 (link)

    style: Missing changeset - custom instruction requires all PRs with code changes to include a changeset. Create one using npx changeset.

    Context Used: Rule from dashboard - All PRs that make code changes must include a changeset to document the version bump and changes. (source)

5 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

@weeta-code
Copy link
Author

@greptileai

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Overview

Summary

Refactored Cerebras integration from a custom OpenAI-compatible wrapper to use the official @ai-sdk/cerebras package through delegation to AISdkClient. This eliminates 300+ lines of custom formatting, caching, and error handling logic.

Key changes:

  • CerebrasClient now creates an AISdkClient instance and delegates all createChatCompletion calls to it
  • Added 7 new Cerebras model variants (Llama 4, Qwen 3, GPT-OSS models)
  • Updated documentation to include CEREBRAS_API_KEY environment variable
  • Properly strips cerebras- prefix before passing model name to the SDK
  • Follows the same delegation pattern used by other AI SDK integrations in the codebase

Minor type issue: The import of ClientOptions from types/model (which is OpenAIClientOptions | AnthropicClientOptions) doesn't perfectly match Cerebras SDK's expected options shape, though it works at runtime due to overlapping properties.

Confidence Score: 4/5

  • This PR is safe to merge with minimal risk
  • The refactoring significantly improves code maintainability by delegating to the battle-tested AISdkClient, reducing custom code by 300+ lines. The author tested all models across 7 different evals successfully. The only issue is a minor type mismatch with ClientOptions that doesn't affect runtime behavior. Score is 4 instead of 5 due to the type inconsistency that should ideally be cleaned up.
  • lib/llm/CerebrasClient.ts requires attention for the ClientOptions type import - consider creating Cerebras-specific options type

Important Files Changed

File Analysis

Filename Score Overview
lib/llm/CerebrasClient.ts 3/5 Refactored from OpenAI wrapper to AI SDK delegation pattern - removed 300+ lines of custom logic, now delegates to AISdkClient; minor issue with ClientOptions type mismatch
types/model.ts 5/5 Added 7 new Cerebras model variants to AvailableModelSchema
lib/llm/LLMProvider.ts 5/5 Added 7 new Cerebras models to modelToProviderMap

Sequence Diagram

sequenceDiagram
    participant App as Application
    participant LLMProvider as LLMProvider
    participant CerebrasClient as CerebrasClient
    participant AISdkClient as AISdkClient
    participant CerebrasSDK as @ai-sdk/cerebras
    participant API as Cerebras API

    App->>LLMProvider: getClient("cerebras-llama-3.3-70b", options)
    LLMProvider->>LLMProvider: Check modelToProviderMap
    LLMProvider->>CerebrasClient: new CerebrasClient({logger, cache, modelName, clientOptions})
    CerebrasClient->>CerebrasClient: Strip "cerebras-" prefix
    CerebrasClient->>CerebrasSDK: createCerebras({apiKey, baseURL})
    CerebrasSDK-->>CerebrasClient: cerebrasProvider
    CerebrasClient->>CerebrasSDK: cerebrasProvider(actualModelName)
    CerebrasSDK-->>CerebrasClient: languageModel
    CerebrasClient->>AISdkClient: new AISdkClient({model, logger, cache, enableCaching})
    AISdkClient-->>CerebrasClient: client instance
    CerebrasClient-->>LLMProvider: CerebrasClient instance
    LLMProvider-->>App: client

    App->>CerebrasClient: createChatCompletion(options)
    CerebrasClient->>AISdkClient: createChatCompletion(options)
    AISdkClient->>AISdkClient: Check cache
    AISdkClient->>AISdkClient: Format messages
    AISdkClient->>CerebrasSDK: generateObject/generateText()
    CerebrasSDK->>API: HTTP Request
    API-->>CerebrasSDK: Response
    CerebrasSDK-->>AISdkClient: result
    AISdkClient->>AISdkClient: Cache result
    AISdkClient-->>CerebrasClient: formatted response
    CerebrasClient-->>App: response
Loading

Additional Comments (1)

  1. lib/llm/CerebrasClient.ts, line 3 (link)

    syntax: ClientOptions imported from types/model is defined as OpenAIClientOptions | AnthropicClientOptions, but @ai-sdk/cerebras expects a different shape ({apiKey?: string, baseURL?: string}). This works at runtime because the relevant properties overlap, but creates type confusion.

    Then define cerebras-specific options inline or create a new type.

No files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

@weeta-code
Copy link
Author

@greptileai

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Overview

Summary

This PR modernizes the Cerebras integration by replacing a custom OpenAI-wrapper implementation with the official @ai-sdk/cerebras package. The changes significantly simplify the codebase by removing ~300 lines of custom message formatting, tool handling, and JSON parsing logic. The new implementation delegates all functionality to AISdkClient, which provides standardized handling through the Vercel AI SDK.

Key Changes:

  • Replaced openai dependency with @ai-sdk/cerebras
  • Removed custom message formatting and tool handling code
  • Added 7 new Cerebras models (GPT-OSS, Llama 4, Qwen 3 variants)
  • Updated documentation to include CEREBRAS_API_KEY setup
  • Included proper changeset for version bump

Technical Approach:
The refactor follows the delegation pattern used by other providers in the codebase. The CerebrasClient now acts as a thin wrapper that:

  1. Strips the cerebras- prefix from model names
  2. Initializes the Cerebras provider with API credentials
  3. Wraps the language model in AISdkClient for unified handling
  4. Remaps logger categories from "aisdk" to "cerebras"

The implementation is consistent with the repository's architecture and leverages generateObject/generateText from the AI SDK, aligning with custom instruction c6910a06-83cf-42d7-8d62-5bcf8d671b6f for reliable parsing.

Confidence Score: 5/5

  • This PR is safe to merge with minimal risk
  • The implementation follows established patterns in the codebase, uses the official AI SDK package, includes proper testing (7 evals passed per PR description), has a complete changeset, and significantly reduces code complexity by removing custom implementations. No breaking changes to the public API.
  • No files require special attention

Important Files Changed

File Analysis

Filename Score Overview
lib/llm/CerebrasClient.ts 5/5 Replaced custom OpenAI-based implementation with @ai-sdk/cerebras, delegating all functionality to AISdkClient wrapper. Clean refactor with proper model name handling.

Sequence Diagram

sequenceDiagram
    participant Client as Stagehand Client
    participant Provider as LLMProvider
    participant CerebrasClient as CerebrasClient
    participant AISdkClient as AISdkClient
    participant CerebrasSDK as @ai-sdk/cerebras
    participant CerebrasAPI as Cerebras API
    
    Client->>Provider: Request with cerebras model
    Provider->>CerebrasClient: new CerebrasClient({logger, cache, modelName})
    CerebrasClient->>CerebrasClient: Strip "cerebras-" prefix from modelName
    CerebrasClient->>CerebrasSDK: createCerebras({apiKey, baseURL})
    CerebrasSDK-->>CerebrasClient: cerebrasProvider
    CerebrasClient->>CerebrasSDK: cerebrasProvider(actualModelName)
    CerebrasSDK-->>CerebrasClient: languageModel
    CerebrasClient->>AISdkClient: new AISdkClient({model, logger, cache})
    AISdkClient-->>CerebrasClient: client instance
    
    Client->>CerebrasClient: createChatCompletion(options)
    CerebrasClient->>AISdkClient: createChatCompletion(options)
    AISdkClient->>AISdkClient: Check cache
    AISdkClient->>AISdkClient: Format messages
    AISdkClient->>CerebrasSDK: generateObject/generateText
    CerebrasSDK->>CerebrasAPI: API Request
    CerebrasAPI-->>CerebrasSDK: Response
    CerebrasSDK-->>AISdkClient: Structured response
    AISdkClient->>AISdkClient: Cache response
    AISdkClient-->>CerebrasClient: Formatted result
    CerebrasClient-->>Client: Result
Loading

1 file reviewed, no comments

Edit Code Review Agent Settings | Greptile

@weeta-code
Copy link
Author

why

  • Stability & maintainability: The old Cerebras client re-implemented message formatting, tool handling, and JSON parsing on top of OpenAI's SDK with a baseURL override. This implementation was brittle and higher-maintenance.
  • Consistency: Other providers already standardize on the Vercel AI SDK via our AISdkClient. Moving Cerebras to the same pattern reduces any potential future drift.
  • Feature parity & easier upgrades: Using @ai-sdk/cerebras keeps us aligned with the provider's latest behavior and model roster without custom glue.
  • Code reduction: ~300 lines of custom logic deleted -> simpler, safer code.

what changed

  • Refactor: lib/llm/CerebrasClient.ts now: strips the cerebras- prefix to get the actual model id. Instantiates the Cerebras provider via createCerebras() from @ai-sdk/cerebras. Wraps the language model with our AISdkClient for unified request/response handling, logging, and (when enabled) caching.
  • Exports: lib/index.ts now exports CerebrasClient.
  • Model map: lib/llm/LLMProvider.ts maps new model ids to "cerebras".
  • Type enum: types/model.ts adds the latest Cerebras models.
  • Docs: doc/configuration/models.mdx now properly documents new API key configuration for Cerebras.
  • Housekeeping: Removed legacy OpenAI-with-baseURL path and the associated tool/JSON plumbing.

Compatibility notes

  • API surface: Callers still select models via cerebras-* names.
  • Env var: if anyone previously relied on an OpenAI API key with a Cerebras baseURL, they must set CEREBRAS_API_KEY instead.
  • Vision: hasVision remains false for Cerebras (unchanged behavior).

Test plan

  • Run the existing eval suite on the following test (passed on every model tested.):
    • iframe_form_filling, amazon_add_to_cart, dropdown,
      extract_repo_name, allrecipes, imdb_movie_details
      sciquest

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.