Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 6 additions & 2 deletions docs/content/docs/(configuration)/config.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -106,8 +106,10 @@ startup_delay_secs = 5
enabled = true
headless = true
evaluate_enabled = false
executable_path = "/path/to/chrome" # optional, auto-detected
screenshot_dir = "/path/to/screenshots" # optional, defaults to data_dir/screenshots
persist_session = false # keep browser alive across worker lifetimes
close_policy = "close_browser" # "close_browser", "close_tabs", or "detach"
executable_path = "/path/to/chrome" # optional, auto-detected
screenshot_dir = "/path/to/screenshots" # optional, defaults to data_dir/screenshots

# --- Agents ---
# At least one agent is required. First agent or the one with default = true
Expand Down Expand Up @@ -532,6 +534,8 @@ When branch/worker/cron dispatch happens before readiness is satisfied, Spacebot
| `enabled` | bool | true | Whether workers have browser tools |
| `headless` | bool | true | Run Chrome headless |
| `evaluate_enabled` | bool | false | Allow JavaScript evaluation |
| `persist_session` | bool | false | Keep browser alive across worker lifetimes. Tabs, cookies, and logins survive between tasks. Requires agent restart to take effect. |
| `close_policy` | string | `"close_browser"` | What happens on close: `"close_browser"` (kill Chrome), `"close_tabs"` (close tabs, keep browser), `"detach"` (disconnect, leave everything) |
| `executable_path` | string | None | Custom Chrome/Chromium path |
| `screenshot_dir` | string | None | Directory for screenshots |

Expand Down
78 changes: 57 additions & 21 deletions docs/content/docs/(features)/browser.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -42,8 +42,8 @@ Single `browser` tool with an `action` discriminator. All actions share one argu

| Action | Required Args | Description |
|--------|--------------|-------------|
| `launch` | -- | Start Chrome. Must be called first. |
| `close` | -- | Shut down Chrome and clean up all state. |
| `launch` | -- | Start Chrome (or reconnect to a persistent browser). Must be called first. |
| `close` | -- | Shut down, close tabs, or detach depending on `close_policy`. |

### Navigation

Expand Down Expand Up @@ -130,6 +130,8 @@ Browser config lives in `config.toml` under `[defaults.browser]` (or per-agent o
enabled = true # include browser tool in worker ToolServers
headless = true # run Chrome without a visible window
evaluate_enabled = false # allow JavaScript evaluation via the tool
persist_session = false # keep browser alive across worker lifetimes
close_policy = "close_browser" # what happens on close
executable_path = "" # custom Chrome binary path (auto-detected if empty)
screenshot_dir = "" # override screenshot storage location
```
Expand All @@ -143,34 +145,68 @@ id = "web-scraper"
[agents.browser]
evaluate_enabled = true # this agent's workers can run JS
headless = false # show the browser window for debugging
persist_session = true # keep tabs and logins between worker runs
close_policy = "detach" # workers disconnect without closing tabs
```

When `enabled = false`, the browser tool is not registered on worker ToolServers. Workers for that agent won't see it in their available tools.

### Close Policy

Controls what happens when a worker calls `close`:

| Policy | Behavior |
|--------|----------|
| `close_browser` | Kill Chrome and reset all state. Default. |
| `close_tabs` | Close all tracked tabs but keep Chrome running. |
| `detach` | Disconnect without touching tabs or the browser process. |

`close_policy` is most useful with `persist_session = true`. With `detach`, workers leave the browser exactly as they found it.

## Persistent Sessions

By default, each worker launches its own Chrome process. When the worker finishes, the browser dies and all session state (cookies, tabs, logins) is lost.

With `persist_session = true`, all workers for an agent share a single browser instance via a `SharedBrowserHandle` held in `RuntimeConfig`. The browser and its tabs survive across worker lifetimes.

When a new worker calls `launch` on a persistent browser that's already running, it reconnects to the existing process, discovers all open tabs, and can continue where the previous worker left off. Combined with `close_policy = "detach"`, workers leave the browser untouched when they finish.

This is useful for:

- **Login persistence** -- log in once, reuse the session across multiple worker runs.
- **Watching the agent work** -- set `headless = false` to see a visible Chrome window that stays open.
- **Multi-step workflows** -- a worker can leave tabs open for a follow-up worker to pick up.

```toml
[defaults.browser]
headless = false # visible Chrome window
persist_session = true # keep alive across workers
close_policy = "detach" # workers just disconnect
```

Changing `persist_session` requires an agent restart.

## Architecture

```
Worker (Rig Agent)
├── shell, file, exec, set_status (standard worker tools)
└── browser (BrowserTool)
├── Arc<Mutex<BrowserState>> (shared across tool invocations)
│ ├── Browser (chromiumoxide handle)
│ ├── pages: HashMap (target_id → Page)
│ ├── active_target (current tab)
│ └── element_refs (snapshot ref → ElementRef)
└── Config
├── headless
├── evaluate_enabled
└── screenshot_dir
# Default mode (persist_session = false):
Worker A → BrowserTool { own BrowserState } → own Chrome process
Worker B → BrowserTool { own BrowserState } → own Chrome process

# Persistent mode (persist_session = true):
RuntimeConfig → SharedBrowserHandle (Arc<Mutex<BrowserState>>)
Worker A → BrowserTool { shared state } ──┐
Worker B → BrowserTool { shared state } ──┤→ single Chrome process
Worker C → BrowserTool { shared state } ──┘
```

Each worker gets its own `BrowserTool` instance with its own `BrowserState`. The state is behind `Arc<Mutex<>>` because the Rig tool trait requires `Clone`. The Chrome process (and its CDP WebSocket handler task) live for the lifetime of the worker.
In both modes, `BrowserState` holds:
- `Browser` -- chromiumoxide handle
- `pages: HashMap` -- target_id to Page
- `active_target` -- current tab
- `element_refs` -- snapshot ref to ElementRef

The CDP handler runs as a background tokio task that polls the WebSocket stream. It's spawned during `launch` and dropped when the browser closes or the worker completes.
The state is behind `Arc<Mutex<>>` because the Rig tool trait requires `Clone`. The CDP handler runs as a background tokio task that polls the WebSocket stream, spawned during `launch`.

## Implementation

Expand All @@ -192,5 +228,5 @@ Element resolution: refs map to CSS selectors built from `[role='...']` and `[ar
- **No file upload/download** -- chromiumoxide doesn't expose file chooser interception. Use `shell` + `curl` for downloads.
- **No network interception** -- request blocking and response modification aren't exposed through the tool, though chromiumoxide supports it via raw CDP commands.
- **No cookie/storage management** -- not exposed as tool actions. Could be added if needed.
- **Single browser per worker** -- each worker gets one Chrome process. No connection pooling across workers.
- **Single browser per worker by default** -- with `persist_session = false` (default), each worker gets its own Chrome process. With `persist_session = true`, all workers for an agent share one browser and reconnect to existing tabs on `launch`.
- **Selector fragility** -- the `[role][aria-label]` selector strategy works for well-structured pages but can fail on pages with missing ARIA attributes. The `content` action + `evaluate` (when enabled) serve as fallbacks.
4 changes: 4 additions & 0 deletions interface/src/api/client.ts
Original file line number Diff line number Diff line change
Expand Up @@ -573,6 +573,8 @@ export interface BrowserSection {
enabled: boolean;
headless: boolean;
evaluate_enabled: boolean;
persist_session: boolean;
close_policy: "close_browser" | "close_tabs" | "detach";
}

export interface SandboxSection {
Expand Down Expand Up @@ -655,6 +657,8 @@ export interface BrowserUpdate {
enabled?: boolean;
headless?: boolean;
evaluate_enabled?: boolean;
persist_session?: boolean;
close_policy?: "close_browser" | "close_tabs" | "detach";
}

export interface SandboxUpdate {
Expand Down
23 changes: 23 additions & 0 deletions interface/src/routes/AgentConfig.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -820,6 +820,29 @@ function ConfigSectionEditor({ sectionId, label, description, detail, config, on
value={localValues.evaluate_enabled as boolean}
onChange={(v) => handleChange("evaluate_enabled", v)}
/>
<ConfigToggleField
label="Persist Session"
description="Keep the browser alive across worker lifetimes. Cookies, tabs, and login sessions survive between tasks. Requires agent restart to take effect."
value={localValues.persist_session as boolean}
onChange={(v) => handleChange("persist_session", v)}
/>
<div className="flex flex-col gap-1.5">
<label className="text-sm font-medium text-ink">Close Policy</label>
<p className="text-tiny text-ink-faint">What happens when a worker calls &quot;close&quot; or finishes.</p>
<Select
value={localValues.close_policy as string}
onValueChange={(v) => handleChange("close_policy", v)}
>
<SelectTrigger className="border-app-line/50 bg-app-darkBox/30">
<SelectValue />
</SelectTrigger>
<SelectContent>
<SelectItem value="close_browser">Close Browser</SelectItem>
<SelectItem value="close_tabs">Close Tabs</SelectItem>
<SelectItem value="detach">Detach</SelectItem>
</SelectContent>
</Select>
</div>
</div>
);
case "sandbox":
Expand Down
13 changes: 13 additions & 0 deletions src/api/config.rs
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
use super::state::ApiState;
use crate::config::ClosePolicy;

use axum::Json;
use axum::extract::{Query, State};
Expand Down Expand Up @@ -73,6 +74,8 @@ pub(super) struct BrowserSection {
enabled: bool,
headless: bool,
evaluate_enabled: bool,
persist_session: bool,
close_policy: String,
}

#[derive(Serialize, Debug)]
Expand Down Expand Up @@ -199,6 +202,8 @@ pub(super) struct BrowserUpdate {
enabled: Option<bool>,
headless: Option<bool>,
evaluate_enabled: Option<bool>,
persist_session: Option<bool>,
close_policy: Option<ClosePolicy>,
}

#[derive(Deserialize, Debug)]
Expand Down Expand Up @@ -286,6 +291,8 @@ pub(super) async fn get_agent_config(
enabled: browser.enabled,
headless: browser.headless,
evaluate_enabled: browser.evaluate_enabled,
persist_session: browser.persist_session,
close_policy: browser.close_policy.as_str().to_string(),
},
sandbox: SandboxSection {
mode: match sandbox.mode {
Expand Down Expand Up @@ -672,6 +679,12 @@ fn update_browser_table(
if let Some(v) = browser.evaluate_enabled {
table["evaluate_enabled"] = toml_edit::value(v);
}
if let Some(v) = browser.persist_session {
table["persist_session"] = toml_edit::value(v);
}
if let Some(v) = browser.close_policy {
table["close_policy"] = toml_edit::value(v.as_str());
}
Ok(())
}

Expand Down
38 changes: 31 additions & 7 deletions src/config/load.rs
Original file line number Diff line number Diff line change
Expand Up @@ -10,13 +10,14 @@ use super::providers::{
};
use super::toml_schema::*;
use super::{
AgentConfig, ApiConfig, ApiType, Binding, BrowserConfig, CoalesceConfig, CompactionConfig,
Config, CortexConfig, CronDef, DefaultsConfig, DiscordConfig, DiscordInstanceConfig,
EmailConfig, EmailInstanceConfig, GroupDef, HumanDef, IngestionConfig, LinkDef, LlmConfig,
McpServerConfig, McpTransport, MemoryPersistenceConfig, MessagingConfig, MetricsConfig,
OpenCodeConfig, ProviderConfig, SlackCommandConfig, SlackConfig, SlackInstanceConfig,
TelegramConfig, TelegramInstanceConfig, TelemetryConfig, TwitchConfig, TwitchInstanceConfig,
WarmupConfig, WebhookConfig, normalize_adapter, validate_named_messaging_adapters,
AgentConfig, ApiConfig, ApiType, Binding, BrowserConfig, ClosePolicy, CoalesceConfig,
CompactionConfig, Config, CortexConfig, CronDef, DefaultsConfig, DiscordConfig,
DiscordInstanceConfig, EmailConfig, EmailInstanceConfig, GroupDef, HumanDef, IngestionConfig,
LinkDef, LlmConfig, McpServerConfig, McpTransport, MemoryPersistenceConfig, MessagingConfig,
MetricsConfig, OpenCodeConfig, ProviderConfig, SlackCommandConfig, SlackConfig,
SlackInstanceConfig, TelegramConfig, TelegramInstanceConfig, TelemetryConfig, TwitchConfig,
TwitchInstanceConfig, WarmupConfig, WebhookConfig, normalize_adapter,
validate_named_messaging_adapters,
};
use crate::error::{ConfigError, Result};

Expand Down Expand Up @@ -110,6 +111,21 @@ pub(super) fn warn_unknown_config_keys(content: &str) {
}
}

fn parse_close_policy(value: Option<&str>) -> Option<ClosePolicy> {
match value? {
"close_browser" => Some(ClosePolicy::CloseBrowser),
"close_tabs" => Some(ClosePolicy::CloseTabs),
"detach" => Some(ClosePolicy::Detach),
other => {
tracing::warn!(
value = other,
"unknown close_policy value, expected one of: close_browser, close_tabs, detach"
);
None
}
}
}

fn parse_otlp_headers(value: Option<String>) -> Result<HashMap<String, String>> {
let Some(raw) = value else {
return Ok(HashMap::new());
Expand Down Expand Up @@ -1393,6 +1409,9 @@ impl Config {
.screenshot_dir
.map(PathBuf::from)
.or_else(|| base.screenshot_dir.clone()),
persist_session: b.persist_session.unwrap_or(base.persist_session),
close_policy: parse_close_policy(b.close_policy.as_deref())
.unwrap_or(base.close_policy),
chrome_cache_dir: chrome_cache_dir.clone(),
}
})
Expand Down Expand Up @@ -1590,6 +1609,11 @@ impl Config {
.screenshot_dir
.map(PathBuf::from)
.or_else(|| defaults.browser.screenshot_dir.clone()),
persist_session: b
.persist_session
.unwrap_or(defaults.browser.persist_session),
close_policy: parse_close_policy(b.close_policy.as_deref())
.unwrap_or(defaults.browser.close_policy),
chrome_cache_dir: defaults.browser.chrome_cache_dir.clone(),
}),
mcp: match a.mcp {
Expand Down
22 changes: 22 additions & 0 deletions src/config/runtime.rs
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ use super::{
WarmupConfig, WarmupStatus, WorkReadiness, evaluate_work_readiness,
};
use crate::llm::routing::RoutingConfig;
use crate::tools::browser::SharedBrowserHandle;

/// Live configuration that can be hot-reloaded without restarting.
///
Expand Down Expand Up @@ -64,6 +65,12 @@ pub struct RuntimeConfig {
/// Wrapped in `Arc` so it can be shared with the `Sandbox` struct, which
/// reads the current mode dynamically on every `wrap()` call.
pub sandbox: Arc<ArcSwap<crate::sandbox::SandboxConfig>>,
/// Shared browser state for persistent sessions.
///
/// When `browser.persist_session = true`, all workers share this handle so
/// the browser process and tabs survive across worker lifetimes. When
/// `persist_session = false` this is `None` and each worker creates its own.
pub shared_browser: Option<SharedBrowserHandle>,
}

impl RuntimeConfig {
Expand Down Expand Up @@ -117,6 +124,11 @@ impl RuntimeConfig {
settings: ArcSwap::from_pointee(None),
secrets: ArcSwap::from_pointee(None),
sandbox: Arc::new(ArcSwap::from_pointee(agent_config.sandbox.clone())),
shared_browser: if agent_config.browser.persist_session {
Some(crate::tools::browser::new_shared_browser_handle())
} else {
None
},
}
}

Expand Down Expand Up @@ -188,6 +200,16 @@ impl RuntimeConfig {
.store(Arc::new(resolved.max_concurrent_branches));
self.max_concurrent_workers
.store(Arc::new(resolved.max_concurrent_workers));
let old_persist = self.browser_config.load().persist_session;
let new_persist = resolved.browser.persist_session;
if old_persist != new_persist {
tracing::warn!(
agent_id,
old = old_persist,
new = new_persist,
"persist_session changed — restart the agent for this to take effect"
);
}
self.browser_config.store(Arc::new(resolved.browser));
self.mcp.store(Arc::new(new_mcp.clone()));
self.history_backfill_count
Expand Down
2 changes: 2 additions & 0 deletions src/config/toml_schema.rs
Original file line number Diff line number Diff line change
Expand Up @@ -364,6 +364,8 @@ pub(super) struct TomlBrowserConfig {
pub(super) evaluate_enabled: Option<bool>,
pub(super) executable_path: Option<String>,
pub(super) screenshot_dir: Option<String>,
pub(super) persist_session: Option<bool>,
pub(super) close_policy: Option<String>,
}

#[derive(Deserialize)]
Expand Down
Loading
Loading