Skip to content

Sync browser agent features from ConnectOnion CLI #4

@wu-changxing

Description

@wu-changxing

Sync Browser Agent Features from ConnectOnion CLI Version

Summary

The standalone browser-agent and ConnectOnion's CLI browser agent (@connectonion/cli/browser-agent) have diverged significantly. This issue tracks features from the CLI version that should be adopted to improve UX and reliability.

Current State

  • Standalone: Multi-agent architecture, deep research capabilities, advanced features
  • CLI Version: Better UX, defensive error handling, platform optimizations
  • Last Sync: Never systematically synced

Features to Adopt from CLI Version

1. Home Directory Profile Management

Priority: High

Current: Profile at .co/chrome_profile (project-specific)
Proposed: Profile at ~/.co/browser_profile (persistent across projects)

Benefits:

  • Sessions persist across different projects
  • User logs in once, cookies saved forever
  • More intuitive behavior for users

Implementation:

# In WebAutomation.__init__()
if profile_path:
    self.chrome_profile_path = str(profile_path)
else:
    self.chrome_profile_path = str(Path.home() / ".co" / "browser_profile")

Files to modify:

  • tools/web_automation.py (lines 40-43)

2. macOS Chrome Binary Detection

Priority: Medium

Problem: Playwright's bundled Chromium is unsigned and crashes on macOS in non-headless mode

Solution:

def open_browser(self, headless: bool = None) -> str:
    launch_kwargs = dict(
        headless=headless,
        args=['--disable-blink-features=AutomationControlled'],
        ignore_default_args=['--enable-automation'],
        timeout=120000,
    )

    # macOS fix: Use system Chrome for non-headless mode
    if not headless:
        import sys
        if sys.platform == 'darwin':
            chrome_path = '/Applications/Google Chrome.app/Contents/MacOS/Google Chrome'
            if os.path.exists(chrome_path):
                launch_kwargs['executable_path'] = chrome_path

    self.browser = self.playwright.chromium.launch_persistent_context(
        str(profile_dir),
        **launch_kwargs,
    )

Files to modify:

  • tools/web_automation.py (open_browser method)

3. Smart URL Handling

Priority: Low

Enhancement: Auto-add https:// for partial URLs

Current:

def go_to(self, url: str) -> str:
    self.page.goto(url, wait_until="load")

Proposed:

def go_to(self, url: str) -> str:
    if not url.startswith(('http://', 'https://')):
        url = f'https://{url}' if '.' in url else f'http://{url}'

    self.page.goto(url, wait_until='domcontentloaded', timeout=30000)
    self.page.wait_for_timeout(2000)  # Wait for dynamic content
    self.current_url = self.page.url
    return f"Navigated to {self.current_url}"

Benefits:

  • Users can type example.com instead of https://example.com
  • More forgiving UX

Files to modify:

  • tools/web_automation.py (go_to method, line 80-84)

4. Defensive Null Checking

Priority: Medium

Issue: Methods don't check if browser is open, leading to cryptic errors

Pattern to adopt:

def method_name(self, ...) -> str:
    if not self.page:
        return "Browser not open"

    # ... rest of implementation

Apply to all methods in:

  • tools/web_automation.py: get_text, select_option, check_checkbox, wait_for_element, etc.

5. Additional Helper Methods

Priority: Low

Add convenience methods from CLI version:

def get_current_url(self) -> str:
    """Get the current page URL."""
    if not self.page:
        return "Browser not open"
    return self.page.url

def get_current_page_html(self) -> str:
    """Get the HTML content of the current page."""
    if not self.page:
        return "Browser not open"
    return self.page.content()

def get_urls(self, domain_filter: str = "") -> List[str]:
    """Extract all unique URLs from the current page.

    Args:
        domain_filter: Only return URLs containing this string
    """
    if not self.page:
        return []

    urls = self.page.evaluate("""
        (filter) => {
            const seen = new Set();
            const result = [];
            for (const a of document.querySelectorAll('a[href]')) {
                const href = a.href;
                if (href && !seen.has(href) && (!filter || href.includes(filter))) {
                    seen.add(href);
                    result.push(href);
                }
            }
            return result;
        }
    """, domain_filter)
    return urls or []

def set_viewport(self, width: int, height: int) -> str:
    """Set the browser viewport size."""
    if not self.page:
        return "Browser not open"
    self.page.set_viewport_size({"width": width, "height": height})
    return f"Viewport set to {width}x{height}"

def wait(self, seconds: float) -> str:
    """Wait for a specified number of seconds."""
    if not self.page:
        return "Browser not open"
    self.page.wait_for_timeout(seconds * 1000)
    return f"Waited for {seconds} seconds"

Files to modify:

  • tools/web_automation.py (add after existing methods)

6. Enhanced Screenshot Method

Priority: Low

Current signature:

def take_screenshot(self, filename: str = None) -> str:

Proposed signature:

def take_screenshot(self, url: str = None, path: str = "",
                   width: int = 1920, height: int = 1080,
                   full_page: bool = False) -> str:

New features:

  • Navigate to URL before screenshot
  • Control viewport size
  • Full-page capture option
  • Auto-generate timestamped filenames

Files to modify:

  • tools/web_automation.py (take_screenshot method)

7. Better Manual Login UX

Priority: Medium

Current:

def wait_for_manual_login(self, site_name: str = "the website") -> str:
    print(f"\n{'='*60}\n⏸️  MANUAL LOGIN REQUIRED\n{'='*60}")
    print(f"Please login to {site_name} in the browser window.")
    input("Press Enter to continue...")
    return f"User confirmed login to {site_name}"

Proposed:

def wait_for_manual_login(self, site_name: str = "the website") -> str:
    if not self.page:
        return "Browser not open"

    print(f"\n{'='*60}")
    print(f"  MANUAL LOGIN REQUIRED")
    print(f"{'='*60}")
    print(f"Please login to {site_name} in the browser window.")
    print(f"Once you're logged in and ready to continue:")
    print(f"  Type 'yes' or 'Y' and press Enter")
    print(f"{'='*60}\n")

    while True:
        response = input("Ready to continue? (yes/Y): ").strip().lower()
        if response in ['yes', 'y']:
            print("Continuing automation...\n")
            return f"User confirmed login to {site_name} - continuing"
        else:
            print("Please type 'yes' or 'Y' when ready.")

Files to modify:

  • tools/web_automation.py (wait_for_manual_login method, lines 218-223)

Implementation Plan

Phase 1: Critical UX Improvements

Phase 2: Platform Support

Phase 3: Nice-to-Have

Testing Checklist

After implementing:

  • Test on macOS in non-headless mode
  • Test profile persistence across different projects
  • Test with browser not opened (defensive checks)
  • Test manual login flow with invalid inputs
  • Verify all helper methods work as expected

Related Issues

  • This issue created as part of browser agent sync effort
  • Related: ConnectOnion CLI browser agent issue (link TBD)

Files to Modify

  • tools/web_automation.py - Main implementation file
  • README.md - Update documentation for new features
  • CLAUDE.md - Update development guidance

Migration Notes

These changes are backward compatible - existing code will continue to work. New features are additive.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions