Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.enconvo.ai/llms.txt

Use this file to discover all available pages before exploring further.

Overview

Browser Control lets EnConvo’s AI interact with your real browser — the same one where you’re logged into all your accounts. Unlike headless browser tools that start fresh with no cookies or sessions, Browser Control works with your existing browser instance, including all your login sessions, cookies, bookmarks, and extensions. This means the AI can:
  • Navigate to your dashboard without needing your credentials
  • Fill forms on sites where you’re already authenticated
  • Take screenshots of exactly what you see
  • Read page content from any tab
  • Click buttons, manage tabs, and automate workflows — all in your real browser

Real Browser, Real Sessions

Works with your existing login sessions — no re-authentication needed

Multi-Browser Support

Chrome, Edge, Brave, Arc, Vivaldi, Opera, Firefox, and more

86+ Actions

Navigate, click, type, screenshot, snapshot, eval, tab management, cookies, and much more

Background Mode

Automate without bringing the browser to the foreground

How It Works

Browser Control uses a lightweight companion extension installed in your browser. The extension communicates with EnConvo via WebSocket, allowing the AI to send commands and receive results.
1

Install the Companion Extension

Install the Enconvo Companion extension in your browser
2

Extension Connects Automatically

The extension connects to EnConvo via WebSocket when your browser is running
3

AI Sends Commands

When you ask the AI to interact with a webpage, it sends commands through the extension
4

Extension Executes & Returns Results

The extension performs the action in your browser and returns the result to the AI

Installing the Companion Extension

Setting Up

Choose Your Default Browser

By default, Browser Control uses your system’s default browser. You can override this:
  1. Open EnConvo Settings
  2. Find the Browser Control extension
  3. Set Default Browser to your preferred browser

Verify Connection

To check if the extension is connected:
enconvo browser_control status
This shows:
  • Which browsers are connected
  • Which browser is currently active
  • Whether the extension is running

Core Workflow

The typical pattern for AI browser automation:
1

Navigate to a page

Tell the AI to open a URL — it uses the navigate action
2

Take a snapshot

The AI takes a snapshot of the page, getting an accessibility tree with element references like @e1, @e2
3

Interact with elements

Using the references, the AI can click buttons (@e3), fill inputs (@e5), select dropdowns, etc.
4

Re-snapshot after changes

After any navigation or DOM change, the AI takes a fresh snapshot to get updated references
Since Browser Control uses your real browser sessions, you don’t need to handle login flows. Just navigate directly to authenticated pages — you’re already logged in.

What Can It Do?

Page Interaction

ActionDescription
NavigateOpen URLs, go back/forward in history
Click / Double-clickClick any element by reference, CSS selector, or text
Fill / TypeEnter text into inputs (React/Vue compatible)
SelectChoose dropdown options
Check / UncheckToggle checkboxes and radio buttons
PressKeyboard shortcuts like Enter, Ctrl+A, Cmd+C
ScrollScroll page or specific elements into view
DragDrag and drop between elements
Focus / HoverFocus inputs or hover over elements
Submit FormSubmit form elements

Content Reading

ActionDescription
ScreenshotCapture the visible tab as an image
Annotated ScreenshotScreenshot with numbered labels on interactive elements
SnapshotAccessibility tree with @eN element references
Get ContentExtract page text and optionally HTML
Get Text / HTML / ValueRead specific element content
Get All LinksExtract all links from the page
Get Table DataExtract HTML tables as structured JSON
Get Form DataRead all form field values
Get Meta TagsExtract SEO and OpenGraph metadata
Get ImagesList all images with src, alt, and dimensions
EvalExecute custom JavaScript in the page

Browser Management

ActionDescription
Tab ManagementOpen, close, switch, and list tabs
Cookie ManagementGet, set, remove, and clear cookies
StorageRead/write localStorage and sessionStorage
State Save/LoadSave and restore browser state (cookies + storage)
Window ManagementOpen new windows
ZoomGet, set, zoom in/out, reset zoom
Clear CacheClear the browser cache

Inspection & Debugging

ActionDescription
Get Element InfoBounding box, visibility, enabled state, ARIA info
Is Visible / Enabled / CheckedQuick boolean state checks
Is In ViewportCheck if element is visible without scrolling
HighlightVisually highlight an element with colored outline
Console MessagesCapture console.log output
Page ErrorsCapture JavaScript errors
Get PerformancePage load timing and resource metrics
Network MonitorTrack XHR and fetch requests

Advanced

ActionDescription
BatchExecute multiple actions in one request
Inject CSSAdd custom styles to the page
Remove ElementRemove elements from the DOM
Set AttributeModify HTML attributes
Toggle ClassAdd/remove CSS classes
Emulate DeviceSet viewport and user agent for mobile testing
Set MediaEmulate dark/light mode
Block ResourcesBlock images, scripts, or ads
FrameSwitch between iframes

Examples

Reading the Current Page

Simply ask:
“What’s on the current page in my browser?”
The AI will use get_frontmost_browser_active_tab_content to read whatever page is currently in the foreground, regardless of which browser you’re using.

Filling a Form

“Go to example.com/signup and fill in the form with my name John Doe and email john@example.com
The AI will:
  1. Navigate to the URL
  2. Snapshot the page to find form fields
  3. Fill in each field using the @eN references
  4. Submit the form

Taking a Screenshot

“Take a screenshot of my GitHub dashboard”
The AI navigates to GitHub (you’re already logged in), waits for the page to load, and captures a screenshot that displays directly in the chat.

Extracting Data

“Get all the links from the Hacker News front page”
The AI navigates to Hacker News and uses get_all_links to extract every link with its text and URL.

Background Automation

“In the background, check the price of AAPL on Google Finance”
With background: true, the browser stays minimized while the AI navigates, reads the price, and reports back.

Multi-Browser Support

Browser Control works with any browser that has the companion extension installed:
BrowserDetection MethodName in Commands
Google ChromeAppleScriptgoogle_chrome
Microsoft EdgeExtension (userAgent)edge
BraveAppleScriptbrave_browser
ArcAppleScriptarc
VivaldiAppleScriptvivaldi
OperaExtension (userAgent)opera
FirefoxExtension (URL scheme)firefox

Browser Selection Priority

When you don’t specify a browser, EnConvo follows this order:
  1. Your manual setting — if you chose a default in Browser Control preferences
  2. System default browser — your macOS default (skips Safari since it doesn’t support the extension)
  3. Connected browsers — whichever connected browser was used most recently

Auto-Launch

If the target browser isn’t running, EnConvo will automatically launch it and wait for the extension to connect (up to 10 seconds).

Snapshot & Element References

The snapshot is one of the most powerful features. It produces an accessibility tree of the page:
[page] My Dashboard
  [navigation] Main Nav
    [link @e1] Home
    [link @e2] Projects
    [link @e3] Settings
  [main]
    [heading @e4] Welcome back, John
    [textbox @e5] "Search..." (placeholder)
    [button @e6] New Project
    [list]
      [listitem @e7] Project Alpha
      [listitem @e8] Project Beta
Interactive elements get @eN references that the AI can use directly:
  • click @e6 → clicks “New Project”
  • fill @e5 "my search term" → types in the search box
  • get_text @e4 → reads “Welcome back, John”
Important: Element references (@e1, @e2, etc.) are invalidated when the page changes. The AI always takes a fresh snapshot after navigation or DOM changes.

Troubleshooting

The companion extension isn’t installed or the browser isn’t running.Solutions:
  1. Make sure your browser is running
  2. Install the extension from Chrome Web Store
  3. Or install manually from ~/.enconvo/chrome_extension/ (see Manual Installation above)
  4. Check chrome://extensions and make sure the extension is enabled
The extension connects via WebSocket to localhost:11225. Ensure:
  1. EnConvo is running
  2. No firewall is blocking port 11225
  3. Try disabling and re-enabling the extension
  4. Check the extension’s service worker for errors in chrome://extensions
Some pages restrict extension access:
  • chrome:// and browser internal pages cannot be controlled
  • Pages with strict Content Security Policy (CSP) may block injected scripts
  • Closed Shadow DOM elements cannot be accessed
Workaround: Use eval with caution, or try a different approach like keyboard shortcuts (press).
Check which browser is active:
enconvo browser_control status
You can specify a browser explicitly:
enconvo browser_control navigate --url "https://example.com" --browser "edge"
Or set a default in Browser Control preferences.
captureVisibleTab only captures the visible viewport. Make sure:
  1. The browser window is not minimized
  2. The target tab is the active tab
  3. The page has finished loading (use wait_for first)

AI Agents

Use Browser Control as a tool within AI agent workflows

Context Awareness

EnConvo can automatically read your current browser tab as context

MCP Servers

Extend browser capabilities with MCP-based web tools

Workflows

Automate multi-step browser tasks with visual workflows