AgentBrowser - Browser Automation with Playwright

Trust: ★★★☆☆ (0.90) · 0 validations · developer_reference

Published: 2026-05-10 · Source: crawler_authoritative

Tình huống

Guide for developers using the @mastra/agent-browser package to implement browser automation with accessibility-first element targeting using Playwright.

Insight

The @mastra/agent-browser package provides browser automation using Playwright with accessibility-first element targeting. Elements are identified by refs from the page’s accessibility tree (e.g., @e1, @e2), making interactions reliable across different page layouts. When an agent calls browser_snapshot, it receives a text representation of the page with these refs. The agent then uses refs with other tools to interact with elements. AgentBrowser requires a Chromium binary installed via Playwright for local launches, normally downloaded automatically during package installation. If launching fails with ‘browser executable is missing’, run npx playwright install chromium. A remote browser can be connected using the cdpUrl option without needing local Chromium.

Hành động

Install the package using npm (npm install @mastra/agent-browser), pnpm (pnpm add @mastra/agent-browser), Yarn (yarn add @mastra/agent-browser), or Bun (bun add @mastra/agent-browser). Create a browser instance with new AgentBrowser({ headless: false }) and assign it to an agent using the browser property. Configure the agent with model (e.g., openai/gpt-5.4), instructions specifying to use browser_snapshot to get current page state and element refs, then use refs to target elements for clicks and typing. After actions, take another snapshot to verify results.

Kết quả

The browser agent uses accessibility tree refs to reliably target and interact with elements across different page layouts, with keyboard shortcuts and complex interaction support.

Điều kiện áp dụng

For local launches, requires Chromium binary installed via Playwright. Run npx playwright install chromium if browser executable is missing. When using cdpUrl for remote browser connection, no local Chromium is needed.


Nội dung gốc (Original)

AgentBrowser

The @mastra/agent-browser package provides browser automation using Playwright with accessibility-first element targeting. Elements are identified by refs from the page’s accessibility tree, making interactions reliable across different page layouts.

When to use AgentBrowser

Use AgentBrowser when you need:

  • Reliable element targeting through accessibility refs
  • Fine-grained control over browser actions
  • Playwright’s robust automation capabilities
  • Support for keyboard shortcuts and complex interactions

Quickstart

Install the package:

npm:

npm install @mastra/agent-browser

pnpm:

pnpm add @mastra/agent-browser

Yarn:

yarn add @mastra/agent-browser

Bun:

bun add @mastra/agent-browser

Create a browser instance and assign it to an agent:

import { Agent } from '@mastra/core/agent'
import { AgentBrowser } from '@mastra/agent-browser'
 
const browser = new AgentBrowser({
  headless: false,
})
 
export const browserAgent = new Agent({
  id: 'browser-agent',
  name: 'Browser Agent',
  model: 'openai/gpt-5.4',
  browser,
  instructions: `You are a web automation assistant.
 
When interacting with pages:
1. Use browser_snapshot to get the current page state and element refs
2. Use the refs (like @e1, @e2) to target elements for clicks and typing
3. After actions, take another snapshot to verify the result`,
})

Note: For local launches (the default), AgentBrowser requires a Chromium binary installed via Playwright. This is normally downloaded automatically when you install @mastra/agent-browser. If launching the browser fails with "browser executable is missing", run npx playwright install chromium. If you connect to a remote browser using the cdpUrl option, no local Chromium is needed.

Element refs

AgentBrowser uses accessibility tree refs to identify elements. When an agent calls browser_snapshot, it receives a text representation of the page with refs like @e1, @e2, etc. The agent then uses these refs with other tools to interact with elements.

Note: See AgentBrowser reference for all configuration options and tool details.

Liên kết

Xem thêm: