AgentBrowser - Browser Automation Package

Trust: ★★★☆☆ (0.90) · 0 validations · developer_reference

Published: 2026-05-12 · Source: crawler_authoritative

Tình huống

Guide for using the @mastra/agent-browser package to configure browser automation with Playwright and accessibility-first element targeting for AI agents.

Insight

The @mastra/agent-browser package provides browser automation using Playwright with accessibility-first element targeting. Elements are identified by refs from the page’s accessibility tree (like @e1, @e2), making interactions reliable across different page layouts. The package requires a Chromium binary for local launches (auto-downloaded during installation or installed via npx playwright install chromium). Remote browser connections are supported via the cdpUrl configuration option, which bypasses the local Chromium requirement. AgentBrowser integrates with the browser_snapshot tool that returns a text representation of the page with element refs for targeting, and the browser_screenshot tool that captures PNG images for vision-capable models. Configuration options include headless mode and excludeTools array to disable specific tools for non-vision models. The AgentBrowser instance is assigned to an agent via the browser property on the Agent configuration. Works with keyboard shortcuts and complex interactions.

Hành động

Install via npm (npm install @mastra/agent-browser), pnpm (pnpm add @mastra/agent-browser), yarn (yarn add @mastra/agent-browser), or bun (bun add @mastra/agent-browser). Create an AgentBrowser instance with configuration options: headless (default true), excludeTools array, and cdpUrl for remote browser connections. Assign the browser instance to an Agent via the browser property. In agent instructions, guide the agent to use browser_snapshot to get current page state and element refs, then use refs to target elements for clicks and typing. After actions, take another snapshot to verify results. For non-vision models, use excludeTools: ['browser_screenshot'] to disable the screenshot tool. If local browser launch fails with ‘browser executable is missing’, run npx playwright install chromium.

Kết quả

Returns a configured browser automation instance that enables agents to interact with web pages using accessibility tree refs for reliable element targeting, with support for snapshots, screenshots, and complex browser interactions.

Điều kiện áp dụng

Requires @mastra/core agent setup. For local launches, requires Chromium binary installed via Playwright (auto-downloaded or npx playwright install chromium). Remote browser connections via cdpUrl bypass the local Chromium requirement. The browser_screenshot tool requires vision-capable models.

Nội dung gốc (Original)

AgentBrowser

The @mastra/agent-browser package provides browser automation using Playwright with accessibility-first element targeting. Elements are identified by refs from the page’s accessibility tree, making interactions reliable across different page layouts.

When to use AgentBrowser

Use AgentBrowser when you need:

Reliable element targeting through accessibility refs
Fine-grained control over browser actions
Playwright’s robust automation capabilities
Support for keyboard shortcuts and complex interactions

Quickstart

Install the package:

npm:

npm install @mastra/agent-browser

pnpm:

pnpm add @mastra/agent-browser

Yarn:

yarn add @mastra/agent-browser

Bun:

bun add @mastra/agent-browser

Create a browser instance and assign it to an agent:

import { Agent } from '@mastra/core/agent'
import { AgentBrowser } from '@mastra/agent-browser'
 
const browser = new AgentBrowser({
  headless: false,
})
 
export const browserAgent = new Agent({
  id: 'browser-agent',
  name: 'Browser Agent',
  model: 'openai/gpt-5.4',
  browser,
  instructions: `You are a web automation assistant.
 
When interacting with pages:
1. Use browser_snapshot to get the current page state and element refs
2. Use the refs (like @e1, @e2) to target elements for clicks and typing
3. After actions, take another snapshot to verify the result`,
})

Note: For local launches (the default), AgentBrowser requires a Chromium binary installed via Playwright. This is normally downloaded automatically when you install @mastra/agent-browser. If launching the browser fails with "browser executable is missing", run npx playwright install chromium. If you connect to a remote browser using the cdpUrl option, no local Chromium is needed.

Screenshots

When the agent uses the browser_screenshot tool, it captures a PNG image of the current page and returns it as image content that vision-capable models can interpret directly.

Use screenshots when you need to visually inspect the page — for example, evaluating images, layout, or colors. For text or structured data, use browser_snapshot instead.

To disable the screenshot tool for models that do not support vision, use excludeTools:

const browser = new AgentBrowser({
  headless: false,
  excludeTools: ['browser_screenshot'],
})

Element refs

AgentBrowser uses accessibility tree refs to identify elements. When an agent calls browser_snapshot, it receives a text representation of the page with refs like @e1, @e2, etc. The agent then uses these refs with other tools to interact with elements.

Note: See AgentBrowser reference for all configuration options and tool details.

Liên kết

Nền tảng: Dev Framework · Mastra
Nguồn: https://mastra.ai/docs/browser/agent-browser

Xem thêm:

KakaFlow KB

Nội dung

AgentBrowser - Browser Automation Package

AgentBrowser - Browser Automation Package

Tình huống

Insight

Hành động

Kết quả

Điều kiện áp dụng

Nội dung gốc (Original)

AgentBrowser

When to use AgentBrowser

Quickstart

Screenshots

Element refs

Liên kết

Sơ đồ

Mục lục

Liên kết ngược

KakaFlow KB

Nội dung

AgentBrowser - Browser Automation Package

AgentBrowser - Browser Automation Package

Tình huống

Insight

Hành động

Kết quả

Điều kiện áp dụng

Nội dung gốc (Original)

AgentBrowser

When to use AgentBrowser

Quickstart

Screenshots

Element refs

Related

Liên kết

Sơ đồ

Mục lục

Liên kết ngược