Stagehand Browser Automation Package
Trust: ★★★☆☆ (0.90) · 0 validations · developer_reference
Published: 2026-05-13 · Source: crawler_authoritative
Tình huống
Guide for using the @mastra/stagehand package to enable AI-powered browser automation in Mastra agents with natural language element targeting and data extraction
Insight
The @mastra/stagehand package provides browser automation powered by the Stagehand SDK from Browserbase. It uses AI to understand page context and locate elements, allowing natural language descriptions instead of explicit CSS selectors. The package supports several key tools: stagehand_navigate for URL navigation, stagehand_act for performing actions described in natural language, stagehand_extract for pulling structured data from pages, stagehand_observe for analyzing available actions on the current page, and stagehand_screenshot for capturing PNG images of the current page. Stagehand has native Browserbase cloud integration for cloud browser infrastructure. The StagehandBrowser class accepts configuration options including headless mode, model selection (e.g., ‘openai/gpt-5.4’), env setting for Browserbase mode, apiKey, projectId, and excludeTools to disable specific tools for models without vision support. The browser instance is assigned to an Agent via the browser property.
Hành động
Install using npm install @mastra/stagehand, pnpm add @mastra/stagehand, yarn add @mastra/stagehand, or bun add @mastra/stagehand. Create a StagehandBrowser instance and assign it to an Agent: import { Agent } from ‘@mastra/core/agent’ and import { StagehandBrowser } from ‘@mastra/stagehand’, then configure with headless mode and model. Pass the browser instance to the Agent constructor. For Browserbase cloud integration, set env: ‘BROWSERBASE’ and provide apiKey and projectId from environment variables. To disable the screenshot tool for non-vision models, use excludeTools: [‘stagehand_screenshot’]. Natural language actions are specified in agent instructions - the AI interprets descriptions like ‘Press the Sign In button’ or ‘Type [email protected] in the email field’.
Kết quả
The agent can interact with web pages using natural language commands. stagehand_act interprets and executes described actions. stagehand_extract returns structured JSON data from page content. stagehand_observe returns a list of available actions with element types. stagehand_screenshot returns PNG image content for vision-capable models to interpret.
Điều kiện áp dụng
Requires @mastra/core agent setup. Screenshot tool requires vision-capable models. Browserbase integration requires BROWSERBASE_API_KEY and BROWSERBASE_PROJECT_ID environment variables.
Nội dung gốc (Original)
Stagehand
The @mastra/stagehand package provides browser automation using the Stagehand SDK from Browserbase. Stagehand uses AI to understand page context and locate elements, enabling natural language descriptions instead of explicit selectors.
When to use Stagehand
Use Stagehand when you need:
- Natural language element targeting (“select the login button”)
- AI-powered data extraction from pages
- Native Browserbase cloud integration
- Simpler tool interface for common actions
Quickstart
Install the package:
npm:
npm install @mastra/stagehandpnpm:
pnpm add @mastra/stagehandYarn:
yarn add @mastra/stagehandBun:
bun add @mastra/stagehandCreate a browser instance and assign it to an agent:
import { Agent } from '@mastra/core/agent'
import { StagehandBrowser } from '@mastra/stagehand'
const browser = new StagehandBrowser({
headless: false,
model: 'openai/gpt-5.4',
})
export const stagehandAgent = new Agent({
id: 'stagehand-agent',
name: 'Stagehand Agent',
model: 'openai/gpt-5.4',
browser,
instructions: `You are a web automation assistant.
Use stagehand tools to interact with pages:
- stagehand_navigate to go to URLs
- stagehand_act to perform actions described in natural language
- stagehand_extract to get structured data from the page
- stagehand_observe to find available actions on the page
- stagehand_screenshot to visually inspect the page`,
})Natural language actions
When the agent uses the stagehand_act tool, it accepts natural language descriptions of actions:
- “Press the Sign In button”
- “Type
'[[email protected]](mailto:[email protected])'in the email field” - “Select ‘United States’ from the country dropdown”
Stagehand’s AI interprets the action and finds the appropriate element on the page.
Data extraction
When the agent uses the stagehand_extract tool, it can pull structured data from pages.
Example instruction: “Extract the product name, price, and availability”
The tool returns structured data based on page content:
{
"name": "Widget Pro",
"price": "$29.99",
"availability": "In Stock"
}Observing actions
When the agent uses the stagehand_observe tool, it analyzes the current page and returns possible actions.
Example instruction: “What actions can I take on this login form?”
Returns a list of available actions:
[
{ "action": "Press 'Sign In' button", "element": "button" },
{ "action": "Type in 'Email' field", "element": "input" },
{ "action": "Open 'Forgot Password' link", "element": "a" }
]Screenshots
When the agent uses the stagehand_screenshot tool, it captures a PNG image of the current page and returns it as image content that vision-capable models can interpret directly.
Use screenshots when you need to visually inspect the page — for example, evaluating images, layout, or colors. For text or structured data, use stagehand_extract or stagehand_observe instead.
const browser = new StagehandBrowser({
headless: false,
model: 'openai/gpt-5.4',
})To disable the screenshot tool for models that do not support vision, use excludeTools:
const browser = new StagehandBrowser({
headless: false,
model: 'openai/gpt-5.4',
excludeTools: ['stagehand_screenshot'],
})Browserbase
Stagehand has native Browserbase integration for cloud browser infrastructure:
const browser = new StagehandBrowser({
env: 'BROWSERBASE',
apiKey: process.env.BROWSERBASE_API_KEY,
projectId: process.env.BROWSERBASE_PROJECT_ID,
model: 'openai/gpt-5.4',
})Note: See StagehandBrowser reference for all configuration options.
Related
Liên kết
- Nền tảng: Dev Framework · Mastra
- Nguồn: https://mastra.ai/docs/browser/stagehand
Xem thêm: