Stagehand - Natural Language Browser Automation

Trust: ★★★☆☆ (0.90) · 0 validations · developer_reference

Published: 2026-05-10 · Source: crawler_authoritative

Tình huống

Mastra documentation for configuring and using the @mastra/stagehand package to enable AI-powered browser automation with natural language element targeting.

Insight

The @mastra/stagehand package provides browser automation powered by the Stagehand SDK from Browserbase. It uses AI to understand page context and locate elements, enabling developers to use natural language descriptions instead of explicit CSS selectors or XPath. The package integrates with Mastra’s Agent system, providing four core tools: stagehand_navigate for URL navigation, stagehand_act for performing natural language actions, stagehand_extract for pulling structured data from pages, and stagehand_observe for analyzing available actions on the current page. The StagehandBrowser class accepts configuration options including headless mode (boolean), model specification (e.g., ‘openai/gpt-5.4’), and environment settings for Browserbase cloud integration with apiKey and projectId parameters.

Hành động

Install the package using npm install @mastra/stagehand, pnpm add @mastra/stagehand, yarn add @mastra/stagehand, or bun add @mastra/stagehand. Create a StagehandBrowser instance with configuration options, then assign it to an Agent. Natural language actions are passed to stagehand_act tool (e.g., ‘Press the Sign In button’, ‘Type [email protected] in the email field’, ‘Select United States from the country dropdown’). Data extraction uses stagehand_extract with instructions like ‘Extract the product name, price, and availability’ and returns structured JSON. The stagehand_observe tool analyzes the current page and returns available actions as JSON objects with action and element properties. For Browserbase cloud integration, set env: ‘BROWSERBASE’ and provide BROWSERBASE_API_KEY and BROWSERBASE_PROJECT_ID environment variables.

Kết quả

Agents configured with StagehandBrowser can interact with web pages using natural language commands, extract structured data from pages automatically, and discover available interactive elements without manual selector configuration.

Điều kiện áp dụng

Requires @mastra/core agent setup. Uses OpenAI GPT-5.4 model by default but supports other models. Browserbase integration requires valid API credentials.


Nội dung gốc (Original)

Stagehand

The @mastra/stagehand package provides browser automation using the Stagehand SDK from Browserbase. Stagehand uses AI to understand page context and locate elements, enabling natural language descriptions instead of explicit selectors.

When to use Stagehand

Use Stagehand when you need:

  • Natural language element targeting (“select the login button”)
  • AI-powered data extraction from pages
  • Native Browserbase cloud integration
  • Simpler tool interface for common actions

Quickstart

Install the package:

npm:

npm install @mastra/stagehand

pnpm:

pnpm add @mastra/stagehand

Yarn:

yarn add @mastra/stagehand

Bun:

bun add @mastra/stagehand

Create a browser instance and assign it to an agent:

import { Agent } from '@mastra/core/agent'
import { StagehandBrowser } from '@mastra/stagehand'
 
const browser = new StagehandBrowser({
  headless: false,
  model: 'openai/gpt-5.4',
})
 
export const stagehandAgent = new Agent({
  id: 'stagehand-agent',
  name: 'Stagehand Agent',
  model: 'openai/gpt-5.4',
  browser,
  instructions: `You are a web automation assistant.
 
Use stagehand tools to interact with pages:
- stagehand_navigate to go to URLs
- stagehand_act to perform actions described in natural language
- stagehand_extract to get structured data from the page
- stagehand_observe to find available actions on the page`,
})

Natural language actions

When the agent uses the stagehand_act tool, it accepts natural language descriptions of actions:

Stagehand’s AI interprets the action and finds the appropriate element on the page.

Data extraction

When the agent uses the stagehand_extract tool, it can pull structured data from pages.

Example instruction: “Extract the product name, price, and availability”

The tool returns structured data based on page content:

{
  "name": "Widget Pro",
  "price": "$29.99",
  "availability": "In Stock"
}

Observing actions

When the agent uses the stagehand_observe tool, it analyzes the current page and returns possible actions.

Example instruction: “What actions can I take on this login form?”

Returns a list of available actions:

[
  { "action": "Press 'Sign In' button", "element": "button" },
  { "action": "Type in 'Email' field", "element": "input" },
  { "action": "Open 'Forgot Password' link", "element": "a" }
]

Browserbase

Stagehand has native Browserbase integration for cloud browser infrastructure:

const browser = new StagehandBrowser({
  env: 'BROWSERBASE',
  apiKey: process.env.BROWSERBASE_API_KEY,
  projectId: process.env.BROWSERBASE_PROJECT_ID,
  model: 'openai/gpt-5.4',
})

Note: See StagehandBrowser reference for all configuration options.

Liên kết

Xem thêm: