Get Started with Stagehand

Stagehand is an AI-powered web browsing framework that enables intelligent browser automation through natural language. By integrating Cerebras models, you can leverage ultra-fast inference for web scraping, form filling, automated testing, and complex multi-step workflows that feel responsive and natural.

Prerequisites

Before you begin, ensure you have:

Cerebras API Key - Get a free API key here
Browserbase Account - Visit Browserbase and create an account to get your API key and Project ID
Node.js 20 or higher - Stagehand requires a modern Node.js environment

Stagehand works best with fast inference providers like Cerebras. The ultra-low latency of Cerebras models (gpt-oss-120b, llama3.1-8b) enables near-instantaneous browser control decisions, making your automation agents significantly more responsive than traditional approaches.

Configure Stagehand with Cerebras

Create a new Stagehand project

The fastest way to get started is using the official Stagehand project creator, which sets up everything you need including dependencies:

npx create-browser-app@latest

Then follow the setup instructions:

cd my-stagehand-app
npm install
cp .env.example .env

The project template already includes zod and dotenv as dependencies—no need to install them separately.

Configure environment variables

Edit your .env file with your API credentials. You’ll need both Cerebras and Browserbase credentials to enable AI-powered browser automation:

BROWSERBASE_PROJECT_ID=your-browserbase-project-id-here
BROWSERBASE_API_KEY=your-browserbase-api-key-here
CEREBRAS_API_KEY=your-cerebras-api-key-here

You can find your Browserbase API Key and Project ID in the Browserbase Dashboard under the Overview section.

Extract structured data from Hacker News

Let’s start with a practical example—extracting the top stories from Hacker News. This demonstrates how Stagehand can intelligently parse and structure real web content:

import 'dotenv/config';
import { Stagehand } from "@browserbasehq/stagehand";
import { z } from "zod/v3";

(async () => {
  try {
    const stagehand = new Stagehand({
      env: "BROWSERBASE",
      model: {
        modelName: "cerebras/gpt-oss-120b",
        apiKey: process.env.CEREBRAS_API_KEY,
      }
    });

    await stagehand.init();
    const page = stagehand.context.pages()[0];

    await page.goto("https://news.ycombinator.com");

    const result = await stagehand.extract(
      "Extract the top 5 stories with their titles and points",
      z.object({
        stories: z.array(z.object({
          title: z.string(),
          points: z.number(),
        })),
      })
    );

    console.log("Top Hacker News stories:", result.stories);
    await stagehand.close();
  } catch (error) {
    console.error("Error:", error);
    process.exit(1);
  }
})();

We’re using Cerebras’s gpt-oss-120b model here, which excels at structured data extraction tasks due to its strong reasoning capabilities and fast inference speed.

Perform actions with natural language

Now let’s use Stagehand’s act() method to interact with a website. This example searches GitHub for “stagehand browserbase” and extracts results:

import 'dotenv/config';
import { Stagehand } from "@browserbasehq/stagehand";
import { z } from "zod/v3";

(async () => {
  try {
    const stagehand = new Stagehand({
      env: "BROWSERBASE",
      model: {
        modelName: "cerebras/gpt-oss-120b",
        apiKey: process.env.CEREBRAS_API_KEY,
      }
    });

    await stagehand.init();
    const page = stagehand.context.pages()[0];

    await page.goto("https://github.com");

    // Use atomic instructions (best practice for reliable automation)
    await stagehand.act("Click the search button");
    await stagehand.act("Type 'stagehand browserbase' into the search input");
    await stagehand.act("Press Enter to submit the search");

    // Wait for results to load
    await new Promise(resolve => setTimeout(resolve, 2000));

    // Extract search results
    const result = await stagehand.extract(
      "Extract the repository names from the search results",
      z.object({
        repositories: z.array(z.string()),
      })
    );

    console.log("Found repositories:", result.repositories);
    await stagehand.close();
  } catch (error) {
    console.error("Error:", error);
    process.exit(1);
  }
})();

This example demonstrates Stagehand’s act() method—you describe what you want to do in plain English, and Stagehand figures out how to interact with the page. With Cerebras’s gpt-oss-120b model, these decisions happen in milliseconds.

Best Practice: Use atomic, single-step instructions for reliable automation. Instead of “Click the search box and search for X”, break it into: “Click the search button”, “Type X into the input”, “Press Enter”.

Observe and analyze page content

Before taking action, you can use observe() to understand what’s available on a page. This is useful for dynamic decision-making in your automation workflows:

import 'dotenv/config';
import { Stagehand } from "@browserbasehq/stagehand";

(async () => {
  try {
    const stagehand = new Stagehand({
      env: "BROWSERBASE",
      model: {
        modelName: "cerebras/gpt-oss-120b",
        apiKey: process.env.CEREBRAS_API_KEY,
      }
    });

    await stagehand.init();
    const page = stagehand.context.pages()[0];

    await page.goto("https://www.producthunt.com");

    // Observe what actions are possible on the page
    const observations = await stagehand.observe(
      "What are the main interactive elements on this page?"
    );

    console.log("Available actions:", observations);

    // You can now decide which action to take based on observations
    if (observations.length > 0) {
      await stagehand.act(observations[0]);
    }

    await stagehand.close();
  } catch (error) {
    console.error("Error:", error);
    process.exit(1);
  }
})();

The observe() method analyzes the page and returns suggestions without taking action. Use this when you want to preview options before committing, or when building adaptive automation that responds to different page states.

Build multi-step workflows

Combine all three methods—extract(), act(), and observe()—to create sophisticated automation workflows. This example navigates multiple sites and gathers comparative data:

import 'dotenv/config';
import { Stagehand } from "@browserbasehq/stagehand";
import { z } from "zod/v3";

(async () => {
  try {
    const stagehand = new Stagehand({
      env: "BROWSERBASE",
      model: {
        modelName: "cerebras/gpt-oss-120b",
        apiKey: process.env.CEREBRAS_API_KEY,
      }
    });

    await stagehand.init();
    const page = stagehand.context.pages()[0];

    // Step 1: Search on GitHub
    console.log("Step 1: Searching GitHub...");
    await page.goto("https://github.com/search?q=browser+automation&type=repositories");

    const githubResult = await stagehand.extract(
      "Extract the name and description of the first repository",
      z.object({
        name: z.string(),
        description: z.string(),
      })
    );
    console.log("GitHub top repo:", githubResult);

    // Step 2: Search on npm
    console.log("Step 2: Searching npm...");
    await page.goto("https://www.npmjs.com/search?q=browser%20automation");

    const npmResult = await stagehand.extract(
      "Extract the name and description of the first npm package",
      z.object({
        name: z.string(),
        description: z.string(),
      })
    );
    console.log("npm top package:", npmResult);

    console.log("\nComparative research results:");
    console.log("- GitHub:", githubResult.name);
    console.log("- npm:", npmResult.name);

    await stagehand.close();
  } catch (error) {
    console.error("Error:", error);
    process.exit(1);
  }
})();

Cerebras’s fast inference enables multi-step workflows to complete quickly, making it practical to build efficient browser automation that navigates multiple sites and makes intelligent decisions at each step.

Why Use Cerebras with Stagehand?

Cerebras’s ultra-fast inference provides several key advantages for browser automation:

Real-time responsiveness - Sub-second inference enables agents to react instantly to page changes and dynamic content, making automation feel natural
Complex reasoning - Models like gpt-oss-120b handle sophisticated multi-step workflows and make intelligent decisions about ambiguous page elements
Cost-effective scaling - Fast inference means lower costs for long-running automation tasks and dramatically reduced API usage compared to slower providers
Reliable execution - Low latency reduces timeouts and improves task completion rates, especially for time-sensitive operations and dynamic web applications
Better developer experience - Near-instantaneous responses during development make debugging and iteration significantly faster

Key Features

Four Powerful Primitives

Stagehand provides four complementary approaches to browser automation:

act() - Execute actions using natural language instructions (click, type, navigate, scroll)
extract() - Pull structured data from pages using AI, with Zod schema validation
observe() - Discover available actions on any page without executing them
agent() - Automate entire workflows autonomously for complex multi-step tasks

Natural Language Control

Describe what you want to do in plain English, and Stagehand’s AI will figure out how to interact with the page. No need for brittle CSS selectors or XPath queries.

Structured Data Extraction

Use Zod schemas to define exactly what data you want to extract, and Stagehand will find and structure it for you with built-in validation.

Works Everywhere

Stagehand v3 is compatible with all Chromium-based browsers. It also offers integrations with Playwright, Puppeteer, and Selenium for developers who want to combine AI-powered automation with traditional browser control.

Available Models

Stagehand works with all Cerebras models for browser automation:

Model	Parameters	Best For
llama3.1-8b	8B	Fastest option for simple tasks and high-throughput scenarios
gpt-oss-120b	120B	Largest model for the most demanding tasks
zai-glm-4.7	357B	Advanced 357B parameter model with strong reasoning capabilities

Change the modelName parameter when creating your Stagehand instance to switch between models.

Next Steps

Explore Stagehand’s full documentation for advanced features like custom selectors and context management
Try different Cerebras models to optimize for your use case (speed vs. reasoning capability)
Check out Stagehand GitHub for more automation patterns and community examples
Migrate to GLM4.7: Ready to upgrade? Follow our migration guide to start using our latest model

Troubleshooting

Stagehand can't find an element on the page

If Stagehand struggles to locate page elements:

Use atomic instructions - Break complex actions into single steps. Instead of “Click the search box and search for ‘query’”, use three separate calls: “Click the search button”, “Type ‘query’ into the search input”, “Press Enter”
Be more specific in your instructions - Instead of “click the button”, try “click the blue submit button in the bottom right corner”
Wait for dynamic content - Some elements load via JavaScript. Add explicit waits: await new Promise(r => setTimeout(r, 2000));
Verify the element exists - Use observe() first to see what Stagehand detects on the page
Try a different model - gpt-oss-120b generally has better reasoning for complex pages than faster models

Example with explicit waiting:

await page.goto("https://example.com");
await new Promise(r => setTimeout(r, 2000)); // Wait for dynamic content
await stagehand.act("Click the load more button");

How do I handle authentication and sessions?

Stagehand can handle authenticated sessions through Browserbase’s session management. For sites that require authentication:Option 1: Let Stagehand handle the login flow

await page.goto("https://example.com/login");
await stagehand.act("Enter 'user@example.com' in the email field");
await stagehand.act("Enter the password in the password field");
await stagehand.act("Click the login button");
await new Promise(r => setTimeout(r, 2000));

Option 2: Use Browserbase session persistenceCreate a session that persists authentication:

const stagehand = new Stagehand({
  env: "BROWSERBASE",
  sessionId: "my-persistent-session", // Reuse this session ID
  model: {
    modelName: "cerebras/gpt-oss-120b",
    apiKey: process.env.CEREBRAS_API_KEY,
  }
});

The session will maintain cookies and authentication state across runs.

What's the difference between act() and observe()?

observe() analyzes the page and returns suggestions without taking action. Use this when you want to preview options before committing, or when building agents that need to dynamically decide their next step.
act() executes an action based on your natural language instruction. This performs the actual browser interaction (clicking, typing, scrolling, etc.).

Recommended pattern: Use observe + act for more reliable automation:

// Get candidate actions first
const actions = await stagehand.observe("Click the sign in button");

// Execute the first action
await stagehand.act(actions[0]);

Both methods leverage Cerebras’s fast inference to understand page context and determine the best course of action.

What's the difference between extract() and traditional web scraping?

Traditional web scraping requires:

Writing brittle CSS selectors that break when the page layout changes
Manual handling of dynamic content and pagination
Complex parsing logic to structure data

Stagehand’s extract() method:

Uses AI to intelligently locate data regardless of page structure changes
Handles dynamic content automatically by understanding visual context
Structures data according to your Zod schema with built-in validation
Works across different websites with similar content types without code changes

Example comparison:Traditional scraping:

const titles = await page.$$eval('.story-title > a', els => 
  els.map(el => el.textContent)
);

Stagehand:

import { z } from "zod/v3";

const result = await stagehand.extract(
  "Get all story titles",
  z.object({ titles: z.array(z.string()) })
);
console.log(result.titles);

The Stagehand version will work even if the CSS classes change.

How do I handle rate limits and CAPTCHAs?

For rate limits:

Add delays between operations using await new Promise(r => setTimeout(r, 1000))
Use Browserbase’s session management to spread requests across multiple browser contexts
Implement exponential backoff retry logic
Consider using Cerebras’s faster models (llama3.1-8b) to reduce overall execution time

For CAPTCHAs and bot detection:Browserbase provides features specifically designed to avoid detection:

Residential proxies to appear as real users
Realistic browser fingerprints
Proper browser context management

Example with retry logic:

async function extractWithRetry(stagehand, instruction, schema, maxRetries = 3) {
  for (let i = 0; i < maxRetries; i++) {
    try {
      return await stagehand.extract(instruction, schema);
    } catch (error) {
      if (i === maxRetries - 1) throw error;
      await new Promise(r => setTimeout(r, Math.pow(2, i) * 1000));
    }
  }
}

Which Cerebras model should I use for different tasks?

Choose your model based on the complexity of your automation task:gpt-oss-120b - Best for:

Complex, multi-step workflows with decision branching
Pages with ambiguous or hard-to-locate elements
Tasks requiring sophisticated reasoning about page content
When accuracy is more important than speed

gpt-oss-120b - Best for:

Structured data extraction from well-formatted pages
Balanced performance between speed and reasoning
General-purpose automation tasks
Great middle-ground for most use cases

llama3.1-8b - Best for:

Simple, repetitive tasks with clear page structure
High-volume automation where speed is critical
Extracting data from consistent page layouts
When you need maximum throughput

How do I debug when Stagehand isn't working as expected?

Several strategies can help diagnose issues:

Use observe() to see what Stagehand detects:

const observations = await stagehand.observe("What can I interact with?");
console.log("Stagehand sees:", observations);

Enable verbose logging to see Stagehand’s decision-making:

const stagehand = new Stagehand({
  env: "BROWSERBASE",
  verbose: 2, // Set to 1 or 2 for more detailed logs
  model: {
    modelName: "cerebras/gpt-oss-120b",
    apiKey: process.env.CEREBRAS_API_KEY,
  }
});

Use Browserbase’s live view to watch the browser in real-time during development. The session URL is logged when you call stagehand.init().
Start with simpler instructions and gradually add complexity to isolate where things break.

Additional Resources

Stagehand GitHub Repository - Source code, examples, and community discussions
Stagehand Documentation - Comprehensive guides and API reference
Browserbase Documentation - Browser infrastructure and session management
Cerebras Model Documentation - Learn about available models and their capabilities
GLM4.7 migration guide - Upgrade to the latest model

To find navigation and other pages in this documentation, fetch the llms.txt file at: https://inference-docs.cerebras.ai/llms.txt

Get Started

Capabilities

Compatibility

Resources

Support

Prerequisites

Configure Stagehand with Cerebras

Why Use Cerebras with Stagehand?

Key Features

Four Powerful Primitives

Natural Language Control

Structured Data Extraction

Works Everywhere

Available Models

Next Steps

Troubleshooting

Additional Resources

Get Started

Capabilities

Compatibility

Resources

Support

​Prerequisites

​Configure Stagehand with Cerebras

​Why Use Cerebras with Stagehand?

​Key Features

​Four Powerful Primitives

​Natural Language Control

​Structured Data Extraction

​Works Everywhere

​Available Models

​Next Steps

​Troubleshooting

​Additional Resources

Prerequisites

Configure Stagehand with Cerebras

Why Use Cerebras with Stagehand?

Key Features

Four Powerful Primitives

Natural Language Control

Structured Data Extraction

Works Everywhere

Available Models

Next Steps

Troubleshooting

Additional Resources