Stagehand is an AI-powered web browsing framework that enables intelligent browser automation through natural language. By integrating Cerebras models, you can leverage ultra-fast inference for web scraping, form filling, automated testing, and complex multi-step workflows that feel responsive and natural.Documentation Index
Fetch the complete documentation index at: https://inference-docs.cerebras.ai/llms.txt
Use this file to discover all available pages before exploring further.
Prerequisites
Before you begin, ensure you have:- Cerebras API Key - Get a free API key here
- Browserbase Account - Visit Browserbase and create an account to get your API key and Project ID
- Node.js 20 or higher - Stagehand requires a modern Node.js environment
Configure Stagehand with Cerebras
Create a new Stagehand project
zod and dotenv as dependencies—no need to install them separately.Configure environment variables
.env file with your API credentials. You’ll need both Cerebras and Browserbase credentials to enable AI-powered browser automation:Extract structured data from Hacker News
gpt-oss-120b model here, which excels at structured data extraction tasks due to its strong reasoning capabilities and fast inference speed.Perform actions with natural language
act() method to interact with a website. This example searches GitHub for “stagehand browserbase” and extracts results:act() method—you describe what you want to do in plain English, and Stagehand figures out how to interact with the page. With Cerebras’s gpt-oss-120b model, these decisions happen in milliseconds.Observe and analyze page content
observe() to understand what’s available on a page. This is useful for dynamic decision-making in your automation workflows:observe() method analyzes the page and returns suggestions without taking action. Use this when you want to preview options before committing, or when building adaptive automation that responds to different page states.Build multi-step workflows
extract(), act(), and observe()—to create sophisticated automation workflows. This example navigates multiple sites and gathers comparative data:Why Use Cerebras with Stagehand?
Cerebras’s ultra-fast inference provides several key advantages for browser automation:- Real-time responsiveness - Sub-second inference enables agents to react instantly to page changes and dynamic content, making automation feel natural
- Complex reasoning - Models like
gpt-oss-120bhandle sophisticated multi-step workflows and make intelligent decisions about ambiguous page elements - Cost-effective scaling - Fast inference means lower costs for long-running automation tasks and dramatically reduced API usage compared to slower providers
- Reliable execution - Low latency reduces timeouts and improves task completion rates, especially for time-sensitive operations and dynamic web applications
- Better developer experience - Near-instantaneous responses during development make debugging and iteration significantly faster
Key Features
Four Powerful Primitives
Stagehand provides four complementary approaches to browser automation:- act() - Execute actions using natural language instructions (click, type, navigate, scroll)
- extract() - Pull structured data from pages using AI, with Zod schema validation
- observe() - Discover available actions on any page without executing them
- agent() - Automate entire workflows autonomously for complex multi-step tasks
Natural Language Control
Describe what you want to do in plain English, and Stagehand’s AI will figure out how to interact with the page. No need for brittle CSS selectors or XPath queries.Structured Data Extraction
Use Zod schemas to define exactly what data you want to extract, and Stagehand will find and structure it for you with built-in validation.Works Everywhere
Stagehand v3 is compatible with all Chromium-based browsers. It also offers integrations with Playwright, Puppeteer, and Selenium for developers who want to combine AI-powered automation with traditional browser control.Available Models
Stagehand works with all Cerebras models for browser automation:| Model | Parameters | Best For |
|---|---|---|
| llama3.1-8b | 8B | Fastest option for simple tasks and high-throughput scenarios |
| gpt-oss-120b | 120B | Largest model for the most demanding tasks |
| zai-glm-4.7 | 357B | Advanced 357B parameter model with strong reasoning capabilities |
modelName parameter when creating your Stagehand instance to switch between models.
Next Steps
- Explore Stagehand’s full documentation for advanced features like custom selectors and context management
- Try different Cerebras models to optimize for your use case (speed vs. reasoning capability)
- Check out Stagehand GitHub for more automation patterns and community examples
- Migrate to GLM4.7: Ready to upgrade? Follow our migration guide to start using our latest model
Troubleshooting
Stagehand can't find an element on the page
Stagehand can't find an element on the page
- Use atomic instructions - Break complex actions into single steps. Instead of “Click the search box and search for ‘query’”, use three separate calls: “Click the search button”, “Type ‘query’ into the search input”, “Press Enter”
- Be more specific in your instructions - Instead of “click the button”, try “click the blue submit button in the bottom right corner”
- Wait for dynamic content - Some elements load via JavaScript. Add explicit waits:
await new Promise(r => setTimeout(r, 2000)); - Verify the element exists - Use
observe()first to see what Stagehand detects on the page - Try a different model -
gpt-oss-120bgenerally has better reasoning for complex pages than faster models
How do I handle authentication and sessions?
How do I handle authentication and sessions?
What's the difference between act() and observe()?
What's the difference between act() and observe()?
- observe() analyzes the page and returns suggestions without taking action. Use this when you want to preview options before committing, or when building agents that need to dynamically decide their next step.
- act() executes an action based on your natural language instruction. This performs the actual browser interaction (clicking, typing, scrolling, etc.).
What's the difference between extract() and traditional web scraping?
What's the difference between extract() and traditional web scraping?
- Writing brittle CSS selectors that break when the page layout changes
- Manual handling of dynamic content and pagination
- Complex parsing logic to structure data
extract() method:- Uses AI to intelligently locate data regardless of page structure changes
- Handles dynamic content automatically by understanding visual context
- Structures data according to your Zod schema with built-in validation
- Works across different websites with similar content types without code changes
How do I handle rate limits and CAPTCHAs?
How do I handle rate limits and CAPTCHAs?
- Add delays between operations using
await new Promise(r => setTimeout(r, 1000)) - Use Browserbase’s session management to spread requests across multiple browser contexts
- Implement exponential backoff retry logic
- Consider using Cerebras’s faster models (
llama3.1-8b) to reduce overall execution time
- Residential proxies to appear as real users
- Realistic browser fingerprints
- Proper browser context management
Which Cerebras model should I use for different tasks?
Which Cerebras model should I use for different tasks?
- Complex, multi-step workflows with decision branching
- Pages with ambiguous or hard-to-locate elements
- Tasks requiring sophisticated reasoning about page content
- When accuracy is more important than speed
- Structured data extraction from well-formatted pages
- Balanced performance between speed and reasoning
- General-purpose automation tasks
- Great middle-ground for most use cases
- Simple, repetitive tasks with clear page structure
- High-volume automation where speed is critical
- Extracting data from consistent page layouts
- When you need maximum throughput
How do I debug when Stagehand isn't working as expected?
How do I debug when Stagehand isn't working as expected?
- Use observe() to see what Stagehand detects:
- Enable verbose logging to see Stagehand’s decision-making:
-
Use Browserbase’s live view to watch the browser in real-time during development. The session URL is logged when you call
stagehand.init(). - Start with simpler instructions and gradually add complexity to isolate where things break.
Additional Resources
- Stagehand GitHub Repository - Source code, examples, and community discussions
- Stagehand Documentation - Comprehensive guides and API reference
- Browserbase Documentation - Browser infrastructure and session management
- Cerebras Model Documentation - Learn about available models and their capabilities
- GLM4.7 migration guide - Upgrade to the latest model
To find navigation and other pages in this documentation, fetch the llms.txt file at: https://inference-docs.cerebras.ai/llms.txt

