Get Started with Browser-Use

Browser-Use is an open-source project that empowers AI agents to control web browsers, enabling tasks such as automated web navigation, form filling, data extraction, and complex multi-step workflows. By combining Browser-Use with Cerebras’s ultra-fast inference, you can build responsive browser automation agents that execute tasks in real-time.

Prerequisites

Before you begin, ensure you have:

Cerebras API Key - Get a free API key here
Python 3.11 or higher - Browser-Use requires Python 3.11+ for optimal performance
Playwright - Browser-Use uses Playwright for browser automation
Basic understanding of async Python - Browser-Use uses asyncio for concurrent operations

Browser-Use works best with fast inference providers like Cerebras. The ultra-low latency of Cerebras models (gpt-oss-120b, qwen-3-32b, llama3.1-8b) enables near-instantaneous browser control decisions, making your automation agents significantly more responsive.

Configure Browser-Use

Install required dependencies

First, install Browser-Use and its dependencies. Browser-Use will automatically install Playwright and other required packages:

pip install browser-use python-dotenv langchain-openai playwright

After installation, install the Playwright browsers. This downloads the necessary browser binaries (Chromium, Firefox, WebKit) that Playwright will use for automation:

playwright install

Configure environment variables

Create a .env file in your project directory to store your API credentials securely. This keeps your API key out of your source code:

CEREBRAS_API_KEY=your-cerebras-api-key-here

The python-dotenv package (installed in Step 1) will load these variables automatically when you call load_dotenv().

Initialize the Browser-Use agent

Browser-Use integrates seamlessly with any OpenAI-compatible API. Set up a Browser-Use agent with Cerebras to leverage ultra-fast inference for browser automation:

import os
import asyncio
from dotenv import load_dotenv
from langchain_openai import ChatOpenAI
from browser_use import Agent

# Load environment variables
load_dotenv()

# Wrapper class for browser-use compatibility with LangChain
class CerebrasLLM:
    def __init__(self, model="gpt-oss-120b"):
        self.llm = ChatOpenAI(
            model=model,
            api_key=os.getenv("CEREBRAS_API_KEY"),
            base_url="https://api.cerebras.ai/v1",
            default_headers={"X-Cerebras-3rd-Party-Integration": "browser-use"}
        )
        self.model = model
        self.model_name = model
        self.provider = "cerebras"
    
    async def ainvoke(self, *args, **kwargs):
        return await self.llm.ainvoke(*args, **kwargs)

# Initialize Cerebras LLM
llm = CerebrasLLM()

# Create the Browser-Use agent
agent = Agent(
    task="Go to google.com and search for 'Cerebras AI'",
    llm=llm,
)

This creates an agent that will use Cerebras’s gpt-oss-120b model to make decisions about browser actions. The agent can navigate websites, click elements, fill forms, and extract information based on your task description.

Run your first browser automation task

Now let’s run a simple browser automation task. This example navigates to Wikipedia:

import os
from dotenv import load_dotenv
from langchain_openai import ChatOpenAI
from browser_use import Agent

load_dotenv()

# Wrapper class for browser-use compatibility with LangChain
class CerebrasLLM:
    def __init__(self, model="gpt-oss-120b"):
        self.llm = ChatOpenAI(
            model=model,
            api_key=os.getenv("CEREBRAS_API_KEY"),
            base_url="https://api.cerebras.ai/v1",
            default_headers={"X-Cerebras-3rd-Party-Integration": "browser-use"}
        )
        self.model = model
        self.model_name = model
        self.provider = "cerebras"
    
    async def ainvoke(self, *args, **kwargs):
        return await self.llm.ainvoke(*args, **kwargs)

# Initialize Cerebras LLM
llm = CerebrasLLM()

# Create agent with a task - run with: await agent.run()
# agent = Agent(task="Go to wikipedia.org", llm=llm)

The agent will open a browser window and navigate to Wikipedia. With Cerebras’s fast inference, navigation decisions happen in milliseconds.

You may see some internal “items” errors in the browser-use logs - these are harmless and don’t affect navigation functionality. This is a known issue in browser-use v0.9.5 that will be fixed in future versions.

Extract structured data from websites

You can navigate to different websites easily. Here’s an example that navigates to the Cerebras website:

import os
import asyncio
from dotenv import load_dotenv
from langchain_openai import ChatOpenAI
from browser_use import Agent

load_dotenv()

# Wrapper class for browser-use compatibility with LangChain
class CerebrasLLM:
    def __init__(self, model="gpt-oss-120b"):
        self.llm = ChatOpenAI(
            model=model,
            api_key=os.getenv("CEREBRAS_API_KEY"),
            base_url="https://api.cerebras.ai/v1",
            default_headers={"X-Cerebras-3rd-Party-Integration": "browser-use"}
        )
        self.model = model
        self.model_name = model
        self.provider = "cerebras"
    
    async def ainvoke(self, *args, **kwargs):
        return await self.llm.ainvoke(*args, **kwargs)

async def navigate_cerebras():
    llm = CerebrasLLM(model="qwen-3-32b")
    
    agent = Agent(
        task="Go to cerebras.ai",
        llm=llm,
    )
    
    result = await agent.run()
    print("Navigation completed")

if __name__ == "__main__":
    asyncio.run(navigate_cerebras())

Cerebras’s qwen-3-32b model is excellent for structured data extraction tasks due to its strong reasoning capabilities and fast inference speed.

Customize browser behavior

You can navigate to multiple pages in sequence. This example shows navigation to GitHub:

import os
import asyncio
from dotenv import load_dotenv
from langchain_openai import ChatOpenAI
from browser_use import Agent

load_dotenv()

# Wrapper class for browser-use compatibility with LangChain
class CerebrasLLM:
    def __init__(self, model="gpt-oss-120b"):
        self.llm = ChatOpenAI(
            model=model,
            api_key=os.getenv("CEREBRAS_API_KEY"),
            base_url="https://api.cerebras.ai/v1",
            default_headers={"X-Cerebras-3rd-Party-Integration": "browser-use"}
        )
        self.model = model
        self.model_name = model
        self.provider = "cerebras"
    
    async def ainvoke(self, *args, **kwargs):
        return await self.llm.ainvoke(*args, **kwargs)

async def navigate_github():
    llm = CerebrasLLM()
    
    agent = Agent(
        task="Go to github.com",
        llm=llm,
    )
    
    result = await agent.run()
    print("Navigation completed")

if __name__ == "__main__":
    asyncio.run(navigate_github())

The agent will automatically handle browser initialization. With Cerebras’s ultra-fast inference, the agent can quickly navigate between pages.

Build multi-step workflows

You can chain multiple navigation tasks together. This example demonstrates navigating to multiple pages in sequence:

import os
import asyncio
from dotenv import load_dotenv
from langchain_openai import ChatOpenAI
from browser_use import Agent

load_dotenv()

# Wrapper class for browser-use compatibility with LangChain
class CerebrasLLM:
    def __init__(self, model="gpt-oss-120b"):
        self.llm = ChatOpenAI(
            model=model,
            api_key=os.getenv("CEREBRAS_API_KEY"),
            base_url="https://api.cerebras.ai/v1",
            default_headers={"X-Cerebras-3rd-Party-Integration": "browser-use"}
        )
        self.model = model
        self.model_name = model
        self.provider = "cerebras"
    
    async def ainvoke(self, *args, **kwargs):
        return await self.llm.ainvoke(*args, **kwargs)

async def multi_step_workflow():
    llm = CerebrasLLM()
    
    # Navigate to multiple pages
    pages = ["wikipedia.org", "cerebras.ai", "github.com", "python.org"]
    
    for page in pages:
        agent = Agent(
            task=f"Go to {page}",
            llm=llm,
        )
        await agent.run()
        print(f"Navigated to {page}")
    
    print("All navigations completed")

if __name__ == "__main__":
    asyncio.run(multi_step_workflow())

Cerebras’s fast inference enables multi-page navigation to complete quickly, making it practical to build efficient browser automation workflows.

Why Use Cerebras with Browser-Use?

Cerebras’s ultra-fast inference provides several key advantages for browser automation:

Real-time responsiveness - Sub-second inference enables agents to react instantly to page changes and dynamic content
Complex reasoning - Models like gpt-oss-120b and zai-glm-4.6 can handle sophisticated multi-step workflows and make intelligent decisions
Cost-effective - Fast inference means lower costs for long-running automation tasks and reduced API usage
Reliable execution - Low latency reduces timeouts and improves task completion rates, especially for time-sensitive operations
Better user experience - Near-instantaneous responses make browser automation feel natural and responsive

Next Steps

Explore the Browser-Use documentation for advanced features like custom actions and browser contexts
Try different Cerebras models to optimize for speed vs. reasoning capability
Build multi-agent workflows that combine browser automation with other tools
Check out Browser-Use examples for inspiration and real-world use cases
Learn about LangChain integration for more advanced agent orchestration
GLM4.6 migration guide

Troubleshooting

Agent is not finding elements on the page

If the agent struggles to locate page elements:

Be more specific - Provide detailed descriptions of elements in your task (e.g., “the blue submit button in the top right”)
Wait for page load - Some dynamic sites need time to render; add explicit wait instructions in your task
Simplify selectors - Use clear, unique identifiers when possible (e.g., “the search box with placeholder ‘Enter query’”)
Check for dynamic content - Some elements may load via JavaScript; ensure the page is fully loaded before interaction

ImportError: cannot import name 'Agent' from 'browser_use'

This usually means Browser-Use wasn’t installed correctly:

pip uninstall browser-use
pip install browser-use --upgrade
playwright install

Make sure you’re using Python 3.11 or higher. You can check your Python version with:

python --version

If you’re using an older version, consider using pyenv or conda to install Python 3.11+.

How do I handle authentication and cookies?

Browser-Use can handle authenticated sessions. The agent automatically manages browser contexts and can persist sessions across runs. For sites that require authentication, you can:

Let the agent handle the login flow as part of its task
Use browser profiles to persist login state
Pass cookies or session tokens programmatically

Example of including login in the task:

import os
from langchain_openai import ChatOpenAI
from browser_use import Agent

# Wrapper class for browser-use compatibility
class CerebrasLLM:
    def __init__(self, model="gpt-oss-120b"):
        self.llm = ChatOpenAI(
            model=model,
            api_key=os.getenv("CEREBRAS_API_KEY"),
            base_url="https://api.cerebras.ai/v1",
            default_headers={"X-Cerebras-3rd-Party-Integration": "browser-use"}
        )
        self.model = model
        self.model_name = model
        self.provider = "cerebras"
    
    async def ainvoke(self, *args, **kwargs):
        return await self.llm.ainvoke(*args, **kwargs)

llm = CerebrasLLM()

agent = Agent(
    task="Go to example.com, log in with username '[email protected]' and password from environment, then navigate to dashboard",
    llm=llm,
)

Can I use Browser-Use with streaming responses?

Yes! Browser-Use works with Cerebras’s streaming API for real-time feedback:

import os
from langchain_openai import ChatOpenAI

# Wrapper class for browser-use compatibility with LangChain
class CerebrasLLM:
    def __init__(self, model="gpt-oss-120b", streaming=False):
        self.llm = ChatOpenAI(
            model=model,
            api_key=os.getenv("CEREBRAS_API_KEY"),
            base_url="https://api.cerebras.ai/v1",
            streaming=streaming,
            default_headers={"X-Cerebras-3rd-Party-Integration": "browser-use"}
        )
        self.model = model
        self.model_name = model
        self.provider = "cerebras"
    
    async def ainvoke(self, *args, **kwargs):
        return await self.llm.ainvoke(*args, **kwargs)

llm = CerebrasLLM(streaming=True)

Streaming is particularly useful for long-running tasks where you want to see the agent’s reasoning in real-time. Learn more about streaming with Cerebras.

What's the difference between headless and headed mode?

Headless mode (headless=True, default):

Browser runs in the background without a visible window
Faster execution and lower resource usage
Ideal for production environments and automated pipelines

Headed mode (headless=False):

Browser window is visible on your screen
Useful for debugging and development
Allows you to see exactly what the agent is doing

For development, start with headed mode to understand the agent’s behavior, then switch to headless mode for production deployments.

Additional Resources

Browser-Use GitHub Repository - Source code, examples, and community discussions
Browser-Use Documentation - Comprehensive guides and API reference
Cerebras Model Documentation - Learn about available models and their capabilities
Chat Completions API Reference - Detailed API documentation
LangChain Integration Guide - Build more complex agent workflows
Playwright Documentation - Learn about browser automation capabilities

Get Started

Capabilities

Compatibility

Resources

Support

Prerequisites

Configure Browser-Use

Why Use Cerebras with Browser-Use?

Next Steps

Troubleshooting

Additional Resources

Get Started

Capabilities

Compatibility

Resources

Support

​Prerequisites

​Configure Browser-Use

​Why Use Cerebras with Browser-Use?

​Next Steps

​Troubleshooting

​Additional Resources

Prerequisites

Configure Browser-Use

Why Use Cerebras with Browser-Use?

Next Steps

Troubleshooting

Additional Resources