Build a Grounded Research Agent with Exa

This cookbook shows how to build a grounded research agent that can:

Search the web for current information with Exa
Hand the results to a Cerebras model through tool calling
Return answers with inline citations and a clean source list

Exa search returns clean page content (highlights) with every result, so a single search tool is enough to ground your AI agent.

Prerequisites

Before you begin, ensure you have:

A Cerebras API key
An Exa API key
Python 3.10+ or Node.js 18+

Install the dependencies:

pip install "exa-py>=2.0" openai python-dotenv

The Node.js examples use ES modules and top-level await. Save them with a .mjs extension (or set "type": "module" in your package.json) and run them with node file.mjs.

Then store your API keys in a .env file:

CEREBRAS_API_KEY=your-cerebras-api-key
EXA_API_KEY=your-exa-api-key

Get your keys here: Cerebras and Exa.

Step 1: Initialize the Clients

We use Exa for search and the OpenAI client against Cerebras’ OpenAI-compatible API for agent reasoning and tool use.

import json
import os
import re
from dotenv import load_dotenv
from exa_py import Exa
from openai import OpenAI

load_dotenv()

exa = Exa(api_key=os.environ["EXA_API_KEY"])
exa.headers["x-exa-integration"] = "cerebras-integration"

cerebras = OpenAI(
    api_key=os.environ["CEREBRAS_API_KEY"],
    base_url="https://api.cerebras.ai/v1",
    default_headers={"X-Cerebras-3rd-Party-Integration": "exa"},
)

Step 2: Define the Exa Search Tool

The agent gets one tool: exa_search. It returns clean highlights for each result, with each source tagged [n] so the model can cite it. A finalize helper cleans up the model’s output and appends a numbered source list, so every answer ends with reliable citations.

sources = []
index_by_url = {}

def register(title, url):
    if url not in index_by_url:
        sources.append((title or url, url))
        index_by_url[url] = len(sources)
    return index_by_url[url]

def exa_search(query, type="auto", num_results=10, max_age_hours=None, **_):
    contents = {"highlights": True}
    if max_age_hours is not None:
        contents["max_age_hours"] = max_age_hours
    results = exa.search(query, type=type, num_results=num_results, contents=contents)
    return "\n\n".join(
        f"[{register(r.title, r.url)}] {r.title or r.url}\nURL: {r.url}\n{' '.join(r.highlights or [])}"
        for r in results.results
    )

# Remove stray citation markers (e.g. 【†L1-L9】) the model sometimes adds.
GARBAGE = re.compile(r"【[^】]*】|\d*†[^\s\]】]*】?|[【】†]")

def finalize(answer):
    answer = GARBAGE.sub("", answer)
    answer = re.sub(r"\[\[(\d+)\]\]", r"[\1]", answer).strip()
    if not sources:
        return answer
    lines = "\n".join(f"[{i}] {title} - {url}" for i, (title, url) in enumerate(sources, 1))
    return f"{answer}\n\nSources:\n{lines}"

Step 3: Register the Tool for the Model

The schema exposes the three search types so the model can choose faster or deeper search per query. Only query is required; everything else is optional.

tools = [
    {
        "type": "function",
        "function": {
            "name": "exa_search",
            "description": "Search the web with Exa and get clean, ready-to-use results. Best for current information, news, facts, people, and companies. Returns numbered sources [n] with title, URL, and highlights.",
            "parameters": {
                "type": "object",
                "properties": {
                    "query": {"type": "string", "description": "The search query."},
                    "type": {
                        "type": "string",
                        "enum": ["auto", "fast", "deep"],
                        "description": "Search strategy. 'auto' (default and recommended) balances quality and speed; 'fast' is the lowest-latency option; 'deep' is most thorough.",
                    },
                    "num_results": {
                        "type": "integer",
                        "description": "Number of results to return (1-100, default 10).",
                    },
                    "max_age_hours": {
                        "type": "integer",
                        "description": "Only accept cached pages newer than this many hours; older pages are refreshed before returning. Omit for no freshness limit, 0 to always fetch fresh content, or -1 to use cached content only.",
                    },
                },
                "required": ["query"],
            },
        },
    }
]

available_tools = {"exa_search": exa_search}

Step 4: Run the Agent Loop

The core pattern is:

Ask the model what it needs
Let it call the search tool
Feed tool results back into the conversation
Stop when the model returns a final answer

The loop has a step limit, calls tools safely, and passes any tool error back to the model as a tool message so it can fix its input instead of crashing.

def run_research_agent(question):
    messages = [
        {
            "role": "system",
            "content": (
                "You are a research analyst. Use exa_search to find current sources, then answer "
                "the question. Cite sources inline as [n], matching the labels returned by "
                "exa_search (for example [1] or [2])."
            ),
        },
        {"role": "user", "content": question},
    ]

    for _ in range(6):
        response = cerebras.chat.completions.create(
            model="gpt-oss-120b",
            messages=messages,
            tools=tools,
            tool_choice="auto",
            max_completion_tokens=2000,
        )
        message = response.choices[0].message
        messages.append(message)

        if not message.tool_calls:
            return finalize(message.content or "")

        for tool_call in message.tool_calls:
            tool_fn = available_tools.get(tool_call.function.name)
            try:
                args = json.loads(tool_call.function.arguments)
                result = tool_fn(**args) if tool_fn else f"Unknown tool: {tool_call.function.name}"
            except Exception as e:
                result = f"Tool error ({type(e).__name__}): {e}. Adjust your arguments and try again."
            messages.append({"role": "tool", "tool_call_id": tool_call.id, "content": result})

    return "Could not produce a final answer within the step limit."

Step 5: Try It on a Real Question

Now you can ask for a grounded answer. The agent searches the web, then writes a cited answer.

question = "How are AI agents being used in production today? Cite specific examples."
answer = run_research_agent(question)
print(answer)

Inline [n] markers map to the numbered Sources list at the end of the answer. Non-consecutive citations like [1], [2], and [4] are expected when the model cites only some of the results.

Complete Example

The full agent in a single file. Copy it into agent.py (or agent.mjs) and run it.

import json
import os
import re
from dotenv import load_dotenv
from exa_py import Exa
from openai import OpenAI

load_dotenv()

exa = Exa(api_key=os.environ["EXA_API_KEY"])
exa.headers["x-exa-integration"] = "cerebras-integration"

cerebras = OpenAI(
    api_key=os.environ["CEREBRAS_API_KEY"],
    base_url="https://api.cerebras.ai/v1",
    default_headers={"X-Cerebras-3rd-Party-Integration": "exa"},
)


sources = []
index_by_url = {}

def register(title, url):
    if url not in index_by_url:
        sources.append((title or url, url))
        index_by_url[url] = len(sources)
    return index_by_url[url]

def exa_search(query, type="auto", num_results=10, max_age_hours=None, **_):
    contents = {"highlights": True}
    if max_age_hours is not None:
        contents["max_age_hours"] = max_age_hours
    results = exa.search(query, type=type, num_results=num_results, contents=contents)
    return "\n\n".join(
        f"[{register(r.title, r.url)}] {r.title or r.url}\nURL: {r.url}\n{' '.join(r.highlights or [])}"
        for r in results.results
    )

# Remove stray citation markers (e.g. 【†L1-L9】) the model sometimes adds.
GARBAGE = re.compile(r"【[^】]*】|\d*†[^\s\]】]*】?|[【】†]")

def finalize(answer):
    answer = GARBAGE.sub("", answer)
    answer = re.sub(r"\[\[(\d+)\]\]", r"[\1]", answer).strip()
    if not sources:
        return answer
    lines = "\n".join(f"[{i}] {title} - {url}" for i, (title, url) in enumerate(sources, 1))
    return f"{answer}\n\nSources:\n{lines}"


tools = [
    {
        "type": "function",
        "function": {
            "name": "exa_search",
            "description": "Search the web with Exa and get clean, ready-to-use results. Best for current information, news, facts, people, and companies. Returns numbered sources [n] with title, URL, and highlights.",
            "parameters": {
                "type": "object",
                "properties": {
                    "query": {"type": "string", "description": "The search query."},
                    "type": {
                        "type": "string",
                        "enum": ["auto", "fast", "deep"],
                        "description": "Search strategy. 'auto' (default and recommended) balances quality and speed; 'fast' is the lowest-latency option; 'deep' is most thorough.",
                    },
                    "num_results": {
                        "type": "integer",
                        "description": "Number of results to return (1-100, default 10).",
                    },
                    "max_age_hours": {
                        "type": "integer",
                        "description": "Only accept cached pages newer than this many hours; older pages are refreshed before returning. Omit for no freshness limit, 0 to always fetch fresh content, or -1 to use cached content only.",
                    },
                },
                "required": ["query"],
            },
        },
    }
]

available_tools = {"exa_search": exa_search}


def run_research_agent(question):
    messages = [
        {
            "role": "system",
            "content": (
                "You are a research analyst. Use exa_search to find current sources, then answer "
                "the question. Cite sources inline as [n], matching the labels returned by "
                "exa_search (for example [1] or [2])."
            ),
        },
        {"role": "user", "content": question},
    ]

    for _ in range(6):
        response = cerebras.chat.completions.create(
            model="gpt-oss-120b",
            messages=messages,
            tools=tools,
            tool_choice="auto",
            max_completion_tokens=2000,
        )
        message = response.choices[0].message
        messages.append(message)

        if not message.tool_calls:
            return finalize(message.content or "")

        for tool_call in message.tool_calls:
            tool_fn = available_tools.get(tool_call.function.name)
            try:
                args = json.loads(tool_call.function.arguments)
                result = tool_fn(**args) if tool_fn else f"Unknown tool: {tool_call.function.name}"
            except Exception as e:
                result = f"Tool error ({type(e).__name__}): {e}. Adjust your arguments and try again."
            messages.append({"role": "tool", "tool_call_id": tool_call.id, "content": result})

    return "Could not produce a final answer within the step limit."


question = "How are AI agents being used in production today? Cite specific examples."
answer = run_research_agent(question)
print(answer)

Summary

What We Built

A grounded research agent with:

Exa search for current source discovery, with page content (highlights) returned inline
Cerebras tool calling to plan searches and write cited answers
Reliable inline citations backed by a numbered source list

Next Steps

Use fast for low-latency chat assistants and deep for broader research tasks
Lower max_age_hours for newsy queries that need fresher content
Try other Exa API config in Exa API Dashboard

Resources

Acknowledgements

Thank you to Ishan Goswami from Exa for his collaboration and feedback during the development of this cookbook.

​Prerequisites

​Step 1: Initialize the Clients

​Step 2: Define the Exa Search Tool

​Step 3: Register the Tool for the Model

​Step 4: Run the Agent Loop

​Step 5: Try It on a Real Question

​Complete Example

​Summary

​What We Built

​Next Steps

​Resources

​Acknowledgements

Prerequisites

Step 1: Initialize the Clients

Step 2: Define the Exa Search Tool

Step 3: Register the Tool for the Model

Step 4: Run the Agent Loop

Step 5: Try It on a Real Question

Complete Example

Summary

What We Built

Next Steps

Resources

Acknowledgements