Get Started with Exa - Cerebras Inference

Exa is one of the fastest and most accurate web search APIs, built for AI applications. By combining Exa’s search API with Cerebras Inference, you can ground responses in current web content while keeping agent latency low.

Prerequisites

Before you begin, ensure you have:

Cerebras API Key
Exa API Key
Python or Node.js

Configure Exa with Cerebras

Install required dependencies

Install the Exa SDK and the OpenAI client library. The OpenAI client is used to connect to Cerebras’ OpenAI-compatible API.

pip install "exa-py>=2.0" openai python-dotenv

npm install exa-js openai dotenv

The Node.js examples use ES modules and top-level await. Save them with a .mjs extension (or set "type": "module" in your package.json) and run them with node file.mjs.

Configure environment variables

Create a .env file in your project directory to securely store your API keys:

CEREBRAS_API_KEY=your-cerebras-api-key
EXA_API_KEY=your-exa-api-key

Get your keys here: Cerebras and Exa.

Perform your first grounded web search

This example uses Exa search to gather current web results, then asks a Cerebras model to combine them into a short answer.

import os
from dotenv import load_dotenv
from exa_py import Exa
from openai import OpenAI

load_dotenv()

exa = Exa(api_key=os.environ["EXA_API_KEY"])
exa.headers["x-exa-integration"] = "cerebras-integration"

cerebras = OpenAI(
    api_key=os.environ["CEREBRAS_API_KEY"],
    base_url="https://api.cerebras.ai/v1",
    default_headers={"X-Cerebras-3rd-Party-Integration": "exa"},
)

results = exa.search(
    "latest developments in AI agents",
    type="auto",
    num_results=10,
    contents={"highlights": True},
)

context = "\n\n".join(
    f"Source: {result.url}\n{' '.join(result.highlights or [])}"
    for result in results.results
)

response = cerebras.chat.completions.create(
    model="gpt-oss-120b",
    messages=[
        {
            "role": "system",
            "content": "You are a research assistant. Using only the search results provided, write a short summary. Do not include citation markers or footnotes.",
        },
        {
            "role": "user",
            "content": f"Search results:\n\n{context}\n\nSummarize the latest developments in AI agents.",
        },
    ],
)

print(response.choices[0].message.content)

import 'dotenv/config';
import Exa from 'exa-js';
import OpenAI from 'openai';

const exa = new Exa(process.env.EXA_API_KEY);
exa.headers.set('x-exa-integration', 'cerebras-integration');

const cerebras = new OpenAI({
  apiKey: process.env.CEREBRAS_API_KEY,
  baseURL: 'https://api.cerebras.ai/v1',
  defaultHeaders: { 'X-Cerebras-3rd-Party-Integration': 'exa' },
});

const results = await exa.search('latest developments in AI agents', {
  type: 'auto',
  numResults: 10,
  contents: { highlights: true },
});

const context = results.results
  .map((result) => `Source: ${result.url}\n${(result.highlights || []).join(' ')}`)
  .join('\n\n');

const response = await cerebras.chat.completions.create({
  model: 'gpt-oss-120b',
  messages: [
    {
      role: 'system',
      content: 'You are a research assistant. Using only the search results provided, write a short summary. Do not include citation markers or footnotes.',
    },
    {
      role: 'user',
      content: `Search results:\n\n${context}\n\nSummarize the latest developments in AI agents.`,
    },
  ],
});

console.log(response.choices[0].message.content);

Search types and freshness controls

Exa supports a few search modes with different speed and coverage tradeoffs:

Search type	Best for
`auto`	(Default) Best balance of quality and speed. Recommended. 1s latency.
`fast`	Lowest-latency search. 450ms latency.
`deep`	For the most thorough search. 4s-18s latency.

Freshness is controlled with max_age_hours in Python or maxAgeHours in Node, inside contents. It is optional: leave it out for no freshness limit, or set 0 to always fetch fresh content, 24 to accept cached pages up to one day old, or -1 to use cached content only.

Get page contents

Every Exa search result already includes page content (highlights), so you usually don’t need a separate call. We recommend using search for most grounding workflows. Reach for the Contents API when you already have a URL and want to get its highlights directly.

import os
from dotenv import load_dotenv
from exa_py import Exa

load_dotenv()

exa = Exa(api_key=os.environ["EXA_API_KEY"])
exa.headers["x-exa-integration"] = "cerebras-integration"

contents = exa.get_contents(
    ["https://openai.com/index/hello-gpt-4o/"],
    highlights=True,
)

for result in contents.results:
    print(result.url)
    print(" ".join(result.highlights or []))

import 'dotenv/config';
import Exa from 'exa-js';

const exa = new Exa(process.env.EXA_API_KEY);
exa.headers.set('x-exa-integration', 'cerebras-integration');

const contents = await exa.getContents(['https://openai.com/index/hello-gpt-4o/'], {
  highlights: true,
});

for (const result of contents.results) {
  console.log(result.url);
  console.log((result.highlights || []).join(' '));
}

Use Exa as a tool for grounded answers

Tool calling works well when you want a Cerebras model to decide when to search the web. This example exposes Exa search as a tool.

import os
import json
import re
from dotenv import load_dotenv
from exa_py import Exa
from openai import OpenAI

load_dotenv()

exa = Exa(api_key=os.environ["EXA_API_KEY"])
exa.headers["x-exa-integration"] = "cerebras-integration"

cerebras = OpenAI(
    api_key=os.environ["CEREBRAS_API_KEY"],
    base_url="https://api.cerebras.ai/v1",
    default_headers={"X-Cerebras-3rd-Party-Integration": "exa"},
)

sources = []
index_by_url = {}

def register(title, url):
    if url not in index_by_url:
        sources.append((title or url, url))
        index_by_url[url] = len(sources)
    return index_by_url[url]

def exa_search(query, type="auto", num_results=10, max_age_hours=None, **_):
    contents = {"highlights": True}
    if max_age_hours is not None:
        contents["max_age_hours"] = max_age_hours
    results = exa.search(query, type=type, num_results=num_results, contents=contents)
    return "\n\n".join(
        f"[{register(r.title, r.url)}] {r.title or r.url}\nURL: {r.url}\n{' '.join(r.highlights or [])}"
        for r in results.results
    )

available_tools = {"exa_search": exa_search}

tools = [
    {
        "type": "function",
        "function": {
            "name": "exa_search",
            "description": "Search the web with Exa and get clean, ready-to-use results. Best for current information, news, facts, people, and companies. Returns numbered sources [n] with title, URL, and highlights.",
            "parameters": {
                "type": "object",
                "properties": {
                    "query": {"type": "string", "description": "The search query."},
                    "type": {
                        "type": "string",
                        "enum": ["auto", "fast", "deep"],
                        "description": "Search strategy. 'auto' (default) balances quality and speed; 'fast' is the lowest-latency option; 'deep' is most thorough.",
                    },
                    "num_results": {
                        "type": "integer",
                        "description": "Number of results to return (1-100, default 10).",
                    },
                    "max_age_hours": {
                        "type": "integer",
                        "description": "Only accept cached pages newer than this many hours; older pages are refreshed before returning. Omit for no freshness limit, 0 to always fetch fresh content, or -1 to use cached content only.",
                    },
                },
                "required": ["query"],
            },
        },
    }
]

# Remove stray citation markers (e.g. 【†L1-L9】) the model sometimes adds.
GARBAGE = re.compile(r"【[^】]*】|\d*†[^\s\]】]*】?|[【】†]")

def finalize(answer):
    answer = GARBAGE.sub("", answer)
    answer = re.sub(r"\[\[(\d+)\]\]", r"[\1]", answer).strip()
    if not sources:
        return answer
    lines = "\n".join(f"[{i}] {title} - {url}" for i, (title, url) in enumerate(sources, 1))
    return f"{answer}\n\nSources:\n{lines}"

def answer_with_search(question):
    messages = [
        {
            "role": "system",
            "content": (
                "You are a research assistant. Use exa_search to find current information, then "
                "answer the question. Cite sources inline as [n], matching the labels returned "
                "by exa_search (for example [1] or [2])."
            ),
        },
        {"role": "user", "content": question},
    ]

    for _ in range(6):
        response = cerebras.chat.completions.create(
            model="gpt-oss-120b",
            messages=messages,
            tools=tools,
            tool_choice="auto",
            max_completion_tokens=2000,
        )
        message = response.choices[0].message
        messages.append(message)

        if not message.tool_calls:
            return finalize(message.content or "")

        for tool_call in message.tool_calls:
            tool_fn = available_tools.get(tool_call.function.name)
            try:
                args = json.loads(tool_call.function.arguments)
                result = tool_fn(**args) if tool_fn else f"Unknown tool: {tool_call.function.name}"
            except Exception as e:
                result = f"Tool error ({type(e).__name__}): {e}. Adjust your arguments and try again."
            messages.append({"role": "tool", "tool_call_id": tool_call.id, "content": result})

    return "Could not produce a final answer within the step limit."

print(answer_with_search("What are the latest AI model releases, and what makes them notable?"))

import 'dotenv/config';
import Exa from 'exa-js';
import OpenAI from 'openai';

const exa = new Exa(process.env.EXA_API_KEY);
exa.headers.set('x-exa-integration', 'cerebras-integration');

const cerebras = new OpenAI({
  apiKey: process.env.CEREBRAS_API_KEY,
  baseURL: 'https://api.cerebras.ai/v1',
  defaultHeaders: { 'X-Cerebras-3rd-Party-Integration': 'exa' },
});

const sources = [];
const indexByUrl = new Map();

function register(title, url) {
  if (!indexByUrl.has(url)) {
    sources.push({ title: title || url, url });
    indexByUrl.set(url, sources.length);
  }
  return indexByUrl.get(url);
}

async function exaSearch({ query, type = 'auto', numResults = 10, maxAgeHours }) {
  const contents = { highlights: true };
  if (maxAgeHours !== undefined) contents.maxAgeHours = maxAgeHours;
  const results = await exa.search(query, { type, numResults, contents });
  return results.results
    .map(
      (r) =>
        `[${register(r.title, r.url)}] ${r.title || r.url}\nURL: ${r.url}\n${(r.highlights || []).join(' ')}`
    )
    .join('\n\n');
}

const availableTools = { exa_search: exaSearch };

const tools = [
  {
    type: 'function',
    function: {
      name: 'exa_search',
      description:
        'Search the web with Exa and get clean, ready-to-use results. Best for current information, news, facts, people, and companies. Returns numbered sources [n] with title, URL, and highlights.',
      parameters: {
        type: 'object',
        properties: {
          query: { type: 'string', description: 'The search query.' },
          type: {
            type: 'string',
            enum: ['auto', 'fast', 'deep'],
            description:
              "Search strategy. 'auto' (default) balances quality and speed; 'fast' is the lowest-latency option; 'deep' is most thorough.",
          },
          numResults: {
            type: 'integer',
            description: 'Number of results to return (1-100, default 10).',
          },
          maxAgeHours: {
            type: 'integer',
            description:
              'Only accept cached pages newer than this many hours; older pages are refreshed before returning. Omit for no freshness limit, 0 to always fetch fresh content, or -1 to use cached content only.',
          },
        },
        required: ['query'],
      },
    },
  },
];

function finalize(answer) {
  // Remove stray citation markers (e.g. 【†L1-L9】) the model sometimes adds.
  let text = (answer || '').replace(/【[^】]*】|\d*†[^\s\]】]*】?|[【】†]/g, '');
  text = text.replace(/\[\[(\d+)\]\]/g, '[$1]').trim();
  if (sources.length === 0) return text;
  const lines = sources.map((s, i) => `[${i + 1}] ${s.title} - ${s.url}`).join('\n');
  return `${text}\n\nSources:\n${lines}`;
}

async function answerWithSearch(question) {
  const messages = [
    {
      role: 'system',
      content:
        'You are a research assistant. Use exa_search to find current information, then answer the question. ' +
        'Cite sources inline as [n], matching the labels returned by exa_search (for example [1] or [2]).',
    },
    { role: 'user', content: question },
  ];

  for (let step = 0; step < 6; step++) {
    const response = await cerebras.chat.completions.create({
      model: 'gpt-oss-120b',
      messages,
      tools,
      tool_choice: 'auto',
      max_completion_tokens: 2000,
    });

    const message = response.choices[0].message;
    messages.push(message);

    if (!message.tool_calls) {
      return finalize(message.content || '');
    }

    for (const toolCall of message.tool_calls) {
      const toolFn = availableTools[toolCall.function.name];
      let result;
      try {
        const args = JSON.parse(toolCall.function.arguments);
        result = toolFn ? await toolFn(args) : `Unknown tool: ${toolCall.function.name}`;
      } catch (err) {
        result = `Tool error (${err.name}): ${err.message}. Adjust your arguments and try again.`;
      }
      messages.push({ role: 'tool', tool_call_id: toolCall.id, content: result });
    }
  }

  return 'Could not produce a final answer within the step limit.';
}

console.log(await answerWithSearch('What are the latest AI model releases, and what makes them notable?'));

Next steps

Explore the Exa API Dashboard
See Build a Grounded Research Agent with Exa for a deeper tool-calling agent walkthrough
See Build Your Own Perplexity with Exa for a longer end-to-end Exa workflow
See Search Agent for another Cerebras + Exa research pattern

​Prerequisites

​Configure Exa with Cerebras

​Search types and freshness controls

​Get page contents

​Use Exa as a tool for grounded answers

​Next steps

Prerequisites

Configure Exa with Cerebras

Search types and freshness controls

Get page contents

Use Exa as a tool for grounded answers

Next steps