- Search the web for current information with Exa
- Hand the results to a Cerebras model through tool calling
- Return answers with inline citations and a clean source list
Prerequisites
Before you begin, ensure you have:- A Cerebras API key
- An Exa API key
- Python 3.10+ or Node.js 18+
The Node.js examples use ES modules and top-level
await. Save them with a .mjs extension (or set "type": "module" in your package.json) and run them with node file.mjs..env file:
Step 1: Initialize the Clients
We use Exa for search and the OpenAI client against Cerebras’ OpenAI-compatible API for agent reasoning and tool use.Step 2: Define the Exa Search Tool
The agent gets one tool:exa_search. It returns clean highlights for each result, with each source tagged [n] so the model can cite it.
A finalize helper cleans up the model’s output and appends a numbered source list, so every answer ends with reliable citations.
Step 3: Register the Tool for the Model
The schema exposes the three search types so the model can choose faster or deeper search per query. Onlyquery is required; everything else is optional.
Step 4: Run the Agent Loop
The core pattern is:- Ask the model what it needs
- Let it call the search tool
- Feed tool results back into the conversation
- Stop when the model returns a final answer
Step 5: Try It on a Real Question
Now you can ask for a grounded answer. The agent searches the web, then writes a cited answer.Inline
[n] markers map to the numbered Sources list at the end of the answer. Non-consecutive citations like [1], [2], and [4] are expected when the model cites only some of the results.Complete Example
The full agent in a single file. Copy it intoagent.py (or agent.mjs) and run it.
Summary
What We Built
A grounded research agent with:- Exa search for current source discovery, with page content (highlights) returned inline
- Cerebras tool calling to plan searches and write cited answers
- Reliable inline citations backed by a numbered source list
Next Steps
- Use
fastfor low-latency chat assistants anddeepfor broader research tasks - Lower
max_age_hoursfor newsy queries that need fresher content - Try other Exa API config in Exa API Dashboard
Resources
- Cerebras Inference Docs
- Exa API Dashboard
- Get Started with Exa
- Build Your Own Perplexity with Exa
- Automating Search-Based Report Generation with a Multi-Agent AI Pipeline

