Prerequisites
Before you begin, ensure you have:- Cerebras API Key - Sign up and get a free API key from the Cerebras Inference platform.
- Cartesia API Key - Visit Cartesia and create an account. Navigate to your profile or project settings to generate an API key.
- Python 3.10 or higher - Required for running the integration code.
- Cartesia Line environment - Install and configure the Cartesia CLI and Line SDK following the official Line getting-started guide.
Installation and Setup
Install required dependencies
Install the Cartesia Line SDK. It includes LiteLLM (used internally for LLM routing) as a dependency:
cartesia-lineis the official Cartesia Line SDK for building production voice agents. It bundles LiteLLM for calling LLM providers like Cerebras.python-dotenvis optional but convenient for loading environment variables from a.envfile.
Configure environment variables
Create a Alternatively, you can set these as environment variables in your shell:Cartesia Line uses your
.env file in your project directory to securely store your API keys:CARTESIA_API_KEY for audio orchestration, and LiteLLM (bundled with Line) uses CEREBRAS_API_KEY to call Cerebras models.Verify Cerebras connectivity via LiteLLM
Before building a voice agent, verify that Cerebras Inference is reachable through LiteLLM (which Cartesia Line uses internally to call LLMs). This self-contained example makes a real API call:If this prints a response, your Cerebras API key and LiteLLM routing are working correctly.
Build a Cartesia Line voice agent with Cerebras
With Line, you do not manually manage WebSockets, audio streams, or This configuration:
pyaudio. Instead, you implement your agent’s reasoning in code and let Line handle audio, speech-to-text, and text-to-speech using Cartesia’s Sonic and Ink models.Below is a complete main.py that configures a Line LlmAgent to use Cerebras as the LLM provider:- Uses Cerebras Llama 3.1 8B (
cerebras/llama3.1-8b) as the reasoning engine — the fastest option for low-latency voice interactions. - Passes Cerebras-specific options via
LlmConfig.extra, including the API base URL, provider name, and an integration tracking header. - Lets Cartesia Line handle telephony/WebRTC audio, speech recognition (Ink), text-to-speech (Sonic), streaming, barge-in, and turn-taking.
cartesia.tts.websocket() or pyaudio is required — Line encapsulates all audio orchestration.Talk to your agent
Start the voice agent server:Use the Cartesia CLI to connect to your local Line voice agent:This opens a bi-directional audio session where:
- Audio is streamed through Cartesia’s Sonic/Ink stack.
- User speech is transcribed and sent as text to your Cerebras-backed
LlmAgent. - The Cerebras model responds via LiteLLM, and Line converts the reply back to high-quality speech.
Complete Example: Voice Agent with Custom Voice and Pre-Call Configuration
This self-contained example shows how to configure apre_call_handler to programmatically select Sonic voices, languages, and TTS/STT settings:
LlmAgent (e.g., database lookup, web search, CRM APIs) and let Line orchestrate tool calls and multi-agent handoffs.
Complete Example: Tiered Model Selection
This example shows how to select different Cerebras models per call based on metadata — useful for offering premium vs. standard tiers:cerebras/ prefix (e.g., cerebras/llama3.1-8b).
Available Models
Cerebras offers several models optimized for voice AI applications that work seamlessly through Cartesia Line:| Model | Parameters | Best For |
|---|---|---|
| llama3.1-8b | 8B | Fastest option — ideal for low-latency voice interactions |
| gpt-oss-120b | 120B | Complex reasoning and demanding tasks |
| zai-glm-4.7 | 357B | Advanced 357B parameter model with strong reasoning capabilities |
Next Steps
Explore Cartesia Line voice options
Browse Cartesia’s voice and agent configuration options:Advanced Line examples
See production-grade examples and patterns for voice agents:Cerebras + LiteLLM
Cartesia Line uses LiteLLM internally. Learn more about Cerebras + LiteLLM routing, retries, and fallbacks:Cerebras models and tooling
- Cerebras Models – Explore available models and choose the best fit for latency, cost, and capability.
- Cerebras Tool Use – Add function calling and tool use on top of Cerebras models, then expose those tools through your Line voice agents.
- Migrate to GLM4.7 – Ready to upgrade? Follow the Cerebras migration guide to start using the latest
zai-glm-4.7model in your Line agents.
Additional Resources
- Cartesia Line SDK – Build production voice agents with Cartesia’s Line framework
- Cartesia Docs Home – Full Cartesia and Sonic/Ink documentation
- Cerebras Inference Docs – Full Cerebras API reference and integration guides

