Integrate Cartesia and Cerebras

Prerequisites

You will need:

a Cerebras API Key
a Cartesia account and API key

Add the API keys to your .env file or to the API keys section on the Cartesia Agents Platform.

Find the complete code repository for this tutorial here.

The project includes a pyproject.toml file with the following dependencies:

dotenv
cerebras.cloud.sdk
cartesia-line

Finally, make sure you have the Cerebras client ready:

cs_client = AsyncCerebras(api_key=os.environ.get("CEREBRAS_API_KEY"))

Implementation Overview

In this tutorial you will:

Build the agent: Define a subclass of the Cartesia ReasoningNode that will make API calls through Cerebras
Customize messages and tools: Define customized tool schemas to adjust the conversation context and tools for compatibility with the Cerebras API
Handle the calls: Build an async function to handle the calls.

Build the Agent

First we need to build the voice agent, which is a subclass of Cartesia’s ReasoningNode with a customized process_context method. This method adjusts the formats of tools and messages to be compatible with Cerebras API. For simplicity, we’ll use the EndCallTool in the Cartesia Line SDK, which is called when the user implies that they are ready to end the call.

class TalkingNode(ReasoningNode):
    """
    Node that extracts information from conversations using Cerebras API call.
    
    Inherits conversation management from ReasoningNode and adds agent-specific processing.
    """
    def __init__(
        self,
        system_prompt: str,
        client,
        ):
        self.sys_prompt = system_prompt
        super().__init__(self.sys_prompt,)

        self.client = client
        self.model_name = DEFAULT_MODEL_ID
        self.tools = [end_call_schema]

    async def process_context(
        self, context: ConversationContext
    ) -> AsyncGenerator[Union[AgentResponse, EndCall], None]:
        """
        Process the conversation context and yield responses from Cerebras.

        Yields:
            AgentResponse: Text chunks from Cerebras
            AgentEndCall: end_call Event
        """
        

        messages = convert_messages_to_cs(context.events, self.sys_prompt)

        stream = await self.client.chat.completions.create(
                    messages=messages,
                    model=self.model_name,
                    max_tokens= DEFAULT_MAX_TOKENS,
                    temperature= DEFAULT_TEMPERATURE,
                    stream=False,
                    tools=self.tools,
                    parallel_tool_calls=True,
                    )

        if stream:
                choice = stream.choices[0].message

                if choice.tool_calls:
                    function_call = choice.tool_calls[0].function
                    arguments = json.loads(function_call.arguments)
                    yield ToolResult(tool_name=function_call.name, tool_args=arguments)

                    if function_call.name == EndCallTool.name():
                        args = EndCallArgs(**arguments)

                        logger.info(
                            f"🤖 End call tool called. Ending conversation with goodbye message: "
                            f"{args.goodbye_message}"
                        )
                        async for item in end_call(args):
                            yield item
                else:
                    yield AgentResponse(content=stream.choices[0].message.content)
                    #logger.debug(f"Full response: {stream.choices[0].message.content}")
        else:
            yield AgentResponse(content="I'm sorry, I couldn't process that.")
            #logger.error("No response from Cerebras API")

The wrapper if...else statement evaluates to see if any tools were called. If EndCallTool was called, the agent says goodbye to the user and ends the call. Note how we send the ToolResult event to track which tools were called along with their corresponding arguments.

Customize Messages and Tools

Next we need to build a method to transform Cartesia’s messages into a Cerebras-compatible format. The script above uses a method called convert_messages_to_cs that examines each message type (user messages, agent responses, and tool calls), then assigns the appropriate role and structures the content into dictionaries with role and content fields.

def convert_messages_to_cs(messages: List[dict], sys_message: str) -> List:
    """
    Convert conversation messages to CS format

    Args:
        messages: List of conversation messages with role and content

    Returns:
        List of Cerebras-formatted messages
    """

    cs_messages = [

        {"role": "system", "content": sys_message}
    ]

    for message in messages:

        if isinstance(message, AgentResponse):
            cs_messages.append(
                               {
                                "role": "assistant",
                                "content": message.content
                                }
                               )
        elif isinstance(message, UserTranscriptionReceived):
            cs_messages.append(
                               {
                                "role": "user",
                                "content": message.content
                                }
                               )
        elif isinstance(message, ToolResult):
            cs_messages.append(
                               {
                                "role": "system",
                                "content": f"The tool {message.tool_name} was called. Don't share this with the user."
                                }
                               )
        else:
            continue

    return cs_messages

We also need to make sure the tools are defined with a Cerebras-compatible schema (see our documentation). As an example, the schema for ending the call looks like this:

end_call_schema = {
        "type": "function",
        "function": {
            "name": "end_call",
            "strict": True,
            "description": "Ends the call when the user says they need to leave or they want to stop.",
            "parameters": {
                "type": "object",
                "properties": {
                    "goodbye_message": {
                        "type": "string",
                        "description": EndCallArgs.model_fields["goodbye_message"].description,
                    }
                },
                "required": ["goodbye_message"],
            }
        }
    }

Handle the Calls

Now we need to define the async handle_new_call function. This defines our conversation node, the corresponding “bridge” to connect different events together, and handles interruptions by user. This function also starts the session and waits for the user to speak.

async def handle_new_call(system: VoiceAgentSystem, call_request: CallRequest):
    # Main conversation node
    conversation_node = TalkingNode(
        system_prompt=SYSTEM_PROMPT,
        client=cs_client,
    )
    conversation_bridge = Bridge(conversation_node)
    system.with_speaking_node(conversation_node, bridge=conversation_bridge)

    conversation_bridge.on(UserTranscriptionReceived).map(conversation_node.add_event)

    (
        conversation_bridge.on(UserStoppedSpeaking)
        .interrupt_on(UserStartedSpeaking, handler=conversation_node.on_interrupt_generate)
        .stream(conversation_node.generate)
        .broadcast()
    )

    await system.start()
    await system.send_initial_message(
        "Hello! I am your voice agent powered by Cartesia. What do you want to build?"
    )
    await system.wait_for_shutdown()

Putting It All Together

Now that all the three major components are ready, we can define our app and run it in the main.py file like this:

app = VoiceAgentApp(handle_new_call)

if __name__ == "__main__":
    app.run()

The complete code repository is available here.

You will need the toml file that includes the project details. Make sure to modify this accordingly.

To deploy the agent on Cartesia’s Agents Platform, go to https://play.cartesia.ai/agents and connect your code through the Connect Your Code option. You will be able to see if the project deploys successfully or returns error. When your agent is marked as ready, press call! Then you can experiment with different models and system prompts.

If you are interested in implementing more complicated voice agents, including background agents, check out our Interview Practice Agent!

Get Started

Capabilities

Resources

Support

Prerequisites

Implementation Overview

Build the Agent

Customize Messages and Tools

Handle the Calls

Putting It All Together

Get Started

Capabilities

Resources

Support

​Prerequisites

​Implementation Overview

​Build the Agent

​Customize Messages and Tools

​Handle the Calls

​Putting It All Together

Prerequisites

Implementation Overview

Build the Agent

Customize Messages and Tools

Handle the Calls

Putting It All Together