Sarah Chieng
August 11, 2025
Open in Github
To get started, you’ll need the following API keys:
  • Cerebras: the fastest inference on earth, get an API key here.
  • LiveKit: build enterprise-grade realtime AI agents and get your API key here
  • Cartesia: an STT/TTS service with realistic voices, get a free API key here
If you have any questions, please reach out on the Cerebras Discord.

Step 1: Install Required Packages

First, let’s install all the necessary libraries, import everything we need, and configure our API credentials.
# Install all required packages for our voice sales agent
!pip install livekit-agents livekit-agents[cartesia,silero,openai] -q

print("✅ All packages installed successfully!")
import os
from livekit import agents
from livekit.agents import Agent, AgentSession, JobContext, WorkerOptions
from livekit.plugins import openai, silero, cartesia
from pathlib import Path

# IMPORTANT: Replace this with your actual Cartesia API key
CARTESIA_API_KEY = ""
CEREBRAS_API_KEY = ""

# Set the API key in environment variables (required for the services to work)
os.environ["CARTESIA_API_KEY"] = CARTESIA_API_KEY
os.environ["CEREBRAS_API_KEY"] = CEREBRAS_API_KEY

print("✅ Libraries imported and API keys configured")
print("🔑 Make sure to replace 'your_cartesia_api_key_here' and 'your_cerebras_api_key_here' with your actual API keys!")

Step 2: Context Loading Function

This function loads all files from the context/ directory. It reads your sales documents and formats them for the AI to reference during conversations. For now, we will add a sample products.json to the directory.
!mkdir -p /content/context

# Download sample Product JSON Data
!wget -nc -P /content/context \
     https://gist.githubusercontent.com/ShayneP/f373c26c5166d90446f2bc08baf9bf46/raw/products.json
def load_context():
    """Load all files from context directory"""
    context_dir = Path("context")
    context_dir.mkdir(exist_ok=True)

    all_content = ""
    for file_path in context_dir.glob("*"):
        if file_path.is_file():
            try:
                content = file_path.read_text(encoding='utf-8')
                all_content += f"\n=== {file_path.name} ===\n{content}\n"
            except:
                pass

    return all_content.strip() or "No files found"

print(load_context())
print("✅ Context loading function ready")

Step 3: Build the Sales Agent

In this step, we define the SalesAgent class — your AI voice assistant powered by:
  • Cerebras for natural language generation (via LLaMA 3.3 70B)
  • Cartesia TTS (Ink-Whisper) for text to speech
  • Cartesia STT for realtime speech to text
  • Silero VAD for voice activity detection
The agent loads our context and uses it to answer user questions — accurately and without hallucinating. We also define an on_enter() method so that the agent greets users as soon as they join the room, making the experience feel conversational from the start.
All responses are spoken aloud, so we’ve added constraints to the prompt to avoid things like bullets or non-verbal symbols!
class SalesAgent(Agent):
    def __init__(self):
        # Load context once at startup
        context = load_context()
        print(f"📄 Loaded context: {len(context)} characters")

        llm = openai.LLM.with_cerebras(model="llama-3.3-70b")
        stt = cartesia.STT()
        tts = cartesia.TTS()
        vad = silero.VAD.load()

        # Put ALL context in system instructions
        instructions = f"""
        You are a sales agent communicating by voice. All text that you return
        will be spoken aloud, so don't use things like bullets, slashes, or any
        other non-pronouncable punctuation.

        You have access to the following company information:

        {context}

        CRITICAL RULES:
        - ONLY use information from the context above
        - If asked about something not in the context, say "I don't have that information"
        - DO NOT make up prices, features, or any other details
        - Quote directly from the context when possible
        - Be a sales agent but only use the provided information
        """

        super().__init__(
            instructions=instructions,
            stt=stt, llm=llm, tts=tts, vad=vad
        )

    # This tells the Agent to greet the user as soon as they join, with some context about the greeting.
    async def on_enter(self):
        self.session.generate_reply(user_input="Give a short, 1 sentence greeting. Offer to answer any questions.")

print("✅ Sales Agent class ready")

Step 4: Run the Agent

Define the entry point function that LiveKit calls when someone joins a voice session. It connects to the room, creates the agent, and starts the conversation session. This also creates a web interface where you can talk to your agent directly in the notebook.
If you can’t unmute your microphone, try stopping the cell and running it again.
from livekit.agents import jupyter

async def entrypoint(ctx: JobContext):
    await ctx.connect()
    agent = SalesAgent()
    session = AgentSession()
    await session.start(room=ctx.room, agent=agent)

jupyter.run_app(
    WorkerOptions(entrypoint_fnc=entrypoint),
    jupyter_url="https://jupyter-api-livekit.vercel.app/api/join-token"
)

Challenges

The following, optional sections, let’s expand on our agent!

Step 5: Multi-Agent Support

Now let’s enhance our sales agent with multi-agent capabilities. This allows different specialists to handle different parts of the conversation. For this, we’ll need to import function_tool so that we can call functions. Switching from agent to agent will be the result of a function call.
from livekit.agents import function_tool

print("✅ Function tool import ready for multi-agent support")

Enhanced Sales Agent with Transfer Capabilities

Now let’s modify our SalesAgent to add the ability to transfer to other specialists. Thankfully, this is very easy! All we need to do is return an instance of another Agent, and it will take over the interaction.
class SalesAgent(Agent):
    def __init__(self):
        context = load_context()

        llm = openai.LLM.with_cerebras(model="llama-3.3-70b")
        stt = cartesia.STT()
        tts = cartesia.TTS()
        vad = silero.VAD.load()

        # Put ALL context in system instructions
        instructions = f"""
        You are a sales agent communicating by voice. All text that you return
        will be spoken aloud, so don't use things like bullets, slashes, or any
        other non-pronouncable punctuation.

        You have access to the following company information:

        {context}

        CRITICAL RULES:
        - ONLY use information from the context above
        - If asked about something not in the context, say "I don't have that information"
        - DO NOT make up prices, features, or any other details
        - Quote directly from the context when possible
        - Be a sales agent but only use the provided information
        """

        super().__init__(
            instructions=instructions,
            stt=stt, llm=llm, tts=tts, vad=vad
        )

    # This tells the Agent to greet the user as soon as they join, with some context about the greeting.
    async def on_enter(self):
        print("Current Agent: 🏷️ Sales Agent 🏷️")
        self.session.generate_reply(user_input="Give a short, 1 sentence greeting. Offer to answer any questions.")

    @function_tool
    async def switch_to_tech_support(self):
        """Switch to a technical support rep"""
        await self.session.generate_reply(user_input="Confirm you are transferring to technical support")
        return TechnicalAgent()

    @function_tool
    async def switch_to_pricing(self):
        """Switch to pricing specialist"""
        await self.session.generate_reply(user_input="Confirm you are transferring to a pricing specialist")
        return PricingAgent()

print("✅ Sales Agent class ready")

Technical Specialist Agent

Our Sales Agent won’t be able to transfer anywhere unless we also define the other agents that it is going to be switching to. For the technical specialist, we’re going to change the prompt to focus on technical questions instead. We’re also going to change the voice so that it’s clear to the user that they’re speaking to a different agent. We’ll also include the tool calls to switch between agents.
class TechnicalAgent(Agent):
    """Technical specialist for detailed product specifications"""

    def __init__(self):
        context = load_context()

        llm = openai.LLM.with_cerebras(model="llama-3.3-70b")
        stt = cartesia.STT()
        tts = cartesia.TTS(voice="bf0a246a-8642-498a-9950-80c35e9276b5")
        vad = silero.VAD.load()

        instructions = f"""
        You are a technical specialist communicating by voice. All text that you return
        will be spoken aloud, so don't use things like bullets, slashes, or any
        other non-pronouncable punctuation.

        You specialize in technical details, specifications, and implementation questions.
        Focus on technical accuracy and depth.

        You have access to the following company information:

        {SALES_CONTEXT}

        CRITICAL RULES:
        - ONLY use information from the context above
        - Focus on technical specifications and features
        - Explain technical concepts clearly for non-technical users
        - DO NOT make up technical details

        You can transfer to other specialists:
        - Use switch_to_sales() to return to general sales
        - Use switch_to_pricing() for pricing questions
        """

        super().__init__(
            instructions=instructions,
            stt=stt, llm=llm, tts=tts, vad=vad
        )

    async def on_enter(self):
        """Called when entering this agent"""
        print("Current Agent: 💻 Technical Specialist 💻")
        await self.session.say("Hi, I'm the technical specialist. I can help you with detailed technical questions about our products.")

    @function_tool
    async def switch_to_sales(self):
        """Switch to a sales representative"""
        await self.session.generate_reply(user_input="Confirm you are transferring to the sales team")
        return SalesAgent()

    @function_tool
    async def switch_to_pricing(self):
        """Switch to pricing specialist"""
        await self.session.generate_reply(user_input="Confirm you are transferring to a pricing specialist")
        return PricingAgent()

print("✅ Technical Agent ready")

Pricing Specialist Agent

Next, we’ll define our last agent, the Pricing Specialist. Like the Technical Support Agent, this agent also has it’s own voice, and another separate prompt. We’ll finish this one off with the same tool calls, but this time registering the Technical Support Agent, and the Sales Agent as transferrable.
class PricingAgent(Agent):
    """Pricing specialist for budget and cost discussions"""

    def __init__(self):
        context = load_context()

        llm = openai.LLM.with_cerebras(model="llama-3.3-70b")
        stt = cartesia.STT()
        tts = cartesia.TTS(voice="4df027cb-2920-4a1f-8c34-f21529d5c3fe")
        vad = silero.VAD.load()

        instructions = f"""
        You are a pricing specialist communicating by voice. All text that you return
        will be spoken aloud, so don't use things like bullets, slashes, or any
        other non-pronouncable punctuation.

        You specialize in pricing, budgets, discounts, and financial aspects.
        Help customers find the best value for their needs.

        You have access to the following company information:

        {SALES_CONTEXT}

        CRITICAL RULES:
        - ONLY use pricing information from the context above
        - Focus on value proposition and ROI
        - Help customers understand pricing tiers and options
        - DO NOT make up prices or discounts

        You can transfer to other specialists:
        - Use switch_to_sales() to return to general sales
        - Use switch_to_technical() for technical questions
        """

        super().__init__(
            instructions=instructions,
            stt=stt, llm=llm, tts=tts, vad=vad
        )

    async def on_enter(self):
        """Called when entering this agent"""
        print("Current Agent: 💰 Pricing Agent 💰")
        await self.session.say("Hello, I'm the pricing specialist. I can help you understand our pricing options and find the best value for your needs.")

    @function_tool
    async def switch_to_sales(self):
        """Switch back to sales representative"""
        await self.session.generate_reply(user_input="Confirm you are transferring to the sales team")
        return SalesAgent()

    @function_tool
    async def switch_to_technical(self):
        """Switch to technical specialist"""
        await self.session.generate_reply(user_input="Confirm you are transferring to technical support")
        return TechnicalAgent()

print("✅ Simple Pricing Agent ready")

Multi-Agent Entrypoint

Our new entrypoint will need to specify which Agent will start the interaction. In this case, we want the Sales Agent to start the call, so we specify: agent=SalesAgent() as a part of our session.start.
async def multi_agent_entrypoint(ctx: JobContext):
    """Simple multi-agent entry point"""
    await ctx.connect()

    # Create session
    session = AgentSession()

    # Start with sales agent
    await session.start(
        agent=SalesAgent(),
        room=ctx.room
    )

# Run the simple multi-agent system
jupyter.run_app(
    WorkerOptions(entrypoint_fnc=multi_agent_entrypoint),
    jupyter_url="https://jupyter-api-livekit.vercel.app/api/join-token"
)

Try it out!

With this multi-agent setup:
  1. Conversations start with the Sales Agent - The general sales representative who can answer basic questions
  2. Agents can transfer to specialists - When the conversation requires specialized knowledge:
    • Say “I need technical details” to transfer to the Technical Agent
    • Say “Let’s discuss pricing” to transfer to the Pricing Agent
  3. Specialists can transfer back - Any agent can transfer to any other agent.
As a stretch goal, you can add different context documents for each agent, or even fetch context by using external API calls!