Skip to main content

What is LiveKit?

LiveKit is an open-source platform that enables scalable, multi-user conferencing with WebRTC. It provides the tools you need to add real-time video, audio, and data capabilities to your applications. By combining LiveKit with Cerebras’s ultra-fast inference, you can build responsive voice AI agents that handle conversations with minimal latency. Learn more at LiveKit.io.

Prerequisites

Before you begin, ensure you have:
  • Cerebras API Key - Get a free API key here
  • OpenAI API Key - Required for speech-to-text (Whisper). Get one at OpenAI
  • LiveKit Account - Visit LiveKit Cloud and create an account to get your API credentials
  • Python 3.11 - 3.13 - LiveKit agents require Python < 3.14. Verify your version with python --version.
Cerebras provides ultra-fast LLM inference but does not currently offer speech-to-text (STT) models. This guide uses OpenAI’s Whisper for STT and Cerebras for the LLM, giving you the best of both worlds.

Configure LiveKit with Cerebras

1

Create and activate a virtual environment

Set up an isolated Python environment for your project. This keeps dependencies organized and prevents conflicts with other projects.
python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate
2

Install LiveKit agents and dependencies

Install the LiveKit agents framework with the necessary plugins. This includes OpenAI-compatible clients (which we’ll use to connect to Cerebras), voice activity detection (VAD), and text-to-speech capabilities.
pip install 'livekit-agents[openai,silero,deepgram,cartesia,turn-detector]~=1.0' python-dotenv
The openai plugin allows LiveKit to work with any OpenAI-compatible API, including Cerebras Inference.
3

Configure environment variables

Create a .env file in your project directory with your API credentials. These credentials authenticate your application with Cerebras, OpenAI, and LiveKit services.
CEREBRAS_API_KEY=your-cerebras-api-key-here
OPENAI_API_KEY=your-openai-api-key-here
LIVEKIT_URL=your-livekit-url-here
LIVEKIT_API_KEY=your-livekit-api-key-here
LIVEKIT_API_SECRET=your-livekit-api-secret-here
Get your LiveKit credentials from the LiveKit Cloud dashboard:
  • LIVEKIT_URL: Your project URL (starts with wss://)
  • LIVEKIT_API_KEY and LIVEKIT_API_SECRET: Generate these in Settings → Keys
4

Create a basic voice agent

Build a complete voice AI agent that uses OpenAI Whisper for speech-to-text and Cerebras for ultra-fast LLM responses.Create a file named voice_agent.py:
Server
import logging
import os
from dotenv import load_dotenv
from livekit import agents, api
from livekit.agents import AgentSession, Agent, RoomInputOptions
from livekit.plugins import openai, silero

load_dotenv()

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger("voice-agent")

CEREBRAS_URL = "https://api.cerebras.ai/v1"
CEREBRAS_API_KEY = os.getenv("CEREBRAS_API_KEY")
OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")

LIVEKIT_API_KEY = os.getenv("LIVEKIT_API_KEY")
LIVEKIT_API_SECRET = os.getenv("LIVEKIT_API_SECRET")
LIVEKIT_URL = os.getenv("LIVEKIT_URL", "wss://cbrs-gk5k98e0.livekit.cloud")

class Assistant(Agent):
    def __init__(self) -> None:
        super().__init__(instructions="You are a helpful voice AI assistant.")

# Initialize OpenAI STT (Speech-to-Text) using Whisper
stt = openai.STT(
    model="whisper-1",
    api_key=OPENAI_API_KEY
)

# Initialize Cerebras LLM for ultra-fast conversation
llm = openai.LLM(
    model="llama-3.3-70b",
    api_key=CEREBRAS_API_KEY,
    base_url=CEREBRAS_URL
)

async def entrypoint(ctx: agents.JobContext):
    logger.info(f"Starting agent in room {ctx.room.name}")
    session = AgentSession(
        stt=stt,
        llm=llm,
        tts=openai.TTS(), # Use OpenAI TTS
        vad=silero.VAD.load(),
    )
    await session.start(
        room=ctx.room,
        agent=Assistant(),
    )
    await session.generate_reply(
        instructions="Greet the user and offer your assistance."
    )

if __name__ == "__main__":
    # Generate and display token
    if not LIVEKIT_API_KEY or not LIVEKIT_API_SECRET:
        print("Missing LIVEKIT_API_KEY or LIVEKIT_API_SECRET in .env file")
        print("Get these from your LiveKit Cloud dashboard: https://cloud.livekit.io/")
    else:
        token = api.AccessToken(LIVEKIT_API_KEY, LIVEKIT_API_SECRET) \
            .with_identity("test_user") \
            .with_grants(api.VideoGrants(
                room_join=True,
                room="test_room",
            ))
        
        jwt_token = token.to_jwt()
        
    print("\n\nLiveKit Agent Ready to Connect!\033[0m\n\033[94m" + "="*50 + "\033[0m")
    print(f"\033[94mConnect at: https://agents-playground.livekit.io/\033[0m\n\033[94m")
    print(f"\033[94mURL: {LIVEKIT_URL}\033[0m")
    print(f"\033[94mToken: {jwt_token}\033[0m" + "="*50 + "\033[0m")
    
    agents.cli.run_app(agents.WorkerOptions(entrypoint_fnc=entrypoint))
This example uses OpenAI’s Whisper for speech-to-text and OpenAI for text-to-speech, with Cerebras’s llama-3.3-70b providing the ultra-fast intelligence.
5

Run and test your voice agent

Start your voice agent with the LiveKit CLI. The agent will connect to your LiveKit room and wait for a user to join.
python voice_agent.py dev
To test your agent:
  1. Go to the LiveKit Agents Playground
  2. If authenticated: You’ll see available rooms and can join directly Otherwise: Use manual connection by entering the URL and token from your terminal (displayed in blue when you run the agent)
  3. Paste the token generated when running python voice_agent.py dev
  4. Click Connect
  5. Approve microphone access when Chrome prompts you (required for voice interaction)
  6. Speak into your microphone - the agent should respond!

Example Use Cases

Combining LiveKit with Cerebras enables powerful real-time AI applications:
  • Multimodal Assistants - Support text, voice, and screen sharing with an AI assistant that responds instantly.
  • Telehealth - Enable real-time AI support during virtual medical consultations with HIPAA-compliant infrastructure.
  • Call Centers - Automate inbound and outbound customer support with AI voice agents that handle multiple conversations simultaneously.
  • Real-time Translation - Translate conversations instantly across languages with minimal latency.
  • Interactive Education - Create voice-enabled tutoring systems that provide immediate feedback.
  • Voice Commerce - Build conversational shopping experiences with natural voice interactions.

Advanced Configuration

Using Different Cerebras Models

You can easily swap models based on your needs. Choose faster models for lower latency or more capable models for complex reasoning tasks.
import os
from livekit.plugins import openai

CEREBRAS_URL = "https://api.cerebras.ai/v1"
CEREBRAS_API_KEY = os.getenv("CEREBRAS_API_KEY")

# For faster responses with smaller model
llm = openai.LLM(
    model="llama3.1-8b",
    api_key=CEREBRAS_API_KEY,
    base_url=CEREBRAS_URL
)

Troubleshooting

Check your microphone permissions - Ensure your browser or application has access to your microphone.Verify VAD settings - The Silero VAD may need tuning for your audio environment. Try adjusting min_speech_duration and min_silence_duration parameters.Test STT independently - Make a direct API call to OpenAI Whisper to verify your audio is being transcribed correctly.
Use a smaller model - Try llama3.1-8b instead of llama-3.3-70b for faster responses. The 8B model typically responds 2-3x faster.Check network connectivity - Ensure stable connections to both LiveKit and Cerebras endpoints. Use ping and traceroute to diagnose network issues.Optimize instructions - Shorter, more focused system instructions lead to faster generation. Aim for instructions under 200 words.Monitor token usage - Longer conversations accumulate context. Consider implementing context window management to keep prompts concise.
Verify API keys - Double-check that your CEREBRAS_API_KEY and LiveKit credentials are correct and not expired.Check base URL - Ensure you’re using https://api.cerebras.ai/v1 for the Cerebras endpoint (note the /v1 suffix).Review firewall settings - LiveKit requires WebRTC connections which may be blocked by some firewalls. Ensure UDP ports 50000-60000 are open.Test connectivity - Verify you can reach both services:
curl -H "Authorization: Bearer $CEREBRAS_API_KEY" https://api.cerebras.ai/v1/models
Enable noise cancellation - Configure noise cancellation in your RoomInputOptions if needed for noisy environments.Check sample rates - Ensure your audio input matches the expected sample rate for Whisper (16kHz). Mismatched sample rates can cause quality degradation.Test TTS provider - Try different TTS providers if Cartesia isn’t meeting your quality needs. LiveKit supports multiple TTS engines including ElevenLabs and Deepgram.Monitor bandwidth - Poor audio quality can result from insufficient bandwidth. LiveKit automatically adjusts quality, but ensure you have at least 1 Mbps available.
Verify Python version - LiveKit agents require Python 3.11.5 or later. Check your version:
python --version
Use pyenv for version management - If you need multiple Python versions:
pyenv install 3.11.5
pyenv local 3.11.5
Check async compatibility - Ensure you’re using async/await syntax correctly. LiveKit agents are fully asynchronous.
Remember to never commit your API keys to version control. Always use environment variables or secure secret management systems like AWS Secrets Manager or HashiCorp Vault.

Next Steps

Now that you have a working voice agent, explore these advanced features:
  • Model Selection - Try different Cerebras models to optimize for speed vs. capability based on your use case.
  • Custom Frontend - Build a custom client using the LiveKit client SDKs for web, iOS, or Android.
  • Production Deployment - Deploy your agent to production using LiveKit Cloud or self-hosted infrastructure.
  • Monitoring and Analytics - Implement logging and monitoring to track agent performance and user interactions.
  • See how to use the latest GLM4.6 with Cerebras - GLM4.6 migration guide

Additional Resources