Hume AI is an empathic AI platform that builds emotionally intelligent voice agents. Their Empathic Voice Interface (EVI) combines speech recognition, emotion detection, and natural language understanding to create voice experiences that understand and respond to human emotions in real-time.By integrating Cerebras’s ultra-fast inference with Hume AI’s emotional intelligence capabilities, you can build voice agents that are both lightning-fast and emotionally aware. Learn more at Hume AI.
The hume[microphone] package provides the official SDK with audio playback support. The [microphone] extra includes dependencies for recording and playing audio.
2
Configure environment variables
Create a .env file in your project directory with your API credentials:
You can find your Hume AI credentials in your Hume AI dashboard under API Keys.
3
Initialize the Cerebras client
Set up the Cerebras client to handle language model inference. This client will process text-based interactions while Hume AI handles voice and emotion detection:
The X-Cerebras-3rd-Party-Integration header helps Cerebras track integration usage and provide better support.
4
Create a basic empathic response generator
Build a function that generates emotionally aware responses using Cerebras. This example shows how to incorporate emotional context into your prompts:
Report incorrect code
Copy
Ask AI
import osfrom openai import OpenAIfrom dotenv import load_dotenvload_dotenv()# Initialize Cerebras clientcerebras_client = OpenAI( api_key=os.getenv("CEREBRAS_API_KEY"), base_url="https://api.cerebras.ai/v1", default_headers={ "X-Cerebras-3rd-Party-Integration": "Hume AI" })def generate_empathic_response(user_message, emotions): """Generate an emotionally aware response using Cerebras.""" # Build context-aware prompt with emotional information emotion_context = ", ".join([f"{e['name']}: {e['score']:.2f}" for e in emotions[:3]]) system_prompt = f"""You are an empathic AI assistant. The user's current emotional state shows: {emotion_context}. Respond with appropriate empathy and understanding.""" # Generate response using Cerebras response = cerebras_client.chat.completions.create( model="llama-3.3-70b", messages=[ {"role": "system", "content": system_prompt}, {"role": "user", "content": user_message} ], temperature=0.7, max_tokens=150 ) return response.choices[0].message.content# Example usageemotions = [ {"name": "joy", "score": 0.75}, {"name": "excitement", "score": 0.62}, {"name": "curiosity", "score": 0.48}]response = generate_empathic_response( "I just got accepted to my dream university!", emotions)print(response)
This function takes user input and detected emotions, then generates contextually appropriate responses that acknowledge the user’s emotional state.
5
Process emotional context in conversations
Use detected emotions to generate contextually appropriate responses. This example shows how to integrate emotional intelligence into your Cerebras-powered applications:
Report incorrect code
Copy
Ask AI
import osfrom openai import OpenAIfrom dotenv import load_dotenvload_dotenv()# Initialize Cerebras clientcerebras_client = OpenAI( api_key=os.getenv("CEREBRAS_API_KEY"), base_url="https://api.cerebras.ai/v1", default_headers={ "X-Cerebras-3rd-Party-Integration": "Hume AI" })def process_with_emotion_context(user_message, emotions): """Generate responses that acknowledge emotional context.""" # Build emotion-aware system prompt emotion_context = ", ".join([f"{e['name']}: {e['score']:.2f}" for e in emotions[:3]]) system_prompt = f"""You are an empathic AI assistant. The user's current emotional state shows: {emotion_context}. Respond with appropriate empathy and understanding.""" # Generate response using Cerebras response = cerebras_client.chat.completions.create( model="llama-3.3-70b", messages=[ {"role": "system", "content": system_prompt}, {"role": "user", "content": user_message} ], temperature=0.7, max_tokens=150 ) return response.choices[0].message.content# Example: Process a message with detected emotionsemotions = [ {"name": "frustration", "score": 0.68}, {"name": "confusion", "score": 0.52}, {"name": "determination", "score": 0.45}]response = process_with_emotion_context( "I've been trying to fix this bug for hours and nothing works.", emotions)print(response)
For real-time voice interactions with Hume AI’s EVI (Empathic Voice Interface), refer to the Hume AI EVI documentation. Voice features require WebSocket connections and are best suited for interactive applications.
6
Implement streaming responses
Use streaming to reduce latency when generating empathic responses. This provides immediate feedback to users:
Report incorrect code
Copy
Ask AI
import osfrom openai import OpenAIfrom dotenv import load_dotenvload_dotenv()# Initialize Cerebras clientcerebras_client = OpenAI( api_key=os.getenv("CEREBRAS_API_KEY"), base_url="https://api.cerebras.ai/v1", default_headers={ "X-Cerebras-3rd-Party-Integration": "Hume AI" })def stream_empathic_response(user_message, emotions): """Stream responses with emotional context for lower latency.""" emotion_context = ", ".join([f"{e['name']}: {e['score']:.2f}" for e in emotions[:3]]) system_prompt = f"""You are an empathic AI assistant. The user's current emotional state shows: {emotion_context}. Respond with appropriate empathy and understanding.""" # Stream response from Cerebras stream = cerebras_client.chat.completions.create( model="llama-3.3-70b", messages=[ {"role": "system", "content": system_prompt}, {"role": "user", "content": user_message} ], temperature=0.7, max_tokens=150, stream=True ) full_response = "" for chunk in stream: if chunk.choices[0].delta.content: content = chunk.choices[0].delta.content full_response += content print(content, end="", flush=True) print() # New line after streaming return full_response# Example usageemotions = [ {"name": "anxiety", "score": 0.72}, {"name": "hope", "score": 0.58}]response = stream_empathic_response( "I'm nervous about my presentation tomorrow.", emotions)
Streaming responses significantly improves the user experience by providing immediate feedback while the full response is being generated.
Leverage Hume AI’s emotion detection to create more empathic and contextually appropriate responses:
Report incorrect code
Copy
Ask AI
import osfrom openai import OpenAIfrom dotenv import load_dotenvload_dotenv()# Initialize Cerebras clientcerebras_client = OpenAI( api_key=os.getenv("CEREBRAS_API_KEY"), base_url="https://api.cerebras.ai/v1", default_headers={ "X-Cerebras-3rd-Party-Integration": "Hume AI" })def build_emotion_aware_prompt(user_message, emotions): """Create prompts that incorporate emotional context.""" # Get top three detected emotions top_emotions = sorted(emotions, key=lambda x: x['score'], reverse=True)[:3] emotion_str = ", ".join([f"{e['name']} ({e['score']:.0%})" for e in top_emotions]) # Adjust tone based on dominant emotion if top_emotions[0]['name'] in ['sadness', 'distress', 'anxiety']: tone = "compassionate and supportive" elif top_emotions[0]['name'] in ['joy', 'excitement', 'amusement']: tone = "enthusiastic and celebratory" else: tone = "warm and understanding" system_prompt = f"""You are an empathic AI assistant. The user is currently expressing these emotions: {emotion_str}. Respond in a {tone} manner, acknowledging their emotional state appropriately.""" return system_prompt
Build robust error handling for production voice agents:
Report incorrect code
Copy
Ask AI
import osfrom openai import OpenAIfrom dotenv import load_dotenvload_dotenv()# Initialize Cerebras clientcerebras_client = OpenAI( api_key=os.getenv("CEREBRAS_API_KEY"), base_url="https://api.cerebras.ai/v1", default_headers={ "X-Cerebras-3rd-Party-Integration": "Hume AI" })def robust_generate_response(user_message, emotions, max_retries=3): """Generate response with automatic retry and fallback logic.""" # Build emotion-aware prompt emotion_context = ", ".join([f"{e['name']}: {e['score']:.2f}" for e in emotions[:3]]) system_prompt = f"""You are an empathic AI assistant. The user's current emotional state shows: {emotion_context}. Respond with appropriate empathy and understanding.""" for attempt in range(max_retries): try: response = cerebras_client.chat.completions.create( model="llama-3.3-70b", messages=[ {"role": "system", "content": system_prompt}, {"role": "user", "content": user_message} ], timeout=10.0 ) return response.choices[0].message.content except Exception as e: print(f"Attempt {attempt + 1} failed: {e}") if attempt == max_retries - 1: # Final fallback to simpler model print("Falling back to llama3.1-8b") response = cerebras_client.chat.completions.create( model="llama3.1-8b", messages=[ {"role": "user", "content": user_message} ] ) return response.choices[0].message.content# Example usageemotions = [ {"name": "frustration", "score": 0.65}, {"name": "determination", "score": 0.58}]response = robust_generate_response( "I need help understanding this concept.", emotions)print(response)
Hume AI’s EVI API uses WebSocket connections for real-time voice interactions. Make sure your network environment supports WebSocket connections and consider implementing reconnection logic for production applications.
Which Cerebras model should I use for voice applications?
For real-time voice interactions, we recommend:
cerebras/llama3.1-8b - Best for ultra-low latency when speed is critical
cerebras/llama-3.3-70b - Best balance of quality and speed for most applications
cerebras/qwen-3-32b - Good alternative with strong multilingual support
Start with llama-3.3-70b and switch to llama3.1-8b if you need faster responses.
How do I handle rate limits?
Implement these strategies to manage rate limits:
Use exponential backoff with retry logic (see error handling example above)
Cache common responses to reduce API calls
Implement request queuing for high-traffic scenarios
Monitor your usage and upgrade your plan if needed
Both Cerebras and Hume AI offer higher rate limits on paid plans.
Can I use this integration for multiple languages?
Yes! Cerebras models like qwen-3-32b and llama-3.3-70b support multiple languages. Hume AI also provides multilingual emotion detection and text-to-speech capabilities. Check the Hume AI documentation for supported languages and features.
How do I improve emotion detection accuracy?
To get the best emotion detection results:
Use a quality microphone with minimal background noise
Ensure clear audio input with good signal-to-noise ratio
Allow sufficient speech samples (at least 2-3 seconds) for accurate analysis
Test with different voice settings to find optimal configuration
Implement request queuing and retry logic with exponential backoff
Cache responses for common queries
Monitor your usage in the Cerebras and Hume AI dashboards
Consider upgrading your plan for higher limits
Use streaming responses to reduce the number of API calls
When building production voice agents, always implement proper error handling and graceful degradation. Voice interactions should continue even if one service experiences issues. Consider implementing fallback responses and offline capabilities.