Get Started with Maxim

What is Maxim?

Maxim is an AI observability and evaluation platform that helps teams monitor, trace, and improve their LLM applications in production. With Maxim, you can track model performance, debug issues, and gain insights into your AI workflows. Learn more at https://www.getmaxim.ai/

Prerequisites

Before you begin, ensure you have:

Cerebras API Key - Get a free API key here.
Maxim Account - Visit Maxim and create an account or log in.
- Go to Settings to generate your Maxim API key.
Python 3.11 or higher

Configure Maxim

Install required dependencies

Run the following:

pip install maxim-py openai python-dotenv

Configure environment variables

Create a .env file in your project directory:

CEREBRAS_API_KEY=your-cerebras-api-key-here
MAXIM_API_KEY=your-maxim-api-key-here
MAXIM_LOG_REPO_ID=your-maxim-log-repo-id-here

You can find your MAXIM_LOG_REPO_ID in your Maxim dashboard under Settings > Log Repositories.

Initialize Maxim and instrument your client

Set up the Maxim logger to automatically track all API calls to Cerebras:

import os
from dotenv import load_dotenv
from openai import OpenAI
from maxim import Maxim
from maxim.logger.openai import MaximOpenAIClient

# Load environment variables
load_dotenv()

# Initialize Maxim
maxim = Maxim({"api_key": os.getenv("MAXIM_API_KEY")})
logger = maxim.logger({"id": os.getenv("MAXIM_LOG_REPO_ID")})

# Create Cerebras OpenAI client
cerebras_client = OpenAI(
    api_key=os.getenv("CEREBRAS_API_KEY"),
    base_url="https://api.cerebras.ai/v1",
    default_headers={
        "X-Cerebras-3rd-Party-Integration": "MaximAI"
    }
)

# Wrap with Maxim for automatic tracing
client = MaximOpenAIClient(cerebras_client, logger=logger)

Make your first traced request

Make API calls to Cerebras as usual. Maxim will automatically capture all details:

import os
from dotenv import load_dotenv
from openai import OpenAI
from maxim import Maxim
from maxim.logger.openai import MaximOpenAIClient

load_dotenv()

# Initialize Maxim
maxim = Maxim({"api_key": os.getenv("MAXIM_API_KEY")})
logger = maxim.logger({"id": os.getenv("MAXIM_LOG_REPO_ID")})

# Create Cerebras client
cerebras_client = OpenAI(
    api_key=os.getenv("CEREBRAS_API_KEY"),
    base_url="https://api.cerebras.ai/v1",
    default_headers={
        "X-Cerebras-3rd-Party-Integration": "MaximAI"
    }
)

# Wrap with Maxim
client = MaximOpenAIClient(cerebras_client, logger=logger)

# Make traced request
response = client.chat.completions.create(
    model="gpt-oss-120b",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Explain quantum computing in simple terms."},
    ],
    temperature=0.7,
    max_tokens=500,
)

print(response.choices[0].message.content)

View traces in your dashboard

Log in to your Maxim dashboard to see all logged requests. You can filter by model, date range, or custom metadata, and view detailed information including conversation history, token usage, latency metrics, and model parameters.

Advanced Usage

Custom Metadata with Headers

You can add custom metadata to your traces using extra headers. This helps organize and filter your logs in Maxim by user, session, or feature.

import os
from dotenv import load_dotenv
from openai import OpenAI
from maxim import Maxim
from maxim.logger.openai import MaximOpenAIClient

load_dotenv()

# Initialize Maxim
maxim = Maxim({"api_key": os.getenv("MAXIM_API_KEY")})
logger = maxim.logger({"id": os.getenv("MAXIM_LOG_REPO_ID")})

# Create and wrap Cerebras client
cerebras_client = OpenAI(
    api_key=os.getenv("CEREBRAS_API_KEY"),
    base_url="https://api.cerebras.ai/v1",
    default_headers={"X-Cerebras-3rd-Party-Integration": "MaximAI"}
)
client = MaximOpenAIClient(cerebras_client, logger=logger)

# Add custom metadata via headers
response = client.chat.completions.create(
    model="gpt-oss-120b",
    messages=[
        {"role": "user", "content": "How do I reset my password?"},
    ],
    extra_headers={
        "x-maxim-session-id": "session_456",
        "x-maxim-generation-name": "password_reset_help"
    }
)
print(response.choices[0].message.content)

Streaming Responses

Maxim supports tracing streaming responses from Cerebras. The instrumentation automatically handles streaming data and captures the complete response.

import os
from dotenv import load_dotenv
from openai import OpenAI
from maxim import Maxim
from maxim.logger.openai import MaximOpenAIClient

load_dotenv()

# Initialize Maxim
maxim = Maxim({"api_key": os.getenv("MAXIM_API_KEY")})
logger = maxim.logger({"id": os.getenv("MAXIM_LOG_REPO_ID")})

# Create and wrap Cerebras client
cerebras_client = OpenAI(
    api_key=os.getenv("CEREBRAS_API_KEY"),
    base_url="https://api.cerebras.ai/v1",
    default_headers={"X-Cerebras-3rd-Party-Integration": "MaximAI"}
)
client = MaximOpenAIClient(cerebras_client, logger=logger)

# Stream response
response = client.chat.completions.create(
    model="gpt-oss-120b",
    messages=[
        {"role": "user", "content": "Write a short story about a robot."},
    ],
    stream=True,
)

for chunk in response:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")
print()

Multi-Model Workflows

Track multiple LLM calls in sequence. Each call is automatically traced in your Maxim dashboard:

import os
from dotenv import load_dotenv
from openai import OpenAI
from maxim import Maxim
from maxim.logger.openai import MaximOpenAIClient

load_dotenv()

# Initialize Maxim
maxim = Maxim({"api_key": os.getenv("MAXIM_API_KEY")})
logger = maxim.logger({"id": os.getenv("MAXIM_LOG_REPO_ID")})

# Create and wrap Cerebras client
cerebras_client = OpenAI(
    api_key=os.getenv("CEREBRAS_API_KEY"),
    base_url="https://api.cerebras.ai/v1",
    default_headers={"X-Cerebras-3rd-Party-Integration": "MaximAI"}
)
client = MaximOpenAIClient(cerebras_client, logger=logger)

def research_workflow(query: str) -> str:
    """Multi-step workflow with multiple LLM calls."""
    
    # Step 1: Generate search queries
    search_response = client.chat.completions.create(
        model="gpt-oss-120b",
        messages=[
            {"role": "system", "content": "Generate 3 search queries for research."},
            {"role": "user", "content": query},
        ],
        extra_headers={"x-maxim-generation-name": "generate_queries"}
    )
    queries = search_response.choices[0].message.content
    
    # Step 2: Synthesize results
    synthesis_response = client.chat.completions.create(
        model="qwen-3-32b",
        messages=[
            {"role": "system", "content": "Synthesize research findings."},
            {"role": "user", "content": f"Queries: {queries}\n\nProvide a summary."},
        ],
        extra_headers={"x-maxim-generation-name": "synthesize_results"}
    )
    return synthesis_response.choices[0].message.content

result = research_workflow("What are the latest developments in quantum computing?")
print(result)

Monitoring and Alerts

Maxim provides powerful monitoring capabilities to help you track your application’s performance and set up alerts for issues or anomalies.

View Traces in Dashboard

Log in to your Maxim dashboard
Navigate to the Traces section to see all your logged requests
Filter by model, date range, or custom metadata
Click on individual traces to see detailed information including:
- Full conversation history
- Token usage and costs
- Latency metrics
- Model parameters
- Custom metadata

Set Up Alerts

You can configure alerts to notify you of issues or anomalies. Visit the Maxim alerts documentation to learn how to:

Create alerts for high latency or error rates
Monitor token usage and costs
Track model performance metrics
Get notified via email, Slack, or webhooks

Evaluations and Testing

Maxim provides built-in evaluation capabilities through their dashboard to help you measure and improve your LLM application’s quality. All traced requests are automatically available for evaluation. To set up evaluations:

Log in to your Maxim dashboard
Navigate to Evaluations to create custom evaluators
Define metrics like accuracy, relevance, and coherence
Run evaluations on your traced requests
View results and insights in the dashboard

Learn more about evaluations and testing in the Maxim documentation.

Troubleshooting

Traces not appearing in dashboard

If your traces aren’t showing up in the Maxim dashboard:

Verify your MAXIM_API_KEY is correct and has the necessary permissions
Check that you wrapped your OpenAI client with MaximOpenAIClient before making API requests
Ensure your network allows outbound connections to Maxim’s API
Look for error messages in your application logs
Verify you’re using the latest version of maxim-py (run pip install --upgrade maxim-py)

Authentication errors

If you’re seeing authentication errors:

Double-check that both CEREBRAS_API_KEY and MAXIM_API_KEY are set correctly in your .env file
Ensure you’ve called load_dotenv() before initializing the clients
Verify your API keys haven’t expired or been revoked
Check that your Cerebras API key is active at cloud.cerebras.ai

Missing streaming data

If streaming responses aren’t being captured:

Make sure you’re using the latest version of maxim-py (run pip install --upgrade maxim-py)
Verify that the instrumentation is applied before creating streaming requests
Check that you’re iterating through all chunks in the stream
Ensure you’re not catching and suppressing exceptions during streaming

High latency in traces

If you notice higher latency than expected:

The instrumentation adds minimal overhead (typically less than 10ms)
Check your network connection to both Cerebras and Maxim APIs
Review your Maxim dashboard to identify if the latency is from the LLM call or logging
Consider using async logging if you need to minimize impact on response times
Verify you’re using the closest Cerebras region for optimal performance

Custom metadata not appearing

If custom metadata isn’t showing up in your traces:

Ensure you’re using the log() context manager correctly
Verify that metadata is passed as a dictionary with string keys
Check that you’re not exceeding metadata size limits (typically 10KB per trace)
Make sure the metadata is added before the API call is made

Next Steps

Maxim Documentation

Explore advanced features and capabilities

Evaluations Guide

Learn about evaluation and testing

Agno Integration

Build multi-agent systems with Agno

Cerebras Models

Explore available Cerebras models

For additional support, visit the Maxim support page or contact their team directly.

Get Started

Capabilities

Compatibility

Resources

Support

What is Maxim?

Prerequisites

Configure Maxim

Advanced Usage

Custom Metadata with Headers

Streaming Responses

Multi-Model Workflows

Monitoring and Alerts

View Traces in Dashboard

Set Up Alerts

Evaluations and Testing

Troubleshooting

Next Steps

Maxim Documentation

Evaluations Guide

Agno Integration

Cerebras Models

Get Started

Capabilities

Compatibility

Resources

Support

​What is Maxim?

​Prerequisites

​Configure Maxim

​Advanced Usage

​Custom Metadata with Headers

​Streaming Responses

​Multi-Model Workflows

​Monitoring and Alerts

​View Traces in Dashboard

​Set Up Alerts

​Evaluations and Testing

​Troubleshooting

​Next Steps

Maxim Documentation

Evaluations Guide

Agno Integration

Cerebras Models

What is Maxim?

Prerequisites

Configure Maxim

Advanced Usage

Custom Metadata with Headers

Streaming Responses

Multi-Model Workflows

Monitoring and Alerts

View Traces in Dashboard

Set Up Alerts

Evaluations and Testing

Troubleshooting

Next Steps