Skip to main content

What is Maxim?

Maxim is an AI observability and evaluation platform that helps teams monitor, trace, and improve their LLM applications in production. With Maxim, you can track model performance, debug issues, and gain insights into your AI workflows. Learn more at https://www.getmaxim.ai/

Prerequisites

Before you begin, ensure you have:
  • Cerebras API Key - Get a free API key here.
  • Maxim Account - Visit Maxim and create an account or log in.
    • Go to Settings to generate your Maxim API key.
  • Python 3.11 or higher

Configure Maxim

1

Install required dependencies

Run the following:
pip install maxim-py openai python-dotenv
2

Configure environment variables

Create a .env file in your project directory:
CEREBRAS_API_KEY=your-cerebras-api-key-here
MAXIM_API_KEY=your-maxim-api-key-here
MAXIM_LOG_REPO_ID=your-maxim-log-repo-id-here
You can find your MAXIM_LOG_REPO_ID in your Maxim dashboard under Settings > Log Repositories.
3

Initialize Maxim and instrument your client

Set up the Maxim logger to automatically track all API calls to Cerebras:
import os
from dotenv import load_dotenv
from openai import OpenAI
from maxim import Maxim
from maxim.logger.openai import MaximOpenAIClient

# Load environment variables
load_dotenv()

# Initialize Maxim
maxim = Maxim({"api_key": os.getenv("MAXIM_API_KEY")})
logger = maxim.logger({"id": os.getenv("MAXIM_LOG_REPO_ID")})

# Create Cerebras OpenAI client
cerebras_client = OpenAI(
    api_key=os.getenv("CEREBRAS_API_KEY"),
    base_url="https://api.cerebras.ai/v1",
    default_headers={
        "X-Cerebras-3rd-Party-Integration": "MaximAI"
    }
)

# Wrap with Maxim for automatic tracing
client = MaximOpenAIClient(cerebras_client, logger=logger)
4

Make your first traced request

Make API calls to Cerebras as usual. Maxim will automatically capture all details:
import os
from dotenv import load_dotenv
from openai import OpenAI
from maxim import Maxim
from maxim.logger.openai import MaximOpenAIClient

load_dotenv()

# Initialize Maxim
maxim = Maxim({"api_key": os.getenv("MAXIM_API_KEY")})
logger = maxim.logger({"id": os.getenv("MAXIM_LOG_REPO_ID")})

# Create Cerebras client
cerebras_client = OpenAI(
    api_key=os.getenv("CEREBRAS_API_KEY"),
    base_url="https://api.cerebras.ai/v1",
    default_headers={
        "X-Cerebras-3rd-Party-Integration": "MaximAI"
    }
)

# Wrap with Maxim
client = MaximOpenAIClient(cerebras_client, logger=logger)

# Make traced request
response = client.chat.completions.create(
    model="gpt-oss-120b",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Explain quantum computing in simple terms."},
    ],
    temperature=0.7,
    max_tokens=500,
)

print(response.choices[0].message.content)
5

View traces in your dashboard

Log in to your Maxim dashboard to see all logged requests. You can filter by model, date range, or custom metadata, and view detailed information including conversation history, token usage, latency metrics, and model parameters.

Advanced Usage

Custom Metadata with Headers

You can add custom metadata to your traces using extra headers. This helps organize and filter your logs in Maxim by user, session, or feature.
import os
from dotenv import load_dotenv
from openai import OpenAI
from maxim import Maxim
from maxim.logger.openai import MaximOpenAIClient

load_dotenv()

# Initialize Maxim
maxim = Maxim({"api_key": os.getenv("MAXIM_API_KEY")})
logger = maxim.logger({"id": os.getenv("MAXIM_LOG_REPO_ID")})

# Create and wrap Cerebras client
cerebras_client = OpenAI(
    api_key=os.getenv("CEREBRAS_API_KEY"),
    base_url="https://api.cerebras.ai/v1",
    default_headers={"X-Cerebras-3rd-Party-Integration": "MaximAI"}
)
client = MaximOpenAIClient(cerebras_client, logger=logger)

# Add custom metadata via headers
response = client.chat.completions.create(
    model="gpt-oss-120b",
    messages=[
        {"role": "user", "content": "How do I reset my password?"},
    ],
    extra_headers={
        "x-maxim-session-id": "session_456",
        "x-maxim-generation-name": "password_reset_help"
    }
)
print(response.choices[0].message.content)

Streaming Responses

Maxim supports tracing streaming responses from Cerebras. The instrumentation automatically handles streaming data and captures the complete response.
import os
from dotenv import load_dotenv
from openai import OpenAI
from maxim import Maxim
from maxim.logger.openai import MaximOpenAIClient

load_dotenv()

# Initialize Maxim
maxim = Maxim({"api_key": os.getenv("MAXIM_API_KEY")})
logger = maxim.logger({"id": os.getenv("MAXIM_LOG_REPO_ID")})

# Create and wrap Cerebras client
cerebras_client = OpenAI(
    api_key=os.getenv("CEREBRAS_API_KEY"),
    base_url="https://api.cerebras.ai/v1",
    default_headers={"X-Cerebras-3rd-Party-Integration": "MaximAI"}
)
client = MaximOpenAIClient(cerebras_client, logger=logger)

# Stream response
response = client.chat.completions.create(
    model="gpt-oss-120b",
    messages=[
        {"role": "user", "content": "Write a short story about a robot."},
    ],
    stream=True,
)

for chunk in response:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")
print()

Multi-Model Workflows

Track multiple LLM calls in sequence. Each call is automatically traced in your Maxim dashboard:
import os
from dotenv import load_dotenv
from openai import OpenAI
from maxim import Maxim
from maxim.logger.openai import MaximOpenAIClient

load_dotenv()

# Initialize Maxim
maxim = Maxim({"api_key": os.getenv("MAXIM_API_KEY")})
logger = maxim.logger({"id": os.getenv("MAXIM_LOG_REPO_ID")})

# Create and wrap Cerebras client
cerebras_client = OpenAI(
    api_key=os.getenv("CEREBRAS_API_KEY"),
    base_url="https://api.cerebras.ai/v1",
    default_headers={"X-Cerebras-3rd-Party-Integration": "MaximAI"}
)
client = MaximOpenAIClient(cerebras_client, logger=logger)

def research_workflow(query: str) -> str:
    """Multi-step workflow with multiple LLM calls."""
    
    # Step 1: Generate search queries
    search_response = client.chat.completions.create(
        model="gpt-oss-120b",
        messages=[
            {"role": "system", "content": "Generate 3 search queries for research."},
            {"role": "user", "content": query},
        ],
        extra_headers={"x-maxim-generation-name": "generate_queries"}
    )
    queries = search_response.choices[0].message.content
    
    # Step 2: Synthesize results
    synthesis_response = client.chat.completions.create(
        model="qwen-3-32b",
        messages=[
            {"role": "system", "content": "Synthesize research findings."},
            {"role": "user", "content": f"Queries: {queries}\n\nProvide a summary."},
        ],
        extra_headers={"x-maxim-generation-name": "synthesize_results"}
    )
    return synthesis_response.choices[0].message.content

result = research_workflow("What are the latest developments in quantum computing?")
print(result)

Monitoring and Alerts

Maxim provides powerful monitoring capabilities to help you track your application’s performance and set up alerts for issues or anomalies.

View Traces in Dashboard

  1. Log in to your Maxim dashboard
  2. Navigate to the Traces section to see all your logged requests
  3. Filter by model, date range, or custom metadata
  4. Click on individual traces to see detailed information including:
    • Full conversation history
    • Token usage and costs
    • Latency metrics
    • Model parameters
    • Custom metadata

Set Up Alerts

You can configure alerts to notify you of issues or anomalies. Visit the Maxim alerts documentation to learn how to:
  • Create alerts for high latency or error rates
  • Monitor token usage and costs
  • Track model performance metrics
  • Get notified via email, Slack, or webhooks

Evaluations and Testing

Maxim provides built-in evaluation capabilities through their dashboard to help you measure and improve your LLM application’s quality. All traced requests are automatically available for evaluation. To set up evaluations:
  1. Log in to your Maxim dashboard
  2. Navigate to Evaluations to create custom evaluators
  3. Define metrics like accuracy, relevance, and coherence
  4. Run evaluations on your traced requests
  5. View results and insights in the dashboard
Learn more about evaluations and testing in the Maxim documentation.

Troubleshooting

If your traces aren’t showing up in the Maxim dashboard:
  • Verify your MAXIM_API_KEY is correct and has the necessary permissions
  • Check that you wrapped your OpenAI client with MaximOpenAIClient before making API requests
  • Ensure your network allows outbound connections to Maxim’s API
  • Look for error messages in your application logs
  • Verify you’re using the latest version of maxim-py (run pip install --upgrade maxim-py)
If you’re seeing authentication errors:
  • Double-check that both CEREBRAS_API_KEY and MAXIM_API_KEY are set correctly in your .env file
  • Ensure you’ve called load_dotenv() before initializing the clients
  • Verify your API keys haven’t expired or been revoked
  • Check that your Cerebras API key is active at cloud.cerebras.ai
If streaming responses aren’t being captured:
  • Make sure you’re using the latest version of maxim-py (run pip install --upgrade maxim-py)
  • Verify that the instrumentation is applied before creating streaming requests
  • Check that you’re iterating through all chunks in the stream
  • Ensure you’re not catching and suppressing exceptions during streaming
If you notice higher latency than expected:
  • The instrumentation adds minimal overhead (typically less than 10ms)
  • Check your network connection to both Cerebras and Maxim APIs
  • Review your Maxim dashboard to identify if the latency is from the LLM call or logging
  • Consider using async logging if you need to minimize impact on response times
  • Verify you’re using the closest Cerebras region for optimal performance
If custom metadata isn’t showing up in your traces:
  • Ensure you’re using the log() context manager correctly
  • Verify that metadata is passed as a dictionary with string keys
  • Check that you’re not exceeding metadata size limits (typically 10KB per trace)
  • Make sure the metadata is added before the API call is made

Next Steps

For additional support, visit the Maxim support page or contact their team directly.