Skip to main content

What is Arize Phoenix?

Arize Phoenix is an open-source AI observability platform that helps you monitor, evaluate, and debug your LLM applications. Phoenix provides detailed tracing, evaluation capabilities, and debugging tools to help you understand and improve your AI systems in production. Learn more at https://arize.com/phoenix/ With Phoenix, you can:
  • Capture detailed traces of your LLM interactions
  • Evaluate model outputs with custom or pre-built evaluators
  • Debug retrieval and generation pipelines
  • Monitor performance and quality metrics
  • Analyze embeddings and vector search results

Prerequisites

Before you begin, ensure you have:
  • Cerebras API Key - Get a free API key here.
  • Python 3.11 or higher - Phoenix requires Python 3.11+. Check your version with python --version.
  • Phoenix Account (Optional) - While you can use Phoenix locally, creating a free account at Phoenix Cloud enables cloud-based tracing and team collaboration.

Configure Arize Phoenix

1

Install dependencies

Install Phoenix and the OpenAI SDK:
pip install arize-phoenix openai openinference-instrumentation-openai
2

Configure environment variables

Create a .env file in your project directory:
CEREBRAS_API_KEY=your-cerebras-api-key-here
PHOENIX_API_KEY=your-phoenix-api-key-here
PHOENIX_COLLECTOR_ENDPOINT=https://app.phoenix.arize.com/s/your-workspace-name
# For Phoenix Cloud instances created BEFORE June 24, 2025, also add:
PHOENIX_CLIENT_HEADERS=api_key=your-phoenix-api-key-here
Replace your-workspace-name with your actual Phoenix Cloud workspace name (e.g., sebastian-duerr). You can find your Phoenix API key and workspace name in your Phoenix Cloud dashboard.Note: If your Phoenix Cloud instance was created before June 24, 2025, you must set PHOENIX_CLIENT_HEADERS with the api_key= prefix for authentication to work correctly.
3

Initialize Phoenix tracing

Set up Phoenix Cloud tracing with automatic instrumentation:
import os
from dotenv import load_dotenv
from phoenix.otel import register
from openai import OpenAI

load_dotenv()

# For Phoenix Cloud instances created BEFORE June 24, 2025:
# Set PHOENIX_CLIENT_HEADERS before importing register
if os.getenv("PHOENIX_CLIENT_HEADERS"):
    os.environ["PHOENIX_CLIENT_HEADERS"] = os.getenv("PHOENIX_CLIENT_HEADERS")

# Register with Phoenix Cloud - auto_instrument detects OpenAI SDK
tracer_provider = register(
    project_name="cerebras-integration",
    auto_instrument=True,
)

# Initialize Cerebras client
client = OpenAI(
    api_key=os.getenv("CEREBRAS_API_KEY"),
    base_url="https://api.cerebras.ai/v1",
    default_headers={
        "X-Cerebras-3rd-Party-Integration": "Arize Phoenix"
    }
)
The auto_instrument=True flag automatically instruments the OpenAI SDK to capture all API calls (including Cerebras) and sends detailed traces to Phoenix Cloud.
4

Make your first traced request

Make a request to Cerebras. Phoenix will automatically capture the full trace:
import os
from dotenv import load_dotenv
from phoenix.otel import register
from openai import OpenAI

load_dotenv()

# For Phoenix Cloud instances created BEFORE June 24, 2025
if os.getenv("PHOENIX_CLIENT_HEADERS"):
    os.environ["PHOENIX_CLIENT_HEADERS"] = os.getenv("PHOENIX_CLIENT_HEADERS")

tracer_provider = register(
    project_name="cerebras-integration",
    auto_instrument=True,
)

client = OpenAI(
    api_key=os.getenv("CEREBRAS_API_KEY"),
    base_url="https://api.cerebras.ai/v1",
    default_headers={
        "X-Cerebras-3rd-Party-Integration": "Arize Phoenix"
    }
)

response = client.chat.completions.create(
    model="llama-3.3-70b",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Explain what observability means in AI systems."}
    ],
    temperature=0.7,
    max_tokens=500
)

print(response.choices[0].message.content)
After running this code, visit Phoenix Cloud to see your traces. You’ll see detailed information including conversation history, token usage, response latency, model parameters, and any errors or warnings.

Advanced Features

Streaming Responses

Phoenix fully supports streaming responses from Cerebras. Traces will capture the complete streamed output:
import os
from dotenv import load_dotenv
from phoenix.otel import register
from openai import OpenAI

load_dotenv()

if os.getenv("PHOENIX_CLIENT_HEADERS"):
    os.environ["PHOENIX_CLIENT_HEADERS"] = os.getenv("PHOENIX_CLIENT_HEADERS")

tracer_provider = register(
    project_name="cerebras-streaming",
    auto_instrument=True,
)

client = OpenAI(
    api_key=os.getenv("CEREBRAS_API_KEY"),
    base_url="https://api.cerebras.ai/v1",
    default_headers={
        "X-Cerebras-3rd-Party-Integration": "Arize Phoenix"
    }
)

stream = client.chat.completions.create(
    model="llama-3.3-70b",
    messages=[
        {"role": "user", "content": "Write a short story about AI."}
    ],
    stream=True
)

for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")
The Phoenix trace will show the full streamed response along with timing information for each chunk.

Using Phoenix Evaluations

Phoenix includes a powerful evaluation library that can use Cerebras models to evaluate your LLM outputs:
import os
from dotenv import load_dotenv
from phoenix.otel import register
from phoenix.evals import create_classifier, evaluate_dataframe
from phoenix.evals.llm import LLM
import pandas as pd

load_dotenv()

if os.getenv("PHOENIX_CLIENT_HEADERS"):
    os.environ["PHOENIX_CLIENT_HEADERS"] = os.getenv("PHOENIX_CLIENT_HEADERS")

tracer_provider = register(
    project_name="cerebras-evals",
    auto_instrument=True,
)

# Create LLM instance for evaluations using Cerebras
llm = LLM(
    provider="openai",
    model="llama-3.3-70b",
    api_key=os.getenv("CEREBRAS_API_KEY"),
    base_url="https://api.cerebras.ai/v1"
)

# Create a relevance evaluator
relevance_evaluator = create_classifier(
    name="relevance",
    prompt_template="Is the answer relevant to the question?\n\nQuestion: {input}\nAnswer: {output}",
    llm=llm,
    choices={"relevant": 1.0, "irrelevant": 0.0},
)

# Prepare evaluation data
eval_data = pd.DataFrame({
    "input": ["What is the capital of France?"],
    "output": ["The capital of France is Paris."]
})

# Run evaluation
results = evaluate_dataframe(
    dataframe=eval_data,
    evaluators=[relevance_evaluator],
)

print(results)
Evaluation results are automatically logged to Phoenix, where you can analyze patterns and identify issues across your dataset.

Multi-Turn Conversations

Phoenix traces multi-turn conversations, making it easy to debug complex interactions:
import os
from dotenv import load_dotenv
from phoenix.otel import register
from openai import OpenAI

load_dotenv()

if os.getenv("PHOENIX_CLIENT_HEADERS"):
    os.environ["PHOENIX_CLIENT_HEADERS"] = os.getenv("PHOENIX_CLIENT_HEADERS")

tracer_provider = register(
    project_name="cerebras-multi-turn",
    auto_instrument=True,
)

client = OpenAI(
    api_key=os.getenv("CEREBRAS_API_KEY"),
    base_url="https://api.cerebras.ai/v1",
    default_headers={
        "X-Cerebras-3rd-Party-Integration": "Arize Phoenix"
    }
)

conversation = [
    {"role": "system", "content": "You are a helpful coding assistant."},
    {"role": "user", "content": "How do I read a file in Python?"}
]

# First turn
response = client.chat.completions.create(
    model="llama-3.3-70b",
    messages=conversation
)

conversation.append({
    "role": "assistant",
    "content": response.choices[0].message.content
})

# Follow-up question
conversation.append({
    "role": "user",
    "content": "What about writing to a file?"
})

response = client.chat.completions.create(
    model="llama-3.3-70b",
    messages=conversation
)

print(response.choices[0].message.content)
In the Phoenix UI, you’ll see the complete conversation flow with all turns traced together.

Complete Example

Here’s a complete example showing all the setup and a traced request:
import os
from dotenv import load_dotenv
from phoenix.otel import register
from openai import OpenAI

load_dotenv()

# For Phoenix Cloud instances created BEFORE June 24, 2025
if os.getenv("PHOENIX_CLIENT_HEADERS"):
    os.environ["PHOENIX_CLIENT_HEADERS"] = os.getenv("PHOENIX_CLIENT_HEADERS")

# Register with Phoenix Cloud
tracer_provider = register(
    project_name="cerebras-production",
    auto_instrument=True,
)

# Initialize Cerebras client
client = OpenAI(
    api_key=os.getenv("CEREBRAS_API_KEY"),
    base_url="https://api.cerebras.ai/v1",
    default_headers={
        "X-Cerebras-3rd-Party-Integration": "Arize Phoenix"
    }
)

# Make a request - automatically traced
response = client.chat.completions.create(
    model="llama-3.3-70b",
    messages=[{"role": "user", "content": "Hello!"}]
)

print(response.choices[0].message.content)
Visit Phoenix Cloud to view your traces, analyze performance, and explore your LLM application’s behavior.

Troubleshooting

If you don’t see traces in Phoenix Cloud:
  • For instances created before June 24, 2025: Ensure you set PHOENIX_CLIENT_HEADERS=api_key=your-api-key in your .env file
  • Verify your PHOENIX_API_KEY environment variable is set correctly
  • Check that PHOENIX_COLLECTOR_ENDPOINT is set to https://app.phoenix.arize.com/s/your-workspace-name
  • Ensure you called register() with auto_instrument=True before making any API requests
  • Look for any error messages in your Python console (especially “401 Unauthorized”)
  • Confirm the arize-phoenix and openinference-instrumentation-openai packages are installed
If you’re getting “Failed to export span batch code: 401” errors:
  • This is an authentication issue. For Phoenix Cloud instances created before June 24, 2025, you must set PHOENIX_CLIENT_HEADERS=api_key=your-api-key in your environment
  • Make sure to set the environment variable before importing phoenix.otel.register
  • Verify your API key is active in your Phoenix Cloud settings
  • Check that you’re using your workspace endpoint: https://app.phoenix.arize.com/s/your-workspace-name
If you’re getting connection errors:
  • Verify your CEREBRAS_API_KEY environment variable is set correctly
  • Ensure you’re using the correct base URL: https://api.cerebras.ai/v1
  • Check your internet connection and firewall settings
  • Try making a simple request without Phoenix to isolate the issue
If Phoenix is consuming too much memory:
  • Consider using Phoenix Cloud instead of running locally for production workloads
  • Limit the number of traces stored locally by restarting Phoenix periodically
  • Use trace sampling for high-volume applications
  • Review the performance optimization guide in Phoenix docs

Next Steps