Get Started with Arize Phoenix

What is Arize Phoenix?

Arize Phoenix is an open-source AI observability platform that helps you monitor, evaluate, and debug your LLM applications. Phoenix provides detailed tracing, evaluation capabilities, and debugging tools to help you understand and improve your AI systems in production. Learn more at https://arize.com/phoenix/ With Phoenix, you can:

Capture detailed traces of your LLM interactions
Evaluate model outputs with custom or pre-built evaluators
Debug retrieval and generation pipelines
Monitor performance and quality metrics
Analyze embeddings and vector search results

Prerequisites

Before you begin, ensure you have:

Cerebras API Key - Get a free API key here.
Python 3.11 or higher - Phoenix requires Python 3.11+. Check your version with python --version.
Phoenix Account (Optional) - While you can use Phoenix locally, creating a free account at Phoenix Cloud enables cloud-based tracing and team collaboration.

Configure Arize Phoenix

Install dependencies

Install Phoenix and the OpenAI SDK:

pip install arize-phoenix openai openinference-instrumentation-openai

Configure environment variables

Create a .env file in your project directory:

CEREBRAS_API_KEY=your-cerebras-api-key-here
PHOENIX_API_KEY=your-phoenix-api-key-here
PHOENIX_COLLECTOR_ENDPOINT=https://app.phoenix.arize.com/s/your-workspace-name
# For Phoenix Cloud instances created BEFORE June 24, 2025, also add:
PHOENIX_CLIENT_HEADERS=api_key=your-phoenix-api-key-here

Replace your-workspace-name with your actual Phoenix Cloud workspace name (e.g., sebastian-duerr). You can find your Phoenix API key and workspace name in your Phoenix Cloud dashboard.Note: If your Phoenix Cloud instance was created before June 24, 2025, you must set PHOENIX_CLIENT_HEADERS with the api_key= prefix for authentication to work correctly.

Initialize Phoenix tracing

Set up Phoenix Cloud tracing with automatic instrumentation:

import os
from dotenv import load_dotenv
from phoenix.otel import register
from openai import OpenAI

load_dotenv()

# For Phoenix Cloud instances created BEFORE June 24, 2025:
# Set PHOENIX_CLIENT_HEADERS before importing register
if os.getenv("PHOENIX_CLIENT_HEADERS"):
    os.environ["PHOENIX_CLIENT_HEADERS"] = os.getenv("PHOENIX_CLIENT_HEADERS")

# Register with Phoenix Cloud - auto_instrument detects OpenAI SDK
tracer_provider = register(
    project_name="cerebras-integration",
    auto_instrument=True,
)

# Initialize Cerebras client
client = OpenAI(
    api_key=os.getenv("CEREBRAS_API_KEY"),
    base_url="https://api.cerebras.ai/v1",
    default_headers={
        "X-Cerebras-3rd-Party-Integration": "Arize Phoenix"
    }
)

The auto_instrument=True flag automatically instruments the OpenAI SDK to capture all API calls (including Cerebras) and sends detailed traces to Phoenix Cloud.

Make your first traced request

Make a request to Cerebras. Phoenix will automatically capture the full trace:

import os
from dotenv import load_dotenv
from phoenix.otel import register
from openai import OpenAI

load_dotenv()

# For Phoenix Cloud instances created BEFORE June 24, 2025
if os.getenv("PHOENIX_CLIENT_HEADERS"):
    os.environ["PHOENIX_CLIENT_HEADERS"] = os.getenv("PHOENIX_CLIENT_HEADERS")

tracer_provider = register(
    project_name="cerebras-integration",
    auto_instrument=True,
)

client = OpenAI(
    api_key=os.getenv("CEREBRAS_API_KEY"),
    base_url="https://api.cerebras.ai/v1",
    default_headers={
        "X-Cerebras-3rd-Party-Integration": "Arize Phoenix"
    }
)

response = client.chat.completions.create(
    model="llama-3.3-70b",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Explain what observability means in AI systems."}
    ],
    temperature=0.7,
    max_tokens=500
)

print(response.choices[0].message.content)

After running this code, visit Phoenix Cloud to see your traces. You’ll see detailed information including conversation history, token usage, response latency, model parameters, and any errors or warnings.

Advanced Features

Streaming Responses

Phoenix fully supports streaming responses from Cerebras. Traces will capture the complete streamed output:

import os
from dotenv import load_dotenv
from phoenix.otel import register
from openai import OpenAI

load_dotenv()

if os.getenv("PHOENIX_CLIENT_HEADERS"):
    os.environ["PHOENIX_CLIENT_HEADERS"] = os.getenv("PHOENIX_CLIENT_HEADERS")

tracer_provider = register(
    project_name="cerebras-streaming",
    auto_instrument=True,
)

client = OpenAI(
    api_key=os.getenv("CEREBRAS_API_KEY"),
    base_url="https://api.cerebras.ai/v1",
    default_headers={
        "X-Cerebras-3rd-Party-Integration": "Arize Phoenix"
    }
)

stream = client.chat.completions.create(
    model="llama-3.3-70b",
    messages=[
        {"role": "user", "content": "Write a short story about AI."}
    ],
    stream=True
)

for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")

The Phoenix trace will show the full streamed response along with timing information for each chunk.

Using Phoenix Evaluations

Phoenix includes a powerful evaluation library that can use Cerebras models to evaluate your LLM outputs:

import os
from dotenv import load_dotenv
from phoenix.otel import register
from phoenix.evals import create_classifier, evaluate_dataframe
from phoenix.evals.llm import LLM
import pandas as pd

load_dotenv()

if os.getenv("PHOENIX_CLIENT_HEADERS"):
    os.environ["PHOENIX_CLIENT_HEADERS"] = os.getenv("PHOENIX_CLIENT_HEADERS")

tracer_provider = register(
    project_name="cerebras-evals",
    auto_instrument=True,
)

# Create LLM instance for evaluations using Cerebras
llm = LLM(
    provider="openai",
    model="llama-3.3-70b",
    api_key=os.getenv("CEREBRAS_API_KEY"),
    base_url="https://api.cerebras.ai/v1"
)

# Create a relevance evaluator
relevance_evaluator = create_classifier(
    name="relevance",
    prompt_template="Is the answer relevant to the question?\n\nQuestion: {input}\nAnswer: {output}",
    llm=llm,
    choices={"relevant": 1.0, "irrelevant": 0.0},
)

# Prepare evaluation data
eval_data = pd.DataFrame({
    "input": ["What is the capital of France?"],
    "output": ["The capital of France is Paris."]
})

# Run evaluation
results = evaluate_dataframe(
    dataframe=eval_data,
    evaluators=[relevance_evaluator],
)

print(results)

Evaluation results are automatically logged to Phoenix, where you can analyze patterns and identify issues across your dataset.

Multi-Turn Conversations

Phoenix traces multi-turn conversations, making it easy to debug complex interactions:

import os
from dotenv import load_dotenv
from phoenix.otel import register
from openai import OpenAI

load_dotenv()

if os.getenv("PHOENIX_CLIENT_HEADERS"):
    os.environ["PHOENIX_CLIENT_HEADERS"] = os.getenv("PHOENIX_CLIENT_HEADERS")

tracer_provider = register(
    project_name="cerebras-multi-turn",
    auto_instrument=True,
)

client = OpenAI(
    api_key=os.getenv("CEREBRAS_API_KEY"),
    base_url="https://api.cerebras.ai/v1",
    default_headers={
        "X-Cerebras-3rd-Party-Integration": "Arize Phoenix"
    }
)

conversation = [
    {"role": "system", "content": "You are a helpful coding assistant."},
    {"role": "user", "content": "How do I read a file in Python?"}
]

# First turn
response = client.chat.completions.create(
    model="llama-3.3-70b",
    messages=conversation
)

conversation.append({
    "role": "assistant",
    "content": response.choices[0].message.content
})

# Follow-up question
conversation.append({
    "role": "user",
    "content": "What about writing to a file?"
})

response = client.chat.completions.create(
    model="llama-3.3-70b",
    messages=conversation
)

print(response.choices[0].message.content)

In the Phoenix UI, you’ll see the complete conversation flow with all turns traced together.

Complete Example

Here’s a complete example showing all the setup and a traced request:

import os
from dotenv import load_dotenv
from phoenix.otel import register
from openai import OpenAI

load_dotenv()

# For Phoenix Cloud instances created BEFORE June 24, 2025
if os.getenv("PHOENIX_CLIENT_HEADERS"):
    os.environ["PHOENIX_CLIENT_HEADERS"] = os.getenv("PHOENIX_CLIENT_HEADERS")

# Register with Phoenix Cloud
tracer_provider = register(
    project_name="cerebras-production",
    auto_instrument=True,
)

# Initialize Cerebras client
client = OpenAI(
    api_key=os.getenv("CEREBRAS_API_KEY"),
    base_url="https://api.cerebras.ai/v1",
    default_headers={
        "X-Cerebras-3rd-Party-Integration": "Arize Phoenix"
    }
)

# Make a request - automatically traced
response = client.chat.completions.create(
    model="llama-3.3-70b",
    messages=[{"role": "user", "content": "Hello!"}]
)

print(response.choices[0].message.content)

Visit Phoenix Cloud to view your traces, analyze performance, and explore your LLM application’s behavior.

Troubleshooting

Traces not appearing in Phoenix Cloud

If you don’t see traces in Phoenix Cloud:

For instances created before June 24, 2025: Ensure you set PHOENIX_CLIENT_HEADERS=api_key=your-api-key in your .env file
Verify your PHOENIX_API_KEY environment variable is set correctly
Check that PHOENIX_COLLECTOR_ENDPOINT is set to https://app.phoenix.arize.com/s/your-workspace-name
Ensure you called register() with auto_instrument=True before making any API requests
Look for any error messages in your Python console (especially “401 Unauthorized”)
Confirm the arize-phoenix and openinference-instrumentation-openai packages are installed

401 Unauthorized errors

If you’re getting “Failed to export span batch code: 401” errors:

This is an authentication issue. For Phoenix Cloud instances created before June 24, 2025, you must set PHOENIX_CLIENT_HEADERS=api_key=your-api-key in your environment
Make sure to set the environment variable before importing phoenix.otel.register
Verify your API key is active in your Phoenix Cloud settings
Check that you’re using your workspace endpoint: https://app.phoenix.arize.com/s/your-workspace-name

Connection errors to Cerebras API

If you’re getting connection errors:

Verify your CEREBRAS_API_KEY environment variable is set correctly
Ensure you’re using the correct base URL: https://api.cerebras.ai/v1
Check your internet connection and firewall settings
Try making a simple request without Phoenix to isolate the issue

High memory usage with large traces

If Phoenix is consuming too much memory:

Consider using Phoenix Cloud instead of running locally for production workloads
Limit the number of traces stored locally by restarting Phoenix periodically
Use trace sampling for high-volume applications
Review the performance optimization guide in Phoenix docs

Next Steps

Explore the Phoenix documentation to learn about advanced features like custom evaluators and embedding analysis
Try different Cerebras models to compare performance and quality
Set up custom evaluators to monitor specific quality metrics
Integrate Phoenix with your production applications for continuous monitoring
Join the Phoenix community on Slack to get help and share feedback

Get Started

Capabilities

Compatibility

Resources

Support

What is Arize Phoenix?

Prerequisites

Configure Arize Phoenix

Advanced Features

Streaming Responses

Using Phoenix Evaluations

Multi-Turn Conversations

Complete Example

Troubleshooting

Next Steps

Get Started

Capabilities

Compatibility

Resources

Support

​What is Arize Phoenix?

​Prerequisites

​Configure Arize Phoenix

​Advanced Features

​Streaming Responses

​Using Phoenix Evaluations

​Multi-Turn Conversations

​Complete Example

​Troubleshooting

​Next Steps

What is Arize Phoenix?

Prerequisites

Configure Arize Phoenix

Advanced Features

Streaming Responses

Using Phoenix Evaluations

Multi-Turn Conversations

Complete Example

Troubleshooting

Next Steps