Get Started with AI Suite

What is AI Suite?

AI Suite is a Python library that provides a unified interface for interacting with multiple large language model (LLM) providers. With AI Suite, you can easily switch between different providers and models using the same codebase, making it simple to compare performance, cost, and accuracy across providers. By integrating Cerebras with AI Suite, you can leverage Cerebras’s ultra-fast inference speeds while maintaining the flexibility to use other providers when needed. Learn more at the AI Suite GitHub repository.

Prerequisites

Before you begin, ensure you have:

Cerebras API Key - Get a free API key here.
Python 3.11 or higher installed on your system.

Configure AI Suite

Install AI Suite

Install the AI Suite library using pip. This lightweight package provides the unified interface for accessing multiple LLM providers:

pip install aisuite cerebras-cloud-sdk

If you want to compare Cerebras with other providers (as shown in the examples below), you’ll also need to install their SDKs:

pip install openai anthropic

Set up your API key

Configure your Cerebras API key as an environment variable. AI Suite will automatically detect and use this key when making requests to Cerebras:

export CEREBRAS_API_KEY="your-cerebras-api-key-here"

For a more permanent solution, add this to your .env file:

CEREBRAS_API_KEY=your-cerebras-api-key-here

If you’re using other providers for comparison, set their API keys as well:

export OPENAI_API_KEY="your-openai-api-key-here"
export ANTHROPIC_API_KEY="your-anthropic-api-key-here"

Initialize the AI Suite client

Create an AI Suite client instance. This single client can be used to access any supported LLM provider, including Cerebras:

import aisuite as ai

client = ai.Client()

The client automatically configures itself based on your environment variables, so no additional setup is needed.

Make your first request

To use Cerebras models through AI Suite, prefix the model name with cerebras: followed by the model identifier. Here’s a simple example that generates a response:

import aisuite as ai

client = ai.Client()

messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "What are the benefits of fast inference?"}
]

response = client.chat.completions.create(
    model="cerebras:llama-3.3-70b",
    messages=messages,
    temperature=0.7
)

print(response.choices[0].message.content)

This code sends a chat completion request to Cerebras’s Llama 3.3 70B model and prints the response.

Compare Multiple Models

One of AI Suite’s key advantages is the ability to easily compare responses from different models. Here’s how to query multiple Cerebras models with the same prompt:

import aisuite as ai

client = ai.Client()

# Define different Cerebras models to compare
models = [
    "cerebras:llama-3.3-70b",
    "cerebras:qwen-3-32b",
    "cerebras:llama3.1-8b"
]

messages = [
    {"role": "system", "content": "You are a helpful coding assistant."},
    {"role": "user", "content": "Write a Python function to calculate fibonacci numbers."}
]

for model in models:
    print(f"\n--- Response from {model} ---")
    response = client.chat.completions.create(
        model=model,
        messages=messages,
        temperature=0.7
    )
    print(response.choices[0].message.content)

This approach allows you to:

Compare response quality across different Cerebras models
Benchmark inference speeds across model sizes
Test different models for specific use cases
Easily switch between models without changing your code structure

You can also compare Cerebras with other providers like OpenAI or Anthropic by adding their models to the list (e.g., "openai:gpt-4o", "anthropic:claude-opus-4-5"). Just make sure to install their SDKs and set the appropriate API keys as shown in the setup steps above.

Available Cerebras Models

You can use any of Cerebras’s production models through AI Suite by prefixing them with cerebras::

cerebras:llama-3.3-70b - Best for complex reasoning, long-form content, and tasks requiring deep understanding
cerebras:qwen-3-32b - Balanced performance for general-purpose applications
cerebras:llama3.1-8b - Fastest option for simple tasks and high-throughput scenarios
cerebras:gpt-oss-120b - Largest model for the most demanding tasks
cerebras:zai-glm-4.7 - Advanced 357B parameter model with strong reasoning capabilities

For detailed information about each model’s capabilities and pricing, visit the Cerebras models page.

Advanced Usage

Adjusting Parameters

Customize model behavior with parameters like temperature, max_tokens, and top_p to fine-tune responses for your specific use case:

import aisuite as ai

client = ai.Client()

response = client.chat.completions.create(
    model="cerebras:llama-3.3-70b",
    messages=[{"role": "user", "content": "Explain quantum computing."}],
    temperature=0.3,  # Lower temperature for more focused responses
    max_tokens=500,   # Limit response length
    top_p=0.9        # Nucleus sampling parameter
)

print(response.choices[0].message.content)

Multi-Turn Conversations

Maintain context across multiple exchanges by building up your messages array:

import aisuite as ai

client = ai.Client()

messages = [
    {"role": "system", "content": "You are a helpful math tutor."},
    {"role": "user", "content": "What is the Pythagorean theorem?"}
]

# First response
response = client.chat.completions.create(
    model="cerebras:llama-3.3-70b",
    messages=messages
)

print(response.choices[0].message.content)

# Add assistant's response to conversation history
messages.append({
    "role": "assistant",
    "content": response.choices[0].message.content
})

# Continue the conversation
messages.append({
    "role": "user",
    "content": "Can you give me an example?"
})

response = client.chat.completions.create(
    model="cerebras:llama-3.3-70b",
    messages=messages
)

print(response.choices[0].message.content)

Frequently Asked Questions

How does AI Suite handle API keys for multiple providers?

AI Suite automatically detects API keys from environment variables based on the provider prefix. For Cerebras, it looks for CEREBRAS_API_KEY. You can set multiple provider keys (like OPENAI_API_KEY, ANTHROPIC_API_KEY) and AI Suite will use the appropriate key based on the model prefix in your request.

Can I use AI Suite with async/await patterns?

Currently, AI Suite focuses on synchronous API calls. For async operations, you may need to wrap calls in your own async functions or use the provider’s native SDK directly. Check the AI Suite GitHub repository for updates on async support.

How do I handle errors when switching between providers?

Different providers may have different error formats. Wrap your API calls in try-except blocks and handle provider-specific errors. AI Suite attempts to normalize responses, but error handling may vary by provider.

try:
    response = client.chat.completions.create(
        model="cerebras:llama-3.3-70b",
        messages=messages
    )
except Exception as e:
    print(f"Error: {e}")
    # Fallback to another provider or handle error

Does AI Suite support function calling or tool use?

AI Suite provides a unified interface for basic chat completions. Advanced features like function calling depend on the underlying provider’s capabilities. Check the specific provider’s documentation for feature availability and implementation details.

How can I optimize costs when using multiple providers?

Use AI Suite to benchmark different models for your specific use case. Cerebras offers competitive pricing with ultra-fast inference speeds. Start with smaller models like cerebras:llama3.1-8b for simple tasks and reserve larger models for complex reasoning.

Next Steps

Explore the AI Suite GitHub repository for more examples and documentation
Try different Cerebras models to find the best fit for your use case
Check out other integrations to enhance your AI workflow
Check out the GLM4.7 migration guide to use the latest model

Troubleshooting

API Key Not Found

If you see an error about missing API keys:

Verify your CEREBRAS_API_KEY environment variable is set correctly
Ensure you’re running your script in the same terminal session where you exported the variable
Try using a .env file with a library like python-dotenv for persistent configuration
Restart your Python interpreter or IDE after setting environment variables

Model Not Found

If you receive a model not found error:

Verify you’re using the correct model name format: cerebras:model-name
Check that the model name matches one of the available Cerebras models
Ensure there are no typos in the model identifier (note the hyphen in llama-3.3-70b)
Confirm you’re using a current production model, not a deprecated version

Connection Errors

If you experience connection issues:

Verify your internet connection is stable
Check that your API key is valid and has not expired in your dashboard
Ensure you’re not hitting rate limits (check your usage in the dashboard)
Try a simple test request to isolate the issue

Slow Response Times

If responses seem slower than expected:

Cerebras typically provides the fastest inference speeds in the industry
Compare with other providers using the multi-model example above to benchmark
Check your network latency and consider your geographic location relative to Cerebras’s servers
Verify you’re not using an unnecessarily large model for simple tasks
Ensure you’re not rate-limited or experiencing API throttling

Import Errors

If you encounter import errors:

Verify AI Suite is installed: pip show aisuite
Ensure you’re using the correct import statement: import aisuite as ai
Check your Python version is 3.7 or higher: python --version
Try reinstalling the package: pip install --upgrade aisuite

Get Started

Capabilities

Compatibility

Resources

Support

What is AI Suite?

Prerequisites

Configure AI Suite

Compare Multiple Models

Available Cerebras Models

Advanced Usage

Adjusting Parameters

Multi-Turn Conversations

Frequently Asked Questions

Next Steps

Troubleshooting

Get Started

Capabilities

Compatibility

Resources

Support

​What is AI Suite?

​Prerequisites

​Configure AI Suite

​Compare Multiple Models

​Available Cerebras Models

​Advanced Usage

​Adjusting Parameters

​Multi-Turn Conversations

​Frequently Asked Questions

​Next Steps

​Troubleshooting

What is AI Suite?

Prerequisites

Configure AI Suite

Compare Multiple Models

Available Cerebras Models

Advanced Usage

Adjusting Parameters

Multi-Turn Conversations

Frequently Asked Questions

Next Steps

Troubleshooting