QuickStart

To get started with a free API key, click here.

This QuickStart guide is designed to assist you in making your first API call. If you are an experienced AI applications developer, you may find it more beneficial to go directly to the API reference documentation. If you would like to interact with the models using Cerebras’ Inference solution before making an API call, please visit the developer playground. This guide will walk you through:

Setting up your developer environment
Installing the Cerebras Inference library
Making your first request to the Cerebras API

Prerequisites

To complete this guide, you will need:

A Cerebras account
A Cerebras Inference API key
Python 3.7+ or TypeScript 4.5+

Set up your API key

The first thing you will need is a valid API key. Please visit this link and navigate to “API Keys” on the left nav bar.For security reasons and to avoid configuring your API key each time, it is recommended to set your API key as an environment variable. You can do this by running the following command in your terminal:

export CEREBRAS_API_KEY="your-api-key-here"

Install the Cerebras Inference library

The Cerebras Inference library is available for download and installation through the Python Package Index (PyPI) and the npm package manager. To install the library run either of the following commands in your terminal, based on your language of choice:

Note: You can also call the underlying API directly (see cURL request example below in Step 3).

pip install --upgrade cerebras_cloud_sdk

Making an API request

If your request is being blocked by CloudFront, ensure that User-Agent is included in your headers

Once you have configured your API key, you are ready to send your first API request.The following code snippets demonstrate how to make an API request to the Cerebras API to perform a chat completion.

import os
from cerebras.cloud.sdk import Cerebras

client = Cerebras(
    # This is the default and can be omitted
    api_key=os.environ.get("CEREBRAS_API_KEY"),
)

chat_completion = client.chat.completions.create(
    messages=[
        {
            "role": "user",
            "content": "Why is fast inference important?",
        }
],
    model="llama-4-scout-17b-16e-instruct",
)

print(chat_completion)

Next Steps

Visit our repositories for our Python and Node.js libraries
Check out our API Reference to learn about the details of our available endpoints and request parameters.
Learn how to stream responses.
Learn about tool use.

Get Started

Capabilities

Integrations

Support

Prerequisites

Next Steps

Get Started

Capabilities

Integrations

Support

​Prerequisites

​Next Steps

Prerequisites

Next Steps