Build with Cerebras Inference - Cerebras Inference

To get started with a free API key, click here.

QuickStart Guide

Get started by building your first application using our QuickStart guide.

import os
from cerebras.cloud.sdk import Cerebras

client = Cerebras(
  api_key=os.environ.get("CEREBRAS_API_KEY"),
)

chat_completion = client.chat.completions.create(
  messages=[
  {"role": "user", "content": "Why is fast inference important?",}
],
  model="llama-4-scout-17b-16e-instruct",
)

Explore Models

View our available models, including performance specifications, rate limits, and pricing details.

Get Familiar

Play with our live chatbot demo.
For information on pricing and context length, visit our pricing page.
Experiment with our inference solution in the playground before making an API call.
Explore our API reference documentation.