Make your first API call and see what inference at thousands of tokens per second feels like. Already familiar with LLM APIs? Skip straight to the API reference or try the playground.Documentation Index
Fetch the complete documentation index at: https://inference-docs.cerebras.ai/llms.txt
Use this file to discover all available pages before exploring further.
Prerequisites
To complete this guide, you will need:- A Cerebras account (sign up free)
- A Cerebras Inference API key
- Python 3.10+ or TypeScript 4.5+
Set up your API key
Visit the Cloud Console and navigate to API Keys in the left nav bar to create a key.Set your API key as an environment variable so you don’t have to pass it with every request:Confirm the variable is set:
Install the SDK
Install the Cerebras SDK for your language of choice. You can also call the API directly with cURL (see Step 3).
Next Steps
- Choose a model — Find the right model for your use case in our model selection guide.
- Explore capabilities — Add streaming, tool calling, structured outputs, or reasoning to your application.
- Design for Cerebras — Learn architectural patterns that take full advantage of wafer-scale inference.
- Browse the API — See all available endpoints and parameters in the API reference.

