1
Get an OpenRouter API Key
First, you will need to create an OpenRouter API key. You’ll use this key to authenticate with OpenRouter and access the Cerebras provider.
- Go to API Keys in OpenRouter
- Click Create Key
- Give it a name and copy your API key
2
Make an API Call
Here’s an example using ChatCompletions to query Llama 3.3-70B on Cerebras.Be sure to replace
your_openrouter_key_here
with your actual API key.3
Try Structured Outputs + Make a Tool Call
You did it — your first API call is complete! Now, let’s explore how to make your model smarter at handling tasks and more precise in how it formats its responses via structured outputs and tool calling. See the examples below for how to use both.
Differences Between Cerebras Cloud and OpenRouter
Cerebras Cloud is primarily intended for free tier users and high-throughput startups that need a dedicated plan to handle their inference. OpenRouter acts as one of our “pay-as-you-go“ providers.FAQ
What context length can I run?
What context length can I run?
This varies by model. See our provider page to view max context length for each model.
What additional latency can I expect when using Cerebras through OpenRouter?
What additional latency can I expect when using Cerebras through OpenRouter?
A marginal amount of latency may appear as Cerebras Inference is only available via proxy, which has to be queried after your initial API request.
Why do I see “Wrong API Format“ when running the OpenRouter test code?
Why do I see “Wrong API Format“ when running the OpenRouter test code?
The official OpenRouter inference example uses a multimodal input call, which is not currently supported by Cerebras. To avoid this error, use the code provided in Step 2 of the tutorial above.