Cerebras Inference on OpenRouter
Learn how to use Cerebras Inference on OpenRouter.
This guide provides a step-by-step walkthrough for using the OpenRouter API to run inference on Cerebras hardware. For a complete list of Cerebras Inference powered models available on OpenRouter, visit the OpenRouter site.
We currently support the Chat Completion endpoint via the OpenRouter platform. You can get started with just a few lines of code.
To get started, follow the steps below.
Get an OpenRouter API Key
First, you will need to create an OpenRouter API key. You’ll use this key to authenticate with OpenRouter and access the Cerebras provider.
- Go to API Keys in OpenRouter
- Click Create Key
- Give it a name and copy your API key
Make an API Call
Here’s an example using ChatCompletions to query Llama 3.3-70B on Cerebras.
Be sure to replace your_openrouter_key_here
with your actual API key.
Try Structured Outputs + Make a Tool Call
You did it — your first API call is complete! Now, let’s explore how to make your model smarter at handling tasks and more precise in how it formats its responses via structured outputs and tool calling. See the examples below for how to use both.
Differences Between Cerebras Cloud and OpenRouter
Cerebras Cloud is primarily intended for free tier users and high-throughput startups that need a dedicated plan to handle their inference. OpenRouter acts as one of our “pay-as-you-go“ providers.
Certain models, such as DeepSeek r1-distilled-70b can only be accessed on Cerebras Cloud with a paid plan.