Production Models

Production models are are fully supported offerings intended for use in production environments.

Model NameModel IDParametersSpeed (tokens/s)
Llama 4 Scoutllama-4-scout-17b-16e-instruct109 billion~2600
Llama 3.1 8Bllama3.1-8b8 billion~2200
Llama 3.3 70Bllama-3.3-70b70 billion~2100
Qwen 3 32Bqwen-3-32b32 billion~2100

Preview Models

Preview models are hosted on Cerebras with full accuracy and performance. Please note that these preview models are intended for evaluation purposes only and should not be used in production, as they may be discontinued with short notice.

Model NameModel IDParametersSpeed (tokens/s)
Qwen 3 235Bqwen-3-235b-a22b235 billion~1500
DeepSeek R1 Distill Llama 70B*deepseek-r1-distill-llama-70b70 billion~1700
* DeepSeek R1 Distill Llama 70B is available in private preview. Please contact us to request access.