Models
Supported Models
Production Models
Production models are are fully supported offerings intended for use in production environments.
Model Name | Model ID | Parameters | Speed (tokens/s) |
---|---|---|---|
Llama 4 Scout | llama-4-scout-17b-16e-instruct | 109 billion | ~2600 |
Llama 3.1 8B | llama3.1-8b | 8 billion | ~2200 |
Llama 3.3 70B | llama-3.3-70b | 70 billion | ~2100 |
Qwen 3 32B | qwen-3-32b | 32 billion | ~2100 |
Preview Models
Preview models are hosted on Cerebras with full accuracy and performance. Please note that these preview models are intended for evaluation purposes only and should not be used in production, as they may be discontinued with short notice.
Model Name | Model ID | Parameters | Speed (tokens/s) |
---|---|---|---|
Qwen 3 235B | qwen-3-235b-a22b | 235 billion | ~1500 |
DeepSeek R1 Distill Llama 70B* | deepseek-r1-distill-llama-70b | 70 billion | ~1700 |
* DeepSeek R1 Distill Llama 70B is available in private preview. Please contact us to request access.