| Category | Use Case | Large (>200B) | Medium (20B–200B) | Small (<20B) | Why Cerebras? |
|---|---|---|---|---|---|
| Code & Development | Code generation & reasoning | Kimi K2.6 GLM 5.1 GLM 4.7 (Public) | Gemma 4 31B Qwen3 32B | More reasoning over code, requirements, and edge cases without breaking developer flows. | |
| Code completion & bug fixing | MiniMax M2.5 GLM 4.7 (Public) | Gemma 4 31B Qwen3 32B | Generate, critique, and repair code in multiple passes at the speed you type. | ||
| Terminal tasks | Kimi K2.6 GLM 5.1 MiniMax M2.5 GLM 4.7 (Public) | GPT OSS 120B (Public) | Agents can reason between commands, inspect results, and continue acting while the experience remains interactive. | ||
| AI-Powered Apps | Agents with tool use | Kimi K2.6 MiniMax M2.5 GLM 5.1 GLM 4.7 (Public) | GPT OSS 120B (Public) Qwen3 32B | More tool calls, plan/act/observe loops, and recovery attempts per turn, keeping users engaged. | |
| General reasoning & planning | Kimi K2.6 MiniMax M2.5 GLM 5.1 GLM 4.7 (Public) | Gemma 4 31B | More planning, comparison, and verification steps within the same practical response window. | ||
| Summarization | MiniMax M2.5 | Gemma 4 31B GPT OSS 120B (Public) | Longer context, deeper synthesis, and less aggressive compression without making users wait. | ||
| Low-latency NLU & extraction | GPT OSS 120B (Public) | Inline extraction with validation, correction, and structured outputs fast enough for production workflows. | |||
| Vision & Multimodal | Vision & document understanding | Kimi K2.6 | Gemma 4 31B | Richer reasoning across text, image, and other inputs while keeping multimodal workflows responsive. |
Migrate from Closed Models
If you’re moving from Claude, GPT, or Gemini, here are open-source alternatives available on Cerebras.| Provider | Closed Source | Use Case | Open Source Alternatives |
|---|---|---|---|
| Claude | Claude Opus 4.7 | Complex multi-step reasoning where end-to-end correctness is crucial | Kimi K2.6 GLM 5.1 |
| Claude Sonnet 4.6 | Multi-file refactors, agentic coding loops, code review | Kimi K2.6 GLM 5.1 MiniMax M2.5 GLM 4.7 | |
| Claude Haiku 4.5 | Customer support, classification, extraction, short-form generation, sub-agents in multi-agent systems | Gemma 4 31B GPT OSS 120B MiniMax M2.5 | |
| OpenAI GPT | GPT 5.5 | Frontier reasoning, complex coding, long agentic chains | Kimi K2.6 GLM 5.1 |
| GPT 5.4 Nano/Mini | Balanced reasoning and coding, sub-agents in multi-agent systems, structured tasks | MiniMax M2.5 GLM 4.7 Gemma 4 31B GPT OSS 120B | |
| Gemini | Gemini 3.1 Pro | Image understanding for coding, document analysis, and scientific tasks | Kimi K2.6 GLM 5.1 |
| Gemini 3.1 Pro Flash & Flash Lite | Low-latency multimodal chat and tool calling for real-time UX | Gemma 4 31B GPT OSS 120B MiniMax M2.5 |

