Not sure where to start? Ask the assistant — it can recommend a model based on your use case. Use this guide to find the right model for your use case on Cerebras. All models listed below are served on **[dedicated endpoints](/dedicated/overview)** for enterprise workloads with reserved capacity. A subset is also accessible on **public endpoints** with no additional setup. For architectural guidance on getting the most out of Cerebras speed, see [Designing for Cerebras](/resources/designing-for-cerebras). | Category | Use Case | Large (>200B) | Medium (20B–200B) | Small (\<20B) | Why Cerebras? | | ------------------- | ------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------- | ------------- | ------------------------------------------------------------------------------------------------------------------ | | Code & Development | Code generation & reasoning | [Kimi K2.6](/dedicated/overview)
[GLM 5.1](/dedicated/overview)
[GLM 4.7](/models/zai-glm-47) *(Public)* | [Gemma 4 31B](/models/gemma-4-31b) *(Public)*
[Qwen3 32B](/dedicated/overview) | | More reasoning over code, requirements, and edge cases without breaking developer flows. | | | Code completion & bug fixing | [MiniMax M2.5](/dedicated/overview)
[GLM 4.7](/models/zai-glm-47) *(Public)* | [Gemma 4 31B](/models/gemma-4-31b) *(Public)*
[Qwen3 32B](/dedicated/overview) | | Generate, critique, and repair code in multiple passes at the speed you type. | | | Terminal tasks | [Kimi K2.6](/dedicated/overview)
[GLM 5.1](/dedicated/overview)
[MiniMax M2.5](/dedicated/overview)
[GLM 4.7](/models/zai-glm-47) *(Public)* | [GPT OSS 120B](/models/openai-oss) *(Public)* | | Agents can reason between commands, inspect results, and continue acting while the experience remains interactive. | | AI-Powered Apps | Agents with tool use | [Kimi K2.6](/dedicated/overview)
[MiniMax M2.5](/dedicated/overview)
[GLM 5.1](/dedicated/overview)
[GLM 4.7](/models/zai-glm-47) *(Public)* | [GPT OSS 120B](/models/openai-oss) *(Public)*
[Qwen3 32B](/dedicated/overview) | | More tool calls, plan/act/observe loops, and recovery attempts per turn, keeping users engaged. | | | General reasoning & planning | [Kimi K2.6](/dedicated/overview)
[MiniMax M2.5](/dedicated/overview)
[GLM 5.1](/dedicated/overview)
[GLM 4.7](/models/zai-glm-47) *(Public)* | [Gemma 4 31B](/dedicated/overview) | | More planning, comparison, and verification steps within the same practical response window. | | | Summarization | [MiniMax M2.5](/dedicated/overview) | [Gemma 4 31B](/dedicated/overview)
[GPT OSS 120B](/models/openai-oss) *(Public)* | | Longer context, deeper synthesis, and less aggressive compression without making users wait. | | | Low-latency NLU & extraction | | [GPT OSS 120B](/models/openai-oss) *(Public)* | | Inline extraction with validation, correction, and structured outputs fast enough for production workflows. | | Vision & Multimodal | Vision & document understanding | [Kimi K2.6](/dedicated/overview) | [Gemma 4 31B](/dedicated/overview) | | Richer reasoning across text, image, and other inputs while keeping multimodal workflows responsive. | Looking for the full dedicated model catalog? See [Dedicated Endpoints](/dedicated/overview). ## Migrate from Closed Models If you're moving from Claude, GPT, or Gemini, here are open-source alternatives available on Cerebras. | Provider | Closed Source | Use Case | Open Source Alternatives | | ---------- | --------------------------------- | ------------------------------------------------------------------------------------------------------ | ----------------------------------------------------------------------------------------------------------------------------------------------------------------- | | Claude | Claude Opus 4.7 | Complex multi-step reasoning where end-to-end correctness is crucial | [Kimi K2.6](/dedicated/overview)
[GLM 5.1](/dedicated/overview) | | | Claude Sonnet 4.7 | Multi-file refactors, agentic coding loops, code review | [Kimi K2.6](/dedicated/overview)
[GLM 5.1](/dedicated/overview)
[MiniMax M2.5](/dedicated/overview)
[GLM 4.7](/models/zai-glm-47) | | | Claude Haiku 4.5 | Customer support, classification, extraction, short-form generation, sub-agents in multi-agent systems | [Gemma 4 31B](/models/gemma-4-31b) *(Public)*
[GPT OSS 120B](/models/openai-oss)
[MiniMax M2.5](/dedicated/overview) | | OpenAI GPT | GPT 5.5 | Frontier reasoning, complex coding, long agentic chains | [Kimi K2.6](/dedicated/overview)
[GLM 5.1](/dedicated/overview) | | | GPT 5.4 Nano/Mini | Balanced reasoning and coding, sub-agents in multi-agent systems, structured tasks | [MiniMax M2.5](/dedicated/overview)
[GLM 4.7](/models/zai-glm-47)
[Gemma 4 31B](/models/gemma-4-31b) *(Public)*
[GPT OSS 120B](/models/openai-oss) | | Gemini | Gemini 3.1 Pro | Image understanding for coding, document analysis, and scientific tasks | [Kimi K2.6](/dedicated/overview)
[GLM 5.1](/dedicated/overview) | | | Gemini 3.1 Pro Flash & Flash Lite | Low-latency multimodal chat and tool calling for real-time UX | [Gemma 4 31B](/models/gemma-4-31b) *(Public)*
[GPT OSS 120B](/models/openai-oss)
[MiniMax M2.5](/dedicated/overview) | Explore the full model catalog at [Dedicated Endpoints](/dedicated/overview), or get started with the [Quickstart](/quickstart).