> ## Documentation Index
> Fetch the complete documentation index at: https://inference-docs.cerebras.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Choose a Model

> Find the right open-source model for your workload on Cerebras, including alternatives for Claude, GPT, and Gemini.

<div className="wide-content" />

<Tip>
  Not sure where to start? <a href="?assistant=which%20model%20should%20I%20use%3F" target="_self">Ask the assistant</a> — it can recommend a model based on your use case.
</Tip>

Use this guide to find the right model for your use case on Cerebras. All models listed below are served on **[dedicated endpoints](/dedicated/overview)** for enterprise workloads with reserved capacity. A subset is also accessible on **public endpoints** with no additional setup.

For architectural guidance on getting the most out of Cerebras speed, see [Designing for Cerebras](/resources/designing-for-cerebras).

| Category            | Use Case                        | Large (>200B)                                                                                                                                               | Medium (20B–200B)                                                                     | Small (\<20B) | Why Cerebras?                                                                                                      |
| ------------------- | ------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------- | ------------- | ------------------------------------------------------------------------------------------------------------------ |
| Code & Development  | Code generation & reasoning     | [Kimi K2.6](/dedicated/overview)<br />[GLM 5.1](/dedicated/overview)<br />[GLM 4.7](/models/zai-glm-47) *(Public)*                                          | [Gemma 4 31B](/dedicated/overview)<br />[Qwen3 32B](/dedicated/overview)              |               | More reasoning over code, requirements, and edge cases without breaking developer flows.                           |
|                     | Code completion & bug fixing    | [MiniMax M2.5](/dedicated/overview)<br />[GLM 4.7](/models/zai-glm-47) *(Public)*                                                                           | [Gemma 4 31B](/dedicated/overview)<br />[Qwen3 32B](/dedicated/overview)              |               | Generate, critique, and repair code in multiple passes at the speed you type.                                      |
|                     | Terminal tasks                  | [Kimi K2.6](/dedicated/overview)<br />[GLM 5.1](/dedicated/overview)<br />[MiniMax M2.5](/dedicated/overview)<br />[GLM 4.7](/models/zai-glm-47) *(Public)* | [GPT OSS 120B](/models/openai-oss) *(Public)*                                         |               | Agents can reason between commands, inspect results, and continue acting while the experience remains interactive. |
| AI-Powered Apps     | Agents with tool use            | [Kimi K2.6](/dedicated/overview)<br />[MiniMax M2.5](/dedicated/overview)<br />[GLM 5.1](/dedicated/overview)<br />[GLM 4.7](/models/zai-glm-47) *(Public)* | [GPT OSS 120B](/models/openai-oss) *(Public)*<br />[Qwen3 32B](/dedicated/overview)   |               | More tool calls, plan/act/observe loops, and recovery attempts per turn, keeping users engaged.                    |
|                     | General reasoning & planning    | [Kimi K2.6](/dedicated/overview)<br />[MiniMax M2.5](/dedicated/overview)<br />[GLM 5.1](/dedicated/overview)<br />[GLM 4.7](/models/zai-glm-47) *(Public)* | [Gemma 4 31B](/dedicated/overview)                                                    |               | More planning, comparison, and verification steps within the same practical response window.                       |
|                     | Summarization                   | [MiniMax M2.5](/dedicated/overview)                                                                                                                         | [Gemma 4 31B](/dedicated/overview)<br />[GPT OSS 120B](/models/openai-oss) *(Public)* |               | Longer context, deeper synthesis, and less aggressive compression without making users wait.                       |
|                     | Low-latency NLU & extraction    |                                                                                                                                                             | [GPT OSS 120B](/models/openai-oss) *(Public)*                                         |               | Inline extraction with validation, correction, and structured outputs fast enough for production workflows.        |
| Vision & Multimodal | Vision & document understanding | [Kimi K2.6](/dedicated/overview)                                                                                                                            | [Gemma 4 31B](/dedicated/overview)                                                    |               | Richer reasoning across text, image, and other inputs while keeping multimodal workflows responsive.               |

<Tip>
  Looking for the full dedicated model catalog? See [Dedicated Endpoints](/dedicated/overview).
</Tip>

## Migrate from Closed Models

If you're moving from Claude, GPT, or Gemini, here are open-source alternatives available on Cerebras.

| Provider   | Closed Source                     | Use Case                                                                                               | Open Source Alternatives                                                                                                                               |
| ---------- | --------------------------------- | ------------------------------------------------------------------------------------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------ |
| Claude     | Claude Opus 4.7                   | Complex multi-step reasoning where end-to-end correctness is crucial                                   | [Kimi K2.6](/dedicated/overview)<br />[GLM 5.1](/dedicated/overview)                                                                                   |
|            | Claude Sonnet 4.6                 | Multi-file refactors, agentic coding loops, code review                                                | [Kimi K2.6](/dedicated/overview)<br />[GLM 5.1](/dedicated/overview)<br />[MiniMax M2.5](/dedicated/overview)<br />[GLM 4.7](/models/zai-glm-47)       |
|            | Claude Haiku 4.5                  | Customer support, classification, extraction, short-form generation, sub-agents in multi-agent systems | [Gemma 4 31B](/dedicated/overview)<br />[GPT OSS 120B](/models/openai-oss)<br />[MiniMax M2.5](/dedicated/overview)                                    |
| OpenAI GPT | GPT 5.5                           | Frontier reasoning, complex coding, long agentic chains                                                | [Kimi K2.6](/dedicated/overview)<br />[GLM 5.1](/dedicated/overview)                                                                                   |
|            | GPT 5.4 Nano/Mini                 | Balanced reasoning and coding, sub-agents in multi-agent systems, structured tasks                     | [MiniMax M2.5](/dedicated/overview)<br />[GLM 4.7](/models/zai-glm-47)<br />[Gemma 4 31B](/dedicated/overview)<br />[GPT OSS 120B](/models/openai-oss) |
| Gemini     | Gemini 3.1 Pro                    | Image understanding for coding, document analysis, and scientific tasks                                | [Kimi K2.6](/dedicated/overview)<br />[GLM 5.1](/dedicated/overview)                                                                                   |
|            | Gemini 3.1 Pro Flash & Flash Lite | Low-latency multimodal chat and tool calling for real-time UX                                          | [Gemma 4 31B](/dedicated/overview)<br />[GPT OSS 120B](/models/openai-oss)<br />[MiniMax M2.5](/dedicated/overview)                                    |

Explore the full model catalog at [Dedicated Endpoints](/dedicated/overview), or get started with the [Quickstart](/quickstart).
