The Cerebras Code MCP Server accelerates code generation in your existing IDEs and CLI tools (like Cursor and Claude Code) by up to 20x compared to GPUs. It leverages Cerebras fast inference through the Model Context Protocol and offers optional graceful fallback via OpenRouter. MCP provides an open standard that enables AI models to securely interact with tools, data, and editors. Rather than limiting models to plain text chat, MCP grants them structured access to external systems like your IDE, allowing them to read, write, and modify code with consistent rules. This approach ensures safer, more reliable, and more predictable model-driven coding.Documentation Index
Fetch the complete documentation index at: https://inference-docs.cerebras.ai/llms.txt
Use this file to discover all available pages before exploring further.
The Cerebras Code MCP server is currently in research preview and is open source here. We welcome contributions!
Set up your API key
You need a valid Cerebras API key. Please visit this link and sign up, then click on API Keys in the left navigation.
Optionally, create an OpenRouter key here to use as fallback if you hit Cerebras rate limits.
Available Models
The Cerebras Code MCP Server supports all Cerebras models:| Model | Parameters | Best For |
|---|---|---|
| llama3.1-8b | 8B | Fastest option for simple tasks and high-throughput scenarios |
| gpt-oss-120b | 120B | Largest model for the most demanding tasks |
| zai-glm-4.7 | 357B | Advanced 357B parameter model with strong reasoning capabilities |

