Documentation Index
Fetch the complete documentation index at: https://inference-docs.cerebras.ai/llms.txt
Use this file to discover all available pages before exploring further.
Build at the Speed of Cerebras
Experience real-time AI responses across coding, reasoning, voice, and agentic workloads with the world’s fastest AI inference.
Explore Models
View our available models, including performance specifications, rate limits, and pricing details.
Dedicated Endpoints
Private, high-performance inference endpoints with reserved capacity and guaranteed throughput for production workloads.
Start building
- Designing for Cerebras — Architectural patterns for building on wafer-scale inference.
- OpenAI Compatibility — Migrate your existing code with minimal changes.
- Integrations — Plug into popular AI frameworks and tools.

