Documentation Index
Fetch the complete documentation index at: https://inference-docs.cerebras.ai/llms.txt
Use this file to discover all available pages before exploring further.
Analytics
The Analytics page has three tabs: Usage, Cached-Usage, and Cost.All dates and timestamps are displayed in UTC.
- Usage
- Cached-Usage
- Cost
Track request volume and token consumption over a selected date range. Toggle Show quotas to overlay your rate limit thresholds and see how close you are to your limits. Use Download Report to export the data as a CSV.
Tips
Monitor quota headroom — Enable Show quotas on the Usage tab to see how close you are to your rate limits. If you’re consistently near the ceiling, consider distributing traffic across projects or requesting a limit increase. Optimize caching — If cache hits are low on the Cached-Usage tab, review whether your prompts have a stable, shared prefix. Effective caching reduces Time to First Token (TTFT) for long-context workloads. Track costs by model — Filter the Cost tab by model to compare spend. This helps when deciding whether a smaller, faster model is sufficient for a given use case. Debug usage spikes — Narrow the date range to isolate when a spike started, then cross-reference with Logs to identify the source.Logs
The Logs page has two tabs: Request Logs and Audit Logs.All dates and timestamps are displayed in UTC.
- Request Logs
- Audit Logs
Inspect individual API calls by filtering on model, API key, date range, or HTTP status code. The status code chart at the top gives you a quick visual of error rates over time. Use Download Report to export the current view as a CSV.When contacting support about a failed request, include the Request ID from the log entry.
Limits
The Limits page displays a table of models available to your organization or project, along with their associated rate limits. The table shows each model’s name, context length, limit type (requests or tokens), and quota by minute and day. Hourly limits are shown in the org-level view only. If you need higher limits, contact us or reach out to your account representative.The values shown on the Limits page are specific to your plan and project. Your limits may differ from examples in the documentation.
Per-Org vs. Per-Project Limits
The limits shown depend on your current console context:| Console context | Limits shown |
|---|---|
| All Projects view | Org-level limits |
| Specific project selected | Project-level limits for that project |

