The Cerebras Inference API uses standard HTTP response status codes to indicate the success or failure of an API request. In cases of errors, the SDK throws specific exceptions that inherit fromDocumentation Index
Fetch the complete documentation index at: https://inference-docs.cerebras.ai/llms.txt
Use this file to discover all available pages before exploring further.
cerebras.cloud.sdk.APIError. This documentation outlines the error types, how to handle them, and provides examples for effective error management.
Error Types
All errors in the Cerebras Inference API inherit fromcerebras.cloud.sdk.APIError. The main categories of errors are:
cerebras.cloud.sdk.APIConnectionError: Raised when the library is unable to connect to the API.cerebras.cloud.sdk.APIStatusError: Raised when the API returns a non-success status code (4xx or 5xx).
HTTP Status Codes
| Status Code | Error Type |
|---|---|
| 400 | BadRequestError |
| 401 | AuthenticationError |
| 402 | PaymentRequired |
| 403 | PermissionDeniedError |
| 404 | NotFoundError |
| 422 | UnprocessableEntityError |
| 429 | |
| 500 | InternalServerError |
| 503 | ServiceUnavailable |
| N/A | APIConnectionError |
Handling Errors
Here’s an example of how to handle different types of errors:Retries
By default, certain errors are automatically retried 2 times with a short exponential backoff. These include:- Connection errors
- 408 Request Timeout
- >= 500 Internal errors
max_retries option:
Timeouts
Requests time out after 1 minute by default. You can configure this with atimeout option:
APITimeoutError is thrown. Note that requests that time out are retried twice by default.
