Skip to main content
GET
/
public
/
v1
/
models
curl -sS 'https://api.cerebras.ai/public/v1/models' | jq
{
  "object": "list",
  "data": [
    {
      "id": "gpt-oss-120b",
      "object": "model",
      "created": 1754438400,
      "owned_by": "OpenAI",
      "name": "OpenAI GPT OSS",
      "description": "This model excels at efficient reasoning across science, math, and coding applications. It's ideal for real-time coding assistance, processing large documents for Q&A and summarization, agentic research workflows, and regulated on-premises workloads.",
      "hugging_face_id": "openai/gpt-oss-120b",
      "pricing": {
        "prompt": "0.00000035",
        "completion": "0.00000075"
      },
      "capabilities": {
        "streaming": true,
        "function_calling": true,
        "structured_outputs": true,
        "vision": false,
        "json_mode": true,
        "tools": true,
        "tool_choice": true,
        "parallel_tool_calls": false,
        "response_format": true,
        "reasoning": true
      },
      "supported_parameters": {
        "temperature": true,
        "top_p": true,
        "seed": true,
        "stop": true,
        "max_completion_tokens": true,
        "logprobs": true,
        "top_logprobs": true,
        "frequency_penalty": true,
        "presence_penalty": true,
        "logit_bias": true,
        "repetition_penalty": false
      },
      "architecture": {
        "modality": "text",
        "tokenizer": "GPT",
        "instruct_type": "harmony"
      },
      "limits": {
        "max_context_length": 131072,
        "max_completion_tokens": 40960,
        "requests_per_minute": null,
        "tokens_per_minute": null
      },
      "datacenter_locations": [],
      "deprecated": false,
      "preview": false,
      "quantization": "FP16/8 (weights only)"
    }
  ]
}

Documentation Index

Fetch the complete documentation index at: https://inference-docs.cerebras.ai/llms.txt

Use this file to discover all available pages before exploring further.

Retrieve details about publicly available Cerebras models without an API key, including context length, pricing, and supported features. This endpoint supports multiple response formats for compatibility with different platforms.
This endpoint is public and does not require an API key.

Supported Formats

The endpoint supports three response formats via the format query parameter:
  • Default (Cerebras) - Native Cerebras format
  • OpenRouter - OpenRouter-compatible format
  • HuggingFace - HuggingFace-compatible format

Query Parameters

format
string
Response format. Options: openrouter, huggingface. Omit for default Cerebras format.

Response

Default Format

object
string
The object type, which is always list for list responses or model for single model responses.
data
array
Array of model objects (only in list responses).

OpenRouter Format

data
array
Array of model objects with extended metadata.

HuggingFace Format

data
array
Array of model objects.
curl -sS 'https://api.cerebras.ai/public/v1/models' | jq
{
  "object": "list",
  "data": [
    {
      "id": "gpt-oss-120b",
      "object": "model",
      "created": 1754438400,
      "owned_by": "OpenAI",
      "name": "OpenAI GPT OSS",
      "description": "This model excels at efficient reasoning across science, math, and coding applications. It's ideal for real-time coding assistance, processing large documents for Q&A and summarization, agentic research workflows, and regulated on-premises workloads.",
      "hugging_face_id": "openai/gpt-oss-120b",
      "pricing": {
        "prompt": "0.00000035",
        "completion": "0.00000075"
      },
      "capabilities": {
        "streaming": true,
        "function_calling": true,
        "structured_outputs": true,
        "vision": false,
        "json_mode": true,
        "tools": true,
        "tool_choice": true,
        "parallel_tool_calls": false,
        "response_format": true,
        "reasoning": true
      },
      "supported_parameters": {
        "temperature": true,
        "top_p": true,
        "seed": true,
        "stop": true,
        "max_completion_tokens": true,
        "logprobs": true,
        "top_logprobs": true,
        "frequency_penalty": true,
        "presence_penalty": true,
        "logit_bias": true,
        "repetition_penalty": false
      },
      "architecture": {
        "modality": "text",
        "tokenizer": "GPT",
        "instruct_type": "harmony"
      },
      "limits": {
        "max_context_length": 131072,
        "max_completion_tokens": 40960,
        "requests_per_minute": null,
        "tokens_per_minute": null
      },
      "datacenter_locations": [],
      "deprecated": false,
      "preview": false,
      "quantization": "FP16/8 (weights only)"
    }
  ]
}

Retrieve Specific Model

You can also retrieve information about a specific model by appending the model ID to the endpoint:
GET https://api.cerebras.ai/public/v1/models/{model_id}

Path Parameters

model_id
string
required
The ID of the model to retrieve (e.g., gpt-oss-120b, zai-glm-4.7).
curl -sS 'https://api.cerebras.ai/public/v1/models/gpt-oss-120b' | jq
{
  "id": "gpt-oss-120b",
  "object": "model",
  "created": 1754438400,
  "owned_by": "OpenAI",
  "name": "OpenAI GPT OSS",
  "description": "This model excels at efficient reasoning across science, math, and coding applications. It's ideal for real-time coding assistance, processing large documents for Q&A and summarization, agentic research workflows, and regulated on-premises workloads.",
  "hugging_face_id": "openai/gpt-oss-120b",
  "pricing": {
    "prompt": "0.00000035",
    "completion": "0.00000075"
  },
  "capabilities": {
    "streaming": true,
    "function_calling": true,
    "structured_outputs": true,
    "vision": false,
    "json_mode": true,
    "tools": true,
    "tool_choice": true,
    "parallel_tool_calls": false,
    "response_format": true,
    "reasoning": true
  },
  "limits": {
    "max_context_length": 131072,
    "max_completion_tokens": 40960,
    "requests_per_minute": null,
    "tokens_per_minute": null
  },
  "deprecated": false,
  "preview": false,
  "quantization": "FP16/8 (weights only)"
}

Use Cases

Platform Integration - Use the OpenRouter or HuggingFace formats to integrate Cerebras models into existing platforms that support these standards. Model Discovery - Programmatically discover available models and their capabilities without authentication. Pricing Comparison - Compare pricing across different models using the structured pricing information. Feature Detection - Check which features (streaming, tools, JSON mode) are supported by each model.