> ## Documentation Index
> Fetch the complete documentation index at: https://inference-docs.cerebras.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Public models

Retrieve details about publicly available Cerebras models without an API key, including context length, pricing, and supported features. This endpoint supports multiple response formats for compatibility with different platforms.

<Note>
  This endpoint is **public** and does not require an API key.
</Note>

## Supported Formats

The endpoint supports three response formats via the `format` query parameter:

* **Default (Cerebras)** - Native Cerebras format
* **OpenRouter** - OpenRouter-compatible format
* **HuggingFace** - HuggingFace-compatible format

## Query Parameters

<ParamField query="format" type="string">
  Response format. Options: `openrouter`, `huggingface`. Omit for default Cerebras format.
</ParamField>

## Response

### Default Format

<ResponseField name="object" type="string">
  The object type, which is always `list` for list responses or `model` for single model responses.
</ResponseField>

<ResponseField name="data" type="array">
  Array of model objects (only in list responses).

  <Expandable title="model properties">
    <ResponseField name="id" type="string">
      The model identifier (e.g., `gpt-oss-120b`).
    </ResponseField>

    <ResponseField name="object" type="string">
      The object type, which is always `model`.
    </ResponseField>

    <ResponseField name="created" type="integer">
      The Unix timestamp (in seconds) of when the model was created.
    </ResponseField>

    <ResponseField name="owned_by" type="string">
      The organization that owns the model.
    </ResponseField>

    <ResponseField name="name" type="string">
      Human-readable model name.
    </ResponseField>

    <ResponseField name="description" type="string">
      Description of the model's capabilities and recommended use cases.
    </ResponseField>

    <ResponseField name="hugging_face_id" type="string">
      The HuggingFace repository identifier for this model.
    </ResponseField>

    <ResponseField name="pricing" type="object">
      Pricing per token in USD.

      <Expandable title="pricing fields">
        <ResponseField name="prompt" type="string">Cost per prompt token.</ResponseField>
        <ResponseField name="completion" type="string">Cost per completion token.</ResponseField>
      </Expandable>
    </ResponseField>

    <ResponseField name="capabilities" type="object">
      Boolean flags for supported features.

      <Expandable title="capability fields">
        <ResponseField name="streaming" type="boolean">Whether streaming is supported.</ResponseField>
        <ResponseField name="function_calling" type="boolean">Whether function calling is supported.</ResponseField>
        <ResponseField name="structured_outputs" type="boolean">Whether structured outputs are supported.</ResponseField>
        <ResponseField name="vision" type="boolean">Whether vision/image input is supported.</ResponseField>
        <ResponseField name="json_mode" type="boolean">Whether JSON mode is supported.</ResponseField>
        <ResponseField name="tools" type="boolean">Whether tool use is supported.</ResponseField>
        <ResponseField name="tool_choice" type="boolean">Whether tool choice is supported.</ResponseField>
        <ResponseField name="parallel_tool_calls" type="boolean">Whether parallel tool calls are supported.</ResponseField>
        <ResponseField name="response_format" type="boolean">Whether response format control is supported.</ResponseField>
        <ResponseField name="reasoning" type="boolean">Whether the model supports reasoning.</ResponseField>
      </Expandable>
    </ResponseField>

    <ResponseField name="supported_parameters" type="object">
      Boolean flags for supported API parameters.
    </ResponseField>

    <ResponseField name="architecture" type="object">
      Model architecture details.

      <Expandable title="architecture fields">
        <ResponseField name="modality" type="string">Input/output modality (e.g., `text`).</ResponseField>
        <ResponseField name="tokenizer" type="string">Tokenizer type.</ResponseField>
        <ResponseField name="instruct_type" type="string">Instruction tuning format.</ResponseField>
      </Expandable>
    </ResponseField>

    <ResponseField name="limits" type="object">
      Model limits.

      <Expandable title="limit fields">
        <ResponseField name="max_context_length" type="integer">Maximum context window in tokens.</ResponseField>
        <ResponseField name="max_completion_tokens" type="integer">Maximum completion tokens.</ResponseField>
        <ResponseField name="requests_per_minute" type="integer">Requests per minute limit (null if unlimited).</ResponseField>
        <ResponseField name="tokens_per_minute" type="integer">Tokens per minute limit (null if unlimited).</ResponseField>
      </Expandable>
    </ResponseField>

    <ResponseField name="deprecated" type="boolean">
      Whether this model is deprecated.
    </ResponseField>

    <ResponseField name="preview" type="boolean">
      Whether this model is in preview.
    </ResponseField>

    <ResponseField name="quantization" type="string">
      Quantization method used for the model.
    </ResponseField>
  </Expandable>
</ResponseField>

### OpenRouter Format

<ResponseField name="data" type="array">
  Array of model objects with extended metadata.

  <Expandable title="extended properties">
    <ResponseField name="id" type="string">
      The model identifier.
    </ResponseField>

    <ResponseField name="hugging_face_id" type="string">
      The HuggingFace repository identifier.
    </ResponseField>

    <ResponseField name="name" type="string">
      Human-readable model name.
    </ResponseField>

    <ResponseField name="input_modalities" type="array">
      Supported input types (e.g., `["text"]`).
    </ResponseField>

    <ResponseField name="output_modalities" type="array">
      Supported output types (e.g., `["text"]`).
    </ResponseField>

    <ResponseField name="context_length" type="integer">
      Maximum context window size in tokens.
    </ResponseField>

    <ResponseField name="max_output_length" type="integer">
      Maximum output length in tokens.
    </ResponseField>

    <ResponseField name="pricing" type="object">
      Pricing information per token.

      <Expandable title="pricing fields">
        <ResponseField name="prompt" type="string">
          Cost per prompt token.
        </ResponseField>

        <ResponseField name="completion" type="string">
          Cost per completion token.
        </ResponseField>

        <ResponseField name="request" type="string">
          Cost per request (typically `"0"`).
        </ResponseField>

        <ResponseField name="image" type="string">
          Cost per image token (typically `"0"` for text-only models).
        </ResponseField>

        <ResponseField name="input_cache_read" type="string">
          Cost per cached input token read (typically `"0"`).
        </ResponseField>

        <ResponseField name="input_cache_write" type="string">
          Cost per cached input token write (typically `"0"`).
        </ResponseField>
      </Expandable>
    </ResponseField>

    <ResponseField name="supported_sampling_parameters" type="array">
      List of supported sampling parameters.
    </ResponseField>

    <ResponseField name="supported_features" type="array">
      List of supported features (e.g., `["tools", "json_mode", "structured_outputs"]`).
    </ResponseField>
  </Expandable>
</ResponseField>

### HuggingFace Format

<ResponseField name="data" type="array">
  Array of model objects.

  <Expandable title="properties">
    <ResponseField name="id" type="string">
      The model identifier.
    </ResponseField>

    <ResponseField name="hugging_face_id" type="string">
      The HuggingFace repository identifier.
    </ResponseField>

    <ResponseField name="object" type="string">
      The object type, which is always `model`.
    </ResponseField>

    <ResponseField name="created" type="integer">
      The Unix timestamp (in seconds) of when the model was created.
    </ResponseField>

    <ResponseField name="owned_by" type="string">
      The organization that owns the model.
    </ResponseField>

    <ResponseField name="pricing" type="object">
      Pricing in USD per million tokens.

      <Expandable title="pricing fields">
        <ResponseField name="input" type="number">
          Price per million input tokens.
        </ResponseField>

        <ResponseField name="output" type="number">
          Price per million output tokens.
        </ResponseField>
      </Expandable>
    </ResponseField>

    <ResponseField name="context_length" type="integer">
      Maximum context window size in tokens.
    </ResponseField>

    <ResponseField name="capabilities" type="object">
      Boolean flags for supported features.
    </ResponseField>
  </Expandable>
</ResponseField>

<RequestExample>
  ```bash cURL (Default) theme={null}
  curl -sS 'https://api.cerebras.ai/public/v1/models' | jq
  ```

  ```bash cURL (OpenRouter) theme={null}
  curl -sS 'https://api.cerebras.ai/public/v1/models?format=openrouter' | jq
  ```

  ```bash cURL (HuggingFace) theme={null}
  curl -sS 'https://api.cerebras.ai/public/v1/models?format=huggingface' | jq
  ```

  ```python Python theme={null}
  import httpx

  # Default format
  response = httpx.get("https://api.cerebras.ai/public/v1/models")
  models = response.json()

  # OpenRouter format
  response = httpx.get(
      "https://api.cerebras.ai/public/v1/models",
      params={"format": "openrouter"}
  )
  models = response.json()

  # HuggingFace format
  response = httpx.get(
      "https://api.cerebras.ai/public/v1/models",
      params={"format": "huggingface"}
  )
  models = response.json()
  ```

  ```javascript JavaScript theme={null}
  // Default format
  const response = await fetch('https://api.cerebras.ai/public/v1/models');
  const models = await response.json();

  // OpenRouter format
  const response = await fetch(
    'https://api.cerebras.ai/public/v1/models?format=openrouter'
  );
  const models = await response.json();

  // HuggingFace format
  const response = await fetch(
    'https://api.cerebras.ai/public/v1/models?format=huggingface'
  );
  const models = await response.json();
  ```
</RequestExample>

<ResponseExample>
  ```json Default Format theme={null}
  {
    "object": "list",
    "data": [
      {
        "id": "gpt-oss-120b",
        "object": "model",
        "created": 1754438400,
        "owned_by": "OpenAI",
        "name": "OpenAI GPT OSS",
        "description": "This model excels at efficient reasoning across science, math, and coding applications. It's ideal for real-time coding assistance, processing large documents for Q&A and summarization, agentic research workflows, and regulated on-premises workloads.",
        "hugging_face_id": "openai/gpt-oss-120b",
        "pricing": {
          "prompt": "0.00000035",
          "completion": "0.00000075"
        },
        "capabilities": {
          "streaming": true,
          "function_calling": true,
          "structured_outputs": true,
          "vision": false,
          "json_mode": true,
          "tools": true,
          "tool_choice": true,
          "parallel_tool_calls": false,
          "response_format": true,
          "reasoning": true
        },
        "supported_parameters": {
          "temperature": true,
          "top_p": true,
          "seed": true,
          "stop": true,
          "max_completion_tokens": true,
          "logprobs": true,
          "top_logprobs": true,
          "frequency_penalty": true,
          "presence_penalty": true,
          "logit_bias": true,
          "repetition_penalty": false
        },
        "architecture": {
          "modality": "text",
          "tokenizer": "GPT",
          "instruct_type": "harmony"
        },
        "limits": {
          "max_context_length": 131072,
          "max_completion_tokens": 40960,
          "requests_per_minute": null,
          "tokens_per_minute": null
        },
        "datacenter_locations": [],
        "deprecated": false,
        "preview": false,
        "quantization": "FP16/8 (weights only)"
      }
    ]
  }
  ```

  ```json OpenRouter Format theme={null}
  {
    "data": [
      {
        "id": "gpt-oss-120b",
        "hugging_face_id": "openai/gpt-oss-120b",
        "name": "OpenAI GPT OSS",
        "created": 1754438400,
        "input_modalities": ["text"],
        "output_modalities": ["text"],
        "quantization": "fp16",
        "context_length": 131072,
        "max_output_length": 40960,
        "pricing": {
          "prompt": "0.00000035",
          "completion": "0.00000075",
          "request": "0",
          "image": "0",
          "input_cache_read": "0",
          "input_cache_write": "0"
        },
        "supported_sampling_parameters": [
          "temperature",
          "top_p",
          "stop",
          "seed",
          "frequency_penalty",
          "presence_penalty",
          "max_tokens",
          "logprobs",
          "top_logprobs",
          "logit_bias"
        ],
        "supported_features": [
          "tools",
          "json_mode",
          "structured_outputs",
          "reasoning"
        ]
      }
    ]
  }
  ```

  ```json HuggingFace Format theme={null}
  {
    "data": [
      {
        "id": "gpt-oss-120b",
        "hugging_face_id": "openai/gpt-oss-120b",
        "object": "model",
        "created": 1754438400,
        "owned_by": "OpenAI",
        "context_length": 131072,
        "pricing": {
          "input": 0.35,
          "output": 0.75
        },
        "capabilities": {
          "streaming": true,
          "function_calling": true,
          "structured_outputs": true,
          "vision": false
        }
      }
    ]
  }
  ```
</ResponseExample>

## Retrieve Specific Model

You can also retrieve information about a specific model by appending the model ID to the endpoint:

```
GET https://api.cerebras.ai/public/v1/models/{model_id}
```

### Path Parameters

<ParamField path="model_id" type="string" required>
  The ID of the model to retrieve (e.g., `gpt-oss-120b`, `zai-glm-4.7`).
</ParamField>

<RequestExample>
  ```bash cURL theme={null}
  curl -sS 'https://api.cerebras.ai/public/v1/models/gpt-oss-120b' | jq
  ```

  ```python Python theme={null}
  import httpx

  model_id = "gpt-oss-120b"
  response = httpx.get(f"https://api.cerebras.ai/public/v1/models/{model_id}")
  model = response.json()
  ```

  ```javascript JavaScript theme={null}
  const modelId = 'gpt-oss-120b';
  const response = await fetch(
    `https://api.cerebras.ai/public/v1/models/${modelId}` 
  );
  const model = await response.json();
  ```
</RequestExample>

<ResponseExample>
  ```json Response theme={null}
  {
    "id": "gpt-oss-120b",
    "object": "model",
    "created": 1754438400,
    "owned_by": "OpenAI",
    "name": "OpenAI GPT OSS",
    "description": "This model excels at efficient reasoning across science, math, and coding applications. It's ideal for real-time coding assistance, processing large documents for Q&A and summarization, agentic research workflows, and regulated on-premises workloads.",
    "hugging_face_id": "openai/gpt-oss-120b",
    "pricing": {
      "prompt": "0.00000035",
      "completion": "0.00000075"
    },
    "capabilities": {
      "streaming": true,
      "function_calling": true,
      "structured_outputs": true,
      "vision": false,
      "json_mode": true,
      "tools": true,
      "tool_choice": true,
      "parallel_tool_calls": false,
      "response_format": true,
      "reasoning": true
    },
    "limits": {
      "max_context_length": 131072,
      "max_completion_tokens": 40960,
      "requests_per_minute": null,
      "tokens_per_minute": null
    },
    "deprecated": false,
    "preview": false,
    "quantization": "FP16/8 (weights only)"
  }
  ```
</ResponseExample>

## Use Cases

**Platform Integration** - Use the OpenRouter or HuggingFace formats to integrate Cerebras models into existing platforms that support these standards.

**Model Discovery** - Programmatically discover available models and their capabilities without authentication.

**Pricing Comparison** - Compare pricing across different models using the structured pricing information.

**Feature Detection** - Check which features (streaming, tools, JSON mode) are supported by each model.
