Skip to main content
Structured outputs is a feature that can enforce consistent JSON outputs for models in the Cerebras Inference API. This is particularly useful when building applications that need to process AI-generated data programmatically. Some of the key benefits of using structured outputs are:
  • Reduced Variability: Ensures consistent outputs by adhering to predefined fields.
  • Type Safety: Enforces correct data types, preventing mismatches.
  • Easier Parsing & Integration: Enables direct use in applications without extra processing.

Tutorial: Structured Outputs using Cerebras Inference

In this tutorial, we’ll explore how to use structured outputs with the Cerebras Cloud SDK. We’ll build a simple application that generates movie recommendations and uses structured outputs to ensure the response is in a consistent JSON format.
1

Initial Setup

First, ensure that you have completed steps 1 and 2 of our Quickstart Guide to set up your API key and install the Cerebras Cloud SDK.Then, initialize the Cerebras client and import the necessary modules we will use in this tutorial.
import os
from cerebras.cloud.sdk import Cerebras
import json

client = Cerebras(
    api_key=os.environ.get("CEREBRAS_API_KEY")
)
2

Defining the Schema

To ensure structured responses from the model, we’ll use a JSON schema to define our output structure. Start by defining your schema, which specifies the fields, their types, and which ones are required. For our example, we’ll define a schema for a movie recommendation that includes the title, director, and year:
Note: For every required array you define in your schema, you must set additionalProperties to false.
movie_schema = {
    "type": "object",
    "properties": {
        "title": {"type": "string"},
        "director": {"type": "string"},
        "year": {"type": "integer"},
    },
    "required": ["title", "director", "year"],
    "additionalProperties": False
}
3

Using Structured Outputs

Next, use the schema in your API call by setting the response_format parameter to include both the type and your schema. Setting strict to true will enforce the schema. Setting strict to false will allow the model to return additional fields that are not specified in the schema, similar to JSON mode.
completion = client.chat.completions.create(
    model="gpt-oss-120b",
    messages=[
        {"role": "system", "content": "You are a helpful assistant that generates movie recommendations."},
        {"role": "user", "content": "Suggest a sci-fi movie from the 1990s"}
    ],
    response_format={
        "type": "json_schema", 
        "json_schema": {
            "name": "movie_schema",
            "strict": True,
            "schema": movie_schema
        }
    }
)

# Parse the JSON response
movie_data = json.loads(completion.choices[0].message.content)
print(json.dumps(movie_data, indent=2))
Sample output:
{
  "title": "Terminator 2: Judgment Day",
  "director": "James Cameron",
  "year": 1991
}
Now you have a structured JSON response from the model, which can be used in your application.

Understanding Strict Mode

Strict mode guarantees that the model’s output will exactly match the JSON schema you provide. When strict is set to true, Cerebras employs constrained decoding to ensure schema conformance at the token level, making invalid outputs impossible.

Why Use Strict Mode

Without strict mode, you may encounter:
  • Malformed JSON that fails to parse
  • Missing required fields
  • Incorrect data types (e.g., "16" instead of 16)
  • Extra fields not defined in your schema
With strict model enabled, you get:
  • Guaranteed valid JSON
  • Schema compliance: Every field matches your specification
  • Type safety: Correct data types for all properties
  • No retries needed: Eliminates error handling for schema violations

Enabling Strict Mode

Set strict to true in your response_format configuration:
response_format={
    "type": "json_schema",
    "json_schema": {
        "name": "my_schema",
        "strict": True,  # Enable constrained decoding
        "schema": your_schema
    }
}

Schema Requirements for Strict Mode

When using strict mode, you must set additionalProperties: false. This is required for every object in your schema.

Limitations in Strict Mode

The following limitations apply if strict is set to true in the JSON schema:
  • Recursive JSON schemas are not currently supported.
  • Maximum schema length is limited to 5000 characters.
  • Maximum nesting depth is 10 levels.
  • Maximum number of object properties is 500.
  • A schema may have a maximum of 500 enum values across all enum properties.
  • For a single enum property with string values, the total string length of all enum values cannot exceed 7500 characters when there are more than 250 enum values.
  • items: true is not supported for JSON schema array types.
  • items: false is supported when used with prefixItems for tuple-like arrays with validation rules.
  • $anchor keyword is not supported - use relative paths within definitions/references instead.
  • Use $defs instead of definitions for reusable schema components.
  • Additional informational fields meant as guidelines (not used in validation) are not supported.
For security reasons, external schema references are not supported.
  • Internal definitions are supported: "$ref": "#/$defs/cast_member"
  • Other reference access patterns are not recommended, and will be deprecated in future releases. See Schema References and Definitions for more info.

Schema References and Definitions

You can use $ref with $defs to define reusable schema components within your JSON schema. This is useful for avoiding repetition and creating more maintainable schemas.
schema_with_defs = {
    "type": "object",
    "properties": {
        "title": {"type": "string"},
        "director": {"$ref": "#/$defs/person"},
        "year": {"type": "integer"},
        "lead_actor": {"$ref": "#/$defs/person"},
        "studio": {"$ref": "#/$defs/studio"}
    },
    "required": ["title", "director", "year"],
    "additionalProperties": False,
    "$defs": {
        "person": {
            "type": "object",
            "properties": {
                "name": {"type": "string"},
                "age": {"type": "integer"}
            },
            "required": ["name"],
            "additionalProperties": False
        },
        "studio": {
            "type": "object",
            "properties": {
                "name": {"type": "string"},
                "founded": {"type": "integer"},
                "headquarters": {"type": "string"}
            },
            "required": ["name"],
            "additionalProperties": False
        }
    }
}

Advanced Schema Features

Your schema can include various JSON Schema features:
  • Fundamental Data Types: String, Number, Boolean, Integer, Object, Array, Enum, null.
  • Union Types: Use anyOf to allow the model to return one of multiple possible types (max of 5).
  • Nested structures: Define complex objects with nested properties, with support for up to 5 layers of nesting. You can also use definitions to reference reusable schema components.
  • Required fields: Specify which fields must be present.
  • Additional properties: Control whether extra fields are allowed. Note: the only accepted value is false. For every required array you define in your schema, you must set additionalProperties to false.
  • Enums (value constraints): Use the enum keyword to whitelist the exact literals a field may take. See rating in the example below.
For example, a more complex schema might look like:
detailed_schema = {
    "type": "object",
    "properties": {
        "title": {"type": "string"},
        "director": {"type": "string"},
        "year": {"type": "integer"},
        "genres": {
            "type": "array",
            "items": {"type": "string"}
        },
        "rating": {
            "type": "string",
            "enum": ["G", "PG", "PG‑13", "R"]
        },
        "cast": {
            "type": "array",
            "items": {
                "type": "object",
                "properties": {
                    "name": {"type": "string"},
                    "role": {"type": "string"}
                },
                "required": ["name"],
                "additionalProperties": False
            }
        }
    },
    "required": ["title", "director", "year", "genres"],
    "additionalProperties": False
}
When used with the API, you might get a response like:
{
  "title": "Jurassic Park",
  "director": "Steven Spielberg",
  "year": 1993,
  "genres": ["Science Fiction", "Adventure", "Thriller"],
  "cast": [
    {"name": "Sam Neill", "role": "Dr. Alan Grant"},
    {"name": "Laura Dern", "role": "Dr. Ellie Sattler"},
    {"name": "Jeff Goldblum", "role": "Dr. Ian Malcolm"}
  ]
}

Working with Pydantic and Zod

Besides defining a JSON schema manually, you can use Pydantic (Python) or Zod (JavaScript) to create your schema and convert it to JSON. Pydantic’s model_json_schema and Zod’s zodToJsonSchema methods generate the JSON schema, which can then be used in the API call, as demonstrated in the workflow above.
from pydantic import BaseModel
import json

# Define your schema using Pydantic
class Movie(BaseModel):
    title: str
    director: str 
    year: int 

# Convert the Pydantic model to a JSON schema
movie_schema = Movie.model_json_schema()

# Print the JSON schema to verify it
print(json.dumps(movie_schema, indent=2))

JSON Mode

In addition to structured outputs, you can also use JSON mode to generate JSON responses from the model. This approach tells the model to return data in JSON format but doesn’t enforce a specific structure. The model decides what fields to include based on the context of your prompt.
We recommend using structured outputs with strict set to true whenever possible, as it provides more predictable and reliable results.
To use JSON mode, set the response_format parameter to json_object:
completion = client.chat.completions.create(
    model="gpt-oss-120b",
    messages=[
      {"role": "system", "content": "You are a helpful assistant that generates movie recommendations."},
      {"role": "user", "content": "Suggest a sci-fi movie from the 1990s"}
    ],
    response_format={"type": "json_object"}
)

Limitations

  • You must explicitly instruct the model to generate JSON through a system or user message.

Structured Outputs vs JSON Mode

The table below summarizes the key differences between Structured Outputs and JSON Mode:
FeatureStructured Outputs (strict)Structured Outputs (non-strict)JSON Mode
Outputs valid JSONYesYes (best-effort)Yes
Adheres to schemaYes (guaranteed)YesNo (flexible)
Extra fields allowedNoYesNo (flexible)
Constrained DecodingYesNoNo
Enablingresponse_format: { type: "json_schema", json_schema: {"strict": true, "schema": ...} }response_format: { type: "json_schema", json_schema: {"strict": false, "schema": ...} }.response_format: { type: "json_object" }
tools and response_format cannot be used in the same request.

Conclusion

Structured outputs with JSON schema enforcement ensures your AI-generated responses follow a consistent, predictable format. This makes it easier to build reliable applications that can process AI outputs programmatically without worrying about unexpected data structures or missing fields. Check out some of our other tutorials to learn more about other features of the Cerebras Inference SDK:
  • Tool Use: extending models’ capabilities to access tools to answer questions and perform actions
  • CePO: a reasoning framework for improving reasoning model abilities with test-time compute
  • Streaming: a feature for streaming responses from the model