Reasoning flags are currently only available for the OpenAI GPT OSS model.
To enable reasoning, use the reasoning_effort parameter within the chat.completions.create method. This parameter controls the amount of reasoning the model performs.
1

Initial Setup

Begin by importing the Cerebras SDK and setting up the client.
import os
from cerebras.cloud.sdk import Cerebras

client = Cerebras(
    # This is the default and can be omitted
    api_key=os.environ.get("CEREBRAS_API_KEY"),
)
2

Using Reasoning

Set the reasoning_effort parameter within the chat.completions.create method to enable reasoning capabilities.
stream = client.chat.completions.create(
    messages=[
        {
            "role": "system",
            "content": "You are a helpful assistant."
        },
        {
            "role": "user",
            "content": "Say hello to the world."
        },
        {
            "role": "assistant",
            "content": "Hello, world! 🌍"
        }
    ],
    model="gpt-oss-120b",
    stream=True,
    max_completion_tokens=65536,
    temperature=1,
    top_p=1,
    reasoning_effort="medium"
)

for chunk in stream:
  print(chunk.choices[0].delta.content or "", end="")

Reasoning Effort Levels

The reasoning_effort parameter accepts the following values:
  • "none" - No explicit reasoning effort specified (model-dependent default)
  • "low" - Minimal reasoning, faster responses
  • "medium" - Moderate reasoning
  • "high" - Extensive reasoning, more thorough analysis
The default value depends on the specific model being used. For example, "none" will be treated as "medium" for OpenAI models.

Response Format

When reasoning is enabled, the response includes a reasoning field containing the model’s internal thought process:
{
  "id": "chatcmpl-xxx",
  "object": "chat.completion",
  "choices": [
    {
      "message": {
        "role": "assistant",
        "content": "Hello, world!",
        "reasoning": "The user is asking for a simple greeting to the world. This is a straightforward request that doesn't require complex analysis. I should provide a friendly, direct response."
      }
    }
  ]
}