Qwen 3 235B

We’ve deprecated the qwen-3-235b-a22b model to make way for enhanced versions that deliver superior performance on reasoning tasks. We recommend migrating to either Qwen 3 235B Instruct or Qwen 3 235B Thinking.

Context Length

Free Tier40k tokens

Paid Tiers131k tokens

Speed

~1500

tokens/sec

Input / Output

Input Formats JSON, plain text

Output FormatsJSON, plain text, structured

Pricing

Input

$0.60 / M tokens

Output

$1.20 / M tokens

Exploration pricing shown above is per million tokens. For volume discounts and enterprise features, see our pricing page.

Model Notes

Model ID: qwen-3-235b-a22b

Currently, Cerebras only supports the default reasoning mode. However, if you don't want the model to use reasoning for certain queries, or if you experience reduced accuracy when using this model with long contexts (e.g., 131k tokens), try appending /no_think to your prompt to disable the model's default reasoning behavior.

For example: Tell me about cats /no_think

When using thinking mode, we recommend setting temperature=0.6 and top_p=0.95, and avoid greedy decoding completely as it causes performance issues and repetitions.

Rate Limits

Tier	Requests/min	Input Tokens/min	Output Tokens/min	Daily Tokens
Free	30	64k	8k/request	1M

Endpoints

Chat Completions

Completions

Features

Reasoning

Streaming

Structured Outputs

Tool Calling

Multi-Turn Tool Calling

Need Higher Limits?

Reach out for custom pricing with our Enterprise tier for higher rate limits and dedicated support.Contact Sales

Get Started

Capabilities

Resources

Support

Context Length

Speed

Input / Output

Pricing

Model Notes

Rate Limits

Endpoints

Features

Need Higher Limits?