Model ID:
qwen-3-235b-a22b-thinking-2507
Model Stats
SPEED
~1700
tokens/sec
INPUT / OUTPUT
/
CONTEXT
Free Tier
65k tokens
Paid Tiers
131k tokens
MAX OUTPUT
Free Tier
32k tokens
Paid Tiers
40k tokens
Pricing
Input
$0.60 / M tokens
Output
$2.90 / M tokens
Developer pricing shown above is per million tokens. For volume discounts and enterprise features, see our pricing page.
Model Notes
This model supports only thinking mode. The default chat template automatically includes
<think>
tags, and it's normal to see output that contains only a closing </think>
tag without an explicit opening <think>
tag.This model tends to produce longer, more verbose responses. To prevent truncation, we recommend setting
max_completion_tokens
to 64,000 when using this model.In multi-turn conversations, the historical model output should contain only the final output portion and exclude thinking content. While the Jinja2 chat template handles this automatically, developers using other frameworks must manually ensure this best practice is implemented to maintain clean conversation history.
Rate Limits
Tier | Requests/min | Input Tokens/min | Daily Tokens |
---|---|---|---|
Free | 30 | 60k | 1M |
Developer | 1K | 1M | N/A |
Endpoints
Chat Completions
Completions
Features
Reasoning
Streaming
Structured Outputs
Tool Calling