Context Length
Free Tier64k tokens
Paid Tiers128k tokens
Speed
~2000
tokens/sec
Input / Output
Input Formats JSON, plain text
Output FormatsJSON, plain text, structured
Pricing
Input
$2.00 / M tokens
Output
$2.00 / M tokens
Exploration pricing shown above is per million tokens. For volume discounts and enterprise features, see our pricing page.
Model Notes
Model ID:
qwen-3-coder-480b
This model supports only non-thinking mode. It will not generate
<think></think>
tags.Tool calling with
strict: true
(constrained decoding) is not yet supported - standard tool calling remains fully functional.We recommend setting
temperature=0.7
and top_p=0.8
.Rate Limits
Tier | Requests/min | Input Tokens/min | Output Tokens/min | Daily Tokens |
---|---|---|---|---|
Free | 10 | 150k | 8k/request | 1M |
Endpoints
Chat Completions
Completions
Features
Streaming
Structured Outputs
Tool Calling
Multi-Turn Tool Calling