Skip to main content
This model will be deprecated on November 5, 2025
Model ID: qwen-3-coder-480b

Model Stats

SPEED
~2000
tokens/sec
INPUT / OUTPUT
/
CONTEXT
Free Tier
65k tokens
Paid Tiers
131k tokens
MAX OUTPUT
Free Tier
40k tokens
Paid Tiers
40k tokens

Pricing

Input
$2.00 / M tokens
Output
$2.00 / M tokens
Developer pricing shown above is per million tokens. For volume discounts and enterprise features, see our pricing page.

Model Notes

This model supports only non-thinking mode. It will not generate <think></think> tags.
Tool calling with strict: true (constrained decoding) is not yet supported - standard tool calling remains fully functional.
We recommend setting temperature=0.7 and top_p=0.8.

Rate Limits

TierRequests/minInput Tokens/minDaily Tokens
Free10150k1M
Developer1K1MN/A

Endpoints

Chat Completions
Completions

Features

Streaming
Structured Outputs
Tool Calling
Multi-Turn Tool Calling

Need Higher Limits?

Reach out for custom pricing with our Enterprise tier for higher rate limits and dedicated support.