Skip to main content
Model ID: gpt-oss-120b
Model card

Model Stats

SPEED
~3000
tokens/sec
INPUT / OUTPUT
/
CONTEXT
Free Tier
65k tokens
Paid Tiers
131k tokens
MAX OUTPUT
Free Tier
32k tokens
Paid Tiers
40k tokens

Pricing

Input
$0.35 / M tokens
Output
$0.75 / M tokens
Exploration pricing shown above is per million tokens. For volume discounts and enterprise features, see our pricing page.

Model Notes

Use the reasoning_effort parameter to control reasoning for this model. The default effort level is medium. Learn more in our reasoning guide.
When min_tokens is set, the model may generate EOS (End of Sequence) tokens which may cause parser failures. Use at your own risk.
This model may call tools that aren't directly specified due to it's training. Monitor for non-approved tools and reprompt with "you're hallucinating a tool call" to help the model self-correct and stick to provided tools.
For this model, our API maps the "system" role to developer-level instructions in our prompt hierarchy. See our OpenAI Compatibility guide for more details.

Rate Limits

TierRequests/minInput Tokens/minDaily Tokens
Free3060k1M
Developer1K1MN/A

Endpoints

Chat Completions
/v1/chat/completions

Capabilities

Reasoning
Streaming
Structured Outputs
Tool Calling

Need Higher Limits?

Reach out for custom pricing with our Enterprise tier for higher rate limits and dedicated support.