Skip to main content
Model ID: gpt-oss-120b

Model Stats

SPEED
~3000
tokens/sec
INPUT / OUTPUT
/
CONTEXT
Free Tier
65k tokens
Paid Tiers
131k tokens
MAX OUTPUT
Free Tier
32k tokens
Paid Tiers
40k tokens

Pricing

Input
$0.35 / M tokens
Output
$0.75 / M tokens
Developer pricing shown above is per million tokens. For volume discounts and enterprise features, see our pricing page.

Model Notes

When min_tokens is set, the model may generate EOS (End of Sequence) tokens which may cause parser failures. Use at your own risk.
This model may call tools that aren't directly specified due to it's training. Monitor for non-approved tools and reprompt with "you're hallucinating a tool call" to help the model self-correct and stick to provided tools.
For this model, our API maps the "system" role to developer-level instructions in our prompt hierarchy. See our OpenAI Compatibility guide for more details.

Rate Limits

TierRequests/minInput Tokens/minDaily Tokens
Free3060k1M
Developer1K1MN/A

Endpoints

Chat Completions

Features

Reasoning
Streaming
Structured Outputs
Tool Calling

Need Higher Limits?

Reach out for custom pricing with our Enterprise tier for higher rate limits and dedicated support.
I