Context Length
Free Tier64k tokens
Paid Tiers131k tokens
Speed
~3000
tokens/sec
Input / Output
Input Formats JSON, plain text
Output Formatsplain text, structured
Pricing
Input
$0.35 / M tokens
Output
$0.75 / M tokens
Exploration pricing shown above is per million tokens. For volume discounts and enterprise features, see our pricing page.
Model Notes
Model ID:
gpt-oss-120b
The following features are not yet supported:
- Tool calling with
- Response format with
- Response format with
-
-
- Logprobs with response format or tool usage (available for all other models)
- Tool calling with
strict: true
(constrained decoding)- Response format with
json_object
- Response format with
json_schema
when using strict: true
-
tool_choice: none
-
parallel_tool_calls: false
- Logprobs with response format or tool usage (available for all other models)
When
min_tokens
is set, the model may generate EOS (End of Sequence) tokens which may cause parser failures. Use at your own risk.This model may call tools that aren't directly specified due to it's training. Monitor for non-approved tools and reprompt with "you're hallucinating a tool call" to help the model self-correct and stick to provided tools.
For this model, our API maps the "system" role to developer-level instructions in our prompt hierarchy. See our OpenAI Compatibility guide for more details.
Rate Limits
Tier | Requests/min | Input Tokens/min | Output Tokens/min | Daily Tokens |
---|---|---|---|---|
Free | 30 | 64k | 8k/request | 1M |
Endpoints
Chat Completions
Features
Reasoning
Streaming
Structured Outputs
Tool Calling