Context Length

Free Tier64k tokens
Paid Tiers131k tokens

Speed

~3000
tokens/sec

Input / Output

Input Formats JSON, plain text
Output Formatsplain text, structured

Pricing

Input
$0.25 / M tokens
Output
$0.69 / M tokens
Exploration pricing shown above is per million tokens. For volume discounts and enterprise features, see our pricing page.

Model Notes

Model ID: gpt-oss-120b
The following features are not yet supported:

- Tool calling with strict: true (constrained decoding)

- Response format with json_object

- Response format with json_schema when using strict: true

- tool_choice: none

- parallel_tool_calls: false

- Logprobs with response format or tool usage (available for all other models)
When min_tokens is set, the model may generate EOS (End of Sequence) tokens which may cause parser failures. Use at your own risk.

Rate Limits

TierRequests/minInput Tokens/minOutput Tokens/minDaily Tokens
Free3064k8k/request1M

Endpoints

Chat Completions
Completions

Features

Reasoning
Streaming
Structured Outputs
Tool Calling

Need Higher Limits?

Reach out for custom pricing with our Enterprise tier for higher rate limits and dedicated support.Contact Sales