This model excels at efficient reasoning across science, math, and coding applications. It’s ideal for real-time coding assistance, processing large documents for Q&A and summarization, agentic research workflows, and regulated on-premises workloads.
gpt-oss-120b
strict: true
(constrained decoding)json_object
json_schema
when using strict: true
tool_choice: none
parallel_tool_calls: false
min_tokens
is set, the model may generate EOS (End of Sequence) tokens which may cause parser failures. Use at your own risk.Tier | Requests/min | Input Tokens/min | Output Tokens/min | Daily Tokens |
---|---|---|---|---|
Free | 30 | 64k | 8k/request | 1M |
Chat Completions
Completions