Context Length

Free Tier8k tokens
Paid Tiers32k tokens

Speed

~2600
tokens/sec

Input / Output

Input Formats JSON, plain text
Output FormatsJSON, plain text, structured

Pricing

Input
$0.65 / M tokens
Output
$0.85 / M tokens
Exploration pricing shown above is per million tokens. For volume discounts and enterprise features, see our pricing page.

Model Notes

Model ID: llama-4-scout-17b-16e-instruct
Available through Hugging Face and OpenRouter.

Rate Limits

TierRequests/minInput Tokens/minOutput Tokens/minDaily Tokens
Free3060k8k/request1M
1300300k30k54M
2600600k60k115M
310001M100K185M
412001.2M120k255M
514501.45M145k365M

Endpoints

Chat Completions
Completions

Features

Streaming
Structured Outputs
Tool Calling
Tool Calling w/ Structured Outputs

Need Higher Limits?

Reach out for custom pricing with our Enterprise tier for higher rate limits and dedicated support.Contact Sales