Context Length
Free Tier8k tokens
Paid Tiers32k tokens
Speed
~2200
tokens/sec
Input / Output
Input Formats JSON, plain text
Output FormatsJSON, plain text, structured
Pricing
Input
$0.10 / M tokens
Output
$0.10 / M tokens
Exploration pricing shown above is per million tokens. For volume discounts and enterprise features, see our pricing page.
Model Notes
Model ID:
llama3.1-8b
Rate Limits
Tier | Requests/min | Input Tokens/min | Output Tokens/min | Daily Tokens |
---|---|---|---|---|
Free | 30 | 60k | 8k/request | 1M |
1 | 600 | 600k | 60k | 245M |
2 | 1000 | 1M | 100k | 415M |
Endpoints
Chat Completions
Completions
Features
Streaming
Structured Outputs
Tool Calling
Tool Calling w/ Structured Outputs