Skip to main content
Model ID: llama3.1-8b

Model Stats

SPEED
~2200
tokens/sec
INPUT / OUTPUT
/
CONTEXT
Free Tier
8k tokens
Paid Tiers
32k tokens
MAX OUTPUT
Free Tier
8k tokens
Paid Tiers
8k tokens

Pricing

Input
$0.10 / M tokens
Output
$0.10 / M tokens
Developer pricing shown above is per million tokens. For volume discounts and enterprise features, see our pricing page.

Rate Limits

TierRequests/minInput Tokens/minDaily Tokens
Free3060k1M
Developer1K1MN/A

Endpoints

Chat Completions
Completions

Features

Streaming
Structured Outputs
Tool Calling
Tool Calling w/ Structured Outputs

Need Higher Limits?

Reach out for custom pricing with our Enterprise tier for higher rate limits and dedicated support.