Context Length

Free Tier8,192 tokens
Paid TiersUp to 32k

Speed

~2200
tokens/sec

Input / Output

Input Formats JSON, plain text
Output FormatsJSON, plain text, structured

Model Notes

Model ID: llama3.1-8b

Rate Limits

TierRequests/minInput Tokens/minOutput Tokens/minDaily Tokens
Free3060k-1M
1600600k60k245M
210001M100k415M

Endpoints

Chat Completions
Completions

Features

Streaming
Structured Outputs
Streaming w/ Structured Outputs
Tool Calling
Multi-Turn Tool Calling
Tool Calling w/ Structured Outputs

Need Higher Limits?

Reach out for custom pricing with our Enterprise tier for higher rate limits and dedicated support.

Contact Sales