Model Name | Model ID | Parameters | Speed (tokens/s) |
---|---|---|---|
Llama 4 Scout | llama-4-scout-17b-16e-instruct | 109 billion | ~2600 |
Llama 3.1 8B | llama3.1-8b | 8 billion | ~2200 |
Llama 3.3 70B | llama-3.3-70b | 70 billion | ~2100 |
Qwen 3 32B | qwen-3-32b | 32 billion | ~2600 |
Model Name | Model ID | Parameters | Speed (tokens/s) |
---|---|---|---|
Llama 4 Maverick | llama-4-maverick-17b-128e-instruct | 17 billion | ~1500 |
Qwen 3 235B Instruct | qwen-3-235b-a22b-instruct-2507 | 235 billion | ~1400 |
Qwen 3 235B Thinking | qwen-3-235b-a22b-thinking-2507 | 235 billion | ~1700 |
DeepSeek R1 Distill Llama 70B* | deepseek-r1-distill-llama-70b | 70 billion | ~1700 |