This guide will walk you step-by-step through using the Hugging Face InferenceClient to run inference on Cerebras hardware. Hugging Face acts as our “pay-as-you-go“ provider.
Install the Hugging Face Hub client
Create a new Hugging Face API key
Make an API call
"hf_your_api_key_here"
with your actual API key.What context length can I run?
What additional latency can I expect when using Cerebras through Hugging Face?
Why do I see “Wrong API Format“ when running the Hugging Face test code?