> ## Documentation Index
> Fetch the complete documentation index at: https://inference-docs.cerebras.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Quickstart

> Make your first Cerebras API call in just minutes.

Get started with the world's fastest inference. Already familiar with LLM APIs? Skip straight to the [API reference](/api-reference/chat-completions) or try the [playground](https://cloud.cerebras.ai?utm_source=3pi_quickstart\&utm_campaign=docs).

<Steps>
  <Step title="Set up your API key">
    Visit the [Cloud Console](https://cloud.cerebras.ai?utm_source=3pi_quickstart\&utm_campaign=docs) and sign up or log in. Navigate to **API Keys** in the left nav bar to create a key. See [API Keys](/console/api-keys) for details.

    Set your API key as an environment variable so you don't have to pass it with every request:

    <CodeGroup>
      ```bash macOS / Linux theme={null}
      export CEREBRAS_API_KEY="your-api-key-here"
      ```

      ```powershell Windows (PowerShell) theme={null}
      $env:CEREBRAS_API_KEY = "your-api-key-here"
      ```

      ```bash Windows (CMD) theme={null}
      setx CEREBRAS_API_KEY "your-api-key-here"
      ```
    </CodeGroup>

    Confirm the variable is set:

    <CodeGroup>
      ```bash macOS / Linux theme={null}
      echo $CEREBRAS_API_KEY
      ```

      ```powershell Windows (PowerShell) theme={null}
      echo $env:CEREBRAS_API_KEY
      ```

      ```bash Windows (CMD) theme={null}
      echo %CEREBRAS_API_KEY%
      ```
    </CodeGroup>

    <Note>
      `export` and `$env:` set the variable for the current shell only. `setx` on Windows persists the variable, but you must open a new terminal window for it to take effect. To persist on macOS or Linux, add the `export` line to your `~/.zshrc`, `~/.bashrc`, or equivalent shell profile.
    </Note>
  </Step>

  <Step title="Install the SDK">
    Install the Cerebras SDK for your language of choice. You can also call the API directly with cURL (see Step 3).

    <CodeGroup>
      ```bash Python theme={null}
      pip install --upgrade cerebras_cloud_sdk
      ```

      ```bash Node.js theme={null}
      npm install @cerebras/cerebras_cloud_sdk@latest
      ```
    </CodeGroup>
  </Step>

  <Step title="Make your first API request">
    Run the following code to send a chat completion request:

    <CodeGroup>
      ```python Python theme={null}
      import os
      from cerebras.cloud.sdk import Cerebras

      client = Cerebras(
          api_key=os.environ.get("CEREBRAS_API_KEY"),
      )

      chat_completion = client.chat.completions.create(
          messages=[
              {
                  "role": "user",
                  "content": "Why is fast inference important?",
              }
          ],
          model="gpt-oss-120b",
      )

      print(chat_completion.choices[0].message.content)
      ```

      ```javascript Node.js theme={null}
      import Cerebras from '@cerebras/cerebras_cloud_sdk';

      const client = new Cerebras({
        apiKey: process.env['CEREBRAS_API_KEY'],
      });

      async function main() {
        const completion = await client.chat.completions.create({
          messages: [{ role: 'user', content: 'Why is fast inference important?' }],
          model: 'gpt-oss-120b',
        });

        console.log(completion.choices[0].message.content);
      }

      main();
      ```

      ```cli cURL theme={null}
      curl https://api.cerebras.ai/v1/chat/completions \
        -H "Content-Type: application/json" \
        -H "Authorization: Bearer ${CEREBRAS_API_KEY}" \
        -d '{
          "model": "gpt-oss-120b",
          "messages": [
            {"role": "user", "content": "Why is fast inference important?"}
          ]
        }'
      ```
    </CodeGroup>

    You should see a response like:

    ```
    Fast inference is important because it enables real-time interactions,
    reduces latency in production applications, and allows for more complex
    reasoning workflows within acceptable response times...
    ```
  </Step>
</Steps>

## Common Errors

* `401 Unauthorized` — `CEREBRAS_API_KEY` isn't set in the shell running your code. Re-run the `echo` command above in the same terminal to confirm. On Windows after `setx`, open a new terminal.
* `404 model not found` — The `model` ID is misspelled, deprecated, or not available on your account. See the full list of public models on the [Models page](/models/overview).
* `429 Too Many Requests` — You've hit a rate limit. Free accounts have lower per-minute limits than paid accounts. See [Rate limits](/support/rate-limits) for current quotas and how to request an increase.

For all status codes, see the [error reference](/support/error).

## Next Steps

* **Choose a model** — Find the right model for your use case in our [model selection guide](/models/choose-a-model).
* **Explore capabilities** — Add [streaming](/capabilities/streaming), [tool calling](/capabilities/tool-use), [structured outputs](/capabilities/structured-outputs), or [reasoning](/capabilities/reasoning) to your application.
* **Design for Cerebras** — Learn [architectural patterns](/resources/designing-for-cerebras) that take full advantage of wafer-scale inference.
* **Browse the API** — See all available endpoints and parameters in the [API reference](/api-reference/chat-completions).