> ## Documentation Index
> Fetch the complete documentation index at: https://inference-docs.cerebras.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Streaming Responses

> Learn how to enable streaming responses in the Cerebras API.

<Tip>**To get started with a free API key, [click here](https://cloud.cerebras.ai?utm_source=3pi_streaming\&utm_campaign=capabilities).**</Tip>

The Cerebras API supports streaming responses, allowing messages to be sent back in chunks and displayed incrementally as they are generated. To enable this feature, set the `stream` parameter to `True` within the `chat.completions.create` method. This will result in the API returning an iterable containing the chunks of the message.

Similarly, the same can be done in TypeScript by setting the `stream` property to `true` within the `chat.completions.create` method.

<Steps>
  <Step title="Initial Setup">
    Begin by importing the Cerebras SDK and setting up the client.

    <CodeGroup>
      ```python Python theme={null}
      import os
      from cerebras.cloud.sdk import Cerebras

      client = Cerebras(
          # This is the default and can be omitted
          api_key=os.environ.get("CEREBRAS_API_KEY"),
      )
      ```

      ```javascript Node.js theme={null}
      import Cerebras from 'cerebras_cloud_sdk';

      const client = new Cerebras({
        apiKey: process.env['CEREBRAS_API_KEY'], // This is the default and can be omitted
      });
      ```
    </CodeGroup>
  </Step>

  <Step title="Streaming Responses">
    Set the `stream` parameter to `True` within the `chat.completions.create` method to enable streaming responses.

    <CodeGroup>
      ```python Python theme={null}
      stream = client.chat.completions.create(
          messages=[
              {
                  "role": "user",
                  "content": "Why is fast inference important?",
              }
          ],
          model="gpt-oss-120b",
          stream=True,
      )

      for chunk in stream:
          print(chunk.choices[0].delta.content or "", end="")
      ```

      ```javascript Node.js theme={null}
      import Cerebras from 'cerebras_cloud_sdk';

      const client = new Cerebras({
        apiKey: process.env['CEREBRAS_API_KEY'], // This is the default and can be omitted
      });

      async function main() {
        const stream = await client.chat.completions.create({
          messages: [{ role: 'user', content: 'Why is fast inference important?' }],
          model: 'gpt-oss-120b',
          stream: true,
        });
        for await (const chunk of stream) {
          process.stdout.write(chunk.choices[0]?.delta?.content || '');
        }
      }

      main();
      ```
    </CodeGroup>
  </Step>
</Steps>
