The Cerebras API supports streaming responses, allowing messages to be sent back in chunks and displayed incrementally as they are generated. To enable this feature, set the stream parameter to True within the chat.completions.create method. This will result in the API returning an iterable containing the chunks of the message.

Similarly, the same can be done in TypeScript by setting the stream property to true within the chat.completions.create method.

Was this page helpful?