Completion Request

model
string
required

Available options: llama3.1-8b, llama3.1-70b

prompt
string | array

The prompt(s) to generate completions for, encoded as a string, array of strings, array of tokens, or array of token arrays.

max_tokens
integer | null

The maximum number of tokens that can be generated in the chat completion. The total length of input tokens and generated tokens is limited by the model’s context length.

temperature
number | null

What sampling temperature to use, between 0 and 1.5. Higher values make the output more random, while lower values make it more focused and deterministic.

top_p
number | null

An alternative to sampling with temperature. Model considers tokens with top_p probability mass.

stream
boolean | null

Whether to stream back partial progress.

echo
boolean | null

Echo back the prompt in addition to the completion.

stop
array<string> | null

Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

user
string | null

A unique identifier representing your end-user, which can help Cerebras to monitor and detect abuse.

seed
integer | null

If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result. Determinism is not guaranteed.

Completion Response

choices
object[]
required

The list of completion choices the model generated for the input prompt.

created
integer | null
required

The Unix timestamp (in seconds) of when the completion was created.

id
string

A unique identifier for the completion.

model
string

The model used for completion.

object
string
required

The object type, which is always “text_completion”

system_fingerprint
string

This fingerprint represents the backend configuration that the model runs with.

Can be used in conjunction with the seed request parameter to understand when backend changes have been made that might impact determinism.

usage
object

Usage statistics for the completion request.