Upload model version

curl --request POST \
  --url https://api.cerebras.ai/management/v1/orgs/{org_name}/models:upload \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "model_arch_id": "gpt-oss-120b",
  "model": {
    "weight_uri": "s3://my-bucket/model-weights",
    "version_aliases": [
      "production",
      "v1-stable"
    ],
    "system_fingerprint_suffix": "my-custom-suffix"
  }
}
'

{
  "name": "orgs/my-org/models/gpt-oss-120b/versions/1",
  "done": false,
  "response": {
    "customer_s3_uri": "s3://my-bucket/model-weights",
    "version_aliases": [
      "production",
      "v1-stable"
    ],
    "sync_status": "queued"
  }
}

POST

management

orgs

{org_name}

models:upload

Upload model version

curl --request POST \
  --url https://api.cerebras.ai/management/v1/orgs/{org_name}/models:upload \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "model_arch_id": "gpt-oss-120b",
  "model": {
    "weight_uri": "s3://my-bucket/model-weights",
    "version_aliases": [
      "production",
      "v1-stable"
    ],
    "system_fingerprint_suffix": "my-custom-suffix"
  }
}
'

{
  "name": "orgs/my-org/models/gpt-oss-120b/versions/1",
  "done": false,
  "response": {
    "customer_s3_uri": "s3://my-bucket/model-weights",
    "version_aliases": [
      "production",
      "v1-stable"
    ],
    "sync_status": "queued"
  }
}

This feature is in Private Preview. For access or more information, contact us or reach out to your account representative.

Upload a new finetuned model version to Cerebras. Model versions are custom variants of Cerebras-supported model architectures. To get started, contact your Cerebras Account representative for onboarding to the Customer Management API. S3 Bucket Setup Create an S3 bucket with a name of your choice. Your Cerebras representative will provide specific instructions, but the bucket policy must grant cross-account access to Cerebras:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "AWS": "<cerebras-provided-iam-role-arn>"
      },
      "Action": [
        "s3:GetObject",
        "s3:GetObjectVersion",
        "s3:ListBucket",
        "s3:GetBucketLocation"
      ],
      "Resource": [
        "arn:aws:s3:::<your-bucket-name>",
        "arn:aws:s3:::<your-bucket-name>/*"
      ]
    }
  ]
}

Replace <your-bucket-name> with your S3 bucket name. The IAM role ARN (<cerebras-provided-iam-role-arn>) will be provided by Cerebras and enables secure cross-account access to your model weights.

Authorizations

Authorization

string

header

required

Management API key generated from the Management API keys section on the API keys page at https://cloud.cerebras.ai. Use the format: Bearer <MANAGEMENT_API_KEY>

Path Parameters

org_name

string

required

Cerebras customer management organization name found under Management API keys on the API keys page at https://cloud.cerebras.ai.

Note: This is not to be confused with org_id.

Body

application/json

model_arch_id

string

required

Name of the model architecture (e.g. llama3.1-8b, gpt-oss-120b).

model

ModelVersionUploadRequestModel · object

required

Hide child attributes

model.weight_uri

string

required

S3 URI to customer-provided weights. Must match pattern: ^s3://[a-zA-Z0-9.\-_/~]+$.

Pattern: ^s3://[a-zA-Z0-9.\-_/~]+$

model.version_aliases

string[] | null

Optional list of aliases that can be used to refer to this new version instead of the autogenerated integer number. Must match pattern: ^[a-z][a-zA-Z0-9-]{0,126}[a-z0-9]$.

Pattern: ^[a-z][a-zA-Z0-9-]{0,126}[a-z0-9]$

model.system_fingerprint_suffix

string | null

Optional custom suffix appended after the auto-generated fingerprint pattern.

The final fingerprint will be fp_{backend_hash}_{model_arch_id}_v{version_id}_{custom_suffix} if provided, or just fp_{backend_hash}_{model_arch_id}_v{version_id} if not.

Must match pattern: ^[a-zA-Z0-9-]{1,64}$.

This suffix appears in the system_fingerprint field of inference responses, allowing you to identify which model version served a request.

Example response:

curl -X POST "https://api.cerebras.ai/v1/chat/completions" \
  -H "Authorization: Bearer $CEREBRAS_API_KEY" \
  -d '{"model": "my-org-gpt-oss-120b", "messages": [{"content": "Hello!", "role": "user"}]}'

# Response includes:
{"system_fingerprint": "fp_70185065a4_gpt-oss-120b_v123_super-cool-version", ...}

In this example:

fp_70185065a4 - Cerebras backend configuration used to serve the request
gpt-oss-120b - Model architecture ID
123 - Model version ID
super-cool-version - Custom suffix set during upload

Pattern: ^[a-zA-Z0-9-]{1,64}$

Response

200 - application/json

Successful Response

name

string

required

Model version name in the format of orgs/<org_name>/models/<model_arch_id>/versions/<version_id>. <version_id> is an auto-incrementing integer assigned by the system upon successful upload. This full model version name can be used for model deployment.

done

boolean

required

Whether the operation is complete.

response

ModelVersionUploadResponseData · object

Response data with model version information.

Hide child attributes

response.customer_s3_uri

string

required

S3 URI provided by the customer.

response.version_aliases

string[]

required

List of aliases that can be used to refer to this version.

response.sync_status

enum<string>

required

Status of the weight sync operation.

Available options:

not_started,

queued,

in_progress,

done,

error,

skipped

List model versions

Retrieve model version status

⌘I

Introduction

Chat

Completions

Models

Batch

Files

Metrics

Management

Upload model version

Authorizations

Path Parameters

Body

Response