Skip to main content
POST
/
management
/
v1
/
orgs
/
{org_name}
/
models:upload
Upload model version
curl --request POST \
  --url https://api.cerebras.ai/management/v1/orgs/{org_name}/models:upload \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "model_arch_id": "gpt-oss-120b",
  "model": {
    "weight_uri": "s3://my-bucket/model-weights",
    "version_aliases": [
      "production",
      "v1-stable"
    ],
    "system_fingerprint_suffix": "my-custom-suffix"
  }
}
'
{
  "name": "orgs/my-org/models/gpt-oss-120b/versions/1",
  "done": false,
  "response": {
    "customer_s3_uri": "s3://my-bucket/model-weights",
    "version_aliases": [
      "production",
      "v1-stable"
    ],
    "sync_status": "queued"
  }
}
This feature is in Private Preview. For access or more information, contact us or reach out to your account representative.
Upload a new finetuned model version to Cerebras. Model versions are custom variants of Cerebras-supported model architectures. Before using this endpoint, you must configure an S3 bucket with cross-account access to Cerebras. See S3 Bucket Setup for instructions.

Authorizations

Authorization
string
header
required

Management API key generated from the Management API keys section on the API keys page at https://cloud.cerebras.ai. Use the format: Bearer <MANAGEMENT_API_KEY>

Path Parameters

org_name
string
required

Cerebras customer management organization name found under Management API keys on the API keys page at https://cloud.cerebras.ai.

Note: This is not to be confused with org_id.

Body

application/json
model_arch_id
string
required

Name of the model architecture (e.g. llama3.1-8b, gpt-oss-120b).

model
ModelVersionUploadRequestModel · object
required

Response

200 - application/json

Successful Response

name
string
required

Model version name in the format of orgs/<org_name>/models/<model_arch_id>/versions/<version_id>. <version_id> is an auto-incrementing integer assigned by the system upon successful upload. This full model version name can be used for model deployment.

done
boolean
required

Whether the operation is complete.

response
ModelVersionUploadResponseData · object

Response data with model version information.