curl --request POST \
--url https://api.cerebras.ai/management/v1/orgs/{org_name}/models:upload \
--header 'Authorization: Bearer <token>' \
--header 'Content-Type: application/json' \
--data '
{
"model_arch_id": "llama-3.3-70b",
"model": {
"weight_uri": "s3://my-bucket/model-weights",
"version_aliases": [
"production",
"v1-stable"
],
"system_fingerprint_suffix": "my-custom-suffix"
}
}
'{
"name": "orgs/my-org/models/llama-3.3-70b/versions/1",
"done": false,
"response": {
"customer_s3_uri": "s3://my-bucket/model-weights",
"version_aliases": [
"production",
"v1-stable"
],
"sync_status": "queued"
}
}curl --request POST \
--url https://api.cerebras.ai/management/v1/orgs/{org_name}/models:upload \
--header 'Authorization: Bearer <token>' \
--header 'Content-Type: application/json' \
--data '
{
"model_arch_id": "llama-3.3-70b",
"model": {
"weight_uri": "s3://my-bucket/model-weights",
"version_aliases": [
"production",
"v1-stable"
],
"system_fingerprint_suffix": "my-custom-suffix"
}
}
'{
"name": "orgs/my-org/models/llama-3.3-70b/versions/1",
"done": false,
"response": {
"customer_s3_uri": "s3://my-bucket/model-weights",
"version_aliases": [
"production",
"v1-stable"
],
"sync_status": "queued"
}
}{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"AWS": "<cerebras-provided-iam-role-arn>"
},
"Action": [
"s3:GetObject",
"s3:GetObjectVersion",
"s3:ListBucket",
"s3:GetBucketLocation"
],
"Resource": [
"arn:aws:s3:::<your-bucket-name>",
"arn:aws:s3:::<your-bucket-name>/*"
]
}
]
}
<your-bucket-name> with your S3 bucket name. The IAM role ARN (<cerebras-provided-iam-role-arn>) will be provided by Cerebras and enables secure cross-account access to your model weights.Management API key generated from the Management API keys section on the API keys page at https://cloud.cerebras.ai. Use the format: Bearer <MANAGEMENT_API_KEY>
Cerebras customer management organization name found under Management API keys on the API keys page at https://cloud.cerebras.ai.
Note: This is not to be confused with org_id.
Name of the model architecture (e.g. llama3.1-8b, llama-3.3-70b).
Hide child attributes
S3 URI to customer-provided weights. Must match pattern: ^s3://[a-zA-Z0-9.\-_/~]+$.
Optional list of aliases that can be used to refer to this new version instead of the autogenerated integer number. Must match pattern: ^[a-z][a-zA-Z0-9-]{0,126}[a-z0-9]$.
Optional custom suffix appended after the auto-generated fingerprint pattern.
The final fingerprint will be fp_{backend_hash}_{model_arch_id}_v{version_id}_{custom_suffix} if provided, or just fp_{backend_hash}_{model_arch_id}_v{version_id} if not.
Must match pattern: ^[a-zA-Z0-9-]{1,64}$.
This suffix appears in the system_fingerprint field of inference responses, allowing you to identify which model version served a request.
Example response:
curl -X POST "https://api.cerebras.ai/v1/chat/completions" \
-H "Authorization: Bearer $CEREBRAS_API_KEY" \
-d '{"model": "my-org-gpt-oss-120b", "messages": [{"content": "Hello!", "role": "user"}]}'
# Response includes:
{"system_fingerprint": "fp_70185065a4_gpt-oss-120b_v123_super-cool-version", ...}In this example:
fp_70185065a4 - Cerebras backend configuration used to serve the requestgpt-oss-120b - Model architecture ID123 - Model version IDsuper-cool-version - Custom suffix set during uploadSuccessful Response
Model version name in the format of orgs/<org_name>/models/<model_arch_id>/versions/<version_id>. <version_id> is an auto-incrementing integer assigned by the system upon successful upload. This full model version name can be used for model deployment.
Whether the operation is complete.
Response data with model version information.
Hide child attributes
S3 URI provided by the customer.
List of aliases that can be used to refer to this version.
Status of the weight sync operation.
not_started, queued, in_progress, done, error, skipped Was this page helpful?