Get Started with Flowise

Flowise is an open-source low-code tool for developers to build customized LLM orchestration flows and AI agents. With its intuitive drag-and-drop interface, you can easily create complex AI workflows without writing extensive code. By integrating Cerebras with Flowise, you can leverage the world’s fastest AI inference to power your Flowise applications with ultra-low latency and high throughput.

This guide covers the ChatCerebras v3.0 node, which includes a model dropdown selector and automatic integration tracking. If you’re using an older version, consider updating Flowise to get these enhanced features.

Prerequisites

Before you begin, ensure you have:

Cerebras API Key - Get a free API key here.
Flowise Installation - Install Flowise locally or use Flowise Cloud.
Node.js 18 or higher - Required for running Flowise locally.

Install Flowise

Install Flowise via NPM

The easiest way to get started with Flowise is to install it globally using NPM:

npm install -g flowise

Alternatively, you can use Docker:

docker run -d -p 3000:3000 flowiseai/flowise

Start Flowise

Once installed, start Flowise with:

flowise start

This will launch Flowise on http://localhost:3000. Open this URL in your browser to access the Flowise interface.

Configure Cerebras in Flowise

Create a new Chatflow

In the Flowise UI, create a new chatflow to house your Cerebras-powered application:

Click on “Chatflows” in the left sidebar
Click the “+Add New” button
Give your chatflow a descriptive name like “Cerebras Chat Assistant”

Add the ChatCerebras node

Flowise has a dedicated ChatCerebras node for seamless integration:

In the canvas, click the ”+” button or drag from the left panel
Search for “ChatCerebras” in the Chat Models category
Drag the ChatCerebras node onto the canvas

Configure the ChatCerebras node

Click on the ChatCerebras node to open its configuration panel and configure the following settings:Required Settings:

Connect Credential: Click to add your Cerebras API Key
- If this is your first time, click “Create New”
- Enter your API key from cloud.cerebras.ai (starts with csk-)
- Give it a name like “Cerebras API”
- Click “Add”
Model Name: Select from the dropdown:
- llama-3.3-70b - Best for complex reasoning and long-form content
- qwen-3-32b - Balanced performance for general-purpose tasks
- llama3.1-8b - Fastest model, ideal for simple tasks (default)
- gpt-oss-120b - Largest model for demanding tasks
- zai-glm-4.7 - Advanced reasoning and complex problem-solving

Optional Settings (under Additional Parameters):

Temperature: Control randomness (0.0 to 1.0, default 0.9)
Max Tokens: Maximum response length
Top P: Nucleus sampling parameter
Streaming: Enable for real-time token generation (default: true)

The ChatCerebras node automatically:

Configures the correct API endpoint (https://api.cerebras.ai/v1)
Adds the integration tracking header for better support
No manual configuration needed!

Connect additional nodes

Build out your chatflow by adding other nodes to create a complete application:

Add a Prompt Template - Click ”+” and search for “Prompt Template” to customize your system prompts
Add Memory (optional) - Search for “Buffer Memory” or “Conversation Buffer Memory” to maintain conversation context
Connect the nodes - Draw connections between nodes by clicking and dragging from output ports to input ports

A basic flow might look like:

Prompt Template → ChatCerebras → Output

Or with memory:

Prompt Template → Buffer Memory → ChatCerebras → Output

Test your chatflow

Once your nodes are connected, test your Cerebras-powered chatflow:

Click the “Save” button in the top right
Click the “Chat” icon to open the test interface
Send a test message like “Hello! What can you help me with?”
You should receive a response from your Cerebras-powered chatflow

If you encounter any errors, check that your API key is correct and the Base URL is set to https://api.cerebras.ai/v1.

Using Cerebras with Flowise API

Flowise automatically generates REST APIs for your chatflows, allowing you to integrate Cerebras-powered AI into any application.

Get your Chatflow API endpoint

In the Flowise UI:

Open your chatflow
Click the “API” button in the top right
Copy the API endpoint URL (e.g., http://localhost:3000/api/v1/prediction/your-chatflow-id)

Make API requests to your chatflow

You can now call your Cerebras-powered chatflow from any application:

import requests
import json

url = "http://localhost:3000/api/v1/prediction/your-chatflow-id"

payload = {
    "question": "What is the capital of France?",
    "overrideConfig": {
        "temperature": 0.7
    }
}

headers = {
    "Content-Type": "application/json"
}

response = requests.post(url, json=payload, headers=headers)
result = response.json()

print(result["text"])

The chatflow will use Cerebras Inference to generate responses, giving you the speed and performance of Cerebras through Flowise’s convenient API.

Direct Integration with OpenAI SDK

For advanced users who want to use Cerebras directly in custom Flowise nodes or external applications, you can use the OpenAI SDK with Cerebras configuration:

import os
from openai import OpenAI

client = OpenAI(
    api_key=os.getenv("CEREBRAS_API_KEY"),
    base_url="https://api.cerebras.ai/v1",
    default_headers={
        "X-Cerebras-3rd-Party-Integration": "flowise"
    }
)

response = client.chat.completions.create(
    model="cerebras/llama-3.3-70b",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "What is the capital of France?"}
    ],
    max_tokens=500,
    temperature=0.7
)

print(response.choices[0].message.content)

Advanced Configuration

Using Environment Variables

For production deployments, store your Cerebras API key as an environment variable:

export CEREBRAS_API_KEY=your-api-key-here

Then in Flowise, reference it in the ChatCerebras credential configuration using ${CEREBRAS_API_KEY}.

Streaming Responses

To enable streaming responses for real-time output:

In the ChatCerebras node, enable “Streaming”
Your API responses will now stream tokens as they’re generated
This is particularly useful for long-form content generation and provides a better user experience

Using Multiple Cerebras Models

You can create different chatflows for different use cases:

Fast responses: Use llama3.1-8b for quick, simple queries
Complex reasoning: Use llama-3.3-70b for complex reasoning and long-form content
General purpose: Use qwen-3-32b for balanced performance
Long context: Use gpt-oss-120b for processing large documents
Advanced reasoning: Use zai-glm-4.7 for demanding tasks

Next Steps

Explore the Flowise documentation to learn about advanced features
Try different Cerebras models to find the best fit for your use case
Join the Flowise Discord community for support and inspiration
Check out Flowise templates for pre-built chatflow examples
Deploy your chatflow to production using Flowise Cloud
GLM4.7 migration guide

FAQ

Error: 'Invalid API key' or 401 Unauthorized

This usually means your Cerebras API key is incorrect or not properly configured:

Verify your API key at cloud.cerebras.ai
Make sure there are no extra spaces or characters in the API key field
If using environment variables, ensure they’re properly loaded
Try regenerating your API key if the issue persists

Error: 'Model not found' or invalid model name

Ensure you’re using the correct model name format:

Use llama-3.3-70b
Use qwen-3-32b
Use llama3.1-8b
Use gpt-oss-120b

Refer to the Cerebras models documentation for the complete list of available models.

Responses are slow or timing out

If you’re experiencing slow responses:

Check your internet connection
Verify the Base URL is set to https://api.cerebras.ai/v1 (not http://)
Try reducing the max_tokens parameter
Consider using a faster model like cerebras/llama3.1-8b for simpler tasks
Check Cerebras status page for any service issues

Does the integration tracking header get added automatically?

Yes! As of ChatCerebras v3.0, the X-Cerebras-3rd-Party-Integration: flowise header is automatically included in all requests. You don’t need to manually configure anything.This header helps Cerebras:

Track integration usage and performance
Provide better support for Flowise users
Identify and resolve integration-specific issues faster

Can I use Cerebras with Flowise Cloud?

Yes! The same configuration works with Flowise Cloud:

Sign up at flowiseai.com
Create a new chatflow
Configure the ChatCerebras node as described above
Your chatflow will use Cerebras Inference in the cloud

How do I switch between different Cerebras models?

Switching models is easy with the dropdown selector:

Click on the ChatCerebras node in your chatflow
Click the “Model Name” dropdown
Select your desired model from the list (each has a description to help you choose)
Save the chatflow
Test with the new model

You can also create multiple chatflows with different models for different use cases. The dropdown makes it easy to see all available options at a glance.

What additional latency can I expect when using Cerebras through Flowise?

Flowise adds minimal overhead since it primarily orchestrates the workflow. The actual inference is performed directly by Cerebras, so you’ll experience the same ultra-low latency that Cerebras is known for. Any additional latency is typically negligible (< 50ms) and comes from Flowise’s workflow orchestration.

Get Started

Capabilities

Compatibility

Resources

Support

Prerequisites

Install Flowise

Configure Cerebras in Flowise

Using Cerebras with Flowise API

Direct Integration with OpenAI SDK

Advanced Configuration

Using Environment Variables

Streaming Responses

Using Multiple Cerebras Models

Next Steps

FAQ

Get Started

Capabilities

Compatibility

Resources

Support

​Prerequisites

​Install Flowise

​Configure Cerebras in Flowise

​Using Cerebras with Flowise API

​Direct Integration with OpenAI SDK

​Advanced Configuration

​Using Environment Variables

​Streaming Responses

​Using Multiple Cerebras Models

​Next Steps

​FAQ

Prerequisites

Install Flowise

Configure Cerebras in Flowise

Using Cerebras with Flowise API

Direct Integration with OpenAI SDK

Advanced Configuration

Using Environment Variables

Streaming Responses

Using Multiple Cerebras Models

Next Steps

FAQ