Tool Calling

Tool calling (also known as tool use or function calling) enables models to interact with external tools, applications, or APIs to perform various actions and access real-time information beyond their initial training data.

How It Works

Define the tool: Provide a name, description, and input parameters for each tool you want the model to access.
Send the request: The prompt is sent along with available tool definitions in your API call.
Model decides: The model analyzes the prompt and its available tools to decide if a tool can help answer the question. If it decides to use a tool, it responds with a structured output indicating which tool to call and what arguments to use.
Execute the tool: The client application receives the model’s tool call request, executes the specified tool (such as calling an external API), and retrieves the result.
Generate final response: The result from the tool is sent back to the model, which can then use this new information to generate a final, accurate response to the user.

Basic Tool Calling

Initial Setup

To begin, we need to import the necessary libraries and set up our Cerebras client.

If you haven’t set up your Cerebras API key yet, please visit our QuickStart guide for detailed instructions on how to obtain and configure your API key.

import os
import json
import re
from cerebras.cloud.sdk import Cerebras

# Initialize Cerebras client
client = Cerebras(
    api_key=os.environ.get("CEREBRAS_API_KEY"),
)

Setting Up the Tool

Our first step is to define the tool that our AI will use. In this example, we’re creating a simple calculator function that can perform basic arithmetic operations.

def calculate(expression):
    expression = re.sub(r'[^0-9+\-*/().]', '', expression)
    
    try:
        result = eval(expression)
        return str(result)
    except (SyntaxError, ZeroDivisionError, NameError, TypeError, OverflowError):
        return "Error: Invalid expression"

Defining the Tool Schema

Next, we define the tool schema. This schema acts as a blueprint for the AI, describing the tool’s functionality, when to use it, and what parameters it expects. It helps the AI understand how to interact with our custom tool effectively.

Please ensure that "strict": True is set inside the function object in the tool schema.

tools = [
    {
        "type": "function",
        "function": {
            "name": "calculate",
            "strict": True,
            "description": "A calculator tool that can perform basic arithmetic operations. Use this when you need to compute mathematical expressions or solve numerical problems.",
            "parameters": {
                "type": "object",
                "properties": {
                    "expression": {
                        "type": "string",
                        "description": "The mathematical expression to evaluate"
                    }
                },
                "required": ["expression"]
            }
        }
    }
]

Making the API Call

With our tool and its schema defined, we can now set up the conversation for our AI. We will prompt the LLM using natural language to conduct a simple calculation, and make the API call.This call sends our messages and tool schema to the LLM, allowing it to generate a response that may include tool use.

messages = [
    {"role": "system", "content": "You are a helpful assistant with access to a calculator. Use the calculator tool to compute mathematical expressions when needed."},
    {"role": "user", "content": "What's the result of 15 multiplied by 7?"},
]

response = client.chat.completions.create(
    model="gpt-oss-120b",
    messages=messages,
    tools=tools,
    parallel_tool_calls=False,
)

Handling Tool Calls

Now that we’ve made the API call, we need to process the response and handle any tool calls the LLM might have made. Note that the LLM determines based on the prompt if it should rely on a tool to respond to the user. Therefore, we need to check for any tool calls and handle them appropriately.In the code below, we first check if there are any tool calls in the model’s response. If a tool call is present, we proceed to execute it and ensure that the function is fulfilled correctly. The function call is logged to indicate that the model is requesting a tool call, and the result of the tool call is logged to clarify that this is not the model’s final output but rather the result of fulfilling its request. The result is then passed back to the model so it can continue generating a final response.

choice = response.choices[0].message

if choice.tool_calls:
    function_call = choice.tool_calls[0].function
    if function_call.name == "calculate":
        # Logging that the model is executing a function named "calculate".
        print(f"Model executing function '{function_call.name}' with arguments {function_call.arguments}")

        # Parse the arguments from JSON format and perform the requested calculation.
        arguments = json.loads(function_call.arguments)
        result = calculate(arguments["expression"])

        # Note: This is the result of executing the model's request (the tool call), not the model's own output.
        print(f"Calculation result sent to model: {result}")
       
       # Send the result back to the model to fulfill the request.
        messages.append({
            "role": "tool",
            "content": json.dumps(result),
            "tool_call_id": choice.tool_calls[0].id
        })
 
       # Request the final response from the model, now that it has the calculation result.
        final_response = client.chat.completions.create(
            model="gpt-oss-120b",
            messages=messages,
        )
        
        # Handle and display the model's final response.
        if final_response:
            print("Final model output:", final_response.choices[0].message.content)
        else:
            print("No final response received")
else:
    # Handle cases where the model's response does not include expected tool calls.
    print("Unexpected response from the model")

Tool calling is currently enabled via prompt engineering, but strict adherence to expected outputs is not yet guaranteed. The LLM autonomously determines whether to call a tool. An update is in progress to improve reliability in future versions.

In this case, the LLM determined that a tool call was appropriate to answer the users’ question of what the result of 15 multiplied by 7 is. See the output below.

Model executing function 'calculate' with arguments {"expression": "15 * 7"}
Calculation result sent to model: 105
Final model output: 15 * 7 = 105

Multi-turn Tool Calling

Most real-world workflows require more than one tool invocation. Multi-turn tool calling lets a model call a tool, incorporate its output, and then, within the same conversation, decide whether it needs to call the tool (or another tool) again to finish the task. It works as follows:

After every tool call you append the tool response to messages, then ask the model to continue.
The model itself decides when enough information has been gathered to produce a final answer.
Continue calling client.chat.completions.create() until you get a message without tool_calls.

The example below demonstrates multi-turn tool calling as an extension of the calculator example above. Before continuing, make sure you’ve completed Steps 1–3 from the calculator setup section.

messages = [
    {
        "role": "system",
        "content": (
            "You are a helpful assistant with a calculator tool. "
            "Use it whenever math is required."
        ),
    },
    {"role": "user", "content": "First, multiply 15 by 7. Then take that result, add 20, and divide the total by 2. What's the final number?"},
]

# Register every callable tool once
available_functions = {
    "calculate": calculate,
}

while True:
    resp = client.chat.completions.create(
        model="qwen-3-32b",
        messages=messages,
        tools=tools,
    )
    msg = resp.choices[0].message

    # If the assistant didn’t ask for a tool, we’re done
    if not msg.tool_calls:
        print("Assistant:", msg.content)
        break

    # Save the assistant turn exactly as returned
    messages.append(msg.model_dump())    

    # Run the requested tool
    call  = msg.tool_calls[0]
    fname = call.function.name

    if fname not in available_functions:
        raise ValueError(f"Unknown tool requested: {fname!r}")

    args_dict = json.loads(call.function.arguments)  # assumes JSON object
    output = available_functions[fname](**args_dict)

    # Feed the tool result back
    messages.append({
        "role": "tool",
        "tool_call_id": call.id,
        "content": json.dumps(output),
    })

Current Limitations of Multi-turn Tool Calling

Multi-turn tool calling is currently not supported with the llama-3.3-70b model. This model will error if you include a non-empty tool_calls array on an assistant turn. For llama-3.3-70b, make sure your assistant response explicitly clears its tool_calls like this:

{
  "role": "assistant",
  "content": "Here's the current temperature in Paris: 18°C",
  "tool_calls": []
}

Then append your “role”: “tool” message containing the function output:

{
  "role": "tool",
  "tool_call_id": "abc123",
  "content": "Paris temperature is 18°C"
}

Get Started

Capabilities

Resources

Support

How It Works

Basic Tool Calling

Multi-turn Tool Calling

Current Limitations of Multi-turn Tool Calling

Get Started

Capabilities

Resources

Support

​How It Works

​Basic Tool Calling

​Multi-turn Tool Calling

​Current Limitations of Multi-turn Tool Calling

How It Works

Basic Tool Calling

Multi-turn Tool Calling

Current Limitations of Multi-turn Tool Calling