Introduction

As we explored in the previous section, large language model (LLM) applications can be significantly improved using the various components of agentic workflows. The first of these components that we’ll explore is tool use, which enables LLMs to perform more complex tasks than just text processing. 

In the context of AI agents, a tool is any external resource or capability that augments the core functionality of the LLM. Most often, the types of tools you’ll encounter when building AI agents will be through a method called function calling, a subset of tool use that allows the LLM to invoke predefined functions with specific parameters that can perform calculations, retrieve data, or execute actions that the model itself cannot directly carry out. 

To illustrate the value of tool use and function calling, let’s consider a financial analyst tasked with comparing the moving averages of two companies’ stock prices. 

Without function calling capabilities, an LLM would have limited value to an analyst, facing significant challenges in performing detailed analysis. LLMs lack access to real-time or historical stock price data, making it difficult to work with up-to-date information. While they can explain concepts like moving averages and guide users through calculations, they aren’t reliable for precise mathematical operations. Additionally, due to their probabilistic nature, the results provided by an LLM for complex calculations can be inconsistent or inaccurate.

Tool choice and function calling address these limitations by allowing the LLM to:

  • Request specific stock data for the companies in question.

  • Invoke a dedicated function that accurately calculates the moving average.

  • Consistently produce precise results based on the provided data and specified parameters.

By utilizing a tool that performs an exact calculation of the moving average, the LLM can provide more reliable answers to the analyst. Using this very example, let’s build an AI agent with tool use capabilities to better understand the concept. 

Initial Setup

Before diving into building our tools, let’s begin by initializing the Cerebras Inference SDK. Note, if this is your first time using our SDK, please visit our QuickStart guide for details on installation and obtaining API keys. 

from cerebras.cloud.sdk import Cerebras

client = Cerebras(
    api_key=os.environ.get("CEREBRAS_API_KEY"),
)

For the sake of simplicity, in this section we will only build out a tool for calculating the moving average of stocks. We will handle the data loading step by simply using local JSON files. Note that in a production environment, this could have also been done by creating a tool that fetched real-time stock data from a financial API. 

import json

with open("company_a.json") as f:
    company_a_data = json.load(f)

with open("company_b.json") as f:
    company_b_data = json.load(f)

available_data = {
    "company_a": company_a_data,
    "company_b": company_b_data,
}

Creating a Moving Average Calculator Tool

Now that we have initialized our client and have some data to work with, let’s build our first tool: a function that our LLM can call to calculate the moving average of a stock.

We’ll name our function calculate_moving_average. It computes the moving average of stock prices over a specified period. It first validates the input parameters and retrieves the relevant stock data. The function then iterates through the data, maintaining a sliding window of stock prices. For each day after the initial window, it calculates the average price within the window, rounds it to two decimal places, and stores the result along with the corresponding date. This process continues until it has processed the specified number of days, resulting in a list of moving averages.

def calculate_moving_average(
    data_reference: str, num_days: int, window_size: int
) -> list[dict[str, float]]:
    if data_reference not in available_data:
        raise ValueError(f"Invalid data reference. Available options: {list(available_data.keys())}")
    
    stock_data = available_data[data_reference]
    
    if num_days < window_size:
        raise ValueError("num_days must be greater than or equal to window_size")

    if len(stock_data) < num_days:
        raise ValueError("Insufficient data for the specified number of days")

    recent_data: list[dict[str, float]] = stock_data[-num_days:]
    moving_averages: list[dict[str, float]] = []
    price_window: list[float] = [
        float(item["closingPrice"]) for item in recent_data[:window_size]
    ]

    for i in range(window_size, num_days):
        current_data = recent_data[i]
        current_price = float(current_data["closingPrice"])

        price_window.append(current_price)
        price_window.pop(0)
        average = round(sum(price_window) / window_size, 2)

        moving_averages.append({"date": current_data["date"], "movingAverage": average})

    return moving_averages

Tool Schema

In addition to our calculate_moving_average tool, we need to create a schema which provides context on when and how it can be used. You can think of the tool schema as a user manual for your AI agent. The more precise and informative your schema, the better equipped the AI becomes at determining when to utilize your tool and how to construct appropriate arguments. You can provide the schema to the Cerebras Inference API through the tools parameter, as described in the Tool Use section of the API documentation.

The schema is composed of three components: the name of the tool, a description of the tool, and what parameters it accepts.

For our schema, we’ll use Pydantic to ensure type safety and simplify input validation before passing them to the function. This approach reduces the risk of errors that can arise from manually writing JSON schemas and keeps parameter definitions closely aligned with their usage in the code.

from pydantic import BaseModel, Field
from typing import Literal

class CalculateMovingAverageArgs(BaseModel):
    data_reference: Literal["company_a", "company_b"] = Field(..., description="The key to access specific stock data in the stock_data dictionary.")
    num_days: int = Field(..., description="The number of recent days to consider for calculation.")
    window_size: int = Field(..., description="The size of the moving average window.")

tools = [
    {
        "type": "function",
        "function": {
            "name": "calculate_moving_average",
            "description": "Calculate the moving average of stock data for a specified number of recent days.",
            "parameters": CalculateMovingAverageArgs.model_json_schema(),
        },
    }
]

Integrating Tools Into the Workflow

Now that we have defined our calculate_moving_average function and created the corresponding tool schema, we need to integrate these components into our AI agent’s workflow. The next step is to set up the messaging structure and create a chat completion request using the Cerebras Inference SDK. This involves all of the standard components that comprise a chat completion request, including crafting an initial system message and the user’s query. We’ll then pass these messages along with our defined tools to the chat completion method. This setup allows the AI to understand the context, interpret the user’s request, and determine when and how to utilize the calculate_moving_average function to provide accurate responses.

messages = [
    {"role": "system", "content": "You are a helpful financial analyst. Use the supplied tools to assist the user."},
    {"role": "user", "content": "What's the 10-day moving average for company A over the last 50 days?"},
]

response = llm.chat.completions.create(
    model="llama-3.3-70b",
    messages=messages,
    tools=tools,
)

Once the LLM receives the request, it makes a determination as to whether or not it can answer the query without additional data or computations. If so, it provides a text-based reply, just like it would with any other request. This is typically the case for general knowledge questions or queries that don’t require the use of tools. In our code, we first check to see if the model responded in this way. If so, we print out the content.

content = response.choices[0].message.content
if content:
    print(content)

In cases where the LLM recognizes that answering the query requires specific data or calculations beyond its inherent capabilities, it opts for a function call.

We handle this by checking for a function call, similar to how we checked for a text response. When a function call is detected, the code extracts the function arguments, parses them, and executes the corresponding function.

function_call = response.choices[0].message.tool_calls[0].function
if function_call.name == "calculate_moving_average":
    arguments = json.loads(function_call.arguments)
    result = calculate_moving_average(**arguments)

Conclusion

Tool use and function calling significantly enhance the capabilities of large language models, enabling them to perform complex tasks beyond their core text processing abilities. As demonstrated in our financial analysis example, integrating tools allows LLMs to perform precise calculations, and provide more reliable and accurate responses to user queries.

Our workflow in this example was a simple one, but it can be applied to most tool use scenarios.

  1. Defining the tool function (in our case, calculate_moving_average) and create a corresponding tool schema that clearly outlines its purpose and parameters.

  2. Make a chat completion request, including the defined tools alongside the user’s query and any system messages.

  3. Handle the LLM’s response, which may be either a text-based reply or a function call.

  4. If a function call is made, execute the specified function with the provided arguments and process the results.

We now know how tool use and function calling extend the capabilities of AI agents. In subsequent sections, we’ll explore how we can do even more with tool use, such as:

  • Multistep tool use, where the LLM chains together multiple function calls to solve more complex problems.

  • Parallel tool use, allowing the model to decide for itself which functions are appropriate for the given task for increased flexibility