CePO (Cerebras Planning & Optimization) is a framework that adds advanced reasoning capabilities to the Llama family of models by utilizing test-time compute. This approach enables Llama to address complex reasoning tasks that can be difficult for standard one-shot or instruct models.

CePO is implemented using Cerebras inference, which currently supports llama3.3-70b at 2,200 reasoning tokens/s. This level of inference speed enables efficient test-time computation for more sophisticated reasoning tasks.

How CePO Works

CePO demonstrates how additional test-time computation can improve Llama’s reasoning.

The process involves four main stages:

  1. Planning: The LLM produces a plan to solve a given problem step by step.

  2. Execution: The LLM executes the plan multiple times, generating multiple responses.

  3. Analysis: The model analyzes the responses to detect inconsistencies across executions, helping catch and correct mistakes.

  4. Best-of-N: Responses are evaluated within a Best-of-N framework that includes a structured confidence scoring mechanism.

Getting Started with CePO

1

Step 1: Prerequisites

CePO is built on the popular, open-source OptiLLM library. To get started, install OptiLLM and make sure you have the latest version of the Cerebras Inference SDK installed.

pip install --upgrade cerebras_cloud_sdk 
pip install --upgrade optillm

Next, configure your API key, which can be found in our developer platform.

export CEREBRAS_API_KEY='your_api_key_here'
2

Step 2: Run OptiLLM with CePO

Finally, run the OptiLLM script with CePO

optillm \
  --base-url https://api.cerebras.ai \
  --approach cepo 

If you would like to print intermediate states in the OptiLLM log, you can optionally add:

--cepo_print_output true 

Continued Research

Further work on CePO includes:

  • More advanced prompting frameworks that leverage comparative reasoning.

  • Synthetic data optimized for inference-time computation.

  • Enhanced verification mechanisms for complex reasoning chains.

If you have questions or would like to discuss your findings, please join the #research channel in our discord community.

Read More

For more information on the implementation details and results, see the full CePO announcement.

Was this page helpful?