Documentation Index
Fetch the complete documentation index at: https://inference-docs.cerebras.ai/llms.txt
Use this file to discover all available pages before exploring further.
What is KiloCode?
KiloCode is an AI-powered autonomous assistant for Visual Studio Code that helps you plan, build, and fix code. It combines the best features of popular coding assistants with unique capabilities, offering multiple interaction modes including architect, code, ask, debug, and orchestrator modes. By integrating KiloCode with Cerebras, you get access to ultra-fast inference speeds that make your coding workflow significantly more responsive and efficient. Learn more at kilocode.ai.Prerequisites
Before you begin, ensure you have:- Cerebras API Key - Get a free API key here.
- Visual Studio Code - Download and install from code.visualstudio.com.
- KiloCode Extension - Install from their Website.
Configure KiloCode with Cerebras
Open KiloCode in VS Code
Configure API provider
Enter your Cerebras credentials
- API Key: Enter your Cerebras API key
- Base URL:
https://api.cerebras.ai/v1 - Model: Choose from available Cerebras models:
gpt-oss-120b- Fastest option for simple tasks and high-throughput scenariosgpt-oss-120b- Largest model for the most demanding taskszai-glm-4.7- Advanced 357B parameter model with strong reasoning capabilities
zai-glm-4.7 or gpt-oss-120b for the best coding assistance experience with the best performance.Using KiloCode with Cerebras
Example: Building a Calculator App
Here’s a practical example of using KiloCode with Cerebras to build a full-stack calculator application:Provide your task
Review the plan
- Creating the FastAPI backend structure
- Implementing calculator operations as API endpoints
- Building the HTML/CSS frontend interface
- Setting up proper error handling
- Implementing fetch API calls for frontend-backend communication
Approve and execute
- Generate the necessary files in your workspace
- Write the backend and frontend code
- Set up the project structure with proper organization
- Provide instructions for installing dependencies and running the application
Verifying the Integration
To confirm that KiloCode is using Cerebras’s API correctly:- Response Speed: Cerebras provides significantly faster inference than typical GPU providers. You should notice near-instant responses for most queries.
- Token Usage: Monitor the token counter in the KiloCode panel to track your API usage.
- Model Name: Verify that your selected Cerebras model appears in the session info at the top of the panel.
- API Logs: Check the KiloCode output logs for successful API connections to
api.cerebras.ai.
Advanced Usage
Custom Interaction Modes
You can create custom interaction modes tailored to your specific development needs:- Navigate to the Edit section in KiloCode
- Create a new mode or modify existing ones
- Define custom prompts, system messages, and behaviors
- Save and activate your custom mode for specialized workflows
Working with Large Codebases
When working with large projects, consider these best practices:- Use architect mode first to plan changes and understand the impact across your codebase
- Break down complex tasks into smaller, manageable steps that can be executed incrementally
- Use orchestrator mode to coordinate multi-file changes while maintaining consistency
- Review each change before approving execution to maintain code quality
- Leverage context windows effectively by focusing on relevant files
Debugging Workflow
For effective debugging with KiloCode and Cerebras:- Switch to debug mode for specialized debugging assistance
- Provide error messages, stack traces, or describe the unexpected behavior
- Let KiloCode analyze the problem using Cerebras’s fast inference
- Review suggested fixes and explanations
- Apply changes incrementally and test after each modification
- Use ask mode to understand why bugs occurred and how to prevent them
Best Practices
- Be Specific: Provide clear, detailed prompts with context for better results. Include file names, function names, and expected behavior.
- Review Plans: Always review the assistant’s plan before execution to catch potential issues early.
- Iterative Approach: Break complex tasks into smaller steps and validate each step before moving forward.
- Use Appropriate Modes: Select the right interaction mode for your task to get optimized prompts and behaviors.
- Monitor Token Usage: Keep track of your API usage in the panel to manage costs and understand model efficiency.
- Leverage Speed: Take advantage of Cerebras’s fast inference to iterate quickly and experiment with different approaches.
- Provide Feedback: Use the feedback mechanisms in KiloCode to improve future responses.
Frequently Asked Questions
Which Cerebras model should I use for coding tasks?
Which Cerebras model should I use for coding tasks?
How does Cerebras improve my KiloCode experience?
How does Cerebras improve my KiloCode experience?
- Near-instant responses to your coding queries
- Faster iteration cycles when generating or refactoring code
- More responsive debugging and problem-solving
- Ability to experiment with multiple approaches quickly
- Reduced waiting time during complex multi-step tasks
Can I switch between different Cerebras models?
Can I switch between different Cerebras models?
Does KiloCode support streaming responses with Cerebras?
Does KiloCode support streaming responses with Cerebras?
How do I track my Cerebras API usage?
How do I track my Cerebras API usage?
- In KiloCode: The top bar shows token usage for the current session
- Cerebras Dashboard: Visit cloud.cerebras.ai to see detailed usage analytics, including total tokens, requests, and costs
What happens if I exceed my API rate limits?
What happens if I exceed my API rate limits?
- Wait for the rate limit to reset (typically one minute)
- Upgrade your Cerebras plan for higher rate limits
- Optimize your prompts to use fewer tokens
Next Steps
- Explore the KiloCode documentation for advanced features and customization options
- Try different Cerebras models to find the best fit for your workflow and coding style
- Experiment with custom interaction modes to optimize for your specific development needs
- Join the KiloCode community for tips, support, and sharing best practices
- Check out our Model Comparison Guide to understand which Cerebras model works best for different coding tasks
- Read about Cerebras’s ultra-fast inference technology to learn how it achieves industry-leading speeds
- Want to migrate from GLM4.7? Check out the GLM4.7 migration guide

