Academic Research Agent - Cerebras Inference

Seb Duerr
January 20, 2026

This cookbook demonstrates how to build a conversational agent that:

Generates diverse arXiv search queries
Searches and analyzes academic papers
Downloads and processes PDFs with Unstructured
Performs deep analysis and synthesizes research insights and saves them as reports

What You’ll Learn

PydanticAI Agent Architecture - Building conversational agents with tools
Cerebras Integration - Using Cerebras LLMs with PydanticAI
Pydantic Schemas - Type-safe structured outputs from LLMs
Unstructured - High-quality PDF text extraction
Tool Design - Creating effective agent tools with RunContext

Setup

Install Dependencies

%pip install -q pydantic-ai cerebras-cloud-sdk python-dotenv requests feedparser unstructured-client pydantic

Load API Keys

Get API keys to get started with super fast inference, and Unstructured’s powerful document procesing:

Cerebras: https://cloud.cerebras.ai (free tier available)
Unstructured: https://unstructured.io (free tier available)

Next, we suggest to add the secrets in the Google Collab Password service, or via a .env file, if you cloned the repository.

CEREBRAS_API_KEY=your-key-here
UNSTRUCTURED_API_KEY=your-key-here

import os
from dotenv import load_dotenv

load_dotenv()

required = ["CEREBRAS_API_KEY", "UNSTRUCTURED_API_KEY"]
missing = [k for k in required if not os.getenv(k)]
if missing:
    raise RuntimeError(
        f"Missing API keys: {', '.join(missing)}. Add them to .env file."
    )

print("✅ API keys loaded")

Part 1: Pydantic Schemas

We are using Pydantic models for type safety. Pydantic is a production grade typing framework, that helps to create reliable LLM responses. These schemas:

Guide the LLM on expected output structure
Validate responses automatically
Provide type hints throughout the codebase

from typing import List, Optional, Dict, Any
from pydantic import BaseModel, Field


class ArxivQueries(BaseModel):
    """Structured output for arXiv query generation"""
    queries: List[str] = Field(description="List of diverse search queries")
    reasoning: str = Field(description="Why these queries were chosen")


class AbstractAnalysis(BaseModel):
    """Analysis of paper abstracts"""
    key_themes: List[str] = Field(description="Main themes across papers")
    top_papers_for_deep_analysis: List[str] = Field(
        description="arXiv IDs of most relevant papers"
    )
    reasoning: str = Field(description="Why these papers were selected")


class PaperAnalysis(BaseModel):
    """Deep analysis of a single paper"""
    arxiv_id: str
    methods: str = Field(description="Methods and architectures used")
    contributions: str = Field(description="Novel contributions")
    limitations: Optional[str] = Field(default=None)


class ResearchDirection(BaseModel):
    """A future research direction"""
    direction: str = Field(description="The research direction")
    rationale: str = Field(description="Why this is important")


class ResearchOutput(BaseModel):
    """Final comprehensive research output"""
    research_landscape_summary: str = Field(
        description="Overview of the research landscape"
    )
    key_innovations: List[str] = Field(
        description="Major innovations identified"
    )
    future_research_directions: List[ResearchDirection] = Field(
        description="Suggested future research directions"
    )
    papers_analyzed: int = Field(description="Total papers analyzed")
    queries_used: List[str] = Field(description="Search queries used")

Example: Using Schemas for Type Safety

Here’s how schemas validate LLM outputs:

# Example: Creating a validated ArxivQueries object
example_queries: ArxivQueries = ArxivQueries(
    queries=["vision language models", "multimodal reasoning"],
    reasoning="These queries cover both architecture and capability aspects"
)

print(f"Queries: {example_queries.queries}")
print(f"Reasoning: {example_queries.reasoning}")

# Example: Creating a validated ResearchOutput
example_output: ResearchOutput = ResearchOutput(
    research_landscape_summary="The field is rapidly evolving...",
    key_innovations=["Cross-modal attention", "Chain-of-thought prompting"],
    future_research_directions=[
        ResearchDirection(direction="Video reasoning", rationale="Temporal understanding is key")
    ],
    papers_analyzed=10,
    queries_used=["vision language models"]
)

print(f"\nPapers analyzed: {example_output.papers_analyzed}")
print(f"Innovations: {example_output.key_innovations}")

Part 2: Dependencies & Configuration

The agent uses dependency injection via PydanticAI’s RunContext. This allows tools to access shared resources like API clients and caches.

from cerebras.cloud.sdk import AsyncCerebras
from unstructured_client import UnstructuredClient


class ResearchDeps(BaseModel):
    """Dependencies for the research agent"""
    cerebras_client: Any
    unstructured_client: Any
    papers_cache: Dict[str, Dict[str, Any]] = Field(default_factory=dict)
    start_year: int = 2020
    max_papers_per_query: int = 15
    max_papers_for_deep_analysis: int = 3
    fulltext_excerpt_chars: int = 12000

    model_config = {"arbitrary_types_allowed": True}


def create_research_deps(
    start_year: int = 2020,
    max_papers_for_deep_analysis: int = 3
) -> ResearchDeps:
    """Create research dependencies with API clients"""
    return ResearchDeps(
        cerebras_client=AsyncCerebras(
            api_key=os.getenv("CEREBRAS_API_KEY"),
            default_headers={"X-Cerebras-3rd-Party-Integration": "academic-research-agent"}
        ),
        unstructured_client=UnstructuredClient(
            api_key_auth=os.getenv("UNSTRUCTURED_API_KEY")
        ),
        start_year=start_year,
        max_papers_for_deep_analysis=max_papers_for_deep_analysis
    )

Part 3: Cerebras in Strict Mode

Important: Cerebras requires all tools to have the same strict parameter value. PydanticAI may generate tools with mixed values, which causes errors. We proactively avoid this with a prepare_tools hook that normalizes all tools to strict=False:

from dataclasses import replace
from pydantic_ai.tools import ToolDefinition


async def set_consistent_strict_param(
    ctx: Any,
    tool_defs: List[ToolDefinition]
) -> List[ToolDefinition]:
    """
    Enforce consistent strict=False for all tools.
    
    This addresses the error:
    "Tools with mixed values for 'strict' are not allowed"
    """
    return [replace(tool_def, strict=False) for tool_def in tool_defs]

Part 4: Create the Agent

Now we instantiate the PydanticAI agent with:

Cerebras gpt-oss-120b model
ResearchDeps for dependency injection
prepare_tools hook for strict mode
System prompt defining the agent’s role

from pydantic_ai import Agent, RunContext

agent = Agent(
    'cerebras:gpt-oss-120b',
    deps_type=ResearchDeps,
    prepare_tools=set_consistent_strict_param,
    system_prompt="""You are an expert academic research assistant specializing in literature reviews.

You help researchers by:
1. Generating effective arXiv search queries
2. Searching and analyzing academic papers
3. Identifying key themes and innovations
4. Suggesting future research directions

You have access to tools for each step of the research process. Use them strategically
to conduct comprehensive literature reviews. When asked to research a topic:

1. First generate diverse search queries
2. Search arXiv with those queries
3. Analyze abstracts to identify most relevant papers
4. Download and analyze full papers
5. Synthesize findings into a comprehensive report

Be thorough, cite specific papers, and provide actionable insights."""
)

Part 5: Define the 7 Research Tools

In PydanticAI, each tool is decorated with @agent.tool and receives RunContext[ResearchDeps] for dependency access.

Tool 1: Generate arXiv Search Queries

import json


@agent.tool
async def generate_arxiv_queries(
    ctx: RunContext[ResearchDeps],
    topic: str,
    num_queries: int = 5
) -> str:
    """
    Generate diverse arXiv search queries for a research topic.
    
    Args:
        topic: The research topic to generate queries for
        num_queries: Number of queries to generate (default: 5)
    
    Returns:
        JSON string with queries and reasoning
    """
    print(f"\n🔍 Generating {num_queries} search queries for: {topic}")
    
    prompt = f"""Generate {num_queries} diverse arXiv search queries for researching: "{topic}"

Make queries:
- Specific and targeted
- Cover different aspects/angles
- Use relevant technical terms
- Suitable for arXiv API search

Return JSON:
{{
  "queries": ["query1", "query2", ...],
  "reasoning": "why these queries cover the topic well"
}}"""

    response = await ctx.deps.cerebras_client.chat.completions.create(
        model="gpt-oss-120b",
        messages=[{"role": "user", "content": prompt}],
        response_format={"type": "json_object"},
        temperature=1.0,
        max_completion_tokens=12000
    )
    
    content = response.choices[0].message.content
    data = json.loads(content)
    
    # Validate with Pydantic schema
    result: ArxivQueries = ArxivQueries(**data)
    
    print(f"✓ Generated {len(result.queries)} queries")
    for i, q in enumerate(result.queries, 1):
        print(f"  {i}. {q}")
    
    return json.dumps(data)

Tool 2: Search arXiv Papers

import asyncio
import requests
import feedparser


@agent.tool
async def search_arxiv_papers(
    ctx: RunContext[ResearchDeps],
    queries: List[str]
) -> str:
    """
    Search arXiv with multiple queries and cache results.
    
    Args:
        queries: List of search query strings
    
    Returns:
        Summary of papers found
    """
    print(f"\n📚 Searching arXiv with {len(queries)} queries...")
    
    all_papers: Dict[str, Dict[str, Any]] = {}
    
    for query in queries:
        search_url = "http://export.arxiv.org/api/query"
        params = {
            "search_query": f"all:{query}",
            "start": 0,
            "max_results": ctx.deps.max_papers_per_query,
            "sortBy": "relevance",
            "sortOrder": "descending"
        }
        
        try:
            response = requests.get(search_url, params=params, timeout=30)
            feed = feedparser.parse(response.content)
            
            for entry in feed.entries:
                # Skip entries without required fields
                if not hasattr(entry, 'id') or not hasattr(entry, 'published'):
                    continue
                if not hasattr(entry, 'title') or not hasattr(entry, 'summary'):
                    continue
                    
                arxiv_id = entry.id.split("/abs/")[-1]
                
                if arxiv_id not in all_papers:
                    try:
                        year = int(entry.published[:4])
                    except (ValueError, TypeError):
                        continue
                    
                    if year >= ctx.deps.start_year:
                        authors = []
                        if hasattr(entry, 'authors'):
                            authors = [author.name for author in entry.authors if hasattr(author, 'name')]
                        
                        all_papers[arxiv_id] = {
                            "arxiv_id": arxiv_id,
                            "title": entry.title,
                            "authors": authors,
                            "year": year,
                            "abstract": entry.summary,
                            "link": getattr(entry, 'link', f"https://arxiv.org/abs/{arxiv_id}")
                        }
        except Exception as e:
            print(f"  ⚠️ Query failed: {query[:50]}... ({str(e)[:50]})")
            continue
        
        await asyncio.sleep(1)  # Rate limiting
    
    # Cache papers in dependencies
    ctx.deps.papers_cache.update(all_papers)
    
    summary = f"Found {len(all_papers)} unique papers from {ctx.deps.start_year} onwards\n\n"
    summary += "Top papers:\n"
    for i, (arxiv_id, paper) in enumerate(list(all_papers.items())[:10], 1):
        summary += f"{i}. [{paper['year']}] {arxiv_id} — {paper['title'][:80]}...\n"
    
    print(f"✓ Found {len(all_papers)} papers")
    
    return summary

Tool 3: Analyze Paper Abstracts

@agent.tool
async def analyze_paper_abstracts(
    ctx: RunContext[ResearchDeps],
    topic: str,
    max_papers: int = 20
) -> str:
    """
    Analyze paper abstracts to identify key themes and select papers for deep analysis.
    
    Args:
        topic: The research topic
        max_papers: Maximum papers to analyze (default: 20)
    
    Returns:
        JSON string with analysis results
    """
    print(f"\n📊 Analyzing abstracts for: {topic}")
    
    papers = list(ctx.deps.papers_cache.values())[:max_papers]
    
    if not papers:
        return json.dumps({
            "key_themes": [],
            "top_papers_for_deep_analysis": [],
            "reasoning": "No papers in cache. Run search_arxiv_papers first."
        })
    
    abstracts_text = "\n\n---\n\n".join([
        f"Paper {i+1} (arXiv:{p['arxiv_id']})\nTitle: {p['title']}\nAbstract: {p['abstract']}"
        for i, p in enumerate(papers)
    ])
    
    prompt = f"""Analyze these {len(papers)} paper abstracts for research on: "{topic}"

{abstracts_text}

Identify:
1. Key themes across papers
2. Top {ctx.deps.max_papers_for_deep_analysis} most relevant papers for deep analysis (by arXiv ID)
3. Reasoning for selections

Return JSON:
{{
  "key_themes": ["theme1", "theme2", ...],
  "top_papers_for_deep_analysis": ["arxiv_id1", "arxiv_id2", ...],
  "reasoning": "explanation"
}}"""

    response = await ctx.deps.cerebras_client.chat.completions.create(
        model="gpt-oss-120b",
        messages=[{"role": "user", "content": prompt}],
        response_format={"type": "json_object"},
        temperature=1.0,
        max_completion_tokens=12000
    )
    
    content = response.choices[0].message.content
    data = json.loads(content)
    
    # Validate with Pydantic schema
    result: AbstractAnalysis = AbstractAnalysis(**data)
    
    print(f"✓ Identified {len(result.key_themes)} key themes")
    print(f"✓ Selected {len(result.top_papers_for_deep_analysis)} papers for deep analysis")
    
    return json.dumps(data)

Tool 4: Download and Process PDF

Next, we create a tool that downloads PDFs from arXiv and uses Unstructured’s hi_res partitioning strategy to detect document layout and extract structured elements like tables, images, and text. You can also swap this out for VLM partitioning, add chunking, enrichment (like table descriptions or NER), and embedding nodes to your workflow. Check out this notebook for a hands-on tutorial.

from unstructured_client.models.operations import CreateJobRequest, DownloadJobOutputRequest
from unstructured_client.models.shared import BodyCreateJob, InputFiles
import asyncio
import time
import json

@agent.tool
async def download_and_process_pdf(
    ctx: RunContext[ResearchDeps],
    arxiv_id: str
) -> str:
    """
    Download and extract text from an arXiv paper PDF using Unstructured.
    
    Args:
        arxiv_id: The arXiv ID (e.g., "2301.12345")
    
    Returns:
        Extracted text excerpt
    """
    print(f"\n📄  Processing PDF: {arxiv_id}")
    
    # Check cache first
    if arxiv_id in ctx.deps.papers_cache and "fulltext" in ctx.deps.papers_cache[arxiv_id]:
        print(f"✓ Using cached fulltext")
        return ctx.deps.papers_cache[arxiv_id]["fulltext"]
    
    try:
        # Download PDF from arXiv
        pdf_url = f"https://arxiv.org/pdf/{arxiv_id}.pdf"
        response = requests.get(pdf_url, timeout=120)
        response.raise_for_status()
        
        # Define the hi-res partitioning workflow node
        # Create the job
        job_response = ctx.deps.unstructured_client.jobs.create_job(
            request=CreateJobRequest(
                body_create_job=BodyCreateJob(
                    request_data=json.dumps({
                        "template_id": "hi_res_partition"  # Use a template
                    }),
                    input_files=[
                        InputFiles(
                            content=response.content,
                            file_name=f"{arxiv_id}.pdf",
                            content_type="application/pdf"
                        )
                    ]
                )
            )
        )


        job_id = job_response.job_information.id
        file_id = job_response.job_information.input_file_ids[0]

        print(f"  Job ID: {job_id}")

        # Poll job status until complete
        while True:
            status_response = ctx.deps.unstructured_client.jobs.get_job(
                request={"job_id": job_id}
            )

            job = status_response.job_information

            if job.status == "SCHEDULED":
                print("  Job is scheduled, polling again in 10 seconds...")
                await asyncio.sleep(10)
            elif job.status == "IN_PROGRESS":
                print("  Job is in progress, polling again in 10 seconds...")
                await asyncio.sleep(10)
            elif job.status == "COMPLETED":
                print("  ✓ Job completed")
                break
            elif job.status in ["FAILED", "STOPPED"]:
                raise Exception(f"Job {job.status.lower()}")
            else:
                await asyncio.sleep(10)

        # Download job output
        output_response = ctx.deps.unstructured_client.jobs.download_job_output(
            request=DownloadJobOutputRequest(
                job_id=job_id,
                file_id=file_id
            )
        )

        # Extract text from elements
        text_parts: List[str] = []
        for element in output_response.any:
            if isinstance(element, dict):
                text = element.get("text", "")
            elif hasattr(element, "text"):
                text = element.text
            else:
                text = ""
            
            if text:
                text_parts.append(text)
        
        fulltext = "\n".join(text_parts)
        
        # Limit length to stay within context window
        excerpt = fulltext[:ctx.deps.fulltext_excerpt_chars]
        
        # Cache for reuse
        if arxiv_id in ctx.deps.papers_cache:
            ctx.deps.papers_cache[arxiv_id]["fulltext"] = excerpt
        
        print(f"✓ Extracted {len(fulltext):,} chars (using {len(excerpt):,} char excerpt)")
        
        return excerpt
        
    except Exception as e:
        error_msg = f"Failed to process {arxiv_id}: {str(e)}"
        print(f"⚠️  {error_msg}")
        return error_msg

Tool 5: Deep Analyze Papers

@agent.tool
async def deep_analyze_papers(
    ctx: RunContext[ResearchDeps],
    topic: str,
    arxiv_ids: List[str]
) -> str:
    """
    Perform deep analysis of papers using their full text.
    
    Args:
        topic: The research topic
        arxiv_ids: List of arXiv IDs to analyze
    
    Returns:
        JSON string with deep analysis results
    """
    print(f"\n🔬 Deep analyzing {len(arxiv_ids)} papers...")
    
    analyses: List[PaperAnalysis] = []
    
    for arxiv_id in arxiv_ids:
        # Get fulltext (from cache or download)
        if arxiv_id in ctx.deps.papers_cache and "fulltext" in ctx.deps.papers_cache[arxiv_id]:
            fulltext = ctx.deps.papers_cache[arxiv_id]["fulltext"]
        else:
            fulltext = await download_and_process_pdf(ctx, arxiv_id)
        
        if "Failed to process" in fulltext:
            continue
        
        prompt = f"""Analyze this paper in the context of research on: "{topic}"

Paper ID: {arxiv_id}

Full text excerpt:
{fulltext[:8000]}

Extract:
1. Methods and architectures used
2. Novel contributions
3. Limitations (if mentioned)

Return JSON:
{{
  "arxiv_id": "{arxiv_id}",
  "methods": "description",
  "contributions": "description",
  "limitations": "description or null"
}}"""

        response = await ctx.deps.cerebras_client.chat.completions.create(
            model="gpt-oss-120b",
            messages=[{"role": "user", "content": prompt}],
            response_format={"type": "json_object"},
            temperature=1.0,
            max_completion_tokens=12000
        )
        
        content = response.choices[0].message.content
        data = json.loads(content)
        
        # Validate with Pydantic schema
        analysis: PaperAnalysis = PaperAnalysis(**data)
        analyses.append(analysis)
        
        print(f"  ✓ Analyzed {arxiv_id}")
    
    result = {
        "papers": [a.model_dump() for a in analyses],
        "count": len(analyses)
    }
    
    print(f"✓ Completed deep analysis of {len(analyses)} papers")
    
    return json.dumps(result)

Tool 6: Synthesize Research Findings

@agent.tool
async def synthesize_research_findings(
    ctx: RunContext[ResearchDeps],
    topic: str,
    deep_analysis_json: str,
    queries_used: List[str]
) -> str:
    """
    Synthesize all research findings into a comprehensive report.
    
    Args:
        topic: The research topic
        deep_analysis_json: JSON string from deep_analyze_papers
        queries_used: List of search queries that were used
    
    Returns:
        JSON string with final research output
    """
    print(f"\n🎯 Synthesizing research findings...")
    
    deep_analysis = json.loads(deep_analysis_json)
    
    # Safely get papers list with fallback
    papers_list = deep_analysis.get('papers', [])
    papers_count = deep_analysis.get('count', len(papers_list))
    
    if not papers_list:
        # If no papers key, the JSON might be a single paper analysis or different format
        # Try to handle it gracefully
        print("⚠️ No 'papers' key found in deep_analysis_json, attempting to parse as single paper")
        if 'arxiv_id' in deep_analysis:
            # It's a single paper analysis
            papers_list = [deep_analysis]
            papers_count = 1
        else:
            # Return a minimal synthesis
            return json.dumps({
                "research_landscape_summary": "Unable to synthesize - no paper analysis data available.",
                "key_innovations": [],
                "future_research_directions": [],
                "papers_analyzed": 0,
                "queries_used": queries_used
            })
    
    papers_text = "\n\n".join([
        f"Paper {i+1} ({p.get('arxiv_id', 'unknown')}):\n"
        f"Methods: {p.get('methods', 'Not specified')}\n"
        f"Contributions: {p.get('contributions', 'Not specified')}\n"
        f"Limitations: {p.get('limitations', 'Not specified')}"
        for i, p in enumerate(papers_list)
    ])
    
    prompt = f"""Synthesize research findings on: "{topic}"

Deep analysis of {papers_count} papers:

{papers_text}

Create a comprehensive research summary with:
1. Research landscape overview (2-3 paragraphs)
2. Key innovations (3-5 items)
3. Future research directions (3-5 items with rationale)

Return JSON:
{{
  "research_landscape_summary": "overview text",
  "key_innovations": ["innovation1", "innovation2", ...],
  "future_research_directions": [
    {{"direction": "direction1", "rationale": "why"}},
    ...
  ]
}}"""

    response = await ctx.deps.cerebras_client.chat.completions.create(
        model="gpt-oss-120b",
        messages=[{"role": "user", "content": prompt}],
        response_format={"type": "json_object"},
        temperature=1.0,
        max_completion_tokens=12000
    )
    
    content = response.choices[0].message.content
    data = json.loads(content)
    
    # Add metadata
    data["papers_analyzed"] = papers_count
    data["queries_used"] = queries_used
    
    # Validate with Pydantic schema
    result: ResearchOutput = ResearchOutput(**data)
    
    print(f"✓ Synthesis complete!")
    
    return json.dumps(data)

Tool 7: Save Research Report

from datetime import datetime
from pathlib import Path


@agent.tool
def save_research_report(
    ctx: RunContext[ResearchDeps],
    topic: str,
    research_output_json: str
) -> str:
    """
    Save the research report to a file.
    
    Args:
        topic: The research topic
        research_output_json: JSON string from synthesize_research_findings
    
    Returns:
        Path to saved file
    """
    try:
      print(f"\n💾 Saving research report...")
      
      output = json.loads(research_output_json)
      
      # Create output directory
      output_dir = Path("research_exports")
      output_dir.mkdir(exist_ok=True)
      
      # Generate filename
      timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
      filename = f"research_analysis_{timestamp}.txt"
      filepath = output_dir / filename
      
      # Safely get values with defaults
      papers_analyzed = output.get('papers_analyzed', 'N/A')
      research_landscape_summary = output.get('research_landscape_summary', 'No summary available.')
      key_innovations = output.get('key_innovations', [])
      future_research_directions = output.get('future_research_directions', [])
      queries_used = output.get('queries_used', [])
      
      # Format report
      report = f"""
  {'=' * 80}
  ACADEMIC RESEARCH ANALYSIS
  {'=' * 80}

  Topic: {topic}
  Date: {datetime.now().strftime("%Y-%m-%d %H:%M:%S")}
  Papers Analyzed: {papers_analyzed}

  {'=' * 80}
  RESEARCH LANDSCAPE
  {'=' * 80}

  {research_landscape_summary}

  {'=' * 80}
  KEY INNOVATIONS
  {'=' * 80}

  """
      
      for i, innovation in enumerate(key_innovations, 1):
          report += f"{i}. {innovation}\n"
      
      report += f"\n{'=' * 80}\nFUTURE RESEARCH DIRECTIONS\n{'=' * 80}\n\n"
      
      for i, direction in enumerate(future_research_directions, 1):
          if isinstance(direction, dict):
              report += f"{i}. {direction.get('direction', 'Unknown')}\n"
              report += f"   Rationale: {direction.get('rationale', 'Not specified')}\n\n"
          else:
              report += f"{i}. {direction}\n\n"
      
      report += f"{'=' * 80}\nSEARCH QUERIES USED\n{'=' * 80}\n\n"
      
      for i, query in enumerate(queries_used, 1):
          report += f"{i}. {query}\n"
      
      report += f"\n{'=' * 80}\n"
      
      # Save
      filepath.write_text(report)
      
      print(f"✓ Report saved: {filepath}")
      
      return str(filepath)
    except Exception as e:
      error_msg = f"Failed to save report: {str(e)}"
      print(f"⚠️  {error_msg}")
      return error_msg

Part 6: Conversational Interface

This function handles the conversation with the agent, including extracting the response from PydanticAI’s message structure:

async def chat_with_agent(research_question: str, deps: ResearchDeps) -> str:
    """
    Have a conversation with the research agent.
    
    Args:
        research_question: The research question or instruction
        deps: Research dependencies
    
    Returns:
        Agent's response text
    """
    print(f"\n📋 Your request: {research_question}\n")
    
    result = await agent.run(research_question, deps=deps)
    
    # Extract response from PydanticAI result
    # The result contains new_messages() with TextPart and ThinkingPart objects
    new_msgs = result.new_messages()
    if new_msgs:
        last_msg = new_msgs[-1]
        if hasattr(last_msg, 'parts'):
            # Extract only TextPart content, skip ThinkingPart
            text_parts: List[str] = []
            for part in last_msg.parts:
                if hasattr(part, 'content') and 'TextPart' in str(type(part)):
                    text_parts.append(part.content)
            response = ' '.join(text_parts) if text_parts else str(last_msg)
        else:
            response = str(last_msg)
    else:
        response = str(result)
    
    print("💬 AGENT RESPONSE")
    print(f"\n{response}\n")
    
    return response

Part 7: Run the Agent!

Instantiate Our Previously Created Dependencies

deps = create_research_deps(
    start_year=2023,
    max_papers_for_deep_analysis=1
)
print(f"   Start year: {deps.start_year}")
print(f"   Max papers for deep analysis: {deps.max_papers_for_deep_analysis}")
print(f"   Fulltext excerpt: {deps.fulltext_excerpt_chars:,} chars")

Example 1: Full Research Workflow

The agent will autonomously:

Generate search queries
Search arXiv
Analyze abstracts
Download and process PDFs
Perform deep analysis
Synthesize findings
Save the report

research_question = """
Please conduct a comprehensive literature review on "vision-language models for multimodal reasoning".

Follow these steps:
1. Generate 3 diverse arXiv search queries
2. Search arXiv with those queries
3. Analyze the abstracts to identify key themes
4. Select the top 1 most relevant paper
5. Download and analyze that paper in depth
6. Synthesize the findings into a comprehensive report
7. Save the report to a file

Provide a summary of your findings at the end.
"""

response = await chat_with_agent(research_question, deps)

Example 2: Quick Abstract-Only Analysis

The agent adapts to simpler requests:

quick_question = """
What are the key themes in recent papers about "persuasive natural language generation"?
Just analyze abstracts, don't download full papers.
"""

response = await chat_with_agent(quick_question, deps)

Example 3: Follow-up Questions

The agent can answer follow-up questions:

followup = "What were the most innovative methods you found in those papers?"

response = await chat_with_agent(followup, deps)

Part 8: Inspect Results

View Cached Papers

print(f"Papers in cache: {len(deps.papers_cache)}")
print("\nCached papers:")
for i, (arxiv_id, paper) in enumerate(list(deps.papers_cache.items())[:5], 1):
    print(f"{i}. {arxiv_id} — {paper['title'][:60]}...")
    if 'fulltext' in paper:
        print(f"   ✓ Full text cached ({len(paper['fulltext']):,} chars)")

View Saved Reports

export_dir = Path("research_exports")
if export_dir.exists():
    reports = sorted(export_dir.glob("*.txt"), key=lambda p: p.stat().st_mtime, reverse=True)
    print(f"Saved reports ({len(reports)}):")
    for report in reports[:5]:
        size = report.stat().st_size
        print(f"  • {report.name} ({size:,} bytes)")
else:
    print("No reports saved yet")

Summary

What We Built

A conversational academic research agent with:

tools for a complete research workflow
PydanticAI for agent orchestration and tool management
Cerebras gpt-oss-120b for fast, high-quality reasoning
Unstructured for PDF text extraction
Pydantic schemas for type-safe structured outputs

Key Patterns

Cerebras Strict Mode: Use prepare_tools hook to normalize all tools to strict=False
Dependency Injection: Use RunContext[ResearchDeps] to share API clients and caches
Schema Validation: Validate all LLM outputs with Pydantic models
Error Resilience: Tools return error messages instead of raising exceptions
Caching: Cache papers and full text to avoid redundant API calls

Next Steps

Add semantic search with vector embeddings, rather than different API calls to arxiv’s API
Add a citation graph analysis
Add multi-source search (PubMed, Semantic Scholar)

Resources

Acknowledgement

Thank you team from Pydantic AI and Unstructured for incredibly helpful inputs during the creation of this cookbook. Also a shoutout to my colleagues Zhenwei Gao, Ryan Loney and Sarah Chieng for great feedback on initial versions.

Cookbook

​What You’ll Learn

​Setup

​Install Dependencies

​Load API Keys

​Part 1: Pydantic Schemas

​Example: Using Schemas for Type Safety

​Part 2: Dependencies & Configuration

​Part 3: Cerebras in Strict Mode

​Part 4: Create the Agent

​Part 5: Define the 7 Research Tools

​Tool 1: Generate arXiv Search Queries

​Tool 2: Search arXiv Papers

​Tool 3: Analyze Paper Abstracts

​Tool 4: Download and Process PDF

​Tool 5: Deep Analyze Papers

​Tool 6: Synthesize Research Findings

​Tool 7: Save Research Report

​Part 6: Conversational Interface

​Part 7: Run the Agent!

​Instantiate Our Previously Created Dependencies

​Example 1: Full Research Workflow

​Example 2: Quick Abstract-Only Analysis

​Example 3: Follow-up Questions

​Part 8: Inspect Results

​View Cached Papers

​View Saved Reports

​Summary

​What We Built

​Key Patterns

​Next Steps

​Resources

​Acknowledgement