Email / examples /04_think /CONCEPT.md
lenzcom's picture
Upload folder using huggingface_hub
e706de2 verified

Concept: Reasoning & Problem-Solving Agents

Overview

This example demonstrates how to configure an LLM as a reasoning agent capable of analytical thinking and quantitative problem-solving. It shows the bridge between simple text generation and complex cognitive tasks.

What is a Reasoning Agent?

A reasoning agent is an LLM configured to perform logical analysis, mathematical computation, and multi-step problem-solving through careful system prompt design.

Human Analogy

Regular Chat                    Reasoning Agent
─────────────                  ──────────────────
"Can you help me?"            "I am a mathematician.
"Sure! What do you need?"     I analyze problems methodically
                              and compute exact answers."

The Reasoning Challenge

Why Reasoning is Hard for LLMs

LLMs are trained on text prediction, not explicit reasoning:

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  LLM Training                         β”‚
β”‚  "Predict next word in text"         β”‚
β”‚                                       β”‚
β”‚  NOT explicitly trained for:         β”‚
β”‚  β€’ Step-by-step logic                β”‚
β”‚  β€’ Arithmetic computation            β”‚
β”‚  β€’ Tracking multiple variables       β”‚
β”‚  β€’ Systematic problem decomposition  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

However, they can learn reasoning patterns from training data and be guided by system prompts.

Reasoning Through System Prompts

Configuration Pattern

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  System Prompt Components              β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  1. Role: "Expert reasoner"            β”‚
β”‚  2. Task: "Analyze and solve problems" β”‚
β”‚  3. Method: "Compute exact answers"    β”‚
β”‚  4. Output: "Single numeric value"     β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         ↓
   Reasoning Behavior

Types of Reasoning Tasks

Quantitative Reasoning (this example):

Problem β†’ Count entities β†’ Calculate β†’ Convert units β†’ Answer

Logical Reasoning:

Premises β†’ Apply rules β†’ Deduce conclusions β†’ Answer

Analytical Reasoning:

Data β†’ Identify patterns β†’ Form hypothesis β†’ Conclude

How LLMs "Reason"

Pattern Matching vs. True Reasoning

LLMs don't reason like humans, but they can:

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  What LLMs Actually Do                      β”‚
β”‚                                             β”‚
β”‚  1. Pattern Recognition                     β”‚
β”‚     "This looks like a counting problem"    β”‚
β”‚                                             β”‚
β”‚  2. Template Application                    β”‚
β”‚     "Similar problems follow this pattern"  β”‚
β”‚                                             β”‚
β”‚  3. Statistical Inference                   β”‚
β”‚     "These numbers likely combine this way" β”‚
β”‚                                             β”‚
β”‚  4. Learned Procedures                      β”‚
β”‚     "I've seen this type of calculation"    β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

The Reasoning Process

Input: Complex Word Problem
         ↓
    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
    β”‚   Parse    β”‚  Identify entities and relationships
    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         ↓
    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
    β”‚  Decompose β”‚  Break into sub-problems
    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         ↓
    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
    β”‚  Calculate β”‚  Apply arithmetic operations
    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         ↓
    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
    β”‚  Synthesizeβ”‚  Combine results
    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         ↓
     Final Answer

Problem Complexity Hierarchy

Levels of Reasoning Difficulty

Easy                                        Hard
β”‚                                             β”‚
β”‚  Simple    Multi-step   Nested    Implicit β”‚
β”‚  Arithmetic  Logic    Conditions  Reasoningβ”‚
β”‚                                             β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Examples:
Easy:    "What is 5 + 3?"
Medium:  "If 3 apples cost $2 each, what's the total?"
Hard:    "Count family members with complex relationships"

This Example's Complexity

The potato problem is highly complex:

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Complexity Factors                     β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  βœ“ Multiple entities (15+ people)      β”‚
β”‚  βœ“ Relationship reasoning (family tree)β”‚
β”‚  βœ“ Conditional logic (if married then..)β”‚
β”‚  βœ“ Negative conditions (deceased people)β”‚
β”‚  βœ“ Special cases (dietary restrictions)β”‚
β”‚  βœ“ Multiple calculations                β”‚
β”‚  βœ“ Unit conversions                     β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Limitations of Pure LLM Reasoning

Why This Approach Has Issues

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Problem: No External Tools        β”‚
β”‚                                    β”‚
β”‚  LLM must hold everything in       β”‚
β”‚  "mental" context:                 β”‚
β”‚  β€’ All entity counts               β”‚
β”‚  β€’ Intermediate calculations       β”‚
β”‚  β€’ Conversion factors              β”‚
β”‚  β€’ Final arithmetic                β”‚
β”‚                                    β”‚
β”‚  Result: Prone to errors           β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Common Failure Modes

1. Counting Errors:

Problem: "Count 15 people with complex relationships"
LLM: "14" or "16" (off by one)

2. Arithmetic Mistakes:

Problem: "13 adults Γ— 1.5 + 3 kids Γ— 0.5"
LLM: May get intermediate steps wrong

3. Lost Context:

Problem: Multi-step with many facts
LLM: Forgets earlier information

Improving Reasoning: Evolution Path

Level 1: Pure Prompting (This Example)

User β†’ LLM β†’ Answer
       ↑
   System Prompt

Limitations:

  • All reasoning internal to LLM
  • No verification
  • No tools
  • Hidden process

Level 2: Chain-of-Thought

User β†’ LLM β†’ Show Work β†’ Answer
       ↑
   "Explain your reasoning"

Improvements:

  • Visible reasoning steps
  • Can catch some errors
  • Still no tools

Level 3: Tool-Augmented (simple-agent)

User β†’ LLM ⟷ Tools β†’ Answer
       ↑    (Calculator)
   System Prompt

Improvements:

  • External computation
  • Reduced errors
  • Verifiable steps

Level 4: ReAct Pattern (react-agent)

User β†’ LLM β†’ Think β†’ Act β†’ Observe
       ↑      ↓      ↓      ↓
   System  Reason  Tool   Result
   Prompt         Use
       ↑           ↓       ↓
       └───────────Iterateβ”€β”€β”˜

Best approach:

  • Explicit reasoning loop
  • Tool use at each step
  • Self-correction possible

System Prompt Design for Reasoning

Key Elements

1. Role Definition:

"You are an expert logical and quantitative reasoner"

Sets the mental framework.

2. Task Specification:

"Analyze real-world word problems involving..."

Defines the problem domain.

3. Output Format:

"Return the correct final number as a single value"

Controls response structure.

Design Patterns

Pattern A: Direct Answer (This Example)

Prompt: [Problem]
Output: [Number]

Pros: Concise, fast Cons: No insight into reasoning

Pattern B: Show Work

Prompt: [Problem] "Show your steps"
Output: Step 1: ... Step 2: ... Answer: [Number]

Pros: Transparent, debuggable Cons: Longer, may still have errors

Pattern C: Self-Verification

Prompt: [Problem] "Solve, then verify"
Output: Solution + Verification + Final Answer

Pros: More reliable Cons: Slower, uses more tokens

Real-World Applications

Use Cases for Reasoning Agents

1. Data Analysis:

Input: Dataset summary
Task: Compute statistics, identify trends
Output: Numerical insights

2. Planning:

Input: Goal + constraints
Task: Reason about optimal sequence
Output: Action plan

3. Decision Support:

Input: Options + criteria
Task: Evaluate and compare
Output: Recommended choice

4. Problem Solving:

Input: Complex scenario
Task: Break down and solve
Output: Solution

Comparison: Different Agent Types

                  Reasoning  Tools  Memory  Multi-turn
                  ─────────  ─────  ──────  ──────────
intro.js              βœ—        βœ—      βœ—        βœ—
translation.js        ~        βœ—      βœ—        βœ—
think.js (here)       βœ“        βœ—      βœ—        βœ—
simple-agent.js       βœ“        βœ“      βœ—        ~
memory-agent.js       βœ“        βœ“      βœ“        βœ“
react-agent.js        βœ“βœ“       βœ“      ~        βœ“

Legend:

  • βœ— = Not present
  • ~ = Limited/implicit
  • βœ“ = Present
  • βœ“βœ“ = Advanced/explicit

Key Takeaways

  1. System prompts enable reasoning: Proper configuration transforms an LLM into a reasoning agent
  2. Limitations exist: Pure LLM reasoning is prone to errors on complex problems
  3. Tools help: External computation (calculators, etc.) improves accuracy
  4. Iteration matters: Multi-step reasoning patterns (like ReAct) work better
  5. Transparency is valuable: Seeing the reasoning process helps debug and verify

Next Steps

After understanding basic reasoning:

  • Add tools: Let the agent use calculators, databases, APIs
  • Implement verification: Check answers, retry on errors
  • Use chain-of-thought: Make reasoning explicit
  • Apply ReAct pattern: Combine reasoning and tool use systematically

This example is the foundation for more sophisticated agent architectures that combine reasoning with external capabilities.