---
title: Multi-Agent Research Generator
emoji: 🔬
colorFrom: blue
colorTo: indigo
sdk: docker
sdk_version: 1.44.1
app_file: app.py
pinned: false
license: mit
---

# 🔬 Multi-Agent Research & Report Generator

> Orchestrates multiple specialized AI agents to autonomously research, analyze, fact-check, and produce structured professional reports with cited sources.

![Python](https://img.shields.io/badge/Python-3.10+-blue) ![LangGraph](https://img.shields.io/badge/LangGraph-Latest-green) ![Groq](https://img.shields.io/badge/Groq-Llama3.3-orange) ![Streamlit](https://img.shields.io/badge/UI-Streamlit-red)

---

## 🧠 What This Project Demonstrates

Most AI demos make a single LLM call and call it "AI research." This project does something fundamentally different — it separates research, analysis, fact-checking, and writing into specialized agents that communicate through a shared state graph.

```
Naive approach:    prompt → LLM → output
This project:      orchestrated multi-agent pipeline with 
                   conditional routing, critic patterns, 
                   and real source verification
```

This is the architecture pattern used in production enterprise AI systems.

---

## 🏗️ Agent Architecture

```
User Input: Research Topic
                ↓
    Orchestrator (LangGraph StateGraph)
                ↓
    ┌─────────────────────────────┐
    │      Research Agent         │
    │  Tavily web search          │
    │  3 targeted queries         │
    │  Real-time sources          │
    └──────────┬──────────────────┘
               ↓
    ┌─────────────────────────────┐
    │      Analyst Agent          │
    │  Synthesizes findings       │
    │  Identifies patterns        │
    │  Flags contradictions       │
    └──────────┬──────────────────┘
               ↓
    ┌─────────────────────────────┐
    │       Critic Agent          │
    │  Cross-references claims    │
    │  against Tavily sources     │  ← actual ground truth
    │  Assigns confidence score   │
    │  Flags unsupported claims   │
    └──────────┬──────────────────┘
               ↓
    ┌─────────────────────────────────────────┐
    │         Conditional Router              │
    │  needs_revision AND iterations < 2      │
    │    → back to Analyst (max 2 cycles)     │
    │  else → Writer Agent                    │
    └──────────┬──────────────────────────────┘
               ↓
    ┌─────────────────────────────┐
    │       Writer Agent          │
    │  Structured report          │
    │  Cited sources              │
    │  Confidence score disclosed │
    └──────────┬──────────────────┘
               ↓
    Professional Report + Confidence Score
```

---

## 🔑 Key Technical Decisions

### Why LangGraph State?
Each agent reads from and writes to a shared `ResearchState` TypedDict. No agent needs to know about other agents — they only interact with state. This makes the pipeline modular, debuggable, and extensible.

```python
class ResearchState(TypedDict):
    topic: str
    search_results: List[dict]   # Research Agent writes
    analysis: str                 # Analyst Agent writes
    critic_feedback: str          # Critic Agent writes
    confidence_scores: dict       # Critic writes, UI reads
    final_report: str             # Writer Agent writes
    current_step: str             # Orchestrator tracks
    iteration_count: int          # Prevents infinite loops
```

### Why Tavily for Critic Agent?
A naive critic agent evaluates analyst output using only LLM knowledge — which means it's one LLM instance confirming another's biases. Our Critic Agent cross-references every claim against actual Tavily search results, giving it real ground truth to check against. This is the architectural difference between a critic that finds real issues vs one that just agrees.

### Why iteration_count?
Without a cycle limit, Critic → Analyst → Critic creates an infinite loop that exhausts free-tier rate limits. The Orchestrator increments this counter as a separate node (single responsibility), and the router caps revision cycles at 2 before forcing the pipeline to the Writer Agent.

### Why separate Orchestrator node for iteration?
Mixing counter increments into the Critic Agent violates single responsibility. Each node does exactly one thing — the Critic evaluates, the iterate node increments, the router decides. This makes debugging straightforward: if routing fails, only one node is responsible.

---

## 📊 Confidence Score — What It Actually Means

```
High score (75-100): Most claims in analysis are directly 
                     supported by Tavily search results

Low score (40-65):   Topic is speculative or emerging —
                     fewer verifiable claims in sources

Example:
  Pakistan job market report → 80/100  (established research exists)
  Iran-US war economic impact → 60/100 (speculative, fewer sources)
```

**Honest limitation:** The Critic verifies logical consistency and cross-references against search results. It cannot replace domain expert fact-checking for critical decisions.

---

## 🛠️ Tech Stack

| Tool | Purpose | Why Free |
|------|---------|---------| 
| LangGraph | Agent orchestration | Open source |
| Groq API | LLM inference (Llama 3.3 70B) | Free tier, fastest inference |
| Tavily API | Real-time web search | Free tier, 1000 searches/month |
| LangChain | Tool definitions | Open source |
| Streamlit | UI | Open source |

---

## 🚀 Setup & Installation

### Prerequisites
- Python 3.10+
- Groq API key — [console.groq.com](https://console.groq.com)
- Tavily API key — [tavily.com](https://tavily.com)

### Installation

```bash
# Clone repo
git clone https://github.com/yourusername/multi-agent-research
cd multi-agent-research

# Create virtual environment
python -m venv venv
venv\Scripts\activate  # Windows
source venv/bin/activate  # Mac/Linux

# Install dependencies
pip install -r requirements.txt

# Configure environment
cp .env.example .env
# Add your API keys to .env
```

### Environment Variables

```
GROQ_API_KEY=your_groq_key_here
TAVILY_API_KEY=your_tavily_key_here
```

> **Hugging Face Space users:** Set `GROQ_API_KEY` and `TAVILY_API_KEY` in your Space's **Settings → Variables and Secrets** — do NOT commit a `.env` file.

### Run

```bash
# Streamlit UI
streamlit run app.py

# Terminal only
python main.py
```

---

## 📁 Project Structure

```
multi_agent_research/
├── agents/
│   ├── research_agent.py      # Tavily web search (3 queries)
│   ├── analyst_agent.py       # Synthesizes findings via LLM
│   ├── critic_agent.py        # Cross-references vs sources
│   └── writer_agent.py        # Produces final report
├── graph/
│   └── research_graph.py      # LangGraph StateGraph + routing
├── state/
│   └── research_state.py      # Shared state TypedDict
├── app.py                     # Streamlit UI (HF Space entrypoint)
├── main.py                    # Pipeline runner
├── requirements.txt           # Python dependencies
├── .env.example               # API key template
└── README.md
```

---

## ⚡ Performance

| Metric | Value |
|--------|-------|
| Average report time | ~30 seconds - 1 min |
| Tavily searches per run | 9 (3 queries × 3 results) |
| Max revision cycles | 2 |
| Token usage per run | ~8,000-11,000 tokens |

---

## ⚠️ Known Limitations

- **Critic cannot verify all hallucinations** — it cross-references against Tavily results but cannot catch confidently stated errors absent from search results
- **Groq free tier** — 12,000 TPM limit may cause rate limiting on complex topics
- **Tavily free tier** — 1,000 searches/month; each run uses 9 searches
- **Report quality depends on search result quality** — niche or poorly documented topics produce lower confidence scores

---

## 🗺️ What I Learned

- LangGraph state management and conditional routing
- Critic pattern in multi-agent systems — and its honest limitations
- Why multi-agent genuinely outperforms single-agent for parallel specialization
- Token optimization for free-tier LLM APIs
- Separation of concerns in agent design (single responsibility per node)

---

## 🔮 Future Improvements

- Parallel agent execution (Research + Analyst simultaneously)
- Vector store memory for cross-session topic persistence  
- PDF export for reports
- Domain-specific agent personas (legal, medical, financial)
- Human-in-the-loop approval before Writer Agent runs

---

*Built as Project 4 of an AI Engineering portfolio. Part of a progression from model fine-tuning → RAG systems → single agents → multi-agent orchestration.*