aneeb15's picture
Upload folder using huggingface_hub
bf39d5e verified
---
title: Multi-Agent Research Generator
emoji: ๐Ÿ”ฌ
colorFrom: blue
colorTo: indigo
sdk: docker
sdk_version: 1.44.1
app_file: app.py
pinned: false
license: mit
---
# ๐Ÿ”ฌ Multi-Agent Research & Report Generator
> Orchestrates multiple specialized AI agents to autonomously research, analyze, fact-check, and produce structured professional reports with cited sources.
![Python](https://img.shields.io/badge/Python-3.10+-blue) ![LangGraph](https://img.shields.io/badge/LangGraph-Latest-green) ![Groq](https://img.shields.io/badge/Groq-Llama3.3-orange) ![Streamlit](https://img.shields.io/badge/UI-Streamlit-red)
---
## ๐Ÿง  What This Project Demonstrates
Most AI demos make a single LLM call and call it "AI research." This project does something fundamentally different โ€” it separates research, analysis, fact-checking, and writing into specialized agents that communicate through a shared state graph.
```
Naive approach: prompt โ†’ LLM โ†’ output
This project: orchestrated multi-agent pipeline with
conditional routing, critic patterns,
and real source verification
```
This is the architecture pattern used in production enterprise AI systems.
---
## ๐Ÿ—๏ธ Agent Architecture
```
User Input: Research Topic
โ†“
Orchestrator (LangGraph StateGraph)
โ†“
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Research Agent โ”‚
โ”‚ Tavily web search โ”‚
โ”‚ 3 targeted queries โ”‚
โ”‚ Real-time sources โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
โ†“
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Analyst Agent โ”‚
โ”‚ Synthesizes findings โ”‚
โ”‚ Identifies patterns โ”‚
โ”‚ Flags contradictions โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
โ†“
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Critic Agent โ”‚
โ”‚ Cross-references claims โ”‚
โ”‚ against Tavily sources โ”‚ โ† actual ground truth
โ”‚ Assigns confidence score โ”‚
โ”‚ Flags unsupported claims โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
โ†“
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Conditional Router โ”‚
โ”‚ needs_revision AND iterations < 2 โ”‚
โ”‚ โ†’ back to Analyst (max 2 cycles) โ”‚
โ”‚ else โ†’ Writer Agent โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
โ†“
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Writer Agent โ”‚
โ”‚ Structured report โ”‚
โ”‚ Cited sources โ”‚
โ”‚ Confidence score disclosed โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
โ†“
Professional Report + Confidence Score
```
---
## ๐Ÿ”‘ Key Technical Decisions
### Why LangGraph State?
Each agent reads from and writes to a shared `ResearchState` TypedDict. No agent needs to know about other agents โ€” they only interact with state. This makes the pipeline modular, debuggable, and extensible.
```python
class ResearchState(TypedDict):
topic: str
search_results: List[dict] # Research Agent writes
analysis: str # Analyst Agent writes
critic_feedback: str # Critic Agent writes
confidence_scores: dict # Critic writes, UI reads
final_report: str # Writer Agent writes
current_step: str # Orchestrator tracks
iteration_count: int # Prevents infinite loops
```
### Why Tavily for Critic Agent?
A naive critic agent evaluates analyst output using only LLM knowledge โ€” which means it's one LLM instance confirming another's biases. Our Critic Agent cross-references every claim against actual Tavily search results, giving it real ground truth to check against. This is the architectural difference between a critic that finds real issues vs one that just agrees.
### Why iteration_count?
Without a cycle limit, Critic โ†’ Analyst โ†’ Critic creates an infinite loop that exhausts free-tier rate limits. The Orchestrator increments this counter as a separate node (single responsibility), and the router caps revision cycles at 2 before forcing the pipeline to the Writer Agent.
### Why separate Orchestrator node for iteration?
Mixing counter increments into the Critic Agent violates single responsibility. Each node does exactly one thing โ€” the Critic evaluates, the iterate node increments, the router decides. This makes debugging straightforward: if routing fails, only one node is responsible.
---
## ๐Ÿ“Š Confidence Score โ€” What It Actually Means
```
High score (75-100): Most claims in analysis are directly
supported by Tavily search results
Low score (40-65): Topic is speculative or emerging โ€”
fewer verifiable claims in sources
Example:
Pakistan job market report โ†’ 80/100 (established research exists)
Iran-US war economic impact โ†’ 60/100 (speculative, fewer sources)
```
**Honest limitation:** The Critic verifies logical consistency and cross-references against search results. It cannot replace domain expert fact-checking for critical decisions.
---
## ๐Ÿ› ๏ธ Tech Stack
| Tool | Purpose | Why Free |
|------|---------|---------|
| LangGraph | Agent orchestration | Open source |
| Groq API | LLM inference (Llama 3.3 70B) | Free tier, fastest inference |
| Tavily API | Real-time web search | Free tier, 1000 searches/month |
| LangChain | Tool definitions | Open source |
| Streamlit | UI | Open source |
---
## ๐Ÿš€ Setup & Installation
### Prerequisites
- Python 3.10+
- Groq API key โ€” [console.groq.com](https://console.groq.com)
- Tavily API key โ€” [tavily.com](https://tavily.com)
### Installation
```bash
# Clone repo
git clone https://github.com/yourusername/multi-agent-research
cd multi-agent-research
# Create virtual environment
python -m venv venv
venv\Scripts\activate # Windows
source venv/bin/activate # Mac/Linux
# Install dependencies
pip install -r requirements.txt
# Configure environment
cp .env.example .env
# Add your API keys to .env
```
### Environment Variables
```
GROQ_API_KEY=your_groq_key_here
TAVILY_API_KEY=your_tavily_key_here
```
> **Hugging Face Space users:** Set `GROQ_API_KEY` and `TAVILY_API_KEY` in your Space's **Settings โ†’ Variables and Secrets** โ€” do NOT commit a `.env` file.
### Run
```bash
# Streamlit UI
streamlit run app.py
# Terminal only
python main.py
```
---
## ๐Ÿ“ Project Structure
```
multi_agent_research/
โ”œโ”€โ”€ agents/
โ”‚ โ”œโ”€โ”€ research_agent.py # Tavily web search (3 queries)
โ”‚ โ”œโ”€โ”€ analyst_agent.py # Synthesizes findings via LLM
โ”‚ โ”œโ”€โ”€ critic_agent.py # Cross-references vs sources
โ”‚ โ””โ”€โ”€ writer_agent.py # Produces final report
โ”œโ”€โ”€ graph/
โ”‚ โ””โ”€โ”€ research_graph.py # LangGraph StateGraph + routing
โ”œโ”€โ”€ state/
โ”‚ โ””โ”€โ”€ research_state.py # Shared state TypedDict
โ”œโ”€โ”€ app.py # Streamlit UI (HF Space entrypoint)
โ”œโ”€โ”€ main.py # Pipeline runner
โ”œโ”€โ”€ requirements.txt # Python dependencies
โ”œโ”€โ”€ .env.example # API key template
โ””โ”€โ”€ README.md
```
---
## โšก Performance
| Metric | Value |
|--------|-------|
| Average report time | ~30 seconds - 1 min |
| Tavily searches per run | 9 (3 queries ร— 3 results) |
| Max revision cycles | 2 |
| Token usage per run | ~8,000-11,000 tokens |
---
## โš ๏ธ Known Limitations
- **Critic cannot verify all hallucinations** โ€” it cross-references against Tavily results but cannot catch confidently stated errors absent from search results
- **Groq free tier** โ€” 12,000 TPM limit may cause rate limiting on complex topics
- **Tavily free tier** โ€” 1,000 searches/month; each run uses 9 searches
- **Report quality depends on search result quality** โ€” niche or poorly documented topics produce lower confidence scores
---
## ๐Ÿ—บ๏ธ What I Learned
- LangGraph state management and conditional routing
- Critic pattern in multi-agent systems โ€” and its honest limitations
- Why multi-agent genuinely outperforms single-agent for parallel specialization
- Token optimization for free-tier LLM APIs
- Separation of concerns in agent design (single responsibility per node)
---
## ๐Ÿ”ฎ Future Improvements
- Parallel agent execution (Research + Analyst simultaneously)
- Vector store memory for cross-session topic persistence
- PDF export for reports
- Domain-specific agent personas (legal, medical, financial)
- Human-in-the-loop approval before Writer Agent runs
---
*Built as Project 4 of an AI Engineering portfolio. Part of a progression from model fine-tuning โ†’ RAG systems โ†’ single agents โ†’ multi-agent orchestration.*