# 🔬 AI Research Paper Analyst — Project Walkthrough

> **Automated Peer-Review System powered by Multi-Agent AI**
>
> Upload a research paper (PDF) → receive a publication-ready peer review with methodology critique, novelty assessment, rubric scoring, and an Accept / Revise / Reject recommendation.

---

## 1. What Does This System Do?

| Input | Output |
|---|---|
| A single **PDF** research paper | A structured **peer-review report** with strengths, weaknesses, rubric scores, and a recommendation |

**Key stats:**

- **7 specialized AI agents** working in a sequential pipeline
- **5 custom tools** (PDF parsing, PII redaction, injection scanning, URL validation, citation search)
- **8 Pydantic schemas** enforcing structured JSON output from every agent
- **15-point binary rubric** for quality assurance
- **Gradio web UI** with 6 tabs for exploring every aspect of the review

---

## 2. System Architecture Flowchart

![System Architecture](docs/images/system_architecture.png)

---

## 3. Simplified Pipeline Flow

![Pipeline Flow](docs/images/pipeline_flow.png)

---

## 4. The 7 Agents

| # | Agent | LLM | Role | Key Output |
|---|---|---|---|---|
| 1 | 🛡️ **Safety Guardian** | None (programmatic) | Gate — blocks unsafe docs before any LLM sees them | `SafetyReport` |
| 2 | 📄 **Paper Extractor** | GPT-4o | Extract title, authors, abstract, methodology, findings | `PaperExtraction` |
| 3 | 🔬 **Methodology Critic** | GPT-4o-mini | Evaluate study design, stats, reproducibility | `MethodologyCritique` |
| 4 | 🔍 **Relevance Researcher** | GPT-4o-mini | Search Semantic Scholar / OpenAlex for related work | `RelevanceReport` |
| 5 | ✍️ **Review Synthesizer** | GPT-4o-mini | Combine all insights into a peer-review draft | `ReviewDraft` |
| 6 | 📏 **Rubric Evaluator** | GPT-4o-mini | Score the draft on 15 binary criteria (pass ≥ 11/15) | `RubricEvaluation` |
| 7 | ✨ **Enhancer** | GPT-4o-mini | Fix rubric failures, produce publication-ready report | `FinalReview` |

---

## 5. The 5 Tools

| # | Tool | File | Used By | What It Does |
|---|---|---|---|---|
| 1 | 📑 **PDF Parser** | `tools/pdf_parser.py` | Safety Guardian, Paper Extractor | Extracts text from PDF using `pdfplumber`. Validates file type, existence, and size (≤ 20 MB). |
| 2 | 🔒 **PII Detector** | `tools/pii_detector.py` | Safety Guardian | Regex-based scan for emails, phone numbers, SSNs, credit cards. Replaces matches with `[REDACTED_TYPE]`. |
| 3 | 🚫 **Injection Scanner** | `tools/injection_scanner.py` | Safety Guardian | Checks text against 9 prompt-injection patterns (e.g. "ignore previous instructions", `[INST]`). Fail-safe: defaults to **unsafe** if scanning crashes. |
| 4 | 🌐 **URL Validator** | `tools/url_validator.py` | Safety Guardian | Extracts URLs via regex, checks against blocklist (bit.ly, tinyurl, `data:`, `javascript:`). Max 50 URLs per scan. |
| 5 | 🔎 **Citation Search** | `tools/citation_search.py` | Relevance Researcher | Searches **Semantic Scholar** (with retry + backoff for rate limits). Falls back to **OpenAlex** if unavailable. Max 3 API calls per run. |

### Tool–Agent Assignment Map

![Tool-Agent Assignment](docs/images/tool_agent_map.png)

---

## 6. Pydantic Schemas (Structured Output)

Every agent is forced to output **validated JSON** through Pydantic schemas. If an agent's output doesn't match the schema, CrewAI automatically retries with a correction prompt.

| Schema | Key Fields |
|---|---|
| `SafetyReport` | `is_safe`, `pii_found`, `injection_detected`, `malicious_urls`, `risk_level` |
| `PaperExtraction` | `title`, `authors`, `abstract`, `methodology`, `key_findings`, `paper_type`, `extraction_confidence` |
| `MethodologyCritique` | `strengths`, `weaknesses`, `methodology_score` (1-10), `reproducibility_score` (1-10), `bias_risks` |
| `RelevanceReport` | `related_papers[]`, `novelty_score` (1-10), `field_context`, `gaps_addressed` |
| `ReviewDraft` | `summary`, `strengths_section`, `weaknesses_section`, `recommendation` (Accept/Revise/Reject) |
| `RubricEvaluation` | `scores{}` (15 binary criteria), `total_score` (0–15), `passed` (≥ 11) |
| `FinalReview` | `executive_summary`, `strengths`, `weaknesses`, `recommendation`, `confidence_score`, `improvement_log` |

---

## 7. Safety & Guardrails — 5 Layers

![5-Layer Safety Architecture](docs/images/safety_layers.png)

**Key principle:** The Safety Guardian uses **zero LLM calls** — all safety decisions are deterministic regex/logic. This prevents prompt injection attacks from manipulating the safety gate itself.

---

## 8. Rubric — 15 Binary Criteria

The Rubric Evaluator scores the review on **15 strict pass/fail criteria** (0 or 1 each). A review **passes** with ≥ 11/15.

| # | Category | Criterion |
|---|---|---|
| 1 | 📋 Content | Title & authors correctly identified |
| 2 | 📋 Content | Abstract accurately summarized |
| 3 | 📋 Content | Methodology clearly described |
| 4 | 📋 Content | At least 3 distinct strengths |
| 5 | 📋 Content | At least 3 distinct weaknesses |
| 6 | 📋 Content | Limitations acknowledged |
| 7 | 📋 Content | Related work present (2+ papers) |
| 8 | 🔬 Depth | Novelty assessed with justification |
| 9 | 🔬 Depth | Reproducibility discussed |
| 10 | 🔬 Depth | Evidence quality evaluated |
| 11 | 🔬 Depth | Contribution to field stated |
| 12 | 📝 Quality | Recommendation justified with evidence |
| 13 | 📝 Quality | At least 3 actionable questions |
| 14 | 📝 Quality | No hallucinated citations |
| 15 | 📝 Quality | Professional tone and coherent structure |

---

## 9. Gradio UI — 6 Tabs

| Tab | What It Shows |
|---|---|
| 📋 **Executive Summary** | Recommendation (Accept/Revise/Reject), confidence, rubric score, paper info + download button |
| 📝 **Full Review** | Strengths, weaknesses, methodology & novelty assessments, author questions |
| 📊 **Rubric Scorecard** | All 15 criteria with ✅/❌ scores and per-criterion feedback |
| 🛡️ **Safety Report** | PII findings, injection scan result, URL analysis |
| 💎 **Agent Outputs** | Raw structured JSON output from each of the 7 agents |
| ⚙️ **Pipeline Logs** | Timestamped execution log + JSON run summary |

---

## 10. Tech Stack

| Package | Purpose |
|---|---|
| **CrewAI** ≥ 0.86.0 | Multi-agent orchestration framework |
| **OpenAI** ≥ 1.0.0 | LLM API — GPT-4o + GPT-4o-mini |
| **Gradio** ≥ 5.0.0 | Web UI |
| **pdfplumber** ≥ 0.11.0 | PDF text extraction |
| **Pydantic** ≥ 2.0.0 | Structured output validation |
| **python-dotenv** ≥ 1.0.0 | `.env` file loading |
| **requests** ≥ 2.31.0 | HTTP calls to Semantic Scholar / OpenAlex |

---

## 11. Project Structure

```
Homework5_agentincAI/
├── app.py                         # Main pipeline + Gradio UI (1045 lines)
├── requirements.txt               # Dependencies
├── .env                           # OPENAI_API_KEY
│
├── agents/                        # CrewAI agent definitions
│   ├── paper_extractor.py         # Step 1 — GPT-4o
│   ├── methodology_critic.py      # Step 2a — GPT-4o-mini
│   ├── relevance_researcher.py    # Step 2b — GPT-4o-mini
│   ├── review_synthesizer.py      # Step 3 — GPT-4o-mini
│   ├── rubric_evaluator.py        # Step 4 — GPT-4o-mini
│   └── enhancer.py                # Step 5 — GPT-4o-mini
│
├── tools/                         # Custom tools
│   ├── pdf_parser.py              # PDF → text
│   ├── pii_detector.py            # PII scan & redact
│   ├── injection_scanner.py       # Prompt injection detection
│   ├── url_validator.py           # URL blocklist check
│   └── citation_search.py         # Semantic Scholar / OpenAlex
│
└── schemas/
    └── models.py                  # All 8 Pydantic schemas
```

---

## 12. How to Run

```bash
# 1. Install dependencies
pip install -r requirements.txt

# 2. Set your OpenAI API key in .env
echo "OPENAI_API_KEY=your-key-here" > .env

# 3. Launch the app
python app.py
```

Open **http://localhost:7860** → Upload a PDF → Click **"Analyze Paper"** → Wait 1–3 minutes → Review across all 6 tabs.

---

*AI Research Paper Analyst — Homework 5, Agentic AI Bootcamp*