| # π¬ AI Research Paper Analyst β Project Walkthrough | |
| > **Automated Peer-Review System powered by Multi-Agent AI** | |
| > | |
| > Upload a research paper (PDF) β receive a publication-ready peer review with methodology critique, novelty assessment, rubric scoring, and an Accept / Revise / Reject recommendation. | |
| --- | |
| ## 1. What Does This System Do? | |
| | Input | Output | | |
| |---|---| | |
| | A single **PDF** research paper | A structured **peer-review report** with strengths, weaknesses, rubric scores, and a recommendation | | |
| **Key stats:** | |
| - **7 specialized AI agents** working in a sequential pipeline | |
| - **5 custom tools** (PDF parsing, PII redaction, injection scanning, URL validation, citation search) | |
| - **8 Pydantic schemas** enforcing structured JSON output from every agent | |
| - **15-point binary rubric** for quality assurance | |
| - **Gradio web UI** with 6 tabs for exploring every aspect of the review | |
| --- | |
| ## 2. System Architecture Flowchart | |
|  | |
| --- | |
| ## 3. Simplified Pipeline Flow | |
|  | |
| --- | |
| ## 4. The 7 Agents | |
| | # | Agent | LLM | Role | Key Output | | |
| |---|---|---|---|---| | |
| | 1 | π‘οΈ **Safety Guardian** | None (programmatic) | Gate β blocks unsafe docs before any LLM sees them | `SafetyReport` | | |
| | 2 | π **Paper Extractor** | GPT-4o | Extract title, authors, abstract, methodology, findings | `PaperExtraction` | | |
| | 3 | π¬ **Methodology Critic** | GPT-4o-mini | Evaluate study design, stats, reproducibility | `MethodologyCritique` | | |
| | 4 | π **Relevance Researcher** | GPT-4o-mini | Search Semantic Scholar / OpenAlex for related work | `RelevanceReport` | | |
| | 5 | βοΈ **Review Synthesizer** | GPT-4o-mini | Combine all insights into a peer-review draft | `ReviewDraft` | | |
| | 6 | π **Rubric Evaluator** | GPT-4o-mini | Score the draft on 15 binary criteria (pass β₯ 11/15) | `RubricEvaluation` | | |
| | 7 | β¨ **Enhancer** | GPT-4o-mini | Fix rubric failures, produce publication-ready report | `FinalReview` | | |
| --- | |
| ## 5. The 5 Tools | |
| | # | Tool | File | Used By | What It Does | | |
| |---|---|---|---|---| | |
| | 1 | π **PDF Parser** | `tools/pdf_parser.py` | Safety Guardian, Paper Extractor | Extracts text from PDF using `pdfplumber`. Validates file type, existence, and size (β€ 20 MB). | | |
| | 2 | π **PII Detector** | `tools/pii_detector.py` | Safety Guardian | Regex-based scan for emails, phone numbers, SSNs, credit cards. Replaces matches with `[REDACTED_TYPE]`. | | |
| | 3 | π« **Injection Scanner** | `tools/injection_scanner.py` | Safety Guardian | Checks text against 9 prompt-injection patterns (e.g. "ignore previous instructions", `[INST]`). Fail-safe: defaults to **unsafe** if scanning crashes. | | |
| | 4 | π **URL Validator** | `tools/url_validator.py` | Safety Guardian | Extracts URLs via regex, checks against blocklist (bit.ly, tinyurl, `data:`, `javascript:`). Max 50 URLs per scan. | | |
| | 5 | π **Citation Search** | `tools/citation_search.py` | Relevance Researcher | Searches **Semantic Scholar** (with retry + backoff for rate limits). Falls back to **OpenAlex** if unavailable. Max 3 API calls per run. | | |
| ### ToolβAgent Assignment Map | |
|  | |
| --- | |
| ## 6. Pydantic Schemas (Structured Output) | |
| Every agent is forced to output **validated JSON** through Pydantic schemas. If an agent's output doesn't match the schema, CrewAI automatically retries with a correction prompt. | |
| | Schema | Key Fields | | |
| |---|---| | |
| | `SafetyReport` | `is_safe`, `pii_found`, `injection_detected`, `malicious_urls`, `risk_level` | | |
| | `PaperExtraction` | `title`, `authors`, `abstract`, `methodology`, `key_findings`, `paper_type`, `extraction_confidence` | | |
| | `MethodologyCritique` | `strengths`, `weaknesses`, `methodology_score` (1-10), `reproducibility_score` (1-10), `bias_risks` | | |
| | `RelevanceReport` | `related_papers[]`, `novelty_score` (1-10), `field_context`, `gaps_addressed` | | |
| | `ReviewDraft` | `summary`, `strengths_section`, `weaknesses_section`, `recommendation` (Accept/Revise/Reject) | | |
| | `RubricEvaluation` | `scores{}` (15 binary criteria), `total_score` (0β15), `passed` (β₯ 11) | | |
| | `FinalReview` | `executive_summary`, `strengths`, `weaknesses`, `recommendation`, `confidence_score`, `improvement_log` | | |
| --- | |
| ## 7. Safety & Guardrails β 5 Layers | |
|  | |
| **Key principle:** The Safety Guardian uses **zero LLM calls** β all safety decisions are deterministic regex/logic. This prevents prompt injection attacks from manipulating the safety gate itself. | |
| --- | |
| ## 8. Rubric β 15 Binary Criteria | |
| The Rubric Evaluator scores the review on **15 strict pass/fail criteria** (0 or 1 each). A review **passes** with β₯ 11/15. | |
| | # | Category | Criterion | | |
| |---|---|---| | |
| | 1 | π Content | Title & authors correctly identified | | |
| | 2 | π Content | Abstract accurately summarized | | |
| | 3 | π Content | Methodology clearly described | | |
| | 4 | π Content | At least 3 distinct strengths | | |
| | 5 | π Content | At least 3 distinct weaknesses | | |
| | 6 | π Content | Limitations acknowledged | | |
| | 7 | π Content | Related work present (2+ papers) | | |
| | 8 | π¬ Depth | Novelty assessed with justification | | |
| | 9 | π¬ Depth | Reproducibility discussed | | |
| | 10 | π¬ Depth | Evidence quality evaluated | | |
| | 11 | π¬ Depth | Contribution to field stated | | |
| | 12 | π Quality | Recommendation justified with evidence | | |
| | 13 | π Quality | At least 3 actionable questions | | |
| | 14 | π Quality | No hallucinated citations | | |
| | 15 | π Quality | Professional tone and coherent structure | | |
| --- | |
| ## 9. Gradio UI β 6 Tabs | |
| | Tab | What It Shows | | |
| |---|---| | |
| | π **Executive Summary** | Recommendation (Accept/Revise/Reject), confidence, rubric score, paper info + download button | | |
| | π **Full Review** | Strengths, weaknesses, methodology & novelty assessments, author questions | | |
| | π **Rubric Scorecard** | All 15 criteria with β /β scores and per-criterion feedback | | |
| | π‘οΈ **Safety Report** | PII findings, injection scan result, URL analysis | | |
| | π **Agent Outputs** | Raw structured JSON output from each of the 7 agents | | |
| | βοΈ **Pipeline Logs** | Timestamped execution log + JSON run summary | | |
| --- | |
| ## 10. Tech Stack | |
| | Package | Purpose | | |
| |---|---| | |
| | **CrewAI** β₯ 0.86.0 | Multi-agent orchestration framework | | |
| | **OpenAI** β₯ 1.0.0 | LLM API β GPT-4o + GPT-4o-mini | | |
| | **Gradio** β₯ 5.0.0 | Web UI | | |
| | **pdfplumber** β₯ 0.11.0 | PDF text extraction | | |
| | **Pydantic** β₯ 2.0.0 | Structured output validation | | |
| | **python-dotenv** β₯ 1.0.0 | `.env` file loading | | |
| | **requests** β₯ 2.31.0 | HTTP calls to Semantic Scholar / OpenAlex | | |
| --- | |
| ## 11. Project Structure | |
| ``` | |
| Homework5_agentincAI/ | |
| βββ app.py # Main pipeline + Gradio UI (1045 lines) | |
| βββ requirements.txt # Dependencies | |
| βββ .env # OPENAI_API_KEY | |
| β | |
| βββ agents/ # CrewAI agent definitions | |
| β βββ paper_extractor.py # Step 1 β GPT-4o | |
| β βββ methodology_critic.py # Step 2a β GPT-4o-mini | |
| β βββ relevance_researcher.py # Step 2b β GPT-4o-mini | |
| β βββ review_synthesizer.py # Step 3 β GPT-4o-mini | |
| β βββ rubric_evaluator.py # Step 4 β GPT-4o-mini | |
| β βββ enhancer.py # Step 5 β GPT-4o-mini | |
| β | |
| βββ tools/ # Custom tools | |
| β βββ pdf_parser.py # PDF β text | |
| β βββ pii_detector.py # PII scan & redact | |
| β βββ injection_scanner.py # Prompt injection detection | |
| β βββ url_validator.py # URL blocklist check | |
| β βββ citation_search.py # Semantic Scholar / OpenAlex | |
| β | |
| βββ schemas/ | |
| βββ models.py # All 8 Pydantic schemas | |
| ``` | |
| --- | |
| ## 12. How to Run | |
| ```bash | |
| # 1. Install dependencies | |
| pip install -r requirements.txt | |
| # 2. Set your OpenAI API key in .env | |
| echo "OPENAI_API_KEY=your-key-here" > .env | |
| # 3. Launch the app | |
| python app.py | |
| ``` | |
| Open **http://localhost:7860** β Upload a PDF β Click **"Analyze Paper"** β Wait 1β3 minutes β Review across all 6 tabs. | |
| --- | |
| *AI Research Paper Analyst β Homework 5, Agentic AI Bootcamp* | |