AI-Research-Paper-Analyst / walkthrough.md
Saleh
Clean deployment to HuggingFace Space
2447eba

A newer version of the Gradio SDK is available: 6.9.0

Upgrade

πŸ”¬ AI Research Paper Analyst β€” Project Walkthrough

Automated Peer-Review System powered by Multi-Agent AI

Upload a research paper (PDF) β†’ receive a publication-ready peer review with methodology critique, novelty assessment, rubric scoring, and an Accept / Revise / Reject recommendation.


1. What Does This System Do?

Input Output
A single PDF research paper A structured peer-review report with strengths, weaknesses, rubric scores, and a recommendation

Key stats:

  • 7 specialized AI agents working in a sequential pipeline
  • 5 custom tools (PDF parsing, PII redaction, injection scanning, URL validation, citation search)
  • 8 Pydantic schemas enforcing structured JSON output from every agent
  • 15-point binary rubric for quality assurance
  • Gradio web UI with 6 tabs for exploring every aspect of the review

2. System Architecture Flowchart

System Architecture


3. Simplified Pipeline Flow

Pipeline Flow


4. The 7 Agents

# Agent LLM Role Key Output
1 πŸ›‘οΈ Safety Guardian None (programmatic) Gate β€” blocks unsafe docs before any LLM sees them SafetyReport
2 πŸ“„ Paper Extractor GPT-4o Extract title, authors, abstract, methodology, findings PaperExtraction
3 πŸ”¬ Methodology Critic GPT-4o-mini Evaluate study design, stats, reproducibility MethodologyCritique
4 πŸ” Relevance Researcher GPT-4o-mini Search Semantic Scholar / OpenAlex for related work RelevanceReport
5 ✍️ Review Synthesizer GPT-4o-mini Combine all insights into a peer-review draft ReviewDraft
6 πŸ“ Rubric Evaluator GPT-4o-mini Score the draft on 15 binary criteria (pass β‰₯ 11/15) RubricEvaluation
7 ✨ Enhancer GPT-4o-mini Fix rubric failures, produce publication-ready report FinalReview

5. The 5 Tools

# Tool File Used By What It Does
1 πŸ“‘ PDF Parser tools/pdf_parser.py Safety Guardian, Paper Extractor Extracts text from PDF using pdfplumber. Validates file type, existence, and size (≀ 20 MB).
2 πŸ”’ PII Detector tools/pii_detector.py Safety Guardian Regex-based scan for emails, phone numbers, SSNs, credit cards. Replaces matches with [REDACTED_TYPE].
3 🚫 Injection Scanner tools/injection_scanner.py Safety Guardian Checks text against 9 prompt-injection patterns (e.g. "ignore previous instructions", [INST]). Fail-safe: defaults to unsafe if scanning crashes.
4 🌐 URL Validator tools/url_validator.py Safety Guardian Extracts URLs via regex, checks against blocklist (bit.ly, tinyurl, data:, javascript:). Max 50 URLs per scan.
5 πŸ”Ž Citation Search tools/citation_search.py Relevance Researcher Searches Semantic Scholar (with retry + backoff for rate limits). Falls back to OpenAlex if unavailable. Max 3 API calls per run.

Tool–Agent Assignment Map

Tool-Agent Assignment


6. Pydantic Schemas (Structured Output)

Every agent is forced to output validated JSON through Pydantic schemas. If an agent's output doesn't match the schema, CrewAI automatically retries with a correction prompt.

Schema Key Fields
SafetyReport is_safe, pii_found, injection_detected, malicious_urls, risk_level
PaperExtraction title, authors, abstract, methodology, key_findings, paper_type, extraction_confidence
MethodologyCritique strengths, weaknesses, methodology_score (1-10), reproducibility_score (1-10), bias_risks
RelevanceReport related_papers[], novelty_score (1-10), field_context, gaps_addressed
ReviewDraft summary, strengths_section, weaknesses_section, recommendation (Accept/Revise/Reject)
RubricEvaluation scores{} (15 binary criteria), total_score (0–15), passed (β‰₯ 11)
FinalReview executive_summary, strengths, weaknesses, recommendation, confidence_score, improvement_log

7. Safety & Guardrails β€” 5 Layers

5-Layer Safety Architecture

Key principle: The Safety Guardian uses zero LLM calls β€” all safety decisions are deterministic regex/logic. This prevents prompt injection attacks from manipulating the safety gate itself.


8. Rubric β€” 15 Binary Criteria

The Rubric Evaluator scores the review on 15 strict pass/fail criteria (0 or 1 each). A review passes with β‰₯ 11/15.

# Category Criterion
1 πŸ“‹ Content Title & authors correctly identified
2 πŸ“‹ Content Abstract accurately summarized
3 πŸ“‹ Content Methodology clearly described
4 πŸ“‹ Content At least 3 distinct strengths
5 πŸ“‹ Content At least 3 distinct weaknesses
6 πŸ“‹ Content Limitations acknowledged
7 πŸ“‹ Content Related work present (2+ papers)
8 πŸ”¬ Depth Novelty assessed with justification
9 πŸ”¬ Depth Reproducibility discussed
10 πŸ”¬ Depth Evidence quality evaluated
11 πŸ”¬ Depth Contribution to field stated
12 πŸ“ Quality Recommendation justified with evidence
13 πŸ“ Quality At least 3 actionable questions
14 πŸ“ Quality No hallucinated citations
15 πŸ“ Quality Professional tone and coherent structure

9. Gradio UI β€” 6 Tabs

Tab What It Shows
πŸ“‹ Executive Summary Recommendation (Accept/Revise/Reject), confidence, rubric score, paper info + download button
πŸ“ Full Review Strengths, weaknesses, methodology & novelty assessments, author questions
πŸ“Š Rubric Scorecard All 15 criteria with βœ…/❌ scores and per-criterion feedback
πŸ›‘οΈ Safety Report PII findings, injection scan result, URL analysis
πŸ’Ž Agent Outputs Raw structured JSON output from each of the 7 agents
βš™οΈ Pipeline Logs Timestamped execution log + JSON run summary

10. Tech Stack

Package Purpose
CrewAI β‰₯ 0.86.0 Multi-agent orchestration framework
OpenAI β‰₯ 1.0.0 LLM API β€” GPT-4o + GPT-4o-mini
Gradio β‰₯ 5.0.0 Web UI
pdfplumber β‰₯ 0.11.0 PDF text extraction
Pydantic β‰₯ 2.0.0 Structured output validation
python-dotenv β‰₯ 1.0.0 .env file loading
requests β‰₯ 2.31.0 HTTP calls to Semantic Scholar / OpenAlex

11. Project Structure

Homework5_agentincAI/
β”œβ”€β”€ app.py                         # Main pipeline + Gradio UI (1045 lines)
β”œβ”€β”€ requirements.txt               # Dependencies
β”œβ”€β”€ .env                           # OPENAI_API_KEY
β”‚
β”œβ”€β”€ agents/                        # CrewAI agent definitions
β”‚   β”œβ”€β”€ paper_extractor.py         # Step 1 β€” GPT-4o
β”‚   β”œβ”€β”€ methodology_critic.py      # Step 2a β€” GPT-4o-mini
β”‚   β”œβ”€β”€ relevance_researcher.py    # Step 2b β€” GPT-4o-mini
β”‚   β”œβ”€β”€ review_synthesizer.py      # Step 3 β€” GPT-4o-mini
β”‚   β”œβ”€β”€ rubric_evaluator.py        # Step 4 β€” GPT-4o-mini
β”‚   └── enhancer.py                # Step 5 β€” GPT-4o-mini
β”‚
β”œβ”€β”€ tools/                         # Custom tools
β”‚   β”œβ”€β”€ pdf_parser.py              # PDF β†’ text
β”‚   β”œβ”€β”€ pii_detector.py            # PII scan & redact
β”‚   β”œβ”€β”€ injection_scanner.py       # Prompt injection detection
β”‚   β”œβ”€β”€ url_validator.py           # URL blocklist check
β”‚   └── citation_search.py         # Semantic Scholar / OpenAlex
β”‚
└── schemas/
    └── models.py                  # All 8 Pydantic schemas

12. How to Run

# 1. Install dependencies
pip install -r requirements.txt

# 2. Set your OpenAI API key in .env
echo "OPENAI_API_KEY=your-key-here" > .env

# 3. Launch the app
python app.py

Open http://localhost:7860 β†’ Upload a PDF β†’ Click "Analyze Paper" β†’ Wait 1–3 minutes β†’ Review across all 6 tabs.


AI Research Paper Analyst β€” Homework 5, Agentic AI Bootcamp