Spaces:

AISA-Framework
/

AI-Research-Paper-Analyst

Sleeping

App Files Files Community

AI-Research-Paper-Analyst / walkthrough.md

Saleh

Clean deployment to HuggingFace Space

2447eba 22 days ago

preview code

raw

history blame contribute delete

8.33 kB

A newer version of the Gradio SDK is available: 6.9.0

Upgrade

🔬 AI Research Paper Analyst — Project Walkthrough

Automated Peer-Review System powered by Multi-Agent AI

Upload a research paper (PDF) → receive a publication-ready peer review with methodology critique, novelty assessment, rubric scoring, and an Accept / Revise / Reject recommendation.

1. What Does This System Do?

Input	Output
A single PDF research paper	A structured peer-review report with strengths, weaknesses, rubric scores, and a recommendation

Key stats:

7 specialized AI agents working in a sequential pipeline
5 custom tools (PDF parsing, PII redaction, injection scanning, URL validation, citation search)
8 Pydantic schemas enforcing structured JSON output from every agent
15-point binary rubric for quality assurance
Gradio web UI with 6 tabs for exploring every aspect of the review

2. System Architecture Flowchart

3. Simplified Pipeline Flow

4. The 7 Agents

#	Agent	LLM	Role	Key Output
1	🛡️ Safety Guardian	None (programmatic)	Gate — blocks unsafe docs before any LLM sees them	`SafetyReport`
2	📄 Paper Extractor	GPT-4o	Extract title, authors, abstract, methodology, findings	`PaperExtraction`
3	🔬 Methodology Critic	GPT-4o-mini	Evaluate study design, stats, reproducibility	`MethodologyCritique`
4	🔍 Relevance Researcher	GPT-4o-mini	Search Semantic Scholar / OpenAlex for related work	`RelevanceReport`
5	✍️ Review Synthesizer	GPT-4o-mini	Combine all insights into a peer-review draft	`ReviewDraft`
6	📏 Rubric Evaluator	GPT-4o-mini	Score the draft on 15 binary criteria (pass ≥ 11/15)	`RubricEvaluation`
7	✨ Enhancer	GPT-4o-mini	Fix rubric failures, produce publication-ready report	`FinalReview`

5. The 5 Tools

#	Tool	File	Used By	What It Does
1	📑 PDF Parser	`tools/pdf_parser.py`	Safety Guardian, Paper Extractor	Extracts text from PDF using `pdfplumber`. Validates file type, existence, and size (≤ 20 MB).
2	🔒 PII Detector	`tools/pii_detector.py`	Safety Guardian	Regex-based scan for emails, phone numbers, SSNs, credit cards. Replaces matches with `[REDACTED_TYPE]`.
3	🚫 Injection Scanner	`tools/injection_scanner.py`	Safety Guardian	Checks text against 9 prompt-injection patterns (e.g. "ignore previous instructions", `[INST]`). Fail-safe: defaults to unsafe if scanning crashes.
4	🌐 URL Validator	`tools/url_validator.py`	Safety Guardian	Extracts URLs via regex, checks against blocklist (bit.ly, tinyurl, `data:`, `javascript:`). Max 50 URLs per scan.
5	🔎 Citation Search	`tools/citation_search.py`	Relevance Researcher	Searches Semantic Scholar (with retry + backoff for rate limits). Falls back to OpenAlex if unavailable. Max 3 API calls per run.

Tool–Agent Assignment Map

6. Pydantic Schemas (Structured Output)

Every agent is forced to output validated JSON through Pydantic schemas. If an agent's output doesn't match the schema, CrewAI automatically retries with a correction prompt.

Schema	Key Fields
`SafetyReport`	`is_safe`, `pii_found`, `injection_detected`, `malicious_urls`, `risk_level`
`PaperExtraction`	`title`, `authors`, `abstract`, `methodology`, `key_findings`, `paper_type`, `extraction_confidence`
`MethodologyCritique`	`strengths`, `weaknesses`, `methodology_score` (1-10), `reproducibility_score` (1-10), `bias_risks`
`RelevanceReport`	`related_papers[]`, `novelty_score` (1-10), `field_context`, `gaps_addressed`
`ReviewDraft`	`summary`, `strengths_section`, `weaknesses_section`, `recommendation` (Accept/Revise/Reject)
`RubricEvaluation`	`scores{}` (15 binary criteria), `total_score` (0–15), `passed` (≥ 11)
`FinalReview`	`executive_summary`, `strengths`, `weaknesses`, `recommendation`, `confidence_score`, `improvement_log`

7. Safety & Guardrails — 5 Layers

Key principle: The Safety Guardian uses zero LLM calls — all safety decisions are deterministic regex/logic. This prevents prompt injection attacks from manipulating the safety gate itself.

8. Rubric — 15 Binary Criteria

The Rubric Evaluator scores the review on 15 strict pass/fail criteria (0 or 1 each). A review passes with ≥ 11/15.

#	Category	Criterion
1	📋 Content	Title & authors correctly identified
2	📋 Content	Abstract accurately summarized
3	📋 Content	Methodology clearly described
4	📋 Content	At least 3 distinct strengths
5	📋 Content	At least 3 distinct weaknesses
6	📋 Content	Limitations acknowledged
7	📋 Content	Related work present (2+ papers)
8	🔬 Depth	Novelty assessed with justification
9	🔬 Depth	Reproducibility discussed
10	🔬 Depth	Evidence quality evaluated
11	🔬 Depth	Contribution to field stated
12	📝 Quality	Recommendation justified with evidence
13	📝 Quality	At least 3 actionable questions
14	📝 Quality	No hallucinated citations
15	📝 Quality	Professional tone and coherent structure

9. Gradio UI — 6 Tabs

Tab	What It Shows
📋 Executive Summary	Recommendation (Accept/Revise/Reject), confidence, rubric score, paper info + download button
📝 Full Review	Strengths, weaknesses, methodology & novelty assessments, author questions
📊 Rubric Scorecard	All 15 criteria with ✅/❌ scores and per-criterion feedback
🛡️ Safety Report	PII findings, injection scan result, URL analysis
💎 Agent Outputs	Raw structured JSON output from each of the 7 agents
⚙️ Pipeline Logs	Timestamped execution log + JSON run summary

10. Tech Stack

Package	Purpose
CrewAI ≥ 0.86.0	Multi-agent orchestration framework
OpenAI ≥ 1.0.0	LLM API — GPT-4o + GPT-4o-mini
Gradio ≥ 5.0.0	Web UI
pdfplumber ≥ 0.11.0	PDF text extraction
Pydantic ≥ 2.0.0	Structured output validation
python-dotenv ≥ 1.0.0	`.env` file loading
requests ≥ 2.31.0	HTTP calls to Semantic Scholar / OpenAlex

11. Project Structure

Homework5_agentincAI/
├── app.py                         # Main pipeline + Gradio UI (1045 lines)
├── requirements.txt               # Dependencies
├── .env                           # OPENAI_API_KEY
│
├── agents/                        # CrewAI agent definitions
│   ├── paper_extractor.py         # Step 1 — GPT-4o
│   ├── methodology_critic.py      # Step 2a — GPT-4o-mini
│   ├── relevance_researcher.py    # Step 2b — GPT-4o-mini
│   ├── review_synthesizer.py      # Step 3 — GPT-4o-mini
│   ├── rubric_evaluator.py        # Step 4 — GPT-4o-mini
│   └── enhancer.py                # Step 5 — GPT-4o-mini
│
├── tools/                         # Custom tools
│   ├── pdf_parser.py              # PDF → text
│   ├── pii_detector.py            # PII scan & redact
│   ├── injection_scanner.py       # Prompt injection detection
│   ├── url_validator.py           # URL blocklist check
│   └── citation_search.py         # Semantic Scholar / OpenAlex
│
└── schemas/
    └── models.py                  # All 8 Pydantic schemas

12. How to Run

# 1. Install dependencies
pip install -r requirements.txt

# 2. Set your OpenAI API key in .env
echo "OPENAI_API_KEY=your-key-here" > .env

# 3. Launch the app
python app.py

Open http://localhost:7860 → Upload a PDF → Click "Analyze Paper" → Wait 1–3 minutes → Review across all 6 tabs.

AI Research Paper Analyst — Homework 5, Agentic AI Bootcamp