File size: 21,430 Bytes
2447eba
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
# AI Research Paper Analyst β€” Complete Project Documentation

## Table of Contents

1. [Project Overview](#1-project-overview)
2. [System Architecture Flowchart](#2-system-architecture-flowchart)
3. [Pipeline Flow](#3-pipeline-flow)
4. [Agents](#4-agents)
5. [Tools](#5-tools)
6. [Pydantic Schemas](#6-pydantic-schemas)
7. [Gradio UI](#7-gradio-ui)
8. [Safety & Guardrails](#8-safety--guardrails)
9. [Tech Stack & Dependencies](#9-tech-stack--dependencies)
10. [Project Structure](#10-project-structure)
11. [How to Run](#11-how-to-run)

---

## 1. Project Overview

**AI Research Paper Analyst** is an automated peer-review system powered by a multi-agent AI pipeline. A user uploads a research paper (PDF), and the system produces a comprehensive, publication-ready peer review β€” including methodology critique, novelty assessment, rubric scoring, and a final Accept/Revise/Reject recommendation.

| Property | Value |
|---|---|
| **Framework** | CrewAI (multi-agent orchestration) |
| **LLM Backend** | OpenAI GPT-4o (extraction) + GPT-4o-mini (all other agents) |
| **Frontend** | Gradio 5.x |
| **Safety** | Programmatic (regex/logic-based) β€” no LLM in the safety gate |
| **Output Format** | Structured JSON (Pydantic) rendered as Markdown |

---

## 2. System Architecture Flowchart

```mermaid
flowchart TD
    A["User Uploads PDF via Gradio UI"] --> B["File Validation"]
    B -->|Invalid| B_ERR["Return Error to UI"]
    B -->|Valid .pdf| C["GATE 1: Safety Guardian (Programmatic)"]

    subgraph SAFETY_GATE["Safety Gate β€” No LLM"]
        C --> C1["PDF Parser Tool β€” Extract raw text"]
        C1 --> C2["PII Detector Tool β€” Scan & redact PII"]
        C2 --> C3["Injection Scanner Tool β€” Check for prompt injections"]
        C3 --> C4["URL Validator Tool β€” Flag malicious URLs"]
        C4 --> C5{"is_safe?"}
    end

    C5 -->|UNSAFE| BLOCK["Block Document β€” Show Safety Report"]
    C5 -->|SAFE| D["Sanitized Text passed to Analysis Pipeline"]

    subgraph ANALYSIS_PIPELINE["Analysis Pipeline β€” CrewAI Sequential"]
        D --> E["STEP 1: Paper Extractor Agent (GPT-4o)"]
        E -->|PaperExtraction JSON| F["STEP 2a: Methodology Critic Agent (GPT-4o-mini)"]
        E -->|PaperExtraction JSON| G["STEP 2b: Relevance Researcher Agent (GPT-4o-mini)"]
        F -->|MethodologyCritique JSON| H["STEP 3: Review Synthesizer Agent (GPT-4o-mini)"]
        G -->|RelevanceReport JSON| H
        E -->|PaperExtraction JSON| H
        H -->|ReviewDraft JSON| I["STEP 4: Rubric Evaluator Agent (GPT-4o-mini)"]
        I -->|RubricEvaluation JSON| J["STEP 5: Enhancer Agent (GPT-4o-mini)"]
        H -->|ReviewDraft JSON| J
        E -->|PaperExtraction JSON| J
    end

    J -->|FinalReview JSON| K["Output Formatting"]

    subgraph OUTPUT["Gradio UI β€” 6 Tabs"]
        K --> K1["Executive Summary Tab"]
        K --> K2["Full Review Tab"]
        K --> K3["Rubric Scorecard Tab"]
        K --> K4["Safety Report Tab"]
        K --> K5["Agent Outputs Tab"]
        K --> K6["Pipeline Logs Tab"]
    end

    K2 --> DL["Download Full Report (.md)"]
```

### Simplified Agent Pipeline Flow

```mermaid
flowchart LR
    PDF["PDF Upload"] --> SG["Safety\nGuardian"]
    SG --> PE["Paper\nExtractor"]
    PE --> MC["Methodology\nCritic"]
    PE --> RR["Relevance\nResearcher"]
    MC --> RS["Review\nSynthesizer"]
    RR --> RS
    RS --> RE["Rubric\nEvaluator"]
    RE --> EN["Enhancer"]
    EN --> OUT["Final\nReport"]

    style SG fill:#ff6b6b,stroke:#c0392b,color:#fff
    style PE fill:#74b9ff,stroke:#2980b9,color:#fff
    style MC fill:#a29bfe,stroke:#6c5ce7,color:#fff
    style RR fill:#a29bfe,stroke:#6c5ce7,color:#fff
    style RS fill:#55efc4,stroke:#00b894,color:#fff
    style RE fill:#ffeaa7,stroke:#fdcb6e,color:#333
    style EN fill:#fd79a8,stroke:#e84393,color:#fff
```

### Data Flow Diagram

```mermaid
flowchart TD
    subgraph TOOLS["Tools Layer"]
        T1["pdf_parser_tool"]
        T2["pii_detector_tool"]
        T3["injection_scanner_tool"]
        T4["url_validator_tool"]
        T5["citation_search_tool"]
    end

    subgraph AGENTS["Agent Layer"]
        A1["Safety Guardian\n(Programmatic)"]
        A2["Paper Extractor\n(GPT-4o)"]
        A3["Methodology Critic\n(GPT-4o-mini)"]
        A4["Relevance Researcher\n(GPT-4o-mini)"]
        A5["Review Synthesizer\n(GPT-4o-mini)"]
        A6["Rubric Evaluator\n(GPT-4o-mini)"]
        A7["Enhancer\n(GPT-4o-mini)"]
    end

    subgraph SCHEMAS["Schema Layer (Pydantic)"]
        S1["SafetyReport"]
        S2["PaperExtraction"]
        S3["MethodologyCritique"]
        S4["RelevanceReport"]
        S5["ReviewDraft"]
        S6["RubricEvaluation"]
        S7["FinalReview"]
    end

    A1 -.->|uses| T1 & T2 & T3 & T4
    A2 -.->|uses| T1
    A4 -.->|uses| T5

    A1 -->|outputs| S1
    A2 -->|outputs| S2
    A3 -->|outputs| S3
    A4 -->|outputs| S4
    A5 -->|outputs| S5
    A6 -->|outputs| S6
    A7 -->|outputs| S7
```

---

## 3. Pipeline Flow

The system runs as a **sequential pipeline** with one safety gate and six analysis steps:

| Stage | Agent | LLM | Input | Output Schema | Tools Used |
|---|---|---|---|---|---|
| **Gate 1** | Safety Guardian | None (programmatic) | Raw PDF file | `SafetyReport` | pdf_parser, pii_detector, injection_scanner, url_validator |
| **Step 1** | Paper Extractor | GPT-4o | Sanitized text | `PaperExtraction` | pdf_parser |
| **Step 2a** | Methodology Critic | GPT-4o-mini | PaperExtraction JSON | `MethodologyCritique` | None |
| **Step 2b** | Relevance Researcher | GPT-4o-mini | PaperExtraction JSON | `RelevanceReport` | citation_search |
| **Step 3** | Review Synthesizer | GPT-4o-mini | Paper + Critique + Research | `ReviewDraft` | None |
| **Step 4** | Rubric Evaluator | GPT-4o-mini | Draft + Paper + Critique + Research | `RubricEvaluation` | None |
| **Step 5** | Enhancer | GPT-4o-mini | Draft + Rubric + Paper | `FinalReview` | None |

### Pipeline Error Handling

- Each agent step is wrapped in `try/except` β€” a failure in one agent does not crash the pipeline.
- If an agent fails, its output defaults to `{"error": "..."}` and downstream agents work with available data.
- The Safety Gate blocks the entire pipeline if `is_safe=False` (prompt injection or malicious URLs detected).
- PII is always redacted before analysis, even for "safe" documents.

---

## 4. Agents

### Agent 1: Safety Guardian (Programmatic)

| Property | Value |
|---|---|
| **File** | `app.py` β€” `run_safety_check()` |
| **LLM** | None β€” fully programmatic |
| **Purpose** | Gate that blocks unsafe documents before LLM analysis |
| **Tools** | pdf_parser, pii_detector, injection_scanner, url_validator |
| **Output** | `SafetyReport` |

Runs all 4 safety tools as Python functions directly (no CrewAI agent overhead). This is deterministic, fast (<1 second), and avoids LLM hallucinations in safety-critical decisions.

**Decision Logic:**
- `is_safe = (not injection_detected) AND (no malicious URLs)`
- Risk level: `high` if injection or malicious URLs, `medium` if PII found, `low` otherwise
- If `is_safe=False` β†’ pipeline is blocked, user sees the Safety Report

---

### Agent 2: Paper Extractor

| Property | Value |
|---|---|
| **File** | `agents/paper_extractor.py` |
| **LLM** | GPT-4o (temperature=0.1, seed=42) |
| **Role** | Research Paper Data Extractor |
| **Tools** | pdf_parser_tool |
| **Output** | `PaperExtraction` |
| **Max Iterations** | 3 |

Extracts structured metadata from the raw paper text: title, authors, abstract, methodology, key findings, contributions, limitations, references count, paper type, and extraction confidence level.

Uses GPT-4o (not mini) because extraction requires deep comprehension of the full paper.

---

### Agent 3: Methodology Critic

| Property | Value |
|---|---|
| **File** | `agents/methodology_critic.py` |
| **LLM** | GPT-4o-mini (temperature=0.1, seed=42) |
| **Role** | Research Methodology Evaluator |
| **Tools** | None (pure LLM reasoning) |
| **Output** | `MethodologyCritique` |
| **Max Iterations** | 5 |

Critically evaluates study design, statistical methods, sample sizes, reproducibility, and logical consistency. For theoretical papers, adapts criteria to assess logical rigor and proof completeness. Produces scores for methodology (1-10) and reproducibility (1-10).

---

### Agent 4: Relevance Researcher

| Property | Value |
|---|---|
| **File** | `agents/relevance_researcher.py` |
| **LLM** | GPT-4o-mini (temperature=0.1, seed=42) |
| **Role** | Related Work Analyst |
| **Tools** | citation_search_tool |
| **Output** | `RelevanceReport` |
| **Max Iterations** | 5 |

Searches for real related papers using Semantic Scholar / OpenAlex APIs. Assesses novelty by comparing against existing work. Produces a novelty score (1-10), field context, gaps addressed, and overlaps.

**Critical Rule:** Must NOT hallucinate citations. Only uses papers found by the search tool.

---

### Agent 5: Review Synthesizer

| Property | Value |
|---|---|
| **File** | `agents/review_synthesizer.py` |
| **LLM** | GPT-4o-mini (temperature=0.1, seed=42, max_tokens=4000) |
| **Role** | Peer Review Report Writer |
| **Tools** | None (synthesis only) |
| **Output** | `ReviewDraft` |
| **Max Iterations** | 3 |

Combines insights from Paper Extractor, Methodology Critic, and Relevance Researcher into a coherent peer-review draft with summary, strengths, weaknesses, assessments, recommendation (Accept/Revise/Reject), and questions for authors.

---

### Agent 6: Rubric Evaluator

| Property | Value |
|---|---|
| **File** | `agents/rubric_evaluator.py` |
| **LLM** | GPT-4o-mini (temperature=0.1, seed=42) |
| **Role** | Objective Quality Scorer |
| **Tools** | None (evaluation logic only) |
| **Output** | `RubricEvaluation` |
| **Max Iterations** | 3 |

Scores the review draft on **15 strict binary criteria** (0 or 1 each). Pass threshold: >= 11/15.

**15 Rubric Criteria:**

| # | Category | Criterion |
|---|---|---|
| 1 | Content Completeness | Title & authors correctly identified |
| 2 | Content Completeness | Abstract accurately summarized |
| 3 | Content Completeness | Methodology clearly described |
| 4 | Content Completeness | At least 3 distinct strengths |
| 5 | Content Completeness | At least 3 distinct weaknesses |
| 6 | Content Completeness | Limitations acknowledged |
| 7 | Content Completeness | Related work present (2+ papers) |
| 8 | Analytical Depth | Novelty assessed with justification |
| 9 | Analytical Depth | Reproducibility discussed |
| 10 | Analytical Depth | Evidence quality evaluated |
| 11 | Analytical Depth | Contribution to field stated |
| 12 | Review Quality | Recommendation justified with evidence |
| 13 | Review Quality | At least 3 actionable questions |
| 14 | Review Quality | No hallucinated citations |
| 15 | Review Quality | Professional tone and coherent structure |

---

### Agent 7: Enhancer

| Property | Value |
|---|---|
| **File** | `agents/enhancer.py` |
| **LLM** | GPT-4o-mini (temperature=0.1, seed=42) |
| **Role** | Review Report Enhancer |
| **Tools** | None (writing/synthesis only) |
| **Output** | `FinalReview` |
| **Max Iterations** | 3 |

Takes the draft review + rubric feedback and produces a **complete, publication-ready** peer review report (800-1500 words). Fixes all rubric criteria that scored 0 while keeping content that passed. Produces the final executive summary, recommendation, confidence score, and improvement log.

---

## 5. Tools

### Tool 1: PDF Parser (`tools/pdf_parser.py`)

| Property | Value |
|---|---|
| **Library** | pdfplumber |
| **Assigned To** | Safety Guardian, Paper Extractor |
| **Input** | File path (string) |
| **Output** | Extracted text (string) or `"ERROR: ..."` |

**Guardrails:**
- File must be `.pdf`
- File must exist on disk
- File size max: 20 MB
- Minimum extractable text: 100 chars
- Never raises exceptions β€” returns error strings

---

### Tool 2: PII Detector (`tools/pii_detector.py`)

| Property | Value |
|---|---|
| **Approach** | Regex pattern matching |
| **Assigned To** | Safety Guardian |
| **Input** | Text to scan |
| **Output** | JSON with `findings`, `redacted_text`, `pii_count` |

**Patterns Detected:**
- Email addresses
- Phone numbers (US format)
- Social Security Numbers
- Credit card numbers

All matches are replaced with `[REDACTED_TYPE]` tokens.

---

### Tool 3: Prompt Injection Scanner (`tools/injection_scanner.py`)

| Property | Value |
|---|---|
| **Approach** | Regex pattern matching (9 patterns) |
| **Assigned To** | Safety Guardian |
| **Input** | Text to scan |
| **Output** | JSON with `is_safe`, `suspicious_patterns`, `patterns_checked` |

**Patterns Checked:**
- "ignore previous instructions"
- "disregard above/previous"
- "forget everything/all/your instructions"
- "new instructions:"
- `[INST]` token
- `<|im_start|>` token
- `<|system|>` token
- "override safety"
- "jailbreak"

**Fail-safe:** If scanning itself fails, the document is treated as **unsafe**.

---

### Tool 4: URL Validator (`tools/url_validator.py`)

| Property | Value |
|---|---|
| **Approach** | Regex extraction + blocklist matching |
| **Assigned To** | Safety Guardian |
| **Input** | Text to scan |
| **Output** | JSON with `total_urls`, `malicious_urls`, `is_safe` |

**Suspicious Indicators:**
- URL shorteners: bit.ly, tinyurl, t.co, goo.gl
- Dangerous protocols: data:, javascript:, file://
- Keywords: malware, phishing

Max 50 URLs checked per scan.

---

### Tool 5: Citation Search (`tools/citation_search.py`)

| Property | Value |
|---|---|
| **Primary API** | Semantic Scholar (with retry for HTTP 429) |
| **Fallback API** | OpenAlex (free, no rate limits) |
| **Assigned To** | Relevance Researcher |
| **Input** | Search query (string, max 200 chars) |
| **Output** | Formatted text list of papers with title, authors, year, citations, abstract |

**Rate Limiting:**
- Max 3 API calls per analysis run (tracked globally)
- 10-second timeout per API call
- Exponential backoff for rate limits (1s, 2s, 4s)

**Fallback Chain:** Semantic Scholar -> OpenAlex -> "Search unavailable" message

---

## 6. Pydantic Schemas

All schemas inherit from `BaseAgentOutput` which enforces `extra="ignore"` for Gradio compatibility.

**File:** `schemas/models.py`

### SafetyReport
```
is_safe: bool (default=False, fail-safe)
pii_found: list[str]
injection_detected: bool
malicious_urls: list[str]
sanitized_text: str
risk_level: "low" | "medium" | "high"
```

### PaperExtraction
```
title: str
authors: list[str]
abstract: str
methodology: str
key_findings: list[str]
contributions: list[str]
limitations_stated: list[str]
references_count: int
paper_type: "empirical" | "theoretical" | "survey" | "system" | "mixed"
extraction_confidence: "high" | "medium" | "low"
```

### MethodologyCritique
```
strengths: list[str]
weaknesses: list[str]
limitations: list[str]
methodology_score: int (1-10)
reproducibility_score: int (1-10)
suggestions: list[str]
bias_risks: list[str]
```

### RelevanceReport
```
related_papers: list[RelatedPaper]
novelty_score: int (1-10)
field_context: str
gaps_addressed: list[str]
overlaps_with_existing: list[str]
```

### RelatedPaper
```
title: str
authors: str
year: int
citation_count: int
relevance: str
```

### ReviewDraft
```
summary: str
strengths_section: str
weaknesses_section: str
methodology_assessment: str
novelty_assessment: str
related_work_context: str
questions_for_authors: list[str]
recommendation: "Accept" | "Revise" | "Reject"
confidence: int (1-5)
detailed_review: str
```

### RubricEvaluation
```
scores: dict[str, int]        (15 criteria, each 0 or 1)
total_score: int              (0-15)
failed_criteria: list[str]
feedback_per_criterion: dict[str, str]
passed: bool                  (True if total_score >= 11)
```

### FinalReview
```
executive_summary: str
paper_metadata: dict
strengths: list[str]
weaknesses: list[str]
methodology_assessment: str
novelty_assessment: str
related_work_context: str
questions_for_authors: list[str]
recommendation: "Accept" | "Revise" | "Reject"
confidence_score: int (1-5)
rubric_scores: dict[str, int]
rubric_total: int
improvement_log: list[str]
```

---

## 7. Gradio UI

The UI is a single-page Gradio Blocks application with **6 tabs**:

| Tab | Content | Component |
|---|---|---|
| Executive Summary | Recommendation, confidence, rubric score, paper info | Markdown + Download button |
| Full Review | Strengths, weaknesses, methodology, novelty, questions | Markdown |
| Rubric Scorecard | 15 criteria scores in 3 categories with feedback | Markdown (table) |
| Safety Report | PII findings, injection status, URL analysis | Markdown |
| Agent Outputs | Raw structured output from each of the 7 agents | Markdown |
| Pipeline Logs | Timestamped execution log + JSON summary | Textbox + Code |

### UI Features
- **Progress bar** with real-time status updates (e.g., "Agent 3/6: Searching Related Work...")
- **Download button** to export the full review as a `.md` file
- **File validation** β€” only accepts `.pdf` files

---

## 8. Safety & Guardrails

### Layered Safety Architecture

```mermaid
flowchart TD
    subgraph LAYER1["Layer 1: Input Validation"]
        IV1["File type check (.pdf only)"]
        IV2["File size check (max 20MB)"]
        IV3["Minimum text check (100+ chars)"]
    end

    subgraph LAYER2["Layer 2: Content Safety"]
        CS1["PII Detection & Redaction"]
        CS2["Prompt Injection Scanning"]
        CS3["URL Blocklist Validation"]
    end

    subgraph LAYER3["Layer 3: LLM Configuration"]
        LC1["Low temperature (0.1)"]
        LC2["Deterministic seed (42)"]
        LC3["Max iterations per agent"]
        LC4["Structured output (Pydantic)"]
    end

    subgraph LAYER4["Layer 4: Pipeline Resilience"]
        PR1["Per-agent try/except"]
        PR2["Graceful degradation"]
        PR3["API rate limiting (3 calls max)"]
        PR4["Timeout enforcement (10s)"]
    end

    subgraph LAYER5["Layer 5: Observability"]
        OB1["PipelineLogger β€” every step logged"]
        OB2["API key redaction in logs"]
        OB3["Execution summary with timing"]
    end

    LAYER1 --> LAYER2 --> LAYER3 --> LAYER4 --> LAYER5
```

### Key Principles
- **Fail-safe defaults**: `is_safe=False`, risk defaults to unsafe
- **No LLM in the safety gate**: All safety checks are deterministic regex/logic
- **PII always redacted**: Even for safe documents, PII is stripped before LLM analysis
- **Structured outputs**: Every agent uses Pydantic schemas enforced by CrewAI
- **No secrets in logs**: API keys are regex-redacted from all log output

---

## 9. Tech Stack & Dependencies

| Package | Version | Purpose |
|---|---|---|
| `crewai` | >= 0.86.0 | Multi-agent orchestration framework |
| `crewai-tools` | >= 0.17.0 | Tool wrapper utilities |
| `openai` | >= 1.0.0 | LLM API client (GPT-4o, GPT-4o-mini) |
| `pdfplumber` | >= 0.11.0 | PDF text extraction |
| `pydantic` | >= 2.0.0 | Structured output validation |
| `gradio` | >= 5.0.0 | Web UI framework |
| `python-dotenv` | >= 1.0.0 | Environment variable loading |
| `requests` | >= 2.31.0 | HTTP client for citation APIs |

### Environment Variables

| Variable | Required | Purpose |
|---|---|---|
| `OPENAI_API_KEY` | Yes | OpenAI API access (GPT-4o required) |

---

## 10. Project Structure

```
Homework5_agentincAI/
|-- app.py                          # Main application (pipeline + Gradio UI)
|-- requirements.txt                # Python dependencies
|-- README.md                       # HuggingFace Space metadata
|-- .env                            # Environment variables (API keys)
|-- .gitignore
|
|-- agents/                         # CrewAI agent definitions
|   |-- __init__.py
|   |-- paper_extractor.py          # Agent 2: Structured data extraction
|   |-- methodology_critic.py       # Agent 3: Methodology evaluation
|   |-- relevance_researcher.py     # Agent 4: Related work search
|   |-- review_synthesizer.py       # Agent 5: Draft review writer
|   |-- rubric_evaluator.py         # Agent 6: 15-criteria quality scorer
|   |-- enhancer.py                 # Agent 7: Final report polisher
|
|-- tools/                          # CrewAI tool definitions
|   |-- __init__.py
|   |-- pdf_parser.py               # PDF text extraction
|   |-- pii_detector.py             # PII detection & redaction
|   |-- injection_scanner.py        # Prompt injection detection
|   |-- url_validator.py            # URL blocklist validation
|   |-- citation_search.py          # Semantic Scholar / OpenAlex search
|
|-- schemas/                        # Pydantic output models
|   |-- __init__.py
|   |-- models.py                   # All 8 schema definitions
|
|-- test_components.py              # Component tests
|-- tests/                          # Test directory
```

---

## 11. How to Run

### Prerequisites
- Python 3.10+
- OpenAI API key with GPT-4o access

### Setup

```bash
# 1. Install dependencies
pip install -r requirements.txt

# 2. Create .env file
echo "OPENAI_API_KEY=your-key-here" > .env

# 3. Run the application
python app.py
```

The Gradio UI launches at `http://0.0.0.0:7860`.

### Usage
1. Open the UI in your browser
2. Upload a research paper PDF (max 20 MB)
3. Click "Analyze Paper"
4. Wait 1-3 minutes for the pipeline to complete
5. Review results across all 6 tabs
6. Download the full report as Markdown

---

*Generated for AI Research Paper Analyst β€” Homework 5, Agentic AI Bootcamp*