File size: 4,204 Bytes
0d61aa0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
# PDF Inspector - Test Plan

## Overview
This test plan outlines valid verification steps for the PDF Inspector application using the provided example documents. Since all currently included examples are **untagged** documents, this plan focuses on verifying the "Untagged" detection logic, fallback heuristics (math detection, reading order), and error handling.

## Test Environment
- **URL**: http://127.0.0.1:7860
- **Browsers**: Chrome / Safari / Firefox (Any modern browser)

---

## 1. Test Case: Untagged Document Detection
**Target Document**: `test_document.pdf`

| Step | Action | Expected Result | Pass/Fail |
|------|--------|-----------------|-----------|
| 1.1 | Select `test_document.pdf` from Examples. | File loads into the input box. | |
| 1.2 | Click **Analyze** button. | Analysis completes; "Analysis Results" image appears. | |
| 1.3 | Check Summary Report. | **Alert**: "⚠️ Accessibility Alert: Untagged Document" is visible. | |
| 1.4 | Go to **Advanced Analysis** tab. | Tab opens. | |
| 1.5 | Open **4. Structure Tree Visualizer** and click **Extract**. | **Result**: "## No Structure Tree Found" message. | |

**Success Criteria**: The application correctly identifies the document as untagged and prevents structure-dependent tools from crashing.

---

## 2. Test Case: Math & Visual Block Detection
**Target Document**: `18.1 Notes.pdf` (Handwritten/Math Slides)

| Step | Action | Expected Result | Pass/Fail |
|------|--------|-----------------|-----------|
| 2.1 | Select `18.1 Notes.pdf` from Examples. | File loads. | |
| 2.2 | Click **Analyze** button. | Analysis completes (~1-2 seconds). | |
| 2.3 | Inspect "Page overlay" image. | - **Red Boxes**: Detected around text blocks.<br>- **Math Highlight**: Math formulas (e.g., integrals, sums) should have specific bounding boxes. | |
| 2.4 | Check Summary Report. | **Alert**: "Untagged Document". <br> **Stats**: Should show > 0 "Math-like blocks detected". | |

**Success Criteria**: The heuristic regex-based math detection works on the text extracted from the slides.

---

## 3. Test Case: Screen Reader Simulation (Untagged Fallback)
**Target Document**: `logic.pdf` (Academic Text)

| Step | Action | Expected Result | Pass/Fail |
|------|--------|-----------------|-----------|
| 3.1 | Select `logic.pdf`. | File loads. | |
| 3.2 | Click **Analyze**. | Analysis completes. | |
| 3.3 | Go to **Advanced Analysis** -> **2. Screen Reader Simulator**. | Accordion opens. | |
| 3.4 | Set **Reading Order** to "Raw" or "TBLR". | Settings accepted. | |
| 3.5 | Click **Generate Transcript**. | **Result**: Transcript appears in the textbook.<br> **Header**: "⚠️ Simulated from visual order (PDF not tagged)".<br> **Content**: Contains readable text (e.g., "A Logical Interpretation..."). | |

**Success Criteria**: The simulator successfully uses the fallback logic (visual ordering) instead of crashing when no structure tree is present.

---

## 4. Test Case: Feature Availability Check (Negative Testing)
**Target Document**: Any of the above

| Step | Action | Expected Result | Pass/Fail |
|------|--------|-----------------|-----------|
| 4.1 | Open **5. Block-to-Tag Mapping**. | Accordion opens. | |
| 4.2 | Click **Map Blocks to Tags**. | **Result**: "## No Mappings Found" (because there are no tags). | |
| 4.3 | Open **3. Paragraph Detection** and click **Analyze**. | **Result**: Visual paragraphs are detected (green boxes), but **Semantic <P> Tags** count is 0. | |

### 1.6 Landscape / Rotated Documents
- **Why**: Ensure overlays align correctly on rotated pages.
- **Test**:
  - Load a PDF with landscape pages (or 90-degree rotation).
  - Verify that the blue/red bounding boxes align perfectly with the text.
  - Verify that "reading order" flows logically (e.g., top-left of the *visual* page).

**Success Criteria**: Features requiring tags explicitly state that tags are missing rather than showing empty/broken UIs.

## Known Limitations / Expected Behavior
*   **Untagged Alerts**: All examples provided are untagged; the alert is **expected behavior**.
*   **Reading Order**: Without tags, reading order is a guess. Columns might be read left-to-right across the page in "Raw" mode.