Spaces:
Sleeping
Sleeping
File size: 4,204 Bytes
0d61aa0 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 | # PDF Inspector - Test Plan
## Overview
This test plan outlines valid verification steps for the PDF Inspector application using the provided example documents. Since all currently included examples are **untagged** documents, this plan focuses on verifying the "Untagged" detection logic, fallback heuristics (math detection, reading order), and error handling.
## Test Environment
- **URL**: http://127.0.0.1:7860
- **Browsers**: Chrome / Safari / Firefox (Any modern browser)
---
## 1. Test Case: Untagged Document Detection
**Target Document**: `test_document.pdf`
| Step | Action | Expected Result | Pass/Fail |
|------|--------|-----------------|-----------|
| 1.1 | Select `test_document.pdf` from Examples. | File loads into the input box. | |
| 1.2 | Click **Analyze** button. | Analysis completes; "Analysis Results" image appears. | |
| 1.3 | Check Summary Report. | **Alert**: "⚠️ Accessibility Alert: Untagged Document" is visible. | |
| 1.4 | Go to **Advanced Analysis** tab. | Tab opens. | |
| 1.5 | Open **4. Structure Tree Visualizer** and click **Extract**. | **Result**: "## No Structure Tree Found" message. | |
**Success Criteria**: The application correctly identifies the document as untagged and prevents structure-dependent tools from crashing.
---
## 2. Test Case: Math & Visual Block Detection
**Target Document**: `18.1 Notes.pdf` (Handwritten/Math Slides)
| Step | Action | Expected Result | Pass/Fail |
|------|--------|-----------------|-----------|
| 2.1 | Select `18.1 Notes.pdf` from Examples. | File loads. | |
| 2.2 | Click **Analyze** button. | Analysis completes (~1-2 seconds). | |
| 2.3 | Inspect "Page overlay" image. | - **Red Boxes**: Detected around text blocks.<br>- **Math Highlight**: Math formulas (e.g., integrals, sums) should have specific bounding boxes. | |
| 2.4 | Check Summary Report. | **Alert**: "Untagged Document". <br> **Stats**: Should show > 0 "Math-like blocks detected". | |
**Success Criteria**: The heuristic regex-based math detection works on the text extracted from the slides.
---
## 3. Test Case: Screen Reader Simulation (Untagged Fallback)
**Target Document**: `logic.pdf` (Academic Text)
| Step | Action | Expected Result | Pass/Fail |
|------|--------|-----------------|-----------|
| 3.1 | Select `logic.pdf`. | File loads. | |
| 3.2 | Click **Analyze**. | Analysis completes. | |
| 3.3 | Go to **Advanced Analysis** -> **2. Screen Reader Simulator**. | Accordion opens. | |
| 3.4 | Set **Reading Order** to "Raw" or "TBLR". | Settings accepted. | |
| 3.5 | Click **Generate Transcript**. | **Result**: Transcript appears in the textbook.<br> **Header**: "⚠️ Simulated from visual order (PDF not tagged)".<br> **Content**: Contains readable text (e.g., "A Logical Interpretation..."). | |
**Success Criteria**: The simulator successfully uses the fallback logic (visual ordering) instead of crashing when no structure tree is present.
---
## 4. Test Case: Feature Availability Check (Negative Testing)
**Target Document**: Any of the above
| Step | Action | Expected Result | Pass/Fail |
|------|--------|-----------------|-----------|
| 4.1 | Open **5. Block-to-Tag Mapping**. | Accordion opens. | |
| 4.2 | Click **Map Blocks to Tags**. | **Result**: "## No Mappings Found" (because there are no tags). | |
| 4.3 | Open **3. Paragraph Detection** and click **Analyze**. | **Result**: Visual paragraphs are detected (green boxes), but **Semantic <P> Tags** count is 0. | |
### 1.6 Landscape / Rotated Documents
- **Why**: Ensure overlays align correctly on rotated pages.
- **Test**:
- Load a PDF with landscape pages (or 90-degree rotation).
- Verify that the blue/red bounding boxes align perfectly with the text.
- Verify that "reading order" flows logically (e.g., top-left of the *visual* page).
**Success Criteria**: Features requiring tags explicitly state that tags are missing rather than showing empty/broken UIs.
## Known Limitations / Expected Behavior
* **Untagged Alerts**: All examples provided are untagged; the alert is **expected behavior**.
* **Reading Order**: Without tags, reading order is a guess. Columns might be read left-to-right across the page in "Raw" mode.
|