Spaces:
Sleeping
A newer version of the Gradio SDK is available:
6.9.0
PDF Inspector - Test Plan
Overview
This test plan outlines valid verification steps for the PDF Inspector application using the provided example documents. Since all currently included examples are untagged documents, this plan focuses on verifying the "Untagged" detection logic, fallback heuristics (math detection, reading order), and error handling.
Test Environment
- URL: http://127.0.0.1:7860
- Browsers: Chrome / Safari / Firefox (Any modern browser)
1. Test Case: Untagged Document Detection
Target Document: test_document.pdf
| Step | Action | Expected Result | Pass/Fail |
|---|---|---|---|
| 1.1 | Select test_document.pdf from Examples. |
File loads into the input box. | |
| 1.2 | Click Analyze button. | Analysis completes; "Analysis Results" image appears. | |
| 1.3 | Check Summary Report. | Alert: "⚠️ Accessibility Alert: Untagged Document" is visible. | |
| 1.4 | Go to Advanced Analysis tab. | Tab opens. | |
| 1.5 | Open 4. Structure Tree Visualizer and click Extract. | Result: "## No Structure Tree Found" message. |
Success Criteria: The application correctly identifies the document as untagged and prevents structure-dependent tools from crashing.
2. Test Case: Math & Visual Block Detection
Target Document: 18.1 Notes.pdf (Handwritten/Math Slides)
| Step | Action | Expected Result | Pass/Fail |
|---|---|---|---|
| 2.1 | Select 18.1 Notes.pdf from Examples. |
File loads. | |
| 2.2 | Click Analyze button. | Analysis completes (~1-2 seconds). | |
| 2.3 | Inspect "Page overlay" image. | - Red Boxes: Detected around text blocks. - Math Highlight: Math formulas (e.g., integrals, sums) should have specific bounding boxes. |
|
| 2.4 | Check Summary Report. | Alert: "Untagged Document". Stats: Should show > 0 "Math-like blocks detected". |
Success Criteria: The heuristic regex-based math detection works on the text extracted from the slides.
3. Test Case: Screen Reader Simulation (Untagged Fallback)
Target Document: logic.pdf (Academic Text)
| Step | Action | Expected Result | Pass/Fail |
|---|---|---|---|
| 3.1 | Select logic.pdf. |
File loads. | |
| 3.2 | Click Analyze. | Analysis completes. | |
| 3.3 | Go to Advanced Analysis -> 2. Screen Reader Simulator. | Accordion opens. | |
| 3.4 | Set Reading Order to "Raw" or "TBLR". | Settings accepted. | |
| 3.5 | Click Generate Transcript. | Result: Transcript appears in the textbook. Header: "⚠️ Simulated from visual order (PDF not tagged)". Content: Contains readable text (e.g., "A Logical Interpretation..."). |
Success Criteria: The simulator successfully uses the fallback logic (visual ordering) instead of crashing when no structure tree is present.
4. Test Case: Feature Availability Check (Negative Testing)
Target Document: Any of the above
| Step | Action | Expected Result | Pass/Fail |
|---|---|---|---|
| 4.1 | Open 5. Block-to-Tag Mapping. | Accordion opens. | |
| 4.2 | Click Map Blocks to Tags. | Result: "## No Mappings Found" (because there are no tags). | |
| 4.3 | Open 3. Paragraph Detection and click Analyze. | Result: Visual paragraphs are detected (green boxes), but Semantic Tags count is 0. |
1.6 Landscape / Rotated Documents
- Why: Ensure overlays align correctly on rotated pages.
- Test:
- Load a PDF with landscape pages (or 90-degree rotation).
- Verify that the blue/red bounding boxes align perfectly with the text.
- Verify that "reading order" flows logically (e.g., top-left of the visual page).
Success Criteria: Features requiring tags explicitly state that tags are missing rather than showing empty/broken UIs.
Known Limitations / Expected Behavior
- Untagged Alerts: All examples provided are untagged; the alert is expected behavior.
- Reading Order: Without tags, reading order is a guess. Columns might be read left-to-right across the page in "Raw" mode.