File size: 3,594 Bytes
0d61aa0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
# PDF Debugging Workflow

This guide details how to use the PDF Inspector tool to diagnose and remediate common PDF accessibility issues.

## 1. Initial Compatibility Check
**Goal**: Determine if the document requires major remediation before detailed analysis.

1.  **Upload the PDF**: Use the file uploader or select an example from the list.
2.  **Run Single Page Analysis**: Click "Analyze".
3.  **Check for Alerts**: Look for the "Accessibility Alert" box at the top of the summary.
    *   **Untagged Document**: If you see this, the document lacks the "Structure Tree" required for screen readers.
        *   *Remediation*: Open the source file (Word/PPT) and "Save as PDF" with tags enabled, or use Adobe Acrobat Pro's "Autotag" feature.
    *   **Scanned Page**: If you see this, the page is an image with no selectable text.
        *   *Remediation*: Perform Optical Character Recognition (OCR) using Adobe Acrobat or a similar tool.

## 2. Detailed Single-Page Inspection
**Goal**: Verify reading order and content types on a specific page.

1.  **Visual Inspection**: Look at the "Analysis Results" image.
    *   **Red Boxes**: Indicate detected text blocks.
    *   **Numbers**: Show the reading order.
2.  **Verify Reading Order**:
    *   Does the order (1, 2, 3...) follow the logical flow of the document?
    *   *Issue*: If columns are read left-to-right across the page instead of down the column, the reading order is broken.
    *   *Fix*: This usually requires manual retagging in Acrobat (Order panel).
3.  **Check for Artifacts**:
    *   Are headers/footers marked as text blocks? (They should generally be artifacts/ignored by screen readers).

## 3. Advanced Diagnostics
**Goal**: Deep dive into specific issues using the "Advanced Analysis" tab.

### Content Stream Inspector
*   **Use when**: Text looks correct visually but copies weirdly or reads wrong (e.g., "fi" ligaure issues).
*   **Action**: Select a block and click "Extract Operators".
*   **Look for**: `TJ` or `Tj` operators showing garbled characters or strange spacing adjustments.

### Screen Reader Simulator
*   **Use when**: You want to "hear" what a user hears.
*   **Action**: Select "NVDA" and click "Generate Transcript".
*   **Check**:
    *   Are headings announced as "Heading Level X"?
    *   Is alt text read for images?
    *   Is the reading order intelligible?

### Paragraph Detection
*   **Use when**: Text seems run-on or broken into too many fragments.
*   **Action**: Click "Analyze Paragraphs".
*   **Check**:
    *   **Visual vs. Semantic**: Large discrepancies suggest the `<P>` tags don't match the visual layout, which can confuse users navigating by paragraph.

### Structure Tree Visualizer
*   **Use when**: The document is tagged, but navigation is broken.
*   **Action**: Click "Extract Structure Tree".
*   **Check**:
    *   Hierarchy depth.
    *   Correct nesting (e.g., `L` -> `LI` -> `LBody`).

## 4. Batch Analysis for Large Documents
**Goal**: Identify problematic pages in a long report.

1.  **Go to Batch Analysis Tab**.
2.  **Run Batch**: Analyze 50-100 pages.
3.  **Review the Report**:
    *   **Issues Found**: Look for "Scanned Pages" or "Garbled Text".
    *   **Page List**: Use the list of page numbers to targeting your remediation efforts.

## Summary Checklist
- [ ] Document is Tagged (`/StructTreeRoot` exists)
- [ ] Text is selectable (not an image/scan)
- [ ] Reading order is logical (columns handled correctly)
- [ ] Images have Alt Text (or are marked as artifacts)
- [ ] Headings use Heading tags (`<H1>`, `<H2>`), not just bold text.