# PDF Debugging Workflow This guide details how to use the PDF Inspector tool to diagnose and remediate common PDF accessibility issues. ## 1. Initial Compatibility Check **Goal**: Determine if the document requires major remediation before detailed analysis. 1. **Upload the PDF**: Use the file uploader or select an example from the list. 2. **Run Single Page Analysis**: Click "Analyze". 3. **Check for Alerts**: Look for the "Accessibility Alert" box at the top of the summary. * **Untagged Document**: If you see this, the document lacks the "Structure Tree" required for screen readers. * *Remediation*: Open the source file (Word/PPT) and "Save as PDF" with tags enabled, or use Adobe Acrobat Pro's "Autotag" feature. * **Scanned Page**: If you see this, the page is an image with no selectable text. * *Remediation*: Perform Optical Character Recognition (OCR) using Adobe Acrobat or a similar tool. ## 2. Detailed Single-Page Inspection **Goal**: Verify reading order and content types on a specific page. 1. **Visual Inspection**: Look at the "Analysis Results" image. * **Red Boxes**: Indicate detected text blocks. * **Numbers**: Show the reading order. 2. **Verify Reading Order**: * Does the order (1, 2, 3...) follow the logical flow of the document? * *Issue*: If columns are read left-to-right across the page instead of down the column, the reading order is broken. * *Fix*: This usually requires manual retagging in Acrobat (Order panel). 3. **Check for Artifacts**: * Are headers/footers marked as text blocks? (They should generally be artifacts/ignored by screen readers). ## 3. Advanced Diagnostics **Goal**: Deep dive into specific issues using the "Advanced Analysis" tab. ### Content Stream Inspector * **Use when**: Text looks correct visually but copies weirdly or reads wrong (e.g., "fi" ligaure issues). * **Action**: Select a block and click "Extract Operators". * **Look for**: `TJ` or `Tj` operators showing garbled characters or strange spacing adjustments. ### Screen Reader Simulator * **Use when**: You want to "hear" what a user hears. * **Action**: Select "NVDA" and click "Generate Transcript". * **Check**: * Are headings announced as "Heading Level X"? * Is alt text read for images? * Is the reading order intelligible? ### Paragraph Detection * **Use when**: Text seems run-on or broken into too many fragments. * **Action**: Click "Analyze Paragraphs". * **Check**: * **Visual vs. Semantic**: Large discrepancies suggest the `
` tags don't match the visual layout, which can confuse users navigating by paragraph. ### Structure Tree Visualizer * **Use when**: The document is tagged, but navigation is broken. * **Action**: Click "Extract Structure Tree". * **Check**: * Hierarchy depth. * Correct nesting (e.g., `L` -> `LI` -> `LBody`). ## 4. Batch Analysis for Large Documents **Goal**: Identify problematic pages in a long report. 1. **Go to Batch Analysis Tab**. 2. **Run Batch**: Analyze 50-100 pages. 3. **Review the Report**: * **Issues Found**: Look for "Scanned Pages" or "Garbled Text". * **Page List**: Use the list of page numbers to targeting your remediation efforts. ## Summary Checklist - [ ] Document is Tagged (`/StructTreeRoot` exists) - [ ] Text is selectable (not an image/scan) - [ ] Reading order is logical (columns handled correctly) - [ ] Images have Alt Text (or are marked as artifacts) - [ ] Headings use Heading tags (`