Description
This project, pdf-to-image-python, is a high-performance utility designed to convert PDF documents into high-quality images while perfectly preserving native page proportions. It leverages the PyMuPDF (fitz) engine to handle complex PDF layouts and provides both batch processing for entire documents and single-page extraction (specifically for cover pages). The system dynamically calculates resolution scaling based on target DPI, ensuring crisp output for any source page size.
System Overview
graph TD
A[PDF Input] --> B{Process Type}
B -->|Single| C[single_page.py]
B -->|Batch| D[all_page.py]
C --> E[First Page Export]
D --> F[Full Document Export]
E --> G[Output Folder]
F --> G
Project Structure
pdf-to-image-python/
βββ .gitignore # Git ignore rules
βββ PDF/ # Input PDF directory
βββ output/ # Generated images directory (auto-created)
βββ LICENSE # Project license
βββ README.md # Main documentation
βββ STACKS.md # Technical stack audit
βββ all_page.py # Full PDF conversion script
βββ requirements.txt # Dependency list
βββ single_page.py # First page conversion script
Techstack
Audit of project files (excluding environment and cache):
| File Type | Count | Size (KB) |
|---|---|---|
| PDF (.pdf) | 16 | 15152 |
| PNG (.png) | 17 | 1952 |
| Python (.py) | 3 | 9.8 |
| Markdown (.md) | 3 | 4.3 |
| Text (.txt) | 2 | 0.1 |
| License | 1 | 1.1 |
Total Files: 42
Dependencies
- Python:
PyMuPDF(fitz): Core PDF rendering and processing.argparse: Command-line argument parsing.os: File system operations.glob: Filename pattern matching.
Applications
- Google Antigravity
- Google Gemini Pro
- Visual Studio Code
- Windows PowerShell