pdf-to-image-python / STACKS.md
algorembrant's picture
Upload 10 files
1d50b6b verified
## Description
This project, `pdf-to-image-python`, is a high-performance utility designed to convert PDF documents into high-quality images while perfectly preserving native page proportions. It leverages the PyMuPDF (fitz) engine to handle complex PDF layouts and provides both batch processing for entire documents and single-page extraction (specifically for cover pages). The system dynamically calculates resolution scaling based on target DPI, ensuring crisp output for any source page size.
## System Overview
```mermaid
graph TD
A[PDF Input] --> B{Process Type}
B -->|Single| C[single_page.py]
B -->|Batch| D[all_page.py]
C --> E[First Page Export]
D --> F[Full Document Export]
E --> G[Output Folder]
F --> G
```
## Project Structure
```text
pdf-to-image-python/
β”œβ”€β”€ .gitignore # Git ignore rules
β”œβ”€β”€ PDF/ # Input PDF directory
β”œβ”€β”€ output/ # Generated images directory (auto-created)
β”œβ”€β”€ LICENSE # Project license
β”œβ”€β”€ README.md # Main documentation
β”œβ”€β”€ STACKS.md # Technical stack audit
β”œβ”€β”€ all_page.py # Full PDF conversion script
β”œβ”€β”€ requirements.txt # Dependency list
└── single_page.py # First page conversion script
```
## Techstack
Audit of project files (excluding environment and cache):
| File Type | Count | Size (KB) |
| :--- | :--- | :--- |
| PDF (.pdf) | 16 | 15152 |
| PNG (.png) | 17 | 1952 |
| Python (.py) | 3 | 9.8 |
| Markdown (.md) | 3 | 4.3 |
| Text (.txt) | 2 | 0.1 |
| License | 1 | 1.1 |
**Total Files**: 42
## Dependencies
- **Python**:
- `PyMuPDF` (fitz): Core PDF rendering and processing.
- `argparse`: Command-line argument parsing.
- `os`: File system operations.
- `glob`: Filename pattern matching.
## Applications
- Google Antigravity
- Google Gemini Pro
- Visual Studio Code
- Windows PowerShell