algorembrant
/

pdf-to-image-python

Model card Files Files and versions

pdf-to-image-python / STACKS.md

algorembrant's picture

Upload 10 files

1d50b6b verified 5 days ago

|

history blame contribute delete

1.98 kB

	## Description
	This project, `pdf-to-image-python`, is a high-performance utility designed to convert PDF documents into high-quality images while perfectly preserving native page proportions. It leverages the PyMuPDF (fitz) engine to handle complex PDF layouts and provides both batch processing for entire documents and single-page extraction (specifically for cover pages). The system dynamically calculates resolution scaling based on target DPI, ensuring crisp output for any source page size.

	## System Overview

	```mermaid
	graph TD
	A[PDF Input] --> B{Process Type}
	B -->\|Single\| C[single_page.py]
	B -->\|Batch\| D[all_page.py]
	C --> E[First Page Export]
	D --> F[Full Document Export]
	E --> G[Output Folder]
	F --> G
	```

	## Project Structure

	```text
	pdf-to-image-python/
	├── .gitignore # Git ignore rules
	├── PDF/ # Input PDF directory
	├── output/ # Generated images directory (auto-created)
	├── LICENSE # Project license
	├── README.md # Main documentation
	├── STACKS.md # Technical stack audit
	├── all_page.py # Full PDF conversion script
	├── requirements.txt # Dependency list
	└── single_page.py # First page conversion script
	```

	## Techstack
	Audit of project files (excluding environment and cache):

	\| File Type \| Count \| Size (KB) \|
	\| :--- \| :--- \| :--- \|
	\| PDF (.pdf) \| 16 \| 15152 \|
	\| PNG (.png) \| 17 \| 1952 \|
	\| Python (.py) \| 3 \| 9.8 \|
	\| Markdown (.md) \| 3 \| 4.3 \|
	\| Text (.txt) \| 2 \| 0.1 \|
	\| License \| 1 \| 1.1 \|

	Total Files: 42

	## Dependencies
	- Python:
	- `PyMuPDF` (fitz): Core PDF rendering and processing.
	- `argparse`: Command-line argument parsing.
	- `os`: File system operations.
	- `glob`: Filename pattern matching.

	## Applications
	- Google Antigravity
	- Google Gemini Pro
	- Visual Studio Code
	- Windows PowerShell