--- tags: - pdf - document-processing - pdf-manipulation - python - cli - automation language: - en license: mit library_name: pdf-manipulator pipeline_tag: other --- # PDF Manipulator ![Python](https://img.shields.io/badge/Python-3.9%2B-blue?style=flat-square&logo=python) ![License](https://img.shields.io/badge/License-MIT-green?style=flat-square) ![Version](https://img.shields.io/badge/Version-1.0.0-orange?style=flat-square) ![Maintained](https://img.shields.io/badge/Maintained-Yes-brightgreen?style=flat-square) ![PRs Welcome](https://img.shields.io/badge/PRs-Welcome-blueviolet?style=flat-square) ![Platform](https://img.shields.io/badge/Platform-Linux%20%7C%20macOS%20%7C%20Windows-lightgrey?style=flat-square) A comprehensive, single-file command-line toolkit for all PDF page manipulation operations. Merge, split, remove, rotate, crop, watermark, encrypt, number, reorder, and batch-process PDF files with a clean and intuitive CLI. **Author:** algorembrant --- ## Features | Feature | Command | Description | |----------------------|------------------|-----------------------------------------------------| | Merge PDFs | `merge` | Combine multiple PDFs into one, or interleave pages | | Split PDF | `split` | Split into individual pages or page ranges | | Remove Pages | `remove` | Remove one or more pages by number or range | | Extract Pages | `extract` | Extract specific pages into a new PDF | | Reorder Pages | `reorder` | Rearrange pages in any custom order | | Rotate Pages | `rotate` | Rotate pages by 90, 180, or 270 degrees | | Reverse Pages | `reverse` | Reverse the page order | | Duplicate Pages | `duplicate` | Duplicate specific pages N times | | Insert Blank Page | `insert-blank` | Insert blank page before or after a position | | Insert PDF Pages | `insert` | Insert pages from another PDF at a position | | Replace Pages | `replace` | Replace pages with pages from another PDF | | Crop Pages | `crop` | Crop pages to a custom bounding box | | Scale / Resize | `scale` | Scale pages by factor or resize to A4/letter | | Watermark | `watermark` | Add text or PDF watermark to all pages | | Stamp / Overlay | `stamp` | Overlay a stamp PDF on pages | | Page Numbers | `number` | Add page numbers at any position | | Encrypt | `encrypt` | Password-protect a PDF | | Decrypt | `decrypt` | Remove password from a PDF | | Metadata | `metadata` | View or edit PDF title, author, subject, keywords | | Bookmarks | `bookmarks` | List or add bookmark/outline entries | | Extract Text | `text` | Extract plain text from pages | | Info / Inspect | `info` | Display page count, dimensions, and metadata | | N-Up Layout | `nup` | Arrange multiple pages per sheet (2x1, 2x2, etc.) | | Compress | `compress` | Losslessly compress PDF streams | | Batch Remove | `batch-remove` | Remove pages from all PDFs in a directory | | Batch Merge | `batch-merge` | Merge all PDFs in a directory into one | | Batch Split | `batch-split` | Split all PDFs in a directory into pages | --- ## Requirements - Python 3.9 or newer - System dependency: **Poppler** (required for `nup` command only) --- ## Installation ### 1. Clone the Repository ```bash git clone https://github.com/algorembrant/pdf-manipulator.git cd pdf-manipulator ``` ### 2. Install Python Dependencies ```bash pip install -r requirements.txt ``` ### 3. Install Poppler (Required for N-Up Layout) The `nup` command uses `pdf2image`, which requires Poppler to be installed on your system. | Platform | Install Command | |--------------|-------------------------------------------------------| | Ubuntu/Debian| `sudo apt-get install -y poppler-utils` | | macOS | `brew install poppler` | | Windows | Download from https://github.com/oschwartz10612/poppler-windows/releases and add `bin/` to your PATH | If you do not need the `nup` command, Poppler is not required. --- ## Usage ### Page Range Syntax | Syntax | Meaning | |----------|-------------------------------------| | `3` | Page 3 only | | `1,3,5` | Pages 1, 3, and 5 | | `2-5` | Pages 2 through 5 inclusive | | `1,3-5,7`| Pages 1, 3, 4, 5, and 7 | Pages are always 1-indexed (first page = 1). --- ### Step-by-Step Guide #### Merge PDFs ```bash # Merge two or more PDFs in order python pdf_manipulator.py merge -i file1.pdf file2.pdf file3.pdf -o merged.pdf # Interleave pages (page 1 from file1, page 1 from file2, page 2 from file1, ...) python pdf_manipulator.py merge -i file1.pdf file2.pdf -o interleaved.pdf --interleave ``` #### Split PDF ```bash # Split into individual pages (saved to a directory) python pdf_manipulator.py split -i input.pdf -o ./split_pages # Extract a range of pages into a single file python pdf_manipulator.py split -i input.pdf -o ./split_pages --range 1-5 python pdf_manipulator.py split -i input.pdf -o ./split_pages --range 2,4,6 ``` #### Remove Pages ```bash # Remove page 3 python pdf_manipulator.py remove -i input.pdf -o output.pdf --pages 3 # Remove multiple pages python pdf_manipulator.py remove -i input.pdf -o output.pdf --pages 1,3,5 # Remove a range of pages python pdf_manipulator.py remove -i input.pdf -o output.pdf --pages 2-5 # Remove mixed selection python pdf_manipulator.py remove -i input.pdf -o output.pdf --pages 1,3-5,7 ``` #### Extract Pages ```bash # Extract pages 1-3 into a new PDF python pdf_manipulator.py extract -i input.pdf -o output.pdf --pages 1-3 # Extract specific pages python pdf_manipulator.py extract -i input.pdf -o output.pdf --pages 2,4,6 ``` #### Reorder Pages ```bash # Place page 3 first, then page 1, then page 2, then page 4 python pdf_manipulator.py reorder -i input.pdf -o output.pdf --order 3,1,2,4 ``` #### Rotate Pages ```bash # Rotate all pages 90 degrees clockwise python pdf_manipulator.py rotate -i input.pdf -o output.pdf --angle 90 # Rotate only pages 1 and 3 by 180 degrees python pdf_manipulator.py rotate -i input.pdf -o output.pdf --angle 180 --pages 1,3 # Rotate pages 2-4 by 270 degrees python pdf_manipulator.py rotate -i input.pdf -o output.pdf --angle 270 --pages 2-4 ``` #### Reverse Page Order ```bash python pdf_manipulator.py reverse -i input.pdf -o output.pdf ``` #### Duplicate Pages ```bash # Duplicate page 2 so it appears 3 times in a row python pdf_manipulator.py duplicate -i input.pdf -o output.pdf --pages 2 --times 3 ``` #### Insert Blank Pages ```bash # Insert a blank page after page 2 python pdf_manipulator.py insert-blank -i input.pdf -o output.pdf --after 2 # Insert a blank page before page 1 python pdf_manipulator.py insert-blank -i input.pdf -o output.pdf --before 1 ``` #### Insert Pages from Another PDF ```bash # Insert all pages from extra.pdf after page 3 of base.pdf python pdf_manipulator.py insert -i base.pdf --insert-file extra.pdf -o output.pdf --after 3 # Insert before page 2 python pdf_manipulator.py insert -i base.pdf --insert-file extra.pdf -o output.pdf --before 2 ``` #### Replace Pages ```bash # Replace page 2 of base.pdf with page 1 of new.pdf python pdf_manipulator.py replace -i base.pdf --replace-file new.pdf -o output.pdf --pages 2 --replace-pages 1 ``` #### Crop Pages ```bash # Crop all pages (coordinates in PDF points: left,bottom,right,top) python pdf_manipulator.py crop -i input.pdf -o output.pdf --box "50,50,500,700" # Crop only pages 1-3 python pdf_manipulator.py crop -i input.pdf -o output.pdf --box "50,50,500,700" --pages 1-3 ``` #### Scale / Resize Pages ```bash # Scale all pages to 50% of original size python pdf_manipulator.py scale -i input.pdf -o output.pdf --factor 0.5 # Resize all pages to A4 python pdf_manipulator.py scale -i input.pdf -o output.pdf --to-size A4 # Resize to US Letter python pdf_manipulator.py scale -i input.pdf -o output.pdf --to-size letter ``` #### Add Watermark ```bash # Add a text watermark with defaults (red diagonal, 15% opacity) python pdf_manipulator.py watermark -i input.pdf -o output.pdf --text "CONFIDENTIAL" # Custom opacity and angle python pdf_manipulator.py watermark -i input.pdf -o output.pdf --text "DRAFT" --opacity 0.3 --angle 45 # Use a PDF file as watermark python pdf_manipulator.py watermark -i input.pdf -o output.pdf --watermark-pdf wm.pdf ``` #### Stamp / Overlay ```bash # Overlay stamp.pdf on all pages python pdf_manipulator.py stamp -i input.pdf -o output.pdf --stamp-pdf stamp.pdf # Overlay on page 1 only python pdf_manipulator.py stamp -i input.pdf -o output.pdf --stamp-pdf stamp.pdf --pages 1 ``` #### Add Page Numbers ```bash # Add page numbers at bottom center (default) python pdf_manipulator.py number -i input.pdf -o output.pdf # Custom position and starting number python pdf_manipulator.py number -i input.pdf -o output.pdf --position bottom-right --start 1 # Custom format string python pdf_manipulator.py number -i input.pdf -o output.pdf --position top-right --format "Page {n}" ``` Available positions: `bottom-center`, `bottom-left`, `bottom-right`, `top-center`, `top-left`, `top-right` #### Encrypt / Decrypt ```bash # Encrypt with a user password python pdf_manipulator.py encrypt -i input.pdf -o output.pdf --user-pass mypassword # Encrypt with both user and owner password python pdf_manipulator.py encrypt -i input.pdf -o output.pdf --user-pass mypassword --owner-pass ownerpassword # Decrypt / remove password python pdf_manipulator.py decrypt -i encrypted.pdf -o decrypted.pdf --password mypassword ``` #### Metadata ```bash # View metadata python pdf_manipulator.py metadata -i input.pdf # Set metadata fields python pdf_manipulator.py metadata -i input.pdf -o output.pdf \ --set-title "Annual Report 2024" \ --set-author "algorembrant" \ --set-subject "Finance" \ --set-keywords "annual,report,finance" ``` #### Bookmarks / Outline ```bash # List all bookmarks python pdf_manipulator.py bookmarks -i input.pdf # Add bookmarks python pdf_manipulator.py bookmarks -i input.pdf -o output.pdf \ --add "Introduction:1,Chapter 1:3,Chapter 2:8" ``` #### Extract Text ```bash # Print text from all pages python pdf_manipulator.py text -i input.pdf # Extract text from pages 1-3 and save to file python pdf_manipulator.py text -i input.pdf --pages 1-3 -o extracted.txt ``` #### PDF Info ```bash python pdf_manipulator.py info -i input.pdf ``` #### N-Up Layout (Requires Poppler) ```bash # 2 pages side-by-side on one sheet python pdf_manipulator.py nup -i input.pdf -o output.pdf --layout 2x1 # 4 pages in a 2x2 grid on one sheet python pdf_manipulator.py nup -i input.pdf -o output.pdf --layout 2x2 ``` #### Compress ```bash python pdf_manipulator.py compress -i input.pdf -o output.pdf ``` #### Batch Operations ```bash # Remove page 1 (e.g. cover page) from all PDFs in a directory python pdf_manipulator.py batch-remove --dir ./pdfs --pages 1 --suffix _no_cover # Merge all PDFs in a directory into one python pdf_manipulator.py batch-merge --dir ./pdfs -o merged_all.pdf # Split all PDFs in a directory into individual pages python pdf_manipulator.py batch-split --dir ./pdfs --out-dir ./split_output ``` --- ## Notes - All page numbers are 1-indexed (first page is page 1). - The `nup` command requires Poppler to be installed on your system. - For encrypted PDFs, use the `--password` flag with any command that reads them (decrypt first, or add password support per command as needed). - Output directories are created automatically if they do not exist. --- ## License MIT License. See [LICENSE](LICENSE) for details. --- ## Author **algorembrant**