| | --- |
| | tags: |
| | - pdf |
| | - document-processing |
| | - pdf-manipulation |
| | - python |
| | - cli |
| | - automation |
| | language: |
| | - en |
| | license: mit |
| | library_name: pdf-manipulator |
| | pipeline_tag: other |
| | --- |
| | |
| | # PDF Manipulator |
| |
|
| |  |
| |  |
| |  |
| |  |
| |  |
| |  |
| |
|
| | A comprehensive, single-file command-line toolkit for all PDF page manipulation operations. Merge, split, remove, rotate, crop, watermark, encrypt, number, reorder, and batch-process PDF files with a clean and intuitive CLI. |
| |
|
| | **Author:** algorembrant |
| |
|
| | --- |
| |
|
| | ## Features |
| |
|
| | | Feature | Command | Description | |
| | |----------------------|------------------|-----------------------------------------------------| |
| | | Merge PDFs | `merge` | Combine multiple PDFs into one, or interleave pages | |
| | | Split PDF | `split` | Split into individual pages or page ranges | |
| | | Remove Pages | `remove` | Remove one or more pages by number or range | |
| | | Extract Pages | `extract` | Extract specific pages into a new PDF | |
| | | Reorder Pages | `reorder` | Rearrange pages in any custom order | |
| | | Rotate Pages | `rotate` | Rotate pages by 90, 180, or 270 degrees | |
| | | Reverse Pages | `reverse` | Reverse the page order | |
| | | Duplicate Pages | `duplicate` | Duplicate specific pages N times | |
| | | Insert Blank Page | `insert-blank` | Insert blank page before or after a position | |
| | | Insert PDF Pages | `insert` | Insert pages from another PDF at a position | |
| | | Replace Pages | `replace` | Replace pages with pages from another PDF | |
| | | Crop Pages | `crop` | Crop pages to a custom bounding box | |
| | | Scale / Resize | `scale` | Scale pages by factor or resize to A4/letter | |
| | | Watermark | `watermark` | Add text or PDF watermark to all pages | |
| | | Stamp / Overlay | `stamp` | Overlay a stamp PDF on pages | |
| | | Page Numbers | `number` | Add page numbers at any position | |
| | | Encrypt | `encrypt` | Password-protect a PDF | |
| | | Decrypt | `decrypt` | Remove password from a PDF | |
| | | Metadata | `metadata` | View or edit PDF title, author, subject, keywords | |
| | | Bookmarks | `bookmarks` | List or add bookmark/outline entries | |
| | | Extract Text | `text` | Extract plain text from pages | |
| | | Info / Inspect | `info` | Display page count, dimensions, and metadata | |
| | | N-Up Layout | `nup` | Arrange multiple pages per sheet (2x1, 2x2, etc.) | |
| | | Compress | `compress` | Losslessly compress PDF streams | |
| | | Batch Remove | `batch-remove` | Remove pages from all PDFs in a directory | |
| | | Batch Merge | `batch-merge` | Merge all PDFs in a directory into one | |
| | | Batch Split | `batch-split` | Split all PDFs in a directory into pages | |
| |
|
| | --- |
| |
|
| | ## Requirements |
| |
|
| | - Python 3.9 or newer |
| | - System dependency: **Poppler** (required for `nup` command only) |
| |
|
| | --- |
| |
|
| | ## Installation |
| |
|
| | ### 1. Clone the Repository |
| |
|
| | ```bash |
| | git clone https://github.com/algorembrant/pdf-manipulator.git |
| | cd pdf-manipulator |
| | ``` |
| |
|
| | ### 2. Install Python Dependencies |
| |
|
| | ```bash |
| | pip install -r requirements.txt |
| | ``` |
| |
|
| | ### 3. Install Poppler (Required for N-Up Layout) |
| |
|
| | The `nup` command uses `pdf2image`, which requires Poppler to be installed on your system. |
| |
|
| | | Platform | Install Command | |
| | |--------------|-------------------------------------------------------| |
| | | Ubuntu/Debian| `sudo apt-get install -y poppler-utils` | |
| | | macOS | `brew install poppler` | |
| | | Windows | Download from https://github.com/oschwartz10612/poppler-windows/releases and add `bin/` to your PATH | |
| |
|
| | If you do not need the `nup` command, Poppler is not required. |
| |
|
| | --- |
| |
|
| | ## Usage |
| |
|
| | ### Page Range Syntax |
| |
|
| | | Syntax | Meaning | |
| | |----------|-------------------------------------| |
| | | `3` | Page 3 only | |
| | | `1,3,5` | Pages 1, 3, and 5 | |
| | | `2-5` | Pages 2 through 5 inclusive | |
| | | `1,3-5,7`| Pages 1, 3, 4, 5, and 7 | |
| |
|
| | Pages are always 1-indexed (first page = 1). |
| |
|
| | --- |
| |
|
| | ### Step-by-Step Guide |
| |
|
| | #### Merge PDFs |
| |
|
| | ```bash |
| | # Merge two or more PDFs in order |
| | python pdf_manipulator.py merge -i file1.pdf file2.pdf file3.pdf -o merged.pdf |
| | |
| | # Interleave pages (page 1 from file1, page 1 from file2, page 2 from file1, ...) |
| | python pdf_manipulator.py merge -i file1.pdf file2.pdf -o interleaved.pdf --interleave |
| | ``` |
| |
|
| | #### Split PDF |
| |
|
| | ```bash |
| | # Split into individual pages (saved to a directory) |
| | python pdf_manipulator.py split -i input.pdf -o ./split_pages |
| | |
| | # Extract a range of pages into a single file |
| | python pdf_manipulator.py split -i input.pdf -o ./split_pages --range 1-5 |
| | python pdf_manipulator.py split -i input.pdf -o ./split_pages --range 2,4,6 |
| | ``` |
| |
|
| | #### Remove Pages |
| |
|
| | ```bash |
| | # Remove page 3 |
| | python pdf_manipulator.py remove -i input.pdf -o output.pdf --pages 3 |
| | |
| | # Remove multiple pages |
| | python pdf_manipulator.py remove -i input.pdf -o output.pdf --pages 1,3,5 |
| | |
| | # Remove a range of pages |
| | python pdf_manipulator.py remove -i input.pdf -o output.pdf --pages 2-5 |
| | |
| | # Remove mixed selection |
| | python pdf_manipulator.py remove -i input.pdf -o output.pdf --pages 1,3-5,7 |
| | ``` |
| |
|
| | #### Extract Pages |
| |
|
| | ```bash |
| | # Extract pages 1-3 into a new PDF |
| | python pdf_manipulator.py extract -i input.pdf -o output.pdf --pages 1-3 |
| | |
| | # Extract specific pages |
| | python pdf_manipulator.py extract -i input.pdf -o output.pdf --pages 2,4,6 |
| | ``` |
| |
|
| | #### Reorder Pages |
| |
|
| | ```bash |
| | # Place page 3 first, then page 1, then page 2, then page 4 |
| | python pdf_manipulator.py reorder -i input.pdf -o output.pdf --order 3,1,2,4 |
| | ``` |
| |
|
| | #### Rotate Pages |
| |
|
| | ```bash |
| | # Rotate all pages 90 degrees clockwise |
| | python pdf_manipulator.py rotate -i input.pdf -o output.pdf --angle 90 |
| | |
| | # Rotate only pages 1 and 3 by 180 degrees |
| | python pdf_manipulator.py rotate -i input.pdf -o output.pdf --angle 180 --pages 1,3 |
| | |
| | # Rotate pages 2-4 by 270 degrees |
| | python pdf_manipulator.py rotate -i input.pdf -o output.pdf --angle 270 --pages 2-4 |
| | ``` |
| |
|
| | #### Reverse Page Order |
| |
|
| | ```bash |
| | python pdf_manipulator.py reverse -i input.pdf -o output.pdf |
| | ``` |
| |
|
| | #### Duplicate Pages |
| |
|
| | ```bash |
| | # Duplicate page 2 so it appears 3 times in a row |
| | python pdf_manipulator.py duplicate -i input.pdf -o output.pdf --pages 2 --times 3 |
| | ``` |
| |
|
| | #### Insert Blank Pages |
| |
|
| | ```bash |
| | # Insert a blank page after page 2 |
| | python pdf_manipulator.py insert-blank -i input.pdf -o output.pdf --after 2 |
| | |
| | # Insert a blank page before page 1 |
| | python pdf_manipulator.py insert-blank -i input.pdf -o output.pdf --before 1 |
| | ``` |
| |
|
| | #### Insert Pages from Another PDF |
| |
|
| | ```bash |
| | # Insert all pages from extra.pdf after page 3 of base.pdf |
| | python pdf_manipulator.py insert -i base.pdf --insert-file extra.pdf -o output.pdf --after 3 |
| | |
| | # Insert before page 2 |
| | python pdf_manipulator.py insert -i base.pdf --insert-file extra.pdf -o output.pdf --before 2 |
| | ``` |
| |
|
| | #### Replace Pages |
| |
|
| | ```bash |
| | # Replace page 2 of base.pdf with page 1 of new.pdf |
| | python pdf_manipulator.py replace -i base.pdf --replace-file new.pdf -o output.pdf --pages 2 --replace-pages 1 |
| | ``` |
| |
|
| | #### Crop Pages |
| |
|
| | ```bash |
| | # Crop all pages (coordinates in PDF points: left,bottom,right,top) |
| | python pdf_manipulator.py crop -i input.pdf -o output.pdf --box "50,50,500,700" |
| | |
| | # Crop only pages 1-3 |
| | python pdf_manipulator.py crop -i input.pdf -o output.pdf --box "50,50,500,700" --pages 1-3 |
| | ``` |
| |
|
| | #### Scale / Resize Pages |
| |
|
| | ```bash |
| | # Scale all pages to 50% of original size |
| | python pdf_manipulator.py scale -i input.pdf -o output.pdf --factor 0.5 |
| | |
| | # Resize all pages to A4 |
| | python pdf_manipulator.py scale -i input.pdf -o output.pdf --to-size A4 |
| | |
| | # Resize to US Letter |
| | python pdf_manipulator.py scale -i input.pdf -o output.pdf --to-size letter |
| | ``` |
| |
|
| | #### Add Watermark |
| |
|
| | ```bash |
| | # Add a text watermark with defaults (red diagonal, 15% opacity) |
| | python pdf_manipulator.py watermark -i input.pdf -o output.pdf --text "CONFIDENTIAL" |
| | |
| | # Custom opacity and angle |
| | python pdf_manipulator.py watermark -i input.pdf -o output.pdf --text "DRAFT" --opacity 0.3 --angle 45 |
| | |
| | # Use a PDF file as watermark |
| | python pdf_manipulator.py watermark -i input.pdf -o output.pdf --watermark-pdf wm.pdf |
| | ``` |
| |
|
| | #### Stamp / Overlay |
| |
|
| | ```bash |
| | # Overlay stamp.pdf on all pages |
| | python pdf_manipulator.py stamp -i input.pdf -o output.pdf --stamp-pdf stamp.pdf |
| | |
| | # Overlay on page 1 only |
| | python pdf_manipulator.py stamp -i input.pdf -o output.pdf --stamp-pdf stamp.pdf --pages 1 |
| | ``` |
| |
|
| | #### Add Page Numbers |
| |
|
| | ```bash |
| | # Add page numbers at bottom center (default) |
| | python pdf_manipulator.py number -i input.pdf -o output.pdf |
| | |
| | # Custom position and starting number |
| | python pdf_manipulator.py number -i input.pdf -o output.pdf --position bottom-right --start 1 |
| | |
| | # Custom format string |
| | python pdf_manipulator.py number -i input.pdf -o output.pdf --position top-right --format "Page {n}" |
| | ``` |
| |
|
| | Available positions: `bottom-center`, `bottom-left`, `bottom-right`, `top-center`, `top-left`, `top-right` |
| |
|
| | #### Encrypt / Decrypt |
| |
|
| | ```bash |
| | # Encrypt with a user password |
| | python pdf_manipulator.py encrypt -i input.pdf -o output.pdf --user-pass mypassword |
| | |
| | # Encrypt with both user and owner password |
| | python pdf_manipulator.py encrypt -i input.pdf -o output.pdf --user-pass mypassword --owner-pass ownerpassword |
| | |
| | # Decrypt / remove password |
| | python pdf_manipulator.py decrypt -i encrypted.pdf -o decrypted.pdf --password mypassword |
| | ``` |
| |
|
| | #### Metadata |
| |
|
| | ```bash |
| | # View metadata |
| | python pdf_manipulator.py metadata -i input.pdf |
| | |
| | # Set metadata fields |
| | python pdf_manipulator.py metadata -i input.pdf -o output.pdf \ |
| | --set-title "Annual Report 2024" \ |
| | --set-author "algorembrant" \ |
| | --set-subject "Finance" \ |
| | --set-keywords "annual,report,finance" |
| | ``` |
| |
|
| | #### Bookmarks / Outline |
| |
|
| | ```bash |
| | # List all bookmarks |
| | python pdf_manipulator.py bookmarks -i input.pdf |
| | |
| | # Add bookmarks |
| | python pdf_manipulator.py bookmarks -i input.pdf -o output.pdf \ |
| | --add "Introduction:1,Chapter 1:3,Chapter 2:8" |
| | ``` |
| |
|
| | #### Extract Text |
| |
|
| | ```bash |
| | # Print text from all pages |
| | python pdf_manipulator.py text -i input.pdf |
| | |
| | # Extract text from pages 1-3 and save to file |
| | python pdf_manipulator.py text -i input.pdf --pages 1-3 -o extracted.txt |
| | ``` |
| |
|
| | #### PDF Info |
| |
|
| | ```bash |
| | python pdf_manipulator.py info -i input.pdf |
| | ``` |
| |
|
| | #### N-Up Layout (Requires Poppler) |
| |
|
| | ```bash |
| | # 2 pages side-by-side on one sheet |
| | python pdf_manipulator.py nup -i input.pdf -o output.pdf --layout 2x1 |
| | |
| | # 4 pages in a 2x2 grid on one sheet |
| | python pdf_manipulator.py nup -i input.pdf -o output.pdf --layout 2x2 |
| | ``` |
| |
|
| | #### Compress |
| |
|
| | ```bash |
| | python pdf_manipulator.py compress -i input.pdf -o output.pdf |
| | ``` |
| |
|
| | #### Batch Operations |
| |
|
| | ```bash |
| | # Remove page 1 (e.g. cover page) from all PDFs in a directory |
| | python pdf_manipulator.py batch-remove --dir ./pdfs --pages 1 --suffix _no_cover |
| | |
| | # Merge all PDFs in a directory into one |
| | python pdf_manipulator.py batch-merge --dir ./pdfs -o merged_all.pdf |
| | |
| | # Split all PDFs in a directory into individual pages |
| | python pdf_manipulator.py batch-split --dir ./pdfs --out-dir ./split_output |
| | ``` |
| |
|
| | --- |
| |
|
| | ## Notes |
| |
|
| | - All page numbers are 1-indexed (first page is page 1). |
| | - The `nup` command requires Poppler to be installed on your system. |
| | - For encrypted PDFs, use the `--password` flag with any command that reads them (decrypt first, or add password support per command as needed). |
| | - Output directories are created automatically if they do not exist. |
| |
|
| | --- |
| |
|
| | ## License |
| |
|
| | MIT License. See [LICENSE](LICENSE) for details. |
| |
|
| | --- |
| |
|
| | ## Author |
| |
|
| | **algorembrant** |
| |
|