Spaces:
Sleeping
Sleeping
| title: HTML to PDF Converter | |
| emoji: π | |
| colorFrom: purple | |
| colorTo: blue | |
| sdk: docker | |
| app_port: 7860 | |
| health_check: | |
| path: /health | |
| # HTML to PDF Converter API π | |
| Convert HTML files to PDF with automatic image embedding and page break management. Perfect for generating reports, presentations, and documents from HTML. | |
| ## π Quick Start | |
| ### Basic Conversion (HTML only) | |
| ```bash | |
| curl -X POST https://abdallalswaiti-htmlpdfs.hf.space/convert \ | |
| -F "html_file=@your_file.html" \ | |
| -o output.pdf | |
| ``` | |
| ### With Images | |
| ```bash | |
| curl -X POST https://abdallalswaiti-htmlpdfs.hf.space/convert \ | |
| -F "html_file=@report.html" \ | |
| -F "images=@image1.png" \ | |
| -F "images=@image2.jpg" \ | |
| -F "images=@logo.svg" \ | |
| -o output.pdf | |
| ``` | |
| ### Custom Aspect Ratio | |
| ```bash | |
| curl -X POST https://abdallalswaiti-htmlpdfs.hf.space/convert \ | |
| -F "html_file=@presentation.html" \ | |
| -F "aspect_ratio=16:9" \ | |
| -F "auto_detect=false" \ | |
| -o slides.pdf | |
| ``` | |
| ## π API Endpoints | |
| ### `POST /convert` | |
| Convert HTML file to PDF with optional images. | |
| **Parameters:** | |
| - `html_file` (required): HTML file to convert | |
| - `images` (optional): Image files referenced in HTML (can upload multiple) | |
| - `aspect_ratio` (optional): `16:9`, `1:1`, or `9:16` | |
| - `auto_detect` (optional): Auto-detect aspect ratio from HTML (default: `true`) | |
| **Response:** | |
| - PDF file (application/pdf) | |
| - Headers include metadata: aspect ratio, image count, PDF size | |
| ### `POST /convert-string` | |
| Convert HTML string to PDF (for HTML without external images). | |
| **Parameters:** | |
| - `html_content` (required): HTML content as string | |
| - `aspect_ratio` (optional): `16:9`, `1:1`, or `9:16` | |
| - `auto_detect` (optional): Auto-detect aspect ratio (default: `true`) | |
| **Example:** | |
| ```bash | |
| curl -X POST https://abdallalswaiti-htmlpdfs.hf.space/convert-string \ | |
| -F "html_content=<html><body><h1>Hello World</h1></body></html>" \ | |
| -o output.pdf | |
| ``` | |
| ### `GET /health` | |
| Health check endpoint. | |
| ```bash | |
| curl https://abdallalswaiti-htmlpdfs.hf.space/health | |
| ``` | |
| ## π¨ Features | |
| ### β Automatic Image Path Normalization | |
| The API automatically converts complex image paths to simple filenames: | |
| **Before:** | |
| ```html | |
| <img src="../../../assets/images/logo.png"> | |
| <img src="images/photo.jpg"> | |
| ``` | |
| **After (automatically):** | |
| ```html | |
| <img src="logo.png"> | |
| <img src="photo.jpg"> | |
| ``` | |
| Just upload your images with the `images` parameter, and they'll work! | |
| ### β Aspect Ratio Detection | |
| The API automatically detects aspect ratio from: | |
| - HTML `<meta name="viewport">` tags | |
| - CSS `aspect-ratio` properties | |
| - Keywords like "presentation", "slide" | |
| Supported ratios: | |
| - **16:9** - Landscape (presentations, slides) β A4 Landscape | |
| - **9:16** - Portrait (reports, documents) β A4 Portrait | |
| - **1:1** - Square (social media posts) β 210mm Γ 210mm | |
| ### β Automatic Page Breaks | |
| The API intelligently handles page breaks: | |
| - Elements with classes: `.page`, `.slide`, `section.page` | |
| - Top-level `<section>`, `<article>`, `<div>` elements | |
| - Prevents breaking inside: headings, images, tables, code blocks | |
| ### β Color Preservation | |
| All colors, backgrounds, and gradients are preserved in the PDF with `print-color-adjust: exact`. | |
| ## π‘ Usage Examples | |
| ### Example 1: Simple Report | |
| ```bash | |
| curl -X POST https://abdallalswaiti-htmlpdfs.hf.space/convert \ | |
| -F "html_file=@report.html" \ | |
| -o report.pdf | |
| ``` | |
| ### Example 2: Presentation with Images | |
| ```bash | |
| curl -X POST https://abdallalswaiti-htmlpdfs.hf.space/convert \ | |
| -F "html_file=@slides.html" \ | |
| -F "images=@chart1.png" \ | |
| -F "images=@chart2.png" \ | |
| -F "images=@logo.svg" \ | |
| -F "aspect_ratio=16:9" \ | |
| -o presentation.pdf | |
| ``` | |
| ### Example 3: Multiple Images from Directory | |
| ```bash | |
| curl -X POST https://abdallalswaiti-htmlpdfs.hf.space/convert \ | |
| -F "html_file=@document.html" \ | |
| $(for img in images/*.{png,jpg}; do echo "-F images=@$img"; done) \ | |
| -o document.pdf | |
| ``` | |
| ### Example 4: Python Script | |
| ```python | |
| import requests | |
| # Prepare files | |
| files = { | |
| 'html_file': open('report.html', 'rb'), | |
| } | |
| # Add images | |
| images = [ | |
| ('images', open('image1.png', 'rb')), | |
| ('images', open('image2.jpg', 'rb')), | |
| ] | |
| # Optional parameters | |
| data = { | |
| 'aspect_ratio': '9:16', | |
| 'auto_detect': 'false' | |
| } | |
| # Make request | |
| response = requests.post( | |
| 'https://abdallalswaiti-htmlpdfs.hf.space/convert', | |
| files=files, | |
| data=data, | |
| files=files + images | |
| ) | |
| # Save PDF | |
| if response.status_code == 200: | |
| with open('output.pdf', 'wb') as f: | |
| f.write(response.content) | |
| print("PDF generated successfully!") | |
| else: | |
| print(f"Error: {response.status_code}") | |
| print(response.text) | |
| ``` | |
| ### Example 5: JavaScript/Node.js | |
| ```javascript | |
| const FormData = require('form-data'); | |
| const fs = require('fs'); | |
| const fetch = require('node-fetch'); | |
| async function convertToPDF() { | |
| const form = new FormData(); | |
| // Add HTML file | |
| form.append('html_file', fs.createReadStream('report.html')); | |
| // Add images | |
| form.append('images', fs.createReadStream('image1.png')); | |
| form.append('images', fs.createReadStream('image2.jpg')); | |
| // Optional parameters | |
| form.append('aspect_ratio', '9:16'); | |
| const response = await fetch( | |
| 'https://abdallalswaiti-htmlpdfs.hf.space/convert', | |
| { | |
| method: 'POST', | |
| body: form | |
| } | |
| ); | |
| if (response.ok) { | |
| const buffer = await response.arrayBuffer(); | |
| fs.writeFileSync('output.pdf', Buffer.from(buffer)); | |
| console.log('PDF generated successfully!'); | |
| } else { | |
| console.error('Error:', await response.text()); | |
| } | |
| } | |
| convertToPDF(); | |
| ``` | |
| ## π HTML Best Practices | |
| ### For Multi-Page Documents | |
| Use page classes to control page breaks: | |
| ```html | |
| <div class="page"> | |
| <h1>Page 1</h1> | |
| <p>Content here...</p> | |
| </div> | |
| <div class="page"> | |
| <h1>Page 2</h1> | |
| <p>More content...</p> | |
| </div> | |
| ``` | |
| ### For Presentations (16:9) | |
| ```html | |
| <!DOCTYPE html> | |
| <html> | |
| <head> | |
| <meta name="viewport" content="width=device-width, initial-scale=1.0, orientation=landscape"> | |
| <style> | |
| .slide { | |
| width: 100vw; | |
| height: 100vh; | |
| display: flex; | |
| flex-direction: column; | |
| justify-content: center; | |
| align-items: center; | |
| } | |
| </style> | |
| </head> | |
| <body> | |
| <div class="slide"> | |
| <h1>Slide 1</h1> | |
| <img src="chart.png" alt="Chart"> | |
| </div> | |
| <div class="slide"> | |
| <h1>Slide 2</h1> | |
| <img src="graph.png" alt="Graph"> | |
| </div> | |
| </body> | |
| </html> | |
| ``` | |
| ### For Reports (9:16) | |
| ```html | |
| <!DOCTYPE html> | |
| <html> | |
| <head> | |
| <meta name="viewport" content="width=device-width, initial-scale=1.0, orientation=portrait"> | |
| <style> | |
| body { | |
| font-family: Arial, sans-serif; | |
| padding: 20px; | |
| } | |
| .page { | |
| min-height: 100vh; | |
| } | |
| </style> | |
| </head> | |
| <body> | |
| <section class="page"> | |
| <h1>Annual Report 2024</h1> | |
| <img src="logo.png" alt="Logo" style="width: 200px;"> | |
| <p>Report content...</p> | |
| </section> | |
| </body> | |
| </html> | |
| ``` | |
| ## π― Image Handling | |
| ### Supported Formats | |
| - PNG, JPG/JPEG | |
| - GIF, SVG | |
| - WebP, BMP | |
| ### Image Path Examples | |
| Your HTML can have **any** of these formats: | |
| ```html | |
| <!-- All of these work! --> | |
| <img src="logo.png"> | |
| <img src="images/logo.png"> | |
| <img src="../../../assets/images/logo.png"> | |
| <img src="./photos/image.jpg"> | |
| <!-- CSS backgrounds too --> | |
| <div style="background-image: url('bg.jpg')"></div> | |
| <div style="background-image: url('../images/bg.jpg')"></div> | |
| ``` | |
| Just upload the images: | |
| ```bash | |
| curl -X POST https://abdallalswaiti-htmlpdfs.hf.space/convert \ | |
| -F "html_file=@index.html" \ | |
| -F "images=@logo.png" \ | |
| -F "images=@bg.jpg" \ | |
| -o output.pdf | |
| ``` | |
| The API automatically: | |
| 1. Extracts filenames from paths | |
| 2. Normalizes all references to simple filenames | |
| 3. Saves images to the same directory as HTML | |
| 4. Generates PDF with all images embedded | |
| ## π§ Troubleshooting | |
| ### Images Not Showing | |
| - Ensure image filenames match exactly (case-sensitive) | |
| - Upload ALL images referenced in your HTML | |
| - Check that image paths are normalized (the API does this automatically) | |
| ### Wrong Aspect Ratio | |
| - Set `auto_detect=false` and specify `aspect_ratio` manually | |
| - Check HTML for viewport meta tags that might override | |
| ### Page Breaks in Wrong Places | |
| - Add `class="no-page-break"` to elements that should stay together | |
| - Use `class="page-break"` to force breaks at specific points | |
| ### PDF Too Large | |
| - Optimize images before uploading (compress, resize) | |
| - Use appropriate image formats (WebP for photos, PNG for graphics) | |
| ## π Response Headers | |
| The API includes useful metadata in response headers: | |
| - `X-Aspect-Ratio`: Detected or specified aspect ratio | |
| - `X-Path-Replacements`: Number of image paths normalized | |
| - `X-PDF-Size`: Size of generated PDF in bytes | |
| **Example:** | |
| ```bash | |
| curl -I -X POST https://abdallalswaiti-htmlpdfs.hf.space/convert \ | |
| -F "html_file=@test.html" | |
| # Response headers: | |
| # X-Aspect-Ratio: 9:16 | |
| # X-Path-Replacements: 3 | |
| # X-PDF-Size: 245678 | |
| ``` | |
| ## π οΈ Technical Details | |
| - **Engine**: Puppeteer (Chromium-based) | |
| - **Backend**: FastAPI (Python) | |
| - **Max Timeout**: 60 seconds per conversion | |
| - **Page Sizes**: | |
| - 16:9 β A4 Landscape (297mm Γ 210mm) | |
| - 9:16 β A4 Portrait (210mm Γ 297mm) | |
| - 1:1 β Square (210mm Γ 210mm) | |
| ## π License | |
| This API is provided as-is for public use on Hugging Face Spaces. | |
| ## π€ Support | |
| For issues or questions, please visit the [Space discussion page](https://huggingface.co/spaces/abdallalswaiti/htmlpdfs/discussions). | |
| --- | |
| **Made with β€οΈ using FastAPI and Puppeteer** |