ABDALLALSWAITI commited on
Commit
25fef2d
Β·
verified Β·
1 Parent(s): efa25e7

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +279 -391
README.md CHANGED
@@ -9,518 +9,406 @@ health_check:
9
  path: /health
10
  ---
11
 
12
- # HTML to PDF Converter API
13
 
14
- A powerful API to convert HTML documents to PDF with support for page breaks, image embedding, and multiple aspect ratios.
15
 
16
- ## Features
17
 
18
- - Convert HTML files to PDF
19
- - Support for multiple aspect ratios (16:9, 1:1, 9:16, auto)
20
- - Automatic aspect ratio detection from content
21
- - Image embedding support
22
- - Page break support
23
- - Base64 PDF output option
24
 
25
- # HTML to PDF Conversion - Complete Implementation Guide
26
-
27
- ## 🎯 Overview
28
-
29
- This guide covers the **optimized HTML to PDF conversion system** based on 2024-2025 Puppeteer best practices, implementing proper multi-page support, intelligent mode detection, and professional PDF generation.
30
-
31
- ---
32
-
33
-
34
- ## πŸš€ Usage Examples
35
-
36
- ### Example 1: Auto-Detection (Recommended)
37
  ```bash
38
  curl -X POST https://abdallalswaiti-htmlpdfs.hf.space/convert \
39
- -F "html_file=@document.html" \
40
- -F "aspect_ratio=auto" \
41
- -F "mode=single" \
42
  -o output.pdf
43
  ```
44
 
45
- The system will:
46
- - Analyze HTML structure
47
- - Detect page breaks and content height
48
- - Choose optimal aspect ratio
49
- - Select best generation mode
50
 
51
- ### Example 2: Force Multi-Page Document
52
  ```bash
53
  curl -X POST https://abdallalswaiti-htmlpdfs.hf.space/convert \
54
  -F "html_file=@report.html" \
55
- -F "aspect_ratio=9:16" \
56
- -F "mode=multi" \
57
- -o report.pdf
 
58
  ```
59
 
60
- ### Example 3: Single-Page Infographic
61
- ```bash
62
- curl -X POST https://abdallalswaiti-htmlpdfs.hf.space/convert \
63
- -F "html_file=@infographic.html" \
64
- -F "aspect_ratio=1:1" \
65
- -F "mode=single" \
66
- -o infographic.pdf
67
- ```
68
 
69
- ### Example 4: Presentation Slides
70
  ```bash
71
  curl -X POST https://abdallalswaiti-htmlpdfs.hf.space/convert \
72
  -F "html_file=@presentation.html" \
73
  -F "aspect_ratio=16:9" \
74
- -F "mode=multi" \
75
  -o slides.pdf
76
  ```
77
 
78
- ### Example 5: With Images
79
- ```bash
80
- curl -X POST https://abdallalswaiti-htmlpdfs.hf.space/convert \
81
- -F "html_file=@document.html" \
82
- -F "images=@logo.png" \
83
- -F "images=@chart.jpg" \
84
- -o document.pdf
85
- ```
86
 
87
- ---
88
 
89
- ## 🎨 CSS Best Practices
90
 
91
- ### Essential CSS for Multi-Page PDFs
 
 
 
 
92
 
93
- ```css
94
- /* 1. Configure page size */
95
- @page {
96
- size: A4 portrait;
97
- margin: 20mm;
98
- }
99
 
100
- /* 2. Prevent unwanted breaks */
101
- .keep-together {
102
- page-break-inside: avoid;
103
- break-inside: avoid;
104
- }
105
-
106
- /* 3. Force page breaks */
107
- .new-page {
108
- page-break-before: always;
109
- break-before: page;
110
- }
111
-
112
- .page, .slide {
113
- page-break-after: always;
114
- break-after: page;
115
- }
116
 
117
- /* 4. Avoid breaking after headings */
118
- h1, h2, h3, h4, h5, h6 {
119
- page-break-after: avoid;
120
- break-after: avoid;
121
- }
122
 
123
- /* 5. Table header repetition */
124
- thead {
125
- display: table-header-group;
126
- break-inside: avoid;
127
- }
128
 
129
- /* 6. Prevent table row breaks */
130
- tr {
131
- page-break-inside: avoid;
132
- }
133
-
134
- /* 7. Keep images intact */
135
- img, figure {
136
- page-break-inside: avoid;
137
- break-inside: avoid;
138
- max-width: 100%;
139
- height: auto;
140
- }
141
 
142
- /* 8. Ensure backgrounds print */
143
- * {
144
- -webkit-print-color-adjust: exact !important;
145
- print-color-adjust: exact !important;
146
- }
147
  ```
148
 
149
- ### Print-Specific Styles
150
-
151
- ```css
152
- @media print {
153
- /* Hide elements that shouldn't print */
154
- .no-print {
155
- display: none;
156
- }
157
-
158
- /* Optimize for printing */
159
- body {
160
- font-size: 12pt;
161
- line-height: 1.4;
162
- }
163
-
164
- /* Avoid color backgrounds for text readability */
165
- .text-content {
166
- background: white;
167
- color: black;
168
- }
169
- }
170
- ```
171
 
172
- ---
173
 
174
- ## πŸ“ Aspect Ratios Guide
 
 
175
 
176
- ### **9:16 - Portrait (Default)**
177
- - **Best for**: Reports, documents, invoices, contracts
178
- - **Format**: A4 portrait (210mm Γ— 297mm)
179
- - **Use case**: Traditional business documents
180
 
181
- ```html
182
- <!-- Auto-detected for standard documents -->
183
- <html>
184
- <head>
185
- <meta name="orientation" content="portrait">
186
- </head>
187
- ...
188
- ```
189
 
190
- ### **16:9 - Landscape**
191
- - **Best for**: Presentations, slides, dashboards
192
- - **Format**: A4 landscape (297mm Γ— 210mm)
193
- - **Use case**: Wide-format content
194
 
 
195
  ```html
196
- <!-- Auto-detected for presentations -->
197
- <html>
198
- <head>
199
- <meta name="orientation" content="landscape">
200
- </head>
201
- <body>
202
- <div class="slide">Slide 1</div>
203
- <div class="slide">Slide 2</div>
204
- </body>
205
  ```
206
 
207
- ### **1:1 - Square**
208
- - **Best for**: Social media, Instagram posts, certificates
209
- - **Format**: 210mm Γ— 210mm
210
- - **Use case**: Square format content
211
-
212
  ```html
213
- <!-- Specify explicitly -->
214
- <html data-aspect-ratio="1:1">
215
- ...
216
  ```
217
 
218
- ### **Auto - Dynamic**
219
- - **Best for**: Infographics, posters, or any content where the exact dimensions should be preserved.
220
- - **Format**: The PDF will have the same dimensions as the HTML content.
221
- - **Use case**: When you want the PDF to be a perfect snapshot of the HTML content.
222
 
223
- ---
224
 
225
- ## 🎯 Mode Selection Guide
 
 
 
226
 
227
- ### **Auto Mode (Recommended)**
228
- ```javascript
229
- mode: "auto"
230
- ```
231
- - System analyzes content
232
- - Detects page breaks in CSS
233
- - Measures content height
234
- - Chooses optimal mode automatically
235
-
236
- **Auto-detection logic:**
237
- - Finds `.page` or `.slide` classes β†’ Multi-page
238
- - Content height > 2Γ— viewport β†’ Multi-page
239
- - Keywords like "infographic", "poster" β†’ Single-page
240
- - Otherwise β†’ Multi-page (safer default)
241
-
242
- ### **Multi-Page Mode**
243
- ```javascript
244
- mode: "multi"
245
- ```
246
- - Natural page breaks respected
247
- - Table headers repeat
248
- - Content flows across pages
249
- - Best for documents, reports, articles
250
 
251
- ### **Single-Page Mode**
252
- ```javascript
253
- mode: "single"
254
- ```
255
- - Everything on one page
256
- - Page sized to content
257
- - Perfect for infographics
258
- - No page breaks applied
259
-
260
- ---
261
 
262
- ## πŸ”§ Advanced Configuration
 
 
 
263
 
264
- ### Custom Page Sizes
265
-
266
- ```css
267
- /* Custom dimensions */
268
- @page {
269
- size: 8.5in 11in; /* Letter size */
270
- }
271
 
272
- @page {
273
- size: 420mm 594mm; /* A2 size */
274
- }
275
 
276
- @page {
277
- size: landscape; /* Generic landscape */
278
- }
279
- ```
280
 
281
- ### Headers and Footers
282
 
283
- ```css
284
- @page {
285
- @top-center {
286
- content: "Document Title";
287
- }
288
-
289
- @bottom-right {
290
- content: counter(page) " of " counter(pages);
291
- }
292
- }
293
  ```
294
 
295
- ### Margins
296
-
297
- ```css
298
- @page {
299
- margin: 25mm 20mm; /* top/bottom left/right */
300
- }
301
 
302
- @page {
303
- margin-top: 30mm;
304
- margin-bottom: 25mm;
305
- margin-left: 20mm;
306
- margin-right: 20mm;
307
- }
 
 
308
  ```
309
 
310
- ---
311
-
312
- ## πŸ› Troubleshooting
313
-
314
- ### Issue: Page breaks in wrong places
315
- **Solution**: Add `page-break-inside: avoid` to container elements
316
 
317
- ```css
318
- .section {
319
- page-break-inside: avoid;
320
- }
 
321
  ```
322
 
323
- ### Issue: Table headers not repeating
324
- **Solution**: Ensure proper CSS on `<thead>`
325
 
326
- ```css
327
- thead {
328
- display: table-header-group;
329
- break-inside: avoid;
330
- }
331
- ```
332
-
333
- ### Issue: Backgrounds not printing
334
- **Solution**: Add color-adjust property
335
 
336
- ```css
337
- * {
338
- -webkit-print-color-adjust: exact;
339
- print-color-adjust: exact;
340
  }
341
- ```
342
 
343
- ### Issue: Content cut off at page edges
344
- **Solution**: Add appropriate margins
345
-
346
- ```css
347
- @page {
348
- margin: 20mm;
349
- }
350
 
351
- body {
352
- padding: 0; /* Remove body padding when using @page margins */
 
 
353
  }
354
- ```
355
-
356
- ### Issue: Fonts not loading
357
- **Solution**: Ensure fonts are embedded or use web-safe fonts
358
 
359
- ```css
360
- @font-face {
361
- font-family: 'CustomFont';
362
- src: url(data:font/woff2;base64,...);
363
- }
 
 
 
 
 
 
 
 
 
 
 
364
  ```
365
 
366
- ---
367
-
368
- ## πŸ“Š Performance Optimization
369
-
370
- ### 1. Reuse Browser Instance (for high-volume)
371
- Not implemented in current single-request design, but for high-volume:
372
 
373
  ```javascript
374
- // Keep one browser instance
375
- const browser = await puppeteer.launch();
376
-
377
- // Reuse for multiple PDFs
378
- for (let html of htmlList) {
379
- const page = await browser.newPage();
380
- await page.setContent(html);
381
- await page.pdf({...});
382
- await page.close();
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
383
  }
384
 
385
- await browser.close();
386
  ```
387
 
388
- ### 2. Optimize Images
389
- - Use compressed formats (WebP, optimized JPEG)
390
- - Resize to display dimensions
391
- - Embed as data URLs for self-contained PDFs
392
-
393
- ### 3. Minimize CSS
394
- - Remove unused styles
395
- - Combine selectors
396
- - Use shorthand properties
397
 
398
- ### 4. Reduce Wait Times
399
- ```javascript
400
- // Only wait for what's needed
401
- waitUntil: 'networkidle2' // Instead of networkidle0
402
- ```
403
 
404
- ---
405
 
406
- ## πŸŽ“ Real-World Examples
407
-
408
- ### Invoice Template
409
  ```html
410
- <!DOCTYPE html>
411
- <html>
412
- <head>
413
- <style>
414
- @page { size: A4; margin: 20mm; }
415
- .invoice-header { page-break-after: avoid; }
416
- .invoice-items { page-break-inside: auto; }
417
- thead { display: table-header-group; }
418
- </style>
419
- </head>
420
- <body>
421
- <div class="invoice-header">
422
- <h1>INVOICE</h1>
423
- <p>Date: 2024-10-17</p>
424
- </div>
425
- <table class="invoice-items">
426
- <thead>
427
- <tr><th>Item</th><th>Qty</th><th>Price</th></tr>
428
- </thead>
429
- <tbody>
430
- <!-- Items -->
431
- </tbody>
432
- </table>
433
- </body>
434
- </html>
435
  ```
436
 
437
- ### Multi-Page Report
 
438
  ```html
439
  <!DOCTYPE html>
440
  <html>
441
  <head>
 
442
  <style>
443
- @page { size: A4 portrait; margin: 25mm; }
444
- .page { page-break-after: always; }
445
- .page:last-child { page-break-after: auto; }
446
- h1 { page-break-after: avoid; }
 
 
 
 
447
  </style>
448
  </head>
449
  <body>
450
- <div class="page">
451
- <h1>Executive Summary</h1>
452
- <p>Content...</p>
453
  </div>
454
- <div class="page">
455
- <h1>Details</h1>
456
- <p>Content...</p>
 
457
  </div>
458
  </body>
459
  </html>
460
  ```
461
 
462
- ### Social Media Infographic
 
463
  ```html
464
  <!DOCTYPE html>
465
  <html>
466
  <head>
 
467
  <style>
468
  body {
469
- width: 1080px;
470
- height: 1080px;
471
- margin: 0;
472
- display: flex;
473
- align-items: center;
474
- justify-content: center;
475
  }
476
  </style>
477
  </head>
478
  <body>
479
- <div style="text-align: center;">
480
- <h1>Your Infographic</h1>
481
- </div>
 
 
482
  </body>
483
  </html>
484
  ```
485
 
486
- ---
487
 
488
- ## πŸŽ‰ Key Takeaways
 
 
 
489
 
490
- 1. **Always use print media emulation** - It's critical for proper PDF generation
491
- 2. **Auto-detection is your friend** - Let the system choose the best settings
492
- 3. **CSS page breaks work** - When properly configured with print emulation
493
- 4. **Test with real content** - Different content types need different approaches
494
- 5. **Use semantic HTML** - `<thead>`, `<section>`, etc. help with page breaks
495
- 6. **Embed resources** - For self-contained PDFs, embed images and fonts
496
 
497
- ---
 
 
 
 
 
 
 
 
 
 
 
498
 
499
- ## πŸ“š Additional Resources
 
 
 
 
 
 
 
500
 
501
- - [Puppeteer PDF API Documentation](https://pptr.dev/api/puppeteer.pdfoptions)
502
- - [CSS Paged Media Module](https://www.w3.org/TR/css-page-3/)
503
- - [MDN: page-break-inside](https://developer.mozilla.org/en-US/docs/Web/CSS/page-break-inside)
504
- - [Chromium Print CSS Support](https://bugs.chromium.org/p/chromium/issues/list?q=component:Blink%3ECSS%3EPrint)
 
505
 
506
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
507
 
508
- ## πŸš€ Deployment Checklist
509
 
510
- - [ ] Replace `puppeteer_pdf.js` with new version
511
- - [ ] Replace `app.py` with enhanced version
512
- - [ ] Test with multi-page document
513
- - [ ] Test with single-page infographic
514
- - [ ] Test with different aspect ratios
515
- - [ ] Test auto-detection
516
- - [ ] Verify table header repetition
517
- - [ ] Check page break behavior
518
- - [ ] Test with embedded images
519
- - [ ] Monitor server logs for errors
520
- - [ ] Test production workload
521
 
522
  ---
523
 
524
- **Version**: 2.0.0
525
- **Last Updated**: October 2025
526
- **Compatibility**: Puppeteer 23.x, Node.js 20.x
 
9
  path: /health
10
  ---
11
 
12
+ # HTML to PDF Converter API πŸ“„
13
 
14
+ Convert HTML files to PDF with automatic image embedding and page break management. Perfect for generating reports, presentations, and documents from HTML.
15
 
16
+ ## πŸš€ Quick Start
17
 
18
+ ### Basic Conversion (HTML only)
 
 
 
 
 
19
 
 
 
 
 
 
 
 
 
 
 
 
 
20
  ```bash
21
  curl -X POST https://abdallalswaiti-htmlpdfs.hf.space/convert \
22
+ -F "html_file=@your_file.html" \
 
 
23
  -o output.pdf
24
  ```
25
 
26
+ ### With Images
 
 
 
 
27
 
 
28
  ```bash
29
  curl -X POST https://abdallalswaiti-htmlpdfs.hf.space/convert \
30
  -F "html_file=@report.html" \
31
+ -F "images=@image1.png" \
32
+ -F "images=@image2.jpg" \
33
+ -F "images=@logo.svg" \
34
+ -o output.pdf
35
  ```
36
 
37
+ ### Custom Aspect Ratio
 
 
 
 
 
 
 
38
 
 
39
  ```bash
40
  curl -X POST https://abdallalswaiti-htmlpdfs.hf.space/convert \
41
  -F "html_file=@presentation.html" \
42
  -F "aspect_ratio=16:9" \
43
+ -F "auto_detect=false" \
44
  -o slides.pdf
45
  ```
46
 
47
+ ## πŸ“‹ API Endpoints
 
 
 
 
 
 
 
48
 
49
+ ### `POST /convert`
50
 
51
+ Convert HTML file to PDF with optional images.
52
 
53
+ **Parameters:**
54
+ - `html_file` (required): HTML file to convert
55
+ - `images` (optional): Image files referenced in HTML (can upload multiple)
56
+ - `aspect_ratio` (optional): `16:9`, `1:1`, or `9:16`
57
+ - `auto_detect` (optional): Auto-detect aspect ratio from HTML (default: `true`)
58
 
59
+ **Response:**
60
+ - PDF file (application/pdf)
61
+ - Headers include metadata: aspect ratio, image count, PDF size
 
 
 
62
 
63
+ ### `POST /convert-string`
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
64
 
65
+ Convert HTML string to PDF (for HTML without external images).
 
 
 
 
66
 
67
+ **Parameters:**
68
+ - `html_content` (required): HTML content as string
69
+ - `aspect_ratio` (optional): `16:9`, `1:1`, or `9:16`
70
+ - `auto_detect` (optional): Auto-detect aspect ratio (default: `true`)
 
71
 
72
+ **Example:**
 
 
 
 
 
 
 
 
 
 
 
73
 
74
+ ```bash
75
+ curl -X POST https://abdallalswaiti-htmlpdfs.hf.space/convert-string \
76
+ -F "html_content=<html><body><h1>Hello World</h1></body></html>" \
77
+ -o output.pdf
 
78
  ```
79
 
80
+ ### `GET /health`
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
81
 
82
+ Health check endpoint.
83
 
84
+ ```bash
85
+ curl https://abdallalswaiti-htmlpdfs.hf.space/health
86
+ ```
87
 
88
+ ## 🎨 Features
 
 
 
89
 
90
+ ### βœ… Automatic Image Path Normalization
 
 
 
 
 
 
 
91
 
92
+ The API automatically converts complex image paths to simple filenames:
 
 
 
93
 
94
+ **Before:**
95
  ```html
96
+ <img src="../../../assets/images/logo.png">
97
+ <img src="images/photo.jpg">
 
 
 
 
 
 
 
98
  ```
99
 
100
+ **After (automatically):**
 
 
 
 
101
  ```html
102
+ <img src="logo.png">
103
+ <img src="photo.jpg">
 
104
  ```
105
 
106
+ Just upload your images with the `images` parameter, and they'll work!
 
 
 
107
 
108
+ ### βœ… Aspect Ratio Detection
109
 
110
+ The API automatically detects aspect ratio from:
111
+ - HTML `<meta name="viewport">` tags
112
+ - CSS `aspect-ratio` properties
113
+ - Keywords like "presentation", "slide"
114
 
115
+ Supported ratios:
116
+ - **16:9** - Landscape (presentations, slides) β†’ A4 Landscape
117
+ - **9:16** - Portrait (reports, documents) β†’ A4 Portrait
118
+ - **1:1** - Square (social media posts) β†’ 210mm Γ— 210mm
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
119
 
120
+ ### βœ… Automatic Page Breaks
 
 
 
 
 
 
 
 
 
121
 
122
+ The API intelligently handles page breaks:
123
+ - Elements with classes: `.page`, `.slide`, `section.page`
124
+ - Top-level `<section>`, `<article>`, `<div>` elements
125
+ - Prevents breaking inside: headings, images, tables, code blocks
126
 
127
+ ### βœ… Color Preservation
 
 
 
 
 
 
128
 
129
+ All colors, backgrounds, and gradients are preserved in the PDF with `print-color-adjust: exact`.
 
 
130
 
131
+ ## πŸ’‘ Usage Examples
 
 
 
132
 
133
+ ### Example 1: Simple Report
134
 
135
+ ```bash
136
+ curl -X POST https://abdallalswaiti-htmlpdfs.hf.space/convert \
137
+ -F "html_file=@report.html" \
138
+ -o report.pdf
 
 
 
 
 
 
139
  ```
140
 
141
+ ### Example 2: Presentation with Images
 
 
 
 
 
142
 
143
+ ```bash
144
+ curl -X POST https://abdallalswaiti-htmlpdfs.hf.space/convert \
145
+ -F "html_file=@slides.html" \
146
+ -F "images=@chart1.png" \
147
+ -F "images=@chart2.png" \
148
+ -F "images=@logo.svg" \
149
+ -F "aspect_ratio=16:9" \
150
+ -o presentation.pdf
151
  ```
152
 
153
+ ### Example 3: Multiple Images from Directory
 
 
 
 
 
154
 
155
+ ```bash
156
+ curl -X POST https://abdallalswaiti-htmlpdfs.hf.space/convert \
157
+ -F "html_file=@document.html" \
158
+ $(for img in images/*.{png,jpg}; do echo "-F images=@$img"; done) \
159
+ -o document.pdf
160
  ```
161
 
162
+ ### Example 4: Python Script
 
163
 
164
+ ```python
165
+ import requests
 
 
 
 
 
 
 
166
 
167
+ # Prepare files
168
+ files = {
169
+ 'html_file': open('report.html', 'rb'),
 
170
  }
 
171
 
172
+ # Add images
173
+ images = [
174
+ ('images', open('image1.png', 'rb')),
175
+ ('images', open('image2.jpg', 'rb')),
176
+ ]
 
 
177
 
178
+ # Optional parameters
179
+ data = {
180
+ 'aspect_ratio': '9:16',
181
+ 'auto_detect': 'false'
182
  }
 
 
 
 
183
 
184
+ # Make request
185
+ response = requests.post(
186
+ 'https://abdallalswaiti-htmlpdfs.hf.space/convert',
187
+ files=files,
188
+ data=data,
189
+ files=files + images
190
+ )
191
+
192
+ # Save PDF
193
+ if response.status_code == 200:
194
+ with open('output.pdf', 'wb') as f:
195
+ f.write(response.content)
196
+ print("PDF generated successfully!")
197
+ else:
198
+ print(f"Error: {response.status_code}")
199
+ print(response.text)
200
  ```
201
 
202
+ ### Example 5: JavaScript/Node.js
 
 
 
 
 
203
 
204
  ```javascript
205
+ const FormData = require('form-data');
206
+ const fs = require('fs');
207
+ const fetch = require('node-fetch');
208
+
209
+ async function convertToPDF() {
210
+ const form = new FormData();
211
+
212
+ // Add HTML file
213
+ form.append('html_file', fs.createReadStream('report.html'));
214
+
215
+ // Add images
216
+ form.append('images', fs.createReadStream('image1.png'));
217
+ form.append('images', fs.createReadStream('image2.jpg'));
218
+
219
+ // Optional parameters
220
+ form.append('aspect_ratio', '9:16');
221
+
222
+ const response = await fetch(
223
+ 'https://abdallalswaiti-htmlpdfs.hf.space/convert',
224
+ {
225
+ method: 'POST',
226
+ body: form
227
+ }
228
+ );
229
+
230
+ if (response.ok) {
231
+ const buffer = await response.arrayBuffer();
232
+ fs.writeFileSync('output.pdf', Buffer.from(buffer));
233
+ console.log('PDF generated successfully!');
234
+ } else {
235
+ console.error('Error:', await response.text());
236
+ }
237
  }
238
 
239
+ convertToPDF();
240
  ```
241
 
242
+ ## πŸ“ HTML Best Practices
 
 
 
 
 
 
 
 
243
 
244
+ ### For Multi-Page Documents
 
 
 
 
245
 
246
+ Use page classes to control page breaks:
247
 
 
 
 
248
  ```html
249
+ <div class="page">
250
+ <h1>Page 1</h1>
251
+ <p>Content here...</p>
252
+ </div>
253
+
254
+ <div class="page">
255
+ <h1>Page 2</h1>
256
+ <p>More content...</p>
257
+ </div>
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
258
  ```
259
 
260
+ ### For Presentations (16:9)
261
+
262
  ```html
263
  <!DOCTYPE html>
264
  <html>
265
  <head>
266
+ <meta name="viewport" content="width=device-width, initial-scale=1.0, orientation=landscape">
267
  <style>
268
+ .slide {
269
+ width: 100vw;
270
+ height: 100vh;
271
+ display: flex;
272
+ flex-direction: column;
273
+ justify-content: center;
274
+ align-items: center;
275
+ }
276
  </style>
277
  </head>
278
  <body>
279
+ <div class="slide">
280
+ <h1>Slide 1</h1>
281
+ <img src="chart.png" alt="Chart">
282
  </div>
283
+
284
+ <div class="slide">
285
+ <h1>Slide 2</h1>
286
+ <img src="graph.png" alt="Graph">
287
  </div>
288
  </body>
289
  </html>
290
  ```
291
 
292
+ ### For Reports (9:16)
293
+
294
  ```html
295
  <!DOCTYPE html>
296
  <html>
297
  <head>
298
+ <meta name="viewport" content="width=device-width, initial-scale=1.0, orientation=portrait">
299
  <style>
300
  body {
301
+ font-family: Arial, sans-serif;
302
+ padding: 20px;
303
+ }
304
+ .page {
305
+ min-height: 100vh;
 
306
  }
307
  </style>
308
  </head>
309
  <body>
310
+ <section class="page">
311
+ <h1>Annual Report 2024</h1>
312
+ <img src="logo.png" alt="Logo" style="width: 200px;">
313
+ <p>Report content...</p>
314
+ </section>
315
  </body>
316
  </html>
317
  ```
318
 
319
+ ## 🎯 Image Handling
320
 
321
+ ### Supported Formats
322
+ - PNG, JPG/JPEG
323
+ - GIF, SVG
324
+ - WebP, BMP
325
 
326
+ ### Image Path Examples
 
 
 
 
 
327
 
328
+ Your HTML can have **any** of these formats:
329
+ ```html
330
+ <!-- All of these work! -->
331
+ <img src="logo.png">
332
+ <img src="images/logo.png">
333
+ <img src="../../../assets/images/logo.png">
334
+ <img src="./photos/image.jpg">
335
+
336
+ <!-- CSS backgrounds too -->
337
+ <div style="background-image: url('bg.jpg')"></div>
338
+ <div style="background-image: url('../images/bg.jpg')"></div>
339
+ ```
340
 
341
+ Just upload the images:
342
+ ```bash
343
+ curl -X POST https://abdallalswaiti-htmlpdfs.hf.space/convert \
344
+ -F "html_file=@index.html" \
345
+ -F "images=@logo.png" \
346
+ -F "images=@bg.jpg" \
347
+ -o output.pdf
348
+ ```
349
 
350
+ The API automatically:
351
+ 1. Extracts filenames from paths
352
+ 2. Normalizes all references to simple filenames
353
+ 3. Saves images to the same directory as HTML
354
+ 4. Generates PDF with all images embedded
355
 
356
+ ## πŸ”§ Troubleshooting
357
+
358
+ ### Images Not Showing
359
+ - Ensure image filenames match exactly (case-sensitive)
360
+ - Upload ALL images referenced in your HTML
361
+ - Check that image paths are normalized (the API does this automatically)
362
+
363
+ ### Wrong Aspect Ratio
364
+ - Set `auto_detect=false` and specify `aspect_ratio` manually
365
+ - Check HTML for viewport meta tags that might override
366
+
367
+ ### Page Breaks in Wrong Places
368
+ - Add `class="no-page-break"` to elements that should stay together
369
+ - Use `class="page-break"` to force breaks at specific points
370
+
371
+ ### PDF Too Large
372
+ - Optimize images before uploading (compress, resize)
373
+ - Use appropriate image formats (WebP for photos, PNG for graphics)
374
+
375
+ ## πŸ“Š Response Headers
376
+
377
+ The API includes useful metadata in response headers:
378
+
379
+ - `X-Aspect-Ratio`: Detected or specified aspect ratio
380
+ - `X-Path-Replacements`: Number of image paths normalized
381
+ - `X-PDF-Size`: Size of generated PDF in bytes
382
+
383
+ **Example:**
384
+ ```bash
385
+ curl -I -X POST https://abdallalswaiti-htmlpdfs.hf.space/convert \
386
+ -F "html_file=@test.html"
387
+
388
+ # Response headers:
389
+ # X-Aspect-Ratio: 9:16
390
+ # X-Path-Replacements: 3
391
+ # X-PDF-Size: 245678
392
+ ```
393
+
394
+ ## πŸ› οΈ Technical Details
395
+
396
+ - **Engine**: Puppeteer (Chromium-based)
397
+ - **Backend**: FastAPI (Python)
398
+ - **Max Timeout**: 60 seconds per conversion
399
+ - **Page Sizes**:
400
+ - 16:9 β†’ A4 Landscape (297mm Γ— 210mm)
401
+ - 9:16 β†’ A4 Portrait (210mm Γ— 297mm)
402
+ - 1:1 β†’ Square (210mm Γ— 210mm)
403
+
404
+ ## πŸ“„ License
405
+
406
+ This API is provided as-is for public use on Hugging Face Spaces.
407
 
408
+ ## 🀝 Support
409
 
410
+ For issues or questions, please visit the [Space discussion page](https://huggingface.co/spaces/abdallalswaiti/htmlpdfs/discussions).
 
 
 
 
 
 
 
 
 
 
411
 
412
  ---
413
 
414
+ **Made with ❀️ using FastAPI and Puppeteer**