A newer version of the Gradio SDK is available:
6.5.1
Usage Guide for Enhanced DOCX to PDF Converter
This guide explains how to use the enhanced DOCX to PDF converter, which has been completely redesigned from the original Gradio-based version to a professional FastAPI service.
Getting Started
Prerequisites
- Docker and Docker Compose installed
- At least 4GB of available RAM
- Internet connection for initial setup
Quick Start
- Clone or download this repository
- Navigate to the project directory
- Run the service:
docker-compose up --build - Access the API at
http://localhost:8000 - View API documentation at
http://localhost:8000/docs
API Endpoints
Convert Single DOCX File
POST /convert
Converts a single DOCX file to PDF.
Using Multipart File Upload:
curl -X POST "http://localhost:8000/convert" \
-H "accept: application/json" \
-H "Content-Type: multipart/form-data" \
-F "file=@document.docx"
Using Base64 Content:
# First encode your file to base64
BASE64_CONTENT=$(base64 -i document.docx)
# Then send the request
curl -X POST "http://localhost:8000/convert" \
-H "accept: application/json" \
-H "Content-Type: application/x-www-form-urlencoded" \
-d "file_content=$BASE64_CONTENT" \
-d "filename=document.docx"
Response:
{
"success": true,
"pdf_url": "/download/abc123/document.pdf",
"message": "Conversion successful"
}
Batch Convert Multiple DOCX Files
POST /convert/batch
Converts multiple DOCX files in a single request.
curl -X POST "http://localhost:8000/convert/batch" \
-H "accept: application/json" \
-H "Content-Type: application/json" \
-d '{
"files": [
{
"file_content": "base64_encoded_content_1",
"filename": "document1.docx"
},
{
"file_content": "base64_encoded_content_2",
"filename": "document2.docx"
}
]
}'
Response:
[
{
"success": true,
"pdf_url": "/download/abc123/document1.pdf",
"message": "Conversion successful"
},
{
"success": false,
"error": "Error description"
}
]
Download Converted PDF
GET /download/{temp_id}/{filename}
Downloads a converted PDF file.
curl -X GET "http://localhost:8000/download/abc123/document.pdf" \
-o document.pdf
Health Check
GET /health
Checks if the service is running.
curl -X GET "http://localhost:8000/health"
Response:
{
"status": "healthy",
"version": "2.0.0"
}
Browser Integration
The API includes full CORS support for direct browser integration. You can use the Fetch API or XMLHttpRequest to communicate directly with the service from web applications.
Example JavaScript Integration:
// Convert and download a file
async function convertDocxToPdf(file) {
const formData = new FormData();
formData.append('file', file);
try {
const response = await fetch('http://localhost:8000/convert', {
method: 'POST',
body: formData
});
const result = await response.json();
if (result.success) {
// Open PDF in new tab
window.open('http://localhost:8000' + result.pdf_url, '_blank');
// Or download directly
const link = document.createElement('a');
link.href = 'http://localhost:8000' + result.pdf_url;
link.download = 'converted.pdf';
link.click();
} else {
console.error('Conversion failed:', result.error);
}
} catch (error) {
console.error('Network error:', error);
}
}
Configuration
The service can be configured using environment variables:
| Variable | Description | Default |
|---|---|---|
PORT |
Application port | 8000 |
MAX_FILE_SIZE |
Maximum file size in bytes | 52428800 (50MB) |
MAX_CONVERSION_TIME |
Conversion timeout in seconds | 120 |
TEMP_DIR |
Temporary directory for conversions | /tmp/conversions |
CORS_ORIGINS |
CORS allowed origins | * |
Example with custom configuration:
PORT=8080 MAX_FILE_SIZE=104857600 docker-compose up
File Handling
Supported File Types
- DOCX (Microsoft Word documents)
File Size Limits
- Default maximum: 50MB
- Configurable via
MAX_FILE_SIZEenvironment variable
Storage
- Converted files are stored temporarily in the
conversionsdirectory - This directory is mounted as a Docker volume for persistence
- Files are automatically cleaned up when the container is restarted
Error Handling
The API provides detailed error messages for troubleshooting:
400 Bad Request: Invalid input parameters413 Payload Too Large: File exceeds size limits500 Internal Server Error: Conversion failed
Example error response:
{
"success": false,
"error": "File too large"
}
Performance Considerations
Batch Processing
For converting multiple files, use the batch endpoint to reduce overhead:
curl -X POST "http://localhost:8000/convert/batch" \
-H "Content-Type: application/json" \
-d '{"files": [...]}'
Resource Usage
- Each conversion uses a separate LibreOffice instance
- Monitor memory usage for large files
- Consider scaling the service for high-volume usage
Troubleshooting
Common Issues
Service won't start:
- Ensure Docker and Docker Compose are installed
- Check that port 8000 is not in use
- Verify sufficient system resources
Conversion fails:
- Check that the DOCX file is valid
- Verify file size is within limits
- Review logs with
docker-compose logs
Download fails:
- Ensure the file hasn't been cleaned up
- Check the download URL is correct
Viewing Logs
docker-compose logs -f docx-to-pdf-enhanced
Testing
Run the test suite:
docker-compose run --rm docx-to-pdf-enhanced python3 -m pytest tests/
Deployment
See DEPLOYMENT_ENHANCED.md for detailed deployment instructions for production environments.
Security
- Files are validated for type and size
- Only DOCX files are accepted
- CORS can be configured for production use
- Run containers with minimal privileges
This enhanced version provides a robust, scalable solution for converting DOCX files to PDF with excellent Arabic language support and formatting preservation.