| # Usage Guide for Enhanced DOCX to PDF Converter | |
| This guide explains how to use the enhanced DOCX to PDF converter, which has been completely redesigned from the original Gradio-based version to a professional FastAPI service. | |
| ## Getting Started | |
| ### Prerequisites | |
| - Docker and Docker Compose installed | |
| - At least 4GB of available RAM | |
| - Internet connection for initial setup | |
| ### Quick Start | |
| 1. Clone or download this repository | |
| 2. Navigate to the project directory | |
| 3. Run the service: | |
| ```bash | |
| docker-compose up --build | |
| ``` | |
| 4. Access the API at `http://localhost:8000` | |
| 5. View API documentation at `http://localhost:8000/docs` | |
| ## API Endpoints | |
| ### Convert Single DOCX File | |
| **POST** `/convert` | |
| Converts a single DOCX file to PDF. | |
| #### Using Multipart File Upload: | |
| ```bash | |
| curl -X POST "http://localhost:8000/convert" \ | |
| -H "accept: application/json" \ | |
| -H "Content-Type: multipart/form-data" \ | |
| -F "file=@document.docx" | |
| ``` | |
| #### Using Base64 Content: | |
| ```bash | |
| # First encode your file to base64 | |
| BASE64_CONTENT=$(base64 -i document.docx) | |
| # Then send the request | |
| curl -X POST "http://localhost:8000/convert" \ | |
| -H "accept: application/json" \ | |
| -H "Content-Type: application/x-www-form-urlencoded" \ | |
| -d "file_content=$BASE64_CONTENT" \ | |
| -d "filename=document.docx" | |
| ``` | |
| #### Response: | |
| ```json | |
| { | |
| "success": true, | |
| "pdf_url": "/download/abc123/document.pdf", | |
| "message": "Conversion successful" | |
| } | |
| ``` | |
| ### Batch Convert Multiple DOCX Files | |
| **POST** `/convert/batch` | |
| Converts multiple DOCX files in a single request. | |
| ```bash | |
| curl -X POST "http://localhost:8000/convert/batch" \ | |
| -H "accept: application/json" \ | |
| -H "Content-Type: application/json" \ | |
| -d '{ | |
| "files": [ | |
| { | |
| "file_content": "base64_encoded_content_1", | |
| "filename": "document1.docx" | |
| }, | |
| { | |
| "file_content": "base64_encoded_content_2", | |
| "filename": "document2.docx" | |
| } | |
| ] | |
| }' | |
| ``` | |
| #### Response: | |
| ```json | |
| [ | |
| { | |
| "success": true, | |
| "pdf_url": "/download/abc123/document1.pdf", | |
| "message": "Conversion successful" | |
| }, | |
| { | |
| "success": false, | |
| "error": "Error description" | |
| } | |
| ] | |
| ``` | |
| ### Download Converted PDF | |
| **GET** `/download/{temp_id}/{filename}` | |
| Downloads a converted PDF file. | |
| ```bash | |
| curl -X GET "http://localhost:8000/download/abc123/document.pdf" \ | |
| -o document.pdf | |
| ``` | |
| ### Health Check | |
| **GET** `/health` | |
| Checks if the service is running. | |
| ```bash | |
| curl -X GET "http://localhost:8000/health" | |
| ``` | |
| Response: | |
| ```json | |
| { | |
| "status": "healthy", | |
| "version": "2.0.0" | |
| } | |
| ``` | |
| ## Browser Integration | |
| The API includes full CORS support for direct browser integration. You can use the Fetch API or XMLHttpRequest to communicate directly with the service from web applications. | |
| ### Example JavaScript Integration: | |
| ```javascript | |
| // Convert and download a file | |
| async function convertDocxToPdf(file) { | |
| const formData = new FormData(); | |
| formData.append('file', file); | |
| try { | |
| const response = await fetch('http://localhost:8000/convert', { | |
| method: 'POST', | |
| body: formData | |
| }); | |
| const result = await response.json(); | |
| if (result.success) { | |
| // Open PDF in new tab | |
| window.open('http://localhost:8000' + result.pdf_url, '_blank'); | |
| // Or download directly | |
| const link = document.createElement('a'); | |
| link.href = 'http://localhost:8000' + result.pdf_url; | |
| link.download = 'converted.pdf'; | |
| link.click(); | |
| } else { | |
| console.error('Conversion failed:', result.error); | |
| } | |
| } catch (error) { | |
| console.error('Network error:', error); | |
| } | |
| } | |
| ``` | |
| ## Configuration | |
| The service can be configured using environment variables: | |
| | Variable | Description | Default | | |
| |----------|-------------|---------| | |
| | `PORT` | Application port | 8000 | | |
| | `MAX_FILE_SIZE` | Maximum file size in bytes | 52428800 (50MB) | | |
| | `MAX_CONVERSION_TIME` | Conversion timeout in seconds | 120 | | |
| | `TEMP_DIR` | Temporary directory for conversions | /tmp/conversions | | |
| | `CORS_ORIGINS` | CORS allowed origins | * | | |
| ### Example with custom configuration: | |
| ```bash | |
| PORT=8080 MAX_FILE_SIZE=104857600 docker-compose up | |
| ``` | |
| ## File Handling | |
| ### Supported File Types | |
| - DOCX (Microsoft Word documents) | |
| ### File Size Limits | |
| - Default maximum: 50MB | |
| - Configurable via `MAX_FILE_SIZE` environment variable | |
| ### Storage | |
| - Converted files are stored temporarily in the `conversions` directory | |
| - This directory is mounted as a Docker volume for persistence | |
| - Files are automatically cleaned up when the container is restarted | |
| ## Error Handling | |
| The API provides detailed error messages for troubleshooting: | |
| - `400 Bad Request`: Invalid input parameters | |
| - `413 Payload Too Large`: File exceeds size limits | |
| - `500 Internal Server Error`: Conversion failed | |
| Example error response: | |
| ```json | |
| { | |
| "success": false, | |
| "error": "File too large" | |
| } | |
| ``` | |
| ## Performance Considerations | |
| ### Batch Processing | |
| For converting multiple files, use the batch endpoint to reduce overhead: | |
| ```bash | |
| curl -X POST "http://localhost:8000/convert/batch" \ | |
| -H "Content-Type: application/json" \ | |
| -d '{"files": [...]}' | |
| ``` | |
| ### Resource Usage | |
| - Each conversion uses a separate LibreOffice instance | |
| - Monitor memory usage for large files | |
| - Consider scaling the service for high-volume usage | |
| ## Troubleshooting | |
| ### Common Issues | |
| 1. **Service won't start**: | |
| - Ensure Docker and Docker Compose are installed | |
| - Check that port 8000 is not in use | |
| - Verify sufficient system resources | |
| 2. **Conversion fails**: | |
| - Check that the DOCX file is valid | |
| - Verify file size is within limits | |
| - Review logs with `docker-compose logs` | |
| 3. **Download fails**: | |
| - Ensure the file hasn't been cleaned up | |
| - Check the download URL is correct | |
| ### Viewing Logs | |
| ```bash | |
| docker-compose logs -f docx-to-pdf-enhanced | |
| ``` | |
| ## Testing | |
| Run the test suite: | |
| ```bash | |
| docker-compose run --rm docx-to-pdf-enhanced python3 -m pytest tests/ | |
| ``` | |
| ## Deployment | |
| See [DEPLOYMENT_ENHANCED.md](DEPLOYMENT_ENHANCED.md) for detailed deployment instructions for production environments. | |
| ## Security | |
| - Files are validated for type and size | |
| - Only DOCX files are accepted | |
| - CORS can be configured for production use | |
| - Run containers with minimal privileges | |
| This enhanced version provides a robust, scalable solution for converting DOCX files to PDF with excellent Arabic language support and formatting preservation. |