# Usage Guide for Enhanced DOCX to PDF Converter This guide explains how to use the enhanced DOCX to PDF converter, which has been completely redesigned from the original Gradio-based version to a professional FastAPI service. ## Getting Started ### Prerequisites - Docker and Docker Compose installed - At least 4GB of available RAM - Internet connection for initial setup ### Quick Start 1. Clone or download this repository 2. Navigate to the project directory 3. Run the service: ```bash docker-compose up --build ``` 4. Access the API at `http://localhost:8000` 5. View API documentation at `http://localhost:8000/docs` ## API Endpoints ### Convert Single DOCX File **POST** `/convert` Converts a single DOCX file to PDF. #### Using Multipart File Upload: ```bash curl -X POST "http://localhost:8000/convert" \ -H "accept: application/json" \ -H "Content-Type: multipart/form-data" \ -F "file=@document.docx" ``` #### Using Base64 Content: ```bash # First encode your file to base64 BASE64_CONTENT=$(base64 -i document.docx) # Then send the request curl -X POST "http://localhost:8000/convert" \ -H "accept: application/json" \ -H "Content-Type: application/x-www-form-urlencoded" \ -d "file_content=$BASE64_CONTENT" \ -d "filename=document.docx" ``` #### Response: ```json { "success": true, "pdf_url": "/download/abc123/document.pdf", "message": "Conversion successful" } ``` ### Batch Convert Multiple DOCX Files **POST** `/convert/batch` Converts multiple DOCX files in a single request. ```bash curl -X POST "http://localhost:8000/convert/batch" \ -H "accept: application/json" \ -H "Content-Type: application/json" \ -d '{ "files": [ { "file_content": "base64_encoded_content_1", "filename": "document1.docx" }, { "file_content": "base64_encoded_content_2", "filename": "document2.docx" } ] }' ``` #### Response: ```json [ { "success": true, "pdf_url": "/download/abc123/document1.pdf", "message": "Conversion successful" }, { "success": false, "error": "Error description" } ] ``` ### Download Converted PDF **GET** `/download/{temp_id}/{filename}` Downloads a converted PDF file. ```bash curl -X GET "http://localhost:8000/download/abc123/document.pdf" \ -o document.pdf ``` ### Health Check **GET** `/health` Checks if the service is running. ```bash curl -X GET "http://localhost:8000/health" ``` Response: ```json { "status": "healthy", "version": "2.0.0" } ``` ## Browser Integration The API includes full CORS support for direct browser integration. You can use the Fetch API or XMLHttpRequest to communicate directly with the service from web applications. ### Example JavaScript Integration: ```javascript // Convert and download a file async function convertDocxToPdf(file) { const formData = new FormData(); formData.append('file', file); try { const response = await fetch('http://localhost:8000/convert', { method: 'POST', body: formData }); const result = await response.json(); if (result.success) { // Open PDF in new tab window.open('http://localhost:8000' + result.pdf_url, '_blank'); // Or download directly const link = document.createElement('a'); link.href = 'http://localhost:8000' + result.pdf_url; link.download = 'converted.pdf'; link.click(); } else { console.error('Conversion failed:', result.error); } } catch (error) { console.error('Network error:', error); } } ``` ## Configuration The service can be configured using environment variables: | Variable | Description | Default | |----------|-------------|---------| | `PORT` | Application port | 8000 | | `MAX_FILE_SIZE` | Maximum file size in bytes | 52428800 (50MB) | | `MAX_CONVERSION_TIME` | Conversion timeout in seconds | 120 | | `TEMP_DIR` | Temporary directory for conversions | /tmp/conversions | | `CORS_ORIGINS` | CORS allowed origins | * | ### Example with custom configuration: ```bash PORT=8080 MAX_FILE_SIZE=104857600 docker-compose up ``` ## File Handling ### Supported File Types - DOCX (Microsoft Word documents) ### File Size Limits - Default maximum: 50MB - Configurable via `MAX_FILE_SIZE` environment variable ### Storage - Converted files are stored temporarily in the `conversions` directory - This directory is mounted as a Docker volume for persistence - Files are automatically cleaned up when the container is restarted ## Error Handling The API provides detailed error messages for troubleshooting: - `400 Bad Request`: Invalid input parameters - `413 Payload Too Large`: File exceeds size limits - `500 Internal Server Error`: Conversion failed Example error response: ```json { "success": false, "error": "File too large" } ``` ## Performance Considerations ### Batch Processing For converting multiple files, use the batch endpoint to reduce overhead: ```bash curl -X POST "http://localhost:8000/convert/batch" \ -H "Content-Type: application/json" \ -d '{"files": [...]}' ``` ### Resource Usage - Each conversion uses a separate LibreOffice instance - Monitor memory usage for large files - Consider scaling the service for high-volume usage ## Troubleshooting ### Common Issues 1. **Service won't start**: - Ensure Docker and Docker Compose are installed - Check that port 8000 is not in use - Verify sufficient system resources 2. **Conversion fails**: - Check that the DOCX file is valid - Verify file size is within limits - Review logs with `docker-compose logs` 3. **Download fails**: - Ensure the file hasn't been cleaned up - Check the download URL is correct ### Viewing Logs ```bash docker-compose logs -f docx-to-pdf-enhanced ``` ## Testing Run the test suite: ```bash docker-compose run --rm docx-to-pdf-enhanced python3 -m pytest tests/ ``` ## Deployment See [DEPLOYMENT_ENHANCED.md](DEPLOYMENT_ENHANCED.md) for detailed deployment instructions for production environments. ## Security - Files are validated for type and size - Only DOCX files are accepted - CORS can be configured for production use - Run containers with minimal privileges This enhanced version provides a robust, scalable solution for converting DOCX files to PDF with excellent Arabic language support and formatting preservation.