Spaces:
Running
Running
File size: 3,441 Bytes
0474003 e427816 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 |
# PDF Report Generation Integration
## Summary
Integrated PDF generation functionality into the report file service using utilities from `folder/utils copy`. Reports can now be automatically converted to PDF format as a final step.
## Changes Made
### 1. Added PDF Conversion Utilities
**Files Created:**
- `src/utils/md_to_pdf.py` - Markdown to PDF conversion utility
- `src/utils/markdown.css` - CSS styling for PDF output
**Features:**
- Uses `md2pdf` library for conversion
- Includes error handling and graceful fallback
- Supports custom CSS styling
- Logs conversion status
### 2. Enhanced ReportFileService
**File:** `src/services/report_file_service.py`
**Changes:**
- Added `_save_pdf()` method to generate PDF from markdown
- Updated `save_report_multiple_formats()` to implement PDF generation
- PDF is generated when `report_file_format` is set to `"md_pdf"`
- Both markdown and PDF files are saved and returned
**Method Signature:**
```python
def _save_pdf(
self,
report_content: str,
query: str | None = None,
) -> str:
"""Save report as PDF. Returns path to PDF file."""
```
### 3. Updated Graph Orchestrator
**File:** `src/orchestrator/graph_orchestrator.py`
**Changes:**
- Updated synthesizer node to use `save_report_multiple_formats()`
- Updated writer node to use `save_report_multiple_formats()`
- Both nodes now return PDF paths in result dict when available
- Result includes both `file` (markdown) and `files` (both formats) keys
**Result Format:**
```python
{
"message": final_report, # Report content
"file": "/path/to/report.md", # Markdown file
"files": ["/path/to/report.md", "/path/to/report.pdf"] # Both formats
}
```
## Configuration
PDF generation is controlled by the `report_file_format` setting in `src/utils/config.py`:
```python
report_file_format: Literal["md", "md_html", "md_pdf"] = Field(
default="md",
description="File format(s) to save reports in."
)
```
**Options:**
- `"md"` - Save only markdown (default)
- `"md_html"` - Save markdown + HTML (not yet implemented)
- `"md_pdf"` - Save markdown + PDF ✅ **Now implemented**
## Usage
### Enable PDF Generation
Set the environment variable or update settings:
```bash
REPORT_FILE_FORMAT=md_pdf
```
Or in code:
```python
from src.utils.config import settings
settings.report_file_format = "md_pdf"
```
### Dependencies
PDF generation requires the `md2pdf` library:
```bash
pip install md2pdf
```
If `md2pdf` is not installed, the system will:
- Log a warning
- Continue with markdown-only saving
- Not fail the report generation
## File Output
When PDF generation is enabled:
1. Markdown file is always saved first
2. PDF is generated from the markdown content
3. Both file paths are returned in the result
4. Gradio interface can display/download both files
## Error Handling
- If PDF generation fails, markdown file is still saved
- Errors are logged but don't interrupt report generation
- Graceful fallback ensures reports are always available
## Integration Points
PDF generation is automatically triggered when:
1. Graph orchestrator synthesizer node completes
2. Graph orchestrator writer node completes
3. `save_report_multiple_formats()` is called
4. `report_file_format` is set to `"md_pdf"`
## Future Enhancements
- HTML format support (`md_html`)
- Custom PDF templates
- PDF metadata (title, author, keywords)
- PDF compression options
- Batch PDF generation
|