Spaces:
Running
Running
Joseph Pollack
commited on
Commit
Β·
0854ced
1
Parent(s):
f5a06d4
randomly adds a logger to gradio because no time to refactor
Browse files- .gitignore +1 -2
- REPORT_WRITING_AGENTS_ANALYSIS.md +182 -0
- src/app.py +2 -2
.gitignore
CHANGED
|
@@ -1,4 +1,5 @@
|
|
| 1 |
folder/
|
|
|
|
| 2 |
.cursor/
|
| 3 |
.ruff_cache/
|
| 4 |
# Python
|
|
@@ -74,7 +75,5 @@ htmlcov/
|
|
| 74 |
chroma_db/
|
| 75 |
*.sqlite3
|
| 76 |
|
| 77 |
-
# Development directory (personal notes and planning)
|
| 78 |
-
dev/
|
| 79 |
|
| 80 |
# Trigger rebuild Wed Nov 26 17:51:41 EST 2025
|
|
|
|
| 1 |
folder/
|
| 2 |
+
site/
|
| 3 |
.cursor/
|
| 4 |
.ruff_cache/
|
| 5 |
# Python
|
|
|
|
| 75 |
chroma_db/
|
| 76 |
*.sqlite3
|
| 77 |
|
|
|
|
|
|
|
| 78 |
|
| 79 |
# Trigger rebuild Wed Nov 26 17:51:41 EST 2025
|
REPORT_WRITING_AGENTS_ANALYSIS.md
ADDED
|
@@ -0,0 +1,182 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Report Writing Agents Analysis
|
| 2 |
+
|
| 3 |
+
## Summary
|
| 4 |
+
|
| 5 |
+
This document identifies all agents and methods in the repository that generate reports or write to files.
|
| 6 |
+
|
| 7 |
+
## Key Finding
|
| 8 |
+
|
| 9 |
+
**All report-writing agents return strings (markdown) - NONE write directly to files.**
|
| 10 |
+
|
| 11 |
+
The agents generate report content but do not save it to disk. File writing would need to be added as a separate step.
|
| 12 |
+
|
| 13 |
+
---
|
| 14 |
+
|
| 15 |
+
## Report Writing Agents
|
| 16 |
+
|
| 17 |
+
### 1. WriterAgent
|
| 18 |
+
**File**: `src/agents/writer.py`
|
| 19 |
+
|
| 20 |
+
**Method**: `async def write_report(query, findings, output_length, output_instructions) -> str`
|
| 21 |
+
|
| 22 |
+
**Returns**: Markdown formatted report string
|
| 23 |
+
|
| 24 |
+
**Purpose**: Generates final reports from research findings with numbered citations
|
| 25 |
+
|
| 26 |
+
**File Writing**: β **NO** - Returns string only
|
| 27 |
+
|
| 28 |
+
**Key Features**:
|
| 29 |
+
- Validates inputs
|
| 30 |
+
- Truncates very long findings (max 50,000 chars)
|
| 31 |
+
- Retry logic (3 retries)
|
| 32 |
+
- Returns markdown with numbered citations
|
| 33 |
+
|
| 34 |
+
---
|
| 35 |
+
|
| 36 |
+
### 2. LongWriterAgent
|
| 37 |
+
**File**: `src/agents/long_writer.py`
|
| 38 |
+
|
| 39 |
+
**Methods**:
|
| 40 |
+
- `async def write_next_section(original_query, report_draft, next_section_title, next_section_draft) -> LongWriterOutput`
|
| 41 |
+
- `async def write_report(original_query, report_title, report_draft) -> str`
|
| 42 |
+
|
| 43 |
+
**Returns**:
|
| 44 |
+
- `write_next_section()`: `LongWriterOutput` object (structured output)
|
| 45 |
+
- `write_report()`: Complete markdown report string
|
| 46 |
+
|
| 47 |
+
**Purpose**: Iteratively writes report sections with proper citations and reference management
|
| 48 |
+
|
| 49 |
+
**File Writing**: β **NO** - Returns string only
|
| 50 |
+
|
| 51 |
+
**Key Features**:
|
| 52 |
+
- Writes sections iteratively
|
| 53 |
+
- Reformats and deduplicates references
|
| 54 |
+
- Adjusts heading levels
|
| 55 |
+
- Aggregates references across sections
|
| 56 |
+
|
| 57 |
+
---
|
| 58 |
+
|
| 59 |
+
### 3. ProofreaderAgent
|
| 60 |
+
**File**: `src/agents/proofreader.py`
|
| 61 |
+
|
| 62 |
+
**Method**: `async def proofread(query, report_draft) -> str`
|
| 63 |
+
|
| 64 |
+
**Returns**: Final polished markdown report string
|
| 65 |
+
|
| 66 |
+
**Purpose**: Proofreads and finalizes report drafts
|
| 67 |
+
|
| 68 |
+
**File Writing**: β **NO** - Returns string only
|
| 69 |
+
|
| 70 |
+
**Key Features**:
|
| 71 |
+
- Combines sections
|
| 72 |
+
- Removes duplicates
|
| 73 |
+
- Adds summary
|
| 74 |
+
- Preserves references
|
| 75 |
+
- Polishes wording
|
| 76 |
+
|
| 77 |
+
---
|
| 78 |
+
|
| 79 |
+
### 4. ReportAgent
|
| 80 |
+
**File**: `src/agents/report_agent.py`
|
| 81 |
+
|
| 82 |
+
**Method**: `async def run(messages, thread, **kwargs) -> AgentRunResponse`
|
| 83 |
+
|
| 84 |
+
**Returns**: `AgentRunResponse` with markdown text in `messages[0].text`
|
| 85 |
+
|
| 86 |
+
**Purpose**: Generates structured scientific reports from evidence and hypotheses
|
| 87 |
+
|
| 88 |
+
**File Writing**: β **NO** - Returns `AgentRunResponse` object
|
| 89 |
+
|
| 90 |
+
**Key Features**:
|
| 91 |
+
- Uses structured `ResearchReport` model
|
| 92 |
+
- Validates citations
|
| 93 |
+
- Returns markdown via `report.to_markdown()`
|
| 94 |
+
|
| 95 |
+
---
|
| 96 |
+
|
| 97 |
+
## File Writing Operations Found
|
| 98 |
+
|
| 99 |
+
### Temporary File Writing (Not Reports)
|
| 100 |
+
|
| 101 |
+
1. **ImageOCRService** (`src/services/image_ocr.py`)
|
| 102 |
+
- `_save_image_temp(image) -> str`
|
| 103 |
+
- Saves temporary images for OCR processing
|
| 104 |
+
- Returns temp file path
|
| 105 |
+
|
| 106 |
+
2. **STTService** (`src/services/stt_gradio.py`)
|
| 107 |
+
- `_save_audio_temp(audio_array, sample_rate) -> str`
|
| 108 |
+
- Saves temporary audio files for transcription
|
| 109 |
+
- Returns temp file path
|
| 110 |
+
|
| 111 |
+
---
|
| 112 |
+
|
| 113 |
+
## Where Reports Are Used
|
| 114 |
+
|
| 115 |
+
### Graph Orchestrator
|
| 116 |
+
**File**: `src/orchestrator/graph_orchestrator.py`
|
| 117 |
+
|
| 118 |
+
- Line 642: `final_report = await long_writer_agent.write_report(...)`
|
| 119 |
+
- Returns string result, stored in graph context
|
| 120 |
+
- Final result passed through `AgentEvent` with `message` field
|
| 121 |
+
|
| 122 |
+
### Research Flows
|
| 123 |
+
**File**: `src/orchestrator/research_flow.py`
|
| 124 |
+
|
| 125 |
+
- `IterativeResearchFlow._create_final_report()`: Calls `writer_agent.write_report()`
|
| 126 |
+
- `DeepResearchFlow._create_final_report()`: Calls `long_writer_agent.write_report()`
|
| 127 |
+
- Both return strings
|
| 128 |
+
|
| 129 |
+
---
|
| 130 |
+
|
| 131 |
+
## Integration Points for File Writing
|
| 132 |
+
|
| 133 |
+
To add file writing capability, you would need to:
|
| 134 |
+
|
| 135 |
+
1. **After report generation**: Save the returned string to a file
|
| 136 |
+
2. **In graph orchestrator**: After `write_report()`, save to file and include path in result
|
| 137 |
+
3. **In research flows**: After `_create_final_report()`, save to file
|
| 138 |
+
|
| 139 |
+
### Example Implementation Pattern
|
| 140 |
+
|
| 141 |
+
```python
|
| 142 |
+
import tempfile
|
| 143 |
+
from pathlib import Path
|
| 144 |
+
|
| 145 |
+
# After report generation
|
| 146 |
+
report_content = await writer_agent.write_report(...)
|
| 147 |
+
|
| 148 |
+
# Save to file
|
| 149 |
+
output_dir = Path("/tmp/reports") # or configurable
|
| 150 |
+
output_dir.mkdir(exist_ok=True)
|
| 151 |
+
file_path = output_dir / f"report_{timestamp}.md"
|
| 152 |
+
|
| 153 |
+
with open(file_path, "w", encoding="utf-8") as f:
|
| 154 |
+
f.write(report_content)
|
| 155 |
+
|
| 156 |
+
# Return both content and file path
|
| 157 |
+
return {
|
| 158 |
+
"message": "Report generated successfully",
|
| 159 |
+
"file": str(file_path)
|
| 160 |
+
}
|
| 161 |
+
```
|
| 162 |
+
|
| 163 |
+
---
|
| 164 |
+
|
| 165 |
+
## Recommendations
|
| 166 |
+
|
| 167 |
+
1. **Add file writing utility**: Create a helper function to save reports to files
|
| 168 |
+
2. **Make it optional**: Add configuration flag to enable/disable file saving
|
| 169 |
+
3. **Use temp directory**: Save to temp directory by default, allow custom path
|
| 170 |
+
4. **Include in graph results**: Modify graph orchestrator to optionally save and return file paths
|
| 171 |
+
5. **Support multiple formats**: Consider saving as both `.md` and potentially `.pdf` or `.html`
|
| 172 |
+
|
| 173 |
+
---
|
| 174 |
+
|
| 175 |
+
## Current State
|
| 176 |
+
|
| 177 |
+
β
**Report Generation**: Fully implemented
|
| 178 |
+
β **File Writing**: Not implemented
|
| 179 |
+
β
**File Output Integration**: Recently added (see previous work on `event_to_chat_message`)
|
| 180 |
+
|
| 181 |
+
The infrastructure to handle file outputs in Gradio is in place, but the agents themselves do not yet write files. They would need to be enhanced or wrapped to add file writing capability.
|
| 182 |
+
|
src/app.py
CHANGED
|
@@ -30,7 +30,7 @@ from src.agent_factory.judges import HFInferenceJudgeHandler, JudgeHandler, Mock
|
|
| 30 |
from src.orchestrator_factory import create_orchestrator
|
| 31 |
from src.services.audio_processing import get_audio_service
|
| 32 |
from src.services.multimodal_processing import get_multimodal_service
|
| 33 |
-
|
| 34 |
from src.tools.clinicaltrials import ClinicalTrialsTool
|
| 35 |
from src.tools.europepmc import EuropePMCTool
|
| 36 |
from src.tools.pubmed import PubMedTool
|
|
@@ -38,7 +38,7 @@ from src.tools.search_handler import SearchHandler
|
|
| 38 |
from src.utils.config import settings
|
| 39 |
from src.utils.models import AgentEvent, OrchestratorConfig
|
| 40 |
|
| 41 |
-
|
| 42 |
|
| 43 |
|
| 44 |
def configure_orchestrator(
|
|
|
|
| 30 |
from src.orchestrator_factory import create_orchestrator
|
| 31 |
from src.services.audio_processing import get_audio_service
|
| 32 |
from src.services.multimodal_processing import get_multimodal_service
|
| 33 |
+
import structlog
|
| 34 |
from src.tools.clinicaltrials import ClinicalTrialsTool
|
| 35 |
from src.tools.europepmc import EuropePMCTool
|
| 36 |
from src.tools.pubmed import PubMedTool
|
|
|
|
| 38 |
from src.utils.config import settings
|
| 39 |
from src.utils.models import AgentEvent, OrchestratorConfig
|
| 40 |
|
| 41 |
+
logger = structlog.get_logger()
|
| 42 |
|
| 43 |
|
| 44 |
def configure_orchestrator(
|