Joseph Pollack commited on
Commit
0854ced
Β·
1 Parent(s): f5a06d4

randomly adds a logger to gradio because no time to refactor

Browse files
Files changed (3) hide show
  1. .gitignore +1 -2
  2. REPORT_WRITING_AGENTS_ANALYSIS.md +182 -0
  3. src/app.py +2 -2
.gitignore CHANGED
@@ -1,4 +1,5 @@
1
  folder/
 
2
  .cursor/
3
  .ruff_cache/
4
  # Python
@@ -74,7 +75,5 @@ htmlcov/
74
  chroma_db/
75
  *.sqlite3
76
 
77
- # Development directory (personal notes and planning)
78
- dev/
79
 
80
  # Trigger rebuild Wed Nov 26 17:51:41 EST 2025
 
1
  folder/
2
+ site/
3
  .cursor/
4
  .ruff_cache/
5
  # Python
 
75
  chroma_db/
76
  *.sqlite3
77
 
 
 
78
 
79
  # Trigger rebuild Wed Nov 26 17:51:41 EST 2025
REPORT_WRITING_AGENTS_ANALYSIS.md ADDED
@@ -0,0 +1,182 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Report Writing Agents Analysis
2
+
3
+ ## Summary
4
+
5
+ This document identifies all agents and methods in the repository that generate reports or write to files.
6
+
7
+ ## Key Finding
8
+
9
+ **All report-writing agents return strings (markdown) - NONE write directly to files.**
10
+
11
+ The agents generate report content but do not save it to disk. File writing would need to be added as a separate step.
12
+
13
+ ---
14
+
15
+ ## Report Writing Agents
16
+
17
+ ### 1. WriterAgent
18
+ **File**: `src/agents/writer.py`
19
+
20
+ **Method**: `async def write_report(query, findings, output_length, output_instructions) -> str`
21
+
22
+ **Returns**: Markdown formatted report string
23
+
24
+ **Purpose**: Generates final reports from research findings with numbered citations
25
+
26
+ **File Writing**: ❌ **NO** - Returns string only
27
+
28
+ **Key Features**:
29
+ - Validates inputs
30
+ - Truncates very long findings (max 50,000 chars)
31
+ - Retry logic (3 retries)
32
+ - Returns markdown with numbered citations
33
+
34
+ ---
35
+
36
+ ### 2. LongWriterAgent
37
+ **File**: `src/agents/long_writer.py`
38
+
39
+ **Methods**:
40
+ - `async def write_next_section(original_query, report_draft, next_section_title, next_section_draft) -> LongWriterOutput`
41
+ - `async def write_report(original_query, report_title, report_draft) -> str`
42
+
43
+ **Returns**:
44
+ - `write_next_section()`: `LongWriterOutput` object (structured output)
45
+ - `write_report()`: Complete markdown report string
46
+
47
+ **Purpose**: Iteratively writes report sections with proper citations and reference management
48
+
49
+ **File Writing**: ❌ **NO** - Returns string only
50
+
51
+ **Key Features**:
52
+ - Writes sections iteratively
53
+ - Reformats and deduplicates references
54
+ - Adjusts heading levels
55
+ - Aggregates references across sections
56
+
57
+ ---
58
+
59
+ ### 3. ProofreaderAgent
60
+ **File**: `src/agents/proofreader.py`
61
+
62
+ **Method**: `async def proofread(query, report_draft) -> str`
63
+
64
+ **Returns**: Final polished markdown report string
65
+
66
+ **Purpose**: Proofreads and finalizes report drafts
67
+
68
+ **File Writing**: ❌ **NO** - Returns string only
69
+
70
+ **Key Features**:
71
+ - Combines sections
72
+ - Removes duplicates
73
+ - Adds summary
74
+ - Preserves references
75
+ - Polishes wording
76
+
77
+ ---
78
+
79
+ ### 4. ReportAgent
80
+ **File**: `src/agents/report_agent.py`
81
+
82
+ **Method**: `async def run(messages, thread, **kwargs) -> AgentRunResponse`
83
+
84
+ **Returns**: `AgentRunResponse` with markdown text in `messages[0].text`
85
+
86
+ **Purpose**: Generates structured scientific reports from evidence and hypotheses
87
+
88
+ **File Writing**: ❌ **NO** - Returns `AgentRunResponse` object
89
+
90
+ **Key Features**:
91
+ - Uses structured `ResearchReport` model
92
+ - Validates citations
93
+ - Returns markdown via `report.to_markdown()`
94
+
95
+ ---
96
+
97
+ ## File Writing Operations Found
98
+
99
+ ### Temporary File Writing (Not Reports)
100
+
101
+ 1. **ImageOCRService** (`src/services/image_ocr.py`)
102
+ - `_save_image_temp(image) -> str`
103
+ - Saves temporary images for OCR processing
104
+ - Returns temp file path
105
+
106
+ 2. **STTService** (`src/services/stt_gradio.py`)
107
+ - `_save_audio_temp(audio_array, sample_rate) -> str`
108
+ - Saves temporary audio files for transcription
109
+ - Returns temp file path
110
+
111
+ ---
112
+
113
+ ## Where Reports Are Used
114
+
115
+ ### Graph Orchestrator
116
+ **File**: `src/orchestrator/graph_orchestrator.py`
117
+
118
+ - Line 642: `final_report = await long_writer_agent.write_report(...)`
119
+ - Returns string result, stored in graph context
120
+ - Final result passed through `AgentEvent` with `message` field
121
+
122
+ ### Research Flows
123
+ **File**: `src/orchestrator/research_flow.py`
124
+
125
+ - `IterativeResearchFlow._create_final_report()`: Calls `writer_agent.write_report()`
126
+ - `DeepResearchFlow._create_final_report()`: Calls `long_writer_agent.write_report()`
127
+ - Both return strings
128
+
129
+ ---
130
+
131
+ ## Integration Points for File Writing
132
+
133
+ To add file writing capability, you would need to:
134
+
135
+ 1. **After report generation**: Save the returned string to a file
136
+ 2. **In graph orchestrator**: After `write_report()`, save to file and include path in result
137
+ 3. **In research flows**: After `_create_final_report()`, save to file
138
+
139
+ ### Example Implementation Pattern
140
+
141
+ ```python
142
+ import tempfile
143
+ from pathlib import Path
144
+
145
+ # After report generation
146
+ report_content = await writer_agent.write_report(...)
147
+
148
+ # Save to file
149
+ output_dir = Path("/tmp/reports") # or configurable
150
+ output_dir.mkdir(exist_ok=True)
151
+ file_path = output_dir / f"report_{timestamp}.md"
152
+
153
+ with open(file_path, "w", encoding="utf-8") as f:
154
+ f.write(report_content)
155
+
156
+ # Return both content and file path
157
+ return {
158
+ "message": "Report generated successfully",
159
+ "file": str(file_path)
160
+ }
161
+ ```
162
+
163
+ ---
164
+
165
+ ## Recommendations
166
+
167
+ 1. **Add file writing utility**: Create a helper function to save reports to files
168
+ 2. **Make it optional**: Add configuration flag to enable/disable file saving
169
+ 3. **Use temp directory**: Save to temp directory by default, allow custom path
170
+ 4. **Include in graph results**: Modify graph orchestrator to optionally save and return file paths
171
+ 5. **Support multiple formats**: Consider saving as both `.md` and potentially `.pdf` or `.html`
172
+
173
+ ---
174
+
175
+ ## Current State
176
+
177
+ βœ… **Report Generation**: Fully implemented
178
+ ❌ **File Writing**: Not implemented
179
+ βœ… **File Output Integration**: Recently added (see previous work on `event_to_chat_message`)
180
+
181
+ The infrastructure to handle file outputs in Gradio is in place, but the agents themselves do not yet write files. They would need to be enhanced or wrapped to add file writing capability.
182
+
src/app.py CHANGED
@@ -30,7 +30,7 @@ from src.agent_factory.judges import HFInferenceJudgeHandler, JudgeHandler, Mock
30
  from src.orchestrator_factory import create_orchestrator
31
  from src.services.audio_processing import get_audio_service
32
  from src.services.multimodal_processing import get_multimodal_service
33
- # import structlog
34
  from src.tools.clinicaltrials import ClinicalTrialsTool
35
  from src.tools.europepmc import EuropePMCTool
36
  from src.tools.pubmed import PubMedTool
@@ -38,7 +38,7 @@ from src.tools.search_handler import SearchHandler
38
  from src.utils.config import settings
39
  from src.utils.models import AgentEvent, OrchestratorConfig
40
 
41
- # logger = structlog.get_logger()
42
 
43
 
44
  def configure_orchestrator(
 
30
  from src.orchestrator_factory import create_orchestrator
31
  from src.services.audio_processing import get_audio_service
32
  from src.services.multimodal_processing import get_multimodal_service
33
+ import structlog
34
  from src.tools.clinicaltrials import ClinicalTrialsTool
35
  from src.tools.europepmc import EuropePMCTool
36
  from src.tools.pubmed import PubMedTool
 
38
  from src.utils.config import settings
39
  from src.utils.models import AgentEvent, OrchestratorConfig
40
 
41
+ logger = structlog.get_logger()
42
 
43
 
44
  def configure_orchestrator(