mishrabp commited on
Commit
6ca6468
·
verified ·
1 Parent(s): d87994b

Upload folder using huggingface_hub

Browse files
Files changed (8) hide show
  1. Dockerfile +18 -16
  2. README.md +4 -145
  3. SETUP.md +18 -0
  4. appagents/planner_agent.py +1 -3
  5. core/logger.py +1 -1
  6. requirements.txt +31 -0
  7. run.py +23 -8
  8. ui/app.py +119 -369
Dockerfile CHANGED
@@ -1,31 +1,33 @@
1
- FROM python:3.12-slim
 
2
 
 
3
  ENV PYTHONUNBUFFERED=1 \
4
- DEBIAN_FRONTEND=noninteractive \
5
- PYTHONPATH=/app:$PYTHONPATH
6
 
 
7
  WORKDIR /app
8
 
9
- # System deps
10
  RUN apt-get update && apt-get install -y \
11
- git build-essential curl \
 
 
12
  && rm -rf /var/lib/apt/lists/*
13
 
14
- # Install uv
15
- RUN curl -LsSf https://astral.sh/uv/install.sh | sh
16
- ENV PATH="/root/.local/bin:$PATH"
17
 
18
- # Copy project metadata
19
- COPY pyproject.toml .
20
- COPY uv.lock .
21
 
22
- # Install dependencies using uv, then export and install with pip to system
23
- RUN uv sync --frozen --no-dev && \
24
- uv pip install -e . --system
25
-
26
- # Copy your source code
27
  COPY . .
28
 
 
29
  EXPOSE 7860
30
 
 
31
  CMD ["streamlit", "run", "ui/app.py", "--server.port=7860", "--server.address=0.0.0.0", "--server.headless=true"]
 
1
+ # Use official Python slim image
2
+ FROM python:3.11-slim
3
 
4
+ # Set environment variables
5
  ENV PYTHONUNBUFFERED=1 \
6
+ PIP_NO_CACHE_DIR=1 \
7
+ DEBIAN_FRONTEND=noninteractive
8
 
9
+ # Set working directory
10
  WORKDIR /app
11
 
12
+ # Install system dependencies
13
  RUN apt-get update && apt-get install -y \
14
+ git \
15
+ build-essential \
16
+ curl \
17
  && rm -rf /var/lib/apt/lists/*
18
 
19
+ # Copy requirements file
20
+ COPY requirements.txt .
 
21
 
22
+ # Install Python dependencies
23
+ RUN pip install --upgrade pip
24
+ RUN pip install -r requirements.txt
25
 
26
+ # Copy the rest of the app
 
 
 
 
27
  COPY . .
28
 
29
+ # Expose port for Streamlit
30
  EXPOSE 7860
31
 
32
+ # Command to run the Streamlit app
33
  CMD ["streamlit", "run", "ui/app.py", "--server.port=7860", "--server.address=0.0.0.0", "--server.headless=true"]
README.md CHANGED
@@ -1,9 +1,9 @@
1
  ---
2
- title: AI Deep Researcher # Give your app a title
3
  emoji: 🤖 # Pick an emoji
4
  colorFrom: indigo # Theme start color
5
  colorTo: blue # Theme end color
6
- sdk: docker # SDK type
7
  sdk_version: "4.39.0" # Example Gradio version
8
  app_file: ui/app.py # <-- points to your app.py inside ui/
9
  pinned: false
@@ -22,6 +22,7 @@ To achieve this, the project integrates the following technologies and AI featur
22
  - **SendGrid** (for emailing report)
23
  - **LLMs** - (OpenAI, Geminia, Groq)
24
 
 
25
  ## How it works?
26
  The system is a multi-agent solution, where each agent has a specific responsibility:
27
 
@@ -40,152 +41,10 @@ The system is a multi-agent solution, where each agent has a specific responsibi
40
  - Reads results from all search agents.
41
  - Generates a well-formatted, consolidated report.
42
 
43
- 5. **Email Agent (not functional at present)**
44
  - Responsible for sending the report via email using SendGrid.
45
 
46
  6. **Orchestrator**
47
  - The entry point of the system.
48
  - Facilitates communication and workflow between all agents.
49
 
50
- ## Project Folder Structure
51
-
52
- ```
53
- deep-research/
54
- ├── ui/
55
- │ ├── app.py # Main Streamlit application entry point
56
- │ └── __pycache__/ # Python bytecode cache
57
- ├── appagents/
58
- │ ├── __init__.py # Package initialization
59
- │ ├── orchestrator.py # Orchestrator agent - coordinates all agents
60
- │ ├── planner_agent.py # Planner agent - builds structured query plans
61
- │ ├── guardrail_agent.py # Guardrail agent - validates user input
62
- │ ├── search_agent.py # Search agent - performs web searches
63
- │ ├── writer_agent.py # Writer agent - generates consolidated reports
64
- │ ├── email_agent.py # Email agent - sends reports via email (not functional)
65
- │ └── __pycache__/ # Python bytecode cache
66
- ├── core/
67
- │ ├── __init__.py # Package initialization
68
- │ ├── logger.py # Centralized logging configuration
69
- │ └── __pycache__/ # Python bytecode cache
70
- ├── tools/
71
- │ ├── __init__.py # Package initialization
72
- │ ├── google_tools.py # Google search utilities
73
- │ ├── time_tools.py # Time-related utility functions
74
- │ └── __pycache__/ # Python bytecode cache
75
- ├── prompts/
76
- │ ├── __init__.py # Package initialization (if present)
77
- │ ├── planner_prompt.txt # Prompt for planner agent (if present)
78
- │ ├── guardrail_prompt.txt # Prompt for guardrail agent (if present)
79
- │ ├── search_prompt.txt # Prompt for search agent (if present)
80
- │ └── writer_prompt.txt # Prompt for writer agent (if present)
81
- ├── Dockerfile # Docker configuration for container deployment
82
- ├── pyproject.toml # Project metadata and dependencies (copied from root)
83
- ├── uv.lock # Locked dependency versions (copied from root)
84
- ├── README.md # Project documentation
85
- └── run.py # Script to run the application locally (if present)
86
- ```
87
-
88
- ## File Descriptions
89
-
90
- ### UI Layer (`ui/`)
91
- - **app.py** - Main Streamlit web application that provides the user interface. Handles:
92
- - Text input for research queries
93
- - Run/Download buttons (PDF, Markdown)
94
- - Real-time streaming of results
95
- - Display of final research reports
96
- - Session state management
97
- - Button enable/disable during streaming
98
-
99
- ### Agents (`appagents/`)
100
- - **orchestrator.py** - Central coordinator that:
101
- - Manages the multi-agent workflow
102
- - Handles communication between all agents
103
- - Streams results back to the UI
104
- - Implements the research pipeline
105
-
106
- - **planner_agent.py** - Creates a structured plan for the query:
107
- - Breaks down user query into actionable research steps
108
- - Defines search queries and research angles
109
-
110
- - **guardrail_agent.py** - Validates user input:
111
- - Checks for inappropriate content
112
- - Ensures compliance with policies
113
- - Stops workflow if violations detected
114
-
115
- - **search_agent.py** - Executes web searches:
116
- - Performs parallel web searches
117
- - Integrates with Google Search / Serper API
118
- - Gathers raw research data
119
-
120
- - **writer_agent.py** - Generates final report:
121
- - Consolidates search results
122
- - Formats findings into structured markdown
123
- - Creates well-organized research summaries
124
-
125
- - **email_agent.py** - Email delivery (not functional):
126
- - Intended to send reports via SendGrid
127
- - Currently not integrated in the workflow
128
-
129
- ### Core Utilities (`core/`)
130
- - **logger.py** - Centralized logging configuration:
131
- - Provides consistent logging across agents
132
- - Handles log levels and formatting
133
-
134
- ### Tools (`tools/`)
135
- - **google_tools.py** - Google/Serper API wrapper:
136
- - Executes web searches
137
- - Handles API authentication and response parsing
138
-
139
- - **time_tools.py** - Utility functions:
140
- - Time-related operations
141
- - Timestamp management
142
-
143
- ### Configuration Files
144
- - **Dockerfile** - Container deployment:
145
- - Builds Docker image with Python 3.12
146
- - Installs dependencies using `uv`
147
- - Sets up Streamlit server on port 7860
148
- - Configures PYTHONPATH for module imports
149
-
150
- - **pyproject.toml** - Project metadata:
151
- - Package name: "agents"
152
- - Python version requirement: 3.12
153
- - Lists all dependencies (OpenAI, LangChain, Streamlit, etc.)
154
-
155
- - **uv.lock** - Dependency lock file:
156
- - Ensures reproducible builds
157
- - Pins exact versions of all dependencies
158
-
159
- ## Key Technologies
160
-
161
- | Component | Technology | Purpose |
162
- |-----------|-----------|---------|
163
- | LLM Framework | OpenAI Agents | Multi-agent orchestration |
164
- | Web Search | Serper API / Google Search | Research data gathering |
165
- | Web UI | Streamlit | User interface and interaction |
166
- | Document Export | ReportLab | PDF generation from markdown |
167
- | Async Operations | AsyncIO | Parallel agent execution |
168
- | Dependencies | UV | Fast Python package management |
169
- | Containerization | Docker | Cloud deployment |
170
-
171
- ## Running Locally
172
-
173
- ```bash
174
- # Install dependencies
175
- uv sync
176
-
177
- # Set environment variables defined in .env.name file
178
- export OPENAI_API_KEY="your-key"
179
- export SERPER_API_KEY="your-key"
180
-
181
- # Run the Streamlit app
182
- python run.py
183
- ```
184
-
185
- ## Deployment
186
-
187
- The project is deployed on Hugging Face Spaces as a Docker container:
188
- - **Space**: https://huggingface.co/spaces/mishrabp/deep-research
189
- - **URL**: https://huggingface.co/spaces/mishrabp/deep-research
190
- - **Trigger**: Automatic deployment on push to `main` branch
191
- - **Configuration**: `.github/workflows/deep-research-app-hf.yml`
 
1
  ---
2
+ title: Deep Research App # Give your app a title
3
  emoji: 🤖 # Pick an emoji
4
  colorFrom: indigo # Theme start color
5
  colorTo: blue # Theme end color
6
+ sdk: gradio # SDK type
7
  sdk_version: "4.39.0" # Example Gradio version
8
  app_file: ui/app.py # <-- points to your app.py inside ui/
9
  pinned: false
 
22
  - **SendGrid** (for emailing report)
23
  - **LLMs** - (OpenAI, Geminia, Groq)
24
 
25
+
26
  ## How it works?
27
  The system is a multi-agent solution, where each agent has a specific responsibility:
28
 
 
41
  - Reads results from all search agents.
42
  - Generates a well-formatted, consolidated report.
43
 
44
+ 5. **Email Agent**
45
  - Responsible for sending the report via email using SendGrid.
46
 
47
  6. **Orchestrator**
48
  - The entry point of the system.
49
  - Facilitates communication and workflow between all agents.
50
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
SETUP.md ADDED
@@ -0,0 +1,18 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ### Setting up .venv
2
+ ```bash
3
+ conda create --prefix /home/azureuser/ws/agenticai/projects/deep-research/.venv python=3.11 -y
4
+
5
+ conda activate /home/azureuser/ws/agenticai/projects/deep-research/.venv
6
+
7
+ conda deactivate
8
+
9
+ uv pip install --upgrade -r requirements.txt
10
+ ```
11
+
12
+ ### Run Unit Tests
13
+ ```bash
14
+ pytest -v tests/test_data_agent.py
15
+
16
+ python -m pytest -v
17
+
18
+ ```
appagents/planner_agent.py CHANGED
@@ -31,14 +31,12 @@ groq_api_key = os.getenv('GROQ_API_KEY')
31
  groq_client = AsyncOpenAI(base_url=GROQ_BASE_URL, api_key=groq_api_key)
32
  groq_model = OpenAIChatCompletionsModel(model="groq/compound", openai_client=groq_client)
33
 
34
- openai_model = "gpt-4.1-mini"
35
-
36
  # Note: Many models do not like tool call and json output_schema used together.
37
 
38
  planner_agent = Agent(
39
  name="PlannerAgent",
40
  instructions=INSTRUCTIONS,
41
- model=openai_model,
42
  tools=[TimeTools.current_datetime],
43
  output_type=WebSearchPlan,
44
  input_guardrails=[guardrail_against_unparliamentary],
 
31
  groq_client = AsyncOpenAI(base_url=GROQ_BASE_URL, api_key=groq_api_key)
32
  groq_model = OpenAIChatCompletionsModel(model="groq/compound", openai_client=groq_client)
33
 
 
 
34
  # Note: Many models do not like tool call and json output_schema used together.
35
 
36
  planner_agent = Agent(
37
  name="PlannerAgent",
38
  instructions=INSTRUCTIONS,
39
+ model=gemini_model,
40
  tools=[TimeTools.current_datetime],
41
  output_type=WebSearchPlan,
42
  input_guardrails=[guardrail_against_unparliamentary],
core/logger.py CHANGED
@@ -14,7 +14,7 @@ def log_call(func):
14
  print(f"[{timestamp}] 🚀 Calling: {func.__name__}({arg_list})")
15
  try:
16
  result = func(*args, **kwargs)
17
- # print(f"[{timestamp}] ✅ Finished: {func.__name__}")
18
  return result
19
  except Exception as e:
20
  print(f"[{timestamp}] ❌ Error in {func.__name__}: {e}")
 
14
  print(f"[{timestamp}] 🚀 Calling: {func.__name__}({arg_list})")
15
  try:
16
  result = func(*args, **kwargs)
17
+ print(f"[{timestamp}] ✅ Finished: {func.__name__}")
18
  return result
19
  except Exception as e:
20
  print(f"[{timestamp}] ❌ Error in {func.__name__}: {e}")
requirements.txt ADDED
@@ -0,0 +1,31 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ openai>=1.85.0
2
+ # via
3
+ # agents (pyproject.toml)
4
+ # autogen-ext
5
+ # langchain-openai
6
+ # openai-agents
7
+ # semantic-kernel
8
+ openai-agents>=0.0.17
9
+ # via agents (pyproject.toml)
10
+ python-dotenv>=1.0.1
11
+ requests>=2.31.0
12
+ # via
13
+ # agents (pyproject.toml)
14
+ # autogen-ext
15
+ # langchain-openai
16
+ # openai-agents
17
+ # semantic-kernel
18
+ yfinance>=0.2.27
19
+ # via tools/news_tools.py, tools/yahoo_tools.py
20
+ gradio>=3.34.0
21
+ # via autogen-ext
22
+ sendgrid>=6.9.7
23
+ # via tools/email_tools.py
24
+ mcp==1.9.3
25
+ # via
26
+ # agents (pyproject.toml)
27
+ # autogen-ext
28
+ # mcp-server-fetch
29
+ # openai-agents
30
+ mcp-server-fetch==2025.1.17
31
+ # via agents (pyproject.toml)
run.py CHANGED
@@ -1,11 +1,26 @@
1
  import os
2
- import subprocess
3
  import sys
 
4
 
5
- # Use module execution to guarantee Streamlit runs inside the current interpreter
6
- subprocess.run([
7
- sys.executable, "-m", "streamlit",
8
- "run",
9
- os.path.join("ui", "app.py"),
10
- "--server.runOnSave", "true"
11
- ])
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  import os
 
2
  import sys
3
+ import importlib
4
 
5
+ def main():
6
+ # Root directory of the project
7
+ root_dir = os.path.dirname(os.path.abspath(__file__))
8
+
9
+ # Ensure the root and ui folder are on the Python path
10
+ ui_path = os.path.join(root_dir, "ui")
11
+ for p in [root_dir, ui_path]:
12
+ if p not in sys.path:
13
+ sys.path.insert(0, p)
14
+
15
+ print("🚀 Starting Gradio app (ui/app.py)...\n")
16
+
17
+ # Import and launch the UI
18
+ app_module = importlib.import_module("ui.app")
19
+
20
+ if hasattr(app_module, "ui"):
21
+ app_module.ui.launch(inbrowser=True)
22
+ else:
23
+ print("❌ Could not find `ui` object in ui/app.py")
24
+
25
+ if __name__ == "__main__":
26
+ main()
ui/app.py CHANGED
@@ -1,432 +1,182 @@
1
  import streamlit as st
2
  import asyncio
3
- import time
4
- import html
5
- from datetime import datetime, UTC
6
- from io import BytesIO
7
-
8
  from dotenv import load_dotenv
9
- from reportlab.platypus import SimpleDocTemplate, Paragraph
10
- from reportlab.lib.styles import getSampleStyleSheet
11
 
12
  from appagents.orchestrator import Orchestrator
13
  from agents import SQLiteSession
14
 
15
  load_dotenv(override=True)
16
 
17
- # --------------------
18
  # Page config
19
- # --------------------
20
  st.set_page_config(page_title="Deep Research AI", layout="wide")
21
 
22
- # --------------------
23
- # Session-state init
24
- # --------------------
25
- if "session_store" not in st.session_state:
26
- st.session_state.session_store = {}
27
-
28
- if "session_id" not in st.session_state:
29
- st.session_state.session_id = str(id(st))
30
-
31
- if "final_report" not in st.session_state:
32
- st.session_state.final_report = ""
33
-
34
- if "button_disabled" not in st.session_state:
35
- st.session_state.button_disabled = False
36
-
37
-
38
- # (dark mode removed - UI uses single light theme)
39
-
40
- # --------------------
41
- # CSS for theme-agnostic layout
42
- # --------------------
43
- THEME_AGNOSTIC_CSS = """
44
  <style>
45
- :root {
46
- color-scheme: light dark;
47
- }
48
-
49
- .block-container {
50
- max-width: 90% !important;
51
- margin-left: 5% !important;
52
- margin-right: 5% !important;
53
- padding-top: 1.5rem !important;
54
- padding-bottom: 2rem !important;
55
- }
56
-
57
- /* Use system foreground/background colors */
58
- body {
59
- color: var(--text-color);
60
- background-color: var(--bg-color);
61
- }
62
-
63
- h1, h2, h3, h4, h5, h6 {
64
- font-size: 2.2rem !important;
65
- text-align: left !important;
66
- color: inherit !important;
67
- font-weight: 600 !important;
68
- }
69
-
70
- /* Text areas - inherit system colors */
71
- textarea, .stTextArea > div > div > textarea {
72
- background-color: inherit !important;
73
- color: inherit !important;
74
- font-size: 1.05rem !important;
75
- border: 1px solid var(--border-color) !important;
76
- }
77
-
78
- /* Buttons - proper button styling */
79
- .stButton > button, .stDownloadButton > button {
80
- border: 2px solid currentColor !important;
81
- border-radius: 6px !important;
82
- padding: 10px 20px !important;
83
- font-weight: 600 !important;
84
- cursor: pointer !important;
85
- transition: all 0.2s ease !important;
86
- background-color: transparent !important;
87
- color: inherit !important;
88
- min-width: 150px !important;
89
- min-height: 44px !important;
90
- }
91
-
92
- .stButton > button:hover, .stDownloadButton > button:hover {
93
- background-color: rgba(0, 0, 0, 0.1) !important;
94
- transform: translateY(-2px) !important;
95
- box-shadow: 0 4px 12px rgba(0, 0, 0, 0.15) !important;
96
- }
97
-
98
- .stButton > button:active, .stDownloadButton > button:active {
99
- transform: translateY(0) !important;
100
- }
101
-
102
- /* Download buttons */
103
- .stDownloadButton > button {
104
- width: 180px !important;
105
- height: 48px !important;
106
- }
107
-
108
- /* Text and paragraphs */
109
- p, span, div {
110
- color: inherit !important;
111
- }
112
-
113
- /* Code blocks */
114
- code {
115
- padding: 2px 4px;
116
- border-radius: 3px;
117
- }
118
-
119
- /* Info, success, error, warning boxes */
120
- .stAlert {
121
- border-radius: 6px !important;
122
- }
123
-
124
- /* Markdown content */
125
- .stMarkdown {
126
- color: inherit !important;
127
- }
128
-
129
- /* List items */
130
- ul, ol, li {
131
- color: inherit !important;
132
- }
133
-
134
- /* Links */
135
- a {
136
- text-decoration: none;
137
- }
138
-
139
- a:hover {
140
- text-decoration: underline;
141
  }
142
 
143
- /* Ensure sufficient contrast for readability */
144
- .stApp {
145
- font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, 'Helvetica Neue', Arial, sans-serif;
146
- line-height: 1.6;
147
- }
148
-
149
- /* Progress bar visibility */
150
- .stProgress > div > div > div {
151
- background-color: currentColor !important;
152
- opacity: 0.5 !important;
153
  }
154
 
155
- /* Remove text truncation */
156
- .stMarkdown {
157
- max-height: none !important;
158
- overflow: visible !important;
 
 
 
 
 
 
 
 
 
 
159
  }
160
 
161
- /* Responsive buttons layout */
162
- @media (max-width: 768px) {
163
- h1, h2, h3 {
164
- font-size: 1.6rem !important;
165
- }
166
-
167
- .stButton > button {
168
- width: 100% !important;
169
- height: auto !important;
170
- padding: 10px !important;
171
- }
172
-
173
- .stDownloadButton > button {
174
- width: 100% !important;
175
- height: auto !important;
176
- padding: 10px !important;
177
- }
178
  }
179
 
180
- /* Tablet devices */
181
- @media (min-width: 769px) and (max-width: 1024px) {
182
- .block-container {
183
- max-width: 85% !important;
184
- }
185
-
186
- h1, h2, h3 {
187
- font-size: 1.8rem !important;
188
- }
189
  }
190
 
191
- /* Desktop devices */
192
- @media (min-width: 1025px) {
193
- .block-container {
194
- max-width: 90% !important;
195
- }
196
-
197
- h1, h2, h3 {
198
- font-size: 2.2rem !important;
199
- }
200
  }
201
  </style>
202
  """
 
203
 
204
- st.markdown(THEME_AGNOSTIC_CSS, unsafe_allow_html=True)
 
 
 
 
 
 
205
 
206
- st.markdown(THEME_AGNOSTIC_CSS, unsafe_allow_html=True)
207
 
208
- # --------------------
209
- # Helpers: orchestrator streaming
210
- # --------------------
211
  async def run_async_chunks(query: str, session_id: str):
 
 
 
 
212
  if session_id not in st.session_state.session_store:
213
- st.session_state.session_store[session_id] = SQLiteSession(f"session_{session_id}.db")
 
 
 
214
  session = st.session_state.session_store[session_id]
215
  orchestrator = Orchestrator(session=session)
 
216
  async for chunk in orchestrator.run(query):
217
  yield chunk
218
 
219
- def safe_title_from_query(q: str):
220
- q = q.strip()
221
- if not q:
222
- return "Untitled Report"
223
- first_line = q.splitlines()[0]
224
- # limit length for title
225
- return (first_line[:80] + "...") if len(first_line) > 80 else first_line
226
-
227
- # --------------------
228
- # Export helpers
229
- # --------------------
230
- def make_pdf_bytes(text: str) -> bytes:
231
- """Convert markdown text to PDF with proper formatting."""
232
- buf = BytesIO()
233
- doc = SimpleDocTemplate(buf, topMargin=0.5*72, bottomMargin=0.5*72, leftMargin=0.75*72, rightMargin=0.75*72)
234
- styles = getSampleStyleSheet()
235
- story = []
236
-
237
- # parse markdown: headings, lists, bold, italic
238
- lines = text.split("\n")
239
- for line in lines:
240
- stripped = line.strip()
241
-
242
- if not stripped:
243
- story.append(Paragraph(" ", styles["Normal"])) # empty line
244
- continue
245
-
246
- # heading levels
247
- if stripped.startswith("# "):
248
- story.append(Paragraph(html.escape(stripped[2:]), styles["Heading1"]))
249
- elif stripped.startswith("## "):
250
- story.append(Paragraph(html.escape(stripped[3:]), styles["Heading2"]))
251
- elif stripped.startswith("### "):
252
- story.append(Paragraph(html.escape(stripped[4:]), styles["Heading3"]))
253
- elif stripped.startswith("- ") or stripped.startswith("* "):
254
- # bullet list
255
- story.append(Paragraph("• " + html.escape(stripped[2:]), styles["Normal"]))
256
- elif stripped[0].isdigit() and ". " in stripped[:4]:
257
- # numbered list
258
- story.append(Paragraph(html.escape(stripped), styles["Normal"]))
259
- else:
260
- # regular paragraph with basic markdown formatting
261
- # escape first, then replace with safe formatting tags
262
- p_text = html.escape(stripped)
263
-
264
- # handle **bold** (convert escaped ** back and wrap in <b> tags)
265
- p_text = p_text.replace("&lt;b&gt;", "<b>").replace("&lt;/b&gt;", "</b>")
266
- # Simple approach: replace **text** with <b>text</b>
267
- import re
268
- p_text = re.sub(r'\*\*(.+?)\*\*', r'<b>\1</b>', p_text)
269
- p_text = re.sub(r'__(.+?)__', r'<b>\1</b>', p_text)
270
- # handle *italic* → <i>italic</i> carefully (avoid double replacement)
271
- p_text = re.sub(r'\*([^*]+?)\*', r'<i>\1</i>', p_text)
272
- p_text = re.sub(r'_([^_]+?)_', r'<i>\1</i>', p_text)
273
-
274
- story.append(Paragraph(p_text, styles["Normal"]))
275
-
276
- doc.build(story)
277
- buf.seek(0)
278
- return buf.read()
279
-
280
- def make_md_bytes(text: str) -> bytes:
281
- return text.encode("utf-8")
282
-
283
- def make_html_bytes(text: str, title="Deep Research Report") -> bytes:
284
- # simple HTML wrapper, escape content and preserve newlines
285
- body = "<br/>".join(html.escape(text).split("\n"))
286
- html_doc = f"""<!doctype html>
287
- <html>
288
- <head>
289
- <meta charset="utf-8">
290
- <title>{html.escape(title)}</title>
291
- <style>body{{font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, 'Helvetica Neue', Arial; padding:24px; max-width:900px; margin:auto; line-height:1.6; color: #0b1220; background: #ffffff }}</style>
292
- </head>
293
- <body>
294
- <h1>{html.escape(title)}</h1>
295
- <div>{body}</div>
296
- </body>
297
- </html>"""
298
- return html_doc.encode("utf-8")
299
-
300
- # --------------------
301
- # Streaming runner (final output replaces trace)
302
- # --------------------
303
- def run_streaming(query: str, final_ph, status_ph):
304
  session_id = st.session_state.session_id
305
 
306
  # placeholders
307
- # status_ph = st.empty()
308
  progress_ph = st.empty()
 
309
 
310
- # reset final_report
311
- st.session_state.final_report = ""
312
- # track only the last received chunk
313
- last_chunk = ""
314
- progress_val = 0
315
- progress_bar = progress_ph.progress(progress_val)
316
 
317
- # ensure any prior final output is cleared while streaming
318
- try:
319
- final_ph.empty()
320
- except Exception:
321
- pass
322
- # status_ph.info("🔎 Researching streaming (final result only)...")
 
323
 
324
  async def _stream():
325
- nonlocal progress_val, last_chunk
326
- status_ph.info("Streaming... receiving data")
327
- bStartChunkCollected = False
328
  async for chunk in run_async_chunks(query, session_id):
329
- # start collecting chunks once we see one beginning with #
330
- if not bStartChunkCollected and chunk.strip().startswith("#"):
331
- bStartChunkCollected = True
332
-
333
- if bStartChunkCollected:
334
- last_chunk += chunk
335
- # render accumulated markdown in real-time so user sees content streaming
336
- status_ph.markdown(last_chunk)
337
-
338
- progress_val = min(progress_val + 2, 98)
339
- progress_bar.progress(progress_val)
340
-
341
- # run async generator (compatibility fallback)
342
  try:
343
  asyncio.run(_stream())
344
  except RuntimeError:
 
 
345
  loop = asyncio.new_event_loop()
346
  asyncio.set_event_loop(loop)
347
  loop.run_until_complete(_stream())
348
  loop.close()
349
- except Exception as e:
350
- # on exception, re-enable button and show error
351
- st.session_state.button_disabled = False
352
- status_ph.error(f"❌ Error during research: {str(e)}")
353
- progress_ph.empty()
354
- return
355
-
356
- # finalize
357
- progress_bar.progress(100)
358
- status_ph.success("✅ Research complete!")
359
-
360
- # set final_report to only the last yield (trim surrounding whitespace)
361
- md_text = last_chunk.strip()
362
- st.session_state.final_report = md_text
363
- progress_ph.empty()
364
-
365
- # re-enable button after completion
366
- st.session_state.button_disabled = False
367
-
368
- # history saving disabled (kept minimal in-memory state only)
369
 
370
- # render final output as Markdown into the dedicated placeholder
371
- # Use Streamlit's markdown renderer so headings, lists, links render correctly.
372
- if st.session_state.final_report:
373
- final_ph.markdown(st.session_state.final_report)
374
- else:
375
- final_ph.empty()
376
-
377
- # rerun to reflect button re-enable and final output
378
- st.rerun()
379
-
380
- # Sidebar removed per UI request. Dark-mode and history removed.
381
 
382
 
383
- # --------------------
384
- # Main UI
385
- # --------------------
386
  st.title("🧠 Deep Research (Powered by Agentic AI)")
387
- st.write("What topic would you like to research?")
388
-
389
- query = st.text_area("Enter your research topic", value="Most popular free MLOps & LLMOps tools in 2025.", height=50, label_visibility="collapsed")
390
 
391
- # Action row with buttons
392
- col1, col2, col3, col4 = st.columns([2.0, 2.0, 2.0, 2.0])
393
-
394
- with col1:
395
- run_clicked = st.button("🚀 Run Deep Research", key="run", disabled=st.session_state.button_disabled)
396
-
397
- # PDF and MD download buttons appear inline after a final_report exists
398
- if st.session_state.final_report:
399
- with col2:
400
- # PDF generator stream - create bytes on demand
401
- pdf_bytes = make_pdf_bytes(st.session_state.final_report)
402
- st.download_button("📄 Download PDF", data=pdf_bytes, file_name="report.pdf", mime="application/pdf")
403
-
404
- with col3:
405
- # Markdown
406
- md_bytes = make_md_bytes(st.session_state.final_report)
407
- st.download_button("📝 Download MD", data=md_bytes, file_name="report.md", mime="text/markdown")
408
-
409
- # placeholder for final report (used so streaming traces can be cleared)
410
- final_ph = st.empty()
411
 
412
- # placeholder for streaming status and progress updates
413
- status_ph = st.empty()
 
 
414
 
415
- # Run research if requested; disable button on click and re-run
416
  if run_clicked and query.strip():
417
- st.session_state.button_disabled = True
418
- st.rerun()
419
-
420
- # Execute streaming if button was disabled (i.e., on the rerun after click)
421
- if st.session_state.button_disabled and query.strip():
422
- run_streaming(query.strip(), final_ph, status_ph)
423
- elif not st.session_state.button_disabled:
424
- # if final_report exists (e.g., from previous run), show it in the final placeholder
425
- if st.session_state.final_report:
426
- # final_ph.markdown(f"<div class='report-box'>{st.session_state.final_report}</div>", unsafe_allow_html=True)
427
- final_ph.markdown(st.session_state.final_report, unsafe_allow_html=True)
428
  else:
429
- st.info("Enter a topic and press Run. Final report will replace streaming traces.")
430
 
431
- # small debug caption
 
432
  st.caption(f"Session: {st.session_state.session_id}")
 
1
  import streamlit as st
2
  import asyncio
 
 
 
 
 
3
  from dotenv import load_dotenv
 
 
4
 
5
  from appagents.orchestrator import Orchestrator
6
  from agents import SQLiteSession
7
 
8
  load_dotenv(override=True)
9
 
 
10
  # Page config
 
11
  st.set_page_config(page_title="Deep Research AI", layout="wide")
12
 
13
+ # ---------- CSS: center & 80% width, nicer typography ----------
14
+ CUSTOM_CSS = """
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
15
  <style>
16
+ /* Make main container 80% width and centered */
17
+ .block-container {
18
+ max-width: 80% !important;
19
+ margin-left: auto !important;
20
+ margin-right: auto !important;
21
+ padding-top: 1.5rem;
22
+ padding-bottom: 2rem;
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
23
  }
24
 
25
+ /* Larger title */
26
+ h1 {
27
+ font-size: 2.2rem !important;
28
+ text-align: center;
29
+ margin-bottom: 0.25rem;
 
 
 
 
 
30
  }
31
 
32
+ /* report box: preserve whitespace, allow overflow scrolling */
33
+ .report-box {
34
+ background: #ffffff;
35
+ padding: 24px;
36
+ border-radius: 12px;
37
+ border: 1px solid #e9ecef;
38
+ box-shadow: 0 6px 18px rgba(23,43,77,0.04);
39
+ font-size: 1.05rem;
40
+ line-height: 1.65;
41
+ white-space: pre-wrap; /* preserve newlines */
42
+ word-wrap: break-word;
43
+ overflow-wrap: break-word;
44
+ max-height: 70vh; /* allow vertical scrolling if very long */
45
+ overflow: auto;
46
  }
47
 
48
+ /* Input area style */
49
+ textarea, .stTextArea>div>div>textarea {
50
+ font-size: 1.05rem !important;
 
 
 
 
 
 
 
 
 
 
 
 
 
 
51
  }
52
 
53
+ /* center the Run button under the textarea */
54
+ .run-btn {
55
+ display:flex;
56
+ justify-content:center;
57
+ align-items:center;
58
+ margin-top: 12px;
 
 
 
59
  }
60
 
61
+ /* progress / status spacing */
62
+ .progress-area {
63
+ margin-top: 12px;
64
+ margin-bottom: 10px;
 
 
 
 
 
65
  }
66
  </style>
67
  """
68
+ st.markdown(CUSTOM_CSS, unsafe_allow_html=True)
69
 
70
+ # ---------- session-store for persistent SQLiteSession ----------
71
+ if "session_store" not in st.session_state:
72
+ st.session_state.session_store = {}
73
+
74
+ if "session_id" not in st.session_state:
75
+ # stable per-tab session id
76
+ st.session_state.session_id = str(id(st))
77
 
 
78
 
 
 
 
79
  async def run_async_chunks(query: str, session_id: str):
80
+ """
81
+ Async generator: yields chunks from orchestrator.run(query)
82
+ """
83
+ # create or reuse persistent SQLiteSession
84
  if session_id not in st.session_state.session_store:
85
+ st.session_state.session_store[session_id] = SQLiteSession(
86
+ f"session_{session_id}.db"
87
+ )
88
+
89
  session = st.session_state.session_store[session_id]
90
  orchestrator = Orchestrator(session=session)
91
+
92
  async for chunk in orchestrator.run(query):
93
  yield chunk
94
 
95
+
96
+ def run_streaming(query: str):
97
+ """
98
+ Streamlit-friendly runner: updates spinner, progress bar and the output placeholder.
99
+ Stores full output in st.session_state to avoid truncation between reruns.
100
+ """
101
+
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
102
  session_id = st.session_state.session_id
103
 
104
  # placeholders
105
+ status_ph = st.empty()
106
  progress_ph = st.empty()
107
+ output_ph = st.empty()
108
 
109
+ # keep full output in session_state so it survives reruns while streaming
110
+ if "full_output" not in st.session_state:
111
+ st.session_state.full_output = ""
 
 
 
112
 
113
+ # reset before new run
114
+ st.session_state.full_output = ""
115
+ progress_value = 0
116
+ progress_bar = progress_ph.progress(progress_value)
117
+
118
+ # spinner + async loop
119
+ status_ph.info("🔎 Processing — streaming results now...")
120
 
121
  async def _stream():
122
+ nonlocal progress_value, progress_bar
123
+ # naive increment step; will cap at 98 until finished
 
124
  async for chunk in run_async_chunks(query, session_id):
125
+ # append chunk
126
+ st.session_state.full_output += chunk
127
+
128
+ # update progress (move forward slowly, final step sets 100)
129
+ progress_value = min(progress_value + 2, 98)
130
+ progress_bar.progress(progress_value)
131
+
132
+ # render full output inside a styled div that preserves whitespace
133
+ output_html = f"<div class='report-box'>{st.session_state.full_output}</div>"
134
+ output_ph.markdown(output_html, unsafe_allow_html=True)
135
+
136
+ # run the async generator
 
137
  try:
138
  asyncio.run(_stream())
139
  except RuntimeError:
140
+ # If the event loop is already running (e.g., in some environments),
141
+ # fallback to creating a new loop and running until complete.
142
  loop = asyncio.new_event_loop()
143
  asyncio.set_event_loop(loop)
144
  loop.run_until_complete(_stream())
145
  loop.close()
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
146
 
147
+ # final rendering and progress completion
148
+ progress_bar.progress(100)
149
+ status_ph.success("✅ Completed research — full report below.")
150
+ # ensure final full output rendered (in case last chunk didn't render)
151
+ output_html = f"<div class='report-box'>{st.session_state.full_output}</div>"
152
+ output_ph.markdown(output_html, unsafe_allow_html=True)
 
 
 
 
 
153
 
154
 
155
+ # ---------- UI ----------
 
 
156
  st.title("🧠 Deep Research (Powered by Agentic AI)")
 
 
 
157
 
158
+ st.write("What topic would you like to research?")
159
+ query = st.text_area(
160
+ "", # no label to keep compact
161
+ value="The impact of AI on USA stock market performance in 2025.",
162
+ height=140,
163
+ )
 
 
 
 
 
 
 
 
 
 
 
 
 
 
164
 
165
+ # centered run button
166
+ col1, col2, col3 = st.columns([1, 2, 1])
167
+ with col2:
168
+ run_clicked = st.button("🚀 Run Deep Research", key="run_button", help="Click to start research")
169
 
 
170
  if run_clicked and query.strip():
171
+ run_streaming(query.strip())
172
+ else:
173
+ # If we already have previous output, show it (keeps output visible after page reruns)
174
+ if "full_output" in st.session_state and st.session_state.full_output:
175
+ output_html = f"<div class='report-box'>{st.session_state.full_output}</div>"
176
+ st.markdown(output_html, unsafe_allow_html=True)
 
 
 
 
 
177
  else:
178
+ st.info("Enter a topic above and press Run to start the research agent.")
179
 
180
+ # Optional: small footer that shows session id for debugging
181
+ st.write("")
182
  st.caption(f"Session: {st.session_state.session_id}")