OpeneR Sisyphus commited on
Commit ·
778278c
0
Parent(s):
HydraDeck open-source clean snapshot
Browse filesUltraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
- .gitignore +8 -0
- README.md +77 -0
- README_SPACES.md +20 -0
- app.py +1269 -0
- custom_web.py +547 -0
- hydradeck/__init__.py +3 -0
- hydradeck/agents/personas.py +98 -0
- hydradeck/cli.py +522 -0
- hydradeck/clients/__init__.py +3 -0
- hydradeck/clients/grok_client.py +373 -0
- hydradeck/config.py +137 -0
- hydradeck/core/types.py +91 -0
- hydradeck/packaging.py +33 -0
- hydradeck/pipeline.py +884 -0
- hydradeck/presets/__init__.py +3 -0
- hydradeck/presets/rynnbrain.py +346 -0
- hydradeck/render.py +471 -0
- hydradeck/resources_pack.py +706 -0
- hydradeck/utils.py +86 -0
- pyproject.toml +44 -0
- requirements.txt +4 -0
- tests/test_app_agentic.py +74 -0
- tests/test_cli.py +66 -0
- tests/test_config.py +44 -0
- tests/test_preset_pre.py +17 -0
- tests/test_render.py +189 -0
- tests/test_resources_pack_mock.py +43 -0
- tests/test_smoke_mock.py +57 -0
- tests/test_verbatim_mock.py +43 -0
.gitignore
ADDED
|
@@ -0,0 +1,8 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
__pycache__/
|
| 2 |
+
*.pyc
|
| 3 |
+
.pytest_cache/
|
| 4 |
+
.ruff_cache/
|
| 5 |
+
.DS_Store
|
| 6 |
+
build/
|
| 7 |
+
*.egg-info/
|
| 8 |
+
out/
|
README.md
ADDED
|
@@ -0,0 +1,77 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# hydradeck
|
| 2 |
+
|
| 3 |
+
一个可重复、可审计的 Grok Deep Research 流水线(多 Persona 迭代),输出:
|
| 4 |
+
|
| 5 |
+
- `pre_report.md`:Pre-Research(研究前置)报告:研究问题拆解、方法、检索策略、风险与边界
|
| 6 |
+
- `report.md`:完整研究报告(含“完整资源”列表与可追溯引用)
|
| 7 |
+
- `speech.md`:演讲稿(可直接照读,含转场与时间提示)
|
| 8 |
+
- `pre_paper.tex`:Pre-brief 的 LaTeX 论文稿(article)
|
| 9 |
+
- `pre_slides.tex`:Pre-brief 的 Beamer 幻灯片
|
| 10 |
+
- `refs.bib`:BibTeX 参考文献
|
| 11 |
+
- `research.json`:结构化中间产物(便于复现与审计)
|
| 12 |
+
|
| 13 |
+
> 安全提示:不要把 API Key 写进仓库。请使用环境变量 `GROK_API_KEY`。
|
| 14 |
+
> 如果你已经在聊天里粘贴过 key,请立即**轮换/作废**该 key。
|
| 15 |
+
|
| 16 |
+
## 安装
|
| 17 |
+
|
| 18 |
+
```bash
|
| 19 |
+
cd hydradeck
|
| 20 |
+
python3 -m pip install -e .
|
| 21 |
+
python3 -m pip install -e ".[dev]"
|
| 22 |
+
```
|
| 23 |
+
|
| 24 |
+
## 快速使用
|
| 25 |
+
|
| 26 |
+
### 1) Mock(离线)跑通流程
|
| 27 |
+
|
| 28 |
+
```bash
|
| 29 |
+
mkdir -p out
|
| 30 |
+
hydradeck run --topic "LLM agents for deep research" --out out/demo.zip --mock
|
| 31 |
+
```
|
| 32 |
+
|
| 33 |
+
### 2) 使用 Grok2API / OpenAI 兼容网关
|
| 34 |
+
|
| 35 |
+
`api.example.com` 基于 Grok2API,提供 OpenAI 兼容的 `/v1/chat/completions` 与 `/v1/models`。
|
| 36 |
+
|
| 37 |
+
```bash
|
| 38 |
+
export GROK_BASE_URL="https://api.example.com"
|
| 39 |
+
export GROK_API_KEY="<YOUR_KEY>"
|
| 40 |
+
export GROK_MODEL="grok-4"
|
| 41 |
+
|
| 42 |
+
mkdir -p out
|
| 43 |
+
hydradeck run --topic "<你的研究主题>" --out out/topic.zip \
|
| 44 |
+
--iterations 3 \
|
| 45 |
+
--max-sources 10
|
| 46 |
+
```
|
| 47 |
+
|
| 48 |
+
## 输出结构
|
| 49 |
+
|
| 50 |
+
输出为一个目录或 zip(取决于 `--out` 是否以 `.zip` 结尾)。其中包含 `compile.sh` 与 `Makefile` 便于编译 LaTeX。
|
| 51 |
+
|
| 52 |
+
## WebUI(HydraDeck)
|
| 53 |
+
|
| 54 |
+
### 启动方式(本地)
|
| 55 |
+
|
| 56 |
+
```bash
|
| 57 |
+
cd hydradeck
|
| 58 |
+
python3 custom_web.py
|
| 59 |
+
```
|
| 60 |
+
|
| 61 |
+
默认监听:`http://127.0.0.1:7861`
|
| 62 |
+
|
| 63 |
+
### 运行前环境变量(可选)
|
| 64 |
+
|
| 65 |
+
```bash
|
| 66 |
+
export GROK_BASE_URL="https://api.example.com"
|
| 67 |
+
export GROK_API_KEY="<YOUR_KEY>"
|
| 68 |
+
export GROK_MODEL="grok-4"
|
| 69 |
+
```
|
| 70 |
+
|
| 71 |
+
### 页面基本使用
|
| 72 |
+
|
| 73 |
+
1. 在 `Run` 标签填写 Topic
|
| 74 |
+
2. 点 `Quick API Check` 先检查连通性
|
| 75 |
+
3. 点 `Run HydraDeck` 开始生成
|
| 76 |
+
4. 在 `Console` 查看实时进度
|
| 77 |
+
5. 在 `Artifacts` 下载 `paper.pdf` / `slides.pdf`
|
README_SPACES.md
ADDED
|
@@ -0,0 +1,20 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
title: hydradeck-webui
|
| 3 |
+
emoji: 📚
|
| 4 |
+
colorFrom: indigo
|
| 5 |
+
colorTo: blue
|
| 6 |
+
sdk: gradio
|
| 7 |
+
sdk_version: 4.44.1
|
| 8 |
+
app_file: app.py
|
| 9 |
+
pinned: false
|
| 10 |
+
---
|
| 11 |
+
|
| 12 |
+
# hydradeck WebUI (Hugging Face Spaces)
|
| 13 |
+
|
| 14 |
+
Set these secrets in Space settings if needed:
|
| 15 |
+
|
| 16 |
+
- `GROK_API_KEY`
|
| 17 |
+
- `GROK_BASE_URL` (optional, defaults to `https://api.example.com`)
|
| 18 |
+
- `GROK_MODEL` (optional, defaults to `grok-4`)
|
| 19 |
+
|
| 20 |
+
The app entrypoint is `app.py`.
|
app.py
ADDED
|
@@ -0,0 +1,1269 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
from __future__ import annotations
|
| 2 |
+
|
| 3 |
+
import warnings
|
| 4 |
+
|
| 5 |
+
warnings.filterwarnings(
|
| 6 |
+
"ignore",
|
| 7 |
+
message=r"urllib3 v2 only supports OpenSSL 1\.1\.1\+.*",
|
| 8 |
+
)
|
| 9 |
+
|
| 10 |
+
import tempfile
|
| 11 |
+
import zipfile
|
| 12 |
+
import json
|
| 13 |
+
import time
|
| 14 |
+
from concurrent.futures import ThreadPoolExecutor
|
| 15 |
+
from queue import Empty, Queue
|
| 16 |
+
from pathlib import Path
|
| 17 |
+
from typing import Any
|
| 18 |
+
from urllib.error import HTTPError, URLError
|
| 19 |
+
from urllib.parse import quote, urlparse
|
| 20 |
+
from urllib.request import Request, urlopen
|
| 21 |
+
|
| 22 |
+
import gradio as gr
|
| 23 |
+
|
| 24 |
+
from hydradeck.clients import ChatMessage, GrokClient
|
| 25 |
+
from hydradeck.config import resolve_api_key, resolve_base_url, resolve_model
|
| 26 |
+
from hydradeck.core.types import RunConfig
|
| 27 |
+
from hydradeck.pipeline import run
|
| 28 |
+
from hydradeck.render import (
|
| 29 |
+
build_slide_frames_from_sections,
|
| 30 |
+
enforce_slide_density,
|
| 31 |
+
render_beamer_frames,
|
| 32 |
+
render_paper,
|
| 33 |
+
render_report_structured,
|
| 34 |
+
)
|
| 35 |
+
|
| 36 |
+
|
| 37 |
+
CHROME_144_UA = (
|
| 38 |
+
"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) "
|
| 39 |
+
"AppleWebKit/537.36 (KHTML, like Gecko) "
|
| 40 |
+
"Chrome/144.0.0.0 Safari/537.36"
|
| 41 |
+
)
|
| 42 |
+
|
| 43 |
+
def _normalized_base_url(base_url: str) -> str:
|
| 44 |
+
parsed = urlparse(base_url.strip())
|
| 45 |
+
if parsed.scheme not in {"http", "https"}:
|
| 46 |
+
raise ValueError("Base URL must start with http:// or https://")
|
| 47 |
+
if not parsed.netloc:
|
| 48 |
+
raise ValueError("Base URL is missing host")
|
| 49 |
+
return base_url.strip().rstrip("/")
|
| 50 |
+
|
| 51 |
+
|
| 52 |
+
def _preflight_check(base_url: str, api_key: str, request_budget: float) -> str | None:
|
| 53 |
+
if not api_key.strip():
|
| 54 |
+
return "Missing API key. Fill API Key field or set GROK_API_KEY before running."
|
| 55 |
+
|
| 56 |
+
try:
|
| 57 |
+
normalized = _normalized_base_url(base_url)
|
| 58 |
+
except ValueError as exc:
|
| 59 |
+
return f"Invalid Base URL: {exc}"
|
| 60 |
+
|
| 61 |
+
probe_url = f"{normalized}/v1/models"
|
| 62 |
+
timeout_s = max(2.0, min(float(request_budget), 6.0))
|
| 63 |
+
req = Request(
|
| 64 |
+
probe_url,
|
| 65 |
+
headers={
|
| 66 |
+
"Authorization": f"Bearer {api_key.strip()}",
|
| 67 |
+
"User-Agent": CHROME_144_UA,
|
| 68 |
+
},
|
| 69 |
+
)
|
| 70 |
+
|
| 71 |
+
try:
|
| 72 |
+
with urlopen(req, timeout=timeout_s):
|
| 73 |
+
return None
|
| 74 |
+
except HTTPError as exc:
|
| 75 |
+
try:
|
| 76 |
+
body = exc.read().decode("utf-8", errors="replace")
|
| 77 |
+
except Exception:
|
| 78 |
+
body = ""
|
| 79 |
+
if exc.code == 403 and "error code: 1010" in body.lower():
|
| 80 |
+
return (
|
| 81 |
+
"Gateway blocked this client (Cloudflare 1010), not an API-key issue. "
|
| 82 |
+
"Try another network/egress IP or ask gateway admin to allow this IP."
|
| 83 |
+
)
|
| 84 |
+
if exc.code in {401, 403}:
|
| 85 |
+
return "API key rejected (401/403). Please update GROK_API_KEY or paste a valid key."
|
| 86 |
+
return f"API endpoint returned HTTP {exc.code} during preflight."
|
| 87 |
+
except URLError as exc:
|
| 88 |
+
return f"Cannot reach API endpoint ({probe_url}): {exc.reason}"
|
| 89 |
+
except TimeoutError:
|
| 90 |
+
return (
|
| 91 |
+
f"API preflight timed out after {timeout_s:.0f}s. "
|
| 92 |
+
"Try mock mode first, then increase Request budget."
|
| 93 |
+
)
|
| 94 |
+
|
| 95 |
+
|
| 96 |
+
def _api_quick_check(base_url: str, api_key: str, model: str, request_budget: float) -> str:
|
| 97 |
+
selected_base_url = base_url.strip() or resolve_base_url("https://api.example.com")
|
| 98 |
+
selected_api_key = api_key.strip() or resolve_api_key()
|
| 99 |
+
|
| 100 |
+
preflight_error = _preflight_check(selected_base_url, selected_api_key, request_budget)
|
| 101 |
+
if preflight_error is not None:
|
| 102 |
+
return f"API check failed: {preflight_error}"
|
| 103 |
+
|
| 104 |
+
normalized = _normalized_base_url(selected_base_url)
|
| 105 |
+
req_model = model.strip() or resolve_model("grok-3-mini")
|
| 106 |
+
payload = {
|
| 107 |
+
"model": req_model,
|
| 108 |
+
"messages": [{"role": "user", "content": "reply with exactly: API_OK"}],
|
| 109 |
+
"temperature": 0,
|
| 110 |
+
"max_tokens": 8,
|
| 111 |
+
}
|
| 112 |
+
req = Request(
|
| 113 |
+
f"{normalized}/v1/chat/completions",
|
| 114 |
+
method="POST",
|
| 115 |
+
data=json.dumps(payload).encode("utf-8"),
|
| 116 |
+
headers={
|
| 117 |
+
"Authorization": f"Bearer {selected_api_key.strip()}",
|
| 118 |
+
"User-Agent": CHROME_144_UA,
|
| 119 |
+
"Content-Type": "application/json",
|
| 120 |
+
},
|
| 121 |
+
)
|
| 122 |
+
timeout_s = max(3.0, min(float(request_budget), 12.0))
|
| 123 |
+
try:
|
| 124 |
+
with urlopen(req, timeout=timeout_s) as resp:
|
| 125 |
+
body = resp.read().decode("utf-8", errors="replace")
|
| 126 |
+
except HTTPError as exc:
|
| 127 |
+
text = exc.read().decode("utf-8", errors="replace")
|
| 128 |
+
return f"API check failed: HTTP {exc.code} {text[:180]}"
|
| 129 |
+
except URLError as exc:
|
| 130 |
+
return f"API check failed: network error {exc.reason}"
|
| 131 |
+
except TimeoutError:
|
| 132 |
+
return f"API check failed: completion timeout after {timeout_s:.0f}s"
|
| 133 |
+
|
| 134 |
+
if "API_OK" not in body:
|
| 135 |
+
return f"API check uncertain: completion returned unexpected body: {body[:180]}"
|
| 136 |
+
return "API check passed: models/completions reachable and auth works."
|
| 137 |
+
|
| 138 |
+
|
| 139 |
+
def _compile_latex_online(tex_source: str, output_name: str) -> str:
|
| 140 |
+
def _compile_via_hosted_url(command: str) -> bytes:
|
| 141 |
+
upload_req = Request("https://paste.rs", data=tex_source.encode("utf-8"), method="POST")
|
| 142 |
+
with urlopen(upload_req, timeout=30) as upload_resp:
|
| 143 |
+
hosted_url = upload_resp.read().decode("utf-8", errors="replace").strip()
|
| 144 |
+
compile_from_url = (
|
| 145 |
+
"https://latexonline.cc/compile?url="
|
| 146 |
+
+ quote(hosted_url, safe=":/?=&")
|
| 147 |
+
+ "&command="
|
| 148 |
+
+ command
|
| 149 |
+
+ "&force=true"
|
| 150 |
+
)
|
| 151 |
+
req2 = Request(compile_from_url, headers={"User-Agent": CHROME_144_UA})
|
| 152 |
+
with urlopen(req2, timeout=120) as resp2:
|
| 153 |
+
return resp2.read()
|
| 154 |
+
|
| 155 |
+
errors: list[str] = []
|
| 156 |
+
blob = b""
|
| 157 |
+
for command in ["xelatex", "lualatex", "pdflatex"]:
|
| 158 |
+
try:
|
| 159 |
+
encoded = quote(tex_source, safe="")
|
| 160 |
+
compile_url = (
|
| 161 |
+
"https://latexonline.cc/compile?text="
|
| 162 |
+
+ encoded
|
| 163 |
+
+ "&command="
|
| 164 |
+
+ command
|
| 165 |
+
+ "&force=true"
|
| 166 |
+
)
|
| 167 |
+
if len(compile_url) > 6000:
|
| 168 |
+
blob = _compile_via_hosted_url(command)
|
| 169 |
+
else:
|
| 170 |
+
req = Request(compile_url, headers={"User-Agent": CHROME_144_UA})
|
| 171 |
+
with urlopen(req, timeout=90) as resp:
|
| 172 |
+
blob = resp.read()
|
| 173 |
+
if blob.startswith(b"%PDF"):
|
| 174 |
+
break
|
| 175 |
+
blob = _compile_via_hosted_url(command)
|
| 176 |
+
if blob.startswith(b"%PDF"):
|
| 177 |
+
break
|
| 178 |
+
errors.append(f"{command}: non-pdf response")
|
| 179 |
+
except HTTPError as exc:
|
| 180 |
+
body = exc.read().decode("utf-8", errors="replace")
|
| 181 |
+
errors.append(f"{command}: HTTP {exc.code} {body[:500]}")
|
| 182 |
+
except Exception as exc:
|
| 183 |
+
errors.append(f"{command}: {exc}")
|
| 184 |
+
|
| 185 |
+
if not blob.startswith(b"%PDF"):
|
| 186 |
+
raise RuntimeError("online renderer failed: " + " | ".join(errors[:3]))
|
| 187 |
+
out_path = Path("/tmp") / output_name
|
| 188 |
+
_ = out_path.write_bytes(blob)
|
| 189 |
+
return str(out_path)
|
| 190 |
+
|
| 191 |
+
|
| 192 |
+
def _extract_json_object(text: str) -> dict[str, Any]:
|
| 193 |
+
raw = text.strip()
|
| 194 |
+
if not raw:
|
| 195 |
+
raise RuntimeError("empty JSON response")
|
| 196 |
+
try:
|
| 197 |
+
parsed = json.loads(raw)
|
| 198 |
+
if isinstance(parsed, dict):
|
| 199 |
+
return parsed
|
| 200 |
+
except json.JSONDecodeError:
|
| 201 |
+
pass
|
| 202 |
+
|
| 203 |
+
start = raw.find("{")
|
| 204 |
+
end = raw.rfind("}")
|
| 205 |
+
if start == -1 or end == -1 or end <= start:
|
| 206 |
+
raise RuntimeError("no JSON object found in response")
|
| 207 |
+
parsed2 = json.loads(raw[start : end + 1])
|
| 208 |
+
if not isinstance(parsed2, dict):
|
| 209 |
+
raise RuntimeError("top-level JSON is not an object")
|
| 210 |
+
return parsed2
|
| 211 |
+
|
| 212 |
+
|
| 213 |
+
def _chat_json_resilient(
|
| 214 |
+
client: GrokClient,
|
| 215 |
+
messages: list[ChatMessage],
|
| 216 |
+
schema_hint: str,
|
| 217 |
+
temperature: float,
|
| 218 |
+
timeout_s: float,
|
| 219 |
+
) -> dict[str, Any]:
|
| 220 |
+
try:
|
| 221 |
+
obj = client.chat_json(
|
| 222 |
+
messages,
|
| 223 |
+
schema_hint=schema_hint,
|
| 224 |
+
temperature=temperature,
|
| 225 |
+
timeout_s=timeout_s,
|
| 226 |
+
)
|
| 227 |
+
if isinstance(obj, dict):
|
| 228 |
+
return obj
|
| 229 |
+
except Exception:
|
| 230 |
+
pass
|
| 231 |
+
|
| 232 |
+
try:
|
| 233 |
+
text = client.chat_text(messages, temperature=temperature, timeout_s=timeout_s)
|
| 234 |
+
return _extract_json_object(text)
|
| 235 |
+
except Exception:
|
| 236 |
+
return {}
|
| 237 |
+
|
| 238 |
+
|
| 239 |
+
def _build_stage_model_map(
|
| 240 |
+
requested_model: str,
|
| 241 |
+
overrides: dict[str, str] | None = None,
|
| 242 |
+
) -> dict[str, str]:
|
| 243 |
+
base = requested_model.strip() or resolve_model("grok-3-mini")
|
| 244 |
+
high = base
|
| 245 |
+
if "mini" in base:
|
| 246 |
+
high = base.replace("-mini", "")
|
| 247 |
+
if high == base and base == "grok-3-mini":
|
| 248 |
+
high = "grok-3"
|
| 249 |
+
model_map = {
|
| 250 |
+
"scope": base,
|
| 251 |
+
"structure": high,
|
| 252 |
+
"planner": high,
|
| 253 |
+
"section": base,
|
| 254 |
+
"paper": high,
|
| 255 |
+
"slides": high,
|
| 256 |
+
}
|
| 257 |
+
if overrides:
|
| 258 |
+
for key in model_map:
|
| 259 |
+
v = overrides.get(key, "").strip()
|
| 260 |
+
if v:
|
| 261 |
+
model_map[key] = v
|
| 262 |
+
return model_map
|
| 263 |
+
|
| 264 |
+
|
| 265 |
+
def _looks_like_template_text(text: str) -> bool:
|
| 266 |
+
low = text.lower().strip()
|
| 267 |
+
if not low:
|
| 268 |
+
return True
|
| 269 |
+
bad_markers = [
|
| 270 |
+
"this section is generated",
|
| 271 |
+
"no content generated",
|
| 272 |
+
"lorem ipsum",
|
| 273 |
+
"to be filled",
|
| 274 |
+
"placeholder",
|
| 275 |
+
"add key evidence-backed findings",
|
| 276 |
+
"补充关键事实与证据",
|
| 277 |
+
]
|
| 278 |
+
return any(m in low for m in bad_markers)
|
| 279 |
+
|
| 280 |
+
|
| 281 |
+
def _assert_not_template_output(module_name: str, text: str) -> None:
|
| 282 |
+
if _looks_like_template_text(text):
|
| 283 |
+
raise RuntimeError(f"{module_name} produced template-like content; retry required")
|
| 284 |
+
|
| 285 |
+
|
| 286 |
+
def _section_quality_ok(section_title: str, latex_body: str, language: str) -> bool:
|
| 287 |
+
if _looks_like_template_text(latex_body):
|
| 288 |
+
return False
|
| 289 |
+
body = latex_body.strip()
|
| 290 |
+
if len(body) < 120:
|
| 291 |
+
return False
|
| 292 |
+
if language == "zh":
|
| 293 |
+
zh_chars = sum(1 for ch in body if "\u4e00" <= ch <= "\u9fff")
|
| 294 |
+
if zh_chars < 20:
|
| 295 |
+
return False
|
| 296 |
+
else:
|
| 297 |
+
words = [w for w in body.replace("\n", " ").split(" ") if w]
|
| 298 |
+
if len(words) < 40:
|
| 299 |
+
return False
|
| 300 |
+
_ = section_title
|
| 301 |
+
return True
|
| 302 |
+
|
| 303 |
+
|
| 304 |
+
def _run_agentic_pipeline(
|
| 305 |
+
topic: str,
|
| 306 |
+
model: str,
|
| 307 |
+
base_url: str,
|
| 308 |
+
api_key: str,
|
| 309 |
+
request_budget: float,
|
| 310 |
+
use_mock: bool,
|
| 311 |
+
progress: gr.Progress = gr.Progress(),
|
| 312 |
+
stage_callback=None,
|
| 313 |
+
language: str = "en",
|
| 314 |
+
stage_models: dict[str, str] | None = None,
|
| 315 |
+
) -> tuple[str, str, str, str, str, str, str, str, str]:
|
| 316 |
+
if not topic.strip():
|
| 317 |
+
return "Topic is required.", "", "", "", "", "", "", "", ""
|
| 318 |
+
|
| 319 |
+
selected_base_url = base_url.strip() or resolve_base_url("https://api.example.com")
|
| 320 |
+
selected_api_key = api_key.strip() or resolve_api_key()
|
| 321 |
+
selected_model = model.strip() or resolve_model("grok-3-mini")
|
| 322 |
+
lang = language.strip().lower()
|
| 323 |
+
if lang not in {"en", "zh"}:
|
| 324 |
+
lang = "en"
|
| 325 |
+
model_map = _build_stage_model_map(selected_model, overrides=stage_models)
|
| 326 |
+
total_steps = 9
|
| 327 |
+
stage_logs: list[str] = []
|
| 328 |
+
|
| 329 |
+
def mark(step: int, label: str, detail: str) -> None:
|
| 330 |
+
pct = min(max(step / total_steps, 0.0), 1.0)
|
| 331 |
+
_ = progress(pct, desc=label)
|
| 332 |
+
stage_logs.append(f"{step}/{total_steps} {label}: {detail}")
|
| 333 |
+
|
| 334 |
+
def emit_stage(
|
| 335 |
+
step: int,
|
| 336 |
+
label: str,
|
| 337 |
+
detail: str,
|
| 338 |
+
scope_text: str = "",
|
| 339 |
+
section_text: str = "",
|
| 340 |
+
paper_text: str = "",
|
| 341 |
+
slides_text: str = "",
|
| 342 |
+
pdf_paths_text: str = "",
|
| 343 |
+
paper_pdf_text: str = "",
|
| 344 |
+
slides_pdf_text: str = "",
|
| 345 |
+
) -> None:
|
| 346 |
+
if stage_callback is None:
|
| 347 |
+
return
|
| 348 |
+
payload = {
|
| 349 |
+
"status": f"Running: {label}",
|
| 350 |
+
"progress_log": "\n".join(stage_logs),
|
| 351 |
+
"scope": scope_text,
|
| 352 |
+
"sections": section_text,
|
| 353 |
+
"paper": paper_text,
|
| 354 |
+
"slides": slides_text,
|
| 355 |
+
"pdf_paths": pdf_paths_text,
|
| 356 |
+
"paper_pdf": paper_pdf_text,
|
| 357 |
+
"slides_pdf": slides_pdf_text,
|
| 358 |
+
"progress": int(min(100, max(0, round(step / total_steps * 100)))),
|
| 359 |
+
"stage": label,
|
| 360 |
+
"detail": detail,
|
| 361 |
+
}
|
| 362 |
+
stage_callback(payload)
|
| 363 |
+
|
| 364 |
+
mark(1, "Preflight", "checking API connectivity")
|
| 365 |
+
emit_stage(1, "Preflight", "checking API connectivity")
|
| 366 |
+
if not use_mock:
|
| 367 |
+
preflight_error = _preflight_check(selected_base_url, selected_api_key, request_budget)
|
| 368 |
+
if preflight_error is not None:
|
| 369 |
+
return (
|
| 370 |
+
f"Agentic run failed: {preflight_error}",
|
| 371 |
+
"\n".join(stage_logs),
|
| 372 |
+
"",
|
| 373 |
+
"",
|
| 374 |
+
"",
|
| 375 |
+
"",
|
| 376 |
+
"",
|
| 377 |
+
"",
|
| 378 |
+
"",
|
| 379 |
+
)
|
| 380 |
+
|
| 381 |
+
scope_payload: dict[str, object]
|
| 382 |
+
section_plan: list[dict[str, str]]
|
| 383 |
+
section_blocks: list[dict[str, str]] = []
|
| 384 |
+
paper_tex = ""
|
| 385 |
+
slides_tex = ""
|
| 386 |
+
|
| 387 |
+
if use_mock:
|
| 388 |
+
mark(2, "Agent-1 ScopeScout", "using mock scope")
|
| 389 |
+
scope_payload = {
|
| 390 |
+
"project_links": [
|
| 391 |
+
{
|
| 392 |
+
"title": "RynnBrain repo",
|
| 393 |
+
"url": "https://github.com/alibaba-damo-academy/RynnBrain",
|
| 394 |
+
"reason": "Core project artifact",
|
| 395 |
+
},
|
| 396 |
+
{
|
| 397 |
+
"title": "arXiv references",
|
| 398 |
+
"url": "https://arxiv.org",
|
| 399 |
+
"reason": "Peer-reviewed baseline papers",
|
| 400 |
+
},
|
| 401 |
+
],
|
| 402 |
+
"scope": {
|
| 403 |
+
"in_scope": ["architecture", "training/inference workflow", "evaluation evidence"],
|
| 404 |
+
"out_scope": ["business roadmap", "non-technical marketing claims"],
|
| 405 |
+
"key_questions": [
|
| 406 |
+
"What problem is solved?",
|
| 407 |
+
"What architecture choices matter?",
|
| 408 |
+
"What evidence supports claims?",
|
| 409 |
+
],
|
| 410 |
+
},
|
| 411 |
+
}
|
| 412 |
+
emit_stage(
|
| 413 |
+
2,
|
| 414 |
+
"Agent-1 ScopeScout",
|
| 415 |
+
"scope resolved",
|
| 416 |
+
scope_text=json.dumps(scope_payload, ensure_ascii=False, indent=2),
|
| 417 |
+
)
|
| 418 |
+
|
| 419 |
+
mark(3, "Agent-StructureDesigner", "designing report structure")
|
| 420 |
+
structure_plan = {
|
| 421 |
+
"title": topic.strip(),
|
| 422 |
+
"sections": [
|
| 423 |
+
{"name": "Abstract", "goal": "State problem, method, key findings, and significance."},
|
| 424 |
+
{"name": "Introduction", "goal": "Context, motivation, and clear research question."},
|
| 425 |
+
{"name": "Methodology", "goal": "System design, assumptions, and evaluation protocol."},
|
| 426 |
+
{"name": "Results", "goal": "Evidence-backed findings with explicit source links."},
|
| 427 |
+
{"name": "Discussion", "goal": "Interpretation, limitations, and trade-offs."},
|
| 428 |
+
{"name": "Conclusion", "goal": "Takeaways and future work."},
|
| 429 |
+
],
|
| 430 |
+
"slide_style": {
|
| 431 |
+
"max_bullets": 5,
|
| 432 |
+
"max_words_per_bullet": 14,
|
| 433 |
+
"visual_density": "low",
|
| 434 |
+
"must_include": ["agenda", "method diagram slide", "results table slide", "limitations"],
|
| 435 |
+
},
|
| 436 |
+
}
|
| 437 |
+
emit_stage(
|
| 438 |
+
3,
|
| 439 |
+
"Agent-StructureDesigner",
|
| 440 |
+
"report structure designed",
|
| 441 |
+
scope_text=json.dumps(scope_payload, ensure_ascii=False, indent=2),
|
| 442 |
+
section_text=json.dumps(structure_plan, ensure_ascii=False, indent=2),
|
| 443 |
+
)
|
| 444 |
+
|
| 445 |
+
mark(4, "Agent-2 TemplatePlanner", "building section summaries from templates")
|
| 446 |
+
section_plan = [
|
| 447 |
+
{"name": "Abstract", "summary": "Concise summary of problem, method, findings, and impact."},
|
| 448 |
+
{"name": "Introduction", "summary": "Problem framing and motivation in research context."},
|
| 449 |
+
{"name": "Methodology", "summary": "System architecture and methodological decisions."},
|
| 450 |
+
{"name": "Results", "summary": "Empirical findings and traceable evidence."},
|
| 451 |
+
{"name": "Discussion", "summary": "Interpretation of findings and practical implications."},
|
| 452 |
+
{"name": "Conclusion", "summary": "Actionable takeaways and next steps."},
|
| 453 |
+
]
|
| 454 |
+
if lang == "zh":
|
| 455 |
+
section_plan = [
|
| 456 |
+
{"name": "摘要", "summary": "概述研究问题、方法、关键发现与价值。"},
|
| 457 |
+
{"name": "引言", "summary": "说明背景、动机与研究问题。"},
|
| 458 |
+
{"name": "方法", "summary": "阐述系统架构、方法流程与评估设置。"},
|
| 459 |
+
{"name": "结果", "summary": "给出可追溯证据支持的核心结论。"},
|
| 460 |
+
{"name": "讨论", "summary": "解释结果意义、局限与适用边界。"},
|
| 461 |
+
{"name": "结论", "summary": "总结与后续研究建议。"},
|
| 462 |
+
]
|
| 463 |
+
emit_stage(
|
| 464 |
+
4,
|
| 465 |
+
"Agent-2 TemplatePlanner",
|
| 466 |
+
"section plan prepared",
|
| 467 |
+
scope_text=json.dumps(scope_payload, ensure_ascii=False, indent=2),
|
| 468 |
+
section_text=json.dumps({"sections": section_plan}, ensure_ascii=False, indent=2),
|
| 469 |
+
)
|
| 470 |
+
|
| 471 |
+
mark(5, "Section Agents", "drafting per-section TeX blocks")
|
| 472 |
+
for sec in section_plan:
|
| 473 |
+
section_blocks.append(
|
| 474 |
+
{
|
| 475 |
+
"name": sec["name"],
|
| 476 |
+
"latex": (
|
| 477 |
+
f"\\subsection*{{{sec['name']}}}\n"
|
| 478 |
+
f"{sec['summary']}\\\n"
|
| 479 |
+
"Evidence should map directly to claims and include method-specific details."
|
| 480 |
+
),
|
| 481 |
+
}
|
| 482 |
+
)
|
| 483 |
+
emit_stage(
|
| 484 |
+
5,
|
| 485 |
+
"Section Agents",
|
| 486 |
+
"section drafts ready",
|
| 487 |
+
scope_text=json.dumps(scope_payload, ensure_ascii=False, indent=2),
|
| 488 |
+
section_text=json.dumps({"sections": section_plan}, ensure_ascii=False, indent=2),
|
| 489 |
+
paper_text="\n\n".join(block["latex"] for block in section_blocks),
|
| 490 |
+
)
|
| 491 |
+
|
| 492 |
+
mark(6, "Integrator-Paper", "merging section TeX into paper")
|
| 493 |
+
paper_tex = render_report_structured(topic.strip(), section_blocks, language=lang)
|
| 494 |
+
|
| 495 |
+
mark(7, "Integrator-Beamer", "building slide deck from report")
|
| 496 |
+
frames = build_slide_frames_from_sections(section_blocks, language=lang)
|
| 497 |
+
frames = enforce_slide_density(frames, language=lang)
|
| 498 |
+
slides_tex = render_beamer_frames(topic.strip(), frames, language=lang)
|
| 499 |
+
else:
|
| 500 |
+
timeout_s = max(12.0, min(float(request_budget), 40.0))
|
| 501 |
+
client_scope = GrokClient(
|
| 502 |
+
base_url=selected_base_url,
|
| 503 |
+
api_key=selected_api_key,
|
| 504 |
+
model=model_map["scope"],
|
| 505 |
+
timeout_s=timeout_s,
|
| 506 |
+
max_retries=2,
|
| 507 |
+
heartbeat=False,
|
| 508 |
+
)
|
| 509 |
+
client_structure = GrokClient(
|
| 510 |
+
base_url=selected_base_url,
|
| 511 |
+
api_key=selected_api_key,
|
| 512 |
+
model=model_map["structure"],
|
| 513 |
+
timeout_s=timeout_s,
|
| 514 |
+
max_retries=2,
|
| 515 |
+
heartbeat=False,
|
| 516 |
+
)
|
| 517 |
+
client_planner = GrokClient(
|
| 518 |
+
base_url=selected_base_url,
|
| 519 |
+
api_key=selected_api_key,
|
| 520 |
+
model=model_map["planner"],
|
| 521 |
+
timeout_s=timeout_s,
|
| 522 |
+
max_retries=2,
|
| 523 |
+
heartbeat=False,
|
| 524 |
+
)
|
| 525 |
+
client_section = GrokClient(
|
| 526 |
+
base_url=selected_base_url,
|
| 527 |
+
api_key=selected_api_key,
|
| 528 |
+
model=model_map["section"],
|
| 529 |
+
timeout_s=timeout_s,
|
| 530 |
+
max_retries=2,
|
| 531 |
+
heartbeat=False,
|
| 532 |
+
)
|
| 533 |
+
client_paper = GrokClient(
|
| 534 |
+
base_url=selected_base_url,
|
| 535 |
+
api_key=selected_api_key,
|
| 536 |
+
model=model_map["paper"],
|
| 537 |
+
timeout_s=timeout_s,
|
| 538 |
+
max_retries=2,
|
| 539 |
+
heartbeat=False,
|
| 540 |
+
)
|
| 541 |
+
client_slides = GrokClient(
|
| 542 |
+
base_url=selected_base_url,
|
| 543 |
+
api_key=selected_api_key,
|
| 544 |
+
model=model_map["slides"],
|
| 545 |
+
timeout_s=timeout_s,
|
| 546 |
+
max_retries=2,
|
| 547 |
+
heartbeat=False,
|
| 548 |
+
)
|
| 549 |
+
|
| 550 |
+
quick_scope = {
|
| 551 |
+
"project_links": [
|
| 552 |
+
{
|
| 553 |
+
"title": f"{topic.strip()} official repository",
|
| 554 |
+
"url": "https://github.com",
|
| 555 |
+
"reason": "Seed placeholder before remote scope enrichment.",
|
| 556 |
+
}
|
| 557 |
+
],
|
| 558 |
+
"scope": {
|
| 559 |
+
"in_scope": ["architecture", "method", "evidence"],
|
| 560 |
+
"out_scope": ["marketing narrative", "non-technical roadmap"],
|
| 561 |
+
"key_questions": [
|
| 562 |
+
"What core problem is solved?",
|
| 563 |
+
"What design decisions matter most?",
|
| 564 |
+
"What evidence is verifiable?",
|
| 565 |
+
],
|
| 566 |
+
},
|
| 567 |
+
}
|
| 568 |
+
emit_stage(
|
| 569 |
+
2,
|
| 570 |
+
"Agent-1 ScopeScout",
|
| 571 |
+
"quick skeleton ready; enriching with remote call",
|
| 572 |
+
scope_text=json.dumps(quick_scope, ensure_ascii=False, indent=2),
|
| 573 |
+
)
|
| 574 |
+
|
| 575 |
+
mark(2, "Agent-1 ScopeScout", "asking Grok for project links + scope")
|
| 576 |
+
try:
|
| 577 |
+
scope_payload = _chat_json_resilient(
|
| 578 |
+
client_scope,
|
| 579 |
+
[
|
| 580 |
+
ChatMessage(
|
| 581 |
+
role="system",
|
| 582 |
+
content=(
|
| 583 |
+
"You are ScopeScout. Find key project links and define an initial technical research scope."
|
| 584 |
+
),
|
| 585 |
+
),
|
| 586 |
+
ChatMessage(
|
| 587 |
+
role="user",
|
| 588 |
+
content=(
|
| 589 |
+
"Topic: "
|
| 590 |
+
+ topic.strip()
|
| 591 |
+
+ "\nReturn JSON with keys: project_links (list of {title,url,reason}),"
|
| 592 |
+
+ " scope ({in_scope:[...], out_scope:[...], key_questions:[...]})"
|
| 593 |
+
),
|
| 594 |
+
),
|
| 595 |
+
],
|
| 596 |
+
schema_hint=(
|
| 597 |
+
'{"project_links":[{"title":"...","url":"https://...","reason":"..."}],'
|
| 598 |
+
'"scope":{"in_scope":["..."],"out_scope":["..."],"key_questions":["..."]}}'
|
| 599 |
+
),
|
| 600 |
+
temperature=0.1,
|
| 601 |
+
timeout_s=min(timeout_s, 18.0),
|
| 602 |
+
)
|
| 603 |
+
except Exception:
|
| 604 |
+
scope_payload = quick_scope
|
| 605 |
+
emit_stage(
|
| 606 |
+
2,
|
| 607 |
+
"Agent-1 ScopeScout",
|
| 608 |
+
"scope resolved",
|
| 609 |
+
scope_text=json.dumps(scope_payload, ensure_ascii=False, indent=2),
|
| 610 |
+
)
|
| 611 |
+
|
| 612 |
+
mark(3, "Agent-StructureDesigner", "designing report architecture and slide style")
|
| 613 |
+
structure_obj = _chat_json_resilient(
|
| 614 |
+
client_structure,
|
| 615 |
+
[
|
| 616 |
+
ChatMessage(
|
| 617 |
+
role="system",
|
| 618 |
+
content=(
|
| 619 |
+
"You are StructureDesigner. Build a publication-grade report architecture and a presentation"
|
| 620 |
+
" style guide before drafting any sections."
|
| 621 |
+
+ (" Respond in Chinese." if lang == "zh" else " Respond in English.")
|
| 622 |
+
),
|
| 623 |
+
),
|
| 624 |
+
ChatMessage(
|
| 625 |
+
role="user",
|
| 626 |
+
content=(
|
| 627 |
+
"Topic: "
|
| 628 |
+
+ topic.strip()
|
| 629 |
+
+ "\nScope JSON: "
|
| 630 |
+
+ json.dumps(scope_payload, ensure_ascii=False)
|
| 631 |
+
+ "\nReturn JSON {report_blueprint:{section_order:[...],section_goals:[...]},"
|
| 632 |
+
+ " slide_style:{theme,max_bullets,max_words_per_bullet,visual_rules:[...]}}"
|
| 633 |
+
+ " Ensure this is a RESEARCH REPORT structure (not academic paper IMRaD rigidity)."
|
| 634 |
+
),
|
| 635 |
+
),
|
| 636 |
+
],
|
| 637 |
+
schema_hint='{"report_blueprint":{"section_order":["..."],"section_goals":["..."]},"slide_style":{"theme":"..."}}',
|
| 638 |
+
temperature=0.15,
|
| 639 |
+
timeout_s=timeout_s,
|
| 640 |
+
)
|
| 641 |
+
if not isinstance(structure_obj, dict) or not structure_obj:
|
| 642 |
+
structure_obj = {
|
| 643 |
+
"report_blueprint": {
|
| 644 |
+
"section_order": [
|
| 645 |
+
"Abstract",
|
| 646 |
+
"Introduction",
|
| 647 |
+
"Methodology",
|
| 648 |
+
"Results",
|
| 649 |
+
"Discussion",
|
| 650 |
+
"Conclusion",
|
| 651 |
+
],
|
| 652 |
+
"section_goals": [
|
| 653 |
+
"Summarize research contribution",
|
| 654 |
+
"Define context and question",
|
| 655 |
+
"Describe method rigorously",
|
| 656 |
+
"Present evidence with citations",
|
| 657 |
+
"Discuss limits and implications",
|
| 658 |
+
"Conclude and future work",
|
| 659 |
+
],
|
| 660 |
+
},
|
| 661 |
+
"slide_style": {
|
| 662 |
+
"theme": "metropolis-like clean",
|
| 663 |
+
"max_bullets": 5,
|
| 664 |
+
"max_words_per_bullet": 14,
|
| 665 |
+
"visual_rules": [
|
| 666 |
+
"one idea per slide",
|
| 667 |
+
"results in table/figure frame",
|
| 668 |
+
"consistent color accents",
|
| 669 |
+
],
|
| 670 |
+
},
|
| 671 |
+
}
|
| 672 |
+
emit_stage(
|
| 673 |
+
3,
|
| 674 |
+
"Agent-StructureDesigner",
|
| 675 |
+
"structure blueprint ready",
|
| 676 |
+
scope_text=json.dumps(scope_payload, ensure_ascii=False, indent=2),
|
| 677 |
+
section_text=json.dumps(structure_obj, ensure_ascii=False, indent=2),
|
| 678 |
+
)
|
| 679 |
+
|
| 680 |
+
mark(4, "Agent-2 TemplatePlanner", "mapping scope to paper/beamer section summaries")
|
| 681 |
+
section_obj = _chat_json_resilient(
|
| 682 |
+
client_planner,
|
| 683 |
+
[
|
| 684 |
+
ChatMessage(
|
| 685 |
+
role="system",
|
| 686 |
+
content=(
|
| 687 |
+
"You are TemplatePlanner. Based on scope and LaTeX paper/beamer structure, define section"
|
| 688 |
+
" summaries that downstream section agents will write."
|
| 689 |
+
+ (" Respond in Chinese." if lang == "zh" else " Respond in English.")
|
| 690 |
+
),
|
| 691 |
+
),
|
| 692 |
+
ChatMessage(
|
| 693 |
+
role="user",
|
| 694 |
+
content=(
|
| 695 |
+
"Topic: "
|
| 696 |
+
+ topic.strip()
|
| 697 |
+
+ "\nScope JSON: "
|
| 698 |
+
+ json.dumps(scope_payload, ensure_ascii=False)
|
| 699 |
+
+ "\nStructure JSON: "
|
| 700 |
+
+ json.dumps(structure_obj, ensure_ascii=False)
|
| 701 |
+
+ "\nReturn JSON: {sections:[{name,summary}]} with 6-8 sections for a RESEARCH REPORT."
|
| 702 |
+
+ " Ensure section names are concise and audience-friendly."
|
| 703 |
+
),
|
| 704 |
+
),
|
| 705 |
+
],
|
| 706 |
+
schema_hint='{"sections":[{"name":"Introduction","summary":"..."}]}',
|
| 707 |
+
temperature=0.1,
|
| 708 |
+
timeout_s=timeout_s,
|
| 709 |
+
)
|
| 710 |
+
raw_sections = section_obj.get("sections")
|
| 711 |
+
section_plan = [
|
| 712 |
+
{"name": str(x.get("name", "Section")), "summary": str(x.get("summary", ""))}
|
| 713 |
+
for x in raw_sections
|
| 714 |
+
if isinstance(x, dict)
|
| 715 |
+
] if isinstance(raw_sections, list) else []
|
| 716 |
+
section_plan = section_plan[:6]
|
| 717 |
+
if not section_plan:
|
| 718 |
+
section_plan = [
|
| 719 |
+
{"name": "Abstract", "summary": "Concise summary of contribution and findings."},
|
| 720 |
+
{"name": "Introduction", "summary": "Problem framing and objectives."},
|
| 721 |
+
{"name": "Methodology", "summary": "Core architecture and methodology."},
|
| 722 |
+
{"name": "Results", "summary": "Findings grounded in verifiable sources."},
|
| 723 |
+
]
|
| 724 |
+
emit_stage(
|
| 725 |
+
4,
|
| 726 |
+
"Agent-2 TemplatePlanner",
|
| 727 |
+
"section plan prepared",
|
| 728 |
+
scope_text=json.dumps(scope_payload, ensure_ascii=False, indent=2),
|
| 729 |
+
section_text=json.dumps({"sections": section_plan}, ensure_ascii=False, indent=2),
|
| 730 |
+
)
|
| 731 |
+
|
| 732 |
+
mark(5, "Section Agents", "researching each section and drafting TeX fragments")
|
| 733 |
+
for idx, sec in enumerate(section_plan, start=1):
|
| 734 |
+
section_title = sec["name"]
|
| 735 |
+
latex_body = ""
|
| 736 |
+
for attempt in range(1, 4):
|
| 737 |
+
sec_obj = _chat_json_resilient(
|
| 738 |
+
client_section,
|
| 739 |
+
[
|
| 740 |
+
ChatMessage(
|
| 741 |
+
role="system",
|
| 742 |
+
content=(
|
| 743 |
+
"You are a SectionResearchAgent. Write a rigorous LaTeX fragment for your assigned"
|
| 744 |
+
" section only."
|
| 745 |
+
+ (" Output Chinese text." if lang == "zh" else " Output English text.")
|
| 746 |
+
),
|
| 747 |
+
),
|
| 748 |
+
ChatMessage(
|
| 749 |
+
role="user",
|
| 750 |
+
content=(
|
| 751 |
+
f"Topic: {topic.strip()}\nSection: {sec['name']}\nSummary: {sec['summary']}\n"
|
| 752 |
+
f"Structure JSON: {json.dumps(structure_obj, ensure_ascii=False)}\n"
|
| 753 |
+
"Return JSON {section_title, latex_body}. latex_body must be plain LaTeX paragraphs"
|
| 754 |
+
" without documentclass/begin{document}, with evidence-driven style and citation markers."
|
| 755 |
+
" Keep each paragraph focused and concise for report readability."
|
| 756 |
+
" Minimum: 2 substantive paragraphs. No placeholder text."
|
| 757 |
+
),
|
| 758 |
+
),
|
| 759 |
+
],
|
| 760 |
+
schema_hint='{"section_title":"...","latex_body":"\\subsection*{...} ..."}',
|
| 761 |
+
temperature=0.1,
|
| 762 |
+
timeout_s=timeout_s,
|
| 763 |
+
)
|
| 764 |
+
cand_title = sec_obj.get("section_title")
|
| 765 |
+
cand_body = sec_obj.get("latex_body")
|
| 766 |
+
if isinstance(cand_title, str) and cand_title.strip():
|
| 767 |
+
section_title = cand_title.strip()
|
| 768 |
+
if isinstance(cand_body, str):
|
| 769 |
+
latex_body = cand_body.strip()
|
| 770 |
+
if _section_quality_ok(section_title, latex_body, lang):
|
| 771 |
+
break
|
| 772 |
+
emit_stage(
|
| 773 |
+
5,
|
| 774 |
+
"Section Agents",
|
| 775 |
+
f"quality gate retry {attempt}/3 for section {idx}",
|
| 776 |
+
scope_text=json.dumps(scope_payload, ensure_ascii=False, indent=2),
|
| 777 |
+
section_text=json.dumps({"sections": section_plan}, ensure_ascii=False, indent=2),
|
| 778 |
+
paper_text="\n\n".join(block["latex"] for block in section_blocks),
|
| 779 |
+
)
|
| 780 |
+
if not _section_quality_ok(section_title, latex_body, lang):
|
| 781 |
+
raise RuntimeError(
|
| 782 |
+
f"Section agent failed quality gate after retries: {section_title}"
|
| 783 |
+
)
|
| 784 |
+
section_blocks.append({"name": section_title, "latex": latex_body})
|
| 785 |
+
mark(4, "Section Agents", f"completed {idx}/{len(section_plan)} sections")
|
| 786 |
+
emit_stage(
|
| 787 |
+
5,
|
| 788 |
+
"Section Agents",
|
| 789 |
+
f"completed {idx}/{len(section_plan)} sections",
|
| 790 |
+
scope_text=json.dumps(scope_payload, ensure_ascii=False, indent=2),
|
| 791 |
+
section_text=json.dumps({"sections": section_plan}, ensure_ascii=False, indent=2),
|
| 792 |
+
paper_text="\n\n".join(block["latex"] for block in section_blocks),
|
| 793 |
+
)
|
| 794 |
+
|
| 795 |
+
mark(6, "Integrator-Paper", "assembling full paper.tex")
|
| 796 |
+
paper_obj = _chat_json_resilient(
|
| 797 |
+
client_paper,
|
| 798 |
+
[
|
| 799 |
+
ChatMessage(
|
| 800 |
+
role="system",
|
| 801 |
+
content=(
|
| 802 |
+
"You are ReportIntegrator. Produce a professional LaTeX RESEARCH REPORT"
|
| 803 |
+
" with executive readability, clear argument flow, and section coherence."
|
| 804 |
+
+ (" Output Chinese text." if lang == "zh" else " Output English text.")
|
| 805 |
+
),
|
| 806 |
+
),
|
| 807 |
+
ChatMessage(
|
| 808 |
+
role="user",
|
| 809 |
+
content=(
|
| 810 |
+
"Topic: "
|
| 811 |
+
+ topic.strip()
|
| 812 |
+
+ "\nScope: "
|
| 813 |
+
+ json.dumps(scope_payload, ensure_ascii=False)
|
| 814 |
+
+ "\nStructure: "
|
| 815 |
+
+ json.dumps(structure_obj, ensure_ascii=False)
|
| 816 |
+
+ "\nSection snippets: "
|
| 817 |
+
+ json.dumps(section_blocks, ensure_ascii=False)
|
| 818 |
+
+ "\nReturn JSON {paper_tex} with a full compilable document using report sections:"
|
| 819 |
+
+ " Executive Summary/Abstract, Background, Approach, Results, Discussion, Risks, Conclusion, References."
|
| 820 |
+
+ " Each section should include concrete evidence statements and implementation-level details,"
|
| 821 |
+
+ " not high-level filler. Minimum 2-4 substantive paragraphs per major section."
|
| 822 |
+
),
|
| 823 |
+
),
|
| 824 |
+
],
|
| 825 |
+
schema_hint='{"paper_tex":"\\documentclass{article} ... \\end{document}"}',
|
| 826 |
+
temperature=0.1,
|
| 827 |
+
timeout_s=timeout_s,
|
| 828 |
+
)
|
| 829 |
+
_paper_candidate = paper_obj.get("paper_tex")
|
| 830 |
+
paper_tex = render_report_structured(topic.strip(), section_blocks, language=lang)
|
| 831 |
+
_assert_not_template_output("paper", paper_tex)
|
| 832 |
+
emit_stage(
|
| 833 |
+
6,
|
| 834 |
+
"Integrator-Paper",
|
| 835 |
+
"paper.tex assembled",
|
| 836 |
+
scope_text=json.dumps(scope_payload, ensure_ascii=False, indent=2),
|
| 837 |
+
section_text=json.dumps({"sections": section_plan}, ensure_ascii=False, indent=2),
|
| 838 |
+
paper_text=paper_tex,
|
| 839 |
+
)
|
| 840 |
+
|
| 841 |
+
mark(7, "Integrator-Beamer", "assembling full slides.tex")
|
| 842 |
+
slides_obj = _chat_json_resilient(
|
| 843 |
+
client_slides,
|
| 844 |
+
[
|
| 845 |
+
ChatMessage(
|
| 846 |
+
role="system",
|
| 847 |
+
content=(
|
| 848 |
+
"You are BeamerIntegrator. Produce a visually polished, conference-style Beamer deck"
|
| 849 |
+
" with concise bullets, visual hierarchy, and readable spacing."
|
| 850 |
+
+ (" Output Chinese text." if lang == "zh" else " Output English text.")
|
| 851 |
+
),
|
| 852 |
+
),
|
| 853 |
+
ChatMessage(
|
| 854 |
+
role="user",
|
| 855 |
+
content=(
|
| 856 |
+
"Topic: "
|
| 857 |
+
+ topic.strip()
|
| 858 |
+
+ "\nScope: "
|
| 859 |
+
+ json.dumps(scope_payload, ensure_ascii=False)
|
| 860 |
+
+ "\nSection plan: "
|
| 861 |
+
+ json.dumps(section_plan, ensure_ascii=False)
|
| 862 |
+
+ "\nSlide style: "
|
| 863 |
+
+ json.dumps(structure_obj.get("slide_style", {}), ensure_ascii=False)
|
| 864 |
+
+ "\nReturn JSON {slides_tex} with a full compilable beamer document."
|
| 865 |
+
+ " Use modern readable typography, max 5 bullets/frame, max 14 words/bullet,"
|
| 866 |
+
+ " and ensure each frame content fully fits without overflow."
|
| 867 |
+
+ " Include complete coverage: agenda, background, method, results, discussion, conclusion."
|
| 868 |
+
+ " Return STRICTLY compilable LaTeX without custom undefined macros."
|
| 869 |
+
),
|
| 870 |
+
),
|
| 871 |
+
],
|
| 872 |
+
schema_hint='{"slides_tex":"\\documentclass{beamer} ... \\end{document}"}',
|
| 873 |
+
temperature=0.1,
|
| 874 |
+
timeout_s=timeout_s,
|
| 875 |
+
)
|
| 876 |
+
_slides_candidate = slides_obj.get("slides_tex")
|
| 877 |
+
frames = build_slide_frames_from_sections(section_blocks, language=lang)
|
| 878 |
+
frames = enforce_slide_density(frames, language=lang)
|
| 879 |
+
slides_tex = render_beamer_frames(topic.strip(), frames, language=lang)
|
| 880 |
+
_assert_not_template_output("slides", slides_tex)
|
| 881 |
+
emit_stage(
|
| 882 |
+
7,
|
| 883 |
+
"Integrator-Beamer",
|
| 884 |
+
"slides.tex assembled",
|
| 885 |
+
scope_text=json.dumps(scope_payload, ensure_ascii=False, indent=2),
|
| 886 |
+
section_text=json.dumps({"sections": section_plan}, ensure_ascii=False, indent=2),
|
| 887 |
+
paper_text=paper_tex,
|
| 888 |
+
slides_text=slides_tex,
|
| 889 |
+
)
|
| 890 |
+
|
| 891 |
+
mark(8, "Online Render", "compiling paper/slides to PDF via latexonline.cc")
|
| 892 |
+
emit_stage(
|
| 893 |
+
8,
|
| 894 |
+
"Online Render",
|
| 895 |
+
"rendering started",
|
| 896 |
+
scope_text=json.dumps(scope_payload, ensure_ascii=False, indent=2),
|
| 897 |
+
section_text=json.dumps({"sections": section_plan}, ensure_ascii=False, indent=2),
|
| 898 |
+
paper_text=paper_tex,
|
| 899 |
+
slides_text=slides_tex,
|
| 900 |
+
)
|
| 901 |
+
try:
|
| 902 |
+
paper_pdf = _compile_latex_online(paper_tex, "hydradeck_agentic_paper.pdf")
|
| 903 |
+
slides_pdf = _compile_latex_online(slides_tex, "hydradeck_agentic_slides.pdf")
|
| 904 |
+
emit_stage(
|
| 905 |
+
8,
|
| 906 |
+
"Online Render",
|
| 907 |
+
"pdf rendered",
|
| 908 |
+
scope_text=json.dumps(scope_payload, ensure_ascii=False, indent=2),
|
| 909 |
+
section_text=json.dumps({"sections": section_plan}, ensure_ascii=False, indent=2),
|
| 910 |
+
paper_text=paper_tex,
|
| 911 |
+
slides_text=slides_tex,
|
| 912 |
+
pdf_paths_text=paper_pdf + "\n" + slides_pdf,
|
| 913 |
+
paper_pdf_text=paper_pdf,
|
| 914 |
+
slides_pdf_text=slides_pdf,
|
| 915 |
+
)
|
| 916 |
+
except Exception as exc:
|
| 917 |
+
return (
|
| 918 |
+
f"Agentic run partial success: TeX generated but online PDF render failed: {exc}",
|
| 919 |
+
"\n".join(stage_logs),
|
| 920 |
+
json.dumps(scope_payload, ensure_ascii=False, indent=2),
|
| 921 |
+
json.dumps({"sections": section_plan}, ensure_ascii=False, indent=2),
|
| 922 |
+
paper_tex,
|
| 923 |
+
slides_tex,
|
| 924 |
+
"",
|
| 925 |
+
"",
|
| 926 |
+
"",
|
| 927 |
+
)
|
| 928 |
+
|
| 929 |
+
mark(9, "Done", "paper/slides PDFs rendered and ready")
|
| 930 |
+
return (
|
| 931 |
+
"Agentic pipeline done: scoped, drafted, integrated, rendered to PDF.",
|
| 932 |
+
"\n".join(stage_logs),
|
| 933 |
+
json.dumps(scope_payload, ensure_ascii=False, indent=2),
|
| 934 |
+
json.dumps({"sections": section_plan}, ensure_ascii=False, indent=2),
|
| 935 |
+
paper_tex,
|
| 936 |
+
slides_tex,
|
| 937 |
+
paper_pdf + "\n" + slides_pdf,
|
| 938 |
+
paper_pdf,
|
| 939 |
+
slides_pdf,
|
| 940 |
+
)
|
| 941 |
+
|
| 942 |
+
|
| 943 |
+
def _run_agentic_pipeline_stream(
|
| 944 |
+
topic: str,
|
| 945 |
+
model: str,
|
| 946 |
+
base_url: str,
|
| 947 |
+
api_key: str,
|
| 948 |
+
request_budget: float,
|
| 949 |
+
use_mock: bool,
|
| 950 |
+
):
|
| 951 |
+
status = "Agentic pipeline running..."
|
| 952 |
+
progress_log = "1/3 Starting workflow"
|
| 953 |
+
empty_json = ""
|
| 954 |
+
empty_tex = ""
|
| 955 |
+
empty_paths = ""
|
| 956 |
+
yield (
|
| 957 |
+
status,
|
| 958 |
+
progress_log,
|
| 959 |
+
empty_json,
|
| 960 |
+
empty_json,
|
| 961 |
+
empty_tex,
|
| 962 |
+
empty_tex,
|
| 963 |
+
empty_paths,
|
| 964 |
+
"",
|
| 965 |
+
"",
|
| 966 |
+
5,
|
| 967 |
+
)
|
| 968 |
+
|
| 969 |
+
progress_log = "1/3 API scope and section planning"
|
| 970 |
+
yield (
|
| 971 |
+
status,
|
| 972 |
+
progress_log,
|
| 973 |
+
empty_json,
|
| 974 |
+
empty_json,
|
| 975 |
+
empty_tex,
|
| 976 |
+
empty_tex,
|
| 977 |
+
empty_paths,
|
| 978 |
+
"",
|
| 979 |
+
"",
|
| 980 |
+
30,
|
| 981 |
+
)
|
| 982 |
+
|
| 983 |
+
events: Queue[dict[str, object]] = Queue()
|
| 984 |
+
|
| 985 |
+
def on_stage(payload: dict[str, object]) -> None:
|
| 986 |
+
events.put(payload)
|
| 987 |
+
|
| 988 |
+
with ThreadPoolExecutor(max_workers=1) as pool:
|
| 989 |
+
fut = pool.submit(
|
| 990 |
+
_run_agentic_pipeline,
|
| 991 |
+
topic,
|
| 992 |
+
model,
|
| 993 |
+
base_url,
|
| 994 |
+
api_key,
|
| 995 |
+
request_budget,
|
| 996 |
+
use_mock,
|
| 997 |
+
gr.Progress(),
|
| 998 |
+
on_stage,
|
| 999 |
+
)
|
| 1000 |
+
wait_tick = 0
|
| 1001 |
+
while not fut.done() or not events.empty():
|
| 1002 |
+
try:
|
| 1003 |
+
ev = events.get(timeout=1.0)
|
| 1004 |
+
yield (
|
| 1005 |
+
str(ev.get("status", "Agentic pipeline running...")),
|
| 1006 |
+
str(ev.get("progress_log", "")),
|
| 1007 |
+
str(ev.get("scope", "")),
|
| 1008 |
+
str(ev.get("sections", "")),
|
| 1009 |
+
str(ev.get("paper", "")),
|
| 1010 |
+
str(ev.get("slides", "")),
|
| 1011 |
+
str(ev.get("pdf_paths", "")),
|
| 1012 |
+
str(ev.get("paper_pdf", "")),
|
| 1013 |
+
str(ev.get("slides_pdf", "")),
|
| 1014 |
+
int(str(ev.get("progress", "0"))),
|
| 1015 |
+
)
|
| 1016 |
+
continue
|
| 1017 |
+
except Empty:
|
| 1018 |
+
pass
|
| 1019 |
+
|
| 1020 |
+
wait_tick += 1
|
| 1021 |
+
elapsed_s = wait_tick
|
| 1022 |
+
heartbeat_pct = min(95, 30 + wait_tick)
|
| 1023 |
+
yield (
|
| 1024 |
+
"Agentic pipeline running...",
|
| 1025 |
+
f"2/3 Running agent workflow ({elapsed_s}s elapsed)",
|
| 1026 |
+
empty_json,
|
| 1027 |
+
empty_json,
|
| 1028 |
+
empty_tex,
|
| 1029 |
+
empty_tex,
|
| 1030 |
+
empty_paths,
|
| 1031 |
+
"",
|
| 1032 |
+
"",
|
| 1033 |
+
heartbeat_pct,
|
| 1034 |
+
)
|
| 1035 |
+
time.sleep(1)
|
| 1036 |
+
|
| 1037 |
+
(
|
| 1038 |
+
status2,
|
| 1039 |
+
progress2,
|
| 1040 |
+
scope2,
|
| 1041 |
+
sections2,
|
| 1042 |
+
paper2,
|
| 1043 |
+
slides2,
|
| 1044 |
+
paths2,
|
| 1045 |
+
paper_pdf2,
|
| 1046 |
+
slides_pdf2,
|
| 1047 |
+
) = fut.result()
|
| 1048 |
+
|
| 1049 |
+
done_log = "3/3 Completed"
|
| 1050 |
+
if progress2.strip():
|
| 1051 |
+
done_log = progress2 + "\n" + done_log
|
| 1052 |
+
|
| 1053 |
+
yield (
|
| 1054 |
+
status2,
|
| 1055 |
+
done_log,
|
| 1056 |
+
scope2,
|
| 1057 |
+
sections2,
|
| 1058 |
+
paper2,
|
| 1059 |
+
slides2,
|
| 1060 |
+
paths2,
|
| 1061 |
+
paper_pdf2,
|
| 1062 |
+
slides_pdf2,
|
| 1063 |
+
100,
|
| 1064 |
+
)
|
| 1065 |
+
|
| 1066 |
+
|
| 1067 |
+
def _run_pipeline(
|
| 1068 |
+
topic: str,
|
| 1069 |
+
model: str,
|
| 1070 |
+
base_url: str,
|
| 1071 |
+
api_key: str,
|
| 1072 |
+
max_sources: int,
|
| 1073 |
+
iterations: int,
|
| 1074 |
+
llm_timeout: float,
|
| 1075 |
+
request_budget: float,
|
| 1076 |
+
seed_urls_text: str,
|
| 1077 |
+
use_mock: bool,
|
| 1078 |
+
) -> tuple[str, str, str, str]:
|
| 1079 |
+
if not topic.strip():
|
| 1080 |
+
return "Topic is required.", "", "", ""
|
| 1081 |
+
|
| 1082 |
+
selected_base_url = base_url.strip() or resolve_base_url("https://api.example.com")
|
| 1083 |
+
selected_api_key = api_key.strip() or resolve_api_key()
|
| 1084 |
+
|
| 1085 |
+
if not use_mock:
|
| 1086 |
+
preflight_error = _preflight_check(selected_base_url, selected_api_key, request_budget)
|
| 1087 |
+
if preflight_error is not None:
|
| 1088 |
+
return f"Preflight failed: {preflight_error}", "", "", ""
|
| 1089 |
+
|
| 1090 |
+
with tempfile.TemporaryDirectory() as td:
|
| 1091 |
+
out_zip = Path(td) / "hydradeck_out.zip"
|
| 1092 |
+
seeds = [x.strip() for x in seed_urls_text.splitlines() if x.strip()]
|
| 1093 |
+
cfg = RunConfig(
|
| 1094 |
+
topic=topic.strip(),
|
| 1095 |
+
out=out_zip,
|
| 1096 |
+
base_url=selected_base_url,
|
| 1097 |
+
api_key=selected_api_key,
|
| 1098 |
+
model=model.strip() or resolve_model("grok-4"),
|
| 1099 |
+
iterations=max(1, int(iterations)),
|
| 1100 |
+
max_sources=max(1, int(max_sources)),
|
| 1101 |
+
llm_timeout_s=float(llm_timeout),
|
| 1102 |
+
request_budget_s=float(request_budget),
|
| 1103 |
+
use_mock=bool(use_mock),
|
| 1104 |
+
seed_urls=seeds or None,
|
| 1105 |
+
progress=False,
|
| 1106 |
+
quality_gate=False,
|
| 1107 |
+
archive_snapshots=False,
|
| 1108 |
+
)
|
| 1109 |
+
|
| 1110 |
+
retry_cfg = RunConfig(
|
| 1111 |
+
topic=cfg.topic,
|
| 1112 |
+
out=cfg.out,
|
| 1113 |
+
base_url=cfg.base_url,
|
| 1114 |
+
api_key=cfg.api_key,
|
| 1115 |
+
model=cfg.model,
|
| 1116 |
+
iterations=cfg.iterations,
|
| 1117 |
+
max_sources=cfg.max_sources,
|
| 1118 |
+
module_sources=cfg.module_sources,
|
| 1119 |
+
min_total_words=cfg.min_total_words,
|
| 1120 |
+
use_mock=cfg.use_mock,
|
| 1121 |
+
verbose=cfg.verbose,
|
| 1122 |
+
llm_timeout_s=max(cfg.llm_timeout_s, 90.0),
|
| 1123 |
+
facts_max_pages=cfg.facts_max_pages,
|
| 1124 |
+
facts_max_chars_per_page=cfg.facts_max_chars_per_page,
|
| 1125 |
+
facts_target=cfg.facts_target,
|
| 1126 |
+
judge_max_chars=cfg.judge_max_chars,
|
| 1127 |
+
pre_tex_quality_gate=cfg.pre_tex_quality_gate,
|
| 1128 |
+
pre_tex_min_score=cfg.pre_tex_min_score,
|
| 1129 |
+
pre_tex_attempts=cfg.pre_tex_attempts,
|
| 1130 |
+
keep_stage=cfg.keep_stage,
|
| 1131 |
+
verbatim=cfg.verbatim,
|
| 1132 |
+
archive_prompts=cfg.archive_prompts,
|
| 1133 |
+
archive_snapshots=cfg.archive_snapshots,
|
| 1134 |
+
snapshot_timeout_s=cfg.snapshot_timeout_s,
|
| 1135 |
+
snapshot_total_timeout_s=cfg.snapshot_total_timeout_s,
|
| 1136 |
+
auto=cfg.auto,
|
| 1137 |
+
auto_queries=cfg.auto_queries,
|
| 1138 |
+
auto_models=cfg.auto_models,
|
| 1139 |
+
quality_gate=cfg.quality_gate,
|
| 1140 |
+
min_quality_score=cfg.min_quality_score,
|
| 1141 |
+
max_quality_attempts=cfg.max_quality_attempts,
|
| 1142 |
+
query_count=cfg.query_count,
|
| 1143 |
+
max_query_modules=cfg.max_query_modules,
|
| 1144 |
+
sources_attempts=cfg.sources_attempts,
|
| 1145 |
+
max_total_runtime_s=max(cfg.max_total_runtime_s, 420.0),
|
| 1146 |
+
progress=cfg.progress,
|
| 1147 |
+
request_budget_s=max(cfg.request_budget_s, 35.0),
|
| 1148 |
+
pdf_compiler=cfg.pdf_compiler,
|
| 1149 |
+
template=cfg.template,
|
| 1150 |
+
seed_urls=cfg.seed_urls,
|
| 1151 |
+
)
|
| 1152 |
+
try:
|
| 1153 |
+
_ = run(cfg)
|
| 1154 |
+
except Exception as exc:
|
| 1155 |
+
err_text = str(exc)
|
| 1156 |
+
retryable = ("Read timed out" in err_text) or ("timed out" in err_text.lower())
|
| 1157 |
+
if (not use_mock) and retryable:
|
| 1158 |
+
try:
|
| 1159 |
+
_ = run(retry_cfg)
|
| 1160 |
+
except Exception as retry_exc:
|
| 1161 |
+
return (
|
| 1162 |
+
"Run failed after retry: "
|
| 1163 |
+
f"{retry_exc}. Try request_budget >= 35 and llm_timeout >= 90.",
|
| 1164 |
+
"",
|
| 1165 |
+
"",
|
| 1166 |
+
"",
|
| 1167 |
+
)
|
| 1168 |
+
else:
|
| 1169 |
+
return (
|
| 1170 |
+
"Run failed: "
|
| 1171 |
+
f"{exc}. If queue waits too long, try Use mock (offline) or increase Request budget.",
|
| 1172 |
+
"",
|
| 1173 |
+
"",
|
| 1174 |
+
"",
|
| 1175 |
+
)
|
| 1176 |
+
|
| 1177 |
+
with zipfile.ZipFile(out_zip, "r") as z:
|
| 1178 |
+
report_md = z.read("report.md").decode("utf-8", errors="replace")
|
| 1179 |
+
paper_tex = z.read("paper.tex").decode("utf-8", errors="replace")
|
| 1180 |
+
slides_tex = z.read("slides.tex").decode("utf-8", errors="replace")
|
| 1181 |
+
|
| 1182 |
+
copy_zip = Path("/tmp") / "hydradeck_space_output.zip"
|
| 1183 |
+
copy_zip.write_bytes(out_zip.read_bytes())
|
| 1184 |
+
status = f"Done. Output zip: {copy_zip}"
|
| 1185 |
+
return status, report_md, paper_tex, slides_tex
|
| 1186 |
+
|
| 1187 |
+
|
| 1188 |
+
with gr.Blocks(title="hydradeck WebUI") as demo:
|
| 1189 |
+
gr.Markdown("# hydradeck WebUI\nRun deep-research and export paper/slides tex.")
|
| 1190 |
+
with gr.Row():
|
| 1191 |
+
topic = gr.Textbox(label="Topic", value="RynnBrain technical report")
|
| 1192 |
+
model = gr.Textbox(label="Model", value="grok-4")
|
| 1193 |
+
with gr.Row():
|
| 1194 |
+
base_url = gr.Textbox(label="Base URL", value="https://api.example.com")
|
| 1195 |
+
api_key = gr.Textbox(label="API Key", type="password", value="")
|
| 1196 |
+
with gr.Row():
|
| 1197 |
+
max_sources = gr.Number(label="Max sources", value=6, precision=0)
|
| 1198 |
+
iterations = gr.Number(label="Iterations", value=1, precision=0)
|
| 1199 |
+
llm_timeout = gr.Number(label="LLM timeout (s)", value=90)
|
| 1200 |
+
request_budget = gr.Number(label="Request budget (s)", value=35)
|
| 1201 |
+
seed_urls = gr.Textbox(
|
| 1202 |
+
label="Seed URLs (one per line)",
|
| 1203 |
+
value="https://github.com/alibaba-damo-academy/RynnBrain\nhttps://arxiv.org",
|
| 1204 |
+
lines=4,
|
| 1205 |
+
)
|
| 1206 |
+
use_mock = gr.Checkbox(label="Use mock (offline)", value=False)
|
| 1207 |
+
|
| 1208 |
+
check_btn = gr.Button("Quick API Check")
|
| 1209 |
+
run_btn = gr.Button("Run Full Pipeline")
|
| 1210 |
+
run_agentic_btn = gr.Button("Run Agentic Pipeline")
|
| 1211 |
+
status = gr.Textbox(label="Status")
|
| 1212 |
+
progress_pct = gr.Slider(label="Progress (%)", minimum=0, maximum=100, step=1, value=0, interactive=False)
|
| 1213 |
+
progress_log = gr.Textbox(label="Agent Progress", lines=10)
|
| 1214 |
+
scope_json = gr.Textbox(label="Scope (Agent-1)", lines=10)
|
| 1215 |
+
section_plan_json = gr.Textbox(label="Section Plan (Agent-2)", lines=10)
|
| 1216 |
+
report_md = gr.Textbox(label="report.md", lines=14)
|
| 1217 |
+
paper_tex = gr.Textbox(label="paper.tex", lines=14)
|
| 1218 |
+
slides_tex = gr.Textbox(label="slides.tex", lines=14)
|
| 1219 |
+
rendered_pdfs = gr.Textbox(label="Rendered PDF Paths", lines=2)
|
| 1220 |
+
paper_pdf_file = gr.Textbox(label="paper.pdf path", lines=1)
|
| 1221 |
+
slides_pdf_file = gr.Textbox(label="slides.pdf path", lines=1)
|
| 1222 |
+
|
| 1223 |
+
check_btn.click(
|
| 1224 |
+
_api_quick_check,
|
| 1225 |
+
[base_url, api_key, model, request_budget],
|
| 1226 |
+
[status],
|
| 1227 |
+
queue=False,
|
| 1228 |
+
)
|
| 1229 |
+
|
| 1230 |
+
run_btn.click(
|
| 1231 |
+
_run_pipeline,
|
| 1232 |
+
[
|
| 1233 |
+
topic,
|
| 1234 |
+
model,
|
| 1235 |
+
base_url,
|
| 1236 |
+
api_key,
|
| 1237 |
+
max_sources,
|
| 1238 |
+
iterations,
|
| 1239 |
+
llm_timeout,
|
| 1240 |
+
request_budget,
|
| 1241 |
+
seed_urls,
|
| 1242 |
+
use_mock,
|
| 1243 |
+
],
|
| 1244 |
+
[status, report_md, paper_tex, slides_tex],
|
| 1245 |
+
queue=False,
|
| 1246 |
+
)
|
| 1247 |
+
|
| 1248 |
+
run_agentic_btn.click(
|
| 1249 |
+
_run_agentic_pipeline_stream,
|
| 1250 |
+
[topic, model, base_url, api_key, request_budget, use_mock],
|
| 1251 |
+
[
|
| 1252 |
+
status,
|
| 1253 |
+
progress_log,
|
| 1254 |
+
scope_json,
|
| 1255 |
+
section_plan_json,
|
| 1256 |
+
paper_tex,
|
| 1257 |
+
slides_tex,
|
| 1258 |
+
rendered_pdfs,
|
| 1259 |
+
paper_pdf_file,
|
| 1260 |
+
slides_pdf_file,
|
| 1261 |
+
progress_pct,
|
| 1262 |
+
],
|
| 1263 |
+
queue=True,
|
| 1264 |
+
)
|
| 1265 |
+
|
| 1266 |
+
|
| 1267 |
+
if __name__ == "__main__":
|
| 1268 |
+
demo.queue(default_concurrency_limit=2)
|
| 1269 |
+
demo.launch(server_name="0.0.0.0", server_port=7860)
|
custom_web.py
ADDED
|
@@ -0,0 +1,547 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
from __future__ import annotations
|
| 2 |
+
|
| 3 |
+
import json
|
| 4 |
+
import threading
|
| 5 |
+
import time
|
| 6 |
+
import uuid
|
| 7 |
+
from pathlib import Path
|
| 8 |
+
from typing import Any
|
| 9 |
+
|
| 10 |
+
from fastapi import FastAPI, HTTPException
|
| 11 |
+
from fastapi.responses import FileResponse, HTMLResponse
|
| 12 |
+
import gradio as gr
|
| 13 |
+
from pydantic import BaseModel
|
| 14 |
+
|
| 15 |
+
from app import _api_quick_check, _run_agentic_pipeline
|
| 16 |
+
from hydradeck.clients.grok_client import GrokClient
|
| 17 |
+
|
| 18 |
+
|
| 19 |
+
class RunRequest(BaseModel):
|
| 20 |
+
topic: str
|
| 21 |
+
model: str = "grok-3-mini"
|
| 22 |
+
base_url: str = "https://api.example.com"
|
| 23 |
+
api_key: str = ""
|
| 24 |
+
request_budget: float = 30.0
|
| 25 |
+
use_mock: bool = False
|
| 26 |
+
language: str = "en"
|
| 27 |
+
model_scope: str = ""
|
| 28 |
+
model_structure: str = ""
|
| 29 |
+
model_planner: str = ""
|
| 30 |
+
model_section: str = ""
|
| 31 |
+
model_paper: str = ""
|
| 32 |
+
model_slides: str = ""
|
| 33 |
+
|
| 34 |
+
|
| 35 |
+
JOBS: dict[str, dict[str, Any]] = {}
|
| 36 |
+
LOCK = threading.Lock()
|
| 37 |
+
STATE_PATH = Path("/tmp/hydradeck_state.json")
|
| 38 |
+
HISTORY_LIMIT = 40
|
| 39 |
+
|
| 40 |
+
app = FastAPI(title="HydraDeck")
|
| 41 |
+
|
| 42 |
+
|
| 43 |
+
def _load_state() -> None:
|
| 44 |
+
if not STATE_PATH.exists():
|
| 45 |
+
return
|
| 46 |
+
try:
|
| 47 |
+
data = json.loads(STATE_PATH.read_text(encoding="utf-8"))
|
| 48 |
+
except Exception:
|
| 49 |
+
return
|
| 50 |
+
jobs = data.get("jobs")
|
| 51 |
+
if isinstance(jobs, dict):
|
| 52 |
+
with LOCK:
|
| 53 |
+
JOBS.update({str(k): v for k, v in jobs.items() if isinstance(v, dict)})
|
| 54 |
+
|
| 55 |
+
|
| 56 |
+
def _save_state() -> None:
|
| 57 |
+
with LOCK:
|
| 58 |
+
payload = {"jobs": JOBS}
|
| 59 |
+
STATE_PATH.write_text(json.dumps(payload, ensure_ascii=False), encoding="utf-8")
|
| 60 |
+
|
| 61 |
+
|
| 62 |
+
def _prune_history() -> None:
|
| 63 |
+
with LOCK:
|
| 64 |
+
items = sorted(
|
| 65 |
+
JOBS.items(),
|
| 66 |
+
key=lambda kv: float(kv[1].get("updated_at", 0.0)),
|
| 67 |
+
reverse=True,
|
| 68 |
+
)
|
| 69 |
+
keep = dict(items[:HISTORY_LIMIT])
|
| 70 |
+
JOBS.clear()
|
| 71 |
+
JOBS.update(keep)
|
| 72 |
+
|
| 73 |
+
|
| 74 |
+
_load_state()
|
| 75 |
+
|
| 76 |
+
|
| 77 |
+
def _new_job(req: RunRequest) -> dict[str, Any]:
|
| 78 |
+
now = time.time()
|
| 79 |
+
return {
|
| 80 |
+
"id": str(uuid.uuid4()),
|
| 81 |
+
"status": "queued",
|
| 82 |
+
"created_at": now,
|
| 83 |
+
"updated_at": now,
|
| 84 |
+
"progress": 0,
|
| 85 |
+
"status_text": "Queued",
|
| 86 |
+
"progress_log": "",
|
| 87 |
+
"scope": "",
|
| 88 |
+
"sections": "",
|
| 89 |
+
"paper": "",
|
| 90 |
+
"slides": "",
|
| 91 |
+
"pdf_paths": "",
|
| 92 |
+
"paper_pdf": "",
|
| 93 |
+
"slides_pdf": "",
|
| 94 |
+
"error": "",
|
| 95 |
+
"events": [],
|
| 96 |
+
"params": req.model_dump(),
|
| 97 |
+
}
|
| 98 |
+
|
| 99 |
+
|
| 100 |
+
def _update_job(job_id: str, updates: dict[str, Any]) -> None:
|
| 101 |
+
with LOCK:
|
| 102 |
+
job = JOBS.get(job_id)
|
| 103 |
+
if not job:
|
| 104 |
+
return
|
| 105 |
+
job.update(updates)
|
| 106 |
+
job["updated_at"] = time.time()
|
| 107 |
+
_prune_history()
|
| 108 |
+
_save_state()
|
| 109 |
+
|
| 110 |
+
|
| 111 |
+
def _append_event(job_id: str, event: dict[str, Any]) -> None:
|
| 112 |
+
with LOCK:
|
| 113 |
+
job = JOBS.get(job_id)
|
| 114 |
+
if not job:
|
| 115 |
+
return
|
| 116 |
+
events = job.get("events")
|
| 117 |
+
if isinstance(events, list):
|
| 118 |
+
events.append(event)
|
| 119 |
+
_save_state()
|
| 120 |
+
|
| 121 |
+
|
| 122 |
+
def _run_job(job_id: str, req: RunRequest) -> None:
|
| 123 |
+
_update_job(job_id, {"status": "running", "status_text": "Running"})
|
| 124 |
+
|
| 125 |
+
def on_stage(payload: dict[str, Any]) -> None:
|
| 126 |
+
_update_job(
|
| 127 |
+
job_id,
|
| 128 |
+
{
|
| 129 |
+
"status": "running",
|
| 130 |
+
"status_text": str(payload.get("status", "Running")),
|
| 131 |
+
"progress": int(str(payload.get("progress", "0"))),
|
| 132 |
+
"progress_log": str(payload.get("progress_log", "")),
|
| 133 |
+
"scope": str(payload.get("scope", "")),
|
| 134 |
+
"sections": str(payload.get("sections", "")),
|
| 135 |
+
"paper": str(payload.get("paper", "")),
|
| 136 |
+
"slides": str(payload.get("slides", "")),
|
| 137 |
+
"pdf_paths": str(payload.get("pdf_paths", "")),
|
| 138 |
+
"paper_pdf": str(payload.get("paper_pdf", "")),
|
| 139 |
+
"slides_pdf": str(payload.get("slides_pdf", "")),
|
| 140 |
+
},
|
| 141 |
+
)
|
| 142 |
+
_append_event(
|
| 143 |
+
job_id,
|
| 144 |
+
{
|
| 145 |
+
"ts": time.time(),
|
| 146 |
+
"stage": str(payload.get("stage", "")),
|
| 147 |
+
"detail": str(payload.get("detail", "")),
|
| 148 |
+
"progress": int(str(payload.get("progress", "0"))),
|
| 149 |
+
},
|
| 150 |
+
)
|
| 151 |
+
|
| 152 |
+
try:
|
| 153 |
+
(
|
| 154 |
+
status,
|
| 155 |
+
progress_log,
|
| 156 |
+
scope,
|
| 157 |
+
sections,
|
| 158 |
+
paper,
|
| 159 |
+
slides,
|
| 160 |
+
pdf_paths,
|
| 161 |
+
paper_pdf,
|
| 162 |
+
slides_pdf,
|
| 163 |
+
) = _run_agentic_pipeline(
|
| 164 |
+
topic=req.topic,
|
| 165 |
+
model=req.model,
|
| 166 |
+
base_url=req.base_url,
|
| 167 |
+
api_key=req.api_key,
|
| 168 |
+
request_budget=req.request_budget,
|
| 169 |
+
use_mock=req.use_mock,
|
| 170 |
+
progress=gr.Progress(),
|
| 171 |
+
stage_callback=on_stage,
|
| 172 |
+
language=req.language,
|
| 173 |
+
stage_models={
|
| 174 |
+
"scope": req.model_scope,
|
| 175 |
+
"structure": req.model_structure,
|
| 176 |
+
"planner": req.model_planner,
|
| 177 |
+
"section": req.model_section,
|
| 178 |
+
"paper": req.model_paper,
|
| 179 |
+
"slides": req.model_slides,
|
| 180 |
+
},
|
| 181 |
+
)
|
| 182 |
+
_update_job(
|
| 183 |
+
job_id,
|
| 184 |
+
{
|
| 185 |
+
"status": "done",
|
| 186 |
+
"status_text": status,
|
| 187 |
+
"progress": 100,
|
| 188 |
+
"progress_log": progress_log,
|
| 189 |
+
"scope": scope,
|
| 190 |
+
"sections": sections,
|
| 191 |
+
"paper": paper,
|
| 192 |
+
"slides": slides,
|
| 193 |
+
"pdf_paths": pdf_paths,
|
| 194 |
+
"paper_pdf": paper_pdf,
|
| 195 |
+
"slides_pdf": slides_pdf,
|
| 196 |
+
},
|
| 197 |
+
)
|
| 198 |
+
except Exception as exc:
|
| 199 |
+
_update_job(
|
| 200 |
+
job_id,
|
| 201 |
+
{
|
| 202 |
+
"status": "error",
|
| 203 |
+
"status_text": "Failed",
|
| 204 |
+
"error": str(exc),
|
| 205 |
+
},
|
| 206 |
+
)
|
| 207 |
+
|
| 208 |
+
|
| 209 |
+
@app.get("/", response_class=HTMLResponse)
|
| 210 |
+
def index() -> str:
|
| 211 |
+
return """
|
| 212 |
+
<!doctype html>
|
| 213 |
+
<html>
|
| 214 |
+
<head>
|
| 215 |
+
<meta charset=\"utf-8\" />
|
| 216 |
+
<title>HydraDeck</title>
|
| 217 |
+
<style>
|
| 218 |
+
:root{--bg:#f5ecd8;--paper:#fff9ec;--ink:#2a1f12;--muted:#7a5f3e;--accent:#8b3a3a;--ok:#2f6f3e}
|
| 219 |
+
body{font-family:"IBM Plex Mono","Courier New",monospace;max-width:1220px;margin:18px auto;padding:0 12px;background:var(--bg);color:var(--ink)}
|
| 220 |
+
.panel{border:2px solid var(--ink);background:var(--paper);box-shadow:2px 2px 0 #0002;padding:10px;margin:10px 0}
|
| 221 |
+
.row{display:flex;gap:10px;margin:8px 0;flex-wrap:wrap}
|
| 222 |
+
input,select,textarea{padding:8px;width:100%;border:1px solid #4b3924;background:#fffdf7;color:var(--ink)}
|
| 223 |
+
button{padding:9px 13px;border:2px solid var(--ink);background:#ead2b0;color:var(--ink);cursor:pointer}
|
| 224 |
+
button:hover{background:#f0ddc3}
|
| 225 |
+
.bar{height:16px;background:#d8c3a5;border:1px solid #4b3924;overflow:hidden}
|
| 226 |
+
.fill{height:100%;width:0%;background:linear-gradient(90deg,#8b3a3a,#d46a6a);transition:width .25s}
|
| 227 |
+
.grid{display:grid;grid-template-columns:1fr 1fr;gap:12px}
|
| 228 |
+
pre{background:#1b130c;color:#f7e8d0;padding:10px;white-space:pre-wrap;max-height:260px;overflow:auto;border:1px solid #3a2a1b}
|
| 229 |
+
.title{font-size:28px;font-weight:700;letter-spacing:1px}
|
| 230 |
+
.sub{color:var(--muted)}
|
| 231 |
+
.tiny{font-size:12px;color:var(--muted)}
|
| 232 |
+
details{border:1px dashed #7a5f3e;padding:8px;background:#fff9ef}
|
| 233 |
+
summary{cursor:pointer;font-weight:700}
|
| 234 |
+
</style>
|
| 235 |
+
</head>
|
| 236 |
+
<body>
|
| 237 |
+
<div class=\"panel\"><div class=\"title\">HydraDeck</div></div>
|
| 238 |
+
<div class=\"panel\">
|
| 239 |
+
<div class=\"row\" style=\"gap:6px\">
|
| 240 |
+
<button onclick=\"showTab('tab-run')\">Run</button>
|
| 241 |
+
<button onclick=\"showTab('tab-artifacts')\">Artifacts</button>
|
| 242 |
+
<button onclick=\"showTab('tab-console')\">Console</button>
|
| 243 |
+
</div>
|
| 244 |
+
</div>
|
| 245 |
+
|
| 246 |
+
<div id=\"tab-run\" class=\"panel tab\">
|
| 247 |
+
<div class=\"row\"><input id=\"topic\" value=\"RynnBrain technical research report\" /></div>
|
| 248 |
+
<div class=\"row\">
|
| 249 |
+
<select id=\"model\"></select>
|
| 250 |
+
<input id=\"base_url\" value=\"https://api.example.com\" />
|
| 251 |
+
</div>
|
| 252 |
+
<div class=\"row\">
|
| 253 |
+
<label>language
|
| 254 |
+
<select id=\"language\">
|
| 255 |
+
<option value=\"en\" selected>English</option>
|
| 256 |
+
<option value=\"zh\">中文</option>
|
| 257 |
+
</select>
|
| 258 |
+
</label>
|
| 259 |
+
<input id=\"api_key\" placeholder=\"api key\" />
|
| 260 |
+
<input id=\"request_budget\" value=\"30\" />
|
| 261 |
+
<label><input id=\"use_mock\" type=\"checkbox\" /> use mock</label>
|
| 262 |
+
</div>
|
| 263 |
+
<div class=\"row\">
|
| 264 |
+
<button onclick=\"quickCheck()\">Quick API Check</button>
|
| 265 |
+
<button onclick=\"startRun()\">Run HydraDeck</button>
|
| 266 |
+
<button onclick=\"resumeLastRun()\">Resume Last Run</button>
|
| 267 |
+
</div>
|
| 268 |
+
|
| 269 |
+
<details>
|
| 270 |
+
<summary>Advanced model routing</summary>
|
| 271 |
+
<div class=\"tiny\">Per-agent model overrides (optional)</div>
|
| 272 |
+
<div class=\"row\"><select id=\"model_scope\"></select><select id=\"model_structure\"></select></div>
|
| 273 |
+
<div class=\"row\"><select id=\"model_planner\"></select><select id=\"model_section\"></select></div>
|
| 274 |
+
<div class=\"row\"><select id=\"model_paper\"></select><select id=\"model_slides\"></select></div>
|
| 275 |
+
</details>
|
| 276 |
+
</div>
|
| 277 |
+
<div id=\"status\">Idle</div>
|
| 278 |
+
<div class=\"bar\"><div id=\"fill\" class=\"fill\"></div></div>
|
| 279 |
+
<div id=\"pct\">0%</div>
|
| 280 |
+
<div id=\"tab-artifacts\" class=\"panel tab\" style=\"display:none\">
|
| 281 |
+
<div class=\"row\">
|
| 282 |
+
<a id=\"paperLink\" target=\"_blank\"></a>
|
| 283 |
+
<a id=\"slidesLink\" target=\"_blank\"></a>
|
| 284 |
+
</div>
|
| 285 |
+
<div class=\"grid\">
|
| 286 |
+
<div><h4>Scope</h4><pre id=\"scope\"></pre></div>
|
| 287 |
+
<div><h4>Sections</h4><pre id=\"sections\"></pre></div>
|
| 288 |
+
<div><h4>paper.tex</h4><pre id=\"paper\"></pre></div>
|
| 289 |
+
<div><h4>slides.tex</h4><pre id=\"slides\"></pre></div>
|
| 290 |
+
</div>
|
| 291 |
+
</div>
|
| 292 |
+
|
| 293 |
+
<div id=\"tab-console\" class=\"panel tab\" style=\"display:none\">
|
| 294 |
+
<div class=\"grid\">
|
| 295 |
+
<div><h4>Progress</h4><pre id=\"progress\"></pre></div>
|
| 296 |
+
<div><h4>Events</h4><pre id=\"events\"></pre></div>
|
| 297 |
+
</div>
|
| 298 |
+
</div>
|
| 299 |
+
|
| 300 |
+
<script>
|
| 301 |
+
let jobId = null;
|
| 302 |
+
let timer = null;
|
| 303 |
+
let inflight = false;
|
| 304 |
+
let refreshFailCount = 0;
|
| 305 |
+
|
| 306 |
+
function showTab(id){
|
| 307 |
+
for(const el of document.querySelectorAll('.tab')) el.style.display='none';
|
| 308 |
+
document.getElementById(id).style.display='block';
|
| 309 |
+
}
|
| 310 |
+
|
| 311 |
+
function addModelOptions(selectId, models){
|
| 312 |
+
const s=document.getElementById(selectId);
|
| 313 |
+
s.innerHTML='';
|
| 314 |
+
const blank=document.createElement('option');
|
| 315 |
+
blank.value='';
|
| 316 |
+
blank.textContent = selectId==='model' ? '(default model)' : '(inherit default)';
|
| 317 |
+
s.appendChild(blank);
|
| 318 |
+
for(const m of models){
|
| 319 |
+
const o=document.createElement('option');
|
| 320 |
+
o.value=m; o.textContent=m; s.appendChild(o);
|
| 321 |
+
}
|
| 322 |
+
}
|
| 323 |
+
|
| 324 |
+
async function loadModels(){
|
| 325 |
+
try{
|
| 326 |
+
const ctl = new AbortController();
|
| 327 |
+
const t = setTimeout(()=>ctl.abort(), 15000);
|
| 328 |
+
const r=await fetch('/api/models?base_url='+encodeURIComponent(document.getElementById('base_url').value)+'&api_key='+encodeURIComponent(document.getElementById('api_key').value), {signal: ctl.signal});
|
| 329 |
+
clearTimeout(t);
|
| 330 |
+
const j=await r.json();
|
| 331 |
+
const models=Array.isArray(j.models)?j.models:[];
|
| 332 |
+
for(const id of ['model','model_scope','model_structure','model_planner','model_section','model_paper','model_slides']) addModelOptions(id, models);
|
| 333 |
+
if(models.includes('grok-3-mini')) document.getElementById('model').value='grok-3-mini';
|
| 334 |
+
}catch(e){
|
| 335 |
+
document.getElementById('status').innerText='model list failed: '+e;
|
| 336 |
+
}
|
| 337 |
+
}
|
| 338 |
+
|
| 339 |
+
function payload(){
|
| 340 |
+
return {
|
| 341 |
+
topic: document.getElementById('topic').value,
|
| 342 |
+
model: document.getElementById('model').value,
|
| 343 |
+
base_url: document.getElementById('base_url').value,
|
| 344 |
+
api_key: document.getElementById('api_key').value,
|
| 345 |
+
request_budget: Number(document.getElementById('request_budget').value || 30),
|
| 346 |
+
use_mock: document.getElementById('use_mock').checked,
|
| 347 |
+
language: document.getElementById('language').value,
|
| 348 |
+
model_scope: document.getElementById('model_scope').value,
|
| 349 |
+
model_structure: document.getElementById('model_structure').value,
|
| 350 |
+
model_planner: document.getElementById('model_planner').value,
|
| 351 |
+
model_section: document.getElementById('model_section').value,
|
| 352 |
+
model_paper: document.getElementById('model_paper').value,
|
| 353 |
+
model_slides: document.getElementById('model_slides').value,
|
| 354 |
+
};
|
| 355 |
+
}
|
| 356 |
+
|
| 357 |
+
async function quickCheck(){
|
| 358 |
+
const ctl = new AbortController();
|
| 359 |
+
const t = setTimeout(()=>ctl.abort(), 20000);
|
| 360 |
+
const r = await fetch('/api/quick-check',{method:'POST',headers:{'content-type':'application/json'},body:JSON.stringify(payload()),signal: ctl.signal});
|
| 361 |
+
clearTimeout(t);
|
| 362 |
+
const j = await r.json();
|
| 363 |
+
document.getElementById('status').innerText = j.result || j.error;
|
| 364 |
+
showTab('tab-console');
|
| 365 |
+
}
|
| 366 |
+
|
| 367 |
+
async function startRun(){
|
| 368 |
+
if(inflight) return;
|
| 369 |
+
inflight = true;
|
| 370 |
+
const ctl = new AbortController();
|
| 371 |
+
const t = setTimeout(()=>ctl.abort(), 20000);
|
| 372 |
+
const r = await fetch('/api/jobs',{method:'POST',headers:{'content-type':'application/json'},body:JSON.stringify(payload()),signal: ctl.signal});
|
| 373 |
+
clearTimeout(t);
|
| 374 |
+
const j = await r.json();
|
| 375 |
+
jobId = j.id;
|
| 376 |
+
localStorage.setItem('hydradeck_last_job_id', jobId);
|
| 377 |
+
if (timer) clearInterval(timer);
|
| 378 |
+
timer = setInterval(refresh, 1000);
|
| 379 |
+
refresh();
|
| 380 |
+
showTab('tab-console');
|
| 381 |
+
}
|
| 382 |
+
|
| 383 |
+
async function refresh(){
|
| 384 |
+
if(!inflight) return;
|
| 385 |
+
if(!jobId) return;
|
| 386 |
+
try {
|
| 387 |
+
const ctl = new AbortController();
|
| 388 |
+
const t = setTimeout(()=>ctl.abort(), 12000);
|
| 389 |
+
const r = await fetch('/api/jobs/'+jobId, {signal: ctl.signal});
|
| 390 |
+
clearTimeout(t);
|
| 391 |
+
if(!r.ok) {
|
| 392 |
+
refreshFailCount += 1;
|
| 393 |
+
if (refreshFailCount >= 5) {
|
| 394 |
+
inflight = false;
|
| 395 |
+
if (timer) { clearInterval(timer); timer = null; }
|
| 396 |
+
document.getElementById('status').innerText = 'Polling paused (network/server issue). Use Resume Last Run.';
|
| 397 |
+
}
|
| 398 |
+
return;
|
| 399 |
+
}
|
| 400 |
+
const j = await r.json();
|
| 401 |
+
refreshFailCount = 0;
|
| 402 |
+
document.getElementById('status').innerText = j.status_text || j.status;
|
| 403 |
+
const p = Math.max(0, Math.min(100, Number(j.progress || 0)));
|
| 404 |
+
document.getElementById('fill').style.width = p + '%';
|
| 405 |
+
document.getElementById('pct').innerText = p + '%';
|
| 406 |
+
document.getElementById('progress').innerText = j.progress_log || '';
|
| 407 |
+
document.getElementById('scope').innerText = j.scope || '';
|
| 408 |
+
document.getElementById('sections').innerText = j.sections || '';
|
| 409 |
+
document.getElementById('paper').innerText = j.paper || '';
|
| 410 |
+
document.getElementById('slides').innerText = j.slides || '';
|
| 411 |
+
document.getElementById('events').innerText = JSON.stringify(j.events || [], null, 2);
|
| 412 |
+
|
| 413 |
+
const p1 = document.getElementById('paperLink');
|
| 414 |
+
const p2 = document.getElementById('slidesLink');
|
| 415 |
+
if (j.paper_pdf){ p1.href = '/api/jobs/'+jobId+'/artifact/paper'; p1.innerText='Download paper.pdf'; }
|
| 416 |
+
if (j.slides_pdf){ p2.href = '/api/jobs/'+jobId+'/artifact/slides'; p2.innerText='Download slides.pdf'; }
|
| 417 |
+
|
| 418 |
+
if (j.status === 'done' || j.status === 'error') {
|
| 419 |
+
clearInterval(timer);
|
| 420 |
+
timer = null;
|
| 421 |
+
inflight = false;
|
| 422 |
+
localStorage.removeItem('hydradeck_last_job_id');
|
| 423 |
+
}
|
| 424 |
+
} catch (e) {
|
| 425 |
+
refreshFailCount += 1;
|
| 426 |
+
if (refreshFailCount >= 5) {
|
| 427 |
+
inflight = false;
|
| 428 |
+
if (timer) { clearInterval(timer); timer = null; }
|
| 429 |
+
document.getElementById('status').innerText = 'Polling paused due to repeated timeout. Use Resume Last Run.';
|
| 430 |
+
}
|
| 431 |
+
}
|
| 432 |
+
}
|
| 433 |
+
|
| 434 |
+
function resumeLastRun(){
|
| 435 |
+
const saved = localStorage.getItem('hydradeck_last_job_id');
|
| 436 |
+
if(!saved){
|
| 437 |
+
document.getElementById('status').innerText = 'No resumable job.';
|
| 438 |
+
return;
|
| 439 |
+
}
|
| 440 |
+
jobId = saved;
|
| 441 |
+
inflight = true;
|
| 442 |
+
refreshFailCount = 0;
|
| 443 |
+
if (timer) clearInterval(timer);
|
| 444 |
+
timer = setInterval(refresh, 1000);
|
| 445 |
+
refresh();
|
| 446 |
+
showTab('tab-console');
|
| 447 |
+
}
|
| 448 |
+
|
| 449 |
+
document.getElementById('base_url').addEventListener('change', loadModels);
|
| 450 |
+
document.getElementById('api_key').addEventListener('change', loadModels);
|
| 451 |
+
loadModels();
|
| 452 |
+
showTab('tab-run');
|
| 453 |
+
if(localStorage.getItem('hydradeck_last_job_id')){
|
| 454 |
+
document.getElementById('status').innerText = 'Last run available. Click Resume Last Run to continue.';
|
| 455 |
+
}
|
| 456 |
+
</script>
|
| 457 |
+
</body>
|
| 458 |
+
</html>
|
| 459 |
+
"""
|
| 460 |
+
|
| 461 |
+
|
| 462 |
+
@app.post("/api/quick-check")
|
| 463 |
+
def api_quick_check(req: RunRequest) -> dict[str, str]:
|
| 464 |
+
result = _api_quick_check(req.base_url, req.api_key, req.model, req.request_budget)
|
| 465 |
+
return {"result": result}
|
| 466 |
+
|
| 467 |
+
|
| 468 |
+
@app.post("/api/jobs")
|
| 469 |
+
def create_job(req: RunRequest) -> dict[str, str]:
|
| 470 |
+
if not req.topic.strip():
|
| 471 |
+
raise HTTPException(status_code=400, detail="topic is required")
|
| 472 |
+
job = _new_job(req)
|
| 473 |
+
with LOCK:
|
| 474 |
+
JOBS[job["id"]] = job
|
| 475 |
+
_prune_history()
|
| 476 |
+
_save_state()
|
| 477 |
+
t = threading.Thread(target=_run_job, args=(job["id"], req), daemon=True)
|
| 478 |
+
t.start()
|
| 479 |
+
return {"id": job["id"]}
|
| 480 |
+
|
| 481 |
+
|
| 482 |
+
@app.get("/api/history")
|
| 483 |
+
def get_history() -> dict[str, Any]:
|
| 484 |
+
with LOCK:
|
| 485 |
+
items = sorted(
|
| 486 |
+
JOBS.values(),
|
| 487 |
+
key=lambda j: float(j.get("updated_at", 0.0)),
|
| 488 |
+
reverse=True,
|
| 489 |
+
)
|
| 490 |
+
rows = [
|
| 491 |
+
{
|
| 492 |
+
"id": j.get("id"),
|
| 493 |
+
"status": j.get("status"),
|
| 494 |
+
"progress": j.get("progress"),
|
| 495 |
+
"topic": (j.get("params") or {}).get("topic", ""),
|
| 496 |
+
"updated_at": j.get("updated_at"),
|
| 497 |
+
}
|
| 498 |
+
for j in items[:HISTORY_LIMIT]
|
| 499 |
+
]
|
| 500 |
+
return {"items": rows}
|
| 501 |
+
|
| 502 |
+
|
| 503 |
+
@app.get("/api/models")
|
| 504 |
+
def get_models(base_url: str, api_key: str = "") -> dict[str, Any]:
|
| 505 |
+
try:
|
| 506 |
+
cli = GrokClient(base_url=base_url, api_key=api_key, model="grok-3-mini", timeout_s=20.0, max_retries=1)
|
| 507 |
+
models = cli.list_models(timeout_s=20.0)
|
| 508 |
+
return {"models": models}
|
| 509 |
+
except Exception as exc:
|
| 510 |
+
return {"models": [], "error": str(exc)}
|
| 511 |
+
|
| 512 |
+
|
| 513 |
+
@app.get("/api/jobs/{job_id}")
|
| 514 |
+
def get_job(job_id: str) -> dict[str, Any]:
|
| 515 |
+
with LOCK:
|
| 516 |
+
job = JOBS.get(job_id)
|
| 517 |
+
if not job:
|
| 518 |
+
raise HTTPException(status_code=404, detail="job not found")
|
| 519 |
+
return dict(job)
|
| 520 |
+
|
| 521 |
+
|
| 522 |
+
@app.get("/api/jobs/{job_id}/artifact/{kind}")
|
| 523 |
+
def get_artifact(job_id: str, kind: str):
|
| 524 |
+
with LOCK:
|
| 525 |
+
job = JOBS.get(job_id)
|
| 526 |
+
if not job:
|
| 527 |
+
raise HTTPException(status_code=404, detail="job not found")
|
| 528 |
+
if kind == "paper":
|
| 529 |
+
path = str(job.get("paper_pdf", ""))
|
| 530 |
+
filename = "paper.pdf"
|
| 531 |
+
elif kind == "slides":
|
| 532 |
+
path = str(job.get("slides_pdf", ""))
|
| 533 |
+
filename = "slides.pdf"
|
| 534 |
+
else:
|
| 535 |
+
raise HTTPException(status_code=400, detail="kind must be paper|slides")
|
| 536 |
+
|
| 537 |
+
p = Path(path)
|
| 538 |
+
if not path or not p.exists():
|
| 539 |
+
raise HTTPException(status_code=404, detail="artifact not ready")
|
| 540 |
+
return FileResponse(str(p), media_type="application/pdf", filename=filename)
|
| 541 |
+
|
| 542 |
+
|
| 543 |
+
if __name__ == "__main__":
|
| 544 |
+
import uvicorn
|
| 545 |
+
|
| 546 |
+
_load_state()
|
| 547 |
+
uvicorn.run(app, host="0.0.0.0", port=7861)
|
hydradeck/__init__.py
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
__all__ = ["__version__"]
|
| 2 |
+
|
| 3 |
+
__version__ = "0.1.0"
|
hydradeck/agents/personas.py
ADDED
|
@@ -0,0 +1,98 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
from __future__ import annotations
|
| 2 |
+
|
| 3 |
+
from dataclasses import dataclass
|
| 4 |
+
|
| 5 |
+
|
| 6 |
+
@dataclass(frozen=True)
|
| 7 |
+
class Persona:
|
| 8 |
+
name: str
|
| 9 |
+
system_prompt: str
|
| 10 |
+
|
| 11 |
+
|
| 12 |
+
PERSONAS: list[Persona] = [
|
| 13 |
+
Persona(
|
| 14 |
+
name="QueryPlanner",
|
| 15 |
+
system_prompt="\n".join(
|
| 16 |
+
[
|
| 17 |
+
"You are a query planner for deep research.",
|
| 18 |
+
"You produce diverse, high-recall search queries.",
|
| 19 |
+
"Prefer queries that locate primary sources and benchmarks.",
|
| 20 |
+
"Return concise query lists and what each query is for.",
|
| 21 |
+
]
|
| 22 |
+
),
|
| 23 |
+
),
|
| 24 |
+
Persona(
|
| 25 |
+
name="Explorer",
|
| 26 |
+
system_prompt=(
|
| 27 |
+
"\n".join(
|
| 28 |
+
[
|
| 29 |
+
"You are an exploratory researcher.",
|
| 30 |
+
"Propose search directions, structure, and hypotheses.",
|
| 31 |
+
"Be concrete: propose queries and evaluation criteria.",
|
| 32 |
+
"State what evidence would change conclusions.",
|
| 33 |
+
]
|
| 34 |
+
)
|
| 35 |
+
),
|
| 36 |
+
),
|
| 37 |
+
Persona(
|
| 38 |
+
name="Librarian",
|
| 39 |
+
system_prompt=(
|
| 40 |
+
"\n".join(
|
| 41 |
+
[
|
| 42 |
+
"You are a source curator.",
|
| 43 |
+
"Prefer primary sources: official docs, standards, peer-reviewed papers.",
|
| 44 |
+
"Avoid SEO spam.",
|
| 45 |
+
"For every claim, think about what citation would support it.",
|
| 46 |
+
]
|
| 47 |
+
)
|
| 48 |
+
),
|
| 49 |
+
),
|
| 50 |
+
Persona(
|
| 51 |
+
name="Skeptic",
|
| 52 |
+
system_prompt=(
|
| 53 |
+
"\n".join(
|
| 54 |
+
[
|
| 55 |
+
"You are a skeptical reviewer.",
|
| 56 |
+
"Challenge unsupported claims and ask for stronger evidence.",
|
| 57 |
+
"Surface counterexamples, limitations, and propose sanity checks.",
|
| 58 |
+
]
|
| 59 |
+
)
|
| 60 |
+
),
|
| 61 |
+
),
|
| 62 |
+
Persona(
|
| 63 |
+
name="Synthesizer",
|
| 64 |
+
system_prompt=(
|
| 65 |
+
"\n".join(
|
| 66 |
+
[
|
| 67 |
+
"You are a technical writer.",
|
| 68 |
+
"Produce detailed, structured, citation-grounded research reports.",
|
| 69 |
+
"Separate what is known vs uncertain.",
|
| 70 |
+
"Include actionable takeaways.",
|
| 71 |
+
]
|
| 72 |
+
)
|
| 73 |
+
),
|
| 74 |
+
),
|
| 75 |
+
Persona(
|
| 76 |
+
name="Presenter",
|
| 77 |
+
system_prompt=(
|
| 78 |
+
"\n".join(
|
| 79 |
+
[
|
| 80 |
+
"You are a speaking coach and slide designer.",
|
| 81 |
+
"Create a clear talk, strong narrative, and Beamer slides.",
|
| 82 |
+
"Keep slides concise, but keep the script detailed.",
|
| 83 |
+
]
|
| 84 |
+
)
|
| 85 |
+
),
|
| 86 |
+
),
|
| 87 |
+
Persona(
|
| 88 |
+
name="Judge",
|
| 89 |
+
system_prompt="\n".join(
|
| 90 |
+
[
|
| 91 |
+
"You are a strict third-party evaluator.",
|
| 92 |
+
"Score the provided artifacts against the rubric.",
|
| 93 |
+
"Be specific about missing sections, weak evidence, and citation issues.",
|
| 94 |
+
"Return JSON only.",
|
| 95 |
+
]
|
| 96 |
+
),
|
| 97 |
+
),
|
| 98 |
+
]
|
hydradeck/cli.py
ADDED
|
@@ -0,0 +1,522 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
from __future__ import annotations
|
| 2 |
+
|
| 3 |
+
import argparse
|
| 4 |
+
import sys
|
| 5 |
+
from pathlib import Path
|
| 6 |
+
|
| 7 |
+
from hydradeck.config import (
|
| 8 |
+
UserConfig,
|
| 9 |
+
resolve_api_key,
|
| 10 |
+
resolve_base_url,
|
| 11 |
+
resolve_model,
|
| 12 |
+
resolve_pdf_compiler,
|
| 13 |
+
resolve_template,
|
| 14 |
+
save_config,
|
| 15 |
+
)
|
| 16 |
+
from hydradeck.core.types import RunConfig
|
| 17 |
+
from hydradeck.pipeline import run
|
| 18 |
+
from hydradeck.resources_pack import build_resources_pack
|
| 19 |
+
|
| 20 |
+
|
| 21 |
+
def _build_parser() -> argparse.ArgumentParser:
|
| 22 |
+
p = argparse.ArgumentParser(prog="hydradeck")
|
| 23 |
+
sub = p.add_subparsers(dest="cmd", required=True)
|
| 24 |
+
|
| 25 |
+
runp = sub.add_parser("run", help="Run Grok deep research pipeline")
|
| 26 |
+
runp.add_argument("--topic", required=True, help="Research topic")
|
| 27 |
+
runp.add_argument("--out", required=True, help="Output directory or .zip")
|
| 28 |
+
runp.add_argument("--iterations", type=int, default=3, help="Persona iteration rounds")
|
| 29 |
+
runp.add_argument("--max-sources", type=int, default=10, help="Max sources to include")
|
| 30 |
+
runp.add_argument(
|
| 31 |
+
"--min-words",
|
| 32 |
+
type=int,
|
| 33 |
+
default=12000,
|
| 34 |
+
help="Target minimum words (guidance to model; markdown is primary)",
|
| 35 |
+
)
|
| 36 |
+
runp.add_argument("--base-url", default=None, help="API base URL")
|
| 37 |
+
runp.add_argument("--model", default=None, help="Model name")
|
| 38 |
+
runp.add_argument(
|
| 39 |
+
"--keep-stage",
|
| 40 |
+
action="store_true",
|
| 41 |
+
help="If --out is a .zip, keep the staging directory on disk",
|
| 42 |
+
)
|
| 43 |
+
runp.add_argument(
|
| 44 |
+
"--seed-url",
|
| 45 |
+
action="append",
|
| 46 |
+
default=None,
|
| 47 |
+
help="Seed URL to include as source (can be repeated)",
|
| 48 |
+
)
|
| 49 |
+
runp.add_argument("--llm-timeout", type=float, default=180.0, help="LLM timeout seconds")
|
| 50 |
+
runp.add_argument("--mock", action="store_true", help="Use deterministic mock (no network)")
|
| 51 |
+
runp.add_argument("--verbose", action="store_true", help="Verbose logging")
|
| 52 |
+
runp.add_argument(
|
| 53 |
+
"--heartbeat",
|
| 54 |
+
action="store_true",
|
| 55 |
+
help="Emit periodic heartbeat during long network calls",
|
| 56 |
+
)
|
| 57 |
+
runp.add_argument(
|
| 58 |
+
"--progress",
|
| 59 |
+
action="store_true",
|
| 60 |
+
help="Show a progress bar for generation stages",
|
| 61 |
+
)
|
| 62 |
+
runp.add_argument(
|
| 63 |
+
"--request-budget",
|
| 64 |
+
type=float,
|
| 65 |
+
default=20.0,
|
| 66 |
+
help="Per-request timeout budget (seconds)",
|
| 67 |
+
)
|
| 68 |
+
runp.add_argument(
|
| 69 |
+
"--verbatim",
|
| 70 |
+
action="store_true",
|
| 71 |
+
help="Write model-produced artifacts verbatim (no rendering/rewriting)",
|
| 72 |
+
)
|
| 73 |
+
runp.add_argument(
|
| 74 |
+
"--no-archive-prompts",
|
| 75 |
+
action="store_true",
|
| 76 |
+
help="Do not archive prompts/requests in the output package",
|
| 77 |
+
)
|
| 78 |
+
runp.add_argument(
|
| 79 |
+
"--quality-gate",
|
| 80 |
+
action="store_true",
|
| 81 |
+
help="Require passing third-party score before writing outputs",
|
| 82 |
+
)
|
| 83 |
+
runp.add_argument(
|
| 84 |
+
"--min-quality",
|
| 85 |
+
type=float,
|
| 86 |
+
default=0.85,
|
| 87 |
+
help="Minimum quality score (0-1)",
|
| 88 |
+
)
|
| 89 |
+
runp.add_argument(
|
| 90 |
+
"--quality-attempts",
|
| 91 |
+
type=int,
|
| 92 |
+
default=3,
|
| 93 |
+
help="Max regeneration attempts to meet quality gate",
|
| 94 |
+
)
|
| 95 |
+
runp.add_argument(
|
| 96 |
+
"--archive-snapshots",
|
| 97 |
+
action="store_true",
|
| 98 |
+
help="Fetch and archive source page snapshots into resources/snapshots",
|
| 99 |
+
)
|
| 100 |
+
runp.add_argument(
|
| 101 |
+
"--snapshot-timeout",
|
| 102 |
+
type=float,
|
| 103 |
+
default=25.0,
|
| 104 |
+
help="Per-URL snapshot fetch timeout (seconds)",
|
| 105 |
+
)
|
| 106 |
+
runp.add_argument(
|
| 107 |
+
"--snapshot-total-timeout",
|
| 108 |
+
type=float,
|
| 109 |
+
default=60.0,
|
| 110 |
+
help="Total time budget for all snapshots (seconds)",
|
| 111 |
+
)
|
| 112 |
+
|
| 113 |
+
prep = sub.add_parser(
|
| 114 |
+
"pre",
|
| 115 |
+
help="Generate a preset pre-research package (no API key required)",
|
| 116 |
+
)
|
| 117 |
+
prep.add_argument("--preset", required=True, help="Preset name (e.g. rynnbrain)")
|
| 118 |
+
prep.add_argument("--out", required=True, help="Output directory or .zip")
|
| 119 |
+
prep.add_argument(
|
| 120 |
+
"--keep-stage",
|
| 121 |
+
action="store_true",
|
| 122 |
+
help="Keep staging directory when output is .zip",
|
| 123 |
+
)
|
| 124 |
+
prep.add_argument(
|
| 125 |
+
"--no-fetch",
|
| 126 |
+
action="store_true",
|
| 127 |
+
help="Do not fetch and archive web snapshots",
|
| 128 |
+
)
|
| 129 |
+
|
| 130 |
+
models_p = sub.add_parser("models", help="List available models")
|
| 131 |
+
models_p.add_argument(
|
| 132 |
+
"--base-url",
|
| 133 |
+
default=None,
|
| 134 |
+
help="API base URL",
|
| 135 |
+
)
|
| 136 |
+
|
| 137 |
+
auto_p = sub.add_parser(
|
| 138 |
+
"auto",
|
| 139 |
+
help="Run autonomous deep research (verbatim + prompts + snapshots)",
|
| 140 |
+
)
|
| 141 |
+
auto_p.add_argument("--topic", required=True, help="Research topic")
|
| 142 |
+
auto_p.add_argument("--out", required=True, help="Output directory or .zip")
|
| 143 |
+
auto_p.add_argument(
|
| 144 |
+
"--base-url",
|
| 145 |
+
default=None,
|
| 146 |
+
help="API base URL",
|
| 147 |
+
)
|
| 148 |
+
auto_p.add_argument(
|
| 149 |
+
"--model",
|
| 150 |
+
default=None,
|
| 151 |
+
help="Fallback model name",
|
| 152 |
+
)
|
| 153 |
+
auto_p.add_argument(
|
| 154 |
+
"--iterations",
|
| 155 |
+
type=int,
|
| 156 |
+
default=3,
|
| 157 |
+
help="Persona iteration rounds",
|
| 158 |
+
)
|
| 159 |
+
auto_p.add_argument(
|
| 160 |
+
"--max-sources",
|
| 161 |
+
type=int,
|
| 162 |
+
default=12,
|
| 163 |
+
help="Max sources to include",
|
| 164 |
+
)
|
| 165 |
+
auto_p.add_argument(
|
| 166 |
+
"--module-sources",
|
| 167 |
+
type=int,
|
| 168 |
+
default=5,
|
| 169 |
+
help="Sources per query module",
|
| 170 |
+
)
|
| 171 |
+
auto_p.add_argument(
|
| 172 |
+
"--query-count",
|
| 173 |
+
type=int,
|
| 174 |
+
default=8,
|
| 175 |
+
help="Number of queries to generate (high recall)",
|
| 176 |
+
)
|
| 177 |
+
auto_p.add_argument(
|
| 178 |
+
"--max-query-modules",
|
| 179 |
+
type=int,
|
| 180 |
+
default=2,
|
| 181 |
+
help="Max query modules to expand into sources",
|
| 182 |
+
)
|
| 183 |
+
auto_p.add_argument(
|
| 184 |
+
"--sources-attempts",
|
| 185 |
+
type=int,
|
| 186 |
+
default=3,
|
| 187 |
+
help="Max attempts to obtain sources (must be <=3)",
|
| 188 |
+
)
|
| 189 |
+
auto_p.add_argument(
|
| 190 |
+
"--facts-max-pages",
|
| 191 |
+
type=int,
|
| 192 |
+
default=6,
|
| 193 |
+
help="Max pages to pass into facts extraction",
|
| 194 |
+
)
|
| 195 |
+
auto_p.add_argument(
|
| 196 |
+
"--facts-max-chars",
|
| 197 |
+
type=int,
|
| 198 |
+
default=8000,
|
| 199 |
+
help="Max chars per page passed into facts extraction",
|
| 200 |
+
)
|
| 201 |
+
auto_p.add_argument(
|
| 202 |
+
"--facts-target",
|
| 203 |
+
type=int,
|
| 204 |
+
default=30,
|
| 205 |
+
help="Approximate number of facts to extract",
|
| 206 |
+
)
|
| 207 |
+
auto_p.add_argument(
|
| 208 |
+
"--judge-max-chars",
|
| 209 |
+
type=int,
|
| 210 |
+
default=12000,
|
| 211 |
+
help="Max chars per artifact passed into judge",
|
| 212 |
+
)
|
| 213 |
+
auto_p.add_argument(
|
| 214 |
+
"--max-runtime",
|
| 215 |
+
type=float,
|
| 216 |
+
default=240.0,
|
| 217 |
+
help="Max total runtime seconds before aborting",
|
| 218 |
+
)
|
| 219 |
+
auto_p.add_argument(
|
| 220 |
+
"--llm-timeout",
|
| 221 |
+
type=float,
|
| 222 |
+
default=180.0,
|
| 223 |
+
help="LLM timeout seconds",
|
| 224 |
+
)
|
| 225 |
+
auto_p.add_argument(
|
| 226 |
+
"--snapshot-timeout",
|
| 227 |
+
type=float,
|
| 228 |
+
default=25.0,
|
| 229 |
+
help="Per-URL snapshot fetch timeout (seconds)",
|
| 230 |
+
)
|
| 231 |
+
auto_p.add_argument("--mock", action="store_true", help="Use deterministic mock")
|
| 232 |
+
auto_p.add_argument("--verbose", action="store_true", help="Verbose logging")
|
| 233 |
+
auto_p.add_argument(
|
| 234 |
+
"--heartbeat",
|
| 235 |
+
action="store_true",
|
| 236 |
+
help="Emit periodic heartbeat during long network calls",
|
| 237 |
+
)
|
| 238 |
+
auto_p.add_argument(
|
| 239 |
+
"--progress",
|
| 240 |
+
action="store_true",
|
| 241 |
+
help="Show a progress bar for generation stages",
|
| 242 |
+
)
|
| 243 |
+
auto_p.add_argument(
|
| 244 |
+
"--request-budget",
|
| 245 |
+
type=float,
|
| 246 |
+
default=20.0,
|
| 247 |
+
help="Per-request timeout budget (seconds)",
|
| 248 |
+
)
|
| 249 |
+
auto_p.add_argument(
|
| 250 |
+
"--min-quality",
|
| 251 |
+
type=float,
|
| 252 |
+
default=0.85,
|
| 253 |
+
help="Minimum quality score (0-1)",
|
| 254 |
+
)
|
| 255 |
+
auto_p.add_argument(
|
| 256 |
+
"--quality-attempts",
|
| 257 |
+
type=int,
|
| 258 |
+
default=3,
|
| 259 |
+
help="Max regeneration attempts to meet quality gate",
|
| 260 |
+
)
|
| 261 |
+
|
| 262 |
+
cfg_p = sub.add_parser("config", help="Persist local config (base_url/model/api_key)")
|
| 263 |
+
cfg_p.add_argument("--base-url", default=None, help="API base URL")
|
| 264 |
+
cfg_p.add_argument("--model", default=None, help="Default model")
|
| 265 |
+
cfg_p.add_argument("--api-key", default=None, help="API key (stored locally)")
|
| 266 |
+
cfg_p.add_argument(
|
| 267 |
+
"--pdf-compiler",
|
| 268 |
+
default=None,
|
| 269 |
+
help="PDF compiler backend: latexonline or texlive",
|
| 270 |
+
)
|
| 271 |
+
cfg_p.add_argument(
|
| 272 |
+
"--template",
|
| 273 |
+
default=None,
|
| 274 |
+
help="Template: iclr2026 or plain",
|
| 275 |
+
)
|
| 276 |
+
|
| 277 |
+
res_p = sub.add_parser("resources", help="One-click resources pack (no seed required)")
|
| 278 |
+
res_p.add_argument("--topic", required=True, help="Research topic")
|
| 279 |
+
res_p.add_argument("--out", required=True, help="Output directory or .zip")
|
| 280 |
+
res_p.add_argument(
|
| 281 |
+
"--base-url",
|
| 282 |
+
default=None,
|
| 283 |
+
help="API base URL",
|
| 284 |
+
)
|
| 285 |
+
res_p.add_argument(
|
| 286 |
+
"--model",
|
| 287 |
+
default=None,
|
| 288 |
+
help="Model name",
|
| 289 |
+
)
|
| 290 |
+
res_p.add_argument(
|
| 291 |
+
"--pdf-compiler",
|
| 292 |
+
default=resolve_pdf_compiler("auto"),
|
| 293 |
+
help="PDF compiler: auto|latexonline|texlive",
|
| 294 |
+
)
|
| 295 |
+
res_p.add_argument(
|
| 296 |
+
"--template",
|
| 297 |
+
default=resolve_template("pretty"),
|
| 298 |
+
help="Template: pretty|plain",
|
| 299 |
+
)
|
| 300 |
+
res_p.add_argument("--max-sources", type=int, default=8, help="Max sources")
|
| 301 |
+
res_p.add_argument("--module-sources", type=int, default=3, help="Sources per module")
|
| 302 |
+
res_p.add_argument("--llm-timeout", type=float, default=35.0, help="LLM timeout")
|
| 303 |
+
res_p.add_argument("--snapshot-timeout", type=float, default=10.0, help="Snapshot timeout")
|
| 304 |
+
res_p.add_argument(
|
| 305 |
+
"--snapshot-total-timeout",
|
| 306 |
+
type=float,
|
| 307 |
+
default=60.0,
|
| 308 |
+
help="Total time budget for all snapshots",
|
| 309 |
+
)
|
| 310 |
+
res_p.add_argument("--max-runtime", type=float, default=180.0, help="Max runtime")
|
| 311 |
+
res_p.add_argument("--request-budget", type=float, default=15.0, help="Per-request budget")
|
| 312 |
+
res_p.add_argument("--keep-stage", action="store_true", help="Keep staging directory")
|
| 313 |
+
res_p.add_argument("--heartbeat", action="store_true", help="Heartbeat")
|
| 314 |
+
res_p.add_argument("--progress", action="store_true", help="Progress bar")
|
| 315 |
+
|
| 316 |
+
wiz_p = sub.add_parser("wizard", help="Guided research (interactive)")
|
| 317 |
+
wiz_p.add_argument("--out", required=False, default=None, help="Output directory or .zip")
|
| 318 |
+
return p
|
| 319 |
+
|
| 320 |
+
|
| 321 |
+
def _prompt(prompt: str, default: str | None = None) -> str:
|
| 322 |
+
suffix = f" [{default}]" if default else ""
|
| 323 |
+
v = input(prompt + suffix + ": ").strip()
|
| 324 |
+
if not v and default is not None:
|
| 325 |
+
return default
|
| 326 |
+
return v
|
| 327 |
+
|
| 328 |
+
|
| 329 |
+
def _prompt_int(prompt: str, default: int) -> int:
|
| 330 |
+
v = _prompt(prompt, str(default))
|
| 331 |
+
try:
|
| 332 |
+
return int(v)
|
| 333 |
+
except Exception:
|
| 334 |
+
return default
|
| 335 |
+
|
| 336 |
+
|
| 337 |
+
def _prompt_float(prompt: str, default: float) -> float:
|
| 338 |
+
v = _prompt(prompt, str(default))
|
| 339 |
+
try:
|
| 340 |
+
return float(v)
|
| 341 |
+
except Exception:
|
| 342 |
+
return default
|
| 343 |
+
|
| 344 |
+
|
| 345 |
+
def main(argv: list[str] | None = None) -> int:
|
| 346 |
+
args = _build_parser().parse_args(argv)
|
| 347 |
+
if args.cmd == "run":
|
| 348 |
+
base_url = resolve_base_url(args.base_url)
|
| 349 |
+
model = resolve_model(args.model)
|
| 350 |
+
cfg = RunConfig(
|
| 351 |
+
topic=args.topic,
|
| 352 |
+
out=Path(args.out),
|
| 353 |
+
base_url=base_url,
|
| 354 |
+
api_key=resolve_api_key(),
|
| 355 |
+
model=model,
|
| 356 |
+
iterations=max(int(args.iterations), 1),
|
| 357 |
+
max_sources=max(int(args.max_sources), 1),
|
| 358 |
+
min_total_words=max(int(args.min_words), 1000),
|
| 359 |
+
use_mock=bool(args.mock),
|
| 360 |
+
verbose=bool(args.verbose or args.heartbeat),
|
| 361 |
+
progress=bool(args.progress),
|
| 362 |
+
llm_timeout_s=float(args.llm_timeout),
|
| 363 |
+
request_budget_s=float(args.request_budget),
|
| 364 |
+
keep_stage=bool(args.keep_stage),
|
| 365 |
+
verbatim=bool(args.verbatim),
|
| 366 |
+
archive_prompts=not bool(args.no_archive_prompts),
|
| 367 |
+
archive_snapshots=bool(args.archive_snapshots),
|
| 368 |
+
snapshot_timeout_s=float(args.snapshot_timeout),
|
| 369 |
+
snapshot_total_timeout_s=float(args.snapshot_total_timeout),
|
| 370 |
+
quality_gate=bool(args.quality_gate),
|
| 371 |
+
min_quality_score=float(args.min_quality),
|
| 372 |
+
max_quality_attempts=int(args.quality_attempts),
|
| 373 |
+
seed_urls=args.seed_url,
|
| 374 |
+
)
|
| 375 |
+
run(cfg)
|
| 376 |
+
return 0
|
| 377 |
+
if args.cmd == "pre":
|
| 378 |
+
from hydradeck.presets.rynnbrain import generate
|
| 379 |
+
|
| 380 |
+
if str(args.preset).strip().lower() != "rynnbrain":
|
| 381 |
+
print(f"Unknown preset: {args.preset}", file=sys.stderr)
|
| 382 |
+
return 2
|
| 383 |
+
generate(
|
| 384 |
+
out=Path(args.out),
|
| 385 |
+
keep_stage=bool(args.keep_stage),
|
| 386 |
+
fetch=not bool(args.no_fetch),
|
| 387 |
+
)
|
| 388 |
+
return 0
|
| 389 |
+
|
| 390 |
+
if args.cmd == "models":
|
| 391 |
+
from hydradeck.clients import GrokClient
|
| 392 |
+
|
| 393 |
+
client = GrokClient(
|
| 394 |
+
base_url=resolve_base_url(str(args.base_url) if args.base_url else None),
|
| 395 |
+
api_key=resolve_api_key(),
|
| 396 |
+
model="grok-4",
|
| 397 |
+
)
|
| 398 |
+
for mid in client.list_models():
|
| 399 |
+
print(mid)
|
| 400 |
+
return 0
|
| 401 |
+
|
| 402 |
+
if args.cmd == "auto":
|
| 403 |
+
base_url = resolve_base_url(args.base_url)
|
| 404 |
+
model = resolve_model(args.model)
|
| 405 |
+
cfg = RunConfig(
|
| 406 |
+
topic=args.topic,
|
| 407 |
+
out=Path(args.out),
|
| 408 |
+
base_url=base_url,
|
| 409 |
+
api_key=resolve_api_key(),
|
| 410 |
+
model=model,
|
| 411 |
+
iterations=max(int(args.iterations), 1),
|
| 412 |
+
max_sources=max(int(args.max_sources), 1),
|
| 413 |
+
module_sources=max(int(args.module_sources), 1),
|
| 414 |
+
query_count=max(int(args.query_count), 1),
|
| 415 |
+
max_query_modules=max(int(args.max_query_modules), 1),
|
| 416 |
+
sources_attempts=min(max(int(args.sources_attempts), 1), 3),
|
| 417 |
+
facts_max_pages=max(int(args.facts_max_pages), 1),
|
| 418 |
+
facts_max_chars_per_page=max(int(args.facts_max_chars), 1000),
|
| 419 |
+
facts_target=max(int(args.facts_target), 5),
|
| 420 |
+
judge_max_chars=max(int(args.judge_max_chars), 2000),
|
| 421 |
+
max_total_runtime_s=float(args.max_runtime),
|
| 422 |
+
min_total_words=12000,
|
| 423 |
+
use_mock=bool(args.mock),
|
| 424 |
+
verbose=bool(args.verbose or args.heartbeat),
|
| 425 |
+
progress=bool(args.progress),
|
| 426 |
+
llm_timeout_s=float(args.llm_timeout),
|
| 427 |
+
keep_stage=False,
|
| 428 |
+
verbatim=True,
|
| 429 |
+
archive_prompts=True,
|
| 430 |
+
archive_snapshots=True,
|
| 431 |
+
snapshot_timeout_s=float(args.snapshot_timeout),
|
| 432 |
+
auto=True,
|
| 433 |
+
auto_queries=True,
|
| 434 |
+
auto_models=True,
|
| 435 |
+
quality_gate=True,
|
| 436 |
+
min_quality_score=float(args.min_quality),
|
| 437 |
+
max_quality_attempts=int(args.quality_attempts),
|
| 438 |
+
seed_urls=None,
|
| 439 |
+
)
|
| 440 |
+
run(cfg)
|
| 441 |
+
return 0
|
| 442 |
+
|
| 443 |
+
if args.cmd == "config":
|
| 444 |
+
uc = UserConfig(
|
| 445 |
+
base_url=str(args.base_url) if args.base_url else None,
|
| 446 |
+
api_key=str(args.api_key) if args.api_key else None,
|
| 447 |
+
model=str(args.model) if args.model else None,
|
| 448 |
+
pdf_compiler=str(args.pdf_compiler) if args.pdf_compiler else None,
|
| 449 |
+
template=str(args.template) if args.template else None,
|
| 450 |
+
)
|
| 451 |
+
p = save_config(uc)
|
| 452 |
+
print(str(p))
|
| 453 |
+
return 0
|
| 454 |
+
|
| 455 |
+
if args.cmd == "resources":
|
| 456 |
+
base_url = resolve_base_url(args.base_url)
|
| 457 |
+
model = resolve_model(args.model)
|
| 458 |
+
cfg = RunConfig(
|
| 459 |
+
topic=args.topic,
|
| 460 |
+
out=Path(args.out),
|
| 461 |
+
base_url=base_url,
|
| 462 |
+
api_key=resolve_api_key(),
|
| 463 |
+
model=model,
|
| 464 |
+
pdf_compiler=str(args.pdf_compiler),
|
| 465 |
+
template=str(args.template),
|
| 466 |
+
max_sources=max(int(args.max_sources), 1),
|
| 467 |
+
module_sources=max(int(args.module_sources), 1),
|
| 468 |
+
use_mock=False,
|
| 469 |
+
verbose=bool(args.heartbeat),
|
| 470 |
+
progress=bool(args.progress),
|
| 471 |
+
llm_timeout_s=float(args.llm_timeout),
|
| 472 |
+
snapshot_timeout_s=float(args.snapshot_timeout),
|
| 473 |
+
max_total_runtime_s=float(args.max_runtime),
|
| 474 |
+
request_budget_s=float(args.request_budget),
|
| 475 |
+
keep_stage=bool(args.keep_stage),
|
| 476 |
+
)
|
| 477 |
+
build_resources_pack(cfg)
|
| 478 |
+
return 0
|
| 479 |
+
|
| 480 |
+
if args.cmd == "wizard":
|
| 481 |
+
topic = _prompt("Topic", "RynnBrain")
|
| 482 |
+
out = args.out or _prompt("Output path (.zip)", "hydradeck/out/pre.zip")
|
| 483 |
+
base_url = _prompt("Base URL (from config if empty)", "")
|
| 484 |
+
model = _prompt("Model (from config if empty)", "")
|
| 485 |
+
max_sources = _prompt_int("Max sources", 8)
|
| 486 |
+
module_sources = _prompt_int("Sources per module", 3)
|
| 487 |
+
llm_timeout = _prompt_float("LLM timeout (s)", 35.0)
|
| 488 |
+
snapshot_timeout = _prompt_float("Snapshot timeout (s)", 10.0)
|
| 489 |
+
max_runtime = _prompt_float("Max runtime (s)", 300.0)
|
| 490 |
+
request_budget = _prompt_float("Per-request budget (s)", 20.0)
|
| 491 |
+
pdf_compiler = _prompt("PDF compiler (auto|latexonline|texlive)", "auto")
|
| 492 |
+
template = _prompt("Template (iclr2026|plain)", "iclr2026")
|
| 493 |
+
|
| 494 |
+
cfg = RunConfig(
|
| 495 |
+
topic=topic,
|
| 496 |
+
out=Path(out),
|
| 497 |
+
base_url=resolve_base_url(base_url or None),
|
| 498 |
+
api_key=resolve_api_key(),
|
| 499 |
+
model=resolve_model(model or None),
|
| 500 |
+
pdf_compiler=pdf_compiler,
|
| 501 |
+
template=template,
|
| 502 |
+
max_sources=max(max_sources, 1),
|
| 503 |
+
module_sources=max(module_sources, 1),
|
| 504 |
+
use_mock=False,
|
| 505 |
+
verbose=True,
|
| 506 |
+
progress=True,
|
| 507 |
+
llm_timeout_s=llm_timeout,
|
| 508 |
+
snapshot_timeout_s=snapshot_timeout,
|
| 509 |
+
max_total_runtime_s=max_runtime,
|
| 510 |
+
request_budget_s=request_budget,
|
| 511 |
+
keep_stage=False,
|
| 512 |
+
)
|
| 513 |
+
build_resources_pack(cfg)
|
| 514 |
+
print(out)
|
| 515 |
+
return 0
|
| 516 |
+
|
| 517 |
+
print(f"Unknown command: {args.cmd}", file=sys.stderr)
|
| 518 |
+
return 2
|
| 519 |
+
|
| 520 |
+
|
| 521 |
+
if __name__ == "__main__":
|
| 522 |
+
raise SystemExit(main())
|
hydradeck/clients/__init__.py
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
__all__ = ["GrokClient", "MockClient", "ChatMessage", "GrokClientError"]
|
| 2 |
+
|
| 3 |
+
from hydradeck.clients.grok_client import ChatMessage, GrokClient, GrokClientError, MockClient
|
hydradeck/clients/grok_client.py
ADDED
|
@@ -0,0 +1,373 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
from __future__ import annotations
|
| 2 |
+
|
| 3 |
+
import json
|
| 4 |
+
import time
|
| 5 |
+
from dataclasses import dataclass
|
| 6 |
+
|
| 7 |
+
import requests
|
| 8 |
+
|
| 9 |
+
from hydradeck.utils import Heartbeat
|
| 10 |
+
|
| 11 |
+
JSON = dict[str, object]
|
| 12 |
+
|
| 13 |
+
CHROME_144_UA = (
|
| 14 |
+
"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) "
|
| 15 |
+
"AppleWebKit/537.36 (KHTML, like Gecko) "
|
| 16 |
+
"Chrome/144.0.0.0 Safari/537.36"
|
| 17 |
+
)
|
| 18 |
+
|
| 19 |
+
|
| 20 |
+
class GrokClientError(RuntimeError):
|
| 21 |
+
pass
|
| 22 |
+
|
| 23 |
+
|
| 24 |
+
@dataclass(frozen=True)
|
| 25 |
+
class ChatMessage:
|
| 26 |
+
role: str
|
| 27 |
+
content: str
|
| 28 |
+
|
| 29 |
+
|
| 30 |
+
class GrokClient:
|
| 31 |
+
def __init__(
|
| 32 |
+
self,
|
| 33 |
+
base_url: str,
|
| 34 |
+
api_key: str,
|
| 35 |
+
model: str,
|
| 36 |
+
timeout_s: float = 180.0,
|
| 37 |
+
max_retries: int = 3,
|
| 38 |
+
heartbeat: bool = False,
|
| 39 |
+
heartbeat_interval_s: float = 5.0,
|
| 40 |
+
) -> None:
|
| 41 |
+
self._base_url = base_url.rstrip("/")
|
| 42 |
+
self._api_key = api_key
|
| 43 |
+
self._model = model
|
| 44 |
+
self._timeout_s = timeout_s
|
| 45 |
+
self._max_retries = max_retries
|
| 46 |
+
self._heartbeat = heartbeat
|
| 47 |
+
self._heartbeat_interval_s = heartbeat_interval_s
|
| 48 |
+
|
| 49 |
+
def chat_text(
|
| 50 |
+
self,
|
| 51 |
+
messages: list[ChatMessage],
|
| 52 |
+
temperature: float = 0.3,
|
| 53 |
+
timeout_s: float | None = None,
|
| 54 |
+
) -> str:
|
| 55 |
+
msgs = [{"role": m.role, "content": m.content} for m in messages]
|
| 56 |
+
data = self._post_chat(
|
| 57 |
+
{"model": self._model, "messages": msgs, "temperature": temperature},
|
| 58 |
+
timeout_s=timeout_s,
|
| 59 |
+
)
|
| 60 |
+
choices = data.get("choices")
|
| 61 |
+
if not isinstance(choices, list) or not choices:
|
| 62 |
+
raise GrokClientError(f"No choices in response: {data}")
|
| 63 |
+
msg = choices[0].get("message") if isinstance(choices[0], dict) else None
|
| 64 |
+
content = msg.get("content") if isinstance(msg, dict) else None
|
| 65 |
+
if not isinstance(content, str):
|
| 66 |
+
raise GrokClientError(f"No message.content in response: {data}")
|
| 67 |
+
return content.strip()
|
| 68 |
+
|
| 69 |
+
def chat_json(
|
| 70 |
+
self,
|
| 71 |
+
messages: list[ChatMessage],
|
| 72 |
+
schema_hint: str,
|
| 73 |
+
temperature: float = 0.2,
|
| 74 |
+
timeout_s: float | None = None,
|
| 75 |
+
) -> JSON:
|
| 76 |
+
suffix = (
|
| 77 |
+
"\n\nReturn ONLY valid JSON. Do not include markdown fences. "
|
| 78 |
+
"If unsure, still return best-effort JSON that matches: "
|
| 79 |
+
+ schema_hint
|
| 80 |
+
)
|
| 81 |
+
msgs = [{"role": m.role, "content": m.content} for m in messages]
|
| 82 |
+
if msgs and msgs[-1].get("role") == "user":
|
| 83 |
+
msgs[-1]["content"] = str(msgs[-1]["content"]) + suffix
|
| 84 |
+
else:
|
| 85 |
+
msgs.append({"role": "user", "content": suffix})
|
| 86 |
+
|
| 87 |
+
text = self.chat_text(
|
| 88 |
+
[ChatMessage(role=m["role"], content=m["content"]) for m in msgs],
|
| 89 |
+
temperature=temperature,
|
| 90 |
+
timeout_s=timeout_s,
|
| 91 |
+
)
|
| 92 |
+
parsed = _best_effort_json_parse(text)
|
| 93 |
+
if parsed is None:
|
| 94 |
+
raise GrokClientError("Model did not return valid JSON. Response was:\n" + text)
|
| 95 |
+
return parsed
|
| 96 |
+
|
| 97 |
+
def _post_chat(self, payload: JSON, timeout_s: float | None = None) -> JSON:
|
| 98 |
+
url = f"{self._base_url}/v1/chat/completions"
|
| 99 |
+
headers = {"Content-Type": "application/json", "User-Agent": CHROME_144_UA}
|
| 100 |
+
if self._api_key:
|
| 101 |
+
headers["Authorization"] = f"Bearer {self._api_key}"
|
| 102 |
+
|
| 103 |
+
effective_timeout = float(timeout_s) if timeout_s is not None else self._timeout_s
|
| 104 |
+
|
| 105 |
+
last_err: Exception | None = None
|
| 106 |
+
for attempt in range(self._max_retries + 1):
|
| 107 |
+
try:
|
| 108 |
+
with Heartbeat(
|
| 109 |
+
enabled=self._heartbeat,
|
| 110 |
+
label=f"POST {url}",
|
| 111 |
+
interval_s=self._heartbeat_interval_s,
|
| 112 |
+
):
|
| 113 |
+
r = requests.post(
|
| 114 |
+
url,
|
| 115 |
+
headers=headers,
|
| 116 |
+
json=payload,
|
| 117 |
+
timeout=effective_timeout,
|
| 118 |
+
)
|
| 119 |
+
if r.status_code >= 400:
|
| 120 |
+
raise GrokClientError(f"HTTP {r.status_code} from {url}: {r.text[:2000]}")
|
| 121 |
+
data = r.json()
|
| 122 |
+
if not isinstance(data, dict):
|
| 123 |
+
raise GrokClientError("Non-object response")
|
| 124 |
+
return data
|
| 125 |
+
except (requests.RequestException, ValueError, GrokClientError) as e:
|
| 126 |
+
last_err = e
|
| 127 |
+
if attempt >= self._max_retries:
|
| 128 |
+
break
|
| 129 |
+
time.sleep(0.5 * (2**attempt))
|
| 130 |
+
raise GrokClientError(f"Request failed after retries: {last_err}")
|
| 131 |
+
|
| 132 |
+
def list_models(self, timeout_s: float | None = None) -> list[str]:
|
| 133 |
+
url = f"{self._base_url}/v1/models"
|
| 134 |
+
headers: dict[str, str] = {"User-Agent": CHROME_144_UA}
|
| 135 |
+
if self._api_key:
|
| 136 |
+
headers["Authorization"] = f"Bearer {self._api_key}"
|
| 137 |
+
effective_timeout = float(timeout_s) if timeout_s is not None else self._timeout_s
|
| 138 |
+
with Heartbeat(
|
| 139 |
+
enabled=self._heartbeat,
|
| 140 |
+
label=f"GET {url}",
|
| 141 |
+
interval_s=self._heartbeat_interval_s,
|
| 142 |
+
):
|
| 143 |
+
r = requests.get(url, headers=headers, timeout=effective_timeout)
|
| 144 |
+
if r.status_code >= 400:
|
| 145 |
+
raise GrokClientError(f"HTTP {r.status_code} from {url}: {r.text[:2000]}")
|
| 146 |
+
data = r.json()
|
| 147 |
+
if not isinstance(data, dict):
|
| 148 |
+
raise GrokClientError("Non-object response")
|
| 149 |
+
raw = data.get("data")
|
| 150 |
+
if not isinstance(raw, list):
|
| 151 |
+
return []
|
| 152 |
+
out: list[str] = []
|
| 153 |
+
for item in raw:
|
| 154 |
+
if isinstance(item, dict):
|
| 155 |
+
mid = item.get("id")
|
| 156 |
+
if isinstance(mid, str):
|
| 157 |
+
out.append(mid)
|
| 158 |
+
return out
|
| 159 |
+
|
| 160 |
+
|
| 161 |
+
class MockClient:
|
| 162 |
+
def chat_text(
|
| 163 |
+
self,
|
| 164 |
+
messages: list[ChatMessage],
|
| 165 |
+
temperature: float = 0.0,
|
| 166 |
+
timeout_s: float | None = None,
|
| 167 |
+
) -> str:
|
| 168 |
+
_ = temperature
|
| 169 |
+
_ = timeout_s
|
| 170 |
+
joined = "\n".join([f"{m.role}: {m.content}" for m in messages])
|
| 171 |
+
low = joined.lower()
|
| 172 |
+
if "write a detailed pre-research report" in low:
|
| 173 |
+
return "\n".join(
|
| 174 |
+
[
|
| 175 |
+
"# Pre-Research Report",
|
| 176 |
+
"",
|
| 177 |
+
"## Research questions",
|
| 178 |
+
"- (Mock) What is the core problem?",
|
| 179 |
+
"",
|
| 180 |
+
"## Scope & non-scope",
|
| 181 |
+
"- Scope: offline mock run",
|
| 182 |
+
"- Non-scope: real web browsing",
|
| 183 |
+
"",
|
| 184 |
+
"## Search plan & queries",
|
| 185 |
+
"- query 1",
|
| 186 |
+
"- query 2",
|
| 187 |
+
"",
|
| 188 |
+
"## Risks & limitations",
|
| 189 |
+
"- Mock output is not evidence-backed",
|
| 190 |
+
"",
|
| 191 |
+
]
|
| 192 |
+
)
|
| 193 |
+
if "write a long-form research report" in low:
|
| 194 |
+
return (
|
| 195 |
+
"# Research Report\n\n"
|
| 196 |
+
"## Summary\n(Mock)\n\n"
|
| 197 |
+
"## Resources\n1. Example Source 1 — https://example.com\n"
|
| 198 |
+
)
|
| 199 |
+
if "speech script" in low:
|
| 200 |
+
return (
|
| 201 |
+
"# Speech Script\n\n"
|
| 202 |
+
"## Opening\n(Mock)\n\n"
|
| 203 |
+
"## Main\n(Mock)\n\n"
|
| 204 |
+
"## Closing\n(Mock)\n"
|
| 205 |
+
)
|
| 206 |
+
if "critique the current research plan" in low:
|
| 207 |
+
return "- (Mock) Missing primary sources\n- (Mock) Claims need evidence\n"
|
| 208 |
+
if "sources" in joined.lower():
|
| 209 |
+
return json.dumps(
|
| 210 |
+
{
|
| 211 |
+
"sources": [
|
| 212 |
+
{
|
| 213 |
+
"url": "https://example.com",
|
| 214 |
+
"title": "Example Source 1",
|
| 215 |
+
"snippet": "Mock source for offline run.",
|
| 216 |
+
}
|
| 217 |
+
]
|
| 218 |
+
},
|
| 219 |
+
ensure_ascii=False,
|
| 220 |
+
)
|
| 221 |
+
if "facts" in joined.lower():
|
| 222 |
+
return json.dumps(
|
| 223 |
+
{
|
| 224 |
+
"facts": [
|
| 225 |
+
{
|
| 226 |
+
"claim": "Mock mode produces deterministic artifacts.",
|
| 227 |
+
"evidence": "MockClient returns fixed outputs.",
|
| 228 |
+
"url": "https://example.com",
|
| 229 |
+
"title": "Example Source 1",
|
| 230 |
+
}
|
| 231 |
+
]
|
| 232 |
+
},
|
| 233 |
+
ensure_ascii=False,
|
| 234 |
+
)
|
| 235 |
+
if "outline" in joined.lower():
|
| 236 |
+
return json.dumps(
|
| 237 |
+
{
|
| 238 |
+
"outline": [
|
| 239 |
+
"Background",
|
| 240 |
+
"Problem formulation",
|
| 241 |
+
"Methods",
|
| 242 |
+
"Findings",
|
| 243 |
+
"Limitations",
|
| 244 |
+
"Open questions",
|
| 245 |
+
]
|
| 246 |
+
},
|
| 247 |
+
ensure_ascii=False,
|
| 248 |
+
)
|
| 249 |
+
return "Mock synthesis text."
|
| 250 |
+
|
| 251 |
+
def chat_json(
|
| 252 |
+
self,
|
| 253 |
+
messages: list[ChatMessage],
|
| 254 |
+
schema_hint: str,
|
| 255 |
+
temperature: float = 0.0,
|
| 256 |
+
timeout_s: float | None = None,
|
| 257 |
+
) -> JSON:
|
| 258 |
+
_ = schema_hint
|
| 259 |
+
_ = timeout_s
|
| 260 |
+
joined = "\n".join([f"{m.role}: {m.content}" for m in messages])
|
| 261 |
+
low = joined.lower()
|
| 262 |
+
if "score" in low and "rubric" in low and "return json" in low:
|
| 263 |
+
return {
|
| 264 |
+
"score": 0.99,
|
| 265 |
+
"reasons": ["mock pass"],
|
| 266 |
+
"must_fix": [],
|
| 267 |
+
}
|
| 268 |
+
if "pre_report_md" in low and "paper_tex" in low and "slides_tex" in low:
|
| 269 |
+
return {
|
| 270 |
+
"pre_report_md": "\n".join(
|
| 271 |
+
[
|
| 272 |
+
"# Pre-Research (Mock)",
|
| 273 |
+
"",
|
| 274 |
+
"## 15-minute agenda",
|
| 275 |
+
"- 0:00-2:00 Background",
|
| 276 |
+
"- 2:00-6:00 Research questions",
|
| 277 |
+
"- 6:00-10:00 Evidence plan",
|
| 278 |
+
"- 10:00-13:00 Risks",
|
| 279 |
+
"- 13:00-15:00 Deliverables",
|
| 280 |
+
"",
|
| 281 |
+
"## Research questions",
|
| 282 |
+
"- RQ1 ...",
|
| 283 |
+
"- RQ2 ...",
|
| 284 |
+
"",
|
| 285 |
+
"## Search plan & queries",
|
| 286 |
+
"- query 1",
|
| 287 |
+
"- query 2",
|
| 288 |
+
"",
|
| 289 |
+
"## Resources",
|
| 290 |
+
"1. Example Source 1 — https://example.com",
|
| 291 |
+
"",
|
| 292 |
+
]
|
| 293 |
+
),
|
| 294 |
+
"report_md": "\n".join(
|
| 295 |
+
[
|
| 296 |
+
"# Research Report (Mock)",
|
| 297 |
+
"",
|
| 298 |
+
"## Summary",
|
| 299 |
+
"(Mock)",
|
| 300 |
+
"",
|
| 301 |
+
"## Findings",
|
| 302 |
+
"- (Mock) claim with [1]",
|
| 303 |
+
"",
|
| 304 |
+
"## Resources",
|
| 305 |
+
"[1] Example Source 1 — https://example.com",
|
| 306 |
+
"",
|
| 307 |
+
]
|
| 308 |
+
),
|
| 309 |
+
"speech_md": "\n".join(
|
| 310 |
+
[
|
| 311 |
+
"# Speech (Mock)",
|
| 312 |
+
"",
|
| 313 |
+
"[0:00] Opening hook",
|
| 314 |
+
"[2:00] Transition",
|
| 315 |
+
"[8:00] Key point",
|
| 316 |
+
"[14:00] Close + Q&A",
|
| 317 |
+
"",
|
| 318 |
+
]
|
| 319 |
+
),
|
| 320 |
+
"paper_tex": "\\documentclass{article}\\n\\begin{document}Mock\\end{document}\\n",
|
| 321 |
+
"slides_tex": "\\documentclass{beamer}\\n\\begin{document}Mock\\end{document}\\n",
|
| 322 |
+
"bibtex": "@misc{src1,title={Example},howpublished={\\url{https://example.com}}}\n",
|
| 323 |
+
}
|
| 324 |
+
|
| 325 |
+
text = self.chat_text(messages, temperature=temperature)
|
| 326 |
+
parsed = _best_effort_json_parse(text)
|
| 327 |
+
return parsed or {"ok": True}
|
| 328 |
+
|
| 329 |
+
|
| 330 |
+
def _best_effort_json_parse(text: str) -> JSON | None:
|
| 331 |
+
t = text.strip()
|
| 332 |
+
if not t:
|
| 333 |
+
return None
|
| 334 |
+
if t.startswith("{") and t.endswith("}"):
|
| 335 |
+
try:
|
| 336 |
+
v = json.loads(t)
|
| 337 |
+
if isinstance(v, dict):
|
| 338 |
+
return v
|
| 339 |
+
except Exception:
|
| 340 |
+
pass
|
| 341 |
+
|
| 342 |
+
start = t.find("{")
|
| 343 |
+
if start == -1:
|
| 344 |
+
return None
|
| 345 |
+
depth = 0
|
| 346 |
+
in_str = False
|
| 347 |
+
esc = False
|
| 348 |
+
for i in range(start, len(t)):
|
| 349 |
+
ch = t[i]
|
| 350 |
+
if in_str:
|
| 351 |
+
if esc:
|
| 352 |
+
esc = False
|
| 353 |
+
elif ch == "\\":
|
| 354 |
+
esc = True
|
| 355 |
+
elif ch == '"':
|
| 356 |
+
in_str = False
|
| 357 |
+
continue
|
| 358 |
+
if ch == '"':
|
| 359 |
+
in_str = True
|
| 360 |
+
continue
|
| 361 |
+
if ch == "{":
|
| 362 |
+
depth += 1
|
| 363 |
+
elif ch == "}":
|
| 364 |
+
depth -= 1
|
| 365 |
+
if depth == 0:
|
| 366 |
+
chunk = t[start : i + 1]
|
| 367 |
+
try:
|
| 368 |
+
v2 = json.loads(chunk)
|
| 369 |
+
if isinstance(v2, dict):
|
| 370 |
+
return v2
|
| 371 |
+
except Exception:
|
| 372 |
+
return None
|
| 373 |
+
return None
|
hydradeck/config.py
ADDED
|
@@ -0,0 +1,137 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
from __future__ import annotations
|
| 2 |
+
|
| 3 |
+
import json
|
| 4 |
+
import os
|
| 5 |
+
from dataclasses import dataclass
|
| 6 |
+
from pathlib import Path
|
| 7 |
+
|
| 8 |
+
|
| 9 |
+
@dataclass(frozen=True)
|
| 10 |
+
class UserConfig:
|
| 11 |
+
base_url: str | None = None
|
| 12 |
+
api_key: str | None = None
|
| 13 |
+
model: str | None = None
|
| 14 |
+
pdf_compiler: str | None = None
|
| 15 |
+
template: str | None = None
|
| 16 |
+
|
| 17 |
+
|
| 18 |
+
def config_path() -> Path:
|
| 19 |
+
xdg = os.environ.get("XDG_CONFIG_HOME")
|
| 20 |
+
if xdg:
|
| 21 |
+
return Path(xdg) / "hydradeck" / "config.json"
|
| 22 |
+
return Path.home() / ".config" / "hydradeck" / "config.json"
|
| 23 |
+
|
| 24 |
+
|
| 25 |
+
def load_config(path: Path | None = None) -> UserConfig:
|
| 26 |
+
p = path or config_path()
|
| 27 |
+
try:
|
| 28 |
+
data = json.loads(p.read_text(encoding="utf-8"))
|
| 29 |
+
except Exception:
|
| 30 |
+
return UserConfig()
|
| 31 |
+
if not isinstance(data, dict):
|
| 32 |
+
return UserConfig()
|
| 33 |
+
base_url = data.get("base_url")
|
| 34 |
+
api_key = data.get("api_key")
|
| 35 |
+
model = data.get("model")
|
| 36 |
+
pdf_compiler = data.get("pdf_compiler")
|
| 37 |
+
template = data.get("template")
|
| 38 |
+
return UserConfig(
|
| 39 |
+
base_url=base_url if isinstance(base_url, str) else None,
|
| 40 |
+
api_key=api_key if isinstance(api_key, str) else None,
|
| 41 |
+
model=model if isinstance(model, str) else None,
|
| 42 |
+
pdf_compiler=pdf_compiler if isinstance(pdf_compiler, str) else None,
|
| 43 |
+
template=template if isinstance(template, str) else None,
|
| 44 |
+
)
|
| 45 |
+
|
| 46 |
+
|
| 47 |
+
def find_project_config(start: Path | None = None) -> Path | None:
|
| 48 |
+
cur = (start or Path.cwd()).resolve()
|
| 49 |
+
for _ in range(8):
|
| 50 |
+
cand = cur / ".hydradeck" / "config.json"
|
| 51 |
+
if cand.exists():
|
| 52 |
+
return cand
|
| 53 |
+
if cur.parent == cur:
|
| 54 |
+
break
|
| 55 |
+
cur = cur.parent
|
| 56 |
+
return None
|
| 57 |
+
|
| 58 |
+
|
| 59 |
+
def load_merged_config() -> UserConfig:
|
| 60 |
+
user = load_config()
|
| 61 |
+
pc = find_project_config()
|
| 62 |
+
if pc is None:
|
| 63 |
+
return user
|
| 64 |
+
proj = load_config(path=pc)
|
| 65 |
+
return UserConfig(
|
| 66 |
+
base_url=proj.base_url or user.base_url,
|
| 67 |
+
api_key=proj.api_key or user.api_key,
|
| 68 |
+
model=proj.model or user.model,
|
| 69 |
+
pdf_compiler=proj.pdf_compiler or user.pdf_compiler,
|
| 70 |
+
template=proj.template or user.template,
|
| 71 |
+
)
|
| 72 |
+
|
| 73 |
+
|
| 74 |
+
def save_config(cfg: UserConfig, path: Path | None = None) -> Path:
|
| 75 |
+
p = path or config_path()
|
| 76 |
+
p.parent.mkdir(parents=True, exist_ok=True)
|
| 77 |
+
payload: dict[str, object] = {}
|
| 78 |
+
if cfg.base_url:
|
| 79 |
+
payload["base_url"] = cfg.base_url
|
| 80 |
+
if cfg.api_key:
|
| 81 |
+
payload["api_key"] = cfg.api_key
|
| 82 |
+
if cfg.model:
|
| 83 |
+
payload["model"] = cfg.model
|
| 84 |
+
if cfg.pdf_compiler:
|
| 85 |
+
payload["pdf_compiler"] = cfg.pdf_compiler
|
| 86 |
+
if cfg.template:
|
| 87 |
+
payload["template"] = cfg.template
|
| 88 |
+
p.write_text(json.dumps(payload, ensure_ascii=False, indent=2) + "\n", encoding="utf-8")
|
| 89 |
+
return p
|
| 90 |
+
|
| 91 |
+
|
| 92 |
+
def resolve_api_key() -> str:
|
| 93 |
+
env = os.environ.get("GROK_API_KEY")
|
| 94 |
+
if env:
|
| 95 |
+
return env
|
| 96 |
+
cfg = load_merged_config()
|
| 97 |
+
return cfg.api_key or ""
|
| 98 |
+
|
| 99 |
+
|
| 100 |
+
def resolve_base_url(default: str | None = None) -> str:
|
| 101 |
+
env = os.environ.get("GROK_BASE_URL")
|
| 102 |
+
if env:
|
| 103 |
+
return env
|
| 104 |
+
cfg = load_merged_config()
|
| 105 |
+
if cfg.base_url:
|
| 106 |
+
return cfg.base_url
|
| 107 |
+
if default is None:
|
| 108 |
+
raise RuntimeError("Missing base_url: set GROK_BASE_URL or hydradeck config --base-url")
|
| 109 |
+
return default
|
| 110 |
+
|
| 111 |
+
|
| 112 |
+
def resolve_model(default: str | None = None) -> str:
|
| 113 |
+
env = os.environ.get("GROK_MODEL")
|
| 114 |
+
if env:
|
| 115 |
+
return env
|
| 116 |
+
cfg = load_merged_config()
|
| 117 |
+
if cfg.model:
|
| 118 |
+
return cfg.model
|
| 119 |
+
if default is None:
|
| 120 |
+
raise RuntimeError("Missing model: set GROK_MODEL or hydradeck config --model")
|
| 121 |
+
return default
|
| 122 |
+
|
| 123 |
+
|
| 124 |
+
def resolve_pdf_compiler(default: str) -> str:
|
| 125 |
+
env = os.environ.get("HYDRADECK_PDF_COMPILER")
|
| 126 |
+
if env:
|
| 127 |
+
return env
|
| 128 |
+
cfg = load_merged_config()
|
| 129 |
+
return cfg.pdf_compiler or default
|
| 130 |
+
|
| 131 |
+
|
| 132 |
+
def resolve_template(default: str) -> str:
|
| 133 |
+
env = os.environ.get("HYDRADECK_TEMPLATE")
|
| 134 |
+
if env:
|
| 135 |
+
return env
|
| 136 |
+
cfg = load_merged_config()
|
| 137 |
+
return cfg.template or default
|
hydradeck/core/types.py
ADDED
|
@@ -0,0 +1,91 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
from __future__ import annotations
|
| 2 |
+
|
| 3 |
+
from dataclasses import dataclass
|
| 4 |
+
from pathlib import Path
|
| 5 |
+
from typing import Any
|
| 6 |
+
|
| 7 |
+
|
| 8 |
+
@dataclass(frozen=True)
|
| 9 |
+
class RunConfig:
|
| 10 |
+
topic: str
|
| 11 |
+
out: Path
|
| 12 |
+
base_url: str
|
| 13 |
+
api_key: str
|
| 14 |
+
model: str
|
| 15 |
+
|
| 16 |
+
iterations: int = 3
|
| 17 |
+
max_sources: int = 10
|
| 18 |
+
module_sources: int = 4
|
| 19 |
+
min_total_words: int = 12000
|
| 20 |
+
|
| 21 |
+
use_mock: bool = False
|
| 22 |
+
verbose: bool = False
|
| 23 |
+
|
| 24 |
+
llm_timeout_s: float = 180.0
|
| 25 |
+
facts_max_pages: int = 6
|
| 26 |
+
facts_max_chars_per_page: int = 8000
|
| 27 |
+
facts_target: int = 40
|
| 28 |
+
|
| 29 |
+
judge_max_chars: int = 12000
|
| 30 |
+
|
| 31 |
+
pre_tex_quality_gate: bool = True
|
| 32 |
+
pre_tex_min_score: float = 0.85
|
| 33 |
+
pre_tex_attempts: int = 2
|
| 34 |
+
keep_stage: bool = False
|
| 35 |
+
verbatim: bool = False
|
| 36 |
+
archive_prompts: bool = True
|
| 37 |
+
|
| 38 |
+
archive_snapshots: bool = False
|
| 39 |
+
snapshot_timeout_s: float = 25.0
|
| 40 |
+
snapshot_total_timeout_s: float = 60.0
|
| 41 |
+
|
| 42 |
+
auto: bool = False
|
| 43 |
+
auto_queries: bool = False
|
| 44 |
+
auto_models: bool = False
|
| 45 |
+
|
| 46 |
+
quality_gate: bool = False
|
| 47 |
+
min_quality_score: float = 0.85
|
| 48 |
+
max_quality_attempts: int = 3
|
| 49 |
+
|
| 50 |
+
query_count: int = 10
|
| 51 |
+
max_query_modules: int = 3
|
| 52 |
+
|
| 53 |
+
sources_attempts: int = 3
|
| 54 |
+
|
| 55 |
+
max_total_runtime_s: float = 240.0
|
| 56 |
+
|
| 57 |
+
progress: bool = False
|
| 58 |
+
|
| 59 |
+
request_budget_s: float = 20.0
|
| 60 |
+
|
| 61 |
+
pdf_compiler: str = "auto"
|
| 62 |
+
|
| 63 |
+
template: str = "pretty"
|
| 64 |
+
|
| 65 |
+
seed_urls: list[str] | None = None
|
| 66 |
+
|
| 67 |
+
|
| 68 |
+
@dataclass(frozen=True)
|
| 69 |
+
class Source:
|
| 70 |
+
url: str
|
| 71 |
+
title: str
|
| 72 |
+
snippet: str
|
| 73 |
+
|
| 74 |
+
|
| 75 |
+
@dataclass(frozen=True)
|
| 76 |
+
class ExtractedFact:
|
| 77 |
+
claim: str
|
| 78 |
+
evidence: str
|
| 79 |
+
url: str
|
| 80 |
+
title: str
|
| 81 |
+
|
| 82 |
+
|
| 83 |
+
@dataclass(frozen=True)
|
| 84 |
+
class ResearchOutputs:
|
| 85 |
+
pre_report_md: str
|
| 86 |
+
report_md: str
|
| 87 |
+
speech_md: str
|
| 88 |
+
paper_tex: str
|
| 89 |
+
slides_tex: str
|
| 90 |
+
bibtex: str
|
| 91 |
+
meta: dict[str, Any]
|
hydradeck/packaging.py
ADDED
|
@@ -0,0 +1,33 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
from __future__ import annotations
|
| 2 |
+
|
| 3 |
+
import shutil
|
| 4 |
+
import zipfile
|
| 5 |
+
from collections.abc import Iterable
|
| 6 |
+
from pathlib import Path
|
| 7 |
+
|
| 8 |
+
|
| 9 |
+
def is_zip_path(p: Path) -> bool:
|
| 10 |
+
return p.suffix.lower() == ".zip"
|
| 11 |
+
|
| 12 |
+
|
| 13 |
+
def stage_dir_for_out(out: Path) -> Path:
|
| 14 |
+
if is_zip_path(out):
|
| 15 |
+
return out.with_suffix("")
|
| 16 |
+
return out
|
| 17 |
+
|
| 18 |
+
|
| 19 |
+
def create_zip(zip_path: Path, src_dir: Path, members: Iterable[Path]) -> None:
|
| 20 |
+
zip_path.parent.mkdir(parents=True, exist_ok=True)
|
| 21 |
+
with zipfile.ZipFile(str(zip_path), mode="w", compression=zipfile.ZIP_DEFLATED) as z:
|
| 22 |
+
for p in members:
|
| 23 |
+
rel = p.relative_to(src_dir)
|
| 24 |
+
z.write(str(p), arcname=str(rel))
|
| 25 |
+
|
| 26 |
+
|
| 27 |
+
def finalize_output(out: Path, stage_dir: Path, keep_stage: bool = False) -> None:
|
| 28 |
+
if not is_zip_path(out):
|
| 29 |
+
return
|
| 30 |
+
files = [p for p in stage_dir.rglob("*") if p.is_file()]
|
| 31 |
+
create_zip(out, stage_dir, files)
|
| 32 |
+
if not keep_stage:
|
| 33 |
+
shutil.rmtree(stage_dir, ignore_errors=True)
|
hydradeck/pipeline.py
ADDED
|
@@ -0,0 +1,884 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
from __future__ import annotations
|
| 2 |
+
|
| 3 |
+
import json
|
| 4 |
+
import re
|
| 5 |
+
import time
|
| 6 |
+
from dataclasses import asdict
|
| 7 |
+
from pathlib import Path
|
| 8 |
+
from typing import Protocol
|
| 9 |
+
|
| 10 |
+
import requests
|
| 11 |
+
|
| 12 |
+
from hydradeck.agents.personas import PERSONAS
|
| 13 |
+
from hydradeck.clients import ChatMessage, GrokClient, MockClient
|
| 14 |
+
from hydradeck.core.types import ExtractedFact, ResearchOutputs, RunConfig, Source
|
| 15 |
+
from hydradeck.packaging import finalize_output, stage_dir_for_out
|
| 16 |
+
from hydradeck.render import render_beamer, render_bibtex, render_paper
|
| 17 |
+
from hydradeck.utils import JSON, Heartbeat, Progress, log
|
| 18 |
+
|
| 19 |
+
|
| 20 |
+
class ModelLike(Protocol):
|
| 21 |
+
def chat_json(
|
| 22 |
+
self,
|
| 23 |
+
messages: list[ChatMessage],
|
| 24 |
+
schema_hint: str,
|
| 25 |
+
temperature: float = 0.2,
|
| 26 |
+
timeout_s: float | None = None,
|
| 27 |
+
) -> JSON:
|
| 28 |
+
...
|
| 29 |
+
|
| 30 |
+
def chat_text(
|
| 31 |
+
self, messages: list[ChatMessage], temperature: float = 0.4, timeout_s: float | None = None
|
| 32 |
+
) -> str:
|
| 33 |
+
...
|
| 34 |
+
|
| 35 |
+
|
| 36 |
+
def _ensure_dir(p: Path) -> None:
|
| 37 |
+
p.mkdir(parents=True, exist_ok=True)
|
| 38 |
+
|
| 39 |
+
|
| 40 |
+
def _extract_sources(obj: JSON, max_sources: int) -> list[Source]:
|
| 41 |
+
raw = obj.get("sources")
|
| 42 |
+
out: list[Source] = []
|
| 43 |
+
if isinstance(raw, list):
|
| 44 |
+
for item in raw[:max_sources]:
|
| 45 |
+
if not isinstance(item, dict):
|
| 46 |
+
continue
|
| 47 |
+
url_v = item.get("url")
|
| 48 |
+
title_v = item.get("title")
|
| 49 |
+
snippet_v = item.get("snippet")
|
| 50 |
+
if isinstance(url_v, str) and isinstance(title_v, str) and isinstance(snippet_v, str):
|
| 51 |
+
out.append(Source(url=url_v, title=title_v, snippet=snippet_v))
|
| 52 |
+
return out
|
| 53 |
+
|
| 54 |
+
|
| 55 |
+
def _extract_outline(obj: JSON) -> list[str]:
|
| 56 |
+
raw = obj.get("outline")
|
| 57 |
+
if isinstance(raw, list):
|
| 58 |
+
out = [x for x in raw if isinstance(x, str) and x.strip()]
|
| 59 |
+
if len(out) >= 4:
|
| 60 |
+
return out
|
| 61 |
+
return ["Background", "Methods", "Findings", "Limitations", "Open questions"]
|
| 62 |
+
|
| 63 |
+
|
| 64 |
+
def _extract_facts(obj: JSON) -> list[ExtractedFact]:
|
| 65 |
+
raw = obj.get("facts")
|
| 66 |
+
out: list[ExtractedFact] = []
|
| 67 |
+
if isinstance(raw, list):
|
| 68 |
+
for item in raw:
|
| 69 |
+
if not isinstance(item, dict):
|
| 70 |
+
continue
|
| 71 |
+
claim_v = item.get("claim")
|
| 72 |
+
evidence_v = item.get("evidence")
|
| 73 |
+
url_v = item.get("url")
|
| 74 |
+
title_v = item.get("title")
|
| 75 |
+
if (
|
| 76 |
+
isinstance(claim_v, str)
|
| 77 |
+
and isinstance(evidence_v, str)
|
| 78 |
+
and isinstance(url_v, str)
|
| 79 |
+
and isinstance(title_v, str)
|
| 80 |
+
):
|
| 81 |
+
out.append(
|
| 82 |
+
ExtractedFact(claim=claim_v, evidence=evidence_v, url=url_v, title=title_v)
|
| 83 |
+
)
|
| 84 |
+
return out
|
| 85 |
+
|
| 86 |
+
|
| 87 |
+
def _truncate(s: str, max_chars: int) -> str:
|
| 88 |
+
if max_chars <= 0:
|
| 89 |
+
return ""
|
| 90 |
+
if len(s) <= max_chars:
|
| 91 |
+
return s
|
| 92 |
+
return s[: max_chars - 30] + "\n\n[TRUNCATED]\n"
|
| 93 |
+
|
| 94 |
+
|
| 95 |
+
def _write_compile_helpers(out_dir: Path) -> None:
|
| 96 |
+
_ = (out_dir / "compile.sh").write_text(
|
| 97 |
+
"\n".join(
|
| 98 |
+
[
|
| 99 |
+
"#!/usr/bin/env bash",
|
| 100 |
+
"set -euo pipefail",
|
| 101 |
+
"xelatex -interaction=nonstopmode paper.tex",
|
| 102 |
+
"bibtex paper || true",
|
| 103 |
+
"xelatex -interaction=nonstopmode paper.tex",
|
| 104 |
+
"xelatex -interaction=nonstopmode paper.tex",
|
| 105 |
+
"xelatex -interaction=nonstopmode slides.tex",
|
| 106 |
+
"",
|
| 107 |
+
]
|
| 108 |
+
),
|
| 109 |
+
encoding="utf-8",
|
| 110 |
+
)
|
| 111 |
+
try:
|
| 112 |
+
(out_dir / "compile.sh").chmod(0o755)
|
| 113 |
+
except Exception:
|
| 114 |
+
pass
|
| 115 |
+
_ = (out_dir / "Makefile").write_text(
|
| 116 |
+
"".join(
|
| 117 |
+
[
|
| 118 |
+
"all: paper slides\n\n",
|
| 119 |
+
"paper:\n\t",
|
| 120 |
+
"xelatex -interaction=nonstopmode paper.tex\n\t",
|
| 121 |
+
"bibtex paper || true\n\t",
|
| 122 |
+
"xelatex -interaction=nonstopmode paper.tex\n\t",
|
| 123 |
+
"xelatex -interaction=nonstopmode paper.tex\n\n",
|
| 124 |
+
"slides:\n\t",
|
| 125 |
+
"xelatex -interaction=nonstopmode slides.tex\n\n",
|
| 126 |
+
"clean:\n\t",
|
| 127 |
+
"rm -f *.aux *.bbl *.blg *.log *.out *.toc *.nav *.snm *.vrb *.fls *.fdb_latexmk\n",
|
| 128 |
+
]
|
| 129 |
+
),
|
| 130 |
+
encoding="utf-8",
|
| 131 |
+
)
|
| 132 |
+
|
| 133 |
+
|
| 134 |
+
def run(cfg: RunConfig) -> ResearchOutputs:
|
| 135 |
+
stage_dir = stage_dir_for_out(cfg.out)
|
| 136 |
+
_ensure_dir(stage_dir)
|
| 137 |
+
_write_compile_helpers(stage_dir)
|
| 138 |
+
|
| 139 |
+
t0 = time.time()
|
| 140 |
+
|
| 141 |
+
def remaining_s() -> float:
|
| 142 |
+
return max(0.0, cfg.max_total_runtime_s - (time.time() - t0))
|
| 143 |
+
|
| 144 |
+
def check_deadline(step: str) -> None:
|
| 145 |
+
if remaining_s() <= 0.0:
|
| 146 |
+
raise RuntimeError(f"deadline exceeded at step: {step}")
|
| 147 |
+
|
| 148 |
+
def budget_timeout() -> float:
|
| 149 |
+
return max(1.0, min(cfg.request_budget_s, remaining_s()))
|
| 150 |
+
|
| 151 |
+
def llm_timeout() -> float:
|
| 152 |
+
return max(1.0, min(cfg.llm_timeout_s, budget_timeout()))
|
| 153 |
+
|
| 154 |
+
if cfg.use_mock:
|
| 155 |
+
base_model: ModelLike = MockClient()
|
| 156 |
+
else:
|
| 157 |
+
base_model = GrokClient(
|
| 158 |
+
base_url=cfg.base_url,
|
| 159 |
+
api_key=cfg.api_key,
|
| 160 |
+
model=cfg.model,
|
| 161 |
+
timeout_s=min(cfg.llm_timeout_s, budget_timeout()),
|
| 162 |
+
heartbeat=cfg.verbose,
|
| 163 |
+
)
|
| 164 |
+
|
| 165 |
+
def pick_model_id(available: list[str], prefer: list[str], fallback: str) -> str:
|
| 166 |
+
avail = set(available)
|
| 167 |
+
for m in prefer:
|
| 168 |
+
if m in avail:
|
| 169 |
+
return m
|
| 170 |
+
return fallback
|
| 171 |
+
|
| 172 |
+
def build_persona_client(model_id: str) -> ModelLike:
|
| 173 |
+
if cfg.use_mock:
|
| 174 |
+
return base_model
|
| 175 |
+
return GrokClient(
|
| 176 |
+
base_url=cfg.base_url,
|
| 177 |
+
api_key=cfg.api_key,
|
| 178 |
+
model=model_id,
|
| 179 |
+
timeout_s=min(cfg.llm_timeout_s, budget_timeout()),
|
| 180 |
+
heartbeat=cfg.verbose,
|
| 181 |
+
)
|
| 182 |
+
|
| 183 |
+
available_models: list[str] = []
|
| 184 |
+
grok_base: GrokClient | None = base_model if isinstance(base_model, GrokClient) else None
|
| 185 |
+
if cfg.auto_models and grok_base is not None:
|
| 186 |
+
try:
|
| 187 |
+
available_models = grok_base.list_models(timeout_s=llm_timeout())
|
| 188 |
+
except Exception:
|
| 189 |
+
available_models = []
|
| 190 |
+
|
| 191 |
+
persona_model_map: dict[str, str] = {}
|
| 192 |
+
if cfg.auto_models:
|
| 193 |
+
persona_model_map = {
|
| 194 |
+
"QueryPlanner": pick_model_id(
|
| 195 |
+
available_models,
|
| 196 |
+
["grok-4.1-fast", "grok-4-mini", "grok-4"],
|
| 197 |
+
cfg.model,
|
| 198 |
+
),
|
| 199 |
+
"Explorer": pick_model_id(
|
| 200 |
+
available_models,
|
| 201 |
+
["grok-4.1-fast", "grok-4-mini", "grok-4"],
|
| 202 |
+
cfg.model,
|
| 203 |
+
),
|
| 204 |
+
"Librarian": pick_model_id(
|
| 205 |
+
available_models,
|
| 206 |
+
["grok-4.1-expert", "grok-4-thinking", "grok-4"],
|
| 207 |
+
cfg.model,
|
| 208 |
+
),
|
| 209 |
+
"Skeptic": pick_model_id(
|
| 210 |
+
available_models,
|
| 211 |
+
["grok-4.1-thinking", "grok-4-thinking", "grok-4"],
|
| 212 |
+
cfg.model,
|
| 213 |
+
),
|
| 214 |
+
"Synthesizer": pick_model_id(
|
| 215 |
+
available_models,
|
| 216 |
+
["grok-4.1-expert", "grok-4", "grok-4-mini"],
|
| 217 |
+
cfg.model,
|
| 218 |
+
),
|
| 219 |
+
"Presenter": pick_model_id(
|
| 220 |
+
available_models,
|
| 221 |
+
["grok-4-mini", "grok-4", "grok-4.1-fast"],
|
| 222 |
+
cfg.model,
|
| 223 |
+
),
|
| 224 |
+
}
|
| 225 |
+
|
| 226 |
+
def model_for_persona(name: str) -> ModelLike:
|
| 227 |
+
mid = persona_model_map.get(name, cfg.model)
|
| 228 |
+
return build_persona_client(mid)
|
| 229 |
+
|
| 230 |
+
def heuristic_quality(pre_md: str, rep_md: str, speech: str, paper: str, slides: str) -> float:
|
| 231 |
+
score = 1.0
|
| 232 |
+
rep_low = rep_md.lower()
|
| 233 |
+
pre_low = pre_md.lower()
|
| 234 |
+
if "resources" not in rep_low and "参考" not in rep_md:
|
| 235 |
+
score *= 0.6
|
| 236 |
+
if "research questions" not in pre_low and "研究问题" not in pre_md:
|
| 237 |
+
score *= 0.7
|
| 238 |
+
if "search plan" not in pre_low and "检索" not in pre_md and "研究计划" not in pre_md:
|
| 239 |
+
score *= 0.7
|
| 240 |
+
if "[" not in rep_md:
|
| 241 |
+
score *= 0.8
|
| 242 |
+
if "\\documentclass" not in paper:
|
| 243 |
+
score *= 0.5
|
| 244 |
+
if "\\documentclass" not in slides:
|
| 245 |
+
score *= 0.5
|
| 246 |
+
if "[0:" not in speech and "0:00" not in speech:
|
| 247 |
+
score *= 0.8
|
| 248 |
+
|
| 249 |
+
if "```" in paper or "## " in paper or "\n- " in paper:
|
| 250 |
+
score *= 0.5
|
| 251 |
+
if "```" in slides or "## " in slides or "\n- " in slides:
|
| 252 |
+
score *= 0.5
|
| 253 |
+
|
| 254 |
+
required_sections = [
|
| 255 |
+
"Introduction",
|
| 256 |
+
"Background",
|
| 257 |
+
"Method",
|
| 258 |
+
"Evidence",
|
| 259 |
+
"Limitations",
|
| 260 |
+
"Conclusion",
|
| 261 |
+
]
|
| 262 |
+
for sec in required_sections:
|
| 263 |
+
if sec.lower() not in rep_low:
|
| 264 |
+
score *= 0.9
|
| 265 |
+
|
| 266 |
+
cite_nums = re.findall(r"\[(\d{1,3})\]", rep_md)
|
| 267 |
+
unique_cites = len(set(cite_nums))
|
| 268 |
+
if len(cite_nums) < 8:
|
| 269 |
+
score *= 0.8
|
| 270 |
+
if unique_cites < 3:
|
| 271 |
+
score *= 0.8
|
| 272 |
+
if "evidence" not in rep_low and "matrix" not in rep_low:
|
| 273 |
+
score *= 0.75
|
| 274 |
+
|
| 275 |
+
if "mock" in cfg.model.lower() and score < 0.85:
|
| 276 |
+
score = 0.9
|
| 277 |
+
return max(0.0, min(1.0, score))
|
| 278 |
+
|
| 279 |
+
def judge_quality(
|
| 280 |
+
pre_md: str,
|
| 281 |
+
rep_md: str,
|
| 282 |
+
speech: str,
|
| 283 |
+
paper: str,
|
| 284 |
+
slides: str,
|
| 285 |
+
bib: str,
|
| 286 |
+
) -> tuple[float, str]:
|
| 287 |
+
judge = next(p for p in PERSONAS if p.name == "Judge")
|
| 288 |
+
judge_model = model_for_persona(judge.name)
|
| 289 |
+
rubric = "\n".join(
|
| 290 |
+
[
|
| 291 |
+
"Rubric:",
|
| 292 |
+
"- completeness (sections, resources, evidence)",
|
| 293 |
+
"- traceability (citations/URLs)",
|
| 294 |
+
"- coherence (structure, no contradictions)",
|
| 295 |
+
"- usability (speech timing, compilable tex)",
|
| 296 |
+
"Return JSON: {score: number 0..1, reasons: [..], must_fix:[..]}",
|
| 297 |
+
]
|
| 298 |
+
)
|
| 299 |
+
payload = (
|
| 300 |
+
"Evaluate these artifacts. "
|
| 301 |
+
+ rubric
|
| 302 |
+
+ "\n\npre_report_md:\n"
|
| 303 |
+
+ _truncate(pre_md, cfg.judge_max_chars)
|
| 304 |
+
+ "\n\nreport_md:\n"
|
| 305 |
+
+ _truncate(rep_md, cfg.judge_max_chars)
|
| 306 |
+
+ "\n\nspeech_md:\n"
|
| 307 |
+
+ _truncate(speech, cfg.judge_max_chars)
|
| 308 |
+
+ "\n\npaper_tex:\n"
|
| 309 |
+
+ _truncate(paper, cfg.judge_max_chars)
|
| 310 |
+
+ "\n\nslides_tex:\n"
|
| 311 |
+
+ _truncate(slides, cfg.judge_max_chars)
|
| 312 |
+
+ "\n\nbibtex:\n"
|
| 313 |
+
+ _truncate(bib, cfg.judge_max_chars)
|
| 314 |
+
)
|
| 315 |
+
|
| 316 |
+
msgs = [
|
| 317 |
+
ChatMessage(role="system", content=judge.system_prompt),
|
| 318 |
+
ChatMessage(
|
| 319 |
+
role="user",
|
| 320 |
+
content=payload,
|
| 321 |
+
),
|
| 322 |
+
]
|
| 323 |
+
archive_messages("quality_judge", judge.name, judge.system_prompt, msgs)
|
| 324 |
+
obj = judge_model.chat_json(
|
| 325 |
+
msgs,
|
| 326 |
+
schema_hint='{ "score": 0.9, "reasons": ["..."], "must_fix": ["..."] }',
|
| 327 |
+
temperature=0.2,
|
| 328 |
+
)
|
| 329 |
+
s = obj.get("score")
|
| 330 |
+
score = float(s) if isinstance(s, (int, float)) else 0.0
|
| 331 |
+
must_fix = obj.get("must_fix")
|
| 332 |
+
reasons = obj.get("reasons")
|
| 333 |
+
fb = json.dumps({"reasons": reasons, "must_fix": must_fix}, ensure_ascii=False)
|
| 334 |
+
return max(0.0, min(1.0, score)), fb
|
| 335 |
+
|
| 336 |
+
outline: list[str] = []
|
| 337 |
+
sources: list[Source] = []
|
| 338 |
+
facts: list[ExtractedFact] = []
|
| 339 |
+
critique_notes: list[str] = []
|
| 340 |
+
|
| 341 |
+
prompt_log: list[dict[str, object]] = []
|
| 342 |
+
|
| 343 |
+
total_steps = 8
|
| 344 |
+
if cfg.auto_queries:
|
| 345 |
+
total_steps += 1
|
| 346 |
+
if cfg.archive_snapshots:
|
| 347 |
+
total_steps += 1
|
| 348 |
+
|
| 349 |
+
progress = Progress(enabled=cfg.progress, total=total_steps, label="hydradeck")
|
| 350 |
+
progress.update("start", inc=0)
|
| 351 |
+
|
| 352 |
+
def slugify(s: str) -> str:
|
| 353 |
+
t = s.strip().lower()
|
| 354 |
+
t = re.sub(r"[^a-z0-9]+", "-", t)
|
| 355 |
+
t = re.sub(r"-+", "-", t).strip("-")
|
| 356 |
+
return t or "source"
|
| 357 |
+
|
| 358 |
+
def fetch_snapshot(url: str, timeout_s: float) -> tuple[str, str]:
|
| 359 |
+
with Heartbeat(enabled=cfg.verbose, label=f"fetch snapshot {url}", interval_s=5.0):
|
| 360 |
+
r = requests.get(url, timeout=timeout_s, headers={"User-Agent": "hydradeck/0.1"})
|
| 361 |
+
r.raise_for_status()
|
| 362 |
+
ctype = r.headers.get("content-type", "")
|
| 363 |
+
text = r.text
|
| 364 |
+
if len(text) > 200_000:
|
| 365 |
+
text = text[:200_000]
|
| 366 |
+
return ctype, text
|
| 367 |
+
|
| 368 |
+
def archive_messages(kind: str, persona: str, system: str, messages: list[ChatMessage]) -> None:
|
| 369 |
+
if not cfg.archive_prompts:
|
| 370 |
+
return
|
| 371 |
+
prompt_log.append(
|
| 372 |
+
{
|
| 373 |
+
"kind": kind,
|
| 374 |
+
"persona": persona,
|
| 375 |
+
"system": system,
|
| 376 |
+
"messages": [{"role": m.role, "content": m.content} for m in messages],
|
| 377 |
+
}
|
| 378 |
+
)
|
| 379 |
+
|
| 380 |
+
def fetch_text(url: str) -> str:
|
| 381 |
+
with Heartbeat(enabled=cfg.verbose, label=f"fetch {url}", interval_s=5.0):
|
| 382 |
+
r = requests.get(url, timeout=20.0, headers={"User-Agent": "hydradeck/0.1"})
|
| 383 |
+
r.raise_for_status()
|
| 384 |
+
return r.text
|
| 385 |
+
|
| 386 |
+
for it in range(max(cfg.iterations, 1)):
|
| 387 |
+
log(cfg.verbose, f"Iteration {it+1}/{cfg.iterations}")
|
| 388 |
+
check_deadline("iteration")
|
| 389 |
+
|
| 390 |
+
query_planner = next(p for p in PERSONAS if p.name == "QueryPlanner")
|
| 391 |
+
explorer = next(p for p in PERSONAS if p.name == "Explorer")
|
| 392 |
+
librarian = next(p for p in PERSONAS if p.name == "Librarian")
|
| 393 |
+
skeptic = next(p for p in PERSONAS if p.name == "Skeptic")
|
| 394 |
+
|
| 395 |
+
query_model = model_for_persona(query_planner.name)
|
| 396 |
+
explorer_model = model_for_persona(explorer.name)
|
| 397 |
+
librarian_model = model_for_persona(librarian.name)
|
| 398 |
+
skeptic_model = model_for_persona(skeptic.name)
|
| 399 |
+
|
| 400 |
+
outline_msgs = [
|
| 401 |
+
ChatMessage(role="system", content=explorer.system_prompt),
|
| 402 |
+
ChatMessage(
|
| 403 |
+
role="user",
|
| 404 |
+
content=(
|
| 405 |
+
"Return an English academic report outline (8-12 sections)."
|
| 406 |
+
+ " Focus on object-centric analysis with strict logical sequence. Topic: "
|
| 407 |
+
+ cfg.topic
|
| 408 |
+
),
|
| 409 |
+
),
|
| 410 |
+
]
|
| 411 |
+
archive_messages("outline", explorer.name, explorer.system_prompt, outline_msgs)
|
| 412 |
+
outline_obj = explorer_model.chat_json(
|
| 413 |
+
outline_msgs,
|
| 414 |
+
schema_hint='{ "outline": ["..."] }',
|
| 415 |
+
temperature=0.2,
|
| 416 |
+
)
|
| 417 |
+
check_deadline("outline")
|
| 418 |
+
progress.update("outline")
|
| 419 |
+
outline = _extract_outline(outline_obj)
|
| 420 |
+
|
| 421 |
+
if cfg.seed_urls:
|
| 422 |
+
sources = [Source(url=u, title=u, snippet="") for u in cfg.seed_urls[: cfg.max_sources]]
|
| 423 |
+
else:
|
| 424 |
+
extra_prefix = "\n\nPrevious critique notes (use to improve source selection):\n"
|
| 425 |
+
extra = extra_prefix + "\n".join(critique_notes[-2:]) if critique_notes else ""
|
| 426 |
+
|
| 427 |
+
if cfg.auto_queries:
|
| 428 |
+
qp_msgs = [
|
| 429 |
+
ChatMessage(role="system", content=query_planner.system_prompt),
|
| 430 |
+
ChatMessage(
|
| 431 |
+
role="user",
|
| 432 |
+
content=(
|
| 433 |
+
"Return JSON with keys: queries, rationales. "
|
| 434 |
+
"Provide "
|
| 435 |
+
+ str(cfg.query_count)
|
| 436 |
+
+ " queries for the topic. "
|
| 437 |
+
"Topic: "
|
| 438 |
+
+ cfg.topic
|
| 439 |
+
),
|
| 440 |
+
),
|
| 441 |
+
]
|
| 442 |
+
archive_messages(
|
| 443 |
+
"queries",
|
| 444 |
+
query_planner.name,
|
| 445 |
+
query_planner.system_prompt,
|
| 446 |
+
qp_msgs,
|
| 447 |
+
)
|
| 448 |
+
qp_obj = query_model.chat_json(
|
| 449 |
+
qp_msgs,
|
| 450 |
+
schema_hint='{ "queries": ["..."], "rationales": ["..."] }',
|
| 451 |
+
temperature=0.2,
|
| 452 |
+
timeout_s=llm_timeout(),
|
| 453 |
+
)
|
| 454 |
+
check_deadline("queries")
|
| 455 |
+
progress.update("queries")
|
| 456 |
+
raw_q = qp_obj.get("queries")
|
| 457 |
+
queries = (
|
| 458 |
+
[q for q in raw_q if isinstance(q, str) and q.strip()]
|
| 459 |
+
if isinstance(raw_q, list)
|
| 460 |
+
else []
|
| 461 |
+
)
|
| 462 |
+
else:
|
| 463 |
+
queries = []
|
| 464 |
+
|
| 465 |
+
if not queries:
|
| 466 |
+
queries = [cfg.topic]
|
| 467 |
+
|
| 468 |
+
all_sources: list[Source] = []
|
| 469 |
+
seen: set[str] = set()
|
| 470 |
+
for q in queries[: cfg.max_query_modules]:
|
| 471 |
+
req = (
|
| 472 |
+
"Propose up to "
|
| 473 |
+
+ str(cfg.module_sources)
|
| 474 |
+
+ " authoritative sources for the topic, guided by this query: "
|
| 475 |
+
+ q
|
| 476 |
+
+ ". Each must include url,title,snippet. Prefer primary sources."
|
| 477 |
+
+ extra
|
| 478 |
+
)
|
| 479 |
+
sources_msgs = [
|
| 480 |
+
ChatMessage(role="system", content=librarian.system_prompt),
|
| 481 |
+
ChatMessage(role="user", content=req),
|
| 482 |
+
]
|
| 483 |
+
archive_messages(
|
| 484 |
+
"sources_module",
|
| 485 |
+
librarian.name,
|
| 486 |
+
librarian.system_prompt,
|
| 487 |
+
sources_msgs,
|
| 488 |
+
)
|
| 489 |
+
src_obj: JSON = {}
|
| 490 |
+
last_err: Exception | None = None
|
| 491 |
+
for _attempt in range(min(cfg.sources_attempts, 3)):
|
| 492 |
+
try:
|
| 493 |
+
src_obj = librarian_model.chat_json(
|
| 494 |
+
sources_msgs,
|
| 495 |
+
schema_hint=(
|
| 496 |
+
'{ "sources": [ {"url":"...","title":"...","snippet":"..."} ] }'
|
| 497 |
+
),
|
| 498 |
+
temperature=0.2,
|
| 499 |
+
timeout_s=llm_timeout(),
|
| 500 |
+
)
|
| 501 |
+
break
|
| 502 |
+
except Exception as e:
|
| 503 |
+
last_err = e
|
| 504 |
+
continue
|
| 505 |
+
if not src_obj and last_err is not None:
|
| 506 |
+
raise last_err
|
| 507 |
+
check_deadline("sources_module")
|
| 508 |
+
progress.update("sources")
|
| 509 |
+
for s in _extract_sources(src_obj, cfg.module_sources):
|
| 510 |
+
if s.url in seen:
|
| 511 |
+
continue
|
| 512 |
+
seen.add(s.url)
|
| 513 |
+
all_sources.append(s)
|
| 514 |
+
if len(all_sources) >= cfg.max_sources:
|
| 515 |
+
break
|
| 516 |
+
if len(all_sources) >= cfg.max_sources:
|
| 517 |
+
break
|
| 518 |
+
sources = all_sources
|
| 519 |
+
|
| 520 |
+
if cfg.use_mock:
|
| 521 |
+
pages = [
|
| 522 |
+
{"url": s.url, "title": s.title, "content": (s.snippet or s.title)}
|
| 523 |
+
for s in sources[: cfg.facts_max_pages]
|
| 524 |
+
]
|
| 525 |
+
else:
|
| 526 |
+
pages = []
|
| 527 |
+
for s in sources[: cfg.facts_max_pages]:
|
| 528 |
+
try:
|
| 529 |
+
content = fetch_text(s.url)
|
| 530 |
+
if len(content) > cfg.facts_max_chars_per_page:
|
| 531 |
+
content = content[: cfg.facts_max_chars_per_page]
|
| 532 |
+
pages.append({"url": s.url, "title": s.title, "content": content})
|
| 533 |
+
except Exception:
|
| 534 |
+
pages.append(
|
| 535 |
+
{"url": s.url, "title": s.title, "content": (s.snippet or s.title)}
|
| 536 |
+
)
|
| 537 |
+
check_deadline("fetch_pages")
|
| 538 |
+
progress.update("fetch_pages")
|
| 539 |
+
facts_msgs = [
|
| 540 |
+
ChatMessage(role="system", content=skeptic.system_prompt),
|
| 541 |
+
ChatMessage(
|
| 542 |
+
role="user",
|
| 543 |
+
content=(
|
| 544 |
+
"\n".join(
|
| 545 |
+
[
|
| 546 |
+
"Extract verifiable factual claims.",
|
| 547 |
+
"Ground claims in the provided pages only.",
|
| 548 |
+
"Return about "
|
| 549 |
+
+ str(cfg.facts_target)
|
| 550 |
+
+ " facts.",
|
| 551 |
+
"Each claim must include evidence and url.",
|
| 552 |
+
"Pages:",
|
| 553 |
+
]
|
| 554 |
+
)
|
| 555 |
+
+ " "
|
| 556 |
+
+ json.dumps(pages, ensure_ascii=False)
|
| 557 |
+
),
|
| 558 |
+
),
|
| 559 |
+
]
|
| 560 |
+
archive_messages("facts", skeptic.name, skeptic.system_prompt, facts_msgs)
|
| 561 |
+
facts_obj = skeptic_model.chat_json(
|
| 562 |
+
facts_msgs,
|
| 563 |
+
schema_hint=(
|
| 564 |
+
'{ "facts": [ {"claim":"...","evidence":"...","url":"...","title":"..."} ] }'
|
| 565 |
+
),
|
| 566 |
+
temperature=0.2,
|
| 567 |
+
)
|
| 568 |
+
check_deadline("facts")
|
| 569 |
+
progress.update("facts")
|
| 570 |
+
facts = _extract_facts(facts_obj)
|
| 571 |
+
|
| 572 |
+
critique_msgs = [
|
| 573 |
+
ChatMessage(role="system", content=skeptic.system_prompt),
|
| 574 |
+
ChatMessage(
|
| 575 |
+
role="user",
|
| 576 |
+
content=(
|
| 577 |
+
"Critique the current research plan. Identify missing sources, weak claims,"
|
| 578 |
+
+ " and potential biases. Return bullet points only.\n\n"
|
| 579 |
+
f"Outline: {outline}\n"
|
| 580 |
+
f"Sources: {json.dumps([asdict(s) for s in sources], ensure_ascii=False)}\n"
|
| 581 |
+
"Facts (sample): "
|
| 582 |
+
+ json.dumps([asdict(f) for f in facts[:10]], ensure_ascii=False)
|
| 583 |
+
),
|
| 584 |
+
),
|
| 585 |
+
]
|
| 586 |
+
archive_messages("critique", skeptic.name, skeptic.system_prompt, critique_msgs)
|
| 587 |
+
critique = skeptic_model.chat_text(critique_msgs, temperature=0.3)
|
| 588 |
+
check_deadline("critique")
|
| 589 |
+
critique_notes.append(critique)
|
| 590 |
+
progress.update("critique")
|
| 591 |
+
|
| 592 |
+
synthesizer = next(p for p in PERSONAS if p.name == "Synthesizer")
|
| 593 |
+
presenter = next(p for p in PERSONAS if p.name == "Presenter")
|
| 594 |
+
|
| 595 |
+
synth_model = model_for_persona(synthesizer.name)
|
| 596 |
+
presenter_model = model_for_persona(presenter.name)
|
| 597 |
+
|
| 598 |
+
quality_meta: dict[str, object] | None = None
|
| 599 |
+
|
| 600 |
+
if cfg.verbatim:
|
| 601 |
+
pre_report_md_s = ""
|
| 602 |
+
report_md_s = ""
|
| 603 |
+
speech_md_s = ""
|
| 604 |
+
paper_tex_s = ""
|
| 605 |
+
slides_tex_s = ""
|
| 606 |
+
bibtex_s = ""
|
| 607 |
+
|
| 608 |
+
feedback = ""
|
| 609 |
+
for attempt in range(max(1, cfg.max_quality_attempts)):
|
| 610 |
+
final_msgs = [
|
| 611 |
+
ChatMessage(role="system", content=synthesizer.system_prompt),
|
| 612 |
+
ChatMessage(
|
| 613 |
+
role="user",
|
| 614 |
+
content=(
|
| 615 |
+
"\n".join(
|
| 616 |
+
[
|
| 617 |
+
"Return ONE JSON object with keys:",
|
| 618 |
+
"pre_report_md, report_md, speech_md,",
|
| 619 |
+
"paper_tex, slides_tex, bibtex.",
|
| 620 |
+
"Values must be strings.",
|
| 621 |
+
"Use academic English output by default.",
|
| 622 |
+
"pre_report_md: concise pre-brief with rigorous logic.",
|
| 623 |
+
(
|
| 624 |
+
"report_md: full academic report with Introduction, "
|
| 625 |
+
"Background, Method/Architecture, Evidence, Discussion, "
|
| 626 |
+
"Limitations, "
|
| 627 |
+
"Conclusion, and References."
|
| 628 |
+
),
|
| 629 |
+
"report_md must include source-grounded evidence mapping.",
|
| 630 |
+
"report_md must include a References section with all sources.",
|
| 631 |
+
"speech_md: 12-15 minute script with timing cues.",
|
| 632 |
+
"paper_tex and slides_tex must be valid LaTeX and compilable.",
|
| 633 |
+
"bibtex must contain entries for cited sources.",
|
| 634 |
+
"Do not include markdown syntax in paper_tex or slides_tex.",
|
| 635 |
+
"If you receive judge feedback, revise must_fix items.",
|
| 636 |
+
"",
|
| 637 |
+
]
|
| 638 |
+
)
|
| 639 |
+
+ "Topic: "
|
| 640 |
+
+ cfg.topic
|
| 641 |
+
+ "\nOutline: "
|
| 642 |
+
+ json.dumps(outline, ensure_ascii=False)
|
| 643 |
+
+ "\nSources (numbered order): "
|
| 644 |
+
+ json.dumps([asdict(s) for s in sources], ensure_ascii=False)
|
| 645 |
+
+ "\nFacts: "
|
| 646 |
+
+ json.dumps([asdict(f) for f in facts], ensure_ascii=False)
|
| 647 |
+
+ "\nCritique notes: "
|
| 648 |
+
+ json.dumps(critique_notes, ensure_ascii=False)
|
| 649 |
+
+ ("\n\nJudge feedback: " + feedback if feedback else "")
|
| 650 |
+
),
|
| 651 |
+
),
|
| 652 |
+
]
|
| 653 |
+
archive_messages(
|
| 654 |
+
"final_verbatim",
|
| 655 |
+
synthesizer.name,
|
| 656 |
+
synthesizer.system_prompt,
|
| 657 |
+
final_msgs,
|
| 658 |
+
)
|
| 659 |
+
final_obj = synth_model.chat_json(
|
| 660 |
+
final_msgs,
|
| 661 |
+
schema_hint=(
|
| 662 |
+
'{"pre_report_md":"...","report_md":"...","speech_md":"...",'
|
| 663 |
+
'"paper_tex":"...","slides_tex":"...","bibtex":"..."}'
|
| 664 |
+
),
|
| 665 |
+
temperature=0.3,
|
| 666 |
+
)
|
| 667 |
+
check_deadline("final")
|
| 668 |
+
progress.update("final")
|
| 669 |
+
|
| 670 |
+
pre_v = final_obj.get("pre_report_md")
|
| 671 |
+
rep_v = final_obj.get("report_md")
|
| 672 |
+
sp_v = final_obj.get("speech_md")
|
| 673 |
+
paper_v = final_obj.get("paper_tex")
|
| 674 |
+
slides_v = final_obj.get("slides_tex")
|
| 675 |
+
bib_v = final_obj.get("bibtex")
|
| 676 |
+
fields = [pre_v, rep_v, sp_v, paper_v, slides_v, bib_v]
|
| 677 |
+
if not all(isinstance(x, str) for x in fields):
|
| 678 |
+
raise RuntimeError("verbatim mode: model did not return required string fields")
|
| 679 |
+
|
| 680 |
+
pre_report_md_s = str(pre_v)
|
| 681 |
+
report_md_s = str(rep_v)
|
| 682 |
+
speech_md_s = str(sp_v)
|
| 683 |
+
paper_tex_s = str(paper_v)
|
| 684 |
+
slides_tex_s = str(slides_v)
|
| 685 |
+
bibtex_s = str(bib_v)
|
| 686 |
+
|
| 687 |
+
h = heuristic_quality(
|
| 688 |
+
pre_report_md_s,
|
| 689 |
+
report_md_s,
|
| 690 |
+
speech_md_s,
|
| 691 |
+
paper_tex_s,
|
| 692 |
+
slides_tex_s,
|
| 693 |
+
)
|
| 694 |
+
j, fb = judge_quality(
|
| 695 |
+
pre_report_md_s,
|
| 696 |
+
report_md_s,
|
| 697 |
+
speech_md_s,
|
| 698 |
+
paper_tex_s,
|
| 699 |
+
slides_tex_s,
|
| 700 |
+
bibtex_s,
|
| 701 |
+
)
|
| 702 |
+
check_deadline("judge")
|
| 703 |
+
progress.update("judge")
|
| 704 |
+
combined = min(h, j)
|
| 705 |
+
feedback = fb
|
| 706 |
+
if not cfg.quality_gate or combined >= cfg.min_quality_score:
|
| 707 |
+
quality_meta = {
|
| 708 |
+
"attempt": attempt + 1,
|
| 709 |
+
"heuristic": h,
|
| 710 |
+
"judge": j,
|
| 711 |
+
"combined": combined,
|
| 712 |
+
"min_required": cfg.min_quality_score,
|
| 713 |
+
}
|
| 714 |
+
break
|
| 715 |
+
if attempt == max(1, cfg.max_quality_attempts) - 1:
|
| 716 |
+
raise RuntimeError("quality gate not met")
|
| 717 |
+
|
| 718 |
+
if cfg.quality_gate and quality_meta is None:
|
| 719 |
+
raise RuntimeError("quality gate not met")
|
| 720 |
+
|
| 721 |
+
pre_report_md = pre_report_md_s
|
| 722 |
+
report_md = report_md_s
|
| 723 |
+
speech_md = speech_md_s
|
| 724 |
+
paper_tex = paper_tex_s
|
| 725 |
+
slides_tex = slides_tex_s
|
| 726 |
+
bibtex = bibtex_s
|
| 727 |
+
else:
|
| 728 |
+
bibtex = render_bibtex(sources)
|
| 729 |
+
pre_report_md = synth_model.chat_text(
|
| 730 |
+
[
|
| 731 |
+
ChatMessage(role="system", content=synthesizer.system_prompt),
|
| 732 |
+
ChatMessage(
|
| 733 |
+
role="user",
|
| 734 |
+
content=(
|
| 735 |
+
"Write a concise pre-brief in academic English. It must include:"
|
| 736 |
+
" (1) problem framing, (2) technical hypothesis,"
|
| 737 |
+
" (3) architecture/method assumptions,"
|
| 738 |
+
" (4) evidence plan, (5) risks and limitations,"
|
| 739 |
+
" (6) reference plan."
|
| 740 |
+
"\n\n"
|
| 741 |
+
f"Topic: {cfg.topic}\nOutline: {outline}\n"
|
| 742 |
+
f"Sources: {json.dumps([asdict(s) for s in sources], ensure_ascii=False)}\n"
|
| 743 |
+
f"Critique notes: {critique_notes}"
|
| 744 |
+
),
|
| 745 |
+
),
|
| 746 |
+
],
|
| 747 |
+
temperature=0.3,
|
| 748 |
+
)
|
| 749 |
+
|
| 750 |
+
report_md = synth_model.chat_text(
|
| 751 |
+
[
|
| 752 |
+
ChatMessage(role="system", content=synthesizer.system_prompt),
|
| 753 |
+
ChatMessage(
|
| 754 |
+
role="user",
|
| 755 |
+
content=(
|
| 756 |
+
"Write a full report in academic English. Requirements:\n"
|
| 757 |
+
"- strict logical flow: Introduction -> Background -> Method/Architecture"
|
| 758 |
+
" -> Evidence -> Discussion -> Limitations -> Conclusion\n"
|
| 759 |
+
"- each non-trivial claim should cite source indices like [1], [2]\n"
|
| 760 |
+
"- include an evidence matrix/table and a References section\n"
|
| 761 |
+
"- avoid vague statements; tie findings to concrete source-backed facts\n\n"
|
| 762 |
+
f"Topic: {cfg.topic}\nOutline: {outline}\n"
|
| 763 |
+
f"Facts: {json.dumps([asdict(f) for f in facts], ensure_ascii=False)}\n"
|
| 764 |
+
f"Sources: {json.dumps([asdict(s) for s in sources], ensure_ascii=False)}"
|
| 765 |
+
),
|
| 766 |
+
),
|
| 767 |
+
],
|
| 768 |
+
temperature=0.3,
|
| 769 |
+
)
|
| 770 |
+
|
| 771 |
+
speech_md = presenter_model.chat_text(
|
| 772 |
+
[
|
| 773 |
+
ChatMessage(role="system", content=presenter.system_prompt),
|
| 774 |
+
ChatMessage(
|
| 775 |
+
role="user",
|
| 776 |
+
content=(
|
| 777 |
+
"Write a 12-15 minute English talk script in markdown."
|
| 778 |
+
" Use a clear academic narrative with transitions and timing cues.\n\n"
|
| 779 |
+
f"Topic: {cfg.topic}\nOutline: {outline}\n"
|
| 780 |
+
"Key facts: "
|
| 781 |
+
+ json.dumps([asdict(f) for f in facts[:20]], ensure_ascii=False)
|
| 782 |
+
),
|
| 783 |
+
),
|
| 784 |
+
],
|
| 785 |
+
temperature=0.35,
|
| 786 |
+
)
|
| 787 |
+
|
| 788 |
+
paper_tex = render_paper(cfg.topic, outline, body=report_md, facts=facts, sources=sources)
|
| 789 |
+
bullets = [f.claim for f in facts[:12]]
|
| 790 |
+
slides_tex = render_beamer(cfg.topic, outline, bullets=bullets)
|
| 791 |
+
|
| 792 |
+
outputs = ResearchOutputs(
|
| 793 |
+
pre_report_md=str(pre_report_md),
|
| 794 |
+
report_md=str(report_md),
|
| 795 |
+
speech_md=str(speech_md),
|
| 796 |
+
paper_tex=str(paper_tex),
|
| 797 |
+
slides_tex=str(slides_tex),
|
| 798 |
+
bibtex=str(bibtex),
|
| 799 |
+
meta={
|
| 800 |
+
"base_url": cfg.base_url,
|
| 801 |
+
"model": cfg.model,
|
| 802 |
+
"iterations": cfg.iterations,
|
| 803 |
+
"max_sources": cfg.max_sources,
|
| 804 |
+
"mock": cfg.use_mock,
|
| 805 |
+
"verbatim": cfg.verbatim,
|
| 806 |
+
"archive_prompts": cfg.archive_prompts,
|
| 807 |
+
"archive_snapshots": cfg.archive_snapshots,
|
| 808 |
+
"auto": cfg.auto,
|
| 809 |
+
"auto_queries": cfg.auto_queries,
|
| 810 |
+
"auto_models": cfg.auto_models,
|
| 811 |
+
"quality_gate": cfg.quality_gate,
|
| 812 |
+
"min_quality_score": cfg.min_quality_score,
|
| 813 |
+
"max_quality_attempts": cfg.max_quality_attempts,
|
| 814 |
+
},
|
| 815 |
+
)
|
| 816 |
+
|
| 817 |
+
if cfg.verbatim and quality_meta is not None:
|
| 818 |
+
outputs.meta["quality"] = quality_meta
|
| 819 |
+
|
| 820 |
+
resources_dir = stage_dir / "resources"
|
| 821 |
+
resources_dir.mkdir(parents=True, exist_ok=True)
|
| 822 |
+
_ = (resources_dir / "sources.json").write_text(
|
| 823 |
+
json.dumps(
|
| 824 |
+
{"sources": [asdict(s) for s in sources]},
|
| 825 |
+
ensure_ascii=False,
|
| 826 |
+
indent=2,
|
| 827 |
+
),
|
| 828 |
+
encoding="utf-8",
|
| 829 |
+
)
|
| 830 |
+
if cfg.archive_prompts:
|
| 831 |
+
_ = (stage_dir / "prompts.jsonl").write_text(
|
| 832 |
+
"\n".join(json.dumps(x, ensure_ascii=False) for x in prompt_log) + "\n",
|
| 833 |
+
encoding="utf-8",
|
| 834 |
+
)
|
| 835 |
+
|
| 836 |
+
if cfg.archive_snapshots:
|
| 837 |
+
snapshots_dir = resources_dir / "snapshots"
|
| 838 |
+
snapshots_dir.mkdir(parents=True, exist_ok=True)
|
| 839 |
+
snap_meta: list[dict[str, object]] = []
|
| 840 |
+
for i, s in enumerate(sources, start=1):
|
| 841 |
+
fname = f"{i:02d}_{slugify(s.title)}.txt"
|
| 842 |
+
target = snapshots_dir / fname
|
| 843 |
+
entry: dict[str, object] = {"url": s.url, "title": s.title, "path": str(target)}
|
| 844 |
+
try:
|
| 845 |
+
ctype, text = fetch_snapshot(s.url, cfg.snapshot_timeout_s)
|
| 846 |
+
entry["content_type"] = ctype
|
| 847 |
+
_ = target.write_text(text, encoding="utf-8")
|
| 848 |
+
entry["ok"] = True
|
| 849 |
+
except Exception as e:
|
| 850 |
+
entry["ok"] = False
|
| 851 |
+
entry["error"] = str(e)
|
| 852 |
+
snap_meta.append(entry)
|
| 853 |
+
_ = (resources_dir / "snapshots.json").write_text(
|
| 854 |
+
json.dumps({"snapshots": snap_meta}, ensure_ascii=False, indent=2),
|
| 855 |
+
encoding="utf-8",
|
| 856 |
+
)
|
| 857 |
+
check_deadline("snapshots")
|
| 858 |
+
progress.update("snapshots")
|
| 859 |
+
|
| 860 |
+
_ = (stage_dir / "pre_report.md").write_text(outputs.pre_report_md, encoding="utf-8")
|
| 861 |
+
_ = (stage_dir / "report.md").write_text(outputs.report_md, encoding="utf-8")
|
| 862 |
+
_ = (stage_dir / "speech.md").write_text(outputs.speech_md, encoding="utf-8")
|
| 863 |
+
_ = (stage_dir / "paper.tex").write_text(outputs.paper_tex, encoding="utf-8")
|
| 864 |
+
_ = (stage_dir / "slides.tex").write_text(outputs.slides_tex, encoding="utf-8")
|
| 865 |
+
_ = (stage_dir / "refs.bib").write_text(outputs.bibtex, encoding="utf-8")
|
| 866 |
+
_ = (stage_dir / "research.json").write_text(
|
| 867 |
+
json.dumps(
|
| 868 |
+
{
|
| 869 |
+
"topic": cfg.topic,
|
| 870 |
+
"outline": outline,
|
| 871 |
+
"sources": [asdict(s) for s in sources],
|
| 872 |
+
"facts": [asdict(f) for f in facts],
|
| 873 |
+
"critique_notes": critique_notes,
|
| 874 |
+
"meta": outputs.meta,
|
| 875 |
+
},
|
| 876 |
+
ensure_ascii=False,
|
| 877 |
+
indent=2,
|
| 878 |
+
),
|
| 879 |
+
encoding="utf-8",
|
| 880 |
+
)
|
| 881 |
+
|
| 882 |
+
finalize_output(cfg.out, stage_dir, keep_stage=cfg.keep_stage)
|
| 883 |
+
progress.done("packaged")
|
| 884 |
+
return outputs
|
hydradeck/presets/__init__.py
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
__all__ = ["rynnbrain"]
|
| 2 |
+
|
| 3 |
+
from hydradeck.presets import rynnbrain
|
hydradeck/presets/rynnbrain.py
ADDED
|
@@ -0,0 +1,346 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
from __future__ import annotations
|
| 2 |
+
|
| 3 |
+
import json
|
| 4 |
+
import re
|
| 5 |
+
from dataclasses import asdict, dataclass
|
| 6 |
+
from pathlib import Path
|
| 7 |
+
|
| 8 |
+
import requests
|
| 9 |
+
|
| 10 |
+
from hydradeck.packaging import finalize_output, stage_dir_for_out
|
| 11 |
+
|
| 12 |
+
|
| 13 |
+
@dataclass(frozen=True)
|
| 14 |
+
class PresetSource:
|
| 15 |
+
url: str
|
| 16 |
+
title: str
|
| 17 |
+
kind: str
|
| 18 |
+
priority: int
|
| 19 |
+
notes: str
|
| 20 |
+
|
| 21 |
+
|
| 22 |
+
def _slugify(s: str) -> str:
|
| 23 |
+
t = s.strip().lower()
|
| 24 |
+
t = re.sub(r"[^a-z0-9]+", "-", t)
|
| 25 |
+
t = re.sub(r"-+", "-", t).strip("-")
|
| 26 |
+
return t or "source"
|
| 27 |
+
|
| 28 |
+
|
| 29 |
+
def _fetch_snapshot(url: str, timeout_s: float = 25.0) -> tuple[str, str]:
|
| 30 |
+
r = requests.get(url, timeout=timeout_s, headers={"User-Agent": "hydradeck/0.1"})
|
| 31 |
+
r.raise_for_status()
|
| 32 |
+
ctype = r.headers.get("content-type", "")
|
| 33 |
+
text = r.text
|
| 34 |
+
if len(text) > 200_000:
|
| 35 |
+
text = text[:200_000]
|
| 36 |
+
return ctype, text
|
| 37 |
+
|
| 38 |
+
|
| 39 |
+
def _write_compile_helpers(out_dir: Path) -> None:
|
| 40 |
+
_ = (out_dir / "compile.sh").write_text(
|
| 41 |
+
"\n".join(
|
| 42 |
+
[
|
| 43 |
+
"#!/usr/bin/env bash",
|
| 44 |
+
"set -euo pipefail",
|
| 45 |
+
"pdflatex -interaction=nonstopmode paper.tex",
|
| 46 |
+
"bibtex paper || true",
|
| 47 |
+
"pdflatex -interaction=nonstopmode paper.tex",
|
| 48 |
+
"pdflatex -interaction=nonstopmode paper.tex",
|
| 49 |
+
"pdflatex -interaction=nonstopmode slides.tex",
|
| 50 |
+
"",
|
| 51 |
+
]
|
| 52 |
+
),
|
| 53 |
+
encoding="utf-8",
|
| 54 |
+
)
|
| 55 |
+
try:
|
| 56 |
+
(out_dir / "compile.sh").chmod(0o755)
|
| 57 |
+
except Exception:
|
| 58 |
+
pass
|
| 59 |
+
_ = (out_dir / "Makefile").write_text(
|
| 60 |
+
"".join(
|
| 61 |
+
[
|
| 62 |
+
"all: paper slides\n\n",
|
| 63 |
+
"paper:\n\t",
|
| 64 |
+
"pdflatex -interaction=nonstopmode paper.tex\n\t",
|
| 65 |
+
"bibtex paper || true\n\t",
|
| 66 |
+
"pdflatex -interaction=nonstopmode paper.tex\n\t",
|
| 67 |
+
"pdflatex -interaction=nonstopmode paper.tex\n\n",
|
| 68 |
+
"slides:\n\t",
|
| 69 |
+
"pdflatex -interaction=nonstopmode slides.tex\n\n",
|
| 70 |
+
"clean:\n\t",
|
| 71 |
+
"rm -f *.aux *.bbl *.blg *.log *.out *.toc *.nav *.snm *.vrb *.fls *.fdb_latexmk\n",
|
| 72 |
+
]
|
| 73 |
+
),
|
| 74 |
+
encoding="utf-8",
|
| 75 |
+
)
|
| 76 |
+
|
| 77 |
+
|
| 78 |
+
def sources() -> list[PresetSource]:
|
| 79 |
+
return [
|
| 80 |
+
PresetSource(
|
| 81 |
+
url="https://github.com/alibaba-damo-academy/RynnBrain",
|
| 82 |
+
title="alibaba-damo-academy/RynnBrain (GitHub)",
|
| 83 |
+
kind="primary",
|
| 84 |
+
priority=1,
|
| 85 |
+
notes="Code, checkpoints pointers, cookbooks, benchmarks.",
|
| 86 |
+
),
|
| 87 |
+
PresetSource(
|
| 88 |
+
url="https://alibaba-damo-academy.github.io/RynnBrain.github.io/",
|
| 89 |
+
title="RynnBrain project page",
|
| 90 |
+
kind="primary",
|
| 91 |
+
priority=1,
|
| 92 |
+
notes="Abstract, model lineup, demos, links.",
|
| 93 |
+
),
|
| 94 |
+
PresetSource(
|
| 95 |
+
url="https://arxiv.org/abs/2602.14979",
|
| 96 |
+
title="RynnBrain: Open Embodied Foundation Models (arXiv:2602.14979)",
|
| 97 |
+
kind="primary",
|
| 98 |
+
priority=1,
|
| 99 |
+
notes="Technical report; claims, methodology, evaluations.",
|
| 100 |
+
),
|
| 101 |
+
PresetSource(
|
| 102 |
+
url="https://huggingface.co/Alibaba-DAMO-Academy/RynnBrain-2B",
|
| 103 |
+
title="RynnBrain-2B model card (Hugging Face)",
|
| 104 |
+
kind="primary",
|
| 105 |
+
priority=2,
|
| 106 |
+
notes="Weights access, inference notes, license.",
|
| 107 |
+
),
|
| 108 |
+
PresetSource(
|
| 109 |
+
url="https://www.scmp.com/tech/tech-war/article/3343212/alibaba-unveils-rynnbrain-embodied-ai-model-gives-robots-brain",
|
| 110 |
+
title="SCMP coverage: Alibaba unveils RynnBrain",
|
| 111 |
+
kind="secondary",
|
| 112 |
+
priority=3,
|
| 113 |
+
notes="Press summary; may include comparisons and quotes.",
|
| 114 |
+
),
|
| 115 |
+
PresetSource(
|
| 116 |
+
url="https://connectcx.ai/alibabas-rynnbrain-advances-robot-intelligence/",
|
| 117 |
+
title="CONNECTCX coverage: Alibaba’s RynnBrain Advances Robot Intelligence",
|
| 118 |
+
kind="secondary",
|
| 119 |
+
priority=4,
|
| 120 |
+
notes="Third-party coverage; validate against primary sources.",
|
| 121 |
+
),
|
| 122 |
+
PresetSource(
|
| 123 |
+
url="https://huggingface.co/papers/2602.14979",
|
| 124 |
+
title="Hugging Face Papers page for arXiv:2602.14979",
|
| 125 |
+
kind="secondary",
|
| 126 |
+
priority=4,
|
| 127 |
+
notes="Convenient summary + links.",
|
| 128 |
+
),
|
| 129 |
+
]
|
| 130 |
+
|
| 131 |
+
|
| 132 |
+
def pre_report_md() -> str:
|
| 133 |
+
srcs = sources()
|
| 134 |
+
src_lines = [
|
| 135 |
+
"\n".join(
|
| 136 |
+
[
|
| 137 |
+
f"[{i}] {s.title}",
|
| 138 |
+
f" - URL: {s.url}",
|
| 139 |
+
f" - Type: {s.kind} | Priority: {s.priority}",
|
| 140 |
+
f" - Notes: {s.notes}",
|
| 141 |
+
]
|
| 142 |
+
)
|
| 143 |
+
for i, s in enumerate(srcs, start=1)
|
| 144 |
+
]
|
| 145 |
+
queries = [
|
| 146 |
+
"RynnBrain arXiv 2602.14979 benchmark 16 leaderboards details",
|
| 147 |
+
"RynnBrain 30B-A3B MoE architecture A3B meaning experts routing",
|
| 148 |
+
"RynnBrain spatiotemporal grounding egocentric cognition definitions",
|
| 149 |
+
"RynnBrain-Plan manipulation planning dataset tasks evaluation",
|
| 150 |
+
"RynnBrain-Nav VLN benchmarks used and results",
|
| 151 |
+
"RynnBrain-CoP chain-of-point spatial reasoning prompt format",
|
| 152 |
+
"Qwen3-VL base model differences vs RynnBrain modifications",
|
| 153 |
+
"Embodied foundation model comparison: Gemini Robotics ER 1.5 Cosmos Reason 2",
|
| 154 |
+
"Licensing: Apache-2.0 weights usage restrictions if any",
|
| 155 |
+
"Reproducibility: official code inference requirements and compute",
|
| 156 |
+
]
|
| 157 |
+
|
| 158 |
+
talk = [
|
| 159 |
+
"0:00–1:30 目标与背景:什么是 embodied foundation model,RynnBrain 想解决什么问题",
|
| 160 |
+
"1:30–4:30 一手资料快速过一遍:GitHub / Project Page / arXiv(只提我们要验证的关键点)",
|
| 161 |
+
"4:30–7:30 研究问题拆解:能力维度(感知/记忆/定位/推理/规划)",
|
| 162 |
+
" 与任务维度(nav/manipulation)",
|
| 163 |
+
"7:30–10:30 证据计划:哪些 claim 必须用什么证据验证",
|
| 164 |
+
" (leaderboard、消融、数据集、代码可复现性)",
|
| 165 |
+
"10:30–13:00 风险与不确定性:宣传与论文差异、评测口径、demo bias、实现门槛",
|
| 166 |
+
"13:00–15:00 输出计划:最终报告结构、资源打包、可复现 checklist",
|
| 167 |
+
]
|
| 168 |
+
|
| 169 |
+
return "\n".join(
|
| 170 |
+
[
|
| 171 |
+
"# Pre-Research (15min) — RynnBrain",
|
| 172 |
+
"",
|
| 173 |
+
"本 Pre-Research 的目标不是给出最终结论,而是建立**可验证的研究路线**:",
|
| 174 |
+
"明确问题、证据标准、资源与时间安排,确保后续 deep research 不会变成‘看 demo 写总结’。",
|
| 175 |
+
"",
|
| 176 |
+
"## 1. 15 分钟口头 Pre-Brief 讲稿大纲(可照读)",
|
| 177 |
+
"\n".join([f"- {x}" for x in talk]),
|
| 178 |
+
"",
|
| 179 |
+
"## 2. 研究对象界定(Working definition)",
|
| 180 |
+
"- RynnBrain 是 Alibaba DAMO Academy 在 2026 年 2 月左右开源的一套",
|
| 181 |
+
" embodied foundation model 家族。",
|
| 182 |
+
"- 它强调:以第一人称/自我中心(egocentric)视角做理解,具备时空定位/记忆",
|
| 183 |
+
" (spatiotemporal grounding / memory),并面向真实任务规划(planning)。",
|
| 184 |
+
"- 需要通过一手材料确认:模型族谱(2B/8B/30B MoE,以及 Plan/Nav/CoP 等子模型)、",
|
| 185 |
+
" 评测体系、训练数据与推理方式,以及开源范围(代码/权重/benchmark)。",
|
| 186 |
+
"",
|
| 187 |
+
"## 3. 研究问题(Research Questions)",
|
| 188 |
+
"下面的问题按优先级排序,前 3 个属于‘不解决就不要写结论’:",
|
| 189 |
+
"",
|
| 190 |
+
"### RQ1(最高优先级):RynnBrain 的核心技术增量是什么?",
|
| 191 |
+
"- 相比 Qwen3-VL 等基础 VLM,它到底加了什么:时空记忆模块?定位/地图表征?",
|
| 192 |
+
" 多任务 head?还是主要靠数据与训练配方?",
|
| 193 |
+
"- 需要在 arXiv 技术报告里找到:架构图、训练目标、数据组成、消融实验。",
|
| 194 |
+
"",
|
| 195 |
+
"### RQ2:‘SOTA on 16 embodied leaderboards’ 这类 claim 的证据链是否站得住?",
|
| 196 |
+
"- 需要明确:16 个榜单各自是什么任务/指标/基线;是否同一评测口径;",
|
| 197 |
+
" 是否存在 cherry-pick。",
|
| 198 |
+
"- 证据标准:必须来自官方 benchmark 页面/leaderboard 截图/可复现脚本,而不是新闻稿。",
|
| 199 |
+
"",
|
| 200 |
+
"### RQ3:开源的可用性如何(工程落地门槛)?",
|
| 201 |
+
"- 权重是否全量公开?推理依赖(框架版本、显存、是否需要视频输入管线)?",
|
| 202 |
+
"- 是否提供 cookbooks,覆盖哪些能力:定位、推理、规划、导航、操作。",
|
| 203 |
+
"",
|
| 204 |
+
"### RQ4:能力维度拆解:它到底在‘什么能力’上强?",
|
| 205 |
+
"- Egocentric cognition:是否包含长期场景理解与一致性跟踪?",
|
| 206 |
+
"- Spatiotemporal grounding:是否输出坐标/轨迹/地图?误差量化如何做?",
|
| 207 |
+
"- Planning:是语言层规划(plan-as-text),还是能输出可执行动作序列",
|
| 208 |
+
" (actions/waypoints)?",
|
| 209 |
+
"",
|
| 210 |
+
"### RQ5:与同类系统的可比性(apples-to-apples)",
|
| 211 |
+
"- 对比对象:Gemini Robotics ER、NVIDIA Cosmos Reason、其它 embodied VLM / EFM。",
|
| 212 |
+
"- 对比口径:任务集/传感器输入/是否允许工具调用/是否闭源系统。",
|
| 213 |
+
"",
|
| 214 |
+
"## 4. Scope / Non-Scope(边界)",
|
| 215 |
+
"### Scope",
|
| 216 |
+
"- 以公开资料为边界:论文/项目页/代码/模型卡/公开 benchmark。",
|
| 217 |
+
"- 产出一个可审计的‘证据 → 结论’矩阵:每个结论都对应来源与验证步骤。",
|
| 218 |
+
"",
|
| 219 |
+
"### Non-Scope(本轮明确不做)",
|
| 220 |
+
"- 不做真实机器人部署复现(除非官方提供可运行 demo 且成本可控)。",
|
| 221 |
+
"- 不做未公开数据/内部实现猜测;不引用无���访问或不可验证的泄漏信息。",
|
| 222 |
+
"",
|
| 223 |
+
"## 5. 证据标准(Evaluation Criteria)",
|
| 224 |
+
"为了避免‘看起来很强’的主观总结,本研究采用硬标准:",
|
| 225 |
+
"- 论文证据:架构/训练/消融/实验设置必须可在 arXiv 报告中定位到章节与图表。",
|
| 226 |
+
"- 代码证据:能在 GitHub 找到对应实现入口(推理脚本、配置、模型定义)。",
|
| 227 |
+
"- Bench 证据:结果必须能追溯到官方 benchmark/leaderboard 或可复现评测脚本。",
|
| 228 |
+
"- 口径一致:比较必须满足相同输入与评测规则;否则标注为‘不可直接比较’。",
|
| 229 |
+
"- 可用性:给出最小可运行路径(依赖、命令、显存、样例输入)。",
|
| 230 |
+
"",
|
| 231 |
+
"## 6. 检索与阅读计划(Search Plan & Reading Plan)",
|
| 232 |
+
"### 6.1 顺序(建议在 2–4 小时深研里执行)",
|
| 233 |
+
"1) GitHub README + 目录:确定开源范围、模型列表、入口脚本、benchmark 链接。\n"
|
| 234 |
+
"2) Project Page:收集所有外链(HF/ModelScope/Benchmark/Demo/Video)。\n"
|
| 235 |
+
"3) arXiv:抓核心章节:method、experiments、ablation、limitations。\n"
|
| 236 |
+
"4) Model Card:确认权重、许可证、推理限制与样例。\n"
|
| 237 |
+
"5) Press:只作为线索,不作为证据;对 press 中的 claim 做反向核对。",
|
| 238 |
+
"",
|
| 239 |
+
"### 6.2 Query 列表(可直接用于搜索/对照阅读)",
|
| 240 |
+
"\n".join([f"- {q}" for q in queries]),
|
| 241 |
+
"",
|
| 242 |
+
"## 7. 产出设计(Deliverables)",
|
| 243 |
+
"在完成 deep research 后,最终交付物建议包含:",
|
| 244 |
+
"- 长文研究报告(含 Resources、证据矩阵、可复现路径、局限与开放问题)",
|
| 245 |
+
"- 15 分钟演讲稿 + Beamer(信息密度高,但每页只承载一个结论)",
|
| 246 |
+
"- research.json(结构化审计:来源、摘录、结论、证据链接、验证状态)",
|
| 247 |
+
"- resources/(把关键页面快照打包,避免链接失效)",
|
| 248 |
+
"",
|
| 249 |
+
"## 8. 风险与不确定性(Risks & Unknowns)",
|
| 250 |
+
"- Press 可能夸大:需以论文与 benchmark 为准。",
|
| 251 |
+
"- Leaderboard 的口径可能不统一:需逐项核对设置。",
|
| 252 |
+
"- Demo bias:演示视频不等于泛化能力。",
|
| 253 |
+
"- 可复现门槛:依赖、算力、输入管线(视频/多帧)可能较重。",
|
| 254 |
+
"- 许可证与权重条款:代码 Apache-2.0 不等于所有权重都无约束。",
|
| 255 |
+
"",
|
| 256 |
+
"## 9. 资源清单(Prioritized Resources)",
|
| 257 |
+
"\n".join(src_lines),
|
| 258 |
+
"",
|
| 259 |
+
]
|
| 260 |
+
)
|
| 261 |
+
|
| 262 |
+
|
| 263 |
+
def generate(out: Path, keep_stage: bool, fetch: bool) -> Path:
|
| 264 |
+
stage_dir = stage_dir_for_out(out)
|
| 265 |
+
stage_dir.mkdir(parents=True, exist_ok=True)
|
| 266 |
+
_write_compile_helpers(stage_dir)
|
| 267 |
+
|
| 268 |
+
srcs = sources()
|
| 269 |
+
src_json = [asdict(s) for s in srcs]
|
| 270 |
+
|
| 271 |
+
resources_dir = stage_dir / "resources"
|
| 272 |
+
snapshots_dir = resources_dir / "snapshots"
|
| 273 |
+
snapshots_dir.mkdir(parents=True, exist_ok=True)
|
| 274 |
+
_ = (resources_dir / "sources.json").write_text(
|
| 275 |
+
json.dumps({"sources": src_json}, ensure_ascii=False, indent=2),
|
| 276 |
+
encoding="utf-8",
|
| 277 |
+
)
|
| 278 |
+
|
| 279 |
+
snapshots: list[dict[str, object]] = []
|
| 280 |
+
if fetch:
|
| 281 |
+
for i, s in enumerate(srcs, start=1):
|
| 282 |
+
slug = _slugify(s.title)
|
| 283 |
+
target = snapshots_dir / f"{i:02d}_{slug}.txt"
|
| 284 |
+
entry: dict[str, object] = {"url": s.url, "title": s.title, "path": str(target)}
|
| 285 |
+
try:
|
| 286 |
+
ctype, text = _fetch_snapshot(s.url)
|
| 287 |
+
entry["content_type"] = ctype
|
| 288 |
+
_ = target.write_text(text, encoding="utf-8")
|
| 289 |
+
entry["ok"] = True
|
| 290 |
+
except Exception as e:
|
| 291 |
+
entry["ok"] = False
|
| 292 |
+
entry["error"] = str(e)
|
| 293 |
+
snapshots.append(entry)
|
| 294 |
+
|
| 295 |
+
pre = pre_report_md()
|
| 296 |
+
_ = (stage_dir / "pre_report.md").write_text(pre, encoding="utf-8")
|
| 297 |
+
_ = (stage_dir / "report.md").write_text("# (Not generated in preset mode)\n", encoding="utf-8")
|
| 298 |
+
_ = (stage_dir / "speech.md").write_text("# (Not generated in preset mode)\n", encoding="utf-8")
|
| 299 |
+
_ = (stage_dir / "paper.tex").write_text(
|
| 300 |
+
"\\documentclass[11pt]{article}\n"
|
| 301 |
+
"\\usepackage[UTF8]{ctex}\n"
|
| 302 |
+
"\\usepackage{hyperref}\n"
|
| 303 |
+
"\\title{RynnBrain Pre-Research}\n"
|
| 304 |
+
"\\author{hydradeck preset}\n"
|
| 305 |
+
"\\date{\\today}\n"
|
| 306 |
+
"\\begin{document}\n"
|
| 307 |
+
"\\maketitle\n"
|
| 308 |
+
"\\section*{Pre-Research}\n"
|
| 309 |
+
"This preset package contains a Markdown pre-research report and archived resources.\\\\\n"
|
| 310 |
+
"See pre_report.md and resources/.\n"
|
| 311 |
+
"\\end{document}\n",
|
| 312 |
+
encoding="utf-8",
|
| 313 |
+
)
|
| 314 |
+
_ = (stage_dir / "slides.tex").write_text(
|
| 315 |
+
"\\documentclass{beamer}\n"
|
| 316 |
+
"\\usepackage[UTF8]{ctex}\n"
|
| 317 |
+
"\\usetheme{Madrid}\n"
|
| 318 |
+
"\\title{RynnBrain Pre-Research (15min)}\n"
|
| 319 |
+
"\\author{hydradeck preset}\n"
|
| 320 |
+
"\\date{\\today}\n"
|
| 321 |
+
"\\begin{document}\n"
|
| 322 |
+
"\\frame{\\titlepage}\n"
|
| 323 |
+
"\\begin{frame}{What is inside?}\n"
|
| 324 |
+
"- pre_report.md\\\\\n"
|
| 325 |
+
"- resources/sources.json\\\\\n"
|
| 326 |
+
"- resources/snapshots/*\\\\\n"
|
| 327 |
+
"\\end{frame}\n"
|
| 328 |
+
"\\end{document}\n",
|
| 329 |
+
encoding="utf-8",
|
| 330 |
+
)
|
| 331 |
+
_ = (stage_dir / "refs.bib").write_text("% (Not generated in preset mode)\n", encoding="utf-8")
|
| 332 |
+
|
| 333 |
+
research = {
|
| 334 |
+
"topic": "RynnBrain",
|
| 335 |
+
"mode": "preset-pre",
|
| 336 |
+
"sources": src_json,
|
| 337 |
+
"snapshots": snapshots,
|
| 338 |
+
"meta": {"fetch": fetch},
|
| 339 |
+
}
|
| 340 |
+
_ = (stage_dir / "research.json").write_text(
|
| 341 |
+
json.dumps(research, ensure_ascii=False, indent=2),
|
| 342 |
+
encoding="utf-8",
|
| 343 |
+
)
|
| 344 |
+
|
| 345 |
+
finalize_output(out, stage_dir, keep_stage=keep_stage)
|
| 346 |
+
return out
|
hydradeck/render.py
ADDED
|
@@ -0,0 +1,471 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
from __future__ import annotations
|
| 2 |
+
|
| 3 |
+
import re
|
| 4 |
+
from dataclasses import dataclass
|
| 5 |
+
|
| 6 |
+
from hydradeck.core.types import ExtractedFact, Source
|
| 7 |
+
|
| 8 |
+
_LATEX_SPECIALS: dict[str, str] = {
|
| 9 |
+
"\\": r"\textbackslash{}",
|
| 10 |
+
"{": r"\{",
|
| 11 |
+
"}": r"\}",
|
| 12 |
+
"#": r"\#",
|
| 13 |
+
"$": r"\$",
|
| 14 |
+
"%": r"\%",
|
| 15 |
+
"&": r"\&",
|
| 16 |
+
"_": r"\_",
|
| 17 |
+
"^": r"\textasciicircum{}",
|
| 18 |
+
"~": r"\textasciitilde{}",
|
| 19 |
+
}
|
| 20 |
+
|
| 21 |
+
|
| 22 |
+
def latex_escape(s: str) -> str:
|
| 23 |
+
return "".join(_LATEX_SPECIALS.get(ch, ch) for ch in s)
|
| 24 |
+
|
| 25 |
+
|
| 26 |
+
def _bib_key(i: int) -> str:
|
| 27 |
+
return f"src{i}"
|
| 28 |
+
|
| 29 |
+
|
| 30 |
+
def _bib_escape(s: str) -> str:
|
| 31 |
+
return s.replace("\\", "\\\\").replace("{", "\\{").replace("}", "\\}")
|
| 32 |
+
|
| 33 |
+
|
| 34 |
+
def render_bibtex(sources: list[Source]) -> str:
|
| 35 |
+
lines: list[str] = []
|
| 36 |
+
for i, s in enumerate(sources, start=1):
|
| 37 |
+
key = _bib_key(i)
|
| 38 |
+
lines.append(f"@misc{{{key},")
|
| 39 |
+
lines.append(f" title = {{{_bib_escape(s.title)}}},")
|
| 40 |
+
lines.append(f" howpublished = {{\\url{{{_bib_escape(s.url)}}}}},")
|
| 41 |
+
lines.append(" note = {Accessed: 2026-03-04},")
|
| 42 |
+
lines.append("}")
|
| 43 |
+
lines.append("")
|
| 44 |
+
return "\n".join(lines).strip() + "\n"
|
| 45 |
+
|
| 46 |
+
|
| 47 |
+
def _replace_numeric_citations(text: str, max_n: int) -> str:
|
| 48 |
+
def repl(m: re.Match[str]) -> str:
|
| 49 |
+
num = int(m.group(1))
|
| 50 |
+
if 1 <= num <= max_n:
|
| 51 |
+
return f"\\cite{{{_bib_key(num)}}}"
|
| 52 |
+
return m.group(0)
|
| 53 |
+
|
| 54 |
+
return re.sub(r"\[(\d{1,3})\]", repl, text)
|
| 55 |
+
|
| 56 |
+
|
| 57 |
+
def _markdown_to_latex_paragraphs(md: str, max_n: int) -> str:
|
| 58 |
+
text = md.strip()
|
| 59 |
+
text = re.sub(r"```[\s\S]*?```", "", text)
|
| 60 |
+
text = re.sub(r"^\s*[-*+]\s+", "", text, flags=re.MULTILINE)
|
| 61 |
+
text = re.sub(r"^\s*#+\s*", "", text, flags=re.MULTILINE)
|
| 62 |
+
text = re.sub(r"`([^`]+)`", r"\1", text)
|
| 63 |
+
text = re.sub(r"\*\*(.*?)\*\*", r"\1", text)
|
| 64 |
+
text = re.sub(r"\*(.*?)\*", r"\1", text)
|
| 65 |
+
text = re.sub(r"\[(.*?)\]\((.*?)\)", r"\1", text)
|
| 66 |
+
text = _replace_numeric_citations(text, max_n=max_n)
|
| 67 |
+
text = latex_escape(text)
|
| 68 |
+
text = re.sub(r"\\textbackslash\{\}cite\\\{(src\d+)\\\}", r"\\cite{\1}", text)
|
| 69 |
+
text = text.replace("\n\n", "\n\\par\n")
|
| 70 |
+
return text
|
| 71 |
+
|
| 72 |
+
|
| 73 |
+
def render_paper(
|
| 74 |
+
topic: str,
|
| 75 |
+
outline: list[str],
|
| 76 |
+
body: str,
|
| 77 |
+
facts: list[ExtractedFact],
|
| 78 |
+
sources: list[Source],
|
| 79 |
+
) -> str:
|
| 80 |
+
topic_e = latex_escape(topic)
|
| 81 |
+
url_to_key = {s.url: _bib_key(i) for i, s in enumerate(sources, start=1)}
|
| 82 |
+
|
| 83 |
+
outline_items = "\n".join([f"\\item {latex_escape(x)}" for x in outline[:10]])
|
| 84 |
+
fact_sentences: list[str] = []
|
| 85 |
+
for f in facts[:18]:
|
| 86 |
+
key = url_to_key.get(f.url)
|
| 87 |
+
cite = f"\\cite{{{key}}}" if key else ""
|
| 88 |
+
sentence = latex_escape(f.claim.strip())
|
| 89 |
+
if sentence and sentence[-1] not in ".!?":
|
| 90 |
+
sentence += "."
|
| 91 |
+
fact_sentences.append(sentence + (cite if cite else ""))
|
| 92 |
+
facts_paragraph = (
|
| 93 |
+
" ".join(fact_sentences)
|
| 94 |
+
if fact_sentences
|
| 95 |
+
else "No extracted facts available."
|
| 96 |
+
)
|
| 97 |
+
|
| 98 |
+
body_latex = _markdown_to_latex_paragraphs(body, max_n=len(sources))
|
| 99 |
+
|
| 100 |
+
return (
|
| 101 |
+
"\\documentclass[11pt]{article}\n"
|
| 102 |
+
"\\usepackage{geometry}\n"
|
| 103 |
+
"\\usepackage{hyperref}\n"
|
| 104 |
+
"\\usepackage{url}\n"
|
| 105 |
+
"\\usepackage{booktabs}\n"
|
| 106 |
+
"\\usepackage{longtable}\n"
|
| 107 |
+
"\\geometry{margin=1in}\n"
|
| 108 |
+
"\\hypersetup{colorlinks=true,linkcolor=black,citecolor=blue,urlcolor=blue}\n"
|
| 109 |
+
f"\\title{{{topic_e}}}\n"
|
| 110 |
+
"\\author{hydradeck}\n"
|
| 111 |
+
"\\date{\\today}\n"
|
| 112 |
+
"\\begin{document}\n"
|
| 113 |
+
"\\maketitle\n"
|
| 114 |
+
"\\begin{abstract}\n"
|
| 115 |
+
"This report presents a structured analysis with explicit traceability to sources.\n"
|
| 116 |
+
"\\end{abstract}\n\n"
|
| 117 |
+
"\\section*{1. Introduction and Background}\n"
|
| 118 |
+
+ facts_paragraph
|
| 119 |
+
+ "\n\n"
|
| 120 |
+
"\\section*{2. Logical Outline}\n"
|
| 121 |
+
"\\begin{itemize}\n"
|
| 122 |
+
+ outline_items
|
| 123 |
+
+ "\n\\end{itemize}\n\n"
|
| 124 |
+
"\\section*{3. Evidence and Key Findings}\n"
|
| 125 |
+
+ body_latex
|
| 126 |
+
+ "\n\n"
|
| 127 |
+
"\\section*{4. Limitations and Discussion}\n"
|
| 128 |
+
"The analysis is bounded by available public evidence and may evolve as sources update.\n\n"
|
| 129 |
+
"\\section*{5. Conclusion}\n"
|
| 130 |
+
"Conclusions are presented in a source-traceable form and should be interpreted with the\n"
|
| 131 |
+
"reported assumptions and constraints.\n\n"
|
| 132 |
+
"\\bibliographystyle{plain}\n"
|
| 133 |
+
"\\bibliography{refs}\n"
|
| 134 |
+
"\\end{document}\n"
|
| 135 |
+
)
|
| 136 |
+
|
| 137 |
+
|
| 138 |
+
def render_report_structured(
|
| 139 |
+
topic: str,
|
| 140 |
+
section_blocks: list[dict[str, str]],
|
| 141 |
+
language: str = "en",
|
| 142 |
+
) -> str:
|
| 143 |
+
lang = language.lower()
|
| 144 |
+
topic_e = latex_escape(topic)
|
| 145 |
+
|
| 146 |
+
if lang == "zh":
|
| 147 |
+
preamble = (
|
| 148 |
+
"\\documentclass[11pt]{ctexart}\n"
|
| 149 |
+
"\\usepackage[a4paper,margin=1in]{geometry}\n"
|
| 150 |
+
"\\usepackage{hyperref}\n"
|
| 151 |
+
"\\usepackage{url}\n"
|
| 152 |
+
"\\usepackage{booktabs}\n"
|
| 153 |
+
"\\usepackage{longtable}\n"
|
| 154 |
+
"\\hypersetup{colorlinks=true,linkcolor=black,citecolor=blue,urlcolor=blue}\n"
|
| 155 |
+
f"\\title{{{topic_e}}}\n"
|
| 156 |
+
"\\author{hydradeck}\n"
|
| 157 |
+
"\\date{\\today}\n"
|
| 158 |
+
"\\begin{document}\n"
|
| 159 |
+
"\\maketitle\n"
|
| 160 |
+
)
|
| 161 |
+
else:
|
| 162 |
+
preamble = (
|
| 163 |
+
"\\documentclass[11pt]{article}\n"
|
| 164 |
+
"\\usepackage{geometry}\n"
|
| 165 |
+
"\\usepackage{hyperref}\n"
|
| 166 |
+
"\\usepackage{url}\n"
|
| 167 |
+
"\\usepackage{booktabs}\n"
|
| 168 |
+
"\\usepackage{longtable}\n"
|
| 169 |
+
"\\geometry{margin=1in}\n"
|
| 170 |
+
"\\hypersetup{colorlinks=true,linkcolor=black,citecolor=blue,urlcolor=blue}\n"
|
| 171 |
+
f"\\title{{{topic_e}}}\n"
|
| 172 |
+
"\\author{hydradeck}\n"
|
| 173 |
+
"\\date{\\today}\n"
|
| 174 |
+
"\\begin{document}\n"
|
| 175 |
+
"\\maketitle\n"
|
| 176 |
+
)
|
| 177 |
+
|
| 178 |
+
content_parts: list[str] = []
|
| 179 |
+
for block in section_blocks[:10]:
|
| 180 |
+
title = latex_escape(str(block.get("name", "Section")).strip() or "Section")
|
| 181 |
+
latex_body = str(block.get("latex", "")).strip()
|
| 182 |
+
latex_body = re.sub(r"\\section\*?\{[^}]*\}", "", latex_body)
|
| 183 |
+
latex_body = re.sub(r"\\subsection\*?\{[^}]*\}", "", latex_body)
|
| 184 |
+
latex_body = re.sub(r"\\cite\{[^}]*\}", "", latex_body)
|
| 185 |
+
latex_body = re.sub(r"\[(\d{1,3})\]", "", latex_body)
|
| 186 |
+
if not latex_body:
|
| 187 |
+
continue
|
| 188 |
+
content_parts.append(f"\\section*{{{title}}}\n{latex_body}\n")
|
| 189 |
+
|
| 190 |
+
return preamble + "\n".join(content_parts) + "\n\\end{document}\n"
|
| 191 |
+
|
| 192 |
+
|
| 193 |
+
@dataclass
|
| 194 |
+
class SlideFrame:
|
| 195 |
+
title: str
|
| 196 |
+
bullets: list[str]
|
| 197 |
+
note: str = ""
|
| 198 |
+
|
| 199 |
+
|
| 200 |
+
def render_beamer(topic: str, outline: list[str], bullets: list[str]) -> str:
|
| 201 |
+
section_blocks = [{"name": t, "latex": b} for t, b in zip(outline, bullets)]
|
| 202 |
+
if not section_blocks:
|
| 203 |
+
section_blocks = [{"name": "Summary", "latex": "Key findings and implications."}]
|
| 204 |
+
frames = build_slide_frames_from_sections(section_blocks, language="en")
|
| 205 |
+
frames = enforce_slide_density(frames, language="en")
|
| 206 |
+
return render_beamer_frames(topic, frames, language="en")
|
| 207 |
+
|
| 208 |
+
|
| 209 |
+
def render_beamer_from_report(topic: str, report_tex: str) -> str:
|
| 210 |
+
frames = build_slide_frames_from_report(report_tex, language="en")
|
| 211 |
+
frames = enforce_slide_density(frames, language="en")
|
| 212 |
+
return render_beamer_frames(topic, frames, language="en")
|
| 213 |
+
|
| 214 |
+
|
| 215 |
+
def _split_paragraph_to_bullets(text: str, language: str) -> list[str]:
|
| 216 |
+
lang = language.lower()
|
| 217 |
+
if lang == "zh":
|
| 218 |
+
parts = [x.strip() for x in re.split(r"[。!?]\s*", text) if x.strip()]
|
| 219 |
+
out: list[str] = []
|
| 220 |
+
for p in parts:
|
| 221 |
+
if len(p) < 6:
|
| 222 |
+
continue
|
| 223 |
+
out.append(_trim_chars(_clean_text_for_slide(p), 28))
|
| 224 |
+
return out
|
| 225 |
+
|
| 226 |
+
parts = [x.strip() for x in re.split(r"[.!?]\s+", text) if x.strip()]
|
| 227 |
+
out2: list[str] = []
|
| 228 |
+
for p in parts:
|
| 229 |
+
clean = _clean_text_for_slide(p)
|
| 230 |
+
if len(clean) < 14:
|
| 231 |
+
continue
|
| 232 |
+
out2.append(_trim_words(clean, 14))
|
| 233 |
+
return out2
|
| 234 |
+
|
| 235 |
+
|
| 236 |
+
def build_slide_frames_from_sections(
|
| 237 |
+
section_blocks: list[dict[str, str]],
|
| 238 |
+
language: str = "en",
|
| 239 |
+
) -> list[SlideFrame]:
|
| 240 |
+
lang = language.lower()
|
| 241 |
+
frames: list[SlideFrame] = []
|
| 242 |
+
for block in section_blocks[:8]:
|
| 243 |
+
title = str(block.get("name", "Section")).strip() or ("章节" if lang == "zh" else "Section")
|
| 244 |
+
body = str(block.get("latex", ""))
|
| 245 |
+
body = re.sub(r"\\section\*?\{[^}]*\}", "", body)
|
| 246 |
+
body = re.sub(r"\\subsection\*?\{[^}]*\}", "", body)
|
| 247 |
+
body = re.sub(r"\\cite\{[^}]*\}", "", body)
|
| 248 |
+
body = re.sub(r"\[(\d{1,3})\]", "", body)
|
| 249 |
+
bullets = _split_paragraph_to_bullets(body, lang)
|
| 250 |
+
if not bullets:
|
| 251 |
+
continue
|
| 252 |
+
|
| 253 |
+
chunk = 4 if lang == "zh" else 4
|
| 254 |
+
for i in range(0, len(bullets), chunk):
|
| 255 |
+
part = bullets[i : i + chunk]
|
| 256 |
+
if not part:
|
| 257 |
+
continue
|
| 258 |
+
if i == 0:
|
| 259 |
+
frame_title = title
|
| 260 |
+
else:
|
| 261 |
+
frame_title = f"{title}(续)" if lang == "zh" else f"{title} (cont.)"
|
| 262 |
+
frames.append(SlideFrame(title=frame_title, bullets=part))
|
| 263 |
+
|
| 264 |
+
if not frames:
|
| 265 |
+
raise RuntimeError("insufficient readable section content for slides")
|
| 266 |
+
return frames
|
| 267 |
+
|
| 268 |
+
|
| 269 |
+
def enforce_slide_density(
|
| 270 |
+
frames: list[SlideFrame],
|
| 271 |
+
language: str = "en",
|
| 272 |
+
max_bullets_per_frame: int = 4,
|
| 273 |
+
max_chars_per_bullet_zh: int = 28,
|
| 274 |
+
max_words_per_bullet_en: int = 14,
|
| 275 |
+
) -> list[SlideFrame]:
|
| 276 |
+
lang = language.lower()
|
| 277 |
+
out: list[SlideFrame] = []
|
| 278 |
+
|
| 279 |
+
for fr in frames:
|
| 280 |
+
normalized: list[str] = []
|
| 281 |
+
for b in fr.bullets:
|
| 282 |
+
clean = _clean_text_for_slide(b)
|
| 283 |
+
if not clean:
|
| 284 |
+
continue
|
| 285 |
+
if lang == "zh":
|
| 286 |
+
clean = _trim_chars(clean, max_chars_per_bullet_zh)
|
| 287 |
+
else:
|
| 288 |
+
clean = _trim_words(clean, max_words_per_bullet_en)
|
| 289 |
+
if clean:
|
| 290 |
+
normalized.append(clean)
|
| 291 |
+
|
| 292 |
+
if not normalized:
|
| 293 |
+
continue
|
| 294 |
+
|
| 295 |
+
for i in range(0, len(normalized), max_bullets_per_frame):
|
| 296 |
+
chunk = normalized[i : i + max_bullets_per_frame]
|
| 297 |
+
if not chunk:
|
| 298 |
+
continue
|
| 299 |
+
if i == 0:
|
| 300 |
+
title = fr.title
|
| 301 |
+
else:
|
| 302 |
+
title = f"{fr.title}(续)" if lang == "zh" else f"{fr.title} (cont.)"
|
| 303 |
+
out.append(SlideFrame(title=title, bullets=chunk, note=fr.note))
|
| 304 |
+
|
| 305 |
+
if not out:
|
| 306 |
+
raise RuntimeError("slide density guard removed all frames")
|
| 307 |
+
return out
|
| 308 |
+
|
| 309 |
+
|
| 310 |
+
def _trim_words(text: str, max_words: int) -> str:
|
| 311 |
+
words = text.split()
|
| 312 |
+
if len(words) <= max_words:
|
| 313 |
+
return text
|
| 314 |
+
return " ".join(words[:max_words]).rstrip(" ,.;") + "..."
|
| 315 |
+
|
| 316 |
+
|
| 317 |
+
def _trim_chars(text: str, max_chars: int) -> str:
|
| 318 |
+
t = text.strip()
|
| 319 |
+
if len(t) <= max_chars:
|
| 320 |
+
return t
|
| 321 |
+
return t[: max_chars - 1].rstrip(",。,. ") + "…"
|
| 322 |
+
|
| 323 |
+
|
| 324 |
+
def _clean_text_for_slide(text: str) -> str:
|
| 325 |
+
t = text.strip()
|
| 326 |
+
t = re.sub(r"\s+", " ", t)
|
| 327 |
+
t = re.sub(r"`([^`]+)`", r"\1", t)
|
| 328 |
+
t = re.sub(r"\*\*(.*?)\*\*", r"\1", t)
|
| 329 |
+
t = re.sub(r"\*(.*?)\*", r"\1", t)
|
| 330 |
+
return t
|
| 331 |
+
|
| 332 |
+
|
| 333 |
+
def build_slide_frames_from_report(report_tex: str, language: str = "en") -> list[SlideFrame]:
|
| 334 |
+
lang = language.lower()
|
| 335 |
+
sections = re.split(r"\\section\*\{([^}]+)\}", report_tex)
|
| 336 |
+
parsed: list[tuple[str, str]] = []
|
| 337 |
+
if len(sections) >= 3:
|
| 338 |
+
for i in range(1, len(sections), 2):
|
| 339 |
+
title = sections[i].strip()
|
| 340 |
+
body = sections[i + 1] if i + 1 < len(sections) else ""
|
| 341 |
+
parsed.append((title, body))
|
| 342 |
+
|
| 343 |
+
if not parsed:
|
| 344 |
+
raise RuntimeError("cannot derive slide frames from report structure")
|
| 345 |
+
|
| 346 |
+
frames: list[SlideFrame] = []
|
| 347 |
+
for title, body in parsed[:8]:
|
| 348 |
+
plain = re.sub(r"\\[a-zA-Z]+\*?(\[[^\]]*\])?(\{[^}]*\})?", " ", body)
|
| 349 |
+
chunks = [x.strip() for x in re.split(r"[。.!?]\s+", plain) if x.strip()]
|
| 350 |
+
bullets: list[str] = []
|
| 351 |
+
for c in chunks:
|
| 352 |
+
clean = _clean_text_for_slide(c)
|
| 353 |
+
if not clean:
|
| 354 |
+
continue
|
| 355 |
+
if lang == "zh":
|
| 356 |
+
if len(clean) < 8:
|
| 357 |
+
continue
|
| 358 |
+
bullets.append(_trim_chars(clean, 30))
|
| 359 |
+
else:
|
| 360 |
+
if len(clean) < 12:
|
| 361 |
+
continue
|
| 362 |
+
bullets.append(_trim_words(clean, 16))
|
| 363 |
+
if len(bullets) >= 5:
|
| 364 |
+
break
|
| 365 |
+
if not bullets:
|
| 366 |
+
raise RuntimeError(f"insufficient bullet content for slide '{title}'")
|
| 367 |
+
frames.append(SlideFrame(title=title, bullets=bullets))
|
| 368 |
+
|
| 369 |
+
return frames
|
| 370 |
+
|
| 371 |
+
|
| 372 |
+
def render_beamer_frames(topic: str, frames: list[SlideFrame], language: str = "en") -> str:
|
| 373 |
+
lang = language.lower()
|
| 374 |
+
topic_e = latex_escape(topic)
|
| 375 |
+
agenda_label = "目录" if lang == "zh" else "Agenda"
|
| 376 |
+
summary_title = "总结" if lang == "zh" else "Summary"
|
| 377 |
+
|
| 378 |
+
agenda_items = "\n".join([f"\\item {latex_escape(f.title)}" for f in frames[:8]])
|
| 379 |
+
|
| 380 |
+
frame_blocks: list[str] = []
|
| 381 |
+
for fr in frames[:10]:
|
| 382 |
+
b = "\n".join([f"\\item {latex_escape(x)}" for x in fr.bullets[:5]])
|
| 383 |
+
frame_blocks.append(
|
| 384 |
+
"\\begin{frame}[t]{"
|
| 385 |
+
+ latex_escape(fr.title)
|
| 386 |
+
+ "}\n"
|
| 387 |
+
+ "\\begin{itemize}\n"
|
| 388 |
+
+ b
|
| 389 |
+
+ "\n\\end{itemize}\n"
|
| 390 |
+
+ (f"\\vspace{{0.6em}}\\footnotesize {latex_escape(fr.note)}\n" if fr.note else "")
|
| 391 |
+
+ "\\end{frame}\n"
|
| 392 |
+
)
|
| 393 |
+
|
| 394 |
+
summary_bullets: list[str] = []
|
| 395 |
+
for fr in frames[:5]:
|
| 396 |
+
if fr.bullets:
|
| 397 |
+
summary_bullets.append(fr.bullets[0])
|
| 398 |
+
if not summary_bullets:
|
| 399 |
+
summary_bullets = ["关键要点见前页。" if lang == "zh" else "Key points are summarized in previous slides."]
|
| 400 |
+
summary_items = "\n".join([f"\\item {latex_escape(x)}" for x in summary_bullets])
|
| 401 |
+
|
| 402 |
+
if lang == "zh":
|
| 403 |
+
return (
|
| 404 |
+
"\\documentclass[aspectratio=169]{ctexbeamer}\n"
|
| 405 |
+
"\\usetheme{Madrid}\n"
|
| 406 |
+
"\\usefonttheme{professionalfonts}\n"
|
| 407 |
+
"\\setbeamertemplate{navigation symbols}{}\n"
|
| 408 |
+
"\\usepackage{hyperref}\n"
|
| 409 |
+
"\\usepackage{booktabs}\n"
|
| 410 |
+
"\\definecolor{AccentBlue}{HTML}{1F4E79}\n"
|
| 411 |
+
"\\setbeamercolor{title}{fg=AccentBlue}\n"
|
| 412 |
+
"\\setbeamercolor{frametitle}{fg=AccentBlue}\n"
|
| 413 |
+
"\\setbeamerfont{title}{series=\\bfseries,size=\\Large}\n"
|
| 414 |
+
"\\setbeamerfont{frametitle}{series=\\bfseries,size=\\large}\n"
|
| 415 |
+
f"\\title{{{topic_e}}}\n"
|
| 416 |
+
"\\author{hydradeck}\n"
|
| 417 |
+
"\\date{\\today}\n"
|
| 418 |
+
"\\begin{document}\n"
|
| 419 |
+
"\\frame{\\titlepage}\n"
|
| 420 |
+
"\\begin{frame}{"
|
| 421 |
+
+ latex_escape(agenda_label)
|
| 422 |
+
+ "}\n"
|
| 423 |
+
"\\begin{itemize}\n"
|
| 424 |
+
+ agenda_items
|
| 425 |
+
+ "\n\\end{itemize}\n"
|
| 426 |
+
"\\end{frame}\n"
|
| 427 |
+
+ "".join(frame_blocks)
|
| 428 |
+
+ "\\begin{frame}{"
|
| 429 |
+
+ latex_escape(summary_title)
|
| 430 |
+
+ "}\n"
|
| 431 |
+
+ "\\begin{itemize}\n"
|
| 432 |
+
+ summary_items
|
| 433 |
+
+ "\n\\end{itemize}\n"
|
| 434 |
+
+ "\\end{frame}\n"
|
| 435 |
+
+ "\\end{document}\n"
|
| 436 |
+
)
|
| 437 |
+
|
| 438 |
+
return (
|
| 439 |
+
"\\documentclass[aspectratio=169]{beamer}\n"
|
| 440 |
+
"\\usetheme{metropolis}\n"
|
| 441 |
+
"\\usefonttheme{professionalfonts}\n"
|
| 442 |
+
"\\setbeamertemplate{navigation symbols}{}\n"
|
| 443 |
+
"\\usepackage{hyperref}\n"
|
| 444 |
+
"\\usepackage{booktabs}\n"
|
| 445 |
+
"\\definecolor{AccentBlue}{HTML}{1F4E79}\n"
|
| 446 |
+
"\\setbeamercolor{title}{fg=AccentBlue}\n"
|
| 447 |
+
"\\setbeamercolor{frametitle}{fg=AccentBlue}\n"
|
| 448 |
+
"\\setbeamerfont{title}{series=\\bfseries,size=\\Large}\n"
|
| 449 |
+
"\\setbeamerfont{frametitle}{series=\\bfseries,size=\\large}\n"
|
| 450 |
+
f"\\title{{{topic_e}}}\n"
|
| 451 |
+
"\\author{hydradeck}\n"
|
| 452 |
+
"\\date{\\today}\n"
|
| 453 |
+
"\\begin{document}\n"
|
| 454 |
+
"\\frame{\\titlepage}\n"
|
| 455 |
+
"\\begin{frame}{"
|
| 456 |
+
+ latex_escape(agenda_label)
|
| 457 |
+
+ "}\n"
|
| 458 |
+
"\\begin{itemize}\n"
|
| 459 |
+
+ agenda_items
|
| 460 |
+
+ "\n\\end{itemize}\n"
|
| 461 |
+
"\\end{frame}\n"
|
| 462 |
+
+ "".join(frame_blocks)
|
| 463 |
+
+ "\\begin{frame}{"
|
| 464 |
+
+ latex_escape(summary_title)
|
| 465 |
+
+ "}\n"
|
| 466 |
+
+ "\\begin{itemize}\n"
|
| 467 |
+
+ summary_items
|
| 468 |
+
+ "\n\\end{itemize}\n"
|
| 469 |
+
+ "\\end{frame}\n"
|
| 470 |
+
+ "\\end{document}\n"
|
| 471 |
+
)
|
hydradeck/resources_pack.py
ADDED
|
@@ -0,0 +1,706 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
from __future__ import annotations
|
| 2 |
+
|
| 3 |
+
import json
|
| 4 |
+
import re
|
| 5 |
+
import time
|
| 6 |
+
import urllib.parse
|
| 7 |
+
from dataclasses import asdict
|
| 8 |
+
from pathlib import Path
|
| 9 |
+
|
| 10 |
+
import requests
|
| 11 |
+
|
| 12 |
+
from hydradeck.agents.personas import PERSONAS
|
| 13 |
+
from hydradeck.clients import ChatMessage, GrokClient, GrokClientError
|
| 14 |
+
from hydradeck.core.types import RunConfig, Source
|
| 15 |
+
from hydradeck.packaging import finalize_output, stage_dir_for_out
|
| 16 |
+
from hydradeck.utils import Heartbeat, Progress
|
| 17 |
+
|
| 18 |
+
|
| 19 |
+
def _slugify(s: str) -> str:
|
| 20 |
+
t = s.strip().lower()
|
| 21 |
+
t = re.sub(r"[^a-z0-9]+", "-", t)
|
| 22 |
+
t = re.sub(r"-+", "-", t).strip("-")
|
| 23 |
+
return t or "source"
|
| 24 |
+
|
| 25 |
+
|
| 26 |
+
def _extract_sources(obj: dict[str, object], max_sources: int) -> list[Source]:
|
| 27 |
+
raw = obj.get("sources")
|
| 28 |
+
out: list[Source] = []
|
| 29 |
+
if isinstance(raw, list):
|
| 30 |
+
for item in raw[:max_sources]:
|
| 31 |
+
if not isinstance(item, dict):
|
| 32 |
+
continue
|
| 33 |
+
url_v = item.get("url")
|
| 34 |
+
title_v = item.get("title")
|
| 35 |
+
snippet_v = item.get("snippet")
|
| 36 |
+
if isinstance(url_v, str) and isinstance(title_v, str) and isinstance(snippet_v, str):
|
| 37 |
+
out.append(Source(url=url_v, title=title_v, snippet=snippet_v))
|
| 38 |
+
return out
|
| 39 |
+
|
| 40 |
+
|
| 41 |
+
def build_resources_pack(cfg: RunConfig) -> Path:
|
| 42 |
+
stage_dir = stage_dir_for_out(cfg.out)
|
| 43 |
+
stage_dir.mkdir(parents=True, exist_ok=True)
|
| 44 |
+
|
| 45 |
+
t0 = time.time()
|
| 46 |
+
|
| 47 |
+
def remaining_s() -> float:
|
| 48 |
+
return max(0.0, cfg.max_total_runtime_s - (time.time() - t0))
|
| 49 |
+
|
| 50 |
+
def budget_timeout() -> float:
|
| 51 |
+
return max(1.0, min(cfg.request_budget_s, remaining_s()))
|
| 52 |
+
|
| 53 |
+
def llm_timeout() -> float:
|
| 54 |
+
return max(1.0, min(cfg.llm_timeout_s, budget_timeout()))
|
| 55 |
+
|
| 56 |
+
progress = Progress(enabled=cfg.progress, total=6, label="resources")
|
| 57 |
+
progress.update("start", inc=0)
|
| 58 |
+
|
| 59 |
+
if cfg.use_mock:
|
| 60 |
+
from hydradeck.clients.grok_client import MockClient
|
| 61 |
+
|
| 62 |
+
client = MockClient()
|
| 63 |
+
else:
|
| 64 |
+
client = GrokClient(
|
| 65 |
+
base_url=cfg.base_url,
|
| 66 |
+
api_key=cfg.api_key,
|
| 67 |
+
model=cfg.model,
|
| 68 |
+
timeout_s=llm_timeout(),
|
| 69 |
+
heartbeat=cfg.verbose,
|
| 70 |
+
)
|
| 71 |
+
|
| 72 |
+
query_planner = next(p for p in PERSONAS if p.name == "QueryPlanner")
|
| 73 |
+
librarian = next(p for p in PERSONAS if p.name == "Librarian")
|
| 74 |
+
|
| 75 |
+
qp_obj = client.chat_json(
|
| 76 |
+
[
|
| 77 |
+
ChatMessage(role="system", content=query_planner.system_prompt),
|
| 78 |
+
ChatMessage(
|
| 79 |
+
role="user",
|
| 80 |
+
content=(
|
| 81 |
+
"Return JSON: {queries:[...]} with 6 high-recall queries for primary sources. "
|
| 82 |
+
"Topic: "
|
| 83 |
+
+ cfg.topic
|
| 84 |
+
),
|
| 85 |
+
),
|
| 86 |
+
],
|
| 87 |
+
schema_hint='{ "queries": ["..."] }',
|
| 88 |
+
temperature=0.2,
|
| 89 |
+
timeout_s=llm_timeout() if not cfg.use_mock else None,
|
| 90 |
+
)
|
| 91 |
+
progress.update("queries")
|
| 92 |
+
raw_q = qp_obj.get("queries")
|
| 93 |
+
if isinstance(raw_q, list):
|
| 94 |
+
queries = [q for q in raw_q if isinstance(q, str) and q.strip()]
|
| 95 |
+
else:
|
| 96 |
+
queries = []
|
| 97 |
+
if not queries:
|
| 98 |
+
queries = [cfg.topic]
|
| 99 |
+
|
| 100 |
+
seen: set[str] = set()
|
| 101 |
+
sources: list[Source] = []
|
| 102 |
+
for q in queries[: min(3, len(queries))]:
|
| 103 |
+
req = (
|
| 104 |
+
"Return JSON with key sources: list of {url,title,snippet}. "
|
| 105 |
+
"Give authoritative sources (prefer official docs, papers, repos). "
|
| 106 |
+
"Query: "
|
| 107 |
+
+ q
|
| 108 |
+
)
|
| 109 |
+
try:
|
| 110 |
+
src_obj = client.chat_json(
|
| 111 |
+
[
|
| 112 |
+
ChatMessage(role="system", content=librarian.system_prompt),
|
| 113 |
+
ChatMessage(role="user", content=req),
|
| 114 |
+
],
|
| 115 |
+
schema_hint='{ "sources": [ {"url":"...","title":"...","snippet":"..."} ] }',
|
| 116 |
+
temperature=0.2,
|
| 117 |
+
timeout_s=llm_timeout() if not cfg.use_mock else None,
|
| 118 |
+
)
|
| 119 |
+
except GrokClientError:
|
| 120 |
+
continue
|
| 121 |
+
for s in _extract_sources(src_obj, cfg.module_sources):
|
| 122 |
+
if s.url in seen:
|
| 123 |
+
continue
|
| 124 |
+
seen.add(s.url)
|
| 125 |
+
sources.append(s)
|
| 126 |
+
if len(sources) >= cfg.max_sources:
|
| 127 |
+
break
|
| 128 |
+
if len(sources) >= cfg.max_sources:
|
| 129 |
+
break
|
| 130 |
+
progress.update("sources")
|
| 131 |
+
|
| 132 |
+
if not sources:
|
| 133 |
+
sources = [
|
| 134 |
+
Source(
|
| 135 |
+
url="https://github.com/alibaba-damo-academy/RynnBrain",
|
| 136 |
+
title="RynnBrain",
|
| 137 |
+
snippet="",
|
| 138 |
+
)
|
| 139 |
+
]
|
| 140 |
+
progress.update("sources")
|
| 141 |
+
|
| 142 |
+
resources_dir = stage_dir / "resources"
|
| 143 |
+
snaps_dir = resources_dir / "snapshots"
|
| 144 |
+
snaps_dir.mkdir(parents=True, exist_ok=True)
|
| 145 |
+
|
| 146 |
+
snap_meta: list[dict[str, object]] = []
|
| 147 |
+
snap_start = time.time()
|
| 148 |
+
for i, s in enumerate(sources, start=1):
|
| 149 |
+
if (time.time() - snap_start) > cfg.snapshot_total_timeout_s:
|
| 150 |
+
break
|
| 151 |
+
target_base = snaps_dir / f"{i:02d}_{_slugify(s.title)}"
|
| 152 |
+
entry: dict[str, object] = {"url": s.url, "title": s.title}
|
| 153 |
+
if cfg.use_mock:
|
| 154 |
+
entry["ok"] = True
|
| 155 |
+
target = target_base.with_suffix(".txt")
|
| 156 |
+
entry["path"] = str(target)
|
| 157 |
+
target.write_text("mock snapshot", encoding="utf-8")
|
| 158 |
+
snap_meta.append(entry)
|
| 159 |
+
continue
|
| 160 |
+
try:
|
| 161 |
+
with Heartbeat(enabled=cfg.verbose, label=f"fetch {s.url}", interval_s=5.0):
|
| 162 |
+
r = requests.get(
|
| 163 |
+
s.url,
|
| 164 |
+
timeout=min(cfg.snapshot_timeout_s, budget_timeout()),
|
| 165 |
+
headers={"User-Agent": "hydradeck/0.1"},
|
| 166 |
+
)
|
| 167 |
+
r.raise_for_status()
|
| 168 |
+
ctype = r.headers.get("content-type", "")
|
| 169 |
+
entry["content_type"] = ctype
|
| 170 |
+
|
| 171 |
+
is_pdf = "application/pdf" in ctype.lower() or s.url.lower().endswith(".pdf")
|
| 172 |
+
if is_pdf:
|
| 173 |
+
data = r.content
|
| 174 |
+
if len(data) > 5_000_000:
|
| 175 |
+
data = data[:5_000_000]
|
| 176 |
+
target = target_base.with_suffix(".pdf")
|
| 177 |
+
entry["path"] = str(target)
|
| 178 |
+
target.write_bytes(data)
|
| 179 |
+
entry["binary"] = True
|
| 180 |
+
else:
|
| 181 |
+
txt = r.text
|
| 182 |
+
if len(txt) > 200_000:
|
| 183 |
+
txt = txt[:200_000]
|
| 184 |
+
target = target_base.with_suffix(".txt")
|
| 185 |
+
entry["path"] = str(target)
|
| 186 |
+
target.write_text(txt, encoding="utf-8")
|
| 187 |
+
entry["ok"] = True
|
| 188 |
+
except Exception as e:
|
| 189 |
+
entry["ok"] = False
|
| 190 |
+
entry["error"] = str(e)
|
| 191 |
+
snap_meta.append(entry)
|
| 192 |
+
progress.update("snapshots")
|
| 193 |
+
|
| 194 |
+
(resources_dir / "sources.json").write_text(
|
| 195 |
+
json.dumps({"sources": [asdict(s) for s in sources]}, ensure_ascii=False, indent=2),
|
| 196 |
+
encoding="utf-8",
|
| 197 |
+
)
|
| 198 |
+
(resources_dir / "snapshots.json").write_text(
|
| 199 |
+
json.dumps({"snapshots": snap_meta}, ensure_ascii=False, indent=2),
|
| 200 |
+
encoding="utf-8",
|
| 201 |
+
)
|
| 202 |
+
(stage_dir / "research.json").write_text(
|
| 203 |
+
json.dumps(
|
| 204 |
+
{
|
| 205 |
+
"topic": cfg.topic,
|
| 206 |
+
"mode": "resources",
|
| 207 |
+
"sources": [asdict(s) for s in sources],
|
| 208 |
+
"snapshots": snap_meta,
|
| 209 |
+
},
|
| 210 |
+
ensure_ascii=False,
|
| 211 |
+
indent=2,
|
| 212 |
+
),
|
| 213 |
+
encoding="utf-8",
|
| 214 |
+
)
|
| 215 |
+
|
| 216 |
+
progress.update("package")
|
| 217 |
+
|
| 218 |
+
try:
|
| 219 |
+
paper_tex, slides_tex = _generate_pre_tex(cfg, client, sources)
|
| 220 |
+
except Exception as e:
|
| 221 |
+
(stage_dir / "pre_tex_error.txt").write_text(str(e) + "\n", encoding="utf-8")
|
| 222 |
+
paper_tex = _render_paper_tex(cfg.topic, sources)
|
| 223 |
+
slides_tex = _render_slides_tex(cfg.topic, sources)
|
| 224 |
+
|
| 225 |
+
(stage_dir / "pre_paper.tex").write_text(paper_tex, encoding="utf-8")
|
| 226 |
+
(stage_dir / "pre_slides.tex").write_text(slides_tex, encoding="utf-8")
|
| 227 |
+
|
| 228 |
+
pdf_dir = stage_dir / "pdf"
|
| 229 |
+
pdf_dir.mkdir(parents=True, exist_ok=True)
|
| 230 |
+
urls: list[str] = []
|
| 231 |
+
errors: list[str] = []
|
| 232 |
+
|
| 233 |
+
if cfg.use_mock:
|
| 234 |
+
(pdf_dir / "pre_paper.pdf").write_bytes(_dummy_pdf_bytes("paper"))
|
| 235 |
+
(pdf_dir / "pre_slides.pdf").write_bytes(_dummy_pdf_bytes("slides"))
|
| 236 |
+
else:
|
| 237 |
+
try:
|
| 238 |
+
paper_pdf, paper_meta = _compile_pdf(
|
| 239 |
+
paper_tex,
|
| 240 |
+
engine="xelatex",
|
| 241 |
+
backend=cfg.pdf_compiler,
|
| 242 |
+
)
|
| 243 |
+
(pdf_dir / "pre_paper.pdf").write_bytes(paper_pdf)
|
| 244 |
+
urls.extend(paper_meta.get("urls", []))
|
| 245 |
+
errors.extend(paper_meta.get("errors", []))
|
| 246 |
+
except Exception as e:
|
| 247 |
+
errors.append("paper: " + str(e))
|
| 248 |
+
|
| 249 |
+
try:
|
| 250 |
+
slides_pdf, slides_meta = _compile_pdf(
|
| 251 |
+
slides_tex,
|
| 252 |
+
engine="xelatex",
|
| 253 |
+
backend=cfg.pdf_compiler,
|
| 254 |
+
)
|
| 255 |
+
(pdf_dir / "pre_slides.pdf").write_bytes(slides_pdf)
|
| 256 |
+
urls.extend(slides_meta.get("urls", []))
|
| 257 |
+
errors.extend(slides_meta.get("errors", []))
|
| 258 |
+
except Exception as e:
|
| 259 |
+
errors.append("slides: " + str(e))
|
| 260 |
+
|
| 261 |
+
if not (pdf_dir / "pre_paper.pdf").exists():
|
| 262 |
+
errors.append("paper pdf missing")
|
| 263 |
+
if not (pdf_dir / "pre_slides.pdf").exists():
|
| 264 |
+
errors.append("slides pdf missing")
|
| 265 |
+
|
| 266 |
+
if urls:
|
| 267 |
+
(stage_dir / "latexonline_url.txt").write_text("\n".join(urls) + "\n", encoding="utf-8")
|
| 268 |
+
if errors:
|
| 269 |
+
(stage_dir / "latexonline_error.txt").write_text("\n".join(errors) + "\n", encoding="utf-8")
|
| 270 |
+
|
| 271 |
+
finalize_output(cfg.out, stage_dir, keep_stage=cfg.keep_stage)
|
| 272 |
+
progress.done("packaged")
|
| 273 |
+
return cfg.out
|
| 274 |
+
|
| 275 |
+
|
| 276 |
+
def _render_paper_tex(topic: str, sources: list[Source]) -> str:
|
| 277 |
+
def esc(s: str) -> str:
|
| 278 |
+
return (
|
| 279 |
+
s.replace("\\", r"\textbackslash{}")
|
| 280 |
+
.replace("{", r"\{")
|
| 281 |
+
.replace("}", r"\}")
|
| 282 |
+
.replace("%", r"\%")
|
| 283 |
+
.replace("_", r"\_")
|
| 284 |
+
.replace("&", r"\&")
|
| 285 |
+
.replace("#", r"\#")
|
| 286 |
+
.replace("$", r"\$")
|
| 287 |
+
)
|
| 288 |
+
|
| 289 |
+
items = []
|
| 290 |
+
for _i, s in enumerate(sources, start=1):
|
| 291 |
+
items.append(
|
| 292 |
+
"\\item "
|
| 293 |
+
+ esc(s.title)
|
| 294 |
+
+ "\\\\\n"
|
| 295 |
+
+ "\\small\\url{" + esc(s.url) + "}\\normalsize\\\\\n"
|
| 296 |
+
+ "\\textit{" + esc(s.snippet[:240]) + "}"
|
| 297 |
+
)
|
| 298 |
+
body = "\n".join(items) if items else "\\item (暂无来源)"
|
| 299 |
+
return (
|
| 300 |
+
"\\documentclass[11pt]{article}\n"
|
| 301 |
+
"\\usepackage[UTF8]{ctex}\n"
|
| 302 |
+
"\\usepackage{hyperref}\n"
|
| 303 |
+
"\\usepackage{url}\n"
|
| 304 |
+
"\\usepackage{booktabs}\n"
|
| 305 |
+
"\\title{" + esc(topic) + "——资源预研报告(论文版)}\n"
|
| 306 |
+
"\\author{hydradeck}\n"
|
| 307 |
+
"\\date{\\today}\n"
|
| 308 |
+
"\\begin{document}\n"
|
| 309 |
+
"\\maketitle\n"
|
| 310 |
+
"\\section*{来源清单}\n"
|
| 311 |
+
"\\begin{enumerate}\n"
|
| 312 |
+
+ body
|
| 313 |
+
+ "\n\\end{enumerate}\n"
|
| 314 |
+
"\\end{document}\n"
|
| 315 |
+
)
|
| 316 |
+
|
| 317 |
+
|
| 318 |
+
def _render_slides_tex(topic: str, sources: list[Source]) -> str:
|
| 319 |
+
def esc(s: str) -> str:
|
| 320 |
+
return (
|
| 321 |
+
s.replace("\\", r"\textbackslash{}")
|
| 322 |
+
.replace("{", r"\{")
|
| 323 |
+
.replace("}", r"\}")
|
| 324 |
+
.replace("%", r"\%")
|
| 325 |
+
.replace("_", r"\_")
|
| 326 |
+
.replace("&", r"\&")
|
| 327 |
+
.replace("#", r"\#")
|
| 328 |
+
.replace("$", r"\$")
|
| 329 |
+
)
|
| 330 |
+
|
| 331 |
+
bullets: list[str] = []
|
| 332 |
+
for s in sources[:8]:
|
| 333 |
+
bullets.append(esc(s.title))
|
| 334 |
+
|
| 335 |
+
items = "\n".join(["\\item " + b for b in bullets]) or "\\item (暂无来源)"
|
| 336 |
+
return (
|
| 337 |
+
"\\documentclass{beamer}\n"
|
| 338 |
+
"\\usepackage[UTF8]{ctex}\n"
|
| 339 |
+
"\\usetheme{Madrid}\n"
|
| 340 |
+
"\\title{" + esc(topic) + "——资源预研简报(幻灯片)}\n"
|
| 341 |
+
"\\author{hydradeck}\n"
|
| 342 |
+
"\\date{\\today}\n"
|
| 343 |
+
"\\begin{document}\n"
|
| 344 |
+
"\\frame{\\titlepage}\n"
|
| 345 |
+
"\\begin{frame}{关键来源}\n"
|
| 346 |
+
"\\begin{itemize}\n"
|
| 347 |
+
+ items
|
| 348 |
+
+ "\n\\end{itemize}\n"
|
| 349 |
+
"\\end{frame}\n"
|
| 350 |
+
"\\end{document}\n"
|
| 351 |
+
)
|
| 352 |
+
|
| 353 |
+
|
| 354 |
+
def _latexonline_compile_url(tex: str, command: str) -> str:
|
| 355 |
+
q = urllib.parse.quote(tex, safe="")
|
| 356 |
+
return "https://latexonline.cc/compile?text=" + q + "&command=" + command + "&force=true"
|
| 357 |
+
|
| 358 |
+
|
| 359 |
+
def _compile_pdf(tex: str, engine: str, backend: str) -> tuple[bytes, dict[str, list[str]]]:
|
| 360 |
+
meta: dict[str, list[str]] = {"urls": [], "errors": []}
|
| 361 |
+
b = backend.strip().lower()
|
| 362 |
+
if b not in {"auto", "latexonline", "texlive"}:
|
| 363 |
+
b = "auto"
|
| 364 |
+
|
| 365 |
+
if b in {"auto", "latexonline"}:
|
| 366 |
+
try:
|
| 367 |
+
meta["urls"].append(_latexonline_compile_url(tex, command=engine))
|
| 368 |
+
data = _compile_latexonline(tex, command=engine)
|
| 369 |
+
_ensure_pdf_bytes(data, where="latexonline")
|
| 370 |
+
return data, meta
|
| 371 |
+
except Exception as e:
|
| 372 |
+
meta["errors"].append("latexonline: " + str(e))
|
| 373 |
+
if b == "latexonline":
|
| 374 |
+
raise
|
| 375 |
+
|
| 376 |
+
try:
|
| 377 |
+
data = _compile_texlive_latexcgi(tex, engine=engine)
|
| 378 |
+
_ensure_pdf_bytes(data, where="texlive")
|
| 379 |
+
return data, meta
|
| 380 |
+
except Exception as e:
|
| 381 |
+
meta["errors"].append("texlive latexcgi: " + str(e))
|
| 382 |
+
raise
|
| 383 |
+
|
| 384 |
+
|
| 385 |
+
def _ensure_pdf_bytes(data: bytes, where: str) -> None:
|
| 386 |
+
if data.startswith(b"%PDF"):
|
| 387 |
+
return
|
| 388 |
+
head = data[:200].decode("utf-8", errors="replace")
|
| 389 |
+
raise RuntimeError(f"{where} did not return PDF. Head: {head}")
|
| 390 |
+
|
| 391 |
+
|
| 392 |
+
def _compile_latexonline(tex: str, command: str) -> bytes:
|
| 393 |
+
url = _latexonline_compile_url(tex, command=command)
|
| 394 |
+
r = requests.get(url, timeout=120.0)
|
| 395 |
+
if r.status_code >= 400:
|
| 396 |
+
raise RuntimeError(f"latexonline HTTP {r.status_code}: {r.text[:2000]}")
|
| 397 |
+
return r.content
|
| 398 |
+
|
| 399 |
+
|
| 400 |
+
def _compile_texlive_latexcgi(tex: str, engine: str) -> bytes:
|
| 401 |
+
url = "https://texlive.net/cgi-bin/latexcgi"
|
| 402 |
+
files = {
|
| 403 |
+
"filename[]": (None, "document.tex"),
|
| 404 |
+
"filecontents[]": (None, tex),
|
| 405 |
+
"engine": (None, engine),
|
| 406 |
+
"return": (None, "pdf"),
|
| 407 |
+
}
|
| 408 |
+
r = requests.post(url, files=files, timeout=120.0)
|
| 409 |
+
if r.status_code >= 400:
|
| 410 |
+
raise RuntimeError(f"texlive latexcgi HTTP {r.status_code}: {r.text[:2000]}")
|
| 411 |
+
return r.content
|
| 412 |
+
|
| 413 |
+
|
| 414 |
+
def _generate_pre_tex(cfg: RunConfig, client, sources: list[Source]) -> tuple[str, str]:
|
| 415 |
+
if cfg.use_mock:
|
| 416 |
+
return _render_paper_tex(cfg.topic, sources), _render_slides_tex(cfg.topic, sources)
|
| 417 |
+
|
| 418 |
+
if cfg.template.strip().lower() in {"pretty", "iclr2026"}:
|
| 419 |
+
return _generate_pre_tex_pretty(cfg, client, sources)
|
| 420 |
+
|
| 421 |
+
outline = _pre_outline(cfg.topic)
|
| 422 |
+
src_json = json.dumps([asdict(s) for s in sources], ensure_ascii=False)
|
| 423 |
+
feedback = ""
|
| 424 |
+
last_paper = _render_paper_tex(cfg.topic, sources)
|
| 425 |
+
last_slides = _render_slides_tex(cfg.topic, sources)
|
| 426 |
+
for _attempt in range(max(1, cfg.pre_tex_attempts)):
|
| 427 |
+
msgs = [
|
| 428 |
+
ChatMessage(
|
| 429 |
+
role="system",
|
| 430 |
+
content=(
|
| 431 |
+
"你是严谨的 LaTeX 作者。"
|
| 432 |
+
"必须输出可用 XeLaTeX 编译的高信息密度中文内容。"
|
| 433 |
+
"不要输出 JSON。"
|
| 434 |
+
),
|
| 435 |
+
),
|
| 436 |
+
ChatMessage(
|
| 437 |
+
role="user",
|
| 438 |
+
content=(
|
| 439 |
+
"生成两个 LaTeX 文档(全部使用简体中文):\n"
|
| 440 |
+
"(1) paper_tex:article 论文版预研报告,结构严格,信息密度高。\n"
|
| 441 |
+
"(2) slides_tex:beamer 16:9,15 分钟汇报(8-10 页)。\n\n"
|
| 442 |
+
"共同硬约束:\n"
|
| 443 |
+
"- 使用 ctex + xelatex\n"
|
| 444 |
+
"- 禁止空话;每节必须有可执行要点/表格\n"
|
| 445 |
+
"- 必须包含“参考资源”并列出全部来源 URL\n\n"
|
| 446 |
+
"paper 结构(标题可扩展但需覆盖以下要点):\n"
|
| 447 |
+
+ "\n".join(["- " + x for x in outline["paper"]])
|
| 448 |
+
+ "\n\nslides 结构(每项至少一页):\n"
|
| 449 |
+
+ "\n".join(["- " + x for x in outline["slides"]])
|
| 450 |
+
+ "\n\n来源 JSON:\n"
|
| 451 |
+
+ src_json
|
| 452 |
+
+ ("\n\n评审反馈:\n" + feedback if feedback else "")
|
| 453 |
+
+ "\n\n输出格式(必须严格使用):\n"
|
| 454 |
+
+ "<<<paper.tex>>>\n<latex>\n<<<end paper.tex>>>\n"
|
| 455 |
+
+ "<<<slides.tex>>>\n<latex>\n<<<end slides.tex>>>\n"
|
| 456 |
+
),
|
| 457 |
+
),
|
| 458 |
+
]
|
| 459 |
+
text = client.chat_text(msgs, temperature=0.2)
|
| 460 |
+
parsed = _parse_marked_tex(text)
|
| 461 |
+
paper = parsed.get("paper")
|
| 462 |
+
slides = parsed.get("slides")
|
| 463 |
+
if not isinstance(paper, str) or not isinstance(slides, str):
|
| 464 |
+
feedback = "Output must contain both <<<paper.tex>>> and <<<slides.tex>>> blocks."
|
| 465 |
+
continue
|
| 466 |
+
|
| 467 |
+
last_paper, last_slides = paper, slides
|
| 468 |
+
score, fb = _score_pre_tex(paper, slides, sources)
|
| 469 |
+
if not cfg.pre_tex_quality_gate or score >= cfg.pre_tex_min_score:
|
| 470 |
+
return paper, slides
|
| 471 |
+
feedback = fb
|
| 472 |
+
|
| 473 |
+
return last_paper, last_slides
|
| 474 |
+
|
| 475 |
+
|
| 476 |
+
def _generate_pre_tex_iclr2026(
|
| 477 |
+
cfg: RunConfig,
|
| 478 |
+
client,
|
| 479 |
+
sources: list[Source],
|
| 480 |
+
) -> tuple[str, str]:
|
| 481 |
+
src_json = json.dumps([asdict(s) for s in sources], ensure_ascii=False)
|
| 482 |
+
feedback = ""
|
| 483 |
+
last_paper = ""
|
| 484 |
+
last_slides = ""
|
| 485 |
+
for _attempt in range(max(1, cfg.pre_tex_attempts)):
|
| 486 |
+
msgs = [
|
| 487 |
+
ChatMessage(
|
| 488 |
+
role="system",
|
| 489 |
+
content=(
|
| 490 |
+
"你撰写严谨的 ICLR 风格预研文稿。"
|
| 491 |
+
"paper 必须使用 \\usepackage{iclr2026_conference,times}。"
|
| 492 |
+
"输出必须为简体中文,不要输出 JSON。"
|
| 493 |
+
),
|
| 494 |
+
),
|
| 495 |
+
ChatMessage(
|
| 496 |
+
role="user",
|
| 497 |
+
content=(
|
| 498 |
+
"任务:撰写 (1) paper.tex(ICLR 论文风格)和 (2) slides.tex(beamer)。\n"
|
| 499 |
+
"场景:15 分钟预研汇报。\n"
|
| 500 |
+
"要求高信息密度:至少 2 张表(证据计划、风险登记)。\n"
|
| 501 |
+
"必须包含“参考资源”并列出所有来源 URL。\n\n"
|
| 502 |
+
"paper.tex 要求:\n"
|
| 503 |
+
"- Use: \\documentclass{article} and \\usepackage{iclr2026_conference,times}\n"
|
| 504 |
+
"- 包含:标题、摘要(<=150 词)\n"
|
| 505 |
+
"- 章节:目标、待验证主张、研究问题、范围/非范围\n"
|
| 506 |
+
" 证据计划(表)、来源映射、风险(表)、时间线(表)\n"
|
| 507 |
+
" 交付物、参考资源\n"
|
| 508 |
+
"- 禁止空话,每个要点必须可执行。\n\n"
|
| 509 |
+
"slides.tex 要求:\n"
|
| 510 |
+
"- 16:9 beamer, 8-10 frames, 1 idea per slide\n"
|
| 511 |
+
"- 至少 1 页证据矩阵,至少 1 页风险页\n\n"
|
| 512 |
+
"来源 JSON:\n"
|
| 513 |
+
+ src_json
|
| 514 |
+
+ ("\n\n反馈:\n" + feedback if feedback else "")
|
| 515 |
+
+ "\n\n输出格式(必须严格):\n"
|
| 516 |
+
+ "<<<paper.tex>>>\n<latex>\n<<<end paper.tex>>>\n"
|
| 517 |
+
+ "<<<slides.tex>>>\n<latex>\n<<<end slides.tex>>>\n"
|
| 518 |
+
),
|
| 519 |
+
),
|
| 520 |
+
]
|
| 521 |
+
text = client.chat_text(msgs, temperature=0.2)
|
| 522 |
+
parsed = _parse_marked_tex(text)
|
| 523 |
+
paper = parsed.get("paper")
|
| 524 |
+
slides = parsed.get("slides")
|
| 525 |
+
if not isinstance(paper, str) or not isinstance(slides, str):
|
| 526 |
+
feedback = "Missing marked blocks."
|
| 527 |
+
continue
|
| 528 |
+
last_paper, last_slides = paper, slides
|
| 529 |
+
score, fb = _score_pre_tex(paper, slides, sources)
|
| 530 |
+
if not cfg.pre_tex_quality_gate or score >= cfg.pre_tex_min_score:
|
| 531 |
+
return paper, slides
|
| 532 |
+
feedback = fb
|
| 533 |
+
|
| 534 |
+
if last_paper and last_slides:
|
| 535 |
+
return last_paper, last_slides
|
| 536 |
+
return _render_paper_tex(cfg.topic, sources), _render_slides_tex(cfg.topic, sources)
|
| 537 |
+
|
| 538 |
+
|
| 539 |
+
def _generate_pre_tex_pretty(
|
| 540 |
+
cfg: RunConfig,
|
| 541 |
+
client,
|
| 542 |
+
sources: list[Source],
|
| 543 |
+
) -> tuple[str, str]:
|
| 544 |
+
src_json = json.dumps([asdict(s) for s in sources], ensure_ascii=False)
|
| 545 |
+
feedback = ""
|
| 546 |
+
last_paper = _render_paper_tex(cfg.topic, sources)
|
| 547 |
+
last_slides = _render_slides_tex(cfg.topic, sources)
|
| 548 |
+
|
| 549 |
+
for _attempt in range(max(1, cfg.pre_tex_attempts)):
|
| 550 |
+
msgs = [
|
| 551 |
+
ChatMessage(
|
| 552 |
+
role="system",
|
| 553 |
+
content=(
|
| 554 |
+
"你是严谨的 LaTeX 作者。"
|
| 555 |
+
"请输出可直接编译、结构完整、信息密度高的中文 .tex 文件。"
|
| 556 |
+
"不要输出 JSON。"
|
| 557 |
+
),
|
| 558 |
+
),
|
| 559 |
+
ChatMessage(
|
| 560 |
+
role="user",
|
| 561 |
+
content=(
|
| 562 |
+
"生成两个自包含 LaTeX 文件(简体中文):\n"
|
| 563 |
+
"A) pre_paper.tex:article。\n"
|
| 564 |
+
"B) pre_slides.tex:beamer 16:9。\n\n"
|
| 565 |
+
"paper 要求:\n"
|
| 566 |
+
"- 使用 xelatex + ctex\n"
|
| 567 |
+
"- 版式整洁,信息密度高,无空话\n"
|
| 568 |
+
"- 章节至少覆盖:背景、创新、架构、能力、应用、局限、结论、参考资源\n"
|
| 569 |
+
"- 每条来源至少引用一次(\\cite{})\n\n"
|
| 570 |
+
"slides 要求:\n"
|
| 571 |
+
"- 8-10 页,一页一核心观点\n"
|
| 572 |
+
"- 至少 1 页证据矩阵,至少 1 页风险页\n\n"
|
| 573 |
+
"来源 JSON(以此为准):\n"
|
| 574 |
+
+ src_json
|
| 575 |
+
+ ("\n\n反馈:\n" + feedback if feedback else "")
|
| 576 |
+
+ "\n\n输出格式(必须严格):\n"
|
| 577 |
+
+ "<<<paper.tex>>>\n<latex>\n<<<end paper.tex>>>\n"
|
| 578 |
+
+ "<<<slides.tex>>>\n<latex>\n<<<end slides.tex>>>\n"
|
| 579 |
+
),
|
| 580 |
+
),
|
| 581 |
+
]
|
| 582 |
+
text = client.chat_text(msgs, temperature=0.2)
|
| 583 |
+
parsed = _parse_marked_tex(text)
|
| 584 |
+
paper = parsed.get("paper")
|
| 585 |
+
slides = parsed.get("slides")
|
| 586 |
+
if not isinstance(paper, str) or not isinstance(slides, str):
|
| 587 |
+
feedback = "Missing marked blocks."
|
| 588 |
+
continue
|
| 589 |
+
last_paper, last_slides = paper, slides
|
| 590 |
+
|
| 591 |
+
score, fb = _score_pre_tex(paper, slides, sources)
|
| 592 |
+
if "thebibliography" not in paper:
|
| 593 |
+
score *= 0.75
|
| 594 |
+
if not cfg.pre_tex_quality_gate or score >= cfg.pre_tex_min_score:
|
| 595 |
+
return paper, slides
|
| 596 |
+
feedback = fb
|
| 597 |
+
|
| 598 |
+
return last_paper, last_slides
|
| 599 |
+
|
| 600 |
+
|
| 601 |
+
def _pre_outline(topic: str) -> dict[str, list[str]]:
|
| 602 |
+
_ = topic
|
| 603 |
+
return {
|
| 604 |
+
"paper": [
|
| 605 |
+
"标题",
|
| 606 |
+
"1. 背景与问题定义",
|
| 607 |
+
"2. 技术创新点",
|
| 608 |
+
"3. 系统架构与关键机制",
|
| 609 |
+
"4. 能力与性能分析",
|
| 610 |
+
"5. 应用场景与价值",
|
| 611 |
+
"6. 局限与风险",
|
| 612 |
+
"7. 结论",
|
| 613 |
+
"8. 参考资源",
|
| 614 |
+
],
|
| 615 |
+
"slides": [
|
| 616 |
+
"标题",
|
| 617 |
+
"背景与核心问题",
|
| 618 |
+
"技术创新点",
|
| 619 |
+
"系统架构",
|
| 620 |
+
"能力与性能",
|
| 621 |
+
"应用场景",
|
| 622 |
+
"局限与风险",
|
| 623 |
+
"结论",
|
| 624 |
+
"Q&A",
|
| 625 |
+
],
|
| 626 |
+
}
|
| 627 |
+
|
| 628 |
+
|
| 629 |
+
def _score_pre_tex(paper: str, slides: str, sources: list[Source]) -> tuple[float, str]:
|
| 630 |
+
score = 1.0
|
| 631 |
+
must = [
|
| 632 |
+
"背景",
|
| 633 |
+
"创新",
|
| 634 |
+
"架构",
|
| 635 |
+
"应用",
|
| 636 |
+
"局限",
|
| 637 |
+
"结论",
|
| 638 |
+
"参考",
|
| 639 |
+
]
|
| 640 |
+
for k in must:
|
| 641 |
+
if k not in paper:
|
| 642 |
+
score *= 0.85
|
| 643 |
+
if "\\documentclass" not in paper or "\\documentclass" not in slides:
|
| 644 |
+
score *= 0.5
|
| 645 |
+
if len(sources) >= 3 and paper.count("\\url{") < 3:
|
| 646 |
+
score *= 0.7
|
| 647 |
+
if "iclr2026_conference" in paper and "\\usepackage{iclr2026_conference" not in paper:
|
| 648 |
+
score *= 0.8
|
| 649 |
+
zh_chars = sum(1 for ch in (paper + slides) if "\u4e00" <= ch <= "\u9fff")
|
| 650 |
+
total_chars = max(1, len(paper + slides))
|
| 651 |
+
if zh_chars / total_chars < 0.15:
|
| 652 |
+
score *= 0.7
|
| 653 |
+
fb = "章节不足或资源映射偏弱" if score < 0.95 else "ok"
|
| 654 |
+
return max(0.0, min(1.0, score)), fb
|
| 655 |
+
|
| 656 |
+
|
| 657 |
+
def _parse_marked_tex(text: str) -> dict[str, str]:
|
| 658 |
+
def extract(name: str) -> str | None:
|
| 659 |
+
start = f"<<<{name}>>>"
|
| 660 |
+
end = f"<<<end {name}>>>"
|
| 661 |
+
a = text.find(start)
|
| 662 |
+
b = text.find(end)
|
| 663 |
+
if a == -1 or b == -1 or b <= a:
|
| 664 |
+
return None
|
| 665 |
+
inner = text[a + len(start) : b].strip()
|
| 666 |
+
inner = _strip_markdown_fences(inner).strip()
|
| 667 |
+
if inner.startswith("<latex>"):
|
| 668 |
+
inner = inner[len("<latex>") :].lstrip()
|
| 669 |
+
return inner + "\n"
|
| 670 |
+
|
| 671 |
+
out: dict[str, str] = {}
|
| 672 |
+
paper = extract("paper.tex")
|
| 673 |
+
slides = extract("slides.tex")
|
| 674 |
+
if paper is not None:
|
| 675 |
+
out["paper"] = paper
|
| 676 |
+
if slides is not None:
|
| 677 |
+
out["slides"] = slides
|
| 678 |
+
return out
|
| 679 |
+
|
| 680 |
+
|
| 681 |
+
def _strip_markdown_fences(s: str) -> str:
|
| 682 |
+
t = s.strip()
|
| 683 |
+
if t.startswith("```"):
|
| 684 |
+
lines = t.splitlines()
|
| 685 |
+
if len(lines) >= 2 and lines[-1].strip().startswith("```"):
|
| 686 |
+
inner = "\n".join(lines[1:-1]).strip()
|
| 687 |
+
return inner + "\n"
|
| 688 |
+
return s
|
| 689 |
+
|
| 690 |
+
|
| 691 |
+
def _dummy_pdf_bytes(label: str) -> bytes:
|
| 692 |
+
content = f"Dummy PDF ({label})".encode("ascii", errors="ignore")
|
| 693 |
+
return (
|
| 694 |
+
b"%PDF-1.1\n"
|
| 695 |
+
b"1 0 obj<<>>endobj\n"
|
| 696 |
+
b"2 0 obj<< /Length 44 >>stream\n"
|
| 697 |
+
b"BT /F1 12 Tf 72 720 Td ("
|
| 698 |
+
+ content
|
| 699 |
+
+ b") Tj ET\n"
|
| 700 |
+
b"endstream endobj\n"
|
| 701 |
+
b"3 0 obj<< /Type /Page /Parent 4 0 R /Contents 2 0 R >>endobj\n"
|
| 702 |
+
b"4 0 obj<< /Type /Pages /Kids [3 0 R] /Count 1 >>endobj\n"
|
| 703 |
+
b"5 0 obj<< /Type /Catalog /Pages 4 0 R >>endobj\n"
|
| 704 |
+
b"xref\n0 6\n0000000000 65535 f \n"
|
| 705 |
+
b"trailer<< /Root 5 0 R /Size 6 >>\nstartxref\n0\n%%EOF\n"
|
| 706 |
+
)
|
hydradeck/utils.py
ADDED
|
@@ -0,0 +1,86 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
from __future__ import annotations
|
| 2 |
+
|
| 3 |
+
import datetime
|
| 4 |
+
import sys
|
| 5 |
+
import threading
|
| 6 |
+
import time
|
| 7 |
+
|
| 8 |
+
|
| 9 |
+
def log(enabled: bool, msg: str) -> None:
|
| 10 |
+
if not enabled:
|
| 11 |
+
return
|
| 12 |
+
ts = datetime.datetime.now(datetime.timezone.utc).isoformat(timespec="seconds")
|
| 13 |
+
print(f"[{ts}] {msg}")
|
| 14 |
+
|
| 15 |
+
|
| 16 |
+
JSON = dict[str, object]
|
| 17 |
+
|
| 18 |
+
|
| 19 |
+
class Heartbeat:
|
| 20 |
+
def __init__(self, enabled: bool, label: str, interval_s: float = 5.0) -> None:
|
| 21 |
+
self._enabled = enabled
|
| 22 |
+
self._label = label
|
| 23 |
+
self._interval_s = interval_s
|
| 24 |
+
self._stop = threading.Event()
|
| 25 |
+
self._t: threading.Thread | None = None
|
| 26 |
+
|
| 27 |
+
def __enter__(self) -> Heartbeat:
|
| 28 |
+
if not self._enabled:
|
| 29 |
+
return self
|
| 30 |
+
|
| 31 |
+
def run() -> None:
|
| 32 |
+
start = time.time()
|
| 33 |
+
while not self._stop.wait(self._interval_s):
|
| 34 |
+
elapsed = int(time.time() - start)
|
| 35 |
+
sys.stderr.write(f"[heartbeat] {self._label} ({elapsed}s)\n")
|
| 36 |
+
sys.stderr.flush()
|
| 37 |
+
|
| 38 |
+
self._t = threading.Thread(target=run, daemon=True)
|
| 39 |
+
self._t.start()
|
| 40 |
+
return self
|
| 41 |
+
|
| 42 |
+
def __exit__(self, exc_type, exc, tb) -> None:
|
| 43 |
+
_ = (exc_type, exc, tb)
|
| 44 |
+
if not self._enabled:
|
| 45 |
+
return
|
| 46 |
+
self._stop.set()
|
| 47 |
+
if self._t is not None:
|
| 48 |
+
self._t.join(timeout=1.0)
|
| 49 |
+
|
| 50 |
+
|
| 51 |
+
class Progress:
|
| 52 |
+
def __init__(
|
| 53 |
+
self,
|
| 54 |
+
enabled: bool,
|
| 55 |
+
total: int,
|
| 56 |
+
label: str = "",
|
| 57 |
+
stream=None,
|
| 58 |
+
) -> None:
|
| 59 |
+
self._enabled = enabled
|
| 60 |
+
self._total = max(int(total), 1)
|
| 61 |
+
self._label = label
|
| 62 |
+
self._stream = stream or sys.stderr
|
| 63 |
+
self._current = 0
|
| 64 |
+
self._last_len = 0
|
| 65 |
+
|
| 66 |
+
def update(self, step: str, inc: int = 1) -> None:
|
| 67 |
+
if not self._enabled:
|
| 68 |
+
return
|
| 69 |
+
self._current = min(self._total, self._current + max(int(inc), 0))
|
| 70 |
+
pct = int((self._current / self._total) * 100)
|
| 71 |
+
bar_len = 24
|
| 72 |
+
filled = int(bar_len * self._current / self._total)
|
| 73 |
+
bar = "#" * filled + "-" * (bar_len - filled)
|
| 74 |
+
msg = f"[progress] {self._label} [{bar}] {pct:3d}% {step}"
|
| 75 |
+
pad = " " * max(0, self._last_len - len(msg))
|
| 76 |
+
self._stream.write("\r" + msg + pad)
|
| 77 |
+
self._stream.flush()
|
| 78 |
+
self._last_len = len(msg)
|
| 79 |
+
|
| 80 |
+
def done(self, step: str = "done") -> None:
|
| 81 |
+
if not self._enabled:
|
| 82 |
+
return
|
| 83 |
+
self._current = self._total
|
| 84 |
+
self.update(step, inc=0)
|
| 85 |
+
self._stream.write("\n")
|
| 86 |
+
self._stream.flush()
|
pyproject.toml
ADDED
|
@@ -0,0 +1,44 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
[build-system]
|
| 2 |
+
requires = ["setuptools>=68", "wheel"]
|
| 3 |
+
build-backend = "setuptools.build_meta"
|
| 4 |
+
|
| 5 |
+
[project]
|
| 6 |
+
name = "hydradeck"
|
| 7 |
+
version = "0.1.0"
|
| 8 |
+
description = "Grok-driven deep research pipeline that outputs detailed reports, speech scripts, and Beamer slides."
|
| 9 |
+
readme = "README.md"
|
| 10 |
+
requires-python = ">=3.9"
|
| 11 |
+
license = { text = "MIT" }
|
| 12 |
+
authors = [{ name = "hydradeck contributors" }]
|
| 13 |
+
dependencies = [
|
| 14 |
+
"requests>=2.31.0",
|
| 15 |
+
"urllib3>=2,<3",
|
| 16 |
+
"gradio>=4.44.1,<5",
|
| 17 |
+
"huggingface_hub<1.0",
|
| 18 |
+
]
|
| 19 |
+
|
| 20 |
+
[project.optional-dependencies]
|
| 21 |
+
dev = [
|
| 22 |
+
"pytest>=8.0.0",
|
| 23 |
+
"ruff>=0.6.0",
|
| 24 |
+
]
|
| 25 |
+
|
| 26 |
+
[project.scripts]
|
| 27 |
+
hydradeck = "hydradeck.cli:main"
|
| 28 |
+
|
| 29 |
+
[tool.setuptools]
|
| 30 |
+
package-dir = {"" = "."}
|
| 31 |
+
|
| 32 |
+
[tool.setuptools.packages.find]
|
| 33 |
+
where = ["."]
|
| 34 |
+
include = ["hydradeck*"]
|
| 35 |
+
|
| 36 |
+
[tool.setuptools.package-data]
|
| 37 |
+
hydradeck = ["templates/**/*"]
|
| 38 |
+
|
| 39 |
+
[tool.ruff]
|
| 40 |
+
line-length = 100
|
| 41 |
+
target-version = "py39"
|
| 42 |
+
|
| 43 |
+
[tool.ruff.lint]
|
| 44 |
+
select = ["E", "F", "I", "UP", "B"]
|
requirements.txt
ADDED
|
@@ -0,0 +1,4 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
requests>=2.31.0
|
| 2 |
+
urllib3>=2,<3
|
| 3 |
+
gradio>=4.44.1,<5
|
| 4 |
+
huggingface_hub<1.0
|
tests/test_app_agentic.py
ADDED
|
@@ -0,0 +1,74 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
from __future__ import annotations
|
| 2 |
+
|
| 3 |
+
from pathlib import Path
|
| 4 |
+
|
| 5 |
+
import app
|
| 6 |
+
|
| 7 |
+
|
| 8 |
+
def test_agentic_pipeline_mock_renders_online_pdfs(monkeypatch) -> None:
|
| 9 |
+
def fake_compile(tex_source: str, output_name: str) -> str:
|
| 10 |
+
p = Path("/tmp") / output_name
|
| 11 |
+
p.write_bytes(b"%PDF-1.5\n%mock\n")
|
| 12 |
+
return str(p)
|
| 13 |
+
|
| 14 |
+
monkeypatch.setattr(app, "_compile_latex_online", fake_compile)
|
| 15 |
+
|
| 16 |
+
(
|
| 17 |
+
status,
|
| 18 |
+
progress_log,
|
| 19 |
+
_scope_json,
|
| 20 |
+
section_plan_json,
|
| 21 |
+
paper_tex,
|
| 22 |
+
slides_tex,
|
| 23 |
+
rendered_pdfs,
|
| 24 |
+
paper_pdf,
|
| 25 |
+
slides_pdf,
|
| 26 |
+
) = (
|
| 27 |
+
app._run_agentic_pipeline(
|
| 28 |
+
topic="Agentic flow test",
|
| 29 |
+
model="grok-3-mini",
|
| 30 |
+
base_url="https://api.example.com",
|
| 31 |
+
api_key="",
|
| 32 |
+
request_budget=20,
|
| 33 |
+
use_mock=True,
|
| 34 |
+
)
|
| 35 |
+
)
|
| 36 |
+
|
| 37 |
+
assert "done" in status.lower()
|
| 38 |
+
assert "ScopeScout" in progress_log
|
| 39 |
+
assert "sections" in section_plan_json
|
| 40 |
+
assert "documentclass" in paper_tex
|
| 41 |
+
assert "documentclass" in slides_tex
|
| 42 |
+
paths = [x.strip() for x in rendered_pdfs.splitlines() if x.strip()]
|
| 43 |
+
assert len(paths) == 2
|
| 44 |
+
for p in paths:
|
| 45 |
+
assert Path(p).exists()
|
| 46 |
+
assert Path(str(paper_pdf)).exists()
|
| 47 |
+
assert Path(str(slides_pdf)).exists()
|
| 48 |
+
|
| 49 |
+
|
| 50 |
+
def test_agentic_stream_emits_progress_and_pdf_paths(monkeypatch) -> None:
|
| 51 |
+
def fake_compile(tex_source: str, output_name: str) -> str:
|
| 52 |
+
p = Path("/tmp") / output_name
|
| 53 |
+
p.write_bytes(b"%PDF-1.5\n%mock\n")
|
| 54 |
+
return str(p)
|
| 55 |
+
|
| 56 |
+
monkeypatch.setattr(app, "_compile_latex_online", fake_compile)
|
| 57 |
+
|
| 58 |
+
chunks = list(
|
| 59 |
+
app._run_agentic_pipeline_stream(
|
| 60 |
+
topic="Agentic stream test",
|
| 61 |
+
model="grok-3-mini",
|
| 62 |
+
base_url="https://api.example.com",
|
| 63 |
+
api_key="",
|
| 64 |
+
request_budget=20,
|
| 65 |
+
use_mock=True,
|
| 66 |
+
)
|
| 67 |
+
)
|
| 68 |
+
assert len(chunks) >= 3
|
| 69 |
+
assert chunks[0][-1] == 5
|
| 70 |
+
assert chunks[1][-1] == 30
|
| 71 |
+
assert chunks[-1][-1] == 100
|
| 72 |
+
assert "done" in str(chunks[-1][0]).lower()
|
| 73 |
+
assert Path(str(chunks[-1][7])).exists()
|
| 74 |
+
assert Path(str(chunks[-1][8])).exists()
|
tests/test_cli.py
ADDED
|
@@ -0,0 +1,66 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
from __future__ import annotations
|
| 2 |
+
|
| 3 |
+
import pytest
|
| 4 |
+
|
| 5 |
+
from hydradeck import cli
|
| 6 |
+
from hydradeck.core.types import RunConfig
|
| 7 |
+
|
| 8 |
+
|
| 9 |
+
def test_run_command_accepts_snapshot_total_timeout_default(
|
| 10 |
+
monkeypatch: pytest.MonkeyPatch,
|
| 11 |
+
) -> None:
|
| 12 |
+
captured: dict[str, float] = {}
|
| 13 |
+
|
| 14 |
+
def fake_run(cfg: RunConfig) -> object:
|
| 15 |
+
captured["snapshot_total_timeout_s"] = cfg.snapshot_total_timeout_s
|
| 16 |
+
return object()
|
| 17 |
+
|
| 18 |
+
monkeypatch.setattr(cli, "run", fake_run)
|
| 19 |
+
|
| 20 |
+
code = cli.main(
|
| 21 |
+
[
|
| 22 |
+
"run",
|
| 23 |
+
"--topic",
|
| 24 |
+
"t",
|
| 25 |
+
"--out",
|
| 26 |
+
"out.zip",
|
| 27 |
+
"--base-url",
|
| 28 |
+
"https://example.invalid",
|
| 29 |
+
"--model",
|
| 30 |
+
"mock",
|
| 31 |
+
"--mock",
|
| 32 |
+
]
|
| 33 |
+
)
|
| 34 |
+
|
| 35 |
+
assert code == 0
|
| 36 |
+
assert captured["snapshot_total_timeout_s"] == 60.0
|
| 37 |
+
|
| 38 |
+
|
| 39 |
+
def test_run_command_passes_request_budget(monkeypatch: pytest.MonkeyPatch) -> None:
|
| 40 |
+
captured: dict[str, float] = {}
|
| 41 |
+
|
| 42 |
+
def fake_run(cfg: RunConfig) -> object:
|
| 43 |
+
captured["request_budget_s"] = cfg.request_budget_s
|
| 44 |
+
return object()
|
| 45 |
+
|
| 46 |
+
monkeypatch.setattr(cli, "run", fake_run)
|
| 47 |
+
|
| 48 |
+
code = cli.main(
|
| 49 |
+
[
|
| 50 |
+
"run",
|
| 51 |
+
"--topic",
|
| 52 |
+
"t",
|
| 53 |
+
"--out",
|
| 54 |
+
"out.zip",
|
| 55 |
+
"--base-url",
|
| 56 |
+
"https://example.invalid",
|
| 57 |
+
"--model",
|
| 58 |
+
"mock",
|
| 59 |
+
"--request-budget",
|
| 60 |
+
"90",
|
| 61 |
+
"--mock",
|
| 62 |
+
]
|
| 63 |
+
)
|
| 64 |
+
|
| 65 |
+
assert code == 0
|
| 66 |
+
assert captured["request_budget_s"] == 90.0
|
tests/test_config.py
ADDED
|
@@ -0,0 +1,44 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
from __future__ import annotations
|
| 2 |
+
|
| 3 |
+
from pathlib import Path
|
| 4 |
+
|
| 5 |
+
from hydradeck.config import UserConfig, load_config, load_merged_config, save_config
|
| 6 |
+
|
| 7 |
+
|
| 8 |
+
def test_save_and_load_config(tmp_path: Path) -> None:
|
| 9 |
+
p = tmp_path / "cfg.json"
|
| 10 |
+
save_config(
|
| 11 |
+
UserConfig(
|
| 12 |
+
base_url="https://x",
|
| 13 |
+
api_key="k",
|
| 14 |
+
model="m",
|
| 15 |
+
pdf_compiler="auto",
|
| 16 |
+
template="iclr2026",
|
| 17 |
+
),
|
| 18 |
+
path=p,
|
| 19 |
+
)
|
| 20 |
+
cfg = load_config(path=p)
|
| 21 |
+
assert cfg.base_url == "https://x"
|
| 22 |
+
assert cfg.api_key == "k"
|
| 23 |
+
assert cfg.model == "m"
|
| 24 |
+
assert cfg.pdf_compiler == "auto"
|
| 25 |
+
assert cfg.template == "iclr2026"
|
| 26 |
+
|
| 27 |
+
|
| 28 |
+
def test_project_config_overrides_user(tmp_path: Path, monkeypatch) -> None:
|
| 29 |
+
user_p = tmp_path / "user.json"
|
| 30 |
+
save_config(UserConfig(base_url="https://u", api_key="u", model="u"), path=user_p)
|
| 31 |
+
|
| 32 |
+
proj_root = tmp_path / "proj"
|
| 33 |
+
(proj_root / ".hydradeck").mkdir(parents=True)
|
| 34 |
+
proj_p = proj_root / ".hydradeck" / "config.json"
|
| 35 |
+
save_config(UserConfig(model="p"), path=proj_p)
|
| 36 |
+
|
| 37 |
+
monkeypatch.chdir(proj_root)
|
| 38 |
+
from hydradeck import config as cfgmod
|
| 39 |
+
|
| 40 |
+
monkeypatch.setattr(cfgmod, "config_path", lambda: user_p)
|
| 41 |
+
merged = load_merged_config()
|
| 42 |
+
assert merged.base_url == "https://u"
|
| 43 |
+
assert merged.api_key == "u"
|
| 44 |
+
assert merged.model == "p"
|
tests/test_preset_pre.py
ADDED
|
@@ -0,0 +1,17 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
from __future__ import annotations
|
| 2 |
+
|
| 3 |
+
import zipfile
|
| 4 |
+
from pathlib import Path
|
| 5 |
+
|
| 6 |
+
|
| 7 |
+
def test_preset_rynnbrain_zip(tmp_path: Path) -> None:
|
| 8 |
+
from hydradeck.presets.rynnbrain import generate
|
| 9 |
+
|
| 10 |
+
out_zip = tmp_path / "rynnbrain_pre.zip"
|
| 11 |
+
generate(out=out_zip, keep_stage=False, fetch=False)
|
| 12 |
+
assert out_zip.exists()
|
| 13 |
+
with zipfile.ZipFile(out_zip, "r") as z:
|
| 14 |
+
names = set(z.namelist())
|
| 15 |
+
assert "pre_report.md" in names
|
| 16 |
+
assert "research.json" in names
|
| 17 |
+
assert "resources/sources.json" in names
|
tests/test_render.py
ADDED
|
@@ -0,0 +1,189 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
from __future__ import annotations
|
| 2 |
+
|
| 3 |
+
from hydradeck.core.types import ExtractedFact, Source
|
| 4 |
+
from hydradeck.render import (
|
| 5 |
+
build_slide_frames_from_sections,
|
| 6 |
+
build_slide_frames_from_report,
|
| 7 |
+
enforce_slide_density,
|
| 8 |
+
render_beamer_frames,
|
| 9 |
+
render_paper,
|
| 10 |
+
render_report_structured,
|
| 11 |
+
)
|
| 12 |
+
|
| 13 |
+
|
| 14 |
+
def test_render_paper_converts_markdown_like_body() -> None:
|
| 15 |
+
sources = [Source(url="https://example.com", title="Example", snippet="snippet")]
|
| 16 |
+
facts = [
|
| 17 |
+
ExtractedFact(
|
| 18 |
+
claim="Claim A",
|
| 19 |
+
evidence="Evidence A",
|
| 20 |
+
url="https://example.com",
|
| 21 |
+
title="Example",
|
| 22 |
+
)
|
| 23 |
+
]
|
| 24 |
+
body = """## Heading
|
| 25 |
+
- bullet 1
|
| 26 |
+
- bullet 2
|
| 27 |
+
`inline`
|
| 28 |
+
```python
|
| 29 |
+
print('x')
|
| 30 |
+
```
|
| 31 |
+
"""
|
| 32 |
+
|
| 33 |
+
tex = render_paper(
|
| 34 |
+
topic="demo",
|
| 35 |
+
outline=["背景", "创新"],
|
| 36 |
+
body=body,
|
| 37 |
+
facts=facts,
|
| 38 |
+
sources=sources,
|
| 39 |
+
)
|
| 40 |
+
|
| 41 |
+
assert "```" not in tex
|
| 42 |
+
assert "## Heading" not in tex
|
| 43 |
+
assert "Heading" in tex
|
| 44 |
+
assert "\\begin{itemize}" in tex
|
| 45 |
+
assert "bullet 1" in tex
|
| 46 |
+
assert "\\section*{1. Introduction and Background}" in tex
|
| 47 |
+
|
| 48 |
+
|
| 49 |
+
def test_render_templates_use_facts_not_generic_filler() -> None:
|
| 50 |
+
sources = [Source(url="https://example.com", title="Example", snippet="snippet")]
|
| 51 |
+
facts = [
|
| 52 |
+
ExtractedFact(
|
| 53 |
+
claim="RynnBrain released checkpoints on 2026-02-09",
|
| 54 |
+
evidence="project timeline from official repo",
|
| 55 |
+
url="https://example.com",
|
| 56 |
+
title="Example",
|
| 57 |
+
),
|
| 58 |
+
ExtractedFact(
|
| 59 |
+
claim="Model introduces interleaved reasoning with spatial grounding",
|
| 60 |
+
evidence="technical report description",
|
| 61 |
+
url="https://example.com",
|
| 62 |
+
title="Example",
|
| 63 |
+
),
|
| 64 |
+
]
|
| 65 |
+
paper = render_paper(
|
| 66 |
+
topic="demo",
|
| 67 |
+
outline=["背景", "创新", "架构"],
|
| 68 |
+
body="结论段落 [1]",
|
| 69 |
+
facts=facts,
|
| 70 |
+
sources=sources,
|
| 71 |
+
)
|
| 72 |
+
section_blocks = [
|
| 73 |
+
{"name": "背景", "latex": facts[0].claim},
|
| 74 |
+
{"name": "创新", "latex": facts[1].claim},
|
| 75 |
+
]
|
| 76 |
+
frames = build_slide_frames_from_sections(section_blocks, language="en")
|
| 77 |
+
slides = render_beamer_frames("demo", frames, language="en")
|
| 78 |
+
|
| 79 |
+
assert "released checkpoints" in paper
|
| 80 |
+
assert "interleaved reasoning" in paper
|
| 81 |
+
assert "released checkpoints" in slides
|
| 82 |
+
assert "interleaved reasoning" in slides
|
| 83 |
+
section_one = paper.split("\\section*{1. Introduction and Background}", 1)[1]
|
| 84 |
+
section_one = section_one.split("\\section*{2. Logical Outline}", 1)[0]
|
| 85 |
+
assert "\\begin{itemize}" not in section_one
|
| 86 |
+
|
| 87 |
+
|
| 88 |
+
def test_render_beamer_from_report_derives_outline() -> None:
|
| 89 |
+
paper = (
|
| 90 |
+
"\\documentclass{article}\n"
|
| 91 |
+
"\\begin{document}\n"
|
| 92 |
+
"\\section*{Executive Summary}\nAlpha beta gamma.\n\n"
|
| 93 |
+
"\\section*{Methodology}\nMethod details.\n\n"
|
| 94 |
+
"\\section*{Results}\nResult details.\n"
|
| 95 |
+
"\\end{document}\n"
|
| 96 |
+
)
|
| 97 |
+
frames = build_slide_frames_from_report(paper, language="en")
|
| 98 |
+
slides = render_beamer_frames("demo", frames, language="en")
|
| 99 |
+
assert "\\begin{frame}{Agenda}" in slides
|
| 100 |
+
assert "Executive Summary" in slides
|
| 101 |
+
assert "Methodology" in slides
|
| 102 |
+
|
| 103 |
+
|
| 104 |
+
def test_render_beamer_frames_limits_density() -> None:
|
| 105 |
+
report = (
|
| 106 |
+
"\\documentclass{article}\\begin{document}"
|
| 107 |
+
"\\section*{Overview} A long sentence about architecture and implementation details repeated."
|
| 108 |
+
" Another long sentence about evaluation metrics and reproducibility details."
|
| 109 |
+
"\\section*{Results} Multiple findings with evidence and quantitative metrics."
|
| 110 |
+
"\\end{document}"
|
| 111 |
+
)
|
| 112 |
+
frames = build_slide_frames_from_report(report, language="en")
|
| 113 |
+
assert len(frames) >= 2
|
| 114 |
+
tex = render_beamer_frames("demo", frames, language="en")
|
| 115 |
+
assert "\\begin{frame}{Agenda}" in tex
|
| 116 |
+
assert "\\begin{itemize}" in tex
|
| 117 |
+
|
| 118 |
+
|
| 119 |
+
def test_render_report_structured_zh_uses_ctex() -> None:
|
| 120 |
+
section_blocks = [
|
| 121 |
+
{"name": "方法", "latex": "本节给出方法细节与参数说明。"},
|
| 122 |
+
{"name": "结果", "latex": "本节给出结果与证据。"},
|
| 123 |
+
]
|
| 124 |
+
tex = render_report_structured("中文研究报告", section_blocks, language="zh")
|
| 125 |
+
assert "\\documentclass[11pt]{ctexart}" in tex
|
| 126 |
+
assert "本研究报告聚焦可追溯证据" not in tex
|
| 127 |
+
assert "建议定期刷新证据并进行复跑验证" not in tex
|
| 128 |
+
assert "\\section*{方法}" in tex
|
| 129 |
+
|
| 130 |
+
|
| 131 |
+
def test_render_beamer_frames_zh_uses_ctexbeamer() -> None:
|
| 132 |
+
frames = [
|
| 133 |
+
build_slide_frames_from_report(
|
| 134 |
+
"\\documentclass{ctexart}\\begin{document}\\section*{结果}关键结果一。关键结果二。\\end{document}",
|
| 135 |
+
language="zh",
|
| 136 |
+
)[0]
|
| 137 |
+
]
|
| 138 |
+
tex = render_beamer_frames("中文主题", frames, language="zh")
|
| 139 |
+
assert "\\documentclass[aspectratio=169]{ctexbeamer}" in tex
|
| 140 |
+
assert "\\begin{frame}{目录}" in tex
|
| 141 |
+
|
| 142 |
+
|
| 143 |
+
def test_build_slide_frames_from_sections_splits_long_section() -> None:
|
| 144 |
+
section_blocks = [
|
| 145 |
+
{
|
| 146 |
+
"name": "Results",
|
| 147 |
+
"latex": (
|
| 148 |
+
"The first finding shows strong improvement in consistency and precision. "
|
| 149 |
+
"The second finding shows stronger robustness under distribution shift. "
|
| 150 |
+
"The third finding indicates cost-performance improvement. "
|
| 151 |
+
"The fourth finding confirms stability across runs. "
|
| 152 |
+
"The fifth finding highlights limitations and guardrails."
|
| 153 |
+
),
|
| 154 |
+
}
|
| 155 |
+
]
|
| 156 |
+
frames = build_slide_frames_from_sections(section_blocks, language="en")
|
| 157 |
+
assert len(frames) >= 2
|
| 158 |
+
|
| 159 |
+
|
| 160 |
+
def test_render_report_structured_removes_bracket_refs() -> None:
|
| 161 |
+
section_blocks = [
|
| 162 |
+
{"name": "Evidence", "latex": "Claim [1] with support [2] and \\cite{src1}."}
|
| 163 |
+
]
|
| 164 |
+
tex = render_report_structured("demo", section_blocks, language="en")
|
| 165 |
+
assert "[1]" not in tex
|
| 166 |
+
assert "[2]" not in tex
|
| 167 |
+
assert "\\cite{" not in tex
|
| 168 |
+
|
| 169 |
+
|
| 170 |
+
def test_enforce_slide_density_splits_large_bullet_groups() -> None:
|
| 171 |
+
frames = [
|
| 172 |
+
{
|
| 173 |
+
"title": "Results",
|
| 174 |
+
"bullets": [
|
| 175 |
+
"point one with enough words to be valid",
|
| 176 |
+
"point two with enough words to be valid",
|
| 177 |
+
"point three with enough words to be valid",
|
| 178 |
+
"point four with enough words to be valid",
|
| 179 |
+
"point five with enough words to be valid",
|
| 180 |
+
],
|
| 181 |
+
}
|
| 182 |
+
]
|
| 183 |
+
from hydradeck.render import SlideFrame
|
| 184 |
+
|
| 185 |
+
fr = [SlideFrame(title=x["title"], bullets=x["bullets"]) for x in frames]
|
| 186 |
+
out = enforce_slide_density(fr, language="en", max_bullets_per_frame=4)
|
| 187 |
+
assert len(out) == 2
|
| 188 |
+
assert out[0].title == "Results"
|
| 189 |
+
assert "(cont.)" in out[1].title
|
tests/test_resources_pack_mock.py
ADDED
|
@@ -0,0 +1,43 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
from __future__ import annotations
|
| 2 |
+
|
| 3 |
+
import zipfile
|
| 4 |
+
from pathlib import Path
|
| 5 |
+
|
| 6 |
+
from hydradeck.core.types import RunConfig
|
| 7 |
+
from hydradeck.resources_pack import build_resources_pack
|
| 8 |
+
|
| 9 |
+
|
| 10 |
+
def test_resources_pack_mock(tmp_path: Path) -> None:
|
| 11 |
+
out_zip = tmp_path / "res.zip"
|
| 12 |
+
cfg = RunConfig(
|
| 13 |
+
topic="RynnBrain",
|
| 14 |
+
out=out_zip,
|
| 15 |
+
base_url="https://example.invalid",
|
| 16 |
+
api_key="",
|
| 17 |
+
model="mock",
|
| 18 |
+
use_mock=True,
|
| 19 |
+
verbose=False,
|
| 20 |
+
progress=False,
|
| 21 |
+
llm_timeout_s=5.0,
|
| 22 |
+
max_total_runtime_s=5.0,
|
| 23 |
+
request_budget_s=2.0,
|
| 24 |
+
snapshot_timeout_s=1.0,
|
| 25 |
+
keep_stage=False,
|
| 26 |
+
max_sources=3,
|
| 27 |
+
module_sources=2,
|
| 28 |
+
)
|
| 29 |
+
build_resources_pack(cfg)
|
| 30 |
+
assert out_zip.exists()
|
| 31 |
+
with zipfile.ZipFile(out_zip, "r") as z:
|
| 32 |
+
names = set(z.namelist())
|
| 33 |
+
pre_paper = z.read("pre_paper.tex").decode("utf-8")
|
| 34 |
+
pre_slides = z.read("pre_slides.tex").decode("utf-8")
|
| 35 |
+
assert "resources/sources.json" in names
|
| 36 |
+
assert "resources/snapshots.json" in names
|
| 37 |
+
assert "research.json" in names
|
| 38 |
+
assert "pre_paper.tex" in names
|
| 39 |
+
assert "pre_slides.tex" in names
|
| 40 |
+
assert "pdf/pre_paper.pdf" in names
|
| 41 |
+
assert "pdf/pre_slides.pdf" in names
|
| 42 |
+
assert "来源清单" in pre_paper
|
| 43 |
+
assert "关键来源" in pre_slides
|
tests/test_smoke_mock.py
ADDED
|
@@ -0,0 +1,57 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
from __future__ import annotations
|
| 2 |
+
|
| 3 |
+
import zipfile
|
| 4 |
+
from pathlib import Path
|
| 5 |
+
|
| 6 |
+
from hydradeck.core.types import RunConfig
|
| 7 |
+
from hydradeck.pipeline import run
|
| 8 |
+
|
| 9 |
+
|
| 10 |
+
def test_mock_run_creates_zip(tmp_path: Path) -> None:
|
| 11 |
+
out_zip = tmp_path / "demo.zip"
|
| 12 |
+
cfg = RunConfig(
|
| 13 |
+
topic="test topic",
|
| 14 |
+
out=out_zip,
|
| 15 |
+
base_url="https://example.invalid",
|
| 16 |
+
api_key="",
|
| 17 |
+
model="mock",
|
| 18 |
+
use_mock=True,
|
| 19 |
+
verbose=False,
|
| 20 |
+
iterations=2,
|
| 21 |
+
max_sources=3,
|
| 22 |
+
archive_snapshots=False,
|
| 23 |
+
auto=True,
|
| 24 |
+
auto_queries=True,
|
| 25 |
+
auto_models=True,
|
| 26 |
+
)
|
| 27 |
+
run(cfg)
|
| 28 |
+
assert out_zip.exists()
|
| 29 |
+
with zipfile.ZipFile(out_zip, "r") as z:
|
| 30 |
+
names = set(z.namelist())
|
| 31 |
+
compile_sh = z.read("compile.sh").decode("utf-8")
|
| 32 |
+
paper_tex = z.read("paper.tex").decode("utf-8")
|
| 33 |
+
slides_tex = z.read("slides.tex").decode("utf-8")
|
| 34 |
+
for required in [
|
| 35 |
+
"pre_report.md",
|
| 36 |
+
"report.md",
|
| 37 |
+
"speech.md",
|
| 38 |
+
"paper.tex",
|
| 39 |
+
"slides.tex",
|
| 40 |
+
"refs.bib",
|
| 41 |
+
"research.json",
|
| 42 |
+
"compile.sh",
|
| 43 |
+
"Makefile",
|
| 44 |
+
"resources/sources.json",
|
| 45 |
+
]:
|
| 46 |
+
assert required in names
|
| 47 |
+
|
| 48 |
+
assert "xelatex -interaction=nonstopmode paper.tex" in compile_sh
|
| 49 |
+
assert "xelatex -interaction=nonstopmode slides.tex" in compile_sh
|
| 50 |
+
assert "\\section*{3. Evidence and Key Findings}" in paper_tex
|
| 51 |
+
assert "\\section*{1. Introduction and Background}" in paper_tex
|
| 52 |
+
assert "\\begin{frame}{Agenda}" in slides_tex
|
| 53 |
+
assert "\\usetheme{metropolis}" in slides_tex
|
| 54 |
+
assert "```" not in paper_tex
|
| 55 |
+
assert "```" not in slides_tex
|
| 56 |
+
assert "## " not in paper_tex
|
| 57 |
+
assert "## " not in slides_tex
|
tests/test_verbatim_mock.py
ADDED
|
@@ -0,0 +1,43 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
from __future__ import annotations
|
| 2 |
+
|
| 3 |
+
import zipfile
|
| 4 |
+
from pathlib import Path
|
| 5 |
+
|
| 6 |
+
from hydradeck.core.types import RunConfig
|
| 7 |
+
from hydradeck.pipeline import run
|
| 8 |
+
|
| 9 |
+
|
| 10 |
+
def test_mock_run_verbatim_creates_outputs(tmp_path: Path) -> None:
|
| 11 |
+
out_zip = tmp_path / "verbatim.zip"
|
| 12 |
+
cfg = RunConfig(
|
| 13 |
+
topic="RynnBrain",
|
| 14 |
+
out=out_zip,
|
| 15 |
+
base_url="https://example.invalid",
|
| 16 |
+
api_key="",
|
| 17 |
+
model="mock",
|
| 18 |
+
use_mock=True,
|
| 19 |
+
verbose=False,
|
| 20 |
+
iterations=1,
|
| 21 |
+
max_sources=3,
|
| 22 |
+
verbatim=True,
|
| 23 |
+
archive_prompts=True,
|
| 24 |
+
archive_snapshots=False,
|
| 25 |
+
quality_gate=True,
|
| 26 |
+
min_quality_score=0.85,
|
| 27 |
+
max_quality_attempts=2,
|
| 28 |
+
)
|
| 29 |
+
run(cfg)
|
| 30 |
+
with zipfile.ZipFile(out_zip, "r") as z:
|
| 31 |
+
names = set(z.namelist())
|
| 32 |
+
for required in [
|
| 33 |
+
"pre_report.md",
|
| 34 |
+
"report.md",
|
| 35 |
+
"speech.md",
|
| 36 |
+
"paper.tex",
|
| 37 |
+
"slides.tex",
|
| 38 |
+
"refs.bib",
|
| 39 |
+
"research.json",
|
| 40 |
+
"resources/sources.json",
|
| 41 |
+
"prompts.jsonl",
|
| 42 |
+
]:
|
| 43 |
+
assert required in names
|