Diffusers
ophthalmology
OCT
fundus
medical-imaging
diffusion
stable-diffusion-3
segmentation
instruction-tuning
Instructions to use MaybeRichard/OCTFlow with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use MaybeRichard/OCTFlow with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("MaybeRichard/OCTFlow", dtype=torch.bfloat16, device_map="cuda") prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k" image = pipe(prompt).images[0] - Notebooks
- Google Colab
- Kaggle
OCTFlow v1 (2026-06-21): 13 weights (core+downstream) + S0-S7 report + experiment log; public release
Browse files- .gitattributes +1 -0
- README.md +50 -42
- octflow-raev2-code.tar.gz +2 -2
- results/EXPERIMENT_LOG.md +89 -0
- results/octflow_downstream_report.html +3 -0
- results/result_jsons.tar.gz +3 -0
- results/zeroshot_spectrum.json +52 -0
- weights/sd3_vb_denoise_100k_step4000.pt +3 -0
- weights/sd3_vb_disccup_v3b_step4000.pt +3 -0
- weights/sd3_vb_faz_v3b_step4000.pt +3 -0
- weights/sd3_vb_fluid_v3b_step4000.pt +3 -0
- weights/sd3_vb_layer_100k_step30000.pt +3 -0
- weights/sd3_vb_layer_100k_v3b_step15000.pt +3 -0
- weights/sd3_vb_mask2img_step12000.pt +3 -0
- weights/sd3_vb_octa500_100k_step4000.pt +3 -0
- weights/sd3_vb_octavessel_v3b_step4000.pt +3 -0
- weights/sd3_vb_stageC_v3a_step30000.pt +2 -2
- weights/sd3_vb_vessel_v3b_step4000.pt +3 -0
.gitattributes
CHANGED
|
@@ -39,3 +39,4 @@ results/figs/qual_denoise.png filter=lfs diff=lfs merge=lfs -text
|
|
| 39 |
results/figs/radar.png filter=lfs diff=lfs merge=lfs -text
|
| 40 |
results/figs/spectrum.png filter=lfs diff=lfs merge=lfs -text
|
| 41 |
results/manuscript/octflow_main.pdf filter=lfs diff=lfs merge=lfs -text
|
|
|
|
|
|
| 39 |
results/figs/radar.png filter=lfs diff=lfs merge=lfs -text
|
| 40 |
results/figs/spectrum.png filter=lfs diff=lfs merge=lfs -text
|
| 41 |
results/manuscript/octflow_main.pdf filter=lfs diff=lfs merge=lfs -text
|
| 42 |
+
results/octflow_downstream_report.html filter=lfs diff=lfs merge=lfs -text
|
README.md
CHANGED
|
@@ -2,48 +2,56 @@
|
|
| 2 |
license: other
|
| 3 |
tags:
|
| 4 |
- ophthalmology
|
| 5 |
-
-
|
| 6 |
-
-
|
| 7 |
-
- foundation-model
|
| 8 |
- medical-imaging
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 9 |
---
|
| 10 |
|
| 11 |
-
# OCTFlow —
|
| 12 |
-
|
| 13 |
-
|
| 14 |
-
|
| 15 |
-
|
| 16 |
-
|
| 17 |
-
|
| 18 |
-
|
| 19 |
-
|
| 20 |
-
|
| 21 |
-
|
| 22 |
-
|
| 23 |
-
|
| 24 |
-
|
| 25 |
-
|
| 26 |
-
|
| 27 |
-
|
| 28 |
-
|
| 29 |
-
|
| 30 |
-
|
| 31 |
-
|
| 32 |
-
|
| 33 |
-
|
| 34 |
-
|
| 35 |
-
|
| 36 |
-
|
| 37 |
-
|
| 38 |
-
|
| 39 |
-
|
| 40 |
-
`
|
| 41 |
-
|
| 42 |
-
|
| 43 |
-
|
| 44 |
-
|
| 45 |
-
|
| 46 |
-
hf download MaybeRichard/OCTFlow --local-dir
|
| 47 |
-
|
| 48 |
-
|
| 49 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 2 |
license: other
|
| 3 |
tags:
|
| 4 |
- ophthalmology
|
| 5 |
+
- OCT
|
| 6 |
+
- fundus
|
|
|
|
| 7 |
- medical-imaging
|
| 8 |
+
- diffusion
|
| 9 |
+
- stable-diffusion-3
|
| 10 |
+
- segmentation
|
| 11 |
+
- instruction-tuning
|
| 12 |
+
library_name: diffusers
|
| 13 |
---
|
| 14 |
|
| 15 |
+
# OCTFlow v1 — Unified Generative Instruction Foundation Model for Ophthalmic Imaging
|
| 16 |
+
|
| 17 |
+
**Frozen v1 · 2026-06-21.** A single SD3-medium (KL-VAE 16ch + 2B MMDiT, rectified flow) instruction-tuned
|
| 18 |
+
(Vision-Banana style) ophthalmic foundation model: one model, change the **text prompt** to do generation,
|
| 19 |
+
multi-scheme segmentation, denoising — across 9 imaging modalities. `hf download MaybeRichard/OCTFlow --revision v1`.
|
| 20 |
+
|
| 21 |
+
## What it does (one base, switch by prompt)
|
| 22 |
+
- **Generation (T2I)** — 9 modalities: OCT B-scan, color fundus, SLO, UWF fundus, OCTA en-face, OCT en-face, FA, slit-lamp, IR-SLO.
|
| 23 |
+
- **Instruction segmentation** — OCT retinal layers (9/5/3 + **arbitrary unseen counts** + single-layer selection + any colors), OCT fluid (IRF/SRF/PED), OCTA FAZ & large vessels, fundus vessels & disc/cup, OCTA500 5-layer.
|
| 24 |
+
- **Denoising** — OCT speckle (generative i2i).
|
| 25 |
+
- **Disease classification** — frozen-feature linear probe (14 OCT/fundus/UWF tasks).
|
| 26 |
+
- **Semantic-synthesis & data augmentation** — mask→image generation; labeled synthetic data (shareable/privacy).
|
| 27 |
+
|
| 28 |
+
## Honest results (see `results/octflow_downstream_report.html` + `EXPERIMENT_LOG.md`)
|
| 29 |
+
Positioning is **competitive-not-SOTA per task**; the genuine differentiation is unification + zero-shot instruction generalization + shareable data.
|
| 30 |
+
- **Classification (14 tasks, linear probe, mean top1)**: DINOv2 0.849 ≥ RETFound 0.844 ≥ **ours 0.837** > DINOv3 0.824 > MIRAGE 0.808 > VisionFM 0.796. Ours 3rd/7, never per-task #1.
|
| 31 |
+
- **Segmentation (vs end-to-end fine-tuned FM @512, fair)**: ours competitive; loses most tasks to the strongest fine-tuned FM (esp. DINOv2) by 0.03–0.09, wins OCTA large-vessel, ties FIVES vessel.
|
| 32 |
+
- **Denoising**: fine-tuned FM ≥ ours on PSNR/SSIM; ours wins perceptual LPIPS (0.135 vs 0.37–0.41).
|
| 33 |
+
- **Data augmentation (classification §5 / segmentation §6)**: synthetic ≈ or < classical augmentation for raw point-gain; value is in shareable/customizable data, not point-gain.
|
| 34 |
+
- **★ Zero-shot instruction generalization (the moat)**: trained only on 9/5/3-layer schemes; on **unseen** layer counts (8/7/6/4/2) mean mIoU **0.441 ≈ seen 0.430**; cross-device zero-shot (OCTA500, never trained) binary retina IoU **0.897**. A fixed-head discriminative FM / U-Net cannot do this.
|
| 35 |
+
|
| 36 |
+
## Weights (`weights/`, optimizer state stripped, bf16)
|
| 37 |
+
| file | role |
|
| 38 |
+
|---|---|
|
| 39 |
+
| `sd3_multimodal_base_v2_step240000.pt` | T2I base (generation + probe backbone) |
|
| 40 |
+
| `sd3_oct_stageA_v3_step20000.pt` | OCT domain-adapt init (warm-start) |
|
| 41 |
+
| `sd3_vb_stageC_v3a_step30000.pt` | **instruction model** (multi-scheme seg + zero-shot, §7) |
|
| 42 |
+
| `sd3_vb_layer_100k_step30000.pt` | 9-layer seg anchor (v3a recipe) |
|
| 43 |
+
| `sd3_vb_layer_100k_v3b_step15000.pt` | 9-layer seg + decoded-loss (sharper thin layers) |
|
| 44 |
+
| `sd3_vb_mask2img_step12000.pt` | mask→image generator (§6 semantic synthesis) |
|
| 45 |
+
| `sd3_vb_denoise_100k_step4000.pt` | OCT denoiser |
|
| 46 |
+
| `sd3_vb_{disccup,faz,fluid,octavessel,vessel}_v3b_step4000.pt` | downstream seg specialists |
|
| 47 |
+
| `sd3_vb_octa500_100k_step4000.pt` | OCTA500 5-layer specialist |
|
| 48 |
+
|
| 49 |
+
## Usage
|
| 50 |
+
1. `hf download MaybeRichard/OCTFlow --revision v1 --local-dir octflow_v1`
|
| 51 |
+
2. Untar `octflow-raev2-code.tar.gz`; install via `uv sync` (PyTorch 2.10+cu128). **`torch.load(..., weights_only=False)`**.
|
| 52 |
+
3. Put a weight at `pilot/path1/results/<run>/checkpoints/`, set dataset paths, run the matching script in `pilot/path1/scripts/foundation/` (e.g. `seg_instr_eval.py`, `zeroshot_spectrum.py`, `denoise_eval.py`).
|
| 53 |
+
|
| 54 |
+
## Notes / limitations
|
| 55 |
+
- Pilot-scale; **not clinically validated** (external multi-center validation + reader study are future work).
|
| 56 |
+
- Built on gated `stabilityai/stable-diffusion-3-medium-diffusers` (accept its license).
|
| 57 |
+
- Env: base conda, torch 2.10.0+cu128, diffusers 0.37, transformers 5.3.
|
octflow-raev2-code.tar.gz
CHANGED
|
@@ -1,3 +1,3 @@
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:
|
| 3 |
-
size
|
|
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:12f444e8020f838c863d2e784d8ad825f96db3b9c153c158d3d3c385233c821d
|
| 3 |
+
size 4529154
|
results/EXPERIMENT_LOG.md
ADDED
|
@@ -0,0 +1,89 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# OCTFlow 下游评测 — 实验日志(整段会话整理)
|
| 2 |
+
|
| 3 |
+
- **模型**:SD3-medium(KL-VAE 16ch + 2B MMDiT,rectified flow)+ Vision-Banana 式指令微调
|
| 4 |
+
- **环境**:8× NVIDIA L20Y 80GB(`h800`,共享);base conda(torch2.10 / diffusers0.37 / sklearn / timm / smp / lpips);`HF_HUB_OFFLINE=1`
|
| 5 |
+
- **基座**:`sd3_multimodal_base_v2/step-0240000.pt`(探针 + 生成)、指令模型 `sd3_vb_stageC_v3a/step-0030000.pt`、层分割锚 `sd3_vb_layer_100k/step-0030000.pt`
|
| 6 |
+
- **交付**:`reports/octflow_downstream_report.html`(§0–§7,宽屏,~20MB) + 本日志
|
| 7 |
+
- **总基调**:逐维度诚实压到 **competitive-非-SOTA**;唯一独占能力 = **零样本指令泛化(§7)** + 可共享数据。
|
| 8 |
+
|
| 9 |
+
---
|
| 10 |
+
|
| 11 |
+
## 一、实验时间线与结果
|
| 12 |
+
|
| 13 |
+
### §2 多任务分类探针(14 任务 / 7 backend,冻结特征线性探针)
|
| 14 |
+
- **动机**:作为眼科 foundation,对标 RETFound/MIRAGE/VisionFM/DINOv2/DINOv3。
|
| 15 |
+
- **方法**:冻结编码器特征 + 线性探针;本会话从 7 任务扩到 **14**(+OCT-C8 8类/OCTDL/OCTID/Srinivasan/NEH/OLIVES/Messidor),头部 backend(ours/DINOv2/RETFound)3-seed。
|
| 16 |
+
- **结果(14 任务平均 top1)**:DINOv2 **0.849** ≥ RETFound 0.844 ≥ **ours 0.837** > DINOv3 0.824 > MIRAGE 0.808 > VisionFM 0.796 ≫ 随机 0.621。
|
| 17 |
+
- **结论**:**ours 第 3/7,落后 leaders 仅 ~0.01,但 14 任务逐项排名都 #2–#5、无一第一**(MIRAGE 拿多数 OCT、DINOv2 拿眼底/UWF)。competitive-非-SOTA;远超只训 OCT 的 MIRAGE/VisionFM(它们在眼底/UWF 崩、我们三模态不崩)。
|
| 18 |
+
|
| 19 |
+
### §3 分割:本模型 vs **公平微调** FM vs 监督 U-Net
|
| 20 |
+
- **关键修正**:原先 FM 基线是**冻结探针(eval@224)**=不公平下界,导致"碾压 FM"的错误结论。本会话改成 **端到端微调 FM**(编码器解冻 + 稠密解码头 + CE+Dice,**eval@512**,3-seed,`fm_seg_ft.py`)。
|
| 21 |
+
- **结果(miou_fg,ours vs 最强微调FM vs U-Net)**:9层 0.53 / **0.59** / —;血管 0.51 / 0.51 / **0.66**;OCTA大血管 **0.53** / 0.50 / 0.61;FAZ 0.66 / **0.72** / 0.70;积液 0.63 / **0.66** / 0.78;视盘杯盘 0.74 / **0.80** / 0.79;OCTA500 0.82 / 0.84 / **0.85**。
|
| 22 |
+
- **结论(诚实,推翻旧版)**:公平后 ours **多数任务略逊于为单任务单独微调的最强 FM(尤 DINOv2,差 0.03–0.09)**,仅 OCTA 大血管领先、血管打平;团块任务微调 FM 常反超 U-Net;细血管两者都低于 CNN U-Net(ViT patch 分辨率瓶颈)。
|
| 23 |
+
|
| 24 |
+
### §4 去噪:本模型(生成式)vs 公平微调 FM
|
| 25 |
+
- **方法**:FM 也端到端微调 + L1+LPIPS(`fm_denoise_ft.py`)。
|
| 26 |
+
- **结果**:微调 FM 的 **PSNR/SSIM 反而 ≥ 我们**(MIRAGE-FT 28.2dB/0.77 vs 我们 27.6/0.71);**我们只在感知 LPIPS 占优**(0.135 vs 0.37–0.41,约 3×)。
|
| 27 |
+
- **修复**:对比图里 FM 列伪彩 = 我的解码头预测 3 独立通道,对灰度 OCT 无约束。改为预测 1 通道亮度→复制 3 通道,全列通道差归 0。
|
| 28 |
+
|
| 29 |
+
### §5 生成式数据增广(分类)
|
| 30 |
+
- **方法**:`train_classifier.py` 加 `--mode aug`(计数对齐的强经典增广对照);real / +经典增广 / +合成(我们 base 生成)× 数据量 × 3-seed。
|
| 31 |
+
- **结果**:Kermany(易,95%):+合成 > 仅真实(+0.5)但 **≈ 经典增广**;EyePACS DR 分级(难,50–66%):+合成 ≈ 仅真实、**明显输给经典增广(−1.6 平均)**。
|
| 32 |
+
- **结论**:纯靠合成"涨点"不成立(易任务≈、难任务<经典增广)。
|
| 33 |
+
|
| 34 |
+
### §6 生成式语义合成增广(分割,最完善范式)
|
| 35 |
+
- **方法**:训 **mask→image** 生成器(`train_vb_sd3.py --mask2img`:条件=彩色层图、生成=OCT;warmstart Stage A*;12k 步);用**真 GT 掩膜**(零噪声)生成多样外观 OCT →(真掩膜+合成图)对;ResNet-50 U-Net,real/经典增广/合成 × 标注预算 × 3-seed。
|
| 36 |
+
- **结果(miou_fg,N=100/250/500/1000)**:仅真实 14.5/25.3/27.3/45.5;+经典增广 18.3/31.7/49.0/56.6;+合成 18.6/32.4/38.5/54.2。+合成远好于不加(+4~+11),但 vs 经典增广:**低标注(100/250)打平、高标注(500/1000)输 −2~−10**。
|
| 37 |
+
- **结论**:即便最完善范式,合成增广"涨点"≈/<经典增广、不稳超越。
|
| 38 |
+
|
| 39 |
+
### §7 零样本指令泛化(**护城河,核心成果**)
|
| 40 |
+
- **方法**:`zeroshot_spectrum.py` — 同一指令模型(只训 9/5/3 层),测**训练从未见过的层数**(8/7/6/4/2,加进 `seg_schemes.SCHEMES` 但不在 `SCHEME_WEIGHTS`→训练永不采样)。
|
| 41 |
+
- **结果(N=120,miou_fg)**:见过均值 **0.430** vs 未见均值 **0.441**(未见甚至略高);跨设备零样本 OCTA500 二值视网膜 IoU **0.897**、5 类 mIoU 0.480。montage 肉眼可见层数随 prompt 从 9 带变到 2 区、几何保留。
|
| 42 |
+
- **结论**:模型按 prompt 组合层数、非记忆。**判别式 FM / U-Net 单一固定输出头做不到;换分割方案它们要重训一个网络,我们换一句 prompt 即可。这是论文真正且唯一独占的差异化。**
|
| 43 |
+
|
| 44 |
+
---
|
| 45 |
+
|
| 46 |
+
## 二、失败 / 负结果 / 踩坑(诚实记录)
|
| 47 |
+
|
| 48 |
+
**推翻的旧结论(本会话纠正)**
|
| 49 |
+
- "生成式分割/去噪碾压 FM" → 仅因 FM 用了冻结探针;公平微调后多数任务略逊。
|
| 50 |
+
- hero 能力矩阵"分割顶尖 / 去噪第一" → 同因,需改/降级。
|
| 51 |
+
- "合成增广是真卖点(mixed>real)" → 旧单 seed 无对照夸大;严谨后 ≈/<经典增广。
|
| 52 |
+
- "线性探针打赢 RETFound" → 仅 kermany 单任务偶然;扩到 14 任务后 competitive-非-SOTA。
|
| 53 |
+
|
| 54 |
+
**关键踩坑(供后续自检)**
|
| 55 |
+
- **协议不对等**:冻结探针 vs 端到端微调是巨大不公平(细血管 0.0→0.5);对比必须同协议。
|
| 56 |
+
- **饱和任务掩盖效应**:kermany 95% 几乎无头部空间,增广效应测不出;难/低标注任务才显。
|
| 57 |
+
- **去噪伪彩**:灰度模态别预测多通道,要 1 通道→复制。
|
| 58 |
+
- **半写 ckpt 竞态**:16GB ckpt 写盘中被 `ls -t|head -1` 读到 → "failed finding central directory";评测/生成前应确认 ckpt 写完(本会话靠"评测在训练后串行"+ §6 加 `sleep 40` 规避)。
|
| 59 |
+
- **掩膜粒度**:mask→image 生成要用**完整 9 层**掩膜(与配对的 GT 标签一致),否则细层不匹配。
|
| 60 |
+
- **DINOv2 分辨率**:probe 用 224、其他处 518,注意一致性。
|
| 61 |
+
|
| 62 |
+
---
|
| 63 |
+
|
| 64 |
+
## 三、成功 / 正向结论
|
| 65 |
+
- **§7 零样本指令泛化**:未见层数 ≈ 见过(0.44/0.43)+ 跨设备 0.90 — 唯一独占能力,硬证据。
|
| 66 |
+
- **公平、诚实、完整的下游基准**:§0–§7,全部带对照、多 seed(头部)、宽屏报告。
|
| 67 |
+
- **mask→image 生成器**:可生成匹配 9 层几何的真实 OCT(语义合成可行)。
|
| 68 |
+
- **统一基座**:一个模型换 prompt 覆盖 生成 + 全部分割方案(含未见)+ 去噪。
|
| 69 |
+
|
| 70 |
+
---
|
| 71 |
+
|
| 72 |
+
## 四、诚实定位(论文叙事骨架)
|
| 73 |
+
**单一生成式指令通才**:逐任务 competitive-非-SOTA(分类/分割/去噪/增广都不是单点夺冠),但
|
| 74 |
+
①统一基座一个 prompt 覆盖生成+7类分割+去噪;②**零样本指令泛化(§7,判别式做不到)**;③可共享/可定制数据(隐私/补稀有类)。冲 Nature 的核心 = ②+③+覆盖面,而非任何单点 SOTA。
|
| 75 |
+
|
| 76 |
+
---
|
| 77 |
+
|
| 78 |
+
## 五、后续计划
|
| 79 |
+
1. **手稿对齐(最高优先)**:把诚实结论同步进 `octflow_main.tex` —— 砍掉"碾压 FM / 分割去噪第一 / 增广真卖点"等过度声明;把 §7 零样本指令泛化立为主结果 + hero 矩阵据实重画。
|
| 80 |
+
2. **临床(用户侧,外部 gate)**:多中心外部验证 + 医生 reader study(Nature 硬门槛)。
|
| 81 |
+
3. **可选**:补 octa/oct_enface 数据后重训分清两者(task #51,等医院数据);§7 再加单层选择/色彩跟随两条指令轴做厚。
|
| 82 |
+
4. **磁盘清理**:见下,~1TB 中间权重待清(需确认)。
|
| 83 |
+
|
| 84 |
+
---
|
| 85 |
+
|
| 86 |
+
## 六、产物 / 文件清理摘要
|
| 87 |
+
- **脚本(本会话,均 `pilot/path1/scripts/foundation/` 或 `scripts/`)**:`fm_seg_ft.py`/`fm_denoise_ft.py`(公平微调 FM)、`train_vb_sd3.py --mask2img`、`gen_mask2img.py`、`compare_panels_ft.py`/`compare_denoise_ft.py`、`train_classifier.py`(+aug/+label)、`build_seg_aug_manifests.py`、`unet_seg_baseline.py`(+strong-aug)、`zeroshot_spectrum.py`、`multitask_probe.py`(+7任务)、`build_multitask_report.py`(§0–§7)、各 `run_*.sh`。
|
| 88 |
+
- **代码审查**:无规则违反(`weights_only=False` 全部到位)、无致命 bug(实验均验证通过);仅robustness 提示(半写 ckpt 竞态、硬编码路径、`seg_schemes.SCHEMES` 加了 eval-only 未见方案)。
|
| 89 |
+
- **磁盘清理(已执行 A+B,2026-06-21)**:`results/` **1.5T → 396G(释放 ~1.1T)**。删 61 个中间步 ckpt(每个实验仅留最新/锚点)+ 2 个被取代的整目录(`sd3_vb_vessel` 原始、`sd3_vb_lesion_fluid`)。**保留**:base_v2/240k、stageA_v3/20k、stageC_v3a/30k、layer_100k/30k、layer_100k_v3b/15k、mask2img/12k、long_9mod/150k、各下游 _100k/_v3b 最新(step-4000)、`foundation/`(24G 结果 json+图)。**未删 Tier-C**(privacy_gen×3 / denoise_ft / stageC_v3b2,用户选择保留)。脚本 `/tmp/cleanup_ckpts.sh`(留存可复核)。
|
results/octflow_downstream_report.html
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:d8a57e0bd9b092b550905da5c9045c1a46290236666cb69307ed99fb54246ca1
|
| 3 |
+
size 20172787
|
results/result_jsons.tar.gz
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:cd240c3610995c00b03dcf4935c8e9da1ecadd463511a24e3ebed83927c81d13
|
| 3 |
+
size 93006
|
results/zeroshot_spectrum.json
ADDED
|
@@ -0,0 +1,52 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"9": {
|
| 3 |
+
"miou_incl_bg": 0.4169086099216114,
|
| 4 |
+
"miou_fg": 0.3750156815460237,
|
| 5 |
+
"n_layers": 9,
|
| 6 |
+
"seen": true
|
| 7 |
+
},
|
| 8 |
+
"5": {
|
| 9 |
+
"miou_incl_bg": 0.5372676917430389,
|
| 10 |
+
"miou_fg": 0.4604574883726739,
|
| 11 |
+
"n_layers": 5,
|
| 12 |
+
"seen": true
|
| 13 |
+
},
|
| 14 |
+
"3": {
|
| 15 |
+
"miou_incl_bg": 0.5336960968249542,
|
| 16 |
+
"miou_fg": 0.4548362254640317,
|
| 17 |
+
"n_layers": 3,
|
| 18 |
+
"seen": true
|
| 19 |
+
},
|
| 20 |
+
"8": {
|
| 21 |
+
"miou_incl_bg": 0.47966075374834255,
|
| 22 |
+
"miou_fg": 0.44272555344730624,
|
| 23 |
+
"n_layers": 8,
|
| 24 |
+
"seen": false
|
| 25 |
+
},
|
| 26 |
+
"7": {
|
| 27 |
+
"miou_incl_bg": 0.5102969715125301,
|
| 28 |
+
"miou_fg": 0.46952078938563657,
|
| 29 |
+
"n_layers": 7,
|
| 30 |
+
"seen": false
|
| 31 |
+
},
|
| 32 |
+
"6": {
|
| 33 |
+
"miou_incl_bg": 0.516031219093973,
|
| 34 |
+
"miou_fg": 0.4724559032195191,
|
| 35 |
+
"n_layers": 6,
|
| 36 |
+
"seen": false
|
| 37 |
+
},
|
| 38 |
+
"4": {
|
| 39 |
+
"miou_incl_bg": 0.4319120527269317,
|
| 40 |
+
"miou_fg": 0.37931733275814766,
|
| 41 |
+
"n_layers": 4,
|
| 42 |
+
"seen": false
|
| 43 |
+
},
|
| 44 |
+
"2": {
|
| 45 |
+
"miou_incl_bg": 0.5501978233780155,
|
| 46 |
+
"miou_fg": 0.43936878197981843,
|
| 47 |
+
"n_layers": 2,
|
| 48 |
+
"seen": false
|
| 49 |
+
},
|
| 50 |
+
"_seen_mean_fg": 0.43010313179424314,
|
| 51 |
+
"_unseen_mean_fg": 0.4406776721580856
|
| 52 |
+
}
|
weights/sd3_vb_denoise_100k_step4000.pt
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:8d917cce08bd25c3c80866c0e08f8f7aff0a3a8418d7d1dcd059158d78f989ff
|
| 3 |
+
size 8340765905
|
weights/sd3_vb_disccup_v3b_step4000.pt
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:9982ac013f6cd1d06fe19ba78044879eb10370ef046892c2e9d9393fabcc582c
|
| 3 |
+
size 8340764533
|
weights/sd3_vb_faz_v3b_step4000.pt
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:e7e5055a51c2a7db5ce07270cd1675a623b74eb7cc97fe8fe70060bf876d19b4
|
| 3 |
+
size 8340759045
|
weights/sd3_vb_fluid_v3b_step4000.pt
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:1bcbdd170aa5a0a734ad3e6ac3610d8668efed92cb647da95a8722950f04f85c
|
| 3 |
+
size 8340761789
|
weights/sd3_vb_layer_100k_step30000.pt
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:0ec7d47a1f40f57ef1a11fc4f19fe18b690ccac2dba742e9bb768502f4430e84
|
| 3 |
+
size 8340764533
|
weights/sd3_vb_layer_100k_v3b_step15000.pt
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:59080de19c069b11f1467bda325340ec98ac5a6184b870f3e16eed55d4da7ed8
|
| 3 |
+
size 8340770021
|
weights/sd3_vb_mask2img_step12000.pt
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:0c103dd73e740426256bd88a86daa0669043c5a5f3dbf2f27528e46c5895d2a9
|
| 3 |
+
size 8340761789
|
weights/sd3_vb_octa500_100k_step4000.pt
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:295512ddea29362bbc89ce44cd91acbdf9621ca6778a8c5b7415192f6d12436b
|
| 3 |
+
size 8340765905
|
weights/sd3_vb_octavessel_v3b_step4000.pt
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:ef8c83141396e36ad968a73eb8b713ecf09d6f29db6e2c14ccca601e35770ba8
|
| 3 |
+
size 8340768649
|
weights/sd3_vb_stageC_v3a_step30000.pt
CHANGED
|
@@ -1,3 +1,3 @@
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:
|
| 3 |
-
size
|
|
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:253c3a60d0f2fadbc3c5f09c30a2d4eed2f173d79d0241cd90688abae027d038
|
| 3 |
+
size 8340764533
|
weights/sd3_vb_vessel_v3b_step4000.pt
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:1d7282986845171780b5fcb60d9491dac34b5bc59b1b15d102716449ceb8fe0b
|
| 3 |
+
size 8340763161
|