MaybeRichard commited on
Commit
44d64c2
Β·
verified Β·
1 Parent(s): 08d2566

OCTFlow Path-1 code + stripped weights (Stage A* + v3a + v1/v2)

Browse files
README.md ADDED
@@ -0,0 +1,70 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: other
3
+ tags:
4
+ - oct
5
+ - ophthalmology
6
+ - segmentation
7
+ - stable-diffusion-3
8
+ - instruction-tuning
9
+ - medical-imaging
10
+ ---
11
+
12
+ # OCTFlow β€” Path 1 (SD3 backbone) code + weights
13
+
14
+ Reusable code and checkpoints for the OCTFlow pilot: an ophthalmic multimodal
15
+ generative model that does **prompt-controlled OCT retinal-layer segmentation**
16
+ (Vision-Banana-style instruction tuning on a Stable Diffusion 3 medium backbone).
17
+
18
+ This repo is for **continuing the work on a new machine** β€” the dataset is hosted
19
+ separately. Optimizer state has been stripped from the checkpoints (warm-start and
20
+ inference only need `model` weights).
21
+
22
+ ## Contents
23
+
24
+ | File | What |
25
+ |---|---|
26
+ | `octflow-raev2-code.tar.gz` | Full RAEv2 working tree (src/ engine + pilot/path1/ Path-1 code, configs, scripts). Excludes results/, .git/, pretrained_models/, data/. |
27
+ | `weights/sd3_oct_stageA_v3_step20000.pt` | **Stage A\*** β€” SD3 medium fine-tuned on Topcon OCT (T2I domain adaptation). The warm-start base for all Stage C runs. |
28
+ | `weights/sd3_vb_stageC_v3a_step30000.pt` | **v3a (best)** β€” multi-prompt instruction tuning. Follows prompts for 9/5/3-layer + arbitrary colors + single-layer selection; zero-shot adapts to new layer schemes. |
29
+ | `weights/sd3_vb_stageC_v1_step20000.pt` | (optional) v1 specialist, prob_seg=0.3, single fixed 10-color prompt. |
30
+ | `weights/sd3_vb_stageC_v2_step20000.pt` | (optional) v2 specialist, prob_seg=0.5. |
31
+
32
+ Each `.pt` holds `{step, model, ema, config}` (no optimizer). `model` is a
33
+ `SD3Transformer2DModel` with `pos_embed.proj` expanded 16β†’32 input channels
34
+ (channel-concat image conditioning).
35
+
36
+ ## Key results (v3a)
37
+
38
+ - **Instruction following**: prompt 9/5/3 layers β†’ outputs 6.95/4.36/2.85 layers; shuffled-color prompt mIoU 0.456 β‰ˆ canonical 0.461 (the model reads the prompt's color map).
39
+ - **Cross-device zero-shot (OCTA500, native 5-layer prompt)**: binary retina IoU **0.538 β†’ 0.897** vs the single-prompt pilot.
40
+ - **per-scheme mIoU (incl bg, N=150)**: 9-layer 0.461 / 5-layer 0.526 / 3-layer 0.610.
41
+ - vs OCT-RAE backbone: 10-class strict mIoU 0.023 β†’ 0.507 (22Γ—).
42
+
43
+ ## Restore on a new server
44
+
45
+ ```bash
46
+ # 1. download this repo
47
+ hf download <this-repo-id> --repo-type model --local-dir octflow_restore
48
+
49
+ # 2. unpack code
50
+ mkdir RAEv2 && tar xzf octflow_restore/octflow-raev2-code.tar.gz -C RAEv2
51
+ cd RAEv2
52
+
53
+ # 3. env (uv) + put weights back where run.sh expects them
54
+ uv sync # or: conda env + pip install diffusers transformers torch ...
55
+ mkdir -p pilot/path1/results/sd3_oct_stageA_v3/checkpoints
56
+ mkdir -p pilot/path1/results/sd3_vb_stageC_v3a/checkpoints
57
+ cp octflow_restore/weights/sd3_oct_stageA_v3_step20000.pt pilot/path1/results/sd3_oct_stageA_v3/checkpoints/step-0020000.pt
58
+ cp octflow_restore/weights/sd3_vb_stageC_v3a_step30000.pt pilot/path1/results/sd3_vb_stageC_v3a/checkpoints/step-0030000.pt
59
+
60
+ # 4. point configs/scripts at the new dataset root, then see pilot/path1/run.sh
61
+ ```
62
+
63
+ SD3 medium base weights (`stabilityai/stable-diffusion-3-medium-diffusers`) are
64
+ downloaded from HF at runtime, not bundled here.
65
+
66
+ ## Reproduce / next step
67
+
68
+ The full pipeline is `pilot/path1/run.sh`. Next planned step is **v3b**:
69
+ decoded-space loss (palette CE + soft Dice + thin-layer weighting) to fix the
70
+ generalist tax and weak thin layers (RPE/GCL). Clinical scope is the macula.
octflow-raev2-code.tar.gz ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f1f4703eb902ed2849f291e8f96d14433c70db8a311063f8808dadbff2c57305
3
+ size 1072660574
weights/sd3_oct_stageA_v3_step20000.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:794136b2ccfe297adf566969366ddaab9c6fe0f9e7eb50aecc1c018dd34dc440
3
+ size 8340371276
weights/sd3_vb_stageC_v1_step20000.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:26b4d8c76a59f7622e5641fe6200831cfe32f3bbe0cb2d17e2ccc9e925b2f97a
3
+ size 8340762482
weights/sd3_vb_stageC_v2_step20000.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:fd2999818ef21280b808cb2f2404c86652a2d461eae8a09a56cc94ea82367874
3
+ size 8340762482
weights/sd3_vb_stageC_v3a_step30000.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e34057970598b7b2f89180c5f7461a146a8c2da2037122a8fd5037506234cf71
3
+ size 8340763852